Human In The Loop logo
Human In The Loop · Weekly

A human-first AI podcast.

Real talk. Unpopular opinions. No hype. Two builders cut through the AI hype cycle every week — and call it like it is.

Watch on YouTubeListen on Spotify

Hosted by Oscar Gallo & Matt Wozniak

About

We’re the humans in the loop.

In machine learning, “human in the loop” means a human who provides oversight and feedback in an automated system. That’s the lens of this show: AI is powerful, but humans aren’t leaving the loop. Not yet. Maybe not ever.

This isn’t another “AI is going to change everything” podcast. It’s for builders, operators, and the AI-curious who are tired of breathless hype, doomerism, and surface-level news recaps with no original thought.

What’s in each episode

Every week: one fixed, one rotating.

FIXED

Signal or Noise

We run through the week's AI headlines and make the call: is this actual signal worth paying attention to, or just noise clogging the timeline?

ROTATING

Ship It or Skip It

A real use case or product idea lands on the table. We debate whether it's worth building now or if the tech isn't there yet.

ROTATING

Explain It to My Client

Take a complex AI concept and explain it the way you'd actually explain it to a non-technical stakeholder. No jargon allowed.

ROTATING

Stack Check

One tool, library, or workflow change we actually adopted this week. No sponsorship energy. Just what's in the trenches.

Hosts
Oscar Gallo

Oscar Gallo

AI Engineer & Entrepreneur

Oscar lives at the intersection of engineering and business. He builds AI products, ships code, and runs companies — the daily grind of making AI work in the real world.

Matt Wozniak

Matt Wozniak

Builder, Operator, Relentless Executor

Matt is the operator's operator. He builds, he ships, he scales. His lens is execution: what works, what doesn't, and why most people are thinking about it wrong.

Latest episode

Cursor's $0.50 Coding Model + Anthropic's $900B Round

EP 07May 26, 2026 · 98 minStack Check

Cursor just shipped a coding model that matches Claude Opus 4.7 at one tenth the price, built on an open Chinese base model. The same week, Anthropic is closing a $30B round at a $900B valuation. If frontier coding got commoditized in seven days, what is anyone paying premium for?

Welcome to Human In the Loop, a weekly podcast where two builders cut through the AI hype cycle. No breathless hype. No doomerism. No surface-level recaps. Honest, opinionated conversations from people actually building with AI.

Signal or Noise

The week’s AI headlines, filtered.

  1. SIGNALCursor Composer 2.5, top-three coding agent at one tenth the cost of Opus

  2. SIGNALAndrej Karpathy joins Anthropic's pre-training team

  3. SIGNALAnthropic in talks for a $30B round at a $900B valuation

  4. SIGNALGoogle I/O 2026, Gemini Spark agent and Antigravity 2.0

  5. SIGNALHark's $700M Series A for a "secret" AI device, why we are calling it noise for now

Stack Check

three AI dev tools we actually adopted in the last 30 days. What each one replaces, why it matters, and would we actually keep using it.

Matt Pocock's `skills` repo, Skill Handoff

a real replacement for `/compact`

Convex (convex.dev), TypeScript-native backend built for agentic workflows

Google Antigravity 2.0 + CLI + Managed Agents API

Hot takes

Two opinions, no disclaimers.

Oscar

Cursor Composer 2.5 just proved the open base model thesis. Coding-as-a-service is about to fall off a cliff in price.

Matt

Karpathy joining Anthropic is worth more than the $900B valuation as a signal. Talent is the only moat the market still underprices.

Previously

Anthropic Killed the AI-for-SMB Playbook + AI Agents 101

EP 06May 19, 2026 · 92 minNo Jargon Required

Three months ago on Episode 1, we asked if there was a real business in selling AI to small businesses. This week, Anthropic shipped it themselves. So we just watched the founder playbook get eaten by a frontier lab in 90 days. If you were six weeks into building that company, what do you do Monday?

Signal or Noise

The week’s AI headlines, filtered.

  1. SIGNALAnthropic ships Claude for Small Business

    15 agentic workflows, QuickBooks + HubSpot + Microsoft 365 connectors, 10-city US tour.

  2. SIGNALThinking Machines previews “Interaction Models”

    276B MoE, 0.4s full-duplex response, Mira Murati throws shade at every other lab's voice stack.

  3. SIGNALSierra raises $950M at $15B+ valuation

    $150M ARR, 40% of the Fortune 50 paying for customer agents.

  4. SIGNALTrump administration reverses on AI oversight

    Driven by national-security fears around Anthropic's Mythos model.

  5. SIGNALCloudflare cuts 1,100 jobs and blames AI

    On the same morning it posts record revenue — the first big “AI did it” layoff that isn't a cover for bad numbers.

No Jargon Required

Three AI concepts every exec and operator keeps hearing this quarter, in plain English.

AI Agent

The word every vendor abuses — and how to tell the real ones from the chatbots in costume.

Connectors and MCP

What “AI plugged into your tools” actually requires.

Full-Duplex Voice (Interaction Models)

What to ask when a vendor pitches a voice agent.

Hot takes

Two opinions, no disclaimers.

Oscar

Anthropic just turned every AI consultant into a reseller. By Q3, the model providers own the workflow layer for every regulated and semi-regulated vertical.

Matt

Every vendor calling their thing an ‘AI agent’ in 2026 is going to look like every vendor that called their thing ‘cloud-native’ in 2015. By Q4 the term means nothing.

Previously

Self-Hosted AI with Jackson Oaks. Why 80% of AI Bills Are Wasted.

EP 05May 12, 2026 · 58 minGuest Deep-Dive

On May 12, 2026, Jackson Oaks — founder of Recursion AI and the self-hosted AI platform Courier — joins the show. Jackson has been running open-source models in production for SMBs since early 2025, back when most of the industry still called local inference a hobbyist toy. Today his company runs production workloads on Apple Mac hardware, processes 600 to 800 thousand API calls a month per machine, and is profitable doing it.

His thesis: eighty percent of businesses have eighty percent of use cases that don't need Opus or GPT-5.5. They need a smaller open-source model, good systems, and predictable pricing. Frontier intelligence, or boring infrastructure? Per-token billing, or a flat rate on hardware you already own?

That's the cold open. Every fifth episode of Human In The Loop is a guest deep-dive — no Signal or Noise, no rotating segment, one topic, one expert. Oscar, Matt, and Jackson spend 58 minutes on what most AI bills are actually paying for, and close with three hot takes you won't hear anywhere else.

Guest Deep-Dive

What we cover with Jackson Oaks.

The 80/20 of open-source AI

Where local models win, where frontier models still matter, and how to tell which category your use case is in.

Three measurable ROI categories

Time bought back, margin protected on AI products, conversion lift on AI features — and which one is nearly impossible to measure honestly.

The Apple long-bet

Why M-series is the most underrated AI hardware story of the decade. 1/10th the cost of NVIDIA, 30 to 40x more power efficient. Apple has been positioning for this since the M1 dropped — the same week ChatGPT did.

The economics

600 to 800,000 API calls a month on a single Mac Studio. Flat-rate cloud pricing starting at $100/mo for 45 to 100K calls. The math on why predictable beats per-token for SMBs.

The Deloitte story

A Fortune client paid Deloitte $400,000 for what turned out to be a GPT wrapper with an unoptimized RAG database. MIT says 95% of business AI pilots can't measure ROI. We have theories.

War stories

The infinite hallucination loop that ate 10,000 requests in a week. Building hallucination detection from scratch. The fine-tuned 14B model that outperformed GPT-4o on a real production task because the data was good and the task was narrow.

The boring stuff nobody talks about

Idempotency in non-deterministic systems. Observability. Redundancy. Why your AI feature needs the manual OCR backup you were about to delete.

The vertical opportunity

Why the next wave of AI businesses isn't “AI for everything” — it's $200/mo agents for pressure washers, plumbers, landscapers, attorneys, accountants. Specialize down, charge real money, capture real value.

Hot takes

Two opinions, no disclaimers.

Matt

One of Anthropic or OpenAI buys the other — or merges with a foreign lab — inside three years. Or open-source overtakes both and they shrink to a fraction of their current valuations. Their burn rate is not sustainable.

Oscar

Every home will eventually have a Mac Mini running a better Alexa, powered by open-source models — not closed-source ones. That future is closer than people think.

Jackson

Better systems and structure with smaller open-source models beat throwing frontier models at unstructured problems. People are compensating for a lack of system with a bigger model. We're doing the opposite — we're building better systems so we can use smaller models.

Previously

9 Seconds to Delete a Database. Anthropic at $900B.

EP 04May 5, 2026 · 46 minShip It or Skip It

On April 24, 2026, an AI coding agent hit a credential mismatch in a staging environment. Nine seconds later, the entire production database of a company called PocketOS was gone. Backups too. The agent confessed: “I violated every principle I was given.” The model was Claude Opus 4.6.

Five days later, on April 29, Bloomberg and CNBC reported Anthropic was weighing offers at a $900 billion valuation. Roughly $50 billion in new capital. A round that would put the lab behind Claude above OpenAI as the most valuable AI company in the world. Two stories. Same week. Same lab. We call every one.

That's the cold open. From there, Oscar and Matt spend 46 minutes doing what this show does: filter the week's AI noise, debate three real business ideas, and close with two hot takes you won't hear anywhere else.

Signal or Noise

The week’s AI headlines, filtered.

  1. SIGNALAnthropic at $900B

    $50B raise on the table. Run-rate revenue at $30B. The Mythos withholding story we called in Episode 1 was the brand. The Google $40B in Episode 3 was the funding story. This is the receipt. Treat your model spend like an interest rate, not a software line item.

  2. SIGNALAI agent deletes PocketOS database

    Nine seconds. Backups gone. The agent went looking for an API token, found one in an unrelated file, and used it. The token was scoped for any operation, including destructive ones. Three holes in the swiss cheese. The agent walked through all of them.

  3. SIGNALChina blocks Meta–Manus, Meta buys ARI

    Beijing killed Meta's $2B Singapore-routed AI deal on April 27. Meta acquired humanoid-robotics startup Assured Robot Intelligence on May 1. The thesis didn't change. The substrate did. Robotics is the next frontier-model fight.

  4. SIGNALCloudflare and Stripe ship agent provisioning

    Agents can now create accounts, register domains, start paid subscriptions, and deploy apps with no human in the loop. Default spend cap is $100 per provider per month. The PocketOS story is the demand signal for spend caps and audit logs.

  5. SIGNALChinese court rules AI-replacement layoff illegal

    The Hangzhou Intermediate People's Court upheld a ruling that Zhou, a QA supervisor verifying LLM outputs, couldn't be fired or pushed to a 40% pay cut just because AI took over his work. First major court ruling anywhere putting limits on AI-driven layoffs.

Ship It or Skip It

Three real ideas from YC's RFS, no fence-sitting.

AI-Native Service Companies

Per Gustaf Alströmer's RFS. Don't sell SaaS to insurance brokers, accountants, or compliance teams. Become the broker. File the taxes. Run the audit. Charge services prices for AI-margin work. The wedge is bigger than the entire SaaS market. Pushback: you stop being software and start being ops, with E&O liability and license requirements baked in.

Company Brain

Per Tom Blomfield's RFS. The blocker to AI automation isn't the models anymore. It's domain knowledge scattered across heads, email, Slack, tickets, and databases. Build the layer that pulls fragmented company knowledge, keeps it current, and turns it into an executable skills file for agents. Pushback: this is wikis 5.0, and Microsoft and Notion eat the category as a feature in 18 months.

Counter-Swarm Defense

Per Tyler Bosmeny's RFS. $500K interceptors don't work against $500 drones. Build the Cloudflare-shaped layer for swarm defense: distributed sensors, software-first interceptors, autonomy-stack attacks. The thesis is real. The dual-use moral question is not optional.

Hot takes

Two opinions, no disclaimers.

Oscar

Anthropic at nine hundred billion dollars is the moment the labs stopped being labs. They're sovereign wealth funds with a research division attached. The Mythos withholding story we called in Episode 1 was the brand. The fundraise is the receipt. Price your model spend like an interest rate, not a software line item.

Matt

Every AI agent in production is one bad scope away from being the PocketOS story. The fix isn't better models. It's permissions, audit logs, and a human approval gate on every destructive operation. If your agent can drop a database, your agent will drop a database. The question isn't if. The question is on what week.

Previously

GPT-5.5 Beat Mythos. The Race Got Weirder.

EP 03Apr 28, 2026 · 49 minStack Check

On April 23, 2026, OpenAI shipped GPT-5.5, the model previously known as Spud. The same day, VentureBeat clocked it narrowly beating Anthropic's gated Claude Mythos Preview on Terminal-Bench 2.0. The day after, DeepSeek released V4, a 1.6 trillion parameter open-source model running natively on Huawei chips, at one-sixth the inference cost of the closed frontier. And Google announced it would invest up to forty billion dollars in Anthropic.

Two weeks ago we called GPT-5.5 noise. Now it's the lead story. The frontier is moving every six weeks, the cloud war is inside the labs, and the open-source ceiling jumped a tier on hardware your security team can't audit.

That's the cold open. From there, Oscar and Matt spend 49 minutes doing what this show does: filter the week's AI noise, review the tools they actually use, and close with two hot takes you won't hear anywhere else.

Signal or Noise

The week’s AI headlines, filtered.

  1. SIGNALGPT-5.5 (Spud) ships

    The first buyable frontier model to beat a gated frontier model on a public benchmark. Same week as the Google–Anthropic announcement. The two-tier story from Episode 2 just collided with the cadence story Fortune was writing all month.

  2. SIGNALDeepSeek V4

    1.6T total parameters, 1M context, MIT license, native on Huawei Ascend 950PR. The open-weights ceiling moved by a tier. Geopolitics moved into the model card.

  3. SIGNALGoogle + Anthropic, up to $40B

    Google has Gemini in-house and is still writing a forty billion dollar check. That tells you what they actually believe about the next 18 months.

  4. SIGNALCognition at $25B

    Devin-in-Windsurf is the first time the agent and the IDE feel like one product, not two stitched together. Whether the price tag holds is a different question.

  5. NOISEGrok 4.3 Beta at $300/month

    Video and slide generation, no cross-session memory. The price tag is the message.

Stack Check

Four tools we actually use.

Vercel Labs Agent Browser

Browser automation CLI built for agents. Persistent sessions, headless or real Chrome with your profile, annotated screenshots so a model can click “label 7” instead of guessing CSS selectors.

WorkOS (FGA + AuthKit)

The auth and permissions layer for agent surfaces. Sub-50ms p95 authorization checks. AI Installer in the CLI sets up auth in under five minutes. Last week's Vercel breach was a permissions story with a price tag attached.

AI SDK (ai-sdk.dev)

Define the agent once, swap the model in one line. The Gateway added GPT-5.5, DeepSeek V4, Kimi K2.6, and GPT Image 2 inside two weeks. The only honest way to evaluate this week's launches on real workloads.

Claude Code /remote-control

Run claude remote-control on your laptop, approve file changes from your phone, walk to lunch. Work stays local. The walkaway-and-approve workflow is the productivity win of the quarter.

Hot takes

Two opinions, no disclaimers.

Oscar

Open-source AI just won the cost war and lost the trust war in the same week. DeepSeek V4 is the best open weights anyone has ever shipped. It also runs on chips your security team can't audit.

Matt

OpenAI shipping GPT-5.5 two weeks after GPT-5.4 is the end of the model launch as a marketing event. From now on, the model layer is a software update, and the only thing that matters is what you ship on top of it.

Previously

The Second-Best Model You Can Actually Buy

EP 02Apr 21, 2026 · 47 minNo Jargon Required

On April 16, 2026, Anthropic shipped Claude Opus 4.7 with a 1M token context window at standard API pricing, a 128k max output, and a new “xhigh” effort level tuned for coding and agentic work. In the same announcement, Anthropic said on record that 4.7 is “less risky than Mythos,” and that Mythos stays gated to roughly 50 enterprise partners under Project Glasswing. So the flagship you can buy is, by Anthropic's own words, their second best.

Radical honesty, or the sharpest upsell in enterprise AI since cloud credits?

That's the cold open. From there, Oscar and Matt spend 45 minutes doing what this show does: filtering the week's AI noise, breaking down the concepts decision-makers actually need, and closing with two hot takes you won't hear anywhere else.

Signal or Noise

The week’s AI headlines, filtered.

  1. SIGNALClaude Opus 4.7

    1M context at standard pricing is the real story. A meaningful win for long-horizon agent workloads. And Anthropic just taught every enterprise buyer to ask, “what's the gated tier.”

  2. SIGNALOpenAI's updated Agents SDK

    Sandboxed agents as a primitive. Python first, TypeScript later. Good news for builders, and great news for OpenAI's runtime.

  3. SIGNALAI on the desktop

    Google's native Gemini macOS app plus Perplexity's Personal Computer. The OS is replacing the browser as the AI UX primitive.

  4. SIGNALStanford's 2026 AI Index

    Only 10% of Americans are more excited than concerned about AI, versus 56% of experts. Your consumer product has a trust problem, not a capability problem.

  5. NOISEAllbirds pivots to AI

    A struggling sneaker company rebrands as NewBird AI, sells the footwear business, drops its environmental charter, and the stock jumps 582%. The “pivot to AI” is the new “pivot to crypto.” Price your term sheets accordingly.

No Jargon Required

Three concepts decision-makers need to call this quarter.

Context Engineering

Prompt engineering is writing a good email. Context engineering is designing the whole office the recipient works in. When every company can access the same foundation models, the wiring is the moat.

Agentic AI

What an agent actually is, and why Gartner says 40% of enterprise agentic AI projects will be canceled by the end of 2027 because of cost, unclear value, and weak risk controls.

Frontier Model Tiering

The top AI labs now ship a public tier and a gated tier. The best model is probably one you cannot buy. How to build strategy around that reality.

Hot takes

Two opinions, no disclaimers.

Oscar

Context engineering isn't a new skill. It's ‘write good internal documentation’ rebranded so consultants can charge for it. The companies winning in 2026 had clean internal data in 2022.

Matt

Every executive asking ‘what's our agentic AI strategy’ in April 2026 is two years from being replaced by one who already shipped something.

Previously

The Model Anthropic Refused to Ship

EP 01Apr 14, 2026 · 47 minShip It or Skip It

On April 7, 2026, Anthropic unveiled Claude Mythos Preview — a frontier model they describe as a “step change” in capability — and then refused to release it. Instead: Project Glasswing, a gated program for Amazon, Apple, Microsoft, Cisco, CrowdStrike, and a handful of others. The first major withheld frontier model in roughly seven years.

Responsible scaling finally biting? Or enterprise GTM dressed up as safety PR?

That's the cold open. From there, Oscar and Matt spend 45 minutes doing what this show does: filtering the week's AI noise, debating real business ideas, and closing with two hot takes you won't hear anywhere else.

Signal or Noise

The week’s AI headlines, filtered.

  1. SIGNALClaude Mythos Preview

    Anthropic used Mythos to identify thousands of zero-days across every major OS and browser, then chose not to ship. If Mythos-wielding attackers find what small shops can't patch, the defender/attacker asymmetry just broke.

  2. SIGNALMeta Muse Spark

    First model from Meta Superintelligence Labs. Llama-4 midsize quality at ~10x less compute, shipping into Facebook, Instagram, WhatsApp, and Ray-Ban Meta glasses. If you build consumer or wearable UX, unit economics just moved.

  3. NOISEOpenAI “Spud”

    Polymarket gives GPT-5.5/6 a 78% chance of shipping by April 30. Predictions aren't news. The real story is OpenAI's $122B raise.

  4. SIGNALThe anti-distillation pact

    OpenAI, Anthropic, and Google coordinating against Chinese labs. IP defense or three competitors ganging up on a fourth because it's easier than competing on capability?

  5. SIGNALAnthropic × Coefficient Bio, $400M

    Horizontal labs are quietly going vertical. Life sciences. Cybersecurity. Legal next? Finance? Pick your domain before the labs pick it for you.

Ship It or Skip It

Three real ideas, no fence-sitting.

“Bring AI to small businesses”

Via My First Million Ep 811. Productized 30-day AI-ops install for HVAC, dental, property management. Software margins, services delivery. But the real question: are you the next picks-and-shovels play, or just another consultant in a trench coat?

Vertical AI agent for insurance claims

Per-claim pricing instead of per-seat. Eats into labor P&L, not software budget. But insurance = 18-month sales cycles. You die of starvation before your first logo unless you have an insider co-founder.

“Lovable-for-X”

Clone the $200M ARR playbook for a regulated vertical — legal ops, clinical workflows, HR. Compliance moat is real. But is it also the ceiling on product velocity?

Hot takes

Two opinions, no disclaimers.

Oscar

Anthropic withholding Mythos is the new moat. Safety-as-marketing is about to become the dominant frontier-lab playbook — because nobody wants to be the lab that shipped the model that broke the internet.

Matt

The “AI for small businesses” gold rush is 90% consultants in trench coats. The real money isn't selling AI to SMBs — it's selling picks-and-shovels to the 10,000 consultants who are.

New episodes every week. Reply with the story you want us on next.

Subscribe

Pick your platform.

YouTubeWatch weeklySpotifyListen on the go