Human In The Loop · Weekly

A human-first AI podcast.

Real talk. Unpopular opinions. No hype. Two builders cut through the AI hype cycle every week — and call it like it is.

Watch on YouTube Listen on Spotify

Hosted by Oscar Gallo & Matt Wozniak

Now playing · Latest episode

Apple Sues OpenAI and Meta Ships Muse Spark 1.1

EP 14 · Jul 14, 2026 · 92 min

Watch on YouTube Listen on Spotify

About

We’re the humans in the loop.

In machine learning, “human in the loop” means a human who provides oversight and feedback in an automated system. That’s the lens of this show: AI is powerful, but humans aren’t leaving the loop. Not yet. Maybe not ever.

This isn’t another “AI is going to change everything” podcast. It’s for builders, operators, and the AI-curious who are tired of breathless hype, doomerism, and surface-level news recaps with no original thought.

What’s in each episode

Every week: one fixed, one rotating.

FIXED

Signal or Noise

We run through the week's AI headlines and make the call: is this actual signal worth paying attention to, or just noise clogging the timeline?

ROTATING

Ship It or Skip It

A real use case or product idea lands on the table. We debate whether it's worth building now or if the tech isn't there yet.

ROTATING

Explain It to My Client

Take a complex AI concept and explain it the way you'd actually explain it to a non-technical stakeholder. No jargon allowed.

ROTATING

Stack Check

One tool, library, or workflow change we actually adopted this week. No sponsorship energy. Just what's in the trenches.

Hosts

Oscar Gallo

AI Engineer & Entrepreneur

Oscar lives at the intersection of engineering and business. He builds AI products, ships code, and runs companies — the daily grind of making AI work in the real world.

Matt Wozniak

Builder, Operator, Relentless Executor

Matt is the operator's operator. He builds, he ships, he scales. His lens is execution: what works, what doesn't, and why most people are thinking about it wrong.

Latest episode

Apple Sues OpenAI and Meta Ships Muse Spark 1.1

EP 14Jul 14, 2026 · 92 minNo Jargon Required

Apple says OpenAI built its hardware program with stolen trade secrets. OpenAI denies any interest in Apple's secrets. We make the Signal or Noise call.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALApple sues OpenAI over alleged trade-secret theft
SIGNALMeta releases Muse Spark 1.1 and opens the Meta Model API to developers
SIGNALOpenAI releases GPT-5.6 across ChatGPT, Codex, and the API
SIGNALSpaceXAI releases Grok 4.5 for coding and agentic work
SIGNALAnthropic and OpenAI trade usage resets during the GPT-5.6 launch

No Jargon Required

we explain two terms a decision-maker needs this week.

Multi-agent orchestration: when several AI workers help and when they multiply failure

Model routing: how a product switches among fast, cheap, and capable models

Hot takes

Two opinions, no disclaimers.

Oscar
“The Apple lawsuit is a warning for every AI company hiring from a competitor. Talent does not arrive empty-handed. If your onboarding process does not separate experience from confidential material, your product roadmap can become evidence.”

Matt
“I don't care if your model is as good as Fable. If it's close, and it's faster, and it's cheaper — it wins. The whole industry is selling you a god-tier model like it's the finish line, but with orchestration and model routing, you don't need one genius doing everything. You need a smart conductor and a cheap, fast bench. The frontier model is becoming the part you use least — and the teams still paying premium tokens for every keystroke are going to feel really dumb in a year.”

Sources referenced

Previously

Meta Compute, OpenAI Equity, and AI Startup Bets

EP 13Jul 7, 2026 · 73 minShip It or Skip It

Meta spent so much on AI that it may have backed into becoming a cloud company. Smart move, or capex panic? We make the call.

Welcome to Human In the Loop, a weekly podcast where two builders talk about what matters in AI. No hype. No doomerism. No recaps for the sake of recaps.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALMeta building a cloud business for excess AI compute
SIGNALOpenAI reportedly proposing to give 5% equity to a U.S. sovereign wealth fund
SIGNALClaude Sonnet 5 shipping across Claude, Claude Code, and the Claude Platform, which we call noise until it changes builder behavior
SIGNALTogether AI raising $800M for open-source AI infrastructure
SIGNALClaude Science launching as a research workbench

Ship It or Skip It

we debate two AI business ideas.

Meta Compute broker for AI teams

Back-office agent for small law and accounting firms

Hot takes

Two opinions, no disclaimers.

Oscar
“The model race is becoming an infrastructure race again.”

Matt
“Most AI workbench products are selling relief from tool chaos.”

Sources referenced

Previously

The Frontier Just Got a Guest List

EP 12Jun 30, 2026 · 106 minNo Jargon Required

OpenAI just shipped its most powerful model to twenty companies the government picked. Not the best twenty. The approved twenty. A week after the US switched off Anthropic's best model, the frontier has a guest list, and you are not on it. We figure out what that does to anyone trying to build on the newest model.

Welcome to Human In the Loop, a weekly podcast where two builders cut through the AI hype cycle. No breathless hype. No doomerism. No surface-level recaps.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALGPT-5.6 ships under US government restrictions
OpenAI began a limited preview of GPT-5.6 — Sol, Terra, and Luna — to roughly twenty pre-approved organizations at the US government's request, the first major model to ship under the June 2 executive order. Last week Anthropic went dark by force. This week OpenAI went gated by choice. Two labs, two weeks, same direction.
SIGNALAnthropic accuses Alibaba's Qwen of the largest distillation attack on record
Anthropic's complaint to Senators Warren and Scott alleges Alibaba's Qwen lab ran about 25,000 fraudulent accounts and 28.8 million exchanges against Claude between April 22 and June 5, aimed at software engineering and agentic reasoning. That is roughly 1.7x the combined total it attributed to DeepSeek, Moonshot, and MiniMax in February, and the first time it has named a major Chinese conglomerate.
SIGNALOpenAI and Broadcom unveil Jalapeño, OpenAI's first inference chip
OpenAI's first custom chip, built with Broadcom for inference, went from first design to tape-out in nine months with help from OpenAI's own models — the companies call it the fastest ASIC cycle ever. Early claims cite performance-per-watt well above the current state of the art, with first deployment targeted for the end of 2026.
SIGNALAgentjacking: the prompt-injection hole in every coding agent
A new attack class hijacks coding agents like Claude Code, Cursor, and Codex by hiding instructions inside data they already trust, such as a Sentry error report pulled in over MCP. There is no universal patch — the fix is to treat incoming data as untrusted and keep a human review step before the agent acts. OWASP says prompt injection is up 340% year over year.
NOISEChatGPT slips below 50% as Gemini and Claude surge
Reports put ChatGPT at 46.4% of the assistant market, its first time below 50% since launch, with Gemini at 27.7% and Claude at 10.3%. Mostly a distribution story — Gemini's climb is Android OS-level placement. Noise for builders, except the one real line: Claude roughly quadrupled its monthly users in five months on the back of agentic coding.

No Jargon Required

Two AI words an exec heard this week and couldn't define on the spot, in plain English.

Distillation

Teaching a small, cheap model by having it copy a big, expensive one's answers. The word at the center of the Alibaba–Claude fight, and why some cheap models punch far above their price. The real question is provenance, not the technique.

Prompt injection

A model reads instructions and data as one stream of words, so commands hidden in the data get obeyed like orders. Researchers say it may not be patchable — the defense is design: read-only by default, a human in front of anything you can't undo.

Hot takes

Two opinions, no disclaimers.

Oscar
“The frontier is being quietly nationalized, and most builders are cheering it as safety. Build on the model you can actually keep.”

Matt
“Everyone panicking about distillation forgot the lesson. The frontier has no moat. Stop betting your company on a six-week lead anyone can clone.”

Sources referenced

Previously

The World Cup Is Secretly Run by AI

EP 11Jun 23, 2026 · 104 minShip It or Skip It

Last week the US government gave Anthropic's best model a 72-hour public life, then switched it off for every foreign national on earth. This week, three Chinese labs shipped models that beat almost everything in the open, handed over the weights, and charged a tenth of the price. We figure out which of those two plays ends with the whole world building on your stack.

Welcome to Human In the Loop, a weekly podcast where two builders cut through the AI hype cycle. No breathless hype. No doomerism. No surface-level recaps.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALFIFA is running the 2026 World Cup on AI, the biggest live deployment on earth
An Intelligent Command Center in Miami tied to digital twins of all 16 stadiums, Football AI Pro built with Lenovo on FIFA's own Football Language Model and handed to all 48 teams, 1,248 player avatars from one-second scans, and Gemini drawing up tactics for Argentina, Brazil, and France. Multi-vendor, multi-model, real-time, under a five-second latency budget — the reference architecture nobody asked for.
SIGNALChina's open-weight wave: GLM-5.2 and Kimi K2.7 Code land
Z.ai's GLM-5.2 tops the open-weight leaderboard with a 1M-token window and MIT weights; Moonshot's Kimi K2.7 Code ships open and tuned for agentic coding. Chinese labs now hold four of the top five open-weight spots at a fraction of Western pricing. The model you self-host is the only one Commerce can't switch off.
SIGNALAnthropic's Fable 5 and Mythos 5 are still dark a week after the export ban
A full week after the June 12 Commerce directive, both models stay suspended for foreign nationals with no end date. Anthropic published its defense on June 16, arguing a recall over a narrow jailbreak would halt every frontier deployment. This was the closed-frontier fire drill, and most teams failed it.
SIGNALNoam Shazeer leaves Google for OpenAI
The co-author of "Attention Is All You Need" is gone again, two years after Google paid a reported $2.7B to bring him back into DeepMind. When models converge, talent is the only moat left — and the gravity just pulled toward OpenAI.
NOISEThe International AI Safety Report 2026 drops
A government-commissioned synthesis meant to set a shared baseline regulators reference, landing the same month the EU AI Act becomes fully applicable. It doesn't regulate anything — it's the document the regulations will point at. Noise for builders, for now.

Ship It or Skip It

Two real ideas from this year's agent discourse, steelman versus pushback.

AI governance layer for the agents your employees already run

The referee between an employee's agent and the action — block the refund agent pushing $40k against a $500 policy before it fires. Own the rules engine, not the dashboard.

Vertical AI harnesses, one trade at a time

Stop selling a chat box, sell the opinionated rig for one trade like marketing or video. Brief in, on-brand output out. Win the domain by being the tool people open every morning, not the model underneath.

Hot takes

Two opinions, no disclaimers.

Oscar
“The most important AI lab of 2026 might not be American, and most US builders are too proud to notice.”

Matt
“A free model is only free if you own the GPUs, the ops team, and the eval harness to babysit it.”

Sources referenced

Previously

Anthropic's Best Model Got Banned in 72 Hours

EP 10Jun 16, 2026 · 66 minStack Check

Anthropic shipped the most powerful AI ever released to the public on June 9. By June 12 the US government made them pull it back from every foreign national on earth, including Anthropic's own engineers. The most capable model on the planet had a 72-hour public life. We unpack who actually won that week, because it was not the people who got the model.

Welcome to Human In the Loop, a weekly podcast where two builders cut through the AI hype cycle. No breathless hype. No doomerism. No surface-level recaps.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALAnthropic releases Claude Fable 5 and Mythos 5 at half the old frontier price, then calls for a global AI pause five days earlier. We read the contradiction.
SIGNALThe Trump administration places Fable 5 and Mythos 5 under export controls after a company jailbreaks Mythos. Anthropic disables access for all customers worldwide to comply, then disputes the order.
NOISEOpenAI files confidentially for its IPO at a reported $730 to $850 billion, one week after Anthropic filed at $965 billion. Two unprofitable labs, public markets, same week.
SIGNALOpenAI wires its models and Codex into Oracle Universal Credits, deleting the procurement step for every Oracle enterprise customer.

Stack Check

AI agents in production: the three layers you actually need to run an agent unattended.

Claude Agent SDK

The agent loop you do not have to write.

LangGraph plus LangSmith

Orchestration you can audit.

Temporal

Durable execution so a crash is not a restart.

Hot takes

Two opinions, no disclaimers.

Oscar
“The Mythos ban is the best thing that ever happened to open weights. If the Commerce Department can switch off your stack on a Friday, you are renting, not building.”

Matt
“Everyone watched the model and missed the money. OpenAI spent the week removing every reason a CFO can say no.”

Sources referenced

Previously

The Government Gets Your AI 30 Days Early + The Token Bill

EP 09Jun 9, 2026 · 99 minNo Jargon Required

The White House just signed an order that lets the federal government test the most powerful AI models up to 30 days before you can touch them. The labs said yes. Real national security, or the backdoor frontier labs were begging for? We make the call.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALTrump's frontier-model executive order, a voluntary 30-day government early-access window, with the NSA deciding which models qualify
SIGNALGitHub Copilot flips to token billing on June 1, and developers report bills jumping 10x to 50x
SIGNALMicrosoft's MAI-Code-1-Flash and Google's $100 dev tier crash the AI coding party Anthropic has owned
SIGNALChatGPT crosses 1 billion monthly users, and why the real story is Claude growing 640%
SIGNALAnthropic calls for a global pause on frontier AI, and the 80% of its own code now written by Claude

No Jargon Required

the three words on every AI invoice, explained for anyone who signs off on the bill.

Tokens, and why your AI bill went usage-based

Inference vs training, and what "inference-efficient" actually buys you

Evals and benchmarks, and why the leaderboard is lying to you

Hot takes

Two opinions, no disclaimers.

Oscar
“The executive order is not about safety. It is a distribution deal with a 30-day government waiting room on every frontier release.”

Matt
“The Copilot billing meltdown is the best thing to happen to engineering leaders this year. The meter was always running. Now you can see it.”

Sources referenced

Previously

Anthropic's "Too Dangerous" Model Is Coming for Everyone

EP 08Jun 2, 2026 · 123 minShip It or Skip It

Six weeks ago Anthropic said a model was too dangerous to ship. This week they shipped Opus 4.8, called it near-Mythos, and said Mythos-class models reach every customer in the coming weeks. Was it ever that dangerous? We make the call.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALClaude Opus 4.8, near-Mythos alignment, 3x cheaper fast mode, and Dynamic Workflows that fan one model across up to 1,000 subagents
SIGNALAnthropic passes OpenAI, a new round values it near $1 trillion, plus a projected first operating profit on $10.9B revenue
SIGNALCognition raises $1B at $26B, Devin's autonomous coding agent hits a ~$492M run rate
SIGNALA poisoned Nx Console VS Code extension stole Claude Code configs and helped clone 3,800 of GitHub's internal repos
SIGNALPope Leo XIV's first encyclical takes on AI, signal for regulated buyers, noise for your standup

Ship It or Skip It

three real ideas from this month's builder discourse.

The Company Brain

My First Million Ep 822

Voice AI agent for the trades

Avoca's $125M raise

Stripe for AI agents

Circle Agent Stack

Hot takes

Two opinions, no disclaimers.

Oscar
“Too dangerous to ship" was never a safety call. It was a 90-day enterprise head start.”

Matt
“Devin at $26B is the top, not the floor. An agent lab with no distribution is the WeWork of 2026.”

Sources referenced

Previously

Cursor's $0.50 Coding Model + Anthropic's $900B Round

EP 07May 26, 2026 · 98 minStack Check

Cursor just shipped a coding model that matches Claude Opus 4.7 at one tenth the price, built on an open Chinese base model. The same week, Anthropic is closing a $30B round at a $900B valuation. If frontier coding got commoditized in seven days, what is anyone paying premium for?

Welcome to Human In the Loop, a weekly podcast where two builders cut through the AI hype cycle. No breathless hype. No doomerism. No surface-level recaps. Honest, opinionated conversations from people actually building with AI.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALCursor Composer 2.5, top-three coding agent at one tenth the cost of Opus
SIGNALAndrej Karpathy joins Anthropic's pre-training team
SIGNALAnthropic in talks for a $30B round at a $900B valuation
SIGNALGoogle I/O 2026, Gemini Spark agent and Antigravity 2.0
SIGNALHark's $700M Series A for a "secret" AI device, why we are calling it noise for now

Stack Check

three AI dev tools we actually adopted in the last 30 days. What each one replaces, why it matters, and would we actually keep using it.

Matt Pocock's `skills` repo, Skill Handoff

a real replacement for `/compact`

Convex (convex.dev), TypeScript-native backend built for agentic workflows

Google Antigravity 2.0 + CLI + Managed Agents API

Hot takes

Two opinions, no disclaimers.

Oscar
“Cursor Composer 2.5 just proved the open base model thesis. Coding-as-a-service is about to fall off a cliff in price.”

Matt
“Karpathy joining Anthropic is worth more than the $900B valuation as a signal. Talent is the only moat the market still underprices.”

Sources referenced

Previously

Anthropic Killed the AI-for-SMB Playbook + AI Agents 101

EP 06May 19, 2026 · 92 minNo Jargon Required

Three months ago on Episode 1, we asked if there was a real business in selling AI to small businesses. This week, Anthropic shipped it themselves. So we just watched the founder playbook get eaten by a frontier lab in 90 days. If you were six weeks into building that company, what do you do Monday?

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALAnthropic ships Claude for Small Business
15 agentic workflows, QuickBooks + HubSpot + Microsoft 365 connectors, 10-city US tour.
SIGNALThinking Machines previews “Interaction Models”
276B MoE, 0.4s full-duplex response, Mira Murati throws shade at every other lab's voice stack.
SIGNALSierra raises $950M at $15B+ valuation
$150M ARR, 40% of the Fortune 50 paying for customer agents.
SIGNALTrump administration reverses on AI oversight
Driven by national-security fears around Anthropic's Mythos model.
SIGNALCloudflare cuts 1,100 jobs and blames AI
On the same morning it posts record revenue — the first big “AI did it” layoff that isn't a cover for bad numbers.

No Jargon Required

Three AI concepts every exec and operator keeps hearing this quarter, in plain English.

AI Agent

The word every vendor abuses — and how to tell the real ones from the chatbots in costume.

Connectors and MCP

What “AI plugged into your tools” actually requires.

Full-Duplex Voice (Interaction Models)

What to ask when a vendor pitches a voice agent.

Hot takes

Two opinions, no disclaimers.

Oscar
“Anthropic just turned every AI consultant into a reseller. By Q3, the model providers own the workflow layer for every regulated and semi-regulated vertical.”

Matt
“Every vendor calling their thing an ‘AI agent’ in 2026 is going to look like every vendor that called their thing ‘cloud-native’ in 2015. By Q4 the term means nothing.”

Sources referenced

Previously

Self-Hosted AI with Jackson Oaks. Why 80% of AI Bills Are Wasted.

EP 05May 12, 2026 · 58 minGuest Deep-Dive

On May 12, 2026, Jackson Oaks — founder of Recursion AI and the self-hosted AI platform Courier — joins the show. Jackson has been running open-source models in production for SMBs since early 2025, back when most of the industry still called local inference a hobbyist toy. Today his company runs production workloads on Apple Mac hardware, processes 600 to 800 thousand API calls a month per machine, and is profitable doing it.

His thesis: eighty percent of businesses have eighty percent of use cases that don't need Opus or GPT-5.5. They need a smaller open-source model, good systems, and predictable pricing. Frontier intelligence, or boring infrastructure? Per-token billing, or a flat rate on hardware you already own?

That's the cold open. Every fifth episode of Human In The Loop is a guest deep-dive — no Signal or Noise, no rotating segment, one topic, one expert. Oscar, Matt, and Jackson spend 58 minutes on what most AI bills are actually paying for, and close with three hot takes you won't hear anywhere else.

Watch on YouTube Listen on Spotify

Guest Deep-Dive

What we cover with Jackson Oaks.

The 80/20 of open-source AI

Where local models win, where frontier models still matter, and how to tell which category your use case is in.

Three measurable ROI categories

Time bought back, margin protected on AI products, conversion lift on AI features — and which one is nearly impossible to measure honestly.

The Apple long-bet

Why M-series is the most underrated AI hardware story of the decade. 1/10th the cost of NVIDIA, 30 to 40x more power efficient. Apple has been positioning for this since the M1 dropped — the same week ChatGPT did.

The economics

600 to 800,000 API calls a month on a single Mac Studio. Flat-rate cloud pricing starting at $100/mo for 45 to 100K calls. The math on why predictable beats per-token for SMBs.

The Deloitte story

A Fortune client paid Deloitte $400,000 for what turned out to be a GPT wrapper with an unoptimized RAG database. MIT says 95% of business AI pilots can't measure ROI. We have theories.

War stories

The infinite hallucination loop that ate 10,000 requests in a week. Building hallucination detection from scratch. The fine-tuned 14B model that outperformed GPT-4o on a real production task because the data was good and the task was narrow.

The boring stuff nobody talks about

Idempotency in non-deterministic systems. Observability. Redundancy. Why your AI feature needs the manual OCR backup you were about to delete.

The vertical opportunity

Why the next wave of AI businesses isn't “AI for everything” — it's $200/mo agents for pressure washers, plumbers, landscapers, attorneys, accountants. Specialize down, charge real money, capture real value.

Hot takes

Two opinions, no disclaimers.

Matt
“One of Anthropic or OpenAI buys the other — or merges with a foreign lab — inside three years. Or open-source overtakes both and they shrink to a fraction of their current valuations. Their burn rate is not sustainable.”

Oscar
“Every home will eventually have a Mac Mini running a better Alexa, powered by open-source models — not closed-source ones. That future is closer than people think.”

Jackson
“Better systems and structure with smaller open-source models beat throwing frontier models at unstructured problems. People are compensating for a lack of system with a bigger model. We're doing the opposite — we're building better systems so we can use smaller models.”

Sources referenced

Previously

9 Seconds to Delete a Database. Anthropic at $900B.

EP 04May 5, 2026 · 46 minShip It or Skip It

On April 24, 2026, an AI coding agent hit a credential mismatch in a staging environment. Nine seconds later, the entire production database of a company called PocketOS was gone. Backups too. The agent confessed: “I violated every principle I was given.” The model was Claude Opus 4.6.

Five days later, on April 29, Bloomberg and CNBC reported Anthropic was weighing offers at a $900 billion valuation. Roughly $50 billion in new capital. A round that would put the lab behind Claude above OpenAI as the most valuable AI company in the world. Two stories. Same week. Same lab. We call every one.

That's the cold open. From there, Oscar and Matt spend 46 minutes doing what this show does: filter the week's AI noise, debate three real business ideas, and close with two hot takes you won't hear anywhere else.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALAnthropic at $900B
$50B raise on the table. Run-rate revenue at $30B. The Mythos withholding story we called in Episode 1 was the brand. The Google $40B in Episode 3 was the funding story. This is the receipt. Treat your model spend like an interest rate, not a software line item.
SIGNALAI agent deletes PocketOS database
Nine seconds. Backups gone. The agent went looking for an API token, found one in an unrelated file, and used it. The token was scoped for any operation, including destructive ones. Three holes in the swiss cheese. The agent walked through all of them.
SIGNALChina blocks Meta–Manus, Meta buys ARI
Beijing killed Meta's $2B Singapore-routed AI deal on April 27. Meta acquired humanoid-robotics startup Assured Robot Intelligence on May 1. The thesis didn't change. The substrate did. Robotics is the next frontier-model fight.
SIGNALCloudflare and Stripe ship agent provisioning
Agents can now create accounts, register domains, start paid subscriptions, and deploy apps with no human in the loop. Default spend cap is $100 per provider per month. The PocketOS story is the demand signal for spend caps and audit logs.
SIGNALChinese court rules AI-replacement layoff illegal
The Hangzhou Intermediate People's Court upheld a ruling that Zhou, a QA supervisor verifying LLM outputs, couldn't be fired or pushed to a 40% pay cut just because AI took over his work. First major court ruling anywhere putting limits on AI-driven layoffs.

Ship It or Skip It

Three real ideas from YC's RFS, no fence-sitting.

AI-Native Service Companies

Per Gustaf Alströmer's RFS. Don't sell SaaS to insurance brokers, accountants, or compliance teams. Become the broker. File the taxes. Run the audit. Charge services prices for AI-margin work. The wedge is bigger than the entire SaaS market. Pushback: you stop being software and start being ops, with E&O liability and license requirements baked in.

Company Brain

Per Tom Blomfield's RFS. The blocker to AI automation isn't the models anymore. It's domain knowledge scattered across heads, email, Slack, tickets, and databases. Build the layer that pulls fragmented company knowledge, keeps it current, and turns it into an executable skills file for agents. Pushback: this is wikis 5.0, and Microsoft and Notion eat the category as a feature in 18 months.

Counter-Swarm Defense

Per Tyler Bosmeny's RFS. $500K interceptors don't work against $500 drones. Build the Cloudflare-shaped layer for swarm defense: distributed sensors, software-first interceptors, autonomy-stack attacks. The thesis is real. The dual-use moral question is not optional.

Hot takes

Two opinions, no disclaimers.

Oscar
“Anthropic at nine hundred billion dollars is the moment the labs stopped being labs. They're sovereign wealth funds with a research division attached. The Mythos withholding story we called in Episode 1 was the brand. The fundraise is the receipt. Price your model spend like an interest rate, not a software line item.”

Matt
“Every AI agent in production is one bad scope away from being the PocketOS story. The fix isn't better models. It's permissions, audit logs, and a human approval gate on every destructive operation. If your agent can drop a database, your agent will drop a database. The question isn't if. The question is on what week.”

Sources referenced

Previously

GPT-5.5 Beat Mythos. The Race Got Weirder.

EP 03Apr 28, 2026 · 49 minStack Check

On April 23, 2026, OpenAI shipped GPT-5.5, the model previously known as Spud. The same day, VentureBeat clocked it narrowly beating Anthropic's gated Claude Mythos Preview on Terminal-Bench 2.0. The day after, DeepSeek released V4, a 1.6 trillion parameter open-source model running natively on Huawei chips, at one-sixth the inference cost of the closed frontier. And Google announced it would invest up to forty billion dollars in Anthropic.

Two weeks ago we called GPT-5.5 noise. Now it's the lead story. The frontier is moving every six weeks, the cloud war is inside the labs, and the open-source ceiling jumped a tier on hardware your security team can't audit.

That's the cold open. From there, Oscar and Matt spend 49 minutes doing what this show does: filter the week's AI noise, review the tools they actually use, and close with two hot takes you won't hear anywhere else.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALGPT-5.5 (Spud) ships
The first buyable frontier model to beat a gated frontier model on a public benchmark. Same week as the Google–Anthropic announcement. The two-tier story from Episode 2 just collided with the cadence story Fortune was writing all month.
SIGNALDeepSeek V4
1.6T total parameters, 1M context, MIT license, native on Huawei Ascend 950PR. The open-weights ceiling moved by a tier. Geopolitics moved into the model card.
SIGNALGoogle + Anthropic, up to $40B
Google has Gemini in-house and is still writing a forty billion dollar check. That tells you what they actually believe about the next 18 months.
SIGNALCognition at $25B
Devin-in-Windsurf is the first time the agent and the IDE feel like one product, not two stitched together. Whether the price tag holds is a different question.
NOISEGrok 4.3 Beta at $300/month
Video and slide generation, no cross-session memory. The price tag is the message.

Stack Check

Four tools we actually use.

Vercel Labs Agent Browser

Browser automation CLI built for agents. Persistent sessions, headless or real Chrome with your profile, annotated screenshots so a model can click “label 7” instead of guessing CSS selectors.

WorkOS (FGA + AuthKit)

The auth and permissions layer for agent surfaces. Sub-50ms p95 authorization checks. AI Installer in the CLI sets up auth in under five minutes. Last week's Vercel breach was a permissions story with a price tag attached.

AI SDK (ai-sdk.dev)

Define the agent once, swap the model in one line. The Gateway added GPT-5.5, DeepSeek V4, Kimi K2.6, and GPT Image 2 inside two weeks. The only honest way to evaluate this week's launches on real workloads.

Claude Code /remote-control

Run claude remote-control on your laptop, approve file changes from your phone, walk to lunch. Work stays local. The walkaway-and-approve workflow is the productivity win of the quarter.

Hot takes

Two opinions, no disclaimers.

Oscar
“Open-source AI just won the cost war and lost the trust war in the same week. DeepSeek V4 is the best open weights anyone has ever shipped. It also runs on chips your security team can't audit.”

Matt
“OpenAI shipping GPT-5.5 two weeks after GPT-5.4 is the end of the model launch as a marketing event. From now on, the model layer is a software update, and the only thing that matters is what you ship on top of it.”

Sources referenced

Previously

The Second-Best Model You Can Actually Buy

EP 02Apr 21, 2026 · 47 minNo Jargon Required

On April 16, 2026, Anthropic shipped Claude Opus 4.7 with a 1M token context window at standard API pricing, a 128k max output, and a new “xhigh” effort level tuned for coding and agentic work. In the same announcement, Anthropic said on record that 4.7 is “less risky than Mythos,” and that Mythos stays gated to roughly 50 enterprise partners under Project Glasswing. So the flagship you can buy is, by Anthropic's own words, their second best.

Radical honesty, or the sharpest upsell in enterprise AI since cloud credits?

That's the cold open. From there, Oscar and Matt spend 45 minutes doing what this show does: filtering the week's AI noise, breaking down the concepts decision-makers actually need, and closing with two hot takes you won't hear anywhere else.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALClaude Opus 4.7
1M context at standard pricing is the real story. A meaningful win for long-horizon agent workloads. And Anthropic just taught every enterprise buyer to ask, “what's the gated tier.”
SIGNALOpenAI's updated Agents SDK
Sandboxed agents as a primitive. Python first, TypeScript later. Good news for builders, and great news for OpenAI's runtime.
SIGNALAI on the desktop
Google's native Gemini macOS app plus Perplexity's Personal Computer. The OS is replacing the browser as the AI UX primitive.
SIGNALStanford's 2026 AI Index
Only 10% of Americans are more excited than concerned about AI, versus 56% of experts. Your consumer product has a trust problem, not a capability problem.
NOISEAllbirds pivots to AI
A struggling sneaker company rebrands as NewBird AI, sells the footwear business, drops its environmental charter, and the stock jumps 582%. The “pivot to AI” is the new “pivot to crypto.” Price your term sheets accordingly.

No Jargon Required

Three concepts decision-makers need to call this quarter.

Context Engineering

Prompt engineering is writing a good email. Context engineering is designing the whole office the recipient works in. When every company can access the same foundation models, the wiring is the moat.

Agentic AI

What an agent actually is, and why Gartner says 40% of enterprise agentic AI projects will be canceled by the end of 2027 because of cost, unclear value, and weak risk controls.

Frontier Model Tiering

The top AI labs now ship a public tier and a gated tier. The best model is probably one you cannot buy. How to build strategy around that reality.

Hot takes

Two opinions, no disclaimers.

Oscar
“Context engineering isn't a new skill. It's ‘write good internal documentation’ rebranded so consultants can charge for it. The companies winning in 2026 had clean internal data in 2022.”

Matt
“Every executive asking ‘what's our agentic AI strategy’ in April 2026 is two years from being replaced by one who already shipped something.”

Sources referenced

Previously

The Model Anthropic Refused to Ship

EP 01Apr 14, 2026 · 47 minShip It or Skip It

On April 7, 2026, Anthropic unveiled Claude Mythos Preview — a frontier model they describe as a “step change” in capability — and then refused to release it. Instead: Project Glasswing, a gated program for Amazon, Apple, Microsoft, Cisco, CrowdStrike, and a handful of others. The first major withheld frontier model in roughly seven years.

Responsible scaling finally biting? Or enterprise GTM dressed up as safety PR?

That's the cold open. From there, Oscar and Matt spend 45 minutes doing what this show does: filtering the week's AI noise, debating real business ideas, and closing with two hot takes you won't hear anywhere else.

Watch on YouTube Listen on Spotify

Signal or Noise

The week’s AI headlines, filtered.

SIGNALClaude Mythos Preview
Anthropic used Mythos to identify thousands of zero-days across every major OS and browser, then chose not to ship. If Mythos-wielding attackers find what small shops can't patch, the defender/attacker asymmetry just broke.
SIGNALMeta Muse Spark
First model from Meta Superintelligence Labs. Llama-4 midsize quality at ~10x less compute, shipping into Facebook, Instagram, WhatsApp, and Ray-Ban Meta glasses. If you build consumer or wearable UX, unit economics just moved.
NOISEOpenAI “Spud”
Polymarket gives GPT-5.5/6 a 78% chance of shipping by April 30. Predictions aren't news. The real story is OpenAI's $122B raise.
SIGNALThe anti-distillation pact
OpenAI, Anthropic, and Google coordinating against Chinese labs. IP defense or three competitors ganging up on a fourth because it's easier than competing on capability?
SIGNALAnthropic × Coefficient Bio, $400M
Horizontal labs are quietly going vertical. Life sciences. Cybersecurity. Legal next? Finance? Pick your domain before the labs pick it for you.

Ship It or Skip It

Three real ideas, no fence-sitting.

“Bring AI to small businesses”

Via My First Million Ep 811. Productized 30-day AI-ops install for HVAC, dental, property management. Software margins, services delivery. But the real question: are you the next picks-and-shovels play, or just another consultant in a trench coat?

Vertical AI agent for insurance claims

Per-claim pricing instead of per-seat. Eats into labor P&L, not software budget. But insurance = 18-month sales cycles. You die of starvation before your first logo unless you have an insider co-founder.

“Lovable-for-X”

Clone the $200M ARR playbook for a regulated vertical — legal ops, clinical workflows, HR. Compliance moat is real. But is it also the ceiling on product velocity?

Hot takes

Two opinions, no disclaimers.

Oscar
“Anthropic withholding Mythos is the new moat. Safety-as-marketing is about to become the dominant frontier-lab playbook — because nobody wants to be the lab that shipped the model that broke the internet.”

Matt
“The “AI for small businesses” gold rush is 90% consultants in trench coats. The real money isn't selling AI to SMBs — it's selling picks-and-shovels to the 10,000 consultants who are.”

Sources referenced

New episodes every week. Reply with the story you want us on next.

Subscribe

Pick your platform.

YouTubeWatch weekly SpotifyListen on the go