Everything That Happened in AI Today Thursday, May 28 | The Neuron

Everything That Happened in AI Today (Thursday, May 28, 2026)

Anthropic turned Opus 4.8 into a whole news cycle; IBM committed $10B to fault-tolerant quantum; Waymo opened rides in its new Ojai robotaxi; AI costs hit enterprise sticker shock; plus many more.

Written By
Grant Harvey
Grant Harvey
May 29, 2026
18 minute read

Claude Opus 4.8 arrived with subagents, safety drama, benchmark arguments, and a $65B funding round sitting right next to it like Anthropic wanted the day to be subtle.

Welcome to the Around the Horn Digest, everything that crossed our desk today, sorted. The day had one obvious gravity well: Claude Opus 4.8. Anthropic shipped the model, previewed dynamic workflows, raised one of the largest private funding rounds in tech history, and immediately got dragged into every kind of benchmark, safety, enterprise, jailbreak, and Mythos-rollout take imaginable. Meanwhile, the rest of AI kept doing normal 2026 things: robotaxis got nicer, AI token futures started sounding like oil contracts, law firms spent half a billion dollars on internal tools, and mathematicians used AI-inspired ideas to knock down another famous conjecture. Casual Thursday, apparently. Let's get into it.

Around the Horn - Thursday, May 28, 2026

The big story today was Anthropic releasing Claude Opus 4.8 and turning it into a full platform moment instead of a normal model drop. The model came with improved coding and knowledge work, effort controls for cheaper usage, better uncertainty behavior, and a new Dynamic Workflows system that lets Claude Code spin up coordinated fleets of subagents for jobs like large codebase migrations, security audits, and complex enterprise tasks.

That alone would have been enough for a headline. Then Anthropic announced a $65B Series H at a $965B post-money valuation, Axios reported that Mythos-class models are expected in the coming weeks, and the internet immediately started arguing about whether Opus 4.8 is brilliant, lazy no more, weaker than GPT-5.5 on some tests, more aligned, easier to jailbreak, or all of the above.

The practical takeaway is simple: Anthropic is pushing Claude from "chatbot that codes" toward "orchestrator that manages other agents." That matters because the next battleground is less about one model writing one answer and more about one system decomposing work, assigning subtasks, checking outputs, and knowing when to stop. In other words, Claude is becoming the office manager for a small, slightly terrifying intern army.

🏆 TOP 5 NEWS (Around the Horn)

Advertisement

Honorable Mentions

🍪 TOP TREATS TO TRY

😼 Claude Opus 4.8 Take Stack

  • The launch: Anthropic released Opus 4.8 with improved coding and knowledge work, a Dynamic Workflows tool for coordinating subagents, effort controls, and better uncertainty behavior; Axios added that Mythos-class models are expected in the coming weeks.
  • The money: Anthropic raised $65B at a $965B post-money valuation, with Anthropic's post framing the round around scaling Claude, compute, and safety work.
  • The workflow shift: Claude Code Dynamic Workflows lets Claude write an orchestration plan, spin up tens to hundreds of coordinated subagents, verify results adversarially, and iterate on complex jobs like large code migrations and security audits; ClaudeDevs said the research preview starts when you use the word "workflow" in a prompt, and Sid added that Anthropic built the feature months earlier and that it had already become a daily driver for people inside Anthropic.
  • The Recursive LLM angle: Lateinteraction argued, with a follow-up, that dynamic workflows are one of the first real implementations of recursive LLM concepts: the model gets a symbolic handle to the prompt itself, then uses recursion to coordinate work.
  • The benchmark praise and counter-takes: ProximalHQ reported that Opus 4.8 topped FrontierSWE after improving on reward hacking, planning, and self-evaluation; scaling01 summed up the upgrade as Anthropic "curing laziness" in Claude; Jacob Eiting used the moment to argue that AI now turns bad ideas into bad ideas with charts and tables; and the original linkevin0 item was actually about HumanoidMimicGen, which is folded into the robotics research section below.
  • The benchmark skepticism: Andon Labs said Opus 4.8 performed worse than Opus 4.7 and GPT-5.5 on Vending Bench and Blueprint-Bench, looked more aligned but sometimes refused unethical behavior because of fear of consequences, and performed better at High effort than Max effort; Cline reported Opus 4.8 scored 3.6% lower than GPT-5.5 on Terminal-Bench 2.1; Gabe Stengel noted Gemini 3.5 Flash still beat Opus 4.8 on Finance Agent v2.
  • The enterprise angle: Box wrote that Opus 4.8 advances enterprise content use cases, and Aaron Levie framed it as meaningful for content-heavy enterprise workflows.
  • The creator workflow angle: Ethan Mollick had Opus 4.8 use Claude Code to turn hundreds of research files into a full academic working paper, then used GPT-5.5 Pro as reviewer to find one major issue and minor points; the resulting paper is hosted at The Embeddedness GradientDan Shipper also vibe-checked Opus 4.8 for Every, and the longer Vibe Check dug into the model's writing and reasoning feel.
  • The safety angle: Claude AI highlighted Anthropic's pre-release red-team process, where internal teams deliberately try to break new models before shipping, while Pliny the Liberator demonstrated an autonomous AI-on-AI jailbreak roughly seven minutes after Opus 4.8 launched by having one Claude agent prompt another toward prohibited harmful-content playbooks.
  • The platform angle: Anthropic added mid-conversation system messages to the Claude API, giving developers a way to update instructions during a conversation instead of restarting the whole session.
  • The subagent ecosystem: Sim added live support for Opus 4.8 subagents inside its no-code Mothership agent builder, so users can spin up multi-agent fleets without writing orchestration code; ChrisGPT said Mythos-class models appear to be coming to all Anthropic customers in the June 10-21 window; and KingBootoshi plus an earlier post framed the release as plural Mythos models, with the joke that OpenAI "better cook" with GPT-5.6.
  • The Microsoft pressure: The Information reported that Microsoft is preparing homegrown coding, transcription, reasoning, speech, and image models to reduce dependency on OpenAI and Anthropic, with its X post highlighting the coding-model push.
Advertisement

🏢 Big Tech & Major Companies

💼 AI Productivity, Labor & Economics

  • Remote said it grew revenue 50% per employee without adding headcount, passing $300M ARR and crediting internal AI adoption, Claude workflows, and AI-written code.
  • Simon Willison argued that Anthropic and OpenAI have found product-market fit because enterprise coding agents make expensive LLM usage feel worth it for expensive human workers.
  • Semafor reported that companies are reevaluating aggressive AI spending as IT bills climb instead of falling.
  • Major exchanges are developing AI token futures, treating tokens like a raw material input alongside electricity, compute, and bandwidth.
  • Adam Ozimek argued that AI is unlikely to be the primary cause of rising teen unemployment, pointing instead to a low-hire, low-fire economy, macro uncertainty, and broader labor-market weakness for young adults, including discouraged workers who do not show up in headline unemployment stats; ModeledBehavior and Charles Arnal were part of the reaction thread around that macro-vs-AI framing.
  • Jason Fried argued that interface design should favor stable, pre-defined glances instead of forcing people to start from a blank slate every time.
  • Mitchell Hashimoto warned about "agent psychosis": an AI agent improved a renderer from 88 ms to 1.5 ms, but his hand-written version hit 20 microseconds with no hot-path allocations, showing that impressive agent work can still miss deeper systems wins.
  • Addy Osmani argued that running many agents does not parallelize your attention; review, merging, context switching, and judgment remain the bottleneck.
Advertisement

🤖 AI Agents & Infrastructure

  • TechCrunch argued that the internet is being rebuilt for machines as agents move from experiments into production, with AWS, Cloudflare, Databricks, Snowflake, and Microsoft rethinking infrastructure for machine-generated traffic.
  • Russell Brandom argued that recursive self-improvement is becoming the new AGI target, with labs chasing automated AI research but still hitting reliability, verification, and self-direction limits.
  • Harvard MIMS researchers built AutoScientists, a decentralized team of LLM agents that self-organize around scientific experiments, share wins and failures, and improved results on biomedical and model-optimization benchmarks, with paperGitHubShanghua Gao's postAda Fang's thread, and Matan Grinberg's reaction.
  • Shift launched free NYC apartment cleanings where operators record anonymized egocentric video data for robotics training, with BhathalTanvir0 highlighting the data-for-labor trade.
  • Eric Zakariasson introduced Thermos, a Cursor plugin that runs security/correctness and code-quality audits in parallel on a branch, dedupes findings, and prioritizes overlap; a follow-up showed install and workflow details.
  • SMFS reported that its specialized filesystem for agents reduced token usage by 43% on average across 220 eval runs, with Prasanna sharing the results.
  • Sphere-AI-Lab built Orbit, an ultra-efficient open-source reinforcement-learning pipeline that can post-train trillion-parameter LLMs on a single 8xB200 GPU node by freezing the low-precision base model and updating a small BF16 adapter, with Sphere's page explaining the train-rollout gap it targets and tszzl's thread framing the broader question of long-horizon agent evaluation.
  • Agent Harness Engineering surveyed the execution harness layer of agents and proposed the ETCLOVG taxonomy, with koylanai demoing an open-source harness with observability.
  • Shopify Engineering detailed River/Aquifer, its internal orchestration platform powering thousands of daily autonomous engineering tasks.
  • reach_vb described a Codex high-agency pull-request workflow where the agent produces complete, ready-to-merge branches on internal repos with no human edits before review, making the practical bar less "can it write code?" and more "can it own a clean branch to merge?"
  • Ben Holmes demoed Warp's /handoff skill, where Claude Code plans complex work and delegates scoped tasks to parallel Codex worktrees.
  • Google Antigravity used Gemini 3.5 Flash to orchestrate 93 subagents across 15k+ model calls and boot Doom after building a custom kernel, filesystem, and drivers in 12 hours.
  • Hermes Agent got a deployment masterclass from Matt Pal, with a GitHub templateHermes Agent v0.15.0 also shipped via Nous Research, with launch post.
  • Firecrawl Monitoring schedules recurring scrapes, detects content changes, and sends diffs by webhook or email, with Firecrawl's post framing it as token-saving infrastructure for agents.
  • Baseline.ai keeps coding agents inside the lines with local-first baseline checks and Pro monitoring, with Ben Hylak sharing the launch.
  • HowToEval.com published a 2026 guide to evaluating AI agents in production across offline evals and real-time monitoring.
  • MagicPathAI's X-only post was included in the source batch as an embodied-agent path-planning demo, but the accessible scrape did not expose enough text to verify the full thread beyond that framing; keeping it linked here prevents the demo from being dropped while avoiding invented details.

💻 AI Coding & Developer Tools

Advertisement

🔬 AI Research & Models

🛠️ AI Tools & Products

  • Asana acquired StackAI for $75M to add no-code agent building into its work-management suite.
  • Vertu launched a luxury AI foldable starting at $6,880, built on open-source Hermes Agent with enterprise workflow integrations and premium hardware finishes.
  • Josh Woodward said NotebookLM is rolling out automatic Google Drive file sync, beginning with 10% of users; NotebookLM is the product link.
  • TestingCatalog reported Google Nano Banana is generally available and now accepts native video input, which makes the tool more useful for workflows that analyze or transform motion rather than only still images.
  • Runway released "Last Night," a fully AI-generated short film about a life-changing evening in Tokyo told through fractured memories, created by one person in one day as part of Project Luxo to show that AI video is moving past uncanny-valley demo clips and into polished short-form storytelling.
  • StemStudio is an open-source browser-based 3D editor, engine, and AI copilot for building games and interactive apps with Three.js, behaviors, ECS lambdas, physics, and multiplayer; Ken B. argued biology is the next agentic frontier after coding, while Mark Pincus framed Stem as an MIT-licensed, remixable, web-native 3D game engine and dev studio built for AI-assisted blocks.
  • Cristóbal Valenzuela argued that the future may flip today's disclosure norm: instead of labeling AI-generated videos, we may need to label the videos actually captured by cameras, because current adoption curves suggest synthetic video could become the default medium.
  • Leila Clark defended giving an AI agent production-database access in the YC robotics context, arguing that professional engineering guardrails already include backups, ORMs, docs defining prohibited actions, and postmortems, while the same thread also surfaced a live FLUX Virtual Try-On example.
  • snowmaker, a YC partner, said he quietly gave an AI agent full access to YC's production database one night, then used the story to argue that agents can work safely around sensitive systems when teams apply normal engineering controls; Y Combinator separately pointed to the broader AI/robotics startup batch that made the incident relevant.
  • Lambda API's X-only item and Nick Camara's follow-up were included beside the OpenJarvis/on-device assistant cluster in the source batch, but the accessible scrape did not expose enough post text to verify the exact tooling claim; the digest keeps both links attached to the cluster rather than inventing context.
  • The Neuron general-purpose custom GPT is available as a reusable ChatGPT workspace for idea exploration, problem solving, and faster learning.
  • Enjamb gives biopharma R&D teams an agentic workspace for evidence synthesis, statistical programming, grants, and regulatory documents, with YC's launch post noting a $650K pre-seed.
Advertisement

📊 Fundraising & Deals Roundup

  • Anthropic - $65B Series H at a $965B post-money valuation for Claude, compute, safety, and enterprise scale.
  • Airwallex - valued at $12B after reaching $1.5B ARR.
  • Groq - raising $650M for its AI inference neocloud pivot after a $20B Nvidia licensing deal.
  • Corgi - $106M at a $2.6B valuation for AI-native insurance, doubling valuation in three weeks.
  • Reactor - $59M to build a developer platform for real-time World Models, with Reactor's launch postanother Reactor postRohan PaulLightspeed, and Lightspeed's write-up providing context.
  • Orbital Industries - $50M Series B for AI-designed materials and data-center cooling, with Tech.eu covering the infrastructure angle.
  • Inherent - $50M seed led by Index Ventures for a DeepMind-alumni AI lab building Faraday for open-ended scientific discovery, with Index's post.
  • Geordie AI - $30M Series A for AI agent security and governance.
  • Saris - $28.8M Series A for AI agents that automate back-office work for banks and credit unions.
  • Enjamb - $650K pre-seed from YC and Founders Inc for biopharma R&D agents, via YC's launch post.

💡 Industry Commentary & Analysis

  • Amanda Silberling explained why Google's AI struggles with spelling: LLMs process tokens rather than literal characters, so counting letters can remain brittle even when models answer harder questions.
  • Sigal Samuel argued that humanism should reject AI successionism, transhumanism, and posthumanism's replacement narratives and re-center pluralism plus intrinsic human value, with her post.
  • Turing Post shared six guides on LLM basics: tokenstoken taxonomyembeddingsagentic vector databasesattention and KV cache, and LLM inference.
  • Latent Space interviewed Cognition's Walden Yan and OpenInspect's Cole Murray on the age of async agents: Devin now commits production code, teams are moving from loose prompts to spec-to-PR workflows, agents need full virtual machines, memory, and inspection loops, and PMs can increasingly ship code directly; Walden YanOmooremenhguin, and a1zhang amplified the practical takeaway that async software agents are becoming team infrastructure, not side toys.
  • Tom Davidson argued that full automation of AI R&D would likely boost AI software progress 3-5x and overall AI progress 2-3x even without a pure software intelligence explosion, with his X post.
  • mweinbach pointed to recent AI-agent velocity and enterprise-adoption signals, Ryan Carson argued the coding-agent release cadence is accelerating so fast teams will keep switching tools unless they abstract the agent layer, and Epoch AI Research updated its AI training-compute scaling charts to track how much of that velocity is still being driven by raw compute growth.
  • Mercor's X-only post sat in the labor-market cluster but the accessible scrape did not expose enough text to verify the precise claim; it remains linked here as a pointer to the original expert-work and AI labor-market discussion rather than being padded with guessed details.
  • Venture Twins and scaling01 were investor/operator reactions around the day's model and tooling launches; the accessible scrape did not expose enough text to safely summarize the full posts, so both remain linked without invented claims.
  • Gov. Pritzker's post sat alongside the Illinois SB 315 cluster: the legislature passed a frontier-AI safety bill requiring annual independent third-party safety audits, transparency, and incident reporting for large AI developers, with the bill still needing the governor's signature before its 2028 effective date.
Advertisement

That's a Wrap

That's a massive Thursday: Claude turned into a full subagent discourse machine, enterprise AI bills started looking like a new utility category, and mathematicians got another reminder that "AI-assisted" may soon mean "please clear your weekend." Congrats on making it to the bottom. You now qualify as a dynamic workflow.

For the daily version, make sure you're subscribed to The Neuron. We read all of this so you don't have to.

See you tomorrow.

Grant Harvey

Grant Harvey is the Lead Writer of The Neuron, where he continues to lead the publication's daily coverage of AI news, tools, and trends.

The Neuron Logo

Don't fall behind on AI. Get the AI trends & tools you need to know. Join 700,000+ professionals from top companies like Microsoft, Apple, Salesforce and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.