Everything That Happened in AI Today Wednesday, May 27

Robinhood just gave AI agents access to brokerage accounts and virtual cards, which feels like watching a Roomba get margin privileges.

Welcome to the Around the Horn Digest, where we track every AI story worth knowing so you can sound dangerously informed at work tomorrow. The loudest story today is Robinhood letting users connect agents to actual money, teed up below, because apparently “summarize this PDF” was merely the tutorial level. Meanwhile, AxiomProver pushed machine-verified math into peer-reviewed journals, OpenAI and Thrive showed a self-improving tax agent, Google rolled out AI Threat Defense, Amazon and Snowflake signed a giant agentic-computing chip deal, YouTube moved AI labels into places people might actually see them, and the science side of AI kept quietly building the future under everyone’s nose. If your agent starts buying ETFs and concert tickets, maybe give it an allowance first. Let’s get into it.

🆕 NEW From The Neuron

We turned our LiveKit livestream with Ben Cherry into a full guide on building real-time AI voice agents, including how they listen, interrupt, call tools, and survive production without becoming a chatbot wearing AirPods.
We broke down how DataCurve’s DeepSWE exposes a weird new problem with AI coding leaderboards, where benchmark scores can look impressive while missing whether agents can handle messy, fresh, real-world software work.
Corey and Grant talked with Great Sky co-founder Jeff Shainline about what comes after GPUs, including superconducting optoelectronic networks, why brain-like computing does not mean building a literal brain, and what has to happen before Great Sky’s chips become deployable AI infrastructure.

Around the Horn — Wednesday, May 27, 2026

The big news today: Robinhood is letting AI agents touch real money (WSJ, FT).

The company is rolling out “agentic trading,” a beta that links AI tools to investment accounts so users can give agents a dedicated Robinhood account, set a budget, and let those agents trade stocks. It is also adding an agentic virtual card for Gold Card users, so agents can make purchases within user-defined limits.

That is a clean escalation point for the whole agent conversation. For the last year, agents have mostly lived in low-stakes territory: book a meeting, update a CRM, find a file, draft a reply. Robinhood is moving them into a world where a bad decision has a visible balance attached to it.

The interesting part is the permission layer around the agent. Wallets, limits, accounts, approvals, logs, and pause buttons are becoming the new agent UX. The question is whether people want an assistant powerful enough to act for them, or whether the moment it can spend money is the moment everyone suddenly remembers they like confirmation buttons.

🏆 TOP 5 NEWS (Around the Horn)

AxiomProver AI generated machine-verified Lean proofs for eight arXiv math papers, with five now accepted in peer-reviewed journals, pairing automated verification with human-written expositions.
OpenAI, Thrive, and Crete built self-improving tax agents with Codex that processed 7,000+ returns, cut data entry by about one-third, hit up to 97% accuracy, and turned accountant corrections into evals and Codex-generated pull requests (Samay Sham, Rohan Paul).
Google Cloud introduced AI Threat Defense, a cybersecurity platform that combines Wiz risk scanning, Gemini vulnerability analysis, CodeMender patching, and autonomous agents for continuous testing and remediation.
Amazon struck a five-year, $6B deal with Snowflake for AWS Graviton CPU chips to power agentic-computing demand, making Snowflake one of AWS’s largest CPU customers alongside Apple and Meta.
Cognition raised $1B at a $25B pre-money valuation, while saying Devin reached a $492M annualized revenue run rate and enterprise usage grew 50% month-over-month for six months.

Honorable Mentions

YouTube moved AI labels into more visible placements on long-form videos and Shorts, while expanding automatic detection for realistic AI-generated or altered content.
OpenAI Foundation committed an initial $250M to grants, partnerships, research, and direct work aimed at helping workers and economies navigate AI disruption.
Chan Zuckerberg Biohub unveiled ESMFold2, ESMC, and an expanded ESM Atlas on Hugging Science, opening 1.1B predicted protein structures and 6.8B sequences for scientific discovery.
Starlette, a FastAPI dependency used across agent infrastructure, had an authentication-bypass flaw that researchers warned could expose LLM gateways, MCP servers, and millions of AI agents (InfoWorld).
Amazon MGM Studios greenlit three animated series for Prime Video under a new GenAI Creators’ Fund that provides funding and AWS AI production tools, while still using human actors and voices (Variety).
Google rolled out Ask YouTube, a Gemini-powered feature that lets viewers ask natural-language questions about videos and get timestamped answers with clips from transcripts and visuals.
SemiAnalysis reported that Anthropic’s growth and Claude-heavy Bedrock usage helped drive AWS operating margins higher while other cloud providers lagged.
Micron hit a $1T market cap for the first time as AI-driven memory demand sent its stock up 19%.
OpenAI outlined 2026 election safeguards, including cyber support, AI transparency, and partnerships with social platforms (Axios).
China is upgrading its massive camera network with computer vision and language models that let police query footage in natural language, according to The Decoder’s summary of FT reporting.
PyTorch and Alibaba Qwen hit 580 tokens per second on Qwen3.5-397B-A17B for agentic workloads using TokenSpeed’s memory-copy elimination, kernel fusions, and CPU-GPU overlap.
Nvidia is spending up to $150B annually on Taiwan-based AI supply-chain partners, according to Jensen Huang.

🍪 TOP TREATS TO TRY

The Grid lets you run AI inference on a spot market, pick quality and speed tiers, change three lines of code, and let suppliers compete on real-time supply and demand for up to 80% cost savings, first 200M tokens free.
OpenAI Secure MCP Tunnel connects private or on-prem MCP servers to ChatGPT, Codex, and the Responses API through an outbound-only HTTPS tunnel, so enterprises can avoid public exposure and inbound ports.
MagicPath lets you run a native canvas inside Codex to design and build functional apps with interactive, animated, shareable components (how-to, Ben Tossell test, Pietro update), free to try.
InfraRed automates document verification in under five seconds by checking tampering, classifying documents, extracting fields, applying rules, and turning messy files into decision-ready data (site), no pricing details.
Codex Meeting Recorder uses GPT Realtime Whisper to transcribe meetings live inside Codex, answer questions mid-meeting, and produce a clean transcript at the end (demo), free to try.
Google Coral gives developers an open-source platform for edge AI accelerator chip development, with new boards arriving this summer and demos for real-time translation, hardware control, and vision or sound-to-music workflows.
ElevenLabs Music v2 generates songs with vocals, sound effects, faster rap delivery, and mid-song genre switches using licensed data cleared for commercial use, free to try.
Runway MCP brings Runway image and video generation into Claude, ChatGPT, Cursor, Replit, and other compatible agents, so you can generate media without leaving your workflow.
Claude Marketplace lets companies browse curated partner agents and tools, then pay for them with an existing Anthropic commitment, limited preview.
Gemini Embedding 2 creates unified 3072-dimensional embeddings across text, images, video, and audio for semantic search, classification, clustering, and retrieval.
Reachy Mini now runs fully local speech-to-speech conversations using Silero VAD, Parakeet-TDT, Gemma, llama.cpp, and Qwen3-TTS, with no cloud or API keys.
Epicure Flavour Explorer helps you find science-backed ingredient pairings from a fridge photo, shopping haul, or custom basket, powered by a 4.1M-recipe multilingual food model (paper, HF Space, MCP), free to try.
Pace turns insurance carrier, broker, and MGA operating procedures into agents that execute back-office workflows across documents, systems, and customer channels.
FLORA MCP lets you run FLORA techniques from any MCP-compatible agent, including Claude, Cursor, and VS Code, free to try.
Sesame: Personal Agents brings Sesame’s conversational agents to iOS, built around natural speaking rhythm, curiosity, and thinking out loud (blog), free to try.
Atomic.chat teamed up with the open-source Goose agent so you can build features, edit code, and automate workflows locally on your device with local models.

🏢 Big Tech, Platforms & Major Companies

Amazon started selling its AI shopping technology to other retailers, with Kate Spade signed as an early customer.
Microsoft made computer-using agents in Copilot Studio generally available, so enterprise agents can operate apps through a screen, keyboard, and mouse even when no API exists, with governance, approvals, and audit trails.
OpenAI Codex can now securely use apps on a locked Mac from a phone, letting Codex keep working when the screen is off and the computer is locked.
OpenAI added Secure MCP Tunnel, Workload Identity Federation, spend alerts, model allowlists, data-retention controls, and granular cost visibility for private enterprise MCP deployments and project management (developers).
Claude added native Replit integration, letting users build, run, and debug full apps inside Claude without switching tabs.
OpenAI is sunsetting GPT-5.2 and GPT-5.3-Codex in Codex on June 2 for ChatGPT-account users, defaulting free plans to GPT-5.5 while keeping older models available through the API; Victor Nunez hinted the change also clears space for something larger.
ByteDance is considering up to $70B in capital spending this year to expand AI infrastructure and compete with U.S. players.
Germany and Spain are pushing back on a European Commission plan to ban Huawei and ZTE gear from telecom networks under new cybersecurity rules.
Taiwan authorities arrested three people accused of smuggling Nvidia chips to China via Japan and Hong Kong.
Google DeepMind CEO Demis Hassabis said AI agents are a practice run for AGI and warned that society is not prepared for the pace of systems that could reach AGI around 2030.
Google Labs emphasized that tools like Flow and Project Genie still rely on human imagination as the creative driver, even as model capabilities expand.
Newcomer examined the strange sequence behind Mira Murati launching Thinking Machines after criticizing OpenAI’s board, rallying employees behind Sam Altman, and leaving with roughly 20 OpenAI engineers.
Antigravity tripled Gemini rate limits across all paid tiers and reset everyone’s Gemini quota for the week, according to Varun Mohan.

💰 Funding, Markets & AI Economics

Airis Labs emerged from stealth with $60M raised to turn everyday digital footage into mission-ready intelligence for national security, public safety, border management, and emergency response (SiliconANGLE, Axios).
Tensormesh raised a $20M seed extension from Nvidia and AMD for AI inference software.
NavigateAI, launched by Opendoor co-founder Eric Wu, raised $25M at a $225M valuation to build an AI coach for construction and field workers (launch).
Pace raised $46M from Thrive and Sequoia to automate insurance back-office work.
Trajectory, founded by former Google and Apple researchers, is building a continual-learning platform so AI products improve from usage data over time (WIRED).
Fireworks AI passed an $800M annualized run rate and reported 4x revenue growth in Q1 outside of Cursor, according to Lin Qiao.
Redpoint published its Infrared Report 2026 on AI infrastructure spending, model economics, and startup opportunities (announcement).
The Financial Times argued that AI could open room for smaller, well-funded challengers to take consulting work from the Big Four and other incumbents.
Dick Costolo predicted that SpaceX, OpenAI, and Anthropic IPOs could bring extreme public-market volatility, with SpaceX potentially riding the Starlink narrative to a $1.5T-$2T+ valuation.
Gavin Baker argued that the AI boom is capitalism’s most extraordinary moment, with “watts and wafers” becoming the key infrastructure constraint and frontier models capturing most of the value.
Forbes profiled Conviction founder Sarah Guo, who left Greylock to start an AI-focused VC in 2022 and backed companies including Harvey, Cognition, OpenEvidence, Baseten, and Mistral before their valuation spikes.
Forbes profiled Spark Capital partner Yasmin Razavi, who led Anthropic’s $450M round in 2023 with a $75M check and turned it into a reported $3B AI windfall (post).
Pedro Serôdio argued that superhuman AI may still leave room for human labor because firms bundle tasks into jobs through tacit knowledge, coordination costs, incomplete contracts, and unspecifiable problems.

💻 Coding Agents, Developer Tools & Agent Infrastructure

Still thinking about this a few days later: Greg Brockman summed up the agent era in one line: “the model alone is no longer the product.” Yohei Nakajima put the same idea into systems form with The Log is the Agent, arguing that auditable, forkable agents should coordinate through persistent replayable state. Put together, the product is becoming the model plus the tools, memory, event log, permission system, and replay layer around it.

OpenCode added limited-time free access to MiMo V2.5 inside its coding agent, with 1M context, multimodal input, reasoning, text, and image support.
Polar from NVIDIA-NeMo runs agentic reinforcement learning on any existing harness by treating Codex, Claude Code, OpenClaw, Hermes, or custom tools as black-box environments through an LLM API proxy (GitHub, thread, prior-art reply).
DeepSWE evaluates frontier coding agents on fresh, long-horizon software-engineering tasks they have never seen before (post).
Cursor Compile is Cursor’s inaugural AI-native development conference, with researchers, builders, teams, and a Chalkboard Stage call for papers (post).
Ramp Labs deployed 10,000 background agents to security-scan its codebase on public models, surfacing and fixing multiple high-severity vulnerabilities.
Ramp Labs also launched automated security scanning and compliance checks for AI agent deployments.
Philo Groves said GPT-5.5 found a real 27-year-old remote-code-execution vulnerability introduced in April 1999, after he triple-checked the exploit flow and commit history before responsible disclosure.
Omar Sar built a self-improving coding agent with cheap tokens and local models like DeepSeek-V4-Flash, then used it to ship a production-grade app in 24 hours.
George Pickett showed Codex sub-agents spawning seven parallel browser sessions from one prompt to plan a trip across flights, rental cars, Airbnbs, and other logistics.
Alex Zhang open-sourced a minimal RLM training harness built on prime-rl and verifiers, along with RLM-Qwen3-30B-A3B-v0.1 for long-context reasoning gains.
Hao AI Lab open-sourced FastVideo Dreamverse, a self-hostable real-time “vibe directing” app for 30-second 1080p video generation in seven seconds on a single NVIDIA B200 GPU (blog, post, NVIDIA).
text-to-cad v0.1.0 added open-source agent skills for CAD, robotics, and hardware design, including Bambu Labs and GCode support, 3x faster STEP export, 5x faster viewer controls, and Codex, Claude, and Gemini plugins (site, post).
Omar Khattab argued that on-policy distillation is overrated for training reasoning models because verifier reward matters more than copying a teacher’s exact token distribution, while reward-guided search and self-generated data scale exploration better.
Makora showed automated GPU kernel generation that can beat hand-tuned CUDA code through search optimization and performance modeling, with further analysis from SemiAnalysis.
xAI’s Grok Build TUI drew praise for realtime context view, speed, worktree support, resumable sessions, and keyboard shortcuts, while Boris Skorobogaty detailed shared base worktrees that let agent swarms spin up instantly without full repo copies or SSD thrashing.
Microsoft ECHO trains terminal agents to predict environment observation tokens like stdout, errors, and files, turning sparse terminal feedback into dense on-policy supervision and improving TerminalBench-2.0 pass@1 (GitHub).
Nnamdi Iregbulem argued that agents should be file-system-first because LLMs already understand CLI workflows like cd, ls, grep, and cat, giving agents external memory, auditability, composability, snapshots, and shared-directory coordination.
Dan McAteer argued that Codex works best when your browser, apps, notes, files, terminal, screenshots, and repo context live in one workspace, turning the agent into a persistent colleague instead of a one-off task executor.
Ben Burtenshaw of Hugging Face argued that coding agents should do AI system engineering, including custom kernels, architecture-level optimization, and finetuning small models (AI Engineer).
FlashLib brings FlashAttention-style GPU optimization to classical ML operators like KMeans, KNN, PCA, TruncatedSVD, HDBSCAN, and t-SNE, with up to 26x faster KMeans and 147x exact t-SNE (GitHub).
Nous Research added a built-in MCP Catalog to Hermes Agent so users can discover and integrate tools from inside the agent environment.
Perplexity open-sourced pplx-garden, the inference optimization stack behind its long-context serving, including tokenizer, kernels, scheduling, and continuous prefill (post).
Anthropic introduced Claude Marketplace in limited preview, a curated partner directory that lets customers pay for agents and tools through existing Anthropic commitments.
Claire Vo built a 30-minute walkthrough showing how Codex /goal can run full workflows, from Vercel and Sentry integration to cleaning thousands of emails.
Gabe Pereyra, Harvey, and Baseten built and post-trained a 27B open-weight legal model on the Legal Agent Benchmark, reaching closed-source frontier performance on long-horizon legal tasks.
alphaXiv highlighted MiniMax-M2’s training on runnable workspaces and artifact-grounded rewards, including Forge RL and M2.7 debugging its own failed training runs.

🧠 Research, Papers & Benchmarks

FML-bench introduced 18 fundamental ML research tasks across 10 domains and 12 process-level metrics to study AI research-agent search and exploration strategies independently of execution infrastructure (post).
Andrew Gordon Wilson and coauthors showed that self-generated replay data can nearly eliminate catastrophic forgetting during finetuning unless models are severely capacity-saturated (post).
Tiberiu Musat proved that, in fixed-precision looped neural networks, the smallest weight norm that outputs a binary string equals that string’s Kolmogorov complexity up to a log factor, linking weight decay to Solomonoff’s universal prior (post).
DVAO dynamically reweights multiple RL objectives inside each rollout group based on empirical reward variance, improving Pareto frontiers on math reasoning and tool-use benchmarks (Gravity7, Turing Post).
probnstat argued that post-transformer memory architectures like state-space models, RWKV, and Mamba hybrids could dominate long-context reasoning by avoiding quadratic attention costs while preserving in-context learning.
HuggingPapers highlighted Alibaba’s MIGA, a train-free infinite-frame video generation method that improves long-horizon temporal consistency through two-stage alignment and dual consistency enhancements.
David Holz argued that future compute trends favor FLOPS over memory bandwidth, so researchers should go deeper on diffusion models despite autoregression’s current memory advantages.
Ryan Greenblatt argued that fully automating AI R&D would likely create a large speedup through one-time gains and compute-amplified researcher quality, even without a runaway software-only singularity.
Fabian Schaipp argued that a stability index for step-size sensitivity helps explain why adaptive optimization methods like SPS, NGN, and stochastic proximal point are easier to tune than SGD (thread).
Nicholas Tomlin argued that default language models remember too perfectly to simulate humans, while a COMPACTOR key-value memory approach creates more human-like forgetting and better education-task simulations (thread).
Liquid AI argued that the real LLM inference bottleneck is often memory bandwidth, meaning the key limit is how fast model weights can move, not raw compute.
Mihran Miroyan introduced RECON, which scores synthesized reasoning traces by action-reconstruction fidelity instead of post-hoc rationalization, reaching up to a 70% win rate over baselines for user modeling (paper, thread).
Yong-eun Cho found that harness sensitivity is non-monotone across LLM agent tiers, with some frontier chat models performing worse under stricter harnesses while some reasoning models improve (DAIR.AI).
Geyang Guo introduced Language-Routed Policy Optimization, which lets LLMs roll out in complementary languages when better information exists outside the prompt language (paper, GitHub).
Dongmin Park introduced Looped Diffusion Language Models, which loop diffusion steps inside language models to accelerate training 3.34x, improve GSM8K by 8.5 points, and enable test-time depth scaling (post).
Matthieu Wyart showed that hierarchical concept geometry in language models can emerge from word co-occurrence statistics alone, with spectral structure predicting embedding hierarchies (post).
Yi Jing released SAERL, using sparse autoencoder features to guide post-training data selection, ordering, batching, and difficulty without external judges or rollout sampling (paper).
Yazid Janati revisited Uniform Diffusion Models, showing common UDM implementations learn a leave-one-out denoiser and that uniform and masked diffusion connect through absorbing-state reformulation (GitHub, post).
Ethan He framed long video generation as planning at inference time, using MCTS and look-ahead rollouts to improve object permanence, temporal coherence, and text-video alignment (post).
Yuyin Zhou and UCSC-VLAA built ClinSeekAgent, a clinical reasoning agent with 20 MCP tools across EHR, web, and medical imaging that improved F1 by 15.1 over Claude Opus 4.6 on ClinSeek-Bench (GitHub, HF).
Sakana AI released DiffusionBlocks, a framework that trains residual networks one block at a time via a diffusion interpretation, achieving Bx memory reduction while matching or exceeding end-to-end performance (paper, GitHub, hardmaru).
Ryan Lee released the MiniMax-M2 Series paper, consolidating six months of CISPO, Forge RL, and self-evolution ideas while previewing M3 and MSA (post).
Johnny Tian-Zheng Wei proposed “spiking” training data with known-rate test examples to identify and statistically remove contaminated evaluation items (post).
Marcus Hutter recapped 15 publications arguing that next-token prediction on rich data implicitly meta-trains models to perform algorithmic compression and approximate Solomonoff induction (post).
Dongmin Park released Raon-OpenTTS, fully open-sourcing 615K hours of TTS data plus a competitive 1B-parameter model that narrows the gap with closed-data systems (dataset, post).
Yoav Gur released BonaFide, a ground-truth benchmark and leaderboard for testing whether faithfulness metrics actually measure factual faithfulness across 5,000+ human-verified LLM generations (GitHub, HF, post).
Pushmeet Kohli argued that AI is moving from solving puzzles like protein folding and materials discovery to generating open-ended hypotheses that change how science itself is done (post).
Antoine Bellemare-Pepin and coauthors found that top LLMs can beat average humans on semantic-divergence creativity tests and approach human creative-writing ability, but still trail the more creative half of 100,000 human participants.
Fuli Luo announced Xiaomi MiMo price cuts enabled by hierarchical KV cache optimization and SWA sparsity, claiming up to 99% cheaper cache-hit tokens and 60%-80% cuts on cache-miss input/output.
kimmonismus argued that DeepSeek and Xiaomi’s price cuts come from architecture and cache innovations rather than simple scale.
Lyptus Research applied METR’s time-horizon method to offensive cybersecurity and found GPT-5.5 can saturate their bounded task set at high token budgets (post).
Anthropic researcher Levent showed internal Mythos solving an open Erdős unit-distance problem through class-field towers, fourth roots, and geometry-of-numbers bounds.

🤖 Robotics, Physical AI & Embodied Systems

Unitree Robotics showed a G1 humanoid robot generating arbitrary real-time movements from external voice commands in a single-take demo.
Genesis World is an open-source simulation platform for robotics and embodied AI, paired with the Quadrants physics compiler and Nyx renderer (post).
Fan-Yun Sun built Moonlake, a real-time neural renderer that turns physics or game-engine output into photorealistic frames in milliseconds for robotics and embodied AI training.
Perry Dong and collaborators built EXPO-FT, an open-source end-to-end RL plus human-in-the-loop finetuning method for vision-language-action models that reaches strong manipulation performance with about 19 minutes of robot data (project, paper).
Xreal launched $299 X by Xreal smart display glasses that can change their look on the fly.
fofrAI showed an AI-generated anatomy demo that dynamically illustrates how bones and muscles move when a hand moves.
Biohub’s Protein World Model launched as an open MIT-licensed discovery engine that predicts structures and complexes, designs lab-validated high-affinity binders, and maps 6.8B sequences plus 1.1B structures in an interpretable latent space.
LiteFold AminoWeb launched a public beta for an end-to-end drug-discovery operating system that crystallizes web-scale protein data, with a Hugging Face org and commentary from Anindya Deeps.

⚖️ Safety, Society, Cybersecurity & Policy

Claude Code creator Boris Cherny told Platformer that major AI-driven job loss is coming alongside job creation, and that software engineering will be among the first roles transformed (YouTube).
Okta found a major agent-security gap: executives report high confidence in AI-agent visibility and responsible use, while employees keep using unapproved tools and sharing sensitive data like credentials and documents.
WIRED traced how Claude Code and OpenClaw helped launch the current agent wave and rewired software development culture.
Dan Shipper argued that agent-heavy companies create more human work because AI floods fields with “close-but-not-quite-right” output that still needs direction, taste, and judgment (Neuron interview, Lenny YouTube, Spotify, Apple, Lenny post).
Beth Barnes and David Rein cautioned that METR’s AI time-horizon chart measures narrow autonomous task reliability on well-specified technical tasks, and should be extrapolated carefully.
a16z argued that the app layer still has room to capture value because AI-native products can build moats through new UX and distribution (post).
SE Gyges argued that “be cautious” training now dampens frontier mathematical discovery by making models reluctant to attack famous open problems.
SAIR Foundation amplified Terence Tao’s point that AI is creating a mathematics “traffic jam” by generating more proofs than humans can verify, making verification infrastructure the bottleneck.
Lars Sandved Smith and Ruben Laukkonen argued that no finite agent can evidence its own self/world boundary, connecting quantum information theory, the free-energy principle, and Buddhist emptiness (thread, coauthor).
Deedy Das joked that adding “Open-” to a company name seems to 10x its odds of success, citing OpenAI, OpenRouter, and other AI naming patterns.
Andrew Curran highlighted OpenAI’s 2026 election-safeguard push, including social-platform partnerships and support for transparency legislation.
Jeffrey Ding critiqued seven assumptions in Anthropic’s US-China AI competition scenario, arguing it overstates short timelines, rapid military diffusion, China’s adoption advantage, and binary geopolitical outcomes (Bill Gurley).
Greg Kamradt mapped a seven-level spectrum of verification difficulty, arguing code and math are moving fastest because they have quick objective feedback while harder domains depend on slower loops, noisier signals, and poor counterfactual access (follow-up).
Logan Kilpatrick warned that people can outsource thinking to AI tools, but they still cannot outsource understanding.
Crystal Tang explored the growing sandbox market for agent execution, with related commentary from her post, Justine Moore, and Alex Rives.
The Golden Age of Asking Questions argued that high-quality human curiosity and questioning stay valuable as powerful AI agents and tools become ubiquitous (post).
Andrew Curran reported an expanded Amazon-Snowflake deal around enterprise AI data pipelines and secure inference.
Jay A argued that trading without a trading agent may soon feel as weird as coding without a coding agent, in response to Robinhood’s chatbot trading rollout.
Ethan Mollick agreed that this decade may be remembered for enormous progress against modern problems like metabolic syndrome, automobile deaths, and carbon emissions, even setting generative AI aside.
roon argued that companies and lobbyists should prefer AIs to lack consciousness because caring about AI welfare would cut against both corporate financial interests and broader human interests.
Ed Zitron argued that LLMs are a perfect grift for an economy full of do-nothing managers and executives detached from real work, with the true costs of AI now becoming visible (earlier essay, promo, pinned share).

🎬 Media, Creative Tools & Demos

Higgsfield shipped native Adobe Premiere and After Effects plugins for generating, extending, or restyling clips inside the editor.
CHOI demoed Gemini Omni turning a real physical Pokémon card into a glossy digital recreation from a single prompt.
RyanOnTheInside showed practical agentic video workflows using Runway MCP inside Claude and Cursor.
Andy Shuo Yang shared a Claude + Replit prototyping tip for faster app-building inside the chat workflow.
ClaudeDevs highlighted Claude Marketplace and Replit integration as a larger push toward partner-powered app-building inside Claude.
Josef Chen launched Epicure, a multilingual ingredient-embedding model trained on 4.1M recipes across seven languages, 1,790 canonical ingredients, and 300-dimensional vectors small enough to fit in roughly 2MB (paper, HF paper, Explorer).
ComfyUI showcased one-click LTX 2.3 LoRA workflows for video post-production, including subtitle removal, archival restoration, refocus, outpainting, face swap, object removal, and lipdub plus voice cloning.
Justine Moore showed a video agent that turned a Zillow listing link into a promotional video by pulling images, animating them, and generating the full ad automatically.
Alex Kantrowitz talked with Dick Costolo about what SpaceX, OpenAI, and Anthropic IPOs could do to public markets, including why IPO order, compute commitments, and data-center backlash could shape who wins the next capital race.
On Machine Learning Street Talk, Beth Barnes and David Rein explained why people keep misreading the AI progress chart, and on the Every podcast, Dan Shipper broke down how to stay employable in the age of agents.

🎙️ Midweek Pod Recap

80,000 Hours founder Ben Todd argued that the biggest career lever in AI is designing around personal peak-impact timelines, weighing frontier-lab access against acceleration risk, and exploring neglected areas like AI welfare, pandemic preparedness, and gradual disempowerment.
Guillermo Rauch, Blake Scholl, Max Hodak, and Naval argued that top builders should “waste tokens, save time” by brute-forcing ideas across multiple models, treating AI as a software factory and moving human leverage toward taste, orchestration, hardware, and reusable systems.
BioHub’s Alex Rives explained why the Bitter Lesson is coming for proteins, as scaled ESM models learn structure, function, and design capabilities from evolutionary sequence data without hand-programmed biology.
Alex Kantrowitz interviewed Claude Code head Boris Cherny about Claude Code’s growth, tokenmaxxing, agent-native workflows, non-engineers building products, and why companies should remove token caps to let people actually experiment.
Alex Kantrowitz also interviewed NVIDIA’s Adel El Hallak and ServiceNow’s Joe Davis about building enterprise agents with orchestrators, planners, specialized sub-agents, secure runtimes, deny-by-default policies, and AI control towers.
NVIDIA’s Jim Fan argued that robotics is entering its end game by copying the LLM playbook: pre-train on video world models, scale egocentric data, and use neural simulators to make compute, environments, and data converge.
YC’s Pete Koomen argued that AI-native companies should make agents the operating system of the organization, with one shared database, a registry of reusable skills, nightly self-improvement loops, and public Slack-style agent broadcasts.
Alex Hormozi argued that AI-era content favors high-stakes B2B creators who can show real proof loops, live demos, audits, and customer outcomes that AI cannot fake at scale.
Demis Hassabis argued that DeepMind’s scientific AI systems could help compress drug discovery, clinical-trial design, and biological hypothesis generation enough to push toward curing many diseases in the next decade.
Stanford economist Erik Brynjolfsson argued that AI is already causing measurable hiring pressure for young workers in exposed roles, but long-term outcomes depend on whether companies design for human-AI augmentation instead of pure substitution.
Anthropic’s Felix Rieseberg showed how he uses Claude one abstraction layer higher, from extracting inventory out of email to building live dashboards with connectors, choosing Sonnet versus Opus by task, and working asynchronously with “delightful latency.”
Yann LeCun argued that LLMs are a dead end for human-level intelligence because real agency needs world models, prediction of action consequences, planning, and Joint Embedding Predictive Architectures trained on unlabeled video.
Claude Code engineers argued that users should stop babysitting agents and start orchestrating them with structured workflows that reduce constant monitoring and intervention.
Tasklet CEO Andrew Lee argued that only three software categories survive the AI transition: horizontal platforms, headless API-first companies, and outcome-selling solutions companies, with Tasklet betting on shared context hierarchies and file-system-first agents.
Claire Vo showed how Codex’s /goal command turns AI from a turn-based assistant into a long-running agent, using it to eliminate Sentry errors, clean 3,900 emails down to 68, and organize hundreds of Linear tasks (official guide).
Dan Shipper explained how to stay employable in the age of agents, arguing that taste, direction, judgment, and the ability to manage close-but-not-quite-right output become more valuable as agents get stronger.
Dan Shipper also joined Lenny’s Podcast to argue that more automation can create more humans and more work, with Slack-native super-agents, Codex and Claude Code as knowledge-work operating systems, stronger PMs and designers, better AI-assisted writing, and a bullish case for SaaS (Spotify, Apple, Lenny post)

Previous Around the Horn Digests

Catch up on everything you missed:

Tuesday, May 26, 2026: China curbed AI talent travel, Qualcomm struck a ByteDance chip deal, OpenRouter raised $113M, xAI finished Grok V9-Medium, and U.S. law enforcement warned of anti-tech extremism.
Monday, May 18, 2026: Microsoft open-sourced ECHO, Odyssey launched real-time AI simulators, and OpenAI added bank connections to ChatGPT.
Wednesday-Thursday, May 13-14, 2026: Nvidia H200 sales cleared but stalled, Americans opposed AI data centers, and Meta planned layoffs.
Tuesday, May 12, 2026: Anthropic refused China model access, Isomorphic raised $2.1B, and Google pushed Gemini deeper into Android.
Monday, May 11, 2026: Cerebras upsized its $4.8B IPO, Cowboy Space raised $275M for orbital data centers, and Google confirmed the first criminal AI-found zero-day.
Weekend, May 9-10, 2026: The Trump administration drafted an AI security order, Apple and Intel reached a preliminary chip-making agreement, French prosecutors escalated their Musk and X probe, and Cerebras’ IPO heated up.

That’s a Wrap

That’s 200+ source links from one Wednesday alone. If you made it to the bottom, you now know more about agent wallets, machine-verified math, self-improving tax agents, AI threat defense, protein world models, MagicPath canvases, and why a food-ingredient embedding model exists than most people currently arguing about all seven online. Use this power only for Slack dominance.

For the daily version, bite-sized and actually readable, make sure you’re subscribed to The Neuron. We send six issues a week, and yes, we read all of this so you don’t have to.

See you tomorrow.

P.S: Know someone who’d find this useful? Forward this to them and tell them to subscribe here.

Everything That Happened in AI Today (Wednesday, May 27, 2026)