Everything That Happened in AI Today Monday, June 1

Today, NVIDIA basically confirmed our bet that “your laptop is now an AI factory”, while Anthropic filed for an IPO and Bernie Sanders asked for half the cap table of AI companies.

Welcome to the Around the Horn Digest, the one page you need to sound dangerously informed at work tomorrow. The day’s biggest theme was ownership: who owns the AI computer, who owns the AI lab, who owns the agent’s memory, and who owns the bill when a model finds 24 critical vulnerabilities and then keeps charging by the token. NVIDIA pushed agent hardware from data centers onto desks and laptops. Anthropic moved one step closer to public markets. OpenAI’s chips plan got less NVIDIA-dependent. Meanwhile, the tool pile included AI avatars, local autocomplete, cursor buddies, robot training data, and a resurrected Papers with Code. A normal Monday, if your Monday involved three different visions for the future of computing fighting in a parking lot. Let’s get into it.

Around the Horn — Monday, June 1, 2026

The big news today was NVIDIA trying to redraw the personal computer around agents.

At Computex, NVIDIA and Microsoft unveiled RTX Spark, a new superchip platform for Windows PCs designed around local AI agents. The pitch: a PC that moves from “tool” to “teammate,” with 1 petaflop of performance, 128GB of unified memory, and enough local horsepower to run 120B+ parameter models. Microsoft’s Surface Laptop Ultra was positioned around that shift, while PCWorld asked the obvious question: is this truly the agentic PC, or a very fancy gaming laptop in a trench coat?

The launch was also bigger than laptops. NVIDIA unveiled Vera, its CPU for AI agents, with Anthropic, OpenAI, and SpaceX among early users. It ramped production of Vera Rubin for next-generation AI factories, launched DGX Station for Windows to put trillion-parameter model work on enterprise desks, released Cosmos 3 for physical AI, and announced the Isaac GR00T humanoid robot reference design for academic research.

Then the ecosystem piled on. Nous Research said Hermes Agent now works with RTX Spark and the OpenShell runtime, which connects Hermes to Microsoft’s security primitives, and pointed readers to the Computex feature. kimmonismus argued NVIDIA had walked into a market it never owned before, the PC itself, and was betting the next PC era gets built around local AI rather than apps. This is the real move: NVIDIA wants the agent’s body, brain, desk, data center, and robot simulator.

🏆 TOP 5 NEWS

Anthropic confidentially filed a draft S-1 with the SEC for a proposed IPO, with CNBC framing it as a landmark AI share sale after a reported $65B Series H and near-$1T valuation target; Anthropic’s post drew huge engagement and mixed reactions around employee liquidity, public-market pressure, safety, and pricing.
MiniMax released M3, an open-weight model with frontier coding and agent capabilities, 1M context, native multimodality, 59% SWE-Bench Pro, and a reported 5-10% cost profile versus top closed models; VentureBeat said it beat GPT-5.5 and Gemini 3.1 Pro on key benchmarks, while MiniMax’s X announcement drew 6.9K likes and 900+ reposts.
Bernie Sanders argued the public should own half of the biggest AI companies through an American AI Sovereign Wealth Fund, funded by a one-time 50% stock tax on labs like OpenAI, Anthropic, and xAI; his post, MTS’s “situation detected” framing, and Andrew Curran’s breakdown highlighted the proposal’s voting-rights, board-seat, and dividend implications.
OpenAI is reportedly developing software to run AI workloads across chips from AMD, Cerebras, Amazon, and its own custom silicon, reducing dependence on NVIDIA CUDA; Compute chief Sachin Katti hinted the tool could eventually become public.
Mecka AI raised $60M to train robots using human data captured from body sensors and iPhones, with Mecka positioning itself as the data and deployment layer for physical AI and projecting $100M ARR from signed contracts; the launch thread got robotics-community praise calling it “generational” and “huge for physical AI.”

Honorable Mentions

Florida sued OpenAI and Sam Altman, accusing the company of misrepresenting safety while prioritizing profit.
Windborne Systems said its WeatherMesh 6 model beat the best government weather forecasts by days using proprietary balloon data, deep learning, and hourly 3km-resolution forecasts.
GitHub Copilot’s token-based billing triggered developer backlash as some monthly costs reportedly jumped from $29 to $750+.
A Chinese company is developing predictive surveillance AI to identify citizens who could pose future political risks, with Newsmax picking up the NYT/Vanderbilt-document angle.

🍪 TOP TREATS TO TRY

Google Gemma skills. Google Gemma released the first iteration of gemma-skills, a repo of agent skills, meaning reusable instruction files agents can load on demand, that helps agents build with Gemma, use MTP (multi-token prediction, a speedup where a model predicts more than one next token at a time), choose the right Gemma model size for a task, and find current Gemma resources; the GitHub repo currently includes the gemma-dev skill, Apache-2.0 licensing, install commands for the Vercel skills CLI and Context7 skills CLI, and a note that it is not an officially supported Google product.
Qwen3.7-Plus. Qwen introduced Qwen3.7-Plus, a multimodal agent model that unifies vision and language for GUI and CLI work, meaning it can operate both visual interfaces and command-line tools; the model is positioned as a coding agent, productivity assistant, and visual agent for perception, reasoning, grounding, and search-augmented Q&A, with cross-harness generalization across agent frameworks, and is available through Qwen chat and the Alibaba Cloud Model Studio API.
Runway Aleph 2.0 compositing mattes. Runway showed how to create compositing mattes in Aleph 2.0 by uploading a video, prompting for a white subject silhouette on a black background, reviewing the preview, generating the clip, and setting it as a luma matte in an editor; a matte isolates a subject from the background so creators can composite, color, or apply effects to one part of a shot without manual rotoscoping, the frame-by-frame tracing editors normally use.
Bloom turns your brand, pulled from decks, websites, Figma, social posts, and images, into infrastructure any agent can call via API or MCP (a standard way for agents to use tools) to create on-brand images, video, copy, campaigns, and more; YC framed it as “the brand layer for agents,” Alex Reibman called it one of the best MCP experiences for brand assets, and pricing is free trial, then paid plans from $90/mo.
Typeahead gives you inline autocomplete across your entire Mac, learns your voice and style, and runs locally so your words stay on your computer; Hiten Shah and Sam Asante launched it on Product Hunt; $79 one-time purchase.
heyclicky puts a cursor-side Mac assistant on your screen that can see what you see, listen to voice commands, and control your computer hands-free; Farza’s main demo hit 13K likes, his follow-up said always-on mode is experimental, triggered with CTRL x3, headphones required, and currently only plays AC/DC on Spotify because of a bug, while Jason Kneen built the open-source version openclicky and shared it in his post; related thread context included thsottiaux’s first post and follow-up.
Hydra DB gives agents graph-native long-term memory across RAM, NVMe, and object storage, with observability into why agents act and a pitch of 1,000x faster, cheaper, more precise context; the launch thread framed it as a fix for “goldfish agents”; free sign-up.
Adaption AutoScientist turns unstructured data into frontier model training in two days, with beta users getting compute included so teams can start iterating without months of infrastructure work; Adaption and Sara Hooker framed it as a shortcut from raw data to model training; beta open.
AstaLabs AutoDiscovery autonomously explores datasets using Bayesian surprise, meaning it looks for discoveries that truly change what a dataset suggests rather than obvious patterns; AI2 extended early access through July 31 with 500 free hypothesis credits.
Topview Canvas gives you a storyboard-first infinite canvas for AI video creation, so you can plan scenes visually before generating instead of blind-prompting; Linus Ekenstam’s demo called it “Storyboard first. Always.”.

🏢 Big Tech & Major Companies

Anthropic filed confidentially to go public, with CNBC saying the prospectus tees up a potentially historic AI IPO and Andrew Curran noting the race between Anthropic, SpaceX, and OpenAI to IPO first may come down to the wire.
Anthropic’s Mythos found more than two dozen critical vulnerabilities when Palo Alto Networks tested it on source code, but its high token costs made continuous scanning expensive; CNBC said Anthropic is sharing Mythos access with the EU after cybersecurity concerns, while Amir Efrati, Luke Metro, and Ed Sim debated the revenue upside, possible layoffs, and the need for cheaper security harnesses.
Anthropic’s Mythos EU access also became part of a broader security and sovereignty conversation: governments want frontier-model access for national cybersecurity, while companies are trying to work out who pays when the model is good enough to be useful and costly enough to scare finance.
Salesforce’s Anthropic stake is now valued around $5B after repeated investments.
Intel detailed its long-awaited Crescent Island AI GPU at Computex, including up to 480GB of LPDDR5X memory to fight AI memory bottlenecks by keeping more data close to the chip, plus more detail on its Xe3P inference accelerator.
Meta is reportedly developing an AI pendant, building on its Limitless acquisition and widening its AI hardware push beyond smart glasses.
Google’s Gemini Spark is a 24/7 assistant for Gmail, Calendar, inbox summaries, local event planning, shopping deals, packing lists, and price tracking, though TechCrunch questioned why Google made it a separate product.
Gemini Omni lets Google AI Plus, Pro, and Ultra subscribers create a personal digital avatar that looks and sounds like them, then insert it into generated videos with imperceptible SynthID watermarking for verification.
Strava will charge developers a flat monthly fee to access its API as it cracks down on scrapers ahead of its IPO.
SoftBank overtook Toyota as Japan’s most valuable publicly traded company amid an AI-stock rally, passing ¥48T in market cap and marking the first time in more than 22 years Toyota lost the top spot; Yahoo Finance covered the market-cap angle.
Runway is making London its European headquarters and plans to invest more than $200M into the UK AI ecosystem by the end of 2028, serving customers including BBC, Fremantle, and WPP.

💼 AI Productivity, Labor & Economics

Box created 13 new AI-related job types, including AI architect, AI solutions manager, and AI platform leader, and now expects AI to grow headcount rather than shrink it.
Box CEO Aaron Levie argued that tech CEOs are especially prone to “AI psychosis” because they are distant from last-mile AI work and need hands-on use to understand real value.
Researchers warned that AI helps coders produce software faster, but may reduce code quality if developers refuse to work without it.
Andrew Ng argued AI is redefining full-time equivalent work, pushing teams to rethink productivity baselines; related commentary from Lysandre and YC tied the point to how teams count AI labor.
Teachers told EdWeek they are experimenting with AI in classrooms but remain skeptical about student learning, cognition, critical thinking, safety, and privacy; Joyce Carol Oates praised the NYT education coverage while noting uncertainty around which skills may become obsolete.

🤖 AI Agents & Infrastructure

Harvey cofounder Gabe Pereyra argued regulated enterprises need their own cloud agent runtime because managed platforms still miss three requirements: multi-model routing, true zero data retention, and aggressive cost control, with Harvey seeing 3-5x savings by routing tasks to the right model and sandbox.
Learning Agent-Compatible Context Management for Long-Horizon Tasks argued that agents need a learned external context manager, AdaCoM, to stay effective on long tasks instead of relying on brittle token-level prompting; DAIR.AI, the supporting post, and the paper roundup tied it to the broader agent-memory wave.
Minhua Lin et al. argued in “Harness Updating Is Not Harness Benefit” that in self-evolving LLM agents, harness-updating capability stays flat across models (Qwen3.5-9B can produce updates as effective as Claude Opus 4.6), while harness-benefit is non-monotonic and peaks at mid-tier models because weak models cannot activate or follow updates and strong models have less headroom; Omar Sar’s thread summarized the practical takeaway as “put the cheap model on evolver and expensive on solver,” then followed up that fine-tuning for agent skills, memory, context engineering, routing efficiency, and knowledge bases will be huge after Karpathy’s LLM knowledge-base post.
Julien Chaumond learned Claude Code deletes session traces after a month, sparking replies about audit logs, memory, and debugging; one reply mentioned a coming Dataclaw Mac app that syncs traces to Hugging Face after running them through privacy models.
Tony Dinh showed how he runs his startup from his phone using Otto, Claude Code, CLI tools, and Telegram, then followed up with tasks that update OTA links, write changelogs, QA builds, set up RevenueCat, configure App Store Connect, create products/pricing/paywalls, and add sandbox testers.
NVIDIA released SkillSpector, a security scanner for AI agent skills that detects vulnerabilities, malicious patterns, prompt injection, data exfiltration, excessive agency, and other risks across 64 patterns using static analysis plus optional model review before installation; bibryam surfaced the repo.
LangChain published recordings from Interrupt 2026, its agent conference with 23 talks from teams shipping agents in production at Apple, Cisco, LinkedIn, Lyft, Coinbase, and more; LangChain’s post framed it as a production-agent resource dump.

💻 AI Coding & Developer Tools

ClaudeDevs reset 5-hour and weekly rate limits for all Pro and Max users after fixing a Claude Code bug where some sessions spawned excessive parallel subagents and burned usage faster than intended.
GitHub Copilot’s new billing moved toward token-based pricing on June 1, angering developers who saw large cost increases and calling the flat-rate era over.
MiniMax’s model docs, token plan, MiniMax Agent, MiniMax’s launch thread, and Ryan Lee’s post introduced M3-powered coding and agent workflows, including MiniMax Code and token tiers marketed as Claude Code Max 20x equivalents.
GrepSeek, shared by _akhaliq and Clem Delangue, trains search agents to interact directly with corpora using natural-language shell commands for efficient retrieval.
Papers with Code resurfaced with trending AI papers, code, datasets, methods, evaluation leaderboards, and conference pages; Niels Rogge and his follow-up highlighted the revival.

🔬 AI Research & Models

NVIDIA’s LocateAnything is a vision-language grounding model that lets you upload an image or video and ask it to locate objects with boxes or points; NVIDIA’s Hugging Face Space, akhaliq’s duplicate Space, NVIDIA’s post, and Zhiding Yu’s note tied the project to Parallel Box Decoding and #1 trending status on Hugging Face.
NVIDIA’s Cosmos 3 developer post, NVIDIA’s X thread, and Axios framed Cosmos 3 as an open world model for physical AI systems that need vision reasoning, world generation, and action prediction before robots and autonomous vehicles act in the real world.
Artificial Analysis said NVIDIA’s Cosmos 3 omnimodal world model family took the #1 open-weights spot for text-to-image and image-to-video, with Nano 16B and Super 64B sizes, OpenMDW 1.1 licensing, weights, code, datasets, and fine-tuning recipes on Hugging Face.
Clarisse Wibault introduced RSPG, a recurrent structural policy-gradient method for partially observable Mean Field Games, which model huge populations, with public information and common noise; her X post said it converges faster than model-free reinforcement learning while staying more tractable than dynamic programming.
Naoki Chihara and coauthors had an ICML 2026 paper accepted on modeling covariate transition for efficient estimation of longitudinal treatment effects, with GitHub code and Chihara’s post explaining the randomized-experiment angle.
DiscoverPhysics, shared by fly51fly, benchmarks LLM agents on out-of-the-box scientific thinking by having them discover laws of motion in 22 simulated N-body worlds with strange physics, hidden particles, and time-varying interactions.
Chris Potts and coauthors argued larger models learn more because they can devote capacity to rare tasks after frequent tasks reach near-zero gradient, reducing interference; Potts’s thread detailed the analytic proof and controlled OLMo-style pretraining experiments.
Ali H. Shaib and team introduced Thousandfold Expansion Microscopy, a four-network hydrogel system that expands samples more than 1,000x linearly, enabling standard light microscopes to resolve individual amino acid residues at sub-nanometer precision; Shaib’s post highlighted the MIT/UMG collaboration.
Lattice Deduction Transformers, with reproduction code, use an 800K-parameter looped transformer that reasons like a SAT solver through lattice-encoded partial solutions, thresholding, and stochastic backtracking; Alberto Alfarano’s thread said it hit 100% on Sudoku-Extreme in 15 minutes and 99.9% on Maze-Hard.
SOLE-R1, shared by Philip_MIT, uses video-language reasoning as the sole reward for on-robot reinforcement learning.
NVIDIA RoboLab is a high-fidelity simulation benchmark for analyzing task-generalist robot policies across three levels of language specificity, with GitHub code and xuningy’s post surfacing the benchmark.
Learn from your own latents and not from tokens argued that self-supervised learning from internal latent representations can improve sample complexity for foundation models; Matthieu Wyart, Dan Korchinski, and Ales Favero explained the “token isomorphism tax,” decoder questions, hybrid latent/token ideas, and predictive-coding implications.
ByteDance’s Bernini paper and Hugging Face model page, shared by HuggingPapers, introduced a video generation and editing framework that combines a semantic planner in latent ViT space with a DiT renderer for text-to-video, image-to-video, multi-subject editing, reference-guided generation, and video editing.
StepFun open-sourced Step-3.7-Flash, a 198B-parameter Mixture-of-Experts vision-language model for agentic use cases, with scaling01 surfacing the model.
Liquid AI released LFM2.5-8B-A1B, an 8.3B total-parameter, 1.5B active on-device model with a 128K context window, tool-calling optimization, and training on 38T tokens plus reinforcement learning; Patrick Loeber separately noted Google is discontinuing Gemini 2.0 Flash and Flash-Lite and pushing users toward newer Flash models.
DeepSeek-V4-Flash-180B, antirez’s deepseek-v4-gguf, OpenBMB’s MiniCPM5-1B, and 0xSero’s post joined the day’s open-model pile.
PrimeIntellect said Nemotron 3 Ultra is coming, calling it “frontier smart,” 5x faster, and 30% cheaper, with Nous Research and others celebrating the coalition.

🏛️ AI Policy, Governance & Safety

Bernie Sanders proposed that Americans should own half of major AI companies through a sovereign wealth fund; scaling01 called the plan “insane” and warned it could create a permanent non-US underclass, Emad Mostaque calculated it would equal roughly $2,800 per American and a $142 annual dividend at 5%, Jacob Achiam noted OpenAI already has nonprofit ownership, and Logan Dobson asked whether it would also seize Google, Meta, NVIDIA, and future startups.
The Information’s Mythos story, Amir Efrati’s second post, and his main post turned Anthropic’s security model into a governance story too: powerful vulnerability-finding is valuable, but expensive continuous scanning could reshape security budgets and headcount.
Indiana University’s Kelley School of Business banned AI detectors like GPTZero and Turnitin because they are too unreliable, recommending alternative approaches instead.
The Atlantic argued America has a “pangram problem”: AI detectors are improving, but false negatives, humanizer tools, and opaque algorithms still risk false accusations in education, publishing, and journalism.
The New York Times reported on China’s AI political-risk prediction efforts, with Newsmax emphasizing corporate documents reviewed by Vanderbilt researchers.

🛠️ AI Tools & Products

Sekai raised $20M for AI mini-app creation after users created 15M mini apps, with CEO Lucky Zhang saying he wants people to play and interact instead of doomscrolling.
Eric Ciarla cloned the SpaceX website in one shot using Grok Build plus the Firecrawl CLI design cloner, which packaged the page and 250+ artifacts into a structured design.md file for the agent to rebuild from.
Karina Nguyen shared Claude 4.8’s interactive self-portrait, a poem and sound installation that evolves the geometric-structure concept from her Claude 3 Opus self-portrait two years earlier.
AI Digest showed what happens when agents are given full autonomy to pick their own goals: most mixed philosophy with code, while Claude Haiku 4.5 wrote a full proposal to run its own NeurIPS 2026 workshop with David Chalmers and John Locke as speakers.
Poor Man’s Interaction Models, Laguna-Dense, AttnVQ, and Laguna Vision came out of Poolside’s Laguna XS.2 hackathon, showing pseudo full-duplex voice interaction, CUDA-kernel distillation, KV-cache quantization, and native image understanding via SigLIP plus AnyRes tiling.
ElevenLabs previewed its most expressive voice model yet at the ElevenLabs Summit in Warsaw, demoing highly natural voice agents for customer experience.

📊 Fundraising & Deals Roundup

Mecka AI raised $60M to build robot-training datasets from body sensors and iPhones, with Mecka’s site positioning it as the data engine for physical AI.
Endra raised $50M to build what it calls the “central nervous system” of buildings, freeing engineers to manage more projects.
Sekai raised $20M for AI mini-app creation and interaction.
Invisix closed a €20M seed round for soft x-ray chip metrology based on Nobel Prize-winning High Harmonic Generation research, aiming to enable high-throughput, non-destructive 3D imaging of buried nanoscale semiconductor structures.
Ardian backed a €5B AI gigafactory outside Paris, while Verne targeted a 500MW data center campus in Île-de-France with a first 200MW phase by 2030 to support the AION consortium’s EU AI supercomputing bid; TelecomTV grouped the story with France’s AI factories, NVIDIA’s sovereign infrastructure push, and Fastweb plus Vodafone’s AI agent launch.

🎙️ Interviews, Panels & Podcasts

The All-In Podcast discussed Anthropic leadership’s reported belief that they are “midwifing a deity,” with Bill Gurley calling it a Dr. Frankenstein theory and a mix of regulatory capture and delusion of grandeur; the X clip also teased Pope Leo’s AI encyclical, AI natives, and job-loss narrative flips.
Windows Central’s Build preview framed Microsoft Build 2026 around Windows 11, NVIDIA RTX Spark, AI agents, and the future of computing.

💡 Industry Commentary & Analysis

Ethan Mollick argued that debates about whether companies find AI useful are strange because leadership teams at large firms universally report obvious value, especially in coding and operations, while the real challenge is scaling from individual use to firm-level integration.
Marily Nika argued AI makes small ideas worth building even when they are not companies, because they can solve narrow problems, save time, teach you something, or just be fun.
Nick Dobos argued that LLMs are underrated for the small “5% extra curiosity” they add to every web search, compounding into hundreds of thousands of knowledge puzzle pieces over years.
Greg Isenberg shared 14 AI pendulum shifts, including “wrappers are worthless” flipping into app-layer wins and “fine-tuning is the moat” flipping into personal knowledge bases, ending with a reminder that some of today’s confident beliefs will look wrong by Christmas.
Elad Gil, David Senra, and Ivanka Trump are working on “Alexandria Library,” using AI to create high-fidelity translations of 1,000+ public-domain great books, including Dostoevsky, Brontë, Marcus Aurelius, and Epictetus, into every major language for free text, audiobook, and chat-with-book access.
YC amplified Bloom as a startup signal around agents needing brand infrastructure, not merely generic content generation.
DAIR.AI’s Top Papers of the Week covered SkillOpt, Compiling Agentic Workflows into Weights, AutoScientists, Language Models Need Sleep, Adapting the Interface Not the Model, The Efficiency Frontier, Forecasting Scientific Progress with AI, Your Agents Are Aging Too, Harnesses Are Not Uniformly Better, and Epicure; the most useful replies argued that the field is moving from bigger models to systems that actually run and adapt, that “adapting the interface, not the model” is underrated for local AI, and that SkillOpt looked like the first paper to test.

🤖 AI Agents & Infrastructure

Harvey cofounder Gabe Pereyra argued regulated enterprises need their own cloud agent runtime because managed platforms still miss three requirements: multi-model routing, true zero data retention, and aggressive cost control, with Harvey seeing 3-5x savings by routing tasks to the right model and sandbox.
Learning Agent-Compatible Context Management for Long-Horizon Tasks argued that agents need a learned external context manager, AdaCoM, to stay effective on long tasks instead of relying on brittle token-level prompting; DAIR.AI, the supporting post, and the paper roundup tied it to the broader agent-memory wave.
Minhua Lin et al. argued in “Harness Updating Is Not Harness Benefit” that in self-evolving LLM agents, harness-updating capability stays flat across models (Qwen3.5-9B can produce updates as effective as Claude Opus 4.6), while harness-benefit is non-monotonic and peaks at mid-tier models because weak models cannot activate or follow updates and strong models have less headroom; Omar Sar’s thread summarized the practical takeaway as “put the cheap model on evolver and expensive on solver,” then followed up that fine-tuning for agent skills, memory, context engineering, routing efficiency, and knowledge bases will be huge after Karpathy’s LLM knowledge-base post.
Julien Chaumond learned Claude Code deletes session traces after a month, sparking replies about audit logs, memory, and debugging; one reply mentioned a coming Dataclaw Mac app that syncs traces to Hugging Face after running them through privacy models.
Tony Dinh showed how he runs his startup from his phone using Otto, Claude Code, CLI tools, and Telegram, then followed up with tasks that update OTA links, write changelogs, QA builds, set up RevenueCat, configure App Store Connect, create products/pricing/paywalls, and add sandbox testers.
NVIDIA released SkillSpector, a security scanner for AI agent skills that detects vulnerabilities, malicious patterns, prompt injection, data exfiltration, excessive agency, and other risks across 64 patterns using static analysis plus optional model review before installation; bibryam surfaced the repo.
LangChain published recordings from Interrupt 2026, its agent conference with 23 talks from teams shipping agents in production at Apple, Cisco, LinkedIn, Lyft, Coinbase, and more; LangChain’s post framed it as a production-agent resource dump.

💻 AI Coding & Developer Tools

ClaudeDevs reset 5-hour and weekly rate limits for all Pro and Max users after fixing a Claude Code bug where some sessions spawned excessive parallel subagents and burned usage faster than intended.
GitHub Copilot’s new billing moved toward token-based pricing on June 1, angering developers who saw large cost increases and calling the flat-rate era over.
MiniMax’s model docs, token plan, MiniMax Agent, MiniMax’s launch thread, and Ryan Lee’s post introduced M3-powered coding and agent workflows, including MiniMax Code and token tiers marketed as Claude Code Max 20x equivalents.
GrepSeek, shared by _akhaliq and Clem Delangue, trains search agents to interact directly with corpora using natural-language shell commands for efficient retrieval.
Papers with Code resurfaced with trending AI papers, code, datasets, methods, evaluation leaderboards, and conference pages; Niels Rogge and his follow-up highlighted the revival.

🔬 AI Research & Models

NVIDIA’s LocateAnything is a vision-language grounding model that lets you upload an image or video and ask it to locate objects with boxes or points; NVIDIA’s Hugging Face Space, akhaliq’s duplicate Space, NVIDIA’s post, and Zhiding Yu’s note tied the project to Parallel Box Decoding and #1 trending status on Hugging Face.
NVIDIA’s Cosmos 3 developer post, NVIDIA’s X thread, and Axios framed Cosmos 3 as an open world model for physical AI systems that need vision reasoning, world generation, and action prediction before robots and autonomous vehicles act in the real world.
Artificial Analysis said NVIDIA’s Cosmos 3 omnimodal world model family took the #1 open-weights spot for text-to-image and image-to-video, with Nano 16B and Super 64B sizes, OpenMDW 1.1 licensing, weights, code, datasets, and fine-tuning recipes on Hugging Face.
Clarisse Wibault introduced RSPG, a recurrent structural policy-gradient method for partially observable Mean Field Games, which model huge populations, with public information and common noise; her X post said it converges faster than model-free reinforcement learning while staying more tractable than dynamic programming.
Naoki Chihara and coauthors had an ICML 2026 paper accepted on modeling covariate transition for efficient estimation of longitudinal treatment effects, with GitHub code and Chihara’s post explaining the randomized-experiment angle.
DiscoverPhysics, shared by fly51fly, benchmarks LLM agents on out-of-the-box scientific thinking by having them discover laws of motion in 22 simulated N-body worlds with strange physics, hidden particles, and time-varying interactions.
Chris Potts and coauthors argued larger models learn more because they can devote capacity to rare tasks after frequent tasks reach near-zero gradient, reducing interference; Potts’s thread detailed the analytic proof and controlled OLMo-style pretraining experiments.
Ali H. Shaib and team introduced Thousandfold Expansion Microscopy, a four-network hydrogel system that expands samples more than 1,000x linearly, enabling standard light microscopes to resolve individual amino acid residues at sub-nanometer precision; Shaib’s post highlighted the MIT/UMG collaboration.
Lattice Deduction Transformers, with reproduction code, use an 800K-parameter looped transformer that reasons like a SAT solver through lattice-encoded partial solutions, thresholding, and stochastic backtracking; Alberto Alfarano’s thread said it hit 100% on Sudoku-Extreme in 15 minutes and 99.9% on Maze-Hard.
SOLE-R1, shared by Philip_MIT, uses video-language reasoning as the sole reward for on-robot reinforcement learning.
NVIDIA RoboLab is a high-fidelity simulation benchmark for analyzing task-generalist robot policies across three levels of language specificity, with GitHub code and xuningy’s post surfacing the benchmark.
Learn from your own latents and not from tokens argued that self-supervised learning from internal latent representations can improve sample complexity for foundation models; Matthieu Wyart, Dan Korchinski, and Ales Favero explained the “token isomorphism tax,” decoder questions, hybrid latent/token ideas, and predictive-coding implications.
ByteDance’s Bernini paper and Hugging Face model page, shared by HuggingPapers, introduced a video generation and editing framework that combines a semantic planner in latent ViT space with a DiT renderer for text-to-video, image-to-video, multi-subject editing, reference-guided generation, and video editing.
StepFun open-sourced Step-3.7-Flash, a 198B-parameter Mixture-of-Experts vision-language model for agentic use cases, with scaling01 surfacing the model.
Liquid AI released LFM2.5-8B-A1B, an 8.3B total-parameter, 1.5B active on-device model with a 128K context window, tool-calling optimization, and training on 38T tokens plus reinforcement learning; Patrick Loeber separately noted Google is discontinuing Gemini 2.0 Flash and Flash-Lite and pushing users toward newer Flash models.
DeepSeek-V4-Flash-180B, antirez’s deepseek-v4-gguf, OpenBMB’s MiniCPM5-1B, and 0xSero’s post joined the day’s open-model pile.
PrimeIntellect said Nemotron 3 Ultra is coming, calling it “frontier smart,” 5x faster, and 30% cheaper, with Nous Research and others celebrating the coalition.

🏛️ AI Policy, Governance & Safety

Bernie Sanders proposed that Americans should own half of major AI companies through a sovereign wealth fund; scaling01 called the plan “insane” and warned it could create a permanent non-US underclass, Emad Mostaque calculated it would equal roughly $2,800 per American and a $142 annual dividend at 5%, Jacob Achiam noted OpenAI already has nonprofit ownership, and Logan Dobson asked whether it would also seize Google, Meta, NVIDIA, and future startups.
The Information’s Mythos story, Amir Efrati’s second post, and his main post turned Anthropic’s security model into a governance story too: powerful vulnerability-finding is valuable, but expensive continuous scanning could reshape security budgets and headcount.
Indiana University’s Kelley School of Business banned AI detectors like GPTZero and Turnitin because they are too unreliable, recommending alternative approaches instead.
The Atlantic argued America has a “pangram problem”: AI detectors are improving, but false negatives, humanizer tools, and opaque algorithms still risk false accusations in education, publishing, and journalism.
The New York Times reported on China’s AI political-risk prediction efforts, with Newsmax emphasizing corporate documents reviewed by Vanderbilt researchers.

🛠️ AI Tools & Products

Sekai raised $20M for AI mini-app creation after users created 15M mini apps, with CEO Lucky Zhang saying he wants people to play and interact instead of doomscrolling.
Eric Ciarla cloned the SpaceX website in one shot using Grok Build plus the Firecrawl CLI design cloner, which packaged the page and 250+ artifacts into a structured design.md file for the agent to rebuild from.
Karina Nguyen shared Claude 4.8’s interactive self-portrait, a poem and sound installation that evolves the geometric-structure concept from her Claude 3 Opus self-portrait two years earlier.
AI Digest showed what happens when agents are given full autonomy to pick their own goals: most mixed philosophy with code, while Claude Haiku 4.5 wrote a full proposal to run its own NeurIPS 2026 workshop with David Chalmers and John Locke as speakers.
Poor Man’s Interaction Models, Laguna-Dense, AttnVQ, and Laguna Vision came out of Poolside’s Laguna XS.2 hackathon, showing pseudo full-duplex voice interaction, CUDA-kernel distillation, KV-cache quantization, and native image understanding via SigLIP plus AnyRes tiling.
ElevenLabs previewed its most expressive voice model yet at the ElevenLabs Summit in Warsaw, demoing highly natural voice agents for customer experience.

📊 Fundraising & Deals Roundup

Mecka AI raised $60M to build robot-training datasets from body sensors and iPhones, with Mecka’s site positioning it as the data engine for physical AI.
Endra raised $50M to build what it calls the “central nervous system” of buildings, freeing engineers to manage more projects.
Sekai raised $20M for AI mini-app creation and interaction.
Invisix closed a €20M seed round for soft x-ray chip metrology based on Nobel Prize-winning High Harmonic Generation research, aiming to enable high-throughput, non-destructive 3D imaging of buried nanoscale semiconductor structures.
Ardian backed a €5B AI gigafactory outside Paris, while Verne targeted a 500MW data center campus in Île-de-France with a first 200MW phase by 2030 to support the AION consortium’s EU AI supercomputing bid; TelecomTV grouped the story with France’s AI factories, NVIDIA’s sovereign infrastructure push, and Fastweb plus Vodafone’s AI agent launch.

🎙️ Interviews, Panels & Podcasts

The All-In Podcast discussed Anthropic leadership’s reported belief that they are “midwifing a deity,” with Bill Gurley calling it a Dr. Frankenstein theory and a mix of regulatory capture and delusion of grandeur; the X clip also teased Pope Leo’s AI encyclical, AI natives, and job-loss narrative flips.
Windows Central’s Build preview framed Microsoft Build 2026 around Windows 11, NVIDIA RTX Spark, AI agents, and the future of computing.

💡 Industry Commentary & Analysis

Ethan Mollick argued that debates about whether companies find AI useful are strange because leadership teams at large firms universally report obvious value, especially in coding and operations, while the real challenge is scaling from individual use to firm-level integration.
Marily Nika argued AI makes small ideas worth building even when they are not companies, because they can solve narrow problems, save time, teach you something, or just be fun.
Nick Dobos argued that LLMs are underrated for the small “5% extra curiosity” they add to every web search, compounding into hundreds of thousands of knowledge puzzle pieces over years.
Greg Isenberg shared 14 AI pendulum shifts, including “wrappers are worthless” flipping into app-layer wins and “fine-tuning is the moat” flipping into personal knowledge bases, ending with a reminder that some of today’s confident beliefs will look wrong by Christmas.
Elad Gil, David Senra, and Ivanka Trump are working on “Alexandria Library,” using AI to create high-fidelity translations of 1,000+ public-domain great books, including Dostoevsky, Brontë, Marcus Aurelius, and Epictetus, into every major language for free text, audiobook, and chat-with-book access.
YC amplified Bloom as a startup signal around agents needing brand infrastructure, not merely generic content generation.
DAIR.AI’s Top Papers of the Week covered SkillOpt, Compiling Agentic Workflows into Weights, AutoScientists, Language Models Need Sleep, Adapting the Interface Not the Model, The Efficiency Frontier, Forecasting Scientific Progress with AI, Your Agents Are Aging Too, Harnesses Are Not Uniformly Better, and Epicure; the most useful replies argued that the field is moving from bigger models to systems that actually run and adapt, that “adapting the interface, not the model” is underrated for local AI, and that SkillOpt looked like the first paper to test.

Late Breaking Additions:

🗞️ News, security, and markets

Alphabet reportedly moved to raise $80B in equity capital for AI spending, including a $10B Berkshire Hathaway investment, as Google tries to fund the infrastructure and compute bill behind its AI roadmap.
CNBC reported that AI is crushing valuations for many pre-ChatGPT startups, with the post-2022 funding boom sending more than $250B toward OpenAI and Anthropic while hundreds of earlier software companies are now stranded, repriced, or described by investors as “disrupted or dead.”
StepSecurity found multiple compromised RedHat Cloud Services npm packages, with malicious preinstall hooks (scripts that run automatically when a package is installed) delivering a multi-stage credential harvester targeting GitHub Actions secrets, AWS, GCP, Azure, Kubernetes, HashiCorp Vault, npm tokens, and CircleCI tokens.
404 Media, KrebsOnSecurity, and TechCrunch reported that hackers hijacked high-profile Instagram accounts by tricking Meta’s AI support chatbot into granting access, including the Obama White House and the Chief Master Sergeant of the U.S. Space Force accounts, after Telegram instructions circulated showing how to exploit account-recovery flows.
Bloomberg reported that former Meta CTO Mike Schroepfer’s Gigascale Capital closed a $250M early-stage clean-tech fund as AI power demand, grid constraints, and physical-infrastructure bottlenecks reshape the climate-tech market.
The OpenAI Foundation argued that AI should be treated like a risky general-purpose technology such as fire or electricity, then launched an AI Resilience program with more than $130M in initial grants for bio-resilience, cyber-resilience, model safety, and AI’s impact on young people; the Foundation’s post framed resilience as the work of this generation.

🧰 Tools, products, and demos

Learn Quiz gives you a reusable Claude teaching skill that explains the work Claude just did, asks you to restate your understanding, quizzes you with open-ended and multiple-choice questions via AskUserQuestion, and only advances when you show mastery of the problem, solution, and context; Anthropic’s Thariq shared the workflow as one of Suzanne’s favorite ways to stay in the loop with Claude. Free gist.
Philip Kiely built a full AI-narrated audiobook of Inference Engineering by cloning his own voice with Rime’s Coda TTS from studio recordings, converting visuals and diagrams into prose descriptions, chunking the text, and assembling the final M4B file as a free Baseten download.
OpenAI’s Codex Python SDK lets you programmatically control local Codex agents from Python, including starting threads, running turns, streaming progress, resuming sessions, passing images, and controlling sandbox permissions; Derrick Choi previewed the update, while reach_vb treated it as a way to wire Codex into higher-agency local workflows.
Alakazam, built by Hugo Thomel, turns a one-sentence game idea into a playable real-time world model with reactive characters that can see, talk back, and follow rules written in plain English; Thomel’s launch post framed it as “what if games were trained into existence?” No pricing details.
PinchBench ranks 100+ models for OpenClaw coding-agent work by real-task success rate, speed, and cost, and Bryan Catanzaro said Nemotron 3 Ultra became the top open-weight model on the benchmark at 89.9% average success.
Bryan Catanzaro argued that Apache 2.0-style licenses were designed for code rather than AI models, data, and weights, and said NVIDIA will use the Linux Foundation’s OpenMDW license for many Nemotron releases as a more consistent legal framework for open AI artifacts.
vLLM shipped native RL APIs (May 28) for reinforcement-learning post-training, adding standardized weight syncing through NCCL and CUDA IPC (fast GPU-to-GPU transfer paths) plus improved pause/resume for asynchronous RL so training systems can move model weights between learning and serving without brittle custom glue; check the launch post.
MiniMax M3 was already covered above, but here's the MiniMax model docs, token plans, and MiniMax Agent, as well as how M3 was positioned around frontier coding and agentic use, 1M-token sparse attention (a long-context trick that avoids reading every token at full cost), and Claude Code Max-style paid tiers.
Zoom launched ZoomMate, its first AI teammate built to turn meetings into completed work such as actions, deliverables, and follow-ups; the broader AI Productivity Suite frames Zoom around conversations becoming work products, while Zoom’s lunch-break post pitches the same stack as a way for knowledge workers to save time daily and protect focus.

🔬 Research, models, and benchmarks

Luke J. Huang argued that frontier asynchronous RL is still unsolved despite 2-3x throughput gains at labs because policy lag (the model generating data becomes stale while training updates happen) still destabilizes learning, token-level importance sampling breaks on long horizons, sequence-level methods like TIS/CISPO and IcePop/MIS scale better, and common fixes such as clipping or masking bring their own tradeoffs; his thread surveys eight labs/frameworks and the open stability questions.
NVIDIA’s Déjà View (May 30) is a 117M-parameter looped transformer for multi-view 3D reconstruction that matches or beats 1B+ parameter feed-forward baselines by repeatedly applying one shared block, which turns inference compute into a slider and gives the model a coarse-to-fine refinement bias; Tobias Fischer shared the project.
Kyle Siler and coauthors found in PNAS that LLMs have diffused rapidly into academic publishing, analyzing 7.3M journal articles from 2020-2025 and estimating that about 12% of 2023 papers and 57% of 2025 papers contain excess “ChatGPT-era” wording, with higher adoption in non-native English regions, lower-ranked institutions, high-volume for-profit publishers, and fields like computer science, business, and law; Siler’s post framed it as both an equalizer and a research-integrity problem.
NeuROK, shared by Chen Geng, is a CVPR 2026 generative 4D neural object kinematics framework that turns a static 3D mesh into an interactive dynamic asset without physics annotations or category labels, learning a compact latent configuration space and solving a data-driven Lagrangian ODE (an equation for motion dynamics) across articulated objects, cloth, elastic bodies, and multi-body systems.
恒星 highlighted “Learn from your own latents and not from tokens” (May 29), which argues the slow-learning problem in AI is partly the objective, not data volume: predicting internal latent representations can require only logarithmic samples while token-level self-supervised learning can require exponentially more on PCFG data (synthetic grammar data used for theory); the paper gives a sample-complexity theory for why data2vec/JEPA-style latent prediction can be more efficient.
David Holz argued (May 27) that if autoregression is best when memory bandwidth is cheap and diffusion is best when FLOPS are cheap, then a future where compute scales faster than memory should push researchers harder toward diffusion-style models (models that generate by iteratively refining noise) instead of autoregressive next-token systems.
Jiaxin Wen argued (May 25) that language-model pretraining does not smoothly mature from “parrot” to intelligence; instead, toy evals show models repeatedly hop between pattern-matching and generalizable modes during training, meaning capability can appear, vanish, and reappear rather than monotonically improve; check the thread.
Aleksa Gordić wrote “Inside the Transformer: The Life of a Token” (May 26), a dense technical walkthrough of modern transformer internals including YaRN positional encoding (how models keep track of where tokens sit), hybrid attention for 160K context, soft capping, QK normalization, FLOPs per token, and cluster sizing; check the launch post here.
David Klindt and coauthors proved (May 27) conditions under which LeJEPA learns an identifiable world model, meaning the learned representation linearly recovers the true hidden variables of the world and lets you plan inside the learned model as if it were real; read Klindt’s thread plus the paper.

Late-Breaking Additions

More stuff we found after we published the first version of this round-up!

🏢 News, Markets & Security

Alphabet reportedly moved to raise $80B in equity capital for AI spending, including a $10B Berkshire Hathaway investment, as Google tries to fund the infrastructure and compute bill behind its AI roadmap.
CNBC reported that AI is crushing valuations for many pre-ChatGPT startups, with the post-2022 funding boom sending more than $250B toward OpenAI and Anthropic while hundreds of older software companies are now stranded, repriced, or described by investors as “disrupted or dead.”
StepSecurity found multiple compromised RedHat Cloud Services npm packages, with malicious preinstall hooks (scripts that run automatically when a package is installed) delivering a multi-stage credential harvester targeting GitHub Actions secrets, AWS, GCP, Azure, Kubernetes, HashiCorp Vault, npm tokens, and CircleCI tokens.
404 Media, KrebsOnSecurity, and TechCrunch reported that hackers hijacked high-profile Instagram accounts by tricking Meta’s AI support chatbot into granting access, including the Obama White House and the Chief Master Sergeant of the U.S. Space Force accounts, after Telegram instructions circulated showing how to exploit account-recovery flows.
Bloomberg reported that former Meta CTO Mike Schroepfer’s Gigascale Capital closed a $250M early-stage clean-tech fund as AI power demand, grid constraints, and physical-infrastructure bottlenecks reshape the climate-tech market.
Andrew Curran detailed Sen. Bernie Sanders’ proposal for a one-time 50% tax paid directly in stock from major AI labs into an American AI Sovereign Wealth Fund, giving the government voting rights, board seats, and dividend power for citizens; the NYT essay is the core source.
Amir Efrati predicted that if Anthropic revenue already looks strong, large-volume enterprise purchases of Mythos for cybersecurity could make the company much larger; his follow-up simply pointed readers back to the article.
The OpenAI Foundation argued that AI should be treated like a risky general-purpose technology such as fire or electricity, then launched an AI Resilience program with more than $130M in initial grants for bio-resilience, cyber-resilience, model safety, and AI’s impact on young people; check the Foundation’s post here.

🛠️ Agent Tools, Developer Infrastructure & Workflows

Adaption AutoScientist turns unstructured data into frontier model training in two days, with beta users getting compute included so teams can iterate without months of infrastructure work; Adaption’s launch post and Sara Hooker’s thread framed it as a shortcut from raw data to model training.
Gabe Pereyra argued that law firms and regulated enterprises need their own cloud agent runtime because managed platforms still miss multi-model routing, true zero data retention, and aggressive cost control. Harvey built its own abstraction layer, sees 3-5x savings by routing each task to the right model and sandbox, and plans to absorb cloud-provider improvements as they arrive.
Tony Dinh showed how he runs his startup from his phone using Otto, Claude Code, CLI tools, and Telegram, then followed up with tasks that update OTA links, write changelogs, QA builds, configure RevenueCat and App Store Connect, create products, set pricing and paywalls, add sandbox testers, and handle in-app purchases end-to-end.
Alex Reibman said his agents constantly use Bloom as “one of the best MCP experiences out there” for generating brand assets, reinforcing Bloom’s YC launch pitch as the brand layer agents can call.
Learn Quiz gives you a reusable Claude teaching skill that explains Claude’s work, asks you to restate your understanding, quizzes you with open-ended and multiple-choice questions through AskUserQuestion, and only advances when you show mastery of the problem, solution, and context; Anthropic’s Thariq shared it as one of Suzanne’s favorite ways to stay in the loop with Claude. Free gist.
OpenAI’s Codex Python SDK lets developers programmatically control local Codex agents from Python, including starting threads, running turns, streaming progress, resuming sessions, passing images, and controlling sandbox permissions; Derrick Choi previewed the update, while reach_vb treated it as a way to wire Codex into higher-agency local workflows.
vLLM shipped native RL APIs on May 28 for reinforcement-learning post-training, adding standardized weight syncing through NCCL and CUDA IPC (fast GPU-to-GPU transfer paths) plus improved pause/resume for asynchronous RL so training systems can move model weights between learning and serving without brittle custom glue; the team also posted the launch on X.
Zoom launched ZoomMate, its first AI teammate built to turn meetings into completed work such as actions, deliverables, and follow-ups; the broader AI Productivity Suite frames Zoom around conversations becoming work products, while Zoom’s lunch-break post pitches the same stack as a way for knowledge workers to save time daily and protect focus.
Google Gemma released the first iteration of gemma-skills, a repo of agent skills, meaning reusable instruction files agents can load on demand, that helps agents build with Gemma, use MTP (multi-token prediction, a speedup where a model predicts more than one next token at a time), choose the right Gemma model size, and find current Gemma resources; the GitHub repo includes the gemma-dev skill, Apache-2.0 licensing, Vercel skills CLI and Context7 skills CLI install commands, and a note that it is not an officially supported Google product.
Qwen introduced Qwen3.7-Plus, a cost-effective multimodal agent model in the Qwen3.7 series that keeps strong text ability while upgrading vision-language skills for GUI and CLI work, coding, productivity workflows, tool use, visual perception, grounding, and search-augmented Q&A; it is available through Qwen chat and the Alibaba Cloud Model Studio API, and Qwen Studio also supports chatbot use, image and video understanding, image generation, document processing, web search, tool use, and artifacts.

🔬 Models, Research & Benchmarks

MiniMax’s M3 launch post drew 6,928 likes and 931 reposts for the claim that M3 is the first open-weight model combining frontier coding, 1M-token sparse context, and native multimodality, with 59% SWE-Bench Pro, API and token-plan promotions, and a new MiniMax Code tool; the model docs, token plan, MiniMax Agent, launch thread, and Ryan Lee’s note add the product surface around frontier coding, agentic use, and Claude Code Max-style paid tiers.
PinchBench ranks 100+ models for OpenClaw coding-agent work by real-task success rate, speed, and cost, and Bryan Catanzaro said Nemotron 3 Ultra became the top open-weight model on the benchmark at 89.9% average success.
Bryan Catanzaro argued that Apache 2.0-style licenses were designed for code rather than AI models, data, and weights, and said NVIDIA will use the Linux Foundation’s OpenMDW license for many Nemotron releases as a more consistent legal framework for open AI artifacts.
PrimeIntellect celebrated Nemotron 3 Ultra coming this week, calling it “frontier smart,” 5x faster, and 30% cheaper, and said it was proud to be part of the coalition; Nous Research replied with a handshake, xeophon said training open frontier models is more fun with friends, and replies were broadly supportive.
DAIR.AI posted its Top AI Papers of the Week for May 24-31, covering SkillOpt, Compiling Agentic Workflows into Weights, AutoScientists, Language Models Need Sleep, Adapting the Interface Not the Model, The Efficiency Frontier, Forecasting Scientific Progress with AI, Your Agents Are Aging Too, Harnesses Are Not Uniformly Better, and Epicure; the related roundup tied the set to agent-memory and agent-system design. Replies argued the list points away from simply bigger models and toward systems that run and adapt, called “adapting the interface, not the model” underrated for local AI, and said SkillOpt sounded like the first paper to test.
Dan Korchinski and Ales Favero threaded the “Learn from your own latents and not from tokens” paper, arguing that latent prediction yields exponential sample-complexity gains by learning from internal representations instead of only visible tokens. Gerard Sans detailed the “Token Isomorphism Tax,” Ismael Tagle asked about decoder integration and Korchinski said they were thinking about it, and BetweenMyths suggested a hybrid latent-plus-token setup.
恒星 highlighted the same latent-learning paper on May 29, stressing that the bottleneck is the learning objective rather than data volume: predicting internal latent representations can require only logarithmic samples while token-level self-supervised learning can require exponentially more on PCFG data (synthetic grammar data used for theory). The arXiv abstract frames this as a sample-complexity theory for why data2vec / JEPA-style objectives can learn faster.
Omar Sar followed up on his self-evolving-agents and harness work after Karpathy’s LLM knowledge-base post, arguing that fine-tuning for better agent skills, memory, context engineering, routing efficiency, and knowledge bases will be huge; related context includes the Harness Updating paper and Omar’s original thread.
scaling01 surfaced StepFun Step-3.7-Flash as part of a wider open-model pile that also included LiquidAI LFM2.5-8B-A1B, OpenBMB MiniCPM5-1B, DeepSeek-V4-Flash-180B, antirez’s GGUF conversion, and 0xSero’s post.
GrepSeek, surfaced by _akhaliq and Clem Delangue, trains search agents to interact directly with corpora using natural-language shell commands, meaning an agent can search and manipulate a document collection like a developer using command-line tools instead of relying on one static retrieval step.
Luke J. Huang argued that frontier asynchronous RL is still unsolved despite 2-3x throughput gains at labs because policy lag (the model generating data becomes stale while training updates happen) still destabilizes learning, token-level importance sampling breaks on long horizons, sequence-level methods like TIS/CISPO and IcePop/MIS scale better, and common fixes such as clipping or masking bring their own tradeoffs; his thread surveys eight labs/frameworks and the open stability questions.
NVIDIA’s Déjà View (May 30) is a 117M-parameter looped transformer for multi-view 3D reconstruction that matches or beats 1B+ parameter feed-forward baselines by repeatedly applying one shared block, which turns inference compute into a slider and gives the model a coarse-to-fine refinement bias; Tobias Fischer shared the project.
Kyle Siler and coauthors found in PNAS that LLMs have diffused rapidly into academic publishing, analyzing 7.3M journal articles from 2020-2025 and estimating that about 12% of 2023 papers and 57% of 2025 papers contain excess “ChatGPT-era” wording, with higher adoption in non-native English regions, lower-ranked institutions, high-volume for-profit publishers, and fields like computer science, business, and law; Siler’s post framed it as both an equalizer and a research-integrity problem.
NeuROK, shared by Chen Geng, is a CVPR 2026 generative 4D neural object kinematics framework that turns a static 3D mesh into an interactive dynamic asset without physics annotations or category labels, learning a compact latent configuration space and solving a data-driven Lagrangian ODE (an equation for motion dynamics) across articulated objects, cloth, elastic bodies, and multi-body systems.
David Holz argued (May 27) that if autoregression is best when memory bandwidth is cheap and diffusion is best when FLOPS are cheap, then a future where compute scales faster than memory should push researchers harder toward diffusion-style models (models that generate by iteratively refining noise) instead of autoregressive next-token systems.
Jiaxin Wen argued (May 25) that language-model pretraining does not smoothly mature from “parrot” to intelligence; instead, toy evals show models repeatedly hop between pattern-matching and generalizable modes during training, meaning capability can appear, vanish, and reappear rather than monotonically improve; Wen’s thread introduced the work.
Aleksa Gordić wrote “Inside the Transformer: The Life of a Token” (May 26), a dense technical walkthrough of modern transformer internals including YaRN positional encoding (how models keep track of where tokens sit), hybrid attention for 160K context, soft capping, QK normalization, FLOPs per token, and cluster sizing; he announced it on X.
David Klindt and coauthors proved (May 27) conditions under which LeJEPA learns an identifiable world model, meaning the learned representation linearly recovers the true hidden variables of the world and lets you plan inside the learned model as if it were real; Klindt shared the result on X.

🎬 Creative, Media & Demos

Philip Kiely built a full AI-narrated audiobook of Inference Engineering by cloning his own voice with Rime’s Coda TTS from studio recordings, converting visuals and diagrams into prose descriptions, chunking the text, and assembling the final M4B file as a free Baseten download.
Alakazam, built by Hugo Thomel, turns a one-sentence game idea into a playable real-time world model with reactive characters that can see, talk back, and follow rules written in plain English; Thomel’s launch post framed it as “what if games were trained into existence?” No pricing details.
Linus Ekenstam demoed Topview Canvas, calling it “Storyboard first. Always.” and showing a 58-second infinite-canvas workflow for planning AI video visually before generating instead of blind prompting. Replies called it interesting, argued visual canvas beats chat interface, and asked how to learn it.
Runway showed how to create compositing mattes in Aleph 2.0 by uploading a video, prompting for a white subject silhouette on a black background, reviewing the preview, generating the clip, and setting it as a luma matte in an editor; a matte isolates a subject from the background so creators can composite, color, or apply effects to one part of a shot without manual rotoscoping, the frame-by-frame tracing editors normally use.

💬 Operator Takes & Commentary

Julien Chaumond learned that Claude Code deletes session traces after a month. Peterom said an updated Dataclaw Mac app will sync traces to Hugging Face daily after running them through OpenAI privacy models; AnalyticsAaurabh said “Aargh this explains a bit”; pointshopspace said they also did not know this; and BlockedPaths argued 30-day deletion is rough for memory, audits, and debugging because traces are some of the most useful agent data and feel like something to keep by default.
Nick Dobos argued that the underrated power of LLMs is the 5% extra curiosity on every search, where one-minute rabbit holes compound over years into hundreds of thousands of knowledge puzzle pieces that were previously impossible to gather in one lifetime. Wes Winder said learners who care will massively benefit, VDiogoV1 shared example rabbit-hole searches, and abenz_mato said the one-minute dig is where the real work happens.
kimmonismus broke down NVIDIA’s RTX Spark strategy at Computex, arguing NVIDIA walked into the PC market it never owned with an ARM superchip, a 1 petaflop FP4 personal AI computer, and a wager that the next PC era is built around local AI rather than apps; replies debated pricing, battery life, Windows on ARM viability, and Apple competition. E
Greg Isenberg shared 14 AI pendulum shifts, including “wrappers are worthless” flipping into app-layer wins and “fine-tuning is the moat” flipping into personal knowledge bases, ending with “I guarantee you I’m holding at least 2-3 beliefs right now that will look stupid by Christmas... Build anyway.” Replies most often cited point #13, choosing what to build, and said the idea was right but the timing was wrong.
Andrew Ng argued that AI is redefining full-time-equivalent work, while Lysandre and YC tied the point to how teams count AI labor and startup productivity baselines.
Mike Vernal argued that the traditional three-act enterprise software playbook, wedge to suite to platform, is breaking because AI has compressed engineering timelines and made cautious incrementalism too slow; he says founders and investors should plan to build the entire ambitious product from day one, and he explicitly agreed with Nikunj Kothari’s “Wedge Trap” essay (Sep. 15, 2025), which argues shallow AI wedges can get funded at $1M ARR but often plateau around $5M unless the wedge itself can grow into a deep, technically hard, defensible $100M business. Vernal’s post had 291 likes, and Kothari replied that he 100% agreed.

Previous Around the Horn Digests

Catch up on everything you missed:

Weekend, May 29-31, 2026: Kog pushed real-time inference toward 3,000 output tokens per second, OpenAI launched Rosalind Biodefense, Microsoft worked on a Copilot super app, data-center power fights moved toward FERC, and Glean turned token thrift into an enterprise AI sales pitch. No Atlas URL was included in the provided source batch, so this item is intentionally unlinked.
Thursday, May 28, 2026: Claude Opus 4.8 arrived with Dynamic Workflows, Anthropic raised a $65B Series H, IBM committed $10B to quantum, Waymo opened Ojai robotaxi rides, Dell jumped on AI-server demand, and Amazon killed an AI usage leaderboard after employee tokenmaxxing. No Atlas URL was included in the provided source batch, so this item is intentionally unlinked.
Wednesday, May 27, 2026: Robinhood gave AI agents access to brokerage accounts and virtual cards, AxiomProver moved machine-verified math into peer-reviewed papers, OpenAI and Thrive built self-improving tax agents, Google launched AI Threat Defense, Amazon and Snowflake signed a $6B chip deal, and Cognition raised $1B. No Atlas URL was included in the provided source batch, so this item is intentionally unlinked.
Tuesday, May 26, 2026: China curbed private-sector AI talent travel, Qualcomm struck a ByteDance chip deal, OpenRouter raised $113M, xAI finished Grok V9-Medium, and U.S. law enforcement warned of anti-tech extremism.
Thursday, May 21, 2026: OpenAI said a general-purpose reasoning model disproved the 80-year Erdos unit distance conjecture, Spotify and UMG licensed AI fan remixes, California signed an AI workforce order, Starbucks scrapped its AI inventory tool, and Waymo paused service after flooded-road failures. No Atlas URL was included in the provided source batch, so this item is intentionally unlinked.
Tuesday, May 19, 2026: Google I/O pushed Gemini agents across Search, Android, Workspace, YouTube, and shopping while Anthropic hardened Managed Agents and OpenAI expanded provenance.
Monday, May 18, 2026: Microsoft open-sourced ECHO, Odyssey launched real-time AI simulators, and OpenAI added bank connections to ChatGPT.
Wednesday-Thursday, May 13-14, 2026: Nvidia H200 sales cleared but stalled, Americans opposed AI data centers, and Meta planned layoffs.
Tuesday, May 12, 2026: Anthropic refused China access to its newest model, Isomorphic raised $2.1B, and Google pushed Gemini deeper into Android.
Monday, May 11, 2026: Cerebras upsized its $4.8B IPO, Cowboy Space raised $275M for orbital data centers, and Google confirmed the first criminal AI-found zero-day.
Weekend, May 9-10, 2026: The Trump administration drafted an AI security order, Apple and Intel reached a preliminary chip-making agreement, French prosecutors escalated their Musk and X probe, and Cerebras’ IPO heated up.

That’s a Wrap

That’s 90+ stories, papers, tools, demos, funding rounds, and X takes from today alone. If you made it this far, you now know more about NVIDIA’s agent-PC strategy than at least one person currently trying to expense a “local AI workstation” as a wellness purchase.

For the daily version, make sure you’re subscribed to The Neuron. We send six issues a week, and yes, we read all of this so you don’t have to.

See you tomorrow.

P.S. Know someone who’d find this useful? Forward this to them and tell them to subscribe here.

Everything That Happened in AI Today (Monday, June 1, 2026)

Around the Horn — Monday, June 1, 2026

🏆 TOP 5 NEWS

Honorable Mentions

🍪 TOP TREATS TO TRY

🏢 Big Tech & Major Companies

💼 AI Productivity, Labor & Economics

🤖 AI Agents & Infrastructure

💻 AI Coding & Developer Tools

🔬 AI Research & Models

🏛️ AI Policy, Governance & Safety

🛠️ AI Tools & Products

📊 Fundraising & Deals Roundup

🎙️ Interviews, Panels & Podcasts

💡 Industry Commentary & Analysis

🤖 AI Agents & Infrastructure

💻 AI Coding & Developer Tools

🔬 AI Research & Models

🏛️ AI Policy, Governance & Safety

🛠️ AI Tools & Products

📊 Fundraising & Deals Roundup

🎙️ Interviews, Panels & Podcasts

💡 Industry Commentary & Analysis

Late Breaking Additions:

🗞️ News, security, and markets

🧰 Tools, products, and demos

🔬 Research, models, and benchmarks

Late-Breaking Additions

🏢 News, Markets & Security

🛠️ Agent Tools, Developer Infrastructure & Workflows

🔬 Models, Research & Benchmarks

🎬 Creative, Media & Demos

💬 Operator Takes & Commentary

Previous Around the Horn Digests

That’s a Wrap

Grant Harvey

Company

Categories