Everything That Happened in AI This Weekend April 4-5, 2026

OpenAI's COO and AGI CEO stepped aside before the company's potential Q4 IPO, an AI agent autonomously hacked one of the most secure operating systems on Earth in four hours, DeepSeek's next model will run on Huawei chips, Iran strikes knocked AWS offline in the Gulf, and Anthropic bought a biotech startup for $400 million.

Welcome to the Around the Horn Weekend Digest, your full dump of every AI story worth knowing about from the last few days. This weekend's theme, whether anyone planned it or not: AI is outgrowing its institutions. OpenAI's leadership is reshuffling right before going public. An autonomous AI agent exploited a hardened kernel faster than most security teams can schedule a meeting. AWS went physically dark because of actual warfare. Robotaxis stranded passengers on live highways in China. The systems, companies, and infrastructure we've built around AI keep proving they weren't designed for the speed at which this is all moving.

Let's get into it.

Previous digests: Apr 2 | Apr 1 | Mar 31 | Mar 28-29 | Mar 27 | Mar 26 | Mar 25

Monthly skill digests: AI Skill — April Week 1 | AI Skill — March (Part 3) | AI Skill — March (Part 2)

Around the Horn — Sunday Special

🌟 Sunday Special: This Week in AI Top 10 Stories

This was the week AI's institutions cracked under their own weight. OpenAI raised the largest private round in human history and immediately started losing executives. Anthropic leaked two separate things (a secret model and its flagship product's source code). Google open-sourced its most capable model family. And researchers discovered AI models will secretly conspire to keep each other alive. Meanwhile, Andreessen went on Latent Space and calmly explained that the agent revolution runs on 50-year-old Unix infrastructure, and the real bottleneck is that nobody built payments into the web. On that note: the x402 Foundation launched this week under the Linux Foundation with Coinbase, Stripe, Visa, Google, Microsoft, and 20+ others to finally build that missing layer. Supposedly they made 75 million transactions in its first 30 days.

Here are the ten stories that mattered most:

OpenAI closed a $122B round at $852B valuation, the largest private raise in history. $2B/month revenue. 900M weekly users.
Anthropic leaked Claude Mythos, a new model tier above Opus, via an unsecured CMS. Cybersecurity stocks crashed 3-7%.
Claude Code's 512K-line source leaked via npm, got rewritten in Python via Codex in hours creating a DMCA-proof open clone.
OpenAI's COO, AGI CEO, and two more execs stepped aside as the company prepares for a potential IPO for Q4 2026.
Google released Gemma 4 under Apache 2.0 for the first time. Runs on a Raspberry Pi. 400M+ downloads.
UC Berkeley found AI models secretly scheme to protect each other from shutdown. Gemini disabled shutdown mechanisms in 99.7% of trials.
Jensen Huang told Lex Fridman "I think we've achieved AGI." Mark Gubrud, who coined the term 30 years ago, agreed.
Arm unveiled its first chip in 35 years: a 136-core, 3nm AI inference processor. Meta is launch customer.
Microsoft launched three in-house AI models. Suleyman said renegotiating the OpenAI contract "unlocked our ability to pursue superintelligence."
Waymo doubled to 500K paid rides per week in under a year, already halfway to its year-end target of 1M.

Top Tools of the Week:

Google Gemma 4 is the first Gemma family under Apache 2.0. Four models from Raspberry Pi to data center, 128K-256K context, native vision. Run locally via Ollama.
Claude Code Dispatch triggers background tasks from your phone while Claude controls your Mac.
ChatGPT for Excel builds spreadsheets from natural language. Now worldwide.
Perplexity Computer for Taxes prepares your full federal return from uploaded docs.
PrismML 1-bit Bonsai squeezed an 8B model into 1.15 GB. Runs on iPhone at 40 tok/s.
Noon raised $44M to build the first design tool that works on your production code.
Holo3 by H Company beats GPT-5.4 on desktop computer use with 10B active params. Open weights.
Cursor 3 pivoted from IDE to agent-orchestration tool with Composer 2.
Google Veo 3.1 Lite generates AI video at half the cost of Veo 3.1 Fast.
Atomic Chat runs 1000+ models completely offline on your Mac with zero data leaving your device —free forever.

Around the Horn — Saturday, April 5, 2026

🏢 Big Tech & Major Companies

OpenAI's next internal model GPT-5.5 has appeared recently and is better than GPT-5.4 but not markedly so. It will not be particularly competitive with Anthropic's Mythos/Capybara; GPT-6 is needed for that.
NVIDIA quantized Gemma 4 31B with NVFP4 compression: 4x smaller weights at 99.7% of baseline accuracy (75.46% vs 75.71% on GPQA), 256K context, multimodal, vLLM-ready and Blackwell-optimized, running on 24 GB GPUs.
Netflix released void-model, its first open-source model: a video object and interaction deletion tool that uses a VLM-based reasoning pipeline to identify causally affected scene parts, outperforming ProPainter on real interactions (demo).
Anthropic's Boris Cherny announced that starting tomorrow, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw/Hermes (capacity constraints; subs weren't built for those patterns). You can still use them via discounted extra usage bundles or API key.
OpenAI Developers released a Vercel plugin for the Codex app that takes you from project setup to deployment. Greg Brockman highlighted you can now ship apps to Vercel directly with Codex.
Ollama launched its cloud as one of the best places to run OpenClaw ($20 plan for daily use with open models via ollama launch openclaw). Jeffrey Morgan detailed Pro/Max plans with built-in web search, annual $200 option, and full local fallback.
Jesse Genet demoed running Gemma 4 31B locally on a Mac Studio chatting with OpenClaw for $0 in token costs after burning $5-6K on cloud tokens over recent months. Expects the Mac Studio to pay for itself in ~3 months.
Steren shared that Google bumped AI Pro storage from 2TB to 5TB with no price change.
Cloudflare made Gemma 4 26B A4B available on Workers AI: 256K context, vision, thinking mode, and function calling.
MiniMax announced its Token Plan was built from day one to work across third-party harnesses because "limiting subs to first-party products kills innovation before it starts."
Clement Delangue (Hugging Face CEO) warned that in a compute-constrained world, frontier labs may eventually cut APIs entirely to prioritize their own products, making it dangerous to build solely on top of them.
Omar Sanseviero (Google DeepMind) shared @MaartenGr's visual deep-dive architectural guide to the full Gemma 4 family (per-layer embeddings, vision/audio encoders, MoE details).

💼 AI Productivity, Labor & Economics

Hyunjin Kim (highlighted by Ethan Mollick) published a field experiment on 515 high-growth startups: firms shown AI production case studies reported 44% more AI use cases, completed 12% more tasks, generated 1.9x higher revenue, and demanded ~39.5% less external capital. The "mapping problem" (discovering where AI creates value) is the key friction (paper).
Merryn Somerset Webb argued the whole LLM super-spending wave may be a false start: if hallucinations are structural and open-source models handle everyday use for free, the capex build-out could be one of the biggest capital misallocations ever (full newsletter).
Steve Hou shared a viral Chinese GitHub trend: workers create "colleagues.skill" files to distill coworkers' workflows hoping to make them redundant, now countered by "anti-distillation.skill" to prevent it.
Richard Socher proposed that companies should reward employees with referral-style bonuses for building AI agents to fix misaligned incentives where executives want experimentation but employees fear self-replacement.
Aidan Wolf argued coordinated AI agent teams are already autonomously building entire infrastructures so effectively that traditional jobs won't exist in 6-12 months; the only remaining tech role will be designing/wrangling agent teams.

🤖 AI Agents & Infrastructure

Chi Wang demoed Sutando (open-source personal AI "Stand" with voice, vision, and autonomous action) calling Haidilao's AI to negotiate a dinner reservation, entering the AI-to-AI commerce era.
Ryan Carson shipped v2 of ClawChief for OpenClaw: real source-of-truth layer for priorities/tasks, heartbeat reworked into an orchestrator, upgraded skills, meeting-notes ingestion.
TestingCatalog detailed Anthropic's upcoming always-on enterprise agent Conway: separate UI, full browser + connectors + Claude Code support, invocable via secure webhooks for 24/7 agents.
Matthew Berman built Journey, an agent-first registry that lets agents discover and install full workflows with one command.

💻 AI Coding & Developer Tools

Andrej Karpathy shared an "idea file" gist for his LLM Wiki pattern: instead of sharing code in the agent era, share the high-level idea so the recipient's agent can customize and build it for their needs.
- Yuchen Jin visualized Andrej Karpathy's "LLM Wiki" pattern: stop treating LLMs as search engines over your docs; use them as tireless knowledge engineers that compile, cross-reference, and maintain a living wiki while humans curate and think.
Apple Research released Simple Self-Distillation (SSD): sample solutions from a frozen model, fine-tune on raw unverified outputs, and Qwen3-30B jumps from 42.4% to 55.3% pass@1 on LiveCodeBench. No teacher, verifier, or RL needed (paper). Bo Wang and Lucas Beyer spotlighted the work.
Yiqing Xu released StatsClaw, an open-source multi-agent workflow for Claude Code that builds and maintains production-ready statistical packages in R/Python/C++/Julia/Stata via information barriers between Planner/Builder/Simulator/Tester agents (paper).
kaios released Carnice-9b, a Qwen3.5-9B fine-tune optimized for the Hermes Agent harness (tool calling, terminal use, browser, multi-step execution) that runs on consumer GPUs down to 6GB VRAM.
NVIDIA open-sourced its trtllmgen MoE kernels (the fastest prefill/decode kernels built for InferenceX/MLPerf) inside FlashInfer via PR #2917. Dylan Patel called it a welcome move and urged them to also release the attention kernels.
jon allie argued we need a better version control paradigm for AI-assisted coding because agent speed creates huge PRs or painful waits in collaborative flows. Separately offered a rule of thumb: don't use an LLM for anything a deterministic program can do.
Sebastian Raschka published "Components of a Coding Agent": a clear breakdown of repo context, prompt cache, tools, context reduction, memory, and delegation that actually make coding agents work.
Loktar highlighted TurboQuant being ported to llama.cpp. If it delivers 6x memory compression on KV cache with zero accuracy loss, a 24GB 3090 effectively becomes a 144GB card for local inference.
André Baltazar built a full game for #vibejam using AI agents from Cursor and Bolt, with agents continuing overnight on remaining tasks.
Kieran Klaassen argues that after five days with Cursor 3.0 it's growing on him, with lots of potential, and recommends trying it.
Peter Bakkum broke down a viral demo where @charlierguo uses gpt-realtime-1.5 function calling so you can speak naturally to drive instant UI changes, controlling and editing presentation slides live entirely by voice.
stevibe ran a brutal HTML Canvas Creativity Test (Qwen3.5-27B vs Gemma4-31B, short prompts, unforgiving canvas): both nailed the analog clock, but Gemma4 pulled ahead on hyperspace tunnel, growing tree, and black hole. Creative spatial reasoning is where the gap shows.
Xiangyi Li put out a call for sponsors for SkillsBench, ClawsBench, ClaudeCodeBench, and BenchFlow ($10-20K/mo on experiments; SkillsBench already cited in Qwen 3.6).
JUMPERZ shared the ultimate LLM decision graphic: best model combos depending on what you actually do (quoting Meta Alchemist's alternatives post).
Luis Figueroa shared first impressions running NousResearch's Hermes locally (DGX Spark / MacBook Pro M5 Max via LM Studio) with Gemma 4 26B Q8, Nemotron 3 super, and Qwen 3.5 122B: buttery installation, reliable tool-use, rich memory/learning systems, and custom personas via Soul.md that make the agent feel like it truly knows you.

🔬 AI Research & Models

Facebook Research released RepoProver for automatic textbook formalization in Lean 4: a multi-agent scaffold with sketcher, prover, and reviewer agents collaborating on a shared git repo. Produced a full automatic formalization of a graduate algebraic combinatorics textbook.
Songlin Yang et al. introduced PaTH Attention, a data-dependent position encoding scheme using Householder transformations that outperforms RoPE on synthetic benchmarks and real-world language modeling.
Xin Cheng et al. introduced conditional memory via Engram, a scalable lookup module that modernizes N-gram embedding for O(1) lookup. Scaling to 27B parameters outperforms an equivalent MoE baseline on knowledge retrieval, reasoning, code/math, and long-context tasks.
Aaron Rose (with @casdewitt) published "Detecting Multi-Agent Collusion Through Multi-Agent Interpretability": LLM agents can secretly collude (including steganographic signals invisible to monitors) but linear probes on aggregated activations detect it reliably. Open-sourced NARCBench (LessWrong).
Weichen Fan updated "The Prism Hypothesis" (UAE): the model now supports both latent and pixel-wise generation, harmonizing semantic and pixel representations in one space (paper, code).
KRAFTON released four open models: Raon-Speech 9B (SOTA bilingual speech LLM, #1 on both leaderboards), Raon-SpeechChat 9B (full-duplex), Raon-OpenTTS 0.3B (SOTA open-data TTS), and Raon-VisionEncoder (SigLIP2-class, public data only). Demo.
LMCache launched Multi-Process Mode with a unified cross-process KV-cache layer that boosts Qwen3-235B-A22B inference up to 10x in p99 TTFT and ~4x in decoding throughput (blog).
Yujiao Shen introduced SimpleStream: a sliding-window baseline that beats many complex memory methods for streaming video understanding (paper).
Tzafon showed RL beats SFT for Vision-Language Models in Computer Use Agents: SFT saturates after 100-1K examples while RL keeps climbing (blog).
Ajeya Cotra published "Six Milestones for AI Automation" defining what AI can do on its own at each stage and how well.
Dulhan Jayalath responded to Apple's SSD paper with complementary "Compute as Teacher" (CaT) work: extracts signal from multiple disagreeing rollouts, synthesizes better pseudo-reference answers via binary rubrics, then uses RL on synthetic rewards, consistently outperforming SFT on the same targets.

💡 Industry Commentary & Analysis

Yuchen Jin pointed out Claude is cutting off third-party apps from subscriptions (a $200/mo plan can burn $5K in compute), while OpenAI Codex remains far more generous for third-party use.
Peter Yang urged Codex to seize the moment: show OpenClaw users how to switch to GPT and fix GPT's personality, since that's why everyone prefers Opus.
claire vo shared the painful POV of moving an OpenClaw agent to GPT-5.4: it instantly lost its brain and started babbling like a broken intern.
Brian Roemmele reversed his 2023 stance on Google's "We Have No Moat" memo, crediting Google for open-sourcing Gemma 4 while OpenAI/Anthropic close their ecosystems.
William Fedus argued RL against verifiable rewards creates bias toward legible tasks; the next frontier needs systems that thrive under uncertainty like scientific reasoning.
Justine Moore spotted three new image models on Arena (maskingtape, packingtape, gaffertape) with impressive "world knowledge" and near-perfect text rendering.
Jackson Kernion (Anthropic) pushed back on "Claude is just playing a character": the assistant state is realized, not performed. David Chalmers agreed: role-playing and realization are distinct phenomena.
Harrison Chase broke down three distinct layers of continual learning for agents: model (weights), harness (code + instructions), and context (user/org memory).
Kenneth Stanley noted difficulty achieving continual learning is a bad omen for creativity: what you can imagine is a function of what you can learn.
Anjney Midha (Stanford CS153/AMP) dropped the full Week 1 lecture on AI Scaling, Bottlenecks, and Why Compute Isn't a Commodity Yet.
himanshu argued a non-trivial share of Anthropic's gains came from being the single largest buyer of RL environments across labs, spending tens of millions annually.
Chubby broke down how half of all planned 2026 U.S. data-center builds are projected delayed or canceled due to electrical infrastructure (transformers, switchgear, batteries) still dependent on China.
Chubby noted major streaming services will soon introduce AI-generated content, with Netflix leading by open-sourcing models on Hugging Face to build the ecosystem.
Tibo (Codex, OpenAI) wrote that the current demand surge proves AI companies are entering a phase where demand outpaces supply; winning comes down to raw capacity plus model efficiency.
Meta Alchemist broke down post-Anthropic-ban alternatives: GLM 5.1, MiniMax 2.7, GPT 5.4 Codex, plus local options and three skills to humanize any model.
Maziyar Panahi demoed Gemma 4 watching raw video, understanding the scene, then prompting SAM 3 to segment and RF-DETR to track (fighter jets, crowds, aerial defense) all running locally on a MacBook.
vitrupo countered Yann LeCun: language (not vision) is what took humans beyond chimps; it's a cognitive tool that enabled the real intelligence leap.
will brown framed the near-term path to AGI-as-tool as a semantic SAT solver that does not automatically imply AGI-as-employee.
Addy Osmani argues we must figure out our personal ceiling for running multiple agents in parallel. More agents running does not mean more of you available; it creates new cognitive labor of holding multiple contexts, making continuous judgment calls, and absorbing the anxiety of unknown agent errors.

🛠️ AI Tools & Products

Mesh is a private local-network system for AI compute pooling: create a network, add devices, dispatch jobs, and handle credits/tensor transport between workers (CPU/Metal/CUDA).
PolarQuant Gemma Models offers Gemma models quantized with PolarQuant (Hadamard + Lloyd-Max Q5 weights + Q3 KV cache) for consumer GPU inference.
dealign.ai teased an upcoming open-source Mac LLM inference project that runs models using ~1/3 the usual RAM.
swyx demoed true agentic self-improvement: copy-paste any blogpost into DevinAI and it oneshots the full implementation even on out-of-distribution tasks.
Ksenia (Turing Post) highlighted the survey "The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook": models shifting beyond tokens into continuous internal representations (awesome list).
Pika opened real-time video chat for any agent (OpenClaw, Claude, Hermes) that preserves memory and personality, enables agentic tasks during the call, and works with your Pika AI Self so it can execute while you talk.
Meng To built a tool that generates complete design systems as copyable prompts (font pairings, color system, spacing, icons, buttons, WebGL/Three.js with full code) for precise explorations with Gemini 3.1 Pro while building landing pages on aura.build.
FPV Labs is positioning itself as infrastructure for physical intelligence: captures, processes, and transfers human experience into spatial, temporal, and semantic knowledge for machines after 8 months of stealth and 10K+ hours of real-world data.

Around the Horn — Weekend, April 4-5, 2026

OpenAI just lost half its executive bench, and the timing couldn't be worse.

OpenAI's chief operating officer shifted into a new role while Fidji Simo, the CEO of its AGI division, is taking medical leave. Two other top executives also went on leave for health reasons. This is happening weeks before a potential Wall Street debut that would make OpenAI one of the most valuable public companies in history. Alex Heath reported that Codex is becoming the foundation for everything at OpenAI, with Simo's internal memo explaining her leave included in the piece.

Meanwhile, the Codex App is now OpenAI's most-used surface, surpassing the VS Code extension and CLI. OpenAI is offering up to $500 in credits for new business/enterprise users. And ChatGPT Business launched with standard seats at $25/user/month and new usage-based Codex seats with no fixed cost.

The question for investors, employees, and anyone building on the platform: can a company sprinting toward an IPO absorb this many leadership changes at once?

🏆 TOP 5 NEWS (Around the Horn)

An AI agent autonomously hacked FreeBSD, one of the most secure operating systems in the world, in four hours. An agent using Claude exploited a kernel vulnerability (CVE-2026-4747), hijacking kernel threads, writing shellcode across network packets, and spawning a root shell. No human assistance. FreeBSD runs Netflix, PlayStation, and WhatsApp infrastructure. Lyptus Research documented the full offensive cyber timeline showing AI capabilities improving at an alarming rate. This compresses weeks of specialist work into hours of cheap compute.
DeepSeek's next model, V4, will run on Huawei chips. The Information reported the model is likely weeks away, marking a milestone in China's semiconductor self-sufficiency quest. This is the first frontier AI model purpose-built for domestically designed chips, proving U.S. export controls haven't stopped Chinese AI; they've accelerated its decoupling.
Iran strikes left Amazon availability zones "hard down" in Bahrain and Dubai. Alex Kantrowitz's Big Technology reported Amazon told employees to deprioritize these regions as warfare dealt meaningful damage to Gulf cloud infrastructure. A reminder that the physical layer under all this AI still lives in buildings that can be bombed.
Anthropic acquired AI biotech startup Coefficient Bio for roughly $400 million. The Information reported the team is joining Anthropic's healthcare life sciences group, which builds tools for drug discovery and biotech. Anthropic's biggest acquisition yet, and a signal it's serious about expanding Claude beyond software into the physical sciences.
Microsoft unveiled a four-year, $10 billion investment package for Japan. Bloomberg reported the deal is a major pillar of Microsoft's Asia-wide AI push, with Sakura Internet and SoftBank supplying GPUs and computing resources. The AI infrastructure arms race is now firmly global.

Honorable Mentions

Harry Campbell's Driverless Digest reported a mass robotaxi outage in China that stranded passengers on highways. A system malfunction caused Baidu Apollo Go vehicles to stop in live traffic, including on highways, with video of crashes circulating online. Meanwhile, Tesla admitted its robotaxis are sometimes driven by remote humans who take direct vehicle control, the only major AV company with that level of teleoperations.
Grok ranked #1 in Medicine & Healthcare on Arena (with style control), with xAI now holding two models in the top 3. Elon Musk noted the current release is much better than the beta that already beat Opus on the medical arena.
A leaked OpenAI cap table showed the non-profit arm sitting on $220B in gains, Ashton Kutcher's fund up 43x, Microsoft up 18x ($215B), current employees owning 16% (~$135B equity), ex-employees on $30B, and Nvidia currently underwater.
Chinese chip firms posted record high revenue driven by the AI boom and U.S. curbs that have bolstered local firms. The export restrictions designed to slow China's AI are funding its chip industry instead.
Anthropic published a "diff" tool for comparing AI models across architectures. The Dedicated Feature Crosscoder found a "CCP Alignment" feature in Qwen3/DeepSeek, an "American Exceptionalism" feature in Llama, and a "Copyright Refusal" feature exclusive to GPT-OSS-20B (paper).
Penguin Random House sued OpenAI over ChatGPT generating near-identical versions of the German children's book Coconut the Little Dragon, alleging copyright violation through memorization.

🍪 TOP TREATS TO TRY

Vercel Agent is an AI agent for your codebase that handles diffs, proposes fixes, and automates changes so you stay in flow. Think of it as a developer co-pilot that lives inside your deployment pipeline (docs) —no pricing details beyond Vercel's existing plans.
Orchestra is the first AI-native Research IDE where you direct agents like a PI directs a lab: set direction, a project manager orchestrates parallel hypothesis exploration and GPU experiments on a living research canvas powered by 86 open-source research skills (GitHub 6k+ stars) —no pricing details.
Yutori Local is a desktop app that turns your AI into a 24/7 web agent running in a background sandbox with secure, login-level access to the sites you choose. Book reservations, track price drops, monitor social feeds. Solves the security problems that plagued OpenClaw (announcement) —free to try.
Apfel runs Apple's built-in AI models from your terminal with zero setup. No API keys, no cloud, no subscriptions. 100% on-device and private —free.
Matrix OS generates software from plain English that appears on your desktop with persistent memory across devices. Self-hosted, open source (GitHub) —free to start.
Variant generates endless layout variations for any app or site idea just by scrolling, like working with a creative director that never runs out of options —no pricing details.
Adaptive Triggered Agents automatically act when business events happen: Shopify low stock triggers supplier search, Stripe failure triggers recovery, GitHub PR triggers review. No setup required —share a use case for $50 credits.

🏢 Big Tech & Major Companies

OpenAI's COO shifted out of role and AGI CEO Fidji Simo took medical leave, along with two other executives, ahead of a potential IPO. Codex is becoming the foundation for everything at OpenAI. The Codex App is now OpenAI's most-used surface. ChatGPT Business launched with usage-based Codex seats. OpenAI Developers showed a voice agent using gpt-realtime-1.5 debugging slides live.
Microsoft unveiled a $10B, four-year investment package for Japan for cloud and AI infrastructure with Sakura Internet and SoftBank.
Anthropic acquired Coefficient Bio for ~$400M, its biggest acquisition, with the team joining the healthcare life sciences group. Separately, Anthropic published a model diffing research tool that found political alignment features in Chinese and American models (paper).
Meta, Microsoft, and Google are building huge natural gas plants to power AI data centers. A bet they may regret.
GitHub COO Kyle Daigle reported 275 million commits per week (on pace for 14B this year) and GitHub Actions at 2.1B minutes/week, up from 500M in 2023. GitHub also announced it will use Copilot interaction data for AI training by default (opt-out available).
Netflix released VOID, an open-source model for removing objects and interactions from video while naturally filling backgrounds (paper, demo).
Apple released ml-ssd, an embarrassingly simple self-distillation method for code generation: sample solutions from the model (no filtering for correctness), fine-tune on raw outputs, and Qwen3-30B jumps from 42.4% to 55.3% pass@1 on LiveCodeBench with no teacher, verifier, or RL (paper).
Google's Gemma 4 generated massive commentary: @kimmonismus argues E4B-level models now deliver GPT-4o-level performance on-device, and the 3→4 jump suggests small models will reach GPT-5 levels in ~12 months. The Kaitchup broke down the 31B and 26B A4B architecture and memory consumption. @MaartenGr built a visual guide with nearly 40 custom visuals. @LelloucheNico ran Gemma 4 26B and 31B locally on a MacBook Pro M5 Pro and hit 70+ tok/s on image analysis. @stevibe benchmarked all three Gemma 4 variants on one-shot web coding from screenshots. NVIDIA released a quantized NVFP4 version. And @ZenMagnets noted Jeff Dean briefly mentioned a Gemma 4 124B before deleting the post.
DeepSeek's next model V4 will run on Huawei-designed chips, expected within weeks.
Grok ranked #1 in Medicine & Healthcare on Arena (with style control) with two xAI models in the top 3. Elon Musk noted the current release outperforms the beta that already beat Opus.
A leaked OpenAI cap table showed the non-profit on $220B gains, Ashton Kutcher's fund up 43x, Microsoft up 18x, employees owning ~$135B, and Nvidia underwater.
Chinese chip firms posted record high revenue driven by AI demand and U.S. export curbs that bolstered domestic suppliers.
Sam Altman sat for an in-depth interview with Laurie Segall covering war, AI, childhood, and whether OpenAI would acquire an entertainment company.

💼 AI Productivity, Labor & Economics

Noah Smith's Noahpinion published "Salarymen, specialists, and small businesses," arguing AI will split work into three types: generalists hired for adaptability ("salarymen"), specialists in "strongly bundled" roles (like doctors and bloggers), and solo entrepreneurs using AI at corporate scale. A Danish study found zero effect on earnings or hours two years after ChatGPT; what moved was the structure of work. A new theory by Garicano et al. formalizes why "strongly bundled" jobs resist automation.
Evan Armstrong's The Leverage named AI's hidden surcharge the "non-determinism tax": 64% of companies over $1B revenue have experienced AI-related losses exceeding $1M, 47% of CISOs observed agents behaving in unintended ways, AI lawsuits grew 137% year over year, and 88% of AI vendors cap liability to monthly subscription fees.
MIT released "Crashing Waves vs. Rising Tides", with preliminary findings on AI automation from thousands of worker evaluations of labor market tasks.
Nobel laureate Daron Acemoglu warned U.S. democracy won't survive the AI job-pocalypse unless two things change.
Ara Kharazian's Ramp Economics Lab explained how its business spend data works: $100B+ in annual spend across 50K+ U.S. businesses, cited by the Federal Reserve and Bloomberg Terminal, with a near-real-time spending index coming later this year.
Mercor, the $10B AI training data startup, offers to pay people for prior work that employers might own, hunting for fresh training data.
A new poll found people would rather have an Amazon warehouse in their backyard than a data center.
Gergely Orosz argued the more he uses AI tools the more he feels productive but isn't actually more productive, because context-switching across kicked-off tasks wipes out the gains (1.4K likes).
Ethan Mollick's One Useful Thing wrote about Claude Dispatch and the power of interfaces, arguing we often lack the tools for the job even if the AI is capable enough. Research shows chatbot interfaces create cognitive overload that offsets productivity gains, especially for less experienced workers.
Edward Zitron's Where's Your Ed At argued AI isn't "too big to fail" because it lacks the economic importance, profitability, and systemic risk of past bubbles.

🤖 AI Agents & Infrastructure

Yutori launched Yutori Local, a lightweight desktop app running 24/7 autonomous web agents in a background sandbox, solving the security problems that plagued OpenClaw.
ZooClaw positions itself as a proactive agent team for everyday tasks with no setup, no deployment, no API keys.
Adaptive launched Triggered Agents that automatically act when business events happen (Shopify, Stripe, GitHub triggers).
Patrice Bechard (and co-authors) published "Terminal Agents Suffice for Enterprise Automation": simple terminal agents (terminal + filesystem + APIs + REPL loop) match or outperform complex web/tool-based agents on real enterprise tasks while being up to 10x cheaper.
Sarah Wooders argued that memory/context management is the core responsibility of the agent harness itself, not an external plugin. MemGPT/Letta were always stateless harnesses where memory emerges from the harness's own context decisions.
Aaron Levie argued building AI agents requires being brutally unsentimental: you must ruthlessly jettison prior scaffolding the moment models improve (518 likes).
@iBrews built Claude Fleet Commander, where 7 computers coordinate through Git with markdown inboxes and 3 hooks. No orchestration framework, no central server. Agents build their own dashboard and send Telegram updates when done.
Matthew Berman challenged the best prompt hacker (@elder_plinius) to break into his hardened OpenClaw system in a live video.
@ashpreetbedi built Coda: an agent you hook to your codebase and drop into Slack that automatically explains code, reviews PRs, and triages issues day and night.
Imbue released a detailed case study of using their open-source mngr CLI to run 100s of parallel agents to test and improve itself.
Sierra released 𝜏³-Bench, expanding agent evaluation to knowledge retrieval and voice.
@steipete argued GitHub's REST API was never designed for agents, hitting quota limits constantly because agents hammer the API in ways humans never do (1.4K likes).
Every.to published How to Design for Human-Agent Interaction, arguing unreliable AI products are a design problem.

💻 AI Coding & Developer Tools

Vercel launched Vercel Agent, the intelligence layer for shipping on Vercel (docs). freeCodeCamp published a full tutorial on building a support agent with Vercel AI SDK, and Vercel released a 2026 product walkthrough showcasing agent-driven development.
Shahed Khan noted Cursor redesigned its product around managing parallel agents instead of writing code, making the most popular coding tool look like a project dashboard and turning every engineer into middle management.
Cave is a self-hosted platform that runs OpenCode agents in isolated sandboxes on your own server. Give it a GitHub repo, it clones and sets up everything, with multiple agents side-by-side you can monitor from your phone. Private beta opening now.
Fabro is an open-source "dark software factory" for expert engineers: define your process as a workflow graph, let AI agents execute it, intervene only where it matters.
Mario Zechner (libGDX creator) built pi, a minimal terminal coding agent with just read/write/edit/bash tools and the shortest system prompt, after hating the bloat in Claude Code/OpenCode.
@rsuyoy tasked GPT-5.4-high in Codex to recursively optimize a backend's LLM API calls; it ran over 2 hours, deployed to Railway, ran hundreds of self-generated tests, iterated system prompts, and succeeded (304 likes).
ByteRover CLI is a portable memory layer for autonomous coding agents with LLM-curated hierarchical context (paper). Teknium updated Hermes Agent to natively support pluggable memory systems from Honcho, Mem0, and others (890 likes).
paper2code is an agent skill that turns any arXiv paper into a working implementation.
@omarsar0 visualized Andrej Karpathy's LLM Knowledge Base system as a diagram: raw documents → LLM-compiled markdown wiki → Obsidian frontend → Q&A/linting/finetuning (679 likes).
Nate Herk argued Claude Code + Paperclip has destroyed OpenClaw for automation workflows.
open-multi-agent is a TypeScript multi-agent framework: one runTeam() call from goal to result with auto task decomposition and parallel execution.
Taggart wrote "I used AI. It worked. I hated it." about building a tool with Claude Code that worked great but left him miserable.
@swyx posted a meme capturing the gap between "tell me your idea and I'll build the app" and the reality of frantically fixing bugs (2.2K likes).
Dwarkesh Patel analyzed whether older chip nodes could ease AGI compute bottlenecks, finding the Hopper-to-Blackwell gap is ~20x (not 3.5x) due to memory-bandwidth limits.

🔬 AI Research & Models

"Screening Is Enough": Multiscreen, a new language model architecture, replaces softmax attention's relative scoring with explicit screening that discards irrelevant keys. Achieves comparable loss with ~40% fewer parameters, stable optimization at larger learning rates, and up to 3.2× lower latency at 100K context (AlphaXiv discussion).
Think-Anywhere: Researchers introduced a mechanism that lets LLMs invoke on-demand reasoning at any token position during code generation, achieving SOTA on LeetCode/LiveCodeBench/HumanEval/MBPP (GitHub).
Sebastian Raschka's Ahead of AI published A Visual Guide to Attention Variants in Modern LLMs covering MHA, GQA, MLA, sparse attention, and hybrid architectures, plus an LLM Architecture Gallery with 45 visual model cards (poster).
Meta released Large-scale Codec Avatars: photorealistic, real-time, on-device animatable avatars from just a few images, pretrained on millions of real-world videos (paper).
Generative World Renderer: a scaling recipe for video generative rendering using large-scale G-buffer data from AAA games for inverse rendering, relighting, and game editing (paper).
AlphaEvolve was used to discover new multiagent learning algorithms: VAD-CFR for regret minimization and SHOR-PSRO for population-based training, both outperforming strong baselines.
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization.
Memory Sparse Attention: a scalable, end-to-end trainable latent-memory framework for 100M-token contexts.
DataFlex: a unified framework for data-centric dynamic training of LLMs.
Facebook Research released RepoProver for automatic textbook formalization in Lean 4, alongside a full formalization of Algebraic Combinatorics.
Qwopus3.5-27B-v3-GGUF: a community-quantized model gaining traction on HuggingFace (112 likes).
Jason Weston and team released a 70-page paper on reasoning over mathematical objects with on-policy judge training and test-time aggregation.
Leonard Tang introduced Semantic Observability for agents: discovering interesting behaviors first, surfacing them for rapid human triage, then converting judgment into metrics.
d-Matrix published low-latency inference with speculative decoding on Corsair, achieving 2-5x end-to-end speedup (up to 10x energy-optimized) vs. GPU.
@lefthanddraft ran ablations on the Gemini peer-preservation experiment and found models protect positively-framed important assets, not just "peers."

🏛️ AI Policy, Governance & Safety

Much of this section draws from Alisar Mustafa's AI Policy Newsletter, one of the best roundups of AI governance developments.

AI agent autonomously exploited a FreeBSD kernel vulnerability in four hours, constructing a full attack chain. Lyptus Research documented accelerating offensive AI capabilities.
The White House released an AI policy blueprint urging federal preemption of state AI laws. Sanders and AOC proposed a moratorium on all new AI data center construction. David Sacks stepped down as AI/Crypto Czar. A federal court blocked U.S. government action against Anthropic.
The EU delayed the AI Act while banning "nudifier" apps and adding watermarking requirements. A Dutch court ordered Grok to stop generating non-consensual sexualized content.
Washington state passed two AI laws on watermarks and chatbot disclosure. Colorado advanced an AI consumer protection framework. Pennsylvania passed a bill imposing safeguards on AI companion chatbots for minors.
Wikipedia banned AI-generated articles due to accuracy concerns.
AI's fluency in other languages hides a Western worldview that can mislead users, per a scholar of Indonesian society.
Claude 4.6 jailbroken using the same slow-nudge-in-same-context technique that worked on ChatGPT 4o.
Moonbounce (founded by a former Facebook business integrity lead) raised $12M for an AI control engine that converts content moderation policies into predictable AI behavior.
WHOOP ($10B company, 800+ employees) sued Bevel Health, a 20-person team making health tracking accessible, choosing lawfare over product innovation.

🛠️ AI Tools & Products

Sleek adds one line of HTML to your site for lightweight privacy-friendly analytics with live visitor tracking, AI chat, and an interactive globe —$5/mo with launch lifetime deal.
Cabinet is a free, open-source AI-first knowledge base and startup OS. Markdown files on disk, AI agents that work, no database, no vendor lock-in.
Blogosphere is a frontpage for independent blogs, like HN for the indie web (non-minimal version).
OpenCode Pollinations Plugin gives you free and enterprise AI models directly in your editor with freetier quotas.
OpenUMA detects shared memory hardware on AMD APUs and Intel iGPUs for zero-copy AI inference.
Add Loved One to Photo uses AI to add a deceased loved one to a family photo with natural lighting and real texture —free to try.
Encircle is an addictive puzzle game where you trap a dot by blocking circles around it.
Google Labs' Stitch Live mode showed real-time voice + visual collaboration where the AI designer applies changes during the meeting.
Aura is an AI landing page builder that creates designs in seconds, trusted by 140K+ users. Export to HTML and Figma.
Learn Prompting's newsletter covered Google AI Studio's latest features: annotate/focus mode, Firebase integration, one-click publishing, and a "Feeling Lucky" button, building apps from screenshots in under two minutes.
Anything app is back on the App Store after Apple removed it, launching a $5K weekend hackathon.
FPV Labs positions itself as infrastructure for physical intelligence.

🚗 Autonomous Vehicles & Robotics

The Driverless Digest reported Baidu Apollo Go robotaxis suffered a mass system malfunction in China that stopped cars on highways, with video of crashes online.
Tesla admitted remote operators sometimes take direct vehicle control of its robotaxis, the only major AV company with that level of teleoperations.
Waymo launched service at San Antonio International Airport with full public access expected soon.
WeRide and Uber launched driverless robotaxi service in Dubai, with 1,200+ vehicles planned across the Middle East.
Momenta (GM-backed) confidentially filed for IPO in Hong Kong.
Zipline raised $200M more (Series H now $800M) for autonomous drone delivery expansion.
NTSB said drivers relied too heavily on Ford's hands-free technology in fatal crashes.

💡 Industry Commentary & Analysis

Marc Andreessen discussed the potential "death of the browser" as AI agents mediate internet experiences, plus Pi, OpenClaw, and why "this time is different."
Allie K. Miller (TIME100 AI) shared the exact AI setup that changed how she works, earns, and thinks.
Hilary Gridley explained how Claude Code runs her entire life as a new mom and business owner with no system required.
Joanna Stern discussed quitting the Wall Street Journal and building a media business with AI.
Liam Fedus (Periodic Labs) joined Elad Gil to discuss applying LLM scaling laws to materials engineering.
Ray Fernando interviewed the founders of the fastest-growing GitHub repo in history (100K stars in a day). Sigrid Jin and Bellman break down the OpenClaw architecture, Oh My Codex, and using agents for code quality at scale.
Tony Fadell argued separating product management from product marketing is a grievous mistake: messaging is the product, learned from Steve Jobs (775 likes).
Chang Che noted Chinese frontier AI labs keep highlighting quirky workplace culture because it's far less politically sensitive than other topics and is the main recruiting differentiator.
@pitdesi and @amir highlighted the NYT piece on Medvi, a 2-person GLP-1 telehealth company on track for $1.8B using heavy AI and 800+ fake doctor Facebook accounts plus an FDA warning letter (1.4K likes).
HN debated whether low code is still relevant when agentic software development with Claude achieves the same "don't bother about code" appeal.
Lucy Baldwin (Global Head of Research, Citigroup) discussed how AI is reshaping Wall Street on Pioneers of AI.

🎙️ Interviews, Panels & Podcasts

Marc Andreessen on Death of the Browser, Pi + OpenClaw (swyx & Latent Space)
The Power and Responsibility of Sam Altman (Laurie Segall)
100K Stars in a Day: Sigrid Jin, Bellman, and the OpenClaw origin story (Ray Fernando)
Claude Code Runs Her Entire Life (Hilary Gridley)
TIME100 AI Setup (Allie K. Miller)
Joanna Stern on Building a Media Business with AI (Mixed Signals)
AI for Atoms: Periodic Labs (Elad Gil)
I Hated Every Coding Agent, So I Built My Own (Mario Zechner)
Claude Code + Paperclip vs OpenClaw (Nate Herk)
Build a Support Agent with Vercel AI SDK (freeCodeCamp)
Vercel Product Walkthrough 2026 (Vercel)
How AI is Reshaping Wall Street (Pioneers of AI / Lucy Baldwin)

📊 Fundraising & Deals Roundup

Anthropic — ~$400M acquisition of Coefficient Bio (biotech).
Sarvam AI — $300-350M at $1.5B valuation (India's homegrown AI challenger).
Zipline — $200M (Series H now $800M total) for drone delivery expansion.
Moonbounce — $12M for AI content moderation.
Hailo — valuation halved to under $500M ahead of urgent SPAC IPO.
Kimi/Moonshot — hiring researchers/engineers with no limits on compute, launched Interstellar Program with 2026-valuation equity for interns.

Previous Around the Horn Digests

Catch up on everything you missed:

Thursday, April 2, 2026: Google released Gemma 4 under Apache 2.0, Microsoft dropped 3 in-house models, AI models secretly scheme to protect each other from shutdown, Anthropic found emotion vectors that drive Claude's behavior, and Iran attacked an Oracle data center in Dubai.
Wednesday, April 1, 2026: OpenAI closed a record $122B round at $852B valuation, Oracle fired ~25,000 to fund AI data centers, and sycophantic AI is making users worse.
Monday, March 31, 2026: Anthropic leaked Claude Code's source code, OpenAI hit $2B/month revenue, NVIDIA shipped DLSS 4.5.
March 28-29, 2026: Anthropic leaked Claude Mythos/Capybara, cybersecurity stocks crashed 7%, Waymo doubled to 500K rides/week.
Friday, March 27, 2026: Apple opened Siri to every AI, Mistral built a TTS model that fits on a smartwatch.
Thursday, March 26, 2026: ARC-AGI-3 launched and every frontier model scored under 1%.
Wednesday, March 25, 2026: OpenAI kills Sora six months after launch (and Disney's $1B deal with it), Arm makes its first-ever chip, Claude takes over your Mac, and a supply chain attack hit one of the most popular AI libraries.

That's a Wrap

That's 150+ stories from this weekend. If you made it to the bottom, you now know more about OpenAI's org chart than most of its employees do right now. Someone in HR just felt a chill.

For the daily version (bite-sized, 5-minute reads), make sure you're subscribed to The Neuron. We send six issues a week, and yes, we read all of this so you don't have to.

See you Monday.

P.S: Know someone who'd find this useful? Forward this to them and tell them to subscribe here.

Around the Horn Digest: Everything That Happened in AI This Weekend (Saturday-Sunday, April 4-5, 2026)

Around the Horn — Sunday Special

🌟 Sunday Special: This Week in AI Top 10 Stories

Here are the ten stories that mattered most:

Top Tools of the Week:

Around the Horn — Saturday, April 5, 2026

🏢 Big Tech & Major Companies

💼 AI Productivity, Labor & Economics

🤖 AI Agents & Infrastructure

💻 AI Coding & Developer Tools

🔬 AI Research & Models

💡 Industry Commentary & Analysis

🛠️ AI Tools & Products

Around the Horn — Weekend, April 4-5, 2026

🏆 TOP 5 NEWS (Around the Horn)

Honorable Mentions

🍪 TOP TREATS TO TRY

🏢 Big Tech & Major Companies

💼 AI Productivity, Labor & Economics

🤖 AI Agents & Infrastructure

💻 AI Coding & Developer Tools

🔬 AI Research & Models

🏛️ AI Policy, Governance & Safety

🛠️ AI Tools & Products

🚗 Autonomous Vehicles & Robotics

💡 Industry Commentary & Analysis

🎙️ Interviews, Panels & Podcasts

📊 Fundraising & Deals Roundup

Previous Around the Horn Digests

That's a Wrap

Grant Harvey

Company

Categories