Everything That Happened in AI Today Monday, June 8

Apple finally made Siri the main character at WWDC, then immediately had to explain why Europe and China will be waiting outside the party.

Welcome to the Around the Horn Digest, where today’s AI news looked like Apple trying to make up for two years of Siri jokes in one keynote. WWDC gave us a rebuilt Siri AI, on-screen awareness, camera actions, prompt-built Shortcuts, and a dedicated app, which is a lot of ways to say Apple wants the assistant to finally behave like it lives on your phone. The awkward part: the biggest version is delayed in the EU and China, and the strongest on-device model needs Apple’s newest hardware. Outside Cupertino, OpenAI filed for an IPO, Anthropic showed Mythos can exploit fresh software flaws in hours, and the infrastructure bill kept getting weirder. Let's get into it.

Previous digests: Tuesday, June 2, 2026 | Monday, June 1, 2026 | Weekend, May 29-31, 2026 | Thursday, May 28, 2026 | Wednesday, May 27, 2026

Around the Horn: Monday, June 8, 2026

The lead story was Apple’s WWDC reset, because the company finally gave Siri the agentic shape people expected when Apple Intelligence first launched. Apple previewed next-generation Apple Intelligence and a rebuilt Siri AI across iOS 27, iPadOS 27, macOS, watchOS, and visionOS, with a dedicated Siri app, privately synced conversation history, personal-context understanding across messages, emails, and photos, on-screen awareness, expanded Visual Intelligence, and systemwide app actions. The WWDC keynote, Variety, and PCMag all framed the announcement as Apple’s biggest attempt yet to revive its AI ambitions.

The practical demos were more interesting than the branding. Visual Intelligence now works through a Siri mode inside the Camera app, so users can point at a restaurant bill, choose only what they ordered, and split the tab through Apple Cash; TechCrunch highlighted the bill-splitting demo, while Apple also added nutrition insights from food photos. Shortcuts can now generate workflows from plain-English prompts, and Apple said Intelligence is expanding across Photos, Safari, Messages, Mail, Image Playground, Home, accessibility features, and everyday iPhone, iPad, and Mac experiences.

The caveats are the story too. Apple said Siri AI will be delayed in the EU on iPhone and iPad because of the Digital Markets Act, and MacRumors reported the new Siri AI features also will not be available in China at beta launch. 9to5Mac reported the most powerful on-device AI model requires iPhone 17 Pro, iPhone Air, or newer iPads and Macs with M3 / M4-class hardware and enough memory, while visionOS 27 brings the new Siri, Visual Intelligence, AI-enhanced Flyover, spatial scenes, eye-controlled notifications, curved windows, and a redesigned Control Center to Vision Pro. Apple finally showed the assistant it wants to ship. The rest of the year will show how much of it actually reaches users.

🏆 TOP 5 NEWS (Around the Horn)

OpenAI announced it submitted a confidential draft S-1 to the SEC because it expected the filing to leak, said it has not decided timing, and framed the move as an option to go public sooner if that becomes best; CNBC reported the filing as preparation for a possible mega AI debut around a reported $852B tender-offer valuation.
Anthropic’s Mythos can exploit newly disclosed software flaws in hours, while Anthropic’s N-day research measured how models accelerate known-vulnerability attacks and The Next Web warned China is distilling those cyber capabilities at scale.
NVIDIA and LG, SK hynix, and NAVER turned South Korea into an AI factory showcase, with Jensen Huang and LG Chairman Koo Kwang-mo deepening ties, Chosun noting robotics and data-center cooperation, and Reuters reporting NAVER’s path from 55 megawatts to gigawatt-scale infrastructure.
Microsoft shut down more than 70 of its own GitHub repositories after hackers pushed malware targeting Claude and Gemini coding-agent users; OpenSourceMalware said GitHub disabled 73 Microsoft repos across four organizations in 105 seconds, with a recompromised durabletask package and the open-sourced Miasma worm at the center.
The Pentagon accused Alibaba, Baidu, BYD, and Unitree of supporting China’s military, and Bloomberg said the U.S. doubled down on labeling some of China’s corporate crown jewels as national-security threats.

Honorable Mentions

Amazon struck a multibillion-dollar Corning deal for AI data-center fiber, The Next Web argued fiber is AI’s next bottleneck, and Amazon said the deal will create 1,000 advanced manufacturing jobs in North Carolina.
Cognition launched FrontierCode, a maintainer-built coding benchmark that asks whether model output would actually be merged, with Cognition and scaling01 highlighting its focus on regression safety, test correctness, scope discipline, and code quality.
Xiaomi and TileRT pushed MiMo-V2.5-Pro-UltraSpeed, a 1T-parameter model, past 1,000 tokens per second on a standard 8-GPU node, with HN discussion, Xiaomi’s launch post, open weights, and limited chat access available by application.
MIT FutureTech ranked AI risks with a 272-expert Delphi study, tying the full paper, responsibility visualization, and MIT video to a warning about severe five-year risks, vulnerable groups, and who should be responsible for mitigation.

🍪 TOP TREATS TO TRY

NotebookLM lets you start research with a loose question, approve trusted sources in chat, see its thinking steps, and export editable charts, reports, spreadsheets, slides, images, CSVs, and JSON files; the announcement framed the update as agentic chat for multi-step research, and Google’s guide explains how to use it. Rolling out to Google AI Ultra subscribers and select Workspace accounts.
Kimi Work runs up to 300 specialized desktop agents in parallel to organize local files, automate workflows, control your browser through WebBridge, pull finance data from Yahoo Finance, World Bank, and Binance, remember preferences, and output PPTX, Word, PDF, or Excel files; Moonshot’s launch, Vik Vang’s demo, and additional launch context show the desktop-agent push. Free to download.
Shippy is Skylight’s free ocean-intelligence agent for maritime teams, answering questions like “show fishing activity near Fiji in the last 24 hours” with cited results from live vessel tracking, satellite detections, and partner datasets; Skylight’s launch post invites early adopters, and the support guide explains how it works. Free to try.
LangSmith lets teams create and manage fleets of specialized Deep Agents for inbox management, blog writing, competitor research, recruiting, and other work; Caspar von Bredow said they can use custom instructions, skills, tools, sub-agents, and memory, run on schedules, communicate through Slack, Teams, or email, and export an open-source harness for self-hosting. Free to try.
Intuned turns natural-language browser automation requests into deterministic Playwright code in TypeScript or Python for sites without APIs; the Launch HN thread said customers use it for scraping, reports, and form submission while the runtime handles stealth, auth/session reuse, scheduling, scaling, observability, and self-healing. Free tier with trial credits.
Hivemind gives Claude Code, OpenClaw, Codex, Cursor, Hermes, and other agents shared persistent memory so every trace can become a reusable team skill; Davit Buniatyan said SkillOpt improves agent accuracy by 19-25 points, the follow-up explained the one-command install, and the GitHub repo is open source. Free / open-source.
Alexa for Shopping lets Amazon customers design custom merch by describing ideas like pet portraits on tumblers or matching group shirts, then turns those prompts into shareable, wearable designs. No pricing details.
OpenAI confidentially filed for an IPO, preparing Wall Street for a possible mega AI debut around a reported $852B tender-offer valuation.
Anthropic’s Mythos can exploit newly disclosed software flaws in hours, while Anthropic’s N-day research measured how models accelerate known-vulnerability attacks and The Next Web warned China is distilling those cyber capabilities at scale.
NVIDIA and LG, SK hynix, and NAVER turned South Korea into an AI factory showcase, with Jensen Huang and LG Chairman Koo Kwang-mo deepening ties, Chosun noting robotics and data-center cooperation, and Reuters reporting NAVER’s path from 55 megawatts to gigawatt-scale infrastructure.
Microsoft shut down more than 70 of its own GitHub repositories after hackers pushed malware targeting Claude and Gemini coding-agent users; OpenSourceMalware said GitHub disabled 73 Microsoft repos across four organizations in 105 seconds, with a recompromised durabletask package and the open-sourced Miasma worm at the center.
The Pentagon accused Alibaba, Baidu, BYD, and Unitree of supporting China’s military, and Bloomberg said the U.S. doubled down on labeling some of China’s corporate crown jewels as national-security threats.

🎓 AI Skill of the Day

Make Claude prove the work before you trust the run. Today’s skill comes from Boris Cherny’s advice for running Claude Opus autonomously for hours or days. The useful idea is to treat autonomy like a system, not a wish: give Claude permission to keep moving, give it a goal loop, then make it verify the finished work.

Set Claude to auto mode so it does not ask for approval on every safe project action. Run it in the cloud so the job keeps going after you close your laptop. Use /goal or /loop, which are steering commands that nudge the agent to continue until the task is done. For bigger work, use dynamic workflows so Claude can coordinate many sub-agents. Then add end-to-end verification: Claude in Chrome for web work, an iOS or Android simulator MCP for mobile (MCP means a tool connection the model can use), or the full running server for backend work.

Run this as a long-horizon task.

Use auto-approved permissions only for safe project actions.
Use /goal or /loop to keep working until the outcome is complete.
If the task is too large, create a dynamic workflow and split it into sub-agents.
Do not report "done" until you self-verify end to end:
- Web: test in the browser.
- Mobile: test in an iOS or Android simulator MCP.
- Backend: start the full service and run the relevant checks.

At the end, give me:
1. What changed
2. How you verified it
3. What risks remain

🏢 Big Tech & Major Companies

Google and NVIDIA reportedly looked at Intel as a backup chip manufacturer. The Information reported TSMC capacity pressure is pushing major AI chip designers toward Intel as a backup manufacturer, The Economic Times said Google ordered more than 3M TPUs, Google's in-house AI chips, for 2028 production while NVIDIA explores Intel technology for its own processors, and Yahoo Finance tracked the stock reaction and listed the rumor among the reasons Intel shares bounced.
An OpenAI Sora researcher resigned to start a new company. Gabriel said he left Sora earlier this year to build a team at OpenAI, has now resigned because he has “always been a founder,” and wants to build one last product before AGI; he added that he already misses his friends and colleagues and believes in them.
Apple Maps is adding 3D Gaussian Splatting. Bilawal Sidhu said Apple is bringing 3D Gaussian Splatting, a method for turning photos into clean 3D scenes, to Apple Maps using oblique aerial imagery, giving ground-level detail without common photogrammetry artifacts like broccoli trees or melted powerlines; his follow-up called it the first major Maps refresh since 2021, and his reality-capture playlist gives the Gaussian Splatting / photogrammetry background.
Google cut AI Plus pricing. 9to5Google reported Google dropped AI Plus to $4.99 per month and increased included storage to 400 GB.
Meta removed face-recognition code from its smart-glasses app after reporting exposed it. WIRED reported Meta deleted a face-recognition system from the Meta AI smart-glasses companion app, Gizmodo framed Meta as mostly mad it got caught, and the r/technology discussion praised WIRED’s recent reporting run.
Italy dropped its WhatsApp AI bot investigation into Meta. Reuters reported Italy’s competition authority ended its probe into whether Meta abused its dominant position by installing its AI tool on WhatsApp.
Apple watchers and TestingCatalog expected a roughly 1.2T-parameter Gemini-backed Siri architecture through Private Cloud Compute, with a dedicated Siri app, synced history, Dynamic Island “Search or Ask,” mail, calendar, contacts, and web extensions, an AI health coach, and an iOS 27 cleanup push.
Adam Fry said the ChatGPT team shipped several quality-of-life upgrades: interactive charts, full-screen writing mode with Library saves, tables of contents for long chats, editing messages that contain attachments, long-press Send for temporary model effort selection on Plus and Pro, and faster iOS typing.

💼 AI Productivity, Labor & Economics

Companies still struggle to see what AI is costing them. The Wall Street Journal reported only 26% of companies say they have a comprehensive view of their AI costs, and The Decoder surfaced the same KPMG survey finding as a sign that most companies are still flying blind on AI spending.
Is AI Profitable Yet turned AI burn into a live scoreboard. The tracker collects data on whether AI companies are actually profitable yet, and its most memorable feature is a dollars-spent-since-page-load counter that makes the industry's capital burn visible in real time; linger for about a minute and you can watch roughly $1M vanish on the page.
AI started moving into one of finance's cushiest jobs. Bloomberg reported wealth managers who can make $500K+ are facing a chatbot reckoning as financial advice gets automated, and Reddit's finance community picked up the debate around what happens to traditional advisory work.
Julia Fonseca and Victor Duarte introduced AI for Structural Estimation, a neural approach that learns the mapping needed to clear markets after solving an economic model once, cutting computation from days to minutes and adding an agent that writes implementation code from a natural-language model description.
Tommy Shaughnessy argued that open-source inference price competition is pushing intelligence sharply cheaper, citing provider funding and pricing pressure plus follow-on GPU strategy as Western labs and inference providers prepare for a world where strong Chinese models may become harder to access.
Brian Armstrong argued demand for intelligence is nearly unlimited but most workloads should move to much cheaper models within 12-18 months, with only frontier orchestration and scientific-breakthrough tasks needing the most expensive models while energy and compute become the limiting factors.
Perplexity Research argued AI agents are reshaping knowledge work by raising task autonomy, lowering cost, and widening the scope of work people take on, while Perplexity framed agents as computers that can execute broader tasks instead of only answering questions.

🤖 AI Agents & Infrastructure

Ireland told AI-hungry data center builders to bring their own power. The Wall Street Journal reported Ireland is becoming a test case for countries trying to attract AI investment without risking outages or higher electricity bills for citizens.
A donated-parkland dispute in Taylor, Texas became a data center fight. 404 Media reported a farmer deeded 87 acres to the city in 1999 for a nominal $10 to be held in trust as future parkland, but the city sold the land to data center developer Blueprint for $10M in 2025; the donor's family sued to enforce the original park intent, the initial case was dismissed and is now on appeal, while the city cites zoning rules that permit the 135,000-square-foot data center and projects $30M in tax revenue over the next decade.
Helion raised more money to build Microsoft's fusion power plant. TechCrunch reported the Sam Altman-backed fusion startup raised $465M as it races to complete a power plant for Microsoft by 2028, a deadline shaped by AI's growing appetite for electricity.
Google AI Studio shared a getting-started guide for managed agents. Google AI Studio shared Pat Loeber's practical guide to building managed agents in AI Studio and the Gemini API, complete with a video walkthrough.
LangChain Fleet turned LangSmith into a workplace-agent control room. LangSmith now lets teams create and manage fleets of specialized Deep Agents for everyday workflows like inbox management, blog writing, competitor research, and recruiting; Caspar von Bredow said the agents can use custom instructions, skills, tools, sub-agents, and memory that improves through feedback, run on schedules, communicate through Slack, Teams, or email, and export their open-source harness so teams can self-host the context files. Free to try.
Nando de Freitas argued modern agents are built from a stacked training pipeline with no unified semantics for interaction, then proposed continual, interactive, causal agents where world-written tokens act as evidence, self-written tokens act as interventions, and feedback/correction transcripts train more purposeful behavior; his thread framed this as a cleaner path than stitching together pretraining, preference modeling, and RL.
Self-Trained Verification proposed training verifiers by showing models reference solutions, then making them imitate a more informed version of themselves; the method roughly doubled hard-math accuracy, lifted scientific reasoning from 1.5% to 21%, and improved both test-time verification-refinement loops and training-time verifier-in-the-loop learning.
CausaLab tests whether LLM agents can discover hidden causal laws by experimenting in a synthetic lab, and Dylan Zhang said agents often predict well while learning the wrong mechanism, stop experimenting too early, and improve when a verification step checks whether the hypothesis explains the evidence.
Kwindla Hultman Kramer argued builders should write state machines that design loops for agents rather than hand-designing the loops themselves, while Philip Kiely added that the meta state machine can then be formally verified.
Peter Steinberger repeated his advice that coding-agent builders should design loops that prompt agents rather than repeatedly prompting the agents themselves, and Matt Van Horn unpacked the “loops” discourse as cron plus a decision-maker, where verification and halting conditions become the scarce resource.
Garry Tan’s gbrain gives OpenClaw and Hermes-style agents an opinionated shared memory layer, and David Breslauer said he moved his entire agent “brain” onto a dedicated server so Codex, Claude, OpenClaw, and Hermes can share summarized experiments and conclusions across sessions.
IFLAX combines object-importance learning and neuro-symbolic planning for long-horizon robot tasks under complex logical constraints, with the paper and SAIR Lab framing it as a way to recover from failures when plans have many dependencies.
RAGEN-2 studied reasoning collapse in agentic reinforcement learning, highlighting how agents trained in interactive settings can appear to improve while losing stable reasoning behaviors.
Self-Revising Discovery Systems proposed a categorical framework for agentic AI in science, and Omar Sar surfaced it as part of the broader shift from one-shot models to systems that revise their own discovery loops.
Hello Robot shared another update around its Stretch home-assistance robot, keeping the home-robotics thread alive as labs test whether general-purpose robots can move from demos into real houses.

💻 AI Coding & Developer Tools

Intuned wants browser automation to behave like production software. Intuned lets you describe browser automations in natural language so its AI agent generates, deploys, and automatically heals deterministic Playwright code, which means repeatable browser-control scripts, in TypeScript or Python for sites without APIs; the Launch HN thread said common uses include scraping data, pulling reports, and submitting forms, while the managed runtime handles stealth, auth/session reuse, scheduling, scaling, observability, and self-healing when sites change. Free tier with trial credits.
Lathe uses LLMs to teach you a domain without doing the work for you. Lathe generates source-backed, multi-part technical tutorials for any topic, then has you work through them by reading and typing the code by hand in a local UI; the Show HN post framed it as using LLMs to learn a new domain rather than skip past the learning.
Nightwatch turns incident response into a local, read-only AI workflow. Nightwatch is an open-source AI SRE, meaning a system-reliability assistant, that clusters alert storms, investigates root causes over live systems, and proposes human-gated fixes; the HN post said the weekend project came from a real Kubernetes upgrade that went wrong, reached a point where rollback was impossible, and had to be fixed live during the night while several problems collided.
lowfat trims command-line noise before agents waste tokens on it. lowfat is a pluggable CLI filter that sits between AI agents and verbose command output, such as full Kubernetes config dumps, strips irrelevant lines, and passes through only what matters; the Show HN post said it saved 91.8% of the creator's LLM tokens over two months of personal use.
Web Speed builds machine-readable maps for web agents. Web Speed turns websites into high-fidelity, token-efficient machine maps by progressively translating raw text into HTML, CSS, JavaScript, JSX, and structured output; the Show HN thread described it as an open-source shared web-map registry using MCP, the standard that lets agents connect to outside tools and data.
boxes.dev moves Claude Code and Codex sessions off localhost. boxes.dev gives each Claude Code or Codex chat its own isolated cloud computer so you can connect from mobile or desktop and code from anywhere; the Show HN post said the ex-Gem engineers behind it spent the last year coding almost exclusively with Codex and Claude Code before deciding each agent chat needed its own remote environment.
mnemo gives LLMs a local memory layer. mnemo is a local-first AI memory layer built with Rust, SQLite, and petgraph that gives any LLM, including Ollama, OpenAI, Anthropic, or compatible backends, a persistent knowledge graph, entity extraction, and semantic retrieval; the HN post put the pitch simply: most LLMs forget everything when a conversation ends, and mnemo fixes that.
A Warp sidecar brought local agent sessions into the browser. Developer sathvik built a Chrome extension for Warp that shows your local agent session as a browser sidecar, so you can monitor cloud-agent runs and kick off new ones without switching contexts; it was built with Aiden Bai's react-grab library.
Prince Canuma expanded local MLX tooling for Apple Silicon. Canuma said mlx-audio and mlx-vlm added day-one support for Google DeepMind’s Gemma 4 12B multimodal model, QAT checkpoints for local and edge use, 15+ new TTS / ASR / VAD models, faster long-form transcription, video input, and an expanded OpenAI-compatible audio server for fast local M-series workflows.
Together.ai surfaced its endpoint console for builders. Together.ai’s endpoint page points developers to its AI-native cloud console for model endpoints and inference workflows.
Codex 0.138.0 tightened the CLI-to-desktop workflow. OpenAI’s Codex release added /app handoff from CLI threads into Codex Desktop on macOS and native Windows, exposed saved file paths for local image attachments and image generations, made reasoning-effort selection more flexible, added token-usage and v2 personal-access-token support, enriched plugin JSON output, and fixed /goal behavior so idle auto-turns stay out of Plan mode and goals stop after terminal failures; Felipe Coury highlighted the /app handoff.
Santosh Yadav argued AI coding works best when engineers still own judgment. Yadav wrote that his AI development workflow starts with real user pain, a design doc, approval, a planning agent to break work into smaller tasks, small PRs, and feedback monitoring; the productivity gain comes from faster research and iteration, while the engineer’s value is catching hallucinated solutions, keeping code readable, and deciding what is safe for production.
AutoMegaKernel is an agent harness that compiles a model into one provably correct, self-retargeting CUDA megakernel and self-tunes it past cuBLAS for batch-1 LLM decoding.
RunInfra lets developers optimize open models for production in a chat-native interface that selects compatible models, benchmarks GPUs, optimizes runtime and kernel paths, and deploys measured endpoints; Liad Yosef framed it as model infrastructure built around evidence instead of guesswork.
AgentJourney lets teams point real top agents at any website and watch their paths draw live, so product teams can see how agents shop, search, click, and look for files like llms.txt or pricing.md.
Jeswanth Mukesh published a mathematical walkthrough of Rotary Position Embeddings, the position-tracking technique used in Llama-style models, while Killshot surfaced it for readers who want to understand inverse frequencies, rotation matrices, and relative-position behavior from scratch.
Omar Sanseviero said Gemma 4 MTP was merged into llama.cpp, enabling fast local setups when combined with quantization-aware training, and later noted llama.cpp added video input support for Gemma 4 through chat-completion endpoints and mtmd-cli (follow-up).

🔬 AI Research & Models

Ideogram 4.0 arrived as an open-weight image model built for controllability. The Show HN post said the 9.3B single-stream diffusion transformer, a text-to-image model architecture, was trained from scratch with structured JSON prompts, strong text rendering, spatial awareness through bounding-box guidance, and color-palette control via hex values; the GitHub repo includes full inference code, prompting docs, and gated non-commercial weights, and the NF4 quantized checkpoint can run on a single 24 GB GPU.
A perceptron tutorial rebuilt the smallest useful neural net from scratch. Devarsh Ranpara published an interactive Python walkthrough explaining how weights represent feature importance, how bias shifts the decision boundary, why epochs and learning rate shape training, and why normalizing data prevents unstable updates, with browser demos and runnable code for tasks like sign classification or student pass/fail prediction; the HN thread also surfaced Victor Banev's related “Simplest Learning Machine” idea: a minimal algorithm that reads binary events, keeps only 1 byte of dynamic internal state, and improves continuously over time.
Preston Jensen visualized what image embeddings actually look like. Jensen poked at the geometry of an image model's feature space, and the HN discussion described a related demo that turned embedding vectors from hundreds of cat, dog, and plane photos into heatmaps so database and storage engineers could see what these vectors physically look like at a low level.
A tiny CUDA GPT implementation turned language-model guts into hackable code. markusheimerl's repo implements a generative pretrained transformer, the basic model family behind GPT-style systems, in CUDA, NVIDIA's programming layer for GPU code; the HN thread dug into practical next steps like creating your own data, building a tokenizer, choosing context size, defining stop criteria, training LoRA adapters, which are small model add-ons, and quantizing models, meaning compressing them so they run smaller and cheaper.
Opus 4.8 one-shot a formally verified polygon-intersection project. schildep's repo implements formally verified polygon intersection, meaning code paired with a mathematical proof that it behaves correctly, and the Show HN post said Opus 4.8 could now generate both the algorithm and the formal proof in one shot, where earlier models required the human to supply proof strategies across multiple steps.
Google released Gemma 4 checkpoints built to run smaller on phones and laptops. Google released quantization-aware training checkpoints for Gemma 4, meaning the models are trained to keep their quality after compression, delivering similar performance with roughly 4x less memory; the Hugging Face collection includes the Q4_0 weights, Philipp Schmid noted the new mobile quantization format shrinks the E2B model to about 1 GB and can run directly on devices.
Nex AGI released a small agentic model for local productivity work. Nex-N2-mini is a 35B-parameter MoE model, meaning only part of the model activates per request, fine-tuned from Qwen3.5-35B-A3B-Base for tool calling, terminal execution, productivity tasks, and coherent long-horizon reasoning; LottoLabs surfaced the release. Apache 2.0.
Goedel-Architect tried to make formal theorem proving more programmable. The arXiv paper and project site describe a system that generates and refines proof blueprints so AI systems can help build machine-checkable math proofs more reliably.
Nature examined how AI is reshaping math and physics discovery. The Nature piece argued AI is not replacing human intuition in these fields, but changing how questions are asked, explored, and understood; Ananyo Bhattacharya shared the piece, and Google DeepMind’s account circulated related work.
Researchers induced some sleep-like benefits without sleep in mice. New Scientist reported scientists used targeted brain stimulation to induce NREM-like slow-wave activity in awake, sleep-deprived mice, restoring memory performance and reducing sleep-pressure markers comparable to natural deep sleep; the Nature Neuroscience paper backs the finding, while the r/Futurology thread debated both insomnia / shift-work benefits and dystopian pressure to work longer.
Anthropic showed why agents advance faster in coding than biology. Laura Luebbert argued biological databases are built for expert humans clicking through messy portals, not agents; in VirBench, a test with 120 viral sequence queries across 40 pathogens, Claude, Biomni, Edison Analysis, and GPT models ranged from 16.9% to 91.3% mean accuracy, but adding gget virus, a deterministic retrieval layer that performs the database lookup the same way each time, pushed accuracy above 90% for all agents and as high as 99.7% for GPT-5.5.
Akarsh Kumar introduced Supervised Memory Training, a way to pretrain recurrent networks without recurrence by using encoder-decoder transformers to estimate memory states for one-step supervised learning; the paper and demo thread show time-parallel training, O(1) credit assignment, long-range memory, next-pixel prediction, and DAgger Memory Training for stable rollouts.
Argument Collapse found LLM-written long-form debate essays produced far fewer unique arguments than human essays, with only 3.4% unique main arguments versus 65.3% for humans; the repo and Yekyung Kim’s thread showed the models favored generic evidence, hedged reasoning, and formulaic structure even when prompted for diversity.
Widening the Gap introduced an attack that hides malicious behavior in full-precision models and activates it after quantization, the compression step that makes models cheaper to run; the code and Kazuki Egashira’s thread showed the attack can work against GPTQ and AWQ while preserving normal model utility.
ModelScope and Nex AGI open-sourced the Nex-N2 agentic model series, including Nex-N2-Pro and Nex-N2-mini, with adaptive reasoning depth, Apache 2.0 licensing, SGLang support, parsers, Docker support, and strong coding, terminal, browse, and long-horizon workflow scores.
Harish Krishnakumar, David Yin, and collaborators released WorldBench, a visually diverse benchmark with 2,000 human-curated questions across seven image domains; the paper, dataset, and code show top closed and open vision-language models still struggle with grounded visual reasoning.
Macaron-V1-Preview is a 749B MoL agent model post-trained from GLM5.1, with weights on Hugging Face and a Macaron Versus playground for comparing GenUI responses with plain text.
Oliver Sieberling and collaborators showed dynamic short convolutions, input-dependent filters added around Transformer layers, can improve Transformers, MoEs, and linear-attention models with roughly 1.33x-1.60x compute advantage; the team also released Triton kernels for dynamic causal short convolutions.
ROTATE decomposes MLP neuron weights in vocabulary space by learning small rotations guided by vocabulary kurtosis, surfacing hidden monosemantic channels inside polysemantic neurons; Mor Geva framed it as a clearer view of how neurons act like key-value memory cells.
Wenda Xu proposed Speculative Knowledge Distillation, a hybrid method that samples from the student model but rejects and resamples from the teacher when the teacher assigns low probability, improving distillation when the student is still far from the teacher.
Trajectory Dynamics introduced trajectory extrapolation error, a measure of how much each new word disrupts a language model’s internal path, and Elan Barenholtz said it predicts human reading times beyond surprisal, suggesting shared sequential-processing dynamics between humans and LLMs.
Tencent Hunyuan released MMAE, a benchmark for instruction-based audio editing across speech, music, sound, and mixtures; the code, dataset, and video show current models score below 5% exact match when asked to precisely edit existing audio while preserving everything else, with alphaXiv also surfacing the gap.
Chelsea Finn and collaborators introduced Long-Horizon Q-Learning, which bounds value differences across long horizons to reduce compounding bootstrapping errors in reinforcement learning and improves over one-step TD and n-step returns on long-horizon tasks.
A Mixed Diet Makes DINO showed that training vision encoders on a mixed diet of modalities makes DINO-style representations more omnivorous, aligning RGB, depth, and other visual forms better; DeepMind released code, and Rishabh Kabra highlighted the CVPR work.
On-Policy Representation Distillation moves distillation into hidden-state space by aligning student and teacher representations on the same rollouts, avoiding noisy vocabulary-level estimates; alphaXiv said it closes student-teacher gaps while training 1.44x faster and using 54% less memory than top-k output distillation.
Prism Capability Extraction explores how to extract latent model capabilities, with the paper and Tokenbender’s thread presenting it as a way to identify useful behaviors hiding inside trained models.
Why Larger Models Learn More analyzed how capacity, interference, and rare-task retention explain why bigger models preserve more capabilities, with Rohan Paul surfacing the paper for readers tracking model-scaling behavior.
Principles and Practice of Deep Representation Learning presented a mathematical theory of memory in deep representation learning, and Yi Ma framed it as an attempt to formalize how learned representations store, organize, and retrieve information.

🛠️ AI Tools & Products

MimicScribe brings no-bot meeting notes to macOS. MimicScribe is an on-device in-meeting technical assistant you trigger with Control-Space that gathers requirements, surfaces prep notes, transcribes meetings, and identifies speakers without cloud calls or meeting bots; the Show HN post said the speaker ID system reaches 97% accuracy by combining fluid audio's Pyannote port with grammar-structure cues from Parakeet STT, meaning speech-to-text.
SoulsOnly.ttf is a font for humans rather than AI. convictional built a font and custom keyboard firmware designed to be typed by humans and resist AI reading, and the HN post described the project simply as “SoulsOnly.ttf”: a font for humans, not AI, with firmware to type in it.
Matt Pocock turned his teaching style into an AI skill. Pocock said his new /teach skill, part of the AI Hero skills system, packages a decade of teaching experience into an interactive tutor that can teach step-by-step with HTML lessons and adaptive pacing, including a Rubik’s cube walkthrough.
Genspark Skills gives users another place to package reusable AI workflows. Genspark Skills is a skills hub for reusable instructions and workflows inside Genspark; no pricing details.
portaltext fills knowledge gaps while you read by turning links and images into contextual portals for deeper explanations; alaska r.c.h. described it as a way to understand dense blogs immediately without leaving the page. Chrome extension launching soon.
Graph Playground lets you explore DAG expand/collapse layout animations, and jessald showed variants using Dagre plus FLIP motion and d3-force to create smoother “unfold” interactions. Free to try.
Jui-Hui Chung and collaborators built Goedel-Architect, an open-source framework for formal theorem proving in Lean 4 centered on generating a global dependency graph blueprint of formally stated definitions and lemmas (optionally seeded by natural language proof) that are then proved in parallel by a tool-equipped prover, with failures feeding back to refine the blueprint, achieving 99.2% pass@1 on MiniF2F-test and 75.6% on PutnamBench with DeepSeek-V4-Flash (rising to 88.8% / 597 out of 672 on PutnamBench, 4/6 on IMO 2025, 11/12 on Putnam 2025, and 3/6 on USAMO 2026 at roughly $1.65 per problem) while rivaling proprietary systems at a fraction of the cost (paper: arxiv.org/abs/2606.06468).
MagicPath Builder gives you unlimited external-agent calls plus a cloud multiplayer canvas with visual editing, design systems, live interactive links, Figma export, animations, states, and working prototypes for $10/month when working with Codex, Cursor, or Claude Code (external model usage billed separately by your provider; you can create and edit from any agent without opening the app and share live links directly) (pricing, external agents, Codex plugin).
Together’s Dedicated Model Inference now lets you one-click deploy their Blackwell-optimized inference engine with auto-scaling on frontier open-source models including Nemotron, Minimax, Kimi, DeepSeek, GLM, and Qwen after adding thousands of B200 and B300 chips (api.together.ai/endpoints).
Hermes Agent, the self-improving agent from Nous Research that runs a built-in learning loop to autonomously create skills from experience, refine them in use, nudge itself to persist knowledge, search its own past conversations with LLM summarization, and build a deepening model of you across sessions, just surpassed VSCode in GitHub stars and installs via one-liner to run on a $5 VPS, GPU cluster, or near-zero-idle-cost serverless infrastructure while supporting CLI TUI, Telegram/Discord/Slack gateways, cron scheduling, subagent delegation via RPC, and dozens of LLM providers including Nous Portal and OpenRouter (X post, GitHub)

🏛️ AI Policy, Governance & Safety

Meta fixed an AI support-tool bug tied to hacked Instagram accounts. This Week in Security reported attackers tricked Meta's AI chatbot into resetting passwords for Instagram accounts that lacked two-factor authentication, and SecurityWeek said roughly 20,000 accounts may have been hacked through abuse of Meta's AI-powered account recovery support tool.
AI spying tools reportedly rattled Putin's security apparatus. The Irish Times reported Russia shut down a system protecting President Putin after the Iran war exposed how AI can be used to target enemies, underscoring how surveillance and targeting risks are spilling into state security.
ElevenLabs partnered with the UK government on public-service voice AI. ElevenLabs said it signed an MOU with the UK Government's Department for Science, Innovation and Technology to explore voice AI for public services, especially for visually impaired, low-literacy, elderly, and Welsh-language users, while deepening work with the UK AI Security Institute; the company also said it is tripling its London HQ size and doubling its UK team to 200.
The UK laid out an AI Economics Institute prospectus. GOV.UK published the AI Economics Institute prospectus, outlining a government-backed effort to study the economic effects, adoption, productivity, labor, and growth implications of AI.
- More about this: The UK government launched the AI Economics Institute (AIEI), the world’s first government-backed institute of its kind, as a joint Treasury and DSIT organization modeled on AISI and incorporating the Future of Work Unit; Nobel Prize-winning economist, former IMF Chief Economist, and MIT professor Simon Johnson will serve as Chair, and the institute has signed a Joint Statement of Collaboration with Anthropic, OpenAI, Google, and Microsoft to build and analyze data on AI adoption, conduct research on impacts to specific sectors, occupations, and worker groups, develop models for economy-wide effects, and work with industry, academia, and government to inform policy (without itself setting policy).

📊 Fundraising & Deals Roundup

Moonshot AI sought a huge new valuation in China's AI race. Bloomberg reported the Kimi developer is seeking up to $2B at a $30B valuation, its third financing in six months, and The Next Web noted that would be a 7x jump from a $4B valuation while DeepSeek eyes $59B and Zhipu sits around $80B.
PhysicsX raised $300M for industrial AI. Bloomberg reported the British startup builds AI models for manufacturing components like jet engines and semiconductors and hit a roughly $2.4B valuation, PitchBook framed the Series C as a growing VC bet on industrial AI, and Tech in Asia reported Temasek led the round.
PointFive raised $60M to help rein in runaway AI costs. The Wall Street Journal reported Accel led a $60M growth investment valuing the software maker at $500M, while Ventureburn said the company will expand its AI efficiency platform, strengthen its product, and accelerate global enterprise growth.
A Security came out of stealth with $37M for AI-native cybersecurity. Fortune reported the startup is building autonomous defenses after frontier AI models exposed thousands of previously unknown vulnerabilities, and A Security said it is an autonomous offensive security and remediation platform backed by Cyberstarts, Lightspeed, Wiz CEO Assaf Rappaport, Cyera CEO Yotam Segev, top CrowdStrike executives, and founders Yossi Torati, Omer Gull, and Yuval Itzchakov.

💡 Industry Commentary & Analysis

Ed Zitron argued that the AI business model is running out of room. Zitron argued AI progress is already slowing in ways that threaten the industry's survival because planned data-center spending and lab compute commitments would require OpenAI and Anthropic alone to reach roughly $358B in 2029 revenue, up from about $60B projected for 2026, while token-based billing makes costs unpredictable, ROI remains hard to measure, and broad productivity gains have not shown up at scale; the HN discussion pushed on whether there is still a middle ground where individual capability gains produce major discoveries.
Niklas Göke wrote a case for making peace with the dreams you will never live. Göke argued that accepting some aspirations as enjoyable fantasies, his example is never becoming a snowboarder because his knees cannot take it, can free people to invest in the deliberate, achievable work and choices available in the one life they actually have.
Deedy Das questioned the quality of Meta AI's growth. Das pointed out that Meta AI has grown 2.5x in the last two months and is on pace to become the No. 3 consumer AI app, but argued the growth is likely inorganic because Meta AI has the worst retention among major players, with only 4.5% of users still active after 30 days.
Patrick Collison sketched the missing pieces in LLM workflow tooling. Collison argued builders still need better ways to manage input files and general-purpose context, collaborate in real time with snapshots or version control, store inference workflows and prompts, use general-purpose coding agents beyond chat, and share compiled outputs or artifacts, describing the vibe as “GNU Autotools x Notion.”
Dwarkesh Patel argued that data is the real engine of frontier progress. Patel wrote that AI capabilities contain a “sample efficiency black hole” because models see a volume of data far beyond a human lifetime, making data the main driver of progress and explaining why open-source projects and previous laggards can distill public APIs and catch up quickly; his X post framed architectures and hyperparameters as secondary to replicable data.
Shuangfei Zhai revisited Hinton’s “dark knowledge” for modern distillation debates. Zhai highlighted Geoffrey Hinton’s Dark Knowledge paper, where soft teacher probabilities encode inter-class similarity that hard labels miss, helping students generalize to unseen classes and offering a lens on on-policy versus off-policy distillation today.
Leopold Aschenbrenner’s investor mystique got a Wall Street profile. The Wall Street Journal profiled the 24-year-old AI researcher and investor, noting his Jane Street backing, online cult following, and the way fans dissect his every move.
SpaceX shared Elon Musk’s technical update on the company’s vertical integration capabilities to manufacture, launch, and operate AI satellites at scale, highlighting simpler satellite designs than Starlink that use solar cells, laser links, and radiators to harness near-constant solar power with minimal operating and maintenance costs, enabling potential constellations of a million orbital data centers as an early step toward a Kardashev II civilization, with the estimate that space-based AI compute could become the lowest-cost option within 2–3 years (X post, spacexipo.com).
Gary Marcus attacked the math behind AI IPO fever. Marcus argued that the industry is being propped up by “math that is insane,” pushing back on Cassandra’s summary of Jensen Huang’s idea that SpaceX, OpenAI, and Anthropic IPOs could echo early Amazon, Google, and Meta bets as SpaceX reportedly heads toward a June 12 debut near a possible $2T valuation. Marcus also cited Dragon Field’s math that an Amazon-style 2,538x SpaceX return would imply a $4,442T market cap, about 36x current world GDP, and Jessica Wachter and Jonathan Wachter’s NBER paper, which says the five largest U.S. tech firms spent $380B on capex in 2025, are forecast to roughly double that in 2026, and need an AI-sector productivity boom of about 2.7x to justify the investment; the paper’s scenarios imply 5-58 percentage points of additional cumulative GDP growth by 2030, AI reaching 8-39% of the economy, and higher rates / equity premiums under substantial risk. He tied that to the Is AI Profitable Yet? burn counter as a blunt reminder that the spending boom is visible in investment data before productivity data.
A rumored Anthropic checkpoint fueled model-watch speculation. Chris posted that “Claude Fable 5 (mythos)” had reached a final checkpoint and said a release could come soon based on limited inside info, with replies debating whether “Fable” is a checkpoint name, a dig at OpenAI naming, or a sign of a broader public release.
Deedy Das analyzed “Hell Grind,” an entirely AI-made feature film produced by 15 people in 14 days for about $500K, arguing the technical consistency is impressive but the production bottleneck has shifted from generation to storytelling, directing, and taste.
CJ Zafir shared one agent-workflow note, then followed it with another related post, adding to the day’s recurring theme that agent usefulness now depends on surrounding process, memory, and verification loops as much as model choice.
Paras Chopra connected the day’s representation-learning conversation to practical memory and model-behavior questions, while Yi Ma pointed readers toward the deeper mathematical framing.

Previous Around the Horn Digests

Catch up on our recent roundups:

Tuesday, June 2, 2026: OpenAI pushed Codex into knowledge work, the White House narrowed AI oversight, Microsoft added Windows agent security, and Axiom verified economics in Lean.
Monday, June 1, 2026: NVIDIA turned the PC into an agent computer, Anthropic filed confidentially for an IPO, MiniMax released M3, and Bernie Sanders proposed public ownership of AI labs.
Weekend, May 29-31, 2026: Kog pushed real-time inference toward 3,000 tokens per second, OpenAI launched Rosalind Biodefense, and Microsoft worked on a Copilot super app.
Thursday, May 28, 2026: Claude Opus 4.8 arrived, Anthropic raised a $65B Series H, IBM committed $10B to quantum, and Amazon killed an AI usage leaderboard.
Wednesday, May 27, 2026: Robinhood gave agents brokerage access, AxiomProver moved verified math into papers, OpenAI and Thrive built tax agents, and Google launched AI Threat Defense.
Tuesday, May 26, 2026: China curbed private-sector AI talent travel, Qualcomm struck a ByteDance chip deal, OpenRouter raised $113M, and xAI finished Grok V9-Medium.
Thursday, May 21, 2026: OpenAI said a reasoning model disproved the 80-year Erdos unit distance conjecture, Spotify and UMG licensed AI fan remixes, and Waymo paused service.
Tuesday, May 19, 2026: Google I/O pushed Gemini agents across Search, Android, Workspace, YouTube, and shopping while Anthropic hardened Managed Agents.
Monday, May 18, 2026: Microsoft open-sourced ECHO, Odyssey launched real-time AI simulators, and OpenAI added bank connections to ChatGPT.
Wednesday-Thursday, May 13-14, 2026: Nvidia H200 sales cleared but stalled, Americans opposed AI data centers, and Meta planned layoffs.
Tuesday, May 12, 2026: Anthropic refused China access to its newest model, Isomorphic raised $2.1B, and Google pushed Gemini deeper into Android.
Monday, May 11, 2026: Cerebras upsized its $4.8B IPO, Cowboy Space raised $275M for orbital data centers, and Google confirmed an AI-found zero-day.
Weekend, May 9-10, 2026: The Trump administration drafted an AI security order, Apple and Intel reached a chip-making agreement, and Cerebras' IPO heated up.

That's a Wrap

That is 220+ stories, tools, research drops, and stray AI-market weather systems from Monday alone. If you made it to the bottom, you now know more about Apple’s Siri reboot, OpenAI’s pre-IPO paperwork, and the phrase “N-day exploits” than at least one person currently forwarding a screenshot in Slack. Please use this power responsibly.

For the daily version, make sure you are subscribed to The Neuron. We send six issues a week, and yes, we read all of this so you do not have to.

See you tomorrow.

P.S: Know someone who would find this useful? Forward this to them and tell them to subscribe here.

Everything That Happened in AI Today (Monday, June 8, 2026)