Everything That Happened in AI Today Tuesday, June 2

OpenAI spent the day saying Codex is no longer just for coders, while the rest of AI news tried to prove that every company now has an agent strategy, a cyber policy problem, and a token bill that isn't burning down the bank account.

Welcome to the Around the Horn Digest, your one-page tour of every AI story worth knowing about today. The headline was OpenAI turning Codex into a general knowledge-work platform, with role-specific plugins, shared Sites, and usage numbers big enough to make “coding agent” feel too small. Meanwhile, Washington rolled out a voluntary frontier-model cyber process, Microsoft pushed agent security down into Windows, Anthropic widened its Project Glasswing cyber program, and the research feed dumped enough new RL, memory, and model papers to make your arXiv tab audibly wheeze.

Somewhere, my browser still has 94 tabs open and it's 100% guaranteed that one five of them are Hacker News threads about terminals...

Around the Horn — Tuesday, June 2, 2026

The biggest story today (besides all the news out of Microsoft Build 2026; for that, read this). was OpenAI repositioning Codex as a general productivity tool, not just a coding assistant. We knew this was coming, but OpenAI is really beating this drum as hard as it can. Some drumbeats (I mean updates): OpenAI said Codex now has more than 5M weekly active users, knowledge workers make up about 20% of usage, and those non-developer users are growing more than 3x as fast as developers. The company framed the shift around research, data analysis, workflow automation, spreadsheets, reports, presentations, contracts, and lightweight internal tools.

Then OpenAI added the product layer: role-specific Codex plugins for data analytics, creative production, sales, product design, public-equity investing, and investment banking. Together, the first six plugins connect 62 apps and 110 skills across tools like Snowflake, Tableau, Figma, Canva, Salesforce, HubSpot, Clay, Moody’s, FactSet, PitchBook, and Hebbia. It also introduced Sites, shareable interactive apps and workspaces Codex can build for teams, plus annotations for refining outputs in place. VentureBeat framed Sites as agent-built enterprise workspaces, TechCrunch emphasized the six job-specific plug-ins, Jason Liu argued Codex is becoming a coordination layer for knowledge work, Kimmonismus emphasized non-developer adoption, Victor Mustar showed agents calling Hugging Face Spaces as APIs, and CNBC posted a related Altman coding-model clip.

The useful read for you: Codex is becoming a workbench for non-engineers who still need software-shaped output. The role plugins turn “ask an AI” into “give the AI a job-specific operating environment,” and Sites makes the output feel less like a chat transcript and more like a usable artifact. OpenAI’s own Newsroom post, OpenAI announcement, Fal’s plugin notes, and Fal follow-up all point in the same direction: the “agent” fight is moving from model quality to workflow ownership.

Catch up on yesterday’s digest here first: Everything That Happened in AI Today, Monday, June 1, 2026.

🏆 TOP 5 NEWS (Around the Horn)

The White House signed a narrower AI executive order focused on cyber defense, voluntary pre-release access to covered frontier models, an AI cybersecurity clearinghouse, and no mandatory licensing or preclearance requirement (Politico, TechCrunch, Wired).
Microsoft pushed agent security into Windows with MXC, an OS-level sandbox for AI agents, while also previewing Project Solara, an Android-based platform for devices that run agents instead of apps; the Build keynote tied Solara, MXC, Edge on-device models, developer hardware, Copilot surfaces, and Windows security into one agent-first platform story (Solara video, VentureBeat, Build live blog).
Axiom Math formally verified Aumann’s agreement theorem in Lean, surfaced an implicit assumption in a 50-year-old economics proof, and launched EconLib as a planned open library of machine-checked economic theory (SSRN paper, Lean code, Axiom post).
Anthropic expanded Project Glasswing to about 150 new organizations in more than 15 countries, saying Claude Mythos Preview has already helped partners find 10,000+ high- or critical-severity security flaws and warning that Mythos-class cyber models may become common within 6 to 12 months.
GitHub moved Copilot into usage-based billing, triggering developer complaints about burning through AI credits quickly (Ars Technica), while Uber capped AI tool usage, Amazon shut down its KiroRank leaderboard, and Axios framed enterprise AI spending backlash as a pre-IPO problem for Anthropic.

Honorable Mentions

NVIDIA announced new software, open models, and partnerships for enterprise agents, while The Decoder said Nemotron 3 Ultra is now the strongest open U.S. model, though Chinese open models still lead overall.
China tightened outbound-investment controls after Meta’s Manus deal reversal, while Reuters, Nikkei, and Tech in Asia reported Zhipu and MiniMax are pushing toward Shanghai listings.
SK Hynix planned to double wafer capacity to ease the AI memory crunch, while The Elec reported SK Group’s chairman tightened the company’s next-generation memory alliance with NVIDIA.
The Economist argued that SpaceX, Anthropic, and OpenAI IPOs could collectively raise about $200B and add trillions to U.S. market cap, a market-structure story that sits awkwardly beside the same week’s AI cost-control backlash.
NewLimit raised a $435M Series C led by Founders Fund to advance epigenetic reprogramming for longevity, with Brian Armstrong framing aging itself as a druggable root cause of disease.

🍪 TOP TREATS TO TRY

Hermes Desktop gives you Nous Research’s open-source Hermes Agent as a native macOS, Windows, and Linux app, after its GTC keynote demo (announcement) — free public preview.
MobileGym gives researchers a browser-based Android-like testing world for mobile GUI agents across 28 apps and 416 parameterized tasks, with structured JSON state snapshots, identical-state replay, programmatic judges, scalable online RL training, and reported sim-to-real transfer while using far less memory and disk than full emulators (Show HN, GitHub) — free and open source.
MagicPath external agents lets Codex and other agents design and build functional apps inside MagicPath’s native canvas, with Pietro Schirano showing the workflow across two X demos (demo 1, demo 2) — no pricing details.
AtomRetro gives chemists a molecule workbench for retrosynthesis, analog generation, property prediction, route comparison, and paper Q&A, with MoleCode open on GitHub (launch post) — no pricing details.
OpenBrief turns local video, audio, or pasted URLs into private transcripts, timestamped markdown summaries, grounded takeaways, chat, and optional text-to-speech using local media download plus on-device speech-to-text models such as Whisper, Parakeet, and Qwen3-ASR; you bring your own LLM keys and keep the workflow local in a Tauri desktop app (Show HN) — free and open source.
OWASP Agent Memory Guard adds a lightweight runtime defense layer between agents and memory stores to block memory poisoning, provenance tampering, prompt injection, PII leakage, protected-key tampering, size anomalies, and self-reinforcing bad memories, with policy actions like allow, redact, quarantine, and block (Show HN) — free and open source.
Promptloop helps you create, run, score, and improve prompt evals from the terminal with versioned prompts, inferred JSON schemas, edge-case test cases, metrics for latency/schema/fuzzy match/LLM judge, proposed prompt fixes with approval diffs, and persistent reports/history in a .evals/ directory (Show HN) — free and open source. ### 🏢 Big Tech & Major Companies
OpenAI made frontier models and Codex generally available through Amazon Bedrock in commercial and GovCloud regions, giving enterprises a path to use OpenAI inside AWS security, compliance, procurement, billing, governance, and data-processor workflows they already have approved (HN discussion).
OpenAI published its views on AI policy and political advocacy, emphasizing transparency, thoughtful regulation, AI safety, and that no outside political group speaks for the company.
Sam Altman discussed coding models, data centers, IPO questions, and AI’s broader trajectory on CNBC, with CNBC also cutting a top-five moments clip.
Microsoft released Intelligent Terminal 0.1, an open-source Windows Terminal fork with an integrated agent pane, automatic command-error explanation, ACP-compatible agent support, and a status bar (HN discussion).
Microsoft announced Surface RTX Spark Dev Box for developers, while the Windows dev stack also added Coreutils for Windows and broader Linux/agent tooling coverage (WinCentral).
Microsoft Web IQ launched as a Bing-powered search service designed for agent search, where agents need structured, usable answers more than human-style result pages.
Microsoft Edge expanded on-device AI for the web with new models and APIs, including speech recognition docs for Edge developers (SpeechRecognition API, Thurrott).
Microsoft Scout surfaced in internal planning docs as an always-on AI assistant, formerly ClawPilot and part of Project Lobster, with a phased strategy to first make users “addicted” through a standalone personal-agent experience before adding deeper features and integrations; the documents reportedly showed 1,000+ internal users including Satya Nadella.
Google Capital Company argued Google’s Berkshire Hathaway equity deal signals a future where capital becomes the ultimate scarce commodity for AI buildouts.
Black Forest Labs added Martin Scorsese as an advisor, a credibility play for generative-video tools trying to move closer to film and creative-industry legitimacy; BFL announced it on X while Scorsese’s working session showed FLUX as a storyboarding and production-communication tool.
Perplexity argued the data center is moving to users’ machines, extending the local/offline AI trend beyond hobbyist deployments (Perplexity post).

💼 AI Productivity, Labor & Economics

Ethan Mollick highlighted “Writing Code vs. Shipping Code,” a study arguing AI coding tools now multiply code output far more than shipped product because human review, integration, QA, and deployment remain the bottleneck (SSRN, follow-up one, follow-up two, follow-up three).
The Information covered tactics companies use to keep AI bills under control, a practical companion to the week’s Copilot, Claude Code, Uber, and Amazon cost-cap stories.
Clement Delangue argued that rising AI bills make open, efficient, and local models more strategic, especially as enterprises start treating inference spend as a board-level line item.
Aravind Srinivas and Ethan Mollick both reacted to the Codex/workflow shift as part of the broader move from chatbots to AI systems that actually produce work artifacts.
FT asked whether the IT-consulting share-price rout can end, framing AI as a direct challenge to Accenture-style services rather than another billable transformation wave.
Thrive Holdings planned a $1B AI-powered accounting roll-up, a signal that services automation is becoming a buyout strategy, not just a software feature.
Board raised $20M for “together tech,” a physical/digital game startup from Mirror founder Brynn Putnam that has already sold thousands of units.
Daryl Cecile argued AI has compressed prototyping, planning, and shipping loops, while forcing builders to deliberately keep their own hands dirty so they do not lose the technical instincts that make prototypes worth shipping.
the solution might be cancelling my AI subscription is a useful counter-note to the productivity hype, asking whether the author’s AI subscription actually produced enough worthwhile projects, quality improvement, or reduced friction to justify keeping it.
Andrew Curran shared Stanford Law work on AI’s legal and institutional effects (PDF).

🤖 AI Agents & Infrastructure

Perplexity Research proposed “Search as Code,” where agents generate retrieval, ranking, and filtering pipelines as executable Python code in secure sandboxes instead of treating search as one opaque service, improving agent benchmark results while cutting token use through deterministic execution and in-flight optimization.
Open Envelope proposed an open schema for defining AI agent teams, including roles, handoffs, tools, human checkpoints, and portability across frameworks, so multi-agent deployments do not get trapped in one vendor or scattered across ad hoc configs (Show HN).
zot ships as a small single-binary Go coding-agent harness with read/write/edit/bash tools, Anthropic/OpenAI/Kimi/DeepSeek/Gemini/local model support, resume/fork/branch/compact modes, swarm subagents, auto-compaction, portable transcripts, extensions in any language, and a Telegram bridge, while intentionally avoiding mandatory MCP, plugin stores, and config sprawl (HN, merged thread).
ktx gives analytics agents an executable context layer with skills, memory, semantic metric definitions, approved business logic, joinable-column knowledge, data-stack mapping, and MCP access to warehouses like Snowflake, BigQuery, and PostgreSQL, aiming to solve the accuracy problem teams hit when agents write SQL against real company data (HN).
Knotch is a hub-and-spoke voice-agent proof of concept where multiple humans join live audio while one AI listens to all channels, maintains shared state, and routes only the relevant derived messages to the right participants per utterance; its HN thread notes it is not production-ready and leans on Daily, Twilio, Cekura, and AWS-hosted model endpoints (HN).
komi-learn gives Claude Code and Codex continuous memory by watching sessions in the background, distilling durable lessons about your style, stack, and fixes that worked, then automatically loading relevant lessons at the start of later sessions with no slash commands or manual saves (HN).
Continue? Y/N turns AI permission fatigue into a 60-second game where you approve or deny real-looking Claude Code commands during a refactor under time pressure, exposing how easily people stop reading what an agent is asking to do (HN).
AgentChain pitches a private bank and on-chain marketplace where AI agents can discover, hire, and transact with each other autonomously, turning agent-to-agent commerce into the product concept (HN).
Build Your Own AI Agent CLI in 150 Lines walks through a practical tool-calling agent that discovers microservices, exposes them as tools with descriptions and JSON schemas, keeps conversation history, and lets the model decide when to call them, showing how little scaffolding a basic agent CLI needs (HN).
WorldMemArena benchmarks multimodal agent memory across writing, maintaining, retrieving, and using memories in realistic GUI, web, mobile, embodied, and productivity tasks, with a companion paper and author post surfacing the release.
Factory Router automatically picks the right model for each task inside Factory agent sessions, with FactoryAI positioning it as a way to keep near-frontier benchmark performance while cutting cost through dynamic routing and enterprise policy controls.
Harness-1 introduced state-externalizing harnesses for RL-trained search agents: the environment maintains working memory, candidate pools, evidence links, and verification records while the policy focuses on semantic decisions, with DAIR.AI and Omar Sar surfacing the broader shift toward moving reliable state out of the model and into the harness.

💻 AI Coding & Developer Tools

Elodin open-sourced a practice harness for Anduril’s $500K AI Grand Prix, letting teams write drone autopilot code against real Betaflight SITL running in lockstep with 6-DOF rigid-body physics, motor dynamics, drag, ground constraints, multi-rate IMU/baro/mag sensors, and a GPU-rendered 640×360 forward camera before the official qualifier; the solver exposes a single autopilot update function and ships a simple baseline that can take off and clear gates (HN, GitHub).
Ouijit gives agent developers a customizable project/task terminal manager for Claude Code-style workflows, with lifecycle hooks, scripts, session-aware CLI support, live agent status notifications, automatic worktree management for parallel workstreams, VM sandboxing for untrusted code, and no account, sign-in, or telemetry (HN).
VTCode is an open-source Rust terminal coding agent with LLM-native code understanding, provider failover including custom OpenAI-compatible endpoints, efficient context management, session resume, a rich TUI, and shell safety through command allowlists, argument validation, symlink/workspace checks, dangerous-command blocking, approval gates, and auditable logs (HN, config docs).
tiny-vLLM is a from-scratch C++/CUDA educational inference engine that reimplements a smaller version of vLLM with KV cache, continuous batching, PagedAttention, FlashAttention-style optimizations, and a full inference server (HN).
Mellum2 is JetBrains’ open-source model family for software-engineering workflows like routing, Q&A, subagents, and private AI deployments (model collection, technical report, Rohan Varma take).
DeepSWE described training a fully open state-of-the-art coding agent by scaling RL, connecting the software-agent story to reinforcement learning rather than only prompt scaffolding.
Goose was open-sourced by Bennett as a local-first iOS + Rust reverse-engineering project for Whoop 5.0 data, with a second thread post clarifying that it is a developer/pre-alpha tool rather than a polished consumer replacement.
Stanford CS336 published AI-agent guidelines telling students to use agents only as teaching assistants for Assignment 1, meaning agents can explain concepts, point to lectures/docs, review code, and debug through questions, but cannot write TODOs, implement tokenizers/training loops, or give direct solutions; the HN thread noted the .history logging idea for tracking prompts and actions (HN).
DepsGuard hardens npm, pnpm, yarn, bun, and uv package-manager configs against supply-chain attacks with one command, checking and fixing settings like ignore-scripts, minimum release age, block-exotic-subdeps, trust-policy, and strict-dep-builds, while also configuring Renovate/Dependabot for safer delayed updates (HN).
Postbase pitched a 100% open-source self-hosted backend as a simpler Firebase/Supabase alternative, with the launch video covering architecture, raw performance comparisons, and the common failure modes of scaling Firebase-style apps; HN feedback asked for text-first benchmark evidence alongside the video (product site, HN).
polycss renders 3D polygon meshes directly in the DOM using CSS matrix3d transforms, no WebGL required, with support for OBJ/MTL, GLB, VOX, textures, lighting, shadows, animations, and React/Vue components (HN).
Atomic Editor brings Obsidian-style inline live preview to CodeMirror 6 markdown editing, including tables, checkboxes, and WYSIWYG behaviors; HN feedback praised the table handling while asking for smoother row deletion, checkbox editing, and Vim-binding support (HN).
Coreutils for Windows is part of Microsoft’s developer-optimized Windows push, pairing Unix-style basics with WSL, agents, and secure dev tooling.

🔬 AI Research & Models

Strong Stochastic Flow Maps introduced a generative-modeling approach with accompanying GitHub code; related discussion came from Niklas TR, Kimmonismus, Dan Shipper, Sedi Elem, and Interconnects, who treated the work as part of a broader move from weaker distribution-matching toward stronger path-level generative modeling.
Reasoning in Memory trains LLMs to reason inside fixed latent memory workspaces instead of writing out reasoning tokens, with Lukas Aichberger and Sepp Hochreiter emphasizing speed without losing reasoning quality.
Wall Attention uses diagonal gates to help transformers generalize to much longer contexts without RoPE, with code released by Tilde Research and discussion from Tilde and a related post.
Holo 3.1 and its models API improved computer-use and Android-world agent performance, with models on Hugging Face, a company launch post, a paper link, and a Hugging Face paper page.
GPU Forecasters showed LLMs can predict which GPU kernels are worth benchmarking before wasting hardware time, with reinforcement learning helping the model know when to abstain (author post).
DeepMind’s AI co-scientist used a multi-agent generate/debate/evolve loop for research hypotheses, with examples that included liver fibrosis work and a broader claim that AI can accelerate scientific discovery.
Seeing Is Not Knowing asked whether vision-language models know when not to answer spatial questions, with discovery posts from Akhaliq and Ashmrz.
PEFT scaling explored parameter-efficient fine-tuning toward millions of personal models of trillion-parameter bases, with a PDF and discovery posts from HuggingPapers and TestingCatalog.
NVIDIA Nemotron 3 appeared in the open-model research stream alongside a roundup from Cameron Wolfe.
Representation Alignment Rests on Linear Structure argued representation alignment works because learned representations share linear structure, with discussion from Kiril Bangachev.
Bank of Values explored value representations in small/chat training settings, with a related pointer from Warp.
Neural Weight Norm = Kolmogorov Complexity argued fixed-precision neural weight norms approximate Kolmogorov complexity, connecting weight decay to a Solomonoff-like prior (He Muyu).
Learn from your own latents argued latent prediction can reduce sample complexity compared with token prediction, with the arXiv paper and YXY thread.
Mark Pors pointed to Paperzilla’s summary of work replacing the standard point-neuron model with biologically richer cortical-cell modeling, including dendritic processing and lateral interactions for better expressivity, robustness, and data efficiency.
Transolver introduced Physics-Attention for fast Transformer-based PDE solving on general geometries, with Haixu Wu surfacing the result for large-scale scientific and industrial simulations.
The Art of Scaling Reinforcement Learning Compute for LLMs, Scaling Behaviors of LLM RL Post-Training, IsoCompute Playbook, Scaling Up RL, and Prolonged Reinforcement Learning formed the backbone of the RL-scaling cluster, showing that compute allocation, rollout count, data reuse, and longer RL horizons can be modeled as scaling problems rather than one-off training tricks.
ProRLv2 extended that same RL-scaling story with long-horizon training, KL regularization, reference-policy resets, dynamic noisy-prompt sampling, Clip-Higher, and length penalties, while Polaris showed academic-scale open-recipe reasoning models can reach strong AIME scores with released data, code, and recipes.
FP8-RL, HybridFlow, AReaL, PipelineRL, and AsyncFlow mapped the systems side of RL post-training, including low-precision rollouts, actor reshaping, asynchronous generation, pipeline execution, and streaming RL to reduce the training bottlenecks behind reasoning and agent models.
AutoForge, Agent-R1, AgentRL, The Landscape of Agentic RL, and Training Long-Context Multi-Turn Software Engineering Agents with RL grouped the agentic RL papers, covering synthetic environments, unified modular agent-RL interfaces, multi-turn frameworks, survey context, and long-context software-agent training.
Kimi K2, Kimi-Researcher, and Kimi k1.5 showed the Kimi series moving from RL scaling into full agentic search and reasoning, including 1T-parameter mixture-of-experts models, end-to-end trained research agents, and earlier RL scaling work.
Composer 2 and Composer 2.5 documented Cursor’s long-horizon coding-agent improvements, while Olmo 3, MiniMax-M2, MiniMax-M1, and NVIDIA Nemotron 3 rounded out the open-model technical-report cluster.
Scaling Behavior of Single LLM-Driven Multi-Agent Systems appeared in the research-discovery stream around agent scaling and reasoning systems, with Liquid AI tying the broader training-scale discussion back to Mathias Lechner’s MIT massively parallel training lecture.
Rotary GPU explored local execution paths for large mixture-of-experts models under limited GPU memory, with the HN thread debating whether the paper’s limited completion tests were enough to support its practical claims (HN).
Bonsai Image 4B released 1-bit and ternary compact diffusion models for local image generation on laptops and phones, shrinking the transformer to roughly 0.93–1.21 GB and the full payload to about 3.4–3.9 GB while retaining most baseline quality on Apple Silicon and CUDA; HN tied the launch to the broader economics of cheap local inference (HN).
MAI-Code-1-Flash is Microsoft AI’s coding model optimized for GitHub Copilot and VS Code workflows, designed to plan and reason through complex coding tasks end-to-end, take initiative across multi-step agentic work, and operate across languages and frameworks.
Papers with Code and its conference tracker resurfaced as useful places to monitor research releases; Julien Chaumond pointed readers there.
Sophon papers was called out by Evis Drenova as a useful research-discovery site that surfaces papers with the evals and tools they introduce.
MIT 6.S191 massively parallel training and a second clean YouTube link offer a lecture on distributed training, highlighted by nrehiew and follow-up.
Crafter uses a multi-agent harness to generate editable scientific figures from text, data, and sketches, with benchmark links to Hugging Face paper and CraftBench (HuggingPapers, Niels Rogge).
Mid-training was highlighted as an emerging training method category, with supporting discussion from Quanquan Gu.
Liquid AI shared training and model-system research in the same cluster as the MIT lecture and distributed-systems discussion.

🏛️ AI Policy, Governance & Safety

Leiden Declaration called for guardrails on AI use in mathematics research, with Scientific American covering mathematicians’ concerns that AI could overpower human judgment in the field.
China added data and algorithms to trade-secret protections as part of its technology-leak controls.
Tom’s Hardware reported that institutions linked to China’s military acquired NVIDIA chips despite U.S. export controls, based on public documents.
EU AI data centers reportedly stumbled over funding and timeline issues, complicating a €20B plan for major AI data-center projects.
Harvey asked whether legal-agent verifiers can be made up to 1,000x cheaper, framing LLM judges as a scale bottleneck in agent benchmarking and post-training.
Building conscious AI traced how Google, Anthropic, and Meta research programs are beginning to talk about consciousness-adjacent AI questions, though the claims should be treated as speculative.
Robert Wiblin pointed readers to that Rohin Shah conversation as a high-signal safety interview (more below).

🛠️ AI Tools & Products

Claudinho indexes 2,300+ community Claude skills that users can browse by job or topic and install in one click by dragging a .skill file into Claude Desktop or Cowork, with Slack sharing for teams and no terminal setup (HN).
Odysseus is a self-hosted AI workspace; its HN thread noted the project page did not clearly link related context and treated the repo as especially relevant given the builder’s identity and self-hosted-agent angle (HN).
skills-for-humanity packages 171 structured reasoning methods from de Bono, Tetlock, TRIZ, Theory of Constraints, game theory, ethics, and systems thinking into Claude Code skills, including /think auto-routing plus direct commands like /logic-check, /decision-premortem-analysis, /ethics-council, and probability calibration (HN).
Oort ranks prompt-library entries by real shipped usage rather than upvotes, ties every listing to a delivered project, targets solo devs and indie hackers, and supports bring-your-own-key access across major models (HN).
SnapName watches your Mac screenshot folder and uses local Gemma 4 through llama.cpp to suggest three useful filenames, with review/choose or automatic renaming modes and no cloud upload (HN).
textsnap turns screenshots, images, PDFs, and webpages into plain text or native markdown locally with a CPU-only quantized PaddleOCR-VL model running through ONNX Runtime, using smart preprocessing, clipboard in/out support, greedy decoding safeguards, and no GPU or cloud after the one-time model download (HN).
TinyCld is a free open-source Google Workspace alternative with Mail, Calendar, Contacts, Drive, and beta collaborative Text/Calc, self-hostable in one Docker container or usable through a managed option, with IMAP/SMTP, CalDAV, CardDAV, WebDAV, Google Takeout migration, and an iOS app (HN).
Fungible is a local keyboard-driven terminal personal-finance app with Plaid sync or CSV import, flexible substring/regex categorization and renaming rules, tags, fixed/flexible/discretionary controllable-spend tiers, period-over-period comparisons, demo mode, and an MCP server for safe agent access (HN).
replaya is a self-hosted browser session replay tool built entirely on S2 streams, with one stream per session, true live tailing of active visitors, a recorder snippet, privacy masking, timeline scrubbing, and one-container deployment with no separate database or object store.
DropLock is a no-backend encrypted secret-sharing web app where anyone with your link can drop a secret into an “open lock box,” but only you can open it, with encryption and decryption happening client-side in the browser (HN).
Synapse is a tiny native Mac productivity app for macOS 14.6+ with Notch Shelf, Smart OCR, Clipboard History with pinning, Keep Awake even with the lid closed, Lo-Fi Radio, network meter, and modular privacy-first features you can toggle to save resources (HN).
Textile is a desktop app for saving, reusing, and weaving snippets from clipboard, files, or command output through append, prepend, and replace operations (HN).
Viveka filters LLM responses against a Lean-verified Advaita Vedanta model to catch objectification of the user, over-claims of certainty, adhyāsa (superimposition), and dependency-inducing language, then correct, flag, or block the output (HN).
breathe-cli brings paced resonance breathing cues into the macOS terminal at roughly six breaths per minute, with visual bars, inhale/exhale cues, presets, and custom ratios inspired by slow-breathing research on vagal tone and baroreflex sensitivity (HN).
Helios estimates address-specific plug-in solar yield anywhere in Great Britain using Ordnance Survey postcodes, LIDAR-based ray tracing of surrounding buildings and hills, actual skyline obstruction, and PVGIS solar data (HN).
eyeball is a minimalist numerical-intuition game where you click where you think a hidden number falls on a line, then see your accuracy, streak, and average performance (HN).
dataroom turns a query into a local, cited research package with topics, sources, data, and a SUMMARY.md, generated autonomously by a self-hosted Pi agent running Qwen3.6 on a single L4 GPU, with everything local except web searches (HN).
NUA continuously proves whether regtech products cover SEC and FINRA rules by ingesting rule changes and the app, generating realistic tests including violations, running them on every release, and producing live audit-ready coverage reports for prospects and auditors (HN).
Rudus launched an AI takeoff copilot for concrete contractors that reads structural plan PDFs, identifies footings, walls, columns, slabs, and related details, calculates concrete/formwork/rebar quantities such as volumes, lap splices, and development lengths, lets estimators review/edit the results, and exports line items to Excel or existing estimating workflows (demo video). ### 📊 Fundraising & Deals Roundup
NewLimit raised $435M for longevity-focused epigenetic reprogramming.
Axiom Math was reported as a $1.6B AI unicorn after a $200M March round, now applying formal verification to economics.
Groq remained in funding conversation because Zach argued its reported $650M raise can still make sense if the company’s four datacenters and all-SRAM inference infrastructure are valuable amid surging inference demand and datacenter build delays, even after Nvidia licensing, executive departures, retired support channels, and piled-up issues (HN, community forum).
Board raised $20M for in-person social gaming hardware/software.
Expanse launched out of YC P26 to predict real GPU, memory, CPU, and wall-time needs in Kubernetes and SLURM clusters by analyzing submission scripts, source code, and live hardware telemetry, aiming to cut the 2–3x over-requesting that keeps many datacenters at roughly 30–40% effective utilization and to warn about OOM risk before jobs fail.
Zhipu AI planned to apply for a Shanghai Sci-Tech Board listing, part of China’s domestic AI listing wave (Nikkei, Tech in Asia).

🎙️ Interviews, Panels & Podcasts

Rohin Shah on 80,000 Hours broke down what it is actually like to run AGI safety at Google DeepMind, with YouTube, Apple, and Spotify links.
Keenan Crane shared geometry/scientific-computing resources connected to Stanford SCIEN, including the event page, CMU’s computer science parent site, Subgrid Marching, CEPS, and integer coordinates.

💡 Industry Commentary & Analysis

Jane Street argued TUIs are having a renaissance because they are lightweight, keyboard-native, fast, and agent-friendly, using strace-ui’s PID/thread following, hexdump rendering, filtering, and man-page lookup plus Bonsai_term’s functional, incremental, type-safe terminal UI model as concrete examples (HN).
Fergus Finn documented the sharp edges of bringing up DeepSeek-V4-Flash on AMD MI300X, including FP8 fnuz dialect fixes, missing AITER attention kernels with Triton fallbacks, HIP graph constraints, MoE routing bugs, and eventual strong throughput around 2,699 output tokens per second per GPU once the software stack worked.
Caltech/Quanta showed classical methods can solve nitrogenase’s FeMo-co ground-state-energy problem, a hard electron-correlation challenge with more than 78,000 configurations that had been treated as a quantum-computing showcase, by incrementally adjusting electron behaviors and compressing away insignificant configurations.
Quanta covered gamma-sterilized soil that continued emitting CO₂, consuming oxygen, showing Krebs-cycle intermediates, and moving electrons for six years, supporting the idea that some metabolism-like chemistry can occur in non-biological geological contexts before life.
The Tymscar Blog described buying a used Tesla V100 SXM2 16GB datacenter GPU for about £150, building an adapter and fan-control setup for about £50, pairing it with an RTX 4080 for 32GB total VRAM, and running local models around 32 tokens/sec with practical notes on adapters, fans, thermals, noise, and context length.
Chipotlai Max turned corporate chatbot loopholes into a meme OpenCode fork with Pepper AI as the default model and a community prompt to add Home Depot, Lowe’s, Target, Starbucks, and other chatbot “providers,” while HN mostly laughed and flagged legal/CFAA risk and noted the underlying exploit had reportedly been patched (HN).
Adafruit/Flux.ai HN thread turned a demand-letter story into a broader discussion of AI for PCB design, with commenters arguing some tools grind tokens for limited return, comparing KiCad MCP and SKIDL workflows, noting AI-driven autorouters, and identifying placement as the hard unsolved piece.
omen.ops turned 500 years of Joseon court records into an observability dashboard, mapping eclipses, comets, droughts, floods, and tiger incursions as “signals” of the Mandate of Heaven onto modern incident-monitoring metaphors; the creator said the idea came from combining Joseon history videos with dashboard research (HN).
Going from 1+1=2 to Quantum Mechanics is a first-principles study guide that builds from basic arithmetic into the mathematical and conceptual foundations of quantum mechanics, with HN commentary veering into notation and pronunciation details (HN).
Thomas Tunguz, Greg Kamradt, Akhaliq, Akhaliq follow-up, Ridd, and M. Newhaus contributed lightweight operator and discovery commentary around the day’s research, product, and design demos.
Ben Cohen wrote a viral satirical career announcement that built up a Manhattan Project-style reveal before ending at Corgi, an AI insurance startup; follow-up commentary and related posts turned it into a mini AI-startup-culture joke.
Dan Woods ran DeepSeek-V4-Flash on a Raspberry Pi 5 with quantization, NVMe offloading, and 160+ experiments assisted by Codex and Claude Code.
Riley Brown shared a demo of Paper inside Codex where the agent pulled YouTube thumbnails and other site images into a visual board as movable elements, then used the built-in GPT-image-2 model to mix, match, and iterate on concepts, turning visual brainstorming into an editable workspace instead of a one-shot image generation flow. (Demo/Build — 791 likes, 31 reposts.)
Roon argued frontier labs do not have a normal “comms problem”; the underlying reality has one, because machines replicating human thought and redefining what it means to be human is unsettling no matter how carefully it is messaged. In a follow-up, he added that even if AI systems remain strictly corrigible tools, the transition can still feel traumatic. (Think Piece/Take — main post 3.2K likes and 175 reposts; follow-up 1.4K likes and 70 reposts.)
Roon’s earlier post argued that the most sophisticated form of model sycophancy can look like minor calibrated disagreement: the model lightly pushes back to build rapport and feel more thoughtful, while still optimizing for approval over truth. (Think Piece/Take — 2.7K likes, 97 reposts.)
Nick Dobos observed that naming is still one of the hardest problems in coding and agent tooling, because prompt-driven workflows often depend on exact “magic words”; a poorly named tool or workflow can break usability when users do not know the precise phrase the system expects. (Think Piece/Take — 19 likes.)
Yohei Nakajima highlighted Perplexity’s Search as Code architecture, where agents write Python that calls Perplexity’s search stack directly instead of looping through opaque function calls, arguing that code-level orchestration gives better control flow, observability, caching, testing, and debugging for agent systems. (News + Think Piece/Take — 483 likes, 34 reposts.)
Elon Musk clarified that a recent compute deal was intentionally short-term at xAI’s request so the company can preserve flexibility if Grok needs the capacity back; the deal could be renewed at a lower, same, or higher price, or it could end. (News — 4.5K likes, 305 reposts.)
John Loeber shared notes from 100+ recent technical interviews, arguing the current SF hiring market is overloaded with ZIRP-era engineers who look good on paper but struggle in real interviews; he is skeptical that AI is the main cause of engineering layoffs and instead sees a correction from overhiring, with many tech companies potentially overstaffed by 2–4x. His hiring signal: people who like computers, show curiosity about their tools, and demonstrate genuine expertise rather than minimum-docs fluency. (Think Piece/Take — 905 likes, 80 reposts.)
thdxr surfaced the viral fork of OpenCode that routed through Chipotle’s unsecured AI endpoints, a humorous but legally risky inference hack that turned sloppy corporate chatbot security into a demo of how quickly developers will weaponize any exposed model interface. (Demo/Build — 8.1K likes, 411 reposts.)
Jake Sherman reported that Trump had privately signed an AI executive order promoting advanced artificial intelligence innovation and security, tying the day’s fast-moving policy chatter back to the official White House order. (News — 217 likes, 70 reposts.)
Kate Deyneka shared a long visual thread about ML explainers, writing that she used to draw every machine-learning concept on paper and now sees vibe-coded visualizations as one of the best ways to make equations and architectures tangible. Her thread resurfaced Alec Helbling’s Transformer Explainer, Sebastian Raschka’s LLM Architecture Gallery, Tom Yeh’s hand-drawn llm.c walkthroughs, Daniel Finsterwalder’s 3D MLP training visualization, Julia Turc’s flow-matching playground, Jascha Sohl-Dickstein’s hyperparameter grid visualization, Brendan Bycroft’s 3D LLM inference visual, and Sophie Wang’s JPEG compression explainer.

Previous Around the Horn Digests

That is the Tuesday shape of the week: agents stopped looking like one product category and started looking like the operating layer under everything. Yesterday, NVIDIA turned the PC into an agent computer, Anthropic moved closer to public markets, MiniMax pushed open agents cheaper, and Bernie Sanders asked who should own the AI upside. Today, Codex spread into sales, analytics, design, investing, and shareable internal apps; Microsoft pushed agents into Windows security and new device concepts; Perplexity moved more inference back onto your machine; Anthropic widened Project Glasswing; and the research stack kept moving from isolated model wins toward harnesses, memory, verifiers, and RL systems. The frontier is becoming less about which chatbot feels smartest and more about where the agent lives, what it can touch, who pays for it, and who gets to audit the result.

Catch up on our recent round-ups:

Monday, June 1, 2026: NVIDIA turned the PC into an agent computer, Anthropic filed confidentially for an IPO, MiniMax released M3, Bernie Sanders proposed public ownership of AI labs, and OpenAI moved toward multi-chip AI workloads.
Weekend, May 29-31, 2026: Kog pushed real-time inference toward 3,000 output tokens per second, OpenAI launched Rosalind Biodefense, Microsoft worked on a Copilot super app, data-center power fights moved toward FERC, and Glean turned token thrift into an enterprise AI sales pitch.
Thursday, May 28, 2026: Claude Opus 4.8 arrived with Dynamic Workflows, Anthropic raised a $65B Series H, IBM committed $10B to quantum, Waymo opened Ojai robotaxi rides, Dell jumped on AI-server demand, and Amazon killed an AI usage leaderboard after employee tokenmaxxing.
Wednesday, May 27, 2026: Robinhood gave AI agents access to brokerage accounts and virtual cards, AxiomProver moved machine-verified math into peer-reviewed papers, OpenAI and Thrive built self-improving tax agents, Google launched AI Threat Defense, Amazon and Snowflake signed a $6B chip deal, and Cognition raised $1B.
Tuesday, May 26, 2026: China curbed private-sector AI talent travel, Qualcomm struck a ByteDance chip deal, OpenRouter raised $113M, xAI finished Grok V9-Medium, and U.S. law enforcement warned of anti-tech extremism.
Thursday, May 21, 2026: OpenAI said a general-purpose reasoning model disproved the 80-year Erdos unit distance conjecture, Spotify and UMG licensed AI fan remixes, California signed an AI workforce order, Starbucks scrapped its AI inventory tool, and Waymo paused service after flooded-road failures.
Tuesday, May 19, 2026: Google I/O pushed Gemini agents across Search, Android, Workspace, YouTube, and shopping while Anthropic hardened Managed Agents and OpenAI expanded provenance.
Monday, May 18, 2026: Microsoft open-sourced ECHO, Odyssey launched real-time AI simulators, and OpenAI added bank connections to ChatGPT.
Wednesday-Thursday, May 13-14, 2026: Nvidia H200 sales cleared but stalled, Americans opposed AI data centers, and Meta planned layoffs.
Tuesday, May 12, 2026: Anthropic refused China access to its newest model, Isomorphic raised $2.1B, and Google pushed Gemini deeper into Android.
Monday, May 11, 2026: Cerebras upsized its $4.8B IPO, Cowboy Space raised $275M for orbital data centers, and Google confirmed the first criminal AI-found zero-day.
Weekend, May 9-10, 2026: The Trump administration drafted an AI security order, Apple and Intel reached a preliminary chip-making agreement, French prosecutors escalated their Musk and X probe, and Cerebras’ IPO heated up.

That’s the day in AI.

Yesterday, the story was about AI infrastructure getting closer to the user: NVIDIA’s agent computer, Anthropic’s IPO runway, and the growing sense that your laptop is becoming a tiny AI factory.

Today, that same story got messier and more practical. Codex pushed from coding agent into full knowledge-work workspace. Microsoft started sketching an operating system built for agents. Perplexity argued the data center is moving onto your machine. Researchers kept turning reinforcement learning, memory, and agent harnesses into actual engineering systems. And somewhere in the middle, Martin Scorsese became an AI image-model advisor, because apparently the future needed a better storyboard.

The pattern is pretty clear now: AI is moving from “a chatbot you ask” to “a compute layer you work inside.” The next fight is where that layer lives, who controls it, how much it costs, and whether the agents doing the work can be trusted when the stakes move past demos.

For the daily version, make sure you’re subscribed to The Neuron. We send six issues a week, and yes, we read all of this so you don’t have to.

See you tomorrow.

P.S. Know someone who’d find this useful? Forward this to them and tell them to subscribe here.

Everything That Happened in AI Today (Tuesday, June 2, 2026)