OpenAI boosts ChatGPT’s memory and, PERHAPS, demos a mysterious new “Quasar Alpha” model

OpenAI gave ChatGPT a new memory upgrade and might have a new model waiting in the wings...

Grant Harvey

July 29, 2024

Just when you thought the AI news cycle might take a breath, Sam Altman kicked off the day announcing a major upgrade to ChatGPT: persistent memory across all your past conversations.

Yep, it can now theoretically recall details from chats you had months ago to personalize current interactions.

Sam tweeted he was super excited about today's launch, calling it a step towards “AI systems that get to know you over your life.” The feature rolls out today for Pro users, with Plus users getting it soon (unless you're in Europe, the UK, or a few other spots – awkward).

You can opt-out or use temporary chat if the idea freaks you out. Reactions online range from "finally!" to privacy jokes ("please don't remember my daughter's high school philosophy class chats!") and EU user frustration.

This announcement came amidst heavy speculation, fueled by Sam's earlier "can't sleep, launching something exciting" tweet.

Haider (@slow_developer on X) summarized the rumors: were we getting o4-mini, o3, a 1M token context window, Jukebox 2 music generation, or something called "Quasar Alpha?"

See, while Memory is official, the real mystery release appears to be Quasar Alpha. This model quietly materialized on the AI model hub OpenRouter earlier this month.

OpenRouter calls it a “cloaked model” from an unnamed partner lab, provided free (for now) to gather feedback. It's described as a “powerful, all-purpose model supporting long-context tasks, including code generation.”

What makes Quasar stand out? A massive 1 Million Token Context Window, zero cost (during alpha), and it's reportedly fast. Think 4x faster than Claude 3.7 Sonnet, according to some early tests.

The Hacker News crowd immediately jumped on it, and the reactions are telling:

It codes well: Popping up near the top ranks on coding benchmarks like Aider Polyglot, holding its own against DeepSeek V3 and Claude 3.5 Sonnet.
Long context champ? Early tests on benchmarks like NoLiMa show it handling long-context tasks impressively well, potentially better than GPT-4o or Gemini 1.5 Pro at certain lengths.
Speed Demon: Users consistently report it's noticeably quicker than comparable models.
Who dis? The metadata signature screams OpenAI, the response style feels very GPT-4-ish, and even some clever bioinformatics-style analysis clusters it right next to known OpenAI models. While some users got it to admit its origins, that's often unreliable. Still, the evidence points heavily towards OpenAI (Sam's hint "quasars are very bright things!" adds fuel to the fire).
Alpha Glitches: Some testers found it unreliable on complex, multi-step tasks (like iterative prompt optimization) and occasionally struggled with instruction following compared to top models like Gemini 2.5 Pro or Claude 3.7 Sonnet.

Like, seriously fast according to early testers. Oh, and it's free during this alpha phase (though they're logging everything).

‍

But here's the twist: Quasar Alpha probably isn't the o3 or o4 model OpenAI recently said are coming “in a couple of weeks", nor is it the delayed GPT-5. This appears to be something different—a long-context foundation model, possibly testing specific capabilities or architecture outside their main development track. It's an unexpected curveball given OpenAI just revised its roadmap, pushing GPT-5 back “a few months.”

The timing is fascinating because, let's be real, Google has been absolutely crushing it lately. Alberto Romero over at Algorithmic Bridge wasn't kidding when he said Google is winning hard right now. Just look at the recent onslaught from Google Cloud Next '25:

Gemini 2.5 Pro: Widely considered the top model overall right now based on benchmarks and user vibes.
Gemini 2.5 Flash: Super fast, cheap, great for mobile/edge applications, coming soon to Vertex AI.
Vertex AI Powerhouse: Integrating world-class models for video (Veo 2), image (Imagen 3), music (Lyria), and speech (Chirp 3) all in one place.
Agent Takeover: Doubling down on agents with Deep Research upgrades, Project Astra, Project Mariner, a new Agent Development Kit (ADK), and even an Agent2Agent (A2A) protocol.
Hardware: Announcing Ironwood, their next-gen TPU (their high-powered AI chips to rival NVIDIA’s) optimized for inference (when you actually ask questions of your AI).

And Microsoft isn't sleeping either. Copilot's recent upgrade added persistent Memory, Actions to book tickets or reservations, Copilot Vision to see the real world, and improved Deep Research. Since Copilot typically (but not always) runs on ChatGPT models, maybe that memory feature is powered by OpenAI?

‍

Our Take: The new ChatGPT Memory is a significant personalization play, doubling down on the “AI companion” vision. Quasar Alpha, meanwhile, is an intriguing mystery. A free, fast, 1M context model focused on coding is a strong signal, even in alpha. It shows OpenAI (if it truly is the maker of Quasar) is still pushing boundaries, perhaps testing tech destined for GPT-5 or a specialized variant.

But is a memory upgrade and a stealthy alpha model enough to counter Google's current blitzkrieg across the entire AI stack? We’re talking models, hardware, enterprise tools, agent ecosystems—let alone distribution—Google’s basically got it all. Quasar does feel like a bright spark, but Google's firing on all cylinders.

Even Microsoft's push for a deeply integrated personal AI is a threat when so many people (who work at Copilot’s 2 million enterprise customer companies) have to use it for work.

New memory keeps OpenAI relevant and demonstrates ongoing innovation, but Google currently holds the momentum across the broader landscape, and Microsoft’s Google’s #2 competitions in the office. Perhaps that’s why OpenAI is looking to scoop up as many students as it can—converting them into early adopters BEFORE they enter the workforce.

‍