DeepSeek R2 could crush AI economics with 97% lower costs than GPT-4

DeepSeek R2 is coming...here's what to expect.

Grant Harvey

July 29, 2024

Remember when DeepSeek R1 shocked the market and wiped out $1 trillion in stock value? That was just the warm-up act.

Chinese AI startup DeepSeek is accelerating the release of its next-generation model, DeepSeek R2. Originally scheduled for May 2025, sources indicate it could drop as early as April—or possibly even sooner due to competitive pressure and market momentum.

The leaked specs are mind-blowing:

A massive 1.2 trillion-parameter model using a hybrid MoE (Mixture-of-Experts) architecture
Only needs 78 billion active parameters per token (that's efficiency!)
97.3% cheaper to train and run than GPT-4
Expected pricing of just $0.07 per million input tokens and $0.27 per million output tokens
Trained on 5.2 petabytes of data across various domains

The most industry-disrupting aspect? Analysts estimate DeepSeek's pricing could be 20 to 40 times cheaper than ChatGPT's tools. That's not a typo—we're talking potentially 40X cheaper than what you're paying for GPT-4o right now.

Here's what makes DeepSeek R2 particularly fascinating:

Made in China—Hardware and All

Unlike most AI companies (even Chinese ones) that rely on Nvidia's chips, DeepSeek R2 was reportedly trained using Huawei Ascend 910B chips. This represents a complete break from Western technology dependence.

When US sanctions hit in 2022, DeepSeek already had the computing resources it needed. The company's founder, Liang Wenfeng, had been stockpiling AI chips through his hedge fund High-Flyer years before DeepSeek even existed.

Secret Sauce: Architecture Innovations

DeepSeek isn't just winning on cost—they're innovating on architecture too. Two key technologies power its efficiency:

Hybrid Mixture-of-Experts (MoE): This technology only activates the parts of the AI model needed for a specific task, drastically reducing computational requirements.
Multihead Latent Attention (MLA): Allows the model to process multiple aspects of a prompt simultaneously, further boosting efficiency.

Company Culture vs. "996"

Perhaps most surprising is how DeepSeek operates. Unlike Chinese tech giants known for rigid top-down management and "996" hours (9am-9pm, 6 days a week), DeepSeek offers a more collaborative environment.

One former employee, 26-year-old researcher Benjamin Liu, described it this way: "Liang gave us control and treated us as experts. He constantly asked questions and learned alongside us."

That approach seems to be working. DeepSeek R1 briefly surpassed ChatGPT on the App Store earlier this year, and its models have outperformed Meta's Llama 3.1, OpenAI's GPT-4o, and Alibaba's Qwen 2.5 on third-party benchmarks—all at a fraction of the cost.

What to expect from R2

The new model promises several key improvements:

Enhanced coding capabilities across 30+ programming languages
Multilingual reasoning beyond just English
More advanced multimodal support
Context window potentially up to 128K tokens or more

Why this matters

If DeepSeek delivers on its promises, R2 could force a fundamental reset of AI economics. When a player can deliver comparable (or better) performance at 3% of the cost, the entire industry must respond.

Vijayasimha Alilughatta, COO of Indian tech firm Zensar, believes the new release could mark a turning point: "DeepSeek's success in building cost-effective AI models will likely push companies worldwide to accelerate their efforts, breaking the stranglehold of the few dominant players in the field."

The biggest question now: Will OpenAI, Anthropic, and Google be forced to drastically cut their prices to compete? With GPT-5 expected later this year, the AI race is heating up—and DeepSeek just turned up the temperature significantly.

Keep an eye on this space. The AI economics we've all gotten used to might be about to change dramatically.DeepSeek R2 could crush AI economics with 97% lower costs than GPT-4

‍