Welcome, humans.
People are getting early access to Google Searchâs AI Mode, and itâs really interesting to watch in action.
Some are saying this is basically Googleâs Perplexity killer. And if you add this to the success of Gemini 2.5 Pro, which Google is giving away for free rn, it looks like Google is finally becoming the threat OpenAI was created to preventâŠ
Hereâs what you need to know about AI today:
- We break down how to pick the best AI model.
- OpenAI launched PaperBench to test AI research replication.
- Google released AGI Safety report predicting AGI by 2030.
- Wikimedia traffic rose 50% since January from AI scraping.

Here's how to pick the best AI model for what you actually need
Tired of playing AI model musical chairs? One week Claude's the best, then it's ChatGPT, then suddenly Gemini's crushing benchmarks (welcome to the wild world of AI, folksâor as we call it, âTuesdayâ).
With all the constant changes, how do you know which AI to use when? We actually just watched Tina Huang's hour long interview with Louie Peters (CEO of Towards AI) where they tackled this exact question, and the advice was pretty solid.
First things firstâit all depends on what you're trying to accomplish:
- Solving complex reasoning problems that need high accuracy?
- Processing massive documents with over 700K words?
- Just chatting casually and need something fast and cheap?
- Building enterprise solutions that need self-hosting?
These all require different AI strengths. Here's a few of Louie's pro tips on how to pick the right model for the job:
- Match functionality to your needs: Choose models with capabilities (images, audio, etc.) that fit your specific tasksâfor beginners, start with ChatGPT 4o.
- Check context window size: For long documents or complex instructions, models like Gemini 2.5 Pro offer up to 1M tokens.
- Use benchmarks wisely: Check the metrics most relevant to your use case (math, coding, writing)âmore on that below.
- Calculate your ROI: Expensive models (like o1 Pro) are worth it only when reliability saves more time than they cost.
- Experiment regularly: Build your own intuition by testing multiple modelsâLouie personally uses 5-6 different models 20-30 times a day.
Louie also shared his own breakdown of which models are his favorite atm:

So what tools can help you actually implement this advice?
You could test every model individually (time-consuming but thorough)âor you can use OpenRouter, which lets you test multiple models with the same prompt at once. Just sign up, add funds for premium models, and start comparing results side-by-side.
Another option is checking benchmarks like Live Bench, but remember that AI companies know how to game these tests.
Our favorite approach? Use the site Artificial Analysis, which puts every AI model through standardized tests covering intelligence, speed, cost, and specialized skills.
Their latest rankings show:
- Overall Intelligence: Gemini 2.5 Pro Experimental.
- Speed Champion: Nova Micro (322 tokens per second).
- Cost-Effective King: Gemini 2.0 Flash ($0.2 per million tokens).
- Coding All-Star: o3-mini (high).
- Math Reasoning: Gemini 2.5 Pro Experimental (94/100).
- Best Open-Weight Model: DeepSeek R1.
Fun fact: they also rank speech to text, image, and video models, too!
Now, the above ranking could change. Like, tomorrow. So ultimately, we recommend you go with whichever one consistently works the best for you.
You donât always need the smartest modelâyou just need the one that gets the job done.
After all, choosing an AI is surprisingly personalâitâs not unlike choosing your friends (or more appropriately, your cybernetic coworker). After all, if you're going to spend a good chunk of your day âchattingâ with something, the vibes do kinda matter.

FROM OUR PARTNERS
When your AI needs the best ears in the business... đ

Frustrated when voice AI constantly misunderstands you? Speechmatics fixed that.
While others rush to make AI talk, Speechmatics has solved what matters first: making it truly listen.
Their real-time speech tech delivers 90%+ accuracy in under one second across 55+ languages, diverse accents, and dialects â a full 25% more accurate than competitors, even in noisy environments.
Whether it's AI assistants, customer service, or medical transcription, Speechmatics ensures AI catches every word the first time.
No more âcan you repeat that?ââjust AI that keeps up, not catches up.
Try Speechmatics at no cost and hear the difference.

Prompt Tip of the Day
When youâre trying to condense something, try this prompt: âFirst, give me a shortened version, in <short version>, keeping all the same specificity and context of the original. When thatâs done, write an even shorter version, in <even shorter>.â
Itâs sort of like adding a built-in editor for your AI writing (demo).
Another helpful tip? If you want the AI to write more visually, try: âmake it more concrete (show, donât tell).â Also, you can ask it to use more âimage wordsââbut you might want to add something like: âDon't use metaphors, just use picture words that the user can see.â (demoâŠmaybe I probably shouldâve used that version, huh?).

Treats To Try.
- Claude for Education is a new resource for schools that helps you enhance teaching and learning with specialized features for Claude like Learning mode that guides student reasoning rather than giving answers outright (more).
- Actively AI researches, understands, and reasons about potential customers to maximize revenue quality and pipeline growth (raised $22M).
- GenSpark is a new agent out of China that completes tasks for you through a mixture-of-agents system with fewer hallucinations than competitorsâdemo.
- DeepSite is a totally free vibe-coding app you can use to help code a website (powered by DeepSeek)âwe used it to make this.
- Subscription Day tracks all your subscription payments in your menu bar, showing upcoming charges on a calendar and alerting you before payments are due (Mac only rn).
- Recall connects what you're currently reading with content you've previously saved, instantly showing you where you've seen similar information before.
- ElevenLabs now has a text to bark model for dogs⊠whereâs the cat one, huh??
See our top 51 AI Tools for Business here!

Around the Horn.
- Google replaced the current leader of its consumer AI apps with the leader of Google Labs and helped launch the viral AI research tool NotebookLM.
- OpenAI released PaperBench, a benchmark that evaluates AI agents' ability to replicate state-of-the-art AI research papersâso far, the best agent tested only achieved 21% replication accuracy.
- Google published a 145 page report on the companyâs approach to âAGI Safetyâ and predicts AGI could arrive by 2030.
- Wikimedia traffic surged 50% since January 2024 due to AI crawlers scraping content.
- Researchers from Hong Kong introduced Dream 7B, âthe most powerfulâ open diffusion model (which means it generates text sorta like painting).

FROM OUR PARTNERS
On-device AI. No cloud. No GPUs.

Mirai is building the infrastructure for on-device AI, enabling dev teams to run small language models directly on iOS.
Locally. Fast. Private.
Their engine supports a wide range of architectures, including Llama, Gemma, Qwen, VLMs, and RL over LLMsâmaking advanced AI capabilities accessible on mobile devices. Yeah, pretty cool.

Thursday Trivia
One is real, and one is AI. Which is which? (vote below!)
A.

B.

Which is AI?
The answer is below, but place your vote to see how your guess compares to everyone else (no cheating now!)
Here are the results from last weekâs trivia (A was AI):

Hereâs what you said:
- L.G. chose A: âA. is AI - it's almost perfect - but the logo isn't 100% on point.â
- T.R. chose B: âThe gibberish letters on the cap lead me to think B is AI-generated, given its historical difficulty with text in images.â
- D.F chose A: âShallow depth of field gives it away... It's very common in AI image generation.â

A Cat's Commentary.


Trivia answer: B is AIâŠ