šŸ˜ŗ Gemini's new AI, tested

PLUS: ChatGPT's a better doc than 90% of doctors...
November 18, 2024
In Partnership with

Welcome, humans.

If you want to advertise your product or service in front of the BEST readers in AI (thatā€™s yā€™all, Neuron readers!), fill out our this partnership form ASAP.

With 475,000+ readers, a 42% open rate, and a new standalone secondary placement, itā€™s a great time to advertise in The Neuronā€¦ and spots for Q1 are filling up fast!

The partnership form takes less than 60 seconds to fill out, so youā€™re < 1 minute away from launching a killer ad campaign in 2025!

Advertise in The Neuron here.

Hereā€™s what you need to know about AI today:

  • We tried Geminiā€™s new (experimental) top model.
  • Anthropic has a new prompt fixer.
  • Researchers found smaller models may beat shrunken big ones.
  • ChatGPT beat doctors at diagnosing patients.

We tried Geminiā€™s new top model so you donā€™t have toā€¦ (unless you want!)

The AI world has a new champion (for now): Gemini-Exp-1114 just dethroned GPT-4o on multiple leaderboards, marking a win for Google (however fleeting it might be).

If youā€™re wondering why we sound so ominous about Googleā€™s time to shine, keep reading.

According to the latest Chatbot Arena stats, Gemini-Exp-1114 is crushing it in areas like math, coding, and creative writing. It's currently ranked #1 in:

  1. Hard prompts.
  2. Math problems.
  3. Creative writing.
  4. Instruction following.
  5. Multi-turn conversations.

It currently ranks #3 in coding tasks (behind o1-preview and o1-mini), though.

Early feedback is in:

Some people even say 1114 could be considered ā€œGemini 2ā€ If youā€™re curious, you can try it in the AI Studio here (fun fact: it looks like AI Studio is getting a revamp soon!). Exp 1114 is also available via API, so developers can work with it inside other apps.

We tried it, and hereā€™s what we think:

Exp 1114 is very capable, and it explains itself well. When we asked it for specific tasks, like coding a particular Chrome extension or analyzing a script Claude wrote based on the Sam Altman vs Elon Musk, it easily handled them.

It doesnā€™t have the same magic as chatting w/ Claude, but thereā€™s no doubt this will integrate well inside Googleā€™s other applications.

Hereā€™s the thing, though: This is a big moment for Google, but timing is everything. Thereā€™s a new ChatGPT-4o version already in testing (because of course OpenAI couldn't let Google have its moment), and itā€™s looking like new 4oā€™ll retake the #1 spot.

Plus, thereā€™s the recent controversy over Gemini's safety issuesā€¦ and some ppl have pointed out Exp 1114ā€™s scores are lower on Livebench (another industry benchmark).

Geminiā€™s success from here will all depend on how 1114 (and any future models) get integrated into Googleā€™s other applicationsā€”and whether or not these capabilities significantly improves the experience of using Workspaces or Search or Chrome.

FROM OUR PARTNERS

Hereā€™s why we like Attention for sales and meeting call summariesā€”and why you will, too.

We've been testing Attention for our sales + internal calls lately, and wowā€”weā€™re actually obsessed.

Hereā€™s how it works:

  • Attention listens to your calls in real-time, taking notes on topics you request.
  • You identify key data you want from calls, and Attention finds the relevant info.
  • Once itā€™s done, you can go back and watch the relevant sections of your call.

Why do we love it? No more scrambling to remember key talking points or frantically taking notes mid-call. We can ā€œbe presentā€ (as the gurus say) without worrying about recording every detail, because we know Attentionā€™s got our back.

And Attention can do much more, too.

Book a demo and try it out for yourself here.

Treats To Try.

  1. *Incogni removes your personal data from the open internet so scammers and identity thieves canā€™t access it. Stay safe onlineā€”use code NEURON to get 58% of Incogniā€™s annual plans with their Black Friday offer now.
  2. Stripe launched an agent toolkit that gives AI agents access to the Stripe API to handle invoices or purchase goods on your behalf (for the techies!).
  3. GenSpark Finance attempts to make financial reports easier to read with graphics and chartsā€”and itā€™s powered by Claude (raised $60M).
  4. Reforged Labs creates ads for mobile games using templates generated from a custom model trained on successful video game ads (raised $3.9M).
  5. Mikrotakt splits audio tracks and isolates certain elements like guitar, vocals, bass, drumsā€”itā€™s free to try out, and the demo they provide is fun.
  6. AI Game Master is a text-based RPG tool inspired by D&Dā€”just describe your actions, and the AI game master does the rest.
  7. Parafact fact checks sources (human or AI) with citations and sourcesā€”thereā€™s also Factiverse, which is a more B2B version (more about it here).

See our top 51 AI Tools for Business here!

*This is sponsored content. Advertise in The Neuron here.

Around the Horn.

Monday Meme.

FROM OUR PARTNERS

The fastest way to build AI apps

Writer is the full-stack generative AI platform for enterprises. Quickly and easily build and deploy AI apps with Writer AI Studio, a suite of developer tools fully integrated with our LLMs, graph-based RAG, AI guardrails, and more.

Use Writer Framework to build Python AI apps with drag-and-drop UI creation, our API and SDKs to integrate AI into your existing codebase, or intuitive no-code tools for business users.

Start building with AI Studio

A Cat's Commentary.

We stan a backhanded complimentā€”the more the merrier!
cat carticature

See you cool cats on X!

Get your brand in front of 450,000+ professionals here
www.theneuron.ai/newsletter/geminis-new-ai-tested

Get the latest AI

email graphics

right in

email inbox graphics

Your Inbox

Join 450,000+ professionals from top companies like Disney, Apple and Tesla. 100% Free.