šŸ˜ŗ Devs getting REPLACED??

PLUS: 3 hallucination studies you NEED to know...
February 16, 2025
In Partnership with

Welcome, humans.

It looks like the Unitree G1 robot just got an update to its algorithmā€¦and by algorithm, we mean algo-rhythm:

We really like that disclaimer at the end of the video: ā€œWe kindly request that all users refrain from any dangerous modifications or using the robot in a hazardous manner.ā€

Okay, so when they said G1 can learn ā€œany danceā€, apparently they meant EXCEPT slam dancing.

No spin kicks? No windmills? No Wall of Death?! How could I ever possibly go to a hardcore show with this dude?! He JUST learned how to stand his own in a moshpit.

In all seriousness, what a polite suggestion for something fundamentally terrifying. Friendly reminder that itā€™s much too easy to jailbreak these kinds of robots for nefarious purposes!

Hereā€™s what you need to know about AI today:

  • We go over the top AI coding toolsā€”and what they mean for the industry.
  • The EU and UK pivoted (slightly) on AI safety.
  • Apptronik got $350M for warehouse robot.
  • 3 new studies on AI hallucination dropped.

Advertise in The Neuron here.

Building apps with AI is getting easier than everā€¦ hereā€™s the 6 tools to know.

Last Thursday, we hit a small roadblock: we wanted to share a long Deep Research prompt with yā€™all, but realized there wasn't a great way to do it online.

Sure, we could use Google Docs (which we did), but it felt... clunky. So we did what any reasonable AI newsletter would doā€”we built our own AI Prompt Manager (v0.1).

Hereā€™s how we did it (in only a few hours, entirely with AI):

  1. Defined our requirements with ChatGPT o3-mini to make the prompt.
  2. Spun up the concept in Lovableā€”a chat based AI app builder.
  3. Connected Lovable to Supabase for the backend (where the prompts get stored)ā€”Lovable makes this process super easy, btw.
  4. Tweaked the design with some AI back and forth chats, and chef's kiss!

Lovable also just launched a new Visual Editor so you can now easily edit sizes, colors, content, and other stylings of any element on the page with a Figma-like experience.

Thereā€™s also Lovify, which adds additional features on top, and 21st.Dev, a marketplace for assets to improve these tools.

Now get this: Lovable is just one tool that does this. Thereā€™s actually a lot of tools that work similarly. Check them out:

Thereā€™s even Devin, which aims to replace software engineers entirely (OpenAIā€™s working on something like this too).

Most devs will tell you Devin is too expensive, but thatā€™s the thingā€”Devin isnā€™t meant for devs. Itā€™s meant for management.

Hereā€™s why this matters: Look at this chart of software developer job postings over the last five years. They've fallen off a cliff...

We now know (thanks to Anthropic) that mid-tier developers are embracing AI like crazy. At the same time, companies are laying off senior engineersā€”creating this weird vacuum where junior devs are using AI to code but missing the mentorship on why things work (or break).

John Collins argues there's now a huge gap between the expectations senior managers have for AI replacing software engineers and the reality on the ground.

C-suites are excited about AI reducing the need for expensive engineers, but engineering involves far more than writing codeā€”itā€™s managing stakeholders, debugging, writing tests, providing estimates. AI isn't close to handling all that.

One segment of the software engineer market that COULD get decimated by AI? The outsourcing of dev talent on sites like Fiverr and Upwork, which boomed during the pandemic.

Our take: These tools aren't replacing developers just yet (sorry, CEOs hoping to cut costs)ā€”but they ARE making it possible for everyone to build custom tools for their specific needs.

This means the golden age of personal software might finally be here. Instead of searching for the perfect tool, you might just... build it yourself. With AI as your coding buddy, of course. Just remember to check your buddyā€™s work!

FROM OUR PARTNERS

Compliance for Startups: Download the SOC 2 Checklist

As a startup founder, finding product-market fit is your top priority. But landing bigger customers requires SOC 2 or ISO 27001 complianceā€”a time-consuming process that pulls you away from building and shipping.

Thatā€™s where Vanta comes in.

Join over 9,000 companies, including hundreds of Y Combinator-backed startups like Supabase, Newfront, and Fern who streamline compliance with Vantaā€™s automation and trusted network of security experts.

Start with the SOC 2 compliance checklist, which breaks down the process into clear stepsā€”so you can spend less time on compliance and more time growing your business.

Download the SOC 2 checklist

Treats To Try.

  1. Perplexity now has its own Deep Research tool that writes expert research reports by reading hundreds of sources while you waitā€”just select it from ā€œAutoā€ (more here).
  2. Justpoint matches you with lawyers who take on your medical injury case (like medical malpractice and harmful drug cases) with zero upfront fees.
  3. Deskminder lets you drag to set desktop timers that show full-screen notifications you can't miss (Mac only rn).
  4. Rabbithole visualizes your curiosity by turning each question into a mind map of connected discoveries.
  5. Browser Use Cloud executes web tasks from your text commandsā€”just type ā€œorder pizza from Dominosā€ and it handles the rest (free to use with the code here or $30/month with their cloud).
  6. Alice.tech transforms your course materials into custom flashcards and practice tests that pinpoint where you need to improve.
  7. FeedbackStream interviews your customers through AI voice calls, replacing time-consuming 1-on-1 meetings with automated conversations.

See our top 51 AI Tools for Business here!

Around the Horn.

  • Meta hired the ex-CEO of The RealReal to help boost its AI glasses and VR headset sales.
  • Apptronik has a robot called Apollo that loads trucks, stacks boxes, and moves heavy items so your workers don't have to (raised $350M).
  • Airbnb CEO Brian Chesky thinks it's too soon to deploy AI for trip planning, equating where we are now in the AI wave as equivalent to the ā€œmid-to-lateā€ 90ā€™s for the internetā€”he expects AI to take a few more years to reach a ā€œ30% increase in technology and engineering productivity.ā€
  • After this weekā€™s AI Summit in Europe, the UK renamed the AI Safety Institute to ā€œthe AI Security Instituteā€ and signed a deal with Anthropic.
  • Separately, the EU published a new work program for a ā€œbolder, simpler, fasterā€ EU and abandoned a liability directive that would expedite the process for consumers to sue AI companies over their AI services.

Sunday Special

We want to take this Sunday to highlight some recent studies related to hallucination.

First, the BBC ranked the major AI search toolsā€™ accuracy with news content, and found even the best performer still struggled with factual accuracyā€”hereā€™s each model's version and performance, ranked from best to worst:

  • ChatGPT Enterprise (GPT-4) had 15% significant errors.
  • Perplexity Pro (default LLM) had 17%.
  • Microsoft Copilot Pro (LLM not specified) had 27%.
  • Google Gemini Standard (LLM not specified) had 34%.

This is helpful to know if youā€™re using these tools and relying on the results without fact checking them. Now we know: donā€™t do that.

Important note: it doesnā€™t seem like any of those tools use the latest reasoning models, or at least didnā€™t at the time they were tested.

Second, a slew of papers were just released that argue the following:

  1. Hallucination is inevitableā€”crazy paper that proves this with math.
  2. Hereā€™s what LLMs knowā€“and what they donā€™tā€”really worth diving deep into.
  3. Verifying AI outputs is often more mentally taxing than doing your own thinkingā€”this one is super fascinating, but itā€™s a small sample size and self reported.

All three studies converge on one point: AI can be a game-changer, but human oversight is still the secret ingredient.

For the record, thereā€™s a ā€œHallucination Leaderboardā€ that analyzes each modelā€™s ability to verify factual alignment with a known source. But, it doesnā€™t measure open-ended truths or outside knowledge. So effectively, itā€™s rating each modelā€™s ability to avoid making stuff up when the correct info is spelled out right in front of them.

This all doesnā€™t suggest we banish AI from our workflows; rather, we need to make sure we remain engaged, inquisitive, and skeptical. If youā€™re confident in yourself (and maybe just slightly less confident in the chatbot), youā€™ll strike that sweet spotā€”reaping AIā€™s productivity benefits without losing your own cognitive edge.

A Cat's Commentary.

cat carticature

See you cool cats on X!

Get your brand in front of 450,000+ professionals here
www.theneuron.ai/newsletter/devs-getting-replaced

Get the latest AI

email graphics

right in

email inbox graphics

Your Inbox

Join 450,000+ professionals from top companies like Disney, Apple and Tesla. 100% Free.