Welcome, humans.
If you haven’t checked out Skeleton Crew (the new Star Wars show) yet, it’s REAL good (even IMDB agrees). We just started it this weekend, and we’re basically obsessed. Keep in mind, it’s a kids show, but it’s V fun (and sorta retro 80s—imagine Goonies meets Star Wars).
Check out this clip of SM-33, the coolest robot we’ve seen introduced to the Star Wars universe in a looong time:
SM-33 Fights Space Pirates (Skeleton Crew) | Star Wars Clips
Speaking of cool robots, check out this robot rat that uses AI to befriend real life rats (more about it here). TBH, can’t wait until I can buy a robot version of myself to help me befriend other humans…
Robo-rat reminded us of this story about Chinese companies starting to sell AI pets. With more AI + robo innovation, life-like robo-pets are DEFINITELY coming soon…
Here’s what you need to know about AI today:
- 12 Days of OpenAI, Day 2: a new reinforcement fine-tuning program.
- X.com released Grok for free and (sort of) released a new image generator.
- Meta planned a new $10B Louisiana data center for AI.
- Raven drone and MagicBots demonstrated new robotic capabilities.
OpenAI's new fine-tuning tool lets models become experts in your field.
Reinforcement Fine-Tuning—12 Days of OpenAI: Day 2
Remember when OpenAI dropped o1 and we were all amazed by its ability to think before answering? Well, they just announced something potentially even cooler: a way to make these models experts in specific domains with reinforcement fine-tuning (RFT).
Here's the TL;DR:
- OpenAI launched a research program for RFTing the o1 series of models.
- RFT lets you train models as domain experts with as few as a dozen examples.
- Apha program now, but the technique will be publicly available in early 2025.
Here's a demo in action: A Berkeley computational biologist used RFT to take OpenAI's smaller, cheaper o1-mini model and make it outperform the full o1 model at identifying genes responsible for rare diseases. That's like teaching your Honda Civic to outrace a Ferrari on a specific track.
Instead of typical fine-tuning (where you're basically teaching the model to mimic examples), RFT lets the model think through problems and then grades its answers. The model learns which reasoning paths lead to correct answers, and which don't.
Take that rare disease example. o1 was given patient symptoms like “51-year-old woman with hyperthyroidism” and had to figure out which genes might be responsible. The fine-tuned o1-mini got the right gene 31% of the time—beating both regular o1-mini (17%) and full o1 (25%).
What's REALY impressive is that there's no overlap between training and validation data—they used completely different genes for testing than training. According to OpenAI, that means the model actually learned to generalize rather than just memorize answers.
RLF results from the Berkeley test. 31% isn’t THAT good… but it’s 31% better than nothing, and as we’ve said before, this is the worst these models will EVER be.
Some other early uses include:
- Legal research and analysis with Thomson Reuters.
- Financial modeling and risk assessment.
- Engineering problem-solving.
- Insurance underwriting.
Sound interesting to you? You can apply for early access here. It's still in alpha, and spots in their research program are limited. But if you're working on complex tasks (something where experts could agree on what makes an answer “correct”) then apply away.
OpenAI provides the RL algorithms and training infrastructure—pre-built graders that score responses between 0 and 1, with plans to let users create custom Python graders in the future—you just bring your data and scoring system. Don't expect instant results though—training can take anywhere from a couple hours to a few days.
Our take: The AI landscape we know today, where everyone has access to general-purpose AI with more or less the same capabilities, is headed toward fragmentation, where companies build their own focused, expert systems that deeply understand their specific domain.
As the Berkeley researcher pointed out, the most powerful solutions will likely be “hybrid”—combining traditional domain-specific tools with customized AI models.
Think about it: A law firm's AI assistant that truly understands case law and precedent. A medical AI that speaks your hospital's specific protocols and procedures. A financial model that knows your company's risk tolerance… We're looking at a fundamental shift in how expertise scales across industries here. NBD, just OpenAI casually reinventing how knowledge work gets done... AGAIN.
FROM OUR PARTNERS
Want to build AI projects at lightning speed?
Join Dell Technologies experts and Speed Read AI on Dec 12th to see how Dell Precision workstations powered by NVIDIA RTX™ accelerate every stage of development. Real examples included. Save your spot!
Treats To Try.
- LM Studio runs open-source AI models like Llama and Pixtral on your computer without needing internet connection. Another option is Ollama.
- Showrunner is a tool that lets you direct TV shows complete with digital actors and sets (waitlist rn, but you can sign up for Alpha access).
- ChainClairty explains crypto white-papers with digestible summaries you can actually understand.
- ZenAdmin manages all your company's IT needs—from equipment procurement to employee support - in one centralized platform.
- Akira Docs combines genAI content generation and optimization with a modern, Notion-style editing experience (here’s a demo).
- Countless.dev compares different language models side-by-side, showing you key details like pricing and capabilities (vision included, input/output length) so you can pick the right one for your needs (AMAAAZING resource).
See our top 51 AI Tools for Business here!
Around the Horn.
Watch this bird-inspired robotic drone leap into the air
- There’s been a lot of neat robot demos lately, like Raven, the bird-inspired drone who can jump and launch itself into flight (featured above), this demo of a group of MagicBots working together on a factory floor, and this real life Disney robot meeting Pollen Robotics’ Reachy (who looks like Disney IRL).
- Meta announced a $10B investment to build its largest-ever data center in Louisiana's Richland Parish, designed for AI processing, and planned for 2030.
- X.com made Grok free to all users, and released a new image generator called Aurora too, but already took it down; this comes after xAI raised $6B.
- OpenAI may remove the AGI clause (which protects its most powerful models from Microsoft if it achieves AGI) to get Microsoft to invest more.
Under the Hood
- Meta released Llama 3.3-70B-Instruct, which matches their largest model's performance (405B) but runs on just 17% of the parameters, making it more efficient while outperforming competitors on key benchmarks.
- Florence-VL outperforms other top multimodal AI models at understanding images, whether you're asking questions, reading text, or analyzing charts (demo it here).
- SmolVLM is a compact multimodal model for image-text tasks requiring only 5GB GPU RAM.
- Motion Prompting is a new method to generate videos by drawing motion paths that control how objects and cameras move in the scene.
- FSQ OS Places is a dataset of 100M businesses locations worldwide, from their addresses to operating hours to social media profiles to power location-based services, market research, and mapping applications.
FROM OUR PARTNERS
Build AI projects in hours, not weeks.
See how Dell Precision workstations with NVIDIA RTX™ can unlock faster development at this Dec 12th webinar. Register here.
Sunday Special
More good stuff in the thread.
Here's a super helpful analysis about when to use Pro (and when not). the main takeaway is to use Claude 3.5 Sonnet for most practical needs at $20/month unless you specifically need:
- o1 Pro's vision capabilities.
- PhD-level mathematical analysis.
- That crucial extra 5-10% accuracy in specialized academic work (which would justify the $200/month price tag).