Intelligence with Everyone: RL @ MiniMax, with Olive Song, from AIE NYC & Inference by Turing Post

Intelligence with Everyone: RL @ MiniMax, with Olive Song, from AIE NYC & Inference by Turing Post

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis · 2026-02-22

Olive Song from MiniMax shares how her team trains the M series frontier open-weight models using reinforcement learning, tight product feedback loops, and systematic environment perturbations. This crossover episode weaves together her AI Engineer Conference talk and an in-depth interview from the Inference podcast. Listeners will learn about interleaved thinking for long-horizon agentic tasks, fighting reward hacking, and why they moved RL training to FP32 precision. Olive also offers a candid look at debugging real-world LLM failures and how MiniMax uses AI agents to track the fast-moving AI landscape.

Nathan uses Granola to uncover blind spots in conversations and AI research. Try it at ⁠granola.ai/tcr⁠ with code TCR — and if you’re already using it, test his blind spot recipe here: ⁠https://bit.ly/granolablindspot⁠

LINKS:

Conference Talk (AI Engineer, Dec 2025) – https://www.youtube.com/watch?v=lY1iFbDPRlwInterview (Turing Post, Jan 2026) – https://www.youtube.com/watch?v=GkUMqWeHn40

Sponsors:

Claude:

Claude is the AI collaborator that understands your entire workflow, from drafting and research to coding and complex problem-solving. Start tackling bigger problems with Claude and unlock Claude Pro’s full capabilities at https://claude.ai/tcr

Tasklet:

Tasklet is an AI agent that automates your work 24/7; just describe what you want in plain English and it gets the job done. Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai

CHAPTERS:

(00:00) About the Episode

(04:15) Minimax M2 presentation (Part 1)

(17:59) Sponsors: Claude | Tasklet

(21:22) Minimax M2 presentation (Part 2)

(21:26) Research life and culture

(26:27) Alignment, safety and feedback

(32:01) Long-horizon coding agents

(35:57) Open models and evaluation

(43:29) M2.2 and researcher goals

(48:16) Continual learning and AGI

(52:58) Closing musical summary

(55:49) Outro

PRODUCED BY:

https://aipodcast.ing

SOCIAL LINKS:

Website: https://www.cognitiverevolution.ai

Twitter (Podcast): https://x.com/cogrev_podcast

Twitter (Nathan): https://x.com/labenz

LinkedIn: https://linkedin.com/in/nathanlabenz/

Youtube: https://youtube.com/@CognitiveRevolutionPodcast

Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431

Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

A biweekly podcast where hosts Nathan Labenz and Erik Torenberg interview the builders on the edge of AI and explore the dramatic shift it will unlock in the coming years.
The Cognitive Revolution is part of the Turpentine podcast network. To learn more: turpentine.co

Where can you listen?

Apple Podcasts Logo Spotify Logo Podtail Logo Google Podcasts Logo RSS

Episodes