#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI — Lex Fridman Podcast | Yedapo
#490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI — AI Summary
Key Topics
The DeepSeek Moment: Refers to the January 2025 release of DeepSeek R1, an open-weight Chinese model that rivaled top US proprietary models at a fraction of the training cost. It signifies a geopolitical shift where the "moat" of massive compute budgets is eroded by algorithmic efficiency and architectural tweaks. For listeners, this means high-performance local inference is now viable without relying on US tech giants.
RLVR (Verifiable Rewards): A post-training technique where models are trained on tasks with objective ground truths (math, code) rather than subjective human preference. By grading the model purely on accuracy, it allows for massive scaling of reinforcement learning without the bottleneck of human labeling. This is the engine behind "reasoning" models that can self-correct.
Inference-Time Scaling: The process of allowing a model to generate more tokens (hidden "thoughts") to reason through a problem before outputting the final answer. Unlike pre-training scaling which is a fixed cost, this shifts the compute load to the moment the user asks a question. It turns compute into a variable cost that can be dialed up for complex queries.
Mid-Training: A distinct phase between pre-training (raw knowledge) and post-training (fine-tuning) focused on specialized data ingestion. It involves training on high-quality, long-context data or specific reasoning traces to prepare the model for complex tasks. This is crucial for fixing the "catastrophic forgetting" that happens when models are over-optimized too quickly.
Key Takeaways
Build a model from scratch to truly understand the architecture.
Implement 'Extended Thinking' or 'Inference Scaling' for complex queries.
Curate synthetic data using OCR tools for model training.
Use specific models for specific modalities (Model Routing).