The Age of Scaling is Over. What Comes Next Matters for L&D.

Is there a better solution for AI pre-training than reinforcement learning? Image via Nano Banana 2

OpenAI’s co-founder just said the strategy that built ChatGPT won’t work anymore. The path forward looks surprisingly familiar to anyone in learning and development.

Last month, Ilya Sutskever—the former Chief Scientist at OpenAI who helped create GPT-4—gave a rare interviewwith Dwarkesh Patel that should reshape how we think about AI.

His core message: The era of "just make it bigger" is ending.

From 2020-2025, AI progress came from scaling: more data, more parameters, more computing power. That’s how we got from GPT-3 to GPT-4. But Sutskever says we’ve hit a wall: “We have but one internet.” The training data is finite. The scaling playbook is exhausted.

What replaces it matters enormously for anyone in learning and development. Because the solution isn’t more engineering—it’s better learning science.

The Wall: Why Scaling Can’t Continue

The current approach has two fundamental limits:

First, data scarcity. AI models have essentially consumed the public internet. There’s no second internet to train on. Synthetic data and self-play offer partial solutions, but they risk the model learning its own mistakes.

Second, and more profound: the learning algorithm itself.

Current AI uses reinforcement learning (RL)—a system where an agent takes actions, receives rewards, and updates to maximize those rewards. It’s sophisticated pattern matching. And it produces what I’d call a “savant architecture.”

The Savant Problem

Here’s how Sutskever explains it: A teenager learns to drive in about 10 hours. Current AI systems need millions of examples to master far simpler tasks.

The difference isn't processing power. It's what is being learned.

Consider the recent ARC-AGI benchmark results. OpenAI’s newest model scored an impressive 75% on the original test—a major breakthrough. Then researchers released a slightly modified version with novel puzzles. The same model dropped to 4%. Humans score 100% on both.

This is the savant pattern:

Extraordinary performance on familiar patterns
Catastrophic failure on genuine novelty
No transfer, no generalization, no “common sense”

It’s like a student who memorizes every practice test but can’t handle a word problem they haven’t seen before. They learned the answers. They didn’t learn the reasoning.

What’s Missing: A Constructivist Foundation

Sutskever pointed to something learning professionals will recognize immediately: the role of emotion in cognition.

He referenced research showing that patients who lose emotional processing can’t make basic decisions—not because they lack information, but because they lack the internal signals that mark what matters. He called emotions “value functions” that evolution hardcoded into us.

This isn't just neuroscience trivia. It's a direct challenge to how AI currently learns.

Reinforcement learning reduces all feedback to a single number: the reward signal. Good or bad. Higher or lower. But human learning doesn’t work that way. We have curiosity that drives exploration, fear that signals danger, satisfaction that marks progress, confusion that triggers deeper processing.

These aren’t obstacles to clear thinking. They’re the infrastructure that makes thinking possible.

Every experienced trainer knows this. Engagement isn’t a nice-to-have—it’s the mechanism by which learning actually happens. A disengaged learner doesn’t just learn slower; they learn differently, superficially, in ways that don’t transfer.

Current AI is learning like a disengaged student: absorbing information without the emotional and developmental infrastructure that turns information into understanding.

The Developmental Gap

Think about how human cognition actually develops:

A child spends years building foundational understanding—object permanence, cause and effect, basic physics intuitions. They learn that dropped things fall, that hidden objects still exist, that actions have consequences. This happens through embodied interaction, not instruction.

Only after these foundations are solid does abstract reasoning develop. You can’t do algebra without first understanding quantity. You can’t reason about psychology without first having a theory of mind.

Current AI skips all of this. It goes straight to reading PhD-level physics papers without ever having dropped a ball. It absorbs descriptions of emotions without ever feeling uncertain. It learns about learning without ever being confused and working through it.

The result: formal operations without foundations. A savant who can discuss anything but understand nothing in the way humans mean “understand.”

The Path Forward

Sutskever outlined three pillars for what comes next:

Agents that learn through action. AI that doesn’t just respond but acts in the world, experiences consequences, and accumulates knowledge over time. This is closer to how humans learn—through doing, failing, adjusting.
Inference-time computation. Instead of just making models bigger, give them more time to “think” at runtime. OpenAI’s new reasoning models do this—they pause, consider, explore before answering. It’s a start, but still within the RL paradigm.
And here’s the one that matters most for us: a new learning paradigm entirely. Sutskever won’t say exactly what his new company is building, but he’s clear that the current approach—pattern matching plus reinforcement—isn’t sufficient.

Reading between the lines: the next breakthrough may require AI that learns like humans do. That builds schemas, accommodates when those schemas fail, develops through stages, learns through scaffolded interaction.

Sound familiar? This is Piaget. This is Vygotsky. This is what learning science has studied for a century.

Why This Matters for Learning Leaders

AI tools will get much better at adaptive learning—eventually.

But “eventually” depends on solving the savant problem. In the meantime, be skeptical of AI that performs well on demos but fails when your learners do something slightly unexpected. That’s the 75% → 4% pattern showing up in your training programs.

Human expertise in learning becomes more valuable, not less.

If the path forward for AI is genuinely better learning algorithms, then the people who understand how learning works—instructional designers, facilitators, coaches—aren’t being replaced. They’re becoming essential partners in the next phase of AI development.

The questions change.

Instead of “Can AI replace our trainers?” the question becomes “What can learning science teach AI development?” The organizations that figure this out first will have a genuine competitive advantage.

The Bottom Line

Sutskever now leads a company called Safe Superintelligence, backed by $3 billion, focused on AI that reasons rather than just retrieves. He predicts AI systems will become “agentic”—making decisions, evaluating possibilities, adapting to context.

But here’s what struck me most from his interview: he believes the key to smarter AI isn’t more computing power. It’s understanding how efficient learning actually works. Why a teenager can learn in hours what AI takes millions of examples to approximate.

That’s a learning science question. And it might be the most important one in AI right now.

The age of scaling gave us impressive savants. The age of learning—if we get there—will transform how intelligence itself is built.

The people who understand learning have a seat at that table.

Are you taking it?

I’ve spent my career at the intersection of learning science and technology. I think we’re at an inflection point where these fields need to talk to each other more than ever. What are you seeing? Are the AI tools in your organization genuine learning systems, or sophisticated pattern matchers? I’d love to hear what’s working—and what isn’t.

*Allen Partridge, PhD | Director of Product Evangelism, *Adobe Digital Learning Solutions