
A teenager masters driving in 10 hours. AI needs millions of examples. The difference isn’t computing power—it’s something every good trainer already understands.
In my last article, I explored why the “just make it bigger” era of AI is ending. Today I want to dig into something Ilya Sutskever said that stopped me cold.
He was explaining why AI systems need millions of training examples while humans learn from dozens. His answer wasn’t about algorithms or architecture.
It was about emotions.
The Value Function Problem
Sutskever referenced fascinating neuroscience research: patients with damage to emotional processing centers can’t make basic decisions. Not because they lack information—they can still analyze options perfectly well. But they can’t choose. They’ll deliberate endlessly over where to eat lunch.
His interpretation: emotions aren’t noise that clouds rational thinking. They’re value signals that evolution hard-coded into us. Fear marks danger. Curiosity marks opportunity. Satisfaction marks progress. Without these signals, we have no basis for deciding what matters.
Current AI has none of this.
Reinforcement learning—the dominant paradigm—reduces all feedback to a single number. A reward score. Higher or lower. Good or bad. The system optimizes for that number.
But human learning doesn’t work that way.
What Great Trainers Already Know
Think about how your best employees actually learn.
They don’t just absorb information. They get curious about problems. They feel frustrated when something doesn’t work—and that frustration drives them to understand why. They experience satisfaction when pieces click together. They feel uncertain in ways that trigger deeper investigation.
These aren’t obstacles to learning. They’re the infrastructure that makes learning possible.
A disengaged learner doesn’t just learn slower. They learn differently—superficially, in ways that don’t transfer. Every experienced facilitator knows this. Engagement isn’t a nice-to-have metric. It’s the mechanism.
Current AI learns like a permanently disengaged student: processing information without the emotional architecture that turns information into understanding.
The Multi-Dimensional Gap
Here’s the technical version of the problem:
Human learning involves dozens of simultaneous feedback signals:
- Curiosity (exploration drive)
- Confusion (triggers deeper processing)
- Fear (danger detection)
- Satisfaction (progress confirmation)
- Social signals (belonging, status, approval)
- Physical states (fatigue, alertness, comfort)
These signals interact. Curiosity plus confusion often leads to breakthroughs. Fear plus confusion leads to avoidance. The combination matters as much as the individual signals.
Reinforcement learning collapses all of this into: reward = 0.73
It’s like trying to navigate a city with only “warmer/colder” feedback. You might eventually get somewhere, but you’ll need millions of steps—and you’ll have no idea why you arrived.
Why This Creates Savants
This explains the pattern I described last time: AI that scores 75% on a benchmark, then drops to 4% when the puzzles change slightly.
Without rich value signals, the system can only learn what worked, not why it worked. It memorizes patterns without building the underlying model that would let it generalize.
It’s the difference between a student who memorizes “the answer to problem 7 is 42” versus one who understands the principle well enough to solve problem 7b, 7c, and problems that look nothing like 7 but use the same reasoning.
The first student might ace the practice test. They’ll fail the final.
Honest Caveat
Here’s where I need to be careful.
We don’t actually know that emotional value signals are necessary for generalization. It’s possible that the attention mechanism in transformers—the way they weigh relationships between concepts—can simulate something functionally equivalent. Maybe the current architecture just needs different training, not different machinery.
And even if Sutskever is right that something like emotional feedback is essential, we don't know what that looks like in silicon. The biochemical cascade from your amygdala through your prefrontal cortex is staggeringly complex. The path to replicating its *function* (let alone its mechanism) might be decades away—or it might emerge from an approach nobody's tried yet.
What we do know:
- Current systems show the savant pattern (brilliant on familiar, brittle on novel)
- Scaling alone isn’t solving it
- The researchers at the frontier are talking about learning science, not just engineering
That’s not proof that L&D expertise holds the key. But it’s a strong signal that the conversation is changing.
What This Means for L&D
1. Your expertise is more relevant than ever.
If the path forward for AI involves richer value signals and more human-like learning, then the people who understand motivation, engagement, and developmental psychology aren’t being automated away. They’re becoming essential to the next phase of AI development.
2. Current AI tools have predictable limitations.
When evaluating AI for learning applications, ask: how does it handle novelty? If it performs beautifully on demos but struggles when learners do something unexpected, you’re seeing the single-reward-signal limitation in action.
3. The hybrid approach matters.
The most effective learning systems right now combine AI’s pattern-matching strengths with human facilitation’s adaptive, emotionally-intelligent guidance. That’s not a temporary compromise—it’s recognition of what each does well.
The Frontier
Some researchers are already working on richer reward architectures. Systems with curiosity drives. Agents that model their own uncertainty. Multi-objective optimization that doesn’t collapse everything to a single number.
But here’s what struck me most about Sutskever’s comments: the people who’ve studied human learning for decades already have deep knowledge of these value signals. They know which combinations drive growth. They know how to scaffold challenge and support.
That knowledge is about to become very valuable.
*The gap between "knowing" and "understanding" isn't just philosophical. It's the central challenge facing AI development right now. What are you seeing? Are your AI tools genuinely adaptive, or do they break when learners go off-script?*
Allen Partridge, PhD | Director of Product Evangelism, Adobe Digital Learning Solutions