A Game Theory Paradox
The Absent-Minded
Driver Problem
You are driving with two exits. You want the second one. But at each exit, you will not remember if you have passed one already.
Imagine you are driving home late at night on a highway with two exits. Your house is at Exit 2. Exit 1 leads somewhere terrible (0 points). If you miss both exits and stay on the highway, you end up at a mediocre destination (1 point). Getting home via Exit 2 is the best outcome (4 points).
Here is the twist: you have imperfect recall. When you reach an intersection, you cannot remember whether you have already passed an exit or not. Both intersections look identical to you.
0
Exit 1 (Wrong)
4
Exit 2 (Home)
1
Stayed on Highway
You must choose a probability p of exiting at any intersection you encounter. You cannot condition on which intersection it is because you cannot tell them apart.
What probability should you choose? The answer reveals a deep puzzle about when to evaluate decisions.
Experience the Problem
Try driving yourself. Remember: when you see an exit, you cannot know which one it is. Make the same type of decision each time and see what happens.
You want to get home (Exit 2). But here is the catch: at each intersection, you will not remember if you have already passed an exit.
Notice how difficult it is to get home consistently without taking the wrong exit.
Find the Optimal Strategy
If you exit with probability p at each intersection:
33.0%
Exit 1 (0 pts)
22.1%
Exit 2 (4 pts)
44.9%
Stayed (1 pt)
Expected Payoff
1.333
Planning optimal: 1.333 | Acting optimal: 1.000
Move the slider to explore different strategies. Notice the curve peaks around p = 1/3.
Planning vs Acting: Two Different Answers
Here is where it gets strange. Depending on when you evaluate the decision, you get different optimal strategies.
Planning Optimal
Before you start driving
p = 1/3
Expected payoff: 1.333
Acting Optimal
At the intersection
p = 2/3
Expected payoff: 1.000
The Paradox: Both answers are mathematically correct, but they give different strategies. Which should you use?
The Planning Perspective
Before you start driving, you maximize expected utility over all possible outcomes. This gives p = 1/3. You treat your future self as a different player who will mechanically follow the policy you set.
The Acting Perspective
At the intersection, you reason: “I am equally likely to be at exit 1 or exit 2. What action maximizes my expected utility from this information state?” This gives p = 2/3. You update on the fact that you are at an intersection.
“Your past self and future self are playing a coordination game, but they cannot communicate.”
This is not just a curiosity. It reveals a fundamental tension in decision theory: should rational agents commit to policies, or re-optimize at each decision point?
Simulate 1,000 Drives
Test different strategies with a Monte Carlo simulation. Compare the planning-optimal strategy (p = 1/3) with the acting-optimal strategy (p = 2/3).
Run simulations at different probabilities to see how the empirical results match theory.
Why This Matters for AI
The absent-minded driver is not just an academic puzzle. It appears naturally in AI systems that involve multiple instances, interrupted processing, or forked agents.
Multiple AI Instances
When multiple copies of an AI are running, they face the same problem: each instance cannot know how many others have already acted.
Interrupted Processing
An AI that gets interrupted and restarted may not know if it previously completed a task. Like the driver at an exit.
Forked Agents
When an agent is forked (copied), the copies must coordinate without communication. Each is "absent-minded" about the others.
Commitment Devices
The planning-vs-acting distinction matters for AI alignment: should an AI commit to a policy, or reason fresh at each decision?
The Core Question for AI Alignment
Should an AI commit to a policy and follow it mechanically (planning optimality)? Or should it re-evaluate at each decision point (acting optimality)?
In the absent-minded driver problem, these give different answers. Neither is obviously “more rational” than the other.
Key Concepts
Imperfect Recall
The player cannot distinguish between different game states based on their history. They do not remember past actions. This breaks standard game theory assumptions.
Mixed Strategy
Since you cannot condition on which exit you are at, you must choose a probability of exiting that applies to all intersections identically.
Self-Coordination
Your decision at one intersection affects the probabilities your "other self" faces at the next intersection. You are playing a game against yourself.
Time Inconsistency
What looks optimal before you start driving differs from what looks optimal when you are at an intersection. Neither perspective is "wrong".
The absent-minded driver teaches us that rationality is more subtle than maximizing expected utility.
When you have imperfect recall, the question is not just “what should I do?” but “from whose perspective should I evaluate what I should do?”
Explore More Decision Theory
We build interactive, intuition-first explanations of paradoxes in decision theory, game theory, and AI alignment.
Reference: Piccione & Rubinstein (1997)