A driver is sitting in a pub planning his trip home. In order to get there he must take the highway and get off at the second exit. Unfortunately, the two exits look the same. If he mistakenly takes the first exit he’ll have to drive on a very hazardous road, and if he misses both exits then he’ll reach the end of the highway and have to spend the night at a hotel. Assign the payoff values shown above: 4 for getting home, 1 for reaching the hotel, and 0 for taking the first exit.
The man knows that he’s very absent-minded — when he reaches an intersection, he can’t tell whether it’s the first or the second intersection, and he can’t remember how many exits he’s passed. So he decides to make a plan now, in the pub, and follow it on the way home. This amounts to choosing between two policies: Exit when you reach an intersection, or continue. The exiting policy will lead him to the hazardous road, with a payoff of 0, and continuing will lead him to the hotel, with a payoff of 1, so he chooses the second policy.
This seems optimal. But then, on the road, he finds himself approaching an intersection and reflects: This is either the first or the second intersection, each with probability 1/2. If he were to exit now, the expected payoff would be
That’s twice the payoff of going straight! “There appear to be two contradictory optimal strategies, one at the planning stage and one at the action stage while driving,” writes Leonard M. Wapner in Unexpected Expectations. “At the pub, during the planning stage, it appears the driver should never exit. But once this plan is in place and he arrives at an exit, a recalculation with no new significant information shows that exiting yields twice the expectation of going straight.” What is the answer?
(Michele Piccione and Ariel Rubinstein, “On the Interpretation of Decision Problems with Imperfect Recall,” Games and Economic Behavior 20 , 3-24.)