SFB 303 Discussion Paper No. B - 432
Author: Cressman, R., and K. H. Schlag
Title: Updating Strategies Through Observed Play - Optimization Under Bounded Rationality
Abstract: Individuals repeatedly face a multi-decision task with unknown payoff
distributions. They have minimal memory and update their strategy by
observing previous play (and not strategy) of someone else. We select
behavior rules that increase average payoffs as often as possible in a large
population where all use the same rule. Here imitation generalizes to a
pasting procedure. When decisions within the task are unrelated, individuals
eventually learn the efficient strategy but the underlying dynamic is not
monotone. However, when choices influence which decisions are subsequently
faced in the task, play may not be efficient in the long run as it
approaches a Nash equilibrium of the agent normal form.
Keywords: Multi-Armed Bandit, improving, undominated
behavioral rule, play-wise imitating, replicator dynamic, monotone
dynamics
JEL-Classification-Number: C72, C79
Creation-Date: April 1998
URL: ../1998/b/bonnsfb432.zip
SFB 303 Homepage
21.04.1998, Webmaster