The Rat in Your Slot Machine: Reinforcement Schedules

Blog || Politics || Philosophy || Science || Fiction || Quotes

When gamblers tug at the lever of a slot machine, it is programmed to reward them just often enough and in just the right amount so as to reinforce the lever-pulling behavior - to keep them putting money in. Its effect is so powerful that it even overrides the conscious knowledge most players have that in the long run, the machines are programmed to make net profit off of customers, not give money out.

Slot machine designers know a lot about human behavior, and how it is influenced by experience (learning). They are required by law to give out on average a certain percentage of the amount put in over time (say 90% payout), but the schedule on which a slot machine's reinforcement is delivered is very carefully programmed in and planned (mainly small and somewhat randomly interspersed payoffs). Interestingly, this effective type of reinforcement schedule originally comes from studies with non-human animals.

When you put rats in a box with a lever, you can set up various contingencies such that pressing the lever releases food to them. You could release food based on a fixed ratio of lever presses (every 10 presses drops some food), or a fixed interval (fifteen seconds must elapse since the last lever press before a new lever press will release food). Alternately, you could do it based on a variable ratio of presses (on average, it will take 10 presses to get food, sometimes more, sometimes less), or a variable interval (on average, food is available for pressing a lever every 15 seconds, but sometimes you have to wait longer, sometimes not as long).

A variable ratio schedule is perhaps the most interesting for the example of slot machines. If you make food available on a variable ratio, you can make sure food is given out often enough that the task remains interesting (i.e. the rat doesn't totally give up on pressing the lever), and you can also make it impossible for the rat to guess exactly when reward is coming (so it won't sit there and count to 10 lever presses and expect food; or it won't sit and wait 15 seconds before pressing the lever). Indeed, since the rat only knows it is somewhere in the range of when a reward might come, but doesn't know exactly on which press it is coming, the rat ends up pressing the lever over and over quite steadily. Other reinforcement schedules do not produce as consistent a pattern of behavior (the response curve is not nearly as steep or consistent).

Slot machine designers learned that lesson well and applied it to humans, for whom the same responses appear given a particular reward contingency. By providing payoffs on a variable ratio schedule, they give out money just often enough that people keep playing, and because it happens on average every X times, rather than exactly every X times, the players cannot anticipate when reward is coming (in which case they won't not bother playing when it was not coming). It is possible that any response could be reinforced, so they are less likely to give up. It keeps them in the seat the longest, tugging that lever repeatedly because it always feels like they are on the verge of getting paid off.

The lesson here is not just meant for gamblers. Our modern life is so full of coercive techniques aimed at controlling our behavior (based on principles of learning and conditioning like those mentioned above) that we have come to expect no less. We recognize that television commercials use tricks to convince us to buy products. We recognize that speech writers and marketing/P-R firms perform careful studies to determine how language and word choice contributes to supporting or extinguishing a behavior. These things still affect our behavior, but recognizing coercive techniques is one of our few defenses to avoiding their invisible pull. And so it is worth it for all of us to pick up a little knowledge about the field of learning and behavior analysis, to better understand how our own behavior is conditioned that we might take back as much control as possible.

Originally Written: 01-25-07
Last Updated: 01-25-07