Strange Loops - Controlling Behavior: Reward and Punishment

Controlling Behavior: Reward and Punishment

Blog || Politics || Philosophy || Science || Fiction || Quotes

It is common sense that if you give someone a reward for doing something, they are more likely to do it, and if you punish them, they are less likely. However, scientists have discovered that things are a little more complex than that. Indeed, if you want to shape behavior (or understand how behavior is shaped by reward or punishment contingencies already in place), there are four distinct effects an outcome can have on an actor's behavior.

Positive reinforcement is the most straightforward setup. When a particular behavior (or response to something) results in the presentation of a new stimulus which causes that behavior to increase, that is positive reinforcement. Positive because a new stimulus was added (you can call it a reward), and reinforcement because the behavior in question increased. If I give a dog treats for rolling over on command, it will more often roll over on command than if I never rewarded the behavior.

Positive punishment is the next most straightforward contingency to control behavior. When an action leads to the presentation of a new stimulus which causes the behavior to decrease, that is positive punishment (positive for adding a new stimulus, and punishment because the behavior went down). An example of this would be someone hitting a dog with a rolled up newspaper every time it barked at a doorbell; if that leads to decreased barking, then the bark->newspaper contingency is a form of positive punishment.

Less obvious but just as important contingencies are negative reinforcement and negative punishment. These involve situations where some behavior leads to removal of a stimulus, and that in turn causes the behavior to increase (reinforcement) or decrease (punishment). So if taking a pain pill (behavior) to alleviate a headache (removal of an aversive stimulus) leads to taking pain pills for future headaches, the behavior has been negatively reinforced. On the other hand, when a misbehaving child has their favorite toy taken away by their parent in response to the misbehavior, and in turn the child acts better, then the parent has used negative punishment (removed a desirable stimulus) to decrease the behavior.

All of this is well and good, and has been accepted for many decades by scientists who study behavior in humans and animals. Generally the latter three contingencies - everything except positive reinforcement - are labeled aversive control. They control behavior by dealing in some way with aversive stimuli. Yet aversive control, especially positive punishment, are known to work well only in certain restricted circumstances.

For example, for punishment to be effective in the long run it must be immediate, consistent, abruptly introduced, and moderate in intensity early on. Someone could deliver a horrible blow to a child for goofing off and the child will immediately stop goofing off, but at best such fear tactics will be ineffective (the child will learn not to misbehave when the threatening parent is around, but won't change its behavior otherwise) and they almost always entail significant side effects.

For example, the more punishment is used to control behavior (say, yelling at a child to stop crying), the more that punishment behavior (yelling) gets reinforced for the punisher (the parent), at the same time it may temporarily reduce the child's crying behavior. In other words, when you train another's behavior with punishment, you simultaneously are training yourself to use punishment tactics in the future. Other side effects include aggression problems (a child or dog that is beaten often develops an aggressive personality) and spread of aversion to the context (i.e. the environment where the punishment happens becomes aversive). Also, there will be a general difficulty in changing an avoidance behavior once it is learned (so if you train your puppy not to bark by hitting it for barking, it will be harder to teach the dog to bark on command later).

For these reasons, some behavior analysts in recent times have been suggesting the use of only positive reinforcement to control behavior, wherever possible. Hence the rise in non-aversive, reward-only clicker training among dog trainers, a reorientation of parenting advice away from the spankings of yesteryear, time off for good behavior in prisoners, and so on. Of course, it also makes people feel better to use positive reinforcement wherever they can, due perhaps to aesthetic preferences against punishment (empathy might play a role in that). People have listened to the results presented by behavioral scientists and started changing how they go about controlling the behavior of others.

Unfortunately, the picture presented above may not be so straightforward as it seems. In a 2003 peer-reviewed article ("Negative Effects of Positive Reinforcement", in Behavior Analyst v. 26), Michael Perone argues against the positive reinforcement-only paradigm, claiming that positive reinforcement has its share of problems. But more fundamentally, he says that aversive control is unavoidable, and in some cases may be conceptually indistinguishable from positive reinforcement.

The first argument he makes is that environmental context is what makes a stimulus presentation aversive or not. That is, it is not the case that a given stimulus is aversive or not in and of itself (say, because it involves pain), despite popular misconceptions. He cites an experiment by Scripture in which a frog was put in an easily-escaped water chamber which was heated imperceptibly slowly until the frog was boiled alive without trying to escape. Here a stimulus was killer (certainly what commonsense would intuitively label aversive) but it was not technically aversive since it did not induce the frog to escape. On the opposite side, avoidance of an oncoming car is aversive by definition but not intuitively 'bad' or harmful.

Thus Perone is initially seeking to correct the popular misconception of aversive control as involving 'bad' things, in contrast to the 'good', safe rewards of positive reinforcement. He also cites a pair of rat studies to support this. In one study, rats act to avoid shocks greater than 1 mA (negative reinforcement) but not shocks under 1 mA. However, in another experiment involving positive food reinforcement, shocks under 1 mA had an aversive effect (positive punishment - they stopped responding to the food-delivery schedule when shocks were added). So whether something is aversive or not certainly depends on the context, not just the stimulus itself.

Further, though, Perone seeks to undermine the fundamental positive/negative distinction (in reinforcement specifically). Consider positive reinforcement, where a behavior leads to a stimulus which presumably makes things better (insofar as it causes the behavior to increase). In this case, the original situation before this behavior (and its reward) was relatively worse, relatively lacking. Thus the situation of positive reinforcement is at the exact same time negative reinforcement in that the behavior in question increases because it removes the relatively worse conditions of before. The target behavior is still the same. It is not a case of looking at two different behaviors and saying one is positively reinforced while another is negatively reinforced; rather, the exact same stimulus presentation for the exact same behavior can legitimately be seen under either light without any inconsistency.

In addition, Perone repeats a point made by B.F. Skinner that positive reinforcement can have aversive consequences later on. The behavior of eating junk food is positively reinforced by the introduction of good-taste stimuli that make a person more likely to eat junk food again. Risky, temporally-myopic decision making (like gambling) can be positively reinforced in the short term, leading to more risky behavior in the future. These things, however, can have problematic consequences later on which are, hopefully, aversive, and at the very least what commonsense would call 'bad' (i.e. become obese or lose savings).

So in the end Perone argues that people should not be so afraid of aversive control (which might have 'bad' connotations in popular conception), nor put such unassailable stock in positive reinforcement. Rather, he concludes, the focus should be on the results of a given contingency - whether the setup leads to the desired behavior in the long-run (as he puts it, "behavior in the long-term interest of the individual") - and not on the type of contingency used.

Aversive control has its place, and it need not be what we think of as 'bad'; on the flip side, positive reinforcement is neither all-powerful nor devoid of its own 'bad' effects. Though certainly the warnings against certain types of aversive control stand, so in general it tends to be more effective to reward your child for good behavior (as an alternative to bad behavior, replacing it), rather than strictly punishing bad behavior, and time-outs (negative punishment) will have less side effects than beatings (positive punishment).

Originally Written: 01-25-07
Last Updated: 01-25-07