You’re probably reading this on a web browser written in C++. C++ (pronounced see-plus-plus) has 63 “keywords” and about 30 or so arithmetic and logical “operators.” Compare this to the roughly 600,000 words in the last edition of the Oxford English Dictionary! With less than 100 basic building blocks, C++ can produce Firefox (for better or worse), Chrome, Internet Explorer (definitely worse), and most of the software you use every day. You don’t need a large complicated system to get things done. You’re better off with a smaller and simpler one. (As a matter of fact many programmers, myself included, think that C++ could probably lose 20 or so keywords and be the better for it.)
The four basic building blocks of operant conditioning, and thus dog training, can take us a long way toward describing and changing behavior. There’s a lot more going on than they describe, and sometimes they tend to oversimplify, but they still tend to get the job done.
When we want to train a new behavior we use positive reinforcement (R+ for short.) We are adding something (a treat, a toy, praise….) that reinforces the behavior.
This means our dog found whatever we added as a consequence of the behavior reinforcing or, as we mortal humans without white lab coats might say, rewarding. This is the first area where some people get tripped up with positive reinforcement. Just because you think it’s reinforcing, that doesn’t mean your dog will.
For example, 45 minutes into a one hour class, that yummy dog treat might start getting boring. As one of my dog trainer friends is fond of saying, “I love cheesecake! But after an hour it gets a little old.” You have to be willing to reward honest work with appropriate pay.
But the biggest mistake people make with positive reinforcement has to do with a slightly more complicated concept: reinforcement schedules. How frequently someone is rewarded for a behavior is just as important, if not more, than the type of reward. These schedules can get very complicated, but for our purposes we can keep it pretty simple.
If you’ve ever had a job for pay, you have been on a fixed reinforcement schedule. If you were paid every week, two weeks, or once a month you were on a fixed interval schedule. After a set period of time, you were paid.
The pitfall with these fixed schedules is pretty simple. What would have happened (or did happen) when it came time to get paid and you were not? Chances are, you stop working. Immediately.
So, if you give your dog a treat every time she sits for a few weeks, and then stop because “she knows it now and should just do it.” What happens?
Now think about the slot machines at Las Vegas or Atlantic City. We all know that there is no way that those machines can give out more money than is put in. But people stand there for hours, emptying their pockets into them because the machines operate on a variable reinforcement schedule. The players never know when the next reward is going to come. This is very compelling — compelling enough that some people literally cannot stop.
Most trainers will tell you, in simpler terms, to work on a fixed reinforcement schedule when you introduce a new behavior, usually rewarding your dog every time she performs the behavior (technically a “continuous” schedule). When the behavior is “reliable” (the dog does it whenever we ask) move to variable reinforcement. Many trainers will tell you “Become a slot machine!”
Many owners never get out of the fixed reinforcement stage or wait too long, and end up with what I call a “Show me the money!” dog. The dog will not perform unless she can see the treat.
Next week: how to use positive reinforcement to stop a behavior. (Think that doesn’t make sense? Good! You’re paying attention.)



Great explanation. A lot of trainers lose people when explaining operant conditioning because they don’t use enough words to explain it (e.g. real life examples).
Look forward to the next posts.
Ha! That is very funny we both wrote about variable reinforcement schedules on the same day. I’m positive that I’m guilty of not switching thing up enough after the initial learning period.
)
Do you happen to know if variable fear situations would work the same way? Like the occasional scary thing reinforces the fear in a bad way?
I’ve never seen anything in regards to schedules and classical conditioning. It seems to me that one of the more insidious aspects of fear is that it generalizes very well and very quickly.
But how does repeated exposure effect things? Does it really make things worse? Strong fear responses don’t seem to require any maintenance. Time to do some research I guess.
This is a great point to be made and often missed and misunderstood!
Classical conditioning works best when the conditioned stimulus is followed by the unconditioned stimulus every time (opposite of operant conditioning). However negative, conditioned emotional responses can be achieved with only a few pairings and are tough to change.
So I think the answer to Roxanne’s question is, yes, the occasional exposure to a scary thing helps to maintain and strengthen the negative response. Bummer isn’t it?