Terrierman's Daily Dose: The Three Parts of Operant Conditioning

Thursday, February 25, 2010

The Three Parts of Operant Conditioning

What we call "dog training" is also called "operant conditioning."

For all the mumbo-jumbo you hear about dog training, there are are only three basic parts to it: positive reinforcement, aversive reinforcement, and extinction.

Positive reinforcement is any kind of consequence that causes a behavior to occur more often. Examples include food, praise, and play. In some situations, positive reinforcement can be the removal of an aversive reinforcement.

Aversive reinforcement is a consequence that causes a behavior to occur less often. Examples include a leash pop, a harsh sound, or any kind of nonverbal aversive communication made through body movement or positioning. In some situations, punishment can also be the removal of a (positive) reinforcement.

Extinction is simply a complete lack of response. The nonresponse should be total -- no eye contact, no noise or sound triggered by the dog, and no responsive body movement. The dog is invisible.

Watch the short animated clip above, and you will note that the cartoon Cesar Millan uses all three methods to train South Park's Eric Cartman after "Super Nanny" collapses and goes insane in the face of the trials and tribulations of this spoiled-rotten child.

Step one in the Cesar Millan bag of tricks is to extinguish Cartman's negative behavior.

What Millan is doing by ignoring Cartman is signaling that a "new sheriff" is in town -- one that will not be overly reactive.

When Millan talks about "calm, assertive energy" what he is really saying is that the owners have to react less.

A calm owner is not sending a lot of signals, and an assertive owner is not sending tentative or confusing signals.

Send fewer signals. Send clearer signals. Do not be drawn into the dog or the child's drama in a kind of call-and-response situation.

By ignoring young Eric Cartman at the beginning, Millan is creating a "silence" which forces Cartman to pay attention. Suddenly he is not running the show, which means he now needs to pay attention to see how (and if) he can regain control. Cartman is used to running the show and he thinks that is his job. Millan is teaching him something else.

Cesar Millan puts up with a certain amount of nonsense from young Eric, and then he sends a negative signal. The signal has two components; one is tactile, and the other is oral (but not verbal).

Even as he sends the "punishment" of an unambiguous negative signal, Millan is also maintaining his control by ignoring Cartman.

Cartman is not able to "lead" the group by acting out. In fact, both Millan and Cartman's mom are ignoring him! He has gotten a negative reaction, but he has not gotten an empowering response that makes him the center of attention.

At the end of this clip, Millan is seen walking Cartman.

Walking does several things simultaneously-- it gives Cartman something physical to do, and it helps to drain off "the jitters" that both kids and dogs naturally have if they are kept cooped up for too long.

Taking Cartman for a walk also forces the Mother to spend "alone time" with Cartman -- a major reward for Cartman (attention-seeking is one reason he may have been acting out).

The act of taking Cartman for a walk also puts the Mother in the role of initiating, leading and ending the activity.

In short, walking the child or the dog is both a reward (time with mother), a remedy (activity soothes anxiety), and a recapitulation of the pack hierarchy (the Mother is reinforced as the pack leader).

Watch any episode of The Dog Whisperer, and you will see Millan use these same three techniques over and over again.

And to recap, he is using ALL of the tools of dog training:

Positive reinforcement (reward)

Aversive reinforcement (punishment)

Extinction (nonresponse to minor inappropriate behavior that is not self-reinforcing).

Is Cear Millan using dog treats and a clicker for positive reinforcement? No, not generally. But yes, that too is a way of giving positive reinforcement. Contrary to what some dog-training faddists might have you believe, however, click-and-treat is not the only way to give positive reinforcement.

Is the punishment harsh? No. Cartman is not being spanked, much less whipped with a telephone cord. What is happening here is simple communication. The goal is to get the child or the animal to understand what is not wanted, as well as what is wanted. Aversives do not need to be harsh for either a human or an animal to want to avoid them.

You will note that Millan does not always use a leash to train. It shocks people that Millan actually touches a dog! Oh. My. God.

But Millan is no fool -- he knows dogs in houses do not (and cannot) spend their life on a leash, but mild corrections are still needed. The answer: a simple tap with his fingers and a harsh (but not loud or overly threatening) sound serves as a warning that the immediate behavior is improper.

Millan's timing is excellent. He generally corrects dogs in mid-action, and so there is no ambiguity as to what is being said. Sometimes he will "body block" by squaring up his body with the dog -- a way of punctuating his message.

For the record, your life is a product of the same kind of operant conditioning that is being practiced by Cesar Millan.

You get to work on time because of the prospect of positive reinforcement (praise, pay and promotion) and negative reinforcement (criticism, demotion or termination).

If you tell a racist joke at the water cooler, and your coworkers turn away and act as if you are invisible, your bad behavior will be extinguished pretty quickly.

Here's a question: Do you think people would stop at a red light if they did not get traffic tickets for running through them?

Should a store owner praise you and tell you what a wonderful person you are when you pay for your goods, but simply look the other way if you steal them? If you steal from the store, should the limit of the store owner's displeasure be to tell you "no" and not praise you?

How do you think society would work if there was only praise and no punishment?

How do you think society would work if there was only punishment and no praise?

Think both of those questions over.

You see, the world needs balance. And it needs balanced trainers who come at the job with a complete set of tools.

As I have noted in the past, I can build a house with only six tools, but I need every one of them to do a credible job.

The fact that I do not use a level and a square as often as a saw and hammer does not make these two tools expendable.

And so it is with dog training.

I can train a dog with only three tools, but I need all three do to a credible job.

I would no more salute a dog trainer who never used aversive reinforcement than I would hire a builder who never used a level and a square, and for much the same reason -- lining things up and keeping them tight makes the entire structure more durable under stress and in bad weather.

And really, isn't that when we need a good house most?

As for Eric Cartman, how did the rest of his training go? Well, let's see:

The entire episode can be seen here.

Notice that young Eric Cartman had settled down pretty quickly.

Is he happy that he is not the center of attention and leading everyone around? Not yet! But Mrs. Cartman is not at her wit's end here -- a glimmer of hope is revealed because for the first time ever, Cartman is getting clear and consistent communication. Part of that communication is that bad behavior has consequences, and that the agenda is no longer being set by the small annoyance at the end of the leash.

In the end, Eric Cartman is completely transformed. No longer angry and out of control, he is getting regular positive feedback for engaging in model behavior.

He has learned the most important rule of society: Do good, get good; do bad, get bad.

But of course, it turns out that young Cartman's needs are easier to fill than his mother's!

When Cesar Millan leaves, Mrs. Cartman find that she is lonely again, and she reverts back to her old ways of making Eric the center of the house, sending the wrong signals, and relinquishing all power to "the little monster".

Any question as to how that ends?

Now to restate a point I have made before: Cesar Millan's way is not the only way to train dogs.

That said, all successful training methods are based on only three components: positive reinforcement, aversive reinforcement, and extinction. Almost everything else else is chaining, shaping, timing and repetition -- methods to put a point on the pencil.

Different trainers will have different mixes of positive to negative reinforcement, and some will use extinction to better effect than others.

Some trainers are better at timing and nonverbal communication than others.

Different trainers will have different preferences in terms of rewards and aversives, and most good trainers will change those rewards and aversives based on the type, temperament and preference of the animal.

That said, if a trainer does not ever use extinction and does not ever use aversives in training, you do not have a complete trainer or a complete training system.

Can a man with just a hammer and a saw build a house?

Sure.

But remember that the house will be slower to build, will leak when it rains, and will be hot in summer and cold in winter.

Some people are fine with that -- "Hey, it's just a little cabin in the woods. I'm almost never there."

Other folks demand a higher standard. They want a carpenter with a tape measure, a square and a level as well a hammer, a saw, and a glass cutter.

Not only will the house that carpenter builds go up faster, it will also do the job better in the long term.

Yes, both carpenters will be working with just saw and a hammer most of the time, but those four other tools, properly used, actually do make a world of difference.
.

11 comments:

Heather Houlahan said...: This is not only the greatest episode of South Park EVER, it is better dog-training education than watching hours of the live-action Cesar Millan.

That's because of the final 30 seconds of the episode, when Mrs. Cartman flees back into her destructive parenting habits.

I'm serious. I recommend this episode to my training clients.

Also, greatest homage to the movie Altered States ever.; 11:26 AM
Marie said...: I've never bought into the clicker training method. Mother dogs don't use positive reinforcement to keep their pups in line. A bit of "corporal" punishment goes a long way to introduce boundaries for a dog. Dogs without boundaries are a real PITA.

Love my dog trainer, she is a common sense trainer that will use tools that are available to teach boundaries and are mixed in with positive reinforcement. She helps you to train your dog without breaking the spirit of the dog.

I always laugh when people get all shook up about prong collars. One of the best tools out there in dog correction. That collar has probably saved untold numbers of dogs from ending up in a shelter or the pound or on craigslist.

I have personal experience using the prong collar on my highly agressive rescue terrier. Most people would have given up on him and either put him down or drop him at the nearest shelter and let them deal with him. We used the collar on him to stop lunging at other dogs/people. It was highly effective on him even though it looks like some medieval torture device, it is nothing of the sort.

Anyway, I digress. I totally agree about dog training.; 1:11 PM
Retrieverman said...: I can't stand Eric Cartman.

I've tried two or three times to get into the show.

I must be too liberal to like it, because to me, this probably the most reactionary cartoon ever made.

And I include the comic strip Mallard Fillmore.

Every episode I've watched has had an anti-environmental or anti-regulation measage.

Just give me Family Guy.

At least the dog on there has brains.; 2:25 PM
Viatecio said...: I could never get into that show either. Doens't mean the episode isn't awesome, but as a whole, I just never saw the humor. I've heard people say the same thing about FG, but that humor strikes the funny bone a little better.

I've never bought into "different methods for different dogs." I buy into the "Different dogs" thing, but aren't they all the same animal that can be taught using a VARIANCE of one main technique based on what's right for the individual personality, temperament and drive?

I've only ever worked with balanced trainers. For some reason, I've never met a positive-only or clicker-trainer who didn't act like the world was full of sunshine and bunnies. The ones I've had the pleasure of meeting were all like birds: they chirped when they spoke. Maybe I just need to meet different people...but the balanced trainers I've had contact with were some of the most realistic, down-to-earth trainers I've ever met.

In my forays and looks into the training industry in hopes to join their ranks, it completely blows my mind that I am judged by the tools I use rather than the technique or even the results of my work. The Litmus Test of "So, stranger, how do you get a dog to stop pulling on a leash" is getting to me, because the answer I give will be based on the TOOL I use, which will then garner the usual "Tsk" and the "Cruel Trainer" label. Anyone who I've ever talked to who's done this has never asked to watch my dog work, nor asked for further details...they just rip into me about how inhumane the "old school" ways are, and by the way, why not try a "Gentle" leader? *GAG*BARF*; 5:16 PM
Seahorse said...: Retrieverman, it's not your liberalism that makes you not like South Park, it's just whatever makes you laugh, or not. I'm as liberal as they come, and South Park makes me howl with laughter. And Cartman is supposed to make you hate him, so that much is workin' for you. When Martha Stewart had her little brush with the law and South Park had a parody on the air in ten DAYS, I was amazed. And I cried laughing. I admire the fact that there are no sacred cows for the writers; given enough time, everyone takes a shot to the gut. Different strokes...s'o.k.! ;)

Seahorse; 12:06 AM
Paula McCollum & the Blueticks said...: Bravo!! Excellent post and excellent breakdown of the episode. That is one of my all time favorite South Parks.; 1:08 PM
Kali said...: Great post. Just be aware that you're not using the official/academic definitions of operant conditioning. But, neither do 99% of most "positive reinforcement only" trainers.

The common mistake boils down to equating "positive" with "good" and "negative" with bad, and not making a distinction between "reinforcement" and "punishment".

"Positive" in behavorism-speak is supposed to mean "added". "Negative" obviously means "take away".

Reinforcement is anything that *increases* the likelihood of a behavior, and Punishment is anything that *decreases* the likelihood of a behavior.

One point of confusion lies in the fact that, in the case of corrections, you can look at it both ways--does the leash correction count as punishment, because it decreases the behavior of "disobedience", or does it count as reinforcement, because in increases the behavior of sitting? I believe it counts as REINFORCEMENT, because by definition a correction PUTS THE DOG RIGHT (i.e. in the position the dog should have assumed himself), and "not sitting" is not really a behavior.

This is to be distinguished from a pure punisher, which simply stops a behavior, but doesn't necessarily encourage anything else in it's place.

Example of positive reinforcement: tossing a treat when the dog sits. Petting/praise when the dog sits.

Example of negative reinforcement: removal of upwards leash pressure when the dog sits. Or, the ear pinch stopping when the dog retrieves.

Example of positive punishment: an "OUT" correction as described in the Koehler method of dog training.

Example of negative punishment: Removing the dog from the dog park when he fails to come when called. In other words, you remove something (the fun of the dog park) to decrease the behavior (ignoring recall commands)

Any trainer who handicaps them to only one "quadrant" is doing themselves and the dog a disservice.; 3:11 PM
PBurns said...: Kali, your comments are technically correct, but also a good example of why I think so many people get confused and start to fight basic dog training.

Three points:

1. Dog training is NOT complicated. Anyone can do it.

2. You do NOT need to use terms of art to establish authority.

3. Education starts with communication, which means it helps to speak English as commonly understood, especially at the beginning.

If you are teaching sailing, for example, the front of a boat is the front of the boat, not "the bow." The back of the boat is the back of the boat, not "the stern." A railing is a railing, not a "stanchion." A rudder hangs from a hinge, not a "gudgeon and pintel."

YES, I can teach people how to sail using terms of art, but why would I want to do that? If I am trying to get folks to learn how to sail, I need them to understand me, not be impressed that I know a shroud from a halyard. Most of all, I do not neeed the folks I am talking to to spend 80 percent of their brains trying to learn a foreign language.

Your terms are technically correct if you are neck-deep in the literature, but I think they are also the WRONG ONES to use in the real world of dog training.

No one want to "punish" their dog, and certainly not during training. It is the WRONG word to use most of the time, and I try not to use it, even if it has been copied from one book to another for 60 years.

When folks jerk on a collar chain or put their hand in front of a dog's face, they are using an aversive signal to communicate. It is not "punishment" -- it is communication. It is communication through an aversive, and it reinforces a desired behavior, just as positive reinforcement does the same thing using a different technique. Two sides to one coin.

Try using simple words that mean something to people, like "relax the leash" or "pop the leash" and people will generally get it. Always keep it simple and avoid jargon and people will get it.

Why do I present operant conditioning as having only three parts? Well, for one, it's true!

It's also correct communication.

One of the axioms of communication is that humans can generally remember only three things.

As a consequence, religion has "the father, son and holy ghost."

Music has Do-Re-Mi.

The alphabet is "the A-B-C's."

... And no one can remember the lasty of the "Four Freedoms"

The good news is that operant conditioning really does have only rewards, aversives and extinction.

Everything else is variations on a theme, and "a point on a pencil."

We can get to those later. But at the begining, three things are all anyone's going to remember!

Hope that makes sense.

P; 6:32 PM
Kali said...: Not sure if my other response came through, something went wonky after I hit the "publish" button, but in any case, I will assume it did.

I also thought this point you made was interesting: "By ignoring young Eric Cartman at the beginning, Millan is creating a "silence" which forces Cartman to pay attention. Suddenly he is not running the show, which means he now needs to pay attention to see how (and if) he can regain control. Cartman is used to running the show and he thinks that is his job. Millan is teaching him something else."

I never quite thought of it as "creating a silence", but that's an interesting way to put it.

It very much resembles the early longe line work in Koehler training. The first two or three days are spent with the dog on a 15 foot longe line, given the full length of line to explore, while the trainer simply walks resolutely and silently from Point A to Point B to Point C to Point A etc, while making a point to not look at the dog, speak to the dog, interact with the dog, respond to anything that the dog does, allow the dog to stop their forward progress, pull them off line, or push them off the their path of travel by getting in the way.

I am always shocked at the number of dogs who respond to this ridiculously undemanding exercise by frantically nipping, jumping, and otherwise harassing the trainer in a desperate attempt to get them to respond so they can regain control. This crap usually starts up after the first or second time the dog reaches the end of the line and the trainer doesn't stop walking--i.e. doesn't indulge the dog.; 12:06 AM
J.Deans said...: Great post, and great replies!
I was actually just talking to my trainer/behaviourist friend about the wording used to describe operant conditioning, and we both agreed that it is completely useless when training owners how to train their pets. The average dog owner doesn't need to know these terms, as they even confuse some of the experts. They are great and dandy in a text book, or for those who like to try and sound edu-macated (and I see a lot of these terms thrown around so often on some of the web boards by people who are trying to sound like they know something about dog training) but for Joe-average dog owner, they are as useless as calculus is to me.
It's best to stick to the basics, as you pointed out Patrick. Points are easier to get across that way, confusion is reduced and you enable people to retain what they've been told. Owner and dog can only benefit from keeping it simple stupid.; 8:44 AM
Kali said...: Looks like my first reply didn't make it through! I'll try again. This is in response to P's response to my original replay (when I gave the "official" definitions of OC).

Believe me, P, I wholeheartedly agree with you that the "official" behavioristic vocabulary is confusing, useless for the average pet owner, and even ugly. As you point out, dog training is not magic. The secrets to success can generally be distilled to a few short idioms, like "What you pet is what you get", "Train the dog, and don't let him do that", "Make the right thing easy and the wrong thing hard." Etc. I'm frequently wonder why we call it DOG TRAINING anyway, since it's the owners who need the "training" most of the time. The dogs are generally fine and quite willing to learn.

I just thought I'd point out how the official terms work. Since you didn't mention them in your original post, it was unclear as to whether you were revising them or misinterpreting them. I wanted to assume the former, but figured it would be interesting to talk about them for the benefit of those who might not be familiar, or who have been exposed to the confused version espoused by most dog trainers, which I think tends to reveal their own training biases.
By misunderstanding them in the interesting way that they do, it suggests that they don't actually know how to use those concepts successfully in training. It's hard for me to respect someone's opinion on useful techniques like negative reinforcement if they think it just means "applying pain".

Certainly, when I'm training a dog or teaching someone else to train their dog, you will never hear me talk about "behaviors" (ugh, as if it was pointless to refer to intention), stimulus, or "positive" or "negative" reinforcement anything. I find it an artificial, unimaginative, and incomplete way of talking about our interactions with dogs.

Instead, there's lot of words like praise, correction, responsibility, authority, respect, language, "mean what you say, say what you mean", conversation, listening, etc being thrown around. The only time I talk about "punishment" is to distinguish it from "correction".

(Yup, that's a lot of Vicki Hearne, and thus Koehler, you're hearing there.)

Hope that makes my stance on the issue more clear.; 11:59 AM