顶[45] 分享评论[0] 编辑

斯金纳

B.F.斯金纳 1904—1990
The major problems of the world today can be solved only if we improve our understanding of human behavior.

美国行为主义心理学家，新行为主义的代表人物，操作性条件反射理论的奠基者。他创制了研究动物学习活动的仪器——斯金纳箱。1950年当选为国家科学院院士，1958年获美国心理学会颁发的杰出科学贡献奖，1968年获美国总统颁发的最高科学荣誉——国家科学奖。

行为主义的领袖——斯金纳

B.F.斯金纳（1904—1990）是行为主义学派最负盛名的代表人物，也是世界心理学史上最为著名的心理学家之一，直到今天，他的思想在心理学研究、教育和心理治疗中仍然被广为应用。

斯金纳生于宾夕法尼亚州的一个小镇上，父亲是当地的律师，他从小就爱制作各种小玩艺，成为行为主义心理学家后，又发明并改造了很多动物实验的装置。在中学和大学期间，他曾立志当一名作家，并曾获得希腊文特别奖，他曾经试图进行文学创作，但很快，他就发现无论是自己还是其他作家对人的行为的理解都少得可怜，为了更深入的理解人的行为，他转向了心理学。

在哈佛大学攻读心理学硕士的时候，他受到了行为主义心理学的吸引，成为了一名彻头彻尾的行为主义者，从此开始了他一生的心理学家生涯。他在华生等人的基础上向前迈进了一大步，提出了有别于巴甫洛夫的条件反射的另一种条件反射行为，并将二者做了区分，在此基础上提出了自己的行为主义理论——操作性条件反射理论。他长期致力于研究鸽子和老鼠的操作性条件反射行为，提出了“及时强化”的概念以及强化的时间规律，形成了自己的一套理论。

斯金纳还将操作性条件反射理论应用于对人的研究，他认为，人是没有尊严和自由的，人们作出某种行为，不做出某种行为，只取决于一个影响因素，那就是行为的后果。人并不能自由选择自己的行为，而是根据奖惩来决定自己以何种方式行动，因此，人既没有选择自己行为的自由，也没有任何的尊严，人和动物没有什么两样。

斯金纳还将自己的强化理论推广到教育心理学领域，他提出了一种新型的教育模式，并研制设计出了新型的教学机器。在他的领导之下，新教材开始编制，教学机器也在各大中学校广为应用，一时间在教育界掀起了一场轰轰烈烈的程序教学运动。

斯金纳在各个领域推销他的操作性条件反射理论，在心理治疗领域，他提出了塑造行为的行为矫正技术，不断地利用奖惩来塑造人们的行为，促使人们作出好的行为，改变不良行为。现在行为主义学派的行为矫正技术仍然在心理治疗领域广为应用。

斯金纳还提出了自己对理想社会的设想，在其名著《沃尔登第二》一书中，他描述了一个理想的乌托邦似的社会，在这个社会中，孩子从诞生之日起，就通过强化来进行严格的行为形成训练，孩子们要被训练成具有合作精神和社交能力的人，所有的训练都是为了社会全体成员的利益和幸福。这本书在美国极受推崇，大学生们尤其热衷于阅读此书，在弗吉尼亚州，甚至还有人真正根据《沃尔登第二》的模式建立起了一个公社。

斯金纳在美国公众中的名声远比在心理学界的名声大得多，一位崇拜者写道：“（斯金纳）是一个神话中的著名人物......科学家英雄，普洛米休斯式的播火者，技艺高超的技术专家......敢于打破偶像的人，不畏权威的人，他解放了我们的思想，从而脱离了古代的局限。”这些话虽然有些夸张，但斯金纳在心理学界的贡献仍然是不可磨灭的。

生平

斯金纳（Burrhus Frederic Skinner，1904—1990）是新行为主义心理学的创始人之一。他1904年3月20日生于美国宾夕法尼亚州东北部的一个车站小镇。斯金纳从小喜爱发明创造，富有冒险精神。他15岁时曾与几个小伙伴驾独木舟沿河而下，漂流300英里。他还试制过简易滑翔机，曾把一台废锅炉改造成一门蒸汽炮，把土豆和萝卜当炮弹射到邻居的屋顶上。

1922年斯金纳进入汉密尔顿学院主修英国文学并开始从事写作。由于他对动物和人类的行为深感兴趣，因此他曾选修过生物学、胚胎学和猫体解剖等学科。在生物学教师的指导下他阅读了洛布的《脑生理学和比较心理学》、巴甫洛夫的《条件反射》等科学著作，还阅读了罗素的《哲学原理》、华生的《行为主义》。这些著作对他日后的学术成就产生了巨大影响。

1926年斯金纳从汉密尔顿学院毕业，转入哈佛大学心理系。在哈佛大学学习期间，他为自己制定了一张极严格的日程表，从早晨6点至晚上9点的分分秒秒几乎都用来钻研心理学和生理学。他不看电影不看戏，谢绝一切约会。功夫不负有心人，斯金纳于1930年获哈佛大学心理学硕士学位，1931年又获心理学博士学位。此后他在该校研究院任研究员。1937～1945年他在明尼苏达州立大学教心理学，1945～1947年任印第安那大学心理系主任。1947年他重返哈佛大学，担任心理学系的终身教授，从事行为及其控制的实验研究。

斯金纳在心理学研究方面的成就卓著。他发展了巴甫洛夫和桑代克的研究，揭示了操作性条件反射的规律。他设计的用来研究操作性条件反射的实验装置“斯金纳箱”，被世界各国心理学家和生物学家广泛采用。他在哈佛大学的鸽子实验室名垂青史。他根据对操作性条件反射和强化作用的研究发明了“教学机器”并设计了“程序教学”方案，对美国教育产生过深刻影响，被誉为“教学机器之父”。为表彰斯金纳在心理科学方面作出的重大贡献，1958年美国心理学会授予他“卓越科学贡献奖”，1968年他荣获美国国家科学奖章，这是美国最高级别的科学奖励。1971年美国心理学基金会授予他一枚金质奖章。1990年8月10日美国心理学会授予他“心理学毕生贡献奖”荣誉证书。8天后，即8月18日斯金纳去世。

斯金纳一生著作很多。自1930年以来发表了百余篇论文和12本专著。他的主要著作有：《有机体的行为：一种实验的分析》《科学与人类行为》《言语行为》《学习的科学和教学的艺术》《教学机器》。这些著作全面阐述了操作行为主义理论和这种理论在教学领域中的应用。他还用操作行为主义理论阐述社会生活问题，出版了小说《沃尔登第二》以及《自由与人类的控制》《超越自由与尊严》。这些作品曾在美国社会中引起巨大反响和激烈争论。

主要著作

1938
The Behavior of Organisms：An Experimental Analysis《有机体的行为：一种实验分析》通过对白鼠和鸽子的观察，经验性地描述学习的法则，从而为操作性条件作用原理奠定了基础。

1948
Walden Two《沃登第二》根据人类行为的科学原理，试图形成一种以积极控制的方法加以管理的理想社会

1953
Science and Human Behavior《科学与人类行为》探讨了人类行为的一些重要方面，如思维、自我和社会化等

1957
Schedules of Reinforcement

1957
Verbal Behavior《言语行为》

1959
Cumulative Record: Definitive Edition

1961
The Analysis of Behavior: A Program for Self-Instruction

1968
The Technology of Teaching《教学技术学》探讨了他的基本原理在人类学习中的运用。

1969
Contingencies of Reinforcement ：A Theoretical Analysis《强化的相倚关系：一种理论分析》

1971
Beyond Freedom and Dignity《超越自由与尊严》对他自己观点的总结，并驳斥了他人的种种批评。

1974
About Behaviorism《关于行为主义》

1976
Particulars of My Life: Part One of an Autobiography

1978
Reflections on Behaviorism and Society

1979
The Shaping of a Behaviorist: Part Two of an Autobiography

1983
A Matter of Consequences: Part Three of an Autobiography

1983
Enjoy Old Age: A Program of Self-Management

1987
Upon Further Reflection

1989
Recent Issues in the Analysis of Behavior

操作性条件反射实验

斯金纳从20世纪20年代末，在哈佛大学就读研究生时起，便开始了动物学习的实验研究。他采用的被试多为大白鼠、鸽子和猫等动物。他自制实验装置，早期的实验装置构造比较简单，适于用大白鼠做实验，在一个矩形通道中间穿一根横轴，使其呈天平状平衡安置在一块固定的木板上。当大白鼠从矩形通道的一端跑向另-端时，矩形通道就发生倾斜。每当通道倾斜时，附臂就钩住旁边安置的一个轮子，并使轮子移动一个缺口，使一个缺口内的食物经漏斗落入食物盘内。这样，大白鼠在矩形通道内学会来回穿梭跑动，通过自己的动作获得食物奖励，见右图所示。

30年代后期,斯金纳为研究操作性条件反射，精心设计制作了一种特殊的仪器，即一个阴暗的隔音箱，箱子里有一个开关（如用白鼠为被试，即是一小根杠杆或一块木板；如以鸽子为被试，就是一个键盘）。开关连接着箱外的一个记录系统，用线条方式准确地记录动物按或啄开关的次数与时间,如左图所示。这个实验装置被称为“斯金纳箱”(Skinner box)。在实验时，并不是动物每一次按杠杆或啄键盘都给食物，食物的释放方式由实验者决定。一面的箱壁上有一根横杆，恰巧装在一只小食盘和喷水口上面。老鼠在笼子里面爬来爬去，当它碰巧把前爪放在横杆上并压下它时，一粒饲料会自动落下到食盘里。笼子外面连接的一些设备会自动的在移动纸带上画出一条线，一分钟一分钟的记录下压下横杆的次数，从而记录老鼠的行为。这比桑代克的迷箱方法先进了很多，更容易收集数据，实验者所作的工作也更为简单、容易，他们不需要时时盯着老鼠，更不需要在横杆压下时及时的递送饲料，而只需要查看纸带上的记录就行了。

操作性条件反射理论

应答性行为和操作性行为--经典式条件反射学习和操作式条件反射学习

斯金纳的行为主义理论与华生的行为主义观点有一个显著的区别。华生坚持“没有刺激，就没有反应”的信条。而斯金纳却认为这种观点不尽全面，也不准确。斯金纳提出要注意区分“引发反应”与“自发反应”，并根据这两种反应提出了两种行为：应答性行为和操作性行为。前者是指由特定的、可观察的刺激所引起的行为，如在巴甫洛夫实验室里，狗看见食物或灯光就流唾液，食物或灯光是引起流唾液反应的明确的刺激；后者是指在没有任何能观察的外部刺激的情境下的有机体行为，它似乎是自发的，如白鼠在斯金纳箱中的按压杠杆行为就找不到明显的刺激物。应答性行为比较被动，由刺激控制，操作性行为代表着有机体对环境的主动适应，由行为的结果所控制。人类的大多数行为都是操作性行为，如游泳、写字、读书等等。

据此，斯金纳进一步提出两种学习形式：一种是经典式条件反射学习，用以塑造有机体的应答行为；另一种是操作式条件反射学习，用以塑造有机体的操作行为。西方学者认为，这两种反射是两种不同的联结过程：经典性条件反射是S--R的联结过程；操作性条件反射是R--S的联结过程。这便补充和丰富了原来行为主义的公式。

斯金纳的操作条件反射与桑代克的效果律的比较

桑代克的效果律指出“如果一个操作行为出现以后有强化刺激跟随，其反应的强度便增加”。可见两者都提及了强化的概念。但是在斯金纳的行为分析中，强化所扮演的角色发生了重大的变化。

首先，在桑代克那里，强化是用来解释刺激-反应联结加强的一条主要原理，而在斯金纳体系中，强化只是一个用来描述反应概率增加的术语，即强化增加的是反应发生的概率，如何安排强化才是核心所在。

其次，其他学习理论家（如巴甫洛夫）把消退看作是一个主动的抑制过程，而斯金纳认为不能把消退看作是一种与强化无关的独立的过程。事实上，强化可用于消退行为，停止强化可以使反应概率下降。消退过程可用来表明强化效果持续的时间。

现在我们回忆一下桑代克的“猫的迷笼实验”，再与斯金纳的操作性条件反射实验相比较，我们会发现两者既有不同之处，也有相似之点。

不同之处在于：桑代克的迷笼实验是刺激情境在前，偶发的反应在后；而斯金纳的"斯金纳箱实验"是自发的反应在前，强化刺激在后。

相似之处：两种学习都依赖于动物做出自发的反应动作。
　　
反射学说

1、操作性条件反射的建立

如果一个操作发生后，接着给予一个强化刺激，那么其强度就增加。斯金纳的操作性条件反射所建立的原理，在许多动物和人类的学习中得到印证。例如，鸽子偶一抬高头，受到强化，此后会继续抬高它的头；婴儿偶尔叫一声"妈"，妈妈便报以微笑和爱抚，于是孩子学会了叫"妈妈"。斯金纳甚至依据这个原理，训练两只鸽子玩一种乒乓球游戏，获得成功。实际上，只要巧妙安排强化程序，可以训练动物习得许多复杂的行为。

2、操作性条件反射的消退
　　
关于操作性条件反射的消退，斯金纳总结说："如果在一个已经通过条件化而增强的操作性活动发生之后，没有强化刺激物出现，它的力量就削弱。"可见，与条件作用的形成一样，消退的关键也在于强化。例如，白鼠的压杆行为如果不予以强化，压杆反应便停止。学生某一良好反应未能受到教师充分的关注和表扬，学生便最终放弃这一作出良好反应的努力。

但是，反应的消退表现为一个过程。即一个已经习得的行为并不即刻随强化的停止而终止，而是继续反应一段时间，最终趋于消失。斯金纳以实验表明，一只已经习得压杆反应的白鼠在强化被停止之后，仍然能按压杠杆达50-250次之多，然后最终停止反应。至于消退的时间，则与该习得反应本身力量的强弱成正比，即如果原来反应非常牢固，那么消退的时间较长，反之亦然。例如，在上述实验中，受过多次强化的白鼠在强化停止后，可连续按压杠杆250次左右，而仅受过一次强化的白鼠在强化停止后连续按压杠杆的次数为50次左右。所以，消退过程的时间长短也是斯金纳衡量操作性条件反射力量的一个指标。

P>斯金纳认为，学习是一种行为，当主体学习时反应速率就增强，不学习时反应速率则下降。因此他把学习定义为反应概率的变化。在他看来，学习是一门科学，学习过程是循序渐进的过程；而教则是一门艺术，是把学生与教学大纲结合起来的艺术，是安排可能强化的事件来促进学习，教师起着监督者或中间人的作用。斯金纳激烈抨击传统的班级教学，指责它效率低下，质量不高。他根据操作性条件反射和积极强化的理论，对教学进行改革，设计了一套教学机器和程序教学方案。

教学机器是一种外形像小盒子的装置，盒内装有精密的电子和机械仪器。它的构造包括输入、输出、贮存和控制四个部分。教学材料分解成由按循序渐进原则有机地相互联系的几百甚至几千个问题框面组成的程序。每一个步子就是一个框面，学生正确回答了一个框面的问题，就能开始下一个框面的学习。如果答错了，用正确答案纠正后再过渡到下一个框面。框面的左侧标出前一框面的答案，成为对该框面问题的提示。一个程序学完了，再学下一个程序。斯金纳认为课堂上采用教学机器，与传统的班级教学相比较有许多优点。第一，教学机器能即时强化正确答案，学习效果的及时反馈能加强学习动力。而在班级教学中行为与强化之间间隔时间很长，因而强化效果大大削弱。第二，传统的教学主要借助厌恶的刺激来控制学生的行为，学生学习是为了不得低分，不被教师、同学、家长羞辱等，从而失去学习兴趣。教学机器使学生得到积极强化，力求获得正确答案的愿望成了推动学生学习的动力，提高了学习效率。第三，采用教学机器，一个教师能同时监督全班学生尽可能多地完成作业。第四，教学机器允许学生按自己的速度循序渐进地学习（即使一度离校的学生也能在返校后以他辍学时的水平为起点继续学习），这能使教材掌握得更牢固，提高学生的学习责任心。第五，采用教学机器，教师就可以按一个极复杂的整体把教学内容安排成一个连续的顺序，设计一系列强化列联。第六，教学机器可记录错误数量，从而为教师修改磁带提供依据，结果是提高了教学效果。第七，学习时手脑并用，能培养学生自学能力。

采用机器教学必须把教学内容编成程序输入机器，因此，机器教学就是程序教学，但程序教学不一定要用机器。斯金纳的程序教学的主要原则有五条。

第一，积极反应。斯金纳认为，传统的课堂教学是教师讲，学生听。学生充当消极的听众角色，没有机会普遍地、经常地作出积极反应。传统的教科书也不给学生提供对每一单元的信息作出积极反应的可能性。程序教学以问题形式向学生呈现知识，学生在学习过程中能通过写、说、运算、选择、比较等作出积极反应，从而提高学习效率。

第二，小的步子。斯金纳把程序教学的教材分成若干小的、有逻辑顺序的单元，编成程序，后一步的难度略高于前一步。分小步按顺序学习是程序教学的重要原则之一。程序教学的基本过程是：显示问题（第一小步）──学生解答──对回答给予确认──进展到第二小步……如此循序前进直至完成一个程序。由于知识是逐步呈现的，学生容易理解，因此在整个学习进程中他能自始至终充满信心。

第三，即时反馈。斯金纳认为，在教学过程中应对学生的每个反应立即作出反馈，对行为的即时强化是控制行为的最好方法，能使该行为牢固建立。对学生的反应作出的反馈越快，强化效果就越大。最常用的强化方式是即时知道结果和从一个框面进入下一个框面的活动。这种强化方式能有效地帮助学生提高学习信心。

第四，自定步调。每个班级的学生在学习程度上通常都有上、中、下之别。传统教学总是按统一进度进行，很难照顾到学生的个别差异，影响了学生的自由发展。程序教学以学生为中心，鼓励学生按最适宜于自己的速度学习并通过不断强化获得稳步前进的诱因。

第五，最低的错误率。教学机器有记录错误的装置。程序编制者可根据记录了解学生实际水平并修改程序，使之更适合学生程度；又由于教材是按由浅入深、由已知到未知的顺序编制的，学生每次都可能作出正确反应，从而把错误率降到最低限度。斯金纳认为不应让学生在发生错误后再去避免错误，无错误的学习能激发学习积极性，增强记忆，提高效率。

利用教学机器所进行的教学称为"程序教学"。当然利用这种思想所进行的教学也可称为程序教学。程序教学的基本要求是：

(1)教师要编写一系列刺激(问题)-＞反应(答案)框面，这些框面由易到难地小步子地呈现教学内容。
(2)要求学生必须主动地学习，即要求他们对每个框面所呈现的内容(问题)作出积极的反应。
(3)给学生的每个反应(答案)提供即时的反馈(指出正确答案)。
(4)尽量安排好问题，使学生能经常作出正确的反应并得到及时强化。
(5)让每个学生按照自己的进度完成整个教学程序。
(6)给勤奋和学习效果好的学生提供大量支持性强化物。

斯金纳顺应时代潮流，为计算机辅助教学在教育上的运用开辟了道路。程序教学问世以来对美国、西欧、日本有较大影响，被广泛用于英语、数学、统计、地理、科学等学科的教学中。但它在策略上过于刻板，注重对教材的分析，把教材分解得支离破碎，破坏了知识的连贯性和完整性。程序教学着重于灌输知识，缺乏师生间的交流和学生间的探讨，不利于创造思维能力的培养。因此，程序教学只能作为教学的一种辅助手段。

对斯金纳的学习理论的评价

斯斯金纳对学习理论的研究是有重大贡献的，其主要贡献表现在以下几个方面：

1、斯金纳发现了操作性条件反射现象，并对其进行了认真的实验和理论研究。这项研究丰富了条件反射的实验研究，填补了条件反射类型上的一项空白，同时也打破了传统行为主义的"没有刺激，就没有反应"的错误观点。

2、斯金纳的"无错误辨别"学习的实验研究是有意义的。它不论在动物的行为训练，还是在学生的行为塑造上都是可借鉴的。而且，对课堂教学也有指导意义。
　　
3、斯金纳所做的"强化程序"的实验研究既深入，又具体，系统性很强。揭示出的强化规律客观可靠。它是驯兽师的必修课，对人类的行为管理和学生学习过程的控制和激励也有重要的参考价值。

4、50年代兴起的"程序教学"运动显然应该归功于斯金纳的贡献。这项工作推动了个体化教学形式的深入研究。

斯金纳学习理论的局限性　　

1、斯金纳犯有同传统行为主义者同样的错误，即只注重描述行为，不注重解释行为；只注重外部反应和外部行为结果．而不探讨内部心理机制。他把内部过程看成是一个"黑箱"。因此．他是一位极端的行为主义者。有人把他的思想体系称为"描述性"的行为主义。
　　
2、斯金纳在晚年仍然坚持自己的行为主义观点，反对认知心理学的研究，反对对学习过程和行为塑造过程的认知解释。站在行为主义心理学的立场上看斯金纳，他是一位坚定的行为主义者；而站在认知心理学和心理学发展的角度上看，他是一位顽固者。
　　
3、斯金纳倡导的"程序教学"，其实践效果并不像斯金纳预想的那样好。教学实践表明，程序教学减少了师生直接对话的机会，阻碍了师生间的及时交流，这对学生的学习来说是极为不利的。学生在教学机器上学习，还会有盲目地追求学习进度、猜想问题的答案和不求甚解等不良倾向。这些不利因素致使程序教学运动没有得到继续发展，而只成为教育史上留下的个体化教学方式之一。

A Brief Survey of Operant Behavior

It has long been known that behavior is affected by its consequences. We reward and punish people, for example, so that they will behave in different ways. A more specific effect of a consequence was first studied experimentally by Edward L. Thorndike in a well-known experiment. A cat enclosed in a box struggled to escape and eventually moved the latch which opened the door. When repeatedly enclosed in a box, the cat gradually ceased to do those things which had proved ineffective ("errors") and eventually made the successful response very quickly.

In operant conditioning, behavior is also affected by its consequences, but the process is not trial-and-error learning. It can best be explained with an example. A hungry rat is placed in a semi-soundproof box. For several days bits of food are occasionally delivered into a tray by an automatic dispenser. The rat soon goes to the tray immediately upon hearing the sound of the dispenser. A small horizontal section of a lever protruding from the wall has been resting in its lowest position, but it is now raised slightly so that when the rat touches it, it moves downward. In doing so it closes an electric circuit and operates the food dispenser. Immediately after eating the delivered food the rat begins to press the lever fairly rapidly. The behavior has been strengthened or reinforced by a single consequence. The rat was not "trying" to do anything when it first touched the lever and it did not learn from "errors."

To a hungry rat, food is a natural reinforcer, but the reinforcer in this example is the sound of the food dispenser, which was conditioned as a reinforcer when it was repeatedly followed by the delivery of food before the lever was pressed. In fact, the sound of that one operation of the dispenser would have had an observable effect even though no food was delivered on that occasion, but when food no longer follows pressing the lever, the rat eventually stops pressing. The behavior is said to have been extinguished.

An operant can come under the control of a stimulus. If pressing the lever is reinforced when a light is on but not when it is off, responses continue to be made in the light but seldom, if at all, in the dark. The rat has formed a discrimination between light and dark. When one turns on the light, a response occurs, but that is not a reflex response.

The lever can be pressed with different amounts of force, and if only strong responses are reinforced, the rat presses more and more forcefully. If only weak responses are reinforced, it eventually responds only very weakly. The process is called differentiation.

A response must first occur for other reasons before it is reinforced and becomes an operant. It may seem as if a very complex response would never occur to be reinforced, but complex responses can be shaped by reinforcing their component parts separately and putting them together in the final form of the operant.

Operant reinforcement not only shapes the topography of behavior, it maintains it in strength long after an operant has been formed. Schedules of reinforcement are important in maintaining behavior. If a response has been reinforced for some time only once every five minutes, for example, the rat soon stops responding immediately after reinforcement but responds more and more rapidly as the time for the next reinforcement approaches. (That is called a fixed-interval schedule of reinforcement.) If a response has been reinforced n the average every five minutes but unpredictably, the rat responds at a steady rate. (That is a variable-interval schedule of reinforcement.) If the average interval is short, the rate is high; if it is long, the rate is low.

If a response is reinforced when a given number of responses has been emited, the rat responds more and more rapidly as the required number is approached. (That is a fixed-ratio schedule of reinforcement.) The number can be increased by easy stages up to a very high value; the rat will continue to respond even though a response is only very rarely reinforced. "Piece-rate pay" in industry is an example of a fixed-ratio schedule, and employers are sometimes tempted to "stretch" it by increasing the amount of work required for each unit of payment. When reinforcement occurs after an average number of responses but unpredictably, the schedule is called variable-ratio. It is familiar in gambling devices and systems which arrange occasional but unpredictable payoffs. The required number of responses can easily be stretched, and in a gambling enterprise such as a casino the average ratio must be such that the gambler loses in the long run if the casino is to make a profit.

Reinforcers may be positive or negative. A positive reinforcer reinforces when it is presented; a negative reinforcer reinforces when it is withdrawn. Negative reinforcement is not punishment. Reinforcers always strengthen behavior; that is what "reinforced" means. Punishment is used to suppress behavior. It consists of removing a positive reinforcer or presenting a negative one. It often seems to operate by conditioning negative reinforcers. The punished person henceforth acts in ways which reduce the threat of punishment and which are incompatible with, and hence take the place of, the behavior punished.

This human species is distinguished by the fact that its vocal responses can be easily conditioned as operants. There are many kinds of verbal operants because the behavior must be reinforced only through the mediation of other people, and they do many different things. The reinforcing practices of a given culture compose what is called a language. The practices are responsible for most of the extraordinary achievements of the human species. Other species acquire behavior from each other through imitation and modelling (they show each other what to do), but they cannot tell each other what to do. We acquire most of our behavior with that kind of help. We take advice, heed warnings, observe rules, and obey laws, and our behavior then comes under the control of consequences which would otherwise not be effective. Most of our behavior is too complex to have occurred for the first time without such verbal help. By taking advice and following rules we acquire a much more extensive repertoire than would be possible through a solitary contact with the environment.

Responding because behavior has had reinforcing consequences is very different from responding by taking advice, following rules, or obeying laws. We do not take advice because of the particular consequence that will follow; we take it only when taking other advice from similar sources has already had reinforcing consequences. In general, we are much more strongly inclined to do things if they have had immediate reinforcing consequences than if we have been merely advised to do them.

The innate behavior studied by ethologists is shaped and maintained by its contribution to the survival of the individual and species. Operant behavior is shaped and maintained by its consequences for the individual. Both processes have controversial features. Neither one seems to have any place for a prior plan or purposes. In both, selection replaces creation.

Personal freedom also seems threatened. It is only the feeling of freedom, however, which is affected. Those who respond because their behavior has had positively reinforcing consequences usually feel free. They seem to be doing what they want to do. Those who respond because the reinforcement has been negative and who are therefore avoiding or escaping from punishment are doing what they have to do and do not feel free. These distinctions do not involve the fact of freedom.

The experimental analysis of operant behavior has led to a technology often called behavior modification. It usually consists of changing the consequences of behavior, removing consequences which have caused trouble, or arranging new consequences for behavior which has lacked strength. Historically, people have been controlled primarily through negative reinforcement that is, they have been punished when they have not done what is reinforcing to those who could punish them. Positive reinforcement has been less often used, partly because its effect is slightly deferred, but it can be as effective as negative reinforcement and has many fewer unwanted byproducts. For example, students who are punished when they do not study may study, but they may also stay away from school (truancy), vandalize school property, attack teachers, or stubbornly do nothing. Redesigning school systems so that what students do is more often positively reinforced can make a great difference.

(For further details, see my The Behavior of Organisms, my Science and Human Behavior, and Schedules of Reinforcement by C. F. Ferster and me.)

-- B. F. Skinner

《鸽子的迷信行为》'SUPERSTITION' IN THE PIGEON

斯金纳是一位久负盛名的心理学家，堪称行为主义之父，他是著名的“斯金纳箱”的发明者，此外他还出版了10 多本著作，发表了70多篇科学论文。斯金纳的理论用一句简单的话来概括就是：在任一特定的情况下，你的行为都很可能伴随着某种结果，比如得到赞扬、报酬或解决问题后的满足感，那么随后在类似的情况下，你很可能重复这一行为；这些结果被称为强化。如果你的行为伴随着另一种结果，比如疼痛或尴尬，那么你在以后的相似情况下就很少会重复这种行为；这些结果被称为“惩罚”。

在斯金纳的众多研究中，有一篇题为《鸽子的迷信行为》，这篇文章不仅题目幽默，而且通过这个有趣的研究，我们不仅能够清楚的认识斯金纳的基本理论，他的研究行为方式、方法，还能对我们都熟悉的“迷信”现象进行一种斯金纳式的解释。

理论假设

人们总是会有这样那样的迷信行为，比方说，忌讳从梯子下走过，忌讳踩到裂缝等等。很多人不愿意承认这一点，但是某些时候人们是会因为迷信而做某些事情。斯金纳认为，人们这样做的原因是他们相信或推测在迷信行为和某些被强化的结果之间存在联系，即便是实际情况下两者并不相关。所以，斯金纳说了下面的话“如果你认为这是人类特有的行为，那么我将给你一只迷信的鸽子。”

实验方法

斯金纳在此项研究中使用了“斯金纳箱”，但做了重要的变化：即为了研究迷信行为，食物分发器被设定为每隔 15秒落下食丸，不管动物当时在做什么。可以看到这便产生了非关联性强化。换句话说，不管动物做了什么，每隔 15 秒它将得到一份奖励。

研究中的被试是 8只鸽子。连续几天对这些鸽子喂少于他们正常进食量的食物，以便在测试时，它们处于饥饿状态。由此增强寻找食物的动机（这增加了强化的效果）。让每只鸽子每天在实验箱里待几分钟，对其行为不作任何限制。在这期间，每个15 秒强化自动出现。几天后，两个独立的观测者记录了鸽子在箱中的行为。

实验结果
　　
斯金纳在报告中写道：“ 8 只鸽子中的 6 只产生了非常明显的反应，两名观察者得到了完全一致的记录。一只鸽子形成了在箱子中逆时针转圈的条件反射，在两次强化之间转2 － 3圈；另一只反复将头撞向箱子上方的一个角落；第三只只显现出一种上举反应，似乎把头放在一根看不见的杆下面并反复抬起它。还有两只鸽子的头和身体呈现出一种摇摆似的动作，它们头部前伸，并且从右向左大幅度摇摆，接着再慢慢的转过来，它们的身子也顺势移动，动作幅度过大时还会向前走几步。还有一只鸽子形成了不完整的啄击或轻触的条件反应，动作直冲地面但并不接触。”
　　
上述的行为都是在建立条件反射前未曾观测到的。实际上新的行为和鸽子得到食物毫无联系。然而，它们表现的就好像行为会产生食物似的；也就是说，它们变得迷信了。
　　
接下来，斯金纳想知道如果两次强化之间的间隔被拉长了，又会发生什么。他用了一只摇头的鸽子，然后把两次投放食丸的时间间隔慢慢增加到 1分钟。这时，鸽子表现的更加精力充沛，直到最后在两次强化间的 1 分钟内，这只鸽子像在表演一种舞蹈（好像一种“鸽子食物舞”）。
　　
最后是消除鸽子的这种新行为。这意味着在测试箱中的强化不再出现。这时，迷信行为逐渐消退，直到完全消失。然而，值得注意的是，这只“跳舞”的鸽子在完全消退前的这种反应次数超过了1 万次。

讨论　　
在实验中斯金纳得到了 6只迷信的鸽子。然而，他对这一成果的解释却极为谨慎：“这一实验可以说是证明了一种迷信。鸽子行为的依据是行为和食物之间的因果关系，虽然这种联系实际上并不存在。”
　　
迷信难以消退的原因可以从那只在行为消除前跳了 1万多次舞的鸽子那儿去寻找。当某种行为只是偶然的被强化一次，它就变得非常难以消除。这是因为人们的期望值很高，期望迷信行为会产生强化的后果。你能想象，如果每种联系每次出现，然后突然消失，那么行为就会很快停止，然而，对人类而言，偶然的强化通常要过很长时间才能发生，因此迷信行为常常持续一生。

结论
　　
迷信无处不在，从心理学角度看迷信是不健康的吗？绝大多数心理学家相信，尽管从定义上讲，迷信行为并不会导致你想要的结果，但它们还是有积极的功能。当一个人身处困境时，迷信行为经常能产生力量，不再失控。从事危险职业的人比其他人更加迷信。有时候，由迷信行为带来的力量感和控制感能降低焦虑、增强自信和信心，并提高成绩。

'SUPERSTITION' IN THE PIGEON

B. F. Skinner

Indiana University

First published in Journal of Experimental Psychology, 38, 168-172.

To say that a reinforcement is contingent upon a response may mean nothing morethan that it follows the response. It may follow because of some mechanicalconnection or because of the mediation of another organism; but conditioningtakes place presumably because of the temporal relation only, expressed in termsof the order and proximity of response and reinforcement. Whenever we present astate of affairs which is known to be reinforcing at a given drive, we mustsuppose that conditioning takes place, even though we have paid no attention tothe behavior of the organism in making the presentation. A simple experimentdemonstrates this to be the case.

A pigeon is brought to a stable state of hunger by reducing it to 75 percent ofits weight when well fed. It is put into an experimental cage for a few minuteseach day. A food hopper attached to the cage may be swung into place so that thepigeon can eat from it. A solenoid and a timing relay hold the hopper in placefor five sec. at each reinforcement.

If a clock is now arranged to present the food hopper at regular intervals withno reference whatsoever to the bird's behavior, operant conditioning usuallytakes place. In six out of eight cases the resulting responses were so clearlydefined that two observers could agree perfectly in counting instances. One birdwas conditioned to turn counter-clockwise about the cage, making two or threeturns between reinforcements. Another repeatedly thrust its head into one of theupper corners of the cage. A third developed a 'tossing' response, as if placingits head beneath an invisible bar and lifting it repeatedly. Two birds developeda pendulum motion of the head and body, in which the head was extended forwardand swung from right to left with a sharp movement followed by a somewhat slowerreturn. The body generally followed the movement and a few steps might be takenwhen it was extensive. Another bird was conditioned to make incomplete peckingor brushing movements directed toward but not touching the floor. None of theseresponses appeared in any noticeable strength during adaptation to the cage oruntil the food hopper was periodically presented. In the remaining two cases,conditioned responses were not clearly marked.

The conditioning process is usually obvious. The bird happens to be executingsome response as the hopper appears; as a result it tends to repeat thisresponse. If the interval before the next presentation is not so great thatextinction takes place, a second 'contingency' is probable. This strengthens theresponse still further and subsequent reinforcement becomes more probable. It istrue that some responses go unreinforced and [p. 169] some reinforcements appearwhen the response has not just been made, but the net result is the developmentof a considerable state of strength.

With the exception of the counter-clockwise turn, each response was almostalways repeated in the same part of the cage, and it generally involved anorientation toward some feature of the cage. The effect of the reinforcement wasto condition the bird to respond to some aspect of the environment rather thanmerely to execute a series of movements. All responses came to be repeatedrapidly between reinforcements -- typically five or six times in 15 sec.

The effect appears to depend upon the rate of reinforcement. In general, weshould expect that the shorter the intervening interval, the speedier and moremarked the conditioning. One reason is that the pigeon's behavior becomes morediverse as time passes after reinforcement. A hundred photographs, each takentwo sec. after withdrawal of the hopper, would show fairly uniform behavior. Thebird would be in the same part of the cage, near the hopper, and probablyoriented toward the wall where the hopper has disappeared or turning to one sideor the other. A hundred photographs taken after 10 sec., on the other hand,would find the bird in various parts of the cage responding to many differentaspects of the environment. The sooner a second reinforcement appears,therefore, the more likely it is that the second reinforced response will besimilar to the first, and also that they will both have one of a few standardforms. In the limiting case of a very brief interval the behavior to be expectedwould be holding the head toward the opening through which the magazine hasdisappeared.

Another reason for the greater effectiveness of short intervals is that thelonger the interval, the greater the number of intervening responses emittedwithout reinforcement. The resulting extinction cancels the effect of anoccasional reinforcement.

According to this interpretation the effective interval will depend upon therate of conditioning and the rate of extinction, and will therefore vary withthe drive and also presumably between species. Fifteen sec. is a very effectiveinterval at the drive level indicated above. One min. is much less so. When aresponse has once been set up, however, the interval can be lengthened. In onecase it was extended to two min., and a high rate of responding was maintainedwith no sign of weakening. In another case, many hours of responding wereobserved with an interval of one min. between reinforcements.

In the latter case, the response showed a noticeable drift in topography. Itbegan as a sharp movement of the head from the middle position to the left. Thismovement became more energetic, and eventually the whole body of the bird turnedin the same direction, and a step or two would be taken. After many hours, thestepping response became the predominant feature. The bird made a well definedhopping step from the right to the left foot, meanwhile turning its head andbody to the left as before.

When the stepping response became strong, it was possible to obtain a mechanicalrecord by putting the bird on a large tambour directly connected with a smalltambour which made a delicate electric contact each time [p. 170] stepping tookplace. By watching the bird and listening to the sound of the recorder it waspossible to confirm the fact that a fairly authentic record was being made. Itwas possible for the bird to hear the recorder at each step, but this was, ofcourse, in no way correlated with feeding. The record obtained when the magazinewas presented once every min. resembles in every respect the characteristiccurve for the pigeon under periodic reinforcement of a standard selectedresponse. A well marked temporal discrimination develops. The bird does notrespond immediately after eating, but when 10 or 15 or even 20 sec. have elapsedit begins to respond rapidly and continues until the reinforcement is received.

(图略)

Fig. 1. 'Reconditioning' of a superstitious response after extinction. Theresponse of hopping from right to left had been thoroughly extinguished justbefore the record was taken. The arrows indicate the automatic presentation offood at one-min. intervals without reference to the pigeon's behavior.

In this case it was possible to record the 'extinction' of the response when theclock was turned off and the magazine was no longer presented at any time. Thebird continued to respond with its characteristic side to side hop. More thanl0,000 responses were recorded before 'extinction' had reached the point atwhich few if any responses were made during a 10 or 15 min interval. When theclock was again started, the periodic presentation of the magazine (stillwithout any connection whatsoever with the bird's behavior) brought out atypical curve for reconditioning after periodic reinforcement, shown in Fig. 1.The record had been essentially horizontal for 20 min. prior to the beginning ofthis curve. The first reinforcement had some slight effect and the second agreater effect. There is a smooth positive acceleration in rate as the birdreturns to the rate of responding which prevailed when it was reinforced everymin.

When the response was again extinguished and the periodic presentation of foodthen resumed, a different response was picked up. This consisted of aprogressive walking response in which the bird moved about the cage. [p. 171]The response of hopping from side to side never reappeared and could not, ofcourse, be obtained deliberately without making the reinforcement contingentupon the behavior.

The experiment might be said to demonstrate a sort of superstition. The birdbehaves as if there were a causal relation between its behavior and thepresentation of food, although such a relation is lacking. There are manyanalogies in human behavior. Rituals for changing one's luck at cards are goodexamples. A few accidental connections between a ritual and favorableconsequences suffice to set up and maintain the behavior in spite of manyunreinforced instances. The bowler who has released a ball down the alley butcontinues to behave as if he were controlling it by twisting and turning his armand shoulder is another case in point. These behaviors have, of course, no realeffect upon one's luck or upon a ball half way down an alley, just as in thepresent case the food would appear as often if the pigeon did nothing -- or,more strictly speaking, did something else.

It is perhaps not quite correct to say that conditioned behavior has been set upwithout any previously determined contingency whatsoever. We have appealed to auniform sequence of responses in the behavior of the pigeon to obtain anover-all net contingency. When we arrange a clock to present food every 15 sec.,we are in effect basing our reinforcement upon a limited set of responses whichfrequently occur 15 sec. after reinforcement. When a response has beenstrengthened (and this may result from one reinforcement), the setting of theclock implies an even more restricted contingency. Something of the same sort istrue of the bowler. It is not quite correct to say that there is no connectionbetween his twisting and turning and the course taken by the ball at the far endof the alley. The connection was established before the ball left the bowler'shand, but since both the path of the ball and the behavior of the bowler aredetermined, some relation survives. The subsequent behavior of the bowler mayhave no effect upon the ball, but the behavior of the ball has an effect uponthe bowler. The contingency, though not perfect, is enough to maintain thebehavior in strength. The particular form of the behavior adopted by the bowleris due to induction from responses in which there is actual contact with theball. It is clearly a movement appropriate to changing the ball's direction. Butthis does not invalidate the comparison, since we are not concerned with whatresponse is selected but with why it persists in strength. In rituals forchanging luck the inductive strengthening of a particular form of behavior isgenerally absent. The behavior of the pigeon in this experiment is of the lattersort, as the variety of responses obtained from different pigeons indicates.Whether there is any unconditioned [p. 172] behavior in the pigeon appropriateto a given effect upon the environment is under investigation.

The results throws some light on incidental behavior observed in experiments inwhich a discriminative stimulus is frequently presented. Such a stimulus hasreinforcing value and can set up superstitious behavior. A pigeon will oftendevelop some response such as turning, twisting, pecking near the locus of thediscriminative stimulus, flapping its wings, etc. In much of the work to date inthis field the interval between presentations of the discriminative stimulus hasbeen one min. and many of these superstitious responses are short-lived. Theirappearance as the result of accidental correlations with the presentation of thestimulus is unmistakable.

(Manuscript received June 5, 1947)

斯金纳

附件列表

标签

同义词