Marvin Minsky

Marvin Lee Minsky (August 9, 1927 - January 24, 2016) was an American scientist in the field of artificial intelligence (AI), co-founder of MIT's AI laboratory, author of several texts on AI and philosophy, and winner of the 1969 Turing Award.

Quotes

Once the computers got control, we might never get it back. We would survive at their sufferance. If we're lucky, they might decide to keep us as pets.

What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle.

In today's computer science curricula … almost all their time is devoted to formal classification of syntactic language types, defeatist unsolvability theories, folklore about systems programming, and generally trivial fragments of "optimization of logic design" — the latter often in situations where the art of heuristic programming has far outreached the special-case "theories" so grimly taught and tested — and invocations about programming style almost sure to be outmoded before the student graduates.
- Turing Award Lecture "Form and Content in Computer Science" (1969), in Journal of the Association for Computing Machinery 17 (2) (April 1970)

Computer languages of the future will be more concerned with goals and less with procedures specified by the programmer.
- Turing Award Lecture "Form and Content in Computer Science" (1969), in Journal of the Association for Computing Machinery 17 (2) (April 1970)
‘In from three to eight years we will have a machine with the general intelligence of an average human being. I mean a machine that will be able to read Shakespeare, grease a car, play office politics, tell a joke, have a fight. At that point the machine will begin to educate itself with fantastic speed. In a few months it will be at genius level and a few months after that its powers will be incalculable... Once the computers got control, we might never get it back. We would survive at their sufferance. If we're lucky, they might decide to keep us as pets... I have warned [people in the Pentagon] again and again that we are getting into very dangerous country. They don’t seem to understand.
- Meet Shakey, The First Electronic Person by Brad Darrach, LIFE, November 20, 1970.
I had the naive idea that if one could build a big enough network, with enough memory loops, it might get lucky and acquire the ability to envision things in its head. This became a field of study later. It was called self-organizing random networks. Even today, I still get letters from young students who say, 'Why are you people trying to program intelligence? Why don't you try to find a way to build a nervous system that will just spontaneously create it?' Finally, I decided that either this was a bad idea or it would take thousands or millions of neurons to make it work, and I couldn't afford to try to build a machine like that.
- Bernstein, Jeremy (1981-12-06). "Marvin Minsky’s Vision of the Future" (in en-US). The New Yorker. ISSN 0028-792X.

Speed is what distinguishes intelligence. No bird discovers how to fly: evolution used a trillion bird-years to 'discover' that – where merely hundreds of person-years sufficed.
- "Communication with Alien Intelligence", in Extraterrestrials: Science and Alien Intelligence (1985) edited by Edward Regis also published in Byte Magazine (April 1985)

Will robots inherit the earth? Yes, but they will be our children.
- Scientific American (October 1994)

When David Marr at MIT moved into computer vision, he generated a lot of excitement, but he hit up against the problem of knowledge representation; he had no good representations for knowledge in his vision systems.
- Marvin Minsky in: David G. Stork (1998). HAL's Legacy: 2001's Computer As Dream and Reality. p. 16

An ethicist is someone who sees something wrong with whatever you have in mind.
- TED talk (February 2003)

You don't understand anything until you learn it more than one way.
- In Managing an Information Security and Privacy Awareness and Training Program (2005) by Rebecca Herold, p. 101

If you like somebody's work -- just go and see them. However, don't ask for their autograph. A lot of people came and asked me for my autograph -- and it's creepy. What I did is read everything they published first... and correct them. That's what they really want. Every smart person wants to be corrected, not admired.
- In "The Society of Mind" MIT course, part 6, "Layers of Mental Activities" (25:40 -- 26:15). Fall 2011.

If there's something you like very much then you should regard this not as you feeling good but as a kind of brain cancer, because it means that some small part of your mind has figured out how to turn off all the other things.
- In "The Many Minds of Marvin Minsky (R.I.P.)" by John Horgan, Scientific American Blogs, 26 January 2016

Jokes and their Relation to the Cognitive Unconscious (1980)

"Jokes and their Relation to the Cognitive Unconscious" AI memo No. 603, (November 1980), also published in Cognitive Constraints on Communication (1981) edited by Vaina and Hintikka

I am inclined to doubt that anything very resembling formal logic could be a good model for human reasoning. In particular, I doubt that any logic that prohibits self-reference can be adequate for psychology: no mind can have enough power — without the power to think about Thinking itself. Without Self-Reference it would seem immeasurably harder to achieve Self-Consciousness — which, so far as I can see, requires at least some capacity to reflect on what it does. If Russell shattered our hopes for making a completely reliable version of commonsense reasoning, still we can try to find the islands of "local consistency," in which naive reasoning remains correct.

Since we have no systematic way to avoid all the inconsistencies of commonsense logic, each person must find his own way by building a private collection of "cognitive censors" to suppress the kinds of mistakes he has discovered in the past.

Questioning one's own "top-level" goals always reveals the paradox-oscillation of ultimate purpose. How could one decide that a goal is worthwhile — unless one already knew what it is that is worthwhile?

For avoiding nonsense in general, we might accumulate millions of censors. For all we know, this "negative meta-knowledge" — about patterns of thought and inference that have been found defective or harmful — may be a large portion of all we know.

All intelligent persons also possess some larger-scale frame-systems whose members seemed at first impossibly different — like water with electricity, or poetry with music. Yet many such analogies — along with the knowledge of how to apply them — are among our most powerful tools of thought. They explain our ability sometimes to see one thing — or idea — as though it were another, and thus to apply knowledge and experience gathered in one domain to solve problems in another. It is thus that we transfer knowledge via the paradigms of Science. We learn to see gases and fluids as particles, particles as waves, and waves as envelopes of growing spheres.

Positive general principles need always to be supplemented by negative, anecdotal censors. For, it hardly ever pays to alter a general mechanism to correct a particular bug.

K-Linesː A Theory of Memory (1980)

"K-Linesː A Theory of Memory" in Cognitive Science 4 (1980), pp.117-133

When you "get an idea," or "solve a problem," or have a "memorable experience," you create what we shall call a K-line. This K-line gets connected to those "mental agencies" that were actively involved in the memorable event. When that K-line is later "activated," it reactivates some of those mental agencies, creating a "partial mental state" resembling the original.

We usually say that one must first understand simpler things. But what if feelings and viewpoints are the simpler things?

We shall envision the mind (or brain) as composed of many partially autonomous "agents"—as a "Society" of smaller minds. ...It is easiest to think about partial states that constrain only agents within a single Division. ...(we suggest) the local mechanisms for resolving conflicts could be the precursors of what we know later as reasoning — useful ways to combine different fragments of knowledge.

Concrete concepts are not necessarily the simplest ones. A novice best remembers "being at" a concert. The amateur remembers more of what it "sounded like." Only the professional remembers the music itself, timbres, tones and textures.

I maintain that attitudes do really precede propositions, feelings come before facts.

Old answers never perfectly suit new questions, except in the most formal, logical circumstances.

Get the mind into the (partial) state that solved the old problem; then it might handle the new problem in the "same way."

Changing the states of many agents grossly alters behavior, while changing only a few just perturbs the overall disposition a little.

A memory should induce a state through which we see current reality as an instance of the remembered event — or equivalently, see the past as an instance of the present. ...the system can perform a computation analogous to one from the memorable past, but sensitive to present goals and circumstances.

It would seem that making unusual connections is unusually difficult and, often, rather "indirect"—be it via words, images, or whatever. The bizarre structures used by mnemonist (and, presumably unknowingly, by each of us) suggests that arbitrary connections require devious pathways.

Most theories of learning have been based on ideas of "reinforcement" of success. But all these theories postulate a single, centralized reward mechanism. I doubt this could suffice for human learning because the recognition of which events should be considered memorable cannot be a single, uniform process. It requires too much "intelligence." Instead I think such recognitions must be made, for each division of the mind, by some other agency that has engaged the present one for a purpose.

Each subsociety of mind must have its own internal epistemology and phenomenology, with most details private, not only from the central processes, but from one another.

Each part of the mind sees only a little of what happens in some others, and that little is swiftly refined, reformulated and "represented." We like to believe that these fragments have meanings in themselves — apart from the great webs of structure from which they emerge — and indeed this illusion is valuable to us qua thinkers — but not to us as psychologists — because it leads us to think that expressible knowledge is the first thing to study.

Music, Mind, and Meaning (1981)

"Music, Mind, and Meaning" (1981), a revised version of AI Memo No. 616, MIT; also published in the Computer Music Journal, Vol. 5, Number 3 (Fall 1981)

Only the surface of reason is rational. I don't mean that understanding emotion is easy, only that understanding reason is probably harder.

Our culture has a universal myth in which we see emotion as more complex and obscure than intellect. Indeed, emotion might be "deeper" in some sense of prior evolution, but this need not make it harder to understand; in fact, I think today we actually know much more about emotion than about reason.

If explaining minds seems harder than explaining songs, we should remember that sometimes enlarging problems makes them simpler! The theory of the roots of equations seemed hard for centuries within its little world of real numbers, but it suddenly seemed simple once Gauss exposed the larger world of so-called complex numbers. Similarly, music should make more sense once seen through listeners' minds.

We find things that do not fit into familiar frameworks hard to understand – such things seem meaningless.

What is the difference between merely knowing (or remembering, or memorizing) and understanding? ...A thing or idea seems meaningful only when we have several different ways to represent it — different perspectives and different associations. ...Then we can turn it around in our minds, so to speak: however it seems at the moment, we can see it another way and we never come to a full stop. In other words, we can 'think' about it. If there were only one way to represent this thing or idea, we would not call this representation thinking.

Of what use is musical knowledge? Here is one idea. Each child spends endless days in curious ways; we call this play. A child stacks and packs all kinds of blocks and boxes, lines them up, and knocks them down. … Clearly, the child is learning about space! ...how on earth does one learn about time? Can one time fit inside another? Can two of them go side by side? In music, we find out!

The way the mathematics game is played, most variations lie outside the rules, while music can insist on perfect canon or tolerate a casual accompaniment.

Most adults have some childlike fascination for making and arranging larger structures out of smaller ones.

Perhaps the music that some call 'background' music can tranquilize by turning under-thoughts from bad to neutral, leaving the surface thoughts free of affect by diverting the unconscious.

Theorems often tell us complex truths about the simple things, but only rarely tell us simple truths about the complex ones. To believe otherwise is wishful thinking or "mathematics envy."

Music... immerses us in seemingly stable worlds! How can this be, when there is so little of it present at each moment?

Hearing music is like viewing scenery and... when we hear good music our minds react in very much the same way they do when we see things.

Our eyes are always flashing sudden flicks of different pictures to our brains, yet none of that saccadic action leads to any sense of change or motion in the world; each thing reposes calmly in its "place"! ...What makes us such innate Copernicans?

How do both music and vision build things in our minds? Eye motions show us real objects; phrases show us musical objects. We "learn" a room with bodily motions; large musical sections show us musical "places." Walks and climbs move us from room to room; so do transitions between musical sections. Looking back in vision is like recapitulation in music; both give us time, at certain points, to reconfirm or change our conceptions of the whole.

Innate sentic detectors could help by teaching children about their own affective states. For if distinct signals arouse specific states, the child can associate those signals with those states. Just knowing that such states exist, that is, having symbols for them, is half the battle.

When no idea seems right, the right one must seem wrong.

The Society of Mind (1987)

We'll show you that you can build a mind from many little parts, each mindless by itself.
- Prologue

This book... too, is a society — of many small ideas. Each by itself is only common sense, yet when we join enough of them we explain the strangest mysteries of mind.
- Prologue

Unless we can explain the mind in terms of things that have no thoughts or feelings of their own, we'll only have gone around in a circle.
- Ch.1

How many processes are going on, to keep that teacup level in your grasp? There must be a hundred of them.
- Ch.1

The "laws of thought" depend not only on the property of brain cells, but also on how they are connected. And these connections are established not by the basic, "general" laws of physics... To be sure, "general" laws apply to everything. But, for that very reason, they can rarely explain anything in particular. ...Each higher level of description must add to our knowledge about lower levels.
- Ch.2

Questions about arts, traits, and styles of life are actually quite technical. They ask us to explain what happens among the agents of our minds. But this is a subject about which we have never learned very much... Such questions will be answered in time. But it will just prolong the wait if we keep using pseudo-explanation words like "holistic" and "gestalt." …It's harmful, when naming leads the mind to think that names alone bring meaning close.
- Ch.2

One's present personality cannot share all the thoughts of one's older personalities — and yet it has some sense that they exist. This is one reason why we feel that we possess an inner Self — a sort of ever-present person-friend, inside the mind, whom we can always ask for help.

We rarely recognize how wonderful it is that a person can traverse an entire lifetime without making a single really serious mistake — like putting a fork in one's eye or using a window instead of a door.

For generations, scientists and philosophers have tried to explain ordinary reasoning in terms of logical principles — with virtually no success. I suspect this enterprise failed because it was looking in the wrong direction: common sense works so well not because it is an approximation of logic; logic is only a small part of our great accumulation of different, useful ways to chain things together.
- p. 187

What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle.
- p. 308

Perceptrons (1988)

One popular version is that the publication of our book so discouraged research on learning in network machines that a promising line of research was interrupted. Our version is that progress had already come to a virtual halt because of the lack of adequate basic theories... Most theorists had tried to focus only on the mathematical structure of what was common to all learning, and the theories to which this had led were too general and too weak to explain which patterns perceptrons could learn to recognize... The trouble appeared when perceptrons had no way to represent the knowledge required for solving certain problems. The moral was that one simply cannot learn enough by studying learning by itself; one also has to understand the nature of what one wants to learn.
- p. xii-xiii
Perceptrons have been widely publicized as “pattern recognition” or “learning” machines and as such have been discussed in a large number of books, journal articles, and voluminous “reports.” Most of this writing...is without scientific value.
- p. 4
We do not see that any good can come of experiments which pay no attention to limiting factors that will assert themselves as soon as the small model is scaled up to a usable size.
- p. 18
[We] became involved with a somewhat therapeutic compulsion: to dispel what we feared to be the first shadows of a “holistic” or “Gestalt” misconception that would threaten to haunt the fields of engineering and artificial intelligence as it had earlier haunted biology and psychology. For this, and for a variety of more practical and theoretical goals, we set out to find something about the range and limitations of perceptrons.
- p. 20
Have you considered perceptrons with many layers? ... We have not found (by thinking or by studying the literature) any other really interesting class of multilayered machine, at least none whose principles seem to have a significant relation to those of the perceptron. To see the force of this qualification it is worth pondering the fact, trivial in itself, that a universal computer could be built entirely out of linear threshold modules. This does not in any sense reduce the theory of computation and programming to the theory of perceptrons.
- p. 231
More concretely, we would call the student's attention to the following considerations: 1. Multilayer machines with loops clearly open all the questions of the general theory of automata. 2. A system with no loops but with an order restriction at each layer can compute only predicates of finite order. 3. On the other hand, if there is no restriction except for the absence of loops, the monster of vacuous generality once more raises its head. The perceptron has shown itself worthy of study despite (and even because of!) its severe limitations. It has many features to attract attention: its linearity; its intriguing learning theorem; its clear paradigmatic simplicity as a kind of parallel computation. There is no reason to suppose that any of these virtues carry over to the many-layered version. Nevertheless, we consider it to be an important research problem to elucidate (or reject) our intuitive judgment that the extension is sterile. Perhaps some powerful convergence theorem will be discovered, or some profound reason for the failure to produce an interesting “learning theorem” for the multilayered machine will be found.
- p. 231-232
We could extend them either by scaling up small connectionist models or by combining small-scale networks into some larger organization. In the first case, we would expect to encounter theoretical obstacles to maintaining [generalized delta rule]’s effectiveness on larger, deeper nets. And despite the reputed efficacy of other alleged remedies for the deficiencies of hill-climbing, such as “annealing,” we stay with our research conjecture that no such procedures will work very well on large-scale nets, except in the case of problems that turn out to be of low order in some appropriate sense. The second alternative is to employ a variety of smaller networks rather than try to scale up a single one... No single-method learning scheme can operate efficiently for every possible task; we cannot expect any one type of machine to account for any large portion of human psychology.
- p. 267-268

The Emotion Machine (2006)

Perhaps it is no accident that one meaning of the word express is "to squeeze"—for when you try to "express yourself," your language resources will have to pick and choose among the descriptions your other resources construct—and then attempt to squeeze a few of these through your tiny channels of phrases and gestures.

I suspect our human "thinking processes" often "break down," but you rarely notice anything's wrong, because your systems so quickly switch you to think in different ways, while the systems that failed are repaired or replaced.

Most of our future attempts to build large, growing Artificial Intelligences will be subject to all sorts of mental disorders.

We still remain prone to doctrines, philosophies, faiths, and beliefs that spread through the populations of entire civilizations. It is hard to imagine any foolproof ways to protect ourselves from such infections. ...the best we can do is to try to educate our children to learn more skills of critical thinking and methods of scientific verification.

Every system that we build will surprise us with new kinds of flaws until those machines become clever enough to conceal their faults from us.

Attributed

I cannot articulate enough to express my dislike to people who think that understanding spoils your experience… How would they know?
- Mat Buckland, AI Techniques for Game Programming (2002), Cincinnati, OH: Premier Press, 36 ISBN 1-931841-08-X.
It would seem that Perceptrons has much the same role as The Necronomicon -- that is, often cited but never read.
- Berkeley, Istvan SN. "A revisionist history of connectionism." (1997), attributing it to (Minsky, personal communication, 1994).

Quotes about Marvin Minsky

Frank Rosenblatt... invented a very simple single-layer device called a Perceptron. ...Unfortunately, its influence was damped by Marvin Minsky and Seymour Papert, who proved [in Perceptrons: An Introduction to Computational Geometry (1969)] that the Perceptron architecture and learning rule could not execute the "exclusive OR" and therefore could not learn. This killed interest in Perceptrons for a number of years... It is possible to construct multilayer networks of simple units that could easily execute the exclusive OR... Minsky and Papert would have contributed more if they had produced a solution to this problem rather than beating the Perceptron to death.
- Francis Crick, The Astonishing Hypothesis: The Scientific Search for the Soul (1994)

Although my own previous enthusiasm has been for syntactically rich languages like the Algol family, I now see clearly and concretely the force of Minsky's 1970 Turing lecture, in which he argued that Lisp's uniformity of structure and power of self reference gave the programmer capabilities whose content was well worth the sacrifice of visual form.
- Robert Floyd, The Paradigms of Programming, 1978 Turing Award Lecture, Communications of the ACM 22 (8), August 1979: pp. 455–460
When the Minsky and Papert book came out, entitled Perceptrons... I saw they'd done some serious work here, and there was some good mathematics in this book, but I said, "My God, what a hatchet job."... I felt that they had sufficiently narrowly defined what the perceptron was, that they were able to prove that it could do practically nothing... I couldn't understand what the point of it was, why the hell they did it. But I know how long it takes to write a book. I figured that they must have gotten inspired to write that book really early on to squelch the field, to do what they could to stick pins in the balloon. But by the time the book came out, the field was already gone. There was just about nobody doing it.
- Bernard Widrow, interview on March 29, 1994. Anderson, James A., ed (2000) (in en). Talking Nets: An Oral History of Neural Networks. The MIT Press. doi:10.7551/mitpress/6626.003.0004. ISBN 978-0-262-26715-1.
I was one of the people suffering from Minsky and Papert's book [Perceptrons] because it went roughly this way: you start telling somebody about your work, and this visitor or whoever you talk to says, "Don't you know that this area is dead?" It is something like what we experienced in the pattern recognition society when everything started to be structural and grammatical and semantic and so on. If somebody said, "I'm doing research on the statistical pattern recognition," then came this remark, "Hey, don't you know that is a dead idea already?"
- Teuvo Kohonen, interview in March 1993, Anderson, James A., ed (2000) (in en). Talking Nets: An Oral History of Neural Networks. The MIT Press. doi:10.7551/mitpress/6626.003.0004. ISBN 978-0-262-26715-1.
Many people see the book as what killed neural nets, but I really don't think that's true. I think that the funding priorities, the fashions in computer science departments, had shifted the emphasis away from neural nets to the more symbolic methods of AI by the time the book came out. I think it was more that a younger generation of computer scientists who didn't know the earlier work may have used the book as justification for sticking with "straight AI" and ignoring neural nets.
- Michael A. Arbib, interview in March 1993, Anderson, James A., ed (2000) (in en). Talking Nets: An Oral History of Neural Networks. The MIT Press. doi:10.7551/mitpress/6626.003.0004. ISBN 978-0-262-26715-1.
The wall-to-wall media coverage of Rosenblatt and his machine irked Minsky. One reason was that although Rosenblatt 's training was in "soft science," his perceptron work was quite mathematical and quite sound--turf that Minsky, with his "hard science" Princeton mathematics PhD, didn't feel Rosenblatt belonged on. Perhaps an even greater problem was the fact that the heart of the perceptron machine was a clever motor-driven potentiometer adaptive element that had been pioneered in the world's first neurocomputer the "SNARC" which had been designed and built by Minsky several years earlier! In some ways, Minsky 's early career was like that of Darth Vader. He started out as one of the earliest pioneers in neural networks , but was then turned to the dark side of the force (AI) and became the strongest and most effective foe of his original community. This view of his career history is not unknown to him. When he was invited to give the keynote address at a large neural network conference in the late 1980s to an absolutely rapt audience he began with the words: "I am not the Devil!"
- Robert Hemt-Nielsen, interview in July 1993, Anderson, James A., ed (2000) (in en). Talking Nets: An Oral History of Neural Networks. The MIT Press. doi:10.7551/mitpress/6626.003.0004. ISBN 978-0-262-26715-1.
In his summary talk at the end of the conference [The AI@50 conference (2006)], Marvin Minsky started out by saying how disappointed he was both by the talks and by where AI was going. He explained why: “You’re not working on the problem of general intelligence. You’re just working on applications.”... At the end of the dinner, the five returning members of the 1956 Dartmouth Summer Research Project on Artificial Intelligence made brief remarks about the conference and the future of AI. In the question and answer period, I stood up and, turning to Minsky, said: “There is a belief in the neural network community that you are the devil who was responsible for the neural network winter in the 1970s. Are you the devil?” Minsky launched into a tirade about how we didn’t understand the mathematical limitations of our networks. I interrupted him—“Dr. Minsky, I asked you a yes or no question. Are you, or are you not, the devil?” He hesitated for a moment, then shouted out, “Yes, I am the devil!”
- Terrence J. Sejnowski, The Deep Learning Revolution (2018), Chapter 17.
... there wasn’t anyone in the Mathematics Department who was qualified to assess his dissertation, so they sent it to the mathematicians at the Institute for Advanced Study in Princeton who, it was said, talked to God. The reply that came back was, “If this isn’t mathematics today, someday it will be,” which was good enough to earn Minsky his Ph.D.
- Terrence J. Sejnowski, The Deep Learning Revolution (2018), Chapter 17.
In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. “What are you doing?”, asked Minsky. “I am training a randomly wired neural net to play Tic-Tac-Toe”, Sussman replied. “Why is the net wired randomly?”, asked Minsky. “I do not want it to have any preconceptions of how to play”, Sussman said. Minsky then shut his eyes. “Why do you close your eyes?”, Sussman asked his teacher. “So that the room will be empty.” At that moment, Sussman was enlightened.
- “Sussman attains enlightenment”, “AI Koans”, Jargon File

External links

Wikipedia

Wikipedia has an article about:

Marvin Minsky

Commons

Wikimedia Commons has media related to:

Marvin Minsky