Norvig vs. Chomsky and the Fight for the Future of AI

knowtheory · on July 25, 2012

It's a little bit frustrating to read a rehash of an argument that was cutting edge maybe back in the late 90s, especially one that is so poorly written, and framed as a battle between two intellectuals.

Chomsky's past his heyday. He has been seminal in his field, but he's no longer doing research which pushes at the boundaries of our understanding of language, how to model it, or what the fundamental nature of language understanding systems is. (as one might infer, I come from a non-chomskyian school of linguistics).

Given that we have actual data and research about large scale systems that do interesting things (including the massive artificial neural network that google built last month, see: http://www.wired.com/wiredscience/2012/06/google-x-neural-ne... ) reporting as substance free and obfuscating as this is, is a real frustration, when we could be talking about more interesting things, such as what a solid operational definition of meaning is, or how exactly heuristic/rule based systems actually differ from statistical mechanism, and whether or not all heuristic systems can (or should) be modeled with statistical systems.

The framing of this article is particularly galling because there are so many non-chomskian linguists out in the world who operate fruitfully in the statistical domain. Propping Chomsky up as somehow representative of all linguists is pretty specious and a bit irritating.

ntoshev · on July 25, 2012

It's a rehash of http://norvig.com/chomsky.html and the talk by Chomsky that led to it (both recent and relevant). I can't wait to read the continuation of that article that Norvig promised in comments.

mcguire · on July 26, 2012

You know, the discussion of Chomsky made me realize something. I would bet that Chomsky's take on Norvig's approach (to put specific names on the ideas) is that Norvig is a resurrection of Skinner. Specifically, behaviorism as "study texts, make statistical correlations, but don't make up any grand theories."

mcguire · on July 25, 2012

Non-Chomskian linguists? I was under the impression that, post-Chomsky, linguistics was defined as Chomskian; everyone else had left the field for whatever related discipline most closely matched what they wanted to do.

knowtheory · on July 25, 2012

I mean... there are core things at the basis of what everyone agrees on in linguistics (language is structured, learned, and there is a defined syllabary of sounds that humans make and use in language, etc), but Chomsky's grand unified theory of language, and his ideas about how those mechanics function are not universally agreed upon.

Even within syntax, which is really the sort of core of what Chomsky has been interested in, there are other formalisms to represent syntax which differ from Chomsky's theoretical framework. You can read more about Chomskian transformational grammar on wikipedia with links off to other sorts of formalisms as well: http://en.wikipedia.org/wiki/Transformational_grammar

mcguire · on July 26, 2012

"a defined syllabary of sounds"

Actually, that was one of the examples I had in mind. I was under the impression that the people interested in languages and sound had left linguistics in favor of phonetics, and as a result there was little interest in the interaction between the two.

My difficulty with the Chomskian method, as opposed to my ignorance about linguistics, is actually based on the "formalisms to represent syntax", since application of application of formalisms to natural language seems to be to be problematic. To quote the Wikipedia page you mentioned,

"Chomsky noted the obvious fact that people, when speaking in the real world, often make linguistic errors (e.g., starting a sentence and then abandoning it midway through). He argued that these errors in linguistic performance were irrelevant to the study of linguistic competence (the knowledge that allows people to construct and understand grammatical sentences). Consequently, the linguist can study an idealised version of language, greatly simplifying linguistic analysis...."

At the time, my impression from more neurological reading was that the errors were rather more interesting (http://en.wikipedia.org/wiki/Aphasia).

knowtheory · on July 26, 2012

Phonetics is most assuredly part of linguistics, unless you redefine linguistics just to mean "syntax" (No linguistics department i'm aware of makes such a distinction).

And yep, there's a whole sub-discipline of psycholinguistics which definitely learns from things like speech pathologies.

jholman · on July 26, 2012

Elaborating on what knowtheory posted, in modern linguistics, "Chomskian" doesn't usually refer to "everything Chomsky said". For example, Chomsky was probably the foremost advocate of the overthrow of behaviourist theory in linguistics, and now no-one takes extreme behaviourism a la BF Skinner serious (as an explanation of linguistic behaviour), but this rejection is not currently labelled as "Chomskian". It's just linguistics.

So, not that I'm 100% clear on the details, but Chomskian refers to more specific claims. For example, some claims about what aspects of language knowledge are innate (Chomsky claims a great language engine with a relatively small number of tuning parameters). As another example, at least at one point, I think Chomsky was rejecting syntax rulesets with productions that were not binary? I think?

mcguire · on July 26, 2012

"Chomsky was probably the foremost advocate of the overthrow of behaviourist theory in linguistics, and now no-one takes extreme behaviourism a la BF Skinner serious (as an explanation of linguistic behaviour), but this rejection is not currently labelled as 'Chomskian'. It's just linguistics."

My impression was that Chomsky had effectively replaced extreme behaviorism with extreme "Chomskianism", which had then assumed the name of linguistics. Extreme behaviorism is pretty goofy, but less extreme versions have some good points.

morsch · on July 26, 2012

Not sure how you got that impression, but a lot of stuff has happened in linguistics since Chomsky, some of it quite explicitly contrary to his arguments. None of my advanced linguistics classes dealt with Chomskian stuff.

mcguire · on July 26, 2012

Ok, I'm not anywhere close to a linguist. The only linguistics class I've ever taken was, strangely, an automata theory course.

But what I've read around linguistics lead me to believe that the field was mostly Chomskian theory with very little empirical evidence. But then, I've never been a real fan of Chomsky's work---sure, in that automata theory (and in other well-defined formal languages) it's great, but I don't see a good application to natural languages---and so I haven't actually tried to follow up on anything I've heard about it.

gbog · on July 26, 2012

> linguistics was defined as Chomskian

IOPOTW (In Other Parts Of The World) Chomsky may have a much lower relative influence on linguistics. For instance in France, you have Ferdinand de Saussure and André Martinet. I read a bit of the later, and it seem to me more solid on some grounds than Chomsky, because more humble (the first act of reason is to acknowledge its limits).

I'd bet in continental Europe the political Chomsky is deemed more interesting than the linguist.

davedx · on July 26, 2012

I don't know; "On Language" was a series of dialogs that took part in France, if I recall?

saurik · on July 26, 2012

West Coast Functionalism gave birth to various continuations that are still practiced fervently at various universities (including UCSB, as a specific example I can point to that I have personal experience with).

gbog · on July 25, 2012

> we could be talking about more interesting things

Please do.

phaedrus · on July 25, 2012

I spent about ten years working on Markov based chat programs. I gave up on themwhen I realized that no matter how sophisticated your statistical model it will never be more than a statistical analysis of text, unless it includes some rich rule based model of mental processes and mental objects. It may be that such a model of mental processes must itself be fuzzy and probabilistic, but it must exist. Therefore I come down firmly on the side of Chomsky in this debate: we should pursue theories of intelligence, and stastical models without any theory do not advance our scientific understanding of AI, however practical their application may be at the present time. This is not to say statistical methods do not work, of course they work, what I am saying is it is not a path that leads to true understanding of intelligence any more than spectral analysis of the EMF emissions of a running computer would lead to a theory of computation.

knowtheory · on July 25, 2012

Just because you chose Markov chains as your modeling mechanism doesn't mean that there is no statistical modeling method that is capable of developing something passing for what we'd call "meaning".

This is the same argument that was used against artificial neural networks. Neural network of type A can't do X, therefore neural networks will never do Y.

Language is immensely complex, and real human language involves things which are not encoded in text (and i'd remind you that you were trying to infer meaning from text specifically, not the full multi-channel robustness of humans communicating), we don't even have a full handle on what all of the cognitive processes and factors are that go into the production and understanding of language (although we've developed a lot of interesting work to those ends).

So hearing folks give up claim that Chomsky is correct because our current tools aren't up to the job is a bit puzzling, because we don't even have a complete understanding of what sort of thing language is or what sorts of things we are as systems which can use language.

Chomsky has opinions (and some facts) about what language is, and we are, but he does not have solid proof to confirm his specific conjectures. Is human language context free? context sensitive? Something else? (Chomsky's minimalist program uses movement along a tree to preserve referentiality and a bunch of junk, alternative syntactic frameworks such as HPSG uses directed graphs as the basis of their language modeling. Still others do weirder things like higher order combinatoric logics. And unfortunately none of the theoretical frameworks appear to be without their drawbacks)

neilk · on July 25, 2012

I am not a specialist, but as far as I know, Chomsky's argument here was that the existence of recursion showed that a Markov approach had to be wrong. Surely a similar argument can be made for statistical approaches? There is no way to represent a reference to some other part of the statement in a purely statistical method. If they work they happen to work basically by accident.

Just blue-skying here, but it seems to me that if I knew enough about how a statistical program worked, I could craft a sentence that would utterly confuse it, even though it was perfectly intelligible to a normal English speaker. A putative strong-AI program could not be fooled in this way.

knowtheory · on July 25, 2012

Except that his argument is somewhat moot as a practical matter, because there are no infinitely recursive sentences (given that all sentences are finite).

Long distance dependencies are an issue in language modeling that do need to be accounted for, but all that tells me is that Markov chains aren't the right structure to model language (unless, maybe you had a MASSIVE amount of data, and a markov chain of an order high enough that you account for the majority of sentences. maybe).

greendestiny · on July 26, 2012

You can statistically build a model that has recursion. It's just that such a model cannot be sure it has induced the right grammar - that's what Chomsky's argument was. I think the obvious counter argument is so what? Given any other constraints like parsimony you can certainly reliably induce a grammar.

JabavuAdams · on July 25, 2012

Two experiences have made it clear to me that humans don't understand language that well, without context:

(1) Raising a child. My dad often remarks that he's surprised that my daughter knows how to use a word in just the right context. I'm not, because this is a natural product of mimicry: if you copy what others say, you usually use the words in the correct context. As with computer-generated text the exceptions are often hilarious.

(2) Song lyrics. I had a very clear experience where I just could not understand the refrain of Gold Guns Girls. It sounded completely unintelligible until I read the lyrics. After that, it sounded crystal clear. Why would reading the lyrics make the song sound different? Context.

slurgfest · on July 25, 2012

There is no valid argument leading FROM your disenchantment with Markov based chat programs, TO a conclusion that machine learning is invalid. Markov chatters are a toy.

Did you produce a better chat bot based on UG? - No? Then on what basis are you junking machine learning?

Machine translation is far from a solved problem. But Chomsky's school claimed they were going to solve it in the 60s or perhaps the 70s. Do you know what is the basis for the most successful current approach to machine translation?

Analysis of text. (But not the kind of simplistic junk one does in a Markov chatter)

All you have done is suggest that some beautiful perfect text model exists natively in every person. (Presumably this evolved somehow - or if you are Chomsky, it just developed like a crystal for no apparent reason). This isn't an explanation of anything unless you actually find that model instantiated in the brain. But this is just not happening. So either our instruments are still too crude to detect it, or it's not really there.

Appealing to an as-yet unknown perfect universal text model does not build a better chat bot or a better explanation of human behavior.

True understanding of intelligence must incorporate an understanding of how learning occurs. Because anyone who watches children sees learning occurring, and only doctrinaire Chomskyists deny that it occurs (because it is not beautiful enough and some abductive argument is claimed to show that it is not sufficient).

bhickey · on July 25, 2012

The fact that Markov chains are by definition memoryless isn't an argument in favor Chomsky or magical thinking. Sure, if you want to improve your output you can use (n+1)grams instead of n-grams, but the curse of dimensionality is going to quickly catch up with you. Language smoothing will help for a little while. Over a long enough horizon all Markov chain output is jibberish. None of these obvious limitations are an argument against statistical models.

Where is the data that statistical methods don't 'advance our understanding'? What does an EEG tell us about the brain works?

slurgfest · on July 25, 2012

Running a Markov chain model as-is to generate text produces gibberish. You correctly point out that the gibberish can be much higher quality.

Fundamentally it is gibberish not for any simple algorithmic reason, but because generation is occurring without any respect to context or meaning beyond what randomly emerges from the graceful juxtaposition of randomly chosen words. It is purely about the combinations of words (in that sense, syntax). This shouldn't be surprising - who ever actually expected that generating a kind of syntax model would result in coherent thoughts? At most it can generate texts like weird dreams, it shouldn't be surprising that the result is not a cogent discussion of current events.

This does not mean that the same information cannot be used in more sophisticated ways. But these wouldn't be a Markov chatbot. The Markov model would effectively be a component in a larger system that needed to use words. It isn't at all clear that the Markov model is the best possible one, but it is just groundless dogma to insist that learning can't have anything to do with real performance.

bwanab · on July 25, 2012

It strikes me that the theories are not mutually incompatible at all. They have very different purposes. Chomsky is trying to understand meaning and intelligence at a deep level. Norvig is trying to build models that help people right now (and incidently, help his company to make more money). Any new insights from either path will help refine the other.

So it sounds like what you decided was that you wanted to explore one path and not the other. Nothing wrong with that, but it's a very different statement.

dave_sullivan · on July 25, 2012

As far as language modeling, this is a recent paper that models language on the character level rather than word level and can track long term dependencies and even generate plausible sounding non-words from time to time: http://www.cs.toronto.edu/~ilya/pubs/2011/LANG-RNN.pdf

The state of the art is improving a bit, although this method still knows nothing of meaning so it can often generate some strange sentences. Still, I wouldn't write off the whole field yet--just because something didnt work with tools years ago doesn't mean it isn't possible.

guscost · on July 25, 2012

There are two possible ways to invent this kind of theory, the way I see it:

1.) Measure data, then make an educated guess. 2.) Make an uneducated guess.

robg · on July 25, 2012

This is one of those rare moments in intellectual life where being in the room and now seeing the debate develop, it becomes clear that the resulting hype isn't (wasn't) loud enough.

This distinction marks the real turning point in AI from abstract, grand claims with highly restrictive evidence toward engineering that simply works. Who cares about the ontology when we can recreate? It's like saying airplanes don't properly explain flight because they don't replicate how birds do it. Who cares? We can fly (and translate and soon reason) artificially.

It's clear that Chomsky and Universal Syntax has held back the entire field of AI (and at MIT). There isn't one algorithm in the human mind to decode all of our mental capabilities. That's mistaking subjectivity for objective lessons. Trying to recreate that Phantom has led to rule tables in AI, constraints on how the mind must operate. Instead, by allowing those fuzzy boundaries to accumulate with evidence, statistical approaches win in the long-term of our lives and in this debate.

Kuhn knew what happens to dinosaurs.

_delirium · on July 25, 2012

I don't think I would take that strident battle-against-dinosaurs view, in part because I think intellectually understanding things is useful in and of itself, not some kind of "just build it and shut up" anti-intellectual view; and also because I don't think it's an accurate summary of the history of AI. There's no particular reason we can't both build and study things in various ways, and the history of AI has been full of people doing many takes on both.

In particular, statistical approaches have been used for a long time, but were not practical until fairly recently; it was the lack of "big data" computing power holding them back more than anything. Statistical machine translation and parsing experiments have been tried on and off for decades, but with 1950s-era data they produced total garbage as output, even worse than the (also bad) symbolic approaches. Hence why Shannon's work on text processing didn't produce practical NLP or NLG systems. It took Google-sized data to produce statistical translation that was actually usable.

What numerical approaches were possible on computers of the time were fairly extensively investigated when they became possible (e.g. the 1980s focus on "sub-symbolic AI", with perceptrons, neural networks, numerical regression methods, etc.). Some were shelved for years because they just didn't work as well, e.g. symbolic game-tree search massively outperformed machine learning in board games in the early experiments, which is why Samuels's 1950s ML-based checkers player was theoretically intriguing but not considered very practical.

jerf · on July 25, 2012

I often feel bad for computer science researchers in the 50s and 60s. It isn't that they were any less smart than today, it's just that where they had kilo-X, we have giga-X, if not tera-X or even peta-X.

One of my evolutionary computation professors said that he had a deck of cards that he carried from job to job, university to university, running bits of his evolutionary computation experiment on spare cycles, an experiment that runs in fractions of a second on hardware at the time he told the story in 2001. Presumably now it would run in even more fractional fractions of a second.

slurgfest · on July 25, 2012

Intellectually understanding things is useful. There's no anti-intellectualism here.

It's become increasingly clear that Chomsky's approach to language is not going to generalize to AI successes or explanations of many domains of human performance, as was promised. That approach has not yielded fruitfully anywhere except the narrow realm of syntax (in particular typologies of syntax). While machine learning has kicked butt left and right.

There is no reason people can't go on studying syntax at the same time machine learning expands. But it would be dishonest to ignore the extreme and exclusive claims laid down by Chomsky's school at the outset of the cognitive revolution. These claims were given a very liberal benefit of the doubt for decades. A lot of good work has been done in syntax. But now those claims about AI and psychology-in-general are clearly threadbare, since they have not yielded the promised fruits either in AI or in explaining human functions. They have not even yielded plausible IOUs. Remember, this was supposed to explain pretty much everything. Not even language acquisition has been explained.

That is empirical inadequacy. Who cares if Chomsky thinks it's beautiful?

And it's equally clear that learning of various kinds has to occur and also gives us the more parsimonious and elegant solutions to problems (no-solution and no-explanation is not elegant even if it is simpler).

Machine learning has a great deal of conceptual beauty - if you study it rather than pooh-poohing it because of some facile abductive argument to UG.

Computing power isn't a problem. The scale of the human brain is enormous and we are still nowhere near reaching it with our computational resources.

ddd571 · on July 25, 2012

What about research showing things like intermediate traces exist, we parse sentences in accordance with binding principles, we don't postulate gaps for fillers in sentences in island contexts unless there'll later be a potential filler gap? All of these are facts we know from abstract work in Chomskyan syntax, and yet they've been replicated in laboratories. So, your claim that there's no fruits is just simply empirically false

JabavuAdams · on July 25, 2012

> There isn't one algorithm in the human mind to decode all of our mental capabilities.

Careful here. There could be one algorithm, but we may not be able to express its parameters in the way we like to for simple models.

These statistical approaches are algorithms. It's the fact that we can't make sense of the parameters that seems to lead people to believe that they don't explain anything.

The fact that we can't categorize all life in an unambiguous way that makes sense to humans doesn't mean that the elegant and simple algorithm of evolution by natural selection is wrong.

Model vs. initial conditions. Knowledge vs. search.

slurgfest · on July 25, 2012

You are right, these statistical approaches are algorithms. And they themselves are very simple and do not necessarily require a lot of initial state. And you are also right that it is extra hard to reverse-engineer these.

But they amount to learning and search. And because of the politics of Chomsky's movement, it has always fought the concept of learning tooth and nail in favor of innate knowledge or ready-made models. This is why we talk about language "acquisition" and why we read Chomsky saying that contemporaries who studied learning were facilitating totalitarianism.

It's also worth mentioning that Chomsky has stated that he does not think the brain has evolved.

So however many algorithms there are (and however they are employed to get the vast array of intelligent human actions) many things you have said here are totally anathema to Chomsky's school.

It was pushed too hard, it overextended and only now has any challenge come to it - from the quarter of AI. (Up until a few years ago, fighting it in academic Psychology meant you were OK with killing your career, and anyone less respectable than Norvig would get torn apart - look at what they are saying about Norvig now).

JabavuAdams · on July 25, 2012

Nauseating. Trashing my academic career is probably the best "mistake" I've made.

pnathan · on July 25, 2012

"Who cares about the ontology when we can recreate? It's like saying airplanes don't properly explain flight because they don't replicate how birds do it. Who cares?"

Well, I do. Being able to understand something from first principle is very different than being able to model it and gain an approximation of it.

One might point out that this debate crops up in a different form in developer circles, but reframed as "Learn assembly & cs theory vs. use IDE & ignore assembly and theory".

fusiongyro · on July 25, 2012

> It's clear that Chomsky and Universal Syntax has held back the entire field of AI

That may be true, but there is value to understanding language for the sake of language outside the practical goals of improving AI. There is plenty of evidence for parameters playing a role in human language whether or not a parametric implementation of NLP is possible, desirable or necessary. It's certainly true that a statistical approach that bears plenty of fruit in AI applications is going to be very strained to provide anything of value back to linguistics or developmental psychology.

meric · on July 26, 2012

"It's like saying airplanes don't properly explain flight because they don't replicate how birds do it."

The analogy doesn't hold. We don't want planes to fly like birds, but we may want machines to thinking like humans.

eyko · on July 26, 2012

Also, the statistical approach would be to imitate flying creatures until we can fly, rather than understanding aerodynamics in order to create "vessels that float".

rm999 · on July 25, 2012

I've been in machine learning/AI for ten years now - from undergraduate research, to graduate school, to industry - and I find debate like this fascinating. My take on it is that our understanding of what we will be able to do in the future is very unclear, and what we will want to do is very open-ended. So the debate is worth having, but it won't really resolve anything.

Statistical models may (in my opinion probably will) end up being an "AI" dead-end, eventually falling into other fields such as algorithms, like game trees and logic-based agents did. That's not to say the current statistical approach is a bad idea; on the contrary, I think these techniques are useful and simple enough that they will become fairly ubiquitous in CS.

On the Chomsky side of the argument, AI researchers have consistently been frustrated in the past 50 years, to the point that studying AI today makes you sound like a joke. But their goal is a noble one. Anyone can understand how great it would be to have a human-level intelligence on a chip - this would fundamentally change the World. The fact that we haven't dented this problem doesn't mean the problem isn't worth solving, it just means our understanding of what it takes to build this kind of AI is in its infancy.

I almost feel like Norvig and Chomsky are arguing in parallel. They are both right, but their arguments are valid on different time scales. Today, the Norvig approach will easily win out; Chomsky has nothing and is largely irrelevant. But Chomsky is, IMO, correctly predicting what will need to happen to move beyond an eventual roadblock in a much grander AI.

debacle · on July 25, 2012

They have two different definitions of "artificial intelligence," which is where the schism seems to be arising from.

Chomsky takes the academic approach - artificial intelligence is the simulation of humanlike (or even possibly mammalian) intelligence.

Norvig is taking the engineering approach - artificial intelligence needs only to pass the Turing test.

They're both right, both approaches have value, and they both are bound by our limited technology at the moment.

In the end, though, Norvig will lose out. Sure, he'll make the finish line first - an AI capable of 'passing' the Turing test, but in order to have real intelligence you need an analytical engine (or brain, if you will) that can prioritize data without fiddling with bits. In the Norvig solution, someone will always have to be fiddling with the bits.

Chomsky's approach, on the other hand, will result in a 'true' artificial intelligence, the way neurologists understand it. It's just going to take a lot longer to get there.

phuff · on July 25, 2012

Having studied Chomsky a fair bit in grad school, and also studied cognitive linguistics a fair bit in grad school, I think the idea that Chomsky's models will ever win anything is just wrong.

Chomsky's central problem is that his modeling is not based on anything biological at all. His models don't correspond to reality. Some of them were based on some assumptions about how the brain works that were untestable in the 50s and 60s when all of his linguistic models were developed, and have since become testable and are not particularly evident in the way we currently understand the brain to work.

Given this fact, I think your current best bet is Norvig as a modern approach to AI or anything linguistic-y. But this is only because it is slightly more grounded in reality rather than being something that Chomsky (who is a very smart guy) came up with on his own without the benefit of actual biological models of the brain.

In the end, I think there will be (eventually, a long time from now) an actual model of how the brain processes language based on actual observations of working brains that throws away much of what Chomsky has proposed but probably uses some of it and that doesn't use huge Google-esque lookup tables but is highly influenced by statistics.

But until we get to that point, statistics are probably your best bet since at least they're grounded in reality (unlike much of Chomsky's work).

sp332 · on July 25, 2012

Children learn language much faster than it seems possible for a "blank" neural network to learn. It seems that there is some "circuitry" hard-wired into the human brain that helps learning language. So the question is: can a computer learn language as well as a human, without simply hard-coding language into it?

Jun8 · on July 25, 2012

This is a popular summary of Chomsky's thesis that was put down decades ago, when cognitive psychology was at its infancy. Now we know a lot more on how babies learn the world and language (do a Google search on "infant statistical learning") and most evidence points to the fact that they employ algorithms that are mainly statistical in nature for learning.

"Children learn language much faster than it seems possible for a "blank" neural network to learn." This is a very strong statement that has no mathematical or computational proof AFAIK. It had no proof when Chomksy first put down that thesis either, it was an axiom of his. BTW, belief in a specialized language faculty was not universal, even in the past, esp. some philosophers of language disagreed with this view.

sp332 · on July 25, 2012

It follows logically from these 3 premises https://en.wikipedia.org/wiki/Poverty_of_the_stimulus#Summar... Those premises could be wrong but it's not just an axiom.

Retric · on July 25, 2012

#2 is simply wrong. Children are corrected when their grammar is off so they do get to see incorrect sentence structure. It's questionable if children could learn language from only watching TV, but that's not the standard learning environment.

PS: I am far from the first person to point that out. At this point they are treating it as an axiom because they continue to believe it despite the disproof of their premises.

ddd571 · on July 25, 2012

Yet, they consistently ignore this input. This has also been pointed out dozens and dozens of times.

debacle · on July 25, 2012

Children are definitely not a 'blank' neural network. They spend ~6 months staring off into nowhere, looking, listening, and slowly developing the skills to respond to their environment.

cma · on July 25, 2012

A newborn baby will recoil away from a dark circle growing larger on a screen. Where did he learn to associate a growing share with an object approaching near enough to collide?

We have certain built in behaviors and reflexes.

ckg · on July 25, 2012

This strikes me as the old nature vs. nurture debate - trying to determine which human behaviors are hard wired and which are learned. Like most complex questions I don't think there is a single right answer, but my current theory is that humans have more hard wired behavior than most people like to admit. It is precisely because of our language skills that we can rationalize behavior that has it's root cause in the more animal regions of the brain.

To put it another way - most people think they are rational. Most people act irrationally. To me it is animal instinct that is cause of greed, war, social hierarchy, etc. and it is so ingrained in society that we don't question it's root cause which most likely boils down to atavistic tendencies.

agumonkey · on July 25, 2012

To me this fall in the reptilian reflexes box. Accelerating signal related to increasing proximity => act.

sp332 · on July 25, 2012

By "blank" of course I mean that they begin blank and immediately start learning from their environment. You say they develop skills "slowly" but it's still much faster than you would expect, unless children have some innate skill at language built in instead of being "blank."

Edit: sorry if this is vague, this is what I'm talking about https://en.wikipedia.org/wiki/Psychological_nativism

knowtheory · on July 25, 2012

You are being way too vague. What is setting your expectations of "slowly"? What rate would you expect children to learn language at? Even if that were to be the case, your argument is essentially a god of the gaps argument. Not P therefor Q is not sound reasoning.

The whole notion of the Universal Grammar and innate language faculties which instantiate subsets of the Universal Grammar is weird.

gnaritas · on July 25, 2012

No, he's right, check out Pinkers the language instinct for the full treatment on the issue. Children learn at a rate impossible from just what they hear and imitate. We are pre-wired for language.

Retric · on July 25, 2012

30,000+ hours using more computing power than the fastest computer on the planet to get the a 3 year olds grasp of language is hardly 'fast'.

PS: Language acquisition starts before birth, and continues 24/7 after that until we 'learn' it.

gnaritas · on July 25, 2012

"Pinker explains that a universal grammar represents specific structures in the human brain that recognize the general rules of other humans' speech, such as whether the local language places adjectives before or after nouns, and begin a specialized and very rapid learning process not explainable as reasoning from first principles or pure logic. This learning machinery exists only during a specific critical period of childhood and is then disassembled for thrift, freeing resources in an energy-hungry brain."

Having read the book, his arguments are far more convincing than your assertions.

Retric · on July 25, 2012

However, unlike his arguments I limited myself to using actual facts.

If you just want a compelling argument: Biology is a vary important component in the creation and evolution of language, because the fine motor control required to say "linguistic" vs "mommy" or "stop" has a lot to do with how languages are learned and evolve. As baby's practice how to say thing babble converts to simple words but in doing so there are pattern as to which sounds are easier to produce and therefore enable them to probe their environment by reproducing. This paralls the evolution of language where the most important and simplist ideas where the first to be communicated and therefore take up the 'root' address space in the language with more complex words and ideas like 'chemist' being tacked on over time.

PS: Sure, it sounds great. But how many assumptions am I stringing together in just those few sentences.

gnaritas · on July 25, 2012

Look, if you haven't read the book, I'm not interested in your opinions of his arguments; you don't know them. When someone recommends a book, you don't kill the messenger when your disagreement is with the author; don't be an ass.

Retric · on July 25, 2012

I have read most of the language instinct. I get why people find it compelling, but that has little to do with being accurate. My point was his style tends to be convincing vs. his actual evidence being compelling.

PS: Think of it like this A -> B, B -> C therefore A -> C is all well in good most of the time. A -> B .... Y -> Z therefore A -> Z only really works with math, build a chain that long and it's unlikely for all those steps to be accurate.

gnaritas · on July 25, 2012

And I don't find your point in the least bit compelling. Unless you're a leading expert in the field as he is, your points mean jack squat to me. And since I'm not making the argument, it's not argument from authority to say his book is far more convincing than your assertions without evidence. You're trying to debate me about a book I recommended; you're an ass. Goodbye.

Retric · on July 25, 2012

That seems overly rude. It's also a ridiculous appeal to authority at the same time.

If I where to attack the book I could say something like: "In chapter 2 'Chatterboxes' he states humans are the only animal that uses language which is complex issue by it's self. He goes further and says every group of humans in remote areas we encountered have had complex language. He then runs with that line of reasoning. However this ignores not just other animals that use simple forms of language but human ancestors that where close to us anatomically and probably also used language. If homo sapiens's ancestors also used language then you would expect the earliest humans to also use language therefore language would spread from it's origins to all those remote areas vs. being created from scratch in those remote areas."

Now, I could get 10 PHD's to say the same thing and use quotes etc, but what's important is the accuracy of the statement not who says it. http://lesswrong.com/lw/jl/what_is_evidence/

gnaritas · on July 25, 2012

Are you mentally handicapped, what part of "Goodbye" was unclear to you? And no, it's not an argument from authority, I specifically headed off that critique when I said I wasn't making any fucking argument. Learn to read.

Retric · on July 25, 2012

Saying you found his argument convincing is in no way important either it's a factual statement it it's not, what you believe is meaningless. Suggesting it matters in any way who made the argument is an appeal to authority even if your next sentience says it's not. Reality does not care what you think it just is.

PS: You clearly lack the courage of your convictions to actually leave an argument when you say 'Goodbye'. However, I realize trying to reason with a fool is a waste of time, so best of luck and 'Goodbye'.

gnaritas · on July 25, 2012

PS: Fuck you, I wasn't even talking to you, I recommended a book to someone. You lack the brains to know when someone isn't interested a debate, because you're an ass.

PS: You're still an ass, and you don't know what appeal to authority means. We have to be debating and me relying on him to make a point for it to be an appeal to authority. As I clearly indicated I wasn't interested in debating the subject, you can't accuse me of logical fallacies, a point I made previously but you failed to grok because you're an ass.

radarsat1 · on July 25, 2012

I'm not sure what you mean by "faster," (what are you comparing to exactly?), but I think something that speeds up human learning considerably as compared to computers is feedback. Children don't just blankly sit there taking in information and then "fitting it" to a model-- they perform actions and observe the consequences; it is empirical. The embodied action-perception loop is fundamental to how real-world learning works. A closer computer model is reinforcement learning, for example, which does exactly this, it wraps a neural network in an action-perception loop an uses online training to learn the reward function. The problem is of course that the reward function can be very hard to design except for fairly simple tasks.

bugsbunnyak · on July 25, 2012

> Chomsky's approach, on the other hand, will result in a 'true' artificial intelligence, the way neurologists understand it. It's just going to take a lot longer to get there.

High-level behavioral impressions taken by a neurologist are a convenient abstraction. That this high-level behavior is useful in monitoring mental state (outputs) says very little about the underlying 'hardware'. In fact, this is the `fundamental` debate in cognitive science: from whence does intelligence arise? Theories generally fall under two headings: 'top-down' and 'bottom-up', which roughly correspond to 'pre-programmed' and 'emergent'. The canonical bottom-up approach is the neural network, approximating cells with various equations that govern behavior (outputs) based on aggregate input (there are various levels at which this can be done). There are a variety of top-down approaches, a typical approach would take the form of logic engines (think Prolog), or generative rules (Chomsky)

Statistical modelling approaches are closer to bottom-up, but depending on the model they may still incorporate domain knowledge that is emergent from the model input.

Statistical approaches have momentum these days due to considerable success - thanks largely to Moore's law. However, they also have biological support: what is a neuron? It's an FPGA with a lot of electrical and chemical inputs. Small neural circuits can behave statistically, and it's an open question whether this gives rise to high-level behavior. A big reason it's an open question is that we don't yet have the spatial or temporal resolution to measure enough signals.

That said, there is plenty of room for what I consider a happy medium: locally statistical behavior, but globally (and generationally) top-down organization driven by genetics.

_delirium · on July 25, 2012

A significant proportion of the "academic approach" is actually a third one: artificial intelligence is the analysis and implementation of rational decision-making. That approach tends to care neither about biological accuracy, nor believability in a Turing-test sense. Rather, it cares about whether its decisions are correct based on evidence available to the decision-maker. That's the kind of attitude you most often find in both statistical and logic-based AI circles.

Actually, Russell & Norvig's AI textbook has a nice summary of these different approaches to AI in its intro chapter.

JabavuAdams · on July 25, 2012

Too bad humans aren't rational.

_delirium · on July 25, 2012

They aren't, but not all AI is aimed at mimicking human behavior. For example, if your goal is an AI system that can automatically control ship routing on the global Maersk shipping network (planning routes, rerouting to respond to contingencies, etc.), you might want "optimal" rather than "human-like" decision making.

JabavuAdams · on July 25, 2012

Good point.

netcan · on July 25, 2012

I'm going out on a limb because I don't really know much about neurology & I may be wrong about facts but.. I think an issue here is that we don't really know what "natural intelligence" is.

For a significant part of the Scientific age we knew about genes in some sense without knowing much about them. We called them traits, observed & measured them. We got to know some "rules" about their inheritance. But it wasn't until genetics got to be a little better understood that we got to know their physical manifestation. We can explain the diference between genetic & cultural (memetic?) inheritance in these terms. A descendant's cooking habits are memetic and her hair colour is genetic.

When it comes to neuroscience I think we're where we were a century ago in biology. Emotions, thoughts, memories. We don't know what their physical manifestation is. We dont know how they work. Since we don't know much about how natural intelligence works I think our common sense definition of intelligence is, to a certain extent: "stuff we can do that computers can't."

I think that if we had a definition that was more functional than observational, you wouldn't be hesitant at all to use "mammalian" in your definition. Whatever processes result in observed human intelligence are almost certainly shared with other species. We'd probably also know what species to draw the lines at: reptiles? invertebrates? fungi?

If apes have intelligence, goldfish don't but octopi do that suggests there multiple versions of natural intelligence.

jan_g · on July 25, 2012

But what exactly is true artificial intelligence? For example, I consider Google search and Wolfram alpha very intelligent. They can do math, answer questions, rank information, follow current events, ...

snowwrestler · on July 25, 2012

I think "artificial intelligence" is a total misnomer for we're talking about here.

There is a lot more involved than just language. It seems possible to solve language and still not create "strong AI." See: Watson kicking ass at Jeopardy. That is pretty sophisticated language comprehension; it even got clues that involved puns and wordplay.

From that it would seem totally possible to adapt this sort of intelligence to taking an IQ test. Just provide the right corpus to search and give the algorithms time to learn and be tuned. What if Watson tests out at a 150 IQ? Is he "intelligent"? After all it is an "intelligence quotient."

I think most people would say no, the issue is the Turing Test. That involves language, but I think the real point is an artificial personality: a computer that can hold a conversation and evince a discernible will and opinion. It seems to me that to do that, the machine must have emotions. I'm no expert in the field, but from my layman's position I don't see nearly as many stories about studying and replicating emotions as I do about human language.

Which is funny because any dog owner knows that dogs, despite having almost no formal language, are clearly willful, independent, emotional living beings. Can we even simulate a dog's emotional state with a computer? How about a bug's emotional state? I haven't heard much about that. But there is a ton of heat around language.

waterlesscloud · on July 25, 2012

I think statistical models are an 80% solution. They get a long way down the path very quickly, but then they hit a wall and don't advance much further. Search, translation, probably also autonomous vehicles. They get to a point with statistical approaches and then they stop advancing.

The rapid success at first may be leading to a dead end.

Having said that, there's a lot of value in these technologies as assistance to human intelligence, but I'm skeptical they're ever going to lead to full-on autonomous intelligence.

jan_g · on July 25, 2012

But when does your average human hit a wall? Pretty quickly, I'd say. Mostly due to laziness and parenting and getting old.

I think that you are also getting at the _sentience_. I think this is what most people refer to when speaking about true AI. You know, having consciousness, desires, social skills, etc.

debacle · on July 25, 2012

They're still computers in the traditional sense - they only do what someone told them to do.

jan_g · on July 25, 2012

I believe that you are then referring to sentience in addition to intelligence? Artificial sentient being is another thing entirely, I think. And tougher nut to crack :)

zumda · on July 25, 2012

You are right, but to solve most problems, do machines really HAVE to think like humans?

zumda · on July 25, 2012

(can't answer you directly, so I'm doing it here)

My point was, that, for example, language recognition doesn't need human intelligence, a statistical model is enough.

Or driving around a confined space only requires a particle filter, not human intelligence.

debacle · on July 25, 2012

For most problems, you don't need AI.

power · on July 25, 2012

“Norvig is taking the engineering approach - artificial intelligence needs only to pass the Turing test.”

Passing the Turing test and the simulation of intelligence are supposed to be the same thing. Turing came up with the test to sidestep the definition of intelligence.

I’m not sure what you mean by “but in order to have real intelligence you need an analytical engine (or brain, if you will) that can prioritize data without fiddling with bits. In the Norvig solution, someone will always have to be fiddling with the bits”. Do you mean bits as in pieces or as in ones and zeros ? If the former, haven’t Chomsky’s models required the addition of new parameters as exceptions to his rules are found ? If you mean the latter, how is a computer to prioritize data if it can’t look at bits ?

There seems to be the misconception that Norvig does not use simple models. He does, just ones that use statistics for training and to learn. His approach strikes me as elegant, simple and more robust to changes in language over time than Chomsky’s.

btilly · on July 25, 2012

I personally believe that it will be possible to simulate a working brain in sufficient detail that it is able to "think" like a human does before humans understand said brain.

In fact I doubt that the cognitive capacity of a human brain is enough to truly understand an operating human brain.

rm999 · on July 25, 2012

This is exactly my take on it. They are talking about AI in different contexts, and therefore aren't really arguing with each other as much as past each other. Anyone who has any interesting in building something today would take a Norvig approach, and anyone who pictures AI 100 years from now should hope that the Chomsky approach eventually won out.

SudarshanP · on July 25, 2012

Many confuse the Chomsky approach v/s Norvig approach as who is right and who is wrong. But that is an unnecessary distraction. Consider planes. The wright brothers studied planes without mastering Fluid dynamics. Their planes flew. That was awesome. But plane makers kept exploring and inventing. We achieved super sonic speeds and even made rockets. As time passed, we learnt more and more about fluid flows. While on the one hand atoms jiggling around with various velocities is all that we are studying. But the fluids demonstrate very rich and complex behaviours. Only a fool would stay contented that the aeroplane flew. Our understanding of fluid dynamics comes in handy to model wind turbines and weather patterns and a whole host of awesome areas. A lot of those great discoveries came from investments in aeronautical research made by profit hungry entrepreneurs. The science v/s technology debate is usually just funding politics. It is silly to argue whether the eye runs the marathon or the leg runs the marathon. Let us just accept that "the runner runs the marathon" and just move on.

Chomsky may or may not be right. Just like aristotle was not right about many things. He was a genius and he contributed as well. He may have stirred up some trouble also. Over time if we are fortunate enough we may get a clear understanding of the core ideas like we understand the behavior of a fluid. If we have to settle with flying planes but don't understand wind, we cannot go supersonic or reach the moon.

olalonde · on July 25, 2012

Isn't there a consensus that "passing the Turing test" <=> "true artificial intelligence"? Otherwise, what test do neurologists propose?

azakai · on July 25, 2012

First thing, please read the actual article by Norvig, it is excellent,

http://norvig.com/chomsky.html

Second: I found it astounding that the article never mentions Skinner. Surely this article is trying to do to Chomsky what Chomsky did to Skinner in 1959 ("A Review of B. F. Skinner's Verbal Behavior", http://www.chomsky.info/articles/1967----.htm ).

Chomsky basically marked the beginning of modern era of cognitive psychology with that essay, displaing the previous paradigm of behaviorism. Norvig's article has similar form in some ways to that article, and similar goals (to argue for a new paradigm over an older one). As I was reading it, I was sure Norvig had that context in mind. So I was surprised to read

> So how could Chomsky say that observations of language cannot be the subject-matter of linguistics? It seems to come from his viewpoint as a Platonist and a Rationalist and perhaps a bit of a Mystic

Well, no, Chomksy explained very well why he opposed observations being the subject matter of linguistics in his 1959 essay. Skinner's behaviorism looked only at observations and experience, and did away entirely with internal mental states. That might seem bizarre to us today, and the reason is in large part the shift heralded by Chomsky's article from behavioral psychology to cognitive psychology. In the latter, the goal is to understand the internal processes that are involved in psychology (or specifically language).

Statistical language models are not behaviorism. But they do share a lot with it, they are based primarily on raw empirical observations as opposed to deep models, so it is natural for Chomsky to oppose them on similar grounds (and not due to Platonism or Rationalism, although I suppose you can speculate that those motivated his 1959 essay too).

Side note, we can speculate that if Skinner had today's computers and statistical modelling methods, the shift from behaviorism to cognitivism might never have happened, seeing as the statistical approach is so successful.

orbitingpluto · on July 25, 2012

I know a card counter. I showed him how to condition probabilities to determine how to best play. He went for the full Monte Carlo method and he lets his simulation run for a week before he starts using it "just to make sure". It's frustrating because he doesn't get that his results are statistically significant after about 30 seconds of runtime. He still makes money doing it. The results are tangible, but he's still just mucking about.

'Quantum mechanics is certainly imposing. But an inner voice tells me that it is not yet the real thing. The theory says a lot, but does not really bring us any closer to the secret of the "old one." I, at any rate, am convinced that He does not throw dice.' --Einstein

Statistical methods can work but they are unsatisfying to the scientifically curious. You're not really a scientist if you create something that works and you don't really know why. (Not to say that the method doesn't have value. Sometimes you have to play with your Lego before you grow up.)

bluekeybox · on July 25, 2012

> You're not really a scientist if you create something that works and you don't really know why.

According to your logic, the only true "science" is mathematics. If you test the workings of your "creation" using scientific method, you're still a scientist. Scientific method is also about testing your claims empirically, and it has been successfully applied for more than a century to study of biological organisms, climate, and other complex systems that we do not "really" understand. Not to berate understanding of underlying mechanism which is always preferable, just to point out that there is more than one way to skin the cat.

orbitingpluto · on July 26, 2012

I notch you up a point good sir for discovering my bias. (mathematics)

bluekeybox · on July 26, 2012

I guess I know your bias because I'm biased towards mathematics as well :).

VikingCoder · on July 25, 2012

I picture Chomsky as Kepler, trying to build orbits out of Platonic solids.

Until Kepler had access to Brahe's data, he was not going to be able to come up with his theories of planetary motions.

Worse than that, the laws of planetary motion present a simplistic view of the universe: what happens when a bunch of small objects orbit a very massive object. I think they wouldn't help you out at all, in trying to understand planets moving in a binary star system.

There is no analytic solution to the N-body problem. We can only simulate the motions of a group of massive bodies by iteratively applying the laws of gravitation that we have deduced. Knowing the mathematical properties of how objects behave in a gravitational field, and actually understanding HOW GRAVITY WORKS are two enormously different things. Newton was frustrated with the theory of Gravity, because it was, as Norvig's models, just a model - with no explanation of why. But the model allows you to make falsifiable predictions, and understand how the universe will behave. Looking for the Higgs Boson is awesome - but there is potentially no equivalent in the linguistic world.

Chomsky asks us to ignore F = G * m1 * m2 / r^2, because there's no WHY attached to it.

PS - this understanding of the history of science is brought to you by Carl Sagan's Cosmos TV series. I have no deeper insight than that.

OmegaHN · on July 25, 2012

I think it is the other way around. Chomsky is trying to find the underlying structure of intelligence (just like gravity underlies planetary motion), and is saying that others are simply trying to generate a model of intelligence (through statistical methods) with no understanding of why the intelligence behaves that way. Gravity is the why, planetary motion is the model produced by data (acquired by Brahe).

VikingCoder · on July 25, 2012

What will prevent Chomsky from having an Earth-centric model of the solar system, with epicycles to explain all of those weird little ticks like dropped pronouns?

The only thing that could possibly break you out of that way of thinking is massive amounts of observational analysis to show you that your foundation is flawed.

Seeing the moons of Jupiter revolutionized physics. Chomsky says that observing the heavens is unnecessary, and a distraction from his studying of the motion of billiards balls.

He's got trigonometry down cold, but he'll never come up with calculus that way. And quantum mechanics would never fall into Chomsky's way of thinking, in my analogy.

jal278 · on July 25, 2012

But there is a WHY associated with F= G* m1 * m2 / r^2 -- an explanation for why gravity exists, which although isn't yet unified with quantum mechanics, is that mass deforms space. The statistical formula is cool on its own right though too of course -- because of its elegance.

However, a concise formula without an explanation is less interesting, and an enormous statistical model without any elegance or insight (perhaps driven by big-data) is less interesting still, although it may have practical applications.

There may be general principles to intelligence, how it evolved through natural selection, and what in general will make for an intelligent robot. These are some of the most interesting unmade discoveries. But it isn't a sharp dichotomy between "Chomsky" and "Norvig," although that is one way to frame the question.

We should explore all reasonable approaches to AI and make sure one method is not dominating the others, but otherwise let everyone explore what they will. Who really knows what actually will lead to AI?

VikingCoder · on July 25, 2012

First, Norvig is not rejecting Chomsky's approach - you can't rely only on stats and probabilities. Chomsky is absolutely rejecting Norvig's approach. So, you apparently agree with Norvig.

Second, you don't know WHY mass deforms space. You don't know why some particles have mass, and others don't. You don't know how space is deformed by mass.

You wouldn't even have that explanation, "mass deforms space," without starting from the mathematical models. The mathematical models came about by studying the observed data.

Remember, people thought that cannon balls moved in a straight line, until they ran out of energy, and then they fell straight down. That was a perfectly valid view of the universe, given the observations available at the time. It even allowed you to make predictions about how far a cannonball would go, if you gave it different amounts of energy. It was also wrong.

We then cleverly figured out that the cannon ball moves in a parabola, with gravity as a constant force, pulling the cannonball down. Also wrong.

Now we know that the gravitational force exerted on the cannonball by the Earth is inversely proportional to the square of the distance between them. Meaning that you can put a cannonball in orbit, or you can even get it to escape Earth's gravity well.

Knowing "mass deforms space" doesn't help you make any of those predictions. How it deforms space does. And we, in fact, figured out that it deforms space, before we had any idea WHY, or by which mechanism.

Again, my problem with Chomsky is that he thinks we will come up with "mass deforms space," and Kepler's laws of planetary motion, before we even make any observations about the natural world.

Or rather, given our ability to measure huge quantities of text, it seems absurd that Chomsky would have us ignore the corpus, and go back to first principles as the exclusive and only way to gain any insight, be able to make any predictions, before we build any applications.

mootothemax · on July 25, 2012

Isn't this basically an argument over John Searle's Chinese Room thought experiment?

It supposes that there is a program that gives a computer the ability to carry on an intelligent conversation in written Chinese. If the program is given to someone who speaks only English to execute the instructions of the program by hand, then in theory, the English speaker would also be able to carry on a conversation in written Chinese. However, the English speaker would not be able to understand the conversation. Similarly, Searle concludes, a computer executing the program would not understand the conversation either.

http://en.wikipedia.org/wiki/Chinese_room

sp332 · on July 25, 2012

The argument is whether a computer can learn language (well) from scratch, or whether some capacity for language must be built into the computer manually. https://en.wikipedia.org/wiki/Poverty_of_the_stimulus

mootothemax · on July 25, 2012

Ooh, I see, interesting, thanks! :)

knowtheory · on July 25, 2012

Just as a note, the space of possible responses to Searle's argument have been pretty well enumerated here: http://plato.stanford.edu/entries/chinese-room/

I'm of the opinion that the room has an understanding entity inside of it, in talking about infinitely-sized books with an infinitely-sized index allowing any mechanical process to map an input to a correct output, you've hypothesized something complicated enough that it should be said to be an entity capable of understanding/meaning.

debacle · on July 25, 2012

The problem with Searle's assertion is that he is making a distinction between the computer and the program. We, as human beings, are not our computers, we are our programs.

bluekeybox · on July 26, 2012

Absolutely. Also, according to Searle's argument, submarines cannot swim.

nessus42 · on July 25, 2012

> Isn't this basically an argument over John Searle's Chinese Room thought experiment?

No, this debate is completely and utterly different from Searle's Chinese Room argument! Searle's argument is a philosophical one for the assertion that a computer could never be a person. He concludes that this is true even if we were to eventually believe that we completely understand intelligence in the manner than Chomsky is lobbying for, and then fully implement that understanding in a computer.

For Searle, no amount of understanding of intelligence in any form will ever let us make an intelligent computer. For Searle, intelligent beings must be made out of flesh and bone. Or at least not out of anything digital and computer-like.

zeteo · on July 25, 2012

Is a robot capable of running? Let's say you had one, then take his legs and give them to an amputee. The amputee can operate the legs by pressing a button. However, pressing a button is not running. Similarly, the robot would not really be running either.

pessimizer · on July 25, 2012

>However, pressing a button is not running.

No True Scotsman here, which I'm assuming that you're implying exists in Searle's argument. But that puts you in a position of saying that the English speaking operator of the Chinese room understands Chinese because he responds in a convincing way to Chinese speakers, which is pretty explicitly false.

It also relies on the concept of "running" being monolithic instead of constructed. Under a part of most people's concepts of running lie things that a running robot does, other parts of the concept for some people involve particular types of clothing, and/or nipple chafe cream. I would think that most people would think of what the amputee does with the robot legs as running, yet they wouldn't want him in the Olympics.

Is Searle's room, or the assemblage of Searle's room and its operator, intelligent? Whether you can answer that question easily depends on which part of the concept of intelligence is important to you in that context. Is the designer and implementor of Searle's room intelligent? Unambiguously. Are the assemblage and the designer intelligent in exactly the same way?

brudgers · on July 25, 2012

Intellectually, there seems to be something as wrong with avoiding anthropomorphism when discussing human endeavors (such as language) as there is with anthropomorphic explanations of erosion or chemical reactions. Skinnarian approaches to language may leave people unsatisfied because there is no story, just clinical observation.

Norvig's approach (as characterized in the article) takes the the "Artificial" in "Artificial Intelligence" to include the mechanism by which an intelligence makes decisions. Chompsky's aesthetic of linguistics applied to AI would treat "Artificial" as a description of the platform in which an intelligence is embodied (i.e. non-biological) while requiring the platform to operate linguistically on the same principles as a "natural intelligence."

Norvig's approach (as characterized in the article) is essentially a better Eliza (or Ford's faster horse).

If one takes the Turing Test as scientifically meaningful rather than an engineering standard, then one falls in one camp or the other and the Norvig Chompsky debate is over a pseudo-problem. "Artificial Intelligence" is in that sense metaphysical jargon.

slurgfest · on July 25, 2012

Skinner's book Verbal Behavior was mostly unsatisfying because it didn't have a lot of data; it really just laid out a research program which had not been carried out in any significant way (and now, never will be). Of course it is also unsatisfying that Skinner does not appeal to our sense that we already understand everything important about psychology and language "from the inside" and don't really need any stinking data.

The reason most people are unsatisfied with Skinner's approach to language is that they did not read Verbal Behavior, but rather Chomsky's review; and because Chomsky chose it (as among Skinner's weakest work) and reviewed it in the most uncharitable way possible, without understanding any of the basic concepts or motivations to Skinner's approach.

So, for example, he successfully associates Skinner directly with Watson, and makes it out that "radical" behaviorism is radical not for its rejection of premises of classical behaviorism but for being even more crazy.

That review is a masterpiece of propaganda and it effectively prevented Skinner's basic ideas from even being seriously evaluated ever again.

brudgers · on July 25, 2012

Just to clarify, my reference to Skinner was to be behaviorism in general or even more generally, radical empiricism in regards to human activities.

Jun8 · on July 25, 2012

OK, let me start with two facts, one objective, one personal: (i) Noam Chomsky is a genius with many contributions to linguistics and computer science (ii) I think his overall influence had been damaging to linguistics.

Here's a summary of Chomsky's career in layman's terms: As everyone knows, Chomsky first came to prominence with his critique of Skinner (who, as everyone also knows, was a total psycho). He pretty much created linguistics as we know it (at least in the US, there were some numbskulls in Europe who still doubted the new order), starting from the main thesis of linguistic universals, which can be summarized as the fact that all humans possess the same language faculty, i.e. the wide range of linguistics differences between, say, English and Mandarin are just on the surface. This was a welcome relief against the Sapir-Whorf mumbo-jumbo which held that Eskimos had hundreds of words of snow and language constrained how we think. Chomsky has also been very active in politics (he's actually much better known to the general world by his political books), pointing out the evils especially of the American brand of capitalism (is there any other kind?) and its corrosive influence on the world, e.g. Iraq, Afghanistan, etc. He also points out errors in certain approaches in Economics, e.g. see http://en.wikiquote.org/wiki/Noam_Chomsky#Capitalism, without holding a degree in the field, but everybody does that.

Chomsky's greatly damaging influence to linguistics is due to the fact that his speculative and simplistic (at least originally) views on how the brain processes and learns language has stifled research in promising fields by decades. The main problem I have with him is that the cause of the shortcomings of his theory seems to be not lack of knowledge (very little was known about cognition in the 60s), which, of course handicaps all pioneers of science, but politics (I detest politically motivated scientific theories). AFAIK, his universalist views were motivated from his political beliefs.

Luckily, starting in the 90s, Chomsky's chokehold on linguistics has slipped somehow. Researchers, such as Leda Cosmides, have ventured into research on linguistic relativity (http://en.wikipedia.org/wiki/Linguistic_relativity). Skinner's theories are making a comeback in academic circles (http://www.theatlantic.com/magazine/archive/2012/06/the-perf...).

So, what does all this mean for the current debate? I think it's time to retire and the "old guard"! Let us acknowledge their breakthroughs, their contributions, but also their limitations and move on.

rdtsc · on July 25, 2012

> Chomsky's greatly damaging influence to linguistics is due to the fact that his speculative and simplistic ... views on how the brain processes and learns language has stifled research in promising fields by decades.

By decades. Really? I think those claims would need better substantiation. Looking back we have a 20/20 hindsight and maybe can say where we took the wrong turn. But can you point out to other budding and promising theories that Chomsky somehow squashed because of, I don't know, his great power or character.

As for politics. Can you explain better and provide some more information. So far it seems, this is just you disagreeing with his politics and then attacking his influence on linguistics as an obvious extension of his political views.

> I think it's time to retire and the "old guard"!

No. You retire, theories that have been proven wrong and bring about theories that explain the reality in a better, more complete, or universal way. You don't retire the "old guard" because you disagree with their food tastes or politics (which seems to be in this case)

forkandwait · on July 26, 2012

> ... of Skinner (who, as everyone also knows, was a total psycho)

Not so fast cowboy. Skinner's theories were far more limited than he or his disciples believed (ditto Chomsky and pretty much every psych or social theorist), but the behaviorists added a great deal to psychological understanding.

It is interesting to contrast the soft sciences (for lack of a better term) with the hard sciences and how they relate to their history. Nobody would say "Newton was an utter psycho" because his theories didn't account for thermodynamic behavior, or because his dynamics were proven to be approximate by Einstein, or because in his later years he spent a lot time analyzing word patterns in the bible. Rather, what he said that was true gets incorporated into the textbooks and we move on. Sometimes Skinnerian dynamics apply, and we should use them in those situations, and learn about them, and toss them when not appropriate. Duh.

I am not sure why the soft/social sciences are so dysfunctional in this respect.

In anthropology this is extremely evident, but they are all psychos there...

http://www.theatlantic.com/magazine/archive/2012/06/the-perf...

disgruntledphd2 · on July 26, 2012

No offence (I like most of your post), but BF Skinner was not a psycho. He was an amazing scientist and researcher. You should really read some of his books before you judge him (Beyond Freedom and Dignity is very good, and short). Skinner was totally wrong on language, yes, but he was right on a lot of other things. He essentially invented operant conditioning, which is the basis for slot machines and much of modern gamification. In addition, he's responisble for such signs as "thank you for not smoking" (that was in Beyond Freedom).

To summate, he was an extremely interesting thinker and a fine researcher (and really loved pigeons). His theories were wrong, but all theories tend towards wrong in the limit.

zzzeek · on July 25, 2012

I'm completely ignorant here - is it widely established that there's no link between how we think about things versus the languages we speak ? It seems intuitive that our ability to conceive of concepts would be dependent on having/creating language that can describe them, even in our own minds, but I have no idea how anyone could really know one way or the other.

Jun8 · on July 25, 2012

As you point out, this is a hard thing to test. Add to this the fact that the question may be a sensitive one, similar to differences between men and women.

How can one go about testing the effect of language on thinking? Consider this example: English has an explicit grammatical structure for counterfactuals (CFs), e.g. "If I were a rich man ..." whereas some languages, e.g. Chinese (Mandarin) do not (they do have some other means but not as overt). One can then think presenting stories containing complex CF situations to native English and Chinese speakers, and somehow test how quickly they grasp them. This exact experiment was performed by Alfred Bloom in 1981 and indeed showed some differences. Later researchers noted some points that might have affected the results. You can see why this research may be sensitive, it may be mistakenly used to argue that Chinese speakers are somehow linguistically deficient.

pyoung · on July 25, 2012

I am no expert in this area, but I believe this is a fairly good example of the issue you are describing. According to the study, it appears as if the gender system used in some languages appears to bias individuals perception of the object.

http://public.wsu.edu/~fournier/Teaching/psych592/Readings/G...

zzzeek · on July 25, 2012

so you're saying, it's not that there's no link, just that it's hard to pin down, and any real conclusions are inherently dangerous from a political/anthropological stance. that makes sense as I've heard from others that the field of anthropology has been in a semi-permanent state of apologism for eurocentric thought for the past thirty years.

I guess some forms of truth can't be released until society has fully prepared itself structurally for them (i.e., a society that could learn of inherent differences in thought based on language, without starting a new caste system based on that).

chj · on July 25, 2012

This is bullshit. Chinese language does not rely on "were" to make CF structure, but that doesn't mean it can't make CF statements. if you replace "were" with "am", the sentence still means the same thing.

morsch · on July 26, 2012

I think it's widely accepted in linguistics (IA kind of AL) that language influences conceptualisation (sometimes called the weak Sapir-Whorf hypothesis) but that it does not determine conceptualisation (the strong version).

See http://en.wikipedia.org/wiki/Linguistic_relativity

mcguire · on July 26, 2012

I've always had this suspicion that the "strong" form of any theory is a caricature made up by the theory's opponents.

zzzeek · on Aug 1, 2012

right, I'm familiar with sapir-whorf...but the parent comment here specifically refers to sapir-whorf as "mumbo-jumbo". Seems like the linguists still aren't keen on it.

guscost · on July 25, 2012

I think that his philosophy damaged linguistics (and science in general) rather than his theory or experiment. I think that he genuinely does not know where one should end and the other should begin, especially in his problem domain, especially in 2012.

fauigerzigerk · on July 26, 2012

I think you are wrong about Chomsky's theories being politically motivated. I don't have time right now to look up quotes but many years of debates with intellectuals on the left have made one thing very clear to me: The left hates talk of anything innate, particularly if it is even remotly connected to meaning.

I don't want to dumb this down because there are many different opinions on all sides of the political spectrum, but the left has a tendency to see everything as learned, learnable and socially induced - nurture not nature. Chomsky's theory is seen as a form of biological determinism. There are few things less popular on the left than that.

why-el · on July 25, 2012

You cant label something as a fact then proceed to say "I think". (ii) is poorly defended. You set out to outline Chomsky's long and engaging political life just so that you can say that it influenced his linguistic theory. Chomsky himself have said many times that there is no clear correlation between his linguistic work and political activism. This is not new. He has been asked this in a lot of talks.

with this said, there were attempts at bridging the creativity exhibited by linguistic behavior and that of political freedom, such as in his "Language and Freedom" book, but this happen way after he established his theory.

Jun8 · on July 25, 2012

Labeling (ii) as a fact was a bit of joke, but not overtly marked, which may lead to misunderstanding, so you are correct.

My gripe is not against Chomsky, the person, but against Chomsky-ism (although he does have a very aggressive, know-it-all tone when discussing many subjects, that I find at odds with scientific humility). This is common: A pioneer comes up with a new way of thinking, then it is accepted as dogma by the community. Although one may think otherwise, this is also very common in science. Scientific revolutions usually don't happen with paradigm shifts but through the demise of old age of the new theory's opponents.

What's shocking is that a field like linguistics, which some may argue is more scientifically rigorous than its sister humanities disciplines, operated on what, to me, are speculative assumptions for close to 40 years.

slurgfest · on July 25, 2012

I am interested in how you come to the conclusion that Skinner was a total psycho. You actually seem to suggest that Skinner was an evil, mentally ill person and I would be interested to know what evidence there ever was for that.

Jun8 · on July 25, 2012

Sorry for not being clear: The second paragraph was meant as a somewhat humorous summary what an informed non-expert would think about Chomsky and his work.

But even in academic circles his name had a bad vibe, like Anakin, who started out with so much promise but then went to the dark side. This is partly due to the huge backlash in the 70s (see page 2 of the Atlantic article).

icandoitbetter · on July 25, 2012

> I detest politically motivated scientific theories

Then you detest every scientific theory ever devised.

JabavuAdams · on July 25, 2012

This is a bizarre claim. My motivations for doing science are to understand, to create, and to stroke my own ego.

Jun8 · on July 25, 2012

If you comment was not in jest, can you elaborate on why you think so? One can argue that all scientific theories are motivated to a larger or lesser extent by the prevalent world view of their day. But arguing, e.g. the Standard Model in physics is influenced by the politics of its day (~70s) is indeed bizarre.

PaulHoule · on July 25, 2012

Well, in the big picture, Chomsky created an activity which keeps liguists very busy. His approach, however, has contributed very little to language engineering.

_delirium · on July 25, 2012

Do you mean specifically of human languages? Because Chomsky's approach has contributed pretty extensively to programming language engineering, as the foundation of parsing theory and the whole formal language hierarchy (context-free, context-sensitive, regular, etc.).

I do agree it's been less successful for its original intended purpose, but things often find new life, which seems okay.

slurgfest · on July 25, 2012

Yes: the Chomsky hierarchy is a fundamental of computer science, one of the great intellectual achievements of humanity. And in that respect also important to AI.

Chomsky is incredibly strong on anything that does not require empirical data.

But UG has no legs and Chomsky's analysis of syntax has very limited applicability and after many many IOUs, pretty much no empirical claims have panned out in any significant way.

If you take away the application of those basic computer science concepts to language, you unfortunately take away most of what Chomsky has written regarding linguistics, psychology and AI. Because of the sheer volume of output, that leaves a number of contributions. My point is that it is necessary to be discriminating rather than making Chomsky into the Pope, as certain fields have done for some time.

mcguire · on July 25, 2012

Historically, AI has been divided into two related but different approaches. "Strong" AI is interested in understanding and creating Minds; figuring out what intelligence is, how it works, how we do it, and how it could be done in general. "Weak" AI is interested in doing things that couldn't be done before; things that we do not have good algorithms for, or don't have any algorithms at all.

Those two are not opposed. Any advance on either side helps the other. In this argument, Norvig is representing an extreme version of weak AI since he seems to be arguing that it's possible that statistical methods are all there is. (I suspect that he isn't actually making that argument, though, but that strong AI's models are currently too simplistic to capture what statistical approaches can do.) Chomsky, on the other hand, seems to be caricaturing strong AI by saying that anything that doesn't directly shed light on the Grand Theory is worthless.

aidenn0 · on July 25, 2012

It's a question about engineering vs science. Before Kepler, people actually could predict the motion of the stars and planets through the sky; perhaps not as elegantly or accurately as after Kepler, but to a certain degree, so what?

The AI case is clearly a point where the theories from linguistics are insufficient for engineering purposes. Watson could not have been built today based off of Chomskian linguistics. Maybe the statistical models will advance the theory of linguistics, maybe not. Either way they will give us useful tools now which is better than elegant tools later.

frobbin · on July 25, 2012

AI research, including speech recognition and machine vision, are currently ENGINEERING disciplines trying to make artifacts that do interesting things. Success is an artifact that works.

Several basic science disciplines are trying to understand how brains work. There is mostly tremendous amounts of experimental facts, difficult to put together, and some theory and modelling to go with it.

Norvig would be confused if he thinks that engineering AI systems automatically counts as models useful for understanding the brain. If there is application to understanding brains it is a welcome accident. It happens that there are signals in basal ganglia that look like the temporal difference error signal from reinforcement learning. So maybe RL research can help understand some brain circuitry in that case.

But in general the engineers are trying to get stuff to work, and they are deluded if they think they are simultaneously making progress in understanding how brains work.

EDIT:

For example: why does speech recognition use hidden markov models and N-gram language models? Because they're the best model of how brains understand speech? No! Not at all. HMMs and N-gram models are above all computationally tractable. Easy to implement, not too slow to run.

We have algorithms (such as baum-welch and N-gram smoothing techniques) to get them work work well in engineering applications. Nothing more. Might they help us understand brains? Maybe, but not at all necessarily so.

aangjie · on July 25, 2012

Just for the record, i consider this a simple model. And it's from norvig. http://norvig.com/spell-correct.html

fat_clown · on July 25, 2012

It is an interesting debate, though I think it's being shone in the wrong light.

According to the article, it almost sounds like Chomsky believes a statistical approach to AI is a disservice to the field. The point he's missing is that research in statistical based AI is just that - statistics research.

Chomsky and Norvig deal in two different fields, which happen to have similar applications. Norvig does research in statistical and machine based learning. Success in this field comes from a new model that can make more accurate predictions, or a proof that it is impossible to make valid predictions about X with only Y as input. Applications of this field include technologies which rival AI systems as envisioned by Chomsky, but the essential point is that this field focuses on statistics research, not AI research.

Chomsky is wrong in dismissing this as a disservice. I do agree with his main point, that AI research and knowledge is not necessarily furthered by statistics research, but that is simply because they are different beasts entirely.

Maybe one day, when the biology has caught up with us and we have a solid understanding of the brain, will we be able to create a highly intelligent computer. Until then, statistics research is most likely to yield fruitful results.

slurgfest · on July 25, 2012

Chomsky sees a threat to a politics-based academic hegemony, so he responds politically. If he hadn't made so many broadly quantified claims (i.e. had stuck to syntax as Norvig sticks to AI and machine learning) then those claims would not be in such jeopardy from other fields. Chomsky has always been a top-class warrior and this is more of the same.

no_more_death · on July 26, 2012

One myth I want to debunk:

Copernicus's theory did NOT do away with epicycles. Search on Google for "copernicus epicycle" and the first article demonstrates my point. The one who did away with epicycles was Kepler. Copernicus believed orbits had to be perfectly circular; Kepler recognized that the data fit better into an elliptical model.

It's not 100% clear whether the author believed the "myth," but hopefully I can set some people straight in this forum.

mbq · on July 25, 2012

The main problem with Chomsky's approach is that it is quite likely that human intelligence mechanics are just incomprehensible for a human intelligence, and not because of some crazy construction tricks but simply plain old brute size and complexity it imposes. Judging from much simpler (thus deeper investigated) biological systems like some bacteria metabolisms we can see that there is no grand design there, only trivial primitive core and numerous layers of less or more subtle modifiers of modifiers. IMO there is no reason why the same can't work for the brain and thus the "transition to sentience" is way more continuous than we would like to expect.

stcredzero · on July 26, 2012

> If the solar system’s structure were open for debate today, AI algorithms could successfully predict the planets’ motion without ever discovering Kepler’s laws, and Google could just store all the recorded positions of the stars and planets in a giant database

I'm sorry, but this bit is half wrong and simply numerically illiterate. We can store all of the recorded positions of the planets and other bodies in the solar system, but we need models to predict their future positions. This is an important distinction, since we might use such models to save the human race one day.

sireat · on July 25, 2012

There must be some analogies made to the much smaller field of chess computer programs.

From 1950s to about 1980 or so it was thought that the best computer chess program would approximate the way a human would think about the game. Botvinnik in particular was adamant that such an approach would be the right one.

However, most of the progress was made through brute force. Modern chess programs select moves in a way that is far removed from the way a good chessplayer selects moves, yet they can now produce games that seem very "uncomputer" like and "human".

6ren · on July 26, 2012

It's true that Engineering at times leads Science. But, from a scientific view, what's the point of a model if you can't understand it? After all, we already know how to create intelligence without understanding it.

While it's conceivable that intelligence is too complex for a human to ever understand (e.g. if not amenable to hierarchical decomposition), that would be very sad news for science.

yters · on July 26, 2012

Norvig is only trivially right. Sure, with enough stats you can infer a lot of the structure of all the information we humans have created, and thus replicate the structure, as Google is doing with its suggest service. However, this does not explain how humans created the structure in the first place. Such a form of AI will forever being playing catch up to humans.

ecolak · on July 26, 2012

When Einstein heard about Quantum Mechanics and the idea that everything is a probability, he said: "God doesn't roll dice". He meant that even though Quantum Mechanics does give us many answers about the world of the tiny, it doesn't truly explain it. I believe that a similar analogy can be made to this case.

ilaksh · on July 25, 2012

I know that everyone has been careful not to mention Chomsky's political beliefs, but I am suspicious that this is actually partly about Chomsky's political beliefs, which I think are more in line with reality or at least more egalitarian than Norvig's must be, since Norvig has been running one of the hegemony's greatest tools recently. I see a parallel between the general derisive dismissal of Chomsky's academic views as being simplistic with the type of dismissal commonly given to a Chomskyish geopolitical viewpoint. I see this disagreement as a surrogate for the very different geopolitical worldviews.

I doubt that Chomsky is really so hard line about his old approaches to AI as we are led to believe, although he is probably farther behind the times than Norving.

I actually think that even Norvig is just applying recent contemporary AI to AI problems, but still is part of an old or establishment guard himself as far as AI goes. I think that the real cutting edge AI research is called AGI (artificial general intelligence) research.

The generation/category of AI research or machine learning that Norvig is tied into is much newer and steps beyond the earlier traditional AI that Chomsky might have been involved with, but the AGI researchers are a step beyond Norvig's clique. And the AGI researchers are, by the way, very optimistic about the Singularity or at least the likelihood of human-like and probably super-human artificial general intelligence in the short or medium term.

I mean the Norvigish machine-learning stuff isn't completely disconnected from the AGI stuff and completely behind and I assume it will result in extremely capable AIs relatively soon, but the AGI approaches will probably prove to be more powerful and more humanlike since they are closer to human models.

Take a look at what Brain Corporation is doing, or Numenta, or the OpenCog project. That stuff is beyond Norvig and friends' approaches.

psb · on July 25, 2012

Where is eyudkowski when we need him?

SlipperySlope · on July 25, 2012

I am a entrepreneur/researcher working to create artificial intelligence. My approach follows Turing's suggestion that one should create a child mind and proceed to educate it. I employ Construction Grammar in my English dialog system - not a statistical parser/generator. Operating on a smartphone, I use available statistical speech recognition engines to transform speech to text, but from that point onwards the server-side processing in Construction Grammar is symbolic, thus engineered from first principles. Likewise, for English generation, my discourse planner emits structured RDF that the bi-directional Construction Grammar generator transforms into a text utterance. That symbolic text is then input to an available statistical text-to-speech engine available on the smartphone, to speak to the user.

As an example of the power of symbolic approaches, my parser has a complete symbolic analysis of English auxillary verb constructions, producing unique, meaning-rich, RDF-compatible semantics for:

I am learning about computers.

We are learning about computers.

We will be learning about computers.

I could be learning about computers.

I have been learning about computers.

I better learn about computers.

I had better learn about computers.

I dare learn about computers.

I did learn about computers.

I do learn about computers.

He does learn about computers.

I had learned about computers.

He has learned about computers.

I have learned about computers.

He is learning about computers.

I need learning about computers.

I ought to learn about computers.

I ought to be learning about computers.

I used to learn about computers.

I was learning about computers.

We were learning about computers.

Because of the so-far limited success of my work, I am inclined to agree with Chomsky's AI argument despite using a modern grammar opposed to his linguistic principles.

An artificial intelligence will use both statistical techniques and symbolic, e.g. procedural techniques, I think. With the most useful intelligent behavior being symbolic. E.g. an AI designing, writing and testing software.