17 September 2006

Statistical NLP is not NLP but just Statistics?

bact' brings up an interesting point, perhaps more provocative than my original (intended-to-be provocative) pseudo-question. To quote, he says:

and some also said,
statistical natural language processing is not language processing at all, only statistics :P

My impression is that the only sense in which this sentence is true is if you insist that what goes on inside the black box of statistical NLP is somehow explaining what goes on inside our heads.  I see it as essentially parallel to the argument against "neural-style" machine learning.  Some neural networks people used to claim (some still do, I hear) that what happens in an artificial neural net is essentially the same as what goes on in our minds.  My impression (though this is now outside what I really know for sure) is that most cognitive scientists would strongly disagree with this claim.  I get the sense that the majority of people who use NNets in practice use them because they work well, not out of some desire to mimic what goes on in our heads.

I feel the same is probably true for most statistical NLP.  I don't know of anyone who would claim that when people parse sentences they do chart parsing (I know some people claim something more along the lines of incremental parsing actually does happen and this seems somewhat plausible to me).  Or that when people translate sentences they apply IBM Model 4 :).

On the other hand, the alternative to statistical NLP is essentially rule-based NLP.  I have an equally hard time believing that we behave simply as rule processing machines when parsing or translating, and that we efficiently store and search through millions of rules in order to do processing.  In fact, I think I have a harder time believing this than believing the model 4 story :P.

Taking a step back, it seems that there are several goals one can have with dealing with language on a computer.  One can be trying to carry out tasks that have to do with language, which I typically refer to as NLP.  Alternatively, one can be trying to model how humans work with language.  I would probably call this CogNLP or something like that.  One could instead try to use computers and language data to uncover "truths" about language.  This is typically considered computational linguistics.  I don't think any of these goals is a priori better than the others, but they are very different.  My general feeling is that NLPers cannot solve all problems, CogNLPers don't really know what goes on in our minds and CLers are a long way from understanding how language functions.  Given this, I think it's usually best to confine a particular piece of work to one of the fields, since trying to solve two or three at a time is likely going to basically be impossible.


Anonymous said...

On the other hand, the alternative to statistical NLP is essentially rule-based NLP.

This statement seems disingenuous. There really are to be a continuum of approaches that use more and more linguistic information. Take, for example, the famous machine translation pyramid. At the bottom, MT approaches using a bag of words model or low-order HMMs really do seem to be "only statistics", while higher up approaches such as syntax-based MT or interlingual-based MT may involve statistics but try to use actual features of natural language.

hal said...

Hrm, I guess this is up to how you interpret the original statement. "Modern" syntactic MT (ala David Chiang's Hiero, Dekai Wu's ITG, Dan Melamed's groups' work and the work from ISI) I would consider statistical NLP, exactly in the same sense that bag-of-words or phrase-based mdoels are statistical NLP. I would also consider things like example based MT (EMBT) to be a form of statistical NLP (albeit one that doesn't use probabilities). On the other hand, I would consider things like the KANT project from CMU (interlingua MT) to not be statistical NLP.

If by statistical NLP you mean bag-of-words NLP, then yes, I would agree, that's not really language processing. But there's a lot more you can do with statistics than just HMMs.

Ignacio Nicolás Rodríguez said...

I don't think any nonstatistical (or statistical, then complemented by something else) method tries to be an imitation of the human process; I believe it's about trading words and the structures they build as "meaning:" that is, not a thing by itself but a pointer.

It's like statistical processing revolves around the pointers while other approaches can go investigate what's pointed to.

hal said...

Ignacio -- thanks for the comment...I think you're right that I was wrong about the goal. I agree that the typical goal is to "meaning." But I guess I don't see why the statistical/non-statistical debate has anything to do with this. My knowledge (which is admittedly slim) of non-statistical methods is that they too "just" push symbols around. I guess I don't see how the non-statistical symbols are any less pointers than the statistical ones.

Anonymous said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花