18 August 2006

Change of Notation

Ed Hovy has a particular label he assigns to a large variety of work in NLP: it performs a change of notation. The canonical example is POS tagging, where we are changing from one notation (words) to another notation (POS tags). He also applies this notion to most forms of parsing (though this is a bit more like adding notation), machine translation (one notation is Arabic, the other English), etc. My impression (though it's sometimes hard to tell where Ed really stands on such things) is that solving a task with a change-of-notation technique is suboptimal. I've never fully grasped what Ed was trying to say specifically until recently (and I make no promises I didn't get it wrong, but this is my interpretation).

I think what Ed means by change of notation is that what is being done is pure symbol manipulation. That is, we have a bunch of symbols (typically words) and we just push them around and change them, but the symbols don't actually "mean" anything. There's no sense that the symbol "chair" is grounded to an actual chair. Or, as a logician might say, the manipulation is purely syntactic, not semantic. There is no model in which statements are evaluated. My interpretation is that by "notation" Ed simply means "syntax" (the mathematical notion of syntax, not the linguistic one).

On the other hand, it's hard not to say that everything we do, by definition, must be symbol manipulation. Computer science is all about playing with symbols. There may be a sense in which the word "chair" is grounded to an actual chair, but this actual chair, once inside the computer, will again be a symbol!

Despite this, I think that what one can think about is something like "how large is the connection between my model and the real world?" For most applications, the connection is probably limited to the specific form of training data. This is probably as close to the smallest connection possible. Sometimes people attempt to integrate ontological information, either handcrafted (via something like WordNet) or automatically derived. This is, in some sense, attempting to open the "program to world" pipe a bit more. And there is some work that explicitly focuses on learning grounding, but I'm not aware of many actual uses for this yet. Additionally, some applications cannot help but to be tied more closely to the real world (eg., robot navigation). But for the majority of tasks, the connection is quite small.

4 comments:

Anonymous said...

A Turing machine is a "change of notation" machine, and therefore, so is any program run on it. Skeptics, like Searle, then conclude that with "just" symbol manipulation, there's no possibility of meaning. The argument is that there's no way for "aboutness" to be inside the system.

Take a philosophy of mind class and spend countless hours discussing this issue and its brethren. The two part book "Philosophy of Psychology" is a good rigorous place to start. It includes the paper that made Chomsky famous, his review of Skinner's Verbal Behavior.

hal said...

I completely agree about the Turing machine issue. I'm not sure that Ed intends to make a Searlean argument, though I could be wrong. I suppose once you start talking about meaning, you almost have to start worrying about these things. Maybe I'm just more of a practical nature, but I think the "connection to real world" is the important characteristic. Whether you claim "meaning" or not, you certainly can have different levels of knowledge that go into a system. I think a major complaint made about "change of notation" systems is that typically they are almost exclusively lexically driven and don't use any extra knowledge (one could argue this is a good thing, though).

Incidentally, it seems to me that the real world is exactly what makes all of AI hard. CS disciplines that avoid the real world and work entirely within the machine (eg., prog lang, compilers, thm provers, theory, db, etc.) tend to be quite successful. Those that attempt to interact with the real world in some way (speech, NLP, vision, robotics, etc.) all seem to be much harder, presumably because at some level, they're all about representations and models over which we do not have full control. (Interestingly, I almost added "graphics" to the first list, but subsequently removed it because in a sense graphics research interacts with our eyes and tries to make things that look real [or not] and this is often very difficult.)

Anonymous said...

HI

There is a closely related issue of whether syntactic regularities themselves tell us something about meaning. Arguably, syntax is there to facilitate expression of meaning, so it is organised in a convenient way -- for example, clustering verbs according to their subcategorisation patterns yields clusters of verbs that share elements of 'meaning'.
Quotes are there because the system itself does not know that this is so, but the human observer does. The same goes for pos tagging - adjectives are an interesting class structure-wise because they have some common properties of meaning (typically, that is).

I guess this amounts to saying that there are better and worse symbol manipulations, and the best ones tell us something interesting about meaning as we humans understand it.

Another comment on 'mere' symbol manipulations: it could well be that it is too early to judge the hypothesis of underdetermination of meaning by structure, since we are not yet fully aware of layers of strcture that are out there. One nice example is the 'information structure' - encoding of what is supposed to be new/old to the listener in the discourse. This is done by quantification and word order, but actually tells us something about the writer's assumpations regarding the reader's knowledge. There might be other 'semantic' things realized in subtle surface patterns which we haven't explicated so far.

So, it is theoretically possible that when constraints from multiple levels of strcture are combined regarding a particular piece, we effectivly have specified its meaning well enough for whatever the system is trying to do with it (like translating).

Anonymous said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花