natural language processing blog: NLP as a study of representations

07 November 2009

NLP as a study of representations

Ellen Riloff and I run an NLP reading group pretty much every semester. Last semester we covered "old school NLP." We independently came up with lists of what we consider some of the most important ideas (idea = paper) from pre-1990 (most are much earlier) and let students select which to present. There was a lot of overlap between Ellen's list and mine (not surprisingly). ~~If people are interested, I can provide the whole list (just post a comment and I'll dig it up)~~. The whole list of topics is posted as a comment. The topics that were actually selected are here.

I hope the students have found this exercise useful. It gets you thinking about language in a way that papers from the 2000s typically do not. It brings up a bunch of issues that we no longer think about frequently. Like language. (Joking.) (Sort of.)

One thing that's really stuck out for me is how much "old school" NLP comes across essentially as a study of representations. Perhaps this is a result of the fact that AI -- as a field -- was (and, to some degree, still is) enamored with knowledge representation problems. To be more concrete, let's look at a few examples. It's already been a while since I read these last (I had meant to write this post during the spring when things were fresh in my head), so please forgive me if I goof a few things up.

I'll start with one I know well: Mann and Thompson's rhetorical structure theory paper from 1988. This is basically "the" RST paper. I think that when a many people think of RST, they think of it as a list of ways that sentences can be organized into hierarchies. Eg., this sentence provides background for that one, and together they argue in favor of yet a third. But this isn't really where RST begins. It begins by trying to understand the communicative role of text structure. That is, when I write, I am trying to communicate something. Everything that I write (if I'm writing "well") is toward that end. For instance, in this post, I'm trying to communicate that old school NLP views representation as the heart of the issue. This current paragraph is supporting that claim by providing a concrete example, which I am using to try to convince you of my claim.

As a more detailed example, take the "Evidence" relation from RST. M+T have the following characterization of "Evidence." Herein, "N" is the nucleus of the relation, "S" is the satellite (think of these as sentences), "R" is the reader and "W" is the writer:

relation name: Evidence
constraints on N: R might not believe N to a degree satisfactory to W
constraints on S: R believes S or will find it credible
constraints on N+S: R's comprehending S increases R's belief of N
the effect: R's belief of N is increased
locus of effect: N

This is a totally different way from thinking about things than I think we see nowadays. I kind of liken it to how I tell students not to program. If you're implementing something moderately complex (say, forward/backward algorithm), first write down all the math, then start implementing. Don't start implementing first. I think nowadays (and sure, I'm guilty!) we see a lot of implementing without the math. Or rather, with plenty of math, but without a representational model of what it is that we're studying.

The central claim of the RST paper is that one can think of texts as being organized into elementary discourse units, and these are connected into a tree structure by relations like the one above. (Or at least this is my reading of it.) That is, they have laid out a representation of text and claimed that this is how texts get put together.

As a second example (this will be sorter), take Wendy Lehnert's 1982 paper, "Plot units and narrative summarization." Here, the story is about how stories get put together. The most interesting thing about the plot units model to me is that it breaks from how one might naturally think about stories. That is, I would naively think of a story as a series of events. The claim that Lehnert makes is that this is not the right way to think about it. Rather, we should think about stories as sequences of affect states. Effectively, an affect state is how a character is feeling at any time. (This isn't quite right, but it's close enough.) For example, Lehnert presents the following story:

When John tried to start his care this morning, it wouldn't turn over. He asked his neighbor Paul for help. Paul did something to the carburetor and got it going. John thanked Paul and drove to work.

The representation put forward for this story is something like: (1) negative-for-John (the car won't start), which leads to (2) motivation-for-John (to get it started, which leads to (3) positive-for-John (it's started), when then links back and resolves (1). You can also analyze the story from Paul's perspective, and then add links that go between the two characters showing how things interact. The rest of the paper describes how these relations work, and how they can be put together into more complex event sequences (such as "promised request bungled"). Again, a high level representation of how stories work from the perspective of the characters.

So now I, W, hope that you, R, have an increased belief in the title of the post.

Why do I think this is interesting? Because at this point, we know a lot about how to deal with structure in language. From a machine learning perspective, if you give me a structure and some data (and some features!), I will learn something. It can even be unsupervised if it makes you feel better. So in a sense, I think we're getting to a point where we can go back, look at some really hard problems, use the deep linguistic insights from two decades (or more) ago, and start taking a crack at things that are really deep. Of course, features are a big problem; as a very wise man once said to me: "Language is hard. The fact that statistical association mining at the word level made it appear easy for the past decade doesn't alter the basic truth. :-)." We've got many of the ingredients to start making progress, but it's not going to be easy!

21 comments:

Peter Turney07 November, 2009 14:05
I agree. In a recent paper, I took some old-school AI from 1989 (The Structure Mapping Engine) and updated it with an unsupervised learning corpus-based feature-vector approach (The Latent Relation Mapping Engine). I think we're going to see a lot of work of this type in the coming decade.
ReplyDelete
Replies
Ryan Shaw08 November, 2009 11:28
I'd be interested in seeing your full list of topics.
ReplyDelete
Replies
Kevin Duh09 November, 2009 05:38
This is interesting. Thanks for the post, Hal! I would love to go back and draw inspirations from the old classics. But the question for me is: "Where do I start?" I can barely keep up with all the recent proceedings. If there is a system like WhatToSee that can suggest older papers using current papers as "queries", that might be interesting! We might need some way to map new vocabularies to old ones, like Wang/McCallum's dynamic topic model evolving through time.
ReplyDelete
Replies
Dalamar Taurog09 November, 2009 06:26
I would be interested in seeing the whole list of topics as well!
ReplyDelete
Replies
hal09 November, 2009 08:02
Here's the full list...

[Bahl et al., 1983] L.R. Bahl, F. Jelinek and R.L. Mercer. "A Maximum Likelihood Approach to Continuous Speech Recognition." IEEE Journal of Pattern Analysis and Machine Intelligence.

[Charniak, 1983] Eugene Charniak. Passing Markers: A Theory of Contextual Influence in Language Comprehension, Cognitive Science, 7, pp. 171-190.

[Charniak, 1973] Jack and Janet in Search of a Theory of Knowledge. In Proceedings of the International Joint Conference on Artificial Intelligence (1973) + [Charniak, 1977] Eugene Charniak. Ms. Malaprop, A Language Comprehension Program. In Proceedings of the International Joint Conference on Artificial Intelligence (1977). (Each is short so let's cover both papers.)

[Cohen et al. 1982] Philip R. Cohen, C. Raymond Perrault, and James F. Allen. Beyond Question Answering. Strategies for Natural Language Processing, pp. 245- 274.

[Grosz, Joshi, and Weinstein, 1995]. Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics, 21 (2), pp. 203-226.

[Grosz and Sidner, 1986]. Attention, Intention, and the Structure of Discourse. Computational Linguistics, 12 (3), pp. 175-204, 1986.

[Hobbs et al., 1993]. Interpretation as Abduction. Artificial Intelligence, vol 63. pp. 69-142.

[Hobbs, 1979] Jerry Hobbs. Coherence and Coreference, Cognitive Science 3(1), pp. 67-90.

[Hovy, 1988] Hovy, E.H. 1988. Planning Coherent Multisentential Text. Proceedings of 26th ACL Conference. Buffalo, NY.

[Karttunen, 1969] Lauri Karttunen. 1969. Pronouns and variables. In CLS 5: Proceedings of the Fifth Regional Meeting, pages 108-116, Chicago, Illinois. Chicago Linguistic Society.

[Kay, 1986] Martin Kay. Parsing in functional unification grammar.

[Lakoff & Johnson, 1980] George Lakoff and Mark Johnson. Metaphors We Live By, Chapters 1-4. (short - a total of 21 pages).

[Lehnert, 1981] Wendy G. Lehnert. Plot units and narrative summarization. Cognitive Science, Volume 5, Issue 4, October-December 1981, Pages 293-331

[Lehnert, 1977] Wendy Lehnert. Human and Computational Question Answering. Cognitive Science, Vol. 1, No. 1, pp. 47-73.

[Mann and Thompson, 1988]. Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8 (3), pp. 243-281, 1988.

[Martin et al., 1986] P. Martin, D. Appelt and F. Pereira. Transportability and generality in a natural-language interface system.

[McKeown 1986] Kathleen McKeown. Discourse strategies for generating natural-language text.

[Rosch and Mervis, 1975] Eleanor Rosch and Carolyn B. Mervis. Family Resemblances: Studies in the Internal Structure of Categories, Cognitive Psychology, 7, 573-605.

[Schank, 1986] Roger Schank. Language and memory.

[Schubert and Pelletier, 1986] L Schubert and F J Pelletier. From English to logic: context-free computation of "conventional" logical translations.

[Wilks, 1975] Yorick Wilks. An Intelligent Analyzer and Understander of English, CACM 1975.

[Woods, 1986] W.A. Woods. Semantics and quantification in natural language question answering.
ReplyDelete
Replies
Anonymous13 November, 2009 02:05
another self-serving comment by Mr web 2.0
Peter Turney. isn't a blog enough already ?
ReplyDelete
Replies
Anonymous14 November, 2009 10:13
Interesting post as for me. It would be great to read something more about that matter.
BTW look at the design I've made myself High class escort
ReplyDelete
Replies
Anonymous27 November, 2009 01:57
You do have a point here :) I admire the stuff you post and the quality information you offer in your blog! Keep up the good work dude. Please come visit my site Fresno Business Services And Classifieds when you got time.
ReplyDelete
Replies
Anonymous27 November, 2009 01:58
You do have a point here :) I admire the stuff you post and the quality information you offer in your blog! Keep up the good work dude. Please come visit my site California CA Phone Directory when you got time.
ReplyDelete
Replies
Ciro Castro12 December, 2009 06:40
Good Morning, I didn't find a different way to write you. NLP has recently bitten me, and I don't know how to start to learn about it. Where do I get information for beginners? what should I learn first of all the things?. Are there some books o web sites? I'd appreciate your help.

Thanks
Ciro Castro
(ciadcara@gmail.com)
ReplyDelete
Replies
Ted12 December, 2009 08:57
I frequently start students who are new to NLP with the following:

Turing, A.M. (1950). Computing machinery and intelligence. Mind, 59, 433-460.

Weizenbaum, Joseph (January 1966), "ELIZA - A Computer Program For the Study of Natural Language Communication Between Man And Machine", Communications of the ACM 9 (1): 36–45,

For as famous as it is, the Turing paper seems to often be un-read, and it has a lot of wrinkles that go beyond the popular understanding of the Turing Test (for example, clairvoyance is discussed in a serious way :)

The Eliza paper is interesting on many levels having to do with dialogue processing and our expectations of conversations (based on who we are talking with), but I also like to use it as an early example of published source code leading to many many reasonably accurate re-implementations.

Anyway, I have found these papers are often pretty motivating for students who may not have thought much about NLP. In the past I've started with more technical fare (POS tagging, word sense disambiguation and the like) but sometimes that just doesn't really fire up someone new-ish in the area.

With students interested in word sense disambiguation or related topics, I sometimes refer them to Karen Spark-Jones' book "Synonymy and Semantic Classification", published in 1986 but really written in about 1964 as her PhD dissertation.

You can see a review of this book (which includes publication details) here.

The Spark-Jones work strikes me as remarkable given that it takes on issues like clustering and finding semantic relations in an era of teeny-tiny punched card computers - there's something both humbling and motivating about that.

In any case, bravo for taking the longer view!

Cordially,
Ted
ReplyDelete
Replies
Robin16 December, 2009 00:33
As a researcher in Natural Language Processing I have found this blog informative.

Regards
Robin
ReplyDelete
Replies
Anonymous17 December, 2009 02:35
@Peter, i also went through that Falkenhainer-Forbus-Gentner paper. I would be interested in seeing the full paper of your Latent Relation Mapping Engine paper. I am a college sophomore with a dual major in Physics and Mathematics @ University of California, Santa Barbara. By the way, i came across these excellent flash cards. Its also a great initiative by the FunnelBrain team. Amazing!!!
ReplyDelete
Replies
rr800410 January, 2010 00:41
Very nice information. Thanks for this. Please come visit my site Louisville Yellow Page Business Directory when you got time.
ReplyDelete
Replies
rr800410 January, 2010 00:42
Very nice information. Thanks for this. Please come visit my site Phone Directory Of Louisville City Kentuchy KY State when you got time.
ReplyDelete
Replies
Unknown24 January, 2010 20:14
yeah, very good!
Just statistic is crazy for nlp.
ReplyDelete
Replies
gamefan1202 March, 2010 15:37
This is so good that you are letting the students get more involved. I am so glad to see this.
buffalo mesothelioma lawyer
ReplyDelete
Replies
Anonymous13 April, 2010 20:01
cheap nike shox
cheap sport shoes
nike tn dollar
ed hardy ugg boots
ed hardy love kills slowly
ed hardy clothing us
ed hardy clothing
cheap ed hardy
cheap ed hardy clothing
ed hardy clothes
ed hardy wholesale
ed hardy clothing
ed hardy t shirts
ed hardy shirts
ed hardy uk
ed hardy t shirts
ed hardy shirts
ed hardy hoodies
Cheap JORDAN SHOES，，
cheap nike max ，。
puma future cat
ed hardy ugg boots.
ed hardy love kills slowly boots.
ed hardy love kills slowly.
ed hardy polo shirts.
cheap ed hardy clothing,.
ed hardy shirts .
ed hardy t shirts.,.
ReplyDelete
Replies
ai26 April, 2010 19:50
ugg boots
polo boots
polo shoes

herve leger
herve leger bandage dress

chanel outlet
chanel bandbags
chanel bags
chanel iman

ralph Lauren polo
ralph lauren outlet
lacoste polo
polo raplh lauren

air jordan 2010
cheap jordan shoes
jordan ajf shoes
discount jordan shoes
jumpman23

moncler
moncler jackets
moncler coats
moncler vest
moncler outlet
moncler Polo t shirt
cheap five finger shoes

kiss ghd
ReplyDelete
Replies
janewangleilei15 May, 2010 00:13
You can have a look at it.
jordan shoes
jordan ajf shoes
There are cheap shoes to choose
jordan 6
jordan 7
Good quality with low price.
air jordan 2010
Air Jordan 2009
If you like,you can contact us.
jordan 3
jordan 4
We offer different styles.
jordan 1
jordan 2
Thanks.
jordan 5
jumpman
I can't believe it.
nike outlet
You can have a look at it.
adidas outlet
puma outlet
north face
moncler
Wow
ak jackets
spyder jackets
Beautiful!
Columbia Sportswear
quiksilver jackets
Enjoy it.
burton jacket
powder room jackets
Thanks.
karbon jacket
goldwin jackets
I can't believe it.
eider jackets
You can have a look at it.
sportalm jackets
Wonderful!
west scout
stylish design
Ed Hardy Wholesale
fashion excellent quality
wholesale Ed Hardy
ED Hardy clothing bring you a super surprise!
ed hardy wholesale clothing
I can't believe it.
abercrombie outlet
You can have a look at it.
abercrombie fitch outlet
Wonderful!
coats & jackets
ReplyDelete
Replies
combattery8407 July, 2010 00:17
ACER Travelmate 4002lmi battery
Acer travelmate 800 battery
Acer aspire 3613wlmi battery
Travelmate 2414wlmi battery
Acer batcl50l battery
Acer Travelmate 2300 battery
ACER aspire 3610 battery
ACER travelmate 4600 battery
Dell Latitude D800 battery
Dell Inspiron 600m battery
Dell Inspiron 8100 Battery
Dell Y9943 battery
Dell Inspiron 1521 battery
Dell Inspiron 510m battery
Dell Latitude D500 battery
Dell Latitude D520 battery
Dell GD761 battery
Dell NF343 battery
Dell D5318 battery
Dell G5260 battery
Dell Inspiron 9200 battery
Dell Latitude C500 battery
Dell HD438 Battery
Dell GK479 battery
Dell PC764 battery
Dell KD476 Battery
Dell Inspiron 1150 battery
ReplyDelete
Replies

Add comment