I come from a strong lineage of discourse folks. Writing a parser for Rhetorical Structure Theory was one of the first class projects I had when I was a grad student. Recently, with the release of the Penn Discourse Treebank, there has been a bit of a flurry of interest in this problem (I had some snarky comments right after ACL about this). I've also talked about why this is a hard problem, but never really about why it is an interesting problem.
My thinking about discourse has changed a lot over the years. My current thinking about it is in an "interpretation as abduction" sense. (And I sincerely hope all readers know what that means... if not, go back and read some classic papers by Jerry Hobbs.) This is a view I've been rearing for a while, but I finally started putting it into words (probably mostly Jerry's words) in a conversation at ACL with Hoifung Poon and Joseph Turian (I think it was Joseph... my memory fades quickly these days :P).
This view is that discourse is that thing that gives you an interpretation above and beyond whatever interpretations you get from a sentence. Here's a slightly refined version of the example I came up with on the fly at ACL:
- I only like traveling to Europe. So I submitted a paper to ACL.
- I only like traveling to Europe. Nevertheless, I submitted a paper to ACL.
Pretty amazing stuff, huh? Replacing a "so" with a "nevertheless" completely changes this interpretation.
What does this have to do with interpretation as abduction? Well, we're going to assume that this discourse is coherent. Given that assumption, we have to ask ourselves: in (1), what do we have to assume about the world to make this discourse coherent? The answer is that you have to assume that ACL is in Europe. And similarly for (2).
Of course, there are other things you could assume that would make this discourse coherent. For (1), you could assume that I have a rich benefactor who likes ACL submissions and will send me to Europe every time I submit something to ACL. For (2), you could assume that I didn't want my paper to get in, but I wanted a submission to get reviews, and so I submitted a crappy paper. Or something. But these fail the Occam's Razor test. Or, perhaps they are a priori simply less likely (i.e., you have to assume more to get the same result).
Interestingly, I can change the interpretation of (2), for instance, by adding a third sentence to the discourse: "I figured that it would be easy to make my way to Europe after going to Israel." Here, we would abduce that ACL is in Israel, and that I'm willing to travel to Israel on my way to Europe. For you GOFAI folks, this would be something like non-monotonic reasoning.
Whenever I talk about discourse to people who don't know much about it, I always get this nagging sense of "yes, but why do I care that you can recognize that sentence 4 is background to sentence 3, unless I want to do summarization?" I hope that this view provides some alternative answer to that question. Namely, that there's some information you can get from sentences, but there is additional information in how those sentences are glued together.
Of course, one of the big problems we have is that we have no idea how to represent sentence-level interpretations, or at least some ideas but no way to get there in the general case. In the sentence-level case, we've seen some progress recently in terms of representing semantics in a sort of substitutability manner (ala paraphrasing), which is nice because the representation is still text. One could ask if something similar might be possible at a discourse level. Obviously you could paraphrase discourse connectives, but that's missing the point. What else could you do?