15 January 2006

NLP as Glorified Memorization

The view of NLP as essentially a hunt-and-return technology has been gathering momentum since the burgeoning of the web. Example-based MT takes this view to machine translation, and phrase-based statisitcal MT is essentially EBMT done with statistics. In question answering (factoid style), the situation is even more dramatic. Deepak's thesis was essentially devoted to the idea that the answer to any question can be found in huge corpora by relatively simple pattern-matching. To a somewhat lesser degree, information extraction technology is something like smoothed (or backed-off) memorization, and performance is largely driving by one's ability to obtain gazeteers relevant to one's task.

Pushing such memorization technology further will doubtless lead to continued success, and there are many open research questions here. I would love others to answer these questions, but I have little interest in answering them myself. Fortunately, I think there are many interesting real-world problems for which simple memorization techniques will not work, and deeper "analysis" or "understanding" is required.

Any QA/summarization task that focuses on something other than "general world knowledge" fits into this category. I might want to ask questions to my email client about past emails I've recieved. The answer will likely exist only once, and likely not in the form I ask the question. I might want to ask questions about scientific research, either from PubMed or REXA. I might want to ask about the issues involved in the election of the Canadian PM (something I know nothing about) or the confirmation hearings of Samuel Alito (something I know comparatively more about). And I would want the answers tailored to me. If I owned a large corporation or were running a campaign, I would want to know what my supporters and detractors were saying about me, and who was listening to whom.

I could be proven wrong: maybe memorization techniques can solve some/all of these problems, but I doubt it. What other problems are people interested in that may not be solvable with memorization?

1 comment:

Anonymous said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花