15 July 2006

Where are Interesting Learning Problems in NLP?

I just spent a few days visiting Yee Whye and NUS (photos to prove it!). YW and I talked about many things, but one that stood out as a ripper is attempting to answer the question: as a machine learning person (i.e., YW), what problems are there in NLP that are likely to be amenable to interesting machine learning techniques (i.e., Bayesian methods, or non-parametric methods, etc.). This was a question we tried to answer at the workshop last year, but I don't think we reached a conclusion.

At first, we talked about some potential areas, mostly focusing around problems for which one really needs to perform some sort of inference at a global scale, rather than just locally.  I think that this answer is something of a pearler, but not the one I want to dwell on.

Another potential partial answer arose, which I think bears consideration: it will not be on any problem that is popular right now.  Why?  Well, what we as NLPers usually do these days is use simple (often hueristic) techniques to solve problems.  And we're doing a sick job at it, for the well studied tasks (translation, factoid QA, ad hoc search, parsing, tagging, etc.).  The hunch is that one of the reasons such problems are so popular these days is because such techniques work so bloody well.  Given this, you'd have to be a flamin' galah to try to apply something really fancy to solve one of these tasks.

This answer isn't incompatible with the original answer (globalness).  After all, most current techniques only use local information.  There is a push toward "joint inference" problems and reducing our use of pipelining, but this tends to be at a fairly weak level.

This is not to say that Bayesian techniques (or, fancy machine learning more generally) is not applicable to problems with only local information, but there seems to be little need to move toward integrating in large amounts of global uncertainty.  Of course, you may disagree and if you do, no wuckers.

p.s., I'm in Australia for ACL, so I'm trying to practice my Aussie English.

5 comments:

Kevin said...

I can't imagine you in an Australian accent, although it sure seems you got the lexicon quite well. :)

Regarding interesting machine learning problems in NLP. I agree that global/joint learning is one aspect. In general, I think the neat thing about NLP is that the inputs are words and sentences, and the outputs are sentences and trees. This makes it different from the simple binary classification problem commonly explored in machine learning (so I'm talking about structured outputs here).

I think there may be an additional interesting area, but it's something I haven't seen much of: figuring out the feature space of NLP tasks. Basically, we often define some sort of vectorial representation of words/sentences based on some linguistically motivated features. Then we do some learning algorithm on the resulting Euclidean space. It may be possible that this representation isn't so good, since words are discrete and don't really lie in some Euclidean space. So I'm thinking of smarter feature selection/induction algorithms, something along the lines of Jun Suzuki's "Convoltion Kernels with Feature Selection" paper.

I think NLP is full of interesting problems for MLer's. The catch is for MLer's to use new algorithm to beat the state of the art, which is often not so easy!

. said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花

seldamuratim said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..
sesli sohbetsesli chatkamerali sohbetseslisohbetsesli sohbet sitelerisesli chat siteleriseslichatsesli sohpetseslisohbet.comsesli chatsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet
seslisohbetsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet

DiSCo said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
kameralı sohbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
seslidunya
seslisehir
sesli sex

Sesli Chat said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
sesli muhabbet siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
görüntülü muhabbet siteleri
kameralı sohbet siteleri
kameralı chat siteleri
kameralı muhabbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
birsesver
birses
seslidunya
seslisehir
sesli sex