14 February 2006

Tutorial: Bayesian Techniques for NLP

I'm giving a tutorial on Bayesian methods for NLP at HLT-NAACL 2006. I gave a similar tutorial about a year ago here at ISI. This gave me a pretty good idea of what I want to keep in and what I want to cut out. The topics I intend to cover are, roughly:

  1. Bayesian paradigm: priors, posteriors, normalization, etc.
  2. Graphical models, expectation maximization, non-bayesian inference techniques
  3. Common statistical distributions: uniform, binomial/multinomial, beta/dirichlet
  4. Simple inference: integration, summaring, monte carlo
  5. Advanced inference: MCMC, Laplace, Variational
  6. Survey of popular models: LDA, Topics and Syntax, Words and Pictures
  7. Pointers to literature
All of this is, of course, cast in the context of NLP problems: all discrete distributions, language applications, etc., that hopefully both NLP and IR people will find interesting (maybe even some speech people, too).

Does anyone have anything they'd really like to hear that's not on the list? Or anything that's on the list that they don't care about? Keep in mind several constraints: 3 hours (minus coffee time), generally accessible, focused on NLP applications, and something I know something about. (For instance, I covered expectation propagation in the tutorial last year, but decided to cut it for this to give more time to other issues.) Note that I am also preparing a written tutorial that covers roughly the same material.

11 comments:

Kevin said...

How about a discussion of NLP areas where Bayesian methods may work? This might encourage future work in Bayesian-NLP.

Deepak said...

That is a very good idea! But how will one know about it without trying it?
It is an open problem!

hal said...

indeed a difficult question!

i have mixed feelings. i tend to think that Bayesian techniques really shine in unsupervised NLP settings --- it's just so easy to get good discriminative methods to work well for the supervised problems.

one argument made at the workshop is that Bayesian techniques will work well whenever there is insufficient data. given the nlp mantra "there's no data like more data" it would seem that this is every problem. i don't believe this. at least not once you factor in the computation issue (a perceptron is just soooo fast).

PierreD. said...

As a newbee in NLP, I would be interested in such a question: What caracteristics of a NLP problem are making bayesian approach a good idea ?
But what would be much more interesting is : what carateristics make bayesian approach a *bad* idea ?
Is it just computation time ?

hal said...

i think computation time is an issue, but perhaps not the biggest one (variational EM or EP or even well implemented collapsed Gibbs/MH are often not that much slower than vanilla EM). i think (and this is essentially the message of the tutorial) that any time you're using EM, you should consider a Bayesian model instead. especially when it is difficult to exactly specify a model structure and would like to be allow more variability (i.e. a prior probability instead of a 0/1 decision), or when the space of models is enormous in comparison to the size of the data set, it's probably worth a try.

delip said...

Could you pls share the slides of your HLT/NACCL tutorial?

. said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花

seldamuratim said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..
sesli sohbetsesli chatkamerali sohbetseslisohbetsesli sohbet sitelerisesli chat siteleriseslichatsesli sohpetseslisohbet.comsesli chatsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet
seslisohbetsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet

DiSCo said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
kameralı sohbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
seslidunya
seslisehir
sesli sex

Sesli Chat said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
sesli muhabbet siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
görüntülü muhabbet siteleri
kameralı sohbet siteleri
kameralı chat siteleri
kameralı muhabbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
birsesver
birses
seslidunya
seslisehir
sesli sex

Anonymous said...

Little knowledge is dangerous! Statistical NLP is like statistical conitive neuroscience: both will achieve nothing, because the problem at hand is way beyond a couple of formulae