natural language processing blog: Tutorial: Bayesian Techniques for NLP

14 February 2006

Tutorial: Bayesian Techniques for NLP

I'm giving a tutorial on Bayesian methods for NLP at HLT-NAACL 2006. I gave a similar tutorial about a year ago here at ISI. This gave me a pretty good idea of what I want to keep in and what I want to cut out. The topics I intend to cover are, roughly:

Bayesian paradigm: priors, posteriors, normalization, etc.
Graphical models, expectation maximization, non-bayesian inference techniques
Common statistical distributions: uniform, binomial/multinomial, beta/dirichlet
Simple inference: integration, summaring, monte carlo
Advanced inference: MCMC, Laplace, Variational
Survey of popular models: LDA, Topics and Syntax, Words and Pictures
Pointers to literature

All of this is, of course, cast in the context of NLP problems: all discrete distributions, language applications, etc., that hopefully both NLP and IR people will find interesting (maybe even some speech people, too).

Does anyone have anything they'd really like to hear that's not on the list? Or anything that's on the list that they don't care about? Keep in mind several constraints: 3 hours (minus coffee time), generally accessible, focused on NLP applications, and something I know something about. (For instance, I covered expectation propagation in the tutorial last year, but decided to cut it for this to give more time to other issues.) Note that I am also preparing a written tutorial that covers roughly the same material.

7 comments:

Kevin Duh17 February, 2006 14:39
How about a discussion of NLP areas where Bayesian methods may work? This might encourage future work in Bayesian-NLP.
ReplyDelete
Replies
hal22 February, 2006 21:54
indeed a difficult question!

i have mixed feelings. i tend to think that Bayesian techniques really shine in unsupervised NLP settings --- it's just so easy to get good discriminative methods to work well for the supervised problems.

one argument made at the workshop is that Bayesian techniques will work well whenever there is insufficient data. given the nlp mantra "there's no data like more data" it would seem that this is every problem. i don't believe this. at least not once you factor in the computation issue (a perceptron is just soooo fast).
ReplyDelete
Replies
Anonymous13 March, 2006 06:02
As a newbee in NLP, I would be interested in such a question: What caracteristics of a NLP problem are making bayesian approach a good idea ?
But what would be much more interesting is : what carateristics make bayesian approach a *bad* idea ?
Is it just computation time ?
ReplyDelete
Replies
hal15 March, 2006 08:03
i think computation time is an issue, but perhaps not the biggest one (variational EM or EP or even well implemented collapsed Gibbs/MH are often not that much slower than vanilla EM). i think (and this is essentially the message of the tutorial) that any time you're using EM, you should consider a Bayesian model instead. especially when it is difficult to exactly specify a model structure and would like to be allow more variability (i.e. a prior probability instead of a 0/1 decision), or when the space of models is enormous in comparison to the size of the data set, it's probably worth a try.
ReplyDelete
Replies
Delip Rao27 March, 2006 14:29
Could you pls share the slides of your HLT/NACCL tutorial?
ReplyDelete
Replies
Anonymous12 May, 2009 11:28
酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店酒店兼差PRETTY GIRL酒店公關酒店小姐彩色爆米花酒店兼職,酒店工作彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀彩色爆米花
ReplyDelete
Replies
Anonymous26 November, 2013 10:54
Little knowledge is dangerous! Statistical NLP is like statistical conitive neuroscience: both will achieve nothing, because the problem at hand is way beyond a couple of formulae
ReplyDelete
Replies

Add comment