21 July 2008

ICML/UAI/COLT 2008 Retrospective

I know it's a bit delayed, but better late than never, eh? I have to say: I thought ICML/UAI/COLT this year was fantastic. The organizers all did a fantastic job. I can't possibly list everything I liked, but here are some highlights:

The tutorials were incredible. I'm reaching that point where I often skip out on tutorials because there's nothing new enough. This time, there was an embarrassment of riches: I couldn't go to everything I wanted to. In the end, I went to the Smola+Gretton+Fukumizu tutorial on distributional embeddings and the Krause+Guestrin tutorial on submodularity. Both were fantastic. My only complaint is that while all the plenary talks at ICML/UAI/COLT were video taped, the tutorials were not! Anyway, I will blog separately about both of these tutorials in the near future; also check out their web pages: the slides are probably reasonably self-explanatory.

The first day began with an invited talk by John Winn, who does (at least used to do) probabilistic modeling for computer vision. I'm not sure if this was the intended take-away message, but the thing I liked best about the talk is that he listed about a dozen things that you would have to model, were you to model "vision." He then talked about problems and techniques in terms of which of these you model, which of them you fix, and which of them you treat as "noise." I think this is a great way to think about modeling any kind of complex real world phenomenon (hint hint!).

My favorite morning talk was Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatci, Yee Whye Teh, and Zoubin Ghahramani. This is a really clever use of slice sampling, which everyone should know about! The idea is to resample entire sequences of hidden states in one go. The trouble with this in an infinite model is that I don't know how to do forward-backward over an infinite state space. The trick is to slice the probability mass on the lattice and only sample over those state/observation pairs that stick out above the slice. This is effectively a fully Bayesian way to do beam-like methods, and is certainly no limited to infinite HMMs.

That afternoon, there was the joint award session, with a "ten year" award (great idea!) going to Blum and Mitchell's original co-training paper from COLT 1998. This is clearly a very influential paper, and one I will blog about separately. I particularly liked two of the best paper award recipients:

  1. Percy Liang and Michael Jordan for An Asymptotic Analysis of Generative, Discriminative, and Pseudolikelihood Estimators. The idea here is to look at what various estimators are like in the limit of infinite data. The conclusion (as I read it) is that if your model is correct (i.e., the true distribution is realized by a certain set of parameters in your model family), then you're asymptotically better using generative models. If the model is even a tiny tiny bit incorrect, then you're better using discriminative. The key weakness of this analysis seems to be that, since we're looking only at asymptotics, there's no notion of a "sortof-wrong" model -- the wrongness gets blown up in the limit.
  2. Shai Shalev-Shwartz and Nati Srebo for SVM Optimization: Inverse Dependence on Training Set Size. The idea here is that having lots of data should be a good thing, not something that makes us say "ugh, now my SVM will take forever to run." The key to getting an inverse dependence on the size of the training set is to sort of change the problem. The idea is that if you have 1,000 training points, you may get 88% accuracy and it may take an hour. If you have 1,000,000 training points, we're not going to ask that we find something with 90% accuracy, but rather still aim for 88% accuracy but try to get there faster. The analysis is based on the idea that if we have lots more data, we can be a bit looser with finding the exact best model parameters, which is what usually takes so long. We can afford to have not exactly the best parameters, since we're still going for 90% accuracy, not 88%.
Although I didn't see the talk, I enjoyed the poster on Training Structural SVMs when Exact Inference is Intractable by Thomas Finley and Thorsten Joachims. They investigate the difference between approximate inference methods that are overgenerating versus undergenerating. The former is exemplified by a conversion from an IP to an LP. The latter by, say, search. I don't think there are huge cut and dry situations where one is better than the other, but I think this is a really good step to understanding these models better.

Tuesday morning, there was a great talk on Efficient Projections onto the L1-Ball for Learning in High Dimensions by John Duchi, Shai Shalev-Shwartz, Yoram Singer and Tushar Chandra. Here, there's trying to extend the Pegasos algorithm to do L1 regularized SVMs. The difficult step is, after taking a sub-gradient step, projecting yourself back on to the L1 ball. They have a very clever algorithm for doing this that makes use of red-black trees to maintain a notion of when a dimension was last updated (necessary in order to get complexity that depends on the sparseness of feature vectors). The only worry I have about this approach is that I'm very happy dealing with arrays for everything, but once someone tells me that my algorithm needs to use a tree structure, I start worrying. Not because of big-O issues, but just because trees mean pointers, pointers mean pointer chasing, and pointer chasing (especially out of cache) is a nightmare for actual efficiency.

All the papers in the Tuesday 10:40am online learning section I found interesting. Confidence-weighted linear classification (Mark Dredze, Koby Crammer and Fernando Pereira) presents a technique for keeping track of how confident you are about your weights for various features (if you squint it looks very PAC-Bayes) and they get very good online performance with only one or two passes over the data. Francesco Orabona, Joseph
Keshet and Barbara Caputo had a paper on memory-bounded online perceptrons called the Projectron. The idea is simple and effective; however, the big take-away I had from this paper has to do with the baseline they compare against. If you only allow your perceptron to maintain, say, 100 support vectors, just use the first 100 points that the perceptron updates on. This does pretty much as well as most of the past work in this problem! How did this not get noticed before? Finally, Sham Kakade, Shai Shalev-Shwartz and Ambuj Tewari talked about bandit problems; this is interesting work, but still seems far from practical. The idea of naively picking an unknown label with uniform probability over a gigantic set of labels is just not going to work in the real world. But I think it's a really good start.

The entire NLP section on the last afternoon was good. David Chen and Ray Mooney talked about how to generate sportscasts from RoboCup trials; Ronan Collobert and Jason Weston talked about using neural nets to solve every NLP problem in the world (more on this in another post); Jason Wolfe, Aria Haghighi and Dan Klein talked about how to parallelize the M step in addition to the E step; and then Percy Liang, me, and Dan Klein talked about whether its reasonable or not to get rid of features in structured prediction (more on this in another post).

The next day was workshop day. The two workshops I went to were the Prior Knowledge in Text workshop (that I co-organized with Marc Dymetman, Guillame Bouchard and Yee Whye Teh), and the Nonparametric Bayes workshop (where I talked a tiny bit about HBC). I'll relate the news of our own workshop at a later date; you can see some about the NPBayes workshop here.

Then came UAI and COLT. I must admit by this time I was a bit conferenced out, so I missed a bunch of sessions. However, there were still some papers that stood out for me.

In UAI, David Mimno and Andrew McCallum presented Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression, essentially a conditional variant of LDA wherein the distribution over hyperparameters is given by a generalized linear model and can use arbitrary features. One audience question that came up (David, if you're reading, maybe you can answer this!) had to do with putting the GLM on the "alpha" hyperparameter. The fact that there was such a big improvement over baseline suggests that LDA-like models are highly sensitive to the setting of their hyperparameters: this is a bit surprising. I (like the questioner) was a bit surprised that tweaking a hyperparameter could have such a big influence! Nevertheless, very cool.

The best paper went to David Sontag, Talya Meltzer, Amir Globerson, Tommi Jaakkola and Yair Weiss for Tightening LP Relaxations for MAP using Message Passing. The idea is to take your marginal polytope initially defined by simple single-node marginal constraints and iteratively add pairwise, 3-wise, etc., constraints. There are some implementation details that allow you to do warm restarts. They managed to scale really amazingly well. This furthers my confidence that LPs are a really great way to do MAP inference in graphical models.

Kuzman Ganchev, Joao Graca, John Blitzer and Ben Taskar talked about Multi-View Learning over Structured and Non-Identical Outputs. I see this as a bit of an extension to the "alignment by agreement" and the "agreement-based learning" work, but now the difference is that the output spaces don't have to "match up."

Marina Meila and Le Bao had a poster on Estimation and clustering with infinite rankings, where they give a nonparametric model over rankings, focusing on properties of the distribution. Kurt Miller, Tom Griffiths and Mike Jordan had an extension to the Indian Buffet Process that intentionally removes exchangeability in order to model cases where we know the data is not exchangeable. Chong Wang, Dave Blei and David Heckerman had a nice result on how to do topic modeling over continuous time (modeled as Brownian motion). The cool thing is that by going continuous, they're able to get a more efficient algorithm that for previous work that functioned in discrete time steps.

At COLT, there were also a good number of good papers. Shai Ben-David, Tyler Lu and David Pal ask: Does Unlabeled Data Provably Help? Worst-case Analysis of the Sample Complexity of Semi-Supervised Learning. I'll ruin the suspense: No, it does not. (At least not without strong assumptions about the label distribution.) Nina Balcan, Avrim Blum and Nati Srebo
show that learning with similarity functions is even better than we thought.

One paper at COLT that I especially liked because I wasn't aware of much of the previous work they cited was by Liwei Wang, Masashi Sugiyama, Cheng Yang, Zhi-Hua Zhou and Jufu Feng On the Margin Explanation of Boosting Algorithms. The history here is roughly like this. Boosting came along and worked pretty well. People started trying to explain it from the perspective of margin-based analysis. Then, Breiman came along and described an algorithm arc-gv that provably generates a better margin than AdaBoost (and does so in practice as well), yet works worse! This paper attempts to re-analyze boosting from the perspective of an "Equilibrium Margin," which provides sharper bounds and results that actually agree with what is observed empirically.

18 comments:

Anonymous said...

Nice overview! I also noticed there was no ACL overview this year -- does this mean the interesting stuff is drifting from ACL to ICML?

hal said...

No... it means that I only attended like 5 talks at ACL this year other than the ones in the summarization summary, and didn't feel like I could give it a fair shake down.

If someone else went to ACL and saw lots of talks and wanted to write a post, I'd be happy to "augment" it with what I saw!

Brendan said...

That was great, thanks! I'm pretty interested to hear your thoughts on the neural networks paper -- I can't imagine such a system to be at all practical at all to develop, but it sure sounds fun...

Mark Dredze said...

Thanks for the shout-out. I didn't make it to ICML this year but I'm glad to see some much interesting stuff. I'll add some of these to my reading pile.

Anonymous said...

It is the holic gold which makes me very happy these days, my brother says holic money is his favorite games gold he likes, he usually holic online gold to start his game and most of the time he will win the cheap holic gold back and give me some holic online money to play the game.

cutepig said...

I want to tell you, I know a website sell the gaia gold, and in this website all gaia online gold is very cheap, there had many customer, if you want to buy gaia gold, I suggest you to come here then to buy the cheap gaia gold

Anonymous said...

網頁設計,情趣用品,情趣用品,情趣用品,情趣用品
色情遊戲,寄情築園小遊戲,情色文學,一葉情貼圖片區,情惑用品性易購,情人視訊網,辣妹視訊,情色交友,成人論壇,情色論壇,愛情公寓,情色,舊情人,情色貼圖,色情聊天室,色情小說,做愛,做愛影片,性愛

免費視訊聊天室,aio交友愛情館,愛情公寓,一葉情貼圖片區,情色貼圖,情色文學,色情聊天室,情色小說,情色電影,情色論壇,成人論壇,辣妹視訊,視訊聊天室,情色視訊,免費視訊,免費視訊聊天,視訊交友網,視訊聊天室,視訊美女,視訊交友,視訊交友90739,UT聊天室,聊天室,豆豆聊天室,尋夢園聊天室,聊天室尋夢園,080聊天室,080苗栗人聊天室,女同志聊天室,上班族聊天室,小高聊天室 

AV,AV女優
視訊,影音視訊聊天室,視訊交友
視訊,影音視訊聊天室,視訊聊天室,視訊交友,視訊聊天,視訊美女

. said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花

酒店上班請找艾葳 said...

艾葳酒店經紀公司提供專業的酒店經紀, 酒店上班小姐,八大行業,酒店兼職,傳播妹,或者想要打工兼差打工,兼差,八大行業,酒店兼職,想去酒店上班, 日式酒店,制服酒店,ktv酒店,禮服店,整天穿得水水漂漂的,還是想去制服店日領上班小姐,水水們如果想要擁有打工工作、晚上兼差工作兼差打工假日兼職兼職工作酒店兼差兼差打工兼差日領工作晚上兼差工作酒店工作酒店上班酒店打工兼職兼差兼差工作酒店上班等,想了解酒店相關工作特種行業內容,想兼職工作日領假日兼職兼差打工、或晚班兼職想擁有鋼琴酒吧又有保障的工作嗎???又可以現領請找專業又有保障的艾葳酒店經紀公司!

艾葳酒店經紀是合法的公司工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆,可日領現領
一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班,酒店上班,酒店打工環境、上班條件給水水們。心動嗎!? 趕快來填寫你的酒店上班履歷表

水水們妳有缺現領、有兼職缺錢便服店的煩腦嗎?想到日本留學缺錢嗎?妳是傳播妹??想要擁有高時薪又輕鬆的賺錢,酒店和,假日打工,假日兼職賺錢的機會嗎??想實現夢想卻又缺錢沒錢嗎!??
艾葳酒店台北酒店經紀招兵買馬!!徵專業的酒店打工,想要去酒店的水水,想要短期日領,酒店日領,禮服酒店,制服店,酒店經紀,ktv酒店,便服店,酒店工作,禮服店,酒店小姐,酒店經紀人,
等相關服務 幫您快速的實現您的夢想~!!

tariely said...

Оса 800 электрошокер в Москве.

Anonymous said...

шокер
электрошокер

erdal said...

Thanks
klip izle | malatya park | youtube | video izle | sitemap

gamefan12 said...

I agree with you all the way. This was definitely fantastic. I would love to see this more in the future.
ponte vedra cosmetic dentist

seldamuratim said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..
sesli sohbetsesli chatkamerali sohbetseslisohbetsesli sohbet sitelerisesli chat siteleriseslichatsesli sohpetseslisohbet.comsesli chatsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet
seslisohbetsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet

cilemsin42 said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..
sesli sohbetsesli chat
sesli sohbet siteleri

sesli chat siteleri sesli sohbetsesli chat
sesli sohbet siteleri
sesli chat siteleri
SesliChat
cılgın sohbet
güzel kızlar
bekar kızlar
dul bayanlar
seviyeli insanlar
yarışma
canlı müzik
izdivac
en güzel evlilik
hersey burada
sesliparti
seslisohbet odalari
Sesli adresi
Sesli Chat
SesliChat Siteleri
Sesli Chat sitesi
SesliChat sitesi
SesliSohbet
Sesli Sohbet
Sesli Sohbet Sitesi
SesliSohbet Sitesi
SesliSohbet Siteleri
Muhabbet Sitesi
kamerali chat
Görüntülü Sohbet
Hasret gülleri
Çet sitesi
SesliSohbet
Sesli Sohbet
Canli sohbet
Turkce sohbet
Kurtce Sohbet
Kurtce Chat
Kurtce Muhabbet
Kurtce Sohbet
Kurdish Chat
SesliChat
Sesli Chat
SesliSanal
Guncel Haber
sohbet Sitesi
Chat sitesi..

seldamuratim said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..

sesli sohbet
seslisohbet
sesli chat
seslichat
sesli sohbet sitesi
sesli chat sitesi
sesli sohpet
kamerali sohbet
kamerali chat
webcam sohbet

DiSCo said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
kameralı sohbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
seslidunya
seslisehir
sesli sex

Sesli Chat said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
sesli muhabbet siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
görüntülü muhabbet siteleri
kameralı sohbet siteleri
kameralı chat siteleri
kameralı muhabbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
birsesver
birses
seslidunya
seslisehir
sesli sex