07 September 2009

ACL and EMNLP retrospective, many days late

Well, ACL and EMNLP are long gone. And sadly I missed one day of each due either to travel or illness, so most of my comments are limited to Mon/Tue/Fri. C'est la vie. At any rate, here are the papers I saw or read that I really liked.
  • P09-1010 [bib]: S.R.K. Branavan; Harr Chen; Luke Zettlemoyer; Regina Barzilay
    Reinforcement Learning for Mapping Instructions to Actions

    and

    P09-1011 [bib]: Percy Liang; Michael Jordan; Dan Klein
    Learning Semantic Correspondences with Less Supervision

    these papers both address what might roughly be called the grounding problem, or at least trying to learn something about semantics by looking at data. I really really like this direction of research, and both of these papers were really interesting. Since I really liked both, and since I think the directions are great, I'll take this opportunity to say what I felt was a bit lacking in each. In the Branavan paper, the particular choice of reward was both clever and a bit of a kludge. I can easily imagine that it wouldn't generalize to other domains: thank goodness those Microsoft UI designers happened to call the Start Button something like UI_STARTBUTTON. In the Liang paper, I worry that it relies too heavily on things like lexical match and other very domain specific properties. They also should have cited Fleischman and Roy, which Branavan et al did, but which many people in this area seem to miss out on -- in fact, I feel like the Liang paper is in many ways a cleaner and more sophisticated version of the Fleischman paper.

  • P09-1054 [bib]: Yoshimasa Tsuruoka; Jun’ichi Tsujii; Sophia Ananiadou
    Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty

    This paper is kind of an extension of the truncated gradient approach to learning l1-regularized models that John, Lihong and Tong had last year at NIPS. The paper did a great job at motivated why L1 penalties is hard. The first observation is that L1 regularizes optimized by gradient steps like to "step over zero." This is essentially the observation in truncated gradient and frankly kind of an obvious one (I always thought this is how everyone optimized these models, though of course John, Lihong and Tong actually proved something about it). The second observation, which goes into this current paper, is that you often end up with a lot of non-zeros simply because you haven't run enough gradient steps since the last increase. They have a clever way to accumulating these penalties lazily and applying them at the end. It seems to do very well, is easy to implement, etc. But they can't (or haven't) proved anything about it.

  • P09-1057 [bib]: Sujith Ravi; Kevin Knight
    Minimized Models for Unsupervised Part-of-Speech Tagging

    I didn't actually see this paper (I think I was chairing a session at the time), but I know about it from talking to Sujith. Anyone who considers themselves a Bayesian in the sense of "let me put a prior on that and it will solve all your ills" should read this paper. Basically they show that sparse priors don't give you things that are sparse enough, and that by doing some ILP stuff to minimize dictionary size, you can get tiny POS tagger models that do very well.

  • D09-1006: [bib] Omar F. Zaidan; Chris Callison-Burch
    Feasibility of Human-in-the-loop Minimum Error Rate Training

    Chris told me about this stuff back in March when I visited JHU and I have to say I was totally intrigued. Adam already discussed this paper in an earlier post, so I won't go into more details, but it's definitely a fun paper.

  • D09-1011: [bib] Markus Dreyer; Jason Eisner
    Graphical Models over Multiple Strings

    This paper is just fun from a technological perspective. The idea is to have graphical models, but where nodes are distributions over strings represented as finite state automata. You do message passing, where your messages are now automata and you get to do all your favorite operations (or at least all of Jason's favorite operations) like intersection, composition, etc. to compute beliefs. Very cool results.

  • D09-1024: [bib] Ulf Hermjakob
    Improved Word Alignment with Statistics and Linguistic Heuristics

    Like the Haghighi coreference paper below, here we see how to do word alignment without fancy math!

  • D09-1120: [bib] Aria Haghighi; Dan Klein
    Simple Coreference Resolution with Rich Syntactic and Semantic Features

    How to do coreference without math! I didn't know you could still get papers accepted if they didn't have equations in them!
In general, here's a trend I've seen in both ACL and EMNLP this year. It's the "I find a new data source and write a paper about it" trend. I don't think this trend is either good or bad: it simply is. A lot of these data sources are essentially Web 2.0 sources, though some are not. Some are Mechanical Turk'd sources. Some are the Penn Discourse Treebank (about which there were a ridiculous number of papers: it's totally unclear to me why everyone all of a sudden thinks discourse is cool just because there's a new data set -- what was wrong with the RST treebank that it turned everyone off from discourse for ten years?! Okay, that's being judgmental and I don't totally feel that way. But I partially feel that way.)

20 comments:

  1. Dear Dr. Daume

    There is a natural language conference happening in India that we would like to request you to feature on this blog. It is http://www.icon2009.in. Kindly feature it?

    Regards
    Kishore

    ReplyDelete
  2. I am not really sure if best practices have emerged around things like that, but I am sure that your great job is clearly identifed. I was wondering if you offer any subscription to your RSS feeds as I would be very interested and can’t find any link to subscribe here. Please come visit my site symptoms of alzheimer's when you got time.

    ReplyDelete
  3. This is just another reason why I like your website. I like your style of writing you tell your stories without out sending us to 5 other sites to complete the story. Please come visit my site digestive disorder when you got time.

    ReplyDelete
  4. always kept talking about this. I will forward this article to him. Pretty sure he will have a good read. Thanks for sharing! Please come visit my site Long Beach Business Directory when you got time.

    ReplyDelete
  5. Excellent article , i just share it with my friend of Italy. I Stumble UP your blog post , you will notice an increase of traffic within 24 hours for targeted people. Cheers . Please come visit my site Phone Directory Of Fresno City California CA State when you got time.

    ReplyDelete
  6. Great tips. I am new to business, trying to visit more business blogs for guides and tips.
    You can be friends with me. Please come visit my site Cleveland Business Directory when you got time. Thanks.

    ReplyDelete
  7. I can see that you are putting a lot of time and effort into your blog and detailed articles! I am deeply in love with every single piece of information you post here. Will be back often to read more updates! Please come visit my site crazy video's when you got time.

    ReplyDelete
  8. Nice blog design. This seems like it would be a very interesting blog to keep up with. Please come visit my site Business Directory Durham when you got time.

    ReplyDelete
  9. Nice blog design. This seems like it would be a very interesting blog to keep up with. Please come visit my site Resource Guide Boston Massachusetts MA when you got time.

    ReplyDelete
  10. love your blog! Very cool! Please come visit my site Business Directory Austin when you got time.

    ReplyDelete
  11. love your blog! Very cool! Please come visit my site Business Reviews Of Austin City when you got time.

    ReplyDelete
  12. Wow! Thank you! I always wanted to write in my site something like that. Can I take part of your post to my blog? Please come visit my site Jacksonville Florida FL Directory when you got time.

    ReplyDelete
  13. Excellent article , i just share it with my friend of Italy. I Stumble UP your blog post , you will notice an increase of traffic within 24 hours for targeted people. Cheers . Please come visit my site California CA Phone Directory when you got time.

    ReplyDelete
  14. Excellent article , i just share it with my friend of Italy. I Stumble UP your blog post , you will notice an increase of traffic within 24 hours for targeted people. Cheers . Please come visit my site Santa Ana Phone Book when you got time.

    ReplyDelete
  15. I found your blog on google and read a few of your other posts. I just added you to my Google News Reader. Keep up the good work. Look forward to reading more from you in the future. Please come visit my site Free Business Listing of Louisville when you

    ReplyDelete
  16. I found your blog on google and read a few of your other posts. I just added you to my Google News Reader. Keep up the good work. Look forward to reading more from you in the future. Please come visit my site Louisville Directory Businesses when you

    ReplyDelete
  17. I read your post and find it very good source of info for quite a long time and must tell that your posts always prove to be of a high value and quality for readers. Casino Slot Games

    ReplyDelete
  18. Online shoe shopping has never been so easy.A one-stop destination for all your footwear needs!

    Nike Shox R4,Shox shoes,shox

    nz
    ,ugg boots or new spyder,you name it and we have it.We are not only the

    premier shopping destination for spyder

    jackets
    online but also for people of all age-groups and for all brands. Buy online without

    leaving the comfort of your home. Whatever your needs: from clothing and accessories to all brands

    (cheap ugg boots,discount ugg boots,ugg

    boots
    ,cheap ugg boots,discount ugg boots,Rare ghd,MBT boots,MBT shoes in fashion,cheap mbt shoes sale,discount mbt outlet 2010,MBT Walking Shoes)of shoes and sports gear,we has

    it all for you.

    ReplyDelete