natural language processing blog: ACL 2010 Retrospective

24 July 2010

ACL 2010 Retrospective

ACL 2010 finished up in Sweden a week ago or so. Overall, I enjoyed my time there (the local organization was great, though I think we got hit with unexpected heat, so those of us who didn't feel like booking a room at the Best Western -- hah! why would I have done that?! -- had no A/C and my room was about 28-30 every night).

But you don't come here to hear about sweltering nights, you come to hear about papers. My list is actually pretty short this time. I'm not quite sure why that happened. Perhaps NAACL sucked up a lot of the really good stuff, or I went to the wrong sessions, or something. (Though my experience was echoed by a number of people (n=5) I spoke to after the conference.) Anyway, here are the things I found interesting.

Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates, by Matthew Gerber and Joyce Chai (this was the Best Long Paper award recipient). This was by far my favorite paper of the conference. For all you students out there (mine included!), pay attention to this one. It was great because they looked at a fairly novel problem, in a fairly novel way, put clear effort into doing something (they annotated a bunch of data by hand), developed features that were significantly more interesting than the usual off-the-shelf set, and got impressive results on what is clearly a very hard problem. Congratulations to Matthew and Joyce -- this was a great paper, and the award is highly deserved.
Challenge Paper: The Human Language Project: Building a Universal Corpus of the World’s Languages, by Steven Abney and Steven Bird. Basically this would be awesome if they can pull it off -- a giant structured database with stuff from tons of languages. Even just having tokenization in tons of languages would be useful for me.
Extracting Social Networks from Literary Fiction, by David Elson, Nicholas Dames and Kathleen McKeown. (This was the IBM best student paper.) Basically they construct networks of characters from British fiction and try to analyze some literary theories in terms of those networks, and find that there might be holes in the existing theories. My biggest question, as someone who's not a literary theorist, is why did those theories exist in the first place? The analysis was over 80 or so books, surely literary theorists have read and pondered all of them.
Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish, by Reyyan Yeniterzi and Kemal Oﬂazer. You probably know that I think translating morphology and translating out of English are both interesting topics, so it's perhaps no big surprise that I liked this paper. The other thing I liked about this paper is that they presented things that worked, as well as things that might well have worked but didn't.
Learning Common Grammar from Multilingual Corpus, by Tomoharu Iwata, Daichi Mochihashi and Hiroshi Sawad. I wouldn't go so far as to say that I thought this was a great paper, but I would say there is the beginning of something interesting here. They basically learn a coupled PCFG in Jenny Finkel hierarchical-Bayes style, over multiple languages. The obvious weakness is that languages don't all have the same structure. If only there were an area of linguistics that studies how they differ.... (Along similar lines, see

Phylogenetic Grammar Induction, by Taylor Berg-Kirkpatrick and Dan Klein, which has a similar approach/goal.)
Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation, by Michael Bloodgood and Chris Callison-Burch. The "trend" referenced in the title is that active learning always asymptotes depressingly early. They have turkers translate bits of sentences in context (i.e., in a whole sentence, translate the highlighted phrase) and get a large bang-for-the-buck. Right now they're looking primarily at out-of-vocabulary stuff, but there's a lot more to do here.

A few papers that I didn't see, but other people told me good things about:

“Was It Good? It Was Provocative.” Learning the Meaning of Scalar Adjectives, by Marie-Catherine de Marneffe, Christopher D. Manning and Christopher Pott.
Unsupervised Ontology Induction from Text, by Hoifung Poon and Pedro Domingos.
Improving the Use of Pseudo-Words for Evaluating Selectional Preferences, by Nathanael Chambers and Daniel Jurafsky.
Learning to Follow Navigational Directions, by Adam Vogel and Daniel Jurafsky.
Compositional Matrix-Space Models of Language, by Sebastian Rudolph and Eugenie Giesbrecht. (This was described to me as "thought provoking" though not necessarily more.)
Top-Down K-Best A* Parsing, by Adam Pauls, Dan Klein and Chris Quirk.

At any rate, I guess that's a reasonably long list. There were definitely good things, but with a fairly heavy tail. If you have anything you'd like to add, feel free to comment. (As an experiment, I've turned comment moderation on as a way to try to stop the spam... I'm not sure I'll do it indefinitely; I hadn't turned it on before because I always thought/hoped that Google would just start doing spam detection and/or putting hard captcha's up or something to try to stop spam, but sadly they don't seem interested.)

6 comments:

Dmitry Kan said...: Lauri Karttunen, with whom we had a chat yesterday, has told that the hottest topic in Uppsala was sentiment detection. Did you have a similar impression and have you seen anything worthwhile on that matter?

Due to this, the commettee might have rejected papers on other not-as-hot NLP topics, like machine translation.; 30 July, 2010 05:26
hal said...: @D_K: I didn't see many (any?) sentiment papers, so it's hard to say... I wasn't actively avoiding them, but they just didn't pique my interest enough to drag me out of another session. I think there were fewer parsing and MT sessions than in the past, which I can unequivocally say that I think is a good thing. Diversity is good.; 30 July, 2010 07:36
Kevin Duh said...: Nice list, Hal! Just to add to it, I also enjoyed the following papers:

a) Fine-Grained Tree-to-String Translation Rule Extraction (Xianchao Wu; Takuya Matsuzaki; Jun’ichi Tsujii) - statistical machine translation with HPSG.

b) Bootstrapping Semantic Analyzers from Non-Contradictory Texts (Ivan Titov; Mikhail Kozhevnikov) - a very interesting and challenging unsupervised semantic parsing problem.

c) Dynamic Programming for Linear-Time Incremental Parsing (Liang Huang; Kenji Sagae) - the title says it all.

d) Combining Data and Mathematical Models of Language Change (Morgan Sonderegger; Partha Niyogi) - models the evolution of stress change in English noun/verb pairs (e.g. "contract", "protest"); 31 July, 2010 11:11
Shrey Agarwal said...: Hello,
I am a student from India pursuing my undergraduate studies in Computer Science and plan to do my MS from a foreign university in the field of NLP. Could you please tell me some of the universities which provide an MS program in this field?
Hoping to get some guidance from you on this matter.
Thanks! :); 31 July, 2010 12:24
hal said...: @kevin: thanks, those sound interesting and i dont' think i saw any of them!

@shrey: a good place to start might be here: http://aclweb.org/aclwiki/index.php?title=List_of_NLP/CL_courses ... of course, i have to say that obviously the best place to go is UMD :).; 31 July, 2010 13:55
Shrey Agarwal said...: Thanks! :)
If it isn't a lot to ask, could you also brief me on what universities look for in an application? Is it just recommendations, projects and the statement of purpose, or something else too?; 31 July, 2010 22:09

natural language processing blog

24 July 2010

ACL 2010 Retrospective

6 comments:

About Me

Labels

My Blog List

Blog Archive