Shared tasks have been increasing in popularity over the past half decade. These are effectively competitions (though perhaps that word is rightfully disdained) for building systems that perform well on a given task, for a specific data set. Typically a lot of stuff is given to you for free: the data, variously preprocessing steps, evaluation scripts, etc. Anywhere from a handful of people to dozens enter these shared tasks. Probably the most well known are the CoNLL shared tasks, but they have also taken place in other workshops (eg., the two SMT workshops and many others). Goverment-run competitions (eg., GALE, ACE, DUC (to some degree) and others) are somehow similar, with the added bonus that money is often contingent on performance, but for the most part, I'll be talking about the community-driven shared tasks. (I'll note that shared tasks exist in other communities, but not to the extent that they exist in NLP, to my knowledge.)
I think there are both good and bad things about having these shared tasks, and a lot depends on how they are run. Perhaps some analysis (and discussion?) can serve to help future shared task organizers make decisions about how to run these things.
Many pros of shared tasks are perhaps obvious:
- Increases community attention to the task.
- Often leads to development or convergence of techniques by getting lots of people together to talk about the same problem.
- Significantly reduces the barrier of entry to the task (via the freely available, preprocessed data and evaluation scripts).
- (Potentially) enables us to learn what works and what doesn't work for the task.
- Makes a standardized benchmark against which future algorithms can be compared.
In my opinion, (4) seems like it should be the real reason to do these things. I think the reason why people don't tend to learn as much as might be possible about what does and does not work is that there's very little systematization in the shared tasks. At the very least, almost everyone will use (A) a different learning algorithm and (B) a different feature set. This means that it's often very hard to tell -- when someone does well -- whether it was the learning or the features.
Unfortunately (were it not the case!) there are some cons associated with shared tasks, generally closely tied to corresponding pros.
- May artificially bloat the attention given to one particular task.
- Usefulness of results is sometimes obscured by multiple dimensions of variability.
- Standardization can lead to inapplicability of certain options that might otherwise work well.
- Leads to repeated testing on the same data.
(2) and (3) are, unfortunately, almost opposed. You can, for instance, fix the feature set and only allow people to vary the learning. Then we can see who does learning best. Aside from the obvious problem here, there's an additional problem that another learning algorithm might do better, if it had different features. Alternatively, you could fix the learning and let people do feature engineering. I think this would actually be quite interesting. I've thought for a while about putting out a version of Searn for a particular task and just charge people with coming up with better features. This might be especially interesting if we did it for, say, both Searn and Mallet (the UMass CRF implementation) so we can get a few more points of comparison.
To be more concrete about (3), a simple example is in machine translation. The sort of preprocessing (eg., tokenization) that is good for one MT (eg., a phrase-based system) may be very different from the preprocessing that is good for another (eg., syntax-based). One solution here is to give multiple versions of the data (raw, preprocessed, etc.), but then this makes the (2) situation worse: how can we tell who is doing best, and is it just because they have a darn good tokenizer (don't under-estimate the importance of this!).
(4) doesn't really need any extra discussion.
My personal take-away from putting some extra thought into this is that it can be very beneficial to have shared tasks, if we set at the beginning what are the goals. If our goal is to understand what features are important, maybe we should consider fixing the learning to a small set of algorithms. If our goal is learning, do the opposite. If we want both, maybe ask people to do feature ablation and/or try with a few different learning techniques (this is perhaps too much burden, though). I think we should definitely keep the (3) of low barrier of entry: to me, this is one of the biggest pros. I think the SMT workshops did a phenomenal job here, for a task as complex as MT. And, of course, we should choose the tasks carefully.
10 comments:
I totally agree about the pros and cons, but shared tasks mean me a bit different. They are the places where the tools don't matter, only your performance. Someone tune on features others on learning, it isn't a problem (to my personal view). It could highlight new topics which worth the time to deal with (at least when you built an application) e.g. a simple system with a postprocessing step (containing few expert rules) can beat the most sophisticated algorithms.
Obviously it is interesting only if the goal is to solve problems, not just to release theories.
Nice post Hal. Having participated in a couple shared tasks in the past I have often thought about this. The idea of fixing either the learning or features and varying the other is a reasonable thought, but I am not sure how feasible it is. In my opinion, the most interesting part of these shared tasks is representation. For example, at last years CoNLL task on dependency parsing, people used: spanning tree, stacked-based, CFG, plus some other representations of the problem. In this case, it is not always true that you can fix the learning or feature-set since each representation of the problem may be incompatible with a particular choice. For instance, lets say we want the learning to be discriminative max-likelihood (CRF). For the spanning tree parsing methods it is not clear to me that this can be done (i.e., I know how to do inference and normalization, but I do not know how to compute feature expectations. Just curious, does anyone?). I think leaving as many dimensions free as possible is beneficial. After the shared-task, people can then go and create studies comparing various techniques in a more controlled setting. The only dimension I think is important to be strict about is resources, especially now that people are finding good ways to use unlabeled data. In this case one can have two-tracks "open" and "closed" to allow people to experiment.
Another serious drawback of shared tasks is that many people see them not as research, a share opportuntity to learn, but as a competitive event. This leads to behavious that increases the idiosyncacies of sumitted systems (adding ad-hoc hacks to "rank" better, to improve the score) at the cost of control. This is why often, no true learning experience is the result.
--
Jochen L Leidner
Linguit Ltd. (www.linguit.com)
酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花
艾葳酒店經紀公司提供專業的酒店經紀, 酒店上班小姐,八大行業,酒店兼職,傳播妹,或者想要打工兼差、打工,兼差,八大行業,酒店兼職,想去酒店上班, 日式酒店,制服酒店,ktv酒店,禮服店,整天穿得水水漂漂的,還是想去制服店當日領上班小姐,水水們如果想要擁有打工工作、晚上兼差工作、兼差打工、假日兼職、兼職工作、酒店兼差、兼差、打工兼差、日領工作、晚上兼差工作、酒店工作、酒店上班、酒店打工、兼職、兼差、兼差工作、酒店上班等,想了解酒店相關工作和特種行業內容,想兼職工作日領、假日兼職、兼差打工、或晚班兼職想擁有鋼琴酒吧又有保障的工作嗎???又可以現領請找專業又有保障的艾葳酒店經紀公司!
艾葳酒店經紀是合法的公司工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆,可日領、現領。
一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班,酒店上班,酒店打工環境、上班條件給水水們。心動嗎!? 趕快來填寫你的酒店上班履歷表
水水們妳有缺現領、有兼職、缺錢便服店的煩腦嗎?想到日本留學缺錢嗎?妳是傳播妹??想要擁有高時薪又輕鬆的賺錢,酒店和,假日打工,假日兼職賺錢的機會嗎??想實現夢想卻又缺錢沒錢嗎!??
艾葳酒店台北酒店經紀招兵買馬!!徵專業的酒店打工,想要去酒店的水水,想要短期日領,酒店日領,禮服酒店,制服店,酒店經紀,ktv酒店,便服店,酒店工作,禮服店,酒店小姐,酒店經紀人,
等相關服務 幫您快速的實現您的夢想~!!
Oes Tsetnoc one of the ways in which we can learn seo besides Mengembalikan Jati Diri Bangsa. By participating in the Oes Tsetnoc or Mengembalikan Jati Diri Bangsa we can improve our seo skills. To find more information about Oest Tsetnoc please visit my Oes Tsetnoc pages. And to find more information about Mengembalikan Jati Diri Bangsa please visit my Mengembalikan Jati Diri Bangsa pages. Thank you So much.
one day i went shopping outside,and in an ed hardy store,I found some kinds of ed hardy i love most they are Your website is really good Thank you for the information ed hardy ed hardy ed hardy clothing ed hardy clothing ed hardy shoes ed hardy shoes don ed hardy don ed hardy ed hardy clothes ed hardy clothes ed hardy bags ed hardy bags ed hardy swimwear ed hardy swimwear ed hardy jeans ed hardy jeans ed hardy mens ed hardy mens Thank you for the information
Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..
sesli sohbetsesli chatkamerali sohbetseslisohbetsesli sohbet sitelerisesli chat siteleriseslichatsesli sohpetseslisohbet.comsesli chatsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet
seslisohbetsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet
Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it
to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
kameralı sohbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
seslidunya
seslisehir
sesli sex
Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it
to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
sesli muhabbet siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
görüntülü muhabbet siteleri
kameralı sohbet siteleri
kameralı chat siteleri
kameralı muhabbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
birsesver
birses
seslidunya
seslisehir
sesli sex
Post a Comment