Comments on natural language processing blog: EMNLP papers, tales from the trenches

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店酒店兼差PRETTY GIRL酒店公關酒...

2009-05-12T10:52:00.000-06:00

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店酒店兼差PRETTY GIRL酒店公關酒店小姐彩色爆米花酒店兼職,酒店工作彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀彩色爆米花

Yay simultaneous posts.I have many thoughts along ...

2007-04-30T16:00:00.000-06:00

Yay simultaneous posts.

I have many thoughts along the lines of making review information public, but I think that will have to wait for another post. I think the short answer is that people fear "repercussions" from various forms of non-anonymity (nymity?)

FYI, Fernando Pereira has something to say.I've al...

2007-04-30T15:57:00.000-06:00

FYI, Fernando Pereira has something to say.

I've also wondered why they don't do what you suggest. I think that people psychologically don't like mechanized reviewing, but that's not a good excuse. I think that what ends up happening in practice is that the overall-recs are used to filter the top, then the ACs read/skim those top papers, and perhaps order them by some combination of weights that they deem are appropriate. Why the first step couldn't happen automatically, I don't know. It would be an interesting multilevel regression
problem to see how people actually behave wrt giving overall scores :).

So this just shifting the bias to the area chair, so you'd better hope the area chairs have good biases, or that the higher-ups can instill a good bias. Importantly, though, the ACs get a much more global picture than any single reviewer, so the bias effect is probably somewhat mitigated.

Aha, I like the last paragraph of you reply. looks...

2007-04-30T15:46:00.000-06:00

Aha, I like the last paragraph of you reply. looks like we must accept to live with some randomness in the proccess.

Here is a suggestion.

I always wondered why do a reviwer need to enter a final evalulation score if he/she already entered the breakdown of her score along different criteria. Why don't we just have a given theme of the conference, like what you mentined before about ACL, this year we would like to stress innovative ideas as oppossed to stressing on empircal rigor! The chair puts the weights on each criteria, reviwers enter detailed scores, and that is it, final recommendation is there as a weighted average.

I think this will enforce more overall consistency and somehow would downweights the reviewer preference bias.

agree/disagree??

There wasn't much variance in what I pushed forwar...

2007-04-30T06:47:00.000-06:00

There wasn't much variance in what I pushed forward as the top three papers. But that's somewhat of a chicken-and-egg statement. The may not have been the top three had they had variance :). For good looking paper that did have high variance, I asked authors to discuss. Usually they did not change their scores. (Sometimes by 0.5 points.) But the discussion was quite helpful in understanding why there was high variance. Typically it was due to the fact that different reviewers actually *want* different things out of a paper.

I think it's important to separate out variance due to noise and variance due to reviewer preferences. I would venture that for the majority of the pretty good papers in the track (maybe 10ish), the primary variance that existed was due to reviewer preferences.

Of course, reviewer assignments introduce variance, too, due primarily to the fact that different reviewer biases can show up as variance. If you assign a theoretically strong by empirically weak paper to three empirically-minded reviewers, the variance will be artificially low.

in other words, whose rule its to smooth this vari...

2007-04-29T16:43:00.000-06:00

in other words, whose rule its to smooth this variance, and how did you handle this yourself. I just feel it is overwhelming to go and read every single on the border paper -- and the problem get exacerbated as you move to the top of the reviewing heirarchy .

Thansk for the post, but what can you say about th...

2007-04-29T16:40:00.000-06:00

Thansk for the post, but what can you say about the effect the sample selection has on each reviewer's score?

I mean obviously everyone is inclined (consiously or not) to have a quasi-normal distribution over his scores -- a very strong paper usually raises the reviwer's standars when looking at the next one; you just can't help! of course there are clear accept or reject papers, and I beleive what I am addressing her usually pertains to the in-between categories (you metioned something related to this in ur post).

Have you seen this effect on ur area, perhaps by looking at how the reviwers of those top three papers assign scores to other papers?? is it a problem and how it can be solved?

it looks to me like a chicken and egg problem as you need to have a rough idea about the paper before assiging it and it supports your cascading approach. Do you think knowing the authors of the paper would help in this cascading approach (at least at the area chair scale in assigning papers) or it will just add to the randomness and increase the bias?