27 December 2007

Those Darn Biologists...

I've recently been talking to a handful of biologists. The lab/bench kind, not the computational kind. As an experiment in species observation, they intrigue me greatly. I'm going to stereotype, and I'm sure it doesn't hold universally, but I think it's pretty true and several (non-biologist) friends have confirmed this (one of whom is actually married to a biologist). The reason they intrigue me is because they are wholly uninterested in techniques that allow you to do X better, once they already have a solution for X.

This is a bit of an exaggeration, but not a huge one (I feel).

It is perhaps best served by example. Take dimensionality reduction. This is a very very well studied problem in the machine learning community. There are a bajillion ways to do it. What do biologists use? PCA. Still. And it's not that people have tried other things and they've failed. Other things have worked. Better. But no one cares. Everyone still uses PCA.

Part of this is probably psychological. After using PCA for a decade, people become comfortable with it and perhaps able to anticipate a bit of what it will do. Changing techniques would erase this.

Part of this is probably cultural. If you're trying to publish a bio paper (i.e., a paper in a bio journal) and you're doing dimensionality reduction and you aren't using PCA, you're probably going to really have to convince the reviewers that there's a good reason not to use PCA and that PCA just plain wouldn't work.

Now, that's not to say that biologists are luddites. They seem perfectly happy to accept new solutions to new problems. For instance, still in the dimensionality reduction setting, non-negative matrix factorization techniques are starting to show their heads in microarray settings. But it's not because they're better at solving the same old problem. What was observed was that if you have not just one, but a handful of microarray data sets, taken under slightly different experimental conditions, then the variance captured by PCA is the variance across experiments, not the variance that you actually care about. It has been (empirically) observed that non-negative matrix factorization techniques don't suffer from this problem. Note that it's only fairly recently that we've had such a surplus of microarray data that this has even become an issue. So here we have an example of a new problem needing a new solution.

Let's contrast this with the *ACL modus operandi.

There were 132 papers in ACL 2007. How many of them do you think introduced a new problem?

Now, I know I'm being a bit unfair. I'm considering dimensionality reduction across different platforms (or experimental conditions) as a different problem from dimensionality reduction on a single platform. You could argue that these really are the same problem. I just kind of disagree.

Just as biologists tend to be wary of new techniques, so we tend to be wary of new problems. I think anyone who has worked on a new problem and tried to get it published in a *ACL has probably gone through an experience that left a few scars. I know I have. It's just plain hard to introduce a new problem.

The issue is that if you work on an age-old problem, then all you have to do is introduce a new technique and then show that it works better than some old technique. And convince the reviewers that the technique is interesting (though this can even be done away with if the "works better" part is replaced with "works a lot better").

On the other hand, if you work on a new problem, you have to do a lot. (1) You have to introduce the new problem and convince us that it is interesting. (2) You have to introduce a new evaluation metric and convince us that it is reasonable. (3) You have to introduce a new technique and show that it's better than whatever existed before.

The problem is that (1) is highly subjective, so a disinterested reviewer can easily kill such a paper with a "this problem isn't interesting" remark. (2) is almost impossible to do right, especially the first time around. Again, this is something that's easy to pick on as a reviewer and instantly kill a paper. One might argue that (3) is either irrelevant (being a new problem, there isn't anything that existed before) or no harder than in the "age old problem" case. I've actually found this not to be true. If you work on an age old problem, everyone knows what the right baseline is. If you work on a new problem, not only do you have to come up with your own technique, but you also have to come up with reasonable baselines. I've gotten (on multiple occasions) reviews for "new problems" papers along the lines of "what about a baseline that does XXX." The irony is that while XXX is often creative and very interesting, it's not really a baseline and would actually probably warrant it's own paper!

That said, I didn't write this post to complain about reviewing, but rather to point out a marked difference between how our community works and how another community (that is also successful) works. Personally, I don't think either is completely healthy. I feel like there is merit in figuring out how to do something better. But I also feel that there is merit in figuring out what new problems are and how to solve them. The key problem with the latter is that the common reviewer complaints I cited above (problem not interesting or metric not useful) often are real issues with "new problem" papers. And I'm not claiming that my cases are exceptions. And once you accept a "new problem" paper that is on the problem that really isn't interesting, you've opened Pandora's box and you can expect to see a handful of copy cat papers next year ( could, but won't, give a handful of examples of this happening in the past 5 years). And it's really hard to reject the second paper on a topic as being uninteresting, given that the precedent has been set.

8 comments:

Suresh Venkatasubramanian said...

This is interesting. It would be a little tricky to get a paper into FOCS/STOC/SODA where the main contribution was to improve some bound on a problem (unless the improvement was substantial, came after a lot of effort, introduced a new technique, etc).

On the other hand, although introducing a new problem can be tricky, lots of new problems get introduced into algorithms conferences, if done the right way. So there's a slightly different sensibility even within CS (to the extent that *ACL is "CS")

Anonymous said...

Lab biologists care about solving biology problems, not machine learning, algorithms or statistics problems. They'll be hired, get tenure, get grants, get papers published, and win Nobel prizes for innovations in biology.

Time they spend learning about state-of-the-art machine learning and statistics is time they can't spend learning about biology.

Clustering and factorization techniques like PCA are exploratory data analysis tools used to formulate hypotheses. Biologists don't publish much based on data analysis; they have to go back in the lab and verify what they found in the statistics on the bench.

As an academic computational linguist, writing an ACL or NIPS paper is the end product. Parsing a section of the Penn treebank or participating in CoNLL is considered real work.

If, like me, you have to solve real problems for real customers, you'll find yourselves using bigram language models, HMMs, naive Bayes, and other robust techniques even if they're not "state of the art". That's largely because they work well without a lot of feature engineering and run fast in the field in small memory footprints.

Fernando Pereira said...

I've been working closely with lab biologists and I've started to publish with them. We should distinguish between "methods" papers and "results" papers. Our two main papers so far (one published, the other in print) are on new gene prediction methods. The reviews were pretty much like what I would expected from ACL reviewers, except that they were much more detailed, and required substantial work to answer in full. The first paper was published in a results-oriented journal (PLoS Comp Bio), the other has been accepted by a methods journal (Bioinformatics). Some of our most recent work may lead to specific, experimentally confirmed biological results, which we will submit to the appropriate results journals. Even methods papers may make it into results-oriented journals if the new method is seen by the editor and reviewers as changing the game in a particular area. This has been the case with recent work on discriminative structured gene prediction methods by the Rätsch lab at Max Plank Tübingen and by my group (publications in PLoS Comp Bio), and by the Batzoglou lab at Stanford and the Galagan lab at the Broad institute (publications in Genome Research).

Suresh Venkatasubramanian said...

I should also add that in many areas of CS, the same holds: namely, certain techniques take hold and root themselves, even if newer techniques come along with improvements: the argument again is that the focus is on the problem, rather than the technique, and so you need a dramatic improvement, or a technique snuck in via a new problem.

More generally, this goes to the fact that large parts of ML, and large parts of algorithms for that matter, are driven by techniques rather than problems, whereas in the outside "problem-centric" world, there's a much bigger focus on generalist solutions.

mgr said...

I agree with Bob Capenter.
People have different values. I mean they will attach more weightage to a method/technique depending on how the method or technique fits into their overall scheme of things.
If the goal is a research paper then you could try and do a 'mashup' of sorts to increase performance from 99 to 99.5.

In practice I would prefer to use Peter Drucker's value scheme 'It is more important to do the right thing than doing the thing right'.So if this value is used project delivery would get 8 of 10. POS tagging of all varieties will probably get 3.
In one of your earlier blogs you talked about 'confusion matrices'? for POS tagging.
Question is how much marginal value (To the goals) does a
marginal improvement add?

Igor said...

Hal,

I'd have to agree with you but you have to look at it from a different perspective. If you are not in Machine Learning, how can one figure out which of these 25 dimension reduction techniques ( http://www.cs.unimaas.nl/l.vandermaaten/Laurens_van_der_Maaten/Matlab_Toolbox_for_Dimensionality_Reduction.html )
is the best for the type of problems you are most interested in. I can see why using PCA is a safe bet and would convince people in the field and reviewers that the finding to be published is not a by-product of the new algorithm used.

Igor.
http://nuit-blanche.blogspot.com

Anonymous said...

Your blog is great! Very informative! We will be sure to link to you!

Judy and Harold

Anonymous said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花