08 March 2011

Some thoughts on supplementary materials

Having the option of authors submitting supplementary materials is becoming popular in NLP/ML land.  NIPS was one of the first conferences I submit to that has allowed this; I think ACL allowed it this past year, at least for specific types of materials (code, data), and EMNLP is thinking of allowing it at some point in the near future.

Here is a snippet of the NIPS call for papers (see section 5) that describes the role of supplementary materials:

In addition to the submitted PDF paper, authors can additionally submit supplementary material for their paper... Such extra material may include long technical proofs that do not fit into the paper, image, audio or video sample outputs from your algorithm, animations that describe your algorithm, details of experimental results, or even source code for running experiments.  Note that the reviewers and the program committee reserve the right to judge the paper solely on the basis of the 8 pages, 9 pages including citations, of the paper; looking at any extra material is up to the discretion of the reviewers and is not required.
(Emphasis mine.)  Now, before everyone goes misinterpreting what I'm about to say, let me make it clear that in general I like the idea of supplementary materials, given our current publishing model.

You can think of the emphasized part of the call as a form of reviewer protection.  It basically says: look, we know that reviewers are overloaded; if your paper isn't very interesting, the reviewers aren't required to read the supplement.  (As an aside, I feel the same thing happens with pages 2-8 given page 1 in a lot of cases :P.)

I think it's good to have such a form a reviewer protection.  What I wonder is whether it also makes sense to add a form of author protection.  In other words, the current policy -- which seems only explicitly stated in the case of NIPS, but seems to be generally understood elsewhere, too -- is that reviewers are protected from overzealous authors.  I think we need to have additional clauses that protect authors from overzealous reviewers.

Why?  Already I get annoyed with reviewers who seem to think that extra experiments, discussion, proofs or whatever can somehow magically fit in an already crammed 8 page page.  A general suggestion to reviewers is that if you're suggesting things to add, you should also suggest things to cut.

This situation is exacerbated infinity-fold with the "option" of supplementary material.  There now is no length-limit reason why an author couldn't include everything under the sun.  And it's too easy for a reviewer just to say that XYZ should have been included because, well, it could just have gone in the supplementary material!

So what I'm proposing is that supplementary material clauses should have two forms of protection.  The first being the existing one, protecting reviewers from overzealous authors.  The second being the reverse, something like:
Authors are not obligated to include supplementary materials.  The paper should stand on its own, excluding any supplement.  Reviewers must take into account the strict 8 page limit when evaluating papers.
Or something like that: the wording isn't quite right.  But without this, I fear that supplementary materials will, in the limit, simply turn into an arms race.


Keith said...

Protection against reviewers seems like a good idea in general, but I'd hope most reviewers wouldn't abuse it even without. After all, it's supposed to be optional.

That brings me to a question I've had - what's the intended benefit of supplementary materials? Or rather, the benefit to the author. For example, software supplements were traditionally posted on the web and that's mostly fine these days, though some of the other resources no longer exist. Sorry if I'm just uninformed - is there a good writeup somewhere of the benefits to author, reader, and venue?

Paul said...

Sharing Detailed Research Data Is Associated with Increased Citation Rate.

Supplementary material makes it easier to build off of or understand a paper, increasing the value to the reader. The author and the venue benefit in reputation when readers view them as providing more value.

Suresh Venkatasubramanian said...

It's possible what you say could happen. However, in theory conferences we typically allow an appendix for extended proofs with a caveat that the paper must contain enough information in the regular page limit to get an idea of the proof. In other words you can't just shunt all proofs off to an appendix. This works out fairly well: reviewers can read the detailed proofs, and no one complains that the author DIDN'T include a detailed proof. To be fair though, authors usually have detailed proofs to begin with.

Kristy said...

FYI, EMNLP is in fact allowing supplementary materials now, with a clause similar to the NIPS one:

"The supplementary material should be supplementary (rather than central) to the paper. It may include explanations or details of proofs or derivations that do not fit into the paper, lists of features or feature templates, sample inputs and outputs for a system, pseudo-code or source code, and data. The paper should not rely on the supplementary material: while the paper may refer to and cite the supplementary material and the supplementary material will be available to reviewers, they will not be asked to review or even download the supplementary material."

decorr said...

I recently saw a great refreshing policy at CICLing 2011 read as "Verifiability, reproducibility and open source policy". How does this fit with the "Supplementary Materials" paradigm? My personal take is that a policy like CICLing is more rigorous and to a large extent guarantees great research and advance in science.

hal said...

@Keith: I think the general idea is that sometimes, as an author, you really have things you want to say that just don't fit. There are lots of reasons this could happen. One is long, tedious derivations or proofs, which, as Suresh pointed out, you often have anyway. Case in point: my JAIR 2006 paper (incidentally my second most cited paper) was rejected three times in a row from ACL, EMNLP and then ACL again, basically because 8 pages wasn't enough to include a derivation and reviewers couldn't verify it. (There were other complaints, but that was the consistent one.) Having supplementary material could have saved this.

Now, just because you _can_ say something doesn't mean that you _should_. I hope supplements don't turn in to kitchen sinks for extra experiments, etc. But putting more details about a data set, or exactly how you ran experiments or whatever seems like a reasonable thing to do.

Keith said...

I can definitely see the benefit for detailed proofs, but it seems like the benefit would be really tied into the way the reader accesses the supplement. If it's a small, easy-to-miss link it might not help readers much. It also depends a bit on how people find articles - they may not even think to look for a supplement if they're just using Google Scholar (unless there's a note+link in the pdf).

In some cases it really feels like a hack to get around the page limit though. Without supplements, if you can't fit something in 8-10p so you revise and make it a journal article, like your JAIR paper. Or you publish a short version at a conference and add the detail in a journal followup article. (In cases where the additional material is meaningful)

Mcenley makes a good point about reproducibility and such, which I would assume is more directed to source code and experimental design. At the same time, it's a lot of effort to get source code and dependencies packaged nicely enough so that the supplement is meaningful. And even then, it probably wouldn't affect acceptance/rejection unless conferences change significantly. So it seems like a lot of work for a little benefit, although if your software is easy enough to use it could lead to many citations.

Then you have textual supplements like extra descriptions or explanations, but I can picture that becoming "oh, I wrote this section and didn't include (or edit) it, so why not?"

With the exception of software/code, it just seems like an awkward way to have a variable page limit.

hal said...

@Keith: yes, I agree. I think that supplements should just end up as appendices at the end of the paper. Since we don't actually print proceedings any more, this is a non-issue.

One way that supplements were described that I like is that "paper is push, supplement is pull." I.e., as an authors, the paper is what you want to tell someone who's interested. The supplement is where someone who's really interested can go to find more details.

The problem with the "just send it to a journal" answer is that (at least in NLP/ML), you often don't get the same visibility with a journal article as you would with a conference paper. Plus, in some ways, it's often more prestigious to have a conference paper under the model that _eventually_ anything reasonable will get in to a journal. And publishing a short version doesn't always work even if you intend to follow up with a long version, the JAIR paper being a case in point.

But yeah, I agree that I wouldn't want to see this stuff overused.

Keith said...

@Hal I agree about journals vs conferences and it's a shame. The push/pull explanation seems right on.

I worry about the reviewing of extra materials though, mostly because it's optional. If a reviewer doesn't like a supplemental section, you can remove it instead of revising it. (Unless like you're worried about, reviewers force "essential" extra material in there)

On the other side of things, I'm concerned that essential material would go in the supplement and a paper would be accepted only because it's present. But I'd much rather read a longer paper where the additional part is coherently weaved into the main paper.