I was recently asked by a colleague if I knew what the first paper was that used the bag of words model. I'm pretty certain it would be an IR paper, but have no idea what I would be. Manning+Schutze and Jurafsky+Martin don't have it. I know tf-idf is due to Sparck-Jones, but I presumably BOW existed before that. The vector space model is often credited to Salton, which is probably the earliest thing I know of, but my guess is that BOW predated even that. Anyone know a citation?