tag:blogger.com,1999:blog-19803222.post3864970775826105056..comments2024-03-18T01:45:45.724-06:00Comments on natural language processing blog: Bag of Words citationhalhttp://www.blogger.com/profile/02162908373916390369noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-19803222.post-61439939796782140872009-05-12T10:56:00.000-06:002009-05-12T10:56:00.000-06:00酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒...酒店經紀PRETTY GIRL <A HREF="http://www.taipeilady.com/" REL="nofollow" TITLE="台北酒店經紀人">台北酒店經紀人</A> ,<A HREF="http://tw.myblog.yahoo.com/jw!qZ9n..6QEhhc0LkItOBm/" REL="nofollow" TITLE="禮服店">禮服店</A> 酒店兼差PRETTY GIRL<A HREF="http://www.mashow.org/" REL="nofollow" TITLE="酒店公關">酒店公關</A> 酒店小姐 彩色爆米花<A HREF="http://blog.xuite.net/jkl338801/blog/" REL="nofollow" TITLE="酒店兼職">酒店兼職</A>,酒店工作 彩色爆米花<A HREF="http://tw.myblog.yahoo.com/jw!BIBoU5SeBRs21nb_ajFpncbTqXds" REL="nofollow" TITLE="酒店經紀">酒店經紀</A>, <A HREF="http://mypaper.pchome.com.tw/news/thomsan/3/1310065116/20080905040949/" REL="nofollow" TITLE="酒店上班">酒店上班</A>,酒店工作 PRETTY GIRL<A HREF="http://tw.myblog.yahoo.com/jw!rybqykeeER6TH3AKz1HQ5grm/" REL="nofollow" TITLE="酒店喝酒">酒店喝酒</A>酒店上班 彩色爆米花<A HREF="http://mypaper.pchome.com.tw/news/jkl338801/" REL="nofollow" TITLE="台北酒店">台北酒店</A>酒店小姐 PRETTY GIRL<A HREF="http://www.mashow.org/" REL="nofollow" TITLE="酒店上班">酒店上班</A>酒店打工PRETTY GIRL<A HREF="http://www.tpangel.com/" REL="nofollow" TITLE="酒店打工">酒店打工</A>酒店經紀 彩色爆米花Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-19803222.post-21510520481363529332007-03-19T20:46:00.000-06:002007-03-19T20:46:00.000-06:00The use of individual words to represent a documen...The use of individual words to represent a document for retrieval purposes probably goes back to the advent of movable type. In Western civilization this means going back to the mid 15th century. One late 16th century work has a rather complete term index, unordered either alphabetically or even by order of appearance. In China movable type appeared in the 11th century. It woudn't be too difficult to imagine that indices were generated and employed in China in the wake of the invention of movable type there. <BR/><BR/>Cryptography may indeed be another route to explore early history of bag-of-words representations. To my (admittedly scant) understanding, most Western ciphers, bound as they are to alphabet-type texts, operated on individual characters. Hence there would be little in the way of representations utilizing word-to-word mappings rather than character-to-character mappings. However pictograms-based languages may hold promise for finding some bag-of-words representation for a document that precedes the invention of movable type.Patrickhttps://www.blogger.com/profile/04539777192807636576noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-64893139813447069702007-02-08T13:55:00.000-07:002007-02-08T13:55:00.000-07:00My guess is the early cryptographers. Shannon's...My guess is the early cryptographers. Shannon's 1948 paper <A HREF="http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf" REL="nofollow">A Mathematical Theory of Communication</A> lays out a "first-order word approximation", which is equivalent to a bag of words. Of course, he generalized to n-gram models. In the paper, he cites cryptographers for the word distributions.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-19803222.post-70116667064028162722007-02-06T01:14:00.000-07:002007-02-06T01:14:00.000-07:00They don't use the term "bag-of-words" but I think...They don't use the term "bag-of-words" but I think Luhn (1957) and Maron & Kuhns (1959) deserve a look. Luhn introduced a concept related to what we know as synsets and the model described by Maron and Kuhns appears to me quite similar to BOW.<br /><br />The URLs:<br /><br />http://www.research.ibm.com/journal/rd/014/ibmrd0104D.pdf<br /><br />http://www.doc.ic.ac.uk/~jmag/classic/1960.On%20Relevance,%20Probabilistic%20Indexing%20and%20Information%20Retrieval.pdfAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-19803222.post-30021831867965752682007-02-05T21:32:00.000-07:002007-02-05T21:32:00.000-07:00i don't have the book, but mosteller and wallace (...i don't have the book, but mosteller and wallace (1964) may use BOW.david bleihttps://www.blogger.com/profile/06292909346075142113noreply@blogger.com