tag:blogger.com,1999:blog-19803222.post114590117947191852..comments2024-03-18T01:45:45.724-06:00Comments on natural language processing blog: Unsupervised Learning: Why?halhttp://www.blogger.com/profile/02162908373916390369noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-19803222.post-17723960954165467322010-04-15T10:08:48.943-06:002010-04-15T10:08:48.943-06:00This is such a great resource. You are very smart ...This is such a great resource. You are very smart about this. I just love your insight on this.<br /><a href="http://www.palmbeach-porcelainveneers.com" rel="nofollow">palm beach porcelain veneers</a>gamefan12https://www.blogger.com/profile/17700305176595632282noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-73776900991466257782009-05-12T11:23:00.000-06:002009-05-12T11:23:00.000-06:00酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒...酒店經紀PRETTY GIRL <A HREF="http://www.taipeilady.com/" REL="nofollow" TITLE="台北酒店經紀人">台北酒店經紀人</A> ,<A HREF="http://tw.myblog.yahoo.com/jw!qZ9n..6QEhhc0LkItOBm/" REL="nofollow" TITLE="禮服店">禮服店</A> 酒店兼差PRETTY GIRL<A HREF="http://www.mashow.org/" REL="nofollow" TITLE="酒店公關">酒店公關</A> 酒店小姐 彩色爆米花<A HREF="http://blog.xuite.net/jkl338801/blog/" REL="nofollow" TITLE="酒店兼職">酒店兼職</A>,酒店工作 彩色爆米花<A HREF="http://tw.myblog.yahoo.com/jw!BIBoU5SeBRs21nb_ajFpncbTqXds" REL="nofollow" TITLE="酒店經紀">酒店經紀</A>, <A HREF="http://mypaper.pchome.com.tw/news/thomsan/3/1310065116/20080905040949/" REL="nofollow" TITLE="酒店上班">酒店上班</A>,酒店工作 PRETTY GIRL<A HREF="http://tw.myblog.yahoo.com/jw!rybqykeeER6TH3AKz1HQ5grm/" REL="nofollow" TITLE="酒店喝酒">酒店喝酒</A>酒店上班 彩色爆米花<A HREF="http://mypaper.pchome.com.tw/news/jkl338801/" REL="nofollow" TITLE="台北酒店">台北酒店</A>酒店小姐 PRETTY GIRL<A HREF="http://www.mashow.org/" REL="nofollow" TITLE="酒店上班">酒店上班</A>酒店打工PRETTY GIRL<A HREF="http://www.tpangel.com/" REL="nofollow" TITLE="酒店打工">酒店打工</A>酒店經紀 彩色爆米花Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1146002991303216272006-04-25T16:09:00.000-06:002006-04-25T16:09:00.000-06:00This seems to put the cart before the horse: you'r...This seems to put the cart before the horse: you're saying that a problem is worth working on if, after I develop lots of solutions, I can see a difference. Isn't the point of deciding if a problem is worth working on or not so that you needn't waste time building all these systems? I also agree (to a certain extent -- this is a good other post) that introducing artifical constraints to make humans agree is bad.<BR/><BR/>I'm still not sure what in my original statement you disagree with, though. As far as I can tell, we're in agreement about when one should use unsupervised learning: essentially, whenever we are forced to do human evaluations.halhttps://www.blogger.com/profile/02162908373916390369noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1146001821648807822006-04-25T15:50:00.000-06:002006-04-25T15:50:00.000-06:00Isn't this too permissive? I can almost always bu...Isn't this too permissive? I can almost always build two baselines with noticibly different performance, or a baseline that's noticibly worse than a human. And it seems odd to want a noticible difference between two humans: usually we're fighting to get our humans to agree!halhttps://www.blogger.com/profile/02162908373916390369noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1146000767726680182006-04-25T15:32:00.000-06:002006-04-25T15:32:00.000-06:00Deepak -- I don't think you and I disagree. (Ther...Deepak -- <BR/><BR/>I don't think you and I disagree. (There's a small caveat that I don't really know how you distinguish a system from a problem. I can imagine having a system B and then I want to decide if I should try to improve on it? I don't know how I can decide that this is an interesting problem to work on without actually trying it.)<BR/><BR/>I'm actually not sure what you're disagreeing with. I'm essentially saying that so long as we can evaluate automatically, we should do a supervised approach. Only if we <I>cannot</I> evaluate automatically (eg., for word alignment), should we do something unsupervised. Why? For exactly Alex's reason: if we don't know that the metric we're optimizing is good, then we might as well optimize anything and then show via human evaluations (or in Alex's case, Bleu evaluations) that we've done something useful.<BR/><BR/>I think the answer to your counter statement is just that someone has gone through the effort of <A HREF="http://nlpers.blogspot.com/2006/02/evaluation-criterea-correlation-versus.html" REL="nofollow">showing that improvements in the white-box often (or are expected to) lead to improvements in the black-box</A>. I <A HREF="http://nlpers.blogspot.com/2006/02/art-of-loss-functions.html" REL="nofollow">completely sympathize with this</A>. Unfortunately, this doesn't happen all that much :).halhttps://www.blogger.com/profile/02162908373916390369noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1145976550390159002006-04-25T08:49:00.000-06:002006-04-25T08:49:00.000-06:00I think I can agree with anonymous: I was focusing...I think I can agree with anonymous: I was focusing on cases where the problem really looks supervised, but we just don't have data so we build an unsupervised model for it.<BR/><BR/>However, playing devil's advocate: can someone give me an NLP problem that really <I>is</I> unsupervised by nature?halhttps://www.blogger.com/profile/02162908373916390369noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1145957915480224152006-04-25T03:38:00.000-06:002006-04-25T03:38:00.000-06:00This is based on the assumption that, anyway, supe...This is based on the assumption that, anyway, supervised learning is "better"... unsupervised learning being an alternative for those who cannot afford an annotated corpus.<BR/>Well, I'm not sure of that. Sup and unsup learning are very different. IMO, unsupervised learning really IS learning, while sup learning is more like a mapping process between two structures that must be quite similar.<BR/>To me, sup and unsup learning are not opposed, they complete eachother. Unsup learning builds concepts from raw data (this step is often replaced by the use of 'features'), THEN, sup learning builds an 'interface' between this set of concept, and another one (human, machine..).Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1145936371728441742006-04-24T21:39:00.000-06:002006-04-24T21:39:00.000-06:00For NLP, at least we can test the performance an u...For NLP, at least we can test the performance an unsupervised learning in one language. If good, we can apply to the other languages to reduce the anotation labor.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1145934894009131472006-04-24T21:14:00.000-06:002006-04-24T21:14:00.000-06:00I guess this runs in to the difficulty that you ne...I guess this runs in to the difficulty that you need data to develop features on. Doing this requires held-out data, which means cross validation really wouldn't work (at least if you don't want to cheat).<BR/><BR/>So I guess it also depends if you're developing a new system or just trying to port an existing one.halhttps://www.blogger.com/profile/02162908373916390369noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1145925292621279522006-04-24T18:34:00.000-06:002006-04-24T18:34:00.000-06:00Cross-validation?I agree about "practical amount" ...Cross-validation?<BR/><BR/>I agree about "practical amount" though.halhttps://www.blogger.com/profile/02162908373916390369noreply@blogger.comtag:blogger.com,1999:blog-19803222.post-1145925129250599722006-04-24T18:32:00.000-06:002006-04-24T18:32:00.000-06:00Suppose you run out of money after annotating the ...Suppose you run out of money after annotating the test data. Then you can only do unsupervised learning on the training data. :) <BR/><BR/>More seriously, I think if unsupervised learning is to be used, it should demonstrate that it outperforms supervised learning with a practical amount of labeled data. "Practical amount" is a fuzzy term and depends on your resources. If you're a grad student, annotating 50 sentences with POS tags is practical; annotating the entire Wall Street Journal with syntax trees is not practical.Kevin Duhhttps://www.blogger.com/profile/07407894290644783502noreply@blogger.com