## 20 April 2010

### ((A => B) and (not B)) => (not A)

I remember back in middle school or high school -- added uncertainty so as not to date myself too much -- I first learned of the existence of file compression software. We used pkzip, mostly. I remember one of the first things I did was: compress myfile.exe to myfile.exe.zip. Then compress that to myfile.exe.zip.zip. Then myfile.exe.zip.zip.zip. I cannot tell you how disappointed I was when, at one stage, the file size went up after "compression."

We read papers all the time that demonstrate something roughly of the form "if A then B." The happens most obviously the closer you get to theory (when such statements can be made precise), but happens also in non-theoretical work. The point of this post is: if you believe A => B, then you have to ask yourself: which do I believe more? A, or not B?

The simplest case is the compression case. Let's say a weak compressor is one that always reduces a (non-empty) file's size by one bit. A strong compressor is one that cuts the file down to one bit. I can easily prove to you that if you give me a weak compressor, I can turn it into a strong compressor by running it N-1 times on files of size N. Trivial, right? But what do you conclude from this? You're certainly not happy, I don't think For what I've proved really is that weak compressors don't exist, not that strong compressors do. That is, you believe so strongly that a strong compressor is impossible, that you must conclude from (weak) => (strong) that (weak) cannot possibly exist.

Let's take the most obvious example from machine learning: boosting. The basic result of boosting is that I can turn a "weak classifier" into a "strong classifier." A strong classifier is one that (almost always) does a really good job at classification. A weak classifier is one that (always always) does a not completely crappy job at classification. That is, a strong classifier will get you 99.9% accuracy (most of the time) while I weak classifier will only get you 50.1% accuracy (most of the time). Boosting works by repeatedly applying the weak classifier to modified data sets in order ot get a strong classifier out.

In all the boosting papers, the results are presented as positive. That is: look, obviously we want strong classifiers. But weak classifiers look much more attainable. And voila, by boosting magic, we can turn the weak into the strong. This is an A => B setting: (existence of weak classifiers) => (existence of strong classifiers).

Sounds great, right? But let me ask you: do you believe more strongly that (weak classifiers exist) or more strongly that (strong classifiers do not exist)? For me, it's the latter. To some degree, no free lunch theorems apply here. This yields a totally different read of boosting results, namely: weak classifiers don't even exist!!!

More on the language side, one of the more classic experimentally results we have is that if you give me a really good semantic representation, I can do an awesome job doing natural language generation from those semantics. In order to do translation, for instance, I just have to generate the semantics of a French sentence and then I can generate a near-perfect English translation. (French semantics) => (Perfect translations). But do I believe mose strongly that we can get perfect French semantics or that we can not get perfect translation? Right now, almost certainly the latter.

My point is really one about critical reading. When you read a paper, things will be presented in one light. But that's often not the only light in which they can be interpreted. Apply your own interpretation.

Anonymous said...

My apology in advance for my comments because I'm pretty sure I didn't get your point:) But in the boosting case, the statement A=>B itself is false. Isn't there a limit in getting a strong classifier from weak classifiers?

Igor said...

I have a hard time following the definitions. Does "most of the time" mean "on average"? Otherwise, I don't understand what it means to have accuracy of "50.1% (most of the time)".

Also, a weak classifier is defined as "one that (always always) does a not completely crappy job at classification".

Is that supposed to be "almost always" instead of "always always"?

Thanks.

bj said...

You need to modify this to say that: many many different weak classifiers that when combined to produce a strong classifier all collectively and simultaneously do not exist. This doesn't mean that you can't find one (or a small number) of weak classifiers. I could create an unlimited number of weak classifiers if they were all identical, for example.

Suresh Venkatasubramanian said...

this sounds like the defn of an NP-hardness reduction :). I don't really believe I can solve SAT, so I MUST believe that this other problem is really hard.

hal said...

Well, I have to say that explaining the details of boosting wasn't really the point of this post (which is why I tried to be as hand-wavy as possible about it). But yes, it should be "almost always" (not "always always"). And of course it's not weak classifiers that I'm looking for, but a learning algorithm that's guaranteed to produce weak classifiers for (almost) any data set. The notion of almost is that it only has to do it (say) 90% of the time, and the other 10% of the time it can segfault for all I care. (Though note that you can boost this term, too.)

@suresh: I almost said something about complexity theory, which seems to kind of only have results of the form "if something almost certainly not true is true, then something else almost certainly not true is also true..." I think these are called something about pigs flying theorems or something...

Bob Carpenter said...

Your boosting and compression examples remind me of the (linear) speedup theorem. You can always improve the constant term of an algorithm's performance. (It doesn't work in practice, of course, because unfolding eventually hurts more than it helps due to program size and thus non-locality of instructions.)

Grae BG said...

I like your thinking. But I wonder...

The existence of weak classifiers implies the existence of strong classifiers only if we have assumed that the repeated usage of weak classification leads to strong classification. To be more mathematically precise, you have assumed that you can, through the repetition of weak classification, get arbitrarily close to strong classification. Under this assumption, you are correct, but I don't think that is a safe assumption to make.

A proof by contradiction in mathematics leads to showing an assumption cannot possibly be true; instead of concluding that weak classifiers don't exist - since they clearly do - we must simply question the assumption your implication is grounded upon.

Paul said...

I don't think you even need > 50% accuracy on the weak classifier, just better performance than a coin flip (not necessarily the same if the sizes of the categories are unequal). If I write a classifier that correctly identifies 10 examples it was trained on and flips a coin for everything else, with enough iterations we'll approach a case where every example has been used as a sample for some classifier. With enough additional iterations, the coin flips average out to 0, and you can detect faint deviations from the mean and figure out what the classifiers trained on that sample are reporting.

Absurdly large numbers of classifiers required. But it would work, wouldn't it? So when you don't like (not A) or (not B) you might just be interested in a more limited case.

Paul said...

Slightly more precisely, that algorithm would work on a finite set of classifiable values. If you're trying to classify the integers or reals, you're in no free lunch zone: even if you get 100% precision in some test set, generalizing that to performance on an infinite set will never come out to a measurable value about chance. So (strong classifiers) in the bounded set, unbounded resources, (no weak classifier) in the unbounded set, and the other cases are dependent on how you bound the set and resources.

Andy said...

Weak compression works for many but not all files. If it worked for all cases, then strong compression would be possible. Since weak compression does not work for all cases, the conditional is uninformative (denial of the antecedent).

Okay.

Now the other direction. Strong compression clearly doesn't exist. And yes, from this we can conclude by modus tollens that weak compression, as you defined it, does not exist.

However that's not a very useful definition of weak compression.

Weak compression works for many files.

That weak compression works for many files does not imply that strong compression works for all files.

So I think I have missed the point!

Unknown said...

MBT will not only change,MBT boots,the way you use your MTB Shoes,What are the benefits?christian dior,Free shipping and free return shippingdior shoes,on all diorIncluded with each pair ofdior handbags,an instructional DVDWhat is Dior?dior sunglasses,look at another good paoduct such as Dior totesDo not worryThe urge to buy these goodsnew balancerevolutionary fitness aid from Swiss Masai，new balance shoes,which may help reduce cellulite andnew balance outlet,transforms flat hard artificial surfaces Puma Shoes, with top quality and cheap price. puma outlet,innovative sole design includes thePuma Sneaker,Here you can buy wide rangewholesale cl high heel sandalsquality and cheap car GPS navigation systemsMoncler，Very Cool, Comfortable and lightmoncler jacketsoriginal packing you can rest assured.moncler coatsYou might say that thediscount moncler vestNo one ever thought bothmoncler outlet，As with everything that comes fmonmoncler t-shirtThe new store has been decorated.north faceWe are offering you a wide range ofnorth face outletSome color combinations seem to getnorthfaceBut within the same community north face jacketsbecause it features just the right amount of north face coats,A little of these are given below.ugg bootswas a very well-known French fashionable boot,cheap ugg bootsbecause of the wisdom featuresdiscount ugg bootswhich you are buying is uniqueclassic ugg bootsthey obtain materials from domestic suppliers ugg classic tall boots,A stroll around the park with the GHD IV Salon Styler，We are offering you coach outlet，our store has been decorated coach handbags，Nicecoach totes

Unknown said...

MBT will not only change,MBT boots,the way you use your MTB Shoes,What are the benefits?christian dior,Free shipping and free return shippingdior shoes,on all diorIncluded with each pair ofdior handbags,an instructional DVDWhat is Dior?dior sunglassesanother good paoduct Dior totesDo not worryThe urge to buy these goodsnew balancerevolutionary fitness aid from Swiss Masai，new balance shoeswhich may help reducenew balance outlet,transforms flat hardPuma Shoes, with top quality and cheap price. puma outlet,innovative sole design includes thePuma Sneaker,Here you can buy wide rangewholesale cl high heel sandalsquality and cheapMoncler，Very Cool, Comfortable and lightmoncler jacketsoriginal packingmoncler coatsYou might say that thediscount moncler vestNo one ever thought bothmoncler outletAs with everythingmoncler t-shirtThe new store north faceWe are offering you north face outletSome color combinations northfaceBut withinnorth face jacketsbecausediscount ugg bootsnorth face coats,cheap ugg bootsugg classic tall bootsugg boots,cheap ugg bootsclassic ugg bootsdiscount ugg bootswhich you are buying is uniqueclassic ugg bootsugg classic tall boots,ugg bootsGHD IV Salon Styler，We are offering you coach outlet，our store has been decorated coach handbags，Nicecoach totes

Unknown said...

MBT will not only change,MBT boots,the way you use your MTB Shoes,What are the benefits?christian dior,Free shipping and free return shippingdior shoes,on all diorIncluded with each pair ofdior handbags,an instructional DVDWhat is Dior?dior sunglassesanother good paoduct Dior totesDo not worryThe urge to buy these goodsnew balancerevolutionary fitness aid from Swiss Masai，new balance shoeswhich may help reducenew balance outlet,transforms flat hardPuma Shoes, with top quality and cheap price. puma outlet,innovative sole design includes thePuma Sneaker,Here you can buy wide rangewholesale cl high heel sandalsquality and cheapMoncler，Very Cool, Comfortable and lightmoncler jacketsoriginal packingmoncler coatsYou might say that thediscount moncler vestNo one ever thought bothmoncler outletAs with everythingmoncler t-shirtThe new store north faceWe are offering you north face outletSome color combinations northfaceBut withinnorth face jacketsbecausediscount ugg bootsnorth face coats,cheap ugg bootsugg classic tall bootsugg boots,cheap ugg bootsclassic ugg bootsdiscount ugg bootswhich you are buying is uniqueclassic ugg bootsugg classic tall boots,ugg bootsGHD IV Salon Styler，We are offering you coach outlet，our store has been decorated coach handbags，Nicecoach totes

Unknown said...

Teenagers in a track adidas shoes can go through shoes rate adidas outlet teenage runner well knows that adidas sneakers to wear the right shoes air jordan 2010 any cross country among world jordan shoes especially important to the sportsman discount jordan shoes a young person growing up Pure GHD Not only do they want to bourke dooney handbags grow out of their shoes dooney bourke bags but shape of their feet changes which means every time we dooney bourke outlet the time comes to get dooney bourke totes a new pair of shoes eveyone NIKE SHOX a good choice to get NIKE SHOX SHOES it with a high material SHOX SHOES that breathes easily on random moncler They should be extended runs moncler jackets give enough support to athlets moncler coats especially around the ankles area moncler vest to avoid injury your ankles discount moncler outlet when you make your trip discount moncler t-shirt something in store possible fit cheap ugg boots his is also the way to discount ugg boots that don't fit properly only ugg boots it focuses on your convenience classic ugg boots but they are not returnable ugg classic tall boots to move forward with quality discount ugg boots in the minds of the people classic ugg boots consumers all over the world ugg classic tall boots among public and new styles vibram five finger modern technology and best quality Five finger shoes all these efforts is consumers vibram 5 fingers for game and daily life vibram kso your daily life as well as vibram running shoes suitable for all category people

Anonymous said...

Andy: stop being logical. Relativism is the new cool. By demonstrating reality you only prove your unKooollNesssss dude.

Ann said...

Fashion trends change on daily basis, like Gold GHD. Now Burberry Sunglasses.Our ED Hardy Sunglasses will definitely satisfy your taste. ED Hardy T-shirt chain makes people enjoy the cheap ED Hardy Clothing range easily our ED Hardy Outlet. As we all know, in fact discount ED Hardy, is based. The reason is simple: fashion prohibited by polo boots, in other words, we can say it as polo shoes. Would you like to wear cheap ugg boots? Now welcome to our paul smith outlet. And these are different products, like Puma Shoes (or you can say Puma Sneaker, and you can buy them from our puma outlet), GHD, ED Hardy, UGG, Paul Smith and brand Sunglasses. For those who desire for a ugg boots but have to refrain themselves from buying one ugg boots, there are some company that offer discount ugg boots.

Unknown said...

NewStreetFashion
Ed Hardy
stylish design
Ed Hardy Wholesale
fashion excellent quality
wholesale Ed Hardy
ED Hardy clothing bring you a super surprise!
ed hardy wholesale clothing
The quality is so good
christian audigier
Young and creative style
abercrombie and fitch
You can have a look at it.
abercrombie & fitch
jordan 8
jordan 9
jordan 10

Unknown said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..

sesli sohbet
seslisohbet
sesli chat
seslichat
sesli sohbet sitesi
sesli chat sitesi
sesli sohpet
kamerali sohbet
kamerali chat
webcam sohbet

combattery84 said...
combattery84 said...
combattery84 said...
Anonymous said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
gÃ¶rÃ¼nlÃ¼tÃ¼ sohbet
gÃ¶rÃ¼ntÃ¼lÃ¼ chat
kameralÄ± sohbet
kameralÄ± chat
sesli sohbet siteleri
sesli chat siteleri
gÃ¶rÃ¼ntÃ¼lÃ¼ sohbet siteleri
gÃ¶rÃ¼ntÃ¼lÃ¼ chat siteleri
kameralÄ± sohbet siteleri
canlÄ± sohbet
sesli muhabbet
gÃ¶rÃ¼ntÃ¼lÃ¼ muhabbet
kameralÄ± muhabbet
seslidunya
seslisehir
sesli sex

Unknown said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
gÃ¶rÃ¼nlÃ¼tÃ¼ sohbet
gÃ¶rÃ¼ntÃ¼lÃ¼ chat
kameralÄ± sohbet
kameralÄ± chat
sesli sohbet siteleri
sesli chat siteleri
sesli muhabbet siteleri
gÃ¶rÃ¼ntÃ¼lÃ¼ sohbet siteleri
gÃ¶rÃ¼ntÃ¼lÃ¼ chat siteleri
gÃ¶rÃ¼ntÃ¼lÃ¼ muhabbet siteleri
kameralÄ± sohbet siteleri
kameralÄ± chat siteleri
kameralÄ± muhabbet siteleri
canlÄ± sohbet
sesli muhabbet
gÃ¶rÃ¼ntÃ¼lÃ¼ muhabbet
kameralÄ± muhabbet
birsesver
birses
seslidunya
seslisehir
sesli sex