26 January 2010

A machine learner's apology

Andrew Gelman recently announced an upcoming talk by John Lafferty. This reminded me of a post I've been meaning to write for ages (years, really) but haven't gotten around to. Well, now I'm getting around to it.

A colleague from Utah (not in ML) went on a trip and spent some time talking to a computational statistician, who will remain anonymous. But let's call this person Alice. The two were talking about various topics and at one point machine learning came up. Alice commented:

"Machine learning is just non-rigorous computational statistics."
Or something to that effect.

A first reaction is to get defensive: no it's not! But Alice has a point. Some subset of machine learning, in particular the side more Bayesian, tends to overlap quite a bit with compstats, so much so that in some cases they're probably not really that differentiable. (That is to say, there's a subset of ML that's very very similar to a subset of compstats... you could probably fairly easily find some antipoles that are amazingly different.)

And there's a clear intended interpretation to the comment: it's not that we're not rigorous, it's that we're not rigorous in the way that computational statisticians are. To that end, let me offer a glib retort:
Computational statistics is just machine learning where you don't care about the computation.
In much the same way that I think Alice's claim is true, I think this claim is also true. The part of machine learning that's really strong on the CS side, cares a lot about computation: how long, how much space, how many samples, etc., will it take to learn something. This is something that I've rarely seen in compstats, where the big questions really have to do with things like: is this distribution well defined, can I sample from it, etc., now let me run Metropolis-Hastings. (Okay, I'm still being glib.)

I saw a discussion on a theory blog recently that STOC/FOCS is about "THEORY of algorithms" while SODA is about "theory of ALGORITHMS" or something like that. (Given the capitalization, perhaps it was Bill Gasarch :)?) Likewise, I think it's fair to say that classic ML is "MACHINE learning" or "COMPUTATIONAL statistics" and classic compstats is "machine LEARNING" or "computational STATISTICS." We're really working on very similar problems, but the things that we value tend to be different.

Due to that, I've always found it odd that there's not more interaction between compstats and ML. Sure, there's some... but not very much. Maybe it's cultural, maybe it's institutional (conferences versus journals), maybe we really know everything we need to know about the other side and talking wouldn't really get us anywhere. But if it's just a sense of "I don't like you because you're treading on my field," then that's not productive (either direction), even if it is an initial gut reaction.

13 comments:

Bob Carpenter said...

I think the issue here is largely semantic.

Was the inventor of L-BFGS a computational statistician, a machine learnologist, or merely a numerical computationalist? What about all the heavy algorithmic lifting involved in sampling (e.g. Casella and Roberts' book)? What about the folks who implemented BUGS? Or something like lmer or even the basic regression in R (or SAS or Stata)? Which field gets to claim the bootstrap or Metropolis-Hastings or Gibbs sampling?

Are all those convergence theorems in NIPS papers more computational stats than machine learning, given that they're not about algorithms per se?

PS. I really think one of the big cultural differences is one Lafferty mentioned, saying "the goal of machine learning is to develop computer programs that predict well, according to some measure of risk or accuracy". That seems to make it a branch of decision theory focusing on algorithms.

Statisticians seem more focused on the analysis of fixed data sets rather than generalization behavior to unseen data. They tend to care more about causality and significance for this reason, because they're trying, in some sense, to use statistics to draw scientific conclusions.

Fernando Pereira said...

To put Bob's and John's points in summary form: machine learning cares about cost and risk, statistics cares about truth (hypothesis testing). In "The Emergence of Probability," Ian Hacking wrote about the conceptual struggle to separate probability (truth) from expectation (risk/reward). My tongue-in-cheek cynical conclusion is that probability allowed statisticians to save their necks in political perilous times by leaving the prediction and prescription business to economists and other risk-takers.

hal said...

I totally (mostly) agree with Bob and Fernando -- in fact, that's what I tell students on day one of my machine learning class!

But I don't think it's the whole story. There are whole parts of machine learning that don't care about predictions (a large subset of unsupervised learning, for instance). And similarly there are parts of statistics that care about predictions (as Bob points out, some parts of decision theory).

So I think it's a good first approximation, but it's not the whole story.

Anonymous said...

Although it seems to be unfashionable within some of the more statistically oriented parts of the community today, machine learning used to worry about some larger goals of 'intelligence' - something statisticians never really worried about or had much to say about...

Lawyer Nevada said...

Your article was quite intriguing and the information quite useful. Will check your site often to see other great posts you make! Regards



Nevada Attorneys, Nevada
Lawyers
, Nevada Law Firms,
Nevada Law Offices, Nevada
Legal Services
, Attorneys
In Nevada
, Nevada Lawyer
Directory
, Nevada Attorney
Directory
, Nevada Accident Attorneys, Nevada Administrative & Governmental Law Attorneys, Nevada Adoption Attorneys, Nevada Agricultural Law Attorneys, Nevada Appeals Attorneys, Nevada Arbitration & Mediation Services, Nevada Arbitration & Mediation Services Attorneys, Nevada Asbestos Diseases Attorneys, Nevada Asset Protection Attorneys, Nevada Attorneys, Nevada Attorneys&#; Information & Referral Services, Nevada Attorneys&#; Support Services, Nevada Banking & Investment Law Attorneys, Nevada Bankruptcy Attorneys, Nevada Business Services, Nevada Child Abuse Law Attorneys

Attorney New Mexico said...

As a Newbie, I am always searching online for articles that can help me. Thank you




State of New Mexico Lawyer Directory,
New Mexico Attorney Search,
New Mexico Lawyers Search, Find
A New Mexico Attorney Lawyers
, New Mexico Civil Law Attorneys, New Mexico Collection Law Attorneys, New Mexico Computers & Technology Law Attorneys, New Mexico Constitutional Law Attorneys, New Mexico Construction Law Attorneys, New Mexico Consumer Protection Attorneys, New Mexico Domestic Partnerships Attorneys, New Mexico Drug Charges Attorneys, New Mexico DUI/DWI Attorneys, New Mexico Education Law Attorneys, New Mexico Elder Law Attorneys, New Mexico Land Use & Zoning Attorneys, New Mexico Landlord & Tenant Law Attorneys, New Mexico Legal Information Services, New Mexico Legal Services, New Mexico Trial Attorneys, New Mexico Vehicular Accident Attorneys, New Mexico Whistleblower Attorneys, New Mexico Workers&#; Compensation Attorneys, New Mexico Wrongful Death Attorneys, New Mexico Wrongful Termination Attorneys

Unknown said...

Interesting post! We've been trying to foster just such a collaboration here at UCL in the UK at our "Centre for Computational Statistics and Machine Learning":

http://www.csml.ucl.ac.uk/

Anonymous said...

cheap nike shox
cheap sport shoes
nike tn dollar
ed hardy ugg boots
ed hardy love kills slowly
ed hardy clothing us
ed hardy clothing
cheap ed hardy
cheap ed hardy clothing
ed hardy clothes
ed hardy wholesale
ed hardy clothing
ed hardy t shirts
ed hardy shirts
ed hardy uk
ed hardy t shirts
ed hardy shirts
ed hardy hoodies
Cheap JORDAN SHOES,,
cheap nike max ,。
puma future cat
ed hardy ugg boots.
ed hardy love kills slowly boots.
ed hardy love kills slowly.
ed hardy polo shirts.
cheap ed hardy clothing,.
ed hardy shirts .
ed hardy t shirts.,.,.

Unknown said...

Good writing. Keep up the good work. I just added your RSS feed my Google News Reader..
New Jersey Attorneys Legal Services
, Attorney Directory New Jersey,
State Of New Jersey Lawyers,
Lawyers New Jersey, Attorneys
Of New Jersey
,

New Jersey Attorney Lawyer, New Jersey Lemon Law Attorneys, New Jersey Malpractice & Negligence Attorneys, New Jersey Maritime & Admiralty Law Attorneys, New Jersey Media & Communications Law Attorneys

Unknown said...

Well Whattadya know, yet another great site to add to my reader! Google blog search has you pretty well indexed it seems! you have some brilliant contents!
New Jersey Medical Malpractice Attorneys, New Jersey Military & Veterans Law Attorneys, New Jersey Native Persons Law Attorneys, New Jersey Non-Attorney Court Agents, New Jersey Nonprofit Organizations Attorneys, New Jersey Patent & Trademark Attorneys, New Jersey Personal Injury Attorneys,
New Jersey Product Liability Law Attorneys, New Jersey Property Law Attorneys, New Jersey Real Estate Attorneys

Anonymous said...

Hey very nice blog!! Man .. Beautiful .. Amazing .. I will bookmark your blog and take the feeds also…
Alabama motorcycle insurance, Alaska motorcycle insurance, Arizona motorcycle insurance, Arkansas motorcycle insurance, Bail Bonds Referral Services motorcycle insurance, California motorcycle insurance, Colorado motorcycle insurance, Connecticut motorcycle insurance, Delaware motorcycle insurance, District Of Colombia motorcycle insurance, Florida motorcycle insurance,

Anonymous said...

This is very interesting information. I am doing some research for a class in school. and i liked the post. do you know where I can find other information regarding this? I am finding other information on this but nothing that I can use really in my paper for my final. do you have any suggestions?
This is very interesting information. I am doing some research for a class in school. and i liked the post. do you know where I can find other information regarding this? I am finding other information on this but nothing that I can use really in my paper for my final. do you have any suggestions?

ylinling001 said...

I like your article, really interesting! My point is also very good, I hope you'll like:chi flat iron are a very popular choice of hair straightener.New Balance,new Blance shoes,new Blance Outlet are some of the most comfortable and stylish shoes on the market today. The designer has a whole range of shoes for all types of athletes. five finger shoes,vibram five fingers,Five fingers shoes give women the feeling of walking barefoot while still keeping the feet protected.