See the post on Hunch. Maybe that last 1.5% might benefit not from fancy ML, but from fancy (or even stupid) NLP. "But what data do we have to run the NLP on," my friends may ask. How about stuff like this, or if you're adventurous like this? (Had I enough time, I might give it a whirl, but alas...)
p.s., If any of the above links encourage copyright violations, then I'm not actually advocating their use.
Oddly enough, here at matchmine (a company matching personal interests with various media) I was just talking to one of our ML scientists about this.
ReplyDeleteWe're using a mesh of statistical and linguistic NLP in concert with our solution. While we're not (currently) diving as deep as the scripts, we're pulling in ancillary metadata in concert with standard ML techniques.
Basically, like the BellKor guys, we've realized you need a cocktail of ideas mixed together. Time will tell how we fair, but it's promising to see others thinking along the same lines.
BTW - Regarding copyright issues, this is a great reason to support the open licensing models. As they become more ubiquitous (and semantically defined), ML and NLP work becomes that much more viable.
Oh, your blog is of great interest to me. I do know the Netflix company. It is the largest on-line DVD and video game rental service. Netflix is a market leader in on-line DVD rental service in the United States. That is a total rip off. I did have huge problems with the delivery of the goods. And know where the consolation is? www.pissedconsumer.com. This is an amazing site to express dissatisfaction and get the right advice.
ReplyDelete