25 March 2006

Yesterday was Structured Learning Day

I'm out in Chicago at the atomic learning workshop and yesterday was structured prediction day. There were a lot of cool talks on pretty much all the major SP techniques: Charles talked about efficient training of CRFs, Yasemin talked about semi-supervised learning in M3Ns and SVMISOs, Ben talked about efficient training of M3Ns for combinatorial-style problems (like word alignment), Drew talked about planning as a SP problem, Vasin talked about the tradeoffs of local vs. global inference in SP, and I of course talked about search.

I really enjoyed Drew's talk. He is considering the problem of robot navigation, which is typically handled using A* search and some very crafty hand-written heuristic rules for generating the cost map over which A* search runs. He wants to get rid of the crafty part and replace it with learning. The idea is simple and cute: use observed features the robot receives to learn to produce a cost map so that when A* runs it finds the right path. Here, he defines the right path by having a human remote control the robot and drive it to the correct place. He casts this as a max-margin problem and solves it using very fast subgradient methods. There's no paper on this yet, but if all goes well there will be soon. I think there's a lot of stuff here that could strongly influence how we think about SP.

There were many questions raised at the workshop that I deserve significant thought. These include:

  1. What is structured learning?
  2. How does structured learning related to multi-task learning (ala Rich Caruana and others)?
  3. What sort of techniques are there for dealing with un- or partially-labeled data for structured learning?
  4. What can be done about featuritis (i.e., throwing in all possible features)? Also: Is this a legitimate concern?
  5. How important is it what loss function we optimize?
  6. When is local learning enough (ala Dan Roth) and when must you do "global learning?"
  7. What training techniques scale sufficiently?
Almost all of these questions deserve separate posts, and of course they are non-independent. Modulo feedback, I'll probably start at the top and work my way down. (Of course, if John beats me to it, I'll just post on his blog and cross-link here.)


hal said...

I started a separate post on heuristics to discuss this. Please reply there.

Kevin Duh said...

I don't think that structured learning is such a complicated approach--people have worked on prediction problems with structured outputs for years. For example, speech recognition (predicting a sequence of words), communication theory (decoding streams of bits), etc. are all so-called "structured learning" problems. It's just that recently the term has been coined and people looked at things in a more general light.

Hal, regarding your post on the TTI atomic learning workshop--I'd appreciate if you could comment on each of the issues mentioned when you have time. :)