I remember when I took my first "real" Syntax class, where by "real" I mean "Chomskyan." It was at USC in Fall 2001, taught by Roumyana Pancheva. It was hard as hell but I loved it. However, as a computationally minded guy, I remember snickering to myself the whole time we were talking about movements that get you from deep structure to surface structure. This stuff was all computationally ridiculous.
But why was it computationally ridiculous? It was ridiculous because my mindset, and I think the mindset of most computational folks at the time, was that of n^3 CKY or Earley style parsing. Namely exact parsing in a context free manner. This whole idea of transformations would kill anything like that in a very bad way.
However, there's been a recent shift in attitudes. Sure, people still do their n^3 parsing, but of course none of it is exact anyway (due to pruning). But more than that, things like linear time parsing algorithms as popularized by people like Joakim Nivre and Kenji Sagae and Brian Roark and Joseph Turian, have proved very useful. They work well, are incredibly efficient, and are easy to implement. They're also a bit more psychologically plausible (as Eugene Charniak said recently "we don't know what people are doing, but they're definitely not doing CKY.").
So I'm led to wonder: could we actually do parsing in a transformational grammar using all the new stuff we know about (for instance) left-to-right parsing?
One thing that stands in our way, of course, is the stupid Penn Treebank, which was annotated only with very simple transformations (mostly noun phrase movements) and not really "deep" transformations as most Chomskyan linguists would recognize them.
But I think you could still do it. It would end up as being partially unsupervised, but at least from a minimum description length perspective, I can either spend weights learning more special cases, or I can learn general transformational rules. It would take some thought and effort to write it out and figure out how to actually optimize such a thing, but I bet it could be done in a semester.
So then the question is: aside from smaller models (potentially), is there any other reason to do it?
I can think of at least one: parsing non-declarative sentences. Since almost all sentences in the Treebank are declarative, parsers do pretty crappy when tested on other things. Slav Petrov had a paper at EMNLP 2010 on parsing questions. Here is the abstract, which says pretty much everything:
... We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers ... drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.Now, at least in principle, if you can parse declarative sentences, you should be able to parse questions. At least if you know about some basic syntactic transformations in English. (As an aside, the "uptraining" idea is almost exactly the same as the structure compilation idea that Percy, Dan and I had at ICML 2008, though Slav and colleagues apply it to a domain adaptation problem, while we just did simple semi-supervised learning.)
We have observed similar effects in the parsing of commands, such as "Put your head in a noose" where parsers -- even constituency ones -- really really want "Put" to be a noun! Again, if you know simple transformations -- like subject dropping -- you should be able to parse commands if you can already parse declarations.
As with any generalization, the hope is that by realizing the generalization, you don't need to store so many specific cases. So if you can learn that commands and questions are simple transformation on declarative sentences, and you can learn to parse declaratives, you should be able to handle the other case.
(Anticipating comments: yes, I know you could try to pre-transform your data, like they do in MT, but that's quite inelegant. And yes, I know you could probably take the treebank and turn a lot of the sentences into commands or questions to create a new data set. But that's kind of missing the point: I don't want to just handle commands or questions... I want to handle anything, even things that I might not have anticipated.)