I'm currently visiting Microsoft Research Asia (in Beijing) for two weeks (thanks for having me, guys!). I speak basically no Chinese. I took one half of a semester about 6 years ago. I know much more Japanese; enough so that I can read signs that indicate direction, dates and times, but that's about it... the remainder is too divergent for me to make out at all (perhaps a native Japanese speaker would feel differently, but certainly not a gaijin like me).
My experience here has reminded me of a paper that Ken Church and Ed Hovy wrote almost 15 years ago now, Good Applications for Crummy Machine Translation. I'm not sure how many people have read it recently, but it essentially makes the following point: MT should enter the users world in small steps, only insofar as it is actually going to work. To say that MT quality has improved significantly in 15 years is probably already an understatement, but it is still certainly far from something that can even compare to translation quality of a human, even in the original training domain.
That said, I think that maybe we are a bit too modest as a community. MT output is actually relatively readable these days, especially for relatively short input sentences. The fact that "real world companies" such as Google and LanguageWeaver seem to anticipate making a profit off of MT shows that at least a few crazies out there believe that it is likely to work well enough to be useful.
At this point, rather than gleefully shouting the glories of MT, I would like to point out the difference between the title of this post and the title of the Church/Hovy paper. I want to know what to do with a crummy translation. They want to know what to do with crummy machine translation. This brings me back to the beginning of this post: my brief experience in Beijing. (Discourse parsers: I challenge you to get that dependency link!)
- This voucher can not be encashed and can be used for one sitting only.
- The management reserves the right of explanation.
- Office snack is forbidden to take away.
- Fizzwater bottles please recycle.
The question I have to ask myself is: are these human translations really better than something a machine could produce? My guess is that machine translation outputs would be less entertaining, but I have a hard time imagine that they would be less comprehensible. I guess I want to know: if we're holding ourselves to the standard of a level of human translation, what level is this? Clearly it's not the average translation level that large tourism companies in China hold themselves to. Can we already beat these translations? If so, why don't we relish in this fact?