19 July 2007

What's the Use of a Crummy Translation?

I'm currently visiting Microsoft Research Asia (in Beijing) for two weeks (thanks for having me, guys!). I speak basically no Chinese. I took one half of a semester about 6 years ago. I know much more Japanese; enough so that I can read signs that indicate direction, dates and times, but that's about it... the remainder is too divergent for me to make out at all (perhaps a native Japanese speaker would feel differently, but certainly not a gaijin like me).

My experience here has reminded me of a paper that Ken Church and Ed Hovy wrote almost 15 years ago now, Good Applications for Crummy Machine Translation. I'm not sure how many people have read it recently, but it essentially makes the following point: MT should enter the users world in small steps, only insofar as it is actually going to work. To say that MT quality has improved significantly in 15 years is probably already an understatement, but it is still certainly far from something that can even compare to translation quality of a human, even in the original training domain.

That said, I think that maybe we are a bit too modest as a community. MT output is actually relatively readable these days, especially for relatively short input sentences. The fact that "real world companies" such as Google and LanguageWeaver seem to anticipate making a profit off of MT shows that at least a few crazies out there believe that it is likely to work well enough to be useful.

At this point, rather than gleefully shouting the glories of MT, I would like to point out the difference between the title of this post and the title of the Church/Hovy paper. I want to know what to do with a crummy translation. They want to know what to do with crummy machine translation. This brings me back to the beginning of this post: my brief experience in Beijing. (Discourse parsers: I challenge you to get that dependency link!)

  • This voucher can not be encashed and can be used for one sitting only.
  • The management reserves the right of explanation.
  • Office snack is forbidden to take away.
  • Fizzwater bottles please recycle.
The first two are at my hotel, which is quite upscale; the second two are here on the fridge at Microsoft. There are so many more examples, in subway stations, on the bus, on tourism brochures, in trains, at the airport, I could go on collecting these forever. The interesting this is that although two of these use words that aren't even in my vocabulary (encashed and fizzwater), one is grammatical but semantically nonsensical (what are they explaining?) and one is missing an indirect object (but if it had one, it would be semantically meaningless), I still know what they all mean. Yes, they're sometimes amusing and worth a short chuckle, but overall the important points are gotten across: no cash value; you can be kicked out; don't steal snacks; recycle bottles.

The question I have to ask myself is: are these human translations really better than something a machine could produce? My guess is that machine translation outputs would be less entertaining, but I have a hard time imagine that they would be less comprehensible. I guess I want to know: if we're holding ourselves to the standard of a level of human translation, what level is this? Clearly it's not the average translation level that large tourism companies in China hold themselves to. Can we already beat these translations? If so, why don't we relish in this fact?

16 comments:

David Novakovic said...

There is some very promising work being done on SMT for asian languages, I wish I could say more. But needless to say there are leaps and bounds being made in the field. There are different ways of thinking about SMT, and google's blue scores are just not quite there, there are ways to get much better translations using the same techniques.Obviously I don't know the internals of Googles SMT tech, but the differences run to a very fundamental level. I regret having to be so cryptic, i respect my agreements :) It does have something to do with this though: http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html

John Wong said...

Let's see how Google Translate performs in these cases:

此優惠券不可兌換現金,而且只可使用一次。
This Coupon non-convertible cash, but can only be used once.

管理層保留最終解釋權。
Management retain the final power of interpretation.

(When the concept of "final" is removed, as in the original sign:
管理層保留解釋的權利。
Management explained to retain the rights.)

辦公室零食不可拿走。
Office snack foods can not take away.

請回收汽水瓶。
Please recall the bottle.

(In the word order of the original sign:
汽水瓶請回收。
Please bottle recycling.)

Hmm. It works okay.

This reminds me of the electronic English-Chinese dictionary that I used to have. It worked great for looking up words. It would be cool if it could do simple MT as well. Now some electronic dictionaries can already do simple MT, e.g. this one. I'm not sure about the technology underlying this or how well it performs, but I wonder what current statistical MT technologies can do in a hand-held device like this.

David Novakovic said...

John: SMT has long been overlooked as a viable MT method simply because of the prohibitively high cost of processing. I'm guessing that would be a limitation on embedded devices too. Hopefully not for long though.

disclaimer: I don't work for these guys, but i'm very interested and excited in what they are doing.
I got a little more info about the guys specialising in Asian SMT:
1. they currently are working on over 100 high quality Asian language pairs. with many more to come.
2. have support for domain specific translations (thus resulting in a much higher quality translation)
3. will have a Thai beta translation done soon. Thai is an extremely hard language to translate because of it's lack of spaces, periods and attention :)

4.they are looking for computational linguists from Asian countries to help with pre and post processing of their translations. So if you have experience in the area get in contact with me and I'll give you a contact. Or sign up for their mailing list at asiaonline.net.

malford385 said...

Machine verses people. Not sure if I get that. Whilst I agree with the concept of MT I don't feel nor do I subscribe to its use commercially. Surely what we should all be looking at is contextual memory and the relationship of language to market and market language to brand or business or sector. Systems are already emerging that use artificial intelligence and are developing faster than MT. Ultimately the quality of our language as a communications medium is going to be the demonstrable proof that we can communicate with our fellow man or woman in his country and for his business.

Dion Wiggins said...

Excellent post Hal. My company is the group that Dave Novakvic was referring to developing Asian SMT systems and what you present in this entry is the exactly what we subscribe to.

I previously worked as VP and Research Director for Gartner in Asia Pacific and one of the messages I often tried to get across to people looking for perfection was that it often was not necessary. Good enough for the current task is what is necessary. Many businesses are using Wiki's, blogs and other tools internally that are far from perfect but useable and good enough for the purpose they are being used. Sure, they could be better and offer more, but they are good enough.

MT is the same and the examples you pointed out are as good as any. MT has improved greatly and new techniques and new resources are making MT better every day.

Some of the languages we are working on are difficult to deal with programatically, such as Thai. Thai does not have spaces, punctuation, periods or anything useful to determine words, paragraphs, sentences etc. I have a sentence that runs 27 pages - try feeding that through a SMT system :) Research has been limited to date because of basic low level tech being hard to master to get to the same point where most other languages start at.

We will have our first Thai system up for demo in 2-3 weeks from now. It will be far from perfect, but it will be better than any of the other limited Thai translation systems today based on rules and it will be "good enough." Sure, there will be improvements over time and we are already working on some, but there is also a point of diminishing returns.

We are training on multiple domains with corpus sizes exceeding 10 million sentence pairs. It will be an interesting language to monitor to see how it stacks up in SMT and what the point of diminishing returns is for domains.

As I am sure you are aware, gathering corpus is not fun, time consuming and expensive - that point of being "good enough" is going to be key for us. At this stage, if key messages are clearly presented, even if the grammar is a little off, then that to me is "good enough".

Future enhancements such as syntax trees and applying morphological data into some of the processes will likely give us greater quality than going too far on corpus.

BTW, my personal favorite from China is "Passage of deformed man". The Chinese read "wheelchair ramp" - that one was not quite "good enough"

Regards

Dion

Bob Carpenter said...

It's always nice to have a low baseline. A particularly useful baseline for speech systems is call center attendant performance. The attendants aren't dumb, they're just stressed from too little time to make a decision (typically under 20 seconds), too little training about the business logic (typically hundreds of destinations, with half a day training and a 3-ring binder), and too little experience (typically under six months).

It really becomes a cost issue. How much are you willing to pay for a good translation? For the Chinese tour bus companies, not much. How much are companies willing to pay for telephone support? Again, not much.

Just don't confuse baselines with toplines. People can do call routing and hotel sign translation at near 100% accuracy.

The bigger issue is that all of NLP is crummy. 90% entity extraction precision means developing systems to deal with errors in 1/10 low level decisions. (Not to mention developing whole new systems to deal with the lack of recall.) 97% tagger accuracy still means one word in a sentence is likely wrong; not coincidentally that word's likely to be the most discriminative one in the sentence (in the TF/IDF sense), such as a noun or adjective, rather than a functional word like "the", which is (almost) always tagged correctly, padding the accuracy stats.

Anonymous said...

I love how SMT systems omit a negative every now and then, translating the phrase as the exact opposite of its original meaning. Idiomatic expressions and colloquialisms are often a great source of entertainment as well.

florian laws said...

Bob, do you think one should weight accuracy calculation by e.g. the TF/IDF score of the word, to get a more relevant accuracy measure?

Bob Carpenter said...

I'm not a big fan of complexly weighted utility metrics. If you can motivate one with a task, fair enough.

Weighting by IDF would be similar to macro-average results for classifiers (metric is average over types, not tokens).

What you need will depend on task. For "needle in a haystack" kinds of text mining, you need good recall on items not in the training set. For "what do people think of the iPhone", it's much easier to get an answer because the signal is hugely redundant. I'd want high recall for the former and high precision for the latter, most likely.

Andres Dominguez said...

Some rumblings...

Are those texts translations? What is a translation for you? The work of translation is more often than not underestimated, thought of as a trivial task and it is frequently miserably payed. People start to react about what a translation is when something has to be done with the product apart from laughing.
Few people would say something is a car if it never moves. People are too used to just taking a look at translations and giving up and trying to understand by themselves through context or else...until they have to deal with longer messages that cannot be guessed from the look of a machine.
MT has been improving a lot. Still, many endeavors in this area would progress faster if people would be humble enough as to ask what AI people have so often failed to ask:
what is our general theory of this? (in this case, general theory of translation)
The Turin test for AI was a bad premise. A parrot can talk and often fool and yet few people would say it is very intelligent. The Turin test and the reluctance to think about what intelligence really is has lead to lots of nice gadgets but too little advances for the efforts in AI.
In the same way as many AI people in general have often failed to sit still for a moment and think of a theory of mind (remember On Intelligence, by Jeff Hawkins?), many people in NLP have failed to ask firstly what their theory of the language is (and a theory of the language needs to be more than a chosen formalism).

What do we expect from MT? To surpass the work of people who are doing their best at a work they are not capable of but for which most companies do not want to pay enough? Do we want the system to do some kind of understanding? (very high level, and then define understanding)
Or do we take some average path and
decide to go for a system that can render the intended meaning of most of the sentences? Say 70, 80%?
How robust? In what text fields?

Andrés


Crossminder

. said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花

酒店上班請找艾葳 said...

艾葳酒店經紀公司提供專業的酒店經紀, 酒店上班小姐,八大行業,酒店兼職,傳播妹,或者想要打工兼差打工,兼差,八大行業,酒店兼職,想去酒店上班, 日式酒店,制服酒店,ktv酒店,禮服店,整天穿得水水漂漂的,還是想去制服店日領上班小姐,水水們如果想要擁有打工工作、晚上兼差工作兼差打工假日兼職兼職工作酒店兼差兼差打工兼差日領工作晚上兼差工作酒店工作酒店上班酒店打工兼職兼差兼差工作酒店上班等,想了解酒店相關工作特種行業內容,想兼職工作日領假日兼職兼差打工、或晚班兼職想擁有鋼琴酒吧又有保障的工作嗎???又可以現領請找專業又有保障的艾葳酒店經紀公司!

艾葳酒店經紀是合法的公司工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆,可日領現領
一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班,酒店上班,酒店打工環境、上班條件給水水們。心動嗎!? 趕快來填寫你的酒店上班履歷表

水水們妳有缺現領、有兼職缺錢便服店的煩腦嗎?想到日本留學缺錢嗎?妳是傳播妹??想要擁有高時薪又輕鬆的賺錢,酒店和,假日打工,假日兼職賺錢的機會嗎??想實現夢想卻又缺錢沒錢嗎!??
艾葳酒店台北酒店經紀招兵買馬!!徵專業的酒店打工,想要去酒店的水水,想要短期日領,酒店日領,禮服酒店,制服店,酒店經紀,ktv酒店,便服店,酒店工作,禮服店,酒店小姐,酒店經紀人,
等相關服務 幫您快速的實現您的夢想~!!

seldamuratim said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..
sesli sohbetsesli chatkamerali sohbetseslisohbetsesli sohbet sitelerisesli chat siteleriseslichatsesli sohpetseslisohbet.comsesli chatsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet
seslisohbetsesli sohbetkamerali sohbetsesli chatsesli sohbetkamerali sohbet

cilemsin42 said...

jumsgcxReally trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it to a few friends of mine that I know would enjoy reading..
sesli sohbetsesli chat
sesli sohbet siteleri

sesli chat siteleri sesli sohbetsesli chat
sesli sohbet siteleri
sesli chat siteleri
SesliChat
cılgın sohbet
güzel kızlar
bekar kızlar
dul bayanlar
seviyeli insanlar
yarışma
canlı müzik
izdivac
en güzel evlilik
hersey burada
sesliparti
seslisohbet odalari
Sesli adresi
Sesli Chat
SesliChat Siteleri
Sesli Chat sitesi
SesliChat sitesi
SesliSohbet
Sesli Sohbet
Sesli Sohbet Sitesi
SesliSohbet Sitesi
SesliSohbet Siteleri
Muhabbet Sitesi
kamerali chat
Görüntülü Sohbet
Hasret gülleri
Çet sitesi
SesliSohbet
Sesli Sohbet
Canli sohbet
Turkce sohbet
Kurtce Sohbet
Kurtce Chat
Kurtce Muhabbet
Kurtce Sohbet
Kurdish Chat
SesliChat
Sesli Chat
SesliSanal
Guncel Haber
sohbet Sitesi
Chat sitesi..

DiSCo said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
kameralı sohbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
seslidunya
seslisehir
sesli sex

Sesli Chat said...

Really trustworthy blog. Please keep updating with great posts like this one. I have booked marked your site and am about to email it

to a few friends of mine that I know would enjoy reading..
seslisohbet
seslichat
sesli sohbet
sesli chat
sesli
sesli site
görünlütü sohbet
görüntülü chat
kameralı sohbet
kameralı chat
sesli sohbet siteleri
sesli chat siteleri
sesli muhabbet siteleri
görüntülü sohbet siteleri
görüntülü chat siteleri
görüntülü muhabbet siteleri
kameralı sohbet siteleri
kameralı chat siteleri
kameralı muhabbet siteleri
canlı sohbet
sesli muhabbet
görüntülü muhabbet
kameralı muhabbet
birsesver
birses
seslidunya
seslisehir
sesli sex