Sunday, November 18, 2007

Machine translation

The other day I had to evaluate a bunch of texts that had been machine-translated. The intent was to evaluate if for some highly technical, reasonably short, self-contained texts, machine translation (MT, for short) was acceptable. I did sentence-by-sentence analyses which brought to light a number of interesting conclusions:
- there are elements that are not translated. From one language to the next, some filler words (prepositions, for example) just go. They just weigh down the text and don't add meaning. An example is "then" in 'if-then' instances. The poor machine translated then as a time construct instead of causality - the results were disastrous as the added meaning was far from the original. Better to have skipped it altogether.
- some elements cannot be skipped. MT sometimes misses crucial notions. This has the unexpected consequence of making the sentence unreadable. Sometimes the verb or subject will be missing. I did not stop to analyze why - I suspect it has to do with style (syntax, really). The machine cannot recognize the element as important in relation to its position in the sentence.
- common words have many meanings that a human being understands because of context or experience. The word May was translated as a derivative of 'can' instead of the month it was representing.
By the end of the 6-hour exercise, I was in awe of the human brain. Never before had I truly appreciated the complexity of the translation work. MT really demonstrates the enormous amount of weeding that goes into choosing the correct words to translate a thought. It also highlights the thought processes and decision-making abilities of good translators.

The page-by-page analyses showed that even though individual sentences can be properly translated, it is rarely enough to get the meaning across. If important elements are dropped (a negation in a warning, for example), then no matter how grammatical your sentence is, the results are still unacceptable. The text ranged from poor to good, good being the category just above poor. I would have had categories such as laughable, miserable, poor, barely readable, understandable. Good would have been the highest possibility. This was an eye-opening experience...

2 comments:

Anonymous said...

I have experienced these inconsistencies when using Babelfish to translate text. At times it was useful when translating Flemish to English and at other times down right frustrating.
Reassuring that machines can't fully "take over" for the human mind.

Sleepwalker said...

They are not even close. The humann mind is so much more nuanced, it is truly amazing.