Log in ....Tribune

Monday, July 15, 2002
Feature

PCs can’t translate!

ILLUSTRATION BY SANDEEP JOSHITAKE this sentence, let translation tools on an Internet search engine work their magic to translate it into Korean and back again, and this is what you get: "It has this elder brother and boil, if magic they in order to translate it again at a Korean and after one, it makes the translation tool in the Internet search engine, this is what you get.

Clearly, computers still can't translate as accurately and artfully as people do.

Many experts doubt they ever will. But recently some researchers have stumbled on what could be a powerful new tool for translators: the World Wide Web.

"People say, look, possibly everything I've ever wanted to say is on the Web and probably already translated on the Web," Eduard Hovy, a researcher at the University of Southern California's Information Sciences Institute, said. The Web is flooded with translations of everything from novels to corporate documents to personal pages. Some have been translated by people, some by translation software, some by a combination. The software of one company, Systran (a simple version of which is found at Altavista.com and produced the above example), translates 6 million pages a day. Other players in the field include IBM Corp., SDL International and Bowne Global Solutions, which purchased intellectual property from bankrupt language software company Lernout and Hauspie. Pooling the lessons suggested in Web pages may someday prove more effective at creating new translations than the current method.

For now, programmers generally assemble dictionaries of words and phrases likely to occur in the documents to be translated, along with rules to help figure out an unfamiliar phrase.

It works well enough for texts with recurring vocabularies and style - weather forecasts or owner's manuals. But it's not a tool to be used on marketing literature or contracts. Dimensions, dates, local currencies, laws and proper nouns - an executive named "Mike" came out, as "microphone" on one Website - are too complex. In short, computers have a little common sense. Even a child can tell from the context of a sentence whether the "bank" is a place to borrow money or to fish, but that still largely baffles machines. To deduce those rules, computer needs millions of examples, laid out in perfectly aligned, translated text.

So far, researchers have found such examples amid the proceedings of the Canadian Parliament, which are issued in French and English. And the University of Pennsylvania's Linguistic Data Consortium has formatted a Hong Kong legal archive and back issues of a Taiwanese magazine in Chinese and English. But those documents offer little help translating "Randy Johnson is a monster with a wicked fastball," for instance, or the lyrics to a '70s rock anthem. But those phrases - or enough parts to be useful - may well have been translated somewhere on the Web. The translations may be bad, but they're good enough to get the idea across. — AP