
TR06: Babelfish: Real-Time Machine Translation on the Internet
In his ironic science fiction thriller, The Hitchhiker’s Guide to the Galaxy,
Douglas Adams describes a creature called a Babel fish that enables humans
to understand and speak any language on earth.
You simply stick the device in your ear and -- voila! you’re multilingual.
No more need for flash cards, language labs, or grammar books.
Just plug in and play the fish.
Ready or not, Adams’ fictional earpiece just made its virtual debut on the Internet.
It’s time to stop surfing and start fishing.
Real-Time "Debabelizer"
On December 9, 1997, Digital Equipment Corporation and SYSTRAN A.G.
launched AltaVista Translation Service,
the first European language translation service for Web content.
For the first time, non-English speaking users can translate information
on the predominantly English speaking Web in real time.
The new free service, which is hosted by Digital’s AltaVista Search site
(http://www.altavista.com),
also enables English-only users the ability to understand information
in five European languages:
French, German, Italian, Portuguese, and Spanish.
Not surprisingly, the server itself is called "Babelfish."
Translating Web Pages On the Fly
To translate raw text is simple:
- Go to the AltaVista Translation Service site:
http://babelfish.altavista.com
- Paste the source text into the text box
- Select a target language and click Translate.
If you wish, you can copy and paste the translated text into any type of document.
Or you can reverse the process and translate English text into a foreign language.
To translate Web pages or search results is just as easy
(for more details see Digital’s AltaVista Search site).
Mind Over Machine
Idiomatic texts, such as the one you are reading,
do not lend themselves well to machine translation.
As Digital and SYSTRAN put it:
"The technology works best when the text is grammatically correct
and does not use too many idioms; however, users can usually understand the meaning
of even a poorly written document." This you can judge for yourself.
I find that reading text generated by the AltaVista Translation Service
is not unlike listening to "Voice of America" broadcasts
through heavy state-sponsored static.
The reception could be better, but you get a basic idea of what’s going
on outside your borders.
No doubt, and with good reason, professional translators will build bonfires for AltaVista.
But others -- particularly monolingual Americans -- will erect shrines
to this fast, free, and easy translation service,
no matter how obvious and odious its flaws.
Dé Jà Vu All Over Again
If all of this sounds vaguely familiar, it is. Machine translation,
like the Internet itself, is a remnant of the Cold War.
After World War II, the idea of decoding natural languages
through mathematical techniques became a reality.
Twenty years of military-industrial research culminated in SYSTRAN,
which was developed in 1968 by Peter Toma in La Jolla, California.
By the late 1980s, this system enabled loyal behemoth customers --
such as the Commission of European Communities,
the U.S. Air Force, and Xerox Corporation -- to translate mountains of documents,
modify their own dictionaries,
and preserve original document formats during the translation process.
In the early 1990s, SYSTRAN retrofitted its mainframe-based technology to personal computers.
Now, together with Digital, they are back on the world stage,
this time offering free Web page translation to, of all things, individuals.
History of the Future
The history of translation in general says a lot
about the future of real-time machine translation in particular.
Essentially, there are three ways to translate documents:
- Human translation, done by humans who are fluent in the source language
and native to the target language,
is used for sensitive documents that do not contain much redundant material
and are not likely to be revised frequently.
- Computer-assisted translation. Interactive machine translation
includes modifiable bilingual glossaries and "fuzzy memory"
that compares current texts with previous translations,
allowing humans to accept, reject, or edit those translations.
It requires almost no post-translation editing by humans,
and is used for polished retail publications that go through repeated revisions.
- Machine translation is automatic translation,
and as such requires the use of controlled language in original texts
and extensive post-translation editing by humans.
It is used by military and industrial organizations
that are large and disciplined enough to leverage economies of scale.
Unlike the AltaVista Translation Service,
all three approaches involve human translators to a greater or lesser degree.
In fact, AltaVista is a translator’s nightmare:
unchangeable databases mechanically processing uncontrolled language worldwide in real time
and in a public space.
Despite its obvious flaws, however, this spectacular experiment is something
to keep your eye on, especially if you are directly involved
in international technical communication.
Remember 1993? In the beginning, the experts thought the Web was science fiction.
Then came the browser wars. In 1996 they thought it couldn’t turn a buck.
Then came electronic commerce. Now they say it has no content.
Enter real-time machine translation.
Each of these breakthroughs was market-driven, and each violated the conventional wisdom
of its time
Resources
For more information about real-time machine translation, consult the following sources:
Note: This article appeared originally as
"Real-Time Machine Translation on the Internet" in the May 1998 issue of Intercom,
the magazine of the Society for Technical Communication.
It is Copyright 1998, Kurt Ament and STC.
For further copyright information, contact the editor of Intercom,
Maurice Martin maurice@stc-va.org.
© TC Forum 1998-2001 - http://www.tc-forum.org - file last updated 13 Jan 00
"transline Deutschland - Übersetzungsdienst für technische Übersetzung"
Web design by "Alexander von Obert"