The Internet is a global communications network; but, until now, its potential as a global business environment has been limited by language barriers. In the early days, before the Internet became open to business, English dominated. Many users were technical people, who, regardless of their native language, had grown accustomed to reading computer manuals in English. Then, with the commercial expansion of the Internet, tens of millions of non-technical people from all walks of life connected. Many of these newcomers do not understand English or are uncomfortable using English or as a matter of cultural pride would prefer to be addressed in their native tongue. As a result, businesses of all kinds are rushing to cater to the needs of this growing audience -- providing local-language content, along with selling services and products on-line. This means that an increasing proportion of the content on the Internet is difficult for English-speaking people to decipher. It also means that an increasing number of non-English speaking Web users cannot understand the content of Web pages written in English. As a result, the Web has been fragmenting into local language entities, and costs have been escalating for those companies that want to reach a global audience and hence have to have their pages translated into multiple languages and have to make changes to those translated versions every time they make a change in the content of a page. The AltaVista Search site (http://www.altavista.digital.com) adapted to this multi-lingual environment by partnering with companies around the world to set up mirror sites, like Telia in Sweden and Telefonica in Spain, which provide instructions and help in nearly all European languages. AltaVista also users limit their searches to pages in a particular country (with the command domain: followed by the country identifier, e.g., domain:de for Germany) and to limit a search to pages in a particular language using "language tags". This approach helped global companies with multi-lingual capabilities to navigate through the increasingly language-fragmented Web, but ordinary users and small companies found themselves trapped on their respective language islands. Using the "language tag" capability at AltaVista (picking a particular language in Advanced Search, selecting "give me only a precise count of matches," and performing a search for *), you can see that the AltaVista index includes over 5 million pages in Japanese, 3 million in German, nearly 2 million in French, another million in Spanish, and another million in Italian. Yes, English still predominates (the total in the index today is about 88 million pages), but thousands of non-English pages are added every day -- including pages with information that could be valuable to you if you could both find it and make sense of it. Yes, automatic translation software is available for some languages, but it can be awkward cutting-and-pasting Web page text into the software and then bouncing back to the Web to follow links and explore further. You lose the convenience and speed of information access that makes the Web so compelling. For business communications, automatic translation can provide a good first approximation, and is very useful. But it is not good enough for Web site owners to rely on it as a means of providing their content in multiple languages. If the translation sits on their site, it reflects on their brand; and they need to go out of their way to avoid the glaring gaffes and cultural blunders that automation often generates. That means that they have to pay high prices and cope with long delay times for human experts to polish and correct the output of automated translation. And they also have to provide added disk space for the translated text and develop internal procedures to help manage the increased complexity of a multi-lingual Web site. But even if individual users had their own translation software and even if many businesses could afford to maintain multi-lingual sites, language would still be a barrier to search. With all the millions of pages out there, how could you find those pages that have information that is potentially valuable to you but that are only available in non-English languages? And how could non-English speaking people find pages at English-language sites, even if those sites could (and would want to) provide automatic translations on the fly? Amazingly, a free service recently added at the AltaVista Search site goes a long way toward providing practical solutions for all these problems, helping to make the Web truly global. This is one of those developments -- like the initial launch of AltaVista -- that change the direction of the Web. The availability of free instantaneous translation helps break down language barriers, opening new markets and business opportunities and fostering international understanding. ALTAVISTA TRANSLATION FROM A USER PERSPECTIVE Have you tried the translation service at AltaVista? Go to http://www.altavista.digital.com Then submit any query, then click on "translate" next to any of the items in the results list. Or go straight to the translation page by clicking on the "Translations" tab above the query box or by going to http://babelfish.altavista.dig ital.com. There you can enter a URL, or type or cut-and-paste any text into the box, and you can translate from English to French, German, Italian, Spanish, or Portuguese; or from any of those languages to English. Unless the server is extraordinarily busy, you get the results almost immediately. And unless the text is idiomatic or laden with slang, you are likely to get remarkably good translations. This service uses automated translation software from Systran. Hence it has the strengths and the limitations of automated translation. Don't expect perfection. Don't expect it to understand and correct misspellings and grammar. Don't expect artistic and colloquial translations of poetry and rap lyrics. Do expect quick and useful renderings of business-related information. You can get some laughs from checking how it handles tricky figures of speech that would challenge a UN interpreter. Or you can use this as an aid to help you smoothly navigate through foreign pages. Because the software runs on Digital's powerful Alpha computer systems at the AltaVista site, you get very fast results (though response- time may slow somewhat when tens of thousands of users make requests at the same moment). And the translation service is intimately tied into the AltaVista search service, making translation part of your normal Web-navigating experience. Whenever you do a search, matches in your results list that are in any of the six languages now covered come with a "translate" link. Clicking on that takes you to a page where you select the language you want to translate it to. Then clicking on "translate" again, provides you with the page itself -- with all its graphic look and feel, including all its hyperlinks -- with the text in the language of your choice. From there you can continue to explore as you normally do in the Web environment. This development seems natural for AltaVista, which is based on a massive, language-independent index. Search engines that are built around the syntax of any particular language lock themselves out of the rest of the world. But AltaVista understands nothing about any language. It just captures all the text it finds and treats it all equally. (Within a couple weeks of when the original AltaVista Search site went on line, the developers got email from people in Korea who had typed in queries using their Korean keyboards and had gotten good results pointing to Korean pages.) LIMITATIONS There are limitations. Today the service only provides translations between English and five European languages -- not among those European languages (e.g., French to German) and not any of the other major languages of the world (such as Arabic, Japanese, Chinese, and Russian). Also, because of performance issues, the size of the text it will translate is limited -- normally about 5-10 Kbytes (which about one to two times the size of the text of the average Web page), but varying to balance performance based on usage. If a document is longer than about half a dozen paragraphs, only the beginning will be translated; then you will encounter the words "TRANSLATION ENDS HERE" and the balance of the target Web page will appear in the original language. If the balance of such a large document is important to you, you can cut-and- paste additional chunks of text from the original into the form at AltaVista's translation page, one piece at a time, by hand. Admittedly, that's awkward, but it can solve your immediate problem and prevents one person's "need" to translate an entire book from slowing performance for millions of other people with the less demanding requirements. Also, keep in mind that this service only translates plain text. Words embedded in graphics remain unchanged. And words that appear in Java applets or inside frames or inside databases do not get translated when you submit a URL for translation. And if you submit for translation a URL that is behind a firewall or on the other side of a password-protected registration page, AltaVista won't be able to fetch and translate the text. But you can cut-and-paste text from an applet or >from a database query or from inside a frame or from a page on your company's intranet. In fact, you can cut-and-paste text from any source at all -- from newsgroups or forums or chat sessions or your email or your own personal files that reside on your personal computer. Or you can simply type in whatever text you like. POTENTIAL APPLICATIONS What can you do with this? Here are a few suggestions:multi-lingual email correspondence -- Type your messages in the form at the AltaVista translation page. Then cut-and paste the translated text into the email you are sending. And when you receive messages in a language that you don't understand but that AltaVista can handle, cut- and-paste the email into that same form. newsgroup, forum, and chat participation -- Read and submit to newsgroups, forums, and even chat in foreign languages, once again by cutting-and-pasting text into the translation form. For convenience, you might want to keep the AltaVista translation page open in a separate Window. travel -- Check local-language Web pages from places where you intend to travel, learning about accommodations and entertainment and events. news -- Read local-language news stories on the Web, getting a foreign perspective on events, and perhaps greater detail than that offered by global news sources. games -- Play on-line strategy games (like Diplomacy), with participants all over the world, who do not have a common language, but who can use the translation capability at AltaVista to go beyond that barrier. research -- If you suspect that information that you need (for business, school, or whatever) is available in another language, enter your query words in the translation form; then cut-and-paste the translated text into the query box (adding English language command words, e.g., link: or host:, if needed). The translated words will have the appropriate accent marks, even if you are unable to generate those accents with your keyboard and software. In addition, you could limit your search to the target language using language tags. language study -- If you are reading a book in a foreign language, you might want to keep the AltaVista translation form on-line as you do so and type in unfamiliar words, as an alternative to looking them up in a dictionary. You might also benefit from experimenting with automatic translation back and forth to and from the language you are studying, probing to find the limitations, where human knowledge and experience is essential for understanding. Those are the aspects of language that you should focus your efforts on. Automatic translation will gradually transform language study, just as the ready availability of calculators transformed the study of mathematics. distance education -- Already today, over 5 million people a year take courses at a distance, and many of those are delivered over the Internet. Many of these people reside outside the US and are taking courses at US institutions. The ability to rapidly and readily translate messages for email and forums should make it easier to students who are not fluent in English to actively participate in courses delivered in English, and for English-speaking students to participate in courses delivered in foreign languages as well. K-12 education -- Arrange partnerships with schools and classes in other countries, using the AltaVista translation capability to break through language barriers, so kids with no language in common can carry on dialogues with one another through email and forums. This could be part of social studies programs intended to foster international understanding. It also could be part of after-school club activity or built into model UN exercises. bilingual education and English as a second language -- This service could prove important to non-English-speaking students in predominantly English-speaking schools and to the teachers and administrators who serve them. TRANSLATION FROM THE PERSPECTIVE OF THE INFORMATION PROVIDER Today, some Web sites are required by law or by the charters of their organizations to provide all their content in more than one language. For instance, this is true of government sites in Canada and in many instances vast and complex realm of the United Nations. In other cases, while not officially required, multi-lingual capability is highly desirable both from a practical standpoint -- potentially opening new markets for businesses -- but also as a matter of respect for the culture and heritage of people in the target audience. So how can companies and organizations take advantage of the AltaVista translation capability, getting maximum benefit at minimum expense? First, make sure that your pages are in a format that can be translated. If much of your content is plain text, then you are in good shape. But if you are using sophisticated techniques that create pages dynamically on the fly or are using frames or the text is generated from databases or appears in Java applets, then you have locked yourself out from taking full advantage of this new capability. Perhaps you should consider creating a plain text version of your pages that will be translatable (and also be indexable by search engines like AltaVista). If your pages do have translatable text, you could use AltaVista to translate them and save the resulting pages, even large pages created by cutting-and-pasting chunks, at your site; then offer visitors the choice of which language they would like to see. But in that case, you are vulnerable to the vagaries of automatic translation, and an horrendous blunder caused by the inability of the software to understand a colloquial phrase might damage your company's reputation among the very people you are trying to open your site to. Also, in that case, you take on a significant maintenance burden -- having to change your translated pages every time you change the originals; and additional overhead in terms of disk space and Web site complexity. But the underlying technology of AltaVista makes possible a very interesting alternative. Every search at AltaVista generates a unique URL, which a user can bookmark and an information provider can cut- and-paste into Web pages, making hyperlinks that automatically generate particular AltaVista searches, providing up-to-date results whenever you want them. That same capability applies to the translation service. If you do an AltaVista search which yields a particular page in the match list and then click on the word "translate" next to that match, you arrive at the AltaVista translation page with the URL of the target page already in the form. Check the "location" of that translation page -- it is not just http://babelfish.altavista.digital.com. Rather you see a unique URL that the contents of the translation form -- with the URL you are interested in already entered. You can bookmark that page and get back to it whenever you want. And you can cut-and-paste that massive, complex URL and make a hyperlink from your own Web page to there. In other words, if you have a Web page with about 5-10 Kbytes of text -- small enough so you can feel confident that in most circumstances AltaVista will translate the whole thing -- you can make a hyperlink from that page that will take a visitor to AltaVista's translation page with the URL of your page already in the box. So you could have a link at the top of your page that tells visitors (perhaps in more than one language), "for a rough translation of this page, click here." Once at the AltaVista translation page, the visitor then chooses the target language and gets the translation, created on-the-fly, at not cost and no hassle to you. A simple explanation at your site can set user expectations appropriately. You are not responsible for the quality of the translation. You are providing this link as a convenience. If your pages are in English, this technique would open your site to visitors who do not understand English, but do read French, German, Spanish, Italian, or Portuguese. But how will those people find you in the first place? They'll never translate your pages if they don't know you exist. Once again, a practical solution is readily available. AltaVista recognizes "key word" metatags. If you really want to open your site to foreign visitors, make those keywords foreign words. Use keyword metatags on all your pages to provide translations of the words and phrases that potential visitors to your site are likely to search for. First decide what words and phrases are most important. Then use AltaVista to translate them into the target languages and cut-and-paste the translation results into your key word metatag (with all the accents). Then once you have completed your page, go to AltaVista, click on ADD/REMOVE URL and enter the URL of that particular page. Then the new text for that page, including the foreign words that you just entered in your keyword metatag, should be in the AltaVista index in a day or two. Then someone searching for those foreign words will be able to find your pages, and then, using your translation link to AltaVista, will be able to read the complete text in their native language. In other words, with minimal effort, you can go a long ways toward making your Web site and your Web-based business truly global. By Richard Seltzer, email: seltzer@samizdat.com, Web: www.samizdat.com.
Back to list of articles
Last revised on 20 Jan., 1998
URL: http://www.euromktg.com/eng/ed/art/autotranslate.html