Google Translate has a beta version up for translating from Traditional Chinese (h/t to Fili’s World) into English and from English into both Traditional and Simplified Chinese. Simplified Chinese is commonly used in Mainland China, Singapore, and Malaysia, while traditional Chinese is most common in Taiwan, Hong Kong, and Macau. According to Fili’s World, Google Translate is better than Babel Fish. Google Translate is surprisingly good for translating from Chinese to English and from English to Chinese.

I have used online translators to respond quickly to e-mails in Spanish, German and Russian when there is no time to have someone in my firm fluent in these languages help in the response. The translations generally are not bad, but certainly not anywhere near 100%.  The Russian translations tend to be the worst and that makes sense since it is the least like English.  A good way to test these translators is to input something in English and have it translated into the foreign language and then translate it back to English. I did this with Google Beta going from English to Simplified Chinese and then back to English and I was stunned at how well it did.

It’s not going to replace human translation, but it might just work in a pinch.

  • Something else that is nice about the Google translator is that you can provide feedback for the Chinese language translations. If you find that a translation isn’t quite right you click the “suggest a better translation” and contribute your own translation. Kind of like a spoon-fed AI learning technique.

  • youlamda

    no so good as you described, however, for single word, sometimes it works.

  • Statistical Translation is something that should be on the radar for every international business when doing strategic planning. At some point it is going to do for language what the Internet did for distance.That’s not to say that the technology is anywhere close to perfect, or that there aren’t huge benefits to those who take the time to learn one or more foreign languages.However, ST will fundamentally change the way that businesses operate, and sooner than you may think!
    Links of interest:

  • Shaun —
    Thanks for checking in. Did not notice that. That is pretty cool. Will be interesting to see if it does improve over time.

  • bubu

    I am interested in your articles,and i will care for you in the future.i am chinese.

  • youlamda —
    Thanks for checking in. It sure ain’t perfect, but it is much better than I would have expected.

  • Benjamin —
    Is statistical translation the same as computer translation? If so, I disagree, at least in the legal arena. Just yesterday, a client sent me a Russian language condo sales agreement he needed turned around really quickly. To speed things up, he provided me with a computer translation of it. The translation was completely incoherent. My client’s first question was whether the deposit was refundable or not. Here is the translation of that provision: “In the case of refuse Buyer from a transaction in the indicated term, advance to the last does not return.” Find me a lawyer who would give legal advice based on that!
    I gave it to our Russian paralegal and she instantly said it was nonrefundable.

  • bubu —
    Thanks for checking in and thanks for the kudos. You are welcome here any time and I encourage you to feel free to comment any time.

  • Dan,
    I’d like to start by saying that I am by no means an expert in this area. However, this technology is something Liwen Bianji is keeping any eye on because it will affect our editing business (aimed at Chinese/Japanese authors) before it is felt in other sectors.
    I’m not sure what your client used for the translation, but i’m guessing it was something freely available over the internet?
    Statistical Translation (ST) is a specific type of computer translation (Machine Translation). Intead of translating a text based on a set of rules developed specifically for the language pair, it uses a corpus of bilingual text. For example, to develop a French to German ST program, one can take a document that has already been translated from French to German, and add both language pairs to the corpus. The program then uses statistical methods to analyze the bilingual corpus. For this to work, huge amounts of text are needed for the bilingual corpus. Therefore, Hansard, EU, or UN documents are widely used.
    ST is the new method for most MT under development, and Google Translate is being developed by leaders in the field. However, the problem is that the ST does not work so well for general translation at the moment. If I am chatting online with someone in Russia it will be difficult for ST to translate accurately because too much is dependent on context. A certain word can have multiple meanings. However, in specific fields (law, chemistry, biology, etc.) the possibilities are much more narrow. For example, an ST program developed specifically for the legal field is much more likely to provide a correct translation for the word “discovery”, than if a general ST program is being used. This is becasue it knows that discovery likely only has one meaning in the legal context. If you and I are chatting then it could be a legal term, the name of a space shuttle, or the act of finding out something new.
    There are still many problems associated with ST, and as in your situation with the Russian client, it cannot replace a human. What it can do is aid them by providing a rough translation of a large number of documents so that human translators know what they need to focus their attention on.
    There are many other examples where ST will become useful. It is important to keep in mind that requirements for accuracy are different for each situation. In the customer service field, a client values help in their own language above proper grammar. In the future, a customer service rep in Dallas can use ST enabled chat to explain to a client in Beijing how to properly use a new product (although I’m sure this function will be more widely used for more amorous purposes). Because ST software can be integrated with other programs the possibilities are endless.

  • Benjamin —
    Yes, it was a free internet translator. What you describe is very interesting.
    I was actually called the other day by a client of ours who has a really successful high tech business very much involved with localization and translation. She was heading off to give a speech at the Monterrey Institute on translation standards in the legal profession and wanted to “pick my brain” before she left.
    I was dreading the call, thinking it would be terribly boring, but I ended up being fascinated. She told me that getting good legal translations becomes a huge problem when a company reaches a certain size and through our discussion, I began to realize why. My firm relies entirely on using lawyers (or in rare instances, paralegals) who either work for us or whom we have known and trusted forever. Once you get to a certain size, though that is no longer possible. She asked me if my firm were to triple in size whether we could continue doing our translations as we do now and I realized the answer was no. The people we use are so rare and so valuable that we probably could not find two more of them.
    Hopefully by then we can just use a machine. “A just machine to make big decisions programmed by fellows with compassion and vision. We’ll be clean when their work is done, we’ll be eternally free yes . . . .”

  • Google Chinese Translate

    An interesting discussion is taking place at China Law Blog about Statistical Machine Translation. I think that it is interesting that Dan Harris uses statistical machine translation when a quick response is needed for communication at a legal practi…

  • From the machine translation point of view, legal documents should be fairly easy for the machine since you lawyers uses very rigid (and strange?) sentence structure that we mere mortals have to pay the “translation” fee just to have it translated from legal English to plain English. 😉
    In this case, domain specific statistical translation would work the best. I am surprised that no firm has developed this already. Maybe it’s hard for the software firm to break into the market. I suspect there is a tremendous inertia for large law firms to adapt the software as large law firms normally deal with big companies and can charge hundreds an hour for their service. Letting the clients know they are using machine translation may be bad for business since it cut out a lot of billable hours. Small law firms just don’t make enough market for the software companies.
    I suspect the lack of good translation software in legal field has less to do with available technologies and more to do with market opportunities.
    Dan, you can ask your client about the program they used to keep track of translation progress and the “translation memory” used in those programs. Translation memory isn’t sophisticated technology. Basically, it keeps translated whole sentences in a database and will use that to assist human translators. It’s amazing efficient for translation because the duplication of same sentences across documents in similar fields. I think legal documents have even higher occurrence of those sentences.

  • Mr. Li —
    Well written legal documents tend not to use much legalese (particularly international documents) and they tend not to be terribly repetitive of previous documents either. I actually think translating legal documents by machine will prove particularly tough becuase each word in a legal document can be of such critical importance. “Must”, “shall”,
    “should”, “may”, “necessary”, “optional”, “compulsory”, often are difficult to translate exactly. How is a machine to know?

  • Dan,
    Getting away from MT/ST, you previously mentioned getting a back translation done for documents that have been translated from English (or a person’s native language) into another language. I think it is worth stressing that this is a simple step that foreign managers in China should not skip for any translated material. It is often only necssary to get a very rough back translation. Foreign managers with even a basic reading level can handle this one their own, even for high-level documents, by supplementing their own skills with a program like JinShan that has a scroll-over character dictionary. There is also an extension called Chinesepera-kun that can be downloaded for Firefox. Chinesepera-kun is sligthly better than JinShan in that it has pinyin, but it can only be used within a browser. The kicker is that both of these programs allow you to save new characters to a vocabulary list for later study.
    Jin Shan

  • David Li

    Hi Dan,
    I don’t mean to trivialize the difficulty. Machine translations is hard. It depends on the type of legal documents. Supreme courts opinions would be really hard but contracts would be easier. Given employment or business contracts from 100 different companies, there are a lot of repetitions or similarities among them. These help in training the machine to perform translation.
    The example of “must,” “shall” and “may” is perfect to illustrate why legal documents are easier for machine then general documents. These terms are more narrowly defined legally then in general usage which help machines in the face of ambiguities.
    Machine translation is a sexy field but it has never taken off as a business. It has been one of the things that sounds very useful but no field will be first to seriously apply it. Having ones’ professional documents to be able to easily translated by machines is demeaning, at least in the perspective from the professionals. I encounter this first hand while organizing a open source effort to translate 1000 pages technical documents into Chinese. I proposed to use machine translation as the base and human to review. This was rejected by the group who prefer to do the translation by themselves. The resulting documents? The quality is only a bit higher then the machine translation but the style isn’t consistent with everyone writes in their own style and terminologies are all messed up.

  • Is translating an English text into Chinese and then back again even a good test for statistical translation? For all I know, the statistical processes could be associating my text with another that is a bad translation, and then be switching them back and forth, eg
    See you tomorrow -> ???? -> See you tomorrow.
    Or is this not an issue?

  • Benjamin —
    Thanks for the links.

  • Mr. Li —
    Of course every situation is different, but let’s just take my own law firm’s webiste, which we write in English and then translate into German, Spanish and Chinese. Here’s how we handled each language:
    1. We used a law professor to translate into Chinese and then Steve reviewed her work. She did a fantastic job, very inexpensively. Not shockingly inexpensively because she is so top of the linem, but very inexpensive compared to what it would have cost us to use someone in the states. It does not make economic sense for us to use Steve as a translator and so we almost always use really good translators, with Steve reviewing.
    2. We found a very good, quite inexpensive translator in Guatemala for the Spanish. Our Spain licensed lawyer reviewed her work and it took her maybe 6 hours to bring everything up to “Spain standards.”
    3. German translators are expensive so I had the brilliant idea of using a machine to put it into German and then our Germany licensed lawyer would revise and finalize. She did one page and told me it took LONGER to revise than if she had just translated it herself. At that point, I decided we absolutley needed to contract out and we ended up finding a German ethnic in South America who did an excellent translation, at a fairly low rate.

  • Micah —
    Really good point. I don’t know in general and I don’t know re Chinese. Damn, and I thought my test was so brilliant.

  • Dan-
    I should apologize for not acknowledging your Steeley Dan reference…just haven’t been able to think of a clever reply.

  • Dan

    one tool I use religiously is adsotrans ( ). It translates, but can also annotate Chinese text with pinyin and translations, and does a much better job than desktop dictionary apps. They encourange user additions to their database, so they have a lot of modern transliterations and slang, too.

  • Benjamin —
    Not exactly Steely Dan. Donald Fagan solo album, The Nightfly. Absolutely terrific and this is one of the best songs on it. B+ for getting close (of course, if I learn you searched it out on Google, I will flunk you!)

  • Dan —
    Thanks for checking in and (as my kids say) thanks for sharing.