Machine Translation Evaluation and Deployment

Machine Translation

Check the links above for more information on these subjects.

Machine Translation has been adopted by technology companies. In 2005 there were few practitioners, and very few more doing pilot projects. Today, MT features in every localisation conference. There are EC funded research projects to push the boundaries of technology and translators are getting to grips with the challenges of post-editing machine translated material. On the web and in social networks, free (no-cost) MT facilities are available through a variety of companies eg Yahoo!'s Babelfish and Microsoft's Translator. Commercial offerings are also now available, e.g.KantanMT

Bringing MT into your company, as an efficiency for document content translation, or as an aid in customer care or as an internal gisting service is a strategic decision. We would like to help you in that decision and guide thaqt process.

Rules Based Machine Translation (RBMT) has been around for over 20 years. It relies on coding a set of rules which implement the grammatical constructs of a language. These grammatical constructs are then populated by sets of dictionaries. The systems can be tuned by setting priorites within dictionaries and options within the rule sets. Like any other enterprise system, the out of the box system requires optimisation to your company's characteristics. In translation terms this means style guides and terminology for the target languages.

Statistical Machine Translation (SMT) relies on training a statistical engine with large amounts of paired language resources, source and target, eg English and French. The trained engine then is used to translate new source material. These technologies are the subject of multiple research efforts across Europe.

Hybrid Machine Translation is the result of binding RBMT and SMT together in order to leverage the advantages of both. RBMT provides levels of certaintly through being based on rules but is poor at capturing the idiomatic nature of language. Whereas SMT being based on already translated content is not constricted in this fashion but is limited by the variance in the tranining material and the effectiveness of the statistical engine. By combining the properties of both is hoped to furnish an better outcome.

Terminology is a key constituent of a company's writing style. User comprehension of products is transmitted through consistant terminology. Harvesting and consistency play a vital part in both RBMT and SMT technologies.