Name:TRAINING
Description:Training of statistical models for automatic translation
Abstract:The field of application is the training of statistical models for automatic translation. This application is placed in the field of natural language processing within the area of pattern recognition and artificial intelligence. The training of statistical models for automatic translation from a source language to target language requires large amounts of data in the form of pairs of bilingual sentences origin-destination where every sentence is a translation of a phrase origin.<BR/><BR/>The number of pairs of phrases that are used for training the translation models often exceeds one million. The training of these models is an iterative algorithm based on the data set during which they adjust the model parameters.<BR/><BR/>Storage and efficient access to these parameters is another computational challenge, since it involves considering statistical bilingual dictionaries where each word in a target sentence is a potential translation of every word in the source phrase. This leads us to consider scattered bilingual dictionaries including in some cases thousands of millions of entries.<BR/><BR/>The duration of the training of these models typically takes several weeks of calculation depending on the number of iterations of the algorithm.<BR/>The result of the training of these statistical models is the learned bilingual dictionaries, and the establishment of an alignment between the words in the bilingual pairs used in the training of the model.<BR/>

Created:2011-03-22
Last updated:2011-03-22