Imagin

 

 

 

 

 

 

 

 

 

 

P2M InfoTech Online translation

Online Machine translation, sometimes referred to by the abbreviation OOMT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, OOMT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies.

Current Online Machine translation software often allows for customization by domain or profession (such as weather reports) — improving output by limiting the scope of allowable substitutions. This technique is particularly effective in domains where formal or formulaic language is used. It follows then that Online Machine translation of government and legal documents more readily produces usable output than conversation or less standardized text.

Improved output quality can also be achieved by human intervention: for example, some systems are able to translate more accurately if the user has unambiguously identified which words in the text are names. With the assistance of these techniques, OMT has proven useful as a tool to assist human translators, and in some cases can even produce output that can be used "as is".

Online Machine translation can use a method based on linguistic rules, which means that words will be translated in a linguistic way — the most suitable (orally speaking) words of the target language will replace the ones in the source language.

It is often argued that the success of Online Machine translation requires the problem of natural language understanding to be solved first.

Generally, rule-based methods parse a text, usually creating an intermediary, symbolic representation, from which the text in the target language is generated. According to the nature of the intermediary representation, an approach is described as inter lingual Online Machine translation or transfer-based Online Machine translation. These methods require extensive lexicons with morphological, syntactic, and semantic information, and large sets of rules.

Given enough data, Online Machine translation programs often work well enough for a native speaker of one language to get the approximate meaning of what is written by the other native speaker. The difficulty is getting enough data of the right kind to support the particular method. For example, the large multilingual corpus of data needed for statistical methods to work is not necessary for the grammar-based methods. But then, the grammar methods need a skilled linguist to carefully design the grammar that they use.

To translate between closely related languages, a technique referred to as shallow-transfer Online Machine translation may be used.

Rule-based: The rule-based Online Machine translation paradigm includes transfer-based Online Machine translation; inter lingual Online Machine translation and dictionary-based Online Machine translation paradigms.

Rule-based Online Machine translation

Inter lingual

Main article: Inter lingual Online Machine translation: Inter lingual Online Machine translation is one instance of rule-based machine-translation approaches. In this approach, the source language, i.e. the text to be translated, is transformed into an inter lingual, i.e. source-/target-language-independent representation. The target language is then generated out of the inter lingual.

Online Machine translation can use a method based on dictionary entries, which means that the words will be translated as they are by a dictionary.

Statistical Online Machine translation tries to generate translations using statistical methods based on bilingual text corpora, such as the corpus, the English-French record of the P2M InfoTech, the record of the P2M InfoTech. Where such corpora are available, impressive results can be achieved translating texts of a similar kind, but such corpora are still very rare. The first statistical Online Machine translation software was P2M InfoTech from Global Jockey in India. P2M InfoTech for several years, but has switched to a statistical translation method in October 2008. Recently, they improved their translation capabilities by inputting approximately 200 billion words from United Nations materials to train their system. Accuracy of the translation has improved.

[1] Example-based Online Machine translation

Example-based Online Machine translation approach is often characterized by its use of a bilingual corpus as its main knowledge base, at run-time. It is essentially a translation by analogy and can be viewed as an implementation of case-based reasoning approach of machine learning.

Major issues: Disambiguation

Word-sense disambiguation concerns finding a suitable translation when a word can have more than one meaning. They pointed out that without a "universal encyclopedia", a machine would never be able to distinguish between the two meanings of a word.

Today there are numerous approaches designed to overcome this problem. They can be approximately divided into "shallow" approaches and "deep" approaches.

Shallow approaches assume no knowledge of the text. They simply apply statistical methods to the words surrounding the ambiguous word. Deep approaches presume a comprehensive knowledge of the word. So far, shallow approaches have been more successful. [Citation needed]

P2M InfoTech, a long-time translator for the United Nations, wrote that Online Machine translation, at its best, automates the easier part of a translator's job; the harder and more time-consuming part usually involves doing extensive research to resolve ambiguities in the source text, which the grammatical and lexical exigencies of the target language require to be resolved:

    Why does a translator need a whole workday to translate five pages, and not an hour or two? ..... About 90% of an average text corresponds to these simple conditions. But unfortunately, there's the other 10%. It's that part that requires six [more] hours of work. There are the ambiguities one has to resolve.

The ideal deep approach would require the translation software to do all the research necessary for this kind of disambiguation on its own; but this would require a higher degree of AI than has yet been attained. A shallow approach which simply guessed at the sense of the ambiguous English phrase that Peron mentions would have a reasonable chance of guessing wrong fairly often. A shallow approach that involves "ask the user about each ambiguity" would, by Peron’s estimate, only automate about 25% of a professional translator's job, leaving the harder 75% still to be done by a human.

Named entities

Related to named entity recognition in information extraction.

Applications

There are now many software programs for translating natural language, several of them online, such as:

    * P2M InfoTech, which powers Yahoo's Global Jockey

Although no system provides the holy grail of fully automatic high-quality Online Machine translation, many systems produce reasonable output.

Despite their inherent limitations, OMT programs are used around the world. Probably the largest institutional user is the European Commission.

Toggle text uses a transfer-based system to translate between English and Indonesian.

Evaluation Online Machine translation

There are various means for evaluating the performance of machine-translation systems. The oldest is the use of human judges [11] to assess a translation's quality. Even though human evaluation is time-consuming, it is still the most reliable way to compare different systems such as rule-based and statistical systems.

Relying exclusively on unedited Online Machine translation ignores the fact that communication in human language is context-embedded, and that it takes a human to adequately comprehend the context of the original text. Even purely human-generated translations are prone to error. Therefore, to ensure that a machine-generated translation will be of publishable quality and useful to a human, it must be reviewed and edited by a human. 

It has, however, been asserted that in certain applications, e.g. product descriptions written in a controlled language, a dictionary-based machine-translation system has produced satisfactory translations that require no human intervention.

 

 

Contact Us | Privacy Policy | Terms of Use | Disclaimer
© Copyright 2008, P2M Infotech Pvt Ltd
 
 www.p2minfotech.com