Example-based Machine Translation

Term Paper 2007 14 Pages

English Language and Literature Studies - Linguistics



0 Introduction

1 The idea of Example-based MT
1.1 History
1.2 EBMT: Definition
1.3 Comparison to Other Approaches

2 Procedure
2.1 How the Systems Work
2.2 Example Database
2.3 Examples of EBMT

3 Critical Evaluation
3.1 General Problems
3.2 Advantages
3.3 Practical Use

4 Conclusion

5 References

0 Introduction

“Now more than ever, the world looks to computers to perform the task of translation.” [INT3] Machine Translation has more and more become an essential method to assist or even replace human translators [INT1]. The necessity of developing useful computer software that fulfils this task has grown because in the age of the internet people want to get their information in their own language. Which approach is appropriate and which technique works well in order to cope with this challenge?

This paper will focus on Example-based Machine Translation (EBMT), an approach that does not correspond with traditional translation systems but has the advantage of requiring only little knowledge and thus being usable in a great number of languages [INT3].

At first, the term Example-based Machine Translation will be defined. It follows a short introduction to the history of EBMT. A comparison of EBMT to other approaches will show the specific characteristics of each system. An overview of the procedure will be given. Different examples will illustrate the technique. Furthermore, we will see the advantages and disadvantages of EBMT and an evaluation of it. We will then discuss the question whether this technique is useful or not. Finally a conclusion will sum up the main results.

1 The idea of Example-based MT

Unlike rule-based MT systems, EBMT as a corpus-based MT approach refers to a bilingual database of example translations in order to find an adequate combination of previous phrases to produce a new translation (Trujillo 1999:203).

1.1 History

Machine Translation in general was first mentioned in the United States of America in the 1950s when sixty Russian sentences were to be translated into English automatically. The successful experiment caused interest in further machine translation research. The development was slow until in the 1980s, as a consequence of the rapid computational progress, statistical models for machine translation were invented [INT1].

As an alternative to “rule-based” machine translation, “corpus-based” models like EBMT were then suggested [INT3]. In 1984 Nagao Makoto published his idea which was realized from the 1990s on (Somers 2003:514).

1.2 EBMT: Definition

The principle in EBMT is to suppose that there are already existing translations of comparable sentences which help to produce the new adequate translation, which is generated from an alternation of earlier translations. This works as follows:

(1) He buys a book on international politics.
The system is supposed to translate example (1) into Japanese, provided that similar previous translations are available:
(2) a. He buys a notebook. Kare wa nōto o kau.

b. I read a book on international politics. Watashi wa kokusai seiji nitsuite kakareta hon o yomu.

Example (2) shows two English sentences with their corresponding Japanese translations. The bold part of sentence a. represents the first part of example (1) and the bold part of b. matches the second part of example (1). If both parts, which cover the sentence that has to be translated, are put together, the correct translation can be constructed as exemplified in (3):

(3) Kare wa kokusai seiji nitsuite kakareta hon o kau.

(Somers 1999:4-5)

1.3 Comparison to Other Approaches

“Corpus-based” translation approaches can be divided into statistical machine translation (SMT) on the one hand and EBMT on the other hand. The main difference is that SMT concentrates on word frequency and word combinations, whereas EBMT focuses on the extraction and combination of phrases [INT4]. Statistical MT is based on probabilities for the occurrence of words and sentences in a certain surrounding in the source and in the target language (Arnold 1994:201-202).

EBMT can be compared to Translation Memory (TM), an operation which equals EBMT in so far as it also uses equivalent pre-existing material to produce a modified translation. The difference is that in TM the human translator has to identify the suitable examples of the databank, which have to be employed for the translation. Within EBMT, this task has to be performed automatically (Somers 2003:514).

Rule-based MT (RBMT) approaches use rules and features as a reference for the analysis of the input language and the generation of a corresponding translation, whereas EBMT employs a set of examples. In contrast to RBMT, which explicitly contains information about syntax and morphology, in EBMT the knowledge is implicit in the corpus [INT4].



ISBN (eBook)
ISBN (Book)
File size
515 KB
Catalog Number
Institution / College
University of Marburg – Fremdsprachliche Philologien
Example-based Machine Translation



Title: Example-based Machine Translation