Extending machine translation evaluation metrics with lexical cohesion to document level

Billy T.M. Wong, Chunyu Kit

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

57 Citations (Scopus)

Abstract

This paper proposes the utilization of lexical cohesion to facilitate evaluation of machine translation at the document level. As a linguistic means to achieve text coherence, lexical cohesion ties sentences together into a meaningfully interwoven structure through words with the same or related meaning. A comparison between machine and human translation is conducted to illustrate one of their critical distinctions that human translators tend to use more cohesion devices than machine. Various ways to apply this feature to evaluate machine-translated documents are presented, including one without reliance on reference translation. Experimental results show that incorporating this feature into sentence-level evaluation metrics can enhance their correlation with human judgements.

Original languageEnglish
Title of host publicationEMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference
Pages1060-1068
Number of pages9
Publication statusPublished - 2012
Event2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012 - Jeju Island, Korea, Republic of
Duration: 12 Jul 201214 Jul 2012

Publication series

NameEMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference

Conference

Conference2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012
Country/TerritoryKorea, Republic of
CityJeju Island
Period12/07/1214/07/12

Fingerprint

Dive into the research topics of 'Extending machine translation evaluation metrics with lexical cohesion to document level'. Together they form a unique fingerprint.

Cite this