Escolar Documentos
Profissional Documentos
Cultura Documentos
Han
Natural Language Processing & Portuguese-Chinese Machine Translation
Laboratory
University of Macau, Macau S.A.R., China
Lexical similarity
Linguistic features
Metrics combination
Motivation
LEPOR Metrics Description
Performances on international ACL-WMT corpora
Publications and Open source tools
Other research interests and publications
Time-consuming
Expensive
Unrepeatable
Low agreement (Callison-Burch, et al., 2011)
Precision-based
Bleu (Papineni et al., 2002 ACL)
Recall-based
ROUGE(Lin, 2004 WAS)
Precision and Recall
Meteor (Banerjee and Lavie, 2005 ACL)
Word-order based
NKT_NSR(Isozaki et al., 2010EMNLP), Port (Chen
et al., 2012 ACL), ATEC (Wong et al., 2008AMTA)
Word-alignment based
AER (Och and Ney, 2003 J.CL)
Edit distance-based
WER(Su et al., 1992Coling), PER(Tillmann et al.,
1997 EUROSPEECH), TER (Snover et al., 2006
AMTA)
Language model
LM-SVM (Gamon et al., 2005EAMT)
Shallow parsing
GLEU (Mutton et al., 2007ACL), TerrorCat (Fishel
et al., 2012WMT)
Semantic roles
Named entity, morphological, synonymy,
paraphrasing, discourse representation, etc.
Sub-factors:
exp 1
: <
= 1
=
exp 1 : >
(1)
= exp
=
(2)
| |
=1
(3)
= | | (4)
: position of matched token in
output sentence
: position of matched token in reference
sentence
#
#
#
#
= , =
(5)
(6)
+
(7)
LEPOR Metrics:
= (, ) (8)
=
, ,
=
=1
=1
+ +
+ +
= (
(9)
=1 )
(10)
1
(
+
) (11)
When multi-references:
Select the alignment that results in the minimum NPD
score.
System-level correlation:
Spearman rank correlation coefficient:
= 1
2
6
=1
(2 1)
(12)
= 1 , , , = {1 , , }
=1( )( )
=1
=1
(13)
(14)
System-level metrics:
=
| |
=1
(15)
= (, ) (16)
| |
=1
(17)
1
(
+
+ ) (18)
= exp(
=1 )
(19)
My research interests:
Other publications:
A Description of Tunable Machine Translation Evaluation Systems in
WMT13 Metrics Task
Aaron Li-Feng Han, Derek Wong, Lidia S. Chao, Yi Lu, Yervant
Ho, Yiming Wang, Zhou jiaji. Proceedings of the ACL 2013 EIGHTH
WORKSHOP ON STATISTICAL MACHINE TRANSLATION (ACL-WMT
2013), 8-9 August 2013. Sofia, Bulgaria.
1 <
#
#
#
#
Bayes rule:
1 , 2 , , =
1 ,2 ,, ( )
(1 ,2 ,, )
+ ))
}
||||
, (, | , ))
Phrase Tagset Mapping for French and English Treebanks and Its
Application in Machine Translation Evaluation
Aaron Li-Feng Han, Derek F. Wong, Lidia S. Chao, Yervant Ho, Shuo
Li, Lynn Ling Zhu. In GSCL 2013. LNCS Vol. 8105, Volume Editors: Iryna
Gurevych, Chris Biemann and Torsten Zesch.
German Society for Computational Linguistics (oral presentation):
To facilitate future research in unsupervised induction of syntactic structures
We design French-English phrase tagset mapping
We propose a universal phrase tagset
Phase tags extracted from French Treebank and English Penn Treebank
Explore the employment of the proposed mapping in unsupervised MT evaluation
= ( 1 , 2 , 3 )
1 =
2 =
3 = (
3 )
information retrieval
question and answering
Searching
text analysis
etc.
Q and A
Thanks for your attention!
12. Tillmann C., Stephan Vogel, Hermann Ney, Arkaitz Zubiaga, and Hassan Sawaf:
Accelerated DP Based Search For Statistical Translation. In Proceedings of the 5th
European Conference on Speech Communication and Technology (EUROSPEECH97)
(1997)
13. Papineni, K., Roukos, S., Ward, T. and Zhu, W. J.: BLEU: a method for automatic
evaluation of machine translation. In Proceedings of the (ACL 2002), pages 311-318,
Philadelphia, PA, USA (2002)
14. Doddington, G.: Automatic evaluation of machine translation quality using ngram
co-occurrence statistics. In Proceedings of the second international conference
on Human Language Technology Research(HLT 2002), pages 138-145, San Diego,
California, USA (2002)
15. Turian, J. P., Shen, L. and Melanmed, I. D.: Evaluation of machine translation
and its evaluation. In Proceedings of MT Summit IX, pages 386-393, New Orleans,
LA, USA (2003)
16. Banerjee, S. and Lavie, A.: Meteor: an automatic metric for MT evaluation with
high levels of correlation with human judgments. In Proceedings of ACL-WMT,
pages 65-72, Prague, Czech Republic (2005)
17. Denkowski, M. and Lavie, A.: Meteor 1.3: Automatic metric for reliable optimization
and evaluation of machine translation systems. In Proceedings of (ACL-WMT),
pages 85-91, Edinburgh, Scotland, UK (2011)
18. Snover, M., Dorr, B., Schwartz, R., Micciulla, L. and Makhoul, J.: A study of
translation edit rate with targeted human annotation. In Proceedings of the Conference
of the Association for Machine Translation in the Americas (AMTA), pages
223-231, Boston, USA (2006)
19. Chen, B. and Kuhn, R.: Amber: A modied bleu, enhanced ranking metric. In
Proceedings of (ACL-WMT), pages 71-77, Edinburgh, Scotland, UK (2011)
20. Bicici, E. and Yuret, D.: RegMT system for machine translation, system combination,
and evaluation. In Proceedings ACL-WMT, pages 323-329, Edinburgh,
Scotland, UK (2011)
21. Taylor, J. Shawe and N. Cristianini: Kernel Methods for Pattern Analysis. Cambridge
University Press 2004.
22. Wong, B. T-M and Kit, C.: Word choice and word position for automatic MT
evaluation. In Workshop: MetricsMATR of the Association for Machine Translation
in the Americas (AMTA), short paper, 3 pages, Waikiki, Hawai'I, USA (2008)
23. Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H.: Automatic evaluation
of translation quality for distant language pairs. In Proceedings of the 2010
Conference on (EMNLP), pages 944{952, Cambridge, MA (2010)
24. Talbot, D., Kazawa, H., Ichikawa, H., Katz-Brown, J., Seno, M. and Och, F.: A
Lightweight Evaluation Framework for Machine Translation Reordering. In Proceedings
of the Sixth (ACL-WMT), pages 12-21, Edinburgh, Scotland, UK (2011)
25. Song, X. and Cohn, T.: Regression and ranking based optimisation for sentence
level MT evaluation. In Proceedings of the (ACL-WMT), pages 123-129, Edinburgh,
Scotland, UK (2011)
26. Popovic, M.: Morphemes and POS tags for n-gram based evaluation metrics. In
Proceedings of (ACL-WMT), pages 104-107, Edinburgh, Scotland, UK (2011)
27. Popovic, M., Vilar, D., Avramidis, E. and Burchardt, A.: Evaluation without references:
IBM1 scores as evaluation metrics. In Proceedings of the (ACL-WMT),
pages 99-103, Edinburgh, Scotland, UK (2011)
28. Petrov S., Leon Barrett, Romain Thibaux, and Dan Klein: Learning accurate,
compact, and interpretable tree annotation. Proceedings of the 21st ACL, pages
433-440, Sydney, July (2006)
29. Callison-Bruch, C., Koehn, P., Monz, C. and Zaidan, O. F.: Findings of the 2011
Workshop on Statistical Machine Translation. In Proceedings of (ACL-WMT), pages
22-64, Edinburgh, Scotland, UK (2011)
30. Callison-Burch, C., Koehn, P., Monz, C., Peterson, K., Przybocki, M. and Zaidan,
O. F.: Findings of the 2010 Joint Workshop on Statistical Machine Translation and
Metrics for Machine Translation. In Proceedings of (ACL-WMT), pages 17-53, PA,
USA (2010)
31. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Findings of the 2009
Workshop on Statistical Machine Translation. In Proceedings of ACL-WMT, pages
1-28, Athens, Greece (2009)
32. Callison-Burch, C., Koehn, P., Monz,C. and Schroeder, J.: Further meta-evaluation
of machine translation. In Proceedings of (ACL-WMT), pages 70-106, Columbus,
Ohio, USA (2008)
33. Avramidis E., Popovic, M., Vilar, D., Burchardt, A.: Evaluate with Condence
Estimation: Machine ranking of translation outputs using grammatical features. In
Proceedings of the Sixth Workshop on Statistical Machine Translation, Association
for Computational Linguistics (ACL-WMT), pages 65-70, Edinburgh, Scotland, UK
(2011)