TAUS Quality Ebook PDF

Redefining Translation Quality: From User Data to Business Intelligence
TAUS Signature Editions
REDEFINING
TRANSLATION
QUALITY:
FROM USER DATA
TO BUSINESS
INTELLIGENCE
1
Published by TAUS Signature Editions, Keizersgracht 74, 1015 CT

Amsterdam, The Netherlands
Tel: +31 (0) 20 773 41 72

E-mail: memberservices@taus.net
www.taus.net
All rights reserved. No part of this book may be reproduced or transmitted

in any form or by any means, electronic or mechanical, including
photocopying, recording or by any information storage and retrieval
system, without written permission from the author, except for the inclusion
of brief quotations in a review.
Copyright 2016 by Attila Grg and TAUS Signature Editions
2
TABLE
OF CONTENTS
Introduction 4
Translation Quality 5
French Poetry and DQF 5
Translation: A Nice-to-Have? 7
Buy Translation Like You Book a Hotel Room! 10
Quality Evaluation 12
Quality Evaluation on a Shoestring 12
Translation Productivity Revisited 14
Business Intelligence 20
Business Intelligence and Quality Evaluation Data 20
Crowdshaping Translations 24
Further Reading and Reference 28

About 30
Attila Grg 30
TAUS 31
3
INTRODUCTION
Translation quality is one of the key concepts in the translation industry today.
Measuring and tracking translation quality is essential for all players of the
industry. More and more translation vendors offer different types and levels
of quality resulting in dynamic pricing. Translation buyers are seeking to know
whether their customized Machine Translation (MT) engine is improving and
they would like to compare different MT providers. Finally, translators need to
set the threshold of TM/MT matches at the most optimal levels. These are just
a few examples where translation quality becomes central and increasingly
tuned to user satisfaction.
The aim of this eBook is to redefine the way we look at translation quality and
evaluate translations. We are not trying to achieve this by a collection of sci-
entific papers or by heavy argumentation. What we offer is a selection of short
articles in reflectional style on various aspects of the quality of translations.
The first three articles are on the topic of Translation Quality. We will focus
on past and new interpretations of the concept of translation quality. The
uber-ization of the translation industry is near and well investigate how ap-
proaches to translation quality reflect this new phenomenon.
The second topic, Quality Evaluation, includes two articles. I will explain how
to save on time and resources by applying new techniques and methods for
translation quality evaluation. Also, how to be fair to translators when mea-
suring productivity? How to take into account the quality of the available re-
sources when comparing translators and post-editors? These are some of the
questions we are trying to answer in this chapter.
Finally, we complete our journey from quality to intelligence with two more
articles on the topic of Business Intelligence. We will consider evolving trends
in translation technology, new methods of evaluation and we even touch hot
topics like crowdshaping and big data.
Enjoy reading!
Attila Grg
4
TRANSLATION
QUALITY
French Poetry and DQF
When I studied French at university in my homeland, Hungary,

I remember attending a class on translating poetry. In this class
we discussed the translation of a poem for two hours. Each time
we worked on a different poem. Everyone prepared in advance
and we would go line by line, word by word, reading out loud
our version of the poem.
Philipp Koehn once cleverly wrote: ten different translators will

almost always produce ten different translations. And indeed,
each one of us came up with a different masterpiece. Of course,
not all of them were equally good. In some cases a translated
line was semantically or stylistically incorrect, or It didnt fit into
the context, didnt match the rhyme scheme or it just sound-
ed awful! These versions were ruled out without mercy by our
teacher, Dr. Brdos, or by some older, more experienced fellow
students.
At the end, we created a pool of good, acceptable translations

(two, three, four depending on the difficulty of the poem). This
is when quality evaluation comes into play: these different ver-
sions were all considered publishable quality. There were slight
differences in word order, word choice, tone, style etc., but they
were still good translations of the French poem. The ones out-
side the pool had serious errors, but the ones selected into the
pool were all correct and well-written pieces.
5
What happened next was a vote. All of us, teacher included,

had to vote for our favorite version to decide which one was
the overall best translation of the poem. Our quest for the high-
est quality, through votes. This happened, quite inefficiently,
around a table on a lazy afternoon somewhere in an old, histor-
ic building. We spent time and effort that we could have spent
on real work. Well, I personally learnt a lot from it.
Since then, many things have changed. I moved to the

Netherlands and after a long career in linguistic research and
the language industry, I became product manager of the TAUS
Dynamic Quality Framework (DQF). DQF offers a set of tools
that are made for a similar but slightly different exercise than
the one above: to vote for different MT engines or human trans-
lations, to score translated segments, to count errors based on
an error-typology, and to measure post-editing productivity.
DQF offers different tools for different purposes. It is made with
the student, the developer, and the project manager in mind.
And the good news is that DQF is available for academics free
of charge. It is perhaps the only workflow tool for quality evalu-
ation that can be used in the translation classroom today.
6
Translation: A Nice-to-Have?
Todays fast-changing landscape of goods and services is

filled with past must-haves that are evolving into nice-to-
haves. Otherwise they become superfluous. At the other end
of the spectrum, we find things weve never dreamed of that
we desperately need today.
Technological innovation coupled with an unstable econo-

my, topped with globalization and increasing environmental
awareness awaken in us new needs and extinguish old neces-
sities. Pen and paper are becoming a nice-to-have book-
shelves too. In big cities, owning a car is being replaced by
car-sharing concepts and app-based transportation pooling.
Cash has made room for plastic money, but maybe not for
long; mobile payment is going to take up more of that room.
In a couple of months, Im giving up preparing food on gas
and I will become the happy user of an induction cooker. I will
actually say goodbye to gas altogether because Im moving
into a house with district heating.
Old must-haves are turning into nice-to-haves or no-need-to-

haves. At the same time, in just a couple of years, Wi-Fi avail-
ability has become self-evident. Smartphones, tablets and
smartwatches are fast becoming extensions of our bodies.
We cant live without electricity anymore and we will die if we
cant read our emails for a day.
Stuff and fluff (i.e. products and services) become indis-

pensable or superfluous or just nice-to-haves. This new law
of nature seems to influence our lives in an ever-accelerat-
ing tempo. The translation industry reflects the same trend.
Translation used to be superfluous: tradesman used a lingua
franca. Then, it became a must have. If you launch your prod-
uct globally, you should address your target audience in their
mother tongue. You should translate your user guides, your
website, marketing content and, at least, you should subtitle
your videos.
7
What is translation today? In a narrow sense
Translation transfers a written source text into a written target

text of roughly equivalent length. Such a translation conveys all
the source texts meaning, making only those adjustments nec-
essary for cultural appropriateness without adding, omitting,
condensing, or adapting anything else.
Defining the landscape of translation by Melby, Fields, Hague,

Koby & Lommel
Is such a translation a must-have today? Sometimes. In a broad-

er sense
Translation departs from or is inspired by source content in one

language with the aim of providing fit-for-purpose, comprehen-
sible content in a foreign language.
How do you define translation? Broadly or narrowly? To what

extent are adequacy and fluency important to you?
In some areas or situations, translation (in the narrow sense)

is evolving into a nice-to-have. An expensive nice-to-have
sometimes a superfluous one, too. More and more companies
are introducing preliminary content profiling in order to deter-
mine whether the source in question will deserve a translation
at all, and if yes, should it be in the narrow or broader sense.
With the increase in crowdsourcing, post-editing, interactive

and adaptive machine translation and learning management
tools, new workflows and technologies are entering the mar-
ket. Different quality requirements define different translation
processes, giving each its equal right of existence. As Monica
Guy cleverly articulates in her blog post:
No type of translation is inherently better than another, but

each is appropriate for a different and specific purpose. And all
should work together in harmony to provide a powerful tool to
reach target markets across the world.
8
No type of translation is in-

herently better than anoth-
er, but each is appropriate
for a different and specif-
ic purpose. And all should
work together in harmony
to provide a powerful tool to
reach target markets across
the world.
- Monica Guy
In our age of hyper-globalization, translation in the narrow

sense is becoming a nice-to-have. On the other hand, transla-
tion in the broader sense encompassing transcreation, local-
ization, gist translation, raw machine translation, summary/ex-
tract translation etc., is and always will be a must-have. Content
profiling can help you select the right process, the right quality
level and the right evaluation type. To learn more about con-
tent profiling, please watch this webinar or read the Dynamic
Quality Framework report referenced at the end of this book.
9
Buy Translation Like You Book a Hotel Room!
In 2014, at the VViN conference (the Dutch version of ATA), I lead

two breakout sessions on translation quality. It was interesting
to hear how Language Service Providers (LSP) experience what
I would call the quality-paradox: most clients desire top quality
but want to pay budget prices. Why is the translation indus-
try so different from the well-known hospitality business where
you don't expect to get a cheap room in a 5-star hotel? And
when a cheap room is offered through some campaign, you
become suspicious. Some participants suggested that transla-
tion is a service and hotel ratings have to do with the features
of an accommodation. I would say offering a hotel room is also
a service. It's definitely no product I'm buying when I'm staying
at a hotel. When we evaluate a translation, don't we do that by
focusing on some features of the text (style, terminology, accu-
racy, fluency, readability, usability etc.)? I personally don't see
the difference.
The translation industry lacks an equivalent to the 5-star hotel

rating. One of the main topics and aims of the 2014 Translation
QE Summit in Vancouver was to try to come up with a simi-
lar system. A bold aim, isn't it? Unfortunately, it seemed to be
too difficult a challenge for a break-out session of one hour.
We want to reach this by first looking at the different processes
and quality levels offered already in the translation and local-
ization business. A first and very limited investigation showed
that there are three camps today: one that openly promotes
different quality levels and offers the choice to customers, one
that offers the same but not openly (i.e. promoting premium
quality on their websites, but if the customer wants, also offer
lower quality levels) and, finally, LSPs that are unwilling to offer
anything else but premium quality as yet.
The magic word that comes into play is benchmarking. No

quality levels without comparison. No weights and no penalties
either. One should have at least a vague idea what makes pre-
mium quality a 5-star and what are the minimum requirements
10
before we can call a translation a translation (i.e. 1-star). For

the basic level of quality, I would say readability, usability and
comprehensibility tests would be good yardsticks. First, define
the purpose of the translation and of course your budget. For
some purposes, a 1-star quality is enough. For others, it's not.
For the top level, obviously a top score based on a combina-
tion of adequacy, fluency and error-typology evaluations would
be required. And this leads us to the following topic of the QE
Summit in Vancouver: Quality in the translation workflow. A
different workflow will usually result in different quality levels.
MT + light post-edit will be lower level than MT + full post-edit
will be lower than full post-edit or human translation + review
etc. Depending on the different steps, different results can be
expected. I would be curious to know how quality control and
evaluation are built in the subsequent steps of the translation
workflow and how do you train your personnel for evaluation.
Finally, what is the cost and benefit of quality evaluation? How

much do you lose by delivering random quality? How do you
calculate ROI? Let's face it: we rarely know the quality level of a
translation we deliver. "But it has been made by my best transla-
tor and reviewed by my best proofreader!" Translators have bad
days, reviewers too. We all do. Without evaluating the product
at least by using samples, you can't be sure you are delivering
the required quality.
11
QUALITY
EVALUATION
Quality Evaluation on a Shoestring
Quality is a hot topic today for all players of the translation in-
dustry: translation buyers want different types of quality and
flexible ways of pricing; LSPs would like to know whether their
customized MT solution is improving; and translators are keen
on setting the threshold of fuzzy matches/MT suggestions at the
most optimal level. These are just a few examples where quality
evaluation plays a crucial role.
Unfortunately, theres no such thing as a free lunch! Quality

Evaluation (QE) can save money but it also costs money.
Assessing the quality of a translation can sometimes cost you
even more than producing the translation itself! Nonetheless,
continuous monitoring of translation quality and sharing eval-
uation data are indispensable practices for developing metrics
for automated QE. Without that, no advances will occur in the
translation industry. As Maxim Khalilov of Booking.com men-
tioned very cleverly in one of TAUSs translation quality webinars:
Improving automated metrics for QE will also improve the qual-

ity of existing MT solutions.
Better QE means better tools.
Obviously, a hassle-free improvement of MT output is a fairy

tale. In order to develop and improve, we need to measure
12
Improving automated met-

rics for QE will also improve
the quality of existing MT
solutions.
- Maxim Khalilov
quality constantly. But how can we achieve that when budgets

and resources set-aside for this purpose are so tight. How do
we become efficient in QE?
Lets talk efficiency!
Efficiency in general, describes the extent to which time, effort or

cost is well used for the intended task or purpose. (Wikipedia).
The main theme of the 2014 QE Summit held in Dublin was ef-
ficiency how to save on time and resources by applying new
techniques and methods for translation QE. Speakers at the
event elaborated on five topics which are also five proposed
ways of saving on budget when it comes to QE. The outcomes
of the break-out sessions and the presentations were bundled
in the following five best practices:
Quality Estimation
Community Evaluation
Readability
Usability Evaluation
Sampling
13
Translation Productivity Revisited
Once upon a time in the Land of Translations...

... we wanted to know how many words we could produce per
month, per day, per hour. How much time we need to post-edit
machine translated segments. And we wanted to track the edit
distance. Why on Earth?! Well, to find ways to profile translators
and post-editors, to set prices, compare vendors, categorize
content, evaluate MT engine performance... the list is endless,
but are we doing it right?
Productivity tells you how fast a translation was completed. Due

to many variables, however, it will never be a reliable measure-
ment when it comes to profiling post-editors and translators,
comparing vendors or evaluating MT output. And by the way,
will it ever give us valid insights into the quality and difficulty of
the specific content we receive from our customers?
Productivity defined
According to Wikipedia:
Productivity is an average measure of the efficiency of produc-
tion. It can be expressed as the ratio of output to inputs used in
the production process, i.e. output per unit of input.
This formula works well when all the variables on the in- and
output side are listed, well defined and measured consistently.
Problems arise through only taking a limited number of vari-
ables. Unfortunately, thats exactly what is happening in the
translation industry today: we take time as the only input, and
words as the only output. As a result, the more words produced
in a shorter amount of time, the higher the productivity will be.
This is just too simplistic if you ask me. Im just wondering how
our industry could get away with it so long!
There is much more to productivity than the number of words

per hour. Why not also take into account the number of (fi-
nal) edits per hour. This means calculating one unique score
that is based on the total number of words translated by the
14
translator in an hour, combined with the number of final edits

done in the whole process of producing the translation (and
calculated from the character-based edit distance). This gives a
more reliable productivity score. Its easier to translate fast when
the translation memory gives many exact matches or context
matches, and when the MT engine is in top shape and we hard-
ly need to translate anything from scratch, as opposed to the
situation where there are no available resources or ones of very
poor quality.
For this reason, it is a good step forward to include the number

of edits per hour in the productivity score (and I will talk about
this later), but one should also take into account the following
variables:
Difficulty of the source content (using some measurement
independent of language)
Quality of the source content (based on human assess-
ment by the translator or the reviewer)
Available resources (also called translation process):
whether the translator did or didnt use an MT engine, a trans-
lation memory, glossary etc
Quality of these resources (using fuzzy match and MT con-
fidence information combined with edit distance)
Number of corrections applied by the reviewer(s)
Number of errors, weights and penalties applied by the
reviewer(s) in the review cycle(s)
Now, I dont say this is all easy to measure, keep track of, or
aggregate in one single score. But still, lets try and see what
happens!
TAUS Efficiency Score

That is also what one of the developers (Nikos Argyropoulos)
thought when he came up with a new metric to measure pro-
ductivity called the TAUS Efficiency Score. This score replaces
traditional productivity measurement because it can be applied
to every form of translation: translation from scratch, translation
with translation memory, PEMT or a mix of these three. More and
more translation jobs have a mixed nature: one can post-edit
15
MT suggestions, insert TM matches or translate segments from

scratch in the very same translation job. There is no hard divide
anymore between MT, TM and human translation. This should
be reflected in a new metric measuring productivity.
In the TAUS Efficiency Score, time is measured for producing

(and, if needed, updating) each segment regardless of the seg-
ment origin (MT, PE, glossary, scratch, etc).The Efficiency Score
is flexible in that the number of variables used to calculate it
and the ways the different measurements are taken into account
vary based on user requirements and available data. The score
is also relative because it is calculated based on the data pres-
ent in the underlying database at the moment of calculation.
Variables
The variables involved in producing the Efficiency Score are, in
the first place, the two obligatory variables (core variables) and
any additional variables (optional variables) that are added to
the calculation to increase precision and credibility. The score
can be calculated to measure translator efficiency, but the fo-
cus can also be on CAT/TMS efficiency or MT engine efficiency.
While edit distance and the edits per hour are calculated in
many translation tools, this measurement tends to be only ap-
plied to evaluate MT engines and less so for evaluating post-ed-
iting productivity. This is simply because no one has come up
with a method that would combine a productivity score with
edit distance information and normalize the score in a dynamic
way. This is exactly what the TAUS Efficiency Score does when it
is based on the core variables.
In order to unify the two measurements (processed words per

hour and final edits per hour that are based on edit distance),
one needs to convert relative scores into absolute scores. The
Efficiency Score is calculated on an ongoing basis using data
from the DQF database which is fed with data from real life
projects. The score is displayed in the TAUS Quality Dashboard.
The more data and the more homogenous data is used to cal-
culate the score, the more precise and meaningful that score
will be.
16
The Efficiency Score is not yet implemented in the TAUS

Quality Dashboard. If you want to read more about the Quality
Dashboard please click here.
Use case
The Efficiency score based on core variables is calculated using
the following data:
1. The number of words that a translator processed. (Note:

each time a translator returns to a segment, the extra time will
be added on that segment.)
2. The edit-distance is calculated using the Wagner & Fischer
algorithm after the translation process.
In the example below, four translators have been involved in

similar translation projects. The table offers information on the
actual number of words processed, the actual time spent, the
speed expressed in word per hour, the aggregated edit dis-
tance based on all segments and this normalized to the number
of edits per hour.
Name Number Time (sec- WPH Edit Dis- Edits per

of Words onds) tance Hour
Translator 1 100 120 3000 50 1500
Translator 2 150 140 3857 80 2057
Translator 3 80 120 2400 70 2100
Translator 4 120 130 3323 30 831
Table 1: Translator data - Example
The normalization of all the variables will be calculated using

the Min-Max normalization because it is simple, and it has the
advantage of preserving exactly all relationships in the data and
provides easy way to compare values that are measured using
different scales.
17
Using the Min-Max normalization the following scores will be

obtained:
Having these results, it becomes clear what is the rate of each

translator in the distribution for the words per hour and ed-
it-distance measurements, and the difference between them
can be seen in a scale from [0.0, 1,0]. Both measurements have
an equal share in assessing translators.
Based on the probabilities above, the Efficiency Score can be

calculated. This is based on the total of the two normalized
scores divided by 2.
Name WPH WPH Min- Edit-dis- Edits per Edit-distance

Max nor. tance Hour Min-Max nor.
Translator 1 3000 0.411 50 1500 0.4
Translator 2 3857 1.0 80 2057 1.0
Translator 3 2400 0.0 70 2100 0.8
Translator 4 3323 0.633 30 831 0.0
Table 2: Translators - Probabilities
Name WHP Normal Edit-distance Sum Efficiency

normal. Score
Translator 1 0.411 0.4 0.811 0.405
Translator 2 1.0 1.0 2.0 1.0
Translator 3 0.0 0.8 0.8 0.4
Translator 4 0.633 0.0 0.633 0.316
Table 3: Composite indicator for producing the Efficiency Score
18
Summary
For the Efficiency Score based on the core variables, we mea-
sure time for processing segments while tracking the segment
origin. Next, we measure the edit distance and calculate the edit
distance per segment (minimum number of edits needed to get
from A to B) and produce the number of edits per hour. Finally,
we normalize and unify the two measurements. For more preci-
sion and credibility, we can base our calculation of the score on
additional (optional) features.
There are a number of reasons for developing a composite in-

dicator for productivity based on the words per hour measure-
ment and the edit-distance scores:
1. It can offer a rounded assessment of performance.

2. It presents the big picture and can be easily under-
stood than trying to find an answer in the two (or more) other
measurements.
3. It can help for the implementation of better analytical
methods and better quality data.
The two data points are used to generate a numerical score that
will show the efficiency of the translator among other transla-
tors who worked in similar projects (technology, process and
content). As I mentioned earlier, you can also use the score to
compare technologies, processes etc. Before calculating the
Efficiency Score, the data needs to be preprocessed and trans-
formed to fall within a smaller and common range for all the
metrics, such as [0.0, 1.0]. This way we give data points an equal
weight.
Future work will involve adding the Efficiency Score to the TAUS
Quality Dashboard. Initially this score will be calculated based
on the core variables. In a later phase, the possibility of adding
quality and content difficulty scores is envisioned.
Lets see whether this will reform the way we look at translation
productivity and determine our prices. In any case, one thing is
for sure: the traditional way of measuring productivity is dead.
19
BUSINESS
INTELLIGENCE
Business Intelligence and Quality
Evaluation Data
Business Intelligence (BI) in the translation industry is about en-
gineering an environment of answers by selecting, collecting
and interpreting data derived at various stages of the transla-
tion process. In this webinar, Tom Shaw (Capita) explained how
quality evaluation data of even a small sample can predict ROI
and support business decisions when this type of data is re-
corded and interpreted correctly. Business Intelligence is start-
ing to catch on in the translation industry, and with good reason
using smart ways to transform data of any type into actionable
information yields business benefits and helps stakeholders
make informed decisions.
Once the KPIs of a business or project are defined and ways

to measure these are set, data collection and monitoring will
begin. Today, high quality data is abundant and we have many
ways to harvest it. Whats very often missing is the extra step
of analyzing and interpreting this data. One example from our
industry is the collection of post-editing productivity data that
is produced in ever-increasing volumes, but that is not being
linked to post-editor profiles, quality levels or content difficulty
levels and, as a result, doesnt find its way to the pricing cycle.
One could use this type of data in a dynamic way to adjust pric-
es. Unfortunately, dynamic pricing is still a neglected concept in
the translation industry and would definitely deserve an article
on its own.
20
Lets take a look at some common areas where translation ven-

dors and buyers can benefit from collecting and analyzing data.
Zooming in on one of the components of the well-known trans-
lation pyramid (quality-speed-cost), we can track translation
quality on an ongoing basis by collecting benchmarking data
available in different stages of the process: from content au-
thoring through actual translation or post-editing to publishing
and post-publishing.
We can measure source-text quality using readability measures;

we can evaluate the quality of MT output applying relative or
absolute metrics; we can categorize post-editors based on pro-
ductivity tests that involve time measurements and edit distance
data. Once this data is linked to user profiles, type of CAT tools,
content profiles, industry-domain etc., it can offer powerful in-
formation to vendors and buyers of translation services. All this
data helps to spot trends, improve efficiency, adjust translation
workflows, enhance tools (CAT, MT, etc.) and select, discard or
update resources (both TMs and Human resources). A core ca-
pability of any BI environment is to isolate granular sub-sets of
data. The key is the ability to conduct multidimensional analysis
quickly and to test theories and identify trends.
BI data also enables us to do adjustments and introduce more

efficiency in our workflow processes. One example is when
low post-editing productivity is linked to poor MT output when
actually it is the result of
wrong choices made fur-
ther down the road: miss-
ing guidelines, untrained
post-editors selected or
environments that are un-
suitable or inappropriately
set-up for the job.
While evaluation scores

tracked internally can be
of enormous value for
safeguarding efficiency,
21
they might become meaningless without absolute values to

compare them with. Is a 30% productivity increase for English-
Japanese statistical machine translation (SMT) good or bad?
One of the main problems in the translation industry today is
the lack of benchmarking. Translations cannot be compared
to industry averages or standards because these are not yet
available. At the same time, buyers of translation services are
increasingly interested in translated content of different quality
levels and different pricing models. They want to save some
resources on some content and invest more in others. As a re-
sult, several vendors today are offering services and products
tailored to various needs. But how can these customers specify
the quality level they need? And how do vendors make sure
the right quality is delivered?
The general consensus is that you shouldnt wait for your busi-
ness to have big enough data or perfect data to track BI and
do benchmarking. Because of the elusive nature of data, you
will never have enough of it and it will never be perfect, de-
spite your best efforts. Whats more, reports and analytics that
BI provides actually help expose the faults in your data. This
being said, its still important to understand really good, action-
able business intelligence in the translation industry depends
on complete and accurate data. This is the old garbage in, gar-
bage out axiom and its as true now as it ever was.
Here are some tips on how to get started with BI in your

company.
1. Make sure you use the right tools and record the right
data for each and every project in your translation workflow.
2. Once you have decided on collecting data and doing
data analysis, you need to connect the results to a limited num-
ber of KPIs. You cant possibly follow and interpret dozens of
data points. It will get too complicated down the line, so just
keep it simple.
3. KPIs have to be linked to what you try to achieve. KPIs set
and data measured in isolation of the bigger business picture,
just for the sake of measuring is meaningless. What are your
objectives?
22
4. While its important to track results and act based on new

insights resulting from data, the focus should remain on busi-
ness, not on improving scores.
5. Finally, its important to train your team and explain in ad-
vance how results will be and should be interpreted. The maturi-
ty of the team around a data-centric approach is indispensable.
Clearly, there are myriad different ways in which BI can support

your translation/localization business. As BI becomes more
commonplace in our industry, no doubt we will see more and
more vendors and buyers create different use cases for it. TAUS
is committed to share new insights, benchmarks and best prac-
tices on Machine Translation, Post-Editing, Data and Evaluation
to help the industry move forward. Are you interested in learn-
ing more? Please visit the different TAUS shared services pages.
23
Crowdshaping Translations
Is your business...
Improving translation quality without spending a fortune
on manual assessment?
Providing awesome customer experience within the lim-
its of your budget?
Monitoring translator performance?
Crowdshaping might be of help.
Crowdshaping versus crowdsourcing

Crowdshaping is a recent successor of crowdsourcing and it is
increasingly deployed in various settings: during dance events,
in retail stores, in the construction of road systems and in foot-
ball stadia. It has much in common with crowdsourcing, but dif-
fers from it in one important aspect: user participation. While
crowdsourcing refers to people intentionally and actively shar-
ing their opinions, preferences, or ideas, crowdshaping is rela-
tively passive, generally using technology that detects peoples
preferences and interests based on their actions.
Crowdsourcing is vulnerable to reporting errors, particularly

when it comes to certain kinds of input: what people say about
their preferences, feelings and future behavior doesnt always
align with what they do. Crowdshaping overcomes this limita-
tion by not asking how people would act or what they think,
but by recording what people actually do by indirectly tapping
input from them.
Of course, you need to own such technology and you also have
to know how to interpret and put the harvested data to work.
That might be a drawback as crowdshaping technology is still
evolving and best practices are missing in various industries.
Also you might think: Wait a minute. How about my privacy?
But is data privacy really an issue today? Evidence suggests
that consumers are growing accustomed to a world in which
data is a shared resource.
24
A new IBM study found that consumers are willing to share their
personal information with retailers, particularly if they get good
value in exchange. The percentage of consumers willing to share
their current location via GPS nearly doubled year-over-year to
36 percent. 38 percent of consumers would provide their mo-
bile number for the purpose of receiving text messages and 32
percent would share their social handles with retailers.
Adjusting party beats with biometric wristbands

One of my favourite examples of crowdshaping in action is
Lightwaves biometric wristbands. These gadgets have been
utilized at events
where DJs exploit-
ed the real-time data
to adjust the music
selection. The wrist-
bands have four sen-
sors: an accelerom-
eter to measure the
wearers movement;
a microphone to de-
tect decibel levels;
a gauge to measure
both body and room
temperature; and a
sensor to detect skin
physiological and psychological arousal through sweat. When
the temperature of the crowd reached a set point, for example,
the crowd unlocked a round of drinks. Leaderboards rated in-
dividual dancers for energy, and, during a boys vs girls dance-
off, both teams competed to see who could dance the most
energetically.
Reshaping customer experience with in-store technology

Sounds far-fetched? The possibilities of tracking user data to
improve user experience are endless. Retailers, for instance,
will soon start to employ in-store technology that can collect
and use data to reshape the experience served to shoppers?
25
Crowdshaped advertising on digital screens. Crowdshaped in-

store music. In fact, you could soon crowdshape the music at
your next party if a prototype called Chne (a small, smart juke-
box made by design firm Clearleft) is launched. Chne uses
technology to strip music playlists out of your smartphone,
and then plays a music selection that is an aggregation of your
preferences.
Crowdshaping in the translation industry

While user data is omnipresent in the translation industry as
well, it is not aggregated or used in a smart way. An example
where the translation industry could deploy crowdshaping is
the real-time adjustment of translation quality based on visitor
preferences of published content. Website content that is ma-
chine translated could receive better quality translation (light
or full post-edit, crowdsourced translation, etc.) based on in-
formation such as page views, bounce rates, engagement and
most popular pages in order to offer better user experience.
Evaluating quality using indirect methods is an increasingly

dominant topic in the industry today. How can we harvest user
data to offer the right level of quality? How to measure usabili-
ty and user behavior without spending much effort on manual
assessment? Here are some examples from the translation in-
dustry where crowdshaping could be or is already effectively
applied:
Tracking user behavior: evaluating the quality of translat-

ed content on multilingual sites by comparing user data (pa-
geviews, geo-location, sessions, entrances, clicks, purchases
and transactions) for the different languages.
User-adaptive MT: improving the quality of an MT engine
by constantly feeding post-editing data or user feedback to
the system.
Interactive MT: the computer software that assists the hu-
man translator attempts to predict the text the user is going
to put in by taking into account all the available information.
Whenever a prediction is wrong and the user provides feed-
back to the system, a new prediction is performed consid-
26
ering the new information. This process is repeated until the

translation provided matches the users expectations.
Technology evaluation: Evaluate translation technology
including CAT tools and MT engines by tracking productivity
data in real time.
These are just a couple of examples, but Im sure there are more
out there and even more to come
The TAUS Quality Dashboard

There are multiple ways to tap into translation data and user
information. Just like other crowdshaping technologies, the
TAUS Quality Dashboard will bring many benefits to users who
are willing to share their translation data. But the Dashboard is
also the first vendor independent application that supports the
benchmarking and analysis of translation efficiency on a large
scaleand its all done without additional effort on your side.
The TAUS Quality Dashboard will tell users:

Which translator to choose for a certain job
What the origin is of translated segments
How efficient a vendor or a technology is in a given project
How to adjust pricing based on performance
How the delivered quality by a vendor compares to in-
dustry average
and the list can become almost endless.
The data is there and we make so little use of it. The good new
is: all you want to know will soon be available at your fingertips
by using an intelligent solution for aggregating data, the TAUS
Quality Dashboard.
So, where will your first journey take you in a crowdshaped

universe? Where will you encounter crowdshaping for the first
time? At a dance event, in a hospital, in a football stadium or
in the translation marketplace? With different translation quality
levels offered by different vendors on the market, crowdshaping
is already reshaping the way we approach translation quality.
27
FURTHER
READING &
REFERENCE
This list of reading material is far from complete. It can serve as
a first step towards understanding the main topics in transla-
tion quality evaluation.
Attila Grg & Pilar Sanchez-Gijn (eds.): Revista Tradumtica

special issue on Translation and Quality, No. 12 (2014)
http://revistes.uab.cat/tradumatica/issue/view/5
Attila Grg: TAUS Dynamic Quality Framework Report 2015

https://www.taus.net/think-tank/reports/evaluate-reports/
dynamic-quality-framework-report-2015
Lena Marg, Sharon OBrien, Attila Grg, Miguel Gonzalez:

TAUS Best Practices on Community Evaluation
https://evaluation.taus.net/resources-c/guidelines-c/
community-evaluation-best-practices
Luigi Muzii: Quality Assessment and Economic Sustainability

of Translation
http://www.openstarts.units.it/dspace/bit-
stream/10077/2891/1/ritt9_05muzii.pdf
Sharon OBrien, Rahzeb Choudhury, Jaap van der Meer, Nora

Aranberri Monasterio: TAUS Dynamic Quality
Evaluation Framework: TAUS Labs report
28
https://www.taus.net/reports/
translation-quality-evaluation-is-catching-up-with-the-times
Sharon OBrien: Towards a Dynamic Quality Evaluation Model

for Translation
http://www.jostrans.org/issue17/art_obrien.pdf
Sharon OBrien: Translation Quality - Its time that we agree

https://taus.net/taus-ilf-dublin-3-june-2014-translation-quality-
it-s-time-that-we-agree
TAUS Dynamic Quality Framework Report
TAUS Best Practices: Adequacy & Fluency
29
ABOUT
Attila Grg
Attila Grg is Director of Enterprise

Member Services at TAUS. He pre-
viously held a position in the prod-
uct development team for the
TAUS Dynamic Quality Framework.
Currently he is the key account
holder for enterprise member com-
panies. Attila has been involved in
various national and international
projects on language technology
for more than ten years. He has a
solid background in quality eval-
uation, post-editing and terminol-
ogy Management. One of his key
tasks is to encourage and facilitate the discussion around trans-
lation quality by organizing international workshops and con-
ferences on the topic. Participants are vendors and buyers of
translation services, governmental organizations and academ-
ic institutions. Attila is also the host of the annual QE Summits
that have translation technology and quality evaluation as their
main focus. He published various articles on quality evaluation,
terminology management and computational linguistics in the
past in various academic and industry journals.
30
TAUS
TAUS is a resource center for the global language and transla-

tion industries. Founded in 2004, TAUS provides insights, tools,
metrics, benchmarking, data and knowledge for the translation
industry through its Academy, Data Cloud and Quality Dash-
board.
Working with partners and representatives globally, TAUS sup-

ports all translation operators translation buyers, language ser-
vice providers, individual translators and government agencies
with a comprehensive suite of online services, software and
knowledge that help them to grow and innovate their business.
Through sharing translation data and quality evaluation metrics,
promoting innovation and encouraging positive change, TAUS
extends the reach and growth of the translation industry.
To find out how we translate our mission into services, please

write to memberservices@taus.net to schedule an introductory
call.
31
TAUS Signature Editions
32

TAUS Quality Ebook PDF

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

TAUS Quality Ebook PDF

Enviado por

Direitos autorais:

Formatos disponíveis

Redefining Translation Quality: From User Data to Business Intelligence

TAUS Signature Editions

Published by TAUS Signature Editions, Keizersgracht 74, 1015 CT

Tel: +31 (0) 20 773 41 72

All rights reserved. No part of this book may be reproduced or transmitted

Copyright 2016 by Attila Grg and TAUS Signature Editions

Buy Translation Like You Book a Hotel Room! 10

Translation Productivity Revisited 14

Further Reading and Reference 28

When I studied French at university in my homeland, Hungary,

Philipp Koehn once cleverly wrote: ten different translators will

At the end, we created a pool of good, acceptable translations

What happened next was a vote. All of us, teacher included,

Since then, many things have changed. I moved to the

Todays fast-changing landscape of goods and services is

Technological innovation coupled with an unstable econo-

Old must-haves are turning into nice-to-haves or no-need-to-

Stuff and fluff (i.e. products and services) become indis-

What is translation today? In a narrow sense

Translation transfers a written source text into a written target

Defining the landscape of translation by Melby, Fields, Hague,

Is such a translation a must-have today? Sometimes. In a broad-

Translation departs from or is inspired by source content in one

How do you define translation? Broadly or narrowly? To what

In some areas or situations, translation (in the narrow sense)

With the increase in crowdsourcing, post-editing, interactive

No type of translation is inherently better than another, but

No type of translation is in-

In our age of hyper-globalization, translation in the narrow

Buy Translation Like You Book a Hotel Room!

In 2014, at the VViN conference (the Dutch version of ATA), I lead

The translation industry lacks an equivalent to the 5-star hotel

The magic word that comes into play is benchmarking. No

before we can call a translation a translation (i.e. 1-star). For

Finally, what is the cost and benefit of quality evaluation? How

Unfortunately, theres no such thing as a free lunch! Quality

Improving automated metrics for QE will also improve the qual-

Better QE means better tools.

Obviously, a hassle-free improvement of MT output is a fairy

Improving automated met-

quality constantly. But how can we achieve that when budgets

Lets talk efficiency!

Efficiency in general, describes the extent to which time, effort or

Translation Productivity Revisited

Once upon a time in the Land of Translations...

Productivity tells you how fast a translation was completed. Due

There is much more to productivity than the number of words

translator in an hour, combined with the number of final edits

For this reason, it is a good step forward to include the number

TAUS Efficiency Score

MT suggestions, insert TM matches or translate segments from

In the TAUS Efficiency Score, time is measured for producing

In order to unify the two measurements (processed words per

The Efficiency Score is not yet implemented in the TAUS

1. The number of words that a translator processed. (Note:

In the example below, four translators have been involved in

Name Number Time (sec- WPH Edit Dis- Edits per

Table 1: Translator data - Example

The normalization of all the variables will be calculated using

Using the Min-Max normalization the following scores will be

Having these results, it becomes clear what is the rate of each

Based on the probabilities above, the Efficiency Score can be

Name WPH WPH Min- Edit-dis- Edits per Edit-distance