Does machine translation reinforce gender bias - Portada
21/06/2022

Does machine translation reinforce gender bias?

Although a machine learning model can be a powerful tool in the translation space, it can only be as good as the data it learns from. If there is a systematic error in the data used to train a machine learning algorithm, the resulting model will reflect this. These errors are the main reason that gender bias is present in machine translation. Some aspects of this are out of the control of the machine translation engine creators, but some others aren’t. Let’s examine how machine translation reinforces gender bias and how it can be fixed.

How Errors Can Occur

Wikipedia serves as a good example of how machine translation errors can occur and reinforce gender bias. Wikipedia’s entries tend to be geographically diverse, lengthy, and refer to subjects in the third person, which leads to the use of a lot of pronouns. Because of this, Wikipedia entries (particularly biographies) often have potential to cause machine translation errors related to gender, especially if an article refers to a person explicitly early in a sentence, but not later on. 

How Errors Can Be Resolved

Let’s look at Google as an example of a company aiming to resolve machine translation mistakes regarding gender. Google acknowledges that its translation tools struggle with errors that lead to reinforcing gender bias. They believe that they need to advance translation techniques to surpass single sentences. Doing this requires setting new metrics for measuring their progress and creating datasets with the most commonly encountered context-related errors. They’re facing a significant challenge. Translation errors related to gender are susceptible, as they can incorrectly refer to someone and how they self-identify. 

Google is working towards long-term improvements on their machine learning systems so they can continuously improve how they translate pronouns and gender.

The Takeaway

In recent years there’s been more awareness that these biases exist and machine translation engineers are trying their best to resolve this issue quickly, but it’s no easy endeavor since gender works so differently in all languages. Even though many advancements have been made in the machine translation industry, work still needs to be done. In all reality, a human translator is much better equipped to handle such sensitive issues like gender. 

It has taken many years to improve machine translation quality, and additional improvements will take more time to make. However, this issue can’t wait that long to be addressed. Errors that reinforce gender bias are critical to work on now, considering the recent relevance gender inclusivity has taken recently. If a company wants to prioritize inclusive language, it’s not safe to use an automated solution. Gender is a sensitive topic; with a translation, you want to ensure your message is conveyed discreetly. Right now, human translators are researching and staying up to date with the latest trends in the languages they work with. This is necessary, as everything is changing so fast. Companies should turn to these professionals to ensure their brand is not hurt by a careless machine translation mistake.

Post-editing-Highlights-What-to-Correct-Portada
26/01/2021

Post-editing Highlights: What to Correct

The implementation of artificial intelligence provides new resources and possibilities to the localization industry. As a result, the translation workflows change. Because of that, language professionals perform additional tasks apart from translation or editing, such as pre-editing, post-editing or Machine Translation (MT) evaluation.

Post-editing implies reviewing a MT output in order to improve it and to obtain a semantically and syntactically accurate target text. This service is a specialized task that requires a specific set of skills, expertise and competencies.

Trained post-editors are aware of the most common mistakes MT makes and quickly implement the changes needed. Let’s analyze some of the most common errors addressed in the post-editing stage.

Mistranslations and omissions

Whether a document or project need deep or light post-editing, there are mistakes that post-editors always correct in the post-editing stage. They scan the output text for omitted or added words, phrases or segments. Additionally, they will correct mistranslations, semantic and syntactic errors by applying quick and short changes. Correcting numerical and tag mismatches between source and target text is also a must during post-editing.

Furthermore, if specified for a project, reviewers evaluate if the output complies with stylistic guidelines and correct it accordingly.

With all these basic improvements, post-editing ensures that the target text is accurately translated and properly formatted.

Limits of AI

Mistranslations or omissions are common errors that can be found even in human translation. But other mistakes are related to the capabilities of the artificial intelligence engine. Some of them are the following:

  • Post-editors spot errors in the output that can be due to a spelling error in the source text. When the misspelled word or cipher exists, the engine translates it, but the target text will convey the wrong meaning. Because vendors master specific domains, they are able to spot those errors.
Post-editors-spot-errors-in-the-output-that-can-be-due-to-a-spelling-error-in-the-source-text
  • If there are acronym preferences specified, post-editors will ensure they are properly translated into the target text. This is because the MT engine might accurately translate well-known acronyms (e.g., WHO>OMS), but non-familiar ones can be left untranslated. Also, there might be inconsistencies in how they are translated or explained in the target text.
  • Depending on the engine (if it’s, for instance, example based, ruled based or neural), some types tend to mirror the letter case of words. Post-editors correct any capitalization mistake generated by differences in the capitalization rules between target and source text. 
Examples-when-MT-engine-misreads-punctuation-by-interpreting-it-wrongly-or-mirroring-the-source-texts-punctuation
  • Some projects may have the specification of leaving untranslated certain terms or phrases, for example, codes of web pages, proper names or institution names. While reviewing the output, the post-editor ensures the target text complies with that requirement.
  • Sometimes, the MT engine misreads punctuation by interpreting it wrongly or mirroring the source text’s punctuation. Post-editors must be aware of the most common punctuation mistakes (for instance, mistranslation of the long dash and colon in English into Spanish text pairs) and correct them accordingly.
Examples-when-the-MT-engine-misreads-punctuation-by-interpreting-it-wrongly-or-mirroring-the-source-texts-punctuation.
  • The MT output can be grammatically and syntactically correct, but still don’t comply with, for example, the character limit specified for a project. Post-editors will bear in mind the specific requirements and apply the appropriate changes.

Leave it to the experts

Relying on expert post-editors ensures that providers with a specific background and know-how handle the MT workflows. Experience and expertise allow vendors to implement the required improvements in MT outputs without sacrificing time nor productivity.

Linguist Profiling: What Makes an Ideal Candidate for Post-editor
02/09/2020

Linguist Profiling: What Makes an Ideal Candidate for Post-editor

The implementation of machine translation (MT) impacts the localization workflow with increased rates of productivity because it reduces delivery times and costs. But it also has other consequences, like redefining the traditional roles that language professionals assume in the industry, such as editor, proofreader, or translator. 

One of the most requested tasks in the MT workflow is post-edition, the process of improving a machine translation output. Only certain professionals stand out in this task. They are a specific type of editors that have the required technical, psychological, and linguistic skills. Let’s find out what you need to be an ideal post-editor. 

Let’s find out what you need to be an ideal post-editor.

Papers please

First and foremost, a certification in translation, language studies, or linguistics is a must in the profile of a post-editor. If not, as per ISO 18587:2017, the post-editor must have at least five years’ experience in translating or post-editing. These are requirements that intend to guarantee translation service providers work with top-quality professionals.

A whole lot of competences

A quality-oriented translation workflow is rooted in the proper selection of the professionals involved in a project. This is also the case for MT workflows. The following list summarizes the competencies that are part of the ideal post-editor profile. 

A list that summarizes the competencies that are part of the ideal post-editor profile.

Post-editors are, like any other translation professional, proficient in both source and target language and culture. They know how to conduct efficient research of terminology and manage the information. Also, they master the specific domains, since this implies an expert understanding of the source text. 

Lastly, post-editors must be skilled in IT resources, like CAT tools, but also be acquainted with MT systems. The post-editors that fulfill the required profile know MT models (neural, statistical, example-based, rule-based) and their differences. Furthermore, they are aware of the most common errors in each system. Thus, they can manage more efficiently their attention and spot mistakes quickly.

The two A’s: Aptitude and Attitude

Differents ways a linguist can add value in an MT workflow.

There are differences between the profiles of MT post-editors and TEP editors. Both are detail-oriented linguists, but in addition, post-editors must be fast and efficient, implementing minor and quick changes in the short time provided for the edition. 

Moreover, a salient feature of post-editors is their predisposition or flexible attitude. Sometimes language professionals are reluctant to the implementation of MT. But MT is just one solution in the fast-growing translation industry, whose core business remains the same, regardless of its growth or of the MT implementation. Successful post-editors are confident and creative, and they adapt willingly to the new roles the industry has to offer.

15/07/2020

Lead Linguist Bibiana Cirera’s View on Machine Translation

Machine translation has always caused controversy in the translation industry. According to Bibiana Cirera, Lead Linguist at Terra Translations, this is primarily because some translators and editors have firmly opposed the incorporation of machine translation into their work. Bibiana has witnessed this distrust of machine translation firsthand, “I have heard many translators express concern about machine translation taking their jobs or stifling their creativity.” To provide more insight into this topic, we asked Bibiana for her honest take on machine translation’s role in the translation industry. 

Machine Translation is Here to Stay

Bibiana is aware that we live in a globalized and constantly evolving world. The adoption of machine translation is one change that she believes translators and linguists need to accept and not feel threatened by, “The truth is that a machine will never be able to completely replace human labor in the translation process, at least for now, and it will always take the touch of a translator or editor to deliver a verbose, meaningful, and error-free deliverable to the client,” Bibiana said.

Bibiana has found that machine translation is extremely effective in handling certain subject matters, such as those relating to medical, technology, and engineering industries. For subject matter that requires more creativity, such as marketing and advertising, she doesn’t feel machine translation can hit the mark. 

At the end of the day, one of the benefits of machine translation in Bibiana’s opinion is client satisfaction, at least in regard to saving time. Some clients require a fast turnaround, especially if they handle large volumes of text, and they may not have time to wait for a human translation. By using machine translation, and then utilizing human labor for the post-edition process, the client can have a deliverable of acceptable quality in a quicker time frame. The decision to use machine translation during the process depends on the quality expectations of the client and what their priorities are. “We have clients who have found these machine translation tools to be really high quality in the cases of highly technical translation projects. In many instances, we find practically no differences between what a person and machine translation can translate when there is little room for creativity during the process. This tool even recognizes the client’s translation memory and glossary, which guarantees the correct application of both,” Bibiana said. 

Machine Translation Has Its Faults

Bibiana acknowledges that machine translation has its difficulties, which is why pairing it with a human translator, linguist, or editor can make all the difference. Four shortcomings that Bibiana is wary of include:

  • Complex formats. Most machine translation engines do not recognize formats such as bold, italics, underlined text, subscript and superscript, colors, and the tags that are generated in a conversion.
  • Table headings. Machine translation tools often break the words from the heading. When translating, sometimes the order of the words must be reversed. This process cannot be recognized by an automatic translation tool and it translates the words literally.
  • Segmentation. When a program takes a source file, it may cut sentences, therefore in this case, the machine translation engine doesn’t recognize the cut sentence and translates it as two separate and meaningless sentences.
Machine Translation Faults - Segmentation Issue
  • Inconsistencies. Machine translation engines are usually inconsistent with the translation of the same term and often confuse the indistinct use of informal or formal tone or verb tenses.
Machine Translation Faults - Inconsistency Issue

Bibiana urges against solely utilizing machine translation tools for the sake of saving money, as some tools may leave much to be desired without human intervention and the work may end up needing a complete retranslation.

Working Together is Key

Machine translation can not stand on its own and that should provide some comfort to linguists and translators that feel their territory is being infringed upon. “As we see it, in one case or the other, human work is essential,” Bibiana said. She believes that we can no longer continue to ignore and oppose the implementation of machine translation. She urges that we must make sense of it and acknowledge the many benefits associated with pairing machine translation with a skilled human touch.

4 Stages and 8 Rules for Successful Post-editing
20/01/2020

4 Stages and 8 Rules for Successful Post-editing

Post-edition is the task of improving a machine translation (MT) output. This service is part of a wider workflow that may involve the preparation of the input, the implementation of MT and the evaluation of the obtained text. It’s a complex process that involves technology know-how, artificial intelligence and linguistic knowledge1 in its various steps.

4 Stages and 8 Rules for Successful Post-editing

1. Pre-editing

In order to obtain a better output after implementing the MT engine, post-editors will prepare the source text. This is because there are texts that are more suitable for MT than others. Pre-editing is the process of preparing the source text before MT to obtain a better MT raw output. The most common actions required in this step are the following: 

  • Manage terminology
  • Apply style guides
  • Shorten sentence length
  • Reduce long noun phrases

2. Machine Translation

At this stage, the MT engine translates the source text. The device can be integrated in a CAT tool, it can be a client’s engine or Google Translate, among other options. Depending on the project’s scope or requirements, a sample may be machine-translated to check the output. According to the results — and if needed — the project’s team makes adjustments in the source text or the engine.

3. Post-editing

Depending on the client’s requests and needs, the translated output can be delivered without post-editing at all (raw output), or with light post-editing or deep post-editing. Regardless of which process is applied, there are certain rules that determine the post-editing process. According to the Translation Automation User Society (TAUS), during the post-editing task, the post-editor should bear in mind these rules:

  1. Do not retranslate the text
  2. Decide changes quickly (“2-second rule”)
  3. Translate the whole text, unless some phrases are classified as untranslatable
  4. Correct incomprehensible sentences
  5. Delete inaccurate sentences if they are irrelevant and difficult to correct
  6. Focus on semantic and syntactic mistakes
  7. Don’t correct stylistic errors (their correction is subject to prior agreement)
  8. Don’t replace recurring terms with synonyms

4. Feedback and Evaluation

When developing an MT engine, the post-editor not only corrects the text, but also provides feedback to the engineers. Usually, the evaluation is made using standardized forms. This is a very important step that helps improve the MT device. The MT team retrains the engine based on the feedback provided (changing configurations, uploading new bilingual samples, for instance). With this step, the engine is “trained” so the quality of the MT output improves gradually. 

The Zero Step

Like in any other localization project, there is a step that cannot be skipped. For a successful delivery, it’s essential to have a prior agreement with clients about what they expect of the MT workflow. Specifically, what kind of post-editing process will be applied (none, light, or deep), style preferences, proper nouns treatment, date format and untranslatable phrases, among others, are details that need to be specified before the project starts. This kind of agreement is the foundation of any localization task.

1As we can see in the chart, the skills, and expertise of linguists play a key part in the MT’s workflow.

17/10/2019

What Is Post-Editing In The Translation Industry?

The development of artificial intelligence is not only changing industrial processes but also the translation workflow as we know it. In the localization industry, the use of powerful engines to produce machine translation (MT) output is becoming more frequent. Because of this, the role of the translator and other professionals in the industry is being redefined. MT cultivates new roles and needs while other tasks are no longer required. The most relevant evolution created through MT involves the translator. This new role isn’t in charge of the translation anymore but of the edition of the output generated by the engine. This process of revision is called post-editing.

What exactly is post-editing?

When we examine post-edition, we are referring to the process of improving a MT output. The textual product is modified with two main goals. The first is to enhance that particular text in order to get a readable and understandable output. The second is to improve the MT engine with the linguist‘s work and feedback. This process involves the job of localization or language engineers. It is important to note that post-edition requires specific skills. A post-editor should be very aware of details to detect errors and make pertinent corrections. Also, he or she should be efficient, since in this task time is everything. The goal of post-editing is to get a correct text with very quick and short modifications.

Furthermore, when dealing with MT projects, it is important to know which text types are a better option for this technology. Technical or scientific texts are more suitable for MT and post-editing than marketing materials, video-games or audiovisual products. This is because these are related to other media (e.g., video, audio) and have more creative content as well as higher complex sentences. Such factors result in harder input for MT.

Types of post-editing processes

There are two different processes of post-editing. First, light post-editing consists of implementing quickly a small number of changes so the MT output is considered acceptable. The expected corrections are the following: 

  • Orthography
  • Mistranslation
  • Omissions and additions
  • Terminology

The text can contain grammar or punctuation mistakes. The purpose of the light post-editing process is to obtain an acceptable and understandable text. Also to make information available to readers, not an output with human translation quality.

On the other hand, in the deep post-editing  process more changes are expected since the quality of the output should be equivalent to the quality of a human translation. As a result, linguists involved in the task spend more time working on the MT product. In a deep post-editing project, the editor should implement light-editing changes, plus the following:

  • Grammar
  • Punctuation
  • Style
  • Tone

After the edition, the text will be suitable for publication and distribution.

Types and needs

Knowing the difference between both post-editing types is key for a successful project. The process to be applied always depends on the clients’ needs. That way they can truly take advantage of MT technology.

13/05/2019

What is Machine Translation? Here’s What You Need to Know

The use of Machine Learning is growing at an extraordinary rate. In fact, business leaders said they believe Artificial Intelligence (AI) is going to be fundamental in the future according to a PwC study. 72 percent termed it a “business advantage.” There’s no denying the cost savings and efficiencies Machine Learning can provide. However, researchers still seek to perfect and appropriately apply data and technology. Professional translators also incorporate a form of Machine Learning into practice. This is known as machine translation.

What is machine translation?

Machine translation utilizes software to translate one language into another. The process performs simple substitutions of words with no human involvement. One of the most well-known examples is Google Translate. Google CEO Sundar Pichai revealed that their service now translates 143 billion words a day. While highly popular, professional translators agree that Google Translate lacks accuracy. The final translations that Google produces, especially when cultural references are involved, are not precise.

What are the different types of machine translations?

There are three main types of machine translations. The first is rule-based. The translation relies on a collection of language “rules” developed by linguists. With countless linguist guidelines, rule-based machine translation requires costly upkeep. First of its kind available commercially, today the technology has since been replaced with more efficient software.

The second type is statistical machine translation. This more complex form uses algorithms to produce text selected from millions of possible permutations. In some situations, combining rule-based and statistical translations improves the quality of the translation. Similar to rule-based, this form is not being used as frequently due to the additional work needed to maintain the system.

The third type and most commonly known is neural machine translation. First introduced by Google, neural machine translation uses an AI modeled after the human brain to predict a sequence of words. This interactive form allows translators to train the machine in real time as they rework and edit suggested phrases. The engine will learn and remember new terms in the correct context and tone for greater quality in future translations. Sentences and phrases generated from a neural network-based machine translation usually sound more natural and fluent.

When should I use machine translation?

While machine translation may optimize the speed, many projects require more attention. As mentioned earlier with Google Translate, machine translation lacks the ability to fully understand culture, context, and tone. Translation errors and fluency issues are still possible. Sales, legal, life science, safety, and marketing content should be handled by human translators. There are, however, certain contexts and situations where it is most beneficial. One scenario is having large volumes of content to translate with short deadlines. Another instance machine translation can be applied is as a placeholder while human translation is in process.

Translators across the industry can agree, it is highly recommended that machine translated content should undergo human post-editing. Post-editing can be light. The translator ensures the text is accurate and understandable. Post-editing can also be more in-depth or full. The translator ensures the text is accurate, fluent, and consistent with the target language.

Machine translation has grown more sophisticated over the years. Nonetheless, it’s still imperative to have a human translator check for errors. Especially if the translation is for professional use. Every translation mistake has the potential to drive away customers or worse…go viral.