Does machine translation reinforce gender bias - Portada
21/06/2022

Does machine translation reinforce gender bias?

Although a machine learning model can be a powerful tool in the translation space, it can only be as good as the data it learns from. If there is a systematic error in the data used to train a machine learning algorithm, the resulting model will reflect this. These errors are the main reason that gender bias is present in machine translation. Some aspects of this are out of the control of the machine translation engine creators, but some others aren’t. Let’s examine how machine translation reinforces gender bias and how it can be fixed.

How Errors Can Occur

Wikipedia serves as a good example of how machine translation errors can occur and reinforce gender bias. Wikipedia’s entries tend to be geographically diverse, lengthy, and refer to subjects in the third person, which leads to the use of a lot of pronouns. Because of this, Wikipedia entries (particularly biographies) often have potential to cause machine translation errors related to gender, especially if an article refers to a person explicitly early in a sentence, but not later on. 

How Errors Can Be Resolved

Let’s look at Google as an example of a company aiming to resolve machine translation mistakes regarding gender. Google acknowledges that its translation tools struggle with errors that lead to reinforcing gender bias. They believe that they need to advance translation techniques to surpass single sentences. Doing this requires setting new metrics for measuring their progress and creating datasets with the most commonly encountered context-related errors. They’re facing a significant challenge. Translation errors related to gender are susceptible, as they can incorrectly refer to someone and how they self-identify. 

Google is working towards long-term improvements on their machine learning systems so they can continuously improve how they translate pronouns and gender.

The Takeaway

In recent years there’s been more awareness that these biases exist and machine translation engineers are trying their best to resolve this issue quickly, but it’s no easy endeavor since gender works so differently in all languages. Even though many advancements have been made in the machine translation industry, work still needs to be done. In all reality, a human translator is much better equipped to handle such sensitive issues like gender. 

It has taken many years to improve machine translation quality, and additional improvements will take more time to make. However, this issue can’t wait that long to be addressed. Errors that reinforce gender bias are critical to work on now, considering the recent relevance gender inclusivity has taken recently. If a company wants to prioritize inclusive language, it’s not safe to use an automated solution. Gender is a sensitive topic; with a translation, you want to ensure your message is conveyed discreetly. Right now, human translators are researching and staying up to date with the latest trends in the languages they work with. This is necessary, as everything is changing so fast. Companies should turn to these professionals to ensure their brand is not hurt by a careless machine translation mistake.

Rethinking Context in Localization - Portada
14/06/2022

Rethinking Context in Localization

Nobody would deny that context has a salient relevance in translation endeavors, and on a broader scope, in understanding language in use. But on second thought, “context” is a more complex notion that refers to different levels of texts and reality. Is context only what surrounds a word or an expression? How to take into account the social practices where texts are used? Is it always possible to consider all aspects of context? In this article, we will outline some possible answers to these questions and think about their relevance for localization projects.

Managing Context

When examining context, we have to imagine it in a general sense. It doesn’t only refer to contextual placement of words or texts, but also to the people that participate in a given communication act, the setting where it happens, how, when, and why. So when it comes to localization projects, collecting this context data has a positive impact on the analysis and selection of the best workflows, procedures, services or strategies of a task or project.

Job briefings are the documents that support this need for situational context. They include a summary of who the client is, which texts will be processed and their purpose, what’s the target audience, expectations, etc. Sometimes, job briefings also incorporate the style guide or any other linguistic preference.

But how does all this information add value to a localization or translation project? For instance, knowing who the client is—e.g., a direct client or an LSP—informs Project Managers (PM) about possible expectations or the level of familiarity with the industrial processes. So, for example, to a  client who is a newcomer, a PM can suggest extra quality assurance steps to reduce risks for sensitive documents. On another note, having information about where the localized texts will appear—say a mobile app v. desktop app, or a marketing campaign for social media v. graphic media campaign—can help PMs choose the right linguistic team for the task, with experience not only in IT or Marketing but in mobile apps or social media.

Context Reliance

We know context is crucial because it sometimes draws the line between inaccurate and precise translations. A visual reference, a note, a video, all can help linguists determine the gender of a character, the meaning of a sentence, the reference of a noun, etc. This is why every project, no matter its topic or intention, can benefit from having references (e.g., videos, websites of the product or the client, related documents) or the source text (the original document). These materials help linguists maintain consistency, choose terminology, check format or typography, and so on.

Other dimensions of context play a role in translating humor, for example. In this case, culture, age and idiosyncrasy of the target audience is crucial to translate puns and jokes in a way that is funny but also appropriate. The same can be said, for instance, about the reliance of subtitles on audiovisual materials, like in movies, series or video games. Being able to see and listen to gestures, movements, and tones of voice help translators and editors contextualizing dialogues and narratives.

Teamwork and Communication

As we can see, there are a lot of resources that can address the need of contextualizing texts, projects and clients. However, a realistic approach to localization endeavors needs to take into account that contexts, as complex as they are, are never completely saturable. This means that sometimes, despite supportive documents or briefs, meanings, wordplays, and references can be difficult to capture. It’s in these scenarios where a solid communication approach based on exchange and teamwork can boost creativity and problem-solving. Research, discussion, debate, and having a comfortable framework for asking questions to teammates and clients always help find collectively the best option for each case.

Brazil One Country, Many Variants and a Linguistic Rivalry - Portada
07/06/2022

Brazil: One Country, Many Variants and a Linguistic Rivalry

Brazil is a massive country with more than 211 million people living across 3,287,956 square miles. In fact, Brazil is the fifth largest country by area across the world and the largest in South America. While the official language of this sprawling country is Portuguese, how the language is spoken varies greatly from region to region. The two most recognizable accents are the Rio de Janeiro accent and the São Paulo accent. 

While Rio and São Paulo are not very far from each other when it comes to distance, they do have quite the language divide. Both regions speak a different version of Brazilian Portuguese and their pronunciation differs greatly. Citizens of Rio tend to be called Cariocas, whereas citizens of São Paulo are usually called Paulistanos or Paulistanas.

These varieties stem from the European influences in Rio de Janeiro caused by colonialism. In São Paulo, more language influence came from the indigenous people and a variety of European languages. These differing influences have left a mark on Brazil that is still felt today, let’s take a look at some of these. 

Terminology 

Terminology can vary greatly throughout Brazil, even in popular songs. Take the “Happy Birthday” song for insance. In São Paulo they start singing “é pique, é pique…” whereas those in Rio de Janeiro will sing “é big, é big…” Even the names for party decorations vary widely between different areas of Brazil. The word “balloon” is another solid example of these differences. A balloon is called “bexiga” in São Paulo (which also means “bladder” across Brazil) and “balão” in Rio de Janeiro (like a soccer ball). 

The word for “traffic lights” also varies as “farol” (São Paulo) and “sinal” (Rio de Janeiro). Sometimes, the same Portuguese word can have different meanings. The word “bolacha” refers to any kind of cookie or biscuit in São Paulo, whereas in Rio de Janeiro, it only refers to cookies with filling. To learn more about this cookie issue, check out the video below.  

Accents

When it comes to accents, Brazil does not have a standard accent or even a preferable one. While some TV and radio broadcasters do try to speak with a more “neutral accent”, the version of a so-called neutral accent can sound different depending on where the content is being distributed. 

Whether paulista, carioca, or from other regions, most people in Brazil may be mocked at some point in their life by their accents, mostly in a friendly way, by those who live outside of those regions.

One key difference between the accents of São Paulo and Rio de Janeiro is how they pronounce a “s” sound before consonants. How they pronounce the “s” before consonants in Rio de Janeiro is the same as it is when speaking Standard European Portuguese. Another example worth examining is the “r” sound. in Rio “r” is pronounced similar to “h” in English, whereas “r” in São Paulo are rolled, closer to the “r” spoken in Spanish.

One Language With Key Differences

While these differences may seem small at first glance, they can present challenges during the translation process. It can help to work with a localization expert that is very familiar with the specific market in Brazil that you’re creating a product or content for.