Although a machine learning model can be a powerful tool in the translation space, it can only be as good as the data it learns from. If there is a systematic error in the data used to train a machine learning algorithm, the resulting model will reflect this. These errors are the main reason that gender bias is present in machine translation. Some aspects of this are out of the control of the machine translation engine creators, but some others aren’t. Let’s examine how machine translation reinforces gender bias and how it can be fixed.
How Errors Can Occur
Wikipedia serves as a good example of how machine translation errors can occur and reinforce gender bias. Wikipedia’s entries tend to be geographically diverse, lengthy, and refer to subjects in the third person, which leads to the use of a lot of pronouns. Because of this, Wikipedia entries (particularly biographies) often have potential to cause machine translation errors related to gender, especially if an article refers to a person explicitly early in a sentence, but not later on.
How Errors Can Be Resolved
Let’s look at Google as an example of a company aiming to resolve machine translation mistakes regarding gender. Google acknowledges that its translation tools struggle with errors that lead to reinforcing gender bias. They believe that they need to advance translation techniques to surpass single sentences. Doing this requires setting new metrics for measuring their progress and creating datasets with the most commonly encountered context-related errors. They’re facing a significant challenge. Translation errors related to gender are susceptible, as they can incorrectly refer to someone and how they self-identify.
Google is working towards long-term improvements on their machine learning systems so they can continuously improve how they translate pronouns and gender.
The Takeaway
In recent years there’s been more awareness that these biases exist and machine translation engineers are trying their best to resolve this issue quickly, but it’s no easy endeavor since gender works so differently in all languages. Even though many advancements have been made in the machine translation industry, work still needs to be done. In all reality, a human translator is much better equipped to handle such sensitive issues like gender.
It has taken many years to improve machine translation quality, and additional improvements will take more time to make. However, this issue can’t wait that long to be addressed. Errors that reinforce gender bias are critical to work on now, considering the recent relevance gender inclusivity has taken recently. If a company wants to prioritize inclusive language, it’s not safe to use an automated solution. Gender is a sensitive topic; with a translation, you want to ensure your message is conveyed discreetly. Right now, human translators are researching and staying up to date with the latest trends in the languages they work with. This is necessary, as everything is changing so fast. Companies should turn to these professionals to ensure their brand is not hurt by a careless machine translation mistake.