Although a machine learning model can be a powerful tool in the translation space, it can only be as good as the data it learns from. If there is a systematic error in the data used to train a machine learning algorithm, the resulting model will reflect this. These errors are the main reason that gender bias is present in machine translation. Some aspects of this are out of the control of the machine translation engine creators, but some others aren’t. Let’s examine how machine translation reinforces gender bias and how it can be fixed.
How Errors Can Occur
Wikipedia serves as a good example of how machine translation errors can occur and reinforce gender bias. Wikipedia entries tend to be geographically diverse, lengthy, and refer to subjects in the third person, which leads to the use of a lot of pronouns. Because of this, Wikipedia entries (particularly biographies) often have potential to cause machine translation errors related to gender, especially if an article refers to a person explicitly early in a sentence, but not later on.
How Errors Can Be Resolved
Let’s look at Google as an example of a company aiming to resolve machine translation mistakes in regards to gender. Google acknowledges that their translation tools struggle with errors that lead to reinforcing gender bias. They believe that they need to advance translation techniques to surpass single sentences. Doing this requires setting new metrics for measuring their progress, as well as creating datasets with the most commonly encountered context-related errors. They’re facing a significant challenge. Translation errors related to gender are very sensitive, as they can incorrectly refer to someone and how they self identify.
Google is working towards long-term improvements on their machine learning systems so they can continuously improve how they translate pronouns and gender.
In recent years there’s been more awareness that these biases exist and machine translation engineers are trying their best to resolve this issue quickly, but it’s no easy endeavor since gender works so differently in all languages. Even though many advancements have been made in the machine translation industry, work still needs to be done. In all reality, a human translator is much better equipped to handle such sensitive issues like gender.
It has taken many years to improve machine translation quality and additional improvements will take more time to make. However, this issue can’t wait that long to be addressed. Errors that lead to reinforcing gender bias are especially important to work on right now, considering the relevance gender inclusivity has taken recently. If a company wants to prioritize inclusive language, then it’s not safe to go with an automated solution. Gender is a sensitive topic and with a translation, you want to make sure your message is conveyed in a sensitive way. Right now, human translators are the ones researching and staying up to date with the latest trends in the languages they work with. This is absolutely necessary, as everything is changing so fast. These professionals are definitely the people companies should turn to in order to ensure their brand is not hurt by a careless machine translation mistake.