While machine translation has been here for a while, OpenAI’s ChatGPT, GPT-4 and similar large language models have already begun disrupting the language industry in many ways, such as with reducing human involvement in translation and localization.
GPT-4, the successor of ChatGPT from OpenAI, is a service based on LLM (large language models) technology. LLM is a type of language model trained to predict the probability of the next word in a sequence of words given the previous words in the sequence.
As opposed to it, solutions like DeepL, Google NMT or ModernMT are based on technology called NMT (neural machine translation) which analyzes the words in a sentence and gives the user the most accurate equivalent in another language based on the words that it has before and after.
However, are such large language models models and solutions fully capable of taking over from human translators?
A study by Google Deep Learning researchers in 2020 shows that NMT has become close to outperforming human-level translation, meaning that the technology is getting very close to its peak.
The use of GPT-4 can even bring down localization and translation costs by up to 50 percent compared to current machine translation (MT) solutions. Additionally, customer mass usage of GPT-4 by language companies can also result in shorter project turnaround times and lower costs, experts claim.
According to Milengo, a localization and translation service provider working with over 400 companies globally and whose products combine machine translation solutions with human expertise, there are examples where the use of GPT-4 can reduce costs even up to 80 percent.
However there are downsides to this technology as well.
One of those is that GPT-4’s translation quality analysis on technical texts shows lower scores than neural machine translation (NMT) models, Milengo’s CEO Roman Kotzsch explains.
“While it is more fluent, it has a slightly higher percentage of mistranslations. But in combination with NMT, it can significantly boost translation productivity,” Kotzsch tells The Recursive.
Additionally, while customers don’t mind using machine translation solutions for their personal needs, when it comes to enterprise translation, quality expectations get much higher.
“In other words, the customer feels satisfied with the quality of Google Translate or DeepL when it comes to translating a personal email or a chat. But speaking of an app or a company website, requirements to the consistency of the translation, terminology aspects, and also being error-free get much higher. So, if you have low quality in your translations as a company, it will automatically harm the brand” Kotzsch adds.
While both NMT and LLM types of models use neural network architectures and are trained using large amounts of data, their objectives and applications are different. For translations, LLMs are still not as good as NMT solutions because the translations are not always very accurate.
Therefore, going first with NMT for the accurate translation and later polishing the text with LLMs is what can boost productivity, he emphasizes.
Human touch still needed to verify accuracy and ensure sensitivity
However, despite the potential of GPT-4 and other LLMs, human translators are still needed to verify accuracy and ensure cultural sensitivity of the final translation. AI translations can also be affected by bias, depending on the dataset used during the training process.
For seasoned translator Katarina Pavicevic, while ChatGPT and GPT-4 could eventually replace humans, right now this remains a distant possibility.
“Language is not just about translating words, but also about capturing cultural nuances, idiomatic expressions, and the context of the message. Human translators have a deep understanding of language and culture, which allows them to accurately deliver the intended meaning of the message,” Pavicevic tells The Recursive.
Another advantage that human translators have is that they can adapt to different writing styles and voices, whereas machine translation can sometimes produce awkward or unnatural-sounding translations. This is because AI language models are based on statistical patterns rather than true understanding of language.
For Skopje-based content writer and translator Maja Popovska, precisely these elements are a guarantee that AI won’t replace writers and translators any time soon.
“Although it seems like AI is taking over and sparking fear of potentially replacing writers and translators, realistically it’s not possible – at least not in the near future. We can’t deny that AI tools like ChatGPT do a decent job at translating text from one language to another, but one thing to bear in mind is the human factor that reflects the tone of the text itself, and evokes feelings with the reader,” Popovska tells The Recursive.
Lastly, human translators can make judgments and decisions based on their own knowledge and experience. They can identify potential errors or ambiguities in the source text and make corrections accordingly, while the machine translation simply lacks this human touch and can sometimes produce errors or misinterpretations, both Popovska and Pavicevic point out.
Other issues that don’t allow companies to completely replace human translation with GPT-4 include legal issues, fact-checking, and output bias among other challenges.
“Human translators are still needed to verify accuracy and ensure cultural sensitivity of the final translation. Also, AI translations can be affected by bias, depending on the dataset used during the training process,” Kotzsch warns, adding that despite the high-level of process automation Milengo has always placed a strong importance on human expertise.
What language companies in an industry worth more than $26 billion can do in order to benefit from using GPT-4 and similar models, is to focus on making the translation output culturally relevant and unique through human editing, fact-checking, and cultural adaptation, experts agree.