The idea of automatic universal translation has long been the stuff of science fiction. A recent paper from researchers in artificial intelligence at Facebook’s parent company Meta is said to be a step in that direction.
The paper demonstrates that machine learning, the technology underlying AI, is capable of translating 204 languages—more than had ever been attempted—at a higher level of quality.
This translation includes languages that are rarely spoken. For instance, the language spoken by Indonesia’s Acehnese people, and Central and Southern Africa’s Chokwe people are also involved. The computers found it difficult to translate such languages since they are rarely available online.
Mark Zuckerberg – CEO of Facebook and Meta lauded the achievement and called AI translation a “superpower”, and the researchers could not be left behind as they were also only slightly less ecstatic.
This is the latest breakthrough in the field of Artificial intelligence which is not new to controversies. It recently was the talk of the town when Google engineer Blake Lemoine claimed LaMDA to be sentient and was placed on leave after this controversial comment.
The aim of Machine Translation (MT) is that all people globally can understand all languages in real-time – states MT pioneer Philipp Koehn and adds that they are very close to achieving the same.
He further states that the paper cites remarkable work to extend production-level translation quality to 200 languages. He was one of the 38 academics and Meta researchers who worked on the project. Additionally, a tonne of resources will be made available so that everyone can use this model and retrain it on their own, promoting that field’s research.
As per the paper, this is the stepping stone for the realization of a universal translation system, yet it was emphasized by computer scientists who weren’t involved in the project that it was just one small step on a long and winding road with no clear destination.
A remarkable feat of engineering
According to Dr. Alexandra Birch-Mayne, a Reader in NLP (Natural Language Processing) at the University of Edinburgh, the paper’s fundamental machine learning technique, a model known by the term Sparsely Gated Mixture of Experts, was not novel in and of itself.
Its most significant contribution, she claims, was gathering, cleaning, and presenting new data on languages that were not widely available on the internet, which is the primary source of data for machine translation.
It’s a remarkable feat of engineering. It is not necessarily a fundamental scientific breakthrough, Dr. Birch-Mayne told Sky News.
The research paper claimed that apart from translating rarely spoken languages, it will also focus on providing the best quality while translating.
Data and algorithms will be made available to the public
Measuring progress in machine learning is difficult, but the Meta paper improved translation quality by 44 percent over the previous avant-garde using a metric known as BLEU.
BLEU is an inexact metric, Dr. Diptesh Kanojia, Lecturer in AI for Natural Language Processing at the University of Surrey, explained. However, quoting BLEU scores is standard practice in natural language processing research.
44 percent is a remarkable improvement statistically.
Though the work will be utilized to enhance the software of Facebook, the language data and the algorithms used for translating it will be made available to the public, which means that this is the first that authoritative datasets on languages like Eastern Yiddish, Northern Kurdish, and Cape Verdean Creole will be available for other researchers to use.
The Meta researchers ensured that the quality of the algorithm, as well as the underlying language data, is accurate by having native speakers inspect their translations, which is a tedious task.
Engaging with the community is laudable. They are not necessarily the ones who started this trend, but they are following good practice, Dr. Birch-Mayne acknowledged the effort’s limitations, which included native speakers from Europe and the United States rather than the languages’ home countries.
Some researchers criticized Meta for releasing the paper without peer review, accusing it of “peer review by media.”
MT pioneer Koehn defended the approach, claiming that it was “common practice in the field… for better or worse” and that it aided in the speed with which research results were communicated.
Machine learning advancements
The paper is one of several recent advances in machine learning, which is improving much faster than researchers anticipated. A Google model released last week solved one-third of MIT undergraduate math problems with 50% accuracy, representing a significant performance improvement.
Despite the fact that each new breakthrough sparks speculation about new forms of consciousness, most experts believe that AI systems are neither sentient nor intelligent and that they do little more than mimic the data they’re given. A robot revolt is not in the cards.
The greater risk of AI systems is that they will lead to disaster by instilling false confidence in humans’ still very limited abilities – a very real possibility given the sensitivity of the tasks potentially involving translation at Facebook, which has previously been chastised for failing to have native-speaking moderators to spot calls for violence on its platform.
Mr. Zuckerberg assured that “the advances here will permit more than 25 billion translations every day across our apps, which Facebook said could include detecting harmful content, securing elections, and reducing online sexual exploitation.
Dr. Birch-Mayne, who recently completed a three-year project with the BBC on 17 languages in Africa and India, warned against using machine translation for anything where accuracy is critical.
You can’t trust these systems, she explained. It could be correct, but it could also be incorrect.