Microsoft and Nvidia created the most powerful language model

Language models have become a key factor when it comes to creating the most complete and accurate artificial intelligence possible. The new model developed by Microsoft and Nvidia should have around 530 billion parameters and be capable of exceptional accuracy, especially in reading comprehension and complex sentence formation.

The Megatron-Turing Natural Language Generation (MTNLG) model from Nvidia and Microsoft sets a new record for a language model: According to technology companies, their model is the most powerful to date and, with its 530 billion parameters, can surpass that of OpenAI. GPT3, as well as Google’s BRET, which specializes in natural language, are able to understand texts, argue and draw conclusions about a complete and precise sentence.

Language models are based on a statistical approach. Although there are many methods, the n-gram model is used here. The learning phase enables the analysis of a large number of texts in order to estimate the probability that a word “fits” correctly into a text. The probability of a phrase is the product of the probabilities of the previously used words. Using probabilities, we can form perfect grammatical sentences.

Biased algorithms still an issue

With 530 billion parameters, the MTNLP model is particularly demanding. In machine learning, parameters are often defined as a unit of measure for machine performance. It has been shown repeatedly that models with a large number of parameters ultimately perform better, which, due to their large dataset, results in more precise and nuanced language. These models are able to summarize books and texts and even write poems.

To bolster MTNLG, Microsoft and Nvidia created their own dataset of approximately 270 billion English website “tokens”. In natural language, “tokens” are used to break the text into smaller pieces in order to better distribute the information. The websites include academic sources like Arxiv, Pubmed, educational websites like Wikipedia or Github, as well as news articles and even social media posts.

As always with language models, the main problem with widespread public use is the distortion of algorithms. The data used to train machine learning algorithms contain human stereotypes that are embedded in the texts. Gender, racial, physical, and religious biases are common in these models and it is especially difficult to get rid of these problems.

For Microsoft and Nvidia, this is one of the biggest challenges with this model. Both companies say that the use of MTNLG “should ensure that appropriate measures are taken to mitigate and minimize potential harm to users”. It is from these revolutionary models that this problem needs to be addressed, and at the moment it does not seem to be resolved for a long time.

Source link