Discovering Multimodal Neurons With OpenAI

May 3, 2021

Multimodal neurons can respond to a cluster of abstract concepts centred around a common high-level theme rather than a specific visual feature.

In a major breakthrough, researchers at OpenAI have discovered neural networks within AI systems resembling the neural network inside the human brain. The multimodal neurons are one of the most advanced neural networks to date.

The researchers have found these advanced neurons can respond to a cluster of abstract concepts centred around a common high-level theme rather than a specific visual feature. Like their biological counterparts, these neurons can respond to a range of emotions, animals, photographs, drawings and famous people.

Researchers wrote these neurons in CLIP can respond to the same concept, whether presented literally, symbolically, or conceptually.

The multimodal neurons have been discovered in the CLIP model that can connect text and images. It can learn visual concepts from natural language supervision. Further, this general-purpose vision system can match the performance of a ResNet-50 but outperforms existing vision systems on the most challenging datasets. For instance, one neuron called the ‘Spider-Man’ can respond to a spider’s image, the text ‘spider’, and the comic book character ‘spider-man’.

The Study

The researchers found multimodal neurons in several CLIP models of varying sizes, but they focused on studying the mid-sized RN50-x4 model. Researchers employed two tools to understand the activations of the model:

Feature visualisation, which maximises the neuron’s firing by doing gradient-based optimisation on the input.
Dataset examples, which looks at the distribution of maximal activating images for a neuron from a dataset.

The researchers carried out a series of carefully-constructed experiments to find these neurons’ unique capabilities in the convolutional layer. Each layer consists of thousands of neurons. “For our preliminary analysis, we looked at feature visualisations, the dataset examples that most activated the neuron, and the English words that most activated the neuron when rastered as images,” said researchers. Most of these neurons were made to deal with sensitive topics, from political figures to emotions.

The experiment revealed an incredible diversity of features such as region neurons, person neurons, emotion neurons, art style neurons, time neurons, abstract neurons, colour neurons and more.

Researchers found that a majority of neurons in CLIP are readily interpretable. “From an interpretability perspective, these neurons can be seen as extreme examples of “multi-faceted neurons” which respond to multiple distinct cases. Looking to neuroscience, they might sound like “grandmother neurons,” but their associative nature distinguishes them from how many neuroscientists interpret that term,” stated researchers.

Researchers also studied how these multimodal neurons can give us insight into understanding how CLIP performs classification, such as image and text classification.

Not Fool-Proof

Neural networks work on the same principle as their biological counterparts to process data. However, the drawback is, it is difficult to understand why it makes certain decisions and how it comes to a particular conclusion.

The researchers said that despite being trained on a curated subset of the internet, it still inherits its many unchecked biases and associations. “…we have discovered several cases where CLIP holds associations that could result in representational harm, such as denigration of certain individuals or groups,” researchers stated. For instance, “Middle East” neuron was associated with terrorism; and an “immigration” neuron responded to Latin America.

Despite fine-tunes and the use of zero-shot techniques, researchers said these biases and associations would remain in the system. The CLIP findings are still evolving, and there is a lot of research and understanding that needs to be done in multimodal systems. In a bid to advance the area, researchers have shared the tools, dataset examples, text feature visualisations, and more with the community.

This article has been published from the source link without modifications to he text. Only the headline has been changed.

Source link

Discovering Multimodal Neurons With OpenAI

The Study

Related

Most Popular

The AI Investment Flood Looks Like a Triumph. It May Be a Warning Sign.

India’s Chief Economic Adviser Says the MBA Era Is Over

Crypto’s Next Big Market Shift Will Come From Regulators

AI Token Costs Are Exploding — But the Fix Isn’t What Anyone Expected

Why ‘Big Short’ Investor Michael Burry Is Betting Against AI

Sam Altman Wants Every American to Own a Piece of the AI Economy

Follow Us

POPULAR POSTS

The AI Investment Flood Looks Like a Triumph. It May Be a Warning Sign.

Sam Altman Wants Every American to Own a Piece of the AI Economy

Zuckerberg Admits Meta Made Mistakes in Its AI Restructuring

Pentagon Says Grok AI Helped Target Over 2,000 Missiles at Iran

POPULAR CATEGORY

The AI Investment Flood Looks Like a Triumph. It May Be...

Discovering Multimodal Neurons With OpenAI

The Study

Not Fool-Proof

Related

RELATED ARTICLES

Most Popular

Follow Us

POPULAR POSTS

POPULAR CATEGORY