Concept whitening has been introduced to elaborate the inner functionalities of deep neural network
Over the past few decades, deep neural network has achieved unprecedented heights on a variety of tasks. Deep neural networks are often considered to be complex, extremely large and multi-layered. Despite their advantages, these complex layers make their interpretability difficult and sometimes impossible. The lack of interpretability in deep neural networks leads to more troublesome spots where the mechanism is judged as untrustworthy and unreliable. Instead of attempting to analyze a neural network post hoc, scientists have tried to introduce a mechanism called ‘concept whitening’ that allows us to understand the computation leading us to that layer.
It is often said that the inner mechanism of neural network is like a cobweb, no idea how it is connected to each other, but it sustains and functions successfully. For almost a decade starting from 2010 when deep learning got streamlined, researchers are working on techniques that could define the neural networks by explaining their results and learned parameters. However, they were so far unclear and misleading. Interpretability in deep learning is undoubtedly important, but the calculations of neural networks are challenging to understand. Therefore, a new format known as ‘concept whitening’ has been introduced to elaborate the inner functionalities of deep neural network.
In the concept of deep learning, latent space plays a pivotal role. Latent space refers to an abstract multi-dimensional space containing feature values that we cannot interpret directly. But it encodes a meaningful internal representation of externally observed events. Latent space aims at providing a broad range of topics and events to a computer through quantitative spatial representation or modelling. Learning latent space would help the deep learning model to make better sense of observed data than from observed data itself, which is a very large space to learn from. However, a deep learning model with the right architecture should be able to discriminate between different types of input. In an AI model, latent space refers to each layer of deep learning model that encodes the features and trains the models into a set of numerical values and stores them in its parameters. While the lower layer of a multilayered convolutional neural network learns the basic features such as corners and edges, higher layers learn to detect more complex features like faces, objects, full scenes, etc.
Concept whitening module is added to a convolutional neural network to demonstrate the latent space that is aligned with the known concepts of interest. Concept whitening is a module that replaces batch norm. It constraints the latent space to represent target concepts and also provides a straightforward means to extract them. It doesn’t force the concepts to be learned as an intermediate step, rather imposes space to be aligned along with the concepts. Henceforth, concept whitening is used to increase the interpretability of the neural network model. It provides a much clearer understanding of how the network gradually learns concepts over layers. Concept whitening can be inserted into neural networks in the pace of the batch normalisation module.
Implying concept whitening in deep neural network model
Researchers from the Prediction Analysis Lab at Duke University, headed by Professor Cynthia Rudin have published a paper in Nature Machine Intelligence about ‘concept whitening’ being used for deep neural network interpretability. The researchers replace the post havoc analysis concept with neural networks to disentangle the latent space, making the axes align with known concepts. The initiative to entangle leverages a much clearer understanding of how the network gradually learns concepts over layers. Initially, the concept performed a whitening transformation that resembles the way in which a signal is transformed into white noise. By undergoing many trial rounds, the researchers found that concept whitening can be applied to any layer of the deep neural network to gain interpretability without hurting performance. Besides, another direction of the research is organizing concepts in hierarchies and disentangling clusters of concepts rather than individual concepts.
This article has been published from the source link without modifications to the text. Only the headline has been changed.