The central postulate of self-supervised machine learning models is that they are trained on unlabeled data, which enables the application of AI in certain areas where collecting well-defined data sets can be difficult, such as cancer research.
A team of Google researchers has developed a new artificial intelligence (AI) model that they claim can have a huge impact on medical research and clinical applications. Led by Shekoofeh Azizi, an AI resident at Google Research, help create a self-supervised deep neural network that can improve the clinical diagnostic efficiency of such algorithms.
Called Multi-Instance Contrastive Learning (MICLe), Azizi and his team have developed what is known as a “self-supervised learning model”. The central postulate of self-supervised machine learning models is that they train with unlabeled data, which enables the use of AI in certain areas where the collection of well-defined data sets can be difficult, such as in cancer research itself.
In his article, Azizi says, “We conducted experiments on two different tasks: classifying dermatological skin diseases using digital camera images and classifying chest x-rays from different labels to show that self-supervised learning in ImageNet followed by additional self-supervised learning in certain not marked medical areas. Images that greatly improves the accuracy of medical image classifiers. We introduce the novel MICLe method, which, when available, uses multiple images of the underlying pathology per patient case to construct more informative positive pairs for self-supervised learning.
MICLe itself builds on Google’s existing research on self-supervised neural convolutional network models. At the International Conference on Machine Learning (ICML) 2020, Google researchers presented the Simple Framework for Contrasting Learning (SimCLR), on which MICLe is based on. Simply put, SimCLR uses multiple variations of the same image to learn multiple representations of the data present, which has helped make the algorithm more robust and accurate in terms of its identification.
At MICLe, the researchers used multiple images of a patient that did not have clearly labeled data points. The first layer of the algorithm used an available repository of images with tagged data, in this case ImageNet, to give the algorithms a first round. Azizi said his team applied a second layer of images, this time with no tagged data, to create the algorithm-created image pairs that allowed the neural network to learn multiple representations of a single image, which is fundamental in medical research.
In clinical treatments, images often have different angles and conditions because medical images cannot be orchestrated or choreographed. After the previous two levels of training, researchers applied a very limited set of tagged images to refine the algorithm for application to targets. The researchers said that in addition to increasing accuracy, these algorithms can also significantly reduce the cost and time required to develop artificial intelligence models for medical research.
We achieved an improvement in Top1 precision of 6.7% and an improvement in the mean area under the curve (AUC) of 1.1% in dermatology and in chest x-ray grading, thus surpassing the strong, previously trained, monitored ones Baselines on ImageNet. the models are robust to changes in distribution and can learn efficiently with a small number of labeled medical images, Azizi summed up in her research.