Audio version of the article
Deep neural networks are the most used model for computer vision applications, largely because of their scalability. Deep neural networks generally derive their superior performance through underlying supervised learning mechanisms.
Supervised learning is a type of deep learning methods which uses labelled datasets. While supervised learning offers superior performance benefits, it comes at a high cost, as labelling data requires human labour. Further, the cost is significantly higher when a data labelling has to be done by an expert, such as a medical practitioner.
In such a scenario, semi-supervised learning (SSL) proves to be a powerful alternative. SSL is a method where learning takes place with a small number of labelled data and a relatively larger set of unlabelled data. This method mitigates the need for labelling all the data as in the case of supervised learning.
Recently, a paper accepted by the NeurIPS 2020 conference, speaks of using an SSL method called FixMatch to achieve state-of-art performance across various SSL benchmarks such as CIFAR-10 even with very few labelled data.
What is FixMatch Algorithm?
FixMatch Algorithm is essentially an SSL method that combines diverse mechanisms to produce artificial labels for unlabelled data. In particular, this algorithm uses consistency regularisation and pseudo labelling for this purpose, as well as a separate set of weak and strong augmentation.
First, in the FixMatch process, the predicted value of an unlabelled image with weak augmentation is calculated by the FixMatch algorithm. Introducing weak augmentation to the image means changing slightly by methods such as rotation and flipping. Only the image with prediction confidence above a certain threshold is treated as pseudo labels. Next, the same model is used to generate predicted values for generating the image with strong augmentation, where larger changes such as changing the temperature of the image are applied.
An artificial label is then computed on the weak augmented image, and the calculated loss is applied against the model’s output for the strongly augmented image. This introduces a form of consistency. This consistency is then regularised.
Consistency regularisation is a method where the weak and strong augmented data are used separately, and the end goal is to force the SSL model to learn to produce the same output for different ‘versions’ of an image. It increases the accuracy of the temporary label by limiting the predicted value. So even if the data is converted, the predicted value is not changed. Further, the reliability of this predicted value is ensured by incorporating the difference between predicted values of both augmented images into the objective function.
FixMatch’s Performance Against Its Counterparts
The paper (referenced above) showed that the FixMatch performed well across standard benchmarks such as CIFAR-10 and CIFAR-100. For example, on CIFAR-10 with four labels per class, FixMatch achieved a 99.43% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 samples, with four labels per class.