Improving ML With Big Data Driven Algorithms

Researchers at the Massachusetts Institute of Technology examined the shortcut problem in a popular machine learning method and came up with a solution by forcing the model to use more data in decision-making to avoid pitfalls.

By removing simpler features, researchers can redirect the model’s attention to more complex features of the data that has been lost. The researchers then ask the model to solve the task in two ways: first by focusing on the simpler features and second by using the complex features. According to the researchers, this reduced the occurrence of shortcut solutions and improved the performance of the model.

With this work, researchers can improve the efficiency of machine learning to identify diseases in medical images and reduce the number of false diagnoses.

It’s always difficult to say why deep networks make the decisions they make, and more specifically, what parts of the data these networks choose to focus on when making a decision, said Joshua Robinson, PhD student in computer science and science. in artificial engineering. Intelligence Laboratory (CSAIL) and lead author of the article.

If we can understand in more detail how shortcuts work, we can go even further to answer some of the fundamental but very practical questions that are really important to people looking to implement these networks.

The researchers focused the study on contrastive learning, which is a powerful form of self-supervised machine learning. With this method, the model is trained using raw data that does not have labeled descriptions from humans. It can be used successfully in a wide variety of data

For contrastive learning models, an encoding algorithm is trained to distinguish between pairs of similar inputs and pairs of different inputs. The process encodes rich and complex data to be interpreted by the learning

The team tested the encoders with a series of images and found that they also had difficulty with the shortcut solutions. With this, the researchers made it harder to distinguish between similar and dissimilar pairs, finding that it altered the characteristics that the encoder examines when making a decision.

If you make the task of discriminating between similar and dissimilar items more and more difficult, your system is forced to learn more meaningful information in the data because without learning that it cannot solve the task, explained Stefanie Jegelka, from the XConsortium Associate Professor of Career Development at EECS and member of CSAIL and IDSS.

To test this method, the researchers used vehicle imaging, adjusting the color, orientation, and type of vehicle to make it difficult for encoders to discriminate between pairs of similar and dissimilar images. The encoder has improved its accuracy on all three functions

To see if the method would withstand more complex data, the researchers also tested it with samples from a medical image database of chronic obstructive pulmonary disease (COPD). Again, the method led to simultaneous improvements in all areas of the characteristics they assessed.

While the study is essential to understanding what causes the shortcuts and how to adapt them, the researchers explained that continuing to refine these methods would pave the way for future advancements.

This ties in with some of the most important questions about deep learning systems, such as’ Why do they fail? And Can we know in advance the situations in which your model will fail? There is still a lot to do if you want to understand rapid learning in its fullness, said Robinson.

Source link