Does this AI think like a human?

Understanding why a model makes certain decisions is often just as important as whether those decisions are correct in machine learning. For example, a machine-learning model may correctly predict that a skin lesion is cancerous, but it may have done so based on an unrelated blip on a clinical photograph.

While tools are available to assist experts in making sense of a model’s reasoning, these methods frequently only provide insights on one decision at a time, and each must be manually evaluated. Models are frequently trained with millions of data inputs, making it nearly impossible for humans to evaluate enough decisions to identify patterns.

Currently, researchers at MIT and IBM Research have developed a technique that permits a user to aggregate, sort, and rank these individual explanations to quickly analyze the behavior of a machine-learning model. Their Shared Interest technique includes quantifiable metrics that compare how well a model’s reasoning matches that of a human.

Shared Interest could assist a user in quickly identifying concerning trends in a model’s decision-making — for instance, perhaps the model is frequently confused by distracting, irrelevant features, such as background objects in photos. Aggregating these insights could assist the user in quickly and quantitatively determining whether a model is trustworthy and ready to be deployed in a real-world situation.

Our goal in developing Shared Interest was to be able to scale up this analysis process so that one could understand on a more global level what their model’s behavior is, says lead author Angie Boggust, a graduate student in the Computer Science and Artificial Intelligence Laboratory’s Visualization Group (CSAIL).

Boggust collaborated on the paper with her advisor, Arvind Satyanarayan, an assistant professor of computer science who leads the Visualization Group, as well as IBM Research’s Benjamin Hoover and senior author Hendrik Strobelt. The paper will be presented at the Human Factors in Computing Systems Conference.

Boggust started working on this project during a summer internship at IBM, where he was mentored by Strobelt. When Boggust and Satyanarayan returned to MIT, they expanded on the project and continued their collaboration with Strobelt and Hoover, who assisted in the deployment of case studies demonstrating how the technique could be utilized in implementation.

Human-AI alignment

Shared Interest makes use of saliency methods, which are popular techniques for demonstrating how a machine-learning model made a specific decision. If the model is classifying images, saliency methods highlight areas of the image that the model considered important when making its decision. These areas are depicted as a type of heatmap known as a saliency map, which is frequently superimposed on the original image. If the model identified the image as a dog and highlighted the dog’s head, it means those pixels were important to the model when it determined the image contained a dog.

Shared Interest compares saliency methods to ground-truth data. Ground-truth data in an image dataset are typically human-generated annotations that surround the relevant parts of each image. In the previous example, the box would completely encircle the dog in the photo. Shared Interest compares the model-generated saliency data and the human-generated ground-truth data for the same image when evaluating an image classification model to see how well they align.

The technique employs several metrics to quantify that alignment (or misalignment) and then categorizes a specific decision into one of eight groups. The categories range from perfectly human-aligned (the model predicts correctly and the highlighted area in the saliency map matches the human-generated box) to completely distracted (the model makes an inaccurate prediction and does not utilize any image features found in the human-generated box).

On one end of the spectrum, one’s model decided for the same reason that a human did, and on the other end, their model and the human are making this decision for completely different reasons. One can sort through the images in your dataset by quantifying that for all of them, Boggust explains.

The technique works similarly with text-based data, highlighting keywords rather than image regions.

Rapid analysis

Three case studies were used by the researchers to demonstrate how Shared Interest could be useful to both non-experts and machine-learning researchers.

In the first case study, they used Shared Interest to help a dermatologist decide whether or not to trust a machine-learning model designed to detect cancer from photos of skin lesions. The dermatologist was able to quickly see examples of the model’s correct and incorrect predictions thanks to Shared Interest. Finally, the dermatologist decided he couldn’t trust the model because it predicted too many things based on image artifacts rather than actual lesions.

The value here is that we can see these patterns emerge in our model’s behavior by using Shared Interest. The dermatologist was able to make a confident decision about whether or not to trust the model and whether or not to deploy it in about half an hour, Boggust says.

In the second case study, they collaborated with a machine-learning researcher to demonstrate how Shared Interest can evaluate a specific saliency method by revealing previously unknown flaws in the model. The researcher was able to analyze thousands of correct and incorrect decisions in a fraction of the time required by traditional manual methods.

They used Shared Interest in the third case study to delve deeper into a specific image classification example. They were able to conduct a what-if analysis by manipulating the image’s ground-truth area to determine which image features were most important for specific predictions.

The researchers were impressed with how well Shared Interest performed in these case studies, but Boggust warns that the technique is only as good as the saliency methods on which it is based. If those techniques are biased or incorrect, Shared Interest will inherit those limitations.

The researchers hope to apply Shared Interest to different types of data in the future, particularly tabular data used in medical records. They also intend to use Shared Interest to aid in the improvement of current saliency techniques. Boggust hopes that this research will spur further research into quantifying machine-learning model behavior in ways that make sense to humans.

This research is supported in part by the MIT-IBM Watson AI Lab, the US Air Force Research Laboratory, and the US Air Force Artificial Intelligence Accelerator.

Source link