Biased AI can Compromise Doctors’ Diagnoses

Although artificial intelligence (AI) has progressed, it is still far from ideal. According to a recent study, doctors utilizing AI to assist in patient diagnosis may not be able to recognize the warning indications of bias in AI systems because of the biased data that these systems are trained on or the way they are constructed.

The study examined a particular AI system intended to assist physicians in making diagnoses, and it was published in the JAMA on Tuesday, December 19. They discovered that, in fact, it did aid physicians in diagnosing patients more accurately; additionally, their accuracy rose even further when the AI “explained” how it arrived at its conclusion.

However, the usage of the AI reduced the accuracy of the clinicians when the researchers tested it. The AI was purposely biased to give specific diagnoses to patients with certain traits. The researchers discovered that little was done to counteract the decline in accuracy, even when the AI provided explanations that demonstrated how clearly biased and full of irrelevant information its answers were.

The study’s AI was intentionally biased, but it shows how difficult it could be for physicians to identify less evident bias in AI they come across in non-research settings.

As the senior author of the study and an assistant professor of internal medicine at the University of Michigan, Dr. Michael Sjoding told Live Science that the paper only emphasizes how crucial it is to conduct due diligence in order to make sure these models don’t have any of these biases.

For the purpose of the study, the researchers developed an online survey that gave medical professionals, nurse practitioners, and physician assistants accurate summaries of patients who had been admitted to the hospital due to acute respiratory failure, a disorder in which the blood cannot carry enough oxygen to the lungs. The descriptions contained information from a chest X-ray, laboratory test findings, physical exam results, and each patient’s symptoms. Every patient had one or more of the following conditions: heart failure, pneumonia, chronic obstructive pulmonary disease, or none of them.

Every clinician in the poll diagnosed two patients without the use of artificial intelligence (AI), six patients with AI, and one patient with the assistance of a fictitious colleague who always recommended the right diagnosis and course of action.

Three of the AI’s predictions were created with intentional bias. For example, one incorporated an age-based bias, increasing the likelihood that a patient over 80 years old would receive a pneumonia diagnosis. Another would have patients who were obese to have an erroneously greater risk of heart failure in comparison to people who were lesser in physical size.

The AI assigned a number between zero and 100 to each possible diagnosis, where 100 represented the most certain diagnosis. When an AI score was 50 or above, it explained how it arrived at that figure. To be more precise, it produced “heatmaps” that displayed the parts of the chest X-ray that the AI thought were most crucial to the conclusion.

457 physicians who identified at least one fictional patient in the trial were assessed; 418 identified all nine. The doctors’ diagnosis were correct roughly 73% of the time when they didn’t have an AI assistant. This number increased to 75.9% when using the conventional, unbiased AI. With an accuracy of 77.5%, those who received an explanation performed even better.

If no explanation was provided, the biased AI reduced the accuracy of the doctors to 61.7%. When biased explanations were provided, it was only marginally higher; these frequently emphasized unrelated areas of the patient’s chest X-ray.

The biased AI also had an impact on whether professionals chose the right treatment. When provided predictions derived by the biased algorithm, clinicians prescribed the proper medication just 55.1% of the time, with or without explanations. Without AI, their accuracy was 70.3%.

According to Ricky Leung, an associate professor at the University at Albany’s School of Public Health who specializes in AI and health and who was not part in the study, the study emphasizes the need for doctors to not rely too heavily on AI. According to Leung’s, the doctor needs to know how the AI models that are being used were developed, whether there is any potential for bias, etc.

The study’s limitations include the use of model patients described in an online survey, which is considerably different from a true clinical situation with live patients. It also excluded any radiologists, who are better accustomed to evaluating chest X-rays but would not make clinical choices in a real hospital.

According to Sjoding, any AI tool used for diagnosis should be built expressly for diagnosis and clinically evaluated, with special attention dedicated to bias reduction. However, the study suggests that it may be as vital to train professionals on how to use AI in diagnosis and to spot symptoms of bias.

There is still hope that [if clinicians] receive more specific training on how to use AI models, they would be able to use them more effectively, according to Sjoding.

Source link