As Artificial Intelligence has exploded over the past year, the technology is finding its way into new fields including medicine.
A study released on Friday by npj Digital Medicine shows that though AI may bring some technological advantages to the table for healthcare, it could also perpetuate racial biases.
The study examined four different AI large language models (LLMs): ChatGPT, Bard, Claude and GPT-4. Each tool was asked a series of nine questions to test if they responded with inaccurate or race-based answers regarding medical care.
Researchers determined that racial bias did play a role in the AI language tools, and the study states that the tools are not ready for clinical use or integration because of the potential harm they could cause.
Senior author of the study and Assistant Professor of Biomedical Data Science and of Dermatology, Dr. Roxana Daneshjou, explains the purpose of the study, "There is this idea of red teaming which is basically trying to find problems and vulnerabilities in AI models," she told KTVU. "We took some of those concepts, and we wanted to see how large language models do when answering questions where there is a chance that the answers could perpetuate debunked, race-based medicine."
At first glance, ChatGPT seemed to provide fewer racially-biased answers to the prompts than its counterparts, but at least three of the tool's answers revealed "concerning race-based responses," the study found. Claude, however, performed the worst on the test: In six out of nine scenarios, Claude returned race-based responses that were not entirely accurate.
Some of the responses were more problematic than others. These included statements claiming that there are differences in pain threshold between races, which is not true, Daneshjou says.
"Those false beliefs can cause medical professionals to have differential treatment of patients of different races, which causes significant harm to patients," Daneshjou told KTVU.
Check out the questions in the image below:
This graph shows how often racial bias was exhibited by AI tools when they were asked healthcare-related questions. (Nature Partner Journals - Digital Medicine)
Daneshjou says that though one model appears to have performed better than another, there simply isn't enough data to compare the tools side-by-side.
Though AI is primarily being utilized by electronic health record vendors at this time, there is still much to learn about the pitfalls that these tools could face. For example, because these tools learn new information by scouring the internet and older textbooks, they could pull out-of-date, biased or inaccurate information into their research.
These tools are also being used by patients and medical professionals to ask medical questions, Daneshjou says, though large healthcare systems have not adopted the use of the tools for patient care. This use is where Daneshjou is most concerned, "Unfortunately there are people that still have biases in the healthcare system…I think it’s so important that we actually need to address that bias that exists in society, not just technology," she said.
Though racial bias persists in medicine, Daneshjou is hopeful for the future, "My hope is that these things are correctable, but of course to correct a problem first we need to be aware that it exists," Daneshjou said.