According to researchers, ChatGPT artificial intelligence provides correct health information most of the time… but beware, in 20% of cases, it is inaccurate or even fictitious.
- In February 2023, US researchers created a set of 25 questions related to breast cancer screening advice. They were submitted to ChatGPT artificial intelligence. The tool has 88% correct answers.
- The chatbot gave 3 bad tips: one answer was based on outdated information. The other two questions had inconsistent and varying answers.
- “ChatGPT sometimes invents fake newspaper articles or health consortia to support its claims,” the lead author warns.
While many students have had the idea of using ChatGPT to write their assignments, researchers at the University of Maryland School of Medicine (UMSOM) wanted to see if the chatbot’s health advice was reliable. Conclusion: if artificial intelligence provides correct information most of the time, it sometimes shares inaccurate, and sometimes even invented, data!
The work has been published in the journal RadiologyApril 4, 2023.
Breast cancer screening: 8 out of 10 good tips with ChatGPT
In February 2023, the UMSOM team compiled a list of 25 questions regarding breast cancer screening to ask ChatGPT. Each of them was submitted to the program three times, as it is known to improve its answers each time, being equipped with a learning tool. His advice was evaluated by three radiologists specializing in mammography. The experts indicated that the answers were appropriate for 22 of the 25 questions.
“We found that ChatGPT answered questions correctly about 88% of the time, which is pretty amazing”explained the author of the study, Dr. Paul Yi, assistant professor of diagnostic radiology and nuclear medicine at UMSOM in a press release from his establishment. “It also has the added benefit of summarizing information in an easily digestible form for consumers to easily understand.” He notes that he correctly answered the questions on the symptoms of breast cancer, the people at risk as well as questions related to the cost, age and frequency of mammograms. The scientists noticed, however, that the answers were not always complete. “ChatGPT provided only one set of recommendations on breast cancer screening, issued by the American Cancer Society, but did not mention the various recommendations issued by the Centers for Disease Control and Prevention (US agency public health protection, Ed) or the US Preventative Services Task Force.
ChatGPT may present fake articles to support its answers
The team also found other issues by studying the three ChatGPT errors. The first provided outdated information regarding scheduling a mammogram around Covid-19 vaccination. The tool advised waiting 4 to 6 weeks, a recommendation removed in February 2022 when the CDC advised that based on new research, there was no need to delay the test.
The other two problematic questions had inconsistent answers that varied significantly each time the same question was asked. More worryingly, the tool may present false evidence.
“We have seen in our experience that ChatGPT sometimes invents fake newspaper articles or fake health consortia to support its claims”warned Dr. Yi. “Consumers should be aware that these are new, unproven technologies and should always rely on their doctor, rather than ChatGPT, for advice”he concluded.
The team will renew the experience with questions about lung cancer. It will also seek to identify ways to improve ChatGPT’s recommendations so that they are more precise and understandable for people without scientific knowledge.
Dr. Mark T. Gladwin, Dean of the University of Maryland School of Medicine, commented on his researchers’ findings: “with the rapid evolution of ChatGPT and other great models, we have a responsibility as a medical community to evaluate these technologies and protect our patients from potential harm that may come from incorrect screening recommendations or obsolete preventive health”.