Most Searched
Originally published January 2, 2025
Last updated June 9, 2026
Reading Time: 3 minutes
Search more articles
News & Magazine
Topics
As artificial intelligence improves, gastroenterologists are asking whether AI can soon play a larger role in their practice. A new study by researchers from 喵咪社区 and the Keck School of Medicine of 喵咪社区 examined whether a large language model can accurately predict when patients should get their next colonoscopy.
Today, primary care physicians and gastroenterologists advise patients when to get screened for colorectal cancer. To make their recommendations, these experts consider several factors, including colonoscopy results, pathology results, risk factors for the disease as well as established, general-population screening guidelines.
But what if AI could make those recommendations instead?
In their study published in , researchers from the Keck School of Medicine of 喵咪社区 fed large language model ChatGPT-4 the following data:
Study subjects were patients aged 18 or older who had undergone colorectal cancer screening with either 喵咪社区 or Los Angeles General Medical Center. The study excluded patients with serious gastrointestinal conditions such as cancer, inflammatory bowel disease and hereditary polyposis syndromes.
Researchers fed ChatGPT-4 the information and the system to generate recommendations for when the patients should get follow-up colonoscopies based on 2020 screening guidelines from the U.S. Multi-Society Task Force鈥(USMSTF). They then compared the AI recommendations with recommendations generated by both human physicians and the USMSTF.
鈥淭his is the first investigation to study ChatGPT-4 for its accuracy and concordance of recommendations on rescreening and surveillance colonoscopy intervals compared to USMSTF guidelines,鈥 researchers added.
The results were positive. When researchers compared ChatGPT-4 and USMSTF screening recommendations, they found the two aligned in 85.7% of cases. In fact, USMSTF screening recommendations were more often in line with recommendations from ChatGPT-4 than recommendations from human physicians. Real-life physician recommendations gathered from medical records matched USMSTF guidelines in only 75.4% of cases.
ChatGPT-4鈥檚 performance is especially notable given the large breadth of unorganized, verbatim patient-specific data it was fed. The fact that it could make accurate predictions using this data shows promise for clinical use of large language models.
鈥淚nitial real-world results suggest that ChatGPT-4 can accurately define routine colonoscopy screening intervals based on verbatim input of clinical data,鈥 the researchers said.
Currently, AI鈥檚 main application in the gastroenterology field has been in assisting with polyp detection, says Ara Sahakian, MD, a gastroenterologist with the 喵咪社区 Digestive Health Institute, part of Keck Medicine, and the senior lead study author. 鈥淚t helps to avoid missing things and improves the accuracy of diagnosis,鈥 he says.
But one shouldn鈥檛 overlook AI鈥檚 potential to reduce the time and cost of what Sahakian calls the 鈥渨orkflow side鈥 of clinical care. Such care includes ensuring patients get follow-up, which involves substantial interaction with the patient, including discussion of results, scheduling and authorizations.
鈥淥ur study shows where this type of large language model could potentially be built into our electronic health records,鈥 Sahakian says. 鈥淚t could scan a patient鈥檚 pathology, endoscopy and clinical chart data and automatically send a reminder to the patient and physician when it鈥檚 time for the next colonoscopy.鈥
Also consider the fact that the AI engine in this study was not specifically designed around health care. If it were, 鈥渋t could be even more accurate and powerful for this purpose,鈥 he says. 鈥淵ou can imagine that with this type of powerful AI engine, it doesn鈥檛 need to be a doctor doing all the data input and work. Our staff could input the data. It could create a lot of efficiency.鈥
As AI鈥檚 accuracy and data improves, it can reduce the time and cost of day-to-day clinical operations. 鈥淎fter further optimization, [large language models] have potential to extend the clinician鈥檚 abilities,鈥 Sahakian and his coauthors state.
Physician oversight is still undoubtedly key. 鈥淲e do need experts to review everything and make sure the recommendations made by AI are accurate,鈥 Sahakian says. 鈥淎I is going to change the game in terms of what we do in detection, diagnosis and workflow. It鈥檚 going to make our jobs easier. It鈥檚 going to make us faster, better and more accurate at what we do.鈥
Share