Applying large language model artificial intelligence for retina International Classification of Diseases (ICD) coding

Thumbnail Image
Ong, Joshua
Kedia, Nikita
Harihar, Sanjana
Vupparaboina, Sharat Chandra
Singh, Sumit Randhir
Venkatesh, Ramesh
Vupparaboina, Kiran
Bollepalli, Sandeep Chandra
Chhablani, Jay
Issue Date
Retina , Large language model (LLM) , Artificial intelligence (AI) , International Classification of Diseases (ICD)
Research Projects
Organizational Units
Journal Issue
Ong, J., Kedia, N., Harihar, S., Vupparaboina, Sharat C., Singh, Sumit R., Venkatesh, R., Vupparaboina, K., Bollepalli, S.C., & Chhablani, J. (2023). Applying large language model artificial intelligence for retina International Classification of Diseases (ICD) coding. Journal of Medical Artificial Intelligence, vol. 6, art. no. 21.

Background: Large language models (LLMs) such as ChatGPT have emerged as a potentially powerful application in medicine. One of these strengths is the ability for ChatGPT to analyze text and to perform certain tasks. International Classification of Diseases (ICD) codes are universally utilized in medicine and have served as a uniform platform for insurance and billing. However, the task of coding ICDs after each patient encounter is time-consuming on physicians, particularly in fast paced clinics such as retina clinics. Additionally, searching for the most specific, correct ICD code may add additional time, resulting in providers electing for more general ICD codes. LLMs may help to relieve this burden by analyzing notes written by a provider and automatically generate an ICD code that can be used for the encounter. Methods: In this study, we analyze the ability of ChatGPT to analyze retina encounters and to generate ICD codes for the encounter without any feedback. Text of mockup retina clinic encounters of various types of visits including new patient visits, return visits, post-operative visits, and injection-only visits were generated by three retina specialists. Results: A total of 181 retina encounters were evaluated, with 84 eyes as right eyes, 97 eyes as left eyes. A total of 597 ICD codes were generated, with 305 consisting of retina codes (1.68 retina codes per eye). In total, 127/181 (70%) of responses resulted in a true positive result with at least one code provided providing a correct code for the encounter that a retina specialist could choose from. In total, 54/181 (30%) responses did not generate any correct code from the text. Further analysis showed that ChatGPT generated a response that was completely correct in 106 of 181 encounters (59%) with the remaining 75 encounters (41%) having some form of incorrect ICD even if it included the correct diagnosis. Conclusions: This pilot study showcases ChatGPT's ability without feedback to potentially alleviate physician burden in coding ICDs by generating a selection of codes. Further research is required to validate this potential use of LLMs to reduce burden on providers through ICD code generation.

Table of Contents
Click on the DOI link to access this article (may not be free).
This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See:
AME Publishing Company
Book Title
Journal of Medical Artificial Intelligence
vol.6, art. no. 21
PubMed ID