Applying large language model artificial intelligence for retina International Classification of Diseases (ICD) coding
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
Background: Large language models (LLMs) such as ChatGPT have emerged as a potentially powerful application in medicine. One of these strengths is the ability for ChatGPT to analyze text and to perform certain tasks. International Classification of Diseases (ICD) codes are universally utilized in medicine and have served as a uniform platform for insurance and billing. However, the task of coding ICDs after each patient encounter is time-consuming on physicians, particularly in fast paced clinics such as retina clinics. Additionally, searching for the most specific, correct ICD code may add additional time, resulting in providers electing for more general ICD codes. LLMs may help to relieve this burden by analyzing notes written by a provider and automatically generate an ICD code that can be used for the encounter. Methods: In this study, we analyze the ability of ChatGPT to analyze retina encounters and to generate ICD codes for the encounter without any feedback. Text of mockup retina clinic encounters of various types of visits including new patient visits, return visits, post-operative visits, and injection-only visits were generated by three retina specialists. Results: A total of 181 retina encounters were evaluated, with 84 eyes as right eyes, 97 eyes as left eyes. A total of 597 ICD codes were generated, with 305 consisting of retina codes (1.68 retina codes per eye). In total, 127/181 (70%) of responses resulted in a true positive result with at least one code provided providing a correct code for the encounter that a retina specialist could choose from. In total, 54/181 (30%) responses did not generate any correct code from the text. Further analysis showed that ChatGPT generated a response that was completely correct in 106 of 181 encounters (59%) with the remaining 75 encounters (41%) having some form of incorrect ICD even if it included the correct diagnosis. Conclusions: This pilot study showcases ChatGPT's ability without feedback to potentially alleviate physician burden in coding ICDs by generating a selection of codes. Further research is required to validate this potential use of LLMs to reduce burden on providers through ICD code generation.
Table of Contents
Description
This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
Publisher
Journal
Book Title
Series
vol.6, art. no. 21