Artificial Intelligence (AI)-Generated Healthcare Content; Understanding the Limitations

By Kaelin O’Reilly, ProAssurance communications specialist

Artificial intelligence (AI), including chatbot tools like the popular ChatGPT, has made possible many useful applications in the healthcare sphere. ChatGPT’s ability to generate human-like responses to natural language inputs has made it an attractive tool for professional and student writers.¹ The application can help develop quality and informative content in the form of articles, reports, blogs, tweets, and emails.² This content may be produced in less time than traditional writing, and the burden of arduous research tasks can be reduced. In the fields of medicine and science, healthcare providers, researchers, and academics can access valuable medical education; supplement record documentation; and produce journal articles, clinical studies, and research papers with assistance from the tool.¹

ChatGPT’s natural language processing model builds on older technologies of speech recognition, predictive text capability, and deep learning.³ It can function as a search engine, providing direct responses to user queries by applying specific criteria to locate appropriate resources. ChatGPT can aid in topic generation and provide translation for some medical and technical jargon. Because its algorithm is “trained” on a robust dataset of conversational text, the tool can address and generate practical written responses for a broad range of prompts, capturing many of the nuances and variations unique to human speech. It can also present language that is clear, easy to follow, often eloquent, and in the appropriate, specified structure.¹

While AI tools like ChatGPT present significant advantages for writers, these applications are not without shortcomings. AI-generated content raises the following concerns⁴:

Authorship and Accountability
Inaccuracies and Errors
Biases and Prejudices
Lack of Regulations, and Privacy and Security
Dependence and Job Displacement

Moreover, developing and fine tuning the ChatGPT algorithm necessitates the collection and analysis of huge volumes of text data across the internet. Notably, these data collections have been relatively sporadic, with the last two collections covering information only up to September 2021, then up to April 2023 with its newer model. This may result in the information generated by ChatGPT being erroneous or out of date, or perpetuating an incomplete or distorted picture of the subject matter.^1,5 Misinformation may be overlooked or unknown, and inadvertently passed on in published work.² As AI implementations become even more commonplace, both readers and writers should be mindful to question the validity and reliability of content and familiarize themselves with the functional limitations of chatbots like ChatGPT.¹

The Limitations and Concerns of ChatGPT-Generated Content

Authorship and Accountability

AI-generated content invites questions about authorship and accountability and, specifically, whether tools like ChatGPT should be applied in research and writing, including healthcare works. Credit for published material has traditionally been given to the individual contributors for their work in applying intelligence to idea generation, research and analysis, design, and execution. It is suggested that definitions of authorship may need to be revisited and specified, considering use of ChatGPT and other AI tools in the healthcare ecosystem is only growing. However, most journals will not allow designation of ChatGPT as an author, suggesting that although the tool does mimic human thought progression and language, create a logical and well-developed piece of writing in an appropriate format, it may not have the capability to produce information that is 100% reliable. As AI is non-human, it cannot be held responsible for its content in the same way as individuals with intention and legal obligations.^1,4

Supporting the argument of accountability is an acknowledgment of the continued need for human intervention with use of these tools, despite their impressive capabilities. Specifically, processes like editing and applying reason and specialized expertise lie beyond the product’s scope of training but are nevertheless essential in writing It may be acceptable, however, and even beneficial for writers to include references to such AI tools along with the other resources they have used in the development of their work. Doing so might establish greater transparency while allowing the author to claim appropriate responsibility for the validity of their content. Further, such citations may bring awareness to the merits of AI resources like ChatGPT as supplemental assistants to the research and writing processes.^1,4

While AI algorithms evolve with new and expanding data collections, opportunities for misuse and plagiarism emerge. In one study, plagiarism detection software and detection tools used to identify AI-generated content (“AI output detector”) were applied to 50 research abstracts that were generated solely by ChatGPT. The ChatGPT had created these abstracts following its review of excerpts from journals like JAMA and The New England Journal of Medicine. The plagiarism detection software found no plagiarism by ChatGPT, while the AI-output detector recognized only 66% of the abstracts as being AI-created. It is encouraging that ChatGPT was not found to have plagiarized the journal articles. However, as ChatGPT seemed to be able to pass through the AI-output detector checks with relative ease, it may be deduced that an individual reader would be unable to make the differentiation.¹

Inaccuracies and Errors

Accuracy and reliability of text generated by AI models depend on the quality of data used in training the models. ChatGPT, like any AI model, may have errors or biases built into its core algorithm and, as a result, its output based on these inaccuracies will sometimes be incorrect.¹ Language models are inherently intricate, complex, and potentially difficult to understand. A user may lack the foresight or knowledge necessary for gauging the correctness of an AI-generated answer or spotting specific errors, especially if the user is not aware how the tool arrived at these conclusions.⁴ There may be ambiguities in the user’s prompt or question (i.e., vague wording, meandering, or unfocused speech), resulting in an answer that is in turn, also ambiguous.¹ In addition, using preset calculations to parse through data and elect the “best” answer in mere fractions of a second—even when there is no clear or easy answer available—can result in incomplete, skewed information. These types of outputs, known as AI “hallucinations,” are presented as factual but are really more of an improvised best guess generated by the chatbot, and have a high potential for inaccuracy.⁶

ChatGPT has a limited ability to apply deductive reasoning in its approach to answers, or to deconstruct and prioritize answers to layered questions. It can have trouble inferring underlying meanings or handling complex, “niche” topics. This weakness becomes even more challenging in detailed areas of science and medicine, which require subject matter expertise and an acute awareness and ability to analyze the constant changes and developments characteristic of these fields. Though ChatGPT is skilled in performing some language translation and adjustment to make medical conditions and treatment terminology more digestible for the average person, the tool may have a hard time interpreting or “understanding” certain medical phrases or jargon specific to a lesser-known subject or subspecialty.⁷

Biases and Prejudices

Data used in the development of AI algorithms may be limited to over- or under-represent certain groups, genders, ages, races, and cultures.⁸ A close examination will reveal that this overgeneralized and unbalanced data base fails to properly include certain populations. Therefore, the results from AI chatbots may be unreliable as applied to those groups. The potential biases and discriminatory attitudes that may be apparent in data collected across the web, and that inform the outputs generated by tools like AI chatbots, reflect not only society’s culture but also the potential culture of technological innovators of the AI-assisted product. A lack of diversity among these teams, as well as collective misconceptions or prejudices can become “embedded” in product development, meaning that product may exclude sizable groups of the population. As well, an unintentional flaw in the product design or in the algorithm’s data input can also yield such biases. These biases perpetuate when AI presents flawed conclusions to users, who may rely upon and pass along that skewed information. Large, varied groups and underrepresented communities should be included in research studies, to effectively create more diverse training sets for new algorithms. Doing so will allow ChatGPT and similar tools to provide a more valuable application that generates more accurate, reliable, and inclusive results.⁹

Lack of Regulations; Privacy and Security

Training of algorithms for ChatGPT and other chatbot implements incorporates access to extensive datasets, which may potentially include health information, particularly if the AI tool is utilized across healthcare facilities through sharing of patient information. Of course, a concern with utilizing health information is the privacy and security of the details within that gathered data, which may be vulnerable to hackers and data breaches. When the underlying data for an AI algorithm contains health information of an actual person, utilizing only properly de-identified data that does not contain protected health information of any individual will avoid violations of HIPAA and breaches of privacy. With no universal guidelines in place to govern the use, efficacy, implementation, and auditing of newer AI tools like ChatGPT in the healthcare sector, legal and ethical debates circulate around the handling and quality of data, patient consent, and confidentiality. A lack of clarity about data models and algorithms plus inadequate training on the user functions of AI equipment invite warranted skepticism and present a need for greater transparency and education across healthcare organizations. It is suggested that collaboration among AI innovators, security experts, and policymakers, as well as healthcare clinicians and providers, is necessary for the development and implementation of rules, regulations, and guidelines to address these novel issues of transparency and security and provide a smoother integration of AI into clinical practices. Specifications in these guidelines could include restrictions on data usage and the sharing of information and impose quality control measures for de-identification, encryption, and anonymization. These specifications would help ensure privacy and security, while maintaining quality of patient care and compliance with existing national healthcare regulations.^8,9

Dependence and Job Displacement

There does exist a concern for dependence and overreliance on AI-assisted tools, especially if their algorithm models are flawed, contain biases, or are simply outdated. Leaning too heavily on these tools can result in missed errors, and a complacency around fact checking and quality assurance for documentation and other important practical applications in healthcare. Regarding the production of healthcare and scientific-related written content, it holds that creativity, personal experience, and an individual voice contribute to quality and originality. The potential for overreliance on AI causes deep concern that these attributes may be lost using a tool like ChatGPT. Content generated through a chatbot should be reviewed and edited for factual merit, quality, grammar, consistency, and timeliness. While AI technology advances in functionality and versatility, researchers and writers may fear job loss or a reduction of employment opportunities. However, the elements common to valuable written pieces illustrate integral contributions that can only come from individual authors: demonstrated depth of knowledge, critical and applied thinking, anecdotes and specific deductive reasoning, and a personal connection to the audience. These are human attributes that cannot be fully replicated or recreated by any technology. ChatGPT and other chatbot tools currently work best alongside humans, serving as resources and tools, making the processes of writing and research smoother and more manageable.^4,8

References

1. Tirth Dave, Sai Anirudh Athaluri, Satyam Singh, “ChatGPT in Medicine: An Overview of Its Applications, Advantages, Limitations, Future Prospects, and Ethical Considerations,” Frontiers in Artificial Intelligence 6 (May 4, 2023), https://www.frontiersin.org/articles/10.3389/frai.2023.1169595/full.

2. Jodie Cook, “6 Giveaway Signs of ChatGPT-Generated Content,” Forbes, Dec. 6, 2023, https://www.forbes.com/sites/jodiecook/2023/12/06/6-giveaway-signs-of-chatgpt-generated-content/?sh=10b8c9181e7d.

3. “The Benefits of AI in Healthcare,” IBM Education, July 11, 2023, https://www.ibm.com/blog/the-benefits-of-ai-in-healthcare/.

4. Alexander S Doyal et al., “ChatGPT and Artificial Intelligence in Medical Writing: Concerns and Ethical Considerations,” Cureus Journal of Medical Science 15(8) (August 10, 2023), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492634/#:~:text=Some%20suggested%20uses%20of%20ChatGPT,in%20the%20writing%20of%20medical.

5. Aaron Mok, “ChatGPT Is Getting an Upgrade That Will Make It More Up to Date,” Business Insider, Nov. 6, 2023, https://www.businessinsider.com/open-ai-chatgpt-training-up-to-date-gpt4-turbo-2023-11#:~:text=ChatGPT%20users%20will%20soon%20have,at%20its%20first%20developer%20day.

6. Sindhu Sundar and Aaron Mok, “How Do AI Chatbots Like ChatGPT Work? Here’s a Quick Explainer,” Business Insider, Oct. 14, 2023, https://www.businessinsider.com/how-ai-chatbots-like-chatgpt-work-explainer-2023-7.

7. Bernard Marr, “The Top Limitations of ChatGPT,” Forbes, Mar 3, 2023, https://www.forbes.com/sites/bernardmarr/2023/03/03/the-top-10-limitations-of-chatgpt/?sh=5b49a2158f35.

8. Josh Nguyen and Christopher A. Pepping, “The Application of ChatGPT in Healthcare Progress Notes: A Commentary From a Clinical and Research Perspective,” Clinical and Translational Medicine 13(7) (July 2, 2023), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10315641/.

9. Bangul Khan et al., “Drawbacks of Artificial Intelligence and Their Potential Solutions in the Healthcare Sector,” Biomedical Materials & Devices 1-8 (Feb. 8, 2023), https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9908503/.

Originally posted in ProVisions, reposted with permission

Note: This article is for informational purposes only and should not be considered as insurance advice related to your specific policy or situation. Please consult with a qualified insurance advisor or professional before making any policy decisions. Full disclaimer and contact information.

Artificial Intelligence (AI)-Generated Healthcare Content; Understanding the Limitations

One thought on “Artificial Intelligence (AI)-Generated Healthcare Content; Understanding the Limitations”