Quality of Information on Hypersensitivity Pneumonitis Generated by Artificial Intelligence Chatbots for Clinical and Patient Use

Derya Yenibertiz; Güzide TOMAS

doi:10.36141/svdld.2026.18399

Authors

Derya Yenibertiz Department of Pulmonology, University of Health Sciences, Dr. Abdurrahman Yurtaslan Ankara Oncology Training and Research Hospital, Ankara, Türkiye https://orcid.org/0000-0002-1783-4015
Güzide TOMAS Department of Pulmonology, University of Health Sciences, Sultan Abdülhamid Han Training and Research Hospital, İstanbul, Türkiye

Keywords:

Hypersensitivity pneumonitis, Artificial intelligence, Chatbots, Patient education

Abstract

Background and aim

Hypersensitivity pneumonitis (HP) is a complex immune-mediated interstitial lung disease that requires integration of clinical, radiological, and exposure-related data for accurate diagnosis and management. With the increasing public use of artificial intelligence (AI)–based chatbots for health-related information, concerns have emerged regarding the quality, reliability, readability, and patient usability of AI-generated medical content. This study aimed to evaluate AI chatbot–generated information on HP from a patient education perspective.

Materials and methods

Four frequently searched patient-oriented questions related to HP were identified using Google Trends. These questions addressed disease definition and causes, clinical features, treatment, and diagnosis. Each question was submitted verbatim to eight widely used AI chatbots, generating 32 responses. Four pulmonologists with expertise in interstitial lung diseases independently and blindly evaluated all responses. Content quality and reliability were assessed using the DISCERN instrument; understandability and actionability using the Patient Education Materials Assessment Tool for Print Materials (PEMAT-P); overall written clarity using the Written Readability Rating (WRR); and structural readability using the Flesch–Kincaid Grade Level (FKGL).

Results

All chatbot-generated responses required advanced literacy, with FKGL scores ranging from 20.17 to 29.07, corresponding to college-level or higher readability. None met recommended patient-appropriate readability thresholds. WRR scores declined as clinical complexity increased, with the lowest scores observed for diagnostic explanations. DISCERN scores varied across chatbots (35.00–57.10), indicating generally fair to good quality but incomplete information. PEMAT-P understandability scores were moderate, while actionability scores were consistently low across all models.

Conclusion

Although AI chatbots can generate clinically detailed information on HP, current outputs remain difficult for patients to understand and apply. Until improvements in readability, actionability, and reliability are achieved, AI-generated content should be used cautiously and under professional supervision in patient education.

References

1. Calaras D, David A, Vasarmidi E, Antoniou K, Corlateanu A. Hypersensitivity pneumonitis: challenges of a complex disease. Can Respir J. 2024;2024:4919951.

2. Hamblin M, Prosch H, Vašáková M. Diagnosis, course and management of hypersensitivity pneumonitis. Eur Respir Rev. 2022;31(163):210169.

3. Dabiri M, Jehangir M, Khoshpouri P, Chalian H. Hypersensitivity pneumonitis: a pictorial review based on the new ATS/JRS/ALAT clinical practice guideline for radiologists and pulmonologists. Diagnostics (Basel). 2022;12(11):2874.

4. Magee AL, Montner SM, Husain A, Adegunsoye A, Vij R, Chung JH. Imaging of hypersensitivity pneumonitis. Radiol Clin North Am. 2016;54(6):1033–1046.

5. Zhao YC, Zhao M, Song S. Online health information seeking among patients with chronic conditions: integrating the health belief model and social support theory. J Med Internet Res. 2022;24(11):e42447.

6. Daraz L, Morrow AS, Ponce OJ, Farah W, Katabi A, Majzoub A, et al. Readability of online health information: a meta-narrative systematic review. Am J Med Qual. 2018;33(5):487–492.

7. Goodman RS, Patrinely JR, Stone CA Jr, Zimmerman E, Donald RR, Chang SS, et al. Accuracy and reliability of chatbot responses to physician questions. JAMA Netw Open. 2023;6(10):e2336483.

8. Fahy S, Oehme S, Milinkovic D, Jung T, Bartek B. Assessment of quality and readability of information provided by ChatGPT in relation to anterior cruciate ligament injury. J Pers Med. 2024;14(1):104.

9. Jido JT, Al-Wizni A, Le Aung S. Readability of AI-generated patient information leaflets on Alzheimer’s disease, vascular dementia, and delirium. Cureus. 2025;17(6):e85463.

10. Abeo ANA, Armstrong S, Scriney M, Goss H. Artificial intelligence techniques and health literacy: a systematic review. Mayo Clin Proc Digit Health. 2025;100269.

11. Carlson JA, Cheng RZ, Lange A, Nagalakshmi N, Rabets J, Shah T, et al. Accuracy and readability of artificial intelligence chatbot responses to vasectomy-related questions: public beware. Cureus. 2024;16(8):e67996