Assessment of information quality in contemporary artificial intelligence systems for digital smile design: A comparative analysis


Topdağı B., Kavaz T.

Journal of Prosthetic Dentistry, cilt.134, sa.4, 2025 (SCI-Expanded, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 134 Sayı: 4
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1016/j.prosdent.2025.06.030
  • Dergi Adı: Journal of Prosthetic Dentistry
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, CINAHL
  • İstanbul Gelişim Üniversitesi Adresli: Evet

Özet

Statement of problem: Although artificial intelligence (AI) chatbots have been increasingly used to obtain information about smile design, the accuracy, reliability, and readability of such information for laypersons remain unclear. Purpose: The purpose of this study was to assess the accuracy, reliability, quality, and readability of responses about digital smile design provided by 4 artificial intelligence models: ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot. Material and methods: The most frequently searched questions regarding smile design were identified via Google search and presented to each AI model. Responses were independently evaluated using a 5-point Likert scale for accuracy, the modified DISCERN scale for reliability, the General Quality Scale (GQS) for quality, and the Flesch Reading Score (FRES) for readability. Normality was assessed by the Kolmogorov-Smirnov test, and group differences by the Kruskal-Wallis test with the Dunn post hoc analysis; statistical significance was set at α=.05. Results: ChatGPT-4 achieved the highest median accuracy score 5 (4−5), with significant differences among models (P<.05). Copilot demonstrated the highest reliability and quality scores (P<.05), while ChatGPT-3.5 responses were the most readable (P<.05); however, all models produced output classified as difficult to read. Only Copilot and Gemini included source citations in their responses. Conclusions: AI chatbots generally provided accurate and moderately reliable information about smile design, but limited readability and insufficient referencing restrict their value as patient education tools. Enhancements in transparency, scientific clarity, and source citation are needed to improve the clinical utility of chatbot systems. These findings are limited to the evaluated models and topic area, and further research is warranted for broader validation.