当前位置：2024年06月

The performance of arti fi cial intelligence models in generating responses to general orthodontic questions: ChatGPT vs Google Bard

作者全名：Daraqel, Baraa; Wafaie, Khaled; Mohammed, Hisham; Cao, Li; Mheissen, Samer; Liu, Yang; Zheng, Leilei

作者地址：[Daraqel, Baraa; Cao, Li; Liu, Yang; Zheng, Leilei] Chongqing Med Univ, Stomatol Hosp, Dept Orthodont, 426 Songshibei Rd, Chongqing 401147, Peoples R China; [Daraqel, Baraa; Cao, Li; Liu, Yang; Zheng, Leilei] Chongqing Med Univ, Chongqing Key Lab Oral Dis & Biomed Sci, Chongqing, Peoples R China; [Daraqel, Baraa; Cao, Li; Liu, Yang; Zheng, Leilei] Chongqing Med Univ, Chongqing Municipal Key Lab Oral Biomed Engn Highe, Chongqing, Peoples R China; [Daraqel, Baraa; Mohammed, Hisham] Al Quds Univ, Oral Hlth Res & Promot Unit, Jerusalem, Palestine; [Wafaie, Khaled; Mheissen, Samer] Zhengzhou Univ, Affiliated Hosp 1, Fac Dent, Dept Orthodont, Zhengzhou, Henan, Peoples R China

通信作者：Daraqel, B; Zheng, LL (通讯作者)，Chongqing Med Univ, Stomatol Hosp, Dept Orthodont, 426 Songshibei Rd, Chongqing 401147, Peoples R China.

来源：AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS

ESI学科分类：CLINICAL MEDICINE

WOS号：WOS:001247324400001

JCR分区：Q1

影响因子：2.7

年份：2024

卷号：165

期号：6

开始页：652

结束页：662

文献类型：Article

关键词：　

摘要：Introduction: This study aimed to evaluate and compare the performance of 2 artificial intelligence (AI) models, Chat Generative Pretrained Transformer -3.5 (ChatGPT-3.5; OpenAI, San Francisco, Calif) and Google Bidirectional Encoder Representations from Transformers (Google Bard; Bard Experiment, Google, Mountain View, Calif), in terms of response accuracy, completeness, generation time, and response length when answering general orthodontic questions. Methods: A team of orthodontic specialists developed a set of 100 questions in 10 orthodontic domains. One author submitted the questions to both ChatGPT and Google Bard. The AI-generated responses from both models were randomly assigned into 2 forms and sent to 5 blinded and independent assessors. The quality of AI-generated responses was evaluated using a newly developed tool for accuracy of information and completeness. In addition, response generation time and length were recorded. Results: The accuracy and completeness of responses were high in both AI models. The median accuracy score was 9 (interquartile range [IQR]: 8-9) for ChatGPT and 8 (IQR: 8-9) for Google Bard (Median difference: 1; P \0.001). The median completeness score was similar in both models, with 8 (IQR: 8-9) for ChatGPT and 8 (IQR: 7-9) for Google Bard. The odds of accuracy and completeness were higher by 31% and 23% in ChatGPT than in Google Bard. Google Bard's response generation time was significantly shorter than that of ChatGPT by 10.4 second/question. However, both models were similar in terms of response length generation. Conclusions: Both ChatGPT and Google Bard generated responses were rated with a high level of accuracy and completeness to the posed general orthodontic questions. However, acquiring answers was generally faster using the Google Bard model. (Am J Orthod Dentofacial Orthop 2024;165:652-62)

基金机构：　

基金资助正文：　

学术成果速报（SCI）

The performance of arti fi cial intelligence models in generating responses to general orthodontic questions: ChatGPT vs Google Bard