west china medical publishers
Keyword
  • Title
  • Author
  • Keyword
  • Abstract
Advance search
Advance search

Search

find Keyword "Large language model" 2 results
  • The application of large language models in the field of evidence-based medicine

    Large Language Models (LLMs) are highly sophisticated deep learning models pre-trained on massive datasets, with ChatGPT representing a prominent application of LLMs in the field of generative models. Since the release of ChatGPT at the end of 2022, generative chatbots have become widely employed across various medical disciplines. As a crucial discipline guiding clinical practices, the usage of generative chatbots like ChatGPT in Evidence-Based Medicine (EBM) is gradually increasing. However, the potential, challenges, and intricacies of their application in the domain of EBM remain unclear. This paper aims to explore and discuss the prospects, challenges, and considerations associated with the application of ChatGPT in the field of EBM through a review of relevant literature. The discussion spans four aspects: evidence generation, synthesis, assessment, dissemination, and implementation, providing researchers with insights into the latest developments and future research suggestions.

    Release date: Export PDF Favorites Scan
  • Evaluation of the accuracy of the large language model for risk of bias assessment in analytical studies

    Objective To systematically review the accuracy and consistency of large language models (LLMs) in assessing risk of bias in analytical studies. Methods The cohort and case-control studies related to COVID-19 based on the team's published systematic review of clinical characteristics of COVID-19 were included. Two researchers independently screened the studies, extracted data, and assessed risk of bias of the included studies with the LLM-based BiasBee model (version Non-RCT) used for automated evaluation. Kappa statistics and score differences were used to analyze the agreement between LLM and human evaluations, with subgroup analysis for Chinese and English studies. Results A total of 210 studies were included. Meta-analysis showed that LLM scores were generally higher than those of human evaluators, particularly in representativeness of exposed cohorts (△=0.764) and selection of external controls (△=0.109). Kappa analysis indicated slight agreement in items such as exposure assessment (κ=0.059) and adequacy of follow-up (κ=0.093), while showing significant discrepancies in more subjective items, such as control selection (κ=−0.112) and non-response rate (κ=−0.115). Subgroup analysis revealed higher scoring consistency for LLMs in English-language studies compared to that of Chinese-language studies. Conclusion LLMs demonstrate potential in risk of bias assessment; however, notable differences remain in more subjective tasks. Future research should focus on optimizing prompt engineering and model fine-tuning to enhance LLM accuracy and consistency in complex tasks.

    Release date: Export PDF Favorites Scan
1 pages Previous 1 Next

Format

Content