Home | Publications | MDB+26

Reporting Checklist for Foundation and Large Language Models in Medical Research (REFINE): An International Consensus Guideline

MCML Authors

Michael Ingrisch

Prof. Dr.

Principal Investigator

Clinical Data Science in Radiology

Abstract

PURPOSE: To develop the REporting checklist for FoundatIon and large laNguagE models (REFINE), an international reporting guideline for transparent and reproducible reporting of foundation model (FM) and large language model (LLM) studies in medical research, including imaging artificial intelligence (AI) applications.<br>METHODS: The protocol was prespecified and publicly archived. A modified Delphi process was conducted to establish reporting standards for unimodal and multimodal FM and LLM applications involving text, imaging, and structured data. The steering committee coordinated protocol development, expert recruitment, all Delphi rounds, and the harmonization phase. Decisions were made based on predefined consensus thresholds. In Rounds 1 and 2, structured ratings and free-text feedback informed iterative revisions. In the post-Delphi harmonization phase, terminology was standardized, and detailed reporting instructions were finalized.<br>RESULTS: The REFINE development group comprised 57 contributors from 17 countries, and 54 panelists from 16 countries completed Rounds 1 and 2. The harmonization phase was completed by three expert panelists and the steering committee. The entire process produced a 44-item, six-section framework with standardized terminology and detailed reporting instructions, supported by an online platform for practical use (https://refinechecklist.github.io/refine/checklist.html).<br>CONCLUSION: The REFINE provides a comprehensive, consensus-based reporting standard for medical FM and LLM research, including imaging AI studies. The online version facilitates practical implementation.<br>CLINICAL SIGNIFICANCE: The REFINE enables transparent, comparable, and reproducible reporting of FM and LLM studies, supporting reliable evidence synthesis in medical and imaging-focused AI studies.

article MDB+26

Diagnostic and Interventional Radiology

Feb. 2026.

Authors

I. • T. A. D’Antonoli • C. Bluethgen • K. Bressem • R. Cuocolo • A. Chaudhari • A. S. Tejani • A. Isaac • A. Ponsiglione • A. Meddeb • B. Khosravi • B. Le Guellec • C. E. Kahn, Jr. • C. H. Suh • D. Pinto dos Santos • D.-M. Koh • E. Tzanis • E. Kotter • E. Colak • F. Kitamura • F. Busch • F. Nensa • G. Yang • H. Müller • J. N. Kather • J. Nawabi • J. Kleesiek • J. Zhong • J. Santinha • J. Haubold • J. Guilherme de Almeida • K. Lekadir • K. Marias • L. N. Reiner • L. Maier-Hein • L. Moy • L. C. Adams • L. Martí-Bonmatí • M. Paschali • M. Moassefi • M. Dietzel • M. Huisman • M. Ingrisch • M. E. Klontzas • N. Papanikolaou • O. Diaz • P. Kuriki • P. Seeböck • P. Rouzrokh • Q. D. Strotzer • S. H. Park • S. Faghani • S. T. Arasteh • S. H. Kim • V. K. Venugopal • W. Kim • B. Kocak