A research team led by Columbia University has developed an open-source framework aimed at accelerating artificial intelligence (AI) research in healthcare by addressing persistent challenges related to data standardization, reproducibility, and cross-institutional collaboration.
The framework, known as MEDS (Medical Event Data Standard), provides a standardized format for clinical data alongside a growing ecosystem of interoperable tools designed to support the development, evaluation, and benchmarking of machine learning models.
The framework was detailed in a study published in NEJM AI.
According to the researchers, MEDS could significantly reduce the technical barriers that often slow health AI research and make it difficult for scientists to reproduce results or compare models across different studies and institutions.
Standardizing Clinical Data for AI
Electronic health record (EHR) data are typically stored in institution-specific formats, requiring extensive preprocessing before they can be used for AI development. These inconsistencies often lead to duplicated efforts, limited collaboration, and challenges in reproducing research findings.
MEDS seeks to overcome these obstacles by introducing a lightweight and extensible standard for representing longitudinal clinical data in machine learning workflows. The framework also includes open-source tools that support data transformation, preprocessing, model development, and performance benchmarking.
The researchers noted that MEDS is specifically designed for AI and machine learning applications and is intended to complement—not replace—existing clinical data standards.
Promoting Collaboration and Reproducibility
The framework supports a wide range of biomedical AI applications, including predictive modeling, representation learning, multimodal AI systems, and large-scale benchmarking studies. As an open-source ecosystem, MEDS allows researchers from academia, healthcare, and industry to contribute new tools and extensions.
“The biggest advances in AI have always been driven by communities working together in open-source environments,” said researcher McDermott. “MEDS demonstrates the benefits of sharing tools, standardizing common workflows, and building reusable resources that can scale across datasets and institutions.”
The study also underscores the growing importance of transparency and reproducibility as AI models move closer to real-world clinical deployment.
Researchers hope MEDS will encourage greater collaboration across institutions, accelerate innovation in clinical AI, and support more transparent and reproducible scientific research. The framework has already been adopted by 21 institutions across 12 countries.

