Creating a Modified Version of the Cambridge Multimorbidity Score (CMMS) to Predict Mortality in People Over 16 Years in the English Nationwide General Practice Extraction Service Data for Pandemic Planning and Research (GDPPR) Dataset: Model Development and Validation (Preprint)

Debasish Kar, Kathryn Suzann Taylor, Mark Joy, Sudhir Venkatesan, Wilhelmine Meeraus, Sylvia Taylor, Sneha Anand, Filipa Ferreira, Gavin Jamie, Xuejuan Fan, Simon de Lusignan

Research output: Working paperPreprint

Abstract

Background: No single multimorbidity measure is validated for use in NHS (National Health Service) England’s General Practice Extraction Service Data for Pandemic Planning and Research (GDPPR), the nationwide primary care data set created for COVID-19 pandemic research. The Cambridge Multimorbidity Score (CMMS) is a validated tool for predicting mortality risk, with 37 conditions defined by Read Codes. The GDPPR uses the more internationally used Systematized Nomenclature of Medicine clinical terms (SNOMED CT). We previously developed a modified version of the CMMS using SNOMED CT, but the number of terms for the GDPPR data set is limited making it impossible to use this version. Objective: We aimed to develop and validate a modified version of CMMS using the clinical terms available for the GDPPR. Methods: We used pseudonymized data from the Oxford-Royal College of General Practitioners Research and Surveillance Centre (RSC), which has an extensive SNOMED CT list. From the 37 conditions in the original CMMS model, we selected conditions either with (1) high prevalence ratio (≥85%), calculated as the prevalence in the RSC data set but using the GDPPR set of SNOMED CT codes, divided by the prevalence included in the RSC SNOMED CT codes or (2) conditions with lower prevalence ratios but with high predictive value. The resulting set of conditions was included in Cox proportional hazard models to determine the 1-year mortality risk in a development data set (n=500,000) and construct a new CMMS model, following the methods for the original CMMS study, with variable reduction and parsimony, achieved by backward elimination and the Akaike information stopping criterion. Model validation involved obtaining 1-year mortality estimates for a synchronous data set (n=250,000) and 1-year and 5-year mortality estimates for an asynchronous data set (n=250,000). We compared the performance with that of the original CMMS and the modified CMMS that we previously developed using RSC data. Results: The initial model contained 22 conditions and our final model included 17 conditions. The conditions overlapped with those of the modified CMMS using the more extensive SNOMED CT list. For 1-year mortality, discrimination was high in both the derivation and validation data sets (Harrell C=0.92) and 5-year mortality was slightly lower (Harrell C=0.90). Calibration was reasonable following an adjustment for overfitting. The performance was similar to that of both the original and previous modified CMMS models. Conclusions: The new modified version of the CMMS can be used on the GDPPR, a nationwide primary care data set of 54 million people, to enable adjustment for multimorbidity in predicting mortality in people in real-world vaccine effectiveness, pandemic planning, and other research studies. It requires 17 variables to produce a comparable performance with our previous modification of CMMS to enable it to be used in routine data using SNOMED CT.

Original languageEnglish
Volume26
Publication statusPublished - 26 Aug 2024
Externally publishedYes

Publication series

NameJournal of Medical Internet Research
PublisherJMIR Publications Inc.
ISSN (Print)1439-4456

ASJC Scopus subject areas

  • Health Informatics

Keywords

  • COVID-19
  • calibration
  • computerized medical records
  • discrimination
  • multimorbidity
  • pandemics
  • predictive model
  • prevalence
  • systematized nomenclature of medicine
  • systems
  • Pandemics
  • Humans
  • Middle Aged
  • Male
  • Young Adult
  • SARS-CoV-2
  • Aged, 80 and over
  • Adult
  • Female
  • England/epidemiology
  • COVID-19/mortality
  • Multimorbidity
  • Systematized Nomenclature of Medicine
  • Adolescent
  • Aged

Fingerprint

Dive into the research topics of 'Creating a Modified Version of the Cambridge Multimorbidity Score (CMMS) to Predict Mortality in People Over 16 Years in the English Nationwide General Practice Extraction Service Data for Pandemic Planning and Research (GDPPR) Dataset: Model Development and Validation (Preprint)'. Together they form a unique fingerprint.

Cite this