Creating a Modified Version of the Cambridge Multimorbidity Score to Predict Mortality in People Older Than 16 Years: Model Development and Validation

Debasish Kar, Kathryn S. Taylor*, Mark Joy, Sudhir Venkatesan, Wilhelmine Meeraus, Sylvia Taylor, Sneha N. Anand, Filipa Ferreira, Gavin Jamie, Xuejuan Fan, Simon de Lusignan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Background: No single multimorbidity measure is validated for use in NHS (National Health Service) England’s General Practice Extraction Service Data for Pandemic Planning and Research (GDPPR), the nationwide primary care data set created for COVID-19 pandemic research. The Cambridge Multimorbidity Score (CMMS) is a validated tool for predicting mortality risk, with 37 conditions defined by Read Codes. The GDPPR uses the more internationally used Systematized Nomenclature of Medicine clinical terms (SNOMED CT). We previously developed a modified version of the CMMS using SNOMED CT, but the number of terms for the GDPPR data set is limited making it impossible to use this version. Objective: We aimed to develop and validate a modified version of CMMS using the clinical terms available for the GDPPR. Methods: We used pseudonymized data from the Oxford-Royal College of General Practitioners Research and Surveillance Centre (RSC), which has an extensive SNOMED CT list. From the 37 conditions in the original CMMS model, we selected conditions either with (1) high prevalence ratio (≥85%), calculated as the prevalence in the RSC data set but using the GDPPR set of SNOMED CT codes, divided by the prevalence included in the RSC SNOMED CT codes or (2) conditions with lower prevalence ratios but with high predictive value. The resulting set of conditions was included in Cox proportional hazard models to determine the 1-year mortality risk in a development data set (n=500,000) and construct a new CMMS model, following the methods for the original CMMS study, with variable reduction and parsimony, achieved by backward elimination and the Akaike information stopping criterion. Model validation involved obtaining 1-year mortality estimates for a synchronous data set (n=250,000) and 1-year and 5-year mortality estimates for an asynchronous data set (n=250,000). We compared the performance with that of the original CMMS and the modified CMMS that we previously developed using RSC data. Results: The initial model contained 22 conditions and our final model included 17 conditions. The conditions overlapped with those of the modified CMMS using the more extensive SNOMED CT list. For 1-year mortality, discrimination was high in both the derivation and validation data sets (Harrell C=0.92) and 5-year mortality was slightly lower (Harrell C=0.90). Calibration was reasonable following an adjustment for overfitting. The performance was similar to that of both the original and previous modified CMMS models. Conclusions: The new modified version of the CMMS can be used on the GDPPR, a nationwide primary care data set of 54 million people, to enable adjustment for multimorbidity in predicting mortality in people in real-world vaccine effectiveness, pandemic planning, and other research studies. It requires 17 variables to produce a comparable performance with our previous modification of CMMS to enable it to be used in routine data using SNOMED CT.

Original languageEnglish
Article numbere56042
JournalJournal of Medical Internet Research
Volume26
Issue number1
DOIs
Publication statusPublished - 26 Aug 2024

ASJC Scopus subject areas

  • Health Informatics

Keywords

  • calibration
  • computerized medical records
  • COVID-19
  • discrimination
  • multimorbidity
  • pandemics
  • predictive model
  • prevalence
  • systematized nomenclature of medicine
  • systems

Fingerprint

Dive into the research topics of 'Creating a Modified Version of the Cambridge Multimorbidity Score to Predict Mortality in People Older Than 16 Years: Model Development and Validation'. Together they form a unique fingerprint.

Cite this