TY - JOUR
T1 - Phenotype execution and modeling architecture to support disease surveillance and real-world evidence studies
T2 - English sentinel network evaluation
AU - Jamie, Gavin
AU - Elson, William
AU - Kar, Debasish
AU - Wimalaratna, Rashmi
AU - Hoang, Uy
AU - Meza-Torres, Bernardo
AU - Forbes, Anna
AU - Hinton, William
AU - Anand, Sneha
AU - Ferreira, Filipa
AU - Byford, Rachel
AU - Ordonez-Mena, Jose
AU - Agrawal, Utkarsh
AU - De Lusignan, Simon
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/7/1
Y1 - 2024/7/1
N2 - Objective: To evaluate Phenotype Execution and Modelling Architecture (PhEMA), to express sharable phenotypes using Clinical Quality Language (CQL) and intensional Systematised Nomenclature of Medicine (SNOMED) Clinical Terms (CT) Fast Healthcare Interoperability Resources (FHIR) valuesets, for exemplar chronic disease, sociodemographic risk factor, and surveillance phenotypes. Method: We curated 3 phenotypes: Type 2 diabetes mellitus (T2DM), excessive alcohol use, and incident influenza-like illness (ILI) using CQL to define clinical and administrative logic. We defined our phenotypes with valuesets, using SNOMED's hierarchy and expression constraint language, and CQL, combining valuesets and adding temporal elements where needed. We compared the count of cases found using PhEMA with our existing approach using convenience datasets. We assessed our new approach against published desiderata for phenotypes. Results: The T2DM phenotype could be defined as 2 intensionally defined SNOMED valuesets and a CQL script. It increased the prevalence from 7.2% to 7.3%. Excess alcohol phenotype was defined by valuesets that added qualitative clinical terms to the quantitative conceptual definitions we currently use; this change increased prevalence by 58%, from 1.2% to 1.9%. We created an ILI valueset with SNOMED concepts, adding a temporal element using CQL to differentiate new episodes. This increased the weekly incidence in our convenience sample (weeks 26-38) from 0.95 cases to 1.11 cases per 100 000 people. Conclusions: Phenotypes for surveillance and research can be described fully and comprehensibly using CQL and intensional FHIR valuesets. Our use case phenotypes identified a greater number of cases, whilst anticipated from excessive alcohol this was not for our other variable. This may have been due to our use of SNOMED CT hierarchy. Our new process fulfilled a greater number of phenotype desiderata than the one that we had used previously, mostly in the modeling domain. More work is needed to implement that sharing and warehousing domains.
AB - Objective: To evaluate Phenotype Execution and Modelling Architecture (PhEMA), to express sharable phenotypes using Clinical Quality Language (CQL) and intensional Systematised Nomenclature of Medicine (SNOMED) Clinical Terms (CT) Fast Healthcare Interoperability Resources (FHIR) valuesets, for exemplar chronic disease, sociodemographic risk factor, and surveillance phenotypes. Method: We curated 3 phenotypes: Type 2 diabetes mellitus (T2DM), excessive alcohol use, and incident influenza-like illness (ILI) using CQL to define clinical and administrative logic. We defined our phenotypes with valuesets, using SNOMED's hierarchy and expression constraint language, and CQL, combining valuesets and adding temporal elements where needed. We compared the count of cases found using PhEMA with our existing approach using convenience datasets. We assessed our new approach against published desiderata for phenotypes. Results: The T2DM phenotype could be defined as 2 intensionally defined SNOMED valuesets and a CQL script. It increased the prevalence from 7.2% to 7.3%. Excess alcohol phenotype was defined by valuesets that added qualitative clinical terms to the quantitative conceptual definitions we currently use; this change increased prevalence by 58%, from 1.2% to 1.9%. We created an ILI valueset with SNOMED concepts, adding a temporal element using CQL to differentiate new episodes. This increased the weekly incidence in our convenience sample (weeks 26-38) from 0.95 cases to 1.11 cases per 100 000 people. Conclusions: Phenotypes for surveillance and research can be described fully and comprehensibly using CQL and intensional FHIR valuesets. Our use case phenotypes identified a greater number of cases, whilst anticipated from excessive alcohol this was not for our other variable. This may have been due to our use of SNOMED CT hierarchy. Our new process fulfilled a greater number of phenotype desiderata than the one that we had used previously, mostly in the modeling domain. More work is needed to implement that sharing and warehousing domains.
KW - clinical coding
KW - computerized
KW - medical record systems
KW - phenotype
KW - routinely collected health data
KW - systematized nomenclature of medicine
UR - http://www.scopus.com/inward/record.url?scp=85193054382&partnerID=8YFLogxK
U2 - 10.1093/jamiaopen/ooae034
DO - 10.1093/jamiaopen/ooae034
M3 - Article
AN - SCOPUS:85193054382
SN - 2574-2531
VL - 7
JO - JAMIA Open
JF - JAMIA Open
IS - 2
M1 - ooae034
ER -