Health data phenotypes

Health data phenotypes

Almost all Genes & Health phenotypes are derived from the (electronic) NHS health records of our volunteers. All raw data (deidentified) is available within our Trusted Research Environment. To curate phenotypes, we take data from all releases from all NHS sources (primary care, secondary care, national NHS England/Digital, and others) and then merge and de-duplicate. Our code for phenotype processing is on our github page.

Quantitative traits:

We have curated ~120 quantitative traits (e.g. BMI, LDL-cholesterol). Because a test result can appear in multiple health record systems from different NHS Providers, if the same result value is seen within a ~2 week window for the same volunteer, this is de-duplicated. We also provide further filtering, if required, around periods of acute severe illness using 2 week windows either side of known 2 day or longer hospital admissions (from NHS England Hospital Episode Statistics).

Binary traits:

We have curated ~300 custom binary traits. These are using custom codelists containing defined ICD10, SNOMED or OPCS codes. There can be several different “flavours” of the same phenotype (e.g. coronary artery disease) with different sensitivity and specificity (e.g. self-reported symptom of angina versus requiring evidence from an angiogram).

We have curated 3digit ICD10 binary traits, using as far as possible the same methods as UK Biobank. There are ~1600 traits (ICD10 A01 to Q99). Unlike UK Biobank we included cancer traits. The 3digit ICD10 traits are intended for a wide and crude sweep across all human phenotypes. Some data is lost versus the custom traits as 1) OPCS codes are not used and 2) some SNOMED codes map to multiple ICD10 and have to be discarded (otherwise you see unfeasible numbers of e.g. Anthrax due to non-specific “infection” codes).

Lists of releases for all traits with counts are on our Phenotype Counts google sheets.

Queen Mary NHS Barts Health - NHS Trust NHS Bradford NHS Manchester University Manchester 1824 Wellcome Medical Research Council Nationial Institute for Health Research Aston University Sanger Institute King's College London