Big Data

For Better Hearts

BigData@Heart

BigData@Heart latest publications

Homocysteine, B vitamins, and cardiovascular disease: a Mendelian randomization study  

 

Published: 23 April 2021, BMC Medicine

Authors: Shuai Yuan, Amy M Mason, Paul Carter, Stephen Burgess, Susanna C Larsson.  

Abstract 

Background

Whether a modestly elevated homocysteine level is causally associated with an increased risk of cardiovascular disease remains unestablished. We conducted a Mendelian randomization study to assess the associations of circulating total homocysteine (tHcy) and B vitamin levels with cardiovascular diseases in the general population. 

Methods

Independent single nucleotide polymorphisms associated with tHcy (n = 14), folate (n = 2), vitamin B6 (n = 1), and vitamin B12 (n = 14) at the genome-wide significance level were selected as instrumental variables. Summary-level data for 12 cardiovascular endpoints were obtained from genetic consortia, the UK Biobank study, and the FinnGen consortium. 

Results

Higher genetically predicted circulating tHcy levels were associated with an increased risk of stroke. For each one standard deviation (SD) increase in genetically predicted tHcy levels, the odds ratio (OR) was 1.11 (95% confidence interval (CI), 1.03, 1.21; p = 0.008) for any stroke, 1.26 (95% CI, 1.05, 1.51; p = 0.013) for subarachnoid hemorrhage, and 1.11 (95% CI, 1.03, 1.21; p = 0.011) for ischemic stroke. Higher genetically predicted folate levels were associated with decreased risk of coronary artery disease (ORSD, 0.88; 95% CI, 0.78, 1.00, p = 0.049) and any stroke (ORSD, 0.86; 95% CI, 0.76, 0.97, p = 0.012). Genetically predicted increased vitamin B6 levels were associated with a reduced risk of ischemic stroke (ORSD, 0.88; 95% CI, 0.81, 0.97, p = 0.009). None of these associations persisted after multiple testing correction. There was no association between genetically predicted vitamin B12 and cardiovascular disease. 

Conclusions

This study reveals suggestive evidence that B vitamin therapy and lowering of tHcy may reduce the risk of stroke, particularly subarachnoid hemorrhage and ischemic stroke.

Read the full paper herehttps://doi.org/10.1186/s12916-021-01977-8

 

Linked electronic health records for research on a nationwide cohort of more than 54 million people in England: data resource

 

Published: 7 April 2021, BMJ

Authors:  Angela Wood, Rachel Denholm, Sam Hollings, Jennifer Cooper, Samantha Ip, Venexia Walker, Spiros Denaxas, Ashley Akbari, Amitava Banerjee, William Whiteley, Alvina Lai, Jonathan Sterne, Cathie Sudlow.

Abstract

Objective

To describe a novel England-wide electronic health record (EHR) resource enabling whole population research on covid-19 and cardiovascular disease while ensuring data security and privacy and maintaining public trust.

Design 

Data resource comprising linked person level records from national healthcare settings for the English population, accessible within NHS Digital’s new trusted research environment.

Setting 

EHRs from primary care, hospital episodes, death registry, covid-19 laboratory test results, and community dispensing data, with further enrichment planned from specialist intensive care, cardiovascular, and covid-19 vaccination data.

Participants

54.4 million people alive on 1 January 2020 and registered with an NHS general practitioner in England.

Main measures of interest 

Confirmed and suspected covid-19 diagnoses, exemplar cardiovascular conditions (incident stroke or transient ischaemic attack and incident myocardial infarction) and all cause mortality between 1 January and 31 October 2020.

Results 

The linked cohort includes more than 96% of the English population. By combining person level data across national healthcare settings, data on age, sex, and ethnicity are complete for around 95% of the population. Among 53.3 million people with no previous diagnosis of stroke or transient ischaemic attack, 98 721 had a first ever incident stroke or transient ischaemic attack between 1 January and 31 October 2020, of which 30% were recorded only in primary care and 4% only in death registry records. Among 53.2 million people with no previous diagnosis of myocardial infarction, 62 966 had an incident myocardial infarction during follow-up, of which 8% were recorded only in primary care and 12% only in death registry records. A total of 959 470 people had a confirmed or suspected covid-19 diagnosis (714 162 in primary care data, 126 349 in hospital admission records, 776 503 in covid-19 laboratory test data, and 50 504 in death registry records). Although 58% of these were recorded in both primary care and covid-19 laboratory test data, 15% and 18%, respectively, were recorded in only one.

Conclusions

This population-wide resource shows the importance of linking person level data across health settings to maximise completeness of key characteristics and to ascertain cardiovascular events and covid-19 diagnoses. Although this resource was initially established to support research on covid-19 and cardiovascular disease to benefit clinical care and public health and to inform healthcare policy, it can broaden further to enable a wide range of research.

Read the full paper herehttps://doi.org/10.1136/bmj.n826

 

Machine learning for subtype definition and risk prediction in heart failure, acute coronary syndromes and atrial fibrillation: systematic review of validity and clinical utility

 

Published: 6 April 2021, BMC Medicine

Authors:  Amitava Banerjee, Suliang Chen, Ghazaleh Fatemifar, Mohamad Zeina, R. Thomas Lumbers, Johanna Mielke, Simrat Gill, Dipak Kotecha, Daniel F. Freitag, Spiros Denaxas, Harry Hemingway

Abstract

Background

Machine learning (ML) is increasingly used in research for subtype definition and risk prediction, particularly in cardiovascular diseases. No existing ML models are routinely used for cardiovascular disease management, and their phase of clinical utility is unknown, partly due to a lack of clear criteria. We evaluated ML for subtype definition and risk prediction in heart failure (HF), acute coronary syndromes (ACS) and atrial fibrillation (AF).

Methods

For ML studies of subtype definition and risk prediction, we conducted a systematic review in HF, ACS and AF, using PubMed, MEDLINE and Web of Science from January 2000 until December 2019. By adapting published criteria for diagnostic and prognostic studies, we developed a seven-domain, ML-specific checklist.

Results

Of 5918 studies identified, 97 were included. Across studies for subtype definition (n = 40) and risk prediction (n = 57), there was variation in data source, population size (median 606 and median 6769), clinical setting (outpatient, inpatient, different departments), number of covariates (median 19 and median 48) and ML methods. All studies were single disease, most were North American (n = 61/97) and only 14 studies combined definition and risk prediction. Subtype definition and risk prediction studies respectively had limitations in development (e.g. 15.0% and 78.9% of studies related to patient benefit; 15.0% and 15.8% had low patient selection bias), validation (12.5% and 5.3% externally validated) and impact (32.5% and 91.2% improved outcome prediction; no effectiveness or cost-effectiveness evaluations).

Conclusions

Studies of ML in HF, ACS and AF are limited by number and type of included covariates, ML methods, population size, country, clinical setting and focus on single diseases, not overlap or multimorbidity. Clinical utility and implementation rely on improvements in development, validation and impact, facilitated by simple checklists. We provide clear steps prior to safe implementation of machine learning in clinical practice for cardiovascular diseases and other disease areas.

Read the full paper herehttps://doi.org/10.1186/s12916-021-01940-7

 

Identification of distinct phenotypic clusters in heart failure with preserved ejection fraction

 

Published: 29 March 2021, European Journal of Heart Failure

Authors: Alicia Uijl, Gianluigi Savarese, Ilonca Vaartjes, Ulf Dahlström, Jasper J. Brugts, Gerard C.M. Linssen, Vanessa van Empel, Hans-Peter Brunner-La Rocca, Folkert W. Asselbergs, Lars H. Lund, Arno W. Hoes, Stefan Koudstaal.

Abstract

Aims

We aimed to derive and validate clinically useful clusters of patients with heart failure with preserved ejection fraction (HFpEF; left ventricular ejection fraction ≥50%).

Methods and results

We derived a cluster model from 6909 HFpEF patients from the Swedish Heart Failure Registry (SwedeHF) and externally validated this in 2153 patients from the Chronic Heart Failure ESC-guideline based Cardiology practice Quality project (CHECK-HF) registry. In SwedeHF, the median age was 80 [interquartile range 72–86] years, 52% of patients were female and most frequent comorbidities were hypertension (82%), atrial fibrillation (68%), and ischaemic heart disease (48%). Latent class analysis identified five distinct clusters: cluster 1 (10% of patients) were young patients with a low comorbidity burden and the highest proportion of implantable devices; cluster 2 (30%) patients had atrial fibrillation, hypertension without diabetes; cluster 3 (25%) patients were the oldest with many cardiovascular comorbidities and hypertension; cluster 4 (15%) patients had obesity, diabetes and hypertension; and cluster 5 (20%) patients were older with ischaemic heart disease, hypertension and renal failure and were most frequently prescribed diuretics. The clusters were reproduced in the CHECK-HF cohort. Patients in cluster 1 had the best prognosis, while patients in clusters 3 and 5 had the worst age- and sex-adjusted prognosis.

Conclusions

Five distinct clusters of HFpEF patients were identified that differed in clinical characteristics, heart failure drug therapy and prognosis. These results confirm the heterogeneity of HFpEF and form a basis for tailoring trial design to individualized drug therapy in HFpEF patients.

Read the full paper herehttps://doi.org/10.1002/ejhf.2169

 

Statistical integration of two omics datasets using GO2PLS

Published: 18 March 2021, BMC Bioinformatics

Authors: Zhujie Gu, Said el Bouhaddani, Jiayi Pei, Jeanine Houwing-Duistermaat, Hae-Won Uh

Abstract

Background

Nowadays, multiple omics data are measured on the same samples in the belief that these different omics datasets represent various aspects of the underlying biological systems. Integrating these omics datasets will facilitate the understanding of the systems. For this purpose, various methods have been proposed, such as Partial Least Squares (PLS), decomposing two datasets into joint and residual subspaces. Since omics data are heterogeneous, the joint components in PLS will contain variation specific to each dataset. To account for this, Two-way Orthogonal Partial Least Squares (O2PLS) captures the heterogeneity by introducing orthogonal subspaces and better estimates the joint subspaces. However, the latent components spanning the joint subspaces in O2PLS are linear combinations of all variables, while it might be of interest to identify a small subset relevant to the research question. To obtain sparsity, we extend O2PLS to Group Sparse O2PLS (GO2PLS) that utilizes biological information on group structures among variables and performs group selection in the joint subspace.

Results

The simulation study showed that introducing sparsity improved the feature selection performance. Furthermore, incorporating group structures increased robustness of the feature selection procedure. GO2PLS performed optimally in terms of accuracy of joint score estimation, joint loading estimation, and feature selection. We applied GO2PLS to datasets from two studies: TwinsUK (a population study) and CVON-DOSIS (a small case-control study). In the first, we incorporated biological information on the group structures of the methylation CpG sites when integrating the methylation dataset with the IgG glycomics data. The targeted genes of the selected methylation groups turned out to be relevant to the immune system, in which the IgG glycans play important roles. In the second, we selected regulatory regions and transcripts that explained the covariance between regulomics and transcriptomics data. The corresponding genes of the selected features appeared to be relevant to heart muscle disease.

Conclusions

GO2PLS integrates two omics datasets to help understand the underlying system that involves both omics levels. It incorporates external group information and performs group selection, resulting in a small subset of features that best explain the relationship between two omics datasets for better interpretability.

Read the full paper herehttps://doi.org/10.1186/s12859-021-03958-3

 

Quantification of fibroblast growth factor 23 and N-terminal pro-B-type natriuretic peptide to identify patients with atrial fibrillation using a high-throughput platform: A validation study

Published: 3 February 2021, PLOS MEDICINE

Authors: Winnie ChuaID, Jonathan P. Law, Victor R. Cardoso, Yanish Purmah, Georgiana Neculau, Muhammad Jawad-Ul-Qamar, Kalisha Russell, Ashley Turner, Samantha P. Tull, Frantisek Nehaj, Paul Brady, Peter Kastner, Andre´ Ziegler, Georgios V. Gkoutos, Davor Pavlovic, Charles J. Ferro, Paulus Kirchhof, Larissa Fabritz.

Abstract

Background

Large-scale screening for atrial fibrillation (AF) requires reliable methods to identify at-risk populations. Using an experimental semi-quantitative biomarker assay, B-type natriuretic peptide (BNP) and fibroblast growth factor 23 (FGF23) were recently identified as the most suitable biomarkers for detecting AF in combination with simple morphometric parameters (age, sex, and body mass index [BMI]). In this study, we validated the AF model using standardised, high-throughput, high-sensitivity biomarker assays.

Methods and findings

For this study, 1,625 consecutive patients with either (1) diagnosed AF or (2) sinus rhythm with CHA2DS2-VASc score of 2 or more were recruited from a large teaching hospital in Birmingham, West Midlands, UK, between September 2014 and February 2018. Seven-day ambulatory ECG monitoring excluded silent AF. Patients with tachyarrhythmias apart from AF and incomplete cases were excluded. AF was diagnosed according to current clinical guidelines and confirmed by ECG. We developed a high-throughput, high-sensitivity assay for FGF23, quantified plasma N-terminal pro-B-type natriuretic peptide (NT-proBNP) and FGF23, and compared results to the previously used multibiomarker research assay. Data were fitted to the previously derived model, adjusting for differences in measurement platforms and known confounders (heart failure and chronic kidney disease). In 1,084 patients (46% with AF; median [Q1, Q3] age 70 [60, 78] years, median [Q1, Q3] BMI 28.8 [25.1, 32.8] kg/m2, 59% males), patients with AF had higher concentrations of NT-proBNP (median [Q1, Q3] per 100 pg/ml: with AF 12.00 [4.19, 30.15], without AF 4.25 [1.17, 15.70]; p < 0.001) and FGF23 (median [Q1, Q3] per 100 pg/ml: with AF 1.93 [1.30, 4.16], without AF 1.55 [1.04, 2.62]; p < 0.001). Univariate associations remained after adjusting for heart failure and estimated glomerular filtration rate, known confounders of NT-proBNP and FGF23. The fitted model yielded a C-statistic of 0.688 (95% CI 0.656, 0.719), almost identical to that of the derived model (C-statistic 0.691; 95% CI 0.638, 0.744). The key limitation is that this validation was performed in a cohort that is very similar demographically to the one used in model development, calling for further external validation.

Conclusions

Age, sex, and BMI combined with elevated NT-proBNP and elevated FGF23, quantified on a high-throughput platform, reliably identify patients with AF.

Read the full paper herehttps://doi.org/10.1371/journal.pmed.1003405

 

Evaluation and improvement of the National Early Warning Score (NEWS2) for COVID-19: a multi-hospital study

 

Published: 21 January 2021, BMC Medicine

Authors: Ewan Carr, Rebecca Bendayan, Daniel Bean, Matt Stammers, Wenjuan Wang, Huayu Zhang, Thomas Searle, Zeljko Kraljevic, Anthony Shek, Hang T. T. Phan, Walter Muruet, Rishi K. Gupta, Anthony J. Shinton, Mike Wyatt, Ting Shi, Xin Zhang, Andrew Pickles, Daniel Stahl, Rosita Zakeri, Mahdad Noursadeghi, Kevin O’Gallagher, Matt Rogers, Amos Folarin, Andreas Karwath, Kristin E. Wickstrøm, Alvaro Köhn-Luque, Luke Slater, Victor Roth Cardoso, Christopher Bourdeaux, Aleksander Rygh Holten, Simon Ball, Chris McWilliams, Lukasz Roguski, Florina Borca, James Batchelor, Erik Koldberg Amundsen, Xiaodong Wu, Georgios V. Gkoutos, Jiaxing Sun, Ashwin Pinto, Bruce Guthrie, Cormac Breen, Abdel Douiri, Honghan Wu, Vasa Curcin, James T. Teo, Ajay M. Shah, Richard J. B. Dobson.

Abstract

Background

The National Early Warning Score (NEWS2) is currently recommended in the UK for the risk stratification of COVID-19 patients, but little is known about its ability to detect severe cases. We aimed to evaluate NEWS2 for the prediction of severe COVID-19 outcome and identify and validate a set of blood and physiological parameters routinely collected at hospital admission to improve upon the use of NEWS2 alone for medium-term risk stratification. 

Methods

Training cohorts comprised 1276 patients admitted to King’s College Hospital National Health Service (NHS) Foundation Trust with COVID-19 disease from 1 March to 30 April 2020. External validation cohorts included 6237 patients from five UK NHS Trusts (Guy’s and St Thomas’ Hospitals, University Hospitals Southampton, University Hospitals Bristol and Weston NHS Foundation Trust, University College London Hospitals, University Hospitals Birmingham), one hospital in Norway (Oslo University Hospital), and two hospitals in Wuhan, China (Wuhan Sixth Hospital and Taikang Tongji Hospital). The outcome was severe COVID-19 disease (transfer to intensive care unit (ICU) or death) at 14days after hospital admission. Age, physiological measures, blood biomarkers, sex, ethnicity, and comorbidities (hypertension, diabetes, cardiovascular, respiratory and kidney diseases) measured at hospital admission were considered in the models.

Results

A baseline model of ‘NEWS2 + age’ had poor-to-moderate discrimination for severe COVID-19 infection at 14days (area under receiver operating characteristic curve (AUC) in training cohort=0.700, 95% confidence interval (CI) 0.680, 0.722; Brier score=0.192, 95% CI 0.186, 0.197). A supplemented model adding eight routinely collected blood and physiological parameters (supplemental oxygen flow rate, urea, age, oxygen saturation, C-reactive protein, estimated glomerular filtration rate, neutrophil count, neutrophil/lymphocyte ratio) improved discrimination (AUC=0.735; 95% CI 0.715, 0.757), and these improvements were replicated across seven UK and non-UK sites. However, there was evidence of miscalibration with the model tending to underestimate risks in most sites. 

Conclusions

NEWS2 score had poor-to-moderate discrimination for medium-term COVID-19 outcome which raises questions about its use as a screening tool at hospital admission. Risk stratification was improved by including readily available blood and physiological parameters measured at hospital admission, but there was evidence of miscalibration in external sites. This highlights the need for a better understanding of the use of early warning scores for COVID.

Read the full paper herehttps://doi.org/10.1186/s12916-020-01893-3

Published on: 06/30/2021