Version: 1.0.0 | Published: 10 Apr 2024 | Updated: 407 days ago
Documentation
Description:
Background: Well-created synthetic data establishes a governance risk-free environment for health care algorithm development and experimentation. This includes the evaluation of new treatment models, care management systems, clinical decision support, and more. Synthetic data is of particular use in rare diseases, where real data may be in short supply, or to replicate disease in less common patient demographics (such as certain ethnicities or co-morbid combinations).
Familial hypertrophic cardiomyopathy (HCM) is a rare genetic condition characterized by thickening (hypertrophy) of the cardiac muscle, usually of the interventricular septum. Many affected individuals have no symptoms. Other people may experience chest pain; breathlessness and fainting. Arrhythmias can be life threatening and HCM is associated with an increased risk of sudden death. Some affected individuals develop potentially fatal heart failure, which may require heart transplantation. Approximately 130,000 people have HCM in the UK, but there is a significant burden of undiagnosed disease and diagnostic delay. Inheritance is autosomal dominant, with the most commonly involved genes being MYH7, MYBPC3, TNNT2, and TNNI3.
PIONEER geography: The West Midlands (WM) has a population of 5.9 million and includes a diverse ethnic and socio-economic mix. There is a higher than average percentage of minority ethnic groups and rare diseases including HCM, with a regional specialist HCM service.
Electronic Health Records (EHR): University Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts in England, providing direct acute services and specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds and 100 ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary and secondary care record (Your Care Connected) and a patient portal “My Health”.
Scope: This synthetic dataset reflects the diversity of the WM and the granularity of the HCM service. It replicates longitudinal data including the care journey prior to diagnosis. It includes ECHO provocation, CPEX, ECG including evidence of pre-excitation, as well as symptoms, family history, genetic testing and HCM phenotype. Outcomes and procedures are included.
Available supplementary data: Synthetic Ambulance data. Real HCM patient data, including from secondary and ambulance settings – forming a synthetic algorithm “build” and real “test” datasets.
Available supplementary support: Analytics, Model build, validation and refinement; A.I.; Data partner support for ETL (extract, transform and load) process, Clinical expertise, Patient and end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.
Is Part Of:
NOT APPLICABLE
Coverage
Spatial:
United Kingdom, England, West Midlands
Typical Age Range:
0-150
Follow Up:
OTHER
Physical Sample Availability:
NOT AVAILABLE
Pathway:
The West Midlands (WM) has a population of 5.9 million and includes a diverse
ethnic, socio-economic mix. There is a higher than average percentage of
minority ethnic groups with Birmingham having a population which is more than
40% non-white. WM has a large number of elderly residents but Birmingham is one
of the youngest cities in the UK. There is social deprivation and Birmingham’s
population suffers with particularly high rates of illness; including
cerebrovascular disease, physical inactivity, obesity, smoking, hypertension,
ischaemic heart disease and diabetes. There are also high levels of rare
diseases, especially immunometabolic conditions. This in turn leads to high
levels of infections. The patients included in this dataset are representative
of this diverse population and also include a wide age range. University
Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts
in England, providing direct acute services and specialist care across four
hospital sites, with 2.2 million patient episodes per year, 2750 beds and 100
ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham
Systems), a shared primary and secondary care record (Your Care Connected) and a
patient portal “My Health”.
Provenance
Origin
Purposes:
OTHER
Sources:
MACHINE GENERATED
Collection Situations:
IN-PATIENTS
Temporal
Accrual Periodicity:
STATIC
Distribution Release Date:
22 February 2021
Time Lag:
NOT APPLICABLE
Accessibility
Access
Access Service:
Trusted Research Environments (TRE) are built using Microsoft Azure services and
hosted in the UK to provide research teams a safe, secure and agile environment
which allows users to quickly analyse, interpret and form an enriched view of
primary care information through a range of integrated datasets. Health data
collated from multiple sources is ingested into a secure data lake which will
then allow subsets of data to be made available to research teams on approval of
a data request. Once approved a customer specific TRE is made available with a
standard set of leading analytical tools from Microsoft including Azure
Databricks, Azure Machine Learning, Azure SQL and Azure Synapse (for large-scale
data warehouses). Specific tools can be provided at an additional cost over the
standard platform data access charge and the PIONEER team will work with you to
determine your exact needs. Access to the TRE is managed using the latest
virtual desktop technology to provide a safe and secure end-user experience. By
utilising leading edge design PIONEER are able to create TREs rapidly to enable
us to service any customer requirement.
Access Request Cost:
www.pioneerdatahub.co.uk/data/data-services-costs/
Delivery Lead Time:
1-2 MONTHS
Jurisdictions:
GB
Data Controller:
University Hospitals Birmingham NHS Foundation Trust
Data Processor:
NOT APPLICABLE
Usage
Data Use Limitations:
GENERAL RESEARCH USE
Data Use Requirements:
PROJECT SPECIFIC RESTRICTIONS
Resource Creators:
- This publication uses data from PIONEER
- an ethically approved database and analytical environment (East Midlands Derby Research Ethics 20/EM/0158)
Format and Standards
Vocabulary Encoding Schemes:
LOCAL
Conforms To:
LOCAL
Languages:
en
Formats:
CSV
Enrichment and Linkage
Derivations:
NOT AVAILABLE
Observations
Statistical Population
Population Description
Population Size
Measured Property
Observation Date
Events
30,000 synthetically generated patient records, generated using a Naïve Bayes model
30000
Count
22 February 2021