client logo
Version: 17.0.0 | Published: 30 Oct 2023 | Updated: 572 days ago
ind-dataset-logo

Genomics England - Common

Dataset

Documentation

Description:
Data views that are common to both the rare disease and the cancer domains. This data pertains to sample handling, genome sequencing, and participant data. Data Relating to Participants: - participant: Data on each individual participant in the 100,000 Genomes Project, e.g. personal information (such as relatives or self-reported ethnicity); points of contact with the Project (e.g. handling Genomic Medicine Centre or Trust); and a record of the status of their clinical review. - death_details: Data on participant deaths submitted by GMCs, likely less complete than the data collected by ONS and NHSE. Data Relating to Samples: - clinic_sample: Data describing the taking and handling of participant samples at the Genomic Medicine Centres, i.e. in the clinic, as well as the type of samples obtained. Because of the complexities of handling and managing tumour tissues samples in a clinical setting, there are many fields that are cancer-specific. - clinic_sample_quality_check_result: Data describing the quality control of obtaining and handling participant samples at the Genomic Medicine Centres, i.e. in the clinic. - laboratory_sample: Data describing the handling of samples at the biorepository and in preparation for sequencing, as well as the type of sample. - plated_sample: Data describing the handling and QC of samples at Illumina (the sequencing provider). - laboratory_sample_omics_availability: Availability of samples collected from participants in the 100,000 Genomes Project for the purpose of omics research. Data includes: Participant ID, Sample Type (e.g. Serum, RNA Blood), the number of aliquots of that sample type for that participant, and the availability status - whether the sample has already been used for a research project. Research proposals for the use of these samples can be submitted, via the GECIP team, to the Scientific Advisory Committee and Access Review Committee.
Is Part Of:
100K Primary Data

Coverage

Spatial:
UK
Typical Age Range:
0-150
Follow Up:
OTHER
Physical Sample Availability:
DNA
Pathway:
Linked datasets cover secondary care.

Provenance

Origin

Purposes:
  • CARE
  • DISEASE REGISTRY
  • OTHER
Sources:
  • ELECTRONIC SURVEY
  • EPR
  • LIMS
  • MACHINE GENERATED
Collection Situations:
  • CLINIC
  • IN-PATIENTS
  • OUTPATIENTS

Temporal

Accrual Periodicity:
QUARTERLY
Distribution Release Date:
30 March 2023
Start Date:
01 January 2012
End Date:
31 December 2022
Time Lag:
2-6 MONTHS

Accessibility

Access

Access Service:
More information about the Genomics England Research Environment can be found here: https://www.genomicsengland.co.uk/about-genomics-england/research-environment/ https://research-help.genomicsengland.co.uk/display/GERE/1.+The+Genomics+England+Research+Environment Genomics England 100k participants have consented to longitudinal lifetime followup and recontact safely through our clinical network. BRST (Bioinformatics Research Services) are a team of bioinformatics who know the dataset inside out and provide consultancy projects on a case by case basis. Our network of clinical and medical experts can be made available on case by case basis. Researchers have the opportunity to work with our and access the GeCIP network who are a community of world-leading experts in specific cancers and rare diseases.
Access Request Cost:
Fees will be dependent on the type of access that is necessary. Raw data is not eligible for export. Summary-level data may be exported provided that it is approved through the Genomics England Airlock Process
Delivery Lead Time:
2-6 MONTHS
Jurisdictions:
GB-GBN
Data Controller:
GENOMICS ENGLAND
Data Processor:
GENOMICS ENGLAND

Usage

Data Use Limitations:
GENERAL RESEARCH USE
Data Use Requirements:
  • ETHICS APPROVAL REQUIRED
  • PROJECT SPECIFIC RESTRICTIONS
  • PUBLICATION MORATORIUM
Resource Creators:
  • The 100
  • 000 Genomes Project Protocol v3
  • Genomics England. doi:10.6084/m9.figshare.4530893.v3. 2017. Publications that use the Genomics England Database should include an author as: Genomics England Research Consortium. Please see publication policy.

Format and Standards

Vocabulary Encoding Schemes:
  • LOCAL
  • ICD10
  • NHS NATIONAL CODES
  • ODS
  • OPCS4
  • READ
  • SNOMED CT
  • OTHER
Languages:
en
Formats:
Multiple formats available

Enrichment and Linkage

Qualified Relations:
  • HES Accident and Emergency
  • HES Outpatient Care
  • Diagnostic Imaging Dataset (DID)
  • Patient Reported Outcome Measures (PROMs)
  • Cancer Registration (AV) tables
  • HES Admitted Patient Care
  • Cancer waiting times (CWT)
  • Lung Cancer Data Audit (LUCADA)
  • PHE Diagnostic Imaging Dataset (NCRAS_DID)
  • Systemic Anti-Cancer Therapy Data Set (SACT)
  • Office for National Statistics - Death details data (ONS)
  • National Radiotherapy Dataset (RTDS)
  • Mental Health Minimum Data Set (MHMDS)
Derivations:
Not Known

Observations

Statistical Population
Population Description
Population Size
Measured Property
Observation Date
Findings
Cancer Tumour - Number of genomes
17,003
Count
30 March 2023
Findings
Cancer Germline - Number of genomes
32,753
Count
30 March 2023
Findings
Rare Disease - Number of genomes
73,517
Count
30 March 2023
Persons
Rare Disease Participants
72,874
Count
30 March 2023
Persons
Cancer Participants
15,624
Count
30 March 2023