client logo
Version: 1.0.0 | Published: 4 Feb 2026 | Updated: 24 days ago

Genomics England - Transcriptomics

Dataset

Documentation

Description:
The Genomics England 100kGP Transcriptomics Pilot and Extension comprises RNA-sequencing of a subset of rare disease probands from the 100,000 Genomes Project who did not receive a genetic diagnosis through the Genomics England Interpretation Pipeline (7840 samples from 7829 probands: 5546 samples in the initial Pilot project, 2294 samples in the Extension). We prioritised probands who were found to carry variants of unknown significance. Priorities were based on: - Variants highlighted through Splice AI - Autosomal recessive disorders with only a single pathogenic variant identified - GMC-selected VUS AND contribution to phenotype partial / unknown AND variant type likely to affect RNA processing - Based on outcome questionnaire and a call to clinicians - VUS with a high Exomiser score AND variant likely to results in detectable abnormal RNA processing - Disorder category ranking by Genomics England on the basis of likely monogenic cause (ranks 1-5) for participants from 1.1 AND no diagnosis in outcome questionnaire - Call to GMCs / clinicians to propose cases based on strong phenotype for a monogenic disorder with no lead from WGS - Review whether RNA sample is available or requirement for fresh RNA sample

Coverage

Spatial:
UK
Typical Age Range:
0-150
Follow Up:
Other
Pathway:
Linked datasets cover secondary care.

Provenance

Origin

Purposes:
  • Care
  • Disease registry
  • Study
Sources:
Machine generated
Collection Situations:
Clinic

Temporal

Accrual Periodicity:
Quarterly
Distribution Release Date:
11 September 2025
Start Date:
21 December 2023
Time Lag:
Other

Accessibility

Access

Access Service:
More information about the Genomics England Research Environment can be found here: https://www.genomicsengland.co.uk/research and https://re-docs.genomicsengland.co.uk/welcome/. Genomics England 100k participants have consented to longitudinal lifetime followup and recontact safely through our clinical network.
Access Request Cost:
Fees will be dependent on the type of access that is necessary. Raw data is not eligible for export. Summary-level data may be exported provided that it is approved through the Genomics England Airlock Process
Delivery Lead Time:
2-6 months
Data Controller:
GENOMICS ENGLAND
Data Processor:
GENOMICS ENGLAND

Usage

Data Use Limitations:
General research use
Data Use Requirements:
  • Ethics approval required
  • Project-specific restrictions
  • Publication moratorium
Resource Creators:
The 100,000 Genomes Project Protocol v3, Genomics England. doi:10.6084/m9.figshare.4530893.v3. 2017. Publications that use the Genomics England Database should include an author as Genomics England Research Consortium. Please see the publication policy.

Format and Standards

Vocabulary Encoding Schemes:
OTHER
Conforms To:
OTHER
Languages:
en
Formats:
  • DRAGEN output
  • RNA-Seq QC output

Observations

Statistical Population
Population Description
Population Size
Measured Property
Observation Date
Persons
A subset of rare disease probands from the 100,000 Genomes Project who did not receive a genetic diagnosis through the Genomics England Interpretation Pipeline. 7840 samples from 7829 probands: 5546 samples in the initial Pilot project, 2294 samples in the Extension
7840
RNA-Seq
25 September 2025