client logo
Version: 1.0.0 | Published: 8 Oct 2024 | Updated: 227 days ago

OPTIMAM Mammographic Image Database

Dataset

Documentation

Description:
The development of artificial intelligence software to improve the outcomes of breast screening relies on the availability of well-curated image databases. The OPTIMAM Mammography Image Database (OMI-DB) was created to provide a centralized, fully annotated dataset for research. The initial reason for creating the database was for the Cancer Research United Kingdom–funded projects OPTIMAM (2008–2013) and OPTIMAM2 (2013–2018), which evaluated how various factors affect breast cancer detection on mammograms. The images are derived from screening centers in the United Kingdom and combined with systematically collected data on the current screening episode, as well as previous and subsequent episodes. In the United Kingdom, the National Health Service Breast Screening Programme (NHSBSP) invites women to attend breast screening every 3 years between the ages of 50 and 70 years. A screening episode is one attendance at screening by a woman and includes any immediate workup imaging (assessment) if she was recalled for further investigation of a suspicious region on the screening mammograms. Any pathologic finding is also included, and the episode ends with histologic diagnosis or treatment for all lesions. Our objective was to collect mammograms for women with screen-detected cancers as well as representative samples of normal and benign screening cases. “For processing” and “for presentation” screening mammograms and prior mammograms have been collected for all screen-detected and interval cancers from several screening centres since 2011. All mammography images and data associated with initial screening attendance, further assessment, and surgical outcomes were collected as a screening episode. In addition to continuous collection of cancers, images and clinical data were collected for all women screened during 2014, and for a random selection of 25% of all women screened in 2012, 2013, and 2015 at two of the three sites. Collection into the database is ongoing, and each case is updated with new information and further screening episodes. The associated data comprise radiologic, clinical, and pathologic information extracted from NBSS. Information on screening history, previous occurrences of cancer, biopsy results, and surgical procedures are collected from NBSS. The exact radiologic locations of lesions are not stored in NBSS. However, such information, important for training and evaluating algorithms, is collected in OMI-DB. Experienced (UK accredited) mammography readers at their own site (radiologists and advanced practice radiographers) annotate the images with reference to records made at the time of initial mammography interpretation and at further (assessment) workup (magnification views, US, and biopsy). This information is used to define rectangular regions of interest indicating the location and area of lesions and other attributes, such as radiologic appearance and conspicuity.

Coverage

Spatial:
United Kingdom
Typical Age Range:
50-70
Follow Up:
Continuous
Pathway:
The images are derived from screening centres in the United Kingdom and combined with systematically collected data on the current screening episode, as well as previous and subsequent episodes. In the United Kingdom, the National Health Service Breast Screening Programme (NHSBSP) invites women to attend breast screening every 3 years between the ages of 50 and 70 years. A screening episode is one attendance at screening by a woman and includes any immediate workup imaging (assessment) if she was recalled for further investigation of a suspicious region on the screening mammograms. Any pathologic finding is also included, and the episode ends with histologic diagnosis or treatment for all lesions. At some screening centers younger and older women are also invited for screening as part of the national age trial. Some women in high-risk groups receive annual invitations to screening. Our objective was to collect mammograms for women with screen-detected cancers as well as representative samples of normal and benign screening cases.

Provenance

Origin

Purposes:
Disease registry
Sources:
Other
Collection Situations:
Other

Temporal

Accrual Periodicity:
Continuous
Start Date:
01 January 2011
Time Lag:
Variable

Accessibility

Access

Delivery Lead Time:
Not applicable
Jurisdictions:
GB-ENG
Data Controller:
CRUK and Royal Surrey NHS Foundation Trusts are joint data controllers. The OPTIMAM Data Access Committee (https://medphys.royalsurrey.nhs.uk/omidb/the-steering-committee/) administer and review data access request. Please apply for access at https://medphys.royalsurrey.nhs.uk/omidb/apply-for-access/ .
Data Processor:
Data scientists from the Royal Surrey manage the collection, storage and distribution of the dataset

Usage

Data Use Limitations:
Project-specific restrictions
Data Use Requirements:
Project-specific restrictions
Resource Creators:
Cancer Research UK;,;Royal Surrey NHS Foundation Trust

Format and Standards

Vocabulary Encoding Schemes:
LOCAL
Languages:
en
Formats:
  • DICOM
  • JSON

Observations

Statistical Population
Population Description
Population Size
Measured Property
Observation Date
Events
Interval Cancer events
3500
Number of interval cancer events
26 May 2023
Findings
Screen detected cancer findings
18000
Count of screen detect cancer findings
26 May 2023
Persons
Total number of clients in the dataset
540000
Total Count
26 May 2023