Data Model

Each sequencing run produces log files, instrument health data, run metrics, base call information (*.bcl files), and other data. BaseSpace Sequence Hub demultiplexes base call information to create the FASTQ files used in secondary analysis.

Common Terms

Biosamples represent the source biological sample being sequenced. They are associated with data aggregated from multiple sequencing runs according the sample name provided in the samplesheet of each run.
Samples represent a set of FASTQ files from a single sequencing run, according to the sample name provided in the samplesheet.
Libraries are produced when a biosample is prepped with a library prep kit.
Pools are an aliquot of one or more libraries, pooled together in order to be placed in a flowcell lane.
Datasets are sets of files produced by a Basespace application. Some views will refer to them as "Other Datasets" to distinguish them from Datasets containing FASTQ files. These were formerly referred to as "App Results."
FASTQ Datasets are a set of FASTQ files produced by FASTQ-Generation or BclConvert apps. Given their proment place in the Basespace data model, they're often treated as a distinct type from "Other Datasets," which aids in data management tasks like filtering & sorting.
Projects are the containers for datasets and dataset files, which can include FASTQ, BAM, and VCF files. Projects can be associated with runs, analyses, and other entities in BaseSpace Sequence Hub. If a given project contains FASTQ files, it will also be associated with one or more Samples & Biosamples.

Analyzing FASTQ Data

Basespace apps that analyze FASTQ files can accept either Biosamples or Samples in the input form, and the system will utilize the proper set of FASTQ files in each situation.
Basespace users can select their preferred input at the top of the form, and Basespace will load the correct controls into the form:

What Happened to the Classic Data Model?

Basespace still has full support for classic data types like Samples (see term definitions above for more info). You can continue to use Samples and the associated set of features if that model is a good fit for your lab's needs, like launching an app with input FASTQ files from a single run.

A run's Samples tab will continue to list the samples containing FASTQ files produced from that run's sequencing data:

A project's Samples tab will continue to list the samples associated with FASTQ files that live in that project:

PreviousRelease Notes NextAccount Types

Last updated 1 year ago

Was this helpful?

Common Terms

Analyzing FASTQ Data

More on Biosamples

What Happened to the Classic Data Model?