Sample Details Page Components

The Sample Details Page shows metrics for a sample that the app that ran the analysis generated. Different panes are displayed on this page depending on the app.


Samples table

The samples table contains general analysis information for the sample. Depending on the workflow, the following metrics can be shown:

Column Description
Sample name The sample name from the sample sheet
Sample ID The sample ID from the sample sheet. Sample ID must always be a unique value
Genome The name of the reference genome
Chr The reference target or chromosome name
Cluster PF The number of clusters passing filter for the sample that aligned to the reference genome
Mismatch The percentage mismatch to reference averaged over cycles per read (Read 1/Read 2)
No Call The percentage of bases that could not be called (no-call) for the sample averaged over cycles per read (Read 1/Read 2)
Coverage Median coverage (number of bases aligned to a given reference position) averaged over all positions
Het SNPs The number of heterozygous SNPs detected for the sample
Hom SNPs The number of homozygous SNPs detected for the sample
Insertions The number of insertions detected for the sample
Deletions The number of deletions detected for the sample


The workflows apps Small RNA and De Novo Assembly have custom samples tables:


Amplicons Table

Column Description
# An ordinal identification number in the table
Amplicons The amplicon name
Location The position at which the variant was found
Variants # The number of variants for this amplicon

Coverage Graph

Y axis X axis Description
Coverage Position The green curve is the number of aligned reads that cover each position in the reference. The red curve is the number of aligned reads that have a miscall at this position in the reference. SNPs and other variants show up as spikes in the red curve

Q-Score Graph

Y axis X axis Description
Q-Score Position The average quality score of bases at the given position of the reference

Variant Score Graph

Y axis X axis Description
Score Position Graphically depicts quality score and the position of SNPs and indels

Variants Table

The variants table shows variants for your sample per chromosome or amplicon.

Column Description
# An ordinal identification number in the table
Location The position at which the variant was found
Score The quality score for this variant
Type The variant type, which can be either SNP or indel
Call A string representing how the base or bases changed at this location in the reference
dbSNP The dbSNP name of the variant, if applicable
RefGene The gene according to RefGene in which this variant appears
Frequency The fraction of reads for the sample that includes the variant. For example, if the reference base is A, and sample 1 has 60 A reads and 40 T reads, then the SNP has a variant frequency of 0.4
Depth The number of reads for a sample covering a particular position. The GATK variant caller subsamples data in regions of high coverage
Filter The criteria for a filtered variant

Small RNA Pie Chart

The Small RNA pie chart provides a visualization of clusters identified as mature miRNA, other forms of RNA, genomic sequence, or contaminants. Common categories for the Small RNA pie chart are as follows:

  • Unaligned clusters that did not align against any reference
  • Genome clusters that aligned to the reference genome
  • miRNA clusters that aligned to the mature miRNA database

Hits to the mature miRNA database are counted only if the cluster aligned to the correct strand and position for the mature miRNA.

The remaining category names in the Small RNA pie chart are taken from the FASTA file names in the databases. For example, if the RNA database contains a file named rRNA.fa, then matches to this file are reported as the category rRNA.


Small RNA Graph

The Small RNA graph provides a plot of the common mature miRNA sequences for a sample and their abundances. The most common miRNA sequences for the selected sample (up to 10 records) are shown in proportion to the number of clusters matched.


Metagenomics Pie Chart

The Metagenomics pie chart provides a visualization of how many clusters from each sample were assigned to a category in each taxonomic level. Click another row in the taxonomy table to change the pie chart to that sample or taxonomic level.


Assembly Samples Graph

Contigs are arranged end-to-end along the X axis and the reference chromosomes are arranged bottom-to-top along the Y axis. Each pixel of the plot is colored according to how many short sequences of the corresponding contig have a match in the corresponding portion of the reference genome.

An identical assembly results in a diagonal line. A vertical gap in the plot might indicate a portion of the reference that is absent in the assembly, such as a plasmid, which is found in some bacteria populations.

Y axis X axis Description
Reference Assembly Position A syntenic plot of assembled contigs compared to a reference. A reference genome must be specified in the sample sheet

Sample QC Table

Column Description
Sample Name The sample name from the sample sheet
Clusters Count The number of clusters sequenced for this sample
Clusters Percentage The percentage of the total cluster number matching the index for this sample
Pass Filter The percentage of clusters passing filter for this sample
Alignment R1/R2 The percentage of clusters successfully aligned in Read 1/ Read 2
Length Median The median fragment length for the sample
Length Min The low percentile of fragment lengths for this sample as they correspond to three standard deviations from the median
Length Max The high percentile of fragment lengths for this sample as they correspond to three standard deviations from the median
Mismatch R1/R2 The percentage mismatch to reference averaged over cycles per read (Read 1/Read 2)
Estimated Diversity An estimate of the total library diversity derived from the observed diversity and the number of apparent PCR duplicates. This calculation is available for paired-end runs unless PCR duplicate flagging was disabled in the sample sheet
Observed Diversity Number of distinct aligned positions. Reads with the same aligned positions are assumed to be PCR duplicates. PCR duplicates are defined as sequences with identical Read 1 and Read 2 start sites
GitHub

Contribute to this article

Want to edit or suggest changes to this content? You can do so using GitHub.