Quality Scores

A quality score (or Q-score) expresses an error probability. In particular, it serves as a convenient and compact way to communicate very small error probabilities.

Given an assertion, A, the quality score, Q(A), expresses the probability that A is not true, P(~A), according to the relationship:

Q(A) =-10 log10(P(~A))

where P(~A) is the estimated probability of an assertion A being wrong.

The relationship between the quality score and error probability is demonstrated with the following table:

On supported Illumina systems, Q-scores are automatically binned. The specific binning applied depends on the current Q-table. See the white paper Reducing Whole Genome Data Storage Footprint for more information, available from the Illumina website.

Quality Score Encoding

In FASTQ files, quality scores are encoded into a compact form, which uses only 1 byte per quality value. In this encoding, the quality score is represented as the character with an ASCII code equal to its value + 33. The following table demonstrates the relationship between the encoding character, its ASCII code, and the quality score represented.

When Q-score binning is in use, the subset of Q-scores applied by the bins is displayed.

Last updated