Quality Scores
A quality score (or Q-score) expresses an error probability. In particular, it serves as a convenient and compact way to communicate very small error probabilities.
Given an assertion, A, the quality score, Q(A), expresses the probability that A is not true, P(~A), according to the relationship:
Q(A) =-10 log10(P(~A))
where P(~A) is the estimated probability of an assertion A being wrong.
The relationship between the quality score and error probability is demonstrated with the following table:
On supported Illumina systems, Q-scores are automatically binned. The specific binning applied depends on the current Q-table. See the white paper Reducing Whole Genome Data Storage Footprint for more information, available from the Illumina website.
Quality Score Encoding
In FASTQ files, quality scores are encoded into a compact form, which uses only 1 byte per quality value. In this encoding, the quality score is represented as the character with an ASCII code equal to its value + 33. The following table demonstrates the relationship between the encoding character, its ASCII code, and the quality score represented.
When Q-score binning is in use, the subset of Q-scores applied by the bins is displayed.
Last updated