The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants
Cock, Peter J. A. and Fields, Christopher J. and Goto, Naohisa and Heuer, Michael L. and Rice, Peter M. (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, 38 (6). pp. 1767-1771. ISSN 1362-4962 (https://doi.org/10.1093/nar/gkp1137)
Preview |
Text.
Filename: Cock-etal-2010-The_Sanger-FASTQ-file-format-for-sequences-with-quality-scores.pdf
Final Published Version License: Download (230kB)| Preview |
Abstract
FASTQ has emerged as a common file format for sharing sequencing read data combining both the sequence and an associated per base quality score, despite lacking any formal definition to date, and existing in at least three incompatible variants. This article defines the FASTQ format, covering the original Sanger standard, the Solexa/Illumina variants and conversion between them, based on publicly available information such as the MAQ documentation and conventions recently agreed by the Open Bioinformatics Foundation projects Biopython, BioPerl, BioRuby, BioJava and EMBOSS. Being an open access publication, it is hoped that this description, with the example files provided as Supplementary Data, will serve in future as a reference for this important file format.
ORCID iDs
Cock, Peter J. A. ORCID: https://orcid.org/0000-0001-9513-9993, Fields, Christopher J., Goto, Naohisa, Heuer, Michael L. and Rice, Peter M.;-
-
Item type: Article ID code: 90565 Dates: DateEvent1 April 2010PublishedSubjects: Medicine > Biomedical engineering. Electronics. Instrumentation Department: UNSPECIFIED Depositing user: Pure Administrator Date deposited: 13 Sep 2024 15:12 Last modified: 14 Dec 2024 23:21 URI: https://strathprints.strath.ac.uk/id/eprint/90565