1 How many reads are in a fastq file?

The number of reads in fastq files are often provided in the scientific literature and used to e.g. estimate genome coverage pre- and post- data cleaning. The instructor is tasking each group to develop a pseudo-code aiming at counting the number of reads in any fastq files.

We will then discuss your ideas and solutions and implement those into either an R or sh script, which will be applied to your own data analyzed in Chapter 4.

To help you design your pseudo-code, a screenshot of the SRR5759389_pe12.fastq (= file used in Chapter 4) fastq file is available in Figure 1.1. This information was obtained by using the head command applied to SRR5759389_pe12.fastq (see Figure 1.1).

Screenshot of SRR5759389_pe12.fastq showing the formatting of reads.

Figure 1.1: Screenshot of SRR5759389_pe12.fastq showing the formatting of reads.