No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default) Path to Cutadapt set as: 'cutadapt' (default) Cutadapt seems to be working fine (tested command 'cutadapt --version') AUTO-DETECTING ADAPTER TYPE =========================== Attempting to auto-detect adapter type from the first 1 million sequences of the first file (>> /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz <<) Found perfect matches for the following adapter sequences: Adapter type Count Sequence Sequences analysed Percentage Illumina 4057 AGATCGGAAGAGC 1000000 0.41 Nextera 5 CTGTCTCTTATA 1000000 0.00 smallRNA 1 TGGAATTCTCGG 1000000 0.00 Using Illumina adapter for trimming (count: 4057). Second best hit was Nextera (count: 5) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 498.41 s (20 us/read; 3.03 M reads/minute). === Summary === Total reads processed: 25,210,785 Reads with adapters: 9,194,471 (36.5%) Reads written (passing filters): 25,210,785 (100.0%) Total basepairs processed: 2,521,078,500 bp Quality-trimmed: 10,032,007 bp (0.4%) Total written (filtered): 2,474,313,959 bp (98.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9194471 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.0% C: 27.3% G: 16.0% T: 19.0% none/other: 2.7% Overview of removed sequences length count expect max.err error counts 1 6684335 6302696.2 0 6684335 2 1479305 1575674.1 0 1479305 3 420317 393918.5 0 420317 4 107721 98479.6 0 107721 5 37827 24619.9 0 37827 6 18772 6155.0 0 18772 7 15295 1538.7 0 15295 8 13361 384.7 0 13361 9 12614 96.2 0 12295 319 10 15293 24.0 1 11451 3842 11 10854 6.0 1 9836 1018 12 9874 1.5 1 9142 732 13 9347 0.4 1 8555 792 14 8409 0.4 1 7721 688 15 8060 0.4 1 7425 635 16 7503 0.4 1 6897 606 17 7287 0.4 1 6700 587 18 6915 0.4 1 6370 545 19 5891 0.4 1 5429 462 20 5215 0.4 1 4762 453 21 4677 0.4 1 4288 389 22 4013 0.4 1 3645 368 23 3586 0.4 1 3253 333 24 3219 0.4 1 2937 282 25 3013 0.4 1 2761 252 26 2600 0.4 1 2382 218 27 2507 0.4 1 2269 238 28 2441 0.4 1 2221 220 29 2246 0.4 1 2032 214 30 2214 0.4 1 1995 219 31 1929 0.4 1 1717 212 32 1802 0.4 1 1619 183 33 1477 0.4 1 1346 131 34 1348 0.4 1 1214 134 35 1266 0.4 1 1112 154 36 1276 0.4 1 1106 170 37 1063 0.4 1 942 121 38 1074 0.4 1 964 110 39 936 0.4 1 808 128 40 788 0.4 1 692 96 41 721 0.4 1 640 81 42 588 0.4 1 511 77 43 791 0.4 1 695 96 44 326 0.4 1 284 42 45 529 0.4 1 469 60 46 499 0.4 1 426 73 47 483 0.4 1 397 86 48 499 0.4 1 423 76 49 446 0.4 1 370 76 50 389 0.4 1 312 77 51 389 0.4 1 334 55 52 325 0.4 1 270 55 53 278 0.4 1 232 46 54 283 0.4 1 225 58 55 292 0.4 1 249 43 56 173 0.4 1 141 32 57 204 0.4 1 148 56 58 185 0.4 1 148 37 59 181 0.4 1 138 43 60 151 0.4 1 124 27 61 167 0.4 1 134 33 62 129 0.4 1 92 37 63 157 0.4 1 121 36 64 157 0.4 1 124 33 65 156 0.4 1 102 54 66 118 0.4 1 86 32 67 173 0.4 1 131 42 68 170 0.4 1 123 47 69 183 0.4 1 125 58 70 229 0.4 1 138 91 71 264 0.4 1 159 105 72 345 0.4 1 129 216 73 612 0.4 1 122 490 74 1571 0.4 1 146 1425 75 27933 0.4 1 208 27725 76 36713 0.4 1 768 35945 77 33962 0.4 1 977 32985 78 26997 0.4 1 610 26387 79 15817 0.4 1 508 15309 80 10920 0.4 1 355 10565 81 7136 0.4 1 227 6909 82 4837 0.4 1 179 4658 83 2866 0.4 1 140 2726 84 1832 0.4 1 118 1714 85 1519 0.4 1 82 1437 86 1254 0.4 1 96 1158 87 975 0.4 1 106 869 88 738 0.4 1 75 663 89 713 0.4 1 63 650 90 550 0.4 1 60 490 91 568 0.4 1 79 489 92 477 0.4 1 51 426 93 551 0.4 1 43 508 94 651 0.4 1 37 614 95 848 0.4 1 66 782 96 1056 0.4 1 60 996 97 1512 0.4 1 117 1395 98 2112 0.4 1 51 2061 99 7505 0.4 1 27 7478 100 64596 0.4 1 46 64550 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz ============================================= 25210785 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 523.04 s (21 us/read; 2.89 M reads/minute). === Summary === Total reads processed: 25,210,785 Reads with adapters: 9,888,434 (39.2%) Reads written (passing filters): 25,210,785 (100.0%) Total basepairs processed: 2,521,078,500 bp Quality-trimmed: 18,588,718 bp (0.7%) Total written (filtered): 2,463,345,403 bp (97.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9888434 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 38.4% C: 23.2% G: 18.3% T: 17.5% none/other: 2.6% Overview of removed sequences length count expect max.err error counts 1 6948743 6302696.2 0 6948743 2 1704518 1575674.1 0 1704518 3 572971 393918.5 0 572971 4 142831 98479.6 0 142831 5 43560 24619.9 0 43560 6 19916 6155.0 0 19916 7 15727 1538.7 0 15727 8 13555 384.7 0 13555 9 13490 96.2 0 12720 770 10 13548 24.0 1 11534 2014 11 11312 6.0 1 9992 1320 12 10372 1.5 1 9386 986 13 9299 0.4 1 8465 834 14 10071 0.4 1 9143 928 15 6989 0.4 1 6384 605 16 7644 0.4 1 6950 694 17 9080 0.4 1 8268 812 18 5315 0.4 1 4828 487 19 6906 0.4 1 6246 660 20 4602 0.4 1 4178 424 21 4705 0.4 1 4270 435 22 4067 0.4 1 3605 462 23 3843 0.4 1 3411 432 24 3689 0.4 1 3213 476 25 3045 0.4 1 2712 333 26 2851 0.4 1 2531 320 27 2752 0.4 1 2408 344 28 2747 0.4 1 2397 350 29 2401 0.4 1 2108 293 30 2834 0.4 1 2496 338 31 1702 0.4 1 1485 217 32 1966 0.4 1 1709 257 33 1647 0.4 1 1418 229 34 1551 0.4 1 1312 239 35 1420 0.4 1 1203 217 36 1507 0.4 1 1280 227 37 1218 0.4 1 1025 193 38 1253 0.4 1 1015 238 39 1008 0.4 1 859 149 40 900 0.4 1 767 133 41 879 0.4 1 692 187 42 924 0.4 1 743 181 43 651 0.4 1 518 133 44 650 0.4 1 503 147 45 856 0.4 1 687 169 46 517 0.4 1 384 133 47 593 0.4 1 442 151 48 692 0.4 1 544 148 49 568 0.4 1 405 163 50 539 0.4 1 417 122 51 721 0.4 1 544 177 52 450 0.4 1 312 138 53 403 0.4 1 306 97 54 383 0.4 1 288 95 55 354 0.4 1 285 69 56 318 0.4 1 238 80 57 290 0.4 1 206 84 58 266 0.4 1 171 95 59 299 0.4 1 217 82 60 265 0.4 1 197 68 61 324 0.4 1 215 109 62 298 0.4 1 170 128 63 300 0.4 1 200 100 64 290 0.4 1 182 108 65 294 0.4 1 169 125 66 490 0.4 1 137 353 67 15661 0.4 1 185 15476 68 18904 0.4 1 681 18223 69 18753 0.4 1 531 18222 70 19407 0.4 1 575 18832 71 10805 0.4 1 566 10239 72 7808 0.4 1 380 7428 73 5518 0.4 1 286 5232 74 3060 0.4 1 220 2840 75 1716 0.4 1 158 1558 76 1222 0.4 1 118 1104 77 1090 0.4 1 150 940 78 760 0.4 1 93 667 79 589 0.4 1 99 490 80 635 0.4 1 88 547 81 453 0.4 1 97 356 82 444 0.4 1 78 366 83 425 0.4 1 82 343 84 384 0.4 1 74 310 85 337 0.4 1 50 287 86 325 0.4 1 53 272 87 371 0.4 1 68 303 88 375 0.4 1 58 317 89 381 0.4 1 43 338 90 448 0.4 1 55 393 91 568 0.4 1 64 504 92 552 0.4 1 27 525 93 608 0.4 1 16 592 94 805 0.4 1 21 784 95 1276 0.4 1 53 1223 96 2029 0.4 1 45 1984 97 3603 0.4 1 101 3502 98 5212 0.4 1 37 5175 99 15023 0.4 1 18 15005 100 118718 0.4 1 54 118664 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz ============================================= 25210785 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 25210785 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 363867 (1.44%) >>> Now running FastQC on the validated data 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P01-AGTTCC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P01-AGTTCC-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 489.25 s (20 us/read; 2.96 M reads/minute). === Summary === Total reads processed: 24,157,901 Reads with adapters: 9,279,981 (38.4%) Reads written (passing filters): 24,157,901 (100.0%) Total basepairs processed: 2,415,790,100 bp Quality-trimmed: 7,463,230 bp (0.3%) Total written (filtered): 2,371,881,923 bp (98.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9279981 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.9% C: 25.5% G: 17.4% T: 18.9% none/other: 2.3% Overview of removed sequences length count expect max.err error counts 1 6493053 6039475.2 0 6493053 2 1561462 1509868.8 0 1561462 3 492527 377467.2 0 492527 4 132747 94366.8 0 132747 5 47694 23591.7 0 47694 6 26810 5897.9 0 26810 7 22497 1474.5 0 22497 8 20285 368.6 0 20285 9 19122 92.2 0 18630 492 10 21534 23.0 1 16796 4738 11 15785 5.8 1 14653 1132 12 15144 1.4 1 14093 1051 13 13843 0.4 1 12881 962 14 13011 0.4 1 12183 828 15 12750 0.4 1 11919 831 16 11492 0.4 1 10739 753 17 11211 0.4 1 10492 719 18 10757 0.4 1 10076 681 19 9405 0.4 1 8777 628 20 8559 0.4 1 8010 549 21 7561 0.4 1 7070 491 22 6814 0.4 1 6330 484 23 5876 0.4 1 5455 421 24 5443 0.4 1 5128 315 25 4978 0.4 1 4608 370 26 4405 0.4 1 4092 313 27 4318 0.4 1 4014 304 28 4358 0.4 1 4070 288 29 3741 0.4 1 3422 319 30 3699 0.4 1 3427 272 31 3300 0.4 1 3054 246 32 3213 0.4 1 2946 267 33 2874 0.4 1 2600 274 34 2767 0.4 1 2509 258 35 2379 0.4 1 2168 211 36 2332 0.4 1 2161 171 37 2200 0.4 1 2012 188 38 2030 0.4 1 1830 200 39 1678 0.4 1 1525 153 40 1547 0.4 1 1405 142 41 1993 0.4 1 1816 177 42 675 0.4 1 604 71 43 936 0.4 1 847 89 44 769 0.4 1 675 94 45 869 0.4 1 774 95 46 931 0.4 1 852 79 47 921 0.4 1 807 114 48 841 0.4 1 759 82 49 890 0.4 1 792 98 50 837 0.4 1 750 87 51 715 0.4 1 638 77 52 606 0.4 1 513 93 53 740 0.4 1 648 92 54 443 0.4 1 400 43 55 500 0.4 1 439 61 56 397 0.4 1 357 40 57 398 0.4 1 336 62 58 378 0.4 1 331 47 59 341 0.4 1 279 62 60 356 0.4 1 296 60 61 355 0.4 1 317 38 62 304 0.4 1 251 53 63 292 0.4 1 238 54 64 344 0.4 1 280 64 65 339 0.4 1 276 63 66 355 0.4 1 291 64 67 359 0.4 1 287 72 68 345 0.4 1 253 92 69 414 0.4 1 309 105 70 480 0.4 1 283 197 71 682 0.4 1 298 384 72 1471 0.4 1 320 1151 73 22283 0.4 1 369 21914 74 39382 0.4 1 1090 38292 75 31882 0.4 1 1475 30407 76 18907 0.4 1 1069 17838 77 13189 0.4 1 795 12394 78 8690 0.4 1 506 8184 79 5430 0.4 1 362 5068 80 3977 0.4 1 280 3697 81 2710 0.4 1 200 2510 82 1797 0.4 1 151 1646 83 1521 0.4 1 177 1344 84 1129 0.4 1 160 969 85 877 0.4 1 127 750 86 739 0.4 1 125 614 87 602 0.4 1 116 486 88 531 0.4 1 104 427 89 458 0.4 1 71 387 90 443 0.4 1 77 366 91 453 0.4 1 105 348 92 413 0.4 1 58 355 93 475 0.4 1 38 437 94 561 0.4 1 27 534 95 681 0.4 1 46 635 96 917 0.4 1 57 860 97 1283 0.4 1 85 1198 98 1819 0.4 1 53 1766 99 6327 0.4 1 34 6293 100 57028 0.4 1 53 56975 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz ============================================= 24157901 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 496.21 s (21 us/read; 2.92 M reads/minute). === Summary === Total reads processed: 24,157,901 Reads with adapters: 9,235,435 (38.2%) Reads written (passing filters): 24,157,901 (100.0%) Total basepairs processed: 2,415,790,100 bp Quality-trimmed: 13,197,068 bp (0.5%) Total written (filtered): 2,365,044,184 bp (97.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9235435 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 36.3% C: 25.2% G: 17.1% T: 19.0% none/other: 2.4% Overview of removed sequences length count expect max.err error counts 1 6501476 6039475.2 0 6501476 2 1512450 1509868.8 0 1512450 3 487981 377467.2 0 487981 4 130644 94366.8 0 130644 5 47785 23591.7 0 47785 6 27089 5897.9 0 27089 7 22597 1474.5 0 22597 8 20394 368.6 0 20394 9 19602 92.2 0 19185 417 10 19926 23.0 1 16882 3044 11 15858 5.8 1 14779 1079 12 15498 1.4 1 14554 944 13 13371 0.4 1 12576 795 14 15079 0.4 1 14157 922 15 10995 0.4 1 10362 633 16 11511 0.4 1 10804 707 17 13445 0.4 1 12656 789 18 8538 0.4 1 8047 491 19 10528 0.4 1 9855 673 20 7617 0.4 1 7119 498 21 7447 0.4 1 6986 461 22 6767 0.4 1 6268 499 23 6050 0.4 1 5595 455 24 5793 0.4 1 5317 476 25 4781 0.4 1 4476 305 26 4588 0.4 1 4240 348 27 4427 0.4 1 4101 326 28 4422 0.4 1 4128 294 29 3765 0.4 1 3461 304 30 4287 0.4 1 3955 332 31 2920 0.4 1 2693 227 32 3335 0.4 1 3065 270 33 2824 0.4 1 2597 227 34 2875 0.4 1 2612 263 35 2433 0.4 1 2191 242 36 2389 0.4 1 2210 179 37 2308 0.4 1 2096 212 38 2002 0.4 1 1806 196 39 1822 0.4 1 1679 143 40 1492 0.4 1 1354 138 41 1454 0.4 1 1292 162 42 1482 0.4 1 1313 169 43 924 0.4 1 820 104 44 921 0.4 1 789 132 45 1215 0.4 1 1084 131 46 825 0.4 1 715 110 47 949 0.4 1 833 116 48 925 0.4 1 828 97 49 909 0.4 1 813 96 50 882 0.4 1 763 119 51 891 0.4 1 767 124 52 575 0.4 1 477 98 53 710 0.4 1 619 91 54 564 0.4 1 484 80 55 541 0.4 1 475 66 56 426 0.4 1 375 51 57 425 0.4 1 356 69 58 434 0.4 1 377 57 59 405 0.4 1 335 70 60 381 0.4 1 304 77 61 447 0.4 1 369 78 62 331 0.4 1 272 59 63 338 0.4 1 261 77 64 346 0.4 1 269 77 65 416 0.4 1 306 110 66 643 0.4 1 302 341 67 11749 0.4 1 332 11417 68 23402 0.4 1 1082 22320 69 16935 0.4 1 1393 15542 70 13577 0.4 1 1047 12530 71 8307 0.4 1 743 7564 72 4773 0.4 1 544 4229 73 2361 0.4 1 318 2043 74 1505 0.4 1 192 1313 75 958 0.4 1 173 785 76 644 0.4 1 132 512 77 563 0.4 1 138 425 78 525 0.4 1 108 417 79 468 0.4 1 132 336 80 477 0.4 1 136 341 81 440 0.4 1 95 345 82 362 0.4 1 117 245 83 358 0.4 1 117 241 84 295 0.4 1 92 203 85 342 0.4 1 113 229 86 362 0.4 1 111 251 87 359 0.4 1 108 251 88 371 0.4 1 118 253 89 377 0.4 1 56 321 90 376 0.4 1 38 338 91 546 0.4 1 92 454 92 514 0.4 1 54 460 93 678 0.4 1 28 650 94 1215 0.4 1 19 1196 95 1779 0.4 1 33 1746 96 2795 0.4 1 34 2761 97 4522 0.4 1 71 4451 98 6593 0.4 1 33 6560 99 18937 0.4 1 24 18913 100 100500 0.4 1 61 100439 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz ============================================= 24157901 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 24157901 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 293514 (1.21%) >>> Now running FastQC on the validated data 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P02-ACAGTG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P02-ACAGTG-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 460.43 s (20 us/read; 3.03 M reads/minute). === Summary === Total reads processed: 23,215,719 Reads with adapters: 8,194,852 (35.3%) Reads written (passing filters): 23,215,719 (100.0%) Total basepairs processed: 2,321,571,900 bp Quality-trimmed: 5,207,744 bp (0.2%) Total written (filtered): 2,292,790,871 bp (98.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 8194852 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.6% C: 28.5% G: 16.5% T: 19.2% none/other: 1.3% Overview of removed sequences length count expect max.err error counts 1 6038992 5803929.8 0 6038992 2 1352531 1450982.4 0 1352531 3 379888 362745.6 0 379888 4 97837 90686.4 0 97837 5 31884 22671.6 0 31884 6 14491 5667.9 0 14491 7 11346 1417.0 0 11346 8 10207 354.2 0 10207 9 9872 88.6 0 9494 378 10 10524 22.1 1 8528 1996 11 8358 5.5 1 7760 598 12 7763 1.4 1 7297 466 13 7288 0.3 1 6847 441 14 6636 0.3 1 6272 364 15 6517 0.3 1 6143 374 16 6035 0.3 1 5741 294 17 5754 0.3 1 5469 285 18 5542 0.3 1 5220 322 19 4796 0.3 1 4557 239 20 4421 0.3 1 4156 265 21 4150 0.3 1 3875 275 22 3675 0.3 1 3448 227 23 3405 0.3 1 3197 208 24 3108 0.3 1 2926 182 25 2867 0.3 1 2705 162 26 2654 0.3 1 2502 152 27 2554 0.3 1 2380 174 28 2503 0.3 1 2347 156 29 2337 0.3 1 2189 148 30 2215 0.3 1 2064 151 31 1968 0.3 1 1857 111 32 1898 0.3 1 1766 132 33 1703 0.3 1 1606 97 34 1818 0.3 1 1699 119 35 1567 0.3 1 1441 126 36 1595 0.3 1 1457 138 37 1399 0.3 1 1272 127 38 1386 0.3 1 1285 101 39 1241 0.3 1 1148 93 40 1096 0.3 1 1019 77 41 1067 0.3 1 976 91 42 863 0.3 1 803 60 43 1117 0.3 1 1044 73 44 476 0.3 1 427 49 45 650 0.3 1 594 56 46 673 0.3 1 625 48 47 750 0.3 1 687 63 48 730 0.3 1 668 62 49 690 0.3 1 635 55 50 698 0.3 1 643 55 51 651 0.3 1 590 61 52 496 0.3 1 445 51 53 532 0.3 1 472 60 54 490 0.3 1 431 59 55 447 0.3 1 399 48 56 316 0.3 1 277 39 57 336 0.3 1 300 36 58 293 0.3 1 247 46 59 309 0.3 1 268 41 60 286 0.3 1 249 37 61 268 0.3 1 238 30 62 228 0.3 1 197 31 63 258 0.3 1 223 35 64 233 0.3 1 197 36 65 240 0.3 1 202 38 66 260 0.3 1 223 37 67 227 0.3 1 200 27 68 238 0.3 1 185 53 69 261 0.3 1 218 43 70 290 0.3 1 241 49 71 294 0.3 1 228 66 72 338 0.3 1 216 122 73 448 0.3 1 243 205 74 732 0.3 1 286 446 75 8248 0.3 1 346 7902 76 13400 0.3 1 917 12483 77 14730 0.3 1 1375 13355 78 12864 0.3 1 1066 11798 79 8187 0.3 1 984 7203 80 5871 0.3 1 637 5234 81 3977 0.3 1 469 3508 82 2808 0.3 1 327 2481 83 1759 0.3 1 301 1458 84 1233 0.3 1 213 1020 85 992 0.3 1 176 816 86 862 0.3 1 201 661 87 608 0.3 1 167 441 88 499 0.3 1 117 382 89 433 0.3 1 122 311 90 408 0.3 1 114 294 91 371 0.3 1 133 238 92 341 0.3 1 94 247 93 328 0.3 1 83 245 94 392 0.3 1 65 327 95 418 0.3 1 75 343 96 560 0.3 1 78 482 97 851 0.3 1 146 705 98 1042 0.3 1 78 964 99 3418 0.3 1 43 3375 100 28246 0.3 1 60 28186 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz ============================================= 23215719 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 471.27 s (20 us/read; 2.96 M reads/minute). === Summary === Total reads processed: 23,215,719 Reads with adapters: 8,971,997 (38.6%) Reads written (passing filters): 23,215,719 (100.0%) Total basepairs processed: 2,321,571,900 bp Quality-trimmed: 14,419,093 bp (0.6%) Total written (filtered): 2,281,293,686 bp (98.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 8971997 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.7% C: 24.4% G: 19.2% T: 17.6% none/other: 1.2% Overview of removed sequences length count expect max.err error counts 1 6335931 5803929.8 0 6335931 2 1617726 1450982.4 0 1617726 3 535354 362745.6 0 535354 4 131792 90686.4 0 131792 5 36956 22671.6 0 36956 6 16035 5667.9 0 16035 7 11931 1417.0 0 11931 8 10462 354.2 0 10462 9 10644 88.6 0 9963 681 10 10404 22.1 1 8782 1622 11 9157 5.5 1 8086 1071 12 8484 1.4 1 7728 756 13 7358 0.3 1 6838 520 14 8168 0.3 1 7594 574 15 5862 0.3 1 5442 420 16 6195 0.3 1 5803 392 17 7310 0.3 1 6805 505 18 4569 0.3 1 4247 322 19 5715 0.3 1 5281 434 20 4085 0.3 1 3782 303 21 4292 0.3 1 3978 314 22 3930 0.3 1 3594 336 23 3812 0.3 1 3452 360 24 3703 0.3 1 3333 370 25 3016 0.3 1 2714 302 26 3034 0.3 1 2710 324 27 2854 0.3 1 2591 263 28 2915 0.3 1 2603 312 29 2604 0.3 1 2320 284 30 2978 0.3 1 2661 317 31 1898 0.3 1 1684 214 32 2196 0.3 1 1955 241 33 2003 0.3 1 1756 247 34 2122 0.3 1 1850 272 35 2017 0.3 1 1751 266 36 1960 0.3 1 1721 239 37 1641 0.3 1 1426 215 38 1651 0.3 1 1417 234 39 1442 0.3 1 1247 195 40 1303 0.3 1 1147 156 41 1317 0.3 1 1136 181 42 1363 0.3 1 1149 214 43 930 0.3 1 785 145 44 1011 0.3 1 834 177 45 1207 0.3 1 1003 204 46 832 0.3 1 659 173 47 905 0.3 1 767 138 48 996 0.3 1 837 159 49 843 0.3 1 704 139 50 879 0.3 1 744 135 51 1120 0.3 1 925 195 52 635 0.3 1 494 141 53 690 0.3 1 564 126 54 598 0.3 1 496 102 55 627 0.3 1 504 123 56 537 0.3 1 421 116 57 563 0.3 1 418 145 58 472 0.3 1 371 101 59 477 0.3 1 375 102 60 448 0.3 1 350 98 61 464 0.3 1 378 86 62 451 0.3 1 323 128 63 451 0.3 1 329 122 64 413 0.3 1 298 115 65 423 0.3 1 291 132 66 555 0.3 1 339 216 67 6522 0.3 1 337 6185 68 8691 0.3 1 1063 7628 69 9191 0.3 1 870 8321 70 9190 0.3 1 984 8206 71 5771 0.3 1 869 4902 72 4258 0.3 1 639 3619 73 3111 0.3 1 497 2614 74 1985 0.3 1 432 1553 75 1261 0.3 1 305 956 76 873 0.3 1 207 666 77 770 0.3 1 204 566 78 637 0.3 1 147 490 79 529 0.3 1 157 372 80 507 0.3 1 135 372 81 406 0.3 1 135 271 82 440 0.3 1 139 301 83 458 0.3 1 157 301 84 359 0.3 1 137 222 85 296 0.3 1 113 183 86 307 0.3 1 104 203 87 350 0.3 1 131 219 88 292 0.3 1 106 186 89 305 0.3 1 74 231 90 318 0.3 1 65 253 91 393 0.3 1 77 316 92 306 0.3 1 50 256 93 369 0.3 1 30 339 94 427 0.3 1 31 396 95 574 0.3 1 37 537 96 885 0.3 1 56 829 97 1573 0.3 1 115 1458 98 2118 0.3 1 66 2052 99 5906 0.3 1 22 5884 100 47903 0.3 1 64 47839 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz ============================================= 23215719 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 23215719 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 190733 (0.82%) >>> Now running FastQC on the validated data 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P03-ACTGAT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P03-ACTGAT-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 466.14 s (20 us/read; 2.95 M reads/minute). === Summary === Total reads processed: 22,919,563 Reads with adapters: 8,662,070 (37.8%) Reads written (passing filters): 22,919,563 (100.0%) Total basepairs processed: 2,291,956,300 bp Quality-trimmed: 11,887,894 bp (0.5%) Total written (filtered): 2,233,341,045 bp (97.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 8662070 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.7% C: 26.7% G: 16.2% T: 18.5% none/other: 3.9% Overview of removed sequences length count expect max.err error counts 1 6057249 5729890.8 0 6057249 2 1350476 1432472.7 0 1350476 3 401413 358118.2 0 401413 4 110413 89529.5 0 110413 5 44565 22382.4 0 44565 6 25971 5595.6 0 25971 7 23216 1398.9 0 23216 8 20554 349.7 0 20554 9 19437 87.4 0 19108 329 10 20458 21.9 1 17702 2756 11 16357 5.5 1 15457 900 12 15655 1.4 1 14905 750 13 14516 0.3 1 13873 643 14 13675 0.3 1 13024 651 15 13036 0.3 1 12404 632 16 12080 0.3 1 11519 561 17 11870 0.3 1 11350 520 18 11003 0.3 1 10430 573 19 9853 0.3 1 9418 435 20 8799 0.3 1 8380 419 21 7992 0.3 1 7576 416 22 7360 0.3 1 6980 380 23 6398 0.3 1 6079 319 24 6098 0.3 1 5790 308 25 5605 0.3 1 5310 295 26 5103 0.3 1 4892 211 27 5058 0.3 1 4797 261 28 4732 0.3 1 4476 256 29 4441 0.3 1 4209 232 30 4224 0.3 1 4002 222 31 3862 0.3 1 3630 232 32 3655 0.3 1 3465 190 33 3359 0.3 1 3184 175 34 3072 0.3 1 2914 158 35 2946 0.3 1 2795 151 36 2851 0.3 1 2690 161 37 2733 0.3 1 2549 184 38 2292 0.3 1 2138 154 39 2236 0.3 1 2079 157 40 2058 0.3 1 1932 126 41 1689 0.3 1 1597 92 42 1605 0.3 1 1500 105 43 2012 0.3 1 1891 121 44 759 0.3 1 695 64 45 1043 0.3 1 974 69 46 1209 0.3 1 1130 79 47 1158 0.3 1 1054 104 48 1274 0.3 1 1203 71 49 1199 0.3 1 1113 86 50 1084 0.3 1 1002 82 51 1033 0.3 1 955 78 52 976 0.3 1 898 78 53 917 0.3 1 814 103 54 833 0.3 1 760 73 55 787 0.3 1 714 73 56 508 0.3 1 460 48 57 541 0.3 1 474 67 58 483 0.3 1 448 35 59 448 0.3 1 404 44 60 422 0.3 1 391 31 61 414 0.3 1 375 39 62 345 0.3 1 302 43 63 361 0.3 1 319 42 64 321 0.3 1 292 29 65 377 0.3 1 296 81 66 353 0.3 1 300 53 67 333 0.3 1 273 60 68 383 0.3 1 306 77 69 420 0.3 1 320 100 70 474 0.3 1 332 142 71 465 0.3 1 279 186 72 619 0.3 1 300 319 73 1032 0.3 1 322 710 74 2398 0.3 1 343 2055 75 35876 0.3 1 593 35283 76 52294 0.3 1 1314 50980 77 47044 0.3 1 1967 45077 78 36303 0.3 1 1245 35058 79 21851 0.3 1 959 20892 80 14792 0.3 1 694 14098 81 9725 0.3 1 409 9316 82 6405 0.3 1 321 6084 83 3876 0.3 1 270 3606 84 2573 0.3 1 241 2332 85 2112 0.3 1 197 1915 86 1669 0.3 1 137 1532 87 1295 0.3 1 177 1118 88 1097 0.3 1 150 947 89 927 0.3 1 103 824 90 835 0.3 1 121 714 91 847 0.3 1 142 705 92 761 0.3 1 77 684 93 759 0.3 1 70 689 94 908 0.3 1 63 845 95 999 0.3 1 83 916 96 1478 0.3 1 76 1402 97 2171 0.3 1 167 2004 98 2803 0.3 1 93 2710 99 10194 0.3 1 44 10150 100 87530 0.3 1 102 87428 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz ============================================= 22919563 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 470.70 s (21 us/read; 2.92 M reads/minute). === Summary === Total reads processed: 22,919,563 Reads with adapters: 9,218,728 (40.2%) Reads written (passing filters): 22,919,563 (100.0%) Total basepairs processed: 2,291,956,300 bp Quality-trimmed: 15,523,382 bp (0.7%) Total written (filtered): 2,227,595,137 bp (97.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9218728 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.6% C: 22.9% G: 18.5% T: 17.3% none/other: 3.7% Overview of removed sequences length count expect max.err error counts 1 6258815 5729890.8 0 6258815 2 1545045 1432472.7 0 1545045 3 525991 358118.2 0 525991 4 138997 89529.5 0 138997 5 47732 22382.4 0 47732 6 26936 5595.6 0 26936 7 23445 1398.9 0 23445 8 20708 349.7 0 20708 9 20126 87.4 0 19533 593 10 19305 21.9 1 17652 1653 11 16687 5.5 1 15581 1106 12 16127 1.4 1 15236 891 13 13989 0.3 1 13319 670 14 16093 0.3 1 15217 876 15 11058 0.3 1 10440 618 16 12056 0.3 1 11382 674 17 14565 0.3 1 13726 839 18 8105 0.3 1 7702 403 19 11272 0.3 1 10682 590 20 7605 0.3 1 7224 381 21 7917 0.3 1 7498 419 22 7347 0.3 1 6881 466 23 6564 0.3 1 6145 419 24 6535 0.3 1 6110 425 25 5287 0.3 1 4992 295 26 5247 0.3 1 4970 277 27 5181 0.3 1 4861 320 28 4859 0.3 1 4536 323 29 4455 0.3 1 4154 301 30 4918 0.3 1 4621 297 31 3319 0.3 1 3097 222 32 3833 0.3 1 3559 274 33 3318 0.3 1 3082 236 34 3203 0.3 1 2996 207 35 2881 0.3 1 2687 194 36 3043 0.3 1 2850 193 37 2667 0.3 1 2521 146 38 2366 0.3 1 2146 220 39 2198 0.3 1 2037 161 40 1969 0.3 1 1847 122 41 1846 0.3 1 1667 179 42 1826 0.3 1 1673 153 43 1273 0.3 1 1174 99 44 1377 0.3 1 1234 143 45 1496 0.3 1 1334 162 46 1133 0.3 1 992 141 47 1169 0.3 1 1063 106 48 1361 0.3 1 1246 115 49 1251 0.3 1 1129 122 50 1135 0.3 1 1036 99 51 1275 0.3 1 1150 125 52 955 0.3 1 834 121 53 912 0.3 1 810 102 54 834 0.3 1 767 67 55 717 0.3 1 631 86 56 670 0.3 1 584 86 57 580 0.3 1 486 94 58 559 0.3 1 479 80 59 489 0.3 1 414 75 60 479 0.3 1 403 76 61 525 0.3 1 460 65 62 444 0.3 1 342 102 63 450 0.3 1 355 95 64 427 0.3 1 316 111 65 511 0.3 1 320 191 66 986 0.3 1 293 693 67 25220 0.3 1 290 24930 68 27258 0.3 1 1560 25698 69 26824 0.3 1 1034 25790 70 27054 0.3 1 1202 25852 71 13895 0.3 1 1132 12763 72 9185 0.3 1 674 8511 73 6446 0.3 1 490 5956 74 3349 0.3 1 361 2988 75 1929 0.3 1 268 1661 76 1330 0.3 1 161 1169 77 1121 0.3 1 196 925 78 859 0.3 1 154 705 79 730 0.3 1 155 575 80 700 0.3 1 146 554 81 559 0.3 1 141 418 82 528 0.3 1 134 394 83 628 0.3 1 136 492 84 459 0.3 1 135 324 85 464 0.3 1 120 344 86 397 0.3 1 101 296 87 464 0.3 1 106 358 88 503 0.3 1 123 380 89 504 0.3 1 86 418 90 590 0.3 1 106 484 91 748 0.3 1 112 636 92 681 0.3 1 54 627 93 835 0.3 1 43 792 94 1115 0.3 1 33 1082 95 1769 0.3 1 56 1713 96 2800 0.3 1 58 2742 97 4825 0.3 1 127 4698 98 7416 0.3 1 78 7338 99 20601 0.3 1 28 20573 100 160498 0.3 1 103 160395 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz ============================================= 22919563 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 22919563 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 422713 (1.84%) >>> Now running FastQC on the validated data 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P04-GTCCGC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P04-GTCCGC-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 536.88 s (20 us/read; 2.99 M reads/minute). === Summary === Total reads processed: 26,724,234 Reads with adapters: 10,486,555 (39.2%) Reads written (passing filters): 26,724,234 (100.0%) Total basepairs processed: 2,672,423,400 bp Quality-trimmed: 16,508,006 bp (0.6%) Total written (filtered): 2,587,883,920 bp (96.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 10486555 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.5% C: 25.2% G: 16.0% T: 18.2% none/other: 5.1% Overview of removed sequences length count expect max.err error counts 1 7211785 6681058.5 0 7211785 2 1603420 1670264.6 0 1603420 3 502061 417566.2 0 502061 4 137281 104391.5 0 137281 5 51475 26097.9 0 51475 6 30001 6524.5 0 30001 7 26169 1631.1 0 26169 8 23638 407.8 0 23638 9 22327 101.9 0 21914 413 10 23967 25.5 1 20695 3272 11 19521 6.4 1 18461 1060 12 17758 1.6 1 16909 849 13 17054 0.4 1 16229 825 14 15600 0.4 1 14887 713 15 15199 0.4 1 14522 677 16 14210 0.4 1 13608 602 17 13475 0.4 1 12803 672 18 13394 0.4 1 12772 622 19 11958 0.4 1 11423 535 20 11038 0.4 1 10528 510 21 9903 0.4 1 9388 515 22 9158 0.4 1 8681 477 23 8508 0.4 1 8074 434 24 7687 0.4 1 7236 451 25 6887 0.4 1 6540 347 26 6856 0.4 1 6486 370 27 6493 0.4 1 6182 311 28 6337 0.4 1 6029 308 29 5722 0.4 1 5446 276 30 5394 0.4 1 5103 291 31 5275 0.4 1 5019 256 32 4901 0.4 1 4608 293 33 4533 0.4 1 4298 235 34 4291 0.4 1 4077 214 35 4136 0.4 1 3877 259 36 3741 0.4 1 3486 255 37 4071 0.4 1 3631 440 38 3129 0.4 1 2937 192 39 3089 0.4 1 2849 240 40 2942 0.4 1 2744 198 41 2573 0.4 1 2390 183 42 2110 0.4 1 1983 127 43 2890 0.4 1 2745 145 44 1162 0.4 1 1077 85 45 1723 0.4 1 1612 111 46 1679 0.4 1 1582 97 47 1905 0.4 1 1761 144 48 1874 0.4 1 1764 110 49 1901 0.4 1 1795 106 50 1771 0.4 1 1659 112 51 1864 0.4 1 1750 114 52 1731 0.4 1 1582 149 53 1535 0.4 1 1416 119 54 1339 0.4 1 1204 135 55 1347 0.4 1 1231 116 56 888 0.4 1 808 80 57 996 0.4 1 908 88 58 888 0.4 1 821 67 59 786 0.4 1 727 59 60 727 0.4 1 659 68 61 815 0.4 1 752 63 62 681 0.4 1 616 65 63 581 0.4 1 515 66 64 676 0.4 1 590 86 65 716 0.4 1 514 202 66 605 0.4 1 515 90 67 613 0.4 1 529 84 68 587 0.4 1 488 99 69 628 0.4 1 496 132 70 750 0.4 1 536 214 71 734 0.4 1 459 275 72 907 0.4 1 476 431 73 1510 0.4 1 504 1006 74 3436 0.4 1 529 2907 75 53016 0.4 1 777 52239 76 76935 0.4 1 1889 75046 77 73954 0.4 1 2706 71248 78 59704 0.4 1 2061 57643 79 35471 0.4 1 1514 33957 80 24120 0.4 1 1026 23094 81 15842 0.4 1 719 15123 82 10769 0.4 1 485 10284 83 6440 0.4 1 432 6008 84 4174 0.4 1 315 3859 85 3430 0.4 1 265 3165 86 2791 0.4 1 264 2527 87 2100 0.4 1 250 1850 88 1776 0.4 1 205 1571 89 1577 0.4 1 168 1409 90 1389 0.4 1 167 1222 91 1233 0.4 1 218 1015 92 1119 0.4 1 123 996 93 1196 0.4 1 99 1097 94 1438 0.4 1 94 1344 95 1639 0.4 1 114 1525 96 2274 0.4 1 108 2166 97 3385 0.4 1 228 3157 98 4572 0.4 1 130 4442 99 16559 0.4 1 162 16397 100 140340 0.4 1 501 139839 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz ============================================= 26724234 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 533.51 s (20 us/read; 3.01 M reads/minute). === Summary === Total reads processed: 26,724,234 Reads with adapters: 10,856,043 (40.6%) Reads written (passing filters): 26,724,234 (100.0%) Total basepairs processed: 2,672,423,400 bp Quality-trimmed: 19,068,208 bp (0.7%) Total written (filtered): 2,583,146,968 bp (96.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 10856043 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.6% C: 22.6% G: 17.3% T: 17.6% none/other: 5.0% Overview of removed sequences length count expect max.err error counts 1 7352268 6681058.5 0 7352268 2 1715986 1670264.6 0 1715986 3 593328 417566.2 0 593328 4 158365 104391.5 0 158365 5 54056 26097.9 0 54056 6 30719 6524.5 0 30719 7 26333 1631.1 0 26333 8 23750 407.8 0 23750 9 22939 101.9 0 22352 587 10 22299 25.5 1 20522 1777 11 19878 6.4 1 18686 1192 12 18171 1.6 1 17221 950 13 16628 0.4 1 15827 801 14 17823 0.4 1 16927 896 15 13425 0.4 1 12798 627 16 14117 0.4 1 13472 645 17 16127 0.4 1 15368 759 18 10739 0.4 1 10227 512 19 13260 0.4 1 12624 636 20 9850 0.4 1 9397 453 21 9747 0.4 1 9267 480 22 9093 0.4 1 8507 586 23 8678 0.4 1 8106 572 24 8056 0.4 1 7583 473 25 6705 0.4 1 6342 363 26 6990 0.4 1 6571 419 27 6651 0.4 1 6241 410 28 6412 0.4 1 6051 361 29 5743 0.4 1 5414 329 30 6255 0.4 1 5885 370 31 4587 0.4 1 4351 236 32 4987 0.4 1 4662 325 33 4544 0.4 1 4260 284 34 4419 0.4 1 4109 310 35 4036 0.4 1 3796 240 36 3908 0.4 1 3686 222 37 3692 0.4 1 3460 232 38 3402 0.4 1 3155 247 39 3092 0.4 1 2903 189 40 2741 0.4 1 2587 154 41 2501 0.4 1 2300 201 42 2608 0.4 1 2402 206 43 1907 0.4 1 1760 147 44 1969 0.4 1 1821 148 45 2295 0.4 1 2130 165 46 1609 0.4 1 1458 151 47 1913 0.4 1 1775 138 48 1954 0.4 1 1831 123 49 1949 0.4 1 1784 165 50 1831 0.4 1 1710 121 51 2235 0.4 1 2058 177 52 1551 0.4 1 1422 129 53 1573 0.4 1 1447 126 54 1307 0.4 1 1217 90 55 1192 0.4 1 1116 76 56 1051 0.4 1 962 89 57 1052 0.4 1 915 137 58 975 0.4 1 859 116 59 875 0.4 1 773 102 60 766 0.4 1 676 90 61 806 0.4 1 710 96 62 670 0.4 1 549 121 63 676 0.4 1 565 111 64 768 0.4 1 628 140 65 747 0.4 1 511 236 66 1185 0.4 1 507 678 67 32523 0.4 1 596 31927 68 39196 0.4 1 1933 37263 69 40116 0.4 1 1608 38508 70 41527 0.4 1 1839 39688 71 23648 0.4 1 1649 21999 72 17110 0.4 1 1164 15946 73 12063 0.4 1 877 11186 74 7075 0.4 1 675 6400 75 4034 0.4 1 492 3542 76 2841 0.4 1 307 2534 77 2392 0.4 1 300 2092 78 1761 0.4 1 211 1550 79 1333 0.4 1 223 1110 80 1382 0.4 1 199 1183 81 1053 0.4 1 208 845 82 1008 0.4 1 218 790 83 991 0.4 1 235 756 84 778 0.4 1 198 580 85 716 0.4 1 150 566 86 782 0.4 1 154 628 87 813 0.4 1 172 641 88 835 0.4 1 182 653 89 829 0.4 1 109 720 90 958 0.4 1 142 816 91 1184 0.4 1 168 1016 92 1208 0.4 1 65 1143 93 1306 0.4 1 55 1251 94 1837 0.4 1 58 1779 95 2748 0.4 1 79 2669 96 4252 0.4 1 80 4172 97 7341 0.4 1 149 7192 98 11424 0.4 1 101 11323 99 32808 0.4 1 212 32596 100 254407 0.4 1 788 253619 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz ============================================= 26724234 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 26724234 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 633493 (2.37%) >>> Now running FastQC on the validated data 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P05-GAGTGG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P05-GAGTGG-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz <<< 10000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 369.99 s (20 us/read; 3.01 M reads/minute). === Summary === Total reads processed: 18,561,500 Reads with adapters: 6,834,093 (36.8%) Reads written (passing filters): 18,561,500 (100.0%) Total basepairs processed: 1,856,150,000 bp Quality-trimmed: 4,879,709 bp (0.3%) Total written (filtered): 1,828,627,143 bp (98.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 6834093 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.6% C: 26.6% G: 16.1% T: 19.6% none/other: 2.0% Overview of removed sequences length count expect max.err error counts 1 5012807 4640375.0 0 5012807 2 1123453 1160093.8 0 1123453 3 334480 290023.4 0 334480 4 84832 72505.9 0 84832 5 24698 18126.5 0 24698 6 10272 4531.6 0 10272 7 7885 1132.9 0 7885 8 6908 283.2 0 6908 9 6843 70.8 0 6516 327 10 7837 17.7 1 5768 2069 11 5715 4.4 1 5196 519 12 5146 1.1 1 4880 266 13 4679 0.3 1 4438 241 14 4307 0.3 1 4093 214 15 3965 0.3 1 3790 175 16 3737 0.3 1 3555 182 17 3340 0.3 1 3149 191 18 3292 0.3 1 3106 186 19 2772 0.3 1 2635 137 20 2673 0.3 1 2487 186 21 2338 0.3 1 2201 137 22 2080 0.3 1 1961 119 23 1819 0.3 1 1733 86 24 1634 0.3 1 1522 112 25 1539 0.3 1 1446 93 26 1418 0.3 1 1346 72 27 1262 0.3 1 1189 73 28 1286 0.3 1 1192 94 29 1249 0.3 1 1167 82 30 1132 0.3 1 1075 57 31 1002 0.3 1 915 87 32 979 0.3 1 913 66 33 847 0.3 1 781 66 34 853 0.3 1 759 94 35 814 0.3 1 755 59 36 721 0.3 1 656 65 37 713 0.3 1 643 70 38 573 0.3 1 509 64 39 581 0.3 1 520 61 40 477 0.3 1 439 38 41 601 0.3 1 554 47 42 250 0.3 1 227 23 43 320 0.3 1 296 24 44 334 0.3 1 301 33 45 297 0.3 1 267 30 46 323 0.3 1 256 67 47 292 0.3 1 242 50 48 353 0.3 1 323 30 49 309 0.3 1 262 47 50 315 0.3 1 272 43 51 292 0.3 1 251 41 52 288 0.3 1 237 51 53 282 0.3 1 255 27 54 186 0.3 1 153 33 55 216 0.3 1 180 36 56 170 0.3 1 141 29 57 166 0.3 1 139 27 58 167 0.3 1 143 24 59 156 0.3 1 132 24 60 172 0.3 1 132 40 61 166 0.3 1 133 33 62 143 0.3 1 108 35 63 134 0.3 1 99 35 64 173 0.3 1 128 45 65 133 0.3 1 100 33 66 140 0.3 1 106 34 67 168 0.3 1 136 32 68 164 0.3 1 120 44 69 234 0.3 1 144 90 70 283 0.3 1 149 134 71 409 0.3 1 150 259 72 926 0.3 1 155 771 73 12487 0.3 1 166 12321 74 22317 0.3 1 498 21819 75 20727 0.3 1 718 20009 76 12784 0.3 1 551 12233 77 9024 0.3 1 422 8602 78 6155 0.3 1 261 5894 79 3908 0.3 1 225 3683 80 2907 0.3 1 152 2755 81 1945 0.3 1 115 1830 82 1287 0.3 1 79 1208 83 1076 0.3 1 101 975 84 858 0.3 1 100 758 85 650 0.3 1 88 562 86 519 0.3 1 81 438 87 431 0.3 1 73 358 88 356 0.3 1 35 321 89 305 0.3 1 63 242 90 278 0.3 1 49 229 91 286 0.3 1 52 234 92 293 0.3 1 36 257 93 292 0.3 1 22 270 94 371 0.3 1 28 343 95 447 0.3 1 40 407 96 568 0.3 1 22 546 97 883 0.3 1 59 824 98 1097 0.3 1 25 1072 99 4062 0.3 1 22 4040 100 36260 0.3 1 33 36227 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz ============================================= 18561500 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz <<< 10000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 376.38 s (20 us/read; 2.96 M reads/minute). === Summary === Total reads processed: 18,561,500 Reads with adapters: 7,037,876 (37.9%) Reads written (passing filters): 18,561,500 (100.0%) Total basepairs processed: 1,856,150,000 bp Quality-trimmed: 10,302,737 bp (0.6%) Total written (filtered): 1,822,120,138 bp (98.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 7037876 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.6% C: 23.7% G: 17.7% T: 19.0% none/other: 2.0% Overview of removed sequences length count expect max.err error counts 1 5044369 4640375.0 0 5044369 2 1212187 1160093.8 0 1212187 3 396210 290023.4 0 396210 4 100150 72505.9 0 100150 5 26983 18126.5 0 26983 6 11393 4531.6 0 11393 7 8012 1132.9 0 8012 8 6949 283.2 0 6949 9 7096 70.8 0 6734 362 10 6851 17.7 1 5772 1079 11 5870 4.4 1 5312 558 12 5325 1.1 1 5015 310 13 4640 0.3 1 4382 258 14 5078 0.3 1 4770 308 15 3487 0.3 1 3291 196 16 3681 0.3 1 3479 202 17 4133 0.3 1 3902 231 18 2545 0.3 1 2404 141 19 3195 0.3 1 3027 168 20 2374 0.3 1 2266 108 21 2319 0.3 1 2180 139 22 2126 0.3 1 1970 156 23 1960 0.3 1 1779 181 24 1833 0.3 1 1661 172 25 1508 0.3 1 1374 134 26 1488 0.3 1 1389 99 27 1376 0.3 1 1280 96 28 1382 0.3 1 1255 127 29 1289 0.3 1 1193 96 30 1401 0.3 1 1269 132 31 893 0.3 1 803 90 32 1038 0.3 1 940 98 33 911 0.3 1 809 102 34 929 0.3 1 832 97 35 862 0.3 1 745 117 36 813 0.3 1 724 89 37 777 0.3 1 693 84 38 650 0.3 1 560 90 39 586 0.3 1 511 75 40 540 0.3 1 478 62 41 504 0.3 1 419 85 42 601 0.3 1 503 98 43 352 0.3 1 309 43 44 402 0.3 1 342 60 45 483 0.3 1 399 84 46 318 0.3 1 255 63 47 300 0.3 1 256 44 48 429 0.3 1 364 65 49 331 0.3 1 282 49 50 356 0.3 1 307 49 51 398 0.3 1 333 65 52 306 0.3 1 237 69 53 304 0.3 1 254 50 54 270 0.3 1 228 42 55 254 0.3 1 192 62 56 224 0.3 1 183 41 57 208 0.3 1 164 44 58 198 0.3 1 161 37 59 236 0.3 1 184 52 60 213 0.3 1 158 55 61 202 0.3 1 165 37 62 200 0.3 1 145 55 63 179 0.3 1 135 44 64 185 0.3 1 147 38 65 199 0.3 1 119 80 66 335 0.3 1 125 210 67 8406 0.3 1 172 8234 68 14840 0.3 1 481 14359 69 12417 0.3 1 601 11816 70 9851 0.3 1 539 9312 71 5930 0.3 1 431 5499 72 3693 0.3 1 308 3385 73 2087 0.3 1 232 1855 74 1201 0.3 1 127 1074 75 849 0.3 1 133 716 76 504 0.3 1 83 421 77 458 0.3 1 70 388 78 433 0.3 1 63 370 79 357 0.3 1 67 290 80 322 0.3 1 53 269 81 251 0.3 1 44 207 82 240 0.3 1 44 196 83 308 0.3 1 70 238 84 255 0.3 1 51 204 85 243 0.3 1 76 167 86 226 0.3 1 47 179 87 226 0.3 1 42 184 88 255 0.3 1 52 203 89 252 0.3 1 28 224 90 273 0.3 1 22 251 91 356 0.3 1 36 320 92 351 0.3 1 27 324 93 430 0.3 1 26 404 94 630 0.3 1 16 614 95 976 0.3 1 19 957 96 1528 0.3 1 16 1512 97 2474 0.3 1 41 2433 98 3637 0.3 1 26 3611 99 9531 0.3 1 11 9520 100 60890 0.3 1 30 60860 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz ============================================= 18561500 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 18561500 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 194532 (1.05%) >>> Now running FastQC on the validated data 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P06-GCCAAT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P06-GCCAAT-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 512.32 s (20 us/read; 3.05 M reads/minute). === Summary === Total reads processed: 26,060,601 Reads with adapters: 9,261,389 (35.5%) Reads written (passing filters): 26,060,601 (100.0%) Total basepairs processed: 2,606,060,100 bp Quality-trimmed: 3,634,254 bp (0.1%) Total written (filtered): 2,578,109,104 bp (98.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9261389 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.1% C: 28.9% G: 16.5% T: 19.6% none/other: 1.0% Overview of removed sequences length count expect max.err error counts 1 6795262 6515150.2 0 6795262 2 1528861 1628787.6 0 1528861 3 438293 407196.9 0 438293 4 114850 101799.2 0 114850 5 41277 25449.8 0 41277 6 21398 6362.5 0 21398 7 17389 1590.6 0 17389 8 15815 397.7 0 15815 9 15057 99.4 0 14743 314 10 15852 24.9 1 13323 2529 11 12604 6.2 1 11900 704 12 11601 1.6 1 11150 451 13 10563 0.4 1 10152 411 14 9939 0.4 1 9547 392 15 9466 0.4 1 9098 368 16 8644 0.4 1 8295 349 17 8087 0.4 1 7804 283 18 7872 0.4 1 7557 315 19 6588 0.4 1 6333 255 20 6207 0.4 1 5972 235 21 5659 0.4 1 5409 250 22 4768 0.4 1 4554 214 23 4485 0.4 1 4313 172 24 4070 0.4 1 3871 199 25 3728 0.4 1 3560 168 26 3427 0.4 1 3274 153 27 3380 0.4 1 3240 140 28 3044 0.4 1 2900 144 29 2907 0.4 1 2778 129 30 2713 0.4 1 2604 109 31 2514 0.4 1 2417 97 32 2334 0.4 1 2201 133 33 2041 0.4 1 1966 75 34 1938 0.4 1 1832 106 35 1746 0.4 1 1654 92 36 1742 0.4 1 1647 95 37 1495 0.4 1 1406 89 38 1380 0.4 1 1297 83 39 1154 0.4 1 1093 61 40 1012 0.4 1 959 53 41 1016 0.4 1 952 64 42 956 0.4 1 889 67 43 1199 0.4 1 1141 58 44 420 0.4 1 388 32 45 575 0.4 1 541 34 46 573 0.4 1 539 34 47 612 0.4 1 576 36 48 603 0.4 1 560 43 49 552 0.4 1 517 35 50 546 0.4 1 508 38 51 471 0.4 1 431 40 52 465 0.4 1 422 43 53 444 0.4 1 407 37 54 363 0.4 1 330 33 55 387 0.4 1 341 46 56 276 0.4 1 252 24 57 269 0.4 1 246 23 58 242 0.4 1 208 34 59 229 0.4 1 203 26 60 246 0.4 1 228 18 61 195 0.4 1 175 20 62 207 0.4 1 177 30 63 162 0.4 1 141 21 64 196 0.4 1 170 26 65 192 0.4 1 151 41 66 193 0.4 1 157 36 67 216 0.4 1 191 25 68 200 0.4 1 175 25 69 215 0.4 1 181 34 70 230 0.4 1 181 49 71 262 0.4 1 195 67 72 267 0.4 1 186 81 73 356 0.4 1 203 153 74 628 0.4 1 219 409 75 8379 0.4 1 349 8030 76 11849 0.4 1 986 10863 77 12578 0.4 1 1301 11277 78 11049 0.4 1 1085 9964 79 6939 0.4 1 922 6017 80 4870 0.4 1 673 4197 81 3362 0.4 1 441 2921 82 2313 0.4 1 364 1949 83 1392 0.4 1 279 1113 84 886 0.4 1 212 674 85 784 0.4 1 186 598 86 653 0.4 1 192 461 87 537 0.4 1 176 361 88 443 0.4 1 150 293 89 368 0.4 1 107 261 90 328 0.4 1 132 196 91 331 0.4 1 132 199 92 271 0.4 1 92 179 93 291 0.4 1 95 196 94 319 0.4 1 82 237 95 397 0.4 1 101 296 96 491 0.4 1 97 394 97 720 0.4 1 181 539 98 828 0.4 1 90 738 99 2815 0.4 1 59 2756 100 23071 0.4 1 65 23006 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz ============================================= 26060601 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 535.66 s (21 us/read; 2.92 M reads/minute). === Summary === Total reads processed: 26,060,601 Reads with adapters: 10,098,285 (38.7%) Reads written (passing filters): 26,060,601 (100.0%) Total basepairs processed: 2,606,060,100 bp Quality-trimmed: 11,087,610 bp (0.4%) Total written (filtered): 2,568,463,082 bp (98.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 10098285 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.6% C: 24.2% G: 19.3% T: 18.1% none/other: 0.9% Overview of removed sequences length count expect max.err error counts 1 7082217 6515150.2 0 7082217 2 1835190 1628787.6 0 1835190 3 619784 407196.9 0 619784 4 158005 101799.2 0 158005 5 48398 25449.8 0 48398 6 23278 6362.5 0 23278 7 17688 1590.6 0 17688 8 15946 397.7 0 15946 9 15791 99.4 0 15056 735 10 15002 24.9 1 13434 1568 11 13002 6.2 1 12082 920 12 12117 1.6 1 11476 641 13 10394 0.4 1 9881 513 14 11738 0.4 1 11191 547 15 8285 0.4 1 7889 396 16 8755 0.4 1 8320 435 17 10024 0.4 1 9579 445 18 6099 0.4 1 5848 251 19 7641 0.4 1 7257 384 20 5534 0.4 1 5266 268 21 5721 0.4 1 5439 282 22 4854 0.4 1 4526 328 23 4735 0.4 1 4422 313 24 4519 0.4 1 4189 330 25 3675 0.4 1 3474 201 26 3630 0.4 1 3379 251 27 3573 0.4 1 3341 232 28 3250 0.4 1 3015 235 29 3006 0.4 1 2798 208 30 3323 0.4 1 3074 249 31 2303 0.4 1 2106 197 32 2487 0.4 1 2305 182 33 2151 0.4 1 2000 151 34 2123 0.4 1 1946 177 35 1929 0.4 1 1761 168 36 1907 0.4 1 1735 172 37 1666 0.4 1 1508 158 38 1557 0.4 1 1388 169 39 1253 0.4 1 1138 115 40 1153 0.4 1 1039 114 41 1201 0.4 1 1057 144 42 1268 0.4 1 1113 155 43 796 0.4 1 697 99 44 936 0.4 1 800 136 45 964 0.4 1 828 136 46 646 0.4 1 545 101 47 712 0.4 1 615 97 48 730 0.4 1 632 98 49 654 0.4 1 559 95 50 694 0.4 1 602 92 51 785 0.4 1 660 125 52 528 0.4 1 429 99 53 523 0.4 1 452 71 54 428 0.4 1 376 52 55 421 0.4 1 346 75 56 449 0.4 1 369 80 57 395 0.4 1 307 88 58 345 0.4 1 270 75 59 325 0.4 1 258 67 60 397 0.4 1 296 101 61 339 0.4 1 257 82 62 349 0.4 1 252 97 63 272 0.4 1 202 70 64 316 0.4 1 240 76 65 319 0.4 1 229 90 66 388 0.4 1 191 197 67 5712 0.4 1 237 5475 68 7455 0.4 1 958 6497 69 7822 0.4 1 734 7088 70 8211 0.4 1 909 7302 71 4854 0.4 1 836 4018 72 3418 0.4 1 618 2800 73 2398 0.4 1 399 1999 74 1497 0.4 1 367 1130 75 954 0.4 1 262 692 76 636 0.4 1 155 481 77 595 0.4 1 168 427 78 467 0.4 1 132 335 79 384 0.4 1 145 239 80 358 0.4 1 133 225 81 293 0.4 1 111 182 82 287 0.4 1 113 174 83 304 0.4 1 113 191 84 247 0.4 1 115 132 85 230 0.4 1 105 125 86 251 0.4 1 108 143 87 279 0.4 1 103 176 88 288 0.4 1 119 169 89 275 0.4 1 72 203 90 304 0.4 1 88 216 91 405 0.4 1 100 305 92 273 0.4 1 54 219 93 313 0.4 1 45 268 94 366 0.4 1 34 332 95 546 0.4 1 54 492 96 811 0.4 1 58 753 97 1428 0.4 1 132 1296 98 1865 0.4 1 67 1798 99 5154 0.4 1 27 5127 100 40722 0.4 1 73 40649 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz ============================================= 26060601 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 26060601 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 153969 (0.59%) >>> Now running FastQC on the validated data 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P07-AGTCAA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P07-AGTCAA-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 565.91 s (20 us/read; 3.02 M reads/minute). === Summary === Total reads processed: 28,467,375 Reads with adapters: 10,107,610 (35.5%) Reads written (passing filters): 28,467,375 (100.0%) Total basepairs processed: 2,846,737,500 bp Quality-trimmed: 4,562,072 bp (0.2%) Total written (filtered): 2,818,843,445 bp (99.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 10107610 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.8% C: 28.5% G: 16.1% T: 19.9% none/other: 0.7% Overview of removed sequences length count expect max.err error counts 1 7493128 7116843.8 0 7493128 2 1672176 1779210.9 0 1672176 3 480110 444802.7 0 480110 4 121401 111200.7 0 121401 5 40846 27800.2 0 40846 6 18726 6950.0 0 18726 7 14938 1737.5 0 14938 8 13611 434.4 0 13611 9 13651 108.6 0 12419 1232 10 15021 27.1 1 11648 3373 11 11089 6.8 1 10406 683 12 10093 1.7 1 9595 498 13 9673 0.4 1 9249 424 14 8627 0.4 1 8234 393 15 8223 0.4 1 7806 417 16 7765 0.4 1 7419 346 17 7265 0.4 1 6932 333 18 6794 0.4 1 6475 319 19 6101 0.4 1 5852 249 20 5601 0.4 1 5368 233 21 5216 0.4 1 4963 253 22 4642 0.4 1 4408 234 23 4113 0.4 1 3874 239 24 3798 0.4 1 3596 202 25 3586 0.4 1 3402 184 26 3218 0.4 1 3060 158 27 3041 0.4 1 2890 151 28 2953 0.4 1 2788 165 29 2755 0.4 1 2596 159 30 2542 0.4 1 2387 155 31 2423 0.4 1 2311 112 32 2206 0.4 1 2061 145 33 1936 0.4 1 1834 102 34 1877 0.4 1 1758 119 35 1763 0.4 1 1652 111 36 1566 0.4 1 1490 76 37 1396 0.4 1 1317 79 38 1279 0.4 1 1205 74 39 1191 0.4 1 1100 91 40 1074 0.4 1 1010 64 41 1301 0.4 1 1227 74 42 615 0.4 1 552 63 43 712 0.4 1 668 44 44 680 0.4 1 634 46 45 625 0.4 1 579 46 46 603 0.4 1 556 47 47 626 0.4 1 553 73 48 593 0.4 1 535 58 49 572 0.4 1 518 54 50 541 0.4 1 502 39 51 454 0.4 1 412 42 52 407 0.4 1 377 30 53 503 0.4 1 463 40 54 296 0.4 1 253 43 55 308 0.4 1 284 24 56 228 0.4 1 203 25 57 278 0.4 1 244 34 58 224 0.4 1 193 31 59 219 0.4 1 183 36 60 196 0.4 1 171 25 61 218 0.4 1 191 27 62 171 0.4 1 156 15 63 194 0.4 1 160 34 64 217 0.4 1 187 30 65 184 0.4 1 161 23 66 183 0.4 1 152 31 67 197 0.4 1 175 22 68 226 0.4 1 184 42 69 226 0.4 1 171 55 70 278 0.4 1 185 93 71 302 0.4 1 171 131 72 592 0.4 1 196 396 73 6813 0.4 1 272 6541 74 11687 0.4 1 1162 10525 75 11397 0.4 1 1538 9859 76 7342 0.4 1 1303 6039 77 5260 0.4 1 1050 4210 78 3546 0.4 1 672 2874 79 2244 0.4 1 488 1756 80 1583 0.4 1 336 1247 81 1158 0.4 1 253 905 82 759 0.4 1 185 574 83 686 0.4 1 213 473 84 515 0.4 1 176 339 85 428 0.4 1 170 258 86 355 0.4 1 164 191 87 339 0.4 1 175 164 88 292 0.4 1 163 129 89 240 0.4 1 121 119 90 214 0.4 1 102 112 91 255 0.4 1 147 108 92 229 0.4 1 114 115 93 192 0.4 1 68 124 94 271 0.4 1 89 182 95 296 0.4 1 106 190 96 377 0.4 1 105 272 97 556 0.4 1 189 367 98 674 0.4 1 128 546 99 2000 0.4 1 52 1948 100 17519 0.4 1 59 17460 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz ============================================= 28467375 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 583.70 s (21 us/read; 2.93 M reads/minute). === Summary === Total reads processed: 28,467,375 Reads with adapters: 11,050,629 (38.8%) Reads written (passing filters): 28,467,375 (100.0%) Total basepairs processed: 2,846,737,500 bp Quality-trimmed: 12,975,319 bp (0.5%) Total written (filtered): 2,807,712,233 bp (98.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 11050629 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 38.3% C: 23.6% G: 19.1% T: 18.4% none/other: 0.6% Overview of removed sequences length count expect max.err error counts 1 7843930 7116843.8 0 7843930 2 2002845 1779210.9 0 2002845 3 675101 444802.7 0 675101 4 164940 111200.7 0 164940 5 47683 27800.2 0 47683 6 20213 6950.0 0 20213 7 15510 1737.5 0 15510 8 13813 434.4 0 13813 9 13681 108.6 0 12978 703 10 13775 27.1 1 11820 1955 11 11846 6.8 1 10634 1212 12 10815 1.7 1 9950 865 13 9647 0.4 1 9143 504 14 10355 0.4 1 9718 637 15 7149 0.4 1 6720 429 16 7955 0.4 1 7504 451 17 9146 0.4 1 8552 594 18 5330 0.4 1 5003 327 19 7134 0.4 1 6689 445 20 5073 0.4 1 4773 300 21 5252 0.4 1 4971 281 22 4838 0.4 1 4482 356 23 4533 0.4 1 4113 420 24 4363 0.4 1 3970 393 25 3575 0.4 1 3321 254 26 3537 0.4 1 3247 290 27 3337 0.4 1 3080 257 28 3250 0.4 1 2969 281 29 3022 0.4 1 2734 288 30 3345 0.4 1 3031 314 31 2188 0.4 1 2002 186 32 2501 0.4 1 2254 247 33 2137 0.4 1 1902 235 34 2167 0.4 1 1926 241 35 1954 0.4 1 1732 222 36 1917 0.4 1 1679 238 37 1656 0.4 1 1476 180 38 1639 0.4 1 1364 275 39 1353 0.4 1 1180 173 40 1255 0.4 1 1105 150 41 1231 0.4 1 1056 175 42 1343 0.4 1 1146 197 43 798 0.4 1 693 105 44 971 0.4 1 811 160 45 1138 0.4 1 901 237 46 727 0.4 1 572 155 47 738 0.4 1 610 128 48 838 0.4 1 696 142 49 696 0.4 1 568 128 50 696 0.4 1 562 134 51 864 0.4 1 709 155 52 558 0.4 1 425 133 53 658 0.4 1 533 125 54 493 0.4 1 387 106 55 463 0.4 1 367 96 56 396 0.4 1 294 102 57 454 0.4 1 334 120 58 370 0.4 1 287 83 59 396 0.4 1 297 99 60 348 0.4 1 268 80 61 405 0.4 1 304 101 62 352 0.4 1 253 99 63 340 0.4 1 256 84 64 344 0.4 1 277 67 65 365 0.4 1 261 104 66 390 0.4 1 226 164 67 3662 0.4 1 286 3376 68 7361 0.4 1 936 6425 69 6755 0.4 1 1174 5581 70 5662 0.4 1 1029 4633 71 3537 0.4 1 806 2731 72 2329 0.4 1 613 1716 73 1288 0.4 1 368 920 74 791 0.4 1 216 575 75 608 0.4 1 195 413 76 362 0.4 1 135 227 77 356 0.4 1 141 215 78 314 0.4 1 124 190 79 331 0.4 1 141 190 80 290 0.4 1 118 172 81 278 0.4 1 122 156 82 264 0.4 1 126 138 83 346 0.4 1 130 216 84 257 0.4 1 113 144 85 285 0.4 1 139 146 86 267 0.4 1 122 145 87 275 0.4 1 136 139 88 304 0.4 1 172 132 89 260 0.4 1 72 188 90 264 0.4 1 74 190 91 389 0.4 1 133 256 92 341 0.4 1 100 241 93 352 0.4 1 59 293 94 453 0.4 1 51 402 95 674 0.4 1 73 601 96 1087 0.4 1 85 1002 97 1655 0.4 1 151 1504 98 2148 0.4 1 78 2070 99 5527 0.4 1 28 5499 100 31425 0.4 1 64 31361 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz ============================================= 28467375 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 28467375 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 144330 (0.51%) >>> Now running FastQC on the validated data 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P08-CTTGTA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P08-CTTGTA-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 438.18 s (20 us/read; 3.03 M reads/minute). === Summary === Total reads processed: 22,109,558 Reads with adapters: 8,247,591 (37.3%) Reads written (passing filters): 22,109,558 (100.0%) Total basepairs processed: 2,210,955,800 bp Quality-trimmed: 5,902,900 bp (0.3%) Total written (filtered): 2,173,134,278 bp (98.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 8247591 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 33.9% C: 27.9% G: 16.5% T: 19.5% none/other: 2.1% Overview of removed sequences length count expect max.err error counts 1 5955682 5527389.5 0 5955682 2 1269050 1381847.4 0 1269050 3 390312 345461.8 0 390312 4 103756 86365.5 0 103756 5 40343 21591.4 0 40343 6 24754 5397.8 0 24754 7 20960 1349.5 0 20960 8 19145 337.4 0 19145 9 18793 84.3 0 18534 259 10 17958 21.1 1 15344 2614 11 16459 5.3 1 15671 788 12 14566 1.3 1 14025 541 13 13253 0.3 1 12704 549 14 12626 0.3 1 12078 548 15 11394 0.3 1 10934 460 16 11099 0.3 1 10633 466 17 10373 0.3 1 9936 437 18 9797 0.3 1 9389 408 19 8527 0.3 1 8195 332 20 7784 0.3 1 7511 273 21 7409 0.3 1 7093 316 22 6111 0.3 1 5861 250 23 5553 0.3 1 5308 245 24 5685 0.3 1 5428 257 25 5055 0.3 1 4867 188 26 4473 0.3 1 4286 187 27 4373 0.3 1 4163 210 28 4116 0.3 1 3929 187 29 3837 0.3 1 3640 197 30 3485 0.3 1 3336 149 31 3342 0.3 1 3181 161 32 2943 0.3 1 2802 141 33 2706 0.3 1 2576 130 34 2634 0.3 1 2481 153 35 2398 0.3 1 2282 116 36 2191 0.3 1 2088 103 37 1970 0.3 1 1870 100 38 1835 0.3 1 1739 96 39 1581 0.3 1 1497 84 40 1568 0.3 1 1479 89 41 1194 0.3 1 1134 60 42 1142 0.3 1 1094 48 43 1924 0.3 1 1844 80 44 406 0.3 1 370 36 45 590 0.3 1 540 50 46 760 0.3 1 677 83 47 773 0.3 1 713 60 48 858 0.3 1 801 57 49 762 0.3 1 706 56 50 687 0.3 1 625 62 51 722 0.3 1 661 61 52 591 0.3 1 550 41 53 554 0.3 1 514 40 54 469 0.3 1 410 59 55 420 0.3 1 384 36 56 301 0.3 1 275 26 57 293 0.3 1 274 19 58 253 0.3 1 222 31 59 266 0.3 1 225 41 60 317 0.3 1 222 95 61 276 0.3 1 191 85 62 216 0.3 1 155 61 63 191 0.3 1 150 41 64 191 0.3 1 157 34 65 202 0.3 1 149 53 66 183 0.3 1 120 63 67 205 0.3 1 157 48 68 152 0.3 1 108 44 69 245 0.3 1 188 57 70 232 0.3 1 148 84 71 202 0.3 1 145 57 72 254 0.3 1 133 121 73 448 0.3 1 196 252 74 712 0.3 1 166 546 75 11435 0.3 1 392 11043 76 14350 0.3 1 533 13817 77 12804 0.3 1 693 12111 78 11532 0.3 1 499 11033 79 7997 0.3 1 547 7450 80 6300 0.3 1 398 5902 81 4898 0.3 1 280 4618 82 3716 0.3 1 253 3463 83 2772 0.3 1 287 2485 84 2136 0.3 1 225 1911 85 1915 0.3 1 220 1695 86 1650 0.3 1 229 1421 87 1442 0.3 1 208 1234 88 1240 0.3 1 191 1049 89 1194 0.3 1 128 1066 90 1136 0.3 1 130 1006 91 1086 0.3 1 169 917 92 1039 0.3 1 127 912 93 1052 0.3 1 82 970 94 1329 0.3 1 89 1240 95 1437 0.3 1 88 1349 96 1907 0.3 1 110 1797 97 2695 0.3 1 184 2511 98 3788 0.3 1 85 3703 99 10510 0.3 1 67 10443 100 69344 0.3 1 105 69239 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz ============================================= 22109558 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 446.53 s (20 us/read; 2.97 M reads/minute). === Summary === Total reads processed: 22,109,558 Reads with adapters: 8,720,472 (39.4%) Reads written (passing filters): 22,109,558 (100.0%) Total basepairs processed: 2,210,955,800 bp Quality-trimmed: 10,968,521 bp (0.5%) Total written (filtered): 2,167,546,551 bp (98.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 8720472 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.2% C: 23.0% G: 19.2% T: 18.6% none/other: 2.0% Overview of removed sequences length count expect max.err error counts 1 6033527 5527389.5 0 6033527 2 1496891 1381847.4 0 1496891 3 514242 345461.8 0 514242 4 137411 86365.5 0 137411 5 45561 21591.4 0 45561 6 26223 5397.8 0 26223 7 21556 1349.5 0 21556 8 19173 337.4 0 19173 9 18870 84.3 0 18396 474 10 17347 21.1 1 15989 1358 11 16326 5.3 1 15386 940 12 15165 1.3 1 14436 729 13 13007 0.3 1 12441 566 14 14407 0.3 1 13708 699 15 10069 0.3 1 9608 461 16 10830 0.3 1 10332 498 17 12602 0.3 1 12011 591 18 7503 0.3 1 7150 353 19 9755 0.3 1 9294 461 20 6860 0.3 1 6555 305 21 7240 0.3 1 6940 300 22 6341 0.3 1 5929 412 23 5711 0.3 1 5322 389 24 5786 0.3 1 5444 342 25 4898 0.3 1 4616 282 26 4588 0.3 1 4327 261 27 4606 0.3 1 4347 259 28 4236 0.3 1 4009 227 29 3802 0.3 1 3569 233 30 4204 0.3 1 3967 237 31 2815 0.3 1 2670 145 32 3081 0.3 1 2910 171 33 2700 0.3 1 2491 209 34 2767 0.3 1 2581 186 35 2436 0.3 1 2289 147 36 2329 0.3 1 2163 166 37 2086 0.3 1 1944 142 38 1940 0.3 1 1774 166 39 1643 0.3 1 1506 137 40 1564 0.3 1 1454 110 41 1277 0.3 1 1167 110 42 1323 0.3 1 1219 104 43 962 0.3 1 893 69 44 1071 0.3 1 966 105 45 1069 0.3 1 955 114 46 795 0.3 1 696 99 47 787 0.3 1 694 93 48 988 0.3 1 893 95 49 825 0.3 1 739 86 50 728 0.3 1 647 81 51 924 0.3 1 822 102 52 568 0.3 1 471 97 53 627 0.3 1 563 64 54 490 0.3 1 421 69 55 429 0.3 1 390 39 56 394 0.3 1 344 50 57 357 0.3 1 306 51 58 313 0.3 1 265 48 59 317 0.3 1 254 63 60 298 0.3 1 250 48 61 311 0.3 1 246 65 62 270 0.3 1 204 66 63 252 0.3 1 197 55 64 269 0.3 1 200 69 65 296 0.3 1 202 94 66 387 0.3 1 135 252 67 10607 0.3 1 192 10415 68 14194 0.3 1 703 13491 69 16021 0.3 1 713 15308 70 14402 0.3 1 695 13707 71 8379 0.3 1 602 7777 72 5973 0.3 1 434 5539 73 4243 0.3 1 324 3919 74 2390 0.3 1 272 2118 75 1632 0.3 1 222 1410 76 1106 0.3 1 147 959 77 963 0.3 1 125 838 78 795 0.3 1 103 692 79 693 0.3 1 137 556 80 602 0.3 1 103 499 81 468 0.3 1 85 383 82 370 0.3 1 74 296 83 505 0.3 1 101 404 84 363 0.3 1 91 272 85 342 0.3 1 75 267 86 317 0.3 1 89 228 87 354 0.3 1 95 259 88 361 0.3 1 73 288 89 331 0.3 1 45 286 90 356 0.3 1 59 297 91 428 0.3 1 71 357 92 394 0.3 1 38 356 93 453 0.3 1 15 438 94 630 0.3 1 26 604 95 931 0.3 1 32 899 96 1428 0.3 1 33 1395 97 2262 0.3 1 73 2189 98 3409 0.3 1 26 3383 99 9367 0.3 1 23 9344 100 76978 0.3 1 77 76901 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz ============================================= 22109558 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 22109558 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 227225 (1.03%) >>> Now running FastQC on the validated data 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P09-GTGAAA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P09-GTGAAA-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 654.23 s (20 us/read; 3.04 M reads/minute). === Summary === Total reads processed: 33,173,908 Reads with adapters: 11,627,476 (35.1%) Reads written (passing filters): 33,173,908 (100.0%) Total basepairs processed: 3,317,390,800 bp Quality-trimmed: 2,711,537 bp (0.1%) Total written (filtered): 3,292,802,685 bp (99.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 11627476 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.5% C: 28.6% G: 16.5% T: 20.2% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 8663367 8293477.0 0 8663367 2 1946670 2073369.2 0 1946670 3 549811 518342.3 0 549811 4 143079 129585.6 0 143079 5 47056 32396.4 0 47056 6 21601 8099.1 0 21601 7 17442 2024.8 0 17442 8 15520 506.2 0 15520 9 15066 126.5 0 14259 807 10 15534 31.6 1 13424 2110 11 12279 7.9 1 11627 652 12 11505 2.0 1 11059 446 13 10851 0.5 1 10401 450 14 9671 0.5 1 9331 340 15 9509 0.5 1 9151 358 16 8581 0.5 1 8274 307 17 8162 0.5 1 7813 349 18 7520 0.5 1 7253 267 19 6895 0.5 1 6624 271 20 6211 0.5 1 5971 240 21 5570 0.5 1 5357 213 22 5020 0.5 1 4775 245 23 4543 0.5 1 4364 179 24 4138 0.5 1 3962 176 25 3922 0.5 1 3753 169 26 3416 0.5 1 3278 138 27 3333 0.5 1 3185 148 28 3095 0.5 1 2938 157 29 2879 0.5 1 2765 114 30 2785 0.5 1 2663 122 31 2481 0.5 1 2377 104 32 2376 0.5 1 2261 115 33 1978 0.5 1 1897 81 34 2011 0.5 1 1901 110 35 1823 0.5 1 1734 89 36 1664 0.5 1 1559 105 37 1504 0.5 1 1396 108 38 1406 0.5 1 1315 91 39 1228 0.5 1 1156 72 40 1148 0.5 1 1074 74 41 1216 0.5 1 1145 71 42 639 0.5 1 575 64 43 725 0.5 1 669 56 44 706 0.5 1 652 54 45 655 0.5 1 598 57 46 715 0.5 1 649 66 47 621 0.5 1 584 37 48 569 0.5 1 512 57 49 599 0.5 1 526 73 50 594 0.5 1 502 92 51 502 0.5 1 463 39 52 466 0.5 1 427 39 53 491 0.5 1 437 54 54 330 0.5 1 297 33 55 305 0.5 1 271 34 56 288 0.5 1 245 43 57 257 0.5 1 223 34 58 271 0.5 1 228 43 59 260 0.5 1 218 42 60 277 0.5 1 230 47 61 330 0.5 1 283 47 62 231 0.5 1 183 48 63 195 0.5 1 159 36 64 288 0.5 1 244 44 65 247 0.5 1 222 25 66 234 0.5 1 198 36 67 298 0.5 1 267 31 68 275 0.5 1 244 31 69 291 0.5 1 234 57 70 346 0.5 1 296 50 71 318 0.5 1 269 49 72 456 0.5 1 334 122 73 2025 0.5 1 470 1555 74 4325 0.5 1 1713 2612 75 4763 0.5 1 2382 2381 76 3311 0.5 1 1832 1479 77 2344 0.5 1 1376 968 78 1724 0.5 1 1016 708 79 1045 0.5 1 628 417 80 763 0.5 1 453 310 81 656 0.5 1 398 258 82 441 0.5 1 287 154 83 449 0.5 1 300 149 84 422 0.5 1 318 104 85 329 0.5 1 257 72 86 319 0.5 1 242 77 87 293 0.5 1 241 52 88 308 0.5 1 256 52 89 230 0.5 1 172 58 90 218 0.5 1 165 53 91 299 0.5 1 241 58 92 164 0.5 1 135 29 93 157 0.5 1 117 40 94 171 0.5 1 115 56 95 234 0.5 1 154 80 96 206 0.5 1 128 78 97 392 0.5 1 261 131 98 350 0.5 1 188 162 99 553 0.5 1 70 483 100 4340 0.5 1 80 4260 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz ============================================= 33173908 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 701.15 s (21 us/read; 2.84 M reads/minute). === Summary === Total reads processed: 33,173,908 Reads with adapters: 12,800,822 (38.6%) Reads written (passing filters): 33,173,908 (100.0%) Total basepairs processed: 3,317,390,800 bp Quality-trimmed: 15,758,799 bp (0.5%) Total written (filtered): 3,276,598,441 bp (98.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 12800822 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.7% C: 24.5% G: 19.3% T: 18.3% none/other: 0.1% Overview of removed sequences length count expect max.err error counts 1 9094554 8293477.0 0 9094554 2 2359553 2073369.2 0 2359553 3 794425 518342.3 0 794425 4 192832 129585.6 0 192832 5 55612 32396.4 0 55612 6 23497 8099.1 0 23497 7 18334 2024.8 0 18334 8 15826 506.2 0 15826 9 15871 126.5 0 14876 995 10 15840 31.6 1 13687 2153 11 13491 7.9 1 11983 1508 12 12505 2.0 1 11630 875 13 10922 0.5 1 10382 540 14 11928 0.5 1 11187 741 15 8506 0.5 1 7988 518 16 8895 0.5 1 8410 485 17 10375 0.5 1 9756 619 18 5965 0.5 1 5613 352 19 8182 0.5 1 7687 495 20 5666 0.5 1 5328 338 21 5744 0.5 1 5417 327 22 5308 0.5 1 4914 394 23 5022 0.5 1 4603 419 24 4944 0.5 1 4549 395 25 4126 0.5 1 3787 339 26 3896 0.5 1 3546 350 27 3770 0.5 1 3421 349 28 3572 0.5 1 3240 332 29 3251 0.5 1 2952 299 30 3787 0.5 1 3376 411 31 2328 0.5 1 2118 210 32 2852 0.5 1 2532 320 33 2280 0.5 1 2031 249 34 2364 0.5 1 2100 264 35 2157 0.5 1 1886 271 36 2115 0.5 1 1816 299 37 1789 0.5 1 1561 228 38 1791 0.5 1 1516 275 39 1491 0.5 1 1322 169 40 1401 0.5 1 1209 192 41 1290 0.5 1 1060 230 42 1562 0.5 1 1265 297 43 859 0.5 1 708 151 44 1064 0.5 1 868 196 45 1342 0.5 1 1069 273 46 829 0.5 1 655 174 47 806 0.5 1 654 152 48 917 0.5 1 731 186 49 843 0.5 1 662 181 50 819 0.5 1 651 168 51 1083 0.5 1 853 230 52 628 0.5 1 478 150 53 755 0.5 1 560 195 54 563 0.5 1 426 137 55 495 0.5 1 398 97 56 492 0.5 1 391 101 57 474 0.5 1 354 120 58 482 0.5 1 358 124 59 491 0.5 1 357 134 60 469 0.5 1 350 119 61 581 0.5 1 446 135 62 541 0.5 1 347 194 63 436 0.5 1 296 140 64 472 0.5 1 351 121 65 434 0.5 1 304 130 66 481 0.5 1 337 144 67 1270 0.5 1 383 887 68 3088 0.5 1 1312 1776 69 3479 0.5 1 1810 1669 70 3036 0.5 1 1588 1448 71 2234 0.5 1 1218 1016 72 1627 0.5 1 931 696 73 923 0.5 1 529 394 74 613 0.5 1 336 277 75 523 0.5 1 276 247 76 385 0.5 1 224 161 77 375 0.5 1 234 141 78 316 0.5 1 194 122 79 354 0.5 1 232 122 80 295 0.5 1 195 100 81 307 0.5 1 191 116 82 292 0.5 1 186 106 83 331 0.5 1 185 146 84 320 0.5 1 195 125 85 350 0.5 1 188 162 86 326 0.5 1 199 127 87 288 0.5 1 158 130 88 383 0.5 1 247 136 89 290 0.5 1 122 168 90 257 0.5 1 109 148 91 363 0.5 1 168 195 92 267 0.5 1 124 143 93 263 0.5 1 84 179 94 274 0.5 1 78 196 95 424 0.5 1 110 314 96 426 0.5 1 93 333 97 805 0.5 1 227 578 98 799 0.5 1 165 634 99 1478 0.5 1 34 1444 100 8086 0.5 1 53 8033 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz ============================================= 33173908 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 33173908 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 116198 (0.35%) >>> Now running FastQC on the validated data 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P10-TAGCTT-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P10-TAGCTT-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 434.79 s (20 us/read; 3.05 M reads/minute). === Summary === Total reads processed: 22,073,550 Reads with adapters: 7,747,847 (35.1%) Reads written (passing filters): 22,073,550 (100.0%) Total basepairs processed: 2,207,355,000 bp Quality-trimmed: 3,920,662 bp (0.2%) Total written (filtered): 2,184,786,544 bp (99.0%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 7747847 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.3% C: 28.7% G: 16.4% T: 19.8% none/other: 0.8% Overview of removed sequences length count expect max.err error counts 1 5746935 5518387.5 0 5746935 2 1286626 1379596.9 0 1286626 3 363956 344899.2 0 363956 4 92834 86224.8 0 92834 5 29866 21556.2 0 29866 6 13220 5389.1 0 13220 7 10584 1347.3 0 10584 8 9189 336.8 0 9189 9 8925 84.2 0 8619 306 10 10456 21.1 1 8152 2304 11 7407 5.3 1 6977 430 12 7077 1.3 1 6748 329 13 6533 0.3 1 6239 294 14 5931 0.3 1 5674 257 15 5824 0.3 1 5588 236 16 5318 0.3 1 5081 237 17 5056 0.3 1 4821 235 18 4675 0.3 1 4462 213 19 4210 0.3 1 4015 195 20 3955 0.3 1 3753 202 21 3693 0.3 1 3498 195 22 3252 0.3 1 3083 169 23 2866 0.3 1 2740 126 24 2744 0.3 1 2612 132 25 2471 0.3 1 2346 125 26 2315 0.3 1 2186 129 27 2210 0.3 1 2099 111 28 2111 0.3 1 2005 106 29 1970 0.3 1 1856 114 30 1842 0.3 1 1756 86 31 1707 0.3 1 1625 82 32 1584 0.3 1 1496 88 33 1421 0.3 1 1338 83 34 1447 0.3 1 1341 106 35 1226 0.3 1 1166 60 36 1274 0.3 1 1185 89 37 1149 0.3 1 1085 64 38 1011 0.3 1 938 73 39 977 0.3 1 895 82 40 835 0.3 1 778 57 41 1075 0.3 1 1011 64 42 447 0.3 1 414 33 43 575 0.3 1 538 37 44 573 0.3 1 530 43 45 554 0.3 1 506 48 46 589 0.3 1 545 44 47 545 0.3 1 499 46 48 533 0.3 1 463 70 49 499 0.3 1 456 43 50 417 0.3 1 381 36 51 471 0.3 1 423 48 52 410 0.3 1 372 38 53 420 0.3 1 377 43 54 319 0.3 1 281 38 55 265 0.3 1 235 30 56 249 0.3 1 225 24 57 243 0.3 1 211 32 58 208 0.3 1 188 20 59 178 0.3 1 167 11 60 186 0.3 1 161 25 61 220 0.3 1 195 25 62 172 0.3 1 146 26 63 174 0.3 1 145 29 64 237 0.3 1 214 23 65 161 0.3 1 146 15 66 201 0.3 1 176 25 67 204 0.3 1 178 26 68 206 0.3 1 163 43 69 241 0.3 1 189 52 70 293 0.3 1 204 89 71 301 0.3 1 165 136 72 548 0.3 1 194 354 73 6284 0.3 1 272 6012 74 10360 0.3 1 982 9378 75 9987 0.3 1 1283 8704 76 6503 0.3 1 1011 5492 77 4613 0.3 1 786 3827 78 3146 0.3 1 548 2598 79 1972 0.3 1 386 1586 80 1499 0.3 1 276 1223 81 976 0.3 1 215 761 82 678 0.3 1 164 514 83 581 0.3 1 173 408 84 478 0.3 1 147 331 85 380 0.3 1 164 216 86 326 0.3 1 133 193 87 333 0.3 1 165 168 88 248 0.3 1 113 135 89 206 0.3 1 83 123 90 200 0.3 1 96 104 91 231 0.3 1 126 105 92 172 0.3 1 72 100 93 172 0.3 1 54 118 94 222 0.3 1 65 157 95 256 0.3 1 80 176 96 287 0.3 1 70 217 97 516 0.3 1 151 365 98 576 0.3 1 77 499 99 1859 0.3 1 28 1831 100 16390 0.3 1 71 16319 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz ============================================= 22073550 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 458.47 s (21 us/read; 2.89 M reads/minute). === Summary === Total reads processed: 22,073,550 Reads with adapters: 8,546,750 (38.7%) Reads written (passing filters): 22,073,550 (100.0%) Total basepairs processed: 2,207,355,000 bp Quality-trimmed: 12,328,881 bp (0.6%) Total written (filtered): 2,174,060,281 bp (98.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 8546750 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 38.1% C: 23.9% G: 19.3% T: 18.0% none/other: 0.7% Overview of removed sequences length count expect max.err error counts 1 6062251 5518387.5 0 6062251 2 1555315 1379596.9 0 1555315 3 521554 344899.2 0 521554 4 128828 86224.8 0 128828 5 34979 21556.2 0 34979 6 14528 5389.1 0 14528 7 11013 1347.3 0 11013 8 9367 336.8 0 9367 9 9717 84.2 0 9037 680 10 9870 21.1 1 8339 1531 11 8114 5.3 1 7199 915 12 7712 1.3 1 7098 614 13 6545 0.3 1 6156 389 14 7336 0.3 1 6840 496 15 5222 0.3 1 4859 363 16 5471 0.3 1 5162 309 17 6452 0.3 1 6002 450 18 3743 0.3 1 3501 242 19 5064 0.3 1 4690 374 20 3573 0.3 1 3344 229 21 3797 0.3 1 3557 240 22 3463 0.3 1 3151 312 23 3239 0.3 1 2914 325 24 3191 0.3 1 2895 296 25 2607 0.3 1 2368 239 26 2618 0.3 1 2366 252 27 2480 0.3 1 2248 232 28 2392 0.3 1 2189 203 29 2165 0.3 1 1930 235 30 2545 0.3 1 2267 278 31 1611 0.3 1 1449 162 32 1816 0.3 1 1607 209 33 1603 0.3 1 1418 185 34 1651 0.3 1 1451 200 35 1567 0.3 1 1359 208 36 1604 0.3 1 1370 234 37 1365 0.3 1 1215 150 38 1230 0.3 1 1055 175 39 1103 0.3 1 942 161 40 976 0.3 1 841 135 41 950 0.3 1 794 156 42 1170 0.3 1 983 187 43 680 0.3 1 568 112 44 873 0.3 1 700 173 45 1018 0.3 1 822 196 46 660 0.3 1 558 102 47 669 0.3 1 541 128 48 741 0.3 1 605 136 49 662 0.3 1 542 120 50 568 0.3 1 468 100 51 820 0.3 1 659 161 52 494 0.3 1 392 102 53 557 0.3 1 457 100 54 449 0.3 1 370 79 55 410 0.3 1 325 85 56 392 0.3 1 292 100 57 357 0.3 1 288 69 58 353 0.3 1 254 99 59 355 0.3 1 252 103 60 318 0.3 1 253 65 61 406 0.3 1 302 104 62 331 0.3 1 234 97 63 317 0.3 1 228 89 64 353 0.3 1 275 78 65 339 0.3 1 241 98 66 428 0.3 1 264 164 67 3520 0.3 1 296 3224 68 6877 0.3 1 859 6018 69 6037 0.3 1 1138 4899 70 4921 0.3 1 942 3979 71 3163 0.3 1 676 2487 72 1949 0.3 1 482 1467 73 1172 0.3 1 319 853 74 695 0.3 1 189 506 75 569 0.3 1 191 378 76 332 0.3 1 119 213 77 312 0.3 1 126 186 78 298 0.3 1 111 187 79 283 0.3 1 131 152 80 229 0.3 1 99 130 81 219 0.3 1 96 123 82 190 0.3 1 99 91 83 229 0.3 1 104 125 84 229 0.3 1 114 115 85 252 0.3 1 139 113 86 242 0.3 1 116 126 87 232 0.3 1 107 125 88 245 0.3 1 118 127 89 195 0.3 1 50 145 90 214 0.3 1 49 165 91 308 0.3 1 96 212 92 214 0.3 1 60 154 93 280 0.3 1 42 238 94 410 0.3 1 38 372 95 590 0.3 1 41 549 96 901 0.3 1 57 844 97 1400 0.3 1 101 1299 98 1951 0.3 1 77 1874 99 4993 0.3 1 24 4969 100 28752 0.3 1 78 28674 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz ============================================= 22073550 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 22073550 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 134382 (0.61%) >>> Now running FastQC on the validated data 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P11-ACTTGA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P11-ACTTGA-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 522.50 s (20 us/read; 3.06 M reads/minute). === Summary === Total reads processed: 26,686,495 Reads with adapters: 9,470,009 (35.5%) Reads written (passing filters): 26,686,495 (100.0%) Total basepairs processed: 2,668,649,500 bp Quality-trimmed: 6,856,366 bp (0.3%) Total written (filtered): 2,633,995,216 bp (98.7%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 9470009 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.6% C: 28.1% G: 16.3% T: 19.5% none/other: 1.4% Overview of removed sequences length count expect max.err error counts 1 6983865 6671623.8 0 6983865 2 1555501 1667905.9 0 1555501 3 438779 416976.5 0 438779 4 111942 104244.1 0 111942 5 36171 26061.0 0 36171 6 16604 6515.3 0 16604 7 13077 1628.8 0 13077 8 11630 407.2 0 11630 9 11180 101.8 0 10956 224 10 12166 25.5 1 9856 2310 11 9658 6.4 1 9013 645 12 8771 1.6 1 8348 423 13 8370 0.4 1 7958 412 14 7482 0.4 1 7146 336 15 7346 0.4 1 7013 333 16 6690 0.4 1 6383 307 17 6450 0.4 1 6153 297 18 6111 0.4 1 5829 282 19 5450 0.4 1 5194 256 20 4892 0.4 1 4674 218 21 4448 0.4 1 4236 212 22 4255 0.4 1 4039 216 23 3565 0.4 1 3411 154 24 3280 0.4 1 3111 169 25 3313 0.4 1 3146 167 26 2971 0.4 1 2825 146 27 2788 0.4 1 2654 134 28 2517 0.4 1 2369 148 29 2423 0.4 1 2274 149 30 2414 0.4 1 2280 134 31 2205 0.4 1 2090 115 32 2060 0.4 1 1933 127 33 1902 0.4 1 1797 105 34 1859 0.4 1 1737 122 35 1736 0.4 1 1634 102 36 1637 0.4 1 1515 122 37 1354 0.4 1 1273 81 38 1273 0.4 1 1178 95 39 1301 0.4 1 1155 146 40 1097 0.4 1 1012 85 41 1331 0.4 1 1238 93 42 568 0.4 1 522 46 43 684 0.4 1 642 42 44 680 0.4 1 625 55 45 691 0.4 1 629 62 46 664 0.4 1 586 78 47 642 0.4 1 575 67 48 716 0.4 1 648 68 49 661 0.4 1 594 67 50 683 0.4 1 611 72 51 598 0.4 1 543 55 52 559 0.4 1 493 66 53 566 0.4 1 501 65 54 357 0.4 1 295 62 55 374 0.4 1 327 47 56 302 0.4 1 272 30 57 286 0.4 1 240 46 58 303 0.4 1 261 42 59 244 0.4 1 206 38 60 326 0.4 1 217 109 61 318 0.4 1 230 88 62 267 0.4 1 200 67 63 240 0.4 1 177 63 64 275 0.4 1 214 61 65 209 0.4 1 169 40 66 232 0.4 1 189 43 67 260 0.4 1 206 54 68 215 0.4 1 170 45 69 310 0.4 1 206 104 70 398 0.4 1 241 157 71 438 0.4 1 201 237 72 964 0.4 1 208 756 73 13649 0.4 1 319 13330 74 21479 0.4 1 1145 20334 75 20199 0.4 1 1469 18730 76 13212 0.4 1 1198 12014 77 9485 0.4 1 915 8570 78 6338 0.4 1 642 5696 79 4184 0.4 1 501 3683 80 3106 0.4 1 300 2806 81 2003 0.4 1 233 1770 82 1357 0.4 1 161 1196 83 1188 0.4 1 189 999 84 966 0.4 1 198 768 85 661 0.4 1 170 491 86 576 0.4 1 138 438 87 544 0.4 1 187 357 88 459 0.4 1 165 294 89 337 0.4 1 79 258 90 325 0.4 1 84 241 91 371 0.4 1 118 253 92 292 0.4 1 62 230 93 328 0.4 1 61 267 94 395 0.4 1 59 336 95 408 0.4 1 64 344 96 643 0.4 1 94 549 97 947 0.4 1 167 780 98 1177 0.4 1 83 1094 99 4067 0.4 1 67 4000 100 35419 0.4 1 157 35262 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz ============================================= 26686495 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 555.02 s (21 us/read; 2.88 M reads/minute). === Summary === Total reads processed: 26,686,495 Reads with adapters: 10,347,910 (38.8%) Reads written (passing filters): 26,686,495 (100.0%) Total basepairs processed: 2,668,649,500 bp Quality-trimmed: 15,631,499 bp (0.6%) Total written (filtered): 2,622,269,897 bp (98.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 10347910 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.9% C: 23.7% G: 19.1% T: 18.0% none/other: 1.3% Overview of removed sequences length count expect max.err error counts 1 7315866 6671623.8 0 7315866 2 1852306 1667905.9 0 1852306 3 621196 416976.5 0 621196 4 152625 104244.1 0 152625 5 42728 26061.0 0 42728 6 17792 6515.3 0 17792 7 13674 1628.8 0 13674 8 11758 407.2 0 11758 9 12085 101.8 0 11412 673 10 11662 25.5 1 10008 1654 11 10451 6.4 1 9279 1172 12 9558 1.6 1 8764 794 13 8308 0.4 1 7840 468 14 9209 0.4 1 8560 649 15 6385 0.4 1 5983 402 16 6923 0.4 1 6482 441 17 8261 0.4 1 7697 564 18 4826 0.4 1 4530 296 19 6509 0.4 1 6008 501 20 4394 0.4 1 4115 279 21 4613 0.4 1 4295 318 22 4563 0.4 1 4147 416 23 4087 0.4 1 3661 426 24 3882 0.4 1 3498 384 25 3425 0.4 1 3116 309 26 3382 0.4 1 3042 340 27 3140 0.4 1 2848 292 28 2887 0.4 1 2596 291 29 2699 0.4 1 2385 314 30 3267 0.4 1 2900 367 31 2106 0.4 1 1886 220 32 2411 0.4 1 2143 268 33 2175 0.4 1 1907 268 34 2218 0.4 1 1947 271 35 2138 0.4 1 1856 282 36 2025 0.4 1 1744 281 37 1648 0.4 1 1446 202 38 1586 0.4 1 1317 269 39 1463 0.4 1 1281 182 40 1262 0.4 1 1106 156 41 1201 0.4 1 1026 175 42 1376 0.4 1 1179 197 43 809 0.4 1 697 112 44 1030 0.4 1 851 179 45 1236 0.4 1 987 249 46 737 0.4 1 605 132 47 769 0.4 1 629 140 48 1002 0.4 1 822 180 49 815 0.4 1 658 157 50 824 0.4 1 704 120 51 993 0.4 1 806 187 52 675 0.4 1 542 133 53 727 0.4 1 609 118 54 547 0.4 1 431 116 55 560 0.4 1 451 109 56 481 0.4 1 369 112 57 439 0.4 1 328 111 58 499 0.4 1 382 117 59 508 0.4 1 368 140 60 424 0.4 1 325 99 61 436 0.4 1 341 95 62 451 0.4 1 308 143 63 402 0.4 1 290 112 64 431 0.4 1 308 123 65 405 0.4 1 275 130 66 596 0.4 1 317 279 67 7367 0.4 1 335 7032 68 14539 0.4 1 956 13583 69 11969 0.4 1 1332 10637 70 9636 0.4 1 1108 8528 71 6161 0.4 1 776 5385 72 3468 0.4 1 547 2921 73 1815 0.4 1 368 1447 74 1215 0.4 1 220 995 75 820 0.4 1 198 622 76 533 0.4 1 146 387 77 501 0.4 1 166 335 78 437 0.4 1 88 349 79 461 0.4 1 168 293 80 386 0.4 1 124 262 81 311 0.4 1 103 208 82 276 0.4 1 104 172 83 372 0.4 1 126 246 84 335 0.4 1 130 205 85 323 0.4 1 122 201 86 316 0.4 1 117 199 87 343 0.4 1 133 210 88 364 0.4 1 150 214 89 306 0.4 1 73 233 90 332 0.4 1 65 267 91 478 0.4 1 112 366 92 405 0.4 1 54 351 93 490 0.4 1 48 442 94 785 0.4 1 51 734 95 1120 0.4 1 64 1056 96 1800 0.4 1 66 1734 97 2919 0.4 1 119 2800 98 4028 0.4 1 75 3953 99 11201 0.4 1 57 11144 100 62232 0.4 1 292 61940 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz ============================================= 26686495 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 26686495 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 221849 (0.83%) >>> Now running FastQC on the validated data 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P12-TGACCA-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P12-TGACCA-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz <<< 10000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 360.61 s (20 us/read; 2.95 M reads/minute). === Summary === Total reads processed: 17,735,907 Reads with adapters: 6,770,690 (38.2%) Reads written (passing filters): 17,735,907 (100.0%) Total basepairs processed: 1,773,590,700 bp Quality-trimmed: 7,228,061 bp (0.4%) Total written (filtered): 1,718,415,065 bp (96.9%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 6770690 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.2% C: 26.2% G: 15.3% T: 18.4% none/other: 5.9% Overview of removed sequences length count expect max.err error counts 1 4836894 4433976.8 0 4836894 2 972166 1108494.2 0 972166 3 296994 277123.5 0 296994 4 76625 69280.9 0 76625 5 25029 17320.2 0 25029 6 13387 4330.1 0 13387 7 10968 1082.5 0 10968 8 9921 270.6 0 9921 9 9801 67.7 0 9621 180 10 10238 16.9 1 8124 2114 11 8593 4.2 1 8042 551 12 7477 1.1 1 7137 340 13 6881 0.3 1 6570 311 14 6757 0.3 1 6415 342 15 6097 0.3 1 5784 313 16 5942 0.3 1 5592 350 17 5393 0.3 1 5139 254 18 5333 0.3 1 5020 313 19 4602 0.3 1 4364 238 20 4158 0.3 1 3917 241 21 3771 0.3 1 3550 221 22 3115 0.3 1 2980 135 23 2913 0.3 1 2763 150 24 2839 0.3 1 2673 166 25 2645 0.3 1 2493 152 26 2304 0.3 1 2183 121 27 2154 0.3 1 2028 126 28 2011 0.3 1 1918 93 29 1870 0.3 1 1753 117 30 1747 0.3 1 1657 90 31 1606 0.3 1 1515 91 32 1348 0.3 1 1276 72 33 1373 0.3 1 1278 95 34 1231 0.3 1 1167 64 35 1078 0.3 1 1003 75 36 1014 0.3 1 951 63 37 917 0.3 1 860 57 38 871 0.3 1 791 80 39 921 0.3 1 795 126 40 714 0.3 1 648 66 41 1028 0.3 1 959 69 42 213 0.3 1 195 18 43 323 0.3 1 293 30 44 405 0.3 1 370 35 45 346 0.3 1 318 28 46 360 0.3 1 332 28 47 312 0.3 1 285 27 48 351 0.3 1 316 35 49 355 0.3 1 308 47 50 301 0.3 1 273 28 51 323 0.3 1 282 41 52 217 0.3 1 191 26 53 254 0.3 1 228 26 54 155 0.3 1 128 27 55 168 0.3 1 150 18 56 139 0.3 1 120 19 57 120 0.3 1 102 18 58 151 0.3 1 105 46 59 89 0.3 1 73 16 60 113 0.3 1 79 34 61 132 0.3 1 111 21 62 112 0.3 1 82 30 63 135 0.3 1 74 61 64 96 0.3 1 63 33 65 92 0.3 1 58 34 66 110 0.3 1 80 30 67 108 0.3 1 70 38 68 110 0.3 1 48 62 69 308 0.3 1 183 125 70 284 0.3 1 67 217 71 404 0.3 1 79 325 72 1001 0.3 1 48 953 73 14137 0.3 1 123 14014 74 18277 0.3 1 204 18073 75 15920 0.3 1 459 15461 76 12229 0.3 1 176 12053 77 10273 0.3 1 191 10082 78 9393 0.3 1 167 9226 79 7658 0.3 1 205 7453 80 7446 0.3 1 162 7284 81 6415 0.3 1 128 6287 82 5508 0.3 1 138 5370 83 5347 0.3 1 143 5204 84 5052 0.3 1 121 4931 85 4394 0.3 1 117 4277 86 4125 0.3 1 101 4024 87 4001 0.3 1 150 3851 88 3733 0.3 1 98 3635 89 3627 0.3 1 83 3544 90 3531 0.3 1 68 3463 91 3503 0.3 1 84 3419 92 3333 0.3 1 59 3274 93 3864 0.3 1 42 3822 94 4529 0.3 1 47 4482 95 4736 0.3 1 57 4679 96 6120 0.3 1 53 6067 97 8095 0.3 1 96 7999 98 10574 0.3 1 56 10518 99 27067 0.3 1 48 27019 100 189490 0.3 1 106 189384 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz ============================================= 17735907 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz <<< 10000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 362.52 s (20 us/read; 2.94 M reads/minute). === Summary === Total reads processed: 17,735,907 Reads with adapters: 6,970,945 (39.3%) Reads written (passing filters): 17,735,907 (100.0%) Total basepairs processed: 1,773,590,700 bp Quality-trimmed: 15,221,319 bp (0.9%) Total written (filtered): 1,712,687,846 bp (96.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 6970945 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.3% C: 20.7% G: 18.4% T: 17.9% none/other: 5.8% Overview of removed sequences length count expect max.err error counts 1 4746186 4433976.8 0 4746186 2 1142863 1108494.2 0 1142863 3 392446 277123.5 0 392446 4 95516 69280.9 0 95516 5 28992 17320.2 0 28992 6 14655 4330.1 0 14655 7 11335 1082.5 0 11335 8 9910 270.6 0 9910 9 9936 67.7 0 9570 366 10 9287 16.9 1 8333 954 11 8520 4.2 1 7877 643 12 7715 1.1 1 7268 447 13 6860 0.3 1 6506 354 14 7704 0.3 1 7287 417 15 5367 0.3 1 5110 257 16 5726 0.3 1 5405 321 17 6678 0.3 1 6287 391 18 3964 0.3 1 3757 207 19 5257 0.3 1 4972 285 20 3599 0.3 1 3377 222 21 3720 0.3 1 3506 214 22 3214 0.3 1 2997 217 23 2953 0.3 1 2782 171 24 2870 0.3 1 2703 167 25 2582 0.3 1 2407 175 26 2358 0.3 1 2207 151 27 2210 0.3 1 2051 159 28 2124 0.3 1 2001 123 29 1829 0.3 1 1711 118 30 2052 0.3 1 1919 133 31 1343 0.3 1 1246 97 32 1461 0.3 1 1342 119 33 1297 0.3 1 1173 124 34 1346 0.3 1 1250 96 35 1046 0.3 1 961 85 36 1052 0.3 1 983 69 37 958 0.3 1 888 70 38 877 0.3 1 790 87 39 867 0.3 1 800 67 40 708 0.3 1 647 61 41 620 0.3 1 538 82 42 591 0.3 1 529 62 43 427 0.3 1 392 35 44 428 0.3 1 394 34 45 474 0.3 1 411 63 46 347 0.3 1 304 43 47 335 0.3 1 287 48 48 346 0.3 1 309 37 49 363 0.3 1 314 49 50 348 0.3 1 297 51 51 399 0.3 1 347 52 52 239 0.3 1 183 56 53 252 0.3 1 220 32 54 198 0.3 1 154 44 55 195 0.3 1 174 21 56 181 0.3 1 150 31 57 149 0.3 1 123 26 58 155 0.3 1 126 29 59 129 0.3 1 90 39 60 147 0.3 1 108 39 61 162 0.3 1 128 34 62 128 0.3 1 89 39 63 141 0.3 1 98 43 64 155 0.3 1 91 64 65 215 0.3 1 83 132 66 603 0.3 1 74 529 67 26660 0.3 1 92 26568 68 43423 0.3 1 334 43089 69 41446 0.3 1 537 40909 70 33636 0.3 1 381 33255 71 19663 0.3 1 288 19375 72 11661 0.3 1 211 11450 73 6845 0.3 1 138 6707 74 3976 0.3 1 85 3891 75 2724 0.3 1 96 2628 76 1658 0.3 1 43 1615 77 1696 0.3 1 61 1635 78 1640 0.3 1 31 1609 79 1305 0.3 1 74 1231 80 1195 0.3 1 45 1150 81 976 0.3 1 42 934 82 870 0.3 1 46 824 83 769 0.3 1 54 715 84 655 0.3 1 36 619 85 601 0.3 1 43 558 86 609 0.3 1 44 565 87 605 0.3 1 49 556 88 639 0.3 1 43 596 89 698 0.3 1 14 684 90 719 0.3 1 25 694 91 837 0.3 1 32 805 92 925 0.3 1 24 901 93 1127 0.3 1 18 1109 94 1634 0.3 1 12 1622 95 2461 0.3 1 18 2443 96 3480 0.3 1 15 3465 97 5469 0.3 1 23 5446 98 8104 0.3 1 22 8082 99 21514 0.3 1 29 21485 100 153615 0.3 1 79 153536 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz ============================================= 17735907 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 17735907 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 450424 (2.54%) >>> Now running FastQC on the validated data 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P13-ATCACG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P13-ATCACG-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 624.32 s (20 us/read; 3.05 M reads/minute). === Summary === Total reads processed: 31,778,548 Reads with adapters: 10,990,256 (34.6%) Reads written (passing filters): 31,778,548 (100.0%) Total basepairs processed: 3,177,854,800 bp Quality-trimmed: 2,902,895 bp (0.1%) Total written (filtered): 3,153,950,683 bp (99.2%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 10990256 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.2% C: 28.9% G: 16.7% T: 20.0% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 8239701 7944637.0 0 8239701 2 1838601 1986159.2 0 1838601 3 511440 496539.8 0 511440 4 133720 124135.0 0 133720 5 41126 31033.7 0 41126 6 16026 7758.4 0 16026 7 12186 1939.6 0 12186 8 10453 484.9 0 10453 9 10231 121.2 0 9721 510 10 10797 30.3 1 9228 1569 11 8654 7.6 1 8035 619 12 7980 1.9 1 7663 317 13 7630 0.5 1 7326 304 14 6948 0.5 1 6657 291 15 6658 0.5 1 6384 274 16 6216 0.5 1 5991 225 17 5860 0.5 1 5602 258 18 5786 0.5 1 5570 216 19 5014 0.5 1 4806 208 20 4669 0.5 1 4440 229 21 4176 0.5 1 3998 178 22 3887 0.5 1 3699 188 23 3524 0.5 1 3357 167 24 3302 0.5 1 3166 136 25 3112 0.5 1 2964 148 26 2854 0.5 1 2714 140 27 2729 0.5 1 2598 131 28 2581 0.5 1 2459 122 29 2393 0.5 1 2267 126 30 2404 0.5 1 2292 112 31 2189 0.5 1 2107 82 32 2031 0.5 1 1933 98 33 1751 0.5 1 1664 87 34 1747 0.5 1 1670 77 35 1614 0.5 1 1547 67 36 1483 0.5 1 1405 78 37 1430 0.5 1 1350 80 38 1368 0.5 1 1267 101 39 1152 0.5 1 1096 56 40 1109 0.5 1 1047 62 41 865 0.5 1 832 33 42 878 0.5 1 817 61 43 1070 0.5 1 1015 55 44 529 0.5 1 483 46 45 635 0.5 1 589 46 46 642 0.5 1 573 69 47 679 0.5 1 633 46 48 612 0.5 1 582 30 49 616 0.5 1 579 37 50 595 0.5 1 556 39 51 543 0.5 1 491 52 52 490 0.5 1 450 40 53 483 0.5 1 444 39 54 406 0.5 1 379 27 55 410 0.5 1 383 27 56 265 0.5 1 235 30 57 324 0.5 1 282 42 58 274 0.5 1 234 40 59 274 0.5 1 232 42 60 279 0.5 1 225 54 61 274 0.5 1 245 29 62 255 0.5 1 210 45 63 236 0.5 1 202 34 64 252 0.5 1 225 27 65 241 0.5 1 205 36 66 247 0.5 1 203 44 67 270 0.5 1 233 37 68 261 0.5 1 234 27 69 257 0.5 1 208 49 70 331 0.5 1 281 50 71 300 0.5 1 251 49 72 344 0.5 1 281 63 73 386 0.5 1 289 97 74 527 0.5 1 338 189 75 3132 0.5 1 521 2611 76 5554 0.5 1 1784 3770 77 5991 0.5 1 2454 3537 78 4359 0.5 1 1685 2674 79 2845 0.5 1 1274 1571 80 1946 0.5 1 850 1096 81 1319 0.5 1 603 716 82 929 0.5 1 417 512 83 641 0.5 1 335 306 84 437 0.5 1 265 172 85 392 0.5 1 240 152 86 426 0.5 1 284 142 87 388 0.5 1 270 118 88 341 0.5 1 249 92 89 221 0.5 1 151 70 90 241 0.5 1 171 70 91 282 0.5 1 214 68 92 180 0.5 1 134 46 93 177 0.5 1 119 58 94 194 0.5 1 122 72 95 233 0.5 1 139 94 96 263 0.5 1 147 116 97 468 0.5 1 300 168 98 392 0.5 1 166 226 99 822 0.5 1 50 772 100 6501 0.5 1 80 6421 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz ============================================= 31778548 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 668.44 s (21 us/read; 2.85 M reads/minute). === Summary === Total reads processed: 31,778,548 Reads with adapters: 12,209,152 (38.4%) Reads written (passing filters): 31,778,548 (100.0%) Total basepairs processed: 3,177,854,800 bp Quality-trimmed: 15,060,724 bp (0.5%) Total written (filtered): 3,138,823,875 bp (98.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 12209152 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.8% C: 24.8% G: 19.3% T: 17.9% none/other: 0.2% Overview of removed sequences length count expect max.err error counts 1 8724358 7944637.0 0 8724358 2 2259557 1986159.2 0 2259557 3 752730 496539.8 0 752730 4 177780 124135.0 0 177780 5 46891 31033.7 0 46891 6 18028 7758.4 0 18028 7 12788 1939.6 0 12788 8 10845 484.9 0 10845 9 11251 121.2 0 10156 1095 10 11414 30.3 1 9512 1902 11 9461 7.6 1 8318 1143 12 8806 1.9 1 8032 774 13 7715 0.5 1 7228 487 14 8638 0.5 1 8061 577 15 5911 0.5 1 5552 359 16 6552 0.5 1 6177 375 17 7620 0.5 1 7079 541 18 4543 0.5 1 4267 276 19 6100 0.5 1 5714 386 20 4265 0.5 1 3974 291 21 4365 0.5 1 4097 268 22 4188 0.5 1 3837 351 23 3943 0.5 1 3564 379 24 3939 0.5 1 3564 375 25 3240 0.5 1 2968 272 26 3225 0.5 1 2921 304 27 3089 0.5 1 2783 306 28 2978 0.5 1 2663 315 29 2698 0.5 1 2427 271 30 3236 0.5 1 2889 347 31 2048 0.5 1 1851 197 32 2417 0.5 1 2130 287 33 1979 0.5 1 1781 198 34 2079 0.5 1 1822 257 35 1879 0.5 1 1650 229 36 1847 0.5 1 1602 245 37 1708 0.5 1 1484 224 38 1594 0.5 1 1324 270 39 1358 0.5 1 1184 174 40 1243 0.5 1 1084 159 41 1200 0.5 1 1006 194 42 1383 0.5 1 1147 236 43 851 0.5 1 736 115 44 1015 0.5 1 847 168 45 1248 0.5 1 1036 212 46 740 0.5 1 596 144 47 797 0.5 1 675 122 48 904 0.5 1 774 130 49 811 0.5 1 653 158 50 825 0.5 1 681 144 51 1045 0.5 1 814 231 52 639 0.5 1 492 147 53 640 0.5 1 514 126 54 551 0.5 1 444 107 55 512 0.5 1 409 103 56 500 0.5 1 386 114 57 491 0.5 1 367 124 58 430 0.5 1 334 96 59 421 0.5 1 314 107 60 441 0.5 1 335 106 61 525 0.5 1 417 108 62 445 0.5 1 304 141 63 440 0.5 1 323 117 64 423 0.5 1 313 110 65 440 0.5 1 313 127 66 395 0.5 1 264 131 67 2087 0.5 1 342 1745 68 3800 0.5 1 1539 2261 69 3538 0.5 1 1222 2316 70 3501 0.5 1 1269 2232 71 2701 0.5 1 1204 1497 72 1899 0.5 1 878 1021 73 1379 0.5 1 671 708 74 1004 0.5 1 532 472 75 708 0.5 1 365 343 76 477 0.5 1 262 215 77 436 0.5 1 223 213 78 323 0.5 1 173 150 79 314 0.5 1 183 131 80 323 0.5 1 184 139 81 307 0.5 1 203 104 82 282 0.5 1 180 102 83 297 0.5 1 172 125 84 244 0.5 1 141 103 85 273 0.5 1 158 115 86 312 0.5 1 188 124 87 365 0.5 1 209 156 88 301 0.5 1 180 121 89 263 0.5 1 111 152 90 288 0.5 1 127 161 91 351 0.5 1 170 181 92 229 0.5 1 81 148 93 228 0.5 1 58 170 94 263 0.5 1 58 205 95 347 0.5 1 80 267 96 430 0.5 1 102 328 97 795 0.5 1 209 586 98 761 0.5 1 105 656 99 1681 0.5 1 20 1661 100 12227 0.5 1 48 12179 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz ============================================= 31778548 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 31778548 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 121684 (0.38%) >>> Now running FastQC on the validated data 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P14-GTGGCC-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P14-GTGGCC-READ2-Sequences.txt.gz_trimmed.fq.gz ==================================================================================================== Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 686.87 s (20 us/read; 3.05 M reads/minute). === Summary === Total reads processed: 34,898,106 Reads with adapters: 12,716,168 (36.4%) Reads written (passing filters): 34,898,106 (100.0%) Total basepairs processed: 3,489,810,600 bp Quality-trimmed: 9,506,829 bp (0.3%) Total written (filtered): 3,432,271,892 bp (98.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 12716168 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 34.3% C: 27.7% G: 16.4% T: 19.4% none/other: 2.2% Overview of removed sequences length count expect max.err error counts 1 9152187 8724526.5 0 9152187 2 2045192 2181131.6 0 2045192 3 594999 545282.9 0 594999 4 157439 136320.7 0 157439 5 59760 34080.2 0 59760 6 33489 8520.0 0 33489 7 28495 2130.0 0 28495 8 25135 532.5 0 25135 9 23862 133.1 0 23449 413 10 25782 33.3 1 21743 4039 11 20460 8.3 1 19179 1281 12 19150 2.1 1 18423 727 13 17756 0.5 1 17052 704 14 16358 0.5 1 15721 637 15 16051 0.5 1 15446 605 16 14490 0.5 1 13912 578 17 13798 0.5 1 13262 536 18 13173 0.5 1 12597 576 19 11672 0.5 1 11211 461 20 10894 0.5 1 10474 420 21 9789 0.5 1 9403 386 22 8813 0.5 1 8443 370 23 7723 0.5 1 7422 301 24 7264 0.5 1 6977 287 25 6412 0.5 1 6156 256 26 6195 0.5 1 5945 250 27 5920 0.5 1 5676 244 28 5710 0.5 1 5446 264 29 5064 0.5 1 4836 228 30 4875 0.5 1 4646 229 31 4428 0.5 1 4237 191 32 4325 0.5 1 4126 199 33 3860 0.5 1 3682 178 34 3658 0.5 1 3468 190 35 3504 0.5 1 3356 148 36 3061 0.5 1 2918 143 37 2800 0.5 1 2656 144 38 2518 0.5 1 2383 135 39 2216 0.5 1 2097 119 40 2134 0.5 1 1999 135 41 1928 0.5 1 1807 121 42 1555 0.5 1 1453 102 43 2100 0.5 1 1983 117 44 803 0.5 1 754 49 45 1082 0.5 1 1024 58 46 1135 0.5 1 1067 68 47 1135 0.5 1 1060 75 48 1230 0.5 1 1152 78 49 1138 0.5 1 1064 74 50 1041 0.5 1 975 66 51 1106 0.5 1 1017 89 52 912 0.5 1 848 64 53 825 0.5 1 770 55 54 778 0.5 1 680 98 55 702 0.5 1 627 75 56 448 0.5 1 415 33 57 506 0.5 1 452 54 58 424 0.5 1 388 36 59 371 0.5 1 336 35 60 330 0.5 1 293 37 61 288 0.5 1 263 25 62 294 0.5 1 247 47 63 271 0.5 1 236 35 64 278 0.5 1 247 31 65 284 0.5 1 221 63 66 291 0.5 1 238 53 67 290 0.5 1 241 49 68 288 0.5 1 231 57 69 381 0.5 1 285 96 70 324 0.5 1 213 111 71 384 0.5 1 236 148 72 510 0.5 1 245 265 73 810 0.5 1 275 535 74 1884 0.5 1 307 1577 75 29888 0.5 1 525 29363 76 38886 0.5 1 1519 37367 77 39150 0.5 1 1810 37340 78 32701 0.5 1 1461 31240 79 19239 0.5 1 1130 18109 80 13041 0.5 1 753 12288 81 8717 0.5 1 491 8226 82 5762 0.5 1 333 5429 83 3553 0.5 1 328 3225 84 2277 0.5 1 247 2030 85 2037 0.5 1 256 1781 86 1468 0.5 1 190 1278 87 1165 0.5 1 217 948 88 969 0.5 1 182 787 89 860 0.5 1 164 696 90 720 0.5 1 130 590 91 730 0.5 1 201 529 92 628 0.5 1 112 516 93 687 0.5 1 85 602 94 792 0.5 1 94 698 95 931 0.5 1 106 825 96 1274 0.5 1 140 1134 97 1766 0.5 1 218 1548 98 2381 0.5 1 118 2263 99 8437 0.5 1 77 8360 100 71572 0.5 1 119 71453 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz ============================================= 34898106 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/mnt/data/coral_RNAseq_2017/porites/20180311_fastqc_trimming/trimmed/4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz Trimming mode: paired-end Trim Galore version: 0.4.4_dev Cutadapt version: 1.16 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 15 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 4 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /mnt/data/coral_RNAseq_2017/porites/20180415_trimmed/fastqc --threads 24' Output file(s) will be GZIP compressed Writing final adapter and quality trimmed output to 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_trimmed.fq.gz >>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz <<< 10000000 sequences processed 20000000 sequences processed 30000000 sequences processed This is cutadapt 1.16 with Python 3.6.4 Command line parameters: -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz Running on 1 core Trimming 1 adapter with at most 10.0% errors in single-end mode ... Finished in 710.98 s (20 us/read; 2.95 M reads/minute). === Summary === Total reads processed: 34,898,106 Reads with adapters: 13,696,554 (39.2%) Reads written (passing filters): 34,898,106 (100.0%) Total basepairs processed: 3,489,810,600 bp Quality-trimmed: 16,415,739 bp (0.5%) Total written (filtered): 3,422,168,837 bp (98.1%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 13696554 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 37.6% C: 23.3% G: 18.7% T: 18.3% none/other: 2.1% Overview of removed sequences length count expect max.err error counts 1 9508638 8724526.5 0 9508638 2 2388777 2181131.6 0 2388777 3 805242 545282.9 0 805242 4 206399 136320.7 0 206399 5 67746 34080.2 0 67746 6 34469 8520.0 0 34469 7 28973 2130.0 0 28973 8 25262 532.5 0 25262 9 24875 133.1 0 24119 756 10 24187 33.3 1 21782 2405 11 21083 8.3 1 19583 1500 12 19993 2.1 1 18884 1109 13 17334 0.5 1 16497 837 14 19446 0.5 1 18444 1002 15 13840 0.5 1 13188 652 16 14698 0.5 1 13992 706 17 16926 0.5 1 16105 821 18 10271 0.5 1 9773 498 19 13371 0.5 1 12709 662 20 9627 0.5 1 9163 464 21 9713 0.5 1 9246 467 22 8933 0.5 1 8348 585 23 8099 0.5 1 7498 601 24 7859 0.5 1 7380 479 25 6360 0.5 1 5947 413 26 6522 0.5 1 6100 422 27 6219 0.5 1 5808 411 28 5946 0.5 1 5583 363 29 5285 0.5 1 4910 375 30 5836 0.5 1 5432 404 31 3985 0.5 1 3710 275 32 4544 0.5 1 4233 311 33 4042 0.5 1 3741 301 34 3951 0.5 1 3655 296 35 3663 0.5 1 3377 286 36 3408 0.5 1 3137 271 37 2939 0.5 1 2701 238 38 2672 0.5 1 2439 233 39 2387 0.5 1 2162 225 40 2217 0.5 1 2030 187 41 2055 0.5 1 1808 247 42 2005 0.5 1 1813 192 43 1442 0.5 1 1289 153 44 1505 0.5 1 1344 161 45 1731 0.5 1 1517 214 46 1228 0.5 1 1044 184 47 1220 0.5 1 1063 157 48 1423 0.5 1 1265 158 49 1227 0.5 1 1089 138 50 1255 0.5 1 1086 169 51 1502 0.5 1 1297 205 52 937 0.5 1 815 122 53 949 0.5 1 808 141 54 842 0.5 1 720 122 55 753 0.5 1 622 131 56 673 0.5 1 565 108 57 607 0.5 1 495 112 58 554 0.5 1 458 96 59 481 0.5 1 371 110 60 454 0.5 1 371 83 61 438 0.5 1 350 88 62 440 0.5 1 330 110 63 383 0.5 1 279 104 64 454 0.5 1 317 137 65 464 0.5 1 285 179 66 741 0.5 1 278 463 67 18269 0.5 1 305 17964 68 21668 0.5 1 1344 20324 69 21419 0.5 1 1017 20402 70 21672 0.5 1 1043 20629 71 12349 0.5 1 1079 11270 72 8607 0.5 1 715 7892 73 6117 0.5 1 542 5575 74 3547 0.5 1 453 3094 75 2087 0.5 1 306 1781 76 1412 0.5 1 216 1196 77 1246 0.5 1 218 1028 78 908 0.5 1 168 740 79 744 0.5 1 171 573 80 690 0.5 1 132 558 81 601 0.5 1 153 448 82 515 0.5 1 115 400 83 560 0.5 1 151 409 84 468 0.5 1 109 359 85 454 0.5 1 144 310 86 444 0.5 1 123 321 87 483 0.5 1 150 333 88 546 0.5 1 152 394 89 511 0.5 1 111 400 90 557 0.5 1 98 459 91 743 0.5 1 131 612 92 675 0.5 1 73 602 93 756 0.5 1 43 713 94 1039 0.5 1 69 970 95 1568 0.5 1 67 1501 96 2481 0.5 1 81 2400 97 4175 0.5 1 146 4029 98 6009 0.5 1 85 5924 99 17494 0.5 1 37 17457 100 134240 0.5 1 103 134137 RUN STATISTICS FOR INPUT FILE: /mnt/data/coral_RNAseq_2017/porites/4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz ============================================= 34898106 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_trimmed.fq.gz file_1: 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_trimmed.fq.gz, file_2: 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_trimmed.fq.gz <<<<< Writing validated paired-end read 1 reads to 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Writing validated paired-end read 2 reads to 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Total number of sequences analysed: 34898106 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 365883 (1.05%) >>> Now running FastQC on the validated data 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz<<< Started analysis of 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 5% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 10% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 15% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 20% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 25% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 30% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 35% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 40% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 45% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 50% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 55% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 60% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 65% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 70% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 75% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 80% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 85% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 90% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz Approx 95% complete for 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_val_1.fq.gz >>> Now running FastQC on the validated data 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz<<< Started analysis of 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 5% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 10% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 15% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 20% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 25% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 30% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 35% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 40% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 45% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 50% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 55% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 60% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 65% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 70% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 75% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 80% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 85% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 90% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Approx 95% complete for 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_val_2.fq.gz Deleting both intermediate output files 4R041-L7-P15-GTTTCG-READ1-Sequences.txt.gz_trimmed.fq.gz and 4R041-L7-P15-GTTTCG-READ2-Sequences.txt.gz_trimmed.fq.gz ====================================================================================================