SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-209_S29_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Using Illumina adapter for trimming (count: 330202). Second best hit was smallRNA (count: 1) Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: --outdir /gscratch/scrubbed/strigg/analyses/20200320/TG_FASTQS/FastQC --threads 28 Output file will be GZIP compressed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-209_S29_L004_R2_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 87.43 s (3 us/read; 21.56 M reads/minute). === Summary === Total reads processed: 31,419,769 Reads with adapters: 21,614,014 (68.8%) Reads written (passing filters): 31,419,769 (100.0%) Total basepairs processed: 3,173,396,669 bp Quality-trimmed: 52,846,192 bp (1.7%) Total written (filtered): 2,628,782,596 bp (82.8%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 21614014 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 35.3% C: 22.2% G: 11.4% T: 31.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 8734774 7854942.2 0 8734774 2 264168 1963735.6 0 264168 3 224610 490933.9 0 224610 4 192791 122733.5 0 192791 5 190150 30683.4 0 190150 6 194913 7670.8 0 194913 7 196982 1917.7 0 196982 8 209000 479.4 0 209000 9 196000 119.9 0 195173 827 10 203141 30.0 1 198412 4729 11 182865 7.5 1 177790 5075 12 191450 1.9 1 185766 5684 13 186738 0.5 1 181604 5134 14 205369 0.5 1 199330 6039 15 195656 0.5 1 190271 5385 16 196181 0.5 1 190746 5435 17 202165 0.5 1 196403 5762 18 181166 0.5 1 176280 4886 19 193370 0.5 1 187654 5716 20 190622 0.5 1 185238 5384 21 193328 0.5 1 187765 5563 22 198860 0.5 1 192907 5953 23 189427 0.5 1 183694 5733 24 199822 0.5 1 193557 6265 25 183895 0.5 1 178470 5425 26 182511 0.5 1 176453 6058 27 184136 0.5 1 177440 6696 28 194739 0.5 1 188979 5760 29 184983 0.5 1 179036 5947 30 204058 0.5 1 198106 5952 31 174604 0.5 1 169099 5505 32 189523 0.5 1 184209 5314 33 192826 0.5 1 186716 6110 34 187744 0.5 1 181492 6252 35 187013 0.5 1 182001 5012 36 179254 0.5 1 173687 5567 37 178941 0.5 1 173539 5402 38 164059 0.5 1 159187 4872 39 166796 0.5 1 161285 5511 40 164687 0.5 1 159412 5275 41 169686 0.5 1 164725 4961 42 168623 0.5 1 163921 4702 43 148078 0.5 1 143407 4671 44 153809 0.5 1 148976 4833 45 186123 0.5 1 180924 5199 46 146858 0.5 1 142559 4299 47 116142 0.5 1 112386 3756 48 155224 0.5 1 150892 4332 49 121186 0.5 1 117716 3470 50 121068 0.5 1 117322 3746 51 166414 0.5 1 162381 4033 52 107754 0.5 1 104617 3137 53 110077 0.5 1 107077 3000 54 97287 0.5 1 94592 2695 55 117805 0.5 1 114708 3097 56 112515 0.5 1 109320 3195 57 105922 0.5 1 102845 3077 58 102228 0.5 1 99390 2838 59 98462 0.5 1 95603 2859 60 93993 0.5 1 91182 2811 61 92745 0.5 1 89961 2784 62 94564 0.5 1 91720 2844 63 94361 0.5 1 91394 2967 64 92501 0.5 1 89409 3092 65 95404 0.5 1 92128 3276 66 99458 0.5 1 95616 3842 67 149612 0.5 1 136937 12675 68 755065 0.5 1 735966 19099 69 336790 0.5 1 326080 10710 70 177166 0.5 1 171002 6164 71 89780 0.5 1 86251 3529 72 56865 0.5 1 54449 2416 73 38055 0.5 1 36303 1752 74 28628 0.5 1 27218 1410 75 22781 0.5 1 21660 1121 76 19324 0.5 1 18326 998 77 17129 0.5 1 16193 936 78 14770 0.5 1 13913 857 79 12720 0.5 1 11998 722 80 11018 0.5 1 10351 667 81 9430 0.5 1 8848 582 82 7978 0.5 1 7469 509 83 6881 0.5 1 6440 441 84 5905 0.5 1 5462 443 85 4791 0.5 1 4440 351 86 4295 0.5 1 3927 368 87 4050 0.5 1 3700 350 88 4120 0.5 1 3775 345 89 4646 0.5 1 4212 434 90 5934 0.5 1 5409 525 91 8230 0.5 1 7513 717 92 12210 0.5 1 11133 1077 93 26572 0.5 1 24349 2223 94 77954 0.5 1 72243 5711 95 131539 0.5 1 122540 8999 96 55073 0.5 1 51211 3862 97 35038 0.5 1 32570 2468 98 14394 0.5 1 13380 1014 99 14041 0.5 1 13023 1018 100 16128 0.5 1 14975 1153 101 33498 0.5 1 30322 3176 RUN STATISTICS FOR INPUT FILE: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-209_S29_L004_R2_001.fastq.gz ============================================= 31419769 sequences processed in total Total number of sequences analysed for the sequence pair length validation: 31419769 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 2541055 (8.09%)