SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-230_S37_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Using Illumina adapter for trimming (count: 393944). Second best hit was Nextera (count: 0) Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: --outdir /gscratch/scrubbed/strigg/analyses/20200320/TG_FASTQS/FastQC --threads 28 Output file will be GZIP compressed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-230_S37_L004_R2_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 85.83 s (3 us/read; 22.68 M reads/minute). === Summary === Total reads processed: 32,435,740 Reads with adapters: 23,367,538 (72.0%) Reads written (passing filters): 32,435,740 (100.0%) Total basepairs processed: 3,276,009,740 bp Quality-trimmed: 77,027,024 bp (2.4%) Total written (filtered): 2,564,694,682 bp (78.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 23367538 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 33.7% C: 22.2% G: 13.5% T: 30.6% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 8087603 8108935.0 0 8087603 2 270197 2027233.8 0 270197 3 220972 506808.4 0 220972 4 194228 126702.1 0 194228 5 194483 31675.5 0 194483 6 200191 7918.9 0 200191 7 197050 1979.7 0 197050 8 204432 494.9 0 204432 9 194489 123.7 0 193639 850 10 198851 30.9 1 193895 4956 11 188599 7.7 1 182991 5608 12 198453 1.9 1 192343 6110 13 192095 0.5 1 186431 5664 14 208683 0.5 1 202046 6637 15 198617 0.5 1 192902 5715 16 200472 0.5 1 194701 5771 17 212338 0.5 1 206008 6330 18 189715 0.5 1 184250 5465 19 202977 0.5 1 196911 6066 20 201671 0.5 1 195724 5947 21 203189 0.5 1 197089 6100 22 209424 0.5 1 202876 6548 23 203950 0.5 1 197656 6294 24 216504 0.5 1 209421 7083 25 199984 0.5 1 193957 6027 26 201906 0.5 1 194945 6961 27 205931 0.5 1 198184 7747 28 215965 0.5 1 209211 6754 29 208099 0.5 1 200993 7106 30 230309 0.5 1 223235 7074 31 199303 0.5 1 192740 6563 32 216119 0.5 1 209851 6268 33 221690 0.5 1 214395 7295 34 219230 0.5 1 211468 7762 35 216591 0.5 1 210471 6120 36 210275 0.5 1 203556 6719 37 208917 0.5 1 202322 6595 38 194417 0.5 1 188464 5953 39 197704 0.5 1 191039 6665 40 196813 0.5 1 190116 6697 41 203055 0.5 1 196765 6290 42 200635 0.5 1 194816 5819 43 179402 0.5 1 173444 5958 44 186178 0.5 1 180015 6163 45 224480 0.5 1 217886 6594 46 177047 0.5 1 171714 5333 47 140588 0.5 1 135871 4717 48 187380 0.5 1 181842 5538 49 146092 0.5 1 141782 4310 50 147241 0.5 1 142591 4650 51 199837 0.5 1 194547 5290 52 131800 0.5 1 128035 3765 53 135152 0.5 1 131362 3790 54 118989 0.5 1 115474 3515 55 142303 0.5 1 138568 3735 56 135473 0.5 1 131473 4000 57 126612 0.5 1 123064 3548 58 121076 0.5 1 117446 3630 59 117987 0.5 1 114480 3507 60 113778 0.5 1 110309 3469 61 111889 0.5 1 108462 3427 62 113479 0.5 1 109779 3700 63 111018 0.5 1 107346 3672 64 108244 0.5 1 104546 3698 65 111740 0.5 1 107735 4005 66 120650 0.5 1 115628 5022 67 193920 0.5 1 175060 18860 68 1100049 0.5 1 1070752 29297 69 522269 0.5 1 505082 17187 70 284977 0.5 1 275090 9887 71 144072 0.5 1 138289 5783 72 88943 0.5 1 85291 3652 73 56509 0.5 1 53957 2552 74 41491 0.5 1 39483 2008 75 31896 0.5 1 30204 1692 76 26059 0.5 1 24667 1392 77 22885 0.5 1 21595 1290 78 19797 0.5 1 18680 1117 79 17069 0.5 1 16050 1019 80 14907 0.5 1 13973 934 81 12996 0.5 1 12186 810 82 11380 0.5 1 10596 784 83 9995 0.5 1 9325 670 84 9144 0.5 1 8421 723 85 8159 0.5 1 7521 638 86 7537 0.5 1 6900 637 87 7342 0.5 1 6698 644 88 7582 0.5 1 6925 657 89 8476 0.5 1 7746 730 90 10819 0.5 1 9916 903 91 15048 0.5 1 13821 1227 92 22365 0.5 1 20404 1961 93 47877 0.5 1 44173 3704 94 140737 0.5 1 130933 9804 95 239117 0.5 1 223461 15656 96 102535 0.5 1 95674 6861 97 65944 0.5 1 61526 4418 98 27320 0.5 1 25474 1846 99 26433 0.5 1 24567 1866 100 27015 0.5 1 25072 1943 101 50313 0.5 1 45639 4674 RUN STATISTICS FOR INPUT FILE: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-230_S37_L004_R2_001.fastq.gz ============================================= 32435740 sequences processed in total Total number of sequences analysed for the sequence pair length validation: 32435740 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 3825438 (11.79%)