SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-208_S28_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Using Illumina adapter for trimming (count: 555734). Second best hit was smallRNA (count: 0) Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: --outdir /gscratch/scrubbed/strigg/analyses/20200320/TG_FASTQS/FastQC --threads 28 Output file will be GZIP compressed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-208_S28_L004_R2_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 87.71 s (2 us/read; 26.73 M reads/minute). === Summary === Total reads processed: 39,075,361 Reads with adapters: 31,424,594 (80.4%) Reads written (passing filters): 39,075,361 (100.0%) Total basepairs processed: 3,946,611,461 bp Quality-trimmed: 219,369,543 bp (5.6%) Total written (filtered): 2,423,314,615 bp (61.4%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 31424594 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 29.1% C: 22.4% G: 20.4% T: 28.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 7102734 9768840.2 0 7102734 2 226443 2442210.1 0 226443 3 201993 610552.5 0 201993 4 176510 152638.1 0 176510 5 181718 38159.5 0 181718 6 192190 9539.9 0 192190 7 182018 2385.0 0 182018 8 208299 596.2 0 208299 9 180449 149.1 0 179695 754 10 179468 37.3 1 174992 4476 11 171773 9.3 1 166505 5268 12 177712 2.3 1 172210 5502 13 174508 0.6 1 169358 5150 14 189678 0.6 1 183873 5805 15 181978 0.6 1 176642 5336 16 184120 0.6 1 178943 5177 17 195493 0.6 1 189911 5582 18 176848 0.6 1 171962 4886 19 187194 0.6 1 181799 5395 20 187740 0.6 1 182259 5481 21 191496 0.6 1 185617 5879 22 197944 0.6 1 191756 6188 23 196750 0.6 1 190688 6062 24 204772 0.6 1 198019 6753 25 192863 0.6 1 186951 5912 26 195010 0.6 1 187922 7088 27 197596 0.6 1 189690 7906 28 207734 0.6 1 201144 6590 29 204528 0.6 1 197301 7227 30 227240 0.6 1 220150 7090 31 199620 0.6 1 192794 6826 32 216698 0.6 1 210139 6559 33 228844 0.6 1 220927 7917 34 229743 0.6 1 221194 8549 35 224369 0.6 1 217625 6744 36 223899 0.6 1 216280 7619 37 223405 0.6 1 216170 7235 38 208835 0.6 1 202145 6690 39 213256 0.6 1 205645 7611 40 217003 0.6 1 209432 7571 41 224810 0.6 1 217535 7275 42 225890 0.6 1 218871 7019 43 204868 0.6 1 198025 6843 44 217709 0.6 1 210293 7416 45 282381 0.6 1 273779 8602 46 219060 0.6 1 212241 6819 47 174686 0.6 1 168923 5763 48 242243 0.6 1 234890 7353 49 186939 0.6 1 181222 5717 50 192831 0.6 1 186499 6332 51 276471 0.6 1 269277 7194 52 183342 0.6 1 177943 5399 53 192935 0.6 1 187597 5338 54 174187 0.6 1 169015 5172 55 218102 0.6 1 212194 5908 56 212356 0.6 1 206326 6030 57 204151 0.6 1 198037 6114 58 201004 0.6 1 194984 6020 59 202836 0.6 1 196597 6239 60 200949 0.6 1 194916 6033 61 206931 0.6 1 200485 6446 62 218187 0.6 1 210977 7210 63 224730 0.6 1 216784 7946 64 232495 0.6 1 224201 8294 65 259856 0.6 1 249858 9998 66 309763 0.6 1 295643 14120 67 577646 0.6 1 518538 59108 68 3616892 0.6 1 3520578 96314 69 1644780 0.6 1 1589525 55255 70 889914 0.6 1 858600 31314 71 436231 0.6 1 418981 17250 72 263589 0.6 1 252378 11211 73 162487 0.6 1 154588 7899 74 118504 0.6 1 112365 6139 75 92100 0.6 1 87183 4917 76 73880 0.6 1 69668 4212 77 63422 0.6 1 59741 3681 78 55210 0.6 1 51956 3254 79 47394 0.6 1 44340 3054 80 41238 0.6 1 38586 2652 81 35175 0.6 1 32721 2454 82 30373 0.6 1 28081 2292 83 26620 0.6 1 24583 2037 84 24145 0.6 1 22153 1992 85 22090 0.6 1 20210 1880 86 20946 0.6 1 19066 1880 87 21525 0.6 1 19469 2056 88 23296 0.6 1 21102 2194 89 27364 0.6 1 24843 2521 90 36473 0.6 1 33056 3417 91 55370 0.6 1 50624 4746 92 82370 0.6 1 75294 7076 93 179378 0.6 1 165218 14160 94 531477 0.6 1 495237 36240 95 789733 0.6 1 737801 51932 96 323292 0.6 1 301561 21731 97 187333 0.6 1 174806 12527 98 81078 0.6 1 75707 5371 99 73380 0.6 1 68165 5215 100 77682 0.6 1 72215 5467 101 142024 0.6 1 129017 13007 RUN STATISTICS FOR INPUT FILE: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-208_S28_L004_R2_001.fastq.gz ============================================= 39075361 sequences processed in total Total number of sequences analysed for the sequence pair length validation: 39075361 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 11883981 (30.41%)