SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-215_S31_L004_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Using Illumina adapter for trimming (count: 419482). Second best hit was Nextera (count: 0) Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: --outdir /gscratch/scrubbed/strigg/analyses/20200320/TG_FASTQS/FastQC --threads 28 Output file will be GZIP compressed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-215_S31_L004_R2_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 61.38 s (3 us/read; 23.66 M reads/minute). === Summary === Total reads processed: 24,211,042 Reads with adapters: 17,622,075 (72.8%) Reads written (passing filters): 24,211,042 (100.0%) Total basepairs processed: 2,445,315,242 bp Quality-trimmed: 75,126,414 bp (3.1%) Total written (filtered): 1,824,581,848 bp (74.6%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 17622075 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 32.3% C: 22.4% G: 15.5% T: 29.8% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 5771944 6052760.5 0 5771944 2 183046 1513190.1 0 183046 3 158629 378297.5 0 158629 4 128020 94574.4 0 128020 5 127720 23643.6 0 127720 6 130531 5910.9 0 130531 7 130534 1477.7 0 130534 8 144560 369.4 0 144560 9 128764 92.4 0 128166 598 10 136920 23.1 1 133571 3349 11 120971 5.8 1 117415 3556 12 125870 1.4 1 121769 4101 13 122698 0.4 1 119151 3547 14 135250 0.4 1 130983 4267 15 129474 0.4 1 125800 3674 16 130700 0.4 1 126928 3772 17 134806 0.4 1 130579 4227 18 121590 0.4 1 118072 3518 19 130676 0.4 1 126655 4021 20 129833 0.4 1 125828 4005 21 130058 0.4 1 125998 4060 22 135996 0.4 1 131674 4322 23 130909 0.4 1 126703 4206 24 139596 0.4 1 134893 4703 25 128873 0.4 1 124962 3911 26 129948 0.4 1 125346 4602 27 130462 0.4 1 125317 5145 28 138202 0.4 1 133608 4594 29 132754 0.4 1 128236 4518 30 146808 0.4 1 142117 4691 31 126768 0.4 1 122526 4242 32 135883 0.4 1 131626 4257 33 140417 0.4 1 135653 4764 34 139939 0.4 1 134834 5105 35 141920 0.4 1 137827 4093 36 137077 0.4 1 132428 4649 37 137824 0.4 1 133493 4331 38 128249 0.4 1 124131 4118 39 129968 0.4 1 125394 4574 40 129955 0.4 1 125404 4551 41 135045 0.4 1 130790 4255 42 134525 0.4 1 130411 4114 43 120855 0.4 1 116878 3977 44 127502 0.4 1 123218 4284 45 159215 0.4 1 154359 4856 46 125795 0.4 1 121907 3888 47 99547 0.4 1 96102 3445 48 136525 0.4 1 132388 4137 49 105564 0.4 1 102287 3277 50 106984 0.4 1 103576 3408 51 151207 0.4 1 147093 4114 52 100212 0.4 1 97225 2987 53 102805 0.4 1 99868 2937 54 92255 0.4 1 89366 2889 55 115141 0.4 1 111838 3303 56 110186 0.4 1 107026 3160 57 106589 0.4 1 103407 3182 58 106848 0.4 1 103689 3159 59 101392 0.4 1 98330 3062 60 99030 0.4 1 95943 3087 61 100875 0.4 1 97682 3193 62 104368 0.4 1 100972 3396 63 107182 0.4 1 103524 3658 64 107851 0.4 1 103992 3859 65 116416 0.4 1 111947 4469 66 130943 0.4 1 125356 5587 67 219252 0.4 1 199178 20074 68 1217122 0.4 1 1185020 32102 69 537209 0.4 1 519603 17606 70 285367 0.4 1 275242 10125 71 143255 0.4 1 137321 5934 72 89887 0.4 1 85903 3984 73 59047 0.4 1 56254 2793 74 43854 0.4 1 41550 2304 75 35016 0.4 1 33074 1942 76 29682 0.4 1 27961 1721 77 26375 0.4 1 24869 1506 78 23383 0.4 1 21970 1413 79 20421 0.4 1 19150 1271 80 17439 0.4 1 16411 1028 81 14993 0.4 1 14031 962 82 12876 0.4 1 12010 866 83 11290 0.4 1 10446 844 84 9836 0.4 1 9119 717 85 8533 0.4 1 7828 705 86 7597 0.4 1 6921 676 87 7168 0.4 1 6517 651 88 7460 0.4 1 6745 715 89 8694 0.4 1 7825 869 90 11195 0.4 1 10168 1027 91 16561 0.4 1 15058 1503 92 24004 0.4 1 21925 2079 93 51848 0.4 1 47530 4318 94 150769 0.4 1 140122 10647 95 235976 0.4 1 220141 15835 96 95401 0.4 1 88674 6727 97 59081 0.4 1 54950 4131 98 25037 0.4 1 23328 1709 99 22979 0.4 1 21281 1698 100 25841 0.4 1 23912 1929 101 48628 0.4 1 44213 4415 RUN STATISTICS FOR INPUT FILE: /gscratch/scrubbed/strigg/analyses/20200320/FASTQS/EPI-215_S31_L004_R2_001.fastq.gz ============================================= 24211042 sequences processed in total Total number of sequences analysed for the sequence pair length validation: 24211042 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 4023307 (16.62%)