/var/spool/slurm/d/job2358892/slurm_script: line 20: fg: no job control Using an excessive number of cores has a diminishing return! It is recommended not to exceed 8 cores per trimming process (you asked for 8 cores). Please consider re-specifying Path to Cutadapt set as: '/gscratch/srlab/strigg/bin/anaconda3/bin/cutadapt' (user defined) Cutadapt seems to be working fine (tested command '/gscratch/srlab/strigg/bin/anaconda3/bin/cutadapt --version') Cutadapt version: 2.4 Could not detect version of Python used by Cutadapt from the first line of Cutadapt (but found this: >>>#!/bin/sh<<<) Letting the (modified) Cutadapt deal with the Python version instead Parallel gzip (pigz) detected. Proceeding with multicore (de)compression using 8 cores No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default) Output will be written into the directory: /gscratch/scrubbed/strigg/analyses/20200319/specify_a/ Writing report to '/gscratch/scrubbed/strigg/analyses/20200319/specify_a/EPI-167_S10_L002_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGCACACGTCTGAAC' (user defined) Maximum trimming error rate: 0.1 (default) Optional adapter 2 sequence (only used for read 2 of paired-end files): 'AGATCGGAAGAGCGTCGTGTAGGGA' Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/strigg/analyses/20200319/specify_a --threads 28' Output file(s) will be GZIP compressed Cutadapt seems to be fairly up-to-date (version 2.4). Setting -j 8 Writing final adapter and quality trimmed output to EPI-167_S10_L002_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGCACACGTCTGAAC' from file /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGCACACGTCTGAAC /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 80.31 s (3 us/read; 18.57 M reads/minute). === Summary === Total reads processed: 24,859,230 Reads with adapters: 13,310,825 (53.5%) Reads written (passing filters): 24,859,230 (100.0%) Total basepairs processed: 2,510,782,230 bp Quality-trimmed: 11,488,232 bp (0.5%) Total written (filtered): 2,298,441,331 bp (91.5%) === Adapter 1 === Sequence: AGATCGGAAGAGCACACGTCTGAAC; Type: regular 3'; Length: 25; Trimmed: 13310825 times. No. of allowed errors: 0-9 bp: 0; 10-19 bp: 1; 20-25 bp: 2 Bases preceding removed adapters: A: 25.5% C: 9.5% G: 23.7% T: 41.3% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 5079588 6214807.5 0 5079588 2 1202251 1553701.9 0 1202251 3 456300 388425.5 0 456300 4 315396 97106.4 0 315396 5 162023 24276.6 0 162023 6 154952 6069.1 0 154952 7 141317 1517.3 0 141317 8 147815 379.3 0 147815 9 156397 94.8 0 154875 1522 10 144818 23.7 1 138784 6034 11 151610 5.9 1 144610 7000 12 144234 1.5 1 137849 6385 13 138279 0.4 1 132093 6186 14 149582 0.1 1 141472 8110 15 139617 0.0 1 132371 7246 16 149533 0.0 1 140718 8815 17 143290 0.0 1 134911 8379 18 132507 0.0 1 123777 8159 571 19 143599 0.0 1 132190 10635 774 20 131755 0.0 2 122046 8589 1120 21 146494 0.0 2 132987 11445 2062 22 135126 0.0 2 124303 9630 1193 23 126194 0.0 2 115663 9314 1217 24 131857 0.0 2 119870 10387 1600 25 122353 0.0 2 112116 9190 1047 26 132356 0.0 2 119736 10869 1751 27 119873 0.0 2 109836 8721 1316 28 113147 0.0 2 103784 8301 1062 29 123716 0.0 2 112992 9263 1461 30 113300 0.0 2 103872 8363 1065 31 120517 0.0 2 109240 9537 1740 32 111072 0.0 2 102015 7969 1088 33 114305 0.0 2 104447 8569 1289 34 108779 0.0 2 99492 8120 1167 35 104061 0.0 2 95648 7427 986 36 99353 0.0 2 91169 7175 1009 37 105972 0.0 2 96779 8018 1175 38 92166 0.0 2 84732 6433 1001 39 93699 0.0 2 85639 6965 1095 40 94449 0.0 2 85795 7662 992 41 129224 0.0 2 119820 8107 1297 42 78034 0.0 2 72578 4766 690 43 35611 0.0 2 31892 3292 427 44 72241 0.0 2 66560 4941 740 45 66871 0.0 2 61637 4601 633 46 63480 0.0 2 58567 4379 534 47 65222 0.0 2 59962 4547 713 48 58338 0.0 2 53531 4197 610 49 59861 0.0 2 54840 4380 641 50 53802 0.0 2 49655 3678 469 51 50397 0.0 2 46525 3380 492 52 46801 0.0 2 43097 3252 452 53 42870 0.0 2 39683 2782 405 54 41405 0.0 2 38238 2746 421 55 40756 0.0 2 37782 2611 363 56 36903 0.0 2 34142 2423 338 57 33898 0.0 2 31227 2360 311 58 31239 0.0 2 28980 2027 232 59 30217 0.0 2 28005 1966 246 60 26732 0.0 2 24878 1668 186 61 26427 0.0 2 24460 1787 180 62 25361 0.0 2 23508 1647 206 63 21705 0.0 2 20154 1396 155 64 19937 0.0 2 18653 1152 132 65 17966 0.0 2 16711 1137 118 66 16513 0.0 2 15317 1088 108 67 15408 0.0 2 14252 1052 104 68 14019 0.0 2 12960 970 89 69 13548 0.0 2 12518 939 91 70 12389 0.0 2 11493 818 78 71 12247 0.0 2 11250 892 105 72 14140 0.0 2 12722 1193 225 73 24750 0.0 2 21535 2947 268 74 80751 0.0 2 75790 4690 271 75 62956 0.0 2 59263 3517 176 76 34298 0.0 2 32139 2050 109 77 19980 0.0 2 18745 1172 63 78 11508 0.0 2 10781 686 41 79 6171 0.0 2 5742 410 19 80 3991 0.0 2 3687 279 25 81 2402 0.0 2 2222 166 14 82 1642 0.0 2 1490 140 12 83 1273 0.0 2 1174 88 11 84 1130 0.0 2 1031 91 8 85 989 0.0 2 903 78 8 86 959 0.0 2 877 76 6 87 835 0.0 2 750 78 7 88 716 0.0 2 643 63 10 89 798 0.0 2 727 62 9 90 871 0.0 2 802 64 5 91 1229 0.0 2 1108 109 12 92 1963 0.0 2 1790 160 13 93 4579 0.0 2 4247 306 26 94 13773 0.0 2 12652 1011 110 95 24682 0.0 2 22814 1730 138 96 11975 0.0 2 11015 885 75 97 8863 0.0 2 8123 684 56 98 3854 0.0 2 3544 284 26 99 4127 0.0 2 3800 310 17 100 4411 0.0 2 4002 383 26 101 8135 0.0 2 7184 899 52 RUN STATISTICS FOR INPUT FILE: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz ============================================= 24859230 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/strigg/analyses/20200319/specify_a/EPI-167_S10_L002_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGCACACGTCTGAAC' (user defined) Maximum trimming error rate: 0.1 (default) Optional adapter 2 sequence (only used for read 2 of paired-end files): 'AGATCGGAAGAGCGTCGTGTAGGGA' Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/strigg/analyses/20200319/specify_a --threads 28' Output file(s) will be GZIP compressed Cutadapt seems to be fairly up-to-date (version 2.4). Setting -j -j 8 Writing final adapter and quality trimmed output to EPI-167_S10_L002_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGCGTCGTGTAGGGA' from file /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGCGTCGTGTAGGGA /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 82.98 s (3 us/read; 17.98 M reads/minute). === Summary === Total reads processed: 24,859,230 Reads with adapters: 15,363,536 (61.8%) Reads written (passing filters): 24,859,230 (100.0%) Total basepairs processed: 2,510,782,230 bp Quality-trimmed: 22,659,614 bp (0.9%) Total written (filtered): 2,293,251,313 bp (91.3%) === Adapter 1 === Sequence: AGATCGGAAGAGCGTCGTGTAGGGA; Type: regular 3'; Length: 25; Trimmed: 15363536 times. No. of allowed errors: 0-9 bp: 0; 10-19 bp: 1; 20-25 bp: 2 Bases preceding removed adapters: A: 40.3% C: 20.3% G: 7.3% T: 32.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 8554514 6214807.5 0 8554514 2 247131 1553701.9 0 247131 3 191193 388425.5 0 191193 4 164298 97106.4 0 164298 5 160566 24276.6 0 160566 6 159035 6069.1 0 159035 7 150319 1517.3 0 150319 8 152605 379.3 0 152605 9 152922 94.8 0 152285 637 10 151128 23.7 1 146689 4439 11 146920 5.9 1 141612 5308 12 148837 1.5 1 143363 5474 13 142341 0.4 1 137280 5061 14 151875 0.1 1 146116 5759 15 140621 0.0 1 135080 5541 16 140329 0.0 1 134360 5969 17 147590 0.0 1 141058 6532 18 131268 0.0 1 125180 5970 118 19 137242 0.0 1 130202 6888 152 20 134860 0.0 2 126296 7299 1265 21 136470 0.0 2 126594 8307 1569 22 137929 0.0 2 128046 8405 1478 23 132148 0.0 2 122951 7765 1432 24 137535 0.0 2 127293 8713 1529 25 121830 0.0 2 112500 7874 1456 26 122372 0.0 2 111493 9021 1858 27 122833 0.0 2 110779 9719 2335 28 125558 0.0 2 115435 8627 1496 29 121040 0.0 2 109875 9372 1793 30 127761 0.0 2 117798 8505 1458 31 110644 0.0 2 101253 7874 1517 32 113145 0.0 2 104687 7317 1141 33 117931 0.0 2 107886 8506 1539 34 119747 0.0 2 108853 9087 1807 35 109358 0.0 2 101564 6789 1005 36 102469 0.0 2 93985 7185 1299 37 101986 0.0 2 93931 6852 1203 38 88289 0.0 2 81349 5989 951 39 91384 0.0 2 83980 6298 1106 40 87968 0.0 2 81063 5902 1003 41 85810 0.0 2 79585 5436 789 42 82412 0.0 2 76871 4870 671 43 73026 0.0 2 67328 4920 778 44 72521 0.0 2 67133 4755 633 45 84963 0.0 2 79659 4649 655 46 65401 0.0 2 60883 3896 622 47 46340 0.0 2 42573 3316 451 48 62890 0.0 2 58972 3433 485 49 44178 0.0 2 41087 2712 379 50 45926 0.0 2 42356 3132 438 51 59803 0.0 2 56373 3012 418 52 36947 0.0 2 34103 2470 374 53 36693 0.0 2 33991 2310 392 54 32194 0.0 2 29695 2182 317 55 36861 0.0 2 34493 2101 267 56 34269 0.0 2 31729 2174 366 57 30720 0.0 2 28494 1961 265 58 28380 0.0 2 26339 1776 265 59 26631 0.0 2 24733 1671 227 60 25170 0.0 2 23184 1700 286 61 24593 0.0 2 22735 1596 262 62 24060 0.0 2 22223 1561 276 63 22601 0.0 2 20779 1597 225 64 21649 0.0 2 19897 1515 237 65 22204 0.0 2 20390 1572 242 66 24131 0.0 2 22040 1770 321 67 35937 0.0 2 31177 4455 305 68 129003 0.0 2 123606 5023 374 69 47313 0.0 2 44432 2606 275 70 24870 0.0 2 23330 1358 182 71 12976 0.0 2 11952 904 120 72 8650 0.0 2 7938 594 118 73 6006 0.0 2 5428 500 78 74 4673 0.0 2 4170 423 80 75 3587 0.0 2 3246 292 49 76 2967 0.0 2 2644 260 63 77 2595 0.0 2 2287 248 60 78 2277 0.0 2 1988 229 60 79 1974 0.0 2 1740 196 38 80 1588 0.0 2 1403 150 35 81 1399 0.0 2 1220 147 32 82 1178 0.0 2 1020 129 29 83 1067 0.0 2 929 111 27 84 908 0.0 2 767 111 30 85 844 0.0 2 702 108 34 86 765 0.0 2 629 110 26 87 784 0.0 2 659 101 24 88 797 0.0 2 673 96 28 89 913 0.0 2 759 123 31 90 1123 0.0 2 931 153 39 91 1402 0.0 2 1155 188 59 92 2147 0.0 2 1810 255 82 93 4579 0.0 2 3823 597 159 94 13431 0.0 2 11681 1388 362 95 23741 0.0 2 20778 2421 542 96 11598 0.0 2 10136 1204 258 97 8556 0.0 2 7512 866 178 98 3633 0.0 2 3223 345 65 99 3883 0.0 2 3382 409 92 100 4128 0.0 2 3619 404 105 101 7880 0.0 2 6828 858 194 RUN STATISTICS FOR INPUT FILE: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz ============================================= 24859230 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files EPI-167_S10_L002_R1_001_trimmed.fq.gz and EPI-167_S10_L002_R2_001_trimmed.fq.gz file_1: EPI-167_S10_L002_R1_001_trimmed.fq.gz, file_2: EPI-167_S10_L002_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: EPI-167_S10_L002_R1_001_trimmed.fq.gz and EPI-167_S10_L002_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end Read 1 reads to EPI-167_S10_L002_R1_001_val_1.fq.gz Writing validated paired-end Read 2 reads to EPI-167_S10_L002_R2_001_val_2.fq.gz Total number of sequences analysed: 24859230 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 518493 (2.09%) >>> Now running FastQC on the validated data EPI-167_S10_L002_R1_001_val_1.fq.gz<<< Started analysis of EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 5% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 10% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 15% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 20% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 25% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 30% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 35% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 40% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 45% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 50% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 55% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 60% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 65% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 70% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 75% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 80% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 85% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 90% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 95% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Analysis complete for EPI-167_S10_L002_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data EPI-167_S10_L002_R2_001_val_2.fq.gz<<< Started analysis of EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 5% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 10% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 15% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 20% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 25% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 30% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 35% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 40% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 45% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 50% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 55% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 60% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 65% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 70% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 75% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 80% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 85% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 90% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 95% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Analysis complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Deleting both intermediate output files EPI-167_S10_L002_R1_001_trimmed.fq.gz and EPI-167_S10_L002_R2_001_trimmed.fq.gz ==================================================================================================== Using an excessive number of cores has a diminishing return! It is recommended not to exceed 8 cores per trimming process (you asked for 8 cores). Please consider re-specifying Path to Cutadapt set as: '/gscratch/srlab/strigg/bin/anaconda3/bin/cutadapt' (user defined) Cutadapt seems to be working fine (tested command '/gscratch/srlab/strigg/bin/anaconda3/bin/cutadapt --version') Cutadapt version: 2.4 Could not detect version of Python used by Cutadapt from the first line of Cutadapt (but found this: >>>#!/bin/sh<<<) Letting the (modified) Cutadapt deal with the Python version instead Parallel gzip (pigz) detected. Proceeding with multicore (de)compression using 8 cores No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default) Output will be written into the directory: /gscratch/scrubbed/strigg/analyses/20200319/ AUTO-DETECTING ADAPTER TYPE =========================== Attempting to auto-detect adapter type from the first 1 million sequences of the first file (>> /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz <<) Found perfect matches for the following adapter sequences: Adapter type Count Sequence Sequences analysed Percentage Illumina 195648 AGATCGGAAGAGC 1000000 19.56 Nextera 0 CTGTCTCTTATA 1000000 0.00 smallRNA 0 TGGAATTCTCGG 1000000 0.00 Using Illumina adapter for trimming (count: 195648). Second best hit was Nextera (count: 0) Writing report to '/gscratch/scrubbed/strigg/analyses/20200319/EPI-167_S10_L002_R1_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/strigg/analyses/20200319 --threads 28' Output file(s) will be GZIP compressed Cutadapt seems to be fairly up-to-date (version 2.4). Setting -j 8 Writing final adapter and quality trimmed output to EPI-167_S10_L002_R1_001_trimmed.fq.gz >>> Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 80.37 s (3 us/read; 18.56 M reads/minute). === Summary === Total reads processed: 24,859,230 Reads with adapters: 13,309,382 (53.5%) Reads written (passing filters): 24,859,230 (100.0%) Total basepairs processed: 2,510,782,230 bp Quality-trimmed: 11,488,232 bp (0.5%) Total written (filtered): 2,298,095,482 bp (91.5%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 13309382 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 25.5% C: 9.5% G: 23.7% T: 41.3% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 5079720 6214807.5 0 5079720 2 1201788 1553701.9 0 1201788 3 455870 388425.5 0 455870 4 314973 97106.4 0 314973 5 161437 24276.6 0 161437 6 154361 6069.1 0 154361 7 140803 1517.3 0 140803 8 147331 379.3 0 147331 9 155862 94.8 0 154346 1516 10 144320 23.7 1 138306 6014 11 151112 5.9 1 144135 6977 12 143788 1.5 1 137433 6355 13 137785 0.4 1 131615 6170 14 149147 0.4 1 141174 7973 15 139227 0.4 1 132332 6895 16 149312 0.4 1 141084 8228 17 143086 0.4 1 135505 7581 18 132310 0.4 1 126071 6239 19 143811 0.4 1 135287 8524 20 131290 0.4 1 124923 6367 21 145679 0.4 1 136781 8898 22 134890 0.4 1 128035 6855 23 126067 0.4 1 119640 6427 24 131665 0.4 1 124339 7326 25 122358 0.4 1 116289 6069 26 132225 0.4 1 124597 7628 27 119894 0.4 1 114131 5763 28 113340 0.4 1 107992 5348 29 123769 0.4 1 117559 6210 30 113297 0.4 1 108240 5057 31 120469 0.4 1 114168 6301 32 111183 0.4 1 106261 4922 33 114363 0.4 1 108986 5377 34 108844 0.4 1 103786 5058 35 104131 0.4 1 99505 4626 36 99526 0.4 1 95050 4476 37 106038 0.4 1 100981 5057 38 92323 0.4 1 88308 4015 39 93759 0.4 1 89399 4360 40 94519 0.4 1 89633 4886 41 129205 0.4 1 123873 5332 42 78110 0.4 1 74983 3127 43 35837 0.4 1 33746 2091 44 72367 0.4 1 69091 3276 45 67031 0.4 1 64106 2925 46 63610 0.4 1 60758 2852 47 65331 0.4 1 62355 2976 48 58472 0.4 1 55680 2792 49 59983 0.4 1 57145 2838 50 53971 0.4 1 51646 2325 51 50543 0.4 1 48295 2248 52 47005 0.4 1 44956 2049 53 43086 0.4 1 41326 1760 54 41670 0.4 1 39954 1716 55 40994 0.4 1 39370 1624 56 37127 0.4 1 35622 1505 57 34125 0.4 1 32652 1473 58 31505 0.4 1 30268 1237 59 30411 0.4 1 29237 1174 60 26925 0.4 1 25909 1016 61 26652 0.4 1 25578 1074 62 25545 0.4 1 24603 942 63 21912 0.4 1 21115 797 64 20166 0.4 1 19505 661 65 18118 0.4 1 17409 709 66 16596 0.4 1 15950 646 67 15516 0.4 1 14924 592 68 14125 0.4 1 13540 585 69 13654 0.4 1 13093 561 70 12496 0.4 1 12006 490 71 12383 0.4 1 11814 569 72 14176 0.4 1 13392 784 73 24770 0.4 1 22779 1991 74 80795 0.4 1 77743 3052 75 62980 0.4 1 60815 2165 76 34346 0.4 1 33049 1297 77 20015 0.4 1 19279 736 78 11561 0.4 1 11141 420 79 6228 0.4 1 5972 256 80 4035 0.4 1 3856 179 81 2455 0.4 1 2342 113 82 1682 0.4 1 1592 90 83 1305 0.4 1 1243 62 84 1169 0.4 1 1106 63 85 1034 0.4 1 975 59 86 988 0.4 1 940 48 87 884 0.4 1 824 60 88 755 0.4 1 703 52 89 848 0.4 1 797 51 90 941 0.4 1 890 51 91 1310 0.4 1 1224 86 92 2045 0.4 1 1916 129 93 4668 0.4 1 4448 220 94 13833 0.4 1 13196 637 95 24756 0.4 1 23645 1111 96 12039 0.4 1 11473 566 97 8920 0.4 1 8427 493 98 3872 0.4 1 3654 218 99 4141 0.4 1 3934 207 100 4419 0.4 1 4130 289 101 8269 0.4 1 7410 859 RUN STATISTICS FOR INPUT FILE: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R1_001.fastq.gz ============================================= 24859230 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Writing report to '/gscratch/scrubbed/strigg/analyses/20200319/EPI-167_S10_L002_R2_001.fastq.gz_trimming_report.txt' SUMMARISING RUN PARAMETERS ========================== Input filename: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz Trimming mode: paired-end Trim Galore version: 0.6.4_dev Cutadapt version: 2.4 Python version: could not detect Number of cores used for trimming: 8 Quality Phred score cutoff: 20 Quality encoding type selected: ASCII+33 Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected) Maximum trimming error rate: 0.1 (default) Minimum required adapter overlap (stringency): 1 bp Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp All Read 1 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 5' end to avoid poor qualities or biases (e.g. M-bias for BS-Seq applications) All Read 1 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases All Read 2 sequences will be trimmed by 8 bp from their 3' end to avoid poor qualities or biases Running FastQC on the data once trimming has completed Running FastQC with the following extra arguments: '--outdir /gscratch/scrubbed/strigg/analyses/20200319 --threads 28' Output file(s) will be GZIP compressed Cutadapt seems to be fairly up-to-date (version 2.4). Setting -j -j 8 Writing final adapter and quality trimmed output to EPI-167_S10_L002_R2_001_trimmed.fq.gz >>> Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz <<< 10000000 sequences processed 20000000 sequences processed This is cutadapt 2.4 with Python 3.7.6 Command line parameters: -j 8 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz Processing reads on 8 cores in single-end mode ... Finished in 82.12 s (3 us/read; 18.16 M reads/minute). === Summary === Total reads processed: 24,859,230 Reads with adapters: 15,369,391 (61.8%) Reads written (passing filters): 24,859,230 (100.0%) Total basepairs processed: 2,510,782,230 bp Quality-trimmed: 22,659,614 bp (0.9%) Total written (filtered): 2,292,740,401 bp (91.3%) === Adapter 1 === Sequence: AGATCGGAAGAGC; Type: regular 3'; Length: 13; Trimmed: 15369391 times. No. of allowed errors: 0-9 bp: 0; 10-13 bp: 1 Bases preceding removed adapters: A: 40.2% C: 20.3% G: 7.3% T: 32.1% none/other: 0.0% Overview of removed sequences length count expect max.err error counts 1 8551744 6214807.5 0 8551744 2 246148 1553701.9 0 246148 3 190565 388425.5 0 190565 4 163955 97106.4 0 163955 5 160213 24276.6 0 160213 6 158749 6069.1 0 158749 7 150000 1517.3 0 150000 8 152348 379.3 0 152348 9 152655 94.8 0 152020 635 10 150821 23.7 1 146396 4425 11 146653 5.9 1 141358 5295 12 148628 1.5 1 143175 5453 13 142195 0.4 1 137114 5081 14 151761 0.4 1 146219 5542 15 140774 0.4 1 135529 5245 16 140681 0.4 1 135637 5044 17 148111 0.4 1 142873 5238 18 131846 0.4 1 127220 4626 19 138041 0.4 1 133124 4917 20 134670 0.4 1 129536 5134 21 136338 0.4 1 130803 5535 22 137968 0.4 1 132680 5288 23 132322 0.4 1 127360 4962 24 137798 0.4 1 132572 5226 25 122164 0.4 1 117575 4589 26 122835 0.4 1 117672 5163 27 123407 0.4 1 117521 5886 28 125924 0.4 1 121369 4555 29 121485 0.4 1 116288 5197 30 128128 0.4 1 123665 4463 31 110876 0.4 1 106520 4356 32 113497 0.4 1 109642 3855 33 118175 0.4 1 113570 4605 34 120005 0.4 1 114905 5100 35 109616 0.4 1 106079 3537 36 102731 0.4 1 98816 3915 37 102246 0.4 1 98428 3818 38 88554 0.4 1 85388 3166 39 91579 0.4 1 88232 3347 40 88267 0.4 1 85088 3179 41 85974 0.4 1 83185 2789 42 82637 0.4 1 80194 2443 43 73211 0.4 1 70603 2608 44 72696 0.4 1 70279 2417 45 85138 0.4 1 82870 2268 46 65598 0.4 1 63595 2003 47 46515 0.4 1 44853 1662 48 63045 0.4 1 61242 1803 49 44347 0.4 1 42924 1423 50 46087 0.4 1 44500 1587 51 59933 0.4 1 58384 1549 52 37135 0.4 1 35811 1324 53 36863 0.4 1 35622 1241 54 32337 0.4 1 31205 1132 55 36994 0.4 1 35915 1079 56 34416 0.4 1 33218 1198 57 30866 0.4 1 29829 1037 58 28525 0.4 1 27569 956 59 26804 0.4 1 25894 910 60 25275 0.4 1 24330 945 61 24709 0.4 1 23797 912 62 24147 0.4 1 23270 877 63 22706 0.4 1 21820 886 64 21721 0.4 1 20890 831 65 22286 0.4 1 21358 928 66 24176 0.4 1 23085 1091 67 36009 0.4 1 33206 2803 68 129078 0.4 1 125717 3361 69 47386 0.4 1 45734 1652 70 24924 0.4 1 24012 912 71 13025 0.4 1 12410 615 72 8682 0.4 1 8278 404 73 6054 0.4 1 5733 321 74 4715 0.4 1 4403 312 75 3618 0.4 1 3425 193 76 2997 0.4 1 2803 194 77 2622 0.4 1 2433 189 78 2318 0.4 1 2156 162 79 1995 0.4 1 1863 132 80 1609 0.4 1 1513 96 81 1408 0.4 1 1302 106 82 1197 0.4 1 1109 88 83 1086 0.4 1 994 92 84 925 0.4 1 841 84 85 857 0.4 1 780 77 86 793 0.4 1 706 87 87 813 0.4 1 718 95 88 823 0.4 1 735 88 89 934 0.4 1 846 88 90 1158 0.4 1 1031 127 91 1444 0.4 1 1276 168 92 2210 0.4 1 1964 246 93 4655 0.4 1 4194 461 94 13593 0.4 1 12607 986 95 23945 0.4 1 22218 1727 96 11675 0.4 1 10828 847 97 8623 0.4 1 7964 659 98 3643 0.4 1 3391 252 99 3896 0.4 1 3601 295 100 4139 0.4 1 3843 296 101 7928 0.4 1 7181 747 RUN STATISTICS FOR INPUT FILE: /gscratch/srlab/strigg/data/Pgenr/FASTQS/raw/EPI-167_S10_L002_R2_001.fastq.gz ============================================= 24859230 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files EPI-167_S10_L002_R1_001_trimmed.fq.gz and EPI-167_S10_L002_R2_001_trimmed.fq.gz file_1: EPI-167_S10_L002_R1_001_trimmed.fq.gz, file_2: EPI-167_S10_L002_R2_001_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: EPI-167_S10_L002_R1_001_trimmed.fq.gz and EPI-167_S10_L002_R2_001_trimmed.fq.gz <<<<< Writing validated paired-end Read 1 reads to EPI-167_S10_L002_R1_001_val_1.fq.gz Writing validated paired-end Read 2 reads to EPI-167_S10_L002_R2_001_val_2.fq.gz Total number of sequences analysed: 24859230 Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 522218 (2.10%) >>> Now running FastQC on the validated data EPI-167_S10_L002_R1_001_val_1.fq.gz<<< Started analysis of EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 5% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 10% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 15% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 20% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 25% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 30% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 35% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 40% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 45% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 50% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 55% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 60% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 65% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 70% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 75% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 80% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 85% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 90% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 95% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Approx 100% complete for EPI-167_S10_L002_R1_001_val_1.fq.gz Analysis complete for EPI-167_S10_L002_R1_001_val_1.fq.gz >>> Now running FastQC on the validated data EPI-167_S10_L002_R2_001_val_2.fq.gz<<< Started analysis of EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 5% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 10% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 15% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 20% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 25% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 30% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 35% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 40% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 45% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 50% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 55% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 60% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 65% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 70% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 75% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 80% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 85% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 90% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Approx 95% complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Analysis complete for EPI-167_S10_L002_R2_001_val_2.fq.gz Deleting both intermediate output files EPI-167_S10_L002_R1_001_trimmed.fq.gz and EPI-167_S10_L002_R2_001_trimmed.fq.gz ====================================================================================================