FastQ format assumed (by default) Each Bowtie 2 instance is going to be run with 28 threads. Please monitor performance closely and tune down if needed! chr NC_035780.1 (65668440 bp) chr NC_035781.1 (61752955 bp) chr NC_035782.1 (77061148 bp) chr NC_035783.1 (59691872 bp) chr NC_035784.1 (98698416 bp) chr NC_035785.1 (51258098 bp) chr NC_035786.1 (57830854 bp) chr NC_035787.1 (75944018 bp) chr NC_035788.1 (104168038 bp) chr NC_035789.1 (32650045 bp) chr NC_007175.2 (17244 bp) Processed 100000 sequences in total Sequences with no alignments under any condition: 71737 Sequences did not map uniquely: 9776 Sequences which were discarded because genomic sequence could not be extracted: 0 Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 9334 ((converted) top strand) CT/GA: 9153 ((converted) bottom strand) GA/CT: 0 (complementary to (converted) top strand) GA/GA: 0 (complementary to (converted) bottom strand) Number of alignments to (merely theoretical) complementary strands being rejected in total: 0 Processing single-end Bismark output file(s) (SAM format): cvir_bsseq_all_R1_bismark_bt2.bam If there are several alignments to a single position in the genome the first alignment will be chosen. Since the input files are not in any way sorted this is a near-enough random selection of reads. Checking file >>cvir_bsseq_all_R1_bismark_bt2.bam<< for signs of file truncation... Output file is: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Total number of alignments analysed in cvir_bsseq_all_R1_bismark_bt2.bam: 18487 Total number duplicated alignments removed: 29 (0.16%) Duplicated alignments were found at: 26 different position(s) Total count of deduplicated leftover sequences: 18458 (99.84% of total) skipping header line: @HD VN:1.0 SO:unsorted skipping header line: @SQ SN:NC_035780.1 LN:65668440 skipping header line: @SQ SN:NC_035781.1 LN:61752955 skipping header line: @SQ SN:NC_035782.1 LN:77061148 skipping header line: @SQ SN:NC_035783.1 LN:59691872 skipping header line: @SQ SN:NC_035784.1 LN:98698416 skipping header line: @SQ SN:NC_035785.1 LN:51258098 skipping header line: @SQ SN:NC_035786.1 LN:57830854 skipping header line: @SQ SN:NC_035787.1 LN:75944018 skipping header line: @SQ SN:NC_035788.1 LN:104168038 skipping header line: @SQ SN:NC_035789.1 LN:32650045 skipping header line: @SQ SN:NC_007175.2 LN:17244 skipping header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 100000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" *** Bismark methylation extractor version v0.19.0 *** Trying to determine the type of mapping from the SAM header line of file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Treating file(s) as single-end data (as extracted from @PG line) Core usage currently set to more than 20 threads. Let's see how this goes... (set value: 28) Summarising Bismark methylation extractor parameters: =============================================================== Bismark single-end SAM format specified (default) Number of cores to be used: 28 Output will be written to the current directory ('/gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_100000') Summarising bedGraph parameters: =============================================================== Generating additional output in bedGraph and coverage format bedGraph format: coverage format: Using a cutoff of 1 read(s) to report cytosine positions Reporting and sorting cytosine methylation information in CpG context only (default) White spaces in read ID names will be removed prior to sorting The bedGraph UNIX sort command will use the following memory setting: '75%'. Temporary directory used for sorting is the output directory Checking file >>cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam<< for signs of file truncation... Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original bottom strand to CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original bottom strand to CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original top strand to CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original top strand to CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original bottom strand to CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original bottom strand to CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam skipping SAM header line: @HD VN:1.0 SO:unsorted skipping SAM header line: @SQ SN:NC_035780.1 LN:65668440 skipping SAM header line: @SQ SN:NC_035781.1 LN:61752955 skipping SAM header line: @SQ SN:NC_035782.1 LN:77061148 skipping SAM header line: @SQ SN:NC_035783.1 LN:59691872 skipping SAM header line: @SQ SN:NC_035784.1 LN:98698416 skipping SAM header line: @SQ SN:NC_035785.1 LN:51258098 skipping SAM header line: @SQ SN:NC_035786.1 LN:57830854 skipping SAM header line: @SQ SN:NC_035787.1 LN:75944018 skipping SAM header line: @SQ SN:NC_035788.1 LN:104168038 skipping SAM header line: @SQ SN:NC_035789.1 LN:32650045 skipping SAM header line: @SQ SN:NC_007175.2 LN:17244 skipping SAM header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 100000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Now waiting for all child processes to complete Merging individual splitting reports into overall report: 'cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt' Merging from these individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28 Processed 18458 lines in total Total number of methylation call strings processed: 18458 Final Cytosine Methylation Report ================================= Total number of C's analysed: 278472 Total methylated C's in CpG context: 29289 Total methylated C's in CHG context: 2395 Total methylated C's in CHH context: 18136 Total C to T conversions in CpG context: 9412 Total C to T conversions in CHG context: 55424 Total C to T conversions in CHH context: 163816 C methylated in CpG context: 75.7% C methylated in CHG context: 4.1% C methylated in CHH context: 10.0% Merging individual M-bias reports into overall M-bias statistics from these 28 individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28.mbias Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Deleting unused files ... CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept Using these input files: CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Summary of parameters for bismark2bedGraph conversion: ====================================================== bedGraph output: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz output directory: >< remove whitespaces: yes CX context: no (CpG context only, default) No-header selected: no Sorting method: Unix sort-based (smaller memory footprint, but slower) Sort buffer size: 75% Coverage threshold: 1 ============================================================================= Methylation information will now be written into a bedGraph and coverage file ============================================================================= Using the following files as Input: /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_100000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_100000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing bedGraph to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz Also writing out a coverage file including counts methylated and unmethylated residues to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bismark.cov.gz The genome of interest was specified to contain gazillions of chromosomes or scaffolds. Merging all input files and sorting everything in memory instead of writing out individual chromosome files... Writing all merged methylation calls to temp file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_100000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_100000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Sorting input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged by positions (using -S of 75%) Successfully deleted the temporary input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Finished BedGraph conversion ... Found 1 alignment reports in current directory. Now trying to figure out whether there are corresponding optional reports Writing Bismark HTML report to >> cvir_bsseq_all_R1_bismark_bt2_SE_report.html << ============================================================================================================== Using the following alignment report: > cvir_bsseq_all_R1_bismark_bt2_SE_report.txt < Processing alignment report cvir_bsseq_all_R1_bismark_bt2_SE_report.txt ... Complete Using the following deduplication report: > cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt < Processing deduplication report cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt ... Complete Using the following splitting report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt < Processing splitting report cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt ... Complete Using the following M-bias report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt < Processing M-bias report cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt ... Complete No nucleotide coverage report file specified, skipping this step ============================================================================================================== Found Bismark/Bowtie2 single-end files No Bismark/Bowtie2 paired-end BAM files detected No Bismark/Bowtie single-end BAM files detected No Bismark/Bowtie paired-end BAM files detected Generating Bismark summary report from 1 Bismark BAM file(s)... >> Reading from Bismark report: cvir_bsseq_all_R1_bismark_bt2_SE_report.txt Wrote Bismark project summary to >> bismark_summary_report.html << [bam_sort_core] merging from 0 files and 28 in-memory blocks... FastQ format assumed (by default) Each Bowtie 2 instance is going to be run with 28 threads. Please monitor performance closely and tune down if needed! chr NC_035780.1 (65668440 bp) chr NC_035781.1 (61752955 bp) chr NC_035782.1 (77061148 bp) chr NC_035783.1 (59691872 bp) chr NC_035784.1 (98698416 bp) chr NC_035785.1 (51258098 bp) chr NC_035786.1 (57830854 bp) chr NC_035787.1 (75944018 bp) chr NC_035788.1 (104168038 bp) chr NC_035789.1 (32650045 bp) chr NC_007175.2 (17244 bp) Processed 500000 sequences in total Sequences with no alignments under any condition: 357549 Sequences did not map uniquely: 49436 Sequences which were discarded because genomic sequence could not be extracted: 0 Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 46787 ((converted) top strand) CT/GA: 46228 ((converted) bottom strand) GA/CT: 0 (complementary to (converted) top strand) GA/GA: 0 (complementary to (converted) bottom strand) Number of alignments to (merely theoretical) complementary strands being rejected in total: 0 Processing single-end Bismark output file(s) (SAM format): cvir_bsseq_all_R1_bismark_bt2.bam If there are several alignments to a single position in the genome the first alignment will be chosen. Since the input files are not in any way sorted this is a near-enough random selection of reads. Checking file >>cvir_bsseq_all_R1_bismark_bt2.bam<< for signs of file truncation... Output file is: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Total number of alignments analysed in cvir_bsseq_all_R1_bismark_bt2.bam: 93015 Total number duplicated alignments removed: 531 (0.57%) Duplicated alignments were found at: 454 different position(s) Total count of deduplicated leftover sequences: 92484 (99.43% of total) skipping header line: @HD VN:1.0 SO:unsorted skipping header line: @SQ SN:NC_035780.1 LN:65668440 skipping header line: @SQ SN:NC_035781.1 LN:61752955 skipping header line: @SQ SN:NC_035782.1 LN:77061148 skipping header line: @SQ SN:NC_035783.1 LN:59691872 skipping header line: @SQ SN:NC_035784.1 LN:98698416 skipping header line: @SQ SN:NC_035785.1 LN:51258098 skipping header line: @SQ SN:NC_035786.1 LN:57830854 skipping header line: @SQ SN:NC_035787.1 LN:75944018 skipping header line: @SQ SN:NC_035788.1 LN:104168038 skipping header line: @SQ SN:NC_035789.1 LN:32650045 skipping header line: @SQ SN:NC_007175.2 LN:17244 skipping header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 500000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" *** Bismark methylation extractor version v0.19.0 *** Trying to determine the type of mapping from the SAM header line of file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Treating file(s) as single-end data (as extracted from @PG line) Core usage currently set to more than 20 threads. Let's see how this goes... (set value: 28) Summarising Bismark methylation extractor parameters: =============================================================== Bismark single-end SAM format specified (default) Number of cores to be used: 28 Output will be written to the current directory ('/gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_500000') Summarising bedGraph parameters: =============================================================== Generating additional output in bedGraph and coverage format bedGraph format: coverage format: Using a cutoff of 1 read(s) to report cytosine positions Reporting and sorting cytosine methylation information in CpG context only (default) White spaces in read ID names will be removed prior to sorting The bedGraph UNIX sort command will use the following memory setting: '75%'. Temporary directory used for sorting is the output directory Checking file >>cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam<< for signs of file truncation... Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original bottom strand to CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original bottom strand to CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original top strand to CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original top strand to CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original bottom strand to CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original bottom strand to CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam skipping SAM header line: @HD VN:1.0 SO:unsorted skipping SAM header line: @SQ SN:NC_035780.1 LN:65668440 skipping SAM header line: @SQ SN:NC_035781.1 LN:61752955 skipping SAM header line: @SQ SN:NC_035782.1 LN:77061148 skipping SAM header line: @SQ SN:NC_035783.1 LN:59691872 skipping SAM header line: @SQ SN:NC_035784.1 LN:98698416 skipping SAM header line: @SQ SN:NC_035785.1 LN:51258098 skipping SAM header line: @SQ SN:NC_035786.1 LN:57830854 skipping SAM header line: @SQ SN:NC_035787.1 LN:75944018 skipping SAM header line: @SQ SN:NC_035788.1 LN:104168038 skipping SAM header line: @SQ SN:NC_035789.1 LN:32650045 skipping SAM header line: @SQ SN:NC_007175.2 LN:17244 skipping SAM header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 500000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Now waiting for all child processes to complete Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Merging individual splitting reports into overall report: 'cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt' Merging from these individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28 Processed 92484 lines in total Total number of methylation call strings processed: 92484 Final Cytosine Methylation Report ================================= Total number of C's analysed: 1392276 Total methylated C's in CpG context: 146724 Total methylated C's in CHG context: 11989 Total methylated C's in CHH context: 90098 Total C to T conversions in CpG context: 47087 Total C to T conversions in CHG context: 278276 Total C to T conversions in CHH context: 818102 C methylated in CpG context: 75.7% C methylated in CHG context: 4.1% C methylated in CHH context: 9.9% Merging individual M-bias reports into overall M-bias statistics from these 28 individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28.mbias Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Deleting unused files ... CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept Using these input files: CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Summary of parameters for bismark2bedGraph conversion: ====================================================== bedGraph output: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz output directory: >< remove whitespaces: yes CX context: no (CpG context only, default) No-header selected: no Sorting method: Unix sort-based (smaller memory footprint, but slower) Sort buffer size: 75% Coverage threshold: 1 ============================================================================= Methylation information will now be written into a bedGraph and coverage file ============================================================================= Using the following files as Input: /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_500000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_500000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing bedGraph to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz Also writing out a coverage file including counts methylated and unmethylated residues to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bismark.cov.gz The genome of interest was specified to contain gazillions of chromosomes or scaffolds. Merging all input files and sorting everything in memory instead of writing out individual chromosome files... Writing all merged methylation calls to temp file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_500000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_500000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Sorting input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged by positions (using -S of 75%) Successfully deleted the temporary input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Finished BedGraph conversion ... Found 1 alignment reports in current directory. Now trying to figure out whether there are corresponding optional reports Writing Bismark HTML report to >> cvir_bsseq_all_R1_bismark_bt2_SE_report.html << ============================================================================================================== Using the following alignment report: > cvir_bsseq_all_R1_bismark_bt2_SE_report.txt < Processing alignment report cvir_bsseq_all_R1_bismark_bt2_SE_report.txt ... Complete Using the following deduplication report: > cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt < Processing deduplication report cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt ... Complete Using the following splitting report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt < Processing splitting report cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt ... Complete Using the following M-bias report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt < Processing M-bias report cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt ... Complete No nucleotide coverage report file specified, skipping this step ============================================================================================================== Found Bismark/Bowtie2 single-end files No Bismark/Bowtie2 paired-end BAM files detected No Bismark/Bowtie single-end BAM files detected No Bismark/Bowtie paired-end BAM files detected Generating Bismark summary report from 1 Bismark BAM file(s)... >> Reading from Bismark report: cvir_bsseq_all_R1_bismark_bt2_SE_report.txt Wrote Bismark project summary to >> bismark_summary_report.html << [bam_sort_core] merging from 0 files and 28 in-memory blocks... FastQ format assumed (by default) Each Bowtie 2 instance is going to be run with 28 threads. Please monitor performance closely and tune down if needed! chr NC_035780.1 (65668440 bp) chr NC_035781.1 (61752955 bp) chr NC_035782.1 (77061148 bp) chr NC_035783.1 (59691872 bp) chr NC_035784.1 (98698416 bp) chr NC_035785.1 (51258098 bp) chr NC_035786.1 (57830854 bp) chr NC_035787.1 (75944018 bp) chr NC_035788.1 (104168038 bp) chr NC_035789.1 (32650045 bp) chr NC_007175.2 (17244 bp) Processed 1000000 sequences in total Sequences with no alignments under any condition: 705799 Sequences did not map uniquely: 102281 Sequences which were discarded because genomic sequence could not be extracted: 0 Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 96405 ((converted) top strand) CT/GA: 95515 ((converted) bottom strand) GA/CT: 0 (complementary to (converted) top strand) GA/GA: 0 (complementary to (converted) bottom strand) Number of alignments to (merely theoretical) complementary strands being rejected in total: 0 Processing single-end Bismark output file(s) (SAM format): cvir_bsseq_all_R1_bismark_bt2.bam If there are several alignments to a single position in the genome the first alignment will be chosen. Since the input files are not in any way sorted this is a near-enough random selection of reads. Checking file >>cvir_bsseq_all_R1_bismark_bt2.bam<< for signs of file truncation... Output file is: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Total number of alignments analysed in cvir_bsseq_all_R1_bismark_bt2.bam: 191920 Total number duplicated alignments removed: 1523 (0.79%) Duplicated alignments were found at: 1240 different position(s) Total count of deduplicated leftover sequences: 190397 (99.21% of total) skipping header line: @HD VN:1.0 SO:unsorted skipping header line: @SQ SN:NC_035780.1 LN:65668440 skipping header line: @SQ SN:NC_035781.1 LN:61752955 skipping header line: @SQ SN:NC_035782.1 LN:77061148 skipping header line: @SQ SN:NC_035783.1 LN:59691872 skipping header line: @SQ SN:NC_035784.1 LN:98698416 skipping header line: @SQ SN:NC_035785.1 LN:51258098 skipping header line: @SQ SN:NC_035786.1 LN:57830854 skipping header line: @SQ SN:NC_035787.1 LN:75944018 skipping header line: @SQ SN:NC_035788.1 LN:104168038 skipping header line: @SQ SN:NC_035789.1 LN:32650045 skipping header line: @SQ SN:NC_007175.2 LN:17244 skipping header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 1000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" *** Bismark methylation extractor version v0.19.0 *** Trying to determine the type of mapping from the SAM header line of file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Treating file(s) as single-end data (as extracted from @PG line) Core usage currently set to more than 20 threads. Let's see how this goes... (set value: 28) Summarising Bismark methylation extractor parameters: =============================================================== Bismark single-end SAM format specified (default) Number of cores to be used: 28 Output will be written to the current directory ('/gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_1000000') Summarising bedGraph parameters: =============================================================== Generating additional output in bedGraph and coverage format bedGraph format: coverage format: Using a cutoff of 1 read(s) to report cytosine positions Reporting and sorting cytosine methylation information in CpG context only (default) White spaces in read ID names will be removed prior to sorting The bedGraph UNIX sort command will use the following memory setting: '75%'. Temporary directory used for sorting is the output directory Checking file >>cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam<< for signs of file truncation... Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original bottom strand to CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original bottom strand to CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original top strand to CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original top strand to CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original bottom strand to CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original bottom strand to CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam skipping SAM header line: @HD VN:1.0 SO:unsorted skipping SAM header line: @SQ SN:NC_035780.1 LN:65668440 skipping SAM header line: @SQ SN:NC_035781.1 LN:61752955 skipping SAM header line: @SQ SN:NC_035782.1 LN:77061148 skipping SAM header line: @SQ SN:NC_035783.1 LN:59691872 skipping SAM header line: @SQ SN:NC_035784.1 LN:98698416 skipping SAM header line: @SQ SN:NC_035785.1 LN:51258098 skipping SAM header line: @SQ SN:NC_035786.1 LN:57830854 skipping SAM header line: @SQ SN:NC_035787.1 LN:75944018 skipping SAM header line: @SQ SN:NC_035788.1 LN:104168038 skipping SAM header line: @SQ SN:NC_035789.1 LN:32650045 skipping SAM header line: @SQ SN:NC_007175.2 LN:17244 skipping SAM header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 1000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Now waiting for all child processes to complete Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Merging individual splitting reports into overall report: 'cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt' Merging from these individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28 Processed 190397 lines in total Total number of methylation call strings processed: 190397 Final Cytosine Methylation Report ================================= Total number of C's analysed: 2841113 Total methylated C's in CpG context: 295962 Total methylated C's in CHG context: 21717 Total methylated C's in CHH context: 164097 Total C to T conversions in CpG context: 96371 Total C to T conversions in CHG context: 569867 Total C to T conversions in CHH context: 1693099 C methylated in CpG context: 75.4% C methylated in CHG context: 3.7% C methylated in CHH context: 8.8% Merging individual M-bias reports into overall M-bias statistics from these 28 individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28.mbias Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Deleting unused files ... CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept Using these input files: CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Summary of parameters for bismark2bedGraph conversion: ====================================================== bedGraph output: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz output directory: >< remove whitespaces: yes CX context: no (CpG context only, default) No-header selected: no Sorting method: Unix sort-based (smaller memory footprint, but slower) Sort buffer size: 75% Coverage threshold: 1 ============================================================================= Methylation information will now be written into a bedGraph and coverage file ============================================================================= Using the following files as Input: /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_1000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_1000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing bedGraph to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz Also writing out a coverage file including counts methylated and unmethylated residues to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bismark.cov.gz The genome of interest was specified to contain gazillions of chromosomes or scaffolds. Merging all input files and sorting everything in memory instead of writing out individual chromosome files... Writing all merged methylation calls to temp file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_1000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_1000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Sorting input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged by positions (using -S of 75%) Successfully deleted the temporary input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Finished BedGraph conversion ... Found 1 alignment reports in current directory. Now trying to figure out whether there are corresponding optional reports Writing Bismark HTML report to >> cvir_bsseq_all_R1_bismark_bt2_SE_report.html << ============================================================================================================== Using the following alignment report: > cvir_bsseq_all_R1_bismark_bt2_SE_report.txt < Processing alignment report cvir_bsseq_all_R1_bismark_bt2_SE_report.txt ... Complete Using the following deduplication report: > cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt < Processing deduplication report cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt ... Complete Using the following splitting report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt < Processing splitting report cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt ... Complete Using the following M-bias report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt < Processing M-bias report cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt ... Complete No nucleotide coverage report file specified, skipping this step ============================================================================================================== Found Bismark/Bowtie2 single-end files No Bismark/Bowtie2 paired-end BAM files detected No Bismark/Bowtie single-end BAM files detected No Bismark/Bowtie paired-end BAM files detected Generating Bismark summary report from 1 Bismark BAM file(s)... >> Reading from Bismark report: cvir_bsseq_all_R1_bismark_bt2_SE_report.txt Wrote Bismark project summary to >> bismark_summary_report.html << [bam_sort_core] merging from 0 files and 28 in-memory blocks... FastQ format assumed (by default) Each Bowtie 2 instance is going to be run with 28 threads. Please monitor performance closely and tune down if needed! chr NC_035780.1 (65668440 bp) chr NC_035781.1 (61752955 bp) chr NC_035782.1 (77061148 bp) chr NC_035783.1 (59691872 bp) chr NC_035784.1 (98698416 bp) chr NC_035785.1 (51258098 bp) chr NC_035786.1 (57830854 bp) chr NC_035787.1 (75944018 bp) chr NC_035788.1 (104168038 bp) chr NC_035789.1 (32650045 bp) chr NC_007175.2 (17244 bp) Processed 2000000 sequences in total Sequences with no alignments under any condition: 1400838 Sequences did not map uniquely: 208674 Sequences which were discarded because genomic sequence could not be extracted: 0 Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 195707 ((converted) top strand) CT/GA: 194781 ((converted) bottom strand) GA/CT: 0 (complementary to (converted) top strand) GA/GA: 0 (complementary to (converted) bottom strand) Number of alignments to (merely theoretical) complementary strands being rejected in total: 0 Processing single-end Bismark output file(s) (SAM format): cvir_bsseq_all_R1_bismark_bt2.bam If there are several alignments to a single position in the genome the first alignment will be chosen. Since the input files are not in any way sorted this is a near-enough random selection of reads. Checking file >>cvir_bsseq_all_R1_bismark_bt2.bam<< for signs of file truncation... Output file is: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Total number of alignments analysed in cvir_bsseq_all_R1_bismark_bt2.bam: 390488 Total number duplicated alignments removed: 4757 (1.22%) Duplicated alignments were found at: 3800 different position(s) Total count of deduplicated leftover sequences: 385731 (98.78% of total) skipping header line: @HD VN:1.0 SO:unsorted skipping header line: @SQ SN:NC_035780.1 LN:65668440 skipping header line: @SQ SN:NC_035781.1 LN:61752955 skipping header line: @SQ SN:NC_035782.1 LN:77061148 skipping header line: @SQ SN:NC_035783.1 LN:59691872 skipping header line: @SQ SN:NC_035784.1 LN:98698416 skipping header line: @SQ SN:NC_035785.1 LN:51258098 skipping header line: @SQ SN:NC_035786.1 LN:57830854 skipping header line: @SQ SN:NC_035787.1 LN:75944018 skipping header line: @SQ SN:NC_035788.1 LN:104168038 skipping header line: @SQ SN:NC_035789.1 LN:32650045 skipping header line: @SQ SN:NC_007175.2 LN:17244 skipping header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 2000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" *** Bismark methylation extractor version v0.19.0 *** Trying to determine the type of mapping from the SAM header line of file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Treating file(s) as single-end data (as extracted from @PG line) Core usage currently set to more than 20 threads. Let's see how this goes... (set value: 28) Summarising Bismark methylation extractor parameters: =============================================================== Bismark single-end SAM format specified (default) Number of cores to be used: 28 Output will be written to the current directory ('/gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_2000000') Summarising bedGraph parameters: =============================================================== Generating additional output in bedGraph and coverage format bedGraph format: coverage format: Using a cutoff of 1 read(s) to report cytosine positions Reporting and sorting cytosine methylation information in CpG context only (default) White spaces in read ID names will be removed prior to sorting The bedGraph UNIX sort command will use the following memory setting: '75%'. Temporary directory used for sorting is the output directory Checking file >>cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam<< for signs of file truncation... Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original bottom strand to CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original bottom strand to CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original top strand to CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original top strand to CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original bottom strand to CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original bottom strand to CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam skipping SAM header line: @HD VN:1.0 SO:unsorted skipping SAM header line: @SQ SN:NC_035780.1 LN:65668440 skipping SAM header line: @SQ SN:NC_035781.1 LN:61752955 skipping SAM header line: @SQ SN:NC_035782.1 LN:77061148 skipping SAM header line: @SQ SN:NC_035783.1 LN:59691872 skipping SAM header line: @SQ SN:NC_035784.1 LN:98698416 skipping SAM header line: @SQ SN:NC_035785.1 LN:51258098 skipping SAM header line: @SQ SN:NC_035786.1 LN:57830854 skipping SAM header line: @SQ SN:NC_035787.1 LN:75944018 skipping SAM header line: @SQ SN:NC_035788.1 LN:104168038 skipping SAM header line: @SQ SN:NC_035789.1 LN:32650045 skipping SAM header line: @SQ SN:NC_007175.2 LN:17244 skipping SAM header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 2000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Now waiting for all child processes to complete Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Merging individual splitting reports into overall report: 'cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt' Merging from these individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28 Processed 385731 lines in total Total number of methylation call strings processed: 385731 Final Cytosine Methylation Report ================================= Total number of C's analysed: 5745638 Total methylated C's in CpG context: 600246 Total methylated C's in CHG context: 40711 Total methylated C's in CHH context: 312080 Total C to T conversions in CpG context: 195545 Total C to T conversions in CHG context: 1163367 Total C to T conversions in CHH context: 3433689 C methylated in CpG context: 75.4% C methylated in CHG context: 3.4% C methylated in CHH context: 8.3% Merging individual M-bias reports into overall M-bias statistics from these 28 individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28.mbias Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Deleting unused files ... CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept Using these input files: CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Summary of parameters for bismark2bedGraph conversion: ====================================================== bedGraph output: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz output directory: >< remove whitespaces: yes CX context: no (CpG context only, default) No-header selected: no Sorting method: Unix sort-based (smaller memory footprint, but slower) Sort buffer size: 75% Coverage threshold: 1 ============================================================================= Methylation information will now be written into a bedGraph and coverage file ============================================================================= Using the following files as Input: /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_2000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_2000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing bedGraph to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz Also writing out a coverage file including counts methylated and unmethylated residues to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bismark.cov.gz The genome of interest was specified to contain gazillions of chromosomes or scaffolds. Merging all input files and sorting everything in memory instead of writing out individual chromosome files... Writing all merged methylation calls to temp file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_2000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_2000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Sorting input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged by positions (using -S of 75%) Successfully deleted the temporary input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Finished BedGraph conversion ... Found 1 alignment reports in current directory. Now trying to figure out whether there are corresponding optional reports Writing Bismark HTML report to >> cvir_bsseq_all_R1_bismark_bt2_SE_report.html << ============================================================================================================== Using the following alignment report: > cvir_bsseq_all_R1_bismark_bt2_SE_report.txt < Processing alignment report cvir_bsseq_all_R1_bismark_bt2_SE_report.txt ... Complete Using the following deduplication report: > cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt < Processing deduplication report cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt ... Complete Using the following splitting report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt < Processing splitting report cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt ... Complete Using the following M-bias report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt < Processing M-bias report cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt ... Complete No nucleotide coverage report file specified, skipping this step ============================================================================================================== Found Bismark/Bowtie2 single-end files No Bismark/Bowtie2 paired-end BAM files detected No Bismark/Bowtie single-end BAM files detected No Bismark/Bowtie paired-end BAM files detected Generating Bismark summary report from 1 Bismark BAM file(s)... >> Reading from Bismark report: cvir_bsseq_all_R1_bismark_bt2_SE_report.txt Wrote Bismark project summary to >> bismark_summary_report.html << [bam_sort_core] merging from 0 files and 28 in-memory blocks... FastQ format assumed (by default) Each Bowtie 2 instance is going to be run with 28 threads. Please monitor performance closely and tune down if needed! chr NC_035780.1 (65668440 bp) chr NC_035781.1 (61752955 bp) chr NC_035782.1 (77061148 bp) chr NC_035783.1 (59691872 bp) chr NC_035784.1 (98698416 bp) chr NC_035785.1 (51258098 bp) chr NC_035786.1 (57830854 bp) chr NC_035787.1 (75944018 bp) chr NC_035788.1 (104168038 bp) chr NC_035789.1 (32650045 bp) chr NC_007175.2 (17244 bp) Processed 5000000 sequences in total Sequences with no alignments under any condition: 3569125 Sequences did not map uniquely: 498819 Sequences which were discarded because genomic sequence could not be extracted: 0 Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 465565 ((converted) top strand) CT/GA: 466491 ((converted) bottom strand) GA/CT: 0 (complementary to (converted) top strand) GA/GA: 0 (complementary to (converted) bottom strand) Number of alignments to (merely theoretical) complementary strands being rejected in total: 0 Processing single-end Bismark output file(s) (SAM format): cvir_bsseq_all_R1_bismark_bt2.bam If there are several alignments to a single position in the genome the first alignment will be chosen. Since the input files are not in any way sorted this is a near-enough random selection of reads. Checking file >>cvir_bsseq_all_R1_bismark_bt2.bam<< for signs of file truncation... Output file is: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Total number of alignments analysed in cvir_bsseq_all_R1_bismark_bt2.bam: 932056 Total number duplicated alignments removed: 22562 (2.42%) Duplicated alignments were found at: 17140 different position(s) Total count of deduplicated leftover sequences: 909494 (97.58% of total) skipping header line: @HD VN:1.0 SO:unsorted skipping header line: @SQ SN:NC_035780.1 LN:65668440 skipping header line: @SQ SN:NC_035781.1 LN:61752955 skipping header line: @SQ SN:NC_035782.1 LN:77061148 skipping header line: @SQ SN:NC_035783.1 LN:59691872 skipping header line: @SQ SN:NC_035784.1 LN:98698416 skipping header line: @SQ SN:NC_035785.1 LN:51258098 skipping header line: @SQ SN:NC_035786.1 LN:57830854 skipping header line: @SQ SN:NC_035787.1 LN:75944018 skipping header line: @SQ SN:NC_035788.1 LN:104168038 skipping header line: @SQ SN:NC_035789.1 LN:32650045 skipping header line: @SQ SN:NC_007175.2 LN:17244 skipping header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 5000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" *** Bismark methylation extractor version v0.19.0 *** Trying to determine the type of mapping from the SAM header line of file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Treating file(s) as single-end data (as extracted from @PG line) Core usage currently set to more than 20 threads. Let's see how this goes... (set value: 28) Summarising Bismark methylation extractor parameters: =============================================================== Bismark single-end SAM format specified (default) Number of cores to be used: 28 Output will be written to the current directory ('/gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_5000000') Summarising bedGraph parameters: =============================================================== Generating additional output in bedGraph and coverage format bedGraph format: coverage format: Using a cutoff of 1 read(s) to report cytosine positions Reporting and sorting cytosine methylation information in CpG context only (default) White spaces in read ID names will be removed prior to sorting The bedGraph UNIX sort command will use the following memory setting: '75%'. Temporary directory used for sorting is the output directory Checking file >>cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam<< for signs of file truncation... Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original bottom strand to CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original bottom strand to CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original top strand to CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original top strand to CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original bottom strand to CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original bottom strand to CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam skipping SAM header line: @HD VN:1.0 SO:unsorted skipping SAM header line: @SQ SN:NC_035780.1 LN:65668440 skipping SAM header line: @SQ SN:NC_035781.1 LN:61752955 skipping SAM header line: @SQ SN:NC_035782.1 LN:77061148 skipping SAM header line: @SQ SN:NC_035783.1 LN:59691872 skipping SAM header line: @SQ SN:NC_035784.1 LN:98698416 skipping SAM header line: @SQ SN:NC_035785.1 LN:51258098 skipping SAM header line: @SQ SN:NC_035786.1 LN:57830854 skipping SAM header line: @SQ SN:NC_035787.1 LN:75944018 skipping SAM header line: @SQ SN:NC_035788.1 LN:104168038 skipping SAM header line: @SQ SN:NC_035789.1 LN:32650045 skipping SAM header line: @SQ SN:NC_007175.2 LN:17244 skipping SAM header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 5000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Now waiting for all child processes to complete Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Merging individual splitting reports into overall report: 'cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt' Merging from these individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28 Processed 909494 lines in total Total number of methylation call strings processed: 909494 Final Cytosine Methylation Report ================================= Total number of C's analysed: 13496392 Total methylated C's in CpG context: 1360954 Total methylated C's in CHG context: 94075 Total methylated C's in CHH context: 737951 Total C to T conversions in CpG context: 527776 Total C to T conversions in CHG context: 2752388 Total C to T conversions in CHH context: 8023248 C methylated in CpG context: 72.1% C methylated in CHG context: 3.3% C methylated in CHH context: 8.4% Merging individual M-bias reports into overall M-bias statistics from these 28 individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28.mbias Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Determining maximum read length for M-Bias plot Maximum read length of Read 1: 76 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Deleting unused files ... CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept Using these input files: CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Summary of parameters for bismark2bedGraph conversion: ====================================================== bedGraph output: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz output directory: >< remove whitespaces: yes CX context: no (CpG context only, default) No-header selected: no Sorting method: Unix sort-based (smaller memory footprint, but slower) Sort buffer size: 75% Coverage threshold: 1 ============================================================================= Methylation information will now be written into a bedGraph and coverage file ============================================================================= Using the following files as Input: /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_5000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_5000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing bedGraph to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz Also writing out a coverage file including counts methylated and unmethylated residues to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bismark.cov.gz The genome of interest was specified to contain gazillions of chromosomes or scaffolds. Merging all input files and sorting everything in memory instead of writing out individual chromosome files... Writing all merged methylation calls to temp file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_5000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_5000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Sorting input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged by positions (using -S of 75%) Successfully deleted the temporary input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Finished BedGraph conversion ... Found 1 alignment reports in current directory. Now trying to figure out whether there are corresponding optional reports Writing Bismark HTML report to >> cvir_bsseq_all_R1_bismark_bt2_SE_report.html << ============================================================================================================== Using the following alignment report: > cvir_bsseq_all_R1_bismark_bt2_SE_report.txt < Processing alignment report cvir_bsseq_all_R1_bismark_bt2_SE_report.txt ... Complete Using the following deduplication report: > cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt < Processing deduplication report cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt ... Complete Using the following splitting report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt < Processing splitting report cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt ... Complete Using the following M-bias report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt < Processing M-bias report cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt ... Complete No nucleotide coverage report file specified, skipping this step ============================================================================================================== Found Bismark/Bowtie2 single-end files No Bismark/Bowtie2 paired-end BAM files detected No Bismark/Bowtie single-end BAM files detected No Bismark/Bowtie paired-end BAM files detected Generating Bismark summary report from 1 Bismark BAM file(s)... >> Reading from Bismark report: cvir_bsseq_all_R1_bismark_bt2_SE_report.txt Wrote Bismark project summary to >> bismark_summary_report.html << [bam_sort_core] merging from 0 files and 28 in-memory blocks... FastQ format assumed (by default) Each Bowtie 2 instance is going to be run with 28 threads. Please monitor performance closely and tune down if needed! chr NC_035780.1 (65668440 bp) chr NC_035781.1 (61752955 bp) chr NC_035782.1 (77061148 bp) chr NC_035783.1 (59691872 bp) chr NC_035784.1 (98698416 bp) chr NC_035785.1 (51258098 bp) chr NC_035786.1 (57830854 bp) chr NC_035787.1 (75944018 bp) chr NC_035788.1 (104168038 bp) chr NC_035789.1 (32650045 bp) chr NC_007175.2 (17244 bp) Processed 10000000 sequences in total Sequences with no alignments under any condition: 7479240 Sequences did not map uniquely: 907664 Sequences which were discarded because genomic sequence could not be extracted: 2 Number of sequences with unique best (first) alignment came from the bowtie output: CT/CT: 810107 ((converted) top strand) CT/GA: 802987 ((converted) bottom strand) GA/CT: 0 (complementary to (converted) top strand) GA/GA: 0 (complementary to (converted) bottom strand) Number of alignments to (merely theoretical) complementary strands being rejected in total: 0 Processing single-end Bismark output file(s) (SAM format): cvir_bsseq_all_R1_bismark_bt2.bam If there are several alignments to a single position in the genome the first alignment will be chosen. Since the input files are not in any way sorted this is a near-enough random selection of reads. Checking file >>cvir_bsseq_all_R1_bismark_bt2.bam<< for signs of file truncation... Output file is: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Total number of alignments analysed in cvir_bsseq_all_R1_bismark_bt2.bam: 1613094 Total number duplicated alignments removed: 104357 (6.47%) Duplicated alignments were found at: 63570 different position(s) Total count of deduplicated leftover sequences: 1508737 (93.53% of total) skipping header line: @HD VN:1.0 SO:unsorted skipping header line: @SQ SN:NC_035780.1 LN:65668440 skipping header line: @SQ SN:NC_035781.1 LN:61752955 skipping header line: @SQ SN:NC_035782.1 LN:77061148 skipping header line: @SQ SN:NC_035783.1 LN:59691872 skipping header line: @SQ SN:NC_035784.1 LN:98698416 skipping header line: @SQ SN:NC_035785.1 LN:51258098 skipping header line: @SQ SN:NC_035786.1 LN:57830854 skipping header line: @SQ SN:NC_035787.1 LN:75944018 skipping header line: @SQ SN:NC_035788.1 LN:104168038 skipping header line: @SQ SN:NC_035789.1 LN:32650045 skipping header line: @SQ SN:NC_007175.2 LN:17244 skipping header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 10000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" *** Bismark methylation extractor version v0.19.0 *** Trying to determine the type of mapping from the SAM header line of file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Treating file(s) as single-end data (as extracted from @PG line) Core usage currently set to more than 20 threads. Let's see how this goes... (set value: 28) Summarising Bismark methylation extractor parameters: =============================================================== Bismark single-end SAM format specified (default) Number of cores to be used: 28 Output will be written to the current directory ('/gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_10000000') Summarising bedGraph parameters: =============================================================== Generating additional output in bedGraph and coverage format bedGraph format: coverage format: Using a cutoff of 1 read(s) to report cytosine positions Reporting and sorting cytosine methylation information in CpG context only (default) White spaces in read ID names will be removed prior to sorting The bedGraph UNIX sort command will use the following memory setting: '75%'. Temporary directory used for sorting is the output directory Checking file >>cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam<< for signs of file truncation... Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the complementary to original bottom strand to CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHG context from the original bottom strand to CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original top strand to CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original top strand to CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the complementary to original bottom strand to CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing result file containing methylation information for C in CHH context from the original bottom strand to CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam Now reading in Bismark result file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bam skipping SAM header line: @HD VN:1.0 SO:unsorted skipping SAM header line: @SQ SN:NC_035780.1 LN:65668440 skipping SAM header line: @SQ SN:NC_035781.1 LN:61752955 skipping SAM header line: @SQ SN:NC_035782.1 LN:77061148 skipping SAM header line: @SQ SN:NC_035783.1 LN:59691872 skipping SAM header line: @SQ SN:NC_035784.1 LN:98698416 skipping SAM header line: @SQ SN:NC_035785.1 LN:51258098 skipping SAM header line: @SQ SN:NC_035786.1 LN:57830854 skipping SAM header line: @SQ SN:NC_035787.1 LN:75944018 skipping SAM header line: @SQ SN:NC_035788.1 LN:104168038 skipping SAM header line: @SQ SN:NC_035789.1 LN:32650045 skipping SAM header line: @SQ SN:NC_007175.2 LN:17244 skipping SAM header line: @PG ID:Bismark VN:v0.19.0 CL:"bismark --path_to_bowtie /gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/ --genome /gscratch/srlab/sam/data/C_virginica/genomes/ --score_min L,0,-0.6 -u 10000000 -p 28 /gscratch/scrubbed/samwhite/data/C_virginica/BSseq/cvir_bsseq_all_R1.fastq.gz" Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 500000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1000000 Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Now waiting for all child processes to complete Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Finished processing child process. Exiting.. Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Finished processing child process. Exiting.. Finished processing child process. Exiting.. Processed lines: 1500000 Finished processing child process. Exiting.. Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Processed lines: 1500000 Processed lines: 1500000 Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Finished processing child process. Exiting.. Merging individual splitting reports into overall report: 'cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt' Merging from these individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27 cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28 Processed 1508737 lines in total Total number of methylation call strings processed: 1508737 Final Cytosine Methylation Report ================================= Total number of C's analysed: 24639748 Total methylated C's in CpG context: 2709819 Total methylated C's in CHG context: 1260694 Total methylated C's in CHH context: 4840111 Total C to T conversions in CpG context: 956130 Total C to T conversions in CHG context: 3847573 Total C to T conversions in CHH context: 11025421 C methylated in CpG context: 73.9% C methylated in CHG context: 24.7% C methylated in CHH context: 30.5% Merging individual M-bias reports into overall M-bias statistics from these 28 individual files: cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.1.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.2.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.3.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.4.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.5.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.6.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.7.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.8.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.9.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.10.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.11.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.12.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.13.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.14.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.15.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.16.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.17.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.18.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.19.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.20.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.21.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.22.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.23.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.24.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.25.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.26.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.27.mbias cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt.28.mbias Determining maximum read length for M-Bias plot Maximum read length of Read 1: 101 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Determining maximum read length for M-Bias plot Maximum read length of Read 1: 101 Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table) Deleting unused files ... CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CpG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHG_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept CHH_CTOT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_CTOB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt was empty -> deleted CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt contains data -> kept Using these input files: CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt CHH_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Summary of parameters for bismark2bedGraph conversion: ====================================================== bedGraph output: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz output directory: >< remove whitespaces: yes CX context: no (CpG context only, default) No-header selected: no Sorting method: Unix sort-based (smaller memory footprint, but slower) Sort buffer size: 75% Coverage threshold: 1 ============================================================================= Methylation information will now be written into a bedGraph and coverage file ============================================================================= Using the following files as Input: /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_10000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_10000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt Writing bedGraph to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz Also writing out a coverage file including counts methylated and unmethylated residues to file: cvir_bsseq_all_R1_bismark_bt2.deduplicated.bismark.cov.gz The genome of interest was specified to contain gazillions of chromosomes or scaffolds. Merging all input files and sorting everything in memory instead of writing out individual chromosome files... Writing all merged methylation calls to temp file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_10000000/CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OT_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Now replacing whitespaces in the sequence ID field of the Bismark methylation extractor output /gscratch/scrubbed/samwhite/outputs/20190222_cvirginica_se_bismark/subset_10000000/CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt prior to bedGraph conversion Attempting to write to file CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt Finished writing methylation calls from CpG_OB_cvir_bsseq_all_R1_bismark_bt2.deduplicated.txt.spaces_removed.txt to merged temp file Sorting input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged by positions (using -S of 75%) Successfully deleted the temporary input file cvir_bsseq_all_R1_bismark_bt2.deduplicated.bedGraph.gz.methylation_calls.merged Finished BedGraph conversion ... Found 1 alignment reports in current directory. Now trying to figure out whether there are corresponding optional reports Writing Bismark HTML report to >> cvir_bsseq_all_R1_bismark_bt2_SE_report.html << ============================================================================================================== Using the following alignment report: > cvir_bsseq_all_R1_bismark_bt2_SE_report.txt < Processing alignment report cvir_bsseq_all_R1_bismark_bt2_SE_report.txt ... Complete Using the following deduplication report: > cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt < Processing deduplication report cvir_bsseq_all_R1_bismark_bt2.deduplication_report.txt ... Complete Using the following splitting report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt < Processing splitting report cvir_bsseq_all_R1_bismark_bt2.deduplicated_splitting_report.txt ... Complete Using the following M-bias report: > cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt < Processing M-bias report cvir_bsseq_all_R1_bismark_bt2.deduplicated.M-bias.txt ... Complete No nucleotide coverage report file specified, skipping this step ============================================================================================================== Found Bismark/Bowtie2 single-end files No Bismark/Bowtie2 paired-end BAM files detected No Bismark/Bowtie single-end BAM files detected No Bismark/Bowtie paired-end BAM files detected Generating Bismark summary report from 1 Bismark BAM file(s)... >> Reading from Bismark report: cvir_bsseq_all_R1_bismark_bt2_SE_report.txt Wrote Bismark project summary to >> bismark_summary_report.html << [bam_sort_core] merging from 0 files and 28 in-memory blocks...