--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left date: 2018-04-19 23:41:39+00:00 layout: post slug: kmer-estimation-kmergenie-tweaks-on-geoduck-sequence-data title: 'Kmer Estimation – Kmergenie Tweaks on Geoduck Sequence Data ' categories: - 2018 - Geoduck Genome Sequencing tags: - geoduck - kmer - kmer genie - mox - Panopea generosa --- Earlier today, [I ran kmergenie on our all of geoduck DNA sequencing data](https://robertslab.github.io/sams-notebook/posts/2018/2018-04-19-kmer-estimation-kmergenie-on-geoduck-sequence-data-default-settings/) to see what it would spit out for an ideal kmer setting, which I would then use in another assembly attempt using SparseAssembler; just to see how the assembly might change. The output from that kmergenie run suggested that the ideal kmer size exceeded the default maximum (k = 121), so I decided to run kmergenie a few more times, with some slight changes. All jobs were run on our Mox HPC node. ###### Run 1 * Diploid * Slurm script: [20180419_kmergenie_diploid_geoduck_slurm.sh](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_geoduck/20180419_kmergenie_diploid_geoduck_slurm.sh) ###### Run 2 * Diploid * k 301 * Slurm script: [20180419_kmergenie_diploid_k301_geoduck_slurm.sh](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_k301_geoduck/20180419_kmergenie_diploid_k301_geoduck_slurm.sh) ##### Results: Output folders: * [20180419_kmergenie_diploid_geoduck/](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_geoduck/) * [20180419_kmergenie_diploid_k301_geoduck/](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_k301_geoduck/) Slurm output files: * [20180419_kmergenie_diploid_geoduck/slurm-162003.out](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_geoduck/slurm-162003.out) * [20180419_kmergenie_diploid_k301_geoduck/slurm-162004.out](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_k301_geoduck/slurm-162004.out) Kmer histogram (HTML) reports: * [20180419_kmergenie_diploid_geoduck/histograms_report.html](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_geoduck/histograms_report.html) * [20180419_kmergenie_diploid_k301_geoduck/histograms_report.html](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_k301_geoduck/histograms_report.html) #### Diploid ![](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_geoduck/20180419_kmergenie_diploid_geoduck.png) #### Diploid, k 301 ![](https://owl.fish.washington.edu/Athaliana/20180419_kmergenie_diploid_k301_geoduck/20180419_kmergenie_diploid_k301_geoduck.png) Okay, well, these graphs clearly show that the diploid setting is no good. We should be getting a nice, smooth, concave curve. Will try running again, without diploid setting and just increasing the max kmer size.