--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left date: 2017-10-19 18:29:44+00:00 layout: post slug: software-installation-masurca-v3-2-3-assembler-on-mox-hyak title: Software Installation - MaSuRCA v3.2.3 Assembler on Mox (Hyak) categories: - 2017 - Miscellaneous tags: - hyak - MaSuRCA - mox --- Saw this tweet this morning and thought this would be good to try out for our Olympia oyster genome assemblies, as it will handle "hybrid" assemblies (i.e. short-reads and long reads):
> > The latest verison of the MaSuRCA assembler supports [@nanopore](https://twitter.com/nanopore?ref_src=twsrc%5Etfw) reads and it is super easy to use. Check it out! [https://t.co/ByKUCpAoIf](https://t.co/ByKUCpAoIf) [pic.twitter.com/KulviJhmMh](https://t.co/KulviJhmMh) > > — Darrin Schultz (@conchoecia) [October 18, 2017](https://twitter.com/conchoecia/status/920771453728854016?ref_src=twsrc%5Etfw)
Additionally, I was excited by the "...super easy to use." part in the description. As it turns out, that part of the Tweet is totally untrue. Here are some of the things that make it _difficult_ to use: * No pre-install readme file. Without readme file there are no instructions/info on: * Necessary dependencies * Install command(s) * Initial install attempt failed with error message. Message suggests trying: * BOOST_ROOT=install ./install.sh * No post-install readme file. How do I even get started without any documentation??!! I managed to track down the guide for this, but didn't get it via searching the internet. I noticed that [the link for the software in the original Tweet](ftp://ftp.genome.umd.edu/pub/MaSuRCA/latest/) had a parent directory, so I navigated there and spotted this: Quick start guide (PDF): [ftp://ftp.genome.umd.edu/pub/MaSuRCA/MaSuRCA_QuickStartGuide.pdf](ftp://ftp.genome.umd.edu/pub/MaSuRCA/MaSuRCA_QuickStartGuide.pdf) Although not a big deal, this quick start guide is for the previous version (v.3.2.2). So, is this where we get to the "super easy to use" part? Uh, no. Check it out: 1. Modify a config file (luckily, a template is created during install) * Illumina paired-end (PE) reads: Aribtrary two letter prefix, mean read length, and read length standard deviation need to be supplied (!!!) * Illumina mate-paired (MP) reads: Called "JUMP" in config file and needs the same type of info supplied as PE reads. * PacBio reads: Need to be in a single FASTA file (ugh)! * A bunch of other stuff that I can probably leave alone. 2. Run the `masurca` script (located here: `MaSuRCA-3.2.3/bin/masurca`). This will generate a new script (called `assemble.sh`). 3. Run the `assemble.sh` script that was created in the previous step. Although not specifically related to the MaSuRCA install, I did encounter problems trying to install this on our Mox (hyak) computing node. #### [Build node](https://github.com/RobertsLab/hyak_mox/wiki/Node-Types#build-node) fail (ironically, this is the specific type of node that's supposed to be used for compiling software): ![](https://owl.fish.washington.edu/Athaliana/20171019_mox_build_masurca_fail.png)(http://owl.fish.washington.edu/Athaliana/20171019_mox_build_masurca_fail.png) OK, so I decided to try compiling it using the [login node](https://github.com/RobertsLab/hyak_mox/wiki/Node-Types#login-node) (which is not what the login node is supposed to be used for): #### [Login node](https://github.com/RobertsLab/hyak_mox/wiki/Node-Types#login-node) fail: ![](https://owl.fish.washington.edu/Athaliana/20171019_mox_login_masurca_fail.png)(http://owl.fish.washington.edu/Athaliana/20171019_mox_login_masurca_fail.png) I really didn't want to have to put together [an SBATCH script](https://github.com/RobertsLab/hyak_mox/wiki/Running-a-Job) just to compile this software (which compiled without issue, except for that initial BOOST error thingy, on my local Ubuntu 16.04 LTS system), so I just tried running an [interactive node](https://github.com/RobertsLab/hyak_mox/wiki/Node-Types#interactive-node) and it worked! Now, on to trying to actually run this thing...