Sam’s Notebook - Data Received - Jay’s Coral epiRADseq

We received notice that Jay’s coral (Porites spp) epiRADseq data was available from the Genomic Sequencing Laboratory at UC-Berkeley.

Downloaded the FASTQ files from the project directory to Owl/nightingales/Porites_spp:

<code>time wget -r -np -nc -A "*.gz" --ask-password ftp://gslftp@gslserver.qb3.berkeley.edu/160830_100PE_HS4KB/Roberts</code>

Generated MD5 checksums for each file:

<code>for i in *.gz; do md5 $i >> checksums.md5; done</code>

Calculate total number of reads for this sequencing run:

<code>totalreads=0; for i in *.gz; do linecount=`gunzip -c "$i" | wc -l`; readcount=$((linecount/4)); totalreads=$((readcount+totalreads)); done; echo $totalreads</code>

Total reads: 573,378,864

Calculate read counts for each file and write the data to the readme.md file in the Owl/web/nightingales/Porites_spp directory:

<code>for i in *.gz; do linecount=`gunzip -c "$i" | wc -l`; readcount=$(($linecount/4)); printf "%s\t%s\n" "$i" "$readcount" >> readme.md; done</code>

See this Jupyter notebook for code explanations.

Added sequencing info to [Next_Gen_Seq_Library_Database (Google Sheet)(https://docs.google.com/spreadsheets/d/1r4twxfBHpWfQoznbn2dAQhgMvmlZvQqW9I2_uVZX_aU/edit?usp=sharing) and the Nightingales Spreadsheet (Google Sheet) and Nightingales Fusion Table (Google Fusion Table).