{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Counting Reads\n", "\n", "In this notebook, I'll count the number of reads in both untrimmed and trimmed *C. virgincia* gonad sequence data from Illumina.\n", "\n", "1. Untrimmed files\n", "2. Trimmed files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Prepare for analyses" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 0a. Set working directory" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'/Users/yaamini/Documents/paper-gonad-meth/code'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pwd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/yaamini/Documents/paper-gonad-meth/data\n" ] } ], "source": [ "cd ../data/" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!mkdir 2019-03-17-Counting-Reads" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/yaamini/Documents/paper-gonad-meth/data/2019-03-17-Counting-Reads\n" ] } ], "source": [ "cd 2019-03-17-Counting-Reads/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Untrimmed files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The untrimmed files have FastQC reports I can use to get read counts, instead of downloading the whole file. The link to these files can be found in the Nightingales spreadsheet." ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Create a new directory for downloading files and saving read counts\n", "!mkdir 2019-03-17-Untrimmed-Reads" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/yaamini/Documents/paper-gonad-meth/data/2019-03-17-Counting-Reads/2019-03-17-Untrimmed-Reads\n" ] } ], "source": [ "cd 2019-03-17-Untrimmed-Reads/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1a. Download files" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2019-03-18 14:16:10-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/\n", "Resolving owl.fish.washington.edu... 128.95.149.83\n", "Connecting to owl.fish.washington.edu|128.95.149.83|:80... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html'\n", "\n", "owl.fish.washington [ <=> ] 10.27K --.-KB/s in 0.02s \n", "\n", "2019-03-18 14:16:10 (627 KB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html' saved [10512]\n", "\n", "Loading robots.txt; please ignore errors.\n", "--2019-03-18 14:16:10-- http://owl.fish.washington.edu/robots.txt\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 404 Not Found\n", "2019-03-18 14:16:23 ERROR 404: Not Found.\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html since it should be rejected.\n", "\n", "--2019-03-18 14:16:23-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/?C=N;O=D\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=N;O=D'\n", "\n", "owl.fish.washington [ <=> ] 10.27K --.-KB/s in 0s \n", "\n", "2019-03-18 14:16:24 (27.5 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=N;O=D' saved [10512]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=N;O=D since it should be rejected.\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/?C=M;O=A\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=M;O=A'\n", "\n", "owl.fish.washington [ <=> ] 10.27K --.-KB/s in 0s \n", "\n", "2019-03-18 14:16:24 (34.3 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=M;O=A' saved [10512]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=M;O=A since it should be rejected.\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/?C=S;O=A\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=S;O=A'\n", "\n", "owl.fish.washington [ <=> ] 10.27K --.-KB/s in 0s \n", "\n", "2019-03-18 14:16:24 (61.1 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=S;O=A' saved [10512]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=S;O=A since it should be rejected.\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/?C=D;O=A\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=D;O=A'\n", "\n", "owl.fish.washington [ <=> ] 10.27K --.-KB/s in 0s \n", "\n", "2019-03-18 14:16:24 (61.5 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=D;O=A' saved [10512]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/index.html?C=D;O=A since it should be rejected.\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/multiqc_data/\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 2131 (2.1K) [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/multiqc_data/index.html'\n", "\n", "owl.fish.washington 100%[===================>] 2.08K --.-KB/s in 0s \n", "\n", "2019-03-18 14:16:24 (226 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/multiqc_data/index.html' saved [2131/2131]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/multiqc_data/index.html since it should be rejected.\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_1_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 312086 (305K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_1_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 304.77K --.-KB/s in 0.007s \n", "\n", "2019-03-18 14:16:24 (42.2 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_1_s1_R1_fastqc.zip' saved [312086/312086]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_1_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 313620 (306K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_1_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 306.27K --.-KB/s in 0.004s \n", "\n", "2019-03-18 14:16:24 (66.7 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_1_s1_R2_fastqc.zip' saved [313620/313620]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_2_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 300114 (293K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_2_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 293.08K --.-KB/s in 0.004s \n", "\n", "2019-03-18 14:16:24 (69.2 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_2_s1_R1_fastqc.zip' saved [300114/300114]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_2_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 304039 (297K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_2_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 296.91K --.-KB/s in 0.009s \n", "\n", "2019-03-18 14:16:24 (32.1 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_2_s1_R2_fastqc.zip' saved [304039/304039]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_3_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 298747 (292K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_3_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 291.75K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (91.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_3_s1_R1_fastqc.zip' saved [298747/298747]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_3_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 297104 (290K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_3_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 290.14K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (88.8 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_3_s1_R2_fastqc.zip' saved [297104/297104]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_4_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 296718 (290K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_4_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 289.76K --.-KB/s in 0.004s \n", "\n", "2019-03-18 14:16:24 (76.5 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_4_s1_R1_fastqc.zip' saved [296718/296718]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_4_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 296519 (290K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_4_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 289.57K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (102 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_4_s1_R2_fastqc.zip' saved [296519/296519]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_5_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 298644 (292K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_5_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 291.64K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (101 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_5_s1_R1_fastqc.zip' saved [298644/298644]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_5_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 297307 (290K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_5_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 290.34K --.-KB/s in 0.002s \n", "\n", "2019-03-18 14:16:24 (129 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_5_s1_R2_fastqc.zip' saved [297307/297307]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_6_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 299501 (292K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_6_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 292.48K --.-KB/s in 0.002s \n", "\n", "2019-03-18 14:16:24 (131 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_6_s1_R1_fastqc.zip' saved [299501/299501]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_6_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 294814 (288K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_6_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 287.90K --.-KB/s in 0.002s \n", "\n", "2019-03-18 14:16:24 (115 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_6_s1_R2_fastqc.zip' saved [294814/294814]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_7_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 297446 (290K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_7_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 290.47K --.-KB/s in 0.01s \n", "\n", "2019-03-18 14:16:24 (28.9 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_7_s1_R1_fastqc.zip' saved [297446/297446]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_7_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 296627 (290K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_7_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 289.67K --.-KB/s in 0.002s \n", "\n", "2019-03-18 14:16:24 (119 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_7_s1_R2_fastqc.zip' saved [296627/296627]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_8_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 301564 (294K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_8_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 294.50K --.-KB/s in 0.008s \n", "\n", "2019-03-18 14:16:24 (35.0 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_8_s1_R1_fastqc.zip' saved [301564/301564]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_8_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 300521 (293K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_8_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 293.48K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (110 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_8_s1_R2_fastqc.zip' saved [300521/300521]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_9_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 306406 (299K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_9_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 299.22K --.-KB/s in 0.02s \n", "\n", "2019-03-18 14:16:24 (18.2 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_9_s1_R1_fastqc.zip' saved [306406/306406]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_9_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 303294 (296K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_9_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 296.19K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (106 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_9_s1_R2_fastqc.zip' saved [303294/303294]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_10_s1_R1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 299983 (293K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_10_s1_R1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 292.95K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (87.3 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_10_s1_R1_fastqc.zip' saved [299983/299983]\n", "\n", "--2019-03-18 14:16:24-- http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_10_s1_R2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 298118 (291K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_10_s1_R2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 291.13K --.-KB/s in 0.003s \n", "\n", "2019-03-18 14:16:24 (98.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/zr2096_10_s1_R2_fastqc.zip' saved [298118/298118]\n", "\n", "FINISHED --2019-03-18 14:16:24--\n", "Total wall clock time: 14s\n", "Downloaded: 26 files, 5.8M in 0.1s (51.5 MB/s)\n" ] } ], "source": [ "#Download files from owl. The files will be downloaded in the same directory structure they are in online.\n", "!wget -r -l1 --no-parent -A_s1_R1_fastqc.zip -A_s1_R2_fastqc.zip \\\n", "http://owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/" ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Move all files from owl folder to the current directory\n", "!mv owl.fish.washington.edu/Athaliana/20180409_fastqc_Cvirginica_MBD/* ." ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mmultiqc_data\u001b[m\u001b[m zr2096_4_s1_R2_fastqc.zip\r\n", "\u001b[34mowl.fish.washington.edu\u001b[m\u001b[m zr2096_5_s1_R1_fastqc.zip\r\n", "zr2096_10_s1_R1_fastqc.zip zr2096_5_s1_R2_fastqc.zip\r\n", "zr2096_10_s1_R2_fastqc.zip zr2096_6_s1_R1_fastqc.zip\r\n", "zr2096_1_s1_R1_fastqc.zip zr2096_6_s1_R2_fastqc.zip\r\n", "zr2096_1_s1_R2_fastqc.zip zr2096_7_s1_R1_fastqc.zip\r\n", "zr2096_2_s1_R1_fastqc.zip zr2096_7_s1_R2_fastqc.zip\r\n", "zr2096_2_s1_R2_fastqc.zip zr2096_8_s1_R1_fastqc.zip\r\n", "zr2096_3_s1_R1_fastqc.zip zr2096_8_s1_R2_fastqc.zip\r\n", "zr2096_3_s1_R2_fastqc.zip zr2096_9_s1_R1_fastqc.zip\r\n", "zr2096_4_s1_R1_fastqc.zip zr2096_9_s1_R2_fastqc.zip\r\n" ] } ], "source": [ "#Confirm all files were moved\n", "!ls" ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#Remove the empty owl directory\n", "!rm -r owl.fish.washington.edu" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1b. Count reads" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, I'll test a loop and ensure it identifies all of the files I want to use by having the loop print the filename of each file (`f`):" ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "zr2096_10_s1_R1_fastqc.zip\n", "zr2096_10_s1_R2_fastqc.zip\n", "zr2096_1_s1_R1_fastqc.zip\n", "zr2096_1_s1_R2_fastqc.zip\n", "zr2096_2_s1_R1_fastqc.zip\n", "zr2096_2_s1_R2_fastqc.zip\n", "zr2096_3_s1_R1_fastqc.zip\n", "zr2096_3_s1_R2_fastqc.zip\n", "zr2096_4_s1_R1_fastqc.zip\n", "zr2096_4_s1_R2_fastqc.zip\n", "zr2096_5_s1_R1_fastqc.zip\n", "zr2096_5_s1_R2_fastqc.zip\n", "zr2096_6_s1_R1_fastqc.zip\n", "zr2096_6_s1_R2_fastqc.zip\n", "zr2096_7_s1_R1_fastqc.zip\n", "zr2096_7_s1_R2_fastqc.zip\n", "zr2096_8_s1_R1_fastqc.zip\n", "zr2096_8_s1_R2_fastqc.zip\n", "zr2096_9_s1_R1_fastqc.zip\n", "zr2096_9_s1_R2_fastqc.zip\n" ] } ], "source": [ "%%bash\n", "for f in *zip\n", "do\n", " echo ${f}\n", "done" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that I know it works, I'm going to count the number of reads in each file. I will first unzip each file with `unzip`." ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Archive: zr2096_10_s1_R1_fastqc.zip\n", " creating: zr2096_10_s1_R1_fastqc/\n", " creating: zr2096_10_s1_R1_fastqc/Icons/\n", " creating: zr2096_10_s1_R1_fastqc/Images/\n", " inflating: zr2096_10_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_10_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_10_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_10_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_10_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_10_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_10_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_10_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_10_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_10_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_10_s1_R2_fastqc.zip\n", " creating: zr2096_10_s1_R2_fastqc/\n", " creating: zr2096_10_s1_R2_fastqc/Icons/\n", " creating: zr2096_10_s1_R2_fastqc/Images/\n", " inflating: zr2096_10_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_10_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_10_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_10_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_10_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_10_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_10_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_10_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_10_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_10_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_1_s1_R1_fastqc.zip\n", " creating: zr2096_1_s1_R1_fastqc/\n", " creating: zr2096_1_s1_R1_fastqc/Icons/\n", " creating: zr2096_1_s1_R1_fastqc/Images/\n", " inflating: zr2096_1_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_1_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_1_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_1_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_1_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_1_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_1_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_1_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_1_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_1_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_1_s1_R2_fastqc.zip\n", " creating: zr2096_1_s1_R2_fastqc/\n", " creating: zr2096_1_s1_R2_fastqc/Icons/\n", " creating: zr2096_1_s1_R2_fastqc/Images/\n", " inflating: zr2096_1_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_1_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_1_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_1_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_1_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_1_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_1_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_1_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_1_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_1_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_2_s1_R1_fastqc.zip\n", " creating: zr2096_2_s1_R1_fastqc/\n", " creating: zr2096_2_s1_R1_fastqc/Icons/\n", " creating: zr2096_2_s1_R1_fastqc/Images/\n", " inflating: zr2096_2_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_2_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_2_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_2_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_2_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_2_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_2_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_2_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_2_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_2_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_2_s1_R2_fastqc.zip\n", " creating: zr2096_2_s1_R2_fastqc/\n", " creating: zr2096_2_s1_R2_fastqc/Icons/\n", " creating: zr2096_2_s1_R2_fastqc/Images/\n", " inflating: zr2096_2_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_2_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_2_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_2_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_2_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_2_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_2_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_2_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_2_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_2_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_3_s1_R1_fastqc.zip\n", " creating: zr2096_3_s1_R1_fastqc/\n", " creating: zr2096_3_s1_R1_fastqc/Icons/\n", " creating: zr2096_3_s1_R1_fastqc/Images/\n", " inflating: zr2096_3_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_3_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_3_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_3_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_3_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_3_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_3_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_3_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_3_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_3_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_3_s1_R2_fastqc.zip\n", " creating: zr2096_3_s1_R2_fastqc/\n", " creating: zr2096_3_s1_R2_fastqc/Icons/\n", " creating: zr2096_3_s1_R2_fastqc/Images/\n", " inflating: zr2096_3_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_3_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_3_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_3_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_3_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_3_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_3_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_3_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_3_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_3_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_4_s1_R1_fastqc.zip\n", " creating: zr2096_4_s1_R1_fastqc/\n", " creating: zr2096_4_s1_R1_fastqc/Icons/\n", " creating: zr2096_4_s1_R1_fastqc/Images/\n", " inflating: zr2096_4_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_4_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_4_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_4_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_4_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_4_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_4_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_4_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_4_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_4_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_4_s1_R2_fastqc.zip\n", " creating: zr2096_4_s1_R2_fastqc/\n", " creating: zr2096_4_s1_R2_fastqc/Icons/\n", " creating: zr2096_4_s1_R2_fastqc/Images/\n", " inflating: zr2096_4_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_4_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_4_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_4_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_4_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_4_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_4_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_4_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_4_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_4_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_5_s1_R1_fastqc.zip\n", " creating: zr2096_5_s1_R1_fastqc/\n", " creating: zr2096_5_s1_R1_fastqc/Icons/\n", " creating: zr2096_5_s1_R1_fastqc/Images/\n", " inflating: zr2096_5_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_5_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_5_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_5_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_5_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_5_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_5_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_5_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_5_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_5_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_5_s1_R2_fastqc.zip\n", " creating: zr2096_5_s1_R2_fastqc/\n", " creating: zr2096_5_s1_R2_fastqc/Icons/\n", " creating: zr2096_5_s1_R2_fastqc/Images/\n", " inflating: zr2096_5_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_5_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_5_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_5_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_5_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_5_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_5_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_5_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_5_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_5_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_6_s1_R1_fastqc.zip\n", " creating: zr2096_6_s1_R1_fastqc/\n", " creating: zr2096_6_s1_R1_fastqc/Icons/\n", " creating: zr2096_6_s1_R1_fastqc/Images/\n", " inflating: zr2096_6_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_6_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_6_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_6_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_6_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_6_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_6_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_6_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_6_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_6_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_6_s1_R2_fastqc.zip\n", " creating: zr2096_6_s1_R2_fastqc/\n", " creating: zr2096_6_s1_R2_fastqc/Icons/\n", " creating: zr2096_6_s1_R2_fastqc/Images/\n", " inflating: zr2096_6_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_6_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_6_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_6_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_6_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_6_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_6_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_6_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_6_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_6_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_7_s1_R1_fastqc.zip\n", " creating: zr2096_7_s1_R1_fastqc/\n", " creating: zr2096_7_s1_R1_fastqc/Icons/\n", " creating: zr2096_7_s1_R1_fastqc/Images/\n", " inflating: zr2096_7_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_7_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_7_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_7_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_7_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_7_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_7_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_7_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_7_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_7_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_7_s1_R2_fastqc.zip\n", " creating: zr2096_7_s1_R2_fastqc/\n", " creating: zr2096_7_s1_R2_fastqc/Icons/\n", " creating: zr2096_7_s1_R2_fastqc/Images/\n", " inflating: zr2096_7_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_7_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_7_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_7_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_7_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_7_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_7_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_7_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_7_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_7_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_8_s1_R1_fastqc.zip\n", " creating: zr2096_8_s1_R1_fastqc/\n", " creating: zr2096_8_s1_R1_fastqc/Icons/\n", " creating: zr2096_8_s1_R1_fastqc/Images/\n", " inflating: zr2096_8_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_8_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_8_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_8_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_8_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_8_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_8_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_8_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_8_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_8_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_8_s1_R2_fastqc.zip\n", " creating: zr2096_8_s1_R2_fastqc/\n", " creating: zr2096_8_s1_R2_fastqc/Icons/\n", " creating: zr2096_8_s1_R2_fastqc/Images/\n", " inflating: zr2096_8_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_8_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_8_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_8_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_8_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_8_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_8_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_8_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_8_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_8_s1_R2_fastqc/fastqc.fo \n", "Archive: zr2096_9_s1_R1_fastqc.zip\n", " creating: zr2096_9_s1_R1_fastqc/\n", " creating: zr2096_9_s1_R1_fastqc/Icons/\n", " creating: zr2096_9_s1_R1_fastqc/Images/\n", " inflating: zr2096_9_s1_R1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_9_s1_R1_fastqc/Icons/warning.png \n", " inflating: zr2096_9_s1_R1_fastqc/Icons/error.png \n", " inflating: zr2096_9_s1_R1_fastqc/Icons/tick.png \n", " inflating: zr2096_9_s1_R1_fastqc/summary.txt \n", " inflating: zr2096_9_s1_R1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_9_s1_R1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_9_s1_R1_fastqc/fastqc_report.html \n", " inflating: zr2096_9_s1_R1_fastqc/fastqc_data.txt \n", " inflating: zr2096_9_s1_R1_fastqc/fastqc.fo \n", "Archive: zr2096_9_s1_R2_fastqc.zip\n", " creating: zr2096_9_s1_R2_fastqc/\n", " creating: zr2096_9_s1_R2_fastqc/Icons/\n", " creating: zr2096_9_s1_R2_fastqc/Images/\n", " inflating: zr2096_9_s1_R2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_9_s1_R2_fastqc/Icons/warning.png \n", " inflating: zr2096_9_s1_R2_fastqc/Icons/error.png \n", " inflating: zr2096_9_s1_R2_fastqc/Icons/tick.png \n", " inflating: zr2096_9_s1_R2_fastqc/summary.txt \n", " inflating: zr2096_9_s1_R2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_9_s1_R2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_9_s1_R2_fastqc/fastqc_report.html \n", " inflating: zr2096_9_s1_R2_fastqc/fastqc_data.txt \n", " inflating: zr2096_9_s1_R2_fastqc/fastqc.fo \n" ] } ], "source": [ "%%bash\n", "for f in *zip\n", "do\n", " unzip ${f}\n", "done" ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mmultiqc_data\u001b[m\u001b[m \u001b[34mzr2096_5_s1_R1_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_10_s1_R1_fastqc\u001b[m\u001b[m zr2096_5_s1_R1_fastqc.zip\r\n", "zr2096_10_s1_R1_fastqc.zip \u001b[34mzr2096_5_s1_R2_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_10_s1_R2_fastqc\u001b[m\u001b[m zr2096_5_s1_R2_fastqc.zip\r\n", "zr2096_10_s1_R2_fastqc.zip \u001b[34mzr2096_6_s1_R1_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_1_s1_R1_fastqc\u001b[m\u001b[m zr2096_6_s1_R1_fastqc.zip\r\n", "zr2096_1_s1_R1_fastqc.zip \u001b[34mzr2096_6_s1_R2_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_1_s1_R2_fastqc\u001b[m\u001b[m zr2096_6_s1_R2_fastqc.zip\r\n", "zr2096_1_s1_R2_fastqc.zip \u001b[34mzr2096_7_s1_R1_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_2_s1_R1_fastqc\u001b[m\u001b[m zr2096_7_s1_R1_fastqc.zip\r\n", "zr2096_2_s1_R1_fastqc.zip \u001b[34mzr2096_7_s1_R2_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_2_s1_R2_fastqc\u001b[m\u001b[m zr2096_7_s1_R2_fastqc.zip\r\n", "zr2096_2_s1_R2_fastqc.zip \u001b[34mzr2096_8_s1_R1_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_3_s1_R1_fastqc\u001b[m\u001b[m zr2096_8_s1_R1_fastqc.zip\r\n", "zr2096_3_s1_R1_fastqc.zip \u001b[34mzr2096_8_s1_R2_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_3_s1_R2_fastqc\u001b[m\u001b[m zr2096_8_s1_R2_fastqc.zip\r\n", "zr2096_3_s1_R2_fastqc.zip \u001b[34mzr2096_9_s1_R1_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_4_s1_R1_fastqc\u001b[m\u001b[m zr2096_9_s1_R1_fastqc.zip\r\n", "zr2096_4_s1_R1_fastqc.zip \u001b[34mzr2096_9_s1_R2_fastqc\u001b[m\u001b[m\r\n", "\u001b[34mzr2096_4_s1_R2_fastqc\u001b[m\u001b[m zr2096_9_s1_R2_fastqc.zip\r\n", "zr2096_4_s1_R2_fastqc.zip\r\n" ] } ], "source": [ "#Confirm files were unzipped\n", "!ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, I'll use `grep` to identify \"Total Sequences\" within each sample file. Using `>>`, I can concatenate the results each time the loop runs, then save the entire output in a new file." ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%bash\n", "for f in *fastqc\n", "do\n", " grep \"Total Sequences *\" ${f}/fastqc_data.txt \\\n", " >> 2019-03-17-Untrimmed-Read-Counts.txt\n", "done" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total Sequences\t17717127\r\n", "Total Sequences\t17717127\r\n", "Total Sequences\t28982766\r\n", "Total Sequences\t28982766\r\n", "Total Sequences\t30798582\r\n", "Total Sequences\t30798582\r\n", "Total Sequences\t29892002\r\n", "Total Sequences\t29892002\r\n", "Total Sequences\t24341968\r\n", "Total Sequences\t24341968\r\n" ] } ], "source": [ "#Confirm total sequences were counted. The first 2 lines correspond to sample 10.\n", "!head 2019-03-17-Untrimmed-Read-Counts.txt" ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "279681264\r\n" ] } ], "source": [ "#Sum the contents of the second column ($2), then divide by 2 to obtain the total number of paired-end reads.\n", "!cat 2019-03-17-Untrimmed-Read-Counts.txt | awk -F\"\\t\" '{ sum+=$2 / 2} END {print sum}'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Trimmed files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since my files were trimmed with FastQC, I can use the information from the FastQC reports to get read information for each file. In the Basic Statistics module, FastQC includes Total Sequences (i.e. Total Reads) after trimming." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2a. Download files" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/yaamini/Documents/paper-gonad-meth/data/2019-03-17-Counting-Reads\n" ] } ], "source": [ "cd .." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "!mkdir 2019-03-17-FastQC-Reports" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/yaamini/Documents/paper-gonad-meth/data/2019-03-17-Counting-Reads/2019-03-17-FastQC-Reports\n" ] } ], "source": [ "cd 2019-03-17-FastQC-Reports/" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2019-03-18 09:39:10-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/\n", "Resolving owl.fish.washington.edu... 128.95.149.83\n", "Connecting to owl.fish.washington.edu|128.95.149.83|:80... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html'\n", "\n", "owl.fish.washington [ <=> ] 10.61K --.-KB/s in 0s \n", "\n", "2019-03-18 09:39:10 (61.3 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html' saved [10864]\n", "\n", "Loading robots.txt; please ignore errors.\n", "--2019-03-18 09:39:10-- http://owl.fish.washington.edu/robots.txt\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 404 Not Found\n", "2019-03-18 09:39:10 ERROR 404: Not Found.\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html since it should be rejected.\n", "\n", "--2019-03-18 09:39:10-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/?C=N;O=D\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=N;O=D'\n", "\n", "owl.fish.washington [ <=> ] 10.61K --.-KB/s in 0s \n", "\n", "2019-03-18 09:39:11 (39.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=N;O=D' saved [10864]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=N;O=D since it should be rejected.\n", "\n", "--2019-03-18 09:39:11-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/?C=M;O=A\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=M;O=A'\n", "\n", "owl.fish.washington [ <=> ] 10.61K --.-KB/s in 0s \n", "\n", "2019-03-18 09:39:11 (56.0 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=M;O=A' saved [10864]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=M;O=A since it should be rejected.\n", "\n", "--2019-03-18 09:39:11-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/?C=S;O=A\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=S;O=A'\n", "\n", "owl.fish.washington [ <=> ] 10.61K --.-KB/s in 0s \n", "\n", "2019-03-18 09:39:11 (66.8 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=S;O=A' saved [10864]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=S;O=A since it should be rejected.\n", "\n", "--2019-03-18 09:39:11-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/?C=D;O=A\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=D;O=A'\n", "\n", "owl.fish.washington [ <=> ] 10.61K --.-KB/s in 0s \n", "\n", "2019-03-18 09:39:11 (70.5 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=D;O=A' saved [10864]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/index.html?C=D;O=A since it should be rejected.\n", "\n", "--2019-03-18 09:39:11-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/multiqc_data/\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 2492 (2.4K) [text/html]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/multiqc_data/index.html'\n", "\n", "owl.fish.washington 100%[===================>] 2.43K --.-KB/s in 0s \n", "\n", "2019-03-18 09:39:11 (88.0 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/multiqc_data/index.html' saved [2492/2492]\n", "\n", "Removing owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/multiqc_data/index.html since it should be rejected.\n", "\n", "--2019-03-18 09:39:11-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_1_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 295534 (289K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_1_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 288.61K 265KB/s in 24s \n", "\n", "2019-03-18 09:39:36 (11.8 KB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_1_s1_R1_val_1_fastqc.zip' saved [295534/295534]\n", "\n", "--2019-03-18 09:39:36-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_1_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 284342 (278K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_1_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 277.68K --.-KB/s in 0.005s \n", "\n", "2019-03-18 09:39:36 (49.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_1_s1_R2_val_2_fastqc.zip' saved [284342/284342]\n", "\n", "--2019-03-18 09:39:36-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_2_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 282490 (276K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_2_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 275.87K --.-KB/s in 0.004s \n", "\n", "2019-03-18 09:39:36 (60.8 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_2_s1_R1_val_1_fastqc.zip' saved [282490/282490]\n", "\n", "--2019-03-18 09:39:36-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_2_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 277353 (271K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_2_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 270.85K --.-KB/s in 0.009s \n", "\n", "2019-03-18 09:39:37 (28.6 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_2_s1_R2_val_2_fastqc.zip' saved [277353/277353]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_3_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 281947 (275K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_3_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 275.34K --.-KB/s in 0.02s \n", "\n", "2019-03-18 09:39:37 (17.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_3_s1_R1_val_1_fastqc.zip' saved [281947/281947]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_3_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 274139 (268K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_3_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 267.71K --.-KB/s in 0.02s \n", "\n", "2019-03-18 09:39:37 (13.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_3_s1_R2_val_2_fastqc.zip' saved [274139/274139]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_4_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 277397 (271K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_4_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 270.90K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:37 (89.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_4_s1_R1_val_1_fastqc.zip' saved [277397/277397]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_4_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 269182 (263K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_4_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 262.87K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:37 (94.2 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_4_s1_R2_val_2_fastqc.zip' saved [269182/269182]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_5_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 277246 (271K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_5_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 270.75K --.-KB/s in 0.03s \n", "\n", "2019-03-18 09:39:37 (9.42 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_5_s1_R1_val_1_fastqc.zip' saved [277246/277246]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_5_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 271336 (265K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_5_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 264.98K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:37 (75.6 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_5_s1_R2_val_2_fastqc.zip' saved [271336/271336]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_6_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 277816 (271K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_6_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 271.30K 1.59MB/s in 0.2s \n", "\n", "2019-03-18 09:39:37 (1.59 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_6_s1_R1_val_1_fastqc.zip' saved [277816/277816]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_6_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 271808 (265K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_6_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 265.44K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:37 (95.6 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_6_s1_R2_val_2_fastqc.zip' saved [271808/271808]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_7_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 278596 (272K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_7_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 272.07K --.-KB/s in 0.002s \n", "\n", "2019-03-18 09:39:37 (129 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_7_s1_R1_val_1_fastqc.zip' saved [278596/278596]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_7_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 274113 (268K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_7_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 267.69K --.-KB/s in 0.02s \n", "\n", "2019-03-18 09:39:37 (11.2 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_7_s1_R2_val_2_fastqc.zip' saved [274113/274113]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_8_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 283261 (277K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_8_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 276.62K --.-KB/s in 0.002s \n", "\n", "2019-03-18 09:39:37 (118 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_8_s1_R1_val_1_fastqc.zip' saved [283261/283261]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_8_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 274615 (268K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_8_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 268.18K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:37 (87.7 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_8_s1_R2_val_2_fastqc.zip' saved [274615/274615]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_9_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 284972 (278K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_9_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 278.29K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:37 (101 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_9_s1_R1_val_1_fastqc.zip' saved [284972/284972]\n", "\n", "--2019-03-18 09:39:37-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_9_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 274089 (268K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_9_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 267.67K --.-KB/s in 0.1s \n", "\n", "2019-03-18 09:39:38 (2.06 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_9_s1_R2_val_2_fastqc.zip' saved [274089/274089]\n", "\n", "--2019-03-18 09:39:38-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_10_s1_R1_val_1_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 286980 (280K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_10_s1_R1_val_1_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 280.25K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:38 (85.4 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_10_s1_R1_val_1_fastqc.zip' saved [286980/286980]\n", "\n", "--2019-03-18 09:39:38-- http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_10_s1_R2_val_2_fastqc.zip\n", "Reusing existing connection to owl.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 273009 (267K) [application/zip]\n", "Saving to: 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_10_s1_R2_val_2_fastqc.zip'\n", "\n", "owl.fish.washington 100%[===================>] 266.61K --.-KB/s in 0.003s \n", "\n", "2019-03-18 09:39:38 (76.3 MB/s) - 'owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/zr2096_10_s1_R2_val_2_fastqc.zip' saved [273009/273009]\n", "\n", "FINISHED --2019-03-18 09:39:38--\n", "Total wall clock time: 28s\n", "Downloaded: 26 files, 5.4M in 25s (221 KB/s)\n" ] } ], "source": [ "#Download files from owl. The files will be downloaded in the same directory structure they are in online.\n", "!wget -r -l1 --no-parent -A_fastqc.zip \\\n", "http://owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#Move all files from owl folder to the current directory\n", "!mv owl.fish.washington.edu/Athaliana/20180411_trimgalore_10bp_Cvirginica_MBD/20180411_fastqc_trim_10bp_Cvirginica_MBD/* ." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mmultiqc_data\u001b[m\u001b[m zr2096_4_s1_R2_val_2_fastqc.zip\r\n", "\u001b[34mowl.fish.washington.edu\u001b[m\u001b[m zr2096_5_s1_R1_val_1_fastqc.zip\r\n", "zr2096_10_s1_R1_val_1_fastqc.zip zr2096_5_s1_R2_val_2_fastqc.zip\r\n", "zr2096_10_s1_R2_val_2_fastqc.zip zr2096_6_s1_R1_val_1_fastqc.zip\r\n", "zr2096_1_s1_R1_val_1_fastqc.zip zr2096_6_s1_R2_val_2_fastqc.zip\r\n", "zr2096_1_s1_R2_val_2_fastqc.zip zr2096_7_s1_R1_val_1_fastqc.zip\r\n", "zr2096_2_s1_R1_val_1_fastqc.zip zr2096_7_s1_R2_val_2_fastqc.zip\r\n", "zr2096_2_s1_R2_val_2_fastqc.zip zr2096_8_s1_R1_val_1_fastqc.zip\r\n", "zr2096_3_s1_R1_val_1_fastqc.zip zr2096_8_s1_R2_val_2_fastqc.zip\r\n", "zr2096_3_s1_R2_val_2_fastqc.zip zr2096_9_s1_R1_val_1_fastqc.zip\r\n", "zr2096_4_s1_R1_val_1_fastqc.zip zr2096_9_s1_R2_val_2_fastqc.zip\r\n" ] } ], "source": [ "#Confirm all files were moved\n", "!ls" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#Remove the empty owl directory\n", "!rm -r owl.fish.washington.edu" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2b. Count reads" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "zr2096_10_s1_R1_val_1_fastqc.zip\n", "zr2096_10_s1_R2_val_2_fastqc.zip\n", "zr2096_1_s1_R1_val_1_fastqc.zip\n", "zr2096_1_s1_R2_val_2_fastqc.zip\n", "zr2096_2_s1_R1_val_1_fastqc.zip\n", "zr2096_2_s1_R2_val_2_fastqc.zip\n", "zr2096_3_s1_R1_val_1_fastqc.zip\n", "zr2096_3_s1_R2_val_2_fastqc.zip\n", "zr2096_4_s1_R1_val_1_fastqc.zip\n", "zr2096_4_s1_R2_val_2_fastqc.zip\n", "zr2096_5_s1_R1_val_1_fastqc.zip\n", "zr2096_5_s1_R2_val_2_fastqc.zip\n", "zr2096_6_s1_R1_val_1_fastqc.zip\n", "zr2096_6_s1_R2_val_2_fastqc.zip\n", "zr2096_7_s1_R1_val_1_fastqc.zip\n", "zr2096_7_s1_R2_val_2_fastqc.zip\n", "zr2096_8_s1_R1_val_1_fastqc.zip\n", "zr2096_8_s1_R2_val_2_fastqc.zip\n", "zr2096_9_s1_R1_val_1_fastqc.zip\n", "zr2096_9_s1_R2_val_2_fastqc.zip\n" ] } ], "source": [ "%%bash\n", "for f in *zip\n", "do\n", " echo ${f}\n", "done" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Archive: zr2096_10_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_10_s1_R1_val_1_fastqc/\n", " creating: zr2096_10_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_10_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_10_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_10_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_10_s1_R2_val_2_fastqc/\n", " creating: zr2096_10_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_10_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_10_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_1_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_1_s1_R1_val_1_fastqc/\n", " creating: zr2096_1_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_1_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_1_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_1_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_1_s1_R2_val_2_fastqc/\n", " creating: zr2096_1_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_1_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_1_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_2_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_2_s1_R1_val_1_fastqc/\n", " creating: zr2096_2_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_2_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_2_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_2_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_2_s1_R2_val_2_fastqc/\n", " creating: zr2096_2_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_2_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_2_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_3_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_3_s1_R1_val_1_fastqc/\n", " creating: zr2096_3_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_3_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_3_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_3_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_3_s1_R2_val_2_fastqc/\n", " creating: zr2096_3_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_3_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_3_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_4_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_4_s1_R1_val_1_fastqc/\n", " creating: zr2096_4_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_4_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_4_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_4_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_4_s1_R2_val_2_fastqc/\n", " creating: zr2096_4_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_4_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_4_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_5_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_5_s1_R1_val_1_fastqc/\n", " creating: zr2096_5_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_5_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_5_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_5_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_5_s1_R2_val_2_fastqc/\n", " creating: zr2096_5_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_5_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_5_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_6_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_6_s1_R1_val_1_fastqc/\n", " creating: zr2096_6_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_6_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_6_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_6_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_6_s1_R2_val_2_fastqc/\n", " creating: zr2096_6_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_6_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_6_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_7_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_7_s1_R1_val_1_fastqc/\n", " creating: zr2096_7_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_7_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_7_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_7_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_7_s1_R2_val_2_fastqc/\n", " creating: zr2096_7_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_7_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_7_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_8_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_8_s1_R1_val_1_fastqc/\n", " creating: zr2096_8_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_8_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_8_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_8_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_8_s1_R2_val_2_fastqc/\n", " creating: zr2096_8_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_8_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_8_s1_R2_val_2_fastqc/fastqc.fo \n", "Archive: zr2096_9_s1_R1_val_1_fastqc.zip\n", " creating: zr2096_9_s1_R1_val_1_fastqc/\n", " creating: zr2096_9_s1_R1_val_1_fastqc/Icons/\n", " creating: zr2096_9_s1_R1_val_1_fastqc/Images/\n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Icons/warning.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Icons/error.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Icons/tick.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/summary.txt \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/Images/adapter_content.png \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/fastqc_report.html \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/fastqc_data.txt \n", " inflating: zr2096_9_s1_R1_val_1_fastqc/fastqc.fo \n", "Archive: zr2096_9_s1_R2_val_2_fastqc.zip\n", " creating: zr2096_9_s1_R2_val_2_fastqc/\n", " creating: zr2096_9_s1_R2_val_2_fastqc/Icons/\n", " creating: zr2096_9_s1_R2_val_2_fastqc/Images/\n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Icons/fastqc_icon.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Icons/warning.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Icons/error.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Icons/tick.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/summary.txt \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/per_base_quality.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/per_tile_quality.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/per_sequence_quality.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/per_base_sequence_content.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/per_sequence_gc_content.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/per_base_n_content.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/sequence_length_distribution.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/duplication_levels.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/Images/adapter_content.png \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/fastqc_report.html \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/fastqc_data.txt \n", " inflating: zr2096_9_s1_R2_val_2_fastqc/fastqc.fo \n" ] } ], "source": [ "%%bash\n", "for f in *zip\n", "do\n", " unzip ${f}\n", "done" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mzr2096_10_s1_R1_val_1_fastqc\u001b[m\u001b[m \u001b[34mzr2096_5_s1_R1_val_1_fastqc\u001b[m\u001b[m\r\n", "zr2096_10_s1_R1_val_1_fastqc.zip zr2096_5_s1_R1_val_1_fastqc.zip\r\n", "\u001b[34mzr2096_10_s1_R2_val_2_fastqc\u001b[m\u001b[m \u001b[34mzr2096_5_s1_R2_val_2_fastqc\u001b[m\u001b[m\r\n", "zr2096_10_s1_R2_val_2_fastqc.zip zr2096_5_s1_R2_val_2_fastqc.zip\r\n", "\u001b[34mzr2096_1_s1_R1_val_1_fastqc\u001b[m\u001b[m \u001b[34mzr2096_6_s1_R1_val_1_fastqc\u001b[m\u001b[m\r\n", "zr2096_1_s1_R1_val_1_fastqc.zip zr2096_6_s1_R1_val_1_fastqc.zip\r\n", "\u001b[34mzr2096_1_s1_R2_val_2_fastqc\u001b[m\u001b[m \u001b[34mzr2096_6_s1_R2_val_2_fastqc\u001b[m\u001b[m\r\n", "zr2096_1_s1_R2_val_2_fastqc.zip zr2096_6_s1_R2_val_2_fastqc.zip\r\n", "\u001b[34mzr2096_2_s1_R1_val_1_fastqc\u001b[m\u001b[m \u001b[34mzr2096_7_s1_R1_val_1_fastqc\u001b[m\u001b[m\r\n", "zr2096_2_s1_R1_val_1_fastqc.zip zr2096_7_s1_R1_val_1_fastqc.zip\r\n", "\u001b[34mzr2096_2_s1_R2_val_2_fastqc\u001b[m\u001b[m \u001b[34mzr2096_7_s1_R2_val_2_fastqc\u001b[m\u001b[m\r\n", "zr2096_2_s1_R2_val_2_fastqc.zip zr2096_7_s1_R2_val_2_fastqc.zip\r\n", "\u001b[34mzr2096_3_s1_R1_val_1_fastqc\u001b[m\u001b[m \u001b[34mzr2096_8_s1_R1_val_1_fastqc\u001b[m\u001b[m\r\n", "zr2096_3_s1_R1_val_1_fastqc.zip zr2096_8_s1_R1_val_1_fastqc.zip\r\n", "\u001b[34mzr2096_3_s1_R2_val_2_fastqc\u001b[m\u001b[m \u001b[34mzr2096_8_s1_R2_val_2_fastqc\u001b[m\u001b[m\r\n", "zr2096_3_s1_R2_val_2_fastqc.zip zr2096_8_s1_R2_val_2_fastqc.zip\r\n", "\u001b[34mzr2096_4_s1_R1_val_1_fastqc\u001b[m\u001b[m \u001b[34mzr2096_9_s1_R1_val_1_fastqc\u001b[m\u001b[m\r\n", "zr2096_4_s1_R1_val_1_fastqc.zip zr2096_9_s1_R1_val_1_fastqc.zip\r\n", "\u001b[34mzr2096_4_s1_R2_val_2_fastqc\u001b[m\u001b[m \u001b[34mzr2096_9_s1_R2_val_2_fastqc\u001b[m\u001b[m\r\n", "zr2096_4_s1_R2_val_2_fastqc.zip zr2096_9_s1_R2_val_2_fastqc.zip\r\n" ] } ], "source": [ "#Confirm files were unzipped\n", "!ls" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%bash\n", "for f in *fastqc\n", "do\n", " grep \"Total Sequences *\" ${f}/fastqc_data.txt \\\n", " >> 2019-03-17-Trimmed-Read-Counts.txt\n", "done" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total Sequences\t17448883\r\n", "Total Sequences\t17448883\r\n", "Total Sequences\t28603346\r\n", "Total Sequences\t28603346\r\n", "Total Sequences\t30325606\r\n", "Total Sequences\t30325606\r\n", "Total Sequences\t29548753\r\n", "Total Sequences\t29548753\r\n", "Total Sequences\t23970516\r\n", "Total Sequences\t23970516\r\n" ] } ], "source": [ "#Confirm total sequences were counted. The first 2 lines correspond to sample 10.\n", "!head 2019-03-17-Trimmed-Read-Counts.txt" ] }, { "cell_type": "code", "execution_count": 150, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cat: 2019-03-17-Trimmed-Read-Counts.txt: No such file or directory\r\n", "\r\n" ] } ], "source": [ "#Sum the contents of the second column ($2), then divide by 2 to obtain the total number of paired-end reads.\n", "!cat 2019-03-17-Trimmed-Read-Counts.txt | awk -F\"\\t\" '{ sum+=$2 / 2} END {print sum}'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Reads that mapped to genome" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, I need to count how many trimmed reads mapped back to the genome. I can do this by looking at `bismark` processing reports. Each processing report outlines how many paired-end reads did not map to the genome under any condition. I can extract this number and subtract it from the total trimmed paired-end reads per sample to obtain how many reads mapped back to the genome." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3a. Download files" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/yaamini/Documents/paper-gonad-meth/data/2019-03-17-Counting-Reads\n" ] } ], "source": [ "cd .." ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "collapsed": true }, "outputs": [], "source": [ "!mkdir 2019-03-17-Mapped-Reads" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/yaamini/Documents/paper-gonad-meth/data/2019-03-17-Counting-Reads/2019-03-17-Mapped-Reads\n" ] } ], "source": [ "cd 2019-03-17-Mapped-Reads/" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2019-04-07 13:50:50-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/\n", "Resolving gannet.fish.washington.edu... 128.95.149.52\n", "Connecting to gannet.fish.washington.edu|128.95.149.52|:80... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html'\n", "\n", "gannet.fish.washing [ <=> ] 61.14K --.-KB/s in 0.001s \n", "\n", "2019-04-07 13:50:52 (45.6 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html' saved [62605]\n", "\n", "Loading robots.txt; please ignore errors.\n", "--2019-04-07 13:50:52-- http://gannet.fish.washington.edu/robots.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 404 Not Found\n", "2019-04-07 13:50:52 ERROR 404: Not Found.\n", "\n", "Removing gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html since it should be rejected.\n", "\n", "--2019-04-07 13:50:52-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/?C=N;O=D\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=N;O=D'\n", "\n", "gannet.fish.washing [ <=> ] 61.14K --.-KB/s in 0.002s \n", "\n", "2019-04-07 13:50:53 (36.2 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=N;O=D' saved [62605]\n", "\n", "Removing gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=N;O=D since it should be rejected.\n", "\n", "--2019-04-07 13:50:53-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/?C=M;O=A\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=M;O=A'\n", "\n", "gannet.fish.washing [ <=> ] 61.14K --.-KB/s in 0.001s \n", "\n", "2019-04-07 13:50:55 (42.9 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=M;O=A' saved [62605]\n", "\n", "Removing gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=M;O=A since it should be rejected.\n", "\n", "--2019-04-07 13:50:55-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/?C=S;O=A\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=S;O=A'\n", "\n", "gannet.fish.washing [ <=> ] 61.14K --.-KB/s in 0.002s \n", "\n", "2019-04-07 13:50:56 (38.4 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=S;O=A' saved [62605]\n", "\n", "Removing gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=S;O=A since it should be rejected.\n", "\n", "--2019-04-07 13:50:56-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/?C=D;O=A\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=D;O=A'\n", "\n", "gannet.fish.washing [ <=> ] 61.14K --.-KB/s in 0.001s \n", "\n", "2019-04-07 13:50:58 (45.6 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=D;O=A' saved [62605]\n", "\n", "Removing gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/index.html?C=D;O=A since it should be rejected.\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/@eaDir/\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: unspecified [text/html]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/@eaDir/index.html'\n", "\n", "gannet.fish.washing [ <=> ] 64.35K --.-KB/s in 0.001s \n", "\n", "2019-04-07 13:50:58 (43.1 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/@eaDir/index.html' saved [65897]\n", "\n", "Removing gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/@eaDir/index.html since it should be rejected.\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_1_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1982 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_1_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.94K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (145 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_1_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1982/1982]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_2_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1982 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_2_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.94K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (78.8 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_2_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1982/1982]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_3_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1983 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_3_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.94K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (172 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_3_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1983/1983]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_4_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1980 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_4_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.93K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (94.4 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_4_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1980/1980]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_5_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1983 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_5_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.94K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (210 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_5_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1983/1983]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_6_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1980 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_6_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.93K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (189 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_6_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1980/1980]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_7_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1982 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_7_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.94K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (210 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_7_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1982/1982]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_8_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1984 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_8_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.94K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (189 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_8_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1984/1984]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_9_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1982 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_9_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.94K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (135 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_9_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1982/1982]\n", "\n", "--2019-04-07 13:50:58-- http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_10_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "Reusing existing connection to gannet.fish.washington.edu:80.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 1980 (1.9K) [text/plain]\n", "Saving to: 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_10_s1_R1_val_1_bismark_bt2_PE_report.txt'\n", "\n", "gannet.fish.washing 100%[===================>] 1.93K --.-KB/s in 0s \n", "\n", "2019-04-07 13:50:58 (94.4 MB/s) - 'gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/zr2096_10_s1_R1_val_1_bismark_bt2_PE_report.txt' saved [1980/1980]\n", "\n", "FINISHED --2019-04-07 13:50:58--\n", "Total wall clock time: 8.5s\n", "Downloaded: 16 files, 389K in 0.009s (43.2 MB/s)\n" ] } ], "source": [ "#Download files from gannet. The files will be downloaded in the same directory structure they are in online.\n", "!wget -r -l1 --no-parent -A_s1_R1_val_1_bismark_bt2_PE_report.txt \\\n", "http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#Move all files from owl folder to the current directory\n", "!mv gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2018-11-07-Bismark-Mox/* ." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2018-12-03-Mapping-Efficiency.csv\r\n", "\u001b[34m2019-03-17-Counting-Reads\u001b[m\u001b[m\r\n", "\u001b[34m@eaDir\u001b[m\u001b[m\r\n", "OysterTissueInfoSheet_GonadTestRoberts_20171002.xlsx\r\n", "README.md\r\n", "\u001b[34mgannet.fish.washington.edu\u001b[m\u001b[m\r\n", "zr2096_10_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_1_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_2_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_3_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_4_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_5_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_6_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_7_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_8_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n", "zr2096_9_s1_R1_val_1_bismark_bt2_PE_report.txt\r\n" ] } ], "source": [ "#Confirm files were moved\n", "!ls" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Remove empty folders\n", "!rm -r gannet.fish.washington.edu" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3b. Obtain mapping efficiency and sequences analyzed" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "zr2096_10_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_1_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_2_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_3_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_4_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_5_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_6_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_7_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_8_s1_R1_val_1_bismark_bt2_PE_report.txt\n", "zr2096_9_s1_R1_val_1_bismark_bt2_PE_report.txt\n" ] } ], "source": [ "%%bash\n", "for f in *txt\n", "do\n", " echo ${f}\n", "done" ] }, { "cell_type": "code", "execution_count": 111, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Bismark report for: /gscratch/scrubbed/yaamini/data/Virginica-MBD/2018-10-17-Trimmed-Files//zr2096_1_s1_R1_val_1.fq.gz and /gscratch/scrubbed/yaamini/data/Virginica-MBD/2018-10-17-Trimmed-Files//zr2096_1_s1_R2_val_2.fq.gz (version: v0.19.0)\r\n", "Bismark was run with Bowtie 2 against the bisulfite genome of /gscratch/scrubbed/yaamini/data/Virginica-MBD/2018-04-27-Bismark-Inputs/ with the specified options: -q --score-min L,0,-1.2 -p 28 --reorder --ignore-quals --no-mixed --no-discordant --dovetail --maxins 500\r\n", "Option '--non_directional' specified: alignments to all strands were being performed (OT, OB, CTOT, CTOB)\r\n", "\r\n", "Final Alignment report\r\n", "======================\r\n", "Sequence pairs analysed in total:\t28603346\r\n", "Number of paired-end alignments with a unique best hit:\t8273829\r\n", "Mapping efficiency:\t28.9% \r\n", "Sequence pairs with no alignments under any condition:\t17321484\r\n" ] } ], "source": [ "#Identify what information is needed from the report\n", "!head zr2096_1_s1_R1_val_1_bismark_bt2_PE_report.txt" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%%bash\n", "for f in *txt\n", "do\n", " grep \"Mapping efficiency *\" ${f} \\\n", " >> 2019-03-17-Mapping-Efficiency.txt\n", "done" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mapping efficiency:\t53.2% \r\n", "Mapping efficiency:\t28.9% \r\n", "Mapping efficiency:\t50.3% \r\n", "Mapping efficiency:\t52.6% \r\n", "Mapping efficiency:\t54.2% \r\n", "Mapping efficiency:\t52.0% \r\n", "Mapping efficiency:\t54.4% \r\n", "Mapping efficiency:\t51.7% \r\n", "Mapping efficiency:\t48.2% \r\n", "Mapping efficiency:\t50.6% \r\n" ] } ], "source": [ "#Confirm file was created. The first entry corresponds to sample 10.\n", "!head 2019-03-17-Mapping-Efficiency.txt" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "#Isolate mapping efficiency percentages (remove % from the end)\n", "#Save new document\n", "!cut -f2 2019-03-17-Mapping-Efficiency.txt | cut -c1-4 \\\n", "| paste 2019-03-17-Mapping-Efficiency.txt -> 2019-03-17-Mapping-Efficiency-Percents-Included.txt" ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mapping efficiency:\t53.2% \t53.2\r\n", "Mapping efficiency:\t28.9% \t28.9\r\n", "Mapping efficiency:\t50.3% \t50.3\r\n", "Mapping efficiency:\t52.6% \t52.6\r\n", "Mapping efficiency:\t54.2% \t54.2\r\n", "Mapping efficiency:\t52.0% \t52.0\r\n", "Mapping efficiency:\t54.4% \t54.4\r\n", "Mapping efficiency:\t51.7% \t51.7\r\n", "Mapping efficiency:\t48.2% \t48.2\r\n", "Mapping efficiency:\t50.6% \t50.6\r\n" ] } ], "source": [ "!head 2019-03-17-Mapping-Efficiency-Percents-Included.txt" ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Divide percentages by 100\n", "#Save new document\n", "!awk -v c=100 '{ print $3/c}' 2019-03-17-Mapping-Efficiency-Percents-Included.txt \\\n", "| paste 2019-03-17-Mapping-Efficiency-Percents-Included.txt -> 2019-03-17-Mapping-Efficiency-Divided-Percents.txt" ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mapping efficiency:\t53.2% \t53.2\t0.532\r\n", "Mapping efficiency:\t28.9% \t28.9\t0.289\r\n", "Mapping efficiency:\t50.3% \t50.3\t0.503\r\n", "Mapping efficiency:\t52.6% \t52.6\t0.526\r\n", "Mapping efficiency:\t54.2% \t54.2\t0.542\r\n", "Mapping efficiency:\t52.0% \t52.0\t0.52\r\n", "Mapping efficiency:\t54.4% \t54.4\t0.544\r\n", "Mapping efficiency:\t51.7% \t51.7\t0.517\r\n", "Mapping efficiency:\t48.2% \t48.2\t0.482\r\n", "Mapping efficiency:\t50.6% \t50.6\t0.506\r\n" ] } ], "source": [ "!head 2019-03-17-Mapping-Efficiency-Divided-Percents.txt" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%%bash\n", "for f in *txt\n", "do\n", " grep \"Sequence pairs analysed in total *\" ${f} \\\n", " >> 2019-03-17-Pairs-Analyzed.txt\n", "done" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sequence pairs analysed in total:\t17448883\r\n", "Sequence pairs analysed in total:\t28603346\r\n", "Sequence pairs analysed in total:\t30325606\r\n", "Sequence pairs analysed in total:\t29548753\r\n", "Sequence pairs analysed in total:\t23970516\r\n", "Sequence pairs analysed in total:\t31503281\r\n", "Sequence pairs analysed in total:\t23909493\r\n", "Sequence pairs analysed in total:\t29273635\r\n", "Sequence pairs analysed in total:\t29483218\r\n", "Sequence pairs analysed in total:\t31847541\r\n" ] } ], "source": [ "!head 2019-03-17-Pairs-Analyzed.txt" ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#Isolate pairs analyzed\n", "#Save in new document with divided percents\n", "!cut -f2 2019-03-17-Pairs-Analyzed.txt \\\n", "|paste 2019-03-17-Mapping-Efficiency-Divided-Percents.txt -> 2019-03-17-Pairs-Analyzed-and-Mapping-Efficiency.txt" ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mapping efficiency:\t53.2% \t53.2\t0.532\t17448883\r\n", "Mapping efficiency:\t28.9% \t28.9\t0.289\t28603346\r\n", "Mapping efficiency:\t50.3% \t50.3\t0.503\t30325606\r\n", "Mapping efficiency:\t52.6% \t52.6\t0.526\t29548753\r\n", "Mapping efficiency:\t54.2% \t54.2\t0.542\t23970516\r\n", "Mapping efficiency:\t52.0% \t52.0\t0.52\t31503281\r\n", "Mapping efficiency:\t54.4% \t54.4\t0.544\t23909493\r\n", "Mapping efficiency:\t51.7% \t51.7\t0.517\t29273635\r\n", "Mapping efficiency:\t48.2% \t48.2\t0.482\t29483218\r\n", "Mapping efficiency:\t50.6% \t50.6\t0.506\t31847541\r\n" ] } ], "source": [ "#Confirm paste worked\n", "!head 2019-03-17-Pairs-Analyzed-and-Mapping-Efficiency.txt" ] }, { "cell_type": "code", "execution_count": 106, "metadata": { "collapsed": false }, "outputs": [], "source": [ "#Mulitply percentage mapped and pairs analyzed columns to obtain mapped reads\n", "!awk '{ print $3, $6, $5*$6}' 2019-03-17-Pairs-Analyzed-and-Mapping-Efficiency.txt \\\n", "> 2019-03-17-Mapped-Reads.txt" ] }, { "cell_type": "code", "execution_count": 111, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "53.2% 17448883 9.28281e+06\r\n", "28.9% 28603346 8.26637e+06\r\n", "50.3% 30325606 1.52538e+07\r\n", "52.6% 29548753 1.55426e+07\r\n", "54.2% 23970516 1.2992e+07\r\n", "52.0% 31503281 1.63817e+07\r\n", "54.4% 23909493 1.30068e+07\r\n", "51.7% 29273635 1.51345e+07\r\n", "48.2% 29483218 1.42109e+07\r\n", "50.6% 31847541 1.61149e+07\r\n" ] } ], "source": [ "#Confirm multiplication\n", "!head 2019-03-17-Mapped-Reads.txt" ] }, { "cell_type": "code", "execution_count": 113, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "136186380\r\n" ] } ], "source": [ "#Sum the contents of the third column ($3) to obtain the total number of paired-end reads that mapped to the genome.\n", "!cat 2019-03-17-Mapped-Reads.txt | awk '{ sum+=$3} END {print sum}'" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 1 }