--- title: "Week 06 Questions" format: html: code-fold: false code-tools: true code-copy: true highlight-style: github code-overflow: wrap --- a) **What are SAM/BAM files? What is the difference between to the two?** Two common high-throughput data alignment formats for storing sequencing reads mapped to a reference genome or transcriptome index are Sequence Alignment/Mapping (SAM) and its binary analog (BAM). The latter is a format that is formed from the original SAM file that a computer is better able to handle. b) **`samtools`is a popular program for working with alignment data. What are three common tasks that this software is used for?** Three common tasks that samtools is used for is viewing and sorting SAM files to convert them into BAM files. Samtools flags is a command that tells you attributes encoded in SAM/BAM files such as whether the sequence is paired-end, unmapped, aligned in prper pair, etc, all of which tells us how the read is aligned. c) **Why might you want to visualize alignment data and what are two program that can be used for this?** We can use samtools tview subcommand works on position-sorted and indexed BAM files to quickly look at alignment data in the terminal. You can also use the Integrated Genomics Viewer (IGV) to get a more in depth look at alignment data. IGV must first be installed (I would use homebrew) and then you can launch the app with the command igv. d) **Describe what VCF file is?** Variant call format (VCF) files are the output of analyses from BAMs. They contain the following three elements: - Metadata header that can be multiple lines preceded two pound symbols (##) - A single line eader preceded by one pound symbol (#) - Data lines