/gannet/Atumefaciens/20190228_pgen_maker_v070_annotation/ Use MAKER to annotate Panopea generosa genome assembly v070. MAKER was run, followed by two rounds of SNAP ab initio gene predictions. MAKER is a program and also functions as a "wrapper" for a bunch of other programs (e.g. BLAST, SNAP, etc.). See the SBATCH script and my notebook entry (URL at end of this file) for specific details on input files, programs, and configurations used to run MAKER. --- FILES - 20190228_pgen_maker_v070_annotation.sh: SBATCH script to run on Mox. - blastp_annotation: Output folder from BLASTp. - combined_proteomes.fasta: FastA of the following proteomes: - Crassostrea gigas (NCBI) - Crassostrea virginica (NCBI) - Panopea generosa (from transcriptome assembly via Transdecoder) - _Inline: ? - interproscan_annotation: Output folder from InterProScan. - maker_bopts.ctl: MAKER control file - generated by the "maker_opts.ctl" file. - maker_exe.ctl: MAKER control file - generated by the "maker_opts.ctl" file. - maker_opts.ctl: MAKER control file. Contains all configurations used to run MAKER. These adjustments are detailed in the SBATCH script. - Pgenerosa_v070.all.gff: Initial GFF generated by MAKER's gene predictions. - Can be used, but not as refined as those that were predicted in conjunction with SNAP. - Contains corresponding sequences (in FastA format) at end of file!!! - Pgenerosa_v070.all.maker.proteins.fasta: Proteins FastA file based on MAKER's gene predictions. - Can be used, but not as refined as those that were predicted in conjunction with SNAP. - Pgenerosa_v070.all.maker.transcripts.fasta: Transcripts FastA file based on MAKER's gene predictions. - Can be used, but not as refined as those that were predicted in conjunction with SNAP. - Pgenerosa_v070_genome_snap02.all.maker.proteins.renamed.putative_function.fasta: - Proteins FastA file based on MAKER and two rounds of SNAP gene predictions. - Sequence IDs have been renamed from original MAKER names to a GenBank-like format. - Putative function of each protein has been added to sequence descriptors. - Pgenerosa_v070_genome_snap02.all.maker.transcripts.renamed.putative_function.fasta: - Proteins FastA file based on MAKER and two rounds of SNAP gene predictions. - Sequence IDs have been renamed from original MAKER names to a GenBank-like format. - Putative function of each protein has been added to sequence descriptors. - Pgenerosa_v070_genome_snap02.all.renamed.putative_function.domain_added.gff: - Definitive GFF file based on MAKER and two rounds of SNAP gene predictions. - Sequence IDs have been renamed from original MAKER names to a GenBank-like format. - Putative function of each gene has been added to descriptors. - Putative functional domains of each gene has been add to descriptors. - Contains corresponding sequences (in FastA format) at end of file!!! - Pgenerosa_v070_genome_snap02.all.renamed.putative_function.gff: - GFF file based on MAKER and two rounds of SNAP gene predictions. - Sequence IDs have been renamed from original MAKER names to a GenBank-like format. - Putative function of each gene has been added to descriptors. - Contains corresponding sequences (in FastA format) at end of file!!! - Pgenerosa_v070_genome_snap02.all.renamed.visible_ips_domains.gff: - GFF file based on MAKER and two rounds of SNAP gene predictions. - Sequence IDs have been renamed from original MAKER names to a GenBank-like format. - InterProScan domains added. - Useful for genome browsers. - Contains corresponding sequences (in FastA format) at end of file!!! - Pgenerosa_v070.maker.all.noseqs.est2genome.gff: GFF of initial MAKER transcriptome alignment. - Pgenerosa_v070.maker.all.noseqs.gff: Initial GFF generated by MAKER's gene predictions. - Can be used, but not as refined as those that were predicted in conjunction with SNAP. - Pgenerosa_v070.maker.all.noseqs.protein2genome.gff: GFF of initial MAKER protein alignment. - Pgenerosa_v070.maker.all.noseqs.repeats.gff: GFF of initial MAKER repeats alignment. - Pgenerosa_v070.maker.output: Output directory for MAKER processes. - slurm-*.out: Various SLURM outputs (i.e. stderr/stdout) from each time MAKER was started. - snap01: Output directory for first round of SNAP gene predictions. - snap02: Output directory for second round of SNAP gene predictions. - system_path.log: Contents of Sam's system $PATH on Mox. --- Notebook: https://robertslab.github.io/sams-notebook/2019/02/28/Genome-Annotation-Pgenerosa_v070-MAKER-on-Mox.html