A comparative analysis of lean and siscowet lake trout ecotypes — integrating PacBio HiFi DNA methylation, presence–absence variation, and a new functional-annotation layer that links these differences to genes and plausible phenotypes.
Lake trout in the Great Lakes occur as divergent ecotypes that share water but not lifestyle. Lean trout are shallow-water, elongate, and low-lipid; siscowet trout are deep-water specialists with robust bodies and high lipid storage. We ask a focused question: which genes carry the epigenetic and structural differences between the ecotypes, and what phenotypes might they shape?
Earlier stages of this project produced the raw differences — differentially methylated regions and presence–absence variants. The work featured here adds the missing interpretive layer: a genome-wide functional annotation that turns coordinates into gene names, products, and Gene Ontology terms, then ranks candidates and reasons — carefully — about phenotype.
Of 2,036 annotated candidates, only 4 genes are hit by both a
differentially methylated region and a high-confidence structural deletion — led by a
znf883-like zinc-finger locus with an exonic DMR and an exonic deletion.
Exonic siscowet-specific deletions fall in lipid genes — angptl5,
mogat2, epoxide hydrolase 1 — the very axis separating the lean and
high-lipid ecotypes. Suggestive at the gene level, not genome-wide.
The strongest enrichment in the deletion set is calcium-ion transport (FDR 3×10⁻³) and neuron-projection development — read with a gene-length caveat, but the most defensible enrichment we see.
Read this as hypothesis-generating. Every link below is an association on a single lean-background reference genome — no functional validation, and no single CpG survives genome-wide multiple-testing correction. The value is a ranked, annotated shortlist, not a causal claim. See interpretation guardrails.
Genes were ranked by convergence (methylation and deletion), promoter/exon placement, expression support, and methylation↔expression concordance. The four convergent loci — carrying both a DMR and a high-confidence siscowet deletion — are the strongest candidates.
| Gene | Product | Methylation | Deletion | Note |
|---|---|---|---|---|
znf883-likeLOC120032414 |
Zinc finger protein 883-like | exon hyper | exonic | top convergent |
XlCGF57.1-likeLOC120040411 |
Gastrula zinc finger protein XlCGF57.1-like | intron · hypo | nearby | convergent |
septin-9-likeLOC120043843 |
Septin-9-like | intron · hyper | nearby | convergent |
LOC120039781 |
Uncharacterized locus | intron · hypo | nearby | convergent |
angptl5 |
Angiopoietin-related protein 5-like | — | exonic | lipid axis |
mogat2 |
2-acylglycerol O-acyltransferase 2-A-like | — | exonic | lipid axis |
ephx1-like |
Epoxide hydrolase 1-like | promoter | — | lipid / xenobiotic |
| GO term | Fold | FDR | Read as |
|---|---|---|---|
| Calcium ion transmembrane transport | 3.7 | 5.6×10⁻⁴ | most defensible signal |
| Neuron projection development | 2.4 | 2.6×10⁻³ | sensory / neural |
| Calcium channel complex | 4.6 | 3.0×10⁻³ | length-bias caveat |
| Calcium ion transport phenotype-flagged | 3.0 | 3.0×10⁻³ | ion homeostasis |
| Lipid / phospholipid binding | 1.5 | ns (0.3) | suggestive only |
Hypergeometric over-representation
vs. all GO-annotated genes (BH-FDR). The DMR set's enrichment is dominated by a single histone
cluster and adjacent znf883 paralogs — a tandem-cluster artifact, not broad convergence.
Full tables:
PAV ·
DMR ·
union.
Each axis pairs annotated candidate genes with a measured or known difference between the ecotypes. These are hypotheses anchored to morphometric data, not validated mechanisms.
Exonic siscowet deletions in angptl5, mogat2,
and a promoter DMR at epoxide hydrolase 1 touch lipid handling — consistent with the defining
high-lipid siscowet phenotype and its role in deep-water buoyancy.
Methylation-led candidates including rbm24b (muscle/cardiac splicing)
and growth-associated GO terms align with the elongate-lean vs. robust-siscowet body-form
contrast captured in the 17-landmark morphometric data.
The calcium-transport and neuron-projection GO enrichment in the deletion set hints at sensory or excitability differences relevant to a deep, dark, high-pressure habitat — flagged but length-bias-aware.
Exonic deletions hit immune/adhesion genes (alpha-2-macroglobulin, CEACAM, DMBT1). Some signal is real ecotype divergence; some reflects rapidly evolving, reference-divergent gene families — interpret comparatively.
Inspect methylation, PAV, and gene tracks directly across the SaNama_1.0 assembly.
This analysis is deliberately conservative. The constraints below shape every claim on this page and are baked into the candidate rankings.
The reference is a lean-background genome. SaNama_1.0 was built from a doubled-haploid Seneca Lake (lean-morphotype) fish. Siscowet diverges more from it, so siscowet reads map less completely — inflating apparent siscowet-specific deletions and reducing methylation power in the most divergent regions. Siscowet and lean variant counts are not magnitude-comparable.
No single CpG survives genome-wide correction. 0 DMCs at q < 0.1. Interpretation leads with DMR-level and stringent-PAV sets; single-CpG and lenient-PAV hits are hypothesis-generating only.
Expression support is weak by design. The liver RNA-seq is from a separate parasite study with different individuals, so it serves as orthogonal support — never confirmation.
Enrichment confounders. The PAV GO signal carries a gene-length bias (long calcium/ion-channel genes accumulate deletions by chance); the DMR GO signal is a tandem-cluster artifact. Associations, not causation — no functional validation was performed.
| Assembly | GCF_016432855.1 (SaNama_1.0) |
| Species | Salvelinus namaycush |
| BioProject | PRJNA674328 |
| Samples | Lean n=4 · Siscowet n=4 (PacBio HiFi) |
| Annotation | NCBI RefSeq GFF + GO (GAF) |