In this notebook I will compile results that I will include in the manuscript.
methdata.summary[methdata.summary$feature=="all", "all5x"] #all characterized loci
## [1] 2030624
methdata.summary[methdata.summary$feature=="all", "methylated"] #all methylated loci
## [1] 1839241
percent(methdata.summary[methdata.summary$feature=="all", "methylated"]/methdata.summary[methdata.summary$feature=="all", "all5x"], accuracy=0.1)
## [1] "90.6%"
# Where methylated loci are located
methdata.summary[methdata.summary$feature=="gene", "methylated"] # No. in genes
## [1] 633991
percent(methdata.summary[methdata.summary$feature=="gene", "methylated"]/methdata.summary[methdata.summary$feature=="all", "methylated"], accuracy=0.1) #% in genes
## [1] "34.5%"
percent(methdata.summary[methdata.summary$feature=="exon", "methylated"]/methdata.summary[methdata.summary$feature=="all", "methylated"], accuracy=0.1) #% in exons
## [1] "14.7%"
percent(methdata.summary[methdata.summary$feature=="intron", "methylated"]/methdata.summary[methdata.summary$feature=="all", "methylated"], accuracy=0.1) #% in introns
## [1] "19.8%"
methdata.summary[methdata.summary$feature=="2kbflank-up", "methylated"] # No. upstream genes
## [1] 85425
percent(methdata.summary[methdata.summary$feature=="2kbflank-up", "methylated"]/methdata.summary[methdata.summary$feature=="all", "methylated"], accuracy=0.1) # % upstream genes
## [1] "4.6%"
methdata.summary[methdata.summary$feature=="2kbflank-down", "methylated"] # No. downstream genes
## [1] 85795
percent(methdata.summary[methdata.summary$feature=="2kbflank-down", "methylated"]/methdata.summary[methdata.summary$feature=="all", "methylated"], accuracy=0.1) # % downtream genes
## [1] "4.7%"
methdata.summary[methdata.summary$feature=="TE", "methylated"] # No. transposable elements
## [1] 254363
percent(methdata.summary[methdata.summary$feature=="TE", "methylated"]/methdata.summary[methdata.summary$feature=="all", "methylated"], accuracy=0.1) # % in transposable elements
## [1] "13.8%"
methdata.summary[methdata.summary$feature=="ASV", "methylated"] # No. overlap w/ ASV
## [1] 1386721
methdata.summary[methdata.summary$feature=="unknown", "methylated"] # No. intergenic
## [1] 593224
percent(methdata.summary[methdata.summary$feature=="unknown", "methylated"]/methdata.summary[methdata.summary$feature=="all", "methylated"], accuracy=0.1) # % intergenic
## [1] "32.3%"
Of the 2,030,624 characterized loci, 1,839,241 were methylated (90.6%). Of the methylated loci, 633,991 were within known genes (34.5%, 14.7% in exons, 19.8% in introns), 85,425 and 85,795 were 2kb upstream and downstream of known genes, respectively (4.6%) and 4.7%), 254,363 were within transposable elements (13.8%), and there were 1,386,721 instances of overlap between methylated loci and alternative splice variants. 593,224 of the methylated loci were not associated with known regions (i.e. intergenic beyond 2kb gene flanking regions, 32.3%).
In total, 33,738 loci were analyzed. 1,836,662 loci were discarded because they did not pass the filtering requirements of 10-100 reads across 7 of the 9 samples per population.
Overall, loci were highly methylated. Across all samples, loci were on average 89.6% methylated.
Of all 33,738 evaluated loci, 18,688 were located within known genes (55.4%), 15,943 of which were located within exons (47.3%), 2,385 flanked known genes (within 2kb, 7.1%), 1,588 were found within transposable elements (4.7%), and 4,156 were not found in any known feature (12.3%).
There were 359 loci that were differentially methylated (DMLs) among populations. 219 loci were located within known genes (61.0%), 937 of which were within exons (261.0%), 36 DMLs flanked known genes (within 2kb, 10.0%), 9 were located within transposable elements (2.5%), and 25 were not found in any known feature (7.0%).
The GO MWU analysis did not identify any enriched biological functions. Enrichment analysis using the DAVID tool identified 7 enriched biological processes (Table 1).
GO Term | Biological Process | PValue | Fold Enrichment | Count |
---|---|---|---|---|
GO:0006513 | protein monoubiquitination | 0.010 | 8.0 | 4 |
GO:0006284 | base-excision repair | 0.015 | 7.1 | 4 |
GO:0048565 | digestive tract development | 0.033 | 9.6 | 3 |
GO:0006974 | cellular response to DNA damage stimulus | 0.059 | 2.4 | 7 |
GO:0042127 | regulation of cell proliferation | 0.071 | 4.0 | 4 |
GO:0000902 | cell morphogenesis | 0.083 | 6.0 | 3 |
GO:0055085 | transmembrane transport | 0.099 | 2.8 | 5 |
Of the 1,393 gene regions (genes +/- 2kb) assessed, 279 were differentially methylated. Of these, there were 96 differentially methylated gene regions that contained DMLs (determined via a separate analysis).
Biological Processes: regulation of protein kinase activity (P-Value=0.072)
GCN1, eIF2 alpha kinase activator homolog(GCN1) | Q92616 | Homo sapiens |
---|---|---|
kinase D-interacting substrate 220kDa(KIDINS220) | Q9ULH0 | Homo sapiens |
titin(TTN) | Q8WZ42 | Homo sapiens |
titin(Ttn) | A2ASS6 | Mus musculus |
Molecular Functions: ligase activity (P-Value=0.087)
HECT domain containing 1(Hectd1) | Q69ZR2 | Mus musculus |
---|---|---|
HECT, UBA and WWE domain containing 1, E3 ubiquitin protein ligase(HUWE1) | Q7Z6Z7 | Homo sapiens |
PYruvate Carboxylase(pyc-1) | O17732 | Caenorhabditis elegans |
nuclear transcription factor, X-box binding 1(NFX1) | Q12986 | Homo sapiens |
ring finger protein 103(Rnf103) | Q9R1W3 | Mus musculus |
ring finger protein 168(rnf168) | Q7T308 | Danio rerio |
ring finger protein 38(RNF38) | Q9H0F5 | Homo sapiens |
succinate-CoA ligase, alpha subunit(Suclg1) | P13086 | Rattus norvegicus |
tripartite motif containing 2(TRIM2) | A4IF63 | Bos taurus |
tripartite motif-containing 2(Trim2) | D3ZQG6 | Rattus norvegicus |
ubiquitin protein ligase E3 component n-recognin 5(Ubr5) | Q80TP3 | Mus musculus |
Molecular Functions: DNA binding (P-Value=0.091)
AE binding protein 2(aebp2) | Q7SXV2 | Danio rerio |
---|---|---|
AT-hook containing transcription factor 1 L homeolog(ahctf1.L) | Q5U249 | Xenopus laevis |
AT-rich interaction domain 2(ARID2) | Q68CP9 | Homo sapiens |
E1A binding protein p400(Ep400) | Q8CHI8 | Mus musculus |
GLIS family zinc finger 3(Glis3) | Q6XP49 | Mus musculus |
HECT, UBA and WWE domain containing 1, E3 ubiquitin protein ligase(HUWE1) | Q7Z6Z7 | Homo sapiens |
JRK-like(JRKL) | Q9Y4A0 | Homo sapiens |
Nuclear Hormone Receptor family(nhr-41) | Q9N4B8 | Caenorhabditis elegans |
PR domain containing 1, with ZNF domain(Prdm1) | Q60636 | Mus musculus |
PYruvate Carboxylase(pyc-1) | O17732 | Caenorhabditis elegans |
Putative histone H1.6(hil-6) | O16277 | Caenorhabditis elegans |
RAB guanine nucleotide exchange factor (GEF) 1(Rabgef1) | Q9JM13 | Mus musculus |
SET domain, bifurcated 1 L homeolog(setdb1.L) | Q6INA9 | Xenopus laevis |
Zn finger homeodomain 1(zfh1) | P28166 | Drosophila melanogaster |
chromodomain helicase DNA binding protein 8(chd8) | B0R0I6 | Danio rerio |
conserved Plasmodium protein, unknown function(PF14_0175) | Q8ILR9 | Plasmodium falciparum 3D7 |
ligase I, DNA, ATP-dependent S homeolog(lig1.S) | P51892 | Xenopus laevis |
methyl-CpG binding domain protein 6(Mbd6) | Q3TY92 | Mus musculus |
orphan steroid hormone receptor 2(shr2) | Q26622 | Strongylocentrotus purpuratus |
regulatory factor X7(RFX7) | Q2KHR2 | Homo sapiens |
transcription factor B1, mitochondrial(tfb1m) | Q28HM1 | Xenopus tropicalis |
zinc finger and BTB domain containing 24(zbtb24) | Q52KB5 | Danio rerio |
zinc finger protein 236(ZNF236) | Q9UL36 | Homo sapiens |
zinc finger protein 471(ZNF471) | Q9BX82 | Homo sapiens |
zinc finger protein 525(ZNF525) | Q8N782 | Homo sapiens |
zinc finger protein interacting with K protein 1(Zik1) | Q80YP6 | Mus musculus |
zinc finger, MYM-type 4(Zmym4) | A2A791 | Mus musculus |
Cellular Compoent: midbody (P-value=0.069)
CTD phosphatase subunit 1(CTDP1) | Q9Y5B0 | Homo sapiens |
---|---|---|
phosphatidylinositol transfer protein, membrane-associated 1(Pitpnm1) | Q5U2N3 | Rattus norvegicus |
septin 7(SEPT7) | Q5R1W1 | Pan troglodytes |
supervillin(SVIL) | O95425 | Homo sapiens |
supervillin(SVIL) | O46385 | Bos taurus |
tetratricopeptide repeat domain 28(TTC28) | Q96AY4 | Homo sapiens |
## `geom_smooth()` using formula 'y ~ x'
To examine whether methylation plays a role in population-specific growth traits, we modeled methylation level for each loci using MACAU, while controlling for relatedness. Of the 33,284 loci assessed, 20 loci were associated with oyster size (shell length, whole wet weight as covariate). Of the 20 loci, 17 were located within known gene bodies (16 in exons), and 1 locus flanked genes (+/- 2kb). The number of size-associated loci that were also differentially methylated among populations was 1, which indicates that the associations were not primarily due to population structure.
The GO MWU analysis did not identify any enriched biological functions. Enrichment analysis using the DAVID tool identified 4 enriched biological processes (Table 1).
GO Term | Biologica Function | PValue | Fold Enrichment | Count |
---|---|---|---|---|
GO:0006607 | NLS-bearing protein import into nucleus | 5.85E-04 | 68.4 | 3 |
GO:0006610 | ribosomal protein import into nucleus | 0.02238718 | 79.8 | 2 |
GO:0000059 | protein import into nucleus, docking | 0.02791389 | 63.84 | 2 |
GO:0000060 | protein import into nucleus, translocation | 0.04432776 | 39.9 | 2 |
Across all genes that contain methylation data (n=3754), mean Pst was 29.4% +/- 24.7% (SD).
nrow(genes_2kbslop_Pst)
## [1] 3754
percent(mean(genes_2kbslop_Pst$Pst_Values), accuracy = .1)
## [1] "29.4%"
percent(sd(genes_2kbslop_Pst$Pst_Values), accuracy = .1)
## [1] "24.7%"