=== Log initialized: ../output/26.4-ElasticNet-permutation-sig4/ElasticNet_permutation_log_20251210_171051.txt === Timestamp: 2025-12-10 17:10:51.553572 Session info: R version 4.5.2 (2025-10-31) Platform: x86_64-conda-linux-gnu Running under: Ubuntu 24.04.3 LTS Matrix products: default BLAS/LAPACK: /home/shared/8TB_HDD_02/shedurkin/.local/share/r-miniconda/envs/r_enet_rscript/lib/libopenblasp-r0.3.30.so; LAPACK version 3.12.0 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C time zone: America/Los_Angeles tzcode source: system (glibc) attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] pbapply_1.7-4 doParallel_1.0.17 [3] iterators_1.0.14 foreach_1.5.2 [5] scales_1.4.0 genefilter_1.80.3 [7] ggfortify_0.4.18 vegan_2.7-1 [9] permute_0.9-8 factoextra_1.0.7 [11] caret_7.0-1 lattice_0.22-7 [13] glmnet_4.1-10 Matrix_1.7-4 [15] pheatmap_1.0.13 rvest_1.0.5 [17] corrplot_0.95 ggcorrplot_0.1.4.1 [19] reshape2_1.4.4 edgeR_3.40.2 [21] limma_3.54.2 WGCNA_1.73 [23] fastcluster_1.3.0 dynamicTreeCut_1.63-1 [25] ggraph_2.2.1 tidygraph_1.3.1 [27] psych_2.5.6 igraph_2.1.4 [29] DESeq2_1.38.3 SummarizedExperiment_1.28.0 [31] Biobase_2.58.0 MatrixGenerics_1.10.0 [33] matrixStats_1.5.0 GenomicRanges_1.50.2 [35] GenomeInfoDb_1.34.9 IRanges_2.32.0 [37] S4Vectors_0.36.2 BiocGenerics_0.44.0 [39] lubridate_1.9.4 forcats_1.0.1 [41] stringr_1.6.0 dplyr_1.1.4 [43] purrr_1.2.0 readr_2.1.6 [45] tidyr_1.3.1 tibble_3.3.0 [47] ggplot2_3.5.2 tidyverse_2.0.0 loaded via a namespace (and not attached): [1] RColorBrewer_1.1-3 shape_1.4.6.1 rstudioapi_0.17.1 [4] magrittr_2.0.4 farver_2.1.2 rmarkdown_2.29 [7] zlibbioc_1.44.0 vctrs_0.6.5 memoise_2.0.1 [10] RCurl_1.98-1.17 base64enc_0.1-3 htmltools_0.5.8.1 [13] pROC_1.18.5 Formula_1.2-5 parallelly_1.45.0 [16] htmlwidgets_1.6.4 plyr_1.8.9 impute_1.72.3 [19] cachem_1.1.0 lifecycle_1.0.4 pkgconfig_2.0.3 [22] R6_2.6.1 fastmap_1.2.0 future_1.58.0 [25] GenomeInfoDbData_1.2.9 digest_0.6.39 colorspace_2.1-2 [28] AnnotationDbi_1.60.2 geneplotter_1.76.0 Hmisc_5.2-3 [31] RSQLite_2.4.2 timechange_0.3.0 mgcv_1.9-4 [34] httr_1.4.7 polyclip_1.10-7 compiler_4.5.2 [37] bit64_4.6.0-1 withr_3.0.2 htmlTable_2.4.3 [40] backports_1.5.0 BiocParallel_1.32.6 viridis_0.6.5 [43] DBI_1.2.3 ggforce_0.5.0 lava_1.8.1 [46] MASS_7.3-65 DelayedArray_0.24.0 ModelMetrics_1.2.2.2 [49] tools_4.5.2 foreign_0.8-90 future.apply_1.20.0 [52] nnet_7.3-20 glue_1.8.0 nlme_3.1-168 [55] grid_4.5.2 checkmate_2.3.2 cluster_2.1.8.1 [58] recipes_1.3.1 generics_0.1.4 gtable_0.3.6 [61] tzdb_0.5.0 class_7.3-23 preprocessCore_1.60.2 [64] data.table_1.17.8 hms_1.1.4 xml2_1.5.0 [67] XVector_0.38.0 ggrepel_0.9.6 pillar_1.11.1 [70] splines_4.5.2 tweenr_2.0.3 survival_3.8-3 [73] bit_4.6.0 annotate_1.76.0 tidyselect_1.2.1 [76] GO.db_3.16.0 locfit_1.5-9.12 Biostrings_2.66.0 [79] knitr_1.50 gridExtra_2.3 xfun_0.54 [82] graphlayouts_1.2.2 hardhat_1.4.1 timeDate_4041.110 [85] stringi_1.8.7 evaluate_1.0.5 codetools_0.2-20 [88] cli_3.6.5 rpart_4.1.24 xtable_1.8-4 [91] dichromat_2.0-0.1 Rcpp_1.1.0 globals_0.18.0 [94] png_0.1-8 XML_3.99-0.18 gower_1.0.2 [97] blob_1.2.4 bitops_1.0-9 listenv_0.9.1 [100] viridisLite_0.4.2 ipred_0.9-15 prodlim_2025.04.28 [103] crayon_1.5.3 rlang_1.1.6 KEGGREST_1.38.0 [106] mnormt_2.1.1 Start time: 2025-12-10 17:10:51.621944 Defining model functions Loading gene counts Loading miRNA counts Loading lncRNA counts Loading WGBS data Loading metadata table Species prefix and code: ACR , Apul Number of raw WGBS sites: 21621 Number of WGBS sites retained after filtering: 10302 Retained 20 of 20 genes; 51 of 51 miRNA; and 15559 of 15559 lncRNA through presence filtering Retained 20 of 20 genes; 47 of 51 miRNA; and 15559 of 15559 lncRNA through pOverA filtering Predictor set dimensions: 39 Predictor set dimensions: 25908 Gene set dimensions: 39 Gene set dimensions: 20 === PART 1: Bootstrapped Elastic Net Training === --- First Bootstrap Round --- Running 5 bootstrap replicates using 10 core(s) Bootstrap1: Running in parallel mode... Bootstrap1: 0/5 (0.0%) Bootstrap1: 5/5 (100.0%) Finished first round of bootstrapping On first round of bootstrapping, 9 genes are consistently well-predicted --- Second Bootstrap Round --- Number of well-predicted genes for second bootstrap: 9 Running 5 bootstrap replicates using 10 core(s) Bootstrap2: Running in parallel mode... Bootstrap2: 0/5 (0.0%) Bootstrap2: 5/5 (100.0%) On second round of bootstrapping, 8 genes are again consistently well-predicted Bootstrapping completed in 6.39 minutes === PART 2: Permutation-Based Significance Testing === This step assesses statistical significance of each predictor coefficient by comparing observed coefficients to a null distribution from permuted data. Running permutation tests for 9 well-predicted genes Calculating permutation p-values for 9 genes with 5000 permutations each Total predictors per gene: 25908 Using 10 core(s) for gene-level parallelization NOTE: Only non-zero coefficients will be tested (reduces multiple testing burden) Permutation: Running in parallel mode across 9 genes... Permutation: 0/9 (0.0%) Permutation: 9/9 (100.0%) === Permutation Test Summary === Total predictor-gene combinations: 233172 Non-zero coefficients: 437 (0.19%) Predictors tested: 437 Predictors skipped (zero coef): 232735 Permutation testing completed in 13.43 minutes Applying FDR (Benjamini-Hochberg) correction for multiple testing FDR correction applied to 437 tests across 9 genes Average tests per gene: 48.6 (vs. 25908 total predictors) === Significant Predictor Summary === Number of significant predictor-gene associations (FDR < 0.05): 437 Number of genes with at least one significant predictor: 9 Breakdown of TESTED predictors by type: # A tibble: 3 × 4 Predictor_Type N_tested N_significant Pct_significant 1 Other 58 58 100 2 lncRNA 378 378 100 3 miRNA 1 1 100 Breakdown of SIGNIFICANT predictors by type: # A tibble: 3 × 4 Predictor_Type N_significant N_unique_predictors N_genes_affected 1 Other 58 57 8 2 lncRNA 378 364 9 3 miRNA 1 1 1 === PART 3: Generating Visualizations and Output Files === Creating summary plots... Creating per-gene permutation test plots... Saving permutation test results... Enhancing top predictors table with significance information... === ANALYSIS COMPLETE === End time: 2025-12-10 17:32:07.457525 Timing summary: - Bootstrapping: 6.39 minutes - Permutation testing: 13.43 minutes - Total runtime: 19.82 minutes Output files generated: 1. Apul_permutation_results_full.csv - All predictor-gene p-values 2. Apul_significant_predictors.csv - FDR < 0.05 associations only 3. Apul_gene_summary.csv - Summary per gene 4. Apul_predictor_type_summary.csv - Summary by predictor type 5. Apul_top_predictors_with_significance.csv - Top predictors with p-values 6. Various plots in ../output/26.4-ElasticNet-permutation-sig4/ Key statistics: - Total genes analyzed: 20 - Well-predicted genes (R2 > 0.50): 9 - Significant predictor-gene associations: 437 - Bootstrap replicates: 5 (round 1), 5 (round 2) - Permutations per gene: 5000 - Cores used: 10 Statistical notes: - P-values were calculated using permutation testing - Multiple testing correction: Benjamini-Hochberg FDR (within each gene) - Significance threshold: FDR < 0.05 Log file saved to: ../output/26.4-ElasticNet-permutation-sig4/ElasticNet_permutation_log_20251210_171051.txt === END OF SCRIPT === === Log closed: 2025-12-10 17:32:07.511879 ===