{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# STDM Quick Start Tutorial\n", "\n", "This notebook demonstrates the basic usage of the STDM package for tensor decomposition of gene expression data.\n", "\n", "## Overview\n", "\n", "We'll cover:\n", "1. Loading and preprocessing gene expression data\n", "2. Building multi-dimensional tensors\n", "3. Fitting CP decomposition\n", "4. Visualizing results\n", "5. Interpreting biological patterns\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import required packages\n", "import sys\n", "from pathlib import Path\n", "\n", "# Add src to path if running from examples directory\n", "sys.path.insert(0, str(Path.cwd().parent / \"src\"))\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "from stdm import (\n", " DataLoader,\n", " TensorBuilder,\n", " CPDecomposition,\n", " plot_components,\n", " plot_temporal_patterns,\n", " plot_species_comparison,\n", " plot_gene_loadings\n", ")\n", "from stdm.utils import compute_explained_variance\n", "\n", "%matplotlib inline\n", "sns.set_style('whitegrid')\n", "print(\"Imports successful!\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Load Gene Expression Data\n", "\n", "First, let's load the merged gene expression data containing all species.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set up paths\n", "data_dir = Path.cwd().parent / \"data\" / \"merge-vst\"\n", "\n", "# Load data\n", "loader = DataLoader(data_dir)\n", "data = loader.load_merged_data()\n", "\n", "print(f\"Data shape: {data.shape}\")\n", "print(f\"Genes: {data.shape[0]}\")\n", "print(f\"Samples: {data.shape[1]}\")\n", "print(f\"\\nFirst few rows:\")\n", "data.head()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "In this tutorial, we completed a full tensor decomposition analysis of gene expression data. The STDM package provides powerful tools for:\n", "- Multi-species gene expression analysis\n", "- Temporal pattern discovery\n", "- Species comparison\n", "- Marker gene identification\n", "\n", "For more examples and advanced usage, see the `examples/` directory and the comprehensive README.\n" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }