sopa

sopa

Sopa provides an AI-ready, scalable, and technology-invariant pipeline for large-scale spatial omics analysis, enabling AI agents to autonomously process complex biological data and drive scientific discovery.

SciencePedia AI Insight

This page outlines an essential AI for Science infrastructure for spatial omics analysis, providing machine-readable, one-click ready capabilities for diverse data types through Docker images and standard cases. AI Agents can seamlessly call upon this out-of-the-box pipeline to autonomously conduct large-scale spatial data processing, perform quality control, and extract biological insights, significantly advancing research in systems biomedicine and developmental biology.

INFRASTRUCTURE STATUS:
Docker Verified
MCP Agent Ready

The sopa tool provides a robust and technology-invariant pipeline specifically designed for the scalable processing and analysis of large-scale spatial omics data. Building upon its core purpose, sopa can efficiently handle datasets comprising millions of cells generated by various cutting-edge platforms, including Xenium, Visium, and MERSCOPE, making it a critical asset for modern biological research requiring high-throughput, spatially resolved insights.

sopa finds extensive application across various scientific domains, particularly in Systems Biomedicine, Developmental Biology, and Computational Biology. It is instrumental in addressing complex problems related to the analysis of spatially resolved omics data and spatial transcriptomics. Researchers can leverage sopa to decipher the intricate spatial organization of cells and molecules within tissues, moving beyond traditional bulk or single-cell analyses to understand context-dependent biological phenomena. This includes detailed investigations into gene expression mapping in situ, cellular heterogeneity, and cell-cell interactions within complex biological environments.

Practical applications of sopa include mapping gene expression in situ with technologies like 10x Visium, Slide-seqV2, MERFISH, and Xenium, identifying and characterizing cellular microenvironments as recurring spatial configurations of cell types and molecules, and analyzing cell-cell interactions based on spatial proximity. For instance, it can be used to compute adjacency matrices for epsilon-radius graphs to evaluate cellular connectivity as a function of distance within tissues. The tool also facilitates crucial data preprocessing steps, such as defining and applying spot-level quality control metrics (e.g., total UMIs, mitochondrial fraction, number of detected genes) and ensuring data integrity by linking expression matrices, coordinates, segmentation masks, and image pyramids into a minimal interoperable schema. This enables a deeper understanding of tissue architecture, disease progression, and developmental processes by integrating genomic and spatial information at an unprecedented scale.

Spatial Transcriptomics for Mapping Gene Expression in Situ
Analysis of Spatially Resolved Omics Data

Tool Build Parameters