Content is user-generated and unverified.

Creating Consensus Networks from Multiple Pathway Databases in Napistu

Overview

Napistu provides powerful functionality to merge multiple pathway models from different databases into a single consensus network. This process resolves entity conflicts, merges identical components, and creates a unified representation while preserving source information.

Key Components

1. Core Classes

  • SBML_dfs: Represents individual pathway models as collections of pandas DataFrames
  • PWIndex: Manages metadata about pathway files and their locations
  • Identifiers: Handles systematic identifiers for biological entities
  • Source: Tracks provenance information for merged entities

2. Main Function

The primary function for consensus building is construct_consensus_model() in the consensus.py module.

Step-by-Step Process

Step 1: Prepare Your Pathway Index

Create a pathway index that describes all the models you want to merge:

python
from napistu import indices

# Create pathway index from model definitions
pw_index_df = indices.create_pathway_index_df(
    model_keys={"human": "recon3", "mouse": "iMM1415"},
    model_urls={"human": "path/to/recon3.xml", "mouse": "path/to/iMM1415.xml"},
    model_species={"human": "Homo sapiens", "mouse": "Mus musculus"},
    base_path="./models/",
    source_name="metabolic_models",
    file_extension=".xml"
)

# Create PWIndex object
pw_index = indices.PWIndex(pw_index_df)

Step 2: Load Individual Models

Convert your pathway files into SBML_dfs objects:

python
from napistu import consensus

# Load all models from the pathway index
sbml_dfs_dict = consensus.construct_sbml_dfs_dict(
    pw_index=pw_index.index,
    strict=True  # Set to False to skip problematic files
)

Step 3: Build the Consensus Model

Create the consensus network by merging shared entities:

python
# Build consensus model
consensus_model = consensus.construct_consensus_model(
    sbml_dfs_dict=sbml_dfs_dict,
    pw_index=pw_index,
    dogmatic=True  # True: keep genes/transcripts/proteins separate
                   # False: merge them when possible
)

Entity Merging Strategy

Merging Rules

The consensus building process follows a hierarchical approach:

  1. Compartments: Merged based on unique identifiers
  2. Species: Merged using biological qualifier terms (BQB)
  3. Compartmentalized Species: Created from species × compartment combinations
  4. Reactions: Merged based on identical membership (same reactants/products)
  5. Reaction Species: Links between reactions and compartmentalized species

Biological Qualifiers (BQB)

Two modes control how entities are merged:

Dogmatic Mode (dogmatic=True):

  • Only merges entities with BQB_IS or BQB_IS_HOMOLOG_TO relationships
  • Keeps genes, transcripts, and proteins as separate entities
  • More conservative merging

Non-Dogmatic Mode (dogmatic=False):

  • Additionally merges entities with BQB_IS_ENCODED_BY and BQB_ENCODES
  • Allows merging of genes, transcripts, and proteins
  • More aggressive merging

Advanced Options

Pre-Consensus Analysis

Check ontology compatibility before merging:

python
# Check shared ontologies across models
shared_ontologies, ontology_summary = consensus.pre_consensus_ontology_check(
    sbml_dfs_dict, 
    tablename="species"
)

# Check shared compartments
shared_compartments, compartment_summary = consensus.pre_consensus_compartment_check(
    sbml_dfs_dict,
    tablename="compartments"
)

Post-Consensus Validation

Analyze the merged model:

python
# Check ontologies in consensus model
consensus_ontologies = consensus.post_consensus_species_ontology_check(consensus_model)

# Get network summary statistics
network_stats = consensus_model.get_network_summary()

# Check source coverage
reactions_sources = consensus.post_consensus_source_check(
    consensus_model, 
    table_name="reactions"
)

Handling Conflicts and Issues

Common Issues and Solutions

  1. Missing Compartment Information:
python
   consensus_model.infer_uncompartmentalized_species_location()
  1. Missing SBO Terms:
python
   consensus_model.infer_sbo_terms()
  1. Inconsistent Naming:
python
   consensus_model.name_compartmentalized_species()

Validation and Resolution

The consensus model includes automatic validation:

python
# Validate and attempt automatic fixes
consensus_model.validate_and_resolve()

# Manual validation (raises errors without fixes)
consensus_model.validate()

Export and Analysis

Export Consensus Model

python
# Export to various formats
consensus_model.export_sbml_dfs(
    model_prefix="consensus_",
    outdir="./output/",
    overwrite=True,
    dogmatic=True
)

Query the Consensus Model

python
# Get reaction formulas
formulas = consensus_model.reaction_formulas()

# Search by species name
matching_species = consensus_model.search_by_name(
    name="glucose", 
    entity_type="species"
)

# Get species participation in reactions
species_status = consensus_model.species_status("SPEC_00001")

# Get characteristic identifiers
species_ids = consensus_model.get_characteristic_species_ids(dogmatic=True)

Example Complete Workflow

python
from napistu import consensus, indices

def create_consensus_network(model_configs, output_dir):
    """
    Complete workflow for creating a consensus network
    
    Parameters:
    - model_configs: dict with model metadata
    - output_dir: where to save results
    """
    
    # Step 1: Create pathway index
    pw_index_df = indices.create_pathway_index_df(**model_configs)
    pw_index = indices.PWIndex(pw_index_df)
    
    # Step 2: Load models
    print("Loading individual models...")
    sbml_dfs_dict = consensus.construct_sbml_dfs_dict(
        pw_index=pw_index.index,
        strict=False  # Skip problematic files
    )
    
    print(f"Loaded {len(sbml_dfs_dict)} models")
    
    # Step 3: Pre-consensus analysis
    print("Analyzing model compatibility...")
    species_ontologies, _ = consensus.pre_consensus_ontology_check(
        sbml_dfs_dict, "species"
    )
    print(f"Shared species ontologies: {species_ontologies}")
    
    # Step 4: Build consensus
    print("Building consensus model...")
    consensus_model = consensus.construct_consensus_model(
        sbml_dfs_dict=sbml_dfs_dict,
        pw_index=pw_index,
        dogmatic=True
    )
    
    # Step 5: Validate and export
    print("Validating consensus model...")
    consensus_model.validate_and_resolve()
    
    print("Exporting results...")
    consensus_model.export_sbml_dfs(
        model_prefix="consensus_",
        outdir=output_dir,
        overwrite=True
    )
    
    # Step 6: Summary statistics
    stats = consensus_model.get_network_summary()
    print(f"Consensus model contains:")
    print(f"  - {stats['n_species']} species")
    print(f"  - {stats['n_reactions']} reactions")
    print(f"  - {stats['n_compartments']} compartments")
    
    return consensus_model

# Usage example
model_config = {
    "model_keys": {"recon": "recon3", "bigg": "iHuman"},
    "model_urls": {"recon": "recon3.xml", "bigg": "iHuman.xml"},
    "model_species": {"recon": "Homo sapiens", "bigg": "Homo sapiens"},
    "base_path": "./models/",
    "source_name": "human_metabolism"
}

consensus_net = create_consensus_network(model_config, "./consensus_output/")

Key Considerations

Performance

  • Large models may require significant memory and processing time
  • Consider filtering models by species or pathway type before merging
  • Use strict=False to skip problematic files during loading

Data Quality

  • Ensure consistent identifier formats across input models
  • Validate source models individually before consensus building
  • Review merge conflicts in the log output

Biological Interpretation

  • Choose dogmatic mode based on your analysis goals
  • Consider the biological meaning of merged entities
  • Validate that consensus reactions are biochemically meaningful

Troubleshooting

Common Error Messages

  1. "Missing identifiers": Some entities lack systematic identifiers
    • Solution: Check input model quality or use non-strict loading
  2. "Foreign key violations": Inconsistent entity relationships
    • Solution: Validate individual models first
  3. "Duplicate primary keys": Conflicting entity IDs
    • Solution: Check for ID formatting issues in source models
  4. "Underspecified reactions": Reactions missing critical components
    • Solution: Review reaction definitions in source models

The consensus building process preserves all source information, allowing you to trace any entity back to its original model(s) and understand how the merger was performed.

Content is user-generated and unverified.
    Creating Consensus Networks in Napistu | Claude