Running processing pipelines

Just like with the BIDS conversion pipelines, Nipoppy uses the Boutiques framework to run image processing pipelines. By default, new Nipoppy datasets (as created with nipoppy init) are populated with descriptor files and default invocation files for the following processing pipelines:

  • fMRIPrep, a pipeline for preprocessing anatomical and functional MRI data.

  • MRIQC, a pipeline for automated quality control (QC) metric extraction

Note

Although fMRIPrep and MRIQC are both BIDS Apps, Nipoppy can also be used to run pipelines that are not BIDS Apps. Custom pipelines can be added by creating Boutiques descriptor and invocation files and modifying the global configuration file accordingly.

Summary

Prerequisites

  • A Nipoppy dataset with a valid global configuration file and an accurate manifest

  • Raw imaging data organized according to the BIDS standard in the <DATASET_ROOT>/bids directory

If using the default configuration file:

  • The Apptainer (formerly Singularity) container platform installed on your system

    • See here for installation instructions

    • Note: Apptainer is only natively supported on Linux systems

  • The container image file for the pipeline you wish to use

    • This can be downloaded by e.g., running apptainer pull <URI> inside your container directory (see the configuration file for URI). Make sure the file path is the same as what is specified in the configuration file!

Caution

Although it is possible to use Nipoppy without containers by modifying the default invocation files, we highly recommend using containerized pipelines to make your workflow as reproducible as possible. Using containers instead of locally installed software can also help avoid conflicts or unwanted interactions between different software/versions.

Data directories

Directory

Content description

<DATASET_ROOT>/bids

Input – Raw imaging data (NIfTIs) organized according to the BIDS standard

<DATASET_ROOT>/derivatives/<PIPELINE_NAME>/<PIPELINE_VERSION>/output

Output – Derivative files produced by processing pipelines

Commands

Workflow

  1. Nipoppy will loop over all participants/sessions that have BIDS data according to the curation status file but have not yet successfully completed the pipeline according to the processing status file

  2. For each participant-session pair:

    1. The pipeline’s invocation will be processed such that template strings related to the participant/session and dataset paths (e.g., [[NIPOPPY_PARTICIPANT_ID]]) are replaced by the appropriate values

    2. A PyBIDS database indexing the BIDS data for this participant and session is created in a subdirectory inside <DATASET_ROOT>/scratch/pybids_db

    3. The pipeline is launched using Boutiques, which will be combine the processed invocation with the pipeline’s descriptor file to produce and run a command-line expression

Configuring processing pipelines

Just like with BIDS converters, pipeline and pipeline step configurations are set in the global configuration file (see here for a more complete guide on the fields in this file).

There are several files in pipeline step configurations that can be further modified to customize pipeline runs:

  • INVOCATION_FILE: a JSON file containing key-value pairs specifying runtime parameters. The keys correspond to entries in the pipeline’s descriptor file.

  • PYBIDS_IGNORE_FILE: a JSON file containing a list of file names or patterns to ignore when building the PyBIDS database

Note

By default, pipeline files are stored in <DATASET_ROOT>/pipelines/<PIPELINE_NAME>-<PIPELINE_VERSION>.

Warning

Pipeline step configurations also have a DESCRIPTOR_FILE field, which points to the Boutiques descriptor of a pipeline. Although descriptor files can be modified, it is not needed and we recommend that less advanced users keep the default.

Customizing pipeline invocations

The default pipeline invocation files (in <DATASET_ROOT>/pipelines/<PIPELINE_NAME>-<PIPELINE_VERSION>) can be modified by changing existing values or adding new key-value pairs.

Tip

Run the pipeline on a single participant and session with the --simulate flag to check/debug custom invocation files.

Note

To account for invocations needing to be different for different participants and sessions (amongst other things), Nipoppy invocations are actually templates that need to be slightly processed at runtime to replace template strings by actual values. Recognized template strings include:

  • [[NIPOPPY_PARTICIPANT_ID]]: the participant ID without the sub- prefix

  • [[NIPOPPY_SESSION_ID]]: the session ID without the ses- prefix

  • [[NIPOPPY_BIDS_PARTICIPANT_ID]]: the participant ID with the sub- prefix

  • [[NIPOPPY_BIDS_SESSION_ID]]: the session ID with the ses- prefix

  • [[NIPOPPY_<LAYOUT_PROPERTY>]], where <LAYOUT_PROPERTY> is a property in the Nipoppy dataset layout configuration file (all uppercase): any path defined in the Nipoppy dataset layout

  • [[NIPOPPY_DPATH_PIPELINE_OUTPUT]]: the output directory for this pipeline, i.e. <DATASET_ROOT>/derivatives/<PIPELINE_NAME>/<PIPELINE_VERSION>/output

  • [[NIPOPPY_DPATH_PIPELINE_WORK]]: the working directory for this pipeline run, which will be a subdirectory of <DATASET_ROOT>/derivatives/<PIPELINE_NAME>/<PIPELINE_VERSION>/work

  • [[NIPOPPY_DPATH_PIPELINE_BIDS_DB]]: the PyBIDS database for the participant and session

Running a processing pipeline

Using the command-line interface

To process all participants and sessions in a dataset (sequentially), run:

$ nipoppy run \
    --dataset <DATASET_ROOT> \
    --pipeline <PIPELINE_NAME>

where <PIPELINE_NAME> correspond to the pipeline name as specified in the global configuration file.

Note

If there are multiple versions for the same pipeline in the global configuration file, use --pipeline-version to specify the desired version. By default, the first version listed for the pipeline will be used.

Similarly, if --pipeline-step is not specified, the first step defined in the global configuration file will be used.

The pipeline can also be run on a single participant and/or session (useful for batching on clusters and testing pipelines/configurations):

$ nipoppy run \
    --dataset <DATASET_ROOT> \
    --pipeline <PIPELINE_NAME> \
    --participant-id <PARTICIPANT_ID> \
    --session-id <SESSION_ID>

Hint

The --simulate argument will make Nipoppy print out the command to be executed with Boutiques (instead of actually executing it). It can be useful for checking runtime parameters or debugging the invocation file.

See the CLI reference page for more information on additional optional arguments.

Note

Log files for this command will be written to <DATASET_ROOT>/logs/run

Using the Python API

from nipoppy.workflows import PipelineRunner

# replace by appropriate values
dpath_root = "<DATASET_ROOT>"
pipeline_name = "<PIPELINE_NAME>"

workflow = PipelineRunner(
    dpath_root=dpath_root,
    pipeline_name=pipeline_name,
)
workflow.run()

See the API reference for nipoppy.workflows.PipelineRunner for more information on optional arguments (they correspond to the ones for the CLI).

Next steps

Nipoppy trackers can be used to assess the status of processing pipelines being run on participants/sessions in a dataset.

Once the entire dataset has been processed with a pipeline, Nipoppy extractors can be used to obtain analysis-ready imaging-derived phenotypes (IDPs).