Running processing pipelines¶
Just like with the BIDS conversion pipelines, Nipoppy uses the Boutiques framework to run image processing pipelines. By default, new Nipoppy datasets (as created with nipoppy init
) are populated with descriptor files and default invocation files for the following processing pipelines:
fMRIPrep, a pipeline for preprocessing anatomical and functional MRI data.
MRIQC, a pipeline for automated quality control (QC) metric extraction
Note
Although fMRIPrep and MRIQC are both BIDS Apps, Nipoppy can also be used to run pipelines that are not BIDS Apps. Custom pipelines can be added by creating Boutiques descriptor and invocation files and modifying the global configuration file accordingly.
Summary¶
Prerequisites¶
A Nipoppy dataset with a valid global configuration file and an accurate manifest
See the Quickstart guide for instructions on how to set up a new dataset
Raw imaging data organized according to the BIDS standard in the
<DATASET_ROOT>/bids
directory
If using the default configuration file:
The Apptainer (formerly Singularity) container platform installed on your system
See here for installation instructions
Note: Apptainer is only natively supported on Linux systems
The container image file for the pipeline you wish to use
This can be downloaded by e.g., running
apptainer pull <URI>
inside your container directory (see the configuration file for URI). Make sure the file path is the same as what is specified in the configuration file!
Caution
Although it is possible to use Nipoppy without containers by modifying the default invocation files, we highly recommend using containerized pipelines to make your workflow as reproducible as possible. Using containers instead of locally installed software can also help avoid conflicts or unwanted interactions between different software/versions.
Data directories¶
Directory |
Content description |
---|---|
|
Input – Raw imaging data (NIfTIs) organized according to the BIDS standard |
|
Output – Derivative files produced by processing pipelines |
Commands¶
Command-line interface:
nipoppy run
Python API:
nipoppy.workflows.PipelineRunner
Workflow¶
Nipoppy will loop over all participants/sessions that have BIDS data according to the curation status file but have not yet successfully completed the pipeline according to the processing status file
An existing, out-of-date curation status file can be updated with
nipoppy track-curation --regenerate
The processing status file can be updated with
nipoppy track
For each participant-session pair:
The pipeline’s invocation will be processed such that template strings related to the participant/session and dataset paths (e.g.,
[[NIPOPPY_PARTICIPANT_ID]]
) are replaced by the appropriate valuesA PyBIDS database indexing the BIDS data for this participant and session is created in a subdirectory inside
<DATASET_ROOT>/scratch/pybids_db
The pipeline is launched using Boutiques, which will be combine the processed invocation with the pipeline’s descriptor file to produce and run a command-line expression
Configuring processing pipelines¶
Just like with BIDS converters, pipeline and pipeline step configurations are set in the global configuration file (see here for a more complete guide on the fields in this file).
There are several files in pipeline step configurations that can be further modified to customize pipeline runs:
INVOCATION_FILE
: a JSON file containing key-value pairs specifying runtime parameters. The keys correspond to entries in the pipeline’s descriptor file.PYBIDS_IGNORE_FILE
: a JSON file containing a list of file names or patterns to ignore when building the PyBIDS database
Note
By default, pipeline files are stored in <DATASET_ROOT>/pipelines
/<PIPELINE_NAME>-<PIPELINE_VERSION>
.
Warning
Pipeline step configurations also have a DESCRIPTOR_FILE
field, which points to the Boutiques descriptor of a pipeline. Although descriptor files can be modified, it is not needed and we recommend that less advanced users keep the default.
Customizing pipeline invocations¶
Understanding Boutiques descriptors and invocations
Boutiques descriptors have an inputs
field listing all available parameters for the tool being described. As a simple example, let’s use the following descriptor for a dummy “pipeline”:
{
"name": "example",
"description": "An example tool",
"tool-version": "0.1.0",
"schema-version": "0.5",
"command-line": "echo [PARAM1] [PARAM2] [FLAG1]",
"inputs": [
{
"name": "The first parameter",
"id": "basic_param1",
"type": "File",
"optional": true,
"value-key": "[PARAM1]"
},
{
"name": "The second parameter",
"id": "basic_param2",
"type": "String",
"optional": false,
"value-key": "[PARAM2]",
"value-choices": [
"choice1",
"choice2"
]
},
{
"name": "The first flag",
"id": "basic_flag1",
"type": "Flag",
"optional": true,
"command-line-flag": "-f",
"value-key": "[FLAG1]"
}
]
}
Each key in the invocation file should match the id
field in an input described in the descriptor file. The descriptor contains information about the input, such as its type (e.g., file, string, flag), whether it is required or not, etc.
Here is a valid invocation file for the above descriptor:
{
"basic_param1": ".",
"basic_param2": "choice1",
"basic_flag1": true
}
If we pass these two files to Boutiques (or rather, bosh
, the Boutiques CLI tool), it will combine them into the following command (and run it):
echo . choice1 -f
Hence, Boutiques allows Nipoppy to abstract away pipeline-specific parameters into JSON text files, giving it the flexibility to run many different kinds of pipelines!
See also
See the Boutiques tutorial for a much more comprehensive overview of Boutiques.
The default pipeline invocation files (in <DATASET_ROOT>/pipelines
/<PIPELINE_NAME>-<PIPELINE_VERSION>
) can be modified by changing existing values or adding new key-value pairs.
Tip
Run the pipeline on a single participant and session with the --simulate
flag to check/debug custom invocation files.
Note
To account for invocations needing to be different for different participants and sessions (amongst other things), Nipoppy invocations are actually templates that need to be slightly processed at runtime to replace template strings by actual values. Recognized template strings include:
[[NIPOPPY_PARTICIPANT_ID]]
: the participant ID without thesub-
prefix[[NIPOPPY_SESSION_ID]]
: the session ID without theses-
prefix[[NIPOPPY_BIDS_PARTICIPANT_ID]]
: the participant ID with thesub-
prefix[[NIPOPPY_BIDS_SESSION_ID]]
: the session ID with theses-
prefix[[NIPOPPY_<LAYOUT_PROPERTY>]]
, where<LAYOUT_PROPERTY>
is a property in the Nipoppy dataset layout configuration file (all uppercase): any path defined in the Nipoppy dataset layout[[NIPOPPY_DPATH_PIPELINE_OUTPUT]]
: the output directory for this pipeline, i.e.<DATASET_ROOT>/derivatives/<PIPELINE_NAME>/<PIPELINE_VERSION>/output
[[NIPOPPY_DPATH_PIPELINE_WORK]]
: the working directory for this pipeline run, which will be a subdirectory of<DATASET_ROOT>/derivatives/<PIPELINE_NAME>/<PIPELINE_VERSION>/work
[[NIPOPPY_DPATH_PIPELINE_BIDS_DB]]
: the PyBIDS database for the participant and session
Running a processing pipeline¶
Using the command-line interface¶
To process all participants and sessions in a dataset (sequentially), run:
$ nipoppy run \
--dataset <DATASET_ROOT> \
--pipeline <PIPELINE_NAME>
where <PIPELINE_NAME>
correspond to the pipeline name as specified in the global configuration file.
Note
If there are multiple versions for the same pipeline in the global configuration file, use --pipeline-version
to specify the desired version. By default, the first version listed for the pipeline will be used.
Similarly, if --pipeline-step
is not specified, the first step defined in the global configuration file will be used.
The pipeline can also be run on a single participant and/or session (useful for batching on clusters and testing pipelines/configurations):
$ nipoppy run \
--dataset <DATASET_ROOT> \
--pipeline <PIPELINE_NAME> \
--participant-id <PARTICIPANT_ID> \
--session-id <SESSION_ID>
Hint
The --simulate
argument will make Nipoppy print out the command to be executed with Boutiques (instead of actually executing it). It can be useful for checking runtime parameters or debugging the invocation file.
See the CLI reference page for more information on additional optional arguments.
Note
Log files for this command will be written to <DATASET_ROOT>/logs
/run
Using the Python API¶
from nipoppy.workflows import PipelineRunner
# replace by appropriate values
dpath_root = "<DATASET_ROOT>"
pipeline_name = "<PIPELINE_NAME>"
workflow = PipelineRunner(
dpath_root=dpath_root,
pipeline_name=pipeline_name,
)
workflow.run()
See the API reference for nipoppy.workflows.PipelineRunner
for more information on optional arguments (they correspond to the ones for the CLI).
Next steps¶
Nipoppy trackers can be used to assess the status of processing pipelines being run on participants/sessions in a dataset.
Once the entire dataset has been processed with a pipeline, Nipoppy extractors can be used to obtain analysis-ready imaging-derived phenotypes (IDPs).