Quickstart

Note

See the Installation instructions first if you have not yet installed Nipoppy.

Initializing a new dataset

An Nipoppy directory tree can be generated by running nipoppy init in an empty directory. You can also specify the directory path directly (Nipoppy will create it if it doesn’t exist):

$ nipoppy init --dataset <DATASET_ROOT>

Warning

Initializing inside a non-empty directory will result in an error.

The newly created directory tree follows the Nipoppy specification. Other Nipoppy commands expect all these directories to exist – they will throw an error if that is not the case.

Tip

Each subdirectory contains a README.md file that briefly describes the purpose of the subdirectory and the type of data that it should contain.

Creating/modifying required files

Nipoppy requires two user-provided files in each dataset: a JSON configuration file and a tabular manifest file. Commands will result in errors if either of these files does not exist or are invalid.

Note

The nipoppy init command copies examples of these files to the expected paths within a new dataset, but you will most likely have to modify/overwrite them.

Customizing the global configuration file

The global configuration file at <DATASET_ROOT>/global_config.json contains high-level essential configurations for running Nipoppy commands. It starts out like this:

 1{
 2    "SUBSTITUTIONS": {
 3        "[[NIPOPPY_DPATH_CONTAINERS]]": "[[NIPOPPY_DPATH_ROOT]]/containers",
 4        "[[HPC_ACCOUNT_NAME]]": ""
 5    },
 6    "DICOM_DIR_PARTICIPANT_FIRST": true,
 7    "CONTAINER_CONFIG": {
 8        "COMMAND": "apptainer",
 9        "ARGS": [
10            "--cleanenv"
11        ],
12        "ENV_VARS": {
13            "PYTHONUNBUFFERED": "1"
14        }
15    },
16    "HPC_PREAMBLE": [
17        "export PYTHONUNBUFFERED=1"
18    ],
19    "PIPELINE_VARIABLES": {
20        "BIDSIFICATION": {},
21        "PROCESSING": {},
22        "EXTRACTION": {}
23    },
24    "CUSTOM": {}
25}

By default, this file does not contain any pipeline-specific information, since the dataset does not have any pipelines installed yet. Still, there are fields that may need to be modified depending on your setup:

  • If you are on a system that still uses Singularity (which has been renamed to Apptainer), you need to change CONTAINER_CONFIG -> COMMAND to "singularity" instead of "apptainer"

  • If your group uses a shared directory for storing container image files, you can replace "[[NIPOPPY_DPATH_ROOT]]/containers" by the full path to that shared directory.

    • Alternatively, you can create a symlink from <DATASET_ROOT>/containers to that directory (then this line in the configuration can be deleted).

As pipelines are installed into the dataset, this file may need to be modified to set pipeline-specific variables.

Generating the manifest file

The manifest file at <DATASET_ROOT>/manifest.tsv contains ground truth information about the participants and visits/sessions available for a dataset.

There must be only one row per unique participant/visit combination.

The example manifest looks like this:

participant_id

visit_id

session_id

datatype

01

BL

BL

[‘anat’]

01

M06

01

M12

M12

[‘anat’]

02

BL

BL

[‘anat’,’dwi’]

02

M06

02

M12

M12

[‘anat’,’dwi’]

It is extremely unlikely that this manifest file works for your dataset, so you will have to generate one yourself. We recommend writing a script, for the purpose of reproducibility and easy updates if the more data is added to the dataset.

Tip

See the schema reference for more information about each column.

Next steps

Note

The rest of this documentation is still work in progress. If the information you are looking for is not available in the user guide, refer to the commands associated with the data organization or processing step(s) you wish to perform.

Nipoppy protocol

Once the Nipoppy directory tree for a study is created, the next steps are typically to populate it with raw data, move and organize raw imaging data (typically DICOMs) into a regular structure, then convert the data to BIDS. However, depending on the type of raw data you have, your workflow might be a little different. See here for all the available documentation in the User guide.