The recruitment file

Often, the participant IDs generated during the data capture/collection stage are not ideal for subsequent data wrangling tasks: they may contain personal information (e.g. names) or special characters (e.g. spaces, dashes, or punctuation) not permitted by data standards such as BIDS. If imaging and clinical data collection are handled by separate people, then such studies may also end up with multiple sets of participant IDs, which requires remapping at a later stage.

The Nipoppy recruitment file is an optional TSV file that aims to solve these issues at the earliest (i.e. before the data curation) stage. It maps all sets of participant-related IDs generated during the data collection stage to the participant_id column that serves as the primary ID in all subsequent tasks.

The participant_id needs to comply with the BIDS specification, which only allows alphanumeric characters (i.e. letters and numbers), and it should not include the sub- prefix. If any of the sets of participant IDs generated within the study meets these conditions, then that can become the designated participant_id column. Otherwise, users need to create a new set that is BIDS-compliant.

Tip

  • Exclude special characters (e.g. -, _), spaces and punctuation

  • Use zero-padding for numerical labels to ensure correct alphabetical sorting (e.g., 01 and 10 instead of 1 and 10).

This TSV file can also include a participant_dicom_dir column listing the relative DICOM directory path relative to <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg to handle cases where DICOM directory names and participant IDs are not the same (see the DICOM reorganization guide for details). Note that the participant_dicom_dir mapping can also be specified in a separate DICOM_DIR_MAP_FILE file in cases where this recruitment file is not needed.

For sanity checks, this file can also be used to list the larger cohort of originally recruited participants. This can help avoid possible confusion created by drop-outs or exclusions during the subsequent study stages (i.e. curation, processing, extraction, and analysis).

The recommended location for the recruitment.tsv file is the <NIPOPPY_PROJECT_ROOT>/sourcedata directory.

Here is an example recruitment file:

participant_id

visit_id

session_id

recruitment_id

clinical_id

participant_dicom_dir

01

BL

BL

MNI-MR.POPPY

MNI_POPPY

POPPY-MRI-MAY-2024/ses-BL

01

M06

MNI-MR.POPPY

MNI_POPPY

01

M12

M12

MNI-MR.POPPY

MNI_POPPY

POPPY-MRI-MAY-2025/ses-M12

02

BL

BL

MNI-MS.SESAME

MNI_SESAME

SESAME-MRI-JUNE-2024/ses-BL

02

M06

MNI-MS.SESAME

MNI_SESAME

02

M12

M12

MNI-MS.SESAME

MNI_SESAME

SESAME-MRI-JULY-2025/ses-M12

In the Nipoppy protocol

The Nipoppy protocol considers the recruitment.tsv as an optional helper file part of the “data capture” stage. It is typically created before the manifest file, which represents the curated view of the recruited participants and their IDs.