The recruitment file¶

Often, the participant IDs generated during the data capture/collection stage are not ideal for subsequent data wrangling tasks: they may contain personal information (e.g. names) or special characters (e.g. spaces, dashes, or punctuation) not permitted by data standards such as BIDS. If imaging and clinical data collection are handled by separate people, then such studies may also end up with multiple sets of participant IDs, which requires remapping at a later stage.

The Nipoppy recruitment file is an optional TSV file that aims to solve these issues at the earliest (i.e. before the data curation) stage. It maps all sets of participant-related IDs generated during the data collection stage to the participant_id column that serves as the primary ID in all subsequent tasks.

The participant_id needs to comply with the BIDS specification, which only allows alphanumeric characters (i.e. letters and numbers), and it should not include the sub- prefix. If any of the sets of participant IDs generated within the study meets these conditions, then that can become the designated participant_id column. Otherwise, users need to create a new set that is BIDS-compliant.

Tip

Exclude special characters (e.g. -, _), spaces and punctuation
Use zero-padding for numerical labels to ensure correct alphabetical sorting (e.g., 01 and 10 instead of 1 and 10).

This TSV file can also include a participant_dicom_dir column listing the relative DICOM directory path relative to <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg to handle cases where DICOM directory names and participant IDs are not the same (see the DICOM reorganization guide for details). Note that the participant_dicom_dir mapping can also be specified in a separate DICOM_DIR_MAP_FILE file in cases where this recruitment file is not needed.

For sanity checks, this file can also be used to list the larger cohort of originally recruited participants. This can help avoid possible confusion created by drop-outs or exclusions during the subsequent study stages (i.e. curation, processing, extraction, and analysis).

The recommended location for the recruitment.tsv file is the <NIPOPPY_PROJECT_ROOT>/sourcedata directory.

Here is an example recruitment file:

participant_id	visit_id	session_id	recruitment_id	clinical_id	participant_dicom_dir
01	BL	BL	MNI-MR.POPPY	MNI_POPPY	POPPY-MRI-MAY-2024/ses-BL
01	M06		MNI-MR.POPPY	MNI_POPPY
01	M12	M12	MNI-MR.POPPY	MNI_POPPY	POPPY-MRI-MAY-2025/ses-M12
02	BL	BL	MNI-MS.SESAME	MNI_SESAME	SESAME-MRI-JUNE-2024/ses-BL
02	M06		MNI-MS.SESAME	MNI_SESAME
02	M12	M12	MNI-MS.SESAME	MNI_SESAME	SESAME-MRI-JULY-2025/ses-M12

In the Nipoppy protocol¶

The Nipoppy protocol considers the recruitment.tsv as an optional helper file part of the “data capture” stage. It is typically created before the manifest file, which represents the curated view of the recruited participants and their IDs.