The recruitment file¶
Often, the participant IDs generated during the data capture/collection stage are not ideal for subsequent data wrangling tasks: they may contain personal information (e.g. names) or special characters (e.g. spaces, dashes, or punctuation) not permitted by data standards such as BIDS. If imaging and clinical data collection are handled by separate people, then such studies may also end up with multiple sets of participant IDs, which requires remapping at a later stage.
The Nipoppy recruitment file is an optional TSV file that aims to solve these issues at the earliest (i.e. before the data curation) stage. It maps all sets of participant-related IDs generated during the data collection stage to the participant_id column that serves as the primary ID in all subsequent tasks.
The participant_id needs to comply with the BIDS specification, which only allows alphanumeric characters (i.e. letters and numbers), and it should not include the sub- prefix. If any of the sets of participant IDs generated within the study meets these conditions, then that can become the designated participant_id column. Otherwise, users need to create a new set that is BIDS-compliant.
Tip
Exclude special characters (e.g.
-,_), spaces and punctuationUse zero-padding for numerical labels to ensure correct alphabetical sorting (e.g.,
01and10instead of1and10).
This TSV file can also include a participant_dicom_dir column listing the relative DICOM directory path relative to <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg to handle cases where DICOM directory names and participant IDs are not the same (see the DICOM reorganization guide for details). Note that the participant_dicom_dir mapping can also be specified in a separate DICOM_DIR_MAP_FILE file in cases where this recruitment file is not needed.
For sanity checks, this file can also be used to list the larger cohort of originally recruited participants. This can help avoid possible confusion created by drop-outs or exclusions during the subsequent study stages (i.e. curation, processing, extraction, and analysis).
The recommended location for the recruitment.tsv file is the <NIPOPPY_PROJECT_ROOT>/sourcedata directory.
Here is an example recruitment file:
participant_id |
visit_id |
session_id |
recruitment_id |
clinical_id |
participant_dicom_dir |
|
|---|---|---|---|---|---|---|
01 |
BL |
BL |
MNI-MR.POPPY |
MNI_POPPY |
POPPY-MRI-MAY-2024/ses-BL |
|
01 |
M06 |
MNI-MR.POPPY |
MNI_POPPY |
|||
01 |
M12 |
M12 |
MNI-MR.POPPY |
MNI_POPPY |
POPPY-MRI-MAY-2025/ses-M12 |
|
02 |
BL |
BL |
MNI-MS.SESAME |
MNI_SESAME |
SESAME-MRI-JUNE-2024/ses-BL |
|
02 |
M06 |
MNI-MS.SESAME |
MNI_SESAME |
|||
02 |
M12 |
M12 |
MNI-MS.SESAME |
MNI_SESAME |
SESAME-MRI-JULY-2025/ses-M12 |
In the Nipoppy protocol¶
The Nipoppy protocol considers the recruitment.tsv as an optional helper file part of the “data capture” stage. It is typically created before the manifest file, which represents the curated view of the recruited participants and their IDs.