Batch Processing¶
Run csttool over a cohort of subjects with a single command. The full per-command reference is in batch; this page walks through the typical flow end-to-end.
Two input modes¶
csttool batch accepts either a BIDS dataset or a JSON manifest.
Step 1 — Dry-run first¶
Long batches deserve a sanity check. --dry-run lists planned subjects and inputs without touching disk:
Use --validate-only to additionally exit non-zero if any manifest entry or BIDS layout fails validation.
Step 2 — Launch¶
When you are happy with the plan, drop --dry-run:
Subjects are processed sequentially. Each subject prints the same CHECK → IMPORT → PREPROCESS → TRACK → EXTRACT → METRICS banners that you see in single-subject runs.
Use --timeout-minutes 120 so a stuck subject does not block the entire cohort.
Step 3 — Aggregate metrics¶
Each subject writes a summary.csv row under its metrics directory. Concatenate them:
find ./derivatives -name 'summary.csv' \
| xargs -I{} sh -c 'tail -n +2 {}' \
| (head -n 1 < "$(find ./derivatives -name 'summary.csv' | head -1)"; cat) \
> ./derivatives/cohort_metrics.csv
For larger cohorts, prefer Python:
import pandas as pd, glob
df = pd.concat(pd.read_csv(p) for p in glob.glob("derivatives/**/summary.csv", recursive=True))
df.to_csv("derivatives/cohort_metrics.csv", index=False)
Resuming and re-running¶
batch skips subjects whose output directories already exist. To force re-processing:
Parallelising across machines¶
batch runs subjects sequentially within a single process. For real parallelism, partition the cohort with --include and launch one csttool batch per host or scheduler slot. On SLURM:
with run_subjects.sh selecting its slice of subject IDs from $SLURM_ARRAY_TASK_ID.
Related¶
- Multiple subjects how-to — task-recipe focus.
batchCLI reference.runCLI reference — single-subject pipeline thatbatchinvokes per subject.