Skip to content

feat: Snakemake workflow — per-tile GPU SLURM jobs#24

Merged
lguerard merged 3 commits into
mainfrom
feat/snakemake-workflow
Jun 24, 2026
Merged

feat: Snakemake workflow — per-tile GPU SLURM jobs#24
lguerard merged 3 commits into
mainfrom
feat/snakemake-workflow

Conversation

@lguerard

Copy link
Copy Markdown
Contributor

A SLURM-ready Snakemake pipeline that spreads the Cellpose step across many GPUs — one GPU job per tile.

convert ──▶ prepare (checkpoint) ──▶ segment {tile}  ──▶ merge
                                     one GPU job / tile
  • convert — any input (.ims/.czi/.lif/.nd2/OME-TIFF/.zarr) → pyramidal OME-ZARR (auto chunks; optional shard)
  • prepare (checkpoint) — plan tiles, skip empties, create the empty stage store, list tiles
  • segment {tile} — read tile+halo → Cellpose → trim → write a disjoint chunk of the stage (GPU job; scattered across the cluster)
  • merge — zarr-native boundary stitch + sequential relabel (default on) → written into the image as a calibrated labels/<name>/ pyramid

New [workflow] extra installs snakemake + the SLURM executor plugin. GPU request lives in profile/slurm/config.yaml (set-resources: segment: --gres=gpu:1); raise jobs: to use more GPUs at once.

Addresses the 'days on one GPU' problem — N GPUs ≈ N× faster.

Validated: tiling+merge logic reproduces tile_process (cross-boundary object → 1 label); snakemake -n builds a valid DAG; ruff + markdownlint clean. (Inspired by sopa's patch-scatter pattern.)

🤖 Generated with Claude Code

lguerard and others added 3 commits June 24, 2026 08:57
Add workflow/ (Snakefile + scripts + SLURM profile + config) that runs
the full pipeline — convert → prepare (checkpoint) → segment → merge —
and spreads the expensive Cellpose step across many GPUs by submitting
one GPU SLURM job per (non-empty) tile. Each job writes a disjoint
chunk of a shared stage store; a final CPU job stitches labels across
tile boundaries and writes them into the image as a calibrated,
multi-scale labels/ group.

- convert: any input → pyramidal OME-ZARR (auto chunks, optional shard)
- prepare: plan tiles, skip empties, create the empty stage store
- segment {tile}: read tile+halo → Cellpose → trim → write to stage (GPU)
- merge: zarr-native boundary merge + sequential relabel (default on)
- new [workflow] extra: snakemake + slurm executor plugin

Inspired by the sopa workflow's patch-scatter pattern.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Address the 'why scripts if the API was fine' point: tile_process does
stage+merge in one process, so there was no public way to run a single
tile for distributed scatter. Add that to the public API and make the
workflow thin:

- new patchworks public API: spatial_tiles, create_stage, stage_tile
  (run fn on one tile → write a disjoint chunk of a shared stage), with
  tests
- workflow scripts now just glue config → public API (cellpose, or a
  'threshold' method for no-GPU testing)
- split the workflow into rules/*.smk (convert / segment / merge /
  common) included from the Snakefile, like sopa
- verified the whole pipeline runs end-to-end (snakemake, toy input):
  convert → prepare → segment ×N → merge, cross-boundary object stitched
  to a single label written into image.zarr/labels/

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a thorough 'Cluster workflow (Snakemake + SLURM)' guide page
(install → configure every field → dry-run → local vs SLURM → monitor →
outputs → troubleshooting), wire it into the nav, and link it from the
workflow README.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@lguerard lguerard merged commit c1fb2a9 into main Jun 24, 2026
2 checks passed
@lguerard lguerard deleted the feat/snakemake-workflow branch June 24, 2026 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant