A reusable GitHub Actions workflow system for Python projects with integrated agent automation (Codex keepalive, autofix, CI orchestration).
For a narrative of how the repo evolved through five development phases (bootstrap → v1.1.x → Feb consolidation → production quiet → re-engagement) and where it stands today, see docs/HISTORY.md.
✅ Production Ready - Actively used and maintained.
13 repos are currently registered as first-party consumers (synced via .github/workflows/maint-68-sync-consumer-repos.yml):
- Travel-Plan-Permission
- Template
- trip-planner
- Manager-Database
- Portable-Alpha-Extension-Model
- Trend_Model_Project
- Collab-Admin
- Counter_Risk
- Pension-Data
- Inv-Man-Intake
- Ready
- learning-management-system
- Fine-Art-Archive
.github/workflows/maint-68-sync-consumer-repos.yml is the authoritative list — REGISTERED_CONSUMER_REPOS env var. The Workflows-Integration-Tests harness validates the consumer surface separately.
The agents:auto-pilot label triggers a fully automated issue-to-merge pipeline.
This is the primary automation system for both the Workflows repo and all consumer repos.
Issue created ──▶ Format ──▶ Optimize ──▶ Apply ──▶ Capability Check
│
▼
◀── Verify ◀── Merge ◀── Keepalive ◀── Create PR
│
┌─────┴──────┐
│ PASS │ CONCERNS/FAIL
▼ ▼
Done Label-triggered follow-up
| Stage | What Happens |
|---|---|
| Format | issue_formatter.py + task_decomposer.py restructure the issue into Why / Scope / Tasks / Acceptance Criteria |
| Optimize | issue_optimizer.py analyzes the repo codebase and adds file paths, patterns, and conflict warnings |
| Apply | Enriches issue tasks with optimizer suggestions |
| Capability Check | Validates issue suitability for automation; assigns the registry-backed agent:<name> label (default agent:codex, overridable via runner:<name>) |
| Create PR | Creates codex/issue-* branch with issue context in PR body |
| Keepalive | Event-driven loop (Gate completion → task appendix → registry-backed agent dispatch → push → repeat) |
| Verify | LLM-based evaluation of PR against acceptance criteria (PASS / CONCERNS / FAIL) |
| Follow-up | On CONCERNS/FAIL, maintainers or automation apply verify:create-issue for an issue-only follow-up or verify:create-new-pr for a bootstrapped follow-up PR; the new-PR path enforces the chain-depth cap |
Auto-pilot uses workflow_dispatch with a force_step input to chain stages sequentially.
This eliminates label-trigger race conditions that plagued earlier designs.
Each stage completes, then dispatches auto-pilot again with the next step name.
- Event-driven keepalive — Gate
workflow_runcompletion triggers iteration, not polling - Task appendix injection — Agent prompts include explicit, structured task context from the issue
- Token-aware retry —
withRetry+token_load_balancer.jsdistributes API calls across PATs and GitHub App tokens - Verification pipeline — Dual-model
verify:comparecatches quality gaps post-merge; see metrics below
After PR merge, applying a verify:* label (typically verify:evaluate via auto-pilot, or verify:compare for dual-model mode) triggers the verifier. In compare mode, two LLM providers (gpt-5.4 + claude-sonnet-4-6) independently evaluate the diff against acceptance criteria with unanimous PASS required. On CONCERNS or FAIL, maintainers or automation can apply the verify:create-new-pr label to trigger a 4-round LLM pipeline that generates a follow-up issue (analyze -> tasks -> acceptance criteria -> format).
Live verifier and pipeline metrics are surfaced through the weekly summary tracker — see issue #1796 (durable auto-bot tracker, posted Mondays at 06:00 UTC) and the LangSmith dashboard wired by maint-80-langsmith-metrics-dashboard.yml. The original Feb 2026 baseline (40-PR sample, first-fix 35%, avg chain depth 2.7) is preserved at docs/analysis/verify-compare-40pr-evaluation-feb-2026.md for historical comparison.
- CI:
reusable-10-ci-python.yml,reusable-11-ci-node.yml,reusable-12-ci-docker.yml - Agent automation:
reusable-16-agents.yml,reusable-codex-run.yml,reusable-20-pr-meta.yml - Orchestration:
reusable-70-orchestrator-init.yml,reusable-70-orchestrator-main.yml
- Gate:
pr-00-gate.yml(single PR-required check) - Maintenance & health:
maint-*,health-* - Agents:
agents-*(auto-pilot, verifier, keepalive, issue-intake, pr-meta)
autofix/- Formatting and hygiene automationsignature-verify/- Signature/manifest verification helperscodex-bootstrap-lite/- Lightweight bootstrap utilities for agent runssetup-api-client/- Token export and API client initialization
Validation (scripts/):
check_branch.sh- Comprehensive branch validationvalidate_yaml.py- YAML syntax checkingsync_tool_versions.py- Tool version management
CI Support (scripts/):
ci_cosmetic_repair.py- Automated pytest repairsci_coverage_delta.py- Coverage delta calculationledger_validate.py- Ledger validation
Agent Pipeline (scripts/langchain/):
issue_formatter.py- Issue restructuring for agent consumptiontask_decomposer.py- Multi-action task splittingissue_optimizer.py- Repo-aware issue enrichment
Rate Limiting (.github/scripts/):
github-api-with-retry.js- Exponential backoff + token rotationtoken_load_balancer.js- Multi-token registry and optimal selection
Start with:
docs/GLOSSARY.md— plain-language definitions of the system's key termsdocs/QUICK_REFERENCE.md— two-page operator cheat-sheet (pause/resume, keepalive, labels)docs/USAGE.mddocs/INTEGRATION_GUIDE.mddocs/ci-workflow.mdtemplates/consumer-repo/docs/SETUP_CHECKLIST.mddocs/guides/ADD_NEW_AGENT.md— Checklist for onboarding new automation agentsdocs/analysis/verify-compare-40pr-evaluation-feb-2026.md- Verify:compare pipeline evaluation (Feb 2026)
Reference workflows in your repository:
# .github/workflows/ci.yaml
name: CI
on: [push, pull_request]
jobs:
python-standard:
# Current first-party consumer default
uses: stranske/Workflows/.github/workflows/reusable-10-ci-python.yml@main
with:
python-version: "3.12"
python-pinned:
# Pin to an exact commit for reproducible builds
uses: stranske/Workflows/.github/workflows/reusable-10-ci-python.yml@abc123def4567890
with:
python-version: "3.12"@main– current first-party consumer default. Consumer templates and first-party repos track the latest reusable workflow behavior directly.- Pinned commit SHA – use this when you need strict reproducibility or a controlled rollout.
- Alternative refs – use release tags or feature branches only when you are intentionally testing or managing a separate distribution strategy.
- Clone the repository
- Open in VS Code (devcontainer recommended)
- Install pre-commit hooks:
pip install pre-commit pre-commit install
# Fast local validation (syntax, workflows, lint, typecheck, keepalive JS tests)
./scripts/dev_check.sh
# Comprehensive validation
./scripts/check_branch.sh.github/
workflows/ # Workflows (reusable + first-party orchestration)
actions/ # Composite actions
scripts/ # JS helpers used by workflows (includes tests)
docs/ # Documentation
scripts/ # Standalone tooling and validation scripts
templates/ # Consumer templates and examples
tests/ # Python tests
- Fork the repository
- Create a feature branch
- Make your changes
- Run validation:
./scripts/check_branch.sh - Submit a pull request
Pre-commit hooks will automatically:
- Format Python with Black
- Lint with Ruff
- Validate YAML syntax
- Run fast validation checks
MIT License - See LICENSE for details.
Links:
- Documentation: docs/README.md
- Glossary: docs/GLOSSARY.md
- Quick reference: docs/QUICK_REFERENCE.md
- Usage: docs/USAGE.md
- Integration guide: docs/INTEGRATION_GUIDE.md
- CI wiring: docs/ci-workflow.md
- Keepalive setup: templates/consumer-repo/docs/SETUP_CHECKLIST.md
- Workflow guide: docs/WORKFLOW_GUIDE.md
- Agent policy: docs/AGENTS_POLICY.md
- Compatibility: docs/COMPATIBILITY.md
- Contributing: docs/CONTRIBUTING.md