Skip to content

Upgrade eval workflow: pin latest action SHAs, global tool installs, COPILOT_GITHUB_TOKEN#5

Open
Copilot wants to merge 8 commits into
mainfrom
copilot/add-skill-evaluations-using-vally
Open

Upgrade eval workflow: pin latest action SHAs, global tool installs, COPILOT_GITHUB_TOKEN#5
Copilot wants to merge 8 commits into
mainfrom
copilot/add-skill-evaluations-using-vally

Conversation

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Modernizes .github/workflows/eval.yml to use the latest GitHub Actions, install vally and the Copilot CLI as global tools, and switch to COPILOT_GITHUB_TOKEN for evals.

Changes

  • Upgraded actions to latest versions, SHA-pinned per security policy:
    • actions/checkout → v7.0.0
    • actions/setup-node → v6.4.0
    • actions/upload-artifact → v7.0.1
  • Added "Install tools" step — installs @microsoft/vally-cli@0.6.0 and @github/copilot@1.0.65 globally so downstream steps invoke them directly
  • Replaced npx -y @microsoft/vally-cli@0.6.0 ... with vally ... in Lint and Run evals steps
  • Switched env var in Run evals from GITHUB_TOKENCOPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_GITHUB_TOKEN }}

Comment thread evals/linting/eval.yaml Outdated
@heaths heaths marked this pull request as ready for review June 25, 2026 07:45

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Vally-based evaluation specs and fixtures for selected plugin skills and wires them into CI via a dedicated pull-request workflow, so skill behavior can be continuously validated as part of the plugin marketplace repo.

Changes:

  • Added .vally.yaml defining a pr suite that runs linting + security eval specs.
  • Added Vally eval specs + fixtures for check-spelling, lint-markdown, and pin-github-actions.
  • Added a dedicated pull_request GitHub Actions workflow to run Vally lint/evals and upload vally-results.
Show a summary per file
File Description
evals/security/fixtures/pin-github-actions/workflow.yml Adds a workflow fixture used to validate action pinning behavior.
evals/security/eval.yaml Defines the security plugin eval stimulus and graders for pin-github-actions.
evals/linting/fixtures/lint-markdown/input.md Adds a markdown fixture with intentionally inconsistent bullet styles.
evals/linting/fixtures/check-spelling/input.md Adds a spelling fixture containing a known typo and a project-specific word.
evals/linting/eval.yaml Defines linting plugin eval stimuli and graders for check-spelling and lint-markdown.
.vally.yaml Introduces the Vally project config and pr suite wiring to the eval specs.
.github/workflows/vally-pr-evals.yml Adds a PR workflow that runs Vally lint + evals and uploads results.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 7/7 changed files
  • Comments generated: 4
  • Review effort level: Low

Comment thread evals/linting/eval.yaml Outdated
Comment thread evals/linting/eval.yaml Outdated
Comment thread evals/security/eval.yaml Outdated
Comment thread evals/security/eval.yaml Outdated

@heaths heaths left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot tye eval prompts should be vague. Check that the skills are invoked and the assertions pass, but don't include implementation details of the skills in the prompts. That's not realistic.

Comment thread evals/linting/eval.yaml Outdated

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@copilot tye eval prompts should be vague. Check that the skills are invoked and the assertions pass, but don't include implementation details of the skills in the prompts. That's not realistic.

Updated in 3a2b9d9. The prompts are now vague and task-oriented, while expect_skills and the graders still verify the intended skill behavior.

Copilot AI requested a review from heaths June 25, 2026 08:04
@heaths

heaths commented Jun 25, 2026

Copy link
Copy Markdown
Owner

@copilot resolve the active comments using the suggestions given.

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@copilot resolve the active comments using the suggestions given.

Updated in a5127a4.

@heaths

heaths commented Jun 25, 2026

Copy link
Copy Markdown
Owner

@copilot resolve the check error "Error: Unable to resolve action actions/setup-node@49933ea5288caeca8642e84d8d5f7a3f7c7bd7e8, unable to find version 49933ea5288caeca8642e84d8d5f7a3f7c7bd7e8" and fix the workflow. Pin the GitHub action using pin-GitHub-actions skill in this repo. Already rename the workflow to be just "eval" for both the file name (eval.yml) and the name in the file.

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@copilot resolve the check error "Error: Unable to resolve action actions/setup-node@49933ea5288caeca8642e84d8d5f7a3f7c7bd7e8, unable to find version 49933ea5288caeca8642e84d8d5f7a3f7c7bd7e8" and ...

Updated in 4a6d841.

Copilot AI requested a review from heaths June 25, 2026 08:27
@heaths

heaths commented Jun 25, 2026

Copy link
Copy Markdown
Owner

@copilot running evals fails with " Error: Authorization error, you may need to run /login (Request ID: DC33:24F025:3306A38:3777493:6A3CE6B2)". Fix the eval workflow. According to https://microsoft.github.io/vally/get-started/add-to-ci/, the GitHub token needs to be passed eg:

name: Skill Eval
on:
  pull_request:
    paths:
      - "**/SKILL.md"
      - "**/eval.yaml"

jobs:
  eval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: "22"

      - run: npm install -g @microsoft/vally-cli

      - name: Run evals
        run: |
          vally eval \
            --eval-spec eval.yaml \
            --skill-dir . \
            --output-dir ./results
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      - name: Upload trajectories
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: eval-results
          path: ./results/

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@copilot running evals fails with " Error: Authorization error, you may need to run /login (Request ID: DC33:24F025:3306A38:3777493:6A3CE6B2)". Fix the eval workflow. According to https://microsoft.g...

Updated in e9cfcc2.

Copilot AI requested a review from heaths June 25, 2026 08:35
@heaths

heaths commented Jun 25, 2026

Copy link
Copy Markdown
Owner

@copilot running evals is still failing with "Error: Authorization error, you may need to run /login (Request ID: 2818:77508:338EB17:38288BC:6A3CE88E)". Fix the workflow. If you cannot determine why, do not use trial and error to fix it. Terminate with an error and I'll ask colleagues who might know why.

@heaths

heaths commented Jun 25, 2026

Copy link
Copy Markdown
Owner

@chlowell I'm getting an auth error trying to run evals. Any idea why?

Error: Authorization error, you may need to run /login (Request ID: 2818:77508:338EB17:38288BC:6A3CE88E)

I followed the docs and can't find anything about token requirements.

/cc @LarryOsterman

Copilot AI changed the title Add per-plugin Vally skill evals and dedicated PR eval workflow Upgrade eval workflow: pin latest action SHAs, global tool installs, COPILOT_GITHUB_TOKEN Jun 25, 2026
Comment thread .github/workflows/eval.yml Outdated
Comment on lines +15 to +19
- name: Install tools
run: |
npm install -g @microsoft/vally-cli@0.6.0
npm install -g @github/copilot@1.0.65

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot use separate steps to install each tool for better diagnostics.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split into separate "Install vally" and "Install copilot CLI" steps in the latest commit.

Copilot AI requested a review from heaths June 25, 2026 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants