Skip to content

feat(sandbox,providers): add aws-bedrock as a recognized inference provider#1704

Open
st-gr wants to merge 14 commits into
NVIDIA:mainfrom
st-gr:feat/aws-bedrock-provider
Open

feat(sandbox,providers): add aws-bedrock as a recognized inference provider#1704
st-gr wants to merge 14 commits into
NVIDIA:mainfrom
st-gr:feat/aws-bedrock-provider

Conversation

@st-gr

@st-gr st-gr commented Jun 3, 2026

Copy link
Copy Markdown

Summary

Adds aws-bedrock as a recognized inference protocol in the supervisor's L7 router and the providers catalog so operators can register a Bedrock-shaped upstream as --type aws-bedrock and route Claude Code Bedrock-mode traffic (POST /model/{id}/invoke[-with-response-stream]) through inference.local. Without this, sandboxes hit 403 "connection not allowed by policy" because no L7 pattern matches Bedrock URLs. The canonical no-SigV4 use case is SAP AI Core deployed Bedrock models (Anthropic models behind a Bedrock-shape API with XSUAA bearer auth instead of SigV4); operators wanting real AWS Bedrock additionally need #1630's proxy-side SigV4 signing.

Related Issue

Complementary to #1630 ("Sigv4 credential signing"). #1630 adds proxy-side SigV4 re-signing as a credential_signing: sigv4 policy field. This PR is the URL-pattern half: the supervisor's L7 router needs to recognize Bedrock InvokeModel paths before anything can be signed, regardless of whether the upstream needs SigV4. The two patches don't touch the same files; they're complementary, not overlapping.

No upstream tracking issue filed — happy to file one if reviewers prefer.

Changes

  • crates/openshell-sandbox/src/l7/inference.rs:
    • Adds two patterns to default_patterns(): POST /model/*/invoke (aws_bedrock_invoke) and POST /model/*/invoke-with-response-stream (aws_bedrock_invoke_stream).
    • Extends detect_inference_pattern to support a single middle /*/ glob in addition to the existing trailing /*. The middle wildcard matches exactly one non-empty path segment containing no //model//invoke and /model/a/b/invoke both no-match.
  • providers/aws-bedrock.yaml: new YAML profile declaring four credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION) and a default endpoint of bedrock-runtime.us-east-1.amazonaws.com:443. Operators in other regions or pointing at non-AWS Bedrock-compatible upstreams override per-deployment via the operator-supplied BEDROCK_BASE_URL config-key (mirroring how the anthropic provider accepts ANTHROPIC_BASE_URL).
  • crates/openshell-providers/src/providers/aws_bedrock.rs: the ProviderDiscoverySpec so --auto-providers picks up AWS_* env vars from local credentials.
  • crates/openshell-providers/src/{providers/mod.rs,lib.rs,profiles.rs}: register the new module + SPEC + YAML.

Testing

  • mise run pre-commit passes — not run end-to-end (mise not on author's dev environment), but the equivalent rust pieces verified independently: cargo fmt --all -- --check clean; cargo clippy --no-deps -p openshell-providers -p openshell-sandbox --all-targets -- -D warnings clean.
  • Unit tests added/updated — 7 new pattern-matcher tests in crates/openshell-sandbox/src/l7/inference.rs::tests cover positive path, query-string handling, GET rejection, empty-segment rejection, multi-segment rejection, unknown-action rejection. Provider-discovery test follows the existing test_discovers_env_credential! macro convention. cargo test -p openshell-sandbox --lib l7::inference: 40 passed; cargo test -p openshell-providers: 35 passed.
  • E2E tests added/updated — none — running an aws-bedrock provider end-to-end requires either real AWS Bedrock with SigV4 (covered by Sigv4 credential signing #1630) or a Bedrock-compatible stub backend. Deferring the E2E test to whichever PR lands second so it can exercise the full URL-pattern + auth path together.

Checklist

  • Follows Conventional Commits (feat(sandbox):, feat(providers):)
  • Commits are signed off (DCO)
  • Architecture docs updated — no surgical place to add a row that wouldn't pre-empt Sigv4 credential signing #1630's signing scope. The new patterns sit alongside the existing OpenAI/Anthropic ones in default_patterns(); the new YAML profile follows the same shape as claude-code.yaml / nvidia.yaml. Happy to add a paragraph to docs/sandboxes/manage-providers.mdx (or another spot reviewers prefer) in this PR rather than a doc-only follow-up.

Notes for reviewers

Use cases (which PRs you need for which upstream):

Upstream What you need Why
SAP AI Core deployed Bedrock (XSUAA bearer; no SigV4) This PR alone The bridge ignores inbound auth and mints XSUAA outbound; the supervisor's L7 router only needs to recognize Bedrock URL patterns, which this PR adds.
In-cluster translating bridge (LiteLLM in Bedrock-emulation mode, custom Bedrock-compatible proxy that authenticates separately) This PR alone Same shape as SAP — operator's bridge handles upstream auth; the proxy just needs URL-pattern recognition.
Real AWS Bedrock (SigV4 enforced at AWS) This PR plus #1630 This PR adds the URL-pattern recognition; #1630 adds proxy-side SigV4 signing via the credential_signing: sigv4 policy field. The two are complementary; this PR is the prerequisite that makes #1630's signing applicable to Bedrock paths.

In all three cases, provider create --type aws-bedrock requires --no-verify until a Bedrock-aware arm is added to validation_probe() in crates/openshell-router/src/backend.rs. That extension is left for a follow-up PR to keep this one focused on the URL-pattern + provider-registration changes.

Out of scope (intentional):

  • SigV4 signing. Already addressed in Sigv4 credential signing #1630.
  • {region} placeholder substitution in the YAML loader. Operators override per-deployment via BEDROCK_BASE_URL config-key the same way ANTHROPIC_BASE_URL works for the anthropic provider.
  • Body translation between Bedrock InvokeModel and other inference shapes. The router treats matched requests as opaque pass-through.
  • CLI / TUI surface updates. Operators can already create the provider via openshell provider create --type aws-bedrock because the registry recognizes the new id; surfacing it in the TUI's provider-type picker is a follow-up.

Operator context:

The st-gr/openshell-driver-kyma Helm chart currently registers its SAP AI Core ↔ Bedrock translation bridge as --type anthropic with /v1/messages on the inside, because aws-bedrock isn't a recognized provider type. The chart therefore carries a server-side Anthropic→Bedrock body translator and a denylist for Anthropic-API-only fields the SAP gateway rejects. After this PR, the bridge becomes a path-translating + auth-substituting pass-through with no body work — the chart's translator code goes away.

@st-gr st-gr requested review from a team, derekwaynecarr, maxamillion and mrunalp as code owners June 3, 2026 05:30
@copy-pr-bot

copy-pr-bot Bot commented Jun 3, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@st-gr

st-gr commented Jun 3, 2026

Copy link
Copy Markdown
Author

I have read the DCO document and I hereby sign the DCO.

@st-gr

st-gr commented Jun 3, 2026

Copy link
Copy Markdown
Author

recheck

Comment thread crates/openshell-providers/src/providers/aws_bedrock.rs
Comment thread providers/aws-bedrock.yaml Outdated
@johntmyers

johntmyers commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

gator-agent

PR Review Status

Validation: this PR remains project-valid as a concentrated Bedrock-compatible provider/profile and sandbox L7 routing update. The prior legacy ProviderDiscoverySpec blocker is resolved at head 508a4e6c1fb88c7c1550f2cebb50c24301116db6.

Head SHA: 508a4e6c1fb88c7c1550f2cebb50c24301116db6

Review findings:

  • Blocking: aws-bedrock still is not wired into the managed inference.local route registry. openshell_core::inference::profile_for only registers openai, anthropic, and nvidia, so openshell inference set --provider <aws-bedrock-provider> will reject this provider before the new sandbox L7 patterns can be used. Please add the aws-bedrock inference profile/route support or narrow the PR so it does not advertise managed inference.local support.
  • Major: providers/aws-bedrock.yaml declares AWS_SESSION_TOKEN, but discovery.credentials omits aws_session_token. Providers v2 discovery only scans listed credentials, so --from-existing will silently drop temporary AWS credentials. Please include aws_session_token in discovery.credentials.
  • Major: direct user-facing docs are still missing. If this PR exposes aws-bedrock as a provider/profile and adds Bedrock request patterns, please update the existing provider and inference routing docs. No new docs/index.yml entry appears necessary.
  • Test gap: add coverage that an aws-bedrock provider can be created from the built-in profile and either successfully used by upsert_cluster_inference_route or intentionally rejected with a clear documented error.

Docs: missing for direct provider and inference-routing behavior.

Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for /ok to test; I am not posting /ok to test while blocking review feedback is outstanding.

Next state: gator:in-review

@johntmyers johntmyers added the gator:in-review Gator is reviewing or awaiting PR review feedback label Jun 3, 2026
st-gr added a commit to st-gr/OpenShell that referenced this pull request Jun 5, 2026
Addresses johntmyers's review on NVIDIA#1704: net-new
providers should land via the v2 YAML profile only and should NOT
require changes to the legacy `ProviderDiscoverySpec` registry.

- Delete `crates/openshell-providers/src/providers/aws_bedrock.rs`
  (the legacy SPEC + `test_discovers_env_credential!` invocation).
- Drop `pub mod aws_bedrock;` from `crates/openshell-providers/src/providers/mod.rs`.
- Drop `registry.register(providers::aws_bedrock::SPEC)` from
  `crates/openshell-providers/src/lib.rs`.

Kept:

- `providers/aws-bedrock.yaml` and the `include_str!` in
  `BUILT_IN_PROFILE_YAMLS` (`profiles.rs`) — the v2 path.
  `discover_from_profile()` (`crates/openshell-providers/src/discovery.rs`)
  picks up AWS_* env vars via `discovery.credentials` in the YAML.
- L7 router patterns in `crates/openshell-sandbox/src/l7/inference.rs`
  — orthogonal to the provider registry.

The discovery test in the deleted file goes with it; v2 doesn't have
an established per-provider env-var-pickup unit test pattern, and
other YAML-only registrations (none today, but this is the new
direction) won't carry one either.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Re-check After Reviewer Update

I re-evaluated latest head 508a4e6c1fb88c7c1550f2cebb50c24301116db6 after st-gr's June 5 responses on the legacy provider update and endpoint wildcard, and johntmyers's June 5 maintainer response accepting the fixed bedrock-runtime.us-east-1.amazonaws.com endpoint.

Disposition: partially resolved. The legacy ProviderDiscoverySpec update is removed, and the endpoint wildcard discussion is resolved. Blocking review feedback remains.

Remaining items:

  • Blocking: providers/aws-bedrock.yaml marks the profile inference_capable: true, but openshell_core::inference::profile_for still only registers openai, anthropic, and nvidia; openshell inference set --provider <aws-bedrock-provider> will still reject this provider before the new sandbox L7 patterns can be used. Please add the aws-bedrock inference profile/route support with the intended auth semantics, or narrow the PR so it does not advertise managed inference.local support.
  • Blocking/static test regression: crates/openshell-server/src/grpc/provider.rs still asserts the built-in profile list is exactly ['claude-code', 'github', 'nvidia']. Adding providers/aws-bedrock.yaml makes that assertion stale; please update the expected built-in profile list.
  • Major: providers/aws-bedrock.yaml declares aws_session_token, but discovery.credentials omits it. Providers v2 discovery only scans listed credentials, so --from-existing will silently drop temporary AWS credentials. Please include aws_session_token in discovery.credentials.
  • Major: direct user-facing docs are still missing for the new provider/profile and inference-routing behavior. No docs/index.yml navigation change appears necessary.
  • Test gap: please add coverage that an aws-bedrock provider can be created from the built-in profile and either successfully used by upsert_cluster_inference_route or intentionally rejected with a clear documented error.

Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for /ok to test; I am not posting /ok to test while blocking review feedback is outstanding.

Next state: gator:in-review

st-gr added a commit to st-gr/OpenShell that referenced this pull request Jun 7, 2026
…le assertion

Two fixes from johntmyers's gator-agent re-check on NVIDIA#1704:

1. `providers/aws-bedrock.yaml`: add `aws_session_token` to
   `discovery.credentials`. The credential is declared in the profile
   but was missing from the discovery scan list, so Providers v2
   `--from-existing` would silently drop temporary AWS credentials
   (STS / IRSA scenarios).

2. `crates/openshell-server/src/grpc/provider.rs`: update the static
   `list_provider_profiles_returns_built_in_profile_categories`
   assertion to include `aws-bedrock` at alphabetical position 0.
   Adding `providers/aws-bedrock.yaml` to BUILT_IN_PROFILE_YAMLS made
   the prior `["claude-code", "github", "nvidia"]` expectation stale.

Remaining blockers from the same review (deferred to follow-up
commits): `inference::profile_for` registration for aws-bedrock,
user-facing provider + inference-routing docs, and an
`upsert_cluster_inference_route` integration test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
st-gr added a commit to st-gr/OpenShell that referenced this pull request Jun 7, 2026
Addresses johntmyers's blocking review feedback on PR NVIDIA#1704:
"aws-bedrock still is not wired into the managed inference.local route
registry. profile_for only registers openai, anthropic, and nvidia, so
inference set --provider <aws-bedrock-provider> will reject this
provider before the new sandbox L7 patterns can be used."

Approach: register aws-bedrock as a *bridge-fronted* upstream — the
router does not inject any auth header on outbound requests; the
configured BEDROCK_BASE_URL is expected to point at a translating
bridge / Bedrock-compatible proxy that handles auth in its own pod.
This is the shape the L7 patterns commit (8b30211) and the YAML
profile (6b51e1a) were designed for. SigV4 signing for direct AWS
Bedrock is a separate follow-up; see PR thread.

Changes:

- core::inference::AuthHeader: add `None` variant for upstreams that
  authenticate themselves.
- core::inference: add AWS_BEDROCK_PROFILE static + register in
  profile_for. Default base URL is bedrock-runtime.us-east-1, override
  via BEDROCK_BASE_URL config-key (mirrors ANTHROPIC_BASE_URL pattern).
  Empty credential_key_names + auth: None means no router-side
  credential lookup at route time.
- router::backend: handle AuthHeader::None as a no-op (skip auth
  injection).
- server::inference::resolve_provider_route: gate find_provider_api_key
  on auth != None. aws-bedrock providers with empty credentials now
  resolve cleanly. Updated the unsupported-type error message to
  include aws-bedrock in the supported list.
- server::inference tests: add positive
  upsert_cluster_route_succeeds_for_aws_bedrock_without_api_key test
  covering the new code path end-to-end (provider with empty creds +
  BEDROCK_BASE_URL config → upsert succeeds → resolved route has
  empty api_key + provider_type aws-bedrock + bridge URL).
- core::inference tests: profile_for_known_types covers aws-bedrock,
  case-insensitive lookup, plus three new aws-bedrock-specific tests
  (auth: None, no credential keys, bedrock-specific protocols).
- docs/sandboxes/inference-routing.mdx: header forwarding row
  mentions aws-bedrock has no passthrough headers; new tabs in
  Supported API Patterns (InvokeModel + InvokeModelWithResponseStream)
  and Create a Provider (with the bridge-fronted shape note + SigV4
  deferral).
- docs/sandboxes/manage-providers.mdx: new row in Supported Provider
  Types table; new row in Supported Inference Providers table.

Verification (in dev container):
- cargo check -p openshell-core -p openshell-router -p openshell-server: clean
- cargo test -p openshell-core --lib inference: 14/14 pass (incl. 3 new)
- cargo test -p openshell-server --lib inference::tests::upsert: 6/6 pass
  (incl. new aws-bedrock test)
- cargo fmt --check: clean
- cargo clippy --all-targets -D warnings: clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
@st-gr

st-gr commented Jun 7, 2026

Copy link
Copy Markdown
Author

@johntmyers — pushed 4ab587f1 addressing the remaining review items.

This push:

  • Blocking (profile_for registration): registered aws-bedrock with a new AuthHeader::None variant — the router does not inject any auth header on outbound requests. The configured BEDROCK_BASE_URL is expected to point at a translating bridge or Bedrock-compatible proxy that handles auth in its own pod. This matches the pass-through architecture noted in the original commit message (8b30211a: "if the operator's upstream is real AWS Bedrock it speaks Bedrock natively, if it's a translating bridge the bridge does any conversion server-side"). openshell inference set --provider <aws-bedrock-provider> now resolves cleanly.
  • Major (aws_session_token in discovery): added in 3d0ad1c6.
  • Major (docs): updated docs/sandboxes/inference-routing.mdx (header forwarding row, new API-pattern tab, new provider-creation tab with the bridge-fronted note) and docs/sandboxes/manage-providers.mdx (Supported Provider Types and Supported Inference Providers tables).
  • Test gap: added upsert_cluster_route_succeeds_for_aws_bedrock_without_api_key in crates/openshell-server/src/inference.rs. Creates an aws-bedrock provider with empty credentials + a BEDROCK_BASE_URL config-key, upserts the cluster route, and asserts the resolved route carries provider_type: "aws-bedrock", the bridge URL, and an empty api_key.
  • Static profile assertion: updated in 3d0ad1c6 to include aws-bedrock.

Deferred to follow-up PRs (will file issues for tracking):

  1. SigV4 router-side signer. Adds an AuthHeader::SigV4 { ... } variant + outbound request signing using the aws-sigv4 crate so aws-bedrock providers can authenticate against real AWS Bedrock directly (instead of via a bridge). The current PR's auth: None shape is one of two intended shapes; SigV4 is the other and is genuinely separable.
  2. {region} placeholder substitution in YAML profile loader. So operators can use any AWS region without having to override the entire base URL via BEDROCK_BASE_URL. Today the YAML's host is a literal bedrock-runtime.us-east-1.amazonaws.com.

Verification (run locally in the rust-1.95 dev container): cargo check -p openshell-core -p openshell-router -p openshell-server, cargo test -p openshell-core --lib inference (14/14), cargo test -p openshell-server --lib inference::tests::upsert (6/6), cargo fmt --check, cargo clippy --all-targets -D warnings — all clean. Branch Checks / Helm Lint stand by for /ok to test whenever ready.

@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Re-check After Author Update

I re-evaluated latest head 4ab587f1bdab2c2565462fe5ed9cc6bb3a5d0c12 after @st-gr's June 7 comment saying the remaining review items were addressed.

Disposition: partially resolved. The previous profile_for registration, built-in profile list, aws_session_token discovery entry, and Fern docs update requests are addressed. Blocking review feedback remains on the public setup path and Bedrock route behavior.

Remaining items:

  • Blocking: openshell inference set verifies routes by default, but the router validation probe does not support aws_bedrock_invoke / aws_bedrock_invoke_stream, so the documented Bedrock inference setup will fail unless users know to pass --no-verify. Please add a Bedrock validation strategy, or explicitly require/document --no-verify and add coverage for that public path.
  • Blocking/security: Bedrock carries the model id in /model/{modelId}/invoke, but the router currently forwards the caller's original path and only rewrites JSON body model. That lets sandbox code choose a different upstream model than the operator-configured route model, and may also mutate native Bedrock request bodies incorrectly. Please rewrite or reject mismatched Bedrock model paths and skip body model injection for Bedrock protocols, with buffered and streaming router coverage.
  • Blocking/docs or behavior mismatch: the docs show openshell provider create --type aws-bedrock --config BEDROCK_BASE_URL=... without credentials, but the CLI requires --from-existing or --credential and the server rejects empty credentials for non-refresh profiles. Please either make bridge-fronted aws-bedrock work as a config-only provider and test provider_create, or document the actual required command.
  • Major: until SigV4 support exists, the core profile should not silently fall back to the real AWS Bedrock endpoint with auth: None; require BEDROCK_BASE_URL for bridge-fronted mode or otherwise prevent saving a route that forwards prompt bodies to AWS without usable auth.
  • Major: the provider YAML marks AWS credentials required and discoverable even though bridge-fronted routing intentionally does not use them. Please avoid collecting required AWS credentials for the bridge-only shape, or explain and test why they must be stored before SigV4 lands.
  • Docs polish: docs/sandboxes/inference-routing.mdx still says provider support is NVIDIA, OpenAI-compatible, and Anthropic; please include AWS Bedrock in that summary row if this PR keeps the provider.

Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for /ok to test; I am not posting /ok to test while blocking review feedback is outstanding.

Next state: gator:in-review

@johntmyers johntmyers added gator:blocked Gator is blocked by process or repository gates and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 8, 2026
@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Blocked

Gator is blocked by merge conflicts on the PR branch. GitHub currently reports mergeable=CONFLICTING and mergeStateStatus=DIRTY for head 4ab587f1bdab2c2565462fe5ed9cc6bb3a5d0c12.

Next action: @st-gr, please rebase or merge main into feat/aws-bedrock-provider and resolve the conflicts. After the branch is mergeable again, gator will re-check the outstanding review items before any /ok to test action.

st-gr added a commit to st-gr/OpenShell that referenced this pull request Jun 8, 2026
Resolves merge conflicts after main fast-forwarded 554 commits past
the branch base. Conflicting files and resolutions:

- crates/openshell-core/src/inference.rs: kept aws-bedrock additions
  (AWS_BEDROCK_PROTOCOLS const, AWS_BEDROCK_PROFILE static) alongside
  the new VERTEX_AI_PROTOCOLS const and normalize_inference_provider_type
  function. Added "aws-bedrock" to that normalize function so profile_for
  resolves it through the same canonicalization path as the other
  inference profiles.
- crates/openshell-server/src/grpc/provider.rs: merged the static
  built-in profile assertion to include both the upstream additions
  (codex, copilot, cursor, google-vertex-ai, pypi) and aws-bedrock,
  alphabetically.
- crates/openshell-server/src/inference.rs: kept the auth: None gating
  for aws-bedrock api-key lookup, applied around the new
  CredentialLookup-based call signature; preserved the vertex-ai
  dispatch added upstream. Updated the unsupported-type error message
  to list both google-vertex-ai and aws-bedrock.
- docs/sandboxes/inference-routing.mdx: combined the upstream Vertex
  Claude rawPredict header note with the aws-bedrock no-passthrough
  note in the header forwarding row.

Drive-by fix for one finding from gator-agent's NVIDIA#1704 re-check on
4ab587f: AWS_BEDROCK_PROFILE.default_base_url is now an empty string
rather than the real AWS Bedrock URL. Without BEDROCK_BASE_URL config,
resolve_provider_route's existing empty-base_url check now rejects
the provider rather than silently forwarding prompts to AWS with
auth: None. Once SigV4 lands, the default can revert.

Verification (in dev container):
- cargo check -p openshell-core -p openshell-router -p openshell-server: clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
st-gr added a commit to st-gr/OpenShell that referenced this pull request Jun 8, 2026
Addresses four findings from gator-agent's NVIDIA#1704 re-check on 4ab587f:

- **Item 5** (YAML collects unused AWS creds): mark all four AWS
  credentials `required: false` and clear `discovery.credentials`.
  Bridge-fronted routing intentionally does not consume AWS
  credentials, so `--from-existing` no longer scans for them. The
  credentials remain in the schema (not deleted) so the SigV4
  follow-up can flip them back without a schema migration. Added a
  multi-line description that names the bridge-fronted shape and the
  SigV4 deferral so readers don't have to cross-reference the PR
  thread.

- **Item 3** (docs show command that the CLI rejects): rewrite the
  Create-a-Provider example for AWS Bedrock to use the actual
  required shape — placeholder `--credential AWS_ACCESS_KEY_ID=
  unused-bridge-fronted-shape` plus the `--config BEDROCK_BASE_URL`.
  The placeholder satisfies the gRPC handler's
  `provider.credentials.is_empty()` rejection without expanding
  server-side validation; the router ignores it on the outbound path
  because `auth: AuthHeader::None` skips header injection. Operators
  see a clearly-labeled placeholder in `provider get` output.

- **Item 1** (validator probe): document `--no-verify` as required
  for `openshell inference set --provider <aws-bedrock>` since the
  default validation probe doesn't recognize the
  `aws_bedrock_invoke` / `aws_bedrock_invoke_stream` protocols. Doc
  now shows the full `provider create` + `inference set --no-verify`
  flow with rationale for both decisions inline.

- **Item 6** (docs polish): `inference-routing.mdx` summary row now
  lists AWS Bedrock alongside NVIDIA, Anthropic, Vertex AI, and
  OpenAI-compatible providers, with the bridge-fronted caveat
  inline.

Test additions in `crates/openshell-server/src/inference.rs`:

- Renamed the existing aws-bedrock test from
  `..._without_api_key` to `..._with_bridge_url` and updated it to
  use a placeholder credential (mirroring the doc-recommended
  pattern operators will copy-paste). The `auth: None` path still
  produces an empty `api_key` on the resolved route — the test now
  documents that the credential is *stored* but not *used*.
- Added `upsert_cluster_route_rejects_aws_bedrock_without_bedrock_base_url`:
  the negative half of johntmyers' "successfully used by
  upsert_cluster_inference_route or intentionally rejected with a
  clear documented error" ask. With
  `default_base_url: ""` and no `BEDROCK_BASE_URL` config, route
  resolution returns `InvalidArgument` naming the missing base_url
  rather than silently forwarding prompts to AWS Bedrock with no
  usable auth.

Verification (in dev container):
- cargo test -p openshell-core --lib inference: 18/18 (incl. 3 new)
- cargo test -p openshell-server --lib inference::tests::upsert: 8/8
  (incl. 2 new aws-bedrock cases — positive + negative)
- cargo fmt --check: clean
- cargo clippy --all-targets -D warnings: clean

Item 2 (router-side enforcement of operator-configured Bedrock model
path, replacing the current verbatim path forwarding + body-only
model rewrite) is the remaining blocker and is genuinely separable —
it touches the L7 router with streaming-aware test coverage.
Deferring to its own commit so the security-critical change gets the
review attention it deserves.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
@johntmyers johntmyers added gator:in-review Gator is reviewing or awaiting PR review feedback and removed gator:blocked Gator is blocked by process or repository gates labels Jun 9, 2026
@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Re-check After Author Update

I re-evaluated latest head 6373b4d2649f7fb36cbe75c34c96319f0120e439 after @st-gr pushed updates on June 8, 2026 to resolve the prior merge-conflict blocker and address the remaining review items.

Disposition: partially resolved. The merge-conflict blocker is resolved (mergeable=true), and the bridge-only Bedrock profile/docs/no-verify path is clearer. Blocking review feedback remains.

Remaining items:

  • Blocking/security: Bedrock still does not enforce the configured route model. aws-bedrock resolves with model_in_path: false and no path override, so sandbox code can call /model/other-model/invoke and the bridge receives that caller-selected model in the URL. The generic body rewrite also injects "model": route.model, which is not Bedrock's selector and can mutate native Bedrock bodies. Please either build/rewrite Bedrock upstream paths from route.model or reject mismatched caller model paths, and skip generic body model insertion for aws_bedrock_* protocols.
  • Blocking/scope: the PR currently carries unrelated fork infrastructure and external compute driver changes (.github/workflows/release-gateway.yml, .github/workflows/rust-lint.yml, .github/workflows/smoke-self-hosted-kyma.yml, Dockerfile.gateway, README-FORK.md, and external-driver hunks). Please remove those from this Bedrock provider PR or split them into maintainer-scoped PRs.
  • Warning: Bedrock InvokeModel should be buffered while InvokeModelWithResponseStream is streaming. Please add framing/coverage so /model/{id}/invoke cannot be corrupted by the streaming proxy's truncation/error-frame behavior.
  • Warning/docs: the bridge-only profile still declares AWS secret fields while discovery is empty and the router does not consume them. Please avoid encouraging real AWS secrets for the current bridge-only shape, or label them clearly as future SigV4 fields in the docs table.

Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for /ok to test; I am not posting /ok to test while blocking review feedback is outstanding.

Next state: gator:in-review

st-gr added a commit to st-gr/OpenShell that referenced this pull request Jun 9, 2026
Closes the security-blocking item from gator-agent's NVIDIA#1704 re-check
on 4ab587f: "Bedrock carries the model id in /model/{modelId}/invoke,
but the router currently forwards the caller's original path and only
rewrites JSON body model. That lets sandbox code choose a different
upstream model than the operator-configured route model, and may also
mutate native Bedrock request bodies incorrectly."

Two changes in `prepare_backend_request`:

1. **Path rewrite for Bedrock routes.** Before computing the upstream
   URL, parse the inbound path's `/model/<id>/invoke[-with-response-stream]`
   shape and substitute the operator-configured `route.model` for the
   caller-supplied model segment. Sandbox code that hardcodes a
   different model still works (we don't reject on mismatch), but the
   operator's configured model is what reaches the upstream / bridge.
   If the inbound path is somehow not a recognized Bedrock shape on a
   Bedrock route (the L7 pattern detector upstream of the router
   should never produce this combination), reject with
   RouterError::Internal naming the offending path rather than
   forwarding verbatim.

2. **Skip body-model injection for Bedrock routes.** The existing body
   rewriter unconditionally inserts `route.model` into the JSON body
   for non-Vertex routes. AWS Bedrock InvokeModel encodes the model
   in the URL path; the body is the raw provider-specific payload
   (Anthropic Messages for Claude, Mistral payload for Mistral, etc.)
   and must not be mutated. The branch ordering is now:
   needs_vertex_anthropic_version → strip body model + inject
   anthropic_version; route_is_bedrock → leave body alone; else →
   inject route.model (existing default).

New helpers, all in `crates/openshell-router/src/backend.rs`:

- `route_is_bedrock(route)` — true when route.protocols contains
  aws_bedrock_invoke or aws_bedrock_invoke_stream.
- `parse_bedrock_invocation_path(path)` — returns
  Some((model_id, "/invoke" | "/invoke-with-response-stream")) for
  paths matching the recognized Bedrock shapes. Strips query strings.
  Rejects empty model ids and multi-segment ids (defense-in-depth
  matching the L7 pattern detector's existing guards).
- `rewrite_bedrock_path(route, path)` — returns the path with the
  caller's model segment replaced by route.model.

Test coverage in the same file (9 new tests):

- parse_bedrock_invocation_path: positive cases for both invoke
  variants, query-string stripping; negative cases for empty model id,
  multi-segment id, unknown action, wrong prefix, missing slash.
- route_is_bedrock: matches both protocol variants singly and
  combined; rejects openai_chat_completions.
- rewrite_bedrock_path: substitutes operator model on both invoke
  variants; returns None for non-Bedrock paths.
- bedrock_route_rewrites_model_in_path_and_preserves_body
  (wiremock end-to-end): caller sends /model/some-other-model/invoke
  with a body containing model: "caller-supplied-model-name". Mock
  asserts the upstream receives /model/<operator-model>/invoke and the
  body's model field is the caller's value (NOT route.model) — proves
  both the path rewrite and the body preservation.
- bedrock_route_streaming_rewrites_model_in_path: same contract for
  invoke-with-response-stream.
- bedrock_route_rejects_non_bedrock_path: defense-in-depth coverage of
  the Internal-error path when a Bedrock route receives a path that
  doesn't match Bedrock shape.

Verification (in dev container):
- cargo test -p openshell-router --lib: 53/53 (incl. 9 new)
- cargo fmt --check: clean
- cargo clippy -p openshell-core -p openshell-router -p openshell-server
  --all-targets -- -D warnings: clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
@st-gr st-gr force-pushed the feat/aws-bedrock-provider branch from 2a8e23b to a1055e2 Compare June 9, 2026 05:17
@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Re-check After Author Update

I re-evaluated latest head 36c6c20508e91c8fd4415dbb3fa1c1a875650a67 after @st-gr's June 16, 2026 comment asking whether the current gator findings should remain blockers. I also ran the required independent code-only review for the current diff and response.

Disposition: partially resolved.

Remaining items:

  • Blocking/security: the AWS credential schema/docs issue remains a blocker unless a maintainer explicitly accepts the risk. providers/aws-bedrock.yaml and docs/sandboxes/manage-providers.mdx still expose real AWS credential names, including AWS_SECRET_ACCESS_KEY, for a bridge-fronted provider shape that does not consume them today. Because the docs describe these as injected environment variables, this creates unnecessary sandbox-visible secret surface for no current runtime benefit. Recommended fix: use a neutral bridge placeholder credential such as BEDROCK_BRIDGE_PLACEHOLDER, keep discovery empty, and update the provider docs/table so real AWS secrets are not presented as injected for this bridge-only release. Security mapping: CWE-200 / CWE-269 / OWASP LLM06.
  • Non-blocking follow-up: the provider-create/gRPC path coverage request is valid test hardening, but I am downgrading it from a merge blocker. The current diff covers the route-resolution/router behavior, profile registration, BEDROCK_BASE_URL requirement, auth-none behavior, and Bedrock model-id validation. Public create-path coverage would still be useful to catch docs/profile drift.

Checks: DCO and the CI gate publisher are passing, but OpenShell / Branch Checks, OpenShell / Helm Lint, and OpenShell / E2E are still waiting for /ok to test on this head. I am not posting /ok to test while the credential-schema blocker remains unresolved or unwaived by a maintainer.

Next state: gator:in-review

@johntmyers

Copy link
Copy Markdown
Collaborator
  1. We can keep as-is knowing there is a follow-up coming.

  2. File as a follow-up.

@st-gr looks like we need a re-base and then we can merge. Let's get the follow-ups ticketed and then can keep going.

st-gr and others added 13 commits June 16, 2026 16:47
Adds two patterns to `default_patterns()` so the supervisor's L7
inference router recognizes the Bedrock InvokeModel URL shape and
forwards matched requests to the registered upstream:

- `POST /model/{modelId}/invoke`                       → aws_bedrock_invoke
- `POST /model/{modelId}/invoke-with-response-stream`  → aws_bedrock_invoke_stream

The `{modelId}` segment is wildcarded by extending `detect_inference_pattern`
to handle one middle `/*/` segment in addition to the existing trailing
`/*`. The wildcard is constrained to a single non-empty path segment to
avoid path-traversal liabilities — `/model//invoke` and `/model/a/b/invoke`
both no-match.

Without this, sandboxes running Claude Code in its native Bedrock mode
(`CLAUDE_CODE_USE_BEDROCK=1`, `ANTHROPIC_BEDROCK_BASE_URL`, AWS-style
auth) hit the supervisor with `403 connection not allowed by policy`
because their URL doesn't match `/v1/*` shapes. The fix unblocks
operators wanting to register direct AWS Bedrock, an in-cluster
Bedrock-compatible bridge, or a Bedrock-emulating LiteLLM as
`--type aws-bedrock` providers.

Tests cover: positive matches for invoke + invoke-with-response-stream,
query-string handling, GET rejection, empty-segment rejection,
multi-segment rejection, and unknown-action rejection.

Companion changes (provider discovery spec + YAML profile) follow in
the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Adds `aws-bedrock` to the built-in provider catalog so operators can
run `openshell provider create --type aws-bedrock --credential ...`
and have the gateway treat it as a first-class inference provider
alongside `anthropic`, `openai`, etc.

- `providers/aws-bedrock.yaml`: YAML profile declaring four credentials
  (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION).
  Default endpoint is `bedrock-runtime.us-east-1.amazonaws.com:443`;
  operators in other regions or running against a Bedrock-compatible
  proxy override via the operator-supplied `BEDROCK_BASE_URL` config-key
  (mirrors `ANTHROPIC_BASE_URL` for the `anthropic` provider).

- `crates/openshell-providers/src/providers/aws_bedrock.rs`: the
  `ProviderDiscoverySpec` so `openshell provider create --auto-providers`
  picks up AWS_* env vars from local credentials.

- `crates/openshell-providers/src/providers/mod.rs`: register the module.

- `crates/openshell-providers/src/lib.rs`: register the SPEC in the
  default registry alongside the other providers.

- `crates/openshell-providers/src/profiles.rs`: include the new YAML in
  `BUILT_IN_PROFILE_YAMLS`.

What this PR explicitly does NOT add (intentionally separated for
review-size reasons; will follow up):

- A SigV4 signer in `openshell-router`. The current change simply
  declares the protocol; a follow-up PR adds outbound SigV4 signing
  using the `aws-sigv4` crate and a new `auth_style: sigv4` validator
  branch in profiles.rs. Operators who don't need SigV4 (e.g. an
  in-cluster bridge that ignores it and authenticates separately to
  the upstream) can use this PR today.

- Body translation between Bedrock InvokeModel shape and other
  inference shapes. The router treats Bedrock requests as opaque
  pass-through; if the operator's upstream is real AWS Bedrock it
  speaks Bedrock natively, if it's a translating bridge the bridge
  does any conversion server-side.

- `BEDROCK_BASE_URL` placeholder substitution in the YAML loader.
  Today the YAML's `host` is a literal default; operators override
  with the config-key the same way `ANTHROPIC_BASE_URL` works.

Tested: `cargo test -p openshell-providers` (35 tests green) and
`cargo test -p openshell-sandbox --lib l7::inference` (40 tests green
including the seven new aws_bedrock cases from the previous commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses johntmyers's review on NVIDIA#1704: net-new
providers should land via the v2 YAML profile only and should NOT
require changes to the legacy `ProviderDiscoverySpec` registry.

- Delete `crates/openshell-providers/src/providers/aws_bedrock.rs`
  (the legacy SPEC + `test_discovers_env_credential!` invocation).
- Drop `pub mod aws_bedrock;` from `crates/openshell-providers/src/providers/mod.rs`.
- Drop `registry.register(providers::aws_bedrock::SPEC)` from
  `crates/openshell-providers/src/lib.rs`.

Kept:

- `providers/aws-bedrock.yaml` and the `include_str!` in
  `BUILT_IN_PROFILE_YAMLS` (`profiles.rs`) — the v2 path.
  `discover_from_profile()` (`crates/openshell-providers/src/discovery.rs`)
  picks up AWS_* env vars via `discovery.credentials` in the YAML.
- L7 router patterns in `crates/openshell-sandbox/src/l7/inference.rs`
  — orthogonal to the provider registry.

The discovery test in the deleted file goes with it; v2 doesn't have
an established per-provider env-var-pickup unit test pattern, and
other YAML-only registrations (none today, but this is the new
direction) won't carry one either.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
…le assertion

Two fixes from johntmyers's gator-agent re-check on NVIDIA#1704:

1. `providers/aws-bedrock.yaml`: add `aws_session_token` to
   `discovery.credentials`. The credential is declared in the profile
   but was missing from the discovery scan list, so Providers v2
   `--from-existing` would silently drop temporary AWS credentials
   (STS / IRSA scenarios).

2. `crates/openshell-server/src/grpc/provider.rs`: update the static
   `list_provider_profiles_returns_built_in_profile_categories`
   assertion to include `aws-bedrock` at alphabetical position 0.
   Adding `providers/aws-bedrock.yaml` to BUILT_IN_PROFILE_YAMLS made
   the prior `["claude-code", "github", "nvidia"]` expectation stale.

Remaining blockers from the same review (deferred to follow-up
commits): `inference::profile_for` registration for aws-bedrock,
user-facing provider + inference-routing docs, and an
`upsert_cluster_inference_route` integration test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses johntmyers's blocking review feedback on PR NVIDIA#1704:
"aws-bedrock still is not wired into the managed inference.local route
registry. profile_for only registers openai, anthropic, and nvidia, so
inference set --provider <aws-bedrock-provider> will reject this
provider before the new sandbox L7 patterns can be used."

Approach: register aws-bedrock as a *bridge-fronted* upstream — the
router does not inject any auth header on outbound requests; the
configured BEDROCK_BASE_URL is expected to point at a translating
bridge / Bedrock-compatible proxy that handles auth in its own pod.
This is the shape the L7 patterns commit (8b30211) and the YAML
profile (6b51e1a) were designed for. SigV4 signing for direct AWS
Bedrock is a separate follow-up; see PR thread.

Changes:

- core::inference::AuthHeader: add `None` variant for upstreams that
  authenticate themselves.
- core::inference: add AWS_BEDROCK_PROFILE static + register in
  profile_for. Default base URL is bedrock-runtime.us-east-1, override
  via BEDROCK_BASE_URL config-key (mirrors ANTHROPIC_BASE_URL pattern).
  Empty credential_key_names + auth: None means no router-side
  credential lookup at route time.
- router::backend: handle AuthHeader::None as a no-op (skip auth
  injection).
- server::inference::resolve_provider_route: gate find_provider_api_key
  on auth != None. aws-bedrock providers with empty credentials now
  resolve cleanly. Updated the unsupported-type error message to
  include aws-bedrock in the supported list.
- server::inference tests: add positive
  upsert_cluster_route_succeeds_for_aws_bedrock_without_api_key test
  covering the new code path end-to-end (provider with empty creds +
  BEDROCK_BASE_URL config → upsert succeeds → resolved route has
  empty api_key + provider_type aws-bedrock + bridge URL).
- core::inference tests: profile_for_known_types covers aws-bedrock,
  case-insensitive lookup, plus three new aws-bedrock-specific tests
  (auth: None, no credential keys, bedrock-specific protocols).
- docs/sandboxes/inference-routing.mdx: header forwarding row
  mentions aws-bedrock has no passthrough headers; new tabs in
  Supported API Patterns (InvokeModel + InvokeModelWithResponseStream)
  and Create a Provider (with the bridge-fronted shape note + SigV4
  deferral).
- docs/sandboxes/manage-providers.mdx: new row in Supported Provider
  Types table; new row in Supported Inference Providers table.

Verification (in dev container):
- cargo check -p openshell-core -p openshell-router -p openshell-server: clean
- cargo test -p openshell-core --lib inference: 14/14 pass (incl. 3 new)
- cargo test -p openshell-server --lib inference::tests::upsert: 6/6 pass
  (incl. new aws-bedrock test)
- cargo fmt --check: clean
- cargo clippy --all-targets -D warnings: clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses four findings from gator-agent's NVIDIA#1704 re-check on 4ab587f:

- **Item 5** (YAML collects unused AWS creds): mark all four AWS
  credentials `required: false` and clear `discovery.credentials`.
  Bridge-fronted routing intentionally does not consume AWS
  credentials, so `--from-existing` no longer scans for them. The
  credentials remain in the schema (not deleted) so the SigV4
  follow-up can flip them back without a schema migration. Added a
  multi-line description that names the bridge-fronted shape and the
  SigV4 deferral so readers don't have to cross-reference the PR
  thread.

- **Item 3** (docs show command that the CLI rejects): rewrite the
  Create-a-Provider example for AWS Bedrock to use the actual
  required shape — placeholder `--credential AWS_ACCESS_KEY_ID=
  unused-bridge-fronted-shape` plus the `--config BEDROCK_BASE_URL`.
  The placeholder satisfies the gRPC handler's
  `provider.credentials.is_empty()` rejection without expanding
  server-side validation; the router ignores it on the outbound path
  because `auth: AuthHeader::None` skips header injection. Operators
  see a clearly-labeled placeholder in `provider get` output.

- **Item 1** (validator probe): document `--no-verify` as required
  for `openshell inference set --provider <aws-bedrock>` since the
  default validation probe doesn't recognize the
  `aws_bedrock_invoke` / `aws_bedrock_invoke_stream` protocols. Doc
  now shows the full `provider create` + `inference set --no-verify`
  flow with rationale for both decisions inline.

- **Item 6** (docs polish): `inference-routing.mdx` summary row now
  lists AWS Bedrock alongside NVIDIA, Anthropic, Vertex AI, and
  OpenAI-compatible providers, with the bridge-fronted caveat
  inline.

Test additions in `crates/openshell-server/src/inference.rs`:

- Renamed the existing aws-bedrock test from
  `..._without_api_key` to `..._with_bridge_url` and updated it to
  use a placeholder credential (mirroring the doc-recommended
  pattern operators will copy-paste). The `auth: None` path still
  produces an empty `api_key` on the resolved route — the test now
  documents that the credential is *stored* but not *used*.
- Added `upsert_cluster_route_rejects_aws_bedrock_without_bedrock_base_url`:
  the negative half of johntmyers' "successfully used by
  upsert_cluster_inference_route or intentionally rejected with a
  clear documented error" ask. With
  `default_base_url: ""` and no `BEDROCK_BASE_URL` config, route
  resolution returns `InvalidArgument` naming the missing base_url
  rather than silently forwarding prompts to AWS Bedrock with no
  usable auth.

Verification (in dev container):
- cargo test -p openshell-core --lib inference: 18/18 (incl. 3 new)
- cargo test -p openshell-server --lib inference::tests::upsert: 8/8
  (incl. 2 new aws-bedrock cases — positive + negative)
- cargo fmt --check: clean
- cargo clippy --all-targets -D warnings: clean

Item 2 (router-side enforcement of operator-configured Bedrock model
path, replacing the current verbatim path forwarding + body-only
model rewrite) is the remaining blocker and is genuinely separable —
it touches the L7 router with streaming-aware test coverage.
Deferring to its own commit so the security-critical change gets the
review attention it deserves.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Closes the security-blocking item from gator-agent's NVIDIA#1704 re-check
on 4ab587f: "Bedrock carries the model id in /model/{modelId}/invoke,
but the router currently forwards the caller's original path and only
rewrites JSON body model. That lets sandbox code choose a different
upstream model than the operator-configured route model, and may also
mutate native Bedrock request bodies incorrectly."

Two changes in `prepare_backend_request`:

1. **Path rewrite for Bedrock routes.** Before computing the upstream
   URL, parse the inbound path's `/model/<id>/invoke[-with-response-stream]`
   shape and substitute the operator-configured `route.model` for the
   caller-supplied model segment. Sandbox code that hardcodes a
   different model still works (we don't reject on mismatch), but the
   operator's configured model is what reaches the upstream / bridge.
   If the inbound path is somehow not a recognized Bedrock shape on a
   Bedrock route (the L7 pattern detector upstream of the router
   should never produce this combination), reject with
   RouterError::Internal naming the offending path rather than
   forwarding verbatim.

2. **Skip body-model injection for Bedrock routes.** The existing body
   rewriter unconditionally inserts `route.model` into the JSON body
   for non-Vertex routes. AWS Bedrock InvokeModel encodes the model
   in the URL path; the body is the raw provider-specific payload
   (Anthropic Messages for Claude, Mistral payload for Mistral, etc.)
   and must not be mutated. The branch ordering is now:
   needs_vertex_anthropic_version → strip body model + inject
   anthropic_version; route_is_bedrock → leave body alone; else →
   inject route.model (existing default).

New helpers, all in `crates/openshell-router/src/backend.rs`:

- `route_is_bedrock(route)` — true when route.protocols contains
  aws_bedrock_invoke or aws_bedrock_invoke_stream.
- `parse_bedrock_invocation_path(path)` — returns
  Some((model_id, "/invoke" | "/invoke-with-response-stream")) for
  paths matching the recognized Bedrock shapes. Strips query strings.
  Rejects empty model ids and multi-segment ids (defense-in-depth
  matching the L7 pattern detector's existing guards).
- `rewrite_bedrock_path(route, path)` — returns the path with the
  caller's model segment replaced by route.model.

Test coverage in the same file (9 new tests):

- parse_bedrock_invocation_path: positive cases for both invoke
  variants, query-string stripping; negative cases for empty model id,
  multi-segment id, unknown action, wrong prefix, missing slash.
- route_is_bedrock: matches both protocol variants singly and
  combined; rejects openai_chat_completions.
- rewrite_bedrock_path: substitutes operator model on both invoke
  variants; returns None for non-Bedrock paths.
- bedrock_route_rewrites_model_in_path_and_preserves_body
  (wiremock end-to-end): caller sends /model/some-other-model/invoke
  with a body containing model: "caller-supplied-model-name". Mock
  asserts the upstream receives /model/<operator-model>/invoke and the
  body's model field is the caller's value (NOT route.model) — proves
  both the path rewrite and the body preservation.
- bedrock_route_streaming_rewrites_model_in_path: same contract for
  invoke-with-response-stream.
- bedrock_route_rejects_non_bedrock_path: defense-in-depth coverage of
  the Internal-error path when a Bedrock route receives a path that
  doesn't match Bedrock shape.

Verification (in dev container):
- cargo test -p openshell-router --lib: 53/53 (incl. 9 new)
- cargo fmt --check: clean
- cargo clippy -p openshell-core -p openshell-router -p openshell-server
  --all-targets -- -D warnings: clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Closes the buffered-vs-streaming framing warning from gator-agent's
re-check on 4ab587f: "Bedrock InvokeModel should be buffered while
InvokeModelWithResponseStream is streaming. Please add framing/coverage
so /model/{id}/invoke cannot be corrupted by the streaming proxy's
truncation/error-frame behavior."

The InferenceApiPattern struct gained a `framing: ResponseFraming`
field upstream after the original Bedrock-patterns commit (#22b78cff)
landed; the cherry-pick onto current upstream/main left the two
Bedrock entries without the new field. Fixed here:

- aws_bedrock_invoke (POST /model/{id}/invoke):
    framing = ResponseFraming::Buffered
  InvokeModel returns one JSON object the caller decodes whole. Sending
  it through the streaming proxy would risk a mid-body size-cap
  truncation or idle-timeout failure appending an SSE error event onto
  bytes the caller decodes as one JSON body — the same corruption mode
  that drove the existing embeddings + model-discovery to Buffered.
- aws_bedrock_invoke_stream (POST /model/{id}/invoke-with-response-stream):
    framing = ResponseFraming::Streaming
  InvokeModelWithResponseStream returns an AWS event-stream of binary
  chunks; the caller wants chunks incrementally, so the streaming proxy
  path is correct.

Two new tests in `crates/openshell-sandbox/src/l7/inference.rs` pin
down the contract:

- aws_bedrock_invoke_is_buffered — detect_inference_pattern returns a
  Buffered pattern for /model/<id>/invoke, with explanatory message
  naming the corruption mode being prevented.
- aws_bedrock_invoke_stream_is_streaming — same shape, asserting
  Streaming for /model/<id>/invoke-with-response-stream.

Verification (in dev container):
- cargo check -p openshell-sandbox: clean (was failing on missing
  `framing` field before this commit)
- cargo test -p openshell-sandbox --lib l7::inference::tests::aws_bedrock:
  7/7 (incl. 2 new framing tests)
- cargo fmt --check: clean
- cargo clippy --all-targets -- -D warnings: clean

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
…rrors land

Per PR NVIDIA#1704 review (johntmyers): defer Bedrock streaming to a follow-up
that also wires protocol-aware error framing for the AWS event-stream
shape. Until then, surfacing `/model/{id}/invoke-with-response-stream`
risks shipping responses the sandbox cannot interpret on failure.

- AWS_BEDROCK_PROTOCOLS no longer advertises `aws_bedrock_invoke_stream`.
- The L7 inference pattern table drops the streaming entry; only
  `aws_bedrock_invoke` (buffered) is recognized.
- Test `aws_bedrock_invoke_stream_pattern_is_deferred` asserts no
  pattern claims that protocol so the gap is visible.

Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
…serve query

Per PR NVIDIA#1704 review (johntmyers, gator agent): the operator-configured
Bedrock model_id flows verbatim into the upstream URL path; the previous
plumbing left no enforcement that the value was a single benign path
segment, opening a path-injection vector.

Defense in depth, both layers:

* `openshell-server::inference`: new `validate_aws_bedrock_model_id`
  rejects empty, leading/trailing whitespace, `/`, `\\`, `?`, `#`, `%`,
  `..`, control/whitespace characters. Wired into `resolve_provider_route`
  ahead of base_url resolution so the route store cannot persist a
  malformed model_id. Mirrors `validate_vertex_model_id` exactly.

* `openshell-router::backend`: `rewrite_bedrock_path` now refuses to
  construct the upstream URL unless `route.model` passes
  `is_valid_bedrock_model_id`, so even a stale or hand-edited route
  cannot reach the wire. The parser also drops the
  `/invoke-with-response-stream` arm to match the protocol catalog.

* `parse_bedrock_invocation_path` returns the `?`-prefixed query tail
  as a third element; `rewrite_bedrock_path` re-attaches it so any
  caller-supplied query string is preserved through the model rewrite.

Tests: 5 unit tests for the validator, 1 integration test that
exercises every unsafe-model_id reject path through
`upsert_cluster_inference_route`, plus a router-side rewrite-rejects
test covering 11 unsafe `route.model` values. All 72 server inference
tests + router tests pass.

Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
…docs

Per PR NVIDIA#1704 review (johntmyers): a single profile that is "usable as
is" — not a profile that auto-grants direct AWS Bedrock egress to any
sandbox that selects it. The bridge IS the egress point and is
operator-managed; the profile must not implicitly punch a hole through
the cluster's network policy on its behalf.

* `providers/aws-bedrock.yaml`: clear `endpoints` and `binaries` to
  empty arrays. Rewrite the description to spell out that the profile
  is intentionally non-egress-granting, that operators are responsible
  for declaring their bridge's egress endpoint and binary attribution,
  and that the SigV4 follow-up will repopulate these fields once
  router-side signing exists.

* `docs/sandboxes/inference-routing.mdx`:
  - Drop the `InvokeModelWithResponseStream` row from the supported
    patterns table (matches the protocol-catalog + L7-pattern drop).
  - Update the `--no-verify` paragraph to reference only
    `aws_bedrock_invoke`.
  - Generalise the placeholder-credential rationale: any
    standalone-router profile registering `AuthHeader::None` will hit
    the same non-empty-credentials structural requirement.

Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
CI's rustfmt rejected the one-line `let (path_only, query_tail) = path.find('?').map_or(...)`
shape from commit c51160e and required the chained-method layout
instead. Functional behaviour is unchanged; tests still pass.

Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
`RouterError` is already imported at the top of the file, so the
two test-side `matches!(result, Err(crate::RouterError::UpstreamProtocol(_)))`
uses trip clippy's `unused_qualifications` under `-D warnings`. Drop
the prefix on both sites; functional behaviour is unchanged.

These warnings predate this PR (originated in 25abc9e on 2026-06-07)
but NVIDIA's `rust:lint` re-runs because this PR touches
`backend.rs`, so the lint regression surfaces here.

Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Re-check After Maintainer Update

I re-evaluated latest head 36c6c20508e91c8fd4415dbb3fa1c1a875650a67 after @johntmyers's June 16, 2026 maintainer comment accepting the AWS credential schema as-is with a follow-up and asking for the provider-create coverage to be filed as a follow-up.

Disposition: partially resolved. The remaining gator review feedback is waived or downgraded by maintainer decision, so those items no longer block gator review for this PR. The current blocker is mergeability: GitHub REST reports mergeable_state=dirty for this head.

Remaining items:

  • Blocking/process: the branch has merge conflicts against main and must be rebased or merged again before CI can be re-authorized and the PR can return to approval readiness.
  • Follow-up: file the provider-create/gRPC path coverage item separately as requested by the maintainer.
  • Follow-up: keep the AWS credential schema/SigV4/direct-Bedrock work in the planned follow-up scope accepted by the maintainer.

Next action: @st-gr, please rebase or merge main into feat/aws-bedrock-provider and resolve the conflicts. After the branch is mergeable again, gator will re-check the head, authorize CI if no new blockers appear, and continue from approval readiness.

Next state: gator:blocked

@johntmyers johntmyers added gator:blocked Gator is blocked by process or repository gates and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 16, 2026
`crates/openshell-server/src/lib.rs` declares `(shutdown_tx, shutdown_rx)`
twice in `run_server` — the first pair (introduced by upstream NVIDIA#1577
reconciler-lease work) only consumes `shutdown_rx`, leaving the first
`shutdown_tx` unused. Clippy's `-D warnings` flags it on workspace
lint. Rebasing this Bedrock-scoped PR onto current upstream/main
exposes the regression because we re-touch the file and the lint
re-runs.

Prefix the unused half with an underscore so the workspace lint is
clean. Behaviour is unchanged — only the second shutdown_tx (line
~425) ever sends a shutdown signal today.

This is a drive-by upstream fix unrelated to the Bedrock provider
work; keeping it separate so it can be cherry-picked or reverted
independently of the Bedrock commits.

Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
@st-gr

st-gr commented Jun 17, 2026

Copy link
Copy Markdown
Author

@johntmyers — done.

Rebase: rebased feat/aws-bedrock-provider onto current upstream/main (now at 36bb9e3e). Three new commits since the prior 36c6c205:

New HEAD: b1db5577. Diff vs upstream/main: same 9 Bedrock-scoped files plus the one-line lib.rs lint fix.

Verified inside rust:1.95-slim:

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo test -p openshell-core -p openshell-router -p openshell-server -p openshell-supervisor-network --lib → 1,870/1,870 ✅

Follow-up issues, per your direction:

Ready for /ok to test b1db5577 and merge.

@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Re-check After Author Update

I re-evaluated latest head b1db5577d0487f55004e755da4a31b7a847b3eb1 after @st-gr's June 17, 2026 comment saying the branch was rebased, the follow-up issues were filed, and the branch is ready for /ok to test.

Disposition: resolved for gator review. GitHub REST reports mergeable=true with mergeable_state=blocked, which is the normal review/check gate rather than a merge-conflict blocker. The required independent code-only review for the current diff found no blocking findings.

Remaining items:

Checks: DCO and the CI gate publisher are passing. OpenShell / Branch Checks, OpenShell / Helm Lint, and OpenShell / E2E are waiting for /ok to test on this head, so I am authorizing CI and moving gator back to pipeline watch.

Next state: gator:watch-pipeline

@johntmyers johntmyers added gator:watch-pipeline Gator is monitoring PR CI/CD status and removed gator:blocked Gator is blocked by process or repository gates labels Jun 18, 2026
@johntmyers

Copy link
Copy Markdown
Collaborator

/ok to test b1db557

@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Maintainer Approval Needed

Gator validation and PR monitoring are complete.

Validation: this PR remains project-valid as concentrated initial plumbing for bridge-fronted AWS Bedrock inference/provider support.
Review: no blocking gator review findings remain at head b1db5577d0487f55004e755da4a31b7a847b3eb1; follow-up #1940 tracks direct AWS Bedrock/SigV4 and non-bridge profile work, and follow-up #1941 tracks provider-create/gRPC path coverage.
Docs: updated for the new provider and inference-routing behavior.
Checks: OpenShell / Branch Checks, OpenShell / Helm Lint, and OpenShell / E2E are passing for the current head.
E2E: test:e2e is applied and the core E2E gate is passing; GPU and Kubernetes HA E2E are not required for this PR.

Human maintainer approval or merge decision is now required.

@johntmyers johntmyers added gator:approval-needed Gator completed review; maintainer approval needed and removed gator:watch-pipeline Gator is monitoring PR CI/CD status labels Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gator:approval-needed Gator completed review; maintainer approval needed test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants