Skip to content

[ExecuTorch][WebGPU] SymInt arithmetic ops (add/sub/mul/floordiv) for dynamic shapes#20573

Open
JulianCloudNTH wants to merge 3 commits into
gh/JulianCloudNTH/65/basefrom
gh/JulianCloudNTH/65/head
Open

[ExecuTorch][WebGPU] SymInt arithmetic ops (add/sub/mul/floordiv) for dynamic shapes#20573
JulianCloudNTH wants to merge 3 commits into
gh/JulianCloudNTH/65/basefrom
gh/JulianCloudNTH/65/head

Conversation

@JulianCloudNTH

@JulianCloudNTH JulianCloudNTH commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Stack from ghstack (oldest at bottom):

Register scalar SymInt arithmetic so dynamic-shape graphs lower without an "unsupported op" failure.

Problem: A dynamic-shape exported program emits scalar SymInt arithmetic nodes (add/sub/mul/floordiv) to compute live sizes and positions (e.g. input_pos + S, seq_len // n). The WebGPU backend registered only et_vk.select_as_symint.default, so WebGPUGraph::build() threw unsupported op: add when loading any dynamic .pte.

Solution:

  • Before: only select_as_symint produced a live SymInt; any arithmetic on it was unsupported.
  • After: add/sub/mul/floordiv each recompute their output SymInt from the operands via a resize hook whenever a live operand changes.

Implementation:

  • register_sym_binary reads each operand (live SymInt via read_symint, else a static Int), seeds the build-time value, and registers a resize hook on any live operand (set_symint on recompute).
  • floordiv rounds toward negative infinity (Python semantics).
  • Mirrors Vulkan backends/vulkan/runtime/graph/ops/impl/SymIntOps.cpp (add/sub/floordiv/mul).
  • Registered under the bare targets add/sub/mul/floordiv — distinct registry keys from the tensor aten.add.Tensor/aten.mul.Tensor ops.

Constraints: An output that folded to a static Int is a no-op. No GPU kernel, no dispatch, no change to the static-shape path. sym_size.int is intentionally not added here (depends on the tensor-shape dim-source API in the following diff).

Co-authored-with: Claude Code.
@exported-using-ghexport

Differential Revision: D109906102

Differential Revision: D109906102

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 28, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20573

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 2 Pending

As of commit fe4f84f with merge base db5f1ba (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 28, 2026
@github-actions

Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@JulianCloudNTH

Copy link
Copy Markdown
Contributor Author

@claude review and check for any areas or opportunities for modularization

@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Claude finished @JulianCloudNTH's task in 2m 6s —— View job


Review: WebGPU SymInt arithmetic ops

  • Read changed file SelectAsSymint.cpp
  • Compare against Vulkan reference SymIntOps.cpp
  • Check WebGPUGraph API (read_symint/set_symint/add_resize_hook/propagate_resize)
  • Analyze correctness & modularization opportunities

Overall this is a clean, well-scoped change. The resize-hook approach is a genuine improvement over the Vulkan reference it mirrors, and the diff does exactly what the description claims. Notes below, ordered by importance.


Modularization (the main ask)

1. register_sym_binary is the right call — and better than Vulkan. The Vulkan reference (backends/vulkan/runtime/graph/ops/impl/SymIntOps.cpp:53-166) copy-pastes the full _impl + resize_*_node + registration boilerplate four times. Collapsing all four ops into one helper parameterized by a std::function (SelectAsSymint.cpp:51-74) is the more maintainable design. Good instinct.

2. File/directory naming no longer matches contents. This is the clearest modularization opportunity. The file now contains general SymInt arithmetic (add/sub/mul/floordiv), but lives in ops/select_as_symint/SelectAsSymint.cpp — a directory named after one specific op. The repo convention is strictly one directory per op/concern (ops/add/, ops/mul/, ops/select/, …). A reader looking for where floordiv is registered won't find it under a select_as_symint path.

Consider extracting the arithmetic ops into a sibling ops/sym_int/SymIntOps.cpp (mirroring the Vulkan filename), leaving select_as_symint to own only select_as_symint. The shared helper read_scalar would move with them. This keeps the grep-to-op mapping the codebase relies on. Fix this → (if you'd rather keep the diff small and land the stack first, this is fine as a fast-follow — just flagging the convention.)


Correctness

3. Division by zero in floordiv is UB (SelectAsSymint.cpp:90-91). x / y and x % y with y == 0 are undefined behavior. The Vulkan reference has the same gap, so this isn't a regression, and a floordiv node with a zero divisor shouldn't arise from a valid export — but a one-line guard (throw, or return 0) would make it fail-loud like the rest of this file (read_symint, set_symint both throw on misuse) instead of crashing.

4. INT_MIN % y / INT_MIN / -1 overflow — theoretical only, since these values are tensor sizes/positions. Not worth handling; noting for completeness.

5. Self-referential operands register the hook twice. If a == b (e.g. x + x), both if branches at :68-72 fire and recompute is registered twice on the same SymInt. Harmless (recompute is idempotent, and set_symint no-ops when the value is unchanged), just slightly redundant. An if (a_is_symint) add_hook(a); else if (b_is_symint)… won't work since both may legitimately be distinct live operands — a b != a guard on the second branch would be the precise fix if you want it.


Minor / style

6. read_scalar silently assumes non-SymInt ⇒ Int (:43-48). If an operand were a Double (or any non-Int constant), get_int reads an unrelated flat array and returns a wrong value rather than erroring. Scalar arithmetic operands should only ever be Int/SymInt, so this is safe in practice; a comment or a type assert would document the assumption.

7. Chained arithmetic relies on registration order. propagate_resize (WebGPUGraph.cpp:148-158) iterates resize_hooks_ once in registration order; a hook that recomputes out marks out dirty so a later-registered hook keyed on out still fires in the same pass. This works only because ops are registered in topological order during build(). That's a reasonable invariant, but it's load-bearing and currently undocumented — worth a one-line comment on add_resize_hook or propagate_resize so a future single-pass refactor doesn't silently break multi-level SymInt chains (a + b, then that * c).


Verification

I could not build/run the WebGPU backend in this environment, so the above is static analysis only. Worth confirming with an actual dynamic-shape .pte that exercises a chained expression (e.g. (input_pos + S) // n) end-to-end, which would validate both the floor rounding (:88-95) and the chaining behavior in note 7.
· branch gh/JulianCloudNTH/65/head

[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants