-
Notifications
You must be signed in to change notification settings - Fork 76
Drop codegen support of gather (but not takeAlongAxis) #5907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
!test |
Description
|
| Relevant files | |||||||
|---|---|---|---|---|---|---|---|
| Enhancement |
| ||||||
| Tests |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| 🔒 No security concerns identified |
| ⚡ Recommended focus areas for review |
Gather validation logic
gather->exactSizes() method. Need to verify this method correctly identifies all cases where gather operations would have non-gathered indices with smaller output dimensions, as mentioned in the PR description. |
Test failures
-
(Medium, 6)
NVFuser internal assert (TensorIndexer isSupported) in PersistentBufferTest.BufferGatherLookupTv and ReductionTest.CrossEntropyGatherPatternTest Name A100 GB200 H100 Source PersistentBufferTest.BufferGatherLookupTv ❌ ❌ ❌ Link ReductionTest.CrossEntropyGatherPattern ❌ ❌ ❌ Link -
(Medium, 2)
Thunder nvFuser nanoGPT autograd returns zero scalar on CUDATest Name A100 GB200 Source thunder.tests.test_networks.test_nanogpt_complete_autograd_nvfuser_cuda_thunder.dtypes.float32 ❌ ❌
Gather allows non-gathered indices to have smaller output dimensions, which complicates indexing and is not yet supported by TensorIndexer. Note that takeAlongAxis, which is a limited case of gather, is supported.
One way to support it is to decompose it into a takeAlongAxis and slice. For now, this PR disables codegen of gather and delegates to ExprEval.