[WIP - NOT READY FOR REVIEW] Paged Attention: rocmlir-gen changes by justinrosner · Pull Request #2222 · ROCm/rocMLIR

justinrosner · 2026-01-30T17:41:04Z

Motivation

This PR adds end-to-end testing infrastructure for paged attention in rocmlir-gen, enabling generation of both GPU kernels and CPU validation functions that properly handle paged K/V caches with shuffled page table addressing.

Implements: https://amd-hub.atlassian.net/browse/AIROCMLIR-439

Technical Details

New command line options:

--paged-attention          # Enable paged attention mode
--num-pages <N>            # Number of pages in the cache
--page-size <N>            # Elements per page

Example Usage:

rocmlir-gen --operation attention -seq_len_q 1024 -seq_len_k 1024 -head_dim_qk 32 -head_dim_v 32 -t f16  --paged-attention --num-pages 32 --page-size 1024

Key Changes:

GPU Kernel Generation

For paged attention, K/V inputs become page tables instead of data tensors.
The kernel generates:
- rock.deref ops to create virtual views of paged K/V data
- Transform chains to reshape [batch, numPages, pageSize] -> [G, seqK, headDim]
- rock.attention with keyAddresses/valueAddresses pointing to deref outputs

Host Harness Generation: Implements paged cache testing with shuffled page ordering
CPU Validation

Validation function receives regular K/V tensors (not page tables), using the logical-order CPU cache for comparison

Test Plan

Nightly LIT tests

Test Result

Nightly LIT tests

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull request overview

This pull request adds paged attention support to rocmlir-gen, a code generation tool for MLIR-based ROCm kernels. Paged attention is an optimization technique that allows attention mechanisms to work with non-contiguous memory pages, improving memory efficiency for large language models.

Changes:

Adds command-line options (--paged-attention, --page-size, --num-pages) to enable and configure paged attention mode
Modifies attention kernel generation to use page tables (arrays of i64 pointers) instead of direct K/V tensor inputs
Implements GPU kernel logic with rock.deref operations to dereference page tables and transform paged data to attention-compatible shapes
Adds CPU validation path that reconstructs regular K/V tensors from paged cache for correctness verification
Includes comprehensive test coverage with MLIR test file and e2e test configurations

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
mlir/tools/rocmlir-gen/rocmlir-gen.cpp	Core implementation: adds paged attention command-line options, validation logic, GPU kernel generation with page table dereferencing and transforms, CPU validation with cache buffer management and shuffling, and host harness logic for page table population
mlir/test/rocmlir-gen/paged-attention-kernel.mlir	Comprehensive test file verifying paged attention kernel signature, rock.deref operations, transforms, and validation function with both single-head and GQA configurations
mlir/test/e2e/PrAttentionSchedule.toml	Adds e2e test case for paged attention with schedule version 2
mlir/test/e2e/PrAttentionI8.toml	Adds e2e test case for paged attention with int8 quantization
mlir/test/e2e/PrAttentionF32.toml	Adds e2e test case for paged attention with f32 data type
mlir/test/e2e/PrAttentionF16.toml	Adds e2e test case for paged attention with f16 data type
mlir/test/e2e/PrAttentionDirectToLDS.toml	Adds e2e test case for paged attention with direct-to-LDS optimization
mlir/test/e2e/PrAttentionBF16.toml	Adds e2e test case for paged attention with bf16 data type
mlir/test/e2e/AttentionSchedule.toml	Adds e2e test case for paged attention with standard schedule
mlir/test/e2e/AttentionNonPowerOfTwoTileSize.toml	Adds e2e test case for paged attention with non-power-of-two tile sizes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mlir/tools/rocmlir-gen/rocmlir-gen.cpp

justinrosner marked this pull request as ready for review January 30, 2026 18:34

justinrosner requested a review from causten as a code owner January 30, 2026 18:34

justinrosner requested review from Copilot, dhernandez0, pabloantoniom and umangyadav January 30, 2026 18:34

Copilot started reviewing on behalf of justinrosner January 30, 2026 18:36 View session

justinrosner changed the title ~~[WIP] Paged Attention: rocmlir-gen changes~~ [WIP - NOT READY FOR REVIEW] Paged Attention: rocmlir-gen changes Jan 30, 2026

Copilot AI reviewed Jan 30, 2026

View reviewed changes

mlir/tools/rocmlir-gen/rocmlir-gen.cpp Outdated Show resolved Hide resolved

mlir/tools/rocmlir-gen/rocmlir-gen.cpp Outdated Show resolved Hide resolved

mlir/tools/rocmlir-gen/rocmlir-gen.cpp Outdated Show resolved Hide resolved

justinrosner force-pushed the 42-paged-attention-rocmlir branch from 9959a7d to fa551da Compare January 30, 2026 21:56

justinrosner force-pushed the 439-paged-attention-rocmlir-gen branch from 3a81a17 to a172ac8 Compare January 30, 2026 22:02

justinrosner force-pushed the 42-paged-attention-rocmlir branch from fa551da to 034180c Compare February 2, 2026 22:14

justinrosner force-pushed the 439-paged-attention-rocmlir-gen branch from a172ac8 to 5f36777 Compare February 2, 2026 22:18

justinrosner added 2 commits February 2, 2026 22:21

rocmlir-gen changes

020797c

Remove default

dbea8f0

justinrosner force-pushed the 439-paged-attention-rocmlir-gen branch from 5f36777 to dbea8f0 Compare February 2, 2026 22:21

Attend to CoPilot review comments

d10b447

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP - NOT READY FOR REVIEW] Paged Attention: rocmlir-gen changes#2222

[WIP - NOT READY FOR REVIEW] Paged Attention: rocmlir-gen changes#2222
justinrosner wants to merge 3 commits into42-paged-attention-rocmlirfrom
439-paged-attention-rocmlir-gen

justinrosner commented Jan 30, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

justinrosner commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

justinrosner commented Jan 30, 2026 •

edited

Loading