Aling 'linalg-to-xegpu' pass with patched XeGPU dialect#201
Aling 'linalg-to-xegpu' pass with patched XeGPU dialect#201kurapov-peter merged 13 commits intointel:mainfrom
Conversation
616c9e1 to
716af02
Compare
| loc, vecLoadType, tile, vnniAxisAttr, transpose, | ||
| loc, vecLoadType, tile, packedAttr, transpose, transpose_bit, |
There was a problem hiding this comment.
vnniAxis->packedAttr: instead of a vnni axis (0, 1) specify "packed" attribute that's equivalent ofvnni_axis=0transpose_bit: allows to transpose data while loading. Isn't used by this lowering pass
|
|
||
| // Load A sub-tiles. | ||
| SmallVector<Value> loadVecA = | ||
| loadNdDescTiles(rewriter, loc, tilesA, readCacheHint, vnniConfA); |
There was a problem hiding this comment.
vnniConfA can't be used during loading since vnniAxis=1 is now longer supported. However we still need this config to compute proper tiles for xegpu.dpas later in the code.
aeada62 to
435b520
Compare
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
435b520 to
2778459
Compare
f78f6d2 to
829b9d4
Compare
| // Create output initial value load tiles. | ||
| // CHECK: %[[rootC:.+]] = xegpu.create_nd_tdesc %[[C]] | ||
| // CHECK: %[[tC:.+]] = xegpu.update_nd_offset %[[rootC]], [0, 0] | ||
| // CHECK: %[[tC:.+]] = xegpu.update_nd_offset %[[rootC]], [%c0, %c0] |
There was a problem hiding this comment.
imex doesn't support constant offsets (see intel/mlir-extensions#815)
| // Extract DPAS-sized chunks from larger loaded tile A. | ||
| // Tile B is already in the correct shape. | ||
| // CHECK: %[[vA_flat:.+]] = vector.shape_cast %[[vA]] : vector<32x8x2xf16> to vector<512xf16> | ||
| // CHECK: %[[vA_flat:.+]] = vector.shape_cast %[[vA]] : vector<32x16xf16> to vector<512xf16> |
There was a problem hiding this comment.
we do not load the A matrix via vnni_axis=1 anymore (see packed_attr)
|
The IMEX changes are merged in Menooker:dev. |
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
| # required functionality is merged. | ||
| gc_fetch_content(imex 496b240093b5e132b60c5ee69878300fe69be300 https://github.com/Menooker/mlir-extensions | ||
| SET IMEX_CHECK_LLVM_VERSION=ON IMEX_ENABLE_L0_RUNTIME=0 | ||
| gc_fetch_content(imex d5bbd635dee500b8cff138686833bacfac5ade78 https://github.com/Menooker/mlir-extensions |
There was a problem hiding this comment.
updated to the latest commit in dev branch
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
Signed-off-by: dchigarev <dmitry.chigarev@intel.com>
| cl_platform_id platform; // OpenCL platform | ||
| cl_device_id device; // device ID | ||
| CL_SAFE_CALL(clGetPlatformIDs(1, &platform, NULL)); | ||
| CL_SAFE_CALL(clGetDeviceIDs(platform, *devtype, 1, &device, NULL)); | ||
| return device; |
There was a problem hiding this comment.
The old logic searched for a device of the requested type only in one platform (and couldn't find any GPU on my machine). Rewritten the logic to iterate over all available platforms and return a first suitable device
Closes #192
This PR updates
linalg-to-xegpupass to make it compatible withxegpu-to-vc-funcpass from IMEX.The PR also adds a simple e2e test for
linalg->xegpu->gpu exepipeline.