Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark#150
Task Fusion, Constant Conversion Optimization, and 27pt stencil benchmark#150shivsundram wants to merge 55 commits intonv-legate:branch-24.03from
Conversation
This reverts commit b698b33.
…y into shiv1/op_fusion
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
Thanks for the PR. I'll probably split this into a couple of independent PRs instead of merging it as-is, and want to move all fusion related code to the core. |
| ) | ||
| temp._thunk.convert( | ||
| two._thunk, stacklevel=(stacklevel + 1) | ||
| ) |
There was a problem hiding this comment.
above code is a scalar constant optimization, which avoids dispatching CONVERT operations (for a scalar constant), as the constant's value is embedded in the code and thus already known
|
@magnatelee @shivsundram What is the status of this PR? |
|
@marcinz same here. part of this PR should really be in the core, so I'll do the porting in the near future. |
|
@marcinz @magnatelee Yeah this is/was a working PR (with some nice speedup results here if interested), but Wonchan will be porting/merging this functionality into the core. This PR is pretty stale right now |
Also take this opportunity to clean up a naming inconsistency; NumPy types are "dtypes", core types are "types".
PR for implementing Legate Task Fusion.
This is the cuNumeric companion PR to the Core's Task Fusion PR nv-legate/legate#113
This PR contains 3 primary changes
cunumeric/array.pyThis removes the need to issue expensive "convert" ops for scalar constants embedded in the code