Skip to content

Conversation

@FlorianDeconinck
Copy link
Contributor

@FlorianDeconinck FlorianDeconinck commented Jun 9, 2025

Description

Current bridge goes from OIR to SDFG unexpanded, then expand to a full SDFG. The bridge has accumulated tech debt, which we have been cleaning up, but barriers to new feature implementation remains.

We have been thinking about leveraging the Schedule Tree representation of DaCe for some time. The idea behind that representation is that we should have an easier time to do schedule related optimizations (e.g. loop re-ordering based on CPU/GPU target and (same-axis) loop merges).

This PR refactors the GT4Py - DaCe bridge to OIR -> TreeIR -> ScheduleTree -> SDFG -> codegen, details in the ADR.

Note, we might might amend a commit or two to fix/change the data layout to map the current loop structure, but no big changes are planned to go into this branch anymore. So we marked it as ready for review.

Requirements

  • All fixes and/or new features come with corresponding tests.
  • Important design decisions have been documented in the appropriate ADR inside the docs/development/ADRs/ folder.

romanc and others added 27 commits June 3, 2025 18:25
This still needs proper symbols (and thus fails upon call time).
These are translated to `tn.ForScope` and then get mapped to state machine
loops in the resulting SDFG.

The problems were twofold

1. Wrong bounds in oir to tree IR visitor
2. Memlet propagation missing support for subsets.Indices()

Includes a bit of cleanup when creating the `tn.ForScope`.
This is a WIP commit adding support for variable K offset reads. The
FieldAccess visitor has been extended to resolve variable K offsets and
dace had to be updated to do proper subset intersection (between ranges
and indices) in MemletDict and MemletSet (iirc these are new in the
stree-to-sdfg branch, so not being super polished makes sense).

I've been testing with `test_variable_offsets` from
`test_code_generation.py` and this test currently fails SDFG validation
with an out of bounds memlet.
Verbose the gtscript func usage in stencil definitions
… for symbols pre-defined.

Added temporary stencil to stencil definition + Field2D capacity
oir to tree: introduce ContextPushPop to reduce boilerplate
Add a simple while stencil to stencil_definitions
romanc added 3 commits July 16, 2025 14:08
We don't know how to properly "inline" typed temporaries into a Tasklet.
Let's make sure that DDE isn't trying to remove access nodes of thins it
can't handle later on when patching the Tasklet.
@romanc romanc marked this pull request as ready for review July 16, 2025 12:51
@romanc
Copy link
Contributor

romanc commented Jul 16, 2025

Alright - the issue with temporaries is squashed (and worked around where we can't control it). I think this is ready for show. I've tested with pyFV3 and the AI2 data and this validates all major stencils over there in orchestration as well as stencil mode.

@romanc

This comment was marked as resolved.

FlorianDeconinck and others added 2 commits July 16, 2025 16:26
I went over the code of this PR "one last time" and after getting some
distance I found a couple small things:

- dace_backend: get ranges only for defined axes
- oir->tasklet: simplify tasklet generation
- oir->tasklet: add note on volume computation for VariableK offsets
- oir->tasklet: add doc strings for memlet generation functions
- oir->tasklet: fix VariableKOffset memlets (missing offset)
- Various typos / language improvements in comments and ADRs
Copy link
Contributor

@romanc romanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did "one last code review" resulting in the last commit c34ccc2.

@FlorianDeconinck have a quick look at the changes to memlet range computation when adding fields to the wrapper sdfg (in dace_backend).

Also, I figured, we missed to offset the i/j axes for nodes with VariableKOffset. The K-axis was okay (we add the offset when we build the FieldAccess in the code generation of the Tasklet. This might help with problems that we have seen in physics codes.

@romanc
Copy link
Contributor

romanc commented Jul 25, 2025

cscs-ci run

1 similar comment
@romanc
Copy link
Contributor

romanc commented Jul 25, 2025

cscs-ci run

Copy link
Contributor

@twicki twicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great first step!

@romanc
Copy link
Contributor

romanc commented Jul 28, 2025

cscs-ci run

2 similar comments
@romanc
Copy link
Contributor

romanc commented Jul 28, 2025

cscs-ci run

@romanc
Copy link
Contributor

romanc commented Jul 28, 2025

cscs-ci run

@romanc romanc merged commit 68eea74 into GridTools:main Jul 28, 2025
31 checks passed
romanc added a commit to NOAA-GFDL/NDSL that referenced this pull request Jul 29, 2025
PR GridTools/gt4py#2067 in GT4Py fundamentally changed how the `dace:*` backends behave. In that PR, we changed the strategy to make use of an upcoming DaCe feature called "Schedule Tree", which we will use for optimization purposes. This new "bridge" between GT4Py and DaCe, allows for a much cleaner design where both packages handle nothing more than what they need to. A drop in performance is to be expected (especially on CPU) as we have deactivated local caching for now. But the very next task is to re-use this new platform to allow for much more improved and aggressive merging capacities, local caching and hardware-driven tiling.

This PR updates GT4Py to a version that includes the above mentioned "Schedule tree bridge" and removes NDSL-level optimizations in the orchestration pipeline. As said, this will come with a temporary dip in performance, which we plan to restore with upcoming pull requests.

In addition to updating the GT4Py and DaCe versions, we include the following two changes in this PR

1. Expose compiler optimization level as `GT4PY_COMPILE_OPT_LEVEL`, defaults to `3` as before.
2. Minor change in import style in `ndsl/dsl/dace/orchestration.py`.

---------

Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
stubbiali pushed a commit to stubbiali/gt4py that referenced this pull request Aug 19, 2025
## Description

In the latest gt4py/dace bridge refactor (on the cartesian side)[^1], we
slipped in a bunch of type enhancements. This PR pull out what is
independent of the bridge refactor. In particular this PR brings

- generically typed base registry
- typed frontend and backend registries
- typed `BackendCodegen` constructors
- new-style type hints in touched files

## Requirements

- [x] All fixes and/or new features come with corresponding tests.
  Covered by existing tests.
- [ ] Important design decisions have been documented in the appropriate
ADR inside the [docs/development/ADRs/](docs/development/ADRs/README.md)
folder.
  N/A

[^1]: see GridTools#2067 if you are interested
stubbiali pushed a commit to stubbiali/gt4py that referenced this pull request Aug 19, 2025
…orkflow (GridTools#2098)

## Description

Refactors and cleanups from PR GridTools#2067, because that PR is way too large
and we shouldn't bury refactors in there.

## Requirements

- [x] All fixes and/or new features come with corresponding tests.
  New tests are added with this PR.
- [ ] Important design decisions have been documented in the appropriate
ADR inside the [docs/development/ADRs/](docs/development/ADRs/README.md)
folder.
  N/A

---------

Co-authored-by: Enrique González Paredes <enriqueg@cscs.ch>
stubbiali pushed a commit to stubbiali/gt4py that referenced this pull request Aug 19, 2025
…end (GridTools#2103)

## Description

This PR brings another two cleanup commits:

1. `gtir` doesn't need to be passed around when calling `BackendCodegen`
classes. That info can be derived from the backend (class), which is a
member of `BackendCodegen`.
2. Use type hints in `dace_backend` (only for the functions that we
aren't about to torch anyway with PR GridTools#2067)

## Requirements

- [ ] All fixes and/or new features come with corresponding tests.
  Covered by existing tests.
- [ ] Important design decisions have been documented in the appropriate
ADR inside the [docs/development/ADRs/](docs/development/ADRs/README.md)
folder.
  N/A
jjuyeonkim pushed a commit to jjuyeonkim/NDSL that referenced this pull request Sep 8, 2025
PR GridTools/gt4py#2067 in GT4Py fundamentally changed how the `dace:*` backends behave. In that PR, we changed the strategy to make use of an upcoming DaCe feature called "Schedule Tree", which we will use for optimization purposes. This new "bridge" between GT4Py and DaCe, allows for a much cleaner design where both packages handle nothing more than what they need to. A drop in performance is to be expected (especially on CPU) as we have deactivated local caching for now. But the very next task is to re-use this new platform to allow for much more improved and aggressive merging capacities, local caching and hardware-driven tiling.

This PR updates GT4Py to a version that includes the above mentioned "Schedule tree bridge" and removes NDSL-level optimizations in the orchestration pipeline. As said, this will come with a temporary dip in performance, which we plan to restore with upcoming pull requests.

In addition to updating the GT4Py and DaCe versions, we include the following two changes in this PR

1. Expose compiler optimization level as `GT4PY_COMPILE_OPT_LEVEL`, defaults to `3` as before.
2. Minor change in import style in `ndsl/dsl/dace/orchestration.py`.

---------

Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com>
Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
FlorianDeconinck pushed a commit that referenced this pull request Sep 9, 2025
## Description

This is a small follow-up from when we moved to the schedule tree in PR
#2067. Looks like we left some
test utils. `mypy` isn't complaining about the undefined import paths
because we skip it for all of tests and python isn't complaining because
the code never runs.

## Requirements

- [ ] All fixes and/or new features come with corresponding tests.
  N/A
- [ ] Important design decisions have been documented in the appropriate
ADR inside the [docs/development/ADRs/](docs/development/ADRs/README.md)
folder.
  N/A

Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cartesian: dace backends missing support for casting in variable k offsets

6 participants