Use cumsum from flox #10987

Illviljan · 2025-12-06T13:44:27Z

Closes cumsum drops index coordinates #6528 ?
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

The non-flox version reduces chunksizes significantly:

x = xr.DataArray([1, 1, 1, 1, 1], name="x").chunk()
grp_idx = xr.DataArray([-1, 0, 0, -1, 1])
with xr.set_options(use_flox=False):
    print(x.groupby(grp_idx).cumsum())
<xarray.DataArray 'x' (dim_0: 5)> Size: 40B
dask.array<getitem, shape=(5,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
Dimensions without coordinates: dim_0

With flox the chunksize is retained:

x = xr.DataArray([1, 1, 1, 1, 1], name="x").chunk()
grp_idx = xr.DataArray([-1, 0, 0, -1, 1])
with xr.set_options(use_flox=True):
    print(x.groupby(grp_idx).cumsum())
<xarray.DataArray 'x' (dim_0: 5)> Size: 40B
dask.array<_finalize_scan, shape=(5,), dtype=int64, chunksize=(5,), chunktype=numpy.ndarray>
Dimensions without coordinates: dim_0

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

dcherian · 2025-12-06T15:19:34Z

xarray/core/groupby.py

+
+        # return result
+
+        actual = apply_ufunc(


yes, this is the way. eventually I'd like the apply_ufunc for reductions to live in Xarray too. So feel free to move that over if it helps. We could put it in flox_compat.py

xarray/core/groupby.py

…o cumsum_flox

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

…o cumsum_flox

for more information, see https://pre-commit.ci

xarray/core/groupby.py

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

for more information, see https://pre-commit.ci

Illviljan · 2025-12-07T12:14:28Z

xarray/core/_aggregations.py

+        keep_attrs: bool | None = None,
+        **kwargs: Any,
+    ) -> Dataset:
+        raise NotImplementedError()


I've made these changes manually now.
I'm not getting pytest-accept to correctly fix the docstrings in _aggregations.py, it's for example not indenting correctly. I'm not sure if this is just a Windows 10 thing.

Illviljan · 2025-12-07T12:40:06Z

xarray/tests/test_groupby.py

+    else:
+        # TODO: Remove drop_vars when GH6528 is fixed
+        # when Dataset.cumsum propagates indexes, and the group variable?
+        assert_identical(expected.drop_vars(["x", "group_id"]), actual)


Keeping this until min_version of flox is 0.10.5 at least.
Coordinates and docstrings might differ between using flox or not now though. Is that ok?

Not ok imo. I think it might be fixed by simply propagating coordinates in the non-flox branch of the templated code. Might be easy

Illviljan · 2025-12-07T12:45:58Z

Tests are passing now! But there's a lot of deactivated options left and there was quite a bit of extra code in _flox_reduce xarray_reduce, not sure how much of it was to deal with reduction operations only.

dcherian · 2025-12-07T14:47:49Z

xarray/core/groupby.py

+            wrapper,
+            obj,
+            *codes,
+            # input_core_dims=input_core_dims,


I think we don't need this because we just want the full array forwarded

dcherian · 2025-12-07T14:48:04Z

xarray/core/groupby.py

+            kwargs=dict(
+                func=func,
+                skipna=skipna,
+                expected_groups=None,


should be the same as _flox_reduce. This is an important optimization.

dcherian · 2025-12-07T14:48:17Z

xarray/core/groupby.py

+                dtype=None,
+                method=None,
+                engine=None,


These we should grab from kwargs and forward just like _flox_reduce

dcherian · 2025-12-07T14:48:27Z

xarray/core/groupby.py

+            # exclude_dims=set(dim_tuple),
+            # output_core_dims=[output_core_dims],
+            dask="allowed",
+            # dask_gufunc_kwargs=dict(


please delete.

dcherian · 2025-12-07T14:48:34Z

xarray/core/groupby.py

+            # for xarray's test_groupby_duplicate_coordinate_labels
+            # exclude_dims=set(dim_tuple),
+            # output_core_dims=[output_core_dims],


please delete

dcherian · 2025-12-07T15:03:22Z

xarray/tests/test_groupby.py

        ("cumprod", [1.0, 2.0, 6.0, 6.0, 2.0, 2.0]),
    ],
 )
 def test_resample_cumsum(method: str, expected_array: list[float]) -> None:


Let's not use flox for the resample case. It could break catastrophically if we don't support method="blockwise" or "cohorts" in flox.

dcherian · 2025-12-07T15:05:23Z

xarray/tests/test_groupby.py

        assert_identical(expected, actual)


 def test_groupby_cumsum() -> None:


Testing here is pretty light. We'll want at least two more cases:

grouping by a single nD variable

grouping by multiple variables.

use cumsum from flox

776bc5a

github-actions bot added the topic-groupby label Dec 6, 2025

pre-commit-ci bot and others added 13 commits December 6, 2025 13:44

[pre-commit.ci] auto fixes from pre-commit.com hooks

ae27632

for more information, see https://pre-commit.ci

Update groupby.py

a5f9326

Update groupby.py

50ccca4

[pre-commit.ci] auto fixes from pre-commit.com hooks

f55531e

for more information, see https://pre-commit.ci

Update groupby.py

06ac372

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

31244e6

…o cumsum_flox

Update groupby.py

dd47536

[pre-commit.ci] auto fixes from pre-commit.com hooks

e867f12

for more information, see https://pre-commit.ci

Update groupby.py

88e0ebc

[pre-commit.ci] auto fixes from pre-commit.com hooks

181d4a3

for more information, see https://pre-commit.ci

use apply_ufunc for dataset and dataarray handling

a82ec39

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

6c6abed

…o cumsum_flox

[pre-commit.ci] auto fixes from pre-commit.com hooks

24c3f1d

for more information, see https://pre-commit.ci

dcherian reviewed Dec 6, 2025

View reviewed changes

xarray/core/groupby.py Show resolved Hide resolved

Illviljan and others added 11 commits December 6, 2025 16:21

Update groupby.py

d8d0eaa

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

55ff46a

…o cumsum_flox

[pre-commit.ci] auto fixes from pre-commit.com hooks

33d1360

for more information, see https://pre-commit.ci

sync protocols with each other

c97ae98

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

06b52ae

…o cumsum_flox

typing

84f9b44

[pre-commit.ci] auto fixes from pre-commit.com hooks

2978877

for more information, see https://pre-commit.ci

add dataset and version requirement

0a9adee

Merge branch 'cumsum_flox' of https://github.com/Illviljan/xarray int…

ae9a3d8

…o cumsum_flox

[pre-commit.ci] auto fixes from pre-commit.com hooks

c056d1f

for more information, see https://pre-commit.ci

Update _aggregations.py

d4873b9

dcherian reviewed Dec 6, 2025

View reviewed changes

xarray/core/groupby.py Outdated Show resolved Hide resolved

Update xarray/core/groupby.py

21cbde2

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

Illviljan and others added 9 commits December 6, 2025 21:14

Update groupby.py

4aebc47

Update groupby.py

f4cab24

Update groupby.py

23d9d50

Update generate_aggregations.py

9b64db2

Renove workaround in test

928b158

Update _aggregations.py

130f98e

Update _aggregations.py

5a3e754

Update test_groupby.py

d912cda

[pre-commit.ci] auto fixes from pre-commit.com hooks

3bc8dc7

for more information, see https://pre-commit.ci

Illviljan commented Dec 7, 2025

View reviewed changes

Illviljan marked this pull request as ready for review December 7, 2025 12:35

Illviljan commented Dec 7, 2025

View reviewed changes

Illviljan added 2 commits December 7, 2025 14:01

clean ups

ec8ffd6

Merge branch 'main' into cumsum_flox

b0cf8c4

dcherian reviewed Dec 7, 2025

View reviewed changes

		assert_identical(expected, actual)


		def test_groupby_cumsum() -> None:

Uh oh!

Use cumsum from flox #10987

Are you sure you want to change the base?

Use cumsum from flox #10987

Conversation

Illviljan commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Illviljan Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Illviljan commented Dec 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dcherian Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Illviljan commented Dec 6, 2025 •

edited

Loading

Illviljan Dec 7, 2025 •

edited

Loading

dcherian Dec 7, 2025 •

edited

Loading