Skip to content

Conversation

@jhamman
Copy link
Member

@jhamman jhamman commented Oct 20, 2025

(Marking this as a draft for now)

closes: #1595
replaces: #1483
xref: zarr-developers/zarr-extensions#25

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.md
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Oct 20, 2025
@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

❌ Patch coverage is 79.56081% with 121 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.60%. Comparing base (34859c4) to head (b55cb8e).

Files with missing lines Patch % Lines
src/zarr/core/chunk_grids.py 74.49% 76 Missing ⚠️
src/zarr/core/indexing.py 85.07% 20 Missing ⚠️
src/zarr/core/array.py 79.68% 13 Missing ⚠️
src/zarr/testing/strategies.py 89.28% 9 Missing ⚠️
src/zarr/core/metadata/v3.py 80.00% 2 Missing ⚠️
src/zarr/core/_info.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3534      +/-   ##
==========================================
+ Coverage   60.90%   61.60%   +0.70%     
==========================================
  Files          86       86              
  Lines       10174    10640     +466     
==========================================
+ Hits         6196     6555     +359     
- Misses       3978     4085     +107     
Files with missing lines Coverage Δ
src/zarr/api/asynchronous.py 72.20% <100.00%> (ø)
src/zarr/api/synchronous.py 36.61% <ø> (ø)
src/zarr/core/group.py 70.27% <ø> (ø)
src/zarr/core/_info.py 51.80% <0.00%> (ø)
src/zarr/core/metadata/v3.py 59.74% <80.00%> (+1.73%) ⬆️
src/zarr/testing/strategies.py 94.18% <89.28%> (-3.66%) ⬇️
src/zarr/core/array.py 68.70% <79.68%> (+0.21%) ⬆️
src/zarr/core/indexing.py 70.19% <85.07%> (+0.73%) ⬆️
src/zarr/core/chunk_grids.py 70.70% <74.49%> (+8.40%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Oct 20, 2025
Comment on lines +614 to +615
With variable chunking, the standard `.chunks` property is not available since chunks
have different sizes. Instead, access chunk information through the chunk grid:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if .chunks just had a different type (tuple of tuples of ints)



@dataclass(frozen=True)
class RectilinearChunkGrid(ChunkGrid):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on just calling this class Rectilinear, and renaming the RegularChunkGrid to Regular? We could keep around a RegularChunkGrid class for compatibility. But I feel like people know these are chunk grids when they import them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

50/50. I think the more descriptive class is useful when looking at a tracebacks. Plus, this is currently in .core so its not meant to be used directly by users.

)

@cached_property
def _cumulative_sizes(self) -> tuple[tuple[int, ...], ...]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, this is cached

Comment on lines 584 to 588
chunk_shapes_rle = [
[[c, r] for c, r in zip(draw(dim_chunks), draw(repeats), strict=True)]
for _ in range(ndim)
]
return RectilinearChunkGrid(chunk_shapes=chunk_shapes_rle)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs fixing

@given(data=st.data())
async def test_basic_indexing(data: st.DataObject) -> None:
zarray = data.draw(simple_arrays())
@given(data=st.data(), zarray=st.one_of([simple_arrays(), complex_chunked_arrays()]))
Copy link
Contributor

@dcherian dcherian Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the search space for the standard arrays strategy is so large, i made a different one complex_chunked_arrays that purely checks different chunk grids
with simple_arrays() we are only spending 10% of our time trying RectilinearChunkGrid so using this approach. We should boost number of examples too.

Comment on lines +668 to +669
2. **Not compatible with sharding**: You cannot use variable chunking together with
the sharding feature. Arrays must use either variable chunking or sharding, but not both.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this is a temporary limitation! There's a natural extension of rectilinear chunk grids to rectilinear shard grids.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhamman jhamman marked this pull request as ready for review December 30, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Variable Chunking (ZEP003) in V3

4 participants