Enable window_bits and gzip functions for zlib-rs backend #520

Copilot · 2025-12-07T08:08:18Z

Several functions were unnecessarily restricted to C-based zlib backends despite working with pure Rust backends like zlib-rs.

Changes

Changed feature gate from any_zlib to any_impl for:

Compress::new_with_window_bits() - Custom window size for compression
Compress::new_gzip() - Gzip-formatted compression
Decompress::new_with_window_bits() - Custom window size for decompression
Decompress::new_gzip() - Gzip-formatted decompression

These functions work across all backends because:

Backend trait make() methods already accept window_bits parameter
Gzip mode uses standard window_bits + 16 convention supported universally

Dictionary functions (set_dictionary, set_level) remain any_zlib-only as they require direct stream manipulation not exposed in zlib-rs stable API.

Tests

Added validation for newly enabled functions:

Window size variations (9, 12, 15)
Boundary validation (panics on invalid values)
Round-trip compression/decompression

// Now works with zlib-rs backend
let mut encoder = Compress::new_gzip(Compression::default(), 9);
encoder.compress_vec(data, &mut output, FlushCompress::Finish)?;

let mut decoder = Decompress::new_gzip(9);
decoder.decompress(&output, &mut decoded, FlushDecompress::Finish)?;

Original prompt

#517 See how the functions that rely on #[cfg(feature = "any_zlib")] can also be implemented when zlib-rs is set, which doesn't use the C FFI anymore. Add tests that validate that these functions are present with zlib-rs.
If one of these can't be implemented with zlib-rs, that's fine too.

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: Byron <63622+Byron@users.noreply.github.com>

- cargo fmt

Byron · 2025-12-07T08:56:53Z

@folkertdev I'd need your help here 🙏.

I think this PR shows that set_dictionary and set_level currently aren't supported for zlib-rs because they need more access. Is this something you think zlib-rs will support (or does already?) so we can merge this PR and have a release that isn't considered breaking?
Thanks again for your help.

folkertdev · 2025-12-08T13:28:26Z

I would have expected this PR to break CI given that zlib-rs does not currently expose the API to implement those features?

I'll need to make a quick release exposing some additional things. We have the functionality, just not the API.

Byron · 2025-12-09T04:02:53Z

I would have expected this PR to break CI given that zlib-rs does not currently expose the API to implement those features?

Actually the test doesn't check for set_dictionary and set_level for that reason, but everything else is exposed.

Once we have access to these via the Rust API, the test can be adjusted to also validate them.

In theory, it could then be copy-pasted for each of the other implementations.

In any case, thanks for your help, it's much appreciated!

folkertdev · 2025-12-09T12:54:56Z

Some fun kind of undocumented behavior: the native zlib backends count the dictionary length towards total_in, but the flate2 bookkeeping does not (it only counts input to compress and decompress). Is that deviation deliberate?

Byron · 2025-12-09T16:49:58Z

Maybe @jongiddy has more context, as this question is way over my head.

Knowing that some models as built into the GH chat window have access to a lot of context and can do some thinking/planning, I thought maybe sharing their output here can be useful. Maybe not, I really can't even decide that 😅.

GPT 4.1 Mini

Short answer: probably a deliberate bookkeeping choice in flate2 rather than a spec bug in zlib. zlib's C implementation treats the preset dictionary as part of the stream history and increments its internal total_in/total_out counters to reflect that the stream now "contains" those bytes; flate2's counters reflect only the bytes that were actually passed to the compress()/decompress() calls by the caller (i.e., explicit input), and so do not add in the implicit dictionary length.

Why that difference exists (summary)

zlib C API semantics: when you give zlib a preset dictionary (deflateSetDictionary / inflateSetDictionary) the library internally takes that dictionary into account as stream history; the library’s total_in may be updated to include the dictionary length because the library views those bytes as part of the stream state.
flate2 bookkeeping: flate2 exposes total_in/total_out that count bytes passed through its API, which excludes any bytes that were not actually passed to compress()/decompress() (so a dictionary you hand in separately is excluded).
Result: two backends that both produce correct compressed output may still report different total_in values because they're answering slightly different questions about what "total input" means.

Is it deliberate?

It’s plausible and reasonable that flate2 authors intentionally count only the explicit API input (that keeps the meaning clear: “how many bytes did you hand me in these calls?”).
It could also be an implementation divergence (or an oversight) between backends — the important point is that the semantics are not identical across backends today.

What I recommend

If you need consistent semantics in your code, treat flate2’s total_in as “bytes passed to the API” and, when you need parity with zlib’s native behavior, add dict.len() yourself to the reported total_in when you supplied a preset dictionary to the native backend.
If you think flate2 should document or change this behavior (or at least make it consistent across backends), open an issue on the flate2 repository with a minimal repro and the exact backend/feature flags you used (native zlib vs pure-Rust backend).
When opening the issue, include:
- The flate2 version and the backend feature you used (native zlib vs miniz_oxide / pure Rust).
- A tiny reproducible program: compress or decompress with an explicit preset dictionary, and print total_in from both backends so maintainers can reproduce the discrepancy.
- The exact numbers you observed and what you expected.
- Why you care about total_in including/excluding the dictionary (e.g., for progress reporting, accounting, or compatibility with other tools).

Suggested issue title/body (you can paste/edit this):

Title:
total_in behavior differs from system zlib when using a preset dictionary

Body:
Summary:
When using a preset dictionary, the system zlib backend increases zlib's total_in by the dictionary length, but flate2's bookkeeping (the value returned from total_in()) does not include the dictionary length. This makes total_in differ across backends.

Repro:

flate2 version:
Backends tested: system zlib (feature X), pure-Rust backend (feature Y)
Minimal steps: (1) set a preset dictionary, (2) call compress/decompress on some data, (3) read total_in().

Observed:

system zlib backend: total_in == bytes_passed + dict.len()
flate2 bookkeeping: total_in == bytes_passed

Expected / question:
Should flate2:

document that total_in excludes preset dictionary length, or
change behavior to include the dictionary length for parity with system zlib, or
expose both values (explicit input vs stream-history-inclusive total)?

Workaround:
Add dict.len() to total_in yourself when you need the dictionary counted.

If you’d like, I can help you draft a minimal reproducible Rust snippet to include in the issue (or a short test) showing the differing totals for the two backends. Which backend(s) and flate2 version are you testing?

Copilot AI assigned Copilot and Byron Dec 7, 2025

Copilot started work on behalf of Byron December 7, 2025 08:08 View session

Copilot AI changed the title ~~[WIP] Add tests for functions with zlib-rs support~~ Enable window_bits and gzip functions for zlib-rs backend Dec 7, 2025

Copilot AI requested a review from Byron December 7, 2025 08:23

Copilot finished work on behalf of Byron December 7, 2025 08:23

Copilot AI and others added 2 commits December 7, 2025 09:37

Enable window_bits and gzip functions for zlib-rs backend

598cf63

Co-authored-by: Byron <63622+Byron@users.noreply.github.com>

refactor

b4993c4

- cargo fmt

Byron force-pushed the copilot/add-tests-for-zlib-rs-functionality branch from 8f2cdbb to b4993c4 Compare December 7, 2025 08:38

Add integration test for zlib-rs specifically to test exposed APIs.

d8a67e7

Byron force-pushed the copilot/add-tests-for-zlib-rs-functionality branch from 977d847 to d8a67e7 Compare December 7, 2025 08:53

Byron linked an issue Dec 7, 2025 that may be closed by this pull request

1.1.7 have breaking change about feature zlib-rs #517

Open

Byron added the question label Dec 7, 2025

youknowone mentioned this pull request Dec 7, 2025

Allow zlib_impl for more functions #521

Closed

Byron mentioned this pull request Dec 7, 2025

Document feature flags using document-features crate #519

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable window_bits and gzip functions for zlib-rs backend #520

Enable window_bits and gzip functions for zlib-rs backend #520

Copilot AI commented Dec 7, 2025 •

edited

Loading

Uh oh!

Byron commented Dec 7, 2025

Uh oh!

folkertdev commented Dec 8, 2025

Uh oh!

Byron commented Dec 9, 2025

Uh oh!

folkertdev commented Dec 9, 2025 •

edited

Loading

Uh oh!

Byron commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Enable window_bits and gzip functions for zlib-rs backend #520

Are you sure you want to change the base?

Enable window_bits and gzip functions for zlib-rs backend #520

Conversation

Copilot AI commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Tests

Uh oh!

Byron commented Dec 7, 2025

Uh oh!

folkertdev commented Dec 8, 2025

Uh oh!

Byron commented Dec 9, 2025

Uh oh!

folkertdev commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Byron commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Dec 7, 2025 •

edited

Loading

folkertdev commented Dec 9, 2025 •

edited

Loading