* Guillem Jover <guil...@debian.org> [2024-11-22 12:29]: [...] > * There were concerns (from Fay) about whether given same input the > output changes per arch or hw setup, we'd need to check this; I'd > expect this not to be the case for different arches, but it might > be an issue with number of cores for example, but if either is true > this would be a serious blocker. > * There were concerns (from Fay) about the output stream changing due > to a potential implementation switch and that affecting external > reproducibility. Personally I think while I can see how this is > annoying for the involved parties, it's part of the "you need > the same tools to generate the same output" premise that we also > assume in Debian. I guess keeping both implementations around > indefinitely, I think, would make this less of an issue, with the > potential drawbacks mentioned in the previous point. [...]
I did some more testing with zlib-ng. With the original zlib, you will always get an identical output stream given the same input stream and compressor parameters (compression level being the only one that's commonly varied in ZIP files). I expected that zlib-ng would often produce a different output steam than the original, but what I found was a lot more non-deterministic than just that. With zlib-ng, feeding the data into the compressor in e.g. 1024-byte chunks always gave me a different output stream than using 4096-byte chunks (at compression level 6). In fact, every chunk size I tried gave a different output. And that's with fixed size chunks, which is not a given if you're handling e.g. a stream of input. Even using the same buffer size, I cannot get an identical compressed output stream with Python and Java any more, presumably because of subtle implementation differences (in the stdlib code that ends up calling zlib to do the compression) that do not affect zlib but clearly do affect zlib-ng. Which makes zlib-ng unsuitable for use cases where you need to be able to create an identical output stream without knowing exactly how the bytes were fed into the zlib compressor (or simply have no way to control this). This fundamentally breaks my tooling in ways I can't fix by using the same build environment. Because programs that used to produce identical and deterministic output with zlib no longer do with zlib-ng, despite using the exact same zlib-ng .so. - Fay