On Mon, 29 Aug 2022 11:58:15 GMT, Volker Simonis <simo...@openjdk.org> wrote:
>> Add an API note to `InflaterInputStream::read(byte[] b, int off, int len)` >> to highlight that it might write more bytes than the returned number of >> inflated bytes into the buffer `b`. >> >> The superclass `java.io.InputStream` specifies that `read(byte[] b, int off, >> int len)` will leave the content beyond the last read byte in the read >> buffer `b` unaffected. However, the overridden `read` method in >> `InflaterInputStream` passes the read buffer `b` to >> `Inflater::inflate(byte[] b, int off, int len)` which doesn't provide this >> guarantee. Depending on implementation details, `Inflater::inflate` might >> write more than the returned number of inflated bytes into the buffer `b`. >> >> ### TL;DR >> >> `java.util.zip.Inflater` is the Java wrapper class for zlib's inflater >> functionality. `Inflater::inflate(byte[] output, int off, int len)` >> currently calls zlib's native `inflate(..)` function and passes the address >> of `output[off]` and `len` to it via JNI. >> >> The specification of zlib's `inflate(..)` function (i.e. the [API >> documentation in the original zlib >> implementation](https://github.com/madler/zlib/blob/cacf7f1d4e3d44d871b605da3b647f07d718623f/zlib.h#L400)) >> doesn't give any guarantees with regard to usage of the output buffer. It >> only states that upon completion the function will return the number of >> bytes that have been written (i.e. "inflated") into the output buffer. >> >> The original zlib implementation only wrote as many bytes into the output >> buffer as it inflated. However, this is not a hard requirement and newer, >> more performant implementations of the zlib library like >> [zlib-chromium](https://chromium.googlesource.com/chromium/src/third_party/zlib/) >> or [zlib-cloudflare](https://github.com/cloudflare/zlib) can use more bytes >> of the output buffer than they actually inflate as a scratch buffer. See >> https://github.com/simonis/zlib-chromium for a more detailed description of >> their approach and its performance benefit. >> >> These new zlib versions can still be used transparently from Java (e.g. by >> putting them into the `LD_LIBRARY_PATH` or by using `LD_PRELOAD`), because >> they still fully comply to specification of `Inflater::inflate(..)`. >> However, we might run into problems when using the `Inflater` functionality >> from the `InflaterInputStream` class. `InflaterInputStream` is derived from >> from `InputStream` and as such, its `read(byte[] b, int off, int len)` >> method is quite constrained. It specifically specifies that if *k* bytes >> have been read, then "these bytes will be stored in elements `b[off]` >> through `b[off+`*k*`-1]`, leaving elements `b[off+`*k*`]` through >> `b[off+len-1]` **unaffected**". But `InflaterInputStream::read(byte[] b, int >> off, int len)` (which is constrained by `InputStream::read(..)`'s >> specification) calls `Inflater::inflate(byte[] b, int off, int len)` and >> directly passes its output buffer down to the native zlib `inflate(..)` >> method which is free to change the bytes beyond `b[off+` *k*`]` (where *k* is the number of inflated bytes). >> >> From a practical point of view, I don't see this as a big problem, because >> callers of `InflaterInputStream::read(byte[] b, int off, int len)` can never >> know how many bytes will be written into the output buffer `b` (and in fact >> its content can always be completely overwritten). It therefore makes no >> sense to depend on any data there being untouched after the call. Also, >> having used zlib-cloudflare productively for about two years, we haven't >> seen real-world issues because of this behavior yet. However, from a >> specification point of view it is easy to artificially construct a program >> which violates `InflaterInputStream::read(..)`'s postcondition if using one >> of the alterantive zlib implementations. A recently integrated JTreg test >> (test/jdk/jdk/nio/zipfs/ZipFSOutputStreamTest.java) "unintentionally" fails >> with zlib-chromium but can fixed easily to run with alternative >> implementations as well (see >> [JDK-8283756](https://bugs.openjdk.java.net/browse/JDK-8283756)). > > Volker Simonis has updated the pull request incrementally with one additional > commit since the last revision: > > Updated JavaDoc according to @mbreinhold's suggestions Marked as reviewed by mr (Lead). ------------- PR: https://git.openjdk.org/jdk/pull/7986