Re: RFR: 8294696 - BufferedInputStream.transferTo should drain buffer when mark set [v5]

Markus KARG Mon, 31 Oct 2022 01:30:54 -0700

On Sat, 29 Oct 2022 12:31:41 GMT, Markus KARG <[email protected]> wrote:


>> This PR implements JDK-8294696.
>
> Markus KARG has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   No immediate buffer resize; in the rare case of read-past-EOF automatic 
> buffer grow will happen anyways

**TL;DR:** Your optimization is now contained in this PR and outperforms the 
previous code. Nevertheless I plead for Alternative C (double buffering) 
instead, which is even faster but much simpler.

**Justification**

I took some time and did intensive JMH testing (with two buffer sizes and with 
four percental start positions within the buffer to simulate non-pre-read, 
slightly-pre-read, mostly-pre-read and fully-pre-read buffer situations) with 
three variants: The current code with early buffer resize (which you say is not 
necessary), the current code without early buffer resize (which you propose 
instead), and the so-called Alternative C (which uses double-buffering instead 
of buffer-replacing).

* Alternative B: Optimization outperforms unoptimized code
The results are shown below, and as you can see, early buffer resize is only a 
problem in *some*, but not all, cases. Nevertheless I agree that should be 
avoided, so I already pushed it as commit 
https://github.com/openjdk/jdk/pull/10525/commits/9c2f502c06083694deefe6a9d4c8ee4b77a92c68
 and hope that you are fine with this latest piece of code.

* Alternative C: Outperforms Alternative B (optimized *and* unoptimized code) 
unless buffer is huge
Interesting to see BTW is that Alternative C (double-buffering instead of 
buffer-replacing) even outperforms that optimized code for the default buffer 
size (8K), and is only slightly slower in the case of large (1M) buffers 
(apparently `System.arraycopy` is faster than `Arrays.fill` unless the array is 
huge, which I did not expect originally). As the code is super-simple compared 
to the current code, and as the originally assumed risk of OOME does not exists 
(if it would, the `fill()` method would fail already, as it uses 
double-buffering, too, and is called by the `read` happening *before* 
`transferTo`) I would propose that we go with this code instead of the 
optimized Alternative B:

private long implTransferTo(OutputStream out) throws IOException {
    if (getClass() == BufferedInputStream.class && markpos == -1) {
        int avail = count - pos;
        if (avail > 0) {
            byte[] buffer = new byte[avail]; // Double buffering prevents 
poisoning and leaking
            System.arraycopy(getBufIfOpen(), pos, buffer, 0, avail);
            out.write(buffer);
            pos = count;
        }
        return avail + getInIfOpen().transferTo(out);
    } else {
        return super.transferTo(out);
    }
}


Here are the numbers in full length for your reference:

Alternative B optimized (no early buffer resize)

Benchmark                                            (bufSize)  (posPercent)   
Mode  Cnt        Score       Error  Units
BufferedInputStreamTransferToPerformance.transferTo       8192             0  
thrpt   25     4289,949 ┬▒   217,452  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192            10  
thrpt   25     3823,443 ┬▒   314,171  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192            90  
thrpt   25     4122,949 ┬▒    52,131  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192           100  
thrpt   25     4675,491 ┬▒    46,194  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576             0  
thrpt   25     3477,497 ┬▒   111,229  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576            10  
thrpt   25     4767,220 ┬▒    18,274  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576            90  
thrpt   25    43900,923 ┬▒   799,471  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576           100  
thrpt   25  1680163,414 ┬▒ 29823,617  ops/s

Alternative B original (with early buffer resize)

Benchmark                                            (bufSize)  (posPercent)   
Mode  Cnt        Score       Error  Units
BufferedInputStreamTransferToPerformance.transferTo       8192             0  
thrpt   25     4346,835 ┬▒   273,140  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192            10  
thrpt   25     3816,744 ┬▒   220,669  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192            90  
thrpt   25     3624,021 ┬▒   340,279  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192           100  
thrpt   25     4484,060 ┬▒    95,201  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576             0  
thrpt   25     3507,334 ┬▒   140,604  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576            10  
thrpt   25     4800,050 ┬▒    44,987  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576            90  
thrpt   25    44327,841 ┬▒   256,322  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576           100  
thrpt   25  1699685,163 ┬▒ 44257,421  ops/s

Alternative C: Double buffering

Benchmark                                            (bufSize)  (posPercent)   
Mode  Cnt        Score       Error  Units
BufferedInputStreamTransferToPerformance.transferTo       8192             0  
thrpt   25     4555,523 ┬▒    83,370  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192            10  
thrpt   25     3972,562 ┬▒   489,380  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192            90  
thrpt   25     4181,716 ┬▒    49,527  ops/s
BufferedInputStreamTransferToPerformance.transferTo       8192           100  
thrpt   25     4602,409 ┬▒   125,641  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576             0  
thrpt   25     3373,734 ┬▒   161,989  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576            10  
thrpt   25     4684,583 ┬▒    87,257  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576            90  
thrpt   25    44082,232 ┬▒   364,210  ops/s
BufferedInputStreamTransferToPerformance.transferTo    1048576           100  
thrpt   25  1690855,778 ┬▒ 53885,439  ops/s

-------------

PR: https://git.openjdk.org/jdk/pull/10525

Re: RFR: 8294696 - BufferedInputStream.transferTo should drain buffer when mark set [v5]

Reply via email to