On Thu, 20 Jul 2023 16:58:44 GMT, Glavo <d...@openjdk.org> wrote: > By the way, I ran `LoopOverNonConstantHeap` on the 3700x platform, and the > performance of ByteBuffer was also poor:
I finally see it. Benchmark (polluteProfile) Mode Cnt Score Error Units LoopOverNonConstantHeap.BB_get false avgt 30 1.801 ± 0.020 ns/op LoopOverNonConstantHeap.unsafe_get false avgt 30 0.567 ± 0.007 It seems that, between updating JMH and rebuilding the JDK from scratch, *something* did the trick. While I knew that random access on a BB is slower than Unsafe (as there's an extra check), whereas looped access is as fast (as C2 is good at hoisting the checks outside the loop, as shown in the benchmark). Note also that we are in the nanosecond realm, so each instruction here counts. Is there any benchmark for DataInput/Output stream that can be used? I mean, it would be interesting to understand how these numbers translate when running the stuff that is built on top. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14636#discussion_r1269763302