Thanks Sam for your investigation. I revisited the logs and confirmed that the JDK has never changed.
'java -version' get: > openjdk version "11.0.19" 2023-04-18 LTS > OpenJDK Runtime Environment (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS) > OpenJDK 64-Bit Server VM (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS, > mixed mode, sharing) It is installed via yum, 'yum info' get: > Name : java-11-openjdk > Epoch : 1 > Version : 11.0.19.0.7 > Release : 4.0.3.al8 > Architecture : x86_64 > Size : 1.3 M > Source : java-11-openjdk-11.0.19.0.7-4.0.3.al8.src.rpm > Repository : @System > From repo : alinux3-updates > Summary : OpenJDK 11 Runtime Environment > URL : http://openjdk.java.net/ > License : ASL 1.1 and ASL 2.0 and BSD and BSD with advertising and > GPL+ and > : GPLv2 and GPLv2 with exceptions and IJG and LGPLv2+ and MIT > and > : MPLv2.0 and Public Domain and W3C and zlib and ISC and FTL > and > : RSA Description : The OpenJDK 11 runtime environment. The benchmark env is hosted on Aliyun, the OS and JVM are also released by Aliyun. And thanks for your PR, I will try to put it in our daily run soon. Best, Zakelly On Mon, Jun 10, 2024 at 9:24 AM Sam Barker <s...@quadrocket.co.uk> wrote: > After completing the side quest > <https://github.com/apache/flink-benchmarks/pull/90>[1] of enabling async > profiler when running the JMH benchmarks I've been unable to reproduce the > performance change between the last known good run and the first run > highlighted as a regression. > Results from my fedora f40 workstation using > > # JMH version: 1.37 > # VM version: JDK 11.0.23, OpenJDK 64-Bit Server VM, 11.0.23+9 > # VM invoker: /home/sam/.sdkman/candidates/java/11.0.23-tem/bin/java > # VM options: -Djava.rmi.server.hostname=127.0.0.1 > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.ssl > # Blackhole mode: full + dont-inline hint (auto-detected, use > -Djmh.blackhole.autoDetect=false to disable) > > ───────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > │ File: /tmp/profile-results/163b9cca6d2/jmh-result.csv > > ───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > 1 │ "Benchmark","Mode","Threads","Samples","Score","Score Error > (99.9%)","Unit" > 2 │ > > "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString","thrpt",1,30,179.453066,5.725733,"ops/ms" > 3 │ > > "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString:async","thrpt",1,1,NaN,NaN,"---" > > ───────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > > ───────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > │ File: /tmp/profile-results/f38d8ca43f6/jmh-result.csv > > ───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > 1 │ "Benchmark","Mode","Threads","Samples","Score","Score Error > (99.9%)","Unit" > 2 │ > > "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString","thrpt",1,30,178.861842,6.711582,"ops/ms" > 3 │ > > "org.apache.flink.benchmark.SerializationFrameworkMiniBenchmarks.serializerHeavyString:async","thrpt",1,1,NaN,NaN,"---" > > ───────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── > > Where f38d8ca43f6 is the last known good run and 163b9cca6d2 is the first > regression. > > One question I have from comparing my local results to those on flink-speed > < > https://flink-speed.xyz/timeline/#/?exe=6&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=3&revs=200 > >[2] > is it possible the JDK version changed between the runs (I don't see the > actual JDK build listed anywhere so I can't check versions or > distributions)? > > I've also tried comparing building flink with the java11-target profile vs > the default JDK 8 build and that does not change the performance. > > Sam > > [1] https://github.com/apache/flink-benchmarks/pull/90 > [2] > > https://flink-speed.xyz/timeline/#/?exe=6&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=3&revs=200 > > On Wed, 29 May 2024 at 16:53, Sam Barker <s...@quadrocket.co.uk> wrote: > > > > I guess that improvement is a fluctuation. You can double check the > > performance results[1] of the last few days. The performance isn't > > recovered. > > > > Hmm yeah the improvement was a fluctuation and smaller than I remembered > > seeing (maybe I had zoomed into the timeline too much). > > > > > I fixed an issue related to kryo serialization in FLINK-35215. IIUC, > > serializerHeavyString doesn't use the kryo serialization. I try to > > run serializerHeavyString demo locally, and didn't see the > > kryo serialization related code is called. > > > > I don't see it either, but then again I don't see commons-io in the call > > stacks either despite the regression... > > > > I'm continuing to investigate the regression. > > > > On Mon, 27 May 2024 at 20:15, Rui Fan <1996fan...@gmail.com> wrote: > > > >> Thanks Sam for the comment! > >> > >> > It looks like the most recent run of JDK 11 saw a big improvement of > the > >> > performance of the test. > >> > >> I guess that improvement is a fluctuation. You can double check the > >> performance results[1] of the last few days. The performance isn't > >> recovered. > >> > > > > > > > >> > >> > That improvement seems related to which is a fix for FLINK-35215. > >> > >> I fixed an issue related to kryo serialization in FLINK-35215. IIUC, > >> serializerHeavyString doesn't use the kryo serialization. I try to > >> run serializerHeavyString demo locally, and didn't see the > >> kryo serialization related code is called. > >> > >> Please correct me if I'm wrong, thanks~ > >> > >> [1] > >> > >> > http://flink-speed.xyz/timeline/#/?exe=6&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=3&revs=200 > >> > >> Best, > >> Rui > >> > >> On Thu, May 23, 2024 at 1:27 PM Sam Barker <s...@quadrocket.co.uk> > wrote: > >> > >> > It looks like the most recent run of JDK 11 saw a big improvement[1] > of > >> the > >> > performance of the test. That improvement seems related to [2] which > is > >> a > >> > fix for FLINK-35215 [3]. That suggests to me that the test isn't as > >> > isolated to the performance of the code its trying to test as would be > >> > ideal. However I've only just started looking at the test suite and > >> trying > >> > to run locally so I'm not very well placed to judge. > >> > > >> > It does however suggest that this shouldn't be a blocker for the > >> release. > >> > > >> > > >> > > >> > [1] http://flink-speed.xyz/changes/?rev=c1baf07d76&exe=6&env=3 > >> > [2] > >> > > >> > > >> > https://github.com/apache/flink/commit/c1baf07d7601a683f42997dc35dfaef4e41bc928 > >> > [3] https://issues.apache.org/jira/browse/FLINK-35215 > >> > > >> > On Wed, 22 May 2024 at 00:15, Piotr Nowojski <pnowoj...@apache.org> > >> wrote: > >> > > >> > > Hi, > >> > > > >> > > Given what you wrote, that you have investigated the issue and > >> couldn't > >> > > find any easy explanation, I would suggest closing this ticket as > >> "Won't > >> > > do" or "Can not reproduce" and ignoring the problem. > >> > > > >> > > In the past there have been quite a bit of cases where some > benchmark > >> > > detected a performance regression. Sometimes those can not be > >> reproduced, > >> > > other times (as it's the case here), some seemingly unrelated change > >> is > >> > > causing the regression. The same thing happened in this benchmark > many > >> > > times in the past [1], [2], [3], [4]. Generally speaking this > >> benchmark > >> > has > >> > > been in the spotlight a couple of times [5]. > >> > > > >> > > Note that there have been cases where this benchmark did detect a > >> > > performance regression :) > >> > > > >> > > My personal suspicion is that after that commons-io version bump, > >> > > something poked JVM/JIT to compile the code a bit differently for > >> string > >> > > serialization causing this regression. We have a couple of > benchmarks > >> > that > >> > > seem to be prone to such semi intermittent issues. For example the > >> same > >> > > benchmark was subject to this annoying pattern [6], that I've > spotted > >> in > >> > > quite a bit of benchmarks over the years [6]: > >> > > > >> > > [image: image.png] > >> > > (https://imgur.com/a/AoygmWS) > >> > > > >> > > Where benchmark results are very stable within a single JVM fork. > But > >> > > between two forks, they can reach two different "stable" levels. > Here > >> it > >> > > looks like 50% of the chance of getting stable "200 records/ms" and > >> 50% > >> > > chances of "250 records/ms". > >> > > > >> > > A small interlude. Each of our benchmarks run in 3 different JVM > >> forks, > >> > 10 > >> > > warm up iterations and 10 measurement iterations. Each iteration > >> > > lasts/invokes the benchmarking method at least for one second. So by > >> > "very > >> > > stable" results, I mean that for example after the 2nd or 3rd warm > up > >> > > iteration, the results stabilize < +/-1%, and stay on that level for > >> the > >> > > whole duration of the fork. > >> > > > >> > > Given that we are repeating the same benchmark in 3 different forks, > >> we > >> > > can have by pure chance: > >> > > - 3 slow fork - total average 200 records/ms > >> > > - 2 slow fork, 1 fast fork - average 216 r/ms > >> > > - 1 slow fork, 2 fast forks - average 233 r/ms > >> > > - 3 fast forks - average 250 r/ms > >> > > > >> > > So this benchmark is susceptible to enter some different semi stable > >> > > states. As I wrote above, I guess something with the commons-io > >> version > >> > > bump just swayed it to a different semi stable state :( I have never > >> > gotten > >> > > desperate enough to actually dig further what's exactly causing this > >> kind > >> > > of issues. > >> > > > >> > > Best, > >> > > Piotrek > >> > > > >> > > [1] https://issues.apache.org/jira/browse/FLINK-18684 > >> > > [2] https://issues.apache.org/jira/browse/FLINK-27133 > >> > > [3] https://issues.apache.org/jira/browse/FLINK-27165 > >> > > [4] https://issues.apache.org/jira/browse/FLINK-31745 > >> > > [5] > >> > > > >> > > >> > https://issues.apache.org/jira/browse/FLINK-35040?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20text%20~%20%22serializerHeavyString%22 > >> > > [6] > >> > > > >> > > >> > http://flink-speed.xyz/timeline/#/?exe=1&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=2&revs=1000 > >> > > > >> > > wt., 21 maj 2024 o 12:50 Rui Fan <1996fan...@gmail.com> napisał(a): > >> > > > >> > >> Hi devs: > >> > >> > >> > >> We(release managers of flink 1.20) wanna update one performance > >> > >> regresses to the flink dev mail list. > >> > >> > >> > >> # Background: > >> > >> > >> > >> The performance of serializerHeavyString starts regress since April > >> 3, > >> > >> and we created FLINK-35040[1] to follow it. > >> > >> > >> > >> In brief: > >> > >> - The performance only regresses for jdk 11, and Java 8 and Java 17 > >> are > >> > >> fine. > >> > >> - The regression reason is upgrading commons-io version from 2.11.0 > >> to > >> > >> 2.15.1 > >> > >> - This upgrading is done in FLINK-34955[2]. > >> > >> - The performance can be recovered after reverting the commons-io > >> > >> version > >> > >> to 2.11.0 > >> > >> > >> > >> You can get more details from FLINK-35040[1]. > >> > >> > >> > >> # Problem > >> > >> > >> > >> We try to generate the flame graph (wall mode) to analyze why > >> upgrading > >> > >> the commons-io version affects the performance. These flamegraphs > can > >> > >> be found in FLINK-35040[1]. (Many thanks to Zakelly for generating > >> these > >> > >> flamegraphs from the benchmark server). > >> > >> > >> > >> Unfortunately, we cannot find any code of commons-io dependency is > >> > called. > >> > >> Also, we try to analyze if any other dependencies are changed > during > >> > >> upgrading > >> > >> commons-io version. The result is no, other dependencies are > totally > >> the > >> > >> same. > >> > >> > >> > >> # Request > >> > >> > >> > >> After the above analysis, we cannot find why the performance of > >> > >> serializerHeavyString > >> > >> starts to regress for jdk11. > >> > >> > >> > >> We are looking forward to hearing valuable suggestions from the > Flink > >> > >> community. > >> > >> Thanks everyone in advance. > >> > >> > >> > >> Note: > >> > >> 1. I cannot reproduce the regression on my Mac with jdk11, and we > >> > suspect > >> > >> this regression may be caused by the benchmark environment. > >> > >> 2. We will accept this regression if the issue still cannot be > >> solved. > >> > >> > >> > >> [1] https://issues.apache.org/jira/browse/FLINK-35040 > >> > >> [2] https://issues.apache.org/jira/browse/FLINK-34955 > >> > >> > >> > >> Best, > >> > >> Weijie, Ufuk, Robert and Rui > >> > >> > >> > > > >> > > >> > > >