> I guess that improvement is a fluctuation. You can double check the performance results[1] of the last few days. The performance isn't recovered.
Hmm yeah the improvement was a fluctuation and smaller than I remembered seeing (maybe I had zoomed into the timeline too much). > I fixed an issue related to kryo serialization in FLINK-35215. IIUC, serializerHeavyString doesn't use the kryo serialization. I try to run serializerHeavyString demo locally, and didn't see the kryo serialization related code is called. I don't see it either, but then again I don't see commons-io in the call stacks either despite the regression... I'm continuing to investigate the regression. On Mon, 27 May 2024 at 20:15, Rui Fan <1996fan...@gmail.com> wrote: > Thanks Sam for the comment! > > > It looks like the most recent run of JDK 11 saw a big improvement of the > > performance of the test. > > I guess that improvement is a fluctuation. You can double check the > performance results[1] of the last few days. The performance isn't > recovered. > > > > That improvement seems related to which is a fix for FLINK-35215. > > I fixed an issue related to kryo serialization in FLINK-35215. IIUC, > serializerHeavyString doesn't use the kryo serialization. I try to > run serializerHeavyString demo locally, and didn't see the > kryo serialization related code is called. > > Please correct me if I'm wrong, thanks~ > > [1] > > http://flink-speed.xyz/timeline/#/?exe=6&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=3&revs=200 > > Best, > Rui > > On Thu, May 23, 2024 at 1:27 PM Sam Barker <s...@quadrocket.co.uk> wrote: > > > It looks like the most recent run of JDK 11 saw a big improvement[1] of > the > > performance of the test. That improvement seems related to [2] which is a > > fix for FLINK-35215 [3]. That suggests to me that the test isn't as > > isolated to the performance of the code its trying to test as would be > > ideal. However I've only just started looking at the test suite and > trying > > to run locally so I'm not very well placed to judge. > > > > It does however suggest that this shouldn't be a blocker for the release. > > > > > > > > [1] http://flink-speed.xyz/changes/?rev=c1baf07d76&exe=6&env=3 > > [2] > > > > > https://github.com/apache/flink/commit/c1baf07d7601a683f42997dc35dfaef4e41bc928 > > [3] https://issues.apache.org/jira/browse/FLINK-35215 > > > > On Wed, 22 May 2024 at 00:15, Piotr Nowojski <pnowoj...@apache.org> > wrote: > > > > > Hi, > > > > > > Given what you wrote, that you have investigated the issue and couldn't > > > find any easy explanation, I would suggest closing this ticket as > "Won't > > > do" or "Can not reproduce" and ignoring the problem. > > > > > > In the past there have been quite a bit of cases where some benchmark > > > detected a performance regression. Sometimes those can not be > reproduced, > > > other times (as it's the case here), some seemingly unrelated change is > > > causing the regression. The same thing happened in this benchmark many > > > times in the past [1], [2], [3], [4]. Generally speaking this benchmark > > has > > > been in the spotlight a couple of times [5]. > > > > > > Note that there have been cases where this benchmark did detect a > > > performance regression :) > > > > > > My personal suspicion is that after that commons-io version bump, > > > something poked JVM/JIT to compile the code a bit differently for > string > > > serialization causing this regression. We have a couple of benchmarks > > that > > > seem to be prone to such semi intermittent issues. For example the same > > > benchmark was subject to this annoying pattern [6], that I've spotted > in > > > quite a bit of benchmarks over the years [6]: > > > > > > [image: image.png] > > > (https://imgur.com/a/AoygmWS) > > > > > > Where benchmark results are very stable within a single JVM fork. But > > > between two forks, they can reach two different "stable" levels. Here > it > > > looks like 50% of the chance of getting stable "200 records/ms" and 50% > > > chances of "250 records/ms". > > > > > > A small interlude. Each of our benchmarks run in 3 different JVM forks, > > 10 > > > warm up iterations and 10 measurement iterations. Each iteration > > > lasts/invokes the benchmarking method at least for one second. So by > > "very > > > stable" results, I mean that for example after the 2nd or 3rd warm up > > > iteration, the results stabilize < +/-1%, and stay on that level for > the > > > whole duration of the fork. > > > > > > Given that we are repeating the same benchmark in 3 different forks, we > > > can have by pure chance: > > > - 3 slow fork - total average 200 records/ms > > > - 2 slow fork, 1 fast fork - average 216 r/ms > > > - 1 slow fork, 2 fast forks - average 233 r/ms > > > - 3 fast forks - average 250 r/ms > > > > > > So this benchmark is susceptible to enter some different semi stable > > > states. As I wrote above, I guess something with the commons-io version > > > bump just swayed it to a different semi stable state :( I have never > > gotten > > > desperate enough to actually dig further what's exactly causing this > kind > > > of issues. > > > > > > Best, > > > Piotrek > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-18684 > > > [2] https://issues.apache.org/jira/browse/FLINK-27133 > > > [3] https://issues.apache.org/jira/browse/FLINK-27165 > > > [4] https://issues.apache.org/jira/browse/FLINK-31745 > > > [5] > > > > > > https://issues.apache.org/jira/browse/FLINK-35040?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20text%20~%20%22serializerHeavyString%22 > > > [6] > > > > > > http://flink-speed.xyz/timeline/#/?exe=1&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=2&revs=1000 > > > > > > wt., 21 maj 2024 o 12:50 Rui Fan <1996fan...@gmail.com> napisał(a): > > > > > >> Hi devs: > > >> > > >> We(release managers of flink 1.20) wanna update one performance > > >> regresses to the flink dev mail list. > > >> > > >> # Background: > > >> > > >> The performance of serializerHeavyString starts regress since April 3, > > >> and we created FLINK-35040[1] to follow it. > > >> > > >> In brief: > > >> - The performance only regresses for jdk 11, and Java 8 and Java 17 > are > > >> fine. > > >> - The regression reason is upgrading commons-io version from 2.11.0 to > > >> 2.15.1 > > >> - This upgrading is done in FLINK-34955[2]. > > >> - The performance can be recovered after reverting the commons-io > > >> version > > >> to 2.11.0 > > >> > > >> You can get more details from FLINK-35040[1]. > > >> > > >> # Problem > > >> > > >> We try to generate the flame graph (wall mode) to analyze why > upgrading > > >> the commons-io version affects the performance. These flamegraphs can > > >> be found in FLINK-35040[1]. (Many thanks to Zakelly for generating > these > > >> flamegraphs from the benchmark server). > > >> > > >> Unfortunately, we cannot find any code of commons-io dependency is > > called. > > >> Also, we try to analyze if any other dependencies are changed during > > >> upgrading > > >> commons-io version. The result is no, other dependencies are totally > the > > >> same. > > >> > > >> # Request > > >> > > >> After the above analysis, we cannot find why the performance of > > >> serializerHeavyString > > >> starts to regress for jdk11. > > >> > > >> We are looking forward to hearing valuable suggestions from the Flink > > >> community. > > >> Thanks everyone in advance. > > >> > > >> Note: > > >> 1. I cannot reproduce the regression on my Mac with jdk11, and we > > suspect > > >> this regression may be caused by the benchmark environment. > > >> 2. We will accept this regression if the issue still cannot be solved. > > >> > > >> [1] https://issues.apache.org/jira/browse/FLINK-35040 > > >> [2] https://issues.apache.org/jira/browse/FLINK-34955 > > >> > > >> Best, > > >> Weijie, Ufuk, Robert and Rui > > >> > > > > > >