Hi Piotr and Zakelly, Thanks a lot for providing these historical JIRAs related to serializerHeavyString performance. It seems this similar issue has happened multiple times.
And thanks for clarifying the principle of benchmark. I have encountered the semi-stable state for other benchmarks as well. > Available information indicates this issue is environment- and JDK-specific, and I also failed to reproduce it in my Mac. Thanks for your check on your Mac as well. > Considering the historical context of this test provided by Piotr, I vote a "Won't fix" for this problem. I will update FLINK-35040[1] to "Won't fix" and update the Priority from Blocker to Major if there are no objections before next Monday. Of course, it can be picked up or reopened if we have any new clues. Best, Rui [1] https://issues.apache.org/jira/browse/FLINK-35040 On Wed, May 22, 2024 at 12:41 PM Zakelly Lan <zakelly....@gmail.com> wrote: > Hi Rui and RMs of Flink 1.20, > > Thanks for driving this! > > Available information indicates this issue is environment- and > JDK-specific, and I also failed to reproduce it in my Mac. Thus I guess it > is caused by JIT behavior, which is unpredictable and vulnerable to > disturbance of the codebase. Considering the historical context of this > test provided by Piotr, I vote a "Won't fix" for this problem. > > And I can offer some help if anyone wants to investigate the benchmark > environment, please reach out to me. JDK version info: > >> openjdk version "11.0.19" 2023-04-18 LTS >> OpenJDK Runtime Environment (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS) >> OpenJDK 64-Bit Server VM (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS, >> mixed mode, sharing) > > The OS version is Alibaba Cloud Linux 3.2104 LTS 64-bit[1]. The linux > kernel version is 5.10.134-15.al8.x86_64. > > > Best, > Zakelly > > [1] > https://www.alibabacloud.com/help/en/alinux/product-overview/release-notes-for-alibaba-cloud-linux > (See: Alibaba Cloud Linux 3.2104 U8, image id: > aliyun_3_x64_20G_alibase_20230727.vhd) > > On Tue, May 21, 2024 at 8:15 PM Piotr Nowojski <pnowoj...@apache.org> > wrote: > >> Hi, >> >> Given what you wrote, that you have investigated the issue and couldn't >> find any easy explanation, I would suggest closing this ticket as "Won't >> do" or "Can not reproduce" and ignoring the problem. >> >> In the past there have been quite a bit of cases where some benchmark >> detected a performance regression. Sometimes those can not be reproduced, >> other times (as it's the case here), some seemingly unrelated change is >> causing the regression. The same thing happened in this benchmark many >> times in the past [1], [2], [3], [4]. Generally speaking this benchmark has >> been in the spotlight a couple of times [5]. >> >> Note that there have been cases where this benchmark did detect a >> performance regression :) >> >> My personal suspicion is that after that commons-io version bump, >> something poked JVM/JIT to compile the code a bit differently for string >> serialization causing this regression. We have a couple of benchmarks that >> seem to be prone to such semi intermittent issues. For example the same >> benchmark was subject to this annoying pattern [6], that I've spotted in >> quite a bit of benchmarks over the years [6]: >> >> [image: image.png] >> (https://imgur.com/a/AoygmWS) >> >> Where benchmark results are very stable within a single JVM fork. But >> between two forks, they can reach two different "stable" levels. Here it >> looks like 50% of the chance of getting stable "200 records/ms" and 50% >> chances of "250 records/ms". >> >> A small interlude. Each of our benchmarks run in 3 different JVM forks, >> 10 warm up iterations and 10 measurement iterations. Each iteration >> lasts/invokes the benchmarking method at least for one second. So by "very >> stable" results, I mean that for example after the 2nd or 3rd warm up >> iteration, the results stabilize < +/-1%, and stay on that level for the >> whole duration of the fork. >> >> Given that we are repeating the same benchmark in 3 different forks, we >> can have by pure chance: >> - 3 slow fork - total average 200 records/ms >> - 2 slow fork, 1 fast fork - average 216 r/ms >> - 1 slow fork, 2 fast forks - average 233 r/ms >> - 3 fast forks - average 250 r/ms >> >> So this benchmark is susceptible to enter some different semi stable >> states. As I wrote above, I guess something with the commons-io version >> bump just swayed it to a different semi stable state :( I have never gotten >> desperate enough to actually dig further what's exactly causing this kind >> of issues. >> >> Best, >> Piotrek >> >> [1] https://issues.apache.org/jira/browse/FLINK-18684 >> [2] https://issues.apache.org/jira/browse/FLINK-27133 >> [3] https://issues.apache.org/jira/browse/FLINK-27165 >> [4] https://issues.apache.org/jira/browse/FLINK-31745 >> [5] >> https://issues.apache.org/jira/browse/FLINK-35040?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20text%20~%20%22serializerHeavyString%22 >> [6] >> http://flink-speed.xyz/timeline/#/?exe=1&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=2&revs=1000 >> >> wt., 21 maj 2024 o 12:50 Rui Fan <1996fan...@gmail.com> napisał(a): >> >>> Hi devs: >>> >>> We(release managers of flink 1.20) wanna update one performance >>> regresses to the flink dev mail list. >>> >>> # Background: >>> >>> The performance of serializerHeavyString starts regress since April 3, >>> and we created FLINK-35040[1] to follow it. >>> >>> In brief: >>> - The performance only regresses for jdk 11, and Java 8 and Java 17 are >>> fine. >>> - The regression reason is upgrading commons-io version from 2.11.0 to >>> 2.15.1 >>> - This upgrading is done in FLINK-34955[2]. >>> - The performance can be recovered after reverting the commons-io >>> version >>> to 2.11.0 >>> >>> You can get more details from FLINK-35040[1]. >>> >>> # Problem >>> >>> We try to generate the flame graph (wall mode) to analyze why upgrading >>> the commons-io version affects the performance. These flamegraphs can >>> be found in FLINK-35040[1]. (Many thanks to Zakelly for generating these >>> flamegraphs from the benchmark server). >>> >>> Unfortunately, we cannot find any code of commons-io dependency is >>> called. >>> Also, we try to analyze if any other dependencies are changed during >>> upgrading >>> commons-io version. The result is no, other dependencies are totally the >>> same. >>> >>> # Request >>> >>> After the above analysis, we cannot find why the performance of >>> serializerHeavyString >>> starts to regress for jdk11. >>> >>> We are looking forward to hearing valuable suggestions from the Flink >>> community. >>> Thanks everyone in advance. >>> >>> Note: >>> 1. I cannot reproduce the regression on my Mac with jdk11, and we suspect >>> this regression may be caused by the benchmark environment. >>> 2. We will accept this regression if the issue still cannot be solved. >>> >>> [1] https://issues.apache.org/jira/browse/FLINK-35040 >>> [2] https://issues.apache.org/jira/browse/FLINK-34955 >>> >>> Best, >>> Weijie, Ufuk, Robert and Rui >>> >>