Thanks Piotr and Zakelly for your advice. I have also tested this in my local machine, and I can not reproduce it even if switch to JDK11. This must have something to do with the benchmark's testing environment.
> I will update FLINK-35040[1] to "Won't fix" and update the Priority from Blocker to Major if there are no objections before next Monday. +1 from my side and thanks for your thorough investigation! Best regards, Weijie Rui Fan <1996fan...@gmail.com> 于2024年5月22日周三 14:52写道: > Hi Piotr and Zakelly, > > Thanks a lot for providing these historical JIRAs related to > serializerHeavyString performance. It seems this similar issue > has happened multiple times. > > And thanks for clarifying the principle of benchmark. I have > encountered the semi-stable state for other benchmarks as well. > > > Available information indicates this issue is environment- and > JDK-specific, and I also failed to reproduce it in my Mac. > > Thanks for your check on your Mac as well. > > > Considering the historical context of this test provided by Piotr, I > vote a "Won't fix" for this problem. > > I will update FLINK-35040[1] to "Won't fix" and update the > Priority from Blocker to Major if there are no objections > before next Monday. > > Of course, it can be picked up or reopened if we have any > new clues. > > Best, > Rui > > [1] https://issues.apache.org/jira/browse/FLINK-35040 > > On Wed, May 22, 2024 at 12:41 PM Zakelly Lan <zakelly....@gmail.com> > wrote: > >> Hi Rui and RMs of Flink 1.20, >> >> Thanks for driving this! >> >> Available information indicates this issue is environment- and >> JDK-specific, and I also failed to reproduce it in my Mac. Thus I guess it >> is caused by JIT behavior, which is unpredictable and vulnerable to >> disturbance of the codebase. Considering the historical context of this >> test provided by Piotr, I vote a "Won't fix" for this problem. >> >> And I can offer some help if anyone wants to investigate the benchmark >> environment, please reach out to me. JDK version info: >> >>> openjdk version "11.0.19" 2023-04-18 LTS >>> OpenJDK Runtime Environment (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS) >>> OpenJDK 64-Bit Server VM (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS, >>> mixed mode, sharing) >> >> The OS version is Alibaba Cloud Linux 3.2104 LTS 64-bit[1]. The linux >> kernel version is 5.10.134-15.al8.x86_64. >> >> >> Best, >> Zakelly >> >> [1] >> https://www.alibabacloud.com/help/en/alinux/product-overview/release-notes-for-alibaba-cloud-linux >> (See: Alibaba Cloud Linux 3.2104 U8, image id: >> aliyun_3_x64_20G_alibase_20230727.vhd) >> >> On Tue, May 21, 2024 at 8:15 PM Piotr Nowojski <pnowoj...@apache.org> >> wrote: >> >>> Hi, >>> >>> Given what you wrote, that you have investigated the issue and couldn't >>> find any easy explanation, I would suggest closing this ticket as "Won't >>> do" or "Can not reproduce" and ignoring the problem. >>> >>> In the past there have been quite a bit of cases where some benchmark >>> detected a performance regression. Sometimes those can not be reproduced, >>> other times (as it's the case here), some seemingly unrelated change is >>> causing the regression. The same thing happened in this benchmark many >>> times in the past [1], [2], [3], [4]. Generally speaking this benchmark has >>> been in the spotlight a couple of times [5]. >>> >>> Note that there have been cases where this benchmark did detect a >>> performance regression :) >>> >>> My personal suspicion is that after that commons-io version bump, >>> something poked JVM/JIT to compile the code a bit differently for string >>> serialization causing this regression. We have a couple of benchmarks that >>> seem to be prone to such semi intermittent issues. For example the same >>> benchmark was subject to this annoying pattern [6], that I've spotted in >>> quite a bit of benchmarks over the years [6]: >>> >>> [image: image.png] >>> (https://imgur.com/a/AoygmWS) >>> >>> Where benchmark results are very stable within a single JVM fork. But >>> between two forks, they can reach two different "stable" levels. Here it >>> looks like 50% of the chance of getting stable "200 records/ms" and 50% >>> chances of "250 records/ms". >>> >>> A small interlude. Each of our benchmarks run in 3 different JVM forks, >>> 10 warm up iterations and 10 measurement iterations. Each iteration >>> lasts/invokes the benchmarking method at least for one second. So by "very >>> stable" results, I mean that for example after the 2nd or 3rd warm up >>> iteration, the results stabilize < +/-1%, and stay on that level for the >>> whole duration of the fork. >>> >>> Given that we are repeating the same benchmark in 3 different forks, we >>> can have by pure chance: >>> - 3 slow fork - total average 200 records/ms >>> - 2 slow fork, 1 fast fork - average 216 r/ms >>> - 1 slow fork, 2 fast forks - average 233 r/ms >>> - 3 fast forks - average 250 r/ms >>> >>> So this benchmark is susceptible to enter some different semi stable >>> states. As I wrote above, I guess something with the commons-io version >>> bump just swayed it to a different semi stable state :( I have never gotten >>> desperate enough to actually dig further what's exactly causing this kind >>> of issues. >>> >>> Best, >>> Piotrek >>> >>> [1] https://issues.apache.org/jira/browse/FLINK-18684 >>> [2] https://issues.apache.org/jira/browse/FLINK-27133 >>> [3] https://issues.apache.org/jira/browse/FLINK-27165 >>> [4] https://issues.apache.org/jira/browse/FLINK-31745 >>> [5] >>> https://issues.apache.org/jira/browse/FLINK-35040?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20text%20~%20%22serializerHeavyString%22 >>> [6] >>> http://flink-speed.xyz/timeline/#/?exe=1&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=2&revs=1000 >>> >>> wt., 21 maj 2024 o 12:50 Rui Fan <1996fan...@gmail.com> napisał(a): >>> >>>> Hi devs: >>>> >>>> We(release managers of flink 1.20) wanna update one performance >>>> regresses to the flink dev mail list. >>>> >>>> # Background: >>>> >>>> The performance of serializerHeavyString starts regress since April 3, >>>> and we created FLINK-35040[1] to follow it. >>>> >>>> In brief: >>>> - The performance only regresses for jdk 11, and Java 8 and Java 17 are >>>> fine. >>>> - The regression reason is upgrading commons-io version from 2.11.0 to >>>> 2.15.1 >>>> - This upgrading is done in FLINK-34955[2]. >>>> - The performance can be recovered after reverting the commons-io >>>> version >>>> to 2.11.0 >>>> >>>> You can get more details from FLINK-35040[1]. >>>> >>>> # Problem >>>> >>>> We try to generate the flame graph (wall mode) to analyze why upgrading >>>> the commons-io version affects the performance. These flamegraphs can >>>> be found in FLINK-35040[1]. (Many thanks to Zakelly for generating these >>>> flamegraphs from the benchmark server). >>>> >>>> Unfortunately, we cannot find any code of commons-io dependency is >>>> called. >>>> Also, we try to analyze if any other dependencies are changed during >>>> upgrading >>>> commons-io version. The result is no, other dependencies are totally the >>>> same. >>>> >>>> # Request >>>> >>>> After the above analysis, we cannot find why the performance of >>>> serializerHeavyString >>>> starts to regress for jdk11. >>>> >>>> We are looking forward to hearing valuable suggestions from the Flink >>>> community. >>>> Thanks everyone in advance. >>>> >>>> Note: >>>> 1. I cannot reproduce the regression on my Mac with jdk11, and we >>>> suspect >>>> this regression may be caused by the benchmark environment. >>>> 2. We will accept this regression if the issue still cannot be solved. >>>> >>>> [1] https://issues.apache.org/jira/browse/FLINK-35040 >>>> [2] https://issues.apache.org/jira/browse/FLINK-34955 >>>> >>>> Best, >>>> Weijie, Ufuk, Robert and Rui >>>> >>>