Re: [DISCUSS] The performance of serializerHeavyString starts regress since April 3

weijie guo Tue, 21 May 2024 23:58:21 -0700

Thanks Piotr and Zakelly for your advice.

I have also tested this in my local machine, and I can not reproduce
it even if switch to JDK11. This must have something to do with the
benchmark's testing environment.


> I will update FLINK-35040[1] to "Won't fix" and update the Priority from
Blocker to Major if there are no objections before next Monday.

+1 from my side and thanks for your thorough investigation!


Best regards,

Weijie


Rui Fan <[email protected]> 于2024年5月22日周三 14:52写道：

> Hi Piotr and Zakelly,
>
> Thanks a lot for providing these historical JIRAs related to
> serializerHeavyString performance. It seems this similar issue
> has happened multiple times.
>
> And thanks for clarifying the principle of benchmark. I have
> encountered the semi-stable state for other benchmarks as well.
>
> > Available information indicates this issue is environment- and
> JDK-specific, and I also failed to reproduce it in my Mac.
>
> Thanks for your check on your Mac as well.
>
> > Considering the historical context of this test provided by Piotr, I
> vote a "Won't fix" for this problem.
>
> I will update FLINK-35040[1] to "Won't fix" and update the
> Priority from Blocker to Major if there are no objections
> before next Monday.
>
> Of course, it can be picked up or reopened if we have any
> new clues.
>
> Best,
> Rui
>
> [1] https://issues.apache.org/jira/browse/FLINK-35040
>
> On Wed, May 22, 2024 at 12:41 PM Zakelly Lan <[email protected]>
> wrote:
>
>> Hi Rui and RMs of Flink 1.20,
>>
>> Thanks for driving this!
>>
>> Available information indicates this issue is environment- and
>> JDK-specific, and I also failed to reproduce it in my Mac. Thus I guess it
>> is caused by JIT behavior, which is unpredictable and vulnerable to
>> disturbance of the codebase. Considering the historical context of this
>> test provided by Piotr, I vote a "Won't fix" for this problem.
>>
>> And I can offer some help if anyone wants to investigate the benchmark
>> environment, please reach out to me. JDK version info:
>>
>>> openjdk version "11.0.19" 2023-04-18 LTS
>>> OpenJDK Runtime Environment (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS)
>>> OpenJDK 64-Bit Server VM (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS,
>>> mixed mode, sharing)
>>
>> The OS version is Alibaba Cloud Linux 3.2104 LTS 64-bit[1]. The linux
>> kernel version is 5.10.134-15.al8.x86_64.
>>
>>
>> Best,
>> Zakelly
>>
>> [1]
>> https://www.alibabacloud.com/help/en/alinux/product-overview/release-notes-for-alibaba-cloud-linux
>> (See: Alibaba Cloud Linux 3.2104 U8, image id:
>> aliyun_3_x64_20G_alibase_20230727.vhd)
>>
>> On Tue, May 21, 2024 at 8:15 PM Piotr Nowojski <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> Given what you wrote, that you have investigated the issue and couldn't
>>> find any easy explanation, I would suggest closing this ticket as "Won't
>>> do" or "Can not reproduce" and ignoring the problem.
>>>
>>> In the past there have been quite a bit of cases where some benchmark
>>> detected a performance regression. Sometimes those can not be reproduced,
>>> other times (as it's the case here), some seemingly unrelated change is
>>> causing the regression. The same thing happened in this benchmark many
>>> times in the past [1], [2], [3], [4]. Generally speaking this benchmark has
>>> been in the spotlight a couple of times [5].
>>>
>>> Note that there have been cases where this benchmark did detect a
>>> performance regression :)
>>>
>>> My personal suspicion is that after that commons-io version bump,
>>> something poked JVM/JIT to compile the code a bit differently for string
>>> serialization causing this regression. We have a couple of benchmarks that
>>> seem to be prone to such semi intermittent issues. For example the same
>>> benchmark was subject to this annoying pattern [6], that I've spotted in
>>> quite a bit of benchmarks over the years [6]:
>>>
>>> [image: image.png]
>>> (https://imgur.com/a/AoygmWS)
>>>
>>> Where benchmark results are very stable within a single JVM fork. But
>>> between two forks, they can reach two different "stable" levels. Here it
>>> looks like 50% of the chance of getting stable "200 records/ms" and 50%
>>> chances of "250 records/ms".
>>>
>>> A small interlude. Each of our benchmarks run in 3 different JVM forks,
>>> 10 warm up iterations and 10 measurement iterations. Each iteration
>>> lasts/invokes the benchmarking method at least for one second. So by "very
>>> stable" results, I mean that for example after the 2nd or 3rd warm up
>>> iteration, the results stabilize < +/-1%, and stay on that level for the
>>> whole duration of the fork.
>>>
>>> Given that we are repeating the same benchmark in 3 different forks, we
>>> can have by pure chance:
>>> - 3 slow fork - total average 200 records/ms
>>> - 2 slow fork, 1 fast fork - average 216 r/ms
>>> - 1 slow fork, 2 fast forks - average 233 r/ms
>>> - 3 fast forks - average 250 r/ms
>>>
>>> So this benchmark is susceptible to enter some different semi stable
>>> states. As I wrote above, I guess something with the commons-io version
>>> bump just swayed it to a different semi stable state :( I have never gotten
>>> desperate enough to actually dig further what's exactly causing this kind
>>> of issues.
>>>
>>> Best,
>>> Piotrek
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-18684
>>> [2] https://issues.apache.org/jira/browse/FLINK-27133
>>> [3] https://issues.apache.org/jira/browse/FLINK-27165
>>> [4] https://issues.apache.org/jira/browse/FLINK-31745
>>> [5]
>>> https://issues.apache.org/jira/browse/FLINK-35040?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20text%20~%20%22serializerHeavyString%22
>>> [6]
>>> http://flink-speed.xyz/timeline/#/?exe=1&ben=serializerHeavyString&extr=on&quarts=on&equid=off&env=2&revs=1000
>>>
>>> wt., 21 maj 2024 o 12:50 Rui Fan <[email protected]> napisał(a):
>>>
>>>> Hi devs:
>>>>
>>>> We(release managers of flink 1.20) wanna update one performance
>>>> regresses to the flink dev mail list.
>>>>
>>>> # Background:
>>>>
>>>> The performance of serializerHeavyString starts regress since April 3,
>>>> and we created FLINK-35040[1] to follow it.
>>>>
>>>> In brief:
>>>> - The performance only regresses for jdk 11, and Java 8 and Java 17 are
>>>> fine.
>>>> - The regression reason is upgrading commons-io version from 2.11.0 to
>>>> 2.15.1
>>>>   - This upgrading is done in FLINK-34955[2].
>>>>   - The performance can be recovered after reverting the commons-io
>>>> version
>>>> to 2.11.0
>>>>
>>>> You can get more details from FLINK-35040[1].
>>>>
>>>> # Problem
>>>>
>>>> We try to generate the flame graph (wall mode) to analyze why upgrading
>>>> the commons-io version affects the performance. These flamegraphs can
>>>> be found in FLINK-35040[1]. (Many thanks to Zakelly for generating these
>>>> flamegraphs from the benchmark server).
>>>>
>>>> Unfortunately, we cannot find any code of commons-io dependency is
>>>> called.
>>>> Also, we try to analyze if any other dependencies are changed during
>>>> upgrading
>>>> commons-io version. The result is no, other dependencies are totally the
>>>> same.
>>>>
>>>> # Request
>>>>
>>>> After the above analysis, we cannot find why the performance of
>>>> serializerHeavyString
>>>> starts to regress for jdk11.
>>>>
>>>> We are looking forward to hearing valuable suggestions from the Flink
>>>> community.
>>>> Thanks everyone in advance.
>>>>
>>>> Note:
>>>> 1. I cannot reproduce the regression on my Mac with jdk11, and we
>>>> suspect
>>>>   this regression may be caused by the benchmark environment.
>>>> 2. We will accept this regression if the issue still cannot be solved.
>>>>
>>>> [1] https://issues.apache.org/jira/browse/FLINK-35040
>>>> [2] https://issues.apache.org/jira/browse/FLINK-34955
>>>>
>>>> Best,
>>>> Weijie, Ufuk, Robert and Rui
>>>>
>>>

Re: [DISCUSS] The performance of serializerHeavyString starts regress since April 3

Reply via email to