[Bug middle-end/107905] 2x slowdown versus CLANG and ICL

sanmayce at sanmayce dot com via Gcc-bugs Wed, 30 Nov 2022 07:01:18 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905


--- Comment #7 from Georgi <sanmayce at sanmayce dot com> ---
(In reply to Alexander Monakov from comment #5)
> Not sure what you don't like about the inputs, they appear quite reasonable.
> Perhaps GCC's estimation of bb frequencies is off (with profile feedback we
> achieve good performance).
> 
> Georgi: you'll likely see better results with profile-guided optimization.
> You can first compile the benchmark with -O2 -fprofile-generate, run the
> output (it will generate *.gcda files), then compile again with -O2
> -fprofile-use. For Clang the options are spelled -fprofile-instr-generate
> and -fprofile-instr-use, respectively.

Thank you Alexander, your help is much appreciated. 
For the first time I see such problem, no idea what is the core-problem.
Regarding inputs, I agree, they are not only okay, they form one DEFINITIVE
benchmark, anyway, glad seeing the problem resolved when using your
-fdisable-rtl-bbro:

Rerunning, on Zen2 4800H:

D:\Wildtest_2022-Nov-29>3

D:\Wildtest_2022-Nov-29>wildtest_CLANG_14.0.1.exe
WILDTEST, wildcard benchmark written by Dogan Kurt (dogan.k...@dodobyte.com),
modified by Kaze (twitter.com/Sanmayce), revision 2022-Nov-29.

Dogan Kurt's 'Antimalware', 2016, Iterative (wild_iterative):
70.441000 s
Dogan Kurt's 'Antimalware', 2016, Iterative Optimised (wild_iterative_opt):
61.099000 s
Tcheburaschka_r3, 2022, (Tcheburaschka_Wildcard_Iterative_Kaze_CaseSensitive):
72.853000 s
JackHandy_Iterative, 2005, (IterativeWildcards):
80.415000 s
Kirk J. Krauss, 2014, DrDobbs (FastWildCompare):
42.279000 s
Alessandro Cantatore, 2003, (szWildMatch7):
97.681000 s
Nondeterministic Finite Automaton (wild_nfa):
162.023000 s

D:\Wildtest_2022-Nov-29>wildtest_Intel_19.0.exe
WILDTEST, wildcard benchmark written by Dogan Kurt (dogan.k...@dodobyte.com),
modified by Kaze (twitter.com/Sanmayce), revision 2022-Nov-29.

Dogan Kurt's 'Antimalware', 2016, Iterative (wild_iterative):
98.859000 s
Dogan Kurt's 'Antimalware', 2016, Iterative Optimised (wild_iterative_opt):
72.175000 s
Tcheburaschka_r3, 2022, (Tcheburaschka_Wildcard_Iterative_Kaze_CaseSensitive):
70.278000 s
JackHandy_Iterative, 2005, (IterativeWildcards):
107.693000 s
Kirk J. Krauss, 2014, DrDobbs (FastWildCompare):
45.988000 s
Alessandro Cantatore, 2003, (szWildMatch7):
79.309000 s
Nondeterministic Finite Automaton (wild_nfa):
170.198000 s

D:\Wildtest_2022-Nov-29>wildtest_GCC_11.3.0.exe
WILDTEST, wildcard benchmark written by Dogan Kurt (dogan.k...@dodobyte.com),
modified by Kaze (twitter.com/Sanmayce), revision 2022-Nov-29.

Dogan Kurt's 'Antimalware', 2016, Iterative (wild_iterative):
113.664000 s
Dogan Kurt's 'Antimalware', 2016, Iterative Optimised (wild_iterative_opt):
72.522000 s
Tcheburaschka_r3, 2022, (Tcheburaschka_Wildcard_Iterative_Kaze_CaseSensitive):
78.138000 s
JackHandy_Iterative, 2005, (IterativeWildcards):
83.266000 s
Kirk J. Krauss, 2014, DrDobbs (FastWildCompare):
56.878000 s
Alessandro Cantatore, 2003, (szWildMatch7):
119.861000 s
Nondeterministic Finite Automaton (wild_nfa):
176.290000 s

Hope future GCCs to fix this, the Tcheburaschka function is good in my view.

[Bug middle-end/107905] 2x slowdown versus CLANG and ICL

Reply via email to