Hi Richard, Thanks a lot for your response! Another failure reported by the Linaro CI is as follows : (Note: I am planning to send a separate mail for each failure, as this will make the discussion easy to track)
FAIL: gcc.target/aarch64/sve/acle/general/cpy_1.c -march=armv8.2-a+sve -moverride=tune=none check-function-bodies dup_x0_m Expected code: ... add (x[0-9]+), x0, #?1 mov (p[0-7])\.b, p15\.b mov z0\.d, \2/m, \1 ... ret Code obtained w/o patch: addvl sp, sp, #-1 str p15, [sp] add x0, x0, 1 mov p3.b, p15.b mov z0.d, p3/m, x0 ldr p15, [sp] addvl sp, sp, #1 ret Code obtained w/ patch: addvl sp, sp, #-1 str p15, [sp] mov p3.b, p15.b add x0, x0, 1 mov z0.d, p3/m, x0 ldr p15, [sp] addvl sp, sp, #1 ret As we can see, with the patch, the following two instructions are interchanged: add x0, x0, 1 mov p3.b, p15.b I believe that this is fine and the test can be modified to allow it to pass on aarch64. Please let me know what you think. Regards, Surya On 24/11/23 4:18 pm, Richard Earnshaw wrote: > > > On 24/11/2023 08:09, Surya Kumari Jangala via Gcc wrote: >> Hi Richard, >> Ping. Please let me know if the test failure that I mentioned in the mail >> below can be handled by changing the expected generated code. I am not >> conversant with arm, and hence would appreciate your help. >> >> Regards, >> Surya >> >> On 03/11/23 4:58 pm, Surya Kumari Jangala wrote: >>> Hi Richard, >>> I had submitted a patch for review >>> (https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631849.html) >>> regarding scaling save/restore costs of callee save registers with block >>> frequency in the IRA pass (PR111673). >>> >>> This patch has been approved by VMakarov >>> (https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632089.html). >>> >>> With this patch, we are seeing performance improvements with spec on x86 >>> (exchange: 5%, xalancbmk: 2.5%) and on Power (perlbench: 5.57%). >>> >>> I received a mail from Linaro about some failures seen in the CI pipeline >>> with >>> this patch. I have analyzed the failures and I wish to discuss the analysis >>> with you. >>> >>> One failure reported by the Linaro CI is: >>> >>> FAIL: gcc.target/arm/pr111235.c scan-assembler-times ldrexd\tr[0-9]+, >>> r[0-9]+, \\[r[0-9]+\\] 2 >>> >>> The diff in the assembly between trunk and patch is: >>> >>> 93c93 >>> < push {r4, r5} >>> --- >>>> push {fp} >>> 95c95 >>> < ldrexd r4, r5, [r0] >>> --- >>>> ldrexd fp, ip, [r0] >>> 99c99 >>> < pop {r4, r5} >>> --- >>>> ldr fp, [sp], #4 >>> >>> >>> The test fails with patch because the ldrexd insn uses fp & ip registers >>> instead >>> of r[0-9]+ >>> >>> But the code produced by patch is better because it is pushing and >>> restoring only >>> one register (fp) instead of two registers (r4, r5). Hence, this test can be >>> modified to allow it to pass on arm. Please let me know what you think. >>> >>> If you need more information, please let me know. I will be sending >>> separate mails >>> for the other test failures. >>> > > Thanks for looking at this. > > > The key part of this test is that the compiler generates LDREXD. The > registers used for that are pretty much irrelevant as we don't match them to > any other operations within the test. So I'd recommend just testing for the > mnemonic and not for any of the operands (ie just match "ldrexd\t"). > > R. > >>> Regards, >>> Surya >>> >>> >>>