On 10/11/2023 14:46, Kyrylo Tkachov wrote:
-----Original Message-----
From: Richard Earnshaw <richard.earns...@foss.arm.com>
Sent: Friday, November 10, 2023 11:31 AM
To: Wilco Dijkstra <wilco.dijks...@arm.com>; Kyrylo Tkachov
<kyrylo.tkac...@arm.com>; GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Richard Sandiford <richard.sandif...@arm.com>; Richard Earnshaw
<richard.earns...@arm.com>
Subject: Re: [PATCH] AArch64: Cleanup memset expansion
On 10/11/2023 10:17, Wilco Dijkstra wrote:
Hi Kyrill,
+ /* Reduce the maximum size with -Os. */
+ if (optimize_function_for_size_p (cfun))
+ max_set_size = 96;
+
.... This is a new "magic" number in this code. It looks sensible, but how
did you arrive at it?
We need 1 instruction to create the value to store (DUP or MOVI) and 1 STP
for every 32 bytes, so the 96 means 4 instructions for typical sizes
(sizes not
a multiple of 16 can add one extra instruction).
It would be useful to have that reasoning in the comment.
I checked codesize on SPECINT2017, and 96 had practically identical size.
Using 128 would also be a reasonable Os value with a very slight size
increase,
and 384 looks good for O2 - however I didn't want to tune these values
as this
is a cleanup patch.
Cheers,
Wilco
Shouldn't this be a param then? Also, manifest constants in the middle
of code are a potential nightmare, please move it to a #define (even if
that's then used as the default value for the param).
I agree on making this a #define but I wouldn't insist on a param.
Code size IMO has a much more consistent right or wrong answer as it's
statically determinable.
It this was a speed-related param then I'd expect the flexibility for the power
user to override such heuristics would be more widely useful.
But for code size the compiler should always be able to get it right.
If Richard would still like the param then I'm fine with having the param, but
I'd be okay with the comment above and making this a #define.
I don't immediately have a feel for how sensitive code would be to the
precise value here. Is this value something that might affect
individual benchmarks in different ways? Or something where a future
architecture might want a different value? For either of those reasons
a param might be useful, but if this is primarily a code size trade off
and the variation in performance is small, then it's probably not
worthwhile having an additional hook.
R.