https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #11 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> Loop body is likely going to simplify further, this is difficult
> to guess, we just decrease the result by 1/3. */
>
This is introduced by r0-68074-g91a01f21abfe19
/* Estimate number of insns of completely unrolled loop. We assume
+ that the size of the unrolled loop is decreased in the
+ following way (the numbers of insns are based on what
+ estimate_num_insns returns for appropriate statements):
+
+ 1) exit condition gets removed (2 insns)
+ 2) increment of the control variable gets removed (2 insns)
+ 3) All remaining statements are likely to get simplified
+ due to constant propagation. Hard to estimate; just
+ as a heuristics we decrease the rest by 1/3.
+
+ NINSNS is the number of insns in the loop before unrolling.
+ NUNROLL is the number of times the loop is unrolled. */
+
+static unsigned HOST_WIDE_INT
+estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns,
+ unsigned HOST_WIDE_INT nunroll)
+{
+ HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3;
+ if (unr_insns <= 0)
+ unr_insns = 1;
+ unr_insns *= (nunroll + 1);
+
+ return unr_insns;
+}
And r0-93444-g08f1af2ed022e0 try do it more accurately by marking
likely_eliminated stmt and minus that from total insns, But 2 / 3 is still
keeped.
+/* Estimate number of insns of completely unrolled loop.
+ It is (NUNROLL + 1) * size of loop body with taking into account
+ the fact that in last copy everything after exit conditional
+ is dead and that some instructions will be eliminated after
+ peeling.
- NINSNS is the number of insns in the loop before unrolling.
- NUNROLL is the number of times the loop is unrolled. */
+ Loop body is likely going to simplify futher, this is difficult
+ to guess, we just decrease the result by 1/3. */
static unsigned HOST_WIDE_INT
-estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns,
+estimated_unrolled_size (struct loop_size *size,
unsigned HOST_WIDE_INT nunroll)
{
- HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3;
+ HOST_WIDE_INT unr_insns = ((nunroll)
+ * (HOST_WIDE_INT) (size->overall
+ -
size->eliminated_by_peeling));
+ if (!nunroll)
+ unr_insns = 0;
+ unr_insns += size->last_iteration -
size->last_iteration_eliminated_by_peeling;
+
+ unr_insns = unr_insns * 2 / 3;
if (unr_insns <= 0)
unr_insns = 1;
- unr_insns *= (nunroll + 1);
It looks to me 1 / 3 overestimates the instructions that can be optimised away,
especially if we've subtracted eliminated_by_peeling