[Bug target/63679] [5.0 Regression][AArch64] Failure to constant fold.

pinskia at gcc dot gnu.org Thu, 20 Nov 2014 08:50:28 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679


--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tejas Belagod from comment #7)
> I tried this, but it still doesn't seem to fold for aarch64.
> 
> So, here is the DOM trace for aarch64:
> 
> Optimizing statement a = *.LC0;

Why do we get LC0 in the first place?  It seems like it is happening because of
some cost model issue with MOVECOST.

> LKUP STMT a = *.LC0 with .MEM_3(D)
> LKUP STMT *.LC0 = a with .MEM_3(D)
> Optimizing statement vectp_a.5_1 = &a;
> LKUP STMT vectp_a.5_1 = &a
> ==== ASGN vectp_a.5_1 = &a
> Optimizing statement vect__6.6_13 = MEM[(int *)vectp_a.5_1];
>   Replaced 'vectp_a.5_1' with constant '&aD.2604'
> LKUP STMT vect__6.6_13 = MEM[(int *)&a] with .MEM_4
> 2>>> STMT vect__6.6_13 = MEM[(int *)&a] with .MEM_4
> Optimizing statement vect_sum_7.7_6 = vect__6.6_13;
> LKUP STMT vect_sum_7.7_6 = vect__6.6_13
> ==== ASGN vect_sum_7.7_6 = vect__6.6_13
> Optimizing statement vectp_a.4_7 = vectp_a.5_1 + 16;
>   Replaced 'vectp_a.5_1' with constant '&aD.2604'
> LKUP STMT vectp_a.4_7 = &a pointer_plus_expr 16
> 2>>> STMT vectp_a.4_7 = &a pointer_plus_expr 16
> ==== ASGN vectp_a.4_7 = &MEM[(void *)&a + 16B]
> Optimizing statement ivtmp_8 = 1;
> LKUP STMT ivtmp_8 = 1
> ==== ASGN ivtmp_8 = 1
> Optimizing statement vect__6.6_10 = MEM[(int *)vectp_a.4_7];
>   Replaced 'vectp_a.4_7' with constant '&MEM[(voidD.39 *)&aD.2604 + 16B]'
>   Folded to: vect__6.6_10 = MEM[(int *)&a + 16B];
> LKUP STMT vect__6.6_10 = MEM[(int *)&a + 16B] with .MEM_4
> 2>>> STMT vect__6.6_10 = MEM[(int *)&a + 16B] with .MEM_4
> Optimizing statement vect_sum_7.7_17 = vect_sum_7.7_6 + vect__6.6_10;
>   Replaced 'vect_sum_7.7_6' with variable 'vect__6.6_13'
> gimple_simplified to vect_sum_7.7_17 = vect__6.6_10 + vect__6.6_13;
>   Folded to: vect_sum_7.7_17 = vect__6.6_10 + vect__6.6_13;
> LKUP STMT vect_sum_7.7_17 = vect__6.6_10 plus_expr vect__6.6_13
> 2>>> STMT vect_sum_7.7_17 = vect__6.6_10 plus_expr vect__6.6_13
> ...
> 
> In x86's case, by this time, the constant vectors have been propagated and
> folded into a constant vector:
> 
> Optimizing statement vect_cst_.12_23 = { 0, 1, 2, 3 };
> LKUP STMT vect_cst_.12_23 = { 0, 1, 2, 3 }
> ==== ASGN vect_cst_.12_23 = { 0, 1, 2, 3 }
> Optimizing statement vect_cst_.11_32 = { 4, 5, 6, 7 };
> LKUP STMT vect_cst_.11_32 = { 4, 5, 6, 7 }
> ==== ASGN vect_cst_.11_32 = { 4, 5, 6, 7 }
> Optimizing statement vectp.14_2 = &a[0];
> LKUP STMT vectp.14_2 = &a[0]
> ==== ASGN vectp.14_2 = &a[0]
> Optimizing statement MEM[(int *)vectp.14_2] = vect_cst_.12_23;
>   Replaced 'vectp.14_2' with constant '&aD.1831[0]'
>   Replaced 'vect_cst_.12_23' with constant '{ 0, 1, 2, 3 }'
>   Folded to: MEM[(int *)&a] = { 0, 1, 2, 3 };
> LKUP STMT MEM[(int *)&a] = { 0, 1, 2, 3 } with .MEM_3(D)
> LKUP STMT { 0, 1, 2, 3 } = MEM[(int *)&a] with .MEM_3(D)
> LKUP STMT { 0, 1, 2, 3 } = MEM[(int *)&a] with .MEM_25
> 2>>> STMT { 0, 1, 2, 3 } = MEM[(int *)&a] with .MEM_25
> Optimizing statement vectp.14_21 = vectp.14_2 + 16;
>   Replaced 'vectp.14_2' with constant '&aD.1831[0]'
> LKUP STMT vectp.14_21 = &a[0] pointer_plus_expr 16
> 2>>> STMT vectp.14_21 = &a[0] pointer_plus_expr 16
> ==== ASGN vectp.14_21 = &MEM[(void *)&a + 16B]
> Optimizing statement MEM[(int *)vectp.14_21] = vect_cst_.11_32;
>   Replaced 'vectp.14_21' with constant '&MEM[(voidD.41 *)&aD.1831 + 16B]'
>   Replaced 'vect_cst_.11_32' with constant '{ 4, 5, 6, 7 }'
>   Folded to: MEM[(int *)&a + 16B] = { 4, 5, 6, 7 };
> LKUP STMT MEM[(int *)&a + 16B] = { 4, 5, 6, 7 } with .MEM_25
> LKUP STMT { 4, 5, 6, 7 } = MEM[(int *)&a + 16B] with .MEM_25
> LKUP STMT { 4, 5, 6, 7 } = MEM[(int *)&a + 16B] with .MEM_19
> 2>>> STMT { 4, 5, 6, 7 } = MEM[(int *)&a + 16B] with .MEM_19
> Optimizing statement vectp_a.5_22 = &a;
> LKUP STMT vectp_a.5_22 = &a
> ==== ASGN vectp_a.5_22 = &a
> Optimizing statement vect__13.6_20 = MEM[(int *)vectp_a.5_22];
>   Replaced 'vectp_a.5_22' with constant '&aD.1831'
> LKUP STMT vect__13.6_20 = MEM[(int *)&a] with .MEM_19
> FIND: { 0, 1, 2, 3 }
>   Replaced redundant expr '# VUSE <.MEM_19>
> MEM[(intD.6 *)&aD.1831]' with '{ 0, 1, 2, 3 }'
> ==== ASGN vect__13.6_20 = { 0, 1, 2, 3 }
> Optimizing statement vect_sum_14.7_13 = vect__13.6_20;
>   Replaced 'vect__13.6_20' with constant '{ 0, 1, 2, 3 }'
> LKUP STMT vect_sum_14.7_13 = { 0, 1, 2, 3 }
> ==== ASGN vect_sum_14.7_13 = { 0, 1, 2, 3 }
> ....
> 
> While the MEM[vect_ptr + CST] gets replaced correctly by 'a', it doesn't
> seem to figure out that the literal pool load 'a = *LC0' is nothing but 
> 
>  vect_cst_.12_23 = { 0, 1, 2, 3 }; and vect_cst_.11_32 = { 4, 5, 6, 7 };
> 
> which is the only major difference between how the const vector is
> initialized in x86 and aarch64. Is DOM not able to understand 'a = *LC0'?

[Bug target/63679] [5.0 Regression][AArch64] Failure to constant fold.

Reply via email to