https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110026
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |ra --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to d_vampile from comment #2) > O0 does miss a lot of optimizations. However, for the problem I mentioned, > the GPRs used before and the FP registers after modification are used. When > vectorization is not applicable, the X0 register is faster than the D0 > register. Is it appropriate to modify here? Well the generic_tunings has: { 4, /* load_int. */ 4, /* store_int. */ 4, /* load_fp. */ 4, /* store_fp. */ 4, /* load_pred. */ 4 /* store_pred. */ }, /* memmov_cost. */ Which says the load/store of fp has the same cost as ints (gprs) (this is the same as a53's tuning). If anything that should be changed .... Of you should use -mcpu=* where appliable.