On Feb 12, 2021, Richard Biener <richard.guent...@gmail.com> wrote: >> + if (TREE_CODE (mem) == SSA_NAME) >> + if (ptr_info_def *pi = get_ptr_info (mem)) >> + { >> + unsigned al = get_pointer_alignment (builtin->dst_base); >> + if (al > pi->align || pi->misalign)
> We still might prefer pi->align == 64 and pi->misalign == 32 over al == 16 > so maybe factor that in, too. Ugh, apologies, I somehow posted an incorrect and outdated version of the patch. The improved (propagates both alignments) and fixed (divides by BITS_PER_UNIT, fixing a regression in gfortran.dg/sms-2.f90) had this alternate hunk as the only difference: @@ -1155,6 +1156,16 @@ generate_memset_builtin (class loop *loop, partition *partition) mem = force_gimple_operand_gsi (&gsi, mem, true, NULL_TREE, false, GSI_CONTINUE_LINKING); + if (TREE_CODE (mem) == SSA_NAME) + if (ptr_info_def *pi = get_ptr_info (mem)) + { + unsigned al; + unsigned HOST_WIDE_INT misal; + if (get_pointer_alignment_1 (builtin->dst_base, &al, &misal)) + set_ptr_info_alignment (pi, al / BITS_PER_UNIT, + misal / BITS_PER_UNIT); + } + /* This exactly matches the pattern recognition in classify_partition. */ val = gimple_assign_rhs1 (DR_STMT (builtin->dst_dr)); /* Handle constants like 0x15151515 and similarly > So I wonder whether we should instead re-run CCP after loop opts which > computes nonzero bits as well instead of the above "hack". Would > nonzero bits be ready to compute in the above way from loop distribution? > That is, you can do set_nonzero_bits just like you did > set_ptr_info_alignment ... > Since CCP also performs copy propagation an obvious candidate would be > to replace the last pass_copy_prop with pass_ccp (with a comment noting > to compute up-to-date nonzero bits info). I'll look into these possibilities. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer Vim, Vi, Voltei pro Emacs -- GNUlius Caesar