Hi Richard:
> I do think that local variable layout probably doesn't belong in that IPA pass
> but elsewhere (way earlier). But my main complaint was that the diff
> doesn't show changes you made to the pass because it first and foremost
> shows moving all the code. That makes reviewing the change impossible.
>
> So at least split the patch into one moving the pass to a separate file and
> one doing the actual changes.
Oh, I got your point, I've added few code after moving, but the diff is
impossible to show which part is new added, I'll send v2 patch soon.
> static unsigned int
> align_local_variable (tree decl, bool really_expand)
> {
> unsigned int align;
>
> if (TREE_CODE (decl) == SSA_NAME)
> align = TYPE_ALIGN (TREE_TYPE (decl));
> else
> {
> align = LOCAL_DECL_ALIGNMENT (decl);
> /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
> That is done before IPA and could bump alignment based on host
> backend even for offloaded code which wants different
> LOCAL_DECL_ALIGNMENT. */
> if (really_expand)
> SET_DECL_ALIGN (decl, align);
>
> here we should assert that we never lower a decls alignment since earlier
> optimization already made use of the alignment of a decl (which also means
> doing layout as far as alignment is concerned earlier will improve
> generated code).
>
> The documentation of LOCAL_DECL_ALIGNMENT says the purpose is
> to increase alignment. Only x86 implements LOCAL_DECL_ALIGNMENT,
> others just LOCAL_ALIGNMENT (x86 also implements that). The difference
> must be quite subtle ... Oh, and LOCAL_ALIGNMENT seems completely
> unused (besides providing the default implementation for
> LOCAL_DECL_ALIGNMENT).
> Huh.
BTW, here is the full story for ARM and RISC-V, why they need this patch:
- simplify_builtin_call call can_store_by_pieces check it's OK to
store by pieces?
- can_store_by_pieces call targetm.use_by_pieces_infrastructure_p to
ask back-end
- targetm.use_by_pieces_infrastructure_p call by_pieces_ninsns to
calculate how many instruction needed, it was 1 for RV64*, because
alignment is changed to 8 bits from 64 bits for char array due to this
patch g:26d7a5e690169ac04acde90070b0092c41b71c7e.
- by_pieces_ninsns got 4 instead of 1, and large than MOVE_RATIO,
failed to simplify memset to load/store.
Thanks.