Jeff Law <l...@redhat.com> writes:
> On 10/23/2017 11:20 AM, Richard Sandiford wrote:
>> This patch changes various bits of state related to argument sizes so
>> that they have type poly_int64 rather than HOST_WIDE_INT.  This includes:
>> 
>> - incoming_args::pops_args and incoming_args::size
>> - rtl_data::outgoing_args_size
>> - pending_stack_adjust
>> - stack_pointer_delta
>> - stack_usage::pushed_stack_size
>> - args_size::constant
>> 
>> It also changes TARGET_RETURN_POPS_ARGS so that the size of the
>> arguments passed in and the size returned by the hook are both
>> poly_int64s.
>> 
>> 
>> 2017-10-23  Richard Sandiford  <richard.sandif...@linaro.org>
>>          Alan Hayward  <alan.hayw...@arm.com>
>>          David Sherwood  <david.sherw...@arm.com>
>> 
>> gcc/
>>      * target.def (return_pops_args): Treat both the input and output
>>      sizes as poly_int64s rather than HOST_WIDE_INTS.
>>      * targhooks.h (default_return_pops_args): Update accordingly.
>>      * targhooks.c (default_return_pops_args): Likewise.
>>      * doc/tm.texi: Regenerate.
>>      * emit-rtl.h (incoming_args): Change pops_args, size and
>>      outgoing_args_size from int to poly_int64_pod.
>>      * function.h (expr_status): Change x_pending_stack_adjust and
>>      x_stack_pointer_delta from int to poly_int64.
>>      (args_size::constant): Change from HOST_WIDE_INT to poly_int64.
>>      (ARGS_SIZE_RTX): Update accordingly.
>>      * calls.c (highest_outgoing_arg_in_use): Change from int to
>>      unsigned int.
>>      (stack_usage_watermark, stored_args_watermark): New variables.
>>      (stack_region_maybe_used_p, mark_stack_region_used): New functions.
>>      (emit_call_1): Change the stack_size and rounded_stack_size
>>      parameters from HOST_WIDE_INT to poly_int64.  Track n_popped
>>      as a poly_int64.
>>      (save_fixed_argument_area): Check stack_usage_watermark.
>>      (initialize_argument_information): Change old_pending_adj from
>>      a HOST_WIDE_INT * to a poly_int64_pod *.
>>      (compute_argument_block_size): Return the size as a poly_int64
>>      rather than an int.
>>      (finalize_must_preallocate): Track polynomial argument sizes.
>>      (compute_argument_addresses): Likewise.
>>      (internal_arg_pointer_based_exp): Track polynomial offsets.
>>      (mem_overlaps_already_clobbered_arg_p): Rename to...
>>      (mem_might_overlap_already_clobbered_arg_p): ...this and take the
>>      size as a poly_uint64 rather than an unsigned HOST_WIDE_INT.
>>      Check stored_args_used_watermark.
>>      (load_register_parameters): Update accordingly.
>>      (check_sibcall_argument_overlap_1): Likewise.
>>      (combine_pending_stack_adjustment_and_call): Take the unadjusted
>>      args size as a poly_int64 rather than an int.  Return a bool
>>      indicating whether the optimization was possible and return
>>      the new adjustment by reference.
>>      (check_sibcall_argument_overlap): Track polynomail argument sizes.
>>      Update stored_args_watermark.
>>      (can_implement_as_sibling_call_p): Handle polynomial argument sizes.
>>      (expand_call): Likewise.  Maintain stack_usage_watermark and
>>      stored_args_watermark.  Update calls to
>>      combine_pending_stack_adjustment_and_call.
>>      (emit_library_call_value_1): Handle polynomial argument sizes.
>>      Call stack_region_maybe_used_p and mark_stack_region_used.
>>      Maintain stack_usage_watermark.
>>      (store_one_arg): Likewise.  Update call to
>>      mem_overlaps_already_clobbered_arg_p.
>>      * config/arm/arm.c (arm_output_function_prologue): Add a cast to
>>      HOST_WIDE_INT.
>>      * config/avr/avr.c (avr_outgoing_args_size): Likewise.
>>      * config/microblaze/microblaze.c (microblaze_function_prologue):
>>      Likewise.
>>      * config/cr16/cr16.c (cr16_return_pops_args): Update for new
>>      TARGET_RETURN_POPS_ARGS interface.
>>      (cr16_compute_frame, cr16_initial_elimination_offset): Add casts
>>      to HOST_WIDE_INT.
>>      * config/ft32/ft32.c (ft32_compute_frame): Likewise.
>>      * config/i386/i386.c (ix86_return_pops_args): Update for new
>>      TARGET_RETURN_POPS_ARGS interface.
>>      (ix86_expand_split_stack_prologue): Add a cast to HOST_WIDE_INT.
>>      * config/moxie/moxie.c (moxie_compute_frame): Likewise.
>>      * config/m68k/m68k.c (m68k_return_pops_args): Update for new
>>      TARGET_RETURN_POPS_ARGS interface.
>>      * config/vax/vax.c (vax_return_pops_args): Likewise.
>>      * config/pa/pa.h (STACK_POINTER_OFFSET): Add a cast to poly_int64.
>>      (EXIT_IGNORE_STACK): Update reference to crtl->outgoing_args_size.
>>      * config/arm/arm.h (CALLER_INTERWORKING_SLOT_SIZE): Likewise.
>>      * config/powerpcspe/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>>      * config/powerpcspe/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>>      * config/powerpcspe/powerpcspe.h (STACK_DYNAMIC_OFFSET): Likewise.
>>      * config/rs6000/aix.h (STACK_DYNAMIC_OFFSET): Likewise.
>>      * config/rs6000/darwin.h (STACK_DYNAMIC_OFFSET): Likewise.
>>      * config/rs6000/rs6000.h (STACK_DYNAMIC_OFFSET): Likewise.
>>      * dojump.h (saved_pending_stack_adjust): Change x_pending_stack_adjust
>>      and x_stack_pointer_delta from int to poly_int64.
>>      * dojump.c (do_pending_stack_adjust): Update accordingly.
>>      * explow.c (allocate_dynamic_stack_space): Handle polynomial
>>      stack_pointer_deltas.
>>      * function.c (STACK_DYNAMIC_OFFSET): Add a cast to poly_int64.
>>      (pad_to_arg_alignment): Track polynomial offsets.
>>      (assign_parm_find_stack_rtl): Likewise.
>>      (assign_parms, locate_and_pad_parm): Handle polynomial argument sizes.
>>      * toplev.c (output_stack_usage): Update reference to
>>      current_function_pushed_stack_size.
> I haven't been able to convince myself that the changes to the
> stack_usage_map are correct, particularly in the case where the
> UPPER_BOUND is not constant.  But I'm willing to let it go given the
> only target that could potentially be affected would be SVE and I'd
> expect that if you'd gotten it wrong it would have showed up in your
> testing.
>
> I'm also slightly worried about how we handle ARGS_GROW_DOWNWARD targets
> (pa, stormy16).  I couldn't convince myself those changes were correct
> either.  Again, I'm willing on fall back to lean on your extensive
> experience and testing here.
>
> Of all the patches to-date I've looked at, this one worries me the most
> (which is why it's next to last according to my records).  The potential
> for a goof in the argument setup, padding, stack save area, and the
> consequences for such a goof worry me.
>
> I'm going to conditionally ack this -- my condition is that you
> re-review the the calls.c/function.c changes as well.

I agree this is probably the least obvious of the changes.  I've had
another look and I still think the approach is OK.

For stack_usage_map: we're trying to detect whether an instruction
references a region of stack that has already been clobbered.
We can conservatively do this as a range from the lowest possible byte
offset in the region to the highest possible byte offset in the region.
This works both when recording a clobber and when testing for an overlap
with a previous clobber.

For polynomial lower bounds, the lowest possible offset is given by
constant_lower_bound.  For polynomial upper bounds, the highest
possible offset is HOST_WIDE_INT_M1U (since we track unsigned offsets
from a zero base).  But it's not possible to have the stack_usage_map
array itself go up to HOST_WIDE_INT_M1U, so instead, the watermark
records the lowest lower bound whose corresponding upper bound is
HOST_WIDE_INT_M1U, so that every index of stack_usage_map starting at
the watermark is conceptually set.

The patch doesn't really change the handling of ARGS_GROW_DOWNWARD,
it just rewrites the tests into a different form.  E.g. there were
two instances of code like:

                  if (ARGS_GROW_DOWNWARD)
                    highest_outgoing_arg_in_use
                      = MAX (initial_highest_arg_in_use, needed + 1);
                  else
                    highest_outgoing_arg_in_use
                      = MAX (initial_highest_arg_in_use, needed);

and the patch splits out the calculation of the second MAX argument:

                  poly_int64 limit = needed;
                  if (ARGS_GROW_DOWNWARD)
                    limit += 1;

The third ARGS_GROW_DOWNARD-related hunk is for pad_to_arg_alignment.
Previously we had:

          offset_ptr->constant = -sp_offset +
            (ARGS_GROW_DOWNWARD
            ? FLOOR_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes)
            : CEIL_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes));

Expanding FLOOR_ROUND, and using the fact that boundary_in_bytes is
a power of 2, the ARGS_GROW_DOWNARD case is equivalent to:

          offset_ptr->constant = -sp_offset
            + (offset_ptr->constant + sp_offset)
            - ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1));

The -sp_offset + sp_offset cancels out, giving:

          offset_ptr->constant
            = offset_ptr->constant
            - ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1));

or:

          offset_ptr->constant
            -= (offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1);

But since (offset_ptr->constant + sp_offset) is polynomial, we can't
always calculate this value, so the code is conditional on a call
to known_misalignment:

      if (...
          || !known_misalignment (offset_ptr->constant + sp_offset,
                                  boundary_in_bytes, &misalign))
        ...
      else
        {
          if (ARGS_GROW_DOWNWARD)
            offset_ptr->constant -= misalign;
        
Thanks,
Richard

Reply via email to