Jeff Law <l...@redhat.com> writes: > On 10/23/2017 11:20 AM, Richard Sandiford wrote: >> This patch changes various bits of state related to argument sizes so >> that they have type poly_int64 rather than HOST_WIDE_INT. This includes: >> >> - incoming_args::pops_args and incoming_args::size >> - rtl_data::outgoing_args_size >> - pending_stack_adjust >> - stack_pointer_delta >> - stack_usage::pushed_stack_size >> - args_size::constant >> >> It also changes TARGET_RETURN_POPS_ARGS so that the size of the >> arguments passed in and the size returned by the hook are both >> poly_int64s. >> >> >> 2017-10-23 Richard Sandiford <richard.sandif...@linaro.org> >> Alan Hayward <alan.hayw...@arm.com> >> David Sherwood <david.sherw...@arm.com> >> >> gcc/ >> * target.def (return_pops_args): Treat both the input and output >> sizes as poly_int64s rather than HOST_WIDE_INTS. >> * targhooks.h (default_return_pops_args): Update accordingly. >> * targhooks.c (default_return_pops_args): Likewise. >> * doc/tm.texi: Regenerate. >> * emit-rtl.h (incoming_args): Change pops_args, size and >> outgoing_args_size from int to poly_int64_pod. >> * function.h (expr_status): Change x_pending_stack_adjust and >> x_stack_pointer_delta from int to poly_int64. >> (args_size::constant): Change from HOST_WIDE_INT to poly_int64. >> (ARGS_SIZE_RTX): Update accordingly. >> * calls.c (highest_outgoing_arg_in_use): Change from int to >> unsigned int. >> (stack_usage_watermark, stored_args_watermark): New variables. >> (stack_region_maybe_used_p, mark_stack_region_used): New functions. >> (emit_call_1): Change the stack_size and rounded_stack_size >> parameters from HOST_WIDE_INT to poly_int64. Track n_popped >> as a poly_int64. >> (save_fixed_argument_area): Check stack_usage_watermark. >> (initialize_argument_information): Change old_pending_adj from >> a HOST_WIDE_INT * to a poly_int64_pod *. >> (compute_argument_block_size): Return the size as a poly_int64 >> rather than an int. >> (finalize_must_preallocate): Track polynomial argument sizes. >> (compute_argument_addresses): Likewise. >> (internal_arg_pointer_based_exp): Track polynomial offsets. >> (mem_overlaps_already_clobbered_arg_p): Rename to... >> (mem_might_overlap_already_clobbered_arg_p): ...this and take the >> size as a poly_uint64 rather than an unsigned HOST_WIDE_INT. >> Check stored_args_used_watermark. >> (load_register_parameters): Update accordingly. >> (check_sibcall_argument_overlap_1): Likewise. >> (combine_pending_stack_adjustment_and_call): Take the unadjusted >> args size as a poly_int64 rather than an int. Return a bool >> indicating whether the optimization was possible and return >> the new adjustment by reference. >> (check_sibcall_argument_overlap): Track polynomail argument sizes. >> Update stored_args_watermark. >> (can_implement_as_sibling_call_p): Handle polynomial argument sizes. >> (expand_call): Likewise. Maintain stack_usage_watermark and >> stored_args_watermark. Update calls to >> combine_pending_stack_adjustment_and_call. >> (emit_library_call_value_1): Handle polynomial argument sizes. >> Call stack_region_maybe_used_p and mark_stack_region_used. >> Maintain stack_usage_watermark. >> (store_one_arg): Likewise. Update call to >> mem_overlaps_already_clobbered_arg_p. >> * config/arm/arm.c (arm_output_function_prologue): Add a cast to >> HOST_WIDE_INT. >> * config/avr/avr.c (avr_outgoing_args_size): Likewise. >> * config/microblaze/microblaze.c (microblaze_function_prologue): >> Likewise. >> * config/cr16/cr16.c (cr16_return_pops_args): Update for new >> TARGET_RETURN_POPS_ARGS interface. >> (cr16_compute_frame, cr16_initial_elimination_offset): Add casts >> to HOST_WIDE_INT. >> * config/ft32/ft32.c (ft32_compute_frame): Likewise. >> * config/i386/i386.c (ix86_return_pops_args): Update for new >> TARGET_RETURN_POPS_ARGS interface. >> (ix86_expand_split_stack_prologue): Add a cast to HOST_WIDE_INT. >> * config/moxie/moxie.c (moxie_compute_frame): Likewise. >> * config/m68k/m68k.c (m68k_return_pops_args): Update for new >> TARGET_RETURN_POPS_ARGS interface. >> * config/vax/vax.c (vax_return_pops_args): Likewise. >> * config/pa/pa.h (STACK_POINTER_OFFSET): Add a cast to poly_int64. >> (EXIT_IGNORE_STACK): Update reference to crtl->outgoing_args_size. >> * config/arm/arm.h (CALLER_INTERWORKING_SLOT_SIZE): Likewise. >> * config/powerpcspe/aix.h (STACK_DYNAMIC_OFFSET): Likewise. >> * config/powerpcspe/darwin.h (STACK_DYNAMIC_OFFSET): Likewise. >> * config/powerpcspe/powerpcspe.h (STACK_DYNAMIC_OFFSET): Likewise. >> * config/rs6000/aix.h (STACK_DYNAMIC_OFFSET): Likewise. >> * config/rs6000/darwin.h (STACK_DYNAMIC_OFFSET): Likewise. >> * config/rs6000/rs6000.h (STACK_DYNAMIC_OFFSET): Likewise. >> * dojump.h (saved_pending_stack_adjust): Change x_pending_stack_adjust >> and x_stack_pointer_delta from int to poly_int64. >> * dojump.c (do_pending_stack_adjust): Update accordingly. >> * explow.c (allocate_dynamic_stack_space): Handle polynomial >> stack_pointer_deltas. >> * function.c (STACK_DYNAMIC_OFFSET): Add a cast to poly_int64. >> (pad_to_arg_alignment): Track polynomial offsets. >> (assign_parm_find_stack_rtl): Likewise. >> (assign_parms, locate_and_pad_parm): Handle polynomial argument sizes. >> * toplev.c (output_stack_usage): Update reference to >> current_function_pushed_stack_size. > I haven't been able to convince myself that the changes to the > stack_usage_map are correct, particularly in the case where the > UPPER_BOUND is not constant. But I'm willing to let it go given the > only target that could potentially be affected would be SVE and I'd > expect that if you'd gotten it wrong it would have showed up in your > testing. > > I'm also slightly worried about how we handle ARGS_GROW_DOWNWARD targets > (pa, stormy16). I couldn't convince myself those changes were correct > either. Again, I'm willing on fall back to lean on your extensive > experience and testing here. > > Of all the patches to-date I've looked at, this one worries me the most > (which is why it's next to last according to my records). The potential > for a goof in the argument setup, padding, stack save area, and the > consequences for such a goof worry me. > > I'm going to conditionally ack this -- my condition is that you > re-review the the calls.c/function.c changes as well.
I agree this is probably the least obvious of the changes. I've had another look and I still think the approach is OK. For stack_usage_map: we're trying to detect whether an instruction references a region of stack that has already been clobbered. We can conservatively do this as a range from the lowest possible byte offset in the region to the highest possible byte offset in the region. This works both when recording a clobber and when testing for an overlap with a previous clobber. For polynomial lower bounds, the lowest possible offset is given by constant_lower_bound. For polynomial upper bounds, the highest possible offset is HOST_WIDE_INT_M1U (since we track unsigned offsets from a zero base). But it's not possible to have the stack_usage_map array itself go up to HOST_WIDE_INT_M1U, so instead, the watermark records the lowest lower bound whose corresponding upper bound is HOST_WIDE_INT_M1U, so that every index of stack_usage_map starting at the watermark is conceptually set. The patch doesn't really change the handling of ARGS_GROW_DOWNWARD, it just rewrites the tests into a different form. E.g. there were two instances of code like: if (ARGS_GROW_DOWNWARD) highest_outgoing_arg_in_use = MAX (initial_highest_arg_in_use, needed + 1); else highest_outgoing_arg_in_use = MAX (initial_highest_arg_in_use, needed); and the patch splits out the calculation of the second MAX argument: poly_int64 limit = needed; if (ARGS_GROW_DOWNWARD) limit += 1; The third ARGS_GROW_DOWNARD-related hunk is for pad_to_arg_alignment. Previously we had: offset_ptr->constant = -sp_offset + (ARGS_GROW_DOWNWARD ? FLOOR_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes) : CEIL_ROUND (offset_ptr->constant + sp_offset, boundary_in_bytes)); Expanding FLOOR_ROUND, and using the fact that boundary_in_bytes is a power of 2, the ARGS_GROW_DOWNARD case is equivalent to: offset_ptr->constant = -sp_offset + (offset_ptr->constant + sp_offset) - ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1)); The -sp_offset + sp_offset cancels out, giving: offset_ptr->constant = offset_ptr->constant - ((offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1)); or: offset_ptr->constant -= (offset_ptr->constant + sp_offset) & (boundary_in_bytes - 1); But since (offset_ptr->constant + sp_offset) is polynomial, we can't always calculate this value, so the code is conditional on a call to known_misalignment: if (... || !known_misalignment (offset_ptr->constant + sp_offset, boundary_in_bytes, &misalign)) ... else { if (ARGS_GROW_DOWNWARD) offset_ptr->constant -= misalign; Thanks, Richard