Hi Richard,

it is already a while ago, but I had not found time to continue
with this patch until now.

I think I have now a better solution, which properly addresses your
comments below.

On 3/25/19 9:41 AM, Richard Biener wrote:
> On Fri, 22 Mar 2019, Bernd Edlinger wrote:
> 
>> On 3/21/19 12:15 PM, Richard Biener wrote:
>>> On Sun, 10 Mar 2019, Bernd Edlinger wrote:
>>> Finally...
>>>
>>> Index: gcc/function.c
>>> ===================================================================
>>> --- gcc/function.c      (revision 269264)
>>> +++ gcc/function.c      (working copy)
>>> @@ -2210,6 +2210,12 @@ use_register_for_decl (const_tree decl)
>>>    if (DECL_MODE (decl) == BLKmode)
>>>      return false;
>>>
>>> +  if (STRICT_ALIGNMENT && TREE_CODE (decl) == PARM_DECL
>>> +      && DECL_INCOMING_RTL (decl) && MEM_P (DECL_INCOMING_RTL (decl))
>>> +      && GET_MODE_ALIGNMENT (DECL_MODE (decl))
>>> +        > MEM_ALIGN (DECL_INCOMING_RTL (decl)))
>>> +    return false;
>>> +
>>>    /* If -ffloat-store specified, don't put explicit float variables
>>>       into registers.  */
>>>    /* ??? This should be checked after DECL_ARTIFICIAL, but tree-ssa
>>>
>>> I wonder if it is necessary to look at DECL_INCOMING_RTL here
>>> and why such RTL may not exist?  That is, iff DECL_INCOMING_RTL
>>> doesn't exist then shouldn't we return false for safety reasons?
>>>

You are right, it is not possbile to return different results from
use_register_for_decl before vs. after incoming RTL is assigned.
That hits an assertion in set_rtl.

This hunk is gone now, instead I changed assign_parm_setup_reg
to use movmisalign optab and/or extract_bit_field if misaligned
entry_parm is to be assigned in a register.

I have no test coverage for the movmisalign optab though, so I
rely on your code review for that part.

>>> Similarly the very same issue should exist on x86_64 which is
>>> !STRICT_ALIGNMENT, it's just the ABI seems to provide the appropriate
>>> alignment on the caller side.  So the STRICT_ALIGNMENT check is
>>> a wrong one.
>>>
>>
>> I may be plain wrong here, but I thought that !STRICT_ALIGNMENT targets
>> just use MEM_ALIGN to select the right instructions.  MEM_ALIGN
>> is always 32-bit align on the DImode memory.  The x86_64 vector instructions
>> would look at MEM_ALIGN and do the right thing, yes?
> 
> No, they need to use the movmisalign optab and end up with UNSPECs
> for example.
Ah, thanks, now I see.

>> It seems to be the definition of STRICT_ALIGNMENT targets that all RTL
>> instructions need to have MEM_ALIGN >= GET_MODE_ALIGNMENT, so the target
>> does not even have to look at MEM_ALIGN except in the mov_misalign_optab,
>> right?
> 
> Yes, I think we never losened that.  Note that RTL expansion has to
> fix this up for them.  Note that strictly speaking SLOW_UNALIGNED_ACCESS
> specifies that x86 is strict-align wrt vector modes.
> 

Yes I agree, the code would be incorrect for x86 as well when the 
movmisalign_optab
is not used.  So I invoke the movmisalign optab if available and if not fall
back to extract_bit_field.  As in the assign_parm_setup_stack 
assign_parm_setup_reg
assumes that data->promoted_mode != data->nominal_mode does not happen with
misaligned stack slots.


Attached is the v3 if my patch.

Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf.

Is it OK for trunk?


Thanks
Bernd.
2019-07-30  Bernd Edlinger  <bernd.edlin...@hotmail.de>

	PR middle-end/89544
	* function.c (assign_param_data_one): Remove unused data members.
	(assign_parm_find_stack_rtl): Use larger alignment when possible.
	(assign_parm_adjust_stack_rtl): Revise STRICT_ALIGNMENT check.
	(assign_parm_setup_reg): Handle misaligned stack arguments.

testsuite:
2019-07-30  Bernd Edlinger  <bernd.edlin...@hotmail.de>

	PR middle-end/89544
	* gcc.target/arm/unaligned-argument-1.c: New test.
	* gcc.target/arm/unaligned-argument-2.c: New test.

Index: gcc/function.c
===================================================================
--- gcc/function.c	(revision 273767)
+++ gcc/function.c	(working copy)
@@ -2274,8 +2274,6 @@ struct assign_parm_data_one
   int partial;
   BOOL_BITFIELD named_arg : 1;
   BOOL_BITFIELD passed_pointer : 1;
-  BOOL_BITFIELD on_stack : 1;
-  BOOL_BITFIELD loaded_in_reg : 1;
 };
 
 /* A subroutine of assign_parms.  Initialize ALL.  */
@@ -2699,8 +2697,23 @@ assign_parm_find_stack_rtl (tree parm, struct assi
      intentionally forcing upward padding.  Otherwise we have to come
      up with a guess at the alignment based on OFFSET_RTX.  */
   poly_int64 offset;
-  if (data->locate.where_pad != PAD_DOWNWARD || data->entry_parm)
+  if (data->locate.where_pad == PAD_NONE || data->entry_parm)
     align = boundary;
+  else if (data->locate.where_pad == PAD_UPWARD)
+    {
+      align = boundary;
+      /* If the argument offset is actually more aligned than the nominal
+	 stack slot boundary, take advantage of that excess alignment.
+	 Don't make any assumptions if STACK_POINTER_OFFSET is in use.  */
+      if (poly_int_rtx_p (offset_rtx, &offset)
+	  && STACK_POINTER_OFFSET == 0)
+	{
+	  unsigned int offset_align = known_alignment (offset) * BITS_PER_UNIT;
+	  if (offset_align == 0 || offset_align > STACK_BOUNDARY)
+	    offset_align = STACK_BOUNDARY;
+	  align = MAX (align, offset_align);
+	}
+    }
   else if (poly_int_rtx_p (offset_rtx, &offset))
     {
       align = least_bit_hwi (boundary);
@@ -2813,8 +2826,9 @@ assign_parm_adjust_stack_rtl (struct assign_parm_d
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
   if (stack_parm
-      && ((STRICT_ALIGNMENT
-	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
+      && ((GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)
+	   && targetm.slow_unaligned_access (data->nominal_mode,
+					     MEM_ALIGN (stack_parm)))
 	  || (data->nominal_type
 	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
 	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
@@ -3292,6 +3306,23 @@ assign_parm_setup_reg (struct assign_parm_data_all
 
       did_conversion = true;
     }
+  else if (MEM_P (data->entry_parm)
+	   && GET_MODE_ALIGNMENT (promoted_nominal_mode)
+	      > MEM_ALIGN (data->entry_parm)
+	   && targetm.slow_unaligned_access (promoted_nominal_mode,
+					     MEM_ALIGN (data->entry_parm)))
+    {
+      enum insn_code icode = optab_handler (movmisalign_optab,
+					    promoted_nominal_mode);
+
+      if (icode != CODE_FOR_nothing)
+	emit_insn (GEN_FCN (icode) (parmreg, validated_mem));
+      else
+	rtl = parmreg = extract_bit_field (validated_mem,
+			GET_MODE_BITSIZE (promoted_nominal_mode), 0,
+			unsignedp, parmreg,
+			promoted_nominal_mode, VOIDmode, false, NULL);
+    }
   else
     emit_move_insn (parmreg, validated_mem);
 
Index: gcc/testsuite/gcc.target/arm/unaligned-argument-1.c
===================================================================
--- gcc/testsuite/gcc.target/arm/unaligned-argument-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/unaligned-argument-1.c	(working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-marm -march=armv6 -mno-unaligned-access -mfloat-abi=soft -mabi=aapcs -O3" } */
+
+struct s {
+  int a, b;
+} __attribute__((aligned(8)));
+
+struct s f0;
+
+void f(int a, int b, int c, int d, struct s f)
+{
+  f0 = f;
+}
+
+/* { dg-final { scan-assembler-times "ldrd" 1 } } */
+/* { dg-final { scan-assembler-times "strd" 1 } } */
+/* { dg-final { scan-assembler-times "stm" 0 } } */
Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c
===================================================================
--- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c	(working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-marm -march=armv6 -mno-unaligned-access -mfloat-abi=soft -mabi=aapcs -O3" } */
+
+struct s {
+  int a, b;
+} __attribute__((aligned(8)));
+
+struct s f0;
+
+void f(int a, int b, int c, int d, int e, struct s f)
+{
+  f0 = f;
+}
+
+/* { dg-final { scan-assembler-times "ldrd" 0 } } */
+/* { dg-final { scan-assembler-times "strd" 0 } } */
+/* { dg-final { scan-assembler-times "stm" 1 } } */

Reply via email to