On Wed, Jun 26, 2013 at 4:41 PM, Yufeng Zhang <yufeng.zh...@arm.com> wrote: > On 06/27/13 00:04, Andrew Pinski wrote: >> >> On Wed, Jun 26, 2013 at 3:39 PM, Yufeng Zhang<yufeng.zh...@arm.com> >> wrote: >>> >>> This patch updates assign_parm_find_data_types to assign passed_mode and >>> nominal_mode with the mode of the built pointer type instead of the >>> hard-coded Pmode in the case of pass-by-reference. This is in line with >>> the >>> assignment to passed_mode and nominal_mode in other cases inside the >>> function. >>> >>> assign_parm_find_data_types generally uses TYPE_MODE to calculate >>> passed_mode and nominal_mode: >>> >>> /* Find mode of arg as it is passed, and mode of arg as it should be >>> during execution of this function. */ >>> passed_mode = TYPE_MODE (passed_type); >>> nominal_mode = TYPE_MODE (nominal_type); >>> >>> this includes the case when the passed argument is a pointer by itself. >>> >>> However there is a discrepancy when it deals with argument passed by >>> invisible reference; it builds the argument's corresponding pointer type, >>> but sets passed_mode and nominal_mode with Pmode directly. >>> >>> This is OK for targets where Pmode == ptr_mode, but on AArch64 with ILP32 >>> they are different with Pmode as DImode and ptr_mode as SImode. When such >>> a >>> reference is passed on stack, the reference is prepared by the caller in >>> the >>> lower 4 bytes of an 8-byte slot but is fetched by the callee as an 8-byte >>> datum, of which the higher 4 bytes may contain junk. It is probably the >>> combination of Pmode != ptr_mode and the particular ABI specification >>> that >>> make the AArch64 ILP32 the first target on which the issue manifests >>> itself. >>> >>> Bootstrapped on x86_64-none-linux-gnu. >>> >>> OK for the trunk? >> >> >> >> IA64-hpux also uses Pmode != ptr_mode, can you provide the testcase >> which fails without this change? >> I used a powerpc64 target where Pmode != ptr_mode which did not hit >> this bug either. > > > The issue was firstly observed in one of the compat tests which passes a > large number of non-small structures. The following is a trimmed-down > reproducible code snippet (although not runnable but shall be easy to be > make runnable): > > struct s5 > { > double a; > double b; > double c; > double d; > double e; > } gS; > > double foo (struct s5 p1, struct s5 p2,struct s5 p3,struct s5 p4,struct s5 > p5,struct s5 p6,struct s5 p7,struct s5 p8, struct s5 p9) > { > return p9.c; > } > --------------- CUT --------------- > > The code-gen (-O2) without the patch is: > > .text > .align 2 > .global foo > .type foo, %function > foo: > ldr x0, [sp] <<=== here! > ldr d0, [x0,16] > ret > .size foo, .-foo > > Where the arrow points is the load of the pointer to 'p9' that is passed on > stack. The instruction really should be ldr w0, [sp], i.e. the pointer mode > is SImode rather than DImode. > > It needs a number of conditions for the issue to manifest: > > 1. pass-by-reference; on aarch64 one example is a struct that is larger than > 16 bytes. > 2. the reference is passed on stack; on aarch64, this usually only happens > after registers x0 - x7 are used. > 3. the size of stack slot for passing pointer is larger than the pointer > size; on aarch64, it is 8-byte vs. 4-byte > 4. the unused part of the stack slot is not zeroed out, i.e. undefined by > the ABI
This is the real issue. I think it is better if we change the ABI to say they are zero'd. It really makes things like this a mess. > 5. in the runtime, the unused part of such a stack slot contains junk. > > The runtime segmentation fault may only be generated when all the above > conditions are met. I'm not familiar with IA64-hpux or powerpc64 procedure > call ABIs, but I guess those targets are just being lucky? Or rather their ABIs all say are zero or sign extended for values less than 8 byte wide. Thanks, Andrew Pinski