On Thu, Dec 08, 2022 at 09:17:38PM +0800, Jiufu Guo wrote:
> Segher Boessenkool <seg...@kernel.crashing.org> writes:
> > On Wed, Dec 07, 2022 at 08:00:08PM +0800, Jiufu Guo wrote:
> >> typedef struct SA {double a[3];} A;
> >> A ret_arg_pt (A *a) {return *a;} // on ppc64le, expect only 3 lfd(s)
> >> A ret_arg (A a) {return a;} // just empty fun body
> >> void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s)
> >
> > What is this like if you use [5] instead?  Or use an ABI without
> > homogeneous aggregates?
> Thanks for this question!  I also tested the cases on different array
> types or different sizes, or mixed field types.
> 
> If it is out of the number of registers for passing the param
> or return, it is treated as a mem block.
> For parameter, it is partially passed via registers, and partially
> passing via stack.
> For return, it is returned via a pointer (with one invisible pointer
> parameter). And the <retval> of the function is not with parallel code.
> 
> This patch does not cover these cases.

Understood, sure; but my point is, can it degrade code quality in such
cases?  I don't see anything in the patch that precludes that.

> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/powerpc/pr65421-1.c
> >> @@ -0,0 +1,15 @@
> >> +/* PR target/65421 */
> >> +/* { dg-options "-O2" } */
> >> +/* { dg-require-effective-target has_arch_ppc64 } */
> >> +
> >> +typedef struct SA
> >> +{
> >> +  double a[2];
> >> +  long l;
> >> +} A;
> >> +
> >> +/* std 3 param regs to return slot */
> >> +A ret_arg (A a) {return a;}
> >> +/* { dg-final { scan-assembler-times {\mstd 4,0\(3\)\s} 1 } } */
> >> +/* { dg-final { scan-assembler-times {\mstd 5,8\(3\)\s} 1 } } *
> >> +/* { dg-final { scan-assembler-times {\mstd 6,16\(3\)\s} 1 } } */
> >
> > This is only correct on certain ABIs, probably only ELFv2 even.
> Thanks for point out this!
> This is only correct if the ABI allows this struct to be passed
> through integer registers, and return through the mem block.

And it needs to be in those specific registers / at those specific
offsets as well.

Btw, please leave out the \s?

> In the previous version, I added a requirement on ELFv2. As tested on
> BE environments, this case also pass. So, I deleted the requirement.

BE for ELFv2 also exists, fwiw.

> (While on BE environments, there is another issue: some unnecessary
> memory stores are not deleted.)

Huh.  Does that happen with the current compiler as well?  Do you have
an example?

> But with more reading of the code 'rs6000_function_arg', as you said,
> I'm not sure if this behavior meets other ABIs (at least, it seems,
> this is not correct on darwin64).
> So, as you said, we may add a requirement on ELFv2; Or leave this
> case there, and add "! target" when hitting failure?

If you do !target the testcase won't test much at all anymore ;-)

> > We certainly can improve the homogeneous aggregates stuff, but please
> > make sure you don't degrade all other stuff?  Older, as well as when
> > things are not an homogeneous aggregate, for example too big.  Can you
> > please add tests for such cases?
> Sure, thanks!  I encounter one issue in this kind of case (large struct)
> on a previous version path.

Perhaps it would be better to have a hook so that every target (and
subtarget) can fine tune exactly when this is done.  Then again, perhaps
I worry too much.


Segher

Reply via email to