pengfei added inline comments.
================ Comment at: clang/lib/CodeGen/TargetInfo.cpp:1858-1859 } - return getIndirectResult(Ty, /*ByVal=*/false, State); + bool ByVal = IsVectorCall && Ty->isFloatingType(); + return getIndirectResult(Ty, ByVal, State); } ---------------- rnk wrote: > pengfei wrote: > > rnk wrote: > > > I would try to refactor this so that the vectorcall HFA that can't be > > > passed in SSE regs falls through to the following logic. I suspect that > > > it correctly handles each case that we care about: > > > - double: direct > > > - vector: indirect for alignment > > > - aggregate: indirect for alignment, any HFA will presumably be aligned > > > to more than 32bits > > > > > Not sure if I understand it correctly, the HFA is not a floating type, so > > it won't be affected. Add a test case for it. > > MSVC passes it indirectly too. https://gcc.godbolt.org/z/3qf4dTYfv > Thanks, I see what you mean. I thought the code for handling overaligned > aggregates would trigger, passing any HFA indirectly, but it does not for > plain FP HFAs. You can observe the difference by replacing `double` in HFA2 > with `__int64`, and see that HFA2 is passed underaligned on the stack: > https://gcc.godbolt.org/z/jqx4xcnjq > > I still think this code would benefit from separating the regcall and > vectorcall cases, something like: > bool IsInReg = State.FreeSSERegs >= NumElts; > if (IsInReg) > State.FreeSSERegs -= NumElts; > if (IsRegCall) { > // handle regcall > if (IsInReg) > ... > } else { > // handle vectorcall > if (IsInReg) > ... > } > > They seem to have pretty different rules both when SSE regs are available and > when not. I can see the reason is non plain FP HFAs are not `HomogeneousAggregate` (possible be passed by SSE) on X86. Anyway, I got your point now. Refactor the code seems better, thanks! ================ Comment at: clang/test/CodeGen/vectorcall.c:157 +// X86-SAME: <4 x float> inreg noundef %xmm5, +// X86-SAME: double* noundef byval(double) align 4 %0) #endif ---------------- rnk wrote: > pengfei wrote: > > rnk wrote: > > > Why not pass the double directly? That should be ABI compatible: > > > https://gcc.godbolt.org/z/W4rjn63b5 > > Sorry, I'm not sure what's your mean here. Do you mean I should use your > > example as the test case? Here the case mocked `vectorcall_indirect_vec`, > > which I think is intended to check if the type, `inreg` and `byval` etc. > > are generated correctly. > I mean that these two LLVM prototypes are ABI compatible at the binary level > for i686, but the second is much easier to optimize: > double @byval(double* byval(double) %p) { > %v = load double, double* %p > ret double %v > } > double @direct(double %v) { > ret double %v > } > https://gcc.godbolt.org/z/MjEvdEKbT > > Clang should generate the prototype. I see your point now, sounds good to me, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D134797/new/ https://reviews.llvm.org/D134797 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits