rnk added inline comments.
================ Comment at: clang/lib/CodeGen/TargetInfo.cpp:1858-1859 } - return getIndirectResult(Ty, /*ByVal=*/false, State); + bool ByVal = IsVectorCall && Ty->isFloatingType(); + return getIndirectResult(Ty, ByVal, State); } ---------------- pengfei wrote: > rnk wrote: > > I would try to refactor this so that the vectorcall HFA that can't be > > passed in SSE regs falls through to the following logic. I suspect that it > > correctly handles each case that we care about: > > - double: direct > > - vector: indirect for alignment > > - aggregate: indirect for alignment, any HFA will presumably be aligned to > > more than 32bits > > > Not sure if I understand it correctly, the HFA is not a floating type, so it > won't be affected. Add a test case for it. > MSVC passes it indirectly too. https://gcc.godbolt.org/z/3qf4dTYfv Thanks, I see what you mean. I thought the code for handling overaligned aggregates would trigger, passing any HFA indirectly, but it does not for plain FP HFAs. You can observe the difference by replacing `double` in HFA2 with `__int64`, and see that HFA2 is passed underaligned on the stack: https://gcc.godbolt.org/z/jqx4xcnjq I still think this code would benefit from separating the regcall and vectorcall cases, something like: bool IsInReg = State.FreeSSERegs >= NumElts; if (IsInReg) State.FreeSSERegs -= NumElts; if (IsRegCall) { // handle regcall if (IsInReg) ... } else { // handle vectorcall if (IsInReg) ... } They seem to have pretty different rules both when SSE regs are available and when not. ================ Comment at: clang/test/CodeGen/vectorcall.c:157 +// X86-SAME: <4 x float> inreg noundef %xmm5, +// X86-SAME: double* noundef byval(double) align 4 %0) #endif ---------------- pengfei wrote: > rnk wrote: > > Why not pass the double directly? That should be ABI compatible: > > https://gcc.godbolt.org/z/W4rjn63b5 > Sorry, I'm not sure what's your mean here. Do you mean I should use your > example as the test case? Here the case mocked `vectorcall_indirect_vec`, > which I think is intended to check if the type, `inreg` and `byval` etc. are > generated correctly. I mean that these two LLVM prototypes are ABI compatible at the binary level for i686, but the second is much easier to optimize: double @byval(double* byval(double) %p) { %v = load double, double* %p ret double %v } double @direct(double %v) { ret double %v } https://gcc.godbolt.org/z/MjEvdEKbT Clang should generate the prototype. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D134797/new/ https://reviews.llvm.org/D134797 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits