[PATCH] D118084: [CUDA, NVPTX] Pass byval aggregates directly

Artem Belevich via Phabricator via cfe-commits Tue, 25 Jan 2022 16:20:55 -0800

tra planned changes to this revision.
tra added a comment.

Getting rid of byval helps getting rid of locals in quite a few places, but 
runs into a new problem. 😕

Looks like this change does have unexpected side-effects.
When we need to dynamically index into a struct passed directly, there's no 
easy way to do it as `extractvalue` only supports constant indices. 
With `byval` aggregates LLVM uses GEP which does allow using dynamic indexing. 
Alloca would only show up after `nvptx-lower-args` pass and that by that time 
IR would often be simple enough to eliminate that alloca. 
Now, clang generates  a local copy early on and, indexes into it dynamically 
with GEP... and then LLVM fails to eliminate the local copy because SROA fails 
to deal with dynamic indices and that in turn prevents IR optimizations that 
were possible without alloca.
https://github.com/llvm/llvm-project/issues/51734

That's rather unfortunate. This regression is serious enough to be a 
showstopper for my use case.

================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:7193
          EIT->getNumBits() > 64))
       return getNaturalAlignIndirect(Ty, /* byval */ true);
   }
----------------
jdoerfert wrote:
> When is this ever hit and should we not disable byval here too while we are 
> at it?
Basically it's saying "pass as byval pointer if it's an int that's larger than 
what we can lower".
Yes, I think passing such integer directly would make sense.

We may hit this if clang wants to pass `__i128` (do larger int types exist in 
clang?). 
I think (some of) this may be a leftover from the days when we didn't support 
i128 in CUDA/NVPTX. I think we do now.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118084/new/

https://reviews.llvm.org/D118084

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D118084: [CUDA, NVPTX] Pass byval aggregates directly

Reply via email to