PiotrFusik marked 2 inline comments as done.
PiotrFusik added inline comments.


================
Comment at: clang/lib/Headers/opencl-c.h:15594
+int     __ovld sub_group_elect(void);
+int     __ovld sub_group_non_uniform_all( int predicate );
+int     __ovld sub_group_non_uniform_any( int predicate );
----------------
Anastasia wrote:
> PiotrFusik wrote:
> > Anastasia wrote:
> > > PiotrFusik wrote:
> > > > Anastasia wrote:
> > > > > PiotrFusik wrote:
> > > > > > Anastasia wrote:
> > > > > > > PiotrFusik wrote:
> > > > > > > > Anastasia wrote:
> > > > > > > > > Ideally `convergent` attribute was intended to be used in the 
> > > > > > > > > non-divergent scenarios. So I don't know if it's going to do 
> > > > > > > > > what is needed here. Did you look into this already?
> > > > > > > > > 
> > > > > > > > > If we look at the Clang documentation it says:
> > > > > > > > > 
> > > > > > > > > > In languages designed for SPMD/SIMT programming model, e.g. 
> > > > > > > > > > OpenCL or CUDA, the call instructions of a function with 
> > > > > > > > > > this attribute must be executed by all work items or 
> > > > > > > > > > threads in a work group or sub group.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > I remember @nhaehnle  was looking at using `convergent` with 
> > > > > > > > > operations in the divergent control flow some time ago 
> > > > > > > > > https://reviews.llvm.org/D68994? I am not sure where this 
> > > > > > > > > thread ended up and whether we can expect this to work 
> > > > > > > > > currently?
> > > > > > > > Thanks for pointing this out! My understanding of the 
> > > > > > > > `convergent` attribute was that it's for uniform control flow, 
> > > > > > > > as per the documentation you cited.
> > > > > > > > A quick check shows that Intel Graphics Compiler doesn't suffer 
> > > > > > > > from this invalid optimization.
> > > > > > > > Yet I agree that the functions should be marked somehow. It is 
> > > > > > > > with `__conv` ?
> > > > > > > 
> > > > > > > > A quick check shows that Intel Graphics Compiler doesn't suffer 
> > > > > > > > from this invalid optimization.
> > > > > > > 
> > > > > > > Did you check on the examples that @nhaehnle provided in the 
> > > > > > > review?
> > > > > > > 
> > > > > > > > Yet I agree that the functions should be marked somehow. It is 
> > > > > > > > with __conv ?
> > > > > > > 
> > > > > > > I am not sure there is a solution to this in the upstream LLVM at 
> > > > > > > present. I am hoping @nhaehnle  can provide us more information. 
> > > > > > > Alternatively, we could run his examples against the latest LLVM 
> > > > > > > revision to see if the problem still remains or not. I am not 
> > > > > > > saying that it should block this review, I am ok that we commit 
> > > > > > > your patch but if we know already that this functionality can't 
> > > > > > > be fully supported it would be good to at least create a bug to 
> > > > > > > OpenCL component to record this.
> > > > > > The following example:
> > > > > > ```
> > > > > > kernel void test(global int *data)
> > > > > > {
> > > > > >     uint id = (uint) get_global_id(0);
> > > > > >     if (id < 4)
> > > > > >         data[id] = sub_group_elect();
> > > > > >     else
> > > > > >         data[id] = sub_group_elect();
> > > > > > }
> > > > > > ```
> > > > > > with `clang -S -emit-llvm` emits invalid code both with and without 
> > > > > > the `convergent` attribute.
> > > > > > 
> > > > > > We don't have this problem in our Intel Graphics Compiler though. 
> > > > > > That's because we replace the subgroup functions with internal 
> > > > > > intrinsics early enough. These intrinsics are marked with both the 
> > > > > > `convergent` and `inaccessiblememonly` attributes. Then in the 
> > > > > > `SimplifyCFG` pass we prevent optimization if both of these 
> > > > > > attributes are present: 
> > > > > > https://github.com/intel/intel-graphics-compiler/blob/master/IGC/Compiler/GenTTI.cpp#L397
> > > > > > 
> > > > > > I don't understand what can go wrong in the second example (with 
> > > > > > the jump threading pass) ?
> > > > > > 
> > > > > Thanks for checking. Do you mind creating a bug to clang for now 
> > > > > under OpenCL component 
> > > > > https://bugs.llvm.org/enter_bug.cgi?product=clang although we might 
> > > > > reclassify this for a wider scope later on.
> > > > > 
> > > > > 
> > > > > 
> > > > > >  Then in the SimplifyCFG pass we prevent optimization if both of 
> > > > > > these attributes are present: 
> > > > > > https://github.com/intel/intel-graphics-compiler/blob/master/IGC/Compiler/GenTTI.cpp#L397
> > > > > 
> > > > > I see. Perhaps we need to come up with a new semantic of `convergent` 
> > > > > and update LLVM passes... or maybe we have to introduce a new 
> > > > > attribute. Not sure. I suggest we add a comment to 
> > > > > https://reviews.llvm.org/D68994 and see if it gets picked up.
> > > > > 
> > > > > 
> > > > Do you mean a bug for the invalid optimizations of these new subgroup 
> > > > functions?
> > > > I added a comment to https://reviews.llvm.org/D68994.
> > > > Do you mean a bug for the invalid optimizations of these new subgroup 
> > > > functions?
> > > 
> > > yes, you can just copy/paste your test above:
> > > 
> > > 
> > > > The following example:
> > > > 
> > > > kernel void test(global int *data)
> > > > {
> > > >     uint id = (uint) get_global_id(0);
> > > >     if (id < 4)
> > > >         data[id] = sub_group_elect();
> > > >     else
> > > >         data[id] = sub_group_elect();
> > > > }
> > > > 
> > > > with clang -S -emit-llvm emits invalid code both with and without the 
> > > > convergent attribute.
> > > 
> > > If you can it would be good to highlight what is invalid in IR produced 
> > > by upstream LLVM now.
> > > 
> > > 
> > Shall I submit the bug before this change is merged?
> The order is not important as long as we have it eventually. :)
Submitted https://bugs.llvm.org/show_bug.cgi?id=46199


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79781/new/

https://reviews.llvm.org/D79781



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to