sameerds added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:566
+      return false;
+    };
+
----------------
sameerds wrote:
> jdoerfert wrote:
> > sameerds wrote:
> > > jdoerfert wrote:
> > > > sameerds wrote:
> > > > > jdoerfert wrote:
> > > > > > jdoerfert wrote:
> > > > > > > sameerds wrote:
> > > > > > > > jdoerfert wrote:
> > > > > > > > > You can use AAPointerInfo for the call site return 
> > > > > > > > > IRPosition. It will (through the iterations) gather all 
> > > > > > > > > accesses and put them into "bins" based on offset and size. 
> > > > > > > > > It deals with uses in calls, etc. and if there is stuff 
> > > > > > > > > missing it is better to add it in one place so we benefit 
> > > > > > > > > throughout. 
> > > > > > > > I am not following what you have in mind. "implicitarg_ptr" is 
> > > > > > > > a pointer returned by an intrinsic that reads an ABI-defined 
> > > > > > > > register. I need to check that for a given call-graph, a 
> > > > > > > > particular range of bytes relative to that base pointer are 
> > > > > > > > never accessed. The above DFS on the uses conservatively 
> > > > > > > > assumes that such a load exists unless it can conclusively 
> > > > > > > > trace every use of the base pointer. This may include the 
> > > > > > > > pointer being passed to an extern function or being stored into 
> > > > > > > > a different memory location (although we don't expect ABI 
> > > > > > > > registers being capture this way). I am not seeing how to 
> > > > > > > > construct this around AAPointerInfo. As far as I can see, the 
> > > > > > > > public interface only talks about uses that are recognized as 
> > > > > > > > loads and stores.
> > > > > > > Not actually tested, replaces the function body. Depends on 
> > > > > > > D119249.
> > > > > > > ```
> > > > > > > const auto PointerInfoAA = A.getAAFor<AAPointerInfo>(*this, 
> > > > > > > IRPosition::callback_returned(cast<CallBase>(Ptr)), 
> > > > > > > DepClassTy::Required);
> > > > > > > if (!PointerInfoAA.getState().isValidState())
> > > > > > >   return true; // Abort (which is weird as false is abort in the 
> > > > > > > other CB).
> > > > > > > AAPointerInfo::OffsetAndSize OAS(*Position, /* probably look 
> > > > > > > pointer width up in DL */ 8);
> > > > > > > return !forallInterferingAccesses(OAS, [](const 
> > > > > > > AAPointerInfo::Access &Acc, bool IsExact) {
> > > > > > >    return Acc.getRemoteInst()->isDroppable(); });
> > > > > > > ```
> > > > > > You don't actually need the state check.
> > > > > > And as I said, this will take care of following pointers passed 
> > > > > > into callees or through memory to other places, all while ignoring 
> > > > > > dead code, etc.
> > > > > I see now. forallInterferingAccesses() does check for valid state on 
> > > > > entry, which is sufficient to take care of all the opaque uses like a 
> > > > > call to an extern function or a complicated phi or a capturing store. 
> > > > > Thanks a ton ... this has been very educational!
> > > > Yes, all "forAll" functions will return `false` if we cannot visit 
> > > > "all". Though, these methods will utilize the smarts, e.g., ignore what 
> > > > is dead, look at loads if the value is stored in a way we can track it 
> > > > through memory, transfer accesses in a callee to the caller "space" if 
> > > > a pointer is passed to the callee,... etc.
> > > > Yes, all "forAll" functions will return `false` if we cannot visit 
> > > > "all". Though, these methods will utilize the smarts, e.g., ignore what 
> > > > is dead, look at loads if the value is stored in a way we can track it 
> > > > through memory, transfer accesses in a callee to the caller "space" if 
> > > > a pointer is passed to the callee,... etc.
> > > 
> > > @jdoerfert, do you see  D119249 landing soon? We are kinda on a short 
> > > runway here (less than a handful of days) and hoping to land this review 
> > > quickly because it solves an important issue. I would prefer to have the 
> > > check that you outlined, but the alternative is to let my version through 
> > > now, and then update it when the new interface becomes available.
> > Did you test my proposed code? I'll land my patch tomorrow, if yours works 
> > as you expect with the proposed AAPointerInfo use, great. If it doesn't I 
> > don't necessarily mind you merging something else with a clear intention to 
> > address issues and remove duplication as we go.
> > 
> > I can lift my commit block as we go and @arsenm can also give it a 
> > good-to-go if need be.
> Lifting your commit block would be useful in general. I certainly do not 
> intend to submit something in a hurry.
> 
> I applied your change and tested the proposed code. It gives more pessimistic 
> results than my original crude version, i.e., it marks some uses of the 
> `implicitarg_ptr` as accessing the target range, when I am pretty sure it 
> does not. See `test_kernel42` in the lit test `hsa-metadata-hostcall-v3.ll` 
> in this review. It is intended to access eight bytes from 16 onwards, which 
> clearly does not overlap with offset 24, which is the start of the target 
> range (hostcall pointer).
> 
> Strangely, in the debug messages, the `AAPointerInfoCallsiteReturned()` 
> constructor is called, but I don't see any messages marked [AAPointerInfo].
Good news. The PointerInfo works as expected ... it's just that 
AMDGPUAttributor has a very short Allowed list, and I needed to add PointerInfo 
there.

So @jdoerfert, D119249 works for me with the suggested changes. If that makes 
it to mainline, I'll update this change to use it.

@arsenm do you see any issues with enabling AAPointerInfo in the 
AMDGPUAttributor? Will there be concerns about it being "too heavy" or 
something?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119216/new/

https://reviews.llvm.org/D119216

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to