sameerds added inline comments.
================ Comment at: llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp:566 + return false; + }; + ---------------- sameerds wrote: > jdoerfert wrote: > > sameerds wrote: > > > jdoerfert wrote: > > > > sameerds wrote: > > > > > jdoerfert wrote: > > > > > > jdoerfert wrote: > > > > > > > sameerds wrote: > > > > > > > > jdoerfert wrote: > > > > > > > > > You can use AAPointerInfo for the call site return > > > > > > > > > IRPosition. It will (through the iterations) gather all > > > > > > > > > accesses and put them into "bins" based on offset and size. > > > > > > > > > It deals with uses in calls, etc. and if there is stuff > > > > > > > > > missing it is better to add it in one place so we benefit > > > > > > > > > throughout. > > > > > > > > I am not following what you have in mind. "implicitarg_ptr" is > > > > > > > > a pointer returned by an intrinsic that reads an ABI-defined > > > > > > > > register. I need to check that for a given call-graph, a > > > > > > > > particular range of bytes relative to that base pointer are > > > > > > > > never accessed. The above DFS on the uses conservatively > > > > > > > > assumes that such a load exists unless it can conclusively > > > > > > > > trace every use of the base pointer. This may include the > > > > > > > > pointer being passed to an extern function or being stored into > > > > > > > > a different memory location (although we don't expect ABI > > > > > > > > registers being capture this way). I am not seeing how to > > > > > > > > construct this around AAPointerInfo. As far as I can see, the > > > > > > > > public interface only talks about uses that are recognized as > > > > > > > > loads and stores. > > > > > > > Not actually tested, replaces the function body. Depends on > > > > > > > D119249. > > > > > > > ``` > > > > > > > const auto PointerInfoAA = A.getAAFor<AAPointerInfo>(*this, > > > > > > > IRPosition::callback_returned(cast<CallBase>(Ptr)), > > > > > > > DepClassTy::Required); > > > > > > > if (!PointerInfoAA.getState().isValidState()) > > > > > > > return true; // Abort (which is weird as false is abort in the > > > > > > > other CB). > > > > > > > AAPointerInfo::OffsetAndSize OAS(*Position, /* probably look > > > > > > > pointer width up in DL */ 8); > > > > > > > return !forallInterferingAccesses(OAS, [](const > > > > > > > AAPointerInfo::Access &Acc, bool IsExact) { > > > > > > > return Acc.getRemoteInst()->isDroppable(); }); > > > > > > > ``` > > > > > > You don't actually need the state check. > > > > > > And as I said, this will take care of following pointers passed > > > > > > into callees or through memory to other places, all while ignoring > > > > > > dead code, etc. > > > > > I see now. forallInterferingAccesses() does check for valid state on > > > > > entry, which is sufficient to take care of all the opaque uses like a > > > > > call to an extern function or a complicated phi or a capturing store. > > > > > Thanks a ton ... this has been very educational! > > > > Yes, all "forAll" functions will return `false` if we cannot visit > > > > "all". Though, these methods will utilize the smarts, e.g., ignore what > > > > is dead, look at loads if the value is stored in a way we can track it > > > > through memory, transfer accesses in a callee to the caller "space" if > > > > a pointer is passed to the callee,... etc. > > > > Yes, all "forAll" functions will return `false` if we cannot visit > > > > "all". Though, these methods will utilize the smarts, e.g., ignore what > > > > is dead, look at loads if the value is stored in a way we can track it > > > > through memory, transfer accesses in a callee to the caller "space" if > > > > a pointer is passed to the callee,... etc. > > > > > > @jdoerfert, do you see D119249 landing soon? We are kinda on a short > > > runway here (less than a handful of days) and hoping to land this review > > > quickly because it solves an important issue. I would prefer to have the > > > check that you outlined, but the alternative is to let my version through > > > now, and then update it when the new interface becomes available. > > Did you test my proposed code? I'll land my patch tomorrow, if yours works > > as you expect with the proposed AAPointerInfo use, great. If it doesn't I > > don't necessarily mind you merging something else with a clear intention to > > address issues and remove duplication as we go. > > > > I can lift my commit block as we go and @arsenm can also give it a > > good-to-go if need be. > Lifting your commit block would be useful in general. I certainly do not > intend to submit something in a hurry. > > I applied your change and tested the proposed code. It gives more pessimistic > results than my original crude version, i.e., it marks some uses of the > `implicitarg_ptr` as accessing the target range, when I am pretty sure it > does not. See `test_kernel42` in the lit test `hsa-metadata-hostcall-v3.ll` > in this review. It is intended to access eight bytes from 16 onwards, which > clearly does not overlap with offset 24, which is the start of the target > range (hostcall pointer). > > Strangely, in the debug messages, the `AAPointerInfoCallsiteReturned()` > constructor is called, but I don't see any messages marked [AAPointerInfo]. Good news. The PointerInfo works as expected ... it's just that AMDGPUAttributor has a very short Allowed list, and I needed to add PointerInfo there. So @jdoerfert, D119249 works for me with the suggested changes. If that makes it to mainline, I'll update this change to use it. @arsenm do you see any issues with enabling AAPointerInfo in the AMDGPUAttributor? Will there be concerns about it being "too heavy" or something? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D119216/new/ https://reviews.llvm.org/D119216 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits