ychen added a comment. In D97915#2861178 <https://reviews.llvm.org/D97915#2861178>, @ChuanqiXu wrote:
> In D97915#2861036 <https://reviews.llvm.org/D97915#2861036>, @ychen wrote: > >> In D97915#2860984 <https://reviews.llvm.org/D97915#2860984>, @ChuanqiXu >> wrote: >> >>> In D97915#2860916 <https://reviews.llvm.org/D97915#2860916>, @ychen wrote: >>> >>>> In D97915#2859237 <https://reviews.llvm.org/D97915#2859237>, @ChuanqiXu >>>> wrote: >>>> >>>>> In D97915#2848816 <https://reviews.llvm.org/D97915#2848816>, @ychen wrote: >>>>> >>>>>>> Thanks for clarifying. Let's solve the semantics problem first. >>>>>>> With the introduction about 'raw frame', I think it's necessary to >>>>>>> introduce this concept in the section 'Switched-Resume Lowering' or >>>>>>> even the section 'Introduction' in the document. Add a section to tell >>>>>>> the terminology is satisfied too. >>>>>> >>>>>> Done. >>>>>> >>>>>>> Then why we defined both 'llvm.coro.raw.frame.ptr.offset' and >>>>>>> 'llvm.coro.raw.frame.ptr.addr' together? It looks like refer to the >>>>>>> same value finally. It looks like 'llvm.coro.raw.frame.ptr.offset' are >>>>>>> trying to solve the problem about memory leak. But I think we could use >>>>>>> llvm.coro.raw.frame.ptr.addr directly instead of traversing the frame >>>>>>> (Maybe we need to add an intrinsic `llvm.coro.raw.size`). Then we can >>>>>>> omit a field in the frame to save space. >>>>>> >>>>>> ("llvm.coro.raw.frame.ptr.offset" is an offset from coroutine frame >>>>>> address instead of raw frame pointer) >>>>>> >>>>>> Apologies for the confusion. I've briefly explained it here >>>>>> https://reviews.llvm.org/D102145#2752445 I think it is not clear. >>>>>> "llvm.coro.raw.frame.ptr.addr" is conceptually "the address of a >>>>>> coroutine frame field storing the `raw frame pointer`" only after >>>>>> `insertSpills` in CoroFrame.cpp. Before that, >>>>>> "llvm.coro.raw.frame.ptr.addr" is actually an alloca storing the `raw >>>>>> frame pointer` (try grepping "alloc.frame.ptr" in this review page). >>>>>> Using "llvm.coro.raw.frame.ptr.offset" instead of >>>>>> "llvm.coro.raw.frame.ptr.addr" is doable which looks like below, please >>>>>> check line 31. The downside is that the write to coroutine frame is not >>>>>> through an alloca but a direct write. It is unusual because all fields >>>>>> in the frame are stored as 1. special/header fields 2. alloca 3. >>>>>> splills. Doing the write indirectly as Alloca makes me comfortable. The >>>>>> tradeoff is one extra intrinsic "llvm.coro.raw.frame.ptr.addr". What do >>>>>> you think? >>>>>> >>>>>> 19 coro.alloc.align: ; preds = >>>>>> %coro.alloc.check.align >>>>>> 20 %3 = sub nsw i64 64, 16 >>>>>> 21 %4 = add i64 128, %3 >>>>>> 22 %call1 = call noalias nonnull i8* @_Znwm(i64 %4) #13 >>>>>> 23 %mask = sub i64 64, 1 >>>>>> 24 %intptr = ptrtoint i8* %call1 to i64 >>>>>> 25 %over_boundary = add i64 %intptr, %mask >>>>>> 26 %inverted_mask = xor i64 %mask, -1 >>>>>> 27 %aligned_intptr = and i64 %over_boundary, %inverted_mask >>>>>> 28 %diff = sub i64 %aligned_intptr, %intptr >>>>>> 29 %aligned_result = getelementptr inbounds i8, i8* %call1, i64 %diff >>>>>> 30 call void @llvm.assume(i1 true) [ "align"(i8* %aligned_result, >>>>>> i64 64) ] >>>>>> 31 store i8* %call1, i8** %alloc.frame.ptr, align 8 >>>>>> >>>>>> >>>>>> ; Replace line 31 with below, and must makes sure line 46~line 48 >>>>>> is skipped. >>>>>> ; %poff = call i32 @llvm.coro.raw.frame.ptr.offset.i32() >>>>>> ; %addr = getelementptr inbounds i8, i8* %aligned_result, i32 >>>>>> %poff >>>>>> ; %addr1 = bitcast i8* %addr to i8** >>>>>> ; store i8* %call1, i8** %addr1, align 8 >>>>>> >>>>>> >>>>>> 32 br label %coro.init.from.coro.alloc.align >>>>>> 33 >>>>>> 34 coro.init.from.coro.alloc.align: ; preds = >>>>>> %coro.alloc.align >>>>>> 35 %aligned_result.coro.init = phi i8* [ %aligned_result, >>>>>> %coro.alloc.align ] >>>>>> 36 br label %coro.init >>>>>> 37 >>>>>> 38 coro.init: ; preds = >>>>>> %coro.init.from.entry, %coro.init.from.coro.alloc.align, %cor >>>>>> o.init.from.coro.alloc >>>>>> 39 %5 = phi i8* [ %.coro.init, %coro.init.from.entry ], [ >>>>>> %call.coro.init, %coro.init.from.coro.alloc ], [ %aligned_result >>>>>> .coro.init, %coro.init.from.coro.alloc.align ] >>>>>> 40 %FramePtr = bitcast i8* %5 to %f0.Frame* >>>>>> 41 %resume.addr = getelementptr inbounds %f0.Frame, %f0.Frame* >>>>>> %FramePtr, i32 0, i32 0 >>>>>> 42 store void (%f0.Frame*)* @f0.resume, void (%f0.Frame*)** >>>>>> %resume.addr, align 8 >>>>>> 43 %6 = select i1 true, void (%f0.Frame*)* @f0.destroy, void >>>>>> (%f0.Frame*)* @f0.cleanup >>>>>> 44 %destroy.addr = getelementptr inbounds %f0.Frame, %f0.Frame* >>>>>> %FramePtr, i32 0, i32 1 >>>>>> 45 store void (%f0.Frame*)* %6, void (%f0.Frame*)** %destroy.addr, >>>>>> align 8 >>>>>> 46 %7 = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 >>>>>> 0, i32 2 >>>>>> 47 %8 = load i8*, i8** %alloc.frame.ptr, align 8 >>>>>> 48 store i8* %8, i8** %7, align 8 >>>>>> 49 br label %AllocaSpillBB >>>>>> 50 >>>>>> 51 AllocaSpillBB: ; preds = >>>>>> %coro.init >>>>>> 52 %.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* >>>>>> %FramePtr, i32 0, i32 4 >>>>>> 53 %ref.tmp.reload.addr = getelementptr inbounds %f0.Frame, >>>>>> %f0.Frame* %FramePtr, i32 0, i32 5 >>>>>> 54 %agg.tmp.reload.addr = getelementptr inbounds %f0.Frame, >>>>>> %f0.Frame* %FramePtr, i32 0, i32 6 >>>>>> 55 %ref.tmp5.reload.addr = getelementptr inbounds %f0.Frame, >>>>>> %f0.Frame* %FramePtr, i32 0, i32 7 >>>>>> 56 %agg.tmp8.reload.addr = getelementptr inbounds %f0.Frame, >>>>>> %f0.Frame* %FramePtr, i32 0, i32 8 >>>>>> 57 %__promise.reload.addr = getelementptr inbounds %f0.Frame, >>>>>> %f0.Frame* %FramePtr, i32 0, i32 10 >>>>>> 58 br label %PostSpill >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Then I am a little confused for the design again, since we would treat >>>>>>> the value for CoroBegin as the address of coroutine frame in the past >>>>>>> and it looks like to be the raw frame now. Let me reconsider if it is >>>>>>> OK. >>>>>> >>>>>> The returned value of CoroBegin is still coroutine frame not a raw frame >>>>>> even if the frame is overaligned. You could check the above code. >>>>> >>>>> Thanks for clarifying! >>>>> >>>>> I don't understand why we need to store the address for coroutine raw >>>>> frame in the coroutine frame. For example, `%call1` in your example marks >>>>> the address for the raw frame. Then can we use the value `%call1` in >>>>> every place where we want to use the address for coroutine frame? >>>>> If yes, I think we could emit an intrinsic called 'llvm.coro.raw.frame' >>>>> in the frontend if we need to use the address for the raw frame. Then in >>>>> the middle end, we could replace `llvm.coro.raw.frame` with `%call1` >>>>> simply. Similarly, we could define intrinsic `llvm.coro.raw.frame.size`. >>>>> As far as I know from the codes, the address for the coroutine frame is >>>>> mainly used for deallocation. So it should be fine I guess. >>>>> >>>>> --- >>>>> >>>>> Then the code generated now looks roughly like: >>>>> >>>>> if (should over align) { >>>>> /// ... >>>>> mem = ... >>>>> } else { >>>>> /// ... >>>>> mem = ... >>>>> } >>>>> coro.begin(id, mem); >>>>> >>>>> It looks redundant since the `then` part and `else` part looks very >>>>> similar. I understand it would be eliminated in the middle end. But >>>>> another problem is that the redundant implementation in clang. Maybe we >>>>> could solve it by refactoring. >>>>> But I am wondering if it is possible to use another pattern (assume >>>>> `llvm.coro.alloc` returns true): >>>>> >>>>> %raw.frame.ptr = new(call @llvm.coro.raw.frame.size()) >>>>> %true.frame.ptr = call @llvm.coro.frame(%raw.frame.ptr, NEW_ALIGN) ; we >>>>> need a better name >>>>> call @llvm.coro.begin(coro.id, %true.frame.ptr) >>>>> >>>>> Then for `llvm.coro.frame`, we could return `@raw.frame.ptr ` simply if >>>>> the alignment could be satisfied (alignment needed is less than >>>>> NEW_ALIGN). Or we could do simply to align up for the coroutine frame. >>>>> There are many APIs in Align.h. >>>>> And for the destruction, we could emit: >>>>> >>>>> call @delete(%raw.frame.ptr, call @llvm.coro.raw.frame.size()) >>>>> >>>>> In this way, I guess we would get simpler implementation and generated >>>>> codes. >>>>> >>>>> BTW, if we choose to do so, the semantics for llvm.coro.raw.frame.ptr and >>>>> llvm.coro.size would change slightly. They would stands for the address >>>>> and size for the coroutine frame if we don't need over alignment. >>>>> >>>>> How do you think about this? >>>> >>>> I was confused by this and @rjmccall explained it here >>>> https://reviews.llvm.org/D97915/new/#2604871. Basically, we could not >>>> recover "raw frame pointer" (`%call1`) from coroutine frame pointer >>>> statically at deallocation time. >>> >>> Oh, I understand why we need to store the address for raw frame now. >>> Another question is that how do you think combine the pattern: >>> >>> if (should over align) { >>> /// ... >>> mem = ... >>> } else { >>> /// ... >>> mem = ... >>> } >>> coro.begin(id, mem); >>> >>> into this one: >>> >>> %true.frame.ptr = call @llvm.coro.create.frame(new(call >>> @llvm.coro.raw.frame.size()), NEW_ALIGN) ; we need a better name >>> >>> >>> ; It would be lowered to store the address of the raw >>> frame to the alloca in the middle end if needed >>> call @llvm.coro.begin(coro.id, %true.frame.ptr) >>> >>> and this one: >>> >>> call @delete(call @llvm.coro.raw.frame.ptr(), call >>> @llvm.coro.raw.frame.size()) ; Then >>> use `llvm.coro.raw.frame.ptr()` and `llvm.coro.raw.frame.size()` directly >>> whenever we want. >>> >>> It looks like we could generate the same code in the front for normal and >>> over aligned coroutines. >> >> Yeah, I think it works for this patch alone. It shifts the semantic lowering >> from Clang to LLVM but does not perform less work. For future language >> support like D102147 <https://reviews.llvm.org/D102147>, >> @llvm.coro.create.frame needs to be repurposed based on the new semantics >> and that seems a sign that it should be implemented in frontend. > > For language support, `::operator new(size_t, align_t)`, I think it could be > implemented like: > > %allocated = call @new(call @llvm.coro.raw.frame.size(), align_val) > %true.frame.ptr = call @llvm.coro.create.frame(%allocated, 0) ; if the > second argument is 0, it means `llvm.coro.create.frame` could be lowered to > `%allocated` simply. > call @llvm.coro.begin(coro.id, %true.frame.ptr) > > It looks not hard to implement. And we don't need to refactor the CodeGen > part a lot. In this way, I think the main effort to support `::operator > new(size_t, align_t)` would be in the Sema part and the works remained in > CodeGen part would be little. It wouldn't touch the middle end part neither. I agree that something like this is simpler implementation-wise. Although in spirit, it is similar to D100739 <https://reviews.llvm.org/D100739> which we decided not to pursue. What is the middle-end? is it LLVM? >> It shifts the semantic lowering from Clang to LLVM but does not perform less >> work. > > I think it would be simpler. I agree it would be simpler. > At least, we don't need to emit `getReturnStmtOnAllocFailure` twice and `getReturnStmtOnAllocFailure` is called twice but the code is emitted once. > we don't need to touch `CallCoroDelete` neither. If we don't touch `CallCoroDelete`, then we cannot do the equivalent work in LLVM since there is no concept of deallocation function concept there. > And we don't organize the basic blocks in the CodeGenCoroutineBody. That's true. It is delayed until CoroSplit. > And we could emit simpler AlignupTo (Although it could be simplified further, > I believe). D102147 <https://reviews.llvm.org/D102147> has a version of it. > And the extra work we need to do is to compare the alignment requirement for > the coroutine frame with the second argument of `llvm.coro.create.frame` to > see if we need to over align coroutine frame. > If yes, we need to lower the `llvm.coro.create.frame` to compute the true > address for the coroutine frame and store the raw frame address. > If no, we could return `%allocated` simply. Yes, it is similar to CoroFrame.cpp changes in D102147 <https://reviews.llvm.org/D102147>. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97915/new/ https://reviews.llvm.org/D97915 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits