[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Yuanfang Chen via Phabricator via cfe-commits Tue, 29 Jun 2021 16:48:13 -0700

ychen added a comment.

> Thanks for clarifying. Let's solve the semantics problem first.
> With the introduction about 'raw frame', I think it's necessary to introduce 
> this concept in the section 'Switched-Resume Lowering' or even the section 
> 'Introduction' in the document. Add a section to tell the terminology is 
> satisfied too.


Done.

> Then why we defined both 'llvm.coro.raw.frame.ptr.offset' and 
> 'llvm.coro.raw.frame.ptr.addr' together? It looks like refer to the same 
> value finally. It looks like 'llvm.coro.raw.frame.ptr.offset' are trying to 
> solve the problem about memory leak. But I think we could use 
> llvm.coro.raw.frame.ptr.addr directly instead of traversing the frame (Maybe 
> we need to add an intrinsic `llvm.coro.raw.size`).  Then we can omit a field 
> in the frame to save space.

("llvm.coro.raw.frame.ptr.offset" is an offset from coroutine frame address 
instead of raw frame pointer)

Apologies for the confusion. I've briefly explained it here 
https://reviews.llvm.org/D102145#2752445 I think it is not clear. 
"llvm.coro.raw.frame.ptr.addr" is conceptually "the address of a coroutine 
frame field storing the `raw frame pointer`" only after `insertSpills` in 
CoroFrame.cpp. Before that, "llvm.coro.raw.frame.ptr.addr" is actually an 
alloca storing the `raw frame pointer` (try grepping "alloc.frame.ptr" in this 
review page). Using  "llvm.coro.raw.frame.ptr.offset" instead of  
"llvm.coro.raw.frame.ptr.addr" is doable which looks like below, please check 
line 31. The downside is that the write to coroutine frame is not through an 
alloca but a direct write. It is unusual because all fields in the frame are 
stored as 1. special/header fields 2. alloca 3. splills. Doing the write 
indirectly as Alloca makes me comfortable. The tradeoff is one extra intrinsic 
"llvm.coro.raw.frame.ptr.addr". What do you think?

  19 coro.alloc.align:                                 ; preds = 
%coro.alloc.check.align
  20   %3 = sub nsw i64 64, 16
  21   %4 = add i64 128, %3
  22   %call1 = call noalias nonnull i8* @_Znwm(i64 %4) #13
  23   %mask = sub i64 64, 1
  24   %intptr = ptrtoint i8* %call1 to i64
  25   %over_boundary = add i64 %intptr, %mask
  26   %inverted_mask = xor i64 %mask, -1
  27   %aligned_intptr = and i64 %over_boundary, %inverted_mask
  28   %diff = sub i64 %aligned_intptr, %intptr
  29   %aligned_result = getelementptr inbounds i8, i8* %call1, i64 %diff
  30   call void @llvm.assume(i1 true) [ "align"(i8* %aligned_result, i64 64) ]
  31   store i8* %call1, i8** %alloc.frame.ptr, align 8                     
  
       ; Replace line 31 with below, and must makes sure line 46~line 48 is 
skipped.
       ; %poff = call i32 @llvm.coro.raw.frame.ptr.offset.i32()
       ; %addr = getelementptr inbounds i8, i8* %aligned_result, i32 %poff
       ; %addr1 = bitcast i8* %addr to i8**
       ; store i8* %call1, i8** %addr1, align 8
  
  
  32   br label %coro.init.from.coro.alloc.align
  33
  34 coro.init.from.coro.alloc.align:                  ; preds = 
%coro.alloc.align
  35   %aligned_result.coro.init = phi i8* [ %aligned_result, %coro.alloc.align 
]
  36   br label %coro.init
  37
  38 coro.init:                                        ; preds = 
%coro.init.from.entry, %coro.init.from.coro.alloc.align, %cor
     o.init.from.coro.alloc
  39   %5 = phi i8* [ %.coro.init, %coro.init.from.entry ], [ %call.coro.init, 
%coro.init.from.coro.alloc ], [ %aligned_result
     .coro.init, %coro.init.from.coro.alloc.align ]
  40   %FramePtr = bitcast i8* %5 to %f0.Frame*
  41   %resume.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, 
i32 0, i32 0
  42   store void (%f0.Frame*)* @f0.resume, void (%f0.Frame*)** %resume.addr, 
align 8
  43   %6 = select i1 true, void (%f0.Frame*)* @f0.destroy, void (%f0.Frame*)* 
@f0.cleanup
  44   %destroy.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, 
i32 0, i32 1
  45   store void (%f0.Frame*)* %6, void (%f0.Frame*)** %destroy.addr, align 8
  46   %7 = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, i32 0, i32 2
  47   %8 = load i8*, i8** %alloc.frame.ptr, align 8
  48   store i8* %8, i8** %7, align 8
  49   br label %AllocaSpillBB
  50
  51 AllocaSpillBB:                                    ; preds = %coro.init
  52   %.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* %FramePtr, 
i32 0, i32 4
  53   %ref.tmp.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* 
%FramePtr, i32 0, i32 5
  54   %agg.tmp.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* 
%FramePtr, i32 0, i32 6
  55   %ref.tmp5.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* 
%FramePtr, i32 0, i32 7
  56   %agg.tmp8.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* 
%FramePtr, i32 0, i32 8
  57   %__promise.reload.addr = getelementptr inbounds %f0.Frame, %f0.Frame* 
%FramePtr, i32 0, i32 10
  58   br label %PostSpill




> Then I am a little confused for the design again, since we would treat the 
> value for CoroBegin as the address of coroutine frame in the past and it 
> looks like to be the raw frame now. Let me reconsider if it is OK.

The returned value of CoroBegin is still coroutine frame not a raw frame even 
if the frame is overaligned. You could check the above code.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97915/new/

https://reviews.llvm.org/D97915

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Reply via email to