[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Yuanfang Chen via Phabricator via cfe-commits Thu, 04 Mar 2021 17:22:27 -0800

ychen added inline comments.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4450
+    Value *Extra = Builder.CreateSelect(Cmp, Diff, Zero);
+    return RValue::get(Builder.CreateAdd(SizeCall, Extra));
   }
----------------
rjmccall wrote:
> ychen wrote:
> > ychen wrote:
> > > rjmccall wrote:
> > > > Okay, so you're implicitly increasing the coroutine size to allow you 
> > > > to round up to get an aligned frame.  But how do you round back down to 
> > > > get the actual pointer that you need to delete?  This just doesn't work.
> > > > 
> > > > You really ought to just be using the aligned `operator new` instead 
> > > > when the required alignment is too high.  If that means checking the 
> > > > alignment "dynamically" before calling `operator new` / `operator 
> > > > delete`, so be it.  In practice, it will not be dynamic because 
> > > > lowering will replace the `coro.align` call with a constant, at which 
> > > > point the branch will be foldable.
> > > > 
> > > > I don't know what to suggest if the aligned `operator new` isn't 
> > > > reliably available on the target OS.  You could outline a function to 
> > > > pick the best allocator/deallocator, I suppose.
> > > Thanks for the review!
> > > 
> > > > Okay, so you're implicitly increasing the coroutine size to allow you 
> > > > to round up to get an aligned frame. But how do you round back down to 
> > > > get the actual pointer that you need to delete? This just doesn't work.
> > > 
> > > Hmm, you're right that I missed the `delete` part, thanks. The adjusted 
> > > amount is constant, I could just dial it down in `coro.free`, right?
> > > 
> > > >  You really ought to just be using the aligned operator new instead 
> > > > when the required alignment is too high. If that means checking the 
> > > > alignment "dynamically" before calling operator new / operator delete, 
> > > > so be it. In practice, it will not be dynamic because lowering will 
> > > > replace the coro.align call with a constant, at which point the branch 
> > > > will be foldable.
> > >  
> > > That's my intuition at first. But spec is written in a way suggesting 
> > > (IMHO) that the aligned version should not be used? What if the user 
> > > specify their own allocator, then which one they should use?
> > Sorry, I meant the adjusted amount is coro.align - std::max_align_t,  I 
> > could subtract it in `coro.free` . I think it should work?
> No, because the adjustment you have to do in `coro.alloc` isn't just an 
> addition, it's an addition plus a mask, which isn't reversible.  Suppose the 
> frame needs to be 32-byte-aligned, but the allocator only promises 8-byte 
> alignment.  The problem is that when you go to free a frame pointer, and you 
> see that it's 32-byte-aligned (which, again, it always will be), the pointer 
> you got from the allocator might be the frame pointer minus any of 8, 16, or 
> 24 — or it might be exactly the same.  The only way to reverse that is to 
> store some sort of cookie, either the amount to subtract or even just the 
> original pointer.
> 
> Now, if you could change the entire coroutine ABI, you could make the frame 
> handle that you pass around be the unadjusted pointer and then just repeat 
> the adjustment every time you enter the coroutine.  But that doesn't work 
> because the ABI relies on things like the promise being at a reliable offset 
> from the frame handle.
> 
> I think the best solution would be to figure out a way to use an aligned 
> allocator, which at worst does this in a more systematic way and at best can 
> actually just satisfy your requirement directly without any overhead.  If you 
> can't do that, adding an offset to the frame would be best; if you can't do 
> that, doing it as a cookie is okay.
> 
> > That's my intuition at first. But spec is written in a way suggesting 
> > (IMHO) that the aligned version should not be used? What if the user 
> > specify their own allocator, then which one they should use?
> 
> It seems like a spec bug that this doesn't use aligned allocators even when 
> they're available.  If there's an aligned allocator available, I think this 
> should essentially do dynamically what it would normally do statically, i.e.:
> 
> ```
> void *allocation = alignment > __STDCPP_DEFAULT_NEW_ALIGNMENT__ ? operator 
> new(size, align_val_t(alignment)) : operator new(size);
> ```
> 
> This would ODR-use both allocation functions, of course.
> 
> Maybe it's right to do this cookie thing if we can't rely on an aligned 
> allocator, like if the promise class provides only an `operator new(size_t)`.
> No, because the adjustment you have to do in `coro.alloc` isn't just an 
> addition, it's an addition plus a mask, which isn't reversible.  Suppose the 
> frame needs to be 32-byte-aligned, but the allocator only promises 8-byte 
> alignment.  The problem is that when you go to free a frame pointer, and you 
> see that it's 32-byte-aligned (which, again, it always will be), the pointer 
> you got from the allocator might be the frame pointer minus any of 8, 16, or 
> 24 — or it might be exactly the same.  The only way to reverse that is to 
> store some sort of cookie, either the amount to subtract or even just the 
> original pointer.

I got myself confused. This makes perfect sense.

> 
> Now, if you could change the entire coroutine ABI, you could make the frame 
> handle that you pass around be the unadjusted pointer and then just repeat 
> the adjustment every time you enter the coroutine.  But that doesn't work 
> because the ABI relies on things like the promise being at a reliable offset 
> from the frame handle.
> 
> I think the best solution would be to figure out a way to use an aligned 
> allocator, which at worst does this in a more systematic way and at best can 
> actually just satisfy your requirement directly without any overhead.  If you 
> can't do that, adding an offset to the frame would be best; if you can't do 
> that, doing it as a cookie is okay.
> 

This is very helpful. I'll explore the adding offset to the frame option first. 
If it is not plausible, I'll use the cookie method. Thanks!



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97915/new/

https://reviews.llvm.org/D97915

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Reply via email to