ychen added a comment.

In D100739#2711698 <https://reviews.llvm.org/D100739#2711698>, @ChuanqiXu wrote:

>> This is an alternative to D97915 <https://reviews.llvm.org/D97915> which 
>> missed proper deallocation of the over-allocated frame. This patch handles 
>> both allocations and deallocations.
>
> If D97915 <https://reviews.llvm.org/D97915> is not needed, we should abandon 
> it.
>
> For the example shows in D97915 <https://reviews.llvm.org/D97915>, it says:
>
>   #include <experimental/coroutine>
>   #include <iostream>
>   #include <stdexcept>
>   #include <thread>
>   #include <cassert>
>   
>   struct task{
>     struct alignas(64) promise_type {
>       task get_return_object() { return {}; }
>       std::experimental::suspend_never initial_suspend() { return {}; }
>       std::experimental::suspend_never final_suspend() noexcept { return {}; }
>       void return_void() {}
>       void unhandled_exception() {}
>     };
>     using handle = std::experimental::coroutine_handle<promise_type>;
>   };
>   
>   auto switch_to_new_thread() {
>     struct awaitable {
>       bool await_ready() { return false; }
>       void await_suspend(task::handle h) {
>         auto i = reinterpret_cast<std::uintptr_t>(&h.promise());
>         std::cout << i << std::endl;
>         assert(i % 64 == 0);
>       }
>       void await_resume() {}
>     };
>     return awaitable{};
>   }
>   
>   task resuming_on_new_thread() {
>     co_await switch_to_new_thread();
>   }
>   
>   int main() {
>     resuming_on_new_thread();
>   }
>
> The assertion would fail. If this is the root problem, I think we could 
> adjust the align for the promise alloca like:

The problem is that any member of the coroutine frame could be overaligned 
including promise, local variables, spills. The problem is *not* specific to 
promise.

>   %promise = alloca %promise_type, align 8
>
> into
>
>   %promise = alloca %promise_type, align 128
>
> In other words, if this the problem we need to solve, I think we could make 
> it in a simpler way.

This may not fix the problem.

> Then I looked into the document you give in the summary. The issue#3 says the 
> frontend can't do some work in the process of template instantiation due to 
> the frontend doesn't know about the align and size of the coroutine. But from 
> the implementation, it looks like not the problem this patch wants to solve.

I meant to use that as a reference to help describe the problem (but not the 
solution). The document itself includes both problem statements (issue#3) and 
solutions (frontend-based) which are totally unrelated to this patch. It looks 
like it is not that useful in this case so please disregard that.

> I am really confused about the problem. Could you please restate your problem 
> more in more detail? For example, would it make the alignment incorrect like 
> the example above? Or does we want the frontend to get alignment information? 
> Then what would be affected? From the title, I can guess the size of frame 
> would get bigger. But how big would it be? Who would control and determine 
> the final size?

understood.

There are two kinds of alignments: the alignment of a type/object at 
compile-time (ABI specified or user-specified), and the alignment the object of 
that type actually gets during runtime. The compiler assumes that the alignment 
of a struct is the maximal alignment of all its members. However, that 
assumption may not be true at runtime where the memory allocator may return a 
memory block that has insufficient alignment which causes some members aligned 
incorrectly.

For C++ coroutine, right now the default memory allocator could only return 16 
bytes aligned memory block. When any member of the coroutine frame (promise, 
local variables, spills etc.) has alignment > 16, the frame becomes 
overaligned. This could only be fixed dynamically at runtime: by 
over-allocating memory and then adjust the frame start address so that it 
aligns correctly.

For example, suppose malloc returns 16 bytes aligned address 16, how do we make 
it 64 bytes aligned? align 16 up to an address that is 64 bytes aligned which 
is 64, so the adjustment amount is 64-16=48

Another similar example, suppose malloc returns 16 bytes aligned address 32, 
how do we make it 64 bytes aligned? align 32 up to an address that is 64 bytes 
aligned which is 64, so the adjustment amount is 64-32=32

Another similar example, suppose malloc returns 16 bytes aligned address 48, 
how do we make it 64 bytes aligned? align 48 up to an address that is 64 bytes 
aligned which is 64, so the adjustment amount is 64-48=16

Another similar example, suppose malloc returns 16 bytes aligned address 64, 
how do we make it 64 bytes aligned? align 64 up to an address that is 64 bytes 
aligned which is 64, so the adjustment amount is 64-64=0

So the mamximal adjustment amount is 64-16=48. We don't know until runtime if 
the malloc returned address X is (X % 64 == 0) or (X % 64 == 16) or (X % 64 == 
32) or (X % 64 == 48), so we must emit extra code to deal with all cases (by 
bitwise operations).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100739/new/

https://reviews.llvm.org/D100739

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to