ychen added a comment. In D100739#2715964 <https://reviews.llvm.org/D100739#2715964>, @ChuanqiXu wrote:
> In D100739#2713579 <https://reviews.llvm.org/D100739#2713579>, @ychen wrote: > >> In D100739#2711698 <https://reviews.llvm.org/D100739#2711698>, @ChuanqiXu >> wrote: >> >>>> This is an alternative to D97915 <https://reviews.llvm.org/D97915> which >>>> missed proper deallocation of the over-allocated frame. This patch handles >>>> both allocations and deallocations. >>> >>> If D97915 <https://reviews.llvm.org/D97915> is not needed, we should >>> abandon it. >>> >>> For the example shows in D97915 <https://reviews.llvm.org/D97915>, it says: >>> >>> #include <experimental/coroutine> >>> #include <iostream> >>> #include <stdexcept> >>> #include <thread> >>> #include <cassert> >>> >>> struct task{ >>> struct alignas(64) promise_type { >>> task get_return_object() { return {}; } >>> std::experimental::suspend_never initial_suspend() { return {}; } >>> std::experimental::suspend_never final_suspend() noexcept { return >>> {}; } >>> void return_void() {} >>> void unhandled_exception() {} >>> }; >>> using handle = std::experimental::coroutine_handle<promise_type>; >>> }; >>> >>> auto switch_to_new_thread() { >>> struct awaitable { >>> bool await_ready() { return false; } >>> void await_suspend(task::handle h) { >>> auto i = reinterpret_cast<std::uintptr_t>(&h.promise()); >>> std::cout << i << std::endl; >>> assert(i % 64 == 0); >>> } >>> void await_resume() {} >>> }; >>> return awaitable{}; >>> } >>> >>> task resuming_on_new_thread() { >>> co_await switch_to_new_thread(); >>> } >>> >>> int main() { >>> resuming_on_new_thread(); >>> } >>> >>> The assertion would fail. If this is the root problem, I think we could >>> adjust the align for the promise alloca like: >> >> The problem is that any member of the coroutine frame could be overaligned >> (thus make the frame overaligned) including promise, local variables, >> spills. The problem is *not* specific to promise. >> >>> %promise = alloca %promise_type, align 8 >>> >>> into >>> >>> %promise = alloca %promise_type, align 128 >>> >>> In other words, if this the problem we need to solve, I think we could make >>> it in a simpler way. >> >> This may not fix the problem. >> >>> Then I looked into the document you give in the summary. The issue#3 says >>> the frontend can't do some work in the process of template instantiation >>> due to the frontend doesn't know about the align and size of the coroutine. >>> But from the implementation, it looks like not the problem this patch wants >>> to solve. >> >> I meant to use that as a reference to help describe the problem (but not the >> solution). The document itself includes both problem statements (issue#3) >> and solutions (frontend-based) which are totally unrelated to this patch. It >> looks like it is not that useful in this case so please disregard that. >> >>> I am really confused about the problem. Could you please restate your >>> problem more in more detail? For example, would it make the alignment >>> incorrect like the example above? Or does we want the frontend to get >>> alignment information? Then what would be affected? From the title, I can >>> guess the size of frame would get bigger. But how big would it be? Who >>> would control and determine the final size? >> >> understood. >> >> There are two kinds of alignments: the alignment of a type/object at >> compile-time (ABI specified or user-specified), and the alignment the object >> of that type actually gets during runtime. The compiler assumes that the >> alignment of a struct is the maximal alignment of all its members. However, >> that assumption may not be true at runtime where the memory allocator may >> return a memory block that has insufficient alignment which causes some >> members aligned incorrectly. >> >> For C++ coroutine, right now the default memory allocator could only return >> 16 bytes aligned memory block. When any member of the coroutine frame >> (promise, local variables, spills etc.) has alignment > 16, the frame >> becomes overaligned. This could only be fixed dynamically at runtime: by >> over-allocating memory and then adjust the frame start address so that it >> aligns correctly. >> >> For example, suppose malloc returns 16 bytes aligned address 16, how do we >> make it 64 bytes aligned? align 16 up to an address that is 64 bytes aligned >> which is 64, so the adjustment amount is 64-16=48 >> >> Another similar example, suppose malloc returns 16 bytes aligned address 32, >> how do we make it 64 bytes aligned? align 32 up to an address that is 64 >> bytes aligned which is 64, so the adjustment amount is 64-32=32 >> >> Another similar example, suppose malloc returns 16 bytes aligned address 48, >> how do we make it 64 bytes aligned? align 48 up to an address that is 64 >> bytes aligned which is 64, so the adjustment amount is 64-48=16 >> >> Another similar example, suppose malloc returns 16 bytes aligned address 64, >> how do we make it 64 bytes aligned? align 64 up to an address that is 64 >> bytes aligned which is 64, so the adjustment amount is 64-64=0 >> >> So the mamximal adjustment amount is 64-16=48. We don't know until runtime >> if the malloc returned address X is (X % 64 == 0) or (X % 64 == 16) or (X % >> 64 == 32) or (X % 64 == 48), so we must emit extra code to deal with all >> cases (by bitwise operations). > > Thanks for the explanation. I think I got the problem now. And my > understanding for the solution of this patch is, if the align of the original > frame is 64, then we allocate (64+48) space to the new frame now. And the > original frame becomes an alloca with align 64 in the new frame. So the > actually frame gets the right alignment now. Do I get the problem and > solution right? > > I am wondering if there is a simpler solution. For example, after we > construct the frame, we can get the alignment requirement for the frame. Then > if the alignment is bigger than 16, we could lower the `coro.begin` to > aligned new instead of default new. It looks like that the implementation > would be much simpler. Oh, right now C++ coroutine standard is written in the way that the aligned new is not searched by the frontend. The limitation will be lifted in C++23 (hopefully). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100739/new/ https://reviews.llvm.org/D100739 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits