Hi Michal,

> On 12 Jul 2022, at 16:14, Michal Jankovič <michal.jankovi...@gmail.com> wrote:

> One other related thing I would like to investigate is reducing the
> number of compiler generated variables in the frame, particularly
> _Coro_destroy_fn and _Coro_self_handle.  
> 
> As I understand it, _Coro_destroy_fn just sets a flag in
> _Coro_resume_index and calls _Coro_resume_fn; it should be possible to
> move this logic to __builtin_coro_destroy, so that only _Coro_resume_fn
> is stored in the frame;

That is a particular point about GCC’s implementation … (it is not neccesarily, 
or even
likely to be the same for other implementations) - see below.

I was intending to do experiment with making the ramp/resume/destroy value a 
parameter
to the actor function so that we would have something like -

ramp calls      actor(frame, 0)
resume calls  actor(frame, 1)
destroy calls  actor(frame, 2) 
- the token values are illustrative, not intended to be a final version.

I think that should allow for more inlining opportunites and possibly a way 
forward to
frame elision (a.k.a halo).

> this would however change the coroutine ABI - I don't know if that's a 
> problem.

The external ABI for the coroutine is the 
 resume,
 destroy pointers 
 and the promise 
 and that one can find each of these from the frame pointer.

This was agreed between the interested “vendors” so that one compiler could 
invoke
coroutines built by another.  So I do not think this is so much a useful area 
to explore.

Also the intent is that an indirect call through the frame pointer is the most 
frequent
operation so should be the most efficient.  
  resume() might be called many times, 
  destroy() just once thus it is a cold code path 
  - space can be important too - but interoperability was the goal here.

> The _Coro_self_handle should be constructible on-demand from the frame 
> address.

Yes, and in the header the relevant items are all constexpr - so that should 
happen in the
user’s code.  I elected to have that value in the frame to avoid recreating it 
each time - I
suppose that is a trade-off of one oiptimisation c.f. another … 
> 
> Do you have any advice / opinions on this before I try to implement it?

Hopefully, the notes above help.

I will rebase my latest code changes as soon as I have a chance and put them 
somewhere
for you to look at - basically, these are to try and address the correctness 
issues we face,

Iain


> 
> Michal
> 
> On Jul 12 2022, at 4:08 pm, Iain Sandoe <i...@sandoe.co.uk> wrote:
> 
>> Hi Michal,
>> 
>>> On 12 Jul 2022, at 14:35, Michal Jankovič via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>> 
>>> Currently, coroutine frames store all variables of a coroutine separately,
>>> even if their lifetime does not overlap (they are in distinct
>>> scopes). This
>>> patch implements overlapping distinct variable scopes in the
>>> coroutine frame,
>>> by storing the frame fields in nested unions of structs. This lowers
>>> the size
>>> of the frame for larger coroutines significantly, and makes them more usable
>>> on systems with limited memory.
>> 
>> not a review (I will try to take a look at the weekend).
>> 
>> but … this is one of the two main optimisations on my TODO - so cool
>> for doing it.
>> 
>> (the other related optimisation is to eliminate frame entries for
>> scopes without any suspend
>> points - which has the potential to save even more space for code with
>> sparse use of co_xxxx)
>> 
>> Iain
>> 
>>> Bootstrapped and regression tested on x86_64-pc-linux-gnu; new test fails
>>> before the patch and succeeds after with no regressions.
>>> 
>>>     PR c++/105989
>>> 
>>> gcc/cp/ChangeLog:
>>> 
>>>     * coroutines.cc (struct local_var_info): Add field_access_path.
>>>     (build_local_var_frame_access_expr): New.
>>>     (transform_local_var_uses): Use build_local_var_frame_access_expr.
>>>     (coro_make_frame_entry_id): New.
>>>     (coro_make_frame_entry): Delegate to coro_make_frame_entry_id.
>>>     (struct local_vars_frame_data): Add orig, field_access_path.
>>>     (register_local_var_uses): Generate new frame layout. Create access
>>>     paths to vars.
>>>     (morph_fn_to_coro): Set new fields in local_vars_frame_data.  
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>>     * g++.dg/coroutines/pr105989.C: New test.
>>> 
>>> <pr105989.patch>
>> 
>> 

Reply via email to