On Thu, 19 Oct 2023 12:58:39 GMT, Martin Doerr <mdo...@openjdk.org> wrote:

>>> I wonder if the native_invoker_size_per_arg thing still works good enough. 
>>> We may exceed the computed size, now, right?
>> 
>> Good point. I'll have a look at enhancing the test we have for this. 
>> 
>> Intuitively, I think it will be okay. It's true that we generate more code 
>> to add the oops and offsets together, but at the same time, we don't have 
>> any code to shuffle the offsets.
>
>> > I wonder if the native_invoker_size_per_arg thing still works good enough. 
>> > We may exceed the computed size, now, right?
>> 
>> Good point. I'll have a look at enhancing the test we have for this.
>> 
>> Intuitively, I think it will be okay. It's true that we generate more code 
>> to add the oops and offsets together, but at the same time, we don't have 
>> any code to shuffle the offsets.
> 
> Looks like we use 2 input regs per obj, so we reserve 2x 
> native_invoker_size_per_arg. However, the case in which `reg_oop` and 
> `reg_offset` are on stack together with `arg_shuffle` can produce more than 
> this size (depending on platform). If we pass lots of objects on stack, we 
> may exceed the computed size. Correct?

> @TheRealMDoerr Note that `reg_offset` is filtered out, and not handled by 
> `arg_shuffle`. So, it becomes the question whether shuffling a register, or 
> adding an offset to an oop takes more bytes. I think most of the 
> `native_invoker_size_per_arg` have some lenience built in though? (I did do 
> that for x64 and aarch64 at least). So, I think it will be okay.

Right.
The filtering happens after the size computation `code_size = 
native_invoker_code_base_size + (num_args * native_invoker_size_per_arg)`, so 
we reserve 2 * native_invoker_size_per_arg per heap segment which is 2 * 2 
instructions on PPC64.
The maximum actual code size is the size of the 1 * add_offset operation + 1 
arg shuffle, which is 4 + 2 instructions on PPC64.
So, we reserve space for 4 instructions, but emit 6 ones.
The reason why it still works is that `_needs_transition` must be false and 
that saves more space than we exceeded by the oversized argument handling code.
Not a very nice design, but I can live with it.
 
> I've added an additional test case to the existing test for this, which 
> should stress the new code gen. 
> ([dd9e974](https://github.com/openjdk/jdk/commit/dd9e9741de3ca07e6a4cc561002255f98e1e3330))

Thanks for adding it! Would you mind using `limit(84)` which is the maximum? 
We'd be on the safe side when testing this.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16201#issuecomment-1771757690

Reply via email to