On Thu, 19 Oct 2023 12:58:39 GMT, Martin Doerr <mdo...@openjdk.org> wrote:
>>> I wonder if the native_invoker_size_per_arg thing still works good enough. >>> We may exceed the computed size, now, right? >> >> Good point. I'll have a look at enhancing the test we have for this. >> >> Intuitively, I think it will be okay. It's true that we generate more code >> to add the oops and offsets together, but at the same time, we don't have >> any code to shuffle the offsets. > >> > I wonder if the native_invoker_size_per_arg thing still works good enough. >> > We may exceed the computed size, now, right? >> >> Good point. I'll have a look at enhancing the test we have for this. >> >> Intuitively, I think it will be okay. It's true that we generate more code >> to add the oops and offsets together, but at the same time, we don't have >> any code to shuffle the offsets. > > Looks like we use 2 input regs per obj, so we reserve 2x > native_invoker_size_per_arg. However, the case in which `reg_oop` and > `reg_offset` are on stack together with `arg_shuffle` can produce more than > this size (depending on platform). If we pass lots of objects on stack, we > may exceed the computed size. Correct? > @TheRealMDoerr Note that `reg_offset` is filtered out, and not handled by > `arg_shuffle`. So, it becomes the question whether shuffling a register, or > adding an offset to an oop takes more bytes. I think most of the > `native_invoker_size_per_arg` have some lenience built in though? (I did do > that for x64 and aarch64 at least). So, I think it will be okay. Right. The filtering happens after the size computation `code_size = native_invoker_code_base_size + (num_args * native_invoker_size_per_arg)`, so we reserve 2 * native_invoker_size_per_arg per heap segment which is 2 * 2 instructions on PPC64. The maximum actual code size is the size of the 1 * add_offset operation + 1 arg shuffle, which is 4 + 2 instructions on PPC64. So, we reserve space for 4 instructions, but emit 6 ones. The reason why it still works is that `_needs_transition` must be false and that saves more space than we exceeded by the oversized argument handling code. Not a very nice design, but I can live with it. > I've added an additional test case to the existing test for this, which > should stress the new code gen. > ([dd9e974](https://github.com/openjdk/jdk/commit/dd9e9741de3ca07e6a4cc561002255f98e1e3330)) Thanks for adding it! Would you mind using `limit(84)` which is the maximum? We'd be on the safe side when testing this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16201#issuecomment-1771757690