On Thu, 8 Aug 2024 10:51:59 GMT, Martin Doerr <mdo...@openjdk.org> wrote:

> Can't we do these nasty loads in C++ code and use set_vm_result_2 in 
> UpcallLinker::on_entry?

That's what I tried. I got a ~20% hit to execution time.

> I guess that upcalls are less performance critical

Why so? They are certainly much more rare than downcalls, but when they _are_ 
used, I think we'd like them to be fast.

> Maybe the C++ code can get optimized better, too.

I 
[tried](https://github.com/openjdk/jdk/commit/a2614ab77ef0ed493a819b970b31b939126c3da5)
 optimizing things by moving the accessors to `javaClasses.inline.hpp`, that 
helped the generated code a bit, but it didn't really improve speed. I think 
the problem is that we don't know at C++ compile time which barrier we need to 
use, since the GC is selected at runtime, while we do know when generating the 
stub. So, if we use C++, there will always be an out-of-line dispatch to the 
`_load_at` function for the particular GC.

> Some of the DecoratorSet should be applicable and improve performance. If 
> that doesn't help enough, maybe we should implement a dedicated static stub? 
> There's no need to have the code replicated in each upcall stub.

That's a good idea. If we can make that work, I'm all for it.

P.S. giving that a try now.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20479#issuecomment-2275658037
PR Comment: https://git.openjdk.org/jdk/pull/20479#issuecomment-2275861261

Reply via email to