On Wed, 9 Apr 2025 11:34:09 GMT, Roberto Castañeda Lozano <rcastaned...@openjdk.org> wrote:
>> Thomas Schatzl has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 39 commits: >> >> - * missing file from merge >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - Merge branch 'master' into submit/8342382-card-table-instead-of-dcq >> - * make young gen length revising independent of refinement thread >> * use a service task >> * both refinement control thread and young gen length revising use the >> same infrastructure to get the number of available bytes and determine the >> time to the next update >> - * fix IR code generation tests that change due to barrier cost changes >> - * factor out card table and refinement table merging into a single >> method >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq3 >> - * obsolete G1UpdateBufferSize >> >> G1UpdateBufferSize has previously been used to size the refinement >> buffers and impose a minimum limit on the number of cards per thread >> that need to be pending before refinement starts. >> >> The former function is now obsolete with the removal of the dirty >> card queues, the latter functionality has been taken over by the new >> diagnostic option `G1PerThreadPendingCardThreshold`. >> >> I prefer to make this a diagnostic option is better than a product option >> because it is something that is only necessary for some test cases to >> produce some otherwise unwanted behavior (continuous refinement). >> >> CSR is pending. >> - ... and 29 more: https://git.openjdk.org/jdk/compare/41d4a0d7...1c5a669f > > src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp line 101: > >> 99: } >> 100: >> 101: void >> G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* >> masm, DecoratorSet decorators, > > Have you measured the performance impact of inlining this assembly code > instead of resorting to a runtime call as done before? Is it worth the > maintenance cost (for every platform), risk of introducing bugs, etc.? I remember significant impact in some microbenchmark. It's also inlined in Parallel GC. I do not consider it a big issue wrt to maintenance - these things never really change, and the method is small and contained. I will try to redo numbers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r2035298557