On Thu, 4 May 2023 11:44:14 GMT, Stefan Karlsson <stef...@openjdk.org> wrote:
>> Hi all, >> >> Please review the implementation of Generational ZGC, which can be turned on >> by adding -XX:+ZGenerational in addition to using -XX:+UseZGC. Generational >> ZGC is a major rewrite of the non-generational ZGC version that exists in >> the openjdk/jdk repository. It splits the heap into two generations; the >> young generation where newly allocated objects are born, and the old >> generation where long-lived objects get promoted to. The motivation for >> introducing generations is to allow ZGC to reclaim memory faster by not >> having to walk the entire object graph every time a garbage collection is >> run. This should make Generational ZGC suitable for more workloads. In >> particular workloads that previously hit allocation stalls because of high >> allocation rates, large live sets, or limited spare machine resources, have >> the potential to work better with Generational ZGC. For an in-depth >> description of Generational ZGC, see https://openjdk.org/jeps/439. >> >> The development of Generational ZGC started around the same time as the >> development of JDK 17. At that point we forked off the Generational ZGC >> development into its own branch and let non-generational live unaffected in >> openjdk/jdk. This safe-guarded non-generational ZGC and allowed Generational >> ZGC to move unhindered, without the shackles of having to fit into another >> GC implementation's design and quirks. Since then, almost all of the ZGC >> files have been changed. Moving forward to today, when it's ready for us to >> upstream Generational ZGC, we now need to deliver Generational ZGC without >> disrupting our current user-base. We have therefore opted to initially >> include both versions of ZGC in the code base, but with the intention to >> deprecate non-generational ZGC in a future release. Existing users running >> with only -XX:+UseZGC will get the non-generational ZGC, and users that want >> the new Generational ZGC need to run with -XX:+ZGenerational in addition to >> -XX:+UseZGC. The intention is to give the users time to validate and deploy their workloads with the new GC implementation. >> >> Including both the new evolution of a GC and its legacy predecessor poses a >> few challenges for us GC developers. The first reaction could be to try to >> mash the two implementations together and sprinkle the GC code with >> conditional statements or dynamic dispatches. We have done similar >> experiments before. When ZGC was first born, we started an experiment where >> we converted G1 into getting the same features as the evolving ZGC. It was >> quite clear to us how time consuming and complex things end up being when we >> tried to keep both the original G1 working, and at the same time implemented >> the ZGC-alike G1. Given this experience, we don't see that as a viable >> solution to deliver a maintainable and evolving Generational ZGC. Our >> pragmatic suggestion to these challenges is to let Generational ZGC live >> under the current gc/z directories and let the legacy, non-generational ZGC >> be completely separated in its own directories. This way we can continue to >> move quickly with the continued develo pment of Generational ZGC and let the non-generational ZGC be mostly untouched until it gets deprecated, and eventually removed. The non-generational ZGC directory will be gc/x and all the classes of non-generational have been prefixed with X instead of Z. An alternative to this rename could be to namespace out non-generational ZGC. We experimented with that, but it was too easy to accidentally cross-compile Generational ZGC code into non-generational ZGC, so we didn't like that approach. >> >> Most of the stand-alone cleanups and enhancements outside of the ZGC code >> have already been upstreamed to openjdk/jdk. There are still a few patches >> that could/should be pushed separately, but they will be easier to >> understand by also looking at the Generational ZGC code, so they will be >> sent out after this PR has been published. The patches that could be >> published separately are: >> >> * 59d1e96af6a UPSTREAM: Introduce check_oop infrastructure to check oops in >> the oop class >> * ca9edf8aa79 UPSTREAM: RISCV tmp reg cleanup resolve_jobject >> * 4bec9c69b67 CLEANUP: barrierSetNMethod_aarch64.cpp >> * b67d03a3f04 UPSTREAM: Add relaxed add&fetch for aarch64 atomics >> * a2824734d23 UPSTREAM: lir_xchg >> * 36cd39c0126 UPSTREAM: assembler_ppc CMPLI >> * 447259cea42 UPSTREAM: assembler_ppc ANDI >> * 9417323499a UPSTREAM: Add VMErrorCallback infrastructure >> >> Regarding all the changesets you see in this PR, they form the history of >> the development of Generational ZGC. It might look a bit unconventional to >> what you are used to see in openjdk development. What we have done is to use >> merges with the 'ours' strategy to ignore the previous Generational ZGC >> patches, and then rebased and flattened the changes on top of the merge. >> This effectively gives us the upsides of having a rebased repository and the >> upsides of retaining the history in the repository. The downside could be >> that GitHub now lists all those changesets in the PR. Given that this patch >> is so big, and that you likely only want to see a part of it, I suggest that >> you pull down the PR branch and then compare it to the openjdk/jdk changeset >> this PR is based against: >> >> >> git fetch https://github.com/openjdk/zgc zgc_master >> git diff zgc_master... <sub-directory of interest> >> >> >> There have been many contributors of this patch over the years. I'll do my >> best to poke Skara into listing you all, but if you see that I've missed >> your name please reach out to me and I'll fix it. >> >> Testing: we have been continuously running Generational ZGC through Oracle's >> tier1-8 testing. > > Stefan Karlsson has updated the pull request incrementally with one > additional commit since the last revision: > > undefine glibc major/minor macros test/hotspot/gtest/gc/z/test_zForwarding.cpp line 68: > 66: > 67: bool reserved = > os::attempt_reserve_memory_at((char*)ZAddressHeapBase, ZGranuleSize, false /* > executable */); > 68: ASSERT_TRUE(reserved); Hi, Thanks for the great work! I have performed some tests on linux-riscv64 Hifive Unmatched board. So far, I only witnessed one gtest failure: $ make test TEST=gtest:ZForwardingTest Building target 'test' in configuration 'linux-riscv64-server-release' Test selection 'gtest:ZForwardingTest', will run: * gtest:ZForwardingTest/server Running test 'gtest:ZForwardingTest/server' Note: Google Test filter = ZForwardingTest* [==========] Running 4 tests from 1 test suite. [----------] Global test environment set-up. [----------] 4 tests from ZForwardingTest [ RUN ] ZForwardingTest.setup_vm test/hotspot/gtest/gc/z/test_zForwarding.cpp:68: Failure Value of: reserved Actual: false Expected: true [ FAILED ] ZForwardingTest.setup_vm (0 ms) [ RUN ] ZForwardingTest.find_empty_vm [ OK ] ZForwardingTest.find_empty_vm (1 ms) [ RUN ] ZForwardingTest.find_full_vm [ OK ] ZForwardingTest.find_full_vm (8 ms) [ RUN ] ZForwardingTest.find_every_other_vm [ OK ] ZForwardingTest.find_every_other_vm (0 ms) [----------] 4 tests from ZForwardingTest (761 ms total) [----------] Global test environment tear-down ERROR: RUN_ALL_TESTS() failed. Error 1 [==========] 4 tests from 1 test suite ran. (762 ms total) [ PASSED ] 3 tests. [ FAILED ] 1 test, listed below: [ FAILED ] ZForwardingTest.setup_vm 1 FAILED TEST Finished running test 'gtest:ZForwardingTest/server' Test report is stored in build/linux-riscv64-server-release/test-results/gtest_ZForwardingTest_server ============================== Test summary ============================== TEST TOTAL PASS FAIL ERROR >> gtest:ZForwardingTest/server 4 3 1 0 << ============================== TEST FAILURE The gtest failed this assertion where 'reserved' return by function os::attempt_reserve_memory_at is false. I find the reason is that the mmap call at the bottom returns a different address instead of the requested one (ZAddressHeapBase). I think that is possible since we are not sure if the requested address is available before the mmap call, right? So I guess we might need some changes here for this gtest. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13771#discussion_r1185645071