Re: RFR: 8345687: Improve the implementation of SegmentFactories::allocateSegment [v6]
On Fri, 7 Mar 2025 17:38:13 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch improves the performance of a typical `Arena::allocate` in >> several ways: >> >> - Delay the creation of the NativeMemorySegmentImpl. This avoids the merge >> of the instance with the one obtained from the call in the uncommon path, >> increasing the chance the object being scalar replaced. >> - Split the allocation of over-aligned memory to a slow-path method. >> - Align the memory to 8 bytes, allowing faster zeroing. >> - Use a dedicated method to zero the just-allocated native memory, reduce >> code size and make it more straightforward. >> - Make `VM.pageAlignDirectMemory` a `Boolean` instead of a `boolean` so that >> `false` value can be constant folded. >> >> Please take a look and leave your reviews, thanks a lot. > > Quan Anh Mai has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains nine additional > commits since the last revision: > > - Merge branch 'master' into segmentallocate > - revert changes to CallocArena > - Merge branch 'master' into segmentallocate > - copyright > - Merge branch 'master' into segmentallocate > - wrong init > - move segment instance creation to SegmentFactories > - address review > - improve the implementation of SegmentFactories::allocateSegment Looks good - just left a minor (optional) stylistic comment test/micro/org/openjdk/bench/java/lang/foreign/AllocTest.java line 26: > 24: package org.openjdk.bench.java.lang.foreign; > 25: > 26: import java.lang.foreign.*; I believe it would be better to leave the imports alone -- as all the other benchmarks also use the "more explicit" style. - Marked as reviewed by mcimadamore (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22610#pullrequestreview-2678029020 PR Review Comment: https://git.openjdk.org/jdk/pull/22610#discussion_r1991245773
Re: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v17]
On Wed, 12 Mar 2025 12:23:50 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 24 additional >> commits since the last revision: >> >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * optimized RISCV gen_write_ref_array_post_barrier() implementation >> contributed by @RealFYang >> - * fix card table verification crashes: in the first refinement phase, >> when switching the global card tables, we need to re-check whether we are >> still in the same sweep epoch or not. It might have changed due to a GC >> interrupting acquiring the Heap_lock. Otherwise new threads will scribble on >> the refinement table. >>Cause are last-minute changes before making the PR ready to review. >> >> Testing: without the patch, occurs fairly frequently when continuously >>(1 in 20) starting refinement. Does not afterward. >> - * ayang review 3 >> * comments >> * minor refactorings >> - * iwalulya review >> * renaming >> * fix some includes, forward declaration >> - * fix whitespace >>* additional whitespace between log tags >>* rename G1ConcurrentRefineWorkTask -> ...SweepTask to conform to the >> other similar rename >> - ayang review >> * renamings >> * refactorings >> - iwalulya review >> * comments for variables tracking to-collection-set and just dirtied >> cards after GC/refinement >> * predicate for determining whether the refinement has been disabled >> * some other typos/comment improvements >> * renamed _has_xxx_ref to _has_ref_to_xxx to be more consistent with >> naming >> - * ayang review - fix comment >> - * iwalulya review 2 >> * G1ConcurrentRefineWorkState -> G1ConcurrentRefineSweepState >> * some additional documentation >> - ... and 14 more: https://git.openjdk.org/jdk/compare/5727f166...aec95051 > > src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp line 263: > >> 261: >> 262: SuspendibleThreadSetLeaver sts_leave; >> 263: VMThread::execute(&op); > > Can you elaborate what synchronization this VM op is trying to achieve? Memory visibility for refinement threads for the references written to the heap. Without them, they may not have received the most recent values. This is the same as the `StoreLoad` barriers synchronization between mutator and refinement threads imo. - PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r1991561707
RFR: 8351593: [JMH] test PhoneCode.Bulk reports NPE exception
the test output was cleanup for common execution. - Commit messages: - JDK-8351593 [JMH] test PhoneCode.Bulk reports NPE exception Changes: https://git.openjdk.org/jdk/pull/24011/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24011&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351593 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/24011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24011/head:pull/24011 PR: https://git.openjdk.org/jdk/pull/24011
Re: RFR: 8347712: IllegalStateException on multithreaded ZipFile access with non-UTF8 charset [v4]
> Can I please get a review of this change which proposes to fix an issue > `java.util.zip.ZipFile` which would cause failures when multiple instances of > `ZipFile` using non-UTF8 `Charset` were operating against the same underlying > ZIP file? This addresses https://bugs.openjdk.org/browse/JDK-8347712. > > ZIP file specification allows for ZIP entries to mark a `UTF-8` flag to > indicate that the entry name and comment are encoded using UTF8. A > `java.util.zip.ZipFile` can be constructed by passing it a `Charset`. This > `Charset` (which defaults to UTF-8) gets used for decoding entry names and > comments for non-UTF8 entries. > > The internal implementation of `ZipFile` uses a `ZipCoder` (backed by > `java.nio.charset.CharsetEncoder/CharsetDecoder` instance) for the given > `Charset`. Except for UTF8 `ZipCoder`, other `ZipCoder`s are not thread safe. > > The internal implementation of `ZipFile` maintains a cache of > `ZipFile$Source`. A `Source` corresponds to the underlying ZIP file and > during construction, uses a `ZipCoder` for parsing the ZIP entries and once > constructed holds on to the parsed ZIP structure. Multiple instances of a > `ZipFile` which all correspond to the same ZIP file on the filesystem, share > a single instance of `Source` (after the `Source` has been constructed and > cached). Although `ZipFile` instances aren't expected to be thread-safe, the > fact that multiple different instances of `ZipFile` could be sharing the same > instance of `Source` in concurrent threads, mandates that the `Source` must > be thread-safe. > > In Java 15, we did a performance optimization through > https://bugs.openjdk.org/browse/JDK-8243469. As part of that change, we > started holding on to the `ZipCoder` instance (corresponding to the `Charset` > provided during `ZipFile` construction) in the `Source`. This stored > `ZipCoder` was then used for `ZipFile` operations when working with the ZIP > entries. As noted previously, any non-UTF8 `ZipCoder` is not thread-safe and > as a result, any usages of `ZipCoder` in the `Source` makes `Source` not > thread-safe too. That effectively violates the requirement that `Source` must > be thread-safe to allow for its usage in multiple different `ZipFile` > instances concurrently. This then causes `ZipFile` usages to fail in > unexpected ways like the one shown in the linked > https://bugs.openjdk.org/browse/JDK-8347712. > > The commit in this PR addresses the issue by not maintaining `ZipCoder` as a > instance field of `Source`. Instead the `ZipCoder` is now maintained in the > `ZipFile`,... Jaikiran Pai has updated the pull request incrementally with one additional commit since the last revision: add code comment - Changes: - all: https://git.openjdk.org/jdk/pull/23986/files - new: https://git.openjdk.org/jdk/pull/23986/files/b6486e7a..935b04ed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23986&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23986&range=02-03 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/23986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23986/head:pull/23986 PR: https://git.openjdk.org/jdk/pull/23986
Withdrawn: 8347491: IllegalArgumentationException thrown by ThreadPoolExecutor doesn't have a useful message
On Mon, 13 Jan 2025 16:28:52 GMT, Viktor Klang wrote: > Seems sensible to improve this given that it can be tricky to figure out > which parameter caused the exception to be thrown. This pull request has been closed without being integrated. - PR: https://git.openjdk.org/jdk/pull/23081
Re: RFR: 8351435: Change JLine Console implementation back to opt-in [v2]
> JDK has been using JLine based Console implementation, in JDK20 as opt-in, > then in JDK22 as the default. While it has been providing rich functionality > for Console, it is increasingly difficult to maintain as a Console > implementation. In light of the on-ramp feature > (https://bugs.openjdk.org/browse/JDK-8344699), which proposes switching > `java.lang.IO` class to use `System.in` and `System.out` instead of Console, > reverting the default Console implementation to JDK's built-in one in the > java.base module is considered desirable. Some tests are modified along with > this change, among them test/jdk/java/io/Console/ConsolePromptTest.java > changes were contributed by @lahodaj (thanks!) Naoto Sato has updated the pull request incrementally with one additional commit since the last revision: Reflects review comment - Changes: - all: https://git.openjdk.org/jdk/pull/23993/files - new: https://git.openjdk.org/jdk/pull/23993/files/960e6dc5..92c9a344 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23993&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23993&range=00-01 Stats: 19 lines in 1 file changed: 10 ins; 3 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/23993.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23993/head:pull/23993 PR: https://git.openjdk.org/jdk/pull/23993
Re: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v18]
> Hi all, > > please review this change that implements (currently Draft) JEP: G1: > Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP > process is already taking very long with no end in sight but we would like to > have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble > Parallel GC's as described in the JEP. The reason is that G1 lacks in > throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent > refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers > (dirty card queues - dcq) containing the location of dirtied cards. > Refinement threads pick up their contents to re-refine. The barrier needs to > enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization > between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), > to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment > `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad;// synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total > instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, > but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease > throughput by 10-20% > ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is > corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization > between refinement and mutator threads, but coarse grained based on > atomically switching card tables. Mutators only work on the "primary" card > table, refinement threads on a se... Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: * ayang review * remove unnecessary STSleaver * some more documentation around to_collection_card card color - Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/aec95051..3766b76c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=16-17 Stats: 18 lines in 2 files changed: 5 ins; 4 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/head:pull/23739 PR: https://git.openjdk.org/jdk/pull/23739
Re: RFR: 8347712: IllegalStateException on multithreaded ZipFile access with non-UTF8 charset [v2]
On Wed, 12 Mar 2025 06:42:43 GMT, Jaikiran Pai wrote: >> src/java.base/share/classes/java/util/zip/ZipFile.java line 376: >> >>> 374: * @param pos the entry position >>> 375: */ >>> 376: private static boolean useUTF8Coder(final byte[] cen, final int >>> pos) { >> >> This seems mostly used to determine which `ZipCoder` to pick. Could we fold >> this into the method, calling it `zipCoderForPos` and make it just return >> the `ZipCoder`, like we had in `Source`? >> >> If it needs to be static, then the non-UTF8/fallback ZipCoder could be >> passed as a parameter? >> >> If a method to determine whether a CEN entry uses UTF-8 is still needed, >> then I think it would be more appropriately named `isUTF8Entry`. > >> This seems mostly used to determine which ZipCoder to pick. Could we fold >> this into the method, calling it zipCoderForPos and make it just return the >> ZipCoder, like we had in Source? > > I intentionally removed the `zipCoderForPos()` method and instead introduced > this static method to avoid the case where the code ends up calling this > method and then stores the returned `ZipCoder` (like was being done in > `Source`). Instead with this new method, it now allows the call sites to > determine if a UTF8 `ZipCoder` is needed or a non-UTF8 one and then decide > where to get the non-UTF8 `ZipCoder` from. If the call site is a instance > method in `ZipFile`, then it was use the `ZipFile`'s `zipCoder` field and if > the call site is within `Source`, when the `Source` is being instantiated > then it constructs a non-UTF8 `ZipCoder` afresh. That level of detail is > better dealt at the call site instead of a method like `zipCoderForPos()`. Hello Jaikiran, I'm not convinced dropping `zipCoderForPos` is a step forward: * `zipCoderForPos` included a short-circuit optimization logic which skipped reading the CENFLG field in the case where the opening charset was UTF-8 (which it commonly is, and always for JarFile). This optimization seems lost in the currrent state of this PR and could impact lookup performance. It would be strange to repeat this optimization at all call sites. * The only thing differing between callsites seems to be which ZipCoder to use as a "fallback" when it's not UTF-8. This ZipCoder instance can easilly be passed in as a parameter, and will de-duplicate logic at the call sites. * The only "cumbersome" call site seems to be `initCEN`, caused by the lazy initialization of the non-UTF-8 ZipCoder. But that lazy initialization seems questionable: First, ZipCoder is already lazy in the way it initializes encoders and decoders. An unused ZipCoder is just a thin wrapper around a Charset and should not incur significant cost until it gets used. Second, the `ZipFile` constructor actually constructs a `ZipCoder` instance, would it not be better to just pass this down to initCEN as a parameter? That would avoid the cost of initializing encoders and decoders twice for the common case of single-threaded `ZipFile` access, once for initCEN, then again on the first lookup. Here's a patch for a quick implementation of the above on top of your PR. (Code speeks better than words some times :-) This retains the CENFLG optimization, simplifies logic at call sites and prevents duplicated initialization of ZipCoder beteween initCEN and lookups: Index: src/java.base/share/classes/java/util/zip/ZipFile.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 === diff --git a/src/java.base/share/classes/java/util/zip/ZipFile.java b/src/java.base/share/classes/java/util/zip/ZipFile.java --- a/src/java.base/share/classes/java/util/zip/ZipFile.java(revision 935b04ed00522aa92105292baa0693c55b1356ae) +++ b/src/java.base/share/classes/java/util/zip/ZipFile.java(date 1741800197915) @@ -201,8 +201,8 @@ this.fileName = file.getName(); long t0 = System.nanoTime(); -this.res = new CleanableResource(this, charset, file, mode); this.zipCoder = ZipCoder.get(charset); +this.res = new CleanableResource(this, charset, zipCoder, file, mode); PerfCounter.getZipFileOpenTime().addElapsedTimeFrom(t0); PerfCounter.getZipFileCount().increment(); @@ -368,13 +368,19 @@ } /** - * {@return true if the ZIP entry corresponding to the given {@code pos} uses UTF-8 - * for entry name and comment field encoding, false otherwise} + * {@return a ZipCoder for decoding name and comment fields for the given CEN entry position} * @param cen the CEN * @param pos the entry position + * @param fallback the ZipCoder to return when the entry is not flagged as UTF-8 */ -private static boolean useUTF8Coder(final byte[] cen, final int pos) { -return (CENFLG(cen, pos) & USE_UTF8) != 0; +private static ZipCoder zipCoderFor(final byte[] cen, final int pos, Z
Re: RFR: 8351374: Improve comment about queue.remove timeout in CleanerImpl.run
On Mon, 10 Mar 2025 23:14:23 GMT, Brent Christian wrote: >> Please review this revision of a previously puzzling comment intending to >> provide the rationale for a bit of non-obvious code. > > src/java.base/share/classes/jdk/internal/ref/CleanerImpl.java line 142: > >> 140: // while there are registered cleanables for other >> objects. >> 141: // If the application cleans all remaining cleanables, >> there >> 142: // won't be any references enqueued to unblock this. >> Using a > > I agree that calling `queue.remove()` with a timeout is the right approach. > But I have a question: > In the case where the Cleaner's `CleanerCleanable` has already run, and we > get to processing the last registered cleanable on `activeList`: > When we do the `ref.clean()`, the `activeList` becomes empty, and we'll drop > out of the `while()` loop. So I'm not seeing how we would attempt another > `queue.remove()` in this case. > What am I missing? You are missing that this loop is not the only place that potentially removes references from `activeList`. The application may also do so, when directly cleaning (as part of a `close()` operation or the like). So the cleaner's cleanable may be dropped, removed, and cleaned but with live entries still in `activeList`, and this loop ends up blocked in `remove` because there's nothing for it to do. The application later closes all of the remaining entries in the `activeList`, which doesn't cause anything to be enqueued on the cleaner's queue, so the cleaner thread remains blocked in `remove`. But now `activeList` is empty and will never be added to, and the queue is also empty, and the thread is blocked in `remove` with nothing (other than the timeout) to break it out. - PR Review Comment: https://git.openjdk.org/jdk/pull/23952#discussion_r1992205243
Re: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v17]
On Wed, 12 Mar 2025 11:58:45 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that implements (currently Draft) JEP: G1: >> Improve Application Throughput with a More Efficient Write-Barrier. >> >> The reason for posting this early is that this is a large change, and the >> JEP process is already taking very long with no end in sight but we would >> like to have this ready by JDK 25. >> >> ### Current situation >> >> With this change, G1 will reduce the post write barrier to much more >> resemble Parallel GC's as described in the JEP. The reason is that G1 lacks >> in throughput compared to Parallel/Serial GC due to larger barrier. >> >> The main reason for the current barrier is how g1 implements concurrent >> refinement: >> * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of >> buffers (dirty card queues - dcq) containing the location of dirtied cards. >> Refinement threads pick up their contents to re-refine. The barrier needs to >> enqueue card locations. >> * For correctness dirty card updates requires fine-grained synchronization >> between mutator and refinement threads, >> * Finally there is generic code to avoid dirtying cards altogether >> (filters), to avoid executing the synchronization and the enqueuing as much >> as possible. >> >> These tasks require the current barrier to look as follows for an assignment >> `x.a = y` in pseudo code: >> >> >> // Filtering >> if (region(@x.a) == region(y)) goto done; // same region check >> if (y == null) goto done; // null value check >> if (card(@x.a) == young_card) goto done; // write to young gen check >> StoreLoad;// synchronize >> if (card(@x.a) == dirty_card) goto done; >> >> *card(@x.a) = dirty >> >> // Card tracking >> enqueue(card-address(@x.a)) into thread-local-dcq; >> if (thread-local-dcq is not full) goto done; >> >> call runtime to move thread-local-dcq into dcqs >> >> done: >> >> >> Overall this post-write barrier alone is in the range of 40-50 total >> instructions, compared to three or four(!) for parallel and serial gc. >> >> The large size of the inlined barrier not only has a large code footprint, >> but also prevents some compiler optimizations like loop unrolling or >> inlining. >> >> There are several papers showing that this barrier alone can decrease >> throughput by 10-20% >> ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is >> corroborated by some benchmarks (see links). >> >> The main idea for this change is to not use fine-grained synchronization >> between refinement and mutator threads, but coarse grained based on >> atomically switching c... > > Thomas Schatzl has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 24 additional > commits since the last revision: > > - Merge branch 'master' into 8342382-card-table-instead-of-dcq > - * optimized RISCV gen_write_ref_array_post_barrier() implementation > contributed by @RealFYang > - * fix card table verification crashes: in the first refinement phase, when > switching the global card tables, we need to re-check whether we are still in > the same sweep epoch or not. It might have changed due to a GC interrupting > acquiring the Heap_lock. Otherwise new threads will scribble on the > refinement table. >Cause are last-minute changes before making the PR ready to review. > > Testing: without the patch, occurs fairly frequently when continuously >(1 in 20) starting refinement. Does not afterward. > - * ayang review 3 > * comments > * minor refactorings > - * iwalulya review > * renaming > * fix some includes, forward declaration > - * fix whitespace >* additional whitespace between log tags >* rename G1ConcurrentRefineWorkTask -> ...SweepTask to conform to the > other similar rename > - ayang review > * renamings > * refactorings > - iwalulya review > * comments for variables tracking to-collection-set and just dirtied > cards after GC/refinement > * predicate for determining whether the refinement has been disabled > * some other typos/comment improvements > * renamed _has_xxx_ref to _has_ref_to_xxx to be more consistent with > naming > - * ayang review - fix comment > - * iwalulya review 2 > * G1ConcurrentRefineWorkState -> G1ConcurrentRefineSweepState > * some additional documentation > - ... and 14 more: https://git.openjdk.org/jdk/compare/53a66058...aec95051 src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp line 217: > 215: > 216: { > 217: SuspendibleThreadSetLeaver sts_leave; Can you add some comment on why leaving the set is required? It's not obvious to me why. I'd expect handshake to work out of the box... src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp line 263: > 261: > 262: SuspendibleThreadSetL
Re: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v17]
> Hi all, > > please review this change that implements (currently Draft) JEP: G1: > Improve Application Throughput with a More Efficient Write-Barrier. > > The reason for posting this early is that this is a large change, and the JEP > process is already taking very long with no end in sight but we would like to > have this ready by JDK 25. > > ### Current situation > > With this change, G1 will reduce the post write barrier to much more resemble > Parallel GC's as described in the JEP. The reason is that G1 lacks in > throughput compared to Parallel/Serial GC due to larger barrier. > > The main reason for the current barrier is how g1 implements concurrent > refinement: > * g1 tracks dirtied cards using sets (dirty card queue set - dcqs) of buffers > (dirty card queues - dcq) containing the location of dirtied cards. > Refinement threads pick up their contents to re-refine. The barrier needs to > enqueue card locations. > * For correctness dirty card updates requires fine-grained synchronization > between mutator and refinement threads, > * Finally there is generic code to avoid dirtying cards altogether (filters), > to avoid executing the synchronization and the enqueuing as much as possible. > > These tasks require the current barrier to look as follows for an assignment > `x.a = y` in pseudo code: > > > // Filtering > if (region(@x.a) == region(y)) goto done; // same region check > if (y == null) goto done; // null value check > if (card(@x.a) == young_card) goto done; // write to young gen check > StoreLoad;// synchronize > if (card(@x.a) == dirty_card) goto done; > > *card(@x.a) = dirty > > // Card tracking > enqueue(card-address(@x.a)) into thread-local-dcq; > if (thread-local-dcq is not full) goto done; > > call runtime to move thread-local-dcq into dcqs > > done: > > > Overall this post-write barrier alone is in the range of 40-50 total > instructions, compared to three or four(!) for parallel and serial gc. > > The large size of the inlined barrier not only has a large code footprint, > but also prevents some compiler optimizations like loop unrolling or inlining. > > There are several papers showing that this barrier alone can decrease > throughput by 10-20% > ([Yang12](https://dl.acm.org/doi/10.1145/2426642.2259004)), which is > corroborated by some benchmarks (see links). > > The main idea for this change is to not use fine-grained synchronization > between refinement and mutator threads, but coarse grained based on > atomically switching card tables. Mutators only work on the "primary" card > table, refinement threads on a se... Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 24 additional commits since the last revision: - Merge branch 'master' into 8342382-card-table-instead-of-dcq - * optimized RISCV gen_write_ref_array_post_barrier() implementation contributed by @RealFYang - * fix card table verification crashes: in the first refinement phase, when switching the global card tables, we need to re-check whether we are still in the same sweep epoch or not. It might have changed due to a GC interrupting acquiring the Heap_lock. Otherwise new threads will scribble on the refinement table. Cause are last-minute changes before making the PR ready to review. Testing: without the patch, occurs fairly frequently when continuously (1 in 20) starting refinement. Does not afterward. - * ayang review 3 * comments * minor refactorings - * iwalulya review * renaming * fix some includes, forward declaration - * fix whitespace * additional whitespace between log tags * rename G1ConcurrentRefineWorkTask -> ...SweepTask to conform to the other similar rename - ayang review * renamings * refactorings - iwalulya review * comments for variables tracking to-collection-set and just dirtied cards after GC/refinement * predicate for determining whether the refinement has been disabled * some other typos/comment improvements * renamed _has_xxx_ref to _has_ref_to_xxx to be more consistent with naming - * ayang review - fix comment - * iwalulya review 2 * G1ConcurrentRefineWorkState -> G1ConcurrentRefineSweepState * some additional documentation - ... and 14 more: https://git.openjdk.org/jdk/compare/f77fa17b...aec95051 - Changes: - all: https://git.openjdk.org/jdk/pull/23739/files - new: https://git.openjdk.org/jdk/pull/23739/files/758fac01..aec95051 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=23739&range=15-16 Stats: 78123 lines in 1539 files changed: 36243 ins; 29177 del; 12703 mod Patch: https://git.openjdk.org/jdk/pull/23739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23739/he
Re: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v14]
On Fri, 7 Mar 2025 13:14:02 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one >> additional commit since the last revision: >> >> * iwalulya review >> * renaming >> * fix some includes, forward declaration > > src/hotspot/share/gc/g1/g1CardTable.hpp line 76: > >> 74: g1_card_already_scanned = 0x1, >> 75: g1_to_cset_card = 0x2, >> 76: g1_from_remset_card = 0x4 > > Could you outline the motivation for this more precise info? Is it for > optimization or essentially for correctness? OK, it's for better performance, not correctness. How much is the improvement? As I understand it, this more precise info is largely independent of the new barrier logic. I wonder if it makes sense to extract this out to its own ticket to better assess its impact on perf and impl complexity. - PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r1991375754
Re: RFR: 8342382: Implementation of JEP G1: Improve Application Throughput with a More Efficient Write-Barrier [v17]
On Wed, 12 Mar 2025 13:20:25 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 24 additional >> commits since the last revision: >> >> - Merge branch 'master' into 8342382-card-table-instead-of-dcq >> - * optimized RISCV gen_write_ref_array_post_barrier() implementation >> contributed by @RealFYang >> - * fix card table verification crashes: in the first refinement phase, >> when switching the global card tables, we need to re-check whether we are >> still in the same sweep epoch or not. It might have changed due to a GC >> interrupting acquiring the Heap_lock. Otherwise new threads will scribble on >> the refinement table. >>Cause are last-minute changes before making the PR ready to review. >> >> Testing: without the patch, occurs fairly frequently when continuously >>(1 in 20) starting refinement. Does not afterward. >> - * ayang review 3 >> * comments >> * minor refactorings >> - * iwalulya review >> * renaming >> * fix some includes, forward declaration >> - * fix whitespace >>* additional whitespace between log tags >>* rename G1ConcurrentRefineWorkTask -> ...SweepTask to conform to the >> other similar rename >> - ayang review >> * renamings >> * refactorings >> - iwalulya review >> * comments for variables tracking to-collection-set and just dirtied >> cards after GC/refinement >> * predicate for determining whether the refinement has been disabled >> * some other typos/comment improvements >> * renamed _has_xxx_ref to _has_ref_to_xxx to be more consistent with >> naming >> - * ayang review - fix comment >> - * iwalulya review 2 >> * G1ConcurrentRefineWorkState -> G1ConcurrentRefineSweepState >> * some additional documentation >> - ... and 14 more: https://git.openjdk.org/jdk/compare/0c7b5abb...aec95051 > > src/hotspot/share/gc/g1/g1ConcurrentRefine.cpp line 217: > >> 215: >> 216: { >> 217: SuspendibleThreadSetLeaver sts_leave; > > Can you add some comment on why leaving the set is required? It's not obvious > to me why. I'd expect handshake to work out of the box... It isn't apparently. Removed. - PR Review Comment: https://git.openjdk.org/jdk/pull/23739#discussion_r1991999476
Re: RFR: 8351435: Change JLine Console implementation back to opt-in [v2]
On Wed, 12 Mar 2025 07:13:19 GMT, Jaikiran Pai wrote: >> Naoto Sato has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Reflects review comment > > test/jdk/java/io/Console/ConsolePromptTest.java line 77: > >> 75: var expect = Paths.get("/usr/bin/expect"); >> 76: if (!Files.exists(expect) || !Files.isExecutable(expect)) { >> 77: System.out.println("'expect' command not found. Test >> ignored."); > > Hello Naoto, I think throwing a `jtreg.SkippedException` might be better here > so that it's clear that the test was skipped. There have been recent > reporting improvements too which make it easier to notice such skipped tests. Good point. Modified the piece (w/ some other minor changes) - PR Review Comment: https://git.openjdk.org/jdk/pull/23993#discussion_r1991925846
Re: RFR: 8336881: [Linux] Support for hierarchical limits for Metrics [v16]
On Thu, 20 Feb 2025 14:22:20 GMT, Severin Gehwolf wrote: >> Please review this fix for cgroups-based metrics reporting in the >> `jdk.internal.platform` package. This fix is supposed to address wrong >> reporting of certain limits if the limits aren't set at the leaf nodes. >> >> For example, on cg v2, the memory limit interface file is `memory.max`. >> Consider a cgroup path of `/a/b/c/d`. The current code only reports the >> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. >> However, some systems - like a systemd slice - sets those limits further up >> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` >> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` >> would report the actual limit value (e.g. `1048576000`). >> >> This patch addresses this issue by: >> >> 1. Refactoring the interface lookup code to relevant controllers for >> cpu/memory. The CgroupSubsystem classes then delegate to those for the >> lookup. This facilitates having an API for the lookup of an updated limit in >> step 2. >> 2. Walking the full hierarchy of the cgroup path (if any), looking for a >> lower limit than at the leaf. Note that it's not possible to raise the limit >> set at a path closer to the root via the interface file at a >> further-to-the-leaf-level. The odd case out seems to be `max` values on some >> systems (which seems to be the default value). >> >> As an optimization this hierarchy walk is skipped on containerized systems >> (like K8S), where the limits are set in interface files at the leaf nodes of >> the hierarchy. Therefore there should be no change on those systems. >> >> This patch depends on the Hotspot change implementing the same for the JVM >> so that `Metrics.isContainerized()` works correctly on affected systems >> where `-XshowSettings:system` currently reports `System not containerized` >> due to the missing JVM fix. A test framework for such hierarchical systems >> has been added in >> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds >> a test using that framework among some simpler unit tests. >> >> Thoughts? >> >> Testing: >> >> - [x] GHA >> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems >> - [x] Some manual testing using systemd slices > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 42 commits: > > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Fix missing imports > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Add exclusive access dirs directive for systemd tests > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - ... and 32 more: https://git.openjdk.org/jdk/compare/e1d0a9c8...74abae5b Keep open, please. - PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2718491360
Re: RFR: 8351435: Change JLine Console implementation back to opt-in
On Tue, 11 Mar 2025 18:32:10 GMT, Naoto Sato wrote: > JDK has been using JLine based Console implementation, in JDK20 as opt-in, > then in JDK22 as the default. While it has been providing rich functionality > for Console, it is increasingly difficult to maintain as a Console > implementation. In light of the on-ramp feature > (https://bugs.openjdk.org/browse/JDK-8344699), which proposes switching > `java.lang.IO` class to use `System.in` and `System.out` instead of Console, > reverting the default Console implementation to JDK's built-in one in the > java.base module is considered desirable. Some tests are modified along with > this change, among them test/jdk/java/io/Console/ConsolePromptTest.java > changes were contributed by @lahodaj (thanks!) test/jdk/java/io/Console/ConsolePromptTest.java line 77: > 75: var expect = Paths.get("/usr/bin/expect"); > 76: if (!Files.exists(expect) || !Files.isExecutable(expect)) { > 77: System.out.println("'expect' command not found. Test > ignored."); Hello Naoto, I think throwing a `jtreg.SkippedException` might be better here so that it's clear that the test was skipped. There have been recent reporting improvements too which make it easier to notice such skipped tests. - PR Review Comment: https://git.openjdk.org/jdk/pull/23993#discussion_r1990775475
Re: RFR: 8347712: IllegalStateException on multithreaded ZipFile access with non-UTF8 charset [v2]
On Tue, 11 Mar 2025 15:31:09 GMT, Eirik Bjørsnøs wrote: > This seems mostly used to determine which ZipCoder to pick. Could we fold > this into the method, calling it zipCoderForPos and make it just return the > ZipCoder, like we had in Source? I intentionally removed the `zipCoderForPos()` method and instead introduced this static method to avoid the case where the code ends up calling this method and then stores the returned `ZipCoder` (like was being done in `Source`). Instead with this new method, it now allows the call sites to determine if a UTF8 `ZipCoder` is needed or a non-UTF8 one and then decide where to get the non-UTF8 `ZipCoder` from. If the call site is a instance method in `ZipFile`, then it was use the `ZipFile`'s `zipCoder` field and if the call site is within `Source`, when the `Source` is being instantiated then it constructs a non-UTF8 `ZipCoder` afresh. That level of detail is better dealt at the call site instead of a method like `zipCoderForPos()`. > src/java.base/share/classes/java/util/zip/ZipFile.java line 1744: > >> 1742: >> 1743: int entryPos = pos + CENHDR; >> 1744: final ZipCoder entryZipCoder; > > If you want to construct this lazily, then I think a comment documenting that > purpose would be useful here. Done. - PR Review Comment: https://git.openjdk.org/jdk/pull/23986#discussion_r1990711952 PR Review Comment: https://git.openjdk.org/jdk/pull/23986#discussion_r1990713465
Re: RFR: 8351897: Extra closing curly brace typos in Javadoc
On Thu, 13 Mar 2025 01:21:54 GMT, Archie Cobbs wrote: > This PR fixes minor Javadoc typos in a few `java.base` classes. Extra curly > braces have been sneaking in. src/java.base/share/classes/java/text/MessageFormat.java line 327: > 325: * {@code unit} > 326: * {@link ListFormat#getInstance(Locale, ListFormat.Type, > ListFormat.Style) > 327: * ListFormat.getInstance}{@code (getLocale()}, {@link > ListFormat.Type#UNIT}, {@link ListFormat.Style#FULL}) The new ')' at the end of the line. Where's the matching '('? I'm guessing that the IDE thinks it's the one immediately before the `getLocale()`, but I suspect that javadoc won't agree with that. - PR Review Comment: https://git.openjdk.org/jdk/pull/24022#discussion_r1992525288
Re: RFR: 8351897: Extra closing curly brace typos in Javadoc
On Thu, 13 Mar 2025 01:21:54 GMT, Archie Cobbs wrote: > This PR fixes minor Javadoc typos in a few `java.base` classes. Extra curly > braces have been sneaking in. The change in `MessageFormat.java` looks good - PR Review: https://git.openjdk.org/jdk/pull/24022#pullrequestreview-2680224832
Re: RFR: 8351897: Extra closing curly brace typos in Javadoc
On Thu, 13 Mar 2025 01:46:53 GMT, Iris Clark wrote: > Where's the matching `(`? I'm guessing that the IDE thinks it's the one > immediately before the `getLocale()`, but I suspect that javadoc won't agree > with that. You are correct - the matching `(` is the one preceding `getLocale()`. I'm not sure what you mean by "I suspect that javadoc won't agree with that". This change is simply making this row of the table consistent with the ones above it. You can see how they currently appear [here](https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/text/MessageFormat.html). In this row, the closing `}` should have been a `)`. Thanks for taking a look. - PR Review Comment: https://git.openjdk.org/jdk/pull/24022#discussion_r1992531755
Re: RFR: 8345687: Improve the implementation of SegmentFactories::allocateSegment [v7]
> Hi, > > This patch improves the performance of a typical `Arena::allocate` in several > ways: > > - Delay the creation of the NativeMemorySegmentImpl. This avoids the merge of > the instance with the one obtained from the call in the uncommon path, > increasing the chance the object being scalar replaced. > - Split the allocation of over-aligned memory to a slow-path method. > - Align the memory to 8 bytes, allowing faster zeroing. > - Use a dedicated method to zero the just-allocated native memory, reduce > code size and make it more straightforward. > - Make `VM.pageAlignDirectMemory` a `Boolean` instead of a `boolean` so that > `false` value can be constant folded. > > Please take a look and leave your reviews, thanks a lot. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: expand wildcard imports - Changes: - all: https://git.openjdk.org/jdk/pull/22610/files - new: https://git.openjdk.org/jdk/pull/22610/files/e82cc493..dba7ae14 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=22610&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=22610&range=05-06 Stats: 17 lines in 1 file changed: 14 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/22610.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/22610/head:pull/22610 PR: https://git.openjdk.org/jdk/pull/22610
RFR: 8351897: Extra closing curly brace typos in Javadoc
This PR fixes minor Javadoc typos in a few `java.base` classes. Extra curly braces have been sneaking in. - Commit messages: - Fix minor Javadoc typos. Changes: https://git.openjdk.org/jdk/pull/24022/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24022&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8351897 Stats: 12 lines in 5 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/24022.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24022/head:pull/24022 PR: https://git.openjdk.org/jdk/pull/24022