Re: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms

2025-04-26 Thread Andrew Haley
On Sun, 6 Apr 2025 03:48:22 GMT, Mohamed Issa wrote: > the built in **cbrt** micro-benchmark How should we run that benchmark? Thanks. - PR Comment: https://git.openjdk.org/jdk/pull/24470#issuecomment-2832035668

Re: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms

2025-04-26 Thread Andrew Haley
On Sun, 6 Apr 2025 03:48:22 GMT, Mohamed Issa wrote: > The goal of this PR is to implement an x86_64 intrinsic for > java.lang.Math.cbrt() using libm. > > The results of all tests posted below were captured with an [Intel® Xeon > 6761P](https://www.intel.com/content/www/us/en/products/sku/2418

Re: RFR: 8354242: VectorAPI: combine vector not operation with compare [v2]

2025-04-24 Thread Andrew Haley
On Fri, 18 Apr 2025 01:36:10 GMT, erifan wrote: >> This patch optimizes the following patterns: >> For integer types: >> >> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) >> => (VectorMaskCmp src1 src2 ncond) >> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) >> => (VectorMa

Re: RFR: 8077587: BigInteger Roots [v19]

2025-04-21 Thread Andrew Haley
On Sun, 20 Apr 2025 16:07:56 GMT, fabioromano1 wrote: >> This PR implements nth root computation for `BigInteger`s using Newton >> method and optimizes `BigInteger.pow(int)` method. >> [Here is a proof of convergence of the recurrence >> used.](https://github.com/user-attachments/files/19785045

Re: RFR: 8077587: BigInteger Roots [v19]

2025-04-21 Thread Andrew Haley
On Sun, 20 Apr 2025 16:07:56 GMT, fabioromano1 wrote: >> This PR implements nth root computation for `BigInteger`s using Newton >> method and optimizes `BigInteger.pow(int)` method. >> [Here is a proof of convergence of the recurrence >> used.](https://github.com/user-attachments/files/19785045

Re: RFR: 8077587: BigInteger Roots [v19]

2025-04-21 Thread Andrew Haley
On Sun, 20 Apr 2025 16:07:56 GMT, fabioromano1 wrote: >> This PR implements nth root computation for `BigInteger`s using Newton >> method and optimizes `BigInteger.pow(int)` method. >> [Here is a proof of convergence of the recurrence >> used.](https://github.com/user-attachments/files/19785045

Re: RFR: 8349184: [JMH] jdk.incubator.vector.ColumnFilterBenchmark.filterDoubleColumn fails on linux-aarch64 [v2]

2025-02-03 Thread Andrew Haley
On Mon, 3 Feb 2025 04:16:57 GMT, SendaoYan wrote: >> Hi all, >> Several JMH tests fails "Unrecognized VM option 'UseAVX=2'" on >> linux-aarch64. The VM option '-XX:UseAVX=2' only support on x86_64 platform. >> This PR add option '-XX:+IgnoreUnrecognizedVMOptions' to make test run >> normally o

Re: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v6]

2024-12-20 Thread Andrew Haley
On Fri, 20 Dec 2024 15:42:14 GMT, Fei Gao wrote: >> Galder Zamarreño has updated the pull request incrementally with five >> additional commits since the last revision: >> >> - Added comment around the assertions >> - Adjust min/max identity IR test expectations after changes >> - Fix style

Integrated: 8339916: AIOOBE due to Math.abs(Integer.MIN_VALUE) in tests

2024-11-22 Thread Andrew Haley
On Thu, 21 Nov 2024 17:21:07 GMT, Andrew Haley wrote: > Test bug. > > Another bug caused by the result of `Math.abs()` being negative. I also tried > `floorMod()`, which would have been cleaner, but it increased the runtime of > this extremely time-sensitive benchmark. This p

Re: RFR: 8339916: AIOOBE due to Math.abs(Integer.MIN_VALUE) in tests

2024-11-22 Thread Andrew Haley
On Fri, 22 Nov 2024 10:36:21 GMT, Rémi Forax wrote: > Let say that the definition of Math.abs() is surprising If anything is surprising, it's the definition of `%` . Which is compatible with `/`, but otherwise useless. - PR Comment: https://git.openjdk.org/jdk/pull/22297#issuecom

RFR: 8339916: AIOOBE due to Math.abs(Integer.MIN_VALUE) in tests

2024-11-21 Thread Andrew Haley
Test bug. Another bug caused by the result of `Math.abs()` being negative. I also tried `floorMod()`, which would have been cleaner, but it increased the runtime of this extremely time-sensitive benchmark. - Commit messages: - Fix integer overflow in abs() Changes: https://git.op

Re: RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning

2024-11-06 Thread Andrew Haley
On Tue, 22 Oct 2024 15:48:43 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 60: >> >>> 58: >>> 59: assert(LockingMode != LM_LIGHTWEIGHT, "lightweight locking should use >>> fast_lock_lightweight"); &

Re: RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning

2024-11-06 Thread Andrew Haley
On Tue, 22 Oct 2024 15:37:23 GMT, Andrew Haley wrote: >> This is the implementation of JEP 491: Synchronize Virtual Threads without >> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for >> further details. >> >> In order to make the code

Re: RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning

2024-11-06 Thread Andrew Haley
On Thu, 17 Oct 2024 14:28:30 GMT, Patricio Chilano Mateo wrote: > * We copy the oops stored in the LockStack of the carrier to the > stackChunk when freezing (and clear the LockStack). We copy the oops back to > the LockStack of the next carrier when thawing for the first time (and clear

Integrated: 8341903: Implementation of Scoped Values (Fourth Preview)

2024-11-05 Thread Andrew Haley
On Thu, 10 Oct 2024 16:16:51 GMT, Andrew Haley wrote: > The fourth preview of scoped values. This pull request has now been integrated. Changeset: 3fab8e37 Author: Andrew Haley URL: https://git.openjdk.org/jdk/commit/3fab8e37bbebbb3930108b2015efe488b1fa1e97 Stats: 161 lines i

Re: RFR: 8341903: Implementation of Scoped Values (Fourth Preview) [v7]

2024-11-01 Thread Andrew Haley
> The fourth preview of scoped values. Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revis

Re: RFR: 8341903: Implementation of Scoped Values (Fourth Preview) [v6]

2024-10-29 Thread Andrew Haley
> The fourth preview of scoped values. Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Typo in comment. (Thanks, Roland!) - Changes: - all: https://git.openjdk.org/jdk/pull/21456/files - new: https://git.openjdk.

Re: RFR: 8341903: Implementation of Scoped Values (Fourth Preview) [v5]

2024-10-23 Thread Andrew Haley
> The fourth preview of scoped values. Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: API change - Changes: - all: https://git.openjdk.org/jdk/pull/21456/files - new: https://git.openjdk.org/jdk/pull/21456/fi

Re: RFR: 8338383: Implementation of Synchronize Virtual Threads without Pinning [v3]

2024-10-22 Thread Andrew Haley
On Tue, 22 Oct 2024 15:37:23 GMT, Andrew Haley wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with six >> additional commits since the last revision: >> >> - Fix comments in objectMonitor.hpp >> - Move frame::saved_thread_address

Re: RFR: 8338383: Implementation of Synchronize Virtual Threads without Pinning [v3]

2024-10-22 Thread Andrew Haley
On Tue, 22 Oct 2024 15:48:43 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 60: >> >>> 58: >>> 59: assert(LockingMode != LM_LIGHTWEIGHT, "lightweight locking should use >>> fast_lock_lightweight"); &

Re: RFR: 8338383: Implementation of Synchronize Virtual Threads without Pinning [v3]

2024-10-22 Thread Andrew Haley
On Tue, 22 Oct 2024 02:14:23 GMT, Patricio Chilano Mateo wrote: >> This is the implementation of JEP 491: Synchronize Virtual Threads without >> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for >> further details. >> >> In order to make the code review easier the change

Re: RFR: 8338383: Implementation of Synchronize Virtual Threads without Pinning [v3]

2024-10-22 Thread Andrew Haley
On Tue, 22 Oct 2024 02:14:23 GMT, Patricio Chilano Mateo wrote: >> This is the implementation of JEP 491: Synchronize Virtual Threads without >> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for >> further details. >> >> In order to make the code review easier the change

Re: RFR: 8338383: Implementation of Synchronize Virtual Threads without Pinning [v3]

2024-10-22 Thread Andrew Haley
On Tue, 22 Oct 2024 02:14:23 GMT, Patricio Chilano Mateo wrote: >> This is the implementation of JEP 491: Synchronize Virtual Threads without >> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for >> further details. >> >> In order to make the code review easier the change

Re: RFR: 8341903: Implementation of Scoped Values (Fourth Preview) [v4]

2024-10-14 Thread Andrew Haley
> The fourth preview of scoped values. Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revis

Re: RFR: 8341903: Implementation of Scoped Values (Fourth Preview) [v3]

2024-10-14 Thread Andrew Haley
On Mon, 14 Oct 2024 13:40:50 GMT, Andrew Haley wrote: >> The fourth preview of scoped values. > > Andrew Haley has updated the pull request incrementally with two additional > commits since the last revision: > > - Scoped values > - Scoped values Tier1 and Tie

Re: RFR: 8341903: Implementation of Scoped Values (Fourth Preview) [v3]

2024-10-14 Thread Andrew Haley
> The fourth preview of scoped values. Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: - Scoped values - Scoped values - Changes: - all: https://git.openjdk.org/jdk/pull/21456/files - new: https://git.openjdk.

Re: RFR: 8341903: Implementation of Scoped Values (Fourth Preview) [v2]

2024-10-10 Thread Andrew Haley
> The fourth preview of scoped values. Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Fix javadoc - Changes: - all: https://git.openjdk.org/jdk/pull/21456/files - new: https://git.openjdk.org/jdk/pull/21456/fi

RFR: 8341903: Implementation of Scoped Values (Fourth Preview)

2024-10-10 Thread Andrew Haley
The fourth preview of scoped values. - Commit messages: - Scoped Values API changes - Scoped Values API changes - Scoped Values API changes Changes: https://git.openjdk.org/jdk/pull/21456/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21456&range=00 Issue: https://bugs

Re: RFR: 8338526: Don't store abstract and interface Klasses in class metaspace

2024-10-09 Thread Andrew Haley
On Tue, 3 Sep 2024 15:50:13 GMT, Thomas Stuefe wrote: >>> > I don't think the costs for two address comparisons matter, not with the >>> > comparatively few deallocations that happen (few hundreds or few >>> > thousand). If deallocate is hot, we are using metaspace wrong. >>> >>> MethodData do

Re: RFR: 8338526: Don't store abstract and interface Klasses in class metaspace [v6]

2024-10-09 Thread Andrew Haley
On Wed, 9 Oct 2024 06:07:13 GMT, John R Rose wrote: > If the interfaces had a compact numbering, and there were a side table > mapping the compact numbers to interface Klass* pointers, then I think > Andrew's code would still work (with natural adjustments). Probably, yes. I can pack klass ID+

Re: RFR: 8338526: Don't store abstract and interface Klasses in class metaspace [v6]

2024-10-09 Thread Andrew Haley
On Wed, 9 Oct 2024 09:00:06 GMT, Andrew Haley wrote: >> Coleen Phillimore has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Replace Metaspace::is_compressed_klass_ptr with >> CompressedKlassPointers::is_in_

Re: RFR: 8338526: Don't store abstract and interface Klasses in class metaspace [v6]

2024-10-09 Thread Andrew Haley
yet making Klass pointers a table index. Another chained load in the path of method dispatch, at a time when I'm trying to get rid of chained loads, would be a Bad Thing for me. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 - PR Comment: https://git.openjdk.org/jdk/pull/19157#issuecomment-2401747594

Re: RFR: 8338526: Don't store abstract and interface Klasses in class metaspace [v6]

2024-10-08 Thread Andrew Haley
mmit since the last revision: > > Replace Metaspace::is_compressed_klass_ptr with > CompressedKlassPointers::is_in_encoding_range. On 8 Oct 2024, at 4:07, Andrew Haley wrote: > On 9/10/24 12:42, Coleen Phillimore wrote: >> Thanks for reviewing Ioi and Thomas, and thank you Thomas for the

Re: RFR: 8338526: Don't store abstract and interface Klasses in class metaspace [v6]

2024-10-08 Thread Andrew Haley
s a way to represent interface pointers in a compact way in lookup tables, and to be able to get from compressed class pointers to the address. As long as interface pointers are in a 32-bit range and there's a fast way to get from compressed class to address that's OK. -- Andrew Haley (he/

Re: RFR: 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long) [v3]

2024-09-27 Thread Andrew Haley
On Fri, 27 Sep 2024 14:15:04 GMT, Galder Zamarreño wrote: > The only situation where this PR is a regression compared to current code is > when the one of the branch side is always taken. Bear in mind that's quite common. It's not very unusual to clip a range with something equivalent to `x =

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm

2024-08-28 Thread Andrew Haley
On Tue, 27 Aug 2024 22:23:44 GMT, Srinivas Vamsi Parasa wrote: >> I agree, this is all rather obscure. Ideally the same names that are used in >> wherever this comes from. >> >> Where does the algorithm come from? What are its accuracy guarantees? >> >> In addition, given the rarity of hyperb

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm

2024-08-27 Thread Andrew Haley
On 8/27/24 12:13, Jatin Bhateja wrote: Hi @vamsi-parasa , Kindly also add a JMH micro benchmark, I did a first run and see around 4% performance drop with attached micro on Sapphire Rapids. [test.txt](https://github.com/user-attachments/files/16761142/test.txt) If I had to guess, that's beca

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm

2024-08-27 Thread Andrew Haley
On Tue, 27 Aug 2024 05:24:34 GMT, Jatin Bhateja wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.tanh() using libm >> >> Benchmark (ops/ms) | Stock JDK | Tanh intrinsic | Speedup >> -- | -- | -- | -- >> MathBench.tanhDouble | 70900 | 95618 | 1.35x > > src/

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v9]

2024-07-26 Thread Andrew Haley
On Fri, 26 Jul 2024 18:39:27 GMT, Vladimir Ivanov wrote: > Oh, it comes as a surprise to me... I was under impression that the first > thing hand-coded assembly variants do is check for `bitmap != > SECONDARY_SUPERS_BITMAP_FULL`. At least, it was my recollection from working > on [JDK-8180450]

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v8]

2024-07-26 Thread Andrew Haley
On Thu, 25 Jul 2024 23:31:21 GMT, Vladimir Ivanov wrote: > Thanks! The patch looks good, except there was one failure observed during > testing with the latest patch [1]. It does look related to the latest changes > you did in > [54050a5](https://github.com/openjdk/jdk/pull/19989/commits/54050

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v9]

2024-07-26 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > sim

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v8]

2024-07-25 Thread Andrew Haley
On Thu, 25 Jul 2024 16:05:49 GMT, Andrew Haley wrote: >> This patch expands the use of a hash table for secondary superclasses >> to the interpreter, C1, and runtime. It also adds a C2 implementation >> of hashed lookup in cases where the superclass isn't known at compile

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v8]

2024-07-25 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > simplicity over abso

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v7]

2024-07-25 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > simplicity over absol

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v6]

2024-07-25 Thread Andrew Haley
On Wed, 24 Jul 2024 19:09:06 GMT, Vladimir Ivanov wrote: >>> > Also also, Klass::is_subtype_of() is used for C1 runtime. >>> >>> Can you elaborate, please? >> >> Sorry, that was rather vague. In C1-compiled code, the Java method >> `Class::isInstance(Object)`calls `Klass::is_subtype_of()`. >>

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v6]

2024-07-24 Thread Andrew Haley
On Wed, 24 Jul 2024 15:51:26 GMT, Andrew Haley wrote: >>> What happens when users include `klass.hpp`, but not `klass.inline.hpp`? >>> How does it affect generated code? >>> >>> I suspect that `Klass::search_secondary_supers()` won't be inlinined in &g

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v6]

2024-07-24 Thread Andrew Haley
On Wed, 24 Jul 2024 14:29:09 GMT, Andrew Haley wrote: > I suspect that Klass::search_secondary_supers() won't be inlinined in such > case. That's true, but it's true of every other function in that file. Is it not deliberate? - PR Review Comment: https://git

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v6]

2024-07-24 Thread Andrew Haley
On Tue, 23 Jul 2024 19:00:02 GMT, Vladimir Ivanov wrote: > What happens when users include `klass.hpp`, but not `klass.inline.hpp`? How > does it affect generated code? > > I suspect that `Klass::search_secondary_supers()` won't be inlinined in such > case. That is true. I can't tell from thi

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v6]

2024-07-24 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > si

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v5]

2024-07-24 Thread Andrew Haley
On Tue, 23 Jul 2024 19:14:57 GMT, Vladimir Ivanov wrote: > > Also also, Klass::is_subtype_of() is used for C1 runtime. > > Can you elaborate, please? Sorry, that was rather vague. In C1-compiled code, the Java method `Class::isInstance(Object)`calls `Klass::is_subtype_of()`. In general, I fi

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v5]

2024-07-22 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > si

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v4]

2024-07-22 Thread Andrew Haley
On Mon, 22 Jul 2024 16:50:47 GMT, Andrew Haley wrote: >> This patch expands the use of a hash table for secondary superclasses >> to the interpreter, C1, and runtime. It also adds a C2 implementation >> of hashed lookup in cases where the superclass isn't known at compile

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v4]

2024-07-22 Thread Andrew Haley
On Thu, 11 Jul 2024 23:57:27 GMT, Vladimir Ivanov wrote: >> Andrew Haley has updated the pull request incrementally with two additional >> commits since the last revision: >> >> - Review comments >> - Review comments > > src/hotspot/cpu/x86/macroAss

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v4]

2024-07-22 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > simplicity over absolut

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v3]

2024-07-22 Thread Andrew Haley
On Mon, 22 Jul 2024 15:03:12 GMT, Andrew Haley wrote: >> Alternatively, `Klass::is_subtype_of()` can unconditionally perform linear >> search over secondary_supers array. >> >> Even though I very much like to see table lookup written in C++ >> (accompany

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v3]

2024-07-22 Thread Andrew Haley
On Thu, 18 Jul 2024 20:11:03 GMT, Vladimir Ivanov wrote: > Alternatively, `Klass::is_subtype_of()` can unconditionally perform linear > search over secondary_supers array. > > Even though I very much like to see table lookup written in C++ (accompanying > heavily optimized platform-specific Ma

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v3]

2024-07-22 Thread Andrew Haley
On Thu, 11 Jul 2024 23:07:43 GMT, Vladimir Ivanov wrote: >> Andrew Haley has updated the pull request incrementally with four additional >> commits since the last revision: >> >> - Review feedback >> - Review feedback >> - Review feedback >> -

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v3]

2024-07-22 Thread Andrew Haley
On Fri, 5 Jul 2024 22:30:09 GMT, Vladimir Ivanov wrote: >> Andrew Haley has updated the pull request incrementally with four additional >> commits since the last revision: >> >> - Review feedback >> - Review feedback >> - Review feedback >> -

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v3]

2024-07-22 Thread Andrew Haley
On Thu, 11 Jul 2024 22:53:42 GMT, Vladimir Ivanov wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 1410: >> >>> 1408: return nullptr; >>> 1409: } else if (num_extra_slots == 0) { >>> 1410: if (num_extra_slots == 0 && interfaces->length() <= 1) { >> >> Since `secondary_supers` a

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v3]

2024-07-19 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > simplicity over

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter [v2]

2024-07-18 Thread Andrew Haley
this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer >

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter

2024-07-18 Thread Andrew Haley
On Thu, 11 Jul 2024 23:39:11 GMT, Vladimir Ivanov wrote: >> This patch expands the use of a hash table for secondary superclasses >> to the interpreter, C1, and runtime. It also adds a C2 implementation >> of hashed lookup in cases where the superclass isn't known at compile >> time. >> >> HotSp

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter

2024-07-18 Thread Andrew Haley
On Thu, 11 Jul 2024 23:22:19 GMT, Vladimir Ivanov wrote: >> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1433: >> >>> 1431: >>> 1432: // Don't check secondary_super_cache >>> 1433: if (super_check_offset.is_register() >> >> Do you see any effects from this particular change? >>

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter

2024-07-18 Thread Andrew Haley
On Wed, 17 Jul 2024 18:54:32 GMT, Vladimir Ivanov wrote: >> Now it starts to sound concerning... `Klass::set_secondary_supers()` >> initializes both `_secondary_supers` and `_bitmap` which implies that >> `Klass::is_subtype_of()` may be called on not yet initialized Klass. It >> that's the cas

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter

2024-07-17 Thread Andrew Haley
On Fri, 5 Jul 2024 22:37:34 GMT, Vladimir Ivanov wrote: >> This patch expands the use of a hash table for secondary superclasses >> to the interpreter, C1, and runtime. It also adds a C2 implementation >> of hashed lookup in cases where the superclass isn't known at compile >> time. >> >> HotSpo

Re: RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter

2024-07-17 Thread Andrew Haley
On Thu, 11 Jul 2024 23:47:51 GMT, Vladimir Ivanov wrote: >> src/hotspot/share/oops/klass.cpp line 284: >> >>> 282: // which doesn't zero out the memory before calling the constructor. >>> 283: Klass::Klass(KlassKind kind) : _kind(kind), >>> 284:_bitmap(SECONDARY_S

Re: RFR: 8315884: New Object to ObjectMonitor mapping [v6]

2024-07-12 Thread Andrew Haley
On Fri, 12 Jul 2024 09:40:45 GMT, Roman Kennke wrote: >> Axel Boldt-Christmas has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Update arguments.cpp > > src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 343: > >> 341: const R

RFR: 8331341: secondary_super_cache does not scale well: C1 and interpreter

2024-07-05 Thread Andrew Haley
This patch expands the use of a hash table for secondary superclasses to the interpreter, C1, and runtime. It also adds a C2 implementation of hashed lookup in cases where the superclass isn't known at compile time. HotSpot shared runtime -- Building hashed secondary tables is

Re: RFR: 8331189: Implementation of Scoped Values (Third Preview)

2024-05-21 Thread Andrew Haley
On Wed, 8 May 2024 09:40:38 GMT, Alan Bateman wrote: > JEP 481 proposes Scoped Values to continue to preview in JDK 23 with one > change. The type of the operation parameter of the callWhere method is > changed to a new functional interface to avoid having the API throw > Exception. With that

Re: RFR: 8325438: Add exhaustive tests for Math.round intrinsics [v12]

2024-04-28 Thread Andrew Haley
On Tue, 23 Apr 2024 09:46:08 GMT, Hamlin Li wrote: >> HI, >> Can you have a look at this patch adding some tests for Math.round >> instrinsics? >> Thanks! >> >> ### FYI: >> During the development of RoundVF/RoundF, we faced the issues which were >> only spotted by running test exhaustively aga

Re: RFR: 8328066: WhiteBoxResizeTest failure on linux-x86: Could not reserve enough space for 2097152KB object heap [v3]

2024-03-15 Thread Andrew Haley
On Thu, 14 Mar 2024 14:04:51 GMT, Jaikiran Pai wrote: >> Can I please get a review of this test-only change which proposes to address >> https://bugs.openjdk.org/browse/JDK-8328066? >> >> The test launches a JVM with 2G heap (`-Xmx2G`) and as noted in that issue, >> the failure was observed on

RFR: JDK-8180450: secondary_super_cache does not scale well

2024-03-14 Thread Andrew Haley
This PR is a redesign of subtype checking. The implementation of subtype checking in the HotSpot JVM is now twenty years old. There have been some performance-related bugs reported, and the only way to fix them is a redesign of the way it works. So what's changed, so that the old design should

Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-06 Thread Andrew Haley
On Tue, 5 Dec 2023 09:08:59 GMT, Andrew Haley wrote: >> We've seen some rare failures of the CLQ Whitebox test on "less-strong" >> architectures, and the only thing which -- given my research -- could be the >> culprit is spuriously failing weakCAS

Re: RFR: 8318809: java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java shows intermittent failures on linux ppc64le and aarch64

2023-12-05 Thread Andrew Haley
On Wed, 22 Nov 2023 20:48:05 GMT, Viktor Klang wrote: > We've seen some rare failures of the CLQ Whitebox test on "less-strong" > architectures, and the only thing which -- given my research -- could be the > culprit is spuriously failing weakCAS (which is correct in terms of the > implementat

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 10:19:01 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains ten addi

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X8

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X8

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 01:13:37 GMT, Xiaohong Gong wrote: >> make/autoconf/lib-sleef.m4 line 56: >> >>> 54: AC_MSG_CHECKING([for the specified LIBSLEEF]) >>> 55: if test -e ${with_libsleef}/lib/libsleef.so && >>> 56:test -e ${with_libsleef}/include/sleef.h; then >> >> Th

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 01:19:12 GMT, Xiaohong Gong wrote: > > Not having a build time dependency on libsleef means you cannot really > > verify that the functions you want to call are correct, but maybe you feel > > secure that they will never change? > > I'm not sure. The main reason that we add

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-12-01 Thread Andrew Haley
On Fri, 1 Dec 2023 09:59:58 GMT, Andrew Haley wrote: > Not having a build time dependency on libsleef means you cannot really verify > that the functions you want to call are correct, but maybe you feel secure > that they will never change? We can still have SLEEF tests, but they wil

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-12-01 Thread Andrew Haley
On Thu, 30 Nov 2023 14:50:24 GMT, Andrew Haley wrote: >> Do this, but with the name vect_math.S. Don't use SLEEF headers in the >> build. I think you can do this with no build-time dependency on SLEEF at all >> if you load the library lazily at runtime. >>

Re: RFR: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts [v4]

2023-11-30 Thread Andrew Haley
On Wed, 22 Nov 2023 07:05:21 GMT, Eric Liu wrote: >> Vector API defines zero-extend operations [1], which are going to be >> intrinsified and generated to `VectorUCastNode` by C2. This patch adds >> backend implementation for `VectorUCastNode` on AArch64. >> >> The micro benchmark shows signif

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-30 Thread Andrew Haley
On Thu, 30 Nov 2023 11:46:58 GMT, Andrew Haley wrote: > [vect_math.S.txt](https://github.com/openjdk/jdk/files/13512306/vect_math.S.txt) I guess this will live only in os_linux and os_bsd because the Windows compiler won't like it AFAIK. - PR Comment: https://git.openjdk

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-30 Thread Andrew Haley
On Thu, 30 Nov 2023 06:39:43 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-30 Thread Andrew Haley
On Thu, 30 Nov 2023 06:39:43 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-30 Thread Andrew Haley
On Thu, 30 Nov 2023 09:35:04 GMT, Magnus Ihse Bursie wrote: > This version looks much better, thank you! I guess cflags/SVE_CFLAGS is an > okay-ish solution. > > I'm still not 100% happy though, but it might be due to my limited > understanding. Let me write down a few numbered statements and t

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-28 Thread Andrew Haley
On Tue, 28 Nov 2023 01:37:01 GMT, Xiaohong Gong wrote: >>> In fact, I am not even sure why it seems to the PR author to be a good idea >>> to let the default be dependent on the build machine at all. My personal >>> opinion is that it would be better to select either "SVE enabled" or "SVE >>>

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-27 Thread Andrew Haley
On Mon, 27 Nov 2023 15:22:32 GMT, Magnus Ihse Bursie wrote: > In fact, I am not even sure why it seems to the PR author to be a good idea > to let the default be dependent on the build machine at all. My personal > opinion is that it would be better to select either "SVE enabled" or "SVE > dis

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-27 Thread Andrew Haley
On Mon, 27 Nov 2023 14:59:23 GMT, Magnus Ihse Bursie wrote: > You still need to separate out the SVE detection from the libsleef code, and > provide a way to enable/disable it from the configure command line. Why? I don't think this should be a build-time option at all, because it puts the peo

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-27 Thread Andrew Haley
On Mon, 27 Nov 2023 01:06:21 GMT, Xiaohong Gong wrote: >> make/autoconf/lib-vmath.m4 line 94: >> >>> 92: # Check the ARM SVE feature >>> 93: SVE_CFLAGS="-march=armv8-a+sve" >>> 94: >> >> What's this about? We're building a standard binary, and all SVE detection >> is to be

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-23 Thread Andrew Haley
On Thu, 23 Nov 2023 08:57:23 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-23 Thread Andrew Haley
On Thu, 23 Nov 2023 08:57:23 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> optimized by C2 compiler on X

Re: RFR: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts [v3]

2023-11-22 Thread Andrew Haley
On Wed, 22 Nov 2023 02:18:32 GMT, Eric Liu wrote: >> src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 1412: >> >>> 1410: _sve_xunpk(is_unsigned, /* is_high */ false, dst, S, dst); >>> 1411: _sve_xunpk(is_unsigned, /* is_high */ false, dst, D, dst); >>> 1412: break; >>

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-22 Thread Andrew Haley
On Wed, 22 Nov 2023 01:52:51 GMT, Xiaohong Gong wrote: > > Have you considered the possibility of copying the sleef source to the > > OpenJDK repository and thereby it becomes part of the build process? I > > don't know how straightforward that is technically and IANAL but I think > > it's wor

Re: RFR: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts [v3]

2023-11-21 Thread Andrew Haley
On Tue, 21 Nov 2023 13:24:34 GMT, Eric Liu wrote: >> Vector API defines zero-extend operations [1], which are going to be >> intrinsified and generated to `VectorUCastNode` by C2. This patch adds >> backend implementation for `VectorUCastNode` on AArch64. >> >> The micro benchmark shows signif

Re: RFR: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts [v3]

2023-11-21 Thread Andrew Haley
On Tue, 21 Nov 2023 13:24:34 GMT, Eric Liu wrote: >> Vector API defines zero-extend operations [1], which are going to be >> intrinsified and generated to `VectorUCastNode` by C2. This patch adds >> backend implementation for `VectorUCastNode` on AArch64. >> >> The micro benchmark shows signif

Re: RFR: 8310159: Bulk copy with Unsafe::arrayCopy is slower compared to memcpy

2023-11-16 Thread Andrew Haley
On Thu, 16 Nov 2023 05:38:30 GMT, Jatin Bhateja wrote: >> The results a concurrent reader sees could be different if the copy is using >> nt writes, but if the read of the destination is not synced with the copy >> operation, I think the reader would not see consistent state in either case. >>

Re: RFR: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts [v2]

2023-11-16 Thread Andrew Haley
On Thu, 16 Nov 2023 08:44:26 GMT, Eric Liu wrote: >> Vector API defines zero-extend operations [1], which are going to be >> intrinsified and generated to `VectorUCastNode` by C2. This patch adds >> backend implementation for `VectorUCastNode` on AArch64. >> >> The micro benchmark shows signif

Re: RFR: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts

2023-11-15 Thread Andrew Haley
On Wed, 15 Nov 2023 07:48:28 GMT, Eric Liu wrote: > Vector API defines zero-extend operations [1], which are going to be > intrinsified and generated to `VectorUCastNode` by C2. This patch adds > backend implementation for `VectorUCastNode` on AArch64. > > The micro benchmark shows significant

Integrated: JDK-8319120: Unbound ScopedValue.get() throws the wrong exception

2023-10-31 Thread Andrew Haley
On Mon, 30 Oct 2023 15:57:23 GMT, Andrew Haley wrote: > The bug here is a thinko in `ScopedValue.scopedValueBindings()`. > > If the JVM runs out of resources, we throw a `VirtualMachineError`. Running > out of resources can happen at almost any time, and can happen while >

  1   2   >