Re: RFR: 8350811: [JMH] test foreign.StrLenTest failed with StringIndexOutOfBoundsException for size=451 [v2]

2025-03-05 Thread Volodymyr Paprotski
On Thu, 6 Mar 2025 00:39:59 GMT, Vladimir Ivanov wrote: >> test/micro/org/openjdk/bench/java/lang/foreign/StrLenTest.java line 149: >> >>> 147: while (lorem.length() < size) { >>> 148: lorem += lorem; >>> 149: } >> >> This is matter of taste, but I would prefer Strin

Re: RFR: 8350811: [JMH] test foreign.StrLenTest failed with StringIndexOutOfBoundsException for size=451 [v2]

2025-03-05 Thread Volodymyr Paprotski
On Tue, 4 Mar 2025 19:37:32 GMT, Vladimir Ivanov wrote: >> test setup was updated to generate data of requested size. > > Vladimir Ivanov has updated the pull request incrementally with one > additional commit since the last revision: > > JDK-8350811 [JMH] test foreign.StrLenTest failed with

Re: RFR: 8350811: [JMH] test foreign.StrLenTest failed with StringIndexOutOfBoundsException for size=451 [v2]

2025-03-05 Thread Volodymyr Paprotski
On Tue, 4 Mar 2025 19:37:32 GMT, Vladimir Ivanov wrote: >> test setup was updated to generate data of requested size. > > Vladimir Ivanov has updated the pull request incrementally with one > additional commit since the last revision: > > JDK-8350811 [JMH] test foreign.StrLenTest failed with

Integrated: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni

2025-02-12 Thread Volodymyr Paprotski
On Tue, 10 Dec 2024 23:45:37 GMT, Volodymyr Paprotski wrote: > (Also see `8319429: Resetting MXCSR flags degrades ecore`) > > This PR fixes two issues: > - the original issue is a crash caused by `__ warn` corrupting the stack on > Windows only > - This issue also uncovere

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v7]

2025-02-12 Thread Volodymyr Paprotski
On Wed, 12 Feb 2025 15:47:04 GMT, Volodymyr Paprotski wrote: >> (Also see `8319429: Resetting MXCSR flags degrades ecore`) >> >> This PR fixes two issues: >> - the original issue is a crash caused by `__ warn` corrupting the stack on >> Windows only >&

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v7]

2025-02-12 Thread Volodymyr Paprotski
On Wed, 12 Feb 2025 16:05:03 GMT, Vladimir Kozlov wrote: > I submitted our internal testing. Please wait results. Thanks! Deleted the integrate command - PR Comment: https://git.openjdk.org/jdk/pull/22673#issuecomment-2654227936

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v6]

2025-02-12 Thread Volodymyr Paprotski
On Tue, 11 Feb 2025 21:47:31 GMT, Volodymyr Paprotski wrote: >> (Also see `8319429: Resetting MXCSR flags degrades ecore`) >> >> This PR fixes two issues: >> - the original issue is a crash caused by `__ warn` corrupting the stack on >> Windows only >&

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v6]

2025-02-12 Thread Volodymyr Paprotski
On Wed, 12 Feb 2025 15:38:34 GMT, Julian Waters wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> comments from Sandhya > > src/hotspot/os/windows/os_windows.cpp line 2757: >

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v7]

2025-02-12 Thread Volodymyr Paprotski
_bytes` to bump the stack pointer. > > --- > > I also kept the fix to `verify_mxcsr` since without it, `-Xcheck:jni` is > practically unusable when `-XX:+EnableX86ECoreOpts` are set (65k+ lines of > warnings) Volodymyr Paprotski has updated the pull request incrementally with one additi

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v6]

2025-02-11 Thread Volodymyr Paprotski
_bytes` to bump the stack pointer. > > --- > > I also kept the fix to `verify_mxcsr` since without it, `-Xcheck:jni` is > practically unusable when `-XX:+EnableX86ECoreOpts` are set (65k+ lines of > warnings) Volodymyr Paprotski has updated the pull request incrementally with one ad

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v5]

2025-02-03 Thread Volodymyr Paprotski
_bytes` to bump the stack pointer. > > --- > > I also kept the fix to `verify_mxcsr` since without it, `-Xcheck:jni` is > practically unusable when `-XX:+EnableX86ECoreOpts` are set (65k+ lines of > warnings) Volodymyr Paprotski has updated the pull request incrementally with one a

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v4]

2025-02-03 Thread Volodymyr Paprotski
CSR changed by native JNI code, use > -XX:+RestoreMXCSROnJNICall > > > **This in fact happens on both Windows _AND_ Linux.** However, _only_ on > Windows there is a crash. This PR fixes the crash but I have not been able to > track down the source of the crash (i.e. crash in the

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v3]

2025-01-23 Thread Volodymyr Paprotski
CSR changed by native JNI code, use > -XX:+RestoreMXCSROnJNICall > > > **This in fact happens on both Windows _AND_ Linux.** However, _only_ on > Windows there is a crash. This PR fixes the crash but I have not been able to > track down the source of the crash (i.e. crash in the

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v3]

2025-01-23 Thread Volodymyr Paprotski
On Thu, 23 Jan 2025 18:23:01 GMT, Volodymyr Paprotski wrote: >> (Also see `8319429: Resetting MXCSR flags degrades ecore`) >> >> For performance, signaling flags (bottom 6 bits) are set by default in >> MXCSR. This PR fixes the Xcheck:jni comparison that is pro

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v2]

2025-01-23 Thread Volodymyr Paprotski
007ff8` _seems_ like a valid mxcsr value, only way it should crash is if > top 2 bytes weren't zeroes, which they are. Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision: cleanup - Changes: - all: http

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni

2025-01-08 Thread Volodymyr Paprotski
On Tue, 10 Dec 2024 23:45:37 GMT, Volodymyr Paprotski wrote: > @TobiHartmann There are still some unanswered questions I have, but > committing this since we need to work around vacation schedules. > > **This in fact happens on both Windows _AND_ Linux.** However, _only_ on >

RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni

2024-12-10 Thread Volodymyr Paprotski
@TobiHartmann There are still some unanswered questions I have, but committing this since we need to work around vacation schedules. **This in fact happens on both Windows _AND_ Linux.** However, _only_ on Windows there is a crash. This fix fixes the crash but I don't understand entirely why t

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v48]

2024-05-30 Thread Volodymyr Paprotski
On Thu, 30 May 2024 13:56:30 GMT, Emanuel Peter wrote: >> Control question: Are we confident with this potentially going into JDK 23 >> or should we rather postpone to JDK 24? The fork is next week. > >> Control question: Are we confident with this potentially going into JDK 23 >> or should we

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v43]

2024-05-28 Thread Volodymyr Paprotski
On Tue, 28 May 2024 17:36:03 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 488: >> >>> 486: __ cmpq(r11, nMinusK); >>> 487: __ ja_b(L_return); >>> 488: __ movq(rax, r11); >> >> At places where we know that return value in r11 is correct, we dont

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-24 Thread Volodymyr Paprotski
On Wed, 22 May 2024 14:50:40 GMT, Scott Gibbons wrote: >> test/jdk/java/lang/StringBuffer/IndexOf.java line 284: >> >>> 282: >>> 283: // Note: it is possible although highly improbable that failCount >>> will >>> 284: // be > 0 even if everthing is working ok >> >> This sounds like either

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v37]

2024-05-24 Thread Volodymyr Paprotski
On Fri, 24 May 2024 15:32:26 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-24 Thread Volodymyr Paprotski
On Fri, 17 May 2024 23:59:05 GMT, Scott Gibbons wrote: >> test/jdk/java/lang/StringBuffer/IndexOf.java line 40: >> >>> 38: private static boolean failure = false; >>> 39: public static void main(String[] args) throws Exception { >>> 40: String testName = "IndexOf"; >> >> intentation > >

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-24 Thread Volodymyr Paprotski
On Wed, 22 May 2024 14:41:36 GMT, Scott Gibbons wrote: >> test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 132: >> >>> 130: @Benchmark >>> 131: public int searchHugeLargeSubstring() { >>> 132: return dataStringHuge.indexOf("B".repeat(30) + "X" + >>> "A".repeat(30), 7

Integrated: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic

2024-05-22 Thread Volodymyr Paprotski
On Tue, 2 Apr 2024 15:42:05 GMT, Volodymyr Paprotski wrote: > Performance. Before: > > Benchmark(algorithm) (dataSize) (keyLength) > (provider) Mode Cnt ScoreError Units > SignatureBench.ECDSA.signSHA256withECDSA10

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v12]

2024-05-22 Thread Volodymyr Paprotski
On Tue, 21 May 2024 17:41:46 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark(algorithm) (dataSize) (keyLength) >> (provider) Mode Cnt ScoreError Units >> SignatureBench.ECDSA.signSHA256with

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v11]

2024-05-21 Thread Volodymyr Paprotski
On Tue, 21 May 2024 07:21:14 GMT, Tobias Hartmann wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> shenandoah verifier > > I'm getting some conflicts when trying to apply

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v12]

2024-05-21 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574 ± >

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v11]

2024-05-17 Thread Volodymyr Paprotski
On Fri, 17 May 2024 21:16:47 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark(algorithm) (dataSize) (keyLength) >> (provider) Mode Cnt ScoreError Units >> SignatureBench.ECDSA.signSHA256with

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v11]

2024-05-17 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v9]

2024-05-17 Thread Volodymyr Paprotski
On Thu, 16 May 2024 23:21:36 GMT, Sandhya Viswanathan wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> whitespace > > src/hotspot/cpu/x86/stubGenerator_x86_64_pol

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v10]

2024-05-17 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-15 Thread Volodymyr Paprotski
On Wed, 15 May 2024 19:21:37 GMT, Volodymyr Paprotski wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Rearrange; add lambdas for clarity > > test/jdk/java/lang/StringBuffer/IndexOf.

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-15 Thread Volodymyr Paprotski
On Sat, 4 May 2024 19:35:21 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-13 Thread Volodymyr Paprotski
On Sat, 4 May 2024 19:35:21 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v9]

2024-05-09 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v7]

2024-05-09 Thread Volodymyr Paprotski
On Thu, 9 May 2024 23:36:03 GMT, Anthony Scarpino wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> whitespace > > src/java.base/share/classes/sun/security/ec/ECOpera

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v8]

2024-05-09 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v7]

2024-05-09 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v6]

2024-05-06 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v5]

2024-04-25 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-24 Thread Volodymyr Paprotski
On Tue, 9 Apr 2024 02:01:36 GMT, Anthony Scarpino wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> remove use of jdk.crypto.ec > > src/java.base/share/classes/sun/security

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v3]

2024-04-24 Thread Volodymyr Paprotski
On Tue, 23 Apr 2024 19:55:57 GMT, Anthony Scarpino wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Comments from Jatin and Tony > > src/java.base/share/classes/sun/security

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-24 Thread Volodymyr Paprotski
On Tue, 16 Apr 2024 02:26:57 GMT, Jatin Bhateja wrote: >> Per-above, this is a switch statement (`UNLIKELY`) fallback. I can still add >> alignment and loop rotation, but being a fallback figured its more important >> to keep it small&readable... > > It's all part of intrinsic, no harm in polis

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v4]

2024-04-24 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-15 Thread Volodymyr Paprotski
On Fri, 5 Apr 2024 07:19:28 GMT, Jatin Bhateja wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> remove use of jdk.crypto.ec > > src/hotspot/cpu/x86/stubGenerator_x86_64_p

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-15 Thread Volodymyr Paprotski
On Wed, 10 Apr 2024 23:56:52 GMT, Volodymyr Paprotski wrote: > Few early comments. > > Please update the copyright year of all the modified files. > > You can even consider splitting this into two patches, Java side changes in > one and x86 optimized intrinsic in ne

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-15 Thread Volodymyr Paprotski
On Thu, 11 Apr 2024 17:15:21 GMT, Anthony Scarpino wrote: >>> In `ECOperations.java`, if I understand this correctly, it is to replace >>> the existing `PointMultiplier` with montgomery-based PointMuliplier. But >>> when I look at the code, I see both are still options. If I read this >>> cor

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v3]

2024-04-15 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-10 Thread Volodymyr Paprotski
On Fri, 5 Apr 2024 09:17:18 GMT, Jatin Bhateja wrote: > Few early comments. > > Please update the copyright year of all the modified files. > > You can even consider splitting this into two patches, Java side changes in > one and x86 optimized intrinsic in next one. Thanks Jatin, will fix! -

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-10 Thread Volodymyr Paprotski
On Wed, 10 Apr 2024 17:18:55 GMT, Anthony Scarpino wrote: > In `ECOperations.java`, if I understand this correctly, it is to replace the > existing `PointMultiplier` with montgomery-based PointMuliplier. But when I > look at the code, I see both are still options. If I read this correctly, it

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-05 Thread Volodymyr Paprotski
On Tue, 2 Apr 2024 19:19:59 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark(algorithm) (dataSize) (keyLength) >> (provider) Mode Cnt ScoreError Units >> SignatureBench.ECDSA.signSHA256with

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-02 Thread Volodymyr Paprotski
On Tue, 2 Apr 2024 16:29:07 GMT, Alan Bateman wrote: >> Volodymyr Paprotski has updated the pull request incrementally with one >> additional commit since the last revision: >> >> remove use of jdk.crypto.ec > > src/java.base/share/classes/module

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

2024-04-02 Thread Volodymyr Paprotski
ntBench.EC.generateSecret ECDH 256 > EC thrpt3 1346.523 ± 28.722 ops/s > Benchmark (isMontBench) Mode Cnt Score > Error Units > PolynomialP256Bench.benchMultiply true thrpt3 1919.574

RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic

2024-04-02 Thread Volodymyr Paprotski
Performance. Before: Benchmark(algorithm) (dataSize) (keyLength) (provider) Mode Cnt ScoreError Units SignatureBench.ECDSA.signSHA256withECDSA1024 256 thrpt3 6443.934 ± 6.491 ops/s SignatureBench.ECDSA.signSHA256