On Fri, 25 Jul 2025 20:09:40 GMT, Jatin Bhateja wrote:
>> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR
>> instruction.
>> It also adds a new hybrid call generator to facilitate lazy intrinsification
>> or else perform procedural inlining
> 1642.784 ops/ms
> VectorSliceBenchmark.longVectorSliceWithVariableIndex 1024 thrpt2
> 1474.808 ops/ms
> VectorSliceBenchmark.shortVectorSliceWithConstantIndex11024 thrpt2
> 10399.394 ops/ms
> VectorSliceBenchmark.shortVectorSliceWithConstantIndex21024 thrpt2
> 10502.8
Patch optimizes Vector. slice operation with constant index using x86 ALIGNR
instruction.
It also adds a new hybrid call generator to facilitate lazy intrinsification or
else perform procedural inlining to prevent call overhead and boxing penalties
in case the fallback implementation expects to
On Tue, 18 Mar 2025 20:51:46 GMT, Jatin Bhateja wrote:
> Patch optimizes Vector. slice operation with constant index using x86 ALIGNR
> instruction.
> It also adds a new hybrid call generator to facilitate lazy intrinsification
> or else perform procedural inlining to prevent call
On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote:
> JDK-8318650 introduced hotspot intrinsification of subword gather load APIs
> for X86 platforms [1]. However, the current implementation is not optimal for
> AArch64 SVE platform, which natively supports vector instructions for subword
>
On Thu, 29 May 2025 18:56:11 GMT, Mohamed Issa wrote:
>> The goal of this PR is to implement an x86_64 intrinsic for
>> java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are
>> included to check the performance of specific input value ranges to help
>> prevent regression
On Thu, 29 May 2025 18:49:28 GMT, Mohamed Issa wrote:
>> test/micro/org/openjdk/bench/java/lang/CbrtPerf.java line 56:
>>
>>> 54: public static class CbrtPerfRanges {
>>> 55: public static int cbrtInputCount = 2048;
>>> 56:
>>
>> Please create separate CbrtPerfSpecialValues for +/-
On Wed, 28 May 2025 18:39:13 GMT, Mohamed Issa wrote:
>> The goal of this PR is to implement an x86_64 intrinsic for
>> java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are
>> included to check the performance of specific input value ranges to help
>> prevent regression
On Mon, 19 May 2025 03:10:46 GMT, Xiaohong Gong wrote:
>> JDK-8318650 introduced hotspot intrinsification of subword gather load APIs
>> for X86 platforms [1]. However, the current implementation is not optimal
>> for AArch64 SVE platform, which natively supports vector instructions for
>> sub
On Wed, 7 May 2025 02:10:56 GMT, erifan wrote:
>> This patch optimizes the following patterns:
>> For integer types:
>>
>> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1))
>> => (VectorMaskCmp src1 src2 ncond)
>> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1))
>> => (VectorMas
_can_be_inverted() ||
>>
>> Do you plan to extend your testcase / matching logic to cover following
>> equivalent patterns:
>>
>> - compare.xor(maskAll(true))
>> - compare.xor(VectorMask.fromLong(SPECIES, -1L))
>
> Hi @jatin-bhateja It is fea
On Fri, 25 Apr 2025 09:17:02 GMT, Jatin Bhateja wrote:
>> Thanks for telling me this information. Another more important reason to
>> check outcnt here is to prevent this optimization when the uses of
>> VectorMaskCmp is greater than 1, because this optimization may not be
On Wed, 7 May 2025 02:10:56 GMT, erifan wrote:
>> This patch optimizes the following patterns:
>> For integer types:
>>
>> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1))
>> => (VectorMaskCmp src1 src2 ncond)
>> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1))
>> => (VectorMas
On Fri, 25 Apr 2025 05:26:35 GMT, Vladimir Ivanov wrote:
>> Migrate Vector API math library (SVML and SLEEF) linkage from native code
>> (in JVM) to Java FFM API.
>>
>> Since FFM API doesn't support vector calling conventions yet, migration
>> affects only symbol lookup for now. But it still e
On Thu, 24 Apr 2025 08:57:14 GMT, erifan wrote:
>> erifan has updated the pull request with a new target base due to a merge or
>> a rebase. The incremental webrev excludes the unrelated changes brought in
>> by the merge/rebase. The pull request contains two additional commits since
>> the la
On Thu, 24 Apr 2025 09:46:24 GMT, erifan wrote:
>> test/hotspot/jtreg/compiler/vectorapi/VectorMaskCompareNotTest.java line 38:
>>
>>> 36: * @summary test combining vector not operation with compare
>>> 37: * @modules jdk.incubator.vector
>>> 38: * @requires ((os.arch!="x86" & os.arch!="i386"
On Thu, 24 Apr 2025 23:03:09 GMT, Vladimir Ivanov wrote:
>> src/hotspot/share/opto/vectorIntrinsics.cpp line 563:
>>
>>> 561: debug_name =
>>> debug_name_oop->const_oop()->as_instance()->java_lang_String_str(buf,
>>> buflen);
>>> 562: }
>>> 563: Node* vcall = make_runtime_call(RC_VECTO
On Thu, 24 Apr 2025 09:37:07 GMT, erifan wrote:
>> src/hotspot/share/opto/vectornode.cpp line 2243:
>>
>>> 2241: in1 = in1->in(1);
>>> 2242: }
>>> 2243: if (in1->Opcode() != Op_VectorMaskCmp || in1->outcnt() > 1 ||
>>
>> Checks on outcnt on line 2243 and 2238 can be removed. Idealizatio
On Wed, 23 Apr 2025 23:54:01 GMT, Vladimir Ivanov wrote:
>> Migrate Vector API math library (SVML and SLEEF) linkage from native code
>> (in JVM) to Java FFM API.
>>
>> Since FFM API doesn't support vector calling conventions yet, migration
>> affects only symbol lookup for now. But it still e
On Fri, 18 Apr 2025 01:36:10 GMT, erifan wrote:
>> This patch optimizes the following patterns:
>> For integer types:
>>
>> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1))
>> => (VectorMaskCmp src1 src2 ncond)
>> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1))
>> => (VectorMa
On Fri, 18 Apr 2025 01:36:10 GMT, erifan wrote:
>> This patch optimizes the following patterns:
>> For integer types:
>>
>> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1))
>> => (VectorMaskCmp src1 src2 ncond)
>> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1))
>> => (VectorMa
On Mon, 14 Apr 2025 15:22:21 GMT, Vladimir Ivanov wrote:
>> The HashMap for caching was deleted. Now it use only ThreadLocal variable
>> without synchronization.
>> According to the specjvm2008::xml.transform workload the performance for low
>> threads counts was not affected and improved for h
On Mon, 14 Apr 2025 15:23:45 GMT, Vladimir Ivanov wrote:
>> src/java.xml/share/classes/com/sun/org/apache/xml/internal/utils/XMLReaderManager.java
>> line 148:
>>
>>> 146: // for this thread, remove it.
>>> 147: ReaderWrapper rw = m_readers.get();
>>> 148: if (rw != null
On Fri, 28 Mar 2025 23:48:34 GMT, Vladimir Ivanov wrote:
>> The HashMap for caching was deleted. Now it use only ThreadLocal variable
>> without synchronization.
>> According to the specjvm2008::xml.transform workload the performance for low
>> threads counts was not affected and improved for h
On Fri, 7 Mar 2025 16:13:05 GMT, Vladimir Ivanov wrote:
>> test setup was updated to generate data of requested size.
>
> Vladimir Ivanov has updated the pull request incrementally with one
> additional commit since the last revision:
>
> JDK-8350811 [JMH] test foreign.StrLenTest failed with
On Fri, 7 Mar 2025 15:51:36 GMT, Vladimir Ivanov wrote:
>> test setup was updated to generate data of requested size.
>
> Vladimir Ivanov has updated the pull request incrementally with one
> additional commit since the last revision:
>
> JDK-8350811 [JMH] test foreign.StrLenTest failed with
On Tue, 4 Mar 2025 19:37:32 GMT, Vladimir Ivanov wrote:
>> test setup was updated to generate data of requested size.
>
> Vladimir Ivanov has updated the pull request incrementally with one
> additional commit since the last revision:
>
> JDK-8350811 [JMH] test foreign.StrLenTest failed with
On Fri, 28 Feb 2025 01:20:44 GMT, Vladimir Ivanov wrote:
> The scope was updated to support multithread configuration (jmh option '-t
> 2') . No other changes needed.
LGTM.
-
Marked as reviewed by jbhateja (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/23834#pullrequestr
On Wed, 26 Feb 2025 07:04:58 GMT, Nicole Xu wrote:
>> Suite `MaskedLogicOpts.maskedLogicOperationsLong512()` failed on both x86
>> and AArch64 with the following error:
>>
>>
>> java.lang.IndexOutOfBoundsException: Index 252 out of bounds for length 249
>>
>>
>> The variable `long256_arr_idx
On Tue, 18 Feb 2025 02:36:13 GMT, Julian Waters wrote:
> Is anyone else getting compile failures after this was integrated? This
> weirdly seems to only happen on Linux
>
> ```
> * For target hotspot_variant-server_libjvm_objs_mulnode.o:
> /home/runner/work/jdk/jdk/src/hotspot/share/opto/mulnod
On Tue, 18 Feb 2025 03:50:19 GMT, Nicole Xu wrote:
> Sure. Since I am very new to openJDK, I asked my teammate for help to file
> the follow-up RFE.
>
> Here is the https://bugs.openjdk.org/browse/JDK-8350215 with description of
> the discussed issues.
Hi @xyyNicole ,
I have modified the be
On Tue, 18 Feb 2025 09:58:50 GMT, SendaoYan wrote:
>> Hi all,
>>
>> The newly added JMH tests
>> 'org.openjdk.bench.jdk.incubator.vector.VectorMultiplyOptBenchmark' fails
>> "java.lang.NoClassDefFoundError: jdk/incubator/vector/Float16" by below test
>> command:
>>
>>
>> make test MICRO="FO
elp to review the
> patch? Thanks.
> @xyyNicole @jatin-bhateja I think it is reasonable to just fix the benchmark
> so that it still has the same behaviour, just without the out-of-bounds
> exception. @jatin-bhateja you originally wrote the benchmark, and it could
> make sense if
On Thu, 13 Feb 2025 12:06:09 GMT, Jatin Bhateja wrote:
>> Suite MaskedLogicOpts.maskedLogicOperationsLong512() failed on both x86 and
>> AArch64 with the following error:
>>
>>
>> java.lang.IndexOutOfBoundsException: Index 252 out of bounds for leng
On Wed, 8 Jan 2025 09:04:47 GMT, Nicole Xu wrote:
> Suite MaskedLogicOpts.maskedLogicOperationsLong512() failed on both x86 and
> AArch64 with the following error:
>
>
> java.lang.IndexOutOfBoundsException: Index 252 out of bounds for length 249
>
>
> The variable `long256_arr_idx` is misuse
On Sun, 15 Dec 2024 18:05:02 GMT, Jatin Bhateja wrote:
> Hi All,
>
> This patch adds C2 compiler support for various Float16 operations added by
> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>
> Following is the summary of changes included with this patch:-
&g
On Wed, 12 Feb 2025 14:46:49 GMT, Paul Sandoz wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Review comments resolutions
>
> Looks good. I merged this PR with master, succ
On Mon, 10 Feb 2025 21:23:28 GMT, Paul Sandoz wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Fixing typos
>
> An impressive and substantial change. I focused on the Java code,
On Mon, 10 Feb 2025 20:43:19 GMT, Paul Sandoz wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Fixing typos
>
> test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java li
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pu
On Tue, 4 Feb 2025 10:05:09 GMT, Jatin Bhateja wrote:
>> Hi All,
>>
>> This patch adds C2 compiler support for various Float16 operations added by
>> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>>
>> Following is the summary of changes included w
On Tue, 4 Feb 2025 19:18:39 GMT, Chen Liang wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Fixing typos
>
> src/java.base/share/classes/jdk/internal/vm/vector/Float16Math.java
On Tue, 4 Feb 2025 09:03:09 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Update
>> test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchma
On Mon, 3 Feb 2025 18:11:11 GMT, Jatin Bhateja wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Update
>> test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchma
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has
On Thu, 30 Jan 2025 11:03:43 GMT, Jatin Bhateja wrote:
>> Hi All,
>>
>> This patch adds C2 compiler support for various Float16 operations added by
>> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>>
>> Following is the summary of changes included w
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull
On Wed, 29 Jan 2025 00:36:54 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Updating typos in comments
>
> Some more minor comments below. Rest of the PR
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has update
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the p
On Mon, 27 Jan 2025 07:42:48 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 14 commits:
>>
>> - Rebasing to jdk mainline
>> - Merge branch 'maste
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated t
On Fri, 17 Jan 2025 16:02:55 GMT, Jatin Bhateja wrote:
>> Hi All,
>>
>> This patch adds C2 compiler support for various Float16 operations added by
>> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>>
>> Following is the summary of changes included w
On Thu, 23 Jan 2025 10:54:54 GMT, Hamlin Li wrote:
>> @Hamlin-Li , Class types are passed as constant oop, this check is added for
>> argument validation.
>
> Thanks!
> Seems it could be an assert instead? Or maybe I could have misunderstood your
> above explanation.
Hi @Hamlin-Li, We intend t
On Thu, 23 Jan 2025 08:14:13 GMT, Hamlin Li wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Review suggestions incorporated.
>
> src/hotspot/share/opto/library_call.cpp line 8670:
On Wed, 15 Jan 2025 00:28:50 GMT, Paul Sandoz wrote:
>> Hi @PaulSandoz ,
>>
>> In above code snippet the return type 'short' of intrinsic call does not
>> comply with the value being returned which is of box type, thereby mandating
>> addition glue code.
>>
>> Regular primitive type boxing
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull r
On Mon, 13 Jan 2025 16:51:02 GMT, Paul Sandoz wrote:
>> Hi @PaulSandoz , In the current scheme we are passing unboxed carriers to
>> intrinsic entry point, in the fallback implementation carrier type is first
>> converted to floating point value using Float.float16ToFloat API which
>> expects
On Thu, 9 Jan 2025 19:22:35 GMT, Paul Sandoz wrote:
>> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java
>> line 1434:
>>
>>> 1432: return float16ToRawShortBits(valueOf(product +
>>> float16ToFloat(f16c)));
>>> 1433: });
>>> 1434:
On Thu, 9 Jan 2025 13:13:30 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has refreshed the contents of this pull request, and previous
>> commits have been removed. The incremental views will show differences
>> compared to the previous content of the PR. The pull request co
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pu
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has refreshed the content
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pu
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has refreshed the content
On Tue, 17 Dec 2024 16:40:33 GMT, Emanuel Peter wrote:
>> Yes, there are multiple test points in newly added test which receive
>> floating-point constant which goes through following IR logic before being
>> constant folded
>> ConF -> ConvF2HF -> ReinterpretS2HF
>
> Please show me an example.
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the p
On Tue, 17 Dec 2024 16:39:58 GMT, Emanuel Peter wrote:
>> This is the core idealization logic which infers FP16 IR. Every test point
>> added in the test points added in
>> test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java
>> verifies this.
>
> Picking a random line from
eing backported was authored by Paul Sandoz on 16 Dec 2024 and
> was reviewed by Quan Anh Mai and Jatin Bhateja.
Marked as reviewed by jbhateja (Reviewer).
-
PR Review: https://git.openjdk.org/jdk/pull/22777#pullrequestreview-2510724750
On Tue, 17 Dec 2024 07:15:35 GMT, Emanuel Peter wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Adding more test points
>
> src/hotspot/share/opto/convertnode.cpp line 28
On Tue, 17 Dec 2024 07:22:45 GMT, Emanuel Peter wrote:
>> src/hotspot/share/opto/convertnode.cpp line 960:
>>
>>> 958: }
>>> 959: return TypeInt::SHORT;
>>> 960: }
>>
>> Do we have tests for these constant folding operations?
>
> We would need all sorts of conversion with Float16 <-> short.
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the p
to the
>> reader explaining why the more obvious code was not being used.
>
> @jatin-bhateja could we change the intrinsic to declare the three Float16
> values as additional parameters which are only ever passed to the lambda? I
> believe when intrinsic we will just drop tho
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the
On Mon, 16 Dec 2024 09:03:38 GMT, Emanuel Peter wrote:
> > > Can you quickly summarize what tests you have, and what they test?
> >
> >
> > Patch includes functional and performance tests, as per your suggestions IR
> > framework-based tests now cover various special cases for constant folding
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated th
On Mon, 16 Dec 2024 07:22:04 GMT, Emanuel Peter wrote:
> Can you quickly summarize what tests you have, and what they test?
Patch includes functional and performance tests, as per your suggestions IR
framework-based tests now cover various special cases for constant folding
transformation. Le
einterpretation chains.
> HF2S + S2HF = HF
> 9. X86 backend implementation for all supported intrinsics.
> 10. Functional and Performance validation tests.
>
> Kindly review the patch and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull re
On Mon, 16 Dec 2024 08:13:48 GMT, Jatin Bhateja wrote:
>> Paul Sandoz has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Merge checks
>
> LGTM.
> @jatin-bhateja You may want to implement reduction i
On Fri, 13 Dec 2024 21:25:48 GMT, Paul Sandoz wrote:
>> Add functional support for unsigned min/max reductions on vectors.
>>
>> We also need to ensure that the `reductionCoerced` intrinsic bails out when
>> there is no reduction operation for the lanewise operation. When intrinsic
>> support
On Sun, 15 Dec 2024 18:05:02 GMT, Jatin Bhateja wrote:
> Hi All,
>
> This patch adds C2 compiler support for various Float16 operations added by
> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>
> Following is the summary of changes included with this patch:-
&g
Hi All,
This patch adds C2 compiler support for various Float16 operations added by
[PR#22128](https://github.com/openjdk/jdk/pull/22128)
Following is the summary of changes included with this patch:-
1. Detection of various Float16 operations through inline expansion or pattern
folding ideali
On Mon, 14 Oct 2024 11:40:01 GMT, Jatin Bhateja wrote:
> Hi All,
>
> This patch adds C2 compiler support for various Float16 operations added by
> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>
> Following is the summary of changes included with this patch:-
&g
On Tue, 10 Dec 2024 08:34:30 GMT, Jatin Bhateja wrote:
>> Quan Anh Mai has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Change wording on VectorLoadShuffleNode
>>
>> Co-authored-by: Jatin Bhateja
On Tue, 10 Dec 2024 16:10:09 GMT, Quan Anh Mai wrote:
>> Hi,
>>
>> This is just a redo of https://github.com/openjdk/jdk/pull/13093. mostly
>> just the revert of the backout.
>>
>> Regarding the related issues:
>>
>> - [JDK-8306008](https://bugs.openjdk.org/browse/JDK-8306008) and
>> [JDK-83
On Tue, 10 Dec 2024 08:18:36 GMT, Quan Anh Mai wrote:
>> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Byte512Vector.java
>> line 1046:
>>
>>> 1044: String msg = ("index "+si+"out of range
>>> ["+length+"] in "+
>>> 1045: java
0d08,%r10; {oop([I{0x0007515a0d08})}
>> vmovdqu 0x10(%r10),%xmm1
>> movabs $0x75158afb8,%r10; {oop([I{0x00075158afb8})}
>> vmovdqu 0x10(%r10),%xmm0
>> vpand -0xddc12(%rip),%xmm0,%xmm0# Stub::vector_int_to_byt...
>
On Tue, 10 Dec 2024 08:17:22 GMT, Quan Anh Mai wrote:
>> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Double128Vector.java
>> line 859:
>>
>>> 857: .reinterpretAsInts()
>>> 858: .intoArray(a, offset);
>>> 859: defaul
On Tue, 10 Dec 2024 07:09:33 GMT, Jatin Bhateja wrote:
>> Quan Anh Mai has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> adverb order
>
> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/
On Mon, 9 Dec 2024 14:12:19 GMT, Quan Anh Mai wrote:
>> Hi,
>>
>> This is just a redo of https://github.com/openjdk/jdk/pull/13093. mostly
>> just the revert of the backout.
>>
>> Regarding the related issues:
>>
>> - [JDK-8306008](https://bugs.openjdk.org/browse/JDK-8306008) and
>> [JDK-830
On Fri, 6 Dec 2024 17:22:33 GMT, Paul Sandoz wrote:
>> @sviswa7 @PaulSandoz @eme64 @jatin-bhateja Thanks for taking a look, I have
>> merged the PR with a more recent master and resolved the sematic difference
>> with newly added intrinsics, too.
>
> @merykitty do you
On Mon, 25 Nov 2024 20:04:09 GMT, Jatin Bhateja wrote:
>> Hi All,
>>
>> This patch adds C2 compiler support for various Float16 operations added by
>> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>>
>> Following is the summary of changes included w
On Mon, 25 Nov 2024 07:18:41 GMT, Emanuel Peter wrote:
>> src/hotspot/share/opto/connode.cpp line 49:
>>
>>> 47: switch( t->basic_type() ) {
>>> 48: case T_INT: return new ConINode( t->is_int() );
>>> 49: case T_SHORT: return new ConHNode( t->is_half_float_constant()
>>> );
On Mon, 25 Nov 2024 08:56:31 GMT, Emanuel Peter wrote:
> I heard no argument about why you did not split this up. Please do that in
> the future. It is hard to review well when there is this much code. If it is
> really necessary, then sure. Here it does not seem necessary to deliver all
> at
ze redundant reinterpretation chains.
> HF2S + S2HF = HF
> 6. Auto-vectorization of newly supported scalar operations.
> 7. X86 and AARCH64 backend implementation for all supported intrinsics.
> 9. Functional and Performance validation tests.
>
> Kindly review and share your f
On Mon, 25 Nov 2024 08:02:55 GMT, Emanuel Peter wrote:
> Wow, thanks for tackling this!
>
> Ok, lots of style comments.
>
> But again: I would have loved to see this split up into these parts:
>
> * Scalar
> * Scalar optimizations (value, ideal, identity)
> * Vector
>
> This will again take m
ze redundant reinterpretation chains.
> HF2S + S2HF = HF
> 6. Auto-vectorization of newly supported scalar operations.
> 7. X86 and AARCH64 backend implementation for all supported intrinsics.
> 9. Functional and Performance validation tests.
>
> **Missing Pieces:-**
> **- AAR
On Sun, 29 Sep 2024 04:21:19 GMT, Jatin Bhateja wrote:
> This patch optimizes LongVector multiplication by inferring VPMUL[U]DQ
> instruction for following IR pallets.
>
>
>MulVL ( AndV SRC1, 0x) ( AndV SRC2, 0x)
>MulVL (U
={})"
> > Phase "PrintIdeal":
>
>
>- counts: Graph contains wrong number of nodes:
>
>
oublewords of quadword lanes for multiplication, hence we can safely
>> save emitting redundant input masking instructions. We already have
>> specialized IR nodes like MulAddVS2VINode and I see these new IR nodes
>> similar to it.
>
> @jatin-bhateja in case when `AndV` is shared, it
1 - 100 of 402 matches
Mail list logo