RFR: 8318078: ADLC: pass ASSERT and PRODUCT flags

2023-10-13 Thread Emanuel Peter
@vnkozlov asked me to guard some debug AD file rules in `#ifdef ASSERT`. https://github.com/openjdk/jdk/pull/14785#discussion_r1349391130 We discovered that the `ASSERT` and `PRODUCT` are not yet passed to ADLC, and hence they are always considered `undefined`. Hence, all of these `ifdef` block

Re: RFR: 8318078: ADLC: pass ASSERT and PRODUCT flags [v2]

2023-10-16 Thread Emanuel Peter
dif > > When compiling, I get complaints for `yyy` on `linux-x64` and for `xxx` on > `linux-x64-debug`. But since `ASSERT` and `PRODUCT` never occur together, we > never get complaints about `control`. > > **Running tier1-3 and stress testing ...** Emanuel Peter has updated

Re: RFR: 8318078: ADLC: pass ASSERT and PRODUCT flags [v2]

2023-10-16 Thread Emanuel Peter
On Fri, 13 Oct 2023 17:23:32 GMT, Vladimir Kozlov wrote: >> Emanuel Peter has updated the pull request incrementally with one additional >> commit since the last revision: >> >> add comments like Vladimir requested > > make/hotspot/gensrc/GensrcAdlc.gmk line

Re: RFR: 8318078: ADLC: pass ASSERT and PRODUCT flags [v2]

2023-10-17 Thread Emanuel Peter
On Mon, 16 Oct 2023 16:03:39 GMT, Vladimir Kozlov wrote: >> Emanuel Peter has updated the pull request incrementally with one additional >> commit since the last revision: >> >> add comments like Vladimir requested > > Good Thanks @vnkozlov @TobiHartmann for t

Integrated: 8318078: ADLC: pass ASSERT and PRODUCT flags

2023-10-17 Thread Emanuel Peter
On Fri, 13 Oct 2023 09:49:48 GMT, Emanuel Peter wrote: > @vnkozlov asked me to guard some debug AD file rules in `#ifdef ASSERT`. > https://github.com/openjdk/jdk/pull/14785#discussion_r1349391130 > > We discovered that the `ASSERT` and `PRODUCT` are not yet passed to ADLC, and

Re: RFR: 8318446: C2: implement StoreNode::Ideal_merge_stores

2024-01-16 Thread Emanuel Peter
On Wed, 25 Oct 2023 03:11:12 GMT, Quan Anh Mai wrote: >> This is a feature requiested by @RogerRiggs and @cl4es . >> >> **Idea** >> >> Merging multiple consecutive small stores (e.g. 8 byte stores) into larger >> stores (e.g. one long store) can lead to speedup. >> Recently, @cl4es and @RogerR

RFR: 8318446: C2: implement StoreNode::Ideal_merge_stores

2024-01-16 Thread Emanuel Peter
This is a feature requiested by @RogerRiggs and @cl4es . **Idea** Merging multiple consecutive small stores (e.g. 8 byte stores) into larger stores (e.g. one long store) can lead to speedup. Recently, @cl4es and @RogerRiggs had to review a few PR's where people would try to get speedups by usin

Re: RFR: 8318446: C2: implement StoreNode::Ideal_merge_stores

2024-01-16 Thread Emanuel Peter
On Wed, 25 Oct 2023 14:59:07 GMT, Quan Anh Mai wrote: >> @merykitty do you have examples for both? Maybe stores to fields already >> works. Merging loads and stores may be out of scope. That sounds a little >> much like SLP. We can still try to do that in a future RFE. We could even >> try to

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v11]

2024-09-11 Thread Emanuel Peter
On Tue, 10 Sep 2024 19:11:30 GMT, Roman Kennke wrote: >> This is the main body of the JEP 450: Compact Object Headers (Experimental). >> >> It is also a follow-up to #20640, which now also includes (and supersedes) >> #20603 and #20605, plus the Tiny Class-Pointers parts that have been >> prev

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v11]

2024-09-11 Thread Emanuel Peter
On Wed, 11 Sep 2024 13:34:28 GMT, Roman Kennke wrote: > > @rkennke Can you please explain the changes in these tests: > > ``` > > test/hotspot/jtreg/compiler/c2/irTests/TestVectorizationMismatchedAccess.java > > test/hotspot/jtreg/compiler/c2/irTests/TestVectorizationNotRun.java > > test/hotspot/

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v11]

2024-09-12 Thread Emanuel Peter
On Thu, 12 Sep 2024 13:13:01 GMT, Roman Kennke wrote: > > > > @rkennke Can you please explain the changes in these tests: > > > > ``` > > > > test/hotspot/jtreg/compiler/c2/irTests/TestVectorizationMismatchedAccess.java > > > > test/hotspot/jtreg/compiler/c2/irTests/TestVectorizationNotRun.java >

Re: RFR: 8343345: Use -jvmArgsPrepend when running microbenchmarks in RunTests.gmk

2024-10-31 Thread Emanuel Peter
On Thu, 31 Oct 2024 08:53:07 GMT, Claes Redestad wrote: > Update RunTests.gmk to use `-jvmArgsPrepend` to avoid overwriting built-in > micro `jvmArgs` flags. Nice, thank you for fixing this! This fixes it for me, I checked this manually, this way https://github.com/openjdk/jdk/pull/21683#issue

Re: RFR: 8343345: Use -jvmArgsPrepend when running microbenchmarks in RunTests.gmk

2024-10-31 Thread Emanuel Peter
On Thu, 31 Oct 2024 12:12:21 GMT, Claes Redestad wrote: >> Update RunTests.gmk to use `-jvmArgsPrepend` to avoid overwriting built-in >> micro `jvmArgs` flags. > > I had hoped you'd exclaim "Trivial!" and review, but I'm sure someone will > step up soon enough. :-) @cl4es so what happens if so

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Fri, 8 Nov 2024 17:42:24 GMT, Roman Kennke wrote: >> Could you please cherry pick >> https://github.com/mur47x111/jdk/commit/c45ebc2a89d0b25a3dd8cc46386e37a635ff9af2 >> for the JVMCI support? > > @mur47x111 it's now intergrated in jdk24. do your magic in Graal ;-) @rkennke I have now looked

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Fri, 8 Nov 2024 17:42:24 GMT, Roman Kennke wrote: >> Could you please cherry pick >> https://github.com/mur47x111/jdk/commit/c45ebc2a89d0b25a3dd8cc46386e37a635ff9af2 >> for the JVMCI support? > > @mur47x111 it's now intergrated in jdk24. do your magic in Graal ;-) @rkennke How important is

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:34:13 GMT, Quan Anh Mai wrote: >> @rkennke >>> BTW, this problem is not specific to UseCompactObjectHeaders - I think the >>> same problem would happen with -UseCompressedClassPointers. With >>> uncompressed class-pointers, byte[] would start at offset 20, while long[] >

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:35:41 GMT, Roman Kennke wrote: >>> @rkennke Ok, fair enough. As far as I know, we at Oracle do not super care >>> about strict alignment `AlignVector`. But maybe other people care, and have >>> to make that tradeoff between vectorization and small object headers. >> >> B

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:23:24 GMT, Roman Kennke wrote: >>> @rkennke How important is the 4-byte saving on `byte, char, short, int, >>> float` arrays? I'd assume they are not generally that small, at least a few >>> elements? So could we make an exception, and have a `16-byte` offset to the >>>

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:13:17 GMT, Roman Kennke wrote: >> @mur47x111 it's now intergrated in jdk24. do your magic in Graal ;-) > >> @rkennke How important is the 4-byte saving on `byte, char, short, int, >> float` arrays? I'd assume they are not generally that small, at least a few >> elements?

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 15:20:17 GMT, Quan Anh Mai wrote: >> @merykitty I guess we can always use >> [vmovdqu](https://www.felixcloutier.com/x86/movdqu:vmovdqu8:vmovdqu16:vmovdqu32:vmovdqu64). >> >> And in fact that is exactly what we do: >> >> public class Test { >> static int RANGE = 1024*10

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 15:00:51 GMT, Roman Kennke wrote: >>> @rkennke It just will (silently) not vectorize, thus running slower but >>> still correct. >> >> Ok, I think we can live with that for now. >> >> As said elsewhere, we are currently working on 4-byte-headers, which would >> make that p

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v57]

2024-11-18 Thread Emanuel Peter
On Thu, 7 Nov 2024 17:25:40 GMT, Roman Kennke wrote: >> This is the main body of the JEP 450: Compact Object Headers (Experimental). >> >> It is also a follow-up to #20640, which now also includes (and supersedes) >> #20603 and #20605, plus the Tiny Class-Pointers parts that have been >> previ

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:38:20 GMT, Quan Anh Mai wrote: >> @merykitty the object base is always at least `8-byte` aligned, see >> `ObjectAlignmentInBytes` - this also holds for all arrays. But the issue is >> the offset from the object base to the array payload. >> >> @rkennke yes, working on fi

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:43:48 GMT, Quan Anh Mai wrote: >> @merykitty >>> Please correct me if I'm wrong but the issue is you need the base to be >>> aligned at 32 bytes on AVX2 machines for any alignment for vector >>> instruction to be meaningful, so I don't see the value of vector alignment

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:50:51 GMT, Quan Anh Mai wrote: >> @merykitty In `AlignmentSolver::solve` / >> `src/hotspot/share/opto/vectorization.cpp` you can see how I compute if >> vectors can be aligned. > > @eme64 If you load a 32-byte (256-bit) vector, then the load is aligned if > the address i

Re: RFR: 8305895: Implement JEP 450: Compact Object Headers (Experimental) [v19]

2024-11-18 Thread Emanuel Peter
On Mon, 18 Nov 2024 14:23:24 GMT, Roman Kennke wrote: >>> @rkennke How important is the 4-byte saving on `byte, char, short, int, >>> float` arrays? I'd assume they are not generally that small, at least a few >>> elements? So could we make an exception, and have a `16-byte` offset to the >>>