On Fri, 20 Dec 2024 16:08:51 GMT, Andrew Haley <a...@openjdk.org> wrote:
>> test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 135: >> >>> 133: @IR(applyIf = {"SuperWordReductions", "true"}, >>> 134: applyIfCPUFeatureOr = { "avx512", "true" }, >>> 135: counts = {IRNode.MIN_REDUCTION_V, " > 0"}) >> >>> @eme64 I've addressed all your comments except aarch64 testing. `asimd` is >>> not enough, you need `sve` for this, but I'm yet to make it work even with >>> `sve`, something's up and need to debug it further. >> >> Hi @galderz , may I ask if these long-reduction cases can't work even with >> `sve`? It might be related with the limitation >> [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566). >> Some `sve` machines have only 128 bits. > > That's right. Neoverse V2 is 4 pipes of 128 bits, V1 is 2 pipes of 256 bits. > That comment is "interesting". Maybe it should be tunable by the back end. > Given that Neoverse V2 can issue 4 SVE operations per clock cycle, it might > still be a win. > > Galder, how about you disable that line and give it another try? FYI: I'm working on removing the line [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566). The issue is that on some platforms 2-element vectors are somehow really slower, and we need a cost-model to give us a better heuristic, rather than the hard "no". See my draft https://github.com/openjdk/jdk/pull/20964. But yes: why don't you remove the line, and see if that makes it work. If so, then don't worry about this case for now, and maybe leave a comment in the test. We can then fix that later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1901576209