On Fri, 20 Dec 2024 16:08:51 GMT, Andrew Haley <a...@openjdk.org> wrote:

>> test/hotspot/jtreg/compiler/loopopts/superword/MinMaxRed_Long.java line 135:
>> 
>>> 133:     @IR(applyIf = {"SuperWordReductions", "true"},
>>> 134:         applyIfCPUFeatureOr = { "avx512", "true" },
>>> 135:         counts = {IRNode.MIN_REDUCTION_V, " > 0"})
>> 
>>> @eme64 I've addressed all your comments except aarch64 testing. `asimd` is 
>>> not enough, you need `sve` for this, but I'm yet to make it work even with 
>>> `sve`, something's up and need to debug it further.
>> 
>> Hi @galderz , may I ask if these long-reduction cases can't work even with 
>> `sve`? It might be related with the limitation 
>> [here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566).
>>  Some `sve` machines have only 128 bits.
>
> That's right. Neoverse V2 is 4 pipes of 128 bits, V1 is 2 pipes of 256 bits.
> That comment is "interesting". Maybe it should be tunable by the back end. 
> Given that Neoverse V2 can issue 4 SVE operations per clock cycle, it might 
> still be a win.
> 
> Galder, how about you disable that line and give it another try?

FYI: I'm working on removing the line 
[here](https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/superword.cpp#L1564-L1566).

The issue is that on some platforms 2-element vectors are somehow really 
slower, and we need a cost-model to give us a better heuristic, rather than the 
hard "no". See my draft https://github.com/openjdk/jdk/pull/20964.

But yes: why don't you remove the line, and see if that makes it work. If so, 
then don't worry about this case for now, and maybe leave a comment in the 
test. We can then fix that later.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20098#discussion_r1901576209

Reply via email to