On Thu, 24 Apr 2025 09:37:07 GMT, erifan <d...@openjdk.org> wrote: >> src/hotspot/share/opto/vectornode.cpp line 2243: >> >>> 2241: in1 = in1->in(1); >>> 2242: } >>> 2243: if (in1->Opcode() != Op_VectorMaskCmp || in1->outcnt() > 1 || >> >> Checks on outcnt on line 2243 and 2238 can be removed. Idealization looks >> for a specific graph palette and replaces it with a new node whose inputs >> are the same as the inputs of the palette. GVN will do the retention job if >> any intermediate node has users beyond the pattern being replaced. > > Thanks for telling me this information. Another more important reason to > check outcnt here is to prevent this optimization when the uses of > VectorMaskCmp is greater than 1, because this optimization may not be > worthwhile. For example: > > > public static void testVectorMaskCmp() { > IntVector bv = IntVector.fromArray(I_SPECIES, ib, 0); > IntVector av = IntVector.fromArray(I_SPECIES, ia, 0); > VectorMask<Integer> m1 = av.compare(VectorOperators.NE, bv); // two uses > VectorMask<Integer> m2 =m1.not(); > m1.intoArray(m, 0); > av.lanewise(VectorOperators.ABS, m2).intoArray(ia, 0); > } > > > If we do not check outcnt and still do this optimization, two VectorMaskCmp > nodes will be generated, and finally two VectorMaskCmp instructions will be > generated. This is unreasonable because VectorMaskCmp has much higher latency > than xor instruction on aarch64.
Thanks, we can add this comment to the code where we are checking outcnt. What if all the other users are also XorNodes?. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24674#discussion_r2059874975