On Fri, 27 Oct 2023 06:42:07 GMT, Xiaohong Gong <xg...@openjdk.org> wrote:
>> sub-word gather do not emit any predicated instructions, thus only >> VectorMaskUseLoad is relevant in this context, however AVX512 and SVE does >> have a direct predicated gather instructions for 32/64 bit types. > > I see, thanks! `VecMaskUsePred` is added to check > `match_rule_supported_vector_masked` for normal vector ops. That's because we > may add an additional mask input for those vector ops. But > `Load|StoreVectorScatterMasked` are different. They point to the masked > operations no matter how they are implemented. So just `VecMaskUseLoad` is > fine for all these two ops for me. I think its better to align masked sub-word gather implementation with non-sub-word once i.e. support intrinsfication only for predicated targets. Respective backends may then choose to either emit a predicated loop like the one which this patch does OR directly emit a predicated instruction if target support it. With this we may see some performance penalty for masked sub-word gathers on non-predicated targets since original java implementation will now become fallback code, but then same penalty exists for non-subword gathers today. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1375522151