On Mon, 20 Nov 2023 01:34:57 GMT, Xiaohong Gong <[email protected]> wrote:

> > > BTW, I have two questions:
> > > 
> > > 1. An intrinsic which should accept the vector as index like non-subword 
> > > gather is more benefical in real applications. See: [8287289: 
> > > Gather/Scatter with Index Vector 
> > > panama-vector#201](https://github.com/openjdk/panama-vector/pull/201) 
> > > please.
> > > 2. Do you have the plan for adding such optimization for subword scatter 
> > > in future?
> > > 
> > > Thanks, Xiaohong
> > 
> > 
> > I agree, proposal looks reasonable to me, but given that x86 ISA does not 
> > have direct sub-word gather instruction hence we will always need to pass 
> > index array to inline expander. Existing interface provisions passing both 
> > index array and vector.
> 
> So in the x86 backend implementation, are the indexs finally stored into a 
> vector register? Per my understand, it looks that way. If so, maybe an 
> alternative is 1) just making the intrinsics accept an index vector like 
> non-subword types, and 2) calling several times such load-gather intrinsics 
> in java implementation of the subword gather (e.g. 4 load-gather for byte 
> gather with int indexes). That means we can move the complex operations to 
> java side, and compiler should only cover a single load-gather operation. 
> This may make the subword unify with non-subword gathers in 
> compiler/intrinsics side.

Maybe it was not clear from my previous comments, for x86 and targets which do 
not support direct sub-word gather backends will need an index array, for other 
cases there are two options, a target specific lowering of gather IR / extend 
the inline expander to emit a specialized IR or accommodate multiple index 
vector loads penalty if it still wins over existing fallback implementation. 
In-addition due lane size incompatibility b/w gather vector lane and index lane 
it will pose challenges for masked gather operation.

On the other hand, since the patch already demonstrates performance gain other 
targets backends can also be implemented on the same lines.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16354#issuecomment-1818256345

Reply via email to