On 12 December 2016 at 10:11, <srol...@vmware.com> wrote: > From: Roland Scheidegger <srol...@vmware.com> > > By using a dst_type in the the gather interface, gather has some more > knowledge about how values should be fetched. > E.g. if this is a 3x32bit fetch and dst_type is 4x32bit vector gather > will no longer do a ZExt with a 96bit scalar value to 128bit, but > just fetch the 96bit as 3x32bit vector (this is still going to be > 2 loads of course, but the loads can be done directly to simd vector > that way). > Also, we can now do some try to use the right int/float type. This should > make no difference really since there's typically no domain transition > penalties for such simd loads, however it actually makes a difference > since llvm will use different shuffle lowering afterwards so the caller > can use this to trick llvm into using sane shuffle afterwards (and yes > llvm is really stupid there - nothing against using the shuffle > instruction from the correct domain, but not at the cost of doing 3 times > more shuffles, the case which actually matters is refusal to use shufps > for integer values). > Also do some attempt to avoid things which look great on paper but llvm > doesn't really handle (e.g. fetching 3-element 8 bit and 16 bit vectors > which is simply disastrous - I suspect type legalizer is to blame trying > to extend these vectors to 128bit types somehow, so fetching these with > scalars like before which is suboptimal due to the ZExt). > > Remove the ability for truncation (no point, this is gather, not conversion) > as it is complex enough already. > > While here also implement not just the float, but also the 64bit avx2 > gathers (disabled though since based on the theoretical numbers the benefit > just isn't there at all until Skylake at least).
Hi Roland, This breaks the build on big endian machines. CC gallivm/lp_bld_gather.lo CC gallivm/lp_bld_init.lo gallivm/lp_bld_gather.c: In function 'lp_build_gather_elem_vec': gallivm/lp_bld_gather.c:238:42: error: 'dst_elem_type' undeclared (first use in this function) LLVMConstInt(dst_elem_type, ^ gallivm/lp_bld_gather.c:238:42: note: each undeclared identifier is reported only once for each function it appears in gallivm/lp_bld_gather.c: In function 'lp_build_gather': Dave. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev