This looks alright to me. There's quite a few places where this will increase the memory requirements a bit but there's no way around it (well unless deciding at runtime alignments/allocations, but it's probably not worth the trouble).
Reviewed-by: Roland Scheidegger <srol...@vmware.com> Am 11.10.2017 um 17:50 schrieb Tim Rowley: > Increase the max allowed vector size from 256 to 512. > > No piglit llvmpipe regressions running on avx2. > > Cc: Dave Airlie <airl...@redhat.com> > Cc: Jose Fonseca <jfons...@vmware.com> > --- > src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 14 +++++++------- > src/gallium/auxiliary/gallivm/lp_bld_type.h | 4 ++-- > 2 files changed, 9 insertions(+), 9 deletions(-) > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > index de18f629cd..97efc3a399 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c > @@ -1272,9 +1272,9 @@ emit_fetch_constant( > /** > * Fetch 64-bit values from two separate channels. > * 64-bit values are stored split across two channels, like xy and zw. > - * This function creates a set of 16 floats, > + * This function creates a set of vec_length*2 floats, > * extracts the values from the two channels, > - * puts them in the correct place, then casts to 8 64-bits. > + * puts them in the correct place, then casts to vec_length 64-bits. > */ > static LLVMValueRef > emit_fetch_64bit( > @@ -1289,9 +1289,9 @@ emit_fetch_64bit( > LLVMValueRef res; > struct lp_build_context *bld_fetch = stype_to_fetch(bld_base, stype); > int i; > - LLVMValueRef shuffles[16]; > + LLVMValueRef shuffles[2 * (LP_MAX_VECTOR_WIDTH/32)]; > int len = bld_base->base.type.length * 2; > - assert(len <= 16); > + assert(len <= (2 * (LP_MAX_VECTOR_WIDTH/32))); > > for (i = 0; i < bld_base->base.type.length * 2; i+=2) { > shuffles[i] = lp_build_const_int32(gallivm, i / 2); > @@ -1691,7 +1691,7 @@ emit_fetch_deriv( > } > > /** > - * store an array of 8 64-bit into two arrays of 8 floats > + * store an array of vec-length 64-bit into two arrays of vec_length floats > * i.e. > * value is d0, d1, d2, d3 etc. > * each 64-bit has high and low pieces x, y > @@ -1710,8 +1710,8 @@ emit_store_64bit_chan(struct lp_build_tgsi_context > *bld_base, > struct lp_build_context *float_bld = &bld_base->base; > unsigned i; > LLVMValueRef temp, temp2; > - LLVMValueRef shuffles[8]; > - LLVMValueRef shuffles2[8]; > + LLVMValueRef shuffles[LP_MAX_VECTOR_WIDTH/32]; > + LLVMValueRef shuffles2[LP_MAX_VECTOR_WIDTH/32]; > > for (i = 0; i < bld_base->base.type.length; i++) { > shuffles[i] = lp_build_const_int32(gallivm, i * 2); > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_type.h > b/src/gallium/auxiliary/gallivm/lp_bld_type.h > index afe8722b05..62f1f85461 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_type.h > +++ b/src/gallium/auxiliary/gallivm/lp_bld_type.h > @@ -59,7 +59,7 @@ extern unsigned lp_native_vector_width; > * Should only be used when lp_native_vector_width isn't available, > * i.e. sizing/alignment of non-malloced variables. > */ > -#define LP_MAX_VECTOR_WIDTH 256 > +#define LP_MAX_VECTOR_WIDTH 512 > > /** > * Minimum vector alignment for static variable alignment > @@ -67,7 +67,7 @@ extern unsigned lp_native_vector_width; > * It should always be a constant equal to LP_MAX_VECTOR_WIDTH/8. An > * expression is non-portable. > */ > -#define LP_MIN_VECTOR_ALIGN 32 > +#define LP_MIN_VECTOR_ALIGN 64 > > /** > * Several functions can only cope with vectors of length up to this value. > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev