Normally this is automatic in the size restrictions that are placed on vector sizes coming from the implementation. However, for the legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final 24 bytes of the vector register. Without this check, do_dup selects TCG_TYPE_V128 and clears only 16 bytes.
Signed-off-by: Richard Henderson <richard.hender...@linaro.org> --- tcg/tcg-op-gvec.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 22db1590d5..61c25f5784 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs, in units of LNSZ. This limits the expansion of inline code. */ static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz) { - uint32_t lnct = oprsz / lnsz; - return lnct >= 1 && lnct <= MAX_UNROLL; + if (oprsz % lnsz == 0) { + uint32_t lnct = oprsz / lnsz; + return lnct >= 1 && lnct <= MAX_UNROLL; + } + return false; } static void expand_clr(uint32_t dofs, uint32_t maxsz); -- 2.17.1