On Mon, Jul 7, 2014 at 6:40 PM, Richard Henderson <r...@redhat.com> wrote: > On 07/03/2014 02:53 AM, Evgeny Stupachenko wrote: >> -expand_vec_perm_palignr (struct expand_vec_perm_d *d) >> +expand_vec_perm_palignr (struct expand_vec_perm_d *d, int insn_num) > > insn_num might as well be "bool avx2", since it's only ever set to two values.
Agree. However: after the alignment, one operand permutation could be just move and take 2 instructions for AVX2 as well for AVX2 there could be other cases when the scheme takes 4 or 5 instructions we can leave it for potential avx512 extension > >> - /* Even with AVX, palignr only operates on 128-bit vectors. */ >> - if (!TARGET_SSSE3 || GET_MODE_SIZE (d->vmode) != 16) >> + /* SSSE3 is required to apply PALIGNR on 16 bytes operands. */ >> + if (GET_MODE_SIZE (d->vmode) == 16) >> + { >> + if (!TARGET_SSSE3) >> + return false; >> + } >> + /* AVX2 is required to apply PALIGNR on 32 bytes operands. */ >> + else if (GET_MODE_SIZE (d->vmode) == 32) >> + { >> + if (!TARGET_AVX2) >> + return false; >> + } >> + /* Other sizes are not supported. */ >> + else >> return false; > > And you'd better check it up here because... > Correct. The following should resolve the issue: /* For AVX2 we need more than 2 instructions when the alignment by itself does not produce the desired permutation. */ if (TARGET_AVX2 && insn_num <= 2) return false; >> + /* For SSSE3 we need 1 instruction for palignr plus 1 for one >> + operand permutaoin. */ >> + if (insn_num == 2) >> + { >> + ok = expand_vec_perm_1 (&dcopy); >> + gcc_assert (ok); >> + } >> + /* For AVX2 we need 2 instructions for the shift: vpalignr and >> + vperm plus 4 instructions for one operand permutation. */ >> + else if (insn_num == 6) >> + { >> + ok = expand_vec_perm_vpshufb2_vpermq (&dcopy); >> + gcc_assert (ok); >> + } >> + else >> + ok = false; >> return ok; > > ... down here you'll simply ICE from the gcc_assert. > > > r~