On Thu, Oct 17, 2013 at 12:47 PM, Kirill Yukhin <kirill.yuk...@gmail.com> wrote: > >> > I suspect gen_lowpart is bad turn when reload is completed, as >> > far as it can create new pseudo. gen_lowpart () may call >> > gen_reg_rtx (), which contain corresponging gcc_assert (). >> >> False. gen_lowpart is perfectly safe post-reload. >> Indeed, taking the subreg of a hard register should arrive >> >> x = gen_rtx_REG_offset (op, outermode, final_regno, final_offset); >> >> in simplify_subreg. >> >> Have you encountered some specific problem with gen_lowpart? > Yes. > > Patch [8/8] contains testsuite for AVX-512. > This pattern is covered as well. > > When trying to do so: > > (define_insn_and_split "vec_extract_lo_v32hi" > [(set (match_operand:V16HI 0 "nonimmediate_operand" "=v,m") > (vec_select:V16HI > (match_operand:V32HI 1 "nonimmediate_operand" "vm,v") > (parallel [(const_int 0) (const_int 1) > (const_int 2) (const_int 3) > (const_int 4) (const_int 5) > (const_int 6) (const_int 7) > (const_int 8) (const_int 9) > (const_int 10) (const_int 11) > (const_int 12) (const_int 13) > (const_int 14) (const_int 15)])))] > "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))" > "#" > "&& reload_completed" > [(const_int 0)] > { > rtx op1 = operands[1]; > op1 = gen_lowpart (V16HImode, op1); > emit_move_insn (operands[0], op1); > DONE; > }) > > I've got ICE, with following bt: > > #1 0x00000000006f28d6 in gen_reg_rtx (mode=V32HImode) at > /export/users/kyukhin/gcc/git/gcc/gcc/emit-rtl.c:866 > #2 0x000000000070243a in copy_to_reg (x=(reg:V32HI 21 xmm0 [163])) at > /export/users/kyukhin/gcc/git/gcc/gcc/explow.c\ > :606 > #3 0x000000000091dfb8 in gen_lowpart_general (mode=V16HImode, x=<optimized > out>) > at /export/users/kyukhin/gcc/git/gcc/gcc/rtlhooks.c:50 > #4 0x0000000000ce16e8 in gen_split_4943 (curr_insn=<optimized out>, > operands=0x16f6320) > at /export/users/kyukhin/gcc/git/gcc/gcc/config/i386/sse.md:6329 > #5 0x00000000006f7865 in try_split (pat=(set (reg:V16HI 23 xmm2 [164]) > (vec_select:V16HI (reg:V32HI 21 xmm0 [163]) > (parallel [ > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 2 [0x2]) > (const_int 3 [0x3]) > (const_int 4 [0x4]) > (const_int 5 [0x5]) > (const_int 6 [0x6]) > (const_int 7 [0x7]) > (const_int 8 [0x8]) > (const_int 9 [0x9]) > (const_int 10 [0xa]) > (const_int 11 [0xb]) > (const_int 12 [0xc]) > (const_int 13 [0xd]) > (const_int 14 [0xe]) > (const_int 15 [0xf]) > ]))), trial=(insn 48 46 49 6 (set (reg:V16HI 23 xmm2 [164]) > (vec_select:V16HI (reg:V32HI 21 xmm0 [163]) > (parallel [ > (const_int 0 [0]) > (const_int 1 [0x1]) > (const_int 2 [0x2]) > (const_int 3 [0x3]) > (const_int 4 [0x4]) > (const_int 5 [0x5]) > (const_int 6 [0x6]) > (const_int 7 [0x7]) > (const_int 8 [0x8]) > (const_int 9 [0x9]) > (const_int 10 [0xa]) > (const_int 11 [0xb]) > (const_int 12 [0xc]) > (const_int 13 [0xd]) > (const_int 14 [0xe]) > (const_int 15 [0xf]) > ]))) > /export/users/kyukhin/gcc/git/gcc/gcc/testsuite/gcc.target/i386/avx512f-vec-unpack.c:24 > 2151 {ve\ > c_extract_lo_v32hi} > (nil)), last=<optimized out>) > at /export/users/kyukhin/gcc/git/gcc/gcc/emit-rtl.c:3467 > > So, we have: > [rtlhooks.c:50]gen_lowpart_general () -> > [explow.c:606]copy_to_reg () -> > [emit-rtl.c:866]gen_reg_rtx (): gcc_assert (can_create_pseudo_p ()); > > Maybe the code in the pattern is buggy? Or is it a gen_lowpart?
I think that original approach with gen_rtx_REG is correct and follows established practice in sse.md (please grep for gen_reg_RTX in sse.md). If this approach is necessary due to the deficiency of gen_lowpart, then the fix to gen_lowpart should be proposed in a follow-up patch. Uros.