Hello! 2012-11-04 Vladimir Yakovlev <vladimir.b.yakov...@intel.com>
* mode-switching.c (create_pre_exit): Added code for maybe_builtin_apply case. * config/i386/i386-protos.h (emit_i387_cw_initialization): Deleted. (emit_vzero): Added prototype. (ix86_mode_entry): Likewise. (ix86_mode_exit): Likewise. (ix86_emit_mode_set): Likewise. * config/i386/i386.c (VALID_AVX256_REG_OR_OI_MODE): New. (typedef struct block_info_def): Deleted. (define BLOCK_INFO): Deleted. (check_avx256_stores): Added checking for MEM_P. (move_or_delete_vzeroupper_2): Deleted. (move_or_delete_vzeroupper_1): Deleted. (move_or_delete_vzeroupper): Deleted. (ix86_maybe_emit_epilogue_vzeroupper): Deleted. (function_pass_avx256_p): Deleted. (ix86_function_ok_for_sibcall): Deleted disabling sibcall. (nit_cumulative_args): Deleted initialization of of avx256 fields of cfun->machine. (ix86_emit_restore_sse_regs_using_mov): Deleted vzeroupper generation. (ix86_expand_epilogue): Likewise. (is_vzeroupper): New. (is_vzeroall): New. (ix86_avx_u128_mode_needed): New. (ix86_i387_mode_needed): Renamed ix86_mode_needed. (ix86_mode_needed): New. (ix86_avx_u128_mode_after): New. (ix86_mode_after): New. (ix86_avx_u128_mode_entry): New. (ix86_mode_entry): New. (ix86_avx_u128_mode_exit): New. (ix86_mode_exit): New. (ix86_emit_vzeroupper): New. (ix86_emit_mode_set): New. (ix86_expand_call): Deleted vzeroupper generation. (ix86_split_call_vzeroupper): Deleted. (ix86_init_machine_status): Initialzed optimize_mode_switching. (ix86_expand_special_args_builtin): Changed. (ix86_reorg): Deleted a call of move_or_delete_vzeroupper. * config/i386/i386.h (AVX_U128): New. (avx_u128_state): New. (NUM_MODES_FOR_MODE_SWITCHING): Added AVX_U128_ANY. (MODE_AFTER): New. (MODE_ENTRY): New. (MODE_EXIT): New. (EMIT_MODE_SET): Changed. (machine_function): Deleted avx256 fields. * config/i386/i386.md (UNSPEC_CALL_NEEDS_VZEROUPPER): Deleted. (define_insn_and_split "*call_vzeroupper"): Deleted. (define_insn_and_split "*call_rex64_ms_sysv_vzeroupper"): Deleted. (define_insn_and_split "*sibcall_vzeroupper"): Deleted. (define_insn_and_split "*call_pop_vzeroupper"): Deleted. (define_insn_and_split "*sibcall_pop_vzeroupper"): Deleted. (define_insn_and_split "*call_value_vzeroupper"): Deleted. (define_insn_and_split "*sibcall_value_vzeroupper"): Deleted. (define_insn_and_split "*call_value_rex64_ms_sysv_vzeroupper"): Deleted. (define_insn_and_split "*call_value_pop_vzeroupper"): Deleted. (define_insn_and_split "*sibcall_value_pop_vzeroupper"): Deleted. (define_expand "return"): Deleted vzeroupper emitting. (define_expand "simple_return"): Deleted. 2012-11-04 Vladimir Yakovlev <vladimir.b.yakov...@intel.com> * gcc.target/i386/avx-vzeroupper-5.c: Changed scan-assembler-times. gcc.target/i386/avx-vzeroupper-8.c: Likewise. gcc.target/i386/avx-vzeroupper-9.c: Likewise. gcc.target/i386/avx-vzeroupper-10.c: Likewise. gcc.target/i386/avx-vzeroupper-11.c: Likewise. gcc.target/i386/avx-vzeroupper-12.c: Likewise. gcc.target/i386/avx-vzeroupper-19.c: Likewis. gcc.target/i386/avx-vzeroupper-27.c: New. Target part (without mode-switching.c change) is OK for mainline, with a few small changes below: +#define VALID_AVX256_REG_OR_OI_MODE(m) (VALID_AVX256_REG_MODE (m) || (m) == OImode) enum upper_128bits_state Put this definition in i386.h, after VALID_AVX256_REG_MODE. +static void +ix86_emit_vzeroupper (void) +{ + emit_insn (gen_avx_vzeroupper (GEN_INT (9))); +} No need to pass argument to vzeroupper anymore. We have only one vzeroupper type now, so following definition in sse.md could also be changed from: (define_insn "avx_vzeroupper" [(unspec_volatile [(match_operand 0 "const_int_operand")] UNSPECV_VZEROUPPER)] to: (define_insn "avx_vzeroupper" [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)] Please call gen_avx_vzeroupper directly, so ix86_emit_vzeroupper wrapper function can be simply deleted. +/* Check insn for vzeroupper intrinsic. */ + +static bool +is_vzeroupper (rtx pat) +{ + return pat + && GET_CODE (pat) == UNSPEC_VOLATILE + && XINT (pat, 1) == UNSPECV_VZEROUPPER; +} + +/* Check insn for vzeroall intrinsic. */ + +static bool +is_vzeroall (rtx pat) +{ + return pat + && GET_CODE (pat) == PARALLEL + && GET_CODE (XVECEXP (pat, 0, 0)) == UNSPEC_VOLATILE + && XINT (XVECEXP (pat, 0, 0), 1) == UNSPECV_VZEROALL; +} These should be put in predicates.md. This can be in a follow-up patch. case VOID_FTYPE_VOID: if (icode == CODE_FOR_avx_vzeroupper) - target = GEN_INT (vzeroupper_intrinsic); + target = GEN_INT (9); emit_insn (GEN_FCN (icode) (target)); return 0; Please use: case VOID_FTYPE_VOID: emit_insn (GEN_FCN (icode) ()); return 0; Otherwise other VOID_FTYPE_VOID patterns will get excessive argument. -/* { dg-final { scan-assembler-not "avx_vzeroupper" } } */ +/* { dg-final { scan-assembler-times "avx_vzeroupper" 3 } } */ (... and a couple of similar testsuite changes ...) These asm scans were put there for a reason. I assume you have looked at these differences and are correct (this also implies that current vzeroupper placement code is not optimal or even wrong). I will split out the mode-switching part and re-post it to mailing list with an explanation. After this change is approved, please commit the patch to mainline SVN with requested changes. Thanks, Uros.