On Tue, Jun 21, 2016 at 11:22 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Tue, Jun 21, 2016 at 2:40 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Mon, Jun 20, 2016 at 12:46 PM, Richard Sandiford >> <rdsandif...@googlemail.com> wrote: >>> Uros Bizjak <ubiz...@gmail.com> writes: >>>> On Mon, Jun 20, 2016 at 9:19 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >>>>> On Mon, Jun 20, 2016 at 12:13 PM, Uros Bizjak <ubiz...@gmail.com> wrote: >>>>>> On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu <hongjiu...@intel.com> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> This patch implements the alternate code sequence recommended in >>>>>>> >>>>>>> https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI >>>>>>> >>>>>>> to load external function address via GOT slot with >>>>>>> >>>>>>> movq func@GOTPCREL(%rip), %rax >>>>>>> >>>>>>> so that linker won't create an PLT entry for extern function >>>>>>> address. >>>>>>> >>>>>>> Tested on x86-64. OK for trunk? >>>>>> >>>>>>> + else if (ix86_force_load_from_GOT_p (op1)) >>>>>>> + { >>>>>>> + /* Load the external function address via the GOT slot to >>>>>>> + avoid PLT. */ >>>>>>> + op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1), >>>>>>> + (TARGET_64BIT >>>>>>> + ? UNSPEC_GOTPCREL >>>>>>> + : UNSPEC_GOT)); >>>>>>> + op1 = gen_rtx_CONST (Pmode, op1); >>>>>>> + op1 = gen_const_mem (Pmode, op1); >>>>>>> + /* This symbol must be referenced via a load from the Global >>>>>>> + Offset Table. */ >>>>>>> + set_mem_alias_set (op1, ix86_GOT_alias_set ()); >>>>>>> + op1 = convert_to_mode (mode, op1, 1); >>>>>>> + op1 = force_reg (mode, op1); >>>>>>> + emit_insn (gen_rtx_SET (op0, op1)); >>>>>>> + /* Generate a CLOBBER so that there will be no REG_EQUAL note >>>>>>> + on the last insn to prevent cse and fwprop from replacing >>>>>>> + a GOT load with a constant. */ >>>>>>> + rtx tmp = gen_reg_rtx (Pmode); >>>>>>> + emit_clobber (tmp); >>>>>>> + return; >>>>>> >>>>>> Jeff, is this the recommended way to prevent CSE, as far as RTL >>>>>> infrastructure is concerned? I didn't find any example of this >>>>>> approach with other targets. >>>>>> >>>>> >>>>> FWIW, the similar approach is used in ix86_expand_vector_move_misalign, >>>>> ix86_expand_convert_uns_didf_sse and ix86_expand_vector_init_general >>>>> as well as other targets: >>>>> >>>>> frv/frv.c: emit_clobber (op0); >>>>> frv/frv.c: emit_clobber (op1); >>>>> im32c/m32c.c: /* emit_clobber (gen_rtx_REG (HImode, R0L_REGNO)); */ >>>>> s390/s390.c: emit_clobber (addr); >>>>> s390/s390.md: emit_clobber (reg0); >>>>> s390/s390.md: emit_clobber (reg1); >>>>> s390/s390.md: emit_clobber (reg0); >>>>> s390/s390.md: emit_clobber (reg0); >>>>> s390/s390.md: emit_clobber (reg1); >>>> >>>> These usages mark the whole register as being "clobbered" >>>> (=undefined), before only a part of register is written, e.g.: >>>> >>>> emit_clobber (int_xmm); >>>> emit_move_insn (gen_lowpart (DImode, int_xmm), input); >>>> >>>> They aren't used to prevent unwanted CSE. >>> >>> Since it's being called in the move expander, I thought the normal >>> way of preventing the constant being rematerialised would be to reject >>> it in the move define_insn predicates. >>> >>> FWIW, I agree that using a clobber for this is going to be fragile. >>> >> >> Here is the patch without clobber. Tested on x86-64. OK for >> trunk? > > No, your patch has multiple problems: > > 1. It won't work for e.g. &bar+1, since you have to legitimize the > symbol in two places of ix86_expand_move. Also, why use TARGET_64BIT > in > > + op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1), > + (TARGET_64BIT > + ? UNSPEC_GOTPCREL > + : UNSPEC_GOT)); > > when ix86_force_load_from_GOT_p rejects non-64bit targets? > > 2. New check should be added to ix86_legitimate_constant_p, not to > predicates of move insn patterns. Unfortunately, we still have to > change x86_64_immediate_operand in two places. > > I have attached my version of the patch. It handles all your > testcases, plus &foo+1 case. Bootstrap is still running. > > Does the patch work for you?
It works. Thanks. -- H.J.