On Tue, Jun 21, 2016 at 11:22 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Tue, Jun 21, 2016 at 2:40 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>> On Mon, Jun 20, 2016 at 12:46 PM, Richard Sandiford
>> <rdsandif...@googlemail.com> wrote:
>>> Uros Bizjak <ubiz...@gmail.com> writes:
>>>> On Mon, Jun 20, 2016 at 9:19 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>>>>> On Mon, Jun 20, 2016 at 12:13 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>>>>>> On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu <hongjiu...@intel.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> This patch implements the alternate code sequence recommended in
>>>>>>>
>>>>>>> https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI
>>>>>>>
>>>>>>> to load external function address via GOT slot with
>>>>>>>
>>>>>>> movq func@GOTPCREL(%rip), %rax
>>>>>>>
>>>>>>> so that linker won't create an PLT entry for extern function
>>>>>>> address.
>>>>>>>
>>>>>>> Tested on x86-64.  OK for trunk?
>>>>>>
>>>>>>> +  else if (ix86_force_load_from_GOT_p (op1))
>>>>>>> +    {
>>>>>>> +      /* Load the external function address via the GOT slot to
>>>>>>> +        avoid PLT.  */
>>>>>>> +      op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
>>>>>>> +                           (TARGET_64BIT
>>>>>>> +                            ? UNSPEC_GOTPCREL
>>>>>>> +                            : UNSPEC_GOT));
>>>>>>> +      op1 = gen_rtx_CONST (Pmode, op1);
>>>>>>> +      op1 = gen_const_mem (Pmode, op1);
>>>>>>> +      /* This symbol must be referenced via a load from the Global
>>>>>>> +        Offset Table.  */
>>>>>>> +      set_mem_alias_set (op1, ix86_GOT_alias_set ());
>>>>>>> +      op1 = convert_to_mode (mode, op1, 1);
>>>>>>> +      op1 = force_reg (mode, op1);
>>>>>>> +      emit_insn (gen_rtx_SET (op0, op1));
>>>>>>> +      /* Generate a CLOBBER so that there will be no REG_EQUAL note
>>>>>>> +        on the last insn to prevent cse and fwprop from replacing
>>>>>>> +        a GOT load with a constant.  */
>>>>>>> +      rtx tmp = gen_reg_rtx (Pmode);
>>>>>>> +      emit_clobber (tmp);
>>>>>>> +      return;
>>>>>>
>>>>>> Jeff, is this the recommended way to prevent CSE, as far as RTL
>>>>>> infrastructure is concerned? I didn't find any example of this
>>>>>> approach with other targets.
>>>>>>
>>>>>
>>>>> FWIW, the similar approach is used in ix86_expand_vector_move_misalign,
>>>>> ix86_expand_convert_uns_didf_sse and ix86_expand_vector_init_general
>>>>> as well as other targets:
>>>>>
>>>>> frv/frv.c:  emit_clobber (op0);
>>>>> frv/frv.c:  emit_clobber (op1);
>>>>> im32c/m32c.c:  /*  emit_clobber (gen_rtx_REG (HImode, R0L_REGNO)); */
>>>>> s390/s390.c:  emit_clobber (addr);
>>>>> s390/s390.md:  emit_clobber (reg0);
>>>>> s390/s390.md:  emit_clobber (reg1);
>>>>> s390/s390.md:  emit_clobber (reg0);
>>>>> s390/s390.md:  emit_clobber (reg0);
>>>>> s390/s390.md:  emit_clobber (reg1);
>>>>
>>>> These usages mark the whole register as being "clobbered"
>>>> (=undefined), before only a part of register is written, e.g.:
>>>>
>>>>       emit_clobber (int_xmm);
>>>>       emit_move_insn (gen_lowpart (DImode, int_xmm), input);
>>>>
>>>> They aren't used to prevent unwanted CSE.
>>>
>>> Since it's being called in the move expander, I thought the normal
>>> way of preventing the constant being rematerialised would be to reject
>>> it in the move define_insn predicates.
>>>
>>> FWIW, I agree that using a clobber for this is going to be fragile.
>>>
>>
>> Here is the patch without clobber.  Tested on x86-64.  OK for
>> trunk?
>
> No, your patch has multiple problems:
>
> 1. It won't work for e.g. &bar+1, since you have to legitimize the
> symbol in two places of ix86_expand_move. Also, why use TARGET_64BIT
> in
>
> +      op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
> +    (TARGET_64BIT
> +     ? UNSPEC_GOTPCREL
> +     : UNSPEC_GOT));
>
> when ix86_force_load_from_GOT_p rejects non-64bit targets?
>
> 2. New check should be added to ix86_legitimate_constant_p, not to
> predicates of move insn patterns. Unfortunately, we still have to
> change x86_64_immediate_operand in two places.
>
> I have attached my version of the patch. It handles all your
> testcases, plus &foo+1 case. Bootstrap is still running.
>
> Does the patch work for you?

It works.

Thanks.

-- 
H.J.

Reply via email to