On Mon, Mar 19, 2012 at 8:54 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
> On Mon, Mar 19, 2012 at 8:51 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
>> On Sun, Mar 18, 2012 at 1:55 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>>> On Sun, Mar 18, 2012 at 5:01 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>>>
>>>>> I am testing this patch.  OK for trunk if it passes all tests?
>>>>
>>>> No, force_reg will generate a pseudo, so this conversion is valid only
>>>> for !can_create_pseudo ().
>>>>
>>>> At least for *tls_initial_exec_x32_store, you will need a temporary to
>>>> split the pattern after reload.
>>
>> Here is the updated patch to add can_create_pseudo.  I also changed
>> tls_initial_exec_x32 to take an input register operand as thread pointer.
>>
>>> Please try attached patch. It simply throws away all recent
>>> complications w.r.t. to thread pointer and always handles TP in
>>> DImode.
>>>
>>> The testcase:
>>>
>>> --cut here--
>>> __thread int foo __attribute__ ((tls_model ("initial-exec")));
>>>
>>> void bar (int x)
>>> {
>>>  foo = x;
>>> }
>>>
>>> int baz (void)
>>> {
>>>  return foo;
>>> }
>>> --cut here--
>>>
>>> Now compiles to:
>>>
>>> bar:
>>>        movq    foo@gottpoff(%rip), %rax
>>>        movl    %edi, %fs:(%rax)
>>>        ret
>>>
>>> baz:
>>>        movq    foo@gottpoff(%rip), %rax
>>>        movl    %fs:(%rax), %eax
>>>        ret
>>>
>>> In effect, this always generates %fs(%rDI) and emits REX prefix before
>>> mov/add to satisfy brain-dead linkers.
>>>
>>> The patch is bootstrapping now on x86_64-pc-linux-gnu.
>>>
>>
>> For
>>
>> --
>> extern __thread char c;
>> extern char y;
>> void
>> ie (void)
>> {
>>  y = c;
>> }
>> --
>>
>> Your patch generates:
>>
>>        movl    %fs:0, %eax
>>        movq    c@gottpoff(%rip), %rdx
>>        movzbl  (%rax,%rdx), %edx
>>        movb    %dl, y(%rip)
>>        ret
>>
>> It can be optimized to:
>>
>>        movq    c@gottpoff(%rip), %rax
>>        movzbl  %fs:(%rax), %eax
>>        movb    %al, y(%rip)
>>        ret
>>
>
> Combine failed:
>
> (set (reg:QI 63 [ c ])
>    (mem/c:QI (plus:DI (zero_extend:DI (unspec:SI [
>                        (const_int 0 [0])
>                    ] UNSPEC_TP))
>            (mem/u/c:DI (const:DI (unspec:DI [
>                            (symbol_ref:SI ("c") [flags 0x60]
> <var_decl 0x7ffff19b8140 c>)
>                        ] UNSPEC_GOTNTPOFF)) [2 S8 A8])) [0 c+0 S1 A8]))
>
>

Wrong testcase.  IT should be

--
extern __thread char c;
extern __thread short w;
extern char y;
extern short i;
void
ie (void)
{
  y = c;
  i = w;
}
---

I got

        movl    %fs:0, %eax     
        movq    c@gottpoff(%rip), %rdx  
        movzbl  (%rax,%rdx), %edx       
        movb    %dl, y(%rip)    
        movq    w@gottpoff(%rip), %rdx  
        movzwl  (%rax,%rdx), %eax       
        movw    %ax, i(%rip)    
        ret     

It can be

        movq    c@gottpoff(%rip), %rax  
        movzbl  %fs:(%rax), %eax        
        movb    %al, y(%rip)    
        movq    w@gottpoff(%rip), %rax  
        movzwl  %fs:(%rax), %eax        
        movw    %ax, i(%rip)    
        ret     



-- 
H.J.

Reply via email to