https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80881

--- Comment #86 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Julian Waters from comment #83)

> Uros: I see, I'll try to do so. I was mainly avoiding that to break less
> code (I have a habit of doing that to anything I touch). Although, the
> resulting assembly (Barring the register selection) already seems to be as
> compact as possible for Windows, I'm not sure how using get_thread_pointer
> could make it any more optimal. This is a genuinely curious question, not
> placing doubt on whether get_thread_pointer can help optimize the resulting
> assembly

I can speak from Linux perspective - when thread pointer is modelled as UNSPEC,
then generic compiler part can optimize access to the location as shown in the
Comment #81. There are many optimizations performed, and following the current
implementation assures that your target won't be left behind when new generic
optimization is introduced.

That said, and looking at your code in Comment #83, it looks that on Windows,
TLS access can't use gs: prefixed address (similar to Linux with
-mno-tls-direct-seg-refs). If this is the case, then generating UNSPEC via
get_thread_pointer is not beneficial, since UNSPEC can't be combined into
address.

Your thread pointer is generated with:

+  tp = gen_const_mem (Pmode, GEN_INT (TARGET_64BIT ? 88 : 44));
+  set_mem_addr_space (tp, DEFAULT_TLS_SEG_REG);

which is in fact what UNSPEC_TP will be split to in split1 pass.

Reply via email to