On Tue, Apr 29, 2025 at 12:22 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Tue, Apr 29, 2025 at 5:30 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Tue, Apr 29, 2025 at 9:56 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > Don't expand UNSPEC_TLS_LD_BASE to a call so that the RTL local copy > > > propagation pass can eliminate multiple __tls_get_addr calls. > > > > __tls_get_addr needs to be called with 16-byte aligned stack, I don't > > think the compiler will correctly handle required call alignment if > > you emit the call without emit_libcall_block. > > ix86_split_tls_local_dynamic_base_64 generates the same sequence > as emit_libcall_block. stack alignment is handled by > > (define_expand "@tls_local_dynamic_base_64_<mode>" > [(set (match_operand:P 0 "register_operand") > (unspec:P > [(match_operand 1 "constant_call_address_operand") > (reg:P SP_REG)] > UNSPEC_TLS_LD_BASE))] > "TARGET_64BIT" > "ix86_tls_descriptor_calls_expanded_in_cfun = true;")
The above is to align the initial %rsp at the beginning of the function. When PUSH instructions in the function misaling %rsp, there will be nothing to keep %rsp aligned before the call to __tls_get_addr. We have been bitten by this in the past. Uros.