[Bug ld/31795] ld.bfd makes ELFs of type ET_EXEC for static PIEs when load address is non-0

i at maskray dot me Tue, 28 May 2024 11:07:24 -0700

https://sourceware.org/bugzilla/show_bug.cgi?id=31795


--- Comment #65 from Fangrui Song <i at maskray dot me> ---
> > Changing ET_DYN to ET_EXEC fulfills the address requirement but disables
> > ASLR.
> > Is it intentional?
> 
> That's my understanding of reading the -Ttext-segment documentation.  The 
> question is whether we relax the semantic to have it as a minimum address or 
> define it as the expected address (thus disabling ASLR as a consequence). 
> 
> I don't have a strong opinion, but currently, Linux only enforces the former 
> (I think it is the main reason this makes some sense) so we will need to 
> discuss with kernel developers the expected semantics.

If there is a strong need for the address requirement (>=p_vaddr), Linux kernel
and glibc have the capability to implement it.
However, this alone does not justify keeping the ld hack that sets ET_EXEC for
-pie -Ttext-segment=$non_zero.

> -Ttext-segment=0x600000000000 should create a binary which is guaranteed to be
> loaded at 0x600000000000.

-Ttext-segment sets the address of the first byte of the text segment, which
likely influences the p_vaddr member of a PT_LOAD segment.
When e_type is ET_EXEC, this address is also the virtual address of the first
memory area.
However, if e_type is ET_DYN, there's no guarantee of this address, and
fulfilling this request is left to the discretion of the loaders.

Since ld offers the -no-pie flag, there's no need for a workaround to make -pie
behave similarly.
(In addition, DF_1_PIE with ET_EXEC is very odd.)

If a user desires both address>=0x600000000000 && ASLR, this could be achieved
if ET_DYN is used and loaders satisfy the address requirement.
However, retaining the ET_EXEC hack in ld would prevent the fulfillment of this
goal.

> PIE is the only way to create a small mode executable loaded at 
> 0x600000000000.

This is an oversimplification.

The -mcmodel= flag imposes specific code generation restrictions that allow
relocatable files to be used in certain address space layouts after linking.
While it's (almost) a sufficient condition, it's not a necessary one.

Achieving high-address functionality doesn't necessitate -mcmodel=large. For
instance, you can use PIC symbol addressing (combining -mcmodel=small -fpie
with non-preemptible symbols) to achieve the same result.
If your code is larger than 2GiB, you can even use range extension thunks.

https://maskray.me/blog/2023-05-14-relocation-overflow-and-code-models#x86-64-code-models

    Similarly, for a function call, we no longer assume that the address of the
function or its PLT entry is within the ±2GiB range from the program counter,
so call callee cannot be used.

    Actually, call callee can still be used if we implement range extension
thunks in the linker, unfortunately GCC/GNU ld did not pursue this direction.

> There are 2 issues with -mcmodel=large:
> 
> 1. Since there are no -mcmodel=large run-time libraries, you can't use 
> -mcmodel=large
> to create any meaningful binaries.
> 2. -mcmodel=large performance is much slower.

True. (1) is an ecosystem issue with -mcmodel=large.
However, this point is unrelated to the ld ET_EXEC hack.

> I think BFD can use the emulation (-m) option for this, since, for instance, 
> gcc will pass -maarch64linux for aarch64-linux-gnu.

I don't think this is necessary.

`-pie -Wl,-no-pie` works today (lld doesn't even need `--no-relax`).
Therefore, there isn't a strong argument retaining the ET_EXEC hack.

    % gcc -pie -Wl,-no-pie a.c -fuse-ld=bfd
-Wl,--no-relax,-Ttext-segment=0x600000000000 -o a
    % ./a
    0x600000001139
    % ./a
    0x600000001139  # no ASLR

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[Bug ld/31795] ld.bfd makes ELFs of type ET_EXEC for static PIEs when load address is non-0

Reply via email to