Issue 138584
Summary [lld] Unsorted loadable program headers with `-Ttext=` argument
Labels
Assignees
Reporter pskrgag
    ELF spec say that program headers must be sorted by `p_vaddr` 

> Loadable segment entries in the program header table appear in ascending order,
> sorted on the p_vaddr member

However it does not work if `-Ttext` argument is used. 

Consider following example:

```c
int main(void)
{
 while (1);
}
```

```bash
~/Documents/compiler_ws/lld
paskripkin > ~/Documents/git/llvm-project/build/bin/ld.lld -e main -Ttext=0x800000 test.o -o a.out

~/Documents/compiler_ws/lld
paskripkin > llvm-readelf -l a.out

Elf file type is EXEC (Executable file)
Entry point 0x800000
There are 5 program headers, starting at offset 64

Program Headers:
  Type Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000118 0x000118 R   0x8
  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000158 0x000158 R   0x1000
  LOAD           0x001000 0x0000000000800000 0x0000000000800000 0x00000d 0x00000d R E 0x1000
  LOAD 0x001010 0x0000000000801010 0x0000000000801010 0x00003c 0x00003c R 0x1000
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x0

 Section to Segment mapping:
  Segment Sections...
   00
   01
   02     .text
   03     .eh_frame
   04
 None   .comment .symtab .shstrtab .strtab
```

Note that first loadable segment has a `p_vaddr` less than the second one. (still reproduces on current master c50cba6275271fba69be661b9ec0665b2be88dbc).

## Sort of analysis

The first loadable segment is the one that contains ELF header and program headers. It's added in  `Writer<ELFT>::createPhdrs`. 
```cpp
 // Add the headers. We will remove them if they don't fit.
    // In the other partitions the headers are ordinary sections, so they don't
    // need to be added here.
    if (isMain) {
      load = addHdr(PT_LOAD, flags);
      load->add(ctx.out.elfHeader.get());
 load->add(ctx.out.programHeaders.get());
    }
```

In normal case it gets merged into some of other segments because of the following code

```cpp
    bool sameLMARegion =
        load && !sec->lmaExpr && sec->lmaRegion == load->firstSec->lmaRegion;
    if (load && sec != relroEnd &&
        sec->memRegion == load->firstSec->memRegion &&
 (sameLMARegion || load->lastSec == ctx.out.programHeaders.get()) &&
 (ctx.script->hasSectionsCommand || sec->type == SHT_NOBITS ||
 load->lastSec->type != SHT_NOBITS)) {
      load->p_flags |= newFlags;
 } else {
      load = addHdr(PT_LOAD, newFlags);
      flags = newFlags;
 }

```

However if `-Ttext` is specified, predicate above may fail and new segment will be inserted. Then during `LinkerScript::assignAddresses`, first `PT_LOAD` segment gets assigned to `ctx.target->getImageBase()`, which may be far from the base of `text` (IIUC lld starts calculating from the base passed in `-T`). 

## Note
This is not a synthetic example. It was observed in real life with very strict elf loader.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to