Re: CREL relocation format for ELF (was: RELLEB)

Fangrui Song via Gcc Thu, 28 Mar 2024 00:45:04 -0700

On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song <mask...@gcc.gnu.org> wrote:
>
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <mask...@gcc.gnu.org> wrote:
> >
> > The relocation formats REL and RELA for ELF are inefficient. In a
> > release build of Clang for x86-64, .rela.* sections consume a
> > significant portion (approximately 20.9%) of the file size.
> >
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!
> >
> > Detailed analysis:
> > https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> > generic ABI (ELF specification):
> > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
> > binutils feature request: 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=31475
> > LLVM: 
> > https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
> >
> > Implementation primarily involves binutils changes. Any volunteers?
> > For GCC, a driver option like -mrelleb in my Clang prototype would be
> > needed. The option instructs the assembler to use RELLEB.
>
> The format was tentatively named RELLEB. As I refine the original pure
> LEB-based format, “RELLEB” might not be the most fitting name.
>
> I have switched to SHT_CREL/DT_CREL/.crel and updated
> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> and
> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>
> The new format is simpler and better than RELLEB even in the absence
> of the shifted offset technique.
>
> Dynamic relocations using CREL are even smaller than Android's packed
> relocations.
>
> // encodeULEB128(uint64_t, raw_ostream &os);
> // encodeSLEB128(int64_t, raw_ostream &os);
>
> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
> uint32_t symidx = 0, type = 0;
> for (const Reloc &rel : relocs)
>   offsetMask |= crels[i].r_offset;
> int shift = std::countr_zero(offsetMask)
> encodeULEB128(relocs.size() * 4 + shift, os);
> for (const Reloc &rel : relocs) {
>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>               (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
>   if (deltaOffset < 0x10) {
>     os << char(b);
>   } else {
>     os << char(b | 0x80);
>     encodeULEB128(deltaOffset >> 4, os);
>   }
>   if (b & 1) {
>     encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os);
>     symidx = rel.r_symidx;
>   }
>   if (b & 2) {
>     encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os);
>     type = rel.r_type;
>   }
>   if (b & 4) {
>     encodeSLEB128(std::make_signed_t<uint>(rel.r_addend - addend), os);
>     addend = rel.r_addend;
>   }
> }
>
> ---
>
> While alternatives like PrefixVarInt (or a suffix-based variant) might
> excel when encoding larger integers, LEB128 offers advantages when
> most integers fit within one or two bytes, as it avoids the need for
> shift operations in the common one-byte representation.
>
> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
> is inferior to or on par with SLEB128 for one-byte encodings.



We can introduce a gas option --crel, then users can specify `gcc
-Wa,--crel a.c` (-flto also gets -Wa, options).

I propose that we add another gas option --implicit-addends-for-data
(does the name look good?) to allow non-code sections to use implicit
addends to save space
(https://sourceware.org/PR31567).
Using implicit addends primarily benefits debug sections such as
.debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
data sections such as .eh_frame, .data., .data.rel.ro, .init_array.

-Wa,--implicit-addends-for-data can be used on its own (6.4% .o
reduction in a clang -g -g0 -gpubnames build)       or together with
CREL to achieve more incredible size reduction, one single byte for
most .debug_* relocations!
With CREL, concerns of debug section relocations will become a thing
of the past.

Re: CREL relocation format for ELF (was: RELLEB)

Reply via email to