On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song <mask...@gcc.gnu.org> wrote: > > On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <mask...@gcc.gnu.org> wrote: > > > > The relocation formats REL and RELA for ELF are inefficient. In a > > release build of Clang for x86-64, .rela.* sections consume a > > significant portion (approximately 20.9%) of the file size. > > > > I propose RELLEB, a new format offering significant file size > > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)! > > > > Your thoughts on RELLEB are welcome! > > > > Detailed analysis: > > https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf > > generic ABI (ELF specification): > > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw > > binutils feature request: > > https://sourceware.org/bugzilla/show_bug.cgi?id=31475 > > LLVM: > > https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600 > > > > Implementation primarily involves binutils changes. Any volunteers? > > For GCC, a driver option like -mrelleb in my Clang prototype would be > > needed. The option instructs the assembler to use RELLEB. > > The format was tentatively named RELLEB. As I refine the original pure > LEB-based format, “RELLEB” might not be the most fitting name. > > I have switched to SHT_CREL/DT_CREL/.crel and updated > https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf > and > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ > > The new format is simpler and better than RELLEB even in the absence > of the shifted offset technique. > > Dynamic relocations using CREL are even smaller than Android's packed > relocations. > > // encodeULEB128(uint64_t, raw_ostream &os); > // encodeSLEB128(int64_t, raw_ostream &os); > > Elf_Addr offsetMask = 8, offset = 0, addend = 0; > uint32_t symidx = 0, type = 0; > for (const Reloc &rel : relocs) > offsetMask |= crels[i].r_offset; > int shift = std::countr_zero(offsetMask) > encodeULEB128(relocs.size() * 4 + shift, os); > for (const Reloc &rel : relocs) { > Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift; > uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) + > (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0); > if (deltaOffset < 0x10) { > os << char(b); > } else { > os << char(b | 0x80); > encodeULEB128(deltaOffset >> 4, os); > } > if (b & 1) { > encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os); > symidx = rel.r_symidx; > } > if (b & 2) { > encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os); > type = rel.r_type; > } > if (b & 4) { > encodeSLEB128(std::make_signed_t<uint>(rel.r_addend - addend), os); > addend = rel.r_addend; > } > } > > --- > > While alternatives like PrefixVarInt (or a suffix-based variant) might > excel when encoding larger integers, LEB128 offers advantages when > most integers fit within one or two bytes, as it avoids the need for > shift operations in the common one-byte representation. > > While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert > SLEB128-encoded type/addend to use ULEB128 instead, the generate code > is inferior to or on par with SLEB128 for one-byte encodings.
We can introduce a gas option --crel, then users can specify `gcc -Wa,--crel a.c` (-flto also gets -Wa, options). I propose that we add another gas option --implicit-addends-for-data (does the name look good?) to allow non-code sections to use implicit addends to save space (https://sourceware.org/PR31567). Using implicit addends primarily benefits debug sections such as .debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also data sections such as .eh_frame, .data., .data.rel.ro, .init_array. -Wa,--implicit-addends-for-data can be used on its own (6.4% .o reduction in a clang -g -g0 -gpubnames build) or together with CREL to achieve more incredible size reduction, one single byte for most .debug_* relocations! With CREL, concerns of debug section relocations will become a thing of the past.