On Wed, Jan 29, 2025 at 04:02:34PM -0800, Andrii Nakryiko wrote: > On Tue, Jan 28, 2025 at 6:02 PM Josh Poimboeuf <jpoim...@kernel.org> wrote: > I'm not sure about this chunked lookup approach for arbitrary user > space applications. Those executable sections can be a) big and b) > discontiguous. E.g., one of the production binaries I looked at. Here > are its three main executable sections: > > ... > [17] .bolt.org.text PROGBITS 000000000b00e640 0ae0d640 > 0000000011ad621c 0000000000000000 AX 0 0 64 > ... > [48] .text PROGBITS 000000001e600000 1ce00000 > 0000000000775dd8 0000000000000000 AX 0 0 2097152 > [49] .text.cold PROGBITS 000000001ed75e00 1d575e00 > 00000000007d3271 0000000000000000 AX 0 0 64 > ... > > Total text size is about 300MB: > >>> 0x0000000000775dd8 + 0x00000000007d3271 + 0x0000000011ad621c > 312603237 > > Section #17 ends at: > > >>> hex(0x0000000011ad621c + 0x000000000b00e640) > '0x1cae485c' > > While .text starts at 000000001e600000, so we have a gap of ~28MB: > > >>> 0x000000001e600000 - 0x1cae485c > 28424100 > > So unless we do something more clever to support multiple > discontiguous chunks, this seems like a bad fit for user space.
Nothing clever needed, we could just have multiple sframe sections, each one with a pointer to its text segment. That would also have the benefit of allowing the sframe data to be much more compact for the noncontiguous cases. > I think having all this just binary searchable is already a big win > anyways and should be plenty fast, no? Sframe is trying to compete with frame pointers which are MUCH faster. 3-4x faster in my testing, not including the page faults (which tend to only affect performance in the very beginning). -- Josh