Re: [PATCH v4] eliminate mutex in fast path of __register_frame

Thomas Neumann via Gcc-patches Mon, 21 Nov 2022 16:31:20 -0800

Hi,

When dynamically linking a fast enough machine hides the latency, but when
Statically linking or on slower devices this change caused a 5x increase in
Instruction count and 2x increase in cycle count before getting to main.

I have looked at ways to fix that. The problem is that with staticlinking unwinding tables are registered dynamically, and with my patchthat registration triggers an eager sort of fde lists. While previouslythe lists were sorted when the first exception was thrown. If anapplication throws at least one exception there is no downside in eagersorting, but if the application never throws there is overhead.

The obvious way to improve the situation is to make sorting faster. Whenreplacing the split+sort+merge logic with a radix sort (which can bedone without additional memory) we get the following timings for your

#include <cstdio>
int main() {}
example (with git stat -r 200):

# pre-patch version, fast
              0,06 msec task-clock
           272.286      cycles
           464.754      instructions
# post-patch version, slow
              0,21 msec task-clock
           972.876      cycles
         3.079.515      instructions
# +radix sort, in the middle
              0,13 msec task-clock
           604.697      cycles
         1.702.930      instructions

The radix sort clearly improves things, but it does not fully eliminatethe overhead.

The question is, how much do we care about that situation (i.e., staticlinking, exceptions registered but never thrown). I could change thecode to recognize three states instead of two: no exceptions registered,exceptions register but never thrown, and full exception mode. But thatwould increase code complexity and it would pessimize applications thatdo throw, as we now need more checks to guard against concurrent changes.

It makes the code more complex and a bit slower, which is why I amhesistant. But I can implement that if you think that we need that. Orwe just replace the sort, which is probably a good idea anyway.


Best

Thomas

Re: [PATCH v4] eliminate mutex in fast path of __register_frame

Reply via email to