It would be nice to have optimizer hints useful for critical sections -
sections that should be optimized at the expense of code surrounding it.

        pthread_mutex_lock(&m);
        critical section;
        pthread_mutex_unlock(&m);

Things like spilling registers, .p2align, jumps that the compiler need
to insert etc, should here preferably be outside the critical section.

I imagine this could be done by giving pthread_mutex_lock() weak
variants of __attribute__((cold)) before the function and ((hot))
after, and the inverse for pthread_mutex_unlock().

Whether an exit point is hot or not can depend on the result code
though.  E.g.  pthread_mutex_trylock(&m) would be hot only when
returning 0.  Well, so is pthread_mutex_lock, but there it makes sense
to default to "hot" if the error code is not checked or if the compiler
cannot figure out how the error code is used.  Such a default is likely
wrong for trylock.

-fprofile-use may not work right for these optimizations.  If anything,
it may pessimize a critical section which is not entered often because
the surrounding code works at avoiding entry into it.
__builtin_expect() may not help either since it is not branches at the
function call as such that need to be optimized, but also e.g. where to
insert an unconditional jump which needs to be put somewhere - inside or
outside the critical section, regardless of branches or no branches.

The compiler already can optimize entry into the critical section
somewhat if the program branches on whether entry succeeded.  Does not
help near exit from a critical section though.  And it would be nice to
optimize it when there is no profile data or __builtin_expect.

These hints should not be too strong - do not pessimize too much at the
outside of the critical section, since there may be nested critical
sections.


There are some other cases which could use this too.  E.g.  execvp()
is likely "cold" on return but not on entry.   The manual suggests e.g.
perror could be "cold", but that's not right for the branch leading
up to execvp - since it's just the return which is unlikely.

In that regard, it'd also be useful to have an attribute or something
which just cancels out __attribute__((cold)) but does not make anything
"hot".

-- 
Hallvard

Reply via email to