* Peter Dimov: > Even on a P4, inlining may enable compiler optimizations. One case is when > the compiler can see that the return value of __sync_fetch_and_or (for > instance) isn't used. It's possible to use a wait-free "lock or" instead of > a "lock cmpxchg" loop (MSVC 8 does this for _InterlockedOr.)
You don't need inlining to optimize these cases. You only need to know precisely what the library implementations do, and you need a couple of choices. GCC can already optimizes printf ("hello world\n"); to puts ("hello world"); even though no inlining takes place.