On 09/12/2018 04:17 PM, Thomas Gleixner wrote:
On Wed, 12 Sep 2018, Florian Weimer wrote:
On 09/09/2018 10:05 PM, Thomas Gleixner wrote:
See the patch below. It's integrating TAI without slowing down everything
and it definitely does not result in indirect calls.
On a HSW it slows down clock_gettime() by ~0.5ns. On a SKL I get a speedup
by ~0.5ns. On a AMD Epyc server it's 1.2ns speedup. So it somehow depends
on the uarch and I also observed compiler version dependend variations.
Does this mean glibc can keep using a single vDSO entrypoint, the one we
have today?
We have no intention to change that.
Okay, I was wondering because Andy seemed to have proposed just that.
But we surely could provide separate entry points as an extra to avoid a
bunch of conditionals.
We could adjust to that, but the benefit would be long-term because it's
an ABI change for glibc, and they tend to take a long time to propagate.
But I must say that clock_gettime is an odd place to start. I would
have expected any of the type-polymorphic multiplexer interfaces (fcntl,
ioctl, ptrace, futex) to be a more natural starting point. 8-)
Thanks,
Florian