On Sat, 2 Feb 2008, Kris Kennaway wrote:
Alexander Motin wrote:
Robert Watson wrote:
Hence my request for drilling down a bit on profiling -- the question I'm
asking is whether profiling shows things running or taking time that
shouldn't be.
I have not yet understood why does it happend, but hwpmc shows huge amount
of "p4-resource-stall"s in UMA functions:
For this moment I have invent two possible explanation. One is that due to
UMA's cyclic block allocation order it does not fits CPU caches and another
that it is somehow related to critical_exit(), which possibly can cause
context switch. Does anybody have better explanation how such small and
simple in this part function can cause such results?
You can look at the raw output from pmcstat, which is a collection of
instruction pointers that you can feed to e.g. addr2line to find out exactly
where in those functions the events are occurring. This will often help to
track down the precise causes.
There was, FYI, a report a few years ago that there was a measurable
improvement from allocating off the free bucket rather than maintaining
separate alloc and free buckets. It sounded good at the time but I was never
able to reproduce the benefits in my test environment. Now might be a good
time to try to revalidate that. Basically, the goal would be to make the pcpu
cache FIFO as much as possible as that maximizes the chances that the newly
allocated object already has lines in the cache. It's a fairly trivial tweak
to the UMA allocation code.
Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"