On 28.06.2013 18:14, Adrian Chadd wrote:
.. i'd rather you narrow down _why_ it's performing better before committing it.
If you have good guesses -- they are welcome. All those functions are so small, that it is hard to imagine how congestion may happen there at all. I have strong feeling that lock spinning there consumes incomparably much more CPU time then the locked region itself could consume.
Otherwise it may just creep up again after someone does another change in an unrelated part of the kernel.
Big win or small, TAILQ is still heavier then STAILQ, while it is not needed there at all.
You're using instructions-retired; how about using l1/l2 cache loads, stores, etc? There's a lot more CPU counters available.
I am using unhalted-cycles, that is more reasonable then instructions-retired. What's about other counters, there are indeed a lot of them, but it is not always easy to get something useful out of them.
You have a very cool problem to solve. If I could reproduce it locally I'd give you a hand.
You'd need a lot of hardware and patches to reproduce it in full. But if you like to see this with your own eyes, I can give you an SSH access to my test machine.
-- Alexander Motin _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"