Krishna Yenduri wrote:
Hi All,
Our team is doing some micro benchmarks on a multi CPU machine
and found the following hot locks from lockstat(1M) when using 32 threads.
The benchmark does an ioctl() on /dev/crypto in a while loop.
Adaptive mutex spin: 7123153 events in 30.966 seconds (230029 events/sec)
Count indv cuml rcnt spin Lock Caller
-------------------------------------------------------------------------------
1045435 15% 15% 0.00 9 0x60006cb0100 releasef+0x24
1029413 14% 29% 0.00 9 0x60006cb0100 getf+0x38
...
Adaptive mutex block: 57757 events in 30.966 seconds (1865 events/sec)
Count indv cuml rcnt nsec Lock Caller
-------------------------------------------------------------------------------
...
8027 14% 28% 0.00 83275 0x60006cb0100 releasef+0x24
...
5723 10% 62% 0.00 83270 0x60006cb0100 getf+0x38
Another test with a simple stub driver and a little test code that uses
32 threads to perform ioctl operations on the device showed the same
problem.
This lock is uf_lock of the device file descriptor on which the ioctl
is being done. This contention is surprising to me because the routine
getf()
seems to be optimized for scaling.
Any ideas on what kind of solution is possible here? We are not able to
scale
beyond 5X and this lock seems to be the main culprit.
Thanks,
-Krishna
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org
If all the threads share the same FD, you're going to have problems
if the actual IOCTL is very fast.
Can you open the device multiple times?
- Bart
--
Bart Smaalders Solaris Kernel Performance
[EMAIL PROTECTED] http://blogs.sun.com/barts
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org