On Tue, 7 Oct 2008, John Baldwin wrote:
On Tuesday 07 October 2008 07:44:00 am ???? wrote:
Hi, folks,
I did kernel profiling when a single thread client sends UDP packets to a
single thread server on the same machine.
In the output kernel profile, the first few kernel functions that consumes
the most CPU time are listed below:
granularity: each sample hit covers 16 byte(s) for 0.01% of 25.68 seconds
% cumulative self self total
time seconds seconds calls ms/call ms/call name
42.4 10.88 10.88 0 100.00% __mcount [1]
36.1 20.14 9.26 17937541 0.00 0.00 spinlock_exit [4]
4.2 21.22 1.08 3145728 0.00 0.00 in_cksum_skip [40]
1.8 21.68 0.45 7351987 0.00 0.00 generic_copyin [43]
1.1 21.96 0.29 3146028 0.00 0.00 generic_copyout [48]
1.0 22.21 0.24 2108904 0.00 0.00 Xint0x80_syscall [3]
0.8 22.42 0.21 6292131 0.00 0.00 uma_zalloc_arg [46]
0.8 22.62 0.20 1048576 0.00 0.00 soreceive_generic [9]
It is very strange that spinlock_exit consumes over 36% CPU time while it
seems a very simple function.
It's because the intr_restore() re-enables interrupts and the resulting time
spent executing the handlers for any pending interrupts are attributed to
spinlock_exit().
This is one of many defects that are not present in high resolution
kernel profiling (kgmon -B instead of kgmon -b; availaible on amd64
and i386). However, high resolution kernel profiling doesn't work
right with SMP, and was completely broken by gcc-4. Ordinary profiling
was less completely broken by gcc-4, and you can recover the old
behaviour by turning off new optimizations (mainly -funit-at-a-time
and/or -finline-functions-called-once and or all of -O2).
Bruce
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"