> > On Fri, 3 Oct 2008, Danny Braniss wrote: > > >> On Fri, 3 Oct 2008, Danny Braniss wrote: > >> > >>> gladly, but have no idea how to do LOCK_PROFILING, so some pointers would > >>> be helpfull. > >> > >> The LOCK_PROFILING(9) man page isn't a bad starting point -- I find that > >> the defaults work fine most of the time, so just use them. Turn the > >> enable > >> syscl on just before you begin a run, and turn it off immediately > >> afterwards. Make sure to reset between reruns (rebooting to a new kernel > >> is fine too!). > > > > in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof > > there 3 files: > > 7.1-100 host connected at 100 running -prerelease > > 7.1-1000 same but connected at 1000 > > 7.0-1000 -stable with your 'patch' > > at 100 my benchmark didn't suffer from the profiling, average was about 9. > > at 1000 the benchmark got realy hit, average was around 12 for the patched, > > and 4 for the unpatched (less than at 100). > > Interesting. A bit of post-processing: > > [EMAIL PROTECTED]:/tmp> cat 7.1-1000 | awk -F' ' '{print $3" "$9}' | sort -n > | > tail -10 > 2413283 /r+d/7/sys/kern/kern_mutex.c:141 > 2470096 /r+d/7/sys/nfsclient/nfs_socket.c:1218 > 2676282 /r+d/7/sys/net/route.c:293 > 2754866 /r+d/7/sys/kern/vfs_bio.c:1468 > 3196298 /r+d/7/sys/nfsclient/nfs_bio.c:1664 > 3318742 /r+d/7/sys/net/route.c:1584 > 3711139 /r+d/7/sys/dev/bge/if_bge.c:3287 > 3753518 /r+d/7/sys/net/if_ethersubr.c:405 > 3961312 /r+d/7/sys/nfsclient/nfs_subs.c:1066 > 10688531 /r+d/7/sys/dev/bge/if_bge.c:3726 > [EMAIL PROTECTED]:/tmp> cat 7.0-1000 | awk -F' ' '{print $3" "$9}' | sort -n > | > tail -10 > 468631 /r+d/hunt/src/sys/nfsclient/nfs_nfsiod.c:286 > 501989 /r+d/hunt/src/sys/nfsclient/nfs_vnops.c:1148 > 631587 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1198 > 701155 /r+d/hunt/src/sys/nfsclient/nfs_socket.c:1258 > 718211 /r+d/hunt/src/sys/kern/kern_mutex.c:141 > 1118711 /r+d/hunt/src/sys/nfsclient/nfs_bio.c:1664 > 1169125 /r+d/hunt/src/sys/nfsclient/nfs_subs.c:1066 > 1222867 /r+d/hunt/src/sys/kern/vfs_bio.c:1468 > 3876072 /r+d/hunt/src/sys/netinet/udp_usrreq.c:545 > 5198927 /r+d/hunt/src/sys/netinet/udp_usrreq.c:864 > > The first set above is with the unmodified 7-STABLE tree, the second with a > reversion of read locking on the UDP inpcb. The big blinking sign of > interest > is that the bge interface lock is massively contended in the first set of > output, and basically doesn't appear in the second. There are various > reasons > bge could stand out quite so much -- one possibly is that previously, the udp > lock serialized all access to the interface from the send code, preventing > the > send and receive paths from contending. > > A few things to try: > > - Let's look compare the context switch rates on the two benchmarks. Could > you run vmstat and look at the cpu cs line during the benchmarks and see > how > similar the two are as the benchmarks run? You'll want to run it with > vmstat -w 1 and collect several samples per benchmark, since we're really > interested in the distribution rather than an individual sample. > > - Is there any chance you could drop an if_em card into the same box and run > the identical benchmarks with and without LOCK_PROFILING to see whether it > behaves differently than bge when the patch is applied? if_em's interrupt > handling is quite different, and may significantly affect lock use, and > hence contention.
at the moment, the best I can do is run it on a different hardware that has if_em, the results are in ftp://ftp.cs.huji.ac.il/users/danny/lock.prof/7.1-1000.em the benchmark ran better with the Intel NIC, averaged UDP 54MB/s, TCP 53MB/s (I get the same numbers with an older kernel). danny _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"