Re: [RFC] Un-staticise the toolchain
On Tuesday, May 01, 2012 6:46:04 pm Erik Cederstrand wrote: > Den 01/05/2012 kl. 15.55 skrev Gary Palmer: > > > > If you want a high-level view of what goes on run > > > > ldd `which ls` > > > > check that it has libraries to load and doesn't say "not a dynamic ELF > > executable", and then run: > > > > ktrace ls > > kdump | more > > > > All the system calls related to resolving and loading shared libraries > > take time. I realise "ls" is not "make", but it should give you an > > idea. > > Just for fun, I did some measurements with ktrace and a static vs dynamic > make. I did 'ktrace make' and 'trace dynamicmake' ('dynamicmake' produced by running 'make NO_SHARED=no' in /usr/src/usr.bin/make), just to see where time is spent when loading and running make in the static and dynamic cases. I did this 100 times for each version (discarding the first run) and ran through ministat (numbers in msecs): > > x static.dat > ++ > | x x >| > | x x x x x xx >| > | x x x x x xx x >| > |x x x x x x x x x x x x x >| > | x x x x x x x x x x x x x x x x x >| > | x x x x x x x x x x x x x x x x x x x x x x x >| > |x x x x x x x x x x x x x x x x x x x x x x x x x x x x x > x| > | |_M__A___| >| > ++ > N Min MaxMedian AvgStddev > x 100 2.228 2.266 2.242 2.24333 0.00794 > > x dynamic.dat > ++ > |x >| > |xx >| > | xxx x x x >| > | xxx x x xx x >| > | x x xx x x xx x x x >| > | x xx x x xx x x xx x x x x >| > |x x x xx x x x xx x x x x xx x x x x xx >| > |x xx x x x xx x x x xx x x x x xx x x x x xx x x x x xx x x > x x| > | |_M_A___| >| > ++ > N Min MaxMedian AvgStddev > x 100 2.648 2.691 2.6662.6669 0.00873 > > > So the dynamic version is ca. 0.42 ms, or 16%, slower on my system. > > I then compared a trace from a static and a dynamic 'make'. Both selected > traces had a running time matching the median of that run. I cut away all common system calls etc., which resulted in just the steps involved in loading 'make' to a state where it can start executing. > > In the static case, this took 0.26 ms. In the dynamic case, this took 0.57 > ms, or 0.31 ms (119%) more. The remaining ca. 0.11 ms (0.42 - 0.31) are mainly due to the dynamic version having slightly slower IO (read() and stat()). I have no idea why. The following are the notable steps in each version: > > Static version: > * 0.09 ms spent execve'ing /usr/bin/make > * The rest is mostly sysctl calls > > Dynamic version: > * 0.09 ms spent execve'ing ./dynamicmake and /libexec/ld-elf.so.1 > * 0.18 ms spent loading libc.so.7 (incl. reading /etc/libmap.conf and > /var/run/ld-elf.so.hints) > * 0.11 ms spent calling sysarch(0x81,0x7fff) (static version also > does this but only uses 0.01 ms. No idea why) > * The rest is mostly sysctl and sigprocmask calls Keep in mind that the first time each system call is invoked, the runtime linker has to resolve the symbol. You can force the linker to do all that when the object is loaded instead by setting LD_BIND_NOW=yes in the environment of dynamicmake. If that makes the sysarch() call take 0.01 ms, then you know that the extra 0.10 ms is because of rtld doing the lookup of sysarch() on the first sysarch() call. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
fast bcopy...
as part of my netmap investigations, i was looking at how expensive are memory copies, and here are a couple of findings (first one is obvious, the second one less so) 1. especially on 64bit machines, always use multiple of at least 8 bytes (possibly even larger units). The bcopy code in amd64 seems to waste an extra 20ns (on a 3.4 GHz machine) when processing blocks of size 8n + {4,5,6,7}. The difference is relevant, on that machine i have bcopy(src, dst, 1) ~12.9ns (data in L1 cache) bcopy(src, dst, 3) ~12.9ns (data in L1 cache) bcopy(src, dst, 4) ~33.4ns (data in L1 cache) <--- NOTE bcopy(src, dst, 32) ~12.9ns (data in L1 cache) bcopy(src, dst, 63) ~33.4ns (data in L1 cache) <--- NOTE bcopy(src, dst, 64) ~12.9ns (data in L1 cache) Note how the two marked lines are much slower than the others. Same thing happens with data not in L1 bcopy(src, dst, 64) ~ 22ns (not in L1) bcopy(src, dst, 63) ~ 44ns (not in L1) ... Continuing the tests on larger sizes, for the next item: bcopy(src, dst,256) ~19.8ns (data in L1 cache) bcopy(src, dst,512) ~28.8ns (data in L1 cache) bcopy(src, dst,1K) ~39.6ns (data in L1 cache) bcopy(src, dst,4K) ~95.2ns (data in L1 cache) An older P4 running FreeBSD4/32 bit the operand size seems less sensitive to odd sizes. 2. apparently, bcopy is not the fastest way to copy memory. For small blocks and multiples of 32-64 bytes, i noticed that the following is a lot faster (breaking even at about 1 KBytes) static inline void fast_bcopy(void *_src, void *_dst, int l) { uint64_t *src = _src; uint64_t *dst = _dst; for (; l > 0; l-=32) { *dst++ = *src++; *dst++ = *src++; *dst++ = *src++; *dst++ = *src++; } } fast_bcopy(src, dst, 32) ~ 1.8ns(data in L1 cache) fast_bcopy(src, dst, 64) ~ 2.9ns(data in L1 cache) fast_bcopy(src, dst,256) ~10.1ns(data in L1 cache) fast_bcopy(src, dst,512) ~19.5ns(data in L1 cache) fast_bcopy(src, dst,1K) ~38.4ns(data in L1 cache) fast_bcopy(src, dst,4K) ~152.0ns(data in L1 cache) fast_bcopy(src, dst, 32) ~15.3ns(not in L1) fast_bcopy(src, dst,256) ~38.7ns(not in L1) ... The old P4/32 bit also exhibits similar results. Conclusion: if you have to copy packets you might be better off padding the length to a multiple of 32, and using the following function to get the best of both worlds. Sprinkle some prefetch() for better taste. // XXX only for multiples of 32 bytes, non overlapped. static inline void good_bcopy(void *_src, void *_dst, int l) { uint64_t *src = _src; uint64_t *dst = _dst; #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) if (unlikely(l >= 1024)) { bcopy(src, dst, l); return; } for (; l > 0; l-=32) { *dst++ = *src++; *dst++ = *src++; *dst++ = *src++; *dst++ = *src++; } } cheers luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: fast bcopy...
Luigi Rizzo ha scritto: For small blocks and multiples of 32-64 bytes, i noticed that the following is a lot faster (breaking even at about 1 KBytes) static inline void fast_bcopy(void *_src, void *_dst, int l) { uint64_t *src = _src; uint64_t *dst = _dst; for (; l> 0; l-=32) { *dst++ = *src++; *dst++ = *src++; *dst++ = *src++; *dst++ = *src++; } } DJB influence? :-) -- Alex Dupre ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: fast bcopy...
Luigi Rizzo wrote: > 2. apparently, bcopy is not the fastest way to copy memory. http://now.cs.berkeley.edu/Td/bcopy.html Best Regards. Steven. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
ksh93 build failure
On a 10.0 Current system, cvsupped today, ksh93 fails to build. As best I can determine, the failure is due to a problem of conflicting includes. In file included from /home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/ast_wchar.h:113, from /home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/wchar.h:22, from /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/lexstates.h:85, from /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/shlex.h:32, from /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/data/keywords.c:22: /usr/include/../include/wchar.h:102: error: conflicting types for '_sfio_FILE' Has anyone else run into this problem, and if so, what did you do about it. ksh93 builds without error on 9.0 Stable. -- Best regards, Derek Tattersall d...@mebtel.net dlt...@yahoo.com dtatt...@gmail.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: fast bcopy...
It's highly chipset and processor dependent what works best. Intel now has non-temporal loads and stores which work much better in some cases but provide little benefit in others. -Kip On Wed, May 2, 2012 at 11:52 PM, Steven Atreju wrote: > Luigi Rizzo wrote: >> 2. apparently, bcopy is not the fastest way to copy memory. > > http://now.cs.berkeley.edu/Td/bcopy.html > > Best Regards. > > Steven. > ___ > freebsd-...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" -- “The real damage is done by those millions who want to 'get by.' The ordinary men who just want to be left in peace. Those who don’t want their little lives disturbed by anything bigger than themselves. Those with no sides and no causes. Those who won’t take measure of their own strength, for fear of antagonizing their own weakness. Those who don’t like to make waves—or enemies. Those for whom freedom, honour, truth, and principles are only literature. Those who live small, love small, die small. It’s the reductionist approach to life: if you keep it small, you’ll keep it under control. If you don’t make any noise, the bogeyman won’t find you. But it’s all an illusion, because they die too, those people who roll up their spirits into tiny little balls so as to be safe. Safe?! >From what? Life is always on the edge of death; narrow streets lead to the same place as wide avenues, and a little candle burns itself out just like a flaming torch does. I choose my own way to burn.” Sophie Scholl ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: fast bcopy...
Hi, On Wed, May 2, 2012 at 5:52 PM, Steven Atreju wrote: > Luigi Rizzo wrote: >> 2. apparently, bcopy is not the fastest way to copy memory. > > http://now.cs.berkeley.edu/Td/bcopy.html > "Pentium 166, Triton Chipset, EDO memory"... ahem. - Arnaud > Best Regards. > > Steven. > ___ > freebsd-...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
FreeBSD 10 prognostication...
Can't say that I wouldn't look forward to this, but it sounds a little off: http://www.trollaxor.com/2012/05/freebsd-x-berkeley-unix-apple-quality.html What do others think? --Vance ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: FreeBSD 10 prognostication...
On Thu, May 03, 2012 at 12:31:12AM -0400, Vance Siemens wrote: > Can't say that I wouldn't look forward to this, but it sounds a little off: > > http://www.trollaxor.com/2012/05/freebsd-x-berkeley-unix-apple-quality.html > > What do others think? Please leave trollaxor on Kuro5hin or slashdot where he belongs. I mean, this isn't even adequacy level stuff. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: FreeBSD 10 prognostication...
On 05/03/12 06:31, Vance Siemens wrote: > Can't say that I wouldn't look forward to this, but it sounds a little off: > > http://www.trollaxor.com/2012/05/freebsd-x-berkeley-unix-apple-quality.html > > What do others think? > > --Vance OMG! signature.asc Description: OpenPGP digital signature
Re: ksh93 build failure
On Wed, May 02, 2012 at 06:52:21PM -0400, Derek Tattersall wrote: > On a 10.0 Current system, cvsupped today, ksh93 fails to build. As best > I can determine, the failure is due to a problem of conflicting > includes. > > In file included from > /home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/ast_wchar.h:113, > from > /home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/wchar.h:22, > from > /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/lexstates.h:85, > from > /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/shlex.h:32, > from > /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/data/keywords.c:22: > /usr/include/../include/wchar.h:102: error: conflicting types for '_sfio_FILE' > Has anyone else run into this problem, and if so, what did you do about > it. ksh93 builds without error on 9.0 Stable. Do you perhaps have devel/sfio installed on that machine ? If so can you deinstall and retry the build. -- - (2^(N-1)) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"