Re: [RFC] Un-staticise the toolchain

2012-05-02 Thread John Baldwin
On Tuesday, May 01, 2012 6:46:04 pm Erik Cederstrand wrote:
> Den 01/05/2012 kl. 15.55 skrev Gary Palmer:
> > 
> > If you want a high-level view of what goes on run
> > 
> > ldd `which ls`
> > 
> > check that it has libraries to load and doesn't say "not a dynamic ELF
> > executable", and then run: 
> > 
> > ktrace ls
> > kdump | more
> > 
> > All the system calls related to resolving and loading shared libraries
> > take time.  I realise "ls" is not "make", but it should give you an
> > idea.
> 
> Just for fun, I did some measurements with ktrace and a static vs dynamic 
> make. I did 'ktrace make' and 'trace dynamicmake' ('dynamicmake' 
produced by running 'make NO_SHARED=no' in /usr/src/usr.bin/make), just to see 
where time is spent when loading and running make in the static and 
dynamic cases. I did this 100 times for each version (discarding the first run) 
and ran through ministat (numbers in msecs):
> 
> x static.dat
> ++
> |   x x   
>|
> |   x x x   x x xx
>|
> |   x x x   x x xx x  
>|
> |x  x x x x   x x x x x  x x x
>|
> |  x x  x x x x   x x x x   x x  x x   x   x x
>|
> |  x   x x  x x x x x x x x x x x x  x x x x x x   x x
>|
> |x x   x x  x x x x x x x x x x x x  x x x x x x   x x x   x x  x x   
>   x|
> |   |_M__A___|
>|
> ++
> N   Min   MaxMedian   AvgStddev
> x 100 2.228 2.266 2.242   2.24333   0.00794
> 
> x dynamic.dat
> ++
> |x
>|
> |xx   
>|
> |   xxx   x x   x 
>|
> |   xxx   x x   xx x  
>|
> |   x  x xx   x x   xx   x x x
>|
> |   x xx x x xx   x x   xx   x x x   x
>|
> |x  x   x xx   x x x xx x x x x xx x x x x  xx
>|
> |x xx   x x x xx   x x x xx x x x x xx x x x x xx x x x x xx x  x 
> x x|
> |   |_M_A___| 
>|
> ++
> N   Min   MaxMedian   AvgStddev
> x 100 2.648 2.691 2.6662.6669   0.00873
> 
> 
> So the dynamic version is ca. 0.42 ms, or 16%, slower on my system.
> 
> I then compared a trace from a static and a dynamic 'make'. Both selected 
> traces had a running time matching the median of that run. I cut away 
all common system calls etc., which resulted in just the steps involved in 
loading 'make' to a state where it can start executing.
> 
> In the static case, this took 0.26 ms. In the dynamic case, this took 0.57 
> ms, or 0.31 ms (119%) more. The remaining ca. 0.11 ms (0.42 - 0.31) 
are mainly due to the dynamic version having slightly slower IO (read() and 
stat()). I have no idea why. The following are the notable steps in 
each version:
> 
> Static version:
>  * 0.09 ms spent execve'ing /usr/bin/make
>  * The rest is mostly sysctl calls
> 
> Dynamic version:
>  * 0.09 ms spent execve'ing ./dynamicmake and /libexec/ld-elf.so.1
>  * 0.18 ms spent loading libc.so.7 (incl. reading /etc/libmap.conf and 
> /var/run/ld-elf.so.hints)
>  * 0.11 ms spent calling sysarch(0x81,0x7fff) (static version also 
> does this but only uses 0.01 ms. No idea why)
>  * The rest is mostly sysctl and sigprocmask calls

Keep in mind that the first time each system call is invoked, the runtime
linker has to resolve the symbol.  You can force the linker to do all
that when the object is loaded instead by setting LD_BIND_NOW=yes in the
environment of dynamicmake.  If that makes the sysarch() call take 0.01
ms, then you know that the extra 0.10 ms is because of rtld doing the
lookup of sysarch() on the first sysarch() call.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


fast bcopy...

2012-05-02 Thread Luigi Rizzo
as part of my netmap investigations, i was looking at how
expensive are memory copies, and here are a couple of findings
(first one is obvious, the second one less so)

1. especially on 64bit machines, always use multiple of at
   least 8 bytes (possibly even larger units). The bcopy code
   in amd64 seems to waste an extra 20ns (on a 3.4 GHz machine)
   when processing blocks of size 8n + {4,5,6,7}.
   The difference is relevant, on that machine i have

bcopy(src, dst,  1) ~12.9ns (data in L1 cache)
bcopy(src, dst,  3) ~12.9ns (data in L1 cache)
bcopy(src, dst,  4) ~33.4ns (data in L1 cache) <--- NOTE
bcopy(src, dst, 32) ~12.9ns (data in L1 cache)
bcopy(src, dst, 63) ~33.4ns (data in L1 cache) <--- NOTE
bcopy(src, dst, 64) ~12.9ns (data in L1 cache)
   Note how the two marked lines are much slower than the others.
   Same thing happens with data not in L1

bcopy(src, dst, 64) ~ 22ns  (not in L1)
bcopy(src, dst, 63) ~ 44ns  (not in L1)
...

   Continuing the tests on larger sizes, for the next item:
bcopy(src, dst,256) ~19.8ns (data in L1 cache)
bcopy(src, dst,512) ~28.8ns (data in L1 cache)
bcopy(src, dst,1K)  ~39.6ns (data in L1 cache)
bcopy(src, dst,4K)  ~95.2ns (data in L1 cache)


   An older P4 running FreeBSD4/32 bit the operand size seems less
   sensitive to odd sizes.

2. apparently, bcopy is not the fastest way to copy memory.
   For small blocks and multiples of 32-64 bytes, i noticed that
   the following is a lot faster (breaking even at about 1 KBytes)

static inline void
fast_bcopy(void *_src, void *_dst, int l)
{
uint64_t *src = _src;
uint64_t *dst = _dst;
for (; l > 0; l-=32) {
*dst++ = *src++;
*dst++ = *src++;
*dst++ = *src++;
*dst++ = *src++;
}
}

fast_bcopy(src, dst, 32) ~ 1.8ns(data in L1 cache)
fast_bcopy(src, dst, 64) ~ 2.9ns(data in L1 cache)
fast_bcopy(src, dst,256) ~10.1ns(data in L1 cache)
fast_bcopy(src, dst,512) ~19.5ns(data in L1 cache)
fast_bcopy(src, dst,1K)  ~38.4ns(data in L1 cache)
fast_bcopy(src, dst,4K) ~152.0ns(data in L1 cache)

fast_bcopy(src, dst, 32) ~15.3ns(not in L1)
fast_bcopy(src, dst,256) ~38.7ns(not in L1)
...

   The old P4/32 bit also exhibits similar results.

Conclusion: if you have to copy packets you might be better off
padding the length to a multiple of 32, and using the following
function to get the best of both worlds.

Sprinkle some prefetch() for better taste.

// XXX only for multiples of 32 bytes, non overlapped.
static inline void
good_bcopy(void *_src, void *_dst, int l)
{
uint64_t *src = _src;
uint64_t *dst = _dst;
#define likely(x)   __builtin_expect(!!(x), 1)
#define unlikely(x)   __builtin_expect(!!(x), 0)
if (unlikely(l >= 1024)) {
bcopy(src, dst, l);
return;
}
for (; l > 0; l-=32) {
*dst++ = *src++;
*dst++ = *src++;
*dst++ = *src++;
*dst++ = *src++;
}
}

cheers
luigi
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fast bcopy...

2012-05-02 Thread Alex Dupre

Luigi Rizzo ha scritto:

For small blocks and multiples of 32-64 bytes, i noticed that
the following is a lot faster (breaking even at about 1 KBytes)

static inline void
fast_bcopy(void *_src, void *_dst, int l)
{
uint64_t *src = _src;
uint64_t *dst = _dst;
for (; l>  0; l-=32) {
*dst++ = *src++;
*dst++ = *src++;
*dst++ = *src++;
*dst++ = *src++;
}
}


DJB influence? :-)

--
Alex Dupre
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fast bcopy...

2012-05-02 Thread Steven Atreju
Luigi Rizzo wrote:
> 2. apparently, bcopy is not the fastest way to copy memory.

http://now.cs.berkeley.edu/Td/bcopy.html

Best Regards.

Steven.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


ksh93 build failure

2012-05-02 Thread Derek Tattersall
On a 10.0 Current system, cvsupped today, ksh93 fails to build.  As best
I can determine, the failure is due to a problem of conflicting
includes.

In file included from 
/home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/ast_wchar.h:113,
 from 
/home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/wchar.h:22,
 from 
/home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/lexstates.h:85,
 from 
/home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/shlex.h:32,
 from 
/home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/data/keywords.c:22:
/usr/include/../include/wchar.h:102: error: conflicting types for '_sfio_FILE'
Has anyone else run into this problem, and if so, what did you do about
it.  ksh93 builds without error on 9.0 Stable.
-- 
Best regards,
Derek Tattersall
d...@mebtel.net dlt...@yahoo.com dtatt...@gmail.com
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fast bcopy...

2012-05-02 Thread K. Macy
It's highly chipset and processor dependent what works best. Intel now
has non-temporal loads and stores which work much better in some cases
but provide little benefit in others.

-Kip

On Wed, May 2, 2012 at 11:52 PM, Steven Atreju  wrote:
> Luigi Rizzo wrote:
>> 2. apparently, bcopy is not the fastest way to copy memory.
>
> http://now.cs.berkeley.edu/Td/bcopy.html
>
> Best Regards.
>
> Steven.
> ___
> freebsd-...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



-- 
   “The real damage is done by those millions who want to 'get by.'
The ordinary men who just want to be left in peace. Those who don’t
want their little lives disturbed by anything bigger than themselves.
Those with no sides and no causes. Those who won’t take measure of
their own strength, for fear of antagonizing their own weakness. Those
who don’t like to make waves—or enemies.

   Those for whom freedom, honour, truth, and principles are only
literature. Those who live small, love small, die small. It’s the
reductionist approach to life: if you keep it small, you’ll keep it
under control. If you don’t make any noise, the bogeyman won’t find
you.

   But it’s all an illusion, because they die too, those people who
roll up their spirits into tiny little balls so as to be safe. Safe?!
>From what? Life is always on the edge of death; narrow streets lead to
the same place as wide avenues, and a little candle burns itself out
just like a flaming torch does.

   I choose my own way to burn.”

   Sophie Scholl
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: fast bcopy...

2012-05-02 Thread Arnaud Lacombe
Hi,

On Wed, May 2, 2012 at 5:52 PM, Steven Atreju  wrote:
> Luigi Rizzo wrote:
>> 2. apparently, bcopy is not the fastest way to copy memory.
>
> http://now.cs.berkeley.edu/Td/bcopy.html
>
"Pentium 166, Triton Chipset, EDO memory"... ahem.

 - Arnaud

> Best Regards.
>
> Steven.
> ___
> freebsd-...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


FreeBSD 10 prognostication...

2012-05-02 Thread Vance Siemens
Can't say that I wouldn't look forward to this, but it sounds a little off:

http://www.trollaxor.com/2012/05/freebsd-x-berkeley-unix-apple-quality.html

What do others think?

--Vance
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD 10 prognostication...

2012-05-02 Thread Luke S. Crawford
On Thu, May 03, 2012 at 12:31:12AM -0400, Vance Siemens wrote:
> Can't say that I wouldn't look forward to this, but it sounds a little off:
> 
> http://www.trollaxor.com/2012/05/freebsd-x-berkeley-unix-apple-quality.html
> 
> What do others think?

Please leave trollaxor on Kuro5hin or slashdot where he belongs.   

I mean, this isn't even adequacy level stuff.  
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD 10 prognostication...

2012-05-02 Thread O. Hartmann
On 05/03/12 06:31, Vance Siemens wrote:
> Can't say that I wouldn't look forward to this, but it sounds a little off:
> 
> http://www.trollaxor.com/2012/05/freebsd-x-berkeley-unix-apple-quality.html
> 
> What do others think?
> 
> --Vance

OMG!



signature.asc
Description: OpenPGP digital signature


Re: ksh93 build failure

2012-05-02 Thread Jason Hellenthal


On Wed, May 02, 2012 at 06:52:21PM -0400, Derek Tattersall wrote:
> On a 10.0 Current system, cvsupped today, ksh93 fails to build.  As best
> I can determine, the failure is due to a problem of conflicting
> includes.
> 
> In file included from 
> /home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/ast_wchar.h:113,
>  from 
> /home/ports/usr/ports/shells/ksh93/work/arch/freebsd.amd64/include/ast/wchar.h:22,
>  from 
> /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/lexstates.h:85,
>  from 
> /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/include/shlex.h:32,
>  from 
> /home/ports/usr/ports/shells/ksh93/work/src/cmd/ksh93/data/keywords.c:22:
> /usr/include/../include/wchar.h:102: error: conflicting types for '_sfio_FILE'
> Has anyone else run into this problem, and if so, what did you do about
> it.  ksh93 builds without error on 9.0 Stable.

Do you perhaps have devel/sfio installed on that machine ? If so can you
deinstall and retry the build.

-- 

 - (2^(N-1))
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"