On Sat, 7 May 2011, Hans Petter Selasky wrote:

On Saturday 07 May 2011 19:13:27 m...@freebsd.org wrote:
On Sat, May 7, 2011 at 9:36 AM, Hans Petter Selasky <hsela...@c2i.net>
wrote:
On Saturday 07 May 2011 18:28:24 Hans Petter Selasky wrote:
  - Use memcpy() instead of bcopy().

- Use memset() instead of bzero().

Why?  It usually falls through to the same code in libc.  Is there
some standardization on memfoo versus bfoo here?

I thought that memset() was a compiler builtin and bzero() optimised for
larger amounts of data?

In the kernel, compiler builtins aren't used, memset() is slightly
pessimized, and bzero() is not optimized (except in old versions of
FreeBSD on i386, attempts were made to optimize bzero() for large data
at a tiny cost to small data).  A better implementation would use the
compiler builtin for both.  My version does this, but the gains (or
losses) from using builtins for this and other things in the kernel
insignificant.  Here it is for bzero():

#define bzero(p, n) ({                                          \
        if (__builtin_constant_p(n) && (n) <= 32)            \
                __builtin_memset((p), 0, (n));                  \
        else                                                    \
                (bzero)((p), (n));                              \
})

This hard-codes the limit of 32 for the builtin since some versions of
gcc use a worse limit.

In userland, on at least amd64 and i386, the extern bzero() and memset()
are unoptimized, but the compiler builtin is used for memset() only.  A
better implementation of bzero() would use the compiler builtin for it
too.  The above is not good enough for libc, since it evaluates args more
than once and has a hard-coded gccism.

The correct optimizations for bzero() etc. are very machine-dependent
and context-dependent and are far too hard for anyone or the compiler
or the CPU to get right (but I believe newer Intel CPUs are closer to
making unoptimized stosb as fast as possible).  Context-dependent parts
include whether the data should go through cache(s) (it shouldn't iff
it won't be used soon and the memory system is such that not going
through caches is either faster or saves time later).

Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to