On Fri, 20 Sep 2019 at 08:14, Bruce Evans <b...@optusnet.com.au> wrote:
>
> Optimizing this function [memchr] is especially unimportant,

Why?

Really, we should provide optimized assembly implementations of string
functions deemed important, but it will take time for that to happen
on all architectures.

> and this version has mounds of style bugs.

Yes, I've wondered about applying style(9) to the handful of imported
string routines. In general we don't really expect ongoing updates, so
it's probably no concern from a maintenance perspective.

> > -             do {
> > -                     if (*p++ == (unsigned char)c)
> > -                             return ((void *)(p - 1));
> > -             } while (--n != 0);
>
> Old KNF code.  In the !__GNUC__ #else clause , this is replaced by
> equivalent code with a style that is very far from KNF.

Functionally equivalent, although I compared the compiled output of
both cases and what's currently there is somewhat smaller. Note that
it's not an #else case, the equivalent loop is used in both cases -
handling the non-word-size residue in the __GNUC__ case.

> > +void *memchr(const void *src, int c, size_t n)
> > +{
> > +     const unsigned char *s = src;
> > +     c = (unsigned char)c;
> > +#ifdef __GNUC__
> > +     for (; ((uintptr_t)s & ALIGN) && n && *s != c; s++, n--);
> > +     if (n && *s != c) {
> > +             typedef size_t __attribute__((__may_alias__)) word;
>
> This seems to have no dependencies on __GNUC__ except the use of 
> anti-aliasing,

Yes, that is the reason.

> I checked that this optimization actually works.  For long strings, it is
> almost sizeof(size_t) times faster, by accessing memory with a size
> sizeof(size_t).  size_t is a not very good way of hard-coding the word size.

Indeed - I wouldn't have imported this change if I didn't observe a
reasonable benefit when trying it out. As for word size it could be
replaced using LONG_BIT I suppose (as strspn.c, strlen.c), if it's
getting reworked anyway.

> Related silly optimizations:
> - i386 has memchr() in asm.  This uses scasb, which was last optimal on the
>    8088, or perhaps the original i386.  On freefall, it is several times 
> slower
>    than the naive translation of the naive C code.

I posted a review (https://reviews.freebsd.org/D21785) to remove the
scasb implementation. I haven't investigated wmemchr.

> - i386 pessimizes several old string functions using scasb.

The only other one I see is in strcat.S, which should probably be
retired as well.  Are there others?

> - i386 also does dubious alignment optimizations

I don't think there's much value in putting a lot of time into i386,
but am happy to address straightforward issues (such as removing
pessimal MD implementations). I haven't looked at these alignment
optimizations.

> - amd64 pessimizes strcpy() by implementing it as a C wrapper for stpcpy().
>    The wrapper code is MI, but is only used for amd64.

And it appears the MD one is selected by Makefile .PATH ordering,
which seems fragile.
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to