On Mon, Mar 23, 2015 at 9:25 AM, David Miller <[email protected]> wrote:
>
> Ok, here is what I committed.

So I wonder - looking at that assembly, I get the feeling that it
isn't any better code than gcc could generate from simple C code.

Would it perhaps be better to turn memmove() into C?

That's particularly true because if I read this code right, it now
seems to seriously pessimise non-overlapping memmove's, in that it now
*always* uses that slow downward copy if the destination is below the
source.

Now, admittedly, the kernel doesn't use a lot of memmov's, but this
still falls back on the "byte at a time" model for a lot of cases (all
non-64-bit-aligned ones). I could imagine those existing. And some
people (reasonably) hate memcpy because they've been burnt by the
overlapping case and end up using memmove as a "safe alternative", so
it's not necessarily just the overlapping case that might trigger
this.

Maybe the code could be something like

    void *memmove(void *dst, const void *src, size_t n);
    {
        // non-overlapping cases
        if (src + n <= dst)
            return memcpy(dst, src, n);
        if (dst + n <= src)
            return memcpy(dst, src, n);

        // overlapping, but we know we
        //  (a) copy upwards
        //  (b) initialize the result in at most chunks of 64
        if (dst+64 <= src)
            return memcpy(dst, src, n);

        .. do the backwards thing ..
    }

(ok, maybe I got it wrong, but you get the idea).

I *think* gcc should do ok on the above kind of code, and not generate
wildly different code from your handcoded version.

                            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to