Bernd Schmidt wrote: > On 09/15/2016 03:38 PM, Wilco Dijkstra wrote: > > __rawmemchr is not the fastest on any target I tried, including x86, > > Interesting. Care to share your test program? I just looked at the libc > sources and strlen/rawmemchr are practically identical code so I'd > expect any difference to be lost in the noise. Of course there might be > inlines interfering with the comparison.
It's glibc/benchtests/bench-strlen.c slightly modified to compare strlen, rawmemchr and strchr. Even if they appear identical the inner loop of strlen is much faster than strchr and rawmemchr at larger sizes: strchr rawmemchr strlen Length 4096, alignment 12: 3.35132e+06 2.39842e+06 1.88962e+06 > > So the only reasonable optimization is to always emit a + strlen (a). > > Not sure about "only reasonable" but on the whole I'd agree that it's > reasonable and we shouldn't let the perfect be the enemy of the good > here. I'm sure we can come up with lots of different ways to do this but > let's just pick one and if the one Wilco submitted looks decent let's > just put it in. > > Out of curiousity, is there real-world code that this is intended to > optimize? I noticed rawmemchr taking non-trivial amounts of time in various profiles despite no use of rawmemchr in any of the source code. It's apparently a common idiom to use strchr (s, 0) to find the end of a string. Given strchr is slower than strlen, it is changed to rawmemchr by GLIBC headers. However this makes things even slower since few targets have an optimized rawmemchr, and for targets that do, strlen is faster. So this is one of many improvements to ensure GCC/GLIBC by default do optimizations in a way that is best for most targets. If a particular target wants to do something different that is always possible of course. Wilco