On Thu, Nov 3, 2016 at 5:21 AM, Benjamin Coutu <ben.co...@zeyos.com> wrote: > Hello everyone, > > I think there are a few improvements that we could make to some fundamental > algorithms. > I'd like to point out one particular for now, to see if there is any interest > in these kind of low level code optimizations from the community. > > Please consider the following code snippet extracted from > > https://github.com/php/php-src/blob/49412756df244d94a217853395d15e96cb60e18f/Zend/zend_operators.h#L193 > > static zend_always_inline const void *zend_memrchr(const void *s, int c, > size_t n) > { > const unsigned char *e; > if (n <= 0) { > return NULL; > } > > for (e = (const unsigned char *)s + n - 1; e >= (const unsigned char *)s; > e--) { > if (*e == (const unsigned char)c) { > return (const void *)e; > } > } > return NULL; > } > > The issue with this is that on each loop we have to do two comparison tests, > one against the length boundary and one to check for the actual occurrence of > the search character. > > We can avoid checking the length boundary if we use a sentinel by setting the > last possible element to the character we are searching for, whereby we > guarantee that we will find it and no longer need the test inside the loop. > After the search we reset that character and then recheck the last iteration > step to see if we have aborted the loop because of reaching the end (false > positive) or because of an actual positive. > > A patch could look like this: > > static zend_always_inline const void *zend_memrchr(const void *s, int c, > size_t n) > { > const unsigned char *b; > const unsigned char *e; > const unsigned char t; > > if (n > 0) { > /* Initialize boundary pointers */ > b = (const unsigned char *)s; > e = b + n - 1; > > /* Set sentinel */ > t = *b; > *b = (const unsigned char)c; > > /* Search */ > while (*e != (const unsigned char)c) { > --e; > } > > /* Restore sentinel */ > *b = t; > > /* Recheck */ > if (e != b || t == (const unsigned char)c) { > return (const void *)e; > } > } > > return NULL; > } > > For the above example we could go one step further and test an entire > longword at a time with alignment on longword boundaries, just like some > modern C library implementations do. Which begs the question: Is there a > reason we are not using standard memrchr here? > > This kind of approach might be applicable to other string handling functions > as well. Should I look into this further? > > Cheers, > > Benjamin Coutu > > -- > > Bejamin Coutu > ben.co...@zeyos.com > > ZeyOS, Inc. > http://www.zeyos.com > > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php
An obvious issue with this is that not all strings are writable. The signature of the function also specifies it is a pointer to a const so you shouldn't be able to do this anyway. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php