"Paul Watson" <[EMAIL PROTECTED]> writes: > "Mike Meyer" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] >> "Paul Watson" <[EMAIL PROTECTED]> writes: > ... >> Did you do timings on it vs. mmap? Having to copy the data multiple >> times to deal with the overlap - thanks to strings being immutable - >> would seem to be a lose, and makes me wonder how it could be faster >> than mmap in general. > > The only thing copied is a string one byte less than the search string for > each block.
Um - you removed the code, but I could have *sworn* that it did something like: buf = buf[testlen:] + f.read(bufsize - testlen) which should cause the the creation of three strings: the last few bytes of the old buffer, a new bufferfull from the read, then the sum of those two - created by copying the first two into a new string. So you wind up copying all the data. Which, as you showed, doesn't take nearly as much time as using mmap. Thanks, <mike -- Mike Meyer <[EMAIL PROTECTED]> http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. -- http://mail.python.org/mailman/listinfo/python-list