Hi, On Wednesday, November 18, 2015 at 10:00:43 AM UTC+1, Nagy László Zsolt wrote: > > Perhaps there is a size threshold? You could experiment with different > > block > > sizes in the following f.read() replacement: > > > > def read_chunked(f, size=2**20): > > read = functools.partial(f.read, size) > > return "".join(iter(read, "")) > > > Under win32 platform, my experience is that the fastest way to read > binary file from disk is the mmap module. You should try that too.
Thank you for your suggestion. I have tried that now, and with my naive approach I have done this: start = time.time() fid = open(filename, 'r+b') strs = mmap.mmap(fid.fileno(), 0, access=mmap.ACCESS_READ)[:] end = time.time() print 'mmap.read time:', end-start And it takes about 2.7 seconds. Not a bad improvement :-) . Unfortunately, when the file is on a network drive, all the other approaches ran at around 25-30 seconds loading time, while the mmap one clocks at 110 seconds :-( Andrea. -- https://mail.python.org/mailman/listinfo/python-list