Antoine Pitrou wrote: > Le Fri, 19 Mar 2010 17:18:17 +0000, djc a écrit : >> changing >> with open(filename, 'rU') as tabfile: to >> with codecs.open(filename, 'rU', 'utf-8', 'backslashreplace') as >> tabfile: >> >> and >> with open(outfile, 'wt') as out_part: to >> with codecs.open(outfile, 'w', 'utf-8') as out_part: >> >> causes a program that runs in >> 43 seconds to take 4 minutes to process the same data. > > codecs.open() (and the object it returns) is slow as it is written in > pure Python. > > Accelerated reading and writing of unicode files is available in Python > 2.7 and 3.1, using the new `io` module.
Thank you, for a clear and to the point explanation. I shall concentrate on finding an optimal time to upgrade from Python 2.6. -- David Clark, MSc, PhD. UCL Centre for Publishing Gower Str London WCIE 6BT What sort of web animal are you? <https://www.bbc.co.uk/labuk/experiments/webbehaviour> -- http://mail.python.org/mailman/listinfo/python-list