rusi:
Can you please try one more experiment Neil? Knock off all non-ASCII strings (paths) from your dataset and try again.
Results are the same 0.40 (well, 0.001 less but I don't think the timer is that accurate) for Python 3.2 and 0.78 for Python 3.3.
Neil -- http://mail.python.org/mailman/listinfo/python-list