On 29 July 2013 07:25, Serhiy Storchaka <storch...@gmail.com> wrote: > 28.07.13 22:59, Roy Smith написав(ла): > > The input is an 8.8 Mbyte file containing about 570,000 lines (11,000 >> unique strings). >> > > Repeat you tests with totally unique lines.
Counter is about ½ the speed of defaultdict in that case (as opposed to ⅓). > The full profiler dump is at the end of this message, but the gist of >> it is: >> > > Profiler affects execution time. In particular it slowdown Counter > implementation which uses more function calls. For real world measurement > use different approach. Doing some re-times, it seems that his originals for defaultdict, exception and Counter were about right. I haven't timed the other. > Why is count() [i.e. collections.Counter] so slow? >> > > Feel free to contribute a patch which fixes this "wart". Note that Counter > shouldn't be slowdowned on mostly unique data. I find it hard to agree that counter should be optimised for the unique-data case, as surely it's much more oft used when there's a point to counting? Also, couldn't Counter just extend from defaultdict?
-- http://mail.python.org/mailman/listinfo/python-list