Re: collections.Counter surprisingly slow

Joshua Landau Mon, 29 Jul 2013 04:57:11 -0700

On 29 July 2013 07:25, Serhiy Storchaka <storch...@gmail.com> wrote:

> 28.07.13 22:59, Roy Smith написав(ла):
>
>    The input is an 8.8 Mbyte file containing about 570,000 lines (11,000
>> unique strings).
>>
>
> Repeat you tests with totally unique lines.



Counter is about ½ the speed of defaultdict in that case (as opposed to ⅓).


>  The full profiler dump is at the end of this message, but the gist of
>> it is:
>>
>
> Profiler affects execution time. In particular it slowdown Counter
> implementation which uses more function calls. For real world measurement
> use different approach.


Doing some re-times, it seems that his originals for defaultdict, exception
and Counter were about right. I haven't timed the other.


>  Why is count() [i.e. collections.Counter] so slow?
>>
>
> Feel free to contribute a patch which fixes this "wart". Note that Counter
> shouldn't be slowdowned on mostly unique data.


I find it hard to agree that counter should be optimised for the
unique-data case, as surely it's much more oft used when there's a point to
counting?

Also, couldn't Counter just extend from defaultdict?

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: collections.Counter surprisingly slow

Reply via email to