[issue26904] Difflib quick_ratio() could use Counter()

2017-09-07 Thread Michael Cuthbert
Michael Cuthbert added the comment: I've tried to get the system to not be slower on small sets by not creating a Counter for less than 60 items, and managed to get within 10% of the speed for small sequences while maintaining the 3-3.6x speedup for big comparisons and testing that the results

[issue26904] Difflib quick_ratio() could use Counter()

2017-05-15 Thread Michael Cuthbert
Michael Cuthbert added the comment: Poking to see if there's still interest in getting this into 3.7. Thanks! -- versions: +Python 3.7 -Python 3.6 ___ Python tracker ___ ___

[issue26904] Difflib quick_ratio() could use Counter()

2016-05-03 Thread Michael Cuthbert
Michael Cuthbert added the comment: Here are the results I obtained along with the test code I used to get the results. The test code also has a "hybrid" code which I did not propose, but maybe I should have, which uses the old code for very short (but not degenerate) tests and then the new c

[issue26904] Difflib quick_ratio() could use Counter()

2016-05-03 Thread Michael Cuthbert
Michael Cuthbert added the comment: @wolma -- you're right, that the inplace __iand__ version of Counter is substantially faster -- it is still slower than the current code (since it is still basically a superset of it). However, testing shows that it is close enough to the current code as to

[issue26904] Difflib quick_ratio() could use Counter()

2016-05-02 Thread Wolfgang Maier
Wolfgang Maier added the comment: Given your comment about sum((fullacount & fullbcount).values()), why not use its in-place version: fullacount &= fullbcount matches = sum(fullacount.values()) ? -- nosy: +wolma ___ Python tracker

[issue26904] Difflib quick_ratio() could use Counter()

2016-05-01 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you provide a script or a data used by you for benchmarking, so we can repeat this? -- nosy: +rhettinger, serhiy.storchaka stage: -> patch review ___ Python tracker _

[issue26904] Difflib quick_ratio() could use Counter()

2016-05-01 Thread Michael Cuthbert
New submission from Michael Cuthbert: The implementation used in difflib.SequenceMatcher().quick_ratio() counts how often each member of the sequence (character, list entry, etc.) appears in order to calculate its lower bound. Counting how often an entry appears in an iterable has been sped up