[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-20 Thread Markus
Markus added the comment: Eleminating duplicates before processing is faster once the overhead of the set operation is less than the time required to sort the larger dataset with duplicates. So we are basically comparing sort(data) to sort(set(data)). The optimum depends on the input data. py

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-20 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Only one duplicated address is degenerated case. When there is a lot of duplicated addresses in range the patch causes regression. $ ./python -m timeit -s "import ipaddress; ips = [ipaddress.ip_address('2001:db8::%x' % (i%100)) for i in range(10)]" -- "

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-20 Thread Markus
Markus added the comment: My initial patch was wrong wrt. _find_address_range. It did not loop over equal addresses. Thats why performance with many equal addresses was degraded when dropping the set(). Here is a patch to fix _find_address_range, drop the set, and improve performance again. p

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___ _

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Roundup Robot
Roundup Robot added the comment: New changeset 021b23a40f9f by Serhiy Storchaka in branch 'default': Issue #23266: Restore the performance of ipaddress.collapse_addresses() whith https://hg.python.org/cpython/rev/021b23a40f9f -- ___ Python tracker

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Antoine Pitrou
Antoine Pitrou added the comment: Then +1. The patch looks fine to me. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Uns

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > What is the performance on the benchmark posted here? The same as with current code. -- ___ Python tracker ___

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Antoine Pitrou
Antoine Pitrou added the comment: Good catch. What is the performance on the benchmark posted here? -- ___ Python tracker ___ ___ Pyth

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Deduplication should not be omitted. This slowed down collapsing of duplicated addresses. $ ./python -m timeit -s "import ipaddress; ips = [ipaddress.ip_address('2001:db8::1000') for i in range(1000)]" -- "ipaddress.collapse_addresses(ips)" Before f7508a17

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Antoine Pitrou
Antoine Pitrou added the comment: Ok, I've committed the patch. Thank you! -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Roundup Robot
Roundup Robot added the comment: New changeset f7508a176a09 by Antoine Pitrou in branch 'default': Issue #23266: Much faster implementation of ipaddress.collapse_addresses() when there are many non-consecutive addresses. https://hg.python.org/cpython/rev/f7508a176a09 -- nosy: +python-de

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Markus
Markus added the comment: I just signed the agreement, ewa@ is processing it. -- ___ Python tracker ___ ___ Python-bugs-list mailing l

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is an updated patch with a fix to the tests and docstrings. -- Added file: http://bugs.python.org/file37764/collapse.patch ___ Python tracker _

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Antoine Pitrou
Antoine Pitrou added the comment: This is great, thank you. Can you sign the contributor's agreement? https://www.python.org/psf/contrib/contrib-form/ -- nosy: +pitrou, pmoody ___ Python tracker ___

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- stage: -> patch review ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Markus
Markus added the comment: Added the testrig. -- Added file: http://bugs.python.org/file37763/testrig.tar.gz ___ Python tracker ___ ___

[issue23266] Faster implementation to collapse non-consecutive ip-addresses

2015-01-18 Thread Markus
New submission from Markus: I found the code used to collapse addresses to be very slow on a large number (64k) of island addresses which are not collapseable. The code at https://github.com/python/cpython/blob/0f164ccc85ff055a32d11ad00017eff768a79625/Lib/ipaddress.py#L349 was found to be guilt