[issue33827] Generators with lru_cache can be non-intuituve
New submission from Michel Albert : Consider the following code: # filename: foo.py from functools import lru_cache @lru_cache(10) def bar(): yield 10 yield 20 yield 30 # This loop will work as expected for row in bar(): print(row) # This loop will not loop over anything. # The cache will return an already consumed generator. for row in bar(): print(row) This behaviour is natural, but it is almost invisible to the caller of "foo". The main issue is one of "surprise". When inspecting the output of "foo" it is clear that the output is a generator: >>> import foo >>> foo.bar() **Very** careful inspection will reveal that each call will return the same generator instance. So to an observant user the following is an expected behaviour: >>> result = foo.bar() >>> for row in result: ...print(row) ... 10 20 30 >>> for row in result: ... print(row) ... >>> However, the following is not: >>> import foo >>> result = foo.bar() >>> for row in result: ... print(row) ... 10 20 30 >>> result = foo.bar() >>> for row in result: ... print(row) ... >>> Would it make sense to emit a warning (or even raise an exception) in `lru_cache` if the return value of the cached function is a generator? I can think of situation where it makes sense to combine the two. For example the situation I am currently in: I have a piece of code which loops several times over the same SNMP table. Having a generator makes the application far more responsive. And having the cache makes it even faster on subsequent calls. But the gain I get from the cache is bigger than the gain from the generator. So I would be okay with converting the result to a list before storing it in the cache. What is your opinion on this issue? Would it make sense to add a warning? -- messages: 319279 nosy: exhuma priority: normal severity: normal status: open title: Generators with lru_cache can be non-intuituve type: behavior ___ Python tracker <https://bugs.python.org/issue33827> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20815] ipaddress unit tests PEP8
Michel Albert added the comment: It seems the contributor agreement form has been processed. As I understand it, the asterisk on my name confirms this. I also verified that this patch cleanly applies to the most recent revision. -- ___ Python tracker <http://bugs.python.org/issue20815> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: Hi again, The contribution agreement has been processed, and the patch still cleanly applies to the latest revision of branch `default`. -- ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20826] Faster implementation to collapse consecutive ip-networks
Michel Albert added the comment: Sorry for the late reply. I wanted to take some time and give a more detailed explanation. At least to the best of my abilities :) I attached a zip-file with my quick-and-dirty test-rig. The zip contains: * gendata.py -- The script I used to generate test-data * testdata.lst -- My test-data set (for reproducability) * tester.py -- A simple script using ``timeit.timeit``. I am not sure how sensitive the data is I am working with, so I prefer not to put any of the real data on a public forum. Instead, I wrote a small script which generates a data-set which makes the performance difference visible (``gendata.py``). The data which I processed actually created an even worse case, but it's difficult to reproduce. In my case, the data-set I used is in the file named ``testdata.lst``. I then ran the operation 5 times using ``timeit`` (tester.py). Let me also outline an explanation to what it happening: It is possible, that through one "merge" operation, a second may become possible. For the sake of readability, let's use IPv4 addresses, and consider the following list: [a_1, a_2, ..., a_n, 192.168.1.0/31, 192.168.1.2/32, 192.168.1.3/32, b_1, b_2, ..., b_n] This can be reduced to [a_1, a_2, ..., a_n, 192.168.1.0/31, 192.168.1.2/31, b_1, b_2, ..., b_n] Which in turn can then be reduced to: [a_1, a_2, ..., a_n, 192.168.1.0/30, b_1, b_2, ..., b_n] The current implementation, sets a boolean (``optimized``) to ``True`` if any merge has been performed. If yes, it re-runs through the whole list until no optimisation is done. Those re-runs also include [a1..an] and [b1..bn], which is unnecessary. With the given data-set, this gives the following result: Execution time: 48.27790632040014 seconds ./python tester.py 244.29s user 0.06s system 99% cpu 4:04.51 total With the shift/reduce approach, only as many nodes are visited as necessary. If a "reduce" is made, it "backtracks" as much as possible, but not further. So in the above example, nodes [a1..an] will only be visited once, and [b1..bn] will only be visited once the complete optimisation of the example addresses has been performed. With the given data-set, this gives the following result: Execution time: 20.298685277199912 seconds ./python tester.py 104.20s user 0.14s system 99% cpu 1:44.58 total If my thoughts are correct, both implementations should have a similar "best-case", but the "worst-case" differs significantly. I am not well-versed with the Big-O notation, especially the best/average/worst case difference. Neither am I good at math. But I believe both are strictly speaking O(n). But more precisely, O(k*n) where k is proportional the number of reconciliation steps needed (that is, if one merger makes another merger possible). But it is much smaller in the shift/reduce approach as only as many elements need to be revisited as necessary, instead of all of them. -- Added file: http://bugs.python.org/file34583/testrig.zip ___ Python tracker <http://bugs.python.org/issue20826> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: I made the changes mentioned by r.david.murray I am not sure if the modifications in ``Doc/whatsnew/3.5.rst`` are correct. I tried to follow the notes at the top of the file, but it's not clear to me if it should have gone into ``News/Misc`` or into ``Doc/whatsnew/3.5.rst``. On another side-note: I attached this as an ``-r3`` file, but I could have replaced the existing patch as well. Which method is preferred? Replacing existing patches on the issue or adding new revisions? -- Added file: http://bugs.python.org/file34588/net-in-net-r3.patch ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20815] ipaddress unit tests PEP8
New submission from Michel Albert: While I was looking at the source of the ipaddress unit-tests, I noticed a couple of PEP8 violations. This patch fixes these (verified using the ``pep8`` tool). There are no behavioural changes. Only white-space. Test-cases ran successfully before, and after the change was made. -- components: Tests files: test_ipaddress_pep8.patch keywords: patch messages: 212497 nosy: exhuma, ncoghlan, pmoody priority: normal severity: normal status: open title: ipaddress unit tests PEP8 versions: Python 3.5 Added file: http://bugs.python.org/file34257/test_ipaddress_pep8.patch ___ Python tracker <http://bugs.python.org/issue20815> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20815] ipaddress unit tests PEP8
Michel Albert added the comment: Thanks for the quick reply! I did not know the pep8 tool added it's own rules :( I have read PEP8 a long while ago and have since relied on the tool to do "the right thing". Many of it's idiosyncrasies have probably made their way into my blood since :( And you're right: The PEP is actually explicitly saying that it's okay to leave out the white-space to make operator precedence more visible (for reference: http://legacy.python.org/dev/peps/pep-0008/#other-recommendations). I will undo those changes. Is there anything else that immediately caught your eye so I can address it in the update? -- ___ Python tracker <http://bugs.python.org/issue20815> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20815] ipaddress unit tests PEP8
Michel Albert added the comment: Here's a new patch which addresses white-space issues without touching the old tests. -- Added file: http://bugs.python.org/file34265/test_ipaddress_pep8-r3.patch ___ Python tracker <http://bugs.python.org/issue20815> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
New submission from Michel Albert: The ipaddress module always returns ``False`` when testing if a network is contained in another network. However, I feel this should be a valid test. No? Is there any reason why this is fixed to ``False``? In case not, here's a patch which implements this test. Note that by design, IP addresses networks can never overlap "half-way". In cases where this should return ``False``, you either have a network that lies completely "to the left", or completely "to the right". In the case it should return ``True`` the smaller network is always completely bounded by the larger network's network- and broadcast address. I needed to change two containment tests as they were in conflict with this change. These tests were ``self.v6net not in self.v6net`` and ``self.v4net not in self.v4net``. The reason these two failed is that the new containment test checks the target range *including* broadcast and network address. So ``a in a`` always returns true. This could be changed by excluding one of the two boundaries, and by that forcing the "containee" to be smaller than the "container". But it would make the check a bit more complex, as you would need to add an exception for the case that both are identical. Backwards compatibility is a good question. Strictly put, this would break it. However, I can't think of any reason why anyone would expect ``a in a`` to be false in the case of IP-Addresses. Just as a side-note, I currently work at our national network provider and am currently implementing a tool dealing with a lot of IP-Addresses. We have run into the need to test ``net in net`` a couple of times and ran into bugs because the stdlib returns ``False`` where you technically expect it to be ``True``. -- components: Library (Lib) files: net-in-net.patch keywords: patch messages: 212550 nosy: exhuma, ncoghlan, pmoody priority: normal severity: normal status: open title: containment test for "ip_network in ip_network" versions: Python 3.5 Added file: http://bugs.python.org/file34266/net-in-net.patch ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: Hmm... after thinking about this, I kind of agree. I was about to state something about the fact that you could consider networks like an "ordered set". And use that to justify my addition :) But the more I think about it, the more I am okay with your point. I quickly tested the following: >>> a = ip_network('10.0.0.0/24') >>> b = ip_network('10.0.0.0/30') >>> a <= b True >>> b <= a False Which is wrong when considering "containement". What about an instance-method? Something like ``b.contained_in(a)``? At least that would be explicit and avoids confusion. Because the existing ``__lt__`` implementation makes sense in the way it's already used. -- ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20826] Faster implementation to collapse consecutive ip-networks
New submission from Michel Albert: This alternative implementation runs over the ``addresses`` collection only once, and "backtracks" only if necessary. Inspired by a "shift-reduce" approach. Technically both are O(n), so the best case is always the same. But the old implementation runs over the *complete* list multiple times until it cannot make any more optimisations. The new implementation only repeats the optimisation on elements which require reconciliation. Tests on a local machine have shown a considerable increase in speed on large collections of elements (iirc about twice as fast on average). -- components: Library (Lib) files: faster-collapse-addresses.patch keywords: patch messages: 212553 nosy: exhuma, ncoghlan, pmoody priority: normal severity: normal status: open title: Faster implementation to collapse consecutive ip-networks type: performance versions: Python 3.5 Added file: http://bugs.python.org/file34267/faster-collapse-addresses.patch ___ Python tracker <http://bugs.python.org/issue20826> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20815] ipaddress unit tests PEP8
Michel Albert added the comment: I strongly agree with Raymond's points! They are all valid. I should note, that I submitted this patch to - as mentioned by Nick - familiarise myself with the patch submission process. I decided to make harmless changes which won't risk braking anything. -- ___ Python tracker <http://bugs.python.org/issue20815> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: I second "supernet_of" and "subnet_of". I'll implement it as soon as I get around it. I have been thinking about using ``in`` and ``<=`` and, while I initially liked the idea for tests, I find both operators too ambiguous. With ``in`` there's the already mentioned ambiguity of containment/inclusion. And ``<=`` could mean is a smaller size (has less individual hosts), but could also mean that it is a subnet, or even that it is "located to the left". Naming it ``subnet_of`` makes it 100% clear what it does. Currently, ``a <= b`` returns ``True`` if a comes before/lies on the left of b. -- ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: Here's a new patch implementing both ``subnet_of`` and ``supernet_of``. It also contains the relevant docs and unit-tests. -- Added file: http://bugs.python.org/file34292/net-in-net-r2.patch ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: Yes. I signed it last Friday if I recall correctly. As I understood it, the way for you to tell if it's done, is that my username will be followed by an asterisk. But I'm not in a hurry. Once I get the confirmation, I can just ping you again via a comment here, so you don't need to monitor it yourself. -- ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20815] ipaddress unit tests PEP8
Michel Albert added the comment: Did so already last weekend. I suppose it will take some time to be processed. I can ping you via a message here once I receive the confirmation. -- ___ Python tracker <http://bugs.python.org/issue20815> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: I just realised that the latest patch on this no longer applies properly. I have fixed the issue and I am currently in the process of running the unit-tests which takes a while. Once those pass, I'll update some metadata and resubmit. -- ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: Test pass properly. Is there anything else left to do? Here's the fixed patch (net-in-net-r4.patch) -- Added file: http://bugs.python.org/file43534/net-in-net-r4.patch ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: Updated patch, taking into account notes from the previous patch-reviews -- Added file: http://bugs.python.org/file43535/net-in-net-r5.patch ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: I don't quite see how the operator module could help. I don't have much experience with it though, so I might be missing something... I don't see how I can relate one check to the other. The example I gave in the patch review was the following: With the existing implementation: '192.168.1.0/25' subnet of '192.168.1.128/25' -> False '192.168.1.0/25' supernet of '192.168.1.128/25' -> False With the proposal to simply return "not subnet_of(...)" it would become: '192.168.1.0/25' subnet of '192.168.1.128/25' -> False '192.168.1.0/25' supernet of '192.168.1.128/25' -> True which would be wrong. I have now added the new test-cases for the TypeError and removed the code-duplication by introducing a new "private" function. Let me know what you think. I am running all test cases again and I'll uploaded it once they finished. -- ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: New patch with proposed changes. -- Added file: http://bugs.python.org/file43537/net-in-net-r6.patch ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20825] containment test for "ip_network in ip_network"
Michel Albert added the comment: Are there any updates on this? Not sure if it's too late again to get it applied for the next Python (3.6) release? -- ___ Python tracker <http://bugs.python.org/issue20825> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com