Steve Howell <showel...@yahoo.com> writes: > I haven't profiled deque vs. list, but I think you are correct about > pop() possibly being a red herring.... > For really large lists, I suppose memmove() would eventually start to > become a bottleneck, but it's brutally fast when it just moves a > couple kilobytes of data around.
One way to think of Python is as a scripting wrapper around a bunch of C functions, rather than as a full-fledged programming language. Viewed that way, list operations like pop(0) are essentially constant time unless the list is quite large. By that I mean you can implement classic structures like doubly-linked lists using Python tuples, but even though inserting into the middle of them is theoretically O(1), the memmove's of the native list operations will be much faster in practice. Programs dealing with large lists (more than a few thousand elements) are obviously different and if your program is using such large lists, you have to plan a little differently when writing the code. >> I've often run applications with millions of lists > I bet even in your application, the amount of memory consumed by the > PyListObjects themselves is greatly dwarfed by other objects, notably > the list elements themselves Such lists often would just one element or even be empty. For example, you might have a dictionary mapping names to addresses. Most people have just one address, but some might have no address, and a few might have more than one address, so you would have a list of addresses for each name. Of course the dictionary slots in that example would also use space. > Well, I am not trying to solve problems wastefully here. CPU cycles > are also scarce, so it seems wasteful to do an O(N) memmove that could > be avoided by storing an extra pointer per list. Realistically the CPython interpreter is so slow that the memmove is unnoticable, and Python (at least CPython) just isn't all that conductive to writing fast code. It makes up for this in programmer productivity for the many sorts of problems in which moderate speed is acceptable. > Thanks for your patience in responding to me, despite the needlessly > abrasive tone of my earlier postings. I wondered whether you might have come over from the Lisp newsgroups, which are pretty brutal. We try to be friendlier here (not that we're always successful). Anyway, welcome. > 1) Summarize all this discussion and my lessons learned in some kind > of document. It does not have to be a PEP per se, but I could provide > a useful service to the community by listing pros/cons/etc. I suppose that can't hurt, but there are probably other areas (multicore parallelism is a perennial one) of much higher community interest. http://wiki.python.org/moin/ is probably a good place to put such a document. > 2) I would still advocate for removing the warning against list.pop > (0) from the tutorial. I agree with Steven D'Aprano that docs really > should avoid describing implementation details in many instances On general principles I agree with Alex Stepanov that the running time of a function should be part of its interface (nobody wants to use a stack of popping an element takes quadratic time) and therefore should be stated in the docs. Python just has a weird incongruence between the interpreter layer and the C layer, combined with a library well-evolved for everyday problem sizes, so the traditional asymptotic approach to algorithm selection often doesn't give the best practical choice. I don't feel like looking up what the tutorial says about pop(0), but if it just warns against it without qualification, it should probably be updated. -- http://mail.python.org/mailman/listinfo/python-list