On 07May2012 11:02, Chris Angelico <ros...@gmail.com> wrote: | On Mon, May 7, 2012 at 10:31 AM, Cameron Simpson <c...@zip.com.au> wrote: | > I didn't mean per .append() call (which I'd expect to be O(n) for large | > n), I meant overall for the completed list. | > | > Don't the realloc()s make it O(n^2) overall for large n? The list | > must get copied when the underlying space fills. I confess to being | > surprised at how quick it went for me though. I suppose reallocing in | > chunks (eg to double the available size) might conceal this for a while. | > It should still be well over O(n) overall (not amortised per .append() | > call). | | I haven't checked the CPython code, but the usual way to do | reallocations is to double the size
My example above:-) | (or add 50% or something) each | time, meaning that as n increases, the frequency of reallocations | decreases - hence the O(1) amortized time. Hmm, yes. But it is only O(1) for doubling. If one went with a smaller increment (to waste less space in the end case where one stops growing the array) then there are more copies and less .append efficiency, trading potential memory bloat for compute time in .append(). If you went all the way to adding only one item the cost goes to O(n) for .append() and O(n^2) overall, with varying costs in between. | In any case, it's the | recommended way to build large strings: | | s = "" | while condition: | s += "some_value" | | versus: | | s = [] | while condition: | s.append("some_value") | s = "".join(s) | | so I would be very much surprised if this common wisdom is merely | replacing one O(n*n) operation with another O(n*n) operation. It sort of is (except that list.append may be specially tuned to realloc in big chunks). The join-lots-of-strings standard example is based on the length of the strings in characters/bytes being far more than the number of strings. So you do a lot of tiny listappends instead, copying nothing, then a big allocate-the-correct-size-and-copy-once with s.join. Also, this: s += "some_value" is not a list extension. Because strings are immutable (and anyway in general for "+" in python) you're making a new string and copying two other string contents into it. Inherently a copy of the whole of "s" on every "+" operation. Compared with list.append() or list.extend(), which are in-place modifications to the original list. So these are really apples and oranges here. -- Cameron Simpson <c...@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ We are in great haste to construct a magnetic telegraph from Maine to Texas; but Maine and Texas, it may be, have nothing important to communicate. - H. D. Thoreau -- http://mail.python.org/mailman/listinfo/python-list