On Thu, 13 Jul 2017 10:10 pm, Rustom Mody wrote: > Yeah I know append method is supposedly O(1). > I find that surprising... > More so when the article > https://wiki.python.org/moin/TimeComplexity > talks of average case Vs amortized-worst case(!) Whatever does that mean?
"Average case" refers to the average over the possible data values you are dealing with. For example, if you ask for the average cost of inserting into a list: mylist.insert(i, obj) you average over i=0, i=1, i=2, ..., i=len(mylist). "Worst case" refers to the most expensive case, which in the case of inserting into a list, will be mylist.insert(0, obj). "Amortized" refers to averaging over many repeated operations. For example, start with an empty list, and keep inserting over and over again: mylist = [] for i in range(10000000): # ideally, N -> ∞ mylist.insert(0, obj) That's equivalent to averaging over: [].insert(0, obj) [1].insert(0, obj) [1, 1].insert(0, obj) [1, 1, 1].insert(0, obj) etc. The average cost of each of those insertions is the amortized cost. Amortized costs are relevant where the cost of an operation is usually cheap, but occasionally you hit a special case which makes it expensive. In the case of list.append, lists are over-allocated: mylist = [1, 2, 3] # allocates (let's say) 50 slots, but only 3 are in use Appending to an over-allocated list takes constant time, because that's just writing to a slot in an array. So we can append to mylist 47 times before the array is full, then we get an expensive resize: mylist 50 slots -> 100 slots That's very costly, proportional to the size of the list, but it doesn't happen every append. After the resize, we can do another 50 appends before needing to do it again. CPython's allocation strategy for lists doubles[1] the list on each resize, so the worst case is that your list is 50% (plus one item) full, and you're using twice as much memory as needed. But the advantage is that resizes happen less and less frequently: after each resize, it takes twice as many appends before you need to do the next. If you do the calculations, each resize is twice as expensive as the last, but happens half as frequently, so the cost of those resizes are amortized out over all the appends to be a constant (and relatively small) cost. [1] Actually, CPython's lists initially quadruple the size of the array, up to a certain point, and then switch to doubling. This ensures that small lists have even fewer expensive resizes, at the cost of wasting a bit more memory, but its only a small array so who cares? -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list