[Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
I am interested in creating a patch to make deleting elements from the front of 
Python list work in O(1) time by advancing the ob_item pointer.

The patch will probably be rejected, but I would like to try it anyway as an 
exercise in digging into the CPython source, and working through the process.  
My goal is to accompany the proposed patch with a PEP (which I also expect to 
be initially rejected, but which will hopefully be a useful contribution in 
terms of documenting the decision.)

The reason I am posting here is that there appears to be some history behind 
similar patches that I am not aware of, so if anybody can refer me to earlier 
patches, PEPS, discussion threads, etc., I would be grateful.  

I am aware of PEP 3128, which has similar goals to what I'm trying to achieve, 
but there are also some differences.

The blist implementation described in PEP 3128 achieves the goal of reducing 
time complexity for some operations, but necessarily at the expense of other 
operations, notably list access.

The patch that I am considering would not affect time complexity for any other 
operations, nor memory complexity, but it would, of course, have marginal costs 
in certain operations, notably any operation that eventually calls 
list_resize().  I am confident that the patch would not impact performance of 
list accesses at all.  The memory cost for the list itself would be an 
additional pointer or integer, which I think should be considered negligible 
compared to the cost of the list itself [O(N)] and the elements in the list 
[O(N)].  I haven't completely worked out the best strategy to eventually 
release the memory taken up by the pointers of the unreleased elements, but the 
worst case scenario is that the unused memory only gets wasted until the time 
that the list itself gets garbage collected.  I think I can do better than 
that, at some cost of complicating list_resize.  From a memory standpoint, the 
benefits of encouraging deleting items from the front
 of the list should outweigh any disadvantages with respect to lazily releasing 
memory from the pointers at the front of the list, since in deleting elements, 
you allow the elements themselves to be garbage collected earlier, as well as 
objects that might be referenced by those elements.

My goal would be to target the patch at 3.x, and if I was lucky enough to get 
it accepted, I think it could eventually be backported to 2.x.  The proposal 
does not affect the definition of the language itself, of course; it merely 
attempts to improve the performance of the CPython implementation.

The instructions that I have found for setting up your development environment 
seemed to be targeted at 2.x trunk, which is fine with me.  I will attempt the 
patch off  the 2.x trunk to get an initial feel for the issues involved, unless 
somebody counsels me to work off 3.x right from the start.

http://www.python.org/dev/setup/

I have not been able to locate the source code for 3.x.  Is the implementation 
of list more or less the same there?

There is a long thread entitled "list.pop(0) vs. collections.dequeue" on 
comp.lang.python that addresses alternatives to list.pop(0), but none of them 
really fit my use case.

Here is a sketch of the PEP that I would propose:


Proposal: 


Improve list's implementation so that deleting elements from 
the front of the list does not require an O(N) memmove operation. 


Rationale: 


Some Python programs that process lists have multiple 
methods that consume the first element of the list and pop it off. 
The pattern comes up with parsers in particular, but there are other 
examples.  It is possible now, of course, to use a data structure in 
Python that has O(1) for deleting off the top of the list, but none of 
the alternatives fully replicate the benefits of list itself. 


Specification: 


Improving CPython's performance does not affect the 
language itself, so there are no bikeshed arguments to be had with 
respect to syntax, etc.  Any patch would, of course, affect the 
performance of nearly every Python program in existence, so any patch 
would have to, at a bare minimum: 


  1) Not increase the time or memory complexity of any other list 
operation. 
  2) Not affect list access at all. 
  3) Minimally affect list operations that mutate the list. 
  4) Be reasonably simple within CPython itself. 
  5) Not be grossly wasteful of memory. 


Backwards Compatibility: 


See above.  An implementation of this PEP would not change the 
definition of the language in any way, but it would have to minimally 
impact the performance of lists for the normal use cases. 


Implementation: 


There are two ways to make deleting the first item of the list run 
more efficiently. 


The most ambitious proposal is to fix the memory manager itself to 
allow the release of memory from the start of the chunk.  The 
advantage of this proposal is that it would simplify the changes to 
list itself, and possibly have collateral b

Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
--- On Mon, 1/25/10, Daniel Stutzbach  wrote:

> FWIW, for a long-running FIFO queue, it's critical to
> release some of the memory along the way, otherwise the
> amount of wasted memory is unbounded.
> 

Somebody implementing a long-running FIFO queue should actually be using deque 
instead of list, but I agree with your greater point that waiting until the 
list gets garbage-collected to release memory would probably be unacceptable, 
so I'll see what I can do.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
From: Raymond Hettinger 
> On Jan 25, 2010, at 11:22 AM, Steve Howell
> wrote:
> I
> am interested in creating a patch to make deleting elements
> from the front of Python list work in O(1) time by advancing
> the ob_item pointer.
>
> +1 on doing whatever experiments you feel like
> doing-1 on putting something like this in the
> core
> 1) To many things in the Python world rely on
> the current implementation of lists.  It's not
> worth breaking third-party extensions, tools like psyco,
> work on unladen swallow, and other implementations of Python
> such as PyPy and Jython.

I don't understand how changing the implementation of CPython would impact PyPy 
and Jython, unless you are just referring to the fact that CPython is treated 
as a reference implementation, so its simplicity is a virtue for other ports.  
Am I missing something else?


> 2). The use of lists pervades the language and
> it doesn't make sense to have the language as a whole
> pay a price (in terms of speed and space) for every list
> that gets created.  The corner case isn't worth
> it.

I understand the tradeoff.

> 3).  We already got one.  The
> collections.deque() class was introduced specifically to
> handle inserting and popping from the front of a list
> efficiently.

I understand that deque solves some problems that list does not.  Obviously, it 
allows you to delete elements off the front in O(1) time, but it also has other 
advantages, such as allowing you to rotate elements efficiently.  It's 
designed, I am guessing, for FIFO queues, and it's a perfectly good data 
structure, just not one that is well suited for all use cases.

> 4).  In the comp.lang.python thread on this
> subject, you've gotten nearly zero support for your
> position and have managed to offend many of the
> developers.

Fair enough.  I think the idea is at least worthy of a PEP that puts forward 
the strongest case possible.

Terry Jan Reedy wrote:

'''
I am not opposed to a possible change, just hasty, ill-informed
criticism. If there is not a PEP on this issue, it would be good to have
one that recorded the proposal and the pros and cons, regardless of the
outcome, so there would be something to refer people to. If that had
been already done, it would have shortened this thread considerably.
'''

Do you agree that it is at least worthwhile to write a PEP here?  It could be 
fairly short and quickly rejected, and down the road, if more people get behind 
it, it could always be strengthened and revisited.

There seems to be at least some precedence for PEPs that only pertain to 
internal implementation details, such as this one:

http://www.python.org/dev/peps/pep-0267/

(Raymond, apologies for my double reply...the first one was meant to go to the 
list, not directly to you.)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
--- On Mon, 1/25/10, Benjamin Peterson  wrote:

> 2010/1/25 Steve Howell :
> > I am interested in creating a patch to make deleting
> elements from the front
> > of Python list work in O(1) time by advancing the
> ob_item pointer.
>
> How about just using a deque?

Deque does not support all the operations that list does.  It is also roughly 
twice as slow for accessing elements (I've measured it).

>>> lst = ['foo', 'bar', 'baz']
>>> lst[1:]
['bar', 'baz']
>>> lst.insert(0, 'spam')
>>> lst
['spam', 'foo', 'bar', 'baz']


>>> d = deque(lst)
>>> d[2:]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: sequence index must be integer, not 'slice'
>>> d.insert(0, 'eggs')
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'collections.deque' object has no attribute 'insert'


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
> From: Raymond Hettinger 
>
> On Jan 25, 2010, at 12:36 PM, Steve Howell wrote:
>
> >
> > Deque does not support all the operations that list
> does.  It is also roughly twice as slow for accessing
> elements (I've measured it).
>
>
> ISTM that apps that want to insert or pop from the front of
> list are also apps that don't care about accessing arbitrary
> elements in the middle using the position index.  When
> lists are growing or shrinking from the front, the meaning
> of the i-th element changes.   So, it doesn't
> make sense for an application to track indices of objects in
> such a list.
>
>i = s.find('abc')
>s.pop(0)
>print s[i]# i no longer
> points at 'abc'
>

I am not going to directly address your point, but I'd like to give a examples 
of code that uses pop(0) from the standard library.

If you look at the code for multiprocessing/connection.py, you will see that 
PipeListener creates _handle_queue as an ordinary Python list, and in line 317 
it uses pop(0) to pop the first handle off the top of the queue.

Why does that code not use a deque?  In hindsight, it probably should.  But to 
make the change now, it's not a simple matter of fixing just PipeListener, 
because PipeListener passes off _handle_queue to Finalize, which also expects a 
list.

In order to understand why Finalize expects a list, you need to look at how it 
uses args, and here is one example usage:

res = self._callback(*self._args, **self._kwargs)

Ok, so now you need to know what self._callback is doing, so now you have to 
trace through all callers of Finalize are passing in for their args.

So what seems like a trivial matter--switching over a list to a deque--actually 
requires a lot of thinking.

It turns out that the callback for PipeListener just iterates through the 
remaining handles and closes them.  So a deque would not break that code. 

If you look at difflib.py, it also does pop(0) in a loop.  Why doesn't it use a 
deque?  Simplicity, maybe?

codecs.py also deletes from the top of the list:

del self.linebuffer[0]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
--- On Mon, 1/25/10, Mike Klaas  wrote:

> From: Mike Klaas 
> > On Mon, Jan 25, 2010 at 1:22 PM, Steve Howell 
> wrote:
> >>
> >> I haven't completely worked out the best strategy
> to eventually release
> >> the memory taken up by the pointers of the
> unreleased elements, but the
> >> worst case scenario is that the unused memory only
> gets wasted until the
> >> time that the list itself gets garbage collected.
> >
> > FWIW, for a long-running FIFO queue, it's critical to
> release some of the
> > memory along the way, otherwise the amount of wasted
> memory is unbounded.
> >
> > Good luck :)
> 
> It seems to me that the best way to do this is invert
> .append() logic:
> leave at most X amount of wasted space at the beginning of
> the list,
> where X is a constant fraction of the list size.

That's roughly what I intend to do.  The problems are not entirely symmetric.  
For appends, the wasted space exists in the sense that CPython optimistically 
get extra chunks of memory for future appends, to save CPU at the possible cost 
of needlessly allocating memory.

For pops, the bargain would be that you optimistically defer releasing memory 
to save CPU cycles in the hope that memory is not scarce.  Of course, if you 
have just popped an element off the list, then you have just made memory less 
scarce by virtue of removing the list elements themselves.

> Whether it is worth adding a extra pointer to the data
> stored by a
> list is another story.

That is definitely a point of contention.  It would certainly bloat a program 
that had millions of empty lists.  I think in most real-world programs, though, 
the amount of memory taken by PyListObjects will always be greatly exceeded by 
the amount of memory used by the list elements, or even just the pointers to 
the list elements.  It's the difference between the number of elements in a 
list, O(N), and the number of structures that define a list's state, O(1).

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
--- On Mon, 1/25/10, Benjamin Peterson  wrote:

> From: Benjamin Peterson 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" 
> Cc: [email protected]
> Date: Monday, January 25, 2010, 3:15 PM
> 2010/1/25 Steve Howell :
> >> From: Raymond Hettinger 
> >>
> >> On Jan 25, 2010, at 12:36 PM, Steve Howell wrote:
> >>
> >> >
> >> > Deque does not support all the operations
> that list
> >> does.  It is also roughly twice as slow for
> accessing
> >> elements (I've measured it).
> >>
> >>
> >> ISTM that apps that want to insert or pop from the
> front of
> >> list are also apps that don't care about accessing
> arbitrary
> >> elements in the middle using the position index.
>  When
> >> lists are growing or shrinking from the front, the
> meaning
> >> of the i-th element changes.   So, it doesn't
> >> make sense for an application to track indices of
> objects in
> >> such a list.
> >>
> >>    i = s.find('abc')
> >>    s.pop(0)
> >>    print s[i]    # i no longer
> >> points at 'abc'
> >>
> >
> > I am not going to directly address your point, but I'd
> like to give a examples of code that uses pop(0) from the
> standard library.
> >
> > If you look at the code for
> multiprocessing/connection.py, you will see that
> PipeListener creates _handle_queue as an ordinary Python
> list, and in line 317 it uses pop(0) to pop the first handle
> off the top of the queue.
> >
> > Why does that code not use a deque?  In hindsight, it
> probably should.  But to make the change now, it's not a
> simple matter of fixing just PipeListener, because
> PipeListener passes off _handle_queue to Finalize, which
> also expects a list.
> >
> > In order to understand why Finalize expects a list,
> you need to look at how it uses args, and here is one
> example usage:
> >
> > res = self._callback(*self._args, **self._kwargs)
> >
> > Ok, so now you need to know what self._callback is
> doing, so now you have to trace through all callers of
> Finalize are passing in for their args.
> >
> > So what seems like a trivial matter--switching over a
> list to a deque--actually requires a lot of thinking.
> >
> > It turns out that the callback for PipeListener just
> iterates through the remaining handles and closes them.  So
> a deque would not break that code.
> >
> > If you look at difflib.py, it also does pop(0) in a
> loop.  Why doesn't it use a deque?  Simplicity, maybe?
> >
> > codecs.py also deletes from the top of the list:
> >
> > del self.linebuffer[0]
> 
> Yes, but in either of these cases is there an excellent
> performance
> improvement to be gained and is it worth the complexity of
> your
> optimization? I think not.
> 

I am starting to think that the optimization would be drowned out by the cost 
of processing each line, unless you had some combination of the following:
 
 1) a pretty large list (plausible)
 2) a very inexpensive operation that you were applying to each line (rare)
 3) a really slow memmove implementation (extremely doubtful)

The complexity of the optimization does not phase me for some reason.  If 
ruthless simplicity were the only goal, then I'd also simplify/remove some of 
this code:

/* Bypass realloc() when a previous overallocation is large enough
   to accommodate the newsize.  If the newsize falls lower than half
   the allocated size, then proceed with the realloc() to shrink the list.
*/
if (allocated >= newsize && newsize >= (allocated >> 1)) {
assert(self->ob_item != NULL || newsize == 0);
Py_SIZE(self) = newsize;
return 0;
}

/* This over-allocates proportional to the list size, making room
 * for additional growth.  The over-allocation is mild, but is
 * enough to give linear-time amortized behavior over a long
 * sequence of appends() in the presence of a poorly-performing
 * system realloc().
 * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
 */
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

In my mind, though, the complexity within CPython does not have to leak up to 
the Python level.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
--- On Mon, 1/25/10, Christian Heimes  wrote:
> From: Christian Heimes 
> Michael Foord wrote:
> > How great is the complication? Making list.pop(0)
> efficient sounds like 
> > a worthy goal, particularly given that the reason you
> don't use it is 
> > because you *know* it is inefficient (so the fact that
> you don't use it 
> > isn't evidence that it isn't wanted - merely evidence
> that you had to 
> > work around the known inefficiency).
> 
> The implementation must be changed in at least four
> places:
> 
> * The PyListObject struct gets an additional pointer that
> stores a
> reference to the head. I would keep the head (element 0) of
> the list in
> **ob_item and the reference to the malloc()ed array in a
> new pointer
> *ob_allocated.
> 
> * PyList_New() stores the pointer to the allocated memory
> in
> op->ob_allocated and sets op->ob_item =
> op->ob_allocated
> 
> * listpop() moves the op->ob_item pointer by one 
> for the special case
> of pop(0)
> 
> * list_resize() should occasionally compact the free space
> before the
> head with memcpy() if it gets too large.
> 
> listinsert() could be optimized for 0 if the list has some
> free space in
> front of the header, too.
> 
> I favor this approach over an integer offset because
> doesn't change the
> semantic of ob_item.
> 

The approach that Christian outlines is exactly what I intend to accomplish, 
even if the patch does get permanently or temporarily rejected.

I am pretty confident about what needs to be done within list_ass_slice, 
including the listinsert() optimization.  I also see where I need to add the 
new pointer (ob_allocated seems like a good name to me) within the PyListObject 
struct.

Still wrestling with the other details, though.  My C is pretty rusty, and of 
course I have the extreme versatility of Python to blame for that! :)




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
--- On Mon, 1/25/10, Nick Coghlan  wrote:

> From: Nick Coghlan 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Michael Foord" 
> Cc: "Christian Heimes" , [email protected]
> Date: Monday, January 25, 2010, 6:32 PM
> Michael Foord wrote:
> > On 26/01/2010 00:28, Christian Heimes wrote:
> >> I favor this approach over an integer offset
> because doesn't change the
> >> semantic of ob_item.
> >>    
> > Well, on the face of it this doesn't sound like a huge
> increase in
> > complexity. Not that I'm qualified to judge.
> 
> Christian's approach is a good one for minimising the
> semantic changes,
> and compared to an offset based approach actually has a
> decent chance of
> working without breaking too much C code (you'd have to
> change the way
> sys.getsizeof() worked for lists, but I can't think of
> anything else off
> the top of my head that would definitely break).

As I'm diving into this, it is clear that you want to preserve the semantics of 
ob_item and ob_size, as they are used in a whole bunch of places.  For now I am 
tracking a var called orphans, which subtly changes one invariant:

  0 <= ob_size + orphans <= allocated

I think Christian covered most of the places that would need to change, and 
list_dealloc would also need to change.
 
> However, as Cameron pointed out, the O() value for an
> operation is an
> important characteristic of containers, and having people
> get used to an
> O(1) list.pop(0) in CPython could create problems not only
> for other
> current Python implementations but also for future versions
> of CPython
> itself.

I hadn't thought of that.

Here are the objections that I've heard or thought of myself:

 * The simplicity of the current implementation is important beyond the normal 
benefits of simplicity, since it is also a reference implementation for other 
ports of Python.
 * People who got used to O(1) in one version of Python might have unpleasant 
surprises when they went to other versions.
 * Alternatives to list already exist, such as deque and blist 
 * An O(1) solution would increase the size of PyListObject.
 * An O(1) solution would postpone the release of the memory from the orphaned 
pointers.
 * An O(1) solution would slow down calls to list_resize, PyList_new, and 
list_dealloc.
 * For small and medium sized lists, memmove()'s penalty is usually drowned out 
by other operations on the list elements.
 * The use case of popping elements off a large list is not that common 
(although this might be somewhat driven by the documented slowness)
 * There may be third party code that relies on the current internal 
implementation

Did I leave anything out?

Here are the benefits of an O(1) implementation.

 * O(1) is faster than O(N) for some, presumably quite small, value of N.
 * Performance benefits tend to compound.  If you have P processes doing pop(0) 
in a loop on an N-element list, you are saving P * N memmoves of size kN.
 * The technique required to make O(1) work is simple at its core--advance a 
pointer forward instead of moving the memory backward.
 * Encouraging the use of pop(0) will lead to leaner Python programs that 
release memory earlier in the process.
 * While not super common, there do exist programs today that pop from the top, 
either using pop itself or del, including programs in the standard library
 * The language moratorium provides a good window for performance improvements 
in general (even if this one does not pass the litmus test for other reasons)

Did I leave anything out?


 


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
I made enough of a patch to at least get a preliminary benchmark.

The program toward the bottom of this email runs over 100 times faster with my 
patch.  The patch still has a ways to go--I use a very primitive scheme to 
reclaim orphan pointers (1000 at a time) and I am still segfaulting when 
removing the last element of the list.  But the initial results at least 
confirm that the intended benefit is achievable.

I've attached the diff, in case anyone wants to try it out or help me figure 
out what else needs to change.

The core piece of the patch is this--everything else is memory management 
related.

+if (ilow == 0) {
+a->orphans += 1;
+a->ob_item += (-1 * d);
+}
+else {
+memmove(&item[ihigh+d], &item[ihigh],
+(Py_SIZE(a) - ihigh)*sizeof(PyObject *));
+}


import time

n = 8

lst = []
for i in range(n):
lst.append(i)

t = time.time()
for i in range(n-1):
del lst[0]

print('time = ' + str(time.time() - t))
print(len(lst))
print('got here at least')


show...@showell-laptop:~/PYTHON/py3k$ cat BEFORE 
0
2.52699589729

show...@showell-laptop:~/PYTHON/py3k$ cat AFTER 
time = 0.0216660499573
1
got here at least


Python 3.2a0 (py3k:77751M, Jan 25 2010, 20:25:21) 
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 2.526996 / 0.021666
116.63417335918028
>>>

DIFF
Description: Binary data
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-25 Thread Steve Howell
--- On Mon, 1/25/10, Steve Howell  wrote:

> From: Steve Howell 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Michael Foord" , "Nick Coghlan" 
> 
> Cc: "Christian Heimes" , [email protected]
> Date: Monday, January 25, 2010, 8:33 PM
> I made enough of a patch to at least
> get a preliminary benchmark.
> 
> The program toward the bottom of this email runs over 100
> times faster with my patch.  The patch still has a ways
> to go--I use a very primitive scheme to reclaim orphan
> pointers (1000 at a time) and I am still segfaulting when
> removing the last element of the list.  But the initial
> results at least confirm that the intended benefit is
> achievable.
> 

Ok, I fixed the obvious segfaults, and I am now able to pass all the tests on 
my debug build.  A new diff is attached.

There is still at least one bug in my code in listextend, which the tests do 
not seem to expose, so I will try to at least beef up the test suite a bit.

I really like listobject.c.  Very clean code, very easy to understand.  I guess 
I shouldn't be surprised.





DIFF
Description: Binary data
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell
--- On Tue, 1/26/10, Stefan Behnel  wrote:
> From: Stefan Behnel 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: [email protected]
> Date: Tuesday, January 26, 2010, 1:27 AM
> Michael Foord, 26.01.2010 01:14:
> > How great is the complication? Making list.pop(0)
> efficient sounds like
> > a worthy goal, particularly given that the reason you
> don't use it is
> > because you *know* it is inefficient
> 
> I agree. Giving a programmer the insight that lists are
> implemented as
> arrays in CPython is essentially saying "don't use
> list.pop(0), it's
> SLOW!". So they won't use it, and even avoid it for small
> lists where it
> doesn't really matter. It basically stinks.
> 
> Making list.pop(0) fast removes another edge where
> programmers must
> prematurely concentrate on implementation specifics of the
> interpreter they
> use, rather than the functionality they want to implement.
> I think that's
> an improvement to the simplicity that Python provides. It's
> basically
> saying: "care about performance when you have to, but
> otherwise believe in
> the core developers to make your code fast enough in most
> common cases".
> 

Thanks! I agree with you 100%, of course.

In fairness, I think my patch will only negligibly affect the performance of 
almost every Python program in existence.  I don't see any likelihood of 
negative effects for real world programs, but I also concede that the positive 
effects will usually be drowned out by other performance bottlenecks.

So I'd like to push for the patch on the grounds that it simply relieves Python 
programmers from worrying about a needless implementation detail.  But I am 
also prepared to have it rejected on the simple principle of keeping the 
CPython implementation less complex.  I've strived to make the changes as 
uninvasive as possible, but there is no getting around 1) increasing the size 
of PyListObject by an int, and 2) touching every major function in listobject.c 
related to memory management (w/list_resize getting the most surgery).  So 
there are certainly tradeoffs, unless I am overlooking some more clever way to 
implement this.

The core piece of the change that I made was to make deleting elements off the 
top of the list more efficient, by advancing a single pointer forward instead 
of moving an entire chunk of memory backward.  An incidental change was then to 
make inserting elements at the top of the list try to reclaim empty slots.  I 
think the former use case is valid and reasonably common (for some definition 
of "common").  The latter use case is probably a lot more rare, but I 
implemented the insert optimization to avoid the spiky double-memmove that the 
delete optimization would have otherwise introduced to prepends that succeeded 
pops.

My current strategy for giving memory back to the OS is overly simple and needs 
refinement, but it basically says that the number of orphaned pointers cannot 
exceed the number of active elements in the list.  This might sound wasteful, 
but the reality is that N list elements consume a lot more memory than N 
orphaned pointers, and the greater attractiveness of list.pop(0) would of 
course lead to earlier garbage collection of the popped elements in real world 
programs.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell
> From: Nick Coghlan 
> 
> Steve, I suggest creating a new issue at bugs.python.org to
> track your
> proposal rather than sending diffs to the list. 

Agreed.  After sending out the second, still slightly broken diff, I realized 
that I probably needed to read up on the process.  You can find the issue 
tracked here:

http://bugs.python.org/issue7784

Florent Xicluna suggested applying my patch to 2.x, instead of 3.x, and was 
kind enough to fix up my tab/spacing issues.

> Putting the
> patch code
> up on Rietveld is also a good step in helping potential
> reviewers (I
> believe there is some info on using Rietveld in the dev
> FAQ).

Ok, I will look into Rietveld tomorrow.

> The patch may still end up being rejected, but I know I at
> least am lot
> less opposed to the idea than I was when this thread
> started.
> 

Thanks for the encouragement.  Regardless of the outcome, this has been a good 
learning experience for me.  I haven't touched gdb in over a decade; it was 
actually kind of fun to debug some C code.  As I mentioned earlier, Python has 
totally spoiled me for other languages, including C!


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell

--- On Tue, 1/26/10, Daniel Stutzbach  wrote:


> Just to be clear, when you say "all tests" do you
> mean "all of the list tests" or "the full
> Python test suite"?
> 

The full suite, minus some tests that skipped on my platform.  The last patch I 
posted was broken on test_sys.py, but I have since fixed that.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell

--- On Tue, 1/26/10, Antoine Pitrou  wrote:

> From: Antoine Pitrou 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: [email protected]
> Date: Tuesday, January 26, 2010, 12:50 AM
> Terry Reedy 
> udel.edu> writes:
> > 
> > The idea that CPython should not be improved because
> it would spoil 
> > programmers strikes me as a thin, even desparate
> objection. One could 
> > say that same thing about the recent optimization of
> string += string so 
> > that repeated concats are O(n) instead of O(n*n).
> 
> Note that it's not even generally true. It only works if
> your platform's
> realloc() is sufficiently smart. That's why we once have
> had some
> Windows-specific performance problems in the py3k IO
> module.
> 


One thing that has struck me in working on the O(1) patch is that almost all of 
my code would be unneeded if C's memory manager were just smart enough to allow 
you to release memory off the top of a chunk equally as efficiently as 
releasing it off the bottom of a chunk.  It's not like memory is asymmetrical; 
it's just the 1980's C interface for memory management that makes us think we 
can only extend and shrink in one direction.  The memory management paradigm 
that CPython builds on top of is at least a quarter century outdated, as far as 
I am concerned.  It might even go back to the 1970s.

It's not Python's fault that OSes or compilers have not modernized their 
approach to memory management, but I do think there will come a time, maybe 25 
years from now, when CPython says enough is enough, it's time to treat memory 
like it's extendible and shrinkable in two directions.




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell
The patch is now on Rietveld.

http://codereview.appspot.com/194083/show

I wsa getting HTTP errors for certain operations, like trying to publish 
comments, but I am able to see the patch there.

Here is the issue tracker link:

http://bugs.python.org/issue7784

Here is a rough draft of a proposal to include the patch, which includes 
discussion of possible objections:

http://wiki.python.org/moin/ProposalToSpeedUpListOperations


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell
--- On Tue, 1/26/10, Guido van Rossum  wrote:

> From: Guido van Rossum 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" 
> Cc: "Nick Coghlan" , [email protected]
> Date: Tuesday, January 26, 2010, 11:17 AM
> On Tue, Jan 26, 2010 at 10:01 AM,
> Steve Howell 
> wrote:
> > The patch is now on Rietveld.
> >
> > http://codereview.appspot.com/194083/show
> >
> > I wsa getting HTTP errors for certain operations, like
> trying to publish comments, but I am able to see the patch
> there.
> 
> Hey Steve,
> 
> You seem to be using Rietveld in a slightly odd fashion
> which prevents
> the side-by-side diff and commenting feature fro working.
> 
> Try downloading Rietveld's "upload.py" script
> (http://codereview.appspot.com/static/upload.py) and
> uploading your
> patch again from your SVN client, using
> 
>   upload.py -i 194083
> 
> (This will require the email and password you used to
> upload the issue
> in the first place.)
>

Ok, I just ran upload.py with my patch.  It shows up on Rietveld, but I am 
still getting 500 errors when I attempt to publish my own comments, so I am not 
sure what I'm doing wrong.

The version that I just uploaded was my working copy, which was off of the 3.x 
trunk.

The version that I attempted to upload earlier, via the URL feature, was 
slightly different--Florent had taken my changes, applied them to 2.x, cleaned 
up tabs/spaces, and posted his diff to the issue tracker:

http://bugs.python.org/issue7784

The content of both patches is the same otherwise.

I suspect the patch has lots of minor cleanup that needs to be done, but I 
should point out one major design decision that probably needs to be addressed 
first.  I chose to store the number of orphans vs. storing the pointer to the 
originally allocated memory.  There was no rhyme or reason for my decision, 
other than it was just how I initially got my head around the problem.  The 
tradeoffs should be pretty obvious--there are places where it's convenient to 
work with the count, but it comes at the cost of pointer arithmetic in other 
places.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell
From: Guido van Rossum 
> 
> So here's how you can fix it: go to "Edit Issue" and change
> the
> "Base:" field to the following:
> 
> http://svn.python.org/view/*checkout*/python/trunk/
>

I just deleted the issue altogether for now, since the preferred approach is to 
use a pointer, and that's gonna change the patch in lots of little places, so I 
think it would be wasteful to review it in its current form.
 
 
> I'd like you to revisit this design decision. It would make
> life
> harder if we ever switched to a conservative garbage
> collector. I feel
> much more comfortable having a hard pointer to the actual
> memory block
> in my hands rather than having to compute where that block
> starts
> using pointer arithmetic.

Ok, will do.
 
> Note that I'm not endorsing your patch -- I expect that the
> number of
> (real) use cases that aren't already served better or
> equally well by
> deques is very small, and I fear that the real cost of
> adding
> sizeof(ssize_t) bytes to every list object is real (unless
> there are 8
> bytes unused given the block size rounding happening in
> obmalloc). I
> don't believe a microbenchmark proves much of anything.
> 

Fair enough.

It seems to me that the goal of keeping lists super-compact from a memory 
standpoint is contradicted by the code in list_resize that optimistically 
preallocates extra memory on appends.  I'm not arguing against the tradeoff 
there, as I think that optimization has a way more compelling CPU gain for most 
applications than the optimizations I'm proposing.  What I'm really saying is 
that there is some precedent to waste a little memory to make CPython lists run 
faster.  My gut says that the trend of the future is more CPU scarcity than 
memory scarcity, but that's just a WAG.

Just more food for thought--doing the memory management within listobject.c is 
really a workaround to the limitations of the underlying memory manager.  It 
seems conceptually possible to me to design a memory manager that allows you to 
shrink and extend memory chunks on both sides, without any extra memory 
overhead for bookkeeping.  I'm sure the devils in the details, of course.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-26 Thread Steve Howell
> From: Guido van Rossum 
> Steve Howell 
> wrote:
> > It seems to me that the goal of keeping lists
> > super-compact from a memory standpoint is contradicted by
> > the code in list_resize that optimistically preallocates
> > extra memory on appends.
> 
> Ah, but that applies for *large* lists. Adding 8 bytes to
> each list
> object applies to *all* lists. I betcha that many programs
> use many
> tiny lists.
>

Even most tiny-ish lists are wasting memory, though, according to this sequence:

4, 8, 16, 25, ...


 
> If I had a program that was growing and shrinking a list at
> both ends,
> I'd consider a deque. 

I'd use a deque on current Python, but I'd use list with the proposed patch. :)

> If I had a program that was growing
> and
> shrinking a list at the front, I'd consider maintaining the
> data
> backwards. Extensive use of pop(0) (and insert(0, x) I
> suppose) is
> only defensible if you also need to access the data by
> index and
> negative indexes are not an option. Think about that.
> 

Maintaining the data backwards creates complications if you are popping off 
small chunks of the list and then subsequently operating on them.  A parser 
would be a good example.  If you reverse the original list, then grab the first 
four lines, for example, and then want to concatenate the first four lines in 
the original order, you will need to reverse them again.

Having to reverse a list to change its performance just seems backward to me.  

> > [...] My
> > gut says that the trend of the future is more CPU scarcity
> > than memory scarcity, but that's just a WAG.
> 
> I'm no expert in this area, but considering the cost of
> cache misses,
> I'm not so sure about that.

Ok, good point.
 
> > Just more food for thought--doing the memory
> management within listobject.c is really a workaround to the
> limitations of the underlying memory manager.  It seems
> conceptually possible to me to design a memory manager that
> allows you to shrink and extend memory chunks on both sides,
> without any extra memory overhead for bookkeeping.  I'm
> sure the devils in the details, of course.

I asked Doug Lea about this, and he just replied to me that the devil is indeed 
in the details--it would be hard to make memory managers do what I'm proposing 
in an efficient way.


 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Wed, 1/27/10, Glenn Linderman  wrote:
> As a newcomer to python, I must say that I wouldn't expect
> a list to be like an array.  I'd expect it more to be
> like a list... many implementations of lists (linked lists,
> in particular) make it O(1) to add to the front or
> back.  An array can be used to represent a list, but
> there are known inefficiencies that result when doing so,
> one of which the Subject patch is working to address. 
> I guess I would have expected something called an array to
> be more like an array, rather than something called a
> list.  But one has to read the documentation to find
> out what things really mean, in a new environment.
> 

My concept of Python lists is that they should have at least the same 
performance characteristics as an ordinary to-do list that you make with 
pencil, paper, and an eraser.

When you complete the first task on your to-do list, you can just erase it; no 
need to recopy the whole list.   When you complete all the elements on the 
first page, throw away the paper.  As you find new tasks to complete, add them 
on the end.  When you fill up a page, get a new sheet of paper.

If you complete the first task, erase it, but then a new urgent task comes in, 
go ahead and write the new urgent task on the first line of the first page.

If, for some reason, you keep getting bombarded with tasks that are more urgent 
than the plan you originally set out for yourself, you need a more powerful 
tool than a simple pencil/paper tool list, and by analogy, you would need a 
more powerful tool than a Python list to do it electronically.

But if you are just working your way through a paper to-do list, then erasing 
elements from the top works in O(1) time with occasional compacting of paper 
when you finish a whole page of tasks.

In my mind Python's lists should have the same performance characteristics as 
the paper list (or better).


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Tue, 1/26/10, Cameron Simpson  wrote:

> From: Cameron Simpson 
> | The idea that CPython should not be improved because it
> would spoil
> | programmers strikes me as a thin, even desparate
> objection.
> 
> Hey, I even used the word "thin" to describe my concern!
> 
> My point was that I look on python builtins like list and
> dict as highly
> optimised, highly efficient facilities. That means that I
> expect a "list"
> to be very very much like a linear array as one would find
> in C-like
> languages, with realloc() managment behind the scenes to
> handle the
> resizing requirements on append/extend.
> 
> It also means that the O() cost of operations should be
> pretty obvious.
> pop(0) "obviously" involves shuffling everything down one
> slot.

In my previous posting, though, I mention another plausible metaphor that 
programmers would have about lists--ordinary to-do lists where you can cross 
out or erase the first item in the list in O(1) time.
 
> 
> The proposed change to make pop(0) O(1) involves adding a
> base offset
> to the internal list implementation. Doesn't that incur a
> (small)
> overhead to _every_ list operation? Doesn't that weaken
> "list" as the
> "as efficient as possible" implementation of choice for
> "array"-like
> things?
>

The patch advances the pointer itself during prepops, so accesses continue to 
work as quickly as before.   The offset between the originally allocated 
pointer and the pointer to the new first element of the list only gets 
calculated and used during list_resize and list_dealloc.  List_resize gets 
called by any operation that changes the size of the list, including inserts, 
deletes, pops, etc.  

Because I have to increase the size of PyListObject, you could argue that I 
even affect the performance of access to the extent that the greater size of 
PyListObject increases the likelihood of cache misses.  I would consider that 
also to be a very "thin" objection, although not completely implausible. 

> 
> Yes, optimisation is nice. Higher seed is nice. But there
> are
> optimisations that simple fix inefficiencies or provide a
> special case
> for a common operation, and there are those with side
> effects elsewhere
> in the implementation (i.e. the base offset this pop(0) opt
> would incur,
> which adds a (small) cost to everything).
>

Your general point about optimizations having tradeoffs is valid, but I am not 
making the particular tradeoff that you mention.
 
> The other aspect, mentioned elsewhere in this thread, is
> that
> programmers should know the O() cost of these operations.
> 

The tutorial does mention that the current list implementation has to move all 
the elements forward when you remove off the top.

(Cameron: Quick aside, for some reason your emails keep going to my spam 
folder.  I am sure it's just my mail client being stupid, but I thought I'd let 
you know in case it's something on your side.)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Wed, 1/27/10, Daniel Stutzbach  wrote:

> From: Daniel Stutzbach 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" 
> Cc: [email protected]
> Date: Wednesday, January 27, 2010, 5:32 AM
> On Wed, Jan 27,
> 2010 at 7:13 AM, Steve Howell 
> wrote:
> 
> My concept of Python lists is that they should have at
> least the same performance characteristics as an ordinary
> to-do list that you make with pencil, paper, and an eraser.
> 
> 
> 
> When you complete the first task on your to-do list, you
> can just erase it; no need to recopy the whole list.   
> 
> I don't think your analogy works, unless you recopy
> your to-do lists whenever you complete a task in the middle
> of the list. ;-)
> 

The bunch of stickies on my desk, and scribbled notes on the back of envelopes, 
etc. does indeed suggest a jumbled data structure that I would never want to 
reproduce electronically! :)  


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Tue, 1/26/10, Guido van Rossum  wrote:

> From: Guido van Rossum 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" 
> Cc: "Nick Coghlan" , [email protected]
> Date: Tuesday, January 26, 2010, 12:57 PM
> On Tue, Jan 26, 2010 at 12:46 PM,
> Steve Howell 
> wrote:
> > It seems to me that the goal of keeping lists
> super-compact from a memory standpoint is contradicted by
> the code in list_resize that optimistically preallocates
> extra memory on appends.
> 
> Ah, but that applies for *large* lists. Adding 8 bytes to
> each list
> object applies to *all* lists. I betcha that many programs
> use many
> tiny lists.
> 

I think that some of the large programs that you mention with many tiny lists 
have some subset of lists still in memory only due to the fact that they cannot 
be garbage collected due to dangling references.

One of the ways to eliminate dangling references is to aggressively delete 
objects after you are done processing them.

Right now the Python programmer looking to aggressively delete elements from 
the top of a list has to consider the tradeoff that the operation takes O(N) 
time and would possibly churn his memory caches with the O(N) memmove 
operation.  In some cases, the Python programmer would only have himself to 
blame for not using a deque in the first place.  But maybe he's a maintenance 
programmer, so it's not his fault, and maybe the code he inherits uses lists in 
a pervasive way that makes it hard to swap in deque after the fact.  What 
advice do you give him?

Of course, this scenario is mostly speculative.  A concrete example of a large 
Python program that keeps lots of tiny lists around would probably shed more 
light on the matter.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Wed, 1/27/10, John Arbash Meinel  wrote:

> From: John Arbash Meinel 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" 
> Cc: "Guido van Rossum" , "Nick Coghlan" 
> , [email protected]
> Date: Wednesday, January 27, 2010, 7:45 AM
> 
> > Right now the Python programmer looking to
> aggressively delete elements from the top of a list has to
> consider the tradeoff that the operation takes O(N) time and
> would possibly churn his memory caches with the O(N) memmove
> operation.  In some cases, the Python programmer would
> only have himself to blame for not using a deque in the
> first place.  But maybe he's a maintenance programmer,
> so it's not his fault, and maybe the code he inherits uses
> lists in a pervasive way that makes it hard to swap in deque
> after the fact.  What advice do you give him?
> > 
> 
> Or he could just set them to None.

Fair enough, but that's still wasteful of memory, keeping around a bunch of 
None elements because you can't inexpensively delete them.  I concede that you 
can break the dangling references, though, and that's often where large 
programs waste a lot of memory, so your point is well taken.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Wed, 1/27/10, Daniel Stutzbach  wrote:

> From: Daniel Stutzbach 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Steve Howell" 
> Cc: "John Arbash Meinel" , [email protected]
> Date: Wednesday, January 27, 2010, 8:20 AM
> On Wed, Jan 27,
> 2010 at 9:55 AM, Steve Howell 
> wrote:
> 
> 
> Fair enough, but that's still wasteful of memory,
> keeping around a bunch of None elements because you
> can't inexpensively delete them.  
> 
> Even if there are many references to it, there is only one
> None element.
> 

I should have been more precise and said the pointers to None, which could be 
reclaimed.  But that's a pretty minor savings--I concede on the greater point, 
you do have the alternative to break dangling references with None, so the 
expense of avoiding remove operations is only local to list itself.






___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Wed, 1/27/10, Virgil Dupras  wrote:
> 
> Why is this thread still going? It seems to me that the
> case for this
> change is very weak. Lists, like dicts and tuples, are
> used
> *everywhere* and in *vast* quantities. Making them grow by
> 4 or 8
> bytes each for such a corner case can't be an option.
> 
> I'm sure your new list class has a lot of uses, but it
> should be an
> external class. If it stays close in behavior to the lists'
> behavior,
> then we could even add a notice in pop()'s documentation
> that
> recommends to use your new class in case they want a
> painless way to
> replace list usage (to make the life of those poor
> developers
> maintaining other people's code easier), maybe even add it
> in stdlib's
> "collections" unit.
> 


Lists are indeed used *everywhere* and *vast* quantities, which is why 
optimizations on list operations are worthwhile to pursue.

The particular optimization makes a few tradeoffs, and the consensus here 
appears to be that the ugliest tradeoff is adding the pointer to PyListObject.

There is at least some irony in opposing an optimization to a remove() 
operation on the basis of compacting memory, which isn't to say that the 
argument isn't valid.

There is also the possibility that my initial patch can be refined by somebody 
smarter than myself to eliminate the particular tradeoff.  In fact, Antoine 
Pitrou already suggested an approach, although I agree that it kind of pushes 
the boundary of sanity. :)



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Wed, 1/27/10, Antoine Pitrou  wrote:

> From: Antoine Pitrou 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: [email protected]
> Date: Wednesday, January 27, 2010, 6:15 AM
> Nick Coghlan 
> gmail.com> writes:
> > 
> > The big practical concern is actually the memory cost
> of adding yet
> > another pointer (potentially 8 bytes of data) to every
> list object, even
> > empty ones.
> 
> It needn't be, actually. Why?
> You only need to store this other pointer when you have an
> orphaned area in the
> first place. But then you can store the pointer *in* the
> orphaned area :-)
> 
> So let's say for example:
> - when ob_size >= 0, there are no orphans
> - when ob_size < 0, the pointer to the orphaned area is
> given by 
>   (PyObject **) self->ob_items[-1]
> 
> This makes a couple of micro-operations slightly slower
> (you need to take
> ob_size's absolute value to get the length (*)); and it
> kills code which uses
> Py_SIZE() on lists. But PyList_GET_SIZE() exists for a
> reason.
> 
> (*) or you could use ob_size's MSB, and then a simple
> binary AND gets you the
> length; or even ob_items's LSB, since the pointer will
> always be aligned on more
> than 1 byte
> 
> (and, yes, this is a bit insane)
> 

A slightly more sane alternative would be to leave ob_size and ob_item alone 
with their current semantics, and then replace allocated with self->excess, 
where

  self->excess == excess_above * 256 + excess_below.

Right now self->allocated is only used in a few places:
  
  list_resize
  listextend
  listsort
  list_init (but only in an assert)
  list_sizeof

Those places would need to compute:

  allocated = self->ob_size + self->excess / 256 + self->excess % 256

The right hand side could be done with shifting/bitmasking inside a macro.

Excess_left would obviously not be allowed to exceed 256; excess_right would 
max out at PY_SIZE_MAX / 256, instead of PY_SIZE_MAX / 8 + 6, which is probably 
the right thing anyway.

Here is the current algorithm for allocating extra bytes on the right:

new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

On a 32-bit platform, excess_right would max out at 16M instead of the current 
512M.

Any method that wanted to find the originally allocated pointer would compute

  self->ob_item - (self->excess % 256)

To the extent that most of the ugly details could be encapsulated in macros, I 
do not think this would complicate the list code itself greatly, but it would 
complicate debugging.  Guido has expressed a strong desire to keep a hard 
pointer to the originally allocated chunk, especially with regard to future 
garbage collection changes.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
--- On Wed, 1/27/10, Antoine Pitrou  wrote:

> From: Antoine Pitrou 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: [email protected]
> Date: Wednesday, January 27, 2010, 12:41 PM
> Le mercredi 27 janvier 2010 à 11:49
> -0800, Steve Howell a écrit :
> > A slightly more sane alternative would be to leave
> ob_size and ob_item
> > alone with their current semantics, and then replace
> allocated with
> > self->excess, where
> > 
> >   self->excess == excess_above * 256
> + excess_below.
> 
> Or we could use allocated's sign bit in order to store the
> flag (of
> whether there is an orphaned area or not) and then store
> the orphaned
> pointer as ob_items[-1]. Thus we don't have to limit the
> magnitude of
> anything. And since allocated itself isn't used in any
> really critical
> path, it doesn't slow down anything.
> 

Ok, I will try that technique and benchmark it, as it would address one strong 
objection to the proposal. 

I think I've summarized most of the other major objections here:

http://wiki.python.org/moin/ProposalToSpeedUpListOperations


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell

--- On Wed, 1/27/10, Raymond Hettinger  wrote:

> From: Raymond Hettinger 
> Subject: Re: [Python-Dev] patch to make list.pop(0) work in O(1) time
> To: "Martin v. Löwis" 
> Cc: "Steve Howell" , [email protected]
> Date: Wednesday, January 27, 2010, 1:49 PM
> 
> On Jan 27, 2010, at 11:38 AM, Martin v. Löwis wrote:
> 
> >> Lists are indeed used *everywhere* and *vast*
> quantities, which is
> >> why optimizations on list operations are
> worthwhile to pursue.
> > 
> > Unfortunately, the proposed change is a pessimisation,
> not an
> > optimisation. We probably shouldn't discuss it
> further, at least not
> > until a PEP gets written by its proponents.
> > 
> 
> I concur (on both points).
> 
> AFAICT, the performance of the proposal:
> 
> * increases space requirements by a small fixed amount
> 
> * s.append() performance slightly degraded.
> 
> * the s.insert(0, value) case isn't helped -- still O(n)
> 
> * repeated s.pop(0) have amortized O(1) but either
>    needs to waste space indefinitely (likely
> not what
>    the programmer intended by popping off
> values)
>    or needs to do occasional memmoves (which
> isn't
>    as good as a deque which never moves the
> data).
>  
> * the resize performance doesn't work well with the
>    memory allocator -- while a series of
> appends can often
>    resize in-place without a need to do an
> O(n) memmove,
>    but a series of pop(0) calls doesn't have
> a resize in-place
>    option.
> 
> What currently unknown is the effect on third-party
> extensions
> and debugging tools that access the structure directly.
> Also, am not sure if this affects psyco or the other 
> implementations such as Jython which may implement
> lists in terms of existing Java containers.
> 
> ISTM, the *only* benefit is that an occasional s.pop(0)
> will perform better than it does now but not as well
> as a deque which never has to move data).
> 
> 

I agree with most of what's said above, with a few clarifications.

The speedup is not limited to pop(0); it also considerably speeds up code like 
below:

n = 80
for i in range(n):
x = lst[:10]
del lst[:10]

Understood that it is a contrived benchmark, and real code like that could be 
replaced with a technique that Nones out the used up elements and advances a 
start pointer.  

Prepends are still O(N) for most use cases, but prepends will reclaim space at 
the front if it's there in O(1).

The biggest performance tradeoff, the extra space requirement in PyListObject, 
can be eliminated, albeit with a pretty ugly hack.

Since there is a lot of discussion about tradeoffs, I would remind folks that 
asking somebody to use a deque instead of a list also forces tradeoffs; you 
lose the comfort and familiarity of lists (not to be underestimated) as well as 
some features (you can't spell d[:10] as d[:10] for example).


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell
> From: Antoine Pitrou 
>  gmail.com> writes:
> > 
> > AFAICT, the performance of the proposal:
> > 
> > * increases space requirements by a small fixed
> amount
> 
> Well, given my proposal (assuming it turns out ok) it
> doesn't.
> 
> > * s.append() performance slightly degraded.
> 
> Why?
> 

A slight degradation is unavoidable AFAICT.  The overhead is roughly this on 
every call:


if (self->orphans >= newsize) { // SKIP ALMOST ALWAYS
needed = newsize + self->orphans;

And if it's time for a realloc, you have to pay down a little more bookkeeping:

items = self->ob_item - self->orphans;
self->ob_item = items + self->orphans;
op->orphans = 0;


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell


--- On Wed, 1/27/10, Raymond Hettinger  wrote:

> From: Raymond Hettinger 

> * the current design encourages people to use
> the right data structure for a given need.  the
> proposed change makes the trades-off murky and
> implementation dependent.  

Are you saying that the current slowness of list for prepops helps people to 
choose more appropriate data structures?  Really

> you have to know a lot more
> in order to make the right choices.  that's not
> good for usability.  we want tools that are easy to use
> correctly/well.

If you want tools that are easy to use correctly, make them bug-free and 
document their behavior.  If you want tools that are easy to use well, then 
make them perform better.  I am not sure how my patch contradicts either of 
these goals.

You keep making the argument that deque is a better alternative to list in many 
situations.  I actually agree with you.  Most programming problems are best 
modelled by a queue.  I am not sure why Python lists get all the syntax sugar 
and promotion over deque, when in reality, Python lists implement a pretty 
useless data structure.  Python lists are a glorification of a C array built on 
top of a memory-upward-biased memory allocator.  As such, they optimize list 
appends (good) but fail awfully on list prepops (bad).  They are much better as 
stacks than queues, even though queues are more useful for the most common 
programming known to man--work through a work queue and delete tasks when they 
are done.

It is not surprising that Python lists are starting to show their lack of 
versatility in 2010.  They're based on 1970's technology.  Python lists are 
really just a thin encapsulation of C arrays built on top of an asymmetrical 
memory manager.

In 2010 you could improve Python lists by releasing from the constraints of 
1970s semantics.  But I am starting to think a more modern approach would be to 
take more useful data structures like deques and give them more sugar.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-27 Thread Steve Howell

--- On Wed, 1/27/10, Alex Gaynor  wrote:

> 
> "Python lists implement a pretty useless data structure"
> 
> It's very difficult for ideas to gain traction when they
> contain such
> useless, and obviously wrong, rhetoric.  There's an
> enormous body of
> code out there that begs to disagree with this assertion.
> 

The statement was meant 99% tongue in cheek.  Like probably 99.99% of Python 
programmers, I use lists all the time; that's why I want them to be more 
versatile.

There's also a 1% element of truth to the statement.  To the extent that people 
are arguing for alternative data structures to list, particularly deque, I 
wonder if there actually is some merit in discouraging the use of lists in 
favor of better alternatives.





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-28 Thread Steve Howell
--- On Thu, 1/28/10, Josiah Carlson  wrote:
> [...] in the decade+ that I've been using
> Python and
> needed an ordered sequence; lists were the right solution
> 99% of the
> time [...]

What do you think of LISP, and "car" in particular (apart from the stupidly 
cryptic name)?




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] patch to make list.pop(0) work in O(1) time

2010-01-30 Thread Steve Howell
--- On Fri, 1/29/10, Stephen J. Turnbull  wrote:

> 
>  > Lisp lists are really stacks
> 
> No, they're really (ie, concretely) singly-linked
> lists.  
> 
> Now, stacks are an abstract data type, and singly-linked
> lists provide
> an efficient implementation of stacks.  But that's not
> what linked
> lists "really are".  For example, singly-linked lists
> are also a
> reasonable way to implement inverted trees (ie, the node
> knows its
> parent, but not its children), which is surely not a
> stack.

I like your distinction between abstract data types and concrete 
implementations.
 
>From a mutability perspective, the concrete implementation of Python lists 
>shares a performance characteristic with most concrete implementations of 
>stacks, in that inserts/pops at the top are cheap.

Unlike most stacks, Python lists do at least semantically allow queue-like 
behavior for removing elements from the bottom, but I don't think it's unfair 
of me to say that removes from the bottom are discouraged under the current 
implementation.  (I can cite the tutorial, for example).  So, to the extent 
that removes from the bottom are frowned upon, Python again has the same 
mutability characteristics as a stack.

The abstract data type "stack" does not allow for random access of elements 
AFAIK, so Python lists are definitely more than a stack, especially since 
random accesses are not only possible, they are quite efficient.

So I guess they are an array.

I don't know whether or not "arrays" are considered to be an abstract data type 
or not, but my de facto concept of an array is something that supports fast 
random access, cheap mutation at the top, and no guarantees at the bottom.  I 
am guessing that from a big-O perspective, Python lists have the exact same 
performance characteristics as the data structures that Perl, Ruby, and 
Javascript all call "array."  Also, Python lists are built on top of a C array, 
and while it would be a bit of an overstatement to say that lists are just a 
nicely sugared encapsulation of C arrays, I think it would be a fair statement 
to say that Python lists only give O(1) performance for the same operations as 
the underlying C array; all the other operations are there just for convenience 
where performance is not a driving concern.

Also, we can go back to the example of LISP, the one language that I know of 
that shares the term "list."  Whatever a "list" denotes from an abstract 
perspective, Python and LISP do not agree upon the definition.  Python lists 
are more like right-side-up stacks with fast random access, while LISP lists 
are more like an upside-down stack without iteration.

> The Python use of "list" to denote what is concretely a
> dynamically
> extensible one-dimensional array confused me a bit. 
> But what the
> heck, Guido needed a four-letter word to denote a concrete
> type used
> to implement a mutable sequence ADT, and he wasn't going to
> borrow one
> from that French guy on the ramparts, right?  No big
> deal.  Ahem...

Probably the same reason he didn't call dictionaries "hashes", right? :)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com