Fwd: Re: Negative array indicies and slice()

Andrew Robinson Sat, 03 Nov 2012 15:44:29 -0700

Forwarded to python list:

-------- Original Message --------
Subject:        Re: Negative array indicies and slice()
Date:   Sat, 03 Nov 2012 15:32:04 -0700
From:   Andrew Robinson
Reply-To:       andr...@r3dsolutions.com
To:     Ian Kelly <>




On 11/01/2012 05:32 PM, Ian Kelly wrote:

 On Thu, Nov 1, 2012 at 4:25 PM, Andrew Robinson

 The bottom line is:  __getitem__ must always *PASS* len( seq ) to slice()
 each *time* the slice() object is-used.  Since this is the case, it would
 have been better to have list, itself, have a default member which takes the
 raw slice indicies and does the conversion itself.  The size would not need
 to be duplicated or passed -- memory savings,&   speed savings...

 And then tuple would need to duplicate the same code.  As would deque.
   And str.  And numpy.array, and anything else that can be sliced,
 including custom sequence classes.

I don't think that's true.  A generic function can be shared among
different objects without being embedded in an external index data
structure to boot!

If *self* were passed to an index conversion function (as would
naturally happen anyway if it were a method), then the method could take
len( self ) without knowing what the object is;
Should the object be sliceable -- the len() will definitely return the
required piece of information.

 Numpy arrays are very different internally from lists.

Of course!  (Although, lists do allow nested lists.)

 I'm not understanding what this is meant to demonstrate.  Is "MyClass"
 a find-replace error of "ThirdParty"?  Why do you have __getitem__
 returning slice objects instead of items or subsequences?  What does
 this example have to do with numpy?

Here's a very cleaned up example file, cut and pastable:
#!/bin/env python
# File: sliceIt.py  --- a pre PEP357 hypothesis test skeleton

class Float16():
    """
    Numpy creates a float type, with very limited precision -- float16
    Rather than force you to install np for this test, I'm just making a
    faux object.  normally we'd just "import np"
    """

    def __init__(self,value): self.value = value
    def AltPEP357Solution(self):
        """ This is doing exactly what __index__ would be doing. """
        return None if self.value is None else int( self.value )

class ThirdParty( list ):
    """
    A simple class to implement a list wrapper, having all the
properties of
    a normal list -- but explicitly showing portions of the interface.
    """
    def __init__(self, aList): self.aList = aList

    def __getitem__(self, aSlice):
        print( "__getitems__", aSlice )
        temp=[]
        edges = aSlice.indices( len( self.aList ) ) # *unavoidable* call
        for i in range( *edges ): temp.append( self.aList[ i ] )
        return temp

def Inject_FloatSliceFilter( theClass ):
    """
    This is a courtesy function to allow injecting (duck punching)
    a float index filter into a user object.
    """
    def Filter_FloatSlice( self, aSlice ):

        # Single index retrieval filter
        try: start=aSlice.AltPEP357Solution()
        except AttributeError: pass
        else: return self.aList[ start ]

        # slice retrieval filter
        try: start=aSlice.start.AltPEP357Solution()
        except AttributeError: start=aSlice.start
        try: stop=aSlice.stop.AltPEP357Solution()
        except AttributeError: stop=aSlice.stop
        try: step=aSlice.step.AltPEP357Solution()
        except AttributeError: step=aSlice.step
        print( "Filter To",start,stop,step )
        return self.super_FloatSlice__getitem__( slice(start,stop,step) )

    theClass.super_FloatSlice__getitem__ = theClass.__getitem__
    theClass.__getitem__ = Filter_FloatSlice

# EOF: sliceIt.py

--------------------------------------------------------
Example run:

 from sliceIt import *
 test = ThirdParty( [1,2,3,4,5,6,7,8,9] )
 test[0:6:3]

('__getitems__', slice(0, 6, 3))
[1, 4]

 f16=Float16(8.3)
 test[0:f16:2]

('__getitems__', slice(0,<sliceIt.Float16 instance at 0xb74baaac>, 2))
Traceback (most recent call last):
  File "<stdin>", line 1, in<module>
  File "sliceIt.py", line 26, in __getitem__
    edges = aSlice.indices( len( self.aList ) )  # This is an
*unavoidable* call
TypeError: object cannot be interpreted as an index

 Inject_FloatSliceFilter( ThirdParty )
 test[0:f16:2]

('Filter To', 0, 8, 2)
('__getitems__', slice(0, 8, 2))
[1, 3, 5, 7]

 test[f16]

 We could also require the user to explicitly declare when they're
 performing arithmetic on variables that might not be floats. Then we
 can turn off run-time type checking unless the user explicitly
 requests it, all in the name of micro-optimization and explicitness.

:) None of those would help micro-optimization that I can see.

 Seriously, whether x is usable as a sequence index is a property of x,
 not a property of the sequence.

Yes, but the *LENGTH* of the sequence is a function of the *sequence*.

 Users shouldn't need to pick and choose *which* particular sequence
 index types their custom sequences are willing to accept. They should
 even be able to accept sequence index types that haven't been written
 yet.


I disagree, and "float" is a good example.  Besides -- Personally -- I
don't have a problem with subclassing for a custom sequence; in spite of
what D'Aprano thinks.  It's the generic sequences that irritate me.

OK, then, in your opinion what's the unspoken reason that PEP 357
happened, when in fact people already could have just said: myList[
int(firstItem) : int(secondItem), int(thirdItem) ]  ?

 Most importantly normal programs not using Numpy wouldn't have had to carry
 around an extra API check for index() *every* single time the heavily used
 [::] happened.  Memory&   speed both.

 The O(1) __index__ check is probably rather inconsequential compared
 to the O(n) cost of actually performing the slicing.

I'm sure that's true; at least -- I'm sure that O(1) index check done at
the *C* level is probably inconsequential compared to slicing at the *C*
level. When the index checking has to happen at the python interpreter
level, I'm not so sure... I'm trying to learn how to profile that.

 <snip>
 Such a change would only affect numpy floats, not all floats, so it
 would not be a monkey-patch.

User's of python generally don't bother checking the types.  The object
"typeing" ability, I think, is a rather new development. When a function
accepts float, it often returns a "float"; so there is no reason that
one might not mix a python float and third party "float" as function
parameters -- and then use a return from that function which could be
*either* kind of float.

Since this is typical behavior, variables which have traditionally been
python floats can become another type without explicit warning; and then
may (all the sudden) index any list, anywhere, any time.

Besides, PEP357 doesn't DISTINGUISH between system floats and python
floats as indices.  The writers clearly believed that NETHER of them
were acceptable as indices.  That alone makes being able to turn *some*
floats "ON" as indices unexpected behavior.  ( I think the PEP writers
did the best they could with the limited tools they had coming into
their minds. )

As an aside: I am treating this as a postmortem; Gathering information
and looking for what was *good* as well as what was bad about an
implementation.  I have, for example, noticed that non-mutables can't be
made to have loops later;  Hence any object made strictly out of non
mutables at every step -- do not need garbage collection; and that can
be used to get rid of GC overhead on any object obeying that property.

-- 
http://mail.python.org/mailman/listinfo/python-list

Fwd: Re: Negative array indicies and slice()

Reply via email to