Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Nick Coghlan wrote:

On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson  wrote:

Open questions
==

There are two open questions for this PEP:

* Should ``list`` expose a kwarg in it's constructor for supplying a length
  hint.
* Should a function be added either to ``builtins`` or some other module which
  calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Let's try to keep this as limited as possible for a public API.


Length hints are very useful for *any* container implementation,
whether those containers are in the standard library or not. Just as
we exposed operator.index when __index__ was added, we should expose
an "operator.length_hint" function with the following semantics:

[...]

As given, length_hint gives no way of distinguishing between iterables and 
non-iterables:


py> length_hint([])
0
py> length_hint(42)
0

nor does it give iterable objects a way to indicate that either they don't 
know their length, or that they are infinite.


I suggest:

* object (and hence all other types that don't explicitly override it)
  should have a __length_hint__ that raises TypeError;

* __length_hint__ should be allowed to return None to indicate "don't know"
  or -1 to indicate "infinite".

Presumably anything that wishes to create a list or other sequence from an 
object with a hint of -1 could then raise an exception immediately.





--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Nick Coghlan
On Sun, Jul 15, 2012 at 6:21 PM, Steven D'Aprano  wrote:
> I suggest:
>
> * object (and hence all other types that don't explicitly override it)
>   should have a __length_hint__ that raises TypeError;

We can keep it simpler than that just by changing the order of the checks.

> * __length_hint__ should be allowed to return None to indicate "don't know"
>   or -1 to indicate "infinite".
>
> Presumably anything that wishes to create a list or other sequence from an
> object with a hint of -1 could then raise an exception immediately.

I'm not seeing the value in returning None over 0 for the don't know
case - it just makes the API harder to use. Declaring negative results
as meaning "I'm infinite" sounds reasonable, though:

def length_hint(obj):
"""Return an estimate of the number of items in obj.

This is useful for presizing containers when building from an iterable.

If the object supports len(), the result will be exact. Otherwise,
it may over or underestimate by an arbitrary amount.
"""
try:
get_hint = obj.__length_hint__
except AttributeError:
return len(obj)
hint = get_hint()
if not isinstance(hint, int):
msg = "Length hint must be an integer, not %r"
raise TypeError(msg % type(hint))
if hint < 0:
raise ValueError("%r is an infinite iterator" % (obj,))
return hint

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Stefan Behnel
Alex Gaynor, 15.07.2012 07:20:
> there's no way for the __lenght_hint__ to specify that
> that particular instance can't have a length hint computed.  e.g. imagine
> some sort of lazy stream that cached itself, and only wanted to offer a
> length hint if it had already been evaluated.  Without an exception to
> raise, it has to return whatever the magic value for length_hint is (in
> your impl it appears to be 0, the current _PyObject_LengthHint method in
> CPython has a required `default` parameter).  The PEP proposes using
> TypeError for that.

Yes, that's a major issue. I've been planning to add a length hint to
Cython's generator expressions for a while, but the problem is really that
in most cases it is only known at runtime if the underlying iterable has a
length hint, so propagating it needs a way to say "sorry, I thought I might
know, but I don't". It would be even better if this way was efficient.
Since we're at a point of making this an official protocol, why not change
the current behaviour and return -1 (or even just 0) to explicitly state
that "we don't know"?

The problem with an exception here is that it might have been raised
accidentally inside of the __length_hint__() implementation that is being
asked. Swallowing it just because it happened to be a TypeError rather than
something else may end up covering bugs. We had a similar issue with
hasattr() in the past.

Also, it would be nice if this became a type slot rather than requiring a
dict lookup and Python function call.

Stefan

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 18:47:38 +1000
Nick Coghlan  wrote:
> 
> > * __length_hint__ should be allowed to return None to indicate "don't know"
> >   or -1 to indicate "infinite".
> >
> > Presumably anything that wishes to create a list or other sequence from an
> > object with a hint of -1 could then raise an exception immediately.
> 
> I'm not seeing the value in returning None over 0 for the don't know
> case - it just makes the API harder to use.

The point is that 0 is a legitimate value for a length hint. Simple
implementations of __length_hint__ will start returning 0 as a
legitimate value and you will wrongly interpret that as "don't know",
which kinds of defeat the purpose of __length-hint__ ;)

That said, I don't think a special value for "is infinite" is useful.
Just make -1 mean "I don't know".

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Take the first step in resolving the messy pkgutil vs importlib edge cases by

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 10:10:07 +0200 (CEST)
nick.coghlan  wrote:
>  
>  int
> +set_main_loader(PyObject *d, const char *filename, const char *loader_name)
> +{

This function should be static.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Actually initialize __main__.__loader__ with loader instances, not the

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 11:10:50 +0200 (CEST)
nick.coghlan  wrote:
>  tstate = PyThreadState_GET();
>  interp = tstate->interp;
> -loader = PyObject_GetAttrString(interp->importlib, loader_name);
> +loader_type = PyObject_GetAttrString(interp->importlib, loader_name);
> +if (loader_type == NULL) {
> +return -1;
> +}
> +loader = PyObject_CallFunction(loader_type, "ss", "__main__", filename);

I think you may have a refleak on loader_type here.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Antoine Pitrou wrote:


The point is that 0 is a legitimate value for a length hint. Simple
implementations of __length_hint__ will start returning 0 as a
legitimate value and you will wrongly interpret that as "don't know",
which kinds of defeat the purpose of __length-hint__  ;) 



That said, I don't think a special value for "is infinite" is useful.
Just make -1 mean "I don't know".


You've obviously never accidentally called list on an infinite iterator *wink*

It's not the (eventual) MemoryError that is the problem. On some systems, this 
can cause the PC  to become unresponsive as the OS tries to free an 
ever-increasing amount of memory. Been there, done that, on a production 
system. I had to do a hard reboot to fix it.


I think having a hint that says "there's no way this can succeed, fail 
immediately" is more useful than caring about the difference between a hint of 
0 and a hint of 1.




--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Nick Coghlan
Right, I agree on the value in being able to return something to say "this
cannot be converted to a concrete container".

I still haven't seen a use case where the appropriate response to "I don't
know" differs from the appropriate response to a hint of zero - that is,
you don't preallocate, you just start iterating.

Cheers,
Nick.

--
Sent from my phone, thus the relative brevity :)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Mon, 16 Jul 2012 00:08:41 +1000
Nick Coghlan  wrote:
> Right, I agree on the value in being able to return something to say "this
> cannot be converted to a concrete container".

Who would be able to return that, apart from trivial cases like
itertools.cycle()?

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Christian Heimes
Am 15.07.2012 16:22, schrieb Antoine Pitrou:
> On Mon, 16 Jul 2012 00:08:41 +1000
> Nick Coghlan  wrote:
>> Right, I agree on the value in being able to return something to say "this
>> cannot be converted to a concrete container".
> 
> Who would be able to return that, apart from trivial cases like
> itertools.cycle()?

For example most numerical sequence iterators like Fibonacci generator,
prime number sequence generator and even trivial cases like even natural
number generator. IMO it's a good idea to have a notation for infinitive
iterators that can't be materialized as finite containers.

+1

Christian

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Mark Shannon

Nick Coghlan wrote:
Right, I agree on the value in being able to return something to say 
"this cannot be converted to a concrete container".


I still haven't seen a use case where the appropriate response to "I 
don't know" differs from the appropriate response to a hint of zero - 
that is, you don't preallocate, you just start iterating.




There seem to be 5 possible classes values of __length_hint__ that an
iterator object can provide:

1. Don't implement it at all.

2. Implement __length_hint__() but don't want to return any value.
   Either raise an exception (TypeError) -- As suggested in the PEP.
   or return NotImplemented -- my preferred option.

3. Return a "don't know" value:
   Returning 0 would be fine for this, but the VM might want to respond
   differently to "don't know" and 0.
__length_hint__() == 0 container should be minimum size.
__length_hint__() == "unknown" container starts at default size.

4. Infinite iterator:
   Could return float('inf'), but given this is a "hint" then
   returning sys.maxsize or sys.maxsize + 1 might be OK.
   Alternatively raise an OverflowError

5. A meaningful length. No problem :)

Also, what are the allowable return types?

1. int only
2. Any number (ie any type with a __int__() method)?
3. Or any integer-like object (ie a type with a __index__() method)?

My suggestion:

a) Don't want to return any value or "don't know": return NotImplemented
b) For infinite iterators: raise an OverflowError
c) All other cases: return an int or a type with a __index__() method.

Cheers,
Mark.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Mark Shannon

Alex Gaynor wrote:

Hi all,

I've just submitted a PEP proposing making __length_hint__ a public API for 
users to define and other VMs to implement:


These seems back-to-front.
__length_hint__ is *used* by the VM, not provided by it.
It should be part of the object model, rather than the API.



PEP: 424
Title: A method for exposing a length hint
Version: $Revision$
Last-Modified: $Date
Author: Alex Gaynor 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 14-July-2012
Python-Version: 3.4

Abstract


CPython currently defines an ``__length_hint__`` method on several types, such
as various iterators. This method is then used by various other functions (such 
as

``map``) to presize lists based on the estimated returned by


Don't use "map" as an example.
map returns an iterator so it doesn't need __length_hint__


``__length_hint__``. Types can then define ``__length_hint__`` which are not
sized, and thus should not define ``__len__``, but can estimate or compute a
size (such as many iterators).

Proposal


This PEP proposes formally documenting ``__length_hint__`` for other
interpreter and non-standard library Python to implement.

``__length_hint__`` must return an integer, and is not required to be accurate.
It may return a value that is either larger or smaller than the actual size of
the container. It may raise a ``TypeError`` if a specific instance cannot have
its length estimated. It may not return a negative value.


Rather than raising a TypeError, why not return NotImplemented?



Rationale
=

Being able to pre-allocate lists based on the expected size, as estimated by 
``__length_hint__``, can be a significant optimization. CPython has been

observed to run some code faster than PyPy, purely because of this optimization
being present.

Open questions
==

There are two open questions for this PEP:

* Should ``list`` expose a kwarg in it's constructor for supplying a length
  hint.
* Should a function be added either to ``builtins`` or some other module which
  calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Copyright
=

This document has been placed into the public domain.

..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8




Alex

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Brett Cannon
On Sun, Jul 15, 2012 at 10:39 AM, Mark Shannon  wrote:

> Nick Coghlan wrote:
>
>> Right, I agree on the value in being able to return something to say
>> "this cannot be converted to a concrete container".
>>
>> I still haven't seen a use case where the appropriate response to "I
>> don't know" differs from the appropriate response to a hint of zero - that
>> is, you don't preallocate, you just start iterating.
>>
>>
> There seem to be 5 possible classes values of __length_hint__ that an
> iterator object can provide:
>
> 1. Don't implement it at all.
>
> 2. Implement __length_hint__() but don't want to return any value.
>Either raise an exception (TypeError) -- As suggested in the PEP.
>or return NotImplemented -- my preferred option.
>
> 3. Return a "don't know" value:
>Returning 0 would be fine for this, but the VM might want to respond
>differently to "don't know" and 0.
> __length_hint__() == 0 container should be minimum
> size.
> __length_hint__() == "unknown" container starts at default
> size.


> 4. Infinite iterator:
>Could return float('inf'), but given this is a "hint" then
>returning sys.maxsize or sys.maxsize + 1 might be OK.
>Alternatively raise an OverflowError
>

I am really having a hard time differentiating infinity with "I don't know"
since they are both accurate from the point of view of __length_hint__ and
its typical purpose of allocation. You have no clue how many values will be
grabbed from an infinite iterator, so it's the same as just not knowing
upfront how long the iterator will be, infinite or not, and thus not worth
distinguishing.


>
> 5. A meaningful length. No problem :)
>
> Also, what are the allowable return types?
>
> 1. int only
> 2. Any number (ie any type with a __int__() method)?
> 3. Or any integer-like object (ie a type with a __index__() method)?
>
> My suggestion:
>
> a) Don't want to return any value or "don't know": return NotImplemented
> b) For infinite iterators: raise an OverflowError
> c) All other cases: return an int or a type with a __index__() method.
>

I'm fine with (a), drop (b), and for (c) use what we allow for __len__()
since, as Nick's operator.length_hint pseudo-code suggests, people will
call this as a fallback if __len__ isn't defined.

-Brett



>
> Cheers,
> Mark.
>
>
> __**_
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/**mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
> brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Christian Heimes
Am 15.07.2012 16:39, schrieb Mark Shannon:
> 1. Don't implement it at all.
> 
> 2. Implement __length_hint__() but don't want to return any value.
>Either raise an exception (TypeError) -- As suggested in the PEP.
>or return NotImplemented -- my preferred option.

How is this different from "don't know"? What's the use case for knowing
that the object doesn't want to say anything or doesn't know its
possible length.

> 3. Return a "don't know" value:
>Returning 0 would be fine for this, but the VM might want to respond
>differently to "don't know" and 0.

How about None? It's the logical choice, simple and easy to test for in
Python and C code.

0 is a valid number for "I know that's I'll return nothing".

> 4. Infinite iterator:
>Could return float('inf'), but given this is a "hint" then
>returning sys.maxsize or sys.maxsize + 1 might be OK.
>Alternatively raise an OverflowError

Too complex, hard to remember and even harder to check for. Since a
length is always positive or zero, -1 is a good return value for infinite.

> a) Don't want to return any value or "don't know": return NotImplemented

+1

> b) For infinite iterators: raise an OverflowError

-1, I'm for -1. ;) I'm not a fan of using exception for valid and
correct return values.

> c) All other cases: return an int or a type with a __index__() method.

+1

Christian

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Mark Shannon

Brett Cannon wrote:



On Sun, Jul 15, 2012 at 10:39 AM, Mark Shannon > wrote:


Nick Coghlan wrote:

Right, I agree on the value in being able to return something to
say "this cannot be converted to a concrete container".

I still haven't seen a use case where the appropriate response
to "I don't know" differs from the appropriate response to a
hint of zero - that is, you don't preallocate, you just start
iterating.


There seem to be 5 possible classes values of __length_hint__ that an
iterator object can provide:

1. Don't implement it at all.

2. Implement __length_hint__() but don't want to return any value.
   Either raise an exception (TypeError) -- As suggested in the PEP.
   or return NotImplemented -- my preferred option.

3. Return a "don't know" value:
   Returning 0 would be fine for this, but the VM might want to respond
   differently to "don't know" and 0.
__length_hint__() == 0 container should be
minimum size.
__length_hint__() == "unknown" container starts at
default size.


4. Infinite iterator:
   Could return float('inf'), but given this is a "hint" then
   returning sys.maxsize or sys.maxsize + 1 might be OK.
   Alternatively raise an OverflowError


I am really having a hard time differentiating infinity with "I don't 
know" since they are both accurate from the point of view of 
__length_hint__ and its typical purpose of allocation. You have no clue 
how many values will be grabbed from an infinite iterator, so it's the 
same as just not knowing upfront how long the iterator will be, infinite 
or not, and thus not worth distinguishing.
 



5. A meaningful length. No problem :)

Also, what are the allowable return types?

1. int only
2. Any number (ie any type with a __int__() method)?
3. Or any integer-like object (ie a type with a __index__() method)?

My suggestion:

a) Don't want to return any value or "don't know": return NotImplemented
b) For infinite iterators: raise an OverflowError
c) All other cases: return an int or a type with a __index__() method.


I'm fine with (a), drop (b), and for (c) use what we allow for __len__() 
since, as Nick's operator.length_hint pseudo-code suggests, people will 
call this as a fallback if __len__ isn't defined.


So how does an iterator express infinite length?

What should happen if I am silly enough to do this:
>>> list(itertools.count())

This will fail; it should fail quickly.

Cheers,
Mark.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 16:33:23 +0200
Christian Heimes  wrote:
> Am 15.07.2012 16:22, schrieb Antoine Pitrou:
> > On Mon, 16 Jul 2012 00:08:41 +1000
> > Nick Coghlan  wrote:
> >> Right, I agree on the value in being able to return something to say "this
> >> cannot be converted to a concrete container".
> > 
> > Who would be able to return that, apart from trivial cases like
> > itertools.cycle()?
> 
> For example most numerical sequence iterators like Fibonacci generator,
> prime number sequence generator and even trivial cases like even natural
> number generator.

First, you can't implement __length_hint__ for a generator, which is the
preferred (the most practical) way of writing iterators in pure Python.

Second, not all iterators will implement __length_hint__ (because it's
optional and, really, of rather little use). So, as a user, you cannot
hope that `list(some_iterator)` will always raise instead of filling
your memory with an infinite stream of values: you have to be careful
anyway.

Even if __length_hint__ is implemented, its result may be wrong.
That's the whole point: it's a *hint*; an iterator might tell you it's
finite while it's infinite, or the reverse.


My conclusion is that an infinite iterator is a documentation issue.
Just tell the user that it doesn't stop, and let them shoot themselves
in the foot in they want to.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Sun, 15 Jul 2012 16:08:00 +0100
Mark Shannon  wrote:
> 
> What should happen if I am silly enough to do this:
>  >>> list(itertools.count())
> 
> This will fail; it should fail quickly.

Why should it? AFAIK it's not a common complaint. You said it yourself:
it's a silly thing to do.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Alexandre Zani
On Sun, Jul 15, 2012 at 8:08 AM, Mark Shannon  wrote:
> Brett Cannon wrote:
>
>>
>>
>> On Sun, Jul 15, 2012 at 10:39 AM, Mark Shannon > > wrote:
>>
>> Nick Coghlan wrote:
>>
>> Right, I agree on the value in being able to return something to
>> say "this cannot be converted to a concrete container".
>>
>> I still haven't seen a use case where the appropriate response
>> to "I don't know" differs from the appropriate response to a
>> hint of zero - that is, you don't preallocate, you just start
>> iterating.
>>
>>
>> There seem to be 5 possible classes values of __length_hint__ that an
>> iterator object can provide:
>>
>> 1. Don't implement it at all.
>>
>> 2. Implement __length_hint__() but don't want to return any value.
>>Either raise an exception (TypeError) -- As suggested in the PEP.
>>or return NotImplemented -- my preferred option.
>>
>> 3. Return a "don't know" value:
>>Returning 0 would be fine for this, but the VM might want to
>> respond
>>differently to "don't know" and 0.
>> __length_hint__() == 0 container should be
>> minimum size.
>> __length_hint__() == "unknown" container starts at
>> default size.
>>
>>
>> 4. Infinite iterator:
>>Could return float('inf'), but given this is a "hint" then
>>returning sys.maxsize or sys.maxsize + 1 might be OK.
>>Alternatively raise an OverflowError
>>
>>
>> I am really having a hard time differentiating infinity with "I don't
>> know" since they are both accurate from the point of view of __length_hint__
>> and its typical purpose of allocation. You have no clue how many values will
>> be grabbed from an infinite iterator, so it's the same as just not knowing
>> upfront how long the iterator will be, infinite or not, and thus not worth
>> distinguishing.
>>
>>
>> 5. A meaningful length. No problem :)
>>
>> Also, what are the allowable return types?
>>
>> 1. int only
>> 2. Any number (ie any type with a __int__() method)?
>> 3. Or any integer-like object (ie a type with a __index__() method)?
>>
>> My suggestion:
>>
>> a) Don't want to return any value or "don't know": return
>> NotImplemented
>> b) For infinite iterators: raise an OverflowError
>> c) All other cases: return an int or a type with a __index__() method.
>>
>>
>> I'm fine with (a), drop (b), and for (c) use what we allow for __len__()
>> since, as Nick's operator.length_hint pseudo-code suggests, people will call
>> this as a fallback if __len__ isn't defined.
>
>
> So how does an iterator express infinite length?
>
> What should happen if I am silly enough to do this:
 list(itertools.count())
>
> This will fail; it should fail quickly.
>
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/alexandre.zani%40gmail.com

The PEP so far says: "It may raise a ``TypeError`` if a specific
instance cannot have
its length estimated." In many ways, "I don't know" is the same as
this "specific instance cannot have its length estimated". Why not
just raise a TypeError?

Also, regarding the code Nick posted above, I'm a little concerned
about calling len as the first thing to try. That means that if I
implement both __len__ and __len_hint__ (perhaps because __len__ is
very expensive) __len_hint__ will never be used. It's relatively easy
to say:

try:
  hint = len_hint(l)
except TypeError:
  hint = len(l)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Mark Shannon wrote:


So how does an iterator express infinite length?


The suggestion was it should return -1.



What should happen if I am silly enough to do this:
 >>> list(itertools.count())

This will fail; it should fail quickly.


That depends on your OS. I've just tested it now on Linux Mint, and the Python 
process was terminated within seconds.


I've also inadvertently done it on a Fedora system, which became completely 
unresponsive to user-input (including ctrl-alt-delete) within a few minutes. I 
let it run overnight (16 hours) before literally pulling the plug.


(I expect the difference in behaviour is due to the default ulimit under 
Debian/Mint and RedHat/Fedora systems.)


Ignoring OS-specific features, the promise[1] of the language is that list 
will try to allocate enough space for every item yielded by the iterator, or 
fail with a MemoryError. No promise is made as to how long that will take: it 
could take hours, or days, depending on how badly memory allocation 
performance drops when faced with unreasonably large requests. You can't 
expect it to fail either quickly or with an exception.


With a length hint, we could strengthen that promise:

"if __length_hint__ returns a negative number, list, tuple and set will fail 
immediately with MemoryError"


which I think is a good safety feature for some things which cannot possibly 
succeed, but risk DOSing your system. Does it prevent every possible failure 
mode? No, of course not. But just because you can't prevent *every* problem 
doesn't mean you should prevent the ones which you can.



[1] I think. I'm sure I read this somewhere in the docs, but I can't find it 
now.


--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Chris Angelico
On Mon, Jul 16, 2012 at 1:55 AM, Steven D'Aprano  wrote:
> (I expect the difference in behaviour is due to the default ulimit under
> Debian/Mint and RedHat/Fedora systems.)

Possibly also virtual memory settings. Allocating gobs of memory with
a huge page file slows everything down without raising an error.

And since it's possible to have non-infinite but ridiculous-sized
iterators, I'd not bother putting too much effort into protecting
infinite iterators - although the "huge but not infinite" case is,
admittedly, rather rarer than either "reasonable-sized" or "actually
infinite".

ChrisA
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Antoine Pitrou wrote:


First, you can't implement __length_hint__ for a generator, which is the
preferred (the most practical) way of writing iterators in pure Python.


Limitations of generators are no reason for not improving iterators which are 
not generators. __length_hint__ already exists; this proposal simply proposes 
making it documented and officially supported.


py> iter([]).__length_hint__




Even if __length_hint__ is implemented, its result may be wrong.
That's the whole point: it's a *hint*; an iterator might tell you it's
finite while it's infinite, or the reverse.


If it claims to be infinite, I see no reason to disbelieve it on the 
off-chance that it is actually both finite and small enough to fit into memory 
 without crashing my system. If it claims to be finite, but is actually 
infinite, well that's not much of a hint, is it? There's an implied promise 
that the hint will be close to the real value, not infinitely distant.




My conclusion is that an infinite iterator is a documentation issue.
Just tell the user that it doesn't stop, and let them shoot themselves
in the foot in they want to.


Buffer overflows are a documentation issue. Just tell the user not to 
overwrite memory they don't mean to, and let them shoot themselves in the foot 
if they want.


*wink*



--
Steven

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Mon, 16 Jul 2012 02:00:58 +1000
Chris Angelico  wrote:
> On Mon, Jul 16, 2012 at 1:55 AM, Steven D'Aprano  wrote:
> > (I expect the difference in behaviour is due to the default ulimit under
> > Debian/Mint and RedHat/Fedora systems.)
> 
> Possibly also virtual memory settings. Allocating gobs of memory with
> a huge page file slows everything down without raising an error.
> 
> And since it's possible to have non-infinite but ridiculous-sized
> iterators, I'd not bother putting too much effort into protecting
> infinite iterators - although the "huge but not infinite" case is,
> admittedly, rather rarer than either "reasonable-sized" or "actually
> infinite".

In the real world, I'm sure "huge but not infinite" is much more
frequent than "actually infinite". Trying to list() an infinite
iterator is a programming error, so it shouldn't end up in production
code. However, data that grows bigger than expected (or that gets
disposed of too late) is quite a common thing.


When hg.python.org died of OOM two weeks ago, it wasn't because of an
infinite iterator:
http://mail.python.org/pipermail/python-committers/2012-July/002084.html


Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Antoine Pitrou
On Mon, 16 Jul 2012 02:21:20 +1000
Steven D'Aprano  wrote:
> 
> > My conclusion is that an infinite iterator is a documentation issue.
> > Just tell the user that it doesn't stop, and let them shoot themselves
> > in the foot in they want to.
> 
> Buffer overflows are a documentation issue. Just tell the user not to 
> overwrite memory they don't mean to, and let them shoot themselves in the 
> foot 
> if they want.

No, buffer overflows are bugs and they get fixed.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Steven D'Aprano wrote:


With a length hint, we could strengthen that promise:

"if __length_hint__ returns a negative number, list, tuple and set will 
fail immediately with MemoryError"


which I think is a good safety feature for some things which cannot 
possibly succeed, but risk DOSing your system. Does it prevent every 
possible failure mode? No, of course not. But just because you can't 
prevent *every* problem doesn't mean you should prevent the ones which 
you can.


Gah, I messed that last sentence up. It should read:

just because you can't prevent *every* problem doesn't mean you SHOULDN'T 
prevent the ones which you can.



--
Steven

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Stephen J. Turnbull
Nick Coghlan writes:

 > Right, I agree on the value in being able to return something to say "this
 > cannot be converted to a concrete container".
 > 
 > I still haven't seen a use case where the appropriate response to "I don't
 > know" differs from the appropriate response to a hint of zero - that is,
 > you don't preallocate, you just start iterating.

Why wouldn't one just believe the hint and jump past the iteration?

What about an alternative API such as length_hint(iter, bound)
returning 'cannot say' (if no hint is available), 'small' (if the
estimated length is less than bound), and 'large' (if it's greater
than the bound or infinite)?  (Or None, True, False which would give
the boolean interpretation "do I know I'm small enough to be converted
to a concrete container?")

The point is that I don't really see the value in returning a precise
estimate that cannot be relied on to be accurate.  OK, Python is a
"consenting adults" language, but returning an integer here seems like
invitation to abuse.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Nick Coghlan
On Jul 16, 2012 1:52 PM, "Stephen J. Turnbull"  wrote:
> The point is that I don't really see the value in returning a precise
> estimate that cannot be relied on to be accurate.  OK, Python is a
> "consenting adults" language, but returning an integer here seems like
> invitation to abuse.

Because preallocating memory is ridiculously faster than doing multiple
resizes. That's all this API is for: how many objects should a container
constructor preallocate space for when building from an iterable. It's an
important optimisation in CPython when using itertools, and PyPy is
planning to adopt it as well. Alex is doing the right thing in attempting
to standardise it rather than risk the two implementations using subtly
incompatible definitions.

Skipping the iteration in the zero case is a pointless micro-optimisation
that just makes the API more complex for no good reason. Allowing a
negative hint to mean "infinite", on the other hand, avoids certain
categories of errors without making the API any harder to use (since
negatives have to be rejected anyway).

--
Sent from my phone, thus the relative brevity :)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Steven D'Aprano

Stephen J. Turnbull wrote:


The point is that I don't really see the value in returning a precise
estimate that cannot be relied on to be accurate.  OK, Python is a
"consenting adults" language, but returning an integer here seems like
invitation to abuse.


Since __length_hint__ already exists and is already used, we should probably 
hear from somebody who knows how it is used and what problems and/or benefits 
it leads to.




--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-15 Thread Stefan Behnel
Mark Shannon, 15.07.2012 16:14:
> Alex Gaynor wrote:
>> CPython currently defines an ``__length_hint__`` method on several types,
>> such
>> as various iterators. This method is then used by various other functions
>> (such as ``map``) to presize lists based on the estimated returned by
> 
> Don't use "map" as an example.
> map returns an iterator so it doesn't need __length_hint__

Right. It's a good example for something else, though. As I mentioned
before, iterators should be able to propagate the length hint of an
underlying iterator, e.g. in generator expressions or map(). I consider
that an important feature that the protocol must support.

Stefan

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com