[Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Larry Hastings



Argument Clinic "converters" specify how to convert an individual 
argument to the function you're defining.  Although a converter could 
theoretically represent any sort of conversion, most of the time they 
directly represent types like "int" or "double" or "str".


Because there's such variety in argument parsing, the converters are 
customizable with parameters.  Many of these are common enough that 
Argument Clinic suggests some standard names.  Examples: "zeroes=True" 
for strings and buffers means "permit internal \0 characters", and 
"bitwise=True" for unsigned integers means "copy the bits over, even if 
there's overflow/underflow, and even if the original is negative".


A third example is "nullable=True", which means "also accept None for 
this parameter".  This was originally intended for use with strings 
(compare the "s" and "z" format units for PyArg_ParseTuple), however it 
looks like we'll have a use for "nullable ints" in the ongoing Argument 
Clinic conversion work.


Several people have said they found the name "nullable" surprising, 
suggesting I use another name like "allow_none" or "noneable".  I, in 
turn, find their surprise surprising; "nullable" is a term long 
associated with exactly this concept.  It's used in C# and SQL, and the 
term even has its own Wikipedia page:


   http://en.wikipedia.org/wiki/Nullable_type

Most amusingly, Vala *used* to have an annotation called "(allow-none)", 
but they've broken it out into two annotations, "(nullable)" and 
"(optional)".


   
http://blogs.gnome.org/desrt/2014/05/27/allow-none-is-dead-long-live-nullable/


Before you say "the term 'nullable' will confuse end users", let me 
remind you: this is not user-facing.  This is a parameter for an 
Argument Clinic converter, and will only ever be seen by CPython core 
developers.  A group which I hope is not so easily confused.


It's my contention that "nullable" is the correct name.  But I've been 
asked to bring up the topic for discussion, to see if a consensus forms 
around this or around some other name.


Let the bike-shedding begin,


//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Stephen Hansen
On Mon, Aug 4, 2014 at 12:12 AM, Larry Hastings  wrote:

>
> Several people have said they found the name "nullable" surprising,
> suggesting I use another name like "allow_none" or "noneable".  I, in turn,
> find their surprise surprising; "nullable" is a term long associated with
> exactly this concept.  It's used in C# and SQL, and the term even has its
> own Wikipedia page:
>

The thing is, "null" in these languages are not the same thing. If you look
to the various database wrappers there's a lot of controversy about just
how to map the SQL NULL to Python: simply mapping it to Python's None
becomes strange because the semantics of a SQL NULL or NULL pointer and
Python None don't exactly match. Not all that long ago someone was making
an argument on this list to add a SQLNULL type object to better map SQL
NULL semantics (regards to sorting, as I recall -- but its been awhile)

Python has None. Its definition and understanding in a Python context is
clear. Why introduce some other concept? In Python its very common you pass
None instead of an other argument.


> Before you say "the term 'nullable' will confuse end users", let me remind
> you: this is not user-facing.  This is a parameter for an Argument Clinic
> converter, and will only ever be seen by CPython core developers.  A group
> which I hope is not so easily confused
>

Yet, my lurking observation of argument clinic is it is all about clearly
defining the C-side of how things are done in Python API's. It may not
confuse 'end users', but it may confuse possible contributors, and simply
add a lack of clarity to the situation.

Passing None in place of another argument is a very Pythonic thing to do;
why confuse that by using other words which imply other semantics? None is
a Python thing with clear semantics in Python; allow_none quite accurately
describes the Pythonic thing described here, while 'nullable' expects for
domain knowledge beyond Python and makes assumptions of semantics.

/re-lurk

--S
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Glenn Linderman

On 8/4/2014 12:35 AM, Stephen Hansen wrote:
On Mon, Aug 4, 2014 at 12:12 AM, Larry Hastings > wrote:



Several people have said they found the name "nullable"
surprising, suggesting I use another name like "allow_none" or
"noneable".  I, in turn, find their surprise surprising;
"nullable" is a term long associated with exactly this concept. 
It's used in C# and SQL, and the term even has its own Wikipedia page:



The thing is, "null" in these languages are not the same thing. If you 
look to the various database wrappers there's a lot of controversy 
about just how to map the SQL NULL to Python: simply mapping it to 
Python's None becomes strange because the semantics of a SQL NULL or 
NULL pointer and Python None don't exactly match. Not all that long 
ago someone was making an argument on this list to add a SQLNULL type 
object to better map SQL NULL semantics (regards to sorting, as I 
recall -- but its been awhile)


Python has None. Its definition and understanding in a Python context 
is clear. Why introduce some other concept? In Python its very common 
you pass None instead of an other argument.


Before you say "the term 'nullable' will confuse end users", let
me remind you: this is not user-facing.  This is a parameter for
an Argument Clinic converter, and will only ever be seen by
CPython core developers.  A group which I hope is not so easily
confused


Yet, my lurking observation of argument clinic is it is all about 
clearly defining the C-side of how things are done in Python API's. It 
may not confuse 'end users', but it may confuse possible contributors, 
and simply add a lack of clarity to the situation.


Passing None in place of another argument is a very Pythonic thing to 
do; why confuse that by using other words which imply other semantics? 
None is a Python thing with clear semantics in Python; allow_none 
quite accurately describes the Pythonic thing described here, while 
'nullable' expects for domain knowledge beyond Python and makes 
assumptions of semantics.


/re-lurk

--S


Thanks, Stephen.  +1 to all you wrote.

There remains, of course, one potential justification for using 
"nullable", that you didn't make 100% clear. Because "argument clinic is 
it is all about clearly defining the C-side of how things are done in 
Python API's." and that is that C uses NULL (but it is only a 
convention, not a language feature) for missing reference parameters on 
occasion. But I think it is much more clear that if C NULL gets mapped 
to Python None, and we are talking about Python parameters, then a 
NULLable C parameter should map to an "allow_none" Python parameter.


The concepts of C NULL, C# NULL, SQL NULL, and Python None are all 
slightly different, even the brilliant people on python-dev could better 
spend their energies on new features and bug fixes rather than being 
slowed by the need to remember yet another unclear and inconsistent 
terminology issue, of which there are already too many.


Glenn
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Oleg Broytman
Hi!

On Mon, Aug 04, 2014 at 05:12:47PM +1000, Larry Hastings  
wrote:
> "nullable=True", which means "also accept None
> for this parameter".  This was originally intended for use with
> strings (compare the "s" and "z" format units for PyArg_ParseTuple),
> however it looks like we'll have a use for "nullable ints" in the
> ongoing Argument Clinic conversion work.
> 
> Several people have said they found the name "nullable" surprising,
> suggesting I use another name like "allow_none" or "noneable".  I,
> in turn, find their surprise surprising; "nullable" is a term long
> associated with exactly this concept.  It's used in C# and SQL, and
> the term even has its own Wikipedia page:
> 
>http://en.wikipedia.org/wiki/Nullable_type

   In my very humble opinion, "nullable" is ok, but "allow_none" is
better.

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/[email protected]
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Nick Coghlan
On 4 Aug 2014 18:16, "Oleg Broytman"  wrote:
>
> Hi!
>
> On Mon, Aug 04, 2014 at 05:12:47PM +1000, Larry Hastings <
[email protected]> wrote:
> > "nullable=True", which means "also accept None
> > for this parameter".  This was originally intended for use with
> > strings (compare the "s" and "z" format units for PyArg_ParseTuple),
> > however it looks like we'll have a use for "nullable ints" in the
> > ongoing Argument Clinic conversion work.
> >
> > Several people have said they found the name "nullable" surprising,
> > suggesting I use another name like "allow_none" or "noneable".  I,
> > in turn, find their surprise surprising; "nullable" is a term long
> > associated with exactly this concept.  It's used in C# and SQL, and
> > the term even has its own Wikipedia page:
> >
> >http://en.wikipedia.org/wiki/Nullable_type
>
>In my very humble opinion, "nullable" is ok, but "allow_none" is
> better.

Yup, this is where I stand as well. The main concern I have with nullable
is that we *are* writing C code when dealing with Argument Clinic, and
"nullable" may make me think of a C NULL rather than Python's None.

Cheers,
Nick.

>
> Oleg.
> --
>  Oleg Broytmanhttp://phdru.name/[email protected]
>Programmers don't die, they just GOSUB without RETURN.
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Antoine Pitrou

Le 04/08/2014 03:35, Stephen Hansen a écrit :


Before you say "the term 'nullable' will confuse end users", let me
remind you: this is not user-facing.  This is a parameter for an
Argument Clinic converter, and will only ever be seen by CPython
core developers.  A group which I hope is not so easily confused


Yet, my lurking observation of argument clinic is it is all about
clearly defining the C-side of how things are done in Python API's. It
may not confuse 'end users', but it may confuse possible contributors,
and simply add a lack of clarity to the situation.


That's a rather good point, and I agree with Stephen here. Even core 
contributors can deserve clarity and the occasional non-confusing 
notation :-)


Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Nathaniel Smith
I admit I spent the first half of the email scratching my head and trying
to figure out what NULL had to do with argument clinic specs. (Maybe it
would mean that if the argument is "not given" in some appropriate way then
we set the corresponding C variable to NULL?) Finding out you were talking
about None came as a surprising twist.

-n
On 4 Aug 2014 08:13, "Larry Hastings"  wrote:

>
>
> Argument Clinic "converters" specify how to convert an individual argument
> to the function you're defining.  Although a converter could theoretically
> represent any sort of conversion, most of the time they directly represent
> types like "int" or "double" or "str".
>
> Because there's such variety in argument parsing, the converters are
> customizable with parameters.  Many of these are common enough that
> Argument Clinic suggests some standard names.  Examples: "zeroes=True" for
> strings and buffers means "permit internal \0 characters", and
> "bitwise=True" for unsigned integers means "copy the bits over, even if
> there's overflow/underflow, and even if the original is negative".
>
> A third example is "nullable=True", which means "also accept None for this
> parameter".  This was originally intended for use with strings (compare the
> "s" and "z" format units for PyArg_ParseTuple), however it looks like we'll
> have a use for "nullable ints" in the ongoing Argument Clinic conversion
> work.
>
> Several people have said they found the name "nullable" surprising,
> suggesting I use another name like "allow_none" or "noneable".  I, in turn,
> find their surprise surprising; "nullable" is a term long associated with
> exactly this concept.  It's used in C# and SQL, and the term even has its
> own Wikipedia page:
>
> http://en.wikipedia.org/wiki/Nullable_type
>
> Most amusingly, Vala *used* to have an annotation called "(allow-none)",
> but they've broken it out into two annotations, "(nullable)" and
> "(optional)".
>
>
> http://blogs.gnome.org/desrt/2014/05/27/allow-none-is-dead-long-live-nullable/
>
>
> Before you say "the term 'nullable' will confuse end users", let me remind
> you: this is not user-facing.  This is a parameter for an Argument Clinic
> converter, and will only ever be seen by CPython core developers.  A group
> which I hope is not so easily confused.
>
> It's my contention that "nullable" is the correct name.  But I've been
> asked to bring up the topic for discussion, to see if a consensus forms
> around this or around some other name.
>
> Let the bike-shedding begin,
>
>
> */arry*
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
>
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-04 Thread Chris Barker
On Sat, Aug 2, 2014 at 1:35 PM, David Wilson  wrote:

> > Repeated list and str concatenation both have quadratic O(N**2)
> > performance, but people frequently build up strings with +
>


> join() isn't preferable in cases where it damages readability while
> simultaneously providing zero or negative performance benefit, such as
> when concatenating a few short strings, e.g. while adding a prefix to a
> filename.
>

Good point -- I was trying to make the point about .join() vs + for strings
in an intro python class last year, and made the mistake of having the
students test the performance.

You need to concatenate a LOT of strings to see any difference at all --  I
know that O() of algorithms is unavoidable, but between efficient python
optimizations and a an apparently good memory allocator, it's really a
practical non-issue.


> Although it's true that join() is automatically the safer option, and
> especially when dealing with user supplied data, the net harm caused by
> teaching rote and ceremony seems far less desirable compared to fixing a
> trivial slowdown in a script, if that slowdown ever became apparent.
>

and it rarely would.

Blocking sum( some_strings) because it _might_ have poor performance seems
awfully pedantic.

As a long-time numpy user, I think sum(a_long_list_of_numbers) has
pathetically bad performance, but I wouldn't block it!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[email protected]
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Larry Hastings


On 08/04/2014 05:46 PM, Glenn Linderman wrote:
There remains, of course, one potential justification for using 
"nullable", that you didn't make 100% clear. Because "argument clinic 
is it is all about clearly defining the C-side of how things are done 
in Python API's." and that is that C uses NULL (but it is only a 
convention, not a language feature) for missing reference parameters 
on occasion. But I think it is much more clear that if C NULL gets 
mapped to Python None, and we are talking about Python parameters, 
then a NULLable C parameter should map to an "allow_none" Python 
parameter.


Argument Clinic defines *both* sides of how things are done in builtins, 
both C and Python.  So it's a bit messier than that. Currently the 
"nullable" flag is only applicable to certain converters which output 
pointer types in C, so if it gets a None for that argument it does 
provide a NULL as the C equivalent.  But in the "nullable int" patch 
obviously I can't do that.  Instead you get a structure containing 
either an int or a flag specifying "you got a None", currently named 
"is_null".  So I don't think your proposed additional justification helps.


Of course, in my opinion I don't need this additional justification.  
Python's "None" is its null object.  And we already have the concept of 
"nullable types" in computer science, for exactly, *exactly!*, this 
concept.  As the Zen says, "special cases aren't special enough to break 
the rules".  Just because Python is silly enough to name its null object 
"None" doesn't mean we have to warp all our other names around it.



//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Ethan Furman

On 08/04/2014 12:12 AM, Larry Hastings wrote:


It's my contention that "nullable" is the correct name.  But I've been asked to 
bring up the topic for discussion, to
see if a consensus forms around this or around some other name.

Let the bike-shedding begin,


I think the original name is okay, but 'allow_none' is definitely clearer.

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Alexander Belopolsky
On Mon, Aug 4, 2014 at 12:57 PM, Ethan Furman  wrote:

> 'allow_none' is definitely clearer.


I disagree. Unlike "nullable", "allow_none" does not tell me what happens
on the C side when I pass in None.  If the receiving type is PyObject*,
either NULL or Py_None is a valid choice.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Antoine Pitrou

Le 04/08/2014 13:36, Alexander Belopolsky a écrit :


On Mon, Aug 4, 2014 at 12:57 PM, Ethan Furman mailto:[email protected]>> wrote:

'allow_none' is definitely clearer.


I disagree. Unlike "nullable", "allow_none" does not tell me what
happens on the C side when I pass in None.  If the receiving type is
PyObject*, either NULL or Py_None is a valid choice.


But here the receiving type can be an int.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Alexander Belopolsky
On Mon, Aug 4, 2014 at 1:53 PM, Antoine Pitrou  wrote:

> I disagree. Unlike "nullable", "allow_none" does not tell me what
>> happens on the C side when I pass in None.  If the receiving type is
>> PyObject*, either NULL or Py_None is a valid choice.
>>
>
> But here the receiving type can be an int.


We cannot "allow None" when the receiving type is C int.  In this case, we
need a way to implement "nullable int" type in C.  We can use int * or a
pair of int and _Bool or anything else.  Whatever the implementation, the
concept that is implemented is "nullable int."  The advantage of using the
term "nullable" is that it is language and implementation neutral.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-04 Thread Steven D'Aprano
On Mon, Aug 04, 2014 at 09:25:12AM -0700, Chris Barker wrote:

> Good point -- I was trying to make the point about .join() vs + for strings
> in an intro python class last year, and made the mistake of having the
> students test the performance.
> 
> You need to concatenate a LOT of strings to see any difference at all --  I
> know that O() of algorithms is unavoidable, but between efficient python
> optimizations and a an apparently good memory allocator, it's really a
> practical non-issue.

If only that were the case, but it isn't. Here's a cautionary tale for 
how using string concatenation can blow up in your face:

Chris Withers asks for help debugging HTTP slowness:
https://mail.python.org/pipermail/python-dev/2009-August/091125.html

and publishes some times:
https://mail.python.org/pipermail/python-dev/2009-September/091581.html

(notice that Python was SIX HUNDRED times slower than wget or IE)

and Simon Cross identified the problem:
https://mail.python.org/pipermail/python-dev/2009-September/091582.html

leading Guido to describe the offending code as an embarrassment.

It shouldn't be hard to demonstrate the difference between repeated 
string concatenation and join, all you need do is defeat sum()'s 
prohibition against strings. Run this bit of code, and you'll see a 
significant difference in performance, even with CPython's optimized 
concatenation:

# --- cut ---
class Faker:
def __add__(self, other):
return other

x = Faker()
strings = list("Hello World!")
assert ''.join(strings) == sum(strings, x)

from timeit import Timer
setup = "from __main__ import x, strings"
t1 = Timer("''.join(strings)", setup)
t2 = Timer("sum(strings, x)", setup)

print (min(t1.repeat()))
print (min(t2.repeat()))
# --- cut ---


On my computer, using Python 2.7, I find the version using sum is nearly 
4.5 times slower, and with 3.3 about 4.2 times slower. That's with a 
mere twelve substrings, hardly "a lot". I tried running it on IronPython 
with a slightly larger list of substrings, but I got sick of waiting for 
it to finish.

If you want to argue that microbenchmarks aren't important, well, I 
might agree with you in general, but in the specific case of string 
concatenation there's that pesky factor of 600 slowdown in real world 
code to argue with.


> Blocking sum( some_strings) because it _might_ have poor performance seems
> awfully pedantic.

The rationale for explicitly prohibiting strings while merely implicitly 
discouraging other non-numeric types is that beginners, who are least 
likely to understand why their code occasionally and unpredictably 
becomes catastrophically slow, are far more likely to sum strings than 
sum tuples or lists.

(I don't entirely agree with this rationale, I'd prefer a warning rather 
than an exception.)



-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Larry Hastings


On 08/05/2014 03:53 AM, Antoine Pitrou wrote:

Le 04/08/2014 13:36, Alexander Belopolsky a écrit :
If the receiving type is PyObject*, either NULL or Py_None is a valid 
choice.

But here the receiving type can be an int.


Just to be precise: in the case where the receiving type *would* have 
been an int, and "nullable=True", the receiving type is actually a 
structure containing an int and a "you got a None" flag. I can't stick a 
magic value in the int and say "that represents you getting a None" 
because any integer value may be valid.


Also, I'm pretty sure there are places in builtin argument parsing that 
accept either NULL or Py_None, and I *think* maybe in one or two of them 
they actually mean different things.  What fun!



For small values of "fun",


//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Surely "nullable" is a reasonable name?

2014-08-04 Thread Antoine Pitrou

Le 04/08/2014 14:18, Larry Hastings a écrit :


On 08/05/2014 03:53 AM, Antoine Pitrou wrote:

Le 04/08/2014 13:36, Alexander Belopolsky a écrit :

If the receiving type is PyObject*, either NULL or Py_None is a valid
choice.

But here the receiving type can be an int.


Just to be precise: in the case where the receiving type *would* have
been an int, and "nullable=True", the receiving type is actually a
structure containing an int and a "you got a None" flag. I can't stick a
magic value in the int and say "that represents you getting a None"
because any integer value may be valid.

Also, I'm pretty sure there are places in builtin argument parsing that
accept either NULL or Py_None, and I *think* maybe in one or two of them
they actually mean different things.  What fun!


For small values of "fun",


Is -909 too large a value to be fun?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-04 Thread Stefan Behnel
Steven D'Aprano schrieb am 04.08.2014 um 20:10:
> On Mon, Aug 04, 2014 at 09:25:12AM -0700, Chris Barker wrote:
> 
>> Good point -- I was trying to make the point about .join() vs + for strings
>> in an intro python class last year, and made the mistake of having the
>> students test the performance.
>>
>> You need to concatenate a LOT of strings to see any difference at all --  I
>> know that O() of algorithms is unavoidable, but between efficient python
>> optimizations and a an apparently good memory allocator, it's really a
>> practical non-issue.
> 
> If only that were the case, but it isn't. Here's a cautionary tale for 
> how using string concatenation can blow up in your face:
> 
> Chris Withers asks for help debugging HTTP slowness:
> https://mail.python.org/pipermail/python-dev/2009-August/091125.html
> 
> and publishes some times:
> https://mail.python.org/pipermail/python-dev/2009-September/091581.html
> 
> (notice that Python was SIX HUNDRED times slower than wget or IE)
> 
> and Simon Cross identified the problem:
> https://mail.python.org/pipermail/python-dev/2009-September/091582.html
> 
> leading Guido to describe the offending code as an embarrassment.

Thanks for digging up that story.


>> Blocking sum( some_strings) because it _might_ have poor performance seems
>> awfully pedantic.
> 
> The rationale for explicitly prohibiting strings while merely implicitly 
> discouraging other non-numeric types is that beginners, who are least 
> likely to understand why their code occasionally and unpredictably 
> becomes catastrophically slow, are far more likely to sum strings than 
> sum tuples or lists.

Well, the obvious difference between strings and lists (not tuples) is that
strings are immutable, so it would seem more obvious at first sight to
concatenate strings than to do the same thing with lists, which can easily
be extended (they are clearly designed for that). This rational may not
apply as much to beginners as to more experienced programmers, but it
should still explain why this is so often discussed in the context of
string concatenation and pretty much never for lists.

As for tuples, their most common use case is to represent a fixed length
sequence of semantically different values. That renders their concatenation
a sufficiently uncommon use case to make no-one ask loudly for "large
scale" sum(tuples) support.

Basically, extending lists is an obvious thing, but getting multiple
strings joined without using "+"-concatenating them isn't.

Stefan


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-04 Thread Jim J. Jewett



Sat Aug 2 12:11:54 CEST 2014, Julian Taylor wrote (in
https://mail.python.org/pipermail/python-dev/2014-August/135623.html ) wrote:


> Andrea Griffini  wrote:

>>However sum([[1,2,3],[4],[],[5,6]], []) concatenates the lists.

> hm could this be a pure python case that would profit from temporary
> elision [ https://mail.python.org/pipermail/python-dev/2014-June/134826.html 
> ]?

> lists could declare the tp_can_elide slot and call list.extend on the
> temporary during its tp_add slot instead of creating a new temporary.
> extend/realloc can avoid the copy if there is free memory available
> after the block.

Yes, with all the same problems.

When dealing with a complex object, how can you be sure that __add__
won't need access to the original values during the entire computation?
It works with matrix addition, but not with matric multiplication.
Depending on the details of the implementation, it could even fail for
a sort of sliding-neighbor addition similar to the original justification.

Of course, then those tricky implementations should not define an
_eliding_add_, but maybe the builtin objects still should?  After all,
a plain old list is OK to re-use.  Unless the first evaluation to create
it ends up evaluating an item that has side effects...

In the end, it looks like a lot of machinery (and extra checks that may
slow down the normal small-object case) for something that won't be used
all that often.

Though it is really tempting to consider a compilation mode that assumes
objects and builtins will be "normal", and lets you replace the entire
above expression with compile-time [1, 2, 3, 4, 5, 6].  Would writing
objects to that stricter standard and encouraging its use (and maybe
offering a few AST transforms to auto-generate the out-parameters?) work
as well for those who do need the speed?

-jJ

--

If there are still threading problems with my replies, please
email me with details, so that I can try to resolve them.  -jJ

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com