On 11/06/2012 05:55 PM, Steven D'Aprano wrote:
On Tue, 06 Nov 2012 14:41:24 -0800, Andrew Robinson wrote:
Yes. But this isn't going to cost any more time than figuring out
whether or not the list multiplication is going to cause quirks, itself.
Human psychology *tends* (it's a FAQ!) to automatically assume the
purpose of the list multiplication is to pre-allocate memory for the
equivalent (using lists) of a multi-dimensional array. Note the OP even
said "4d array".
I'm not entirely sure what your point is here. The OP screwed up -- he
didn't generate a 4-dimensional array. He generated a 2-dimensional
array. If his intuition about the number of dimensions is so poor, why
should his intuition about list multiplication be treated as sacrosanct?
Yes he did screw up.
There is a great deal of value in studying how people screw up, and
designing interfaces which tend to discourage it. "Candy machine
interfaces".
As they say, the only truly intuitive interface is the nipple.
No it's not -- that interface really sucks. :)
Have you ever seen a cat trying to suck a human nipple -- ?
Or, have you ever asked a young child who was weaned early and doesn't
remember nursing -- what a breast is for ? Once the oral stage is left,
remaining behavior must be re-learned.
There are
many places where people's intuition about programming fail. And many
places where Fred's intuition is the opposite of Barney's intuition.
OK. But that doesn't mean that *all* places have opposite intuition;
Nor does it mean that one intuition which is statistically *always*
wrong shouldn't be discouraged, or re-routed into useful behavior.
Take the candy machine, if the items being sold are listed by number --
and the prices are also numbers; it's very easy to type in the price
instead of the object number because one *forgets* that the numbers have
different meaning and the machine can't always tell from the price,
which object a person wanted (duplicate prices...); Hence a common
mistake... people get the wrong item, by typing in the price.
By merely avoiding a numeric keypad -- the user is re-routed into
choosing the correct item by not being able to make the mistake.
For this reason, Python tends to *like* things such as named parameters
and occasionally enforces their use. etc.
Even more exciting, there are places where people's intuition is
*inconsistent*, where they expect a line of code to behave differently
depending on their intention, rather than on the code. And intuition is
often sub-optimal: e.g. isn't it intuitively obvious that "42" + 1 should
give 43? (Unless it is intuitively obvious that it should give 421.)
I agree, and in places where an *exception* can be raised; it's
appropriate to do so.
Ambiguity, like the candy machine, is *bad*.
So while I prefer intuitively obvious behaviour where possible, it is not
the holy grail, and I am quite happy to give it up.
"where possible"; OK, fine -- I agree. I'm not "happy" to give it up;
but I am willing.
I don't like the man hours wasted on ambiguous behavior; and I don't
ever think that should make someone "happy".
The OP's original construction was simple, elegant, easy to read and
very commonly done by newbies learning the language because it's
*intuitive*. His second try was still intuitive, but less easy to read,
and not as elegant.
Yes. And list multiplication is one of those areas where intuition is
suboptimal -- it produces a worse outcome overall, even if one minor use-
case gets a better outcome.
I'm not disputing that [[0]*n]*m is intuitively obvious and easy. I'm
disputing that this matters. Python would be worse off if list
multiplication behaved intuitively.
How would it be worse off?
I can agree, for example, that in "C" -- realloc -- is too general.
One can't look at the line where realloc is being used, and decide if it is:
1) mallocing
2) deleting
3) resizing
Number (3) is the only non-redundant behavior the function provides.
There is, perhaps, a very clear reason that I haven't discovered why the
extra functionality in list multiplication would be bad; That reason is
*not* because list multiplication is unable to solve all the copying
problems in the word; (realloc is bad, precisely because of that); But
a function ought to do at least *one* thing well.
Draw up some use cases for the multiplication operator (I'm calling on
your experience, let's not trust mine, right?); What are all the
Typical ways people *Do* to use it now?
If those use cases do not *primarily* center around *wanting* an effect
explicitly caused by reference duplication -- then it may be better to
abolish list multiplication all together; and rather, improve the list
comprehensions to overcome the memory, clarity, and speed pitfalls in
the most common case of initializing a list.
For example, in initialization use cases; often the variable of a for
loop isn't needed and all the initializers have parameters which only
need to be evaluated *once* (no side effects).
Hence, there is an opportunity for speed and memory gains,while
maintaining clarity and *consistency*.
Some ideas of use cases:
[ (0) in xrange(10) ] # The function to create a tuple cache's the
parameter '0', makes 10 (0)'s
[ dict.__new__(dict) in xrange(10) ] # dict.__new__, The dict parameter
is cached -- makes 10 dicts.
[ lambda x:(0) in xrange(10) ] # lambda caches (0), returns a
*reference* to it multiple times.
An analogy: the intuitively obvious thing to do with a screw is to bang
it in with a hammer. It's long, thin, has a point at the end, and a flat
head that just screams "hit me". But if you do the intuitive thing, your
carpentry will be *much worse* than the alternatives.
:)
I agree. Good point and Good "thin point".
Having list multiplication copy has consequences beyond 2D arrays. Those
consequences make the intuitive behaviour you are requesting a negative
rather than a positive. If that means that newbie programmers have to
learn not to hammer screws in, so be it. It might be harder, slower, and
less elegant to drill a pilot hole and then screw the screw in, but the
overall result is better.
no, the overall result is still bad. If the answer is *don't* hammer
nails, then it's better to raise an exception when it's tried. There's
no way to do that with list multiplication.
* Consistency of semantics is better than a plethora of special
cases. Python has a very simple and useful rule: objects should not
be copied unless explicitly requested to be copied. This is much
better than having to remember whether this operation or that
operation makes a copy. The answer is consistent:
Bull. Even in the last thread I noted the range() object produces
special cases.
>>> range(0,5)[1]
1
>>> range(0,5)[1:3]
range(1, 3)
What's the special case here? What do you think is copied?
You take a slice of a range object, you get a new range object.
You were'nt paying attention, OCCASIONALLY, get an integer, or a list.
>>> range(3)[2]
2
LOOOOK! That's not a range object, that's an integer. Use Python 3.2
and try it.
I'm honestly not getting what you think is inconsistent about this.
How about now?
Two-dimensional arrays in Python using lists are quite rare. Anyone
who is doing serious numeric work where they need 2D arrays is using
numpy, not lists.
Game programmers routinely use 2D lists to represent the screen layout;
For example, they might use 'b' to represent a brick tile, and 'w' to
represent a water tile.
This is quite common in simple games; I have seen several use 2D lists
(or tuples) to do this.
Serious numeric work is not needed in most simple games; especially if
motion is not involved.
There are *many* non serious uses of matrix mathematics and 2D lists.
Numpy isn't desired even if it would work. Cost benefit analysis....
Crossword puzzles, periodic table of the elements with different sources
of weights listed under each element, etc. (That can also be done with
a dict, but I've seen an implementation do it the other way.) etc.
There are millions of people using Python, so it's hardly surprising
that once or twice a year some newbie trips over this. But it's not
something that people tend to trip over again and again and again,
like C's "assignment is an expression" misfeature.
Good point. I don't have a statistic -- except the handful of times I
searched for some other topic -- and I have seen it three times already....
I read some of the documentation on why Python 3 chose to implement it
this way.
What documentation is this?
The documentation for range() -- which I just studied read because of
another thread we both were in. You're misconstruing the subject --
which was "inconsistency" of Python is allowed; not "is list
multiplication inconsistent."
Q: What about [[]]*10?
A: No, the elements are never copied.
YES! For the obvious reason that such a construction is making mutable
lists that the user wants to populate later. If they *didn't* want to
populate them later, they ought to have used tuples -- which take less
overhead. Who even does this thing you are suggesting?!
Who knows? Who cares? Nobody does:
exactly !!!! But I do care, even though I don't do it (because it
doesn't *work*)
n -= n
instead of just n=0, but that doesn't mean that we should give it some
sort of special meaning different from n -= m. If it turns out that the
definition of list multiplication is such that NOBODY, EVER, uses [[]]*n,
that is *still* not a good reason for special-casing it.
Ahh... but some people *DO* try to use it for another purpose.
Your example is a bad analogy.
Special cases aren't special enough to break the rules.
finish the sentence: ALTHOUGH practicality beats purity.
There are perfectly good ways to generate a 2D array out of lists, and
even better reasons not to use lists for that in the first place. (Numpy
arrays are much better suited for serious work.)
Duh... I answered that....
I'm afraid you've just lost an awful lot of credibility there.
py> x = [{}]*5
py> x
[{}, {}, {}, {}, {}]
No, I showed what happed when you do {}*3;
That *DOESN'T* work; You aren't multiplying the dictionary, you are
multiplying the LIST of dictionaries. Very different things.
You were complaining that my method doesn't multiply them -- well, gee
-- either mine DOES or python DOESN'T. Double standards are *crap*.
py> x[0]['key'] = 1
py> x
[{'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}]
And similarly for any other mutable object.
If you don't understand that lists can contain other mutable objects
apart from lists, then you really shouldn't be discussing this issue.
I do; that's why I DEMONSTRATED this issue in my own replies.
Your proposal throws away consistency for a trivial benefit on a rare
use- case, and replaces it with a bunch of special cases:
RARE!!!! You are NUTS!!!!
Yes, rare. I base that on about 15 years of Python coding and many
thousands (tens of thousands?) of hours on Python forums like this one.
What's your opinion based on?
Which opinion?
2D lists are NOT rare; I've seen them in dozens of python programs not
written by me.
As to my other opinion regarding why change it, there are two separate
issues:
One was that a poster asked if it would be difficult to do without
introducing bugs; That's the question I answered affirmatively. The
_OP_ problem can be removed using a simple fix which isn't going to
break any major number of programs. That's a fact by your admission as
well.
The second issue is would it be consistent to make the change (and NOTE
I wasn't asked that, and wasn't answering that) You and others brought
it up tangentially. I agree, There is a problem, as I noted, with
subclassing. I also will note that ([],[],[]) is effectively the same
question as sub-classing -- () are merely lists that are immutable.
As to how often people make the mistake -- There's a big difference
between how often someone makes the mistake, and how often it shows up
on a forum. The issue shows up in a forum when someone can't figure out
what they did wrong after a long debugging session.
There is more than one way to resolve such an issue, even if a person
doesn't know why their construction is wrong. If it is easy to see the
construction produces the wrong result, one can simply try list
comprehensions which will work correctly. Then, either the person
learns why they made the mistake -- or they don't. If they are able to
get it to work sometimes, but sometimes they can't -- they may or not
stop using it. I cite Java programming where the API is notorious for
that kind of inconsistent behavior; Yet Java programmers feel compelled
to still use those constructions. etc.
values = [None]*n # or 0 is another popular starting value
Using it twice to generate a 2D array is even rarer.
Sure, and if it is used that way -- I doubt it is ever used like [ {}
]*n, because that object will have a side effect. SO again, if this is
the *main* use case, the default behavior is not the main reason they
use it -- but they can often work around the default behavior by using
it with specially thought about data.
Q: How about if I use delegation to proxy a list? A: Oh no, they
definitely won't be copied.
Give an example usage of why someone would want to do this. Then we can
discuss it.
Proxying objects is hardly a rare scenario. Delegation is less common
since you can subclass built-ins, but it is still used. It is a standard
design pattern.
I was not judging you; I was asking for an example to discuss.
Python is a twenty year old language. Do you really think this is the
first time somebody has noticed it? It's hard to search for
discussions on the dev list, because the obvious search terms bring up
many false positives.
No, I don't think it's the first time.
But here are a couple of bug reports closed as "won't fix":
http://bugs.python.org/issue1408 http://bugs.python.org/issue12597 I
suspect it is long past time for a PEP so this can be rejected once
and for all.
Yeah, that'd be good -- and perhaps they'd abolish it. :)
The copy speed will be the same or *faster*, and the typing less -- and
the psychological mistakes *less*, the elegance more.
You think that it is *faster* to copy a list than to make a new pointer
to it? Your credibility is not looking too good here.
YES -- WHEN copying by reference is a BUG, then copying is NOT by reference.
That's the use case I spelled out. You're changing the subject to make
me look dumb? or being purposely facile to hide being destroyed on the
substance of the argument ?
When true copying is desired, then doing it at the "C" level is better
than the interpreter Level.
ergo: You're not looking too intelligent either.
It's hardly going to confuse anyone to say that lists are copied with
list multiplication, but the elements are not.
Well, that confuses me. What about a list where the elements are lists?
Are they copied?
YES! It's a LIST copy; the all lists, and only the lists, are copied;
the rest are referenced.
What about other mutable objects? Are they copied?
No.
It's a list copy, not random mutable object copy.
What about mutable objects which are uncopyable, like file objects?
No.
It's a list copy, not a file copy.
Every time someone passes a list to a function, they *know* that the
list is passed by value -- and the elements are passed by reference.
And there goes the last of your credibility. *You* might "know" this, but
that doesn't make it so.
No, it's not gone. You saying so, doesn't make it gone.
People aren't ignorant of the passing mechanism just because they didn't
transfer from another language like "C", or "Pascal", "ADA", "Fortran",
etc. But when they do transfer from a language which makes a
distinction -- then, yes, it's weird.
Python's calling behaviour is identical to that used by languages
including Java (excluding unboxed primitives) and Ruby, to mention
only two. You're starting to shout and yell, so perhaps it's best if I
finish this here.
Huh?
I'm not yelling any more than you are. Are ???YOU??? yelling?
:-\
--
http://mail.python.org/mailman/listinfo/python-list