Re: Multi-dimensional list initialization

Andrew Robinson Wed, 07 Nov 2012 14:13:39 -0800

On 11/06/2012 05:55 PM, Steven D'Aprano wrote:

On Tue, 06 Nov 2012 14:41:24 -0800, Andrew Robinson wrote:

Yes.  But this isn't going to cost any more time than figuring out
whether or not the list multiplication is going to cause quirks, itself.
  Human psychology *tends* (it's a FAQ!) to automatically assume the
purpose of the list multiplication is to pre-allocate memory for the
equivalent (using lists) of a multi-dimensional array.  Note the OP even
said "4d array".

I'm not entirely sure what your point is here. The OP screwed up -- he
didn't generate a 4-dimensional array. He generated a 2-dimensional
array. If his intuition about the number of dimensions is so poor, why
should his intuition about list multiplication be treated as sacrosanct?

Yes he did screw up.

There is a great deal of value in studying how people screw up, anddesigning interfaces which tend to discourage it. "Candy machineinterfaces".

As they say, the only truly intuitive interface is the nipple.

No it's not -- that interface really sucks.  :)
Have you ever seen a cat trying to suck a human nipple -- ?

Or, have you ever asked a young child who was weaned early and doesn'tremember nursing -- what a breast is for ? Once the oral stage is left,remaining behavior must be re-learned.

  There are
many places where people's intuition about programming fail. And many
places where Fred's intuition is the opposite of Barney's intuition.

OK. But that doesn't mean that *all* places have opposite intuition;Nor does it mean that one intuition which is statistically *always*wrong shouldn't be discouraged, or re-routed into useful behavior.

Take the candy machine, if the items being sold are listed by number --and the prices are also numbers; it's very easy to type in the priceinstead of the object number because one *forgets* that the numbers havedifferent meaning and the machine can't always tell from the price,which object a person wanted (duplicate prices...); Hence a commonmistake... people get the wrong item, by typing in the price.

By merely avoiding a numeric keypad -- the user is re-routed intochoosing the correct item by not being able to make the mistake.

For this reason, Python tends to *like* things such as named parametersand occasionally enforces their use. etc.

Even more exciting, there are places where people's intuition is
*inconsistent*, where they expect a line of code to behave differently
depending on their intention, rather than on the code. And intuition is
often sub-optimal: e.g. isn't it intuitively obvious that "42" + 1 should
give 43? (Unless it is intuitively obvious that it should give 421.)

I agree, and in places where an *exception* can be raised; it'sappropriate to do so.

Ambiguity, like the candy machine, is *bad*.

So while I prefer intuitively obvious behaviour where possible, it is not
the holy grail, and I am quite happy to give it up.

"where possible"; OK, fine -- I agree. I'm not "happy" to give it up;but I am willing.I don't like the man hours wasted on ambiguous behavior; and I don'tever think that should make someone "happy".

The OP's original construction was simple, elegant, easy to read and
very commonly done by newbies learning the language because it's
*intuitive*.  His second try was still intuitive, but less easy to read,
and not as elegant.

Yes. And list multiplication is one of those areas where intuition is
suboptimal -- it produces a worse outcome overall, even if one minor use-
case gets a better outcome.

I'm not disputing that [[0]*n]*m is intuitively obvious and easy. I'm
disputing that this matters. Python would be worse off if list
multiplication behaved intuitively.

How would it be worse off?

I can agree, for example, that in "C" -- realloc -- is too general.
One can't look at the line where realloc is being used, and decide if it is:
1) mallocing
2) deleting
3) resizing

Number (3) is the only non-redundant behavior the function provides.

There is, perhaps, a very clear reason that I haven't discovered why theextra functionality in list multiplication would be bad; That reason is*not* because list multiplication is unable to solve all the copyingproblems in the word; (realloc is bad, precisely because of that); Buta function ought to do at least *one* thing well.

Draw up some use cases for the multiplication operator (I'm calling onyour experience, let's not trust mine, right?); What are all theTypical ways people *Do* to use it now?

If those use cases do not *primarily* center around *wanting* an effectexplicitly caused by reference duplication -- then it may be better toabolish list multiplication all together; and rather, improve the listcomprehensions to overcome the memory, clarity, and speed pitfalls inthe most common case of initializing a list.

For example, in initialization use cases; often the variable of a forloop isn't needed and all the initializers have parameters which onlyneed to be evaluated *once* (no side effects).

Hence, there is an opportunity for speed and memory gains,whilemaintaining clarity and *consistency*.


Some ideas of use cases:

[ (0) in xrange(10) ] # The function to create a tuple cache's theparameter '0', makes 10 (0)'s[ dict.__new__(dict) in xrange(10) ] # dict.__new__, The dict parameteris cached -- makes 10 dicts.[ lambda x:(0) in xrange(10) ] # lambda caches (0), returns a*reference* to it multiple times.

An analogy: the intuitively obvious thing to do with a screw is to bang
it in with a hammer. It's long, thin, has a point at the end, and a flat
head that just screams "hit me". But if you do the intuitive thing, your
carpentry will be *much worse* than the alternatives.

:)
I agree.  Good point and Good "thin point".

Having list multiplication copy has consequences beyond 2D arrays. Those
consequences make the intuitive behaviour you are requesting a negative
rather than a positive. If that means that newbie programmers have to
learn not to hammer screws in, so be it. It might be harder, slower, and
less elegant to drill a pilot hole and then screw the screw in, but the
overall result is better.

no, the overall result is still bad. If the answer is *don't* hammernails, then it's better to raise an exception when it's tried. There'sno way to do that with list multiplication.

* Consistency of semantics is better than a plethora of special
    cases. Python has a very simple and useful rule: objects should not
    be copied unless explicitly requested to be copied. This is much
    better than having to remember whether this operation or that
    operation makes a copy. The answer is consistent:

Bull.  Even in the last thread I noted the range() object produces
special cases.
  >>>  range(0,5)[1]
1
  >>>  range(0,5)[1:3]
range(1, 3)

What's the special case here? What do you think is copied?


You take a slice of a range object, you get a new range object.

You were'nt paying attention, OCCASIONALLY, get an integer, or a list.
>>> range(3)[2]
2

LOOOOK! That's not a range object, that's an integer. Use Python 3.2and try it.

I'm honestly not getting what you think is inconsistent about this.

How about now?

Two-dimensional arrays in Python using lists are quite rare. Anyonewho is doing serious numeric work where they need 2D arrays is usingnumpy, not lists.

Game programmers routinely use 2D lists to represent the screen layout;

For example, they might use 'b' to represent a brick tile, and 'w' torepresent a water tile.This is quite common in simple games; I have seen several use 2D lists(or tuples) to do this.Serious numeric work is not needed in most simple games; especially ifmotion is not involved.

There are *many* non serious uses of matrix mathematics and 2D lists.Numpy isn't desired even if it would work. Cost benefit analysis....

Crossword puzzles, periodic table of the elements with different sourcesof weights listed under each element, etc. (That can also be done witha dict, but I've seen an implementation do it the other way.) etc.

There are millions of people using Python, so it's hardly surprisingthat once or twice a year some newbie trips over this. But it's notsomething that people tend to trip over again and again and again,like C's "assignment is an expression" misfeature.

Good point. I don't have a statistic -- except the handful of times Isearched for some other topic -- and I have seen it three times already....

I read some of the documentation on why Python 3 chose to implement it
this way.

What documentation is this?

The documentation for range() -- which I just studied read because ofanother thread we both were in. You're misconstruing the subject --which was "inconsistency" of Python is allowed; not "is listmultiplication inconsistent."

      Q: What about [[]]*10?
      A: No, the elements are never copied.

YES! For the obvious reason that such a construction is making mutable
lists that the user wants to populate later.  If they *didn't* want to
populate them later, they ought to have used tuples -- which take less
overhead.  Who even does this thing you are suggesting?!

Who knows? Who cares? Nobody does:

exactly !!!! But I do care, even though I don't do it (because itdoesn't *work*)


n -= n

instead of just n=0, but that doesn't mean that we should give it some
sort of special meaning different from n -= m. If it turns out that the
definition of list multiplication is such that NOBODY, EVER, uses [[]]*n,
that is *still* not a good reason for special-casing it.

Ahh... but some people *DO* try to use it for another purpose.
Your example is a bad analogy.


Special cases aren't special enough to break the rules.

finish the sentence: ALTHOUGH practicality beats purity.

There are perfectly good ways to generate a 2D array out of lists, and
even better reasons not to use lists for that in the first place. (Numpy
arrays are much better suited for serious work.)

Duh... I answered that....

I'm afraid you've just lost an awful lot of credibility there.

py>  x = [{}]*5
py>  x
[{}, {}, {}, {}, {}]

No, I showed what happed when you do {}*3;

That *DOESN'T* work; You aren't multiplying the dictionary, you aremultiplying the LIST of dictionaries. Very different things.You were complaining that my method doesn't multiply them -- well, gee-- either mine DOES or python DOESN'T. Double standards are *crap*.

py>  x[0]['key'] = 1
py>  x
[{'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}, {'key': 1}]

And similarly for any other mutable object.

If you don't understand that lists can contain other mutable objects
apart from lists, then you really shouldn't be discussing this issue.


I do; that's why I DEMONSTRATED this issue in my own replies.

Your proposal throws away consistency for a trivial benefit on a rare
use- case, and replaces it with a bunch of special cases:

RARE!!!! You are NUTS!!!!

Yes, rare. I base that on about 15 years of Python coding and many
thousands (tens of thousands?) of hours on Python forums like this one.
What's your opinion based on?

Which opinion?

2D lists are NOT rare; I've seen them in dozens of python programs notwritten by me.

As to my other opinion regarding why change it, there are two separateissues:

One was that a poster asked if it would be difficult to do withoutintroducing bugs; That's the question I answered affirmatively. The_OP_ problem can be removed using a simple fix which isn't going tobreak any major number of programs. That's a fact by your admission aswell.

The second issue is would it be consistent to make the change (and NOTEI wasn't asked that, and wasn't answering that) You and others broughtit up tangentially. I agree, There is a problem, as I noted, withsubclassing. I also will note that ([],[],[]) is effectively the samequestion as sub-classing -- () are merely lists that are immutable.

As to how often people make the mistake -- There's a big differencebetween how often someone makes the mistake, and how often it shows upon a forum. The issue shows up in a forum when someone can't figure outwhat they did wrong after a long debugging session.

There is more than one way to resolve such an issue, even if a persondoesn't know why their construction is wrong. If it is easy to see theconstruction produces the wrong result, one can simply try listcomprehensions which will work correctly. Then, either the personlearns why they made the mistake -- or they don't. If they are able toget it to work sometimes, but sometimes they can't -- they may or notstop using it. I cite Java programming where the API is notorious forthat kind of inconsistent behavior; Yet Java programmers feel compelledto still use those constructions. etc.

values = [None]*n  # or 0 is another popular starting value

Using it twice to generate a 2D array is even rarer.

Sure, and if it is used that way -- I doubt it is ever used like [ {}]*n, because that object will have a side effect. SO again, if this isthe *main* use case, the default behavior is not the main reason theyuse it -- but they can often work around the default behavior by usingit with specially thought about data.

      Q: How about if I use delegation to proxy a list? A: Oh no, they
      definitely won't be copied.

Give an example usage of why someone would want to do this.  Then we can
discuss it.

Proxying objects is hardly a rare scenario. Delegation is less common
since you can subclass built-ins, but it is still used. It is a standard
design pattern.

I was not judging you; I was asking for an example to discuss.

Python is a twenty year old language. Do you really think this is thefirst time somebody has noticed it? It's hard to search fordiscussions on the dev list, because the obvious search terms bring upmany false positives.

No, I don't think it's the first time.

But here are a couple of bug reports closed as "won't fix":http://bugs.python.org/issue1408 http://bugs.python.org/issue12597 Isuspect it is long past time for a PEP so this can be rejected onceand for all.

Yeah, that'd be good -- and perhaps they'd abolish it. :)

The copy speed will be the same or *faster*, and the typing less -- and
the psychological mistakes *less*, the elegance more.

You think that it is *faster* to copy a list than to make a new pointer
to it? Your credibility is not looking too good here.

YES -- WHEN copying by reference is a BUG, then copying is NOT by reference.

That's the use case I spelled out. You're changing the subject to makeme look dumb? or being purposely facile to hide being destroyed on thesubstance of the argument ?

When true copying is desired, then doing it at the "C" level is betterthan the interpreter Level.

ergo: You're not looking too intelligent either.

It's hardly going to confuse anyone to say that lists are copied with
list multiplication, but the elements are not.

Well, that confuses me. What about a list where the elements are lists?
Are they copied?

YES! It's a LIST copy; the all lists, and only the lists, are copied;the rest are referenced.

What about other mutable objects? Are they copied?

No.
It's a list copy, not random mutable object copy.

What about mutable objects which are uncopyable, like file objects?

No.
It's a list copy, not a file copy.

Every time someone passes a list to a function, they *know* that the
list is passed by value -- and the elements are passed by reference.

And there goes the last of your credibility. *You* might "know" this, but
that doesn't make it so.

No, it's not gone.  You saying so, doesn't make it gone.

People aren't ignorant of the passing mechanism just because they didn'ttransfer from another language like "C", or "Pascal", "ADA", "Fortran",etc. But when they do transfer from a language which makes adistinction -- then, yes, it's weird.

Python's calling behaviour is identical to that used by languagesincluding Java (excluding unboxed primitives) and Ruby, to mentiononly two. You're starting to shout and yell, so perhaps it's best if Ifinish this here.

Huh?
I'm not yelling any more than you are.  Are ???YOU??? yelling?

:-\

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Multi-dimensional list initialization

Reply via email to