Terry Reedy wrote: > On 2/18/2013 6:47 AM, John Reid wrote: > >> I was hoping namedtuples could be used as replacements for tuples > > in all instances. > > This is a mistake in the following two senses. First, tuple is a class > with instances while namedtuple is a class factory that produces > classes. (One could think of namedtuple as a metaclass, but it was not > implemented that way.)
I think you have misunderstood. I don't believe that John wants to use the namedtuple factory instead of tuple. He wants to use a namedtuple type instead of tuple. That is, given: Point3D = namedtuple('Point3D', 'x y z') he wants to use a Point3D instead of a tuple. Since: issubclass(Point3D, tuple) holds true, the Liskov Substitution Principle (LSP) tells us that anything that is true for a tuple should also be true for a Point3D. That is, given that instance x might be either a builtin tuple or a Point3D, all of the following hold: - isinstance(x, tuple) returns True - len(x) returns the length of x - hash(x) returns the hash of x - x[i] returns item i of x, or raises IndexError - del x[i] raises TypeError - x + a_tuple returns a new tuple - x.count(y) returns the number of items equal to y etc. Basically, any code expecting a tuple should continue to work if you pass it a Point3D instead (or any other namedtuple). There is one conspicuous exception to this: the constructor: type(x)(args) behaves differently depending on whether x is a builtin tuple, or a Point3D. The LSP is about *interfaces* and the contracts we make about those interfaces, rather than directly about inheritance. Inheritance is just a mechanism for allowing types to automatically get the same interface as another type. Another way to put this, LSP is about duck-typing. In this case, if we have two instances: x = (1, 2, 3) y = Point3D(4, 5, 6) then x and y: - quack like tuples - swim like tuples - fly like tuples - walk like tuples - eat the same things as tuples - taste very nice cooked with orange sauce like tuples etc., but y does not lay eggs like x. The x constructor requires a single argument, the y constructor requires multiple arguments. You can read more about LSP here: http://en.wikipedia.org/wiki/Liskov_substitution_principle although I don't think this is the most readable Wikipedia article, and the discussion of mutability is a red-herring. Or you can try this: http://c2.com/cgi/wiki?LiskovSubstitutionPrinciple although even by c2 wiki standards, it's a bit of a mess. These might help more: http://blog.thecodewhisperer.com/2013/01/08/liskov-substitution-principle-demystified/ http://lassala.net/2010/11/04/a-good-example-of-liskov-substitution-principle/ > Second, a tuple instance can have any length and > different instances can have different lengths. On the other hand, all > instances of a particular namedtuple class have a fixed length. This is a subtle point. If your contract is, "I must be able to construct an instance with a variable number of items", then namedtuples are not substitutable for builtin tuples. But I think this is an *acceptable* violation of LSP, since we're deliberately restricting a namedtuple to a fixed length. But within the constraints of that fixed length, we should be able to substitute a namedtuple for any tuple of that same length. > This > affects their initialization. So does the fact that Oscar mentioned, > that fields can be initialized by name. Constructing namedtuples by name is not a violation, since it *adds* behaviour, it doesn't take it away. If you expect a tuple, you cannot construct it with: t = tuple(spam=a, ham=b, eggs=c) since that doesn't work. You have to construct it from an iterable, or more likely a literal: t = (a, b, c) Literals are special, since they are a property of the *interpreter*, not the tuple type. To put it another way, the interpreter understands (a,b,c) as syntax for constructing a tuple, the tuple type does not. So we cannot expect to use (a,b,c) syntax to construct a MyTuple instance, or a Point3D instance instead. If we hope to substitute a subclass, we have to use the tuple constructor directly: type_to_use = tuple t = type_to_use([a, b, c]) Duck-typing, and the LSP, tells us that we should be able to substitute a Point3D for this: type_to_use = namedtuple('Point3D', 'x y z') t = type_to_use([a, b, c]) but we can't. And that is an important violation of LSP. There could be three fixes to this, none of them practical: 1) tuple could accept multiple arguments, tuple(a, b, c) => (a, b, c) but that conflicts with the use tuple(iterable). If Python had * argument unpacking way back in early days, it might have been better to give tuples the signature tuple(*args), but it didn't and so it doesn't and we can't change that now. 2) namedtuples could accept a single iterable argument like tuple does, but that conflicts with the desired signature pt = Point3D(1, 2, 3). 3) namedtuples should not claim to be tuples, which is probably the least-worst fix. Backwards-compatibility rules out making this change, but even if it didn't, namedtuples quack like tuples, swim like tuples, and walk like tuples, so even if they aren't a subclass of tuple it would still be reasonable to want them to lay eggs like tuples. So I don't believe there is any good solution to this, except the ad-hoc one of overriding the __new__ constructor when needed. > > There seem to be some differences between how tuples and namedtuples > > are created. For example with a tuple I can do: >> >> a=tuple([1,2,3]) > > But no sensible person would ever do that, since it creates an > unnecessary list and is equivalent to > > a = 1,2,3 Well, no, not as given. But it should be read as just an illustration. In practise, code like this is not uncommon: a = tuple(some_iterable) [...] > It is much less common to change tuple(iterable) to B(iterable). Less common or not, duck-typing and the LSP tells us we should be able to do so. We cannot. >> Is this a problem with namedtuples, ipython or just a feature? > > With canSequence. If isinstance was available and the above were written > before list and tuple could be subclassed, canSequence was sensible when > written. But as Oscar said, it is now a mistake for canSequence to > assume that all subclasses of list and tuple have the same > initialization api. No, it is not a mistake. It is a problem with namedtuples that they violate the expectation that they should have the same constructor signature as other tuples. After all, namedtuples *are* tuples, they should be constructed the same way. But they aren't, so that violates a reasonable expectation. Is the convenience of being able to write Point3D(1, 2, 3) more important than LSP-purity? Perhaps. I suspect that will be the answer Raymond Hettinger might give. I'm 85% inclined to agree with this answer. > In fact, one reason to subclass a class is to change the initialization > api. That might be a reason that people give, but it's a bad reason from the perspective of interface contracts, duck-typing and the LSP. Of course, these are not the *only* perspectives. There is no rule that states that one must always obey the interface contracts of one's parent class. But if you don't, you will be considered an "ill-behaved" subclass for violating the promises made by your type. -- Steven -- http://mail.python.org/mailman/listinfo/python-list