This is a follow-up discussion on my earlier PEP-suggestion. Ive integrated the insights collected during the previous discussion, and tried to regroup my arguments for a second round of feedback. Thanks to everybody who gave useful feedback the last time.
PEP Proposal: Pythonification of the asterisk-based collection packing/ unpacking syntax. This proposal intends to expand upon the currently existing collection packing and unpacking syntax. Thereby we mean the following related python constructs: head, *tail = somesequence #pack the remainder of the unpacking of somesequence into a list called tail def foo(*args): pass #pack the unbound positional arguments into a tuple calls args def foo(**kwargs): pass #pack the unbound keyword arguments into a dict calls kwargs foo(*args) #unpack the sequence args into positional arguments foo(**kwargs) #unpack the mapping kwargs into keyword arguments We suggest that these constructs have the following shortcomings that could be remedied. It is unnecessarily cryptic, and out of line with Pythons preference for an explicit syntax. One can not state in a single line what the asterisk operator does; this is highly context dependent, and is devoid of that ‘for line in file’ pythonic obviousness. From the perspective of a Python outsider, the only hint as to what *args means is by loose analogy with the C-way of handling variable arguments. The current syntax, in its terseness, leaves to be desired in terms of flexibility. While a tuple might be the logical choice to pack positional arguments in the vast majority of cases, it need not be true that a list is always the preferred choice to repack an unpacked sequence, for instance. Type constraints: In case the asterisk is not used to signal unpacking, but rather to signal packing, its semantics is essentially that of a type constraint. The statement: head, tail = sequence Signifies regular unpacking. However, if we add an asterisk, as in: head, *tail = sequence We demand that tail not be just any python object, but rather a list. This changes the semantics from normal unpacking, to unpacking and then repacking all but the head into a list. It may be somewhat counter-intuitive to think of this as a type constraint, since python is after all a weakly-typed language. But the current usage of askeriskes is an exception to that rule. For those who are unconvinced, please consider the analogy to the following simple C# code: var foo = 3; An ‘untyped‘ object foo is created (actually, its type will be inferred from its rhs as an integer). float foo = 3; By giving foo a type-constraint of float instead, the semantics are modified; foo is no longer the integer 3, but gets silently cast to 3.0. This is a simple example, but conceptually entirely analogous to what happens when one places an asterisk before an lvalue in Python. It means ‘be a list, and adjust your behavior accordingly’, versus ‘be a float, and adjust your behavior accordingly’. The aim of this PEP, is that this type-constraint syntax is expanded upon. We should be careful here to distinguish with providing optional type constraints throughout python as a whole; this is not our aim. This concept has been considered before, but the costs have not been found to out-weight the benefits. http://www.artima.com/weblogs/viewpost.jsp?thread=86641 Our primary aim is the niche of collection packing/unpacking, but if further generalizations can be made without increasing the cost, those are most welcome. To reiterate: what is proposed is nothing radical; merely to replace the asterisk-based type constraints with a more explicit type constraint. Currently favored alternative syntax: Both for the sake of explicitness and flexibility, we consider it desirable that the name of the collection type is used directly in any collection packing statement. Annotating a variable declaration with a collection type name should signal collection packing. This association between a collection type name and a variable declaration can be accomplished in many ways; for now, we suggest collectionname::collectiontype for packing, and ::collectionname for unpacking. Examples of use: head, tail::tuple = ::sequence def foo(args::list, kwargs::dict): pass foo(::args, ::kwargs) The central idea is to replace annotations with asteriskes by annotations with collection type names, but note that we have opted for several other minor alterations of the existing syntax that seem natural given the proposed changes. First of all, explicitly mentioning the type of the collection involved eliminates the need to have two symbols, * and **. Which variable captures the positional arguments and which captures the keyword arguments can be inferred from the collection type they model, mapping or sequence. The rare case of collections that both model a sequence and a mapping can either be excluded or handled by assigning precedence for one type or the other. A double semicolon before a collection type signals unpacking. As with declarations, there is no genuine need to have a different operator for sequence and mapping types, although if such a demand exists, it would not be hard to accommodate. A double semicolon in front of the collection is congruent with the asterisk syntax, and nicely emphasizes this unpacking operation being the symmetric counterpart of the packing operation, which is signalled by the same symbols to the right of the identifier. Since we are going to make the double semicolon (or whatever the symbol) a general collection packing/ unpacking marker, we feel it makes sense to allow it to be used to explicitly signify unpacking, even when as much is implied by the syntax on the left hand side, to preserve symmetry with the syntax inside function calls. Summarizing, what this syntax achieves, in loose order of perceived importance: Simplicity: we have reduced a set of rather arbitrary rules concerning the syntax and semantics of the asterisk (does it construct a list or a tuple?) to a single general symbol: the double semicolon is the collection packing/unpacking annotation symbol, and that is all there is to know about it. Readability: the proposed syntax reads like a book: args-list and kwargs-dict, unlike the more cryptic asterisk syntax. We avoid extra lines of code in the event another sequence or mapping type than the one returned by default is required. Efficiency: by declaring the desired collection type, it can be constructed in the optimal way from the given input, rather than requiring a conversion after the default collection type is constructed. A double semicolon is suggested, since the single colon is already taken by the function annotation syntax in Python 3. This is somewhat unfortunate: programming should come before meta-programming, and it should rather be the other way around. On the one hand having both : and :: as variable declaration annotation symbols is a nice unification, on the other hand, a syntax more easily visually distinguished from function annotations can be defended. For increased backwards compatibility the asterisk could be used, but sandwiched between two identifiers it looks like a multiplication. But many others symbols would do, such as @ or !. -- http://mail.python.org/mailman/listinfo/python-list