On Tue, Oct 22, 2019 at 04:11:45PM -0400, Todd wrote:
> On Tue, Oct 22, 2019 at 3:54 PM Steve Jorgensen <[email protected]> wrote:
>
> > See
> > https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#The_%_Notation
> > for what Ruby offers.
> >
> > For me, the arrays are the most useful aspect.
> >
> > %w{one two three}
> > => ["one", "two", "three"]
I would expect %w{ ... } to return a set, not a list:
%w[ ... ] # list
%w{ ... ] # set
%w( ... ) # tuple
and I would describe them as list/set/tuple "word literals". Unlike
list etc displays [spam, eggs, cheese] these would actually be true
literals that can be determined entirely at compile-time.
> I am not seeing the advantage of this. Can you provide some specific
> examples that you think would benefit from this syntax?
I would use this feature, or something like it, a lot, especially in
doctests where there is a premium in being able to keep examples short
and on one line.
Here is a small selection of examples from my code that would be
improved by something like the suggested syntax. I have trimmed some of
them for brevity, and to keep them on one line. (Anything with an
ellipsis ... has been trimmed.) I have dozens more, but they'll all
pretty similar and I don't want to bore you.
__slots__ = ('key', 'value', 'prev', 'next', 'count')
__all__ = ["Mode_Estimators", "Location", "mfv", ...]
The "string literal".split() idiom is especially common, especially for
data tables of strings. Here are some examples:
NUMBERS = ('zero one two three ... twenty-eight twenty-nine').split()
_TOKENS = set("indent assign addassign subassign ...".split())
__all__ = 'loopup loopdown reduce whileloop recursive product'.split()
for i, colour in enumerate('Black Red Green Yellow Blue Magenta Cyan
White'.split()):
for methodname in 'pow add sub mul truediv'.split():
attrs = "__doc__ __version__ __date__ __author__ __all__".split()
names = 'meta private dunder ignorecase invert'.split()
unsorted = "The quick brown Fox jumps over the lazy Dog".split()
blocks = chaff.pad('flee to south'.split(), key='george')
minmax('aa bbbb c ddd eeeee f ggggg'.split(), key=len)
My estimate is that I would use this "string literal".split() idiom:
- about 60-70% in doctests;
- about 5-10% in other tests;
- about 25% in non-test code.
Anyone who has had to write out a large, or even not-so-large, list of
words could benefit from this. Why quote each word individually like a
drudge, when the compiler could do it for you at compile-time?
Specifically as a convenience for this "list of words" use-case,
namedtuple splits a single string into words, e.g.
namedtuple('Parameter', 'name alias default')
I do the same in some of my functions as well, to make it easier to pass
lists of words.
Similarly, support for keyword arguments in the dict constructor was
specifically added to ease the case where your keys were single words:
# {'spam': 1, 'eggs': 2}
dict(spam=1, eggs=2)
Don't underestimate the annoyance factor of having to write out things
by hand when the compiler could do it for you. Analogy: we have list
displays to make it easy to construct a list:
mylist = [2, 7, -1]
but that's strictly unnecessary, since we could construct it like
this:
mylist = list()
mylist.append(2)
mylist.append(7)
mylist.append(-1)
If you think I'm being fascious about the list example, you've probably
never used standard Pascal, which had arrays but no syntax to initialise
them except via a sequence of assignments. That wasn't too bad if you
could put the assignments in a loop, but was painful if the initial
entries were strings or floats.
> For the example you gave, besides saving a few characters I don't see the
> advantage over the existing way we have to do that:
>
> 'one two three'.split()
One of the reasons why Python is "slow" is that lots of things that can
be done at compile-time are deferred to run-time. I doubt that splitting
short strings will often be a bottle-neck, but idioms like this cannot
help to contribute (even if only a little bit) to the extra work the
Python interpreter does at run-time:
load a pre-allocated string constant
look up the "split" attribute in the instance (not found)
look up the "split" attribute in the class
call the descriptor protocol which returns a method
call the method
build and return a list
garbage collect the string constant
versus:
build and return a list from pre-allocated strings
(Or something like this, I'm not really an expert on the Python
internals, I just pretend to know what I'm talking about.)
> Python usually uses [ ] for list creation or indexing. Co-opting it for a
> substantially different purpose of string processing like this doesn't
> strike me as a good idea, especially since we have two string identifiers
> already, ' and ".
I'm not sure why you describe this as "string processing". The result
you get is a list, not a string. This would be pure syntactic sugar for:
%w[words] # "words".split()
%w{words} # set("words".split())
%w(words) # tuple("words".split())
except done by the compiler, at compile-time, not runtime.
--
Steven
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/V4NGPX7DVB6YXFYXJPYS4YBMTMHWTTD4/
Code of Conduct: http://python.org/psf/codeofconduct/