On Mar 3, 2020, at 01:09, M.-A. Lemburg <[email protected]> wrote:
>
> The main reason for having not having characters and strings is
> reducing complexity. Why try to add this now for no apparent
> net benefit ?
I don’t think the benefit is worth the (as far as I can tell insurmountable)
backward compatibility cost, but you can’t argue that there is no benefit.
An object whose first element is itself is a valid idea, but it’s a
pathological case; you have to write something like `lst=[]; lst.append(lst)`
to get one. So code like this is fine:
def flatten(xs):
for x in xs:
if isinstance(x, Iterable):
yield from flatten(x)
else:
yield x
… in that it only infinitely recurses if you go out of your way to give it an
infinitely recursive value.
… except that every string is an infinitely recursive value, so all you have to
do is give it 'A'.
Which is not just weird in theory; it breaks perfectly sensible code like
flatten. And it’s why we have to have idioms like endswith taking a
str|Tuple[str] rather than any Iterable: forcing people to write
s.endswith(tuple(suffixes)) when suffixes is a set Is the only reasonable way
to avoid confusion when suffixes is an arbitrary iterable.
And, because it comes up all the time, and many other languages don’t have this
problem, it has to be explained to new students and people coming from other
languages, and painfully remembered or relearned by people who usually work in
Java or whatever but occasionally have to do Python.
Of course regular Python developers have this drummed into their heads, and
usually remember to check for str and handle it specially, and we’ve all
learned to deal with the tuple-special idiom, and so on. But that doesn’t mean
it’s an ideal design, just that we’ve all gotten used to it.
> I think the situation with bytes (iteration returning integers
> instead of bytes) has shown that this not a very user friendly
> nor intuitive approach:
Well, it shows that using integers is confusing.
In fact, it’s even worse than C, where char is an integral type but at least
not the same type as int. (A char ranges from 0 to 255; its default output and
input in functions like printf, and C++ streams, is as a character rather than
as a number; there are a bunch of character-related functions that take char
but not int, although using them with an int is usually just a warning rather
than an error; etc.)
That doesn’t mean a new type would be confusing:
>>>> b = bytes((1,2,3,4))
>>>> b
> b'\x01\x02\x03\x04'
>>>> b[:2]
> b'\x01\x02'
>>>> b[:1]
> b'\x01'
>>>> b[0]
byte(b'\x01')
In fact, it would make bytes consistent with other sequences of byte:
>>> s = list(b)
>>> s[:1]
[byte(b'\x01')]
>>> s[0]
byte(b'\x01')
… without adding any new inconsistencies:
>>> assert tuple(b[:2]) == tuple(s[:2])
>>> assert b[0] == s[0]
The downside, of course, is having one more builtin type. But that’s not an
instant disqualifier; it’s a cost to trade off with the benefits. I think if it
weren’t for backward compatibility, chr might turn out to be useful enough to
qualify (byte I’m much less confident of—it comes up less often, and also once
you start bikeshedding the interface there’s a lot more vagueness in the
concept), or at least worth having a PEP to explain why it’s rejected. (But of
course “if not for backward compatibility” isn’t realistic.)
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/UGGSZRM7YT7OOWHWLFMLCNGEMTWCLWAW/
Code of Conduct: http://python.org/psf/codeofconduct/