On 06/07/2018 12:45 AM, Chris Angelico wrote:
On Thu, Jun 7, 2018 at 1:55 PM, Steven D'Aprano
<steve+comp.lang.pyt...@pearwood.info> wrote:
On Tue, 05 Jun 2018 23:27:16 +1000, Chris Angelico wrote:
And an ASCIIZ string cannot contain a byte value of zero. The parallel
is exact.
Why should we, as Python programmers, care one whit about ASCIIZ strings?
They're not relevant. You might as well say that file names cannot
contain the character "π" because ASCIIZ strings don't support it.
No they don't, and yet nevertheless file names can and do contain
characters outside of the ASCIIZ range.
Under Linux, a file name contains bytes, most commonly representing
UTF-8 sequences. So... an ASCIIZ string *can* contain that character,
or at least a representation of it. Yet it cannot contain "\0".
ChrisA
This seems like an argument for allowing byte strings to be used as file
names, not for altering text strings. If file names are allowed to
contain values that are illegal for text strings, then they shouldn't be
necessarily considered as text strings.
The unicode group sets one set of rules, and their rules should apply in
their area. The Linux group sets another set of rules, and their rules
should apply in their area. Just because there is a large area of
overlap doesn't mean that the two areas are congruent. Byte strings are
designed to handle any byte pattern, but text strings are designed to
handle a subset of those patterns. Most byte strings are readable as
text, but not all of them.
--
https://mail.python.org/mailman/listinfo/python-list