On 02/22/2015 09:41 PM, Ben Finney wrote:
Chris Angelico <ros...@gmail.com> writes:
In Python, unrecognized escape sequences are treated literally,
without (as far as I can tell) any sort of warning or anything.
Right. Text strings literals are documented to work that way
<URL:https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str>,
which refers the reader to the language reference
<URL:https://docs.python.org/3/reference/lexical_analysis.html#strings>.
Why is it that Python interprets them this way, and doesn't even give
a warning?
Because the interpretation of those literals is unambiguous and correct.
Correct according to a misguided language definition.
It's unfortunate that MS Windows inherited the incompatible “backslash
is a path separator”, long after backslash was already established in
many programming languages as the escape character.
Windows "inherited" it from DOS. But since Windows was nothing but a
DOS shell for several years, that's not surprising. The historical
problem came from CP/M's use of the forward slash for a
switch-character. Since MSDOS/PCDOS/QDOS was trying to permit
transliterated CP/M programs, and because subdirectories were an
afterthought (version 2.0), they felt they needed to pick a different
character. At one time, the switch-character could be set by the user,
but most programs ignored that, so it died.
Is there a way to enable such warnings/errors?
A warning or error for a correctly formatted literal with an unambiguous
meaning would be an up-Pythonic thing to have.
I can see the motivation, but really the best solution is to learn that
the backslash is an escape character in Python text string literals.
This has the advantage that it's the same escape character used for text
string literals in virtually every other programming language, so you're
not needing to learn anything unusual.
I might be able to buy that argument if it was done the same way, but as
it says in:
https://docs.python.org/3/reference/lexical_analysis.html#strings
"""Unlike Standard C, all unrecognized escape sequences are left in the
string unchanged, i.e., the backslash is left in the result. (This
behavior is useful when debugging: if an escape sequence is mistyped,
the resulting output is more easily recognized as broken.)
"""
The word "broken" is an admission that this was a flawed approach. If
it's broken, it should be an error.
I'm not suggesting that the implementation should falsely trigger an
error. But that the language definition should be changed to define it
as an error.
--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list