On Fri, 9 Nov 2018 at 23:56, Chris Angelico <[email protected]> wrote: > >>> list("\797") > ['\x07', '9', '7']
> The octal escape grabs as many digits as it can, and when it finds a > character in the literal that isn't a valid octal digit (same whether > it's a '9' or a 'q'), it stops. The remaining characters have no > special meaning; this does not become four hex digits. A "\xNN" escape > in Python must be exactly two digits, no more and no less. Yes- I had just figured this out before going to sleep, and was comming back that although strange, this was no motive for breaking stuff up. Thank your for the lengthy reply!! > > On Sat, Nov 10, 2018 at 12:42 PM Joao S. O. Bueno <[email protected]> > wrote: > > > > I just saw some document which reminded me that strings with a > > backslash followed by 3 octal digits. When a backslash is followed by > > 3 octal digits, that means a character with the corresponding > > codepoint and all is well. > > > > The "valid scenaario": > > > > In [42]: "\777" > > Out[42]: 'ǿ' > > > > The problem is when you have just two valid octal digits > > > > In [40]: "\778" > > Out[40]: '?8' > > > > Which is ambiguous at least -- why is this not "\x07" "77" for > > example? (0ct(77) actually corresponds to the "?" (63) character) > > Not ambiguous. It takes as many valid octal digits as it can. > > https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals > > \ooo ==> Character with octal value ooo > Note 1: As in Standard C, up to three octal digits are accepted. > > "Up to" means that one or two digits can also define a character. For > obvious reasons, it has to take digits greedily (otherwise "\777" > would be "\x07" followed by "77"), and it's not an error to have fewer > digits. Permitting a single digit means that "\0" means the NUL > character, which is often convenient. > > > And then when the second digit is not valid octal: > > In [43]: "\797" > > Out[43]: '\x0797' > > WAT? > > > > So, between the possibly ambiguous scenario with two octal digits > > followed by a no-octal digit, and the complety unexpected expansion > > to a 4-hexadecimal digit codepoint in the last case > > You may possibly be misinterpreting the last result. It's exactly the > same as the previous ones. > > >>> list("\797") > ['\x07', '9', '7'] > > The octal escape grabs as many digits as it can, and when it finds a > character in the literal that isn't a valid octal digit (same whether > it's a '9' or a 'q'), it stops. The remaining characters have no > special meaning; this does not become four hex digits. A "\xNN" escape > in Python must be exactly two digits, no more and no less. > > > what do you say > > of deprecating any r"\[0-9]{1,3}" sequence that don't match full 3 > > octal digits, and yield a syntax error for that from Python 3.9 (or > > 3.10) on? > > Nope. Would break code for no good reason. > > ChrisA > _______________________________________________ > Python-ideas mailing list > [email protected] > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
