On Mon, Feb 13, 2017 at 10:56 AM, Erik <pyt...@lucidity.plus.com> wrote: > Actually, while contriving those examples, I noticed that sometimes when > using string literal concatenation, the 'rawness' of the initial string is > sometimes applied to the following string and sometimes not: > >>>> "hello \the" r"worl\d" > 'hello \theworl\\d' > > Makes sense - the initial string is not raw, the concatenated string is. > >>>> r"hello \the" "worl\d" > 'hello \\theworl\\d' > > Slightly surprising. The concatenated string adopts the initial string's > 'rawness'. > >>>> "hello \the" r"worl\d" "\t" > 'hello \theworl\\d\t' > > The initial string is not raw, the following string is. The string following > _that_ becomes raw too. > >>>> r"hello \the" "worl\d" "\t" > 'hello \\theworl\\d\t' > > The initial string is raw. The following string adopts that (same as the > second example), but the _next_ string does not! > >>>> r"hello \the" "worl\d" r"\t" > 'hello \\theworl\\d\\t' > > ... and this example is the same as before, but makes the third string "raw" > again by explicitly declaring it as such. > > Presumably (I haven't checked), this also applies to u-strings and f-strings > - is this a documented and known "wart"/edge-case or is it something that > should be defined and fixed?
Firstly, be aware that there's no such thing as a "raw string" - what you have is a "raw string literal". It's a purely syntactic feature. The repr of a string indicates the contents of the string object, and raw literals and non-raw literals can produce the same resulting string objects. The string "\t" gets shown in the repr as "\t". It is a string consisting of one character, U+0009, a tab. The string r"\t" is shown as "\\t" and consists of two characters, REVERSE SOLIDUS and LATIN SMALL LETTER T. That might be why you think there's confusing stuff happening :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list