On Mon, 02 Sep 2013 13:22:37 -0700, Ethan Furman wrote: > In a raw string, the backslash is buggy (IMNSHO) when it's the last > character. Given the above error, you might think that to get a > single-quote in a string delimited by single-quotes that you would use > r'\'', but no: > > --> r'\'' > "\\'"
You get exactly what you asked for. It's a raw string, right, so backslash has no special powers, and "backslash C" should give you exactly backslash followed by C, for any character C. Which is exactly what you do get. So that's working correctly, as far as it goes. > you get a backslash and a single-quote. And if you try to escape the > backslash to get only one? > > --> r'\\' > '\\\\' > > You get two. Grrrr. Again, working as expected. Since backslash has no special powers, if you enter a string with backslash backslash, you ought to get two backslashes. Just as you do. The *real* mystery is how the first example r'\'' succeeds in the first place, and that gives you a clue as to why r'\' doesn't. The answer is discussed in this bug report: http://bugs.python.org/issue1271 Summarising, the parser understands backslash as an escape character, and when it scans the string r'\'' the backslash escapes the inner quote, but then when Python generates the string it skips the backslash escape mechanism. Since the parser knows that backslash escapes, it fails to parse r'\' and you get a SyntaxError. If you stick stuff at the end of the line, you get the SyntaxError at another place: py> s = r'\'[:] # and more File "<stdin>", line 1 s = r'\'[:] # and more ^ SyntaxError: EOL while scanning string literal So the real bug is with the parser. It is likely that nobody noticed this bug in the first place because the current behaviour doesn't matter for regexes, which is the primary purpose of raw strings. You can't end a regex with an unescaped backslash, so r'abc\'' is an illegal regex and it doesn't matter if you can't create it. -- Steven -- http://mail.python.org/mailman/listinfo/python-list