On Oct 26, 2:47 pm, Dave Angel <d...@davea.name> wrote: > On 10/26/2011 03:48 PM, Ross Boylan wrote: > > > > > > > > > I want to replace every \ and " (the two characters for backslash and > > double quotes) with a \ and the same character, i.e., > > \ -> \\ > > " -> \" > > > I have not been able to figure out how to do that. The documentation > > for re.sub says "repl can be a string or a function; if it is a string, > > any backslash escapes in it are processed.That is, \n is converted to a > > single newline character, \r is converted to a carriage return, and so > > forth. Unknown escapes such as \j are left alone." > > > \\ is apparently unknown, and so is left as is. So I'm unable to get a > > single \. > > > Here are some tries in Python 2.5.2. The document suggested the result > > of a function might not be subject to the same problem, but it seems to > > be. > >>>> def f(m): > > ... return "\\"+m.group(1) > > ... > >>>> re.sub(r"([\\\"])", f, 'Silly " quote') > > 'Silly \\" quote' > > <SNIP> > >>> re.sub(r"([\\\"])", "\\\\\\1", 'Silly " quote') > > 'Silly \\" quote' > > > Or perhaps I'm confused about what the displayed results mean. If a > > string has a literal \, does it get shown as \\? > > > I'd appreciate it if you cc me on the reply. > > > Thanks. > > Ross Boylan > > I can't really help on the regex aspect of your code, but I can tell you > a little about backslashes, quote literals, the interpreter, and python. > > > Now, one way to cheat on the string if you know you'll want to put > actual backslashes is to use the raw string. That works quite well > unless you want the string to end with a backslash. There isn't a way > to enter that as a single raw literal. You'd have to do something > string like > a = r"strange\literal\with\some\stuff" + "\\" > > My understanding is that no valid regex ends with a backslash, so this > may not affect you. > > -- > > DaveA
Dave's answer is excellent background. I've snipped everything except the part I want to emphasize, which is to use raw strings. They were put into Python specifically for your problem: that is, how to avoid the double and triple backslashes while writing regexes. John Roth -- http://mail.python.org/mailman/listinfo/python-list