On Mar 22, 12:07 pm, Benjamin Kaplan <benjamin.kap...@case.edu> wrote: > On Tue, Mar 22, 2011 at 2:40 PM, John Harrington > > > > <beartiger....@gmail.com> wrote: > > On Mar 22, 11:16 am, John Bokma <j...@castleamber.com> wrote: > >> John Harrington <beartiger....@gmail.com> writes: > >> > I'm trying to use the following substitution, > > >> > lineList[i]=re.sub(r'(\\begin{document})([^$])',r'\1\n\n > >> > \2',lineList[i]) > > >> > I intend this to match any string "\begin{document}" that doesn't end > >> > in a line ending. If there's no line ending, then, I want to place > >> > two carriage returns between the string and the non-line end > >> > character. > > >> > However, this places carriage returns even when the string is followed > >> > directly after with a line ending. Can someone explain to me why this > >> > match is not behaving as I intend it to, especially the ([^$])? > > >> [^$] matches: not a $ character > > >> You might want [^\n] > > > Thank you, John. > > > I thought that when you use "r" before the regex, $ matches an end of > > line. But, in any case, if I use "[^\n]" as you suggest I get the > > same result. > > r before a string has nothing to do with regexes. It signals a raw > string- escape sequences wont' be escaped.>>> print 'a\tb' > a b > >>> print r'a\tb' > > a\tb > > We use raw strings for regexes because otherwise, you'd have to > remember double up all your backslashes. And double up your doubled up > backslashes when you really want a backslash. > > > > > Here's a script that illustrates the problem. Any help would be > > appreciated!: > > > #BEGIN SCRIPT > > import re > > > outlist = [] > > myfile = "raw.tex" > > > fin = open(myfile, "r") > > lineList = fin.readlines() > > fin.close() > > > for i in range(0,len(lineList)): > > > lineList[i]=re.sub(r'(\\begin{document})([^\n])',r'\1\n\n > > \2',lineList[i]) > > > outlist.append(lineList[i]) > > > fou = open(myfile, "w") > > for i in range(len(outlist)): > > fou.write(outlist[i]) > > fou.close > > #END SCRIPT > > > And the file raw.tex: > > > %BEGIN TeX FILE > > \begin{document} > > This line should remain right after the above line in the output, but > > doesn't > > > \begin{document}Extra stuff here should appear below the begin line > > and does in the output. > > %END TeX FILE > > Works for me. Do you have a space after the \begin{document} or > something? Because that get moved. You might want to check for > non-whitespace characters in the reges instead of just non-newlines. > > > -- > >http://mail.python.org/mailman/listinfo/python-list > >
Matching the non-whitespace works, but I'm troubled I can't match a non-end-of-line. No, there was no space after the string. Thank you for your help, Ben -- http://mail.python.org/mailman/listinfo/python-list