Matt wrote: > I am attempting to reformat a string, inserting newlines before certain > phrases. For example, in formatting SQL, I want to start a new line at > each JOIN condition. Noting that strings are immutable, I thought it > best to spllit the string at the key points, then join with '\n'. > > Regexps can seem the best way to identify the points in the string > ('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need > to identify multiple locationg in the string. However, the re.split > method returns the list without the split phrases, and re.findall does > not seem useful for this operation. > > Suggestions? > >
Matt, You may want to try this solution: >>> import SE >>> Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ') # Details explained below the dotted line >>> print Formatter ('select id, people.* from ids left outer join people where ...\nSELECT name, job from people INNER JOIN jobs WHERE ...;\n') select id, people.* from ids left outer join people where ... SELECT name, job from people INNER JOIN jobs where ...; You may add other substitutions as required one by one, interactively tweaking each one until it does what it is supposed to do: >>> Formatter = SE.SE (''' "~(?i)(left|inner|right|outer).*join~=\n =" # Add an indentation "where=\n where" "WHERE=\n WHERE" # Add a newline also before 'where' ";\n=;\n\n" # Add an extra line feed "\n=;\n\n" # And add any missing semicolon # etc. ''') >>> print Formatter ('select id, people.* from ids left outer join people where ...\nSELECT name, job from people INNER JOIN jobs WHERE ...;\n') select id, people.* from ids left outer join people where ...; SELECT name, job from people INNER JOIN jobs WHERE ...; http://cheeseshop.python.org/pypi?:action=display&name=SE&version=2.3 Frederic ---------------------------------------------------------------------------------------------------------------------- The anatomy of a replacement definition >>> Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ') target=substitute (first '=') >>> Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ') = (each following '=' stands for matched target) >>> Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ') ~ ~ (contain regular expression) >>> Formatter = SE.SE (' "~(?i)(left|inner|right|outer).*join~=\n=" ') " " (contain definition containing white space) -- http://mail.python.org/mailman/listinfo/python-list