Peter Otten wrote: > James Stroud wrote: > >> James Stroud wrote: >>> John Pye wrote: >>>> Hi all >>>> >>>> I have a file with a bunch of perl regular expressions like so: >>>> >>>> /(^|[\s\(])\*([^ ].*?[^ ])\*([\s\)\.\,\:\;\!\?]|$)/$1'''$2'''$3/ # >>>> bold >>>> /(^|[\s\(])\_\_([^ ].*?[^ ])\_\_([\s\)\.\,\:\;\!\?]|$)/$1''<b>$2<\/ >>>> b>''$3/ # italic bold >>>> /(^|[\s\(])\_([^ ].*?[^ ])\_([\s\)\.\,\:\;\!\?]|$)/$1''$2''$3/ # >>>> italic >>>> >>>> These are all find/replace expressions delimited as '/search/replace/ >>>> # comment' where 'search' is the regular expression we're searching >>>> for and 'replace' is the replacement expression. >>>> >>>> Is there an easy and general way that I can split these perl-style >>>> find-and-replace expressions into something I can use with Python, eg >>>> re.sub('search','replace',str) ? >>>> >>>> I though generally it would be good enough to split on '/' but as you >>>> see the <\/b> messes that up. I really don't want to learn perl >>>> here :-) >>>> >>>> Cheers >>>> JP >>>> >>> This could be more general, in principal a perl regex could end with a >>> "\", e.g. "\\/", but I'm guessing that won't happen here. >>> >>> py> for p in perlish: >>> ... print p >>> ... >>> /(^|[\s\(])\*([^ ].*?[^ ])\*([\s\)\.\,\:\;\!\?]|$)/$1'''$2'''$3/ >>> /(^|[\s\(])\_\_([^ ].*?[^ >>> ])\_\_([\s\)\.\,\:\;\!\?]|$)/$1''<b>$2<\/b>''$3/ /(^|[\s\(])\_([^ ].*?[^ >>> ])\_([\s\)\.\,\:\;\!\?]|$)/$1''$2''$3/ py> import re >>> py> splitter = re.compile(r'[^\\]/') >>> py> for p in perlish: >>> ... print splitter.split(p) >>> ... >>> ['/(^|[\\s\\(])\\*([^ ].*?[^ ])\\*([\\s\\)\\.\\,\\:\\;\\!\\?]|$', >>> "$1'''$2'''$", ''] >>> ['/(^|[\\s\\(])\\_\\_([^ ].*?[^ ])\\_\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$', >>> "$1''<b>$2<\\/b>''$", ''] >>> ['/(^|[\\s\\(])\\_([^ ].*?[^ ])\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$', >>> "$1''$2''$", ''] >>> >>> (I'm hoping this doesn't wrap!) >>> >>> James >> I realized that threw away the closing parentheses. This is the correct >> version: >> >> py> splitter = re.compile(r'(?<!\\)/') >> py> for p in perlish: >> ... print splitter.split(p) >> ... >> ['', '(^|[\\s\\(])\\*([^ ].*?[^ ])\\*([\\s\\)\\.\\,\\:\\;\\!\\?]|$)', >> "$1'''$2'''$3", ''] >> ['', '(^|[\\s\\(])\\_\\_([^ ].*?[^ >> ])\\_\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$)', "$1''<b>$2<\\/b>''$3", ''] >> ['', '(^|[\\s\\(])\\_([^ ].*?[^ ])\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$)', >> "$1''$2''$3", ''] > > There is another problem with escaped backslashes: > >>>> re.compile(r'(?<!\\)/').split(r"/abc\\/def/") > ['', 'abc\\\\/def', ''] > > Peter
Yes, this would be a case of the expression (left side) ending with a "\" as I mentioned above. James -- http://mail.python.org/mailman/listinfo/python-list