On Sep 29, 8:32 pm, [EMAIL PROTECTED] wrote: > It think he's saying it should look like this: > > # File: masseditor.py > > import re > import os > import time > > p1= re.compile('(href=|HREF=)+(.*)(#)+(.*)(\w\'\?-<:)+(.*)(">)+') > p2= re.compile('(name=")+(.*)(\w\'\?-<:)+(.*)(">)+') > p100= re.compile('(a name=)+(.*)(-)+(.*)(></a>)+') > q1= r"\1\2\3\4_\6\7" > q2= r"\1\2_\4\5" > > def massreplace(): > editfile = open("C:\Program Files\Credit Risk Management\Masseditor > \editfile.txt") > filestring = editfile.read() > filelist = filestring.splitlines() > > for i in range(len(filelist)): > source = open(filelist[i]) > starttext = source.read() > > for i in range (13): > interimtext = p1.sub(q1, starttext) > interimtext= p2.sub(q2, interimtext) > interimtext= p100.sub(q2, interimtext) > source.close() > source = open(filelist[i],"w") > source.write(finaltext) > source.close() > > massreplace() > > I'll try that and see how it works...
Ok, if you want a single RE... How about: test = ''' <a href="Web_Sites.htm#A Web Sites"> <a name="A Web Sites"></a> <a href="Web_Sites.htm#A Web Sites"> <a name="A Web Sites"></a> <a HREF="Web_Sites.htm#A Web Sites"> <a name=Quoteless></a> <a name = "oo ps"></a> ''' import re r = re.compile(r''' (?:href=['"][^#]+[#]([^"']+)["']) | (?:name=['"]?([^'">]+)) ''', re.IGNORECASE | re.MULTILINE | re.DOTALL | re.VERBOSE) def zap_space(m): return m.group(0).replace(' ', '_') print r.sub(zap_space, test) It prints out <a href="Web_Sites.htm#A_Web_Sites"> <a name="A_Web_Sites"></a> <a href="Web_Sites.htm#A_Web_Sites"> <a name="A_Web_Sites"></a> <a HREF="Web_Sites.htm#A_Web_____________________________Sites"> <a name=Quoteless></a> <a name = "oo ps"></a> -- bjorn -- http://mail.python.org/mailman/listinfo/python-list