Hi all, I am writing a script to visualize (and print) the web references hidden in the html files as: '<a href="web reference"> underlined reference</a>' Optimizing my code, I found that an essential step is: splitting on a word (in this case 'href').
I am asking if there is some alternative (more pythonic...): # SplitMultichar.py import re # string s simulating an html file s='ffy: ytrty <a href="www.python.org">python</a> fyt <A HREF="wwwx">wx</A> dtrtf' p=re.compile(r'\bhref\b',re.I) lHref=p.findall(s) # lHref=['href','HREF'] # for normal html files the lHref list has more elements # (more web references) c='~' # char to be used as delimiter # c=chr(127) # char to be used as delimiter for i in lHref: s=s.replace(i,c) # s ='ffy: ytrty <a ~="www.python.org">python</a> fyt <A ~="wwwx">wx</A> dtrtf' list=s.split(c) # list=['ffy: ytrty <a ', '="www.python.org">python</a> fyt <A ', '="wwwx">wx</A> dtrtf'] #=----------------------------------------------------- If you save the original s string to xxx.html, any browser can visualize it. To be sure as delimiter I choose chr(127) which surely is not present in the html file. Bye. -- http://mail.python.org/mailman/listinfo/python-list