On Apr 29, 5:31 am, Cameron Simpson <c...@zip.com.au> wrote: > On 28Apr2010 22:03, Daniel Fetchinson <fetchin...@googlemail.com> wrote: > | > Any idea how I can replace words in a html file? Meaning only the > | > content will get replace while the html tags, javascript, & css are > | > remain untouch. > | > | I'm not sure what you tried and what you haven't but as a first trial > | you might want to > | > | <untested> > | > | f = open( 'new.html', 'w' ) > | f.write( open( 'index.html' ).read( ).replace( 'replace-this', 'with-that' > ) ) > | f.close( ) > | > | </untested> > > If 'replace-this' occurs inside the javascript etc or happens to be an > HTML tag name, it will get mangled. The OP didn't want that. > > The only way to get this right is to parse the file, then walk the doc > tree enditing only the text parts. > > The BeautifulSoup module (3rd party, but a single .py file and trivial to > fetch and use, though it has some dependencies) does a good job of this, > coping even with typical not quite right HTML. It gives you a parse > tree you can easily walk, and you can modify it in place and write it > straight back out. > > Cheers, > -- > Cameron Simpson <c...@zip.com.au> DoD#743http://www.cskk.ezoshosting.com/cs/ > > The Web site you seek > cannot be located but > endless others exist > - Haiku Error > Messageshttp://www.salonmagazine.com/21st/chal/1998/02/10chal2.html
Hi all, Thanks for all your input. Cameron Simpson get the idea of what I am trying to do. I've been looking at beautiful soup so far I don't know how to perform search and replace within it. Any suggest good read? Thanks all, James -- http://mail.python.org/mailman/listinfo/python-list