On Aug 1, 4:08 pm, Paul McGuire <[EMAIL PROTECTED]> wrote: > On Aug 1, 1:31 pm, [EMAIL PROTECTED] wrote: > <snip> > > > > > I'm thinking maybe somehow haveHTMLParserappend each character it > > reads except for data inside tags in some kind of buffer? This way I > > can have the HTML contents read into a buffer, then when I do my own > > handle_ overrides, I can also append to that buffer with the > > transformed data. Once the HTML page is finished parsing, ideally I > > would be able to print the contents of the buffer and the HTML would > > be identical except for the string transformations. > > > I also need to make sure that all newlines, tags, spacing, etc are > > kept in tact -- this part is a requirement for other reasons. > > > Thanks! > > What you describe is almost exactly how pyparsing implements > transformString. See below: > > from pyparsing import * > > boldStart,boldEnd = makeHTMLTags("B") > > # convert <B> to <div class="bold"> and </B> to </div> > boldStart.setParseAction(replaceWith('<div class="emphatic">')) > boldEnd.setParseAction(replaceWith('</div>')) > converter = boldStart | boldEnd > > html = "Display this in <b>bold</b>" > print converter.transformString(html) > > Prints: > > Display this in <div class="emphatic">bold</div> > > All text not matched by a pattern in the converter is left as-is. (My > CSS style/form may not be up to date, but I hope you get the idea.) > > -- Paul
Hello, Sorry for the delay in reply, and that you for the info. Though, I think either I am mis-understanding your post or its not the solution I'm looking for. How does this fit into what I'm looking to do with HTMLParser? Thanks! -- http://mail.python.org/mailman/listinfo/python-list