On Tue, Jan 27, 2009 at 9:02 PM, Caius Durling <[email protected]> wrote:
> > On 27 Jan 2009, at 17:04, Adam Holt wrote: > > I'm having some trouble figuring out how to do some regular expression > foo and thought maybe someone on here could help, > > Basically i am trying to parse out special tags from html returned > from TinyMCE, and replace them with content for doing html to pdf > generation. > > I've written a little gist page demonstrating what i need it to do: > http://gist.github.com/53383 > > > I'd probably not try and do this all in one regex. I'd run through line by > line, if it matches the start line, save everything until it matches the > ending line. Then you have your entire string. > > C > I know this might be un-helpful, but I say it everytime anyone asks this question in mailing lists, regular expressions are almost certainly the wrong approach to take :) Is it possible to solve your problem using XSL, which is far far better suited to removing/modifying elements in SGML-like documents. If the HTML isn't XHTML etc then running it through some sort of tidier (I use TagSoup in Java, not sure what is available in Ruby of the top of my head, but there will be something!) first *will* make your life considerably easier, honest :) - cj. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "NWRUG" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nwrug-members?hl=en -~----------~----~----~----~------~----~------~--~---
