Peter TB Brett wrote > Unfortunately XHTML and HTML are not regular languages, which means that > they cannot be processed correctly with regular expressions. > > Indeed, "Implement an HTML parser using regular expressions" is a > well-known prank project to suggest for inexperienced developers to > waste their time on... > > So, your approach is sadly not workable. > > If you're processing XHTML, I recommend using revXML. > > If you need to process arbitrary HTML, then unfortunately the only > sensible option is to use a browser...
Bummer. Not only are XHTML and HTML not regular languages but their use in ePub's is even more irregular (if that is possible.) I have some texts which include both forms: Others where every tag 'h', 'p' etc has an id attribute. A browser is not an option as I will need to use LCs chunking and text selection features. I am using the htmltext of a field and given the htmltext function ignores most of what I was trying to remove it probably doesn't matter in the end. Just a bit untidy. Thanks Peter. -- View this message in context: http://runtime-revolution.278305.n4.nabble.com/using-the-SHELL-function-to-GREP-a-body-of-text-tp4702346p4702348.html Sent from the Revolution - User mailing list archive at Nabble.com. _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode