On Thu, Jul 25, 2013 at 9:53 PM, Geoff Canyon <gcan...@gmail.com> wrote:
> regex is notoriously unable to handle recursion. To see endless heated > debate, search the web for how to parse HTML using regex. > Hi Geoff, You piqued my interest and indeed there are endless debates about parsing html! I also came across the (?R) item which allows for recursion of a regex. There's an example of using it at http://php.net/manual/en/regexp.reference.recursive.php that is specifically aimed at capturing the text between nested parens. Not quite what the original problem was but you just need to subtract one from each start char and add 1 to each end char to get the char positions of the parens. It still doesn't handle Mark's earlier example of two completely separate sets of parens, but I'll bet someone with more regex skills than I could modify it to do so. However, it may still not be usable within LC because matchChunk requires you to know in advance how many capture groups will be found and specify the requisite number of start/end variable pairs, which has always seemed strange to me. I wonder if it might ever be changed to return an array with one numeric key for each match containing the comma separated start and end chars, or even a line delimited list of start/end positions. Pete lcSQL Software <http://www.lcsql.com> _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode