On Jun 11, 2011, at 8:45 AM, John Patten wrote:
Good thing I read this explanation of regex Saturday morning ;-) ... I think? Seems like matchtext/regex would be useful for HTML scraping strategies too.
John Patten - SUSD


Scraping HTML depends on your goals and the complexity of the pages,

I use MatchChuck( txt,(?U), char1, char2) in a repeat loop to parse nested tables to extract header-row (column-row) data cells.

It is better to use LC chunking to reduce the size of the text block for regEx, otherwise the regEx scanning takes exponentially longer for repeating pattern blocks, such as html tables.

Glad you understood the explanation.


Jim Ault
Las Vegas



_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to