Wow. I would not have expected such a significant difference. Regex has been around a long time and lots of smart computer science types has spent time coming up with ways to optimize its performance for pattern matching. I assumed (falsely) that regex based filters in LC would be on par or even superior than a custom function using chunks. This leads me to:
1) wondering if LC's hooks to whatever regex tool they are using under the hood is a good as it should be AND 2) planning on rewriting my code to use chunks. Thanks for the post. On 1/30/2016 6:45 PM, Richard Gaskin wrote: > Regex is wonderfully compact to write relative to equivalent routines > using chunk expressions, but sometimes paid for in execution time. > > When I come across a good regex example like the one you provided, if > I have a moment I like to test things out to see where regex is faster > and where it isn't. It's really great for many things, but carries > quite a bit of overhead. > > Of course for this test to be relevant it assumes that most of the > specifiers in the regex expression are merely to identify the elements > you're looking for, and that the data is expected to fit the > definition you provided. > > Given that, it's possible to make the regex a bit simpler (see foo2 > below), but only with a modest boost to performance. It can probably > be simplified more, but the chunk-based alternative performed so well > I didn't bother exploring the regex side any further. > > Writing a lengthier handler that uses chunk expressions seems to yield > the same results you reported, running between 12 and 60 times faster > (depending on the percentage of lines tested that match the criteria > being looked for). > > For one-offs like validating email addresses regex can be an excellent > fit, and even some larger tasks depending on the specifics. > > But for iterating across lists I've often been delightfully surprised > by LiveCode's gracefully efficient chunk handling. > > Testing your original data replicated to become 250 lines long, and > looking for page 1 among them, the script below yields: > > Regex: 9261 ms > RegexLite: 7958 ms > Chunks: 197 ms > Chunks faster than orig regex by: 47.01 times > Chunks faster than lite regex by: 40.4 times > Same result? true > > > on mouseUp > put fld 1 into tList > put 1 into tPage --< change this for different tests > put 1000 into n > -- > -- Test 1: original regex > put the millisecs into t > repeat n > put foo1(tPage, tList) into r1 > end repeat > put the millisecs - t into t1 > -- > -- Test 2: lighter regex > put the millisecs into t > repeat n > put foo2(tPage, tList) into r2 > end repeat > put the millisecs - t into t2 > -- > -- Test 3: chunks > put the millisecs into t > repeat n > put foo3(tPage, tList) into r3 > end repeat > put the millisecs - t into t3 > -- > -- Display results: > set the numberformat to "0.##" > put "Regex: "&t1 &" ms"&cr \ > &"RegexLite: "&t2 &" ms"&cr \ > &"Chunks: "& t3 &" ms"&cr \ > &"Chunks faster than orig regex by: "&(t1 / t3)&" times" &cr \ > &"Chunks faster than lite regex by: "&(t2 / t3)&" times" &cr \ > &"Same result? "& (r1=r3) &cr&cr& r1 &cr&cr& r3 > end mouseUp > > > function foo1 pPage, tList > put > "(.+\t"&pPage&",\d+,\d+,\d+)|(.+\t\d+,\d+,"&pPage&",\d+)|(.+\t"&pPage&",\d*\.?\d*,\d*\.?\d*,\d*\.?\d*,\d*\.?\d*)" > into tMatchPattern > filter lines of tList with regex pattern tMatchPattern > return tList > end foo1 > > > function foo2 pPage, tList > put "(.+\t"&pPage&",*)|(.+\t\d+,\d+,"&pPage&",*)|(.+\t"&pPage&",*)" > into tMatchPattern > filter lines of tList with regex pattern tMatchPattern > return tList > end foo2 > > > > function foo3 pPage, tList > repeat for each line tLine in tList > set the itemdel to tab > put item 3 of tLine into t1 > put pPage &"," into tPageMarker > if "." is in t1 then > if (t1 begins with tPageMarker) then > put tLine &cr after tNuList > end if > else > if ( t1 begins with tPageMarker) OR (item 4 of tLine begins with > tPageMarker) then > put tLine &cr after tNuList > end if > end if > end repeat > delete last char of tNuList > return tNuList > end foo3 > > > > > > > > > > _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode