On Jun 15, Aaron Craig said:
>references for making regexes go faster
It's all about brains, really. Optimization is done to some extent by the
Perl regex engine, but you have to make sure you're not being silly and
giving it something it'll spend a year on.
Take the example for (?>) from the perlre documentation -- Perl knows how
to optimize
"aaaaaaaaaaaaaaaaaaaaaaa" =~ /(a*)*b/;
but it doesn't know how to optimize
"aaaaaaaaaaaaaaaaaaaaaaa" =~ /(a*)*[b]/;
which could be a problem for you. The solution is to force no
backtracking with the cut operator (?>).
"aaaaaaaaaaaaaaaaaaaaaaa" =~ /(?>a*)*[b]/;
That fails a whole lot faster.
The other things you can do are KNOW YOUR REGEX. Are you wasting
something? Are you doing something needlessly? Maybe a construct like
/"(.*?)"/
could be written as
/"([^"]*)"/
which is more sensical -- why creep forward and look for a '"' every time,
when you can just fly through until you find a '"'?
My book will have a chapter on optimization of regexes -- I'm not sure yet
how in-depth it will be, but it will be there, and it will be helpful.
In the mean-time, I leave you all with this delightful regex chicanery,
before I go down to the YAPC building (today I give my regex talk, which
you can find at http://www.pobox.com/~japhy/TPC5.0/).
$list = join "," => (1,2,3,4,7,9,10,11,12,15,16);
($range = $list) =~ s/(\d+)(?:,((??{ 1 + $+ })))+/$1-$+/g;
print $range; # "1-4,7,9-12,15-16"
--
Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/
I am Marillion, the wielder of Ringril, known as Hesinaur, the Winter-Sun.
Are you a Monk? http://www.perlmonks.com/ http://forums.perlguru.com/
Perl Programmer at RiskMetrics Group, Inc. http://www.riskmetrics.com/
Acacia Fraternity, Rensselaer Chapter. Brother #734
** Manning Publications, Co, is publishing my Perl Regex book **