Im using a regular expression, to get all urls from a file.
When using it on the webserver, it processes a 25kb html file
about 0.25 seconds (PIII 350, 128mb ram, linux 2.2.13-7mdk, php 3.0.12).
When i try the same on my machine (Celeron 300A, 192mb ram.), it does it
about 14-17 seconds, which is about 60-70 times slower.
Tested it on:
redhat 7.2, php 4.1.1 - 16 seconds
windows 98, php 4.1.1 - 14 seconds
windows 98, php 3.0.12 - 17 seconds.
The machine speed differences arent that big, especially considering
that other(simpler) regexps run at the almost the same speed (20-30%
difference).
Where can this enourmous speed difference come from
Regexp im using is following:
while
(eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"]
*content=['\"][0-9]+;url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\'\"]?(([[a-z]{3,5}://(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9-]*))(#[.a-zA-Z0-9-]*)?[\'\"
]?",$file,$regexp))
{
print $regexp[2]."\n";
$file=str_replace($regexp[0], "", $file);
}
Anyone care to test it on there machine? Full code (20 lines) with
timing function would be:
<?
function getmicrotime(){
list($usec, $sec) = explode(" ",microtime());
return ((float)$usec + (float)$sec);
}
//change to some html file around 25kb in size
$lines = @file('test.html');
if (is_array($lines))
{
while (list($id,$line) = each($lines))
$file .= $line;
}
$start_time=getmicrotime();
while
(eregi("(<frame[^>]*src[[:blank:]]*=|href[[:blank:]]*=|http-equiv=['\"]refresh['\"]
*content=['\"][0-9]+;url[[:blank:]]*=|window[.]location[[:blank:]]*=|window[.]open[[:blank:]]*[(])[[:blank:]]*[\'\"]?(([[a-z]{3,5}://(([.a-zA-Z0-9-])+(:[0-9]+)*))*([:%/?=&;\\,._a-zA-Z0-9-]*))(#[.a-zA-Z0-9-]*)?[\'\"
]?",$file,$regexp))
{
print $regexp[2]."\n";
$file=str_replace($regexp[0], "", $file);
}
$end_time = getmicrotime()- $start_time;
print "Completed in ".$end_time." seconds";
?>
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php