Jeff Peng wrote:

Can the code (specially the regex) below be optimized to run faster?

#!/usr/bin/perl
for ($i=0; $i<1000; $i+=1) {

 open HD,"index.html" or die $!;
 while(<HD>) {
   print $1,"\n" if /href="http:\/\/(.*?)\/.*" target="_blank"/;
 }
 close HD;
}

Let me first "normalize" the code.

  #!/usr/bin/perl
  use strict;
  use warnings;

  my $fname = "index.html";

  for my $i ( 0 .. 999 ) {

      open my $fh, "<", $fname or die $!;

      while( <$fh> ) {
          print $1,"\n"
            if m{href="http://(.*?)/.*" target="_blank"};
      }
      close $fh;
  }

So it captures hostnames out of href/target strings.
(for example only out of the first one in a line)

I would add a question mark afther the second ".*", to minimize backtracking. But that changes the meaning.

Further there is no need to open the file 1000 times, see -f seek.

--
Ruud

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to