Wow, thank you to everyone whom responded on this one, it's great to get
really good responses.

Rob, you have some really good points in your email and your code works
great. Thank you. But there is one part that I don't fully understand.

In this piece of the code,
while (<LOG>) {

  if (/user ([\w.]+)\.\.\.$/) {
    my $name = $1;
    $_ = <LOG>;
    if (/$date $time: \Q$msg/) {
      $users{$name}++;
    }
  }

I don't understand why or how it knows to look for /$date $time: \Q$msg/
only on the line after it matches /user ([\w.]+)\.\.\.$/ and not just
continually throughout the log file after it matches just one /user
([\w.]+)\.\.\.$/.

Thanks.

Romeo




On 9/7/06, Rob Dixon <[EMAIL PROTECTED]> wrote:

Romeo Theriault wrote:
>
> Hello, I'm trying to match this line (or more than one) starting from
> the words "user picard..."
>
> 8/28/2006 1:04:41 PM: Retrieving mail from host mail.maine.edu
[130.111.32.22], user picard...
> 8/28/2006 1:04:45 PM: Mail retrieval failed, reason: POP3 Host did  not
acknowlege password and returned following error: -ERR [AUTH]  Invalid
login
>
>
> Using this regular expression:
>
> user (\w+(\.\w+)?(\.\w+)?)\.\.\.\n\d{1,2}\/\d{1,2}\/\d{4} \d{1,2}:\d
{2}:\d{2} (A|P)M: Mail retrieval failed, reason: POP3 Host did
not  acknowlege
password and returned following error: -ERR \[AUTH\]  Invalid login
>
> I seem to be matching it in ActiveState's regular expression toolkit,
> but when I try running the code it doesn't match the lines. I've
> tracked it down to the new line character after the three dots at the
> end of the first line. But I can't figure out how to get past it. The
> \n don't work. I've tried using chomp and then removing the new line
> character but it still doesn't match.
>
> Below is my code that I'm trying to get working and can't seem to get
> that regex. Any pointers that any has would be appreciated. Thank you.
>
> Romeo
>
>
>
> #!/usr/bin/perl -w
>
> use strict;
> my %saw;
> my @dup_array;
>
> # Open the file for reading and if cannot die and give error message.
> open LOG, "c:\\Documents and
Settings\\romeok\\desktop\\perl\\CopyofPOPconSrv.log"
>     or die "Cannot open file for reading: $!";
>
>
> # While the file is open read through it and remove new line characters.
> # Look for the memory variable $1 and push it into the
> # @dup_array.
>
> while (<LOG>) {
>     if (/user (\w+(\.\w+)?(\.\w+)?)\.\.\.\n\d{1,2}\/\d{1,2}\/\d{4} \d
{1,2}:\d{2}:\d{2} (A|P)M: Mail retrieval failed, reason: POP3 Host  did
not
acknowlege password and returned following error: -ERR \[AUTH \] Invalid
login/) {
>     push @dup_array, $1;
>     }
> }
>
> # When done close the file.
> close LOG;
>
>
>
> # These two lines remove duplicates from the array.
> undef %saw;
> my @no_dup_array = grep(!$saw{$_}++, @dup_array);
>
>
>
> # For each item in the array print it out.
> # Actually use this to send an email instead.
> foreach (@no_dup_array) { #Uses $_ by default
>     print "$_\n";
> }

Hello Romeo

The main reason your regex is failing is that your while loop reads only a
single line at a time, and so there are no characters to match after the
'user picard...' line. Solve this by matching this first line and
extracting the
user name, then reading another line and checking for the error condition.

I also think you're doing too much checking of the log file format.
There's no
need to make sure the logging software output the date and time correctly
at the
beginning of each line - you can just assume it's there, as it always has
been
for the last million or so years. I've left it in my solution though, just
in
case you were adamant about needing it.

When writing extended regexes like this it's better to split them into
chunks so
that you can more easily see what's going on. I've written separate
regexes to
match the date field and the time field, and stored the exact log error
message
you're looking for in a further variable. Each of these is substituted
into the
final regex, with \Q being used to escape non-word characters in the
message
text like the square brackets.

Finally, there's no point in storing the user names in an array and then
copying
them to a unique list via a temporary hash: you may as well just store the
whole
lot in a hash in the first place. The only reason this may not work for
you is
if the order the names appeared in the log is important, otherwise the
code
below will work for you.

HTH,

Rob


use strict;
use warnings;

my $file = 'C:\Documents and
Settings\romeok\desktop\perl\CopyofPOPconSrv.log';

# Open the file for reading and if cannot die and give error message.
open LOG, $file or die "Cannot open file for reading: $!";

my %users;

# Set up regexes for standard parts of the pattern
#
my $msg = "Mail retrieval failed, reason: POP3 Host did  not acknowlege
password
and returned following error: -ERR [AUTH]  Invalid login";
my $date = qr(\d?\d/\d?\d/\d\d\d\d);
my $time = qr(\d?\d:\d\d:\d\d (?:AM|PM));

# While there is data left in the file read through it
# Look for the memory variable $1 and save it in the %users hash
#
while (<LOG>) {

   if (/user ([\w.]+)\.\.\.$/) {
     my $name = $1;
     $_ = <LOG>;
     if (/$date $time: \Q$msg/) {
       $users{$name}++;
     }
   }
}

# When done close the file.
#
close LOG;

# For each item in the array print it out.
# Actually use this to send an email instead.
#
foreach (keys %users) { # Uses $_ by default
   print "$_\n";
}


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>



Reply via email to