Simple regex problem has me baffled

Bill Harpley Mon, 26 Jan 2009 07:21:09 -0800

Hello,

I have simple regex problem that is driving me crazy.


I am writing a script to analyse a log file. It contains Java related
information about requests and responses.

Each pair of Request (REQ) and Response (RES) calls have a unique
Request ID. This is a 5 digit hex number contained in square brackets
(e.g.  "[81c2d]" ).

Using timestamps in each log entry, I need to calculate the time
difference between the start of the Request and the end of the Response.

As a first step, I thought I would identify the matching REQ/RES pairs
in the log and then set about extracting the timestamp information and
doing the calculations.

I started with a simple script to extract the Request IDs from each log
entry. Here is what one looks like (names have been changed to protect
the innocent).


[2009-01-23 09:20:48,719]TRACE [server-1] [http-80-5] a...@mydomain.net
:090123-092048567:f5825 (SetCallForwardStatusImpl.java:call:54) -
RequestId [81e80] SetCallForwardStatus.REQ { accountNumber:=W12345,
phoneNumber:=12121212121, onBusyStatus:=true, busyCurrent:=voicemail,
onNoAnswerStatus:=false, noAnswerCurent:=voicemail,
onUncondStatus:=false, uncondCurrent:=voicemail }

So I need to extract the 5 hex digits in "RequestId [81e80]". Sounds
simple, eh?

Here is a fragment of my initial script:

open ( DATA, "< $INBOX/sample.log") || die "Cannot open source file:
$!";
open ( FILE, "> $INBOX/request.dat") || die "Cannot open request file:
$!";

chomp(@list=<DATA>);

foreach $entry(@list)
{

        $entry =~ /\[([a-z0-9]{5})\]/;

        print "$1\n";           # print to screen

        # print FILE "$1\n";            # print to file
}

I have spent quite a bit of time refining this expression and it looks
OK to me. I basically just need to extract the 5-digit hex string and
then write it to a file (or to screen).

This is what I get when I run the script:

Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

8252c
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

8252c
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

8252d
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

8252d
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

82534
82534
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

82535
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

82534
82534
82534
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044.

8253c
8253c
8253c
Use of uninitialized value in concatenation (.) or string at ./magic.pl
line 16, <DATA> line 1044


< --- Big long list --note that RequestIDs from REQ/RES pairs need not
be adjacent in the list -- >

The first thing that puzzles me is that it obviously extracting the
RequestId substring correctly, it seems to complain about the "$1\n"
expression in line 16.
This looks quite OK to me and I am baffled why I am getting this
message.

The other thing that puzzles me is that there can only be a single
REQ/RES pair in the file with a given ID. So the RequestID should not
appear more than twice in the
The output list. Yet there are many instances where the RequestID
appears more than twice.

Any help you guys can provide would be much appreciated. The Perl
version is 5.8.4. on solaris 10


Regards,

Bill Harpley

Simple regex problem has me baffled

Reply via email to