David Shere wrote:
I'm seeking critique of the following work-in-progress:
My task is to read in a binary data file and prepare it for use. The
"use" involves answering a query on whether some data is in the file,
and if it is, return the rest of the data associated with it.
Each line/record in the file is a line item from a sales order. The two
fields in the record that I care about (right now) are invoiceNumber and
transactionNumber. I must be able to tell the "user" if their
transaction number is present in the file, and if it is, what invoice
number is associated with it. There may be multiple records with the
same invoice number and transaction number. Presumably the case does
not exist where two records have the same transaction number but
different invoice numbers, or vice-versa. (Gosh I hope that's true...)
My solution is this: create a hash of arrays of hashes. Here is a
(shortened) example of one of the top hash elements:
'801017215' => [
{
'invoiceNumber' => '060712067',
'transactionId' => '801017215',
'partNumber' => '00-0001-00'
},
{
'invoiceNumber' => '060712067',
'transactionId' => '801017215',
'partNumber' => '70-0773-84'
}
],
That top hash key matches the transaction Numbers of its member hashes.
My plan is to search for the transaction number in top hash, and if it
exists, return the array of the two line items from that sales order
(the two lower hashes).
Here is the code:
sub processFile {
my $sourceFile = shift;
my $buffer;
my $dataSet;
open (INF, $sourceFile) or die "Infile: $!\n";
binmode INF;
while (read (INF, $buffer, 189)) {
$record = parseDataLine(substr($buffer,20));
push (@{$dataSet->{$record->{'transactionId'}}}, $record)
unless !$record->{'transactionId'};
"unless !" could be shortened to "if".
}
close INF or die "Can't close $sourceFile: $!\n";
print Dumper ($dataSet);
}
sub parseDataLine {
my $line = shift;
my $return;
$return->{'reportStat'} = substr($line,0,1);
$return->{'shipStat'} = substr($line,1,1);
$return->{'auctionId'} = substr($line,2,30);
$return->{'transactionId'} = substr($line,32,30);
$return->{'invoiceNumber'} = substr($line,62,9);
$return->{'partNumber'} = substr($line,71,10);
$return->{'shipDate'} = substr($line,81,6);
$return->{'VIN'} = substr($line,87,17);
$return->{'froogleId'} = substr($line,104,14);
$return->{'paypalTran'} = substr($line,119,50);
foreach $element (keys %{$return}) {
$return->{$element} =~ s/\s//gi;
Your substitution uses the /i option but there are no alphabetic
characters in your pattern that /i can effect. Your pattern would be
more efficient with the '+' modifier:
$return->{$element} =~ s/\s+//g;
}
Since you are only (and can only) modifying the values there is no
reason to access the keys:
s/\s+//g foreach values %$return;
return $return;
}
I might do it like this (UNTESTED):
use Fcntl;
use constant BUF_LEN => 189;
sub processFile {
my $sourceFile = shift;
sysopen my $INF, $sourceFile, O_RDONLY or die "Can't open
$sourceFile: $!\n";
my %dataSet;
{ my $len = sysread $INF, my $buffer, BUF_LEN;
last if $len == 0; # reached EOF
die "Error reading from $sourceFile: $!\n" unless defined $len;
die "Error: read $len bytes, expecting ", BUF_LEN, " bytes.\n"
if $len < BUF_LEN;
my %record;
@record{ qw/reportStat shipStat auctionId transactionId
invoiceNumber partNumber shipDate VIN froogleId paypalTran/ }
= map { s/\s+//g; $_ } unpack 'x20 A A A30 A30 A9 A10 A6
A17 A14 A50', $buffer;
push @{ $dataSet{ $record{ transactionId } } }, $record if
$record{ transactionId };
redo;
}
close $INF or die "Can't close $sourceFile: $!\n";
print Dumper \%dataSet;
}
John
--
Those people who think they know everything are a great
annoyance to those of us who do. -- Isaac Asimov
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/