Hi All,

I'm trying to parse data from a report file and I'm having trouble producing 
desired results. 

Here is a data example from the report:

    PONumber   Line  InvoicedQty   UnitCost  Amount Curr  Extended Amount
Fr    Date   Company Department   Account  ItemNum       ItemDescription
--------- ------------------------- ------ --------------------- 
------------------ 
     1023112-0000    1   1.0000   102.3419   102.34 USD    102.34
03A  10/13/10  213 31000   - 10810-   138328        ARMBD ART LN
Vendor:     1288 ALIMED                          Buyer:A02  VALERIE BAGALA

     1026244-0000    1   1.0000   284.2525   284.25 USD       284.25
03A  10/29/10  213 31000  - 10810-  279784         BAGS CRUSHER
Vendor:     1338 SHARPE LINES INC                Buyer:A02  VALERIE BAGALA

     1024877-0000    1  4.0000   140.4800   561.92 USD   561.92
03A  10/26/10  213 31000  - 10810-  235228         SYR 1ML AMBER W/ TIP CAP
Vendor:     2472 BAXA CORP                       Buyer:A02  VALERIE BAGALA

     1000066-0000     1  .9000  241.6845    217.52 USD     217.52
03A  05/19/10  213 41000 - 10810-  145155   NDL JAMSHIDI 11GA DISP STR
Vendor:     2686 CARDINAL HEALTH 200 INC         Buyer:A04  DEAN SCHUMACHER

--------- ------------------------- ------ --------------------- 
------------------ 


A complete record in the report expands to more than 1 line.  
Each record begin with a line starting with 4 to 7 digits and -0000
each record ends with a line contains the word Vendor:'

I need to extract some elements from each record and combine the extracted data 
in one new record (1 line per record)
so that later I can bcp the new data into a table/database.

Here is my code:

#!/usr/bin/perl  -w
use strict;

my $PO_file = "/home/sybase/scripts125/pl/test/simple_SH135.dat";
open(IN,"$PO_file") || die "Fail open $PO_file";

my ($line, @part1, @part2, @part3, $rec_part1, $rec_part2, $rec_part3, 
$complete_record);

print "Need to extract the following data\n";
print 
"PONumber|Quantity|UnitCost|ExtAmt|Date|Company|ItemNumber|Description|VendorID\n";

while ($line=<IN>) {
  chomp $line;
  $line =~ s/^\s+//go;
  $line =~ s/\s+/ /go;
  $line =~ s/,//go;

  # Part1
  # Data Example:      1023112-0000    1   1.0000   102.3419   102.34 USD    
102.34
  
########################################################################Part1: 
  if ($line =~ /-0000/){  
  #if ($line =~ /^\d{7}-0000/){    
     my @part1 = split(/\s+/,$line);
     my $PONumber = $part1[0];
     my $Quantity = $part1[2];
     my $UnitCost = $part1[3];
     my $ExtAmt = $part1[6];
     $rec_part1 = join "|",($PONumber, $Quantity, $UnitCost, $ExtAmt);
  }# end part 1  
  
  
  ### Part2:
  #Data Example: 03A  10/13/10  213 31000   - 10810-   138328        ARMBD ART 
LN
  ######################################################################
  if ($line =~ /(\d{2}\/\d{2}\/\d{2})/){ 
  #if ($line =~ /^.{5}(\d{2}\/\d{2}\/\d{2})/){ #Data Eg: 03A  10/13/10  213 
31000 #why not work?
      my @part2 = split(/\s+/,$line,8);  # last group has multiple words is 
descriptions
      my $PurchFr = $part2[0] ;
      my $Date = $part2[1] ;
      my $Company = $part2[2] ;
      my $Dept = $part2[3] ;
      my $Acct = $part2[5] ;
      my $ItemNumber = $part2[6] ;
      my $Desc = $part2[7];
      #$rec_part2 = join '|',($part2[1],$part2[2] ,$part2[6],$part2[7]);
      $rec_part2 = join '|',($Date,$Company ,$ItemNumber,$Desc);
      #print "rec_part2: $rec_part2 \n"; 
  }# end Part2


  ##Part3: VendorID
  # Data Example: Vendor:     1288 ALIMED              Buyer:A02  VALERIE BAGALA
  ###################################################################### 
  if ($line =~ /^Vendor:/){
    my @part3 = split(/\s+/,$line);
    my $VendorID = $part3[1];
    $rec_part3 = $VendorID;
  } #end part3

  $complete_record = "$rec_part1".'|'."$rec_part2".'|'."$rec_part3"; 
  print "$complete_record\n";

}#end while 

My questions:

I expect my program to produce 4 records, like:

PONumber|Quantity|UnitCost|ExtAmt|Date|Company|ItemNumber|Description|VendorID
1026244-0000|1.0000|284.2525|284.25|10/13/10|213|138328|ARMBD ART LN|1288
1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS CRUSHER|1338
1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|2472
1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL JAMSHIDI 11GA DISP 
STR|2686

However it produces unwanted results, like below. Any pointers would be greatly 
appreciated.

1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD ART LN|
1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD ART LN|1288
1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD ART LN|1288
1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD ART LN|1288
1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD ART LN|1288
1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD ART LN|1288
1026244-0000|1.0000|284.2525|284.25|10/13/10|213|138328|ARMBD ART LN|1288
1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS CRUSHER|1288
1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS CRUSHER|1338
1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS CRUSHER|1338
1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS CRUSHER|1338
1024877-0000|4.0000|140.4800|561.92|10/29/10|213|279784|BAGS CRUSHER|1338
1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|1338
1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|2472
1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|2472
1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|2472
1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|2472
1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|2472
1000066-0000|.9000|241.6845|217.52|10/26/10|213|235228|SYR 1ML AMBER W/ TIP 
CAP|2472
1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL JAMSHIDI 11GA DISP 
STR|2472
1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL JAMSHIDI 11GA DISP 
STR|2686
1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL JAMSHIDI 11GA DISP 
STR|2686




      

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to