John and Rob,
Thanks a lot for the pointers.  Very helpfull. 

--- On Thu, 1/27/11, Rob Dixon <rob.di...@gmx.com> wrote:

> From: Rob Dixon <rob.di...@gmx.com>
> Subject: Re: parse data from a report file
> To: beginners@perl.org
> Cc: "loan tran" <loan...@yahoo.com>
> Date: Thursday, January 27, 2011, 11:33 AM
> On 27/01/2011 06:58, loan tran
> wrote:
> > Hi All,
> >
> > I'm trying to parse data from a report file and I'm
> having trouble producing desired results.
> >
> > Here is a data example from the report:
> >
> >     
> PONumber   Line 
> InvoicedQty   UnitCost  Amount
> Curr  Extended Amount
> > Fr    Date   Company
> Department   Account  ItemNum 
>      ItemDescription
> > --------- ------------------------- ------
> --------------------- ------------------
> >       1023112-0000 
>  
> 1   1.0000   102.3419   102.34
> USD    102.34
> > 03A  10/13/10  213 31000   -
> 10810-   138328       
> ARMBD ART LN
> > Vendor:     1288 ALIMED 
>                
>         Buyer:A02  VALERIE BAGALA
> >
> >       1026244-0000 
>  
> 1   1.0000   284.2525   284.25
> USD       284.25
> > 03A  10/29/10  213 31000  -
> 10810-  279784     
>    BAGS CRUSHER
> > Vendor:     1338 SHARPE LINES
> INC               
> Buyer:A02  VALERIE BAGALA
> >
> >       1024877-0000 
>   1 
> 4.0000   140.4800   561.92
> USD   561.92
> > 03A  10/26/10  213 31000  -
> 10810-  235228     
>    SYR 1ML AMBER W/ TIP CAP
> > Vendor:     2472 BAXA CORP 
>                
>      Buyer:A02  VALERIE BAGALA
> >
> >       1000066-0000 
>    1  .9000  241.6845   
> 217.52 USD     217.52
> > 03A  05/19/10  213 41000 - 10810- 
> 145155   NDL JAMSHIDI 11GA DISP STR
> > Vendor:     2686 CARDINAL HEALTH
> 200 INC     
>    Buyer:A04  DEAN SCHUMACHER
> >
> > --------- ------------------------- ------
> --------------------- ------------------
> >
> >
> > A complete record in the report expands to more than 1
> line.
> > Each record begin with a line starting with 4 to 7
> digits and -0000
> > each record ends with a line contains the word
> Vendor:'
> >
> > I need to extract some elements from each record and
> combine the extracted data in one new record (1 line per
> record)
> > so that later I can bcp the new data into a
> table/database.
> >
> > Here is my code:
> >
> > #!/usr/bin/perl  -w
> > use strict;
> >
> > my $PO_file =
> "/home/sybase/scripts125/pl/test/simple_SH135.dat";
> > open(IN,"$PO_file") || die "Fail open $PO_file";
> >
> > my ($line, @part1, @part2, @part3, $rec_part1,
> $rec_part2, $rec_part3, $complete_record);
> >
> > print "Need to extract the following data\n";
> > print
> "PONumber|Quantity|UnitCost|ExtAmt|Date|Company|ItemNumber|Description|VendorID\n";
> >
> > while ($line=<IN>) {
> >    chomp $line;
> >    $line =~ s/^\s+//go;
> >    $line =~ s/\s+/ /go;
> >    $line =~ s/,//go;
> >
> >    # Part1
> >    # Data Example:     
> 1023112-0000   
> 1   1.0000   102.3419   102.34
> USD    102.34
> >   
> ########################################################################Part1:
> >    if ($line =~ /-0000/){
> >    #if ($line =~ /^\d{7}-0000/){
> >       my @part1 =
> split(/\s+/,$line);
> >       my $PONumber =
> $part1[0];
> >       my $Quantity =
> $part1[2];
> >       my $UnitCost =
> $part1[3];
> >       my $ExtAmt =
> $part1[6];
> >       $rec_part1 = join
> "|",($PONumber, $Quantity, $UnitCost, $ExtAmt);
> >    }# end part 1
> >
> >
> >    ### Part2:
> >    #Data Example: 03A  10/13/10 
> 213 31000   -
> 10810-   138328       
> ARMBD ART LN
> >   
> ######################################################################
> >    if ($line =~ /(\d{2}\/\d{2}\/\d{2})/){
> >    #if ($line =~
> /^.{5}(\d{2}\/\d{2}\/\d{2})/){ #Data Eg: 03A 
> 10/13/10  213 31000 #why not work?
> >        my @part2 =
> split(/\s+/,$line,8);  # last group has multiple words
> is descriptions
> >        my $PurchFr = $part2[0] ;
> >        my $Date = $part2[1] ;
> >        my $Company = $part2[2] ;
> >        my $Dept = $part2[3] ;
> >        my $Acct = $part2[5] ;
> >        my $ItemNumber = $part2[6]
> ;
> >        my $Desc = $part2[7];
> >        #$rec_part2 = join
> '|',($part2[1],$part2[2] ,$part2[6],$part2[7]);
> >        $rec_part2 = join
> '|',($Date,$Company ,$ItemNumber,$Desc);
> >        #print "rec_part2:
> $rec_part2 \n";
> >    }# end Part2
> >
> >
> >    ##Part3: VendorID
> >    # Data Example: Vendor: 
>    1288 ALIMED       
>       Buyer:A02  VALERIE BAGALA
> >   
> ######################################################################
> >    if ($line =~ /^Vendor:/){
> >      my @part3 = split(/\s+/,$line);
> >      my $VendorID = $part3[1];
> >      $rec_part3 = $VendorID;
> >    } #end part3
> >
> >    $complete_record =
> "$rec_part1".'|'."$rec_part2".'|'."$rec_part3";
> >    print "$complete_record\n";
> >
> > }#end while
> >
> > My questions:
> >
> > I expect my program to produce 4 records, like:
> >
> >
> PONumber|Quantity|UnitCost|ExtAmt|Date|Company|ItemNumber|Description|VendorID
> >
> 1026244-0000|1.0000|284.2525|284.25|10/13/10|213|138328|ARMBD
> ART LN|1288
> >
> 1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS
> CRUSHER|1338
> >
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|2472
> >
> 1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL
> JAMSHIDI 11GA DISP STR|2686
> >
> > However it produces unwanted results, like below. Any
> pointers would be greatly appreciated.
> >
> >
> 1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD
> ART LN|
> >
> 1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD
> ART LN|1288
> >
> 1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD
> ART LN|1288
> >
> 1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD
> ART LN|1288
> >
> 1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD
> ART LN|1288
> >
> 1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD
> ART LN|1288
> >
> 1026244-0000|1.0000|284.2525|284.25|10/13/10|213|138328|ARMBD
> ART LN|1288
> >
> 1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS
> CRUSHER|1288
> >
> 1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS
> CRUSHER|1338
> >
> 1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS
> CRUSHER|1338
> >
> 1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS
> CRUSHER|1338
> >
> 1024877-0000|4.0000|140.4800|561.92|10/29/10|213|279784|BAGS
> CRUSHER|1338
> >
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|1338
> >
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|2472
> >
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|2472
> >
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|2472
> >
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|2472
> >
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|2472
> >
> 1000066-0000|.9000|241.6845|217.52|10/26/10|213|235228|SYR
> 1ML AMBER W/ TIP CAP|2472
> >
> 1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL
> JAMSHIDI 11GA DISP STR|2472
> >
> 1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL
> JAMSHIDI 11GA DISP STR|2686
> >
> 1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL
> JAMSHIDI 11GA DISP STR|2686
> 
> My thoughts are of a much simpler solution. I hope the
> program below helps.
> 
> Rob
> 
> use strict;
> use warnings;
> 
> my $po_file =
> '/home/sybase/scripts125/pl/test/simple_SH135.dat';
> 
> while (<$in>) {
> 
>    chomp;
>    next unless /-0000/;
> 
>    my @record = (split)[0, 2, 3, 6];
> 
>    $_ = <DATA>;
>    push @record, (split)[1, 2, 6, 7];
> 
>    $_ = <DATA>;
>    die "No 'Vendor' found in record" unless
> /Vendor:/;
>    push @record, (split)[1];
> 
>    print join('|', @record), "\n";
> }
> 
> **OUTPUT**
> 
> 1023112-0000|1.0000|102.3419|102.34|10/13/10|213|138328|ARMBD|1288
> 1026244-0000|1.0000|284.2525|284.25|10/29/10|213|279784|BAGS|1338
> 1024877-0000|4.0000|140.4800|561.92|10/26/10|213|235228|SYR|2472
> 1000066-0000|.9000|241.6845|217.52|05/19/10|213|145155|NDL|2686
> 
> -- 
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
> 
> 
> 




--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to