Re: regexp and parsing assistance

Jim Gibson Sun, 09 Jun 2013 09:01:02 -0700

On Jun 8, 2013, at 8:06 PM, Noah wrote:

> Hi there,
> 
> I am attempting to parse the following output and not quite sure how to do 
> it.   The text is in columns and spaced out that way regardless if there are 
> 0 numbers in say col5 or Col 6 or not.  If the column has an entry then I 
> want to save it to a variable if there is no entry then that variable will be 
> equal to 'blank'
> 
> The first line is a header and can be ignored.
> 
> 
> C Col2               C Col4      Col5       Col6  Col7            Col8 
> <<<new_line>>>
> * 123.456.789.101/85 A 803                        Reject  <<<new_line>>>
>                     B 804         76         10 >800.99.999.0    98765 78910 
> I <<<new_line>>>
>                     O 805       1234          1 >800.9.999.1     98765 78910 
> I <<<new_line>>>


If your data consists of constant-width fields, then the best approach is to 
use the unpack function. See 'perldoc -f unpack' for how to use it and 'perldoc 
-f pack' for the template parameters that describe your data.

This statement will unpack the second and third data lines you have shown, 
presuming that you have read the lines into the variable $line:

  my @fields = unpack('A2 A19 A2 A3 A11 A11 A18 A5 A5 A1',$line);

However, your data as shown has variable data in the first or second column. If 
that is really the case, then you will have to look at the first twenty columns 
of your data and determine where column three starts. Then you can use the 
unpack function to parse the rest of the columns. Maybe something like this:

  if( $line =~ /^\s{20}/ ) {
    # no data in first 20 columns, unpack remainder
    $line = substr($line,20);
  }else{
    # data in first 20 columns -- remove first two fields
    $line =~ s/\S+\s\S+\s//;
  }
  my @fields = unpack('A2 A3 A11 A11 A18 A5 A5 A1',$line);

Exactly what you need to do depends upon the exact nature of your data and how 
much it varies from line to line.

Good luck!


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: regexp and parsing assistance

Reply via email to