Been playing around with this. It should be a start for you. No doubt you
will have questions and others will improve on it...

---
use strict;

my $file = "test.txt";
my ($line_start, $line_end, $letters, %data);

open FILE, "$file" or die "Can't open $file: $!";

while(<FILE>) {
        my ($line, $number, $letter) = split(/\s+/, $_);

        if ($number < 1) {
                $line_start = $line unless $line_start;
                $letters .= $letter;
                $line_end = $line;
        } else {
                next unless $line_start;
                if ($line_end - $line_start >= 9) {
                        my $lines = "$line_start to $line_end";
                        $data{$lines} = $letters;
                }
                $line_start = 0;
                undef $letters;
        }
}

close FILE;

foreach my $key (sort keys %data) {
        print "$key => $data{$key}\n";
}
---

John

-----Original Message-----
From: Pedro A Reche Gallardo [mailto:[EMAIL PROTECTED]]
Sent: 02 July 2001 15:48
To: [EMAIL PROTECTED]
Subject: PARSING A COLUMN FILE


Hi all, I have a file that it looks as it follows


The number on the second column goes from 0 to 4.3 and the files are
really long (around 1000 lines).  What I would like to do, is to find
out the file stretches  that contain at least 9 consecutive rows with a
value in the second column lower that 1, and storing the character of
the third column.
Eventually I would like to be able to print out the data as follows

stretch128-137    ACDNTCSSD
stretch190-205    SCTSEET+WTSpSSDS


with the above example the program would only find the followng stretch

stretch1-10  MRVKGI++Na


Please help.

Cheers,

Pedro Reche

***************************************************************************
PEDRO a. RECHE gallardo, pHD            TL: 617 632 3824
Scientist, Mol.Immnunol.Foundation,     FX: 617 632 3351
Dana-Farber Cancer Institute,           EM: [EMAIL PROTECTED]
Harvard Medical School,                 URL: http://www.reche.org
44 Binney Street, D610C,
Boston, MA 02115
***************************************************************************




--------------------------Confidentiality--------------------------.
This E-mail is confidential.  It should not be read, copied, disclosed or
used by any person other than the intended recipient.  Unauthorised use,
disclosure or copying by whatever medium is strictly prohibited and may be
unlawful.  If you have received this E-mail in error please contact the
sender immediately and delete the E-mail from your system.


Reply via email to