On Thu, May 12, 2011 at 6:23 AM, Nathalie Conte <n...@sanger.ac.uk> wrote:

>  I have this script to split and iterate over each line, but I don't know
> how to group 2 lines together, and take the start of the firt line and the
> end on the second line? could you please advise? thanks
>
>
You have a couple of options for handling two lines at a time. Generally,
two main options: Handle one line at a time, keeping the previous line in
memory and having a conditional inside the loop deciding what to do, or get
the two lines that you need and have the loop be relatively simpler. What
you are doing right now - Reading the entire file into an array - Makes the
latter somewhat simpler, but with a big enough file it just won't be viable
and you'll have to modify the script. Still, here's the nice and easy way to
do it with what you already have, using splice[0] (and, just for
completeness, I also did a small mock-up of the body of the loop, so the
proggy also uses autodie[1], chomp LIST[2], hash slices[3], and
smart-matching[4]):

use strict;
use warnings;
use 5.010;
use autodie;

my $header = <DATA>;
my @list = <DATA>;

open my $OUTFILE, '>', "grouped.txt";

chomp @list;
while ( my ($one, $two) = splice @list, 0, 2 ) {
my (%first_line, %second_line);
@first_line{ qw/ chromosome start end strand / } = split /\s+/, $one;
@second_line{ qw/ chromosome start end strand / } = split /\s+/, $two;

if ( @first_line{ qw/ chromosome strand / } ~~ @second_line{ qw/ chromosome
strand / } ) {
say { $OUTFILE } join "\t", $first_line{chromosome}, $first_line{start},
$second_line{end}, $first_line{strand};
} else {
die "Something weird is going on";
}
}

__DATA__
chr start end strand
x 12 24 1
x 24 48 1
1 100 124 -1
1 124 148 -1

(Without gmail's screwy indentation: http://ideone.com/SnRKp)

But reading an entire file into memory isn't advisable. So you either need
to drop that @list array and change the condition int he while loop to
something like this:

while ( my ($one, $two) = ( scalar <DATA>, scalar <DATA> ) ) { # Or put that
inside a function
   ...
}

Or you keep the @list array, but do some magic with Tie::File[5]:
use Tie::File;
tie @list, 'Tie::File', $file or die ...;

my $index = 1; #To skip the header
while ( $index < $#list ) {  # Or use a traditional for loop.
    my $one = $list[$index++];
    my $two = $list[$index++];
    ...
}

Brian.

[0] http://perldoc.perl.org/functions/splice.html
[1] http://perldoc.perl.org/autodie.html
[2] http://perldoc.perl.org/functions/chomp.html
[3] http://perldoc.perl.org/perldata.html#Slices
[4] http://perldoc.perl.org/perlsyn.html#Switch-statements
[5] http://perldoc.perl.org/Tie/File.html

Reply via email to