On Thu, May 12, 2011 at 6:23 AM, Nathalie Conte <n...@sanger.ac.uk> wrote:
> I have this script to split and iterate over each line, but I don't know > how to group 2 lines together, and take the start of the firt line and the > end on the second line? could you please advise? thanks > > You have a couple of options for handling two lines at a time. Generally, two main options: Handle one line at a time, keeping the previous line in memory and having a conditional inside the loop deciding what to do, or get the two lines that you need and have the loop be relatively simpler. What you are doing right now - Reading the entire file into an array - Makes the latter somewhat simpler, but with a big enough file it just won't be viable and you'll have to modify the script. Still, here's the nice and easy way to do it with what you already have, using splice[0] (and, just for completeness, I also did a small mock-up of the body of the loop, so the proggy also uses autodie[1], chomp LIST[2], hash slices[3], and smart-matching[4]): use strict; use warnings; use 5.010; use autodie; my $header = <DATA>; my @list = <DATA>; open my $OUTFILE, '>', "grouped.txt"; chomp @list; while ( my ($one, $two) = splice @list, 0, 2 ) { my (%first_line, %second_line); @first_line{ qw/ chromosome start end strand / } = split /\s+/, $one; @second_line{ qw/ chromosome start end strand / } = split /\s+/, $two; if ( @first_line{ qw/ chromosome strand / } ~~ @second_line{ qw/ chromosome strand / } ) { say { $OUTFILE } join "\t", $first_line{chromosome}, $first_line{start}, $second_line{end}, $first_line{strand}; } else { die "Something weird is going on"; } } __DATA__ chr start end strand x 12 24 1 x 24 48 1 1 100 124 -1 1 124 148 -1 (Without gmail's screwy indentation: http://ideone.com/SnRKp) But reading an entire file into memory isn't advisable. So you either need to drop that @list array and change the condition int he while loop to something like this: while ( my ($one, $two) = ( scalar <DATA>, scalar <DATA> ) ) { # Or put that inside a function ... } Or you keep the @list array, but do some magic with Tie::File[5]: use Tie::File; tie @list, 'Tie::File', $file or die ...; my $index = 1; #To skip the header while ( $index < $#list ) { # Or use a traditional for loop. my $one = $list[$index++]; my $two = $list[$index++]; ... } Brian. [0] http://perldoc.perl.org/functions/splice.html [1] http://perldoc.perl.org/autodie.html [2] http://perldoc.perl.org/functions/chomp.html [3] http://perldoc.perl.org/perldata.html#Slices [4] http://perldoc.perl.org/perlsyn.html#Switch-statements [5] http://perldoc.perl.org/Tie/File.html