Dear list: I wrote a script that takes a list of ids from an input file and store these in an array in a pairwise-like manner (if total list is n then the array is (2 ^n)-n). I need to extract for each pair of ids a certain value from a huge file that contains the pair of ids and the value (format of the file: col1 col2 id1 id2 value). The script works but it is takes too long, specially because the second file is too big (more than 600 MB). I would like to increase the speed of the script, but I haven't quite worked what is the best way to do it. Any tip? Thanks in advance, L. Pardo ps, I am attaching the script
#!/usr/local/bin/perl use strict; use warnings; my %nuc=(); my @arr=(); open(SN,"file_22.txt") || die "cannot open file: $!\n"; my @a2=(<SN>); close SN; chomp(@a2); for (my $m=0; $m<=$#a2; $m++) { my @temp0 = split/\s+/,$a2[$m]; $nuc{$temp0[3]} = $temp0[1]; push(@arr,$temp0[3]); } print "$#arr\n";
my @couple= (); open(OUT,">>couple.txt") || die "can not write on it: $!\n"; for (my $i=0; $i<=50; $i++) { for (my $n=$i; $n<=50; $n++) { if($nuc{$arr[$i]}>= $nuc{$arr[$n]}- 200000) { push(@couple,$arr[$i],$arr[$n]); print OUT"@couple\n"; } @couple =(); } } ######SECOND_PART############# open(LL,"top_22.txt") or die "cannot do it : $!\n"; print `date`; my @a3 = (<LL>); print `date`; open(COUPLE,"couple.txt") or die " can not open file :$!\n"; my @a4 = (<COUPLE>); open(OUT1,">Not_found.txt"); open(OUT2,">Pairwise.txt"); for (my $y =0; $y <=$#a4; $y= $y+1) { my @temp1= split/\s+/,$a4[$y]; print "OK $temp1[0],$temp1[1]\n"; for (my $x=0; $x<=$#a3; $x++) { my @temp2 = split/\s+/,$a3[$x]; if($temp1[0] eq $temp2[3] & $temp1[1] eq $temp2[4] || $temp1[0] eq $temp2[4] & $temp1[1] eq $temp2[3]) { print "$temp2[3], $temp2[4],$temp2[5],$temp2[6]\n"; } elsif($x == $#a3) { print OUT1 "$temp1[0], $temp1[1]\n;" } } } #system("gzip Not_found.txt");
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/