On 21/10/2011 00:32, nac wrote: > HI again, > I have corrected myself a bit, I think the script is now giving me what I > want, having said that, I guess it is not the best way ( even if there is > more than one way), again any pointer are welcome. > many thanks > Nat
Hi Nat > #!/usr/bin/perl > use strict; > use warnings; > my @seq; > @seq=qw{^TGGCAGTGGAGG ^TGTCTGGCAGTG ^TG....GCAGTG TCTGTCTG TCTGGCAG > GCAGTGGA TGTCTGGC ^TGTCTGGC ^..TCTGGCAGTG ^TGTCTGGCAGTG ^TGCATGGC}; > > open (IN,"</Users/nac/Desktop/example.fastq") or die > "can't open in:$!"; > open > (OUT,">>//Users/nac/Desktop/example.fastq\_class_COUNTED2.txt") > or die "can't open out: $!"; It is better to use lexical filehandles, and the three-parameter form of open, so: open my $in, '<', '/Users/nac/Desktop/example.fastq' or die "can't open in: $!"; open my $out, '>>', '//Users/nac/Desktop/example.fastq\_class_COUNTED2.txt' or die "can't open out: $!"; (Is the double-slash in the output filename correct?) > my %final_hash; > while (<IN>) { > > > if (/^A|T|G|C/){ This will search for A at the start of the string, or T, G, or C anywhere in the string, which is presumably not what you want. Also you can save an indentation level using next: next unless /^(?:A|T|G|C)/; > print my $seq_line=$_; > foreach my $ff (@seq){ > if ($seq_line =~ /$ff/g){ The /g is unnecessary and will give you incorrect results. In scalar context, as here, it will force the pattern search to start where the previous successful one left off, which is not what you want. Just if ($seq_line =~ /$ff/) { is correct. > if (!exists $final_hash{$ff}) { > $final_hash{$ff}=1; > } else { > $final_hash {$ff}++; > } As John described, all you need here is $final_hash {$ff}++; as Perl will autovivify a non-existent hash element before incrementing it. > } > } > } > } > for my $key (sort {$final_hash {$b}<=> $final_hash {$a}}keys > %final_hash){ > my $value=$final_hash{$key}; > print OUT $key,"\t",$value, "\n"; More easily written as print $out "$key\t$value\n"; > } The changes as a whole look like the program below. HTH, Rob use strict; use warnings; my @seq; @seq = qw{ ^TGGCAGTGGAGG ^TGTCTGGCAGTG ^TG....GCAGTG TCTGTCTG TCTGGCAG GCAGTGGA TGTCTGGC ^TGTCTGGC ^..TCTGGCAGTG ^TGTCTGGCAGTG ^TGCATGGC }; open my $in, '<', '/Users/nac/Desktop/example.fastq' or die "can't open in: $!"; open my $out, '>>', '//Users/nac/Desktop/example.fastq\_class_COUNTED2.txt' or die "can't open out: $!"; my %final_hash; while (<$in>) { next unless /^(?:A|T|G|C)/; print my $seq_line = $_; foreach my $ff (@seq){ $final_hash {$ff}++ if $seq_line =~ /$ff/; } } for my $key (sort {$final_hash {$b} <=> $final_hash {$a}}keys %final_hash) { my $value=$final_hash{$key}; print $out "$key\t$value\n"; } -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/