On 21/10/2011 00:32, nac wrote:
> HI again,
> I have corrected myself a bit, I think the script is now giving me what I
> want, having said that, I guess it is not the best way ( even if there is
> more than one way), again any pointer are welcome.
> many thanks
> Nat

Hi Nat

> #!/usr/bin/perl
> use strict;
> use warnings;
> my @seq;
> @seq=qw{^TGGCAGTGGAGG ^TGTCTGGCAGTG ^TG....GCAGTG TCTGTCTG TCTGGCAG
> GCAGTGGA TGTCTGGC ^TGTCTGGC ^..TCTGGCAGTG ^TGTCTGGCAGTG ^TGCATGGC};
> 
> open (IN,"</Users/nac/Desktop/example.fastq") or die
> "can't open in:$!";
> open
> (OUT,">>//Users/nac/Desktop/example.fastq\_class_COUNTED2.txt")
> or die "can't open out: $!";

It is better to use lexical filehandles, and the three-parameter form 
of open, so:

open my $in, '<', '/Users/nac/Desktop/example.fastq' or die "can't open in: $!";
open my $out, '>>', '//Users/nac/Desktop/example.fastq\_class_COUNTED2.txt' or 
die "can't open out: $!";

(Is the double-slash in the output filename correct?)

> my %final_hash;
>        while (<IN>) {
> 
> 
>     if (/^A|T|G|C/){

This will search for A at the start of the string, or T, G, or C
anywhere in the string, which is presumably not what you want. Also you
can save an indentation level using next:

  next unless /^(?:A|T|G|C)/;

> print my $seq_line=$_;
> foreach my $ff (@seq){
>       if ($seq_line =~ /$ff/g){

The /g is unnecessary and will give you incorrect results. In scalar
context, as here, it will force the pattern search to start where the
previous successful one left off, which is not what you want. Just

  if ($seq_line =~ /$ff/) {

is correct.

>          if (!exists $final_hash{$ff}) {
>              $final_hash{$ff}=1;
>          } else {
>          $final_hash {$ff}++;
>                        }

As John described, all you need here is

  $final_hash {$ff}++;

as Perl will autovivify a non-existent hash element before incrementing it.

>      }
>      }
> }
> }
> for my $key (sort {$final_hash {$b}<=>  $final_hash {$a}}keys
> %final_hash){
>      my $value=$final_hash{$key};
>       print OUT $key,"\t",$value, "\n";

More easily written as

  print $out "$key\t$value\n";

> }

The changes as a whole look like the program below.

HTH,

Rob


use strict;
use warnings;

my @seq;
@seq = qw{
  ^TGGCAGTGGAGG   ^TGTCTGGCAGTG   ^TG....GCAGTG   TCTGTCTG
  TCTGGCAG        GCAGTGGA        TGTCTGGC        ^TGTCTGGC
  ^..TCTGGCAGTG   ^TGTCTGGCAGTG   ^TGCATGGC
};

open my $in, '<', '/Users/nac/Desktop/example.fastq' or die "can't open in: $!";
open my $out, '>>', '//Users/nac/Desktop/example.fastq\_class_COUNTED2.txt' or 
die "can't open out: $!";

my %final_hash;

while (<$in>) {
  next unless /^(?:A|T|G|C)/;
  print my $seq_line = $_;
  foreach my $ff (@seq){
    $final_hash {$ff}++ if $seq_line =~ /$ff/;
  }
}

for my $key (sort {$final_hash {$b} <=> $final_hash {$a}}keys %final_hash) { 
  my $value=$final_hash{$key};
  print $out "$key\t$value\n";
}



-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to