Re: Position Weight Matrix of Set of Strings with Perl

Mumia W. Wed, 06 Sep 2006 04:05:33 -0700

On 09/06/2006 04:02 AM, Wijaya Edward wrote:

Dear Experts,
I am looking for a really efficient way to compute a position weight matrix (PWM) from a set of strings. In each set the strings are of the same length. Basically PWM compute the frequency (or probabilities) of bases [ATCG] occur in each position/column of a string. For example the set of strings below:AAA
                    ATG
                    TTT
                    GTC
Note that the length of these strings in the setmaybe greater than 3.Would give the following result:$VAR1 = {
            'A' => [2,1,1],
            'T' => [1,3,1],
            'C' => [0,0,1],
            'G' => [1,0,1]
         };
So the size of the array is the same with the length of the string.In my case I need the variation of it, namely the probability of theeach base occur in the particular position:
$VAR =     {
            'A' => ['0.5','0.25','0.25'],
            'T' => ['0.25','0.75','0.25'],
            'C' => ['0','0','0.25'],
            'G' => ['0.25','0','0.25']
          }
In this link you can find my incredibly naive and inefficient code.Can any body suggest a better and faster solution than this:http://www.rafb.net/paste/results/c6T7B629.htmlThanks and Regards,
Edward WIJAYA
SINGAPORE

Although I'm sure that smarter posters than I will turn thisinto a one-liner, I think that my solution is not so atrocious:


use strict;
use warnings;
use Data::Dumper;
local our @deep;
local $; = ','; # A vestige of a previous version

my @data = qw(AAA ATG TTT GTC);
my @d2 = map [ split // ], @data;

my (%hash);
for my $entry (@d2) {
    *deep = $entry;
    for my $nx (0..$#deep) {
        $hash{$deep[$nx]}[$nx]++;
    }
}
foreach my $entry (values %hash) {
    $entry = [ map defined $_ ? $_ : 0, @$entry ];
}
print Dumper(\%hash);

__HTH__


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Position Weight Matrix of Set of Strings with Perl

Reply via email to