Wijaya Edward wrote:
Hi,
I have two strings that I want to compute the number of mismatches between them. These two strings are of the "same" size. Let's call them 'source' string and 'target' string. Now, the problem is that the 'source' and 'target' string may come in ambiguous form, meaning that in one position they may contain more than 1 (upto 4) characters. The ambiguous position is marked with square bracketed [ATCG] region. The example is as follows:

Example 1 (where the source is ambiguous):
my $source1  = '[TCG]GGGG[AT]'; # ambiguous
my $target1   = 'AGGGGC'; # No of mismatch = 2  on position 1 and 6
my $target2  = 'TGGGGC'; # No of mismatch = 1  on position 6 only


Example 2 (where the source is NOT ambiguous):
my $source2  =  'TGGGGT'; # not-ambiguous
my $target1  = 'AGGGGC'; # No of mismatch = 2  on position 1 and 6
my $target3  = 'TGGGGT'; # No of mismatch = 0  all position matches


Example 3 (where both source and target are ambiguous)
my $source1  = '[TCG]GGGG[AT]'; # ambiguous
my $target1   = 'AGGGG[CT]';         # ambiguous, no of mismatch = 1  only at 
position 1

For example I can use bitwise operator to do it.
I have no problem when dealing with Example 1 and 2 above.
But I'm stuck with example 3, where both source and target is ambiguous.


Here is the current snippet I have, which doesn't do the job:

__BEGIN__
sub mismatches {
    my($source, $target) = @_;
    my @sparts = ($source =~ /(\[.*?\]|.)/g);
    my @tparts = ($target =~ /(\[.*?\]|.)/g);
scalar grep $tparts[$_] !~ /^$sparts[$_]/, 0 .. $#sparts;
  }
__END__

Where did I go wrong? I humbly seek advice.

Regards
Edward WIJAYA


two negatives - use only the first!

scalar grep $tparts[$_] !~ /$sparts[$_]/, 0 .. $#sparts;

. . . from a grass shack on the island of Tongatapu


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to