I am attempting to do a "fuzzy match" with the String::Approx (v.3) module, with very limited success. I am working with biological genome sequence, this is a 30136242 character long string (which I load into $seq), each character is either an A , T , G or C (or in some cases more rarely an N to denote that it could be A,T,G or C). I then want to match 15 - 20 characters against this 30136242 character string.
I have written the code below, however I am having problems as the code seems to stop generally after finding only one hit when I know there are more in there. The aindex and aslice methods do not seem to have a offset, so I am having to try to alter the search string myself, to progress along it. From the documentation I expected aslice to return a two element list which would be placed into $index and $size, however I seem to get an array reference returned into $index and $size is left undefined. Any help / advice on this would be greatly appreciated. Cheers Paul #!/usr/bin/perl -w use String::Approx qw(amatch aindex aslice); #Fuzzy matching die "Syntax: primerSearch Chromosome_number, Number_Point_mutations, Primer_Sequence" if (@ARGV != 3); open (CHR,"<chromo$ARGV[0]_pseudo_v080501.seq"); $seq = <CHR>; close (CHR); $a = $ARGV[2]; # Reverse sequence my ($ra) =&rev($a); my $addf = 0; my $indx; my $flag; do { undef $indx; undef $flag; my ($index, $size) = aslice($a, ["$ARGV[1]"], $seq); while ( $indx = shift(@$index)) { $flag = 1; my $sizx = shift(@$index); my $sq = substr($seq,$indx,$sizx); print ("\t" , $indx+$addf , "\t($sizx)\tSeq: $sq\n"); $addf += ($indx + 1); $seq = substr($seq,$indx,length($seq)); } } while ( defined $flag ); sub rev { my $reversed_seq = reverse $_[0]; $reversed_seq =~ tr/ATGC/TACG/; return $reversed_seq; } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]