Henry Todd wrote: > > I'm having trouble counting the number of specific substrings within a > string. I'm working on a bioinformatics coursework at the moment, so my > string looks like this: > > $sequence = "caggaacttcccctcggaagaccatgta"; > > I want to count the number of occurrences of each pair of letters, for example: > > Number of occurrences of "aa" > Number of occurrences of "gt" > Number of occurrences of "cc" > > This is how I'm counting the number of "cc" pairs at the moment ($cc is > my counter variable): > > $cc++ while $sequence =~ /cc/gi; > > But this only matches the literal string "cc", so if, as it scans > $sequence, it finds "cccc" it's only counting it once instead of three > times. > > What pattern do I need to be looking for in the $sequence if I want to > count *all* occurences of "cc" -- even if they overlap?
Use a zero-width positive look-ahead assertion for the second repeated character. $ perl -le' my $string = q/cccc/; my $count = () = $string =~ /c(?=c)/g; print $count; ' 3 John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>