On Jun 14, Praedor Atrebates said:
On Tuesday 14 June 2005 17:34, Jeff 'japhy' Pinyan wrote:
The [...] construct is a character class -- it represents a set of
characters, any of which can match. Thus, [KRH] matches a 'K', an 'R', or
an 'H'. But [A{3,}B{3,}] is really just the same as [AB3,{}] -- that is,
an 'A', a 'B', a '3', a ',', a '{', or a '}'. What you want is
$dnakmotif = qr/[KRH](?:L{3,}|V{3,}|I{3,}|F{3,}|Y{3,}|A{3,})[KRH]/;
That sounds like it should match what you're looking for.
I may add one or two more amino acids to the middle portion as possibles but
for now that is the rule I am trying to get working. In what you offer, what
does the leading "qr/" mean and what of the "?:L..."? None of the internal
letters MUST repeat but the CAN. There could be no repeats with position
filled by a unique 3, 4, or 5 characters from the list or it could be
entirely one character repeated anywhere between 3 to 5 times to a
combination of repeats and singles.
The qr/.../ construct *creates* a compiled regex, that you can then use
later. The inside of it is parsed like a regex (not like a normal quoted
string).
my $rx = qr/[KRH][LVIFAY]{3,5}[KRH]/;
if ($str =~ /$rx/) { something }
The (?:...) part of the regex quoted above is a grouping construct that
does not capture to a $DIGIT variable.
/abc(def|ghi)jkl/
captures 'def' or 'ghi' (whichever matched) to $1, but
/abc(?:def|ghi)jkl/
does not capture anything.
I'd say you want to go ahead and use
qr/[KRH][LVIFAY]{3,5}[KRH]/;
for now, until you can come up with a more complex definition of the
interior of your sequence.
--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>