Hi, I have a long sequence of letters ( an amino acid sequence). I want to extract 4letters either side of each S and get them into an array. e.g.,
ADFGTREDSWQACVDFRSSSGHYT would get TREDSWQAC VDFRSSSGH DFRSSSGHY etc. I have worked out how to do this by using substr() but wondered if there was a more elegant way using regexps . I tried: @peptides = $sequence =~ /(\w{4}S\w{4})/g; this works up to a point, but if there are 2 adjacent 'S' the 2nd one is not extracted, I guess because the regexp engine continues after the end of the previous match ie., it doesn't extract DFRSSSGHY above. . Is it possible to try the next match from within the previous match to remedy this? Thanks for any tips or flashes of inspiration, Richard -- Dr Richard Adams University of Edinburgh Kings Buildings, Mayfield Rd, Edinburgh Email [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]