Note that the multi-line version above tolerates missing digits: if the number is missing after the '+/-' it doesn't skip any letters.
Brief explanation of the multi-digit version: +/- are converted to spaces and used to split the string into sections. The split process effectively swallows the +/- characters. The complication of multi-digits is that you need to skip the (possibly multiple) digits, which adds another stage to the calculation. In: +3ACG. -> . you skip 1 + 3 characters, 1 for the digit, 3 for the following letters as specified by the digit 3. In: -11ACGACGACGACG. -> G. You skip 2 + 11 characters, 2 digits in "12" and 11 letters following. And incidentally in: +ACG. -> ACG. there's no digit, so you skip 0 digits + 0 letters. Having split on +/- using .translate() and .split() I use takewhile to separate the zero or more digits from the following letters. If takewhile doesn't find any digits at the start of the sequence, it returns the empty list []. ''.join(list) swallows empty lists so dropwhile and ''.join() cover the no-digit case between them. If a lack of digits is a data error then it would be easy to test for-- just look for an empty list in 'digits'. I was pleasantly surprised to find that using list comprehensions, zip, join (all highly optimised in Python) and several intermediate lists still works at a fairly decent speed, despite using more stages to handle multi-digits. But it is about 4x slower than the less flexible 1-digit version on my hardware (about 25,000 per second.) Nick On Monday, 7 January 2013 14:40:02 UTC+11, Nick Mellor wrote: > Hi Sia, > > > > Find a multi-digit method in this version: > > > > from string import maketrans > > from itertools import takewhile > > > > def is_digit(s): return s.isdigit() > > > > class redux: > > > > def __init__(self): > > intab = '+-' > > outtab = ' ' > > self.trantab = maketrans(intab, outtab) > > > > > > def reduce_plusminus(self, s): > > list_form = [r[int(r[0]) + 1:] if r[0].isdigit() else r > > for r > > in s.translate(self.trantab).split()] > > return ''.join(list_form) > > > > def reduce_plusminus_multi_digit(self, s): > > spl = s.translate(self.trantab).split() > > digits = [list(takewhile(is_digit, r)) > > for r > > in spl] > > numbers = [int(''.join(r)) if r else 0 > > for r > > in digits] > > skips = [len(dig) + num for dig, num in zip(digits, numbers)] > > return ''.join([s[r:] for r, s in zip(skips, spl)]) > > > > if __name__ == "__main__": > > p = redux() > > print p.reduce_plusminus(".+3ACG.+5CAACG.+3ACG.+3ACG") > > print p.reduce_plusminus("tA.-2AG.-2AG,-2ag") > > print 'multi-digit...' > > print p.reduce_plusminus_multi_digit(".+3ACG.+5CAACG.+3ACG.+3ACG") > > print > p.reduce_plusminus_multi_digit(".+12ACGACGACGACG.+5CAACG.+3ACG.+3ACG") > > > > > > HTH, > > > > Nick > > > > On Saturday, 5 January 2013 19:35:26 UTC+11, Sia wrote: > > > I have strings such as: > > > > > > > > > > > > tA.-2AG.-2AG,-2ag > > > > > > or > > > > > > .+3ACG.+5CAACG.+3ACG.+3ACG > > > > > > > > > > > > The plus and minus signs are always followed by a number (say, i). I want > > python to find each single plus or minus, remove the sign, the number after > > it and remove i characters after that. So the two strings above become: > > > > > > > > > > > > tA.., > > > > > > and > > > > > > ... > > > > > > > > > > > > How can I do that? > > > > > > Thanks. -- http://mail.python.org/mailman/listinfo/python-list