James Stroud said unto the world upon 2005-03-27 17:39:
Hello,

I have strings represented as a combination of an alphabet (AGCT) and a an operator "/", that signifies degeneracy. I want to split these strings into lists of lists, where the degeneracies are members of the same list and non-degenerates are members of single item lists. An example will clarify this:

"ATT/GATA/G"

gets split to

[['A'], ['T'], ['T', 'G'], ['A'], ['T'], ['A', 'G']]

I have written a very ugly function to do this (listed below for the curious), but intuitively I think this should only take a couple of lines for one skilled in regex and/or listcomp. Any takers?

James

p.s. Here is the ugly function I wrote:

def build_consensus(astr):

  consensus = []       # the lol that will be returned
  possibilities = []   # one element of consensus
  consecutives = 0     # keeps track of how many in a row

  for achar in astr:
    if (achar == "/"):
      consecutives = 0
      continue
    else:
      consecutives += 1
    if (consecutives > 1):
      consensus.append(possibilities)
      possibilities = [achar]
    else:
      possibilities.append(achar)
  if possibilities:
    consensus.append(possibilities)
  return consensus

Hi,

in the spirit of "Now I have two problems" I like to avoid r.e. when I can. I don't think mine avoids a bit of ugly, but I, at least, find it easier to grok (YMMV):

def build_consensus(string):

    result = [[string[0]]]   # starts list with a list of first char
    accumulate = False

    for char in string[1:]:

        if char == '/':
            accumulate = True

        else:
            if accumulate:
                # The pop removes the last list appended, and we use
                # its single item to build then new list to append.
                result.append([result.pop()[0], char])
                accumulate = False

            else:
                result.append([char])

    return result


(Since list.append returns None, this could use
accumulate = result.append([result.pop()[0], char])
in place of the two lines in the if accumulate block, but I don't think that is a gain worth paying for.)


HTH,

Brian vdB

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to