On Mar 26, 5:28 pm, Sean Davis <[EMAIL PROTECTED]> wrote: > I am working with genomic data. Basically, it consists of many tuples > of (start,end) on a line. I would like to convert these tuples of > (start,end) to a string of bits where a bit is 1 if it is covered by > any of the regions described by the (start,end) tuples and 0 if it is > not. I then want to do set operations on multiple bit strings (AND, > OR, NOT, etc.). Any suggestions on how to (1) set up the bit string > and (2) operate on 1 or more of them? Java has a BitSet class that > keeps this kind of thing pretty clean and high-level, but I haven't > seen anything like it for python.
The solution depends on what size of genomes you want to work with. There is a bitvector class that probably could do what you want, there are some issues on scaling as it is pure python. http://cobweb.ecn.purdue.edu/~kak/dist/BitVector-1.2.html If you want high speed stuff (implemented in C and PyRex) that works for large scale genomic data analysis the bx-python package might do what you need (and even things that you don't yet know that you really want to do) http://bx-python.trac.bx.psu.edu/ but of course this one is a lot more complicated i. -- http://mail.python.org/mailman/listinfo/python-list