Ann: CoreBio 0.4
Announcing CoreBio 0.4 CoreBio home page: http://code.google.com/p/corebio/ Download: http://corebio.googlecode.com/svn/dist/CoreBio-0.4.1.tar.gz CoreBio is an open source python library for bioinformatics and computational biology, designed to be fast, compact, reliable and easy to use. Currently, CoreBio includes code to store and manipulate protein and DNA sequences, read and write many common biological sequence formats, read blast reports and access other computational and database resources. The CoreBio project welcomes additional suggestions, code and participants. This release includes the following modules: - data: Standard information used in computational biology. - matrix: Arrays indexed by alphabetic strings. - moremath: Various bits of useful math not in the standard python library. - resource: Access to programs, complex file formats and databases - astral: ASTRAL dataset IO. - scop: SCOP: Structural Classification of Proteins IO. - stride: STRIDE: Protein secondary structure assignment from atomic coordinates. - seq: Alphabetic sequences and associated tools and data. - seq_io: Sequence file reading and writing. - array_io: Read and write arrays of sequence data. - clustal_io: Read the CLUSTAL sequence file format. - fasta_io: Read and write FASTA format. - genbank_io: Read GenBank flat files. - intelligenetics_io: Read IntelliGenetics format. - msf_io: Read sequence information in MSF format. - nbrf_io: Sequence IO for NBRF/PIR format. - nexus_io: Read the sequence data from a nexus file. - null_io: Null sequence IO. - phylip_io: Read Sequences in interleaved Phylip format. - plain_io: Read and write raw, unformatted sequence data. - stockholm_io: Read a STOCKHOLM format. - table_io: Read tab delimited format. - ssearch_io: Parse sequence search analysis reports. - blastxml: Read BLAST XML output. - fasta: Read the output of a fasta similarity search. - transform: Transformations of Seqs (alphabetic sequences), including translation with a full suite of GeneticCode's. Gavin Crooks and John Gilman -- http://mail.python.org/mailman/listinfo/python-list
Re: Ann: CoreBio 0.4
km wrote: > Hi, > why are u reinventing the wheel when Biopython[1] is already existing ? is > there any specific reason u wanted to develop this CoreBio ? why dont u just > extend the existing BioPython package itself ? > regards, > KM > [1]http://biopython.org > <[EMAIL PROTECTED]> wrote: Biopython is a fine project which I have used and contributed to in the past. Unfortunately, Biopython suffers form a lack of focus. It is a huge heap of code, some of which is well written, but of lot of which is not, and a significant portion doesn't work as advertised. There is no consistency, the code-base is idiosyncratic, the documentation is spotty, and it is very hard to actually discover and use the functionality that you need. For some common tasks there are three ways of doing things, only one of which is supported. CoreBio is intended to be a high-quality, easy to use collection of the core functionality needed for bioinformatics and computational biology. Compared to biopython, we are taking a narrow, quality first, rather than breadth first, approach with simple API's that hide as much complexity as reasonable possible. As an example of simple API's consider the common task of reading a file of protein sequences: >>> from corebio import seq_io >>> afile = open("human.fa") >>> list_of_sequences = seq_io.read(afile) CoreBio will figure out the file format, so 'seq_io.read()' will parse sequence data from fasta, clustal, genbank, intelligenetics, msf, nbrf/pir, nexus or phylip formatted files. Gavin Crooks -- http://mail.python.org/mailman/listinfo/python-list
Extended slices and indices
The indices method of slice doesn't seem to work quite how I would expect when reversing a sequence. For example : >>> s = '01234' >>> s[::-1] '43210' >>> s[slice(None,None,-1) ] '43210' So a slice with a negative step (and nothing else) reverses the sequence. But what are the corresponding indices? >>> slice(None,None,-1).indices(len(s)) (4, -1, -1) That looks O.K. The start is the last item in the sequence, and the stop is one before the beginning of the sequence. But these indices don't reverse the string: >>> s[4:-1:-1] '' Although they give the correct range: >>> range( 4, -1,-1) [4, 3, 2, 1, 0] It would appear that there is no set of indices that will both reverse the string and produce the correct range! Is this a bug or a feature? GEC See also: http://www.python.org/doc/2.3.5/whatsnew/section-slices.html -- http://mail.python.org/mailman/listinfo/python-list
Re: Extended slices and indices
Robert Kern wrote: > I'd say bug in the .indices() method. The meaning of [4:-1:-1] is unavoidable > different than [::-1] since the index -1 points to the last element, not the > imaginary element before the first element. Unfortunately, there *is* no > concrete (start, stop, step) tuple that will emulate [::-1]. After some more experimenting, it seems that [L-1:-L-1:-1] will reverse a sequence of length L. But slice(L-1,-L-1,-1).indices(L) gives (L-1, -1,-1) which will not reverse the sequence. And range(L-1, -L-1, -1) is totally off, but range(L-1,-1,-1) is correct. Seems like a bug (or an odd feature) of extended slicing of strings and other built in sequences. GEC -- http://mail.python.org/mailman/listinfo/python-list