On Mon, May 12, 2014 at 11:41 AM, Hervé Pagès <hpa...@fhcrc.org> wrote:
> Hi Michael, > > > On 05/09/2014 04:39 PM, Michael Lawrence wrote: > >> What would be the fastest way to do this with a DNAString? Just an >> alphabetFrequency? >> > > That would do it. > > A couple of other issues I ran into with the 2bit code: > > (1) It fails on empty sequences: > > > export(DNAStringSet(c("AA", "", "CC")), "ww.2bit") > Warning message: > In (function (object, seqname) : > needLargeMem: trying to allocate 0 bytes (limit: 17179869184) > Error in sapply(object, function(x) typeof(x) == "externalptr" && > is(x, : > error in evaluating the argument 'X' in selecting a method for > function 'sapply': Error in (function (object, seqname) : UCSC > library operation failed > > Thanks for catching this one. > (2) Could be that internal helper rtracklayer:::.DNAString_to_twoBit() > is introducing a memory leak as it doesn't seem that the memory > the returned external pointer is pointing to (a struct twoBit) is > ever released. The memory leak is minor if the sequence passed via > 'object' has no masks but can be important if there are masks and > if the masks are made of hundreds of thousands of ranges. > > Right now it is the responsibility of the caller to free that memory. Probably should have used a finalizer on the externalptr, but the way it works now is that the write function frees the object. So it's not leaking (as far as I know), but the design could be improved. > Thanks, > H. > > >> >> On Fri, May 9, 2014 at 4:07 PM, Hervé Pagès <hpa...@fhcrc.org >> <mailto:hpa...@fhcrc.org>> wrote: >> >> Hi Michael, >> >> library(rtracklayer) >> library(Biostrings) >> x <- DNAStringSet("AAA-CCC-GGG-TTT-__NNN-KKK") >> >> >> Then: >> >> > x >> A DNAStringSet instance of length 1 >> width seq >> [1] 23 AAA-CCC-GGG-TTT-NNN-KKK >> >> > export(x, "x.2bit") >> >> > import("x.2bit") >> A DNAStringSet instance of length 1 >> width seq names >> [1] 23 AAATCCCTGGGTTTTTNNNTTTT 1 >> >> What about having the "export" method for TwoBitFile raise an error >> (or at least issue a warning) instead of silently turning everything >> that is not A, C, G, T, or N into a T? >> >> Thanks, >> H. >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org> >> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >> >> _________________________________________________ >> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing >> list >> https://stat.ethz.ch/mailman/__listinfo/bioc-devel >> <https://stat.ethz.ch/mailman/listinfo/bioc-devel> >> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]]
_______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel