On Mon, May 12, 2014 at 11:41 AM, Hervé Pagès <hpa...@fhcrc.org> wrote:

> Hi Michael,
>
>
> On 05/09/2014 04:39 PM, Michael Lawrence wrote:
>
>> What would be the fastest way to do this with a DNAString?  Just an
>> alphabetFrequency?
>>
>
> That would do it.
>
> A couple of other issues I ran into with the 2bit code:
>
> (1) It fails on empty sequences:
>
>     > export(DNAStringSet(c("AA", "", "CC")), "ww.2bit")
>     Warning message:
>     In (function (object, seqname)  :
>       needLargeMem: trying to allocate 0 bytes (limit: 17179869184)
>     Error in sapply(object, function(x) typeof(x) == "externalptr" &&
> is(x,  :
>       error in evaluating the argument 'X' in selecting a method for
>       function 'sapply': Error in (function (object, seqname)  : UCSC
>       library operation failed
>
>
Thanks for catching this one.


> (2) Could be that internal helper rtracklayer:::.DNAString_to_twoBit()
>     is introducing a memory leak as it doesn't seem that the memory
>     the returned external pointer is pointing to (a struct twoBit) is
>     ever released. The memory leak is minor if the sequence passed via
>     'object' has no masks but can be important if there are masks and
>     if the masks are made of hundreds of thousands of ranges.
>
>
Right now it is the responsibility of the caller to free that memory.
Probably should have used a finalizer on the externalptr, but the way it
works now is that the write function frees the object. So it's not leaking
(as far as I know), but the design could be improved.



> Thanks,
> H.
>
>
>>
>> On Fri, May 9, 2014 at 4:07 PM, Hervé Pagès <hpa...@fhcrc.org
>> <mailto:hpa...@fhcrc.org>> wrote:
>>
>>     Hi Michael,
>>
>>        library(rtracklayer)
>>        library(Biostrings)
>>        x <- DNAStringSet("AAA-CCC-GGG-TTT-__NNN-KKK")
>>
>>
>>     Then:
>>
>>        > x
>>          A DNAStringSet instance of length 1
>>            width seq
>>        [1]    23 AAA-CCC-GGG-TTT-NNN-KKK
>>
>>        > export(x, "x.2bit")
>>
>>        > import("x.2bit")
>>          A DNAStringSet instance of length 1
>>            width seq                                               names
>>        [1]    23 AAATCCCTGGGTTTTTNNNTTTT                           1
>>
>>     What about having the "export" method for TwoBitFile raise an error
>>     (or at least issue a warning) instead of silently turning everything
>>     that is not A, C, G, T, or N into a T?
>>
>>     Thanks,
>>     H.
>>
>>     --
>>     Hervé Pagès
>>
>>     Program in Computational Biology
>>     Division of Public Health Sciences
>>     Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N, M1-B514
>>     P.O. Box 19024
>>     Seattle, WA 98109-1024
>>
>>     E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>     _________________________________________________
>>     Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing
>> list
>>     https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>     <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to