Re: [Bioc-devel] Combining Ordinary List of GRanges Optimisation

Michael Lawrence Tue, 08 Jan 2013 14:06:13 -0800

That GRanges only had one column, so I'm hoping that's not a lot of
overhead. The merging of the thousands of Seqinfo objects is probably the
issue. Any way to make that n-ary instead of a Reduce() over a binary merge?


Michael


On Tue, Jan 8, 2013 at 10:44 AM, Hervé Pagès <hpa...@fhcrc.org> wrote:

> Hi Dario,
>
>
> On 01/06/2013 07:00 PM, Dario Strbenac wrote:
>
>> Are you asking if you can rewrite your code to work faster, or are you
>>> asking if the BioC devs need to improve the code to be faster?
>>>
>>
>> I was suggesting that maybe the c function for GRanges could be optimised.
>>
>>  Another would be manually splitting each GRanges objects into its
>>> components: seqnames, IRanges, strand, and metadata. Then concatenate these
>>> components and build one big GRanges object.
>>>
>>
>> This approach gives:
>>
>>     user  system elapsed
>>   63.488  11.092  74.786
>>
>
> I think this is more or less what 'do.call(c, blockRanges)' would give
> you if all your GRanges objects were naked i.e. if they had no meta
> columns.
>
>
>
>> which by using c was previously:
>>
>>     user  system elapsed
>> 935.770  23.657 961.952
>>
>
> By default c() will also combine the meta columns which can be
> expensive if you have a lot of them and/or if some of them are
> complicated objects. You can call c() with 'ignore.mcols=TRUE'
> if you don't need to propagate the meta columns. Which, in the
> context of do.call(), translates to something like:
>
>   allRanges <- do.call(c, c(blockRanges, list(ignore.mcols=TRUE)))
>
> IMPORTANT NOTE, related to this thread on the Bioconductor list:
>
>   https://stat.ethz.ch/**pipermail/bioconductor/2012-**
> November/049567.html<https://stat.ethz.ch/pipermail/bioconductor/2012-November/049567.html>
>
> In short: if we ask the R core guys to change the implicit c() generic,
> my understanding is that it won't be possible to support additional
> args in "c" methods anymore, like the 'ignore.mcols' arg of the method
> for GenomicRanges objects. Should take the time to discuss this before
> I proceed?
>
> Thanks,
> H.
>
>
>
>> Thanks for the tip. I now remember using this approach at some time in
>> the past.
>>
>> ______________________________**_________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
>
> ______________________________**_________________
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Combining Ordinary List of GRanges Optimisation

Reply via email to