Thanks.  While I was beating a bit on Biobase, I understand if we don't
want to revisit the design now.  However, we might want to do so for the
multi assay stuff.  I have some additional thoughts on that.

Kasper


On Mon, Feb 10, 2014 at 2:17 AM, Martin Morgan <mtmor...@fhcrc.org> wrote:

> On 02/09/2014 02:38 PM, Kasper Daniel Hansen wrote:
>
>> Memory usage is a common bottleneck.
>>
>> For people interested in profiling their memory usage I want to recommend
>> the lineprof package by Hadley Wickham which I have had great success with
>> so far.  There is some details in his 'Advanced R programming' at
>>    http://adv-r.had.co.nz/memory.html
>> I see this package as a real game changer.
>>
>> I have written an example debugging session on a real use case
>> (minfi::preprocessRaw) at
>>    http://www.hansenlab.org/rstats/2014/01/30/lineprof/
>> where I end up having to workaround using new() for Biobase classes (an
>> eSet derived class in minfi)
>>
>
> Thanks Kasper for the pointer.  This is a bit brutal
>
> > m <- matrix(0, 0, 0)
> > tracemem(m)
> [1] "<0xe048d80>"
> > ExpressionSet(m)
>
> tracemem[0xe048d80 -> 0xeb10530]: eapply sampleNames<- sampleNames<-
> .local .nextMethod eval eval callNextMethod .local initialize initialize
> new .ExpressionSet ExpressionSet ExpressionSet
>
> ... 15 copies later...
>
> tracemem[0xf93b2b8 -> 0xf93bc00]: colnames<- sampleNames<- sampleNames<-
> .harmonizeDimnames .local initialize initialize new .ExpressionSet
> ExpressionSet ExpressionSet
>
> Much of this is avoidable... copyEnv(), eapply(), and rownames<-, used
> when making row and column names of the assayData consistent with feature
> and sample names, all seem to unnecessarily duplicated elements
>
>     e <- new.env(); m <- matrix(1); tracemem(m)
>     ## [1] "<0x1810d650>"
>     e[["m"]] <- m
>     x <- copyEnv(e)
>     ## tracemem[0x1810d650 -> 0x1810e0d8]: .Call copyEnv
>     x <- eapply(e, dim)
>     ## tracemem[0x1810d650 -> 0x1810e9f8]: eapply
>     dimnames(e[["m"]]) <- list("a", "A")
>     ## tracemem[0x1810d650 -> 0x1810fab0]:
>     rownames(e[["m"]]) <- "a"
>     ## tracemem[0x1810fab0 -> 0x18110de8]:
>     ## tracemem[0x18110de8 -> 0x18111730]: rownames<-
>
> I've updated the C code for copyEnv in Biobase, and avoided eapply and
> row/colnames, so that there are usually only one or two copies for the
> simplest constructor. I'll look out for bugs in downstream packages, and
> would be happy to hear of other easily reproducible examples of apparently
> unnecessary duplication.
>
> Martin
>
>
>> Best,
>> Kasper
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to