Re: [Bioc-devel] any interest in a BiocMatrix core package?

Michael Lawrence Fri, 03 Mar 2017 10:04:35 -0800

After reading the original post again, it seems that maybe Rcpp could solve
the problem (if it hasn't already) by implementing their matrix API on top
of dispatch to a few functions like [() and dim(). Both of those are
internally generic, so it would not touch R until the actual method call,
and wouldn't need to "see" any R-level S4 generics.


On Fri, Mar 3, 2017 at 9:29 AM, Michael Lawrence <micha...@gene.com> wrote:

> This is along the lines of what I suggested on the board phone call. If
> there is already a C++ library like Armadillo for doing the heavy lifting,
> it would be easy to implement an R-level abstraction on top of it, just as
> was done for HDF5, bigMemory, etc.
>
> On Fri, Mar 3, 2017 at 8:32 AM, McDavid, Andrew <Andrew_Mcdavid@urmc.
> rochester.edu> wrote:
>
>> On C++, Armadillo can be passed a a pointer to memory for the backing
>> store of its objects, so can use memory mapping.  On the R side, package
>> bigmemory provides R access and initialization of memory-mapped arrays.
>> See https://www.r-bloggers.com/using-rcpparmadillo-with-bigmemory/.
>> This doesn’t provide language or platform interchange of the backing store,
>> but would be an easy-ish solution.
>>
>> On Mar 3, 2017, at 10:23 AM, bioc-devel-requ...@r-project.org<mailto:
>> bioc-devel-requ...@r-project.org> wrote:
>>
>> Some comment on Aaron's stuff
>>
>> One possibility for doing things like this is if your code can be done in
>> C++ using a subset of rows or columns.  That can sometimes give the
>> necessary speed up.  What I mean is this
>>
>> Say you can safely process 1000 cells (not matrix cells, but biological
>> cells, aka columns) at a time in RAM
>>
>> iterate in R:
>>  get chunk i containing 1000 cells from the backend data storage
>>  do something on this sub matrix where everything is in a normal matrix
>> and you just use C++
>>  write results out to whatever backend you're using
>>
>> Then, with a million cells you iterate over 1000 chunks in R.  And you
>> don't need to "touch" the full dataset which can be stored on an arbitrary
>> backend.  And this approach could be run even (potentially) with different
>> chunks on different nodes.
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] any interest in a BiocMatrix core package?

Reply via email to