Sounds cool. I agree about the need for range* functions, if just for simplicity and convenience. Are you able to attach the patch?
Thanks, Michael On Sun, Jun 1, 2014 at 7:58 PM, Peter Haverty <haverty.pe...@gene.com> wrote: > I think viewMedians would be great. While you have the hood up, there are > some opportunities for some speedups and code simplification, I believe. > > I did some experimentation with view* in the genoset package. I made an > alternate version of the C for viewMeans and found about a 10X speedup. I > hoisted the branching for the different types and did the NA handling with > arithmetic rather than branching. The search for the Rle runs covered by > each view is now done with findInterval. There are quite a few code > sections that differ only in the type of the NA value and the pointers to > the input/output vectors. I think it would be worth considering C++ > templates. > > On the R side, each view* function is pretty similar too. In > genoset/R/RleDataFrame-views.R I tried to factor out all the shared pieces. > > While we're on the topic, I think the view* functions should have range* > equivalents that skip the View object and work on an Rle and an IRanges. > If you already have a Views object around, view* are perfect. Otherwise, > making the Views objects uses time that could be saved. > > Overall I found about a 90X speedup over viewMeans(RleViewsList). > > I hope there is some useful food for thought in these experiments. I have a > vignette that shows some of the timings if anyone is interested. > > Regards, > Pete > > ____________________ > Peter M. Haverty, Ph.D. > Genentech, Inc. > phave...@gene.com > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel