Getting there... Thanks for the report Kasper!
On Sun, Jul 7, 2013 at 2:10 PM, Michael Lawrence <lawrence.mich...@gene.com>wrote: > Awesome. If Hector is finished cleaning up, I'd be glad to merge it. > > Michael > > > On Sat, Jul 6, 2013 at 6:18 PM, Kasper Daniel Hansen < > kasperdanielhan...@gmail.com> wrote: > >> A little late, I can report that this speeds up my "many seqlevels" >> problem, by 3 orders of magnitude. >> >> library(IRanges, lib.loc = "library") >> library(GenomicRanges, lib.loc = "library") >> library(BSgenome.Amellifera.BeeBase.assembly4) >> Un <- Amellifera$GroupUn >> gr <- GRanges(seqnames = names(Un), >> ranges= IRanges(start = 1 , width = width(Un))) >> >> ## gr has a length of 9244, but each interval is in a new seqname. >> ## this makes traditional findOverlaps extremely slow >> >> system.time({ >> findOverlaps(gr, gr) >> }) ## roughly 240 secs >> >> system.time({ >> grF <- as(gr, "GIntervalTree") >> }) >> system.time({ >> findOverlaps(grF, grF) >> }) ## roughly 0.1 secs >> >> ## speedup (for this example): 2400x fold !!! >> >> Kasper >> >> >> On Thu, May 30, 2013 at 6:51 AM, Hector Corrada Bravo < >> hcorr...@umiacs.umd.edu> wrote: >> >>> Great. I already have unit tests there for IntervalForest and >>> GIntervalTree. >>> Hector >>> >>> >>> On Wed, May 29, 2013 at 8:31 PM, Vincent Carey >>> <st...@channing.harvard.edu>wrote: >>> >>> > Fine with me, as long as he is acquainted with the build/test before >>> commit >>> > practices that we are supposed >>> > to follow. Breaking IRanges can have severe repercussions. >>> > >>> > On Wed, May 29, 2013 at 6:36 PM, Michael Lawrence < >>> > lawrence.mich...@gene.com >>> > > wrote: >>> > >>> > > Would it be feasible/acceptable to give Hector permission to commit? >>> > > >>> > > Michael >>> > > >>> > > >>> > > On Wed, May 29, 2013 at 2:12 PM, Hector Corrada Bravo < >>> > hcorr...@gmail.com >>> > > >wrote: >>> > > >>> > > > That's great! There's some cleaning up to do there how should we do >>> > this >>> > > > post-merge? >>> > > > >>> > > > >>> > > > On Wed, May 29, 2013 at 4:19 PM, Valerie Obenchain < >>> voben...@fhcrc.org >>> > > >wrote: >>> > > > >>> > > >> Hi Hector, Michael, >>> > > >> >>> > > >> This sounds great. Bringing these into svn is fine with us. >>> Michael, >>> > do >>> > > >> you want to merge these in? >>> > > >> >>> > > >> Val >>> > > >> >>> > > >> On 05/24/2013 07:30 AM, Hector Corrada Bravo wrote: >>> > > >> > Thanks Michael, >>> > > >> > >>> > > >> > It has made significant difference for our visualization >>> project. I >>> > > >> would >>> > > >> > like to merge this into svn asap. Can I get a ruling from the >>> rest >>> > of >>> > > >> the >>> > > >> > core group? Please let me know if/when/how to proceed. >>> > > >> > >>> > > >> > Cheers, >>> > > >> > Hector >>> > > >> > >>> > > >> > >>> > > >> > On Wed, May 22, 2013 at 1:00 PM, Michael Lawrence < >>> > > >> lawrence.mich...@gene.com >>> > > >> >> wrote: >>> > > >> > >>> > > >> >> *Added bioc-devel; hope you don't mind* >>> > > >> >> >>> > > >> >> Hector, >>> > > >> >> >>> > > >> >> This is great stuff. The overall design is on the right track. >>> As >>> > you >>> > > >> >> said, there's a bit of cleaning to do, but I think we should >>> merge >>> > > >> this >>> > > >> >> into svn and work the rest out from there. This will really >>> benefit >>> > > >> >> performance, especially for visualization. Of course, I can't >>> speak >>> > > >> for the >>> > > >> >> others. >>> > > >> >> >>> > > >> >> Michael >>> > > >> >> >>> > > >> >> >>> > > >> >> >>> > > >> >> On Tue, May 21, 2013 at 11:52 AM, Hector Corrada Bravo < >>> > > >> >> hcorr...@umiacs.umd.edu> wrote: >>> > > >> >> >>> > > >> >>> Since the semester is over I finally finished this... >>> > > >> >>> >>> > > >> >>> Recall that I wanted a persistent set of IntervalTrees for >>> GRanges >>> > > >> >>> objects for repeated querying. (The application is this: >>> > > >> >>> http://epiviz.cbcb.umd.edu/help/?page_id=62 which I hope to >>> get >>> > out >>> > > >> >>> soon). Folding this into IRanges and GenomicRanges would make >>> our >>> > > >> life >>> > > >> >>> easier come installation time. >>> > > >> >>> >>> > > >> >>> I've implemented class 'IntervalForest' within IRanges >>> following >>> > > >> >>> Michael's suggestion of storing this as an array of rbTree on >>> the >>> > C >>> > > >> side. >>> > > >> >>> I've implemented findOverlaps that operates with this array >>> in C. >>> > > >> There is >>> > > >> >>> code duplication in IntervalTree.c that could be reduced but >>> > that's >>> > > >> if this >>> > > >> >>> makes it into the package. >>> > > >> >>> >>> > > >> >>> I've also implemented a 'GIntervalTree' that uses >>> 'IntervalForest' >>> > > >> >>> underneath. findOverlaps-GenomicRanges-GIntervalTree-method is >>> > > >> implemented >>> > > >> >>> for this class. I didn't touch the existing >>> > > >> >>> findOverlaps-GenomicRanges-GenomicRanges-method. >>> > > >> >>> >>> > > >> >>> You can pull these here: >>> > > >> >>> http://github.com/hcorrada/IRanges >>> > > >> >>> http://github.com/hcorrada/GenomicRanges >>> > > >> >>> >>> > > >> >>> These track the devel branch of the two packages. Let me know >>> the >>> > > >> best >>> > > >> >>> way to propagate to svn if you guys want this. It needs >>> > > >> documentation, but >>> > > >> >>> I'll add that once implementation is settled. >>> > > >> >>> >>> > > >> >>> Kasper, I'm not sure if this would help with the 'too many >>> > > seqlevels' >>> > > >> >>> problem but I'd be curious to know if you try it. >>> > > >> >>> >>> > > >> >>> Cheers, >>> > > >> >>> Hector >>> > > >> >>> >>> > > >> >> >>> > > >> >> >>> > > >> > >>> > > >> > [[alternative HTML version deleted]] >>> > > >> > >>> > > >> > _______________________________________________ >>> > > >> > Bioc-devel@r-project.org mailing list >>> > > >> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> > > >> > >>> > > >> >>> > > > >>> > > > >>> > > >>> > > [[alternative HTML version deleted]] >>> > > >>> > > _______________________________________________ >>> > > Bioc-devel@r-project.org mailing list >>> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> > > >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > _______________________________________________ >>> > Bioc-devel@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioc-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >> >> > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel