Thanks everyone for their comments. I think the easiest (non-hack)
implementation would do the following:
1) For now, use the newly created IntervalTreeList class to implement
Michael's hash idea. I'm weary of pushing this into the IntervalTree class
since it would make this class depart from the
On 2013-04-03 19:28, Kasper Daniel Hansen wrote:
Making IntervalTree chromosome would also be a great addition for organisms
with many sequences, like bee (due to an incomplete genome; 10,000s of
sequences). It does not matter for humans, but findOverlaps is
excruciatingly slow for bee's. I hav
Yes, we could use this offset-based approach. Though we would want to just
use the seqlengths for calculating the offsets, so that every query will be
compatible with the index. This would usually mean doubles, instead of
integers. The hash approach might be a tiny bit faster, as we wouldn't need
t
On 04/03/2013 10:28 AM, Kasper Daniel Hansen wrote:
Making IntervalTree chromosome would also be a great addition for organisms
with many sequences, like bee (due to an incomplete genome; 10,000s of
sequences). It does not matter for humans, but findOverlaps is
excruciatingly slow for bee's. I
Making IntervalTree chromosome would also be a great addition for organisms
with many sequences, like bee (due to an incomplete genome; 10,000s of
sequences). It does not matter for humans, but findOverlaps is
excruciatingly slow for bee's. I have a couple of posts on this in the
archive.
I am p
Some ideas:
- Turn the IntervalTree into a list/array of nodes that can be
subset/reordered with shallow copying (just copy the pointers to the
nodes), and the index would be secondary. The index in the array could be
stored in each node, for lookup during overlap queries. Right now, as far
as I c
Hi Hector,
That's interesting, thanks for passing this along. I'm still wishing that
somehow GRanges itself could abstract the way it stores ranges. I know that
Herve/Patrick had some reasons for depending specifically on GRanges. One
reason was probably convenience at the C level, but it wouldn't
Yep, I didn't comment on that, but I agree that abstracting how
GRanges stores ranges would make this more elegant. Right now
ranges(GRanges) is specified to be of IRanges class instead of the
abstract Ranges class.
If it were the latter then GIntervalTree can be a subclass of
GenomicRanges, in a