i forgot to add, my naive_find is: def naive_find(intervals, start, stop): results = [] for interval in intervals: if interval.start >= start and interval.stop <= stop: results.append(interval) return results
On Jan 12, 11:55 pm, Per Freem <perfr...@yahoo.com> wrote: > On Jan 12, 10:58 pm, Steven D'Aprano > > > > <ste...@remove.this.cybersource.com.au> wrote: > > On Mon, 12 Jan 2009 14:49:43 -0800, Per Freem wrote: > > > thanks for your replies -- a few clarifications and questions. the > > > is_within operation is containment, i.e. (a,b) is within (c,d) iff a > > >>= c and b <= d. Note that I am not looking for intervals that > > > overlap... this is why interval trees seem to me to not be relevant, as > > > the overlapping interval problem is way harder than what I am trying to > > > do. Please correct me if I'm wrong on this... > > > To test for contained intervals: > > a >= c and b <= d > > > To test for overlapping intervals: > > > not (b < c or a > d) > > > Not exactly what I would call "way harder". > > > -- > > Steven > > hi Steven, > > i found an implementation (which is exactly how i'd write it based on > the description) > here:http://hackmap.blogspot.com/2008/11/python-interval-tree.html > > when i use this however, it comes out either significantly slower or > equal to a naive search. my naive search just iterates through a > smallish list of intervals and for each one says whether they overlap > with each of a large set of intervals. > > here is the exact code i used to make the comparison, plus the code at > the link i have above: > > class Interval(): > def __init__(self, start, stop): > self.start = start > self.stop = stop > > import random > import time > num_ints = 30000 > init_intervals = [] > for n in range(0, > num_ints): > start = int(round(random.random() > *1000)) > end = start + int(round(random.random()*500+1)) > init_intervals.append(Interval(start, end)) > num_ranges = 900 > ranges = [] > for n in range(0, num_ranges): > start = int(round(random.random() > *1000)) > end = start + int(round(random.random()*500+1)) > ranges.append((start, end)) > #print init_intervals > tree = IntervalTree(init_intervals) > t1 = time.time() > for r in ranges: > tree.find(r[0], r[1]) > t2 = time.time() > print "interval tree: %.3f" %((t2-t1)*1000.0) > t1 = time.time() > for r in ranges: > naive_find(init_intervals, r[0], r[1]) > t2 = time.time() > print "brute force: %.3f" %((t2-t1)*1000.0) > > on one run, i get: > interval tree: 8584.682 > brute force: 8201.644 > > is there anything wrong with this implementation? it seems very right > to me but i am no expert. any help on this would be relly helpful. -- http://mail.python.org/mailman/listinfo/python-list