On Sun, Apr 29, 2007 at 12:54:36AM -0700, William Lee Irwin III wrote: >> Common code for rbtree-based priority queues can be factored out of >> cfq, cfs, and hrtimers.
On Sun, Apr 29, 2007 at 10:13:17AM +0200, Willy Tarreau wrote: > In my experience, rbtrees are painfully slow. Yesterday, I spent the > day replacing them in haproxy with other trees I developped a few > years ago, which look like radix trees. They are about 2-3 times as > fast to insert 64-bit data, and you walk through them in O(1). I have > many changes to apply to them before they could be used in kernel, but > at least I think we already have code available for other types of trees. Dynamic allocation of auxiliary indexing structures is problematic for the scheduler, which significantly constrains the algorithms one may use for this purpose. rbtrees are not my favorite either. Faster alternatives to rbtrees exist even among binary trees; for instance, it's not so difficult to implement a heap-ordered tree maintaining the red-black invariant with looser constraints on the tree structure and hence less rebalancing. One could always try implementing a van Emde Boas queue, if he felt particularly brave. Some explanation of the structure may be found at: http://courses.csail.mit.edu/6.897/spring03/scribe_notes/L1/lecture1.pdf According to that, y-trees use less space, and exponential trees are asymptotically faster with a worst-case asymptotic running time of O(min(lg(lg(u))*lg(lg(n))/lg(lg(lg(u))), sqrt(lg(n)/lg(lg(n))))) for all operations, so van Emde Boas is not the ultimate algorithm by any means at O(lg(lg(u))); in these estimates, u is the size of the "universe," or otherwise the range of the key data type. Not to say that any of those are appropriate for the kernel; it's rather likely we'll have to settle for something less interesting, if we bother ditching rbtrees at all, on account of the constraints of the kernel environment. I'll see what I can do about a userspace test harness for priority queues more comprehensive than smart-queue.c. I have in mind the ability to replay traces obtained from queues in the kernel and loading priority queue implementations via dlopen()/dlsym() et al. valgrind can do most of the dirty work. Otherwise running a trace for some period of time and emitting the number of operations it got through should serve as a benchmark. With that in hand, people can grind out priority queue implementations and see how they compare on real operation sequences logged from live kernels. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/