SECOND ATTEMPT AT POST. Web mailer appears to have eaten first one. I apologize in advance if anyone gets two versions of this post. =r
>From: Tom Lane <[EMAIL PROTECTED]> >Sent: Sep 26, 2005 9:42 PM >Subject: Re: [HACKERS] [PERFORM] A Better External Sort? > >So far, you've blithely assumed that you know the size of a cache line, >the sizes of L1 and L2 cache, > NO. I used exact values only as examples. Realistic examples drawn from an extensive survey of past, present, and what I could find out about future systems; but only examples nonetheless. For instance, Hennessy and Patterson 3ed points out that 64B cache lines are optimally performing for caches between 16KB and 256KB. The same source as well as sources specifically on CPU memory hierarchy design points out that we are not likely to see L1 caches larger than 256KB in the forseeable future. The important point was the idea of an efficient Key, rather than Record, sort using a CPU cache friendly data structure with provably good space and IO characteristics based on a reasonable model of current and likely future single box computer architecture (although it would be fairly easy to extend it to include the effects of networking.) No apriori exact or known values are required for the method to work. >and that you are working with sort keys that you can efficiently pack >into cache lines. > Not "pack". "map". n items can not take on more than n values. n values can be represented in lgn bits. Less efficient mappings can also work. Either way I demonstrated that we have plenty of space in a likely and common cache line size. Creating a mapping function to represent m values in lgm bits is a well known hack, and if we keep track of minimum and maximum values for fields during insert and delete operations, we can even create mapping functions fairly easily. (IIRC, Oracle does keep track of minimum and maximum field values.) >And that you know the relative access speeds of the caches and >memory so that you can schedule transfers, > Again, no. I created a reasonable model of a computer system that holds remarkably well over a _very_ wide range of examples. I don't need the numbers to be exactly right to justify my approach to this problem or understand why other approaches may have downsides. I just have to get the relative performance of the system components and the relative performance gap between them reasonably correct. The stated model does that very well. Please don't take my word for it. Go grab some random box: laptop, desktop, unix server, etc and try it for yourself. Part of the reason I published the model was so that others could examine it. >and that the hardware lets you get at that transfer timing. > Never said anything about this, and in fact I do not need any such. >And that the number of distinct key values isn't very large. > Quite the opposite in fact. I went out of my way to show that the method still works well even if every Key is distinct. It is _more efficient_ when the number of distinct keys is small compared to the number of data items, but it works as well as any other Btree would when all n of the Keys are distinct. This is just a CPU cache and more IO friendly Btree, not some magical and unheard of technique. It's just as general purpose as Btrees usually are. I'm simply looking at the current and likely future state of computer systems architecture and coming up with a slight twist on how to use already well known and characterized techniques. not trying to start a revolution. I'm trying very hard NOT to waste anyone's time around here. Including my own Ron ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings