Greg Stark <[EMAIL PROTECTED]> writes: > The approach they take is to have a function which calculates an > abstract "distance" between any two entries. There's an algorithm that > they use to pick the split based on this distance function.
> If you abandoned "PickSplit" and instead exposed this distance > function as the external API then the behaviour for multi-column > indexes is clear. You calculate the distance along all the axes and > calculate the diagonal distance. Hmm ... the problem with that is the assumption that different opclasses will compute similarly-scaled distances. If opclass A generates distances in the range (0,1e6) while B generates in the range (0,1), combining them with Euclidean distance won't work well at all. OTOH you can't blindly normalize, because in some cases maybe the data is such that a massive difference in distances is truly appropriate. I'm also a bit leery of the assumption that every GiST application can reduce its PickSplit logic to Euclidean distances. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly