Hi all, attached a patch to do seq scan + sorting instead of index scan
on CLUSTER (when that's supposed to be faster). As I've already said, the patch is based on: http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php Of course, the code isn't supposed to be ready to be merged: I would like to write more comments and add some test cases to cluster.sql (plus change all the things you are going to tell me I have to change...) I would like your opinions on code correctness and the decisions I took, especially: 1) function names ("cost_index_scan_vs_seqscansort" I guess it's awful...) 2) the fact that I put in Tuplesortstate an EState variable, so that MakeSingleTupleTableSlot wouldn't have to be called for every row in the expression indexes case 3) the expression index case is not "optimized": I preferred to call FormIndexDatum once for the first key value in copytup_rawheap and another time to get all the remaining values in comparetup_rawheap. I liked the idea of re-using FormIndexDatum in that case, instead of copying&pasting only the relevant code: but FormIndexDatum returns all the values, even when I might need only the first one 4) I refactored the code to deform and rewrite tuple into the function "deform_and_rewrite_tuple", because now that part can be called by the regular index scan or by the new seq-scan + sort (but I could copy&paste those lines instead of refactoring them into a new function) Suggestions and comments are not just welcome, but needed! Leonardo
sorted_cluster.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers