> In my original example, a sequential scan of the 1TB of 2KB > or 4KB records, => 250M or 500M records of data, being sorted > on a binary value key will take ~1000x more time than reading > in the ~1GB Btree I described that used a Key+RID (plus node > pointers) representation of the data.
Imho you seem to ignore the final step your algorithm needs of collecting the data rows. After you sorted the keys the collect step will effectively access the tuples in random order (given a sufficiently large key range). This random access is bad. It effectively allows a competing algorithm to read the whole data at least 40 times sequentially, or write the set 20 times sequentially. (Those are the random/sequential ratios of modern discs) Andreas ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match