On Sun, Apr 21, 2024 at 9:35 PM David Rowley <dgrowle...@gmail.com> wrote:
> On Mon, 22 Apr 2024 at 12:16, Ron Johnson <ronljohnso...@gmail.com> wrote: > > > > On Sun, Apr 21, 2024 at 6:45 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > >> > >> Ron Johnson <ronljohnso...@gmail.com> writes: > >> > Why is VACUUM FULL recommended for compressing a table, when CLUSTER > does > >> > the same thing (similarly doubling disk space), and apparently runs > just as > >> > fast? > >> > >> CLUSTER makes the additional effort to sort the data per the ordering > >> of the specified index. I'm surprised that's not noticeable in your > >> test case. > > > > Clustering on a completely different index was also 44 seconds. > > Both VACUUM FULL and CLUSTER go through a very similar code path. Both > use cluster_rel(). VACUUM FULL just won't make use of an existing > index to provide presorted input or perform a sort, whereas CLUSTER > will attempt to choose the cheapest out of these two to get sorted > results. > > If the timing for each is similar, it just means that using an index > scan or sorting isn't very expensive compared to the other work that's > being done. Both CLUSTER and VACUUM FULL require reading every heap > page and writing out new pages into a new heap and maintaining all > indexes on the new heap. That's quite an effort. > My original CLUSTER command didn't have to change the order of the data very much, thus, the sort didn't have to do much work. CLUSTER on a different index was indeed much slower than VACUUM FULL.