On Fri, May 25, 2018 at 6:33 AM, Asim Praveen <aprav...@pivotal.io> wrote: > Hello > > We are evaluating the use of shared buffers for temporary tables. The > advantage being queries involving temporary tables can make use of parallel > workers. >
This is one way, but I think there are other choices as well. We can identify and flush all the dirty (local) buffers for the relation being accessed parallelly. Now, once the parallel operation is started, we won't allow performing any write operation on them. It could be expensive if we have a lot of dirty local buffers for a particular relation. I think if we are worried about the cost of writes, then we can try some different way to parallelize temporary table scan. At the beginning of the scan, leader backend will remember the dirty blocks present in local buffers, it can then share the list with parallel workers which will skip scanning those blocks and in the end leader ensures that all those blocks will be scanned by the leader. This shouldn't incur a much additional cost as the skipped blocks should be present in local buffers of backend. I understand that none of these alternatives are straight-forward, but I think it is worth considering whether we have any better way to allow parallel temporary table scans. > Challenges: > 1. We lose the performance benefit of local buffers. Yeah, I think cases, where we need to drop temp relations, will become costlier as they have to traverse all the shared buffers instead of just local buffers. I think if we use shared buffers for temp relations, there will be some overhead for other backends as well, especially for the cases when backends need to evict buffers. It is quite possible that if the relation is in local buffers, we might not write it at all, but moving it to shared buffers will increase its probability of being written to disk. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com