Hi! Thank you for feedback.
On Sun, Aug 26, 2018 at 4:09 AM Robert Haas <robertmh...@gmail.com> wrote: > On Tue, Aug 21, 2018 at 9:10 AM, Alexander Korotkov > <a.korot...@postgrespro.ru> wrote: > > After heap truncation using this algorithm, shared buffers may contain > > past-OEF buffers. But those buffers are empty (no used items) and > > clean. So, real-only queries shouldn't hint those buffers dirty > > because there are no used items. Normally, these buffers will be just > > evicted away from the shared buffer arena. If relation extension will > > happen short after heap truncation then some of those buffers could be > > found after relation extension. I think this situation could be > > handled. For instance, we can teach vacuum to claim page as new once > > all the tuples were gone. > > I think this all sounds pretty dangerous and fragile, especially in > view of the pluggable storage work. If we start to add new storage > formats, deductions based on the specifics of the current heap's > hint-bit behavior may turn out not to be valid. Now maybe you could > speculate that it won't matter because perhaps truncation will work > differently in other storage formats too, but it doesn't sound to me > like we'd be wise to bet on it working out that way. Hmm, I'm not especially concerned about pluggable storages here. Pluggable storages are deciding themselves how do they manage vacuum including relation truncation if needed. They might reuse or not reuse function for relation truncation, which we have for heap. The thing we should do for that relation truncation function is understandable and predictable interface. So, if relation truncation function cuts relation tailing pages, which are previously cleaned as new. For me, that looks fair enough. The aspect I'm more concerned here about is whether we miss ability for detecting some of IO errors, if we don't distinguish new pages from pages whose tuples were removed by vacuum. > IIRC, Andres had some patches revising shared buffer management that > allowed the ending block number to be maintained in shared memory. > I'm waving my hands here, but with that kind of a system you can > imagine that maybe there could also be a flag bit indicating whether a > truncation is in progress. So, reduce the number of page and set the > bit; then zap all the pages above that value that are still present in > shared_buffers; then clear the bit. Or maybe we don't even need the > bit, but I think we do need some kind of race-free mechanism to make > sure that we never try to read pages that either have been truncated > away or in the process of being truncated away. If we would have some pre-relation shared memory information, then we can make it work even without special write barrier bit. Instead we can place a mark to the whole relation, which would say "please hold on with writes past following pending truncation point". Also having ending block number in the shared memory can save us from trying to read block past EOF. So, I'm sure that when Andres work for revising shared buffer management will be complete, we would be able to solve these problems better. But there is also a question of time. As I get, revising shared buffer management could not realistically get committed to PostgreSQL 12. And we have pretty nasty set of problems here. For me it would be nice to do something with them during this release cycle. But for sure, we should keep in the mind how this solution should be revising once we have new shared buffer management. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company