Hello David, Thank you for the valuable inputs.We will test these scenarios .
Regards, Ameya On Tue, Aug 2, 2022 at 12:16 PM David Rowley <dgrowle...@gmail.com> wrote: > On Tue, 2 Aug 2022 at 19:48, Ameya Bidwalkar > <bidwalkar.amey...@gmail.com> wrote: > > We have a Postgresql 13 database where we have a single table with > several millions of rows . We plan to partition it based on timestamp . > > We have been seeking advice for best practices for building this. > > This table will get lots of updates for the same rows during a short > period of time.During this time rows would be in a single partition . > > After this short time these rows would move to another partition > .Where no more updates take place on these rows.But might have some SELECT > queries running. > > We plan to l have partitions based on months and then roll them up in a > year and then archive these older partitions > > One consultant we talked with told us this row movement between the > partitions will have > > huge complications .But this was an issue during the Postgres 10 > version . > > Define "huge complications"? > > The capabilities of partitioned tables have changed quite a bit since > the feature was added. It's very easy for knowledge to get out-dated > in this area. I did quite a bit of work on them and I struggle to > remember off the top of my head which versions saw which improvements. > PG12 saw lots. See [1], search for "partition". > > One possible complication is what is mentioned in [2] about > "serialization failure error". UPDATEs that cause a tuple to move to > another partition can cause a serialization failure at transaction > isolation level, not just serializable transactions. If it's not > already, you might want to have your application retry transactions on > SQL:40001 errors. > > Apart from that, assuming there's comparatively a small number of rows > in the partition being updated compared to the partition with the > static rows, then it sounds fairly efficient. As you describe it, the > larger static partition is effectively INSERT only and auto-vacuum > will need to touch it only for tuple freezing work. The smaller of > the two tables will receive more churn but will be faster to vacuum. > PG13 got a new feature that makes sure auto-vacuum also does the > rounds on INSERT-only tables too, so the static partition is not going > to be neglected until anti-wrap-around-autovacuums trigger, like they > would have in PG12 and earlier. > > Another thing to consider is that an UPDATE of a non-partitioned table > has a chance at being a HOT update. That's possible if the tuple can > fit on the same page and does not update any of the indexed columns. A > HOT update means no indexes need to be updated so these perform faster > and require less space in WAL than a non-HOT update. An UPDATE that > moves a tuple to another partition can never be a HOT update. That's > something you might want to consider. If you're updating indexed > columns already then it's not a factor to consider. There's also > overhead to postgres having to find the partition for the newly > updated version of the tuple. That's not hugely expensive, but it's > generally measurable. RANGE partitioned tables with a large number of > partitions will have the most overhead for this. HASH partitioned > tables, the least. > > The best thing you can likely do is set up a scenario with pgbench and > compare the performance. pgbench is a pretty flexible tool that will > allow you to run certain queries X% of the time and even throttle the > workload at what you expect your production server to experience. You > could then run it overnight on a test server, or even for weeks and > see how auto-vacuum keeps up when compared to the non-partitioned > case. You can also check how much extra WAL is generated vs the > non-partitioned case. > > > So we are seeking advice on the performance perspective and things we > should take care of along with manual vacuums on a regular schedule and > indexing. > > Are there any tunables I should experiment with in particular ? > > Perhaps if you want to keep a small high-chun table in check you might > want to consider if autovacuum_naptime is set low enough. You may not > care if the space being consumed in the standard 1min > autovacuum_naptime is small enough not to be of concern. > > David > > [1] https://www.postgresql.org/docs/release/12.0/ > [2] https://www.postgresql.org/docs/13/sql-update.html >