Any reply on this please ?. On Fri, Nov 15, 2019 at 9:10 AM github kran <githubk...@gmail.com> wrote:
> > > On Thu, Nov 14, 2019 at 11:42 PM Pavel Stehule <pavel.steh...@gmail.com> > wrote: > >> these numbers looks crazy high - how much memory has your server - more >> than 1TB? >> > > The cluster got 244 GB of RAM and storage capacity it has is 64 TB. > >> >> >> pá 15. 11. 2019 v 6:26 odesílatel github kran <githubk...@gmail.com> >> napsal: >> >>> >>> Hello postGreSQL Community , >>>> >>>> >>>> >>>> Hope everyone is doing great !!. >>>> >>>> >>>> *Background* >>>> >>>> We use PostgreSQL Version 10.6 version and heavily use PostgreSQL for >>>> our day to day activities to write and read data. We have 2 clusters >>>> running PostgreSQL engine , one cluster >>>> >>>> keeps data up to 60 days and another cluster retains data beyond 1 >>>> year. The data is partitioned close to a week( ~evry 5 days a partition) >>>> and we have around 5 partitions per month per each table and we have 2 >>>> tables primarily so that will be 10 tables a week. So in the cluster-1 we >>>> have around 20 partitions and in cluster-2 we have around 160 partitions ( >>>> data from 2018). We also want to keep the data for up to 2 years in the >>>> cluster-2 to serve the data needs of the customer and so far we reached >>>> upto 1 year of maintaining this data. >>>> >>>> >>>> >>>> *Current activity* >>>> >>>> We have a custom weekly migration DB script job that moves data from 1 >>>> cluster to another cluster what it does is the below things. >>>> >>>> 1) COPY command to copy the data from cluster-1 and split that data >>>> into binary files >>>> >>>> 2) Writing the binary data into the cluster-2 table >>>> >>>> 3) Creating indexes after the data is copied. >>>> >>>> >>>> >>>> *Problem what we have right now. * >>>> >>>> When the migration activity runs(weekly) from past 2 times , we saw the >>>> cluster read replica instance has restarted as it fallen behind the >>>> master(writer instance). Everything >>>> >>>> after that worked seamlessly but we want to avoid the replica getting >>>> restarted. To avoid from restart we started doing smaller binary files and >>>> copy those files to the cluster-2 >>>> >>>> instead of writing 1 big file of 450 million records. We were >>>> successful in the recent migration as the reader instance didn’t restart >>>> after we split 1 big file into multiple files to copy the data over but did >>>> restart after the indexes are created on the new table as it could be write >>>> intensive. >>>> >>>> >>>> >>>> *DB parameters set on migration job* >>>> >>>> work_mem set to 8 GB and maintenace_work_mem=32 GB. >>>> >>> >> >> >> Indexes per table = 3 >>>> >>>> total indexes for 2 tables = 5 >>>> >>>> >>>> >>>> *DB size* >>>> >>>> Cluster-2 = 8.6 TB >>>> >>>> Cluster-1 = 3.6 TB >>>> >>>> Peak Table relational rows = 400 - 480 million rows >>>> >>>> Average table relational rows = 300 - 350 million rows. >>>> >>>> Per table size = 90 -95 GB , per table index size is about 45 GB >>>> >>>> >>>> >>>> *Questions* >>>> >>>> 1) Can we decrease the maintenace_work_mem to 16 GB and will it slow >>>> down the writes to the cluster , with that the reader instance can sync the >>>> data slowly ?. >>>> >>>> 2) Based on the above use case what are your recommendations to keep >>>> the data longer up to 2 years ? >>>> >>>> 3) What other recommendations you recommend ?. >>>> >>>> >>>> >>>> >>>> >>>> Appreciate your replies. >>>> >>>> THanks >>>> githubkran >>>> >>>>>