sorry changing the subject line. On Thu, Nov 14, 2019 at 11:21 AM github kran <githubk...@gmail.com> wrote:
> Hello Team, > > > > Hope everyone is doing great !!. > > > *Background* > > We use PostgreSQL Version 10.6 version and heavily use PostgreSQL for our > day to day activities to write and read data. We have 2 clusters running > PostgreSQL engine , one cluster > > keeps data up to 60 days and another cluster retains data beyond 1 year. > The data is partitioned close to a week( ~evry 5 days a partition) and we > have around 5 partitions per month per each table and we have 2 tables > primarily so that will be 10 tables a week. So in the cluster-1 we have > around 20 partitions and in cluster-2 we have around 160 partitions ( data > from 2018). We also want to keep the data for up to 2 years in the > cluster-2 to serve the data needs of the customer and so far we reached > upto 1 year of maintaining this data. > > > > *Current activity* > > We have a custom weekly migration DB script job that moves data from 1 > cluster to another cluster what it does is the below things. > > 1) COPY command to copy the data from cluster-1 and split that data into > binary files > > 2) Writing the binary data into the cluster-2 table > > 3) Creating indexes after the data is copied. > > > > *Problem what we have right now. * > > When the migration activity runs(weekly) from past 2 times , we saw the > cluster read replica instance has restarted as it fallen behind the > master(writer instance). Everything > > after that worked seamlessly but we want to avoid the replica getting > restarted. To avoid from restart we started doing smaller binary files and > copy those files to the cluster-2 > > instead of writing 1 big file of 450 million records. We were successful > in the recent migration as the reader instance didn’t restart after we > split 1 big file into multiple files to copy the data over but did restart > after the indexes are created on the new table as it could be write > intensive. > > > > *DB parameters set on migration job* > > work_mem set to 8 GB and maintenace_work_mem=32 GB. > > Indexes per table = 3 > > total indexes for 2 tables = 5 > > > > *DB size* > > Cluster-2 = 8.6 TB > > Cluster-1 = 3.6 TB > > Peak Table relational rows = 400 - 480 million rows > > Average table relational rows = 300 - 350 million rows. > > Per table size = 90 -95 GB , per table index size is about 45 GB > > > > *Questions* > > 1) Can we decrease the maintenace_work_mem to 16 GB and will it slow down > the writes to the cluster , with that the reader instance can sync the data > slowly ?. > > 2) Based on the above use case what are your recommendations to keep the > data longer up to 2 years ? > > 3) What other recommendations you recommend ?. > > > > > > Appreciate your replies. > > THanks > githubkran > >>