Re: PostGreSQL Replication and question on maintenance

github kran Fri, 15 Nov 2019 07:11:53 -0800

On Thu, Nov 14, 2019 at 11:42 PM Pavel Stehule <[email protected]>
wrote:


>   these numbers looks crazy high - how much memory has your server - more
> than 1TB?
>

The cluster got 244 GB of RAM and storage capacity it has is 64 TB.

>
>
> pá 15. 11. 2019 v 6:26 odesílatel github kran <[email protected]>
> napsal:
>
>>
>> Hello postGreSQL Community ,
>>>
>>>
>>>
>>> Hope everyone is doing great !!.
>>>
>>>
>>> *Background*
>>>
>>> We use PostgreSQL Version 10.6 version and heavily use PostgreSQL for
>>> our day to day activities to write and read data. We have 2 clusters
>>> running PostgreSQL engine , one cluster
>>>
>>> keeps data up to 60 days and another cluster retains data beyond 1 year.
>>> The data is partitioned close to a week( ~evry 5 days a partition) and we
>>> have around 5 partitions per month per each table and we have 2 tables
>>> primarily so that will be 10 tables a week. So in the cluster-1 we have
>>> around  20 partitions and in cluster-2 we have around 160 partitions ( data
>>> from 2018). We also want to keep the data for up to 2 years in the
>>> cluster-2 to serve the data needs of the customer and so far we reached
>>> upto 1 year of maintaining this data.
>>>
>>>
>>>
>>> *Current activity*
>>>
>>> We have a custom weekly migration DB script job that moves data from 1
>>> cluster to another cluster what it does is the below things.
>>>
>>> 1) COPY command to copy the data from cluster-1 and split that data into
>>> binary files
>>>
>>> 2) Writing the binary data into the cluster-2 table
>>>
>>> 3) Creating indexes after the data is copied.
>>>
>>>
>>>
>>> *Problem what we have right now. *
>>>
>>> When the migration activity runs(weekly) from past 2 times , we saw the
>>> cluster read replica instance has restarted as it fallen behind the
>>> master(writer instance). Everything
>>>
>>> after that worked seamlessly but we want to avoid the replica getting
>>> restarted. To avoid from restart we started doing smaller binary files and
>>> copy those files to the cluster-2
>>>
>>> instead of writing 1 big file of 450 million records. We were successful
>>> in the recent migration as the reader instance didn’t restart after we
>>> split 1 big file into multiple files to copy the data over but did restart
>>> after the indexes are created on the new table as it could be write
>>> intensive.
>>>
>>>
>>>
>>> *DB parameters set on migration job*
>>>
>>> work_mem set to 8 GB  and maintenace_work_mem=32 GB.
>>>
>>
>
>
> Indexes per table = 3
>>>
>>> total indexes for 2 tables = 5
>>>
>>>
>>>
>>> *DB size*
>>>
>>> Cluster-2 = 8.6 TB
>>>
>>> Cluster-1 = 3.6 TB
>>>
>>> Peak Table relational rows = 400 - 480 million rows
>>>
>>> Average table relational rows = 300 - 350 million rows.
>>>
>>> Per table size = 90 -95 GB , per table index size is about 45 GB
>>>
>>>
>>>
>>> *Questions*
>>>
>>> 1) Can we decrease the maintenace_work_mem to 16 GB and will it slow
>>> down the writes to the cluster , with that the reader instance can sync the
>>> data slowly ?.
>>>
>>> 2) Based on the above use case what are your recommendations to keep the
>>> data longer up to 2 years ?
>>>
>>> 3) What other recommendations you recommend ?.
>>>
>>>
>>>
>>>
>>>
>>> Appreciate your replies.
>>>
>>> THanks
>>> githubkran
>>>
>>>>

Re: PostGreSQL Replication and question on maintenance

Reply via email to