Re: PostGreSQL Replication and question on maintenance

github kran Sat, 16 Nov 2019 05:37:07 -0800

Any reply on this please ?.

On Fri, Nov 15, 2019 at 9:10 AM github kran <githubk...@gmail.com> wrote:


>
>
> On Thu, Nov 14, 2019 at 11:42 PM Pavel Stehule <pavel.steh...@gmail.com>
> wrote:
>
>>   these numbers looks crazy high - how much memory has your server - more
>> than 1TB?
>>
>
> The cluster got 244 GB of RAM and storage capacity it has is 64 TB.
>
>>
>>
>> pá 15. 11. 2019 v 6:26 odesílatel github kran <githubk...@gmail.com>
>> napsal:
>>
>>>
>>> Hello postGreSQL Community ,
>>>>
>>>>
>>>>
>>>> Hope everyone is doing great !!.
>>>>
>>>>
>>>> *Background*
>>>>
>>>> We use PostgreSQL Version 10.6 version and heavily use PostgreSQL for
>>>> our day to day activities to write and read data. We have 2 clusters
>>>> running PostgreSQL engine , one cluster
>>>>
>>>> keeps data up to 60 days and another cluster retains data beyond 1
>>>> year. The data is partitioned close to a week( ~evry 5 days a partition)
>>>> and we have around 5 partitions per month per each table and we have 2
>>>> tables primarily so that will be 10 tables a week. So in the cluster-1 we
>>>> have around  20 partitions and in cluster-2 we have around 160 partitions (
>>>> data from 2018). We also want to keep the data for up to 2 years in the
>>>> cluster-2 to serve the data needs of the customer and so far we reached
>>>> upto 1 year of maintaining this data.
>>>>
>>>>
>>>>
>>>> *Current activity*
>>>>
>>>> We have a custom weekly migration DB script job that moves data from 1
>>>> cluster to another cluster what it does is the below things.
>>>>
>>>> 1) COPY command to copy the data from cluster-1 and split that data
>>>> into binary files
>>>>
>>>> 2) Writing the binary data into the cluster-2 table
>>>>
>>>> 3) Creating indexes after the data is copied.
>>>>
>>>>
>>>>
>>>> *Problem what we have right now. *
>>>>
>>>> When the migration activity runs(weekly) from past 2 times , we saw the
>>>> cluster read replica instance has restarted as it fallen behind the
>>>> master(writer instance). Everything
>>>>
>>>> after that worked seamlessly but we want to avoid the replica getting
>>>> restarted. To avoid from restart we started doing smaller binary files and
>>>> copy those files to the cluster-2
>>>>
>>>> instead of writing 1 big file of 450 million records. We were
>>>> successful in the recent migration as the reader instance didn’t restart
>>>> after we split 1 big file into multiple files to copy the data over but did
>>>> restart after the indexes are created on the new table as it could be write
>>>> intensive.
>>>>
>>>>
>>>>
>>>> *DB parameters set on migration job*
>>>>
>>>> work_mem set to 8 GB  and maintenace_work_mem=32 GB.
>>>>
>>>
>>
>>
>> Indexes per table = 3
>>>>
>>>> total indexes for 2 tables = 5
>>>>
>>>>
>>>>
>>>> *DB size*
>>>>
>>>> Cluster-2 = 8.6 TB
>>>>
>>>> Cluster-1 = 3.6 TB
>>>>
>>>> Peak Table relational rows = 400 - 480 million rows
>>>>
>>>> Average table relational rows = 300 - 350 million rows.
>>>>
>>>> Per table size = 90 -95 GB , per table index size is about 45 GB
>>>>
>>>>
>>>>
>>>> *Questions*
>>>>
>>>> 1) Can we decrease the maintenace_work_mem to 16 GB and will it slow
>>>> down the writes to the cluster , with that the reader instance can sync the
>>>> data slowly ?.
>>>>
>>>> 2) Based on the above use case what are your recommendations to keep
>>>> the data longer up to 2 years ?
>>>>
>>>> 3) What other recommendations you recommend ?.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Appreciate your replies.
>>>>
>>>> THanks
>>>> githubkran
>>>>
>>>>>

Re: PostGreSQL Replication and question on maintenance

Reply via email to