Re: WIP/PoC for parallel backup

Kashif Zeeshan Thu, 16 Apr 2020 23:32:14 -0700

On Tue, Apr 14, 2020 at 7:37 PM Asif Rehman <[email protected]> wrote:


>
>
> On Tue, Apr 14, 2020 at 6:32 PM Kashif Zeeshan <
> [email protected]> wrote:
>
>> Hi Asif
>>
>> Getting the following error on Parallel backup when --no-manifest option
>> is used.
>>
>> [edb@localhost bin]$
>> [edb@localhost bin]$
>> [edb@localhost bin]$  ./pg_basebackup -v -j 5  -D
>>  /home/edb/Desktop/backup/ --no-manifest
>> pg_basebackup: initiating base backup, waiting for checkpoint to complete
>> pg_basebackup: checkpoint completed
>> pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
>> pg_basebackup: starting background WAL receiver
>> pg_basebackup: created temporary replication slot "pg_basebackup_10223"
>> pg_basebackup: backup worker (0) created
>> pg_basebackup: backup worker (1) created
>> pg_basebackup: backup worker (2) created
>> pg_basebackup: backup worker (3) created
>> pg_basebackup: backup worker (4) created
>> pg_basebackup: write-ahead log end point: 0/2000100
>> pg_basebackup: error: could not get data for 'BUILD_MANIFEST': ERROR:
>>  could not open file
>> "base/pgsql_tmp/pgsql_tmp_b4ef5ac0fd150b2a28caf626bbb1bef2.1": No such file
>> or directory
>> pg_basebackup: removing contents of data directory
>> "/home/edb/Desktop/backup/"
>> [edb@localhost bin]$
>>
>
> I forgot to make a check for no-manifest. Fixed. Attached is the updated
> patch.
>
Hi Asif

Verified the fix, thanks.

[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$ ./pg_basebackup -v -j 5 -D
/home/edb/Desktop/backup --no-manifest
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/4000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_27407"
pg_basebackup: backup worker (0) created
pg_basebackup: backup worker (1) created
pg_basebackup: backup worker (2) created
pg_basebackup: backup worker (3) created
pg_basebackup: backup worker (4) created
pg_basebackup: write-ahead log end point: 0/4000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
[edb@localhost bin]$
[edb@localhost bin]$ ls /home/edb/Desktop/backup
backup_label  pg_commit_ts  pg_ident.conf  pg_notify    pg_snapshots
pg_subtrans  PG_VERSION  postgresql.auto.conf
base          pg_dynshmem   pg_logical     pg_replslot  pg_stat
pg_tblspc    pg_wal      postgresql.conf
global        pg_hba.conf   pg_multixact   pg_serial    pg_stat_tmp
pg_twophase  pg_xact
[edb@localhost bin]$
[edb@localhost bin]$
[edb@localhost bin]$

Regards
Kashif Zeeshan

>
>
>> Thanks
>>
>> On Tue, Apr 14, 2020 at 5:33 PM Asif Rehman <[email protected]>
>> wrote:
>>
>>>
>>>
>>> On Wed, Apr 8, 2020 at 6:53 PM Kashif Zeeshan <
>>> [email protected]> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Apr 7, 2020 at 9:44 PM Asif Rehman <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Thanks, Kashif and Rajkumar. I have fixed the reported issues.
>>>>>
>>>>> I have added the shared state as previously described. The new grammar
>>>>> changes
>>>>> are as follows:
>>>>>
>>>>> START_BACKUP [LABEL '<label>'] [FAST] [MAX_RATE %d]
>>>>>     - This will generate a unique backupid using pg_strong_random(16)
>>>>> and hex-encoded
>>>>>       it. which is then returned as the result set.
>>>>>     - It will also create a shared state and add it to the hashtable.
>>>>> The hash table size is set
>>>>>       to BACKUP_HASH_SIZE=10, but since hashtable can expand
>>>>> dynamically, I think it's
>>>>>       sufficient initial size. max_wal_senders is not used, because it
>>>>> can be set to quite a
>>>>>       large values.
>>>>>
>>>>> JOIN_BACKUP 'backup_id'
>>>>>     - finds 'backup_id' in hashtable and attaches it to server process.
>>>>>
>>>>>
>>>>> SEND_FILE '(' 'FILE' ')' [NOVERIFY_CHECKSUMS]
>>>>>     - renamed SEND_FILES to SEND_FILE
>>>>>     - removed START_WAL_LOCATION from this because 'startptr' is now
>>>>> accessible through
>>>>>       shared state.
>>>>>
>>>>> There is no change in other commands:
>>>>> STOP_BACKUP [NOWAIT]
>>>>> LIST_TABLESPACES [PROGRESS]
>>>>> LIST_FILES [TABLESPACE]
>>>>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>>>>>
>>>>> The current patches (v11) have been rebased to the latest master. The
>>>>> backup manifest is enabled
>>>>> by default, so I have disabled it for parallel backup mode and have
>>>>> generated a warning so that
>>>>> user is aware of it and not expect it in the backup.
>>>>>
>>>>> Hi Asif
>>>>
>>>> I have verified the bug fixes, one bug is fixed and working now as
>>>> expected
>>>>
>>>> For the verification of the other bug fixes faced following issues,
>>>> please have a look.
>>>>
>>>>
>>>> 1) Following bug fixes mentioned below are generating segmentation
>>>> fault.
>>>>
>>>> Please note for reference I have added a description only as steps were
>>>> given in previous emails of each bug I tried to verify the fix. Backtrace
>>>> is also added with each case which points to one bug for both the cases.
>>>>
>>>> a) The backup failed with errors "error: could not connect to server:
>>>> could not look up local user ID 1000: Too many open files" when the
>>>> max_wal_senders was set to 2000.
>>>>
>>>>
>>>> [edb@localhost bin]$ ./pg_basebackup -v -j 1990 -D
>>>>  /home/edb/Desktop/backup/
>>>> pg_basebackup: warning: backup manifest is disabled in parallel backup
>>>> mode
>>>> pg_basebackup: initiating base backup, waiting for checkpoint to
>>>> complete
>>>> pg_basebackup: checkpoint completed
>>>> pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
>>>> pg_basebackup: starting background WAL receiver
>>>> pg_basebackup: created temporary replication slot "pg_basebackup_9925"
>>>> pg_basebackup: backup worker (0) created
>>>> pg_basebackup: backup worker (1) created
>>>> pg_basebackup: backup worker (2) created
>>>> pg_basebackup: backup worker (3) created
>>>> ….
>>>> ….
>>>> pg_basebackup: backup worker (1014) created
>>>> pg_basebackup: backup worker (1015) created
>>>> pg_basebackup: backup worker (1016) created
>>>> pg_basebackup: backup worker (1017) created
>>>> pg_basebackup: error: could not connect to server: could not look up
>>>> local user ID 1000: Too many open files
>>>> Segmentation fault
>>>> [edb@localhost bin]$
>>>>
>>>>
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$ gdb pg_basebackup
>>>> /tmp/cores/core.pg_basebackup.13219.localhost.localdomain.1586349551
>>>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <
>>>> http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>>>> copying"
>>>> and "show warranty" for details.
>>>> This GDB was configured as "x86_64-redhat-linux-gnu".
>>>> For bug reporting instructions, please see:
>>>> <http://www.gnu.org/software/gdb/bugs/>...
>>>> Reading symbols from
>>>> /home/edb/Communtiy_Parallel_backup/postgresql/inst/bin/pg_basebackup...done.
>>>> [New LWP 13219]
>>>> [New LWP 13222]
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>>> Core was generated by `./pg_basebackup -v -j 1990 -D
>>>> /home/edb/Desktop/backup/'.
>>>> Program terminated with signal 11, Segmentation fault.
>>>> #0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
>>>> 47  if (INVALID_NOT_TERMINATED_TD_P (pd))
>>>> (gdb) bt
>>>> #0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
>>>> #1  0x000000000040904a in cleanup_workers () at pg_basebackup.c:2978
>>>> #2  0x0000000000403806 in disconnect_atexit () at pg_basebackup.c:332
>>>> #3  0x00007f2226f76a49 in __run_exit_handlers (status=1,
>>>> listp=0x7f22272f86c8 <__exit_funcs>, 
>>>> run_list_atexit=run_list_atexit@entry=true)
>>>> at exit.c:77
>>>> #4  0x00007f2226f76a95 in __GI_exit (status=<optimized out>) at
>>>> exit.c:99
>>>> #5  0x0000000000408c54 in create_parallel_workers (backupinfo=0x952ca0)
>>>> at pg_basebackup.c:2811
>>>> #6  0x000000000040798f in BaseBackup () at pg_basebackup.c:2211
>>>> #7  0x0000000000408b4d in main (argc=6, argv=0x7ffe3dabc718) at
>>>> pg_basebackup.c:2765
>>>> (gdb)
>>>>
>>>>
>>>>
>>>>
>>>> b) When executing two backups at the same time, getting FATAL error due
>>>> to max_wal_senders and instead of exit  Backup got completed.
>>>>
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$  ./pg_basebackup -v -j 8 -D
>>>>  /home/edb/Desktop/backup1/
>>>> pg_basebackup: warning: backup manifest is disabled in parallel backup
>>>> mode
>>>> pg_basebackup: initiating base backup, waiting for checkpoint to
>>>> complete
>>>> pg_basebackup: checkpoint completed
>>>> pg_basebackup: write-ahead log start point: 1/DA000028 on timeline 1
>>>> pg_basebackup: starting background WAL receiver
>>>> pg_basebackup: created temporary replication slot "pg_basebackup_17066"
>>>> pg_basebackup: backup worker (0) created
>>>> pg_basebackup: backup worker (1) created
>>>> pg_basebackup: backup worker (2) created
>>>> pg_basebackup: backup worker (3) created
>>>> pg_basebackup: backup worker (4) created
>>>> pg_basebackup: backup worker (5) created
>>>> pg_basebackup: backup worker (6) created
>>>> pg_basebackup: error: could not connect to server: FATAL:  number of
>>>> requested standby connections exceeds max_wal_senders (currently 10)
>>>> Segmentation fault (core dumped)
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$ gdb pg_basebackup
>>>> /tmp/cores/core.pg_basebackup.17041.localhost.localdomain.1586353696
>>>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <
>>>> http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>>>> copying"
>>>> and "show warranty" for details.
>>>> This GDB was configured as "x86_64-redhat-linux-gnu".
>>>> For bug reporting instructions, please see:
>>>> <http://www.gnu.org/software/gdb/bugs/>...
>>>> Reading symbols from
>>>> /home/edb/Communtiy_Parallel_backup/postgresql/inst/bin/pg_basebackup...done.
>>>> [New LWP 17041]
>>>> [New LWP 17067]
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>>> Core was generated by `./pg_basebackup -v -j 8 -D
>>>> /home/edb/Desktop/backup1/'.
>>>> Program terminated with signal 11, Segmentation fault.
>>>> #0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
>>>> 47  if (INVALID_NOT_TERMINATED_TD_P (pd))
>>>> (gdb) bt
>>>> #0  pthread_join (threadid=0, thread_return=0x0) at pthread_join.c:47
>>>> #1  0x000000000040904a in cleanup_workers () at pg_basebackup.c:2978
>>>> #2  0x0000000000403806 in disconnect_atexit () at pg_basebackup.c:332
>>>> #3  0x00007f051edc1a49 in __run_exit_handlers (status=1,
>>>> listp=0x7f051f1436c8 <__exit_funcs>, 
>>>> run_list_atexit=run_list_atexit@entry=true)
>>>> at exit.c:77
>>>> #4  0x00007f051edc1a95 in __GI_exit (status=<optimized out>) at
>>>> exit.c:99
>>>> #5  0x0000000000408c54 in create_parallel_workers
>>>> (backupinfo=0x1c6dca0) at pg_basebackup.c:2811
>>>> #6  0x000000000040798f in BaseBackup () at pg_basebackup.c:2211
>>>> #7  0x0000000000408b4d in main (argc=6, argv=0x7ffdb76a6d68) at
>>>> pg_basebackup.c:2765
>>>> (gdb)
>>>>
>>>>
>>>>
>>>>
>>>> 2) The following bug is not fixed yet
>>>>
>>>> A similar case is when DB Server is shut down while the Parallel Backup
>>>> is in progress then the correct error is displayed but then the backup
>>>> folder is not cleaned and leaves a corrupt backup.
>>>>
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$ ./pg_basebackup -v -D  /home/edb/Desktop/backup/
>>>> -j 8
>>>> pg_basebackup: warning: backup manifest is disabled in parallel backup
>>>> mode
>>>> pg_basebackup: initiating base backup, waiting for checkpoint to
>>>> complete
>>>> pg_basebackup: checkpoint completed
>>>> pg_basebackup: write-ahead log start point: 0/A0000028 on timeline 1
>>>> pg_basebackup: starting background WAL receiver
>>>> pg_basebackup: created temporary replication slot "pg_basebackup_16235"
>>>> pg_basebackup: backup worker (0) created
>>>> pg_basebackup: backup worker (1) created
>>>> pg_basebackup: backup worker (2) created
>>>> pg_basebackup: backup worker (3) created
>>>> pg_basebackup: backup worker (4) created
>>>> pg_basebackup: backup worker (5) created
>>>> pg_basebackup: backup worker (6) created
>>>> pg_basebackup: backup worker (7) created
>>>> pg_basebackup: error: could not read COPY data: server closed the
>>>> connection unexpectedly
>>>> This probably means the server terminated abnormally
>>>> before or while processing the request.
>>>> pg_basebackup: error: could not read COPY data: server closed the
>>>> connection unexpectedly
>>>> This probably means the server terminated abnormally
>>>> before or while processing the request.
>>>> pg_basebackup: removing contents of data directory
>>>> "/home/edb/Desktop/backup/"
>>>> pg_basebackup: error: could not read COPY data: server closed the
>>>> connection unexpectedly
>>>> This probably means the server terminated abnormally
>>>> before or while processing the request.
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$
>>>>
>>>>
>>>>
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$ ls /home/edb/Desktop/backup
>>>> base         pg_hba.conf    pg_logical    pg_notify    pg_serial
>>>> pg_stat      pg_subtrans  pg_twophase  pg_xact               
>>>> postgresql.conf
>>>> pg_dynshmem  pg_ident.conf  pg_multixact  pg_replslot  pg_snapshots
>>>>  pg_stat_tmp  pg_tblspc    PG_VERSION   postgresql.auto.conf
>>>> [edb@localhost bin]$
>>>> [edb@localhost bin]$
>>>>
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Kashif Zeeshan
>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 7, 2020 at 4:03 PM Kashif Zeeshan <
>>>>> [email protected]> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 3, 2020 at 3:01 PM Kashif Zeeshan <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Asif
>>>>>>>
>>>>>>> When a non-existent slot is used with tablespace then correct error
>>>>>>> is displayed but then the backup folder is not cleaned and leaves a 
>>>>>>> corrupt
>>>>>>> backup.
>>>>>>>
>>>>>>> Steps
>>>>>>> =======
>>>>>>>
>>>>>>> edb@localhost bin]$
>>>>>>> [edb@localhost bin]$ mkdir /home/edb/tbl1
>>>>>>> [edb@localhost bin]$ mkdir /home/edb/tbl_res
>>>>>>> [edb@localhost bin]$
>>>>>>> postgres=#  create tablespace tbl1 location '/home/edb/tbl1';
>>>>>>> CREATE TABLESPACE
>>>>>>> postgres=#
>>>>>>> postgres=# create table t1 (a int) tablespace tbl1;
>>>>>>> CREATE TABLE
>>>>>>> postgres=# insert into t1 values(100);
>>>>>>> INSERT 0 1
>>>>>>> postgres=# insert into t1 values(200);
>>>>>>> INSERT 0 1
>>>>>>> postgres=# insert into t1 values(300);
>>>>>>> INSERT 0 1
>>>>>>> postgres=#
>>>>>>>
>>>>>>>
>>>>>>> [edb@localhost bin]$
>>>>>>> [edb@localhost bin]$  ./pg_basebackup -v -j 2 -D
>>>>>>>  /home/edb/Desktop/backup/ -T /home/edb/tbl1=/home/edb/tbl_res -S test
>>>>>>> pg_basebackup: initiating base backup, waiting for checkpoint to
>>>>>>> complete
>>>>>>> pg_basebackup: checkpoint completed
>>>>>>> pg_basebackup: write-ahead log start point: 0/2E000028 on timeline 1
>>>>>>> pg_basebackup: starting background WAL receiver
>>>>>>> pg_basebackup: error: could not send replication command
>>>>>>> "START_REPLICATION": ERROR:  replication slot "test" does not exist
>>>>>>> pg_basebackup: backup worker (0) created
>>>>>>> pg_basebackup: backup worker (1) created
>>>>>>> pg_basebackup: write-ahead log end point: 0/2E000100
>>>>>>> pg_basebackup: waiting for background process to finish streaming ...
>>>>>>> pg_basebackup: error: child thread exited with error 1
>>>>>>> [edb@localhost bin]$
>>>>>>>
>>>>>>> backup folder not cleaned
>>>>>>>
>>>>>>> [edb@localhost bin]$
>>>>>>> [edb@localhost bin]$
>>>>>>> [edb@localhost bin]$
>>>>>>> [edb@localhost bin]$ ls /home/edb/Desktop/backup
>>>>>>> backup_label  global        pg_dynshmem  pg_ident.conf  pg_multixact
>>>>>>>  pg_replslot  pg_snapshots  pg_stat_tmp  pg_tblspc    PG_VERSION  
>>>>>>> pg_xact
>>>>>>>             postgresql.conf
>>>>>>> base          pg_commit_ts  pg_hba.conf  pg_logical     pg_notify
>>>>>>>   pg_serial    pg_stat       pg_subtrans  pg_twophase  pg_wal
>>>>>>>  postgresql.auto.conf
>>>>>>> [edb@localhost bin]$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If the same case is executed without the parallel backup patch then
>>>>>>> the backup folder is cleaned after the error is displayed.
>>>>>>>
>>>>>>> [edb@localhost bin]$ ./pg_basebackup -v -D
>>>>>>>  /home/edb/Desktop/backup/ -T /home/edb/tbl1=/home/edb/tbl_res -S 
>>>>>>> test999
>>>>>>> pg_basebackup: initiating base backup, waiting for checkpoint to
>>>>>>> complete
>>>>>>> pg_basebackup: checkpoint completed
>>>>>>> pg_basebackup: write-ahead log start point: 0/2B000028 on timeline 1
>>>>>>> pg_basebackup: starting background WAL receiver
>>>>>>> pg_basebackup: error: could not send replication command
>>>>>>> "START_REPLICATION": ERROR:  replication slot "test999" does not exist
>>>>>>> pg_basebackup: write-ahead log end point: 0/2B000100
>>>>>>> pg_basebackup: waiting for background process to finish streaming ...
>>>>>>> pg_basebackup: error: child process exited with exit code 1
>>>>>>> *pg_basebackup: removing data directory " /home/edb/Desktop/backup"*
>>>>>>> pg_basebackup: changes to tablespace directories will not be undone
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Asif
>>>>>>
>>>>>> A similar case is when DB Server is shut down while the Parallel
>>>>>> Backup is in progress then the correct error is displayed but then the
>>>>>> backup folder is not cleaned and leaves a corrupt backup. I think one bug
>>>>>> fix will solve all these cases where clean up is not done when parallel
>>>>>> backup is failed.
>>>>>>
>>>>>> [edb@localhost bin]$
>>>>>> [edb@localhost bin]$
>>>>>> [edb@localhost bin]$  ./pg_basebackup -v -D
>>>>>>  /home/edb/Desktop/backup/ -j 8
>>>>>> pg_basebackup: initiating base backup, waiting for checkpoint to
>>>>>> complete
>>>>>> pg_basebackup: checkpoint completed
>>>>>> pg_basebackup: write-ahead log start point: 0/C1000028 on timeline 1
>>>>>> pg_basebackup: starting background WAL receiver
>>>>>> pg_basebackup: created temporary replication slot
>>>>>> "pg_basebackup_57337"
>>>>>> pg_basebackup: backup worker (0) created
>>>>>> pg_basebackup: backup worker (1) created
>>>>>> pg_basebackup: backup worker (2) created
>>>>>> pg_basebackup: backup worker (3) created
>>>>>> pg_basebackup: backup worker (4) created
>>>>>> pg_basebackup: backup worker (5) created
>>>>>> pg_basebackup: backup worker (6) created
>>>>>> pg_basebackup: backup worker (7) created
>>>>>> pg_basebackup: error: could not read COPY data: server closed the
>>>>>> connection unexpectedly
>>>>>> This probably means the server terminated abnormally
>>>>>> before or while processing the request.
>>>>>> pg_basebackup: error: could not read COPY data: server closed the
>>>>>> connection unexpectedly
>>>>>> This probably means the server terminated abnormally
>>>>>> before or while processing the request.
>>>>>> [edb@localhost bin]$
>>>>>> [edb@localhost bin]$
>>>>>>
>>>>>> Same case when executed on pg_basebackup without the Parallel backup
>>>>>> patch then proper clean up is done.
>>>>>>
>>>>>> [edb@localhost bin]$
>>>>>> [edb@localhost bin]$  ./pg_basebackup -v -D
>>>>>>  /home/edb/Desktop/backup/
>>>>>> pg_basebackup: initiating base backup, waiting for checkpoint to
>>>>>> complete
>>>>>> pg_basebackup: checkpoint completed
>>>>>> pg_basebackup: write-ahead log start point: 0/C5000028 on timeline 1
>>>>>> pg_basebackup: starting background WAL receiver
>>>>>> pg_basebackup: created temporary replication slot "pg_basebackup_5590"
>>>>>> pg_basebackup: error: could not read COPY data: server closed the
>>>>>> connection unexpectedly
>>>>>> This probably means the server terminated abnormally
>>>>>> before or while processing the request.
>>>>>> pg_basebackup: removing contents of data directory
>>>>>> "/home/edb/Desktop/backup/"
>>>>>> [edb@localhost bin]$
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 3, 2020 at 1:46 PM Asif Rehman <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Apr 2, 2020 at 8:45 PM Robert Haas <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Thu, Apr 2, 2020 at 11:17 AM Asif Rehman <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>> >> Why would you need to do that? As long as the process where
>>>>>>>>> >> STOP_BACKUP can do the check, that seems good enough.
>>>>>>>>> >
>>>>>>>>> > Yes, but the user will get the error only after the STOP_BACKUP,
>>>>>>>>> not while the backup is
>>>>>>>>> > in progress. So if the backup is a large one, early error
>>>>>>>>> detection would be much beneficial.
>>>>>>>>> > This is the current behavior of non-parallel backup as well.
>>>>>>>>>
>>>>>>>>> Because non-parallel backup does not feature early detection of
>>>>>>>>> this
>>>>>>>>> error, it is not necessary to make parallel backup do so. Indeed,
>>>>>>>>> it
>>>>>>>>> is undesirable. If you want to fix that problem, do it on a
>>>>>>>>> separate
>>>>>>>>> thread in a separate patch. A patch proposing to make parallel
>>>>>>>>> backup
>>>>>>>>> inconsistent in behavior with non-parallel backup will be
>>>>>>>>> rejected, at
>>>>>>>>> least if I have anything to say about it.
>>>>>>>>>
>>>>>>>>> TBH, fixing this doesn't seem like an urgent problem to me. The
>>>>>>>>> current situation is not great, but promotions ought to be
>>>>>>>>> relatively
>>>>>>>>> infrequent, so I'm not sure it's a huge problem in practice. It is
>>>>>>>>> also worth considering whether the right fix is to figure out how
>>>>>>>>> to
>>>>>>>>> make that case actually work, rather than just making it fail
>>>>>>>>> quicker.
>>>>>>>>> I don't currently understand the reason for the prohibition so I
>>>>>>>>> can't
>>>>>>>>> express an intelligent opinion on what the right answer is here,
>>>>>>>>> but
>>>>>>>>> it seems like it ought to be investigated before somebody goes and
>>>>>>>>> builds a bunch of infrastructure to make the error more timely.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Non-parallel backup already does the early error checking. I only
>>>>>>>> intended
>>>>>>>>
>>>>>>>> to make parallel behave the same as non-parallel here. So, I agree
>>>>>>>> with
>>>>>>>>
>>>>>>>> you that the behavior of parallel backup should be consistent with
>>>>>>>> the
>>>>>>>>
>>>>>>>> non-parallel one.  Please see the code snippet below from
>>>>>>>>
>>>>>>>> basebackup.c:sendDir()
>>>>>>>>
>>>>>>>>
>>>>>>>> /*
>>>>>>>>>
>>>>>>>>>  * Check if the postmaster has signaled us to exit, and abort with
>>>>>>>>> an
>>>>>>>>>
>>>>>>>>>  * error in that case. The error handler further up will call
>>>>>>>>>
>>>>>>>>>  * do_pg_abort_backup() for us. Also check that if the backup was
>>>>>>>>>
>>>>>>>>>  * started while still in recovery, the server wasn't promoted.
>>>>>>>>>
>>>>>>>>>  * do_pg_stop_backup() will check that too, but it's better to stop
>>>>>>>>>
>>>>>>>>>  * the backup early than continue to the end and fail there.
>>>>>>>>>
>>>>>>>>>  */
>>>>>>>>>
>>>>>>>>> CHECK_FOR_INTERRUPTS();
>>>>>>>>>
>>>>>>>>> *if* (RecoveryInProgress() != backup_started_in_recovery)
>>>>>>>>>
>>>>>>>>> ereport(ERROR,
>>>>>>>>>
>>>>>>>>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>>>>>>>>
>>>>>>>>> errmsg("the standby was promoted during online backup"),
>>>>>>>>>
>>>>>>>>> errhint("This means that the backup being taken is corrupt "
>>>>>>>>>
>>>>>>>>> "and should not be used. "
>>>>>>>>>
>>>>>>>>> "Try taking another online backup.")));
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > Okay, then I will add the shared state. And since we are adding
>>>>>>>>> the shared state, we can use
>>>>>>>>> > that for throttling, progress-reporting and standby early error
>>>>>>>>> checking.
>>>>>>>>>
>>>>>>>>> Please propose a grammar here for all the new replication commands
>>>>>>>>> you
>>>>>>>>> plan to add before going and implement everything. That will make
>>>>>>>>> it
>>>>>>>>> easier to hash out the design without forcing you to keep changing
>>>>>>>>> the
>>>>>>>>> code. Your design should include a sketch of how several sets of
>>>>>>>>> coordinating backends taking several concurrent parallel backups
>>>>>>>>> will
>>>>>>>>> end up with one shared state per parallel backup.
>>>>>>>>>
>>>>>>>>> > There are two possible options:
>>>>>>>>> >
>>>>>>>>> > (1) Server may generate a unique ID i.e.
>>>>>>>>> BackupID=<unique_string> OR
>>>>>>>>> > (2) (Preferred Option) Use the WAL start location as the
>>>>>>>>> BackupID.
>>>>>>>>> >
>>>>>>>>> > This BackupID should be given back as a response to start backup
>>>>>>>>> command. All client workers
>>>>>>>>> > must append this ID to all parallel backup replication commands.
>>>>>>>>> So that we can use this identifier
>>>>>>>>> > to search for that particular backup. Does that sound good?
>>>>>>>>>
>>>>>>>>> Using the WAL start location as the backup ID seems like it might
>>>>>>>>> be
>>>>>>>>> problematic -- could a single checkpoint not end up as the start
>>>>>>>>> location for multiple backups started at the same time? Whether
>>>>>>>>> that's
>>>>>>>>> possible now or not, it seems unwise to hard-wire that assumption
>>>>>>>>> into
>>>>>>>>> the wire protocol.
>>>>>>>>>
>>>>>>>>> I was thinking that perhaps the client should generate a unique
>>>>>>>>> backup
>>>>>>>>> ID, e.g. leader does:
>>>>>>>>>
>>>>>>>>> START_BACKUP unique_backup_id [options]...
>>>>>>>>>
>>>>>>>>> And then others do:
>>>>>>>>>
>>>>>>>>> JOIN_BACKUP unique_backup_id
>>>>>>>>>
>>>>>>>>> My thought is that you will have a number of shared memory
>>>>>>>>> structure
>>>>>>>>> equal to max_wal_senders, each one large enough to hold the shared
>>>>>>>>> state for one backup. The shared state will include
>>>>>>>>> char[NAMEDATALEN-or-something] which will be used to hold the
>>>>>>>>> backup
>>>>>>>>> ID. START_BACKUP would allocate one and copy the name into it;
>>>>>>>>> JOIN_BACKUP would search for one by name.
>>>>>>>>>
>>>>>>>>> If you want to generate the name on the server side, then I suppose
>>>>>>>>> START_BACKUP would return a result set that includes the backup ID,
>>>>>>>>> and clients would have to specify that same backup ID when invoking
>>>>>>>>> JOIN_BACKUP. The rest would stay the same. I am not sure which way
>>>>>>>>> is
>>>>>>>>> better. Either way, the backup ID should be something long and
>>>>>>>>> hard to
>>>>>>>>> guess, not e.g. the leader processes' PID. I think we should
>>>>>>>>> generate
>>>>>>>>> it using pg_strong_random, say 8 or 16 bytes, and then hex-encode
>>>>>>>>> the
>>>>>>>>> result to get a string. That way there's almost no risk of two
>>>>>>>>> backup
>>>>>>>>> IDs colliding accidentally, and even if we somehow had a malicious
>>>>>>>>> user trying to screw up somebody else's parallel backup by
>>>>>>>>> choosing a
>>>>>>>>> colliding backup ID, it would be pretty hard to have any success. A
>>>>>>>>> user with enough access to do that sort of thing can probably
>>>>>>>>> cause a
>>>>>>>>> lot worse problems anyway, but it seems pretty easy to guard
>>>>>>>>> against
>>>>>>>>> intentional collisions robustly here, so I think we should.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Okay so If we are to add another replication command ‘JOIN_BACKUP
>>>>>>>> unique_backup_id’
>>>>>>>> to make workers find the relevant shared state. There won't be any
>>>>>>>> need for changing
>>>>>>>> the grammar for any other command. The START_BACKUP can return the
>>>>>>>> unique_backup_id
>>>>>>>> in the result set.
>>>>>>>>
>>>>>>>> I am thinking of the following struct for shared state:
>>>>>>>>
>>>>>>>>> *typedef* *struct*
>>>>>>>>>
>>>>>>>>> {
>>>>>>>>>
>>>>>>>>> *char* backupid[NAMEDATALEN];
>>>>>>>>>
>>>>>>>>> XLogRecPtr startptr;
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> slock_t lock;
>>>>>>>>>
>>>>>>>>> int64 throttling_counter;
>>>>>>>>>
>>>>>>>>> *bool* backup_started_in_recovery;
>>>>>>>>>
>>>>>>>>> } BackupSharedState;
>>>>>>>>>
>>>>>>>>>
>>>>>>>> The shared state structure entries would be maintained by a shared
>>>>>>>> hash table.
>>>>>>>> There will be one structure per parallel backup. Since a single
>>>>>>>> parallel backup
>>>>>>>> can engage more than one wal sender, so I think max_wal_senders
>>>>>>>> might be a little
>>>>>>>> too much; perhaps max_wal_senders/2 since there will be at least 2
>>>>>>>> connections
>>>>>>>> per parallel backup? Alternatively, we can set a new GUC that
>>>>>>>> defines the maximum
>>>>>>>> number of for concurrent parallel backups i.e.
>>>>>>>> ‘max_concurent_backups_allowed = 10’
>>>>>>>> perhaps, or we can make it user-configurable.
>>>>>>>>
>>>>>>>> The key would be “backupid=hex_encode(pg_random_strong(16))”
>>>>>>>>
>>>>>>>> Checking for Standby Promotion:
>>>>>>>> At the START_BACKUP command, we initialize
>>>>>>>> BackupSharedState.backup_started_in_recovery
>>>>>>>> and keep checking it whenever send_file () is called to send a new
>>>>>>>> file.
>>>>>>>>
>>>>>>>> Throttling:
>>>>>>>> BackupSharedState.throttling_counter - The throttling logic remains
>>>>>>>> the same
>>>>>>>> as for non-parallel backup with the exception that multiple threads
>>>>>>>> will now be
>>>>>>>> updating it. So in parallel backup, this will represent the overall
>>>>>>>> bytes that
>>>>>>>> have been transferred. So the workers would sleep if they have
>>>>>>>> exceeded the
>>>>>>>> limit. Hence, the shared state carries a lock to safely update the
>>>>>>>> throttling
>>>>>>>> value atomically.
>>>>>>>>
>>>>>>>> Progress Reporting:
>>>>>>>> Although I think we should add progress-reporting for parallel
>>>>>>>> backup as a
>>>>>>>> separate patch. The relevant entries for progress-reporting such as
>>>>>>>> ‘backup_total’ and ‘backup_streamed’ would be then added to this
>>>>>>>> structure
>>>>>>>> as well.
>>>>>>>>
>>>>>>>>
>>>>>>>> Grammar:
>>>>>>>> There is a change in the resultset being returned for START_BACKUP
>>>>>>>> command;
>>>>>>>> unique_backup_id is added. Additionally, JOIN_BACKUP replication
>>>>>>>> command is
>>>>>>>> added. SEND_FILES has been renamed to SEND_FILE. There are no other
>>>>>>>> changes
>>>>>>>> to the grammar.
>>>>>>>>
>>>>>>>> START_BACKUP [LABEL '<label>'] [FAST]
>>>>>>>>   - returns startptr, tli, backup_label, unique_backup_id
>>>>>>>> STOP_BACKUP [NOWAIT]
>>>>>>>>   - returns startptr, tli, backup_label
>>>>>>>> JOIN_BACKUP ‘unique_backup_id’
>>>>>>>>   - attaches a shared state identified by ‘unique_backup_id’ to a
>>>>>>>> backend process.
>>>>>>>>
>>>>>>>> LIST_TABLESPACES [PROGRESS]
>>>>>>>> LIST_FILES [TABLESPACE]
>>>>>>>> LIST_WAL_FILES [START_WAL_LOCATION 'X/X'] [END_WAL_LOCATION 'X/X']
>>>>>>>> SEND_FILE '(' FILE ')' [NOVERIFY_CHECKSUMS]
>>>>>>>>
>>>>>>>>
>>>>
>>>
>>> Hi,
>>>
>>> rebased and updated to the current master (8128b0c1). v13 is attached.
>>>
>>> - Fixes the above reported issues.
>>>
>>> - Added progress-reporting support for parallel:
>>> For this, 'backup_streamed' is moved to a shared structure (BackupState)
>>> as
>>> pg_atomic_uint64 variable. The worker processes will keep incrementing
>>> this
>>> variable.
>>>
>>> While files are being transferred from server to client. The main
>>> process remains
>>> in an idle state. So after each increment, the worker process will
>>> signal master to
>>> update the stats in pg_stat_progress_basebackup view.
>>>
>>> The 'tablespace_streamed' column is not updated and will remain empty.
>>> This is
>>> because multiple workers may be copying files from different tablespaces.
>>>
>>>
>>> - Added backup manifest:
>>> The backend workers maintain their own manifest file which contains a
>>> list of files
>>> that are being transferred by the work. Once all backup files are
>>> transferred, the
>>> workers will create a temp file as
>>> ('pg_tempdir/temp_file_prefix_backupid.workerid')
>>> to write the content of the manifest file from BufFile. The workers
>>> won’t add the
>>> header, nor the WAL information in their manifest. These two will be
>>> added by the
>>> main process while merging all worker manifest files.
>>>
>>> The main process will read these individual files and concatenate them
>>> into a single file
>>> which is then sent back to the client.
>>>
>>> The manifest file is created when the following command is received:
>>>
>>>>     BUILD_MANIFEST 'backupid'
>>>
>>>
>>> This is a new replication command. It is sent when pg_basebackup has
>>> copied all the
>>> $PGDATA files including WAL files.
>>>
>>>
>>>
>>> --
>>> Asif Rehman
>>> Highgo Software (Canada/China/Pakistan)
>>> URL : www.highgo.ca
>>>
>>>
>>
>> --
>> Regards
>> ====================================
>> Kashif Zeeshan
>> Lead Quality Assurance Engineer / Manager
>>
>> EnterpriseDB Corporation
>> The Enterprise Postgres Company
>>
>>
>
> --
> --
> Asif Rehman
> Highgo Software (Canada/China/Pakistan)
> URL : www.highgo.ca
>
>

-- 
Regards
====================================
Kashif Zeeshan
Lead Quality Assurance Engineer / Manager

EnterpriseDB Corporation
The Enterprise Postgres Company

Re: WIP/PoC for parallel backup

Reply via email to