Hi, I found an issue while executing a backup use case(please see [1] for queries) on postgres version 12.
Firstly, pg_start_backup registers nonexclusive_base_backup_cleanup as on_shmem_exit call back which will add this function to the before_shmem_exit_list array which is supposed to be removed on pg_stop_backup so that we can do the pending cleanup and issue a warning for each pg_start_backup for which we did not call the pg_stop backup. Now, I executed a query for which JIT is picked up, then the the llvm compiler inserts it's own exit callback i.e. llvm_shutdown at the end of before_shmem_exit_list array. Now, I executed pg_stop_backup and call to cancel_before_shmem_exit() is made with the expectation that the nonexclusive_base_backup_cleanup callback is removed from before_shmem_exit_list array. Since the cancel_before_shmem_exit() only checks the last entry in the before_shmem_exit_list array, which is llvm compiler's exit callback, so we exit the cancel_before_shmem_exit() without removing the intended nonexclusive_base_backup_cleanup callback which remains still the before_shmem_exit_list and gets executed during the session exit throwing a warning "aborting backup due to backend exiting before pg_stop_backup was called", which is unintended. Attached is the patch that fixes the above problem by making cancel_before_shmem_exit() to look for the given function(and for given args) in the entire before_shmem_exit_list array, not just the last entry, starting from the last entry. Request the community take this patch for review for v12. Having said that, abovementioned problem for backup use case does not occur for v13 and latest versions of postgres (please below description[2]), but these kinds of issues can come, if the cancel_before_shmem_exit() is left to just look at the last array entry while removing a registered callback. There's also a comment in cancel_before_shmem_exit() function description "For simplicity, only the latest entry can be removed. (We could work harder but there is no need for current uses.) Since we start to find use cases now, there is a need to enhance cancel_before_shmem_exit(), so I also propose to have the same attached patch for v13 and latest versions. Thoughts? [1] CREATE TABLE t1 (id SERIAL); INSERT INTO t1 (id) SELECT * FROM generate_series(1, 20000000); SELECT * FROM pg_start_backup('label', false, false); /*JIT is enabled in my session and it is being picked by below query*/ EXPLAIN (ANALYZE, VERBOSE, BUFFERS) SELECT COUNT(*) FROM t1; SELECT * FROM pg_stop_backup(false, true); [2] for v13 and latest versions, start_backup first registers do_pg_abort_backup, and then pg_start_backup_callback, performs startup backup operations and unregisters only pg_start_backup_callback from before_shmem_exit_list, retaining do_pg_abort_backup still in the list, which is to be called on session's exit JIT compiler inserts it's own exit call back at the end of before_shmem_exit_list array. stop_backup registers pg_stop_backup_callback, performs stop operations, unregisters pg_stop_backup_callback from before_shmem_exit_list, and sets the sessionBackupState = SESSION_BACKUP_NONE, note that the callback do_pg_abort_backup registered by start_backup command still exists in the before_shmem_exit_list and will not be removed by stop_backup. On session exit, do_pg_abort_backup gets called but returns without performing any operations(not even throwing a warning), by checking sessionBackupState which was set to SESSION_BACKUP_NONE by stop_backup. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
v1-0001-Modify-cancel_before_shmem_exit-to-search-all-reg.patch
Description: Binary data