On Wed, Aug 16, 2017 at 2:08 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, Jul 14, 2017 at 8:17 AM, Mithun Cy <mithun...@enterprisedb.com> wrote: >> [ new patch ] > It's quite possible that in making all of these changes I've > introduced some bugs, so I think this needs some testing and review. > It's also possible that some of the changes that I made are actually > not improvements and should be reverted, but it's always hard to tell > that about your own code. Anyway, please see the attached version.
Sorry, Robert, I was on vacation so could not pick this immediately. I have been testing and reviewing the patch and I found following issues. 1. Hang in apw_detach_shmem. +/* + * Clear our PID from autoprewarm shared state. + */ +static void +apw_detach_shmem(int code, Datum arg) +{ + LWLockAcquire(&apw_state->lock, LW_EXCLUSIVE); + if (apw_state->pid_using_dumpfile == MyProcPid) + apw_state->pid_using_dumpfile = InvalidPid; + if (apw_state->bgworker_pid == MyProcPid) + apw_state->bgworker_pid = InvalidPid; + LWLockRelease(&apw_state->lock); +} The reason is that we might already be under the apw_state->lock when we error out and jump to apw_detach_shmem. So we should not be trying to take the lock again. For example, in autoprewarm_dump_now(), apw_dump_now() will error out under the lock if bgworker is already using dump file. ======= +autoprewarm_dump_now(PG_FUNCTION_ARGS) +{ + int64 num_blocks; + + apw_init_shmem(); + + PG_TRY(); + { + num_blocks = apw_dump_now(false, true); + } + PG_CATCH(); + { + apw_detach_shmem(0, 0); + PG_RE_THROW(); + } + PG_END_TRY(); + + PG_RETURN_INT64(num_blocks); +} ======= + LWLockAcquire(&apw_state->lock, LW_EXCLUSIVE); + if (apw_state->pid_using_dumpfile == InvalidPid) + apw_state->pid_using_dumpfile = MyProcPid; + else + { + if (!is_bgworker) + ereport(ERROR, + (errmsg("could not perform block dump because dump file is being used by PID %d", + apw_state->pid_using_dumpfile))); This attempt to take lock again hangs the autoprewarm module. I think there is no need to take lock while we reset those variables as we reset only if we have set it ourselves. 2) I also found one issue which was my own mistake in my previous patch 19. In "apw_dump_now" I missed calling FreeFile() on first write error, whereas on othercases I am already calling the same. ret = fprintf(file, "<<" INT64_FORMAT ">>\n", num_blocks); + if (ret < 0) + { + int save_errno = errno; + + unlink(transient_dump_file_path); Other than this, the patch is working as it was previously doing. If you think my presumed fix(not to take lock) to hang issue is right I will produce a patch for the same. -- Thanks and Regards Mithun C Y EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers