Re: [HACKERS] Proposal : For Auto-Prewarm.

Mithun Cy Thu, 17 Aug 2017 23:24:19 -0700

On Wed, Aug 16, 2017 at 2:08 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Jul 14, 2017 at 8:17 AM, Mithun Cy <mithun...@enterprisedb.com> wrote:
>> [ new patch ]
> It's quite possible that in making all of these changes I've
> introduced some bugs, so I think this needs some testing and review.
> It's also possible that some of the changes that I made are actually
> not improvements and should be reverted, but it's always hard to tell
> that about your own code.  Anyway, please see the attached version.


Sorry, Robert, I was on vacation so could not pick this immediately. I
have been testing and reviewing the patch and I found following
issues.

1. Hang in apw_detach_shmem.
+/*
+ * Clear our PID from autoprewarm shared state.
+ */
+static void
+apw_detach_shmem(int code, Datum arg)
+{
+   LWLockAcquire(&apw_state->lock, LW_EXCLUSIVE);
+   if (apw_state->pid_using_dumpfile == MyProcPid)
+   apw_state->pid_using_dumpfile = InvalidPid;
+   if (apw_state->bgworker_pid == MyProcPid)
+   apw_state->bgworker_pid = InvalidPid;
+   LWLockRelease(&apw_state->lock);
+}

The reason is that we might already be under the apw_state->lock when
we error out and jump to apw_detach_shmem. So we should not be trying
to take the lock again. For example, in autoprewarm_dump_now(),
apw_dump_now() will error out under the lock if bgworker is already
using dump file.

=======
+autoprewarm_dump_now(PG_FUNCTION_ARGS)
+{
+    int64 num_blocks;
+
+    apw_init_shmem();
+
+    PG_TRY();
+    {
+         num_blocks = apw_dump_now(false, true);
+    }
+    PG_CATCH();
+    {
+         apw_detach_shmem(0, 0);
+         PG_RE_THROW();
+    }
+    PG_END_TRY();
+
+    PG_RETURN_INT64(num_blocks);
+}

=======
+ LWLockAcquire(&apw_state->lock, LW_EXCLUSIVE);
+ if (apw_state->pid_using_dumpfile == InvalidPid)
+ apw_state->pid_using_dumpfile = MyProcPid;
+ else
+ {
+     if (!is_bgworker)
+         ereport(ERROR,
+                    (errmsg("could not perform block dump because
dump file is being used by PID %d",
+                     apw_state->pid_using_dumpfile)));

This attempt to take lock again hangs the autoprewarm module. I think
there is no need to take lock while we reset those variables as we
reset only if we have set it ourselves.

2) I also found one issue which was my own mistake in my previous patch 19.
In "apw_dump_now" I missed calling FreeFile() on first write error,
whereas on othercases I am already calling the same.
ret = fprintf(file, "<<" INT64_FORMAT ">>\n", num_blocks);
+ if (ret < 0)
+ {
+ int save_errno = errno;
+
+ unlink(transient_dump_file_path);

Other than this, the patch is working as it was previously doing. If
you think my presumed fix(not to take lock) to hang issue is right I
will produce a patch for the same.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : For Auto-Prewarm.

Reply via email to