Re: [PATCH] More docs on what to do and not do in extension code

Bharath Rupireddy Tue, 19 Jan 2021 06:34:16 -0800

On Mon, Jan 18, 2021 at 1:27 PM Craig Ringer
<craig.rin...@enterprisedb.com> wrote:
> Hi folks
>
> The attached patch expands the xfunc docs and bgworker docs a little, 
> providing a starting point for developers to learn how to do some common 
> tasks the Postgres Way.
>
> It mentions in brief these topics:
>
> * longjmp() based exception handling with elog(ERROR), PG_CATCH() and 
> PG_RE_THROW() etc
> * Latches, spinlocks, LWLocks, heavyweight locks, condition variables
> * shm, DSM, DSA, shm_mq
> * syscache, relcache, relation_open(), invalidations
> * deferred signal handling, CHECK_FOR_INTERRUPTS()
> * Resource cleanup hooks and callbacks like on_exit, before_shmem_exit, the 
> resowner callbacks, etc
> * signal handling in bgworkers
>
> All very superficial, but all things I really wish I'd known a little about, 
> or even that I needed to learn about, when I started working on postgres.
>
> I'm not sure it's in quite the right place. I wonder if there should be a 
> separate part of xfunc.sgml that covers the slightly more advanced bits of 
> postgres backend and function coding like this, lists relevant README files 
> in the source tree, etc.
>
> I avoided going into details like how resource owners work. I don't want the 
> docs to have to cover all that in detail; what I hope to do is start 
> providing people with clear references to the right place in the code, 
> READMEs, etc to look when they need to understand specific topics.


Thanks for the patch.

Here are some comments:

[1]
   background worker's main function, and must be unblocked by it; this is to
    allow the process to customize its signal handlers, if necessary.
-   Signals can be unblocked in the new process by calling
-   <function>BackgroundWorkerUnblockSignals</function> and blocked by calling
-   <function>BackgroundWorkerBlockSignals</function>.
+   It is important that all background workers set up and unblock signal
+   handling before they enter their main loops. Signal handling in background
+   workers is discussed separately in <xref linkend="bgworker-signals"/>.
   </para>

IMO, we can retain the statement about BackgroundWorkerUnblockSignals
and BackgroundWorkerBlockSignals, but mention the link to
"bgworker-signals" for more details and move the statement "it's
important to unblock signals before enter their main loop" to
"bgworker-signals" section and we can also reason there the
consequences if not done.

[2]
+   interupt-aware APIs</link> for the purpose. Do not
<function>usleep()</function>,
+   <function>system()</function>, make blocking system calls, etc.
+  </para>

Is it "Do not use <function>usleep()</function>,
<function>system()</function> or make blocking system calls etc." ?

[3] IMO, we can remove following from "bgworker-signals" if we retain
it where currently it is, as discussed in [1].
+    Signals can be unblocked in the new process by calling
+    <function>BackgroundWorkerUnblockSignals</function> and blocked by calling
+    <function>BackgroundWorkerBlockSignals</function>.

[4] Can we say
+    The default signal handlers set up for background workers <emphasis>do
+    default background worker signal handlers, it should call

instead of
+    The default signal handlers installed for background workers <emphasis>do
+    default background worker signal handling it should call

[5] IMO, we can have something like below
+    request, etc. Set up these handlers before unblocking signals as
+    shown below:

instead of
+    request, etc. To install these handlers, before unblocking interrupts
+    run:

[6] I think logs and errors either elog() or ereport can be used, so how about
+        Use <function>elog()</function> or <function>ereport()</function> for
+        logging output or raising errors instead of any direct stdio calls.

instead of
+        Use <function>elog()</function> and <function>ereport()</function> for
+        logging output and raising errors instead of any direct stdio calls.

[7] Can we use child processes instead of subprocess ? If okay in
other places in the patch as well.
+        and should only use the main thread. PostgreSQL generally
uses child processes
+        that coordinate over shared memory instead of threads - for
instance, see
+        <xref linkend="bgworker"/>.

instead of
+        and should only use the main thread. PostgreSQL generally
uses subprocesses
+        that coordinate over shared memory instead of threads - see
+        <xref linkend="bgworker"/>.

[8] Why should file descriptor manager API be used to execute
subprocesses/child processes?
+        To execute subprocesses and open files, use the routines provided by
+        the file descriptor manager like <function>BasicOpenFile</function>
+        and <function>OpenPipeStream</function> instead of a direct

[9] "should always be"? even if it's a blocking extesion, does it
work? If our intention is to recommend the developers, maybe we should
avoid using the term "should" in the patch in other places as well.
+        Extension code should always be structured as a non-blocking

[10] I think it is
+        you should avoid using <function>sleep()</function> or
<function>usleep()</function>

instead of
+        you should <function>sleep()</function> or
<function>usleep()</function>


[11] I think it is
+        block if this happens. So cleanup of resources is not
entirely managed by PostgreSQL, it
+       must be handled using appropriate callbacks provided by PostgreSQL

instead of
+        block if this happens. So all cleanup of resources not already
+        managed by the PostgreSQL runtime must be handled using appropriate

[12] I think it is
+        found in corresponding PostgreSQL header and source files.

instead of
+        found in the PostgreSQL headers and sources.

[13] I think it is
+        Use PostgreSQL runtime concurrency and synchronisation primitives

+        between the PostgreSQL processes. These include signals and
ProcSignal multiplexed

instead of
+        Use the PostgreSQL runtime's concurrency and synchronisation primitives

+        between PostgreSQL processes. These include signals and
ProcSignal multiplexed

[14] Is it "relation/table based state management"?
+        Sometimes relation-based state management for extensions is not

[15] I think it is
+        use PostgreSQL shared-memory based inter-process communication

instead of
+        use PostgreSQL's shared-memory based inter-process communication

[16] I think it is
+        or shared memory message queues (<acronym>shm_mq</acronym>). Examples
+        usage of some of these features can be found in the
+        <filename>src/test/modules/test_shm_mq/</filename> sample
extension. Others

instead of
+        or shared memory message queues (<acronym>shm_mq</acronym>). Examples
+        of the use of some these features can be found in the
+        <filename>src/test/modules/test_shm_mq/</filename> example
extension. Others

[17] I think it is
+        syscache entries, as this can cause subtle bugs. See

instead of
+        syscache cache entries, as this can cause subtle bugs. See

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: [PATCH] More docs on what to do and not do in extension code

Reply via email to