On Mon, Jan 18, 2021 at 1:27 PM Craig Ringer <craig.rin...@enterprisedb.com> wrote: > Hi folks > > The attached patch expands the xfunc docs and bgworker docs a little, > providing a starting point for developers to learn how to do some common > tasks the Postgres Way. > > It mentions in brief these topics: > > * longjmp() based exception handling with elog(ERROR), PG_CATCH() and > PG_RE_THROW() etc > * Latches, spinlocks, LWLocks, heavyweight locks, condition variables > * shm, DSM, DSA, shm_mq > * syscache, relcache, relation_open(), invalidations > * deferred signal handling, CHECK_FOR_INTERRUPTS() > * Resource cleanup hooks and callbacks like on_exit, before_shmem_exit, the > resowner callbacks, etc > * signal handling in bgworkers > > All very superficial, but all things I really wish I'd known a little about, > or even that I needed to learn about, when I started working on postgres. > > I'm not sure it's in quite the right place. I wonder if there should be a > separate part of xfunc.sgml that covers the slightly more advanced bits of > postgres backend and function coding like this, lists relevant README files > in the source tree, etc. > > I avoided going into details like how resource owners work. I don't want the > docs to have to cover all that in detail; what I hope to do is start > providing people with clear references to the right place in the code, > READMEs, etc to look when they need to understand specific topics.
Thanks for the patch. Here are some comments: [1] background worker's main function, and must be unblocked by it; this is to allow the process to customize its signal handlers, if necessary. - Signals can be unblocked in the new process by calling - <function>BackgroundWorkerUnblockSignals</function> and blocked by calling - <function>BackgroundWorkerBlockSignals</function>. + It is important that all background workers set up and unblock signal + handling before they enter their main loops. Signal handling in background + workers is discussed separately in <xref linkend="bgworker-signals"/>. </para> IMO, we can retain the statement about BackgroundWorkerUnblockSignals and BackgroundWorkerBlockSignals, but mention the link to "bgworker-signals" for more details and move the statement "it's important to unblock signals before enter their main loop" to "bgworker-signals" section and we can also reason there the consequences if not done. [2] + interupt-aware APIs</link> for the purpose. Do not <function>usleep()</function>, + <function>system()</function>, make blocking system calls, etc. + </para> Is it "Do not use <function>usleep()</function>, <function>system()</function> or make blocking system calls etc." ? [3] IMO, we can remove following from "bgworker-signals" if we retain it where currently it is, as discussed in [1]. + Signals can be unblocked in the new process by calling + <function>BackgroundWorkerUnblockSignals</function> and blocked by calling + <function>BackgroundWorkerBlockSignals</function>. [4] Can we say + The default signal handlers set up for background workers <emphasis>do + default background worker signal handlers, it should call instead of + The default signal handlers installed for background workers <emphasis>do + default background worker signal handling it should call [5] IMO, we can have something like below + request, etc. Set up these handlers before unblocking signals as + shown below: instead of + request, etc. To install these handlers, before unblocking interrupts + run: [6] I think logs and errors either elog() or ereport can be used, so how about + Use <function>elog()</function> or <function>ereport()</function> for + logging output or raising errors instead of any direct stdio calls. instead of + Use <function>elog()</function> and <function>ereport()</function> for + logging output and raising errors instead of any direct stdio calls. [7] Can we use child processes instead of subprocess ? If okay in other places in the patch as well. + and should only use the main thread. PostgreSQL generally uses child processes + that coordinate over shared memory instead of threads - for instance, see + <xref linkend="bgworker"/>. instead of + and should only use the main thread. PostgreSQL generally uses subprocesses + that coordinate over shared memory instead of threads - see + <xref linkend="bgworker"/>. [8] Why should file descriptor manager API be used to execute subprocesses/child processes? + To execute subprocesses and open files, use the routines provided by + the file descriptor manager like <function>BasicOpenFile</function> + and <function>OpenPipeStream</function> instead of a direct [9] "should always be"? even if it's a blocking extesion, does it work? If our intention is to recommend the developers, maybe we should avoid using the term "should" in the patch in other places as well. + Extension code should always be structured as a non-blocking [10] I think it is + you should avoid using <function>sleep()</function> or <function>usleep()</function> instead of + you should <function>sleep()</function> or <function>usleep()</function> [11] I think it is + block if this happens. So cleanup of resources is not entirely managed by PostgreSQL, it + must be handled using appropriate callbacks provided by PostgreSQL instead of + block if this happens. So all cleanup of resources not already + managed by the PostgreSQL runtime must be handled using appropriate [12] I think it is + found in corresponding PostgreSQL header and source files. instead of + found in the PostgreSQL headers and sources. [13] I think it is + Use PostgreSQL runtime concurrency and synchronisation primitives + between the PostgreSQL processes. These include signals and ProcSignal multiplexed instead of + Use the PostgreSQL runtime's concurrency and synchronisation primitives + between PostgreSQL processes. These include signals and ProcSignal multiplexed [14] Is it "relation/table based state management"? + Sometimes relation-based state management for extensions is not [15] I think it is + use PostgreSQL shared-memory based inter-process communication instead of + use PostgreSQL's shared-memory based inter-process communication [16] I think it is + or shared memory message queues (<acronym>shm_mq</acronym>). Examples + usage of some of these features can be found in the + <filename>src/test/modules/test_shm_mq/</filename> sample extension. Others instead of + or shared memory message queues (<acronym>shm_mq</acronym>). Examples + of the use of some these features can be found in the + <filename>src/test/modules/test_shm_mq/</filename> example extension. Others [17] I think it is + syscache entries, as this can cause subtle bugs. See instead of + syscache cache entries, as this can cause subtle bugs. See With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com