date:20240902

Re: thread-safety: getpwuid_r()

2024-09-02 Thread Peter Eisentraut


On 26.08.24 19:54, Heikki Linnakangas wrote:

On 26/08/2024 20:38, Peter Eisentraut wrote:

On 24.08.24 15:55, Heikki Linnakangas wrote:
Come to think of it, the pg_get_user_name() function is just a thin 
wrapper around getpwuid_r(). It doesn't provide a lot of value. 
Perhaps we should remove pg_get_user_name() and 
pg_get_user_home_dir() altogether and call getpwuid_r() directly.


Yeah, that seems better.  These functions are somewhat strangely 
designed and as you described have faulty error handling.  By calling 
getpwuid_r() directly, we can handle the errors better and the code 
becomes more transparent.  (There used to be a lot more interesting 
portability complications in that file, but those are long gone.)


New patch looks good to me, thanks!


committed

Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-09-02 Thread Daniel Gustafsson

> On 23 Aug 2024, at 01:56, Michael Paquier  wrote:
> 
> On Thu, Aug 22, 2024 at 11:13:15PM +0200, Daniel Gustafsson wrote:
>> On 22 Aug 2024, at 02:31, Michael Paquier  wrote:
>>> Just do it :)
>> 
>> That's my plan, I wanted to wait a bit to see if anyone else chimed in with
>> concerns.
> 
> Cool, thanks!

Attached is a rebased v15 (only changes are commit-message changes noted by
Peter upthread) for the sake of archives, and for a green-check run in the
CFBot.  Assuming this builds green I intend to push this.

--
Daniel Gustafsson



v15-0002-Only-perform-pg_strong_random-init-when-required.patch
Description: Binary data


v15-0001-Remove-support-for-OpenSSL-older-than-1.1.0.patch
Description: Binary data

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-09-02 Thread Peter Smith

Hi. Thanks for addressing my previous review comments.

Here are some review comments for v44-0001.

==
Commit message.

1.
Because such synced slots are typically considered not
active (for them to be later considered as inactive) as they don't
perform logical decoding to produce the changes.

~

This sentence is bad grammar. The docs have the same wording, so
please see my doc review comment #4 suggestion below.

==
doc/src/sgml/config.sgml

2.
+   
+Invalidates replication slots that are inactive for longer than
+specified amount of time. If this value is specified without units,
+it is taken as seconds. A value of zero (which is default) disables
+the timeout mechanism. This parameter can only be set in
+the postgresql.conf file or on the server
+command line.
+   
+

nit - This is OK as-is, but OTOH why not make the wording consistent
with the previous GUC description? (e.g. see my v43 [1] #2 review
comment)

~~~

3.
+   
+This invalidation check happens either when the slot is acquired
+for use or during checkpoint. The time since the slot has become
+inactive is known from its
+inactive_since value using which the
+timeout is measured.
+   
+

I felt this is slightly misleading because slot acquiring has nothing
to do with setting the slot invalidation anymore. Furthermore, the 2nd
sentence is bad grammar.

nit - IMO something simple like the following rewording can address
both of those points:

Slot invalidation due to inactivity timeout occurs during checkpoint.
The duration of slot inactivity is calculated using the slot's
inactive_since field value.

~

4.
+Because such synced slots are typically considered not active
+(for them to be later considered as inactive) as they don't perform
+logical decoding to produce the changes.

That sentence has bad grammar.

nit – suggest a much simpler replacement:
Synced slots are always considered to be inactive because they don't
perform logical decoding to produce changes.

==
src/backend/replication/slot.c

5.
+#define IsInactiveTimeoutSlotInvalidationApplicable(s) \
+ (replication_slot_inactive_timeout > 0 && \
+ s->inactive_since > 0 && \
+ !RecoveryInProgress() && \
+ !s->data.synced)
+

5a.
I felt this would be better implemented as an inline function. Then it
can be commented on properly to explain the parts of the condition.
e.g. the large comment currently in InvalidatePossiblyObsoleteSlot()
would be more appropriate in this function.

~

5b.
The name is very long. Can't it be something shorter/simpler like:
'IsSlotATimeoutCandidate()'

~~~

6. ReplicationSlotAcquire

-ReplicationSlotAcquire(const char *name, bool nowait)
+ReplicationSlotAcquire(const char *name, bool nowait,
+bool check_for_invalidation)

nit - Previously this new parameter really did mean to "check" for
[and set the slot] invalidation. But now I suggest renaming it to
'error_if_invalid' to properly reflect the new usage. And also in the
slot.h.

~

7.
+ /*
+ * Error out if the slot has been invalidated previously. Because there's
+ * no use in acquiring the invalidated slot.
+ */

nit - The comment is contrary to the code. If there was no reason to
skip this error, then you would not have the new parameter allowing
you to skip this error. I suggest just repeating the same comment as
in the function header.

~~~

8. ReportSlotInvalidation

nit - Added some blank lines for consistency.

~~~

9. InvalidatePossiblyObsoleteSlot

+ /*
+ * Quick exit if inactive timeout invalidation mechanism
+ * is disabled or slot is currently being used or the
+ * server is in recovery mode or the slot on standby is
+ * currently being synced from the primary.
+ *
+ * Note that the inactive timeout invalidation mechanism
+ * is not applicable for slots on the standby server that
+ * are being synced from primary server. Because such
+ * synced slots are typically considered not active (for
+ * them to be later considered as inactive) as they don't
+ * perform logical decoding to produce the changes.
+ */
+ if (!IsInactiveTimeoutSlotInvalidationApplicable(s))
+ break;

9a.
Consistency is good (commit message, docs and code comments for this),
but the added sentence has bad grammar. Please see the docs review
comment #4 above for some alternate phrasing.

~

9b.
Now that this logic is moved into a macro (I suggested it should be an
inline function) IMO this comment does not belong here anymore because
it is commenting code that you cannot see. Instead, this comment (or
something like it) should be as comments within the new function.

==
src/include/replication/slot.h

10.
+extern void ReplicationSlotAcquire(const char *name, bool nowait,
+bool check_for_invalidation);

Change the new param name as described in the earlier review comment.

==
src/test/recovery/t/050_invalidate_slots.pl

~~~

Please refer to the attached file which implements some o

Re: pgsql: Add more SQL/JSON constructor functions

2024-09-02 Thread Amit Langote

On Fri, Aug 30, 2024 at 4:32 PM Amit Langote  wrote:
> On Thu, Aug 22, 2024 at 12:44 PM Amit Langote  wrote:
> > On Thu, Aug 22, 2024 at 11:02 jian he  wrote:
> >> On Tue, Jul 30, 2024 at 12:59 PM Amit Langote  
> >> wrote:
> >> > On Fri, Jul 26, 2024 at 11:19 PM jian he  
> >> > wrote:
> >> > > {
> >> > > ...
> >> > > /*
> >> > >  * For expression nodes that support soft errors.  Should be set 
> >> > > to NULL
> >> > >  * before calling ExecInitExprRec() if the caller wants errors 
> >> > > thrown.
> >> > >  */
> >> > > ErrorSaveContext *escontext;
> >> > > } ExprState;
> >> > >
> >> > > i believe by default makeNode will set escontext to NULL.
> >> > > So the comment should be, if you want to catch the soft errors, make
> >> > > sure the escontext pointing to an allocated ErrorSaveContext.
> >> > > or maybe just say, default is NULL.
> >> > >
> >> > > Otherwise, the original comment's meaning feels like: we need to
> >> > > explicitly set it to NULL
> >> > > for certain operations, which I believe is false?
> >> >
> >> > OK, I'll look into updating this.
>
> See 0001.
>
> >> > > json_behavior_type:
> >> > > ERROR_P{ $$ = JSON_BEHAVIOR_ERROR; }
> >> > > | NULL_P{ $$ = JSON_BEHAVIOR_NULL; }
> >> > > | TRUE_P{ $$ = JSON_BEHAVIOR_TRUE; }
> >> > > | FALSE_P{ $$ = JSON_BEHAVIOR_FALSE; }
> >> > > | UNKNOWN{ $$ = JSON_BEHAVIOR_UNKNOWN; }
> >> > > | EMPTY_P ARRAY{ $$ = JSON_BEHAVIOR_EMPTY_ARRAY; }
> >> > > | EMPTY_P OBJECT_P{ $$ = JSON_BEHAVIOR_EMPTY_OBJECT; }
> >> > > /* non-standard, for Oracle compatibility only */
> >> > > | EMPTY_P{ $$ = JSON_BEHAVIOR_EMPTY_ARRAY; }
> >> > > ;
> >> > >
> >> > > EMPTY_P behaves the same as EMPTY_P ARRAY
> >> > > so for function GetJsonBehaviorConst, the following "case
> >> > > JSON_BEHAVIOR_EMPTY:" is wrong?
> >> > >
> >> > > case JSON_BEHAVIOR_NULL:
> >> > > case JSON_BEHAVIOR_UNKNOWN:
> >> > > case JSON_BEHAVIOR_EMPTY:
> >> > > val = (Datum) 0;
> >> > > isnull = true;
> >> > > typid = INT4OID;
> >> > > len = sizeof(int32);
> >> > > isbyval = true;
> >> > > break;
> >> > >
> >> > > also src/backend/utils/adt/ruleutils.c
> >> > > if (jexpr->on_error->btype != JSON_BEHAVIOR_EMPTY)
> >> > > get_json_behavior(jexpr->on_error, context, "ERROR");
> >> >
> >> > Something like the attached makes sense?  While this meaningfully
> >> > changes the deparsing output, there is no behavior change for
> >> > JsonTable top-level path execution.  That's because the behavior when
> >> > there's an error in the execution of the top-level path is to throw it
> >> > or return an empty set, which is handled in jsonpath_exec.c, not
> >> > execExprInterp.c.
>
> See 0002.
>
> I'm also attaching 0003 to fix a minor annoyance that JSON_TABLE()
> columns' default ON ERROR, ON EMPTY behaviors are unnecessarily
> emitted in the deparsed output when the top-level ON ERROR behavior is
> ERROR.
>
> Will push these on Monday.

Didn't as there's a release freeze in effect for the v17 branch.  Will
push to both master and v17 once the freeze is over.

> I haven't had a chance to take a closer look at your patch to optimize
> the code in ExecInitJsonExpr() yet.

I've simplified your patch a bit and attached it as 0004.

-- 
Thanks, Amit Langote


v2-0002-SQL-JSON-Fix-default-ON-ERROR-behavior-for-JSON_T.patch
Description: Binary data


v2-0004-SQL-JSON-Avoid-initializing-unnecessary-ON-ERROR-.patch
Description: Binary data


v2-0003-SQL-JSON-Fix-JSON_TABLE-column-deparsing.patch
Description: Binary data


v2-0001-Update-comment-about-ExprState.escontext.patch
Description: Binary data

Re: generic plans and "initial" pruning

2024-09-02 Thread Amit Langote

On Sat, Aug 31, 2024 at 9:30 PM Junwang Zhao  wrote:
> @@ -1241,7 +1244,7 @@ GetCachedPlan(CachedPlanSource *plansource,
> ParamListInfo boundParams,
>   if (customplan)
>   {
>   /* Build a custom plan */
> - plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
> + plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, true);
>
> Is the *true* here a typo? Seems it should be *false* for custom plan?

That's correct, thanks for catching that.  Will fix.

-- 
Thanks, Amit Langote

Re: More performance improvements for pg_dump in binary upgrade mode

2024-09-02 Thread Daniel Gustafsson

> On 5 Jun 2024, at 04:39, Nathan Bossart  wrote:
> 
> On Wed, May 15, 2024 at 03:21:36PM -0500, Nathan Bossart wrote:
>> Nice!  I'll plan on taking a closer look at this one.
> 
> LGTM.  I've marked the commitfest entry as ready-for-committer.

Thanks for review, committed.

--
Daniel Gustafsson

Windows socket problems, interesting connection to AIO

2024-09-02 Thread Thomas Munro

There's a category[1] of random build farm/CI failures where Windows
behaves differently and our stuff breaks, which also affects end
users.  A recent BF failure[2] that looks like one of those jangled my
nerves when I pushed a commit, so I looked into a new theory on how to
fix it.  First, let me restate my understanding of the two categories
of known message loss on Windows, since the information is scattered
far and wide across many threads:

1.  When a backend exits without closing the socket gracefully, which
was briefly fixed[3] but later reverted because it broke something
else, a Windows server's network stack might fail to send data that it
had buffered but not yet physically sent[4].

The reason we reverted that and went back to abortive socket shutdown
(ie just exit()) is that our WL_SOCKET_READABLE was buggy, and could
miss FD_CLOSE events from graceful but not abortive shutdowns (which
keep reporting themselves repeatedly, something to do with being an
error state (?)).  Sometimes a libpq socket we're waiting for with
WaitLatchOrSocket() on the client end of the socket could hang
forever.  Concretely: a replication connection or postgres_fdw running
inside another PostgreSQL server.  We fixed that event loss, albeit in
a gross kludgy way[5], because other ideas seemed too complicated (to
wit, various ways to manage extra state associated with each socket,
really hard to retro-fit in a satisfying way).  Graceful shutdown
should fix the race cases where the next thing the client calls is
recv(), as far as I know.

2.  If a Windows client tries to send() and gets an ECONNRESET/EPIPE
error, then the network stack seems to drop already received data, so
a following recv() will never see it.  In other words, it depends on
whether the application-level protocol is strictly request/response
based, or has sequence points at which both ends might send().  AFAIK
the main consequence for real users is that FATAL recovery conflict,
idle termination, etc messages are not delivered to clients, leaving
just "server closed the connection unexpectedly".

I have wondered whether it might help to kludgify the Windows TCP code
even more by doing an extra poll() for POLLRD before every single
send().  "Hey network stack, before I try to send this message, is
there anything the server wanted to tell me?", but I guess that must
be racy because the goodbye message could arrive between poll() and
send().  Annoyingly, I suspect it would *mostly* work.

The new thought I had about the second category of problem is: if you
use asynchronous networking APIs, then the kernel *can't* throw your
data out, because it doesn't even have it.  If the server's FATAL
message arrives before the client calls send(), then the data is
already written to user space memory and the I/O is marked as
complete.  If it arrives after, then there's no issue, because
computers can't see into the future yet.  That's my hypothesis,
anyway.  To try that, I started with a very simple program[6] on my
local FreeBSD system that does a failing send, and tries synchronous
and asynchronous recv():

=== synchronous ===
send -> -1, error = 32
recv -> "FATAL: flux capacitor failed", error = 0
=== posix aio ===
send -> -1, error = 32
async recv -> "FATAL: flux capacitor failed", error = 0

... and then googled enough Windows-fu to translate it and run it on
CI, and saw the known category 2 failure with the plain old
synchronous version.  The good news is that the async version sees the
goodbye message:

=== synchronous ===
send -> 14, error = 0
recv -> "", error = 10054
=== windows overlapped ===
send -> 14, error = 0
async recv -> "FATAL: flux capacitor failed", error = 0

That's not the same as a torture test for weird timings, and I have
zero knowledge of the implementation of this stuff, but I currently
can't imagine how it could possibly be implemented in any way that
could give a different answer.

Perhaps we could figure out a way to use that API to simulate
synchronous recv() built on top of that stuff, but I think a more
satisfying use of our time and energy would be to redesign all our
networking code to do cross-platform AIO.  I think that will mostly
come down to a bunch of network buffer management redesign work.
Anyway, I don't have anything concrete there, I just wanted to share
this observation.

[1] 
https://wiki.postgresql.org/wiki/Known_Buildfarm_Test_Failures#Miscellaneous_tests_fail_on_Windows_due_to_a_connection_closed_before_receiving_a_final_error_message
[2] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2024-08-31%2007%3A54%3A58
[3] 
https://github.com/postgres/postgres/commit/6051857fc953a62db318329c4ceec5f9668fd42a
[4] 
https://learn.microsoft.com/en-us/windows/win32/winsock/graceful-shutdown-linger-options-and-socket-closure-2
[5] 
https://github.com/postgres/postgres/commit/a8458f508a7a441242e148f008293128676df003
[6] https://github.com/macdice/hello-windows/blob/socket-hacking/test.c

pg_stats_subscription_stats order of the '*_count' columns

2024-09-02 Thread Peter Smith

Hi,

While reviewing another thread I was looking at the view
'pg_stats_subscription_stats' view. In particular, I was looking at
the order of the "*_count" columns of that view.

IMO there is an intuitive/natural ordering for the logical replication
operations (LR) being counted. For example, LR "initial tablesync"
always comes before LR "apply".

I propose that the columns of the view should also be in this same
intuitive order: Specifically, "sync_error_count" should come before
"apply_error_count" (left-to-right in view, top-to-bottom in docs).

Currently, they are not arranged that way.

The view today has only 2 count columns in HEAD, so this proposal
seems trivial, but there is another patch [2] soon to be pushed, which
will add more conflict count columns. As the number of columns
increases IMO it becomes more important that each column is where you
would intuitively expect to find it.

Changes would be needed in several places:
- docs (doc/src/sgml/monitoring.sgml)
- function pg_stat_get_subscription_stats (pg_proc.dat)
- view pg_stat_subscription_stats (src/backend/catalog/system_views.sql)
- TAP test SELECTs (test/subscription/t/026_stats.pl)

Thoughts?

==
[1] docs -
https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-SUBSCRIPTION-STATS
[2] stats for conflicts -
https://www.postgresql.org/message-id/flat/OS0PR01MB57160A07BD575773045FC214948F2%40OS0PR01MB5716.jpnprd01.prod.outlook.com

70 matches

Mail list logo