Re: New standby_slot_names GUC in PG 17

2024-06-25 Thread Masahiko Sawada
On Tue, Jun 25, 2024 at 1:54 PM Amit Kapila  wrote:
>
> On Tue, Jun 25, 2024 at 8:20 AM Masahiko Sawada  wrote:
> >
> > On Tue, Jun 25, 2024 at 11:21 AM Zhijie Hou (Fujitsu)
> >  wrote:
> > >
> > > I agree the current name seems too generic and the suggested ' 
> > > synchronized_standby_slots '
> > > is better than the current one.
> > >
> > > Some other ideas could be:
> > >
> > > synchronize_slots_on_standbys: it indicates that the standbys that enabled
> > > slot sync should be listed in this GUC.
> > >
> > > logical_replication_wait_slots: it means the logical replication(logical
> > > Walsender process) will wait for these slots to advance the confirm flush
> > > lsn before proceeding.
> >
> > I feel that the name that has some connection to "logical replication"
> > also sounds good. Let me add some ideas:
> >
> > - logical_replication_synchronous_standby_slots (might be too long)
> > - logical_replication_synchronous_slots
> >
>
> I see your point about keeping logical_replication in the name but
> that could also lead one to think that this list can contain logical
> slots.

Right.

>  OTOH, there is some value in keeping '_standby_' in the name as
> that is more closely associated with physical standby's and this list
> contains physical slots corresponding to physical standby's.

Agreed.

> So, my
> preference is in order as follows: synchronized_standby_slots,
> wait_for_standby_slots, logical_replication_wait_slots,
> logical_replication_synchronous_slots, and
> logical_replication_synchronous_standby_slots.

I also prefer synchronized_standby_slots.

>From a different angle just for discussion, is it worth considering
the term 'failover' since the purpose of this feature is to ensure a
standby to be ready for failover in terms of logical replication? For
example, failover_standby_slot_names?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Add pg_get_acl() function get the ACL for a database object

2024-06-25 Thread Joel Jacobson
On Tue, Jun 25, 2024, at 08:42, Michael Paquier wrote:
> On Tue, Jun 25, 2024 at 08:06:41AM +0200, Joel Jacobson wrote:
>> Not sure if I see how to implement it for pg_get_acl() though.
>> 
>> I've had a look at how pg_describe_object() works for this case:
>> 
>> SELECT pg_describe_object(0,'t'::regclass::oid,0);
>> ERROR:  unsupported object class: 0
>> 
>> I suppose this is the error message we want in pg_get_acl() when
>> the class ID is invalid?
>
> Ah, and here I thought that this was also returning NULL.  My previous
> work in this area only focused on the object OIDs, not their classes.
> At the end, I'm OK to keep your patch as it is, checking only for the
> case of pinned dependencies in pg_depend as we do for
> pg_describe_object().
>
> It's still a bit confusing, but we've been living with that for years
> now without anybody complaining except me, so perhaps that's fine at
> the end to keep that as this is still useful.  If we change that,
> applying the same rules across the board would make the most sense.

OK, cool.

New version attached that fixes the indentation of the example,
and uses NULL instead of just NULL in the doc.

/Joel

v8-0001-Add-pg_get_acl.patch
Description: Binary data


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-06-25 Thread Jelte Fennema-Nio
On Tue, 25 Jun 2024 at 07:00, Alexander Lakhin  wrote:
> I'd just like to add that that one original query assumes several "remote"
> queries (see the attached excerpt from postmaster.log with verbose logging
> enabled).

Nice catch! All those EXPLAIN queries are definitely not intentional,
and likely to greatly increase the likelihood of this flakiness.

Attached is a patch that fixes that by moving the test before enabling
use_remote_estimate on any of the foreign tables, as well as
increasing the statement_timeout to 100ms.

My expectation is that that should remove all failure cases. If it
doesn't, I think our best bet is removing the test again.


v2-0001-Make-postgres_fdw-cancel-test-not-flaky-anymore.patch
Description: Binary data


v2-0002-Do-not-reset-statement_timeout-indicator-outside-.patch
Description: Binary data


Re: Track the amount of time waiting due to cost_delay

2024-06-25 Thread Bertrand Drouvot
Hi,

On Tue, Jun 25, 2024 at 01:12:16AM +, Imseih (AWS), Sami wrote:

Thanks for the feedback!

> >> 2. the leader being interrupted while waiting is also already happening on 
> >> master
> >> due to the pgstat_progress_parallel_incr_param() calls in
> >> parallel_vacuum_process_one_index() (that have been added in
> >> 46ebdfe164). It has been the case "only" 36 times during my test case.
> 
> 46ebdfe164 will interrupt the leaders sleep every time a parallel workers 
> reports
> progress, and we currently don't handle interrupts by restarting the sleep 
> with
> the remaining time. nanosleep does provide the ability to restart with the 
> remaining
> time [1], but I don't think it's worth the effort to ensure more accurate
> vacuum delays for the leader process. 

+1. I don't think it's necessary to have a 100% accurate delay for all the
times the delay is involded. I think that's an heuristic parameter (among
with cost limit). What matters at the end is by how much you've been able to
pause the whole vacuum (and not by a sleep by sleep basis)).

> > 1. Having a time based only approach to throttle 
> 
> I do agree with a time based approach overall.
> 
> 
> > 1.1) the more parallel workers is used, the less the impact of the leader on
> > the vacuum index phase duration/workload is (because the repartition is done
> > on more processes).
> 
> Did you mean " because the vacuum is done on more processes"? 

Yes.

> When a leader is operating on a large index(s) during the entirety
> of the vacuum operation, wouldn't more parallel workers end up
> interrupting the leader more often?

That's right but my point was about the impact on the "whole" duration time and
"whole" workload (leader + workers included) and not about the number of times 
the
leader is interrupted. If there is say 100 workers then interrupting the leader
(1 process out of 101) is probably less of an issue as it means that there is a
lot of work to be done to have those 100 workers busy. I don't think the size of
the index the leader is vacuuming has an impact. I think that having the leader
vacuuming a 100 GB index or 100 x 1GB indexes is the same (as long as all the
other workers are actives during all that time).

> > 3. A 1 second reporting "throttling" looks a reasonable threshold as:
> 
> > 3.1 the idea is to have a significant impact when the leader could have been
> > interrupted say hundred/thousand times per second.
> 
> > 3.2 it does not make that much sense for any tools to sample 
> > pg_stat_progress_vacuum
> > multiple times per second (so a one second reporting granularity seems ok).
> 
> I feel 1 second may still be too frequent. 

Maybe we'll need more measurements but this is what my test case made of:

vacuum_cost_delay = 1
vacuum_cost_limit = 10
8 parallel workers, 1 leader
21 indexes (about 1GB each, one 40MB), all in memory

lead to:

With 1 second reporting frequency, the leader has been interruped about 2500
times over 8m39s leading to about the same time as on master (8m43s).

> What about 10 seconds ( or 30 seconds )? 

I'm not sure (may need more measurements) but it would probably complicate the
reporting a bit (as with the current v3 we'd miss reporting the indexes that
take less time than the threshold to complete).

> I think this metric in particular will be mainly useful for vacuum runs that 
> are 
> running for minutes or more, making reporting every 10 or 30 seconds 
> still useful.

Agree. OTOH, one could be interested to diagnose what happened during a say 5
seconds peak on I/O resource consumption/latency. Sampling 
pg_stat_progress_vacuum
at 1 second interval and see by how much the vaccum has been paused during that
time could help too (specially if it is made of a lot of parallel workers that
could lead to a lot of I/O). But it would miss data if we are reporting at a
larger rate.

> It just occurred to me also that pgstat_progress_parallel_incr_param 
> should have a code comment that it will interrupt a leader process and
> cause activity such as a sleep to end early.

Good point, I'll add a comment for it.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: New standby_slot_names GUC in PG 17

2024-06-25 Thread Amit Kapila
On Tue, Jun 25, 2024 at 12:30 PM Masahiko Sawada  wrote:
>
> On Tue, Jun 25, 2024 at 1:54 PM Amit Kapila  wrote:
> >
>
> > So, my
> > preference is in order as follows: synchronized_standby_slots,
> > wait_for_standby_slots, logical_replication_wait_slots,
> > logical_replication_synchronous_slots, and
> > logical_replication_synchronous_standby_slots.
>
> I also prefer synchronized_standby_slots.
>
> From a different angle just for discussion, is it worth considering
> the term 'failover' since the purpose of this feature is to ensure a
> standby to be ready for failover in terms of logical replication? For
> example, failover_standby_slot_names?
>

I feel synchronized better indicates the purpose because we ensure
such slots are synchronized before we process changes for logical
failover slots. We already have a 'failover' option for logical slots
which could make things confusing if we add 'failover' where physical
slots need to be specified.

-- 
With Regards,
Amit Kapila.




add a new explain option including_query for include query string inside the json plan output

2024-06-25 Thread jian he
for json format, add a new option to let the explain json output also
include the actual query string.

it can make json usage more convenient.
Now you only need to grab the json output, no need to
collect another explain statement and extract the actual query from
the explain statement.

including_query name is so far what i can come up with, if people have
better ideas, then we can change.

example:
explain (analyze,including_query on, format json) select 1;
 QUERY PLAN
-
 [  +
   {"Query": "select 1"},   +
   {+
 "Plan": {  +
   "Node Type": "Result",   +
   "Parallel Aware": false, +
   "Async Capable": false,  +
   "Startup Cost": 0.00,+
   "Total Cost": 0.01,  +
   "Plan Rows": 1,  +
   "Plan Width": 4, +
   "Actual Startup Time": 0.001,+
   "Actual Total Time": 0.001,  +
   "Actual Rows": 1,+
   "Actual Loops": 1+
 }, +
 "Planning Time": 0.119,+
 "Triggers": [  +
 ], +
 "Execution Time": 0.033+
   }+
 ]
(1 row)




RE: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-06-25 Thread Hayato Kuroda (Fujitsu)
> Dear hackers,
> 
> I found that v12 patch set could not be accepted by the cfbot. PSA new 
> version.

To make others more trackable, I shared changes just in case. All failures were 
occurred
on the pg_dump code. I added an attribute in pg_subscription and modified 
pg_dump code,
but it was wrong. A constructed SQL became incomplete. I.e., in [1]: 

```
pg_dump: error: query failed: ERROR:  syntax error at or near "."
LINE 15:  s.subforcealter
   ^
pg_dump: detail: Query was: SELECT s.tableoid, s.oid, s.subname,
 s.subowner,
 s.subconninfo, s.subslotname, s.subsynccommit,
 s.subpublications,
 s.subbinary,
 s.substream,
 s.subtwophasestate,
 s.subdisableonerr,
 s.subpasswordrequired,
 s.subrunasowner,
 s.suborigin,
 NULL AS suboriginremotelsn,
 false AS subenabled,
 s.subfailover
 s.subforcealter
FROM pg_subscription s
WHERE s.subdbid = (SELECT oid FROM pg_database
   WHERE datname = current_database())
```

Based on that I just added a comma in 0004 patch.

[1]: https://cirrus-ci.com/task/6710166165389312

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Christoph Berg
Re: Nathan Bossart
> At first glance, the general idea seems reasonable to me.  I'm wondering
> whether there is a requirement for this directory to be prepended or if it
> could be appended to the end.  That way, the existing ones would take
> priority, which might be desirable from a security standpoint.

My use case for this is to test things at compile time (where I can't
write to /usr/share/postgresql/). If installed things would take
priority over the things that I'm trying to test, I'd be disappointed.

Christoph




Re: Partial aggregates pushdown

2024-06-25 Thread Jelte Fennema-Nio
On Tue, 25 Jun 2024 at 08:33, fujii.y...@df.mitsubishielectric.co.jp
 wrote:
> Actually, I have other tasks about "PARTIAL_AGGREAGATE" keyword
> to respond Requirement1 and Requirement2 in the following mail.
> https://www.postgresql.org/message-id/TYAPR01MB3088755F2281D41F5EEF06D495F92%40TYAPR01MB3088.jpnprd01.prod.outlook.com

No problem. I totally think it makes sense to focus on basic
PARTIAL_AGGREGATE first. Which is also why I suggested splitting the
patchset up in multiple patches. That way it's easier to get everyone
aligned on PARTIAL_AGGREGATE behaviour for non-internal transtypes,
which would already be a huge improvement over the current situation
in my opinion.

> After that tasks, I plan to compare your proposal with mine seriously, with 
> additional POC patches if necessary.

Sounds great! To be clear, I'm not sure which proposal is best. I
mainly thought mine seemed interesting because it doesn't require
additional columns. But maybe the benefits that the extra columns in
your proposal brings are worth adding those extra columns.

> I see. It seems that adding new natives might make it easier to transmit the 
> state values between local and remote have different major versions.
> However, in my opinion, we should be careful to support the case in which 
> local and remote have different major versions,
> because the transtype of an aggregate function would may change in future 
> major version due to
> something related to the implementation.
> Actually, something like that occurs recently, see
> https://github.com/postgres/postgres/commit/519fc1bd9e9d7b408903e44f55f83f6db30742b7
> I think the transtype of an aggregate function quite more changeable than 
> retype.
> Consequently, so far, I want to support the cases in which local and remote 
> have the same major version.
> If we try to resolve the limitation, it seems to need more additional codes.

Hmm, that's a very good point. Indeed cross-major-version partial
aggregates pushdown would not be fully solved with this yet.

> And, I'm afraid that adding typinput/typoutput bothers the developers.
> They also have to create a new native types in addition to create their new 
> aggregate functions.
> I wonder if this concern might outweigh the benefits for debugging.
> And, if skipping send/receive, they have to select only the text 
> representation on
> the transmission of the state value. I think it is narrow.

I kinda agree with this argument. But really this same argument
applies just as well for regular CREATE TYPE. Developers are forced to
implement typinput/typoutput, even though send/receive might really be
enough for their usecase. So in a sense with your proposal, you give
transtypes a special status over regular types: i.e. transtypes are
the only types where only send/receive is necessary.

So that leaves me two questions:
1. Maybe CREATE TYPE should allow types without input/output functions
as long as send/receive are defined. For these types text
representation could fall back to the hex representation of bytea.
2. If for some reason 1 is undesired, then why are transtypes so
special. Why is it fine for them to only have send/receive functions
and not for other types?




Re: Conflict Detection and Resolution

2024-06-25 Thread Amit Kapila
On Mon, Jun 24, 2024 at 1:47 PM shveta malik  wrote:
>
> On Thu, Jun 20, 2024 at 6:41 PM Amit Kapila  wrote:
> >
> > >> In the second patch, we can implement simple built-in resolution
> > >> strategies like apply and skip (which can be named as remote_apply and
> > >> keep_local, see [3][4] for details on these strategies) with ERROR or
> > >> LOG being the default strategy. We can allow these strategies to be
> > >> configured at the global and table level.
>
> Before we implement resolvers, we need a way to configure them. Please
> find the patch002 which attempts to implement Global Level Conflict
> Resolvers Configuration.  Note that patch002 is dependent upon
> Conflict-Detection patch001 which is reviewed in another thread [1].
> I have attached patch001 here for convenience and to avoid CFBot
> failures. But please use [1] if you have any comments on patch001.
>
> New DDL commands in patch002 are:
>
> To set global resolver for given conflcit_type:
> SET CONFLICT RESOLVER 'conflict_resolver' FOR 'conflict_type'
>
> To reset to default resolver:
> RESET CONFLICT RESOLVER FOR 'conflict_type'
>

Does setting up resolvers have any meaning without subscriptions? I am
wondering whether we should allow to set up the resolvers at the
subscription level. One benefit is that users don't need to use a
different DDL to set up resolvers. The first patch gives a conflict
detection option at the subscription level, so it would be symmetrical
to provide a resolver at the subscription level. Yet another benefit
could be that it provides users facility to configure different
resolvers for a set of tables belonging to a particular
publication/node.

>
> 
>
> As suggested in [2] and above, it seems logical to have table-specific
> resolvers configuration along with global one.
>
> Here is the proposal for table level resolvers:
>
> 1) We can provide support for table level resolvers using ALTER TABLE:
>
> ALTER TABLE  SET CONFLICT RESOLVER  on ,
>SET CONFLICT RESOLVER
>  on , ...;
>
> Reset can be done using:
> ALTER TABLE  RESET CONFLICT RESOLVER on ,
>   RESET CONFLICT RESOLVER on
> , ...;
>
> Above commands will save/remove configuration in/from the new system
> catalog pg_conflict_rel.
>
> 2) Table level configuration (if any) will be given preference over
> global ones. The tables not having table-specific resolvers will use
> global configured ones.
>
> 3) If the table is a partition table, then resolvers created for the
> parent will be inherited by all child partition tables. Multiple
> resolver entries will be created, one for each child partition in the
> system catalog (similar to constraints).
>
> 4) Users can also configure explicit resolvers for child partitions.
> In such a case, child's resolvers will override inherited resolvers
> (if any).
>
> 5) Any attempt to RESET (remove) inherited resolvers on the child
> partition table *alone* will result in error:  "cannot reset inherited
> resolvers" (similar to constraints). But RESET of explicit created
> resolvers (non-inherited ones) will be permitted for child partitions.
> On RESET, the resolver configuration will not fallback to the
> inherited resolver again. Users need to explicitly configure new
> resolvers for the child partition tables (after RESET) if needed.
>

Why so? If we can allow the RESET command to fallback to the inherited
resolver it would make the behavior consistent for the child table
where we don't have performed SET.

> 6) Removal/Reset of resolvers on parent will remove corresponding
> "inherited" resolvers on all the child partitions as well. If any
> child has overridden inherited resolvers earlier, those will stay.
>
> 7) For 'ALTER TABLE parent ATTACH PARTITION child'; if 'child' has its
> own resolvers set, those will not be overridden. But if it does not
> have resolvers set, it will inherit from the parent table. This will
> mean, for say out of 5 conflict_types, if the child table has
> resolvers configured for any 2, 'attach' will retain those; for the
> rest 3, it will inherit from the parent (if any).
>
> 8) Detach partition will not remove inherited resolvers, it will just
> mark them 'non inherited' (similar to constraints).
>

BTW, to keep the initial patch simple, can we prohibit setting
resolvers at the child table level? If we follow this, then we can
give an ERROR if the user tries to attach the table (with configured
resolvers) to an existing partitioned table.

-- 
With Regards,
Amit Kapila.




Re: scalability bottlenecks with (many) partitions (and more)

2024-06-25 Thread Tomas Vondra



On 6/24/24 17:05, Robert Haas wrote:
> On Sun, Jan 28, 2024 at 4:57 PM Tomas Vondra
>  wrote:
>> For NUM_LOCK_PARTITIONS this is pretty simple (see 0001 patch). The
>> LWLock table has 16 partitions by default - it's quite possible that on
>> machine with many cores and/or many partitions, we can easily hit this.
>> So I bumped this 4x to 64 partitions.
> 
> I think this probably makes sense. I'm a little worried that we're
> just kicking the can down the road here where maybe we should be
> solving the problem in some more fundamental way, and I'm also a
> little worried that we might be reducing single-core performance. But
> it's probably fine.
> 

Yeah, I haven't seen this causing any regressions - the sensitive paths
typically lock only one partition, so having more of them does not
affect that. Or if it does, it's likely a reasonable trade off as it
reduces the risk of lock contention.

That being said, I don't recall benchmarking this patch in isolation,
only with the other patches. Maybe I should do that ...

>> What I ended up doing is having a hash table of 16-element arrays. There
>> are 64 "pieces", each essentially the (16 x OID + UINT64 bitmap) that we
>> have now. Each OID is mapped to exactly one of these parts as if in a
>> hash table, and in each of those 16-element parts we do exactly the same
>> thing we do now (linear search, removal, etc.). This works great, the
>> locality is great, etc. The one disadvantage is this makes PGPROC
>> larger, but I did a lot of benchmarks and I haven't seen any regression
>> that I could attribute to this. (More about this later.)
> 
> I think this is a reasonable approach. Some comments:
> 
> - FastPathLocalUseInitialized seems unnecessary to me; the contents of
> an uninitialized local variable are undefined, but an uninitialized
> global variable always starts out zeroed.
> 

OK. I didn't realize global variables start a zero.

> - You need comments in various places, including here, where someone
> is certain to have questions about the algorithm and choice of
> constants:
> 
> +#define FAST_PATH_LOCK_REL_GROUP(rel) (((uint64) (rel) * 7883 + 4481)
> % FP_LOCK_GROUPS_PER_BACKEND)
> 

Yeah, definitely needs comment explaining this.

I admit those numbers are pretty arbitrary primes, to implement a
trivial hash function. That was good enough for a PoC patch, but maybe
for a "proper" version this should use a better hash function. It needs
to be fast, and maybe it doesn't matter that much if it's not perfect.

> When I originally coded up the fast-path locking stuff, I supposed
> that we couldn't make the number of slots too big because the
> algorithm requires a linear search of the whole array. But with this
> one trick (a partially-associative cache), there's no real reason that
> I can think of why you can't make the number of slots almost
> arbitrarily large. At some point you're going to use too much memory,
> and probably before that point you're going to make the cache big
> enough that it doesn't fit in the CPU cache of an individual core, at
> which point possibly it will stop working as well. But honestly ... I
> don't quite see why this approach couldn't be scaled quite far.
> 

I don't think I've heard the term "partially-associative cache" before,
but now that I look at the approach again, it very much reminds me how
set-associative caches work (e.g. with cachelines in CPU caches). It's a
16-way associative cache, assigning each entry into one of 16 slots.

I must have been reading some papers in this area shortly before the PoC
patch, and the idea came from there, probably. Which is good, because it
means it's a well-understood and widely-used approach.

> Like, if we raised FP_LOCK_GROUPS_PER_BACKEND from your proposed value
> of 64 to say 65536, would that still perform well? I'm not saying we
> should do that, because that's probably a silly amount of memory to
> use for this, but the point is that when you start to have enough
> partitions that you run out of lock slots, performance is going to
> degrade, so you can imagine wanting to try to have enough lock groups
> to make that unlikely. Which leads me to wonder if there's any
> particular number of lock groups that is clearly "too many" or whether
> it's just about how much memory we want to use.
> 

That's an excellent question. I don't know.

I agree 64 groups is pretty arbitrary, and having 1024 may not be enough
even with a modest number of partitions. When I was thinking about using
a higher value, my main concern was that it'd made the PGPROC entry
larger. Each "fast-path" group is ~72B, so 64 groups is ~4.5kB, and that
felt like quite a bit.

But maybe it's fine and we could make it much larger - L3 caches tend to
be many MBs these days, although AFAIK it's shared by threads running on
the CPU.

I'll see if I can do some more testing of this, and see if there's a
value where it stops improving / starts degrading, etc.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.e

Re: Conflict Detection and Resolution

2024-06-25 Thread shveta malik
On Tue, Jun 25, 2024 at 3:12 PM Amit Kapila  wrote:
>
> On Mon, Jun 24, 2024 at 1:47 PM shveta malik  wrote:
> >
> > On Thu, Jun 20, 2024 at 6:41 PM Amit Kapila  wrote:
> > >
> > > >> In the second patch, we can implement simple built-in resolution
> > > >> strategies like apply and skip (which can be named as remote_apply and
> > > >> keep_local, see [3][4] for details on these strategies) with ERROR or
> > > >> LOG being the default strategy. We can allow these strategies to be
> > > >> configured at the global and table level.
> >
> > Before we implement resolvers, we need a way to configure them. Please
> > find the patch002 which attempts to implement Global Level Conflict
> > Resolvers Configuration.  Note that patch002 is dependent upon
> > Conflict-Detection patch001 which is reviewed in another thread [1].
> > I have attached patch001 here for convenience and to avoid CFBot
> > failures. But please use [1] if you have any comments on patch001.
> >
> > New DDL commands in patch002 are:
> >
> > To set global resolver for given conflcit_type:
> > SET CONFLICT RESOLVER 'conflict_resolver' FOR 'conflict_type'
> >
> > To reset to default resolver:
> > RESET CONFLICT RESOLVER FOR 'conflict_type'
> >
>
> Does setting up resolvers have any meaning without subscriptions? I am
> wondering whether we should allow to set up the resolvers at the
> subscription level. One benefit is that users don't need to use a
> different DDL to set up resolvers. The first patch gives a conflict
> detection option at the subscription level, so it would be symmetrical
> to provide a resolver at the subscription level. Yet another benefit
> could be that it provides users facility to configure different
> resolvers for a set of tables belonging to a particular
> publication/node.

There can be multiple tables included in a publication with varying
business use-cases and thus may need different resolvers set, even
though they all are part of the same publication.

> >
> > 
> >
> > As suggested in [2] and above, it seems logical to have table-specific
> > resolvers configuration along with global one.
> >
> > Here is the proposal for table level resolvers:
> >
> > 1) We can provide support for table level resolvers using ALTER TABLE:
> >
> > ALTER TABLE  SET CONFLICT RESOLVER  on ,
> >SET CONFLICT RESOLVER
> >  on , ...;
> >
> > Reset can be done using:
> > ALTER TABLE  RESET CONFLICT RESOLVER on ,
> >   RESET CONFLICT RESOLVER on
> > , ...;
> >
> > Above commands will save/remove configuration in/from the new system
> > catalog pg_conflict_rel.
> >
> > 2) Table level configuration (if any) will be given preference over
> > global ones. The tables not having table-specific resolvers will use
> > global configured ones.
> >
> > 3) If the table is a partition table, then resolvers created for the
> > parent will be inherited by all child partition tables. Multiple
> > resolver entries will be created, one for each child partition in the
> > system catalog (similar to constraints).
> >
> > 4) Users can also configure explicit resolvers for child partitions.
> > In such a case, child's resolvers will override inherited resolvers
> > (if any).
> >
> > 5) Any attempt to RESET (remove) inherited resolvers on the child
> > partition table *alone* will result in error:  "cannot reset inherited
> > resolvers" (similar to constraints). But RESET of explicit created
> > resolvers (non-inherited ones) will be permitted for child partitions.
> > On RESET, the resolver configuration will not fallback to the
> > inherited resolver again. Users need to explicitly configure new
> > resolvers for the child partition tables (after RESET) if needed.
> >
>
> Why so? If we can allow the RESET command to fallback to the inherited
> resolver it would make the behavior consistent for the child table
> where we don't have performed SET.

Thought behind not making it fallback is since the user has done
'RESET', he may want to remove the resolver completely. We don't know
if he really wants to go back to the previous one. If he does, it is
easy to set it again. But if he does not, and we set the inherited
resolver again during 'RESET', there is no way he can drop that
inherited resolver alone on the child partition.

> > 6) Removal/Reset of resolvers on parent will remove corresponding
> > "inherited" resolvers on all the child partitions as well. If any
> > child has overridden inherited resolvers earlier, those will stay.
> >
> > 7) For 'ALTER TABLE parent ATTACH PARTITION child'; if 'child' has its
> > own resolvers set, those will not be overridden. But if it does not
> > have resolvers set, it will inherit from the parent table. This will
> > mean, for say out of 5 conflict_types, if the child table has
> > resolvers configured for any 2, 'attach' will retain those; for the
> > rest 3, it will inherit from the parent (if any).
> >
> > 8) Detach partition will not rem

Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Alvaro Herrera
On 2024-Jun-24, Robert Haas wrote:

> Is "tighten up what the superuser can do" on our list of objectives?
> Personally, I think we should be focusing mostly, and maybe entirely,
> on letting non-superusers do more things, with appropriate security
> controls. The superuser can ultimately do anything, since they can
> cause shell commands to be run and load arbitrary code into the
> backend and write code in untrusted procedural languages and mutilate
> the system catalogs and lots of other terrible things.

I don't agree that we should focus _solely_ on allowing non-superusers
to do more things.  Sure, it's a good thing to do -- but we shouldn't
completely close the option of securing superuser itself.  I think it's
not completely impossible to have a future where superuser is just so
within the database, i.e. that it can't escape to the operating system.
I'm sure that would be useful in many environments.  On this list, many
people frequently make the argument that it is impossible to secure, but
I'm not convinced.

They can mutilate the system catalogs: yes, they can TRUNCATE pg_type.
So what?  They've just destroyed their own ability to do anything else.
The real issue here is that they can edit pg_proc to cause SQL function
calls to call arbitrary code.  But what if we limit functions so that
the C code that they can call is located in specific places that are
known to only contain secure code?  This is easy: make sure the
OS-installation only contains safe code in $libdir.

I hear you say: ah, but they can modify dynamic_library_path, which is a
GUC, to load code from anywhere -- especially /tmp, where the newest
bitcoin-mining library was just written.  This is true.  I suggest, to
solve this problem, that we should make dynamic_library_path no longer a
GUC.  It should be a setting that comes from a different origin, one
that even superuser cannot write to.  Only the OS-installation can
modify that file; that way, superuser cannot load arbitrary code that
way.

This is where the new GUC setting being proposed in this thread rubs me
the wrong way: it's adding yet another avenue for this to be exploited.
I would like this new directory not to be a GUC either, just like
dynamic_library_path.

I hear you argue: ah, but they can use COPY to write a new file to
$libdir.  Yes, they can, and I think that's foolish.  We could have
another non-GUC setting which takes a list of directories where COPY can
write files into.  Problem solved.  Do people really need the ability to
write files on arbitrary locations?

Untrusted extensions: well, just don't have those in the OS-installation
and you'll be fine.  I'm okay with saying that a superuser-restricted
system is incompatible with plpython.

archive_command and so on: we could disable these too.  Nathan did some
work to implement those using dynamic libraries, so it shouldn't be too
much of a loss; anything that is done with a shell script can also be
done with a small library.  Those libraries can be made safe.
If there are other ways to invoke shell commands from GUCs, let's add
the ability to use libraries for those too.

What other exploits do we know about?  How can we close them?


Now, I'm not saying that this is an easy journey.  But if we don't
start, we're not going to get there.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)




Backporting BackgroundPsql

2024-06-25 Thread Heikki Linnakangas
While fixing a recent bug on visibility on a standby [1], I wrote a 
regression test that uses BackgroundPsql to run some queries in a 
long-running psql session. The problem is that that was refactored in 
v17, commit 664d757531. The test I wrote for v17 doesn't work as it is 
on backbranches. Options:


1. Write the new test differently on backbranches. Before 664d757531, 
the test needs to work a lot harder to use the background psql session, 
calling pump() etc. That's doable, but as noted in the discussion that 
led to 664d757531, it's laborious and error-prone.


2. Backport commit 664d757531. This might break out-of-tree perl tests 
that use the background_psql() function. I don't know if any such tests 
exist, and they would need to be changed for v17 anyway, so that seems 
acceptable. Anyone aware of any extensions using the perl test modules?


3. Backport commit 664d757531, but keep the existing background_psql() 
function unchanged. Add a different constructor to get the v17-style 
BackgroundPsql session, something like "$node->background_psql_new()".


I'm leaning towards 3. We might need to backport more perl tests that 
use background_psql() in the future, backporting the test module will 
make that easier.


Thoughts?

[1] 
https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a269...@iki.fi


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: add a new explain option including_query for include query string inside the json plan output

2024-06-25 Thread Matthias van de Meent
On Tue, 25 Jun 2024 at 10:55, jian he  wrote:
>
> for json format, add a new option to let the explain json output also
> include the actual query string.

How would this cooperate with e.g. EXPLAIN (...) EXECUTE
my_prepared_statement? Would this query be the prepared statement's
query, or the top-level EXECUTE statement?

> it can make json usage more convenient.
> Now you only need to grab the json output, no need to
> collect another explain statement and extract the actual query from
> the explain statement.

Wouldn't the user be able to keep track of the query they wanted
explained by themselves? If not, why?

> example:
> explain (analyze,including_query on, format json) select 1;
>  QUERY PLAN
> -
>  [  +
>{"Query": "select 1"},   +
>{+
>  "Plan": {  +

If we were to add the query to the explain output, I think it should
be a top-level key in the same JSON object that holds the "Plan",
Triggers, and "Execution Time" keys.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)




Re: Backporting BackgroundPsql

2024-06-25 Thread Heikki Linnakangas

On 25/06/2024 13:26, Heikki Linnakangas wrote:
While fixing a recent bug on visibility on a standby [1], I wrote a 
regression test that uses BackgroundPsql to run some queries in a 
long-running psql session. The problem is that that was refactored in 
v17, commit 664d757531. The test I wrote for v17 doesn't work as it is 
on backbranches. Options:


Sorry, I meant v16. The BackgroundPsql refactorings went into v16. The 
backporting question remains for v15 and below.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Meson far from ready on Windows

2024-06-25 Thread Andres Freund
Hi,

On 2024-06-21 12:20:49 +0100, Dave Page wrote:
> > I'm confused - the old build system wasn't flexible around this stuff *at
> > all*. Everyone had to patch it to get dependencies to work, unless you
> > chose
> > exactly the right source to download from - which was often not documented
> > or
> > outdated.
> >
> 
> As I noted above - as the "owner" of the official packages, I never did
> despite using a variety of upstream sources.

For reference, with 16 and src/tools/msvc:
- upstream zstd build doesn't work, wrong filename (libzstd.dll.a instead of 
libzstd.lib)
- upstream lz4 build doesn't work, wrong filename (liblz4.dll.a instead of 
liblz4.lib)
- openssl, from https://slproweb.com/products/Win32OpenSSL.htm , as our
  docs suggest: doesn't work, wrong filenames (openssl.lib instead of
  lib*64.lib, works if you delete lib/VC/sslcrypto64MD.lib)
- iconv/intl: mismatching library names (lib*.dll.a lib*.lib)

- zlib at least at least from some of the sources (it's hard to tell, because
  everything available is so outdated), wrong filenames

Upstream ICU works.

I gave up at this point, so I don't know if libxml, xslt and uuid work without
patching the sources.

Greetings,

Andres Freund




Re: Meson far from ready on Windows

2024-06-25 Thread Dave Page
On Tue, 25 Jun 2024 at 11:41, Andres Freund  wrote:

> Hi,
>
> On 2024-06-21 12:20:49 +0100, Dave Page wrote:
> > > I'm confused - the old build system wasn't flexible around this stuff
> *at
> > > all*. Everyone had to patch it to get dependencies to work, unless you
> > > chose
> > > exactly the right source to download from - which was often not
> documented
> > > or
> > > outdated.
> > >
> >
> > As I noted above - as the "owner" of the official packages, I never did
> > despite using a variety of upstream sources.
>
> For reference, with 16 and src/tools/msvc:
> - upstream zstd build doesn't work, wrong filename (libzstd.dll.a instead
> of libzstd.lib)
> - upstream lz4 build doesn't work, wrong filename (liblz4.dll.a instead of
> liblz4.lib)
> - openssl, from https://slproweb.com/products/Win32OpenSSL.htm , as our
>   docs suggest: doesn't work, wrong filenames (openssl.lib instead of
>   lib*64.lib, works if you delete lib/VC/sslcrypto64MD.lib)
> - iconv/intl: mismatching library names (lib*.dll.a lib*.lib)
>
> - zlib at least at least from some of the sources (it's hard to tell,
> because
>   everything available is so outdated), wrong filenames
>

https://github.com/dpage/winpgbuild proves that the hacks above are not
required *if* you build the dependencies in the recommended way for use
with MSVC++ (where documented), otherwise just native Windows.

If you, for example, build a dependency using Mingw/Msys, then you may get
different filenames than if you build the same thing using its VC++
solution or makefile. That's where most, if not all, of these issues come
from.

It's probably worth noting that "back in the day" when most of this stuff
was built, there was no UCRT32 compiler option, and it really was a
potential problem to mix VC++ and Mingw compiled binaries so there was a
heavy focus on making sure everything was designed around the MSVC++ builds
wherever they existed.

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com


Re: Meson far from ready on Windows

2024-06-25 Thread Andres Freund
Hi,

On 2024-06-24 09:44:57 +0100, Dave Page wrote:
> To address Andres' concerns around mis-detection of dependencies, or other
> oddities such as required compiler flags not being included, I would
> suggest that a) that's happened very rarely, if ever, in the past, and b)
> we can always spit out an obvious warning if we've not been able to use
> cmake or pkgconfig for any particular dependencies.

I personally spent quite a few days hunting down issues related to this. Not
because I wanted to, but because it was causing breakage and nobody else was
looking.  For several years postgres didn't build against a modern perl, for
example, see the stuff leading up to
  http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ccc59a83cd97
but nobody seemed to care for a prolonged amount of time.

We have evidence of random build hackery all over the tree - often
entirely outdated, sometimes even *breaking* builds these days ([1]):

https://github.com/postgres/postgres/blob/master/src/interfaces/libpq/Makefile#L80-L88
(wrong library names for kerberos on 64bit systems, wrong ssl libnames for )

https://github.com/postgres/postgres/blob/master/contrib/bool_plperl/Makefile#L30
https://github.com/postgres/postgres/blob/master/src/pl/plpython/Makefile#L59-L72
https://github.com/postgres/postgres/blob/master/src/pl/tcl/Makefile#L35-L51
https://github.com/postgres/postgres/blob/master/config/python.m4#L62-L64x

There's plenty more, some of the more complicated cases are a bit less trivial
to search for.

Greetings,

Andres Freund

[1] 
https://postgr.es/m/CAGPVpCSKS9E0An4%3De7ZDnme%2By%3DWOcQFJYJegKO8kE9%3Dgh8NJKQ%40mail.gmail.com




Re: Proposal: Document ABI Compatibility

2024-06-25 Thread Peter Eisentraut

On 24.06.24 22:26, David E. Wheeler wrote:

But now we're talking about API.  That might be subject of another document or 
another section in this one, but it seems confusing to mix this with the ABI 
discussion.

Hrm. They’re super closely-related in my mind, as an extension developer. I 
need to know both! I guess I’m taking of this policy as what I can expect may 
be changed (and how to adapt to it) and what won’t.

That said, I’m fine to remove the API stuff if there’s consensus objecting to 
it, to be defined in a separate policy (perhaps on the same doc page).


I took at a stab at this, using some of your text, but discussing API 
and ABI separately.



# Server API and ABI guidance

This section contains guidance to authors of extensions and other
server plugins about ABI and API stability in the PostgreSQL server.

## General

The PostgreSQL server contains several well-delimited APIs for server
plugins, such as the function manager (fmgr), SPI, and various hooks
specifically designed for extensions.  These interfaces are carefully
managed for long-term stability and compatibility.  However, the
entire set of global functions and variables in the server effectively
constitutes the publicly usable API, but most parts of that were not
designed with extensibility and long-term stability in mind.  That
means, while taking advantage of these interfaces is valid, the
further one strays from the well-trodden path, the likelier it will be
that one might encounter ABI or API compatibility issues at some
point.  Extension authors are also encouraged to provide feedback
about their requirements, so that over time, as new use patterns
arise, certain interfaces can be consider more stabilized or new
better-designed interfaces for new uses can be added.

## API compatibility

(API = application programming interface, meaning the interface used
at compile time)

### Major versions

There is _no_ promise of API compatibility between PostgreSQL major
versions.  That means, extension code might require source code
changes to work with multiple major versions.  These can usually be
managed with preprocessor conditions like `#if PG_VERSION_NUM >=
16`.  Sophisticated extensions that use interfaces beyond the
well-delimited ones usually require a few such changes for each major
server version.

### Minor versions

PostgreSQL makes an effort to avoid server API breaks in minor
releases.  In general, extension code that compiles and works with
some minor release should also compile and work with any other minor
release, past or future.

When a change *is* required, this will be carefully managed, taking
the requirements of extensions into account.  Such changes will be
communicated in the release notes.

## ABI compatibility

(ABI = application binary interface, meaning the interface used at run
time)

### Major versions

Servers of different major versions have intentionally incompatible
ABIs.  That means, extensions that use server APIs must be re-compiled
for each major release.  The inclusion of `PG_MODULE_MAGIC` ensures
that code compiled for one major version will be rejected by other
major versions.

### Minor versions

PostgreSQL makes an effort to avoid server ABI breaks in minor
releases.  In general, an extension compiled against any minor release
should work with any other minor release, past or future.

When a change *is* required, PostgreSQL will choose the least invasive
change possible, for example by squeezing a new field into padding
space or appending it to the end of a struct.  These sorts of changes
should not impact extensions unless they use very unusual code
patterns.

In rare cases, however, even such non-invasive changes may be
impractical or impossible.  In such an event, the change will be
carefully managed, taking the requirements of extensions into account.
Such changes will also be documented in the release notes.

Note, however, again that many parts of the server are not designed or
maintained as publicly-consumable APIs (and that, in most cases, the
actual boundary is also not well-defined).  If urgent needs arise,
changes in those parts will naturally be done with less consideration
for extension code than changes in well-defined and widely used
interfaces.

Also, in the absence of automated detection of such changes, this is
not a guarantee, but historically such breaking changes have been
extremely rare.





Re: Meson far from ready on Windows

2024-06-25 Thread Andres Freund
Hi,

On 2024-06-25 11:54:56 +0100, Dave Page wrote:
> https://github.com/dpage/winpgbuild proves that the hacks above are not
> required *if* you build the dependencies in the recommended way for use
> with MSVC++ (where documented), otherwise just native Windows.

Partially it just means that some of the hacks are now located in the "build
dependencies" script.  E.g. you're renaming libintl.dll.a, libiconv.dll.a,
libuuid.a to something that's expected by the buildmethod. And the scripts
change the directory structure for several other dependencies (e.g. zstd, krb).


> If you, for example, build a dependency using Mingw/Msys, then you may get
> different filenames than if you build the same thing using its VC++
> solution or makefile. That's where most, if not all, of these issues come
> from.

Yes, that's precisely my point. The set of correct names / flags depends on
things outside of postgres control. Hence they should be handled outside of
postgres, not as part of postgres. Particularly because several of the
dependencies can be built in multiple ways, resulting in multiple library
names. And it doesn't even just differ by compiler, there's ways to get
different library names for some of the deps even with the same compiler!


> It's probably worth noting that "back in the day" when most of this stuff
> was built, there was no UCRT32 compiler option, and it really was a
> potential problem to mix VC++ and Mingw compiled binaries so there was a
> heavy focus on making sure everything was designed around the MSVC++ builds
> wherever they existed.

Agreed, this luckily got easier. But it also increased the variety of expected
library names / flags. It's entirely reasonable to build postgres with msvc
against an gcc built ICU or whatnot.

Greetings,

Andres Freund




Re: Backporting BackgroundPsql

2024-06-25 Thread Andres Freund
Hi,

On 2024-06-25 13:26:23 +0300, Heikki Linnakangas wrote:
> While fixing a recent bug on visibility on a standby [1], I wrote a
> regression test that uses BackgroundPsql to run some queries in a
> long-running psql session. The problem is that that was refactored in v17,
> commit 664d757531. The test I wrote for v17 doesn't work as it is on
> backbranches. Options:
> 
> 1. Write the new test differently on backbranches. Before 664d757531, the
> test needs to work a lot harder to use the background psql session, calling
> pump() etc. That's doable, but as noted in the discussion that led to
> 664d757531, it's laborious and error-prone.
> 
> 2. Backport commit 664d757531. This might break out-of-tree perl tests that
> use the background_psql() function. I don't know if any such tests exist,
> and they would need to be changed for v17 anyway, so that seems acceptable.
> Anyone aware of any extensions using the perl test modules?
> 
> 3. Backport commit 664d757531, but keep the existing background_psql()
> function unchanged. Add a different constructor to get the v17-style
> BackgroundPsql session, something like "$node->background_psql_new()".
> 
> I'm leaning towards 3. We might need to backport more perl tests that use
> background_psql() in the future, backporting the test module will make that
> easier.
> 
> Thoughts?

Yes, I've wished for this a couple times. I think 2 or 3 would be reasonable.
I think 1) often just leads to either tests not being written or being
fragile...

Greetings,

Andres Freund




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Jelte Fennema-Nio
On Tue, 25 Jun 2024 at 12:12, Alvaro Herrera  wrote:
> They can mutilate the system catalogs: yes, they can TRUNCATE pg_type.
> So what?  They've just destroyed their own ability to do anything else.
> The real issue here is that they can edit pg_proc to cause SQL function
> calls to call arbitrary code.  But what if we limit functions so that
> the C code that they can call is located in specific places that are
> known to only contain secure code?  This is easy: make sure the
> OS-installation only contains safe code in $libdir.

I wouldn't call it "easy" but I totally agree that changing pg_proc is
the main security issue that we have no easy way to tackle.

> I hear you say: ah, but they can modify dynamic_library_path, which is a
> GUC, to load code from anywhere -- especially /tmp, where the newest
> bitcoin-mining library was just written.  This is true.  I suggest, to
> solve this problem, that we should make dynamic_library_path no longer a
> GUC.  It should be a setting that comes from a different origin, one
> that even superuser cannot write to.  Only the OS-installation can
> modify that file; that way, superuser cannot load arbitrary code that
> way.

I don't think that needs a whole new file. Making this GUC be
PGC_SIGHUP/PGC_POSTMASTER + GUC_DISALLOW_IN_AUTO_FILE should be
enough. Just like was done for the new allow_alter_system GUC in PG17.

> This is where the new GUC setting being proposed in this thread rubs me
> the wrong way: it's adding yet another avenue for this to be exploited.
> I would like this new directory not to be a GUC either, just like
> dynamic_library_path.

We can make it PGC_SIGHUP/PGC_POSTMASTER + GUC_DISALLOW_IN_AUTO_FILE
too, either now or in the future.

> Now, I'm not saying that this is an easy journey.  But if we don't
> start, we're not going to get there.

Sure, but it sounds like you're suggesting that you want to "start" by
not adding new features that have equivalent security holes as the
ones that we already have. I don't think that is a very helpful way to
get to a better place. It seems much more useful to tackle the current
problems that we have first, and almost certainly the same solutions
to those problems can be applied to any new features with security
issues.

It at least definitely seems the case for the proposal in this thread:
i.e. we already have a GUC that allows loading libraries from an
arbitrary location. This proposal adds another such GUC. If we solve
the security problem in that first GUC, either by
GUC_DISALLOW_IN_AUTO_FILE, or by creating a whole new mechanism for
the setting, then I see no reason why we cannot use that exact same
solution for the newly proposed GUC. So the required work to secure
postgres will not be meaningfully harder by adding this GUC.




Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY

2024-06-25 Thread Michail Nikolaev
Hello, Michael!

> As far as I can see, it depends on what kind of query semantics and
> the amount of transparency you are looking for here in your
> application.  An error in the query itself can also be defined as
> useful so as your application is aware of what happens as an effect of
> the concurrent index build (reindex or CIC/DIC), and it is not really
> clear to me why silently falling back to a re-selection of the arbiter
> indexes would be always better.

>From my point of view, INSERT ON CONFLICT UPDATE should never fail with
"ERROR:  duplicate key value violates unique constraint" because the main
idea of upsert is to avoid such situations.
So, it is expected by majority and, probably, is even documented.

On the other side, REINDEX CONCURRENTLY should not cause any queries to
fail accidentally without any clear reason.

Also, as you can see from the topic starter letter, we could see errors
like this:

* ERROR:  duplicate key value violates unique constraint "tbl_pkey"
* ERROR:  duplicate key value violates unique constraint "tbl_pkey_ccnew"
* ERROR:  duplicate key value violates unique constraint "tbl_pkey_ccold"

So, the first error message does not provide any clue for the developer to
understand what happened.

> - The planner ignores indexes with !indisvalid.
> - The executor ignores indexes with !indislive.

Yes, and it feels like we need one more flag here to distinguish
!indisvalid indexes which are going to become valid and which are going to
become !indislive.

For example, let name it as indiscorrect (it means it contains all the
data). In such case, we may use the following logic:

1) !indisvalid && !indiscorrect - index in validation phase probably, do
not use it as arbiter because it does not contain all the data yet
2) !indisvalid && indiscorrect - index will be dropped most likely. Do not
plan new queries with it, but it still may be used by other queries
(including upserts). So, we still need to include it to the arbiters.

And, during the reindex concurrently:

1) begin; mark new index as indisvalid and indiscorrect; mark old one as
!indisvalid but still indiscorrect. invalidate relcache; commit;

Currently, some queries are still using the old one as arbiter, some
queries use both.

2) WaitForLockersMultiple

Now all queries use both indexes as arbiter.

3) begin; mark old index as !indiscorrect, additionally to !indisvalid;
invalidate cache; commit;

Now, some queries use only the new index, both some still use both.

4)  WaitForLockersMultiple;

Now, all queries use only the new index - we are safe to mark the old
one it as !indislive.

> It should, but I'm wondering if that's necessary for two reasons.
In that case, it becomes:

Assert(indexRelation->rd_index->indiscorrect);
Assert(indexRelation->rd_index->indislive);

and it is always the valid check.

Best regards,
Mikhail.


Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Andres Freund
Hi,

On 2024-06-24 16:35:50 -0400, Robert Haas wrote:
> On Mon, Jun 24, 2024 at 3:23 PM Melanie Plageman
>  wrote:
> > Are you more concerned about having a single horizon for pruning or
> > about having a horizon that does not move backwards after being
> > established at the beginning of vacuuming the relation?
> 
> I'm not sure I understand. The most important thing here is fixing the
> bug. But if we have a choice of how to fix the bug, I'd prefer to do
> it by having the pruning code test one horizon that is always correct,
> rather than (as I think the patch does) having it test against two
> horizons because as a way of covering possible discrepancies between
> those values.

I think that's going in the wrong direction. We *want* to prune more
aggressively if we can (*), the necessary state is represented by the
vistest. That's a different thing than *having* to prune tuples beyond a
certain xmin (the cutoff determined by vacuum.c/vacuumlazy.c). The problem
we're having here is that the two states can get out of sync due to the
vistest "moving backwards", because of hot_standby_feedback (and perhaps also
an issue around aborts).

To prevent that we can
a) make sure that we always take the hard cutoff into account
b) prevent vistest from going backwards

(*) we really ought to become more aggressive, by removing intermediary row
versions when they're not visible to anyone, but not yet old enough to be
below the horizon. But that realistically will only be possible in *some*
cases, e.g. when the predecessor row version is on the same page.



> > I had always thought it was because the vacuuming backend's
> > GlobalVisState will get updated periodically throughout vacuum and so,
> > assuming the oldest running transaction changes, our horizon for
> > vacuum would change. But, in writing this repro, it is actually quite
> > hard to get GlobalVisState to update. Our backend's RecentXmin needs
> > to have changed. And there aren't very many places where we take a new
> > snapshot after starting to vacuum a relation. One of those is at the
> > end of index vacuuming, but that can only affect the pruning horizon
> > if we have to do multiple rounds of index vacuuming. Is that really
> > the case we are thinking of when we say we want the pruning horizon to
> > move forward during vacuum?
> 
> I thought the idea was that the GlobalVisTest stuff would force a
> recalculation now and then, but maybe it doesn't actually do that?

It forces an accurate horizon to be determined the first time it would require
it to determine visibility. The "first time" is determined by RecentXmin not
having changed.

The main goal of the vistest stuff was to not have to determine an accurate
horizon in GetSnapshotData(). Determining an accurate horizon requires
accessing each backends ->xmin, which causes things to scale badly, as ->xmin
changes so frequently.

The cost of determining the accurate horizon is irrelevant for vacuums, but
it's not at all irrelevant for on-access pruning.


> Suppose process A begins a transaction, acquires an XID, and then goes
> idle. Process B now begins a giant vacuum. At some point in the middle
> of the vacuum, A ends the transaction. Are you saying that B's
> GlobalVisTest never really notices that this has happened?

Not at the moment, but we should add heuristics like that.


Greetings,

Andres Freund




Re: Proposal: Document ABI Compatibility

2024-06-25 Thread Dagfinn Ilmari Mannsåker
Peter Eisentraut  writes:

> On 24.06.24 22:26, David E. Wheeler wrote:
>>> But now we're talking about API.  That might be subject of another
>> document or another section in this one, but it seems confusing to mix
>> this with the ABI discussion.
>> Hrm. They’re super closely-related in my mind, as an extension
>> developer. I need to know both! I guess I’m taking of this policy as
>> what I can expect may be changed (and how to adapt to it) and what
>> won’t.
>> That said, I’m fine to remove the API stuff if there’s consensus
>> objecting to it, to be defined in a separate policy (perhaps on the
>> same doc page).
>
> I took at a stab at this, using some of your text, but discussing API
> and ABI separately.

This looks good to me, just one minor nitpick:

> ### Minor versions
>
> PostgreSQL makes an effort to avoid server API breaks in minor
> releases.  In general, extension code that compiles and works with
> some minor release should also compile and work with any other minor
> release, past or future.

I think this should explicitly say "any other minor release within [or
"from" or "of"?] the same major version" (and ditto in the ABI section).

- ilmari




Re: PostgreSQL does not compile on macOS SDK 15.0

2024-06-25 Thread Aleksander Alekseev
Hi,

> I've reproduced this issue by:
>
> 1. Download the XCode 16 beta 2 ZIP file:
> https://developer.apple.com/services-account/download?path=/Developer_Tools/Xcode_16_beta/Xcode_16_beta.xip
> 2. Extract this to `/tmp`.
> 3. Then I ran:
>
> export 
> PATH=/tmp/Xcode-beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin:$PATH
> export 
> SDKROOT=/tmp/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.0.sdk
> export 
> XCODE_DIR=/tmp/Xcode-beta.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain
> export CC="$XCODE_DIR/usr/bin/clang" export CXX="$XCODE_DIR/usr/bin/clang++"
>
> ./configure CC="$CC" CXX="$CXX"
> make

Does it work if you do the same with XCode 15?

Perhaps I'm missing something but to me it doesn't look like the
right/supported way of compiling PostgreSQL on this platform [1]. I
tried to figure out what version of Xcode I'm using right now, but it
seems to be none:

$ /usr/bin/xcodebuild -version
xcode-select: error: tool 'xcodebuild' requires Xcode, but active
developer directory '/Library/Developer/CommandLineTools' is a command
line tools instance

Clang I'm using doesn't seem to be part of XCode distribution either:

$ clang --version
Homebrew clang version 18.1.6
Target: x86_64-apple-darwin23.5.0
Thread model: posix
InstalledDir: /usr/local/opt/llvm/bin

It's been a while since I installed all the dependencies on my laptop,
but I'm pretty confident I followed the documentation back then.

IMO the right way to test PostgreSQL against the recent beta version
of MacOS SDK would be replacing (via a symlink perhaps) the SDK
provided by the "Command Line Tools for Xcode" package
(/Library/Developer/CommandLineTools/SDKs/). Or alternatively finding
the official way of installing the beta version of this package.

[1]: 
https://www.postgresql.org/docs/current/installation-platform-notes.html#INSTALLATION-NOTES-MACOS

-- 
Best regards,
Aleksander Alekseev




Re: Logical Replication of sequences

2024-06-25 Thread Shlok Kyal
On Thu, 20 Jun 2024 at 18:24, vignesh C  wrote:
>
> On Wed, 19 Jun 2024 at 20:33, vignesh C  wrote:
> >
> > On Tue, 18 Jun 2024 at 16:10, Amit Kapila  wrote:
> > >
> > >
> > > Agreed and I am not sure which is better because there is a value in
> > > keeping the state name the same for both sequences and tables. We
> > > probably need more comments in code and doc updates to make the
> > > behavior clear. We can start with the sequence state as 'init' for
> > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> > > feel so during the review.
> >
> > Here is a patch which does the sequence synchronization in the
> > following lines from the above discussion:
> > This commit introduces sequence synchronization during 1) creation of
> > subscription for initial sync of sequences 2) refresh publication to
> > synchronize the sequences for the newly created sequences 3) refresh
> > publication sequences for synchronizing all the sequences.
> > 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change):
> >- The subscriber retrieves sequences associated with publications.
> >- Sequences  are added in the 'init' state to the pg_subscription_rel 
> > table.
> >- Sequence synchronization worker will be started if there are any
> > sequences to be synchronized
> >- A new sequence synchronization worker handles synchronization in
> > batches of 100 sequences:
> >  a) Retrieves sequence values using pg_sequence_state from the 
> > publisher.
> >  b) Sets sequence values accordingly.
> >  c) Updates sequence state to 'READY' in pg_susbcripion_rel
> >  d) Commits batches of 100 synchronized sequences.
> > 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH
> > PUBLICATION (no syntax change):
> >- Stale sequences are removed from pg_subscription_rel.
> >- Newly added sequences in the publisher are added in 'init' state
> > to pg_subscription_rel.
> >- Sequence synchronization will be done by sequence sync worker as
> > listed in subscription creation process.
> >- Sequence synchronization occurs for newly added sequences only.
> > 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> > SEQUENCES  for refreshing all sequences:
> >- Removes stale sequences and adds newly added sequences from the
> > publisher to pg_subscription_rel.
> >- Resets all sequences in pg_subscription_rel to 'init' state.
> >- Initiates sequence synchronization for all sequences by sequence
> > sync worker as listed in subscription creation process.
>
> Here is an updated patch with a few fixes to remove an unused
> function, changed a few references of table to sequence and added one
> CHECK_FOR_INTERRUPTS in the sequence sync worker loop.

Hi Vignesh,

I have reviewed the patches and I have following comments:

= tablesync.c ==
1. process_syncing_sequences_for_apply can crash with:
2024-06-21 15:25:17.208 IST [3681269] LOG:  logical replication apply
worker for subscription "test1" has started
2024-06-21 15:28:10.127 IST [3682329] LOG:  logical replication
sequences synchronization worker for subscription "test1" has started
2024-06-21 15:28:10.146 IST [3682329] LOG:  logical replication
synchronization for subscription "test1", sequence "s1" has finished
2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
synchronization for subscription "test1", sequence "s2" has finished
2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
sequences synchronization worker for subscription "test1" has finished
2024-06-21 15:29:53.535 IST [3682767] LOG:  logical replication
sequences synchronization worker for subscription "test1" has started
TRAP: failed Assert("nestLevel > 0 && (nestLevel <= GUCNestLevel ||
(nestLevel == GUCNestLevel + 1 && !isCommit))"), File: "guc.c", Line:
2273, PID: 3682767
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (ExceptionalCondition+0xbb)[0x5b2a61861c99]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (AtEOXact_GUC+0x7b)[0x5b2a618bddfa]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (RestoreUserContext+0xc7)[0x5b2a618a6937]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1ff7dfa)[0x5b2a61115dfa]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1ff7eb4)[0x5b2a61115eb4]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (SequencesyncWorkerMain+0x33)[0x5b2a61115fe7]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (BackgroundWorkerMain+0x4ad)[0x5b2a61029cae]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (postmaster_child_launch+0x236)[0x5b2a6102fb36]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1f1d12a)[0x5b2a6103b12a]
postgres: logical replication sequencesync worker for subscription

Re: Meson far from ready on Windows

2024-06-25 Thread Dave Page
Hi

On Tue, 25 Jun 2024 at 12:39, Andres Freund  wrote:

> Hi,
>
> On 2024-06-25 11:54:56 +0100, Dave Page wrote:
> > https://github.com/dpage/winpgbuild proves that the hacks above are not
> > required *if* you build the dependencies in the recommended way for use
> > with MSVC++ (where documented), otherwise just native Windows.
>
> Partially it just means that some of the hacks are now located in the
> "build
> dependencies" script.  E.g. you're renaming libintl.dll.a, libiconv.dll.a,
> libuuid.a to something that's expected by the buildmethod. And the scripts
> change the directory structure for several other dependencies (e.g. zstd,
> krb).
>
>
> > If you, for example, build a dependency using Mingw/Msys, then you may
> get
> > different filenames than if you build the same thing using its VC++
> > solution or makefile. That's where most, if not all, of these issues come
> > from.
>
> Yes, that's precisely my point. The set of correct names / flags depends on
> things outside of postgres control. Hence they should be handled outside of
> postgres, not as part of postgres. Particularly because several of the
> dependencies can be built in multiple ways, resulting in multiple library
> names. And it doesn't even just differ by compiler, there's ways to get
> different library names for some of the deps even with the same compiler!
>

Replying to this and your previous email.

I think we're in violent agreement here as to how things *should* be in an
ideal world. The issue for me is that this isn't an ideal world and the
current solution potentially makes it much harder to get a working
PostgreSQL build on Windows - not only that, but everyone doing those
builds potentially has to figure out how to get things to work for
themselves because we're pushing the knowledge outside of our build system.

I've been building Postgres on Windows for years (and like to think I'm
reasonably competent), and despite reading the docs I still ended up
starting multiple threads on the hackers list to try to understand how to
get v17 to build. Would we accept the current Meson setup on Linux if
people had to hand-craft .pc files, or install dependencies using multiple
third-party package managers to get it to work?

As I previously noted, I think we should default to pkgconfig/cmake
detection, but then fall back to what we did previously (with suitably
obnoxious warnings). Then at least a build environment that worked in the
past should work in the future.

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com


Re: improve ssl error code, 2147483650

2024-06-25 Thread Peter Eisentraut

On 21.06.24 16:53, Tom Lane wrote:

Peter Eisentraut  writes:

-   strlcpy(errbuf, strerror(ERR_GET_REASON(ecode)), SSL_ERR_LEN);
+   strerror_r(ERR_GET_REASON(ecode), errbuf, SSL_ERR_LEN);


Most of libpq gets at strerror_r via SOCK_STRERROR for Windows
portability.  Is that relevant here?


Looking inside the OpenSSL code, it makes no efforts to translate 
between winsock error codes and standard error codes, so I don't think 
our workaround/replacement code needs to do that either.


Btw., our source code comments say something like 
"ERR_reason_error_string randomly refuses to map system errno values." 
The reason it doesn't is exactly that it can't do it while maintaining 
thread-safety.  Here is the relevant commit: 
https://github.com/openssl/openssl/commit/71f2994b15







Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 8:03 AM Andres Freund  wrote:
> I think that's going in the wrong direction. We *want* to prune more
> aggressively if we can (*), the necessary state is represented by the
> vistest. That's a different thing than *having* to prune tuples beyond a
> certain xmin (the cutoff determined by vacuum.c/vacuumlazy.c). The problem
> we're having here is that the two states can get out of sync due to the
> vistest "moving backwards", because of hot_standby_feedback (and perhaps also
> an issue around aborts).

I agree that we want to prune more aggressively if we can. I think
that fixing this by preventing vistest from going backward is
reasonable, and I like it better than what Melanie proposed, although
I like what Melanie proposed much better than not fixing it! I'm not
sure how to do that cleanly, but one of you may have an idea.

I do think that having a bunch of different XID values that function
as horizons and a vistest object that holds some more XID horizons
floating around in vacuum makes the code hard to understand. The
relationships between the various values are not well-documented. For
instance, the vistest has to be after vacrel->cutoffs.OldestXmin for
correctness, but I don't think there's a single comment anywhere
saying that; meanwhile, the comments for VacuumCutoffs say "OldestXmin
is the Xid below which tuples deleted by any xact (that committed)
should be considered DEAD, not just RECENTLY_DEAD." Surely the reader
can be forgiven for thinking that this is the cutoff that will
actually be used by pruning, but it isn't.

And more generally, it seems like a fairly big problem to me that
LVRelState directly stores NewRelfrozenXid; contains a VacuumCutoffs
object that stores relfrozenxid, OldestXmin, and FreezeLimit; and also
points to a GlobalVisState object that contains definitely_needed and
maybe_needed. That is six different XID cutoffs for one vacuum
operation. That's a lot. I can't describe how they're all different
from each other or what the necessary relationships between them are
off-hand, and I bet nobody else could either, at least until recently,
else we might not have this bug. I feel like if it were possible to
have fewer of them and still have things work, we'd be better off. I'm
not sure that's doable. But six seems like a lot.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Andres Freund
On 2024-06-25 08:42:02 -0400, Robert Haas wrote:
> On Tue, Jun 25, 2024 at 8:03 AM Andres Freund  wrote:
> > I think that's going in the wrong direction. We *want* to prune more
> > aggressively if we can (*), the necessary state is represented by the
> > vistest. That's a different thing than *having* to prune tuples beyond a
> > certain xmin (the cutoff determined by vacuum.c/vacuumlazy.c). The problem
> > we're having here is that the two states can get out of sync due to the
> > vistest "moving backwards", because of hot_standby_feedback (and perhaps 
> > also
> > an issue around aborts).
> 
> I agree that we want to prune more aggressively if we can. I think
> that fixing this by preventing vistest from going backward is
> reasonable, and I like it better than what Melanie proposed, although
> I like what Melanie proposed much better than not fixing it! I'm not
> sure how to do that cleanly, but one of you may have an idea.

It's not hard - but it has downsides. It'll mean that - outside of vacuum -
we'll much more often not react to horizons going backwards due to
hot_standby_feedback. Which means that hot_standby_feedback, when used without
slots, will prevent fewer conflicts.


> I do think that having a bunch of different XID values that function
> as horizons and a vistest object that holds some more XID horizons
> floating around in vacuum makes the code hard to understand. The
> relationships between the various values are not well-documented. For
> instance, the vistest has to be after vacrel->cutoffs.OldestXmin for
> correctness, but I don't think there's a single comment anywhere
> saying that;

It is somewhat documented:

 * Note: the approximate horizons (see definition of GlobalVisState) are
 * updated by the computations done here. That's currently required for
 * correctness and a small optimization. Without doing so it's possible that
 * heap vacuum's call to heap_page_prune_and_freeze() uses a more conservative
 * horizon than later when deciding which tuples can be removed - which the
 * code doesn't expect (breaking HOT).


> And more generally, it seems like a fairly big problem to me that
> LVRelState directly stores NewRelfrozenXid; contains a VacuumCutoffs
> object that stores relfrozenxid, OldestXmin, and FreezeLimit; and also
> points to a GlobalVisState object that contains definitely_needed and
> maybe_needed. That is six different XID cutoffs for one vacuum
> operation. That's a lot. I can't describe how they're all different
> from each other or what the necessary relationships between them are
> off-hand, and I bet nobody else could either, at least until recently,
> else we might not have this bug. I feel like if it were possible to
> have fewer of them and still have things work, we'd be better off. I'm
> not sure that's doable. But six seems like a lot.

Agreed. I don't think you can just unify things though, they actually are all
different for good, or at least decent, reasons.  I think improving the naming
alone could help a good bit though.

Greetings,

Andres Freund




RE: Pgoutput not capturing the generated columns

2024-06-25 Thread Hayato Kuroda (Fujitsu)
Dear Shlok,

Thanks for updating patches! Below are my comments, maybe only for 0002.

01. General

IIUC, we are not discussed why ALTER SUBSCRIPTION ... SET 
include_generated_columns
is prohibit. Previously, it seems okay because there are exclusive options. But 
now,
such restrictions are gone. Do you have a reason in your mind? It is just not 
considered
yet?

02. General

According to the doc, we allow to alter a column to non-generated one, by ALTER
TABLE ... ALTER COLUMN ... DROP EXPRESSION command. Not sure, what should be
when the command is executed on the subscriber while copying the data? Should
we continue the copy or restart? How do you think?

03. Tes tcode

IIUC, REFRESH PUBLICATION can also lead the table synchronization. Can you add
a test for that?

04. Test code (maybe for 0001)

Please test the combination with TABLE ... ALTER COLUMN ... DROP EXPRESSION 
command.

05. logicalrep_rel_open

```
+/*
+ * In case 'include_generated_columns' is 'false', we should skip 
the
+ * check of missing attrs for generated columns.
+ * In case 'include_generated_columns' is 'true', we should check 
if
+ * corresponding column for the generated column in publication 
column
+ * list is present in the subscription table.
+ */
+if (!MySubscription->includegencols && attr->attgenerated)
+{
+entry->attrmap->attnums[i] = -1;
+continue;
+}
```

This comment is not very clear to me, because here we do not skip anything.
Can you clarify the reason why attnums[i] is set to -1 and how will it be used?

06. make_copy_attnamelist

```
+gencollist = palloc0(MaxTupleAttributeNumber * sizeof(bool));
```

I think this array is too large. Can we reduce a size to (desc->natts * 
sizeof(bool))?
Also, the free'ing should be done.

07. make_copy_attnamelist

```
+/* Loop to handle subscription table generated columns. */
+for (int i = 0; i < desc->natts; i++)
```

IIUC, the loop is needed to find generated columns on the subscriber side, 
right?
Can you clarify as comment?

08. copy_table

```
+/*
+ * Regular table with no row filter and 'include_generated_columns'
+ * specified as 'false' during creation of subscription.
+ */
```

I think this comment is not correct. After patching, all tablesync command 
becomes
like COPY (SELECT ...) if include_genereted_columns is set to true. Is it right?
Can we restrict only when the table has generated ones?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



Re: pg_combinebackup --clone doesn't work

2024-06-25 Thread Peter Eisentraut

On 21.06.24 18:10, Tomas Vondra wrote:

On 6/21/24 00:07, Tomas Vondra wrote:

Here's a fix adding the missing headers to pg_combinebackup, and fixing
some compile-time issues in the ifdef-ed block.

I've done some basic manual testing today - I plan to test this a bit
more tomorrow, and I'll also look at integrating this into the existing
tests.



Here's a bit more complete / cleaned patch, adding the testing changes
in separate parts.

0001 adds the missing headers / fixes the now-accessible code a bit

0002 adds the --copy option for consistency with pg_upgrade


This looks good.


0003 adds the PG_TEST_PG_COMBINEBACKUP_MODE, so that we can override the
copy method for tests


I had imagined that we combine PG_TEST_PG_UPGRADE_MODE and this new one 
into one setting.  But maybe that's something to consider with less time 
pressure for PG18.


> I believe 0001-0003 are likely non-controversial, although if someone
> could take a look at the Perl in 0003 that'd be nice. Also, 0002 seems
> nice not only because of consistency with pg_upgrade, but it also makes
> 0003 easier as we don't need to special-case the default mode etc.

Right, that was one of the reasons.


0004 tweaks two of the Cirrus CI tasks to use --clone/--copy-file-range



I'm not sure about 0004 - I initially did this mostly to check we have
the right headers on other platforms, but not sure we want to actually
do this. Or maybe we want to test a different combination (e.g. also
test the --clone on Linux)?


It's tricky to find the right balance here.  We had to figure this out 
for pg_upgrade as well.  I think your solution is good, and we should 
also add test coverage for pg_upgrade --copy-file-range in the same 
place, I think.






Re: Direct SSL connection and ALPN loose ends

2024-06-25 Thread Vladimir Sitnikov
I reviewed the documentation for "direct ALPN connections' ', and it looks
like it could be improved.
Here's the link:
https://www.postgresql.org/docs/17/protocol-flow.html#PROTOCOL-FLOW-SSL

The currently suggested values for "sslnegotiations" are "direct" and
"postgres".
The project name is PostgreSQL and the ALPN name is postgresql. Is there a
reason why property value uses "postgres"?
Can the value be renamed to postgresql for consistency?

"SSL". Technically, the proper term is TLS, and even the document refers to
"IANA TLS ALPN Protocol IDs" (TLS, not SSL).
I would not die on that hill, however, going for tlsnegotiation would look
better than sslnegotiation.

Vladimir


Re: PostgreSQL does not compile on macOS SDK 15.0

2024-06-25 Thread Aleksander Alekseev
Hi,

> IMO the right way to test PostgreSQL against the recent beta version
> of MacOS SDK would be replacing (via a symlink perhaps) the SDK
> provided by the "Command Line Tools for Xcode" package
> (/Library/Developer/CommandLineTools/SDKs/). Or alternatively finding
> the official way of installing the beta version of this package.

As it turned out there is Command_Line_Tools_for_Xcode_16_beta.dmg
package available. It can be downloaded from
https://developer.apple.com/ after logging it. I installed it and also
did:

```
cd /Library/Developer/CommandLineTools/SDKs
sudo mkdir __ignore__
sudo mv MacOSX14.* __ignore__
```

... to make sure Postgres will not find the older version of SDK (it
did until I made this step).

Now I get the following output from `meson --setup ...`:

```
The Meson build system
Version: 0.61.2
Source dir: /Users/eax/projects/c/postgresql
Build dir: /Users/eax/projects/c/postgresql/build
Build type: native build
Project name: postgresql
Project version: 17beta2
C compiler for the host machine: cc (clang 16.0.0 "Apple clang version
16.0.0 (clang-1600.0.20.10)")
C linker for the host machine: cc ld64 1115.5.3
Host machine cpu family: x86_64
Host machine cpu: x86_64
Run-time dependency threads found: YES
Message: darwin sysroot: /Library/Developer/CommandLineTools/SDKs/MacOSX15.0.sdk
...
```

... and get the error reported by Stan. Also I can confirm that the
proposed workaround fixes it. Attached is the result of `git
format-patch` for convenience.

Personally I'm not extremely happy with this workaround though. An
alternative solution would be adding the "pg_" prefix to our type
declarations.

Another question is whether we should fix this while the SDK is in
beta or only after it is released.

Thoughts?

I added the patch to the nearest commitfest so that it wouldn't be lost [1].

[1]: https://commitfest.postgresql.org/48/5073/

-- 
Best regards,
Aleksander Alekseev


v1-0001-Dirty-fix.patch
Description: Binary data


Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-06-25 Thread Robert Haas
On Mon, Jun 24, 2024 at 9:19 AM Jelte Fennema-Nio  wrote:
> > > Patch 3: Similar to 1 & 2 in that it has no actual effect yet. But
> > > after bumping the version this would be a user visible API change, so
> > > I expect it requires a bit more discussion.
> >
> > I don't know if this is the right idea or not. An alternative would be
> > to leave this alone and add PQprotocolMinorVersion().
>
> I considered that, but that makes the API a lot more annoying to use
> for end users when they want to compare to a version, especially when
> they want to include the major version too.
>
> PQprotocolVersion() >= 30001
>
> vs
>
> PQprotocolVersion() > 3 || (PQprotocolVersion() == 3 &&
> PQprotocolVersionMinor() >= 1)
>
> Given that PQprotocolVersion currently has no practical use, because
> it always returns 3 in practice. I personally think that changing the
> behaviour of API in the way I suggested is the best option here.

Mmm, I was thinking of defining the new function to return the major
and minor version in one number, while the existing function could
continue to work as now. I see your point too, but I'd like to hear
some other opinions before we decide, because I think it's unclear
what is actually best here. And also: surely this is not the hill to
die on. If it makes others a lot happier to do it one way or the
other, I'm severely disinclined to spend energy arguing with those
people. And, on the basis of previous experience, this is exactly the
sort of question about which opinions sometimes run quite strongly. I
would much rather swim with the current than get what I would myself
prefer.

> > > Patch 5: Starts using the new connection option from Patch 4
> >
> > Not sure yet whether I like this approach.
>
> Could you explain a bit more what you don't like about it?

I don't dislike it; I'm just not sure whether it is the best approach
to testing the remainder of the patch series. Perhaps the commit
message could explain more why this approach was chosen and what value
we get from it.

> Fair enough. I changed them a bit now, do you think they are good now?

I'll try to re-review the patch set when I have a bit more time than I
do at this exact moment.

> > > Patch 7: Bump the protocol version to 3.2 (see commit message for why
> > > not bumping to 3.1)
> >
> > Good commit message. The point is arguable, so putting forth your best
> > argument is important.
>
> Just to clarify: do you agree with the point now, after that argument?

Well, here again, I would like to know what other people think. It
doesn't intrinsically matter to me that much what we do here, but it
matters to me a lot that extensive recriminations don't ensue
afterwards.

> I did consider an error message (and I think that is what we discussed
> in-person too). But during implementation a WARNING together with a
> simple error indicator seemed nicer since that could hook into the
> existing infrastructure for reporting warnings (both server and client
> side). e.g. you can now provide detail/context/errorcode in the
> warning, without having to add all those to the
> SetProtocolParameterComplete message. I don't feel super strongly
> either way though, so If you prefer the error message to be part of
> the SetProtocolParameterComplete message then I'm happy to change
> that.

My general programming experience has been that any time I decide to
include an error flag rather than an error message, I end up
regretting it. It's possible that this case is, for some reason, an
exception to that principle, but I feel like we need to nail down
exactly what the protocol flow *and* the libpq API for these new
messages is going to be before I can really have an intelligent
opinion about that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Recommended books for admin

2024-06-25 Thread Tom Browder
I’m a complete novice, although I’ve dipped my toes in Admin waters a
couple of times in my many years of using Linux.

Can anyone recommend a good book on installing Postgres on multiple,
connected multiuser systems, tuning it, managing users, backups, updated,
etc.

A cookbook/checklist approach would be great. I’ve bought several books
over the years but a more current one is desirable.

Thanks for any help.

Best regards,

-Tom


Re: Recommended books for admin

2024-06-25 Thread Kashif Zeeshan
Hi Tom

There is alot of stuff available online, you just need to find it, also the
Official PG documentation is extensive too..

Regards
Kashif Zeeshan

On Tue, Jun 25, 2024 at 7:04 PM Tom Browder  wrote:

> I’m a complete novice, although I’ve dipped my toes in Admin waters a
> couple of times in my many years of using Linux.
>
> Can anyone recommend a good book on installing Postgres on multiple,
> connected multiuser systems, tuning it, managing users, backups, updated,
> etc.
>
> A cookbook/checklist approach would be great. I’ve bought several books
> over the years but a more current one is desirable.
>
> Thanks for any help.
>
> Best regards,
>
> -Tom
>


Re: Recommended books for admin

2024-06-25 Thread Bill Smith
Check out the left side bar

https://www.postgresql.org/docs/

> On Jun 25, 2024, at 10:04 AM, Tom Browder  wrote:
> 
> I’m a complete novice, although I’ve dipped my toes in Admin waters a couple 
> of times in my many years of using Linux.
> 
> Can anyone recommend a good book on installing Postgres on multiple, 
> connected multiuser systems, tuning it, managing users, backups, updated, etc.
> 
> A cookbook/checklist approach would be great. I’ve bought several books over 
> the years but a more current one is desirable.
> 
> Thanks for any help.
> 
> Best regards,
> 
> -Tom



Re: Recommended books for admin

2024-06-25 Thread Muhammad Ikram
Hi ,

Here is a lately published book

https://www.amazon.com/PostgreSQL-Administration-Cookbook-real-world-challenges-ebook/dp/B0CP5PPSTQ


Muhammad Ikram

Bitnine Global

On Tue, 25 Jun 2024 at 19:09, Bill Smith 
wrote:

> Check out the left side bar
>
> Documentation 
> postgresql.org 
> [image: favicon.ico] 
> 
>
>
> On Jun 25, 2024, at 10:04 AM, Tom Browder  wrote:
>
> I’m a complete novice, although I’ve dipped my toes in Admin waters a
> couple of times in my many years of using Linux.
>
> Can anyone recommend a good book on installing Postgres on multiple,
> connected multiuser systems, tuning it, managing users, backups, updated,
> etc.
>
> A cookbook/checklist approach would be great. I’ve bought several books
> over the years but a more current one is desirable.
>
> Thanks for any help.
>
> Best regards,
>
> -Tom
>
>
>


favicon.ico
Description: Binary data


Re: Direct SSL connection and ALPN loose ends

2024-06-25 Thread Dave Cramer
On Tue, 25 Jun 2024 at 09:37, Vladimir Sitnikov 
wrote:

> I reviewed the documentation for "direct ALPN connections' ', and it looks
> like it could be improved.
> Here's the link:
> https://www.postgresql.org/docs/17/protocol-flow.html#PROTOCOL-FLOW-SSL
>
> The currently suggested values for "sslnegotiations" are "direct" and
> "postgres".
> The project name is PostgreSQL and the ALPN name is postgresql. Is there a
> reason why property value uses "postgres"?
> Can the value be renamed to postgresql for consistency?
>

+1 I found it strange that we are not using postgresql

>
> "SSL". Technically, the proper term is TLS, and even the document refers
> to "IANA TLS ALPN Protocol IDs" (TLS, not SSL).
> I would not die on that hill, however, going for tlsnegotiation would look
> better than sslnegotiation.
>

+1 again, unusual to use SSL when this really is TLS.

Dave


Re: improve ssl error code, 2147483650

2024-06-25 Thread Tom Lane
Peter Eisentraut  writes:
> On 21.06.24 16:53, Tom Lane wrote:
>> Most of libpq gets at strerror_r via SOCK_STRERROR for Windows
>> portability.  Is that relevant here?

> Looking inside the OpenSSL code, it makes no efforts to translate 
> between winsock error codes and standard error codes, so I don't think 
> our workaround/replacement code needs to do that either.

Fair enough.

> Btw., our source code comments say something like 
> "ERR_reason_error_string randomly refuses to map system errno values." 
> The reason it doesn't is exactly that it can't do it while maintaining 
> thread-safety.

Ah.  Do you want to improve that comment while you're at it?

regards, tom lane




Re: add a new explain option including_query for include query string inside the json plan output

2024-06-25 Thread Tom Lane
Matthias van de Meent  writes:
> On Tue, 25 Jun 2024 at 10:55, jian he  wrote:
>> for json format, add a new option to let the explain json output also
>> include the actual query string.

> Wouldn't the user be able to keep track of the query they wanted
> explained by themselves? If not, why?

Indeed.  I do not think this is a good idea at all, even if the
question of "where did you get the query string from" could be
resolved satisfactorily.

regards, tom lane




Re: Backporting BackgroundPsql

2024-06-25 Thread Tom Lane
Andres Freund  writes:
> On 2024-06-25 13:26:23 +0300, Heikki Linnakangas wrote:
>> 1. Write the new test differently on backbranches. Before 664d757531, the
>> test needs to work a lot harder to use the background psql session, calling
>> pump() etc. That's doable, but as noted in the discussion that led to
>> 664d757531, it's laborious and error-prone.
>> 
>> 2. Backport commit 664d757531. This might break out-of-tree perl tests that
>> use the background_psql() function. I don't know if any such tests exist,
>> and they would need to be changed for v17 anyway, so that seems acceptable.
>> Anyone aware of any extensions using the perl test modules?
>> 
>> 3. Backport commit 664d757531, but keep the existing background_psql()
>> function unchanged. Add a different constructor to get the v17-style
>> BackgroundPsql session, something like "$node->background_psql_new()".

> Yes, I've wished for this a couple times. I think 2 or 3 would be reasonable.
> I think 1) often just leads to either tests not being written or being
> fragile...

I'd vote for (2).  (3) is just leaving a foot-gun for people to
hurt themselves with.

regards, tom lane




Re: PostgreSQL does not compile on macOS SDK 15.0

2024-06-25 Thread Andres Freund
Hi,

On 2024-06-25 16:49:32 +0300, Aleksander Alekseev wrote:
> ... to make sure Postgres will not find the older version of SDK (it
> did until I made this step).

You should be able to influence that by specifying -Ddarwin_sysroot=...


> ... and get the error reported by Stan. Also I can confirm that the
> proposed workaround fixes it. Attached is the result of `git
> format-patch` for convenience.
> 
> Personally I'm not extremely happy with this workaround though.

Yea, it seems decidedly not great.


> An alternative solution would be adding the "pg_" prefix to our type
> declarations.

A third approach would be to make sure we don't include xlocale.h from
pg_locale.h.  IMO pg_locale currently exposes too many implementation details,
neither xlocale.h nor ucol.h should be included in it, that should be in a C
file.


> Another question is whether we should fix this while the SDK is in
> beta or only after it is released.

Yea.

Greetings,

Andres Freund




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 9:07 AM Andres Freund  wrote:
> It's not hard - but it has downsides. It'll mean that - outside of vacuum -
> we'll much more often not react to horizons going backwards due to
> hot_standby_feedback. Which means that hot_standby_feedback, when used without
> slots, will prevent fewer conflicts.

Can you explain this in more detail?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_combinebackup --clone doesn't work

2024-06-25 Thread Tomas Vondra



On 6/25/24 15:21, Peter Eisentraut wrote:
> On 21.06.24 18:10, Tomas Vondra wrote:
>> On 6/21/24 00:07, Tomas Vondra wrote:
>>> Here's a fix adding the missing headers to pg_combinebackup, and fixing
>>> some compile-time issues in the ifdef-ed block.
>>>
>>> I've done some basic manual testing today - I plan to test this a bit
>>> more tomorrow, and I'll also look at integrating this into the existing
>>> tests.
>>>
>>
>> Here's a bit more complete / cleaned patch, adding the testing changes
>> in separate parts.
>>
>> 0001 adds the missing headers / fixes the now-accessible code a bit
>>
>> 0002 adds the --copy option for consistency with pg_upgrade
> 
> This looks good.
> 
>> 0003 adds the PG_TEST_PG_COMBINEBACKUP_MODE, so that we can override the
>> copy method for tests
> 
> I had imagined that we combine PG_TEST_PG_UPGRADE_MODE and this new one
> into one setting.  But maybe that's something to consider with less time
> pressure for PG18.
> 

Yeah. I initially planned to combine those options into a single one,
because it seems like it's not very useful to have them set differently,
and because it's easier to not have different options between releases.
But then I realized PG_TEST_PG_UPGRADE_MODE was added in 16, so this
ship already sailed - so no reason to rush this into 18.

>> I believe 0001-0003 are likely non-controversial, although if someone
>> could take a look at the Perl in 0003 that'd be nice. Also, 0002 seems
>> nice not only because of consistency with pg_upgrade, but it also makes
>> 0003 easier as we don't need to special-case the default mode etc.
> 
> Right, that was one of the reasons.
> 
>> 0004 tweaks two of the Cirrus CI tasks to use --clone/--copy-file-range
> 
>> I'm not sure about 0004 - I initially did this mostly to check we have
>> the right headers on other platforms, but not sure we want to actually
>> do this. Or maybe we want to test a different combination (e.g. also
>> test the --clone on Linux)?
> 
> It's tricky to find the right balance here.  We had to figure this out
> for pg_upgrade as well.  I think your solution is good, and we should
> also add test coverage for pg_upgrade --copy-file-range in the same
> place, I think.
> 

Yeah. I'm not sure if we need to decide this now, or if we can tweak the
testing even for released branches.

IMHO the main challenge is to decide which combinations we actually want
to test on CI. It'd be nice to test each platform with all modes it
supports (I guess for backups that wouldn't be a bad thing). But that'd
require expanding the number of combinations, and I don't think that's
likely.

Maybe it'd be possible to have a second CI config, with additional
combinations, but not triggered after each push?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Alena Rybakina

On 24.06.2024 17:37, Melanie Plageman wrote:

On Mon, Jun 24, 2024 at 4:10 AM Alena Rybakina  wrote:

We can fix this by always removing tuples considered dead before
VacuumCutoffs->OldestXmin. This is okay even if a reconnected standby
has a transaction that sees that tuple as alive, because it will
simply wait to replay the removal until it would be correct to do so
or recovery conflict handling will cancel the transaction that sees
the tuple as alive and allow replay to continue.

This is an interesting and difficult case) I noticed that when initializing the 
cluster, in my opinion, we provide excessive freezing. Initialization takes a 
long time, which can lead, for example, to longer test execution. I got rid of 
this by adding the OldestMxact checkbox is not FirstMultiXactId, and it works 
fine.

if (prstate->cutoffs &&
TransactionIdIsValid(prstate->cutoffs->OldestXmin) &&
prstate->cutoffs->OldestMxact != FirstMultiXactId &&
NormalTransactionIdPrecedes(dead_after, prstate->cutoffs->OldestXmin))
 return HEAPTUPLE_DEAD;

Can I keep it?

This looks like an addition to the new criteria I added to
heap_prune_satisfies_vacuum(). Is that what you are suggesting? If so,
it looks like it would only return HEAPTUPLE_DEAD (and thus only
remove) a subset of the tuples my original criteria would remove. When
vacuum calculates OldestMxact as FirstMultiXactId, it would not remove
those tuples deleted before OldestXmin. It seems like OldestMxact will
equal FirstMultiXactID sometimes right after initdb and after
transaction ID wraparound. I'm not sure I totally understand the
criteria.

One thing I find confusing about this is that this would actually
remove less tuples than with my criteria -- which could lead to more
freezing. When vacuum calculates OldestMxact == FirstMultiXactID, we
would not remove tuples deleted before OldestXmin and thus return
HEAPTUPLE_RECENTLY_DEAD for those tuples. Then we would consider
freezing them. So, it seems like we would do more freezing by adding
this criteria.

Could you explain more about how the criteria you are suggesting
works? Are you saying it does less freezing than master or less
freezing than with my patch?


At first, Inoticedthatwiththispatch, vacuumfoulsthenodes more 
oftenthanbefore,anditseemedto me thatmoretimewasspentinitializingthe 
clusterwiththispatchthanbefore,soIsuggestedconsideringthiscondition.Aftercheckingagain, 
Ifoundthatthe problemwaswithmylaptop.So,sorryforthe noise.



Attached is the suggested fix for master plus a repro. I wrote it as a
recovery suite TAP test, but I am _not_ proposing we add it to the
ongoing test suite. It is, amongst other things, definitely prone to
flaking. I also had to use loads of data to force two index vacuuming
passes now that we have TIDStore, so it is a slow test.

-- snip --

I have a modified version of this that repros the infinite loop on
14-16 with substantially less data. See it here [2]. Also, the repro
attached to this mail won't work on 14 and 15 because of changes to
background_psql.

I couldn't understand why the replica is necessary here. Now I am digging why I 
got the similar behavior without replica when I have only one instance. I'm 
still checking this in my test, but I believe this patch fixes the original 
problem because the symptoms were the same.

Did you get similar behavior on master or on back branches? Was the
behavior you observed the infinite loop or the error during
heap_prepare_freeze_tuple()?

In my examples, the replica is needed because something has to move
the horizon on the primary backwards. When a standby reconnects with
an older oldest running transaction ID than any of the running
transactions on the primary and the vacuuming backend recomputes its
RecentXmin, the horizon may move backwards when compared to the
horizon calculated at the beginning of the vacuum. Vacuum does not
recompute cutoffs->OldestXmin during vacuuming a relation but it may
recompute the values in the GlobalVisState it uses for pruning.

We knew of only one other way that the horizon could move backwards
which Matthias describes here [1]. However, this is thought to be its
own concurrency-related bug in the commit-abort path that should be
fixed -- as opposed to the standby reconnecting with an older oldest
running transaction ID which can be expected.

Do you know if you were seeing the effects of the scenario Matthias describes?


[1]https://www.postgresql.org/message-id/CAEze2WjMTh4KS0%3DQEQB-Jq%2BtDLPR%2B0%2BzVBMfVwSPK5A%3DWZa95Q%40mail.gmail.com
I'm sorry, I need a little more time to figure this out. I will answer 
this question later.


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


Re: PostgreSQL does not compile on macOS SDK 15.0

2024-06-25 Thread Tom Lane
Andres Freund  writes:
> On 2024-06-25 16:49:32 +0300, Aleksander Alekseev wrote:
>> Another question is whether we should fix this while the SDK is in
>> beta or only after it is released.

> Yea.

Stan has started multiple threads about this, which is not doing
anyone any favors, but that issue was already brought up in

https://www.postgresql.org/message-id/flat/4edd2d3c30429c4445cc805ae9a788c489856eb7.1719265762.git.stanhu%40gmail.com

I think the immediate action item should be to push back on the
change and see if we can get Apple to undo it.  If we have to
fix it on our side, it is likely to involve API-breaking changes
that will cause trouble for extensions.  The more so because
we'll have to change stable branches too.

I tend to agree with the idea that not including 
so widely might be the least-bad fix; but that still risks
breaking code that was dependent on that inclusion.

regards, tom lane




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 6:12 AM Alvaro Herrera  wrote:
> Now, I'm not saying that this is an easy journey.  But if we don't
> start, we're not going to get there.

I actually kind of agree with you. I think I've said something similar
in a previous email to the list somewhere. But I don't agree that this
patch should be burdened with taking the first step. We seem to often
find reasons why patches that packagers for prominent distributions
are carrying shouldn't be put into core, and I think that's a bad
habit. They're not going to stop applying those packages because we
refuse to put suitable functionality in core; we're just creating a
situation where lots of people are running slightly patched versions
of PostgreSQL instead of straight-up PostgreSQL. That's not improving
anything. If we want to work on making the sorts of changes that
you're proposing, let's do it on a separate thread. It's not going to
be meaningfully harder to move in that direction after some patch like
this than it is today.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Fix possible overflow of pg_stat DSA's refcnt

2024-06-25 Thread Anthonin Bonnefoy
Hi,

During backend initialisation, pgStat DSA is attached using
dsa_attach_in_place with a NULL segment. The NULL segment means that
there's no callback to release the DSA when the process exits.
pgstat_detach_shmem only calls dsa_detach which, as mentioned in the
function's comment, doesn't include releasing and doesn't decrement the
reference count of pgStat DSA.

Thus, every time a backend is created, pgStat DSA's refcnt is incremented
but never decremented when the backend shutdown. It will eventually
overflow and reach 0, triggering the "could not attach to dynamic shared
area" error on all newly created backends. When this state is reached, the
only way to recover is to restart the db to reset the counter.

The issue can be visible by calling dsa_dump in pgstat_detach_shmem and
checking that refcnt's value is continuously increasing as new backends are
created. It is also possible to reach the state where all connections are
refused by editing the refcnt manually with lldb/gdb (The alternative,
creating enough backends to reach 0 exists but can take some time). Setting
it to -10 and then opening 10 connections will eventually generate the
"could not attach" error.

This patch fixes this issue by releasing pgStat DSA with
dsa_release_in_place during pgStat shutdown to correctly decrement the
refcnt.

Regards,
Anthonin


v1-0001-Fix-possible-overflow-of-pg_stat-DSA-s-refcnt.patch
Description: Binary data


Re: Backporting BackgroundPsql

2024-06-25 Thread Melanie Plageman
On Tue, Jun 25, 2024 at 7:40 AM Andres Freund  wrote:
>
> Hi,
>
> On 2024-06-25 13:26:23 +0300, Heikki Linnakangas wrote:
> > While fixing a recent bug on visibility on a standby [1], I wrote a
> > regression test that uses BackgroundPsql to run some queries in a
> > long-running psql session. The problem is that that was refactored in v17,
> > commit 664d757531. The test I wrote for v17 doesn't work as it is on
> > backbranches. Options:
> >
> > 1. Write the new test differently on backbranches. Before 664d757531, the
> > test needs to work a lot harder to use the background psql session, calling
> > pump() etc. That's doable, but as noted in the discussion that led to
> > 664d757531, it's laborious and error-prone.
> >
> > 2. Backport commit 664d757531. This might break out-of-tree perl tests that
> > use the background_psql() function. I don't know if any such tests exist,
> > and they would need to be changed for v17 anyway, so that seems acceptable.
> > Anyone aware of any extensions using the perl test modules?
> >
> > 3. Backport commit 664d757531, but keep the existing background_psql()
> > function unchanged. Add a different constructor to get the v17-style
> > BackgroundPsql session, something like "$node->background_psql_new()".
> >
> > I'm leaning towards 3. We might need to backport more perl tests that use
> > background_psql() in the future, backporting the test module will make that
> > easier.
> >
> > Thoughts?
>
> Yes, I've wished for this a couple times. I think 2 or 3 would be reasonable.
> I think 1) often just leads to either tests not being written or being
> fragile...

+1 to backporting background psql!

I'm also okay with 2 or 3. But note that for 2, there are several
tests with skip_all and at least one of them uses background_psql
(031_recovery_conflict.pl), so we'll just have to remember to update
those. I assume that is easy enough to do if you grep for
background_psql -- but, just in case you were going to be 100%
test-driven :)

- Melanie




Re: improve predefined roles documentation

2024-06-25 Thread Nathan Bossart
On Mon, Jun 24, 2024 at 03:53:46PM -0700, David G. Johnston wrote:
> pg_database_owner owns the initially created public schema and has an
> implicit membership list of one - the role owning the connected-to database.
> It exists to encourage and facilitate best practices regarding database
> administration.  The primary rule being to avoid using superuser to own or
> do things.

This part restates much of the existing text in a slightly different order,
but I'm not sure it's an improvement.  I like that it emphasizes the intent
of the role, but the basic description of the role is kind-of buried in the
first sentence.  IMO the way this role works is confusing enough that we
ought to keep the basic facts at the very top.  I might even add a bit of
fluff in an attempt to make things clearer:

The pg_database_owner role always has exactly one implicit,
situation-dependent member, namely the owner of the current database.

One other thing I like about your proposal is that it moves the bit about
the role initially owning the public schema much earlier.  That seems like
possibly the most important practical piece of information to convey to
administrators.  Perhaps that could be the very next thing after the basic
description of the role.

> The bootstrap superuser thus should connect to the postgres
> database and create a login role, with the createdb attribute, and then use
> that role to create and administer additional databases.  In that context,
> this feature allows the creator of the new database to log into it and
> immediately begin working in the public schema.

IMHO the majority of this is too prescriptive, even if it's generally good
advice.

> As a result, in version 14, PostgreSQL no longer initially grants create
> and usage privileges, on the public schema, to the public pseudo-role.

IME we tend to shy away from adding too many historical details in the
documentation, and I'm not sure this information is directly related enough
to the role to include here.

> For technical reasons, pg_database_owner may not participate in explicitly
> granted role memberships.  This is an easily mitigated limitation since the
> role that owns the database may be a group and any inheriting members of
> that group will be considered owners as well.

IIUC the intent of this is to expand on the following sentence in the
existing docs:

pg_database_owner cannot be a member of any role, and it cannot have
non-implicit members.

My instinct would be to do something like this:

pg_database_owner cannot be granted membership in any role, and no role
may be granted non-implicit membership in pg_database_owner.

IMHO the part about mitigating this limitation via groups is again too
prescriptive.

-- 
nathan




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Melanie Plageman
On Tue, Jun 25, 2024 at 10:31 AM Robert Haas  wrote:
>
> On Tue, Jun 25, 2024 at 9:07 AM Andres Freund  wrote:
> > It's not hard - but it has downsides. It'll mean that - outside of vacuum -
> > we'll much more often not react to horizons going backwards due to
> > hot_standby_feedback. Which means that hot_standby_feedback, when used 
> > without
> > slots, will prevent fewer conflicts.
>
> Can you explain this in more detail?

If we prevent GlobalVisState from moving backward, then we would less
frequently be pushing the horizon backward on the primary in response
to hot standby feedback. Then, the primary would do more things that
would not be safely replayable on the standby -- so the standby could
end up encountering more recovery conflicts.

- Melanie




Re: Direct SSL connection and ALPN loose ends

2024-06-25 Thread Jacob Champion
On Thu, Jun 20, 2024 at 4:32 PM Jacob Champion
 wrote:
> Thanks,
> --Jacob

Hey Heikki,

[sending this to the list in case it's not just me]

I cannot for the life of me get GMail to deliver your latest message,
even though I see it on postgresql.org. It's not in spam; it's just
gone. I wonder if it's possibly the Perl server script causing
virus-scanner issues?

--Jacob




Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY

2024-06-25 Thread Michail Nikolaev
Hello, Noah!

Answering
https://www.postgresql.org/message-id/flat/20240612194857.1c.nmisch%40google.com#684361ba86bad11f4e9fd84dfa8e0084

> On your other thread, it would be useful to see stack traces from the
high-CPU
> processes once the live lock has ended all query completion.

I was wrong, it is not a livelock, it is a deadlock, actually. I missed it
because pgbench retries deadlocks automatically.

It looks like this:

2024-06-25 17:16:17.447 CEST [711743] 007_concurrently_unique_stuck.pl
ERROR:  deadlock detected
2024-06-25 17:16:17.447 CEST [711743] 007_concurrently_unique_stuck.pl
DETAIL:  Process 711743 waits for ShareLock on transaction 3633; blocked by
process 711749.
Process 711749 waits for ShareLock on speculative token 2 of transaction
3622; blocked by process 711743.
Process 711743: INSERT INTO tbl VALUES(13,89318) on conflict(i) do update
set n = tbl.n + 1 RETURNING n
Process 711749: INSERT INTO tbl VALUES(13,41011) on conflict(i) do update
set n = tbl.n + 1 RETURNING n
2024-06-25 17:16:17.447 CEST [711743] 007_concurrently_unique_stuck.pl
HINT:  See server log for query details.
2024-06-25 17:16:17.447 CEST [711743] 007_concurrently_unique_stuck.pl
CONTEXT:  while inserting index tuple (15,145) in relation "tbl_pkey_ccnew"
2024-06-25 17:16:17.447 CEST [711743] 007_concurrently_unique_stuck.pl
STATEMENT:  INSERT INTO tbl VALUES(13,89318) on conflict(i) do update set n
= tbl.n + 1 RETURNING n

Stacktraces:

-

INSERT INTO tbl VALUES(13,41011) on conflict(i) do update set n = tbl.n + 1
RETURNING n

#0  in epoll_wait (epfd=5, events=0x1203328, maxevents=1, timeout=-1) at
../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  in WaitEventSetWaitBlock (set=0x12032c0, cur_timeout=-1,
occurred_events=0x7ffcc4e38e30, nevents=1) at latch.c:1570
#2  in WaitEventSetWait (set=0x12032c0, timeout=-1,
occurred_events=0x7ffcc4e38e30, nevents=1, wait_event_info=50331655) at
latch.c:1516
#3  in WaitLatch (latch=0x7acb2a2f5f14, wakeEvents=33, timeout=0,
wait_event_info=50331655) at latch.c:538
#4  in ProcSleep (locallock=0x122f778, lockMethodTable=0x1037340
, dontWait=false) at proc.c:1355
#5  in WaitOnLock (locallock=0x122f778, owner=0x1247408, dontWait=false) at
lock.c:1833
#6  in LockAcquireExtended (locktag=0x7ffcc4e39220, lockmode=5,
sessionLock=false, dontWait=false, reportMemoryError=true, locallockp=0x0)
at lock.c:1046
#7  in LockAcquire (locktag=0x7ffcc4e39220, lockmode=5, sessionLock=false,
dontWait=false) at lock.c:739
#8  in SpeculativeInsertionWait (xid=3622, token=2) at lmgr.c:833
#9  in _bt_doinsert (rel=0x7acb2dbb12e8, itup=0x12f1308,
checkUnique=UNIQUE_CHECK_YES, indexUnchanged=true, heapRel=0x7acb2dbb0f08)
at nbtinsert.c:225
#10 in btinsert (rel=0x7acb2dbb12e8, values=0x7ffcc4e39440,
isnull=0x7ffcc4e39420, ht_ctid=0x12ebe20, heapRel=0x7acb2dbb0f08,
checkUnique=UNIQUE_CHECK_YES, indexUnchanged=true, indexInfo=0x12f08a8) at
nbtree.c:195
#11 in index_insert (indexRelation=0x7acb2dbb12e8, values=0x7ffcc4e39440,
isnull=0x7ffcc4e39420, heap_t_ctid=0x12ebe20, heapRelation=0x7acb2dbb0f08,
checkUnique=UNIQUE_CHECK_YES, indexUnchanged=true, indexInfo=0x12f08a8) at
indexam.c:230
#12 in ExecInsertIndexTuples (resultRelInfo=0x12eaa00, slot=0x12ebdf0,
estate=0x12ea560, update=true, noDupErr=false, specConflict=0x0,
arbiterIndexes=0x0, onlySummarizing=false) at execIndexing.c:438
#13 in ExecUpdateEpilogue (context=0x7ffcc4e39870,
updateCxt=0x7ffcc4e3962c, resultRelInfo=0x12eaa00, tupleid=0x7ffcc4e39732,
oldtuple=0x0, slot=0x12ebdf0) at nodeModifyTable.c:2130
#14 in ExecUpdate (context=0x7ffcc4e39870, resultRelInfo=0x12eaa00,
tupleid=0x7ffcc4e39732, oldtuple=0x0, slot=0x12ebdf0, canSetTag=true) at
nodeModifyTable.c:2478
#15 in ExecOnConflictUpdate (context=0x7ffcc4e39870,
resultRelInfo=0x12eaa00, conflictTid=0x7ffcc4e39732,
excludedSlot=0x12f05b8, canSetTag=true, returning=0x7ffcc4e39738) at
nodeModifyTable.c:2694
#16 in ExecInsert (context=0x7ffcc4e39870, resultRelInfo=0x12eaa00,
slot=0x12f05b8, canSetTag=true, inserted_tuple=0x0, insert_destrel=0x0) at
nodeModifyTable.c:1048
#17 in ExecModifyTable (pstate=0x12ea7f0) at nodeModifyTable.c:4059
#18 in ExecProcNodeFirst (node=0x12ea7f0) at execProcnode.c:464
#19 in ExecProcNode (node=0x12ea7f0) at
../../../src/include/executor/executor.h:274
#20 in ExecutePlan (estate=0x12ea560, planstate=0x12ea7f0,
use_parallel_mode=false, operation=CMD_INSERT, sendTuples=true,
numberTuples=0, direction=ForwardScanDirection, dest=0x12daac8,
execute_once=true) at execMain.c:1646
#21 in standard_ExecutorRun (queryDesc=0x12dab58,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:363
#22 in ExecutorRun (queryDesc=0x12dab58, direction=ForwardScanDirection,
count=0, execute_once=true) at execMain.c:304
#23 in ProcessQuery (plan=0x12e1360, sourceText=0x12083b0 "INSERT INTO tbl
VALUES(13,41011) on conflict(i) do update set n = tbl.n + 1 RETURNING n ",
params=0x0, queryEnv=0x0, dest=0x12daac8, qc=0x7ffcc4e39ae0) at pquery.c:160
#24 in Portal

Re: Backporting BackgroundPsql

2024-06-25 Thread Andrew Dunstan


On 2024-06-25 Tu 10:26 AM, Tom Lane wrote:

Andres Freund  writes:

On 2024-06-25 13:26:23 +0300, Heikki Linnakangas wrote:

1. Write the new test differently on backbranches. Before 664d757531, the
test needs to work a lot harder to use the background psql session, calling
pump() etc. That's doable, but as noted in the discussion that led to
664d757531, it's laborious and error-prone.

2. Backport commit 664d757531. This might break out-of-tree perl tests that
use the background_psql() function. I don't know if any such tests exist,
and they would need to be changed for v17 anyway, so that seems acceptable.
Anyone aware of any extensions using the perl test modules?

3. Backport commit 664d757531, but keep the existing background_psql()
function unchanged. Add a different constructor to get the v17-style
BackgroundPsql session, something like "$node->background_psql_new()".

Yes, I've wished for this a couple times. I think 2 or 3 would be reasonable.
I think 1) often just leads to either tests not being written or being
fragile...

I'd vote for (2).  (3) is just leaving a foot-gun for people to
hurt themselves with.





+1


I'd like to get rid of it in its current form at least. Just about all 
the uses I'm aware of could be transformed to use the Session object 
I've been working on, based either on FFI or a small XS wrapper for some 
of libpq.


cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Direct SSL connection and ALPN loose ends

2024-06-25 Thread Jacob Champion
On Tue, Jun 25, 2024 at 7:20 AM Dave Cramer  wrote:
>
> On Tue, 25 Jun 2024 at 09:37, Vladimir Sitnikov  
> wrote:
>>
>> "SSL". Technically, the proper term is TLS, and even the document refers to 
>> "IANA TLS ALPN Protocol IDs" (TLS, not SSL).
>> I would not die on that hill, however, going for tlsnegotiation would look 
>> better than sslnegotiation.
>
> +1 again, unusual to use SSL when this really is TLS.

This was sort of litigated last ye-(checks notes) oh no, three years ago:


https://www.postgresql.org/message-id/flat/CE12DD5C-4BB3-4166-BC9A-39779568734C%40yesql.se

I'm your side when it comes to the use of the TLS acronym, personally,
but I think introducing a brand new option that interfaces with
sslmode and sslrootcert and etc. while not being named like them would
be outright unhelpful. And the idea of switching everything to use TLS
in docs seemed to be met with a solid "meh" on the other thread.

--Jacob




Re: Injection point locking

2024-06-25 Thread Noah Misch
On Tue, Jun 25, 2024 at 11:14:57AM +0900, Michael Paquier wrote:
> On Mon, Jun 24, 2024 at 01:29:38PM +0300, Heikki Linnakangas wrote:
> > InjectionPointRun() acquires InjectionPointLock, looks up the hash entry,
> > and releases the lock:
> > 
> > >   LWLockAcquire(InjectionPointLock, LW_SHARED);
> > >   entry_by_name = (InjectionPointEntry *)
> > >   hash_search(InjectionPointHash, name,
> > >   HASH_FIND, &found);
> > >   LWLockRelease(InjectionPointLock);
> > 
> > Later, it reads fields from the entry it looked up:
> > 
> > >   /* not found in local cache, so load and register */
> > >   snprintf(path, MAXPGPATH, "%s/%s%s", pkglib_path,
> > >entry_by_name->library, DLSUFFIX);
> > 
> > Isn't that a straightforward race condition, if the injection point is
> > detached in between?
> 
> This is a feature, not a bug :)
> 
> Jokes apart, this is a behavior that Noah was looking for so as it is
> possible to detach a point to emulate what a debugger would do with a
> breakpoint for some of his tests with concurrent DDL bugs, so not
> taking a lock while running a point is important.  It's true, though,
> that we could always delay the LWLock release once the local cache is
> loaded, but would it really matter?

I think your last sentence is what Heikki is saying should happen, and I
agree.  Yes, it matters.  As written, InjectionPointRun() could cache an
entry_by_name->function belonging to a different injection point.




Re: improve predefined roles documentation

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 11:35 AM Nathan Bossart
 wrote:
> IIUC the intent of this is to expand on the following sentence in the
> existing docs:
>
> pg_database_owner cannot be a member of any role, and it cannot have
> non-implicit members.
>
> My instinct would be to do something like this:
>
> pg_database_owner cannot be granted membership in any role, and no 
> role
> may be granted non-implicit membership in pg_database_owner.

But you couldn't grant someone implicit membership either, because
then it wouldn't be implicit. So maybe something like this:

pg_database_owner is a predefined role for which membership consists,
implicitly, of the current database owner. It cannot be granted
membership in any role, and no role can be granted membership in
pg_database_owner. However, like any role, it can own objects or
receive grants of access privileges. Consequently, once
pg_database_owner has rights within a template database, each owner of
a database instantiated from that template will exercise those rights.
Initially, this role owns the public schema, so each database owner
governs local use of the schema.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: improve predefined roles documentation

2024-06-25 Thread Nathan Bossart
On Tue, Jun 25, 2024 at 12:16:30PM -0400, Robert Haas wrote:
> pg_database_owner is a predefined role for which membership consists,
> implicitly, of the current database owner. It cannot be granted
> membership in any role, and no role can be granted membership in
> pg_database_owner. However, like any role, it can own objects or
> receive grants of access privileges. Consequently, once
> pg_database_owner has rights within a template database, each owner of
> a database instantiated from that template will exercise those rights.
> Initially, this role owns the public schema, so each database owner
> governs local use of the schema.

The main difference between this and the existing documentation is that the
sentence on membership has been rephrased and moved to earlier in the
paragraph.  I think this helps the logical flow a bit.  We first talk about
implicit membership, then explicit membership, then we talk about
privileges and the consequences of those privileges, and finally we talk
about the default privileges.  So, WFM.

-- 
nathan




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 11:39 AM Melanie Plageman
 wrote:
> On Tue, Jun 25, 2024 at 10:31 AM Robert Haas  wrote:
> > On Tue, Jun 25, 2024 at 9:07 AM Andres Freund  wrote:
> > > It's not hard - but it has downsides. It'll mean that - outside of vacuum 
> > > -
> > > we'll much more often not react to horizons going backwards due to
> > > hot_standby_feedback. Which means that hot_standby_feedback, when used 
> > > without
> > > slots, will prevent fewer conflicts.
> >
> > Can you explain this in more detail?
>
> If we prevent GlobalVisState from moving backward, then we would less
> frequently be pushing the horizon backward on the primary in response
> to hot standby feedback. Then, the primary would do more things that
> would not be safely replayable on the standby -- so the standby could
> end up encountering more recovery conflicts.

I don't get it. hot_standby_feedback only moves horizons backward on
the primary, AFAIK, when it first connects, or when it reconnects.
Which I guess could be frequent for some users with flaky networks,
but does that really rise to the level of "much more often"?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Add ACL (Access Control List) acronym

2024-06-25 Thread Nathan Bossart
On Tue, Jun 25, 2024 at 08:10:24AM +0200, Joel Jacobson wrote:
> On Tue, Jun 25, 2024, at 07:11, Michael Paquier wrote:
>> v1 is fine without the "privileges list" part mentioned by Nathan in
>> the first reply.
> 
> v2 is exactly that, but renamed and attached, so we have an entry this
> was the last version.

LGTM

-- 
nathan




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Andres Freund
Hi,

On 2024-06-25 12:31:11 -0400, Robert Haas wrote:
> On Tue, Jun 25, 2024 at 11:39 AM Melanie Plageman
>  wrote:
> > On Tue, Jun 25, 2024 at 10:31 AM Robert Haas  wrote:
> > > On Tue, Jun 25, 2024 at 9:07 AM Andres Freund  wrote:
> > > > It's not hard - but it has downsides. It'll mean that - outside of 
> > > > vacuum -
> > > > we'll much more often not react to horizons going backwards due to
> > > > hot_standby_feedback. Which means that hot_standby_feedback, when used 
> > > > without
> > > > slots, will prevent fewer conflicts.
> > >
> > > Can you explain this in more detail?
> >
> > If we prevent GlobalVisState from moving backward, then we would less
> > frequently be pushing the horizon backward on the primary in response
> > to hot standby feedback. Then, the primary would do more things that
> > would not be safely replayable on the standby -- so the standby could
> > end up encountering more recovery conflicts.
> 
> I don't get it. hot_standby_feedback only moves horizons backward on
> the primary, AFAIK, when it first connects, or when it reconnects.
> Which I guess could be frequent for some users with flaky networks,
> but does that really rise to the level of "much more often"?

Well, the thing is that with the "prevent it from going backwards" approach,
once the horizon is set to something recent in a backend, it's "sticky".  If a
replica is a bit behind or if there's a long-lived snapshot and disconnects,
the vistest state will advance beyond where the replica needs it to be.  Even
if the standby later reconnects, the vistest in long-lived sessions will still
have the more advanced state. So all future pruning these backends do runs
into the risk of performing pruning that removes rows the standby still deems
visible and thus causes recovery conflicts.

I.e. you don't even need frequent disconnects, you just need one disconnect
and sessions that aren't shortlived.

That said, obviously there will be plenty setups where this won't cause an
issue. I don't really have a handle on how often it'd be a problem.


Greetings,

Andres Freund




Re: RFC: Additional Directory for Extensions

2024-06-25 Thread David E. Wheeler
On Jun 24, 2024, at 5:32 PM, Jelte Fennema-Nio  wrote:

> Still, for the sake of completeness it might make sense to support
> this whole list in extension_destdir. (assuming it's easy to do)

It should be with the current patch, which just uses a prefix to paths in 
`pg_config`. So if SHAREDIR is set to /usr/share/postgresql/16 and 
extension_destdir is set to /mount/ext, then Postgres will look for files in 
/mount/ext/usr/share/postgresql/16. The same rule applies (or should apply) for 
all other pg_config directory configs and where the postmaster looks for 
specific files. And PGXS already supports installing files in these locations, 
thanks to its DESTDIR param.

(I don’t know how it works on Windows, though.)

That said, this is very much a pattern designed for RPM and Debian package 
management patterns, and not for actually installing and managing extensions. 
And maybe that’s fine for now, as it can still be used to address the 
immutability problems descried in the original post in this thread.

Ultimately, I’d like to figure out a way to more tidily organize installed 
extension files, but I think that, too, might be a separate thread.

Best,

David





Re: New standby_slot_names GUC in PG 17

2024-06-25 Thread Nathan Bossart
On Tue, Jun 25, 2024 at 02:02:09PM +0530, Amit Kapila wrote:
> On Tue, Jun 25, 2024 at 12:30 PM Masahiko Sawada  
> wrote:
>> On Tue, Jun 25, 2024 at 1:54 PM Amit Kapila  wrote:
>> > So, my
>> > preference is in order as follows: synchronized_standby_slots,
>> > wait_for_standby_slots, logical_replication_wait_slots,
>> > logical_replication_synchronous_slots, and
>> > logical_replication_synchronous_standby_slots.
>>
>> I also prefer synchronized_standby_slots.
>>
>> From a different angle just for discussion, is it worth considering
>> the term 'failover' since the purpose of this feature is to ensure a
>> standby to be ready for failover in terms of logical replication? For
>> example, failover_standby_slot_names?
> 
> I feel synchronized better indicates the purpose because we ensure
> such slots are synchronized before we process changes for logical
> failover slots. We already have a 'failover' option for logical slots
> which could make things confusing if we add 'failover' where physical
> slots need to be specified.

I'm fine with synchronized_standby_slots.

-- 
nathan




Re: PostgreSQL does not compile on macOS SDK 15.0

2024-06-25 Thread Stan Hu
Thanks, everyone. Sorry to create multiple threads on this.

As I mentioned in the other thread, I've already submitted a bug
report to Apple (FB14047412). My colleagues know a key macOS engineer,
and they have reached out to him to review that thread and bug report.
I'll update if we hear anything.

On Tue, Jun 25, 2024 at 7:39 AM Tom Lane  wrote:
>
> Andres Freund  writes:
> > On 2024-06-25 16:49:32 +0300, Aleksander Alekseev wrote:
> >> Another question is whether we should fix this while the SDK is in
> >> beta or only after it is released.
>
> > Yea.
>
> Stan has started multiple threads about this, which is not doing
> anyone any favors, but that issue was already brought up in
>
> https://www.postgresql.org/message-id/flat/4edd2d3c30429c4445cc805ae9a788c489856eb7.1719265762.git.stanhu%40gmail.com
>
> I think the immediate action item should be to push back on the
> change and see if we can get Apple to undo it.  If we have to
> fix it on our side, it is likely to involve API-breaking changes
> that will cause trouble for extensions.  The more so because
> we'll have to change stable branches too.
>
> I tend to agree with the idea that not including 
> so widely might be the least-bad fix; but that still risks
> breaking code that was dependent on that inclusion.
>
> regards, tom lane




RE: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-25 Thread Amonson, Paul D
> It would be good to know exactly what, if any, changes the Intel lawyers want
> us to make to our license if we accept this patch.

I asked about this and there is nothing Intel requires here license wise. They 
believe that there is nothing wrong with including Clause-3 BSD like licenses 
under the PostgreSQL license. They only specified that for the source file, the 
applying license need to be present either as a link (which was previously 
discouraged in this thread) or the full text. Please note that I checked and 
for this specific Chromium license there is not SPDX codename so the entire 
text is required.

Thanks,
Paul





Re: Proposal for Updating CRC32C with AVX-512 Algorithm.

2024-06-25 Thread Bruce Momjian
On Tue, Jun 25, 2024 at 05:41:12PM +, Amonson, Paul D wrote:
> > It would be good to know exactly what, if any, changes the Intel
> > lawyers want us to make to our license if we accept this patch.
>
> I asked about this and there is nothing Intel requires here license
> wise. They believe that there is nothing wrong with including Clause-3
> BSD like licenses under the PostgreSQL license. They only specified
> that for the source file, the applying license need to be present
> either as a link (which was previously discouraged in this thread)
> or the full text. Please note that I checked and for this specific
> Chromium license there is not SPDX codename so the entire text is
> required.

Okay, that is very interesting.  Yes, we will have no problem
reproducing the exact license text in the source code.  I think we can
remove the license issue as a blocker for this patch.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Patch bug: Fix jsonpath .* on Arrays

2024-06-25 Thread David E. Wheeler
On Jun 25, 2024, at 12:46 AM, Степан Неретин  wrote:

> Hi! Looks good to me, but I have several comments.

Thanks for your review!

> Your patch improves tests, but why did you change formatting in 
> jsonpath_exec.c? What's the motivation?

It’s not just formatting. From the commit message:

> While at it, teach `executeAnyItem()` to return `jperOk` when `found`
> exist, not because it will be used (the result and `found` are inspected
> by different functions), but because it seems like the proper thing to
> return from `executeAnyItem()` when considered in isolation.


I have since realized it’s not a complete fix for the issue, and hacked around 
it in my Go version. Would be fine to remove that bit, but IIRC this was the 
only execution function that would return `jperNotFound` when it in fact adds 
items to the `found` list. The current implementation only looks at one or the 
other, so it’s not super important, but I found the inconsistency annoying and 
sometimes confusing.

>  [1] select jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'strict $.*');
> I propose adding a similar test with explicitly specified lax mode: select 
> jsonb_path_query('[1,2,3,{"b": [3,4,5]}]', 'lax $.*'); to show what lax mode 
> is set by default.

Very few of the other tests do so; I can add it if it’s important for this 
case, though.

> Odd empty result for the test: select jsonb '[1,2,3,{"b": [3,4,5]}]' @? 
> 'strict $.*';
> I expected an error like in test [1]. This behavior is not obvious to me.

@? suppresses a number of errors. Perhaps I should add a variant of the 
error-raising query that passes the silent arg, which would also suppress the 
error.

> Everything else is cool. Thanks to the patch and the discussion above, I 
> began to understand better how wildcards in JSON work.

Yeah, it’s kind of wild TBH.

Best,

David





Re: Proposal: Document ABI Compatibility

2024-06-25 Thread David E. Wheeler
On Jun 25, 2024, at 7:33 AM, Peter Eisentraut  wrote:

> I took at a stab at this, using some of your text, but discussing API and ABI 
> separately.

Oh man this is fantastic, thank you! I’d be more than happy to just turn this 
into a patch. But where should it go? Upthread I assumed xfunc.sgml, and still 
think that’s a likely candidate. Perhaps I’ll just start there --- unless 
someone thinks it should go somewhere other than the docs.

Best,

David





Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 1:10 PM Andres Freund  wrote:
> That said, obviously there will be plenty setups where this won't cause an
> issue. I don't really have a handle on how often it'd be a problem.

Fair enough. Even if it's not super-common, it doesn't seem like a
great idea to regress such scenarios in the back-branches.

Is there any way that we could instead tweak things so that we adjust
the visibility test object itself? Like can have a GlobalVisTest API
where we can supply the OldestXmin from the VacuumCutoffs and have it
... do something useful with that?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: improve predefined roles documentation

2024-06-25 Thread Nathan Bossart
On Tue, Jun 25, 2024 at 11:28:18AM -0500, Nathan Bossart wrote:
> On Tue, Jun 25, 2024 at 12:16:30PM -0400, Robert Haas wrote:
>> pg_database_owner is a predefined role for which membership consists,
>> implicitly, of the current database owner. It cannot be granted
>> membership in any role, and no role can be granted membership in
>> pg_database_owner. However, like any role, it can own objects or
>> receive grants of access privileges. Consequently, once
>> pg_database_owner has rights within a template database, each owner of
>> a database instantiated from that template will exercise those rights.
>> Initially, this role owns the public schema, so each database owner
>> governs local use of the schema.
> 
> The main difference between this and the existing documentation is that the
> sentence on membership has been rephrased and moved to earlier in the
> paragraph.  I think this helps the logical flow a bit.  We first talk about
> implicit membership, then explicit membership, then we talk about
> privileges and the consequences of those privileges, and finally we talk
> about the default privileges.  So, WFM.

I used this in v4 (with some minor changes).  I've copied it here to ease
review.

pg_database_owner always has exactly one implicit member: the current
database owner. It cannot be granted membership in any role, and no
role can be granted membership in pg_database_owner. However, like any
other role, it can own objects and receive grants of access privileges.
Consequently, once pg_database_owner has rights within a template
database, each owner of a database instantiated from that template will
possess those rights. Initially, this role owns the public schema, so
each database owner governs local use of that schema.

-- 
nathan
>From 515d8b8f17c9cc6b5fe53e2dedcac2c282537315 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 18 Jun 2024 11:38:40 -0500
Subject: [PATCH v4 1/1] revamp predefined roles documentation

---
 doc/src/sgml/config.sgml |   2 +-
 doc/src/sgml/monitoring.sgml |   4 +-
 doc/src/sgml/ref/checkpoint.sgml |   2 +-
 doc/src/sgml/ref/reindex.sgml|   2 +-
 doc/src/sgml/user-manag.sgml | 340 ---
 5 files changed, 186 insertions(+), 164 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0c7a9082c5..03e37209e6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -731,7 +731,7 @@ include_dir 'conf.d'

 Determines the number of connection slots that are
 reserved for connections by roles with privileges of the
-pg_use_reserved_connections
+
 role.  Whenever the number of free connection slots is greater than
  but less than or
 equal to the sum of superuser_reserved_connections
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b2ad9b446f..f30c1e53fa 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -286,8 +286,8 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34 
  0:00 postgres: ser
other sessions, many columns will be null.  Note, however, that the
existence of a session and its general properties such as its sessions user
and database are visible to all users.  Superusers and roles with 
privileges of
-   built-in role pg_read_all_stats (see also ) can see all the information about all 
sessions.
+   built-in role pg_read_all_stats
+   can see all the information about all sessions.
   
 
   
diff --git a/doc/src/sgml/ref/checkpoint.sgml b/doc/src/sgml/ref/checkpoint.sgml
index 28a1d717b8..db011a47d0 100644
--- a/doc/src/sgml/ref/checkpoint.sgml
+++ b/doc/src/sgml/ref/checkpoint.sgml
@@ -53,7 +53,7 @@ CHECKPOINT
 
   
Only superusers or users with the privileges of
-   the pg_checkpoint
+   the 
role can call CHECKPOINT.
   
  
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 2942dccf1e..dcf70d14bc 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -305,7 +305,7 @@ REINDEX [ ( option [, ...] ) ] { DA
partitioned table, such commands skip the privilege checks when processing
the individual partitions.  Reindexing a schema or database requires being 
the
owner of that schema or database or having privileges of the
-   pg_maintain
+   
role.  Note specifically that it's thus
possible for non-superusers to rebuild indexes of tables owned by
other users.  However, as a special exception,
diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml
index 07a16247d7..6fc4464519 100644
--- a/doc/src/sgml/user-manag.sgml
+++ b/doc/src/sgml/user-manag.sgml
@@ -590,101 +590,73 @@ DROP ROLE doomed_role;
and information.  Administrators (including roles that have the
CREATEROLE privilege) can GRANT these
roles to users and/or other roles in their environment, prov

Re: Apparent bug in WAL summarizer process (hung state)

2024-06-25 Thread Robert Haas
On Mon, Jun 24, 2024 at 1:56 PM Israel Barth Rubio
 wrote:
> I've been playing a bit with the incremental backup feature which might come 
> as
> part of the 17 release, and I think I hit a possible bug in the WAL summarizer
> process.
>
> The issue that I face refers to the summarizer process getting into a hung 
> state.
> When the issue is triggered, it keeps in an infinite loop trying to process a 
> WAL
> file that no longer exists.  It apparently comes up only when I perform 
> changes to
> `wal_summarize` GUC and reload Postgres, while there is some load in Postgres
> which makes it recycle WAL files.

Yeah, this is a bug. It seems that the WAL summarizer process, when
restarted, wants to restart from wherever it was previously
summarizing WAL, which is correct if that WAL is still around, but if
summarize_wal has been turned off in the meanwhile, it might not be
correct. Here's a patch to fix that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v1-0001-Prevent-summarizer-hang-when-summarize_wal-turn-o.patch
Description: Binary data


Re: Apparent bug in WAL summarizer process (hung state)

2024-06-25 Thread Tom Lane
Robert Haas  writes:
> Yeah, this is a bug. It seems that the WAL summarizer process, when
> restarted, wants to restart from wherever it was previously
> summarizing WAL, which is correct if that WAL is still around, but if
> summarize_wal has been turned off in the meanwhile, it might not be
> correct. Here's a patch to fix that.

This comment seems to be truncated:

+/*
+ * If we're the WAL summarizer, we always want to store the values we
+ * just computed into shared memory, because those are the values we're
+ * going to use to drive our operation, and so they are the authoritative
+ * values. Otherwise, we only store values into shared memory if they are
+ */
+LWLockAcquire(WALSummarizerLock, LW_EXCLUSIVE);
+if (am_wal_summarizer|| !WalSummarizerCtl->initialized)
+{

regards, tom lane




Re: improve predefined roles documentation

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 3:26 PM Nathan Bossart  wrote:
> I used this in v4 (with some minor changes).

Looking at this again, how happy are you with the way you've got
several roles per  instead of one for each? I realize
that was probably part of the intent of the change, to move the data
from below the table into the table, and I see the merit of that. But
one of your other complaints was the entries in the table were
unordered, and it's hard for them to really be ordered if you have
groups like this, since you can't alphabetize, for example, unless you
have just a single entry per .

I don't have a problem with doing it the way you have here if you
think that's good. I'm just asking.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Apparent bug in WAL summarizer process (hung state)

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 3:51 PM Tom Lane  wrote:
> This comment seems to be truncated:

Thanks. New version attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v2-0001-Prevent-summarizer-hang-when-summarize_wal-turned.patch
Description: Binary data


Re: scalability bottlenecks with (many) partitions (and more)

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 6:04 AM Tomas Vondra
 wrote:
> Yeah, definitely needs comment explaining this.
>
> I admit those numbers are pretty arbitrary primes, to implement a
> trivial hash function. That was good enough for a PoC patch, but maybe
> for a "proper" version this should use a better hash function. It needs
> to be fast, and maybe it doesn't matter that much if it's not perfect.

Right. My guess is that if we try too hard to make the hash function
good, there will be a performance hit. Unlike, say, strings that come
from the user, we have no reason to believe that relfilenumbers will
have any particular structure or pattern to them, so a low-quality,
fast function seems like a good trade-off to me. But I'm *far* from a
hashing expert, so I'm prepared for someone who is to tell me that I'm
full of garbage.

> I don't think I've heard the term "partially-associative cache" before
> That's an excellent question. I don't know.
>
> I agree 64 groups is pretty arbitrary, and having 1024 may not be enough
> even with a modest number of partitions. When I was thinking about using
> a higher value, my main concern was that it'd made the PGPROC entry
> larger. Each "fast-path" group is ~72B, so 64 groups is ~4.5kB, and that
> felt like quite a bit.
>
> But maybe it's fine and we could make it much larger - L3 caches tend to
> be many MBs these days, although AFAIK it's shared by threads running on
> the CPU.
>
> I'll see if I can do some more testing of this, and see if there's a
> value where it stops improving / starts degrading, etc.

Sounds good.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: improve predefined roles documentation

2024-06-25 Thread Nathan Bossart
On Tue, Jun 25, 2024 at 04:04:03PM -0400, Robert Haas wrote:
> Looking at this again, how happy are you with the way you've got
> several roles per  instead of one for each? I realize
> that was probably part of the intent of the change, to move the data
> from below the table into the table, and I see the merit of that. But
> one of your other complaints was the entries in the table were
> unordered, and it's hard for them to really be ordered if you have
> groups like this, since you can't alphabetize, for example, unless you
> have just a single entry per .

Yeah, my options were to either separate the roles or to weaken the
ordering, and I guess I felt like the weaker ordering was slightly less
bad.  The extra context in some of the groups seemed worth keeping, and
this probably isn't the only page of our docs that might require ctrl+f.
But I'll yield to the majority opinion here.

-- 
nathan




Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Andres Freund
On 2024-06-25 14:35:00 -0400, Robert Haas wrote:
> Is there any way that we could instead tweak things so that we adjust
> the visibility test object itself? Like can have a GlobalVisTest API
> where we can supply the OldestXmin from the VacuumCutoffs and have it
> ... do something useful with that?

I doubt that's doable in the back branches. And even on HEAD, I don't think
it's a particularly attractive option - there's just a global vistest for each
of the types of objects with a specific horizon (they need to be updated
occasionally, e.g. when taking snapshots). So there's not really a spot to put
an associated OldestXmin. We could put it there and remove it at the end of
vacuum / in an exception handler, but that seems substantially worse.




Re: Backporting BackgroundPsql

2024-06-25 Thread Daniel Gustafsson
> On 25 Jun 2024, at 16:26, Tom Lane  wrote:
> 
> Andres Freund  writes:
>> On 2024-06-25 13:26:23 +0300, Heikki Linnakangas wrote:
>>> 1. Write the new test differently on backbranches. Before 664d757531, the
>>> test needs to work a lot harder to use the background psql session, calling
>>> pump() etc. That's doable, but as noted in the discussion that led to
>>> 664d757531, it's laborious and error-prone.
>>> 
>>> 2. Backport commit 664d757531. This might break out-of-tree perl tests that
>>> use the background_psql() function. I don't know if any such tests exist,
>>> and they would need to be changed for v17 anyway, so that seems acceptable.
>>> Anyone aware of any extensions using the perl test modules?
>>> 
>>> 3. Backport commit 664d757531, but keep the existing background_psql()
>>> function unchanged. Add a different constructor to get the v17-style
>>> BackgroundPsql session, something like "$node->background_psql_new()".
> 
>> Yes, I've wished for this a couple times. I think 2 or 3 would be reasonable.
>> I think 1) often just leads to either tests not being written or being
>> fragile...
> 
> I'd vote for (2).  (3) is just leaving a foot-gun for people to
> hurt themselves with.

I agree with this, if we're backporting we should opt for 2.  The only out of
tree user of background_psql() that I could find was check_pgactivity but they
seem to have vendored an old copy of our lib rather than use anything in a our
tree so we should be fine there.

Before pulling any triggers I think https://commitfest.postgresql.org/48/4959/
should be considered, since Tom found some flaws in the current code around how
timers and timeouts are used.

However, since Andrew is actively aiming to replace all of this shortly, should
we wait a see where that lands to avoid having to backport another library
change?

--
Daniel Gustafsson





Should we document how column DEFAULT expressions work?

2024-06-25 Thread James Coleman
Hello,

It's possible I'm the only one who's been in this situation, but I've
multiple times found myself explaining to a user how column DEFAULT
expressions work: namely how the quoting on an expression following
the keyword DEFAULT controls whether or not the expression is
evaluated at the time of the DDL statement or at the time of an
insertion.

In my experience this is non-obvious to users, and the quoting makes a
big difference.

Is this something that we should document explicitly? I don't see it
called out in the CREATE TABLE reference page, but it's possible I'm
missing something.

Thanks,
James Coleman




Re: Backporting BackgroundPsql

2024-06-25 Thread Tom Lane
Daniel Gustafsson  writes:
> Before pulling any triggers I think https://commitfest.postgresql.org/48/4959/
> should be considered, since Tom found some flaws in the current code around 
> how
> timers and timeouts are used.

That's certainly another issue to consider, but is it really a blocker
for this one?

> However, since Andrew is actively aiming to replace all of this shortly, 
> should
> we wait a see where that lands to avoid having to backport another library
> change?

I would like to see what he comes up with ... but is it likely to
be something we'd risk back-patching?

regards, tom lane




Re: Should we document how column DEFAULT expressions work?

2024-06-25 Thread Tom Lane
James Coleman  writes:
> It's possible I'm the only one who's been in this situation, but I've
> multiple times found myself explaining to a user how column DEFAULT
> expressions work: namely how the quoting on an expression following
> the keyword DEFAULT controls whether or not the expression is
> evaluated at the time of the DDL statement or at the time of an
> insertion.

Uh ... what?  I recall something about that with respect to certain
features such as nextval(), but you're making it sound like there
is something generic going on with DEFAULT.

regards, tom lane




Re: Backporting BackgroundPsql

2024-06-25 Thread Daniel Gustafsson
> On 25 Jun 2024, at 22:57, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>> Before pulling any triggers I think 
>> https://commitfest.postgresql.org/48/4959/
>> should be considered, since Tom found some flaws in the current code around 
>> how
>> timers and timeouts are used.
> 
> That's certainly another issue to consider, but is it really a blocker
> for this one?

It's not a blocker, but when poking at the code it seems useful to consider the
open items around it.

>> However, since Andrew is actively aiming to replace all of this shortly, 
>> should
>> we wait a see where that lands to avoid having to backport another library
>> change?
> 
> I would like to see what he comes up with ... but is it likely to
> be something we'd risk back-patching?

Maybe it'll be a stretch given that it's likely to introduce new dependencies.

--
Daniel Gustafsson





Zero -downtime FULL VACUUM/clustering/defragmentation with zero-downtime and now extra disk space

2024-06-25 Thread Ahmed Yarub Hani Al Nuaimi
Hi guys,

This is inspired by this TODO list
https://wiki.postgresql.org/wiki/Todo#CLUSTER and by pg_repack and
pg_freeze projects.
My final goal is to create an extension that does direct data-file to
data-file transfer (no intermediate tables, no triggers) with no blocking
at all in order to simulate a zero-downtime FULL VACUUM and work on
continuously maintained clustering (which I'm a big fan of). This of course
should involve fixing inexes, etc... There are multiple steps to have a
final product but the first thing I would do is have two pointers: one
iterates from the beginning till it finds enough space, and the other
iterates from the end and adds this row to the space pointed by the first
pointer.
I would like to first know whether this is useful (in my previous companies
this was a complete game changer), and whether there are any alternative
algorithms that you would suggest.
Of course, there are many essential features that would follow later like
running automatically when system is under light load, adding a threshold
for when to do this "online FULL-VACUUM", and utilizing index CLUSTERing
and/or a set of predefined columns.

Any and all comments are welcome,
Ahmed


Re: RFC: Additional Directory for Extensions

2024-06-25 Thread David E . Wheeler
On Jun 25, 2024, at 10:43 AM, Robert Haas  wrote:

> If we want to work on making the sorts of changes that
> you're proposing, let's do it on a separate thread. It's not going to
> be meaningfully harder to move in that direction after some patch like
> this than it is today.

I appreciate this separation of concerns, Robert.

In other news, here’s an updated patch that expands the documentation to record 
that the destination directory is a prefix, and full paths should be used under 
it. Also take the opportunity to document the PGXS DESTDIR variable as the 
thing to use to install files under the destination directory.

It still requires a server restart; I can change it back to superuser-only if 
that’s the consensus.

For those who prefer a GitHub patch review experience, see this PR:

  https://github.com/theory/postgres/pull/3/files

Best,

David



v4-0001-Add-extension_destdir-GUC.patch
Description: Binary data




Re: Should we document how column DEFAULT expressions work?

2024-06-25 Thread James Coleman
On Tue, Jun 25, 2024 at 4:59 PM Tom Lane  wrote:
>
> James Coleman  writes:
> > It's possible I'm the only one who's been in this situation, but I've
> > multiple times found myself explaining to a user how column DEFAULT
> > expressions work: namely how the quoting on an expression following
> > the keyword DEFAULT controls whether or not the expression is
> > evaluated at the time of the DDL statement or at the time of an
> > insertion.
>
> Uh ... what?  I recall something about that with respect to certain
> features such as nextval(), but you're making it sound like there
> is something generic going on with DEFAULT.

Hmm, I guess I'd never considered anything besides cases like
nextval() and now(), but I see now that now() must also be special
cased (when quoted) since 'date_trunc(day, now())'::timestamp doesn't
work but 'now()'::timestamp does.

So I guess what I'm asking about would be limited to those cases (I
assume there are a few others...but I haven't gone digging through the
source yet).

Regards,
James Coleman




Re: IPC::Run accepts bug reports

2024-06-25 Thread Noah Misch
On Mon, Jun 17, 2024 at 01:56:46PM -0400, Robert Haas wrote:
> On Sat, Jun 15, 2024 at 7:48 PM Noah Misch  wrote:
> > Separating this from the pytest thread:
> >
> > On Sat, Jun 15, 2024 at 01:26:57PM -0400, Robert Haas wrote:
> > > The one
> > > thing I know about that *I* think is a pretty big problem about Perl
> > > is that IPC::Run is not really maintained.
> >
> > I don't see in https://github.com/cpan-authors/IPC-Run/issues anything
> > affecting PostgreSQL.  If you know of IPC::Run defects, please report them.
> > If I knew of an IPC::Run defect affecting PostgreSQL, I likely would work on
> > it before absurdity like https://github.com/cpan-authors/IPC-Run/issues/175
> > NetBSD-10-specific behavior coping.
> 
> I'm not concerned about any specific open issue; my concern is about
> the health of that project. https://metacpan.org/pod/IPC::Run says
> that this module is seeking new maintainers, and it looks like the
> people listed as current maintainers are mostly inactive. Instead,
> you're fixing stuff. That's great, but we ideally want PostgreSQL's
> dependencies to be things that are used widely enough that we don't
> end up maintaining them ourselves.

That's reasonable to want.

> I apologize if my comment came across as disparaging your efforts;
> that was not my intent.

It did not come across as disparaging.




Re: Should we document how column DEFAULT expressions work?

2024-06-25 Thread Tom Lane
James Coleman  writes:
> On Tue, Jun 25, 2024 at 4:59 PM Tom Lane  wrote:
>> Uh ... what?  I recall something about that with respect to certain
>> features such as nextval(), but you're making it sound like there
>> is something generic going on with DEFAULT.

> Hmm, I guess I'd never considered anything besides cases like
> nextval() and now(), but I see now that now() must also be special
> cased (when quoted) since 'date_trunc(day, now())'::timestamp doesn't
> work but 'now()'::timestamp does.

Hmm, both of those behaviors are documented, but not in the same place
and possibly not anywhere near where you looked for info about
DEFAULT.  For instance, the Tip at the bottom of section 9.9.5

https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

explains about how 'now'::timestamp isn't what to use in DEFAULT.

regards, tom lane




Re: POC, WIP: OR-clause support for indexes

2024-06-25 Thread Alena Rybakina

On 24.06.2024 18:28, Robert Haas wrote:

On Fri, Jun 21, 2024 at 6:52 PM Alena Rybakina
  wrote:

It's hard to tell, but I think it might be one of the good places to apply 
transformation. Let me describe a brief conclusion on the two approaches.

This explanation is somewhat difficult for me to follow. For example:


In the first approach, we definitely did not process the extra "OR" expressions 
in the first approach, since they were packaged as an Array. It could lead to the fact 
that less planning time would be spent on the optimizer.

I don't know what the "first approach" refers to, or what processing
the extra "OR" expressions means, or what it would mean to package OR
expressions as an array. If you made them into an SAOP then you'd have
an array*instead of*  OR expressions, not OR expressions "packaged as
an array" but even then, they'd still be processed somewhere, unless
the patch was just wrong.

I think you should try writing this summary again and see if you can
make it a lot clearer and more precise.

I'm suspicious based that we should actually be postponing the
transformation even further. If, for example, the transformation is
advantageous for index scans and disadvantageous for bitmap scans, or
the other way around, then this approach can't help much: it either
does the transformation and all scan types are affected, or it doesn't
do it and no scan types are affected. But if you decided for each scan
whether to transform the quals, then you could handle that. Against
that, there might be increased planning cost. But, perhaps that could
be avoided somehow.


Sorry, you are right and I'll try to explain more precisely. The 
firstapproach isthefirstpartof the patch,wherewemade "Or" expressions 
into an SAOPatan earlystageof plangeneration[0],the secondonewasthe one 
proposedby A.Korotkov[1].


So, when we made "OR" expressions into an SAOPat the post-parsing stage 
of the plan generation [0], we definitely did not process the 
redundantexpressions"OR" expressions there (for example,duplicates), 
since they were transformed to SAOP expression. Furthermore, the list of 
OR expressions can be significantly reduced, since constants belonging 
to the same predicate will already be converted into an SAOP expression. 
I assume this may reduce planning time, as I know several places in the 
optimizer where these lists of "OR" expressions are scanned several times.
Also the selectivity for SAOP expressions is estimated better, which 
could lead to the generation of a more optimal plan, but, to be honest, 
this is just an observation from changes in regression tests and, in 
general, how the process of calculating the selectivity of a complex 
expression works. And I think it needs further consideration. SELECT * 
FROM check_estimated_rows('SELECT * FROM functional_dependencies WHERE 
(a = 1 OR a = 51) AND b = ''1'''); estimated | actual 
---+ - 99 | 100 + 100 | 100 (1 row) SELECT * FROM 
check_estimated_rows('SELECT * FROM functional_dependencies WHERE (a = 1 
OR a = 51) AND (b = ''1'' OR b = ''2'')'); estimated | actual 
---+ - 99 | 100 + 100 | 100 (1 row) SELECT * FROM 
check_estimated_rows('SELECT * FROM functional_dependencies WHERE (a = 1 
OR a = 2 OR a = 51 OR a = 52) AND (b = ''1'' OR b = ''2'')'); estimated 
| actual ---+ - 197 | 200 + 200 | 200 Speaking of the 
main disadvantages, we do not give the optimizer the opportunity to 
generate a plan using BitmapScan, which can lead to the generation of a 
suboptimal plan, but in the current approach the same thing happens [2]. 
And you mentioned about it before:


On 24.06.2024 18:28, Robert Haas wrote:

I'm suspicious based that we should actually be postponing the
transformation even further. If, for example, the transformation is
advantageous for index scans and disadvantageous for bitmap scans, or
the other way around, then this approach can't help much: it either
does the transformation and all scan types are affected, or it doesn't
do it and no scan types are affected. But if you decided for each scan
whether to transform the quals, then you could handle that. Against
that, there might be increased planning cost. But, perhaps that could
be avoided somehow.


Andrei mentioned the problem, which might be caused by the 
transformation on the later stage of optimization is updating references 
to expressions in RestrictInfo [3] lists, because they can be used in 
different parts during the formation of the query plan. As the practice 
of Self-join removal [4] has shown, this can be expensive, but feasible. 
By applying the transformation at the analysis stage [0], because no 
links were created, so we did not encounter such problems, so this 
approach was more suitable than the others.


If some things were not clear enough, let me know.

[0] 
https://www.postgresql.org/message-id/attachment/156971/v21-0001-Transform-OR-clauses-to-ANY-expression.patch 
[1] 
https://www.postgresql.org/message-i

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-06-25 Thread Robert Haas
On Tue, Jun 25, 2024 at 4:41 PM Andres Freund  wrote:
> I doubt that's doable in the back branches. And even on HEAD, I don't think
> it's a particularly attractive option - there's just a global vistest for each
> of the types of objects with a specific horizon (they need to be updated
> occasionally, e.g. when taking snapshots). So there's not really a spot to put
> an associated OldestXmin. We could put it there and remove it at the end of
> vacuum / in an exception handler, but that seems substantially worse.

Oh, right: I forgot that the visibility test objects were just
pointers to global variables.

Well, I don't know. I guess that doesn't leave any real options but to
fix it as Melanie proposed. But I still don't like it very much. I
feel like having to test against two different thresholds in the
pruning code is surely a sign that we're doing something wrong.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




CI, macports, darwin version problems

2024-06-25 Thread Thomas Munro
Hi,

Problem #1: we're still using Ventura, but Cirrus has started doing this:

Only ghcr.io/cirruslabs/macos-runner:sonoma is allowed. Automatically upgraded.

It doesn't do it to cfbot, which runs macOS stuff on PGDG-hosted Mac
Minis, but it does it to regular users who use free compute minutes
tagged "instance:OSXCommunityInstance".  This causes them to fail,
because:

[11:17:42.711] Error: Current platform "darwin 23" does not match
expected platform "darwin 22"

Sure enough, the sysinfo task shows "... Darwin Kernel Version
23.5.0...", but for cfbot it's still 22.y.z.  So probably it's time to
change to macOS 14 AKA sonoma AKA darwin 23.

Problem #2:

Once you do that with a simple s/ventura/sonoma/, it still "upgrades"
to macos-runner:sonoma, which is not the same as
macos-sonoma-base:latest.  It has more versions of xcode installed?
Not sure what else will break with that because I haven't successfully
run it yet due to the next problem, but blind patch attached.

Problem #3:

If you have a macports installation cached (eg for CI in your github
account), then the pre-existing macports installation will be for the
wrong darwin version (error shown above).  So I think we need to teach
src/tools/ci/ci_macports_packages.sh to detect that condition and do a
clean install.  I can look into that, but does anyone already know how
to do it?

I know how to find out which darwin version is running: uname -r | sed
's/\..*//'.  What I don't know is how to find the darwin version for a
macports installation.  I have found a terrible way to deduce it:

sqlite3 /opt/local/var/macports/registry/registry.db "select
max(os_major) from ports where os_major != 'any'"

But that's stupid.  There must be a way to ask it what version it was
installed for ... I think it's the variable macports::os_major[2]
(which is written in TCL, a language I can't follow too well), but I
can't figure out where it's reading it from  I hope there is a
text file under /opt/local or at worst a SQLite database, or a way to
ask the port command to spit that number out or ask it if it thinks
migration is necessary...

[1] 
https://github.com/cirruslabs/macos-image-templates/pkgs/container/macos-ventura-xcode
[2] https://github.com/macports/macports-base/blob/bf27e0c98c7443877e081d5f6b6
From f40f7020e1316aa3fb4142af3dcf174eb51f4f02 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 26 Jun 2024 10:43:50 +1200
Subject: [PATCH] ci: Upgrade macOS version from 13 to 14.

---
 .cirrus.tasks.yml| 6 --
 src/tools/ci/ci_macports_packages.sh | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 33646faeadf..a1e133b5a2f 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -411,7 +411,7 @@ task:
 
 
 task:
-  name: macOS - Ventura - Meson
+  name: macOS - Sonoma - Meson
 
   env:
 CPUS: 4 # always get that much for cirrusci macOS instances
@@ -420,7 +420,9 @@ task:
 # work OK. See
 # https://postgr.es/m/20220927040208.l3shfcidovpzqxfh%40awork3.anarazel.de
 TEST_JOBS: 8
-IMAGE: ghcr.io/cirruslabs/macos-ventura-base:latest
+# If changing the macOS release, you probably also need to change
+# macports_url in src/tools/ci/ci_macports_packages.sh
+IMAGE: ghcr.io/cirruslabs/macos-runner:sonoma
 
 CIRRUS_WORKING_DIR: ${HOME}/pgsql/
 CCACHE_DIR: ${HOME}/ccache
diff --git a/src/tools/ci/ci_macports_packages.sh b/src/tools/ci/ci_macports_packages.sh
index f87256e0908..db811a28c8e 100755
--- a/src/tools/ci/ci_macports_packages.sh
+++ b/src/tools/ci/ci_macports_packages.sh
@@ -13,7 +13,7 @@ set -e
 
 packages="$@"
 
-macports_url="https://github.com/macports/macports-base/releases/download/v2.8.1/MacPorts-2.8.1-13-Ventura.pkg";
+macports_url="https://github.com/macports/macports-base/releases/download/v2.9.3/MacPorts-2.9.3-14-Sonoma.pkg";
 cache_dmg="macports.hfs.dmg"
 
 if [ "$CIRRUS_CI" != "true" ]; then
-- 
2.45.2



Re: CI, macports, darwin version problems

2024-06-25 Thread Tom Lane
Thomas Munro  writes:
> I know how to find out which darwin version is running: uname -r | sed
> 's/\..*//'.  What I don't know is how to find the darwin version for a
> macports installation.

"port platform"?

regards, tom lane




Re: Backporting BackgroundPsql

2024-06-25 Thread Alvaro Herrera
On 2024-Jun-25, Tom Lane wrote:

> Daniel Gustafsson  writes:

> > However, since Andrew is actively aiming to replace all of this shortly, 
> > should
> > we wait a see where that lands to avoid having to backport another library
> > change?
> 
> I would like to see what he comes up with ... but is it likely to
> be something we'd risk back-patching?

FWIW I successfully used the preliminary PqFFI stuff Andrew posted to
write a test program for bug #18377, which I think ended up being better
than with BackgroundPsql, so I think it's a good way forward.  As for
back-patching it, I suspect we're going to end up backpatching the
framework anyway just because we'll want to have it available for
backpatching future tests, even if we keep a backpatch minimal by doing
only the framework and not existing tests.

I also backpatched the PqFFI and PostgreSQL::Session modules to older PG
branches, to run my test program there.  This required only removing
some lines from PqFFI.pm that were about importing libpq functions that
older libpq didn't have.

Of course, the PostgreSQL::Session stuff is not ready yet, so if we want
this test in the tree soon, I don't think we should wait.


I'll note, though, that Test::More doesn't work terribly nicely with
perl threads, but that only relates to my test program and not to PqFFI.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"




Re: Backporting BackgroundPsql

2024-06-25 Thread Michael Paquier
On Wed, Jun 26, 2024 at 02:12:42AM +0200, Alvaro Herrera wrote:
> FWIW I successfully used the preliminary PqFFI stuff Andrew posted to
> write a test program for bug #18377, which I think ended up being better
> than with BackgroundPsql, so I think it's a good way forward.  As for
> back-patching it, I suspect we're going to end up backpatching the
> framework anyway just because we'll want to have it available for
> backpatching future tests, even if we keep a backpatch minimal by doing
> only the framework and not existing tests.
> 
> I also backpatched the PqFFI and PostgreSQL::Session modules to older PG
> branches, to run my test program there.  This required only removing
> some lines from PqFFI.pm that were about importing libpq functions that
> older libpq didn't have.

Nice!  I definitely +1 the backpatching of the testing bits.  This
stuff can make validating bugs so much easier, particularly when there
are conflicting parts in the backend after a cherry-pick.
--
Michael


signature.asc
Description: PGP signature


Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2024-06-25 Thread Noah Misch
On Mon, Nov 27, 2023 at 01:43:26AM +0200, Alexander Korotkov wrote:
> v61 looks good to me.  I'm going to push it as long as there are no 
> objections.

This yielded commit 4ed8f09 "Index SLRUs by 64-bit integers rather than by
32-bit integers" and left some expressions coercing SLRU page numbers to int.
Two sources:

  grep -i 'int\b.*page' $(git grep -l SimpleLruInit)
  make && touch $(git grep -l SimpleLruInit) && make PROFILE=-Wconversion 2>&1 
| less -p '.int. from .int64. may alter its value'

(Not every match needs to change.)

> --- a/src/include/access/slru.h
> +++ b/src/include/access/slru.h

> @@ -127,7 +127,15 @@ typedef struct SlruCtlData
>* the behavior of this callback has no functional implications.)  Use
>* SlruPagePrecedesUnitTests() in SLRUs meeting its criteria.
>*/
> - bool(*PagePrecedes) (int, int);
> + bool(*PagePrecedes) (int64, int64);
> +
> + /*
> +  * If true, use long segment filenames formed from lower 48 bits of the
> +  * segment number, e.g. pg_xact/1234. Otherwise, use short
> +  * filenames formed from lower 16 bits of the segment number e.g.
> +  * pg_xact/1234.
> +  */
> + boollong_segment_names;

SlruFileName() makes 15-character (60-bit) file names.  Where does the 48-bit
limit arise?  How does the SlruFileName() comment about a 24-bit limit for
short names relate this comment's 16-bit limit?

nm




Re: [PATCH] Add ACL (Access Control List) acronym

2024-06-25 Thread Michael Paquier
On Tue, Jun 25, 2024 at 11:55:03AM -0500, Nathan Bossart wrote:
> On Tue, Jun 25, 2024 at 08:10:24AM +0200, Joel Jacobson wrote:
> > On Tue, Jun 25, 2024, at 07:11, Michael Paquier wrote:
> >> v1 is fine without the "privileges list" part mentioned by Nathan in
> >> the first reply.
> > 
> > v2 is exactly that, but renamed and attached, so we have an entry this
> > was the last version.
> 
> LGTM

Fine by me as well.  I guess I'll just apply that once v18 opens up.
--
Michael


signature.asc
Description: PGP signature


psql (PostgreSQL) 17beta2 (Debian 17~beta2-1.pgdg+~20240625.1534.g23c5a0e) Failed to retrieve data from the server..

2024-06-25 Thread André Verwijs

psql (PostgreSQL) 17beta2 (Debian 17~beta2-1.pgdg+~20240625.1534.g23c5a0e)

Failed to retrieve data from the server..


retrieve information from database:

column "daticulocale" does not exist
LINE 5: datconnlimit, daticulocale, daticurules, datcollversion,
^
HINT: Perhaps you meant to reference the column "db.datlocale".
--


retrieve information from tables:

'>' not supported between instances of 'NoneType' and 'int'
--

retrieve information from schema's has no issues.



INFO PGADMIN:

version 8.8
Application Mode: Desktop
NW.js Version: 0.77.0
Browser: Chromium 114.0.5735.91
Operating System: Kubuntu 24.04 - Linux-6.8.0-35-generic-x86_64-with 
glibc2.39





  1   2   >