date:20240715

Re: Re: Removing unneeded self joins

2024-07-15 Thread jian he

On Mon, Jul 15, 2024 at 2:08 PM Andrei Lepikhov  wrote:
>
> On 7/15/24 12:31, jian he wrote:
> > hi.
> > Here is the latest patch (v6),
> > I've made the following changes.
> >
> > * disallow original Query->resultRelation participate in SJE.
> > for SELECT, nothing is changed. for UPDATE/DELETE/MERGE
> > we can do:
> > EXPLAIN (COSTS OFF)
> > UPDATE sj sq SET b = sq.b + sz.a FROM (select s1.* from sj s1 join sj
> > s2 on s1.a = s2.a) as sz
> > WHERE sz.a = sq.a;
> >
> > here, only "(select s1.* from sj s1 join sj s2 on s1.a = s2.a)" can
> > apply to SJE.
> >
> > but for now we cannot apply SJE to
> > EXPLAIN (COSTS OFF)
> > UPDATE sj sq SET b = sq.b + sz.a FROM sj as sz WHERE sz.a = sq.a;
> >
> > so the EPQ abnormality issue[1] won't happen.
> >
> >
> > * add a new function: ChangeVarNodesExtended for
> > address concerns in  [2]
> I see you still stay with the code line:
> if (omark && imark && omark->markType != imark->markType)
>
> It is definitely an error. What if omark is NULL, but imark is not? Why
> not to skip this pair of relids? Or, at least, insert an assertion to
> check that you filtered it earlier.
>

i  think "omark is NULL, but imark is not" case won't reach to
remove_self_joins_one_group.
In that case, omark associated RangeTblEntry->rtekind will be RTE_SUBQUERY,
and will be skipped earlier in remove_self_joins_recurse.


Still, do you think the following code is the right way to go?

if ((omark == NULL && imark != NULL) ||
(omark != NULL && imark == NULL) ||
(omark && imark && omark->markType != imark->markType))
  continue;

Re: Extension for PostgreSQL cast jsonb to hstore WIP

2024-07-15 Thread Stepan Neretin

On Mon, Jul 15, 2024 at 12:44 PM Антуан Виолин 
wrote:

> On 2024-04-03 Wn 04:21, Andrew Dunstan
>
>> I don't think a cast that doesn't cater for all the forms json can take
>> is going to work very well. At the very least you would need to error out
>> in cases you didn't want to cover, and have tests for all of those errors.
>> But the above is only a tiny fraction of those. If the error cases are
>> going to be so much more than the cases that work it seems a bit pointless.
>>
> Hi everyone
> I changed my mail account to be officially displayed in the correspondence.
> I also made an error conclusion if we are given an incorrect value. I
> believe that such a cast is needed by PostgreSQL since we already have
> several incomplete casts, but they perform their duties well and help in
> the right situations.
>
> cheers
> Antoine Violin
>
> Antoine
>
> On Mon, Jul 15, 2024 at 12:42 PM Andrew Dunstan 
> wrote:
>
>>
>> On 2024-04-02 Tu 11:43, ShadowGhost wrote:
>>
>> At the moment, this cast supports only these structures, as it was enough
>> for my tasks:
>> {str:numeric}
>> {str:str}
>> {str:bool}
>> {str:null}
>> But it's a great idea and I'll think about implementing it.
>>
>>
>> Please don't top-post on the PostgreSQL lists. See
>> 
>> 
>>
>> I don't think a cast that doesn't cater for all the forms json can take
>> is going to work very well. At the very least you would need to error out
>> in cases you didn't want to cover, and have tests for all of those errors.
>> But the above is only a tiny fraction of those. If the error cases are
>> going to be so much more than the cases that work it seems a bit pointless.
>>
>>
>> cheers
>>
>>
>> andrew
>>
>> --
>> Andrew Dunstan
>> EDB: https://www.enterprisedb.com
>>
>>

Hi! I agree in some cases this cast can be useful.
I Have several comments about the patch:
1)I think we should call pfree on pairs(now we call palloc, but not pfree)
2)I think we should add error handling of load_external_function or maybe
rewrite using of DirectFunctionCall
3)i think we need replace all strdup occurences to pstrdup
4)why such a complex system , you first make global variables there to load
a link to functions there, and then wrap this pointer to a function through
a define?
5) postgres=# SELECT '{"aaa": "first_value", "aaa":
"second_value"}'::jsonb::hstore;
hstore
---
 "aaa"=>"second_value"
(1 row)
is it documented behaviour?

Re: Injection point locking

2024-07-15 Thread Heikki Linnakangas


On 10/07/2024 06:44, Michael Paquier wrote:

On Tue, Jul 09, 2024 at 12:12:04PM +0300, Heikki Linnakangas wrote:

I thought about it, but no. If the generation number doesn't match, there
are a few possibilities:

1. The entry was what we were looking for, but it was concurrently detached.
Return NULL is correct in that case.

2. The entry was what we were looking for, but it was concurrently detached,
and was then immediately reattached. NULL is a fine return value in that
case too. When Run runs concurrently with Detach+Attach, you don't get any
guarantee whether the actual apparent order is "Detach, Attach, Run",
"Detach, Run, Attach", or "Run, Detach, Attach". NULL result corresponds to
the "Detach, Run, Attach" ordering.

3. The entry was not actually what we were looking for. The name comparison
falsely matched just because the slot was concurrently detached and recycled
for a different injection point. We must continue the search in that case.

I added a comment to the top of the loop to explain scenario 2. And a
comment to the "continue" to explain scnario 3, because that's a bit subtle.


Okay.  I am fine with your arguments here.  There is still an argument
imo about looping back at the beginning of ActiveInjectionPoints
entries if we find an entry with a matching name but the generation
does not match with the local copy for the detach-attach concurrent
case, but just moving on with the follow-up entries is also OK by me,
as well.

The new comments in InjectionPointCacheRefresh() are nice
improvements.  Thanks for that.


Ok, committed this.

--
Heikki Linnakangas
Neon (https://neon.tech)

Re: Removing unneeded self joins

2024-07-15 Thread Andrei Lepikhov


On 7/15/24 14:35, jian he wrote:

On Mon, Jul 15, 2024 at 2:08 PM Andrei Lepikhov  wrote:


On 7/15/24 12:31, jian he wrote:

hi.
Here is the latest patch (v6),
I've made the following changes.

* disallow original Query->resultRelation participate in SJE.
for SELECT, nothing is changed. for UPDATE/DELETE/MERGE
we can do:
EXPLAIN (COSTS OFF)
UPDATE sj sq SET b = sq.b + sz.a FROM (select s1.* from sj s1 join sj
s2 on s1.a = s2.a) as sz
WHERE sz.a = sq.a;

here, only "(select s1.* from sj s1 join sj s2 on s1.a = s2.a)" can
apply to SJE.

but for now we cannot apply SJE to
EXPLAIN (COSTS OFF)
UPDATE sj sq SET b = sq.b + sz.a FROM sj as sz WHERE sz.a = sq.a;

so the EPQ abnormality issue[1] won't happen.


* add a new function: ChangeVarNodesExtended for
address concerns in  [2]

I see you still stay with the code line:
if (omark && imark && omark->markType != imark->markType)

It is definitely an error. What if omark is NULL, but imark is not? Why
not to skip this pair of relids? Or, at least, insert an assertion to
check that you filtered it earlier.



i  think "omark is NULL, but imark is not" case won't reach to
remove_self_joins_one_group.
In that case, omark associated RangeTblEntry->rtekind will be RTE_SUBQUERY,
and will be skipped earlier in remove_self_joins_recurse.


Still, do you think the following code is the right way to go?

if ((omark == NULL && imark != NULL) ||
(omark != NULL && imark == NULL) ||
(omark && imark && omark->markType != imark->markType))
   continue;
Sure, if query block needs RowMark it applies proper RowMark to each 
base relation. All pull-up transformations executes before this code.
But it is worth to set Assert at the point to check that nothing changed 
in the code above and the patch works correctly, am I wrong?



--
regards, Andrei Lepikhov

Re: Wrong results with grouping sets

2024-07-15 Thread Sutou Kouhei

Hi,

I'm reviewing patches in Commitfest 2024-07 from top to bottom:
https://commitfest.postgresql.org/48/

This is the 3rd patch:
https://commitfest.postgresql.org/48/4583/

FYI: https://commitfest.postgresql.org/48/4681/ is my patch.

In 
  "Re: Wrong results with grouping sets" on Wed, 10 Jul 2024 09:22:54 +0800,
  Richard Guo  wrote:

> Here is an updated version of this patchset.  I've added some comments
> according to the review feedback, and also tweaked the commit messages
> a bit more.

I'm not familiar with related codes but here are my
comments:

0001:

---
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 85a62b538e5..8055f4b2b9e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1242,6 +1245,12 @@ typedef struct RangeTblEntry
/* estimated or actual from caller */
Cardinality enrtuples pg_node_attr(query_jumble_ignore);
 
+   /*
+* Fields valid for a GROUP RTE (else NULL/zero):
+*/
+   /* list of expressions grouped on */
+   List   *groupexprs pg_node_attr(query_jumble_ignore);
+
/*
 * Fields valid in all RTEs:
 */


+* Fields valid for a GROUP RTE (else NULL/zero):

There is only one field and it's LIST. So how about using
the following?

* A field valid for a GROUP RTE (else NIL):



diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c
index 844fc30978b..0982f873a42 100644
--- a/src/backend/optimizer/util/var.c
+++ b/src/backend/optimizer/util/var.c
@@ -902,6 +915,141 @@ flatten_join_alias_vars_mutator(Node *node,
...
+Node *
+flatten_group_exprs(PlannerInfo *root, Query *query, Node *node)
+{
+   flatten_join_alias_vars_context context;
...
---

If we want to reuse flatten_join_alias_vars_context for
flatten_group_exprs(), how about renaming it?
flatten_join_alias_vars() only uses
flatten_join_alias_vars_context for now. So the name of
flatten_join_alias_vars_context is meaningful. But if we
want to flatten_join_alias_vars_context for
flatten_group_exprs() too. The name of
flatten_join_alias_vars_context is strange.



diff --git a/src/backend/parser/parse_relation.c 
b/src/backend/parser/parse_relation.c
index 2f64eaf0e37..69476384252 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -2557,6 +2557,79 @@ addRangeTableEntryForENR(ParseState *pstate,
...
+   char   *colname = te->resname ? pstrdup(te->resname) : 
"unamed_col";
...


Can the "te->resname == NULL" case be happen? If so, how
about adding a new test for the case?

(BTW, is "unamed_col" intentional name? Is it a typo of
"unnamed_col"?)


Thanks,
-- 
kou

Re: gcc 13 warnings

2024-07-15 Thread Aleksander Alekseev

Hi,

> /*
>  * Prevent spurious warning due to compiler not realizing
>  * VARATT_IS_EXTERNAL_NON_EXPANDED() branch in assign_simple_var() 
> isn't
>  * reachable due to "found" being byvalue.
>  */
> if (var->datatype->typlen != 1)
> pg_unreachable();
>
> I'm somewhat inclined to think it'd be worth adding something along those
> lines to avoid this warning ([1]).

IMO we shouldn't allow warnings to appear in release builds, even
harmless ones. Otherwise we start ignoring them and will skip
something important one day. So I think we should do this.

-- 
Best regards,
Aleksander Alekseev

Re: Things I don't like about \du's "Attributes" column

2024-07-15 Thread Rafia Sabih

On Sat, 13 Jul 2024 at 14:21, Pavel Luzanov  wrote:
>
> On 12.07.2024 12:22, Rafia Sabih wrote:
>
> Other thoughts came to my mind, should we have this column in \du+
> instead, maybe connection limit too.
> I know in the current version we have all this in \du itself, but then
> all clubbed in one column. But now since
> our table has got wider, it might be aesthetic to have it in the
> extended version. Also, their usage wise might not
> be the first thing to be looked at for a user/role.
>
> In the first version of the patch, "Valid until" (named "Password expire 
> time")
> and "Connection limit" ("Max connections") columns were in extended mode. [1]
>
> Later we decided to place each attribute in the "Attributes" column on a 
> separate
> line. This allowed us to significantly reduce the overall width of the output.
> So, I decided to move "Valid until" and "Connection limit" from extended mode
> to normal mode.
>
> Just compare output from patched \du and popular \list command:
>
> postgres@demo(17.0)=# \du
> List of roles
>  Role name | Login | Attributes  |  Valid until   | Connection limit
> ---+---+-++--
>  alice | yes   | Inherit | 2024-06-30 00:00:00+03 |
>  bob   | yes   | Inherit | infinity   |
>  charlie   | yes   | Inherit ||   11
>  postgres  | yes   | Superuser  +||
>|   | Create DB  +||
>|   | Create role+||
>|   | Inherit+||
>|   | Replication+||
>|   | Bypass RLS  ||
> (4 rows)
>
> postgres@demo(17.0)=# \list
>  List of databases
>Name|  Owner   | Encoding | Locale Provider |   Collate   |Ctype   
>  | Locale | ICU Rules |   Access privileges
> ---+--+--+-+-+-++---+---
>  demo  | postgres | UTF8 | libc| en_US.UTF-8 | 
> en_US.UTF-8 ||   |
>  postgres  | postgres | UTF8 | libc| en_US.UTF-8 | 
> en_US.UTF-8 ||   |
>  template0 | postgres | UTF8 | libc| en_US.UTF-8 | 
> en_US.UTF-8 ||   | =c/postgres  +
>|  |  | | |
>  ||   | postgres=CTc/postgres
>  template1 | postgres | UTF8 | libc| en_US.UTF-8 | 
> en_US.UTF-8 ||   | =c/postgres  +
>|  |  | | |
>  ||   | postgres=CTc/postgres
> (4 rows)
>
> If we decide to move "Valid until" and "Connection limit" to extended mode,
> then the role attributes should be returned to their previous form by placing
> them on one line separated by commas.
>
> 1. 
> https://www.postgresql.org/message-id/27f87cb9-229b-478b-81b2-157f94239d55%40postgrespro.ru
>
> --
> Pavel Luzanov
> Postgres Professional: https://postgrespro.com

Well, it was just my opinion of how I would have liked it better, but
of course you may decide against it, there is no strong feeling around
it.
And if you are on the fence with the opinion of having them in normal
or extended mode, then maybe we can ask more people to chip in.

I certainly would not want you to update the patch back and forth for
almost no reason.
-- 
Regards,
Rafia Sabih

Re: Sort functions with specialized comparators

2024-07-15 Thread Stepan Neretin

On Mon, Jul 15, 2024 at 12:23 PM Антуан Виолин 
wrote:

> Hello all.
>>
>> I have decided to explore more areas in which I can optimize and have
>> added
>> two new benchmarks. Do you have any thoughts on this?
>>
>> postgres=# select bench_int16_sort(100);
>> bench_int16_sort
>>
>>
>> -
>> Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
>> 52151523 ns, Percentage difference: 21.41%
>> (1 row)
>>
>> postgres=# select bench_float8_sort(100);
>> bench_float8_sort
>>
>>
>> --
>> Time taken by usual sort: 121475231 ns, Time taken by optimized sort:
>> 74458545 ns, Percentage difference: 38.70%
>> (1 row)
>>
>
>  Hello all
> We would like to see the relationship between the length of the sorted
> array and the performance gain, perhaps some graphs. We also want to see
> to set a "worst case" test, sorting the array in ascending order when it
> is initially descending
>
> Best, regards, Antoine Violin
>
> postgres=#
>>
>
> On Mon, Jul 15, 2024 at 10:32 AM Stepan Neretin  wrote:
>
>>
>>
>> On Sat, Jun 8, 2024 at 1:50 AM Stepan Neretin  wrote:
>>
>>> Hello all.
>>>
>>> I am interested in the proposed patch and would like to propose some
>>> additional changes that would complement it. My changes would introduce
>>> similar optimizations when working with a list of integers or object
>>> identifiers. Additionally, my patch includes an extension for
>>> benchmarking, which shows an average speedup of 30-40%.
>>>
>>> postgres=# SELECT bench_oid_sort(100);
>>>  bench_oid_sort
>>>
>>>
>>> 
>>>  Time taken by list_sort: 116990848 ns, Time taken by list_oid_sort:
>>> 80446640 ns, Percentage difference: 31.24%
>>> (1 row)
>>>
>>> postgres=# SELECT bench_int_sort(100);
>>>  bench_int_sort
>>>
>>>
>>> 
>>>  Time taken by list_sort: 118168506 ns, Time taken by list_int_sort:
>>> 80523373 ns, Percentage difference: 31.86%
>>> (1 row)
>>>
>>> What do you think about these changes?
>>>
>>> Best regards, Stepan Neretin.
>>>
>>> On Fri, Jun 7, 2024 at 11:08 PM Andrey M. Borodin 
>>> wrote:
>>>
 Hi!

 In a thread about sorting comparators[0] Andres noted that we have
 infrastructure to help compiler optimize sorting. PFA attached PoC
 implementation. I've checked that it indeed works on the benchmark from
 that thread.

 postgres=# CREATE TABLE arrays_to_sort AS
SELECT array_shuffle(a) arr
FROM
(SELECT ARRAY(SELECT generate_series(1, 100)) a),
generate_series(1, 10);

 postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- original
 Time: 990.199 ms
 postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- patched
 Time: 696.156 ms

 The benefit seems to be on the order of magnitude with 30% speedup.

 There's plenty of sorting by TransactionId, BlockNumber, OffsetNumber,
 Oid etc. But this sorting routines never show up in perf top or something
 like that.

 Seems like in most cases we do not spend much time in sorting. But
 specialization does not cost us much too, only some CPU cycles of a
 compiler. I think we can further improve speedup by converting inline
 comparator to value extractor: more compilers will see what is actually
 going on. But I have no proofs for this reasoning.

 What do you think?


 Best regards, Andrey Borodin.

 [0]
 https://www.postgresql.org/message-id/flat/20240209184014.sobshkcsfjix6u4r%40awork3.anarazel.de#fc23df2cf314bef35095b632380b4a59
>>>
>>>
>> Hello all.
>>
>> I have decided to explore more areas in which I can optimize and have
>> added two new benchmarks. Do you have any thoughts on this?
>>
>> postgres=# select bench_int16_sort(100);
>> bench_int16_sort
>>
>>
>> -
>>  Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
>> 52151523 ns, Percentage difference: 21.41%
>> (1 row)
>>
>> postgres=# select bench_float8_sort(100);
>> bench_float8_sort
>>
>>
>> --
>>  Time taken by usual sort: 121475231 ns, Time taken by optimized sort:
>> 74458545 ns, Percentage difference: 38.70%
>> (1 row)
>>
>> postgres=#
>>
>> B

Re: broken JIT support on Fedora 40

2024-07-15 Thread Thomas Munro

On Fri, May 17, 2024 at 4:26 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> Thanks, that's correct. I think the only thing left is to add a verifier
> pass, which everybody seems to be agreed is nice to have. The plan is to
> add it only to master without back-patching. I assume Thomas did not
> have time for that yet, so I've added the latest suggestions into his
> patch, and going to open a CF item to not forget about it.

Pushed, and closed.  Thanks!

Re: long-standing data loss bug in initial sync of logical replication

2024-07-15 Thread Amit Kapila

On Thu, Jul 11, 2024 at 6:19 PM Nitin Motiani  wrote:
>
> On Wed, Jul 10, 2024 at 11:22 PM Nitin Motiani  
> wrote:
> >
> > On Wed, Jul 10, 2024 at 10:39 PM vignesh C  wrote:
> > >
> > > On Wed, 10 Jul 2024 at 12:28, Amit Kapila  wrote:
> > > > The patch missed to use the ShareRowExclusiveLock for partitions, see
> > > > attached. I haven't tested it but they should also face the same
> > > > problem. Apart from that, I have changed the comments in a few places
> > > > in the patch.
> > >
> > > I could not hit the updated ShareRowExclusiveLock changes through the
> > > partition table, instead I could verify it using the inheritance
> > > table. Added a test for the same and also attaching the backbranch
> > > patch.
> > >
> >
> > Hi,
> >
> > I tested alternative-experimental-fix-lock.patch provided by Tomas
> > (replaces SUE with SRE in OpenTableList). I believe there are a couple
> > of scenarios the patch does not cover.
> >
> > 1. It doesn't handle the case of "ALTER PUBLICATION  ADD TABLES
> > IN SCHEMA  ".
> >
> > I took crash-test.sh provided by Tomas and modified it to add all
> > tables in the schema to publication using the following command :
> >
> >ALTER PUBLICATION p ADD TABLES IN SCHEMA  public
> >
> > The modified script is attached (crash-test-with-schema.sh). With this
> > script, I can reproduce the issue even with the patch applied. This is
> > because the code path to add a schema to the publication doesn't go
> > through OpenTableList.
> >
> > I have also attached a script run-test-with-schema.sh to run
> > crash-test-with-schema.sh in a loop with randomly generated parameters
> > (modified from run.sh provided by Tomas).
> >
> > 2.  The second issue is a deadlock which happens when the alter
> > publication command is run for a comma separated list of tables.
> >
> > I created another script create-test-tables-order-reverse.sh. This
> > script runs a command like the following :
> >
> > ALTER PUBLICATION p ADD TABLE test_2,test_1
> >
> > Running the above script, I was able to get a deadlock error (the
> > output is attached in deadlock.txt). In the alter publication command,
> > I added the tables in the reverse order to increase the probability of
> > the deadlock. But it should happen with any order of tables.
> >
> > I am not sure if the deadlock is a major issue because detecting the
> > deadlock is better than data loss.
> >

The deadlock reported in this case is an expected behavior. This is no
different that locking tables or rows in reverse order.

>
> I looked further into the scenario of adding the tables in schema to
> the publication. Since in that case, the entry is added to
> pg_publication_namespace instead of pg_publication_rel, the codepaths
> for 'add table' and 'add tables in schema' are different. And in the
> 'add tables in schema' scenario, the OpenTableList function is not
> called to get the relation ids. Therefore even with the proposed
> patch, the data loss issue still persists in that case.
>
> To validate this idea, I tried locking all the affected tables in the
> schema just before the invalidation for those relations (in
> ShareRowExclusiveLock mode).
>

This sounds like a reasonable approach to fix the issue. However, we
should check SET publication_object as well, especially the drop part
in it. It should not happen that we miss sending the data for ADD but
for DROP, we send data when we shouldn't have sent it.


-- 
With Regards,
Amit Kapila.

Re: Parent/child context relation in pg_get_backend_memory_contexts()

2024-07-15 Thread David Rowley

On Fri, 12 Jul 2024 at 08:09, Robert Haas  wrote:
> Do we ever have contexts with the same name at the same level? Could
> we just make the path an array of strings, so that you could then say
> something like this...
>
> SELECT * FROM pg_backend_memory_contexts where path[2] = 'CacheMemoryContext'
>
> ...and get all the things with that in the path?

Unfortunately, this wouldn't align with the goals of the patch. Going
back to Melih's opening paragraph in the initial email, he mentions
that there's currently no *reliable* way to determine the parent/child
relationship in this view.

There's been a few different approaches to making this reliable. The
first patch had "parent_id" and "id" columns.  That required a WITH
RECURSIVE query.  To get away from having to write such complex
queries, the "path" column was born.  I'm now trying to massage that
into something that's as easy to use and intuitive as possible. I've
gotta admit, I don't love the patch. That's not Melih's fault,
however. It's just the nature of what we're working with.

> I'm doubtful about this because nothing prevents the set of memory
> contexts from changing between one query and the next. We should try
> to make it so that it's easy to get what you want in a single query.

I don't think it's ideal that the context's ID changes in ad-hoc
queries, but I don't know how to make that foolproof.  The
breadth-first ID assignment helps, but it could certainly still catch
people out when the memory context of interest is nested at some deep
level.  The breadth-first certainly assignment helped me with the
CacheMemoryContext that I'd been testing with. It allowed me to run my
aggregate query to sum the bytes without the context created in
nodeAgg.c causing the IDs to change.

I'm open to better ideas on how to make this work, but it must meet
the spec of it being a reliable way to determine the context
relationship. If names were unique at each level having those instead
of IDs might be nice, but as Melih demonstrated, they're not. I think
even if Melih's query didn't return results, it would be a bad move to
make it work the way you mentioned if we have nothing to enforce the
uniqueness of names.

David

Re: Parent/child context relation in pg_get_backend_memory_contexts()

2024-07-15 Thread David Rowley

On Sat, 13 Jul 2024 at 10:33, Melih Mutlu  wrote:
> I've been also thinking if we should still have the parent column, as finding 
> out the parent is also possible via looking into the path. What do you think?

I think we should probably consider removing it. Let's think about
that later. I don't think its existence is blocking us from
progressing here.

David

Re: New function normal_rand_array function to contrib/tablefunc.

2024-07-15 Thread Dean Rasheed

On Tue, 2 Jul 2024 at 11:18, Jim Jones  wrote:
>
> When either minval or maxval exceeds int4 the function cannot be
> executed/found
>
> SELECT * FROM normal_rand_array(5, 10, 8, 42::bigint);
>
> ERROR:  function normal_rand_array(integer, integer, integer, bigint)
> does not exist
> LINE 1: SELECT * FROM normal_rand_array(5, 10, 8, 42::bigint);
>   ^
> HINT:  No function matches the given name and argument types. You might
> need to add explicit type casts.
>

This could be solved by defining separate functions for each supported
type, rather than one function with type anyelement. Internally, they
could continue to share common code, of course.

> In some cases the function returns an empty array. Is it also expected?
>

Perhaps it would be useful to have separate minimum and maximum array
length arguments, rather than a mean array length argument.

Actually, I find the current behaviour somewhat counterintuitive. Only
after reading the source code did I realise what it's actually doing,
which is this:

Row 1: array of random length in range [0, meanarraylen]
Row 2: array of length 2*meanarraylen - length of previous array
Row 3: array of random length in range [0, meanarraylen]
Row 4: array of length 2*meanarraylen - length of previous array
...

That's far from obvious (it's certainly not documented) and I don't
think it's a particularly good way of achieving a specified mean array
length, because only half the lengths are random.

One thing that's definitely needed is more documentation. It should
have its own new subsection, like the other tablefunc functions.

Something else that confused me is why this function is called
normal_rand_array(). The underlying random functions that it's calling
return random values from a uniform distribution, not a normal
distribution. Arrays of normally distributed random values might also
be useful, but that's not what this patch is doing.

Also, the function accepts float8 minval and maxval arguments, and
then simply ignores them and returns random float8 values in the range
[0,1), which is highly counterintuitive.

My suggestion would be to mirror the signatures of the core random()
functions more closely, and have this:

1). rand_array(numvals int, minlen int, maxlen int)
returns setof float8[]

2). rand_array(numvals int, minlen int, maxlen int,
   minval int, maxval int)
returns setof int[]

3). rand_array(numvals int, minlen int, maxlen int,
   minval bigint, maxval bigint)
returns setof bigint[]

4). rand_array(numvals int, minlen int, maxlen int,
   minval numeric, maxval numeric)
returns setof numeric[]

Also, I'd recommend giving the function arguments names in SQL, like
this, since that makes them easier to use.

Something else that's not obvious is that this patch is relying on the
core random functions, which means that it's using the same PRNG state
as those functions. That's probably OK, but it should be documented,
because it's different from tablefunc's normal_rand() function, which
uses pg_global_prng_state. In particular, what this means is that
calling setseed() will affect the output of these new functions, but
not normal_rand(). Incidentally, that makes writing tests much simpler
-- just call setseed() first and the output will be repeatable.

Regards,
Dean

Re: Sort functions with specialized comparators

2024-07-15 Thread Stepan Neretin

On Mon, Jul 15, 2024 at 4:52 PM Stepan Neretin  wrote:

>
>
> On Mon, Jul 15, 2024 at 12:23 PM Антуан Виолин 
> wrote:
>
>> Hello all.
>>>
>>> I have decided to explore more areas in which I can optimize and have
>>> added
>>> two new benchmarks. Do you have any thoughts on this?
>>>
>>> postgres=# select bench_int16_sort(100);
>>> bench_int16_sort
>>>
>>>
>>> -
>>> Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
>>> 52151523 ns, Percentage difference: 21.41%
>>> (1 row)
>>>
>>> postgres=# select bench_float8_sort(100);
>>> bench_float8_sort
>>>
>>>
>>> --
>>> Time taken by usual sort: 121475231 ns, Time taken by optimized sort:
>>> 74458545 ns, Percentage difference: 38.70%
>>> (1 row)
>>>
>>
>>  Hello all
>> We would like to see the relationship between the length of the sorted
>> array and the performance gain, perhaps some graphs. We also want to see
>> to set a "worst case" test, sorting the array in ascending order when it
>> is initially descending
>>
>> Best, regards, Antoine Violin
>>
>> postgres=#
>>>
>>
>> On Mon, Jul 15, 2024 at 10:32 AM Stepan Neretin 
>> wrote:
>>
>>>
>>>
>>> On Sat, Jun 8, 2024 at 1:50 AM Stepan Neretin  wrote:
>>>
 Hello all.

 I am interested in the proposed patch and would like to propose some
 additional changes that would complement it. My changes would introduce
 similar optimizations when working with a list of integers or object
 identifiers. Additionally, my patch includes an extension for
 benchmarking, which shows an average speedup of 30-40%.

 postgres=# SELECT bench_oid_sort(100);
  bench_oid_sort


 
  Time taken by list_sort: 116990848 ns, Time taken by list_oid_sort:
 80446640 ns, Percentage difference: 31.24%
 (1 row)

 postgres=# SELECT bench_int_sort(100);
  bench_int_sort


 
  Time taken by list_sort: 118168506 ns, Time taken by list_int_sort:
 80523373 ns, Percentage difference: 31.86%
 (1 row)

 What do you think about these changes?

 Best regards, Stepan Neretin.

 On Fri, Jun 7, 2024 at 11:08 PM Andrey M. Borodin 
 wrote:

> Hi!
>
> In a thread about sorting comparators[0] Andres noted that we have
> infrastructure to help compiler optimize sorting. PFA attached PoC
> implementation. I've checked that it indeed works on the benchmark from
> that thread.
>
> postgres=# CREATE TABLE arrays_to_sort AS
>SELECT array_shuffle(a) arr
>FROM
>(SELECT ARRAY(SELECT generate_series(1, 100)) a),
>generate_series(1, 10);
>
> postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- original
> Time: 990.199 ms
> postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- patched
> Time: 696.156 ms
>
> The benefit seems to be on the order of magnitude with 30% speedup.
>
> There's plenty of sorting by TransactionId, BlockNumber, OffsetNumber,
> Oid etc. But this sorting routines never show up in perf top or something
> like that.
>
> Seems like in most cases we do not spend much time in sorting. But
> specialization does not cost us much too, only some CPU cycles of a
> compiler. I think we can further improve speedup by converting inline
> comparator to value extractor: more compilers will see what is actually
> going on. But I have no proofs for this reasoning.
>
> What do you think?
>
>
> Best regards, Andrey Borodin.
>
> [0]
> https://www.postgresql.org/message-id/flat/20240209184014.sobshkcsfjix6u4r%40awork3.anarazel.de#fc23df2cf314bef35095b632380b4a59


>>> Hello all.
>>>
>>> I have decided to explore more areas in which I can optimize and have
>>> added two new benchmarks. Do you have any thoughts on this?
>>>
>>> postgres=# select bench_int16_sort(100);
>>> bench_int16_sort
>>>
>>>
>>> -
>>>  Time taken by usual sort: 66354981 ns, Time taken by optimized sort:
>>> 52151523 ns, Percentage difference: 21.41%
>>> (1 row)
>>>
>>> postgres=# select bench_float8_sort(100);
>>> bench_float8_sort
>>>
>>>
>>> -

Re: [PATCH] TODO “Allow LISTEN on patterns”

2024-07-15 Thread Alexander Cheshev

Hi Emanuel,

Changed implementation of the function Exec_UnlistenCommit . v2 of the path
contained a bug in the function Exec_UnlistenCommit (added a test case for
that) and also it was not implemented in natural to C form using pointers.
Now it looks fine and works as expected.

In the previous email I forgot to mention that the new implementation of
the function Exec_UnlistenCommit has the same space and time complexities
as the original implementation (which doesn't support wildcards).

Regards,
Alexander Cheshev


On Sat, 13 Jul 2024 at 13:26, Alexander Cheshev 
wrote:

> Hi Emanuel,
>
> I did a test over the "UNLISTEN >" behavior, and I'm not sure if this is
>> expected.
>> This command I assume should free all the listening channels, however, it
>> doesn't
>> seem to do so:
>
>
> TODO “Allow LISTEN on patterns” [1] is a bit vague about that feature. So
> I didn't implement it in the first version of the patch. Also I see that I
> made a mistake in the documentation and mentioned that it is actually
> supported. Sorry for the confusion.
>
> Besides obvious reasons I think that your finding is especially attractive
> for the following reason. We have an UNLISTEN * command. If we replace >
> with * in the patch (which I actually did in the new version of the patch)
> then we have a generalisation of the above command. For example, UNLISTEN
> a* cancels registration on all channels which start with a.
>
> I attached to the email the new version of the patch which supports the
> requested feature. Instead of > I use * for the reason which I mentioned
> above. Also I added test cases, changed documentation, etc.
>
> I appreciate your work, Emanuel! If you have any further findings I will
> be glad to adjust the patch accordingly.
>
> [1]
> https://www.postgresql.org/message-id/flat/52693FC5.7070507%40gmail.com
>
> Regards,
> Alexander Cheshev
>
> Regards,
> Alexander Cheshev
>
>
> On Tue, 9 Jul 2024 at 11:01, Emanuel Calvo <3man...@gmail.com> wrote:
>
>>
>> Hello there,
>>
>>
>> El vie, 15 mar 2024 a las 9:01, Alexander Cheshev (<
>> alex.ches...@gmail.com>) escribió:
>>
>>> Hello Hackers,
>>>
>>> I have implemented TODO “Allow LISTEN on patterns” [1] and attached
>>> the patch to the email. The patch basically consists of the following
>>> two parts.
>>>
>>> 1. Support wildcards in LISTEN command
>>>
>>> Notification channels can be composed of multiple levels in the form
>>> ‘a.b.c’ where ‘a’, ‘b’ and ‘c’ are identifiers.
>>>
>>> Listen channels can be composed of multiple levels in the form ‘a.b.c’
>>> where ‘a’, ‘b’ and ‘c’ are identifiers which can contain the following
>>> wildcards:
>>>  *  ‘%’ matches everything until the end of a level. Can only appear
>>> at the end of a level. For example, the notification channels ‘a.b.c’,
>>> ‘a.bc.c’ match against the listen channel ‘a.b%.c’.
>>>  * ‘>’ matches everything to the right. Can only appear at the end of
>>> the last level. For example, the notification channels ‘a.b’, ‘a.bc.d’
>>> match against the listen channel ‘a.b>’.
>>>
>>>
>> I did a test over the "UNLISTEN >" behavior, and I'm not sure if this is
>> expected.
>> This command I assume should free all the listening channels, however, it
>> doesn't
>> seem to do so:
>>
>> postgres=# LISTEN device1.alerts.%;
>> LISTEN
>> postgres=# ;
>> Asynchronous notification "device1.alerts.temp" with payload "80"
>> received from server process with PID 237.
>> postgres=# UNLISTEN >;
>> UNLISTEN
>> postgres=# ; -- Here I send a notification over the same channel
>> Asynchronous notification "device1.alerts.temp" with payload "80"
>> received from server process with PID 237.
>>
>> The same happens with "UNLISTEN %;", although I'm not sure if this should
>> have
>> the same behavior.
>>
>> It stops listening correctly if I do explicit UNLISTEN (exact channel
>> matching).
>>
>> I'll be glad to conduct more tests or checks on this.
>>
>> Cheers,
>>
>>
>> --
>> --
>> Emanuel Calvo
>> Database Engineering
>> https://tr3s.ma/aobut
>>
>>
From 403a432c129084ac4f17a92e29aa5398de0125f0 Mon Sep 17 00:00:00 2001
From: Alexander Cheshev 
Date: Thu, 14 Mar 2024 21:53:29 +0100
Subject: [PATCH v3] Support wildcards in LISTEN command
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Notification channels can be composed of multiple levels in the form ‘a.b.c’ where ‘a’, ‘b’ and ‘c’ are identifiers.

Listen and unlisten channels can be composed of multiple levels in the form ‘a.b.c’ where ‘a’, ‘b’ and ‘c’ are identifiers which can contain the following wildcards:
* The wildcard ‘%’ matches everything until the end of a level. Can only appear at the end of a level. For example, the notification channels ‘a.b.c’, ‘a.bc.c’ match against the notification channel ‘a.b%.c’.
* The wildcard ‘*’ matches everything to the right. Can only appear at the end of the last level. For example, the notification channels ‘a.b’, ‘a.bc.d’ match against the notification channel ‘a.b*’.

Use binary trie to

Re: [HACKERS] make async slave to wait for lsn to be replayed

2024-07-15 Thread Alexander Korotkov

On Mon, Jul 15, 2024 at 4:24 AM Alexander Korotkov  wrote:
> Thanks to Kyotaro for the review.  And thanks to Ivan for the patch
> revision.  I made another revision of the patch.

I've noticed failures on cfbot.  The attached revision addressed docs
build failure.  Also it adds some "quits" for background psql sessions
for tests.  Probably this will address test hangs on windows.

--
Regards,
Alexander Korotkov
Supabase

v22-0001-Implement-pg_wal_replay_wait-stored-procedure.patch
Description: Binary data

Re: Patch bug: Fix jsonpath .* on Arrays

2024-07-15 Thread Stepan Neretin

On Mon, Jul 8, 2024 at 11:09 PM David E. Wheeler 
wrote:

> On Jun 27, 2024, at 04:17, Michael Paquier  wrote:
>
> > The tests of jsonb_jsonpath.sql include a lot of patterns for @? and
> > jsonb_path_query with the lax and strict modes, so shouldn't these
> > additions be grouped closer to the existing tests rather than added at
> > the end of the file?
>
> I’ve moved them closer to other tests for unwrapping behavior in the
> attached updated and rebased v3 patch.
>
> Best,
>
> David
>
>
>

Hi! Looks good to me now! Please, register a patch in CF.
Best regards, Stepan Neretin.

Re: Parent/child context relation in pg_get_backend_memory_contexts()

2024-07-15 Thread David Rowley

On Sat, 13 Jul 2024 at 10:12, Melih Mutlu  wrote:
> I updated documentation for path and level columns and also fixed the tests 
> as level starts from 1.

Thanks for updating.

+   The path column can be useful to build
+   parent/child relation between memory contexts. For example, the following
+   query calculates the total number of bytes used by a memory context and its
+   child contexts:

"a memory context" doesn't quite sound specific enough. Let's say what
the query is doing exactly.

+
+WITH memory_contexts AS (
+SELECT *
+FROM pg_backend_memory_contexts
+)
+SELECT SUM(total_bytes)
+FROM memory_contexts
+WHERE ARRAY[(SELECT path[array_length(path, 1)] FROM memory_contexts
WHERE name = 'CacheMemoryContext')] <@ path;

I don't think that example query is the most simple example. Isn't it
better to use the most simple form possible to express that?

I think it would be nice to give an example of using "level" as an
index into "path"

WITH c AS (SELECT * FROM pg_backend_memory_contexts)
SELECT sum(c1.total_bytes)
FROM c c1, c c2
WHERE c2.name = 'CacheMemoryContext'
AND c1.path[c2.level] = c2.path[c2.level];

I think the regression test query could be done using the same method.

>> + while (queue != NIL)
>> + {
>> + List *nextQueue = NIL;
>> + ListCell *lc;
>> +
>> + foreach(lc, queue)
>> + {
>
>
> I don't think we need this outer while loop. Appending to the end of a queue 
> naturally results in top-to-bottom order anyway, keeping two lists, "queue" 
> and "nextQueue", might not be necessary. I believe that it's safe to append 
> to a list while iterating over that list in a foreach loop. v9 removes 
> nextQueue and appends directly into queue.

The foreach() macro seems to be ok with that. I am too.  The following
comment will need to be updated:

+ /*
+ * Queue up all the child contexts of this level for the next
+ * iteration of the outer loop.
+ */

That outer loop is gone.

Also, this was due to my hasty writing of the patch. I named the
function get_memory_context_name_and_indent. I meant to write "ident".
If we did get rid of the "parent" column, I'd not see any need to keep
that function. The logic could just be put in
PutMemoryContextsStatsTupleStore(). I just did it that way to avoid
the repeat.

David

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-07-15 Thread Amit Kapila

On Tue, Jul 9, 2024 at 6:23 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Previous patch set could not be accepted due to the initialization miss.
> PSA new version.
>

Few minor comments:
=
0001-patch
1.
.git/rebase-apply/patch:253: space before tab in indent.

errmsg("slot_name and two_phase cannot be altered at the same
time")));
warning: 1 line adds whitespace errors.

White space issue as shown by git am command.

2.
+/*
+ * Common checks for altering failover and two_phase option
+ */
+static void
+CommonChecksForFailoverAndTwophase(Subscription *sub, const char *option,
+bool isTopLevel)

The function name looks odd to me. How about something along the lines
of CheckAlterSubOption()?

3.
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot disable two_phase when uncommitted prepared
transactions present"),

We can slightly change the above error message to: "cannot disable
two_phase when prepared transactions are present".

0003-patch
Alter the altering from
+  true to false, the publisher will
+  replicate transactions again when they are committed.

The beginning of the sentence sounds awkward.

-- 
With Regards,
Amit Kapila.

Re: add function argument names to regex* functions.

2024-07-15 Thread jian he

On Thu, May 16, 2024 at 3:10 AM Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Thu, Apr 4, 2024 at 9:55 AM jian he  wrote:
> >> changing "N" to lower-case would be misleading for regexp_replace?
> >> so I choose "count".
>
> > I don't see why that would be confusing for regexp_replace
> > specifically, but I think N => count is a reasonable change to make.
> > However, I don't think this quite works:
> > + then the count'th match of the pattern
>
> I think the origin of the problem here is not wanting to use "N"
> as the actual name of the parameter, because then users would have
> to double-quote it to write "regexp_replace(..., "N" => 42, ...)".
>
> However ... is that really so awful?  It's still fewer keystrokes
> than "count".  It's certainly a potential gotcha for users who've
> not internalized when they need double quotes, but I think we
> could largely address that problem just by making sure to provide
> a documentation example that shows use of "N".

done it this way. patch attached.

last example from

regexp_replace('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 3, 'i')
   A PostgrXSQL function

change to

regexp_replace(string=>'A PostgreSQL function', pattern=>'a|e|i|o|u',
replacement=>'X',start=>1, "N"=>3, flags=>'i');
   A PostgrXSQL function

but I am not 100% sure
A PostgrXSQL
function
is in the right position.


also address Chapman Flack point:
correct me if i am wrong, but i don't think the ISO standard mandates
function argument names.
So we can choose the best function argument name for our purpose?
From 168dd2d06ce441958ac3ba2c70864da658540b1c Mon Sep 17 00:00:00 2001
From: jian he 
Date: Mon, 15 Jul 2024 18:51:18 +0800
Subject: [PATCH v5 1/1] add regex functions argument names to pg_proc

Specifically add function argument names to the following funtions:
regexp_replace,
regexp_match,
regexp_matches,
regexp_count,
regexp_instr,
regexp_like,
regexp_substr,
regexp_split_to_table,
regexp_split_to_array
So it would be easier to understand these functions in psql via \df.
now these functions can be called in different notaions.
---
 doc/src/sgml/func.sgml  |  8 ++--
 src/include/catalog/pg_proc.dat | 71 ++---
 2 files changed, 61 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 785886af..9cdb3cea 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6109,7 +6109,7 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
  The regexp_replace function provides substitution of
  new text for substrings that match POSIX regular expression patterns.
  It has the syntax
- regexp_replace(source,
+ regexp_replace(string,
  pattern, replacement
  , start
  , N
@@ -6118,9 +6118,9 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
  (Notice that N cannot be specified
  unless start is,
  but flags can be given in any case.)
- The source string is returned unchanged if
+ The string is returned unchanged if
  there is no match to the pattern.  If there is a
- match, the source string is returned with the
+ match, the string is returned with the
  replacement string substituted for the matching
  substring.  The replacement string can contain
  \n, where n is 1
@@ -6161,7 +6161,7 @@ regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g')
fooXarYXazY
 regexp_replace('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 0, 'i')
X PXstgrXSQL fXnctXXn
-regexp_replace('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 3, 'i')
+regexp_replace(string=>'A PostgreSQL function', pattern=>'a|e|i|o|u', replacement=>'X',start=>1, "N"=>3, flags=>'i');
A PostgrXSQL function
 

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 73d9cf85..e4ead68f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3623,105 +3623,148 @@
   prosrc => 'replace_text' },
 { oid => '2284', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
-  proargtypes => 'text text text', prosrc => 'textregexreplace_noopt' },
+  proargtypes => 'text text text',
+  proargnames => '{string, pattern, replacement}',
+  prosrc => 'textregexreplace_noopt' },
 { oid => '2285', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
-  proargtypes => 'text text text text', prosrc => 'textregexreplace' },
+  proargtypes => 'text text text text',
+  proargnames => '{string, pattern, replacement, flags}',
+  prosrc => 'textregexreplace' },
 { oid => '6251', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
   proargtypes => 'text text text

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Ashutosh Sharma

Hi Robert,

On Fri, Jul 12, 2024 at 9:02 PM Robert Haas  wrote:
>
> On Mon, Jun 24, 2024 at 3:10 PM Jeff Davis  wrote:
> > A special search_path variable "$extension_schema" would be a better
> > solution to this problem. We need something like that anyway, in case
> > the extension is relocated, so that we don't have to dig through the
> > catalog and update all the settings.
>
> +1. I think we need that in order for this proposal to make any progress.
>
> And it seems like the patch has something like that, but I don't
> really understand exactly what's going on here, because it's also got
> hunks like this in a bunch of places:
>
> + if (strcmp($2, "$extension_schema") == 0)
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("search_path cannot be set to $extension_schema"),
> + parser_errposition(@2)));
>
> So it seems like the patch is trying to support $extension_schema in
> some cases but not others, which seems like an odd choice that, as far
> as I can see, is not explained anywhere: not in the commit message,
> not in the comments (which are pretty minimally updated by the patch),
> and not in the documentation (which the patch doesn't update at all).
>
> Ashutosh, can we please get some of that stuff updated, or at least a
> clearer explanation of what's going on with $extension_schema here?
>

I've added these changes to restrict users from explicitly setting the
$extension_schema in the search_path. This ensures that
$extension_schema can only be set implicitly for functions created by
the extension when the "protected" flag is enabled.

I apologize for not commenting on this change initially. I'll review
the patch, add comments where needed, and submit an updated version.

--
With Regards,
Ashutosh Sharma.

Re: Make tuple deformation faster

2024-07-15 Thread Matthias van de Meent

On Tue, 2 Jul 2024 at 02:23, David Rowley  wrote:
>
> On Mon, 1 Jul 2024 at 23:42, Matthias van de Meent
>  wrote:
> >
> > On Mon, 1 Jul 2024 at 12:49, David Rowley  wrote:
> > >
> > > On Mon, 1 Jul 2024 at 22:07, Matthias van de Meent
> > >  wrote:
> > > > One thing I'm slightly concerned about is that this allocates another
> > > > 8 bytes for each attribute in the tuple descriptor. While that's not a
> > > > lot when compared with the ->attrs array, it's still quite a lot when
> > > > we might not care at all about this data; e.g. in temporary tuple
> > > > descriptors during execution, in intermediate planner nodes.
> > >
> > > I've not done it in the patch, but one way to get some of that back is
> > > to ditch pg_attribute.attcacheoff. There's no need for it after this
> > > patch.  That's only 4 out of 8 bytes, however.
> >
> > FormData_pg_attribute as a C struct has 4-byte alignment; AFAIK it
> > doesn't have any fields that require 8-byte alignment? Only on 64-bit
> > systems we align the tuples on pages with 8-byte alignment, but
> > in-memory arrays of the struct wouldn't have to deal with that, AFAIK.
>
> Yeah, 4-byte alignment.  "out of 8 bytes" I was talking about is
> sizeof(TupleDescDeformAttr), which I believe is the same "another 8
> bytes" you had mentioned. What I meant was that deleting attcacheoff
> only reduces FormData_pg_attribute by 4 bytes per column and adding
> TupleDescDeformAttr adds 8 per column, so we still use 4 more bytes
> per column with the patch.

I see I was confused, thank you for clarifying. As I said, the
concerns were only small; 4 more bytes only in memory shouldn't matter
much in the grand scheme of things.

> I'm happy to keep going with this version of the patch

+1, go for it.


Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: Showing applied extended statistics in explain Part 2

2024-07-15 Thread Tomas Vondra

On 6/26/24 11:06, Tatsuro Yamada wrote:
> Hi Tomas!
> 
> Thanks for the comments!
> 
> 1) The patch is not added to the CF app, which I think is a mistake. Can
>> you please add it to the 2024-07 commitfest? Otherwise people may not be
>> aware of it, won't do reviews etc. It'll require posting a rebased
>> patch, but should not be a big deal.
>>
> 
> I added the patch to the 2024-07 commitfest today.
> 
> 
> 2) Not having the patch in a CF also means cfbot is not running tests on
>> it. Which is unfortunate, because the patch actually has an a bug cfbot
>> would find - I've noticed it after running the tests through the github
>> CI, see [2].
>> 3) The bug has this symptom:
>>   ERROR:  unrecognized node type: 268
>>   CONTEXT:  PL/pgSQL function check_estimated_rows(text) line 7 ...
>>   STATEMENT:  SELECT * FROM check_estimated_rows('SELECT a, b FROM ...
>> 4) I can think of two basic ways to fix this issue - either allow
>> copying of the StatisticExtInto node, or represent the information in a
>> different way (e.g. add a new node for that purpose, or use existing
>> nodes to do that).
>>
> 
> Thanks for the info. I'll investigate using cfbot.
> To fix the problem, I understand we need to create a new struct like
> (statistics OID, list of clauses, flag is_or).
> 

Yes, something like that, in the plannodes.h.

> 
> 5) In [3] Tom raised two possible issues with doing this - cost of
>> copying the information, and locking. For the performance concerns, I
>> think the first thing we should do is measuring how expensive it is. I
>> suggest measuring the overhead for about three basic cases:
>>
> 
> Okay, I'll measure it once the patch is completed and check the overhead.
> I read [3][4] and in my opinion I agree with Robert.
> As with indexes, there should be a mechanism for determining whether
> extended statistics are used or not. If it were available, users would be
> able to
> tune using extended statistics and get better execution plans.
> 

I do agree with that, but I also understand Tom's concerns about the
costs. His concern is that to make this work, we have to keep/copy the
information for all queries, even if that user never does explain.

Yes, we do the same thing (copy of some pieces) for indexes, and from
this point of view it's equally reasonable. But there's the difference
that for indexes it's always been done this way, hence it's considered
"the baseline", while for extended stats we've not copied the data until
this patch, so it'd be seen as a regression.

I think there are two ways to deal with this - ideally, we'd show that
the overhead is negligible (~noise). And if it's measurable, we'd need
to argue that it's worth it - but that's much harder, IMHO.

So I'd suggest you try to measure the overhead on a couple cases (simple
query with 0 or more statistics applied).

> 
> 
>> 6) I'm not sure we want to have this under EXPLAIN (VERBOSE). It's what
>> I did in the initial PoC patch, but maybe we should invent a new flag
>> for this purpose, otherwise VERBOSE will cover too much stuff? I'm
>> thinking about "STATS" for example.
>>
>> This would probably mean the patch should also add a new auto_explain
>> "log_" flag to enable/disable this.
>>
> 
> I thought it might be better to do this, so I'll fix it.
> 

OK

> 
> 
>> 7) The patch really needs some docs - I'd mention this in the EXPLAIN
>> docs, probably. There's also a chapter about estimates, maybe that
>> should mention this too? Try searching for places in the SGML docs
>> mentioning extended stats and/or explain, I guess.
>>
> 
> I plan to create documentation after the specifications are finalized.
> 

I'm, not sure that's a good approach. Maybe it doesn't need to be
mentioned in the section explaining how estimates work, but it'd be good
to have it at least in the EXPLAIN command docs. The thing is - docs are
a nice way for reviewers to learn about how the feature is expected to
work / be used. Yes, it may need to be adjusted if the patch changes,
but it's likely much easier than changing the code.

> 
> 
>> For tests, I guess stats_ext is the right place to test this. I'm not
>> sure what's the best way to do this, though. If it's covered by VERBOSE,
>> that seems it might be unstable - and that would be an issue. But maybe
>> we might add a function similar to check_estimated_rows(), but verifying
>>  the query used the expected statistics to estimate expected clauses.
>>
> 
> As for testing, I think it's more convenient for reviewers to include it in
> the patch,
> so I'm thinking of including it in the next patch.
> 

I'm not sure I understand what you mean - what is more convenient to
include in the patch & you plan to include in the next patch version?

My opinion is that there clearly need to be some regression tests, be it
in stats_ext.sql or in some other script. But to make it easier, we
might have a function similar to check_estimated_rows() which would
extract just the interesting part of the plan.

> 
> So ther

Re: Showing applied extended statistics in explain Part 2

2024-07-15 Thread Tomas Vondra

On 6/28/24 13:16, Tatsuro Yamada wrote:
> Hi Tomas and All,
> 
> Attached file is a new patch including:
> 6) Add stats option to explain command
> 7) The patch really needs some docs (partly)
> 
>  >4) Add new node (resolve errors in cfbot and prepared statement)
> 
> I tried adding a new node in pathnode.h, but it doesn't work well.
> So, it needs more time to implement it successfully because this is
> the first time to add a new node in it.
> 

I'm not sure why it didn't work well, and I haven't tried adding the
struct myself so I might be missing something important, but m
assumption was the new struct would go to plannodes.h. The planning
works in phases:

  parse -> build Path nodes -> pick cheapest Path -> create Plan

and it's the Plan that is printed by EXPLAIN. The pathnodes.h and
plannodes.h match this, so if it's expected to be in Plan it should go
to plannodes.h I think.

> 
>> 8) Add regression test (stats_ext.sql)
> 
> 
> Actually, I am not yet able to add new test cases to stats_ext.sql.

Why is that not possible? Can you explain?

> Instead, I created a simple test (test.sql) and have attached it.
> Also, output.txt is the test result.
> 
> To add new test cases to stats_ext.sql,
> I'd like to decide on a strategy for modifying it. In particular, there are
> 381 places where the check_estimated_rows function is used, so should I
> include the same number of tests, or should we include the bare minimum
> of tests that cover the code path? I think only the latter would be fine.
> Any advice is appreciated. :-D
> 

I don't understand. My suggestion was to create a new function, similar
to check_estimated_rows(), that's get a query, do EXPLAIN and extract
the list of applied statistics. Similar to what check_estimated_rows()
does for number of rows.

I did not mean to suggest you modify check_estimated_rows() to extract
both the number of rows and statistics, nor to modify the existing tests
(that's not very useful, because there's only one extended statistics in
each of those tests, and by testing the estimate we implicitly test that
it's applied).

My suggestion is to add a couple new queries, with multiple statistics
and multiple clauses etc. And then test the patch on those. You could do
simple EXPLAIN (COSTS OFF), or add the new function to make it a bit
more stable (but maybe it's not worth it).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Patch bug: Fix jsonpath .* on Arrays

2024-07-15 Thread David E. Wheeler

On Jul 15, 2024, at 07:07, Stepan Neretin  wrote:

> Hi! Looks good to me now! Please, register a patch in CF.
> Best regards, Stepan Neretin.

It’s here:

  https://commitfest.postgresql.org/48/5017/

Best,

David

Re: Showing applied extended statistics in explain Part 2

2024-07-15 Thread Tomas Vondra

Hi,

Let me share my opinion on those questions ...


On 7/12/24 12:09, masahiro.ik...@nttdata.com wrote:
> Hi,
> 
> Thanks for working the feature. As a user, I find it useful, and I'd like to 
> use
> it in v18! Although I've just started start looking into it, I have a few 
> questions.
> 
> 
> (1)
> 
> Is it better to make the order of output consistent? For example, even
> though there are three clauses shown in the below case, the order does not
> match.
> * "Filter" shows that "id1" is first.
> * "Ext Stats" shows that "id2" is first.
> 
> -- An example
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (id1 int2, id2 int4, id3 int8, value varchar(32));
> INSERT INTO test (SELECT i%11, i%103, i%1009, 'hello' FROM 
> generate_series(1,100) s(i));
> create statistics test_s1 on id1, id2 from test; analyze;
> 
> =# EXPLAIN (STATS) SELECT * FROM test WHERE id1 = 1 AND (id2 = 2 OR id2 > 10);
>   QUERY PLAN  
>  
> ---
>  Gather  (cost=1000.00..23092.77 rows=84311 width=20)
>Workers Planned: 2
>->  Parallel Seq Scan on test  (cost=0.00..13661.67 rows=35130 width=20)
>  Filter: ((id1 = 1) AND ((id2 = 2) OR (id2 > 10)))
> -- here
>  Ext Stats: public.test_s1  Clauses: (((id2 = 2) OR (id2 > 10)) AND 
> (id1 = 1))-- here
> (5 rows)
> 

I don't think we need to make the order consistent. It probably wouldn't
hurt, but I'm not sure it's even possible for all scan types - for
example in an index scan, the clauses might be split between index
conditions and filters, etc.

> 
> 
> (2)
> 
> Do we really need the schema names without VERBOSE option? As in the above 
> case,
> "Ext Stats" shows schema name "public", even though the table name "test" 
> isn't
> shown with its schema name.
> 
> Additionally, if the VERBOSE option is specified, should the column names 
> also be
> printed with namespace?
> 
> =# EXPLAIN (VERBOSE, STATS) SELECT * FROM test WHERE id1 = 1 AND (id2 = 2 OR 
> id2 > 10);
>   QUERY PLAN  
>  
> ---
>  Gather  (cost=1000.00..22947.37 rows=82857 width=20)
>Output: id1, id2, id3, value
>Workers Planned: 2
>->  Parallel Seq Scan on public.test  (cost=0.00..13661.67 rows=34524 
> width=20)
>  Output: id1, id2, id3, value
>  Filter: ((test.id1 = 1) AND ((test.id2 = 2) OR (test.id2 > 10)))
>  Ext Stats: public.test_s1  Clauses: (((id2 = 2) OR (id2 > 10)) AND 
> (id1 = 1))   -- here
> (7 rows)
> 

Yeah, I don't think there's a good reason to force printing schema for
the statistics, if it's not needed for the table. The rules should be
the same, I think.

> 
> 
> (3)
> 
> I might be misunderstanding something, but do we need the clauses? Is there 
> any
> case where users would want to know the clauses? For example, wouldn't the
> following be sufficient?
> 
>> Ext Stats: id1, id2 using test_s1
> 

The stats may overlap, and some clauses may be matching multiple of
them. And some statistics do not support all clause types (e.g.
functional dependencies work only with equality conditions). Yes, you
might deduce which statistics are used for which clause, but it's not
trivial - interpreting explain is already not trivial, let's not make it
harder.

(If tracking the exact clauses turns out to be expensive, we might
revisit this - it might make it cheaper).

> 
> 
> (4)
> 
> The extended statistics with "dependencies" or "ndistinct" option don't seem 
> to
> be shown in EXPLAIN output. Am I missing something? (Is this expected?)
> 
> I tested the examples in the documentation. Although it might work with
> "mcv" option, I can't confirm that it works because "unrecognized node type"
> error occurred in my environment.
> https://www.postgresql.org/docs/current/sql-createstatistics.html
> 
> (It might be wrong since I'm beginner with extended stats codes.)
> IIUC, the reason is that the patch only handles 
> statext_mcv_clauselist_selectivity(),
> and doesn't handle dependencies_clauselist_selectivity() and 
> estimate_multivariate_ndistinct().
> 
> 
> -- doesn't work with "dependencies" option?
> =# \dX
> List of extended statistics
>  Schema |  Name   | Definition | Ndistinct | Dependencies |   MCV   
> +-++---+--+-
>  public | s1  | a, b FROM t1   | (null)| defined  | (null)
> (2 rows)
> 
> =# EXPLAIN (STATS, ANALYZE) SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
> QUERY PLAN
>  
> --

Re: add function argument names to regex* functions.

2024-07-15 Thread Chapman Flack

On 07/15/24 08:02, jian he wrote:
> also address Chapman Flack point:
> correct me if i am wrong, but i don't think the ISO standard mandates
> function argument names.
> So we can choose the best function argument name for our purpose?

Ah, I may have mistaken which functions the patch meant to apply to.

These being the non-ISO regexp_* functions using POSIX expressions,
the ISO standard indeed says nothing about them.

In the ISO standard *_regex "functions", there are not really "function
argument names" mandated, because, like so many things in ISO SQL, they
have their own special syntax instead of being generic function calls:

TRANSLATE_REGEX('a|e|i|o|u' FLAG 'i' IN 'A PostgreSQL function'
 WITH 'X' FROM 1 OCCURRENCE 3);

Any choice to use similar argument names in the regexp_* functions would
be a matter of consistency with the analogous ISO functions, not anything
mandated.

Regards,
-Chap

Re: add function argument names to regex* functions.

2024-07-15 Thread Chapman Flack

On 07/15/24 10:46, Chapman Flack wrote:
> Ah, I may have mistaken which functions the patch meant to apply to.
> ...
> Any choice to use similar argument names in the regexp_* functions would
> be a matter of consistency with the analogous ISO functions, not anything
> mandated.

Or, looking back, I might have realized these were the non-ISO regexp_*
functions, but seen there was bikeshedding happening over the best name
to use for the occurrence argument, and merely suggested ISO's choice
OCCURRENCE for the analogous ISO functions, as a possible bikeshed
accelerator.

Regards,
-Chap

Re: Incremental backup from a streaming replication standby fails

2024-07-15 Thread Laurenz Albe

On Sat, 2024-06-29 at 07:01 +0200, Laurenz Albe wrote:
> I played around with incremental backup yesterday and tried $subject
> 
> The WAL summarizer is running on the standby server, but when I try  
> to take an incremental backup, I get an error that I understand to mean  
> that WAL summarizing hasn't caught up yet.
> 
> I am not sure if that is working as designed, but if it is, I think it  
> should be documented.

I played with this some more.  Here is the exact error message:

ERROR:  manifest requires WAL from final timeline 1 ending at 0/1967C260, but 
this backup starts at 0/1967C190

By trial and error I found that when I run a CHECKPOINT on the primary,
taking an incremental backup on the standby works.

I couldn't fathom the cause of that, but I think that that should either
be addressed or documented before v17 comes out.

Yours,
Laurenz Albe

Re: Remove dependence on integer wrapping

2024-07-15 Thread Nathan Bossart

I took a closer look at 0002.

+if (unlikely(isinf(f) || isnan(f)))
+ereport(ERROR,
+(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("invalid float value")));
+
+fresult = rint(f * c);

+if (unlikely(f == 0.0))
+ereport(ERROR,
+(errcode(ERRCODE_DIVISION_BY_ZERO),
+ errmsg("division by zero")));
+if (unlikely(isinf(f) || isnan(f)))
+ereport(ERROR,
+(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("invalid float value")));
+
+fresult = rint(c / f);

I'm curious why you aren't using float8_mul/float8_div here, i.e.,

fresult = rint(float8_mul((float8) c, f));
fresult = rint(float8_div((float8) c, f));

nitpick: I'd name the functions something like "cash_mul_float8" and
"cash_div_float8".  Perhaps we could also add functions like
"cash_mul_int64" and "cash_sub_int64" so that we don't need several copies
of the same "money out of range" ERROR.

-- 
nathan

Re: gcc 12.1.0 warning

2024-07-15 Thread Andres Freund

Hi,

On 2024-05-10 12:13:21 +0300, Nazir Bilal Yavuz wrote:
> On Tue, 23 Apr 2024 at 19:59, Andres Freund  wrote:
> >
> >
> > Which seems entirely legitimate. ISTM that guc_var_compare() ought to only
> > cast the pointers to the key type, i.e. char *.  And incidentally that does
> > prevent the warning.
> >
> > The reason it doesn't happen in newer versions of postgres is that we aren't
> > using guc_var_compare() in the relevant places anymore...
> 
> The fix is attached. It cleanly applies from REL_15_STABLE to
> REL_12_STABLE, fixes the warnings and the tests pass.

Thanks! I've applied it to all branches - while it's not required to avoid a
warning in newer versions, it's still not correct as it was...

Greetings,

Andres

Re: Upgrade Debian CI images to Bookworm

2024-07-15 Thread Andres Freund

Hi,

On 2024-05-24 10:30:00 -0700, Andres Freund wrote:
> On 2024-05-24 16:17:37 +0200, Peter Eisentraut wrote:
> > I'm not sure what the backpatching expectations of this kind of thing is.
> > The history of this CI setup is relatively short, so this hasn't been
> > stressed too much.  I see that we once backpatched the macOS update, but
> > that might have been all.
> 
> I've backpatched a few other changes too.
> 
> 
> > If we start backpatching this kind of thing, then this will grow as a job
> > over time.  We'll have 5 or 6 branches to keep up to date, with several
> > operating systems.  And once in a while we'll have to make additional
> > changes like this warning fix you mention here.  I'm not sure how much we
> > want to take this on.  Is there ongoing value in the CI setup in
> > backbranches?
> 
> I find it extremely useful to run CI on backbranches before
> batckpatching. Enough so that I've thought about proposing backpatching CI all
> the way.
> 
> I don't think it's that much work to fix this kind of thing in the
> backbranches. We don't need to backpatch new tasks or such. Just enough stuff
> to keep e.g. the base image the same - otherwise we end up running CI on
> unsupported distros, which doesn't help anybody.
> 
> 
> > With these patches, we could do either of the following:
> > 5) We update master, PG16, and PG15, but we hold all of them until the
> > warning in PG15 is fixed.
> 
> I think we should apply the fix in <= 15 - IMO it's a correct compiler
> warning, what we do right now is wrong.

I've now applied the guc fix to all branches and the CI changes to 15+.

Thanks Bilal!

Greetings,

Andres

[PATCH v1] Fix parsing of a complex type that has an array of complex types

2024-07-15 Thread Arjan Marku


Hi hackers,

I ran into an issue today when I was trying to insert a complex types 
where one of its attributes is also an array of complex types,


As an example:

CREATE TYPE inventory_item AS
(
    name    text,
    supplier_id integer,
    price   numeric
);
CREATE TYPE item_2d AS (id int, items inventory_item[][]);

CREATE TABLE item_2d_table (id int, item item_2d);

INSERT INTO item_2d_table VALUES(1, '(1,{{("inv a",42,1.99),("inv 
b",42,1.99)},{("inv c",42,1.99),("inv d",42,2)}})');


The INSERT statement will fail due to how complex types are parsed, I 
have included a patch in this email to support this scenario.


The reason why this fails is because record_in lacks support of 
detecting an array string when one of the attributes is of type array.
Due to this, it will stop processing the column value prematurely, which 
results in a corrupted value for that particular column.
As a result array_in will receive a malformed string which is bound to 
error.


To fix this, record_in can detect columns that are of type array and in 
such cases leave the input intact.
array_in will attempt to extract the elements one by one. In case it is 
dealing with unquoted elements, the logic needs to slightly change, 
since if the element is a record, quotes can be allowed, ex: {{("test 
field")}


There are some adjustments that can be made to the patch, for example:
We can detect the number of the dimensions of the array in record_in, do 
we want to error out in case the string has more dimensions than MAXDIM 
in array.h, (to prevent number over/underflow-ing) or whether we want to 
error out if number of dimensions is not the same with the number of 
dimensions that the attribute is supposed to have, or both?


Regards,
Arjan Marku
From 4524a944e47c81fb6e738103441ddc202aca5a59 Mon Sep 17 00:00:00 2001
From: binoverfl0w 
Date: Sat, 13 Jul 2024 19:01:25 +0200
Subject: [PATCH v1] [PATCH v1] Fix parsing of a complex type that has an array
 of complex types

record_in lacks support of detecting an array string when one of the attributes is of type array.
Due to this, it will stop processing the column value prematurely, which results in a corrupted value for that particular column.
As a result array_in will receive a malformed string which is bound to error.

To fix this, record_in can detect columns that are of type array and in such cases leave the input intact.
array_in will attempt to extract the elements one by one. In case it is dealing with unquoted elements, the logic needs to slightly change, since if the element is a record, quotes can be allowed, ex: {{("test field")}
---
 src/backend/utils/adt/arrayfuncs.c | 26 +---
 src/backend/utils/adt/rowtypes.c   | 32 --
 2 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d6641b570d..6925b6a4ff 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -101,7 +101,7 @@ static bool ReadArrayStr(char **srcptr,
 		 int *nitems_p,
 		 Datum **values_p, bool **nulls_p,
 		 const char *origStr, Node *escontext);
-static ArrayToken ReadArrayToken(char **srcptr, StringInfo elembuf, char typdelim,
+static ArrayToken ReadArrayToken(char **srcptr, StringInfo elembuf, char typdelim, bool is_record,
  const char *origStr, Node *escontext);
 static void ReadArrayBinary(StringInfo buf, int nitems,
 			FmgrInfo *receiveproc, Oid typioparam, int32 typmod,
@@ -603,6 +603,7 @@ ReadArrayStr(char **srcptr,
 	bool		ndim_frozen;
 	bool		expect_delim;
 	int			nelems[MAXDIM];
+	bool		is_record;
 
 	/* Allocate some starting output workspace; we'll enlarge as needed */
 	maxitems = 16;
@@ -615,6 +616,9 @@ ReadArrayStr(char **srcptr,
 	/* Loop below assumes first token is ATOK_LEVEL_START */
 	Assert(**srcptr == '{');
 
+	/* Whether '(' should be treated as a special character that denotes the start of a record */
+	is_record = (inputproc->fn_oid == F_RECORD_IN);
+
 	/* Parse tokens until we reach the matching right brace */
 	nest_level = 0;
 	nitems = 0;
@@ -624,7 +628,7 @@ ReadArrayStr(char **srcptr,
 	{
 		ArrayToken	tok;
 
-		tok = ReadArrayToken(srcptr, &elembuf, typdelim, origStr, escontext);
+		tok = ReadArrayToken(srcptr, &elembuf, typdelim, is_record, origStr, escontext);
 
 		switch (tok)
 		{
@@ -793,7 +797,7 @@ dimension_error:
  * If the token is ATOK_ELEM, the de-escaped string is returned in elembuf.
  */
 static ArrayToken
-ReadArrayToken(char **srcptr, StringInfo elembuf, char typdelim,
+ReadArrayToken(char **srcptr, StringInfo elembuf, char typdelim, bool is_record,
 			   const char *origStr, Node *escontext)
 {
 	char	   *p = *srcptr;
@@ -912,6 +916,22 @@ unquoted_element:
 dstlen = elembuf->len;	/* treat it as non-whitespace */
 has_escapes = true;
 break;
+			case '(':
+if (is_record) {
+	bool in_quote = false;
+	while (*p && (in_quote || *p !=

Re: gcc 12.1.0 warning

2024-07-15 Thread Nathan Bossart

On Mon, Jul 15, 2024 at 09:41:55AM -0700, Andres Freund wrote:
> On 2024-05-10 12:13:21 +0300, Nazir Bilal Yavuz wrote:
>> The fix is attached. It cleanly applies from REL_15_STABLE to
>> REL_12_STABLE, fixes the warnings and the tests pass.
> 
> Thanks! I've applied it to all branches - while it's not required to avoid a
> warning in newer versions, it's still not correct as it was...

nitpick: pgindent thinks one of the spaces is unnecessary.

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a043d529ef..b0947a4cf1 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1289,8 +1289,8 @@ find_option(const char *name, bool create_placeholders, 
bool skip_errors,
 static int
 guc_var_compare(const void *a, const void *b)
 {
-const char *namea = **(const char ** const *) a;
-const char *nameb = **(const char ** const *) b;
+const char *namea = **(const char **const *) a;
+const char *nameb = **(const char **const *) b;

 return guc_name_compare(namea, nameb);
 }

-- 
nathan

Re: Restart pg_usleep when interrupted

2024-07-15 Thread Nathan Bossart

On Fri, Jul 12, 2024 at 03:39:57PM -0500, Sami Imseih wrote:
> but Bertrand found long drifts [2[ which I could not reproduce.
> To safeguard the long drifts, continue to use the &remain time with an 
> additional safeguard to make sure the actual sleep does not exceed the 
> requested sleep time.

Bertrand, does this safeguard fix the long drifts you saw?

-- 
nathan

Re: Converting tab-complete.c's else-if chain to a switch

2024-07-15 Thread Andres Freund

Hi,

On 2024-07-13 13:16:14 -0400, Tom Lane wrote:
> I wrote:
> > Per discussion elsewhere [1], I've been looking at $SUBJECT.
>
> Here's a finished patchset to perform this change.

Awesome!

> With gcc 8.5.0 I see the time drop from about 3 seconds to about 0.7.  The
> win is less noticeable with clang version 15.0 on a Mac M1: from about 0.7s
> to 0.45s.

Nice!

With gcc 15 I see:

 before   after
-O0  1.2s 0.6s
-O2  10.5s2.6s
-O3  10.8s2.8s

Quite a nice win.

> Of course the main point is the hope that it will work at all with MSVC, but
> I'm not in a position to test that.

It does work with just your patches applied - very nice.

The only obvious thing that doesn't is that ctrl-c without a running query
terminates psql - interrupting a running query works without terminating psql,
which is a bit weird. But that seems independent of this series.

> Subject: [PATCH v1 3/5] Install preprocessor infrastructure for
>  tab-complete.c.

Ah - I was wondering how the above would actually work when I read your intro :)

> +tab-complete.c: gen_tabcomplete.pl tab-complete.in.c
> + $(PERL) $^ --outfile $@
> +

When I first built this with make, I got this error:
make: *** No rule to make target 
'/home/andres/src/postgresql/src/bin/psql/tab-complete.c', needed by 
'tab-complete.o'.  Stop.

but that's just a a consequence of the make dependency handling... Removing
the .deps directory fixed it.

> +# The input is a C file that contains some case labels that are not
> +# constants, but look like function calls, for example:
> +#case Matches("A", "B"):
> +# The function name can be any one of Matches, HeadMatches, TailMatches,
> +# MatchesCS, HeadMatchesCS, or TailMatchesCS.  The argument(s) must be
> +# string literals or macros that expand to string literals or NULL.

Hm, the fact that we are continuing to use the same macros in the switch makes
it a bit painful to edit the .in.c in an editor with compiler-warnings
integration - I see a lot of reported errors ("Expression is not an integer
constant expression") due to case statements not being something that the
normal C switch can handle.

Perhaps we could use a distinct macro for use in the switch, which generates
valid (but nonsensical) code, so we avoid warnings?

> +# These lines are replaced by "case N:" where N is a unique number
> +# for each such case label.  (For convenience, we use the line number
> +# of the case label, although other values would work just as well.)

Hm, using the line number seems to make it a bit harder to compare the output
of the script after making changes to the input. Not the end of the world, but
...

> +tabcomplete = custom_target('tabcomplete',
> +  input: 'tab-complete.in.c',
> +  output: 'tab-complete.c',
> +  command: [
> +perl, files('gen_tabcomplete.pl'), files('tab-complete.in.c'),
> +'--outfile', '@OUTPUT@',
> +'@INPUT@',
> +  ],
> +  install: true,
> +  install_dir: dir_include_server / 'utils',
> +)

I don't think we want to install tab-complete.c?

Greetings,

Andres Freund

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Robert Haas

On Mon, Jul 15, 2024 at 8:05 AM Ashutosh Sharma  wrote:
> I've added these changes to restrict users from explicitly setting the
> $extension_schema in the search_path. This ensures that
> $extension_schema can only be set implicitly for functions created by
> the extension when the "protected" flag is enabled.

But ... why? I mean, what's the point of prohibiting that? In fact,
maybe we should have *that* and forget about the protected flag in the
control file.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Parent/child context relation in pg_get_backend_memory_contexts()

2024-07-15 Thread Robert Haas

On Fri, Jul 12, 2024 at 6:33 PM Melih Mutlu  wrote:
> I just ran the below  to see if we have any context with the same level and 
> name.
>
> postgres=# select level, name, count(*) from pg_backend_memory_contexts group 
> by level, name having count(*)>1;
>  level |name | count
> ---+-+---
>  3 | index info  |90
>  5 | ExprContext | 5
>
> Seems like it's a possible case. But those contexts might not be the most 
> interesting ones. I guess the contexts that most users would be interested in 
> will likely be unique on their levels and with their name. So we might not be 
> concerned with the contexts, like those two from the above result, and chose 
> using names instead of transient IDs. But I think that we can't guarantee 
> name-based path column would be completely reliable in all cases.

Maybe we should just fix it so that doesn't happen. I think it's only
an issue if the whole path is the same, and I'm not sure whether
that's the case here. But notice that we have this:

const char *name;   /* context name (just
for debugging) */
const char *ident;  /* context ID if any
(just for debugging) */

I think this arrangement dates to
442accc3fe0cd556de40d9d6c776449e82254763, and the discussion thread
begins like this:

"It does look like a 182KiB has been spent for some SQL, however
there's no clear way to tell which SQL is to blame."

So the point of that commit was to find better ways of distinguishing
between similar contexts. It sounds like perhaps we're not all the way
there yet, but if we agree on the goal, maybe we can figure out how to
reach it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: long-standing data loss bug in initial sync of logical replication

2024-07-15 Thread vignesh C

On Mon, 15 Jul 2024 at 15:31, Amit Kapila  wrote:
>
> On Thu, Jul 11, 2024 at 6:19 PM Nitin Motiani  wrote:
> >
> > On Wed, Jul 10, 2024 at 11:22 PM Nitin Motiani  
> > wrote:
> > >
> > > On Wed, Jul 10, 2024 at 10:39 PM vignesh C  wrote:
> > > >
> > > > On Wed, 10 Jul 2024 at 12:28, Amit Kapila  
> > > > wrote:
> > > > > The patch missed to use the ShareRowExclusiveLock for partitions, see
> > > > > attached. I haven't tested it but they should also face the same
> > > > > problem. Apart from that, I have changed the comments in a few places
> > > > > in the patch.
> > > >
> > > > I could not hit the updated ShareRowExclusiveLock changes through the
> > > > partition table, instead I could verify it using the inheritance
> > > > table. Added a test for the same and also attaching the backbranch
> > > > patch.
> > > >
> > >
> > > Hi,
> > >
> > > I tested alternative-experimental-fix-lock.patch provided by Tomas
> > > (replaces SUE with SRE in OpenTableList). I believe there are a couple
> > > of scenarios the patch does not cover.
> > >
> > > 1. It doesn't handle the case of "ALTER PUBLICATION  ADD TABLES
> > > IN SCHEMA  ".
> > >
> > > I took crash-test.sh provided by Tomas and modified it to add all
> > > tables in the schema to publication using the following command :
> > >
> > >ALTER PUBLICATION p ADD TABLES IN SCHEMA  public
> > >
> > > The modified script is attached (crash-test-with-schema.sh). With this
> > > script, I can reproduce the issue even with the patch applied. This is
> > > because the code path to add a schema to the publication doesn't go
> > > through OpenTableList.
> > >
> > > I have also attached a script run-test-with-schema.sh to run
> > > crash-test-with-schema.sh in a loop with randomly generated parameters
> > > (modified from run.sh provided by Tomas).
> > >
> > > 2.  The second issue is a deadlock which happens when the alter
> > > publication command is run for a comma separated list of tables.
> > >
> > > I created another script create-test-tables-order-reverse.sh. This
> > > script runs a command like the following :
> > >
> > > ALTER PUBLICATION p ADD TABLE test_2,test_1
> > >
> > > Running the above script, I was able to get a deadlock error (the
> > > output is attached in deadlock.txt). In the alter publication command,
> > > I added the tables in the reverse order to increase the probability of
> > > the deadlock. But it should happen with any order of tables.
> > >
> > > I am not sure if the deadlock is a major issue because detecting the
> > > deadlock is better than data loss.
> > >
>
> The deadlock reported in this case is an expected behavior. This is no
> different that locking tables or rows in reverse order.
>
> >
> > I looked further into the scenario of adding the tables in schema to
> > the publication. Since in that case, the entry is added to
> > pg_publication_namespace instead of pg_publication_rel, the codepaths
> > for 'add table' and 'add tables in schema' are different. And in the
> > 'add tables in schema' scenario, the OpenTableList function is not
> > called to get the relation ids. Therefore even with the proposed
> > patch, the data loss issue still persists in that case.
> >
> > To validate this idea, I tried locking all the affected tables in the
> > schema just before the invalidation for those relations (in
> > ShareRowExclusiveLock mode).
> >
>
> This sounds like a reasonable approach to fix the issue. However, we
> should check SET publication_object as well, especially the drop part
> in it. It should not happen that we miss sending the data for ADD but
> for DROP, we send data when we shouldn't have sent it.

There were few other scenarios, similar to the one you mentioned,
where the issue occurred. For example: a) When specifying a subset of
existing tables in the ALTER PUBLICATION ... SET TABLE command, the
tables that were supposed to be removed from the publication were not
locked in ShareRowExclusiveLock mode. b) The ALTER PUBLICATION ...
DROP TABLES IN SCHEMA command did not lock the relations that will be
removed from the publication in ShareRowExclusiveLock mode. Both of
these scenarios resulted in data inconsistency due to inadequate
locking. The attached patch addresses these issues.

Regards,
Vignesh
From e1e79bcf24cacf4f8291692f7815dd323e7b4ab5 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 9 Jul 2024 19:23:10 +0530
Subject: [PATCH v5] Fix data loss during initial sync in logical replication.

Previously, when adding tables to a publication in PostgreSQL, they were
locked using ShareUpdateExclusiveLock mode. This mode allowed the lock to
succeed even if there were ongoing DML transactions on that table. As a
consequence, the ALTER PUBLICATION command could be completed before these
transactions, leading to a scenario where the catalog snapshot used for
replication did not include changes from transactions initiated before the
alteration.

To fix this issue, tables are now locked using ShareRow

Re: Parent/child context relation in pg_get_backend_memory_contexts()

2024-07-15 Thread Robert Haas

On Mon, Jul 15, 2024 at 6:44 AM David Rowley  wrote:
> Unfortunately, this wouldn't align with the goals of the patch. Going
> back to Melih's opening paragraph in the initial email, he mentions
> that there's currently no *reliable* way to determine the parent/child
> relationship in this view.
>
> There's been a few different approaches to making this reliable. The
> first patch had "parent_id" and "id" columns.  That required a WITH
> RECURSIVE query.  To get away from having to write such complex
> queries, the "path" column was born.  I'm now trying to massage that
> into something that's as easy to use and intuitive as possible. I've
> gotta admit, I don't love the patch. That's not Melih's fault,
> however. It's just the nature of what we're working with.

I'm not against what you're trying to do here, but I feel like you
might be over-engineering it. I don't think there was anything really
wrong with what Melih was doing, and I don't think there's anything
really wrong with converting the path to an array of strings, either.
Sure, it might not be perfect, but future patches could always remove
the name duplication. This is a debugging facility that will be used
by a tiny minority of users, and if some non-uniqueness gets
reintroduced in the future, it's not a critical defect and can just be
fixed when it's noticed. That said, if you want to go with the integer
IDs and want to spend more time massaging it, I also think that's
fine. I simply don't believe it's the only way forward here. YMMV, but
my opinion is that none of these approaches have such critical flaws
that we need to get stressed about it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal

2024-07-15 Thread Robert Haas

On Sun, Jul 14, 2024 at 10:56 PM Fujii Masao
 wrote:
> I don't think it's a rare scenario since summarize_wal can be enabled
> after starting the server with wal_level=minimal. Therefore, I believe
> such a configuration should be prohibited using a GUC check hook,
> as my patch does.

I guess I'm in the group of people who doesn't understand how this can
possibly work. There's no guarantee about the order in which GUC check
hooks are called, so you don't know if the value of the other variable
has already been set to the final value or not, which seems like a
fatal problem even if the code happens to work correctly as of today.
Even if you have such a guarantee, you can't prohibit a configuration
change at pg_ctl reload time: the server can refuse to start in case
of an invalid configuration, but a running server can't decide to shut
down or stop working at reload time.

> Alternatively, we should at least report or
> log something when summarize_wal is enabled but fast_forward is also
> enabled, so users can easily detect or investigate this unexpected situation.
> I'm not sure if exposing fast_forward is necessary for that or not...

To be honest, I'm uncomfortable with how much time is passing while we
debate these details. I feel like we need to get these WAL format
changes done sooner rather than later considering beta2 is already out
the door. Other changes like logging messages or not can be debated
once the basic fix is in. Even if they don't happen, nobody will have
a corrupted backup once the basic fix is done. At most they will be
confused about why some backup is failing.

> Regarding pg_get_wal_summarizer_state(), it is documented that
> summarized_lsn indicates the ending LSN of the last WAL summary file
> written to disk. However, with the patch, summarized_lsn advances
> even when fast_forward is enabled. The documentation should be updated,
> or summarized_lsn should be changed so it doesn't advance while
> fast_forward is enabled.

I think we need to advance summarized_lsn to get the proper behavior.
I can update the documentation.

> I have one small comment:
>
> +# This test aims to validate that takeing an incremental backup fails when
>
> "takeing" should be "taking"?

Will fix, thanks.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Upgrade Debian CI images to Bookworm

2024-07-15 Thread Andres Freund

Hi,

On 2024-07-15 09:43:26 -0700, Andres Freund wrote:
> I've now applied the guc fix to all branches and the CI changes to 15+.

Ugh. I see that this fails on master, because of

commit 0c3930d0768
Author: Peter Eisentraut 
Date:   2024-07-01 07:30:38 +0200
 
Apply COPT to CXXFLAGS as well

I hadn't seen that because of an independent failure (the macos stuff, I'll
send an email about it in a bit).

Not sure what the best real fix here is, this is outside of our code. I'm
inclined to just disable llvm for the compiler warning task for now.

Greetings,

Andres Freund

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Jeff Davis

On Mon, 2024-07-15 at 13:44 -0400, Robert Haas wrote:
> But ... why? I mean, what's the point of prohibiting that?

Agreed. We ignore all kinds of stuff in search_path that doesn't make
sense, like non-existent schemas. Simpler is better.

Regards,
Jeff Davis

Re: Sort functions with specialized comparators

2024-07-15 Thread Andrey M. Borodin




> On 15 Jul 2024, at 12:52, Stepan Neretin  wrote:
> 
> 
> I run benchmark with my patches:
> ./pgbench -c 10 -j2 -t1000 -d postgres
> 
> pgbench (18devel)
> starting vacuum...end.
> transaction type: 
> scaling factor: 10
> query mode: simple
> number of clients: 10
> number of threads: 2
> maximum number of tries: 1
> number of transactions per client: 1000
> number of transactions actually processed: 1/1
> number of failed transactions: 0 (0.000%)
> latency average = 1.609 ms
> initial connection time = 24.080 ms
> tps = 6214.244789 (without initial connection time)
> 
> and without:
> ./pgbench -c 10 -j2 -t1000 -d postgres
> 
> pgbench (18devel)
> starting vacuum...end.
> transaction type: 
> scaling factor: 10
> query mode: simple
> number of clients: 10
> number of threads: 2
> maximum number of tries: 1
> number of transactions per client: 1000
> number of transactions actually processed: 1/1
> number of failed transactions: 0 (0.000%)
> latency average = 1.731 ms
> initial connection time = 15.177 ms
> tps = 5776.173285 (without initial connection time)
> 
> tps with my patches increase. What do you think?


Hi Stepan!

Thank you for implementing specialized sorting and doing this benchmarks.
I believe it's a possible direction for good improvement.
However, I doubt in correctness of your benchmarks.
Increasing TPC-B performance from 5776 TPS to 6214 TPS seems too good to be 
true. 


Best regards, Andrey Borodin.

Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal

2024-07-15 Thread Nathan Bossart

On Mon, Jul 15, 2024 at 02:30:42PM -0400, Robert Haas wrote:
> On Sun, Jul 14, 2024 at 10:56 PM Fujii Masao
>  wrote:
>> I don't think it's a rare scenario since summarize_wal can be enabled
>> after starting the server with wal_level=minimal. Therefore, I believe
>> such a configuration should be prohibited using a GUC check hook,
>> as my patch does.
> 
> I guess I'm in the group of people who doesn't understand how this can
> possibly work. There's no guarantee about the order in which GUC check
> hooks are called, so you don't know if the value of the other variable
> has already been set to the final value or not, which seems like a
> fatal problem even if the code happens to work correctly as of today.
> Even if you have such a guarantee, you can't prohibit a configuration
> change at pg_ctl reload time: the server can refuse to start in case
> of an invalid configuration, but a running server can't decide to shut
> down or stop working at reload time.

My understanding is that the correctness of this GUC check hook depends on
wal_level being a PGC_POSTMASTER GUC.  The check hook would always return
true during startup, and there'd be an additional cross-check in
PostmasterMain() that would fail startup if necessary.  After that point,
we know that wal_level cannot change, so the GUC check hook for
summarize_wal can depend on wal_level.  If it fails, my expectation would
be that the server would just ignore that change and continue.

-- 
nathan

Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal

2024-07-15 Thread Nathan Bossart

On Mon, Jul 15, 2024 at 01:47:14PM -0500, Nathan Bossart wrote:
> On Mon, Jul 15, 2024 at 02:30:42PM -0400, Robert Haas wrote:
>> I guess I'm in the group of people who doesn't understand how this can
>> possibly work. There's no guarantee about the order in which GUC check
>> hooks are called, so you don't know if the value of the other variable
>> has already been set to the final value or not, which seems like a
>> fatal problem even if the code happens to work correctly as of today.
>> Even if you have such a guarantee, you can't prohibit a configuration
>> change at pg_ctl reload time: the server can refuse to start in case
>> of an invalid configuration, but a running server can't decide to shut
>> down or stop working at reload time.
> 
> My understanding is that the correctness of this GUC check hook depends on
> wal_level being a PGC_POSTMASTER GUC.  The check hook would always return
> true during startup, and there'd be an additional cross-check in
> PostmasterMain() that would fail startup if necessary.  After that point,
> we know that wal_level cannot change, so the GUC check hook for
> summarize_wal can depend on wal_level.  If it fails, my expectation would
> be that the server would just ignore that change and continue.

I should also note that since wal_level defaults to "replica", I don't
think we even need any extra "always return true on startup" logic.  If
wal_level is set prior to summarize_wal, the check hook will fail startup
as needed.  If summarize_wal is set first, the check hook will return true,
and we'll fall back on the PostmasterMain() check (that already exists).

In short, the original patch [0] seems like it should work in this
particular scenario, barring some corner case I haven't discovered.  That
being said, it's admittedly fragile and probably not a great precedent to
set.  I've been thinking about some ideas for more generic GUC dependency
tooling, but I don't have anything to share yet.

[0] 
https://postgr.es/m/attachment/161852/v1-0001-Prevent-summarize_wal-from-enabling-when-wal_leve.patch

-- 
nathan

Re: [PATCH v1] Fix parsing of a complex type that has an array of complex types

2024-07-15 Thread Tom Lane

Arjan Marku  writes:
> INSERT INTO item_2d_table VALUES(1, '(1,{{("inv a",42,1.99),("inv 
> b",42,1.99)},{("inv c",42,1.99),("inv d",42,2)}})');

> The INSERT statement will fail due to how complex types are parsed, I 
> have included a patch in this email to support this scenario.

The actual problem with this input is that it's inadequately quoted.
The record fields need to be quoted according to the rules in

https://www.postgresql.org/docs/current/rowtypes.html#ROWTYPES-IO-SYNTAX

and then in the field that is an array, the array elements need to be
quoted according to the rules in

https://www.postgresql.org/docs/current/arrays.html#ARRAYS-IO

and then the innermost record fields need yet another level of quoting
(well, you might be able to skip that given that there are no special
characters in this particular data, but in general you'd have to).

It looks to me like your patch is trying to make array_in and
record_in sufficiently aware of each other's rules that the outer
quoting could be omitted, but I believe that that's a seriously bad
idea.  In the first place, it's far from clear that that's even
possible without ambiguity (much less that this specific patch does
it correctly).  In the second place, this will make it even more
difficult for these functions to issue on-point error messages for
incorrect input.  (They're already struggling with that, see
e.g. [1].)  And in the third place, these functions are nearly
unmaintainable spaghetti already.  (There have also been complaints
that they're too slow, which this won't improve.)  We don't need
another big dollop of complexity here.

Our normal recommendation for handwritten input is to not try to deal
with the complications of correctly quoting nested array/record data.
Instead use row() and array[] constructors.  So you could write
something like

INSERT INTO item_2d_table
VALUES(1,
   row(1, array[[row('inv a',42,1.99), row('inv b',42,1.99)],
[row('inv c',42,1.99), row('inv 
d',42,2)]]::inventory_item[]));

In this particular example we need an explicit cast to cue the
parser about the type of the array elements, but then it can
cope with casting the outer row() construct automatically.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/CACJufxExAcpvrkbLGrZGdZ%3DbFAuj7OVp1mOhk%2BfsBzeUbOGuHQ%40mail.gmail.com

Re: long-standing data loss bug in initial sync of logical replication

2024-07-15 Thread Nitin Motiani

On Mon, Jul 15, 2024 at 11:42 PM vignesh C  wrote:
>
> On Mon, 15 Jul 2024 at 15:31, Amit Kapila  wrote:
> >
> > On Thu, Jul 11, 2024 at 6:19 PM Nitin Motiani  
> > wrote:
> > > I looked further into the scenario of adding the tables in schema to
> > > the publication. Since in that case, the entry is added to
> > > pg_publication_namespace instead of pg_publication_rel, the codepaths
> > > for 'add table' and 'add tables in schema' are different. And in the
> > > 'add tables in schema' scenario, the OpenTableList function is not
> > > called to get the relation ids. Therefore even with the proposed
> > > patch, the data loss issue still persists in that case.
> > >
> > > To validate this idea, I tried locking all the affected tables in the
> > > schema just before the invalidation for those relations (in
> > > ShareRowExclusiveLock mode).
> > >
> >
> > This sounds like a reasonable approach to fix the issue. However, we
> > should check SET publication_object as well, especially the drop part
> > in it. It should not happen that we miss sending the data for ADD but
> > for DROP, we send data when we shouldn't have sent it.
>
> There were few other scenarios, similar to the one you mentioned,
> where the issue occurred. For example: a) When specifying a subset of
> existing tables in the ALTER PUBLICATION ... SET TABLE command, the
> tables that were supposed to be removed from the publication were not
> locked in ShareRowExclusiveLock mode. b) The ALTER PUBLICATION ...
> DROP TABLES IN SCHEMA command did not lock the relations that will be
> removed from the publication in ShareRowExclusiveLock mode. Both of
> these scenarios resulted in data inconsistency due to inadequate
> locking. The attached patch addresses these issues.
>

Hi,

A couple of questions on the latest patch :

1. I see there is this logic in PublicationDropSchemas to first check
if there is a valid entry for the schema in pg_publication_namespace

psid = GetSysCacheOid2(PUBLICATIONNAMESPACEMAP,

Anum_pg_publication_namespace_oid,

ObjectIdGetDatum(schemaid),

ObjectIdGetDatum(pubid));
if (!OidIsValid(psid))
{
if (missing_ok)
continue;

ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg("tables from schema
\"%s\" are not part of the publication",

get_namespace_name(schemaid;
}

Your proposed change locks the schemaRels before this code block.
Would it be better to lock the schemaRels after the error check? So
that just in case, the publication on the schema is not valid anymore,
the lock is not held unnecessarily on all its tables.

2. The function publication_add_schema explicitly invalidates cache by
calling InvalidatePublicationRels(schemaRels). That is not present in
the current PublicationDropSchemas code. Is that something which
should be added in the drop scenario also? Please let me know if there
is some context that I'm missing regarding why this was not added
originally for the drop scenario.

Thanks & Regards,
Nitin Motiani
Google

Re: Compress ReorderBuffer spill files using LZ4

2024-07-15 Thread Tomas Vondra

On 7/15/24 20:50, Julien Tachoires wrote:
> Hi,
> 
> Le ven. 7 juin 2024 à 06:18, Julien Tachoires  a écrit :
>>
>> Le ven. 7 juin 2024 à 05:59, Tomas Vondra
>>  a écrit :
>>>
>>> On 6/6/24 12:58, Julien Tachoires wrote:
 ...

 When compiled with LZ4 support (--with-lz4), this patch enables data
 compression/decompression of these temporary files. Each transaction
 change that must be written on disk (ReorderBufferDiskChange) is now
 compressed and encapsulated in a new structure.

>>>
>>> I'm a bit confused, but why tie this to having lz4? Why shouldn't this
>>> be supported even for pglz, or whatever algorithms we add in the future?
>>
>> That's right, reworking this patch in that sense.
> 
> Please find a new version of this patch adding support for LZ4, pglz
> and ZSTD. It introduces the new GUC logical_decoding_spill_compression
> which is used to set the compression method. In order to stay aligned
> with the other server side GUCs related to compression methods
> (wal_compression, default_toast_compression), the compression level is
> not exposed to users.
> 

Sounds reasonable. I wonder if it might be useful to allow specifying
the compression level in those places, but that's clearly not something
this patch needs to do.

> The last patch of this set is still in WIP, it adds the machinery
> required for setting the compression methods as a subscription option:
> CREATE SUBSCRIPTION ... WITH (spill_compression = ...);
> I think there is a major problem with this approach: the logical
> decoding context is tied to one replication slot, but multiple
> subscriptions can use the same replication slot. How should this work
> if 2 subscriptions want to use the same replication slot but different
> compression methods?
> 

Do we really support multiple subscriptions sharing the same slot? I
don't think we do, but maybe I'm missing something.

> At this point, compression is only available for the changes spilled
> on disk. It is still not clear to me if the compression of data
> transiting through the streaming protocol should be addressed by this
> patch set or by another one. Thought ?
> 

I'd stick to only compressing the data spilled to disk. It might be
useful to compress the streamed data too, but why shouldn't we compress
the regular (non-streamed) transactions too? Yeah, it's more efficient
to compress larger chunks, but we can fit quite large transactions into
logical_decoding_work_mem without spilling.

FWIW I'd expect that to be handled at the libpq level - there's already
a patch for that, but I haven't checked if it would handle this. But
maybe more importantly, I think compressing streamed data might need to
handle some sort of negotiation of the compression algorithm, which
seems fairly complex.

To conclude, I'd leave this out of scope for this patch.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Upgrade Debian CI images to Bookworm

2024-07-15 Thread Andres Freund

Hi,

On 2024-07-15 11:30:59 -0700, Andres Freund wrote:
> On 2024-07-15 09:43:26 -0700, Andres Freund wrote:
> > I've now applied the guc fix to all branches and the CI changes to 15+.
> 
> Ugh. I see that this fails on master, because of
> 
> commit 0c3930d0768
> Author: Peter Eisentraut 
> Date:   2024-07-01 07:30:38 +0200
>  
> Apply COPT to CXXFLAGS as well
> 
> I hadn't seen that because of an independent failure (the macos stuff, I'll
> send an email about it in a bit).
> 
> Not sure what the best real fix here is, this is outside of our code. I'm
> inclined to just disable llvm for the compiler warning task for now.

Oh - there's a better fix: Turns out bookworm does have llvm 16, where the
warning has been fixed. Upgrading the CI image to install llvm 16 should fix
this.   Any arguments against that approach?

Greetings,

Andres

Re: Converting tab-complete.c's else-if chain to a switch

2024-07-15 Thread Tom Lane

Andres Freund  writes:
> On 2024-07-13 13:16:14 -0400, Tom Lane wrote:
>> Of course the main point is the hope that it will work at all with MSVC, but
>> I'm not in a position to test that.

> It does work with just your patches applied - very nice.

Thanks for testing!

> The only obvious thing that doesn't is that ctrl-c without a running query
> terminates psql - interrupting a running query works without terminating psql,
> which is a bit weird. But that seems independent of this series.

Yeah, whatever's going on there is surely orthogonal to this.

>> +# The input is a C file that contains some case labels that are not
>> +# constants, but look like function calls, for example:
>> +#   case Matches("A", "B"):

> Hm, the fact that we are continuing to use the same macros in the switch makes
> it a bit painful to edit the .in.c in an editor with compiler-warnings
> integration - I see a lot of reported errors ("Expression is not an integer
> constant expression") due to case statements not being something that the
> normal C switch can handle.

Ugh, good point.

> Perhaps we could use a distinct macro for use in the switch, which generates
> valid (but nonsensical) code, so we avoid warnings?

Dunno.  I'd explicitly wanted to not have different notations for
patterns implemented in switch labels and those used in ad-hoc tests.

Thinking a bit further outside the box ... maybe the original source
code could be like it is now (or more likely, like it is after 0002,
with the switch-to-be stuff all in a separate function), and we
could make the preprocessor perform the else-to-switch conversion
every time instead of viewing that as a one-time conversion?
That would make it a bit more fragile, but perhaps not impossibly so.

>> +# These lines are replaced by "case N:" where N is a unique number
>> +# for each such case label.  (For convenience, we use the line number
>> +# of the case label, although other values would work just as well.)

> Hm, using the line number seems to make it a bit harder to compare the output
> of the script after making changes to the input.

OK.  I'd thought this choice might be helpful for debugging, but I'm
not wedded to it.  The obvious alternative is to use sequential
numbers for case labels, but wouldn't that also be a bit problematic
if "making changes" includes adding or removing rules?

> I don't think we want to install tab-complete.c?

Good point --- I borrowed that rule from somewhere else, and
failed to study it closely enough.

regards, tom lane

Re: [PATCH v1] Fix parsing of a complex type that has an array of complex types

2024-07-15 Thread Arjan Marku

I agree with all your points as I encountered them myself, especially 
ambiguity and error handling.


The introduced dependency between those functions also is for me a bad 
idea but it seemed like the only way to support that use case, but turns 
out my assumption that the array literal shouldn't be quoted was wrong,


I managed to execute this query fine:
INSERT INTO item_2d_table VALUES(1, '(1,"{{""(\\""inv 
a\\"",42,1.99)"",""(\\"inv b\\",42,1.99)""},{""(\\""inv 
c\\"",42,1.99)"",""(\\""inv d\\"",42,2)""}}")');


Thanks for your insights on this,

Kind regards,
Arjan Marku

On 7/15/24 9:15 PM, Tom Lane wrote:

INSERT INTO item_2d_table VALUES(1, '(1,{{("inv a",42,1.99),("inv
b",42,1.99)},{("inv c",42,1.99),("inv d",42,2)}})');

Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal

2024-07-15 Thread Robert Haas

On Mon, Jul 15, 2024 at 2:47 PM Nathan Bossart  wrote:
> My understanding is that the correctness of this GUC check hook depends on
> wal_level being a PGC_POSTMASTER GUC.  The check hook would always return
> true during startup, and there'd be an additional cross-check in
> PostmasterMain() that would fail startup if necessary.  After that point,
> we know that wal_level cannot change, so the GUC check hook for
> summarize_wal can depend on wal_level.  If it fails, my expectation would
> be that the server would just ignore that change and continue.

But how do you know that, during startup, the setting for
summarize_wal is processed after the setting for wal_level?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Robert Haas

On Mon, Jul 15, 2024 at 2:33 PM Jeff Davis  wrote:
> On Mon, 2024-07-15 at 13:44 -0400, Robert Haas wrote:
> > But ... why? I mean, what's the point of prohibiting that?
>
> Agreed. We ignore all kinds of stuff in search_path that doesn't make
> sense, like non-existent schemas. Simpler is better.

Oh, I had the opposite idea: I wasn't proposing ignoring it. I was
proposing making it work.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal

2024-07-15 Thread Nathan Bossart

On Mon, Jul 15, 2024 at 04:03:13PM -0400, Robert Haas wrote:
> On Mon, Jul 15, 2024 at 2:47 PM Nathan Bossart  
> wrote:
>> My understanding is that the correctness of this GUC check hook depends on
>> wal_level being a PGC_POSTMASTER GUC.  The check hook would always return
>> true during startup, and there'd be an additional cross-check in
>> PostmasterMain() that would fail startup if necessary.  After that point,
>> we know that wal_level cannot change, so the GUC check hook for
>> summarize_wal can depend on wal_level.  If it fails, my expectation would
>> be that the server would just ignore that change and continue.
> 
> But how do you know that, during startup, the setting for
> summarize_wal is processed after the setting for wal_level?

You don't, but the GUC check hook should always return true when
summarize_wal is processed first.  We'd rely on the PostmasterMain() check
to fail in that case.

-- 
nathan

Re: Sort functions with specialized comparators

2024-07-15 Thread Stepan Neretin

On Tue, Jul 16, 2024 at 1:47 AM Andrey M. Borodin 
wrote:

>
>
> > On 15 Jul 2024, at 12:52, Stepan Neretin  wrote:
> >
> >
> > I run benchmark with my patches:
> > ./pgbench -c 10 -j2 -t1000 -d postgres
> >
> > pgbench (18devel)
> > starting vacuum...end.
> > transaction type: 
> > scaling factor: 10
> > query mode: simple
> > number of clients: 10
> > number of threads: 2
> > maximum number of tries: 1
> > number of transactions per client: 1000
> > number of transactions actually processed: 1/1
> > number of failed transactions: 0 (0.000%)
> > latency average = 1.609 ms
> > initial connection time = 24.080 ms
> > tps = 6214.244789 (without initial connection time)
> >
> > and without:
> > ./pgbench -c 10 -j2 -t1000 -d postgres
> >
> > pgbench (18devel)
> > starting vacuum...end.
> > transaction type: 
> > scaling factor: 10
> > query mode: simple
> > number of clients: 10
> > number of threads: 2
> > maximum number of tries: 1
> > number of transactions per client: 1000
> > number of transactions actually processed: 1/1
> > number of failed transactions: 0 (0.000%)
> > latency average = 1.731 ms
> > initial connection time = 15.177 ms
> > tps = 5776.173285 (without initial connection time)
> >
> > tps with my patches increase. What do you think?
>
>
> Hi Stepan!
>
> Thank you for implementing specialized sorting and doing this benchmarks.
> I believe it's a possible direction for good improvement.
> However, I doubt in correctness of your benchmarks.
> Increasing TPC-B performance from 5776 TPS to 6214 TPS seems too good to
> be true.
>
>
> Best regards, Andrey Borodin.


Yes... I agree.. Very strange.. I restarted the tps measurement and see
this:

tps = 14291.893460 (without initial connection time)  not patched
tps = 14669.624075 (without initial connection time)  patched

What do you think about these measurements?
Best regards, Stepan Neretin.

Re: gcc 12.1.0 warning

2024-07-15 Thread Andres Freund

On 2024-07-15 12:14:47 -0500, Nathan Bossart wrote:
> On Mon, Jul 15, 2024 at 09:41:55AM -0700, Andres Freund wrote:
> > On 2024-05-10 12:13:21 +0300, Nazir Bilal Yavuz wrote:
> >> The fix is attached. It cleanly applies from REL_15_STABLE to
> >> REL_12_STABLE, fixes the warnings and the tests pass.
> > 
> > Thanks! I've applied it to all branches - while it's not required to avoid a
> > warning in newer versions, it's still not correct as it was...
> 
> nitpick: pgindent thinks one of the spaces is unnecessary.

Ugh. Sorry for that, will take a look at fixing that :(

Re: ALTER TABLE uses a bistate but not for toast tables

2024-07-15 Thread Justin Pryzby

@cfbot: rebased
>From bc92a05ba35115ba0df278d553a5c0e4e303fe23 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 21 Jun 2022 22:28:06 -0500
Subject: [PATCH] WIP: use BulkInsertState for toast tuples, too

DONE: ALTER, CLUSTER
TODO: copyto, copyfrom?

slot_getsomeattrs slot_deform_heap_tuple fetchatt
heap_getnextslot => heapgettup => heapgetpage => ReadBufferExtended
initscan
table_beginscan table_scan_getnextslot
RelationCopyStorageUsingBuffer ReadBufferWithoutRelcache

(gdb) bt
 #0  table_open (relationId=relationId@entry=16390, lockmode=lockmode@entry=1) at table.c:40
 #1  0x56444cb23d3c in toast_fetch_datum (attr=attr@entry=0x7f67933cc6cc) at detoast.c:372
 #2  0x56444cb24217 in detoast_attr (attr=attr@entry=0x7f67933cc6cc) at detoast.c:123
 #3  0x56444d07a4c8 in pg_detoast_datum_packed (datum=datum@entry=0x7f67933cc6cc) at fmgr.c:1743
 #4  0x56444d042c8d in text_to_cstring (t=0x7f67933cc6cc) at varlena.c:224
 #5  0x56444d0434f9 in textout (fcinfo=) at varlena.c:573
 #6  0x56444d078f10 in FunctionCall1Coll (flinfo=flinfo@entry=0x56444e4706b0, collation=collation@entry=0, arg1=arg1@entry=140082828592844) at fmgr.c:1124
 #7  0x56444d079d7f in OutputFunctionCall (flinfo=flinfo@entry=0x56444e4706b0, val=val@entry=140082828592844) at fmgr.c:1561
 #8  0x56444ccb1665 in CopyOneRowTo (cstate=cstate@entry=0x56444e470898, slot=slot@entry=0x56444e396d20) at copyto.c:975
 #9  0x56444ccb2c7d in DoCopyTo (cstate=cstate@entry=0x56444e470898) at copyto.c:891
 #10 0x56444ccab4c2 in DoCopy (pstate=pstate@entry=0x56444e396bb0, stmt=stmt@entry=0x56444e3759b0, stmt_location=0, stmt_len=48, processed=processed@entry=0x7ffc212a6310) at copy.c:308

cluster:
heapam_relation_copy_for_cluster
reform_and_rewrite_tuple
rewrite_heap_tuple
raw_heap_insert

//-os-only: linux
---
 src/backend/access/common/toast_internals.c |  6 --
 src/backend/access/heap/heapam.c| 24 +++--
 src/backend/access/heap/heaptoast.c | 11 ++
 src/backend/access/heap/rewriteheap.c   |  8 +--
 src/backend/access/table/toast_helper.c |  6 --
 src/include/access/heaptoast.h  |  4 +++-
 src/include/access/hio.h|  2 ++
 src/include/access/toast_helper.h   |  3 ++-
 src/include/access/toast_internals.h|  4 +++-
 9 files changed, 49 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index 90d0654e629..b66d32a26e3 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -117,7 +117,8 @@ toast_compress_datum(Datum value, char cmethod)
  */
 Datum
 toast_save_datum(Relation rel, Datum value,
- struct varlena *oldexternal, int options)
+ struct varlena *oldexternal, int options,
+ BulkInsertState bistate)
 {
 	Relation	toastrel;
 	Relation   *toastidxs;
@@ -320,7 +321,8 @@ toast_save_datum(Relation rel, Datum value,
 		memcpy(VARDATA(&chunk_data), data_p, chunk_size);
 		toasttup = heap_form_tuple(toasttupDesc, t_values, t_isnull);
 
-		heap_insert(toastrel, toasttup, mycid, options, NULL);
+		heap_insert(toastrel, toasttup, mycid, options, bistate ?
+	bistate->toast_state : NULL);
 
 		/*
 		 * Create the index entry.  We cheat a little here by not using
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 91b20147a00..18b05d46735 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -71,7 +71,8 @@
 
 
 static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
-	 TransactionId xid, CommandId cid, int options);
+	 TransactionId xid, CommandId cid, int options,
+	 BulkInsertState bistate);
 static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
   Buffer newbuf, HeapTuple oldtup,
   HeapTuple newtup, HeapTuple old_key_tuple,
@@ -1931,9 +1932,15 @@ GetBulkInsertState(void)
 	bistate = (BulkInsertState) palloc(sizeof(BulkInsertStateData));
 	bistate->strategy = GetAccessStrategy(BAS_BULKWRITE);
 	bistate->current_buf = InvalidBuffer;
+
 	bistate->next_free = InvalidBlockNumber;
 	bistate->last_free = InvalidBlockNumber;
 	bistate->already_extended_by = 0;
+
+	bistate->toast_state = (BulkInsertState) palloc(sizeof(BulkInsertStateData));
+	bistate->toast_state->strategy = GetAccessStrategy(BAS_BULKWRITE);
+	bistate->toast_state->current_buf = InvalidBuffer;
+
 	return bistate;
 }
 
@@ -1945,6 +1952,10 @@ FreeBulkInsertState(BulkInsertState bistate)
 {
 	if (bistate->current_buf != InvalidBuffer)
 		ReleaseBuffer(bistate->current_buf);
+	if (bistate->toast_state->current_buf != InvalidBuffer)
+		ReleaseBuffer(bistate->toast_state->current_buf);
+
+	FreeAccessStrategy(bistate->toast_state->strategy);
 	FreeAccessStrategy(bistate->strategy);
 	pfree(bistate);
 }
@@ -2010,7 +2021,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,

Re: Upgrade Debian CI images to Bookworm

2024-07-15 Thread Andres Freund

Hi,

On 2024-07-15 12:37:54 -0700, Andres Freund wrote:
> On 2024-07-15 11:30:59 -0700, Andres Freund wrote:
> > On 2024-07-15 09:43:26 -0700, Andres Freund wrote:
> > > I've now applied the guc fix to all branches and the CI changes to 15+.
> >
> > Ugh. I see that this fails on master, because of
> >
> > commit 0c3930d0768
> > Author: Peter Eisentraut 
> > Date:   2024-07-01 07:30:38 +0200
> >
> > Apply COPT to CXXFLAGS as well
> >
> > I hadn't seen that because of an independent failure (the macos stuff, I'll
> > send an email about it in a bit).
> >
> > Not sure what the best real fix here is, this is outside of our code. I'm
> > inclined to just disable llvm for the compiler warning task for now.
>
> Oh - there's a better fix: Turns out bookworm does have llvm 16, where the
> warning has been fixed. Upgrading the CI image to install llvm 16 should fix
> this.   Any arguments against that approach?

Specifically, something like the attached.

Due to the CI failure this is causing, I'm planning to apply this soon...

Arguably we could backpatch this, the warning are present on older branches
too. Except that they don't cause errors, as 0c3930d0768 is only on master.

Greetings,

Andres
>From e696e8ded144095c4c4435d2e729fa37ed9e6903 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 15 Jul 2024 14:08:38 -0700
Subject: [PATCH v1 1/4] ci: Use newer llvm version with gcc

gcc emits a warning for LLVM 14 code outside of our control. To avoid that,
update to a newer LLVM version. Do so both in the CompilerWarnings and normal
tasks - the latter don't fail, but the warnings make it more likely that we'd
miss other warnings.

The warning has been fixed in newer LLVM versions.

Discussion: https://postgr.es/m/20240715193754.awdxgrzurxnww...@awork3.anarazel.de
---
 .cirrus.tasks.yml | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index b14fe91cdb7..df476a1d65a 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -272,6 +272,8 @@ task:
 LDFLAGS: $SANITIZER_FLAGS
 CC: ccache gcc
 CXX: ccache g++
+# GCC emits a warning for llvm-14, so switch to a newer one.
+LLVM_CONFIG: llvm-config-16
 
 LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES
 LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES
@@ -334,7 +336,7 @@ task:
 \
 ${LINUX_CONFIGURE_FEATURES} \
 \
-CLANG="ccache clang"
+CLANG="ccache clang-16"
 EOF
   build_script: su postgres -c "make -s -j${BUILD_JOBS} world-bin"
   upload_caches: ccache
@@ -689,6 +691,8 @@ task:
   # different compilers to build with different combinations of dtrace on/off
   # and cassert on/off.
 
+  # GCC emits a warning for llvm-14, so switch to a newer one.
+
   # gcc, cassert off, dtrace on
   always:
 gcc_warning_script: |
@@ -696,7 +700,8 @@ task:
 --cache gcc.cache \
 --enable-dtrace \
 ${LINUX_CONFIGURE_FEATURES} \
-CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"
+CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16" \
+LLVM_CONFIG=llvm-config-16
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} world-bin
 
@@ -707,7 +712,8 @@ task:
 --cache gcc.cache \
 --enable-cassert \
 ${LINUX_CONFIGURE_FEATURES} \
-CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"
+CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16" \
+LLVM_CONFIG=llvm-config-16
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} world-bin
 
@@ -755,7 +761,7 @@ task:
 --cache gcc.cache \
 CC="ccache gcc" \
 CXX="ccache g++" \
-CLANG="ccache clang"
+CLANG="ccache clang-16"
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} -C doc
 
-- 
2.45.2.746.g06e570c0df.dirty

Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-07-15 Thread Daniel Gustafsson

> On 14 Jul 2024, at 14:03, Peter Eisentraut  wrote:
> 
> On 12.07.24 21:42, Daniel Gustafsson wrote:
>>> On 11 Jul 2024, at 23:22, Peter Eisentraut  wrote:
>>> The 0001 patch removes the functions pgtls_init_library() and pgtls_init() 
>>> but keeps the declarations in libpq-int.h.  This should be fixed.
>> Ah, nice catch. Done in the attached rebase.
> 
> This patch version looks good to me.

Thanks for review, I will go ahead with this once back from vacation at the
tail end of July when I can properly handle the BF.

> Small comments on the commit message of 0002: Typo "checkig".  Also, maybe 
> the commit message title can be qualified a little, since we're not really 
> doing "Remove pg_strong_random initialization." but something like "Remove 
> unnecessary ..."?

Good points, will address before pushing.

--
Daniel Gustafsson

Re: Upgrade Debian CI images to Bookworm

2024-07-15 Thread Andres Freund

Hi,

On 2024-07-15 14:35:14 -0700, Andres Freund wrote:
> Specifically, something like the attached.
>
> Due to the CI failure this is causing, I'm planning to apply this soon...
>
> Arguably we could backpatch this, the warning are present on older branches
> too. Except that they don't cause errors, as 0c3930d0768 is only on master.

Here's a v2, to address two things:
- there was an error in the docs build, because LLVM_CONFIG changed
- there were also deprecation warnings in headerscheck/cpluspluscheck

So I just made the change apply a bit more widely.

Greetings,

Andres Freund
>From a331747fd237c3b859b1da24ffc7768067c18d98 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 15 Jul 2024 14:08:38 -0700
Subject: [PATCH v2 1/3] ci: Use newer llvm version with gcc

gcc emits a warning for LLVM 14 code outside of our control. To avoid that,
update to a newer LLVM version. Do so both in the CompilerWarnings and normal
tasks - the latter don't fail, but the warnings make it more likely that we'd
miss other warnings.

The warning has been fixed in newer LLVM versions.

Discussion: https://postgr.es/m/20240715193754.awdxgrzurxnww...@awork3.anarazel.de
---
 .cirrus.tasks.yml | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index b14fe91cdb7..99ca74d5133 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -272,6 +272,8 @@ task:
 LDFLAGS: $SANITIZER_FLAGS
 CC: ccache gcc
 CXX: ccache g++
+# GCC emits a warning for llvm-14, so switch to a newer one.
+LLVM_CONFIG: llvm-config-16
 
 LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES
 LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES
@@ -334,7 +336,7 @@ task:
 \
 ${LINUX_CONFIGURE_FEATURES} \
 \
-CLANG="ccache clang"
+CLANG="ccache clang-16"
 EOF
   build_script: su postgres -c "make -s -j${BUILD_JOBS} world-bin"
   upload_caches: ccache
@@ -661,6 +663,9 @@ task:
 LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES
 LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES
 
+# GCC emits a warning for llvm-14, so switch to a newer one.
+LLVM_CONFIG: llvm-config-16
+
   <<: *linux_task_template
 
   sysinfo_script: |
@@ -696,7 +701,7 @@ task:
 --cache gcc.cache \
 --enable-dtrace \
 ${LINUX_CONFIGURE_FEATURES} \
-CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"
+CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} world-bin
 
@@ -707,7 +712,7 @@ task:
 --cache gcc.cache \
 --enable-cassert \
 ${LINUX_CONFIGURE_FEATURES} \
-CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"
+CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} world-bin
 
@@ -717,7 +722,7 @@ task:
   time ./configure \
 --cache clang.cache \
 ${LINUX_CONFIGURE_FEATURES} \
-CC="ccache clang" CXX="ccache clang++" CLANG="ccache clang"
+CC="ccache clang" CXX="ccache clang++-16" CLANG="ccache clang-16"
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} world-bin
 
@@ -729,7 +734,7 @@ task:
 --enable-cassert \
 --enable-dtrace \
 ${LINUX_CONFIGURE_FEATURES} \
-CC="ccache clang" CXX="ccache clang++" CLANG="ccache clang"
+CC="ccache clang" CXX="ccache clang++-16" CLANG="ccache clang-16"
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} world-bin
 
@@ -753,9 +758,7 @@ task:
 docs_build_script: |
   time ./configure \
 --cache gcc.cache \
-CC="ccache gcc" \
-CXX="ccache g++" \
-CLANG="ccache clang"
+CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} -C doc
 
@@ -774,7 +777,7 @@ task:
 ${LINUX_CONFIGURE_FEATURES} \
 --without-icu \
 --quiet \
-CC="gcc" CXX"=g++" CLANG="clang"
+CC="gcc" CXX"=g++" CLANG="clang-16"
   make -s -j${BUILD_JOBS} clean
   time make -s headerscheck EXTRAFLAGS='-fmax-errors=10'
 headers_cpluspluscheck_script: |
-- 
2.45.2.746.g06e570c0df.dirty

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-07-15 Thread Peter Geoghegan

On Mon, Jul 8, 2024 at 2:25 PM Melanie Plageman
 wrote:
> Attached v3 has one important additional component in the test -- I
> use pg_stat_progress_vacuum to confirm that we actually do more than
> one pass of index vacuuming. Otherwise, it would have been trivial for
> the test to incorrectly pass.

That's a good idea.

> I could still use another pair of eyes on the test (looking out for
> stability enhancing measures I could take).

First, the basics: I found that your test failed reliably without your
fix, and passed reliably with your fix.

Next, performance: the total test runtime (as indicated by "time meson
test -q recovery/043_vacuum_horizon_floor") was comparable to other
recovery/* TAP tests. The new vacuum_horizon_floor test is a little on
the long running side, as these tests go, but I think that's fine.

Minor nitpicking about the comments in your TAP test:

* It is necessary but not sufficient for your test to "skewer"
maybe_needed, relative to OldestXmin. Obviously, it is not sufficient
because the test can only fail when VACUUM prunes a heap page after
the backend's horizons have been "skewered" in this sense.

Pruning is when we get stuck, and if there's no more pruning then
there's no opportunity for VACUUM to get stuck. Perhaps this point
should be noted directly in the comments. You could add a sentence
immediately after the existing sentence "Then vacuum's first pass will
continue and pruning...". This new sentence would then add commentary
such as "Finally, vacuum's second pass over the heap...".

* Perhaps you should point out that you're using VACUUM FREEZE for
this because it'll force the backend to always get a cleanup lock.
This is something you rely on to make the repro reliable, but that's
it.

In other words, point out to the reader that this bug has nothing to
do with freezing; it just so happens to be convenient to use VACUUM
FREEZE this way, due to implementation details.

* The sentence "VACUUM proceeds with pruning and does a visibility
check on each tuple..." describes the bug in terms of the current
state of things on Postgres 17, but Postgres 17 hasn't been released
just yet. Does that really make sense?

If you want to describe the invariant that caused
heap_pre_freeze_checks() to error-out on HEAD/Postgres 17, then the
commit message of your fix seems like the right place for that. You
could reference these errors in passing. The errors seem fairly
incidental to the real problem, at least to me.

I think that there is some chance that this test will break the build
farm in whatever way, since there is a long history of VACUUM not
quite behaving as expected with these sorts of tests. I think that you
should commit the test case separately, first thing in the morning,
and then keep an eye on the build farm for the rest of the day. I
don't think that it's sensible to bend over backwards, just to avoid
breaking the build farm in this way.

Thanks for working on this
-- 
Peter Geoghegan

Re: CI, macports, darwin version problems

2024-07-15 Thread Andres Freund

Hi,

On 2024-07-03 09:39:06 +1200, Thomas Munro wrote:
> On Thu, Jun 27, 2024 at 6:32 PM Thomas Munro  wrote:
> > So I think we should request
> > ghcr.io/cirruslabs/macos-sonoma-base:latest.  Personal github accounts
> > will use macos-runner:sonoma instead, but at least it's the same OS
> > release.  Here's a new version like that, to see if cfbot likes it.
> 
> The first cfbot run of v3 was successful, but a couple of days later
> when retested it failed with the dreaded "Error:
> ShouldBeAtLeastOneLayer".  (It also failed on Windows, just because
> master was temporarily broken, unrelated to any of this.  Note also
> that the commit message created by cfbot now includes the patch
> version, making the test history easier to grok, thanks Jelte!)
> 
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf/5076
> 
> One difference that jumps out is that the successful v3 run has label
> worker:jc-m2-1 (Mac hosted by Joe), and the failure has
> worker:pgx-m2-1 (Mac hosted by Christophe P).  Is this a software
> version issue, ie need newer Tart to use that image, or could be a
> difficulty fetching the image?  CCing our Mac Mini pool attendants.
> 
> Temporary options include disabling pgx-m2-1 from the pool, or
> teaching .cirrus.task.yml to use Ventura for cfbot but Sonoma for
> anyone else's github account, but ideally we'd figure out why it's not
> working...

Yep, I think we'll have to do that, unless it has been fixed by now.


> This new information also invalidates my previous hypothesis, that the
> new "macos-runner:sonoma" image can't work on self-hosted Macs,
> because that was also on pgx-m2-1.

Besides the base-os-version issue, another theory is that the newer image is
just very large (141GB) and that we've seen some other issues related to
Christophe's internet connection not being the fastest.

WRT your patches:
- I think we ought to switch to the -runner image, otherwise we'll just
  continue to get that "upgraded" warning

- With a fingerprint_script specified, we need to add
  reupload_on_changes: true
  otherwise it'll not be updated.

- I think the fingerprint_script should use sw_vers, just as the script
  does. I see no reason to differ?

- We could just sw_vers -productVersion | sed 's/\..*//g' instead of the more
  complicated version you used, I doubt that they're going to go away from
  numerical major versions...

Greetings,

Andres Freund

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Jeff Davis

On Mon, 2024-07-15 at 16:04 -0400, Robert Haas wrote:
> Oh, I had the opposite idea: I wasn't proposing ignoring it. I was
> proposing making it work.

I meant: ignore $extension_schema if the search_path has nothing to do
with an extension. In other words, if it's in a search_path for the
session, or on a function that's not part of an extension.

On re-reading, I see that you mean it should work if they explicitly
set it as a part of a function that *is* part of an extension. And I
agree with that -- just make it work.

Regards,
Jeff Davis

Re: Remove dependence on integer wrapping

2024-07-15 Thread Joseph Koshakow

Thanks for the review!

On Mon, Jul 15, 2024 at 11:31 AM Nathan Bossart 
wrote:
>
>I took a closer look at 0002.
>
>I'm curious why you aren't using float8_mul/float8_div here, i.e.,
>
>fresult = rint(float8_mul((float8) c, f));
>fresult = rint(float8_div((float8) c, f));

I wrongly assumed that it was only meant to be used to implement
multiplication and division for the built-in float types. I've updated
the patch to use these functions.

>nitpick: I'd name the functions something like "cash_mul_float8" and
>"cash_div_float8". Perhaps we could also add functions like
>"cash_mul_int64"

Done in the updated patch.

>and "cash_sub_int64"

Did you mean "cash_div_int64"? There's only a single function that
subtracts cash and an integer, but there's multiple functions that
divide cash by an integer. I've added a "cash_div_int64" in the updated
patch.

The other patches, 0001, 0003, and 0004 are unchanged but have their
version number incremented.

Thanks,
Joe Koshakow
From 018d952a44d51fb9e0a186003556aebb69a66217 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 6 Jul 2024 14:35:00 -0400
Subject: [PATCH 3/4] Remove overflow from array_set_slice

This commit removes an overflow from array_set_slice that allows seting
absurd slice ranges.
---
 src/backend/utils/adt/arrayfuncs.c   | 8 +++-
 src/test/regress/expected/arrays.out | 8 
 src/test/regress/sql/arrays.sql  | 6 ++
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index d6641b570d..95e027e9ea 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -2887,7 +2887,13 @@ array_set_slice(Datum arraydatum,
 		 errdetail("When assigning to a slice of an empty array value,"
    " slice boundaries must be fully specified.")));
 
-			dim[i] = 1 + upperIndx[i] - lowerIndx[i];
+			/* dim[i] = 1 + upperIndx[i] - lowerIndx[i]; */
+			if (pg_add_s32_overflow(1, upperIndx[i], &dim[i]) ||
+pg_sub_s32_overflow(dim[i], lowerIndx[i], &dim[i]))
+ereport(ERROR,
+		(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+		 errmsg("array size exceeds the maximum allowed (%d)",
+(int) MaxArraySize)));
 			lb[i] = lowerIndx[i];
 		}
 
diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 23404982f7..a2382387e2 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -2699,3 +2699,11 @@ SELECT array_sample('{1,2,3,4,5,6}'::int[], -1); -- fail
 ERROR:  sample size must be between 0 and 6
 SELECT array_sample('{1,2,3,4,5,6}'::int[], 7); --fail
 ERROR:  sample size must be between 0 and 6
+-- Test for overflow in array slicing
+CREATE temp table arroverflowtest (i int[]);
+INSERT INTO arroverflowtest(i[-2147483648:2147483647]) VALUES ('{}');
+ERROR:  array size exceeds the maximum allowed (134217727)
+INSERT INTO arroverflowtest(i[1:2147483647]) VALUES ('{}');
+ERROR:  array size exceeds the maximum allowed (134217727)
+INSERT INTO arroverflowtest(i[2147483647:2147483647]) VALUES ('{}');
+ERROR:  array size exceeds the maximum allowed (134217727)
diff --git a/src/test/regress/sql/arrays.sql b/src/test/regress/sql/arrays.sql
index 50aa539fdc..e9d6737117 100644
--- a/src/test/regress/sql/arrays.sql
+++ b/src/test/regress/sql/arrays.sql
@@ -825,3 +825,9 @@ SELECT array_dims(array_sample('[-1:2][2:3]={{1,2},{3,NULL},{5,6},{7,8}}'::int[]
 SELECT array_dims(array_sample('{{{1,2},{3,NULL}},{{5,6},{7,8}},{{9,10},{11,12}}}'::int[], 2));
 SELECT array_sample('{1,2,3,4,5,6}'::int[], -1); -- fail
 SELECT array_sample('{1,2,3,4,5,6}'::int[], 7); --fail
+
+-- Test for overflow in array slicing
+CREATE temp table arroverflowtest (i int[]);
+INSERT INTO arroverflowtest(i[-2147483648:2147483647]) VALUES ('{}');
+INSERT INTO arroverflowtest(i[1:2147483647]) VALUES ('{}');
+INSERT INTO arroverflowtest(i[2147483647:2147483647]) VALUES ('{}');
-- 
2.34.1

From 950da2c4ce85632f6085680c3ed7b75fb1f780f7 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 8 Jun 2024 22:16:46 -0400
Subject: [PATCH 1/4] Remove dependence on integer wrapping

This commit updates various parts of the code to no longer rely on
integer wrapping for correctness. Not all compilers support -fwrapv, so
it's best not to rely on it.
---
 src/backend/utils/adt/cash.c   |   7 +-
 src/backend/utils/adt/numeric.c|   5 +-
 src/backend/utils/adt/numutils.c   |  34 ---
 src/backend/utils/adt/timestamp.c  |  28 +-
 src/include/common/int.h   | 105 +
 src/interfaces/ecpg/pgtypeslib/timestamp.c |  11 +--
 src/test/regress/expected/timestamp.out|  13 +++
 src/test/regress/expected/timestamptz.out  |  13 +++
 src/test/regress/sql/timestamp.sql |   4 +
 src/test/regress/sql/timestamptz.sql   |   4 +
 10 files changed, 169 insertions(+), 55 deletions(

Re: Proposal for implementing OCSP Stapling in PostgreSQL

2024-07-15 Thread Jacob Champion

On Tue, Mar 5, 2024 at 4:12 PM David Zhang  wrote:
> This is the third version patch for "Certificate status check using OCSP
> Stapling" with ssl regression test cases added.

Hi David,

Thanks again for working on this! So far I've taken a look at the
design and tests. I've only skimmed the callback implementations; I
plan to review those in more detail once the architecture is nailed
down.

= Design =

It looks like this design relies on the DBA to manually prefetch OCSP
responses for their cert chain, and cache them in the local
ssl_ocsp_file. This is similar to Nginx's ssl_stapling_file directive
[1]. I think this may make sense for a v1 (much less code!), but it's
going to take a lot of good documentation on what, exactly, an admin
has to do to make sure their response file is fresh. IIUC it will
depend on the CA's policy and how they operate their responder.

We should probably be prepared for complaints that we don't run the
client ourselves. A v2 could maybe include an extension for running
the OCSP client in a background worker? Or maybe that's overkill, and
an existing job scheduler extension could do it, but having a custom
worker that could periodically check the response file and provide
warnings when it gets close to expiring might be helpful for admins.

I am worried about the multi-stapling that is being done in the tests,
for example the server-ca-ocsp-good response file. I think those tests
are cheating a bit. Yes, for this particular case, we can issue a
single signed response for both the intermediate and the leaf -- but
that's only because we issued both of them. What if your intermediate
and your root use different responders [2]? What if your issuer's
responder doesn't even support multiple certs per request [3]?

(Note that OpenSSL still doesn't support multi-stapling [4], even in
TLS 1.3 where we were supposed to get it "for free".)

I think it would be good for the sslocspstapling directive to 1) maybe
have a shorter name (cue bikeshed!) and 2) support more than a binary
on/off. For example, the current implementation could use
"disable/require" options, and then a v2 could add "prefer" which
simply sends the status request and honors must-staple extensions on
the certificate. (That should be safe to default to, I think, and it
would let DBAs control the stapling rollout more easily.)

A question from ignorance: how does the client decide that the
signature on the OCSP response is actually valid for the specific
chain in use?

= Tests =

I think the tests should record the expected_stderr messages for
failures (see the Code section below for why).

If it turns out that multi-stapling is safe, then IMO the tests should
explicitly test both TLSv1.2 and v1.3, since those protocols differ in
how they handle per-certificate status.

If it's okay with you, I'd like to volunteer to refactor out the
duplicated recipes in sslfiles.mk. I have some particular ideas in
mind and don't want to make you play fetch-a-rock. (No pressure to use
what I come up with, if you don't like it.)

Because a CRL is a valid fallback for OCSP in practice, I think there
should be some tests for their interaction. (For v1, maybe that's as
simple as "you're not allowed to use both yet", but it should be
explicit.)

= Code =

(this is a very shallow review)

+#define OCSP_CERT_STATUS_OK1
+#define OCSP_CERT_STATUS_NOK   (-1)

Returning f -1 from the callback indicates an internal error, so we're
currently sending the wrong alerts for OCSP failures ("internal error"
rather than "bad certificate status response") and getting unhelpful
error messages from libpq. For cases where the handshake proceeds
correctly but we don't accept the OCSP response status, I think we
should be returning zero.

Also, we should not stomp on the OCSP_ namespace (I thought these
macros were part of the official  API at first).

+   /* set up OCSP stapling callback */
+   SSL_CTX_set_tlsext_status_cb(SSL_context, ocsp_stapling_cb);

It seems like this should only be done if ssl_ocsp_file is set?

Thanks again!
--Jacob

[1] https://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_stapling_file
[2] as appears to be the case for postgresql.org
[3] https://community.letsencrypt.org/t/bulk-ocsp-requests/168156/2
[4] https://github.com/openssl/openssl/pull/20945

Re: Parent/child context relation in pg_get_backend_memory_contexts()

2024-07-15 Thread David Rowley

On Tue, 16 Jul 2024 at 06:19, Robert Haas  wrote:
> I'm not against what you're trying to do here, but I feel like you
> might be over-engineering it. I don't think there was anything really
> wrong with what Melih was doing, and I don't think there's anything
> really wrong with converting the path to an array of strings, either.
> Sure, it might not be perfect, but future patches could always remove
> the name duplication. This is a debugging facility that will be used
> by a tiny minority of users, and if some non-uniqueness gets
> reintroduced in the future, it's not a critical defect and can just be
> fixed when it's noticed.

I'm just not on board with the
query-returns-correct-results-most-of-the-time attitude and I'm
surprised you are. You can get that today if you like, just write a
WITH RECURSIVE query joining "name" to "parent". If the consensus is
that's fine because it works most of the time, then I don't see any
reason to invent a new way to get equally correct-most-of-the-time
results.

> That said, if you want to go with the integer
> IDs and want to spend more time massaging it, I also think that's
> fine. I simply don't believe it's the only way forward here. YMMV, but
> my opinion is that none of these approaches have such critical flaws
> that we need to get stressed about it.

If there are other ways forward that match the goal of having a
reliable way to determine the parent of a MemoryContext, then I'm
interested in hearing more.  I know you've mentioned about having
unique names, but I don't know how to do that. Do you have any ideas
on how we could enforce the uniqueness? I don't really like your idea
of renaming contexts when we find duplicate names as bug fixes. The
nature of our code wouldn't make it easy to determine as some reusable
code might create a context as a child of CurrentMemoryContext and
multiple callers might call that code within a different
CurrentMemoryContext.

One problem is that, if you look at MemoryContextCreate(), we require
that the name is statically allocated. We don't have the flexibility
to assign unique names when we find a conflict.  If we were to come up
with a solution that assigned a unique name, then I'd call that
"over-engineered" for the use case we need it for. I think if we did
something like that, it would undo some of the work Tom did in
442accc3f. Also, I think it was you that came up with the idea of
MemoryContext reuse (9fa6f00b1)? Going by that commit message, it
seems to be done for performance reasons. If MemoryContext.name was
dynamic, there'd be more allocation work to do when reusing a context.
That might undo some of the performance gains seen in 9fa6f00b1. I
don't really want to go through the process of verifying there's no
performance regress for a patch that aims to make
pg_backend_memory_contexts more useful.

David

Re: tests fail on windows with default git settings

2024-07-15 Thread Thomas Munro

On Sun, Jul 14, 2024 at 10:00 AM Thomas Munro  wrote:
> On Fri, Jul 12, 2024 at 3:49 AM Dave Page  wrote:
> > # doesn't match '(?^:pg_dump: error: connection to server .* failed: 
> > FATAL:  database "qqq" does not exist)'
>
> Does it help if you revert 29992a6?

FWIW I just happened to notice the same failure on Cirrus, in the
github.com/postgres/postgres master branch:

https://cirrus-ci.com/task/5382396705505280

Your failure mentions GSSAPI and the above doesn't, but that'd be
because for Cirrus CI we have PG_TEST_USE_UNIX_SOCKETS so it's using
AF_UNIX.  At one point I proposed deleting that weird GSAPPI stuff and
using AF_UNIX always on Windows[1], but the feedback was that I should
instead teach the whole test suite to be able to use AF_UNIX or
AF_INET* on all OSes and I never got back to it.

The error does seem be the never-ending saga from this and other threads:

https://www.postgresql.org/message-id/flat/90b34057-4176-7bb0-0dbb-9822a5f6425b%40greiz-reinsdorf.de

My uninformed impression is that graceful socket shutdowns would very
likely fix the class of lost-final-message problem where the client
does recv() next, including this case IIUC.  It's only a partial
improvement though: if the client calls send() next, I think it can
still drop buffered received data, so this graceful shutdown stuff
doesn't quite get us to the same situation as Unix all points in the
protocol.  The real world case where that second case comes up is
where the client sends a new query and on Unix gets a buffered error
message saying the backend has exited due to idle timeout, but on
Windows gets a connection reset message.  I've wondered before if you
could fix (or narrow to almost zero?) that by giving libpq a mode
where it calls poll() to check for buffered readable data every single
time it's about to send.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGK30uLx9dpgkYwomgH0WVLUHytkChDgf3iUM2zp0pf_nA%40mail.gmail.com

Re: Logical Replication of sequences

2024-07-15 Thread Peter Smith

Hi,

I was reading back through this thread to find out how the proposed new
command for refreshing sequences,  came about. The patch 0705 introduces a
new command syntax for ALTER SUBSCRIPTION ... REFRESH SEQUENCES

So now there are 2 forms of subscription refresh.

#1. ALTER SUBSCRIPTION name REFRESH PUBLICATION [ WITH ( refresh_option [=
value] [, ... ] ) ]

#2. ALTER SUBSCRIPTION name REFRESH SEQUENCES



IMO, that separation seems complicated. It leaves many questions like:
* It causes a bit of initial confusion. e.g. When I saw the REFRESH
SEQUENCES I first assumed that was needed because sequences were
not covered by the existing REFRESH PUBLICATION
* Why wasn't command #2 called ALTER SUBSCRIPTION REFRESH PUBLICATION
SEQUENCES? E.g. missing keyword PUBLICATION. It seems inconsistent.
* I expect sequence values can become stale pretty much immediately after
command #1, so the user will want to use command #2 anyway...
* ... but if command #2 also does add/remove changed sequences same as
command #1 then what benefit was there of having the command #1 for
sequences?
* There is a separation of sequences (from tables) in command #2 but there
is no separation currently possible in command #1. It seemed inconsistent.

~~~

IIUC some of the goals I saw in the thread are to:
* provide a way to fetch and refresh sequences that also keeps behaviors
(e.g. copy_data etc.) consistent with the refresh of subscription tables
* provide a way to fetch and refresh *only* sequences

I felt you could just enhance the existing refresh command syntax (command
#1), instead of introducing a new one it would be simpler and it would
still meet those same objectives.

Synopsis:
ALTER SUBSCRIPTION name REFRESH PUBLICATION [TABLES | SEQUENCES | ALL] [
WITH ( refresh_option [= value] [, ... ] ) ]

My only change is the introduction of the optional "[TABLES | SEQUENCES |
ALL]" clause.

I believe that can do everything your current patch does, plus more:
* Can refresh *only* TABLES if that is what you want (current patch 0705
cannot do this)
* Can refresh *only* SEQUENCES (same as current patch 0705 command #2)
* Has better integration with refresh options like "copy_data" (current
patch 0705 command #2 doesn't have options)
* Existing REFRESH PUBLICATION syntax still works as-is. You can decide
later what is PG18 default if the "[TABLES | SEQUENCES | ALL]" is omitted.

~~~

More examples using proposed syntax.

ex1.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION TABLES WITH (copy_data = false)
- same as PG17 functionality for ALTER SUBSCRIPTION sub REFRESH PUBLICATION
WITH (copy_data = false)

ex2.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION TABLES WITH (copy_data = true)
- same as PG17 functionality for ALTER SUBSCRIPTION sub REFRESH PUBLICATION
WITH (copy_data = true)

ex3.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES WITH (copy data =
false)
- this adds/removes only sequences to pg_subscription_rel but doesn't
update their sequence values

ex4.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES WITH (copy data = true)
- this adds/removes only sequences to pg_subscription_rel and also updates
all sequence values.
- this is equivalent behaviour of what your current 0705 patch is doing for
command #2, ALTER SUBSCRIPTION sub REFRESH SEQUENCES

ex5.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION ALL WITH (copy_data = false)
- this is equivalent behaviour of what your current 0705 patch is doing for
command #1, ALTER SUBSCRIPTION sub REFRESH PUBLICATION WITH (copy_data =
false)

ex6.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION ALL WITH (copy_data = true)
- this adds/removes tables and sequences and updates all table initial data
sequence values.- I think it is equivalent to your current 0705 patch doing
command #1 ALTER SUBSCRIPTION sub REFRESH PUBLICATION WITH (copy_data =
true), followed by another command #2 ALTER SUBSCRIPTION sub REFRESH
SEQUENCES

ex7.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES
- Because default copy_data is true you do not need to specify options, so
this is the same behaviour as your current 0705 patch command #2, ALTER
SUBSCRIPTION sub REFRESH SEQUENCES.

~~~

I hope this post was able to demonstrate that by enhancing the existing
command:
- it is less tricky to understand the separate command distinctions
- there is more functionality/flexibility possible
- there is better integration with the refresh options like copy_data
- behaviour for tables/sequences is more consistent

Anyway, it is just my opinion. Maybe there are some pitfalls I'm unaware of.

Thoughts?

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Duplicate unique key values in inheritance tables

2024-07-15 Thread Richard Guo

I came across a query that returned incorrect results and I traced it
down to being caused by duplicate unique key values in an inheritance
table.  As a simple example, consider

create table p (a int primary key, b int);
create table c () inherits (p);

insert into p select 1, 1;
insert into c select 1, 2;

select a, b from p;
 a | b
---+---
 1 | 1
 1 | 2
(2 rows)

explain (verbose, costs off)
select a, b from p group by a;
  QUERY PLAN
--
 HashAggregate
   Output: p.a, p.b
   Group Key: p.a
   ->  Append
 ->  Seq Scan on public.p p_1
   Output: p_1.a, p_1.b
 ->  Seq Scan on public.c p_2
   Output: p_2.a, p_2.b
(8 rows)

The parser considers 'p.b' functionally dependent on the group by
column 'p.a' because 'p.a' is identified as the primary key for table
'p'.  However, this causes confusion for the executor when determining
which 'p.b' value should be returned for each group.  In my case, I
observed that sorted and hashed aggregation produce different results
for the same query.

Reading the doc, it seems that this is a documented limitation of the
inheritance feature that we would have duplicate unique key values in
inheritance tables.  Even adding a unique constraint to the children
does not prevent duplication compared to the parent.

As a workaround for this issue, I'm considering whether we can skip
checking functional dependency on primary keys for inheritance
parents, given that we cannot guarantee uniqueness on the keys in this
case.  Maybe something like below.

@@ -1421,7 +1427,9 @@ check_ungrouped_columns_walker(Node *node,
Assert(var->varno > 0 &&
   (int) var->varno <= list_length(context->pstate->p_rtable));
rte = rt_fetch(var->varno, context->pstate->p_rtable);
-   if (rte->rtekind == RTE_RELATION)
+   if (rte->rtekind == RTE_RELATION &&
+   !(rte->relkind == RELKIND_RELATION &&
+ rte->inh && has_subclass(rte->relid)))
{
if (check_functional_grouping(rte->relid,

Any thoughts?

Thanks
Richard

Re: Injection point locking

2024-07-15 Thread Michael Paquier

On Mon, Jul 15, 2024 at 10:55:26AM +0300, Heikki Linnakangas wrote:
> Ok, committed this.

Okidoki.  Thanks!
--
Michael


signature.asc
Description: PGP signature

Re: Duplicate unique key values in inheritance tables

2024-07-15 Thread David Rowley

On Tue, 16 Jul 2024 at 12:45, Richard Guo  wrote:
> As a workaround for this issue, I'm considering whether we can skip
> checking functional dependency on primary keys for inheritance
> parents, given that we cannot guarantee uniqueness on the keys in this
> case.  Maybe something like below.
>
> @@ -1421,7 +1427,9 @@ check_ungrouped_columns_walker(Node *node,
> Assert(var->varno > 0 &&
>(int) var->varno <= list_length(context->pstate->p_rtable));
> rte = rt_fetch(var->varno, context->pstate->p_rtable);
> -   if (rte->rtekind == RTE_RELATION)
> +   if (rte->rtekind == RTE_RELATION &&
> +   !(rte->relkind == RELKIND_RELATION &&
> + rte->inh && has_subclass(rte->relid)))
> {
> if (check_functional_grouping(rte->relid,
>
> Any thoughts?

The problem with doing that is that it might mean queries that used to
work no longer work.  CREATE VIEW could also fail where it used to
work which could render pg_dumps unrestorable.

Because it's a parser issue, I don't think we can fix it the same way
as a5be4062f was fixed.

I don't have any ideas on what we can do about this right now, but
thought it was worth sharing the above.

David

Re: Allow non-superuser to cancel superuser tasks.

2024-07-15 Thread Michael Paquier

On Mon, Jul 15, 2024 at 09:54:43AM +0900, Michael Paquier wrote:
> Thanks.  I'll see about stressing the buildfarm tomorrow or so, after
> looking at how the CI reacts.

There were a few more things here:
1) The new test was missing from test_misc/meson.build.
2) With 1) fixed, the CI has been complaining about the test
stability, when retrieving the PID of a worker with this query:
SELECT pid FROM pg_stat_activity WHERE backend_type = 'autovacuum worker'

And it's annoying to have missed what's wrong here:
- We don't check that the PID comes from a worker waiting on an
injection point, so it could be a PID of something running, still gone
once the signals are processed.
- No limit check, so we could finish with a list of PIDs while only
one is necessary.  Windows was slow enough to spot that, spawning
multiple autovacuum workers waiting on the injection point.

After improving all that, I have checked again the CI and it was
happy, so applied on HEAD.  Let's see now how the buildfarm reacts.
--
Michael


signature.asc
Description: PGP signature

Re: Pluggable cumulative statistics

2024-07-15 Thread Michael Paquier

On Fri, Jul 12, 2024 at 03:44:26PM +0200, Dmitry Dolgov wrote:
> I think it's fine. Although this solution feels a bit uncomfortable,
> after thinking back and forth I don't see any significantly better
> option. Worth noting that since the main goal is to maintain uniqueness,
> fixing the kind IDs could be accomplished in more than one way, with
> varying amount of control over the list of custom IDs:
> 
> * One coud say "lets keep it in wiki and let the community organize
>   itself somehow", and it's done.
> * Another way would be to keep it in wiki, and introduce some
>   maintenance rules, e.g. once per release someone is going to cleanup
>   the list from old unmaintained extensions, correct errors if needed,
>   etc. Not sure if such cleanup would be needed, but it's not impossible
>   to image.
> * Even more closed option would be to keep the kind IDs in some separate
>   git repository, and let committers add new records on demand,
>   expressed via some request form.

RMGRs have been taking the wiki page approach to control the source of
truth, that still sounds like the simplest option to me.  I'm OK to be
outvoted, but this simplifies the read/write pgstats paths a lot, and
these would get more complicated if we add more options because of new
entry types (more things like serialized names I cannot think of,
etc).  Extra point is that this makes future entensibility a bit
easier to work on.

> As far as I understand the current proposal is about the first option,
> on one side of the spectrum.

Yes.

>> - The fixed-numbered custom stats kinds are stored in an array in
>> PgStat_Snapshot and PgStat_ShmemControl, so as we have something
>> consistent with the built-in kinds.  This makes the tracking of the
>> validity of the data in the snapshots split into parts of the
>> structure for builtin and custom kinds.  Perhaps there are better
>> ideas than that?  The built-in fixed-numbered kinds have no
>> redirection.
> 
> Are you talking about this pattern?
> 
>if (pgstat_is_kind_builtin(kind))
>ptr = // get something from a snapshot/shmem by offset
>else
>ptr = // get something from a custom_data by kind
> 
> Maybe it would be possible to hide it behind some macros or an inlinable
> function with the offset and kind as arguments (and one of them will not
> be used)?

Kind of.  All the code paths calling pgstat_is_kind_builtin() in the
patch manipulate different areas of the snapshot and/or the shmem
control structures, so a macro makes little sense.

Perhaps we should have a few more inline functions like
pgstat_get_entry_len() to retrieve the parts of the custom data in the
snapshot and shmem control structures for fixed-numbered stats.  That
would limit what extensions need to know about
pgStatLocal.shmem->custom_data[] and
pgStatLocal.snapshot.custom_data[], which is easy to use incorrectly.
They don't need to know about pgStatLocal at all, either.

Thinking over the weekend on this patch, splitting injection_stats.c
into two separate files to act as two templates for the variable and
fixed-numbered cases would be more friendly to developers, as well.
--
Michael

signature.asc
Description: PGP signature

Re: Duplicate unique key values in inheritance tables

2024-07-15 Thread David G. Johnston

On Monday, July 15, 2024, David Rowley  wrote:

> On Tue, 16 Jul 2024 at 12:45, Richard Guo  wrote:
> > As a workaround for this issue, I'm considering whether we can skip
> > checking functional dependency on primary keys for inheritance
> > parents, given that we cannot guarantee uniqueness on the keys in this
> > case.
>
> Because it's a parser issue, I don't think we can fix it the same way
> as a5be4062f was fixed.
>
> I don't have any ideas on what we can do about this right now, but
> thought it was worth sharing the above.
>

Add another note to caveats in the docs and call it a feature.  We produce
a valid answer for the data model encountered.  The non-determinism isn’t
wrong, it’s just a poorly written query/model with non-deterministic
results. Since v15 we have an any_value aggregate - we basically are
applying this to the dependent columns implicitly.  A bit of revisionist
history but I’d rather do that than break said queries.  Especially at
parse time; I’d be a bit more open to execution-time enforcement if
functional dependency on the id turns out to have actually been violated.
But people want, and in other products have, any_value implicit aggregation
in this situation so it’s hard to say it is wrong even if we otherwise take
the position that we will not accept it.

David J.

Re: Internal error codes triggered by tests

2024-07-15 Thread Michael Paquier

On Sun, Jul 14, 2024 at 07:00:00PM +0300, Alexander Lakhin wrote:
> I've filed a bug report about the "WITH RECURSIVE" anomaly: [1], but what
> I wanted to understand when presenting different error kinds is what
> definition XX000 errors could have in principle?

Cool, thanks!  I can see that Tom has already committed a fix.

I'm going to start a new thread for ERRCODE_NAME_TOO_LONG.  It would
be confusing to solve the issue in the middle of this thread.

> If the next thing to do is to get backtrace_on_internal_error back and
> that GUC is mainly intended for developers, then maybe having clean (or
> containing expected backtraces only) regression test logs is a final goal
> and we should stop here. But if it's expected that that GUC could be
> helpful for users to analyze such errors in production and thus pay extra
> attention to them, maybe having XX000 status for presumably
> unreachable conditions only is desirable...

Perhaps.  Let's see where it leads if we have this discussion again.
Some internal errors cannot be avoided because some tests expect such
cases (failures with on-disk file manipulation is one). 
--
Michael


signature.asc
Description: PGP signature

Re: Flush pgstats file during checkpoints

2024-07-15 Thread Michael Paquier

On Fri, Jul 12, 2024 at 01:01:19PM +, Bertrand Drouvot wrote:
> Instead of removing the stat file, should we keep it around until the first 
> call
> to pgstat_write_statsfile()?

Oops.  You are right, I have somewhat missed the unlink() once we are
done reading the stats file with a correct redo location.
--
Michael


signature.asc
Description: PGP signature

Re: Confine vacuum skip logic to lazy_scan_skip

2024-07-15 Thread Noah Misch

On Mon, Jul 15, 2024 at 03:26:32PM +1200, Thomas Munro wrote:
> On Mon, Jul 8, 2024 at 2:49 AM Noah Misch  wrote:
> > what is the scope of the review you seek?
> 
> The patch "Refactor tidstore.c memory management." could definitely
> use some review.

That's reasonable.  radixtree already forbids mutations concurrent with
iteration, so there's no new concurrency hazard.  One alternative is
per_buffer_data big enough for MaxOffsetNumber, but that might thrash caches
measurably.  That patch is good to go apart from these trivialities:

> - return &(iter->output);
> + return &iter->output;

This cosmetic change is orthogonal to the patch's mission.

> - for (wordnum = 0; wordnum < page->header.nwords; wordnum++)
> + for (int wordnum = 0; wordnum < page->header.nwords; wordnum++)

Likewise.

Re: CI, macports, darwin version problems

2024-07-15 Thread Thomas Munro

On Tue, Jul 16, 2024 at 10:48 AM Andres Freund  wrote:
> WRT your patches:
> - I think we ought to switch to the -runner image, otherwise we'll just
>   continue to get that "upgraded" warning

Right, let's try it.

> - With a fingerprint_script specified, we need to add
>   reupload_on_changes: true
>   otherwise it'll not be updated.

Ahh, I see.

> - I think the fingerprint_script should use sw_vers, just as the script
>   does. I see no reason to differ?

Yeah might as well.  I started with Darwin versions because that is
what MacPorts complains about, but they move in lockstep.

> - We could just sw_vers -productVersion | sed 's/\..*//g' instead of the more
>   complicated version you used, I doubt that they're going to go away from
>   numerical major versions...

Yep.

I've attached a new version like that.  Let's see which runner machine
gets it and how it turns out...
From 2f45f282b3d690ca0398628f6fb747681687c52f Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 26 Jun 2024 10:43:50 +1200
Subject: [PATCH v4] ci: Upgrade macOS version from 13 to 14.

1.  Previously we were using ghcr.io/cirruslabs/macos-XXX-base:latest
images, but Cirrus has started ignoring that and using a particular
image, currently ghcr.io/cirruslabs/macos-runner:sonoma, for github
accounts using free CI resources (as opposed to dedicated runner
machines, as cfbot uses).  Let's just ask for that image anyway, to stay
in sync.

2.  Instead of hard-coding a MacPorts installation URL, deduce it from
the running macOS version and the available releases.  This removes the
need to keep the ci_macports_packages.sh in sync with .cirrus.task.yml,
and to advance the MacPorts version from time to time.

3.  Change the cache key we use to cache the whole macports installation
across builds to include the OS major version, to trigger a fresh
installation when appropriate.

Back-patch to 15 where CI began.

Reviewed-by: Andres Freund 
Discussion: https://postgr.es/m/CA%2BhUKGLqJdv6RcwyZ_0H7khxtLTNJyuK%2BvDFzv3uwYbn8hKH6A%40mail.gmail.com

ci: Find appropriate MacPorts package automatically.

Discussion: https://postgr.es/m/CA%2BhUKGLqJdv6RcwyZ_0H7khxtLTNJyuK%2BvDFzv3uwYbn8hKH6A%40mail.gmail.com
---
 .cirrus.tasks.yml|  9 +++--
 src/tools/ci/ci_macports_packages.sh | 14 +-
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index 33646faeadf..a6800ecde9c 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -411,7 +411,7 @@ task:
 
 
 task:
-  name: macOS - Ventura - Meson
+  name: macOS - Sonoma - Meson
 
   env:
 CPUS: 4 # always get that much for cirrusci macOS instances
@@ -420,7 +420,7 @@ task:
 # work OK. See
 # https://postgr.es/m/20220927040208.l3shfcidovpzqxfh%40awork3.anarazel.de
 TEST_JOBS: 8
-IMAGE: ghcr.io/cirruslabs/macos-ventura-base:latest
+IMAGE: ghcr.io/cirruslabs/macos-runner:sonoma
 
 CIRRUS_WORKING_DIR: ${HOME}/pgsql/
 CCACHE_DIR: ${HOME}/ccache
@@ -460,6 +460,11 @@ task:
   # updates macports every time.
   macports_cache:
 folder: ${MACPORTS_CACHE}
+fingerprint_script: |
+  # Include the OS major version in the cache key.  If the OS image changes
+  # to a different major version, we need to reinstall.
+  sw_vers -productVersion | sed 's/\..*//'
+reupload_on_changes: true
   setup_additional_packages_script: |
 sh src/tools/ci/ci_macports_packages.sh \
   ccache \
diff --git a/src/tools/ci/ci_macports_packages.sh b/src/tools/ci/ci_macports_packages.sh
index f87256e0908..03cbe48a5d6 100755
--- a/src/tools/ci/ci_macports_packages.sh
+++ b/src/tools/ci/ci_macports_packages.sh
@@ -13,7 +13,19 @@ set -e
 
 packages="$@"
 
-macports_url="https://github.com/macports/macports-base/releases/download/v2.8.1/MacPorts-2.8.1-13-Ventura.pkg";
+macos_major_version="` sw_vers -productVersion | sed 's/\..*//' `"
+echo "macOS major version: $macos_major_version"
+
+# Scan the avialable MacPorts releases to find the latest one for the
+# running macOS release.  By default we assume the first match is the most
+# recent MacPorts version but that can be changed below if it turns out to be
+# problematic or a particular MacPorts release turns out to be broken.
+macports_release_list_url="https://api.github.com/repos/macports/macports-base/releases";
+macports_version_pattern=".*"
+#macports_version_pattern="2\.9\.3"
+macports_url="$( curl -s $macports_release_list_url | grep "\"https://github.com/macports/macports-base/releases/download/v$macports_version_pattern/MacPorts-$macports_version_pattern-$macos_major_version-[A-Za-z]*\.pkg\""; | sed 's/.*: "//;s/".*//' | head -1 )"
+echo "MacPorts package URL: $macports_url"
+
 cache_dmg="macports.hfs.dmg"
 
 if [ "$CIRRUS_CI" != "true" ]; then
-- 
2.45.2

Re: long-standing data loss bug in initial sync of logical replication

2024-07-15 Thread Amit Kapila

On Tue, Jul 16, 2024 at 12:48 AM Nitin Motiani  wrote:
>
> A couple of questions on the latest patch :
>
> 1. I see there is this logic in PublicationDropSchemas to first check
> if there is a valid entry for the schema in pg_publication_namespace
>
> psid = GetSysCacheOid2(PUBLICATIONNAMESPACEMAP,
>
> Anum_pg_publication_namespace_oid,
>
> ObjectIdGetDatum(schemaid),
>
> ObjectIdGetDatum(pubid));
> if (!OidIsValid(psid))
> {
> if (missing_ok)
> continue;
>
> ereport(ERROR,
> (errcode(ERRCODE_UNDEFINED_OBJECT),
>  errmsg("tables from schema
> \"%s\" are not part of the publication",
>
> get_namespace_name(schemaid;
> }
>
> Your proposed change locks the schemaRels before this code block.
> Would it be better to lock the schemaRels after the error check? So
> that just in case, the publication on the schema is not valid anymore,
> the lock is not held unnecessarily on all its tables.
>

Good point. It is better to lock the relations in
RemovePublicationSchemaById() where we are invalidating relcache as
well. See the response to your next point as well.

> 2. The function publication_add_schema explicitly invalidates cache by
> calling InvalidatePublicationRels(schemaRels). That is not present in
> the current PublicationDropSchemas code. Is that something which
> should be added in the drop scenario also? Please let me know if there
> is some context that I'm missing regarding why this was not added
> originally for the drop scenario.
>

The required invalidation happens in the function
RemovePublicationSchemaById(). So, we should lock in
RemovePublicationSchemaById() as that would avoid calling
GetSchemaPublicationRelations() multiple times.

One related comment:
@@ -1219,8 +1219,14 @@ AlterPublicationTables(AlterPublicationStmt
*stmt, HeapTuple tup,
  oldrel = palloc(sizeof(PublicationRelInfo));
  oldrel->whereClause = NULL;
  oldrel->columns = NIL;
+
+ /*
+ * Data loss due to concurrency issues are avoided by locking
+ * the relation in ShareRowExclusiveLock as described atop
+ * OpenTableList.
+ */
  oldrel->relation = table_open(oldrelid,
-   ShareUpdateExclusiveLock);
+   ShareRowExclusiveLock);

Isn't it better to lock the required relations in RemovePublicationRelById()?

-- 
With Regards,
Amit Kapila.

Re: Injection points: preloading and runtime arguments

2024-07-15 Thread Michael Paquier

On Wed, Jul 10, 2024 at 01:16:15PM +0900, Michael Paquier wrote:
> You mean with something that does a injection_point_cache_get()
> followed by a callback run if anything is found in the local cache?
> Why not.  Based on what you have posted at [1], it looks like this had
> better check the contents of the cache's generation with what's in
> shmem, as well as destroying InjectionPointCache if there is nothing
> else, so there's a possible dependency here depending on how much
> maintenance this should do with the cache to keep it consistent.

Now that 86db52a5062a is in the tree, this could be done with a
shortcut in InjectionPointCacheRefresh().  What do you think about
something like the attached, with your suggested naming?
--
Michael
From e33ab4202cf5e6ee79eef1927978d45e1cfa6034 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 16 Jul 2024 13:07:58 +0900
Subject: [PATCH] Add INJECTION_POINT_CACHED()

This works in combination with INJECTION_POINT_LOAD(), ensuring that an
injection point can be run without touching the shared memory area
related to injection points.
---
 src/include/utils/injection_point.h   |  3 ++
 src/backend/utils/misc/injection_point.c  | 33 ---
 .../expected/injection_points.out | 13 
 .../injection_points--1.0.sql | 10 ++
 .../injection_points/injection_points.c   | 14 
 .../injection_points/sql/injection_points.sql |  2 ++
 doc/src/sgml/xfunc.sgml   |  6 +++-
 7 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index bd3a62425c..a385e3df64 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -17,9 +17,11 @@
 #ifdef USE_INJECTION_POINTS
 #define INJECTION_POINT_LOAD(name) InjectionPointLoad(name)
 #define INJECTION_POINT(name) InjectionPointRun(name)
+#define INJECTION_POINT_CACHED(name) InjectionPointCached(name)
 #else
 #define INJECTION_POINT_LOAD(name) ((void) name)
 #define INJECTION_POINT(name) ((void) name)
+#define INJECTION_POINT_CACHED(name) ((void) name)
 #endif
 
 /*
@@ -38,6 +40,7 @@ extern void InjectionPointAttach(const char *name,
  int private_data_size);
 extern void InjectionPointLoad(const char *name);
 extern void InjectionPointRun(const char *name);
+extern void InjectionPointCached(const char *name);
 extern bool InjectionPointDetach(const char *name);
 
 #endif			/* INJECTION_POINT_H */
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index 84ad5e470d..9dd6575a9a 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -415,10 +415,11 @@ InjectionPointDetach(const char *name)
  * Common workhorse of InjectionPointRun() and InjectionPointLoad()
  *
  * Checks if an injection point exists in shared memory, and update
- * the local cache entry accordingly.
+ * the local cache entry accordingly.  If "direct" is true, return the
+ * point after looking up for it only in the cache.
  */
 static InjectionPointCacheEntry *
-InjectionPointCacheRefresh(const char *name)
+InjectionPointCacheRefresh(const char *name, bool direct)
 {
 	uint32		max_inuse;
 	int			namelen;
@@ -461,6 +462,13 @@ InjectionPointCacheRefresh(const char *name)
 		cached = NULL;
 	}
 
+	/*
+	 * If using a direct lookup, forget about the shared memory array
+	 * and return immediately with the data found in the cache.
+	 */
+	if (direct)
+		return cached;
+
 	/*
 	 * Search the shared memory array.
 	 *
@@ -531,7 +539,7 @@ void
 InjectionPointLoad(const char *name)
 {
 #ifdef USE_INJECTION_POINTS
-	InjectionPointCacheRefresh(name);
+	InjectionPointCacheRefresh(name, false);
 #else
 	elog(ERROR, "Injection points are not supported by this build");
 #endif
@@ -546,7 +554,24 @@ InjectionPointRun(const char *name)
 #ifdef USE_INJECTION_POINTS
 	InjectionPointCacheEntry *cache_entry;
 
-	cache_entry = InjectionPointCacheRefresh(name);
+	cache_entry = InjectionPointCacheRefresh(name, false);
+	if (cache_entry)
+		cache_entry->callback(name, cache_entry->private_data);
+#else
+	elog(ERROR, "Injection points are not supported by this build");
+#endif
+}
+
+/*
+ * Execute an injection point directly from the cache, if defined.
+ */
+void
+InjectionPointCached(const char *name)
+{
+#ifdef USE_INJECTION_POINTS
+	InjectionPointCacheEntry *cache_entry;
+
+	cache_entry = InjectionPointCacheRefresh(name, true);
 	if (cache_entry)
 		cache_entry->callback(name, cache_entry->private_data);
 #else
diff --git a/src/test/modules/injection_points/expected/injection_points.out b/src/test/modules/injection_points/expected/injection_points.out
index 2f60da900b..f25bbe4966 100644
--- a/src/test/modules/injection_points/expected/injection_points.out
+++ b/src/test/modules/injection_points/expected/injection_points.out
@@ -129,6 +129,12 @@ SELECT injection_points_detach('TestInjection

Re: CI, macports, darwin version problems

2024-07-15 Thread Thomas Munro

On Tue, Jul 16, 2024 at 3:19 PM Thomas Munro  wrote:
> I've attached a new version like that.  Let's see which runner machine
> gets it and how it turns out...

It failed[1] on pgx-m2-1: "Error: ShouldBeAtLeastOneLayer".  So I
temporarily disabled that machine from the pool and click the re-run
button, and it failed[2] on jc-m2-1: "Error: The operation couldn’t be
completed. No space left on device" after a long period during which
it was presumably trying to download that image.  I could try this
experiment again if Joe could see a way to free up some disk space.
I've reenabled pgx-m2-1 for now.

[1] https://cirrus-ci.com/task/5127256689868800
[2] https://cirrus-ci.com/task/6446688024395776

Re: Restart pg_usleep when interrupted

2024-07-15 Thread Bertrand Drouvot

Hi,

On Mon, Jul 15, 2024 at 12:20:29PM -0500, Nathan Bossart wrote:
> On Fri, Jul 12, 2024 at 03:39:57PM -0500, Sami Imseih wrote:
> > but Bertrand found long drifts [2[ which I could not reproduce.
> > To safeguard the long drifts, continue to use the &remain time with an 
> > additional safeguard to make sure the actual sleep does not exceed the 
> > requested sleep time.
> 
> Bertrand, does this safeguard fix the long drifts you saw?

Yeah, it was the case with the first version using the safeguard (see [1]) and
it's also the case with the last one shared in [2].

[1]: 
https://www.postgresql.org/message-id/Zo0UdeE3i9d0Wt5E%40ip-10-97-1-34.eu-west-3.compute.internal
[2]: 
https://www.postgresql.org/message-id/C18017A1-EDFD-4F2F-BCA1-0572D4CCC92B%40gmail.com
 

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: in parentesis is not usual on DOCs

2024-07-15 Thread Amit Langote

On Wed, May 22, 2024 at 8:22 PM jian he  wrote:
> On Wed, May 22, 2024 at 7:14 PM Peter Eisentraut  wrote:
> >
> > On 20.05.24 02:00, jian he wrote:
> > >> removing parentheses means we need to rephrase this sentence?
> > >> So I come up with the following rephrase:
> > >>
> > >> The context_item specifies the input data to query, the
> > >> path_expression is a JSON path expression defining the query,
> > >> json_path_name is an optional name for the path_expression. The
> > >> optional PASSING clause can provide data values to the
> > >> path_expression.
> > >
> > > Based on this, write a simple patch.
> >
> > Your patch kind of messes up the indentation of the text you are
> > changing.  Please check that.
>
> please check attached.

Sorry about not noticing this earlier.

Thanks for the patch and the reviews.  I've pushed it now after minor changes.

-- 
Thanks, Amit Langote

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-07-15 Thread Hayato Kuroda (Fujitsu)

Dear Amit, Hou,

Thanks for giving comments! PSA new versions.
What's new:

0001: included Hou's patch [1] not to overwrite slot options.
  Some other comments were also addressed.
0002: not so changed, just rebased.
0003: Typo was fixed, s/Alter/After/.

[1]: 
https://www.postgresql.org/message-id/OS3PR01MB57184E0995521300AC06CB4B94A72%40OS3PR01MB5718.jpnprd01.prod.outlook.com
 

Best regards,
Hayato Kuroda
FUJITSU LIMITED



v18-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIP.patch
Description:  v18-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIP.patch


v18-0002-Alter-slot-option-two_phase-only-when-altering-t.patch
Description:  v18-0002-Alter-slot-option-two_phase-only-when-altering-t.patch


v18-0003-Notify-users-to-roll-back-prepared-transactions.patch
Description:  v18-0003-Notify-users-to-roll-back-prepared-transactions.patch

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Ashutosh Sharma

Hi Robert.

On Mon, Jul 15, 2024 at 11:15 PM Robert Haas  wrote:
>
> On Mon, Jul 15, 2024 at 8:05 AM Ashutosh Sharma  wrote:
> > I've added these changes to restrict users from explicitly setting the
> > $extension_schema in the search_path. This ensures that
> > $extension_schema can only be set implicitly for functions created by
> > the extension when the "protected" flag is enabled.
>
> But ... why? I mean, what's the point of prohibiting that? In fact,
> maybe we should have *that* and forget about the protected flag in the
> control file.
>

Just to confirm, are you suggesting to remove the protected flag and
set the default search_path (as $extension_schema,) for all functions
within an extension where no explicit search_path is set? In addition
to that, also allow users to explicitly set $extension_schema as the
search_path and bypass resolution of $extension_schema for objects
outside the extension?

--
With Regards,
Ashutosh Sharma.

Re: long-standing data loss bug in initial sync of logical replication

2024-07-15 Thread Amit Kapila

On Tue, Jul 16, 2024 at 9:29 AM Amit Kapila  wrote:
>
> One related comment:
> @@ -1219,8 +1219,14 @@ AlterPublicationTables(AlterPublicationStmt
> *stmt, HeapTuple tup,
>   oldrel = palloc(sizeof(PublicationRelInfo));
>   oldrel->whereClause = NULL;
>   oldrel->columns = NIL;
> +
> + /*
> + * Data loss due to concurrency issues are avoided by locking
> + * the relation in ShareRowExclusiveLock as described atop
> + * OpenTableList.
> + */
>   oldrel->relation = table_open(oldrelid,
> -   ShareUpdateExclusiveLock);
> +   ShareRowExclusiveLock);
>
> Isn't it better to lock the required relations in RemovePublicationRelById()?
>

On my CentOS VM, the test file '100_bugs.pl' takes ~11s without a
patch and ~13.3s with a patch. So, 2 to 2.3s additional time for newly
added tests. It isn't worth adding this much extra time for one bug
fix. Can we combine table and schema tests into one single test and
avoid inheritance table tests as the code for those will mostly follow
the same path as a regular table?

-- 
With Regards,
Amit Kapila.

[ pg_ctl ] Review Request for Adding Pre-check User Script Feature

2024-07-15 Thread 김명준

Hello,

I have been considering adding a user script that performs pre-checks
before executing the start, stop, and restart operations in pg_ctl. I
believe it is necessary for pg_ctl to support an extension that can prevent
various issues that might occur when using start and stop. To this end, I
have sought a way for users to define and use their own logic. The existing
behavior remains unchanged, and the feature can be used optionally when
needed.

The verification of the code was carried out using the methods described
below, and I would like to request additional opinions or feedback. Tests
were conducted using make check and through direct testing under various
scenarios. As this is my first contribution, there might be aspects I
missed or incorrectly designed.

I would appreciate it if you could review this.

Thank you.


Myoungjun Kim / South Korea


pg_ctl_user_script.patch
Description: Binary data

Re: [ pg_ctl ] Review Request for Adding Pre-check User Script Feature

2024-07-15 Thread Zaid Shabbir

Hello,

Can you briefly explain what’s the issues you are going through the patch?


On Tue, 16 Jul 2024 at 11:40 AM, 김명준  wrote:

> Hello,
>
> I have been considering adding a user script that performs pre-checks
> before executing the start, stop, and restart operations in pg_ctl. I
> believe it is necessary for pg_ctl to support an extension that can prevent
> various issues that might occur when using start and stop. To this end, I
> have sought a way for users to define and use their own logic. The existing
> behavior remains unchanged, and the feature can be used optionally when
> needed.
>
> The verification of the code was carried out using the methods described
> below, and I would like to request additional opinions or feedback. Tests
> were conducted using make check and through direct testing under various
> scenarios. As this is my first contribution, there might be aspects I
> missed or incorrectly designed.
>
> I would appreciate it if you could review this.
>
> Thank you.
>
>
> Myoungjun Kim / South Korea
>

Re: Direct SSL connection and ALPN loose ends

2024-07-15 Thread Michael Paquier

On Mon, Jun 24, 2024 at 11:30:53PM +0300, Heikki Linnakangas wrote:
> Given that, I think it is a good thing to fail the connection completely on
> receiving a V2 error.
> 
> Attached is a patch to fix the other issue, with falling back from SSL to
> plaintext. And some tests and comment fixes I spotted while at it.
> 
> 0001: A small comment fix

Already committed as of cc68ca6d420e.

> 0002: This is the main patch that fixes the SSL fallback issue

+ conn->failed_enc_methods |= conn->allowed_enc_methods &
(~conn->current_enc_method); 

Sounds reasonable to me.

It's a bit annoying to have to guess that current_enc_method is
tracking only one method at a time (aka these three fields are not
documented in libpq-int.h), while allowed_enc_methods and
failed_enc_methods is a bitwise combination of the methods that are
still allowed or that have already failed.

> 0003: This adds fault injection tests to exercise these early error
> codepaths. It is not ready to be merged, as it contains a hack to skip
> locking. See thread at
> https://www.postgresql.org/message-id/e1ffb822-054e-4006-ac06-50532767f75b%40iki.fi.

Locking when running an injection point has been replaced by some
atomics in 86db52a5062a.

+if (IsInjectionPointAttached("backend-initialize-v2-error"))
+{
+FrontendProtocol = PG_PROTOCOL(2,0);
+elog(FATAL, "protocol version 2 error triggered");
+}

This is an attempt to do stack manipulation with an injection point
set.  FrontendProtocol is a global variable, so you could have a new
callback setting up this global variable directly, then FATAL (I
really don't mind is modules/injection_points finishes with a library
of callbacks).

Not sure to like much this new IsInjectionPointAttached() that does a
search in the existing injection point pool, though.  This leads to
more code footprint in the core backend, and I'm trying to minimize
that.  Not everybody agrees with this view, I'd guess, which is also
fine.

> 0004: More tests, for what happens if the server sends an error after
> responding "yes" to the SSLRequest or GSSRequest, but before performing the
> SSL/GSS handshake.

No objections to these two additions.

> Attached is also a little stand-alone perl program that listens on a socket,
> and when you connect to it, it immediately sends a V2 or V3 error, depending
> on the argument. That's useful for testing. It could be used as an
> alternative strategy to the injection points I used in the 0003-0004
> patches, but for now I just used it for manual testing.

Nice toy.
--
Michael

signature.asc
Description: PGP signature

90 matches

Mail list logo