Re: A problem in deconstruct_distribute_oj_quals

2023-02-07 Thread Richard Guo
On Tue, Feb 7, 2023 at 2:12 PM Tom Lane  wrote:

> Richard Guo  writes:
> > I noticed this code because I came across a problem with a query as
> > below.
>
> > create table t (a int);
>
> > select t1.a from (t t1 left join t t2 on true) left join (t t3 left join
> t
> > t4 on t3.a = t4.a) on t2.a = t3.a;
>
> > When we deal with qual 't2.a = t3.a', deconstruct_distribute_oj_quals
> > would always add the OJ relid of t3/t4 into its required_relids, due to
> > the code above, which I think is wrong.  The direct consequence is that
> > we would miss the plan that joins t2 and t3 directly.
>
> I don't see any change in this query plan when I remove that code, so
> I'm not sure you're explaining your point very well.


Sorry I didn't make myself clear.  The plan change may not be obvious
except when the cheapest path happens to be joining t2 and t3 first and
then joining with t4 afterwards.  Currently HEAD would not generate such
a path because the joinqual of t2/t3 always has the OJ relid of t3/t4 in
its required_relids.

To observe an obvious plan change, we can add unique constraint for 'a'
and look how outer-join removal works.

alter table t add unique (a);

-- with that code
# explain (costs off)
select t1.a from (t t1 left join t t2 on true) left join (t t3 left join t
t4 on t3.a = t4.a) on t2.a = t3.a;
QUERY PLAN
---
 Nested Loop Left Join
   ->  Seq Scan on t t1
   ->  Nested Loop Left Join
 ->  Seq Scan on t t2
 ->  Index Only Scan using t_a_key on t t3
   Index Cond: (a = t2.a)
(6 rows)


-- without that code
# explain (costs off)
select t1.a from (t t1 left join t t2 on true) left join (t t3 left join t
t4 on t3.a = t4.a) on t2.a = t3.a;
  QUERY PLAN
--
 Nested Loop Left Join
   ->  Seq Scan on t t1
   ->  Materialize
 ->  Seq Scan on t t2
(4 rows)

This is another side-effect of that code.  The joinqual of t2/t3 is
treated as being pushed down when we try to remove t2/t3, because its
required_relids, which incorrectly includes the OJ relid of t3/t4,
exceed the scope of the join.  This is not right.

Thanks
Richard


Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-07 Thread Tatsuo Ishii
>> Since this commit, make_etags has started failing to generate
>> tags files with the following error messages, on my MacOS.
>> 
>> $ src/tools/make_etags
>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ctags:
>> illegal option -- e
>> usage: ctags [-BFadtuwvx] [-f tagsfile] file ...
>> sort: No such file or directory
>> 
>> 
>> In my MacOS, non-Exuberant ctags is installed and doesn't support
>> -e option. But the commit changed make_etags so that it always
>> calls ctags with -e option via make_ctags. This seems the cause of
>> the above failure.
>> 
>> IS_EXUBERANT=""
>> ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"
>> 
>> make_ctags has the above code and seems to support non-Exuberant
>> ctags.
>> If so, we should revert the changes of make_etags by the commit and
>> make make_etags work with that ctags? Or, we should support
>> only Exuberant-type ctags (btw, I'm ok with this) and get rid of
>> something like the above code?
> 
> Thanks for the report. I will look into this.

Previous make_etags relied on etags command:

#!/bin/sh

# src/tools/make_etags

command -v etags >/dev/null || \
{ echo "'etags' program not found" 1>&2; exit 1; }
:
:

My Mac (M1 Mac running macOS 12.6) does not have etags. Thus before
the commit make_etags on Mac failed anyway. Do we want make_etags to
restore the previous behavior? i.e.  'etags' program not found

>> If so, we should revert the changes of make_etags by the commit and
>> make make_etags work with that ctags?

I think ctags on Mac cannot produce tags file for emacs.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Regarding TPCC benchmarking of postgresql for streaming

2023-02-07 Thread chandan kunal
Hi All,

I want to do TPCC benchmarking of postgresql with streaming data
Now I am guessing the COPY command can be used for this purpose or is there
any other option for this?

Can someone point me towards a better option  to do it in the best way?

Regards,
Chandan


Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-07 Thread Amit Kapila
On Tue, Feb 7, 2023 at 8:22 AM Hayato Kuroda (Fujitsu)
 wrote:
>
> Thank you for reviewing! PSA new version.
>

Few comments:
=
1.
@@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
BKI_SHARED_RELATION BKI_ROW

  Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */

+ int32 subminapplydelay; /* Replication apply delay (ms) */
+
  bool subenabled; /* True if the subscription is enabled (the
  * worker should be running) */

@@ -120,6 +122,7 @@ typedef struct Subscription
  * in */
  XLogRecPtr skiplsn; /* All changes finished at this LSN are
  * skipped */
+ int32 minapplydelay; /* Replication apply delay (ms) */
  char*name; /* Name of the subscription */
  Oid owner; /* Oid of the subscription owner */

Why the new parameter is placed at different locations in above two
strcutures? I think it should be after owner in both cases and
accordingly its order should be changed in GetSubscription() or any
other place it is used.

2. A minor comment change suggestion:
 /*
  * Common spoolfile processing.
  *
- * The commit/prepare time (finish_ts) for streamed transactions is required
- * for time-delayed logical replication.
+ * The commit/prepare time (finish_ts) is required for time-delayed logical
+ * replication.
  */

3. I find the newly added tests take about 8s on my machine which is
close highest in the subscription folder. I understand that it can't
be less than 3s because of the delay but checking multiple cases makes
it take that long. I think we can avoid the tests for streaming and
disable the subscription. Also, after removing those, I think it would
be better to add the remaining test in 001_rep_changes to save set-up
and tear-down costs as well.

4.
+$node_publisher->append_conf('postgresql.conf',
+ 'logical_decoding_work_mem = 64kB');

I think this setting is also not required.

-- 
With Regards,
Amit Kapila.




Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-07 Thread Yugo NAGATA
On Tue, 07 Feb 2023 17:19:37 +0900 (JST)
Tatsuo Ishii  wrote:

> >> Since this commit, make_etags has started failing to generate
> >> tags files with the following error messages, on my MacOS.
> >> 
> >> $ src/tools/make_etags
> >> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ctags:
> >> illegal option -- e
> >> usage: ctags [-BFadtuwvx] [-f tagsfile] file ...
> >> sort: No such file or directory
> >> 
> >> 
> >> In my MacOS, non-Exuberant ctags is installed and doesn't support
> >> -e option. But the commit changed make_etags so that it always
> >> calls ctags with -e option via make_ctags. This seems the cause of
> >> the above failure.
> >> 
> >> IS_EXUBERANT=""
> >> ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"
> >> 
> >> make_ctags has the above code and seems to support non-Exuberant
> >> ctags.
> >> If so, we should revert the changes of make_etags by the commit and
> >> make make_etags work with that ctags? Or, we should support
> >> only Exuberant-type ctags (btw, I'm ok with this) and get rid of
> >> something like the above code?
> > 
> > Thanks for the report. I will look into this.
> 
> Previous make_etags relied on etags command:
> 
> #!/bin/sh
> 
> # src/tools/make_etags
> 
> command -v etags >/dev/null || \
>   { echo "'etags' program not found" 1>&2; exit 1; }
> :
> :
> 
> My Mac (M1 Mac running macOS 12.6) does not have etags. Thus before
> the commit make_etags on Mac failed anyway. Do we want make_etags to
> restore the previous behavior? i.e.  'etags' program not found
> 
> >> If so, we should revert the changes of make_etags by the commit and
> >> make make_etags work with that ctags?
> 
> I think ctags on Mac cannot produce tags file for emacs.

Does is make sense to change make_etags as the attached patch does?
This allows make_etags to use etags if Exuberant-type ctags is not
available. This allows users to use make_etags if hey has either
Exuberant-type ctags or etags.

Regards,
Yugo Nagata

> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS LLC
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo NAGATA 
diff --git a/src/tools/make_etags b/src/tools/make_etags
index afc57e3e89..d0cfdb23d8 100755
--- a/src/tools/make_etags
+++ b/src/tools/make_etags
@@ -1,6 +1,25 @@
 #!/bin/sh
 # src/tools/make_etags
 
-cdir=`dirname $0`
-dir=`(cd $cdir && pwd)`
-exec $dir/make_ctags -e $*
+IS_EXUBERANT=""
+command -v ctags >/dev/null && \
+ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"
+
+if [ "$IS_EXUBERANT" ]; then
+cdir=`dirname $0`
+dir=`(cd $cdir && pwd)`
+exec $dir/make_ctags -e $*
+else
+command -v etags >/dev/null || \
+{ echo "'etags' program not found" 1>&2; exit 1; }
+
+rm -f ./TAGS
+
+find `pwd`/ -type f -name '*.[chyl]' -print |
+xargs etags --append -o TAGS
+
+find . \( -name CVS -prune \) -o \( -name .git -prune \) -o -type d -print |
+while read DIR
+do[ "$DIR" != "." ] && ln -f -s `pwd`/TAGS "$DIR"
+done
+fi


Re: Pluggable toaster

2023-02-07 Thread Aleksander Alekseev
Hi,

> I believe that is a misrepresentation of the situation. ZSON had
> (has?) several systemic issues and could not be accepted to /contrib/
> in the way it was implemented, and it was commented that it would make
> sense that the feature of compression assisted by dictionaries would
> be implemented in core. Yet still, that feature is only closely
> related to pluggable TOAST, but it is not the same as making TOAST
> pluggable.
>
> > * The community wants the feature to have a simple implementation. You
> > said yourself that the idea of type-aware TOASTers is very invasive,
> > and I completely agree.
>
> I'm not sure that this is correct either. Compression (and TOAST) is
> inherently complex, and I don't think we should reject improvements
> because they are complex.
> The problem that I see being raised here, is that there was little
> discussion and no observed community consensus about the design of
> this complex feature *before* this patch with high complexity was
> provided.

Strictly speaking there is no such thing as "community opinion". There
are different people, everyone has their own opinion. To make things
more interesting the opinions change with time.

I did my best to make a brief summary of 100+ messages from different
people in something like 4 threads. These are things that were
requested and/or no one disagrees with (at least no one said "no, put
all this out of the core! and make it complicated too!"). Focusing on
something (almost) no one disagrees with seems to be more productive
than arguing about something everyone disagrees with.

As I see it, the goal is not to be right, but rather to find a
consensus most of us will be not unhappy with.

> The next action that was requested is to take a step back and decide
> how we would want to implement type-aware TOASTing (and the associated
> patch compression dictionaries) *before* we look into the type-aware
> toasting.

Yes, I thought we already agreed to forget about type-aware TOASTing
and compression dictionaries, and are looking for a consensus now.

To clarify, I don't think that pluggable TOASTing is an absolutely bad
idea. We are just not discussing this particular idea anymore, at
least for now.

> > * People also want this to be simple from the user perspective, as
> > simple as just CREATE COMPRESSED TABLE ... [USING lz4|zstd];
>
> Could you provide a reference to this? I have yet to see a COMPRESSED
> TABLE feature or syntax, let alone users asking for TOAST to be as
> easily usable as that feature or syntax.

I was referring to the recent discussion of the new RFC. Please see
[1] and below.

[1]: 
https://www.postgresql.org/message-id/flat/20230203095540.zutul5vmsbmantbm%40alvherre.pgsql#7cce6acef0cb7eb2490715ec9d835e74

-- 
Best regards,
Aleksander Alekseev




Re: Add a test case related to the error "cannot fetch toast data without an active snapshot"

2023-02-07 Thread Nitin Jadhav
> if a procedure fetches a toasted value into a local variable, commits,
> and then tries to detoast the value.

I spent some time and tried to reproduce this error by using [1]
queries. But the error did not occur. Not sure whether I followed what
is mentioned in the above comment. Please correct me if I am wrong.

[1]:
CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,
':') FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,
':') FROM generate_series(1, 1000)));

DO $$
DECLARE v_r record;
DECLARE vref_cursor REFCURSOR;
BEGIN
OPEN vref_cursor FOR SELECT data FROM toasted;
LOOP
fetch vref_cursor into v_r;
INSERT INTO toasted(data) VALUES(v_r.data);
COMMIT;
END LOOP;
END;$$;


Thanks & Regards,
Nitin Jadhav

On Fri, Jan 27, 2023 at 6:26 PM Nitin Jadhav
 wrote:
>
> Hi,
>
> I was going through the comments [1] mentioned in
> init_toast_snapshot() and based on the comments understood that the
> error "cannot fetch toast data without an active snapshot" will occur
> if a procedure fetches a toasted value into a local variable, commits,
> and then tries to detoast the value. I would like to know the sample
> query which causes such behaviour. I checked the test cases. Looks
> like such a case is not present in the regression suit. It is better
> to add one.
>
>
> [1]:
> /*
>  * GetOldestSnapshot returns NULL if the session has no active snapshots.
>  * We can get that if, for example, a procedure fetches a toasted value
>  * into a local variable, commits, and then tries to detoast the value.
>  * Such coding is unsafe, because once we commit there is nothing to
>  * prevent the toast data from being deleted.  Detoasting *must* happen in
>  * the same transaction that originally fetched the toast pointer.  Hence,
>  * rather than trying to band-aid over the problem, throw an error.  (This
>  * is not very much protection, because in many scenarios the procedure
>  * would have already created a new transaction snapshot, preventing us
>  * from detecting the problem.  But it's better than nothing, and for sure
>  * we shouldn't expend code on masking the problem more.)
>  */
>
> Thanks & Regards,
> Nitin Jadhav




Too coarse predicate locks granularity for B+ tree indexes

2023-02-07 Thread Rinat Shigapov
Hi,

TLDR: this email describes a serialization failure that happens (as I
understand it) due to too coarse predicate locks granularity for primary
key index.

I have a concurrent testsuite that runs 14 test cases. Each test case
operates on a disjoint set of records, doesn't retry transactions and is
run under 'serializable' isolation level. The test data is small and likely
fits within a single tuple page.

When I finished the test suite I was surprised that PostgreSQL 14.5 returns
serialization failure on every test suite run. I was even more surprised
when I tested the suite against the current CockroachDB and didn't get
serialization failures. Actually I was able to reproduce RETRY_SERIALIZABLE
errors a couple of times on CockroachDB but it required me to run the test
suite in a loop for more than a half hour.

I started to investigate the test behavior with PostgreSQL with more
simplified and shrinked code and found a serialization failure of two
concurrent `update_user` operations.

The test defines the following `Users` table:

CREATE TABLE Users (
> id UUID,
> title VARCHAR(255),
> first_name VARCHAR(40),
> last_name VARCHAR(80) NOT NULL,
> email VARCHAR(255) NOT NULL,
> lower_email VARCHAR(255) GENERATED ALWAYS AS (lower(email)) STORED,
> marketing_optin BOOLEAN,
> mobile_phone VARCHAR(50),
> phone VARCHAR(50),
> phone_ext VARCHAR(40),
> is_contact BOOLEAN DEFAULT false NOT NULL,
> unlinked_link_ids UUID[],


> CONSTRAINT unique_user_email UNIQUE(lower_email),
> PRIMARY KEY (id)
> );


Concurrent `update_user` operation run the UPDATE query to change user
email to a unique value

UPDATE Users
> SET
> title = CASE WHEN false= true THEN 'foo' ELSE title END,
> first_name = CASE WHEN false= true THEN 'foo' ELSE first_name END,
> last_name = CASE WHEN false= true THEN 'foo' ELSE last_name END,
> email = CASE WHEN true = true THEN 'email2' ELSE email END,
> marketing_optin = CASE WHEN false = true THEN true ELSE
> marketing_optin END,
> mobile_phone = CASE WHEN false = true THEN 'foo' ELSE mobile_phone END,
> phone = CASE WHEN false = true THEN 'foo' ELSE phone END,
> phone_ext = CASE WHEN false = true THEN 'foo' ELSE phone_ext END
> WHERE id = '018629fd-7b28-743c-8647-b6321c166d46';
>

I use the following helper view to monitor locks:

> CREATE VIEW locks_v AS
> SELECT pid,
> virtualtransaction,
>locktype,
>CASE locktype
>  WHEN 'relation' THEN relation::regclass::text
>  WHEN 'virtualxid' THEN virtualxid::text
>  WHEN 'transactionid' THEN transactionid::text
>  WHEN 'tuple' THEN
> relation::regclass::text||':'||page::text||':'||tuple::text
>  WHEN 'page' THEN relation::regclass::text||':'||page::text
>END AS lockid,
>mode,
>granted
> FROM pg_locks;


 When the test Users table has only a few records the query uses a
sequential scan the serialization failure is reproducible without inserting
sleeps before `update_user` transaction commit.

This is caused by relation level predicate locks on Users table:

> select * from locks_v;
>  pid  | virtualtransaction |   locktype|  lockid   |
> mode   | granted
>
> --++---+---+--+-
>  3676 | 5/2444 | relation  | unique_user_email |
> RowExclusiveLock | t
>  3676 | 5/2444 | relation  | users_pkey|
> RowExclusiveLock | t
>  3676 | 5/2444 | relation  | users |
> RowExclusiveLock | t
>  3676 | 5/2444 | virtualxid| 5/2444|
> ExclusiveLock| t
>  3737 | 4/13470| relation  | pg_locks  |
> AccessShareLock  | t
>  3737 | 4/13470| relation  | locks_v   |
> AccessShareLock  | t
>  3737 | 4/13470| virtualxid| 4/13470   |
> ExclusiveLock| t
>  3669 | 3/17334| relation  | unique_user_email |
> RowExclusiveLock | t
>  3669 | 3/17334| relation  | users_pkey|
> RowExclusiveLock | t
>  3669 | 3/17334| relation  | users |
> RowExclusiveLock | t
>  3669 | 3/17334| virtualxid| 3/17334   |
> ExclusiveLock| t
>  3676 | 5/2444 | transactionid | 6571  |
> ExclusiveLock| t
>  3669 | 3/17334| transactionid | 6570  |
> ExclusiveLock| t
>  3676 | 5/2444 | relation  | users |
> SIReadLock   | t
>  3669 | 3/17334| relation  | users |
> SIReadLock   | t
> (15 rows)
>

If I add ballast data to Users table (1000 records) the cost optimizer
switches to index scan and it's hard to reproduce the issue for two
concurrent `update_user` operations without sleeps. After adding long
sleeps after UPDATE query and before commit I could see page-level
predic

Re: MERGE ... WHEN NOT MATCHED BY SOURCE

2023-02-07 Thread Dean Rasheed
On Sat, 21 Jan 2023 at 14:18, Ted Yu  wrote:
>
> On Sat, Jan 21, 2023 at 3:05 AM Dean Rasheed  wrote:
>>
>> Rebased version, following 8eba3e3f02 and 5d29d525ff.
>>

Another rebased version attached.

> In transform_MERGE_to_join :
>
> +   if (action->matchKind == 
> MERGE_WHEN_NOT_MATCHED_BY_SOURCE)
> +   tgt_only_tuples = true;
> +   if (action->matchKind == 
> MERGE_WHEN_NOT_MATCHED_BY_TARGET)
>
> There should be an `else` in front of the second `if`.
> When tgt_only_tuples and src_only_tuples are both true, we can come out of 
> the loop.
>

I decided not to do that. Adding an "else" doesn't change the code
that the compiler generates, and IMO it's slightly more readable
without it, since it keeps the line length shorter, and the test
conditions aligned, but that's a matter of opinion / personal
preference.

I think adding extra logic to exit the loop early if both
tgt_only_tuples and src_only_tuples are true would be a premature
optimisation, increasing the code size for no real benefit. In
practice, there are unlikely to be more than a few merge actions in
the list.

Regards,
Dean
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
new file mode 100644
index b87ad5c..1482ede
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -396,8 +396,8 @@
 originally matched appears later in the list of actions.
 On the other hand, if the row is concurrently updated or deleted so
 that the join condition fails, then MERGE will
-evaluate the condition's NOT MATCHED actions next,
-and execute the first one that succeeds.
+evaluate the condition's NOT MATCHED [BY TARGET]
+actions next, and execute the first one that succeeds.
 If MERGE attempts an INSERT
 and a unique index is present and a duplicate row is concurrently
 inserted, then a uniqueness violation error is raised;
diff --git a/doc/src/sgml/ref/merge.sgml b/doc/src/sgml/ref/merge.sgml
new file mode 100644
index 0995fe0..8ef121a
--- a/doc/src/sgml/ref/merge.sgml
+++ b/doc/src/sgml/ref/merge.sgml
@@ -33,7 +33,8 @@ USING dat
 and when_clause is:
 
 { WHEN MATCHED [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
-  WHEN NOT MATCHED [ AND condition ] THEN { merge_insert | DO NOTHING } }
+  WHEN NOT MATCHED BY SOURCE [ AND condition ] THEN { merge_update | merge_delete | DO NOTHING } |
+  WHEN NOT MATCHED [BY TARGET] [ AND condition ] THEN { merge_insert | DO NOTHING } }
 
 and merge_insert is:
 
@@ -70,7 +71,9 @@ DELETE
from data_source to
target_table_name
producing zero or more candidate change rows.  For each candidate change
-   row, the status of MATCHED or NOT MATCHED
+   row, the status of MATCHED,
+   NOT MATCHED BY SOURCE,
+   or NOT MATCHED [BY TARGET]
is set just once, after which WHEN clauses are evaluated
in the order specified.  For each candidate change row, the first clause to
evaluate as true is executed.  No more than one WHEN
@@ -226,16 +229,37 @@ DELETE
   At least one WHEN clause is required.
  
  
+  The WHEN clause may specify WHEN MATCHED,
+  WHEN NOT MATCHED BY SOURCE, or
+  WHEN NOT MATCHED [BY TARGET].
+  Note that the SQL standard only defines
+  WHEN MATCHED and WHEN NOT MATCHED
+  (which is defined to mean no matching target row).
+  WHEN NOT MATCHED BY SOURCE is an extension to the
+  SQL standard, as is the option to append
+  BY TARGET to WHEN NOT MATCHED, to
+  make its meaning more explicit.
+ 
+ 
   If the WHEN clause specifies WHEN MATCHED
-  and the candidate change row matches a row in the
+  and the candidate change row matches a row in the source to a row in the
   target_table_name,
   the WHEN clause is executed if the
   condition is
   absent or it evaluates to true.
  
  
-  Conversely, if the WHEN clause specifies
-  WHEN NOT MATCHED
+  If the WHEN clause specifies
+  WHEN NOT MATCHED BY SOURCE and the candidate change
+  row represents a row in the
+  target_table_name that does
+  not match a source row, the WHEN clause is executed
+  if the condition is
+  absent or it evaluates to true.
+ 
+ 
+  If the WHEN clause specifies
+  WHEN NOT MATCHED [BY TARGET]
   and the candidate change row does not match a row in the
   target_table_name,
   the WHEN clause is executed if the
@@ -257,7 +281,10 @@ DELETE
  
   A condition on a WHEN MATCHED clause can refer to columns
   in both the source and the target relations. A condition on a
-  WHEN NOT MATCHED clause can only refer to columns from
+  WHEN NOT MATCHED BY SOURCE clause can only refer to
+  columns from the target relation, since by definition there is no matching
+  source row. A condition on a WHEN NOT MATCHED [BY TARGET]
+  clause can only refer to columns from
   the source relation, since by defi

Re: Pluggable toaster

2023-02-07 Thread Pavel Borisov
Hi, hackers!

Maybe I've read the thread too superficially, but for me, it seems
like more of a discussion on what TOAST should NOT be. Maybe someone
more in the topic could explain what is the consensus on what we
require and what we like to to have in a new TOAST?

For me, a good toast should be chunk-updatable, so that we don't need
to rewrite the whole TOAST and WAL-replicate the whole thing at every
small attribute modification. But obviously, it's just another
opinion.

Kind regards,
Pavel Borisov




Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-02-07 Thread Dilip Kumar
On Mon, Dec 26, 2022 at 2:20 PM Bharath Rupireddy
 wrote:
>

I have gone through this patch, I have some comments (mostly cosmetic
and comments)

1.
+ /*
+ * We have found the WAL buffer page holding the given LSN. Read from a
+ * pointer to the right offset within the page.
+ */
+ memcpy(page, (XLogCtl->pages + idx * (Size) XLOG_BLCKSZ),
+(Size) XLOG_BLCKSZ);


>From the above comments, it appears that we are reading from the exact
pointer we are interested to read, but actually, we are reading
the complete page.  I think this comment needs to be fixed and we can
also explain why we read the complete page here.

2.
+static char *
+GetXLogBufferForRead(XLogRecPtr ptr, TimeLineID tli, char *page)
+{
+ XLogRecPtr expectedEndPtr;
+ XLogRecPtr endptr;
+ int idx;
+ char*recptr = NULL;

Generally, we use the name 'recptr' to represent XLogRecPtr type of
variable, but in your case, it is actually data at that recptr, so
better use some other name like 'buf' or 'buffer'.


3.
+ if ((recptr + nbytes) <= (page + XLOG_BLCKSZ))
+ {
+ /* All the bytes are in one page. */
+ memcpy(dst, recptr, nbytes);
+ dst += nbytes;
+ *read_bytes += nbytes;
+ ptr += nbytes;
+ nbytes = 0;
+ }
+ else if ((recptr + nbytes) > (page + XLOG_BLCKSZ))
+ {
+ /* All the bytes are not in one page. */
+ Size bytes_remaining;

Why do you have this 'else if ((recptr + nbytes) > (page +
XLOG_BLCKSZ))' check in the else part? why it is not directly else
without a condition in 'if'?

4.
+XLogReadFromBuffers(XLogRecPtr startptr,
+ TimeLineID tli,
+ Size count,
+ char *buf,
+ Size *read_bytes)

I think we do not need 2 separate variables 'count' and '*read_bytes',
just one variable for input/output is sufficient.  The original value
can always be stored in some temp variable
instead of the function argument.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Compression dictionaries for JSONB

2023-02-07 Thread Alvaro Herrera
On 2023-Feb-05, Aleksander Alekseev wrote:

> Since PostgreSQL is not a specified document-oriented DBMS I think we
> better focus our (far from being infinite) resources on something more
> people would benefit from: AIO/DIO [1] or perhaps getting rid of
> freezing [2], to name a few examples.

For what it's worth -- one of the reasons Postgres is successful, at
least in my opinion, is that each developer does more or less what they
see fit (or what their employer sees fit), without following any sort of
grand plan or roadmap.  This has allowed us to expand in many directions
simultaneously.  There's a group working on AIO; others are interested
in improving partitioning, or logical replication, adding new SQL
features, and so on.  I don't think we should stop thinking about TOAST
(or more precisely JSON compression) just because we want to have all
these other things.  Not being a document database didn't stop us from
adding JSON many years back and JSONB stuff later.  When we did, it was
an enormous enabler of new use cases.

Everyone, from customers of large Postgres support companies, to those
of small or one-man Postgres support shops, to individual users doing
stuff on their free time, benefits from everything that happens in the
Postgres development group.  Let's keep improving Postgres for everyone.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La espina, desde que nace, ya pincha" (Proverbio africano)




Re: MERGE ... RETURNING

2023-02-07 Thread Dean Rasheed
On Mon, 23 Jan 2023 at 16:54, Dean Rasheed  wrote:
>
> On Sun, 22 Jan 2023 at 19:08, Alvaro Herrera  wrote:
> >
> > Regarding mas_action_idx, I would have thought that it belongs in
> > MergeAction rather than MergeActionState.  After all, you determine it
> > once at parse time, and it is a constant from there onwards, right?
>
> Oh, yes that makes sense (and removes the need for a couple of the
> executor changes). Thanks for looking.
>

Attached is a more complete patch, with that change and more doc and
test updates.

A couple of latest changes from this patch look like they represent
pre-existing issues with MERGE that should really be extracted from
this patch and applied to HEAD+v15. I'll take a closer look at that,
and start new threads for those.

Regards,
Dean
diff --git a/doc/src/sgml/dml.sgml b/doc/src/sgml/dml.sgml
new file mode 100644
index cbbc5e2..ff2a827
--- a/doc/src/sgml/dml.sgml
+++ b/doc/src/sgml/dml.sgml
@@ -283,10 +283,15 @@ DELETE FROM products;
RETURNING
   
 
+  
+   MERGE
+   RETURNING
+  
+
   
Sometimes it is useful to obtain data from modified rows while they are
being manipulated.  The INSERT, UPDATE,
-   and DELETE commands all have an
+   DELETE, and MERGE commands all have an
optional RETURNING clause that supports this.  Use
of RETURNING avoids performing an extra database query to
collect the data, and is especially valuable when it would otherwise be
@@ -339,6 +344,24 @@ DELETE FROM products
 
   
 
+  
+   In a MERGE, the data available to RETURNING is
+   the content of the source row plus the content of the inserted, updated, or
+   deleted target row.  In addition, the merge support functions
+and 
+   may be used to return information about which merge action was executed for
+   each row.  Since it is quite common for the source and target to have many
+   of the same columns, specifying RETURNING * can lead to a
+   lot of duplicated columns, so it is often more useful to just return the
+   target row.  For example:
+
+MERGE INTO products p USING new_products n ON p.product_no = n.product_no
+  WHEN NOT MATCHED THEN INSERT VALUES (n.product_no, n.name, n.price)
+  WHEN MATCHED THEN UPDATE SET name = n.name, price = n.price
+  RETURNING pg_merge_action(), p.*;
+
+  
+
   
If there are triggers () on the target table,
the data available to RETURNING is the row as modified by
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index e09e289..6927228
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21202,6 +21202,99 @@ SELECT count(*) FROM sometable;
 
  
 
+ 
+  Merge Support Functions
+
+  
+   MERGE
+   RETURNING
+  
+
+  
+   The merge support functions shown in
+may be used in the
+   RETURNING list of a  command
+   to return additional information about the action taken for each row.
+  
+
+  
+Merge Support Functions
+
+
+ 
+  
+   
+Function
+   
+   
+Description
+   
+  
+ 
+
+ 
+  
+   
+
+ pg_merge_action
+
+pg_merge_action ( )
+text
+   
+   
+Returns the action performed on the current row ('INSERT',
+'UPDATE', or 'DELETE').
+   
+  
+
+  
+   
+
+ pg_merge_when_clause
+
+pg_merge_when_clause ( )
+integer
+   
+   
+Returns the 1-based index of the WHEN clause executed
+for the current row.
+   
+  
+ 
+
+  
+
+  
+   Example:
+
+  
+
+  
+   Note that it is an error to use these functions anywhere other than the
+   RETURNING list of a MERGE command.
+  
+
+ 
+
  
   Subquery Expressions
 
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
new file mode 100644
index 7c01a54..4317cfc
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -1323,9 +1323,9 @@
  to a client upon the
  completion of an SQL command, usually a
  SELECT but it can be an
- INSERT, UPDATE, or
- DELETE command if the RETURNING
- clause is specified.
+ INSERT, UPDATE,
+ DELETE, or MERGE command if the
+ RETURNING clause is specified.
 
 
  The fact that a result set is a relation means that a query can be used
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
new file mode 100644
index 8897a54..3249e9c
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1020,8 +1020,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 
 
 
- If the command does return rows (for example SELECT,
- or INSERT/UPDATE/DELETE
+ If the command does return rows (for example SELECT, or
+ INSERT/UPDATE/DELETE/MERGE
  with RETURNING), there are two ways to proceed.
  When the command will return at most one row, or you only care about
  the first row of output, write the command as usual but add
@@ -1044,7 +1044,8 @@ INSERT INTO mytable VALUES (1,'one'), (2
 

Re: Too coarse predicate locks granularity for B+ tree indexes

2023-02-07 Thread Thomas Munro
On Tue, Feb 7, 2023 at 11:24 PM Rinat Shigapov  wrote:
> Does the current PostgreSQL release support B+ tree index predicate locks 
> more granular then page-level locks?

No.  I tried to follow some breadcrumbs left by Kevin and Dan that
should allow unique index scans that find a match to skip the btree
page lock, though, and p-lock just the heap tuple.  If you like
half-baked experimental code, see the v4-0002 patch in this thread,
where I took some shortcuts (jamming stuff that should be in the
planner down into the executor) for a proof-of-concept:

https://www.postgresql.org/message-id/flat/CAEepm%3D2GK3FVdnt5V3d%2Bh9njWipCv_fNL%3DwjxyUhzsF%3D0PcbNg%40mail.gmail.com

With that approach, if it *doesn't* find a match, then you're back to
having to p-lock the whole index page to represent the "gap", so that
you can conflict with anyone who tries to insert a matching value
later.  I believe the next-key approach would allow for finer grained
gap-locks (haven't studied that myself), but that's a secondary
problem; the primary problem (it seems to me) is getting rid of index
locks completely in the (common?) case that you have a qualifying
match.




Re: Assertion failure in SnapBuildInitialSnapshot()

2023-02-07 Thread Amit Kapila
On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada  wrote:
>
> Attached updated patches.
>

In back-branch patches, the change is as below:
+ *
+ * NB: the caller must hold ProcArrayLock in an exclusive mode regardless of
+ * already_locked which is unused now but kept for ABI compatibility.
  */
 void
 ProcArraySetReplicationSlotXmin(TransactionId xmin, TransactionId catalog_xmin,
  bool already_locked)
 {
- Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));
-
- if (!already_locked)
- LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(ProcArrayLock, LW_EXCLUSIVE));

This change looks odd to me. I think it would be better to pass
'already_locked' as true from the caller.

-- 
With Regards,
Amit Kapila.




Hi i am Intrested to contribute

2023-02-07 Thread Shivam Ardeshna
Respected Sir/Mam

  I am Shivam Ardeshna, A Computer Science Undergraduate 2nd
Year. I was looking for a contribution and then I saw pgsql and wanted to
contribute to it even if I don't know about some languages I will try to
learn and solve the issues so can you allow me to do something?
I am hoping to hear from you soon.
Regards
Shivam


Missing TAG for FEB (current) Minor Version Release

2023-02-07 Thread sujit.rat...@fujitsu.com
Hi OSS Community,
We just wanted to confirm when the TAG will be created for the current FEB 
minor release as we could not find the TAG for none of the minor versions,
below is the screen shot for the some of the minor versions.

[cid:image001.png@01D93B13.7BF82E20]
Could you please confirm when we can expect the TAG created for all minor 
versions?

Thanks and Regards,
Sujit Rathod
Sr. Application Developer
FUJITSU CONSULTING INDIA
Mobile: +91 9730487531



Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Andrew Dunstan


On 2023-02-06 Mo 23:43, Noah Misch wrote:



Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

Yeah.  I don't think we are seriously considering putting any restrictions
in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.



That's one idea that's been put forward, but it seems clear that some 
people are nervous about it.


Maybe a better course would be to continue improving the toolset and get 
more people comfortable with using it locally and then talk about 
integrating it upstream.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Hi i am Intrested to contribute

2023-02-07 Thread Pavel Borisov
Hi, Shivam!

> Respected Sir/Mam
>
>   I am Shivam Ardeshna, A Computer Science Undergraduate 2nd 
> Year. I was looking for a contribution and then I saw pgsql and wanted to 
> contribute to it even if I don't know about some languages I will try to 
> learn and solve the issues so can you allow me to do something?
> I am hoping to hear from you soon.

You may find useful the guide on how to contribute [1]. You can freely
choose what you want (from the list of TODOs linked or anything else)
and work on it, no permission from anyone is necessary.
The downside is that it's not easy to detect what is useful for the
first time, so I'd recommend first joining reviewing existing patches
at commitfest page [2] and/or trying to do some bugfixes from
pgsql-bugs mailing list. Then over time, you can gather some context
and you can choose more and more complicated things.

I wish you succeed and enjoy this activity!

Kind regards,
Pavel Borisov

[1] https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F
[2] https://commitfest.postgresql.org/42/




Re: Understanding years part of Interval

2023-02-07 Thread Marcos Pegoraro
>
> The years are converted to months and the fractional month is rounded half
>> up:
>>
>> 1.05 year = 12.6 month
>> => 1 year 0.6 month
>> => 1 year 1 month(after rounding)
>>
>> Compare that to 12.5 months to see when the rounding occurs:
>>
>> 12.5 month / 12 month
>> => 1.0416... years
>>
>> Plug 1.0416 and 1.0417 into the interval to observe the rounding:
>>
>> =# select '1.0416 year'::interval, '1.0417 year'::interval;
>>  interval |   interval
>> --+--
>>  1 year   | 1 year 1 mon
>>
>> I understood what you explained, but cannot agree that it's correct.
> Run these and you'll see the first and second select are fine, the third
> ... why ?
>
> select distinct current_date + ((random()::numeric) * '1 year'::interval)
> from generate_series(1,100) order by 1;
> select distinct current_date + ((random()::numeric) * '12
> month'::interval) from generate_series(1,100) order by 1;
> select distinct current_date + ((random()::numeric) || 'year')::interval
> from generate_series(1,100) order by 1;
>
> So, I have to think ... never use fractional parts on years, right ?
>

Only to be written, if somebody has to work with fractional parts of years.

This way works
select distinct (random()::numeric) * ('1 year'::interval) from
generate_series(1,100) order by 1;

This way doesn´t
select distinct ((random()::numeric) || 'year')::interval from
generate_series(1,100) order by 1;


Re: Too coarse predicate locks granularity for B+ tree indexes

2023-02-07 Thread Rinat Shigapov
Thomas, thank you for the details!

Have you kept the branch that you used to generate the patch? Which commit
should the patch apply to?

With kindest regards, Rinat Shigapov


вт, 7 февр. 2023 г. в 17:11, Thomas Munro :

> On Tue, Feb 7, 2023 at 11:24 PM Rinat Shigapov 
> wrote:
> > Does the current PostgreSQL release support B+ tree index predicate
> locks more granular then page-level locks?
>
> No.  I tried to follow some breadcrumbs left by Kevin and Dan that
> should allow unique index scans that find a match to skip the btree
> page lock, though, and p-lock just the heap tuple.  If you like
> half-baked experimental code, see the v4-0002 patch in this thread,
> where I took some shortcuts (jamming stuff that should be in the
> planner down into the executor) for a proof-of-concept:
>
>
> https://www.postgresql.org/message-id/flat/CAEepm%3D2GK3FVdnt5V3d%2Bh9njWipCv_fNL%3DwjxyUhzsF%3D0PcbNg%40mail.gmail.com
>
> With that approach, if it *doesn't* find a match, then you're back to
> having to p-lock the whole index page to represent the "gap", so that
> you can conflict with anyone who tries to insert a matching value
> later.  I believe the next-key approach would allow for finer grained
> gap-locks (haven't studied that myself), but that's a secondary
> problem; the primary problem (it seems to me) is getting rid of index
> locks completely in the (common?) case that you have a qualifying
> match.
>


Re: Support logical replication of DDLs

2023-02-07 Thread vignesh C
On Mon, 6 Feb 2023 at 17:02, vignesh C  wrote:
>
> On Mon, 6 Feb 2023 at 06:47, Peter Smith  wrote:
> >
> > Here are some comments for patch v63-0002.
> >
> > This is a WIP because I have not yet looked at the large file - 
> > ddl_deparse.c.
> >
> > ==
> > Commit Message
> >
> > 1.
> > This patch provides JSON blobs representing DDL commands, which can
> > later be re-processed into plain strings by well-defined sprintf-like
> > expansion. These JSON objects are intended to allow for machine-editing of
> > the commands, by replacing certain nodes within the objects.
> >
> > ~
> >
> > "This patch provides JSON blobs" --> "This patch constructs JSON blobs"
> >
> > ==
> > src/backend/commands/ddl_json.
>
> Modified
>

I found few issues while testing:
Issue 1: core dump
Steps to reproduce:
CREATE TABLE lock_tbl1 (a BIGINT);
CREATE TABLE lock_tbl1a (a BIGINT);
CREATE VIEW lock_view1 AS SELECT * FROM lock_tbl1;
CREATE VIEW lock_view2(a,b) AS SELECT * FROM lock_tbl1, lock_tbl1a;
CREATE VIEW lock_view3 AS SELECT * from lock_view2;
CREATE OR REPLACE VIEW lock_view2 AS SELECT * from lock_view3;

Stack trace for the same:
#5  0x5573000128ac in ExceptionalCondition
(conditionName=0x5573001830a3 "IsA(stmt, AlterTableStmt)",
fileName=0x5573001821de "ddl_deparse.c", lineNumber=2840) at
assert.c:66
#6  0x5572ffa8ddef in deparse_AlterRelation (cmd=0x557301038ec8)
at ddl_deparse.c:2840
#7  0x5572ffaa1895 in deparse_utility_command (cmd=0x557301038ec8,
verbose_mode=false) at ddl_deparse.c:9820
#8  0x5572ffd6daeb in publication_deparse_ddl_command_end
(fcinfo=0x7fff3eba50b0) at ddltrigger.c:203
#9  0x5572ffaa7f87 in EventTriggerInvoke
(fn_oid_list=0x557301033d80, trigdata=0x7fff3eba5110) at
event_trigger.c:1047
#10 0x5572ffaa7769 in EventTriggerDDLCommandEnd
(parsetree=0x557300f5b548) at event_trigger.c:719
#11 0x5572ffe33a22 in ProcessUtilitySlow (pstate=0x5573010458b8,
pstmt=0x557300f5b618, queryString=0x557300f5a7c8 "CREATE OR REPLACE
VIEW lock_view2 AS SELECT * from lock_view3;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x557300f5b8d8, qc=0x7fff3eba5910) at utility.c:1933

Issue 2: unsupported object type error
Steps to reproduce:
create table t1(c1 int);
ALTER TABLE t1 ADD CONSTRAINT onek_check_constraint CHECK (c1 >= 0);
ALTER TABLE t1 RENAME CONSTRAINT onek_check_constraint TO
onek_check_constraint_foo;

Issue 3: object name not found error
Steps to reproduce:
create type xfloat4;
create function xfloat4in(cstring) returns xfloat4 immutable strict
  language internal as 'int4in';
create function xfloat4out(xfloat4) returns cstring immutable strict
  language internal as 'int4out';
CREATE TYPE xfloat4 (
   internallength = 16,
   input = xfloat4in,
   output = xfloat4out,
   element = int4,
   category = 'x',   -- just to verify the system will take it
   preferred = true  -- ditto
);

Issue 4: unsupported alter table subtype 18
Steps to reproduce:
create type comptype as (r float8, i float8);
create domain dcomptype as comptype;
alter domain dcomptype add constraint c1 check ((value).r > 0);
alter type comptype alter attribute r type bigint;

Issue 5: unsupported object type 13
Steps to reproduce:
create domain testdomain1 as int constraint unsigned check (value > 0);
alter domain testdomain1 rename constraint unsigned to unsigned_foo;

Issue 6: invalid ObjTree element type
Steps to reproduce:
create extension file_fdw;
CREATE FOREIGN DATA WRAPPER foo;
alter foreign data wrapper foo HANDLER file_fdw_handler;
WARNING:  changing the foreign-data wrapper handler can change
behavior of existing foreign tables
ERROR:  invalid ObjTree element type 1693984336

Issue 7: no owned sequence found
Steps to reproduce:
CREATE TABLE itest13 (a int);
ALTER TABLE itest13 ADD COLUMN b int GENERATED BY DEFAULT AS IDENTITY;

Issue 8: could not find tuple for constraint 0
Steps to reproduce:
create table p1(f1 int);
create table p1_c1() inherits(p1);
alter table p1 add constraint inh_check_constraint1 check (f1 > 0);
alter table p1_c1 add constraint inh_check_constraint1 check (f1 > 0);

Issue 9: unsupported object type 38
Steps to reproduce:
CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false);
COMMENT ON SUBSCRIPTION regress_testsub IS 'test subscription';

Regards,
Vignesh




Re: OpenSSL 3.0.0 vs old branches

2023-02-07 Thread Andrew Dunstan


On 2023-02-07 Tu 02:18, Peter Eisentraut wrote:

On 06.02.23 16:56, Andrew Dunstan wrote:
I recently moved crake to a new machine running Fedora 36, which has 
OpenSSL 3.0.0. This causes the SSL tests to fail on branches earlier 
than release 13, so I propose to backpatch commit f0d2c65f17 to the 
release 11 and 12 branches.


This is not the only patch that we did to support OpenSSL 3.0.0. There 
was a very lengthy discussion that resulted in various patches.  
Unless we have a complete analysis of what was done and how it affects 
various branches, I would not do this.  Notably, we did actually 
consider what to backpatch, and the current state is the result of 
that.  So let's not throw that away without considering that 
carefully.  Even if it gets it to compile, I personally would not 
*trust* it without that analysis.  I think we should just leave it 
alone and consider OpenSSL 3.0.0 unsupported in the branches were it 
is now unsupported.  OpenSSL 1.1.1 is still supported upstream to 
serve those releases.



The only thing this commit does is replace a DES encrypted key file with 
one encrypted with AES-256. It doesn't affect compilation at all, and 
shouldn't affect tests run with 1.1.1.


I guess the alternatives are a) disable the SSL tests on branches <= 12 
or b) completely disable building with SSL for branches <= 12. I would 
probably opt for a). I bet this crops up a few more times as OpenSSL 
3.0.0 becomes more widespread, until release 12 goes EOL.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Missing TAG for FEB (current) Minor Version Release

2023-02-07 Thread Alvaro Herrera
On 2023-Feb-07, sujit.rat...@fujitsu.com wrote:

> Hi OSS Community,
> We just wanted to confirm when the TAG will be created for the current FEB 
> minor release as we could not find the TAG for none of the minor versions,
> below is the screen shot for the some of the minor versions.

Yes, it will be created sometime this week, most likely today.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-07 Thread Tatsuo Ishii
> Does is make sense to change make_etags as the attached patch does?
> This allows make_etags to use etags if Exuberant-type ctags is not
> available. This allows users to use make_etags if hey has either
> Exuberant-type ctags or etags.

The patch drops support for "-n" option :-<

Attached is the patch by fixing make_ctags (make_etags is not
touched).

If Exuberant-type ctags is available, use it (not changed).
  If Exuberant-type ctags is not available, try old ctags (not changed).
If the old ctags does not support "-e" option, try etags (new).
  If etags is not available, give up (new).

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/tools/make_ctags b/src/tools/make_ctags
index 102881667b..8fb9991db0 100755
--- a/src/tools/make_ctags
+++ b/src/tools/make_ctags
@@ -12,12 +12,15 @@ fi
 MODE=
 NO_SYMLINK=
 TAGS_FILE="tags"
+ETAGS_EXISTS=
+PROG=ctags
 
 while [ $# -gt 0 ]
 do
 	if [ $1 = "-e" ]
 	then	MODE="-e"
 		TAGS_FILE="TAGS"
+		command -v etags >/dev/null && ETAGS_EXISTS="Y"
 	elif [ $1 = "-n" ]
 	then	NO_SYMLINK="Y"
 	else
@@ -27,8 +30,10 @@ do
 	shift
 done
 
-command -v ctags >/dev/null || \
+if test -z "$MODE"
+then	(command -v ctags >/dev/null) || \
 	{ echo "'ctags' program not found" 1>&2; exit 1; }
+fi
 
 trap "ret=$?; rm -rf /tmp/$$; exit $ret" 0 1 2 3 15
 rm -f ./$TAGS_FILE
@@ -36,6 +41,20 @@ rm -f ./$TAGS_FILE
 IS_EXUBERANT=""
 ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"
 
+if test -z "$IS_EXUBERANT"
+then
+	if test -n "$MODE"
+	then	ctags --version 2>&1 | grep -- -e >/dev/null
+		if [ $? != 0 ]
+		then
+			if [ -z $ETAGS_EXISTS ]
+			then	echo "'ctags' does not support emacs mode and etags does not exist" 1>&2; exit 1
+			else	PROG=etags
+			fi
+		fi
+	fi
+fi
+
 # List of kinds supported by Exuberant Ctags 5.8
 # generated by ctags --list-kinds
 # --c-kinds was called --c-types before 2003
@@ -65,11 +84,16 @@ then	IGNORE_IDENTIFIES="-I pg_node_attr+"
 else	IGNORE_IDENTIFIES=
 fi
 
+if [ $PROG = "ctags" ]
+then	TAGS_OPT="-a -f"
+else	TAGS_OPT="-a -o"
+fi
+
 # this is outputting the tags into the file 'tags', and appending
 find `pwd`/ \( -name tmp_install -prune -o -name tmp_check -prune \) \
 	-o \( -name "*.[chly]" -o -iname "*makefile*" -o -name "*.mk" -o -name "*.in" \
 	-o -name "*.sql" -o -name "*.p[lm]" \) -type f -print |
-	xargs ctags $MODE -a -f $TAGS_FILE "$FLAGS" "$IGNORE_IDENTIFIES"
+	xargs $PROG $MODE $TAGS_OPT $TAGS_FILE "$FLAGS" "$IGNORE_IDENTIFIES"
 
 # Exuberant tags has a header that we cannot sort in with the other entries
 # so we skip the sort step


Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Amit Kapila
On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan  wrote:
>
> On 2023-02-06 Mo 23:43, Noah Misch wrote:
>
>
> Well, we did talk about adding a pre-commit hook to the repository, with
> instructions for how to enable it. And I don't see a problem with adding the
> pre-receive we're discussing here to src/tools/something.
>
> Yeah.  I don't think we are seriously considering putting any restrictions
> in place on gitmaster
>
> I could have sworn that was exactly what we were discussing, a pre-receive
> hook on gitmaster.
>
>
> That's one idea that's been put forward, but it seems clear that some people 
> are nervous about it.
>
> Maybe a better course would be to continue improving the toolset and get more 
> people comfortable with using it locally and then talk about integrating it 
> upstream.
>

Yeah, that sounds more reasonable to me as well.

-- 
With Regards,
Amit Kapila.




Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Magnus Hagander
On Tue, Feb 7, 2023 at 1:56 PM Amit Kapila  wrote:

> On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan  wrote:
> >
> > On 2023-02-06 Mo 23:43, Noah Misch wrote:
> >
> >
> > Well, we did talk about adding a pre-commit hook to the repository, with
> > instructions for how to enable it. And I don't see a problem with adding
> the
> > pre-receive we're discussing here to src/tools/something.
> >
> > Yeah.  I don't think we are seriously considering putting any
> restrictions
> > in place on gitmaster
> >
> > I could have sworn that was exactly what we were discussing, a
> pre-receive
> > hook on gitmaster.
> >
> >
> > That's one idea that's been put forward, but it seems clear that some
> people are nervous about it.
> >
> > Maybe a better course would be to continue improving the toolset and get
> more people comfortable with using it locally and then talk about
> integrating it upstream.
> >
>
> Yeah, that sounds more reasonable to me as well.
>

If we wanted something "in between" we could perhaps also have a async ci
job that runs after each commit and sends an emali to the committer if the
commit doesn't match up, instead of rejecting it hard but still getting
some relatively fast feedback.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Allow auto_explain to log plan duration and buffer usage

2023-02-07 Thread torikoshia

Hi,

-- commit 9d2d9728b8d546434aade4f9667a59666588edd6

Author: Michael Paquier 
Date:   Thu Jan 26 12:23:16 2023 +0900

Make auto_explain print the query identifier in verbose mode
..(snip)..
While looking at the area, I have noticed that more consolidation
between EXPLAIN and auto_explain would be in order for the logging of
the plan duration and the buffer usage.  This refactoring is left as a
future change.


I'm working on this now.
Attached a PoC patch which enables auto_explain to log plan duration and 
buffer usage on planning phase.

Last 3 lines are added by this patch:

 ```
=# set auto_explain.log_min_duration = 0;
=# set auto_explain.log_verbose = on;
=# set auto_explain.log_analyze = on;
=# select * from pg_class;

LOG:  0: duration: 6.774 ms  plan:
Query Text: select * from pg_class;
Seq Scan on pg_catalog.pg_class  (cost=0.00..18.12 rows=412 
width=273) (actual time=0.009..0.231 rows=412 loops=1)
  Output: oid, relname, relnamespace, reltype, reloftype, 
relowner, relam, relfilenode, reltablespace, relpages, reltuples, 
relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, 
relkind, relnatts, relchecks, relhasrules, relhastriggers, 
relhassubclass, relrowsecurity, relforcerowsecurity, relispopulated, 
relreplident, relispartition, relrewrite, relfrozenxid, relminmxid, 
relacl, reloptions, relpartbound

  Buffers: shared hit=14
Query Identifier: 8034096446570639838
Planning
  Buffers: shared hit=120
Planning Time: 3.908 ms
 ```

It adds a planner hook to track the plan duration and buffer usage for 
planning.

I'm considering the following points and any comments are welcome:

- Plan duration and buffer usage are saved on PlannedStmt. As far as I 
referred totaltime in QueryDesc, adding elements for extensions is not 
always prohibited, but I'm wondering it's ok to add them in this case.
- Just as pg_stat_statements made it possible to add planner information 
in v13, it may be useful for auto_explain to log planner phase 
information, especially plan duration. However, I am not sure to what 
extent information about the buffers used in the plan phase would be 
useful.
- Plan duration and buffer usage may differ from the output of EXPLAIN 
command since EXPLAIN command includes pg_plan_query(), but planner hook 
doesn’t.

pg_plan_query() do things for  log_planner_stats, debugging and tracing.
- (Future works) Log output concerning buffers should be toggled on/off 
by auto_explain.log_buffers. Log output concerning planning should be 
toggled on/off by new GUC something like auto_explain.track_planning.



What do you think?

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom a0997f139cd964ca0368fe3650e5ad1ce1f5e16e Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Tue, 7 Feb 2023 19:49:45 +0900
Subject: [PATCH v1] [PoC] Enable auto_explain to track planning time and
 buffer usage

---
 contrib/auto_explain/auto_explain.c   | 82 ++-
 src/backend/commands/explain.c| 36 ++--
 src/backend/nodes/gen_node_support.pl |  4 ++
 src/backend/optimizer/plan/planner.c  |  2 +
 src/include/commands/explain.h|  4 +-
 src/include/nodes/plannodes.h |  6 ++
 6 files changed, 114 insertions(+), 20 deletions(-)

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index c3ac27ae99..ea5c2949d0 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -20,6 +20,7 @@
 #include "executor/instrument.h"
 #include "jit/jit.h"
 #include "nodes/params.h"
+#include "optimizer/planner.h"
 #include "utils/guc.h"
 
 PG_MODULE_MAGIC;
@@ -64,20 +65,28 @@ static const struct config_enum_entry loglevel_options[] = {
 /* Current nesting depth of ExecutorRun calls */
 static int	nesting_level = 0;
 
+/* Current nesting depth of planner calls */
+static int	plan_nesting_level = 0;
+
 /* Is the current top-level query to be sampled? */
 static bool current_query_sampled = false;
 
 #define auto_explain_enabled() \
 	(auto_explain_log_min_duration >= 0 && \
-	 (nesting_level == 0 || auto_explain_log_nested_statements) && \
+	 ((nesting_level == 0 || plan_nesting_level == 0) || \
+	   auto_explain_log_nested_statements) && \
 	 current_query_sampled)
 
 /* Saved hook values in case of unload */
+static planner_hook_type prev_planner_hook = NULL;
 static ExecutorStart_hook_type prev_ExecutorStart = NULL;
 static ExecutorRun_hook_type prev_ExecutorRun = NULL;
 static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
 
+static PlannedStmt *explain_Planner(Query *parse, const char *query_string,
+int cursorOptions,
+ ParamListInfo boundParams);
 static void explain_ExecutorStart(QueryDesc *queryDesc, int eflags);
 static void explain_ExecutorRun(QueryDesc *queryDesc,
 ScanDirection direction,
@@ -245,6 +254,8 @@ _PG_init(void)
 	Mar

Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Andrew Dunstan


On 2023-02-06 Mo 09:40, Robert Haas wrote:

2. I'd like an easy way to indent the unstaged files in the current
directory (e.g. pgindent --dirty) or the files that have been queued
up for commit (e.g. pgindent --cached).



My git-fu is probably not all that it should be. I think we could 
possibly get at this list of files by running


  git status --porcelain --untracked-files=no --ignored=no -- .

And then your --dirty list would be lines beginning with ' M' while your 
--cached list would be lines beginning with 'A[ M]'


Does that seem plausible?


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Andrew Dunstan


On 2023-02-07 Tu 07:59, Magnus Hagander wrote:



On Tue, Feb 7, 2023 at 1:56 PM Amit Kapila  
wrote:


On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan
 wrote:
>
> On 2023-02-06 Mo 23:43, Noah Misch wrote:
>
>
> Well, we did talk about adding a pre-commit hook to the
repository, with
> instructions for how to enable it. And I don't see a problem
with adding the
> pre-receive we're discussing here to src/tools/something.
>
> Yeah.  I don't think we are seriously considering putting any
restrictions
> in place on gitmaster
>
> I could have sworn that was exactly what we were discussing, a
pre-receive
> hook on gitmaster.
>
>
> That's one idea that's been put forward, but it seems clear that
some people are nervous about it.
>
> Maybe a better course would be to continue improving the toolset
and get more people comfortable with using it locally and then
talk about integrating it upstream.
>

Yeah, that sounds more reasonable to me as well.


If we wanted something "in between" we could perhaps also have a async 
ci job that runs after each commit and sends an emali to the committer 
if the commit doesn't match up, instead of rejecting it hard but still 
getting some relatively fast feedback.



Sure, worth trying. We can always turn it off and no harm done if it 
doesn't suit. I'd probably start by having it email a couple of guinea 
pigs like you and me before turning it loose on committers generally. 
LMK if you need help with it.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-07 Thread Yugo NAGATA
On Tue, 07 Feb 2023 21:29:04 +0900 (JST)
Tatsuo Ishii  wrote:

> > Does is make sense to change make_etags as the attached patch does?
> > This allows make_etags to use etags if Exuberant-type ctags is not
> > available. This allows users to use make_etags if hey has either
> > Exuberant-type ctags or etags.
> 
> The patch drops support for "-n" option :-<
> 
> Attached is the patch by fixing make_ctags (make_etags is not
> touched).
> 
> If Exuberant-type ctags is available, use it (not changed).
>   If Exuberant-type ctags is not available, try old ctags (not changed).
> If the old ctags does not support "-e" option, try etags (new).

I am not sure if this is good way to check if ctags supports "-e" or not. 

+   thenctags --version 2>&1 | grep -- -e >/dev/null

Perhaps, "--help" might be intended rather than "--version" to check
supported options? Even so, ctags would have other option whose name contains
"-e" than Emacs support, so this check could end in a wrong result.  Therefore,
it seems to me that it is better to check immediately if etags is available 
in case that we don't have Exuberant-type ctags.

Regards,
Yugo Nagata

>   If etags is not available, give up (new).
> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS LLC
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo NAGATA 




Re: [PATCH] Compression dictionaries for JSONB

2023-02-07 Thread Aleksander Alekseev
Hi,

> > The complexity of page-level compression is significant, as pages are
> > currently a base primitive of our persistency and consistency scheme.
>
> +many
>
> It's also not all a panacea performance-wise, datum-level decompression can
> often be deferred much longer than page level decompression. For things like
> json[b], you'd hopefully normally have some "pre-filtering" based on proper
> columns, before you need to dig into the json datum.

This is actually a good point.

> It's also not necessarily that good, compression ratio wise. Particularly for
> wider datums you're not going to be able to remove much duplication, because
> there's only a handful of tuples. Consider the case of json keys - the
> dictionary will often do better than page level compression, because it'll
> have the common keys in the dictionary, which means the "full" keys never will
> have to appear on a page, whereas page-level compression will have the keys on
> it, at least once.

To clarify, what I meant was applying an idea of compression with
shared dictionaries to the pages instead of tuples. Just to make sure
we are on the same page.

> Page-level compression can not compress patterns that have a length of
> more than 1 page. TOAST is often used to store values larger than 8kB,
> which we'd prefer to compress to the greatest extent possible. So, a
> value-level compression method specialized to the type of the value
> does make a lot of sense, too.

Let's not forget that TOAST table is a table too. Page-level
compression applies to it as well as to a regular one.

> Of course you can use a dictionary for page-level compression too, but the
> gains when it works well will often be limited, because in most OLTP usable
> page-compression schemes I'm aware of, you can't compress a page all that far
> down, because you need a small number of possible "compressed page sizes".

That's true. However compressing an 8 KB page to, let's say, 1 KB, is
not a bad result as well.

In any case, there seems to be advantages and disadvantages of either
approach. Personally I don't care that much which one to choose. In
fact, although my own patch proposed attribute-level compression, not
tuple-level one, it is arguably closer to tuple-level approach than
page-level one. So to a certain extent I would be contradicting myself
by trying to prove that page-level compression is the way to go. Also
Matthias has a reasonable concern that page-level compression may have
implications for the WAL size. (Maybe it will not but I'm not ready to
prove it right now, nor am I convinced this is necessarily true.)

So, let's focus on tuple-level compression then.

> > > More similar data you compress the more space and disk I/O you save.
> > > Additionally you don't have to compress/decompress the data every time
> > > you access it. Everything that's in shared buffers is uncompressed.
> > > Not to mention the fact that you don't care what's in pg_attribute,
> > > the fact that schema may change, etc. There is a table and a
> > > dictionary for this table that you refresh from time to time. Very
> > > simple.
> >
> > You cannot "just" refresh a dictionary used once to compress an
> > object, because you need it to decompress the object too.
>
> Right. That's what I was trying to refer to when mentioning that we might need
> to add a bit of additional information to the varlena header for datums
> compressed with a dictionary.

> [...]
> and when you have many - updating an existing dictionary requires
> going through all objects compressed with it in the whole database.
> It's a very tricky question how to implement this feature correctly.

Yep, that's one of the challenges.

One approach would be to extend the existing dictionary. Not sure if
ZSTD / LZ4 support this, they probably don't. In any case this is a
sub-optimal approach because the dictionary will grow indefinitely.

We could create a dictionary once per table and forbid modifying it.
Users will have to re-create and refill a table manually if he/she
wants to update the dictionary by using `INSERT INTO .. SELECT ..`.
Although this is a possible solution I don't think this is what Andres
meant above by being invisible to the user. Also it would mean that
the new dictionary should be learned on the old table before creating
the new one with a new dictionary which is awkward.

This is why we need something like dictionary versions. A dictionary
can't be erased as long as there is data that uses this version of a
dictionary. The old data should be decompressed and compressed again
with the most recent dictionary, e.g. during VACUUM or perhaps VACUUM
FULL. This is an idea I ended up using in ZSON.

There may be alternative solutions, but I don't think I'm aware of
such. (There are JSON Schema, Protobuf etc, but they don't work for
general-purpose compression algorithms and/or arbitrary data types.)

> Let's keep improving Postgres for everyone.

Amen.

-- 
Best regards,
Aleksander Alekseev




RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-07 Thread Takamichi Osumi (Fujitsu)
Hi,


On Tuesday, February 7, 2023 6:56 PM Amit Kapila  
wrote:
> On Tue, Feb 7, 2023 at 8:22 AM Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > Thank you for reviewing! PSA new version.
> >
> 
> Few comments:
> =
Thanks for your comments !

> 1.
> @@ -74,6 +74,8 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId)
> BKI_SHARED_RELATION BKI_ROW
> 
>   Oid subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
> 
> + int32 subminapplydelay; /* Replication apply delay (ms) */
> +
>   bool subenabled; /* True if the subscription is enabled (the
>   * worker should be running) */
> 
> @@ -120,6 +122,7 @@ typedef struct Subscription
>   * in */
>   XLogRecPtr skiplsn; /* All changes finished at this LSN are
>   * skipped */
> + int32 minapplydelay; /* Replication apply delay (ms) */
>   char*name; /* Name of the subscription */
>   Oid owner; /* Oid of the subscription owner */
> 
> Why the new parameter is placed at different locations in above two
> strcutures? I think it should be after owner in both cases and accordingly its
> order should be changed in GetSubscription() or any other place it is used.
Fixed.


> 
> 2. A minor comment change suggestion:
>  /*
>   * Common spoolfile processing.
>   *
> - * The commit/prepare time (finish_ts) for streamed transactions is required
> - * for time-delayed logical replication.
> + * The commit/prepare time (finish_ts) is required for time-delayed
> + logical
> + * replication.
>   */
Fixed.

 
> 3. I find the newly added tests take about 8s on my machine which is close
> highest in the subscription folder. I understand that it can't be less than 3s
> because of the delay but checking multiple cases makes it take that long. I
> think we can avoid the tests for streaming and disable the subscription. Also,
> after removing those, I think it would be better to add the remaining test in
> 001_rep_changes to save set-up and tear-down costs as well.
Sounds good to me. Moved the test to 001_rep_changes.pl.


> 4.
> +$node_publisher->append_conf('postgresql.conf',
> + 'logical_decoding_work_mem = 64kB');
> 
> I think this setting is also not required.
Yes. And, in the process to move the test, removed.

Attached the v31 patch.

Note that regarding the translator style,
I chose to export the parameters from the errmsg to outside
at this stage. If there is a need to change it, then I'll follow it.

Other changes are minor alignments to make 'if' conditions
that exceeded 80 characters folded and look nicer.

Also conducted pgindent and pgperltidy.


Best Regards,
Takamichi Osumi



v31-0001-Time-delayed-logical-replication-subscriber.patch
Description:  v31-0001-Time-delayed-logical-replication-subscriber.patch


RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-07 Thread Takamichi Osumi (Fujitsu)
Hi,


On Tuesday, February 7, 2023 2:26 PM Amit Kapila  
wrote:
> On Tue, Feb 7, 2023 at 10:13 AM Kyotaro Horiguchi 
> wrote:
> >
> > At Mon, 6 Feb 2023 13:10:01 +, "Takamichi Osumi (Fujitsu)"
> >  wrote in
> > > The attached patch v29 has included your changes.
> >
> > catalogs.sgml
> >
> > +  
> > +   The minimum delay (ms) for applying changes.
> > +  
> >
> > I think we don't use unit symbols that way. Namely I think we would
> > write it as "The minimum delay for applying changes in milliseconds"
> >
> 
> Okay, if we prefer to use milliseconds, then how about: "The minimum delay, in
> milliseconds, for applying changes"?
This looks good to me. Adopted.

> 
> >
> > alter_subscription.sgml
> >
> >are slot_name,
> >synchronous_commit,
> >binary, streaming,
> > -  disable_on_error, and
> > -  origin.
> > +  disable_on_error,
> > +  origin, and
> > +  min_apply_delay.
> >   
> >
> > By the way, is there any rule for the order among the words?
> >
> 
> Currently, it is in the order in which the corresponding features are added.
Yes. So, I keep it as it is.

> 
> > They
> > don't seem in alphabetical order nor in the same order to the
> > create_sbuscription page.
> >
> 
> In create_subscription page also, it appears to be in the order in which those
> are added with a difference that they are divided into two categories
> (parameters that control what happens during subscription creation and
> parameters that control the subscription's replication behavior after it has 
> been
> created)
Same as here. The current order should be fine.

> 
> >  (I seems like in the order of SUBOPT_* symbols, but I'm not sure it's
> > a good idea..)
> >
> >
> > subscriptioncmds.c
> >
> > +   if (opts.streaming ==
> LOGICALREP_STREAM_PARALLEL &&
> > +
> > + !IsSet(opts.specified_opts, SUBOPT_MIN_APPLY_DELAY) &&
> > + sub->minapplydelay > 0)
> > ..
> > +   if (opts.min_apply_delay > 0 &&
> > +
> > + !IsSet(opts.specified_opts, SUBOPT_STREAMING) && sub->stream ==
> > + LOGICALREP_STREAM_PARALLEL)
> >
> > Don't we wrap the lines?
> >
> >
> > worker.c
> >
> > +   if (wal_receiver_status_interval > 0 &&
> > +   diffms > wal_receiver_status_interval * 1000L)
> > +   {
> > +   WaitLatch(MyLatch,
> > + WL_LATCH_SET |
> WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
> > + wal_receiver_status_interval *
> 1000L,
> > +
> WAIT_EVENT_RECOVERY_APPLY_DELAY);
> > +   send_feedback(last_received, true, false, true);
> > +   }
> > +   else
> > +   WaitLatch(MyLatch,
> > + WL_LATCH_SET |
> WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
> > + diffms,
> > +
> > + WAIT_EVENT_RECOVERY_APPLY_DELAY);
> >
> > send_feedback always handles the case where
> > wal_receiver_status_interval == 0.
> >
> 
> It only handles when force is false but here we are using that as true. So, 
> not
> sure, if what you said would be an improvement.
Agreed. So, I keep it as it is.

> 
> >  thus we can simply wait for
> > min(wal_receiver_status_interval, diffms) then call send_feedback()
> > unconditionally.
> >
> >
> > -start_apply(XLogRecPtr origin_startpos)
> > +start_apply(void)
> >
> > -LogicalRepApplyLoop(XLogRecPtr last_received)
> > +LogicalRepApplyLoop(void)
> >
> > Does this patch requires this change?
> >
> 
> I think this is because the scope of last_received has been changed so that it
> can be used to pass in send_feedback() during the delay.
Yes, that's our intention.


Kindly have a look at the latest patch v31 shared in [1].

[1] - 
https://www.postgresql.org/message-id/TYCPR01MB8373BA483A6D2C924C600968EDDB9%40TYCPR01MB8373.jpnprd01.prod.outlook.com

Best Regards,
Takamichi Osumi





RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-07 Thread Takamichi Osumi (Fujitsu)
Hi, Horiguchi-san


Thanks for your review !
On Tuesday, February 7, 2023 1:43 PM From: Kyotaro Horiguchi 
 wrote:
> At Mon, 6 Feb 2023 13:10:01 +, "Takamichi Osumi (Fujitsu)"
>  wrote in
> subscriptioncmds.c
> 
> + if (opts.streaming ==
> LOGICALREP_STREAM_PARALLEL &&
> + !IsSet(opts.specified_opts,
> SUBOPT_MIN_APPLY_DELAY) &&
> +sub->minapplydelay > 0)
> ..
> + if (opts.min_apply_delay > 0 &&
> + !IsSet(opts.specified_opts,
> SUBOPT_STREAMING) && sub->stream ==
> +LOGICALREP_STREAM_PARALLEL)
> 
> Don't we wrap the lines?
Yes, those lines should have looked nicer.
Updated. Kindly have a look at the latest patch v31 in [1].
There are also other some changes in the patch.

[1] - 
https://www.postgresql.org/message-id/TYCPR01MB8373BA483A6D2C924C600968EDDB9%40TYCPR01MB8373.jpnprd01.prod.outlook.com


Best Regards,
Takamichi Osumi





Track Oldest Initialized WAL Buffer Page

2023-02-07 Thread Bharath Rupireddy
Hi,

While working on [1], I was looking for a quick way to tell if a WAL
record is present in the WAL buffers array without scanning but I
couldn't find one. Hence, I put up a patch that basically tracks the
oldest initialized WAL buffer page, named OldestInitializedPage, in
XLogCtl. With OldestInitializedPage, we can easily illustrate WAL
buffers array properties:

1) At any given point of time, pages in the WAL buffers array are
sorted in an ascending order from OldestInitializedPage till
InitializedUpTo. Note that we verify this property for assert-only
builds, see IsXLogBuffersArraySorted() in the patch for more details.

2) OldestInitializedPage is monotonically increasing (by virtue of how
postgres generates WAL records), that is, its value never decreases.
This property lets someone read its value without a lock. There's no
problem even if its value is slightly stale i.e. concurrently being
updated. One can still use it for finding if a given WAL record is
available in WAL buffers. At worst, one might get false positives
(i.e. OldestInitializedPage may tell that the WAL record is available
in WAL buffers, but when one actually looks at it, it isn't really
available). This is more efficient and performant than acquiring a
lock for reading. Note that we may not need a lock to read
OldestInitializedPage but we need to update it holding
WALBufMappingLock.

3) One can start traversing WAL buffers from OldestInitializedPage
till InitializedUpTo to list out all valid WAL records and stats, and
expose them via SQL-callable functions to users, for instance, as
pg_walinspect functions.

4) WAL buffers array is inherently organized as a circular, sorted and
rotated array with OldestInitializedPage as pivot/first element of the
array with the property where LSN of previous buffer page (if valid)
is greater than OldestInitializedPage and LSN of the next buffer page
(if
valid) is greater than OldestInitializedPage.

Thoughts?

[1] 
https://www.postgresql.org/message-id/CALj2ACXKKK=wbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54+Na=q...@mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 683f8f2764970ca5737039577d64c91e491908d6 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 6 Feb 2023 06:58:54 +
Subject: [PATCH v1] Track Oldest Initialized WAL Buffer Page

---
 src/backend/access/transam/xlog.c | 170 ++
 src/include/access/xlog.h |   1 +
 2 files changed, 171 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f9f0f6db8d..7625907542 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -504,6 +504,44 @@ typedef struct XLogCtlData
 	XLogRecPtr *xlblocks;		/* 1st byte ptr-s + XLOG_BLCKSZ */
 	int			XLogCacheBlck;	/* highest allocated xlog buffer index */
 
+	/*
+	 * Start address of oldest initialized page in XLog buffers.
+	 *
+	 * We mainly track oldest initialized page explicitly to quickly tell if a
+	 * given WAL record is available in XLog buffers. It also can be used for
+	 * other purposes, see notes below.
+	 *
+	 * OldestInitializedPage gives XLog buffers following properties:
+	 *
+	 * 1) At any given point of time, pages in XLog buffers array are sorted in
+	 * an ascending order from OldestInitializedPage till InitializedUpTo.
+	 * Note that we verify this property for assert-only builds, see
+	 * IsXLogBuffersArraySorted() for more details.
+	 *
+	 * 2) OldestInitializedPage is monotonically increasing (by virtue of how
+	 * postgres generates WAL records), that is, its value never decreases.
+	 * This property lets someone read its value without a lock. There's no
+	 * problem even if its value is slightly stale i.e. concurrently being
+	 * updated. One can still use it for finding if a given WAL record is
+	 * available in XLog buffers. At worst, one might get false positives (i.e.
+	 * OldestInitializedPage may tell that the WAL record is available in XLog
+	 * buffers, but when one actually looks at it, it isn't really available).
+	 * This is more efficient and performant than acquiring a lock for reading.
+	 * Note that we may not need a lock to read OldestInitializedPage but we
+	 * need to update it holding WALBufMappingLock.
+	 *
+	 * 3) One can start traversing XLog buffers from OldestInitializedPage till
+	 * InitializedUpTo to list out all valid WAL records and stats, and expose
+	 * them via SQL-callable functions to users.
+	 *
+	 * 4) XLog buffers array is inherently organized as a circular, sorted and
+	 * rotated array with OldestInitializedPage as pivot with the property
+	 * where LSN of previous buffer page (if valid) is greater than
+	 * OldestInitializedPage and LSN of next buffer page (if valid) is greater
+	 * than OldestInitializedPage.
+	 */
+	XLogRecPtr	OldestInitializedPage;
+
 	/*
 	 * InsertTimeLineID is the timeline into which new WAL is being inserted
 	 *

How to solve "too many Lwlocks taken"?

2023-02-07 Thread jack...@gmail.com

I'm trying to write am table_am extension. But I get "too many Lwlocks taken" 
after I insert 
too many tuples. So I try to use UnLockBuffers() everywhere; but it still give 
me "too many Lwlocks taken",
So how should I release All locks?

--



jack...@gmail.com




RE: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Hayato Kuroda (Fujitsu)
Dear Andres, Amit,

> On 2023-02-07 09:00:13 +0530, Amit Kapila wrote:
> > On Tue, Feb 7, 2023 at 2:04 AM Andres Freund  wrote:
> > > How about we make it an option in START_REPLICATION? Delayed logical
> rep can
> > > toggle that on by default.
> 
> > Works for me. So, when this option is set in START_REPLICATION
> > message, walsender will set some flag and allow itself to exit at
> > shutdown without waiting for WAL to be sent?
> 
> Yes. I think that might be useful in other situations as well, but we don't
> need to make those configurable initially. But I imagine it'd be useful to set
> things up so that non-HA physical replicas don't delay shutdown, particularly
> if they're geographically far away.

Based on the discussion, I made a patch for adding a walsender option
exit_before_confirming to the START_STREAMING replication command. It can be
used for both physical and logical replication. I made the patch with
extendibility - it allows adding further options.
And better naming are very welcome.

For physical replication, the grammar was slightly changed like a logical one.
It can now accept options but currently, only one option is allowed. And it is
not used in normal streaming replication. For logical replication, the option is
combined with options for the output plugin. Of course, we can modify the API to
better style.

0001 patch was ported from time-delayed logical replication thread[1], which 
uses
the added option. When the min_apply_delay option is specified and publisher 
seems
to be PG16 or later, the apply worker sends a START_REPLICATION query with
exit_before_confirming = true. And the worker will reboot and send 
START_REPLICATION
again when min_apply_delay is changed from zero to a non-zero value or non-zero 
to zero.

Note that I removed version number because the approach is completely changed.

[1]: 
https://www.postgresql.org/message-id/tycpr01mb8373ba483a6d2c924c600968ed...@tycpr01mb8373.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



0001-Time-delayed-logical-replication-subscriber.patch
Description: 0001-Time-delayed-logical-replication-subscriber.patch


0002-Extend-START_REPLICATION-command-to-accept-walsender.patch
Description:  0002-Extend-START_REPLICATION-command-to-accept-walsender.patch


Re: daitch_mokotoff module

2023-02-07 Thread Dag Lem
Hi Paul,

I just went by to check the status of the patch, and I noticed that
you've added yourself as reviewer earlier - great!

Please tell me if there is anything I can do to help bring this across
the finish line.

Best regards,

Dag Lem




SQLFunctionCache and generic plans

2023-02-07 Thread Ronan Dunklau
Hello,

It has been brought to my attention that SQL functions always use generic 
plans.

Take this function for example:

create or replace function test_plpgsql(p1 oid) returns text as $$
BEGIN
   RETURN (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1);  
 
END;
$$ language plpgsql;

As expected, the PlanCache takes care of generating parameter specific plans, 
and correctly prunes the redundant OR depending on wether we call the function 
with a NULL value or not:

ro=# select test_plpgsql(NULL);
LOG:  duration: 0.030 ms  plan:
Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 
1)
Result  (cost=0.04..0.05 rows=1 width=64)
  InitPlan 1 (returns $0)
->  Limit  (cost=0.00..0.04 rows=1 width=64)
  ->  Seq Scan on pg_class  (cost=0.00..18.12 rows=412 width=64)
LOG:  duration: 0.662 ms  plan:
Query Text: select test_plpgsql(NULL);
Result  (cost=0.00..0.26 rows=1 width=32)

ro=# select test_plpgsql(1);
LOG:  duration: 0.075 ms  plan:
Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 
1)
Result  (cost=8.29..8.30 rows=1 width=64)
  InitPlan 1 (returns $0)
->  Limit  (cost=0.27..8.29 rows=1 width=64)
  ->  Index Scan using pg_class_oid_index on pg_class  
(cost=0.27..8.29 rows=1 width=64)
Index Cond: (oid = '1'::oid)
LOG:  duration: 0.675 ms  plan:
Query Text: select test_plpgsql(1);
Result  (cost=0.00..0.26 rows=1 width=32)


But writing the same function in SQL:
create or replace function test_sql(p1 oid) returns text as $$
SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1
$$ language sql;

we end up with a generic plan:

ro=# select test_sql(1);
LOG:  duration: 0.287 ms  plan:
Query Text:  SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1
Query Parameters: $1 = '1'
Limit  (cost=0.00..6.39 rows=1 width=32)
  ->  Seq Scan on pg_class  (cost=0.00..19.16 rows=3 width=32)
Filter: ((oid = $1) OR ($1 IS NULL))

This is due to the fact that SQL functions are planned once for the whole 
query using a specific SQLFunctionCache instead of using the whole PlanCache 
machinery. 

The following comment can be found in functions.c, about the SQLFunctionCache:

 * Note that currently this has only the lifespan of the calling query.
 * Someday we should rewrite this code to use plancache.c to save parse/plan
 * results for longer than that.

I would be interested in working on this, primarily to avoid this problem of 
having generic query plans for SQL functions but maybe having a longer lived 
cache as well would be nice to have.

Is there any reason not too, or pitfalls we would like to avoid ?

Best regards,

--
Ronan Dunklau







Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Tom Lane
Amit Kapila  writes:
> On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan  wrote:
>> On 2023-02-06 Mo 23:43, Noah Misch wrote:
 Yeah.  I don't think we are seriously considering putting any restrictions
 in place on gitmaster

>>> I could have sworn that was exactly what we were discussing, a pre-receive
>>> hook on gitmaster.

>> That's one idea that's been put forward, but it seems clear that some people 
>> are nervous about it.
>> Maybe a better course would be to continue improving the toolset and get 
>> more people comfortable with using it locally and then talk about 
>> integrating it upstream.

> Yeah, that sounds more reasonable to me as well.

+1.  Even if we end up with such a hook, we need to walk before we
can run.  The tooling of which we speak doesn't even exist today,
so it's unlikely to be bug-free, fast, and convenient to use
just two months from now.  Let's have some people use whatever
is proposed for awhile locally, and see what their experience is.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Justin Pryzby
On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:
> Justin Pryzby  writes:

> Hmmm ... inserting all of those as the default options would likely
> make it impossible to update pg_bsd_indent itself with anything like
> its current indent style (not that it's terribly consistent about
> that).  I could see inventing a --postgresql shortcut switch perhaps.

Or you could add ./.indent.pro, or ./src/tools/indent.profile for it to
read.

> > Would you want to make those the default options of the in-tree indent ?
> > Or provide a shortcut like --postgresql ?
> 
> But it's not clear to me why you're allergic to the perl wrapper?

My allergy is to the totality of the process, not to the perl component.
It's a bit weird to enforce a coding style that no upstream indent tool
supports.  But what's weirder is that, *having forked the indent tool*,
it still doesn't implement the desired style, and the perl wrapper tries
to work around that.

It would be more reasonable if the forked C program knew how to handle
the stuff for which the perl script currently has kludges to munge the
source code before indenting and then un-munging afterwards.

Or if the indentation were handled by the (or a) perl script itself.

Or if the perl script handled everything that an unpatched "ident"
didn't handle, rather than some things but not others, demanding use of
a patched indent as well as a perl wrapper (not just for looping around
files and fancy high-level shortcuts like indenting staged files).

On the one hand, "indent" ought to handle all the source-munging stuff.
On the other hand, it'd be better to use an unpatched indent tool.  The
current way is the worst of both worlds.

Currently, the perl wrapper supports the "/* -"-style comments that
postgres wants to use (why?) by munging the source code.  That could be
supported in pg-bsd-indent with a one line change.  I think an even
better option would be to change postgres' C files to use "/*-" without
a space, which requires neither perl munging nor patching indent.

On a less critical note, I wonder if it's a good idea to import
pgbsdindent as a git "submodule".

-- 
Justin




Re: SQLFunctionCache and generic plans

2023-02-07 Thread Tom Lane
Ronan Dunklau  writes:
> The following comment can be found in functions.c, about the SQLFunctionCache:

>  * Note that currently this has only the lifespan of the calling query.
>  * Someday we should rewrite this code to use plancache.c to save parse/plan
>  * results for longer than that.

> I would be interested in working on this, primarily to avoid this problem of 
> having generic query plans for SQL functions but maybe having a longer lived 
> cache as well would be nice to have.
> Is there any reason not too, or pitfalls we would like to avoid ?

AFAIR it's just lack of round tuits.  There would probably be some
semantic side-effects, though if you pay attention you could likely
make things better while you are at it.  The existing behavior of
parsing and planning all the statements at once is not very desirable
--- for instance, it doesn't work to do
CREATE TABLE foo AS ...;
SELECT * FROM foo;
I think if we're going to nuke this code and start over, we should
try to make that sort of case work.

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Andrew Dunstan


On 2023-02-07 Tu 10:25, Justin Pryzby wrote:

On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:

Justin Pryzby  writes:
Hmmm ... inserting all of those as the default options would likely
make it impossible to update pg_bsd_indent itself with anything like
its current indent style (not that it's terribly consistent about
that).  I could see inventing a --postgresql shortcut switch perhaps.

Or you could add ./.indent.pro, or ./src/tools/indent.profile for it to
read.


Would you want to make those the default options of the in-tree indent ?
Or provide a shortcut like --postgresql ?

But it's not clear to me why you're allergic to the perl wrapper?

My allergy is to the totality of the process, not to the perl component.
It's a bit weird to enforce a coding style that no upstream indent tool
supports.  But what's weirder is that, *having forked the indent tool*,
it still doesn't implement the desired style, and the perl wrapper tries
to work around that.

It would be more reasonable if the forked C program knew how to handle
the stuff for which the perl script currently has kludges to munge the
source code before indenting and then un-munging afterwards.

Or if the indentation were handled by the (or a) perl script itself.

Or if the perl script handled everything that an unpatched "ident"
didn't handle, rather than some things but not others, demanding use of
a patched indent as well as a perl wrapper (not just for looping around
files and fancy high-level shortcuts like indenting staged files).

On the one hand, "indent" ought to handle all the source-munging stuff.
On the other hand, it'd be better to use an unpatched indent tool.  The
current way is the worst of both worlds.

Currently, the perl wrapper supports the "/* -"-style comments that
postgres wants to use (why?) by munging the source code.  That could be
supported in pg-bsd-indent with a one line change.  I think an even
better option would be to change postgres' C files to use "/*-" without
a space, which requires neither perl munging nor patching indent.



Historically we used to do a heck of a lot more in pgindent that is 
currently done in the pre_indent and post_indent functions. If you want 
to spend time implementing that logic in pg_bsd_indent so we can remove 
the remaining bits of that processing then go for it.




On a less critical note, I wonder if it's a good idea to import
pgbsdindent as a git "submodule".



Meh, git submodules can be a pain in the neck in my limited experience. 
I'd rather steer clear.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Tom Lane
Justin Pryzby  writes:
> On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:
>> But it's not clear to me why you're allergic to the perl wrapper?

> My allergy is to the totality of the process, not to the perl component.
> It's a bit weird to enforce a coding style that no upstream indent tool
> supports.  But what's weirder is that, *having forked the indent tool*,
> it still doesn't implement the desired style, and the perl wrapper tries
> to work around that.

> It would be more reasonable if the forked C program knew how to handle
> the stuff for which the perl script currently has kludges to munge the
> source code before indenting and then un-munging afterwards.

[ shrug... ]  If you want to put cycles into that, nobody is stopping
you.  For me, it sounds like make-work.

regards, tom lane




RE: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Hayato Kuroda (Fujitsu)
> Dear Andres, Amit,
> 
> > On 2023-02-07 09:00:13 +0530, Amit Kapila wrote:
> > > On Tue, Feb 7, 2023 at 2:04 AM Andres Freund  wrote:
> > > > How about we make it an option in START_REPLICATION? Delayed logical
> > rep can
> > > > toggle that on by default.
> >
> > > Works for me. So, when this option is set in START_REPLICATION
> > > message, walsender will set some flag and allow itself to exit at
> > > shutdown without waiting for WAL to be sent?
> >
> > Yes. I think that might be useful in other situations as well, but we don't
> > need to make those configurable initially. But I imagine it'd be useful to 
> > set
> > things up so that non-HA physical replicas don't delay shutdown, 
> > particularly
> > if they're geographically far away.
> 
> Based on the discussion, I made a patch for adding a walsender option
> exit_before_confirming to the START_STREAMING replication command. It can
> be
> used for both physical and logical replication. I made the patch with
> extendibility - it allows adding further options.
> And better naming are very welcome.
> 
> For physical replication, the grammar was slightly changed like a logical one.
> It can now accept options but currently, only one option is allowed. And it is
> not used in normal streaming replication. For logical replication, the option 
> is
> combined with options for the output plugin. Of course, we can modify the API 
> to
> better style.
> 
> 0001 patch was ported from time-delayed logical replication thread[1], which
> uses
> the added option. When the min_apply_delay option is specified and publisher
> seems
> to be PG16 or later, the apply worker sends a START_REPLICATION query with
> exit_before_confirming = true. And the worker will reboot and send
> START_REPLICATION
> again when min_apply_delay is changed from zero to a non-zero value or 
> non-zero
> to zero.
> 
> Note that I removed version number because the approach is completely changed.
> 
> [1]:
> https://www.postgresql.org/message-id/TYCPR01MB8373BA483A6D2C924C60
> 0968ed...@tycpr01mb8373.jpnprd01.prod.outlook.com

I noticed that previous ones are rejected by cfbot, even if they passed on my 
environment...
PSA fixed version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v2-0001-Time-delayed-logical-replication-subscriber.patch
Description: v2-0001-Time-delayed-logical-replication-subscriber.patch


v2-0002-Extend-START_REPLICATION-command-to-accept-walsen.patch
Description:  v2-0002-Extend-START_REPLICATION-command-to-accept-walsen.patch


Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Robert Haas
On Tue, Feb 7, 2023 at 8:17 AM Andrew Dunstan  wrote:
> My git-fu is probably not all that it should be. I think we could possibly 
> get at this list of files by running
>
>   git status --porcelain --untracked-files=no --ignored=no -- .
>
> And then your --dirty list would be lines beginning with ' M' while your 
> --cached list would be lines beginning with 'A[ M]'
>
> Does that seem plausible?

I don't know if that works or not, but it does seem plausible, at
least. My idea would have been to use the --name-status option, which
works for both git diff and git show. You just look and see which
lines in the output start with M or A and then take the file names
from those lines.

So to indent files that are dirty, you would look at:

git diff --name-status

For what's cached:

git diff --name-status --cached

For the combination of the two:

git diff --name-status HEAD

For a prior commit:

git show --name-status $COMMITID

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Too coarse predicate locks granularity for B+ tree indexes

2023-02-07 Thread Andrey Borodin
On Tue, Feb 7, 2023 at 4:01 AM Rinat Shigapov  wrote:
>
> Thomas, thank you for the details!
>
> Have you kept the branch that you used to generate the patch? Which commit 
> should the patch apply to?
>

You can try something like
git checkout 'master@{2018-05-13 13:37:00}'
to get a commit by date from rev-parse.

Best regards, Andrey Borodin.




Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Jelte Fennema
On Tue, 7 Feb 2023 at 17:11, Robert Haas  wrote:
> I don't know if that works or not, but it does seem plausible, at
> least. My idea would have been to use the --name-status option, which
> works for both git diff and git show. You just look and see which
> lines in the output start with M or A and then take the file names
> from those lines.

If you add `--diff-filter=ACMR`, then git diff/show will only show
Added, Copied, Modified, and Renamed files.

The pre-commit hook that Andrew added to the wiki uses that in
combination with --name-only to get the list of files that you want to
check on commit:
https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks




Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Robert Haas
On Tue, Feb 7, 2023 at 11:32 AM Jelte Fennema  wrote:
> On Tue, 7 Feb 2023 at 17:11, Robert Haas  wrote:
> > I don't know if that works or not, but it does seem plausible, at
> > least. My idea would have been to use the --name-status option, which
> > works for both git diff and git show. You just look and see which
> > lines in the output start with M or A and then take the file names
> > from those lines.
>
> If you add `--diff-filter=ACMR`, then git diff/show will only show
> Added, Copied, Modified, and Renamed files.
>
> The pre-commit hook that Andrew added to the wiki uses that in
> combination with --name-only to get the list of files that you want to
> check on commit:
> https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

Thanks, that sounds nicer than what I suggested.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: A problem in deconstruct_distribute_oj_quals

2023-02-07 Thread Tom Lane
Richard Guo  writes:
> On Tue, Feb 7, 2023 at 2:12 PM Tom Lane  wrote:
>> I don't see any change in this query plan when I remove that code, so
>> I'm not sure you're explaining your point very well.

> To observe an obvious plan change, we can add unique constraint for 'a'
> and look how outer-join removal works.

Ah.  Yeah, that's pretty convincing, especially since v15 manages to
find that optimization.  Pushed with a test case.

regards, tom lane




Re: daitch_mokotoff module

2023-02-07 Thread Paul Ramsey



> On Feb 7, 2023, at 6:47 AM, Dag Lem  wrote:
> 
> I just went by to check the status of the patch, and I noticed that
> you've added yourself as reviewer earlier - great!
> 
> Please tell me if there is anything I can do to help bring this across
> the finish line.

Honestly, I had set it to Ready for Committer, but then I went to run 
regression one more time and my regression blew up. I found I couldn't enable 
the UTF tests without things failing. And I don't blame you! I think my 
installation is probably out-of-alignment in some way, but I didn't want to 
flip the Ready flag without having run everything through to completion, so I 
flipped it back. Also, are the UTF tests enabled by default? It wasn't clear to 
me that they were?

P



RE: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Hayato Kuroda (Fujitsu)
> I noticed that previous ones are rejected by cfbot, even if they passed on my
> environment...
> PSA fixed version.

While analyzing more, I found the further bug that forgets initialization.
PSA new version that could be passed automated tests on my github repository.
Sorry for noise.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v3-0001-Time-delayed-logical-replication-subscriber.patch
Description: v3-0001-Time-delayed-logical-replication-subscriber.patch


v3-0002-Extend-START_REPLICATION-command-to-accept-walsen.patch
Description:  v3-0002-Extend-START_REPLICATION-command-to-accept-walsen.patch


Re: run pgindent on a regular basis / scripted manner

2023-02-07 Thread Jelte Fennema
> On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan  wrote:
>
> Here's a quick patch for 1 and 3. Would also need to adjust the docco.
>
>
>
> This time with patch.

When supplying the --commit flag it still formats all files for me. I
was able to fix that by replacing:
# no non-option arguments given. so do everything in the current directory
$code_base ||= '.' unless @ARGV;

with:
# no files, dirs or commits given. so do everything in the current directory
$code_base ||= '.' unless @ARGV || @commits;

Does the code-base flag still make sense if you can simply pass a
directory as regular args now?




Re: Speed up transaction completion faster after many relations are accessed in a transaction

2023-02-07 Thread Andres Freund
Hi,

On 2023-01-24 16:57:37 +1300, David Rowley wrote:
> I've attached a rebased patch.

Looks like there's some issue causing tests to fail probabilistically:

https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3501

Several failures are when testing a 32bit build.



> While reading over this again, I wondered if instead of allocating the
> memory for the LOCALLOCKOWNER in TopMemoryContext, maybe we should
> create a Slab context as a child of TopMemoryContext and perform the
> allocations there.

Yes, that does make sense.


> I would like to get this LockReleaseAll problem finally fixed in PG16,
> but I'd feel much better about this patch if it had some review from
> someone who has more in-depth knowledge of the locking code.

I feel my review wouldn't be independent, but I'll give it a shot if nobody
else does.

Greetings,

Andres Freund




Re: generic plans and "initial" pruning

2023-02-07 Thread Andres Freund
Hi,

On 2023-02-03 22:01:09 +0900, Amit Langote wrote:
> I've added a test case under src/modules/delay_execution by adding a
> new ExecutorStart_hook that works similarly as
> delay_execution_planner().  The test works by allowing a concurrent
> session to drop an object being referenced in a cached plan being
> initialized while the ExecutorStart_hook waits to get an advisory
> lock.  The concurrent drop of the referenced object is detected during
> ExecInitNode() and thus triggers replanning of the cached plan.
> 
> I also fixed a bug in the ExplainExecuteQuery() while testing and some 
> comments.

The tests seem to frequently hang on freebsd:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3478

Greetings,

Andres Freund




Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?

2023-02-07 Thread Andres Freund
Hi,

On 2023-01-24 14:30:34 +0100, David Geier wrote:
> Attached is v7 of the patch:
> 
> - Rebased on latest master (most importantly on top of the int64 instr_time
> commits). - Includes two commits from Andres which introduce
> INSTR_TIME_SET_SECONDS(), INSTR_TIME_IS_LT() and WIP to report
> pg_test_timing output in nanoseconds. - Converts ticks to nanoseconds only
> with integer math, while accounting for overflow. - Supports RDTSCP via
> INSTR_TIME_SET_CURRENT() and introduced INSTR_TIME_SET_CURRENT_FAST() which
> uses RDTSC.
> 
> I haven't gotten to the following:
> 
> - Looking through all calls to INSTR_TIME_SET_CURRENT() and check if they
> should be replaced by INSTR_TIME_SET_CURRENT_FAST(). - Reviewing Andres
> commits. Potentially improving on pg_test_timing's output. - Looking at
> enabling RDTSC on more platforms. Is there a minimum set of platforms we
> would like support for? Windows should be easy. That would also allow to
> unify the code a little more. - Add more documentation and do more testing
> around the calls to CPUID. - Profiling and optimizing the code. A quick test
> showed about 10% improvement over master with TIMING ON vs TIMING OFF, when
> using the test-case from Andres' e-mail that started this thread.
> 
> I hope I'll find time to work on these points during the next days.

This fails to build on several platforms:

https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3751

Greetings,

Andres Freund




Re: Can we let extensions change their dumped catalog schemas?

2023-02-07 Thread Jacob Champion
On Tue, Jan 17, 2023 at 3:18 PM Jacob Champion  wrote:
> As a concrete example, Timescale's extension control file could look
> like this:
>
> default_version = '2.x.y'
> module_pathname = '$libdir/timescaledb-2.x.y'
> ...
> dump_version = true
>
> which would then cause pg_dump to issue a VERSION for its CREATE
> EXTENSION line. Other extensions would remain with the default
> (dump_version = false), so they'd be dumped without an explicit VERSION.

Even more concretely, here's a draft patch. There's no nuance --
setting dump_version affects all past versions of the extension, and
it can't be turned off at restore time. So extensions opting in would
have to be written to be side-by-side installable. (Ours is, in its
own way, but the PGDG installers don't allow it -- which maybe
highlights a weakness in this approach.) I'm still not sure if this
addresses Tom's concerns, or just adds new ones.

We could maybe give the user more control by overriding the default
version for an extension in a different TOC entry, and then add
options to ignore or include those version numbers during restore.
That doesn't address the side-by-side problem directly but gives an
escape hatch.

Earlier I wrote:

> I'm curious about your
> opinion on option 3, since it would naively seem to make pg_dump do
> _less_ work for a given extension.

This was definitely naive :D We can't just make use of the
binary-upgrade machinery to dump extension internals, because it pins
OIDs. So that might still be a valid approach, but it's not "less
work."

Thanks,
--Jacob
From 1a79eacbc6bdad8bb5e05418756d6ab35e9bab9e Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Fri, 27 Jan 2023 10:42:43 -0800
Subject: [PATCH] WIP: implement dump_version control option

The intent is to allow extensions to tie their internal catalog version
to a dump. This allows any dumped catalog tables to be upgraded after a
restore. The downside is that both the old and new extension versions
must be available for installation on the target.

This passes manual sanity checks, but there is no test yet.
---
 src/backend/commands/extension.c | 29 +
 src/bin/pg_dump/pg_dump.c| 25 ++---
 src/bin/pg_dump/pg_dump.h|  1 +
 src/include/catalog/pg_proc.dat  |  4 
 4 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index b1509cc505..b58c1d27a1 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -88,6 +88,7 @@ typedef struct ExtensionControlFile
 	bool		relocatable;	/* is ALTER EXTENSION SET SCHEMA supported? */
 	bool		superuser;		/* must be superuser to install? */
 	bool		trusted;		/* allow becoming superuser on the fly? */
+	bool		dump_version;	/* record the VERSION during pg_dump? */
 	int			encoding;		/* encoding of the script file, or -1 */
 	List	   *requires;		/* names of prerequisite extensions */
 } ExtensionControlFile;
@@ -582,6 +583,14 @@ parse_extension_control_file(ExtensionControlFile *control,
 		 errmsg("parameter \"%s\" requires a Boolean value",
 item->name)));
 		}
+		else if (strcmp(item->name, "dump_version") == 0)
+		{
+			if (!parse_bool(item->value, &control->dump_version))
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("parameter \"%s\" requires a Boolean value",
+item->name)));
+		}
 		else if (strcmp(item->name, "encoding") == 0)
 		{
 			control->encoding = pg_valid_server_encoding(item->value);
@@ -639,6 +648,7 @@ read_extension_control_file(const char *extname)
 	control->relocatable = false;
 	control->superuser = true;
 	control->trusted = false;
+	control->dump_version = false;
 	control->encoding = -1;
 
 	/*
@@ -2532,6 +2542,25 @@ pg_extension_config_dump(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
+/*
+ * This function reports whether an extension's version should be written
+ * explicitly by pg_dump.
+ */
+Datum
+pg_extension_version_is_dumpable(PG_FUNCTION_ARGS)
+{
+	Name		extname = PG_GETARG_NAME(0);
+	ExtensionControlFile *control;
+
+	/* Check extension name validity before any filesystem access */
+	check_valid_extension_name(NameStr(*extname));
+
+	/* Read the extension's control file */
+	control = read_extension_control_file(NameStr(*extname));
+
+	PG_RETURN_BOOL(control->dump_version);
+}
+
 /*
  * extension_config_remove
  *
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 527c7651ab..177ffbede5 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -5209,11 +5209,21 @@ getExtensions(Archive *fout, int *numExtensions)
 	int			i_extversion;
 	int			i_extconfig;
 	int			i_extcondition;
+	int			i_dump_version;
 
 	query = createPQExpBuffer();
 
 	appendPQExpBufferStr(query, "SELECT x.tableoid, x.oid, "
-		 "x.extname, n.nspname, x.extrelocatable, x.extversion, x.extconfig, x.extcondition "
+		 "x.extname, n.nspname, x.extrelocata

Re: How to solve "too many Lwlocks taken"?

2023-02-07 Thread Andres Freund
Hi,

On 2023-02-07 22:16:36 +0800, jack...@gmail.com wrote:
> 
> I'm trying to write am table_am extension. But I get "too many Lwlocks taken" 
> after I insert 
> too many tuples. So I try to use UnLockBuffers() everywhere; but it still 
> give me "too many Lwlocks taken",
> So how should I release All locks?

This indicates that you aren't actually releasing all the lwlocks. You can
inspect
static int  num_held_lwlocks = 0;
static LWLockHandle held_lwlocks[MAX_SIMUL_LWLOCKS];
in a debugger to see which locks you didn't release.


You're currently starting multiple threads with questions a week. Could you at
least keep them in one thread?

Greetings,

Andres Freund




Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

2023-02-07 Thread Andres Freund
Hi,

On 2023-01-26 13:07:08 +0300, Aleksander Alekseev wrote:
> > It *certainly* can't be right to just continue with the update in 
> > heap_update,
> 
> I see no reason why. What makes this case so different from updating a
> tuple created by the previous command?

To me it's a pretty fundamental violation of how heap visibility works. I'm
quite sure that there will be problems, but I don't feel like investing the
time to find a reproducer for something that I'm ready to reject on principle.


> > as you've done. You'd have to skip the update, not execute it. What am I
> > missing here?
> 
> Simply skipping updates in a statement that literally says DO UPDATE
> doesn't seem to be the behavior a user would expect.

Given that we skip the update in "UPDATE", your argument doesn't hold much
water.


> > I think this'd completely break triggers, for example, because they won't be
> > able to get the prior row version, since it won't actually be a row ever
> > visible (due to cmin=cmax).
> >
> > I suspect it might break unique constraints as well, because we'd end up 
> > with
> > an invisible row in part of the ctid chain.
> 
> That's a reasonable concern, however I was unable to break unique
> constraints or triggers so far:

I think you'd have to do a careful analysis of a lot of code for that to hold
any water.


I continue to think that we should just reject this behavioural change.

Greetings,

Andres Freund




Re: OpenSSL 3.0.0 vs old branches

2023-02-07 Thread Tom Lane
Andrew Dunstan  writes:
> On 2023-02-07 Tu 02:18, Peter Eisentraut wrote:
>> This is not the only patch that we did to support OpenSSL 3.0.0. There 
>> was a very lengthy discussion that resulted in various patches.  
>> Unless we have a complete analysis of what was done and how it affects 
>> various branches, I would not do this.  Notably, we did actually 
>> consider what to backpatch, and the current state is the result of 
>> that.  So let's not throw that away without considering that 
>> carefully.  Even if it gets it to compile, I personally would not 
>> *trust* it without that analysis.  I think we should just leave it 
>> alone and consider OpenSSL 3.0.0 unsupported in the branches were it 
>> is now unsupported.  OpenSSL 1.1.1 is still supported upstream to 
>> serve those releases.

AFAICT we did back-patch those changes into the branches at issue.
I find this in the 12.9 and 11.14 release notes, for example:



 
  Support OpenSSL 3.0.0
  (Peter Eisentraut, Daniel Gustafsson, Michael Paquier)
 


> The only thing this commit does is replace a DES encrypted key file with 
> one encrypted with AES-256. It doesn't affect compilation at all, and 
> shouldn't affect tests run with 1.1.1.

I double-checked this on Fedora 37 (openssl 3.0.5).  v11 and v12
do build --with-openssl.  There are an annoyingly large number of
-Wdeprecated-declarations warnings, but those are there in v13 too.
I confirm that back-patching f0d2c65f17 is required and sufficient
to make the ssl test pass.

I think Peter's misremembering the history, and OpenSSL 3 *is*
supported in these branches.  There could be an argument for
not back-patching f0d2c65f17 on the grounds that pre-1.1.1 is
also supported there.  On the whole though, it seems more useful
today for that test to pass with 3.x than for it to pass with 0.9.8.
And I can't see investing effort to make it do both (but if Peter
wants to, I won't stand in the way).

regards, tom lane




possible proposal plpgsql GET DIAGNOSTICS oid = PG_ROUTINE_OID

2023-02-07 Thread Pavel Stehule
Hi

I have a question about the possibility of simply getting the name of the
currently executed function. The reason for this request is simplification
of writing debug messages.

GET DIAGNOSTICS _oid = PG_ROUTINE_OID;
RAISE NOTICE '... % ... %', _oid, _oid::regproc::text;

The advantage of this dynamic access to function name is always valid value
not sensitive to some renaming or moving between schemas.

I am able to separate a name from context, but it can be harder to write
this separation really robustly. It can be very easy to enhance the GET
DIAGNOSTICS statement to return the oid of currently executed function.

Do you think it can be useful feature?

The implementation should be trivial.

Comments, notes?

Regards

Pavel


Re: Assertion failure in SnapBuildInitialSnapshot()

2023-02-07 Thread Andres Freund
Hi,

On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada  wrote:
> >
> > Attached updated patches.
> >
>
> Thanks, Andres, others, do you see a better way to fix this problem? I
> have reproduced it manually and the steps are shared at [1] and
> Sawada-San also reproduced it, see [2].
>
> [1] - 
> https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> [2] - 
> https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com

Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
already_locked=true worry me a lot less, because obviously that's not a very
frequent operation.

This is particularly not great because we need to acquire
ReplicationSlotControlLock while already holding ProcArrayLock.


But clearly there's a pretty large hole in the lock protection right now. I'm
a bit confused about why we (Robert and I, or just I) thought it's ok to do it
this way.


I wonder if we could instead invert the locks, and hold
ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().  That'd mean
that already_locked = true callers have to do a bit more work (we have to be
sure the locks are always acquired in the same order, or we end up in
unresolved deadlock land), but I think we can live with that.


This would still allow concurrent invocations of
ReplicationSlotsComputeRequiredXmin() come up with slightly different values,
but that's possible with the proposed patch as well, as effective_xmin is
updated without any of the other locks.  But I don't see a problem with that.

Greetings,

Andres Freund




Re: Assertion failure in SnapBuildInitialSnapshot()

2023-02-07 Thread Andres Freund
Hi,

On 2023-02-07 11:49:03 -0800, Andres Freund wrote:
> On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada  
> > wrote:
> > >
> > > Attached updated patches.
> > >
> >
> > Thanks, Andres, others, do you see a better way to fix this problem? I
> > have reproduced it manually and the steps are shared at [1] and
> > Sawada-San also reproduced it, see [2].
> >
> > [1] - 
> > https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > [2] - 
> > https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
> 
> Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> already_locked=true worry me a lot less, because obviously that's not a very
> frequent operation.

Separately from this change:

I wonder if we ought to change the setup in CreateInitDecodingContext() to be a
bit less intricate. One idea:

Instead of having GetOldestSafeDecodingTransactionId() compute a value, that
we then enter into a slot, that then computes the global horizon via
ReplicationSlotsComputeRequiredXmin(), we could have a successor to
GetOldestSafeDecodingTransactionId() change procArray->replication_slot_xmin
(if needed).

As long as CreateInitDecodingContext() prevents a concurent
ReplicationSlotsComputeRequiredXmin(), by holding ReplicationSlotControlLock
exclusively, that should suffice to ensure that no "wrong" horizon was
determined / no needed rows have been removed. And we'd not need a lock nested
inside ProcArrayLock anymore.


Not sure if it's sufficiently better to be worth bothering with though :(

Greetings,

Andres Freund




Re: improving user.c error messages

2023-02-07 Thread Nathan Bossart
On Fri, Jan 27, 2023 at 03:15:07PM -0800, Nathan Bossart wrote:
> One thing that feels a bit odd is how some of the DETAILs mention the
> operation being attempted while others do not.  For example, we have
> 
>   ERROR:  permission denied to drop role
>   DETAIL: You must have SUPERUSER privilege to drop roles with SUPERUSER.
> 
> In this case, the DETAIL explains the action that is prohibited.  In other
> cases, we have something like
> 
>   ERROR:  permission denied to alter role
>   DETAIL: You must have CREATEROLE privilege and ADMIN OPTION on role 
> "myrole".
> 
> which does not.  I think this is okay because adding "to alter the role" to
> the end of the DETAIL seems kind of awkward.  But in other cases, such as
> 
>   ERROR:  permission denied to use replication slots
>   DETAIL:  You must have REPLICATION privilege.
> 
> adding the operation to the end seems less awkward (i.e., "You must have
> REPLICATION privilege to use replication slots.").  I don't think there's
> any information lost by omitting the action in the DETAIL, so perhaps this
> is just a stylistic choice.  I think I'm inclined to add the action to the
> DETAIL whenever it doesn't make the message lengthy and awkward, and leave
> it out otherwise.  Thoughts?

Here is a new patch set with this change and some other light editing.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 196c52841d19d507d695122ac5e9ecab45a908f2 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 26 Jan 2023 11:05:13 -0800
Subject: [PATCH v5 1/2] Improve user.c error messages.

---
 src/backend/commands/user.c   | 169 --
 src/test/regress/expected/create_role.out |  77 ++
 src/test/regress/expected/dependency.out  |   4 +
 src/test/regress/expected/privileges.out  |  23 ++-
 4 files changed, 198 insertions(+), 75 deletions(-)

diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index 3a92e930c0..3d0b8b9ea6 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -316,23 +316,33 @@ CreateRole(ParseState *pstate, CreateRoleStmt *stmt)
 		if (!has_createrole_privilege(currentUserId))
 			ereport(ERROR,
 	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-	 errmsg("permission denied to create role")));
+	 errmsg("permission denied to create role"),
+	 errdetail("You must have %s privilege to create roles.",
+			   "CREATEROLE")));
 		if (issuper)
 			ereport(ERROR,
 	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-	 errmsg("must be superuser to create superusers")));
+	 errmsg("permission denied to create role"),
+	 errdetail("You must have %s privilege to create roles with %s.",
+			   "SUPERUSER", "SUPERUSER")));
 		if (createdb && !have_createdb_privilege())
 			ereport(ERROR,
 	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-	 errmsg("must have createdb permission to create createdb users")));
+	 errmsg("permission denied to create role"),
+	 errdetail("You must have %s privilege to create roles with %s.",
+			   "CREATEDB", "CREATEDB")));
 		if (isreplication && !has_rolreplication(currentUserId))
 			ereport(ERROR,
 	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-	 errmsg("must have replication permission to create replication users")));
+	 errmsg("permission denied to create role"),
+	 errdetail("You must have %s privilege to create roles with %s.",
+			   "REPLICATION", "REPLICATION")));
 		if (bypassrls && !has_bypassrls_privilege(currentUserId))
 			ereport(ERROR,
 	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-	 errmsg("must have bypassrls to create bypassrls users")));
+	 errmsg("permission denied to create role"),
+	 errdetail("You must have %s privilege to create roles with %s.",
+			   "BYPASSRLS", "BYPASSRLS")));
 	}
 
 	/*
@@ -744,10 +754,18 @@ AlterRole(ParseState *pstate, AlterRoleStmt *stmt)
 	roleid = authform->oid;
 
 	/* To mess with a superuser in any way you gotta be superuser. */
-	if (!superuser() && (authform->rolsuper || dissuper))
+	if (!superuser() && authform->rolsuper)
 		ereport(ERROR,
 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("must be superuser to alter superuser roles or change superuser attribute")));
+ errmsg("permission denied to alter role"),
+ errdetail("You must have %s privilege to alter roles with %s.",
+		   "SUPERUSER", "SUPERUSER")));
+	if (!superuser() && dissuper)
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied to alter role"),
+ errdetail("You must have %s privilege to change the %s attribute.",
+		   "SUPERUSER", "SUPERUSER")));
 
 	/*
 	 * Most changes to a role require that you both have CREATEROLE privileges
@@ -761,13 +779,17 @@ AlterRole(ParseState *pstate, AlterRoleStmt *stmt)
 			dvalidUntil || disreplication || dbypassRLS)
 			ereport(ERROR,
 	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-	 errmsg("permission den

Re:pg_stat_statements and "IN" conditions

2023-02-07 Thread Sergei Kornilov
Hello!

Unfortunately, rebase is needed again due to recent changes in queryjumblefuncs 
( 9ba37b2cb6a174b37fc51d0649ef73e56eae27fc )

It seems a little strange to me that with const_merge_threshold = 1, such a 
test case gives the same result as with const_merge_threshold = 2

select pg_stat_statements_reset();
set const_merge_threshold to 1;
select * from test where i in (1,2,3);
select * from test where i in (1,2);
select * from test where i in (1);
select query, calls from pg_stat_statements order by query;

query| calls 
-+---
 select * from test where i in (...) | 2
 select * from test where i in ($1)  | 1

Probably const_merge_threshold = 1 should produce only "i in (...)"?

const_merge_threshold is "the minimal length of an array" (more or equal) or 
"array .. length is larger than" (not equals)? I think the documentation is 
ambiguous in this regard.

I also noticed a typo in guc_tables.c: "Sets the minimal numer of constants in 
an array" -> number

regards, Sergei




Re: A bug in make_outerjoininfo

2023-02-07 Thread Tom Lane
Richard Guo  writes:
> In cases where we have any clauses between two outer joins, these
> clauses should be treated as degenerate clauses in the upper OJ, and
> they may prevent us from re-ordering the two outer joins.  Previously we
> have the flag 'delay_upper_joins' to help avoid the re-ordering in such
> cases.

> In b448f1c8 remove_useless_result_rtes will remove useless FromExprs and
> merge its quals up to parent.  This makes flag 'delay_upper_joins' not
> necessary any more if the clauses between the two outer joins come from
> FromExprs.  However, if the clauses between the two outer joins come
> from JoinExpr of an inner join, it seems we have holes in preserving
> ordering.

Hmm ... we'll preserve the ordering all right, but if we set commute_below
or commute_above_x bits that don't match reality then we'll have trouble
later with mis-marked varnullingrels, the same as we saw in b2d0e13a0.
I don't think you need a JoinExpr, an intermediate multi-member FromExpr
should have the same effect.

This possibility was bugging me a little bit while working on b2d0e13a0,
but I didn't have a test case showing that it was an issue, and it
doesn't seem that easy to incorporate into make_outerjoininfo's
SpecialJoinInfo-based logic.  I wonder if we could do something based on
insisting that the upper OJ's relevant "min_xxxside" relids exactly match
the lower OJ's min scope, thereby proving that there's no relevant join
of any kind between them.

The main question there is whether it'd break optimization of any cases
where we need to apply multiple OJ identities to get to the most favorable
plan.  I think not, as long as we compare the "min" relid sets not the
syntactic relid sets, but I've not done a careful analysis.

If that doesn't work, another idea could be to reformulate
make_outerjoininfo's loop as a re-traversal of the jointree, allowing
it to see intermediate plain joins directly.  However, that still leaves
me wondering what we *do* about the intermediate joins.  I don't think
we want to fail immediately on seeing one, because we could possibly
apply OJ identity 1 to get the inner join out of the way.  That is:

((A leftjoin B on (Pab)) innerjoin C on (Pac)) leftjoin D on (Pbd)

initially looks like identity 3 can't apply, but apply identity 1
first:

((A innerjoin C on (Pac)) leftjoin B on (Pab)) leftjoin D on (Pbd)

and now it works (insert usual caveat about strictness):

(A innerjoin C on (Pac)) leftjoin (B leftjoin D on (Pbd)) on (Pab)

and you can even go back the other way:

(A leftjoin (B leftjoin D on (Pbd)) on (Pab)) innerjoin C on (Pac)

So it's actually possible to push an innerjoin out of the identity-3
nest in either direction, and we don't want to lose that.

regards, tom lane




Re: pglz compression performance, take two

2023-02-07 Thread Andres Freund
Hi,

On 2023-02-05 10:36:39 -0800, Andrey Borodin wrote:
> On Fri, Jan 6, 2023 at 10:02 PM Andrey Borodin  wrote:
> >
> > Hello! Please find attached v8.
>
> I got some interesting feedback from some patch users.
> There was an oversight that frequently yielded results that are 1,2 or
> 3 bytes longer than expected.
> Looking closer I found that the correctness of the last 3-byte tail is
> checked in two places. PFA fix for this. Previously compressed data
> was correct, however in some cases few bytes longer than the result of
> current pglz implementation.

This version fails on cfbot, due to address sanitizer:

https://cirrus-ci.com/task/4921632586727424
https://api.cirrus-ci.com/v1/artifact/task/4921632586727424/log/src/test/regress/log/initdb.log


performing post-bootstrap initialization ... 
=
==15991==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x61e02ee0 at pc 0x558e1b847b16 bp 0x7ffd35782f70 sp 0x7ffd35782f68
READ of size 1 at 0x61e02ee0 thread T0
#0 0x558e1b847b15 in pglz_hist_add 
/tmp/cirrus-ci-build/src/common/pg_lzcompress.c:310
#1 0x558e1b847b15 in pglz_compress 
/tmp/cirrus-ci-build/src/common/pg_lzcompress.c:680
#2 0x558e1aa86ef0 in pglz_compress_datum 
/tmp/cirrus-ci-build/src/backend/access/common/toast_compression.c:65
#3 0x558e1aa87af2 in toast_compress_datum 
/tmp/cirrus-ci-build/src/backend/access/common/toast_internals.c:68
#4 0x558e1ac22989 in toast_tuple_try_compression 
/tmp/cirrus-ci-build/src/backend/access/table/toast_helper.c:234
#5 0x558e1ab6af24 in heap_toast_insert_or_update 
/tmp/cirrus-ci-build/src/backend/access/heap/heaptoast.c:197
#6 0x558e1ab4a2a6 in heap_update 
/tmp/cirrus-ci-build/src/backend/access/heap/heapam.c:3533
...



Independent of this failure, I'm worried about the cost/benefit analysis of a
pglz change that changes this much at once. It's quite hard to review.


Andres




Re: SLRUs in the main buffer pool - Page Header definitions

2023-02-07 Thread Andres Freund
Hi,

On 2023-02-06 19:12:47 +, Bagga, Rishu wrote:
> Rebased patch as per latest community changes since last email. 

This version doesn't actually build.

https://cirrus-ci.com/task/4512310190931968

[19:43:20.131] FAILED: src/test/modules/test_slru/test_slru.so.p/test_slru.c.o 
[19:43:20.131] ccache cc -Isrc/test/modules/test_slru/test_slru.so.p 
-Isrc/include -I../src/include -Isrc/include/catalog -Isrc/include/nodes 
-Isrc/include/utils -Isrc/include/storage -fdiagnostics-color=always -pipe 
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -g -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -D_GNU_SOURCE -Wmissing-prototypes -Wpointer-arith 
-Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 
-Wcast-function-type -Wshadow=compatible-local -Wformat-security 
-Wdeclaration-after-statement -Wno-format-truncation -Wno-stringop-truncation 
-fPIC -pthread -fvisibility=hidden -MD -MQ 
src/test/modules/test_slru/test_slru.so.p/test_slru.c.o -MF 
src/test/modules/test_slru/test_slru.so.p/test_slru.c.o.d -o 
src/test/modules/test_slru/test_slru.so.p/test_slru.c.o -c 
../src/test/modules/test_slru/test_slru.c
[19:43:20.131] ../src/test/modules/test_slru/test_slru.c:47:8: error: unknown 
type name ‘SlruCtlData’
[19:43:20.131]47 | static SlruCtlData TestSlruCtlData;
[19:43:20.131]   |^~~
[19:43:20.131] ../src/test/modules/test_slru/test_slru.c:57:19: error: unknown 
type name ‘SlruCtl’
[19:43:20.131]57 | test_slru_scan_cb(SlruCtl ctl, char *filename, int 
segpage, void *data)
[19:43:20.131]   |   ^~~

...

Andres




Re: Data is copied twice when specifying both child and parent table in publication

2023-02-07 Thread Andres Freund
Hi,

On 2022-11-16 08:58:31 +, wangw.f...@fujitsu.com wrote:
> Attach the new patch set.

This patch causes several of the tests to fail. See e.g.:

https://cirrus-ci.com/task/6587624765259776

Most of the failures appear to be due to the main regression tests failing:
https://api.cirrus-ci.com/v1/artifact/task/6587624765259776/testrun/build/testrun/regress/regress/regression.diffs

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/publication.out 
/tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/publication.out  
2023-02-07 20:19:34.318018729 +
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out  
2023-02-07 20:22:53.545223026 +
@@ -1657,7 +1657,7 @@
 SELECT * FROM pg_publication_tables;
  pubname | schemaname | tablename  | attnames | rowfilter 
 -+++--+---
- pub | sch2   | tbl1_part1 | {a}  | 
+ pub | sch2   | tbl1_part1 |  | 
 (1 row)
 
 DROP PUBLICATION pub;





Re: heapgettup refactoring

2023-02-07 Thread Melanie Plageman
On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote:
> On Fri, 3 Feb 2023 at 15:26, David Rowley  wrote:
> > I've pushed all but the final 2 patches now.
> 
> I just pushed the final patch in the series.

Cool!

> I held back on moving the setting of rs_inited back into the
> heapgettup_initial_block() helper function as I wondered if we should
> even keep that field.
> 
> It seems that rs_cblock == InvalidBlockNumber in all cases where
> rs_inited == false, so maybe it's better just to use that as a
> condition to check if the scan has started or not. I've attached a
> patch which does that.
> 
> I ended up adjusting HeapScanDescData more than what is minimally
> required to remove rs_inited. I wondered if rs_cindex should be closer
> to rs_cblock in the struct so that we're more likely to be adjusting
> the same cache line than if that field were closer to the end of the
> struct.  We don't need rs_coffset and rs_cindex at the same time, so I
> made it a union. I see that the struct is still 712 bytes before and
> after this change. I've not yet tested to see if there are any
> performance gains to this change.
> 

I like the idea of using a union.

> diff --git a/src/backend/access/heap/heapam.c 
> b/src/backend/access/heap/heapam.c
> index 7eb79cee58..e171d6e38b 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -321,13 +321,15 @@ initscan(HeapScanDesc scan, ScanKey key, bool 
> keep_startblock)
>   }
>  
>   scan->rs_numblocks = InvalidBlockNumber;
> - scan->rs_inited = false;
>   scan->rs_ctup.t_data = NULL;
>   ItemPointerSetInvalid(&scan->rs_ctup.t_self);
>   scan->rs_cbuf = InvalidBuffer;
>   scan->rs_cblock = InvalidBlockNumber;
>  
> - /* page-at-a-time fields are always invalid when not rs_inited */
> + /*
> +  * page-at-a-time fields are always invalid when
> +  * rs_cblock == InvalidBlockNumber
> +  */

So, I was wondering what we should do about initializing rs_coffset here
since it doesn't fall under "don't initialize it because it is only used
for page-at-a-time mode". It might not be required for us to initialize
it in initscan, but we do bother to initialize other "current scan
state" fields. We could check if pagemode is enabled and initialize
rs_coffset or rs_cindex depending on that.

Then maybe the comment should call out the specific page-at-a-time
fields that are automatically invalid? (e.g. rs_ntuples, rs_vistuples)

I presume the point of the comment is to explain why those fields are
not being initialized here, which was a question I had when I looked at
initscan(), so it seems like we should make sure it explains that.

> @@ -717,9 +720,9 @@ heapgettup_advance_block(HeapScanDesc scan, BlockNumber 
> block, ScanDirection dir
>   * the scankeys.
>   *
>   * Note: when we fall off the end of the scan in either direction, we
> - * reset rs_inited.  This means that a further request with the same
> - * scan direction will restart the scan, which is a bit odd, but a
> - * request with the opposite scan direction will start a fresh scan
> + * reset rs_cblock to InvalidBlockNumber.  This means that a further request
> + * with the same scan direction will restart the scan, which is a bit odd, 
> but
> + * a request with the opposite scan direction will start a fresh scan
>   * in the proper direction.  The latter is required behavior for cursors,
>   * while the former case is generally undefined behavior in Postgres
>   * so we don't care too much.

Not the fault of this patch, but I am having trouble parsing this
comment. What does restart the scan mean? I get that it is undefined
behavior, but it is confusing because it kind of sounds like a rescan
which is not what it is (right?). And what exactly is a fresh scan? It
is probably correct, I just don't really understand what it is saying.
Feel free to ignore this aside, as I think your change is correctly
updating the comment.

> @@ -2321,13 +2316,12 @@ heapam_scan_sample_next_block(TableScanDesc scan, 
> SampleScanState *scanstate)
>   ReleaseBuffer(hscan->rs_cbuf);
>   hscan->rs_cbuf = InvalidBuffer;
>   hscan->rs_cblock = InvalidBlockNumber;
> - hscan->rs_inited = false;
>  
>   return false;
>   }
>  
>   heapgetpage(scan, blockno);
> - hscan->rs_inited = true;
> + Assert(hscan->rs_cblock != InvalidBlockNumber);

Quite nice to have this assert.

> diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
> index 8d28bc93ef..c6efd59eb5 100644
> --- a/src/include/access/heapam.h
> +++ b/src/include/access/heapam.h
> @@ -56,9 +56,18 @@ typedef struct HeapScanDescData
>   /* rs_numblocks is usually InvalidBlockNumber, meaning "scan whole rel" 
> */
>  
>   /* scan current state */
> - boolrs_inited;  /* false = scan not init'd yet 
> */
> - OffsetNumber rs_coffset;/* current offset # in 
> non

Re: possible proposal plpgsql GET DIAGNOSTICS oid = PG_ROUTINE_OID

2023-02-07 Thread Kirk Wolak
On Tue, Feb 7, 2023 at 2:49 PM Pavel Stehule 
wrote:

> Hi
>
> I have a question about the possibility of simply getting the name of the
> currently executed function. The reason for this request is simplification
> of writing debug messages.
>
> GET DIAGNOSTICS _oid = PG_ROUTINE_OID;
> RAISE NOTICE '... % ... %', _oid, _oid::regproc::text;
>
> The advantage of this dynamic access to function name is always valid
> value not sensitive to some renaming or moving between schemas.
>
> I am able to separate a name from context, but it can be harder to write
> this separation really robustly. It can be very easy to enhance the GET
> DIAGNOSTICS statement to return the oid of currently executed function.
>
> Do you think it can be useful feature?
>

I was hoping it could be a CONSTANT like TG_OP (so the extra GET
DIAGNOSTICS wasn't invoked, but I have no idea the weight of that CODE
CHANGE)

Regardless, this concept is what we are looking for.  We prefer to leave
some debugging scaffolding in our DB Procedures, but disable it by default.
We are looking for a way to add something like this as a filter on the
level of output.

Our Current USE CASE is
  CALL LOGGING('Msg');  -- And by default nothing happens, unless we set
some session variables appropriately

We are looking for
  CALL LOGGING('Msg', __PG_ROUTINE_OID );  -- Now we can enable logging by
the routine we are interested in!

The LOGGING routine currently checks a session variable to see if logging
is EVEN Desired, if not it exits (eg PRODUCTION).

Now we can add a single line check, if p_funcoid  is IN my list of routines
I am debugging, send the output.

I will gladly work on the documentation side to help this happen!

+10




>
> The implementation should be trivial.
>
> Comments, notes?
>
> Regards
>
> Pavel
>
>
>


Re: Too coarse predicate locks granularity for B+ tree indexes

2023-02-07 Thread Thomas Munro
On Wed, Feb 8, 2023 at 5:25 AM Andrey Borodin  wrote:
> On Tue, Feb 7, 2023 at 4:01 AM Rinat Shigapov  wrote:
> > Thomas, thank you for the details!
> >
> > Have you kept the branch that you used to generate the patch? Which commit 
> > should the patch apply to?
>
> You can try something like
> git checkout 'master@{2018-05-13 13:37:00}'
> to get a commit by date from rev-parse.

I don't have time to work on this currently but if Rinat or others
want to look into it... maybe I should rebase that experiment on top
of current master.  Here's the branch:

https://github.com/macdice/postgres/tree/ssi-index-locking-refinements




Re: Too coarse predicate locks granularity for B+ tree indexes

2023-02-07 Thread Thomas Munro
On Wed, Feb 8, 2023 at 10:44 AM Thomas Munro  wrote:
> On Wed, Feb 8, 2023 at 5:25 AM Andrey Borodin  wrote:
> > On Tue, Feb 7, 2023 at 4:01 AM Rinat Shigapov  
> > wrote:
> > > Thomas, thank you for the details!
> > >
> > > Have you kept the branch that you used to generate the patch? Which 
> > > commit should the patch apply to?
> >
> > You can try something like
> > git checkout 'master@{2018-05-13 13:37:00}'
> > to get a commit by date from rev-parse.
>
> I don't have time to work on this currently but if Rinat or others
> want to look into it... maybe I should rebase that experiment on top
> of current master.  Here's the branch:
>
> https://github.com/macdice/postgres/tree/ssi-index-locking-refinements

Erm, I guess I should also post the rebased patches here, for the
mailing list archives.
From 299e2dfb54ffabe45d73e3296856132169a7419b Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 10 Jun 2021 23:35:16 +1200
Subject: [PATCH v5 1/2] Use tuple-level SIREAD locks for index-only scans.

When index-only scans manage to avoid fetching heap tuples,
previously we'd predicate-lock the whole heap page (since commit
cdf91edb).  Commits c01262a8 and 6f38d4da made it possible to lock the
tuple instead with only a TID, to match the behavior of regular index
scans and avoid some false conflicts.

Discussion: https://postgr.es/m/CAEepm%3D2GK3FVdnt5V3d%2Bh9njWipCv_fNL%3DwjxyUhzsF%3D0PcbNg%40mail.gmail.com
---
 src/backend/executor/nodeIndexonlyscan.c  | 12 --
 .../isolation/expected/index-only-scan-2.out  | 19 +
 .../isolation/expected/index-only-scan-3.out  | 20 ++
 src/test/isolation/isolation_schedule |  2 +
 .../isolation/specs/index-only-scan-2.spec| 39 +++
 .../isolation/specs/index-only-scan-3.spec| 33 
 6 files changed, 121 insertions(+), 4 deletions(-)
 create mode 100644 src/test/isolation/expected/index-only-scan-2.out
 create mode 100644 src/test/isolation/expected/index-only-scan-3.out
 create mode 100644 src/test/isolation/specs/index-only-scan-2.spec
 create mode 100644 src/test/isolation/specs/index-only-scan-3.spec

diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index 0b43a9b969..237f17b07a 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -239,12 +239,16 @@ IndexOnlyNext(IndexOnlyScanState *node)
 
 		/*
 		 * If we didn't access the heap, then we'll need to take a predicate
-		 * lock explicitly, as if we had.  For now we do that at page level.
+		 * lock explicitly, as if we had.  We don't have the inserter's xid,
+		 * but that's only used by PredicateLockTID to check if it matches the
+		 * caller's top level xid, and it can't possibly have been inserted by
+		 * us or the page wouldn't be all visible.
 		 */
 		if (!tuple_from_heap)
-			PredicateLockPage(scandesc->heapRelation,
-			  ItemPointerGetBlockNumber(tid),
-			  estate->es_snapshot);
+			PredicateLockTID(scandesc->heapRelation,
+			 tid,
+			 estate->es_snapshot,
+			 InvalidTransactionId);
 
 		return slot;
 	}
diff --git a/src/test/isolation/expected/index-only-scan-2.out b/src/test/isolation/expected/index-only-scan-2.out
new file mode 100644
index 00..c541eaeb43
--- /dev/null
+++ b/src/test/isolation/expected/index-only-scan-2.out
@@ -0,0 +1,19 @@
+Parsed test spec with 2 sessions
+
+starting permutation: r1 r2 w1 w2 c1 c2
+step r1: SELECT id1 FROM account WHERE id1 = 1;
+id1
+---
+  1
+(1 row)
+
+step r2: SELECT id2 FROM account WHERE id2 = 2;
+id2
+---
+  2
+(1 row)
+
+step w1: UPDATE account SET balance = 200 WHERE id1 = 1;
+step w2: UPDATE account SET balance = 200 WHERE id2 = 2;
+step c1: COMMIT;
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/index-only-scan-3.out b/src/test/isolation/expected/index-only-scan-3.out
new file mode 100644
index 00..0a7c5955d2
--- /dev/null
+++ b/src/test/isolation/expected/index-only-scan-3.out
@@ -0,0 +1,20 @@
+Parsed test spec with 2 sessions
+
+starting permutation: r1 r2 w1 w2 c1 c2
+step r1: SELECT id1 FROM account WHERE id1 = 2;
+id1
+---
+  2
+(1 row)
+
+step r2: SELECT id2 FROM account WHERE id2 = 1;
+id2
+---
+  1
+(1 row)
+
+step w1: UPDATE account SET balance = 200 WHERE id1 = 1;
+step w2: UPDATE account SET balance = 200 WHERE id2 = 2;
+step c1: COMMIT;
+step c2: COMMIT;
+ERROR:  could not serialize access due to read/write dependencies among transactions
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index c11dc9a420..e1574ae9f7 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -17,6 +17,8 @@ test: partial-index
 test: two-ids
 test: multiple-row-versions
 test: index-only-scan
+test: index-only-scan-2
+test: index-only-scan-3
 test: predicate-lock-hot-tuple
 test: update-conflict-out
 test: deadlock-simple
diff --git a/src/test/isolation/specs/index-only-scan-2.s

Re: Non-superuser subscription owners

2023-02-07 Thread Robert Haas
On Wed, Feb 1, 2023 at 4:02 PM Andres Freund  wrote:
> On 2023-01-30 15:32:34 -0500, Robert Haas wrote:
> > I had a long think about what to do with ALTER SUBSCRIPTION ... OWNER
> > TO in terms of permissions checks.
>
> As long as owner and run-as are the same, I think it's strongly
> preferrable to *not* require pg_create_subscription.

OK.

> > Another question around ALTER SUBSCRIPTION ... OWNER TO and also ALTER
> > SUBSCRIPTION .. RENAME is whether they ought to fail if you're not a
> > superuser and password_required false is set.
>
> I don't really see a benefit in allowing it, so I'm inclined to go for
> the more restrictive option. But this is a really weakly held opinion.

I went back and forth on this and ended up with what you propose here.
It's simpler to explain this way.

> > + /* Is the use of a password mandatory? */
> > + must_use_password = MySubscription->passwordrequired &&
> > + !superuser_arg(MySubscription->owner);
>
> There's a few repetitions of this - perhaps worth putting into a helper?

I don't think so. It's slightly different each time, because it's
pulling data out of different data structures.

> This still leaks the connection on error, no?

I've attempted to fix this in v4, attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v4-0001-Add-new-predefined-role-pg_create_subscriptions.patch
Description: Binary data


Re: daitch_mokotoff module

2023-02-07 Thread Tomas Vondra
On 2/7/23 18:08, Paul Ramsey wrote:
> 
> 
>> On Feb 7, 2023, at 6:47 AM, Dag Lem  wrote:
>>
>> I just went by to check the status of the patch, and I noticed that
>> you've added yourself as reviewer earlier - great!
>>
>> Please tell me if there is anything I can do to help bring this across
>> the finish line.
> 
> Honestly, I had set it to Ready for Committer, but then I went to run 
> regression one more time and my regression blew up. I found I couldn't enable 
> the UTF tests without things failing. And I don't blame you! I think my 
> installation is probably out-of-alignment in some way, but I didn't want to 
> flip the Ready flag without having run everything through to completion, so I 
> flipped it back. Also, are the UTF tests enabled by default? It wasn't clear 
> to me that they were?
> 
The utf8 tests are enabled depending on the encoding returned by
getdatabaseencoding(). Systems with other encodings will simply use the
alternate .out file. And it works perfectly fine for me.

IMHO it's ready for committer.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Generating code for query jumbling through gen_node_support.pl

2023-02-07 Thread Tom Lane
Michael Paquier  writes:
> With all that in mind, I have spent my day polishing that and doing a
> close lookup, and the patch has been applied.  Thanks a lot!

I have just noticed that this patch is generating useless jumbling
code for node types such as Path nodes and other planner infrastructure
nodes.  That no doubt contributes to the miserable code coverage rating
for queryjumblefuncs.*.c, which have enough dead lines to drag down the
overall rating for all of backend/nodes/.  Shouldn't a little more
attention have been paid to excluding entire node classes if they can
never appear in Query?

regards, tom lane




Re: A bug in make_outerjoininfo

2023-02-07 Thread Tom Lane
I wrote:
> Richard Guo  writes:
>> In b448f1c8 remove_useless_result_rtes will remove useless FromExprs and
>> merge its quals up to parent.  This makes flag 'delay_upper_joins' not
>> necessary any more if the clauses between the two outer joins come from
>> FromExprs.  However, if the clauses between the two outer joins come
>> from JoinExpr of an inner join, it seems we have holes in preserving
>> ordering.

> Hmm ... we'll preserve the ordering all right, but if we set commute_below
> or commute_above_x bits that don't match reality then we'll have trouble
> later with mis-marked varnullingrels, the same as we saw in b2d0e13a0.
> I don't think you need a JoinExpr, an intermediate multi-member FromExpr
> should have the same effect.

BTW, the presented test case doesn't fail anymore after the fix
for bug #17781.  That's because build_joinrel_tlist() doesn't use
commute_above_l anymore at all, and is a bit more wary in its use of
commute_above_r.  I'm not sure that that completely eliminates this
problem, but it at least makes it a lot harder to reach.

We might want to see if we can devise a new example (or wait for
Robins to break it ;-)) before expending a lot of effort on making
the commute_xxx bits more precise.

regards, tom lane




Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-07 Thread Tatsuo Ishii
>> The patch drops support for "-n" option :-<
>> 
>> Attached is the patch by fixing make_ctags (make_etags is not
>> touched).
>> 
>> If Exuberant-type ctags is available, use it (not changed).
>>   If Exuberant-type ctags is not available, try old ctags (not changed).
>> If the old ctags does not support "-e" option, try etags (new).
> 
> I am not sure if this is good way to check if ctags supports "-e" or not. 
> 
> + thenctags --version 2>&1 | grep -- -e >/dev/null
> 
> Perhaps, "--help" might be intended rather than "--version" to check
> supported options?

Yeah, that was my mistake.

>  Even so, ctags would have other option whose name contains
> "-e" than Emacs support, so this check could end in a wrong result.  
> Therefore,
> it seems to me that it is better to check immediately if etags is available 
> in case that we don't have Exuberant-type ctags.

That makes sense.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-07 Thread Tatsuo Ishii
>> I am not sure if this is good way to check if ctags supports "-e" or not. 
>> 
>> +thenctags --version 2>&1 | grep -- -e >/dev/null
>> 
>> Perhaps, "--help" might be intended rather than "--version" to check
>> supported options?
> 
> Yeah, that was my mistake.
> 
>>  Even so, ctags would have other option whose name contains
>> "-e" than Emacs support, so this check could end in a wrong result.  
>> Therefore,
>> it seems to me that it is better to check immediately if etags is available 
>> in case that we don't have Exuberant-type ctags.
> 
> That makes sense.

Attached is the v2 patch.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/tools/make_ctags b/src/tools/make_ctags
index 102881667b..14089c5a7c 100755
--- a/src/tools/make_ctags
+++ b/src/tools/make_ctags
@@ -12,12 +12,15 @@ fi
 MODE=
 NO_SYMLINK=
 TAGS_FILE="tags"
+ETAGS_EXISTS=
+PROG=ctags
 
 while [ $# -gt 0 ]
 do
 	if [ $1 = "-e" ]
 	then	MODE="-e"
 		TAGS_FILE="TAGS"
+		command -v etags >/dev/null && ETAGS_EXISTS="Y"
 	elif [ $1 = "-n" ]
 	then	NO_SYMLINK="Y"
 	else
@@ -27,8 +30,10 @@ do
 	shift
 done
 
-command -v ctags >/dev/null || \
+if test -z "$MODE"
+then	(command -v ctags >/dev/null) || \
 	{ echo "'ctags' program not found" 1>&2; exit 1; }
+fi
 
 trap "ret=$?; rm -rf /tmp/$$; exit $ret" 0 1 2 3 15
 rm -f ./$TAGS_FILE
@@ -36,6 +41,22 @@ rm -f ./$TAGS_FILE
 IS_EXUBERANT=""
 ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"
 
+# ctags is not exuberant and emacs mode is specified
+if [ -z "$IS_EXUBERANT" -a -n "$MODE" ]
+then
+	if [ -n "$ETAGS_EXISTS" ]
+	then
+		# etags exists. Use it.
+		PROG=etags
+	else	ctags --help 2>&1 | grep -- -e >/dev/null
+		# Note that "ctags --help" does not always work. Even certain ctags does not have the option.
+		# In that case we assume that the ctags does not support emacs mode.
+		if [ $? != 0 -a -z "$ETAGS_EXISTS" ]
+		then	echo "'ctags' does not support emacs mode and etags does not exist" 1>&2; exit 1
+		fi
+	fi
+fi
+
 # List of kinds supported by Exuberant Ctags 5.8
 # generated by ctags --list-kinds
 # --c-kinds was called --c-types before 2003
@@ -65,11 +86,16 @@ then	IGNORE_IDENTIFIES="-I pg_node_attr+"
 else	IGNORE_IDENTIFIES=
 fi
 
+if [ $PROG = "ctags" ]
+then	TAGS_OPT="-a -f"
+else	TAGS_OPT="-a -o"
+fi
+
 # this is outputting the tags into the file 'tags', and appending
 find `pwd`/ \( -name tmp_install -prune -o -name tmp_check -prune \) \
 	-o \( -name "*.[chly]" -o -iname "*makefile*" -o -name "*.mk" -o -name "*.in" \
 	-o -name "*.sql" -o -name "*.p[lm]" \) -type f -print |
-	xargs ctags $MODE -a -f $TAGS_FILE "$FLAGS" "$IGNORE_IDENTIFIES"
+	xargs $PROG $MODE $TAGS_OPT $TAGS_FILE "$FLAGS" "$IGNORE_IDENTIFIES"
 
 # Exuberant tags has a header that we cannot sort in with the other entries
 # so we skip the sort step


Re: RLS makes COPY TO process child tables

2023-02-07 Thread Stephen Frost
Greetings,

* Yugo NAGATA (nag...@sraoss.co.jp) wrote:
> On Wed, 01 Feb 2023 11:47:23 -0500
> Tom Lane  wrote:
> 
> > Yugo NAGATA  writes:
> > > Antonin Houska  wrote:
> > >> While working on [1] I noticed that if RLS gets enabled, the COPY TO 
> > >> command
> > >> includes the contents of child table into the result, although the
> > >> documentation says it should not:
> > 
> > > I think this is a bug because the current behaviour is different from
> > > the documentation. 
> > 
> > I agree, it shouldn't do that.

Yeah, I agree based on what the COPY table TO docs say should be
happening.

> > > When RLS is enabled on a table in `COPY ... TO ...`, the query is 
> > > converted
> > > to `COPY (SELECT * FROM ...) TO ...` to allow the rewriter to add in RLS
> > > clauses. This causes to dump the rows of child tables.
> > 
> > Do we actually say that in so many words, either in the code or docs?
> > If so, it ought to read `COPY (SELECT * FROM ONLY ...) TO ...`
> > instead.  (If we say that in the docs, then arguably the code *does*
> > conform to the docs.  But I don't see it in the COPY ref page at least.)
> 
> The documentation do not say that, but the current code actually do that.
> Also, there is the following comment in BeginCopyTo().
> 
>  * With row-level security and a user using "COPY relation TO", we
>  * have to convert the "COPY relation TO" to a query-based COPY (eg:
>  * "COPY (SELECT * FROM relation) TO"), to allow the rewriter to add
>  * in any RLS clauses.
> 
> Maybe, it is be better to change the description in the comment to
> "COPY (SELECT * FROM ONLY relation) TO" when fixing the bug.

Yeah, that should also be updated.  Perhaps you'd send an updated patch
which includes fixing that too and maybe adds clarifying documentation
to COPY which mentions what happens when RLS is enabled on the relation?

I'm not sure if this makes good sense to back-patch.

Thanks,

Stephen


signature.asc
Description: PGP signature


deadlock-hard flakiness

2023-02-07 Thread Andres Freund
Hi,

On cfbot / CI, we've recently seen a lot of spurious test failures due to
src/test/isolation/specs/deadlock-hard.spec changing output. Always on
freebsd, when running tests against a pre-existing instance.

I'm fairly sure I've seen this failure on the buildfarm as well, but I'm too
impatient to wait for the buildfarm database query (it really should be
updated to use lz4 toast compression).

Example failures:

1)
https://cirrus-ci.com/task/5307793230528512?logs=test_running#L211
https://api.cirrus-ci.com/v1/artifact/task/5307793230528512/testrun/build/testrun/isolation-running/isolation/regression.diffs
https://api.cirrus-ci.com/v1/artifact/task/5307793230528512/testrun/build/testrun/runningcheck.log

2)
https://cirrus-ci.com/task/6137098198056960?logs=test_running#L212
https://api.cirrus-ci.com/v1/artifact/task/6137098198056960/testrun/build/testrun/isolation-running/isolation/regression.diffs
https://api.cirrus-ci.com/v1/artifact/task/6137098198056960/testrun/build/testrun/runningcheck.log

So far the diff always is:

diff -U3 /tmp/cirrus-ci-build/src/test/isolation/expected/deadlock-hard.out 
/tmp/cirrus-ci-build/build/testrun/isolation-running/isolation/results/deadlock-hard.out
--- /tmp/cirrus-ci-build/src/test/isolation/expected/deadlock-hard.out  
2023-02-07 05:32:34.536429000 +
+++ 
/tmp/cirrus-ci-build/build/testrun/isolation-running/isolation/results/deadlock-hard.out
2023-02-07 05:40:33.833908000 +
@@ -25,10 +25,11 @@
 step s6a7: <... completed>
 step s6c: COMMIT;
 step s5a6: <... completed>
-step s5c: COMMIT;
+step s5c: COMMIT; 
 step s4a5: <... completed>
 step s4c: COMMIT;
 step s3a4: <... completed>
+step s5c: <... completed>
 step s3c: COMMIT;
 step s2a3: <... completed>
 step s2c: COMMIT;


Commit 741d7f1047f fixed a similar issue in deadlock-hard. But it looks like
we need something more. But perhaps this isn't an output ordering issue:

How can we end up with s5c getting reported as waiting? I don't see how s5c
could end up blocking on anything?

Greetings,

Andres Freund




windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

2023-02-07 Thread Andres Freund
Hi,

A recent cfbot run caused CI on windows to crash - on a patch that could not
conceivably cause this issue:
  https://cirrus-ci.com/task/564602116576
the patch is just:
  
https://github.com/postgresql-cfbot/postgresql/commit/dbd4afa6e7583c036b86abe2e3d27b508d335c2b

regression.diffs: 
https://api.cirrus-ci.com/v1/artifact/task/564602116576/testrun/build/testrun/regress/regress/regression.diffs
postmaster.log: 
https://api.cirrus-ci.com/v1/artifact/task/564602116576/testrun/build/testrun/regress/regress/log/postmaster.log
crash info: 
https://api.cirrus-ci.com/v1/artifact/task/564602116576/crashlog/crashlog-postgres.exe_1af0_2023-02-08_00-53-23-997.txt

0085`f03ffa40 7ff6`fd89faa8 ucrtbased!abort(void)+0x5a 
[minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 77]
0085`f03ffa80 7ff6`fd6474dc postgres!ExceptionalCondition(
char * conditionName = 0x7ff6`fdd03ca8 
"PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED",
char * fileName = 0x7ff6`fdd03c80 
"../src/backend/storage/ipc/pmsignal.c",
int lineNumber = 0n329)+0x78 
[c:\cirrus\src\backend\utils\error\assert.c @ 67]
0085`f03ffac0 7ff6`fd676eff 
postgres!MarkPostmasterChildActive(void)+0x7c 
[c:\cirrus\src\backend\storage\ipc\pmsignal.c @ 329]
0085`f03ffb00 7ff6`fd59aa3a postgres!InitProcess(void)+0x2ef 
[c:\cirrus\src\backend\storage\lmgr\proc.c @ 375]
0085`f03ffb60 7ff6`fd467689 postgres!SubPostmasterMain(
int argc = 0n3,
char ** argv = 0x01c6`f3814e80)+0x33a 
[c:\cirrus\src\backend\postmaster\postmaster.c @ 4962]
0085`f03ffd90 7ff6`fda0e1c9 postgres!main(
int argc = 0n3,
char ** argv = 0x01c6`f3814e80)+0x2f9 
[c:\cirrus\src\backend\main\main.c @ 192]

So, somehow we ended up a pmsignal slot for a new backend that's not currently
in PM_CHILD_ASSIGNED state.


Obviously the first idea is to wonder whether this is a problem introduced as
part of the the recent postmaster-latchification work.


At first I thought we were failing to terminate running processes, due to the
following output:

parallel group (20 tests):  name char txid text varchar enum float8 regproc 
int2 boolean bit oid pg_lsn int8 int4 float4 uuid rangetypes numeric money
 boolean  ... ok  684 ms
 char ... ok  517 ms
 name ... ok  354 ms
 varchar  ... ok  604 ms
 text ... ok  603 ms
 int2 ... ok  676 ms
 int4 ... ok  818 ms
 int8 ... ok  779 ms
 oid  ... ok  720 ms
 float4   ... ok  823 ms
 float8   ... ok  628 ms
 bit  ... ok  666 ms
 numeric  ... ok 1132 ms
 txid ... ok  497 ms
 uuid ... ok  818 ms
 enum ... ok  619 ms
 money... FAILED (test process exited with exit 
code 2) 7337 ms
 rangetypes   ... ok  813 ms
 pg_lsn   ... ok  762 ms
 regproc  ... ok  632 ms


But now I realize the reason none of the other tests failed, is because the
crash took a long time, presumably due to the debugger creating the above
information, so no other tests failed.


2023-02-08 00:53:20.257 GMT client backend[4584] pg_regress/rangetypes 
STATEMENT:  select '-[a,z)'::textrange;
TRAP: failed Assert("PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED"), 
File: "../src/backend/storage/ipc/pmsignal.c", Line: 329, PID: 5948
[ quite a few lines ]
2023-02-08 00:53:27.420 GMT postmaster[872] LOG:  server process (PID 5948) was 
terminated by exception 0xC354
2023-02-08 00:53:27.420 GMT postmaster[872] HINT:  See C include file 
"ntstatus.h" for a description of the hexadecimal value.
2023-02-08 00:53:27.420 GMT postmaster[872] LOG:  terminating any other active 
server processes
2023-02-08 00:53:27.434 GMT postmaster[872] LOG:  all server processes 
terminated; reinitializing
2023-02-08 00:53:27.459 GMT startup[5800] LOG:  database system was 
interrupted; last known up at 2023-02-08 00:53:19 GMT
2023-02-08 00:53:27.459 GMT startup[5800] LOG:  database system was not 
properly shut down; automatic recovery in progress
2023-02-08 00:53:27.462 GMT startup[5800] LOG:  redo starts at 0/20DCF08
2023-02-08 00:53:27.484 GMT startup[5800] LOG:  could not stat file 
"pg_tblspc/16502": No such file or directory
2023-02-08 00:53:27.484 GMT startup[5800] CONTEXT:  WAL redo at 0/

Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

2023-02-07 Thread Thomas Munro
On Wed, Feb 8, 2023 at 2:28 PM Andres Freund  wrote:
> 2023-02-08 00:53:20.257 GMT client backend[4584] pg_regress/rangetypes 
> STATEMENT:  select '-[a,z)'::textrange;
> TRAP: failed Assert("PMSignalState->PMChildFlags[slot] == 
> PM_CHILD_ASSIGNED"), File: "../src/backend/storage/ipc/pmsignal.c", Line: 
> 329, PID: 5948

No idea what's going on yet, but this assertion failure is very
familiar to me, as one of the ways that lorikeet fails/failed (though
it hasn't failed like that since the postmaster latchification).
There it was because Cygwin's signal blocking is unreliable, so the
postmaster could start a backend, while already being in the middle of
starting a backend.  That particular problem shouldn't be possible
anymore; now we can only start backends from inside the main event
loop.  Hmm.  (State machine bug?  Some confusion about processes
caused by the fact that PID was recycled?)




Re: Add index scan progress to pg_stat_progress_vacuum

2023-02-07 Thread Imseih (AWS), Sami
Hi,

Thanks for your reply!

I addressed the latest comments in v23.

1/ cleaned up the asserts as discussed.
2/ used pq_putmessage to send the message on index scan completion.

Thanks

--
Sami Imseih
Amazon Web Services (AWS)



v23-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch
Description: v23-0001-Add-2-new-columns-to-pg_stat_progress_vacuum.-Th.patch


Re: heapgettup refactoring

2023-02-07 Thread David Rowley
On Wed, 8 Feb 2023 at 09:41, Melanie Plageman  wrote:
>
> On Tue, Feb 07, 2023 at 05:40:13PM +1300, David Rowley wrote:
> > I ended up adjusting HeapScanDescData more than what is minimally
> > required to remove rs_inited. I wondered if rs_cindex should be closer
> > to rs_cblock in the struct so that we're more likely to be adjusting
> > the same cache line than if that field were closer to the end of the
> > struct.  We don't need rs_coffset and rs_cindex at the same time, so I
> > made it a union. I see that the struct is still 712 bytes before and
> > after this change. I've not yet tested to see if there are any
> > performance gains to this change.
> >
>
> I like the idea of using a union.

Using the tests mentioned in [1], I tested out
remove_HeapScanDescData_rs_inited_field.patch. It's not looking very
promising at all.

seqscan.sql test:

master (e2c78e7ab)
tps = 27.769076 (without initial connection time)
tps = 28.155233 (without initial connection time)
tps = 26.990389 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch
tps = 23.990490 (without initial connection time)
tps = 23.450662 (without initial connection time)
tps = 23.600194 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch without union
HeapScanDescData change (just remove rs_inited field)
tps = 24.419007 (without initial connection time)
tps = 24.221389 (without initial connection time)
tps = 24.187756 (without initial connection time)


countall.sql test:

master (e2c78e7ab)

tps = 33.999408 (without initial connection time)
tps = 33.664292 (without initial connection time)
tps = 33.869115 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch
tps = 31.194316 (without initial connection time)
tps = 30.804987 (without initial connection time)
tps = 30.770236 (without initial connection time)

master + remove_HeapScanDescData_rs_inited_field.patch without union
HeapScanDescData change (just remove rs_inited field)
tps = 32.626187 (without initial connection time)
tps = 32.876362 (without initial connection time)
tps = 32.481729 (without initial connection time)

I don't really have any explanation for why this slows performance so
much. My thoughts are that if the performance of scans is this
sensitive to the order of the fields in the struct then it's an
independent project to learn out why and what we can realistically
change to get the best performance here.

> So, I was wondering what we should do about initializing rs_coffset here
> since it doesn't fall under "don't initialize it because it is only used
> for page-at-a-time mode". It might not be required for us to initialize
> it in initscan, but we do bother to initialize other "current scan
> state" fields. We could check if pagemode is enabled and initialize
> rs_coffset or rs_cindex depending on that.

Maybe master should be initialising this field already. I didn't quite
see it as important as it's never used before rs_inited is set to
true. Maybe setting it to InvalidOffsetNumber is a good idea just in
case something tries to use it before it gets set.

> >   * Note: when we fall off the end of the scan in either direction, we
> > - * reset rs_inited.  This means that a further request with the same
> > - * scan direction will restart the scan, which is a bit odd, but a
> > - * request with the opposite scan direction will start a fresh scan
> > + * reset rs_cblock to InvalidBlockNumber.  This means that a further 
> > request
> > + * with the same scan direction will restart the scan, which is a bit odd, 
> > but
> > + * a request with the opposite scan direction will start a fresh scan
> >   * in the proper direction.  The latter is required behavior for cursors,
> >   * while the former case is generally undefined behavior in Postgres
> >   * so we don't care too much.
>
> Not the fault of this patch, but I am having trouble parsing this
> comment. What does restart the scan mean? I get that it is undefined
> behavior, but it is confusing because it kind of sounds like a rescan
> which is not what it is (right?). And what exactly is a fresh scan? It
> is probably correct, I just don't really understand what it is saying.
> Feel free to ignore this aside, as I think your change is correctly
> updating the comment.

I struggled with this too. It just looks incorrect. As far as I see
it, once the scan ends we do the same thing regardless of what the
scan direction is. Maybe it's worth looking back at when that comment
was added and seeing if it was true when it was written and then see
what changed. I think it's worth improving that independently.

I think I'd like to focus on the cleanup stuff before looking into
getting rid of rs_inited. I'm not sure I'm going to get time to do the
required performance tests to look into why removing rs_inited slows
things down so much.

> Also, a few random other complaints about the comments in
> HeapScanDescData that are a

Re: Missing TAG for FEB (current) Minor Version Release

2023-02-07 Thread Justin Pryzby
On Tue, Feb 07, 2023 at 11:14:48AM +, sujit.rat...@fujitsu.com wrote:
> Hi OSS Community,
> We just wanted to confirm when the TAG will be created for the current FEB 
> minor release as we could not find the TAG for none of the minor versions,
> below is the screen shot for the some of the minor versions.
> 
> [cid:image001.png@01D93B13.7BF82E20]
> Could you please confirm when we can expect the TAG created for all minor 
> versions?

You might be interested to read this earlier question:
https://www.postgresql.org/message-id/flat/2e5676ba-e579-09a5-6f3a-d68208052654%40captnemo.in

-- 
Justin




Re: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Kyotaro Horiguchi
I agree to the direction and thanks for the patch.

At Tue, 7 Feb 2023 17:08:54 +, "Hayato Kuroda (Fujitsu)" 
 wrote in 
> > I noticed that previous ones are rejected by cfbot, even if they passed on 
> > my
> > environment...
> > PSA fixed version.
> 
> While analyzing more, I found the further bug that forgets initialization.
> PSA new version that could be passed automated tests on my github repository.
> Sorry for noise.

0002:

This patch doesn't seem to offer a means to change the default
walsender behavior.  We need a subscription option named like
"walsender_exit_mode" to do that.


+ConsumeWalsenderOptions(List *options, WalSndData *data)

I wonder if it is the right design to put options for different things
into a single list. I rather choose to embed the walsender option in
the syntax than needing this function.

K_START_REPLICATION opt_slot opt_physical RECPTR opt_timeline opt_shutdown_mode

K_START_REPLICATION K_SLOTIDENT K_LOGICAL RECPTR opt_shutdown_mode 
plugin_options

where opt_shutdown_mode would be like "SHUTDOWN_MODE immediate".


==
If we go with the current design, I think it is better to share the
option list rule between the logical and physical START_REPLCIATION
commands.

I'm not sure I like the option syntax
"exit_before_confirming=". I imagin that other options may
come in future. Thus, how about "walsender_shutdown_mode=",
where the mode is one of "wait_flush"(default) and "immediate"?


+typedef struct
+{
+   boolexit_before_confirming;
+} WalSndData;

Data doesn't seem to represent the variable. Why not WalSndOptions?


-   !equal(newsub->publications, MySubscription->publications))
+   !equal(newsub->publications, MySubscription->publications) ||
+   (newsub->minapplydelay > 0 && MySubscription->minapplydelay == 
0) ||
+   (newsub->minapplydelay == 0 && MySubscription->minapplydelay > 
0))

 I slightly prefer the following expression (Others may disagree:p):

  ((newsub->minapplydelay == 0) != (MySubscription->minapplydelay == 0))

 And I think we need a comment for the term. For example,

  /* minapplydelay affects START_REPLICATION option exit_before_confirming */


+ * Reads all entrly of the list and consume if needed.
s/entrly/entries/ ?
...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add LZ4 compression in pg_dump

2023-02-07 Thread Justin Pryzby
On Tue, Jan 31, 2023 at 09:00:56AM +, gkokola...@pm.me wrote:
> > In my mind, three things here are misleading, because it doesn't use
> > gzip headers:
> > 
> > | GzipCompressorState, DeflateCompressorGzip, "gzip compressed".
> > 
> > This comment is about exactly that:
> > 
> > * underlying stream. The second API is a wrapper around fopen/gzopen and
> > * friends, providing an interface similar to those, but abstracts away
> > * the possible compression. Both APIs use libz for the compression, but
> > * the second API uses gzip headers, so the resulting files can be easily
> > * manipulated with the gzip utility.
> > 
> > AIUI, Michael says that it's fine that the user-facing command-line
> > options use "-Z gzip" (even though the "custom" format doesn't use gzip
> > headers). I'm okay with that, as long as that's discussed/understood.
> 
> Thank you for the input Justin. I am currently waiting for input from a
> third person to get some conclusion. I thought that it should be stated
> before my inactiveness is considered as indifference, which is not.

I'm not sure what there is to lose by making the names more accurate -
especially since they're private/internal-only.

Tomas marked himself as a committer, so maybe could comment.

It'd be nice to also come to some conclusion about whether -Fc -Z gzip
is confusing (due to not actually using gzip).

BTW, do you intend to merge this for v16 ?  I verified in earlier patch
versions that tests all pass with lz4 as the default compression method.
And checked that gzip output is compatible with before, and that old
dumps restore correctly, and there's no memory leaks or other errors.

-- 
Justin




Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-07 Thread Andres Freund
Hi,

On 2023-02-06 17:53:00 -0800, Andres Freund wrote:
> Another run hit an issue we've been fighting repeatedly on the buildfarm / CI:
> https://cirrus-ci.com/task/5527490404286464
> https://api.cirrus-ci.com/v1/artifact/task/5527490404286464/testrun/build/testrun/regress-running/regress/regression.diffs
>
> diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/create_index.out 
> /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/create_index.out
> --- /tmp/cirrus-ci-build/src/test/regress/expected/create_index.out   
> 2023-02-06 23:52:43.627604000 +
> +++ 
> /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/create_index.out
>2023-02-07 00:03:04.535232000 +
> @@ -1930,12 +1930,13 @@
>  SELECT thousand, tenthous FROM tenk1
>  WHERE thousand < 2 AND tenthous IN (1001,3000)
>  ORDER BY thousand;
> -  QUERY PLAN
> 
> - Index Only Scan using tenk1_thous_tenthous on tenk1
> -   Index Cond: (thousand < 2)
> -   Filter: (tenthous = ANY ('{1001,3000}'::integer[]))
> -(3 rows)
> +  QUERY PLAN
> +--
> + Sort
> +   Sort Key: thousand
> +   ->  Index Only Scan using tenk1_thous_tenthous on tenk1
> + Index Cond: ((thousand < 2) AND (tenthous = ANY 
> ('{1001,3000}'::integer[])))
> +(4 rows)
>
>  SELECT thousand, tenthous FROM tenk1
>  WHERE thousand < 2 AND tenthous IN (1001,3000)
>
>
> I'd be tempted to disable the test, but it also found genuine issues in a
> bunch of CF entries, and all these test instabilities seem like ones we'd also
> see, with a lower occurrence on the buildfarm.

The last occasion we hit this was at: 
https://www.postgresql.org/message-id/1346227.1649887693%40sss.pgh.pa.us

I'm working on cleaning up the patch used for debugging in that thread, to
make VACUUM log to the server log if VERBOSE isn't used.

One thing I'm not quite sure what to do about is that we atm use a hardcoded
DEBUG2 (not controlled by VERBOSE) in a bunch of places:

ereport(DEBUG2,
(errmsg("table \"%s\": removed %lld dead item 
identifiers in %u pages",
vacrel->relname, (long long) index, 
vacuumed_pages)));

ivinfo.message_level = DEBUG2;

I find DEBUG2 hard to use to run the entire regression tests, it results in a
lot of output. Lots of it far less important than these kinds of details
here. So I'd like to use a different log level for them - but without further
complications that'd mean they'd show up in VACUUUM VERBOSE.

I made them part of VERBOSE for now, but not because I think that's
necessarily the right answer, but because it could be useful for debugging
this stupid flapping test.


I right now set instrument = true when
message_level_is_interesting(DEBUG1). But that probably should be false? I set
it to true because of starttime, but it'd probably be better to just move it
out of the if (instrument). Also would require re-jiggering the condition of
the "main block" doing the logging.


FWIW, running all regression tests that support doing so against a running
server with DEBUG1 results in 8.1MB, DEBUG2 in 17MB.


Greetings,

Andres Freund
>From d627c0692e8ddf6a7759f101f30d8d1c54583778 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 7 Feb 2023 18:12:48 -0800
Subject: [PATCH v2 1/2] wip: Log VACUUM information as DEBUG1 when VERBOSE is
 not used

Otherwise it's very hard to debug some kinds of issues, as one cannot use
VACUUM VERBOSE in regression tests.

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
Backpatch:
---
 src/backend/access/heap/vacuumlazy.c | 57 ++--
 1 file changed, 45 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f38..f6f3bb4c021 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -180,6 +180,7 @@ typedef struct LVRelState
 	OffsetNumber offnum;		/* used only for heap operations */
 	VacErrPhase phase;
 	bool		verbose;		/* VACUUM VERBOSE? */
+	int			elevel;
 
 	/*
 	 * dead_items stores TIDs whose index tuples are deleted by index
@@ -324,10 +325,33 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 StartPageDirty = VacuumPageDirty;
 	ErrorContextCallback errcallback;
 	char	  **indnames = NULL;
+	int			elevel;
+
+	if (params->options & VACOPT_VERBOSE)
+	{
+		verbose = true;
+		elevel = INFO;
+		instrument = true;
+	}
+	else if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
+	{
+		verbose = true;
+		elevel = LOG;
+		instrument = true;
+	}
+	else if (message_level_is_interesting(DEBUG1))
+	{
+		verbose = true;
+		elevel = DEBUG1;
+		instrument = true;
+	}
+	else
+	{
+		verbose = false;
+		elevel = DEBUG5; /* avoid compiler warning, not reached */
+		instrument = false;
+	}
 
-	

Difference of ecpg japanese translation in PG15

2023-02-07 Thread Sho Kato (Fujitsu)
Hello

In PG15, ecpg japanese translation are different from other branches.
Is there a reason for this?
If not, I think it would be better to make it the same as the other branch like 
the
attached patch.

regards,
sho kato


update-japanese-translation.patch
Description: update-japanese-translation.patch


Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

2023-02-07 Thread Thomas Munro
On Wed, Feb 8, 2023 at 2:52 PM Thomas Munro  wrote:
> On Wed, Feb 8, 2023 at 2:28 PM Andres Freund  wrote:
> > TRAP: failed Assert("PMSignalState->PMChildFlags[slot] == 
> > PM_CHILD_ASSIGNED"), File: "../src/backend/storage/ipc/pmsignal.c", Line: 
> > 329, PID: 5948

I was wondering if commit 18a4a620 might be relevant, as it touched
the management of those slots a few months ago, but then I found a
couple of matches from 2021 in my database of build farm assertion
failures:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2021-08-12%2010:38:56
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2021-09-28%2016:40:49

Fairywren is a Windows + msys2 system, so it doesn't share Cygwin's
signal system, it's running the pure Windows code (though it's GCC
instead of MSVC and has a different libc, it's using our Windows
native code paths and defines WIN32).




RE: Perform streaming logical transactions by background workers and parallel apply

2023-02-07 Thread wangw.f...@fujitsu.com
On Tue, Feb 7, 2023 15:37 PM Amit Kapila  wrote:
> On Tue, Feb 7, 2023 at 12:41 PM Masahiko Sawada 
> wrote:
> >
> > On Fri, Feb 3, 2023 at 6:44 PM Amit Kapila  wrote:
> >
> > > We need to think of a predictable
> > > way to test this path which may not be difficult. But I guess it would
> > > be better to wait for some feedback from the field about this feature
> > > before adding more to it and anyway it shouldn't be a big deal to add
> > > this later as well.
> >
> > Agreed to hear some feedback before adding it. It's not an urgent feature.
> >
> 
> Okay, Thanks! AFAIK, there is no pending patch left in this proposal.
> If so, I think it is better to close the corresponding CF entry.

Yes, I think so.
Closed this CF entry.

Regards,
Wang Wei


Normalization of utility queries in pg_stat_statements

2023-02-07 Thread Michael Paquier
Hi all,
(Adding Bertrand in CC.)

$subject is a follow-up of the automation of query jumbling for
utilities and DDLs, and attached is a set of patches that apply
normalization to DDL queries across the board, for all utilities.

This relies on tracking the location of A_Const nodes while removing
from the query jumbling computation the values attached to the node,
as as utility queries can show be stored as normalized in
pg_stat_statements with some $N parameters.  The main case behind
doing that is of course monitoring, where we have seen some user
instances willing to get more information but see pg_stat_statements
as a bottleneck because the query ID of utility queries are based on
the computation of their string, and is value-sensitive.  That's the
case mentioned by Bertrand Drouvot for CALL and SET where workloads
full of these easily bloat pg_stat_statements, where we concluded
about more automation in this area (so here it is):
https://www.postgresql.org/message-id/36e5bffe-e989-194f-85c8-06e7bc88e6f7%40amazon.com

For example, this makes possible the following grouping:
- CALL func(1,2); CALL func(1,3); => CALL func($1,$2)
- EXPLAIN SELECT 1; EXPLAIN SELECT 1; => EXPLAIN SELECT $1;
- CREATE MATERIALIZED VIEW aam AS SELECT 1; becomes "CREATE
MATERIALIZED VIEW aam AS SELECT $1".

Query jumbling for DDLs and utilities happens now automatically, still
are not represented correctly in pg_stat_statements (one bit of
documentation I missed previously refers to the fact that these depend
on their query strings, which is not the case yet).

By the way, while looking at all that, I have really underestimated
the use of Const nodes in utilities, as some queries can finish with
the same query ID even if different values are stored in a query,
still don't show up as normalized in pg_stat_statements, so the
current state of HEAD is not good, though you would need to use the
same object name to a conflict for most of them.  So that's my mistake
here with 3db72eb.  If folks think that we'd better have a revert of
this automated query jumbling facility based on this argument, that
would be fine for me, as well.  The main case I have noticed in this
area is EXPLAIN, by the way.  Note that it is actually easy to move to
the ~15 approach of having a query ID depending on the Const node
values for DDLs, by having a custom implementation in
queryjumblefuncs.c for Const nodes, where we apply the constant value
and don't store a location for normalization if a query has a utility
once this information is stored in a JumbleState.

This rule influences various DDLs, as well, once it gets applied
across the board, and it's been some work to identify all of them, but
I think that I have caught them all as the regression database offers
all the possible patterns:
- CREATE VIEW, CTAS, CREATE MATERIALIZED VIEW which have Const nodes
depending on their attached queries, for various clauses.
- ALTER TABLE/INDEX/FOREIGN with DEFAULT, SET components.
- CREATE TABLE with partition bounds.
- BEGIN and ABORT, with transaction commands getting grouped
together.

The attached patch set includes as a set of regression tests for
pg_stat_statements for *all* the utility queries that have either
Const or A_Const nodes, so as one can see the effect that all this
stuff has.  This is based on a diff of the contents of
pg_stat_statements on the regression database once all these
normalization rules are applied.

Compilation of a Const can also be made depending on the type node.
However, all that makes no sense if applying the same normalization
rules to all the queries across the board, because all the queries
would follow the same rules.  That's the critical bit IMO.  From what
I get, the bloat of pg_stat_statements for all utilities is something
that would be helpful for all such queries, still different things
could be done on a per-node basis.  Perhaps this is too aggressive as
it is and people don't like it, though, so feedback is welcome.  I'd
like to think that maximizing grouping is nice though, because it
leads to no actual loss of information on the workload pattern for the
queries involved, AFAIU.  This sentence may be overoptimistic.

So, attached is a patch set, that does the following:
- 0001 is a refactoring of the regression tests of
pg_stat_statements by splitting a bit the tests.  I bumped into that
while getting confused at how the tests are now when it comes to the
handling of utilities and track_planning, where these tests silently
rely on other parts of the same file with different GUC settings.
This refactoring is useful on its own, IMO, and the tests show the
same output as previously.
- 0002 is the addition of tests in pg_stat_statements for all the DDL
and utility patterns that make use of Const and A_Const nodes.  Even
if query jumbling of utilities is done through their text string or
their nodes, this is also useful.
- 0003 is the code of the feature, that switches pg_stat_statements to
properly normalize utility qu

Re: windows CI failing PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED

2023-02-07 Thread Thomas Munro
On Wed, Feb 8, 2023 at 4:00 PM Thomas Munro  wrote:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2021-08-12%2010:38:56
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2021-09-28%2016:40:49

These were a bit different though.  They also logged "could not
reserve shared memory region".  And they don't have a user of the same
PID logging stuff immediately preceding the failure.




Re: Missing TAG for FEB (current) Minor Version Release

2023-02-07 Thread Tom Lane
Justin Pryzby  writes:
> On Tue, Feb 07, 2023 at 11:14:48AM +, sujit.rat...@fujitsu.com wrote:
>> Could you please confirm when we can expect the TAG created for all minor 
>> versions?

> You might be interested to read this earlier question:
> https://www.postgresql.org/message-id/flat/2e5676ba-e579-09a5-6f3a-d68208052654%40captnemo.in

FYI, I pushed the tags about four hours ago, following our customary
schedule.

regards, tom lane




RE: Data is copied twice when specifying both child and parent table in publication

2023-02-07 Thread wangw.f...@fujitsu.com
On Wed, Feb 8, 2023 4:29 AM Andres Freund  wrote:
> Hi,
> 
> On 2022-11-16 08:58:31 +, wangw.f...@fujitsu.com wrote:
> > Attach the new patch set.
> 
> This patch causes several of the tests to fail. See e.g.:
> 
> https://cirrus-ci.com/task/6587624765259776
> 
> Most of the failures appear to be due to the main regression tests failing:
> https://api.cirrus-
> ci.com/v1/artifact/task/6587624765259776/testrun/build/testrun/regress/regres
> s/regression.diffs
> 
> diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/publication.out
> /tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out
> --- /tmp/cirrus-ci-build/src/test/regress/expected/publication.out2023-02-
> 07 20:19:34.318018729 +
> +++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/publication.out
>   2023-02-07 20:22:53.545223026 +
> @@ -1657,7 +1657,7 @@
>  SELECT * FROM pg_publication_tables;
>   pubname | schemaname | tablename  | attnames | rowfilter
>  -+++--+---
> - pub | sch2   | tbl1_part1 | {a}  |
> + pub | sch2   | tbl1_part1 |  |
>  (1 row)
> 
>  DROP PUBLICATION pub;

Thanks for your kind reminder and analysis.

I think this failure is caused by the recently commit (b7ae039) in the current
HEAD. Rebased the patch set and attach them.

Regards,
Wang Wei


HEAD_v16-0001-Fix-data-replicated-twice-when-specifying-publis.patch
Description:  HEAD_v16-0001-Fix-data-replicated-twice-when-specifying-publis.patch


HEAD_v16-0002-Add-clarification-for-the-behaviour-of-the-publi.patch
Description:  HEAD_v16-0002-Add-clarification-for-the-behaviour-of-the-publi.patch


REL15_v16-0001-Fix-data-replicated-twice-when-specifying-publis_patch
Description:  REL15_v16-0001-Fix-data-replicated-twice-when-specifying-publis_patch


REL14_v16-0001-Fix-data-replicated-twice-when-specifying-publis_patch
Description:  REL14_v16-0001-Fix-data-replicated-twice-when-specifying-publis_patch


Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2023-02-07 Thread Michael Paquier
On Thu, Feb 02, 2023 at 11:14:39AM +0900, Michael Paquier wrote:
> Actually, I completely forgot to take into account that there is a
> minor release planned for next week:
> https://www.postgresql.org/developer/roadmap/
> 
> So I'll hold on a bit longer here, until the next versions get their
> tags.

So, the backpatch down to 12 is straight-forward, with some conflicts
in ./configure.  ~12 handles also differently its CFLAGS with
pg_config.h.win32.  11 is more annoying because it lacks
HAVE_SYS_PROCCTL_H and it would need a partial backport of f98b847.  I
am not completely sure if this could have more side effects, though,
even if the part of pmsignal.c is left out.  It should not..

At the end, I have just done this stuff down to ~12, 11 does not seem
worth the trouble as the next stable version to go out of support.
I'll reduce gokiburi's script a bit, as a result, until the oldest
version support is v12.
--
Michael


signature.asc
Description: PGP signature


Re: Assertion failure in SnapBuildInitialSnapshot()

2023-02-07 Thread Amit Kapila
On Wed, Feb 8, 2023 at 1:19 AM Andres Freund  wrote:
>
> On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada  
> > wrote:
> > >
> > > Attached updated patches.
> > >
> >
> > Thanks, Andres, others, do you see a better way to fix this problem? I
> > have reproduced it manually and the steps are shared at [1] and
> > Sawada-San also reproduced it, see [2].
> >
> > [1] - 
> > https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > [2] - 
> > https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
>
> Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> already_locked=true worry me a lot less, because obviously that's not a very
> frequent operation.
>
> This is particularly not great because we need to acquire
> ReplicationSlotControlLock while already holding ProcArrayLock.
>
>
> But clearly there's a pretty large hole in the lock protection right now. I'm
> a bit confused about why we (Robert and I, or just I) thought it's ok to do it
> this way.
>
>
> I wonder if we could instead invert the locks, and hold
> ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
> acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().
>

Along with inverting, doesn't this mean that we need to acquire
ReplicationSlotControlLock in Exclusive mode instead of acquiring it
in shared mode? My understanding of the above locking scheme is that
in CreateInitDecodingContext(), we acquire ReplicationSlotControlLock
in Exclusive mode before acquiring ProcArrayLock in Exclusive mode and
release it after releasing ProcArrayLock. Then,
ReplicationSlotsComputeRequiredXmin() acquires
ReplicationSlotControlLock in Exclusive mode only when already_locked
is false and releases it after a call to
ProcArraySetReplicationSlotXmin(). ProcArraySetReplicationSlotXmin()
won't change.

I don't think just inverting the order without changing the lock mode
will solve the problem because still apply worker will be able to
override the replication_slot_xmin value.

-- 
With Regards,
Amit Kapila.




Re: OpenSSL 3.0.0 vs old branches

2023-02-07 Thread Michael Paquier
On Tue, Feb 07, 2023 at 01:28:26PM -0500, Tom Lane wrote:
> I double-checked this on Fedora 37 (openssl 3.0.5).  v11 and v12
> do build --with-openssl.  There are an annoyingly large number of
> -Wdeprecated-declarations warnings, but those are there in v13 too.
> I confirm that back-patching f0d2c65f17 is required and sufficient
> to make the ssl test pass.

+1.  (I am annoyed by that for any backpatch that involves v11 and
v12.)

> I think Peter's misremembering the history, and OpenSSL 3 *is*
> supported in these branches.  There could be an argument for
> not back-patching f0d2c65f17 on the grounds that pre-1.1.1 is
> also supported there.  On the whole though, it seems more useful
> today for that test to pass with 3.x than for it to pass with 0.9.8.
> And I can't see investing effort to make it do both (but if Peter
> wants to, I won't stand in the way).

Cutting support for 0.9.8 in oldest branches would be a very risky
move, but as you say, if that only involves a failure in the SSL
tests while still allowing anything we have to work, fine by me to
live with that.

Saying that, not being able to test these when working on a
SSL-specific patch adds an extra cost in back-patching.  There are not
many of these lately, so that may be OK, still it would mean to apply
a reverse of f0d2c65.  If things were to work for all the versions of
OpenSSL supported on 11 and 12, would it mean that the tests need to
store both -des and -aes256 data, having the tests switch from one to
the other depending on the version of OpenSSL built with?
--
Michael


signature.asc
Description: PGP signature


  1   2   >