RE: Conflict detection and logging in logical replication

2024-08-19 Thread Zhijie Hou (Fujitsu)
On Monday, August 19, 2024 2:40 PM Amit Kapila  wrote:
> 
> On Mon, Aug 19, 2024 at 11:54 AM shveta malik 
> wrote:
> >
> > On Mon, Aug 19, 2024 at 11:37 AM Amit Kapila 
> wrote:
> > >
> > > On Mon, Aug 19, 2024 at 9:08 AM shveta malik 
> wrote:
> > > >
> > > > On Sun, Aug 18, 2024 at 2:27 PM Zhijie Hou (Fujitsu)
> > > >  wrote:
> > > > >
> > > > > Attach the V16 patch which addressed the comments we agreed on.
> > > > > I will add a doc patch to explain the log format after the 0001 is 
> > > > > RFC.
> > > > >
> > > >
> > > > Thank You for addressing comments. Please see this scenario:
> > > >
> > > > create table tab1(pk int primary key, val1 int unique, val2 int);
> > > >
> > > > pub: insert into tab1 values(1,1,1);
> > > > sub: insert into tab1 values(2,2,3);
> > > > pub: update tab1 set val1=2 where pk=1;
> > > >
> > > > Wrong 'replica identity' column logged? shouldn't it be pk?
> > > >
> > > > ERROR:  conflict detected on relation "public.tab1":
> > > > conflict=update_exists
> > > > DETAIL:  Key already exists in unique index "tab1_val1_key",
> > > > modified locally in transaction 801 at 2024-08-19 08:50:47.974815+05:30.
> > > > Key (val1)=(2); existing local tuple (2, 2, 3); remote tuple (1,
> > > > 2, 1); replica identity (val1)=(1).
> > > >
> > >
> > > The docs say that by default replica identity is primary_key [1]
> > > (see REPLICA IDENTITY),
> >
> > yes, I agree. But here the importance of dumping it was to know the
> > value of RI as well which is being used as an identification of row
> > being updated rather than row being conflicted. Value is logged
> > correctly.
> >
> 
> Agreed, sorry, I misunderstood the problem reported. I thought the suggestion
> was to use 'primary key' instead of 'replica identity' but you pointed out 
> that the
> column used in 'replica identity' was wrong.
> We should fix this one.

Thanks for reporting the bug. I have fixed it and ran pgindent in V17 patch.
I also adjusted few comments and fixed a typo.

Best Regards,
Hou zj


v17-0001-Detect-and-log-conflicts-in-logical-replication.patch
Description:  v17-0001-Detect-and-log-conflicts-in-logical-replication.patch


Re: Conflict detection and logging in logical replication

2024-08-19 Thread shveta malik
On Mon, Aug 19, 2024 at 12:09 PM Amit Kapila  wrote:
>
> On Mon, Aug 19, 2024 at 11:54 AM shveta malik  wrote:
> >
> > On Mon, Aug 19, 2024 at 11:37 AM Amit Kapila  
> > wrote:
> > >
> > > On Mon, Aug 19, 2024 at 9:08 AM shveta malik  
> > > wrote:
> > > >
> > > > On Sun, Aug 18, 2024 at 2:27 PM Zhijie Hou (Fujitsu)
> > > >  wrote:
> > > > >
> > > > > Attach the V16 patch which addressed the comments we agreed on.
> > > > > I will add a doc patch to explain the log format after the 0001 is 
> > > > > RFC.
> > > > >
> > > >
> > > > Thank You for addressing comments. Please see this scenario:
> > > >
> > > > create table tab1(pk int primary key, val1 int unique, val2 int);
> > > >
> > > > pub: insert into tab1 values(1,1,1);
> > > > sub: insert into tab1 values(2,2,3);
> > > > pub: update tab1 set val1=2 where pk=1;
> > > >
> > > > Wrong 'replica identity' column logged? shouldn't it be pk?
> > > >
> > > > ERROR:  conflict detected on relation "public.tab1": 
> > > > conflict=update_exists
> > > > DETAIL:  Key already exists in unique index "tab1_val1_key", modified
> > > > locally in transaction 801 at 2024-08-19 08:50:47.974815+05:30.
> > > > Key (val1)=(2); existing local tuple (2, 2, 3); remote tuple (1, 2,
> > > > 1); replica identity (val1)=(1).
> > > >
> > >
> > > The docs say that by default replica identity is primary_key [1] (see
> > > REPLICA IDENTITY),
> >
> > yes, I agree. But here the importance of dumping it was to know the
> > value of RI as well which is being used as an identification of row
> > being updated rather than row being conflicted. Value is logged
> > correctly.
> >
>
> Agreed, sorry, I misunderstood the problem reported.

no problem.

>I thought the
> suggestion was to use 'primary key' instead of 'replica identity' but
> you pointed out that the column used in 'replica identity' was wrong.
> We should fix this one.
>

Yes, that is what I pointed out. But let me remind you that this logic
to display both 'Key' and 'RI' is done in the latest patch. Earlier
either 'Key' or 'RI' was logged. But since for 'update_exists', both
make sense, thus I suggested to dump both. 'RI' column is dumped
correctly in all other cases, except this new one. So if fixing this
wrong column name for update_exists adds more complexity, then I am
okay with skipping the 'RI' dump for this case. We’re fine with just
'Key' for now, and we can address this later.

thanks
Shveta




Re: [BUG FIX]: invalid meson version detection

2024-08-19 Thread Heikki Linnakangas

On 14/08/2024 17:02, Sergey Solovev wrote:

I looked at meson.build file at found an incorrectly used function to

determine postgres version.


  > if pg_version.endswith('devel')
  >   pg_version_arr = [pg_version.split('devel')[0], '0']


There should be `pg_version.contains('devel')`, not `endswith`. Like this:

-if pg_version.endswith('devel')
+if pg_version.contains('devel')


I believe it's correct as it is. With "beta" and "rc" version, the 
version strings look like "17beta1" or "16rc2", i.e. there's a number at 
the end of the string. But the "devel" version strings never have that, 
e.g. "17devel" or "18devel".


See also src/tools/version_stamp.pl

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Pgoutput not capturing the generated columns

2024-08-19 Thread Peter Smith
Hi Shubham, here are my review comments for the TAP tests patch v27-0002

==
Commit message

Tap tests for 'include-generated-columns'

~

But, it's more than that-- these are the TAP tests for all
combinations of replication related to generated columns. i.e. both
with and without 'include_generated_columns' option enabled.

==
src/test/subscription/t/011_generated.pl

I was mistaken, thinking that the v27-0002 had already been refactored
according to Vignesh's last review but it is not done yet, so I am not
going to post detailed review comments until the restructuring is
completed.

~

OTOH, there are some problems I felt have crept into v26-0001 (TAP
test is same as v27-0002), so maybe try to also take care of them (see
below) in v28-0002.

In no particular order:

* I felt it is almost useless now to have the "combo" (
"regress_pub_combo")  publication. It used to have many tables when
you first created it but with every version posted it is publishing
less and less so now there are only 2 tables in it. Better to have a
specific publication for each table now and forget about "combos"

* The "TEST tab_gen_to_gen initial sync" seems to be not even checking
the table data. Why not? e.g. Even if you expect no data, you should
test for it.

* The "TEST tab_gen_to_gen replication" seems to be not even checking
the table data. Why not?

* Multiple XXX comments like "... it needs more study to determine if
the above result was actually correct, or a PG17 bug..." should be
removed. AFAIK we should well understand the expected results for all
combinations by now.

* The "TEST tab_order replication" is now getting an error saying
, Now, that may now be the correct
error for this situation, but in that case, then I think the test is
not longer testing what it was intended to test (i.e. that column
order does not matter) Probably the table definition needs
adjusting to make sure we are testing whenwe want to test, and not
just making some random scenario "PASS".

* The test "# TEST tab_alter" expected empty result also seems
unhelpful. It might be related to the previous bullet.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-08-19 Thread Jelte Fennema-Nio
On Mon, 19 Aug 2024 at 05:44, Robert Haas  wrote:
> I feel like what you're really complaining about here is that libpq is
> not properly versioned. We've just been calling it libpq.so.5 forever
> instead of bumping the version number when we change stuff.

There is PQlibVersion() that can be used for this. Which has existed
since 2010, so people can assume it exists.

> I just don't see why this particular change is special.

I didn't mean to say that it was, and I don't think the problem is
enormous either. I mainly meant to say that there is not just a cost
to Postgres maintainers when we introduce a new API. There's
definitely a cost to users and client authors too.

> Also, I kind of wish you had brought this argument up earlier. Maybe
> you did and I missed it, but I was under the impression that you were
> just arguing that "nobody will notice or care," which is a quite
> different argument than what you make here.

"nobody will notice or care" was definitely my argument before Jacob
responded. Since Jacob his response I realize there are two valid use
cases for PQprotocolVersion():

1. Feature detection. For this my argument still is: people won't
notice. Many people won't have bothered to use the function and
everyone else will have used >= 3 here.
2. Pinning the protocol version, because they care that the exact
protocol details are the same. Here people will have used == 3, and
thus their check will fail when we start to return a different version
from PQprotocolVersion(). But that's actually what this usecase
desires. By creating a new function, we actually break the expectation
of these people: i.e. they want the PQprotocolVersion() to return a
different version when the protocol changes.

Before Jacob responded I only considered the first case. So my
argument was indeed basically: Let's reuse this currently useless
function with the nice name, because no-one will care. But if people
thought that the risk was too high, I didn't see huge downsides to
introducing a new API either.

But **now I actually feel much more strongly about reusing the same
function**. Because by introducing a new function we actually break
the users of the second use-case.

P.S. The docs for PQprotocolVersion[1] have never said that this
function only returns the major protocol version. And by using the
word "Currently" it has always suggested that new return values could
be introduced later, and thus for feature detection you should use >=
3

[1]: 
https://www.postgresql.org/docs/13/libpq-status.html#LIBPQ-PQPROTOCOLVERSION




Re: MultiXact\SLRU buffers configuration

2024-08-19 Thread Andrey M. Borodin



> On 5 Jul 2024, at 23:18, Andrey M. Borodin  wrote:
> 
> Alvaro, please find attached the test.
> I’ve addressed some Michael’s comments in a nearby thread: removed extra 
> load, made injection point names lowercase, fixed some grammar issues.

I’ve made several runs on Github to test stability [0, 1, 2, 4]. CI seems to be 
stable.

Thanks!


Best regards, Andrey Borodin.

[0] https://github.com/x4m/postgres_g/commit/c9c362679f244
[1] https://github.com/x4m/postgres_g/commit/9d7e43cc1
[2] https://github.com/x4m/postgres_g/commit/18cf186617
[3] https://github.com/x4m/postgres_g/commit/4fbce73997



debug_palloc_context_threshold - dump code paths leading to memory leaks

2024-08-19 Thread Jakub Wartak
Hi -hackers,

>From time to time we hit some memory leaks in PostgreSQL and on one
occasion Tomas wrote: "I certainly agree it's annoying that when OOM
hits, we end up with no information about what used the memory. Maybe
we could have a threshold triggering a call to MemoryContextStats?" -
see [1]. I had this in my memory/TODO list for quite some time now,
and attached is simplistic patch to provide new
"debug_palloc_context_threshold" GUC with aim to help investigating
such memory leaks in contexts (think: mostly on customer side of
things) and start discussion maybe if there are better alternatives.
The aim is to see which codepath contributes to the case of using e.g.
more than 5GB in a single context (it will start then dumping
stats/stack traces every 100ms). Here the patch assumes that most
memory leaks are present in the same context, so it dumps memory into
a log, which can be later analyzed to build a stack trace profile.
This patch is just to get discussion as I found that:
- pg_backend_memory_contexts view: we often cannot modify an
application to issue it from time to time.
- Scripting gdb to call MemoryContextStats() or using 14+
pg_log_backend_memory_contexts() often seems impossible too as we do
not know the PID which is going to cause it (the usual pattern is like
this: some app starts some SQL, sometimes OOMs, rinse and repeat)

My doubts are following:

1. Is dumping on per-context threshold enough? That means it is not
per-backend-RSS threshold. My problem is that palloc() would have to
recursively collect context->mem_allocated for every context out there
starting from TopMemoryContext on every run, right? (or just some
global form of memory accounting would have to be introduced to avoid
such slowdown of palloc(). What's worse there's seems to be no
standardized OS-agnostic way of getting RSS in C/POSIX)

2. Is such stack trace reporting - see example in [2] - enough? Right
now it is multi-line, but maybe we could have our own version of
certain routines so that it would report all in one line? Like, e.g.:
2024-08-19 09:11:04.664 CEST [9317] LOG:  dumping memory context stats
for context "Caller tuples": Grand total: 25165712 bytes in 13 blocks;
1705672 free (0 chunks); 23460040 used; errbacktrace+0x46 <- +0x66c7e5
<- palloc+0x4a <- datumCopy+0x30 <- tuplesort_putdatum+0x8f <-
+0x37bf26 <- [..long list..]

The advantage here would be that - with that sampling every XXX ms, we
could extract and statistically investigate what's the code path
requesting most of the memory , even by using simple grep(1) / sort(1)
/ uniq(1)

3. In offlist discussion Tomas, pointed out to me that "it's not very
clear the place where we exceed the limit is necessarily the place
where we leak memory" (that palloc(), might be later pfree'ed())

4. Another idea (by Tomas) is related to the static threshold. Maybe
we want it to be dynamic, so add e.g. +1 .. 50% on every such call, so
to catch only growing mem usage, not just >
debug_palloc_context_threshold.

Anything else? What other experiences may be relevant here?

-J.

[1]
https://www.postgresql.org/message-id/9574e4b6-3334-777b-4287-29d81151963a%40enterprisedb.com

[2] - sample use and log:

create table zz as select 'AAA' || (i % 100) as id from
generate_series(100, 300) i;
set work_mem to '1GB';
set max_parallel_workers_per_gather to 0;
set enable_hashagg to off;
set debug_palloc_context_threshold to '1 MB';
select id, count(*) from zz group by id  having count(*) > 1;


2024-08-19 09:11:04.664 CEST [9317] LOG:  dumping memory context stats
2024-08-19 09:11:04.664 CEST [9317] DETAIL:  Context "Caller tuples".
2024-08-19 09:11:04.664 CEST [9317] BACKTRACE:
postgres: postgres postgres [local] SELECT(errbacktrace+0x46)
[0x55ac9ab06e56]
postgres: postgres postgres [local] SELECT(+0x66c7e5) [0x55ac9ab2f7e5]
postgres: postgres postgres [local] SELECT(palloc+0x4a) [0x55ac9ab2fd3a]
postgres: postgres postgres [local] SELECT(datumCopy+0x30)
[0x55ac9aa10cf0]
postgres: postgres postgres [local]
SELECT(tuplesort_putdatum+0x8f) [0x55ac9ab4032f]
postgres: postgres postgres [local] SELECT(+0x37bf26) [0x55ac9a83ef26]
[..]
2024-08-19 09:11:04.664 CEST [9317] STATEMENT:  select id, count(*)
from zz group by id  having count(*) > 1;
2024-08-19 09:11:04.664 CEST [9317] LOG:  level: 0; Caller tuples:
25165712 total in 13 blocks; 1705672 free; 23460040 used
2024-08-19 09:11:04.664 CEST [9317] LOG:  Grand total: 25165712 bytes
in 13 blocks; 1705672 free (0 chunks); 23460040 used
//100ms later:
2024-08-19 09:11:04.764 CEST [9317] LOG:  dumping memory context stats
2024-08-19 09:11:04.764 CEST [9317] DETAIL:  Context "Caller tuples".
2024-08-19 09:11:04.764 CEST [9317] BACKTRACE:
postgres: postgres postgres [local] SELECT(errbacktrace+0x46)
[0x55ac9ab06e56]
postgres: postgres postgres [local] SELECT(+0x66c7e5) [0x55ac9ab2f7e5]
postgres: postgres postgres [local] SELECT(palloc+0x4a) [0x55ac9ab

[Bug Fix]standby may crash when switching-over in certain special cases

2024-08-19 Thread px shi
Hi, hackers,

I've recently encountered an issue where a standby crashes when
reconnecting to a new primary after a switchover under certain conditions.
Here's a procedure of the crash scenario:


1) We have three instances: one primary and two standbys (s1 and s2, both
using streaming replication).


2) The primary crashed when the standby’s `flushed_lsn` was at the
beginning of a WAL segment (e.g., `0/2200`). Both s1 and s2 logged the
following:

   ```

   FATAL: could not connect to the primary server...

   LOG: waiting for WAL to become available at 0/22ED

   ```


3) s1 was promoted to the new primary, s1 logged the following:

   ```

   LOG: received promote request

   LOG: redo done at 0/21FFFEE8

   LOG: selected new timeline ID: 2

   ```


4) s2's `primary_conninfo` was updated to point to s1, s2 logged the
following:

   ```

   LOG: received SIGHUP, reloading configuration files

   LOG: parameter "primary_conninfo" changed to ...

   ```


5) s2 began replication with s1 and attempted to fetch `0/2200` on
timeline 2, s2 logged the following:

   ```

   LOG: fetching timeline history file for timeline 2 from primary server

   FATAL: could not start WAL streaming: ERROR: requested starting point
0/2200 on timeline 1 is not this server's history

   DETAIL: This server's history forked from timeline 1 at 0/21E0.

   LOG: started streaming WAL from primary at 0/2200 on timeline 2

   ```


6) WAL record mismatch caused the walreceiver process to terminate, s2
logged the following:

   ```

   LOG: invalid contrecord length 10 (expected 213) at 0/21E0

   FATAL: terminating walreceiver process due to administrator command

   ```


7) s2 then attempted to fetch `0/2100` on timeline 2. However, the
startup process failed to open the WAL file before it was created, leading
to a crash:

   ```

   PANIC: could not open file "pg_wal/00020021": No such
file or directory

   LOG: startup process was terminated by signal 6: Aborted

   ```


In this scenario, s2 attempts replication twice. First, it starts from
`0/2200` on timeline 2, setting `walrcv->flushedUpto` to `0/2200`.
But when a mismatch occurs, the walreceiver process is terminated. On the
second attempt, replication starts from `0/2100` on timeline 2. The
startup process expects the WAL file to exist because WalRcvFlushRecPtr()
return `0/2200`, but the file is not found, causing s2's startup
process to crash.

I think it should check recptr and walrcv->flushedUpto when Request
XLogStreming, so I create a patch for it.


Best regards,
pixian shi


0001-Fix-walreceiver-set-incorrect-flushedUpto-when-switc.patch
Description: Binary data


Re: Asymmetric partition-wise JOIN

2024-08-19 Thread Andrei Lepikhov

On 1/8/2024 20:56, Alexander Korotkov wrote:

On Tue, Apr 2, 2024 at 6:07 AM Andrei Lepikhov
 wrote:
Actually, the idea I tried to express is the combination of #1 and #2:
to build individual plan for every partition, but consider the 'Common
Resources'.  Let me explain this a bit more.

Thanks for keeping your eye on it!

My idea is to introduce a new property for paths selection.
3) Usage of common resources.  The common resource can be: hash
representation of relation, memoize over relation scan, etc.  We can
exclude the cost of common resource generation from the path cost, but
keep the reference for the common resource with its generation cost.
If one path uses more common resources than another path, it could
cost-dominate another one only if its cheaper together with its extra
common resources cost.  If one path uses less or equal common
resources than another, it could normally cost-dominate another one.
The most challenging part for me is the cost calculation, which is 
bonded with estimations of other paths. To correctly estimate the 
effect, we need to remember at least the whole number of paths sharing 
resources.
Also, I wonder if it can cause some corner cases where prediction error 
on a shared resource will cause an even worse situation upstream.
I think we could push off here from an example and a counter-example, 
but I still can't find them.



However, I understand this is huge amount of work given we have to
introduce new basic optimizer concepts.  I get that the main
application of this patch is sharding.  If we have global tables
residing each shard, we can push down any joins with them.  Given this
patch gives some optimization for non-sharded case, I think we
*probably* can accept its concept even that it this optimization is
obviously not perfect.
Yes, right now sharding is the most profitable case. We can push down 
parts of the plan which references only some common resources: 
FunctionScan, ValueScan, tables which can be proved are existed 
everywhere and provide the same output. But for now it is too far from 
the core code, IMO. - So, I search for cases that can be helpful for a 
single instance.


--
regards,
Andrei Lepikhov
Postgres Professional





Re: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.

2024-08-19 Thread Yugo Nagata
On Mon, 19 Aug 2024 00:35:39 +0200
Tomas Vondra  wrote:

> On 8/16/24 13:26, Tomas Vondra wrote:
> > Hi Ayush,
> > 
> > ...
> > 
> > So this fix seems reasonable.
> > 
> 
> I've pushed this to all affected branches, except for 11 which is EOL.
> 
> I thought about adding a test, but I couldn't think of a TAP test where
> this would really fit, and it didn't seem very practical to have a test
> creating hundreds of roles. So I abandoned the idea.

I tried to add Assert in heap_inplace_update to prevent possible similar 
failures, but I gave up because I could not find a good way to determine if
a tuple is detoasted of not.

By the way, I found a comment in vac_update_datfrozenxid() and 
EventTriggerOnLogin()
that explains why we  could not use tuples from the syscache for 
heap_inplace_update. 
I think it is better ad d the same comment in dropdb(). I attached a trivial 
patch for it.


Regards,
Yugo Nagata

-- 
Yugo Nagata 
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 7a7e2c6e3e..d00ae40e19 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1790,7 +1790,9 @@ dropdb(const char *dbname, bool missing_ok, bool force)
 	pgstat_drop_database(db_id);
 
 	/*
-	 * Update the database's pg_database tuple
+	 * Get the pg_database tuple to scribble on.  Note that this does not
+	 * directly rely on the syscache to avoid issues with flattened toast
+	 * values for the in-place update.
 	 */
 	ScanKeyInit(&scankey,
 Anum_pg_database_datname,


Re: gitmaster server problem?

2024-08-19 Thread Jelte Fennema-Nio
On Sat, 17 Aug 2024 at 22:05, Joe Conway  wrote:
> Just to close this out -- problem found and fixed about an hour ago.
> Sorry for the interruption.

Whatever the problem was it seems to have returned




Re: Apply PGDLLIMPORT markings to some GUC variables

2024-08-19 Thread Peter Eisentraut

On 19.08.24 08:18, Michael Paquier wrote:

On Tue, Aug 13, 2024 at 03:00:00PM -0400, Robert Haas wrote:

This seems correct to me.


It is not the first time that this happens in recent history.  Would
it be worth automating that?  I would guess a TAP test that takes a
copy of the headers, applies the changes while filtering the few
exceptions, then compares it to the origin in the tree?


Probably worth thinking about, but it would at least require some kind 
of exclude list, because there are exceptions like src/include/common/ 
/logging.h.






Re: Vacuum statistics

2024-08-19 Thread jian he
in pg_stats_vacuum
if (type == PGSTAT_EXTVAC_INDEX || type == PGSTAT_EXTVAC_HEAP)
{
Oidrelid = PG_GETARG_OID(1);

/* Load table statistics for specified database. */
if (OidIsValid(relid))
{
tabentry = fetch_dbstat_tabentry(dbid, relid);
if (tabentry == NULL || tabentry->vacuum_ext.type != type)
/* Table don't exists or isn't an heap relation. */
PG_RETURN_NULL();

tuplestore_put_for_relation(relid, rsinfo, tabentry);
}
else
{
   }


So for functions pg_stat_vacuum_indexes and pg_stat_vacuum_tables,
it seems you didn't check "relid" 's relkind,
you may need to use get_rel_relkind.




Re: Conflict detection and logging in logical replication

2024-08-19 Thread shveta malik
On Mon, Aug 19, 2024 at 12:32 PM Zhijie Hou (Fujitsu)
 wrote:
>
>
> Thanks for reporting the bug. I have fixed it and ran pgindent in V17 patch.
> I also adjusted few comments and fixed a typo.
>

Thanks for the patch. Re-tested it, all scenarios seem to work well now.

I see that this version has new header inclusion in conflict.c, due to
which I think "catalog/index.h" inclusion is now redundant. Please
recheck and remove if so.
Also, there are few long lines in conflict.c (see line 408, 410).

Rest looks good.

thanks
Shveta




Remove redundant NULL check in clause_selectivity_ext() function

2024-08-19 Thread Ilia Evdokimov

Hi everyone,


In the `clause_selectivity_ext()` function, there’s a comment regarding 
a NULL clause check that asks, "can this still happen?". I decided to 
investigate whether this scenario is still valid.


Here’s what I found after reviewing the relevant cases:

- approx_tuple_count(): The function iterates over a loop of other clauses.
- get_foreign_key_join_selectivity(): The function is invoked within an 
`if (clause)` condition.
- consider_new_or_clause(): The clause is generated by 
`make_restrictinfo()`, which never returns NULL.
- clauselist_selectivity_ext(): This function either checks 
`list_length(clauses) == 1` before being called, or it is called within 
a loop of clauses.


In other cases, the function is called recursively from 
`clause_selectivity_ext()`.


If you are aware of any situations where a NULL clause could be passed 
or if I’ve missed anything, please let me know. I’m also open to any 
other suggestions.


--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.
From fab0575ef1350cb700b6fc079230397ecb5ca19d Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov 
Date: Mon, 19 Aug 2024 12:08:53 +0300
Subject: [PATCH] Remove redundant NULL check for clause during selectivity
 calculation

The function clause_selectivity_ext() takes a Node *clause parameter,
which is either iterated over in a loop with other clauses or passed
as a RestrictInfo from other functions.
Since these functions guarantee that clause cannot be NULL,
the NULL check is unnecessary and has been removed.
---
 src/backend/optimizer/path/clausesel.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 0ab021c1e8..5992f96ed2 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -692,9 +692,6 @@ clause_selectivity_ext(PlannerInfo *root,
 	RestrictInfo *rinfo = NULL;
 	bool		cacheable = false;
 
-	if (clause == NULL)			/* can this still happen? */
-		return s1;
-
 	if (IsA(clause, RestrictInfo))
 	{
 		rinfo = (RestrictInfo *) clause;
-- 
2.34.1



Re: thread-safety: gmtime_r(), localtime_r()

2024-08-19 Thread Peter Eisentraut

On 16.08.24 23:01, Thomas Munro wrote:

On Sat, Aug 17, 2024 at 3:43 AM Peter Eisentraut  wrote:

I moved the _POSIX_C_SOURCE definition for MinGW from the header file to
a command-line option (-D_POSIX_C_SOURCE).  This matches the treatment
of _GNU_SOURCE and similar.

I was trying to figure out what else -D_POSIX_C_SOURCE does to MinGW.
Enables __USE_MINGW_ANSI_STDIO, apparently, but I don't know if we
were using that already, or if it matters.  I suppose if it ever shows
up as a problem, we can explicitly disable it.

. o O ( MinGW is a strange beast.  Do we want to try to keep the code
it runs as close as possible to what is used by MSVC?  I thought so,
but we can't always do that due to missing interfaces (though I
suspect that many #ifdef _MSC_VER tests are based on ancient versions
and now bogus).  But it also offers ways to be more POSIX-y if we
want, and then we have to decide whether to take them, and make it
more like a separate platform with different quirks... )


Yeah, ideally we'd keep it aligned with MSVC.  But a problem here is 
that if _POSIX_C_SOURCE (or _GNU_SOURCE or something like that) gets 
defined for other reasons, then there would be conflicts between the 
system headers and our workaround #define's.  At least plpython triggers 
such a conflict in my testing.  This is the usual problem that we also 
have with _GNU_SOURCE in other contexts.






Re: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.

2024-08-19 Thread Tomas Vondra
On 8/19/24 11:01, Yugo Nagata wrote:
> On Mon, 19 Aug 2024 00:35:39 +0200
> Tomas Vondra  wrote:
> 
>> On 8/16/24 13:26, Tomas Vondra wrote:
>>> Hi Ayush,
>>>
>>> ...
>>>
>>> So this fix seems reasonable.
>>>
>>
>> I've pushed this to all affected branches, except for 11 which is EOL.
>>
>> I thought about adding a test, but I couldn't think of a TAP test where
>> this would really fit, and it didn't seem very practical to have a test
>> creating hundreds of roles. So I abandoned the idea.
> 
> I tried to add Assert in heap_inplace_update to prevent possible similar 
> failures, but I gave up because I could not find a good way to determine if
> a tuple is detoasted of not.
> 

Right, not sure there's a good way to check for that.

> By the way, I found a comment in vac_update_datfrozenxid() and 
> EventTriggerOnLogin()
> that explains why we  could not use tuples from the syscache for 
> heap_inplace_update. 
> I think it is better ad d the same comment in dropdb(). I attached a trivial 
> patch for it.
> 

Agreed. That seems like a nice improvement to the comment.


regards

-- 
Tomas Vondra




Re: define PG_REPLSLOT_DIR

2024-08-19 Thread Ashutosh Bapat
On Wed, Aug 14, 2024 at 5:02 PM Bertrand Drouvot
 wrote:
>
> Hi hackers,
>
> while working on a replication slot tool (idea is to put it in contrib, not
> shared yet), I realized that "pg_replslot" is being used > 25 times in
> .c files.
>
> I think it would make sense to replace those occurrences with a $SUBJECT, 
> attached
> a patch doing so.

Many of these places are slot specific directory/file names within
pg_replslot. I think we can further improve the code by creating macro
on the lines of LSN_FORMAT_ARGS
#define SLOT_DIRNAME_ARGS(slotname) (PG_REPLSLOT_DIR, slotname)
this way we "codify" method to construct the slot directory name
everywhere. We may add another macro
#define SLOT_DIRNAME_FORMAT  "%s/%s" to further enforce the same. But
I didn't find similar LSN_FORMAT macro defined as "%X/%X". I don't
remember exactly why we didn't add it. May be because of trouble with
translations.

-- 
Best Wishes,
Ashutosh Bapat




Re: Conflict detection and logging in logical replication

2024-08-19 Thread Amit Kapila
On Mon, Aug 19, 2024 at 3:03 PM shveta malik  wrote:
>
> On Mon, Aug 19, 2024 at 12:32 PM Zhijie Hou (Fujitsu)
>  wrote:
> >
> >
> > Thanks for reporting the bug. I have fixed it and ran pgindent in V17 patch.
> > I also adjusted few comments and fixed a typo.
> >
>
> Thanks for the patch. Re-tested it, all scenarios seem to work well now.
>
> I see that this version has new header inclusion in conflict.c, due to
> which I think "catalog/index.h" inclusion is now redundant. Please
> recheck and remove if so.
>

This is an extra include, so removed in the attached. Additionally, I
have modified a few comments and commit message.

> Also, there are few long lines in conflict.c (see line 408, 410).
>

I have left these as it is because pgindent doesn't complain about them.

> Rest looks good.
>

Thanks for the review and testing.

-- 
With Regards,
Amit Kapila.


v18-0001-Log-the-conflicts-while-applying-changes-in-logi.patch
Description: Binary data


Re: [PROPOSAL] : Disallow use of empty column name in (column_name '') in ALTER or CREATE of foreign table.

2024-08-19 Thread Ashutosh Bapat
On Fri, Aug 16, 2024 at 8:26 PM Tom Lane  wrote:
>
> Nishant Sharma  writes:
> > Actual column names used while creation of foreign table are not allowed to
> > be an
> > empty string, but when we use column_name as an empty string in OPTIONS
> > during
> > CREATE or ALTER of foreign tables, it is allowed.
>
> Is this really a bug?  The valid remote names are determined by
> whatever underlies the FDW, and I doubt we should assume that
> SQL syntax restrictions apply to every FDW.  Perhaps it would
> be reasonable to apply such checks locally in SQL-based FDWs,
> but I object to assuming such things at the level of
> ATExecAlterColumnGenericOptions.

I agree.

>
> More generally, I don't see any meaningful difference between
> this mistake and the more common one of misspelling the remote
> column name, which is something we're not going to be able
> to check for (at least not in anything like this way).  If
> you wanted to move the ease-of-use goalposts materially,
> you should be looking for a way to do that.

I think this check should be delegated to an FDW validator.

-- 
Best Wishes,
Ashutosh Bapat




Re: Use read streams in pg_visibility

2024-08-19 Thread Nazir Bilal Yavuz
Hi,

On Mon, 19 Aug 2024 at 09:30, Michael Paquier  wrote:
>
> On Tue, Aug 13, 2024 at 03:22:27PM +0300, Nazir Bilal Yavuz wrote:
> > Hi,
> >
> > I am working on using the read stream in pg_visibility. There are two
> > places to use it:
> >
> > 1- collect_visibility_data()
> >
> > This one is straightforward. I created a read stream object if
> > 'include_pd' is true because read I/O is done when 'include_pd' is
> > true. There is ~4% timing improvement with this change. I started the
> > server with the default settings and created a 6 GB table. Then run
> > 100 times pg_visibility() by clearing the OS cache between each run.
> > --
>
> Mind sharing a script for reproducibility?  Except for the drop_caches
> part, of course..

Sure, the script is attached.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft


setup.sh
Description: application/shellscript


Re: possible issue in postgres_fdw batching

2024-08-19 Thread Tomas Vondra
On 8/19/24 01:40, Tom Lane wrote:
> Tomas Vondra  writes:
>> Consider this simplified example:
> 
>>   CREATE TABLE t (a INT);
>>   CREATE FOREIGN TABLE f (a INT) SERVER loopback
>>  OPTIONS (table_name 't');
>>   CREATE FUNCTION counter() RETURNS int LANGUAGE sql AS
>>   $$ SELECT COUNT(*) FROM f $$;
> 
>> And now do
> 
>>   INSERT INTO f SELECT counter() FROM generate_series(1,100);
> 
>> With batch_size=1 this produces a nice sequence, with batching we start
>> to get duplicate values - which is not surprising, as the function is
>> oblivious to the data still in the buffer.
> 
>> The first question is if this is a bug.
> 
> I'd say no; this query is unduly chummy with implementation details.
> Even with no foreign table in the picture at all, we would be fully
> within our rights to execute the SELECT to completion before any
> of the insertions became visible.  (Arguably, it's the current
> behavior that is surprising, not that one.)
> 

Thanks, that's a great point. It didn't occur to me we're entitled to
just run the SELECT to completion, before proceeding to the INSERT. That
indeed means this function is fundamentally unsafe.


regards

-- 
Tomas Vondra




Re: gitmaster server problem?

2024-08-19 Thread Joe Conway

On 8/19/24 05:04, Jelte Fennema-Nio wrote:

On Sat, 17 Aug 2024 at 22:05, Joe Conway  wrote:

Just to close this out -- problem found and fixed about an hour ago.
Sorry for the interruption.


Whatever the problem was it seems to have returned



What specifically? I am not seeing anything at the moment.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Avoiding superfluous buffer locking during nbtree backwards scans

2024-08-19 Thread Matthias van de Meent
On Sun, 11 Aug 2024 at 21:44, Peter Geoghegan  wrote:
>
> On Tue, Aug 6, 2024 at 6:31 PM Matthias van de Meent
>  wrote:
> > +1, LGTM.
> >
> > This changes the backward scan code in _bt_readpage to have an
> > approximately equivalent handling as the forward scan case for
> > end-of-scan cases, which is an improvement IMO.

Here's a new patch that further improves the situation, so that we
don't try to re-lock the buffer we just accessed when we're stepping
backward in index scans, reducing buffer lock operations in the common
case by 1/2.

It also further decreases the number of code differences between
forward and backward scans in _bt_steppage and _bt_readnextpage, with
mostly only small differences remaining in the code paths shared
between the two scan modes.

The change in lwlock.c is to silence a warning when LWLOCK_STATS is enabled.


Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

I've validated my results by compiling with LWLOCK_STATS enabled (e.g.
#define LWLOCK_STATS), and testing backward index scans, e.g.

CREATE TABLE test AS SELECT generate_series(1, 100) as num;
CREATE INDEX ON test (num);
VACUUM (FREEZE) test;
\c -- reconnect to get fresh lwlock stats
SELECT COUNT(num ORDER BY num DESC) FROM test;
\c -- reconnect to dump stats of previous session

Before this patch I consistently got `BufferContent 0x: shacq
2` in the logs, but with this patch that has been decreased to
`BufferContent 0x: shacq 1`


v1-0001-Avoid-unneeded-nbtree-backwards-scan-buffer-locks.patch
Description: Binary data


Re: gitmaster server problem?

2024-08-19 Thread Jelte Fennema-Nio
On Mon, 19 Aug 2024 at 13:39, Joe Conway  wrote:
> What specifically? I am not seeing anything at the moment.

It seems to be working again fine. But before
https://git.postgresql.org was throwing 502 and 503 errors. Depending
on the request either by Varnish or nginx.




Re: why is pg_upgrade's regression run so slow?

2024-08-19 Thread Alexander Lakhin

Hello Andrew,

29.07.2024 13:54, Andrew Dunstan wrote:


On 2024-07-29 Mo 4:00 AM, Alexander Lakhin wrote:


And another interesting fact is that TEMP_CONFIG is apparently ignored by
`meson test regress/regress`. For example, with temp.config containing
invalid settings, `meson test pg_upgrade/002_pg_upgrade` fails, but
`meson test regress/regress` passes just fine.




Well, that last fact explains the discrepancy I originally complained about. How the heck did that happen? It looks 
like we just ignored its use in Makefile.global.in :-(


Please look at the attached patch (partially based on ideas from [1]) for
meson.build, that aligns it with `make` in regard to use of TEMP_CONFIG.

Maybe it could be implemented via a separate meson option, but that would
also require modifying at least the buildfarm client...

[1] 
https://www.postgresql.org/message-id/CAN55FZ304Kp%2B510-iU5-Nx6hh32ny9jgGn%2BOB5uqPExEMK1gQQ%40mail.gmail.com

Best regards,
Alexanderdiff --git a/meson.build b/meson.build
index cd711c6d01..90f4e6da04 100644
--- a/meson.build
+++ b/meson.build
@@ -3388,6 +3388,12 @@ else
   runningcheck = false
 endif
 
+temp_config_env = run_command([
+  python,
+  '-c',
+  'import os; print(os.getenv("TEMP_CONFIG", ""))'
+], check: true).stdout().strip()
+
 testwrap = files('src/tools/testwrap')
 
 foreach test_dir : tests
@@ -3451,6 +3457,10 @@ foreach test_dir : tests
   env = test_env
   env.prepend('PATH', temp_install_bindir, test_dir['bd'])
 
+  if temp_config_env != ''
+test_command_base += [ '--temp-config=' + temp_config_env ]
+  endif
+
   test_kwargs = {
 'protocol': 'tap',
 'priority': 10,


Re: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.

2024-08-19 Thread Tomas Vondra
On 8/19/24 12:16, Tomas Vondra wrote:
> On 8/19/24 11:01, Yugo Nagata wrote:
>
> ...
> 
>> By the way, I found a comment in vac_update_datfrozenxid() and 
>> EventTriggerOnLogin()
>> that explains why we  could not use tuples from the syscache for 
>> heap_inplace_update. 
>> I think it is better ad d the same comment in dropdb(). I attached a trivial 
>> patch for it.
>>
> 
> Agreed. That seems like a nice improvement to the comment.
> 

Done, thanks for the suggestion / patch.


regards

-- 
Tomas Vondra




Re: gitmaster server problem?

2024-08-19 Thread Robert Haas
On Mon, Aug 19, 2024 at 7:57 AM Jelte Fennema-Nio  wrote:
> On Mon, 19 Aug 2024 at 13:39, Joe Conway  wrote:
> > What specifically? I am not seeing anything at the moment.
>
> It seems to be working again fine. But before
> https://git.postgresql.org was throwing 502 and 503 errors. Depending
> on the request either by Varnish or nginx.

I'm still unable to access any https://git.postgresql.org/ URL.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: gitmaster server problem?

2024-08-19 Thread Jelte Fennema-Nio
On Mon, 19 Aug 2024 at 14:04, Robert Haas  wrote:
> I'm still unable to access any https://git.postgresql.org/ URL.

Yeah it's broken for me again too.




Re: gitmaster server problem?

2024-08-19 Thread Jelte Fennema-Nio
On Mon, 19 Aug 2024 at 14:05, Jelte Fennema-Nio  wrote:
>
> On Mon, 19 Aug 2024 at 14:04, Robert Haas  wrote:
> > I'm still unable to access any https://git.postgresql.org/ URL.
>
> Yeah it's broken for me again too.

In case the actual error is helpful (although I doubt it):

Error 503 Backend fetch failed

Backend fetch failed

Guru Meditation:

XID: 1030914118



Varnish cache server




Re: gitmaster server problem?

2024-08-19 Thread Joe Conway

On 8/19/24 08:06, Jelte Fennema-Nio wrote:

On Mon, 19 Aug 2024 at 14:05, Jelte Fennema-Nio  wrote:


On Mon, 19 Aug 2024 at 14:04, Robert Haas  wrote:
> I'm still unable to access any https://git.postgresql.org/ URL.

Yeah it's broken for me again too.


In case the actual error is helpful (although I doubt it):

Error 503 Backend fetch failed

Backend fetch failed

Guru Meditation:

XID: 1030914118



I am not seeing any errors here. nagios monitoring is showing an ipv6 
error though. Not the same issue as yesterday.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest

2024-08-19 Thread Robert Haas
On Fri, Aug 16, 2024 at 3:13 AM Peter Eisentraut  wrote:
> Yeah, I'd rather not open the can of worms that we send automated emails
> to this list at all.

+1.

> If people want to get emails from
> the commitfest app, it should be that you subscribe there and it sends
> those emails to those who want them.

+1.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

2024-08-19 Thread Ashutosh Bapat
Sorry for the delay in my response.

On Wed, May 29, 2024 at 9:34 AM Ashutosh Bapat
 wrote:
>
> Documenting some comments from todays' patch review session

I forgot to mention back then that both of the suggestions below came
from Tom Lane.

> 1. Instead of a nested hash table, it might be better to use a flat hash 
> table to save more memory.

Done. It indeed saves memory without impacting planning time.

> 2. new comm_rinfo member in RestrictInfo may have problems when copying 
> RestrictInfo or translating it. Instead commuted versions may be tracked 
> outside RestrictInfo

After commit 767c598954bbf72e0535f667e2e0667765604b2a,
repameterization of parent paths for child relation happen at the time
of creating the plan. This reduces the number of child paths produced
by reparameterization and also reduces the number of RestrictInfos
that get translated during reparameterization. During
reparameterization commuted parent RestrictInfos are required to be
translated to child RestrictInfos. Before the commit, this led to
translating the same commuted parent RestrictInfo multiple times.
After the commit, only one path gets reparameterized for a given
parent and child pair. Hence we do not produce multiple copies of the
same commuted child RestrictInfo. Hence we don't need to keep track of
commuted child RestrictInfos anymore. Removed that portion of code
from the patches.

I made detailed memory consumption measurements with this patch for
number of partitions changed from 0 (unpartitioned) to 1000 and for 2
to 5-way joins. They are available in the spreadsheet at [1]. The raw
measurement data is in the first sheet named "pwj_mem_measurements raw
numbers". The averages over multiple runs are in second sheet named
"avg_numbers". Rest of the sheet represent the averages in more
consumable manner. Please note that the averages make sense only for
planning time since the memory consumption remains same with every
run. Also note that EXPLAIN now reports planning memory consumption in
kB. Any changes to memory consumption below 1kB are not reported and
hence not noticed. Here are some observations.

1. When partitionwise join is not enabled, no changes to planner's
memory consumption are observed. See sheet named "pwj disabled,
planning memory".

2. When partitionwise join is enabled, upto 17% (206MB) memory is
saved by the patch in case of 5-way self-join with 1000 partitions.
This is the maximum memory saving observed. The amount of memory saved
increases with the number of joins and number of partitions. See sheet
with name "pwj enabled, planning memory"

3. After commit 767c598954bbf72e0535f667e2e0667765604b2a, we do not
translate a parent RestrictInfo multiple times for the same
parent-child pair in case of  a *2-way partitionwise join*. But we
still consume memory in saving the child RestrictInfo in the hash
table. Hence in case of 2-way join we see increased memory consumption
with the patch compared to master. The memory consumption increases by
13kb, 23kB, 76kB and 146kB for 10, 100, 500, 1000 partitions
respectively. This increase is smaller compared to the overall memory
saving. In order to avoid this memory increase, we will need to avoid
using hash table for 2-way join. We will need to know whether there
will be more than one partitionwise join before translating the
RestrictInfos for the first partitionwise join. This is hard to
achieve in all the cases since the decision to use partitionwise join
happens at the time of creating paths for a given join relation, which
itself is computed on the fly. We may choose some heuristics which
take into account the number of partitioned tables in the query, their
partition schemes, and the quals in the query to decide whether or not
to track the translated child RestrictInfos. But that will make the
code more complex, but more importantly the heuristics may not be able
to keep up if we start using partitionwise join as an optimization
strategy for more cases (e.g. asymmetric partitionwise join [2]). The
attached patch looks like a good tradeoff to me. But other opinions
might vary. Suggestions are welcome.

4. There is no noticeable change in the planning time. I ran the same
experiment multiple times. The planning time variations from each
experiment do not show any noticeable pattern suggesting increase or
decrease in the planning time with the patch.

A note about the code: I have added all the structures and functions
dealing with the RestrictInfo hash table at the end of restrictinfo.c.
I have not come across a C file in PostgreSQL code base where private
structures are defined in the middle the file; usually they are
defined at the beginning of the file. But I have chosen it that way
here since it makes it easy to document the hash table and the
functions at one place at the beginning of this code section. I am
open to suggestions which make the documentation easy while placing
the structures at the beginning of the file.

[1] 
https://docs.google.com/s

Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

2024-08-19 Thread Robert Haas
On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang  wrote:
> If the code in PG_TRY contains any non local control flow other than
> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot
> be called, then the PG_exception_stack will point to the memory whose
> stack frame has been released. So after that, when the pg_re_throw
> called, __longjmp() will crash and report Segmentation fault error.
>
>  Addition to sigjmp_buf, add another field 'int magic' which is next to
>  the sigjum_buf in the local stack frame memory. The magic's value is always
>  'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if
>  the magic's value is still '0x12345678', if not, that means the memory
>  where the 'PG_exception_stack' points to has been released, and the 'sigbuf'
>  must be invalid.
>
>   The related code is in patch 0001

This is an interesting idea. I suspect if we do this we want a
different magic number and a different error message than what you
chose here, but those are minor details.

I'm not sure how reliable this would be at actually finding problems,
though. It doesn't seem guaranteed that the magic number will get
overwritten if you do something wrong; it's just a possibility. Maybe
that's still useful enough that we should adopt this, but I'm not
sure.

Personally, I don't think I've ever made this particular mistake. I
think a much more common usage error is exiting the catch-block
without either throwing an error or rolling back a subtransaction. But
YMMV, of course.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest

2024-08-19 Thread Jelte Fennema-Nio
On Fri, 16 Aug 2024 at 16:43, Tom Lane  wrote:
> However, there are other ways to accomplish that.  I liked the
> suggestion of extending the CF webapp with a way to search for entries
> mentioning a particular mail message ID.  I dunno how hard it'd be to
> get it to recognize *any* message-ID from a thread, but surely at
> least the head message ought not be too hard to match.

I sent a patch to support this on the commitfest app to Magnus
off-list. It was pretty easy to implement, even for *any* message-ID.




Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning

2024-08-19 Thread Ashutosh Bapat
On Mon, Aug 19, 2024 at 6:43 PM Ashutosh Bapat
 wrote:
>
> Sorry for the delay in my response.
>
> On Wed, May 29, 2024 at 9:34 AM Ashutosh Bapat
>  wrote:
> >
> > Documenting some comments from todays' patch review session
>
> I forgot to mention back then that both of the suggestions below came
> from Tom Lane.
>
> > 1. Instead of a nested hash table, it might be better to use a flat hash 
> > table to save more memory.
>
> Done. It indeed saves memory without impacting planning time.
>
> > 2. new comm_rinfo member in RestrictInfo may have problems when copying 
> > RestrictInfo or translating it. Instead commuted versions may be tracked 
> > outside RestrictInfo
>
> After commit 767c598954bbf72e0535f667e2e0667765604b2a,
> repameterization of parent paths for child relation happen at the time
> of creating the plan. This reduces the number of child paths produced
> by reparameterization and also reduces the number of RestrictInfos
> that get translated during reparameterization. During
> reparameterization commuted parent RestrictInfos are required to be
> translated to child RestrictInfos. Before the commit, this led to
> translating the same commuted parent RestrictInfo multiple times.
> After the commit, only one path gets reparameterized for a given
> parent and child pair. Hence we do not produce multiple copies of the
> same commuted child RestrictInfo. Hence we don't need to keep track of
> commuted child RestrictInfos anymore. Removed that portion of code
> from the patches.
>
> I made detailed memory consumption measurements with this patch for
> number of partitions changed from 0 (unpartitioned) to 1000 and for 2
> to 5-way joins. They are available in the spreadsheet at [1]. The raw
> measurement data is in the first sheet named "pwj_mem_measurements raw
> numbers". The averages over multiple runs are in second sheet named
> "avg_numbers". Rest of the sheet represent the averages in more
> consumable manner. Please note that the averages make sense only for
> planning time since the memory consumption remains same with every
> run. Also note that EXPLAIN now reports planning memory consumption in
> kB. Any changes to memory consumption below 1kB are not reported and
> hence not noticed. Here are some observations.
>
> 1. When partitionwise join is not enabled, no changes to planner's
> memory consumption are observed. See sheet named "pwj disabled,
> planning memory".
>
> 2. When partitionwise join is enabled, upto 17% (206MB) memory is
> saved by the patch in case of 5-way self-join with 1000 partitions.
> This is the maximum memory saving observed. The amount of memory saved
> increases with the number of joins and number of partitions. See sheet
> with name "pwj enabled, planning memory"
>
> 3. After commit 767c598954bbf72e0535f667e2e0667765604b2a, we do not
> translate a parent RestrictInfo multiple times for the same
> parent-child pair in case of  a *2-way partitionwise join*. But we
> still consume memory in saving the child RestrictInfo in the hash
> table. Hence in case of 2-way join we see increased memory consumption
> with the patch compared to master. The memory consumption increases by
> 13kb, 23kB, 76kB and 146kB for 10, 100, 500, 1000 partitions
> respectively. This increase is smaller compared to the overall memory
> saving. In order to avoid this memory increase, we will need to avoid
> using hash table for 2-way join. We will need to know whether there
> will be more than one partitionwise join before translating the
> RestrictInfos for the first partitionwise join. This is hard to
> achieve in all the cases since the decision to use partitionwise join
> happens at the time of creating paths for a given join relation, which
> itself is computed on the fly. We may choose some heuristics which
> take into account the number of partitioned tables in the query, their
> partition schemes, and the quals in the query to decide whether or not
> to track the translated child RestrictInfos. But that will make the
> code more complex, but more importantly the heuristics may not be able
> to keep up if we start using partitionwise join as an optimization
> strategy for more cases (e.g. asymmetric partitionwise join [2]). The
> attached patch looks like a good tradeoff to me. But other opinions
> might vary. Suggestions are welcome.
>
> 4. There is no noticeable change in the planning time. I ran the same
> experiment multiple times. The planning time variations from each
> experiment do not show any noticeable pattern suggesting increase or
> decrease in the planning time with the patch.
>
> A note about the code: I have added all the structures and functions
> dealing with the RestrictInfo hash table at the end of restrictinfo.c.
> I have not come across a C file in PostgreSQL code base where private
> structures are defined in the middle the file; usually they are
> defined at the beginning of the file. But I have chosen it that way
> here since it makes it easy to document the hash 

Re: gitmaster server problem?

2024-08-19 Thread Tom Lane
Matthias van de Meent  writes:
> Additional infra that doesn't seem to work right now: mailing list
> archives [0] seem to fail at a 503 produced by Varnish Cache server:
> Error 503 Backend fetch failed. Maybe more of infra is down, or
> otherwise under maintenance?

Yeah, I'm seeing that too; not at the top level of the site but
when you try to drill down to individual messages.

commitfest.postgresql.org isn't responding either,
nor wiki.postgresql.org.

regards, tom lane




Re: gitmaster server problem?

2024-08-19 Thread Bruce Momjian
On Sat, Aug 17, 2024 at 02:15:49PM -0400, Tom Lane wrote:
> Matthias van de Meent  writes:
> > Additional infra that doesn't seem to work right now: mailing list
> > archives [0] seem to fail at a 503 produced by Varnish Cache server:
> > Error 503 Backend fetch failed. Maybe more of infra is down, or
> > otherwise under maintenance?
> 
> Yeah, I'm seeing that too; not at the top level of the site but
> when you try to drill down to individual messages.
> 
> commitfest.postgresql.org isn't responding either,
> nor wiki.postgresql.org.

Okay, thanks for the confirmation.  I emailed pginfra at the same time I
posted the first email, so they are aware.  I have not received a reply
from them yet.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: [PATCH] Add get_bytes() and set_bytes() functions

2024-08-19 Thread Aleksander Alekseev
Hi Peter,

Thanks for the feedback.

> I think these functions do about three things at once, and I don't think
> they address the originally requested purpose very well.

The amount of things that the function does is a matter of
interpretation. I can claim that it does one thing ("extracts an
integer from a bytea"), or as many things as there are lines of code.
IMO the actual question is whether this is a good user interface or
not. Since we already have get_byte() / set_byte() the interface is
arguably OK.

> Converting between integers and byte arrays of matching size seems like
> reasonable functionality.  (You can already do one half of that by
> calling int2send(), int4send(), and int8send(), but the other direction
> (intXrecv()) is not user-callable).
>
> The other things are extracting that byte array from a larger byte array
> and sticking it back into a larger byte array; those seem like separate
> operations.  There is already substr() for bytea for the first part, and
> there might be another string-like operationg for the second part, or
> maybe we could add one.

If I understand correctly, you propose doing (1):

```
SELECT substr('\x1122334455667788'::bytea, 2, 2) :: int2;
```

... instead of:

```
SELECT get_bytes('\x1122334455667788'::bytea, 1, 2)
```

... and (2):

```
WITH vals AS (
SELECT '\x1122334455667788'::bytea AS x
) SELECT substr(x, 1, 1) || int2send(1234::int2) || substr(x, 4, 5) FROM vals;
```

... instead of:

```
SELECT set_bytes('\x1122334455667788'::bytea, 1, 2, 0xAABB);
```

There is nothing to do for query (2), it already works. It's not much
better than the query from my first email though.

To clarify, supporting bytea<->integer (and/or bytea<->numeric) casts
doesn't strike me as a terrible idea but it doesn't address the issue
I'm proposing to solve.

-- 
Best regards,
Aleksander Alekseev




Re: define PG_REPLSLOT_DIR

2024-08-19 Thread Bertrand Drouvot
Hi,

On Thu, Aug 15, 2024 at 08:40:42PM -0400, Alvaro Herrera wrote:
> On 2024-Aug-14, Bertrand Drouvot wrote:
> 
> > Out of curiosity, checking the sizes of affected files (O2, no debug):
> > 
> > with patch:
> > 
> >textdata bss dec hex filename
> >   30304   0   8   303127668 
> > src/backend/replication/logical/reorderbuffer.o
> 
> > without patch:
> >   30286   0   8   302947656 
> > src/backend/replication/logical/reorderbuffer.o
> 
> Hmm, I suppose this delta can be explained because because the literal
> string is used inside larger snprintf() format strings or similar, so
> things like
> 
> snprintf(path, sizeof(path),
> -"pg_replslot/%s/%s", slotname,
> +"%s/%s/%s", PG_REPLSLOT_DIR, slotname,
>  spill_de->d_name);
> 
> and
> 
> ereport(ERROR,
> (errcode_for_file_access(),
> -errmsg("could not remove file \"%s\" during removal 
> of pg_replslot/%s/xid*: %m",
> -   path, slotname)));
> +errmsg("could not remove file \"%s\" during removal 
> of %s/%s/xid*: %m",
> +   PG_REPLSLOT_DIR, path, slotname)));
> 

I did not look in details, but yeah that could probably be explained that way.

> I don't disagree with making this change, but changing some of the other
> directory names you suggest, such as
> 
> > pg_notify
> > pg_serial
> > pg_subtrans
> > pg_multixact
> > pg_wal
> 
> would probably make no difference, since there are no literal strings
> that contain that as a substring(*).  It might some sense to handle
> pg_tblspc similarly.  As for pg_logical, I think you'll want separate
> defines for pg_logical/mappings, pg_logical/snapshots and so on.
> 
> 
> (*) I hope you're not going to suggest this kind of change (word-diff):
> 
> if (GetOldestSnapshot())
> ereport(ERROR,
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>  errmsg("[-pg_wal-]{+%s+}_replay_wait() must be only called 
> without an active or registered snapshot"{+, PG_WAL_DIR+}),
>  errdetail("Make sure [-pg_wal-]{+%s+}_replay_wait() isn't 
> called within a transaction with an isolation level higher than READ 
> COMMITTED, another procedure, or a function."{+, PG_WAL_DIR+})));
> 

Yeah, would not cross my mind ;-). I might have been tempted to do the change
in pg_combinebackup.c though.

Having said that, I agree to focus only on:

pg_replslot
pg_tblspc
pg_logical/*

I made the changes for pg_tblspc in pg_combinebackup.c as the number of 
occurences
are greater that the "pg_wal" ones and we were to define PG_TBLSPC_DIR in any
case.

Please find attached the related patches.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 5057933a8a4f49b57dfa50fe792d9f8162abf11a Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Wed, 14 Aug 2024 09:16:21 +
Subject: [PATCH v2 1/5] Define PG_REPLSLOT_DIR

Replace most of the places where "pg_replslot" is used in .c files with a new
PG_REPLSLOT_DIR define. The places where it is not done is for consistency with
the existing PG_STAT_TMP_DIR define.
---
 src/backend/backup/basebackup.c   |  3 +-
 .../replication/logical/reorderbuffer.c   | 15 
 src/backend/replication/slot.c| 38 +--
 src/backend/utils/adt/genfile.c   |  5 ++-
 src/bin/initdb/initdb.c   |  2 +-
 src/bin/pg_rewind/filemap.c   |  2 +-
 src/include/replication/slot.h|  2 +
 7 files changed, 36 insertions(+), 31 deletions(-)
  24.0% src/backend/replication/logical/
  59.8% src/backend/replication/
   8.7% src/backend/utils/adt/
   4.1% src/bin/
   3.2% src/

diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 01b35e26bd..de16afac74 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -36,6 +36,7 @@
 #include "port.h"
 #include "postmaster/syslogger.h"
 #include "postmaster/walsummarizer.h"
+#include "replication/slot.h"
 #include "replication/walsender.h"
 #include "replication/walsender_private.h"
 #include "storage/bufpage.h"
@@ -161,7 +162,7 @@ static const char *const excludeDirContents[] =
 	 * even if the intention is to restore to another primary. See backup.sgml
 	 * for a more detailed description.
 	 */
-	"pg_replslot",
+	PG_REPLSLOT_DIR,
 
 	/* Contents removed on startup, see dsm_cleanup_for_mmap(). */
 	PG_DYNSHMEM_DIR,
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 00a8327e77..6f98115d1e 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4573,7 +4573,7 @@ ReorderBufferCleanupSerializedTXNs(const char

Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

2024-08-19 Thread Tom Lane
Robert Haas  writes:
> On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang  wrote:
>> If the code in PG_TRY contains any non local control flow other than
>> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot
>> be called, then the PG_exception_stack will point to the memory whose
>> stack frame has been released. So after that, when the pg_re_throw
>> called, __longjmp() will crash and report Segmentation fault error.
>> 
>> Addition to sigjmp_buf, add another field 'int magic' which is next to
>> the sigjum_buf in the local stack frame memory. The magic's value is always
>> 'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if
>> the magic's value is still '0x12345678', if not, that means the memory
>> where the 'PG_exception_stack' points to has been released, and the 'sigbuf'
>> must be invalid.

> This is an interesting idea. I suspect if we do this we want a
> different magic number and a different error message than what you
> chose here, but those are minor details.

I would suggest just adding an Assert; I doubt this check would be
useful in production.

> I'm not sure how reliable this would be at actually finding problems,
> though.

Yeah, that's the big problem.  I don't have any confidence at all
that this would detect misuse.  It'd require that the old stack
frame gets overwritten, which might not happen for a long time,
and it'd require that somebody eventually do a longjmp, which again
might not happen for a long time --- and when those did happen, the
error would be detected in someplace far away from the actual bug,
with little evidence remaining to help you localize it.

Also, as soon as some outer level of PG_TRY is exited in the proper
way, the dangling stack pointer will be fixed up.  That means there's
a fairly narrow time frame in which the overwrite and longjmp must
happen for this to catch a bug.

So on the whole I doubt this'd be terribly useful in this form;
and I don't like the amount of code churn required.

> Personally, I don't think I've ever made this particular mistake. I
> think a much more common usage error is exiting the catch-block
> without either throwing an error or rolling back a subtransaction. But
> YMMV, of course.

We have had multiple instances of code "return"ing out of a PG_TRY,
so I fully agree that some better way to detect that would be good.
But maybe we ought to think about static analysis for that.

regards, tom lane




Re: define PG_REPLSLOT_DIR

2024-08-19 Thread Bertrand Drouvot
Hi,

On Fri, Aug 16, 2024 at 01:31:11PM +0900, Yugo Nagata wrote:
> On Wed, 14 Aug 2024 11:32:14 +
> Bertrand Drouvot  wrote:
> > Looking forward to your feedback,
> 
> /* restore all slots by iterating over all on-disk entries */
> -   replication_dir = AllocateDir("pg_replslot");
> -   while ((replication_de = ReadDir(replication_dir, "pg_replslot")) != NULL)
> +   replication_dir = AllocateDir(PG_REPLSLOT_DIR);
> +   while ((replication_de = ReadDir(replication_dir, PG_REPLSLOT_DIR)) != 
> NULL)
> {
> charpath[MAXPGPATH + 12];
> PGFileType  de_type;
> 
> I think the size of path can be rewritten to "MAXPGPATH + 
> sizeof(PG_REPLSLOT_DIR)" 
> and it seems easier to understand the reason why this size is used. 

Thanks for the feedback!

Yeah, fully agree, it has been done that way in v2 that I just shared up-thread.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Opinion poll: Sending an automated email to a thread when it gets added to the commitfest

2024-08-19 Thread Tom Lane
Jelte Fennema-Nio  writes:
> On Fri, 16 Aug 2024 at 16:43, Tom Lane  wrote:
>> However, there are other ways to accomplish that.  I liked the
>> suggestion of extending the CF webapp with a way to search for entries
>> mentioning a particular mail message ID.  I dunno how hard it'd be to
>> get it to recognize *any* message-ID from a thread, but surely at
>> least the head message ought not be too hard to match.

> I sent a patch to support this on the commitfest app to Magnus
> off-list. It was pretty easy to implement, even for *any* message-ID.

Cool, thank you!

regards, tom lane




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-08-19 Thread Robert Haas
On Mon, Aug 19, 2024 at 3:30 AM Jelte Fennema-Nio  wrote:
> But **now I actually feel much more strongly about reusing the same
> function**. Because by introducing a new function we actually break
> the users of the second use-case.
>
> P.S. The docs for PQprotocolVersion[1] have never said that this
> function only returns the major protocol version. And by using the
> word "Currently" it has always suggested that new return values could
> be introduced later, and thus for feature detection you should use >=
> 3

If somebody is using PQprotocolVersion() to detect the arrival of a
new protocol version, it stands to reason that they only care about
new major protocol versions, because that's what the function is
defined to tell you about. Anyone who has done a moderate amount of
looking into this area will understand that the protocol has a major
version number and a minor version number and that this function only
returns the former. Therefore, they should expect that the arrival of
a new minor protocol version won't change the return value of this
function.

I really don't understand why we're still arguing about this. It seems
to me that we've established that there is some usage of the existing
function, and that changing the return value will break something.
Sure, so far as we know that something is "only" regression tests, but
there's no guarantee that there couldn't be other code that we don't
know about that breaks worse, and even there isn't, who wants to break
regression tests when there's nothing actually wrong? Now we could
decide we're going to do it anyway because of whatever reason we might
have, but it doesn't seem like that's what most people want to do.

I feel like we're finally in a position to get some things done here
and this doesn't seem like the point to get stuck on. YMMV, of course.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: define PG_REPLSLOT_DIR

2024-08-19 Thread Bertrand Drouvot
Hi,

On Mon, Aug 19, 2024 at 04:11:31PM +0530, Ashutosh Bapat wrote:
> On Wed, Aug 14, 2024 at 5:02 PM Bertrand Drouvot
>  wrote:
> >
> > Hi hackers,
> >
> > while working on a replication slot tool (idea is to put it in contrib, not
> > shared yet), I realized that "pg_replslot" is being used > 25 times in
> > .c files.
> >
> > I think it would make sense to replace those occurrences with a $SUBJECT, 
> > attached
> > a patch doing so.
> 
> Many of these places are slot specific directory/file names within
> pg_replslot. I think we can further improve the code by creating macro
> on the lines of LSN_FORMAT_ARGS
> #define SLOT_DIRNAME_ARGS(slotname) (PG_REPLSLOT_DIR, slotname)
> this way we "codify" method to construct the slot directory name
> everywhere.

Thanks for the feedback!

I think that could make sense. As the already proposed mechanical changes are
error prone (from my point of view), I would suggest to have a look at your
proposal once the proposed changes go in.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: gitmaster server problem?

2024-08-19 Thread Bruce Momjian
On Mon, Aug 19, 2024 at 08:11:36AM -0400, Joe Conway wrote:
> On 8/19/24 08:06, Jelte Fennema-Nio wrote:
> > On Mon, 19 Aug 2024 at 14:05, Jelte Fennema-Nio  wrote:
> > > 
> > > On Mon, 19 Aug 2024 at 14:04, Robert Haas  wrote:
> > > > I'm still unable to access any https://git.postgresql.org/ URL.
> > > 
> > > Yeah it's broken for me again too.
> > 
> > In case the actual error is helpful (although I doubt it):
> > 
> > Error 503 Backend fetch failed
> > 
> > Backend fetch failed
> > 
> > Guru Meditation:
> > 
> > XID: 1030914118
> 
> I am not seeing any errors here. nagios monitoring is showing an ipv6 error
> though. Not the same issue as yesterday.

gitmaster.postgresql.org is working fine for me right now.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: gitmaster server problem?

2024-08-19 Thread jian he
On Mon, Aug 19, 2024 at 9:56 PM Bruce Momjian  wrote:
>

as of now.
cgit working
https://git.postgresql.org/cgit/postgresql.git/


git not working
( 403 Forbidden nginx)
https://git.postgresql.org/git/postgresql.git/




Re: gitmaster server problem?

2024-08-19 Thread Dave Page
On Mon, 19 Aug 2024 at 14:56, Bruce Momjian  wrote:

> On Sat, Aug 17, 2024 at 02:15:49PM -0400, Tom Lane wrote:
> > Matthias van de Meent  writes:
> > > Additional infra that doesn't seem to work right now: mailing list
> > > archives [0] seem to fail at a 503 produced by Varnish Cache server:
> > > Error 503 Backend fetch failed. Maybe more of infra is down, or
> > > otherwise under maintenance?
> >
> > Yeah, I'm seeing that too; not at the top level of the site but
> > when you try to drill down to individual messages.
> >
> > commitfest.postgresql.org isn't responding either,
> > nor wiki.postgresql.org.
>
> Okay, thanks for the confirmation.  I emailed pginfra at the same time I
> posted the first email, so they are aware.  I have not received a reply
> from them yet.
>

Joe replied.

I've just checked again myself and I don't see any problems with the sites
mentioned from a user perspective, but there are some non-obvious issues
being reported by Nagios on git.postgresql.org (which is not the same as
gitmaster). We'll dig in some more.

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com

PGDay UK 2024, 11th September, London: https://2024.pgday.uk/


Re: gitmaster server problem?

2024-08-19 Thread Magnus Hagander
On Mon, Aug 19, 2024 at 3:56 PM Tom Lane  wrote:

> Matthias van de Meent  writes:
> > Additional infra that doesn't seem to work right now: mailing list
> > archives [0] seem to fail at a 503 produced by Varnish Cache server:
> > Error 503 Backend fetch failed. Maybe more of infra is down, or
> > otherwise under maintenance?
>
> Yeah, I'm seeing that too; not at the top level of the site but
> when you try to drill down to individual messages.
>
> commitfest.postgresql.org isn't responding either,
> nor wiki.postgresql.org.
>

Are you still seeing these issues? It works fine both from here and from
our monitoring, but there could be something source-network-specific
maybe

The git.postgresql.org one specifically looks a lot like a DOS. It may not
be an intentional DOS, but the git server side is very inefficient and can
rapidly turn valid attempts into DOSes if they parallellize too much...

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: gitmaster server problem?

2024-08-19 Thread Tom Lane
Magnus Hagander  writes:
> On Mon, Aug 19, 2024 at 3:56 PM Tom Lane  wrote:
>> Yeah, I'm seeing that too; not at the top level of the site but
>> when you try to drill down to individual messages.
>> commitfest.postgresql.org isn't responding either,
>> nor wiki.postgresql.org.

> Are you still seeing these issues? It works fine both from here and from
> our monitoring, but there could be something source-network-specific
> maybe

Right at the moment,

https://git.postgresql.org/gitweb/?p=postgresql.git

is failing for me with "Backend fetch failed".  However, the mail
archives, commitfest.p.o, and wiki.p.o seem to be responding normally,
as is gitmaster.

regards, tom lane




Re: Restart pg_usleep when interrupted

2024-08-19 Thread Bertrand Drouvot
Hi,

On Thu, Aug 15, 2024 at 04:13:29PM -0500, Nathan Bossart wrote:
> On Wed, Aug 14, 2024 at 06:00:06AM +, Bertrand Drouvot wrote:
> > I gave it more thoughts and I don't think we have to choose between the two.
> > The 1 Hz approach reduces the number of interrupts and Sami's patch 
> > provides a
> > way to get "accurate" delay in case of interrupts. I think both have their 
> > own
> > benefit. 
> 
> Is it really that important to delay with that level of accuracy?  In most
> cases, the chances of actually interrupting a given vacuum delay point are
> pretty small.  Even in the extreme scenario you tested with ~350K
> interrupts in a 19 minute vacuum, you only saw a 10-15% difference in total
> time.  I wouldn't say I'm diametrically opposed to this patch, but I do
> think we need to carefully consider whether it's worth the extra code.
>

I'm not 100% sure that it is worth it but on OTOH I'm thinking that could be the
case for someone that cares enough to change the cost delay from say 2ms to 3ms.

I mean, given the granularity we expose in the cost delay, one could expect to
get "accurate" delay. The doc is cautious enough to mention that "such delays 
may
not be measured accurately on older platforms" which makes me think that could
be worth to implement it.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: gitmaster server problem?

2024-08-19 Thread Dave Page
On Mon, 19 Aug 2024 at 15:27, jian he  wrote:

> On Mon, Aug 19, 2024 at 9:56 PM Bruce Momjian  wrote:
> >
>
> as of now.
> cgit working
> https://git.postgresql.org/cgit/postgresql.git/


Yes.


>
>
>
> git not working
> ( 403 Forbidden nginx)
> https://git.postgresql.org/git/postgresql.git/


>From a browser, that's what I'd expect. You should be able to clone from it
though:

git clone https://git.postgresql.org/git/postgresql.git
Cloning into 'postgresql'...
remote: Enumerating objects: 42814, done.
remote: Counting objects: 100% (42814/42814), done.
remote: Compressing objects: 100% (15585/15585), done.
remote: Total 1014138 (delta 34190), reused 33980 (delta 26978),
pack-reused 971324
Receiving objects: 100% (1014138/1014138), 343.01 MiB | 15.49 MiB/s, done.
Resolving deltas: 100% (873517/873517), done.

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com

PGDay UK 2024, 11th September, London: https://2024.pgday.uk/


Re: gitmaster server problem?

2024-08-19 Thread Dave Page
On Mon, 19 Aug 2024 at 15:29, Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Mon, Aug 19, 2024 at 3:56 PM Tom Lane  wrote:
> >> Yeah, I'm seeing that too; not at the top level of the site but
> >> when you try to drill down to individual messages.
> >> commitfest.postgresql.org isn't responding either,
> >> nor wiki.postgresql.org.
>
> > Are you still seeing these issues? It works fine both from here and from
> > our monitoring, but there could be something source-network-specific
> > maybe
>
> Right at the moment,
>
> https://git.postgresql.org/gitweb/?p=postgresql.git
>
> is failing for me with "Backend fetch failed".


That's working fine for me now, however Magnus did just give the box some
more CPUs.

We're also considering rate limiting by IP on that server, as it seems to
be getting hammered with requests from China.

-- 
Dave Page
pgAdmin: https://www.pgadmin.org
PostgreSQL: https://www.postgresql.org
EDB: https://www.enterprisedb.com

PGDay UK 2024, 11th September, London: https://2024.pgday.uk/


Re: gitmaster server problem?

2024-08-19 Thread Magnus Hagander
On Mon, Aug 19, 2024 at 4:28 PM Dave Page  wrote:

>
>
> On Mon, 19 Aug 2024 at 14:56, Bruce Momjian  wrote:
>
>> On Sat, Aug 17, 2024 at 02:15:49PM -0400, Tom Lane wrote:
>> > Matthias van de Meent  writes:
>> > > Additional infra that doesn't seem to work right now: mailing list
>> > > archives [0] seem to fail at a 503 produced by Varnish Cache server:
>> > > Error 503 Backend fetch failed. Maybe more of infra is down, or
>> > > otherwise under maintenance?
>> >
>> > Yeah, I'm seeing that too; not at the top level of the site but
>> > when you try to drill down to individual messages.
>> >
>> > commitfest.postgresql.org isn't responding either,
>> > nor wiki.postgresql.org.
>>
>> Okay, thanks for the confirmation.  I emailed pginfra at the same time I
>> posted the first email, so they are aware.  I have not received a reply
>> from them yet.
>>
>
> Joe replied.
>
> In the context of replying it's also worth noting that CCing the same
email to both -hackers and -www ensures it gets caught in moderation, and
will take longer to reach someone.

/Magnus


Re: gitmaster server problem?

2024-08-19 Thread Adrian Klaver

On 8/19/24 07:27, Tom Lane wrote:

Magnus Hagander  writes:



Right at the moment,

https://git.postgresql.org/gitweb/?p=postgresql.git

is failing for me with "Backend fetch failed".  However, the mail


I just tried it and it worked.


archives, commitfest.p.o, and wiki.p.o seem to be responding normally,
as is gitmaster.

regards, tom lane




--
Adrian Klaver
adrian.kla...@aklaver.com





Test 041_checkpoint_at_promote.pl faild in installcheck due to missing injection_points

2024-08-19 Thread Maxim Orlov
Hi!

After rebasing one of my old patches, I'm hit to a problem with the
installcheck test for 041_checkpoint_at_promote.pl.
At first, I thought it was something wrong with my patch, although it
doesn't relate to this part of the Postgres.
Then I decided to do the same run but on current master and got the same
result.

Here is my configure:
SRC="../postgres"
TRG=`pwd`

LINUX_CONFIGURE_FEATURES="
--without-llvm
--with-tcl --with-tclconfig=/usr/lib/tcl8.6/ --with-perl
--with-python --with-gssapi --with-pam --with-ldap --with-selinux
--with-systemd --with-uuid=ossp --with-libxml --with-libxslt
--with-zstd
--with-ssl=openssl
"

$SRC/configure \
-C \
--prefix=$TRG/"pgsql" \
--enable-debug --enable-tap-tests --enable-depend --enable-cassert \
--enable-injection-points --enable-nls \
$LINUX_CONFIGURE_FEATURES \
CC="ccache clang" CXX="ccache clang++" \
CFLAGS="-Og -ggdb -fsanitize-recover=all" \
CXXFLAGS="-Og -ggdb -fsanitize-recover=all"

And here is my run:
$ time make PROVE_TESTS=t/041_checkpoint_at_promote.pl installcheck -C
src/test/recovery
...
# Postmaster PID for node "standby1" is 820439
error running SQL: 'psql::1: ERROR:  extension "injection_points" is
not available
DETAIL:  Could not open extension control file
"/home/omg/proj/build/pgsql/share/extension/injection_points.control": No
such file or directory.
HINT:  The extension must first be installed on the system where PostgreSQL
is running.'
while running 'psql -XAtq -d port=17154 host=/tmp/xkTLcw1tDb
dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'CREATE EXTENSION
injection_points;' at
/home/omg/proj/build/../postgres/src/test/perl/PostgreSQL/Test/Cluster.pm
line 2140.
# Postmaster PID for node "master" is 820423
...

Cleary, Postgres can't find injection_points extension.
Am I doing something wrong, or it is a problem with injection points extension
itself?

-- 
Best regards,
Maxim Orlov.


Re: Remove redundant NULL check in clause_selectivity_ext() function

2024-08-19 Thread Tom Lane
Ilia Evdokimov  writes:
> In the `clause_selectivity_ext()` function, there’s a comment regarding 
> a NULL clause check that asks, "can this still happen?". I decided to 
> investigate whether this scenario is still valid.

> Here’s what I found after reviewing the relevant cases:

> - approx_tuple_count(): The function iterates over a loop of other clauses.
> - get_foreign_key_join_selectivity(): The function is invoked within an 
> `if (clause)` condition.
> - consider_new_or_clause(): The clause is generated by 
> `make_restrictinfo()`, which never returns NULL.
> - clauselist_selectivity_ext(): This function either checks 
> `list_length(clauses) == 1` before being called, or it is called within 
> a loop of clauses.

That list_length check doesn't prove anything about whether the list
element is NULL, though.

While I suspect that you may be right that the case doesn't occur
anymore (if it ever did), I'm inclined to leave this test in place.
It's cheap enough compared to what the rest of the function will do,
and more importantly we can't assume that all interesting call sites
are within Postgres core.  There are definitely extensions calling
clauselist_selectivity and related functions.  It's possible that
some of them rely on clause_selectivity not crashing on a NULL.
Certainly, such an assumption could be argued to be a bug they
should fix; but I'm disinclined to make them jump through that
hoop for a vanishingly small performance improvement.

Also, there are boatloads of other places where the planner has
possibly-redundant checks for null clause pointers.  It's likely
that some of the other ones are more performance-critical than
this.  But I wouldn't be in favor of retail removal of the others
either.  Maybe with a more holistic approach we could remove a
whole lot of them and make a measurable improvement; but it
would require some careful thought about what invariants we
want to assume.  There's not really any design principles
about this right now, and where we've ended up is that most
functions operating on expression trees assume they've got to
defend against NULL inputs.  To remove those checks, we'd
need a clear understanding of where caller-side checks need
to be added instead.

regards, tom lane




Re: Report search_path value back to the client.

2024-08-19 Thread Tomas Vondra
On 8/14/24 18:30, Jelte Fennema-Nio wrote:
> On Wed, 14 Aug 2024 at 18:22, Tomas Vondra  wrote:
>> Here's the patch with a somewhat expanded / improved commit message.
>> Jelte, can you take a look there's no silly mistake?
> 
> Looks good to me.

Pushed, after rewording the commit message a bit.

Now let's see how the other protocol thread goes ...


regards

-- 
Tomas Vondra




Re: Avoid orphaned objects dependencies, take 3

2024-08-19 Thread Bertrand Drouvot
Hi,

On Wed, Jul 10, 2024 at 07:31:06AM +, Bertrand Drouvot wrote:
> So, to sum up:
> 
> A. Locking is now done exclusively with LockNotPinnedObject(Oid classid, Oid 
> objid)
> so that it's now always clear what object we want to acquire a lock for. It 
> means
> we are not manipulating directly an object address or a list of objects 
> address
> as it was the case when the locking was done "directly" within the dependency 
> code.
> 
> B. A special case is done for objects that belong to the RelationRelationId 
> class.
> For those, we should be in one of the two following cases that would already
> prevent the relation to be dropped:
> 
>  1. The relation is already locked (could be an existing relation or a 
> relation
>  that we are creating).
> 
>  2. The relation is protected indirectly (i.e an index protected by a lock on
>  its table, a table protected by a lock on a function that depends the 
> table...)
> 
> To avoid any risks for the RelationRelationId class case, we acquire a lock if
> there is none. That may add unnecessary lock for 2. but that seems worth it. 
> 

Please find attached v16, mandatory rebase due to 80ffcb8427.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 79c81071b061466ef51ca863d1b0189298430984 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Fri, 29 Mar 2024 15:43:26 +
Subject: [PATCH v16] Avoid orphaned objects dependencies

It's currently possible to create orphaned objects dependencies, for example:

Scenario 1:

session 1: begin; drop schema schem;
session 2: create a function in the schema schem
session 1: commit;

With the above, the function created in session 2 would be linked to a non
existing schema.

Scenario 2:

session 1: begin; create a function in the schema schem
session 2: drop schema schem;
session 1: commit;

With the above, the function created in session 1 would be linked to a non
existing schema.

To avoid those scenarios, a new lock (that conflicts with a lock taken by DROP)
has been put in place before the dependencies are being recorded. With this in
place, the drop schema in scenario 2 would be locked.

Also, after the new lock attempt, the patch checks that the object still exists:
with this in place session 2 in scenario 1 would be locked and would report an
error once session 1 committs (that would not be the case should session 1 abort
the transaction).

If the object is dropped before the new lock attempt is triggered then the patch
would also report an error (but with less details).

The patch takes into account any type of objects except the ones that are pinned
(they are not droppable because the system requires it).

A special case is done for objects that belong to the RelationRelationId class.
For those, we should be in one of the two following cases that would already
prevent the relation to be dropped:

1. The relation is already locked (could be an existing relation or a relation
that we are creating).

2. The relation is protected indirectly (i.e an index protected by a lock on
its table, a table protected by a lock on a function that depends the table...)

To avoid any risks for the RelationRelationId class case, we acquire a lock if
there is none. That may add unnecessary lock for 2. but that's worth it.

The patch adds a few tests for some dependency cases (that would currently produce
orphaned objects):

- schema and function (as the above scenarios)
- alter a dependency (function and schema)
- function and arg type
- function and return type
- function and function
- domain and domain
- table and type
- server and foreign data wrapper
---
 src/backend/catalog/aclchk.c  |   1 +
 src/backend/catalog/dependency.c  | 212 ++
 src/backend/catalog/heap.c|   7 +
 src/backend/catalog/index.c   |  26 +++
 src/backend/catalog/objectaddress.c   |  57 +
 src/backend/catalog/pg_aggregate.c|   9 +
 src/backend/catalog/pg_attrdef.c  |   1 +
 src/backend/catalog/pg_cast.c |   5 +
 src/backend/catalog/pg_collation.c|   1 +
 src/backend/catalog/pg_constraint.c   |  26 +++
 src/backend/catalog/pg_conversion.c   |   2 +
 src/backend/catalog/pg_depend.c   |  40 +++-
 src/backend/catalog/pg_operator.c |  19 ++
 src/backend/catalog/pg_proc.c |  17 +-
 src/backend/catalog/pg_publication.c  |  11 +
 src/backend/catalog/pg_range.c|   6 +
 src/backend/catalog/pg_type.c |  39 
 src/backend/catalog/toasting.c|   1 +
 src/backend/commands/alter.c  |   4 +
 src/backend/commands/amcmds.c |   1 +
 src/backend/commands/cluster.c|   7 +
 src/backend/commands/event_trigger.c  |   1 +
 src/backend/commands/extension.c  |   5 +
 src/backend/co

Re: Remove redundant NULL check in clause_selectivity_ext() function

2024-08-19 Thread Ilia Evdokimov


On 19.8.24 18:02, Tom Lane wrote:

Ilia Evdokimov  writes:

In the `clause_selectivity_ext()` function, there’s a comment regarding
a NULL clause check that asks, "can this still happen?". I decided to
investigate whether this scenario is still valid.
Here’s what I found after reviewing the relevant cases:
- approx_tuple_count(): The function iterates over a loop of other clauses.
- get_foreign_key_join_selectivity(): The function is invoked within an
`if (clause)` condition.
- consider_new_or_clause(): The clause is generated by
`make_restrictinfo()`, which never returns NULL.
- clauselist_selectivity_ext(): This function either checks
`list_length(clauses) == 1` before being called, or it is called within
a loop of clauses.

That list_length check doesn't prove anything about whether the list
element is NULL, though.

While I suspect that you may be right that the case doesn't occur
anymore (if it ever did), I'm inclined to leave this test in place.
It's cheap enough compared to what the rest of the function will do,
and more importantly we can't assume that all interesting call sites
are within Postgres core.  There are definitely extensions calling
clauselist_selectivity and related functions.  It's possible that
some of them rely on clause_selectivity not crashing on a NULL.
Certainly, such an assumption could be argued to be a bug they
should fix; but I'm disinclined to make them jump through that
hoop for a vanishingly small performance improvement.

Also, there are boatloads of other places where the planner has
possibly-redundant checks for null clause pointers.  It's likely
that some of the other ones are more performance-critical than
this.  But I wouldn't be in favor of retail removal of the others
either.  Maybe with a more holistic approach we could remove a
whole lot of them and make a measurable improvement; but it
would require some careful thought about what invariants we
want to assume.  There's not really any design principles
about this right now, and where we've ended up is that most
functions operating on expression trees assume they've got to
defend against NULL inputs.  To remove those checks, we'd
need a clear understanding of where caller-side checks need
to be added instead.

regards, tom lane


Let's assume that this check needs to remain, and the length check 
doesn't guarantee anything. However, I'm a bit concerned that there's a 
NULL check here, but it's missing in the |clauselist_selectivity_ext()| 
function. For the reasons mentioned above, I would suggest the 
following: either we perform the NULL check in both places, or we don't 
perform it in either.


--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.


Re: Report search_path value back to the client.

2024-08-19 Thread Tom Lane
Tomas Vondra  writes:
> On 8/14/24 18:30, Jelte Fennema-Nio wrote:
>> Looks good to me.

> Pushed, after rewording the commit message a bit.

This patch does not appear to have updated any of the relevant
documentation.

regards, tom lane




Re: Add support for (Var op Var) clause in extended MCV statistics

2024-08-19 Thread Ilia Evdokimov

On 12.8.24 19:25, Tomas Vondra wrote:


Is TPC-B really interesting/useful for this patch? The queries are super
simple, with only a single clause (so it may not even get to the code
handling extended statistics). Did you create any extended stats?


No, it's not the case. I simply wanted to verify that other queries are 
not slowed down after applying my patch.


I think you'll need to construct a custom test, with queries that have
multiple (var op var) clauses, extended stats created, etc. And
benchmark that.



I used the test generator from a previous thread [1] and ran it with 
|default_statistics_target = 1000| to achieve more accurate estimates 
for 3000 rows. It would also be beneficial to run tests with 10,000 and 
100,000 rows for a broader perspective. I've attached the python test. 
Here’s a breakdown of the issues:


1. (A op A) Clause: Before applying my patch, there were poor estimates
   for expressions like |(A op A)|. Currently, we only have correct
   estimates for the |(A = A)| clause, which transforms into |A IS NOT
   NULL|. Should I address this in this thread? I believe we should
   extend the same correction to clauses like |(A != A)|, |(A < A)|,
   and similar conditions. However, this issue is not for current thread.
2. AND Clauses: The estimates for AND clauses were inaccurate before my
   patch. I noticed code segments where I could add something specific
   for the |(Var op Var)| clause, but I'm unsure if I'm missing
   anything crucial. If my understanding is incorrect, I'd appreciate
   any guidance or corrections.


FWIW I don't think it makes sense to benchmark the query execution - if
the estimate improves, it's possible to get arbitrary speedup, but
that's expected and mostly mostly irrelevant I think.

What I'd focus on is benchmarking just the query planning - we need the
overhead to be negligible (or at least small) so that it does not hurt
people with already good plans.

BTW can you elaborate why you are interested in this patch? Do you just
think it's interesting/useful, or do you have a workload where it would
actually help? I'm asking because me being uncertain how beneficial this
is in practice (not just being nice in theory) was one of the reasons
why I didn't do more work on this in 2021.


regards



I have two reasons for pursuing this. Firstly, I've encountered some of 
these queries in practice, although they are quite rare. While it might 
be easy to dismiss these cases due to their infrequency, I believe that 
we shouldn't overlook the opportunity to develop better handling for 
them, regardless of how seldom they occur.


Secondly, I see that you're working on improving estimates for JOIN 
clauses in thread [2]. I believe that enhancing estimates for these rare 
cases could also benefit future work on JOIN queries, particularly those 
with multiple |ON (T1.column = T2.column)| conditions, which are 
essentially |(Var op Var)| clauses. My idea is to start with non-JOIN 
queries, and then apply the same approach to improve JOIN estimates. Of 
course, I might be wrong, but I think this approach has potential.



P.S. If I sent this mail twice I'm sorry. I wanted to sent results of 
the test, and it was not sent to hackers because of big size of attached 
file. Now I sent only test.


[1]: 
https://www.postgresql.org/message-id/ecc0b08a-518d-7ad6-17ed-a5e962fc4f5f%40enterprisedb.com


[2]: 
https://www.postgresql.org/message-id/flat/c8c0ff31-3a8a-7562-bbd3-78b2ec65f16c%40enterprisedb.com 



--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.
#!/usr/bin/python3

import psycopg2
import random
import select
import hashlib

def generate_conditions(nclauses, attributes = ['a', 'b', 'c', 'd'], operators = ['<', '<=', '=', '!=', '>=', '>']):

	if nclauses == 1:
		cols = [random.choice(attributes), random.choice(attributes)]
		oper = ' ' + random.choice(operators) + ' '

		clause = oper.join(cols)

		if random.randint(0,100) < 50:
			clause = 'NOT ' + clause

		return clause


	nparts = random.randint(2, nclauses)

	# distribute the clauses between query parts
	nclauses_parts = [1 for p in range(0, nparts)]

	for x in range(0, nclauses - nparts):
		nclauses_parts[random.randint(0, nparts) - 1] += 1

	parts = []
	for p in range(0, nparts):
		parts.append('(' + generate_conditions(nclauses_parts[p], attributes, operators) + ')')

	c = random.choice([' AND ', ' OR '])

	return c.join(parts)

def generate_data(nrows, attributes = ['a', 'b', 'c', 'd']):

	sql = 'insert into t (' + ','.join(attributes) + ') select '

	attrs = []
	for attr in attributes:

		x = random.choice([-1, 2, 5, 10, 20, 30])

		if x == -1:
			x = random.randint(5, 20)
			expr = '(random() * ' + str(x) + ')::int'
		else:
			expr = 'mod(i,' + str(x) + ')'

		if random.randint(0,100) < 50:
			x = random.choice([2, 5, 10, 20, 30])
			attrs.append('case when mod(i,' + str(x) + ') = 0 then null else ' + expr + ' end')
		else:
			attrs.append(expr)

	sql += ', '.join(attrs) + ' from generate_series(1,' + str(

Re: Test 041_checkpoint_at_promote.pl faild in installcheck due to missing injection_points

2024-08-19 Thread Tom Lane
Maxim Orlov  writes:
> After rebasing one of my old patches, I'm hit to a problem with the
> installcheck test for 041_checkpoint_at_promote.pl.

src/test/recovery/README points out that

  Also, to use "make installcheck", you must have built and installed
  contrib/pg_prewarm, contrib/pg_stat_statements and contrib/test_decoding
  in addition to the core code.

I suspect this needs some additional verbiage about also installing
src/test/modules/injection_points if you've enabled injection points.

(I think we haven't noticed because most people just use "make check"
instead.)

regards, tom lane




Re: Report search_path value back to the client.

2024-08-19 Thread Tomas Vondra
On 8/19/24 18:02, Tom Lane wrote:
> Tomas Vondra  writes:
>> On 8/14/24 18:30, Jelte Fennema-Nio wrote:
>>> Looks good to me.
> 
>> Pushed, after rewording the commit message a bit.
> 
> This patch does not appear to have updated any of the relevant
> documentation.
> 

Oh, I haven't realized we explicitly list which parameters are reported.

I see there are two places in libpq.sgml and protocol.sgml that should
list search_path - will fix. I haven't found any other place in the docs
that would need an update, or did I miss something?


regards

-- 
Tomas Vondra




Re: Report search_path value back to the client.

2024-08-19 Thread Tom Lane
Tomas Vondra  writes:
> I see there are two places in libpq.sgml and protocol.sgml that should
> list search_path - will fix. I haven't found any other place in the docs
> that would need an update, or did I miss something?

I recall there being two places, so that's probably the extent of it
... yeah, the last similar patch was d16f8c8e4, and that's what
it did.

regards, tom lane




Re: Vacuum statistics

2024-08-19 Thread Ilia Evdokimov
Are you certain that all tables are included in `pg_stat_vacuum_tables`? 
I'm asking because of the following:



SELECT count(*) FROM pg_stat_all_tables ;
 count
---
   108
(1 row)

SELECT count(*) FROM pg_stat_vacuum_tables ;
 count
---
    20
(1 row)

--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.





Re: generic plans and "initial" pruning

2024-08-19 Thread Robert Haas
On Fri, Aug 16, 2024 at 8:36 AM Amit Langote  wrote:
> So it is possible for the executor to try to run a plan that has
> become invalid since it was created, so...

I'm not sure what the "so what" here is.

> One perhaps crazy idea [1]:
>
> What if we remove AcquireExecutorLocks() and move the responsibility
> of taking the remaining necessary locks into the executor (those on
> any inheritance children that are added during planning and thus not
> accounted for by AcquirePlannerLocks()), like the patch already does,
> but don't make it also check if the plan has become invalid, which it
> can't do anyway unless it's from a CachedPlan.  That means we instead
> let the executor throw any errors that occur when trying to either
> initialize the plan because of the changes that have occurred to the
> objects referenced in the plan, like what is happening in the above
> example.  If that case is going to be rare anway, why spend energy on
> checking the validity and replan, especially if that's not an easy
> thing to do as we're finding out.  In the above example, we could say
> that it's a user error to create a rule like that, so it should not
> happen in practice, but when it does, the executor seems to deal with
> it correctly by refusing to execute a broken plan .  Perhaps it's more
> worthwhile to make the executor behave correctly in face of plan
> invalidation than teach the rest of the system to deal with the
> executor throwing its hands up when it runs into an invalid plan?
> Again, I think this may be a crazy line of thinking but just wanted to
> get it out there.

I don't know whether this is crazy or not. I think there are two
issues. One, the set of checks that we have right now might not be
complete, and we might just not have realized that because it happens
infrequently enough that we haven't found all the bugs. If that's so,
then a change like this could be a good thing, because it might force
us to fix stuff we should be fixing anyway. I have a feeling that some
of the checks you hit there were added as bug fixes long after the
code was written originally, so my confidence that we don't have more
bugs isn't especially high.

And two, it matters a lot how frequent the errors will be in practice.
I think we normally try to replan rather than let a stale plan be used
because we want to not fail, because users don't like failure. If the
design you propose here would make failures more (or less) frequent,
then that's a problem (or awesome).

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: generic plans and "initial" pruning

2024-08-19 Thread Tom Lane
Robert Haas  writes:
> On Fri, Aug 16, 2024 at 8:36 AM Amit Langote  wrote:
>> So it is possible for the executor to try to run a plan that has
>> become invalid since it was created, so...

> I'm not sure what the "so what" here is.

The fact that there are holes in our protections against that doesn't
make it a good idea to walk away from the protections.  That path
leads to crashes and data corruption and unhappy users.

What the examples here are showing is that AcquireExecutorLocks
is incomplete because it only provides defenses against DDL
initiated by other sessions, not by our own session.  We have
CheckTableNotInUse but I'm not sure if it could be applied here.
We certainly aren't calling that in anywhere near as systematic
a way as we have for acquiring locks.

Maybe we should rethink the principle that a session's locks
never conflict against itself, although I fear that might be
a nasty can of worms.

Could it work to do CheckTableNotInUse when acquiring an
exclusive table lock?  I don't doubt that we'd have to fix some
code paths, but if the damage isn't extensive then that
might offer a more nearly bulletproof approach.

regards, tom lane




Re: generic plans and "initial" pruning

2024-08-19 Thread Robert Haas
On Mon, Aug 19, 2024 at 12:54 PM Tom Lane  wrote:
> What the examples here are showing is that AcquireExecutorLocks
> is incomplete because it only provides defenses against DDL
> initiated by other sessions, not by our own session.  We have
> CheckTableNotInUse but I'm not sure if it could be applied here.
> We certainly aren't calling that in anywhere near as systematic
> a way as we have for acquiring locks.
>
> Maybe we should rethink the principle that a session's locks
> never conflict against itself, although I fear that might be
> a nasty can of worms.

It might not be that bad. It could replace the CheckTableNotInUse()
protections that we have today but maybe cover more cases, and it
could do so without needing any changes to the shared lock manager.
Say every time you start a query you give that query an ID number, and
all locks taken by that query are tagged with that ID number in the
local lock table, and maybe some flags indicating why the lock was
taken. When a new lock acquisition comes along you can say "oh, this
lock was previously taken so that we could do thus-and-so" and then
use that to fail with the appropriate error message. That seems like
it might be more powerful than the refcnt check within
CheckTableNotInUse().

But that seems somewhat incidental to what this thread is about. IIUC,
Amit's original design involved having the plan cache call some new
executor function to do partition pruning before lock acquisition, and
then passing that data structure around, including back to the
executor, so that we didn't repeat the pruning we already did, which
would be a bad thing to do not only because it would incur CPU cost
but also because really bad things would happen if we got a different
answer the second time. IIUC, you didn't think that was going to work
out nicely, and suggested instead moving the pruning+locking to
ExecutorStart() time. But now Amit is finding problems with that
approach, because by the time we reach PortalRun() for the
PORTAL_MULTI_QUERY case, it's too late to replan, because we can't ask
the plancache to replan just one query from the list; and if we try to
fix that by moving ExecutorStart() to PortalStart(), then there are
other problems. Do you have a view on what the way forward might be?

This thread has gotten a tad depressing, honestly. All of the opinions
about what we ought to do seem to be based on the firm conviction that
X or Y or Z will not work, rather than on the confidence that A or B
or C will work. Yet I'm inclined to believe this problem is solvable.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: generic plans and "initial" pruning

2024-08-19 Thread Tom Lane
Robert Haas  writes:
> But that seems somewhat incidental to what this thread is about.

Perhaps.  But if we're running into issues related to that, it might
be good to set aside the long-term goal for a bit and come up with
a cleaner answer for intra-session locking.  That could allow the
pruning problem to be solved more cleanly in turn, and it'd be
an improvement even if not.

> Do you have a view on what the way forward might be?

I'm fresh out of ideas at the moment, other than having a hope that
divide-and-conquer (ie, solving subproblems first) might pay off.

> This thread has gotten a tad depressing, honestly. All of the opinions
> about what we ought to do seem to be based on the firm conviction that
> X or Y or Z will not work, rather than on the confidence that A or B
> or C will work. Yet I'm inclined to believe this problem is solvable.

Yeah.  We are working in an extremely not-green field here, which
means it's a lot easier to see pre-existing reasons why X will not
work than to have confidence that it will work.  But hey, if this
were easy then we'd have done it already.

regards, tom lane




Re: Report search_path value back to the client.

2024-08-19 Thread Tomas Vondra
On 8/19/24 18:19, Tom Lane wrote:
> Tomas Vondra  writes:
>> I see there are two places in libpq.sgml and protocol.sgml that should
>> list search_path - will fix. I haven't found any other place in the docs
>> that would need an update, or did I miss something?
> 
> I recall there being two places, so that's probably the extent of it
> ... yeah, the last similar patch was d16f8c8e4, and that's what
> it did.
> 

Thanks, updated in a similar way.

-- 
Tomas Vondra




Re: Report search_path value back to the client.

2024-08-19 Thread Tom Lane
Tomas Vondra  writes:
> On 8/19/24 18:19, Tom Lane wrote:
>> I recall there being two places, so that's probably the extent of it
>> ... yeah, the last similar patch was d16f8c8e4, and that's what
>> it did.

> Thanks, updated in a similar way.

LGTM, thanks.

regards, tom lane




Re: generic plans and "initial" pruning

2024-08-19 Thread Robert Haas
On Mon, Aug 19, 2024 at 1:52 PM Tom Lane  wrote:
> Robert Haas  writes:
> > But that seems somewhat incidental to what this thread is about.
>
> Perhaps.  But if we're running into issues related to that, it might
> be good to set aside the long-term goal for a bit and come up with
> a cleaner answer for intra-session locking.  That could allow the
> pruning problem to be solved more cleanly in turn, and it'd be
> an improvement even if not.

Maybe, but the pieces aren't quite coming together for me. Solving
this would mean that if we execute a stale plan, we'd be more likely
to get a good error and less likely to get a bad, nasty-looking
internal error, or a crash. That's good on its own terms, but we don't
really want user queries to produce errors at all, so I don't think
we'd feel any more free to rearrange the order of operations than we
do today.

> > Do you have a view on what the way forward might be?
>
> I'm fresh out of ideas at the moment, other than having a hope that
> divide-and-conquer (ie, solving subproblems first) might pay off.

Fair enough, but why do you think that the original approach of
creating a data structure from within the plan cache mechanism
(probably via a call into some new executor entrypoint) and then
feeding that through to ExecutorRun() time can't work? Is it possible
you latched onto some non-optimal decisions that the early versions of
the patch made, rather than there being a fundamental problem with the
concept?

I actually thought the do-it-at-executorstart-time approach sounded
pretty good, even though we might have to abandon planstate tree
initialization partway through, right up until Amit started talking
about moving ExecutorStart() from PortalRun() to PortalStart(), which
I have a feeling is going to create a bigger problem than we can
solve. I think if we want to save that approach, we should try to
figure out if we can teach the plancache to replan one query from a
list without replanning the others, which seems like it might allow us
to keep the order of major operations unchanged. Otherwise, it makes
sense to me to have another go at the other approach, at least to make
sure we understand clearly why it can't work.

> Yeah.  We are working in an extremely not-green field here, which
> means it's a lot easier to see pre-existing reasons why X will not
> work than to have confidence that it will work.  But hey, if this
> were easy then we'd have done it already.

Yeah, true.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-08-19 Thread Jelte Fennema-Nio
On Mon, 19 Aug 2024 at 16:16, Robert Haas  wrote:
> If somebody is using PQprotocolVersion() to detect the arrival of a
> new protocol version, it stands to reason that they only care about
> new major protocol versions, because that's what the function is
> defined to tell you about. Anyone who has done a moderate amount of
> looking into this area will understand that the protocol has a major
> version number and a minor version number and that this function only
> returns the former. Therefore, they should expect that the arrival of
> a new minor protocol version won't change the return value of this
> function.

What I'm trying to say is: I don't think there's any usecase where
people would care about a major bump, but not a minor bump. Especially
keeping in mind that a minor bump had never occurred when originally
creating this function. And because we never did it, there has so far
been no definition of what is the actual difference between a major
and a minor bump.

> I really don't understand why we're still arguing about this. It seems
> to me that we've established that there is some usage of the existing
> function, and that changing the return value will break something.
> Sure, so far as we know that something is "only" regression tests, but
> there's no guarantee that there couldn't be other code that we don't
> know about that breaks worse

My point is that the code that breaks, actually wants to be broken in this case.

> and even there isn't, who wants to break
> regression tests when there's nothing actually wrong?

Updating the regression test would be less work than adding support
for a new API. So if the main problem is

> Now we could
> decide we're going to do it anyway because of whatever reason we might
> have, but it doesn't seem like that's what most people want to do.
>
> I feel like we're finally in a position to get some things done here
> and this doesn't seem like the point to get stuck on. YMMV, of course.

I'd love to hear a response from Jacob and Heikki on my arguments
after their last response. But if after reading those arguments they
still think we should add a new function, I'll update the patchset to
include a new function.




Re: Incremental View Maintenance, take 2

2024-08-19 Thread Kirill Reshke
On Tue, 30 Jul 2024 at 10:24, Yugo NAGATA  wrote:
>
> Hi,
>
> On Tue, 30 Jul 2024 03:32:19 +0500
> Kirill Reshke  wrote:
>
> > On Sat, 27 Jul 2024 at 13:26, Kirill Reshke  wrote:
> > >
> > > Hi!
> > > Cloudberry DB (Greenplum fork) uses IMMV feature for AQUMV (auto query
> > > use matview) feature, so i got interested in how it is implemented.
>
> Thank you so much for a lot of comments!
> I will respond to the comments soon.
>
> > >
> > > On Thu, 11 Jul 2024 at 09:24, Yugo NAGATA  wrote:
> > > >
> > > > I updated the patch to bump up the version numbers in psql and pg_dump 
> > > > codes
> > > > from 17 to 18.
> > >
> > > Few suggestions:
> > >
> > > 1) `Add-relisivm-column-to-pg_class-system-catalog` commit message
> > > should be fixed, there is "isimmv" in the last line.
> > > 2) I dont get why `Add-Incremental-View-Maintenance-support.patch`
> > > goes after 0005 & 0004. Shoulndt we first implement feature server
> > > side, only when client (psql & pg_dump) side?
> > > 3) Can we provide regression tests for each function separately? Test
> > > for main feature in main patch, test for DISTINCT support in
> > > v34-0007-Add-DISTINCT-support-for-IVM.patch etc? This way the patchset
> > > will be easier to review, and can be committed separelety.
> > > 4) v34-0006-Add-Incremental-View-Maintenance-support.patch no longer
> > > applies due to 4b74ebf726d444ba820830cad986a1f92f724649. After
> > > resolving issues manually, it does not compile, because
> > > 4b74ebf726d444ba820830cad986a1f92f724649 also removes
> > > save_userid/save_sec_context fields from ExecCreateTableAs.
> > >
> > > >   if (RelationIsIVM(matviewRel) && stmt->skipData)
> > > Now this function accepts skipData param.
> > >
> > > 5) For DISTINCT support patch uses hidden __ivm* columns. Is this
> > > design discussed anywhere? I wonder if this is a necessity (only
> > > solution) or if there are alternatives.
> > > 6)
> > > What are the caveats of supporting some simple cases for aggregation
> > > funcs like in example?
> > > ```
> > > regress=# CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_2 AS SELECT
> > > sum(j) + sum(i) from mv_base_a;
> > > ERROR:  expression containing an aggregate in it is not supported on
> > > incrementally maintainable materialized view
> > > ```
> > > I can see some difficulties with division CREATE IMMV  AS SELECT
> > > 1/sum(i) from mv_base_a;  (sum(i) == 0 case), but adding &
> > > multiplication should be ok, aren't they?
> > >
> > >
> > > Overall, patchset looks mature, however it is far from being
> > > committable due to lack of testing/feedback/discussion. There is only
> > > one way to fix this... Test and discuss it!
> > >
> > >
> > > [1] https://github.com/cloudberrydb/cloudberrydb
> >
> > Hi! Small update: I tried to run a regression test and all
> > IMMV-related tests failed on my vm. Maybe I'm doing something wrong, I
> > will try to investigate.
> >
> > Another suggestion: support for \d and \d+ commands in psql. With v34
> > patchset applied, psql does not show anything IMMV-related in \d mode.
> >
> > ```
> > reshke=# \d m1
> >Materialized view "public.m1"
> >  Column |  Type   | Collation | Nullable | Default
> > +-+---+--+-
> >  i  | integer |   |  |
> > Distributed by: (i)
> >
> >
> > reshke=# \d+ m1
> >  Materialized view "public.m1"
> >  Column |  Type   | Collation | Nullable | Default | Storage |
> > Compression | Stats target | Description
> > +-+---+--+-+-+-+--+-
> >  i  | integer |   |  | | plain   |
> > |  |
> > View definition:
> >  SELECT t1.i
> >FROM t1;
> > Distributed by: (i)
> > Access method: heap
> >
> > ```
> >
> > Output should be 'Incrementally materialized view "public.m1"' IMO.
>
>
> --
> Yugo NAGATA 


So, I spent another 2 weeks on this patch. I have read the whole
'Incremental View Maintenance' thread (from 2018), this thread, some
related threads. Have studied some papers on this topic. I got a
better understanding of the theory this work is backed up with.
However, I still can add my 2c.


== Major suggestions.

1) At first glance, working with this IVM/IMMV infrastructure feels
really unintuitive about what servers actually do for query execution.
I do think It will be much better for user experience to add more
EXPLAIN about IVM work done inside IVM triggers. This way it is much
clearer which part is working slow, so which index should be created,
etc.

2) The kernel code for IVM lacks possibility to be extended for
further IVM optimizations. The one example is foreign key optimization
described here[1]. I'm not saying we should implement this within this
patchset, but we surely should pave the way for this. I don't have any
good suggestions for how to do this though.

3) I don't really think SQL design is good. CREATE [INCREME

Re: Thread-safe nl_langinfo() and localeconv()

2024-08-19 Thread Thomas Munro
On Tue, Aug 13, 2024 at 5:45 PM Thomas Munro  wrote:
> Over on the discussion thread about remaining setlocale() work[1], I
> wrote out some long boring theories about $SUBJECT.

Just as an FYI/curiosity, I converted my frustrations with
localeconv() into a request[1] that POSIX consider standardising one
of the interfaces that doesn't suck, and the reaction seems so far
positive.  Of course that doesn't really have any actionable
consequences on any relevant time frame, even if eventually
successful, and we can already use the saner interfaces on the systems
most of us really care about, but still...  it's nice to think that
the pessimistic fallback code (really only used by Solaris and maybe
AIX) could eventually be redundant if it goes somewhere...

[1] https://www.mail-archive.com/austin-group-l@opengroup.org/msg12850.html




Re: Detailed release notes

2024-08-19 Thread Bruce Momjian
On Fri, Jul 26, 2024 at 10:11:22AM -0400, Tom Lane wrote:
> Daniel Gustafsson  writes:
> >> On 26 Jul 2024, at 15:00, Marcos Pegoraro  wrote:
> >> But why is that just a hidden comment and not a visible link for us ?
> 
> > That's likely the wrong level of detail for the overwhelming majority of
> > release notes readers.  I have a feeling this was discussed not too long ago
> > but (if so) I fail to find that discussion now.
> 
> Yeah, I too recall some discussion of surfacing the commit links
> somehow, perhaps as a popup tooltip.  Nobody's got round to it yet.
> It's not real clear how to handle multiple links per , which
> happens from time to time in major release notes and just about
> everyplace in minor release notes.

Yes, some kind of popup is what I remember.  Looking at the HTML output
for the docs, the SGML comments don't appear the HTML, so I think we
need to write a Perl or shell script to match all the SGML 
comment text with the HTML  text, and insert tooltip text with links
for every one of them.  Should I work on this?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Cross-version Compatibility of postgres_fdw

2024-08-19 Thread Bruce Momjian
On Wed, Aug 14, 2024 at 03:25:48AM +0900, Etsuro Fujita wrote:
> On Tue, Aug 13, 2024 at 12:15 PM Fujii Masao
>  wrote:
> > On 2024/08/09 17:49, Etsuro Fujita wrote:
> > > I just thought consolidating the information to one place would make
> > > the documentation more readable.
> >
> > Yes, so I think that adding a note about the required remote server version
> > to the cross-version compatibility section would be more intuitive.
> > This section already discusses the necessary server versions and 
> > limitations.
> > Patch attached.
> 
> Thanks for the patch!
> 
> I noticed that we already have mentioned such a version limitation on
> the analyze_sampling option outside that section (see the description
> for that option in the “Cost Estimation Options” section).  So my
> concern is that it might be inconsistent to state only the INSERT
> limitation there.  Maybe I am too worried about that.

I think it is an improvement, so applied to master.  Thanks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: on_error table, saving error info to a table

2024-08-19 Thread jian he
On Mon, Jul 15, 2024 at 1:42 PM Nishant Sharma
 wrote:
>
> Thanks for the patch!
>
> I was not able to understand why we need a special error table for COPY?
> Can't we just log it in a new log error file for COPY in a proper format? Like
> using table approach, PG would be unnecessarily be utilising its resources 
> like
> extra CPU to format the data in pages to store in its table format, writing 
> and
> keeping the table in its store (which may or may not be required), the user
> would be required to make sure it creates the error table with proper columns
> to be used in COPY, etc..
>
>
> Meanwhile, please find some quick review comments:-
>
> 1) Patch needs re-base.
>
> 2) If the columns of the error table are fixed, then why not create it 
> internally using
> some function or something instead of making the user create the table 
> correctly
> with all the columns?

I'll think about it more.
previously, i tried to use SPI to create tables, but at that time, i
thought that's kind of excessive.
you need to create the table, check whether the table name is unique,
check the privilege.
now we quickly error out if the error saving table definition does not
meet. I guess that's less work to do.


> 3) I think, error messages can be improved, which looks big to me.
>
> 4) I think no need of below pstrdup, As CStringGetTextDatum creates a text 
> copy for
> the same:-
> err_code = 
> pstrdup(unpack_sql_state(cstate->escontext->error_data->sqlerrcode));
>
> t_values[9] = CStringGetTextDatum(err_code);
>
> 5) Should 'on_error_rel' as not NULL be checked along with below, as I can 
> see it is
> passed as NULL from two locations?
> +   if (cstate->opts.on_error == COPY_ON_ERROR_TABLE)
> +   {
>
> 6) Below declarations can be shifted to the if block, where they are used. 
> Instead of
> keeping them as global in function?
> +   HeapTuple   on_error_tup;
> +   TupleDesc   on_error_tupDesc;
>
> +   if (cstate->opts.on_error == COPY_ON_ERROR_TABLE)
> +   {
>
> 7) Below comment going beyond 80 char width:-
> * if on_error is specified with 'table', then on_error_rel is the error 
> saving table
>
> 8) Need space after 'false'
> err_detail ? false: true;
>
> 9) Below call can fit in a single line. No need to keep the 2nd and 3rd 
> parameter in
> nextlines.
> +   on_error_tup = heap_form_tuple(on_error_tupDesc,
> +   t_values,
> +   t_isnull);
>
> 10) Variable declarations Tab Spacing issue at multiple places.
>
> 11) Comments in the patch are not matched to PG comment style.
>
>
> Kindly note I have not tested the patch properly yet. Only checked it with a 
> positive test
> case. As I will wait for a conclusion on my opinion of the proposed patch.
>
almost all these issues have been addressed.
The error messages are still not so good. I need to polish it.
From f344314bfab06dfba3fd60c7cef16dd2d4d41803 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Mon, 19 Aug 2024 21:58:46 +0800
Subject: [PATCH v2 1/1] new on_error option: "table" saving error info to
 table

introduce on_error table option for COPY FROM.
the syntax is {on_error table, table 'error_saving_tbl'}.

we first check table error_saving_tbl's existence and data definition.
if it does not meet our criteria, then we quickly abort the COPY operation.
we also did preliminary check the lock of error saving table
so the COPY can insert tuples to it.

once there is a error happened, we save the error metedata
and insert it to the error_saving_table.

discussion: https://postgr.es/m/CACJufxH_OJpVra%3D0c4ow8fbxHj7heMcVaTNEPa5vAurSeNA-6Q%40mail.gmail.com
---
 doc/src/sgml/ref/copy.sgml   | 111 +-
 src/backend/commands/copy.c  |  20 
 src/backend/commands/copyfrom.c  | 139 +++
 src/backend/commands/copyfromparse.c |  46 
 src/backend/parser/gram.y|   4 +
 src/include/commands/copy.h  |   2 +
 src/include/commands/copyfrom_internal.h |   1 +
 src/test/regress/expected/copy2.out  |  96 
 src/test/regress/sql/copy2.sql   |  83 ++
 9 files changed, 501 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a04..80c8a1be41 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -44,6 +44,7 @@ COPY { table_name [ ( column_name [, ...] ) | * }
 FORCE_NULL { ( column_name [, ...] ) | * }
 ON_ERROR error_action
+TABLE 'error_saving_tbl'
 ENCODING 'encoding_name'
 LOG_VERBOSITY verbosity
 
@@ -393,7 +394,9 @@ COPY { table_name [ ( error_action value of
   stop means fail the command, while
-  ignore means discard the input row and continue with the next one.
+  ignore means discard the input row and continue with the next one,
+  TABLE means COPY skips malfor

Re: PG docs - Sequence CYCLE clause

2024-08-19 Thread Bruce Momjian
On Wed, Aug 14, 2024 at 01:46:23PM +0200, Peter Eisentraut wrote:
> On 13.08.24 07:46, Peter Smith wrote:
> > While reviewing another thread I had cause to be looking more
> > carefully at the SEQUENCE documentation.
> > 
> > I found it curious that, unlike other clauses, the syntax of the CYCLE
> > clause was not displayed on a line by itself.
> > 
> > I have changed that, and at the same time I have moved the CYCLE
> > syntax (plus its description) to be adjacent to MINVALUE/MAXVALUE,
> > which IMO is where it naturally belongs.
> > 
> > Please see the attached patch.
> 
> I agree, it makes a bit more sense with your patches.

Great, patch applied to master.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Speed up Hash Join by teaching ExprState about hashing

2024-08-19 Thread David Rowley
On Mon, 19 Aug 2024 at 18:41, David Rowley  wrote:
> The attached v5 patch includes this change.

I made a few more tweaks to the comments and pushed the result.

Thank you both of you for having a look at this.

David




Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-19 Thread Tender Wang
Alvaro Herrera  于2024年8月8日周四 06:50写道:

> On 2024-Jul-26, Tender Wang wrote:
>
> > Junwang Zhao  于2024年7月26日周五 14:57写道:
> >
> > > There is a bug report[0] Tender comments might be the same issue as
> > > this one, but I tried Alvaro's and mine patch, neither could solve
> > > that problem, I did not tried Tender's earlier patch thought. I post
> > > the test script below in case you are interested.
> >
> > My earlier patch should handle Alexander reported case. But I did not
> > do more test. I'm not sure that wether or not has dismatching between
> > pg_constraint and pg_trigger.
> >
> > I aggred with Alvaro said that "requires a much more invasive
> > solution".
>
> Here's the patch which, as far as I can tell, fixes all the reported
> problems (other than the one in bug 18541, for which I proposed an
> unrelated fix in that thread[1]).  If you can double-check, I would very
> much appreciate that.  Also, I think the test cases the patch adds
> reflect the provided examples sufficiently, but if we're still failing
> to cover some, please let me know.
>
>
> As I understand, this fix needs to be applied all the way back to 12,
> because the overall functionality is that old.  However, in branches 14
> and back, the patch doesn't apply cleanly, because of the changes we
> made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
> 16, 17, master now and potentially backpatch later, to avoid dragging
> things along further.  It's taken long enough already.
>

I haven't seen this patch on master.  Is there something that we fotget to
handle?


-- 
Tender Wang


Re: Logical Replication of sequences

2024-08-19 Thread Peter Smith
Hi Vignesh, Here are my review comments for the latest patchset

v20240819-0001. No changes. No comments.
v20240819-0002. No changes. No comments.
v20240819-0003. See below.
v20240819-0004. See below.
v20240819-0005. No changes. No comments.

///

PATCH v20240819-0003

==
src/backend/replication/logical/syncutils.c

3.1.
+typedef enum
+{
+ SYNC_RELATION_STATE_NEEDS_REBUILD,
+ SYNC_RELATION_STATE_REBUILD_STARTED,
+ SYNC_RELATION_STATE_VALID,
+} SyncingRelationsState;
+
+static SyncingRelationsState relation_states_validity =
SYNC_RELATION_STATE_NEEDS_REBUILD;

There is some muddle of singular/plural names here. The
typedef/values/var should all match:

e.g. It could be like:
SYNC_RELATION_STATE_xxx --> SYNC_RELATION_STATES_xxx
SyncingRelationsState --> SyncRelationStates

But, a more radical change might be better.

typedef enum
{
RELATION_STATES_SYNC_NEEDED,
RELATION_STATES_SYNC_STARTED,
RELATION_STATES_SYNCED,
} SyncRelationStates;

~~~

3.2. GENERAL refactoring

I don't think all of the functions moved into syncutil.c truly belong there.

This new module was introduced to be for common/util functions for
tablesync and sequencesync, but with each patchset, it has been
sucking in more and more functions that maybe do not quite belong
here.

For example, AFAIK these below have logic that is *solely* for TABLES
(not for SEQUENCES). Perhaps it was convenient to dump them here
because they are statically called, but I felt they still logically
belong in tablesync.c:
- process_syncing_tables_for_sync(XLogRecPtr current_lsn)
- process_syncing_tables_for_apply(XLogRecPtr current_lsn)
- AllTablesyncsReady(void)

~~~

3.3.
+static bool
+FetchRelationStates(bool *started_tx)
+{

If this function can remain static then the name should change to be
like fetch_table_states, right?

==
src/include/replication/worker_internal.h

3.4.
+extern bool wait_for_relation_state_change(Oid relid, char expected_state);

If this previously static function will be exposed now (it may not
need to be if some other functions are returned tablesync.c) then the
function name should also be changed, right?



PATCH v20240819-0004

==
src/backend/replication/logical/syncutils.c

4.1 GENERAL refactoring

(this is similar to review comment #3.2 above)

Functions like below have logic that is *solely* for SEQUENCES (not
for TABLES). I felt they logically belong in sequencesync.c, not here.
- process_syncing_sequences_for_apply(void)

~~~

FetchRelationStates:
nit - the comment change about "not-READY tables" (instead of
relations) should be already in patch 0003.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-19 Thread Alvaro Herrera
On 2024-Aug-20, Tender Wang wrote:

> > As I understand, this fix needs to be applied all the way back to 12,
> > because the overall functionality is that old.  However, in branches 14
> > and back, the patch doesn't apply cleanly, because of the changes we
> > made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
> > 16, 17, master now and potentially backpatch later, to avoid dragging
> > things along further.  It's taken long enough already.
> 
> I haven't seen this patch on master.  Is there something that we fotget to
> handle?

I haven't pushed it yet, mostly because of being unsure about not doing
anything for the oldest branches (14 and back).

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming"  (A. Stepanov)




Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

2024-08-19 Thread Tender Wang
Alvaro Herrera  于2024年8月20日周二 10:25写道:

> On 2024-Aug-20, Tender Wang wrote:
>
> > > As I understand, this fix needs to be applied all the way back to 12,
> > > because the overall functionality is that old.  However, in branches 14
> > > and back, the patch doesn't apply cleanly, because of the changes we
> > > made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
> > > 16, 17, master now and potentially backpatch later, to avoid dragging
> > > things along further.  It's taken long enough already.
> >
> > I haven't seen this patch on master.  Is there something that we fotget
> to
> > handle?
>
> I haven't pushed it yet, mostly because of being unsure about not doing
> anything for the oldest branches (14 and back).
>

I only did codes and tests on master. I'm not sure how much complicated it
would be
to fix this issue on branches 14 and back.


-- 
Tender Wang


Re: CREATE SUBSCRIPTION - add missing test case

2024-08-19 Thread Peter Smith
On Fri, Aug 16, 2024 at 2:15 PM vignesh C  wrote:
>

Thanks for the review.

>
> I agree currently there is no test to hit this code. I'm not sure if
> this is the correct location for the test, should it be included in
> the 008_diff_schema.pl file?

Yes, that is a better home for this test. Done as suggested in
attached patch v2.

==
Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-Add-missing-test-case.patch
Description: Binary data


Re: PG docs - Sequence CYCLE clause

2024-08-19 Thread Peter Smith
On Tue, Aug 20, 2024 at 10:18 AM Bruce Momjian  wrote:
>
> Great, patch applied to master.
>

Thanks for pushing!

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: documentation structure

2024-08-19 Thread jian he
On Mon, Jul 22, 2024 at 10:42 AM jian he  wrote:
>
hi. this time everything should be ok!


step1. python3  split_func_sgml.py
step2. git apply
v6-0001-all-filelist-for-directory-doc-src-sgml-func.patch (step2,
cannot use "git am").


what v7_split_func_sgml.py did:
1. The new file only has 2 occurrences of "sect1"
2. Each new sgml file has its own unique identifier, e.g. for
func-logical.sgml unique string is ""
3. sed copy,  inplace delete command string will be printed out.
you can check the line number in func.sgml to verify the sed command.
From d123a7c9ef6ad45e3b697aa20bcfc831f594b45d Mon Sep 17 00:00:00 2001
From: jian he 
Date: Sun, 21 Jul 2024 20:43:45 +0800
Subject: [PATCH v6 1/1] all filelist for directory doc/src/sgml/func

---
 doc/src/sgml/filelist.sgml  |  5 -
 doc/src/sgml/func/allfiles.sgml | 40 +
 2 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 doc/src/sgml/func/allfiles.sgml

diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index a7ff5f82..d9f36933 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -17,7 +17,10 @@
 
 
 
-
+
+
+%allfiles_func;
+
 
 
 
diff --git a/doc/src/sgml/func/allfiles.sgml b/doc/src/sgml/func/allfiles.sgml
new file mode 100644
index ..34eec608
--- /dev/null
+++ b/doc/src/sgml/func/allfiles.sgml
@@ -0,0 +1,40 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
-- 
2.34.1

import subprocess
import os
import re
top_dir = subprocess.check_output(["git", "rev-parse", "--show-toplevel"])
top_dir = top_dir.decode().rstrip('\n')
top_src_dir = top_dir + "/doc/src/sgml"
os.chdir(top_src_dir)
#
func_logical="func-logical.sgml"
func_comparison="func-comparison.sgml"
func_math="func-math.sgml"
func_string="func-string.sgml"
func_binarystring="func-binarystring.sgml"
func_bitstring="func-bitstring.sgml"
func_matching="func-matching.sgml"
func_formatting="func-formatting.sgml"
func_datetime="func-datetime.sgml"
func_enum="func-enum.sgml"
func_geometry="func-geometry.sgml"
func_net="func-net.sgml"
func_textsearch="func-textsearch.sgml"
func_uuid="func-uuid.sgml"
func_xml="func-xml.sgml"
func_json="func-json.sgml"
func_sequence="func-sequence.sgml"
func_conditional="func-conditional.sgml"
func_array="func-array.sgml"
func_range="func-range.sgml"
func_aggregate="func-aggregate.sgml"
func_window="func-window.sgml"
func_merge_support ="func-merge-support.sgml"
func_subquery="func-subquery.sgml"
func_comparisons="func-comparisons.sgml"
func_srf="func-srf.sgml"
func_info="func-info.sgml"
func_admin="func-admin.sgml"
func_trigger="func-trigger.sgml"
func_event_triggers="func-event-triggers.sgml"
func_statistics="func-statistics.sgml"
#
func_filenames = list()
func_filenames.append(func_logical)
func_filenames.append(func_comparison)
func_filenames.append(func_math)
func_filenames.append(func_string)
func_filenames.append(func_binarystring)
func_filenames.append(func_bitstring)
func_filenames.append(func_matching)
func_filenames.append(func_formatting)
func_filenames.append(func_datetime)
func_filenames.append(func_enum)
func_filenames.append(func_geometry)
func_filenames.append(func_net)
func_filenames.append(func_textsearch)
func_filenames.append(func_uuid)
func_filenames.append(func_xml)
func_filenames.append(func_json)
func_filenames.append(func_sequence)
func_filenames.append(func_conditional)
func_filenames.append(func_array)
func_filenames.append(func_range)
func_filenames.append(func_aggregate)
func_filenames.append(func_window)
func_filenames.append(func_merge_support )
func_filenames.append(func_subquery)
func_filenames.append(func_comparisons)
func_filenames.append(func_srf)
func_filenames.append(func_info)
func_filenames.append(func_admin)
func_filenames.append(func_trigger)
func_filenames.append(func_event_triggers)
func_filenames.append(func_statistics)
#
func_string_begin_lineno= -1
func_logical_begin_lineno=-1
func_comparison_begin_lineno=-1
func_math_begin_lineno=-1
func_string_begin_lineno=-1
func_binarystring_begin_lineno=-1
func_bitstring_begin_lineno=-1
func_matching_begin_lineno=-1
func_formatting_begin_lineno=-1
func_datetime_begin_lineno=-1
func_enum_begin_lineno=-1
func_geometry_begin_lineno=-1
func_net_begin_lineno=-1
func_textsearch_begin_lineno=-1
func_uuid_begin_lineno=-1
func_xml_begin_lineno=-1
func_json_begin_lineno=-1
func_sequence_begin_lineno=-1
func_conditional_begin_lineno=-1
func_array_begin_lineno=-1
func_range_begin_lineno=-1
func_aggregate_begin_lineno=-1
func_window_begin_lineno=-1
func_merge_support_begin_lineno=-1
func_subquery_begin_lineno=-1
func_comparisons_begin_lineno=-1
func_srf_begin_lineno=-1
func_info_begin_lineno=-1
func_admin_begin_lineno=-1
func_trigger_begin_lineno=-1
func_event_triggers_begin_lineno=-1
func_statistics_begin_lineno=-1
#
func_logical_end_lineno=-1
func_comparison_end_lineno=-1
func_math_end_lineno=-1
func_string_end_lineno=-1
func_binarystring_end_lineno=-

Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

2024-08-19 Thread Xiaoran Wang
Tom Lane  于2024年8月19日周一 22:12写道:

> Robert Haas  writes:
> > On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang 
> wrote:
> >> If the code in PG_TRY contains any non local control flow other than
> >> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot
> >> be called, then the PG_exception_stack will point to the memory whose
> >> stack frame has been released. So after that, when the pg_re_throw
> >> called, __longjmp() will crash and report Segmentation fault error.
> >>
> >> Addition to sigjmp_buf, add another field 'int magic' which is next to
> >> the sigjum_buf in the local stack frame memory. The magic's value is
> always
> >> 'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if
> >> the magic's value is still '0x12345678', if not, that means the memory
> >> where the 'PG_exception_stack' points to has been released, and the
> 'sigbuf'
> >> must be invalid.
>
> > This is an interesting idea. I suspect if we do this we want a
> > different magic number and a different error message than what you
> > chose here, but those are minor details.
>
> I would suggest just adding an Assert; I doubt this check would be
> useful in production.
>

Agree, an Assert is enough.

>
> > I'm not sure how reliable this would be at actually finding problems,
> > though.
>
> Yeah, that's the big problem.  I don't have any confidence at all
> that this would detect misuse.  It'd require that the old stack
> frame gets overwritten, which might not happen for a long time,
> and it'd require that somebody eventually do a longjmp, which again
> might not happen for a long time --- and when those did happen, the
> error would be detected in someplace far away from the actual bug,
> with little evidence remaining to help you localize it.
>

Exactly, it cannot tell you which PG_TRY left the invalid sigjmp_buf,
but to implement that is easy I think, recording the line num maybe.

I think this is still useful, at least  it tell you that the error is due
to the
PG_TRY.

>
> Also, as soon as some outer level of PG_TRY is exited in the proper
> way, the dangling stack pointer will be fixed up.  That means there's
> a fairly narrow time frame in which the overwrite and longjmp must
> happen for this to catch a bug.
>
>
Yes, if the outer level PG_TRY call pg_re_throw instead of ereport,
the dangling stack pointer will be fixed up.  It's would be great to fix
that
up in any case. But I have no idea how to implement that now.

In pg_re_throw, if we could use '_local_sigjmp_buf' instead of the
global var PG_exception_stack, that would be great. We don't
need to worry about the invalid sigjum_buf.

So on the whole I doubt this'd be terribly useful in this form;
> and I don't like the amount of code churn required.
>
> > Personally, I don't think I've ever made this particular mistake. I
> > think a much more common usage error is exiting the catch-block
> > without either throwing an error or rolling back a subtransaction. But
> > YMMV, of course.
>
> We have had multiple instances of code "return"ing out of a PG_TRY,
> so I fully agree that some better way to detect that would be good.
> But maybe we ought to think about static analysis for that.
>
> regards, tom lane
>


-- 
Best regards !
Xiaoran Wang


Re: documentation structure

2024-08-19 Thread jian he
On Tue, Aug 20, 2024 at 11:18 AM jian he  wrote:
>
> On Mon, Jul 22, 2024 at 10:42 AM jian he  wrote:
> >
> hi. this time everything should be ok!
>
>
> step1. python3  split_func_sgml.py
> step2. git apply
> v6-0001-all-filelist-for-directory-doc-src-sgml-func.patch (step2,
> cannot use "git am").
>

sorry. I mean the file attached in the previous mail.
step1. python3  v7_split_func_sgml.py
step2. git apply v7-0001-all-filelist-for-directory-doc-src-sgml-func.patch
(step2, cannot use "git am").




Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

2024-08-19 Thread Xiaoran Wang
Xiaoran Wang  于2024年8月20日周二 11:32写道:

>
>
> Tom Lane  于2024年8月19日周一 22:12写道:
>
>> Robert Haas  writes:
>> > On Mon, Aug 19, 2024 at 2:17 AM Xiaoran Wang 
>> wrote:
>> >> If the code in PG_TRY contains any non local control flow other than
>> >> ereport(ERROR) like goto, break etc., the PG_CATCH or PG_END_TRY cannot
>> >> be called, then the PG_exception_stack will point to the memory whose
>> >> stack frame has been released. So after that, when the pg_re_throw
>> >> called, __longjmp() will crash and report Segmentation fault error.
>> >>
>> >> Addition to sigjmp_buf, add another field 'int magic' which is next to
>> >> the sigjum_buf in the local stack frame memory. The magic's value is
>> always
>> >> 'PG_exception_magic 0x12345678'. And in 'pg_re_throw' routine, check if
>> >> the magic's value is still '0x12345678', if not, that means the memory
>> >> where the 'PG_exception_stack' points to has been released, and the
>> 'sigbuf'
>> >> must be invalid.
>>
>> > This is an interesting idea. I suspect if we do this we want a
>> > different magic number and a different error message than what you
>> > chose here, but those are minor details.
>>
>> I would suggest just adding an Assert; I doubt this check would be
>> useful in production.
>>
>
> Agree, an Assert is enough.
>
>>
>> > I'm not sure how reliable this would be at actually finding problems,
>> > though.
>>
>> Yeah, that's the big problem.  I don't have any confidence at all
>> that this would detect misuse.  It'd require that the old stack
>> frame gets overwritten, which might not happen for a long time,
>> and it'd require that somebody eventually do a longjmp, which again
>> might not happen for a long time --- and when those did happen, the
>> error would be detected in someplace far away from the actual bug,
>> with little evidence remaining to help you localize it.
>>
>
> Exactly, it cannot tell you which PG_TRY left the invalid sigjmp_buf,
> but to implement that is easy I think, recording the line num maybe.
>
> I think this is still useful, at least  it tell you that the error is due
> to the
> PG_TRY.
>
>>
>> Also, as soon as some outer level of PG_TRY is exited in the proper
>> way, the dangling stack pointer will be fixed up.  That means there's
>> a fairly narrow time frame in which the overwrite and longjmp must
>> happen for this to catch a bug.
>>
>>
> Yes, if the outer level PG_TRY call pg_re_throw instead of ereport,
> the dangling stack pointer will be fixed up.  It's would be great to fix
> that
> up in any case. But I have no idea how to implement that now.
>
>
Sorry,  "Yes, if the outer level PG_TRY call pg_re_throw instead of
ereport, " should
be "Yes, if the outer level PG_TRY  reset the PG_exception_stack."

In pg_re_throw, if we could use '_local_sigjmp_buf' instead of the
> global var PG_exception_stack, that would be great. We don't
> need to worry about the invalid sigjum_buf.
>
> So on the whole I doubt this'd be terribly useful in this form;
>> and I don't like the amount of code churn required.
>>
>> > Personally, I don't think I've ever made this particular mistake. I
>> > think a much more common usage error is exiting the catch-block
>> > without either throwing an error or rolling back a subtransaction. But
>> > YMMV, of course.
>>
>> We have had multiple instances of code "return"ing out of a PG_TRY,
>> so I fully agree that some better way to detect that would be good.
>> But maybe we ought to think about static analysis for that.
>>
>> regards, tom lane
>>
>
>
> --
> Best regards !
> Xiaoran Wang
>


-- 
Best regards !
Xiaoran Wang


Re: Improve pg_re_throw: check if sigjmp_buf is valid and report error

2024-08-19 Thread Tom Lane
Xiaoran Wang  writes:
>> Yeah, that's the big problem.  I don't have any confidence at all
>> that this would detect misuse.  It'd require that the old stack
>> frame gets overwritten, which might not happen for a long time,
>> and it'd require that somebody eventually do a longjmp, which again
>> might not happen for a long time --- and when those did happen, the
>> error would be detected in someplace far away from the actual bug,
>> with little evidence remaining to help you localize it.

> Exactly, it cannot tell you which PG_TRY left the invalid sigjmp_buf,
> but to implement that is easy I think, recording the line num maybe.

I don't think you get to assume that the canary word gets overwritten
but debug data a few bytes away survives.

regards, tom lane




Re: define PG_REPLSLOT_DIR

2024-08-19 Thread Ashutosh Bapat
On Mon, Aug 19, 2024 at 7:47 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Mon, Aug 19, 2024 at 04:11:31PM +0530, Ashutosh Bapat wrote:
> > On Wed, Aug 14, 2024 at 5:02 PM Bertrand Drouvot
> >  wrote:
> > >
> > > Hi hackers,
> > >
> > > while working on a replication slot tool (idea is to put it in contrib, 
> > > not
> > > shared yet), I realized that "pg_replslot" is being used > 25 times in
> > > .c files.
> > >
> > > I think it would make sense to replace those occurrences with a $SUBJECT, 
> > > attached
> > > a patch doing so.
> >
> > Many of these places are slot specific directory/file names within
> > pg_replslot. I think we can further improve the code by creating macro
> > on the lines of LSN_FORMAT_ARGS
> > #define SLOT_DIRNAME_ARGS(slotname) (PG_REPLSLOT_DIR, slotname)
> > this way we "codify" method to construct the slot directory name
> > everywhere.
>
> Thanks for the feedback!
>
> I think that could make sense. As the already proposed mechanical changes are
> error prone (from my point of view), I would suggest to have a look at your
> proposal once the proposed changes go in.

What do you mean by error prone?

-- 
Best Wishes,
Ashutosh Bapat




Re: Conflict detection and logging in logical replication

2024-08-19 Thread Amit Kapila
On Mon, Aug 19, 2024 at 4:16 PM Amit Kapila  wrote:
>
> > Rest looks good.
> >
>
> Thanks for the review and testing.
>

Pushed.

-- 
With Regards,
Amit Kapila.




Remaining reference to _PG_fini() in ldap_password_func

2024-08-19 Thread Michael Paquier
Hi all,

While hacking on a different thing, I've noticed that
ldap_password_func had the idea to define _PG_fini().  This is
pointless, as library unloading is not supported in the backend, and
something that has been cleaned up from the tree in ab02d702ef08.
That's not something to encourage, perhaps, as well..

How about removing it like in the attached to be consistent?

Thanks,
--
Michael
diff --git a/src/test/modules/ldap_password_func/ldap_password_func.c b/src/test/modules/ldap_password_func/ldap_password_func.c
index 99c18a8f1c..24d9c63781 100644
--- a/src/test/modules/ldap_password_func/ldap_password_func.c
+++ b/src/test/modules/ldap_password_func/ldap_password_func.c
@@ -23,7 +23,6 @@
 PG_MODULE_MAGIC;
 
 void		_PG_init(void);
-void		_PG_fini(void);
 
 /* hook function */
 static char *rot13_passphrase(char *password);
@@ -37,12 +36,6 @@ _PG_init(void)
 	ldap_password_hook = rot13_passphrase;
 }
 
-void
-_PG_fini(void)
-{
-	/* do  nothing yet */
-}
-
 static char *
 rot13_passphrase(char *pw)
 {


signature.asc
Description: PGP signature


Re: CI cpluspluscheck failures

2024-08-19 Thread Amul Sul
On Mon, Aug 19, 2024 at 1:30 AM Thomas Munro  wrote:
>
> On Mon, Aug 19, 2024 at 12:10 AM Thomas Munro  wrote:
> > I'm not sure why the first one started happening at the commit
> > (aa2d6b15) that caused the second one, I feel like I'm missing
> > something...

Thanks, Thomas, for addressing the missing include.

Adding logging.h to the pg_verifybackup.h file makes it redundant in
pg_verifybackup.c. The attached patch proposes removing this redundant
include from pg_verifybackup.c, along with a few other unnecessary
includes.

Regards,
Amul
From 33ccca65daacb44cfb526bdabcd3a3d9c06443f1 Mon Sep 17 00:00:00 2001
From: Amul Sul 
Date: Tue, 20 Aug 2024 09:44:56 +0530
Subject: [PATCH] Remove unnecessary #include statements.

---
 src/bin/pg_verifybackup/pg_verifybackup.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 3fcfb167217..2a461079a38 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -18,9 +18,6 @@
 #include 
 #include 
 
-#include "common/logging.h"
-#include "common/parse_manifest.h"
-#include "fe_utils/simple_list.h"
 #include "getopt_long.h"
 #include "pg_verifybackup.h"
 #include "pgtime.h"
-- 
2.18.0



Re: define PG_REPLSLOT_DIR

2024-08-19 Thread Bertrand Drouvot
Hi,

On Tue, Aug 20, 2024 at 09:26:59AM +0530, Ashutosh Bapat wrote:
> On Mon, Aug 19, 2024 at 7:47 PM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > On Mon, Aug 19, 2024 at 04:11:31PM +0530, Ashutosh Bapat wrote:
> > > On Wed, Aug 14, 2024 at 5:02 PM Bertrand Drouvot
> > >  wrote:
> > > >
> > > > Hi hackers,
> > > >
> > > > while working on a replication slot tool (idea is to put it in contrib, 
> > > > not
> > > > shared yet), I realized that "pg_replslot" is being used > 25 times in
> > > > .c files.
> > > >
> > > > I think it would make sense to replace those occurrences with a 
> > > > $SUBJECT, attached
> > > > a patch doing so.
> > >
> > > Many of these places are slot specific directory/file names within
> > > pg_replslot. I think we can further improve the code by creating macro
> > > on the lines of LSN_FORMAT_ARGS
> > > #define SLOT_DIRNAME_ARGS(slotname) (PG_REPLSLOT_DIR, slotname)
> > > this way we "codify" method to construct the slot directory name
> > > everywhere.
> >
> > Thanks for the feedback!
> >
> > I think that could make sense. As the already proposed mechanical changes 
> > are
> > error prone (from my point of view), I would suggest to have a look at your
> > proposal once the proposed changes go in.
> 
> What do you mean by error prone?

I meant to say that it is "easy" to make mistakes when doing those manual
mechanical changes. Also I think it's not that fun/easy to review, that's why
I think it's better to do one change at a time. Does that make sense to you?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: define PG_REPLSLOT_DIR

2024-08-19 Thread Ashutosh Bapat
On Tue, Aug 20, 2024 at 10:49 AM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Tue, Aug 20, 2024 at 09:26:59AM +0530, Ashutosh Bapat wrote:
> > On Mon, Aug 19, 2024 at 7:47 PM Bertrand Drouvot
> >  wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Aug 19, 2024 at 04:11:31PM +0530, Ashutosh Bapat wrote:
> > > > On Wed, Aug 14, 2024 at 5:02 PM Bertrand Drouvot
> > > >  wrote:
> > > > >
> > > > > Hi hackers,
> > > > >
> > > > > while working on a replication slot tool (idea is to put it in 
> > > > > contrib, not
> > > > > shared yet), I realized that "pg_replslot" is being used > 25 times in
> > > > > .c files.
> > > > >
> > > > > I think it would make sense to replace those occurrences with a 
> > > > > $SUBJECT, attached
> > > > > a patch doing so.
> > > >
> > > > Many of these places are slot specific directory/file names within
> > > > pg_replslot. I think we can further improve the code by creating macro
> > > > on the lines of LSN_FORMAT_ARGS
> > > > #define SLOT_DIRNAME_ARGS(slotname) (PG_REPLSLOT_DIR, slotname)
> > > > this way we "codify" method to construct the slot directory name
> > > > everywhere.
> > >
> > > Thanks for the feedback!
> > >
> > > I think that could make sense. As the already proposed mechanical changes 
> > > are
> > > error prone (from my point of view), I would suggest to have a look at 
> > > your
> > > proposal once the proposed changes go in.
> >
> > What do you mean by error prone?
>
> I meant to say that it is "easy" to make mistakes when doing those manual
> mechanical changes. Also I think it's not that fun/easy to review, that's why
> I think it's better to do one change at a time. Does that make sense to you?

Since these are all related changes, doing them at once might make it
faster. You may use multiple commits (one for each change) to
"contain" errors.

-- 
Best Wishes,
Ashutosh Bapat




ANALYZE ONLY

2024-08-19 Thread Michael Harris
Hello Postgres Hackers

An application that I am developing uses Postgresql, and includes a fairly large
number of partitioned tables which are used to store time series data.

The tables are partitioned by time, and typically there is only one partition
at a time - the current partition - that is actually being updated.
Older partitions
are available for query and eventually dropped.

As per the documentation, partitioned tables are not analyzed by the autovacuum
workers, although their partitions are. Statistics are needed on the partitioned
table level for at least some query planning activities.

The problem is that giving an ANALYZE command targeting a partitioned table
causes it to update statistics for the partitioned table AND all the individual
partitions. There is currently no option to prevent it from including the
partitions.

This is wasteful for our application: for one thing the autovacuum
has already analyzed the individual partitions; for another most of
the partitions
will have had no changes, so they don't need to be analyzed repeatedly.

I took some measurements when running ANALYZE on one of our tables. It
took approx
4 minutes to analyze the partitioned table, then 29 minutes to analyze the
partitions. We have hundreds of these tables, so the cost is very significant.

For my use case at least it would be fantastic if we could add an ONLY option
to ANALYZE, which would cause it to analyze the named table only and not
descend into the partitions.

I took a look at the source and it looks doable, but before launching into it
I thought I would ask a few questions here.

  1. Would such a feature be welcomed? Are there any traps I might not
have thought of?

  2. The existing ANALYZE command has the following structure:

 ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

 It would be easiest to add ONLY as another option, but that
doesn't look quite
 right to me - surely the ONLY should be attached to the table name?
 An alternative would be:

 ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]

Any feedback or advice would be great.

Regards
Mike.




Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber

2024-08-19 Thread Amit Kapila
On Wed, Aug 14, 2024 at 10:26 AM shveta malik  wrote:
>
> >
> > Thanks for the detailed analysis. I agree with your analysis that we
> > need to reset the origin information for the shutdown path to avoid it
> > being advanced incorrectly. However, the patch doesn't have sufficient
> > comments to explain why we need to reset it for both the ERROR and
> > Shutdown paths. Can we improve the comments in the patch?
> >
> > Also, for the ERROR path, can we reset the origin information in
> > apply_error_callback()?
>
> Please find v4 attached. Addressed comments in that.
>

The patch looks mostly good to me. I have slightly changed a few of
the comments in the attached. What do you think of the attached?

-- 
With Regards,
Amit Kapila.


v5-0001-Don-t-advance-origin-during-apply-failure.patch
Description: Binary data


  1   2   >