Re: [PATCH] Support % wildcard in extension upgrade filenames

2022-06-04 Thread Sandro Santilli
On Sat, May 28, 2022 at 04:50:20PM +0200, Laurenz Albe wrote:
> On Fri, 2022-05-27 at 17:37 -0400, Regina Obe wrote:
> >
> > https://lists.osgeo.org/pipermail/postgis-devel/2022-February/029500.html
> > 
> > Does anyone think this is such a horrible idea that we should abandon all
> > hope?
> 
> I don't think this idea is fundamentally wrong, but I have two worries:
> 
> 1. It would be a good idea good to make sure that there is not both
>"extension--%--2.0.sql" and "extension--1.0--2.0.sql" present.
>Otherwise the behavior might be indeterministic.

I'd make sure to use extension--1.0--2.0.sql in that case (more
specific first).

> 2. What if you have a "postgis--%--3.3.sql", and somebody tries to upgrade
>their PostGIS 1.1 installation with it?  Would that work?

For PostGIS in particular it will NOT work as the PostGIS upgrade
script checks for the older version and decides if the upgrade is
valid or not. This is the same upgrade code used for non-extension
installs.

>Having a lower bound for a matching version might be a good idea,
>although I have no idea how to do that.

I was thinking of a broader pattern matching support, like:

  postgis--3.%--3.3.sql

But it would be better to start simple and eventually if needed
increase the complexity ?

Another option could be specifying something in the control file,
which would also probably be a good idea to still allow some
extensions to use '%' in the version string (for example).

--strk; 

  Libre GIS consultant/developer
  https://strk.kbt.io/services.html




Re: [PATCH] Support % wildcard in extension upgrade filenames

2022-06-04 Thread Sandro Santilli
On Sat, May 28, 2022 at 05:26:05PM +0200, Daniel Gustafsson wrote:
> > On 28 May 2022, at 16:50, Laurenz Albe  wrote:
> 
> > I don't think this idea is fundamentally wrong, but I have two worries:
> > 
> > 1. It would be a good idea good to make sure that there is not both
> > "extension--%--2.0.sql" and "extension--1.0--2.0.sql" present.
> > Otherwise the behavior might be indeterministic.
> > 
> > 2. What if you have a "postgis--%--3.3.sql", and somebody tries to upgrade
> > their PostGIS 1.1 installation with it? Would that work?
> > Having a lower bound for a matching version might be a good idea,
> > although I have no idea how to do that.
> 
> Following that reasoning, couldn't a rogue actor inject a fake file (perhaps
> bundled with another innocent looking extension) which takes precedence in
> wildcard matching?

I think whoever can write into the PostgreSQL extension folder will
be able to inject anything anyway

--strk;




Re: [PATCH] Support % wildcard in extension upgrade filenames

2022-06-04 Thread Sandro Santilli
On Sat, May 28, 2022 at 11:37:30AM -0400, Tom Lane wrote:
> Laurenz Albe  writes:

> > 2. What if you have a "postgis--%--3.3.sql", and somebody tries to upgrade
> >their PostGIS 1.1 installation with it?  Would that work?
> >Having a lower bound for a matching version might be a good idea,
> >although I have no idea how to do that.
> 
> The lack of upper bound is a problem too: what stops the system from
> trying to use this to get from (say) 4.2 to 3.3, and if it does try that,
> will the script produce a sane result?

This is a very old problem we had before EXTENSION was even available
in PostgreSQL, and so we solved this internally. The upgrade script
for PostGIS checks the version of the existing code and refuses to
downgrade (and refuses to upgrade if a dump/restore is required).

> I'm frankly skeptical that this is a good idea at all.  It seems
> to have come out of someone's willful refusal to use the system as
> designed, ie as a series of small upgrade scripts that each do just
> one step.  I don't feel an urgent need to cater to the
> one-monster-script-that-handles-all-cases approach, because no one
> has offered any evidence that that's really a better way.  How would
> you even write the conditional logic needed ... plpgsql DO blocks?
> Testing what?  IIRC we don't expose any explicit knowledge of the
> old extension version number to the script.

We (PostGIS) do expose explicit knowledge of the old extension, and
for this reason I think the pattern-based logic should be
enabled explicitly in the postgis.control file. It could be even less
generic and more specific to a given extension need, if done
completely inside the control file. For PostGIS all we need at the
moment is something like (in the control file):

  one_monster_upgrade_script = postgis--ANY--3.3.0.sql

--strk;




Re: [v15 beta] pg_upgrade failed if earlier executed with -c switch

2022-06-04 Thread Michael Paquier
On Fri, Jun 03, 2022 at 10:32:27PM -0500, Justin Pryzby wrote:
> On Sat, Jun 04, 2022 at 12:13:19PM +0900, Michael Paquier wrote:
>> I am not so sure.  My first reaction was actually to bypass the
>> creation of the directories on EEXIST.
> 
> But that breaks the original motive behind the patch I wrote - logfiles are
> appended to, even if they're full of errors that were fixed before re-running
> pg_upgrade.

Yep.

>> But, isn't the problem different and actually older here?  In the set of
>> commands given by Tushar, he uses the --check option without --retain, but
>> the logs are kept around after the execution of the command.  It seems to me
>> that there is an argument for also removing the logs if the caller of the
>> command does not want to retain them.
> 
> You're right that --check is a bit inconsistent, but I don't think we could
> backpatch any change to fix it.  It wouldn't matter much anyway, since the
> usual workflow would be "pg_upgrade --check && pg_upgrade".  In which case the
> logs would end up being removed anyway.

Exactly, the inconsistency in the log handling is annoying, and
cleaning up the logs when --retain is not used makes sense to me when
the --check command succeeds, but we should keep them if the --check
fails.  I don't see an argument in backpatching that either.

> Hmm .. maybe what you mean is that the *tap test* should first run with
> --check?

Sorry for the confusion.  I meant to add an extra command in the TAP
test itself.

I would suggest the attached patch then, to add a --check command in
the test suite, with a change to clean up the logs when --check is
used without --retain.
--
Michael
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 6114303b52..a8b73b6c5e 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -210,6 +210,11 @@ report_clusters_compatible(void)
 		pg_log(PG_REPORT, "\n*Clusters are compatible*\n");
 		/* stops new cluster */
 		stop_postmaster(false);
+
+		/* Remove log files? */
+		if (!log_opts.retain)
+			(void) rmtree(log_opts.basedir, true);
+
 		exit(0);
 	}
 
diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index 55c7354ba2..23a4190abb 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -213,6 +213,14 @@ chdir ${PostgreSQL::Test::Utils::tmp_check};
 
 # Upgrade the instance.
 $oldnode->stop;
+command_ok(
+	[
+		'pg_upgrade', '--no-sync','-d', $oldnode->data_dir,
+		'-D', $newnode->data_dir, '-b', $oldbindir,
+		'-B', $newbindir, '-p', $oldnode->port,
+		'-P', $newnode->port, '--check'
+	],
+	'run of pg_upgrade --check for new instance');
 command_ok(
 	[
 		'pg_upgrade', '--no-sync','-d', $oldnode->data_dir,


signature.asc
Description: PGP signature


Re: Multi-Master Logical Replication

2022-06-04 Thread Amit Kapila
On Fri, Jun 3, 2022 at 7:12 AM Bruce Momjian  wrote:
>
> On Thu, Jun  2, 2022 at 05:12:49PM +1000, Peter Smith wrote:
> > On Thu, Jun 2, 2022 at 12:03 AM Bruce Momjian  wrote:
> > >
> > > On Wed, Jun  1, 2022 at 10:27:27AM +0530, Amit Kapila wrote:
> > ...
> >
> > > My big point is that you should not be showing up with a patch but
> > > rather have these discussions to get agreement that this is the
> > > direction the community wants to go.
> >
> > The purpose of posting the POC patch was certainly not to present a
> > fait accompli design/implementation.
> >
> > We wanted to solicit some community feedback about the desirability of
> > the feature, but because LRG is complicated to describe we felt that
> > having a basic functional POC might help to better understand the
> > proposal. Also, we thought the ability to experiment with the proposed
> > API could help people to decide whether LRG is something worth
> > pursuing or not.
>
> I don't think the POC is helping, and I am not sure we really want to
> support this style of architecture due to its complexity vs other
> options.
>

None of the other options discussed on this thread appears to be
better or can serve the intent. What other options do you have in mind
and how are they simpler than this? As far as I can understand this
provides a simple way to set up n-way replication among nodes.

I see that other databases provide similar ways to set up n-way
replication. See [1] and in particular [2][3][4] provides a way to set
up n-way replication via APIs. Yet, another way is via configuration
as seems to be provided by MySQL [5] (Group Replication Settings).
Most of the advantages have already been shared but let me summarize
again the benefits it brings (a) more localized database access for
geographically distributed databases, (b) ensuring continuous
availability in case of the primary site becomes unavailable due to a
system or network outage, any natural disaster on the site, (c)
environments that require a fluid replication infrastructure, where
the number of servers has to grow or shrink dynamically and with as
few side-effects as possible. For instance, database services for the
cloud, and (d) load balancing. Some of these can probably be served in
other ways but not everything.

I see your point about POC not helping here and it can also sometimes
discourage OP if we decide not to do this feature or do it in an
entirely different way. But OTOH, I don't see it stopping us from
discussing the desirability or design of this feature.

[1] - https://docs.oracle.com/cd/E18283_01/server.112/e10707/rarrcatpac.htm
[2] - 
https://docs.oracle.com/cd/E18283_01/server.112/e10707/rarrcatpac.htm#i96251
[3] - 
https://docs.oracle.com/cd/E18283_01/server.112/e10707/rarrcatpac.htm#i94500
[4] - 
https://docs.oracle.com/cd/E18283_01/server.112/e10707/rarrcatpac.htm#i97185
[5] - 
https://dev.mysql.com/doc/refman/8.0/en/group-replication-configuring-instances.html

-- 
With Regards,
Amit Kapila.




Re: Count output lines automatically in psql/help.c

2022-06-04 Thread Robert Haas
On Fri, Jun 3, 2022 at 4:51 PM Tom Lane  wrote:
> Thoughts?

+1 from me. Wish we'd done this years ago.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




pg_rewind: warn when checkpoint hasn't happened after promotion

2022-06-04 Thread James Coleman
A few weeks back I sent a bug report [1] directly to the -bugs mailing
list, and I haven't seen any activity on it (maybe this is because I
emailed directly instead of using the form?), but I got some time to
take a look and concluded that a first-level fix is pretty simple.

A quick background refresher: after promoting a standby rewinding the
former primary requires that a checkpoint have been completed on the
new primary after promotion. This is correctly documented. However
pg_rewind incorrectly reports to the user that a rewind isn't
necessary because the source and target are on the same timeline.

Specifically, this happens when the control file on the newly promoted
server looks like:

Latest checkpoint's TimeLineID:   4
Latest checkpoint's PrevTimeLineID:   4
...
Min recovery ending loc's timeline:   5

Attached is a patch that detects this condition and reports it as an
error to the user.

In the spirit of the new-ish "ensure shutdown" functionality I could
imagine extending this to automatically issue a checkpoint when this
situation is detected. I haven't started to code that up, however,
wanting to first get buy-in on that.

Thanks,
James Coleman

1: 
https://www.postgresql.org/message-id/CAAaqYe8b2DBbooTprY4v=bized9qbqvlq+fd9j617eqfjk1...@mail.gmail.com


v1-0001-pg_rewind-warn-if-we-haven-t-checkpointed-after-p.patch
Description: Binary data


Re: pg_rewind: warn when checkpoint hasn't happened after promotion

2022-06-04 Thread Bharath Rupireddy
On Sat, Jun 4, 2022 at 6:29 PM James Coleman  wrote:
>
> A few weeks back I sent a bug report [1] directly to the -bugs mailing
> list, and I haven't seen any activity on it (maybe this is because I
> emailed directly instead of using the form?), but I got some time to
> take a look and concluded that a first-level fix is pretty simple.
>
> A quick background refresher: after promoting a standby rewinding the
> former primary requires that a checkpoint have been completed on the
> new primary after promotion. This is correctly documented. However
> pg_rewind incorrectly reports to the user that a rewind isn't
> necessary because the source and target are on the same timeline.
>
> Specifically, this happens when the control file on the newly promoted
> server looks like:
>
> Latest checkpoint's TimeLineID:   4
> Latest checkpoint's PrevTimeLineID:   4
> ...
> Min recovery ending loc's timeline:   5
>
> Attached is a patch that detects this condition and reports it as an
> error to the user.
>
> In the spirit of the new-ish "ensure shutdown" functionality I could
> imagine extending this to automatically issue a checkpoint when this
> situation is detected. I haven't started to code that up, however,
> wanting to first get buy-in on that.
>
> 1: 
> https://www.postgresql.org/message-id/CAAaqYe8b2DBbooTprY4v=bized9qbqvlq+fd9j617eqfjk1...@mail.gmail.com

Thanks. I had a quick look over the issue and patch - just a thought -
can't we let pg_rewind issue a checkpoint on the new primary instead
of erroring out, maybe optionally? It might sound too much, but helps
pg_rewind to be self-reliant i.e. avoiding external actor to detect
the error and issue checkpoint the new primary to be able to
successfully run pg_rewind on the pld primary and repair it to use it
as a new standby.

Regards,
Bharath Rupireddy.




Re: [v15 beta] pg_upgrade failed if earlier executed with -c switch

2022-06-04 Thread Justin Pryzby
On Sat, Jun 04, 2022 at 06:48:19PM +0900, Michael Paquier wrote:
> I would suggest the attached patch then, to add a --check command in
> the test suite, with a change to clean up the logs when --check is
> used without --retain.

This doesn't address one of the problems that I already enumerated.

./tmp_install/usr/local/pgsql/bin/initdb -D pgsql15.dat
./tmp_install/usr/local/pgsql/bin/initdb -D pgsql15.dat-2

$ ./tmp_install/usr/local/pgsql/bin/pg_upgrade -b 
./tmp_install/usr/local/pgsql/bin/bad -d pgsql15.dat-2 -D pgsql15.dat-2 
check for "tmp_install/usr/local/pgsql/bin/bad" failed: No such file or 
directory
Failure, exiting

$ ./tmp_install/usr/local/pgsql/bin/pg_upgrade -b 
./tmp_install/usr/local/pgsql/bin/bad -d pgsql15.dat-2 -D pgsql15.dat-2 
could not create directory "pgsql15.dat-2/pg_upgrade_output.d": File exists
Failure, exiting

..failing the 2nd time because it failed the 1st time (even if I fix the bad
argument).

Maybe that's easy enough to fix just be rearranging verify_directories() or
make_outputdirs().

But actually it seems annoying to have to remove the failed outputdir.
It's true that those logs *can* be useful to fix whatever underlying problem,
but I'm afraid the *requirement* to remove the failed outputdir is a nuisance,
even outside of check mode.

-- 
Justin




Re: Count output lines automatically in psql/help.c

2022-06-04 Thread Tom Lane
Robert Haas  writes:
> On Fri, Jun 3, 2022 at 4:51 PM Tom Lane  wrote:
>> Thoughts?

> +1 from me. Wish we'd done this years ago.

Pushed, thanks for looking at it.

regards, tom lane




Re: Count output lines automatically in psql/help.c

2022-06-04 Thread Alvaro Herrera
On 2022-Jun-03, Tom Lane wrote:

> So, attached is a patch to remove that maintenance chore by
> constructing the output in a PQExpBuffer and then counting the
> lines automatically.  While I was at it, I introduced a couple of
> macros to make the code shorter rather than longer.

What about adding stringInfoCountLines or something like that?

-- 
Álvaro Herrera




Re: Count output lines automatically in psql/help.c

2022-06-04 Thread Tom Lane
Alvaro Herrera  writes:
> What about adding stringInfoCountLines or something like that?

If we have other use-cases, maybe that'd be worthwhile.

(In the committed patch, I dumbed it down to a plain per-char
loop without the strchr() complication.  So it's very little code.
I'm not real sure that strchr would make it faster.)

regards, tom lane




Error from the foreign RDBMS on a foreign table I have no privilege on

2022-06-04 Thread Phil Florent
Hi,
I opened an issue with an attached code on oracle_fdw git page : 
https://github.com/laurenz/oracle_fdw/issues/534
Basically I expected to obtain a "no privilege" error from PostgreSQL when I 
have no read privilege on the postgres foreign table but I obtained an Oracle 
error instead.
Laurenz investigated and closed the issue but he suggested perhaps I should 
post that on the hackers list since it also occurs with postgres-fdw on some 
occasion (I have investigated some more, and postgres_fdw does the same thing 
when you turn on use_remote_estimate.). Hence I do...
[https://opengraph.githubassets.com/e4d1de8890f6f00ee432d365f033677636df1c545e9d4c10ad623c5de5e7553e/laurenz/oracle_fdw/issues/534]
Oracle error on a foreign table I have no privilege on · Issue #534 · 
laurenz/oracle_fdw
Hi, I noticed a behaviour I didn't expect. Not really a bug but I obtained 
an Oracle error instead of a PostgreSQL error with a foreign table I had no 
privilege on. -- superuser prodige31=*>...
github.com

Best regards,
Phil


Re: pgcon unconference / impact of block size on performance

2022-06-04 Thread Roberto Mello
On Sat, Jun 4, 2022 at 5:23 PM Tomas Vondra 
wrote:

> Hi,
>
> At on of the pgcon unconference sessions a couple days ago, I presented
> a bunch of benchmark results comparing performance with different
> data/WAL block size. Most of the OLTP results showed significant gains
> (up to 50%) with smaller (4k) data pages.


Thanks for sharing this Thomas.

We’ve been doing similar tests with different storage classes in kubernetes
clusters.

Roberto

>


Re: [v15 beta] pg_upgrade failed if earlier executed with -c switch

2022-06-04 Thread Michael Paquier
On Sat, Jun 04, 2022 at 09:13:46AM -0500, Justin Pryzby wrote:
> Maybe that's easy enough to fix just be rearranging verify_directories() or
> make_outputdirs().

For the case, I mentioned, yes.

> But actually it seems annoying to have to remove the failed outputdir.
> It's true that those logs *can* be useful to fix whatever underlying problem,
> but I'm afraid the *requirement* to remove the failed outputdir is a nuisance,
> even outside of check mode.

Well, another error that could happen in the early code paths is
EACCES on a custom socket directory specified, and we'd still face the
same problem on a follow-up restart.  Using a sub-directory structure
as Daniel and Tom mention would address all that (if ignoring EEXIST
for the BASE_OUTPUTDIR), removing any existing content from the base
path when not using --retain.  This comes with the disadvantage of
bloating the disk on repeated errors, but this last bit would not
really be a huge problem, I guess, as it could be more useful to keep
the error information around.
--
Michael


signature.asc
Description: PGP signature


Re: should check interrupts in BuildRelationExtStatistics ?

2022-06-04 Thread Justin Pryzby
On Fri, Jun 03, 2022 at 10:28:37AM -0500, Justin Pryzby wrote:
> Maybe this should actually call vacuum_delay_point(), like
> compute_scalar_stats().

I think vacuum_delay_point() would be wrong for these cases, since they don't
call "fetchfunc()", like the other places which use vacuum_delay_point.

> For MCV, there seems to be no issue, since those
> functions are being called (but only for expressional stats).  But maybe I've
> just failed to make a large enough, non-expressional MCV list for the problem
> to be apparent.

I reproduced the issue with MCV like this:

DROP TABLE IF EXISTS t; CREATE TABLE t AS SELECT 
a::text,b::text,c::text,d::text,e::text,f::text,g::text FROM 
generate_series(101,106)a,generate_series(101,106)b,generate_series(101,106)c,generate_series(101,106)d,generate_series(101,106)e,generate_series(101,106)f,generate_series(101,106)g,generate_series(101,106)h;
 VACUUM t; 
DROP STATISTICS IF EXISTS stxx; CREATE STATISTICS stxx (mcv) ON a,b,c,d,e,f 
FROM t; ALTER STATISTICS stxx SET STATISTICS ; ANALYZE VERBOSE t;

This is slow (25 seconds) inside qsort:

(gdb) bt
#0  __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:1020
#1  0x5653d8686fac in varstrfastcmp_locale (a1p=0x5653dce67d54 "104~", 
len1=7, a2p=0x5653e895ffa4 "104~", len2=7, ssup=ssup@entry=0x5653d98c37b8) 
at varlena.c:2444
#2  0x5653d8687161 in varlenafastcmp_locale (x=94918188367184, 
y=94918384418720, ssup=0x5653d98c37b8) at varlena.c:2270
#3  0x5653d85134d8 in ApplySortComparator (ssup=0x5653d98c37b8, 
isNull2=, datum2=, isNull1=, 
datum1=) at ../../../src/include/utils/sortsupport.h:224
#4  multi_sort_compare (a=0x7fa587b44e58, b=0x7fa5875f0dd0, arg=0x5653d98c37b0) 
at extended_stats.c:903
#5  0x5653d8712eed in qsort_arg (data=data@entry=0x7fa5875f0050, 
n=, n@entry=1679616, element_size=element_size@entry=24, 
compare=compare@entry=0x5653d8513483 , 
arg=arg@entry=0x5653d98c37b0) at ../../src/include/lib/sort_template.h:349
#6  0x5653d851415f in build_sorted_items (data=data@entry=0x7fa58f2e1050, 
nitems=nitems@entry=0x7ffe4f764e5c, mss=mss@entry=0x5653d98c37b0, numattrs=6, 
attnums=0x7fa58f2e1078) at extended_stats.c:1134
#7  0x5653d8515d84 in statext_mcv_build (data=data@entry=0x7fa58f2e1050, 
totalrows=totalrows@entry=1679616, stattarget=stattarget@entry=) at 
mcv.c:204
#8  0x5653d8513819 in BuildRelationExtStatistics 
(onerel=onerel@entry=0x7fa5b26ef658, inh=inh@entry=false, totalrows=1679616, 
numrows=numrows@entry=1679616, rows=rows@entry=0x7fa5a4103050, 
natts=natts@entry=7, 
vacattrstats=vacattrstats@entry=0x5653d98b76b0) at extended_stats.c:213

The fix seems to be to CHECK_FOR_INTERRUPTS() within multi_sort_compare().
That would supercede the other two CHECK_FOR_INTERRUPTS I'd proposed, and
handle mcv, depends, and ndistinct all at once.

Does that sound right ?

For MCV, there's also ~0.6sec spent in build_column_frequencies(), which (if
needed) would be addressed by adding CFI in sort_item_compare.