Proposal: Shared Work Mem Area

2023-04-06 Thread Marco Fortina
Hello there!

I would like to propose to developers a new feature to replace the private 
management of the work mem with a management through shared memory: allocated 
at the start of PostgreSQL this area should be shared to all worker processes 
as for example Oracle Database PGA do.

This would allow an optimal use of this memory area by limiting only its global 
maximum limit and not a configuration/allocation at the process level.

What do you think about my proposal?

Cheers,
Marco



Re: Proposal: Shared Work Mem Area

2023-04-06 Thread Stephen Frost
Greetings,

* Marco Fortina (marco_fort...@hotmail.it) wrote:
> I would like to propose to developers a new feature to replace the private 
> management of the work mem with a management through shared memory: allocated 
> at the start of PostgreSQL this area should be shared to all worker processes 
> as for example Oracle Database PGA do.
> 
> This would allow an optimal use of this memory area by limiting only its 
> global maximum limit and not a configuration/allocation at the process level.
> 
> What do you think about my proposal?

There's ongoing work to provide a way to have a global maximum limit
which doesn't involve entirely reworking how work_mem works today.  The
commitfest entry for that work is here:

https://commitfest.postgresql.org/42/3867/

If you're interested in that, getting additional reviews and comments on
the work would be helpful in moving it forward.

Thanks,

Stephen


signature.asc
Description: PGP signature


R: Proposal: Shared Work Mem Area

2023-04-06 Thread Marco Fortina
Thanks Stephen.

https://commitfest.postgresql.org/42/3867/ is not exacly what I proposed as new 
feature to developers.

If I'm not wrong, almost all main memory areas have a fixed size:

shared_buffers
effective_cache_size
wal_buffers

Instead, work_mem is per-process dynamically allocated up to defined size limit.

What I suggested is to replace work_mem from per-process allocation to global 
and fixed size allocation (see pga_aggregate_target on Oracle) and shared to 
worker processes.

Let's assume the new parameter name is worker_mem_area and this was set to 8GB: 
with my proposal method each worker process do not use it's own dedicated 
work_mem but the shared one.

In this way each worker is also able to peek free pages from the 
worker_mem_area if needed.

Regards,
Marco





Da: Stephen Frost
Inviato: Giovedì, 06 Aprile, 2023 14:43
A: Marco Fortina
Cc: pgsql-gene...@postgresql.org
Oggetto: Re: Proposal: Shared Work Mem Area

Greetings,

* Marco Fortina (marco_fort...@hotmail.it) wrote:
> I would like to propose to developers a new feature to replace the private 
> management of the work mem with a management through shared memory: allocated 
> at the start of PostgreSQL this area should be shared to all worker processes 
> as for example Oracle Database PGA do.
>
> This would allow an optimal use of this memory area by limiting only its 
> global maximum limit and not a configuration/allocation at the process level.
>
> What do you think about my proposal?

There's ongoing work to provide a way to have a global maximum limit
which doesn't involve entirely reworking how work_mem works today.  The
commitfest entry for that work is here:

https://commitfest.postgresql.org/42/3867/

If you're interested in that, getting additional reviews and comments on
the work would be helpful in moving it forward.

Thanks,

Stephen


Re: Proposal: Shared Work Mem Area

2023-04-06 Thread Marco Fortina
Thanks Stephen.

https://commitfest.postgresql.org/42/3867/ is not exacly what I proposed as new 
feature to developers.

If I'm not wrong, almost all main memory areas have a fixed size:

shared_buffers
effective_cache_size
wal_buffers

Instead, work_mem is per-process dynamically allocated up to defined size limit.

What I suggested is to replace work_mem from per-process allocation to global 
and fixed size allocation (see pga_aggregate_target on Oracle) and shared to 
worker processes.

Let's assume the new parameter name is worker_mem_area and this was set to 8GB: 
with my proposal method each worker process do not use it's own dedicated 
work_mem but the shared one.

In this way each worker is also able to peek free pages from the 
worker_mem_area if needed.

Regards,
Marco





Da: Stephen Frost
Inviato: Giovedì, 06 Aprile, 2023 14:43
A: Marco Fortina
Cc: pgsql-gene...@postgresql.org
Oggetto: Re: Proposal: Shared Work Mem Area

Greetings,

* Marco Fortina (marco_fort...@hotmail.it) wrote:
> I would like to propose to developers a new feature to replace the private 
> management of the work mem with a management through shared memory: allocated 
> at the start of PostgreSQL this area should be shared to all worker processes 
> as for example Oracle Database PGA do.
>
> This would allow an optimal use of this memory area by limiting only its 
> global maximum limit and not a configuration/allocation at the process level.
>
> What do you think about my proposal?

There's ongoing work to provide a way to have a global maximum limit
which doesn't involve entirely reworking how work_mem works today.  The
commitfest entry for that work is here:

https://commitfest.postgresql.org/42/3867/

If you're interested in that, getting additional reviews and comments on
the work would be helpful in moving it forward.

Thanks,

Stephen


Re: R: Proposal: Shared Work Mem Area

2023-04-06 Thread Stephen Frost
Greetings,

Please don't top-post on the PG mailing lists, it makes it harder to
follow the discussion.

* Marco Fortina (marco_fort...@hotmail.it) wrote:
> https://commitfest.postgresql.org/42/3867/ is not exacly what I proposed as 
> new feature to developers.

I understood what you were proposing.

> If I'm not wrong, almost all main memory areas have a fixed size:
> 
> shared_buffers
> effective_cache_size
> wal_buffers
> 
> Instead, work_mem is per-process dynamically allocated up to defined size 
> limit.

That's not how work_mem works actually.  It's a per-node amount and it's
not a per-process overall limit, nor is it really a hard limit though
some nodes will do their best to respect the amount configured.

> What I suggested is to replace work_mem from per-process allocation to global 
> and fixed size allocation (see pga_aggregate_target on Oracle) and shared to 
> worker processes.

I understood the suggestion and it's a lot of work for an unclear gain.
You noted that having it be pulled from a single area would allow
administrators to configure an overall memory usage limit- but that's
not the only way to do that and there's an existing effort to do exactly
that already underway that's a lot simpler than what you're proposing.
While there might be other advantages to having a shared memory segment
be used for work_mem, you've not outlined any.

> Let's assume the new parameter name is worker_mem_area and this was set to 
> 8GB: with my proposal method each worker process do not use it's own 
> dedicated work_mem but the shared one.

I understand your suggestion, but making such a large change just to
make it isn't sensible, there should be reasoning behind why that's
better than what we're doing already or proposing to do.

> In this way each worker is also able to peek free pages from the 
> worker_mem_area if needed.

This can be done with the existing approach and doesn't require a shared
memory segment for work_mem.  We are pretty far from having an actual
acceptance system for queries though but I do agree that would be a
useful thing to work towards.  I don't know that it requires work_mem
being in shared memory though.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Patroni vs pgpool II

2023-04-06 Thread Jehan-Guillaume de Rorthais
On Wed, 05 Apr 2023 16:50:15 +0900 (JST)
Tatsuo Ishii  wrote:

> > But, I heard PgPool is still affected by Split brain syndrome.  
> 
> Can you elaborate more? If more than 3 pgpool watchdog nodes (the
> number of nodes must be odd) are configured, a split brain can be
> avoided.

Split brain is a hard situation to avoid. I suppose OP is talking about
PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would
avoid that.

To avoid split brain, you need to implement a combinaison of quorum and
(self-)fencing.

Patroni quorum is in the DCS's hands. Patroni's self-fencing can be achieved
with the (hardware) watchdog. You can also implement node fencing through the
"pre_promote" script to fence the old primary node before promoting the new one.

If you need HA with a high level of anti-split-brain security, you'll not be
able to avoid some sort of fencing, no matter what.

Good luck.




"PANIC: could not open critical system index 2662" - twice

2023-04-06 Thread Evgeny Morozov
Our PostgreSQL 15.2 instance running on Ubuntu 18.04 has crashed with
this error:

2023-04-05 09:24:03.448 UTC [15227] ERROR:  index "pg_class_oid_index"
contains unexpected zero page at block 0
2023-04-05 09:24:03.448 UTC [15227] HINT:  Please REINDEX it.
...
2023-04-05 13:05:25.018 UTC [15437]
root@test_behavior_638162834106895162 FATAL:  index "pg_class_oid_index"
contains unexpected zero page at block 0
2023-04-05 13:05:25.018 UTC [15437]
root@test_behavior_638162834106895162 HINT:  Please REINDEX it.
... (same error for a few more DBs)
2023-04-05 13:05:25.144 UTC [16965]
root@test_behavior_638162855458823077 FATAL:  index "pg_class_oid_index"
contains unexpected zero page at block 0
2023-04-05 13:05:25.144 UTC [16965]
root@test_behavior_638162855458823077 HINT:  Please REINDEX it.
...
2023-04-05 13:05:25.404 UTC [17309]
root@test_behavior_638162881641031612 PANIC:  could not open critical
system index 2662
2023-04-05 13:05:25.405 UTC [9372] LOG:  server process (PID 17309) was
terminated by signal 6: Aborted
2023-04-05 13:05:25.405 UTC [9372] LOG:  terminating any other active
server processes

We had the same thing happened about a month ago on a different database
on the same cluster. For a while PG actually ran OK as long as you
didn't access that specific DB, but when trying to back up that DB with
pg_dump it would crash every time. At that time one of the disks hosting
the ZFS dataset with the PG data directory on it was reporting errors,
so we thought it was likely due to that.

Unfortunately, before we could replace the disks, PG crashed completely
and would not start again at all, so I had to rebuild the cluster from
scratch and restore from pg_dump backups (still onto the old, bad
disks). Once the disks were replaced (all of them) I just copied the
data to them using zfs send | zfs receive and didn't bother restoring
pg_dump backups again - which was perhaps foolish in hindsight.

Well, yesterday it happened again. The server still restarted OK, so I
took fresh pg_dump backups of the databases we care about (which ran
fine), rebuilt the cluster and restored the pg_dump backups again - now
onto the new disks, which are not reporting any problems.

So while everything is up and running now this error has me rather
concerned. Could the error we're seeing now have been caused by some
corruption in the PG data that's been there for a month (so it could
still be attributed to the bad disk), which should now be fixed by
having restored from backups onto good disks? Could this be a PG bug?
What can I do to figure out why this is happening and prevent it from
happening again? Advice appreciated!


PostgreSQL Mailing list public archives : search not working ...

2023-04-06 Thread Imre Samu
Hi,

Theory: Maybe the search doesn't always work ...
Is this a known problem?

My use case:
in https://www.postgresql.org/list/  ->  "Search the Archives"
search for the "large database backup"
->
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=r
-> "Your search for large database backup returned no hits."

But at least one big thread exists:

To: pgsql-general 
Subject: postgres *large database backup*
Date: 2022-11-30 15:40:23
https://www.postgresql.org/message-id/flat/CA%2BONtZ7CdiJ2X8x_8sPHR%2Btz-vv7mR1uoDzLia3fzWVZA9k0JQ%40mail.gmail.com


*Changing the "Sort by" ..*

- sort by: rank: no hits
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=r

- sort by: date:  OK ; 1 result
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=d

- sort by: reverse date : no hits.
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=i

Regards,
 Imre


UPSERT in Postgres

2023-04-06 Thread Louis Tian
This is a question/feature request. 

Virtually all references on the internet today suggests one can "upsert" in 
PostgreSQL using the "insert ... on conflict do update ..." statement. 
But this is not complete true. 

The PostgreSQL's own wiki page 
(https://wiki.postgresql.org/wiki/UPSERT#.22UPSERT.22_definition) defines 
UPSERT as
 "UPSERT" is a DBMS feature that allows a DML statement's author to atomically 
either insert a row, or on the basis of the row already existing, UPDATE that 
existing row instead,
I believe this definition is correct and consistent with defintion elsewhere 
(https://en.wiktionary.org/wiki/upsert).

An implicit assumption behind this definition is that table must have a primary 
key for the upsert operation to make sense since it's the primary key that 
uniquely identifies a row. 
let's say we have a very simple table

```
create table person (
  id int primary key,
  name text not null,
           is_active boolean
)
``` 

Given the definition of upsert, I'd expect an upsert command to do the 
following.
- `upsert into person (id, name) values (0, 'foo')` to insert a new row 
- `upsert into person (id, is_active) values (0, true)` updates the is_active 
column for the row inserted above

Naturally, since there isn't a real upsert command in PostgreSQL this won't 
work today. 
But can we achieve the same effect with "INSERT ... ON CONFLICT DO UPDATE" like 
a lot of references on the internet seems to suggest. 

insert into person (id, name) values (0, 'foo') on conflict ("id") do update 
set id=excluded.id, name=excluded.name
insert into person (id, is_active) values (0, true) on conflict ("id") do 
update set id=excluded.id, is_active=excluded.is_active 

Unfortunately. the second statement will fail due to violation of the not null 
constraint on the "name" column. 
PostgreSQL will always try to insert the row into the table first. and only 
fallback to update when the uniqueness constraint is violated. 
Is this behavior wrong? maybe not, I think it is doing what it reads quite 
literally. 
That being said, I have never had a need for the ON CONFLICT DO UPDATE 
statement other than where I need upsert. 
But using it as "upsert" is only valid when the table is absent of any NOT NULL 
constraint on it's non primary key columns. 
So, if my experience/use case is typical (meaning the main purpose / use case 
for ON CONFLICT DO UPDATE is to support upsert) then it can be argue the 
current behavior is incorrect?  

This has been a source confusion to say at least. 
https://stackoverflow.com/questions/48816629/on-conflict-do-nothing-in-postgres-with-a-not-null-constraint
 
https://www.postgresql.org/message-id/flat/1996438.1604952901%40sss.pgh.pa.us#ace8adc1354265aca5672028faea0c0f

The MERGE command introduced in PG15 in theory can be used to do UPSERT 
properly that is void of the aforementioned limitation. 
The downside is it is rather verbose. 

*Question*
This there a way to do an upsert proper prior to PG15?

*Feature Request*
Given that UPSERT is an *idempotent* operator it is extremely useful.  
Would love to see an UPSERT command in PostgreSQL so one can 'upsert' properly 
and easily. 


Regards,
Louis Tian











Re: UPSERT in Postgres

2023-04-06 Thread Peter Geoghegan
On Thu, Apr 6, 2023 at 1:21 PM Louis Tian  wrote:
> An implicit assumption behind this definition is that table must have a 
> primary key for the upsert operation to make sense since it's the primary key 
> that uniquely identifies a row.

It could just be a unique index or a unique constraint. So you can
upsert on any individual unique constraint/index, or the primary key.
Of course there might be several on a given table, but you can only
use one as the "conflict arbiter" per statement.

> But can we achieve the same effect with "INSERT ... ON CONFLICT DO UPDATE" 
> like a lot of references on the internet seems to suggest.
>
> insert into person (id, name) values (0, 'foo') on conflict ("id") do update 
> set id=excluded.id, name=excluded.name
> insert into person (id, is_active) values (0, true) on conflict ("id") do 
> update set id=excluded.id, is_active=excluded.is_active
>
> Unfortunately. the second statement will fail due to violation of the not 
> null constraint on the "name" column.
> PostgreSQL will always try to insert the row into the table first. and only 
> fallback to update when the uniqueness constraint is violated.
> Is this behavior wrong? maybe not, I think it is doing what it reads quite 
> literally.

It sort of has to work that way, though. In general your example might
*not* fail, due to a row-level before trigger in the insert path.

Why doesn't your proposed upsert syntax have the same problem? I mean,
how could it not? I guess it doesn't if you assume that it'll never
take the insert path with your not NULL constraint example? But if you
know that for sure, why not just use a regular update statement? On
the other hand, if you're not sure if the insert path can be taken,
then why is it actually helpful to not just throw an error at the
earliest opportunity?

Surely upsert means "update or insert", so why wouldn't the user expect
to see an error like this, independent of the specifics of the row in question?
Isn't the user tacitly saying "I don't specifically know if the update or insert
path will be taken in respect of any given row" by using ON CONFLICT
DO UPDATE in the first place?

> That being said, I have never had a need for the ON CONFLICT DO UPDATE 
> statement other than where I need upsert.
> But using it as "upsert" is only valid when the table is absent of any NOT 
> NULL constraint on it's non primary key columns.

I don't know what you mean by that. "Valid"?

> The MERGE command introduced in PG15 in theory can be used to do UPSERT 
> properly that is void of the aforementioned limitation.
> The downside is it is rather verbose.

The MERGE command has various race conditions that are particularly
relevant to UPSERT type use cases. See the wiki page you referenced
for a huge amount of information on this.

> *Feature Request*
> Given that UPSERT is an *idempotent* operator it is extremely useful.

In general UPSERT (or any definition of it that I can think of) does
not imply idempotency.


--
Peter Geoghegan




Re: UPSERT in Postgres

2023-04-06 Thread Adrian Klaver

On 4/5/23 23:21, Louis Tian wrote:

This is a question/feature request.




Given the definition of upsert, I'd expect an upsert command to do the 
following.
- `upsert into person (id, name) values (0, 'foo')` to insert a new row
- `upsert into person (id, is_active) values (0, true)` updates the is_active 
column for the row inserted above

Naturally, since there isn't a real upsert command in PostgreSQL this won't 
work today.
But can we achieve the same effect with "INSERT ... ON CONFLICT DO UPDATE" like 
a lot of references on the internet seems to suggest.

insert into person (id, name) values (0, 'foo') on conflict ("id") do update 
set id=excluded.id, name=excluded.name
insert into person (id, is_active) values (0, true) on conflict ("id") do 
update set id=excluded.id, is_active=excluded.is_active


insert into person (id, name, is_active) values (0, '', true) on 
conflict ("id") do update set id=excluded.id, name=person.name, 
is_active=excluded.is_active ;

INSERT 0 1

select * from person;
 id | name | is_active
+--+---
  0 | foo  | t



Unfortunately. the second statement will fail due to violation of the not null constraint 
on the "name" column.
PostgreSQL will always try to insert the row into the table first. and only 
fallback to update when the uniqueness constraint is violated.
Is this behavior wrong? maybe not, I think it is doing what it reads quite 
literally.
That being said, I have never had a need for the ON CONFLICT DO UPDATE 
statement other than where I need upsert.
But using it as "upsert" is only valid when the table is absent of any NOT NULL 
constraint on it's non primary key columns.
So, if my experience/use case is typical (meaning the main purpose / use case 
for ON CONFLICT DO UPDATE is to support upsert) then it can be argue the 
current behavior is incorrect?

This has been a source confusion to say at least.
https://stackoverflow.com/questions/48816629/on-conflict-do-nothing-in-postgres-with-a-not-null-constraint
https://www.postgresql.org/message-id/flat/1996438.1604952901%40sss.pgh.pa.us#ace8adc1354265aca5672028faea0c0f

The MERGE command introduced in PG15 in theory can be used to do UPSERT 
properly that is void of the aforementioned limitation.
The downside is it is rather verbose.

*Question*
This there a way to do an upsert proper prior to PG15?

*Feature Request*
Given that UPSERT is an *idempotent* operator it is extremely useful.
Would love to see an UPSERT command in PostgreSQL so one can 'upsert' properly 
and easily.


Regards,
Louis Tian











--
Adrian Klaver
adrian.kla...@aklaver.com





Re: UPSERT in Postgres

2023-04-06 Thread Benedict Holland
That is the answer. Postgresql can upsert easily via triggers and on
conflict.

Thanks,
Ben

On Thu, Apr 6, 2023, 5:01 PM Adrian Klaver 
wrote:

> On 4/5/23 23:21, Louis Tian wrote:
> > This is a question/feature request.
> >
>
> > Given the definition of upsert, I'd expect an upsert command to do the
> following.
> > - `upsert into person (id, name) values (0, 'foo')` to insert a new row
> > - `upsert into person (id, is_active) values (0, true)` updates the
> is_active column for the row inserted above
> >
> > Naturally, since there isn't a real upsert command in PostgreSQL this
> won't work today.
> > But can we achieve the same effect with "INSERT ... ON CONFLICT DO
> UPDATE" like a lot of references on the internet seems to suggest.
> >
> > insert into person (id, name) values (0, 'foo') on conflict ("id") do
> update set id=excluded.id, name=excluded.name
> > insert into person (id, is_active) values (0, true) on conflict
> ("id") do update set id=excluded.id, is_active=excluded.is_active
>
> insert into person (id, name, is_active) values (0, '', true) on
> conflict ("id") do update set id=excluded.id, name=person.name,
> is_active=excluded.is_active ;
> INSERT 0 1
>
> select * from person;
>   id | name | is_active
> +--+---
>0 | foo  | t
>
> >
> > Unfortunately. the second statement will fail due to violation of the
> not null constraint on the "name" column.
> > PostgreSQL will always try to insert the row into the table first. and
> only fallback to update when the uniqueness constraint is violated.
> > Is this behavior wrong? maybe not, I think it is doing what it reads
> quite literally.
> > That being said, I have never had a need for the ON CONFLICT DO UPDATE
> statement other than where I need upsert.
> > But using it as "upsert" is only valid when the table is absent of any
> NOT NULL constraint on it's non primary key columns.
> > So, if my experience/use case is typical (meaning the main purpose / use
> case for ON CONFLICT DO UPDATE is to support upsert) then it can be argue
> the current behavior is incorrect?
> >
> > This has been a source confusion to say at least.
> >
> https://stackoverflow.com/questions/48816629/on-conflict-do-nothing-in-postgres-with-a-not-null-constraint
> >
> https://www.postgresql.org/message-id/flat/1996438.1604952901%40sss.pgh.pa.us#ace8adc1354265aca5672028faea0c0f
> >
> > The MERGE command introduced in PG15 in theory can be used to do UPSERT
> properly that is void of the aforementioned limitation.
> > The downside is it is rather verbose.
> >
> > *Question*
> > This there a way to do an upsert proper prior to PG15?
> >
> > *Feature Request*
> > Given that UPSERT is an *idempotent* operator it is extremely useful.
> > Would love to see an UPSERT command in PostgreSQL so one can 'upsert'
> properly and easily.
> >
> >
> > Regards,
> > Louis Tian
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
>


Re: PostgreSQL Mailing list public archives : search not working ...

2023-04-06 Thread Stephen Frost
Greetings,

* Imre Samu (pella.s...@gmail.com) wrote:
> Theory: Maybe the search doesn't always work ...

It does ... but perhaps not as expected (or as ideal).

> Is this a known problem?

Yes and no.

> My use case:
> in https://www.postgresql.org/list/  ->  "Search the Archives"
> search for the "large database backup"
> ->
> https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=r
> -> "Your search for large database backup returned no hits."

Note that the default is "within the last year".  That may not be ideal
though and perhaps we should change it.

> But at least one big thread exists:

"big" is perhaps over-stating it, and most of it is actually about ZFS
anyway, though a few of the initial emails on the thread are
informative.

> To: pgsql-general 
> Subject: postgres *large database backup*
> Date: 2022-11-30 15:40:23
> https://www.postgresql.org/message-id/flat/CA%2BONtZ7CdiJ2X8x_8sPHR%2Btz-vv7mR1uoDzLia3fzWVZA9k0JQ%40mail.gmail.com
> 
> *Changing the "Sort by" ..*

Changing date to 'anytime', sorting by 'date' and using 'database
backup' instead turns this thread up pretty quickly (number 7):

https://www.postgresql.org/search/?m=1&q=database+backup&l=&d=-1&s=d

That said, we should really be including the Subject line in the search
and I don't think we are today (looks like we just search the body).
The code is available for review here:

https://git.postgresql.org/gitweb/?p=pgarchives.git;a=summary

Patches to improve certainly welcome.

Thanks,

Stephen


signature.asc
Description: PGP signature


Backup schema without data

2023-04-06 Thread Atul Kumar
Hi,

Please help me in telling that how I can take the backup of one single
schema without its data using pg_dump utility ?


So far, I could not find anything suitable for doing so.

Regards,
Atul


Re: Backup schema without data

2023-04-06 Thread Michael Loftis
>From the man page….

“

-s
--schema-only

Dump only the object definitions (schema), not data.

…..”

On Thu, Apr 6, 2023 at 18:40 Atul Kumar  wrote:

> Hi,
>
> Please help me in telling that how I can take the backup of one single
> schema without its data using pg_dump utility ?
>
>
> So far, I could not find anything suitable for doing so.
>
> Regards,
> Atul
>
> --

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler


Re: Backup schema without data

2023-04-06 Thread Tom Lane
Atul Kumar  writes:
> Please help me in telling that how I can take the backup of one single
> schema without its data using pg_dump utility ?

Doesn't "pg_dump -s -n  ..." do what you want?

regards, tom lane




Re: Backup schema without data

2023-04-06 Thread Brent Wood
 Looking at the result of:

pg_dump --help
Usage:
  pg_dump [OPTION]... [DBNAME]
...  -n, --schema=PATTERN dump the specified schema(s) only...
 -s, --schema-only    dump only the schema, no data...  


So: pg_dump -s -n  

It works for me...

Cheers
Brent Wood


On Friday, April 7, 2023 at 12:40:58 PM GMT+12, Atul Kumar 
 wrote:  
 
 Hi,
Please help me in telling that how I can take the backup of one single schema 
without its data using pg_dump utility ?

So far, I could not find anything suitable for doing so.
Regards,Atul
  

Re: Patroni vs pgpool II

2023-04-06 Thread Tatsuo Ishii
>> > But, I heard PgPool is still affected by Split brain syndrome.  
>> 
>> Can you elaborate more? If more than 3 pgpool watchdog nodes (the
>> number of nodes must be odd) are configured, a split brain can be
>> avoided.
> 
> Split brain is a hard situation to avoid. I suppose OP is talking about
> PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would
> avoid that.

Ok, "split brain" means here that there are two or more PostgreSQL
primary serves exist.

Pgpool-II's watchdog has a feature called "quorum failover" to avoid
the situation. To make this work, you need to configure 3 or more
Pgpool-II nodes. Suppose they are w0, w1 and w2. Also suppose there
are two PostgreSQL servers pg0 (primary) and pg1 (standby). The goal
is to avoid that both pg0 and pg1 become primary servers.

Pgpool-II periodically monitors PostgreSQL healthiness by checking
whether it can reach to the PostgreSQL servers. Suppose w0 and w1
detect that pg0 is healthy but pg1 is not, while w2 thinks oppositely,
i.e. pg0 is unhealthy but pg1 is healthy (this could happen if w0, w1,
pg0 are in a network A, but w2 and pg1 in different network B. A and B
cannot reach each other).

In this situation if w2 promotes pg1 because w0 seems to be down, then
the system ends up with two primary servers: split brain.

With quorum failover is enabled, w0, w1, and w2 communicate each other
to vote who is correct (if it cannot communicate, it regards other
watchdog is down). In the case above w0 and w1 are majority and will
win. Thus w0 and w1 just detach pg1 and keep on using pg0 as the
primary. On the other hand, since wg2 looses, and it gives up
promoting pg1, thus the split brain is avoided.

Note that in the configuration above, clients access the cluster via
VIP. VIP is always controlled by majority watchdog, clients will not
access pg1 because it is set to down status by w0 and w1.

> To avoid split brain, you need to implement a combinaison of quorum and
> (self-)fencing.
> 
> Patroni quorum is in the DCS's hands. Patroni's self-fencing can be achieved
> with the (hardware) watchdog. You can also implement node fencing through the
> "pre_promote" script to fence the old primary node before promoting the new 
> one.
> 
> If you need HA with a high level of anti-split-brain security, you'll not be
> able to avoid some sort of fencing, no matter what.
> 
> Good luck.

Well, if you define fencing as STONITH (Shoot The Other Node in the
Head), Pgpool-II does not have the feature. However I am not sure
STONITH is always mandatory. I think that depends what you want to
avoid using fencing. If the purpose is to avoid having two primary
servers at the same time, Pgpool-II achieve that as described above.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Patroni vs pgpool II

2023-04-06 Thread Ron

On 4/6/23 23:16, Tatsuo Ishii wrote:

But, I heard PgPool is still affected by Split brain syndrome.

Can you elaborate more? If more than 3 pgpool watchdog nodes (the
number of nodes must be odd) are configured, a split brain can be
avoided.

Split brain is a hard situation to avoid. I suppose OP is talking about
PostgreSQL split brain situation. I'm not sure how PgPool's watchdog would
avoid that.

Ok, "split brain" means here that there are two or more PostgreSQL
primary serves exist.

Pgpool-II's watchdog has a feature called "quorum failover" to avoid
the situation. To make this work, you need to configure 3 or more
Pgpool-II nodes. Suppose they are w0, w1 and w2. Also suppose there
are two PostgreSQL servers pg0 (primary) and pg1 (standby). The goal
is to avoid that both pg0 and pg1 become primary servers.

Pgpool-II periodically monitors PostgreSQL healthiness by checking
whether it can reach to the PostgreSQL servers. Suppose w0 and w1
detect that pg0 is healthy but pg1 is not, while w2 thinks oppositely,
i.e. pg0 is unhealthy but pg1 is healthy (this could happen if w0, w1,
pg0 are in a network A, but w2 and pg1 in different network B. A and B
cannot reach each other).

In this situation if w2 promotes pg1 because w0 seems to be down, then
the system ends up with two primary servers: split brain.

With quorum failover is enabled, w0, w1, and w2 communicate each other
to vote who is correct (if it cannot communicate, it regards other
watchdog is down). In the case above w0 and w1 are majority and will
win. Thus w0 and w1 just detach pg1 and keep on using pg0 as the
primary. On the other hand, since wg2 looses, and it gives up
promoting pg1, thus the split brain is avoided.

Note that in the configuration above, clients access the cluster via
VIP. VIP is always controlled by majority watchdog, clients will not
access pg1 because it is set to down status by w0 and w1.


And this concept is quite old.  (It's also what Windows clustering uses.)

--
Born in Arizona, moved to Babylonia.




Re: Patroni vs pgpool II

2023-04-06 Thread Nikolay Samokhvalov
On Thu, Apr 6, 2023 at 9:17 PM Tatsuo Ishii  wrote:
> With quorum failover is enabled, w0, w1, and w2 communicate each other
> to vote who is correct (if it cannot communicate, it regards other
> watchdog is down). In the case above w0 and w1 are majority and will
> win.

Communication takes time – network latencies. What if during this
communication, the situation becomes different?

What if some of them cannot communicate with each other due to network issues?

What if pg1 is currently primary, pg0 is standby, both are healthy, but
due not network issues, both pg1 and w2 are not reachable to other
nodes? Will pg1 remain primary, and w0 and w1 decide to promote pg0?




Re: Patroni vs pgpool II

2023-04-06 Thread Tatsuo Ishii
> Communication takes time – network latencies. What if during this
> communication, the situation becomes different?

We have to accept it (and do the best to mitigate any consequence of
the problem). I think there's no such a system which presuppose 0
communication latency.

> What if some of them cannot communicate with each other due to network issues?

Can you elaborate more? There are many scenarios for communication
break down. I hesitate to discuss all of them on this forum since this
is for discussions on PostgreSQL, not Pgpool-II. I am welcome you to
join and continue the discussion on pgpool mailing list.

> What if pg1 is currently primary, pg0 is standby, both are healthy, but
> due not network issues, both pg1 and w2 are not reachable to other
> nodes? Will pg1 remain primary, and w0 and w1 decide to promote pg0?

pg1 will remain primary but it is set to "quarantine" state from
pgpool's point of view, which means clients cannot access pg1 via
pgpool.

w0 and w1 will decide to promote pg0.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Patroni vs pgpool II

2023-04-06 Thread Nikolay Samokhvalov
On Thu, Apr 6, 2023 at 11:13 PM Tatsuo Ishii  wrote:
> I am welcome you to
> join and continue the discussion on pgpool mailing list.

I truly believe that this problem – HA – is PostgreSQL's, not 3rd
party's. And it's a shame that Postgres itself doesn't solve this. So
we're discussing it here.


> > What if pg1 is currently primary, pg0 is standby, both are healthy, but
> > due not network issues, both pg1 and w2 are not reachable to other
> > nodes? Will pg1 remain primary, and w0 and w1 decide to promote pg0?
>
> pg1 will remain primary but it is set to "quarantine" state from
> pgpool's point of view, which means clients cannot access pg1 via
> pgpool.

So we have a split brain here – two primaries. Especially if some
clients communicate with PG directly. And even if there are no such
clients, archive_command is going to
work on both nodes, monitoring will show two primaries confusing
humans (e.g, SREs) and various systems, if we have many standby nodes,
some of them might continue replicating from the old primary if they
happen to be in the same network partition, and so on. I don't see how
all these things can be solved with this approach.