RE: Adding percentile metrics to pg_stat_statements module

2023-06-05 Thread benoit
Hello


While looking at slow queries on pg_stat_statements. I was looking for 
percentile fields..


If we are worried about CPU cost, maybe it could be useful to turn in on when 
you have a high stddev_exec_time for the query ?


Regards,


De : Adrien Nayrat 
Envoyé : samedi 2 novembre 2019 10:23:49
À : Tomas Vondra; Igor Calabria
Cc : pgsql-hack...@postgresql.org
Objet : Re: Adding percentile metrics to pg_stat_statements module

On 10/31/19 8:32 PM, Tomas Vondra wrote:
> IMO having some sort of CDF approximation (being a q-digest or t-digest)
> would be useful, although it'd probably need to be optional (mostly
> becuase of memory consumption).

+1, I like this idea. If we are afraid of CPU cost we could imagine some kind of
sampling or add the possibility to collect only for a specific queryid.

I dreamed of this kind of feature for PoWA.  Thus, it could make possible to
compare CDF between two days for example, before and after introducing a change.

Regards,

--
Adrien NAYRAT







pg_shmem_allocations & documentation

2020-12-10 Thread Benoit Lobréau
Hi,

While reading the documentation of pg_shmem_allocations, I noticed that the
off column is described as such :

"The offset at which the allocation starts. NULL for anonymous allocations
and unused memory."

Whereas, the view returns a value for unused memory:

[local]:5433 postgres@postgres=# SELECT * FROM pg_shmem_allocations WHERE
name IS NULL;
 name |off|  size   | allocated_size
--+---+-+
 ¤| 178095232 | 1923968 |1923968
(1 row)

>From what I understand, the doc is wrong.
Am I right ?

Benoit

[1] https://www.postgresql.org/docs/13/view-pg-shmem-allocations.html
[2]
https://www.postgresql.org/message-id/flat/20140504114417.GM12715%40awork2.anarazel.de
(original thread)


Re: pg_shmem_allocations & documentation

2020-12-11 Thread Benoit Lobréau
Would "NULL for anonymous allocations, since details related to them are
not known." be ok ?


Le ven. 11 déc. 2020 à 09:29, Kyotaro Horiguchi  a
écrit :

> At Fri, 11 Dec 2020 14:42:45 +0900, Michael Paquier 
> wrote in
> > On Fri, Dec 11, 2020 at 11:00:58AM +0900, Kyotaro Horiguchi wrote:
> > > Although we could just rip some words off, I'd like to propose instead
> > > to add an explanation why it is not exposed for anonymous allocations,
> > > like the column allocated_size.
> >
> > Indeed, there is a hiccup between what the code does and what the docs
> > tell: the offset is not NULL for unused memory.
> >
> > > -   The offset at which the allocation starts. NULL for anonymous
> > > -   allocations and unused memory.
> > > +   The offset at which the allocation starts. For anonymous
> allocations,
> > > +   no information about individual allocations is available, so
> the column
> > > +   will be NULL in that case.
> >
> > I'd say: let's be simple and just remove "and unused memory" because
> > anonymous allocations are...  Anonymous so you cannot know details
> > related to them.  That's something easy to reason about, and the docs
> > were written originally to remain simple.
>
> Hmm. I don't object to that.  Howerver, isn't the description for
> allocated_size too verbose in that sense?
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>


Re: pg_shmem_allocations & documentation

2020-12-14 Thread Benoit Lobréau
Here's a proposal patch.

Le ven. 11 déc. 2020 à 09:58, Benoit Lobréau  a
écrit :

> Would "NULL for anonymous allocations, since details related to them are
> not known." be ok ?
>
>
> Le ven. 11 déc. 2020 à 09:29, Kyotaro Horiguchi 
> a écrit :
>
>> At Fri, 11 Dec 2020 14:42:45 +0900, Michael Paquier 
>> wrote in
>> > On Fri, Dec 11, 2020 at 11:00:58AM +0900, Kyotaro Horiguchi wrote:
>> > > Although we could just rip some words off, I'd like to propose instead
>> > > to add an explanation why it is not exposed for anonymous allocations,
>> > > like the column allocated_size.
>> >
>> > Indeed, there is a hiccup between what the code does and what the docs
>> > tell: the offset is not NULL for unused memory.
>> >
>> > > -   The offset at which the allocation starts. NULL for anonymous
>> > > -   allocations and unused memory.
>> > > +   The offset at which the allocation starts. For anonymous
>> allocations,
>> > > +   no information about individual allocations is available, so
>> the column
>> > > +   will be NULL in that case.
>> >
>> > I'd say: let's be simple and just remove "and unused memory" because
>> > anonymous allocations are...  Anonymous so you cannot know details
>> > related to them.  That's something easy to reason about, and the docs
>> > were written originally to remain simple.
>>
>> Hmm. I don't object to that.  Howerver, isn't the description for
>> allocated_size too verbose in that sense?
>>
>> regards.
>>
>> --
>> Kyotaro Horiguchi
>> NTT Open Source Software Center
>>
>
From 711f6b60098e67c23a97458ce56893f0ac1afcb6 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 11 Dec 2020 09:44:51 +0100
Subject: [PATCH] Fix pg_shmem_allocation

---
 doc/src/sgml/catalogs.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 62711ee83f..89ca59b92b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -12493,7 +12493,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   
   
The offset at which the allocation starts. NULL for anonymous
-   allocations and unused memory.
+   allocations, since details related to them are not known.
   
  
 
-- 
2.25.4



recovery_target_timeline & documentation

2021-01-05 Thread Benoit Lobréau
Hi,

It seems like this part of the documentation was not updated after changing
the default value of recovery_target_timeline to latest in  v12.

"The default behavior of recovery is to recover along the same timeline
that was current when the base backup was taken. If you wish to recover
into some child timeline (that is, you want to return to some state that
was itself generated after a recovery attempt), you need to specify the
target timeline ID in recovery_target_timeline
<https://www.postgresql.org/docs/13/runtime-config-wal.html#GUC-RECOVERY-TARGET-TIMELINE>.
You cannot recover into timelines that branched off earlier than the base
backup." [1][2]

Here is an attempt to fix that.

regards,
Benoit

[1]
https://www.postgresql.org/docs/13/continuous-archiving.html#BACKUP-TIMELINES
[2]
https://www.postgresql.org/docs/12/continuous-archiving.html#BACKUP-TIMELINES
From 21bbc9badfd5bd2eeb2c702295ef9e8233ed9552 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 5 Jan 2021 11:00:07 +0100
Subject: [PATCH] Fix recovery_target_timeline default in backup.sgml

---
 doc/src/sgml/backup.sgml | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 42a8ed328d..3c8aaed0b6 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1437,12 +1437,13 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'

 

-The default behavior of recovery is to recover along the same timeline
-that was current when the base backup was taken.  If you wish to recover
-into some child timeline (that is, you want to return to some state that
-was itself generated after a recovery attempt), you need to specify the
-target timeline ID in . You cannot recover into
-timelines that branched off earlier than the base backup.
+The default behavior of recovery is to recover to the latest timeline found
+in the archive. If you wish to recover to the timeline that was current
+when the base backup was taken or into a specific child timeline (that
+is, you want to return to some state that was itself generated after a
+recovery attempt), you need to specify current or the
+target timeline ID in . You
+cannot recover into timelines that branched off earlier than the base backup.

   
 
-- 
2.25.4



Re: recovery_target_timeline & documentation

2021-01-06 Thread Benoit Lobréau
> Pushed. Thanks!
>

Thank you !


Re: archive_command / pg_stat_archiver & documentation

2021-02-25 Thread Benoit Lobréau
Le mer. 24 févr. 2021 à 14:52, Julien Rouhaud  a écrit :

> Hi,
>
> On Wed, Feb 24, 2021 at 8:21 PM talk to ben  wrote:
> >
> > The documentation describes how a return code > 125 on the
> restore_command would prevent the server from starting [1] :
> >
> > "
> > It is important that the command return nonzero exit status on failure.
> The command will be called requesting files that are not present in the
> archive; it must return nonzero when so asked. This is not an error
> condition. An exception is that if the command was terminated by a signal
> (other than SIGTERM, which is used as part of a database server shutdown)
> or an error by the shell (such as command not found), then recovery will
> abort and the server will not start up.
> > "
> >
> > But, I dont see such a note on the archive_command side of thing. [2]
> >
> > It could happend in case the archive command is not checked beforehand
> or if the archive command becomes unavailable while PostgreSQL is running.
> rsync can also return 255 in some cases (bad ssh configuration or typos).
> In this case a fatal error is emitted, the archiver stops and is restarted
> by the postmaster.
> >
> > The view pg_stat_archiver is also not updated in this case. Is it on
> purpose ? It could be problematic if someone uses it to check the archiver
> process health.
>
> That's on purpose, see for instance that discussion:
> https://www.postgresql.org/message-id/flat/55731BB8.1050605%40dalibo.com
>

Thanks for pointing that out, I should have checked.


> > Should we document this ? (I can make a patch)
>
> I thought that this behavior was documented, especially for the lack
> of update of pg_stat_archiver.  If it's not the case then we should
> definitely fix that!
>

I tried to do it in the attached patch.
Building the doc worked fine on my computer.
From 350cd7c47d09754ae21f30f260a86e187054257f Mon Sep 17 00:00:00 2001
From: benoit 
Date: Thu, 25 Feb 2021 12:08:03 +0100
Subject: [PATCH] Document archive_command failures in more details

Document that, if the command was terminated by a signal (other than SIGTERM, which
is used as part of a database server shutdown) or an error by the shell with an exit
status greater than 125 (such as command not found), then the archiver process will
abort and the postmaster will restart it. In such cases, the failure will not be
reported in pg_stat_archiver.
---
 doc/src/sgml/backup.sgml | 8 +++-
 doc/src/sgml/monitoring.sgml | 3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 3c8aaed0b6..94d5dcbdf0 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -636,7 +636,13 @@ test ! -f /mnt/server/archivedir/000100A90065 && cp pg_wal/0
 PostgreSQL will assume that the file has been
 successfully archived, and will remove or recycle it.  However, a nonzero
 status tells PostgreSQL that the file was not archived;
-it will try again periodically until it succeeds.
+it will try again periodically until it succeeds. 
+An exception is that if the command was terminated by
+a signal (other than SIGTERM, which is used as
+part of a database server shutdown) or an error by the shell with an exit
+status greater than 125 (such as command not found), then the archiver
+process will abort and the postmaster will restart it. In such cases,
+the failure will not be reported in .

 

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..391df3055b 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3251,7 +3251,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
failed_count bigint
   
   
-   Number of failed attempts for archiving WAL files
+  Number of failed attempts for archiving WAL files (See )
   
  
 
-- 
2.25.4



Re: archive_command / pg_stat_archiver & documentation

2021-02-26 Thread Benoit Lobréau
Done here : https://commitfest.postgresql.org/32/3012/

Le jeu. 25 févr. 2021 à 15:34, Julien Rouhaud  a écrit :

> On Thu, Feb 25, 2021 at 7:25 PM Benoit Lobréau 
> wrote:
> >
> > Le mer. 24 févr. 2021 à 14:52, Julien Rouhaud  a
> écrit :
> >>
> >> I thought that this behavior was documented, especially for the lack
> >> of update of pg_stat_archiver.  If it's not the case then we should
> >> definitely fix that!
> >
> > I tried to do it in the attached patch.
> > Building the doc worked fine on my computer.
>
> Great, thanks!  Can you register it in the next commitfest to make
> sure it won't be forgotten?
>


Re: archive_command / pg_stat_archiver & documentation

2021-03-01 Thread Benoit Lobréau
Le lun. 1 mars 2021 à 08:36, Michael Paquier  a écrit :

> On Fri, Feb 26, 2021 at 10:03:05AM +0100, Benoit Lobréau wrote:
> > Done here : https://commitfest.postgresql.org/32/3012/
>
> Documenting that properly for the archive command, as already done for
> restore_command, sounds good to me.  I am not sure that there is much
> point in doing a cross-reference to the archiving section for one
> specific field of pg_stat_archiver.
>

I wanted to add a warning that using pg_stat_archiver to monitor the good
health of the
archiver comes with a caveat in the view documentation itself. But couldn't
find a concise
way to do it. So I added a link.

If you think it's unnecessary, that's ok.


> For the second paragraph, I would recommend to move that to a
> different  to outline this special case, leading to the
> attached.
>

Good.


Re: archive_command / pg_stat_archiver & documentation

2021-03-01 Thread Benoit Lobréau
I like the idea !

If it's not too complicated, I'd like to take a stab at it.

Le lun. 1 mars 2021 à 10:16, Julien Rouhaud  a écrit :

> On Mon, Mar 1, 2021 at 4:33 PM Benoit Lobréau 
> wrote:
> >
> > Le lun. 1 mars 2021 à 08:36, Michael Paquier  a
> écrit :
> >>
> >> On Fri, Feb 26, 2021 at 10:03:05AM +0100, Benoit Lobréau wrote:
> >> > Done here : https://commitfest.postgresql.org/32/3012/
> >>
> >> Documenting that properly for the archive command, as already done for
> >> restore_command, sounds good to me.  I am not sure that there is much
> >> point in doing a cross-reference to the archiving section for one
> >> specific field of pg_stat_archiver.
> >
> >
> > I wanted to add a warning that using pg_stat_archiver to monitor the
> good health of the
> > archiver comes with a caveat in the view documentation itself. But
> couldn't find a concise
> > way to do it. So I added a link.
> >
> > If you think it's unnecessary, that's ok.
>
> Maybe this can be better addressed than with a link in the
> documentation.  The final outcome is that it can be difficult to
> monitor the archiver state in such case.  That's orthogonal to this
> patch but maybe we can add a new "archiver_start" timestamptz column
> in pg_stat_archiver, so monitoring tools can detect a problem if it's
> too far away from pg_postmaster_start_time() for instance?
>


Re: archive_command / pg_stat_archiver & documentation

2021-03-02 Thread Benoit Lobréau
Thanks !

Le mar. 2 mars 2021 à 04:10, Julien Rouhaud  a écrit :

> On Tue, Mar 2, 2021 at 9:29 AM Michael Paquier 
> wrote:
> >
> > On Mon, Mar 01, 2021 at 05:17:06PM +0800, Julien Rouhaud wrote:
> > > Maybe this can be better addressed than with a link in the
> > > documentation.  The final outcome is that it can be difficult to
> > > monitor the archiver state in such case.  That's orthogonal to this
> > > patch but maybe we can add a new "archiver_start" timestamptz column
> > > in pg_stat_archiver, so monitoring tools can detect a problem if it's
> > > too far away from pg_postmaster_start_time() for instance?
> >
> > There may be other solutions as well.  I have applied the doc patch
> > for now.
>
> Thanks!
>


Re: Probable memory leak with ECPG and AIX

2021-12-15 Thread Benoit Lobréau
Our client confirmed that the more he fetches the more memory is consumed.
The segfault was indeed caused by the absence of LDR_CNTRL.

The tests show that:

* without LDR_CNTRL, we reach 256Mb and segfault ;
* with LDR_CNTRL=MAXDATA=0x1000, we reach 256Mo but there is no
segfault, the program just continues running ;
* with LDR_CNTRL=MAXDATA=0x8000, we reach 2Go and there is no segfault
either, the program just continues running.


Re: Logging parallel worker draught

2024-04-08 Thread Benoit Lobréau

On 4/8/24 10:05, Andrey M. Borodin wrote:

Hi Benoit!

This is kind reminder that this thread is waiting for your response.
CF entry [0] is in "Waiting on Author", I'll move it to July CF.


Hi thanks for the reminder,

The past month as been hectic for me.
It should calm down by next week at wich point I'll have time to go back 
to this. sorry for the delay.


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2024-02-27 Thread Benoit Lobréau

On 2/25/24 20:13, Tomas Vondra wrote:
> 1) name of the GUC
...
> 2) logging just the failures provides an incomplete view
> log_parallel_workers = {none | failures | all}>
> where "failures" only logs when at least one worker fails to start, and
> "all" logs everything.
>
> AFAIK Sami made the same observation/argument in his last message [1].

I like the name and different levels you propose. I was initially 
thinking that the overall picture would be better summarized (an easier 
to implement) with pg_stat_statements. But with the granularity you 
propose, the choice is in the hands of the DBA, which is great and 
provides more options when we don't have full control of the installation.


> 3) node-level or query-level?
...
> There's no value in having information about every individual temp file,
> because we don't even know which node produced it. It'd be perfectly
> sufficient to log some summary statistics (number of files, total space,
> ...), either for the query as a whole, or perhaps per node (because
> that's what matters for work_mem). So I don't think we should mimic this
> just because log_temp_files does this.

I must admit that, given my (poor) technical level, I went for the 
easiest way.


If we go for the three levels you proposed, "node-level" makes even less 
sense (and has great "potential" for spam).


I can see one downside to the "query-level" approach: it might be more 
difficult to to give information in cases where the query doesn't end 
normally. It's sometimes useful to have hints about what was going wrong 
before a query was cancelled or crashed, which log_temp_files kinda does.


> I don't know how difficult would it be to track/collect this information
> for the whole query.

I am a worried this will be a little too much for me, given the time and 
the knowledge gap I have (both in C and PostgreSQL internals). I'll try 
to look anyway.


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2024-02-27 Thread Benoit Lobréau




On 2/25/24 23:32, Peter Smith wrote:

Also, I don't understand how the word "draught" (aka "draft") makes
sense here -- I assume the intended word was "drought" (???).


yes, that was the intent, sorry about that. English is not my native 
langage and I was convinced the spelling was correct.


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2024-02-29 Thread Benoit Lobréau
On 2/27/24 15:09, Tomas Vondra wrote> That is certainly true, but it's 
not a new thing, I believe. IIRC we may

> not report statistics until the end of the transaction, so no progress
> updates, I'm not sure what happens if the doesn't end correctly (e.g.
> backend dies, ...). Similarly for the temporary files, we don't report
> those until the temporary file gets closed, so I'm not sure you'd get a
> lot of info about the progress either.
>
> I admit I haven't tried and maybe I don't remember the details wrong.
> Might be useful to experiment with this first a little bit ...

Ah, yes, Tempfile usage is reported on tempfile closure or deletion
for shared file sets.

> I certainly understand that, and I realize it may feel daunting and even
> discouraging. What I can promise is that I'm willing to help, either by
> suggesting ways to approach the problems, doing reviews, and so on.
> Would that be sufficient for you to continue working on this patch?

Yes, thanks for the proposal, I'll work on it on report here. I am otherwise
booked on company projects so I cannot promise a quick progress.

> Some random thoughts/ideas about how to approach this:
>
> - For parallel workers I think this might be as simple as adding some
> counters into QueryDesc, and update those during query exec (instead of
> just logging stuff directly). I'm not sure if it should be added to
> "Instrumentation" or separately.

I will start here to see how it works.

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

2024-03-07 Thread Benoit Tigeot
Hello,

I am not up to date with the last version of patch but I did a regular
benchmark with version 11 and typical issue we have at the moment and the
result are still very very good. [1]

In term of performance improvement the last proposals could be a real game
changer for 2 of our biggest databases. We hope that Postgres 17 will
contain those improvements.

Kind regards,

Benoit

[1] -
https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d?permalink_comment_id=4972955#gistcomment-4972955
__
Benoit Tigeot



Le jeu. 7 mars 2024 à 15:36, Peter Geoghegan  a écrit :

> On Tue, Jan 23, 2024 at 3:22 PM Peter Geoghegan  wrote:
> > I could include something less verbose, mentioning a theoretical risk
> > to out-of-core amcanorder routines that coevolved with nbtree,
> > inherited the same SAOP limitations, and then never got the same set
> > of fixes.
>
> Attached is v11, which now says something like that in the commit
> message. Other changes:
>
> * Fixed buggy sorting of arrays using cross-type ORDER procs, by
> recognizing that we need to consistently use same-type ORDER procs for
> sorting and merging the arrays during array preprocessing.
>
> Obviously, when we sort, we compare array elements to other array
> elements (all of the same type). This is true independent of whether
> the query itself happens to use a cross type operator/ORDER proc, so
> we will need to do two separate ORDER proc lookups in cross-type
> scenarios.
>
> * No longer subscript the ORDER proc used for array binary searches
> using a scankey subscript. Now there is an additional indirection that
> works even in the presence of multiple redundant scan keys that cannot
> be detected as such due to a lack of appropriate cross-type support
> within an opfamily.
>
> This was subtly buggy before now. Requires a little more coordination
> between array preprocessing and standard/primitive index scan
> preprocessing, which isn't ideal but seems unavoidable.
>
> * Lots of code polishing, especially within _bt_advance_array_keys().
>
> While _bt_advance_array_keys() still works in pretty much exactly the
> same way as it did back in v10, there are now better comments.
> Including something about why its recursive call to itself is
> guaranteed to use a low, fixed amount of stack space, verified using
> an assertion. That addresses a concern held by Matthias.
>
> Outlook
> ===
>
> This patch is approaching being committable now. Current plan is to
> commit this within the next few weeks.
>
> All that really remains now is to research how we might integrate this
> work with the recently added continuescanPrechecked/haveFirstMatch
> stuff from Alexander Korotkov, if at all. I've put that off until now
> because it isn't particularly fundamental to what I'm doing here, and
> seems optional.
>
> I would also like to do more performance validation. Things like the
> parallel index scan code could stand to be revisited once again. Plus
> I should think about the overhead of array preprocessing when
> btrescan() is called many times, from a nested loop join -- I should
> have something to say about that concern (raised by Heikki at one
> point) before too long.
>
> --
> Peter Geoghegan
>


Parallel Query Stats

2023-04-05 Thread Benoit Lobréau
8e-95d5-00c737b865e8%40gmail.com


# pg_stat_all_tables and pg_stat_all_indexes

We could add a parallel_seq_scan counter to pg_stat_all_tables. The column
would be incremented for each worker participating in a scan. The leader
would also increment the counter if it is participating.

The same thing could be done to pg_stat_all_indexes with a 
parallel_index_scan

column.

These metrics could be used in relation to system stats and other PostgreSQL
metrics such as pg_statio_* in tools like POWA.

# Workflow

An overview of the backgroud worker usage could be viewed via the
pg_stat_bgworker view. It could help detect, and in some cases explain, 
parallel

workers draughts. It would also help adapt the size of the worker pools and
prompt us to look into the logs or pg_stat_statements.

The statistics gathered in pg_stat_statements can be used the usual way:
* have an idea of the parallel query usage on the server;
* detect queries that starve from lack of parallel workers;
* compare snapshots to see the impact of parameter modifications;
* combine the statistics with other sources to know:
  * if the decrease in parallel workers had on impact on the average 
execution duration
  * if the increase in parallel workers allocation had an impact on the 
system

time;

The logs can be used to pin point specific queries with their parameters or
to get global statistics when pg_stat_statements is not available or 
can't be

used.

Once a query is singled out, it can be analysed as usual with EXPLAIN to
determine:
* if the lack of workers is a problem;
* how parallelism helps in this particular case.

Finally, the per relation statitics could be combined with system and other
PostgreSQL metrics to identify why the storage is stressed.


If you reach this point, thank you for reading me!

Many thanks to Melanie Plageman for the pointers she shared with us 
around the

pgsessions in Paris and her time in general.

--
Benoit Lobréau
Consultant
http://dalibo.com




Logging parallel worker draught

2023-04-21 Thread Benoit Lobréau

Hi hackers,

Following my previous mail about adding stats on parallelism[1], this 
patch introduces the log_parallel_worker_draught parameter, which 
controls whether a log message is produced when a backend attempts to 
spawn a parallel worker but fails due to insufficient worker slots. The 
shortage can stem from insufficent settings for max_worker_processes, 
max_parallel_worker or max_parallel_maintenance_workers. It could also 
be caused by another pool (max_logical_replication_workers) or an 
extention using bg worker slots. This new parameter can help database 
administrators and developers diagnose performance issues related to 
parallelism and optimize the configuration of the system accordingly.


Here is a test script:

```
psql << _EOF_

SET log_parallel_worker_draught TO on;

-- Index creation
DROP TABLE IF EXISTS test_pql;
CREATE TABLE test_pql(i int, j int);
INSERT INTO test_pql SELECT x,x FROM generate_series(1,100) as F(x);

SET max_parallel_workers TO 0;

CREATE INDEX ON test_pql(i);
REINDEX TABLE test_pql;

RESET max_parallel_workers;

-- VACUUM
CREATE INDEX ON test_pql(j);
CREATE INDEX ON test_pql(i,j);
ALTER TABLE test_pql SET (autovacuum_enabled = off);
DELETE FROM test_pql WHERE i BETWEEN 1000 AND 2000;

SET max_parallel_workers TO 1;

VACUUM (PARALLEL 2, VERBOSE) test_pql;

RESET max_parallel_workers;

-- SELECT
SET min_parallel_table_scan_size TO 0;
SET parallel_setup_cost TO 0;
SET max_parallel_workers TO 1;

EXPLAIN (ANALYZE) SELECT i, avg(j) FROM test_pql GROUP BY i;

_EOF_
```

Which produces the following logs:

```
LOG:  Parallel Worker draught during statement execution: workers 
spawned 0, requested 1

STATEMENT:  CREATE INDEX ON test_pql(i);

LOG:  Parallel Worker draught during statement execution: workers 
spawned 0, requested 1

STATEMENT:  REINDEX TABLE test_pql;

LOG:  Parallel Worker draught during statement execution: workers 
spawned 1, requested 2

CONTEXT:  while scanning relation "public.test_pql"
STATEMENT:  VACUUM (PARALLEL 2, VERBOSE) test_pql;

LOG:  Parallel Worker draught during statement execution: workers 
spawned 1, requested 2

STATEMENT:  EXPLAIN (ANALYZE) SELECT i, avg(j) FROM test_pql GROUP BY i;
```

[1] 
https://www.postgresql.org/message-id/d657df20-c4bf-63f6-e74c-cb85a81d0...@dalibo.com


--
Benoit Lobréau
Consultant
http://dalibo.comFrom f7d3dae918df2dbbe7fd18cf48f7ca39d5716c73 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 3 Mar 2023 10:47:50 +0100
Subject: [PATCH] Add logging for exceeded parallel worker slot limits

Introduce the log_parallel_worker_draught parameter, which controls
whether a log message is produced when a backend attempts to spawn a
parallel worker but fails due to insufficient worker slots. The shortage
can stem from max_worker_processes, max_parallel_worker, or
max_parallel_maintenance_workers. This new parameter can help database
administrators and developers diagnose performance issues related to
parallelism and optimize the configuration of the system accordingly.
---
 doc/src/sgml/config.sgml  | 16 
 src/backend/access/transam/parallel.c |  6 ++
 src/backend/utils/misc/guc_tables.c   | 10 ++
 src/backend/utils/misc/postgresql.conf.sample |  2 ++
 src/include/access/parallel.h |  1 +
 5 files changed, 35 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bcc49aec45..0fe0c7796e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7499,6 +7499,22 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+ 
+  log_parallel_worker_draught 
(boolean)
+  
+   log_parallel_worker_draught configuration 
parameter
+  
+  
+  
+   
+Controls whether a log message is produced when a backend tries to
+spawn a parallel worker but is not successful because either
+max_worker_processes, 
max_parallel_worker
+or max_parallel_maintenance_workers is reached.
+   
+  
+ 
+
  
   log_parameter_max_length (integer)
   
diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index b26f2a64fb..d86ba5a838 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -630,6 +630,12 @@ LaunchParallelWorkers(ParallelContext *pcxt)
pcxt->nknown_attached_workers = 0;
}
 
+   if (log_parallel_worker_draught &&
+   pcxt->nworkers_launched < pcxt->nworkers_to_launch)
+   ereport(LOG,
+   (errmsg("Parallel Worker draught during 
statement execution: workers spawned %d, requested %d",
+   pcxt->nworkers_launched, 
pcxt->nworkers_to_launch)));
+
/* Restore previous memory context. */
MemoryContextSwitchTo(oldcontext);
 }
diff --git a/src/backend/utils/

Re: Logging parallel worker draught

2023-04-28 Thread Benoit Lobréau

On 4/22/23 13:06, Amit Kapila wrote:

I don't think introducing a GUC for this is a good idea. We can
directly output this message in the server log either at LOG or DEBUG1
level.


Hi,

Sorry for the delayed answer, I was away from my computer for a few 
days. I don't mind removing the guc, but I would like to keep it at the 
LOG level. When I do audits most client are at that level and setting it 
to DEBUG1 whould add too much log for them on the long run.


I'll post the corresponding patch asap.

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2023-05-02 Thread Benoit Lobréau

On 5/1/23 18:33, Robert Haas wrote:
> Why not? It seems like something some people might want to log and
> others not. Running the whole server at DEBUG1 to get this information
> doesn't seem like a suitable answer.

Since the statement is also logged, it could spam the log with huge 
queries, which might also be a reason to stop logging this kind of 
problems until the issue is fixed.


I attached the patch without the guc anyway just in case.


What I was wondering was whether we would be better off putting this
into the statistics collector, vs. doing it via logging. Both
approaches seem to have pros and cons.


We tried to explore different options with my collegues in another 
thread [1]. I feel like both solutions are worthwhile, and would be 
helpful. I plan to take a look at the pg_stat_statement patch [2] next.


Since it's my first patch, I elected to choose the easiest solution to 
implement first. I also proposed this because I think logging can help 
pinpoint a lot of problems at a minimal cost, since it is easy to setup 
and exploit for everyone, everywhere


[1] 
https://www.postgresql.org/message-id/d657df20-c4bf-63f6-e74c-cb85a81d0...@dalibo.com


[2] 
https://www.postgresql.org/message-id/flat/6acbe570-068e-bd8e-95d5-00c737b865e8%40gmail.com


--
Benoit Lobréau
Consultant
http://dalibo.comFrom fc71ef40f33f94b0506a092fb5b3dcde6de6d60a Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 2 May 2023 10:08:00 +0200
Subject: [PATCH] Add logging for exceeded parallel worker slot limits

Procude a log message when a backend attempts to spawn a parallel worker
but fails due to insufficient worker slots. The shortage can stem from
max_worker_processes, max_parallel_worker, or
max_parallel_maintenance_workers. The log message can help database
administrators and developers diagnose performance issues related to
parallelism and optimize the configuration of the system accordingly.
---
 src/backend/access/transam/parallel.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index b26f2a64fb..c60d1bd739 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -630,6 +630,11 @@ LaunchParallelWorkers(ParallelContext *pcxt)
pcxt->nknown_attached_workers = 0;
}
 
+   if (pcxt->nworkers_launched < pcxt->nworkers_to_launch)
+   ereport(LOG,
+   (errmsg("Parallel Worker draught during 
statement execution: workers spawned %d, requested %d",
+   pcxt->nworkers_launched, 
pcxt->nworkers_to_launch)));
+
/* Restore previous memory context. */
MemoryContextSwitchTo(oldcontext);
 }
-- 
2.39.2



Questions about the new subscription parameter: password_required

2023-09-21 Thread Benoit Lobréau

Hi,

I am confused about the new subscription parameter: password_required.

I have two instances. The publisher's pg_hba is configured too allow 
connections without authentication. On the subscriber, I have an 
unprivileged user with pg_create_subscription and CREATE on the database.


I tried using a superuser to create a subsciption without setting the 
password_required parameter (the default is true). Then I changed the 
owner to the unprivileged user.


This user can use the subscription without limitation (including ALTER 
SUBSCRIPTION ENABLE / DISABLE). The \dRs+ metacommand shows that a 
password is requiered, which is not the case (or it is but it's not 
enforced).


Is this normal? I was expecting the ALTER SUBSCRIPTION .. OWNER to fail.

When I try to drop the subscription with the unprivileged user or a 
superuser, I get an error:


ERROR:  password is required
DETAIL:  Non-superuser cannot connect if the server does not request a 
password.
HINT:  Target server's authentication method must be changed, or set 
password_required=false in the subscription parameters.


I have to re-change the subscription owner to the superuser, to be able 
to drop it.


(See password_required.sql and password_required.log)

I tried the same setup and changed the connexion string to add an 
application_name with the unprivileged user. In this case, I am reminded 
that I need a password. I tried modifying password_required to false 
with the superuser and modify the connexion string with the unprivilege 
user again. It fails with:


HINT:  Subscriptions with the password_required option set to false may 
only be created or modified by the superuser.


I think that this part works as intended.

I tried dropping the subscription with the unprivilege user: it works. 
Is it normal (given the previous message)?


(see password_required2.sql and password_required2.log)

--
Benoit Lobréau
Consultant
http://dalibo.com

--
\c tests_pg16 postgres
You are now connected to database "tests_pg16" as user "postgres".
--
SELECT version();
 version  
--
 PostgreSQL 16.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1), 64-bit
(1 row)

--
CREATE SUBSCRIPTION sub_pg16   
   CONNECTION 'host=/var/run/postgresql port=5437 user=user_pub_pg16 dbname=tests_pg16'
   PUBLICATION pub_pg16;
psql:/home/benoit/tmp/password_required.sql:8: NOTICE:  created replication slot "sub_pg16" on publisher
CREATE SUBSCRIPTION
--
ALTER SUBSCRIPTION sub_pg16 OWNER TO sub_owner ;
ALTER SUBSCRIPTION
--
\x
Expanded display is on.
\dRs+
List of subscriptions
-[ RECORD 1 ]--+
Name   | sub_pg16
Owner  | sub_owner
Enabled| t
Publication| {pub_pg16}
Binary | f
Streaming  | off
Two-phase commit   | d
Disable on error   | f
Origin | any
Password required  | t
Run as owner?  | f
Synchronous commit | off
Conninfo   | host=/var/run/postgresql port=5437 user=user_pub_pg16 dbname=tests_pg16
Skip LSN   | 0/0

\du+
List of roles
-[ RECORD 1 ]---
Role name   | postgres
Attributes  | Superuser, Create role, Create DB, Replication, Bypass RLS
Description | 
-[ RECORD 2 ]---
Role name   | sub_owner
Attributes  | 
Description | 

\l tests_pg16
List of databases
-[ RECORD 1 ]-+--
Name  | tests_pg16
Owner | postgres
Encoding  | UTF8
Locale Provider   | libc
Collate   | C
Ctype | C
ICU Locale| 
ICU Rules | 
Access privileges | =Tc/postgres +
  | postgres=CTc/postgres+
  | sub_owner=C/postgres

\x
Expanded display is off.
--
\c - sub_owner 
You are now connected to database "tests_pg16" as user "sub_owner".
--
ALTER SUBSCRIPTION sub_pg16 DISABLE;
ALTER SUBSCRIPTION
--
ALTER SUBSCRIPTION sub_pg16 ENABLE;
ALTER SUBSCRIPTION
--
ALTER SUBSCRIPTION sub_pg16 RENAME TO sub_pg16_renamed;
ALTER SUBSCRIPTION
--
DROP SUBSCRIPTION sub_pg16_renamed ;
psql:/home/benoit/tmp/password_required.sql:26: ERROR:  password is required
DETAIL:  Non-superuser cannot connect if the server does not request a password.
HINT:  Target server's authentication method must be changed, or set password_required=false in the subscription parameters.
--
\c - postgres
You are now connected to database "tests_pg16" as user "postgres".
--
DROP SUBSCRIPTION sub_pg16_renamed;
psql:/home/benoit/tmp/password_required.sql:30: ERROR:  password is required
DETAIL:  Non-superuser ca

Re: Questions about the new subscription parameter: password_required

2023-09-22 Thread Benoit Lobréau

On 9/21/23 20:29, Robert Haas wrote:

Which one? I see 2 ALTER SUBSCRIPTION ... OWNER commands in
password_required.log and 1 more in password_required2.log, but
they're all performed by the superuser, who is entitled to do anything
they want.


Thank you for taking the time to respond!

I expected the ALTER SUBSCRIPTION ... OWNER command in 
password_required.log to fail because the end result of the command is a 
non-superuser owning a subscription with password_required=true, but the 
connection string has no password keyword, and the authentication scheme 
used doesn't require one anyway.


The description of the password_required parameter doesn't clearly state 
what will fail or when the configuration is enforced (during CREATE 
SUBSCRIPTION and ALTER SUBSCRIPTION .. CONNECTION):


""" https://www.postgresql.org/docs/16/sql-createsubscription.html
Specifies whether connections to the publisher made as a result of this 
subscription must use password authentication. This setting is ignored 
when the subscription is owned by a superuser. The default is true. Only 
superusers can set this value to false.

"""

The description of pg_subscription.subpasswordrequired doesn't either:

""" https://www.postgresql.org/docs/16/catalog-pg-subscription.html
If true, the subscription will be required to specify a password for 
authentication

"""

Can we consider adding something like this to clarify?

"""
This parameter is enforced when the CREATE SUBSCRIPTION or ALTER 
SUBSCRIPTION .. CONNECTION commands are executed. Therefore, it's 
possible to alter the ownership of a subscription with 
password_required=true to a non-superuser.

"""

Is the DROP SUBSCRIPTION failure in password_required.log expected for 
both superuser and non-superuser?


Is the DROP SUBSCRIPTION success in password_required2.log expected?
(i.e., with password_require=false, the only action a non-superuser can 
perform is dropping the subscription. Since they own it, it is 
understandable).


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Questions about the new subscription parameter: password_required

2023-09-22 Thread Benoit Lobréau

On 9/22/23 14:36, Robert Haas wrote:

I haven't checked this, but I think what's happening here is that DROP
SUBSCRIPTION tries to drop the remote slot, which requires making a
connection, which can trigger the error. You might get different
results if you did ALTER SUBSCRIPTION ... SET (slot_name = none)
first.


You're right, it comes from the connection to drop the slot.

But the code in for DropSubscription in 
src/backend/commands/subscriptioncmds.c tries to connect before testing 
if the slot is NONE / NULL. So it doesn't work to DISABLE the 
subscription and set the slot to NONE.



  1522 >~~~must_use_password = !superuser_arg(subowner) && 
form->subpasswordrequired;

  ...
  1685 >~~~wrconn = walrcv_connect(conninfo, true, must_use_password,
 1 >~~~>~~~>~~~>~~~>~~~>~~~>~~~subname, &err);
 2 >~~~if (wrconn == NULL)
 3 >~~~{
 4 >~~~>~~~if (!slotname)
 5 >~~~>~~~{
 6 >~~~>~~~>~~~/* be tidy */
 7 >~~~>~~~>~~~list_free(rstates);
 8 >~~~>~~~>~~~table_close(rel, NoLock);
 9 >~~~>~~~>~~~return;
10 >~~~>~~~}
11 >~~~>~~~else
12 >~~~>~~~{
13 >~~~>~~~>~~~ReportSlotConnectionError(rstates, subid, slotname, 
err);

14 >~~~>~~~}
15 >~~~}


Reading the code, I think I understand why the postgres user cannot drop 
the slot:


* the owner is sub_owner (not a superuser)
* and form->subpasswordrequired is true

Should there be a test to check if the user executing the query is 
superuser? maybe it's handled differently? (I am not very familiar with 
the code).


I dont understand (yet?) why I can do ALTER SUBSCRIPTIONs after changing 
the ownership to an unpriviledged user (must_use_password should be true 
also in that case).


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Questions about the new subscription parameter: password_required

2023-09-26 Thread Benoit Lobréau

On 9/22/23 21:58, Robert Haas wrote

I think that there normally shouldn't be any problem here, because if
form->subpasswordrequired is true, we expect that the connection
string should contain a password which the remote side actually uses,
or we expect the subscription to be owned by the superuser. If neither
of those things is the case, then either the superuser made a
subscription that doesn't use a password owned by a non-superuser
without fixing subpasswordrequired, or else the configuration on the
remote side has changed so that it now doesn't use the password when
formerly it did. In the first case, perhaps it would be fine to go
ahead and drop the slot, but in the second case I don't think it's OK
from a security point view, because the command is going to behave the
same way no matter who executes the drop command, and a non-superuser
who drops the slot shouldn't be permitted to rely on the postgres
processes's identity to do anything on a remote node -- including
dropping a relation slot. So I tentatively think that this behavior is
correct.


I must admin I hadn't considered the implication when the configuration 
on the remote side has changed and we use a non-superuser. I see how it 
could be problematic.


I will try to come up with a documentation patch.


Maybe you're altering it in a way that doesn't involve any connections
or any changes to the connection string? There's no security issue if,
say, you rename it.


I looked at the code again. Indeed, of the ALTER SUBSCRIPTION commands, 
only ALTER SUBSCRIPTION .. CONNECTION uses walrcv_check_conninfo().


I checked the other thread (Re: [16+] subscription can end up in 
inconsistent state [1]) and will try the patch. Is it the thread you 
where refering to earlier ?


[1] 
https://www.postgresql.org/message-id/flat/5dff4caf26f45ce224a33a5e18e110b93a351b2f.camel%40j-davis.com#ff4a06505de317b1ad436b8102a69446


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Questions about the new subscription parameter: password_required

2023-09-26 Thread Benoit Lobréau

On 9/26/23 16:27, Benoit Lobréau wrote:

I will try to come up with a documentation patch.


This is my attempt at a documentation patch.

--
Benoit Lobréau
Consultant
http://dalibo.comFrom a73baa91032fff37ef039168c276508553830f86 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 26 Sep 2023 18:07:47 +0200
Subject: [PATCH] Doc patch for require_password

Add documentation to ALTER / DROP SUBSCRIPTION regarding non-superuser
subscriptions with require_password=true.
---
 doc/src/sgml/ref/alter_subscription.sgml | 3 +++
 doc/src/sgml/ref/drop_subscription.sgml  | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a85e04e4d6..0bbe7e2335 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -51,6 +51,9 @@ ALTER SUBSCRIPTION name RENAME TO <
to alter the owner, you must be able to SET ROLE to the
new owning role. If the subscription has
password_required=false, only superusers can modify it.
+   If the ownership of a subscription with password_required=true
+   is transferred to a non-superuser, they will gain full control over the subscription
+   but will not be able to modify it's connection string.
   
 
   
diff --git a/doc/src/sgml/ref/drop_subscription.sgml b/doc/src/sgml/ref/drop_subscription.sgml
index 2a67bdea91..8ec743abd0 100644
--- a/doc/src/sgml/ref/drop_subscription.sgml
+++ b/doc/src/sgml/ref/drop_subscription.sgml
@@ -102,6 +102,12 @@ DROP SUBSCRIPTION [ IF EXISTS ] nameDROP
SUBSCRIPTION cannot be executed inside a transaction block.
   
+
+  
+   If the ownership of a subscription with password_required=true
+   has been transferred to a non-superuser, it must be reverted to a superuser for
+   the DROP operation to succeed.
+  
  
 
  
-- 
2.41.0



Re: Questions about the new subscription parameter: password_required

2023-09-28 Thread Benoit Lobréau

On 9/26/23 19:00, Jeff Davis wrote:

   +   If the ownership of a subscription with
password_required=true
   +   is transferred to a non-superuser, they will gain full control
over the subscription
   +   but will not be able to modify it's connection string.

I think you mean false, right?


No, but I was wrong. At the beginning of the thread, I was surprised 
was even possible to change the ownership to a non-superuser because It 
shouldn't work and commands like ENABLE didn't complain in the terminal.
Then Robert Haas explained to me that it's ok because the superuser can 
do whatever he wants. I came back to it later and somehow convinced 
myself it was working. Sorry.



   +   If the ownership of a subscription with
password_required=true
   +   has been transferred to a non-superuser, it must be reverted to a
superuser for
   +   the DROP operation to succeed.

That's only needed if the superuser transfers a subscription with
password_required=true to a non-superuser and the connection string
does not contain a password. In that case, the subscription is already
in a failing state, not just for DROP. Ideally we'd have some other
warning in the docs not to do that -- maybe in CREATE and ALTER.


Yes, I forgot the connection string bit.


Also, if the subscription is in that kind of failing state, there are
other ways to get out of it as well, like disabling it and setting
connection=none, then dropping i

The code in for DropSubscription in
src/backend/commands/subscriptioncmds.c tries to connect before testing
if the slot is NONE / NULL. So it doesn't work to DISABLE the
subscription and set the slot to NONE.

Robert Haas proposed something in the following message but I am a 
little out of my depth here ...


https://www.postgresql.org/message-id/af9435ae-18df-6a9e-2374-2de23009518c%40dalibo.com


The whole thing is fairly delicate. As soon as you work slightly
outside of the intended use, password_required starts causing
unexpected things to happen.

As I said earlier, I think the best thing to do is to just have a
section that describes when to use password_required, what specific
things you should do to satisfy that case, and what caveats you should
avoid. Something like:

   "If you want to have a subscription using a connection string without
a password managed by a non-superuser, then: [ insert SQL steps here ].
Warning: if the connection string doesn't contain a password, make sure
to set password_required=false before transferring ownership, otherwise
it will start failing."


Ok, I will do it that way. Would you prefer this section to be in the 
ALTER SUBSCRIPTION on the CREATE SUBSCIPTION doc ?


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2023-10-12 Thread Benoit Lobréau

On 10/11/23 17:26, Imseih (AWS), Sami wrote:

Thank you for resurrecting this thread.


Well, if you read Benoit's earlier proposal at [1] you'll see that he
does propose to have some cumulative stats; this LOG line he proposes
here is not a substitute for stats, but rather a complement.  I don't
see any reason to reject this patch even if we do get stats.


I believe both cumulative statistics and logs are needed. Logs excel in 
pinpointing specific queries at precise times, while statistics provide 
a broader overview of the situation. Additionally, I often encounter 
situations where clients lack pg_stat_statements and can't restart their 
production promptly.



Regarding the current patch, the latest version removes the separate GUC,
but the user should be able to control this behavior.


I created this patch in response to Amit Kapila's proposal to keep the 
discussion ongoing. However, I still favor the initial version with the 
GUCs.



Query text is logged when  log_min_error_statement > default level of "error".

This could be especially problematic when there is a query running more than 1 
Parallel
Gather node that is in draught. In those cases each node will end up
generating a log with the statement text. So, a single query execution could 
end up
having multiple log lines with the statement text.
...
I wonder if it will be better to accumulate the total # of workers planned and 
# of workers launched and
logging this information at the end of execution?


log_temp_files exhibits similar behavior when a query involves multiple 
on-disk sorts. I'm uncertain whether this is something we should or need 
to address. I'll explore whether the error message can be made more 
informative.


[local]:5437 postgres@postgres=# SET work_mem to '125kB';
[local]:5437 postgres@postgres=# SET log_temp_files TO 0;
[local]:5437 postgres@postgres=# SET client_min_messages TO log;
[local]:5437 postgres@postgres=# WITH a AS ( SELECT x FROM 
generate_series(1,1) AS F(x) ORDER BY 1 ) , b AS (SELECT x FROM 
generate_series(1,1) AS F(x) ORDER BY 1 ) SELECT * FROM a,b;
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.20", size 
122880 => First sort

LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.19", size 14
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.23", size 14
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.22", size 
122880 => Second sort

LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.21", size 14

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Questions about the new subscription parameter: password_required

2023-10-13 Thread Benoit Lobréau

On 9/23/23 03:57, Jeff Davis wrote:

IIUC there is really one use case here, which is for superuser to
define a subscription including the connection, and then change the
owner to a non-superuser to actually run it (without being able to
touch the connection string itself). I'd just document that in its own
section, and mention a few caveats / mistakes to avoid. For instance,
when the superuser is defining the connection, don't forget to set
password_required=false, so that when you reassign to a non-superuser
then the connection doesn't break.


Hi,

I tried adding a section in "Logical Replication > Subscription" with 
the text you suggested and links in the CREATE / ALTER SUBSRIPTION commands.


Is it better ?

--
Benoit Lobréau
Consultant
http://dalibo.comFrom f3f1b0ce8617971b173ea901c9735d8357955aa2 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Thu, 12 Oct 2023 16:45:11 +0200
Subject: [PATCH] Doc patch for password_required

Add documentation regarding non-superuser subscriptions with
password_required=true.
---
 doc/src/sgml/logical-replication.sgml | 32 +++
 doc/src/sgml/ref/alter_subscription.sgml  |  3 ++-
 doc/src/sgml/ref/create_subscription.sgml |  3 ++-
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 3b2fa1129e..c3faaf88cd 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -329,6 +329,38 @@

   
 
+  
+   Password required
+
+   
+password_required is a subscription parameter which specifies whether
+connections to the publisher made as a result of this subscription must
+use password authentication. This setting is ignored when the subscription
+is owned by a superuser and set to true by default.
+   
+
+   
+If you want to have a subscription managed by a non-superuser with a connection string without
+a password, you have to set password_required = false before transferring it's
+ownership. In that case, only superusers can modify the subscription.
+   
+test_pub=# CREATE SUBSCRIPTION test_sub CONNECTION 'host=somehost port=5432 user=repli dbname=tests_pub' PUBLICATION test_pub WITH (password_required=false);
+CREATE SUBSCRIPTION
+test_pub=# ALTER SUBSCRIPTION test_sub OWNER TO new_sub_owner;
+ALTER SUBSCRIPTION
+   
+   
+
+   
+   
+   If the connection string doesn't contain a password or the publication
+   side doesn't require a password during authentication and you have set
+   password_required = truebefore transferring ownership,
+   the subscription will start failing.
+   
+   
+  
+
   
 Examples: Set Up Logical Replication
 
diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index a85e04e4d6..e061c96937 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -50,7 +50,8 @@ ALTER SUBSCRIPTION name RENAME TO <
CREATE permission on the database. In addition,
to alter the owner, you must be able to SET ROLE to the
new owning role. If the subscription has
-   password_required=false, only superusers can modify it.
+   password_required=false, only superusers can modify it
+   (See ).
   
 
   
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index c1bafbfa06..33ad3d12c7 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -361,7 +361,8 @@ CREATE SUBSCRIPTION subscription_nametrue. Only superusers can set
-  this value to false.
+  this value to false
+  (See ).
  
 

-- 
2.41.0



Re: archive modules

2022-08-30 Thread Benoit Lobréau
On Tue, Aug 30, 2022 at 12:46 AM Nathan Bossart 
wrote:

> I would advise that you create a commitfest entry for your patch so that it
> isn't forgotten.
>

Ok done, https://commitfest.postgresql.org/39/3856/ (is that fine if I
added you as a reviewer ?)

and thanks for the added links in the patch.


Re: Logging parallel worker draught

2024-08-28 Thread Benoit Lobréau

Hi,

Here is a new version of the patch. Sorry for the long delay, I was hit 
by a motivation drought and was quite busy otherwise.


The guc is now called `log_parallel_workers` and has three possible values:

* "none": disables logging
* "all": logs parallel worker info for all parallel queries or utilities
* "failure": logs only when the number of parallel workers planned 
couldn't be reached.


For this, I added several members to the EState struct.

Each gather node / gather merge node updates the values and the 
offending queries are displayed during standard_ExecutorEnd.


For CREATE INDEX / REINDEX on btree and brin, I check the parallel 
context struct (pcxt) during _bt_end_parallel() or _brin_end_parallel() 
and display a log message when needed.


For vacuum, I do the same in parallel_vacuum_end().

I added some information to the error message for parallel queries as an 
experiment. I find it useful, but it can be removed, if you re not 
convinced.


2024-08-27 15:59:11.089 CEST [54585] LOG:  1 parallel nodes planned (1 
obtained all their workers, 0 obtained none), 2 workers planned (2 
workers launched)

2024-08-27 15:59:11.089 CEST [54585] STATEMENT:  EXPLAIN (ANALYZE)
SELECT i, avg(j) FROM test_pql GROUP BY i;

2024-08-27 15:59:14.006 CEST [54585] LOG:  2 parallel nodes planned (0 
obtained all their workers, 1 obtained none), 4 workers planned (1 
workers launched)

2024-08-27 15:59:14.006 CEST [54585] STATEMENT:  EXPLAIN (ANALYZE)
SELECT i, avg(j) FROM test_pql GROUP BY i
UNION
SELECT i, avg(j) FROM test_pql GROUP BY i;

For CREATE INDEX / REDINDEX / VACUUMS:

2024-08-27 15:58:59.769 CEST [54521] LOG:  1 workers planned (0 workers 
launched)

2024-08-27 15:58:59.769 CEST [54521] STATEMENT:  REINDEX TABLE test_pql;

Do you think this is better?

I am not sure if a struct is needed to store the es_nworkers* and if the 
modification I did to parallel.h is ok.


Thanks to: Jehan-Guillaume de Rorthais, Guillaume Lelarge and Franck 
Boudehen for the help and motivation boost.


(sorry for the spam, I had to resend the mail to the list)

--
Benoit Lobréau
Consultant
http://dalibo.comFrom 940e1d1149f33ea00e7a0a688da67f492f6d Mon Sep 17 00:00:00 2001
From: benoit 
Date: Mon, 26 Aug 2024 13:48:44 +0200
Subject: [PATCH] Add logging for parallel worker usage

The new guc log_parallel_workers controls whether a log message is
produced to display information on the number of workers spawned when a
parallel query or utility is executed. The default value is `none`
which disables logging. `all` displays information for all parallel
queries, whereas `failures` displays information only when the number of
workers launched is lower than the number of planned workers. This new
parameter can help database administrators and developers diagnose
performance issues related to parallelism and optimize the configuration
of the system accordingly.
---
 doc/src/sgml/config.sgml  | 18 ++
 src/backend/access/brin/brin.c|  8 
 src/backend/access/nbtree/nbtsort.c   |  8 
 src/backend/commands/vacuumparallel.c |  8 
 src/backend/executor/execMain.c   | 12 
 src/backend/executor/execUtils.c  |  6 ++
 src/backend/executor/nodeGather.c |  8 
 src/backend/executor/nodeGatherMerge.c|  8 
 src/backend/utils/misc/guc_tables.c   | 12 
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/parallel.h | 15 +++
 src/include/nodes/execnodes.h |  6 ++
 12 files changed, 110 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2937384b00..234552c3fc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7700,6 +7700,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+
+  log_parallel_workers (enum)
+  
+   log_parallel_workers configuration parameter
+  
+  
+  
+   
+Controls whether a log message is produced to display information on the number of
+workers spawned when a parallel query or utility is executed. The default value is
+none which disables logging. all displays
+information for all parallel queries, whereas failures displays
+information only when the number of workers launched is lower than the number of
+planned workers.
+   
+  
+ 
+
  
   log_parameter_max_length (integer)
   
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6467bed604..0d53d5c575 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2540,6 +2540,14 @@ _brin_end_parallel(BrinLeader *brinleader, BrinBuildState *state)
 	/* Shutdown worker processes */
 	Wa

Parallel workers stats in pg_stat_database

2024-08-28 Thread Benoit Lobréau

Hi hackers,

This patch introduces four new columns in pg_stat_database:

* parallel_worker_planned
* parallel_worker_launched
* parallel_maint_worker_planned
* parallel_maint_worker_launched

The intent is to help administrators evaluate the usage of parallel 
workers in their databases and help sizing max_worker_processes, 
max_parallel_workers or max_parallel_maintenance_workers).


Here is a test script:

psql << _EOF_

-- Index creation
DROP TABLE IF EXISTS test_pql;
CREATE TABLE test_pql(i int, j int);
INSERT INTO test_pql SELECT x,x FROM generate_series(1,100) as F(x);

-- 0 planned / 0 launched
EXPLAIN (ANALYZE)
SELECT 1;

-- 2 planned / 2 launched
EXPLAIN (ANALYZE)
SELECT i, avg(j) FROM test_pql GROUP BY i;

SET max_parallel_workers TO 1;
-- 4 planned / 1 launched
EXPLAIN (ANALYZE)
SELECT i, avg(j) FROM test_pql GROUP BY i
UNION
SELECT i, avg(j) FROM test_pql GROUP BY i;

RESET max_parallel_workers;
-- 1 planned / 1 launched
CREATE INDEX ON test_pql(i);

SET max_parallel_workers TO 0;
-- 1 planned / 0 launched
CREATE INDEX ON test_pql(j);
-- 1 planned / 0 launched
CREATE INDEX ON test_pql(i, j);

SET maintenance_work_mem TO '96MB';
RESET max_parallel_workers;
-- 2 planned / 2 launched
VACUUM (VERBOSE) test_pql;

SET max_parallel_workers TO 1;
-- 2 planned / 1 launched
VACUUM (VERBOSE) test_pql;

-- TOTAL: parallel workers: 6 planned / 3 launched
-- TOTAL: parallel maint workers: 7 planned / 4 launched
_EOF_


And the output in pg_stat_database a fresh server without any 
configuration change except thoses in the script:


[local]:5445 postgres@postgres=# SELECT datname, 
parallel_workers_planned, parallel_workers_launched, 
parallel_maint_workers_planned, parallel_maint_workers_launched FROM pg
_stat_database WHERE datname = 'postgres' \gx 



-[ RECORD 1 ]---+- 



datname | postgres 



parallel_workers_planned| 6 



parallel_workers_launched   | 3 



parallel_maint_workers_planned  | 7 



parallel_maint_workers_launched | 4

Thanks to: Jehan-Guillaume de Rorthais, Guillaume Lelarge and Franck
Boudehen for the help and motivation boost.

---
Benoit Lobréau
Consultant
http://dalibo.comFrom cabe5e8ed2e88d9cf219161394396ebacb33a5a0 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Wed, 28 Aug 2024 02:27:13 +0200
Subject: [PATCH] Adds four parallel workers stat columns to pg_stat_database

* parallel_workers_planned
* parallel_workers_launched
* parallel_maint_workers_planned
* parallel_maint_workers_launched
---
 doc/src/sgml/monitoring.sgml | 36 
 src/backend/access/brin/brin.c   |  4 +++
 src/backend/access/nbtree/nbtsort.c  |  4 +++
 src/backend/catalog/system_views.sql |  4 +++
 src/backend/commands/vacuumparallel.c|  5 +++
 src/backend/executor/nodeGather.c|  5 +++
 src/backend/executor/nodeGatherMerge.c   |  5 +++
 src/backend/utils/activity/pgstat_database.c | 36 
 src/backend/utils/adt/pgstatfuncs.c  | 12 +++
 src/include/catalog/pg_proc.dat  | 20 +++
 src/include/pgstat.h |  7 
 src/test/regress/expected/rules.out  |  4 +++
 12 files changed, 142 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 55417a6fa9..8c4b11c11d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3508,6 +3508,42 @@ description | Waiting for a newly initialized WAL file to reach durable storage
   
  
 
+ 
+  
+   parallel_workers_planned bigint
+  
+  
+   Number of parallel workers planned by queries on this database
+  
+ 
+
+ 
+  
+   parallel_workers_launched bigint
+  
+  
+   Number of parallel workers obtained by queries on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_planned bigint
+  
+  
+   Number of parallel workers planned by utilities on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_launched bigint
+  
+  
+   Number of parallel workers obtained by utilities on this database
+  
+ 
+
  
   
stats_reset timestamp with time zone
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6467bed604..9eceb87b52 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2540,6 +2540,10 @@ _brin_end_parallel(BrinLeader *brinleader, BrinBuildState *state)
 	/* Shutdown worker processes */
 	WaitForParallelWorkersToFinish(brinleader->pcxt);
 
+	pgstat_update_parallel_maint_workers_stats(
+		(PgStat_Counter) brinleader->pcxt->nworkers_to_launch,
+		(PgStat_Counter) brinleader->pcxt->nworkers_launched);
+
 	/*
 	 * Next, accumulate WAL usage.  (This must wait for the workers to finish,
 	 * or we mig

Re: Parallel workers stats in pg_stat_database

2024-08-29 Thread Benoit Lobréau

Hi,

This is a new patch which:

* fixes some typos
* changes the execgather / execgathermerge code so that the stats are 
accumulated in EState and inserted in pg_stat_database only once, during 
ExecutorEnd
* adds tests (very ugly, but I could get the parallel plan to be stable 
across make check executions.)



On 8/28/24 17:10, Benoit Lobréau wrote:

Hi hackers,

This patch introduces four new columns in pg_stat_database:

* parallel_worker_planned
* parallel_worker_launched
* parallel_maint_worker_planned
* parallel_maint_worker_launched

The intent is to help administrators evaluate the usage of parallel 
workers in their databases and help sizing max_worker_processes, 
max_parallel_workers or max_parallel_maintenance_workers).


Here is a test script:

psql << _EOF_

-- Index creation
DROP TABLE IF EXISTS test_pql;
CREATE TABLE test_pql(i int, j int);
INSERT INTO test_pql SELECT x,x FROM generate_series(1,100) as F(x);

-- 0 planned / 0 launched
EXPLAIN (ANALYZE)
 SELECT 1;

-- 2 planned / 2 launched
EXPLAIN (ANALYZE)
 SELECT i, avg(j) FROM test_pql GROUP BY i;

SET max_parallel_workers TO 1;
-- 4 planned / 1 launched
EXPLAIN (ANALYZE)
 SELECT i, avg(j) FROM test_pql GROUP BY i
 UNION
 SELECT i, avg(j) FROM test_pql GROUP BY i;

RESET max_parallel_workers;
-- 1 planned / 1 launched
CREATE INDEX ON test_pql(i);

SET max_parallel_workers TO 0;
-- 1 planned / 0 launched
CREATE INDEX ON test_pql(j);
-- 1 planned / 0 launched
CREATE INDEX ON test_pql(i, j);

SET maintenance_work_mem TO '96MB';
RESET max_parallel_workers;
-- 2 planned / 2 launched
VACUUM (VERBOSE) test_pql;

SET max_parallel_workers TO 1;
-- 2 planned / 1 launched
VACUUM (VERBOSE) test_pql;

-- TOTAL: parallel workers: 6 planned / 3 launched
-- TOTAL: parallel maint workers: 7 planned / 4 launched
_EOF_


And the output in pg_stat_database a fresh server without any 
configuration change except thoses in the script:


[local]:5445 postgres@postgres=# SELECT datname, 
parallel_workers_planned, parallel_workers_launched, 
parallel_maint_workers_planned, parallel_maint_workers_launched FROM pg

_stat_database WHERE datname = 'postgres' \gx

-[ RECORD 1 ]---+-

datname | postgres

parallel_workers_planned    | 6

parallel_workers_launched   | 3

parallel_maint_workers_planned  | 7

parallel_maint_workers_launched | 4

Thanks to: Jehan-Guillaume de Rorthais, Guillaume Lelarge and Franck
Boudehen for the help and motivation boost.

---
Benoit Lobréau
Consultant
http://dalibo.com


--
Benoit Lobréau
Consultant
http://dalibo.comFrom 0338cfb11ab98594b2f16d143b505e269566bb6e Mon Sep 17 00:00:00 2001
From: benoit 
Date: Wed, 28 Aug 2024 02:27:13 +0200
Subject: [PATCH] Adds four parallel workers stat columns to pg_stat_database

* parallel_workers_planned
* parallel_workers_launched
* parallel_maint_workers_planned
* parallel_maint_workers_launched
---
 doc/src/sgml/monitoring.sgml | 36 
 src/backend/access/brin/brin.c   |  4 +++
 src/backend/access/nbtree/nbtsort.c  |  4 +++
 src/backend/catalog/system_views.sql |  4 +++
 src/backend/commands/vacuumparallel.c|  5 +++
 src/backend/executor/execMain.c  |  5 +++
 src/backend/executor/execUtils.c |  3 ++
 src/backend/executor/nodeGather.c|  3 ++
 src/backend/executor/nodeGatherMerge.c   |  3 ++
 src/backend/utils/activity/pgstat_database.c | 36 
 src/backend/utils/adt/pgstatfuncs.c  | 12 +++
 src/include/catalog/pg_proc.dat  | 20 +++
 src/include/nodes/execnodes.h|  3 ++
 src/include/pgstat.h |  7 
 src/test/regress/expected/rules.out  |  4 +++
 src/test/regress/expected/stats.out  | 17 +
 src/test/regress/sql/stats.sql   | 14 
 17 files changed, 180 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 55417a6fa9..8c4b11c11d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3508,6 +3508,42 @@ description | Waiting for a newly initialized WAL file 
to reach durable storage
   
  
 
+ 
+  
+   parallel_workers_planned bigint
+  
+  
+   Number of parallel workers planned by queries on this database
+  
+ 
+
+ 
+  
+   parallel_workers_launched bigint
+  
+  
+   Number of parallel workers obtained by queries on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_planned 
bigint
+  
+  
+   Number of parallel workers planned by utilities on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_launched 
bigint
+  
+  
+   Number of parallel workers obtained by utilities on this database
+  
+ 
+
  
   
stats_reset timestamp 

Re: Parallel workers stats in pg_stat_database

2024-09-03 Thread Benoit Lobréau

Hi,

This new version avoids updating the stats for non parallel queries.

I noticed that the tests are still not stable. I tried using tenk2
but fail to have stable plans. I'd love to have pointers on that front.

--
Benoit Lobréau
Consultant
http://dalibo.comFrom 5e4401c865f77ed447b8b3f25aac0ffa9af0d700 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Wed, 28 Aug 2024 02:27:13 +0200
Subject: [PATCH] Adds four parallel workers stat columns to pg_stat_database

* parallel_workers_planned
* parallel_workers_launched
* parallel_maint_workers_planned
* parallel_maint_workers_launched
---
 doc/src/sgml/monitoring.sgml | 36 
 src/backend/access/brin/brin.c   |  4 +++
 src/backend/access/nbtree/nbtsort.c  |  4 +++
 src/backend/catalog/system_views.sql |  4 +++
 src/backend/commands/vacuumparallel.c|  5 +++
 src/backend/executor/execMain.c  |  7 
 src/backend/executor/execUtils.c |  3 ++
 src/backend/executor/nodeGather.c|  3 ++
 src/backend/executor/nodeGatherMerge.c   |  3 ++
 src/backend/utils/activity/pgstat_database.c | 36 
 src/backend/utils/adt/pgstatfuncs.c  | 12 +++
 src/include/catalog/pg_proc.dat  | 20 +++
 src/include/nodes/execnodes.h|  3 ++
 src/include/pgstat.h |  7 
 src/test/regress/expected/rules.out  |  4 +++
 src/test/regress/expected/stats.out  | 17 +
 src/test/regress/sql/stats.sql   | 14 
 17 files changed, 182 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 55417a6fa9..8c4b11c11d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3508,6 +3508,42 @@ description | Waiting for a newly initialized WAL file 
to reach durable storage
   
  
 
+ 
+  
+   parallel_workers_planned bigint
+  
+  
+   Number of parallel workers planned by queries on this database
+  
+ 
+
+ 
+  
+   parallel_workers_launched bigint
+  
+  
+   Number of parallel workers obtained by queries on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_planned 
bigint
+  
+  
+   Number of parallel workers planned by utilities on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_launched 
bigint
+  
+  
+   Number of parallel workers obtained by utilities on this database
+  
+ 
+
  
   
stats_reset timestamp with time 
zone
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6467bed604..9eceb87b52 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2540,6 +2540,10 @@ _brin_end_parallel(BrinLeader *brinleader, 
BrinBuildState *state)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(brinleader->pcxt);
 
+   pgstat_update_parallel_maint_workers_stats(
+   (PgStat_Counter) brinleader->pcxt->nworkers_to_launch,
+   (PgStat_Counter) brinleader->pcxt->nworkers_launched);
+
/*
 * Next, accumulate WAL usage.  (This must wait for the workers to 
finish,
 * or we might get incomplete data.)
diff --git a/src/backend/access/nbtree/nbtsort.c 
b/src/backend/access/nbtree/nbtsort.c
index f5d7b3b0c3..232e1a0942 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1611,6 +1611,10 @@ _bt_end_parallel(BTLeader *btleader)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(btleader->pcxt);
 
+   pgstat_update_parallel_maint_workers_stats(
+   (PgStat_Counter) btleader->pcxt->nworkers_to_launch,
+   (PgStat_Counter) btleader->pcxt->nworkers_launched);
+
/*
 * Next, accumulate WAL usage.  (This must wait for the workers to 
finish,
 * or we might get incomplete data.)
diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 19cabc9a47..48bf9e5535 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1073,6 +1073,10 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_sessions_abandoned(D.oid) AS sessions_abandoned,
 pg_stat_get_db_sessions_fatal(D.oid) AS sessions_fatal,
 pg_stat_get_db_sessions_killed(D.oid) AS sessions_killed,
+pg_stat_get_db_parallel_workers_planned(D.oid) as 
parallel_workers_planned,
+pg_stat_get_db_parallel_workers_launched(D.oid) as 
parallel_workers_launched,
+pg_stat_get_db_parallel_maint_workers_planned(D.oid) as 
parallel_maint_workers_planned,
+pg_stat_get_db_parallel_maint_workers_launched(D.oid) as 
parallel_maint_workers_launched,
 pg_stat_get_db_stat_reset_t

Re: Parallel workers stats in pg_stat_database

2024-09-04 Thread Benoit Lobréau
On 9/4/24 08:46, Bertrand Drouvot wrote:> What about moving the tests to 
places where it's "guaranteed" to get

parallel workers involved? For example, a "parallel_maint_workers" only test
could be done in vacuum_parallel.sql.


Thank you ! I was too focussed on the stat part and missed the obvious.
It's indeed better with this file.

... Which led me to discover that the area I choose to gather my stats 
is wrong (parallel_vacuum_end), it only traps workers allocated for 
parallel_vacuum_cleanup_all_indexes() and not 
parallel_vacuum_bulkdel_all_indexes().


Back to the drawing board...

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2024-09-04 Thread Benoit Lobréau
I found out in [1] that I am not correctly tracking the workers for 
vacuum commands. I trap workers used by 
parallel_vacuum_cleanup_all_indexes but not 
parallel_vacuum_bulkdel_all_indexes.


Back to the drawing board.

[1] 
https://www.postgresql.org/message-id/flat/783bc7f7-659a-42fa-99dd-ee0565644...@dalibo.com


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Parallel workers stats in pg_stat_database

2024-09-17 Thread Benoit Lobréau

Here is an updated patch fixing the aforementionned problems
with tests and vacuum stats.

--
Benoit Lobréau
Consultant
http://dalibo.comFrom 1b52f5fb4e977599b8925c69193d31148042ca7d Mon Sep 17 00:00:00 2001
From: benoit 
Date: Wed, 28 Aug 2024 02:27:13 +0200
Subject: [PATCH] Adds four parallel workers stat columns to pg_stat_database

* parallel_workers_planned
* parallel_workers_launched
* parallel_maint_workers_planned
* parallel_maint_workers_launched
---
 doc/src/sgml/monitoring.sgml  | 36 +++
 src/backend/access/brin/brin.c|  4 +++
 src/backend/access/nbtree/nbtsort.c   |  4 +++
 src/backend/catalog/system_views.sql  |  4 +++
 src/backend/commands/vacuumparallel.c |  5 +++
 src/backend/executor/execMain.c   |  7 
 src/backend/executor/execUtils.c  |  3 ++
 src/backend/executor/nodeGather.c |  3 ++
 src/backend/executor/nodeGatherMerge.c|  3 ++
 src/backend/utils/activity/pgstat_database.c  | 36 +++
 src/backend/utils/adt/pgstatfuncs.c   | 12 +++
 src/include/catalog/pg_proc.dat   | 20 +++
 src/include/nodes/execnodes.h |  3 ++
 src/include/pgstat.h  |  7 
 src/test/regress/expected/rules.out   |  4 +++
 src/test/regress/expected/select_parallel.out | 27 ++
 src/test/regress/expected/vacuum_parallel.out | 19 ++
 src/test/regress/sql/select_parallel.sql  | 15 
 src/test/regress/sql/vacuum_parallel.sql  | 11 ++
 19 files changed, 223 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 55417a6fa9..8c4b11c11d 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3508,6 +3508,42 @@ description | Waiting for a newly initialized WAL file 
to reach durable storage
   
  
 
+ 
+  
+   parallel_workers_planned bigint
+  
+  
+   Number of parallel workers planned by queries on this database
+  
+ 
+
+ 
+  
+   parallel_workers_launched bigint
+  
+  
+   Number of parallel workers obtained by queries on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_planned 
bigint
+  
+  
+   Number of parallel workers planned by utilities on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_launched 
bigint
+  
+  
+   Number of parallel workers obtained by utilities on this database
+  
+ 
+
  
   
stats_reset timestamp with time 
zone
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6467bed604..9eceb87b52 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2540,6 +2540,10 @@ _brin_end_parallel(BrinLeader *brinleader, 
BrinBuildState *state)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(brinleader->pcxt);
 
+   pgstat_update_parallel_maint_workers_stats(
+   (PgStat_Counter) brinleader->pcxt->nworkers_to_launch,
+   (PgStat_Counter) brinleader->pcxt->nworkers_launched);
+
/*
 * Next, accumulate WAL usage.  (This must wait for the workers to 
finish,
 * or we might get incomplete data.)
diff --git a/src/backend/access/nbtree/nbtsort.c 
b/src/backend/access/nbtree/nbtsort.c
index f5d7b3b0c3..232e1a0942 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1611,6 +1611,10 @@ _bt_end_parallel(BTLeader *btleader)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(btleader->pcxt);
 
+   pgstat_update_parallel_maint_workers_stats(
+   (PgStat_Counter) btleader->pcxt->nworkers_to_launch,
+   (PgStat_Counter) btleader->pcxt->nworkers_launched);
+
/*
 * Next, accumulate WAL usage.  (This must wait for the workers to 
finish,
 * or we might get incomplete data.)
diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 19cabc9a47..48bf9e5535 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1073,6 +1073,10 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_sessions_abandoned(D.oid) AS sessions_abandoned,
 pg_stat_get_db_sessions_fatal(D.oid) AS sessions_fatal,
 pg_stat_get_db_sessions_killed(D.oid) AS sessions_killed,
+pg_stat_get_db_parallel_workers_planned(D.oid) as 
parallel_workers_planned,
+pg_stat_get_db_parallel_workers_launched(D.oid) as 
parallel_workers_launched,
+pg_stat_get_db_parallel_maint_workers_planned(D.oid) as 
parallel_maint_workers_planned,
+pg_stat_get_db_parallel_maint_workers_launched(D.oid) as 
parallel_maint_workers_launched,
 pg_stat_ge

Using pgadmin with AWS RDS https based Data API

2020-10-13 Thread Benoit Marchant
Hi there, it's my first email, so I’d like to first thanks everyone working on 
this, using pgadmin had made a huge difference for me! I’m using it with 
serverless PostgreSQL databases on AWS, and for at the first one I went through 
the work of creating an EC2 instance I could ssh tunnel to so I could use 
PGAdmin. I’m now having to create more databases by hand for now in different 
accounts and ultimately this should be all automated, and I’d rather find 
another way, hence the question: what would it take, what would be the simplest 
way to get pgadmin to work using the https based Data API of AWS RDS 
<https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/data-api.html> ? 
I’m unfortunately not a python developer, focusing on JavaScript client and 
server-side myself, but I hope I’m not the only one who’s like to do this!

Thanks for any feedback/pointers!

Benoit

Re: Logging parallel worker draught

2024-10-15 Thread Benoit Lobréau

On 10/14/24 22:42, Alena Rybakina wrote:

Attached is the correct version of the patch.


Thank you, it's a lot cleaner that way.
I'll add this asap.

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Parallel workers stats in pg_stat_database

2024-10-08 Thread Benoit Lobréau

On 10/7/24 10:19, Guillaume Lelarge wrote:

I've done the split, but I didn't go any further than that.


Thank you Guillaume. I have done the rest of the reformatting
suggested by Michael but I decided to see If I have similar stuff
in my logging patch and refactor accordingly if needed before posting 
the result here.


I have hopes to finish it this week.

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2024-10-11 Thread Benoit Lobréau

This is a rebased version.

I have split queries, vacuum and index creation in different patches.
I have also split the declartion that are in common with the 
pg_stat_database patch.


--
Benoit Lobréau
Consultant
http://dalibo.comFrom cfa426a080ca2d0c484ac8f5201800bd6434 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 11 Oct 2024 23:59:15 +0200
Subject: [PATCH 5/5] Implements logging for parallel worker usage in queries

---
 src/backend/executor/execMain.c| 12 
 src/backend/executor/execUtils.c   |  3 +++
 src/backend/executor/nodeGather.c  |  5 +
 src/backend/executor/nodeGatherMerge.c |  5 +
 src/include/nodes/execnodes.h  |  4 
 5 files changed, 29 insertions(+)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index cc9a594cba..6663fea7b1 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -483,6 +483,18 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
 
 	Assert(estate != NULL);
 
+	if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
+		 estate->es_parallel_workers_to_launch > 0) ||
+		(log_parallel_workers == LOG_PARALLEL_WORKERS_FAILURE &&
+		 estate->es_parallel_workers_to_launch != estate->es_parallel_workers_launched))
+		elog(LOG, "%i parallel nodes planned (%i obtained all their workers, %i obtained none), "
+  "%i workers planned to be launched (%i workers launched)",
+			estate->es_parallel_nodes,
+			estate->es_parallel_nodes_success,
+			estate->es_parallel_nodes_no_workers,
+			estate->es_parallel_workers_to_launch,
+			estate->es_parallel_workers_launched);
+
 	/*
 	 * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
 	 * Assert is needed because ExecutorFinish is new as of 9.1, and callers
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 6712302ec8..fd463f7e4c 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -160,6 +160,9 @@ CreateExecutorState(void)
 	estate->es_use_parallel_mode = false;
 	estate->es_parallel_workers_to_launch = 0;
 	estate->es_parallel_workers_launched = 0;
+	estate->es_parallel_nodes = 0;
+	estate->es_parallel_nodes_success = 0;
+	estate->es_parallel_nodes_no_workers = 0;
 
 	estate->es_jit_flags = 0;
 	estate->es_jit = NULL;
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 7f7edc7f9f..49a96f7cbf 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -188,6 +188,10 @@ ExecGather(PlanState *pstate)
 			 */
 			estate->es_parallel_workers_to_launch += pcxt->nworkers_to_launch;
 			estate->es_parallel_workers_launched += pcxt->nworkers_launched;
+			estate->es_parallel_nodes += 1;
+
+			if (pcxt->nworkers_to_launch == pcxt->nworkers_launched)
+estate->es_parallel_nodes_success += 1;
 
 			/* Set up tuple queue readers to read the results. */
 			if (pcxt->nworkers_launched > 0)
@@ -205,6 +209,7 @@ ExecGather(PlanState *pstate)
 /* No workers?	Then never mind. */
 node->nreaders = 0;
 node->reader = NULL;
+estate->es_parallel_nodes_no_workers += 1;
 			}
 			node->nextreader = 0;
 		}
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index bc99c0b448..f3aa7ee14f 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -229,6 +229,10 @@ ExecGatherMerge(PlanState *pstate)
 			 */
 			estate->es_parallel_workers_to_launch += pcxt->nworkers_to_launch;
 			estate->es_parallel_workers_launched += pcxt->nworkers_launched;
+			estate->es_parallel_nodes += 1;
+
+			if (pcxt->nworkers_to_launch == pcxt->nworkers_launched)
+estate->es_parallel_nodes_success += 1;
 
 			/* Set up tuple queue readers to read the results. */
 			if (pcxt->nworkers_launched > 0)
@@ -246,6 +250,7 @@ ExecGatherMerge(PlanState *pstate)
 /* No workers?	Then never mind. */
 node->nreaders = 0;
 node->reader = NULL;
+estate->es_parallel_nodes_no_workers += 1;
 			}
 		}
 
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index e4698a28c4..d87af53853 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -713,6 +713,10 @@ typedef struct EState
 	int			es_parallel_workers_launched;	/* number of workers actually
  * launched. */
 
+	int			es_parallel_nodes;
+	int			es_parallel_nodes_success;
+	int			es_parallel_nodes_no_workers;
+
 	/* The per-query shared memory area to use for parallel execution. */
 	struct dsa_area *es_query_dsa;
 
-- 
2.46.2

From c6dfd6d7751337a9e7a435efe742c4aea9d831e3 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 11 Oct 2024 23:58:40 +0200
Subject: [PATCH 4/5] Implements logging for parallel worker usage in vacuums

---
 src/bac

Re: Parallel workers stats in pg_stat_database

2024-10-11 Thread Benoit Lobréau

On 10/11/24 09:33, Guillaume Lelarge wrote:
FWIW, with the recent commits of the pg_stat_statements patch, you need 
a slight change in the patch I sent on this thread. You'll find a patch 
attached to do that. You need to apply it after a rebase to master.


Thanks.

Here is an updated version, I modified it to:

* have the same wording in the doc and code (planned => to_launch)
* split de declaration from the rest (and have the same code as the 
parallel worker logging patch)


--
Benoit Lobréau
Consultant
http://dalibo.comFrom ab9aa1344f974348638dd3898c944f3d5253374d Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 8 Oct 2024 10:01:52 +0200
Subject: [PATCH 3/3] Adds two parallel workers stat columns for utilities to
 pg_stat_database

* parallel_maint_workers_to_launch
* parallel_maint_workers_launched
---
 doc/src/sgml/monitoring.sgml  | 18 ++
 src/backend/access/brin/brin.c|  4 
 src/backend/access/nbtree/nbtsort.c   |  4 
 src/backend/catalog/system_views.sql  |  2 ++
 src/backend/commands/vacuumparallel.c |  6 ++
 src/backend/utils/activity/pgstat_database.c  | 18 ++
 src/backend/utils/adt/pgstatfuncs.c   |  6 ++
 src/include/catalog/pg_proc.dat   | 10 ++
 src/include/pgstat.h  |  3 +++
 src/test/regress/expected/rules.out   |  2 ++
 src/test/regress/expected/vacuum_parallel.out | 19 +++
 src/test/regress/sql/vacuum_parallel.sql  | 11 +++
 12 files changed, 103 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 840d7f8161..bd4e4b63c7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3629,6 +3629,24 @@ description | Waiting for a newly initialized WAL file to reach durable storage
   
  
 
+ 
+  
+   parallel_maint_workers_to_launch bigint
+  
+  
+   Number of parallel workers planned to be launched by utilities on this database
+  
+ 
+
+ 
+  
+   parallel_maint_workers_launched bigint
+  
+  
+   Number of parallel workers launched by utilities on this database
+  
+ 
+
  
   
stats_reset timestamp with time zone
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index c0b978119a..4e83091d2c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2544,6 +2544,10 @@ _brin_end_parallel(BrinLeader *brinleader, BrinBuildState *state)
 	/* Shutdown worker processes */
 	WaitForParallelWorkersToFinish(brinleader->pcxt);
 
+	pgstat_update_parallel_maint_workers_stats(
+		(PgStat_Counter) brinleader->pcxt->nworkers_to_launch,
+		(PgStat_Counter) brinleader->pcxt->nworkers_launched);
+
 	/*
 	 * Next, accumulate WAL usage.  (This must wait for the workers to finish,
 	 * or we might get incomplete data.)
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 5cca0d4f52..8ee5fcf6d3 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1615,6 +1615,10 @@ _bt_end_parallel(BTLeader *btleader)
 	/* Shutdown worker processes */
 	WaitForParallelWorkersToFinish(btleader->pcxt);
 
+	pgstat_update_parallel_maint_workers_stats(
+		(PgStat_Counter) btleader->pcxt->nworkers_to_launch,
+		(PgStat_Counter) btleader->pcxt->nworkers_launched);
+
 	/*
 	 * Next, accumulate WAL usage.  (This must wait for the workers to finish,
 	 * or we might get incomplete data.)
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index da9a8fe99f..648166bb3b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1075,6 +1075,8 @@ CREATE VIEW pg_stat_database AS
 pg_stat_get_db_sessions_killed(D.oid) AS sessions_killed,
 pg_stat_get_db_parallel_workers_to_launch(D.oid) as parallel_workers_to_launch,
 pg_stat_get_db_parallel_workers_launched(D.oid) as parallel_workers_launched,
+pg_stat_get_db_parallel_maint_workers_to_launch(D.oid) as parallel_maint_workers_to_launch,
+pg_stat_get_db_parallel_maint_workers_launched(D.oid) as parallel_maint_workers_launched,
 pg_stat_get_db_stat_reset_time(D.oid) AS stats_reset
 FROM (
 SELECT 0 AS oid, NULL::name AS datname
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 77679e8df6..edd0823353 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -443,6 +443,12 @@ parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
 {
 	Assert(!IsParallelWorker());
 
+	if (pvs->nworkers_to_launch > 0)
+		pgstat_update_parallel_maint_workers_stats(
+			(PgStat_Counter) pvs->pcxt->nworkers_to_launch,

Re: Logging parallel worker draught

2024-10-18 Thread Benoit Lobréau

On 10/15/24 09:52, Benoit Lobréau wrote:

Thank you, it's a lot cleaner that way.
I'll add this asap.


This is an updated version with the suggested changes.

--
Benoit Lobréau
Consultant
http://dalibo.comFrom b9bf7c0fa967c72972fd77333a768dfe89d91765 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 11 Oct 2024 23:59:15 +0200
Subject: [PATCH 5/5] Implements logging for parallel worker usage in queries

---
 src/backend/executor/execMain.c| 11 +++
 src/backend/executor/execUtils.c   |  3 +++
 src/backend/executor/nodeGather.c  |  5 +
 src/backend/executor/nodeGatherMerge.c |  5 +
 src/include/nodes/execnodes.h  |  4 
 5 files changed, 28 insertions(+)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index cc9a594cba..b85d679b7f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -483,6 +483,17 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
 
 	Assert(estate != NULL);
 
+	if (LoggingParallelWorkers(log_parallel_workers,
+			   estate->es_parallel_workers_to_launch,
+			   estate->es_parallel_workers_launched))
+		elog(LOG, "%i parallel nodes planned (%i obtained all their workers, %i obtained none), "
+  "%i workers planned to be launched (%i workers launched)",
+			estate->es_parallel_nodes,
+			estate->es_parallel_nodes_success,
+			estate->es_parallel_nodes_no_workers,
+			estate->es_parallel_workers_to_launch,
+			estate->es_parallel_workers_launched);
+
 	/*
 	 * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
 	 * Assert is needed because ExecutorFinish is new as of 9.1, and callers
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 6712302ec8..fd463f7e4c 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -160,6 +160,9 @@ CreateExecutorState(void)
 	estate->es_use_parallel_mode = false;
 	estate->es_parallel_workers_to_launch = 0;
 	estate->es_parallel_workers_launched = 0;
+	estate->es_parallel_nodes = 0;
+	estate->es_parallel_nodes_success = 0;
+	estate->es_parallel_nodes_no_workers = 0;
 
 	estate->es_jit_flags = 0;
 	estate->es_jit = NULL;
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 7f7edc7f9f..49a96f7cbf 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -188,6 +188,10 @@ ExecGather(PlanState *pstate)
 			 */
 			estate->es_parallel_workers_to_launch += pcxt->nworkers_to_launch;
 			estate->es_parallel_workers_launched += pcxt->nworkers_launched;
+			estate->es_parallel_nodes += 1;
+
+			if (pcxt->nworkers_to_launch == pcxt->nworkers_launched)
+estate->es_parallel_nodes_success += 1;
 
 			/* Set up tuple queue readers to read the results. */
 			if (pcxt->nworkers_launched > 0)
@@ -205,6 +209,7 @@ ExecGather(PlanState *pstate)
 /* No workers?	Then never mind. */
 node->nreaders = 0;
 node->reader = NULL;
+estate->es_parallel_nodes_no_workers += 1;
 			}
 			node->nextreader = 0;
 		}
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index bc99c0b448..f3aa7ee14f 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -229,6 +229,10 @@ ExecGatherMerge(PlanState *pstate)
 			 */
 			estate->es_parallel_workers_to_launch += pcxt->nworkers_to_launch;
 			estate->es_parallel_workers_launched += pcxt->nworkers_launched;
+			estate->es_parallel_nodes += 1;
+
+			if (pcxt->nworkers_to_launch == pcxt->nworkers_launched)
+estate->es_parallel_nodes_success += 1;
 
 			/* Set up tuple queue readers to read the results. */
 			if (pcxt->nworkers_launched > 0)
@@ -246,6 +250,7 @@ ExecGatherMerge(PlanState *pstate)
 /* No workers?	Then never mind. */
 node->nreaders = 0;
 node->reader = NULL;
+estate->es_parallel_nodes_no_workers += 1;
 			}
 		}
 
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index e4698a28c4..d87af53853 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -713,6 +713,10 @@ typedef struct EState
 	int			es_parallel_workers_launched;	/* number of workers actually
  * launched. */
 
+	int			es_parallel_nodes;
+	int			es_parallel_nodes_success;
+	int			es_parallel_nodes_no_workers;
+
 	/* The per-query shared memory area to use for parallel execution. */
 	struct dsa_area *es_query_dsa;
 
-- 
2.46.2

From f13f1da31e4e45c68c3afae5c71572008caf6e84 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 11 Oct 2024 23:58:40 +0200
Subject: [PATCH 4/5] Implements logging for parallel worker usage in vacuums

---
 src/backend/commands/vacuumparallel.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/com

Re: Parallel workers stats in pg_stat_database

2024-11-11 Thread Benoit Lobréau

On 11/11/24 02:51, Michael Paquier wrote:

Okidoki, applied.  If tweaks are necessary depending on the feedback,
like column names, let's tackle things as required.  We still have a
good chunk of time for this release cycle.
--
Michael


Thanks !

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Parallel workers stats in pg_stat_database

2024-11-12 Thread Benoit Lobréau




On 11/12/24 16:24, Michael Banck wrote:

I am not sure "backend state" is a good reason (unless it is exposed
somewhere to users?), but the point about utilities does make sense I
guess.


We only track parallel workers used by queries right now.

Parallel index builds (btree & brin) and vacuum cleanup is not commited 
yet since it's not a common occurence. I implemented it in separate

counters.

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Parallel workers stats in pg_stat_database

2024-11-12 Thread Benoit Lobréau

On 11/12/24 15:05, Michael Banck wrote:

I was wondering about the weird new column name workers_to_launch when I
read the commit message - AFAICT this has been an internal term so far,
and this is the first time we expose it to users?

I personally find (parallel_)workers_planned/launched clearer from a
user perspective, was it discussed that we need to follow the internal
terms here? If so, I missed that discussion in this thread (and the
other thread that lead to cf54a2c00).


Michael


I initiallly called it like that but changed it to mirror the column
name added in pg_stat_statements for coherence sake. I prefer "planned"
but english is clearly not my strong suit and I assumed it meant that
the number of worker planned could change before execution. I just
checked in parallel.c and I don't think it's the case, could it be done
elsewhere ?

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2024-09-18 Thread Benoit Lobréau

Here is a new version that fixes the aforementioned problems.

If this patch is accepted in this form, the counters could be used for 
the patch in pg_stat_database. [1]
[1] 
https://www.postgresql.org/message-id/flat/783bc7f7-659a-42fa-99dd-ee0565644...@dalibo.com


--
Benoit Lobréau
Consultant
http://dalibo.comFrom d82cea62d4ab53d3d77054286cddb1536172c8c0 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Mon, 26 Aug 2024 13:48:44 +0200
Subject: [PATCH] Add logging for parallel worker usage

The new guc log_parallel_workers controls whether a log message is
produced to display information on the number of workers spawned when a
parallel query or utility is executed. The default value is `none`
which disables logging. `all` displays information for all parallel
queries, whereas `failures` displays information only when the number of
workers launched is lower than the number of planned workers. This new
parameter can help database administrators and developers diagnose
performance issues related to parallelism and optimize the configuration
of the system accordingly.
---
 doc/src/sgml/config.sgml  | 18 ++
 src/backend/access/brin/brin.c|  8 
 src/backend/access/nbtree/nbtsort.c   |  8 
 src/backend/commands/vacuumparallel.c | 17 +
 src/backend/executor/execMain.c   | 12 
 src/backend/executor/execUtils.c  |  6 ++
 src/backend/executor/nodeGather.c |  8 
 src/backend/executor/nodeGatherMerge.c|  8 
 src/backend/utils/misc/guc_tables.c   | 12 
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/parallel.h | 15 +++
 src/include/nodes/execnodes.h |  6 ++
 12 files changed, 119 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0aec11f443..b687bf3507 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7696,6 +7696,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+
+  log_parallel_workers (enum)
+  
+   log_parallel_workers configuration 
parameter
+  
+  
+  
+   
+Controls whether a log message is produced to display information on 
the number of
+workers spawned when a parallel query or utility is executed. The 
default value is
+none which disables logging. all 
displays
+information for all parallel queries, whereas 
failures displays
+information only when the number of workers launched is lower than the 
number of
+planned workers.
+   
+  
+ 
+
  
   log_parameter_max_length (integer)
   
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 60853a0f6a..2a516911e7 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2540,6 +2540,14 @@ _brin_end_parallel(BrinLeader *brinleader, 
BrinBuildState *state)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(brinleader->pcxt);
 
+   if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
+brinleader->pcxt->nworkers_to_launch > 0) ||
+   (log_parallel_workers == LOG_PARALLEL_WORKERS_FAILURE &&
+brinleader->pcxt->nworkers_to_launch > 
brinleader->pcxt->nworkers_launched))
+   elog(LOG, "%i workers planned (%i workers launched)",
+   brinleader->pcxt->nworkers_to_launch,
+   brinleader->pcxt->nworkers_launched);
+
/*
 * Next, accumulate WAL usage.  (This must wait for the workers to 
finish,
 * or we might get incomplete data.)
diff --git a/src/backend/access/nbtree/nbtsort.c 
b/src/backend/access/nbtree/nbtsort.c
index f5d7b3b0c3..eaa3e1bac1 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1611,6 +1611,14 @@ _bt_end_parallel(BTLeader *btleader)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(btleader->pcxt);
 
+   if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
+btleader->pcxt->nworkers_to_launch > 0) ||
+   (log_parallel_workers == LOG_PARALLEL_WORKERS_FAILURE &&
+btleader->pcxt->nworkers_to_launch > 
btleader->pcxt->nworkers_launched))
+   elog(LOG, "%i workers planned (%i workers launched)",
+   btleader->pcxt->nworkers_to_launch,
+   btleader->pcxt->nworkers_launched);
+
/*
 * Next, accumulate WAL usage.  (This must wait for the workers to 
finish,
 * or we might get incomplete data.)
diff --git a/src/backend/commands/vacuumparallel.c 
b/src/backend/commands/vacuumparallel.c

Re: Parallel workers stats in pg_stat_database

2024-10-02 Thread Benoit Lobréau

Hi,

Thanks for your imput ! I will fix the doc as proposed and do the split 
as soon as I have time.


On 10/1/24 09:27, Michael Paquier wrote:

I'm less
a fan of the addition for utilities because these are less common
operations.


My thought process was that in order to size max_parallel_workers we 
need to
have information on the maintenance parallel worker and "query" parallel 
workers.



Actually, could we do better than what's proposed here?  How about
presenting an aggregate of this data in pg_stat_statements for each
query instead?


I think both features are useful.

My collegues and I had a discussion about what could be done to improve
parallelism observability in PostgreSQL [0]. We thought about several
places to do it for several use cases.

Guillaume Lelarge worked on pg_stat_statements [1] and
pg_stat_user_[tables|indexes] [2]. I proposed a patch for the logs [3].

As a consultant, I frequently work on installation without
pg_stat_statements and I cannot install it on the client's production
in the timeframe of my intervention.

pg_stat_database is available everywhere and can easily be sampled by 
collectors/supervision services (like check_pgactivity).


Lastly the number would be more precise/easier to make sense of, since 
pg_stat_statement has a limited size.


[0] 
https://www.postgresql.org/message-id/flat/d657df20-c4bf-63f6-e74c-cb85a81d0...@dalibo.com
[1] 
https://www.postgresql.org/message-id/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X%2B_dZqKw%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/CAECtzeXXuMkw-RVGTWvHGOJsmFdsRY%2BjK0ndQa80sw46y2uvVQ%40mail.gmail.com
[3] 
https://www.postgresql.org/message-id/8123423a-f041-4f4c-a771-bfd96ab235b0%40dalibo.com


--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Parallel workers stats in pg_stat_database

2024-11-08 Thread Benoit Lobréau

On 11/8/24 05:08, Michael Paquier wrote:

On Thu, Nov 07, 2024 at 02:36:58PM +0900, Michael Paquier wrote:

Incorrect comment format, about which pgindent does not complain..

..  But pgindent complains in execMain.c and pgstat_database.c.  These
are only nits, the patch is fine.  If anybody has objections or
comments, feel free.


Found a few more things, but overall it was fine.  Here is what I have
staged on my local branch.
--
Michael


Hi,

I just reread the patch.
Thanks for the changes. It looks great.

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Logging parallel worker draught

2024-12-26 Thread Benoit Lobréau

This is just a rebase.

As stated before, I added some information to the error message for 
parallel queries as an experiment. It can be removed, if you re not 
convinced.


---
Benoit Lobréau
Consultant
http://dalibo.comFrom 7e63f79226357f2fe84acedb950e64383e582b17 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 8 Oct 2024 12:39:41 +0200
Subject: [PATCH 1/5] Add a guc for parallel worker logging

The new guc log_parallel_workers controls whether a log message is
produced to display information on the number of workers spawned when a
parallel query or utility is executed.

The default value is `none` which disables logging. `all` displays
information for all parallel queries, whereas `failures` displays
information only when the number of workers launched is lower than the
number of planned workers.

This new parameter can help database administrators and developers
diagnose performance issues related to parallelism and optimize the
configuration of the system accordingly.
---
 doc/src/sgml/config.sgml  | 18 ++
 src/backend/access/transam/parallel.c | 15 +++
 src/backend/utils/misc/guc_tables.c   | 12 
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/parallel.h | 19 +++
 5 files changed, 65 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbdd6ce574..3a5f7131b3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7752,6 +7752,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+
+  log_parallel_workers (enum)
+  
+   log_parallel_workers configuration parameter
+  
+  
+  
+   
+Controls whether a log message about the number of workers is produced during the
+execution of a parallel query or utility statement. The default value is
+none which disables logging. all displays
+information for all parallel queries, whereas failures displays
+information only when the number of workers launched is lower than the number of
+planned workers.
+   
+  
+ 
+
  
   log_parameter_max_length (integer)
   
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 0a1e089ec1..e1d378efa6 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1653,3 +1653,18 @@ LookupParallelWorkerFunction(const char *libraryname, const char *funcname)
 	return (parallel_worker_main_type)
 		load_external_function(libraryname, funcname, true, NULL);
 }
+
+/*
+ * The function determines if information about workers should
+ * be logged.
+*/
+bool
+LoggingParallelWorkers(int log_parallel_workers,
+	   int parallel_workers_to_launch,
+	   int parallel_workers_launched)
+{
+	return ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
+			parallel_workers_to_launch > 0) ||
+			(log_parallel_workers == LOG_PARALLEL_WORKERS_FAILURE &&
+			parallel_workers_to_launch != parallel_workers_launched));
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..cc4d70177d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -482,6 +482,7 @@ extern const struct config_enum_entry archive_mode_options[];
 extern const struct config_enum_entry recovery_target_action_options[];
 extern const struct config_enum_entry wal_sync_method_options[];
 extern const struct config_enum_entry dynamic_shared_memory_options[];
+extern const struct config_enum_entry log_parallel_workers_options[];
 
 /*
  * GUC option variables that are exported from this module
@@ -526,6 +527,7 @@ int			log_min_duration_statement = -1;
 int			log_parameter_max_length = -1;
 int			log_parameter_max_length_on_error = 0;
 int			log_temp_files = -1;
+int			log_parallel_workers = LOG_PARALLEL_WORKERS_NONE;
 double		log_statement_sample_rate = 1.0;
 double		log_xact_sample_rate = 0;
 char	   *backtrace_functions;
@@ -5226,6 +5228,16 @@ struct config_enum ConfigureNamesEnum[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_parallel_workers", PGC_SUSET, LOGGING_WHAT,
+			gettext_noop("Log information about parallel worker usage"),
+			NULL
+		},
+		&log_parallel_workers,
+		LOG_PARALLEL_WORKERS_NONE, log_parallel_workers_options,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..f11bf173ff 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -622,6 +622,7 @@
 #log_temp_files = -1			# log temporary files equal or larger
 	# than the specified size in kilobytes;
 	# -1 disables, 0 logs all temp files
+#log_pa

Re: Logging parallel worker draught

2025-02-05 Thread Benoit Lobréau

On 2/3/25 6:36 AM, Sami Imseih wrote:

As far as the current set of patches, I had some other changes that
I missed earlier; indentation to the calls in LogParallelWorkersIfNeeded
and comment for the LogParallelWorkersIfNeeded function. I also re-worked the
setup of the GUC as it was not setup the same way as other
GUCs with an options list. I ended up just making those changes
in v8.



Besides that, I think this is ready for committer.


Thank you for your time and advice on this.

--
Benoit Lobréau
Consultant
http://dalibo.com





Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)

2025-01-30 Thread Benoit Lobréau
I did some additional testing with this command within transactions.

I had the "BEGIN; ALTER INDEX; EXPLAIN; ROLLBACK;" scenario in mind, but
didn't realise we acquire an AccessExclusiveLock on the index. Therefore, it's
not possible to change the visibility within a single transaction
unless you don’t mind blocking all access to the relation.

I read the comments at the top of "AlterTableGetLockLevel" and in the
documentation and I understand that this behavior seems unavoidable.
I suppose this is what was meant by the "globally visible effects" of an ALTER
INDEX in the old discussion ? [1]

Being able to rollback the changes is nice, but in this case there is
not much to alter
back anyway. This is probably not the intended use case (hence the
discussion about
GUCs and hints).

[1] https://www.postgresql.org/message-id/30558.1529359929%40sss.pgh.pa.us




Re: doc: explain pgstatindex fragmentation

2025-01-22 Thread Benoit Lobréau

Here is an updated patch to save you some time.

--
Benoit Lobréau
Consultant
http://dalibo.com
From 5d89ca0a36e98d1e5be858166a8394c09e0fa129 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= 
Date: Tue, 5 Nov 2024 17:59:44 +0100
Subject: [PATCH] doc: explain pgstatindex fragmentation

It was quite hard to guess what leaf_fragmentation meant without looking
at pgstattuple's code. This patch aims to give to the user a better
idea of what it means.
---
 doc/src/sgml/pgstattuple.sgml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/doc/src/sgml/pgstattuple.sgml b/doc/src/sgml/pgstattuple.sgml
index 4071da4ed94..280f1ac9ce8 100644
--- a/doc/src/sgml/pgstattuple.sgml
+++ b/doc/src/sgml/pgstattuple.sgml
@@ -277,6 +277,13 @@ leaf_fragmentation | 0
  page-by-page, and should not be expected to represent an
  instantaneous snapshot of the whole index.
 
+
+
+ avg_leaf_density can be seen as the inverse of bloat,
+ while leaf_fragmentation represents a measure of disorder.
+ A higher leaf_fragmentation indicates that the physical
+ order of the index leaf pages increasingly deviates from their logical order.
+
 

 
-- 
2.48.1



doc: explain pgstatindex fragmentation

2025-01-21 Thread Benoit Lobréau

Hi,

I like the clarification but I think that:

A higher leaf_fragmentation indicates that the 
physical order of the index leaf

pages increasingly deviates from their logical order.

Would be cleaner than:

The higher leaf_fragmentation is, the less the 
physical order
of the index leaf pages corresponds to the logical order we would have 
just after

a REINDEX.

--
Benoit Lobréau
Consultant
http://dalibo.com




Re: Doc: Move standalone backup section, mention -X argument

2025-01-23 Thread Benoit Lobréau

On 1/23/25 4:18 AM, David G. Johnston wrote:
Aside from the name choice this is what I propose, so can you elaborate 
on what doesn't feel right?  You cannot have both "Standalone Physical 
Backup" and "File System Level Backup" co-exist so maybe that was it - 
not realizing that your "new" section is just my proposal?


The current "File System Level Backup" section describes OS-centric, 
mostly cold backups (part of the file system snapshot solutions can be 
considered hotish).


On the other hand "Standalone Hot Backups" use PostgreSQL's backup API 
and the WALs. They can be fetched using archiving which is described in 
the "Continuous Archiving and (PITR)" section, or through streaming 
(e.g., pg_basebackup -X stream or with pg_receivewal). Overall, I feel 
these backups share more in common with what is described in section 
"25.3" than in "25.2".


I also wasn't a fan of the following:

* That standalone hot backup with the backup API disappear in your proposal.

* Of the sentence "PostgreSQL provides a tool, pg_basebackup, that can 
produce a similar standalone backup to the one produced by pg_dump," 
because I don't think they have much in common aside from being standalone.



My initial annoyance was having the following sentence in a section 
named, in part, PITR.


"These are backups that cannot be used for point-in-time recovery."

Which suggests the advice is fundamentally misplaced when in PITR sect2.


Yes, I totally agree. I didn’t like that either and tried to address it 
by renaming the section to:


25.3. Continuous Archiving, backups and Point-in-Time Recovery (PITR)

If that’s not sufficient, how about:

25.3. PostgreSQL level physical backups and recovery
25.3.1. Setting Up WAL Archiving
25.3.2. Making a Base Backup
25.3.3. Making an Incremental Backup
25.3.4. Making a Base Backup Using the Low Level API
25.3.5. Making a Standalone Hot Backup
25.3.6. Recovering Using a Continuous Archive Backup (PITR)
25.3.7. Timelines
25.3.8. Tips and Examples
25.3.9. Caveats

Note: As I mentioned to you privately, I made a mistake and broke the 
thread. I’ve added the new thread to the commit fest. Here is a link to 
the old one to help others follow the conversation:


https://www.postgresql.org/message-id/flat/CAKFQuwaS6DtSde4TWpk133mfaQbgh8d%2BPkk0kDN%3D6jf6qEWbvQ%40mail.gmail.com

--
Benoit Lobréau
Consultant
http://dalibo.com





Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)

2025-01-24 Thread Benoit Lobréau
Hi,

Thank you for the patch! I've had a need for this feature several times,
so I appreciate the work you’ve put into it.

I like the new name VISIBLE/INVISIBLE and the fact that it's a separate flag in
pg_index (it's easy to monitor).

I don’t feel qualified to provide an opinion on the code itself just yet.

I did notice something in the command prototype:

+ALTER INDEX [ IF EXISTS ] name VISIBLE
+ALTER INDEX [ IF EXISTS ] name INVISIBLE

it would probably be better as:

 +ALTER INDEX [ IF EXISTS ] name {VISIBLE|INVISIBLE}

The completion for the INVISIBLE / VISIBLE keyword is missing in psql.


I also tested the ALTER command within a transaction, and it worked as I
expected: the changes are transactional (possibly because you didn’t use
systable_inplace_update_begin?).


Additionally, I tried using the ALTER command on an index that supports
a foreign key. As expected, delete and update operations on the referenced
table became significantly slower. I was wondering if this behavior should
be documented here.

+  Make the specified index invisible. The index will not be used
for queries.
+  This can be useful for testing query performance with and
without specific
+  indexes.

Maybe something like :

The index will not be used for user or system queries (e.g., an index
supporting foreign keys).

I noticed that you mentionned checking pg_stat_user_indexes before using
the query but it might not be enough?




Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)

2025-01-24 Thread Benoit Lobréau
I also noticed that \d on an index doesn't warn about the invisible state
whereas \d on a table does:

[local]:5444 postgres@postgres=# SELECT indexrelid::regclass,
indisvalid, indisvisible FROM pg_index WHERE indexrelid =
'repli_pkey'::regclass \gx
-[ RECORD 1 ]+---
indexrelid   | repli_pkey
indisvalid   | f
indisvisible | f

[local]:5444 postgres@postgres=# \d repli_pkey
  Index "public.repli_pkey"
 Column |  Type   | Key? | Definition
+-+--+
 i  | integer | yes  | i
primary key, btree, for table "public.repli", invalid

[local]:5444 postgres@postgres=# \d repli
   Table "public.repli"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 i  | integer |   | not null |
 t  | text|   |  |
Indexes:
"repli_pkey" PRIMARY KEY, btree (i) INVISIBLE INVALID
Publications:
"pub"

The attached patch adds the flag.

[local]:5444 postgres@postgres=# \d repli_pkey
  Index "public.repli_pkey"
 Column |  Type   | Key? | Definition
+-+--+
 i  | integer | yes  | i
primary key, btree, for table "public.repli", invalid, invisible
From bf3f11e5e88a30a9c1affd9678dadec9bc236351 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 24 Jan 2025 16:12:45 +0100
Subject: [PATCH 2/3] Add the invisible tag for indexes in \d

---
 src/bin/psql/describe.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 2ef99971ac0..5d1acbd149d 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2318,6 +2318,11 @@ describeOneTableDetails(const char *schemaname,
 		else
 			appendPQExpBufferStr(&buf, "false AS indnullsnotdistinct,\n");
 
+		if (pset.sversion >= 18)
+			appendPQExpBufferStr(&buf, "i.indisvisible,\n");
+		else
+			appendPQExpBufferStr(&buf, "true AS indisvisible,\n");
+
 		appendPQExpBuffer(&buf, "  a.amname, c2.relname, "
 		  "pg_catalog.pg_get_expr(i.indpred, i.indrelid, true)\n"
 		  "FROM pg_catalog.pg_index i, pg_catalog.pg_class c, pg_catalog.pg_class c2, pg_catalog.pg_am a\n"
@@ -2343,9 +2348,10 @@ describeOneTableDetails(const char *schemaname,
 			char	   *deferred = PQgetvalue(result, 0, 5);
 			char	   *indisreplident = PQgetvalue(result, 0, 6);
 			char	   *indnullsnotdistinct = PQgetvalue(result, 0, 7);
-			char	   *indamname = PQgetvalue(result, 0, 8);
-			char	   *indtable = PQgetvalue(result, 0, 9);
-			char	   *indpred = PQgetvalue(result, 0, 10);
+			char	   *indisvisible = PQgetvalue(result, 0, 8);
+			char	   *indamname = PQgetvalue(result, 0, 9);
+			char	   *indtable = PQgetvalue(result, 0, 10);
+			char	   *indpred = PQgetvalue(result, 0, 11);
 
 			if (strcmp(indisprimary, "t") == 0)
 printfPQExpBuffer(&tmpbuf, _("primary key, "));
@@ -2382,6 +2388,9 @@ describeOneTableDetails(const char *schemaname,
 			if (strcmp(indisreplident, "t") == 0)
 appendPQExpBufferStr(&tmpbuf, _(", replica identity"));
 
+			if (strcmp(indisvisible, "t") != 0)
+appendPQExpBufferStr(&tmpbuf, _(", invisible"));
+
 			printTableAddFooter(&cont, tmpbuf.data);
 
 			/*
-- 
2.48.1



Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)

2025-01-24 Thread Benoit Lobréau
On Fri, Jan 24, 2025 at 11:32 AM Benoit Lobréau
 wrote:
> The completion for the INVISIBLE / VISIBLE keyword is missing in psql.

I think this should to the trick ?

diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 81cbf10aa28..43ea8e55fd0 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -2393,7 +2393,8 @@ match_previous_words(int pattern_id,
else if (Matches("ALTER", "INDEX", MatchAny))
COMPLETE_WITH("ALTER COLUMN", "OWNER TO", "RENAME TO", "SET",
  "RESET", "ATTACH PARTITION",
- "DEPENDS ON EXTENSION", "NO
DEPENDS ON EXTENSION");
+ "DEPENDS ON EXTENSION", "NO
DEPENDS ON EXTENSION",
+ "INVISIBLE", "VISIBLE");
else if (Matches("ALTER", "INDEX", MatchAny, "ATTACH"))
COMPLETE_WITH("PARTITION");
else if (Matches("ALTER", "INDEX", MatchAny, "ATTACH", "PARTITION"))




Re: Logging parallel worker draught

2025-01-20 Thread Benoit Lobréau

Here is a new set of patches.

The following changed:

* rebase
* simplify the log message to go back to "launched X parallel workers 
(planned: Y)"

* rename the "failure" configuration item to "shortage".

On 1/3/25 17:24, Sami Imseih wrote:> Maintenance work is usually 
planned, so if queries

issues by the applications are not launching enough workers, it's
easy to point the blame on the maintenance activity.


I often work on systems I have no prior knowledge of. Some of these 
systems have external schedulers. Having information in PostgreSQL's 
logs is really useful in such cases.



Maybe it's better to log parallel maintenance workers separately actually
if there is a truly good justification for it. As it stands now, even
pg_stat_database
does not track maintenance workers. Maybe adding logging could also be part
of that discussion?


The original patch on pg_stat_database included this information. I 
still think that having a centralized way to get the information is 
important, whether in the logs and/or pg_stat_database (preferably both).


I feel that observability is important, and I don't understand why we 
would want to have the information for only a portion of the 
functionality's usage (even if it's the most important).


--
Benoit Lobréau
Consultant
http://dalibo.comFrom b99dc5c1c549921ed2ac054db4e8c5398543dfc4 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 8 Oct 2024 12:39:41 +0200
Subject: [PATCH 1/5] Add a guc for parallel worker logging

The new guc log_parallel_workers controls whether a log message is
produced to display information on the number of workers spawned when a
parallel query or utility is executed.

The default value is `none` which disables logging. `all` displays
information for all parallel queries, whereas `shortage` displays
information only when the number of workers launched is lower than the
number of planned workers.

This new parameter can help database administrators and developers
diagnose performance issues related to parallelism and optimize the
configuration of the system accordingly.
---
 doc/src/sgml/config.sgml  | 18 ++
 src/backend/access/transam/parallel.c | 15 +++
 src/backend/utils/misc/guc_tables.c   | 12 
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/parallel.h | 19 +++
 5 files changed, 65 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a8866292d46..095bb751183 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7611,6 +7611,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+
+  log_parallel_workers (enum)
+  
+   log_parallel_workers configuration parameter
+  
+  
+  
+   
+Controls whether a log message about the number of workers is produced during the
+execution of a parallel query or utility statement. The default value is
+none which disables logging. all displays
+information for all parallel queries or utilities, whereas shortage
+displays information only when the number of workers launched is lower than the number
+of planned workers.
+   
+  
+ 
+
  
   log_parameter_max_length (integer)
   
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 7817bedc2ef..d5b630e055b 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1659,3 +1659,18 @@ LookupParallelWorkerFunction(const char *libraryname, const char *funcname)
 	return (parallel_worker_main_type)
 		load_external_function(libraryname, funcname, true, NULL);
 }
+
+/*
+ * The function determines if information about workers should
+ * be logged.
+*/
+bool
+LoggingParallelWorkers(int log_parallel_workers,
+	   int parallel_workers_to_launch,
+	   int parallel_workers_launched)
+{
+	return ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
+			parallel_workers_to_launch > 0) ||
+			(log_parallel_workers == LOG_PARALLEL_WORKERS_SHORTAGE &&
+			parallel_workers_to_launch != parallel_workers_launched));
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d5..fe38758fd57 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -482,6 +482,7 @@ extern const struct config_enum_entry archive_mode_options[];
 extern const struct config_enum_entry recovery_target_action_options[];
 extern const struct config_enum_entry wal_sync_method_options[];
 extern const struct config_enum_entry dynamic_shared_memory_options[];
+extern const struct config_enum_entry log_parallel_workers_options[];
 
 /*
  * GUC option variables that are exported from this module
@@ -526,6 +527,7 @@ int			log_min_duration_statem

Re: doc: explain pgstatindex fragmentation

2025-01-27 Thread Benoit Lobréau

On 1/25/25 7:07 PM, Laurenz Albe wrote:

Looks good to me.  I have one question left: the explanation for the performance
penalty of a high leaf fragmentation sounds like it would only be relevant for
disks where sequential reads are faster.  If that is correct, perhaps it would 
be
worth mentioning.


Hi Laurenz,

Frédéric is in holiday this week. So he might not be able to answer, 
I'll try to do it in his stead.


Frederic noticed a performance hit even for on his laptop with a SSD.

On Fri, 2025-01-24 at 15:41 +0100, Frédéric Yhuel wrote:
> I've noticed that maximum leaf_fragmentation can have a huge impact on
> a range index-only scan, when reading all blocs from disks, even on my
> laptop machine with SSD, but I don't know if this is the right place
> to document this?

He reported to our team, that he did a test with two indexes on the same 
data. They had the same density but one had no fragmentation while the 
other had 100%. He got an execution time of ~90ms (0 frag) vs ~340ms 
100% frag).


I get similar result with my laptor (except my disk is significantly 
worse: ~152ms vs ~833ms).


Here are the scripts.

--
Benoit Lobréau
Consultant
http://dalibo.com


evict_from_both_caches.sql
Description: application/sql


leaf_fragmentation.sql
Description: application/sql


Re: Doc: Move standalone backup section, mention -X argument

2025-01-22 Thread Benoit Lobréau

Hi,

Initially, I shared your perspective, but I wasn’t entirely on board 
with the subdivision you proposed. The more I thought about it and tried 
to reshape it, the less convinced I became.


I don’t think pg_basebackup fits naturally under the "File System Level 
Backup" section. I considered creating a "Standalone Physical Backup" 
section with two subsections: FS-level backups and pg_basebackup, but 
that didn’t feel right either.


What I find most problematic about the current state of the 
documentation is that this solution is buried in the "Tips and Examples" 
section. As a result, it’s easy to miss when skimming through the docs 
since the list of sub sections displayed at the top of the page and in 
"Chapter 25. Backup and Restore" stops 3 level deep.


What if we just move the "Standalone Hot Backups" up one level and 
rename the level 2 section ? Something like this ?


25.3. Continuous Archiving, backups and Point-in-Time Recovery (PITR)
25.3.1. Setting Up WAL Archiving
25.3.2. Making a Base Backup
25.3.3. Making an Incremental Backup
25.3.4. Making a Base Backup Using the Low Level API
25.3.5. Making a Standalone Hot Backup
25.3.6. Recovering Using a Continuous Archive Backup
25.3.7. Timelines
25.3.8. Tips and Examples
25.3.9. Caveats

--
Benoit Lobréau
Consultant
http://dalibo.com





Re: Logging parallel worker draught

2025-01-29 Thread Benoit Lobréau

On 1/29/25 12:41 AM, Sami Imseih wrote:

There will both be an INFO ( existing behavior ) and LOG ( new behavior ).
This seems wrong to me and there should only really be one
mechanism to log parallel workers for utility statements.
Others may have different opinions.


In the use case you describe, I agree that the VERBOSE option is more 
suited for the job. But it's not the use case I had in mind for this

guc.

The "story" I have in mind is: I need to audit an instance I know 
nothing about. I ask the client to adapt the logging parameters for 
pgbadger (including this one), collect the logs and generate a report 
for the said period to have a broad overview of what is happenning.


I attached patches that implemented both your suggestions (I regrouped 
all the utilities together).


Thank you !

--
Benoit Lobréau
Consultant
http://dalibo.com
From 5ae1a4e619c1d7c8e899641e16dda831e792438d Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 8 Oct 2024 12:39:41 +0200
Subject: [PATCH 1/4] Add a guc for parallel worker logging

The new guc log_parallel_workers controls whether a log message is
produced to display information on the number of workers spawned when a
parallel query or utility is executed.

The default value is `none` which disables logging. `all` displays
information for all parallel queries, whereas `shortage` displays
information only when the number of workers launched is lower than the
number of planned workers.

This new parameter can help database administrators and developers
diagnose performance issues related to parallelism and optimize the
configuration of the system accordingly.
---
 doc/src/sgml/config.sgml  | 18 ++
 src/backend/access/transam/parallel.c | 18 ++
 src/backend/utils/misc/guc_tables.c   | 12 
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/parallel.h | 19 +++
 5 files changed, 68 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a782f109982..f8c879f1c10 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7612,6 +7612,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+
+  log_parallel_workers (enum)
+  
+   log_parallel_workers configuration parameter
+  
+  
+  
+   
+Controls whether a log message about the number of workers is emitted during the
+execution of a parallel query or utility statement. The default value is
+none which disables logging. all emits
+information for all parallel queries or utilities, whereas shortage
+emits information only when the number of workers launched is lower than the number
+of planned workers.
+   
+  
+ 
+
  
   log_parameter_max_length (integer)
   
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 7817bedc2ef..d18d203d178 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1659,3 +1659,21 @@ LookupParallelWorkerFunction(const char *libraryname, const char *funcname)
 	return (parallel_worker_main_type)
 		load_external_function(libraryname, funcname, true, NULL);
 }
+
+/*
+ * This function emits information about workers in the logs depending
+ * on the setting of log_parallel_workers
+ */
+void
+LogParallelWorkersIfNeeded(int log_parallel_workers,
+			int parallel_workers_to_launch,
+			int parallel_workers_launched)
+{
+	if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
+		parallel_workers_to_launch > 0) ||
+		(log_parallel_workers == LOG_PARALLEL_WORKERS_SHORTAGE &&
+		parallel_workers_to_launch != parallel_workers_launched))
+		elog(LOG, "launched %i parallel workers (planned: %i)",
+			parallel_workers_launched,
+			parallel_workers_to_launch);
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 38cb9e970d5..fe38758fd57 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -482,6 +482,7 @@ extern const struct config_enum_entry archive_mode_options[];
 extern const struct config_enum_entry recovery_target_action_options[];
 extern const struct config_enum_entry wal_sync_method_options[];
 extern const struct config_enum_entry dynamic_shared_memory_options[];
+extern const struct config_enum_entry log_parallel_workers_options[];
 
 /*
  * GUC option variables that are exported from this module
@@ -526,6 +527,7 @@ int			log_min_duration_statement = -1;
 int			log_parameter_max_length = -1;
 int			log_parameter_max_length_on_error = 0;
 int			log_temp_files = -1;
+int			log_parallel_workers = LOG_PARALLEL_WORKERS_NONE;
 double		log_statement_sample_rate = 1.0;
 double		log_xact_sample_rate = 0;
 char	   *backtrace_functions;
@@ -5236,6 +5238,16 @@ struct 

Re: Logging parallel worker draught

2025-01-03 Thread Benoit Lobréau

Hi, thank you for the review and sorry for the delayed answer.

On 12/28/24 05:17, Sami Imseih wrote:> Thinking about this further, it 
seems to me this logging only makes sense

> for parallel query and not maintenance commands. This is because
> for the latter case, the commands are executed manually and a user 
can issue

> VACUUM VERBOSE ( for the case of vacuum ). For CREATE INDEX, one
> can enable DEBUG1. For example:
>
> postgres=# CREATE INDEX tbl_c1 ON tbl(c1);
> DEBUG:  building index "tbl_c1" on table "tbl" with request for 1
> parallel workers
> DEBUG:  index "tbl_c1" can safely use deduplication
> CREATE INDEX


In my opinion, it's useful:

* Centralization of logs: The maintenance operation can be planned in
cron jobs. In that context, it could certainly be logged separately via
the script. However, when I am doing an audit on a client database, I
think it's useful to have all the relevant information in PostgreSQL logs.

* Centralization of the configuration: If I am auditing the usage of
parallelism, I would like to have a single GUC to enable the logging
for all relevant information.

* Context: Maintenance workers are part of the global pool of workers.
To know why there is a shortage after the fact, having all the information
on what was using workers could be useful. I tied this argument when we
added the parallel worker info to pg_stat_database and wasn't successful.
It would be nice to have the info somewhere.

The logging for CREATE INDEX and VACUUM is in separate patches. If you
don't mind, I would like to keep it as long as possible.

> 2/ For logging parallel query workers, the log line could be simpler.
>
> Instead of:
>
> + elog(LOG, "%i parallel nodes planned (%i obtained all their workers,
> %i obtained none), "
> + "%i workers planned to be launched (%i workers launched)",
>
> what about something like:
>
> "launched %d parallel workers (planned: %)"
>
> For example, if I have 2 gather nodes and they each plan and launch
> 2 workers, the log line should be as simple as
>
> "launched 4 parallel workers (planned: 4)"
>
> This behavior in which workers planned and launched are aggregated in the
> log can be described in the documentation.

"%i parallel nodes planned (%i obtained all their workers, %i obtained 
none),"


I added this part during my first reorganization. I was trying to extract
as much information as possible and thought that the special case where a
parallel node was planned but no workers were obtained was the worst-case
scenario, which might be interesting to log.

If you think it doesn't bring much value, we can strip it.

> 3/ Also, log_parallel_query_workers as the name of the GUC will make 
more sense

> if we go logging parallel query only.

I agree that log_parallel_query_workers would make more sense if we focus
on logging parallel queries only. Since the maintenance-related logging
was added later, I missed this point. Thank you for bringging this up.

If agreeable, I will rename the GUC to log_parallel_workers as long as
the argument regarding maintenance workers is unresolved. If we decide
to strip them, I will revert it to log_parallel_query_workers.

>> I don't think "failure" is a good name for the setting as it's
>> a bit too harsh. What we really have is a "shortage" of
>> workers.

I agree that "failure" is too harsh. The term "shortage" better describes
the situation.

Note: I added Tomas Vondra to the recipients of the email since he advised
me on this topic initially and might have an opinion on this.

--
Benoit Lobréau
Consultant
http://dalibo.com





Re: Fix logging for invalid recovery timeline

2025-02-21 Thread Benoit Lobréau

On 2/20/25 4:40 PM, David Steele wrote:

Benoit -- this was your idea. Did you want to submit a patch yourself?


Here is an attempt at that. I kept the wording I used above. Is it fine 
to repeat the whole ereport block twice?


--
Benoit Lobréau
Consultant
http://dalibo.com
From 44459bf799fca517b13aef03be952b0374463d5c Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 21 Feb 2025 14:09:56 +0100
Subject: [PATCH] Adjust logging for invalid recovery timeline

The original message doesn't mention where the checkpoint was read from.
It refers to the "Latest checkpoint", the terminology is used for some
records from the control file (specifically, the pg_controldata output).
The backup label uses "Checkpoint location".
---
 src/backend/access/transam/xlogrecovery.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index f234007d348..7ada300bf9b 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -846,13 +846,22 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		 * not in expectedTLEs at all.
 		 */
 		switchpoint = tliSwitchPoint(CheckPointTLI, expectedTLEs, NULL);
-		ereport(FATAL,
-(errmsg("requested timeline %u is not a child of this server's history",
-		recoveryTargetTLI),
- errdetail("Latest checkpoint is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X.",
-		   LSN_FORMAT_ARGS(CheckPointLoc),
-		   CheckPointTLI,
-		   LSN_FORMAT_ARGS(switchpoint;
+		if (haveBackupLabel)
+			ereport(FATAL,
+	(errmsg("requested timeline %u is not a child of this server's history",
+			recoveryTargetTLI),
+	 errdetail("The checkpoint location in the backup_label file is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X.",
+			   LSN_FORMAT_ARGS(CheckPointLoc),
+			   CheckPointTLI,
+			   LSN_FORMAT_ARGS(switchpoint;
+		else
+			ereport(FATAL,
+	(errmsg("requested timeline %u is not a child of this server's history",
+			recoveryTargetTLI),
+	 errdetail("The latest checkpoint in the control file is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X.",
+			   LSN_FORMAT_ARGS(CheckPointLoc),
+			   CheckPointTLI,
+			   LSN_FORMAT_ARGS(switchpoint;
 	}
 
 	/*
-- 
2.48.1



Re: Fix logging for invalid recovery timeline

2025-02-25 Thread Benoit Lobréau

On 2/24/25 11:33 PM, Michael Paquier wrote:

On Mon, Feb 24, 2025 at 05:30:35PM +, David Steele wrote:

+/* translator: %s is a backup_label or
pg_control file */


See for example PostmasterMain() with the "/* translator: %s is a
configuration file */".


Thank you Michael and David.
I never paid attention to thoses...

--
Benoit Lobréau
Consultant
http://dalibo.com
From 9a12cad29afb86f2277bf76bc53757702715afd5 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 21 Feb 2025 14:09:56 +0100
Subject: [PATCH] Adjust logging for invalid recovery timeline

The original message doesn't mention where the checkpoint was read from.
It refers to the "Latest checkpoint", the terminology is used for some
records from the control file (specifically, the pg_controldata output).
The backup label uses "Checkpoint location".
---
 src/backend/access/transam/xlogrecovery.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 915cb330295..65335e542ad 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -849,7 +849,9 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		ereport(FATAL,
 (errmsg("requested timeline %u is not a child of this server's history",
 		recoveryTargetTLI),
- errdetail("Latest checkpoint is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X.",
+		/* translator: %s is a backup_label or pg_control file */
+ errdetail("Latest checkpoint in %s is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X.",
+		   haveBackupLabel ? "backup_label" : "pg_control",
 		   LSN_FORMAT_ARGS(CheckPointLoc),
 		   CheckPointTLI,
 		   LSN_FORMAT_ARGS(switchpoint;
-- 
2.48.1



Re: long-standing data loss bug in initial sync of logical replication

2025-02-25 Thread Benoit Lobréau

Hi,

After reading the thread and doing a bit of testing, the problem seems 
significant and is still present. The fact that it's probably not well 
known makes it more concerning, in my opinion. I was wondering what 
could be done to help move this topic forward (given my limited abilities)?


--
Benoit Lobréau
Consultant
http://dalibo.com





Re: Fix logging for invalid recovery timeline

2025-02-24 Thread Benoit Lobréau

On 2/24/25 2:05 AM, Michael Paquier wrote:

I think that you have the right idea here, avoiding the duplication
of the errdetail() which feels itchy when looking at the patch.  


Done this way in the attached patch.


This should have a note for translators that this field refers to a file
name.  


Is there a specific way todo this?

--
Benoit Lobréau
Consultant
http://dalibo.com
From 35dd8db22c997e78ba92f63ffca3022b435d7304 Mon Sep 17 00:00:00 2001
From: benoit 
Date: Fri, 21 Feb 2025 14:09:56 +0100
Subject: [PATCH] Adjust logging for invalid recovery timeline

The original message doesn't mention where the checkpoint was read from.
It refers to the "Latest checkpoint", the terminology is used for some
records from the control file (specifically, the pg_controldata output).
The backup label uses "Checkpoint location".
---
 src/backend/access/transam/xlogrecovery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 915cb330295..c5160bf5258 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -849,7 +849,8 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		ereport(FATAL,
 (errmsg("requested timeline %u is not a child of this server's history",
 		recoveryTargetTLI),
- errdetail("Latest checkpoint is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X.",
+ errdetail("Latest checkpoint in %s is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X.",
+		   haveBackupLabel ? "backup_label" : "pg_control",
 		   LSN_FORMAT_ARGS(CheckPointLoc),
 		   CheckPointTLI,
 		   LSN_FORMAT_ARGS(switchpoint;
-- 
2.48.1



Re: Fix logging for invalid recovery timeline

2025-02-19 Thread Benoit Lobréau

Hi,

I think the changes make sense. Would it be helpful to add the origin of 
the checkpoint record we are referring to ? (i.e. control file or backup 
label).


For example:

* Latest checkpoint in the control file is at %X/%X on timeline %u,
* Checkpoint location in the backup_label file is at %X/%X on timeline %u,

The current message refers to "Latest checkpoint" which automatically 
makes me think about the control file (specifically, the pg_controldata 
output).


Now that I have read the code, I understand which one we are referring 
to, but for someone who hasn't (or me in a few month) it might be useful ?


--
Benoit Lobréau
Consultant
http://dalibo.com





Re: long-standing data loss bug in initial sync of logical replication

2025-03-03 Thread Benoit Lobréau

On 3/3/25 8:41 AM, Zhijie Hou (Fujitsu) wrote:

A nitpick with the data for the Concurrent Transaction (2000) case. The results
show that the HEAD's data appears worse than the patch data, which seems
unusual. However, I confirmed that the details in the attachment are as 
expected,
so, this seems to be a typo. (I assume you intended to use a
decimal point instead of a comma in the data like (8,43500...))


Hi,

Argh, yes, sorry! I didn't pay enough attention and accidentally 
inverted the Patch and Head numbers in the last line when copying them 
from the ODS to the email to match the previous report layout.


The comma is due to how decimals are written in my language (comma 
instead of dot). I forgot to "translate" it.


Concurrent Txn |Head (sec)|Patch (sec) | Degradation in %
-
50 |   0.1797647  |   0.1920949|  6.85907744957
100|   0.3693029  |   0.3823425|  3.53086856344
500|   1.62265755 |   1.91427485   | 17.97158617972
1000   |   3.01388635 |   3.57678295   | 18.67676928162
2000   |   6.4713304  |   7.0171877|  8.43500897435



as Amit pointed out, we will share a new test script soon
that uses the SQL API xxx_get_changes() to test. It would be great if you could
verify the performance using the updated script as well.


Will do.

--
Benoit Lobréau
Consultant
http://dalibo.com





Re: long-standing data loss bug in initial sync of logical replication

2025-02-28 Thread Benoit Lobréau

Hi,

It took me a while but I ran the test on my laptop with 20 runs per 
test. I asked for a dedicated server and will re-run the tests if/when I 
have it.


count of partitions |   Head (sec) |Fix (sec) |Degradation (%)
--
1000|   0,0265 |   0,028  |  5,66037735849054
5000|   0,091  |   0,0945 |  3,84615384615385
1   |   0,1795 |   0,1815 |  1,11420612813371

 Concurrent Txn |Head (sec)|Patch (sec) | Degradation in %
 -
 50 |   0,1797647  |   0,1920949|  6,85907744957
 100|   0,3693029  |   0,3823425|  3,53086856344
 500|   1,62265755 |   1,91427485   | 17,97158617972
 1000   |   3,01388635 |   3,57678295   | 18,67676928162
 2000   |   7,0171877  |   6,4713304|  8,43500897435

I'll try to run test2.pl later (right now it fails).

hope this helps.

--
Benoit Lobréau
Consultant
http://dalibo.com


cache_inval_bug_v1.ods
Description: application/vnd.oasis.opendocument.spreadsheet


Re: Logging parallel worker draught

2025-04-14 Thread Benoit Lobréau

On 4/7/25 6:41 PM, Melanie Plageman wrote:

On Mon, Feb 3, 2025 at 12:37 AM Sami Imseih  wrote:
I started looking at this, and I like the idea.


Thanks for taking a look!


A few comments: I don't understand what 0002 is. For starters, the
commit message says something about pg_stat_database, and there are no
changes related to that.


I had originally split this part out while working on the patch to add 
parallel worker stats in pg_stat_database [1], in order to isolate the 
common components. In the end, that patch only accounted for user queries.


I merged it into "Implements logging for parallel worker usage in 
utilities" for v9.



Also, we already have basically identical logging coming from
parallel_vacuum_process_all_indexes() and viewable in existing output.
Not only does your implementation not replace this, it is odd that
setting your new guc to none does not disable this. It seems a bit
inconsistent. I'm not sure what the exact right behavior is here,
though.


That logging is used for the VERBOSE mode of VACUUM. There was also 
dicussion to add similar info for parallel index creation.


The use case here is different — the goal is to audit parallel worker 
usage across the entire instance, without needing every call site to use 
VACUUM (VERBOSE) along with SET log_min_messages = info.


I avoided reusing that part of the code because I thought the 
expectation was to aggregate worker counts and display them in 
parallel_vacuum_end(). Sami also mentionned that using the same log
line everywhere in the patch would make parsing easier, which I tought 
was a good idea.



Since your last update, it seems parallel gin index build has been
committed, so perhaps you want to add that.


Thanks for the heads-up! I've added logging in _gin_end_parallel().

You’ll find the updated patch attached.

[1] https://commitfest.postgresql.org/patch/5212/

--
Benoit
From 59cd090e78a5833b901ad662d6ae1ae936c3172d Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 8 Oct 2024 12:39:41 +0200
Subject: [PATCH 1/3] Add a guc for parallel worker logging

The new guc log_parallel_workers controls whether a log message is
produced to display information on the number of workers spawned when a
parallel query or utility is executed.

The default value is `none` which disables logging. `all` displays
information for all parallel queries, whereas `shortage` displays
information only when the number of workers launched is lower than the
number of planned workers.

This new parameter can help database administrators and developers
diagnose performance issues related to parallelism and optimize the
configuration of the system accordingly.
---
 doc/src/sgml/config.sgml  | 18 ++
 src/backend/access/transam/parallel.c | 19 +++
 src/backend/utils/misc/guc_tables.c   | 19 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/access/parallel.h | 10 ++
 src/include/utils/guc.h   |  1 +
 6 files changed, 68 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c1674c22cb2..b1345d15a84 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7950,6 +7950,24 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+
+  log_parallel_workers (enum)
+  
+   log_parallel_workers configuration parameter
+  
+  
+  
+   
+Controls whether a log message about the number of workers is emitted during the
+execution of a parallel query or utility statement. The default value is
+none which disables logging. all emits
+information for all parallel queries or utilities, whereas shortage
+emits information only when the number of workers launched is lower than the number
+of planned workers.
+   
+  
+ 
+
  
   log_parameter_max_length (integer)
   
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 94db1ec3012..77a8deff30d 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1663,3 +1663,22 @@ LookupParallelWorkerFunction(const char *libraryname, const char *funcname)
 	return (parallel_worker_main_type)
 		load_external_function(libraryname, funcname, true, NULL);
 }
+
+/*
+ * If required, emit information about parallel workers usage in
+ * the logs.
+ */
+void
+LogParallelWorkersIfNeeded(int log_parallel_workers,
+		   int parallel_workers_to_launch,
+		   int parallel_workers_launched)
+{
+	if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
+		parallel_workers_to_launch > 0) ||
+		(log_parallel_workers == LOG_PARALLEL_WORKERS_SHORTAGE &&
+		parallel_workers_to_launch != parallel_workers_launched))
+		ereport(LOG,
+(errmsg("launched %i parallel workers (planned: %i)",
+

Re: Doc: Move standalone backup section, mention -X argument

2025-04-15 Thread Benoit Lobréau

On 3/16/25 2:19 PM, vignesh C wrote:

I noticed that Alvaro's comment from [1] is not yet addressed, I have
changed the status of commitfest entry to Waiting on Author, please
address them and change the status back to Needs review.
[1] - 
https://www.postgresql.org/message-id/202502101154.bmb536npfl5e%40alvherre.pgsql

Regards,
Vignesh


Hi,

You will find a patch for the proposed changes attached to this mail.

The menu is now:

25.1. SQL Dump
25.1.1. Restoring the Dump
25.1.2. Using pg_dumpall
25.1.3. Handling Large Databases
25.2. Physical Backups Using Continuous Archiving
25.2.1. Built-In Standalone Backups
25.2.2. Setting Up WAL Archiving
25.2.3. Making a Base Backup
25.2.4. Making an Incremental Backup
25.2.5. Making a Base Backup Using the Low Level API
25.2.6. Recovering Using a Continuous Archive Backup
25.2.7. Timelines
25.2.8. Tips and Examples
25.2.9. Caveats
25.3. File System Level Backup

I slightly modified section 25.2.1 and 25.3 as proposed.

--
Benoit Lobréau
Consultant
http://dalibo.com
From 16813b396a45c4061b5c2d21a9091e3fb372567c Mon Sep 17 00:00:00 2001
From: benoit 
Date: Tue, 15 Apr 2025 15:25:08 +0200
Subject: [PATCH] Reorganize the backup section

The standalone backup of the backup documentation lacks visibility. The
solution described in the file level backup section, while still usable,
is not the preferred method. This patch attempts to remedy this by moving
things around.
---
 doc/src/sgml/backup.sgml | 278 +++
 1 file changed, 139 insertions(+), 139 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 25b8904baf7..c167fb5b6b6 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -354,124 +354,8 @@ pg_dump -j num -F d -f 
  
 
- 
-  File System Level Backup
-
-  
-   An alternative backup strategy is to directly copy the files that
-   PostgreSQL uses to store the data in the database;
-explains where these files
-   are located.  You can use whatever method you prefer
-   for doing file system backups; for example:
-
-
-tar -cf backup.tar /usr/local/pgsql/data
-
-  
-
-  
-   There are two restrictions, however, which make this method
-   impractical, or at least inferior to the pg_dump
-   method:
-
-   
-
- 
-  The database server must be shut down in order to
-  get a usable backup. Half-way measures such as disallowing all
-  connections will not work
-  (in part because tar and similar tools do not take
-  an atomic snapshot of the state of the file system,
-  but also because of internal buffering within the server).
-  Information about stopping the server can be found in
-  .  Needless to say, you
-  also need to shut down the server before restoring the data.
- 
-
-
-
- 
-  If you have dug into the details of the file system layout of the
-  database, you might be tempted to try to back up or restore only certain
-  individual tables or databases from their respective files or
-  directories. This will not work because the
-  information contained in these files is not usable without
-  the commit log files,
-  pg_xact/*, which contain the commit status of
-  all transactions. A table file is only usable with this
-  information. Of course it is also impossible to restore only a
-  table and the associated pg_xact data
-  because that would render all other tables in the database
-  cluster useless.  So file system backups only work for complete
-  backup and restoration of an entire database cluster.
- 
-
-   
-  
-
-  
-   An alternative file-system backup approach is to make a
-   consistent snapshot of the data directory, if the
-   file system supports that functionality (and you are willing to
-   trust that it is implemented correctly).  The typical procedure is
-   to make a frozen snapshot of the volume containing the
-   database, then copy the whole data directory (not just parts, see
-   above) from the snapshot to a backup device, then release the frozen
-   snapshot.  This will work even while the database server is running.
-   However, a backup created in this way saves
-   the database files in a state as if the database server was not
-   properly shut down; therefore, when you start the database server
-   on the backed-up data, it will think the previous server instance
-   crashed and will replay the WAL log.  This is not a problem; just
-   be aware of it (and be sure to include the WAL files in your backup).
-   You can perform a CHECKPOINT before taking the
-   snapshot to reduce recovery time.
-  
-
-  
-   If your database is spread across multiple file systems, there might not
-   be any way to obtain exactly-simultaneous frozen snapshots of all
-   the volumes.  For example, if your data files and WAL log are on different
-   disks, or if tablespaces are on different file sy