date:20221126

Re: Add 64-bit XIDs into PostgreSQL 15

2022-11-26 Thread Chris Travers

Hi;

Trying to discuss where we are talking past eachother.

On Fri, Nov 25, 2022 at 9:38 AM Aleksander Alekseev <
aleksan...@timescale.com> wrote:

> Hi hackers,
>
> > I'm wondering whether the safest way to handle this is by creating a
> > new TAM called "heap64", so that all storage changes happens there.
>
> > Many current users see stability as one of the greatest strengths of
> > Postgres, so while I very much support this move, I wonder if this
> > gives us a way to have both stability and innovation at the same time?
>
> That would be nice.
>
> However from what I see TransactionId is a type used globally in
> PostgresSQL. It is part of structures used by TAM interface, used in
> WAL records, etc. So we will have to learn these components to work
> with 64-bit XIDs anyway and then start thinking about cases like: when
> a user runs a transaction affecting two tables, a heap32 one and
> heap64 one and we will have to figure out which tuples are visible and
> which are not. This perhaps is doable but the maintenance burden for
> the project will be too high IMO.
>
> It seems to me that the best option we can offer for the users looking
> for stability is to use the latest PostgreSQL version with 32-bit
> XIDs. Assuming these users care that much about this particular design
> choice of course.
>

I didn't see any changes to pg_upgrade to make this change possible on
upgrade.  Is that also outside of the scope of your patch set?  If so how
is that continuity supposed to be ensured?

Also related to that, I think you would have to have a check on streaming
replication that both instances use the same xid format (that you don't
accidently upgrade this somehow), since this is set per db cluster, right?

>
> > The whole project seems to just ignore basic, pertinent questions.
> > Questions like: why are we falling behind like this in the first
> > place? And: If we don't catch up soon, why should we be able to catch
> > up later on? Falling behind on freezing is still a huge problem with
> > 64-bit XIDs.
>
> Is the example I provided above wrong?
>
> """
> Consider the case when you run a slow OLAP query that takes 12h to
> complete and 100K TPS of fast OLTP-type queries on the same system.
> The fast queries will consume all 32-bit XIDs in less than 12 hours,
> while the OLAP query started 12 hours ago didn't finish yet and thus
> its tuples can't be frozen.
> """
>
> If it is, please let me know. I would very much like to know if my
> understanding here is flawed.
>

So, you have described a scenario we cannot support today (because xids
would be exhausted within 5.5 hours at that transactional rate).
Additionally as PostgreSQL becomes more capable, this sort of scale will
increasingly be within reach and that is an important point in favor of
this effort.

This being said, there is another set of xid wraparound cases which today
is much larger in number that I think would be hurt if this patchset were
to be accepted into Postgres without mitigating measures which you consider
out of bounds -- the cases like Mailchimp, Adjust, and the like.  This is
why I keep stressing this, and I don't think waiving away concerns about
use cases outside of the one you are focusing on is helpful, particularly
from those of us who have faced xid wraparounds in these cases in the
past.  In these cases, database teams are usually faced with an operational
emergency while tools like vacuum, pg_repack, etc are severely degraded due
to getting so far behind on freezing.  The deeper the hole, the harder it
will be to dig out of.

Every large-scale high-throughput database I have ever worked on had
long-running query alerts precisely because of the impact on vacuum and the
downstream performance impacts.   I would love to get to a point where this
wasn't necessary and maybe in a few specific workloads we might be there
very soon.  The effort you are engaging in here is an important part of the
path to get there, but let's not forget the people who today are facing xid
wraparounds due to vacuum problems and what this sort of set of changes
will mean for them.

-- 
> Best regards,
> Aleksander Alekseev
>
>
>

Re: [PATCH] Allow specification of custom slot for custom nodes

2022-11-26 Thread Ian Lawrence Barwick

2022年11月22日(火) 5:50 Alexander Korotkov :
>
> On Mon, Nov 21, 2022 at 4:34 PM Pavel Borisov  wrote:
> > The following review has been posted through the commitfest application:
> > make installcheck-world:  tested, passed
> > Implements feature:   tested, passed
> > Spec compliant:   not tested
> > Documentation:not tested
> >
> > I've looked at this patch and don't see any problems with it. It is 
> > minimally invasive, it doesn't affect functionality unless anyone (e.g. 
> > extension) sets its own slotOps in CustomScanState.
> > Furthermore, the current patch very slightly modifies patch 0b03e5951bf0 
> > with the intention of introducing extensibility. So I think adding more 
> > extensibility regarding different tuple formats is an excellent thing to do.
> >
> > I'm going to mark it as RfC if there are no objections.
>
> Thank you for your feedback.  I also don't see how this patch could
> affect anybody.
> I'm going to push this if there are no objections.

I see this was pushed (cee1209514) so have closed it in the CF app.

Thanks

Ian Barwick

Re: Avoid streaming the transaction which are skipped (in corner cases)

2022-11-26 Thread Amit Kapila

On Sat, Nov 26, 2022 at 10:59 AM Dilip Kumar  wrote:
>
> On Fri, Nov 25, 2022 at 4:04 PM Ashutosh Bapat
>  wrote:
> >
> > Excellent catch. We were looking at this code last week and wondered
> > the purpose of this abort. Probably we should have some macro or
> > function to decided whether to skip a transaction based on log record.
> > That will avoid using different values in different places.
>
> We do have a common function i.e. SnapBuildXactNeedsSkip() but there
> are two problems 1) it has a dependency on the input parameter so the
> result may vary based on the input 2) this is only checked based on
> the LSN but there are other factors dbid and originid based on those
> also transaction could be skipped during DecodeCommit.  So I think one
> possible solution could be to remember a dbid and originid in
> ReorderBufferTXN as soon as we get the first change which has valid
> values for these parameters.
>

But is the required information say 'dbid' available in all records,
for example, what about XLOG_XACT_INVALIDATIONS? The other thing to
consider in this regard is if we are planning to have additional
information as mentioned by me in another to decide whether to stream
or not then the additional checks may be redundant anyway. It is a
good idea to have a common check at both places but if not, we can at
least add some comments to say why the check is different.

-- 
With Regards,
Amit Kapila.

Re: postgres_fdw: batch inserts vs. before row triggers

2022-11-26 Thread Etsuro Fujita

On Sat, Nov 26, 2022 at 1:57 AM Tom Lane  wrote:
> I don't think the committed patch is acceptable at all, at least
> not in the back branches, because it creates a severe ABI break.
> Specifically, by adding a field to ResultRelInfo you have changed
> the array stride of es_result_relations, and that will break any
> previously-compiled extension code that accesses that array.

Ugh.

> I'm not terribly pleased with it having added a field to EState
> either.  That seems much more global than what we need here.

The field stores pending buffered inserts, and I added it to Estate so
that it can be shared across primary/secondary ModifyTable nodes.
(Re-)consider this:

create extension postgres_fdw;
create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres');
create user mapping for current_user server loopback;
create table t1 (a text, b int);
create foreign table ft1 (a text, b int) server loopback options
(table_name 't1');
create table w1 (a text, b int);
create function ft1_rowcount_trigf() returns trigger language plpgsql as
$$
begin
  raise notice '%: there are % rows in ft1',
  tg_name, (select count(*) from ft1);
  return new;
end;
$$;
create trigger ft1_rowcount_trigger before insert on w1 for each row
execute function ft1_rowcount_trigf();
alter server loopback options (add batch_size '10');

with t as (insert into w1 values ('foo', 10), ('bar', 20) returning *)
insert into ft1 select * from t;
NOTICE:  ft1_rowcount_trigger: there are 0 rows in ft1
NOTICE:  ft1_rowcount_trigger: there are 1 rows in ft1
INSERT 0 2

For this query, the primary ModifyTable node doing batch insert is
executed concurrently with the secondary ModifyTable node doing the
modifying CTE, and in the secondary ModifyTable node, any pending
buffered insert done in the primary ModifyTable node needs to be
flushed before firing the BEFORE ROW trigger, so the row is visible to
the trigger.  The field is useful for cases like this.

> Couldn't we add the field to ModifyTableState, instead?

We could probably do so, but I thought having a global list would be
more efficient to handle pending buffered inserts than that.

Anyway I will work on this further.  Thanks for looking at this!

Best regards,
Etsuro Fujita

Re: Avoid distributing new catalog snapshot again for the transaction being decoded.

2022-11-26 Thread Amit Kapila

On Fri, Nov 25, 2022 at 5:30 PM Ashutosh Bapat
 wrote:
>
> Hi Hou,
> Thanks for the patch. With a simple condition, we can eliminate the
> need to queueing snapshot change in the current transaction and then
> applying it. Saves some memory and computation. This looks useful.
>
> When the queue snapshot change is processed in
> ReorderBufferProcessTXN(), we call SetupHistoricSnapshot(). But I
> didn't find a path through which SetupHistoricSnapshot() is called
> from SnapBuildCommitTxn().
>

It will be called after SnapBuildCommitTxn() via
ReorderBufferCommit()->ReorderBufferReplay()->ReorderBufferProcessTXN().
But, I think what I don't see is how the snapshot we build in
SnapBuildCommitTxn() will be assigned as a historic snapshot. We
assign base_snapshot as a historic snapshot and the new snapshot we
build in SnapBuildCommitTxn() is assigned as base_snapshot only if the
same is not set previously. I might be missing something but if that
is true then I don't think the patch is correct, OTOH I would expect
some existing tests to fail if this change is incorrect.

-- 
With Regards,
Amit Kapila.

Re: Small miscellaneous fixes

2022-11-26 Thread Ranier Vilela

Em sex., 25 de nov. de 2022 às 22:21, Michael Paquier 
escreveu:

> On Fri, Nov 25, 2022 at 01:15:40PM +0100, Peter Eisentraut wrote:
> > Is this something you want to follow up on, since you were involved in
> that
> > patch?  Is the redundant assignment simply to be deleted, or do you want
> to
> > check the original patch again for context?
>
> Most of the changes of this thread have been applied as of c42cd05c.
> Remains the SIGALRM business with sig_atomic_t, and I wanted to check
> that by myself first.
>
Thank you Michael, for taking care of it.

regards,
Ranier Vilela

Re: MSVC vs Perl

2022-11-26 Thread Andrew Dunstan



On 2022-11-25 Fr 18:52, Andrew Dunstan wrote:
> On 2022-11-25 Fr 18:48, Andrew Dunstan wrote:
>> For various reasons (see below) it's preferable to build on Windows with
>> Strawberry Perl. This works OK if you're building with Msys2, I upgraded
>> the instance on the machine that runs fairywren and drongo today, and
>> fairywren seems fine. Not so drongo, however. First it encountered a
>> build error, which I attempted to cure using something I found in the
>> vim sources [1]. That worked, but then there's a failure at runtime like
>> this:
>>
>> 2022-11-25 21:33:01.073 UTC [3764:3] pg_regress LOG:  statement: CREATE 
>> EXTENSION IF NOT EXISTS "plperl"
>> src/pl/plperl/Util.c: loadable library and perl binaries are mismatched (got 
>> handshake key 12800080, needed 12900080)
>> 2022-11-25 21:33:01.100 UTC [5048:5] LOG:  server process (PID 3764) exited 
>> with exit code 1
>>
>> I don't know how to debug this. I'm not sure if there's some flag we
>> could set which would cure it. It looks like a 1 bit difference, but I
>> haven't found what that bit corresponds to.


OK, so this cures the problem for drongo:


diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 83a3e40425..dc6b94b74f 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -707,6 +707,7 @@ sub mkvcbuild
    print "CFLAGS recommended by Perl: $Config{ccflags}\n";
    print "CFLAGS to compile embedded Perl: ",
  (join ' ', map { "-D$_" } @perl_embed_ccflags), "\n";
+   push @perl_embed_ccflags,'NO_THREAD_SAFE_LOCALE';
    foreach my $f (@perl_embed_ccflags)
    {
    $plperl->AddDefine($f);



I'll see if it also works for bowerbird.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: postgres_fdw: batch inserts vs. before row triggers

2022-11-26 Thread Tom Lane

Etsuro Fujita  writes:
> On Sat, Nov 26, 2022 at 1:57 AM Tom Lane  wrote:
>> Couldn't we add the field to ModifyTableState, instead?

> We could probably do so, but I thought having a global list would be
> more efficient to handle pending buffered inserts than that.

OK, as long as there's a reason for doing it that way, it's OK
by me.  I don't think that adding a field at the end of EState
is an ABI problem.

We have to do something else than add to ResultRelInfo, though.

regards, tom lane

Re: CI and test improvements

2022-11-26 Thread Andrew Dunstan

[ resending to -hackers because of list issues ]

On 2022-05-28 Sa 11:37, Justin Pryzby wrote:
> I'm "joining" a bunch of unresolved threads hoping to present them better 
> since
> they're related and I'm maintaining them together anyway.
>
> https://www.postgresql.org/message-id/flat/20220219234148.GC9008%40telsasoft.com
>  - set TESTDIR from perl rather than Makefile


I looked at the latest set here, patch 1 still doesn't look right, I
think vc_regress.pl should be setting PG_SUBDIR like the Makefile.global
does.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2022-11-26 Thread Laurenz Albe

On Fri, 2022-11-25 at 14:47 -0800, Peter Geoghegan wrote:
> Attached WIP patch invents the idea of a regular autovacuum that is
> tasked with advancing relfrozenxid -- which is really just another
> trigger criteria, reported on in the server log in its autovacuum
> reports. Of course we retain the idea of antiwraparound autovacuums.
> The only difference is that they are triggered when table age has
> advanced by twice the usual amount, which is presumably only possible
> because a regular autovacuum couldn't start or couldn't complete in
> time (most likely due to continually being auto-cancelled).
> 
> As I said before, I think that the most important thing is to give
> regular autovacuuming a chance to succeed. The exact approach taken
> has a relatively large amount of slack, but that probably isn't
> needed. So the new antiwraparound cutoffs were chosen because they're
> easy to understand and remember, which is fairly arbitrary.

The target is a table that receives no DML at all, right?
I think that is a good idea.
Wouldn't it make sense to trigger that at *half* "autovacuum_freeze_max_age"?

Yours,
Laurenz Albe

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2022-11-26 Thread Peter Geoghegan

On Sat, Nov 26, 2022 at 9:58 AM Laurenz Albe  wrote:
> The target is a table that receives no DML at all, right?

Sort of, but not really. The target is a table that doesn't get
vacuumed for any other reason -- I don't make any claims beyond that
one.

It seems a little too optimistic to suppose that such a table really
didn't need any vacuuming to deal with bloat just because autovacuum.c
didn't seem to think that it did.

> I think that is a good idea.
> Wouldn't it make sense to trigger that at *half* "autovacuum_freeze_max_age"?

That would be equivalent to what I've done here, provided you also
double the autovacuum_freeze_max_age setting. I did it this way
because I believe that it has fewer problems. The approach I took
makes the general perception that antiwraparound autovacuum are a
scary thing (really just needed for emergencies) become true, for the
first time.

We should expect to see very few antiwraparound autovacuums with the
patch, but when we do see even one it'll be after a less aggressive
approach was given the opportunity to succeed, but (for whatever
reason) failed. Just seeing any antiwraparound autovacuums will become
a useful signal of something being amiss in a way that it just isn't
at the moment. The docs can be updated to talk about antiwraparound
autovacuum as a thing that you should encounter approximately never.
This is possible even though the patch isn't invasive at all.

-- 
Peter Geoghegan

Re: TAP output format in pg_regress

2022-11-26 Thread Nikolay Shaplov

В письме от пятница, 25 ноября 2022 г. 00:20:01 MSK пользователь Daniel 
Gustafsson написал:

+ /*
  
+  * The width of the testname field when printing to ensure vertical 
alignment   
+  * of test runtimes. Thius number is somewhat arbitrarily chosen to match 
the   
+  * older pre-TAP output format.   
  
+  */

"Thius" seems to be a typo :-)

-

+ #define bail_noatexit(...)  bail_out(true, __VA_ARGS__)

BTW what does "noat" stands for? I thought it is typo too :-) and originally 
meant to be "not". 

-

-   snprintf(buf, sizeof(buf),
-_(" All %d tests passed. "),
-success_count);
-   else if (fail_count == 0)   /* fail_count=0, fail_ignore_count>0 */
-   snprintf(buf, sizeof(buf),
-_(" %d of %d tests passed, %d failed test(s) ignored. "),
-success_count,
-success_count + fail_ignore_count,
-fail_ignore_count);
-   else if (fail_ignore_count == 0)/* fail_count>0 && fail_ignore_count=0 
*/
-   snprintf(buf, sizeof(buf),
-_(" %d of %d tests failed. "),
-fail_count,
-success_count + fail_count);
+   note(_("All %d tests passed.\n"), success_count);
+   /* fail_count=0, fail_ignore_count>0 */
+   else if (fail_count == 0)
+   note(_("%d of %d tests passed, %d failed test(s) ignored.\n"),
+success_count,
+success_count + fail_ignore_count,
+fail_ignore_count);
+   /* fail_count>0 && fail_ignore_count=0 */
+   else if (fail_ignore_count == 0)
+   diag(_("%d of %d tests failed.\n"),
+fail_count,
+success_count + fail_count);
+   /* fail_count>0 && fail_ignore_count>0 */

Just out of overaccuracy:  Logic here have not changed. Can we keep ifs, elses 
and may be indent offsets of lines that did not change as they were to have 
nicer diff? Would make understanding this changeset more easy... Or this is 
work of pg_indent that spoils it?



While looking at the my output I am getting wrong offset for 
sanity_check:

ok 84   hash_func  121 ms
ok 85   errors  68 ms
ok 86   infinite_recurse   233 ms
ok 87sanity_check  144 ms
# parallel group (20 tests):  select_into delete random select_having 
select_distinct_on namespace select_implicit case prepared_xacts subselect 
transactions portals select_distinct union arrays update hash_index join 
aggregates btree_index
ok 88   select_into134 ms
ok 89   select_distinct812 ms

(also for select_parallel write_parallel vacuum_parallel and fast_default)

I guess the intention was to align them too...



As for the rest: I see no other problems in the code, and consider it should 
be passed to commiter (or may be more experienced reviewer)


> > On 24 Nov 2022, at 20:32, Andres Freund  wrote:
> > 
> > On November 24, 2022 11:07:43 AM PST, Daniel Gustafsson  
wrote:
> >>> On 24 Nov 2022, at 18:07, Nikolay Shaplov  wrote:
> >> One option could be to redefine bail() to take the exit function as a
> >> parameter and have the caller pass the preferred exit handler.
> >> 
> >> -bail_out(bool non_rec, const char *fmt,...)
> >> +bail(void (*exit_func)(int), const char *fmt,...)
> >> 
> >> The callsites would then look like the below, which puts a reference to
> >> the
> >> actual exit handler used in the code where it is called.
> > 
> > I'd just rename _bail to bail_noatexit().
> 
> That's probably the best option, done in the attached along with the comment
> fixup to mention the recursion issue.
> 
> >>> This magic spell "...%-5i %s%-*s %8.0f ms\n" is too dark to repeat it
> >>> even two times. I understand problems with spaces... But may be it
> >>> would be better somehow narrow it to one ugly print... Print "ok %-5i  
> >>>  "|"not ok %-5i" to buffer first, and then have one "%s%-*s %8.0f
> >>> ms%s\n" print or something like that...
> >> 
> >> I'm not convinced that this printf format is that hard to read (which may
> >> well be attributed to Stockholm Syndrome), and I do think that breaking
> >> it up and adding more code to print the line will make it less readable
> >> instead.> 
> > I don't think it's terrible either. I do think it'd also be ok to switch
> > between ok / not ok within a single printf, making it easier to keep them
> > in sync.
> I made it into a single printf to see what it would look like, with some
> additional comments to make it more readable (I'm not a fan of where
> pgindent moves those but..).
> 
> --
> Daniel Gustafsson https://vmware.com/


-- 
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su


signature.asc
Description: Th

Re: User functions for building SCRAM secrets

2022-11-26 Thread Jonathan S. Katz


On 11/16/22 10:09 PM, Michael Paquier wrote:

On Thu, Nov 10, 2022 at 11:14:34PM -0500, Jonathan S. Katz wrote:

On 10/31/22 8:56 PM, Michael Paquier wrote:

Well, one could pass a salt based on something generated by random()
to emulate what we currently do in the default case, as well.  The
salt length is an extra possibility, letting it be randomly generated
by pg_strong_random().


Sure, this is a good point. From a SQL level we can get that from pgcrypto
"gen_random_bytes".


Could it be something we could just push into core?  FWIW, I've used
that quite a bit in the last to cheaply build long random strings of
data for other things.  Without pgcrypto, random() with
generate_series() has always been kind of..  fun.


I would be a +1 for moving that into core, given we did something 
similar with gen_random_uuid[1]. Separate patch, of course :)



+SELECT scram_build_secret_sha256(NULL);
+ERROR:  password must not be null
+SELECT scram_build_secret_sha256(NULL, NULL);
+ERROR:  password must not be null
+SELECT scram_build_secret_sha256(NULL, NULL, NULL);
+ERROR:  password must not be null

This is just testing three times the same thing as per the defaults.
I would cut the second and third cases.


AFAICT it's not returning the defaults. Quick other example:

CREATE FUNCTION ab (a int DEFAULT 0) RETURNS int AS $$ SELECT a; $$ 
LANGUAGE SQL;


SELECT ab();
 ab

  0
(1 row)

SELECT ab(NULL::int);
 ab


(1 row)

Given scram_build_secret_sha256 is not a strict function, I'd prefer to 
test all of the NULL cases in case anything in the underlying code 
changes in the future. It's a cheap cost to be a bit more careful.



git diff --check reports some whitespaces.


Ack. Will fix on the next pass. (I've been transitioning editors, which 
could have resulted in that),



scram_build_secret_sha256_internal() is missing SASLprep on the
password string.  Perhaps the best thing to do here is just to extend
pg_be_scram_build_secret() with more arguments so as callers can
optionally pass down a custom salt with its length, leaving the
responsibility to pg_be_scram_build_secret() to create a random salt
if nothing has been given?


Ah, good catch!

I think if we go with passing down the salt, we'd also have to allow for 
the passing down of the iterations, too, and we're close to rebuilding 
"scram_build_secret". I'll stare a bit at this on the next pass and 
either 1/ just SASLprep the string in the new 
"scram_build_secret_sha256_internal" func, or 2/ change the definition 
of "pg_be_scram_build_secret" to accommodate more overrides.


Thanks,

Jonathan

[1] https://www.postgresql.org/docs/current/functions-uuid.html


OpenPGP_signature
Description: OpenPGP digital signature

pg_regress: Treat child process failure as test failure

2022-11-26 Thread Daniel Gustafsson

In the thread about TAP format out in pg_regress, Andres pointed out [0] that
we allow a test to pass even if the test child process failed.  While its
probably pretty rare to have a test pass if the process failed, this brings a
risk for false positives (and it seems questionable that any regress test will
have a child process failing as part of its intended run).

The attached makes child failures an error condition for the test as a belts
and suspenders type check. Thoughts?

--
Daniel Gustafsson   https://vmware.com/

[0] https://postgr.es/m/20221122235636.4frx7hjterq6b...@awork3.anarazel.de



v1-0001-Consider-a-failed-test-process-as-a-failed-test.patch
Description: Binary data

Re: TAP output format in pg_regress

2022-11-26 Thread Daniel Gustafsson

> On 26 Nov 2022, at 20:37, Nikolay Shaplov  wrote:
> В письме от пятница, 25 ноября 2022 г. 00:20:01 MSK пользователь Daniel 
> Gustafsson написал:

> "Thius" seems to be a typo :-)

Correct, fixed in the attached.

> + #define bail_noatexit(...) bail_out(true, __VA_ARGS__)
> 
> BTW what does "noat" stands for? I thought it is typo too :-) and originally 
> meant to be "not". 

Calling _exit() will cause exit handler functions registered with atexit() to
not be invoked, no noatexit was intentional spelling.

> Just out of overaccuracy: Logic here have not changed. Can we keep ifs, elses 
> and may be indent offsets of lines that did not change as they were to have 
> nicer diff? Would make understanding this changeset more easy... Or this is 
> work of pg_indent that spoils it?

The diff algorithm decided that this was the compact way of displaying the
unified diff, probably because too many lines in proximity changed.  While
avoiding moving the comments to before the line might mitigate that somewhat I
prefer this greatly to a slightly easier to read diff.

> While looking at the my output I am getting wrong offset for 
> sanity_check:
> 
> ok 84 hash_func 121 ms
> ok 85 errors 68 ms
> ok 86 infinite_recurse 233 ms
> ok 87 sanity_check 144 ms
> # parallel group (20 tests): select_into delete random select_having 
> select_distinct_on namespace select_implicit case prepared_xacts subselect 
> transactions portals select_distinct union arrays update hash_index join 
> aggregates btree_index
> ok 88 select_into 134 ms
> ok 89 select_distinct 812 ms
> 
> (also for select_parallel write_parallel vacuum_parallel and fast_default)
> 
> I guess the intention was to align them too...

No, they are aligned in such a way because they are running outside of a
parallel group.  Note that it's not part of the "parallel group" note
preceeding the tests:

# parallel group (6 tests):  collate.linux.utf8 amutils psql_crosstab psql 
rules stats_ext
ok 146  rules  507 ms
ok 147  psql   448 ms
ok 148  psql_crosstab   47 ms
ok 149  amutils 39 ms
ok 150  stats_ext 2578 ms
ok 151  collate.linux.utf8  27 ms
ok 152   select_parallel   668 ms
ok 153   write_parallel 84 ms
ok 154   vacuum_parallel90 ms

In the previous format it's a bit clearer, and maybe we should adopt that for
TAP as well?

parallel group (6 tests):  collate.linux.utf8 amutils psql_crosstab psql rules 
stats_ext
 rules... ok  488 ms
 psql ... ok  430 ms
 psql_crosstab... ok   47 ms
 amutils  ... ok   38 ms
 stats_ext... ok 2301 ms
 collate.linux.utf8   ... ok   24 ms
test select_parallel  ... ok  641 ms
test write_parallel   ... ok   83 ms
test vacuum_parallel  ... ok   87 ms

That would if so make the output something like the below.  Personally I think
the "test" prefix adds little value since everything printed are test suites,
and we are already today using indentation for grouping parallel tests.

# parallel group (6 tests):  collate.linux.utf8 amutils psql_crosstab psql 
rules stats_ext
ok 146  rules  507 ms
ok 147  psql   448 ms
ok 148  psql_crosstab   47 ms
ok 149  amutils 39 ms
ok 150  stats_ext 2578 ms
ok 151  collate.linux.utf8  27 ms
ok 152   test select_parallel  668 ms
ok 153   test write_parallel84 ms
ok 154   test vacuum_parallel   90 ms

--
Daniel Gustafsson   https://vmware.com/

v13-0001-Change-pg_regress-output-format-to-be-TAP-compli.patch
Description: Binary data

Re: pg_regress: Treat child process failure as test failure

2022-11-26 Thread Andres Freund

Hi,

On 2022-11-26 21:11:39 +0100, Daniel Gustafsson wrote:
> In the thread about TAP format out in pg_regress, Andres pointed out [0] that
> we allow a test to pass even if the test child process failed.  While its
> probably pretty rare to have a test pass if the process failed, this brings a
> risk for false positives (and it seems questionable that any regress test will
> have a child process failing as part of its intended run).

> The attached makes child failures an error condition for the test as a belts
> and suspenders type check. Thoughts?

I wonder if it's the right thing to treat a failed psql that's then also
ignored as "failed (ignored)". Perhaps it'd be better to move the statuses[i]
!= 0 check to before the if (differ)?


> - if (differ)
> + if (differ || statuses[i] != 0)
>   {
>   boolignore = false;
>   _stringlist *sl;
> @@ -1815,7 +1815,7 @@ run_single_test(const char *test, test_start_function 
> startfunc,
>   differ |= newdiff;
>   }
>  
> - if (differ)
> + if (differ || exit_status != 0)
>   {
>   status(_("FAILED"));
>   fail_count++;

It certainly is a bit confusing that we print a psql failure separately from
the if "FAILED" vs "ok" bit.

Greetings,

Andres Freund

Re: MSVC vs Perl

2022-11-26 Thread Andres Freund

Hi,

On 2022-11-26 09:43:19 -0500, Andrew Dunstan wrote:
> OK, so this cures the problem for drongo:
> 
> 
> diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
> index 83a3e40425..dc6b94b74f 100644
> --- a/src/tools/msvc/Mkvcbuild.pm
> +++ b/src/tools/msvc/Mkvcbuild.pm
> @@ -707,6 +707,7 @@ sub mkvcbuild
>     print "CFLAGS recommended by Perl: $Config{ccflags}\n";
>     print "CFLAGS to compile embedded Perl: ",
>   (join ' ', map { "-D$_" } @perl_embed_ccflags), "\n";
> +   push @perl_embed_ccflags,'NO_THREAD_SAFE_LOCALE';
>     foreach my $f (@perl_embed_ccflags)
>     {
>     $plperl->AddDefine($f);

This likely is just a test patch, in case it is not, it seems we should add
NO_THREAD_SAFE_LOCALE to @perl_embed_ccflags before printing it.

Do we need a "configure" check for this? I guess it's ok to define this
whenever building with msvc - I don't currently see a scenario where it could
hurt. We already define flags unconditionally, c.f. PLPERL_HAVE_UID_GID.

Given how fragile the embedding is (we've had several prior iterations of
problems around this), I think it'd be good to test that the current flags
avoid the "got handshake key" at configure time, rather than having to debug
runtime failures.

As noted by Noah in [1], the Mkvcbuild.pm actually has code to do so - but
only does for 32bit builds.

I don't think it's worth generalizing this for src/tools/msvc at this point,
but it might be worth copying the test to meson and running the binary (except
when cross building, of course).

Greetings,

Andres Freund

[1] https://postgr.es/m/20220130231432.GA2658915%40rfd.leadboat.com

Re: Decouple last important WAL record LSN from WAL insert locks

2022-11-26 Thread Andres Freund

Hi,

On 2022-11-23 19:12:07 +0530, Bharath Rupireddy wrote:
> While working on something else, I noticed that each WAL insert lock
> tracks its own last important WAL record's LSN (lastImportantAt) and
> both the bgwriter and checkpointer later computes the max
> value/server-wide last important WAL record's LSN via
> GetLastImportantRecPtr(). While doing so, each WAL insertion lock is
> acquired in exclusive mode in a for loop. This seems like too much
> overhead to me.

GetLastImportantRecPtr() should be a very rare operation, so it's fine for it
to be expensive. The important thing is for the maintenance of the underlying
data to be very cheap.

> I quickly coded a patch (attached herewith) that
> tracks the server-wide last important WAL record's LSN in
> XLogCtlInsert (lastImportantPos) protected with a spinlock and gets
> rid of lastImportantAt from each WAL insert lock.

That strikes me as a very bad idea. It adds another point of contention to a
very very hot code path, to make a very rare code path cheaper.

Greetings,

Andres Freund

Re: Allow processes to reset procArrayGroupNext themselves instead of leader resetting for all the followers

2022-11-26 Thread Andres Freund

Hi,

On 2022-11-24 10:43:46 +0530, Bharath Rupireddy wrote:
> While working on something else, I noticed that the proc array group
> XID clearing leader resets procArrayGroupNext of all the followers
> atomically along with procArrayGroupMember. ISTM that it's enough for
> the followers to exit the wait loop and continue if the leader resets
> just procArrayGroupMember, the followers can reset procArrayGroupNext
> by themselves. This relieves the leader a bit, especially when there
> are many followers, as it avoids a bunch of atomic writes and
> pg_write_barrier() for the leader .

I doubt this is a useful change - the leader already has to modify the
relevant cacheline (for procArrayGroupMember). That makes it pretty much free
to modify another field in the same cacheline.

Greetings,

Andres Freund

Re: MSVC vs Perl

2022-11-26 Thread Andrew Dunstan



On 2022-11-26 Sa 16:05, Andres Freund wrote:
> Hi,
>
> On 2022-11-26 09:43:19 -0500, Andrew Dunstan wrote:
>> OK, so this cures the problem for drongo:
>>
>>
>> diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
>> index 83a3e40425..dc6b94b74f 100644
>> --- a/src/tools/msvc/Mkvcbuild.pm
>> +++ b/src/tools/msvc/Mkvcbuild.pm
>> @@ -707,6 +707,7 @@ sub mkvcbuild
>>     print "CFLAGS recommended by Perl: $Config{ccflags}\n";
>>     print "CFLAGS to compile embedded Perl: ",
>>   (join ' ', map { "-D$_" } @perl_embed_ccflags), "\n";
>> +   push @perl_embed_ccflags,'NO_THREAD_SAFE_LOCALE';
>>     foreach my $f (@perl_embed_ccflags)
>>     {
>>     $plperl->AddDefine($f);
> This likely is just a test patch, in case it is not, it seems we should add
> NO_THREAD_SAFE_LOCALE to @perl_embed_ccflags before printing it.

Sure

> Do we need a "configure" check for this? I guess it's ok to define this
> whenever building with msvc - I don't currently see a scenario where it could
> hurt. We already define flags unconditionally, c.f. PLPERL_HAVE_UID_GID.
>
> Given how fragile the embedding is (we've had several prior iterations of
> problems around this), I think it'd be good to test that the current flags
> avoid the "got handshake key" at configure time, rather than having to debug
> runtime failures.
>
> As noted by Noah in [1], the Mkvcbuild.pm actually has code to do so - but
> only does for 32bit builds.
>
> I don't think it's worth generalizing this for src/tools/msvc at this point,
> but it might be worth copying the test to meson and running the binary (except
> when cross building, of course).



Yeah, given that we are planning on ditching this build system as soon
as we can I'm not inclined to do anything very heroic.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pg_regress: Treat child process failure as test failure

2022-11-26 Thread Daniel Gustafsson

> On 26 Nov 2022, at 21:55, Andres Freund  wrote:
> On 2022-11-26 21:11:39 +0100, Daniel Gustafsson wrote:

>> The attached makes child failures an error condition for the test as a belts
>> and suspenders type check. Thoughts?
> 
> I wonder if it's the right thing to treat a failed psql that's then also
> ignored as "failed (ignored)". Perhaps it'd be better to move the statuses[i]
> != 0 check to before the if (differ)?

I was thinking about that too, but I think you're right.  The "ignore" part is
about the test content and not the test run structure.

> It certainly is a bit confusing that we print a psql failure separately from
> the if "FAILED" vs "ok" bit.

I've moved the statuses[i] check before the differ check, such that there is a
separate block for this not mixed up with the differs check and printing.  It
does duplicate things a little bit but also makes it a lot clearer.

--
Daniel Gustafsson   https://vmware.com/

v2-0001-Consider-a-failed-test-process-as-a-failed-test.patch
Description: Binary data

CF 2022-11: entries "Waiting for Committer" but no recent activity

2022-11-26 Thread Ian Lawrence Barwick

Hi

Apologies for the intermission in activity, among other "fun" things had
a family visitation of the influenza virus, which is becoming fashionable
in these parts again.

In an attempt to salvage something from the situation I am having a crack at the
older entries marked "Ready for Committer", some of which probably
need some sort of
action, but it's not always clear (to me at least) what.

If there's some sort of consensus for individual items, per previous
practice I'll update the individual threads.

Date in parenthesis is that of the most recent message on the thread.

pg_stat_statements: Track statement entry timestamp (2022-04-08)


- https://commitfest.postgresql.org/40/3048/
- 
https://www.postgresql.org/message-id/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.ca...@moonset.ru

Patch is applying but no recent activity.

psql - refactor echo code (2022-07-24)
--

- https://commitfest.postgresql.org/40/3140/
- 
https://www.postgresql.org/message-id/flat/alpine.DEB.2.22.394.2105301104400.3020016@pseudo

Seems like a small patch which can be applied easily.

Add Amcheck option for checking unique constraints in btree indexes (2022-09-28)


- https://commitfest.postgresql.org/40/3464/
- 
https://www.postgresql.org/message-id/flat/calt9zehrn5xam5boga0qnrcmpv52bscek2qnq1hmuzdd301...@mail.gmail.com

Seems to be consensus it is actually RfC, but needs a committer to
show interest.


meson PGXS compatibility (2022-10-13)
-

- https://commitfest.postgresql.org/40/3932/
- 
https://www.postgresql.org/message-id/flat/20221005200710.luvw5evhwf6cl...@awork3.anarazel.de

Seems to be RfC.


pg_receivewal fail to streams when the partial file to write [...] (2022-10-13)
---

- https://commitfest.postgresql.org/40/3503/
- 
https://www.postgresql.org/message-id/flat/cahg+qdcvuss6ocomblbv5f4yeglhoct+1x2ylnfg2h_edwu...@mail.gmail.com

The author of the latest patch (not the original patch author) indicates this
needs further review; should the status be changed?  Setting it back to "Needs
review" seems the obvious thing to do, but it feels like it would put it back in
the pool of possibly unreviewe entries (maybe we need a "Needs futher review"


Update relfrozenxmin when truncating temp tables (2022-11-05)


- https://commitfest.postgresql.org/40/3358/
- 
https://www.postgresql.org/message-id/flat/cam-w4hnnbdexixrj0b+_-wvp5nz6of0rlueqfuzfyqblcbe...@mail.gmail.com

I'll set this one to "Waiting on Author" based on Tom's latest feedback.


Faster pglz compression (2021-11-17)


- https://commitfest.postgresql.org/40/2897/
- 
https://www.postgresql.org/message-id/flat/fef3dc5e-4bc4-44e1-8deb-dadc67046...@yandex-team.ru

This one, prior to my reminder, has a committer promising to commit but was
inactive for over a year.

Regards

Ian Barwick

Re: CF 2022-11: entries "Waiting for Committer" but no recent activity

2022-11-26 Thread Tomas Vondra



On 11/27/22 01:43, Ian Lawrence Barwick wrote:
> ...
> Faster pglz compression (2021-11-17)
> 
> 
> - https://commitfest.postgresql.org/40/2897/
> - 
> https://www.postgresql.org/message-id/flat/fef3dc5e-4bc4-44e1-8deb-dadc67046...@yandex-team.ru
> 
> This one, prior to my reminder, has a committer promising to commit but was
> inactive for over a year.
> 

Ugh, I see that slacker is me, so I'll get this committed (unless
someone else wants to).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Removing another gen_node_support.pl special case

2022-11-26 Thread Tom Lane

I got confused about how we were managing EquivalenceClass pointers
in the copy/equal infrastructure, and it took me awhile to remember
that the reason it works is that gen_node_support.pl has hard-wired
knowledge about that.  I think that's something we'd be best off
dropping in favor of explicit annotations on affected fields.
Hence, I propose the attached.  This results in zero change in
the generated copy/equal code.

regards, tom lane

diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index b6f086e262..9f7b4b833f 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -166,9 +166,6 @@ my @custom_read_write;
 # Track node types with manually assigned NodeTag numbers.
 my %manual_nodetag_number;
 
-# EquivalenceClasses are never moved, so just shallow-copy the pointer
-push @scalar_types, qw(EquivalenceClass* EquivalenceMember*);
-
 # This is a struct, so we can copy it by assignment.  Equal support is
 # currently not required.
 push @scalar_types, qw(QualCost);
@@ -458,9 +455,14 @@ foreach my $infile (@ARGV)
 && $attr !~ /^copy_as\(\w+\)$/
 && $attr !~ /^read_as\(\w+\)$/
 && !elem $attr,
-qw(equal_ignore equal_ignore_if_zero read_write_ignore
-write_only_relids write_only_nondefault_pathtarget write_only_req_outer)
-			  )
+qw(copy_as_scalar
+equal_as_scalar
+equal_ignore
+equal_ignore_if_zero
+read_write_ignore
+write_only_relids
+write_only_nondefault_pathtarget
+write_only_req_outer))
 			{
 die
   "$infile:$lineno: unrecognized attribute \"$attr\"\n";
@@ -692,6 +694,8 @@ _equal${n}(const $n *a, const $n *b)
 		# extract per-field attributes
 		my $array_size_field;
 		my $copy_as_field;
+		my $copy_as_scalar  = 0;
+		my $equal_as_scalar = 0;
 		foreach my $a (@a)
 		{
 			if ($a =~ /^array_size\(([\w.]+)\)$/)
@@ -702,19 +706,41 @@ _equal${n}(const $n *a, const $n *b)
 			{
 $copy_as_field = $1;
 			}
+			elsif ($a eq 'copy_as_scalar')
+			{
+$copy_as_scalar = 1;
+			}
+			elsif ($a eq 'equal_as_scalar')
+			{
+$equal_as_scalar = 1;
+			}
 			elsif ($a eq 'equal_ignore')
 			{
 $equal_ignore = 1;
 			}
 		}
 
-		# override type-specific copy method if copy_as is specified
+		# override type-specific copy method if requested
 		if (defined $copy_as_field)
 		{
 			print $cff "\tnewnode->$f = $copy_as_field;\n"
 			  unless $copy_ignore;
 			$copy_ignore = 1;
 		}
+		elsif ($copy_as_scalar)
+		{
+			print $cff "\tCOPY_SCALAR_FIELD($f);\n"
+			  unless $copy_ignore;
+			$copy_ignore = 1;
+		}
+
+		# override type-specific equal method if requested
+		if ($equal_as_scalar)
+		{
+			print $eff "\tCOMPARE_SCALAR_FIELD($f);\n"
+			  unless $equal_ignore;
+			$equal_ignore = 1;
+		}
 
 		# select instructions by field type
 		if ($t eq 'char*')
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index a80f43e540..d98f2c91d9 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -86,6 +86,12 @@ typedef enum NodeTag
  *
  * - copy_as(VALUE): In copyObject(), replace the field's value with VALUE.
  *
+ * - copy_as_scalar: In copyObject(), copy the field's pointer value
+ *   even if it is a node-type pointer.
+ *
+ * - equal_as_scalar: In equal(), compare the field by pointer equality
+ *   even if it is a node-type pointer.
+ *
  * - equal_ignore: Ignore the field for equality.
  *
  * - equal_ignore_if_zero: Ignore the field for equality if it is zero.
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index a544b313d3..885ad42327 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1273,7 +1273,9 @@ typedef struct StatisticExtInfo
  *
  * NB: EquivalenceClasses are never copied after creation.  Therefore,
  * copyObject() copies pointers to them as pointers, and equal() compares
- * pointers to EquivalenceClasses via pointer equality.
+ * pointers to EquivalenceClasses via pointer equality.  This is implemented
+ * by putting copy_as_scalar and equal_as_scalar attributes on fields that
+ * are pointers to EquivalenceClasses.  The same goes for EquivalenceMembers.
  */
 typedef struct EquivalenceClass
 {
@@ -1364,7 +1366,8 @@ typedef struct PathKey
 
 	NodeTag		type;
 
-	EquivalenceClass *pk_eclass;	/* the value that is ordered */
+	/* the value that is ordered */
+	EquivalenceClass *pk_eclass pg_node_attr(copy_as_scalar, equal_as_scalar);
 	Oid			pk_opfamily;	/* btree opfamily defining the ordering */
 	int			pk_strategy;	/* sort direction (ASC or DESC) */
 	bool		pk_nulls_first; /* do NULLs come before normal values? */
@@ -2472,7 +2475,7 @@ typedef struct RestrictInfo
 	 * Generating EquivalenceClass.  This field is NULL unless clause is
 	 * potentially redundant.
 	 */
-	EquivalenceClass *parent_ec pg_node_attr(equal_ignore, read_write_ignore);
+	EquivalenceClass *paren

Re: Add tracking of backend memory allocated to pg_stat_activity

2022-11-26 Thread Reid Thompson

Code rebased to current master.
Updated to incorporate additional recommendations from the the list
   - add units to variables in view
   - remove 'backend_' prefix from variables/functions
   - update documentation
   - add functional test for allocated_bytes
   - refactor allocation reporting to reduce number of functions and
 branches/reduce performance hit
   - zero allocated bytes after fork to avoid double counting postmaster 
allocations




-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

From 3d772d8620faba4bd4e091d6618c63557fbf6749 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Thu, 11 Aug 2022 12:01:25 -0400
Subject: [PATCH] Add tracking of backend memory allocated to pg_stat_activity

This new field displays the current bytes of memory allocated to the
backend process. It is updated as memory for the process is
malloc'd/free'd. Memory allocated to items on the freelist is included in
the displayed value.  Dynamic shared memory allocations are included
only in the value displayed for the backend that created them, they are
not included in the value for backends that are attached to them to
avoid double counting. On occasion, orphaned memory segments may be
cleaned up on postmaster startup. This may result in decreasing the sum
without a prior increment. We limit the floor of backend_mem_allocated
to zero. Updated pg_stat_activity documentation for the new column.
---
 doc/src/sgml/monitoring.sgml| 15 
 src/backend/catalog/system_views.sql|  1 +
 src/backend/postmaster/autovacuum.c |  6 ++
 src/backend/postmaster/postmaster.c | 13 
 src/backend/postmaster/syslogger.c  |  3 +
 src/backend/storage/ipc/dsm_impl.c  | 81 +
 src/backend/utils/activity/backend_status.c | 45 
 src/backend/utils/adt/pgstatfuncs.c |  4 +-
 src/backend/utils/mmgr/aset.c   | 17 +
 src/backend/utils/mmgr/generation.c | 15 
 src/backend/utils/mmgr/slab.c   | 21 ++
 src/include/catalog/pg_proc.dat |  6 +-
 src/include/utils/backend_status.h  | 59 ++-
 src/test/regress/expected/rules.out |  9 ++-
 src/test/regress/expected/stats.out | 11 +++
 src/test/regress/sql/stats.sql  |  3 +
 16 files changed, 300 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 5579b8b9e0..ffe7d2566c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -947,6 +947,21 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  
 
 
+
+ 
+  allocated_bytes bigint
+ 
+ 
+  Memory currently allocated to this backend in bytes. This is the balance
+  of bytes allocated and freed by this backend. Dynamic shared memory
+  allocations are included only in the value displayed for the backend that
+  created them, they are not included in the value for backends that are
+  attached to them to avoid double counting. Use pg_size_pretty
+  described in  to make this value
+  more easily readable.
+ 
+
+
  
   
query text
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..9ea8f78c95 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -864,6 +864,7 @@ CREATE VIEW pg_stat_activity AS
 S.state,
 S.backend_xid,
 s.backend_xmin,
+S.allocated_bytes,
 S.query_id,
 S.query,
 S.backend_type
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 601834d4b4..f54606104d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -407,6 +407,9 @@ StartAutoVacLauncher(void)
 
 #ifndef EXEC_BACKEND
 		case 0:
+			/* Zero allocated bytes to avoid double counting parent allocation */
+			pgstat_zero_my_allocated_bytes();
+
 			/* in postmaster child ... */
 			InitPostmasterChild();
 
@@ -1485,6 +1488,9 @@ StartAutoVacWorker(void)
 
 #ifndef EXEC_BACKEND
 		case 0:
+			/* Zero allocated bytes to avoid double counting parent allocation */
+			pgstat_zero_my_allocated_bytes();
+
 			/* in postmaster child ... */
 			InitPostmasterChild();
 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a8a246921f..89a6caec78 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4102,6 +4102,9 @@ BackendStartup(Port *port)
 	{
 		free(bn);
 
+		/* Zero allocated bytes to avoid double counting parent allocation */
+		pgstat_zero_my_allocated_bytes();
+
 		/* Detangle from postmaster */
 		InitPostmasterChild();
 
@@ -5307,6 +5310,11 @@ StartChildProcess(AuxProcType type)
 		MemoryContextDelete(Postmast

Re: Add tracking of backend memory allocated to pg_stat_activity

2022-11-26 Thread Reid Thompson

On Wed, 2022-11-09 at 09:23 -0500, Reid Thompson wrote:
> Hi Melanie,
> Thank you for looking at this and for the feedback. Responses inline
> below.
> 
> On Mon, 2022-11-07 at 16:17 -0500, Melanie Plageman wrote:
> > 
> > It doesn't seem like you need the backend_ prefix in the view since
> > that is implied by it being in pg_stat_activity.
> 
> I will remove the prefix.

done

> 
> > For the wording on the description, I find "memory allocated to
> > this
> > backend" a bit confusing. Perhaps you could reword it to make clear
> > you mean that the number represents the balance of allocations by
> > this
> > backend. Something like:
> > 
> > Memory currently allocated to this backend in bytes. This
> > is the
> > balance of bytes allocated and freed by this backend.
> > I would also link to the system administration function
> > pg_size_pretty() so users know how to easily convert the value.
> 
> Thanks, I'll make these changes

done

> > > +/* 
> > > + * pgstat_report_backend_allocated_bytes_increase() -
> > > + *
> > > + * Called to report increase in memory allocated for this
> > > backend
> > > + * 
> > > + */
> > 
> > It seems like you could combine the
> > pgstat_report_backend_allocated_bytes_decrease/increase() by either
> > using a signed integer to represent the allocation/deallocation or
> > passing in
> > a "direction" that is just a positive or negative multiplier enum.
> > 
> > Especially if you don't use the write barriers, I think you could
> > simplify the logic in the two functions.
> > 
> > If you do combine them, you might shorten the name to
> > pgstat_report_backend_allocation() or pgstat_report_allocation().
> 
> Agreed. This seems a cleaner, simpler way to go.  I'll add it to the
> TODO list.

done

> 
> > > +   /*
> > > +    * Do not allow backend_allocated_bytes to go below zero.
> > > ereport if we
> > > +    * would have. There's no need for a lock around the read
> > > here as it's
> > > +    * being referenced from the same backend which means
> > > that
> > > there shouldn't
> > > +    * be concurrent writes. We want to generate an ereport
> > > in
> > > these cases.
> > > +    */
> > > +   if (deallocation > beentry->backend_allocated_bytes)
> > > +   {
> > > +   ereport(LOG, errmsg("decrease reduces reported
> > > backend memory allocated below zero; setting reported to 0"));
> > > +
> > 
> > I also think it would be nice to include the deallocation amount
> > and
> > backend_allocated_bytes amount in the log message.
> > It also might be nice to start the message with something more
> > clear
> > than "decrease".
> > For example, I would find this clear as a user:
> > 
> > Backend [backend_type or pid] deallocated [deallocation
> > number] bytes,
> > [backend_allocated_bytes - deallocation number] more than
> > this backend
> > has reported allocating.
> 
> Sounds good, I'll implement these changes

done

> > > diff --git a/src/include/utils/backend_status.h
> > > b/src/include/utils/backend_status.h
> > > index b582b46e9f..75d87e8308 100644
> > > --- a/src/include/utils/backend_status.h
> > > +++ b/src/include/utils/backend_status.h
> > > @@ -169,6 +169,9 @@ typedef struct PgBackendStatus
> > >  
> > > /* query identifier, optionally computed using
> > > post_parse_analyze_hook */
> > > uint64  st_query_id;
> > > +
> > > +   /* Current memory allocated to this backend */
> > > +   uint64  backend_allocated_bytes;
> > >  } PgBackendStatus;
> > 
> > I don't think you need the backend_ prefix here since it is in
> > PgBackendStatus.
> 
> Agreed again, I'll remove the prefix.

done

> > >  /* --
> > >   * Support functions for the SQL-callable functions to
> > > diff --git a/src/test/regress/expected/rules.out
> > > b/src/test/regress/expected/rules.out
> > > index 624d0e5aae..ba9f494806 100644
> > > --- a/src/test/regress/expected/rules.out
> > > +++ b/src/test/regress/expected/rules.out
> > > @@ -1753,10 +1753,11 @@ pg_stat_activity| SELECT s.datid,
> > >  s.state,
> > >  s.backend_xid,
> > >  s.backend_xmin,
> > > +    s.backend_allocated_bytes,
> > >  s.query_id,
> > >  s.query,
> > >  s.backend_type
> > 
> > Seems like it would be possible to add a functional test to
> > stats.sql.
> 
> I will look at adding this.

done

patch attached to 
https://www.postgresql.org/message-id/06b4922193b80776a31e08a3809f2414b0d4bf90.camel%40crunchydata.com
-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

Re: Add tracking of backend memory allocated to pg_stat_activity

2022-11-26 Thread Reid Thompson

On Wed, 2022-11-09 at 08:54 -0500, Reid Thompson wrote:
> Hi Andres,
> Thanks for looking at this and for the feedback. Responses inline
> below.
> 
>> On Fri, 2022-11-04 at 19:41 -0700, Andres Freund wrote:
> > Hi,
> > 
> On 2022-11-04 08:56:13 -0400, Reid Thompson wrote:
> > 
> > I'm not convinced that counting DSM values this way is quite right.
> > There are a few uses of DSMs that track shared resources, with the biggest
> > likely being the stats for relations etc.  I suspect that tracking that via
> > backend memory usage will be quite confusing, because fairly random 
> > backends that
> > had to grow the shared state end up being blamed with the memory usage in
> > perpituity - and then suddenly that memory usage will vanish when that 
> > backend exits,
> > despite the memory continuing to exist.
> 
> Ok, I'll make an attempt to identify these allocations and manage
> them elsewhere.

still TBD

> > 
> > 
> > > @@ -734,6 +747,7 @@ AllocSetAlloc(MemoryContext context, Size
> > > size)
> > > return NULL;
> > >  
> > > context->mem_allocated += blksize;
> > > +   pgstat_report_backend_allocated_bytes_increase(bl
> > > ksize);
> > 
> > I suspect this will be noticable cost-wise. Even though these paths aren't 
> > the
> > hottest memory allocation paths, by nature of going down into malloc, adding
> > an external function call that then does a bunch of branching etc. seems
> > likely to add up to some.  See below for how I think we can deal with 
> > that...
> > 
> > This is quite a few branches, including write/read barriers.
> > 
> > It doesn't really make sense to use the
> > PGSTAT_BEGIN_WRITE_ACTIVITY() pattern
> > here - you're just updating a single value, there's nothing to be gained by
> > it. The point of PGSTAT_BEGIN_*ACTIVITY() stuff is to allow readers to get a
> > consistent view of multiple values - but there aren't multiple values
> > here!
> 
> I'll remove the barriers - initially I copied how prior functions were

barriers removed

> > 
> > To avoid the overhead of checking (!beentry || !pgstat_track_activities) and
> > the external function call, I think you'd be best off copying the trickery I
> > introduced for pgstat_report_wait_start(), in 225a22b19ed.
> > 
> > I.e. make pgstat_report_backend_allocated_bytes_increase() a static inline
> > function that unconditionally updates something like
> > *my_backend_allocated_memory.  To deal with the case of (!beentry ||
> > !pgstat_track_activities), that variable initially points to some backend
> > local state and is set to the shared state in pgstat_bestart().
> > 
> > This additionally has the nice benefit that you can track memory usage from
> > before pgstat_bestart(), it'll be in the local variable.
> 
> OK, I think I can mimic the code you reference.

done

patch attached to 
https://www.postgresql.org/message-id/06b4922193b80776a31e08a3809f2414b0d4bf90.camel%40crunchydata.com

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-11-26 Thread Reid Thompson

On Thu, 2022-11-03 at 11:48 -0400, Reid Thompson wrote:
> On Tue, 2022-10-25 at 11:49 -0400, Reid Thompson wrote:
>
> Rebased to current. Add a couple changes per conversation with D
> Christensen (include units in field name, group field with
> backend_xid
> and backend_xmin fields in pg_stat_activity view, rather than between
> query_id and query)
> 

rebased/patched to current master && current 
pg-stat-activity-backend-memory-allocated 

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

From 1470f45e086bef0757cc262d10e08904e46b9a88 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated to pg_stat_activity.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
push the total over the limit will be denied with an out of memory error causing
that backend's current query/transaction to fail. Due to the dynamic nature of
memory allocations, this limit is not exact. If within 1.5MB of the limit and
two backends request 1MB each at the same time both may be allocated, and exceed
the limit. Further requests will not be allocated until dropping below the
limit. Keep this in mind when setting this value. This limit does not affect
auxiliary backend processes. Backend memory allocations are displayed in the
pg_stat_activity view.
---
 doc/src/sgml/config.sgml  |  26 +
 src/backend/storage/ipc/dsm_impl.c|  12 ++
 src/backend/utils/activity/backend_status.c   | 108 ++
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  17 +++
 src/backend/utils/mmgr/generation.c   |   9 ++
 src/backend/utils/mmgr/slab.c |   8 ++
 src/include/utils/backend_status.h|   3 +
 9 files changed, 197 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 24b1624bad..c2db3ace7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2079,6 +2079,32 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  A backend request that would
+push the total over the limit will be denied with an out of memory
+error causing that backend's current query/transaction to fail. Due to
+the dynamic nature of memory allocations, this limit is not exact. If
+within 1.5MB of the limit and two backends request 1MB each at the same
+time both may be allocated, and exceed the limit. Further requests will
+not be allocated until dropping below the limit. Keep this in mind when
+setting this value. This limit does not affect auxiliary backend
+processes  . Backend memory
+allocations (allocated_bytes) are displayed in the
+pg_stat_activity
+view.
+   
+  
+ 
+
  
  
 
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index 65d59fc43e..8d9df676af 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/*
 	 * Create new segment or open an existing one for attach.
 	 *
@@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 		int			flags = IPCProtection;
 		size_t		segsize;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+			return false;
+
 		/*
 		 * Allocate the memory BEFORE acquiring the resource, so that we don't
 		 * leak the resource if memory allocation fails.
@@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/* Create new segment or open an existing one for attach. */
 	if (op == DSM_OP_CREATE)

RE: Perform streaming logical transactions by background workers and parallel apply

2022-11-26 Thread houzj.f...@fujitsu.com

On Friday, November 25, 2022 10:54 AM Peter Smith  wrote:
> 
> Here are some review comments for v51-0001.

Thanks for the comments!
> ==
> 
> .../replication/logical/applyparallelworker.c
> 
> 1. General - Error messages, get_worker_name()
> 
> I previously wrote a comment to ask if the get_worker_name() should be used
> in more places but the reply [1, #2b] was:
> 
> > 2b.
> > Consider if maybe all of these ought to be calling get_worker_name()
> > which is currently static in worker.c. Doing this means any future
> > changes to get_worker_name won't cause more inconsistencies.
> 
> The most error message in applyparallelxx.c can only use "xx parallel worker",
> so I think it's fine not to call get_worker_name
> 
> ~
> 
> I thought the reply missed the point I was trying to make -- I meant if it was
> arranged now so *every* message would go via
> get_worker_name() then in future somebody wanted to change the names (e.g.
> from "logical replication parallel apply worker" to "LR PA
> worker") then it would only need to be changed in one central place instead of
> hunting down every hardwired error message.
> 

Thanks for the suggestion. I understand your point, but I feel that using
get_worker_name() at some places where the worker type is decided could make
developer think that all kind of worker can enter this code which I am not sure
is better. So I didn't change this.

> 
> 2. HandleParallelApplyMessage
> 
> + case 'X': /* Terminate, indicating clean exit. */
> + shm_mq_detach(winfo->error_mq_handle);
> + winfo->error_mq_handle = NULL;
> + break;
> + default:
> + elog(ERROR, "unrecognized message type received from logical
> replication parallel apply worker: %c (message length %d bytes)",
> + msgtype, msg->len);
> 
> The case 'X' code indentation is too much.

Changed.

> ==
> 
> src/backend/replication/logical/origin.c
> 
> 3. replorigin_session_setup(RepOriginId node, int acquired_by)
> 
> @@ -1075,12 +1075,20 @@ ReplicationOriginExitCleanup(int code, Datum arg)
>   * array doesn't have to be searched when calling
>   * replorigin_session_advance().
>   *
> - * Obviously only one such cached origin can exist per process and the 
> current
> + * Normally only one such cached origin can exist per process and the
> + current
>   * cached value can only be set again after the previous value is torn down
>   * with replorigin_session_reset().
> + *
> + * However, we do allow multiple processes to point to the same origin
> + slot if
> + * requested by the caller by passing PID of the process that has
> + already
> + * acquired it as acquired_by. This is to allow multiple parallel apply
> + * processes to use the same origin, provided they maintain commit
> + order, for
> + * example, by allowing only one process to commit at a time. For the
> + first
> + * process requesting this origin, the acquired_by parameter needs to
> + be set to
> + * 0.
>   */
>  void
> -replorigin_session_setup(RepOriginId node)
> +replorigin_session_setup(RepOriginId node, int acquired_by)
> 
> I think the meaning of the acquired_by=0 is not fully described here:
> "For the first process requesting this origin, the acquired_by parameter needs
> to be set to 0."
> IMO that seems to be describing it only from POV that you are always going to
> want to allow multiple processes. But really this is an optional feature so 
> you
> might pass acquired_by=0, not just because this is the first of multiple, but 
> also
> because you *never* want to allow multiple at all. The comment does not
> convey this meaning.
> 
> Maybe something worded like below is better?
> 
> SUGGESTION
> Normally only one such cached origin can exist per process so the cached value
> can only be set again after the previous value is torn down with
> replorigin_session_reset(). For this normal case pass
> acquired_by=0 (meaning the slot is not allowed to be already acquired by
> another process).
> 
> However, sometimes multiple processes can safely re-use the same origin slot
> (for example, multiple parallel apply processes can safely use the same 
> origin,
> provided they maintain commit order by allowing only one process to commit
> at a time). For this case the first process must pass acquired_by=0, and then 
> the
> other processes sharing that same origin can pass acquired_by=PID of the first
> process.

Changes as suggested.

> ==
> 
> src/backend/replication/logical/worker.c
> 
> 4. GENERAL - get_worker_name()
> 
> If you decide it is OK to hardwire some error messages instead of
> unconditionally calling the get_worker_name() -- see my #1 review comment in
> this post -- then there are some other messages in this file that also seem 
> like
> they can be also hardwired because the type of worker is already known.
> 
> Here are some examples:
> 
> 4a.
> 
> + else if (am_parallel_apply_worker())
> + {
> + if (rel->state != SUBREL_STATE_READY)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + /* translator: first

RE: Perform streaming logical transactions by background workers and parallel apply

2022-11-26 Thread houzj.f...@fujitsu.com

On Tuesday, November 22, 2022 9:53 PM Kuroda, Hayato 
 wroteL
> 
> Thanks for updating the patch!
> I tested the case whether the deadlock caused by foreign key constraint could
> be detected, and it worked well.
> 
> Followings are my review comments. They are basically related with 0001, but
> some contents may be not. It takes time to understand 0002 correctly...

Thanks for the comments!

> 01. typedefs.list
> 
> LeaderFileSetState should be added to typedefs.list.
> 
> 
> 02. 032_streaming_parallel_apply.pl
> 
> As I said in [1]: the test name may be not matched. Do you have reasons to
> revert the change?

The original parallel safety check has been removed, so I changed the name.
After rethinking about this, I named it to stream_parallel_conflict.

> 
> 03. 032_streaming_parallel_apply.pl
> 
> The test does not cover the case that the backend process relates with the
> deadlock. IIUC this is another motivation to use a stream/transaction lock.
> I think it should be added.

The main deadlock cases that stream/transaction lock can detect is 1) LA->PA 2)
LA->PA->PA as explained atop applyparallelworker.c. So I think backend process
related one is a variant which I think have been covered by the existing
tests in the patch.

> 04. log output
> 
> While being applied spooled changes by PA, there are so many messages like
> "replayed %d changes from file..." and "applied %u changes...". They comes
> from
> apply_handle_stream_stop() and apply_spooled_messages(). They have same
> meaning, so I think one of them can be removed.

Changed.

> 05. system_views.sql
> 
> In the previous version you modified pg_stat_subscription system view. Why
> do you revert that?

I was not sure should we include that in the main patch set.
I added a top-up patch that change the view.

> 06. interrupt.c - SignalHandlerForShutdownRequest()
> 
> In the comment atop SignalHandlerForShutdownRequest(), some processes
> that assign the function except SIGTERM are clarified. We may be able to add
> the parallel apply worker.

Changed


> 08. guc_tables.c - ConfigureNamesInt
> 
> ```
> &max_sync_workers_per_subscription,
> +   2, 0, MAX_PARALLEL_WORKER_LIMIT,
> +   NULL, NULL, NULL
> +   },
> ```
> 
> The upper limit for max_sync_workers_per_subscription seems to be wrong, it
> should be used for max_parallel_apply_workers_per_subscription.

That's my miss, sorry for that.

> 10. worker.c - maybe_reread_subscription()
> 
> 
> ```
> +   if (am_parallel_apply_worker())
> +   ereport(LOG,
> +   /* translator: first %s is the name of logical 
> replication
> worker */
> +   (errmsg("%s for subscription \"%s\"
> will stop because of a parameter change",
> +
> + get_worker_name(), MySubscription->name)));
> ```
> 
> I was not sure get_worker_name() is needed. I think "logical replication apply
> worker"
> should be embedded.

Changed.

> 
> 11. worker.c - ApplyWorkerMain()
> 
> ```
> +   (errmsg_internal("%s for subscription \"%s\"
> two_phase is %s",
> +
> + get_worker_name(),
> ```

Changed


Best regards,
Hou zj

CF 2022-11: entries "Ready for Committer" with recent activity

2022-11-26 Thread Ian Lawrence Barwick

Hi

This is an overview of the current patches marked "Ready for Committer"
which have been actively worked on during the current CF.

They largely fall into two categories:
- active participation from likely committers
- have been reviewed and marked as "RfC", but need committer interest

Dates in parentheses represent the last mail on the relevant thread.

meson PGXS compatibility (2022-10-13)
-

- https://commitfest.postgresql.org/40/3932/
- 
https://www.postgresql.org/message-id/flat/20221005200710.luvw5evhwf6cl...@awork3.anarazel.de

Sounds like it is committable, presumably just needs one of the committers on
the thread to do it.


Let libpq reject unexpected authentication requests (2022-11-16)


- https://commitfest.postgresql.org/40/3716/
- 
https://www.postgresql.org/message-id/flat/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.ca...@vmware.com
- Named commtter: Michael Paquier (michael-kun)

Patch refreshed per Michael's comments.


Add LSN to error messages reported for WAL file (2022-11-17)


- https://commitfest.postgresql.org/40/3909/
- 
https://www.postgresql.org/message-id/flat/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcj...@mail.gmail.com

Thread consensus is that it is RfC, but needes interest from a committer.

Parallel Hash Full Join (2022-11-17)


- https://commitfest.postgresql.org/40/2903/
- 
https://www.postgresql.org/message-id/flat/CA+hUKG+A6ftXPz4oe92+x8Er+xpGZqto70-Q_ERwRaSyA=a...@mail.gmail.com
- Named commtter: Thomas Munro (macdice)

Thomas has indicated he will look at this


Fix assertion failure with barriers in parallel hash join (2022-11-18)
--

- https://commitfest.postgresql.org/40/3662/
- https://www.postgresql.org/message-id/flat/20200929061142.ga29...@paquier.xyz

Thomas has indicated he will look at this


XID formatting and SLRU refactorings ... (2022-11-21)
-

- https://commitfest.postgresql.org/40/3489/
- 
https://www.postgresql.org/message-id/flat/caj7c6tpdoybyrncaeyndkbkto0wg2xsdydutf0nxq+vfkmt...@mail.gmail.com

Patch has been updated several times very recently; needs interest
from a committer


Reducing power consumption when idle (2022-11-21)
-

- https://commitfest.postgresql.org/40/3566/
- 
https://www.postgresql.org/message-id/flat/canbhv-hk8yvo_g4vwbmz__vzu_vz4_jjwstwmanmxietdzq...@mail.gmail.com

This is mainly awaiting resolution of the decision whether to
deprecate or remove
"promote_trigger_file"; seems consensus is towards "remove".


pg_dump - read data for some options from external file (2022-11-22)


- https://commitfest.postgresql.org/40/2573/
- 
https://www.postgresql.org/message-id/flat/CAFj8pRB10wvW0CC9Xq=1XDs=zcqxer3cblcnza+qix4cuh-...@mail.gmail.com

Updated patch, needs interest from committer. This one has been
floating around for a couple of years...


Support % wildcard in extension upgrade scripts (2022-11-22)


- https://commitfest.postgresql.org/40/3654/
- https://www.postgresql.org/message-id/flat/YgakFklJyM5pNdt+@c19

Patch has feedback, needs interest from committer; I haven't looked into it in
detail but maybe it needs a bit more discussion?

Compress KnownAssignedXids more frequently (2022-11-22)
---

- https://commitfest.postgresql.org/40/3902/
- 
https://www.postgresql.org/message-id/flat/CALdSSPgahNUD_=pB_j=1zsndbaiotqvfzo8ejt5j_k7qziu...@mail.gmail.com

Active, with feedback from committers.


Transaction Management docs (2022-11-23)


- https://commitfest.postgresql.org/40/3899/
- 
https://www.postgresql.org/message-id/flat/canbhv-e_iy9fmrerxrch8tztyenpfo72hf_xd2hldppva4d...@mail.gmail.com

Seems committable.


Fix order of checking ICU options in initdb and create database (2022-11-24)


- https://commitfest.postgresql.org/40/3976/
- 
https://www.postgresql.org/message-id/flat/534fed4a262fee534662bd07a691c...@postgrespro.ru
- Named commtter: Peter Eistentraut (petere)

Concerns expressed by Peter, will change to WoA.


Use fadvise in wal replay (2022-11-25)
--

- https://commitfest.postgresql.org/40/3707/
- 
https://www.postgresql.org/message-id/flat/CADVKa1WsQMBYK_02_Ji=pbofnms+ct7tv6qjxddhsfcic9v...@mail.gmail.com

Seems to be consensus that this patch is small and will bring a small benefit
with minimal code. Needs interest from a committer.


PGDOCS - Stats views and functions not in order? (2022-11-26)
--

Re: Small miscellaneous fixes

2022-11-26 Thread Michael Paquier

On Sat, Nov 26, 2022 at 11:30:07AM -0300, Ranier Vilela wrote:
> Thank you Michael, for taking care of it.

(As of 1e31484, after finishing the tests I wanted.)
--
Michael


signature.asc
Description: PGP signature

Re: Reducing power consumption on idle servers

2022-11-26 Thread Ian Lawrence Barwick

2022年11月22日(火) 5:50 Laurenz Albe :
>
> On Mon, 2022-11-21 at 12:11 -0500, Tom Lane wrote:
> > Robert Haas  writes:
> > > The reason that I pushed back -- not as successfully as I would have
> > > liked -- on the changes to pg_stop_backup / pg_start_backup is that I
> > > know there are people using the old method successfully, and it's not
> > > just a 1:1 substitution. Here I don't, and it is. I'm totally open to
> > > the feedback that such people exist and to hearing why adopting one of
> > > the newer methods would be a problem for them, if that's the case. But
> > > if there's no evidence that such people exist or that changing is a
> > > problem for them, I don't think waiting 5 years on principle is good
> > > for the project.
> >
> > We make incompatible changes in every release; see the release notes.
> > Unless somebody can give a plausible use-case where this'd be a
> > difficult change to deal with, I concur that we don't need to
> > deprecate it ahead of time.
>
> Since I am the only one that seems to worry, I'll shut up.  You are probably
> right that it the feature won't be missed by many users.

FWIW, though I prefer to err on the side of caution when removing features
from anything, I am struggling to remember ever having used
"promote_trigger_file"
 (or "trigger_file" as it was in Pg11 and earlier); grepping my private notes
brings up a single example from ca. 2012 when I appear to have been
experimenting with replication.

On a quick web search, a large part of the results are related to its change
to a GUC in Pg 12 and/or commented out entries in sample postgresql.conf files;
most of the rest seem to be blog articles/tutorials on setting up replication.

Regards

Ian Barwick

Re: User functions for building SCRAM secrets

2022-11-26 Thread Jonathan S. Katz


On 11/26/22 2:53 PM, Jonathan S. Katz wrote:

On 11/16/22 10:09 PM, Michael Paquier wrote:



git diff --check reports some whitespaces.


Ack. Will fix on the next pass. (I've been transitioning editors, which 
could have resulted in that),


Fixed (and have run that check subsequently).


scram_build_secret_sha256_internal() is missing SASLprep on the
password string.  Perhaps the best thing to do here is just to extend
pg_be_scram_build_secret() with more arguments so as callers can
optionally pass down a custom salt with its length, leaving the
responsibility to pg_be_scram_build_secret() to create a random salt
if nothing has been given?


Ah, good catch!

I think if we go with passing down the salt, we'd also have to allow for 
the passing down of the iterations, too, and we're close to rebuilding 
"scram_build_secret". I'll stare a bit at this on the next pass and 
either 1/ just SASLprep the string in the new 
"scram_build_secret_sha256_internal" func, or 2/ change the definition 
of "pg_be_scram_build_secret" to accommodate more overrides.


In the end I went with your suggested approach as it limited the amount 
of code duplication. I did keep in all the permutations of the tests as 
it did help me catch an error in my code that let to a panic.


As this seems to be closer to completion, I did include docs in this 
patch. I added this function as part of the "string functions" section 
of the docs as "md5" was already there. If we continue to add more 
authentication helper functions, perhaps we should consider breaking 
those out into their own documentation section.


Thanks,

Jonathan
From 2b04b1aca0415b726fdacc7c0cc4903ee864257c Mon Sep 17 00:00:00 2001
From: "Jonathan S. Katz" 
Date: Mon, 31 Oct 2022 16:13:08 -0400
Subject: [PATCH] Add "scram_build_secret_sha_256" SQL function

This function lets users build SCRAM secrets from SQL
functions and provides the ability for the user to select
the password, salt, and number of iterations for the password
hashing algorithm.
---
 doc/src/sgml/func.sgml   | 42 +++
 src/backend/catalog/system_functions.sql |  6 ++
 src/backend/libpq/auth-scram.c   | 29 +---
 src/backend/libpq/crypt.c|  2 +-
 src/backend/utils/adt/Makefile   |  1 +
 src/backend/utils/adt/authfuncs.c| 69 ++
 src/backend/utils/adt/meson.build|  1 +
 src/include/catalog/pg_proc.dat  |  4 ++
 src/include/libpq/scram.h|  3 +-
 src/test/regress/expected/scram.out  | 91 
 src/test/regress/parallel_schedule   |  2 +-
 src/test/regress/sql/scram.sql   | 56 +++
 12 files changed, 295 insertions(+), 11 deletions(-)
 create mode 100644 src/backend/utils/adt/authfuncs.c
 create mode 100644 src/test/regress/expected/scram.out
 create mode 100644 src/test/regress/sql/scram.sql

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6e0425cb3d..f582da138f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3485,6 +3485,48 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ scram_build_secret_sha256
+
+scram_build_secret_sha256 ( 
password text
+[, salt bytea
+[, iterations integer ] ])
+text
+   
+   
+Using the value provided in password, builds a
+SCRAM secret equilvaent to what is stored in
+pg_authid.rolpassword
+and used with scram-sha-256
+authentication. If not provided or set to NULL,
+salt is randomly generated and
+iterations defaults to 4096.
+   
+   
+SELECT scram_build_secret_sha256('secret password', 
decode('MTIzNDU2Nzg5MGFiY2RlZg==', 'base64'));
+
+
+  
SCRAM-SHA-256$4096:MTIzNDU2Nzg5MGFiY2RlZg==$D5BmucT796UQKargx2k3fdqjDYR7cH/L0viKKhGo6kA=:M33+iHFOESP8C3DKLDb2k5QAhkNVWEbp/YUIFd2CxN4=
+
+   
+   
+SELECT scram_build_secret_sha256('secret password', 
'\xabba5432');
+
+
+  
SCRAM-SHA-256$4096:q7pUMg==$05Nb9QHwHkMA0CRcYaEfwtgZ+3kStIefz8fLMjTEtio=:P126h1ycyP938E69yxktEfhoAILbiwL/UMsMk3Efb6o=
+
+   
+   
+SELECT scram_build_secret_sha256('secret password', 
decode('MTIzNDU2Nzg5MGFiY2RlZg==', 'base64'), 1);
+
+
+  
SCRAM-SHA-256$1:MTIzNDU2Nzg5MGFiY2RlZg==$9NkDu1TFpx3L30zMgHUqjRNSq3GRZRrdWU4TuGOnT3Q=:svuIH9L6HH8loyKWguT64XXoOLCrr4FkVViPd2JVR4M=
+
+   
+  
+
   

 
diff --git a/src/backend/catalog/system_functions.sql 
b/src/backend/catalog/system_functions.sql
index 30a048f6b0..4aa76b81d9 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -594,6 +594,12 @@ LANGUAGE internal
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'unicode_is_normalized';
 
+-- defaults for building a "scram-sha-256" secret
+CREATE OR REPLACE FUNCTION
+  scram_build_secret_sha256(text, bytea DEFAULT NULL, int DEFAULT NULL)
+RETURNS text
+LANGUA

Re: Add tracking of backend memory allocated to pg_stat_activity

2022-11-26 Thread Reid Thompson

On Sat, 2022-11-26 at 22:10 -0500, Reid Thompson wrote:
> Code rebased to current master.
> Updated to incorporate additional recommendations from the the list
>    - add units to variables in view
>    - remove 'backend_' prefix from variables/functions
>    - update documentation
>    - add functional test for allocated_bytes
>    - refactor allocation reporting to reduce number of functions and
>  branches/reduce performance hit
>    - zero allocated bytes after fork to avoid double counting
> postmaster allocations
> 
> 
> 
> 

attempt to remedy cfbot windows build issues

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com
From fb43761181925a87cb0674b6744b009fea614796 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Thu, 11 Aug 2022 12:01:25 -0400
Subject: [PATCH] Add tracking of backend memory allocated to pg_stat_activity

This new field displays the current bytes of memory allocated to the
backend process. It is updated as memory for the process is
malloc'd/free'd. Memory allocated to items on the freelist is included in
the displayed value.  Dynamic shared memory allocations are included
only in the value displayed for the backend that created them, they are
not included in the value for backends that are attached to them to
avoid double counting. On occasion, orphaned memory segments may be
cleaned up on postmaster startup. This may result in decreasing the sum
without a prior increment. We limit the floor of backend_mem_allocated
to zero. Updated pg_stat_activity documentation for the new column.
---
 doc/src/sgml/monitoring.sgml| 15 
 src/backend/catalog/system_views.sql|  1 +
 src/backend/postmaster/autovacuum.c |  6 ++
 src/backend/postmaster/postmaster.c | 13 
 src/backend/postmaster/syslogger.c  |  3 +
 src/backend/storage/ipc/dsm_impl.c  | 81 +
 src/backend/utils/activity/backend_status.c | 45 
 src/backend/utils/adt/pgstatfuncs.c |  4 +-
 src/backend/utils/mmgr/aset.c   | 17 +
 src/backend/utils/mmgr/generation.c | 15 
 src/backend/utils/mmgr/slab.c   | 21 ++
 src/include/catalog/pg_proc.dat |  6 +-
 src/include/utils/backend_status.h  | 58 ++-
 src/test/regress/expected/rules.out |  9 ++-
 src/test/regress/expected/stats.out | 11 +++
 src/test/regress/sql/stats.sql  |  3 +
 16 files changed, 299 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 5579b8b9e0..ffe7d2566c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -947,6 +947,21 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  
 
 
+
+ 
+  allocated_bytes bigint
+ 
+ 
+  Memory currently allocated to this backend in bytes. This is the balance
+  of bytes allocated and freed by this backend. Dynamic shared memory
+  allocations are included only in the value displayed for the backend that
+  created them, they are not included in the value for backends that are
+  attached to them to avoid double counting. Use pg_size_pretty
+  described in  to make this value
+  more easily readable.
+ 
+
+
  
   
query text
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..9ea8f78c95 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -864,6 +864,7 @@ CREATE VIEW pg_stat_activity AS
 S.state,
 S.backend_xid,
 s.backend_xmin,
+S.allocated_bytes,
 S.query_id,
 S.query,
 S.backend_type
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 601834d4b4..f54606104d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -407,6 +407,9 @@ StartAutoVacLauncher(void)
 
 #ifndef EXEC_BACKEND
 		case 0:
+			/* Zero allocated bytes to avoid double counting parent allocation */
+			pgstat_zero_my_allocated_bytes();
+
 			/* in postmaster child ... */
 			InitPostmasterChild();
 
@@ -1485,6 +1488,9 @@ StartAutoVacWorker(void)
 
 #ifndef EXEC_BACKEND
 		case 0:
+			/* Zero allocated bytes to avoid double counting parent allocation */
+			pgstat_zero_my_allocated_bytes();
+
 			/* in postmaster child ... */
 			InitPostmasterChild();
 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index a8a246921f..89a6caec78 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4102,6 +4102,9 @@ BackendStartup(Port *port)
 	{
 		free(bn);
 
+		/* Zero allocated

Re: Avoid streaming the transaction which are skipped (in corner cases)

2022-11-26 Thread Dilip Kumar

On Sat, Nov 26, 2022 at 12:15 PM Amit Kapila  wrote:
>
> On Fri, Nov 25, 2022 at 5:38 PM Amit Kapila  wrote:
> >
> > On Fri, Nov 25, 2022 at 1:35 PM Dilip Kumar  wrote:
> > >
> > > During DecodeCommit() for skipping a transaction we use ReadRecPtr to
> > > check whether to skip this transaction or not.  Whereas in
> > > ReorderBufferCanStartStreaming() we use EndRecPtr to check whether to
> > > stream or not. Generally it will not create a problem but if the
> > > commit record itself is adding some changes to the transaction(e.g.
> > > snapshot) and if the "start_decoding_at" is in between ReadRecPtr and
> > > EndRecPtr then streaming will decide to stream the transaction where
> > > as DecodeCommit will decide to skip it.  And for handling this case in
> > > ReorderBufferForget() we call stream_abort().
> > >
> >
> > The other cases are probably where we don't have FilterByOrigin or
> > dbid check, for example, XLOG_HEAP2_NEW_CID/XLOG_XACT_INVALIDATIONS.
> > We anyway actually don't send anything for such cases except empty
> > start/stop messages. Can we add some flag to txn which says that there
> > is at least one change like DML that we want to stream?
> >
>
> We can probably think of using txn_flags for this purpose.

In the attached patch I have used txn_flags to identify whether it has
any streamable change or not and the transaction will not be selected
for streaming unless it has at least one streamable change.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From f330d37f6ac1930cde1d2773dcd568b9a35454c9 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Fri, 25 Nov 2022 13:11:44 +0530
Subject: [PATCH v2] Fix thinko in when to stream a transaction

Actually, during DecodeCommit() for skipping a transaction we use
ReadRecPtr to check whether to skip this transaction or not.  Whereas
in ReorderBufferCanStartStreaming() we use EndRecPtr to check whether
to stream or not. Generally it will not create a problem but if the
commit record itslef is adding some changes to the transaction(e.g. snapshot)
and if the start_decoding_at is in between ReadRecPtr and EndRecPtr then
streaming will decide to stream the transaction where as DecodeCommit will
decide to skip it.  And for handling this case in ReorderBufferForget() we
call stream_abort() in order to abort any streamed changes.  So ideally if
we are planning to skip the transaction we should never stream it hence there
is no need to stream abort such transaction in case of skip.

Along with that we also skip the transaction if the transaction dbid is not same
slot dbid or it is filtered by origin id.  So in corner cases it is possible that
we might stream the transaction but later it will be skipped in DecodeCommit.

For fixing that do not select any transaction for streaming unless there is
any streamable change and if there is any streamable change then we can safely
select it for streaming as it will not be skipped by DecodeCommit.
---
 src/backend/replication/logical/reorderbuffer.c | 34 +
 src/include/replication/reorderbuffer.h | 23 +++--
 2 files changed, 44 insertions(+), 13 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 31f7381..e1a031d 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -793,6 +793,30 @@ ReorderBufferQueueChange(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn,
 		return;
 	}
 
+	/*
+	 * If there are any streamable changes getting queued then get the top
+	 * transaction and mark it has streamable change.  This is required for
+	 * streaming in-progress transactions, the in-progress transaction will
+	 * not be selected for streaming unless it has at least one streamable
+	 * change.
+	 */
+	if (change->action == REORDER_BUFFER_CHANGE_INSERT ||
+		change->action == REORDER_BUFFER_CHANGE_UPDATE ||
+		change->action == REORDER_BUFFER_CHANGE_DELETE ||
+		change->action == REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT ||
+		change->action == REORDER_BUFFER_CHANGE_TRUNCATE)
+	{
+		ReorderBufferTXN *toptxn;
+
+		/* get the top transaction */
+		if (txn->toptxn != NULL)
+			toptxn = txn->toptxn;
+		else
+			toptxn = txn;
+
+		toptxn->txn_flags |= RBTXN_HAS_STREAMABLE_CHANGE;
+	}
+
 	change->lsn = lsn;
 	change->txn = txn;
 
@@ -2942,9 +2966,8 @@ ReorderBufferForget(ReorderBuffer *rb, TransactionId xid, XLogRecPtr lsn)
 	if (txn == NULL)
 		return;
 
-	/* For streamed transactions notify the remote node about the abort. */
-	if (rbtxn_is_streamed(txn))
-		rb->stream_abort(rb, txn, lsn);
+	/* the transaction which is being skipped shouldn't have been streamed */
+	Assert(!rbtxn_is_streamed(txn));
 
 	/* cosmetic... */
 	txn->final_lsn = lsn;
@@ -3502,7 +3525,8 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 		Assert(txn->base_snapshot != NULL);
 
 		if ((largest == NULL || txn->total_size > largest_size) &&
-			(txn->total_

Re: Allow file inclusion in pg_hba and pg_ident files

2022-11-26 Thread Ian Lawrence Barwick

2022年11月25日(金) 11:25 Julien Rouhaud :
>
> On Fri, Nov 25, 2022 at 07:41:59AM +0900, Michael Paquier wrote:
> > On Thu, Nov 24, 2022 at 05:07:24PM +0800, Julien Rouhaud wrote:
> > > So I went with CONF_FILE_START_DEPTH and CONF_FILE_MAX_DEPTH.  Attached 
> > > v22
> > > that fixes it in all the places I found.
> >
> > Sounds fine.  Added one comment, fixed one comment, and applied.
> > Thanks!
>
> Thanks a lot!

Hi

I'm trying to reconcile open CommitFest entries with their actual
status; the entry for this:

  https://commitfest.postgresql.org/40/3558/

shows as "Waiting on Author", but looks like it's all been committed;
is there anything
left to do on this?

Thanks

Ian Barwick

Re: Allow file inclusion in pg_hba and pg_ident files

2022-11-26 Thread Julien Rouhaud

Le dim. 27 nov. 2022 à 15:31, Ian Lawrence Barwick

>
> I'm trying to reconcile open CommitFest entries with their actual
> status; the entry for this:
>
>   https://commitfest.postgresql.org/40/3558/
>
> shows as "Waiting on Author", but looks like it's all been committed;
> is there anything
> left to do on this?
>

right the CF entry is out of date. there's still the additional tap test
that needs to be taken care of. I sent a new version recently that works on
windows CI, so I guess the correct status should now be needs review,
although the latest patchset probably doesn't apply anymore since the first
patch has been committed. I'm traveling today I'll try to send a rebased
version in the evening

>

38 matches

Mail list logo