Remove an undefined function CalculateMaxmumSafeLSN

2022-05-21 Thread Bharath Rupireddy
Hi,

While working on a feature, I came across an undefined
(declaration-only) function CalculateMaxmumSafeLSN added by commit
c655077 [1]. Attaching a patch to remove it.

[1] commit c6550776394e25c1620bc8258427c8f1d448080d
Author: Alvaro Herrera 
Date:   Tue Apr 7 18:35:00 2020 -0400

Allow users to limit storage reserved by replication slots

Regards,
Bharath Rupireddy.


v1-0001-Remove-an-undefined-function-CalculateMaxmumSafeL.patch
Description: Binary data


Fix unnecessary includes and comments in 019_replslot_limit.pl, 007_wal.pl and 004_timeline_switch.pl

2022-05-21 Thread Bharath Rupireddy
Hi,

While working on a feature, I noticed that there are unnecessary
includes of "use File::Path qw(rmtree);" in some TAP test files and
also some wrong comments in 019_replslot_limit.pl. Attaching a patch
to fix these.

Regards,
Bharath Rupireddy.


v1-0001-Fix-unnecessary-includes-and-comments-in-019_repl.patch
Description: Binary data


Re:Remove an undefined function CalculateMaxmumSafeLSN

2022-05-21 Thread Sergei Kornilov
Hello

Yes, I already wrote about this artifact. And created CF app entry so it 
wouldn't get lost: https://commitfest.postgresql.org/38/3616/

regards, Sergei




Re: CPU time for pg_stat_statement

2022-05-21 Thread Julien Rouhaud
Hi,

On Sat, May 21, 2022 at 12:21:49AM +0300, Michail Nikolaev wrote:
>
> > This might be interesting:
> > https://github.com/powa-team/pg_stat_kcache
>
> Oh, nice, looks like it could help me to reduce CPU and test my
> assumption (using exec_user_time and exec_system_time).
>
> BWT, do you know why extension is not in standard contrib (looks mature)?

Because contrib isn't meant to eventually contain all possible extensions.

There is an official postgres extension network, and also community deb/rpm
repositories that are intended to handle postgres extensibility, and this
extension is available on all of that, same as a lot of other extensions, which
are at least as mature.




Re: PG15 beta1 fix pg_stat_statements view document

2022-05-21 Thread Michael Paquier
On Sat, May 21, 2022 at 03:36:10AM +, Shinoda, Noriyoshi (PN Japan FSIP) 
wrote:
> Thank you for your comment.
> I attached the fixed patch.

Thanks, applied.
--
Michael


signature.asc
Description: PGP signature


Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-05-21 Thread Amit Kapila
On Mon, Oct 11, 2021 at 11:58 AM Masahiko Sawada  wrote:
>
> It's much simpler than mine. I think that creating an entry of a
> catalog-changed transaction in the reorder buffer before
> SunapBuildCommitTxn() is the right direction.
>
> After more thought, given DDLs are not likely to happen than DML in
> practice, probably we can always mark both the top transaction and its
> subtransactions as containing catalog changes if the commit record has
> XACT_XINFO_HAS_INVALS?
>

I have some observations and thoughts on this work.

1.
+# For the transaction that TRUNCATEd the table tbl1, the last decoding decodes
+# only its COMMIT record, because it starts from the RUNNING_XACT
record emitted
+# during the second checkpoint execution.  This transaction must be marked as
+# containing catalog changes during decoding the COMMIT record and the decoding
+# of the INSERT record must read the pg_class with the correct
historic snapshot.
+permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate"
"s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert"
"s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes"

In the first line of comment, do you want to say "... record emitted
during the first checkpoint" because only then it can start from the
commit record of the transaction that has performed truncate.

2.
+ /*
+ * Mark the top transaction and its subtransactions as containing catalog
+ * changes, if the commit record has invalidation message.  This is necessary
+ * for the case where we decode only the commit record of the transaction
+ * that actually has done catalog changes.
+ */
+ if (parsed->xinfo & XACT_XINFO_HAS_INVALS)
+ {
+ ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+
+ for (int i = 0; i < parsed->nsubxacts; i++)
+ {
+ ReorderBufferAssignChild(ctx->reorder, xid, parsed->subxacts[i],
+ buf->origptr);
+ ReorderBufferXidSetCatalogChanges(ctx->reorder, parsed->subxacts[i],
+   buf->origptr);
+ }
+ }
+
  SnapBuildCommitTxn(ctx->snapshot_builder, buf->origptr, xid,
 parsed->nsubxacts, parsed->subxacts);

Marking it before SnapBuildCommitTxn has one disadvantage that we
sometimes do this work even if the snapshot state is SNAPBUILD_START
or SNAPBUILD_BUILDING_SNAPSHOT in which case SnapBuildCommitTxn
wouldn't do anything. Now, whereas this will fix the issue but it
seems we need to do this work even when we would have already marked
the txn has catalog changes, and then probably there are cases when we
mark them when it is not required as discussed in this thread.

I think if we don't have any better ideas then we should go with
either this or one of the other proposals in this thread. The other
idea that occurred to me is whether we can somehow update the snapshot
we have serialized on disk about this information. On each
running_xact record when we serialize the snapshot, we also try to
purge the committed xacts (via SnapBuildPurgeCommittedTxn). So, during
that we can check if there are committed xacts to be purged and if we
have previously serialized the snapshot for the prior running xact
record, if so, we can update it with the list of xacts that have
catalog changes. If this is feasible then I think we need to somehow
remember the point where we last serialized the snapshot (maybe by
using builder->last_serialized_snapshot). Even, if this is feasible we
may not be able to do this in back-branches because of the disk-format
change required for this.

Thoughts?

-- 
With Regards,
Amit Kapila.




Re: definition of CalculateMaxmumSafeLSN

2022-05-21 Thread Bharath Rupireddy
On Mon, Feb 28, 2022 at 7:31 PM Sergei Kornilov  wrote:
>
> Hello
> I just spotted in src/include/access/xlog.h:
> extern XLogRecPtr CalculateMaxmumSafeLSN(void);
>
> This function doesn't seem to be used anywhere or even defined? "git grep 
> CalculateMaxmumSafeLSN" shows only this line.
> Was added in c6550776394e25c1620bc8258427c8f1d448080d "Allow users to limit 
> storage reserved by replication slots"

+1 to remove. Although the change is straightforward, I attached the
patch just for the sake of convention. I've marked it as RfC
https://commitfest.postgresql.org/38/3616/.

Just for the record - there's another thread
https://www.postgresql.org/message-id/CALj2ACVoQ7NEf43Xz0rfxsGOKYTN5r4VZp2DO2_5p%2BCMzsRPFw%40mail.gmail.com

Regards,
Bharath Rupireddy.


v1-0001-Remove-an-undefined-function-CalculateMaxmumSafeL.patch
Description: Binary data


Re: check for null value before looking up the hash function

2022-05-21 Thread Ranier Vilela
>Zhihong Yu  writes:
>> I was looking at the code in hash_record()
>> of src/backend/utils/adt/rowtypes.c
>> It seems if nulls[i] is true, we don't need to look up the hash function.

>I don't think this is worth changing. It complicates the logic,
>rendering it unlike quite a few other functions written in the same
>style. In cases where the performance actually matters, the hash
>function is cached across multiple calls anyway. You might save
>something if you have many calls in a query and not one of them
>receives a non-null input, but how likely is that?

I disagree.
I think that is worth changing. The fact of complicating the logic
is irrelevant.
But maybe the v2 attached would be a little better.
My doubt is the result calc when nulls are true.

regards,
Ranier Vilela


v2-hash-record-check-null-first.patch
Description: Binary data


doc: CREATE FOREIGN TABLE .. PARTITION OF .. DEFAULT

2022-05-21 Thread Justin Pryzby
It looks like the docs weren't updated in 6f6b99d13 for v11.

The docs also seem to omit "FOR VALUES" literal.
And don't define partition_bound_spec (which I didn't fix here).

diff --git a/doc/src/sgml/ref/create_foreign_table.sgml 
b/doc/src/sgml/ref/create_foreign_table.sgml
index b374d8645db..1f1c4a52a2a 100644
--- a/doc/src/sgml/ref/create_foreign_table.sgml
+++ b/doc/src/sgml/ref/create_foreign_table.sgml
@@ -35,7 +35,8 @@ CREATE FOREIGN TABLE [ IF NOT EXISTS ] table_name
   { column_name [ WITH OPTIONS ] 
[ column_constraint [ ... ] ]
 | table_constraint }
 [, ... ]
-) ] partition_bound_spec
+) ]
+  { FOR VALUES partition_bound_spec | DEFAULT }
   SERVER server_name
 [ OPTIONS ( option 'value' [, ... ] ) ]
 




Re: check for null value before looking up the hash function

2022-05-21 Thread Ranier Vilela
Em sáb., 21 de mai. de 2022 às 10:06, Ranier Vilela 
escreveu:

> >Zhihong Yu  writes:
> >> I was looking at the code in hash_record()
> >> of src/backend/utils/adt/rowtypes.c
> >> It seems if nulls[i] is true, we don't need to look up the hash
> function.
>
> >I don't think this is worth changing. It complicates the logic,
> >rendering it unlike quite a few other functions written in the same
> >style. In cases where the performance actually matters, the hash
> >function is cached across multiple calls anyway. You might save
> >something if you have many calls in a query and not one of them
> >receives a non-null input, but how likely is that?
>
> I disagree.
> I think that is worth changing. The fact of complicating the logic
> is irrelevant.
> But maybe the v2 attached would be a little better.
>
Or v3.

regards,
Ranier Vilela


v3-hash-record-check-null-first.patch
Description: Binary data


Enforce "max_wal_size/ min_wal_size must be at least twice wal_segment_size" limit while setting GUCs

2022-05-21 Thread Bharath Rupireddy
Hi,

Currently postgres allows setting any value for max_wal_size or
min_wal_size, not enforcing "at least twice as wal_segment_size" limit
[1]. This isn't a problem if the server continues to run, however, the
server can't come up after a crash or restart or maintenance or
upgrade (goes to a continuous restart loop with FATAL errors [1]).

How about we add GUC check hooks for both max_wal_size and
min_wal_size where we can either emit ERROR or WARNING if values are
not "at least twice as wal_segment_size"?

Thoughts?

[1]
FATAL:  "max_wal_size" must be at least twice "wal_segment_size"
FATAL:  "min_wal_size" must be at least twice "wal_segment_size"

[2]
./initdb -D data
./pg_ctl -D data -l logfile start
./psql -c "alter system set max_wal_size='2MB'" postgres
./psql -c "alter system set min_wal_size='2MB'" postgres
./psql -c "select pg_reload_conf()" postgres
./pg_ctl -D data -l logfile restart

Regards,
Bharath Rupireddy.




Re: check for null value before looking up the hash function

2022-05-21 Thread Tomas Vondra



On 5/21/22 15:06, Ranier Vilela wrote:
>>Zhihong Yu  writes:
>>> I was looking at the code in hash_record()
>>> of src/backend/utils/adt/rowtypes.c
>>> It seems if nulls[i] is true, we don't need to look up the hash function.
> 
>>I don't think this is worth changing. It complicates the logic,
>>rendering it unlike quite a few other functions written in the same
>>style. In cases where the performance actually matters, the hash
>>function is cached across multiple calls anyway. You might save
>>something if you have many calls in a query and not one of them
>>receives a non-null input, but how likely is that?
> 
> I disagree.
> I think that is worth changing. The fact of complicating the logic
> is irrelevant.

That's a quite bold claim, and yet you haven't supported it by any
argument whatsoever. Trade-offs between complexity and efficiency are a
crucial development task, so complicating the logic clearly does matter.

It might be out-weighted by efficiency benefits, but as Tom suggested
the cases that might benefit from this are extremely unlikely (data with
just NULL values). And even for those cases no benchmark quantifying the
difference was presented.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: check for null value before looking up the hash function

2022-05-21 Thread Ranier Vilela
Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
tomas.von...@enterprisedb.com> escreveu:

>
>
> On 5/21/22 15:06, Ranier Vilela wrote:
> >>Zhihong Yu  writes:
> >>> I was looking at the code in hash_record()
> >>> of src/backend/utils/adt/rowtypes.c
> >>> It seems if nulls[i] is true, we don't need to look up the hash
> function.
> >
> >>I don't think this is worth changing. It complicates the logic,
> >>rendering it unlike quite a few other functions written in the same
> >>style. In cases where the performance actually matters, the hash
> >>function is cached across multiple calls anyway. You might save
> >>something if you have many calls in a query and not one of them
> >>receives a non-null input, but how likely is that?
> >
> > I disagree.
> > I think that is worth changing. The fact of complicating the logic
> > is irrelevant.
>
> That's a quite bold claim, and yet you haven't supported it by any
> argument whatsoever. Trade-offs between complexity and efficiency are a
> crucial development task, so complicating the logic clearly does matter.
>
What I meant is that complicating the logic in search of efficiency is
worth it, and that's what everyone is looking for in this thread.
Likewise, not complicating the logic, losing a little bit of efficiency,
applied to all the code, leads to a big loss of efficiency.
In other words, I never miss an opportunity to gain efficiency.

regards,
Ranier Vilela


Re: check for null value before looking up the hash function

2022-05-21 Thread David G. Johnston
On Sat, May 21, 2022 at 8:32 AM Ranier Vilela  wrote:

> Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
> tomas.von...@enterprisedb.com> escreveu:
>
>>
>>
>> On 5/21/22 15:06, Ranier Vilela wrote:
>> >>Zhihong Yu  writes:
>> >>> I was looking at the code in hash_record()
>> >>> of src/backend/utils/adt/rowtypes.c
>> >>> It seems if nulls[i] is true, we don't need to look up the hash
>> function.
>> >
>> >>I don't think this is worth changing. It complicates the logic,
>> >>rendering it unlike quite a few other functions written in the same
>> >>style. In cases where the performance actually matters, the hash
>> >>function is cached across multiple calls anyway. You might save
>> >>something if you have many calls in a query and not one of them
>> >>receives a non-null input, but how likely is that?
>> >
>> > I disagree.
>> > I think that is worth changing. The fact of complicating the logic
>> > is irrelevant.
>>
>> That's a quite bold claim, and yet you haven't supported it by any
>> argument whatsoever. Trade-offs between complexity and efficiency are a
>> crucial development task, so complicating the logic clearly does matter.
>>
> What I meant is that complicating the logic in search of efficiency is
> worth it, and that's what everyone is looking for in this thread.
> Likewise, not complicating the logic, losing a little bit of efficiency,
> applied to all the code, leads to a big loss of efficiency.
> In other words, I never miss an opportunity to gain efficiency.
>
>
I disliked the fact that 80% of the patch was adding indentation.  Instead,
remove indentation for the normal flow case and move the loop short-circuit
to the top of the loop where it is traditionally found.

This seems like a win on both efficiency and complexity grounds.  Having
the "/* see hash_array() */" comment and logic twice is a downside but a
minor one that could be replaced with a function call if desired.

diff --git a/src/backend/utils/adt/rowtypes.c
b/src/backend/utils/adt/rowtypes.c
index db843a0fbf..0bc28d1742 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -1838,6 +1838,13 @@ hash_record(PG_FUNCTION_ARGS)
TypeCacheEntry *typentry;
uint32  element_hash;

+   if (nulls[i])
+   {
+   /* see hash_array() */
+   result = (result << 5) - result + 0;
+   continue;
+   }
+
att = TupleDescAttr(tupdesc, i);

if (att->attisdropped)
@@ -1860,24 +1867,16 @@ hash_record(PG_FUNCTION_ARGS)
my_extra->columns[i].typentry = typentry;
}

-   /* Compute hash of element */
-   if (nulls[i])
-   {
-   element_hash = 0;
-   }
-   else
-   {
-   LOCAL_FCINFO(locfcinfo, 1);
+   LOCAL_FCINFO(locfcinfo, 1);

-   InitFunctionCallInfoData(*locfcinfo,
&typentry->hash_proc_finfo, 1,
-att->attcollation, NULL, NULL);
-   locfcinfo->args[0].value = values[i];
-   locfcinfo->args[0].isnull = false;
-   element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));
+   InitFunctionCallInfoData(*locfcinfo, &typentry->hash_proc_finfo, 1,
+   att->attcollation, NULL, NULL);
+   locfcinfo->args[0].value = values[i];
+   locfcinfo->args[0].isnull = false;
+   element_hash = DatumGetUInt32(FunctionCallInvoke(locfcinfo));

-   /* We don't expect hash support functions to return null */
-   Assert(!locfcinfo->isnull);
-   }
+   /* We don't expect hash support functions to return null */
+   Assert(!locfcinfo->isnull);

/* see hash_array() */
result = (result << 5) - result + element_hash;


Re: check for null value before looking up the hash function

2022-05-21 Thread Tom Lane
Ranier Vilela  writes:
> Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
> tomas.von...@enterprisedb.com> escreveu:
>> That's a quite bold claim, and yet you haven't supported it by any
>> argument whatsoever. Trade-offs between complexity and efficiency are a
>> crucial development task, so complicating the logic clearly does matter.

> What I meant is that complicating the logic in search of efficiency is
> worth it, and that's what everyone is looking for in this thread.
> Likewise, not complicating the logic, losing a little bit of efficiency,
> applied to all the code, leads to a big loss of efficiency.
> In other words, I never miss an opportunity to gain efficiency.

[ shrug... ]  You quietly ignored Tomas' main point, which is that no
evidence has been provided that there's actually any efficiency gain.

(1) Sure, in the case where only null values are encountered during a
query, we can save a cache lookup, but will that be even micro-measurable
compared to general query overhead?  Seems unlikely, especially if this is
changed in only one place.  That ties into my complaint about how this is
just one instance of a fairly widely used coding pattern.

(2) What are the effects when we *do* eventually encounter a non-null
value?  The existing coding will perform all the necessary lookups
at first call, but with the proposed change those may be spread across
query execution.  It's not implausible that that leads to a net loss
of efficiency, due to locality-of-access effects.

I'm also concerned that this increases the size of the "state space"
of this function, in that there are now more possible states of its
cached information.  While that probably doesn't add any new bugs,
it does add complication and make things harder to reason about.
So the bottom line remains that I don't think it's worth changing.

regards, tom lane




Re: check for null value before looking up the hash function

2022-05-21 Thread Ranier Vilela
Em sáb., 21 de mai. de 2022 às 13:13, Tom Lane  escreveu:

> Ranier Vilela  writes:
> > Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
> > tomas.von...@enterprisedb.com> escreveu:
> >> That's a quite bold claim, and yet you haven't supported it by any
> >> argument whatsoever. Trade-offs between complexity and efficiency are a
> >> crucial development task, so complicating the logic clearly does matter.
>
> > What I meant is that complicating the logic in search of efficiency is
> > worth it, and that's what everyone is looking for in this thread.
> > Likewise, not complicating the logic, losing a little bit of efficiency,
> > applied to all the code, leads to a big loss of efficiency.
> > In other words, I never miss an opportunity to gain efficiency.
>
> [ shrug... ]  You quietly ignored Tomas' main point, which is that no
> evidence has been provided that there's actually any efficiency gain.
>
IMHO, the point here, is for-non-commiters everything needs benchmarks.
But, I have been see many commits withouts benchmarks or any evidence gains.
And of course, having benchmarks is better, but for micro-optimizations,
It doesn't seem like it's needed that much.


> (1) Sure, in the case where only null values are encountered during a
> query, we can save a cache lookup, but will that be even micro-measurable
> compared to general query overhead?  Seems unlikely, especially if this is
> changed in only one place.  That ties into my complaint about how this is
> just one instance of a fairly widely used coding pattern.
>
Of course, changing only in one place, the gain is tiny, but, step by step,
the coding pattern is changing too, becoming new "fairly widely".


>
> (2) What are the effects when we *do* eventually encounter a non-null
> value?  The existing coding will perform all the necessary lookups
> at first call, but with the proposed change those may be spread across
> query execution.  It's not implausible that that leads to a net loss
> of efficiency, due to locality-of-access effects.
>
Weel the current code, test branch for nulls first.
Most of the time, this is not true.
So, the current code has poor branch prediction, at least.
What I proposed, improves the branch prediction, at least.


> I'm also concerned that this increases the size of the "state space"
> of this function, in that there are now more possible states of its
> cached information.  While that probably doesn't add any new bugs,
> it does add complication and make things harder to reason about.
> So the bottom line remains that I don't think it's worth changing.
>
Of course, your arguments are all valids.
That would all be clarified with benchmarks, maybe the OP is interested in
doing them.

regards,
Ranier Vilela


Re: check for null value before looking up the hash function

2022-05-21 Thread David G. Johnston
On Sat, May 21, 2022 at 10:04 AM Ranier Vilela  wrote:

> Em sáb., 21 de mai. de 2022 às 13:13, Tom Lane 
> escreveu:
>
>> Ranier Vilela  writes:
>> > Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
>> > tomas.von...@enterprisedb.com> escreveu:
>> >> That's a quite bold claim, and yet you haven't supported it by any
>> >> argument whatsoever. Trade-offs between complexity and efficiency are a
>> >> crucial development task, so complicating the logic clearly does
>> matter.
>>
>> > What I meant is that complicating the logic in search of efficiency is
>> > worth it, and that's what everyone is looking for in this thread.
>> > Likewise, not complicating the logic, losing a little bit of efficiency,
>> > applied to all the code, leads to a big loss of efficiency.
>> > In other words, I never miss an opportunity to gain efficiency.
>>
>> [ shrug... ]  You quietly ignored Tomas' main point, which is that no
>> evidence has been provided that there's actually any efficiency gain.
>>
> IMHO, the point here, is for-non-commiters everything needs benchmarks.
> But, I have been see many commits withouts benchmarks or any evidence
> gains.
> And of course, having benchmarks is better, but for micro-optimizations,
> It doesn't seem like it's needed that much.
>

Mostly because committers don't tend to do this kind of drive-by patching
that changes long-established code, which are fairly categorized as
premature optimizations.


>
>> (1) Sure, in the case where only null values are encountered during a
>> query, we can save a cache lookup, but will that be even micro-measurable
>> compared to general query overhead?  Seems unlikely, especially if this is
>> changed in only one place.  That ties into my complaint about how this is
>> just one instance of a fairly widely used coding pattern.
>>
> Of course, changing only in one place, the gain is tiny, but, step by step,
> the coding pattern is changing too, becoming new "fairly widely".
>

Agreed, but that isn't what was done here, there was no investigation of
the overall coding practice and suggestions to change them all to the
improved form.

>
>
>>
>> (2) What are the effects when we *do* eventually encounter a non-null
>> value?  The existing coding will perform all the necessary lookups
>> at first call, but with the proposed change those may be spread across
>> query execution.  It's not implausible that that leads to a net loss
>> of efficiency, due to locality-of-access effects.
>>
> Weel the current code, test branch for nulls first.
> Most of the time, this is not true.
> So, the current code has poor branch prediction, at least.
> What I proposed, improves the branch prediction, at least.
>

Per my other reply, the v3 proposal did not, IMHO, do a good job of
branch prediction either.

I find an improvement on code complexity grounds to be warranted, though
the benefit seems unlikely to outweigh the cost of doing it everywhere
(fixing only one place actually increases the cost component).

Even without the plausible locality-of-access argument the benefit here is
likely to be a micro-optimization that provides only minimal benefit.
Evidence to the contrary is welcomed but, yes, the burden is going to be
placed squarely on the patch author(s) to demonstrate the benefit accrued
from the code churn is worth the cost.

David J.


Re: definition of CalculateMaxmumSafeLSN

2022-05-21 Thread Tom Lane
Bharath Rupireddy  writes:
> On Mon, Feb 28, 2022 at 7:31 PM Sergei Kornilov  wrote:
>> I just spotted in src/include/access/xlog.h:
>> extern XLogRecPtr CalculateMaxmumSafeLSN(void);
>> This function doesn't seem to be used anywhere or even defined? "git grep 
>> CalculateMaxmumSafeLSN" shows only this line.

> +1 to remove.

Right, done.

regards, tom lane




Re: Enforce "max_wal_size/ min_wal_size must be at least twice wal_segment_size" limit while setting GUCs

2022-05-21 Thread rajesh singarapu
Hi Bharath,

Could you explain why min wal size must be at least twice but not
equal to wal_segment_size ?

thanks
Rajesh

On Sat, May 21, 2022 at 7:08 PM Bharath Rupireddy
 wrote:
>
> Hi,
>
> Currently postgres allows setting any value for max_wal_size or
> min_wal_size, not enforcing "at least twice as wal_segment_size" limit
> [1]. This isn't a problem if the server continues to run, however, the
> server can't come up after a crash or restart or maintenance or
> upgrade (goes to a continuous restart loop with FATAL errors [1]).
>
> How about we add GUC check hooks for both max_wal_size and
> min_wal_size where we can either emit ERROR or WARNING if values are
> not "at least twice as wal_segment_size"?
>
> Thoughts?
>
> [1]
> FATAL:  "max_wal_size" must be at least twice "wal_segment_size"
> FATAL:  "min_wal_size" must be at least twice "wal_segment_size"
>
> [2]
> ./initdb -D data
> ./pg_ctl -D data -l logfile start
> ./psql -c "alter system set max_wal_size='2MB'" postgres
> ./psql -c "alter system set min_wal_size='2MB'" postgres
> ./psql -c "select pg_reload_conf()" postgres
> ./pg_ctl -D data -l logfile restart
>
> Regards,
> Bharath Rupireddy.
>
>




Unsubscribing from this mailing list.

2022-05-21 Thread Israa Odeh
Greetings!Please I need to cancel my subscription to this mailing list, could you delete me from it, or tell me how to unsubscribe. Best Regards,Israa Odeh. Sent from Mail for Windows 




Re: docs: mention "pg_read_all_stats" in "track_activities" description

2022-05-21 Thread Nathan Bossart
On Sat, May 21, 2022 at 12:28:58PM +0900, Michael Paquier wrote:
> Indeed, it is a good idea to add this information.  Will apply and
> backpatch accordingly.

Sorry, I should've noticed this yesterday.  This should probably follow
6198420's example and say "roles with privileges of the pg_read_all_stats
role" instead of "members of the pg_read_all_stats role."  Also, I think we
should mention that this information is visible to roles with privileges of
the session user being reported on, too.  Patch attached.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fe64239ed9..a91145ccf3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7917,11 +7917,11 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
 executing command of each session, along with its identifier and the
 time when that command began execution. This parameter is on by
 default. Note that even when enabled, this information is only
-visible to superusers, members of the
-pg_read_all_stats role and the user owning the
-session being reported on, so it should not represent a security risk.
-Only superusers and users with the appropriate SET
-privilege can change this setting.
+visible to superusers, roles with privileges of the
+pg_read_all_stats role, and roles with privileges of
+the user owning the session being reported on, so it should not
+represent a security risk. Only superusers and users with the
+appropriate SET privilege can change this setting.

   
  


Re: check for null value before looking up the hash function

2022-05-21 Thread Tomas Vondra
On 5/21/22 19:24, David G. Johnston wrote:
> On Sat, May 21, 2022 at 10:04 AM Ranier Vilela  > wrote:
> 
> Em sáb., 21 de mai. de 2022 às 13:13, Tom Lane  > escreveu:
> 
> Ranier Vilela mailto:ranier...@gmail.com>>
> writes:
> > Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
> > tomas.von...@enterprisedb.com
> > escreveu:
> >> That's a quite bold claim, and yet you haven't supported it
> by any
> >> argument whatsoever. Trade-offs between complexity and
> efficiency are a
> >> crucial development task, so complicating the logic clearly
> does matter.
> 
> > What I meant is that complicating the logic in search of
> efficiency is
> > worth it, and that's what everyone is looking for in this thread.
> > Likewise, not complicating the logic, losing a little bit of
> efficiency,
> > applied to all the code, leads to a big loss of efficiency.
> > In other words, I never miss an opportunity to gain efficiency.
> 
> [ shrug... ]  You quietly ignored Tomas' main point, which is
> that no
> evidence has been provided that there's actually any efficiency
> gain.
> 
> IMHO, the point here, is for-non-commiters everything needs benchmarks.
> But, I have been see many commits withouts benchmarks or any
> evidence gains.
> And of course, having benchmarks is better, but for
> micro-optimizations,
> It doesn't seem like it's needed that much.
> 
> 
> Mostly because committers don't tend to do this kind of drive-by
> patching that changes long-established code, which are fairly
> categorized as premature optimizations.
> 

FWIW I find the argument that committers are somehow absolved of having
to demonstrate the benefits of a change rather misleading. Perhaps even
offensive, as it hints committers are less demanding/careful when it
comes to their own patches. Which goes directly against my experience
and understanding of what being a committer is.

I'm not going to claim every "optimization" patch submitted by a
committer had a benchmark - some patches certainly are pushed without
it, with just some general reasoning why the change is beneficial.

But I'm sure that when someone suggest the reasoning is wrong, it's
taken seriously - the discussion continues, there's a benchmark etc. And
I can't recall a committer suggesting it's fine because some other patch
didn't have a benchmark either.

> 
> 
> (1) Sure, in the case where only null values are encountered
> during a
> query, we can save a cache lookup, but will that be even
> micro-measurable
> compared to general query overhead?  Seems unlikely, especially
> if this is
> changed in only one place.  That ties into my complaint about
> how this is
> just one instance of a fairly widely used coding pattern.
> 
> Of course, changing only in one place, the gain is tiny, but, step
> by step,
> the coding pattern is changing too, becoming new "fairly widely".
> 
> 
> Agreed, but that isn't what was done here, there was no investigation of
> the overall coding practice and suggestions to change them all to the
> improved form.
> 

Right. If we think the coding pattern is an improvement, we should tweak
all the places, not just one (and hope the other places will magically
switch on their own).

More importantly, though, I kinda doubt tweaking more places will
actually make the difference more significant (assuming it actually does
improve things). How likely is it you need to hash the same data type
multiple times, with just NULL values? And even if you do, improvements
like this tend to sum, not multiply - i.e. if the improvement is 1% for
one place, it's still 1% even if the query hits 10 such places.

> 
> (2) What are the effects when we *do* eventually encounter a
> non-null
> value?  The existing coding will perform all the necessary lookups
> at first call, but with the proposed change those may be spread
> across
> query execution.  It's not implausible that that leads to a net loss
> of efficiency, due to locality-of-access effects.
> 
> Weel the current code, test branch for nulls first.
> Most of the time, this is not true.
> So, the current code has poor branch prediction, at least.
> What I proposed, improves the branch prediction, at least.
> 
> 
> Per my other reply, the v3 proposal did not, IMHO, do a good job of
> branch prediction either.
> 
> I find an improvement on code complexity grounds to be warranted, though
> the benefit seems unlikely to outweigh the cost of doing it everywhere
> (fixing only one place actually increases the cost component).
> 
> Even without the plausible locality-of-access argu

Re: docs: mention "pg_read_all_stats" in "track_activities" description

2022-05-21 Thread Michael Paquier
On Sat, May 21, 2022 at 11:57:43AM -0700, Nathan Bossart wrote:
> Sorry, I should've noticed this yesterday.  This should probably follow
> 6198420's example and say "roles with privileges of the pg_read_all_stats
> role" instead of "members of the pg_read_all_stats role."

Yes, I saw that, but that sounds pretty much the same to me, while we
mention membership of a role in other places.  I don't mind tweaking
that more, FWIW, while we are on it.

> Also, I think we
> should mention that this information is visible to roles with privileges of
> the session user being reported on, too.  Patch attached.

 default. Note that even when enabled, this information is only
-visible to superusers, members of the
-pg_read_all_stats role and the user owning the
-session being reported on, so it should not represent a security risk.
-Only superusers and users with the appropriate SET
-privilege can change this setting.
+visible to superusers, roles with privileges of the
+pg_read_all_stats role, and roles with privileges of
+the user owning the session being reported on, so it should not
+represent a security risk. Only superusers and users with the
+appropriate SET privilege can change this setting.

Regarding the fact that a user can see its own information, the last
part of the description would be right, still a bit confusing perhaps
when it comes to one's own information?
--
Michael


signature.asc
Description: PGP signature


Re: tweak to a few index tests to hits ambuildempty() routine.

2022-05-21 Thread Thomas Munro
On Sat, May 21, 2022 at 6:15 PM Noah Misch  wrote:
> On Mon, Apr 25, 2022 at 10:05:08AM -0400, Tom Lane wrote:
> > Alvaro Herrera  writes:
> > > Hmm, so 027_stream_regress.pl is not prepared to deal with any unlogged
> > > tables that may be left in the regression database (which is what my
> > > spgist addition did).  I first tried doing a TRUNCATE of the unlogged
> > > table, but that doesn't work either, and it turns out that the
> > > regression database does not have any UNLOGGED relations.  Maybe that's
> > > something we need to cater for, eventually, but for now dropping the
> > > table suffices.  I have pushed that.
> >
> > It does seem like the onus should be on 027_stream_regress.pl to
> > deal with that, rather than restricting what the core tests can
> > leave behind.
>
> Yeah.  Using "pg_dumpall --no-unlogged-table-data", as attached, suffices.

'pg_dumpall', '-f', $outputdir . '/primary.dump',
-   '--no-sync', '-p', $node_primary->port
+   '--no-sync',  '-p', $node_primary->port,
+   '--no-unlogged-table-data'# if unlogged, standby
has schema only

LGTM, except for the stray extra whitespace.  I tested by reverting
dec8ad36 locally, at which point "gmake check" still passed but "gmake
-C src/test/recovery/ check PROVE_TESTS=t/027_stream_regress.pl
PROVE_FLAGS=-v" failed, and then your change fixed that.




Re: Enforce "max_wal_size/ min_wal_size must be at least twice wal_segment_size" limit while setting GUCs

2022-05-21 Thread mahendrakar s
Hi Bharath,

+1.
This seems to be good idea to have checks on upper bound for the
max_wal_size and min_wal_size. We have seen customers play with these
parameters and ran into issues.
It will also be better to consider all the control parameters and have a
min/max checks on them as well.

Thanks,
Mahendrakar.

On Sat, 21 May 2022 at 23:26, rajesh singarapu 
wrote:

> Hi Bharath,
>
> Could you explain why min wal size must be at least twice but not
> equal to wal_segment_size ?
>
> thanks
> Rajesh
>
> On Sat, May 21, 2022 at 7:08 PM Bharath Rupireddy
>  wrote:
> >
> > Hi,
> >
> > Currently postgres allows setting any value for max_wal_size or
> > min_wal_size, not enforcing "at least twice as wal_segment_size" limit
> > [1]. This isn't a problem if the server continues to run, however, the
> > server can't come up after a crash or restart or maintenance or
> > upgrade (goes to a continuous restart loop with FATAL errors [1]).
> >
> > How about we add GUC check hooks for both max_wal_size and
> > min_wal_size where we can either emit ERROR or WARNING if values are
> > not "at least twice as wal_segment_size"?
> >
> > Thoughts?
> >
> > [1]
> > FATAL:  "max_wal_size" must be at least twice "wal_segment_size"
> > FATAL:  "min_wal_size" must be at least twice "wal_segment_size"
> >
> > [2]
> > ./initdb -D data
> > ./pg_ctl -D data -l logfile start
> > ./psql -c "alter system set max_wal_size='2MB'" postgres
> > ./psql -c "alter system set min_wal_size='2MB'" postgres
> > ./psql -c "select pg_reload_conf()" postgres
> > ./pg_ctl -D data -l logfile restart
> >
> > Regards,
> > Bharath Rupireddy.
> >
> >
>
>
>


Re: tweak to a few index tests to hits ambuildempty() routine.

2022-05-21 Thread Noah Misch
On Sun, May 22, 2022 at 04:24:16PM +1200, Thomas Munro wrote:
> On Sat, May 21, 2022 at 6:15 PM Noah Misch  wrote:
> > On Mon, Apr 25, 2022 at 10:05:08AM -0400, Tom Lane wrote:
> > > Alvaro Herrera  writes:
> > > > Hmm, so 027_stream_regress.pl is not prepared to deal with any unlogged
> > > > tables that may be left in the regression database (which is what my
> > > > spgist addition did).  I first tried doing a TRUNCATE of the unlogged
> > > > table, but that doesn't work either, and it turns out that the
> > > > regression database does not have any UNLOGGED relations.  Maybe that's
> > > > something we need to cater for, eventually, but for now dropping the
> > > > table suffices.  I have pushed that.
> > >
> > > It does seem like the onus should be on 027_stream_regress.pl to
> > > deal with that, rather than restricting what the core tests can
> > > leave behind.
> >
> > Yeah.  Using "pg_dumpall --no-unlogged-table-data", as attached, suffices.
> 
> 'pg_dumpall', '-f', $outputdir . '/primary.dump',
> -   '--no-sync', '-p', $node_primary->port
> +   '--no-sync',  '-p', $node_primary->port,
> +   '--no-unlogged-table-data'# if unlogged, standby
> has schema only
> 
> LGTM, except for the stray extra whitespace.

perltidy contributes the prior-line whitespace change.




Patch: Don't set LoadedSSL unless secure_initialize succeeds

2022-05-21 Thread Gurjeet Singh
The initialization in PostmasterMain() blindly turns on LoadedSSL,
irrespective of the outcome of secure_initialize(). I don't think
that's how it should behave, primarily because of the pattern followed
by the other places that call secure_initialize().

This patch makes PostmasterMain() behave identical to other places
(SIGHUP handler, and SubPostmasterMain()) where LoadedSSL is turned on
after checking success of secure_initialize() call. Upon failure of
secure_initialize(), it now emits a log message, instead of setting
LoadedSSL to true.

Best regards,
Gurjeet
http://Gurje.et


LoadedSSL.patch
Description: Binary data