date:20221030

Re: Pulling up direct-correlated ANY_SUBLINK

2022-10-30 Thread Andy Fan

Hi All:

On Tue, Sep 10, 2019 at 9:49 PM Tom Lane  wrote:

> Richard Guo  writes:
> > Currently we do not try to pull up sub-select of type ANY_SUBLINK if it
> > refers to any Vars of the parent query, as indicated in the code snippet
> > below:
> > if (contain_vars_of_level((Node *) subselect, 1))
> > return NULL;
> > Why do we have this check?
>
> Because the result would not be a join between two independent tables.
>

I think this situation is caused by we pull-up the ANY-sublink with 2
steps, the first step is to pull up the sublink as a subquery,  and the
next step is to pull up the subquery if it is allowed.  The benefits of
this method are obvious,  pulling up the subquery has more requirements,
even if we can just finish the first step, we still get huge benefits.
However the bad stuff happens if varlevelsup = 1 involves,  step 1 fails!

The solution here is to use the lateral join to overcome the two
independent tables, the issue of this solution includes:

1.  LATERAL pretty much constrains things to use a nestloop like below,
but this reason is questioned since if we can pull-up the subquery,  if so
the
constraint gone. [1]
2.  It has something with unique-ify the inner path. [2] , but Richard
thought
it should be fixed but without an agreement for all people [3].
3.  Richard [4] found it would fail to get a plan for some query. (the
error is
below per my testing)

> ERROR:  failed to build any 3-way joins

So back to the root cause of this issue,  IIUC,  if varlevelsup = 1
involves,
can we just bypass the 2-steps method,  just as what we do for EXISTS
sublinks?  If so, we just need to convert the ANY-SUBLINK to EXIST-SUBLINK
under the case.

The attached is the one commit which includes the 2 methods discussed
here, controlled by different GUC separately, for easy testing.  Per my
test,
Query 2 choosed the Unique Join with the IN-to-EXISTS method, but not
with the Lateral method, and query 3 raises error with the lateral method,
but not with the IN-to-EXISTS method.

[1]
https://www.postgresql.org/message-id/flat/60794.1568104308%40antos#365d5ec69fd605a8569a2674a33909a1
[2] https://www.postgresql.org/message-id/60794.1568104308%40antos
[3]
https://www.postgresql.org/message-id/CAN_9JTzqa-3RmHAw3wZv099Rk8xX480YdEvGy%2BJAdVw8dTnHRA%40mail.gmail.com
[4]
https://www.postgresql.org/message-id/CAMbWs49cvkF9akbomz_fCCKS%3DD5TY%3D4KGHEQcfHPZCXS1GVhkA%40mail.gmail.com

> > Can we try to pull up direct-correlated ANY SubLink with the help of
> > LATERAL?
>
> Perhaps.  But what's the argument that you'd end up with a better
> plan?  LATERAL pretty much constrains things to use a nestloop,
> so I'm not sure there's anything fundamentally different.
>
> regards, tom lane
>
>
-- 
Best Regards
Andy Fan

test.sql
Description: application/sql
From 1bdc3b9098851ab1f6897f497daf90c601eca27e Mon Sep 17 00:00:00 2001
From: Andy Fan 
Date: Sun, 9 Oct 2022 17:47:23 +0800
Subject: [PATCH v1] 2 methods for Pulling up direct-correlated ANY_SUBLINK

---
 .../postgres_fdw/expected/postgres_fdw.out|  24 +-
 src/backend/optimizer/plan/subselect.c|   7 +-
 src/backend/optimizer/prep/prepjointree.c | 291 ++
 src/backend/utils/misc/guc_tables.c   |  23 ++
 src/test/regress/expected/join.out|  35 ++-
 src/test/regress/expected/subselect.out   | 121 
 src/test/regress/sql/subselect.sql|  61 
 7 files changed, 531 insertions(+), 31 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index b3c8ce01313..870590df562 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -11377,19 +11377,19 @@ CREATE FOREIGN TABLE foreign_tbl2 () INHERITS (foreign_tbl)
   SERVER loopback OPTIONS (table_name 'base_tbl');
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT a FROM base_tbl WHERE a IN (SELECT a FROM foreign_tbl);
- QUERY PLAN  
--
- Seq Scan on public.base_tbl
+QUERY PLAN 
+---
+ Nested Loop Semi Join
Output: base_tbl.a
-   Filter: (SubPlan 1)
-   SubPlan 1
- ->  Result
-   Output: base_tbl.a
-   ->  Append
- ->  Async Foreign Scan on public.foreign_tbl foreign_tbl_1
-   Remote SQL: SELECT NULL FROM public.base_tbl
- ->  Async Foreign Scan on public.foreign_tbl2 foreign_tbl_2
-   Remote SQL: SELECT NULL FROM public.base_tbl
+   ->  Seq Scan on public.base_tbl
+ Output: base_tbl.a, base_tbl.b
+ Filter: (base_tbl.a IS NOT NULL)
+   ->  Materialize
+ ->  Append
+   ->  Async Foreign Scan on public.foreign_t

How to started with Contributions

2022-10-30 Thread 1DS20CS093 Joseph Raj Vishal

 Respected Sir/Madam
I am Joseph Raj Vishal, a Computer Science undergrad. I have just entered
third year at Dayananda Sagar College of Engineering in Bengaluru. I am new
to open source contributions but I am well aware of Python, Java, Django,
PHP, SQL, Javascript,Kotlin and Android. I would love to contribute to your
organisation but I would need your help getting started.
Hoping to hear from you soon

Regards
Joseph Raj Vishal

Re: How to started with Contributions

2022-10-30 Thread Andy Fan

HI Joseph:

Maybe the following links are helpful.

https://www.postgresql.org/message-id/pine.bsf.4.21.9912052011040.823-100...@thelab.hub.org

https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F

On Sun, Oct 30, 2022 at 3:42 PM 1DS20CS093 Joseph Raj Vishal <
josephrvis...@gmail.com> wrote:

> Respected Sir/Madam
> I am Joseph Raj Vishal, a Computer Science undergrad. I have just entered
> third year at Dayananda Sagar College of Engineering in Bengaluru. I am new
> to open source contributions but I am well aware of Python, Java, Django,
> PHP, SQL, Javascript,Kotlin and Android. I would love to contribute to your
> organisation but I would need your help getting started.
> Hoping to hear from you soon
>
> Regards
> Joseph Raj Vishal
>



-- 
Best Regards
Andy Fan

Re: Segfault on logical replication to partitioned table with foreign children

2022-10-30 Thread Dilip Kumar

On Sat, Oct 29, 2022 at 1:01 AM  wrote:
>
> Right now it is possible to add a partitioned table with foreign tables
> as its children as a target of a subscription. It can lead to an assert
> (or a segfault, if compiled without asserts) on a logical replication
> worker when the worker attempts to insert the data received via
> replication into the foreign table. Reproduce with caution, the worker
> is going to crash and restart indefinitely. The setup:

Yes, this looks like a bug and your fix seems correct to me.  It would
be nice to add a test case for this scenario.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Segfault on logical replication to partitioned table with foreign children

2022-10-30 Thread Tom Lane

Dilip Kumar  writes:
> Yes, this looks like a bug and your fix seems correct to me.  It would
> be nice to add a test case for this scenario.

A test case doesn't seem that exciting to me.  If we were trying to
make it actually work, then yeah, but throwing an error isn't that
useful to test.  The code will be exercised by replication to a
regular partitioned table (I assume we do have tests for that).

What I'm wondering about is whether we can refactor this code
to avoid so many usually-useless catalog lookups.  Pulling the
namespace name, in particular, is expensive and we generally
are not going to need the result.  In the child-rel case it'd
be much better to pass the opened relation and let the error-check
subroutine work from that.  Maybe we should just do it like that
at the start of the recursion, too?  Or pass the relid and let
the subroutine look up the names only in the error case.

A completely different line of thought is that this doesn't seem
like a terribly bulletproof fix, since children could get added to
a partitioned table after we look.  Maybe it'd be better to check
the relkind at the last moment before we do something that depends
on a child table being a relation.

regards, tom lane

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2022-10-30 Thread Alvaro Herrera

On 2022-Oct-27, Michael Paquier wrote:

> On Thu, Sep 29, 2022 at 08:09:56PM -0700, Nathan Bossart wrote:
> > Looks reasonable to me.
> 
> 0001, to move pg_pwritev_with_retry() to a new home, seems fine, so
> applied.

Maybe something a bit useless, but while perusing the commits I noticed
a forward struct declaration was moved from one file to another; this is
claimed to avoid including port/pg_iovec.h in common/file_utils.h.  We
do that kind of thing in a few places, but in this particular case it
seems a bit of a pointless exercise, since pg_iovec.h doesn't include
anything else and it's a quite simple header.

So I'm kinda proposing that we only do the forward struct initialization
dance when it really saves on things -- in particular, when it helps
avoid or reduce massive indirect header inclusion.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
>From c170367e5ac7dca2dcd11b5cbb6d351b1b205842 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Sun, 30 Oct 2022 15:22:24 +0100
Subject: [PATCH] No need to avoid including pg_iovec.h

---
 src/include/common/file_utils.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/include/common/file_utils.h b/src/include/common/file_utils.h
index 2c5dbcb0b1..20fe5806fb 100644
--- a/src/include/common/file_utils.h
+++ b/src/include/common/file_utils.h
@@ -15,6 +15,8 @@

 #include 

+#include "port/pg_iovec.h"
+
 typedef enum PGFileType
 {
 	PGFILETYPE_ERROR,
@@ -24,7 +26,6 @@ typedef enum PGFileType
 	PGFILETYPE_LNK
 } PGFileType;

-struct iovec;	/* avoid including port/pg_iovec.h here */

 #ifdef FRONTEND
 extern int	fsync_fname(const char *fname, bool isdir);
@@ -40,7 +41,7 @@ extern PGFileType get_dirent_type(const char *path,
   int elevel);

 extern ssize_t pg_pwritev_with_retry(int fd,
-	 const struct iovec *iov,
+	 const iovec *iov,
 	 int iovcnt,
 	 off_t offset);

-- 
2.30.2

Re: How to started with Contributions

2022-10-30 Thread 1DS20CS093 Joseph Raj Vishal

Thanks, I'll check them out.

On Sun, 30 Oct, 2022, 2:47 pm Andy Fan,  wrote:

> HI Joseph:
>
> Maybe the following links are helpful.
>
>
> https://www.postgresql.org/message-id/pine.bsf.4.21.9912052011040.823-100...@thelab.hub.org
>
> https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F
>
> On Sun, Oct 30, 2022 at 3:42 PM 1DS20CS093 Joseph Raj Vishal <
> josephrvis...@gmail.com> wrote:
>
>> Respected Sir/Madam
>> I am Joseph Raj Vishal, a Computer Science undergrad. I have just entered
>> third year at Dayananda Sagar College of Engineering in Bengaluru. I am new
>> to open source contributions but I am well aware of Python, Java, Django,
>> PHP, SQL, Javascript,Kotlin and Android. I would love to contribute to your
>> organisation but I would need your help getting started.
>> Hoping to hear from you soon
>>
>> Regards
>> Joseph Raj Vishal
>>
>
>
>
> --
> Best Regards
> Andy Fan
>

Re: Segfault on logical replication to partitioned table with foreign children

2022-10-30 Thread Alvaro Herrera

On 2022-Oct-28, ilya.v.gladys...@gmail.com wrote:

> This will cause a segfault or raise an assert, because inserting into
> foreign tables via logical replication is not possible. The solution I
> propose is to add recursive checks of relkind for children of a target,
> if the target is a partitioned table.

However, I imagine that the only reason we don't support this is that
the code hasn't been written yet. I think it would be better to write
that code, so that we don't have to raise any error at all (unless the
foreign table is something that doesn't support DML, in which case we
would have to raise an error).  Of course, we still have to fix it in
backbranches, but we can just do it as a targeted check at the moment of
insert/update, not at the moment of subscription create/alter.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"The eagle never lost so much time, as
when he submitted to learn of the crow." (William Blake)

pg15 inherited stats expressions: cache lookup failed for statistics object

2022-10-30 Thread Justin Pryzby

postgres=# \set QUIET 
CREATE TABLE stxdinp (i int, j int) PARTITION BY RANGE(i);
CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(10);
INSERT INTO stxdinp SELECT generate_series(1,9)a;
CREATE STATISTICS stxdp ON i,j FROM stxdinp;
ANALYZE stxdinp;
explain SELECT i, j, COUNT(1) FROM ONLY stxdinp GROUP BY 1,2;
ERROR:  cache lookup failed for statistics object 4060843

It's evidently an issue with 269b532ae (Add stxdinherit flag to
pg_statistic_ext_data)

(gdb) bt
#0  pg_re_throw () at elog.c:1795
#1  0x0096578a in errfinish (filename=, 
filename@entry=0xaf0442 "extended_stats.c", lineno=lineno@entry=2467, 
funcname=funcname@entry=0xaf0720 <__func__.28166> "statext_expressions_load") 
at elog.c:588
#2  0x007efd07 in statext_expressions_load (stxoid=3914359, 
inh=, idx=idx@entry=0) at extended_stats.c:2467
#3  0x00913947 in examine_variable (root=root@entry=0x2b5b530, 
node=node@entry=0x2b88820, varRelid=varRelid@entry=7, 
vardata=vardata@entry=0x7ffe5baa2f00) at selfuncs.c:5264
#4  0x009141ae in get_restriction_variable (root=root@entry=0x2b5b530, 
args=args@entry=0x2b88a30, varRelid=varRelid@entry=7, 
vardata=vardata@entry=0x7ffe5baa2f90, other=other@entry=0x7ffe5baa2f88, 
varonleft=varonleft@entry=0x7ffe5baa2f87) at selfuncs.c:4848
#5  0x00915535 in eqsel_internal (fcinfo=, 
negate=negate@entry=false) at selfuncs.c:263
#6  0x009155f6 in eqsel (fcinfo=) at selfuncs.c:226
#7  0x0096a373 in FunctionCall4Coll 
(flinfo=flinfo@entry=0x7ffe5baa3090, collation=collation@entry=0, 
arg1=arg1@entry=45462832, arg2=arg2@entry=1320, arg3=arg3@entry=45648432, 
arg4=arg4@entry=7) at fmgr.c:1198
#8  0x0096a92a in OidFunctionCall4Coll (functionId=, 
collation=collation@entry=0, arg1=arg1@entry=45462832, arg2=arg2@entry=1320, 
arg3=arg3@entry=45648432, arg4=arg4@entry=7) at fmgr.c:1434
#9  0x0077e759 in restriction_selectivity (root=root@entry=0x2b5b530, 
operatorid=operatorid@entry=1320, args=0x2b88a30, inputcollid=0, 
varRelid=varRelid@entry=7) at plancat.c:1880
#10 0x00728dc9 in clause_selectivity_ext (root=root@entry=0x2b5b530, 
clause=0x2b889d8, clause@entry=0x2b86e88, varRelid=varRelid@entry=7, 
jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0, 
use_extended_stats=use_extended_stats@entry=true) at clausesel.c:875
#11 0x007291b6 in clauselist_selectivity_ext 
(root=root@entry=0x2b5b530, clauses=0x2b88fc0, varRelid=7, 
jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0, 
use_extended_stats=use_extended_stats@entry=true)
at clausesel.c:185
#12 0x0072962e in clauselist_selectivity (root=root@entry=0x2b5b530, 
clauses=, varRelid=, 
jointype=jointype@entry=JOIN_INNER, sjinfo=sjinfo@entry=0x0) at clausesel.c:108
#13 0x0072fb1d in get_parameterized_baserel_size 
(root=root@entry=0x2b5b530, rel=rel@entry=0x2b73900, 
param_clauses=param_clauses@entry=0x2b88f68) at costsize.c:5015
#14 0x007836f6 in get_baserel_parampathinfo (root=root@entry=0x2b5b530, 
baserel=baserel@entry=0x2b73900, required_outer=required_outer@entry=0x2b86478) 
at relnode.c:1346
#15 0x00776819 in create_seqscan_path (root=root@entry=0x2b5b530, 
rel=rel@entry=0x2b73900, required_outer=required_outer@entry=0x2b86478, 
parallel_workers=parallel_workers@entry=0) at pathnode.c:937
#16 0x0077a32c in reparameterize_path (root=root@entry=0x2b5b530, 
path=path@entry=0x2b847b0, required_outer=required_outer@entry=0x2b86478, 
loop_count=loop_count@entry=1) at pathnode.c:3872
#17 0x007249bc in get_cheapest_parameterized_child_path 
(root=root@entry=0x2b5b530, rel=, 
required_outer=required_outer@entry=0x2b86478) at allpaths.c:1996
#18 0x00727619 in add_paths_to_append_rel (root=root@entry=0x2b5b530, 
rel=rel@entry=0x2b6a6a8, live_childrels=live_childrels@entry=0x2b858e8) at 
allpaths.c:1597
#19 0x00728084 in set_append_rel_pathlist (root=root@entry=0x2b5b530, 
rel=rel@entry=0x2b6a6a8, rti=rti@entry=6, rte=rte@entry=0x2b5e1c0) at 
allpaths.c:1270
#20 0x00727e17 in set_rel_pathlist (root=root@entry=0x2b5b530, 
rel=0x2b6a6a8, rti=rti@entry=6, rte=0x2b5e1c0) at allpaths.c:483
#21 0x00727f8a in set_base_rel_pathlists (root=root@entry=0x2b5b530) at 
allpaths.c:355
#22 0x007286fd in make_one_rel (root=root@entry=0x2b5b530, 
joinlist=joinlist@entry=0x2b6fd98) at allpaths.c:225
#23 0x007512d5 in query_planner (root=root@entry=0x2b5b530, 
qp_callback=qp_callback@entry=0x75300a , 
qp_extra=qp_extra@entry=0x7ffe5baa3670) at planmain.c:276
#24 0x007589ec in grouping_planner (root=root@entry=0x2b5b530, 
tuple_fraction=, tuple_fraction@entry=0) at planner.c:1467
#25 0x0075a5f2 in subquery_planner (glob=, 
parse=parse@entry=0x2b23c08, parent_root=parent_root@entry=0x2736768, 
hasRecursion=hasRecursion@entry=false, tuple_fraction=) at 
planner.c:1044
#26 0x00726567 in set_subquery_pathlist (root=root@entry=0x2736768, 
rel=rel@entry=0x275

Re: Code checks for App Devs, using new options for transaction behavior

2022-10-30 Thread Simon Riggs

On Fri, 28 Oct 2022 at 10:33, Simon Riggs  wrote:

> Thanks for the feedback, I will make all of those corrections in the
> next version.

New version attached. I've rolled 002-004 into one patch, but can
split again as needed.

-- 
Simon Riggshttp://www.EnterpriseDB.com/

001_psql_parse_only.v1.patch
Description: Binary data

002_nested_xacts_and_rollback_on_commit.v8.patch
Description: Binary data

Re: Schema variables - new implementation for Postgres 15

2022-10-30 Thread Dmitry Dolgov

> On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote:
>  rebased with simplified code related to usage of pfree function

Thanks for the patch, great work!

I've got a couple of questions, although I haven't fully finished reviewing yet
(so more to come):

* I'm curious about ALTER VARIABLE. Current implementation allows altering only
  the name, schema or the owner -- why not e.g. immutability?

* psql tab completion implementation mentions that CREATE VARIABLE could be
  used inside CREATE SCHEMA:

/* CREATE VARIABLE --- is allowed inside CREATE SCHEMA, so use TailMatches 
*/
/* Complete CREATE VARIABLE  with AS */
else if (TailMatches("IMMUTABLE"))

  Is that correct? It doesn't like it works, and from what I see it requires
  some modifications in transformCreateSchemaStmt and schema_stmt.

* psql describe mentions the following:

/*
 * Most functions in this file are content to print an empty table when
 * there are no matching objects.  We intentionally deviate from that
 * here, but only in !quiet mode, for historical reasons.
 */

  I guess it's taken from listTables, and the extended versions says "because
  of the possibility that the user is confused about what the two pattern
  arguments mean". Are those historical reasons apply to variables as well?

Re: Schema variables - new implementation for Postgres 15

2022-10-30 Thread Pavel Stehule

Hi

ne 30. 10. 2022 v 19:05 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote:
> >  rebased with simplified code related to usage of pfree function
>
> Thanks for the patch, great work!
>
> I've got a couple of questions, although I haven't fully finished
> reviewing yet
> (so more to come):
>
> * I'm curious about ALTER VARIABLE. Current implementation allows altering
> only
>   the name, schema or the owner -- why not e.g. immutability?
>

It is just in category - "not implemented yet". The name, schema or owner
doesn't change behavior. It can be possible (in next versions) to change
type, default expression, immutability (I think). But the patch is long
enough so I prefer just to support basic generic ALTER related to schema,
and other possibilities to implement in next iterations.


>
> * psql tab completion implementation mentions that CREATE VARIABLE could be
>   used inside CREATE SCHEMA:
>
> /* CREATE VARIABLE --- is allowed inside CREATE SCHEMA, so use
> TailMatches */
> /* Complete CREATE VARIABLE  with AS */
> else if (TailMatches("IMMUTABLE"))
>
>   Is that correct? It doesn't like it works, and from what I see it
> requires
>   some modifications in transformCreateSchemaStmt and schema_stmt.
>

yes,

This is a bug. It should be fixed



>
> * psql describe mentions the following:
>
> /*
>  * Most functions in this file are content to print an empty table
> when
>  * there are no matching objects.  We intentionally deviate from
> that
>  * here, but only in !quiet mode, for historical reasons.
>  */
>
>   I guess it's taken from listTables, and the extended versions says
> "because
>   of the possibility that the user is confused about what the two pattern
>   arguments mean". Are those historical reasons apply to variables as well?
>

The behave is same like the tables

(2022-10-30 19:48:14) postgres=# \dt
Did not find any relations.
(2022-10-30 19:48:16) postgres=# \dV
Did not find any session variables.

Thank you for comments

Pavel

Re: warn if GUC set to an invalid shared library

2022-10-30 Thread Maciek Sakrejda

On Sat, Oct 29, 2022 at 10:40 AM Justin Pryzby  wrote:
>
> On Fri, Sep 02, 2022 at 05:24:58PM -0500, Justin Pryzby wrote:
> > It caused no issue when I changed:
> >
> > /* Check that it's acceptable for the indicated 
> > parameter */
> > if (!parse_and_validate_value(record, name, value,
> > - PGC_S_FILE, ERROR,
> > + PGC_S_TEST, ERROR,
> >   &newval, &newextra))
> >
> > I'm not sure where to go from here.
>
> I'm hoping for some guidance ; this simple change may be naive, but I'm not
> sure what a wider change would look like.

I assume you mean guidance on implementation details here, and not on
design. I still think this is a useful patch and I'd be happy to
review and try out future iterations, but I don't have any useful
input here.

Also, for what it's worth, I think requiring the libraries to be in
place before running ALTER SYSTEM does not really seem that onerous. I
can't really think of use cases it precludes.

Thanks,
Maciek

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2022-10-30 Thread Michael Paquier

On Sun, Oct 30, 2022 at 03:44:32PM +0100, Alvaro Herrera wrote:
> So I'm kinda proposing that we only do the forward struct initialization
> dance when it really saves on things -- in particular, when it helps
> avoid or reduce massive indirect header inclusion.

Sure.

>  extern ssize_t pg_pwritev_with_retry(int fd,
> - const struct iovec *iov,
> + const iovec *iov,
>   int iovcnt,
>   off_t offset);

However this still needs to be defined as a struct, no?
--
Michael


signature.asc
Description: PGP signature

Re: Simplifying our Trap/Assert infrastructure

2022-10-30 Thread Michael Paquier

On Fri, Oct 28, 2022 at 09:36:23AM +0200, Peter Eisentraut wrote:
> Would there be a use for that?  It's currently only used in the atomics
> code.

Yep, but they would not trigger when using atomics in the frontend
code.  We don't have any use for that in core on HEAD, still that
could be useful for some external frontend code?  Please see the
attached.
--
Michael
diff --git a/src/include/c.h b/src/include/c.h
index d70ed84ac5..12ff782a63 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -800,7 +800,12 @@ typedef NameData *Name;
 #include 
 #define Assert(p) assert(p)
 #define AssertMacro(p)	((void) assert(p))
-#define AssertPointerAlignment(ptr, bndr)	((void)true)
+
+/*
+ * Check that `ptr' is `bndr' aligned.
+ */
+#define AssertPointerAlignment(ptr, bndr) \
+	Assert(TYPEALIGN(bndr, (uintptr_t)(ptr)) == (uintptr_t)(ptr))
 
 #else			/* USE_ASSERT_CHECKING && !FRONTEND */
 


signature.asc
Description: PGP signature

Re: GUC values - recommended way to declare the C variables?

2022-10-30 Thread Peter Smith

On Fri, Oct 28, 2022 at 6:05 PM Michael Paquier  wrote:
>
> On Fri, Oct 28, 2022 at 11:48:13AM +0900, Michael Paquier wrote:
> > Thanks.  I have not looked at the checkup logic yet, but the central
> > declarations seem rather sane, and I have a few comments about the
> > latter.
>
> So, I've had the energy to look at the check logic today, and noticed
> that, while the proposed patch is doing the job when loading the
> in-core GUCs, nothing is happening for the custom GUCs that could be
> loaded through shared_preload_libraries or just from a LOAD command.
>
> After adding an extra check in define_custom_variable() (reworking a
> bit the interface proposed while on it), I have found a few more
> issues than what's been already found on this thread:
> - 5 missing spots in pg_stat_statements.
> - 3 float rounding issues in pg_trgm.
> - 1 spot in pg_prewarm.
> - A few more that had no initialization, but these had a default of
> false/0/0.0 so it does not influence the end result but I have added
> some initializations anyway.
>
> With all that addressed, I am finishing with the attached.  I have
> added some comments for the default definitions depending on the
> CFLAGS, explaining the reasons behind the choices made.  The CI has
> produced a green run, which is not the same as the buildfarm, still
> gives some confidence.
>
> Thoughts?

LGTM.

The patch was intended to expose mismatches, and it seems to be doing
that job already...

I only had some nitpicks for a couple of the new comments, below:

==

1. src/include/storage/bufmgr.h

+
+/* effective when prefetching is available */
+#ifdef USE_PREFETCH
+#define DEFAULT_EFFECTIVE_IO_CONCURRENCY 1
+#define DEFAULT_MAINTENANCE_IO_CONCURRENCY 10
+#else
+#define DEFAULT_EFFECTIVE_IO_CONCURRENCY 0
+#define DEFAULT_MAINTENANCE_IO_CONCURRENCY 0
+#endif

Maybe avoid the word "effective" since that is also one of the GUC names.

Use uppercase.

SUGGESTION
/* Only applicable when prefetching is available */

==

2. src/include/utils/ps_status.h

+/* Disabled on Windows as the performance overhead can be significant */
+#ifdef WIN32
+#define DEFAULT_UPDATE_PROCESS_TITLE false
+#else
+#define DEFAULT_UPDATE_PROCESS_TITLE true
+#endif
 extern PGDLLIMPORT bool update_process_title;


Perhaps put that comment inside the #ifdef WIN32

SUGGESTION
#ifdef WIN32
/* Disabled on Windows because the performance overhead can be significant */
#define DEFAULT_UPDATE_PROCESS_TITLE false
#else
...

==

src/backend/utils/misc/guc.c

3. InitializeGUCOptions

@@ -1413,6 +1496,9 @@ InitializeGUCOptions(void)
  hash_seq_init(&status, guc_hashtab);
  while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
  {
+ /* check mapping between initial and default value */
+ Assert(check_GUC_init(hentry->gucvar));
+

Use uppercase.

Minor re-wording.

SUGGESTION
/* Check the GUC default and declared initial value for consistency */

~~~

4. define_custom_variable

Same as #3.

--
Kind Regards,
Peter Smith
Fujitsu Australia

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-10-30 Thread Maciek Sakrejda

On Wed, Oct 26, 2022 at 10:55 AM Melanie Plageman
 wrote:

+ The pg_statio_ and
+ pg_stat_io views are primarily useful to determine
+ the effectiveness of the buffer cache. When the number of actual disk reads

Totally nitpicking, but this reads a little funny to me. Previously
the trailing underscore suggested this is a group, and now with
pg_stat_io itself added (stupid question: should this be
"pg_statio"?), it sounds like we're talking about two views:
pg_stat_io and "pg_statio_". Maybe something like "The pg_stat_io view
and the pg_statio_ set of views are primarily..."?

+ by that backend type in that IO context. Currently only a subset of IO
+ operations are tracked here. WAL IO, IO on temporary files, and some forms
+ of IO outside of shared buffers (such as when building indexes or moving a
+ table from one tablespace to another) could be added in the future.

Again nitpicking, but should this be "may be added"? I think "could"
suggests the possibility of implementation, whereas "may" feels more
like a hint as to how the feature could evolve.

+ portion of the main shared buffer pool. This pattern is called a
+ Buffer Access Strategy in the
+ PostgreSQL source code and the fixed-size
+ ring buffer is referred to as a strategy ring buffer for
+ the purposes of this view's documentation.
+ 

Nice, I think this explanation is very helpful. You also use the term
"strategy context" and "strategy operation" below. I think it's fairly
obvious what those mean, but pointing it out in case we want to note
that here, too.

+ read and extended for

Maybe "plus" instead of "and" here for clarity (I'm assuming that's
what the "and" means)?

+ backend_types autovacuum launcher,
+ autovacuum worker, client backend,
+ standalone backend, background
+ worker, and walsender for all
+ io_contexts is similar to the sum of

I'm reviewing the rendered docs now, and I noticed sentences like this
are a bit hard to scan: they force the reader to parse a big list of
backend types before even getting to the meat of what this is talking
about. Should we maybe reword this so that the backend list comes at
the end of the sentence? Or maybe even use a list (e.g., like in the
"state" column description in pg_stat_activity)?

+ heap_blks_read, idx_blks_read,
+ tidx_blks_read, and
+ toast_blks_read in 
+ pg_statio_all_tables. and
+ blks_read from If using the PostgreSQL extension,
+ ,
+ read for
+ backend_types autovacuum launcher,
+ autovacuum worker, client backend,
+ standalone backend, background
+ worker, and walsender for all
+ io_contexts is equivalent to

Same comment as above re: the lengthy list.

+ Normal client backends should be able to rely on maintenance processes
+ like the checkpointer and background writer to write out dirty data as

Nice--it's great to see this mentioned. But I think these are
generally referred to as "auxiliary" not "maintenance" processes, no?

+ If using the PostgreSQL extension,
+ , written and
+ extended for backend_types

Again, should this be "plus" instead of "and"?

+ 
+ bytes_conversion bigint
+ 

I think this general approach works (instead of unit). I'm not wild
about the name, but I don't really have a better suggestion. Maybe
"op_bytes" (since each cell is counting the number of I/O operations)?
But I think bytes_conversion is okay.

Also, is this (in the middle of the table) the right place for this
column? I would have expected to see it before or after all the actual
I/O op cells.

+ io_contexts. When a Buffer Access
+ Strategy reuses a buffer in the strategy ring, it must evict its
+ contents, incrementing reused. When a Buffer
+ Access Strategy adds a new shared buffer to the strategy ring
+ and this shared buffer is occupied, the Buffer Access
+ Strategy must evict the contents of the shared buffer,
+ incrementing evicted.

I think the parallel phrasing here makes this a little hard to follow.
Specifically, I think "must evict its contents" for the strategy case
sounds like a bad thing, but in fact this is a totally normal thing
that happens as part of strategy access, no? The idea is you probably
won't need that buffer again, so it's fine to evict it. I'm not sure
how to reword, but I think the current phrasing is misleading.

+ The number of times a bulkread found the current
+ buffer in the fixed-size strategy ring dirty and requiring flush.

Maybe "...found ... to be dirty..."?

+ frequent vacuuming or more aggressive autovacuum settings, as buffers are
+ dirtied during a bulkread operation when updating the hint bit or when
+ performing on-access pruning.

Are there docs to cross-reference here, especially for pruning? I
couldn't find much except a few un-explained mentions in the page
layout docs [2], and most of the search results refer to partition
pruning. Searching for hint bits at least gives some info in blog
posts and the wiki.

+ again. A high number of repossessions is a sign of contention for the
+ blocks operated on by the strategy operation.

This (and in ge

Re: resowner "cold start" overhead

2022-10-30 Thread Kyotaro Horiguchi

At Sat, 29 Oct 2022 13:00:25 -0700, Andres Freund  wrote in 
> One way to reduce the size increase would be to use the space for initialarr
> to store variables we don't need while initialarr is used. E.g. itemsarr,
> maxitems, lastarr are candidates.  But I suspect that the code complication
> isn't worth it.

+1

> A different approach could be to not go for the "embedded initial elements"
> approach, but instead to not delete resource owners / resource arrays inside
> ResourceOwnerDelete(). We could stash them in a bounded list of resource
> owners, to be reused by ResourceOwnerCreate().  We do end up creating a
> several resource owners even for the simplest queries.

We often do end up creating several resource owners that aquires not
an element at all .  On the other hand, a few resource owners
sometimes grown up to 2048 (several times) or 4096 (one time) elements
druing a run of the regressiont tests. (I saw catlist, tupdesc and
relref grown to 2048 or more elements.)

> The advantage of that scheme is that it'd save more and that we'd only reserve
> space for ResourceArrays that are actually used in the current workload -
> often the majority of arrays won't be.

Thus I believe preserving resource owners works well. Preserving
resource arrays also would work for the time efficiency, but some
resource owners may end up keeping large amount of memory
unnecessarily most of the time for the backend lifetime. I guess that
the amount is far less than the possible bloat by catcache..

> A potential problem would be that we don't want to use the "hashing" style
> ResourceArrays forever, I don't think they'll be as fast for other cases. But
> we could reset the arrays when they get large.

I'm not sure linear search (am I correct?) doesn't harm for 2048 or
more elements.  I think that the "hashing" style doesn't prevent the
arrays from being reset (free-d) at transaction end (or at resource
owner deletion). That allows releasing unused elements while in
transaction but I'm not sure we need to be so keen to reclaim space
during a transaction.

> Greetings,
> 
> Andres Freund
> 
> https://www.postgresql.org/message-id/20221029025420.eplyow6k7tgu6he3%40awork3.anarazel.de

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Lock on ShmemVariableCache fields?

2022-10-30 Thread Japin Li



Hi, hackers

The VariableCacheData says nextOid and oidCount are protected by
OidGenLock.  However, we update them without holding the lock on
OidGenLock in BootStrapXLOG().  Same as nextXid, for other fields
that are protected by XidGenLock, it holds the lock, see
SetTransactionIdLimit().

void
BootStrapXLOG(void)
{
[...]

ShmemVariableCache->nextXid = checkPoint.nextXid;
ShmemVariableCache->nextOid = checkPoint.nextOid;
ShmemVariableCache->oidCount = 0;

[...]

SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);

[...]
}

I also find a similar code in StartupXLOG().  Why we don't hold the lock
on OidGenLock when updating ShmemVariableCache->nextOid and
ShmemVariableCache->oidCount?

If the lock is unnecessary, I think adding some comments is better.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Adding doubly linked list type which stores the number of items in the list

2022-10-30 Thread David Rowley

Thank you for having another look at this

On Sat, 29 Oct 2022 at 18:32, Bharath Rupireddy
 wrote:
> 1. I guess we need to cast the 'node' parameter too, something like
> below? I'm reading the comment there talking about compilers
> complaning about the unused function arguments.
> dlist_member_check(head, node) ((void) (head); (void) (node);)

I looked at dlist_check() and I didn't quite manage to figure out why
the cast is needed. As far as I can see, there are no calls where we
only pass dlist_head solely for the dlist_check().  For
dlist_member_check(), dlist_delete_from() does not use the 'head'
parameter for anything apart from dlist_member_check(), so I believe
the cast is required for 'head'.  I think I'd rather only add the cast
for 'node' unless we really require it. Cargo-culting it in there just
because that's what the other macros do does not seem like a good idea
to me.

> Can we put max limit, at least in assert, something like below, on
> 'count' instead of saying above? I'm not sure if there's someone
> storing 4 billion items, but it will be a good-to-have safety from the
> data structure perspective if others think it's not an overkill.
> Assert(head->count > 0 && head->count <= PG_UINT32_MAX);

'count' is a uint32. It's always going to be <= PG_UINT32_MAX.

My original thoughts were that it seems unlikely we'd ever give an
assert build a workload that would ever have 2^32 dlist_nodes in
dclist. Having said that, perhaps it would do no harm to add some
overflow checks to 'count'.  I've gone and added some
Assert(head->count > 0) after we do count++.

> + * As dlist_delete but performs checks in ILIST_DEBUG to ensure that 'node'
> + * belongs to 'head'.
> I think 'Same as dlist_delete' instead of just 'As dlist_delete'

I don't really see what's wrong with this.  We use "As above" when we
mean "Same as above" in many locations.  Anyway, I don't feel strongly
about not adding the word, so I've adjusted the wording in that
comment which includes adding the word "Same" at the start.

> 5.
> + * Caution: 'node' must be a member of 'head'.
> + * Caller must ensure that 'before' is a member of 'head'.
> Can we have the same comments around something like below?

I've adjusted dclist_insert_after() and dclist_insert_before(). Each
dclist function that uses dlist_member_check() now has the same text.

> 8. Wondering if we need dlist_delete_from() at all. Can't we just add
> dlist_member_check() dclist_delete_from() and call dlist_delete()
> directly?

Certainly, but I made it that way on purpose. I wanted dclist to have
a superset of the functions that dlist has.  I just see no reason why
dlist shouldn't have dlist_delete_from() when dclist has it.

I've attached the v3 version of the patch which includes some
additional polishing work.

David
diff --git a/src/backend/access/heap/rewriteheap.c 
b/src/backend/access/heap/rewriteheap.c
index b01b39b008..a34e9b352d 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -196,8 +196,7 @@ typedef struct RewriteMappingFile
TransactionId xid;  /* xid that might need to see 
the row */
int vfd;/* fd of mappings file 
*/
off_t   off;/* how far have we written yet 
*/
-   uint32  num_mappings;   /* number of in-memory mappings */
-   dlist_head  mappings;   /* list of in-memory mappings */
+   dclist_head mappings;   /* list of in-memory mappings */
charpath[MAXPGPATH];/* path, for error messages */
 } RewriteMappingFile;

@@ -864,9 +863,10 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
Oid dboid;
uint32  len;
int written;
+   uint32  num_mappings = dclist_count(&src->mappings);

/* this file hasn't got any new mappings */
-   if (src->num_mappings == 0)
+   if (num_mappings == 0)
continue;

if (state->rs_old_rel->rd_rel->relisshared)
@@ -874,7 +874,7 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
else
dboid = MyDatabaseId;

-   xlrec.num_mappings = src->num_mappings;
+   xlrec.num_mappings = num_mappings;
xlrec.mapped_rel = RelationGetRelid(state->rs_old_rel);
xlrec.mapped_xid = src->xid;
xlrec.mapped_db = dboid;
@@ -882,31 +882,30 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
xlrec.start_lsn = state->rs_begin_lsn;

/* write all mappings consecutively */
-   len = src->num_mappings * sizeof(LogicalRewriteMappingData);
+   len = num_mappings * sizeof(LogicalRewriteMappingData);
waldata_start = waldata = palloc(len);

RE: Segfault on logical replication to partitioned table with foreign children

2022-10-30 Thread shiy.f...@fujitsu.com

On Sun, Oct 30, 2022 9:39 PM Tom Lane  wrote:
> 
> What I'm wondering about is whether we can refactor this code
> to avoid so many usually-useless catalog lookups.  Pulling the
> namespace name, in particular, is expensive and we generally
> are not going to need the result.  In the child-rel case it'd
> be much better to pass the opened relation and let the error-check
> subroutine work from that.  Maybe we should just do it like that
> at the start of the recursion, too?  Or pass the relid and let
> the subroutine look up the names only in the error case.
> 
> A completely different line of thought is that this doesn't seem
> like a terribly bulletproof fix, since children could get added to
> a partitioned table after we look.  Maybe it'd be better to check
> the relkind at the last moment before we do something that depends
> on a child table being a relation.
> 

I agree. So maybe we can add this check in apply_handle_tuple_routing().

diff --git a/src/backend/replication/logical/worker.c 
b/src/backend/replication/logical/worker.c
index 5250ae7f54..e941b68e4b 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -2176,6 +2176,10 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
Assert(partrelinfo != NULL);
partrel = partrelinfo->ri_RelationDesc;

+   /* Check for supported relkind. */
+   CheckSubscriptionRelkind(partrel->rd_rel->relkind,
+
relmapentry->remoterel.nspname, relmapentry->remoterel.relname);
+
/*
 * To perform any of the operations below, the tuple must match the
 * partition's rowtype. Convert if needed or just copy, using a 
dedicated


Regards,
Shi yu

Re: Documentation for building with meson

2022-10-30 Thread samay sharma

Hi,

On Thu, Oct 27, 2022 at 1:04 AM John Naylor 
wrote:

> +# Run the main pg_regress and isolation tests
> +meson test --suite main
>
> This does not work for me in a fresh install until running
>
> meson test --suite setup
>
> In fact, we see in
>
> https://wiki.postgresql.org/wiki/Meson
>
> meson test --suite setup --suite main
>

You are right that this will be needed for a new install. I've added
--suite setup in the testing section in the v3 of the patch (attached).


> That was just an eyeball check from a naive user -- it would be good to
> try running everything documented here.
>

I retried all the instructions as suggested and they work for me.

Regards,
Samay


>
> --
> John Naylor
> EDB: http://www.enterprisedb.com
>


v3-0001-Add-docs-for-building-with-meson.patch
Description: Binary data

Prefetch the next tuple's memory during seqscans

2022-10-30 Thread David Rowley

As part of the AIO work [1], Andres mentioned to me that he found that
prefetching tuple memory during hot pruning showed significant wins.
I'm not proposing anything to improve HOT pruning here, but as a segue
to get the prefetching infrastructure in so that there are fewer AIO
patches, I'm proposing we prefetch the next tuple during sequence
scans in while page mode.

It turns out the gains are pretty good when we apply this:

-- table with 4 bytes of user columns
create table t as select a from generate_series(1,1000)a;
vacuum freeze t;
select pg_prewarm('t');

Master @ a9f8ca600
# select * from t where a = 0;
Time: 355.001 ms
Time: 354.573 ms
Time: 354.490 ms
Time: 354.556 ms
Time: 354.335 ms

Master + 0001 + 0003:
# select * from t where a = 0;
Time: 328.578 ms
Time: 329.387 ms
Time: 329.349 ms
Time: 329.704 ms
Time: 328.225 ms (avg ~7.7% faster)

-- table with 64 bytes of user columns
create table t2 as
select a,a a2,a a3,a a4,a a5,a a6,a a7,a a8,a a9,a a10,a a11,a a12,a
a13,a a14,a a15,a a16
from generate_series(1,1000)a;
vacuum freeze t2;
select pg_prewarm('t2');

Master:
# select * from t2 where a = 0;
Time: 501.725 ms
Time: 501.815 ms
Time: 503.225 ms
Time: 501.242 ms
Time: 502.394 ms

Master + 0001 + 0003:
# select * from t2 where a = 0;
Time: 412.076 ms
Time: 410.669 ms
Time: 410.490 ms
Time: 409.782 ms
Time: 410.843 ms (avg ~22% faster)

This was tested on an AMD 3990x CPU. I imagine the CPU matters quite a
bit here. It would be interesting to see if the same or similar gains
can be seen on some modern intel chip too.

I believe Thomas wrote the 0001 patch (same as patch in [2]?). I only
quickly put together the 0003 patch.

I wondered if we might want to add a macro to 0001 that says if
pg_prefetch_mem() is empty or not then use that to #ifdef out the code
I added to heapam.c. Although, perhaps most compilers will be able to
optimise away the extra lines that are figuring out what the address
of the next tuple is.

My tests above are likely the best case for this.  It seems plausible
to me that if there was a much more complex plan that found a
reasonable number of tuples and did something with them that we
wouldn't see the same sort of gains. Also, it also does not seem
impossible that the prefetch just results in evicting some
useful-to-some-other-exec-node cache line or that the prefetched tuple
gets flushed out the cache by the time we get around to fetching the
next tuple from the scan again due to various other node processing
that's occurred since the seq scan was last called. I imagine such
things would be indistinguishable from noise, but I've not tested.

I also tried prefetching out by 2 tuples. It didn't help any further
than prefetching 1 tuple.

I'll add this to the November CF.

David

[1] 
https://www.postgresql.org/message-id/flat/20210223100344.llw5an2akleng...@alap3.anarazel.de
[2] 
https://www.postgresql.org/message-id/CA%2BhUKG%2Bpi63ZbcZkYK3XB1pfN%3DkuaDaeV0Ha9E%2BX_p6TTbKBYw%40mail.gmail.com
From 2fd10f1266550f26f4395de080bcdcf89b6859b6 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Wed, 19 Oct 2022 08:54:01 +1300
Subject: [PATCH 1/3] Add pg_prefetch_mem() macro to load cache lines.

Initially mapping to GCC, Clang and MSVC builtins.

Discussion: 
https://postgr.es/m/CAEepm%3D2y9HM9QP%2BHhRZdQ3pU6FShSMyu%3DV1uHXhQ5gG-dketHg%40mail.gmail.com
---
 config/c-compiler.m4   | 17 
 configure  | 40 ++
 configure.ac   |  3 +++
 meson.build|  3 ++-
 src/include/c.h|  8 
 src/include/pg_config.h.in |  3 +++
 src/tools/msvc/Solution.pm |  1 +
 7 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 000b075312..582a47501c 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -355,6 +355,23 @@ AC_DEFINE_UNQUOTED(AS_TR_CPP([HAVE$1]), 1,
[Define to 1 if your compiler understands $1.])
 fi])# PGAC_CHECK_BUILTIN_FUNC
 
+# PGAC_CHECK_BUILTIN_VOID_FUNC
+# ---
+# Variant for void functions.
+AC_DEFUN([PGAC_CHECK_BUILTIN_VOID_FUNC],
+[AC_CACHE_CHECK(for $1, pgac_cv$1,
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([
+void
+call$1($2)
+{
+$1(x);
+}], [])],
+[pgac_cv$1=yes],
+[pgac_cv$1=no])])
+if test x"${pgac_cv$1}" = xyes ; then
+AC_DEFINE_UNQUOTED(AS_TR_CPP([HAVE$1]), 1,
+   [Define to 1 if your compiler understands $1.])
+fi])# PGAC_CHECK_BUILTIN_VOID_FUNC
 
 
 # PGAC_CHECK_BUILTIN_FUNC_PTR
diff --git a/configure b/configure
index 3966368b8d..c4685b8a1e 100755
--- a/configure
+++ b/configure
@@ -15988,6 +15988,46 @@ _ACEOF
 
 fi
 
+# Can we use a built-in to prefetch memory?
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __builtin_prefetch" >&5
+$as_echo_n "checking for __builtin_prefetch... " >&6; }
+if ${pgac_cv__builtin_prefetch+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end c

Re: pg15 inherited stats expressions: cache lookup failed for statistics object

2022-10-30 Thread Richard Guo

On Mon, Oct 31, 2022 at 1:05 AM Justin Pryzby  wrote:

> I think this is what's needed.
>
> diff --git a/src/backend/utils/adt/selfuncs.c
> b/src/backend/utils/adt/selfuncs.c
> index 14e0885f19f..4450f0d682f 100644
> --- a/src/backend/utils/adt/selfuncs.c
> +++ b/src/backend/utils/adt/selfuncs.c
> @@ -5240,6 +5240,8 @@ examine_variable(PlannerInfo *root, Node *node, int
> varRelid,
> /* skip stats without per-expression stats */
> if (info->kind != STATS_EXT_EXPRESSIONS)
> continue;
> +   if (info->inherit != rte->inh)
> +   continue;
>
> pos = 0;
> foreach(expr_item, info->exprs)
>

I think we also need to do this when loading the ndistinct value, to
skip statistics with mismatching stxdinherit in
estimate_multivariate_ndistinct().

Thanks
Richard

Re: GUC values - recommended way to declare the C variables?

2022-10-30 Thread Michael Paquier

On Mon, Oct 31, 2022 at 12:01:33PM +1100, Peter Smith wrote:
> SUGGESTION
> /* Only applicable when prefetching is available */

Thanks for the suggestion.  Done this way, then.

> +/* Disabled on Windows as the performance overhead can be significant */
> +#ifdef WIN32
> +#define DEFAULT_UPDATE_PROCESS_TITLE false
> +#else
> +#define DEFAULT_UPDATE_PROCESS_TITLE true
> +#endif
>  extern PGDLLIMPORT bool update_process_title;
> 
> Perhaps put that comment inside the #ifdef WIN32

I'd keep that externally, as ps_status.h does so.

> [...]
> SUGGESTION
> /* Check the GUC default and declared initial value for consistency */

Okay, fine by me.

I have split the change into two parts at the end: one to refactor and
fix the C declarations, and a second to introduce the check routine
with all the correct declarations in place.

FWIW, I have been testing that with my own in-house modules and it has
caught a few stupid inconsistencies.  Let's see how it goes.
--
Michael


signature.asc
Description: PGP signature

Re: pg15 inherited stats expressions: cache lookup failed for statistics object

2022-10-30 Thread Richard Guo

On Mon, Oct 31, 2022 at 12:26 PM Richard Guo  wrote:

> On Mon, Oct 31, 2022 at 1:05 AM Justin Pryzby 
> wrote:
>
>> I think this is what's needed.
>>
>> diff --git a/src/backend/utils/adt/selfuncs.c
>> b/src/backend/utils/adt/selfuncs.c
>> index 14e0885f19f..4450f0d682f 100644
>> --- a/src/backend/utils/adt/selfuncs.c
>> +++ b/src/backend/utils/adt/selfuncs.c
>> @@ -5240,6 +5240,8 @@ examine_variable(PlannerInfo *root, Node *node, int
>> varRelid,
>> /* skip stats without per-expression stats */
>> if (info->kind != STATS_EXT_EXPRESSIONS)
>> continue;
>> +   if (info->inherit != rte->inh)
>> +   continue;
>>
>> pos = 0;
>> foreach(expr_item, info->exprs)
>>
>
> I think we also need to do this when loading the ndistinct value, to
> skip statistics with mismatching stxdinherit in
> estimate_multivariate_ndistinct().
>

To be concrete, I mean something like attached.

BTW, I noticed a micro-optimization opportunity in examine_variable that
we can fetch the RangeTblEntry for 'onerel' outside the foreach loop
when iterating the extended stats so that we can do it only once rather
than for each stat.

Thanks
Richard


v1-0001-Skip-statistics-with-mismatching-stxdinherit-flag.patch
Description: Binary data

Re: heavily contended lwlocks with long wait queues scale badly

2022-10-30 Thread Kyotaro Horiguchi

At Thu, 27 Oct 2022 09:59:14 -0700, Andres Freund  wrote in 
> But I think we can solve that fairly reasonably nonetheless. We can change
> PGPROC->lwWaiting to not just be a boolean, but have three states:
> 0: not waiting
> 1: waiting in waitlist
> 2: waiting to be woken up
> 
> which we then can use in LWLockDequeueSelf() to only remove ourselves from the
> list if we're on it. As removal from that list is protected by the wait list
> lock, there's no race to worry about.

Since LWLockDequeueSelf() is defined to be called in some restricted
situation, there's no chance that the proc to remove is in another
lock waiters list at the time the function is called. So it seems to
work well. It is simple and requires no additional memory or cycles...

No. It enlarges PRPC by 8 bytes, but changing lwWaiting to int8/uint8
keeps the size as it is. (Rocky8/x86-64)

It just shaves off looping cycles. So +1 for what the patch does.


> client  patched   HEAD
> 16010960174
> 2   112694   116169
> 4   214287   208119
> 8   377459   373685
> 16  524132   515247
> 32  565772   554726
> 64  587716   497508
> 128 581297   415097
> 256 550296   334923
> 512 486207   243679
> 768 449673   192959
> 1024410836   157734
> 204832622482904
> 409625025232007
> 
> Not perfect with the patch, but not awful either.

Fairly good? Agreed.  The performance peak is improved by 6% and
shifted to larger number of clients (32->128).

> I suspect this issue might actually explain quite a few odd performance
> behaviours we've seen at the larger end in the past. I think it has gotten a
> bit worse with the conversion of lwlock.c to proclists (I see lots of
> expensive multiplications to deal with sizeof(PGPROC)), but otherwise likely
> exists at least as far back as ab5194e6f61, in 9.5.
>
> I guess there's an argument for considering this a bug that we should
> backpatch a fix for? But given the vintage, probably not?  The only thing that
> gives me pause is that this is quite hard to pinpoint as happening.

I don't think this is a bug but I think it might be back-patchable
since it doesn't change memory footprint (if adjusted), causes no
additional cost or interfarce breakage, thus it might be ok to
backpatch.  Since this "bug" has the nature of positive-feedback so
reducing the coefficient is benetifical than the direct cause of the
change.

> I've attached my quick-and-dirty patch. Obviously it'd need a few defines etc,
> but I wanted to get this out to discuss before spending further time.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pg15 inherited stats expressions: cache lookup failed for statistics object

2022-10-30 Thread Michael Paquier

On Mon, Oct 31, 2022 at 01:12:09PM +0800, Richard Guo wrote:
> BTW, I noticed a micro-optimization opportunity in examine_variable that
> we can fetch the RangeTblEntry for 'onerel' outside the foreach loop
> when iterating the extended stats so that we can do it only once rather
> than for each stat.

Isn't that the kind of thing where we'd better have some regression
coverage?
--
Michael


signature.asc
Description: PGP signature

Limit of WAL LSN value of type pg_lsn

2022-10-30 Thread Shubham Shingne

Hi all,

I am trying to determine optimal standby for automatic failover based on
pg_last_wal_recieve_lsn() value of all slaves.

But what if max_wal_size is reached, is LSN value reset to zero.

Please share some documentation that mentions clarification.

Also suggest if there is any better approach to get node with best RPO out
of all slaves.

Thanks & regards,
Shubham

Commit fest 2022-11

2022-10-30 Thread Michael Paquier

Hi all,

As per the world clock, the next commit fest will begin in 30 hours
(11/1 0:00 AoE time).  I may have missed something, but it looks like
we have no CFM for this one yet.

Opinions, thoughts or volunteers?
--
Michael


signature.asc
Description: PGP signature

RE: [PATCH] Support % wildcard in extension upgrade filenames

2022-10-30 Thread Regina Obe

Just to reiterate the main impetus for this patch is to save PostGIS from
shipping 100s of duplicate extension files for each release. 

> And now with the actual patch attached ... (sorry)
> 
> --strk;
> 

Sandro, can you submit an updated version of this patch.
I was testing it out and looked good first time.

But I retried just now testing against master, and it fails with 

commands/extension.o: In function `file_exists':
postgresql-git\src\backend\commands/extension.c:3430: undefined reference to
`AssertArg'

It seems 2 days ago AssertArg and AssertState were removed.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b1099eca8f38f
f5cfaf0901bb91cb6a22f909bc6

So your use of AssertArg needs to be replaced with Assert I guess. 
I did that and was able to test again with a sample extension I made

Called:

wildtest

1) The wildcard patch in its current state only does anything if 

wildcard_upgrades = true 

is in the control file. If it's false or missing, then the behavior of
extension upgrades doesn't change.


2) It only understands % as a complete wildcard for a version number

So this is legal
wildtest--%--2.0.sql

This does nothing
wildtest--2%--2.0.sql

3) I confirmed that if you have a path such as

wildtest--2.0--2.2.sql
wildtest--%--2.2.sql

then the exact match trumps the wildcard. In the above case if I am on 2.0
and going to 2.2, the wildtest--2.0--2.2.sql script is used instead of the
wildtest--% one.

4) It is not possible to downgrade with the wildcard.  For example I had
paths
wildtest--%--2.1.sql  

and I was unable to go from a version 2.2 down to a version 2.1.  I didn't
check why that was so, but probably a good thing.

If everyone is okay with this patch, we'll go ahead and add tests and
documentation to go with it.

Thanks,
Regina

Re: Adding doubly linked list type which stores the number of items in the list

2022-10-30 Thread Bharath Rupireddy

On Mon, Oct 31, 2022 at 8:26 AM David Rowley  wrote:
>
> I looked at dlist_check() and I didn't quite manage to figure out why
> the cast is needed. As far as I can see, there are no calls where we
> only pass dlist_head solely for the dlist_check(). For
> dlist_member_check(), dlist_delete_from() does not use the 'head'
> parameter for anything apart from dlist_member_check(), so I believe
> the cast is required for 'head'.  I think I'd rather only add the cast
> for 'node' unless we really require it. Cargo-culting it in there just
> because that's what the other macros do does not seem like a good idea
> to me.

Hm, you're right, dlist_member_check() needs it. Also, slist_check()
needs it for slist_has_next(). dlist_check() doesn't need it, however,
keeping it intact doesn't harm, I guess.

> My original thoughts were that it seems unlikely we'd ever give an
> assert build a workload that would ever have 2^32 dlist_nodes in
> dclist. Having said that, perhaps it would do no harm to add some
> overflow checks to 'count'.  I've gone and added some
> Assert(head->count > 0) after we do count++.

So, when an overflow occurs, the head->count wraps around after
PG_UINT32_MAX, meaning, becomes 0 and we will catch it in an assert
build. This looks reasonable to me. However, the responsibility lies
with the developers to deal with such overflows.

> > 8. Wondering if we need dlist_delete_from() at all. Can't we just add
> > dlist_member_check() dclist_delete_from() and call dlist_delete()
> > directly?
>
> Certainly, but I made it that way on purpose. I wanted dclist to have
> a superset of the functions that dlist has.  I just see no reason why
> dlist shouldn't have dlist_delete_from() when dclist has it.

Okay.

> I've attached the v3 version of the patch which includes some
> additional polishing work.

Thanks. The v3 patch looks good to me.

BTW, do we need sclist_* as well? AFAICS, no such use-case exists
needing slist and element count, maybe we don't need.

I'm wondering if adding count to dlist_head and maintaining it as part
of the existing dlist_* data structure and functions is any better
than introducing dclist_*? In that case, we need only one function,
dlist_count, no? Or do we choose to go dclist_* because we want to
avoid the extra cost of maintaining count within dlist_*? If yes, is
maintaining count in dlist_* really costly?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Lock on ShmemVariableCache fields?

2022-10-30 Thread Zhang Mingli

HI,

On Oct 31, 2022, 10:48 +0800, Japin Li , wrote:
>
> Hi, hackers
>
> The VariableCacheData says nextOid and oidCount are protected by
> OidGenLock. However, we update them without holding the lock on
> OidGenLock in BootStrapXLOG(). Same as nextXid, for other fields
> that are protected by XidGenLock, it holds the lock, see
> SetTransactionIdLimit().
>
> void
> BootStrapXLOG(void)
> {
> [...]
>
> ShmemVariableCache->nextXid = checkPoint.nextXid;
> ShmemVariableCache->nextOid = checkPoint.nextOid;
> ShmemVariableCache->oidCount = 0;
>
> [...]
>
> SetTransactionIdLimit(checkPoint.oldestXid, checkPoint.oldestXidDB);
>
> [...]
> }
>
> I also find a similar code in StartupXLOG(). Why we don't hold the lock
> on OidGenLock when updating ShmemVariableCache->nextOid and
> ShmemVariableCache->oidCount?
>
> If the lock is unnecessary, I think adding some comments is better.
>
> --
> Regrads,
> Japin Li.
> ChengDu WenWu Information Technology Co.,Ltd.
>
>
As its name BootStrapXLOG, it’s used in BootStrap mode to initialize the 
template database.
The process doesn’t speak SQL and the database is not ready.
There won’t be  concurrent access to variables.

Regards,
Zhang Mingli
>

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2022-10-30 Thread Bharath Rupireddy

On Mon, Oct 31, 2022 at 5:01 AM Michael Paquier  wrote:
>
> On Sun, Oct 30, 2022 at 03:44:32PM +0100, Alvaro Herrera wrote:
> > So I'm kinda proposing that we only do the forward struct initialization
> > dance when it really saves on things -- in particular, when it helps
> > avoid or reduce massive indirect header inclusion.
>
> Sure.

I don't think including pg_iovec.h in file_utils.h is a good idea. I
agree that pg_iovec.h is fairly a small header file but file_utils.h
is included in 21 .c files, as of today and the file_utils.h footprint
might increase in future. Therefore, I'd vote for forward struct
initialization as it is on HEAD today.

> >  extern ssize_t pg_pwritev_with_retry(int fd,
> > - const struct iovec *iov,
> > + const iovec *iov,
> >   int iovcnt,
> >   off_t offset);
>
> However this still needs to be defined as a struct, no?

Yes, we need a struct there because we haven't typedef'ed struct iovec.

Also, the patch forgets to remove "port/pg_iovec.h" from file_utils.c

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: GUC values - recommended way to declare the C variables?

2022-10-30 Thread Peter Smith

On Mon, Oct 31, 2022 at 4:02 PM Michael Paquier  wrote:
>
> On Mon, Oct 31, 2022 at 12:01:33PM +1100, Peter Smith wrote:
> > SUGGESTION
> > /* Only applicable when prefetching is available */
>
> Thanks for the suggestion.  Done this way, then.
>
> > +/* Disabled on Windows as the performance overhead can be significant */
> > +#ifdef WIN32
> > +#define DEFAULT_UPDATE_PROCESS_TITLE false
> > +#else
> > +#define DEFAULT_UPDATE_PROCESS_TITLE true
> > +#endif
> >  extern PGDLLIMPORT bool update_process_title;
> >
> > Perhaps put that comment inside the #ifdef WIN32
>
> I'd keep that externally, as ps_status.h does so.
>
> > [...]
> > SUGGESTION
> > /* Check the GUC default and declared initial value for consistency */
>
> Okay, fine by me.
>
> I have split the change into two parts at the end: one to refactor and
> fix the C declarations, and a second to introduce the check routine
> with all the correct declarations in place.
>
> FWIW, I have been testing that with my own in-house modules and it has
> caught a few stupid inconsistencies.  Let's see how it goes.

Thanks for pushing.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Pulling up direct-correlated ANY_SUBLINK

How to started with Contributions

Re: How to started with Contributions

Re: Segfault on logical replication to partitioned table with foreign children

Re: Segfault on logical replication to partitioned table with foreign children

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

Re: How to started with Contributions

Re: Segfault on logical replication to partitioned table with foreign children

pg15 inherited stats expressions: cache lookup failed for statistics object

Re: Code checks for App Devs, using new options for transaction behavior

Re: Schema variables - new implementation for Postgres 15

Re: Schema variables - new implementation for Postgres 15

Re: warn if GUC set to an invalid shared library

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

Re: Simplifying our Trap/Assert infrastructure

Re: GUC values - recommended way to declare the C variables?

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Re: resowner "cold start" overhead

Lock on ShmemVariableCache fields?

Re: Adding doubly linked list type which stores the number of items in the list

RE: Segfault on logical replication to partitioned table with foreign children

Re: Documentation for building with meson

Prefetch the next tuple's memory during seqscans

Re: pg15 inherited stats expressions: cache lookup failed for statistics object

Re: GUC values - recommended way to declare the C variables?

Re: pg15 inherited stats expressions: cache lookup failed for statistics object

Re: heavily contended lwlocks with long wait queues scale badly

Re: pg15 inherited stats expressions: cache lookup failed for statistics object

Limit of WAL LSN value of type pg_lsn

Commit fest 2022-11

RE: [PATCH] Support % wildcard in extension upgrade filenames

Re: Adding doubly linked list type which stores the number of items in the list

Re: Lock on ShmemVariableCache fields?

Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

Re: GUC values - recommended way to declare the C variables?

35 matches

Site Navigation

Mail list logo

Footer information