Re: Statistics Import and Export

2025-05-22 Thread Robert Haas
On Sat, May 10, 2025 at 3:51 PM Greg Sabino Mullane  wrote:
> I may have missed something (we seem to have a lot of threads for this 
> subject), but we are in beta and both pg_dump and pg_upgrade seem to be 
> opt-out? I still object strongly to this;  pg_dump is meant to be a canonical 
> representation of the schema and data. Adding metadata that can change from 
> dump to dump seems wrong, and should be opt-in. I've not been convinced 
> otherwise why stats should be output by default.
>
> To be clear, I 100% want it to be the default for pg_upgrade.
>
> Maybe we are just leaving it enabled to see if anyone complains in beta, but 
> I don't want us to forget about it. :)

Yeah. This could use comments from a few more people, but I really
hope we don't ship the final release this way. We do have a "Enable
statistics in pg_dump by default" item in the open items list under
"Decisions to Recheck Mid-Beta", but that's arguably now. It also sort
of looks like we might have a consensus anyway: Jeff said "I lean
towards making it opt-in for pg_dump and opt-out for pg_upgrade" and I
agree with that and it seems you do, too. So perhaps Jeff should make
it so?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: ALTER DOMAIN ADD NOT NULL NOT VALID

2025-05-22 Thread Quan Zongliang




On 2025/5/21 18:44, jian he wrote:

hi.

attached patch is for $subject implementation

per https://www.postgresql.org/docs/current/sql-alterdomain.html
"""
Although ALTER DOMAIN ADD CONSTRAINT attempts to verify that existing stored
data satisfies the new constraint, this check is not bulletproof, because the
command cannot “see” table rows that are newly inserted or updated and not yet
committed. If there is a hazard that concurrent operations might insert bad
data, the way to proceed is to add the constraint using the NOT VALID option,
commit that command, wait until all transactions started before that commit have
finished, and then issue ALTER DOMAIN VALIDATE CONSTRAINT to search for data
violating the constraint.
"""

Obviously, the above behavior can also happen to not-null constraints.
add NOT NULL NOT VALID is good for validation invalid data too.

the not valid information is displayed at column "Nullable"
for example:

\dD things
   List of domains
  Schema |  Name  |  Type   | Collation |  Nullable  | Default
|   Check
++-+---++-+
  public | things | integer |   | not null not valid |
| CHECK (VALUE < 11)


It makes sense to support the "NOT NULL NOT VALID" option.

The two if statements in the AlterDomainNotNull() should be adjusted.

if (typTup->typnotnull == notNull && !notNull)
==>
if (!notNull && !typTup->typnotnull)


if (typTup->typnotnull == notNull && notNull)
==>
if (notNull && typTup->typnotnull)

--
Quan Zongliang





Re: Assert("vacrel->eager_scan_remaining_successes > 0")

2025-05-22 Thread Melanie Plageman
On Wed, May 21, 2025 at 6:11 PM Masahiko Sawada  wrote:
>
> > if (vacrel->eager_scan_remaining_successes > 0)
> >  vacrel->eager_scan_remaining_successes--;
>
> I've attached a patch that uses this idea. Feedback is very welcome.

Thanks for writing the patch!

I actually think we have the same situation with
eager_scan_remaining_fails. Since the extra pages that are eagerly
scanned could either fail or succeed to be frozen, so we probably also
need to change the assert in the failure case into a guard as well:

else
{
Assert(vacrel->eager_scan_remaining_fails > 0);
vacrel->eager_scan_remaining_fails--;
}

->

  else if (vacrel->eager_scan_remaining_fails > 0)
   vacrel->eager_scan_remaining_fails--;

In the comment you wrote, I would probably just change one thing

+/*
+ * Report only once that we disabled eager scanning. This
+ * check is required because we might have eagerly read
+ * more blocks and we could reach here even after
+ * disabling eager scanning.
+ */

I would emphasize that we read ahead these blocks before executing the
code trying to freeze them. So, I might instead say something like:
"Report only once that we disabled eager scanning. We may eagerly read
ahead blocks in excess of the success or failure caps before
attempting to freeze them, so we could reach here even after disabling
additional eager scanning"

And then probably avoid repeating the whole comment above the
remaining fails guard.

- Melanie




Re: making EXPLAIN extensible

2025-05-22 Thread Andrei Lepikhov

On 22/5/2025 16:17, Robert Haas wrote:

On Sat, May 3, 2025 at 2:44 PM Andrei Lepikhov  wrote:

I have one additional proposal.

I currently use this interface and have noticed that the parameter
`option_name` in the RegisterExtensionExplainOption routine is
case-sensitive. Since SQL treats our extended options as
case-insensitive, it would be beneficial to either add a comment
clarifying this behavior or forcedly convert incoming text constant to
lowercase.


I don't think this is really true. It's just standard identifier
handling. You can have options with upper-case names if you quote
them.
Sorry for my language. I meant that when you call function 
RegisterExtensionExplainOption(), it make sense if you write parameter 
`option_name` as "debug" or "deBug". On the user side 
case-insensitiveness works correctly, of course. Not sure about side 
effects if one extension will call this routine with "Debug" parameter 
and another one - "debuG".


--
regards, Andrei Lepikhov




Re: Statistics Import and Export

2025-05-22 Thread Nathan Bossart
On Thu, May 22, 2025 at 10:20:16AM -0400, Robert Haas wrote:
> It also sort
> of looks like we might have a consensus anyway: Jeff said "I lean
> towards making it opt-in for pg_dump and opt-out for pg_upgrade" and I
> agree with that and it seems you do, too. So perhaps Jeff should make
> it so?

+1, I think we should go ahead and do this.  If Jeff can't get to it, I'm
happy to pick it up in the next week or so.

-- 
nathan




Re: making EXPLAIN extensible

2025-05-22 Thread Robert Haas
On Sat, May 3, 2025 at 2:44 PM Andrei Lepikhov  wrote:
> I have one additional proposal.
>
> I currently use this interface and have noticed that the parameter
> `option_name` in the RegisterExtensionExplainOption routine is
> case-sensitive. Since SQL treats our extended options as
> case-insensitive, it would be beneficial to either add a comment
> clarifying this behavior or forcedly convert incoming text constant to
> lowercase.

I don't think this is really true. It's just standard identifier
handling. You can have options with upper-case names if you quote
them.

robert.haas=# explain (verbose) select 1;
QUERY PLAN
--
 Result  (cost=0.00..0.01 rows=1 width=4)
   Output: 1
(2 rows)

robert.haas=# explain (VERBOSE) select 1;
QUERY PLAN
--
 Result  (cost=0.00..0.01 rows=1 width=4)
   Output: 1
(2 rows)

robert.haas=# explain ("verbose") select 1;
QUERY PLAN
--
 Result  (cost=0.00..0.01 rows=1 width=4)
   Output: 1
(2 rows)

robert.haas=# explain ("VERBOSE") select 1;
ERROR:  unrecognized EXPLAIN option "VERBOSE"
LINE 1: explain ("VERBOSE") select 1;
 ^

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Statistics Import and Export

2025-05-22 Thread Nathan Bossart
On Wed, May 21, 2025 at 04:53:17PM -0700, Jeff Davis wrote:
> On Wed, 2025-05-21 at 16:29 -0500, Nathan Bossart wrote:
>> I don't know precisely where that line might be, but in this case,
>> the
>> dumped stats have no hope of restoring into anything older than
>> v18... But I see no particular benefit from moving the complexity
>> to the
>> import side here.
> 
> That's fine with me. Perhaps we should just say that pre-18 behavior
> differences can be fixed up during export, and post-18 behavior
> differences are fixed up during import?

WFM.  I've committed the patch.

-- 
nathan




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Tom Lane
Srinath Reddy Sadipiralla  writes:
> the extension is loaded and then i entered the bogus extension GUC into
> postgresql.conf and restarted, i did not observe any complain/warning .

Were you looking in the right place?  I experimented with adding

shared_preload_libraries = 'plpgsql'# (change requires restart)
plpgsql.bogus = 1

to postgresql.conf.  Restarting the server, I see in the
postmaster log:

2025-05-22 11:16:45.724 EDT [1526138] WARNING:  invalid configuration parameter 
name "plpgsql.bogus", removing it
2025-05-22 11:16:45.724 EDT [1526138] DETAIL:  "plpgsql" is now a reserved 
prefix.
2025-05-22 11:16:45.728 EDT [1526138] LOG:  starting PostgreSQL 18beta1 on 
x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-26), 
64-bit
... etc etc ...

(I also tried the other order of these settings, to see if that made
any difference, but it didn't.)  I also tried

session_preload_libraries = 'plpgsql'
plpgsql.bogus = 1

and then the complaint comes out during any session start:

$ psql
WARNING:  invalid configuration parameter name "plpgsql.bogus", removing it
DETAIL:  "plpgsql" is now a reserved prefix.
psql (18beta1)
Type "help" for help.

That's all operating as intended, and I'm not seeing how the proposed
patch isn't completely redundant with it.

regards, tom lane




Re: Why our Valgrind reports suck

2025-05-22 Thread Yasir
On Wed, May 21, 2025 at 10:14 PM Tom Lane  wrote:

> Here's a v2 patchset that reaches the goal of zero reported leaks
> in the core regression tests, with some caveats:
>
> * Rather than completely fixing the function-cache and
> TS-dictionary-cache issues, I just added suppression rules to
> hide them.  I am not convinced it is worth working harder than that.
> The patchset does include some fixes that clean up low-hanging fruit
> in that area, but going further seems like a lot of work (and risk of
> bugs) for fairly minimal gain.  The core regression tests show less
> than 10K "suppressed" space in all test sessions but three, and those
> three are still under 100K.
>
> * The patch series assumes that the ModifyTable fix discussed at [1]
> is already applied.
>
> * I still observe leaks in ProcessGetMemoryContextInterrupt, but
> I think the consensus is we should just revert that as not yet ready
> for prime time [2].
>
> 0001 is the same as before except I did more work on the comments.
> I concluded that we were overloading the term "chunk" too much,
> so I invented the term "vchunk" for Valgrind's notion of chunks.
> (Feel free to suggest other terminology.)
>
> 0002 is new work to fix up MemoryContextAllocAligned so it doesn't
> cause possible-leak complaints.
>
>
I tested with the provided v2 patch series making sure mentioned [1]
applied.
More than 800 backend valgrind output files generated against regression,
among which 237 files contain suppressed: > 0 entries, of which 5 files
also contain "definitely lost: > 0 bytes entries".
The Maximum leak I found in these valgrind output files is 960 bytes only.

Whilst, the original issue I posted [link
]
is fixed. There are no leaks.

Regards,

Yasir Hussain
Data Bene


> The rest are more or less bite-sized fixes of individual problems.
> Probably we could squash a lot of them for final commit, but I
> thought it'd be easier to review like this.  Note that I'm not
> expecting 0013 to get applied in this form [3], but without it we
> have various gripes about memory leaked from plancache entries.
>
> regards, tom lane
>
> [1]
> https://www.postgresql.org/message-id/flat/213261.1747611093%40sss.pgh.pa.us
> [2]
> https://www.postgresql.org/message-id/594293.1747708165%40sss.pgh.pa.us
> [3]
> https://www.postgresql.org/message-id/605328.1747710381%40sss.pgh.pa.us
>
>


Re: [PATCH] Add pretty-printed XML output option

2025-05-22 Thread Jim Jones


On 22.05.25 17:00, Tom Lane wrote:
> Yeah, after sleeping on it I fear that leaving xml_parse entirely
> alone will just be a recipe for future copy-and-paste errors.
That's exactly my concern as well.
> The Assert solution seems like the way to go, approximately
>
> xmlNodePtrroot;
> +   xmlNodePtroldroot PG_USED_FOR_ASSERTS_ONLY;
>
> ...
> /* This attaches root to doc, so we need not free it separately. 
> */
> -   xmlDocSetRootElement(doc, root);
> +   oldroot = xmlDocSetRootElement(doc, root);
> +   /* ... and there can't yet be any old root to clean up. */
> +   Assert(oldroot == NULL);
>
> I'll make it so.

+1

Thanks!

Best regards, Jim






Re: generic plans and "initial" pruning

2025-05-22 Thread Robert Haas
On Tue, May 20, 2025 at 11:38 AM Tom Lane  wrote:
> I still like the core idea of deferring locking, but I don't like
> anything about this implementation of it.  It seems like there has
> to be a better and simpler way.

Without particularly defending this implementation, and certainly
without defending its bugs, I just want to say that I'm not convinced
by the idea that there has to be a better and simpler way. We --
principally Amit, but also me and you and others -- have been trying
to find the best way of doing this for probably 5 years now. If you do
something during executor startup, you have to be prepared for
executor startup to force a replan, and if you do something before
executor startup, then you're duplicating executor logic into a new
phase that needs to communicate its results forward to execution
proper. Either approach is awkward and that awkwardness seems to
inevitably bleed into the plan cache specifically. I'd be beyond
delighted if you want to help chart a path through the awkwardness
here, since you know this stuff better than anybody, but I am
skeptical that there is a truly marvelous approach which we've just
managed to overlook for all this time.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Pathify RHS unique-ification for semijoin planning

2025-05-22 Thread wenhui qiu
On Thu, 22 May 2025 at 17:28, Andy Fan  wrote:

> Richard Guo  writes:
>
> Hi,
>
> > However, in the case of sort-based implementation,
> > this function pays no attention to the subpath's pathkeys or the
> > pathkeys of the resulting output.
>
> Good finding!
>
> >
> > In addition to this specific issue, it seems to me that there are
> > other potential issues in create_unique_path().
> >
> > * Currently, we only consider the cheapest_total_path of the RHS when
> > unique-ifying it.
>
> I think it is better have a check the tuple_fraction for the startup_cost
> factor, for some paths where the total cost is high but the required
> fraction is lower.
>
> > I think this may cause us to miss some optimization
> > opportunities.  For example, there might be a path with a better sort
> > order that isn't the cheapest-total one.  Such a path could help avoid
> > a sort at a higher level, potentially resulting in a cheaper overall
> > plan.
>
> > * In create_unique_path(), we currently rely on heuristics to decide
> > whether to use a hash-based or sort-based method.  I think a better
> > approach would be to create paths for both methods and let add_path()
> > determine which one is better, similar to how we handle path selection
> > in other parts of the planner.
> >
> > Therefore, I'm thinking that maybe we could create a new RelOptInfo
> > for the RHS rel to represent its unique-ified version, and then
> > generate all worthwhile paths for it,
>
> This sounds great for me. and I think we can keep the fraction
> cheapest path on the new RelOptInfo as well, then all the things should
> be on the way.
>
> > To be concrete, I'm envisioning something like the following:
> >
> > if (bms_equal(sjinfo->syn_righthand, rel2->relids) &&
> > -   create_unique_path(root, rel2, rel2->cheapest_total_path,
> > -  sjinfo) != NULL)
> > +   (rel2_unique = create_unique_rel(root, rel2, sjinfo)) !=
> NULL)
> >
> > ...
> >
> > -   add_paths_to_joinrel(root, joinrel, rel1, rel2,
> > -JOIN_UNIQUE_INNER, sjinfo,
> > +   add_paths_to_joinrel(root, joinrel, rel1, rel2_unique,
> > +JOIN_INNER, sjinfo,
> >  restrictlist);
> > -   add_paths_to_joinrel(root, joinrel, rel2, rel1,
> > -JOIN_UNIQUE_OUTER, sjinfo,
> > +   add_paths_to_joinrel(root, joinrel, rel2_unique, rel1,
> > +JOIN_INNER, sjinfo,
> >  restrictlist);
> >
> > In addition, by changing the join from "rel1" and "rel2" using
> > JOIN_UNIQUE_OUTER or JOIN_UNIQUE_INNER to a join between "rel1" and
> > "rel2_unique" using a standard JOIN_INNER, we might be able to get
> > rid of the JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER jointypes.
>
> >>if we can, +10.

Agree
Pls kindly release a path for this?

>
>

>
>
>
>


Re: [PoC] XMLCast (SQL/XML X025)

2025-05-22 Thread Robert Haas
On Wed, May 21, 2025 at 2:22 PM Jim Jones  wrote:
> In v10 I added this to the documentation to make the difference to CAST
> clearer:

Yes, that looks very helpful.

> In v10 I changed these comments to:

That, too.

I don't have time to re-review this right now, but I encourage you to
look through the patch for other, similar places that could benefit
from a fuller explanation. And I hope somebody else shows up to
express interest in this so that your work is not wasted...

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Retiring some encodings?

2025-05-22 Thread Bruce Momjian
On Thu, May 22, 2025 at 02:44:39PM +0300, Heikki Linnakangas wrote:
> On 22/05/2025 08:54, Michael Paquier wrote:
> > With all that in mind, I have wanted to kick a discussion about
> > potentially removing one or more encodings from the core code,
> > including the backend part, the frontend part and the conversion
> > routines, coupled with checks in pg_upgrade to complain with database
> > or collations include the so-said encoding (the collation part needs
> > to be checked when not using ICU).  Just being able to removing
> > GB18030 would do us a favor in the long-term, at least, but there's
> > more.
> 
> +1 at high level for deprecating and removing conversions that are not
> widely used anymore. As the first step, we can at least add a warning to the
> documentation, that they will be removed in the future.

Agreed on notification.  A radical idea would be to add a warning for
the use of such encodings in PG 18, and then mention their deprecation
in the PG 18 release notes so everyone is informed they will be removed
in PG 19.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.




Re: [PATCH] Add pretty-printed XML output option

2025-05-22 Thread Tom Lane
Jim Jones  writes:
> On 22.05.25 01:48, Tom Lane wrote:
>> ... I considered adding an assertion that that call returns
>> NULL, but concluded that it wasn't worth the notational hassle.
>> I'm not strongly set on that conclusion, though, if you think
>> differently.

> I see. In that case I believe that at least a different comment
> explaining this decision would avoid confusion. Something like

Yeah, after sleeping on it I fear that leaving xml_parse entirely
alone will just be a recipe for future copy-and-paste errors.
The Assert solution seems like the way to go, approximately

xmlNodePtrroot;
+   xmlNodePtroldroot PG_USED_FOR_ASSERTS_ONLY;

...
/* This attaches root to doc, so we need not free it separately. */
-   xmlDocSetRootElement(doc, root);
+   oldroot = xmlDocSetRootElement(doc, root);
+   /* ... and there can't yet be any old root to clean up. */
+   Assert(oldroot == NULL);

I'll make it so.

regards, tom lane




Re: Reduce "Var IS [NOT] NULL" quals during constant folding

2025-05-22 Thread Tom Lane
Robert Haas  writes:
> Before we commit to something along these lines, I think we need to
> understand whether Tom intends to press Peter for some bigger change
> around expand_virtual_generated_columns.
> If Tom doesn't respond right away, I suggest that we need to add an
> open item for http://postgr.es/m/602561.1744314...@sss.pgh.pa.us

I think that we do need to do something about that, but it may be
in the too-late-for-v18 category by now.  Not sure.  I definitely
don't love the idea of table_open'ing every table in every query
an extra time just to find out (about 99.44% of the time) that it
does not have any virtual generated columns.

I wonder if a better answer would be to make the rewriter responsible
for this.  If you hold your head at the correct angle, a table with
virtual generated columns looks a good deal like a view, and we don't
ask the planner to handle those.

BTW, in my mind the current thread is certainly v19 material,
so I have not looked at Richard's patch yet.

regards, tom lane




Re: Statistics Import and Export

2025-05-22 Thread Tom Lane
Nathan Bossart  writes:
> On Thu, May 22, 2025 at 10:20:16AM -0400, Robert Haas wrote:
>> It also sort
>> of looks like we might have a consensus anyway: Jeff said "I lean
>> towards making it opt-in for pg_dump and opt-out for pg_upgrade" and I
>> agree with that and it seems you do, too. So perhaps Jeff should make
>> it so?

> +1, I think we should go ahead and do this.  If Jeff can't get to it, I'm
> happy to pick it up in the next week or so.

Works for me, too.

regards, tom lane




Re: Tree-walker callbacks vs -Wdeprecated-non-prototype

2025-05-22 Thread Andres Freund
Hi,

On 2022-12-13 14:18:48 +1300, Thomas Munro wrote:
> On Mon, Dec 12, 2022 at 4:43 PM Thomas Munro  wrote:
> > On Mon, Dec 12, 2022 at 4:07 PM Tom Lane  wrote:
> > > I'm for "turn the warning off".  Per previous discussion, adhering
> > > strictly to that rule would make our code worse (less legible AND
> > > less safe), not better.
> >
> > Alright, this seems to do the trick here.
> 
> That did fix that problem.  But... seawasp also just recompiled its
> compiler and picked up new opaque pointer API changes.  So no green
> today.  I have more work to do to fix that, which might take some time
> to get back to.

Presumably due to conflict resolution, this commit added an empty meson.build
to REL_13_STABLE. I'll remove that...

Greetings,

Andres Freund




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Srinath Reddy Sadipiralla
On Thu, May 22, 2025 at 9:00 PM Tom Lane  wrote:

> Srinath Reddy Sadipiralla  writes:
> > the extension is loaded and then i entered the bogus extension GUC into
> > postgresql.conf and restarted, i did not observe any complain/warning .
>
> Were you looking in the right place?  I experimented with adding
>
> shared_preload_libraries = 'plpgsql'# (change requires restart)
> plpgsql.bogus = 1
>
> to postgresql.conf.  Restarting the server, I see in the
> postmaster log:
>
> 2025-05-22 11:16:45.724 EDT [1526138] WARNING:  invalid configuration
> parameter name "plpgsql.bogus", removing it
> 2025-05-22 11:16:45.724 EDT [1526138] DETAIL:  "plpgsql" is now a reserved
> prefix.
> 2025-05-22 11:16:45.728 EDT [1526138] LOG:  starting PostgreSQL 18beta1 on
> x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat
> 8.5.0-26), 64-bit
> ... etc etc ...
>
>
Tom, the problem is when the prefix is a typo ,my bad i should have
specified as bogus prefix instead of bogus GUC ,can you please try again
with
"lpgsql.bogus = 1" .

-- 
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/


Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Tom Lane
Srinath Reddy Sadipiralla  writes:
> Tom, the problem is when the prefix is a typo ,my bad i should have
> specified as bogus prefix instead of bogus GUC ,can you please try again
> with
> "lpgsql.bogus = 1" .

[ shrug... ] How do you know that's a bogus prefix?  It could
perfectly well be a fully valid setting for an extension that
the installation doesn't choose to preload.

regards, tom lane




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Robert Haas
On Thu, May 22, 2025 at 12:10 PM Shaik Mohammad Mujeeb
 wrote:
> In my patch, I currently warn and remove invalid GUCs from the hashtable. 
> However, as you rightly pointed out, some of these could belong to valid but 
> unregistered prefixes. In such cases, it might not be ideal to remove them 
> outright. Instead, it could be more helpful to simply warn the user - 
> covering both potential typos and GUCs with valid yet unregistered prefixes.
>
> I do understand that not everyone may prefer seeing such warnings during PG 
> server restart. To address this, we could introduce a new GUC (perhaps named 
> warn_on_unregistered_guc_prefix), which defaults to false, preserving the 
> existing behaviour. If explicitly enabled, it would emit warnings for these 
> cases, giving users the choice to opt in to this feedback.

I think you might be missing the point of the comments from Tom and
David. To the extent that it is possible to give warnings, we already
do. So this proposal just doesn't really make sense. It either warns
in cases where there is no actual problem, or it gives a duplicate
warning in cases where there is. Changing the details of the proposal
doesn't address that fundamental problem.

I would really encourage you to spend a bit more time trying to
understand the current design intention and behavior before proposing
changes. It actually makes a lot of sense. It is not perfect, but if
there were a simple way to do better we would have likely done that a
long time ago.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Tom Lane
Shaik Mohammad Mujeeb  writes:
> In my patch, I currently warn and remove invalid GUCs from the hashtable. 
> However, as you rightly pointed out, some of these could belong to valid but 
> unregistered prefixes. In such cases, it might not be ideal to remove them 
> outright. Instead, it could be more helpful to simply warn the user - 
> covering both potential typos and GUCs with valid yet unregistered prefixes.

I kind of doubt that warnings at startup are all that helpful.
People don't look at the postmaster log all that often, and
by the time they start wondering why something isn't acting
as they expect, the log file that had the info may have been
rotated out of existence.

I doubt even more that removing GUCs just because they are
not registered prefixes is sane.

Let me suggest a different way of thinking about this: what
say we extend the pg_file_settings view to mark custom GUCs
that don't correspond to any reserved extension prefix?
"Not a known extension" could be one of the "error" messages
that that view provides.

regards, tom lane




Re: Minor adjustment to pg_aios output naming

2025-05-22 Thread Andres Freund
Hi,

On 2025-05-21 15:43:01 -0400, Robert Haas wrote:
> On Tue, May 20, 2025 at 10:45 PM Michael Paquier  wrote:
> > I think that your suggestion of fix is right.  The properties are
> > marked as "WRITEV" and "READV" for vectored operations.  So the
> > documentation is right, not the name used in the code.  Will fix,
> > thanks for the report.
> 
> Since this was Andres's commit, I think it would have been a good idea
> to give at least 24 hours, or better yet a couple of days, for him to
> comment before committing something.

I think the change is fine, but I was also a bit surprised about the speed the
change went in...

Greetings,

Andres Freund




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Shaik Mohammad Mujeeb
Hi David J,

> Because any such setting name is perfectly valid (it serves as a placeholder) 
> and whether it is a typo or just some valid unregistered prefix is not 
> something the system can know.

In my patch, I currently warn and remove invalid GUCs from the hashtable. 
However, as you rightly pointed out, some of these could belong to valid but 
unregistered prefixes. In such cases, it might not be ideal to remove them 
outright. Instead, it could be more helpful to simply warn the user - covering 
both potential typos and GUCs with valid yet unregistered prefixes.

I do understand that not everyone may prefer seeing such warnings during PG 
server restart. To address this, we could introduce a new GUC (perhaps named 
warn_on_unregistered_guc_prefix), which defaults to false, preserving the 
existing behaviour. If explicitly enabled, it would emit warnings for these 
cases, giving users the choice to opt in to this feedback.


Thoughts on this approach?


 
Thanks & Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp

















 On Thu, 22 May 2025 02:01:25 +0530 David G. Johnston 
 wrote ---



On Wednesday, May 21, 2025, Shaik Mohammad Mujeeb 
 wrote:

Currently, if there's a typo in an extension name while adding a GUC to 
postgresql.conf, PostgreSQL server starts up silently without any warning.






Because any such setting name is perfectly valid (it serves as a placeholder) 
and whether it is a typo or just some valid unregistered prefix is not 
something the system can know.



David J.

Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part

2025-05-22 Thread Florents Tselai



> On 22 May 2025, at 5:05 PM, Robert Haas  wrote:
> 
> On Wed, May 21, 2025 at 2:31 PM Tom Lane  wrote:
>> Having said that, what's wrong with inventing some improved function
>> names and never removing the old ones?
> 
> I don't particularly like the clutter, but if the consensus is that
> the clutter doesn't matter, fair enough.
> 

It depends really on how much future work we expect in adding more methods in 
jsonpath.
I think there’s a lot of potential there, but that’s a guess really.

On David’s point about popularity: 
In my experience timestamp related stuff from jsonb documents end up in a 
generated column,
and are indexed & queried there.
I expect that to continue in PG18 onwards as we’ll have virtual gen columns too.

Just to be clear, though, adding another version of these functions means
we’ll have an additional (now third) set of the same 5 functions:

The vanilla versions are considered stable and the suffixed *_tz or *_volatile 
(?) 

jsonb_path_exists
jsonb_path_query
jsonb_path_query_array
jsonb_path_query_first
jsonb_path_match










Re: Relstats after VACUUM FULL and CLUSTER

2025-05-22 Thread Erik Nordström
Hi Sami,

You need a concurrent transaction to recreate the situation. I am attaching
an isolation test to show the behavior, along with its output file. I ran
it on PostgreSQL 17.4.

The test has two permutations, the first one runs on a table without an
index and the second permutation with an index added. In the output file
you can see that the VACUUM FULL on the index-less table produces reltuples
count that includes all updated tuples + the old/garbage tuples. In other
words, it counts all tuples visible to any currently ongoing transaction.
If the table has an index the behavior is different because the reindex
that happens as the last step of vacuum full overwrites the first reltuples
count with the "correct" number (as visible by the transaction snapshot).

Best,
Erik




On Thu, May 22, 2025 at 5:04 PM Sami Imseih  wrote:

> > Does this seem like a bug or is it intentional?
>
> pg_class.reltuples/relpages are only an estimate as per documentation.
>
> However, I cannot reproduce the situation you are talking about on HEAD.
> In the below example, I create a table without indexes, then insert and
> delete some rows. run vacuum to update the pg_class.reltuples, then run
> another delete to generate some more "recent" dead tuples.
>
> The result shows pg_class.reltuples with the expected value,
> but maybe I did not repro the same way you did?
>
> ( I am surprised that n_live_tup, n_dead_tup is off and also that
> VACUUM FULL does not appear to update the stats in pg_stat_all_tables)
>
> ```
> postgres=# drop table if exists t;
> create table t ( id int );
> alter table t set (autovacuum_enabled = off);
> insert into t select n from generate_series(1, 100) as n;
> delete from t where id between 1 and 5000;
> vacuum t;
> delete from t where id between 5001 and 1;
> select reltuples::int from pg_class where relname = 't';
> -- might take a bit of time for n_dead_tup to be set
> select n_dead_tup, n_live_tup from pg_stat_all_tables where relname = 't';
> DROP TABLE
> CREATE TABLE
> ALTER TABLE
> INSERT 0 100
> DELETE 5000
> VACUUM
> DELETE 5000
>  reltuples
> ---
> 995000
> (1 row)
>
>  n_dead_tup | n_live_tup
> +
>   1 | 985000
> (1 row)
>
> postgres=# VACUUM (verbose, full) t;
> INFO:  vacuuming "public.t"
> INFO:  "public.t": found 5000 removable, 99 nonremovable row
> versions in 4425 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> CPU: user: 0.79 s, system: 0.02 s, elapsed: 0.86 s.
> VACUUM
>
> select reltuples::int from pg_class where relname = 't';
> select n_dead_tup from pg_stat_all_tables where relname = 't';
>
> postgres=# select reltuples::int from pg_class where relname = 't';
> select n_dead_tup, n_live_tup from pg_stat_all_tables where relname = 't';
>  reltuples
> ---
> 99
> (1 row)
>
> postgres=# select n_dead_tup, n_live_tup from pg_stat_all_tables where
> relname = 't';
>  n_dead_tup | n_live_tup
> +
>   1 | 985000
> (1 row)
>
> --
> Sami Imseih
> Amazon Web Services (AWS)
>
# Test for log messages emitted by VACUUM and ANALYZE when a specified
# relation is concurrently dropped.
#
# This also verifies that log messages are not emitted for concurrently
# dropped relations that were not specified in the VACUUM or ANALYZE
# command.

setup
{
	CREATE TABLE stats (a INT);
INSERT INTO stats SELECT generate_series(1, 10);
}

teardown
{
	DROP TABLE IF EXISTS stats;
}

session s1
step s1_query
{
	START TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT count(*) FROM pg_attribute;
}
step s1_commit
{
COMMIT;
}

session s2
step s2_update { UPDATE stats SET a=1 WHERE a > 6; } 
step s2_vac_full		{ VACUUM FULL stats; }
step s2_reltuples  { SELECT reltuples FROM pg_class WHERE relname = 'stats'; }
step s2_create_index { CREATE INDEX ON stats (a); }

permutation s2_vac_full s2_reltuples s1_query s2_update s2_reltuples s2_vac_full s1_commit s2_reltuples
permutation s2_create_index s2_vac_full s2_reltuples s1_query s2_update s2_reltuples s2_vac_full s1_commit s2_reltuples


vacuum-full-stats.out
Description: Binary data


Re: Statistics Import and Export

2025-05-22 Thread Jeff Davis
On Thu, 2025-05-22 at 10:20 -0400, Robert Haas wrote:
> Yeah. This could use comments from a few more people, but I really
> hope we don't ship the final release this way. We do have a "Enable
> statistics in pg_dump by default" item in the open items list under
> "Decisions to Recheck Mid-Beta", but that's arguably now. It also
> sort
> of looks like we might have a consensus anyway: Jeff said "I lean
> towards making it opt-in for pg_dump and opt-out for pg_upgrade" and
> I
> agree with that and it seems you do, too. So perhaps Jeff should make
> it so?

Patch attached.

A couple minor points:

 * The default for pg_restore is --no-statistics. That could cause a
minor surprise if the user specifies --with-statistics for pg_dump and
not for pg_restore. An argument could be made that "if the stats are
there, restore them", and I don't have a strong opinion about this
point, but defaulting to --no-statistics seems more consistent with
pg_dump.

 * I added --with-statistics to most of the pg_dump tests. We can be
more judicious about which tests exercise statistics as a separate
commit, but I didn't want to change the test results as a part of this
commit.

Regards,
Jeff Davis

From b76cb91441e2eefe278249e23fcd703d27a85a06 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Thu, 22 May 2025 11:03:03 -0700
Subject: [PATCH v1] Change defaults for statistics export.

Set the default behavior of pg_dump, pg_dumpall, and pg_restore to be
--no-statistics. Leave the default for pg_upgrade to be
--with-statistics.

Discussion: https://postgr.es/m/CA+TgmoZ9=rnwccozikyyjzs_aw1p4qxcw--h4dollhuf1om...@mail.gmail.com
---
 src/bin/pg_dump/pg_backup_archiver.c |  4 +-
 src/bin/pg_dump/t/002_pg_dump.pl | 59 
 src/bin/pg_upgrade/dump.c|  2 +-
 src/bin/pg_upgrade/pg_upgrade.c  |  6 ++-
 4 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index afa42337b11..a66d88bbc51 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -152,7 +152,7 @@ InitDumpOptions(DumpOptions *opts)
 	opts->dumpSections = DUMP_UNSECTIONED;
 	opts->dumpSchema = true;
 	opts->dumpData = true;
-	opts->dumpStatistics = true;
+	opts->dumpStatistics = false;
 }
 
 /*
@@ -1101,7 +1101,7 @@ NewRestoreOptions(void)
 	opts->compression_spec.level = 0;
 	opts->dumpSchema = true;
 	opts->dumpData = true;
-	opts->dumpStatistics = true;
+	opts->dumpStatistics = false;
 
 	return opts;
 }
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index cf34f71ea11..386e21e0c59 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -68,6 +68,7 @@ my %pgdump_runs = (
 			'--no-data',
 			'--sequence-data',
 			'--binary-upgrade',
+			'--with-statistics',
 			'--dbname' => 'postgres',# alternative way to specify database
 		],
 		restore_cmd => [
@@ -75,6 +76,7 @@ my %pgdump_runs = (
 			'--format' => 'custom',
 			'--verbose',
 			'--file' => "$tempdir/binary_upgrade.sql",
+			'--with-statistics',
 			"$tempdir/binary_upgrade.dump",
 		],
 	},
@@ -88,11 +90,13 @@ my %pgdump_runs = (
 			'--format' => 'custom',
 			'--compress' => '1',
 			'--file' => "$tempdir/compression_gzip_custom.dump",
+			'--with-statistics',
 			'postgres',
 		],
 		restore_cmd => [
 			'pg_restore',
 			'--file' => "$tempdir/compression_gzip_custom.sql",
+			'--with-statistics',
 			"$tempdir/compression_gzip_custom.dump",
 		],
 		command_like => {
@@ -115,6 +119,7 @@ my %pgdump_runs = (
 			'--format' => 'directory',
 			'--compress' => 'gzip:1',
 			'--file' => "$tempdir/compression_gzip_dir",
+			'--with-statistics',
 			'postgres',
 		],
 		# Give coverage for manually compressed blobs.toc files during
@@ -132,6 +137,7 @@ my %pgdump_runs = (
 			'pg_restore',
 			'--jobs' => '2',
 			'--file' => "$tempdir/compression_gzip_dir.sql",
+			'--with-statistics',
 			"$tempdir/compression_gzip_dir",
 		],
 	},
@@ -144,6 +150,7 @@ my %pgdump_runs = (
 			'--format' => 'plain',
 			'--compress' => '1',
 			'--file' => "$tempdir/compression_gzip_plain.sql.gz",
+			'--with-statistics',
 			'postgres',
 		],
 		# Decompress the generated file to run through the tests.
@@ -162,11 +169,13 @@ my %pgdump_runs = (
 			'--format' => 'custom',
 			'--compress' => 'lz4',
 			'--file' => "$tempdir/compression_lz4_custom.dump",
+			'--with-statistics',
 			'postgres',
 		],
 		restore_cmd => [
 			'pg_restore',
 			'--file' => "$tempdir/compression_lz4_custom.sql",
+			'--with-statistics',
 			"$tempdir/compression_lz4_custom.dump",
 		],
 		command_like => {
@@ -189,6 +198,7 @@ my %pgdump_runs = (
 			'--format' => 'directory',
 			'--compress' => 'lz4:1',
 			'--file' => "$tempdir/compression_lz4_dir",
+			'--with-statistics',
 			'postgres',
 		],
 		# Verify that data files were compressed
@@ -200,6 +210,7 @@ my %pgdump_runs = (
 			'pg_restore',
 			'--jobs' => '2',
 		

Re: Adding skip scan (including MDAM style range skip scan) to nbtree

2025-05-22 Thread Matthias van de Meent
On Tue, 20 May 2025, 22:14 Peter Geoghegan,  wrote:
>
> On Mon, May 12, 2025 at 8:58 AM Peter Geoghegan  wrote:
> > I wonder if we can fix this problem by getting rid of the old support
> > routine #5, "options". It currently doesn't do anything, and I always
> > thought it was strange that it was added "for uniformity" with other
> > index AMs.
>
> Attached patch completely removes the nbtree "options" support
> function, while changing the support function number of skip support:
> it becomes support function #5 (the number previously used by
> "options"). This patch should fix the regression that Tomas complained
> about in an expedient way.

> It's likely that somebody else will run into the same problem in the
> future, the next time that a new support function is needed. But I
> think that it makes sense to do this much now -- we need a short term
> solution for Postgres 18.

Didn't we have different solutions already? E.g. dropping the
allocation requirement from IndexAM (and indeed, making it
non-required), as seen in my 0001? It's a bit of a hassle, but has
shown to have the same effect in PG18b1,  doesn't break compatibility
with existing amproc registrations, and even improves the memory
situation for all other index types.

> Usually I would never suggest breaking
> compatibility like this, but, remarkably, we have never actually done
> anything with our current support function 5. It's not possible to
> break compatibility with code that can never be called in the first
> place, so I see no compatibility to preserve.

I disagree. An extension or user may well have registered FUNCTION 5
for their opclass or opfamily as blanket future-proofed
implementation, which will now break after pg_upgrade. We never
rejected this registration before, and while there won't be issues
with the proc signature across the versions (options' expected
signature is equivalent to that of skipsupport), it will probably
cause issues when it's called with unexpected inputs.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)




Re: Statistics Import and Export

2025-05-22 Thread Hari Krishna Sunder
Thanks for the help. This has unblocked us!

On Thu, May 22, 2025 at 8:25 AM Nathan Bossart 
wrote:

> On Wed, May 21, 2025 at 04:53:17PM -0700, Jeff Davis wrote:
> > On Wed, 2025-05-21 at 16:29 -0500, Nathan Bossart wrote:
> >> I don't know precisely where that line might be, but in this case,
> >> the
> >> dumped stats have no hope of restoring into anything older than
> >> v18... But I see no particular benefit from moving the complexity
> >> to the
> >> import side here.
> >
> > That's fine with me. Perhaps we should just say that pre-18 behavior
> > differences can be fixed up during export, and post-18 behavior
> > differences are fixed up during import?
>
> WFM.  I've committed the patch.
>
> --
> nathan
>


Re: Statistics Import and Export

2025-05-22 Thread Greg Sabino Mullane
On Thu, May 22, 2025 at 2:52 PM Jeff Davis  wrote:

>  * The default for pg_restore is --no-statistics. That could cause a minor
> surprise if the user specifies --with-statistics for pg_dump and
> not for pg_restore. An argument could be made that "if the stats are
> there, restore them", and I don't have a strong opinion about this point,
> but defaulting to --no-statistics seems more consistent with pg_dump.
>

Hm...somewhat to my own surprise, I don't like this. If it's in the dump,
restore it.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support


Re: Statistics Import and Export

2025-05-22 Thread Nathan Bossart
On Thu, May 22, 2025 at 03:29:38PM -0400, Greg Sabino Mullane wrote:
> On Thu, May 22, 2025 at 2:52 PM Jeff Davis  wrote:
>>  * The default for pg_restore is --no-statistics. That could cause a minor
>> surprise if the user specifies --with-statistics for pg_dump and
>> not for pg_restore. An argument could be made that "if the stats are
>> there, restore them", and I don't have a strong opinion about this point,
>> but defaulting to --no-statistics seems more consistent with pg_dump.
> 
> Hm...somewhat to my own surprise, I don't like this. If it's in the dump,
> restore it.

+1, I think defaulting to restoring everything in the dump file is much
less surprising than the alternative.

-- 
nathan




Re: Statistics Import and Export

2025-05-22 Thread Robert Haas
On Thu, May 22, 2025 at 3:36 PM Nathan Bossart  wrote:
> +1, I think defaulting to restoring everything in the dump file is much
> less surprising than the alternative.

+1.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Avoid orphaned objects dependencies, take 3

2025-05-22 Thread Jeff Davis
On Thu, 2025-05-22 at 09:40 -0400, Robert Haas wrote:

> Pushing the locking down into recordDependencyOn amounts to hoping
> that we don't need to study each code path in detail and decide on
> the
> exactly right place to acquire the lock.

There are (by my rough count) over 250 call sites modified by the v19
patch. I fear that, if each of those call sites needs to be studied for
the kinds of subtle issues you are describing, then we are likely to
make a mistake. If not now, then in the future as new features change
those call sites.

Bertrand, what pattern is safe to follow for most call sites? Which
call sites are the most interesting ones that need special attention?

Regards,
Jeff Davis





Re: Statistics Import and Export

2025-05-22 Thread Tom Lane
Greg Sabino Mullane  writes:
> On Thu, May 22, 2025 at 2:52 PM Jeff Davis  wrote:
>> * The default for pg_restore is --no-statistics. That could cause a minor
>> surprise if the user specifies --with-statistics for pg_dump and
>> not for pg_restore.

> Hm...somewhat to my own surprise, I don't like this. If it's in the dump,
> restore it.

Yeah, I tend to lean that way too.  If the user went out of their way
to say --with-statistics for pg_dump, how likely is it that they
don't want the statistics restored?

Another argument pointing in that direction is that the definition
Jeff proposes creates an inconsistency in the output between text
mode:

pg_dump --with-statistics ... | psql

and non-text mode:

pg_dump -Fc --with-statistics ... | pg_restore

There is no additional filter in text mode, so I think pg_restore's
default behavior should also be "no additional filter".

regards, tom lane




Re: Make wal_receiver_timeout configurable per subscription

2025-05-22 Thread Amit Kapila
On Wed, May 21, 2025 at 6:04 PM Fujii Masao  wrote:
>
> On 2025/05/20 18:13, vignesh C wrote:
> > If we set the wal_receiver_timeout configuration using ALTER ROLE for
> > the subscription owner's role, the apply worker will start with that
> > value. However, any changes made via ALTER ROLE ... SET
> > wal_receiver_timeout will not take effect for an already running apply
> > worker unless the subscription is disabled and re-enabled. In
> > contrast, this is handled automatically during CREATE SUBSCRIPTION,
> > where parameter changes are detected.
>
> Yes, this is one of the limitations of the user-settable wal_receiver_timeout
> approach. If we want to change the timeout used by the apply worker without
> restarting it, storing the value in pg_subscription (similar to how
> synchronous_commit is handled) would be a better solution.
>
> In that case, for example, we could set the default value of
> pg_subscription.wal_receiver_timeout to -1, meaning the apply worker should
> use the global wal_receiver_timeout from postgresql.conf. If the value is 0
> or higher, the apply worker would use the value stored in pg_subscription.
>

Yeah, I had a similar idea in my mind.

>
> On further thought, another downside of the user-settable approach is that
> it doesn't work for parameters like wal_retrieve_retry_interval, which is
> used by the logical replication launcher not the apply worker. So if we
> want to support per-subscription control for non-apply workers, storing
> the settings in pg_subscription might be more appropriate.
>

Yeah, that could be an option, but one might not want to keep such
variables different for each subscription. Do you think one would like
to prefer specifying variables that only apply to the subscriber-node
in a way other than GUC? I always have this question whenever I see
GUCs like max_sync_workers_per_subscription, which are specific to
only subscriber nodes.

-- 
With Regards,
Amit Kapila.




Re: generic plans and "initial" pruning

2025-05-22 Thread Tomas Vondra
On 5/22/25 10:12, Amit Langote wrote:
> On Wed, May 21, 2025 at 7:22 PM Amit Langote  wrote:
>> Fair enough. I’ll revert this and some related changes shortly.  WIP
>> patch attached.
> 
> I have pushed out the revert now.
> 

Thank you.

> Note that I’ve only reverted the changes related to deferring locks on
> prunable partitions. I’m planning to leave the preparatory commits
> leading up to that one in place unless anyone objects. For reference,
> here they are in chronological order (the last 3 are bug fixes):
> 
> bb3ec16e14d Move PartitionPruneInfo out of plan nodes into PlannedStmt
> d47cbf474ec Perform runtime initial pruning outside ExecInitNode()
> cbc127917e0 Track unpruned relids to avoid processing pruned relations
> 75dfde13639 Fix an oversight in cbc127917 to handle MERGE correctly
> cbb9086c9ef Fix bug in cbc127917 to handle nested Append correctly
> 28317de723b Ensure first ModifyTable rel initialized if all are pruned
> 
> I think separating initial pruning from plan node initialization is
> still worthwhile on its own, as evidenced by the improvements in
> cbc127917e.
> 

I'm OK with that in principle, assuming the benefits outweigh the risk
of making backpatching harder. The patches don't seem exceptionally
large / invasive, but I don't know how often we modify these parts.


regards

-- 
Tomas Vondra





Re: Avoid orphaned objects dependencies, take 3

2025-05-22 Thread Robert Haas
On Thu, May 22, 2025 at 8:15 AM Bertrand Drouvot
 wrote:
> That would probably address Robert's concern [1] "acquiring two locks on the 
> same
> object with different lock modes where we should really only acquire one" but
> probably not this one "I think it might result in acquiring the
> locks on those other objects at a subtly wrong time (leading to race
> conditions)".
>
> For the latter I'm not sure how that could be a subtly wrong time or how could
> we determine what a subtly wrong time is (to avoid taking the lock).

Well, I admit I'm not quite sure there is any timing problem here. But
I think it's possible. For example, consider RangeVarGetRelidExtended,
and in particular the callback argument. If we do permissions checking
before locking the relation, the results can change before we actually
lock the relation; if we do it afterward, users can contrive to hold
locks on relations for which they don't have permissions. That rather
ornate callback system allows us to have the best of both worlds. That
system is also concerned with the possibility that we do a name -> OID
translation before holding a lock and by the time we acquire the lock,
concurrent DDL has happened and the answer we got is no longer the
right answer. We've had security holes due to such things.

Now I'm not entirely sure that either of those things are issues here.
My first guess is that name lookup races are not an issue here,
because we're following OID links, but permissions checks seem like
they might be an issue. We might not decide to do something as
elaborate as we did with RangeVarGetRelidExtended and ... maybe that's
OK? But in general I am skeptical of the conclusion that we can just
shove all the locking down into some subordinate layer and nothing
will go wrong, because locking doesn't exist in a vacuum -- it relates
to other things that we also need to do, and whether we do the locking
before or after other steps can affect semantics and even security.

Pushing the locking down into recordDependencyOn amounts to hoping
that we don't need to study each code path in detail and decide on the
exactly right place to acquire the lock. It amounts to hoping that
wherever the recordDependencyOn call is located, it's before things
that we want the locking to happen before and after the things that we
want the locking to happen after. And maybe that's true. Or maybe it's
close enough to true that it's still better than the status quo where
we're not taking locks at all. But on the other hand, since I can't
think of any compelling reason why it HAS to be true, maybe it isn't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part

2025-05-22 Thread Robert Haas
On Wed, May 21, 2025 at 2:31 PM Tom Lane  wrote:
> Having said that, what's wrong with inventing some improved function
> names and never removing the old ones?

I don't particularly like the clutter, but if the consensus is that
the clutter doesn't matter, fair enough.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Reduce "Var IS [NOT] NULL" quals during constant folding

2025-05-22 Thread Robert Haas
On Thu, May 1, 2025 at 4:33 AM Richard Guo  wrote:
> Here is the patchset that implements this optimization.  0001 moves
> the expansion of virtual generated columns to occur before sublink
> pull-up.  0002 introduces a new function, preprocess_relation_rtes,
> which scans the rangetable for relation RTEs and performs inh flag
> updates and virtual generated column expansion in a single loop, so
> that only one table_open/table_close call is required for each
> relation.  0003 collects NOT NULL attribute information for each
> relation within the same loop, stores it in a relation OID based hash
> table, and uses this information to reduce NullTest quals during
> constant folding.
>
> I think the code now more closely resembles the phase 1 and phase 2
> described earlier: it collects all required early-stage catalog
> information within a single loop over the rangetable, allowing each
> relation to be opened and closed only once.  It also avoids the
> has_subclass() call along the way.

Before we commit to something along these lines, I think we need to
understand whether Tom intends to press Peter for some bigger change
around expand_virtual_generated_columns.

If Tom doesn't respond right away, I suggest that we need to add an
open item for http://postgr.es/m/602561.1744314...@sss.pgh.pa.us

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Relstats after VACUUM FULL and CLUSTER

2025-05-22 Thread Sami Imseih
> Does this seem like a bug or is it intentional?

pg_class.reltuples/relpages are only an estimate as per documentation.

However, I cannot reproduce the situation you are talking about on HEAD.
In the below example, I create a table without indexes, then insert and
delete some rows. run vacuum to update the pg_class.reltuples, then run
another delete to generate some more "recent" dead tuples.

The result shows pg_class.reltuples with the expected value,
but maybe I did not repro the same way you did?

( I am surprised that n_live_tup, n_dead_tup is off and also that
VACUUM FULL does not appear to update the stats in pg_stat_all_tables)

```
postgres=# drop table if exists t;
create table t ( id int );
alter table t set (autovacuum_enabled = off);
insert into t select n from generate_series(1, 100) as n;
delete from t where id between 1 and 5000;
vacuum t;
delete from t where id between 5001 and 1;
select reltuples::int from pg_class where relname = 't';
-- might take a bit of time for n_dead_tup to be set
select n_dead_tup, n_live_tup from pg_stat_all_tables where relname = 't';
DROP TABLE
CREATE TABLE
ALTER TABLE
INSERT 0 100
DELETE 5000
VACUUM
DELETE 5000
 reltuples
---
995000
(1 row)

 n_dead_tup | n_live_tup
+
  1 | 985000
(1 row)

postgres=# VACUUM (verbose, full) t;
INFO:  vacuuming "public.t"
INFO:  "public.t": found 5000 removable, 99 nonremovable row
versions in 4425 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 0.79 s, system: 0.02 s, elapsed: 0.86 s.
VACUUM

select reltuples::int from pg_class where relname = 't';
select n_dead_tup from pg_stat_all_tables where relname = 't';

postgres=# select reltuples::int from pg_class where relname = 't';
select n_dead_tup, n_live_tup from pg_stat_all_tables where relname = 't';
 reltuples
---
99
(1 row)

postgres=# select n_dead_tup, n_live_tup from pg_stat_all_tables where
relname = 't';
 n_dead_tup | n_live_tup
+
  1 | 985000
(1 row)

--
Sami Imseih
Amazon Web Services (AWS)




Re: RFC: Logging plan of the running query

2025-05-22 Thread Robert Haas
On Tue, May 20, 2025 at 9:18 AM torikoshia  wrote:
> I tackled this again and the attached patch removes ExecProcNodeOriginal
> from Planstate.
> Instead of adding a new field, this version builds the behavior into the
> existing wrapper function, ExecProcNodeFirst().
>
> Since ExecProcNodeFirst() is already handling instrumentation-related
> logic, the patch has maybe become a bit more complex to accommodate both
> that and the new behavior.
>
> While it might make sense to introduce a more general mechanism that
> allows for stacking an arbitrary number of wrappers around ExecProcNode,
> I’m not sure it's possible or worth the added complexity—such layered
> wrapping doesn't seem like something we typically need.
>
> What do you think?

Hmm, I'm not convinced that this is correct. If
GetProcessLogQueryPlanInterruptActive() is true but node->instrument
is also non-NULL, your implementation of ExecProcNodeFirst() will
handle the first but ignore the second. Plus, I don't understand why
ExecProcNodeFirst() does node->ExecProcNode = ExecProcNodeWithExplain
instead of just directly doing the necessary work. It seems to me that
this will result in the first call to ExecProcNode calling
ExecProcNodeFirst to install ExecProcNodeWithExplain; and then the
second call to ExecProcNode will call ExecProcNodeWithExplain which
will actually do the work. But this seems unnecessary to me: I think
it could just say if (GetProcessLogQueryPlanInterruptActive())
LogQueryPlan(ps).

Backing up a step, I think you've got a good idea here in thinking
that we can probably reuse ExecProcNodeFirst for this purpose. It
appears to me that it is always valid to reset a node's ExecProcNode
pointer back to ExecProcNodeFirst. It might be unnecessary and cost
some performance to do it when not required, but it is safe, because
ExecProcNodeFirst will simply reset node->ExecProcNode and then call
the appropriate function. So what we can do is just add the additional
logic to check whether we need to print the query plan after the
check_stack_depth() call and before the rest of the logic in the
function. I've attached a sample patch to show what I have in mind.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


example-execprocnodefirst-patch.diff
Description: Binary data


Re: making EXPLAIN extensible

2025-05-22 Thread Robert Haas
On Thu, May 22, 2025 at 10:29 AM Andrei Lepikhov  wrote:
> > I don't think this is really true. It's just standard identifier
> > handling. You can have options with upper-case names if you quote
> > them.
> Sorry for my language. I meant that when you call function
> RegisterExtensionExplainOption(), it make sense if you write parameter
> `option_name` as "debug" or "deBug". On the user side
> case-insensitiveness works correctly, of course. Not sure about side
> effects if one extension will call this routine with "Debug" parameter
> and another one - "debuG".

I guess my point is that this works just like other cases where SQL
identifiers are accessible in C code: they're normally lower-case but
they don't have to be if the user quoted them. I don't think it's
worth adding a comment about that general behavior to this specific
bit of code.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: POC: Parallel processing of indexes in autovacuum

2025-05-22 Thread Sami Imseih
I started looking at the patch but I have some high level thoughts I would
like to share before looking further.

> > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > would be better to have a more specific name for parallel vacuum such
> > as autovacuum_max_parallel_workers. This parameter is related to
> > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > affect this parameter.
> > ...
> > I've also considered some alternative names. If we were to use
> > parallel_maintenance_workers, it sounds like it controls the parallel
> > degree for all operations using max_parallel_maintenance_workers,
> > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > interpreted as affecting both autovacuum and manual VACUUM commands,
> > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > use their specified value for the parallel degree. I prefer
> > autovacuum_parallel_workers or vacuum_parallel_workers.
> >
>
> This was my headache when I created names for variables. Autovacuum
> initially implies parallelism, because we have several parallel a/v
> workers. So I think that parameter like
> `autovacuum_max_parallel_workers` will confuse somebody.
> If we want to have a more specific name, I would prefer
> `max_parallel_index_autovacuum_workers`.

I don't think we should have a separate pool of parallel workers for those
that are used to support parallel autovacuum. At the end of the day, these
are parallel workers and they should be capped by max_parallel_workers. I think
it will be confusing if we claim these are parallel workers, but they
are coming from
a different pool.

I envision we have another GUC such as "max_parallel_autovacuum_workers"
(which I think is a better name) that matches the behavior of
"max_parallel_maintenance_worker". Meaning that the autovacuum workers
still maintain their existing behavior ( launching a worker per table
), and if they do need
to vacuum in parallel, they can draw from a pool of parallel workers.

With the above said, I therefore think the reloption should actually be a number
of parallel workers rather than a boolean. Let's take an example of a
user that has 3 tables
they wish to (auto)vacuum can process in parallel, and if available
they wish each of these tables
could be autovacuumed with 4 parallel workers. However, as to not
overload the system, they
cap the 'max_parallel_maintenance_worker' to something like 8. If it
so happens that all
3 tables are auto-vacuumed at the same time, there may not be enough
parallel workers,
so one table will be a loser and be vacuumed in serial. That is
acceptable, and a/v logging
( and perhaps other stat views ) should display this behavior: workers
planned vs workers launched.

thoughts?


--
Sami Imseih
Amazon Web Services (AWS)




Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part

2025-05-22 Thread David E. Wheeler
On May 22, 2025, at 12:38, Florents Tselai  wrote:

> In my experience timestamp related stuff from jsonb documents end up in a 
> generated column,
> and are indexed & queried there.

Have you seen this in the wild using the _tz functions? I wouldn’t think they 
were indexable, given the volatility.

D



signature.asc
Description: Message signed with OpenPGP


Re: Relstats after VACUUM FULL and CLUSTER

2025-05-22 Thread Sami Imseih
> You need a concurrent transaction to recreate the situation. I am attaching 
> an isolation test to show the behavior,

Thanks! That helps.
Indeed heapam_relation_copy_for_cluster and heapam_index_build_range_scan
are counting HEAPTUPLE_RECENTLY_DEAD ( tuples removed but cannot be removed )
different. In heapam_relation_copy_for_cluster they are considered live and in
heapam_index_build_range_scan they are considered dead.

in heapam_relation_copy_for_cluster
```
case HEAPTUPLE_RECENTLY_DEAD:
   *tups_recently_dead += 1;
   /* fall through */
case HEAPTUPLE_LIVE:
   /* Live or recently dead, must copy it */
   isdead = false;
break;
```

In both cases, he recently dead tuples must be copied to the table or index, but
they should not be counted towards reltuples. So, I think we need to fix this in
heapam_relation_copy_for_cluster by probably subtracting
tups_recently_dead from num_tuples ( which is the value set in
pg_class.reltuples )
after we process all the tuples, which looks like the best fix to me.


--
Sami Imseih
Amazon Web Services (AWS)




Re: Assert("vacrel->eager_scan_remaining_successes > 0")

2025-05-22 Thread Masahiko Sawada
On Thu, May 22, 2025 at 7:27 AM Melanie Plageman
 wrote:
>
> On Wed, May 21, 2025 at 6:11 PM Masahiko Sawada  wrote:
> >
> > > if (vacrel->eager_scan_remaining_successes > 0)
> > >  vacrel->eager_scan_remaining_successes--;
> >
> > I've attached a patch that uses this idea. Feedback is very welcome.
>
> Thanks for writing the patch!
>
> I actually think we have the same situation with
> eager_scan_remaining_fails.

Good catch.

> Since the extra pages that are eagerly
> scanned could either fail or succeed to be frozen, so we probably also
> need to change the assert in the failure case into a guard as well:
>
> else
> {
> Assert(vacrel->eager_scan_remaining_fails > 0);
> vacrel->eager_scan_remaining_fails--;
> }
>
> ->
>
>   else if (vacrel->eager_scan_remaining_fails > 0)
>vacrel->eager_scan_remaining_fails--;
>
> In the comment you wrote, I would probably just change one thing
>
> +/*
> + * Report only once that we disabled eager scanning. This
> + * check is required because we might have eagerly read
> + * more blocks and we could reach here even after
> + * disabling eager scanning.
> + */
>
> I would emphasize that we read ahead these blocks before executing the
> code trying to freeze them. So, I might instead say something like:
> "Report only once that we disabled eager scanning. We may eagerly read
> ahead blocks in excess of the success or failure caps before
> attempting to freeze them, so we could reach here even after disabling
> additional eager scanning"
>
> And then probably avoid repeating the whole comment above the
> remaining fails guard.

Agreed. I've updated the patch. Does this address your comments?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


v2-0001-Fix-assertion-when-decrementing-eager-scanning-su.patch
Description: Binary data


Re: queryId constant squashing does not support prepared statements

2025-05-22 Thread Sami Imseih
> IMO adding a struct as suggested is okay, especially if it reduces the
> overall code complexity.  But we don't want a node, just a bare struct.
> Adding a node would be more troublesome.

In v4, a new private struct is added in gram.y, but we are also adding
additional fields to track the expression boundaries to the required
nodes.

--
Sami




Re: Log connection establishment timings

2025-05-22 Thread Peter Eisentraut

On 21.05.25 21:53, Melanie Plageman wrote:

On Tue, May 20, 2025 at 11:16 AM Melanie Plageman
 wrote:


In earlier versions of my patch, I played around with replacing these references in the docs. I ended up not doing it because I wasn't sure we had 
consensus on deprecating the "on", "true", "yes" options and that we would continue to support them indefinitely. 
Thinking about it now, by no longer documenting "on" and "off", I was obviously deprecating them (not to mention removing support 
for log_connections = "y", "ye", etc).

I'll write a patch to change these.


Attached is a patch that updates these as well as changes all usages
of log_connections in the tests. I made some judgment calls about
where we might want expanded or reduced log_connections aspects. As
such, the patch could use a once-over from someone else.


This looks good to me.





Re: Log connection establishment timings

2025-05-22 Thread Peter Eisentraut

On 20.05.25 17:16, Melanie Plageman wrote:

In src/backend/tcop/postgres.c, there is a call

          SetConfigOption("log_connections", "true", context, source);

that could be adjusted.


Do you think the debug option should be 'all' or a list of the options 
covered by "true" (which is a subset of 'all')?


I think the spirit of this debug option is to show lots of things, so 
mapping it to 'all' makes sense to me.





Re: Assert("vacrel->eager_scan_remaining_successes > 0")

2025-05-22 Thread Melanie Plageman
On Thu, May 22, 2025 at 4:07 PM Masahiko Sawada  wrote:
>
> Agreed. I've updated the patch. Does this address your comments?

Yep. LGTM.

I'd probably just remove the parenthetical remark about 20% from the
commit message since that only applies to the success cap and
referencing both the success and failure caps will make the sentence
very long.

from the commit message:
 pages (beyond 20% of the total number of all-visible but not
 all-frozen pages), this does not pose a practical concern as the

Thanks again for writing the patch!

- Melanie




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Greg Sabino Mullane
On Thu, May 22, 2025 at 12:45 PM Tom Lane  wrote:

> > "lpgsql.bogus = 1" .
>
> [ shrug... ] How do you know that's a bogus prefix?  It could perfectly
> well be a fully valid setting for an extension that
> the installation doesn't choose to preload.
>

Well, we do have ways to view all *potential* extensions. I find myself
more sympathetic to the OP than the others here, as I think it's a good
idea to display a warning for a potentially misspelled GUC prefix (but do
nothing else - certainly not remove it!). Will such a warning be seen?
Perhaps not, but that's a risk any warning message has to face. And
obviously there could be false positives if an extension decides to name
their GUCs something other than pg_available_extensions()->name but
hopefully that's a rare case (and shame on them if so.)

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support


Re: Relstats after VACUUM FULL and CLUSTER

2025-05-22 Thread Sami Imseih
> In both cases, he recently dead tuples must be copied to the table or index, 
> but
> they should not be counted towards reltuples. So, I think we need to fix this 
> in
> heapam_relation_copy_for_cluster by probably subtracting
> tups_recently_dead from num_tuples ( which is the value set in
> pg_class.reltuples )
> after we process all the tuples, which looks like the best fix to me.

something like the attached.

--
Sami Imseih
Amazon Web Services (AWS)


v1-0001-Correct-reltuples-count-after-a-VACUUM-CLUSTER-op.patch
Description: Binary data


Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part

2025-05-22 Thread Peter Eisentraut

On 09.05.25 21:50, Robert Haas wrote:

I always struggle a bit to remember our policy on these issues -- to
the best of my knowledge, we haven't documented it anywhere, and I
think we probably should. I believe the way it works is that whenever
a function depends on the operating system's timestamp or locale
definitions, we decide it has to be stable, not immutable. We don't
expect those things to be updated very often, but we know sometimes
they do get updated.


I don't understand how this discussion got to the conclusion that 
functions that depend on the locale cannot be immutable.  Note that the 
top-level functions lower, upper, and initcap themselves are immutable.






Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

2025-05-22 Thread Alexander Korotkov
Hi, Vitaly!

On Tue, May 20, 2025 at 6:44 PM Vitaly Davydov  wrote:
>
> Thank you very much for the review!
>
> > The patchset doesn't seem to build after 371f2db8b0, which adjusted
> > the signature of the INJECTION_POINT() macro.  Could you, please,
> > update the patchset accordingly.
>
> I've updated the patch (see attached). Thanks.
>
> > I see in 0004 patch we're calling XLogGetReplicationSlotMinimumLSN()
> > before slots synchronization then use it for WAL truncation.
> > Generally looks good.  But what about the "if
> > (InvalidateObsoleteReplicationSlots(...))" branch?  It calls
> > XLogGetReplicationSlotMinimumLSN() again.  Why would the value
> > obtained from the latter call reflect slots as they are synchronized
> > to the disk?
>
> In patch 0004 I call XLogGetReplicationSlotMinimumLSN() again to keep the old
> behaviour - this function was called in KeepLogSeg prior to my change. I also
> call CheckPointReplicationSlots at the next line to save the invalidated and
> other dirty slots on disk again to make sure, the new oldest LSN is in sync.
>
> The problem I tried to solve in this if-branch is to fix test
> src/test/recovery/t/019_replslot_limit.pl which was failed because the WAL was
> not truncated enought for the test to pass ok. In general, this branch is not
> necessary and we may fix the test by calling checkpoint twice (please, see the
> alternative.rej patch for this case). If you think, we should incorporate this
> new change, I'm ok to do it. But the WAL will be truncating more lazily.
>
> Furthermore, I think we can save slots on disk right after invalidation, not 
> in
> CheckPointGuts to avoid saving invalidated slots twice.

Thank you for the clarification.  It's all good.  I just missed that
CheckPointReplicationSlots() syncs slots inside the "if" branch.

I've reordered the patchset.  Fix should come first, tests comes
second.  So, tests pass after each commit.  Also I've joined both
tests and injection points into single commit.  I don't see reason to
place tests into src/test/modules, because there is no module.  I've
moved them into src/test/recovery.

I also improved some comments and commit messages.  I think 0001
should go to all supported releases as it fixes material bug, while
0002 should be backpatched to 17, where injection points fist appears.
0003 should go to pg19 after branching.  I'm continuing reviewing
this.


--
Regards,
Alexander Korotkov
Supabase
From c409a441be6487063d49c2671d3a3aecb9ba6994 Mon Sep 17 00:00:00 2001
From: Vitaly Davydov 
Date: Wed, 30 Apr 2025 14:09:21 +0300
Subject: [PATCH v4 1/3] Keep WAL segments by the flushed value of the slot's
 restart LSN

The patch fixes the issue with the unexpected removal of old WAL segments
after checkpoint, followed by an immediate restart. The issue occurs when
a slot is advanced after the start of the checkpoint and before old WAL
segments are removed at the end of the checkpoint.

The idea of the patch is to get the minimal restart_lsn at the beginning
of checkpoint (or restart point) creation and use this value when calculating
the oldest LSN for WAL segments removal at the end of checkpoint. This idea
was proposed by Tomas Vondra in the discussion.

Discussion: https://postgr.es/m/flat/1d12d2-67235980-35-19a406a0%4063439497
Author: Vitaly Davydov 
Reviewed-by: Tomas Vondra 
Reviewed-by: Alexander Korotkov 
Backpatch-through: 13
---
 src/backend/access/transam/xlog.c | 37 ---
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1914859b2ee..30ae65fce53 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -677,7 +677,8 @@ static XLogRecPtr CreateOverwriteContrecordRecord(XLogRecPtr aborted_lsn,
   XLogRecPtr pagePtr,
   TimeLineID newTLI);
 static void CheckPointGuts(XLogRecPtr checkPointRedo, int flags);
-static void KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo);
+static void KeepLogSeg(XLogRecPtr recptr, XLogRecPtr slotsMinLSN,
+	   XLogSegNo *logSegNo);
 static XLogRecPtr XLogGetReplicationSlotMinimumLSN(void);
 
 static void AdvanceXLInsertBuffer(XLogRecPtr upto, TimeLineID tli,
@@ -7087,6 +7088,7 @@ CreateCheckPoint(int flags)
 	VirtualTransactionId *vxids;
 	int			nvxids;
 	int			oldXLogAllowed = 0;
+	XLogRecPtr	slotsMinReqLSN;
 
 	/*
 	 * An end-of-recovery checkpoint is really a shutdown checkpoint, just
@@ -7315,6 +7317,11 @@ CreateCheckPoint(int flags)
 	 */
 	END_CRIT_SECTION();
 
+	/*
+	 * Get the current minimum LSN to be used later in WAL segments cleanup.
+	 */
+	slotsMinReqLSN = XLogGetReplicationSlotMinimumLSN();
+
 	/*
 	 * In some cases there are groups of actions that must all occur on one
 	 * side or the other of a checkpoint record. Before flushing the
@@ -7503,17 +7510,20 @@ CreateCheckPoint(int flags)
 	 * prevent the disk holding the xlog from growing full.
 	 */
 	XLByteToSeg(RedoRecPtr, _logS

Re: Log connection establishment timings

2025-05-22 Thread Melanie Plageman
On Wed, May 21, 2025 at 5:20 PM Jacob Champion
 wrote:
>
> I took a quick look at the authentication and oauth_validator changes.
> Maybe 007_pre_auth.pl should include `receipt`, to make it easier to
> debug pre-auth hangs using the logs? Other than that, the changes
> looked reasonable to me.

Cool, I updated that one (had to review my good ol' perl single
quoting rules to see if q[] still worked there but I think we're safe
:)).  Thanks for taking a look.

Thanks also to Peter E for the report and review. I've pushed this.

- Melanie




Re: Statistics Import and Export

2025-05-22 Thread Jeff Davis
On Thu, 2025-05-22 at 15:41 -0400, Tom Lane wrote:
> There is no additional filter in text mode, so I think pg_restore's
> default behavior should also be "no additional filter".

Attached. Only the defaults for pg_dump and pg_dumpall are changed, and
pg_upgrade explicitly specifies --with-statistics.

Regards,
Jeff Davis

From 5b73253f8848638f1754f4b9da82e90e8814b4b1 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Thu, 22 May 2025 11:03:03 -0700
Subject: [PATCH v2] Change defaults for statistics export.

Set the default behavior of pg_dump, pg_dumpall, and pg_restore to be
--no-statistics. Leave the default for pg_upgrade to be
--with-statistics.

Discussion: https://postgr.es/m/CA+TgmoZ9=rnwccozikyyjzs_aw1p4qxcw--h4dollhuf1om...@mail.gmail.com
---
 doc/src/sgml/ref/pg_dump.sgml|  4 +-
 doc/src/sgml/ref/pg_dumpall.sgml |  4 +-
 src/bin/pg_dump/pg_backup_archiver.c |  2 +-
 src/bin/pg_dump/t/002_pg_dump.pl | 59 
 src/bin/pg_upgrade/dump.c|  2 +-
 5 files changed, 65 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index c10bca63e55..995d8f9a040 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -1134,7 +1134,7 @@ PostgreSQL documentation
   --no-statistics
   

-Do not dump statistics.
+Do not dump statistics. This is the default.

   
  
@@ -1461,7 +1461,7 @@ PostgreSQL documentation
   --with-statistics
   

-Dump statistics. This is the default.
+Dump statistics.

   
  
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 8c5141d036c..81d34df3386 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -567,7 +567,7 @@ exclude database PATTERN
   --no-statistics
   

-Do not dump statistics.
+Do not dump statistics. This is the default.

   
  
@@ -741,7 +741,7 @@ exclude database PATTERN
   --with-statistics
   

-Dump statistics. This is the default.
+Dump statistics.

   
  
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index afa42337b11..175fe9c4273 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -152,7 +152,7 @@ InitDumpOptions(DumpOptions *opts)
 	opts->dumpSections = DUMP_UNSECTIONED;
 	opts->dumpSchema = true;
 	opts->dumpData = true;
-	opts->dumpStatistics = true;
+	opts->dumpStatistics = false;
 }
 
 /*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index cf34f71ea11..386e21e0c59 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -68,6 +68,7 @@ my %pgdump_runs = (
 			'--no-data',
 			'--sequence-data',
 			'--binary-upgrade',
+			'--with-statistics',
 			'--dbname' => 'postgres',# alternative way to specify database
 		],
 		restore_cmd => [
@@ -75,6 +76,7 @@ my %pgdump_runs = (
 			'--format' => 'custom',
 			'--verbose',
 			'--file' => "$tempdir/binary_upgrade.sql",
+			'--with-statistics',
 			"$tempdir/binary_upgrade.dump",
 		],
 	},
@@ -88,11 +90,13 @@ my %pgdump_runs = (
 			'--format' => 'custom',
 			'--compress' => '1',
 			'--file' => "$tempdir/compression_gzip_custom.dump",
+			'--with-statistics',
 			'postgres',
 		],
 		restore_cmd => [
 			'pg_restore',
 			'--file' => "$tempdir/compression_gzip_custom.sql",
+			'--with-statistics',
 			"$tempdir/compression_gzip_custom.dump",
 		],
 		command_like => {
@@ -115,6 +119,7 @@ my %pgdump_runs = (
 			'--format' => 'directory',
 			'--compress' => 'gzip:1',
 			'--file' => "$tempdir/compression_gzip_dir",
+			'--with-statistics',
 			'postgres',
 		],
 		# Give coverage for manually compressed blobs.toc files during
@@ -132,6 +137,7 @@ my %pgdump_runs = (
 			'pg_restore',
 			'--jobs' => '2',
 			'--file' => "$tempdir/compression_gzip_dir.sql",
+			'--with-statistics',
 			"$tempdir/compression_gzip_dir",
 		],
 	},
@@ -144,6 +150,7 @@ my %pgdump_runs = (
 			'--format' => 'plain',
 			'--compress' => '1',
 			'--file' => "$tempdir/compression_gzip_plain.sql.gz",
+			'--with-statistics',
 			'postgres',
 		],
 		# Decompress the generated file to run through the tests.
@@ -162,11 +169,13 @@ my %pgdump_runs = (
 			'--format' => 'custom',
 			'--compress' => 'lz4',
 			'--file' => "$tempdir/compression_lz4_custom.dump",
+			'--with-statistics',
 			'postgres',
 		],
 		restore_cmd => [
 			'pg_restore',
 			'--file' => "$tempdir/compression_lz4_custom.sql",
+			'--with-statistics',
 			"$tempdir/compression_lz4_custom.dump",
 		],
 		command_like => {
@@ -189,6 +198,7 @@ my %pgdump_runs = (
 			'--format' => 'directory',
 			'--compress' => 'lz4:1',
 			'--file' => "$tempdir/compression_lz4_dir",
+			'--with-statistics',
 			'postgres',
 		],
 		# Verify that data 

Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread David G. Johnston
On Thu, May 22, 2025 at 1:20 PM Greg Sabino Mullane 
wrote:

> On Thu, May 22, 2025 at 12:45 PM Tom Lane  wrote:
>
>> > "lpgsql.bogus = 1" .
>>
>> [ shrug... ] How do you know that's a bogus prefix?  It could perfectly
>> well be a fully valid setting for an extension that
>> the installation doesn't choose to preload.
>>
>
> Well, we do have ways to view all *potential* extensions. I find myself
> more sympathetic to the OP than the others here, as I think it's a good
> idea to display a warning for a potentially misspelled GUC prefix (but do
> nothing else - certainly not remove it!). Will such a warning be seen?
> Perhaps not, but that's a risk any warning message has to face. And
> obviously there could be false positives if an extension decides to name
> their GUCs something other than pg_available_extensions()->name but
> hopefully that's a rare case (and shame on them if so.)
>

We don't have proper server variables for applications and non-extensions
to use in things like RLS so choosing a non-extension-registered prefix is
a perfectly valid thing to do.  Maybe it would seldom be done in the config
file, and be nominally reserved for SET/set_config, but it is still valid.

IMO, an improvement in this area would revolve around not making people
type out setting names manually at all.  If they don't type them they
cannot typo them.  Again, this also only improves the config file setup,
but why not add pre-made configuration samples in a subdirectory that
people can either copy/paste individually; or setup a default include_dir
directive for contrib and people can place the "sample" files into the live
"contrib.d" directory making them no longer just samples but live
configuration that can then be edited without having to type names - just
remove comment symbols (maybe) and change values.

David J.


Re: Pathify RHS unique-ification for semijoin planning

2025-05-22 Thread Andy Fan
Richard Guo  writes:

Hi,

> However, in the case of sort-based implementation,
> this function pays no attention to the subpath's pathkeys or the
> pathkeys of the resulting output.

Good finding!

>
> In addition to this specific issue, it seems to me that there are
> other potential issues in create_unique_path().
>
> * Currently, we only consider the cheapest_total_path of the RHS when
> unique-ifying it.

I think it is better have a check the tuple_fraction for the startup_cost
factor, for some paths where the total cost is high but the required
fraction is lower.

> I think this may cause us to miss some optimization
> opportunities.  For example, there might be a path with a better sort
> order that isn't the cheapest-total one.  Such a path could help avoid
> a sort at a higher level, potentially resulting in a cheaper overall
> plan.

> * In create_unique_path(), we currently rely on heuristics to decide
> whether to use a hash-based or sort-based method.  I think a better
> approach would be to create paths for both methods and let add_path()
> determine which one is better, similar to how we handle path selection
> in other parts of the planner.
>
> Therefore, I'm thinking that maybe we could create a new RelOptInfo
> for the RHS rel to represent its unique-ified version, and then
> generate all worthwhile paths for it,

This sounds great for me. and I think we can keep the fraction
cheapest path on the new RelOptInfo as well, then all the things should
be on the way. 

> To be concrete, I'm envisioning something like the following:
>
> if (bms_equal(sjinfo->syn_righthand, rel2->relids) &&
> -   create_unique_path(root, rel2, rel2->cheapest_total_path,
> -  sjinfo) != NULL)
> +   (rel2_unique = create_unique_rel(root, rel2, sjinfo)) != NULL)
>
> ...
>
> -   add_paths_to_joinrel(root, joinrel, rel1, rel2,
> -JOIN_UNIQUE_INNER, sjinfo,
> +   add_paths_to_joinrel(root, joinrel, rel1, rel2_unique,
> +JOIN_INNER, sjinfo,
>  restrictlist);
> -   add_paths_to_joinrel(root, joinrel, rel2, rel1,
> -JOIN_UNIQUE_OUTER, sjinfo,
> +   add_paths_to_joinrel(root, joinrel, rel2_unique, rel1,
> +JOIN_INNER, sjinfo,
>  restrictlist);
>
> In addition, by changing the join from "rel1" and "rel2" using
> JOIN_UNIQUE_OUTER or JOIN_UNIQUE_INNER to a join between "rel1" and
> "rel2_unique" using a standard JOIN_INNER, we might be able to get
> rid of the JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER jointypes.

if we can, +10.

-- 
Best Regards
Andy Fan





Re: Feature Suggestion: Make synchronous_commit a table level property

2025-05-22 Thread Andy Fan
Anas-ur-Rasheed Khan  writes:

Hi,

> We have a use case where some tables are derived, i.e., can be reconstructed 
> entirely from 'source' tables (similar to
> views, but more complex mathematical transformations are applied). Data 
> integrity and durability are important for
> 'source' tables, but not so much for derived tables. In the event of a crash, 
> it is important that we can recover data
> from volatile memory for 'source' tables.
>
> What are your thoughts on this, and potentially adding such support in
> future postgres versions?

I think it is possible to do that since PostgreSQL already checked if a
table need a write log, we can check if the table needs a immedately
xlog flush.

But the final result would be if *any relation* need an immedate flush,
the transction need to flush the xlog during the commit stage, so only a
transaction which touch the derived table *only* could get benefit from
this feature.  As an alternative, your application could identify the
"derived table only" transaction and set synchronous_commit to off by
your own.

-- 
Best Regards
Andy Fan





Re: Feature Suggestion: Make synchronous_commit a table level property

2025-05-22 Thread Christoph Moench-Tegeder
## Anas-ur-Rasheed Khan (annich...@gmail.com):

> We have a use case where some tables are derived, i.e., can be
> reconstructed entirely from 'source' tables (similar to views, but more
> complex mathematical transformations are applied). Data integrity and
> durability are important for 'source' tables, but not so much for derived
> tables. In the event of a crash, it is important that we can recover data
> from volatile memory for 'source' tables.

Did you consider unlogged tables?
https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-UNLOGGED
The documentation says "considerably faster than ordinary tables" but
"automatically truncated after a crash" which might fit your
requirements.

"Synchronous Commit" should be understood as transaction property and
does not really work on a table level.

Regards,
Christoph

-- 
Spare Space.




Re: generic plans and "initial" pruning

2025-05-22 Thread Amit Langote
On Wed, May 21, 2025 at 7:22 PM Amit Langote  wrote:
> Fair enough. I’ll revert this and some related changes shortly.  WIP
> patch attached.

I have pushed out the revert now.

Note that I’ve only reverted the changes related to deferring locks on
prunable partitions. I’m planning to leave the preparatory commits
leading up to that one in place unless anyone objects. For reference,
here they are in chronological order (the last 3 are bug fixes):

bb3ec16e14d Move PartitionPruneInfo out of plan nodes into PlannedStmt
d47cbf474ec Perform runtime initial pruning outside ExecInitNode()
cbc127917e0 Track unpruned relids to avoid processing pruned relations
75dfde13639 Fix an oversight in cbc127917 to handle MERGE correctly
cbb9086c9ef Fix bug in cbc127917 to handle nested Append correctly
28317de723b Ensure first ModifyTable rel initialized if all are pruned

I think separating initial pruning from plan node initialization is
still worthwhile on its own, as evidenced by the improvements in
cbc127917e.

-- 
Thanks, Amit Langote




Expression push down from Join Node to below node.

2025-05-22 Thread Shubhankar Anand Kulkarni
Hi Hackers,
 
While dealing with a few queries, I noticed that when the join expression (join 
clause) is used in projection as well, the expression will be computed twice.
For a better understanding, please take reference from the following example:


SELECT sensitive_data1, column1 FROM benchmark_encytion AS t1 LEFT JOIN ( 
SELECT aes256_cbc_decrypt( c1, '\x1234' :: bytea, '\x5678' :: bytea ) AS 
column1 FROM cipher ) AS t2 ON t1.sensitive_data1 = t2.column1;






As you can see in the above Query, the join clause involves the column which 
needs to be decrypted first and then joined with other table and then in 
projection we need the needed decrypted values to print as well, in this case 
the plan generated by the PG is as mentioned below (refer to the image as well):


                                                                                
                        QUERY PLAN  
   

---

Hash Right Join  (cost=22.74..73.43 rows=1 width=65)

   Output: t1.sensitive_data1, aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, 
'6e67'::bytea)

   Hash Cond: (aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, '6e67'::bytea) = 
t1.sensitive_data1)

   ->  Foreign Scan on public.cipher  (cost=0.00..50.68 rows=1 width=49)

 Output: cipher.c1, cipher.c2, cipher.c3, cipher.c4, cipher.c5, 
cipher.c6

 CStore Dir: 
/home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/116931

 CStore Table Size: 2424 kB

   ->  Hash  (cost=22.72..22.72 rows=1 width=33)

 Output: t1.sensitive_data1

 ->  Foreign Scan on public.benchmark_encytion t1  (cost=0.00..22.72 
rows=1 width=33)

   Output: t1.sensitive_data1

   CStore Dir: 
/home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/135230

   CStore Table Size: 1268 kB

Query Identifier: 1810637692808683603

(14 rows)





As seen in the plan, join clause uses aes256_cbc_decrypt funcExpr to join 
columns and we are selecting the same as projection from hasjJoin node 
resulting in computing the expr twice, which is very costly.
 
My doubt here is, while planing this join, why can't we parse the join clause 
and pass the expressions involved there to the respective scan nodes and use it 
above that wherever needed as a Var? 
In this particular case, we can push_down the expression (decrypt funcExpr) 
from join clause to the foreign scan of cipher table. Why have we not handled 
this case in PG?
Pls share your thoughts on the same, also pls correct me if my understanding is 
wrong here.










Thanks and Regards.
Shubhankar  Kulkarni
ZLabs-CStore

Re: plan shape work

2025-05-22 Thread Andy Fan
Andy Fan  writes:

>
>> This list of elided nodes is stored in the PlannedStmt
>
> I did a quick check on the attached patches and I can see some more
> information is added into PlannedStmt. then my question are the
> PlannedStmt is not avaiable during the future planning cycle, then how
> does these information would be helpful on the feture planning? and I'm
> strange that there are little changes on the optimizer part. Does this
> patchset just a preparation work for reconstructing a same plan in
> future? 

I'm sure it is a preliminary work for reconstructing a same plan in
future, sorry for this noise. 

-- 
Best Regards
Andy Fan





Re: [PATCH] Add pretty-printed XML output option

2025-05-22 Thread Jim Jones


On 22.05.25 01:48, Tom Lane wrote:
> I did look at that one too.  I think it's fine, because we're
> dealing with a newly-created document which can't have a root node
> yet.  (Reinforcing this, Valgrind sees no leaks after applying
> my patch.)  I considered adding an assertion that that call returns
> NULL, but concluded that it wasn't worth the notational hassle.
> I'm not strongly set on that conclusion, though, if you think
> differently.


I see. In that case I believe that at least a different comment
explaining this decision would avoid confusion. Something like

/*
 * This attaches root to doc, so we do not need to free it separately.
 * The return value of xmlDocSetRootElement (xmlNodePtr) is intentionally
 * ignored here, as it is guaranteed to be NULL in this specific context.
 * When using this function elsewhere, ensure to handle the return value
 * properly.
 */


Best regards, Jim






Re: Retiring some encodings?

2025-05-22 Thread Laurenz Albe
The obvious question is how many people would suffer because
of that removal, as it would prevent them from using pg_upgrade.

Can anybody who works in a region that uses these encodings make
an educated guess?

Yours,
Laurenz Albe




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Srinath Reddy Sadipiralla
Hi,

On Thu, May 22, 2025 at 2:09 AM Tom Lane  wrote:

> Shaik Mohammad Mujeeb  writes:
> > Currently, if there's a typo in an extension name while adding a GUC to
> postgresql.conf, PostgreSQL server starts up silently without any warning.
> This can be misleading, as one might assume the configuration has been
> correctly applied, when in fact the value hasn’t been set due to the typo.
>
> Well, yeah, because the core server has no way to identify bogus
> extension GUCs if the relevant extension isn't loaded.  We do
> already complain about such things at extension load time.
>

the extension is loaded and then i entered the bogus extension GUC into
postgresql.conf and restarted, i did not observe any complain/warning .


> > To improve this experience, I’m proposing a patch that issues a
> > warning for such invalid GUC entries.
>
> How will you know they are invalid?  All I see in the patch is
> a syntactic check, which looks quite redundant with
> assignable_custom_variable_name().
>

after the extension is loaded, MarkGUCPrefixReserved() appends
reserved_class_prefix with extension name ,so this patch has code to check
if the the GUC's prefix from the postgresql.conf is in the
reserved_class_prefix or not ,if not it invalidates and throws
warning.Please correct me if i am wrong.

+static bool
+has_valid_class_prefix(const char *name){
+ /* If there's no separator, it can't be a custom variable */
+ const char *sep = strchr(name, GUC_QUALIFIER_SEPARATOR);
+ size_t class_len = sep - name;
+ ListCell *lc;
+ foreach(lc, reserved_class_prefix)
+ {
+ const char *rcprefix = lfirst(lc);
+
+ if (strlen(rcprefix) == class_len &&
+ strncmp(name, rcprefix, class_len) == 0)
+ {
+ return true;
+ }
+ }
+
+ return false;
+}

+void WarnAndRemoveInvalidGUCs(){
+ HASH_SEQ_STATUS status;
+ GUCHashEntry *hentry;
+
+ /* Warn and remove invalid placeholders. */
+ hash_seq_init(&status, guc_hashtab);
+ while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
+ {
+ struct config_generic *var = hentry->gucvar;
+
+ if((var->flags & GUC_CUSTOM_PLACEHOLDER) != 0 &&
!has_valid_class_prefix(var->name)){
+ ereport(WARNING,
+ (errcode(ERRCODE_INVALID_NAME),
+ errmsg("invalid configuration parameter name \"%s\", removing it",
+ var->name),
+ errdetail("\"%s\" doesn't has a reserved prefix.",
+   var->name)));
+ remove_gucvar(var);
+ }
+ }
+}
+


-- 
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/


Re: Expression push down from Join Node to below node.

2025-05-22 Thread Andy Fan
Shubhankar Anand Kulkarni  writes:

Hi,

> SELECT sensitive_data1, column1 FROM benchmark_encytion AS t1 LEFT JOIN ( 
> SELECT aes256_cbc_decrypt( c1, '\x1234' :: 
> bytea, '\x5678' :: bytea ) AS column1 FROM cipher ) AS t2 ON 
> t1.sensitive_data1 = t2.column1; 
>
> As you can see in the above Query, the join clause involves the column which 
> needs to be decrypted first and then joined
> with other table and then in projection we need the needed decrypted values 
> to print as well, in this case the plan
> generated by the PG is as mentioned below (refer to the image as well):
>
>   
>  QUERY PLAN
>   
>   
> ---
>
> Hash Right Join  (cost=22.74..73.43 rows=1 width=65)
>Output: t1.sensitive_data1, aes256_cbc_decrypt(cipher.c1, 
> '4696e67'::bytea, '6e67'::bytea)
>Hash Cond: (aes256_cbc_decrypt(cipher.c1, '4696e67'::bytea, '6e67'::bytea) 
> = t1.sensitive_data1)
>->  Foreign Scan on public.cipher  (cost=0.00..50.68 rows=1 width=49)
>  Output: cipher.c1, cipher.c2, cipher.c3, cipher.c4, cipher.c5, 
> cipher.c6
>  CStore Dir: 
> /home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/116931
>  CStore Table Size: 2424 kB
>->  Hash  (cost=22.72..22.72 rows=1 width=33)
>  Output: t1.sensitive_data1
>  ->  Foreign Scan on public.benchmark_encytion t1  (cost=0.00..22.72 
> rows=1 width=33)
>Output: t1.sensitive_data1
>CStore Dir: 
> /home/shubha/Documents/zoho/postgres17/data/cstore_fdw/116838/135230
>CStore Table Size: 1268 kB
> Query Identifier: 1810637692808683603
> (14 rows)
>
> My doubt here is, while planing this join, why can't we parse the join clause 
> and pass the expressions involved there to
> the respective scan nodes and use it above that wherever needed as a
> Var?

I guess the reason would be once we push the function down to the "foreign scan"
node, we need to run these function *before any other filter*, which may
increase the number of calls of the function. e.g.

SELECT udf1(t1.a) FROM t1_1000row t1, t2_1row t2 where t2.fid = t1.id;

If we push down the udf1 to the timing of scaning t1, udf1 would be
called 1000 times, but without the push down, it is called 1 times in
the above case. IIRC, PostgreSQL assumes after the join, the total rows
will be less. 

To your case especially,

1. the number call of  aes256_cbc_decrypt will not be increased even we
push down, however figuring out this fact needs some work being done in
the very early of planing stage. which might be kind of complex. 

2. You can simply rewrite your query with materialized cte, I think
that probably resolve your issue.

WITH t2 MATERIALIZED as
(SELECT aes256_cbc_decrypt( c1, '\x1234' ::bytea, '\x5678' :: bytea ) AS
column1 FROM cipher)
SELECT sensitive_data1, column1 FROM
benchmark_encytion AS t1 left join t2
on t1.sensitive_data1 = t2.column1; . 

For a general case,  I do want to share some intermediate result between
ExecQual and ExecProject by storing the intermediate result into some
special tts_values in TupleTableSlot. e.g. the case is:

SELECT costly_udf(f1.a) FROM t1 JOIN t2 WHERE costly_udf(f1.a) = f2.a;

In the past I want to use similar idea to bypass some duplicated detoast
effort at [1], but Robert thought it was unaccptable, then the project
is dead. Your case makes me think about it again.

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZfpruG%3DVvqeKLiRC95VbbxEyxBm8d1r3YOpaedkQuL4A%40mail.gmail.com

-- 
Best Regards
Andy Fan





Pathify RHS unique-ification for semijoin planning

2025-05-22 Thread Richard Guo
I came across a query where the plan includes some unnecessary Sort
nodes.  Here's an example that shows the issue.

create table t(a int, b int);
insert into t select i%100, i from generate_series(1,1)i;
create index on t(a);
vacuum analyze t;

set enable_hashagg to off;

explain (costs off)
select * from t t1 where t1.a in
  (select a from t t2 where a < 10)
order by t1.a;
  QUERY PLAN
---
 Merge Join
   Merge Cond: (t1.a = t2.a)
   ->  Index Scan using t_a_idx on t t1
   ->  Sort
 Sort Key: t2.a
 ->  Unique
   ->  Sort
 Sort Key: t2.a
 ->  Index Only Scan using t_a_idx on t t2
   Index Cond: (a < 10)
(10 rows)

I believe the two Sort nodes are unnecessary.

After some digging, it seems that this is related to one of the
approaches we use to implement semijoins: unique-ifying the RHS and
then doing a regular join.  The unique-ification is handled in
create_unique_path(), which considers both hash-based and sort-based
implementations.  However, in the case of sort-based implementation,
this function pays no attention to the subpath's pathkeys or the
pathkeys of the resulting output.

In addition to this specific issue, it seems to me that there are
other potential issues in create_unique_path().

* Currently, we only consider the cheapest_total_path of the RHS when
unique-ifying it.  I think this may cause us to miss some optimization
opportunities.  For example, there might be a path with a better sort
order that isn't the cheapest-total one.  Such a path could help avoid
a sort at a higher level, potentially resulting in a cheaper overall
plan.

* In create_unique_path(), we currently rely on heuristics to decide
whether to use a hash-based or sort-based method.  I think a better
approach would be to create paths for both methods and let add_path()
determine which one is better, similar to how we handle path selection
in other parts of the planner.

Therefore, I'm thinking that maybe we could create a new RelOptInfo
for the RHS rel to represent its unique-ified version, and then
generate all worthwhile paths for it, similar to how it's done in
create_distinct_paths().  Since this is likely to be called repeatedly
on the same rel, we can cache the new RelOptInfo in the rel struct,
just like how we cache cheapest_unique_path currently.

To be concrete, I'm envisioning something like the following:

if (bms_equal(sjinfo->syn_righthand, rel2->relids) &&
-   create_unique_path(root, rel2, rel2->cheapest_total_path,
-  sjinfo) != NULL)
+   (rel2_unique = create_unique_rel(root, rel2, sjinfo)) != NULL)

...

-   add_paths_to_joinrel(root, joinrel, rel1, rel2,
-JOIN_UNIQUE_INNER, sjinfo,
+   add_paths_to_joinrel(root, joinrel, rel1, rel2_unique,
+JOIN_INNER, sjinfo,
 restrictlist);
-   add_paths_to_joinrel(root, joinrel, rel2, rel1,
-JOIN_UNIQUE_OUTER, sjinfo,
+   add_paths_to_joinrel(root, joinrel, rel2_unique, rel1,
+JOIN_INNER, sjinfo,
 restrictlist);

In addition, by changing the join from "rel1" and "rel2" using
JOIN_UNIQUE_OUTER or JOIN_UNIQUE_INNER to a join between "rel1" and
"rel2_unique" using a standard JOIN_INNER, we might be able to get
rid of the JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER jointypes.  This
could simplify a lot of logic in joinpath.c, where we're increasingly
adding special-case handling for these two jointypes.

One last point, I doubt that the code related to UNIQUE_PATH_NOOP is
reachable in practice.  create_unique_path() checks whether the input
rel is an RTE_RELATION or RTE_SUBQUERY and is provably unique, and
creates a UNIQUE_PATH_NOOP UniquePath if so.  However, in that case,
the semijoin should have already been simplified to a plain inner join
by analyzejoins.c.

Any thoughts?

Thanks
Richard




Re: pg_dump does not dump domain not-null constraint's comments

2025-05-22 Thread jian he
hi.

"Catalog domain not-null constraints" commit[1]
doesn't have a pg_dump test.

I actually found another bug.
create schema test;
CREATE DOMAIN test.d1 AS integer NOT NULL default 11;
pg_dump  --schema=test > 1.sql
""
pg_dump: warning: could not resolve dependency loop among these items:
pg_dump: detail: TYPE d1  (ID 1415 OID 18007)
pg_dump: detail: CONSTRAINT d1_not_null  (ID 1416 OID 18008)
""

Catalog domain not-null constraints is committed in 17, so no need to
worry about before 17 branches.

The attached patch will work for PG17 and PG18.

You can use the following SQL examples
to compare with master and see the intended effect of my attached patch.

CREATE DOMAIN test.d1 AS integer NOT NULL default 11;
CREATE DOMAIN test.d3 AS integer constraint nn NOT NULL default 11;

[1] 
https://git.postgresql.org/cgit/postgresql.git/commit/?id=e5da0fe3c22b34c4433f1729e88495554b5331ed
From aaa19222079d27c554c8b06a2e95b2f2581bccd8 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Thu, 22 May 2025 14:25:58 +0800
Subject: [PATCH v4 1/2] Improve pg_dump handling of domain not-null
 constraints

1. If requested, we should dump comments for domain not-null constraints.
Note: In PostgreSQL 16 and earlier, these constraints do not have entries in
pg_constraint.
this patch can apply to PG17 and up.

discussion: https://postgr.es/m/CACJufxF-0bqVR=j4jons6n2ka6hhupfyu3_3twknhow_4yf...@mail.gmail.com
---
 src/bin/pg_dump/pg_dump.c| 64 +++-
 src/bin/pg_dump/pg_dump_sort.c   |  9 ++---
 src/bin/pg_dump/t/002_pg_dump.pl |  6 ++-
 3 files changed, 62 insertions(+), 17 deletions(-)

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c73e73a87d1..fe60ca874b3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -8250,7 +8250,8 @@ getDomainConstraints(Archive *fout, TypeInfo *tyinfo)
 	int			i_tableoid,
 i_oid,
 i_conname,
-i_consrc;
+i_consrc,
+i_contype;
 	int			ntups;
 
 	if (!fout->is_prepared[PREPQUERY_GETDOMAINCONSTRAINTS])
@@ -8260,10 +8261,19 @@ getDomainConstraints(Archive *fout, TypeInfo *tyinfo)
 			 "PREPARE getDomainConstraints(pg_catalog.oid) AS\n"
 			 "SELECT tableoid, oid, conname, "
 			 "pg_catalog.pg_get_constraintdef(oid) AS consrc, "
-			 "convalidated "
+			 "convalidated, "
+			 "contype "
 			 "FROM pg_catalog.pg_constraint "
-			 "WHERE contypid = $1 AND contype = 'c' "
-			 "ORDER BY conname");
+			 "WHERE contypid = $1 ");
+
+		/*
+		 * only PG17 or later versions, not-null domain constraint catalog
+		 * information is stored in the pg_constraint.
+		 */
+		if (fout->remoteVersion < 17)
+			appendPQExpBufferStr(query, "AND contype = 'c' ");
+
+		appendPQExpBufferStr(query, "ORDER BY conname");
 
 		ExecuteSqlStatement(fout, query->data);
 
@@ -8282,6 +8292,7 @@ getDomainConstraints(Archive *fout, TypeInfo *tyinfo)
 	i_oid = PQfnumber(res, "oid");
 	i_conname = PQfnumber(res, "conname");
 	i_consrc = PQfnumber(res, "consrc");
+	i_contype = PQfnumber(res, "contype");
 
 	constrinfo = (ConstraintInfo *) pg_malloc(ntups * sizeof(ConstraintInfo));
 
@@ -8300,7 +8311,7 @@ getDomainConstraints(Archive *fout, TypeInfo *tyinfo)
 		constrinfo[i].dobj.namespace = tyinfo->dobj.namespace;
 		constrinfo[i].contable = NULL;
 		constrinfo[i].condomain = tyinfo;
-		constrinfo[i].contype = 'c';
+		constrinfo[i].contype = *(PQgetvalue(res, i, i_contype));
 		constrinfo[i].condef = pg_strdup(PQgetvalue(res, i, i_consrc));
 		constrinfo[i].confrelid = InvalidOid;
 		constrinfo[i].conindex = 0;
@@ -12485,7 +12496,31 @@ dumpDomain(Archive *fout, const TypeInfo *tyinfo)
 	}
 
 	if (typnotnull[0] == 't')
-		appendPQExpBufferStr(q, " NOT NULL");
+	{
+		if (fout->remoteVersion < 17)
+			appendPQExpBufferStr(q, " NOT NULL");
+		else
+		{
+			for (i = 0; i < tyinfo->nDomChecks; i++)
+			{
+ConstraintInfo *domcheck = &(tyinfo->domChecks[i]);
+
+if (!domcheck->separate && domcheck->contype == 'n')
+{
+	char	   *default_name;
+
+	/* XXX should match ChooseConstraintName better */
+	default_name = psprintf("%s_not_null", qtypname);
+
+	if (strcmp(default_name, fmtId(domcheck->dobj.name)) == 0)
+		appendPQExpBufferStr(q, " NOT NULL");
+	else
+		appendPQExpBuffer(q, " CONSTRAINT %s %s",
+		fmtId(domcheck->dobj.name), domcheck->condef);
+}
+			}
+		}
+	}
 
 	if (typdefault != NULL)
 	{
@@ -12505,7 +12540,7 @@ dumpDomain(Archive *fout, const TypeInfo *tyinfo)
 	{
 		ConstraintInfo *domcheck = &(tyinfo->domChecks[i]);
 
-		if (!domcheck->separate)
+		if (!domcheck->separate && domcheck->contype == 'c')
 			appendPQExpBuffer(q, "\n\tCONSTRAINT %s %s",
 			  fmtId(domcheck->dobj.name), domcheck->condef);
 	}
@@ -18375,14 +18410,23 @@ dumpConstraint(Archive *fout, const ConstraintInfo *coninfo)
 		  .dropStmt = delq->data));
 		}
 	}
-	else if (coninfo->contype == 'c' && tbinfo == NULL)
+	else if (tbinfo == NULL)
 	{
-		/* CHECK constrain

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2025-05-22 Thread Dmitry Koval

superuser can bypass all permission checks.
superuser can attach any table to the partitioned table as he wants.


That's right.
Using SUPERUSER may be a rare situation, but it needs to be processed.

--
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com





Re: queryId constant squashing does not support prepared statements

2025-05-22 Thread Dmitry Dolgov
> On Wed, May 21, 2025 at 08:22:19PM GMT, Sami Imseih wrote:
> > > > At the same time AFAICT there isn't much more code paths
> > > > to worry about in case of a LocationExpr as a node
> > >
> > > I can imagine there are others like value expressions,
> > > row expressions, json array expressions, etc. that we may
> > > want to also normalize.
>
> > Exactly. When using a node, one can explicitly wrap whatever is needed
> > into it, while otherwise one would need to find a new way to piggy back
> > on A_Expr in a new context.
>
> Looking at the VALUES expression case, we will need to carry the info
> with SelectStmt and ultimately to RangeTblEntry which is where the
> values_list is, so either approach we take RangeTblEntry will need the
> LocationExpr pointer or the additional ParseLoc info I am suggesting.
> A_Expr is not used in the values list case.

Right, that's precisely my point -- introducing a new node will allow to
to use the same generalized mechanism in such scenarios as well, instead
of every time inventing something new.

> > I'll take a look at the proposed change, but a bit later.
>
> Here is a v4 to compare with v3.
>
> 0001- is the infrastructure to track the boundaries
> 0002- the changes to jumbling

Just to call this out, I don't think there is an agreement on squashing
Params, which you have added into 0002. Let's discuss this change
separately from the 18 open item.

---

Here is a short summary of the open item:

* An issue has been discovered with the squashing feature in 18, which
  can lead to invalid normalized queries in pg_stat_statement.

* The proposed fix extends gram.y functionality capturing
  the end location for list expressions to address that.

* There is a disagreement on how exactly to capture the location, the
  options are introducing a new node LocationExpr or piggy back on an
  existing A_Expr. I find the former more flexible and less invasive,
  but looks like there are also other opinions.

Now, both flavour of the proposed solution could be still concidered too
invasive to be applied as a bug fix. I personally don't see it like
this, but I'm obviously biased. This leads us to following decisions to
be made:

* Is modifying parser (either adding a new node or modifying an existing
  one) acceptable at this stage? I guess it would be enough to collect
  couple of votes yes/no in this thread.

* If it's not acceptable, the feature could be reverted in 18, and the
  fix could be applied to the master branch only.

I'm fine with both outcomes (apply the fix to both 18 and master, or
revert in 18 and apply the fix on master), and leave the decision to
Álvaro (sorry for causing all the troubles). It's fair to say that
reverting the feature will be the least risky move.




Re: Consider explicit incremental sort for Append and MergeAppend

2025-05-22 Thread Richard Guo
On Mon, May 19, 2025 at 10:21 PM Robert Haas  wrote:
> On Thu, May 15, 2025 at 9:03 AM Andrei Lepikhov  wrote:
> > 2. IncrementalSort is not always more effective - it depends on the
> > column's number of groups. In my experience, a non-cost-based decision
> > one day meets the problematic case, and the people who stick with it are
> > much more confused than in the case when planner decision connected to
> > the costings - they trust the cost model or the cost model tuned by GUCs.

> I wonder if this could be fixed in nodeIncrementalSort.c. I think it's
> a problem to rely on planner estimates because planner estimates of
> the # of groups are quite unreliable. But at runtime, we can notice
> whether the groups are small or large and adjust the algorithm
> accordingly. In fact, it looks like we already have some logic for
> that:
>
> /*
>  * Sorting many small groups with tuplesort is inefficient. In order to
>  * cope with this problem we don't start a new group until the current one
>  * contains at least DEFAULT_MIN_GROUP_SIZE tuples (unfortunately this also
>  * means we can't assume small groups of tuples all have the same prefix 
> keys.)
>  * When we have a bound that's less than DEFAULT_MIN_GROUP_SIZE we start 
> looking
>  * for the new group as soon as we've met our bound to avoid fetching more
>  * tuples than we absolutely have to fetch.
>  */
> #define DEFAULT_MIN_GROUP_SIZE 32
>
> But maybe this logic needs to be further refined somehow. I can't
> shake the feeling that it's going to be really hard to have every
> place that uses incremental sort decide whether to use an incremental
> sort or a regular sort -- we should try to get to a place where it's
> safe to use an incremental sort when it's possible without worrying
> that it might actually be slower.

Agreed.  In some cases, we currently don't have the infrastructure to
consider both incremental sort and full sort and compare their costs —
for example, when inserting explicit sorts for mergejoins, or, as in
this case, for Append/MergeAppend.

Besides, I think the planner currently assumes that incremental sort
is always faster than full sort when there are presorted keys, and
this premise has been applied in various parts of the code.  I checked
all the callers of create_incremental_sort_path, and they all follow
the logic that if there are presorted keys, only incremental sort is
considered.

Also, to understand how incremental sort performs in the worst case, I
ran the following test.

create table ab (a int not null, b int not null);
insert into ab select 0,x from generate_Series(1,999000)x union all
select x%1000+1,0 from generate_Series(999001,100)x;

create index on ab (a);

vacuum analyze ab;

In this table, group 0 has 999000 rows, and the remaining groups
1-1000 have just 1 row each.

-- incremental sort
explain (analyze, costs off) select * from ab order by a,b;
   QUERY PLAN

 Incremental Sort (actual time=585.093..714.589 rows=100.00 loops=1)
   Sort Key: a, b
   Presorted Key: a
   Full-sort Groups: 33  Sort Method: quicksort  Average Memory: 26kB
Peak Memory: 26kB
   Pre-sorted Groups: 1  Sort Method: external merge  Average Disk:
17680kB  Peak Disk: 17680kB
   Buffers: shared hit=5273, temp read=2210 written=
   ->  Index Scan using ab_a_idx on ab (actual time=0.192..330.289
rows=100.00 loops=1)
 Index Searches: 1
 Buffers: shared hit=5273
 Planning Time: 0.354 ms
 Execution Time: 759.693 ms
(11 rows)

-- full sort
explain (analyze, costs off) select * from ab order by a,b;
 QUERY PLAN

 Sort (actual time=570.755..831.825 rows=100.00 loops=1)
   Sort Key: a, b
   Sort Method: external merge  Disk: 17704kB
   Buffers: shared hit=5273, temp read=2213 written=2225
   ->  Index Scan using ab_a_idx on ab (actual time=0.187..327.701
rows=100.00 loops=1)
 Index Searches: 1
 Buffers: shared hit=5273
 Planning Time: 0.304 ms
 Execution Time: 877.112 ms
(9 rows)

So with the same subpath, incremental sort still outperforms full sort
even if there is a big skew in the number of rows per pre-sorted group
(which is a bit surprising to me).

So, I think it's generally safe to use incremental sort whenever
possible.  There might be some corner cases where incremental sort is
slower than full sort, and I think it would be best to address those
in nodeIncrementalSort.c, as Robert suggested.

Thanks
Richard




Re: POC: Parallel processing of indexes in autovacuum

2025-05-22 Thread Daniil Davydov
Hi,

On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada  wrote:
>
> I have some comments on v2-0001 patch

Thank you for reviewing this patch!

> +   {
> +   {"autovacuum_reserved_workers_num", PGC_USERSET,
> RESOURCES_WORKER_PROCESSES,
> +   gettext_noop("Number of worker processes, reserved for
> participation in parallel index processing during autovacuum."),
> +   gettext_noop("This parameter is depending on
> \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
> +"*Only* autovacuum workers can use these
> additional processes. "
> +"Also, these processes are taken into account
> in \"max_parallel_workers\"."),
> +   },
> +   &av_reserved_workers_num,
> +   0, 0, MAX_BACKENDS,
> +   check_autovacuum_reserved_workers_num, NULL, NULL
> +   },
>
> I find that the name "autovacuum_reserved_workers_num" is generic. It
> would be better to have a more specific name for parallel vacuum such
> as autovacuum_max_parallel_workers. This parameter is related to
> neither autovacuum_worker_slots nor autovacuum_max_workers, which
> seems fine to me. Also, max_parallel_maintenance_workers doesn't
> affect this parameter.
> ...
> I've also considered some alternative names. If we were to use
> parallel_maintenance_workers, it sounds like it controls the parallel
> degree for all operations using max_parallel_maintenance_workers,
> including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> interpreted as affecting both autovacuum and manual VACUUM commands,
> suggesting that when users run "VACUUM (PARALLEL) t", the system would
> use their specified value for the parallel degree. I prefer
> autovacuum_parallel_workers or vacuum_parallel_workers.
>

This was my headache when I created names for variables. Autovacuum
initially implies parallelism, because we have several parallel a/v
workers. So I think that parameter like
`autovacuum_max_parallel_workers` will confuse somebody.
If we want to have a more specific name, I would prefer
`max_parallel_index_autovacuum_workers`.

> Which number does this parameter mean to specify: the maximum number
> of parallel vacuum workers that can be used during autovacuum or the
> maximum number of parallel vacuum workers that each autovacuum can
> use?

First variant. I will concrete this in the variable's description.

> +   {
> +   {
> +   "parallel_idx_autovac_enabled",
> +   "Allows autovacuum to process indexes of this table in
> parallel mode",
> +   RELOPT_KIND_HEAP,
> +   ShareUpdateExclusiveLock
> +   },
> +   false
> +   },
>
> The proposed reloption name doesn't align with our naming conventions.
> Looking at our existing reloptions, we typically write out full words
> rather than using abbreviations like 'autovac' or 'idx'.
>
> The new reloption name seems not to follow the conventional naming
> style for existing reloption. For instance, we don't use abbreviations
> such as 'autovac' and 'idx'.

OK, I'll fix it.

> +   /*
> +* If we are running autovacuum - decide whether we need to process 
> indexes
> +* of table with given oid in parallel.
> +*/
> +   if (AmAutoVacuumWorkerProcess() &&
> +   params->index_cleanup != VACOPTVALUE_DISABLED &&
> +   RelationAllowsParallelIdxAutovac(rel))
>
> I think that this should be done in autovacuum code.

We need params->index cleanup variable to decide whether we need to
use parallel index a/v. In autovacuum.c we have this code :
***
/*
 * index_cleanup and truncate are unspecified at first in autovacuum.
 * They will be filled in with usable values using their reloptions
 * (or reloption defaults) later.
 */
tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
***
This variable is filled in inside the `vacuum_rel` function, so I
think we should keep the above logic in vacuum.c.

> +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
>
> These fixed values really useful in common cases? I think we already
> have an optimization where we skip vacuum indexes if the table has
> fewer dead tuples (see BYPASS_THRESHOLD_PAGES).

When we allocate dead items (and optionally init parallel autocuum) we
don't have sane value for `vacrel->lpdead_item_pages` (which should be
compared with BYPASS_THRESHOLD_PAGES).
The only criterion that we can focus on is the number of dead tuples
indicated in the PgStat_StatTabEntry.



> I guess we can implement this parameter as an integer parameter so
> that the user can specify the number of parallel vacuum workers for
> the table. For example, we can have a reloption
> autovacuum_parallel_workers. Setting 0 (by default) means to disable
> parallel vacuum during autovacuum, and setting special value -1 means
> to let PostgreSQL calculate the parallel degree for the table (same as
> the default VACUUM command behavior).
> ...
> The patch includes the changes to bgworker.c so that w

Re: queryId constant squashing does not support prepared statements

2025-05-22 Thread Álvaro Herrera
On 2025-May-22, Dmitry Dolgov wrote:

> Just to call this out, I don't think there is an agreement on squashing
> Params, which you have added into 0002.

Actually I think we do have agreement on squashing PARAM_EXTERN Params.
https://postgr.es/m/3086744.1746500...@sss.pgh.pa.us

> Now, both flavour of the proposed solution could be still concidered too
> invasive to be applied as a bug fix. I personally don't see it like
> this, but I'm obviously biased. This leads us to following decisions to
> be made:
> 
> * Is modifying parser (either adding a new node or modifying an existing
>   one) acceptable at this stage? I guess it would be enough to collect
>   couple of votes yes/no in this thread.

IMO adding a struct as suggested is okay, especially if it reduces the
overall code complexity.  But we don't want a node, just a bare struct.
Adding a node would be more troublesome.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)




Re: Retiring some encodings?

2025-05-22 Thread Heikki Linnakangas

On 22/05/2025 08:54, Michael Paquier wrote:

With all that in mind, I have wanted to kick a discussion about
potentially removing one or more encodings from the core code,
including the backend part, the frontend part and the conversion
routines, coupled with checks in pg_upgrade to complain with database
or collations include the so-said encoding (the collation part needs
to be checked when not using ICU).  Just being able to removing
GB18030 would do us a favor in the long-term, at least, but there's
more.


+1 at high level for deprecating and removing conversions that are not 
widely used anymore. As the first step, we can at least add a warning to 
the documentation, that they will be removed in the future.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Retiring some encodings?

2025-05-22 Thread Pavel Stehule
čt 22. 5. 2025 v 13:44 odesílatel Heikki Linnakangas 
napsal:

> On 22/05/2025 08:54, Michael Paquier wrote:
> > With all that in mind, I have wanted to kick a discussion about
> > potentially removing one or more encodings from the core code,
> > including the backend part, the frontend part and the conversion
> > routines, coupled with checks in pg_upgrade to complain with database
> > or collations include the so-said encoding (the collation part needs
> > to be checked when not using ICU).  Just being able to removing
> > GB18030 would do us a favor in the long-term, at least, but there's
> > more.
>
> +1 at high level for deprecating and removing conversions that are not
> widely used anymore. As the first step, we can at least add a warning to
> the documentation, that they will be removed in the future.
>

+1

Pavel


> --
> Heikki Linnakangas
> Neon (https://neon.tech)
>
>
>
>


Relstats after VACUUM FULL and CLUSTER

2025-05-22 Thread Erik Nordström
Hi all,

I noticed a potential issue with the heap cluster code used by VACUUM FULL
and CLUSTER, but I am not sure so I thought I'd post the question to the
list.

The code in question counts the number of tuples it processes and uses that
count to update reltuples in pg_class. However, the tuple count seems off
sometimes because it includes recently dead tuples (due to updates and
deletes). However, the wrong reltuples count is set, AFAICT, only on tables
that don't have indexes because the cluster code also later rebuilds
indexes which then updates reltuples to the "correct" value.

Does this seem like a bug or is it intentional?

Best regards,

Erik

-- 
Database Architect, Timescale


Re: Avoid orphaned objects dependencies, take 3

2025-05-22 Thread Bertrand Drouvot
Hi,

On Wed, May 21, 2025 at 10:17:58AM -0700, Jeff Davis wrote:
> On Wed, 2025-05-21 at 12:55 -0400, Robert Haas wrote:
> > Yeah, that's not a terrible idea. I still like the idea I thought
> > Bertrand was pursuing, namely, to take no lock in
> > recordDependencyOn()
> > but assert that the caller has previously acquired one. However, we
> > could still do the Assert() check with this design when NoLock is
> > passed, so I think this is a reasonable alternative to that design.
> 
> I'd have to see the patch to see whether I liked the end result. But
> I'm guessing that involves a lot of non-mechanical changes in the call
> sites, and also relies on test coverage for all of them.

Thinking more about it, I'm not sure the NoLock/AccessShareLock will be easy
to make it right and maintain.

The thing is that in recordMultipleDependencies() we may iterate over multiple
referenced objects and those are the ones we want a lock on.

So if we enter recordMultipleDependencies() with "NoLock" we would need to be
100% sure that ALL the referenced objects are already locked (or that we don't
want to take a lock on ALL the referenced objects).

But that's not so easy, for example with something like:

create schema my_schema;
CREATE TABLE my_schema.test_maint(i INT);
INSERT INTO my_schema.test_maint VALUES (1), (2);
CREATE FUNCTION my_schema.fn(INT) RETURNS INT IMMUTABLE LANGUAGE plpgsql AS $$
  BEGIN
RAISE NOTICE 'BDT';
RETURN $1;
  END;
$$;

Then during:

CREATE MATERIALIZED VIEW my_schema.test_maint_mv AS SELECT my_schema.fn(i) FROM 
my_schema.test_maint;

1. The schema is locked
2. The relation is locked
3. The function is not locked

So we should pass "AccessShareLock" to recordMultipleDependencies() but that 
could
be easy to miss, resulting in passing "NoLock".

Furthermore, it means that even if we pass "AccessShareLock" correctly here, 
we'd
need to check that there is no existing lock on each referenced object (with
LockHeldByMe()/CheckRelationOidLockedByMe()) with a level > AccessShareLock (if
not, we'd add an extra lock without any good reason to do so).

With Robert's idea we could avoid to call LockDatabaseObject()/LockRelationOid()
if we know that the object/relation is already locked (or that we don't want 
a lock at this place). But it would mean being 100% sure that if there are
multiple code paths leading to the same referenced object insertion location
then each of them have the same locking behavior.

As that seems hard, a better approach would probably be to also always call
LockHeldByMe()/CheckRelationOidLockedByMe() before trying to acquire the lock.

So I think:

- Jeff's idea could be hard to reason about, as "NoLock" could mean: we are sure
that ALL the existing referenced objects are already locked and 
"AccessShareLock"
would mean: we know that at least one referenced object needs an 
AccessShareLock.
Then that would mean that "by design" we'd need to check if there is no existing
lock before trying to acquire the AccessShareLock on the referenced objects.

- Robert's idea would still require that we check whether there is any existing
lock before acquiring the AccessShareLock (to be on the safe side of things).

So I wonder if, after all, it makes sense to simply try to acquire the
AccessShareLock on a referenced object in recordMultipleDependencies() IIF
this referenced object is not already locked (basically what was initially
proposed, but with this extra check added and without the "NoLock"/"lock"
addition to recordMultipleDependencies())).

That would probably address Robert's concern [1] "acquiring two locks on the 
same
object with different lock modes where we should really only acquire one" but 
probably not this one "I think it might result in acquiring the
locks on those other objects at a subtly wrong time (leading to race
conditions)".

For the latter I'm not sure how that could be a subtly wrong time or how could
we determine what a subtly wrong time is (to avoid taking the lock).

Thoughts?

[1]: 
https://www.postgresql.org/message-id/CA%2BTgmoaCJ5-Zx5R0uN%2Bah4EZU1SamY1PneaW5O617FsNKavmfw%40mail.gmail.com

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Replication slot is not able to sync up

2025-05-22 Thread Amit Kapila
On Fri, May 23, 2025 at 9:57 AM Suraj Kharage <
suraj.khar...@enterprisedb.com> wrote:

> Hi,
>
> Noticed below behaviour where replication slot is not able to sync up if
> any catalog changes happened after the creation.
> Getting below LOG when trying to sync replication slots using
> pg_sync_replication_slots() function.
> The newly created slot does not appear on the standby after this LOG -
>
> 2025-05-23 07:57:12.453 IST [4178805] *LOG:  could not synchronize
> replication slot "failover_slot" because remote slot precedes local slot*
> 2025-05-23 07:57:12.453 IST [4178805] *DETAIL:  The remote slot has LSN
> 0/B60 and catalog xmin 764, but the local slot has LSN 0/B60 and
> catalog xmin 765.*
> 2025-05-23 07:57:12.453 IST [4178805] STATEMENT:  SELECT
> pg_sync_replication_slots();
>
> Below is the test case tried on latest master branch -
> =
> - Create the Primary and start the server.
> wal_level = logical
>
> - Create the physical slot on Primary.
> SELECT pg_create_physical_replication_slot('slot1');
>
> - Setup the standby using pg_basebackup.
> bin/pg_basebackup -D data1 -p 5418 -d "dbname=postgres" -R
>
> primary_slot_name = 'slot1'
> hot_standby_feedback = on
> port = 5419
>
> -- Start the standby.
>
> -- Connect to Primary and create a logical replication slot.
> SELECT pg_create_logical_replication_slot('failover_slot', 'pgoutput',
> false, false, true);
>
> postgres@4177929=#select xmin,* from pg_replication_slots ;
>  xmin |   slot_name   |  plugin  | slot_type | datoid | database |
> temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
> confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
> e | two_phase_at |  inactive_since  | conflicting |
> invalidation_reason | failover | synced
>
> --+---+--+---++--+---+++--+--+-+-++---+-
>
> --+--+--+-+-+--+
>   765 | slot1 |  | physical  ||  | f
>   | t  |4177898 |  765 |  | 0/B018B00   |
>   | reserved   |   | f
>   |  |  | |
>   | f| f
>   | failover_slot | pgoutput | logical   |  5 | postgres | f
>   | f  ||  |  764 | 0/B60   | 0/B98
>   | reserved   |   | f
>   |  | 2025-05-23 07:55:31.277584+05:30 | f   |
>   | t| f
> (2 rows)
>
> -- Perform some catalog changes. e.g.:
> create table abc(id int);
> postgres@4179034=#select xmin from pg_class where relname='abc';
>  xmin
> --
>   764
> (1 row)
>
> -- Connect to the standby and try to sync the replication slots.
> SELECT pg_sync_replication_slots();
>
> In the logfile, can see below LOG -
> 2025-05-23 07:57:12.453 IST [4178805] LOG:  could not synchronize
> replication slot "failover_slot" because remote slot precedes local slot
> 2025-05-23 07:57:12.453 IST [4178805] DETAIL:  The remote slot has LSN
> 0/B60 and catalog xmin 764, but the local slot has LSN 0/B60 and
> catalog xmin 765.
> 2025-05-23 07:57:12.453 IST [4178805] STATEMENT:  SELECT
> pg_sync_replication_slots();
>
> select xmin,* from pg_replication_slots ;
> no rows
>
> Primary -
> postgres@4179034=#select xmin,* from pg_replication_slots ;
>  xmin |   slot_name   |  plugin  | slot_type | datoid | database |
> temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
> confirmed_flush_lsn | wal_status | safe_wal_size | two_phas
> e | two_phase_at |  inactive_since  | conflicting |
> invalidation_reason | failover | synced
>
> --+---+--+---++--+---+++--+--+-+-++---+-
>
> --+--+--+-+-+--+
>   765 | slot1 |  | physical  ||  | f
>   | t  |4177898 |  765 |  | 0/B018C08   |
>   | reserved   |   | f
>   |  |  | |
>   | f| f
>   | failover_slot | pgoutput | logical   |  5 | postgres | f
>   | f  ||  |  764 | 0/B60   | 0/B98
>   | reserved   |   | f
>   |  | 2025-05-23 07:55:31.277584+05:30 | f   |
>   | t| f
> (2 rows)
> =
>
> Is there any way to sync up the replication slot after the catalog changes
> have been made after creation?
>

The remote_slot (slot on primary) should be advanced before you invoke
sync_slot. Can you do pg_logical_slot_get_changes() API before perf

Re: doc: Make logical replication examples executable in bulk and legal sgml.

2025-05-22 Thread Amit Kapila
On Sat, May 3, 2025 at 9:33 PM David G. Johnston
 wrote:
>
> While responding to a "our documentation is buggy" complaint I got annoyed in 
> my attempt to reproduce the behavior by having to surgically copy 
> line-by-line the DDL and DML code involved.  Let's strive for a more 
> copy-paste friendly example setup.  No prompts and no interspersed command 
> tags (they are ok if the script is one block and the output is another).
>

That's a valid point. Since this is not a correctness issue, I am less
inclined to backpatch. What do you or others think?

-- 
With Regards,
Amit Kapila.




doc: Make logical replication examples executable in bulk and legal sgml.

2025-05-22 Thread David G. Johnston
On Thursday, May 22, 2025, Amit Kapila  wrote:

> On Sat, May 3, 2025 at 9:33 PM David G. Johnston
>  wrote:
> >
> > While responding to a "our documentation is buggy" complaint I got
> annoyed in my attempt to reproduce the behavior by having to surgically
> copy line-by-line the DDL and DML code involved.  Let's strive for a more
> copy-paste friendly example setup.  No prompts and no interspersed command
> tags (they are ok if the script is one block and the output is another).
> >
>
> That's a valid point. Since this is not a correctness issue, I am less
> inclined to backpatch. What do you or others think?
>

Agreed, this would not be back-patched.

David J.


Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread Shaik Mohammad Mujeeb
Hi Tom Lane,

> Well, yeah, because the core server has no way to identify bogus 

> extension GUCs if the relevant extension isn't loaded.  We do

> already complain about such things at extension load time.

Yes, currently we do issue warnings at extension load time if a GUC with a 
valid prefix contains an invalid or misspelled suffix (e.g., plpgsql.bogus, 
plpgsql.abc, plpgsql.xyz). In such cases, only the part after the dot is 
invalid, so the warning helps users catch typos early.



However, if the typo exists in the extension name itself (e.g., 
plppsql.extra_errors, plpgaql.extra_errors, plpgsqi.extra_errors), we currently 
don’t provide any acknowledgment. In these examples, the prefix is invalid or 
misspelled, and it would be helpful to notify the user so they have a chance to 
correct it.



I do agree that in cases of unrecognised prefixes, it’s possible they may 
either be typos or valid but unregistered prefixes. So instead of removing such 
GUCs from the hashtable - as my current patch does - it would be more 
appropriate to just issue a warning.



Of course, I understand that not all users may want to see these warnings. To 
make this behaviour optional, we could introduce a new GUC (e.g., 
warn_on_unregistered_guc_prefix), which defaults to false to preserve existing 
behaviour. When set to true, it would enable warnings for unregistered or 
potentially misspelled GUC prefixes, giving users the flexibility to opt in.



> How will you know they are invalid?  All I see in the patch is 

> a syntactic check, which looks quite redundant with

> assignable_custom_variable_name().

After all the shared preload libraries are loaded, the reserved_class_prefix 
list contains the valid prefixes which we can refer to for this validation. I’m 
performing this check only after all such libraries are loaded. While this 
might look similar to the logic in assignable_custom_variable_name(), the 
purpose here is different.

I hope this helps clarify the intention behind the proposal, and I’d appreciate 
any thoughts or feedback on this approach.



Thanks & Regards,

Shaik Mohammad Mujeeb

Member Technical Staff

Zoho Corp







 On Thu, 22 May 2025 02:08:58 +0530 Tom Lane  wrote ---



Shaik Mohammad Mujeeb  writes: 
> Currently, if there's a typo in an extension name while adding a GUC to 
> postgresql.conf, PostgreSQL server starts up silently without any warning. 
> This can be misleading, as one might assume the configuration has been 
> correctly applied, when in fact the value hasn’t been set due to the typo. 
 
Well, yeah, because the core server has no way to identify bogus 
extension GUCs if the relevant extension isn't loaded.  We do 
already complain about such things at extension load time. 
 
> To improve this experience, I’m proposing a patch that issues a 
> warning for such invalid GUC entries. 
 
How will you know they are invalid?  All I see in the patch is 
a syntactic check, which looks quite redundant with 
assignable_custom_variable_name(). 
 
regards, tom lane

Re: Why our Valgrind reports suck

2025-05-22 Thread Andres Freund
Hi,

0001:

This is rather wild, I really have only the dimmest memory of that other
thread even though I apparently resorted to reading valgrind's source code...

I think the vchunk/vpool naming, while not necessarily elegant, is helpful.


0002: Makes sense.


0003:
0004:
0005:

Ugh, that's rather gross. Not that I have a better idea. So it's probably
worth doing ...


0006: LGTM


0007:

+   /* Run the rest in xact context to avoid Valgrind leak complaints */
+   MemoryContextSwitchTo(TopTransactionContext);
 
It seems like it also protects at least somewhat against actual leaks?


0008: LGTM


0009:

Seems like we really ought to do more here. But it's a substantial
improvement, so let's not let perfect be the enemy of good.


0010:
0011:

Not great, but better than not doing anything.


0012:

Hm. Kinda feels like we should just always do it the USE_VALGRIND
approach. ISTM that if domain typecache entry building is a bottleneck, there
are way bigger problems than a copyObject()...


0014:

I guess I personally would try to put the freeing code into a helper, but
it's a close call.


0015:

I'd probably put the list_free() after the
LWLockRelease(LogicalRepWorkerLock)?


0016:

Agreed with the concern stated in the commit message, but addressing that
would obviously be a bigger project.


0017:

A tad weird to leave the comments above the removed = NILs in place, even
though it's obviously still correct.


0018: LGTM.


> But this is the last step to get to zero reported leaks in a run of the core
> regression tests, so let's do it.

I assume that's just about the core tests, not more? I.e. I can't make skink
enable leak checking?

Greetings,

Andres Freund




Re: PG 18 release notes draft committed

2025-05-22 Thread Bruce Momjian
On Wed, May 21, 2025 at 05:57:07PM -0400, Peter Geoghegan wrote:
> For example, TimescaleDB offers Loose index scan as part of the
> TimescaleDB Postgres extension, which (for whatever reason) they chose
> to call skip scan:
> 
> https://www.timescale.com/blog/how-we-made-distinct-queries-up-to-8000x-faster-on-postgresql
> 
> Note that loose index scan can only be used with certain kinds of
> queries involving DISTINCT or GROUP BY. Whereas skip scan (in Oracle
> and now in Postgres) can work with any query that omits one or more
> "=" conditions on a prefix index column from a multicolumn index (when
> a later index column has some condition that can be used by the scan)
> -- it doesn't have to involve aggregation. I believe that describing
> the feature along these lines will make it less likely that users will
> be confused by the apparent naming conflict.

Yes, I understand this "loose index scan" feature as preventing DISTINCT
from traversing all matching indexed values, and then removing
duplicates.  Rather it stops after finding the first match for each
distinct value.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.




Re: PG 18 release notes draft committed

2025-05-22 Thread Bruce Momjian
On Wed, May 21, 2025 at 05:57:07PM -0400, Peter Geoghegan wrote:
> On Thu, May 1, 2025 at 10:44 PM Bruce Momjian  wrote:
> > I have committd the first draft of the PG 18 release notes.
> 
> I suggest that you use something like the following wording for the
> skip scan feature:
> 
> Add the "skip scan" optimization, which enables more efficient scans
> of multicolumn B-tree indexes for queries that omit an "=" condition
> on one or more prefix index columns.
> 
> This is similar to the wording that appeared in the beta1 announcement.
> 
> The term "skip scan" has significant baggage -- we need to be careful
> to not add to the confusion. There are naming conflicts, which seem
> likely to confuse some users. Various community members have in the
> past referred to a feature that MySQL calls loose index scan as skip
> scan, which seems wrong to me -- it clashes with the naming
> conventions used by other RDBMSs, for no good reason. Skip scan and
> loose index scan are in fact rather different features.
> 
> For example, TimescaleDB offers Loose index scan as part of the
> TimescaleDB Postgres extension, which (for whatever reason) they chose
> to call skip scan:
> 
> https://www.timescale.com/blog/how-we-made-distinct-queries-up-to-8000x-faster-on-postgresql
> 
> Note that loose index scan can only be used with certain kinds of
> queries involving DISTINCT or GROUP BY. Whereas skip scan (in Oracle
> and now in Postgres) can work with any query that omits one or more
> "=" conditions on a prefix index column from a multicolumn index (when
> a later index column has some condition that can be used by the scan)
> -- it doesn't have to involve aggregation. I believe that describing
> the feature along these lines will make it less likely that users will
> be confused by the apparent naming conflict.
> 
> FWIW, I don't think that it's important that the release notes point
> out that skip scan is only helpful when the leading/skipped column is
> low cardinality (though that detail is accurate).

I see your point that we are not defining what this does.  I went with
the attached text.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.
diff --git a/doc/src/sgml/release-18.sgml b/doc/src/sgml/release-18.sgml
index 2a52cef1c7c..a2b1921fbbb 100644
--- a/doc/src/sgml/release-18.sgml
+++ b/doc/src/sgml/release-18.sgml
@@ -469,7 +469,8 @@ Allow skip scans of btree indexes (Peter Geoghegan)
 
 
 
-This is effective if the earlier non-referenced columns contain few unique values.
+This allows muti-column btree indexes to be used by queries that only
+reference the second or later indexed columns.
 
 
 


Re: Prevent internal error at concurrent CREATE OR REPLACE FUNCTION

2025-05-22 Thread jian he
On Thu, May 22, 2025 at 10:25 AM jian he  wrote:
>
hi.
earlier, i didn't check patch 0002.

i think in AlterFunction add
/* Lock the function so nobody else can do anything with it. */
LockDatabaseObject(ProcedureRelationId, funcOid, 0, AccessExclusiveLock);

right after
funcOid = LookupFuncWithArgs(stmt->objtype, stmt->func, false);

should be fine.

attached are some simple isolation tests for
CREATE OR REPLACE FUNCTION, ALTER FUNCTION, DROP FUNCTION.
From 040c5f739dbd4e5640ba40ae7c30b86d312bd004 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Fri, 23 May 2025 10:33:39 +0800
Subject: [PATCH v1 1/1] isolation tests for concurrent change FUNCTION
 definition

discussion: https://postgr.es/m/20250331200057.00a62760966a821d484ea...@sraoss.co.jp
---
 .../isolation/expected/change-function.out| 39 +++
 src/test/isolation/isolation_schedule |  1 +
 src/test/isolation/specs/change-function.spec | 29 ++
 3 files changed, 69 insertions(+)
 create mode 100644 src/test/isolation/expected/change-function.out
 create mode 100644 src/test/isolation/specs/change-function.spec

diff --git a/src/test/isolation/expected/change-function.out b/src/test/isolation/expected/change-function.out
new file mode 100644
index 000..812346f2fc1
--- /dev/null
+++ b/src/test/isolation/expected/change-function.out
@@ -0,0 +1,39 @@
+unused step name: r1
+Parsed test spec with 2 sessions
+
+starting permutation: b1 b2 create_f1 alter_f s1 c1 r2
+step b1: BEGIN;
+step b2: BEGIN;
+step create_f1: CREATE OR REPLACE FUNCTION f1() RETURNS INT LANGUAGE sql AS $$ SELECT 2;$$;
+step alter_f: ALTER FUNCTION f1() COST 71; 
+step s1: SELECT pg_get_functiondef(oid) FROM pg_proc pp WHERE proname = 'f1';
+pg_get_functiondef  
+
+CREATE OR REPLACE FUNCTION public.f1()
+ RETURNS integer
+ LANGUAGE sql
+AS $function$ SELECT 2;$function$
+
+(1 row)
+
+step c1: COMMIT;
+step alter_f: <... completed>
+step r2: ROLLBACK;
+
+starting permutation: b1 b2 drop_f create_f1 c2 c1
+step b1: BEGIN;
+step b2: BEGIN;
+step drop_f: DROP FUNCTION f1;
+step create_f1: CREATE OR REPLACE FUNCTION f1() RETURNS INT LANGUAGE sql AS $$ SELECT 2;$$; 
+step c2: COMMIT;
+step create_f1: <... completed>
+step c1: COMMIT;
+
+starting permutation: b1 b2 create_f2 create_f1 c2 c1
+step b1: BEGIN;
+step b2: BEGIN;
+step create_f2: CREATE OR REPLACE FUNCTION f1() RETURNS INT LANGUAGE sql AS $$ SELECT 3;$$;
+step create_f1: CREATE OR REPLACE FUNCTION f1() RETURNS INT LANGUAGE sql AS $$ SELECT 2;$$; 
+step c2: COMMIT;
+step create_f1: <... completed>
+step c1: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index e3c669a29c7..99e3ce780a7 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -116,3 +116,4 @@ test: serializable-parallel-2
 test: serializable-parallel-3
 test: matview-write-skew
 test: lock-nowait
+test: change-function
diff --git a/src/test/isolation/specs/change-function.spec b/src/test/isolation/specs/change-function.spec
new file mode 100644
index 000..987e1ee7cf1
--- /dev/null
+++ b/src/test/isolation/specs/change-function.spec
@@ -0,0 +1,29 @@
+setup
+{
+CREATE OR REPLACE FUNCTION f1() RETURNS INT LANGUAGE sql AS $$ SELECT 1;$$;
+}
+
+teardown
+{
+DROP FUNCTION IF EXISTS f1;
+}
+
+session "s1"
+step b1 { BEGIN;}
+step create_f1   { CREATE OR REPLACE FUNCTION f1() RETURNS INT LANGUAGE sql AS $$ SELECT 2;$$; }
+step c1 { COMMIT; }
+step s1 { SELECT pg_get_functiondef(oid) FROM pg_proc pp WHERE proname = 'f1'; }
+step r1 { ROLLBACK; }
+
+session "s2"
+step b2 { BEGIN;}
+step alter_f{ ALTER FUNCTION f1() COST 71; }
+step create_f2   { CREATE OR REPLACE FUNCTION f1() RETURNS INT LANGUAGE sql AS $$ SELECT 3;$$; }
+step drop_f { DROP FUNCTION f1; }
+step c2 { COMMIT; }
+step r2 { ROLLBACK; }
+
+# Basic effects
+permutation b1 b2 create_f1 alter_f s1 c1 r2
+permutation b1 b2 drop_f create_f1 c2 c1
+permutation b1 b2 create_f2 create_f1 c2 c1
-- 
2.34.1



Re: PG 18 release notes draft committed

2025-05-22 Thread vignesh C
On Fri, 2 May 2025 at 08:14, Bruce Momjian  wrote:
>
> I have committd the first draft of the PG 18 release notes.  The item
> count looks strong:
>
> I will continue improving it until beta 1, and until the final release.
> I will probably add markup in 1-3 weeks.  Let the feedback begin.  ;-)

Regarding the following:
Change the default CREATE SUBSCRIPTION streaming option from "off" to
"parallel" (Hayato Kuroda, Masahiko Sawada, Peter Smith, Amit Kapila)

The author name was incorrectly listed in the commit; it should be
"Vignesh C". This correction is noted in the follow-up to the commit
email at [1].
[1] - 
https://www.postgresql.org/message-id/CAA4eK1K_M6%3DTB7sMKhBS5e819ePknB1_bT3a8LAnzV2dv64wjA%40mail.gmail.com

Regards,
Vignesh




Re: PG 18 release notes draft committed

2025-05-22 Thread Bruce Momjian
On Fri, May 23, 2025 at 08:19:15AM +0530, vignesh C wrote:
> On Fri, 2 May 2025 at 08:14, Bruce Momjian  wrote:
> >
> > I have committd the first draft of the PG 18 release notes.  The item
> > count looks strong:
> >
> > I will continue improving it until beta 1, and until the final release.
> > I will probably add markup in 1-3 weeks.  Let the feedback begin.  ;-)
> 
> Regarding the following:
> Change the default CREATE SUBSCRIPTION streaming option from "off" to
> "parallel" (Hayato Kuroda, Masahiko Sawada, Peter Smith, Amit Kapila)
> 
> The author name was incorrectly listed in the commit; it should be
> "Vignesh C". This correction is noted in the follow-up to the commit
> email at [1].
> [1] - 
> https://www.postgresql.org/message-id/CAA4eK1K_M6%3DTB7sMKhBS5e819ePknB1_bT3a8LAnzV2dv64wjA%40mail.gmail.com

Ah, I see, fixed with the attached patch.  It might be nice if we had a
more organized way of recording such commit corrections.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.
diff --git a/doc/src/sgml/release-18.sgml b/doc/src/sgml/release-18.sgml
index a2b1921fbbb..246e49ce740 100644
--- a/doc/src/sgml/release-18.sgml
+++ b/doc/src/sgml/release-18.sgml
@@ -1316,7 +1316,7 @@ Author: Amit Kapila 
 
 
 
-Change the default CREATE SUBSCRIPTION streaming option from "off" to "parallel" (Hayato Kuroda, Masahiko Sawada, Peter Smith, Amit Kapila)
+Change the default CREATE SUBSCRIPTION streaming option from "off" to "parallel" (Vignesh C)
 §
 
 


Re: Reduce "Var IS [NOT] NULL" quals during constant folding

2025-05-22 Thread Richard Guo
On Thu, May 22, 2025 at 11:51 PM Tom Lane  wrote:
> I wonder if a better answer would be to make the rewriter responsible
> for this.  If you hold your head at the correct angle, a table with
> virtual generated columns looks a good deal like a view, and we don't
> ask the planner to handle those.

In Peter's initial commit (83ea6c540), it was the rewriter that was
responsible for expanding virtual generated columns.  However, this
approach introduced several problems (see the reports starting from
[1]).  In some cases, we can't simply replace Var nodes that reference
virtual columns with their corresponding generation expressions.  To
preserve correctness, we may need to wrap those expressions in
PlaceHolderVars — for example, when the Vars come from the nullable
side of an outer join or are used in grouping sets.

So in commit 1e4351af3, Dean and I proposed moving the expansion of
virtual generated columns into the planner, so that we can insert
PlaceHolderVars when needed.

Yeah, the extra table_open call is annoying.  In this patchset, we're
performing some additional tasks while the relation is open — such as
retrieving relhassubclass and attnotnull information.  We also get rid
of the has_subclass() call along the way.  Maybe this would help
justify the added cost?

[1] https://postgr.es/m/75eb1a6f-d59f-42e6-8a78-124ee808c...@gmail.com

Thanks
Richard




Understanding when VM record needs snapshot conflict horizon

2025-05-22 Thread Melanie Plageman
Hi,

I'm trying to understand when the visibility map WAL record
(xl_heap_visible) needs to include a snapshot conflict horizon.
Currently, when emitting a xl_heap_visible record after phase I of
vacuum, we include a snapshot conflict horizon if the page is being
newly set all-visible in the VM.

We do not include a snapshot conflict horizon in the xl_heap_visible
record if we are newly setting an already all-visible page all-frozen.

I thought this was because if we are setting a page all-visible in the
VM, then we are likely also setting the page level hint PD_ALL_VISIBLE
and thus are likely modifying the page (and perhaps doing so without
emitting WAL), so we should include a conflict horizon in the
subsequent xl_heap_visible record to avoid recovery conflicts. There
is no page-level hint about being all-frozen.

However, there is a comment in the code that says we don't need to
include a conflict horizon when setting an already all-visible page
all-frozen because the snapshot conflict horizon sufficient to make
everything safe for REDO was logged when the page's tuples were
frozen.

That doesn't make sense to me because:
1) isn't it possible that a page was entirely frozen but not set all
frozen in the VM for some reason or other and we didn't actually
freeze any tuples in order to set the page all-frozen in the VM and
2) if our inclusion of a cutoff_xid when freezing tuples is what makes
it safe to omit it from the VM update, then wouldn't that be true if
we included a cutoff_xid when pruning a page in a way that rendered it
all-visible too?

For context, I'm writing a patch to add VM update redo to the
xl_heap_prune record, and, in some cases, the record will only contain
an update to the VM and I'm trying to determine when I need a snapshot
conflict horizon in the record.

- Melanie




Re: queryId constant squashing does not support prepared statements

2025-05-22 Thread Michael Paquier
On Thu, May 22, 2025 at 03:10:34PM -0500, Sami Imseih wrote:
> > IMO adding a struct as suggested is okay, especially if it reduces the
> > overall code complexity.  But we don't want a node, just a bare struct.
> > Adding a node would be more troublesome.
> 
> In v4, a new private struct is added in gram.y, but we are also adding
> additional fields to track the expression boundaries to the required
> nodes.

Handling external parameters as something that gets squashed is also
the consensus I am understanding we've reached.  I'm OK with it.

Upthread, scenarios with multiple IN lists was mentioned to be broken:
https://www.postgresql.org/message-id/CAA5RZ0ts6zb-efiJ+K31Z_YDU=m7the43vv6zbcqqxiabr3...@mail.gmail.com

For example with bind queries like that:
select where $1 in ($3, $2) and 1 in ($4, cast($5 as int))
\bind 0 1 2 3 4

Should we have a bit more coverage, where we use multiple IN and/or
ARRAY lists with constants and/or external parameters?

v4-0003 with the extra tests for ARRAY can be applied first, with the
test output slightly adjusted and the casts showing up.  Now, looking
independently at v4-0001, it is a bit hard to say what's the direct
benefit of this patch, because nothing in the tests of pgss change
after applying it.  Could the benefit of this patch be demonstrated so
as it is possible to compare what's the current vs the what-would-be
new behavior?

The patterns generated when using casts is still a bit funny, but
perhaps nobody will bother much about the result generated as these
are uncommon.  For example, this gets squashed, with the end of the
cast included:
Q: select where 2 in (1, 4) and 1 in (5, (cast(7 as int)), 6, cast(8 as int));
R: select where $1 in ($2 /*, ... */) and $3 in ($4 /*, ... */ as int))

This does not get squashed:
Q: select where 2 in (1, 4) and
1 in (5, cast(7 as int), 6, (cast(8 as int)), 9, 10, (cast(8 as text))::int);
R: select where $1 in ($2 /*, ... */) and
$3 in ($4, cast($5 as int), $6, (cast($7 as int)), $8, $9, (cast($10 as 
text))::int) 

This is the kind of stuff that should also have coverage for, IMO, or
we will never keep track of what the existing behavior is, and if
things break in some way in the future.

FWIW, with v4-0002 applied, I am seeing one diff in the dml tests,
where a IN list is not squashed for pgss_dml_tab.

The squashing tests point to more issues in the v4 series:
- Some lists are not getting squashed anymore.
- Some spacing issues, like "( $5)::jsonb".
Am I missing something?
--
Michael


signature.asc
Description: PGP signature


Re: parallel_safe

2025-05-22 Thread Andy Fan
Andy Fan  writes:

Hi,

Some clearer idea are provided below. Any feedback which could tell this
is *obviously wrong* or *not obviously wrong* is welcome. 

> see the below example:
>
> create table bigt (a int, b int, c int);
> insert into bigt select i, i, i from generate_series(1, 100)i;
> analyze bigt;
>
> select * from bigt o where b = 1 and c = (select sum(c) from bigt i where c = 
> o.c);
..
> I think the below plan should be correct and more efficiently but is 
> impossible.
>
> Plan 1:
>
>QUERY PLAN
> -
>  Gather
>Workers Planned: 2
>->  Parallel Seq Scan on bigt o
>  Filter: ((b = 1) AND (c = (SubPlan 1)))
>  SubPlan 1
>->  Aggregate
>  ->  Seq Scan on bigt
>Filter: (c = o.c)
> (8 rows)
>
> because:
>
> (1). During the planning of the SubPlan, we use is_parallel_safe() to
> set the "bigt i"'s consider_parallel to false because of the above
> "PARAM_EXEC" reason. 
>
> (2). The parallel_safe of the final SubPlan is set to false due to
> rel->consider_parallel. 
>
> (3). During the planning of "bigt o", it calls is_parallel_safe and then
> it find a subplan->parallel_safe == false, then all the partial path is
> impossible.
>
>
> I think it is better to think about what parallel_safe is designed
> for. In Path: 
>
> The definition looks to say: (1) the Path/Plan should not be run as a
> 'parallel_aware' plan,  but the code looks to say: (2). The Path/Plan
> should not be run in a parallel worker even it is *not*
> parallel_aware.  
..
> So parallel_safe looks have two different meaning to me.

I'd like to revist 'bool parallel_safe' to 'ParallelSafety
parallel_safe' for RelOptInfo, Path and Plan (I'd like to rename
RelOptInfo->consider_parallel to parallel_safe for consistentence). 

ParallelSafety would contains 3 properties:

1. PARALLEL_UNSAFE = 0  // default. This acts exactly same as the
current paralle_safe = false.  When it is set on RelOptInfo,  non
partial pathlist on this RelOptInfo should be considered. When it is set 
to Path/Plan,  no parallel worker should run the Path/Plan. 

2. PARALLEL_WORKER_SAFE = 1 // We can set parallel_safe to this value for
the PARAM_EXEC case (when parallel-unsafe function and
Gather/MergeGather doesn't exist), The theory behind it is for a
non-partial-path, it always populate a complete/same result, no matter
different workers use different PARAM_EXEC values. the impact is no
partial path should be considered on this RelOptInfo, but the
non-partial-path/plan could be used with other partial path.  

3. PARALLEL_PARTIALPATH_SAFE = 2: same as the parallel_safe=true.

After this design, more Plan with SubPlan could be parallelized. Take
my case for example:

select * from bigt o where b = 1 and c = (select sum(c) from bigt i
where c = o.c);

RelOptInfo of 'bigt i' would have a parallel_safe =
PARALLEL_WORKER_SAFE, so non partial path should be generated. and the
final SubPlan would have a parallel_safe = PARALLEL_WORKER_SAFE.

When planning RelOptInfo of 'bigt o', it only check if the
SubPlan->parallel_safe is PARALLEL_UNSAFE, so at last
RelOptInfo->parallel_safe is PARALLEL_PARTIALPATH_SAFE, then we could
populated partial_pathlist for it. and the desired plan could be
generated.

-- 
Best Regards
Andy Fan





Re: queryId constant squashing does not support prepared statements

2025-05-22 Thread Sami Imseih
> For example with bind queries like that:
> select where $1 in ($3, $2) and 1 in ($4, cast($5 as int))
> \bind 0 1 2 3 4
>
> Should we have a bit more coverage, where we use multiple IN and/or
> ARRAY lists with constants and/or external parameters?

I will add more test coverage. All the tests we have for constants
should also have a external parameter counterpart.

> v4-0003 with the extra tests for ARRAY can be applied first, with the
> test output slightly adjusted and the casts showing up.

That was my mistake in rearranging the v3-0001 as v4-0003. I will
fix in the next revision.

> Now, looking
> independently at v4-0001, it is a bit hard to say what's the direct
> benefit of this patch, because nothing in the tests of pgss change
> after applying it.  Could the benefit of this patch be demonstrated so
> as it is possible to compare what's the current vs the what-would-be
> new behavior?

You're right, this should not be an independent patch. I had intended to
eventually merge these v4-0001 and v4-0002 but felt it was cleaner to
review separately. I'll just combine them in the next rev.

> The patterns generated when using casts is still a bit funny, but
> perhaps nobody will bother much about the result generated as these
> are uncommon.  For example, this gets squashed, with the end of the
> cast included:
> Q: select where 2 in (1, 4) and 1 in (5, (cast(7 as int)), 6, cast(8 as int));
> R: select where $1 in ($2 /*, ... */) and $3 in ($4 /*, ... */ as int))
>
> This does not get squashed:
> Q: select where 2 in (1, 4) and
> 1 in (5, cast(7 as int), 6, (cast(8 as int)), 9, 10, (cast(8 as text))::int);
> R: select where $1 in ($2 /*, ... */) and
> $3 in ($4, cast($5 as int), $6, (cast($7 as int)), $8, $9, (cast($10 as 
> text))::int)
>
> This is the kind of stuff that should also have coverage for, IMO, or
> we will never keep track of what the existing behavior is, and if
> things break in some way in the future.

This is interesting actually. This is the behavior on HEAD, and I don't get why
the first list with the casts does not get squashed, while the second one does.
I will check IsSquashableConst tomorrow unless Dmitry gets to it first.

```
test=# select where 2 in (1, 4) and 1 in (5, cast(7 as int), 6,
(cast(8 as int)), 9, 10, (cast(8 as text))::int);
--
(0 rows)

test=# select where 1 in (5, cast(7 as int), 6);
--
(0 rows)

test=# select queryid, substr(query, 1, 100) as query from pg_stat_statements;
   queryid|
query

--+---
---
  2125518472894925252 | select where $1 in ($2 /*, ... */) and $3 in
($4, cast($5 as int), $6, (cast($7 as
 int)), $8, $9, (c
 -4436613157077978160 | select where $1 in ($2 /*, ... */)

```

> FWIW, with v4-0002 applied, I am seeing one diff in the dml tests,
> where a IN list is not squashed for pgss_dml_tab.

hmm, I did not observe the same diff.

--
Sami




Re: PG 18 release notes draft committed

2025-05-22 Thread Michael Paquier
On Thu, May 01, 2025 at 10:44:50PM -0400, Bruce Momjian wrote:
> I will continue improving it until beta 1, and until the final release. 
> I will probably add markup in 1-3 weeks.  Let the feedback begin.  ;-)

May I suggest the attached addition for the release notes, following
commit 371f2db8b that has added support for runtime parameters in
injection points?
--
Michael
From 80eef94686acfc434a3ef01ffd73a0acd708ecb8 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 23 May 2025 15:15:01 +0900
Subject: [PATCH] doc PG 18 relnotes: Mention runtime parameters in injection
 points

As per recent commit 371f2db8b.
---
 doc/src/sgml/release-18.sgml | 12 
 1 file changed, 12 insertions(+)

diff --git a/doc/src/sgml/release-18.sgml b/doc/src/sgml/release-18.sgml
index 2a52cef1c7ce..35ba139d04d5 100644
--- a/doc/src/sgml/release-18.sgml
+++ b/doc/src/sgml/release-18.sgml
@@ -2847,6 +2847,18 @@ Injection points can now be created, but not run, via INJECTION_POINT_LOAD(), an
 
 
 
+
+
+
+
+Support runtime arguments in injection points (Michael Paquier)
+§
+
+
+
 

Re: POC: Parallel processing of indexes in autovacuum

2025-05-22 Thread Masahiko Sawada
On Thu, May 22, 2025 at 10:48 AM Sami Imseih  wrote:
>
> I started looking at the patch but I have some high level thoughts I would
> like to share before looking further.
>
> > > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > > would be better to have a more specific name for parallel vacuum such
> > > as autovacuum_max_parallel_workers. This parameter is related to
> > > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > > affect this parameter.
> > > ...
> > > I've also considered some alternative names. If we were to use
> > > parallel_maintenance_workers, it sounds like it controls the parallel
> > > degree for all operations using max_parallel_maintenance_workers,
> > > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > > interpreted as affecting both autovacuum and manual VACUUM commands,
> > > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > > use their specified value for the parallel degree. I prefer
> > > autovacuum_parallel_workers or vacuum_parallel_workers.
> > >
> >
> > This was my headache when I created names for variables. Autovacuum
> > initially implies parallelism, because we have several parallel a/v
> > workers. So I think that parameter like
> > `autovacuum_max_parallel_workers` will confuse somebody.
> > If we want to have a more specific name, I would prefer
> > `max_parallel_index_autovacuum_workers`.
>
> I don't think we should have a separate pool of parallel workers for those
> that are used to support parallel autovacuum. At the end of the day, these
> are parallel workers and they should be capped by max_parallel_workers. I 
> think
> it will be confusing if we claim these are parallel workers, but they
> are coming from
> a different pool.

I agree that parallel vacuum workers used during autovacuum should be
capped by the max_parallel_workers.

>
> I envision we have another GUC such as "max_parallel_autovacuum_workers"
> (which I think is a better name) that matches the behavior of
> "max_parallel_maintenance_worker". Meaning that the autovacuum workers
> still maintain their existing behavior ( launching a worker per table
> ), and if they do need
> to vacuum in parallel, they can draw from a pool of parallel workers.
>
> With the above said, I therefore think the reloption should actually be a 
> number
> of parallel workers rather than a boolean. Let's take an example of a
> user that has 3 tables
> they wish to (auto)vacuum can process in parallel, and if available
> they wish each of these tables
> could be autovacuumed with 4 parallel workers. However, as to not
> overload the system, they
> cap the 'max_parallel_maintenance_worker' to something like 8. If it
> so happens that all
> 3 tables are auto-vacuumed at the same time, there may not be enough
> parallel workers,
> so one table will be a loser and be vacuumed in serial.

+1 for the reloption having a number of parallel workers, leaving
aside the name competition.

> That is
> acceptable, and a/v logging
> ( and perhaps other stat views ) should display this behavior: workers
> planned vs workers launched.

Agreed. The workers planned vs. launched is reported only with VERBOSE
option so we need to change it so that autovacuum can log it at least.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: [Util] Warn and Remove Invalid GUCs

2025-05-22 Thread David G. Johnston
On Thu, May 22, 2025 at 8:43 AM Shaik Mohammad Mujeeb <
mujeeb...@zohocorp.com> wrote:

> I do understand that not everyone may prefer seeing such warnings during
> PG server restart. To address this, we could introduce a new GUC (perhaps
> named *warn_on_unregistered_guc_prefix*), which defaults to false,
> preserving the existing behaviour. If explicitly enabled, it would emit
> warnings for these cases, giving users the choice to opt in to this
> feedback.
>
>
> Thoughts on this approach?
>

The need for yet another GUC makes this considerably less appealing than it
already is.

I see and agree with the problem statement posed here but would prefer an
approach that improves the UX to minimize such mistakes or encourages
people to check their settings more easily to ensure that they didn't type
100 when they meant to type 10 for the correct setting name.  In short,
warning in limited places for a subset of potential errors when doing so
involves false positives is just an unappealing change.

David J.


Re: POC: Parallel processing of indexes in autovacuum

2025-05-22 Thread Masahiko Sawada
On Thu, May 22, 2025 at 12:44 AM Daniil Davydov <3daniss...@gmail.com> wrote:
>
> Hi,
>
> On Wed, May 21, 2025 at 5:30 AM Masahiko Sawada  wrote:
> >
> > I have some comments on v2-0001 patch
>
> Thank you for reviewing this patch!
>
> > +   {
> > +   {"autovacuum_reserved_workers_num", PGC_USERSET,
> > RESOURCES_WORKER_PROCESSES,
> > +   gettext_noop("Number of worker processes, reserved for
> > participation in parallel index processing during autovacuum."),
> > +   gettext_noop("This parameter is depending on
> > \"max_worker_processes\" (not on \"autovacuum_max_workers\"). "
> > +"*Only* autovacuum workers can use these
> > additional processes. "
> > +"Also, these processes are taken into account
> > in \"max_parallel_workers\"."),
> > +   },
> > +   &av_reserved_workers_num,
> > +   0, 0, MAX_BACKENDS,
> > +   check_autovacuum_reserved_workers_num, NULL, NULL
> > +   },
> >
> > I find that the name "autovacuum_reserved_workers_num" is generic. It
> > would be better to have a more specific name for parallel vacuum such
> > as autovacuum_max_parallel_workers. This parameter is related to
> > neither autovacuum_worker_slots nor autovacuum_max_workers, which
> > seems fine to me. Also, max_parallel_maintenance_workers doesn't
> > affect this parameter.
> > ...
> > I've also considered some alternative names. If we were to use
> > parallel_maintenance_workers, it sounds like it controls the parallel
> > degree for all operations using max_parallel_maintenance_workers,
> > including CREATE INDEX. Similarly, vacuum_parallel_workers could be
> > interpreted as affecting both autovacuum and manual VACUUM commands,
> > suggesting that when users run "VACUUM (PARALLEL) t", the system would
> > use their specified value for the parallel degree. I prefer
> > autovacuum_parallel_workers or vacuum_parallel_workers.
> >
>
> This was my headache when I created names for variables. Autovacuum
> initially implies parallelism, because we have several parallel a/v
> workers.

I'm not sure if it's parallelism. We can have multiple autovacuum
workers simultaneously working on different tables, which seems not
parallelism to me.

> So I think that parameter like
> `autovacuum_max_parallel_workers` will confuse somebody.
> If we want to have a more specific name, I would prefer
> `max_parallel_index_autovacuum_workers`.

It's better not to use 'index' as we're trying to extend parallel
vacuum to heap scanning/vacuuming as well[1].

>
> > +   /*
> > +* If we are running autovacuum - decide whether we need to process 
> > indexes
> > +* of table with given oid in parallel.
> > +*/
> > +   if (AmAutoVacuumWorkerProcess() &&
> > +   params->index_cleanup != VACOPTVALUE_DISABLED &&
> > +   RelationAllowsParallelIdxAutovac(rel))
> >
> > I think that this should be done in autovacuum code.
>
> We need params->index cleanup variable to decide whether we need to
> use parallel index a/v. In autovacuum.c we have this code :
> ***
> /*
>  * index_cleanup and truncate are unspecified at first in autovacuum.
>  * They will be filled in with usable values using their reloptions
>  * (or reloption defaults) later.
>  */
> tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED;
> tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED;
> ***
> This variable is filled in inside the `vacuum_rel` function, so I
> think we should keep the above logic in vacuum.c.

I guess that we can specify the parallel degree even if index_cleanup
is still UNSPECIFIED. vacuum_rel() would then decide whether to use
index vacuuming and vacuumlazy.c would decide whether to use parallel
vacuum based on the specified parallel degree and index_cleanup value.

>
> > +#define AV_PARALLEL_DEADTUP_THRESHOLD  1024
> >
> > These fixed values really useful in common cases? I think we already
> > have an optimization where we skip vacuum indexes if the table has
> > fewer dead tuples (see BYPASS_THRESHOLD_PAGES).
>
> When we allocate dead items (and optionally init parallel autocuum) we
> don't have sane value for `vacrel->lpdead_item_pages` (which should be
> compared with BYPASS_THRESHOLD_PAGES).
> The only criterion that we can focus on is the number of dead tuples
> indicated in the PgStat_StatTabEntry.

My point is that this criterion might not be useful. We have the
bypass optimization for index vacuuming and having many dead tuples
doesn't necessarily mean index vacuuming taking a long time. For
example, even if the table has a few dead tuples, index vacuuming
could take a very long time and parallel index vacuuming would help
the situation, if the table is very large and has many indexes.

>
> 
>
> > I guess we can implement this parameter as an integer parameter so
> > that the user can specify the number of parallel vacuum workers for
> > the table. For example, we can have a reloption
> > autovacuum_parallel_workers. Setting 0 (by default) means to disable

Re: Proposal for enabling auto-vectorization for checksum calculations

2025-05-22 Thread Matthew Sterrett
> You can see the failure at the artifacts ->
> 'log/tmp_install/log/install.log' file on the CI web page [1].
>
> If you want to replicate that on your local:
>
> $ ./configure --with-llvm CLANG="ccache clang-16"
> $ make -s -j8 world-bin
> $ make -j8 check-world
>
> should be enough. I was able to replicate it with these commands. I
> hope these help.
Thanks so much for helping me figure this out!

Okay, I've determined that versions of LLVM/Clang before 19 crash when
compiling this patch for some reason; it seems that both make
check-world and make install will crash with the affected LLVM
versions.
Unfortunately, what matters seems to be the version of the linker/LTO
optimizer, which I don't think we can check at compile time.
I added a check for Clang>=19 which works at preventing the crash on my system.
I think it's possible some unusual combination of clang/LLVM might
still crash during the build, but I think this is a reasonable
solution


v4-0005-Use-dummy-function-to-avoid-linker-error-move-dec.patch
Description: Binary data


v4-0004-fix-bench-compiling.patch
Description: Binary data


v4-0002-Fix-compilation-on-systems-where-immintrin.h-is-n.patch
Description: Binary data


v4-0001-Enable-autovectorizing-pg_checksum_block.patch
Description: Binary data


v4-0003-Benchmark-code-for-postgres-checksums.patch
Description: Binary data


v4-0006-Workaround-for-clang-19-crash.patch
Description: Binary data


Re: Why our Valgrind reports suck

2025-05-22 Thread Tom Lane
Andres Freund  writes:
> [ review ]

Thanks for the comments!  I'll go through them and post an updated
version tomorrow.  The cfbot is already nagging me for a rebase
now that 0013 is moot.

>> But this is the last step to get to zero reported leaks in a run of the core
>> regression tests, so let's do it.

> I assume that's just about the core tests, not more? I.e. I can't make skink
> enable leak checking?

No, we're not there yet.  I've identified some other backend issues (in
postgres_fdw in particular), and I've not looked at frontend programs
at all.  For most frontend programs, I'm dubious how much we care.

Actually the big problem is I don't know what to do about
plperl/plpython/pltcl.  I suppose the big-hammer approach
would be to put in suppression patterns covering those,
at least till such time as someone has a better idea.

I'm envisioning this patch series as v19 work, were you
thinking we should be more aggressive?

regards, tom lane




Re: Retiring some encodings?

2025-05-22 Thread Michael Paquier
On Thu, May 22, 2025 at 10:02:16AM -0400, Bruce Momjian wrote:
> Agreed on notification.  A radical idea would be to add a warning for
> the use of such encodings in PG 18, and then mention their deprecation
> in the PG 18 release notes so everyone is informed they will be removed
> in PG 19.

With v18beta1 already out in the wild, I think that we are too late
for taking any action on this version at this stage.  Putting a
deprecation notice for a selected set of conversions and/or encodings
and do the actual removal work when v20 opens up around July 2026
would sound like a better timing here, if the overall consensus goes
in this direction, of course.
--
Michael


signature.asc
Description: PGP signature


  1   2   >