Re: Statistics Import and Export

2025-04-05 Thread Corey Huinker
> > * Changed to use LookupExplicitNamespace() > Seems good. > * Added test for temp tables > +1 > * Doc fixes So this patch swings the pendulum a bit back towards accepting some things as errors. That's understandable, as we're never going to have a situation where we can guarantee that th

Re: Statistics Import and Export

2025-04-05 Thread Corey Huinker
> > > Also, why do we need the clause "WHERE s.tablename = ANY($2)"? Isn't > > that already implied by "JOIN unnest($1, $2) ... s.tablename = > > u.tablename"? > > Good question. Corey, do you recall why this was needed? > In my patch, that SQL statement came with the comment: + /* + * The resul

Re: Statistics Import and Export

2025-04-05 Thread Corey Huinker
> > The first is that i_relallfrozen is undefined in versions earlier than > 18. That's trivial to fix, we just add "0 AS relallfrozen," in the > earlier versions, but still refrain from outputting it. > Ok, so long as we refrain from outputting it, I'm cool with whatever we store internally. >

Re: Statistics Import and Export

2025-04-05 Thread Jeff Davis
On Wed, 2025-03-19 at 15:17 -0700, Jeff Davis wrote: > On Sat, 2025-03-15 at 21:37 -0400, Corey Huinker wrote: > > > 0001 - no changes, but the longer I go the more I'm certain this > > > is > > > something we want to do. > > This replaces regclassin with custom lookups of the namespace and > reln

Re: Statistics Import and Export

2025-04-04 Thread Nathan Bossart
On Fri, Apr 04, 2025 at 07:32:48PM -0400, Corey Huinker wrote: > This patch shrinks the array size to 1 for versions < 9.4, which keeps the > modern code fairly elegant. Committed. -- nathan

Re: Statistics Import and Export

2025-04-04 Thread Corey Huinker
On Fri, Apr 4, 2025 at 6:25 PM Nathan Bossart wrote: > On Fri, Apr 04, 2025 at 03:58:53PM -0500, Nathan Bossart wrote: > > I pushed commit 8ec0aae to fix this. > > And now I'm seeing cross-version test failures due to our use of WITH > ORDINALITY, which wasn't added until v9.4. Looking into it..

Re: Statistics Import and Export

2025-04-04 Thread Nathan Bossart
On Fri, Apr 04, 2025 at 03:58:53PM -0500, Nathan Bossart wrote: > I pushed commit 8ec0aae to fix this. And now I'm seeing cross-version test failures due to our use of WITH ORDINALITY, which wasn't added until v9.4. Looking into it... -- nathan

Re: Statistics Import and Export

2025-04-04 Thread Nathan Bossart
On Fri, Apr 04, 2025 at 03:06:45PM -0500, Nathan Bossart wrote: > I see the buildfarm failure and am working on a fix. I pushed commit 8ec0aae to fix this. -- nathan

Re: Statistics Import and Export

2025-04-04 Thread Nathan Bossart
On Fri, Apr 04, 2025 at 02:56:54PM -0500, Nathan Bossart wrote: > Committed. I see the buildfarm failure and am working on a fix. -- nathan

Re: Statistics Import and Export

2025-04-04 Thread Nathan Bossart
On Thu, Apr 03, 2025 at 09:19:51PM -0500, Nathan Bossart wrote: > Great. I'm planning to commit the attached patch set tomorrow morning. Committed. -- nathan

Re: Statistics Import and Export

2025-04-04 Thread Nathan Bossart
On Tue, Apr 01, 2025 at 10:44:19PM -0700, Jeff Davis wrote: > On Tue, 2025-04-01 at 22:21 -0500, Nathan Bossart wrote: >> We might be able to improve this by inventing a new callback that fails for >> all formats except for custom with feesko() available.  That would at least >> ensure hard failure

Re: Statistics Import and Export

2025-04-03 Thread Nathan Bossart
Thanks for reviewing. On Thu, Apr 03, 2025 at 03:23:40PM -0700, Jeff Davis wrote: > This simplifies commit a0a4601765. I'd break out that simplification as > a separate commit to make it easier to understand what happened. Done. > In patch 0003, there are quite a few static function-scoped vari

Re: Statistics Import and Export

2025-04-03 Thread Jeff Davis
On Wed, 2025-04-02 at 21:26 -0500, Nathan Bossart wrote: > Okay, here is an updated patch set. > * Besides custom format calling WriteToc() twice to update the data >   offsets, tar format ... even if it did, the worst case is that > the >   content of restore.sql (which isn't used by pg_restore

Re: statistics import and export: another difference in dump/restore

2025-04-02 Thread Ashutosh Bapat
On Wed, Apr 2, 2025 at 10:55 PM Jeff Davis wrote: > > On Wed, 2025-04-02 at 15:35 +0530, Ashutosh Bapat wrote: > > Once we fix this issue, we need to enable statistics dump and > > comparison in pg_upgrade/002_pg_upgrade using the attached patch. > > The diff appears to be an issue in 002_pg_upgra

Re: Statistics Import and Export

2025-04-02 Thread Nathan Bossart
On Wed, Apr 02, 2025 at 10:34:58PM -0400, Corey Huinker wrote: >> >> > Also, why do we need the clause "WHERE s.tablename = ANY($2)"? Isn't >> > that already implied by "JOIN unnest($1, $2) ... s.tablename = >> > u.tablename"? >> >> Good question. Corey, do you recall why this was needed? >> > >

Re: Statistics Import and Export

2025-04-02 Thread Jeff Davis
On Tue, 2025-04-01 at 22:21 -0500, Nathan Bossart wrote: > It certainly feels risky.  I was able to avoid executing the queries > twice > in all cases by saving the definition length in the TOC entry and > skipping > that many bytes the second time round. Another idea that was under-discussed is w

Re: statistics import and export: another difference in dump/restore

2025-04-02 Thread Jeff Davis
On Wed, 2025-04-02 at 15:35 +0530, Ashutosh Bapat wrote: > Once we fix this issue, we need to enable statistics dump and > comparison in pg_upgrade/002_pg_upgrade using the attached patch. The diff appears to be an issue in 002_pg_upgrade.pl introduced in 172259afb5. There are two dumps taken from

Re: Statistics Import and Export

2025-04-02 Thread Andres Freund
Hi, https://commitfest.postgresql.org/patch/4538/ is still in "needs review", even though the feature really has been committed. Is that intention, e.g. to track pending changes that we're planning to make? Greetings, Andres

Re: statistics import and export: another difference in dump/restore

2025-04-02 Thread Ashutosh Bapat
Hi, On Tue, Apr 1, 2025 at 12:54 PM Ashutosh Bapat wrote: > [1] > https://www.postgresql.org/message-id/caexhw5vvftcejh+uyznxmgsxofj_1xwi5aqhqfemqjgfmky...@mail.gmail.com > [2] https://cirrus-ci.com/task/5164175841820672 I have added this to PG 18 open items. It might be too early to call this

Re: Statistics Import and Export

2025-04-02 Thread Jeff Davis
On Tue, 2025-04-01 at 22:21 -0500, Nathan Bossart wrote: > It certainly feels risky.  I was able to avoid executing the queries > twice > in all cases by saving the definition length in the TOC entry and > skipping > that many bytes the second time round. That feels like a better approach. >   Th

Re: Statistics Import and Export

2025-04-01 Thread Nathan Bossart
On Tue, Apr 01, 2025 at 03:05:59PM -0700, Jeff Davis wrote: > To restate the problem: one of the problems being solved here is that > the existing code for custom-format dumps calls WriteToc twice. That > was not a big problem before this patch, when the contents of the > entries was easily accessi

Re: Statistics Import and Export

2025-04-01 Thread Robert Haas
On Tue, Apr 1, 2025 at 4:24 PM Jeff Davis wrote: > On Tue, 2025-04-01 at 09:37 -0400, Robert Haas wrote: > > I don't think I was aware of the open item; I was just catching up on > > email. > > I lean towards making it opt-in for pg_dump and opt-out for pg_upgrade. Big +1. > But I think we shoul

Re: Statistics Import and Export

2025-04-01 Thread Jeff Davis
On Tue, 2025-04-01 at 13:44 -0500, Nathan Bossart wrote: > Apologies for the noise.  I noticed one more way to simplify 0002.  > As > before, there should be no functional differences. To restate the problem: one of the problems being solved here is that the existing code for custom-format dumps

Re: Statistics Import and Export

2025-04-01 Thread Jeff Davis
On Tue, 2025-04-01 at 09:37 -0400, Robert Haas wrote: > I don't think I was aware of the open item; I was just catching up on > email. I lean towards making it opt-in for pg_dump and opt-out for pg_upgrade. But I think we should leave open the possibility for changing the default to opt-out for pg

Re: Statistics Import and Export

2025-04-01 Thread Nathan Bossart
On Mon, Mar 31, 2025 at 09:33:15PM -0500, Nathan Bossart wrote: > My goal is to commit the attached patches on Friday morning, but of course > that is subject to change based on any feedback or objections that emerge > in the meantime. I spent some more time polishing these patches this morning.

Re: Statistics Import and Export

2025-04-01 Thread Nathan Bossart
On Tue, Apr 01, 2025 at 01:20:30PM -0500, Nathan Bossart wrote: > On Mon, Mar 31, 2025 at 09:33:15PM -0500, Nathan Bossart wrote: >> My goal is to commit the attached patches on Friday morning, but of course >> that is subject to change based on any feedback or objections that emerge >> in the mean

Re: Statistics Import and Export

2025-04-01 Thread Robert Haas
On Mon, Mar 31, 2025 at 6:04 PM Jeff Davis wrote: > On Mon, 2025-03-31 at 13:39 -0400, Robert Haas wrote: > > +1. I think I said this before, but I don't think it's correct to > > regard the statistics as part of the database. It's great for > > pg_upgrade to preserve them, but I think doing so fo

Re: Statistics Import and Export

2025-04-01 Thread Jeff Davis
On Mon, 2025-03-31 at 13:39 -0400, Robert Haas wrote: > +1. I think I said this before, but I don't think it's correct to > regard the statistics as part of the database. It's great for > pg_upgrade to preserve them, but I think doing so for a regular dump > should be opt-in. I'm confused about th

Re: Statistics Import and Export

2025-03-31 Thread Robert Treat
On Mon, Mar 31, 2025 at 10:33 PM Nathan Bossart wrote: > On Mon, Mar 31, 2025 at 11:11:47AM -0400, Corey Huinker wrote: > Regarding whether pg_dump should dump statistics by default, my current > thinking is that it shouldn't, but I think we _should_ have pg_upgrade > dump/restore statistics by de

Re: Statistics Import and Export

2025-03-31 Thread Nathan Bossart
On Mon, Mar 31, 2025 at 11:11:47AM -0400, Corey Huinker wrote: > In light of v11-0001 being committed as 4694aedf63bf, I've rebased the > remaining patches. I spent the day preparing these for commit. A few notes: * I've added a new prerequisite patch that skips the second WriteToc() call for

Re: Statistics Import and Export

2025-03-31 Thread Robert Haas
On Thu, Feb 27, 2025 at 10:43 PM Greg Sabino Mullane wrote: > I know I'm coming late to this, but I would like us to rethink having > statistics dumped by default. +1. I think I said this before, but I don't think it's correct to regard the statistics as part of the database. It's great for pg_u

Re: Statistics Import and Export

2025-03-31 Thread Corey Huinker
> > The second is that the pg_upgrade test (when run with >> olddump/oldinstall) compares the before and after dumps, and if the >> "before" version is 17, then it will not have the relallfrozen argument >> to pg_restore_relation_stats. We might need a filtering step in >> adjust_new_dumpfile? >> >

Re: Statistics Import and Export

2025-03-28 Thread Jeff Davis
On Fri, 2025-03-28 at 21:11 -0400, Corey Huinker wrote: > A rebase and a reordering of the commits to put the really-really- > must-have relallfrozen ahead of the really-must-have stats batching > and both of them head of the error->warning step-downs. v11-0001 has a couple issues: The first is t

Re: Statistics Import and Export

2025-03-28 Thread Corey Huinker
A rebase and a reordering of the commits to put the really-really-must-have relallfrozen ahead of the really-must-have stats batching and both of them head of the error->warning step-downs. From 96b10b1eb955c5619d23cadf7de8b12d2db638a9 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Sat, 15 Mar

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-28 Thread Jeff Davis
On Fri, 2025-03-28 at 14:53 +0530, Ashutosh Bapat wrote: > When I applied v1 it didn't pass. I applied v1 on top of master as of March 15 (771ba90298), and then took your two changes adding the tests, and it passed. Version v2j is just rebased forward, which involved a trivial merge conflict. >

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-28 Thread Ashutosh Bapat
On Fri, Mar 28, 2025 at 10:41 AM Jeff Davis wrote: > > On Thu, 2025-03-27 at 17:07 +0530, Ashutosh Bapat wrote: > > Pulled the latest sources but the test is still failing with the same > > differences. > > The attached set of patches (your 0001 and 0002, combined with my patch > v2j-0003) applied

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-27 Thread Jeff Davis
On Thu, 2025-03-27 at 17:07 +0530, Ashutosh Bapat wrote: > Pulled the latest sources but the test is still failing with the same > differences. The attached set of patches (your 0001 and 0002, combined with my patch v2j-0003) applied on master (058b5152f0) are passing the pg_upgrade test suite for

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-27 Thread Ashutosh Bapat
On Wed, Mar 12, 2025 at 4:29 PM Ashutosh Bapat wrote: > > I ran my test with this patch (we have to remove 0003 patch in my test > which uses --no-statistics option). It failed with following > differences > @@ -452068,8 +452068,8 @@ > SELECT * FROM pg_catalog.pg_restore_relation_stats( > 'version

Re: Statistics Import and Export

2025-03-27 Thread Robert Treat
On Tue, Mar 25, 2025 at 1:32 AM Jeff Davis wrote: > On Sat, 2025-03-08 at 14:09 -0500, Corey Huinker wrote: > > > > > > except it is perfectly clear that you *asked for* data and > > > statistics, so you get what you asked for. however the user > > > conjures in their heads what they are looking

Re: Statistics Import and Export

2025-03-25 Thread Corey Huinker
> > At this point, I feel I've demonstrated the limit of what can be made into > WARNINGs, giving us a range of options for now and into the beta. I'll > rebase and move the 0002 patch to be in last position so as to tee up > 0003-0004 for consideration. > And here's the rebase (after bde2fb797aae

Re: Statistics Import and Export

2025-03-25 Thread Corey Huinker
> > The original reason we wanted to issue warnings was to allow ourselves > a chance to change the meaning of parameters, add new parameters, or > even remove parameters without causing restore failures. If there are > any ERRORs that might limit our flexibility I think we should downgrade > those

Re: Statistics Import and Export

2025-03-25 Thread Jeff Davis
On Tue, 2025-03-25 at 10:53 -0400, Corey Huinker wrote: > > So this patch swings the pendulum a bit back towards accepting some > things as errors. Not exactly. I see patch 0001 as a change to the function signatures from regclass to schemaname/relname, both for usability as well as control over

Re: Statistics Import and Export

2025-03-24 Thread Jeff Davis
On Sat, 2025-03-08 at 14:09 -0500, Corey Huinker wrote: > > > > except it is perfectly clear that you *asked for* data and > > statistics, so you get what you asked for. however the user > > conjures in their heads what they are looking for, the logic is > > simple, you get what you asked for.  >

Re: Statistics Import and Export

2025-03-19 Thread Corey Huinker
> > This replaces regclassin with custom lookups of the namespace and > relname, but misses some of the complexities that regclassin is > handling. For instance, it calls RangeVarGetRelid(), which calls > LookupExplicitNamespace(), which handles temp tables and > InvokeNamespaceSearchHook(). > > At

Re: Statistics Import and Export

2025-03-19 Thread Jeff Davis
On Sat, 2025-03-15 at 21:37 -0400, Corey Huinker wrote: > > 0001 - no changes, but the longer I go the more I'm certain this is > > something we want to do. This replaces regclassin with custom lookups of the namespace and relname, but misses some of the complexities that regclassin is handling. F

Re: Statistics Import and Export

2025-03-17 Thread Nathan Bossart
On Mon, Mar 17, 2025 at 07:24:46PM -0400, Corey Huinker wrote: > On Mon, Mar 17, 2025 at 10:24 AM Nathan Bossart > wrote: >> I'm assuming that writing a completely different TOC on the second pass >> would corrupt the dump file. Perhaps we could teach it to skip stats >> entries on the second pas

Re: Statistics Import and Export

2025-03-17 Thread Corey Huinker
On Mon, Mar 17, 2025 at 10:24 AM Nathan Bossart wrote: > On Sun, Mar 16, 2025 at 05:32:15PM -0400, Corey Huinker wrote: > >> > >> * The custom format actually does two WriteToc() calls, and since these > >> patches move the queries to this part of pg_dump, it means we'll run > all > >> the qu

Re: Statistics Import and Export

2025-03-17 Thread Nathan Bossart
On Sun, Mar 16, 2025 at 05:32:15PM -0400, Corey Huinker wrote: >> >> * The custom format actually does two WriteToc() calls, and since these >> patches move the queries to this part of pg_dump, it means we'll run all >> the queries twice. The comments around this code suggest that the second >

Re: Statistics Import and Export

2025-03-16 Thread Corey Huinker
> > * The custom format actually does two WriteToc() calls, and since these > patches move the queries to this part of pg_dump, it means we'll run all > the queries twice. The comments around this code suggest that the second > pass isn't strictly necessary and that it is really only useful

Re: Statistics Import and Export

2025-03-16 Thread Nathan Bossart
Thanks for working on this, Corey. On Fri, Mar 14, 2025 at 04:03:16PM -0400, Corey Huinker wrote: > 0003 - > > Storing the restore function calls in the archive entry hogged a lot of > memory and made people nervous. This introduces a new function pointer that > generates those restore SQL calls

Re: Statistics Import and Export

2025-03-15 Thread Corey Huinker
On Thu, Mar 6, 2025 at 3:48 AM Jeff Davis wrote: > On Wed, 2025-03-05 at 23:04 -0500, Corey Huinker wrote: > > > > Anyway, here's a rebased set of the existing up-for-consideration > > patches, plus the optimization of avoiding querying on non-expression > > indexes. > > Comments on 0003: > > * A

Re: Statistics Import and Export

2025-03-15 Thread Andres Freund
On 2025-03-06 13:47:51 -0500, Corey Huinker wrote: > I'm at the same conclusion. This would mean keeping the one > getAttributeStats query perrelation, Why does it have to mean that? It surely would be easier with separate queries, but I don't think there's anything inherently blocking us from doi

Re: Statistics Import and Export

2025-03-15 Thread Corey Huinker
> > > https://www.postgresql.org/docs/current/ddl-priv.html > > The above text indicates that we should do the check, but also that > it's not terribly important for actual security. > Ok, I'm convinced. > > > If we do, we'll want to change downgrade the following errors to > > warn+return fals

Re: Statistics Import and Export

2025-03-15 Thread Andres Freund
Hi, On 2025-03-06 12:16:44 -0500, Corey Huinker wrote: > > > > To be honest, I am a bit surprised that we decided to enable this by > > default. It's not obvious to me that statistics should be regarded as > > part of the database in the same way that table definitions or table > > data are. That

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-12 Thread Ashutosh Bapat
On Wed, Mar 12, 2025 at 4:08 AM Jeff Davis wrote: > > On Tue, 2025-03-11 at 11:26 -0400, Tom Lane wrote: > > Right, that was what I was thinking, but hadn't had time to look in > > detail. The postDataBound dependency isn't real helpful here, we > > could lose that if we had the data dependency.

Re: Statistics Import and Export

2025-03-11 Thread Nathan Bossart
On Thu, Mar 06, 2025 at 01:47:34PM -0500, Tom Lane wrote: > 1. pg_upgrade has made a policy judgement to apply parallelism across > databases not within a database, ie it will launch concurrent dump/ > restore tasks in different DBs but not authorize any one of them to > eat multiple CPUs. That ne

Re: Statistics Import and Export

2025-03-11 Thread Jeff Davis
On Wed, 2025-03-05 at 23:04 -0500, Corey Huinker wrote: > > Anyway, here's a rebased set of the existing up-for-consideration > patches, plus the optimization of avoiding querying on non-expression > indexes. Comments on 0003: * All the argument names for pg_restore_attribute_stats match pg_stat

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-11 Thread Jeff Davis
On Tue, 2025-03-11 at 11:26 -0400, Tom Lane wrote: > Right, that was what I was thinking, but hadn't had time to look in > detail.  The postDataBound dependency isn't real helpful here, we > could lose that if we had the data dependency. Attached a patch. It's a bit messier than I expected, so I'

Re: Statistics Import and Export

2025-03-11 Thread Corey Huinker
> > I don't follow. We already have the tablenames, schemanames and oids of the > to-be-dumped tables/indexes collected in pg_dump, all that's needed is to > send > a list of those to the server to filter there? > Do we have something that currently does that? All of the collect functions (collect

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-11 Thread Tom Lane
Jeff Davis writes: > On Tue, 2025-03-11 at 10:17 -0400, Tom Lane wrote: >> Are you doing the restore in parallel by any chance? I had a todo >> item to revisit the dependencies that pg_dump is creating for stats >> items, because they looked wrong to me, ie inadequate to guarantee >> correct rest

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-11 Thread Jeff Davis
On Tue, 2025-03-11 at 10:17 -0400, Tom Lane wrote: > Ashutosh Bapat writes: > > After fixing the statistics difference in dumps of tables with > > indexes, I now see difference in statistics of materialized view > > dump > > in the test I am developing at [1] (see the latest patches there). > > A

Re: Statistics import and export: difference in statistics of materialized view dumped

2025-03-11 Thread Tom Lane
Ashutosh Bapat writes: > After fixing the statistics difference in dumps of tables with > indexes, I now see difference in statistics of materialized view dump > in the test I am developing at [1] (see the latest patches there). Are you doing the restore in parallel by any chance? I had a todo i

Re: Statistics Import and Export: difference in statistics dumped

2025-03-11 Thread Ashutosh Bapat
On Tue, Mar 11, 2025 at 5:23 AM Jeff Davis wrote: > > On Mon, 2025-03-10 at 17:53 -0400, Tom Lane wrote: > > I wrote: > > > I think what is happening is that the patch shut off CREATE > > > INDEX's update of not only the table's stats but also the > > > index's stats. This seems unhelpful: the in

Re: Statistics Import and Export: difference in statistics dumped

2025-03-10 Thread Jeff Davis
On Mon, 2025-03-10 at 17:53 -0400, Tom Lane wrote: > I wrote: > > I think what is happening is that the patch shut off CREATE > > INDEX's update of not only the table's stats but also the > > index's stats.  This seems unhelpful: the index's empty > > stats can never be what's wanted. > > I looked

Re: Statistics Import and Export: difference in statistics dumped

2025-03-10 Thread Tom Lane
I wrote: > I think what is happening is that the patch shut off CREATE > INDEX's update of not only the table's stats but also the > index's stats. This seems unhelpful: the index's empty > stats can never be what's wanted. I looked at this more closely and realized that it's a simple matter of h

Re: Statistics Import and Export

2025-03-09 Thread Jeff Davis
On Sat, 2025-03-08 at 10:56 -0500, Robert Treat wrote: > In the UX world, the general pattern is people start to get > overwhelmed once you get over a 1/2 dozen options (I think that's > based on Miller's law, but might be mis-remembering); we are already > at 9 for this use case. So really it is q

Re: Statistics Import and Export

2025-03-08 Thread Corey Huinker
> > Until we add a fourth option, and then it becomes completely ambiguous as >> to whether you wanted data+statstics, or you not-wanted schema. >> >> > except it is perfectly clear that you *asked for* data and statistics, so > you get what you asked for. however the user conjures in their heads w

Re: Statistics Import and Export

2025-03-08 Thread Robert Treat
On Fri, Mar 7, 2025 at 10:40 PM Corey Huinker wrote: > >> if you want everything --include=schema,data,statistics (presumably >> redundant with the default behavior) >> if you want schema only --include=schema >> if you want "everything except schema" --include=data,statistics >> > > Until we add

Re: Statistics Import and Export

2025-03-08 Thread Jeff Davis
On Wed, 2025-03-05 at 23:04 -0500, Corey Huinker wrote: > > Anyway, here's a rebased set of the existing up-for-consideration > patches, plus the optimization of avoiding querying on non-expression > indexes. Patch 0001 contains a bug: it returns REQ_STATS early, before doing any exclusions. But

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
Updated and rebase patches. 0001 is the same as v6-0002, but with proper ACL checks on schemas after cache lookup 0002 attempts to replace all possible ERRORs in the restore/clear functions with WARNINGs. This is done with an eye towards reducing the set of things that could potentially cause an

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
On Sat, Mar 8, 2025 at 12:52 AM Hari Krishna Sunder wrote: > To improve the performance of pg_dump can we add a new sql function that > can operate more efficiently than the pg_stats view? It could also take in > an optional list of oids to filter on. > This will help speed up the dump and restor

Re: Statistics Import and Export

2025-03-07 Thread Hari Krishna Sunder
To improve the performance of pg_dump can we add a new sql function that can operate more efficiently than the pg_stats view? It could also take in an optional list of oids to filter on. This will help speed up the dump and restore within pg18 and future upgrades to higher pg versions. Thanks Hari

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
> > I tried to generalize that requirement to all of > {schema|data|statistics} for consistency, but that resulted in 9 > options. > 9 options that resolve to 3 boolean variables. It's not that hard. And if we add a fourth option set, then we have 12 options. So it's O(3N), not O(N^2). People ha

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
> > > if you want everything --include=schema,data,statistics (presumably > redundant with the default behavior) > if you want schema only --include=schema > if you want "everything except schema" --include=data,statistics > Until we add a fourth option, and then it becomes completely ambiguous as

Re: Statistics Import and Export

2025-03-07 Thread Jeff Davis
On Fri, 2025-03-07 at 15:46 -0500, Robert Treat wrote: > There might be some variability depending on the default behavior, > but > if we assume that default means "output everything" The reason I posted this patch is that, depending on performance characteristics in v18 and a decision to be made

Re: Statistics Import and Export

2025-03-07 Thread Robert Treat
On Fri, Mar 7, 2025 at 1:41 PM Jeff Davis wrote: > > On Fri, 2025-03-07 at 12:41 -0500, Robert Treat wrote: > > Ugh... this feels like a bit of the combinatorial explosion, > > especially if we ever need to add another option. > > Not quite that bad, because ideally the yes/no/only would not be >

Re: Statistics Import and Export: difference in statistics dumped

2025-03-07 Thread Tom Lane
Jeff Davis writes: > Sounds good. I will commit something like the v2 patch then soon, and > if we need a different condition we can change it later. Sadly, this made things worse not better: crake is failing cross-version-upgrade tests again [1], with dump diffs like @@ -270836,8 +270836,8 @@

Re: Statistics Import and Export

2025-03-07 Thread Jeff Davis
On Fri, 2025-03-07 at 12:41 -0500, Robert Treat wrote: > Ugh... this feels like a bit of the combinatorial explosion, > especially if we ever need to add another option. Not quite that bad, because ideally the yes/no/only would not be expanding as well. But I agree that it feels like a lot of opt

Re: Statistics Import and Export

2025-03-07 Thread Robert Treat
On Thu, Mar 6, 2025 at 8:42 PM Jeff Davis wrote: > On Thu, 2025-03-06 at 11:15 -0500, Robert Haas wrote: > Patch attached. This patch does NOT change the default; stats are still > opt-out. But it makes it easier for users to start specifying what they > want or not explicitly, or to rely on the d

Re: Statistics Import and Export

2025-03-07 Thread Jeff Davis
On Fri, 2025-03-07 at 11:22 -0500, Andres Freund wrote: > +1, this has been annoying me while testing. IIRC, originally someone had questioned the need for options that expressed what was already the default, but I can't find it right now. Regardless, now the need is clear enough. > Could we, ins

Re: Statistics Import and Export

2025-03-07 Thread Andres Freund
Hi, On 2025-03-06 17:42:30 -0800, Jeff Davis wrote: > At minimum, we would need to at least add the option "--with- > statistics", because right now the only way to explicitly request stats > is to say "--statistics-only". +1, this has been annoying me while testing. I did get confused for a whi

Re: Statistics Import and Export

2025-03-07 Thread Jeff Davis
On Thu, 2025-03-06 at 08:49 -0500, Corey Huinker wrote: > Unless some check was being done by the 'foo.bar'::regclass cast, I > understand why we should add one. "For schemas, allows access to objects contained in the schema (assuming that the objects' own privilege requirements are also met). Ess

Re: Statistics Import and Export

2025-03-06 Thread Tom Lane
Andres Freund writes: >And in contrast to analyzing the database in parallel, the pg_dump/restore >work to restore stats afaict happens single-threaded for each database. In principle we should be able to do stats dump/restore parallelized just as we do for data. There are some stumbling

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > Patch attached. This patch does NOT change the default; stats are still > opt-out. But it makes it easier for users to start specifying what they > want or not explicitly, or to rely on the defaults if they prefer. > > Note that the patch would mean we go from 2 options in v17: > --{schema|

Re: Statistics Import and Export

2025-03-06 Thread Andres Freund
Hi, On 2025-03-06 10:07:43 -0800, Jeff Davis wrote: > On Thu, 2025-03-06 at 12:16 -0500, Andres Freund wrote: > > I don't follow. We already have the tablenames, schemanames and oids > > of the > > to-be-dumped tables/indexes collected in pg_dump, all that's needed > > is to send > > a list of tho

Re: Statistics Import and Export

2025-03-06 Thread Jeff Davis
On Thu, 2025-03-06 at 11:15 -0500, Robert Haas wrote: > To be honest, I am a bit surprised that we decided to enable this by > default. It's not obvious to me that statistics should be regarded as > part of the database in the same way that table definitions or table > data are. That said, I'm not

Re: Statistics Import and Export

2025-03-06 Thread Andres Freund
Hi, On 2025-03-06 14:51:26 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2025-03-06 13:47:34 -0500, Tom Lane wrote: > >> ... I wonder if we could just rip out pg_upgrade's support > >> for DB-level parallelism, which is not terribly pretty anyway, and > >> simply pass the -j switch straig

Re: Statistics Import and Export

2025-03-06 Thread Robert Haas
On Thu, Mar 6, 2025 at 3:50 PM Nathan Bossart wrote: > That being said, I do think in-database parallelism would be useful in some > cases. I frequently hear about problems with huge numbers of large objects > on a cluster with one big database. But that's probably less likely than > the many da

Re: Statistics Import and Export

2025-03-06 Thread Nathan Bossart
On Thu, Mar 06, 2025 at 01:04:55PM -0500, Andres Freund wrote: > To be clear, I think this is a very important improvement that most people > should use. +1 > I just don't think it's quite there yet. I agree that we should continue working on the performance/memory stuff. > 1) It's a difference

Re: Statistics Import and Export

2025-03-06 Thread Nathan Bossart
On Thu, Mar 06, 2025 at 03:20:16PM -0500, Andres Freund wrote: > There are many systems with hundreds of databases, removing all parallelism > for those from pg_upgrade would likely hurt way more than what we can gain > here. I just did a quick test on a freshly analyzed database with 1,000 sequen

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > To be honest, I am a bit surprised that we decided to enable this by > default. It's not obvious to me that statistics should be regarded as > part of the database in the same way that table definitions or table > data are. That said, I'm not overwhelmingly opposed to that choice. > However, ev

Re: Statistics Import and Export

2025-03-06 Thread Andres Freund
Hi, On 2025-03-06 14:47:08 -0500, Tom Lane wrote: > Nathan Bossart writes: > > On Thu, Mar 06, 2025 at 01:47:34PM -0500, Tom Lane wrote: > >> ... I wonder if we could just rip out pg_upgrade's support > >> for DB-level parallelism, which is not terribly pretty anyway, and > >> simply pass the -j

Re: Statistics Import and Export

2025-03-06 Thread Tom Lane
Andres Freund writes: > On 2025-03-06 13:47:34 -0500, Tom Lane wrote: >> ... I wonder if we could just rip out pg_upgrade's support >> for DB-level parallelism, which is not terribly pretty anyway, and >> simply pass the -j switch straight to pg_dump and pg_restore. > I don't think that'd work we

Re: Statistics Import and Export

2025-03-06 Thread Tom Lane
Nathan Bossart writes: > On Thu, Mar 06, 2025 at 01:47:34PM -0500, Tom Lane wrote: >> ... I wonder if we could just rip out pg_upgrade's support >> for DB-level parallelism, which is not terribly pretty anyway, and >> simply pass the -j switch straight to pg_dump and pg_restore. > That would cert

Re: Statistics Import and Export

2025-03-06 Thread Andres Freund
Hi, On 2025-03-06 13:47:34 -0500, Tom Lane wrote: > Andres Freund writes: > >And in contrast to analyzing the database in parallel, the > > pg_dump/restore > >work to restore stats afaict happens single-threaded for each database. > > In principle we should be able to do stats dump/resto

Re: Statistics Import and Export

2025-03-06 Thread Tom Lane
Andres Freund writes: > On 2025-03-06 13:47:51 -0500, Corey Huinker wrote: >> I'm at the same conclusion. This would mean keeping the one >> getAttributeStats query perrelation, > Why does it have to mean that? It surely would be easier with separate > queries, but I don't think there's anything

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > The more I think about it, the less correct it seems to me to have the > statement to restore statistics tracked via ArchiveOpts->createStmt. We > use > that for DDL, but this really is data, not DDL. Because we store it in > ->createStmt it's stored in-memory for the runtime of pg_dump, wh

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > Would it be appropriate to create a temp table? I wouldn't normally > expect pg_dump to create temp tables, but I can't think of a major > reason not to. > I think we can't - the db might be a replica. > > If not, did you have in mind a CTE with a large VALUES expression, or > just a giant I

Re: Statistics Import and Export

2025-03-06 Thread Andres Freund
Hi, On 2025-03-06 13:00:07 -0500, Corey Huinker wrote: > > > > I don't follow. We already have the tablenames, schemanames and oids of the > > to-be-dumped tables/indexes collected in pg_dump, all that's needed is to > > send > > a list of those to the server to filter there? > > > > Do we have so

Re: Statistics Import and Export

2025-03-06 Thread Jeff Davis
On Thu, 2025-03-06 at 12:16 -0500, Andres Freund wrote: > I don't follow. We already have the tablenames, schemanames and oids > of the > to-be-dumped tables/indexes collected in pg_dump, all that's needed > is to send > a list of those to the server to filter there? Would it be appropriate to cre

  1   2   3   4   5   6   >