Re: Statistics Import and Export

2025-03-01 Thread Corey Huinker
> > Independently of that, do we want to switch over to storing > reltuples as a string instead of converting it? I still feel > uncomfortable about the amount of baggage we added to pg_dump > to avoid that. I'm obviously a 'yes' vote for string, either fixed width buffer or pg_strdup'd, for the

Re: Statistics Import and Export

2025-02-27 Thread Corey Huinker
On Thu, Feb 27, 2025 at 10:01 PM Corey Huinker wrote: > +--- error: relation is wrong type >> +SELECT pg_catalog.pg_restore_relation_stats( >> +'relation', 0::oid, >> +'relpages', 17::integer, >> +'reltuples&#x

Re: Statistics Import and Export

2025-03-02 Thread Corey Huinker
> > Also, we will need to think through the set of pg_dump options again. A >> lot of our tools seem to assume that "if it's the default, we don't >> need a way to ask for it explicitly", which makes it a lot harder to >> ever change the default and keep a coherent set of options. >> > > That's a g

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
> > I tried to generalize that requirement to all of > {schema|data|statistics} for consistency, but that resulted in 9 > options. > 9 options that resolve to 3 boolean variables. It's not that hard. And if we add a fourth option set, then we have 12 options. So it's O(3N), not O(N^2). People ha

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
> > > if you want everything --include=schema,data,statistics (presumably > redundant with the default behavior) > if you want schema only --include=schema > if you want "everything except schema" --include=data,statistics > Until we add a fourth option, and then it becomes completely ambiguous as

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
On Sat, Mar 8, 2025 at 12:52 AM Hari Krishna Sunder wrote: > To improve the performance of pg_dump can we add a new sql function that > can operate more efficiently than the pg_stats view? It could also take in > an optional list of oids to filter on. > This will help speed up the dump and restor

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
a couple orders of magnitude gain. From 9cd4b4e0e280d0fd8cb120ac105d6e65a491cd7e Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Tue, 4 Mar 2025 22:16:52 -0500 Subject: [PATCH v7 1/2] Split relation into schemaname and relname. In order to further reduce potential error-failures in restor

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > Patch attached. This patch does NOT change the default; stats are still > opt-out. But it makes it easier for users to start specifying what they > want or not explicitly, or to rely on the defaults if they prefer. > > Note that the patch would mean we go from 2 options in v17: > --{schema|

Re: Statistics Import and Export

2025-03-08 Thread Corey Huinker
> > Until we add a fourth option, and then it becomes completely ambiguous as >> to whether you wanted data+statstics, or you not-wanted schema. >> >> > except it is perfectly clear that you *asked for* data and statistics, so > you get what you asked for. however the user conjures in their heads w

Re: Statistics Import and Export

2025-03-05 Thread Corey Huinker
> > One fairly easy win would be to stop issuing getAttributeStats() for > non-expression indexes. In most cases that'll already drastically cut down > on > the extra queries. That does seem like an easy win, especially since we're already using indexprs for some filters. I am concerned that, dow

Re: Statistics Import and Export

2025-03-05 Thread Corey Huinker
> > Apologies if this has already been considered upthread, but would it be > possible to use one query to gather all the required information into a > sorted table? At a glance, it looks to me like it might be feasible. I > had a lot of luck with reducing the number per-object queries with that

Re: Statistics Import and Export

2025-03-05 Thread Corey Huinker
On Wed, Mar 5, 2025 at 9:18 PM Andres Freund wrote: > Hi, > > On 2025-03-05 20:54:35 -0500, Corey Huinker wrote: > > It's been considered and not ruled out, with a "let's see how the simple > > thing works, first" approach. Considerations are:

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > The more I think about it, the less correct it seems to me to have the > statement to restore statistics tracked via ArchiveOpts->createStmt. We > use > that for DDL, but this really is data, not DDL. Because we store it in > ->createStmt it's stored in-memory for the runtime of pg_dump, wh

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > > > Pardon my inexperience, but aren't the ArchiveEntry records needed right > up > > until the program's run? > > s/the/the end of the/? > yes > > If there's value in freeing them, why isn't it being done already? What > > other thing would consume this freed memory? > > I'm not saying th

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > To be honest, I am a bit surprised that we decided to enable this by > default. It's not obvious to me that statistics should be regarded as > part of the database in the same way that table definitions or table > data are. That said, I'm not overwhelmingly opposed to that choice. > However, ev

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > Would it be appropriate to create a temp table? I wouldn't normally > expect pg_dump to create temp tables, but I can't think of a major > reason not to. > I think we can't - the db might be a replica. > > If not, did you have in mind a CTE with a large VALUES expression, or > just a giant I

Re: Statistics Import and Export

2025-03-11 Thread Corey Huinker
> > I don't follow. We already have the tablenames, schemanames and oids of the > to-be-dumped tables/indexes collected in pg_dump, all that's needed is to > send > a list of those to the server to filter there? > Do we have something that currently does that? All of the collect functions (collect

Re: Statistics Import and Export commit related issue.

2025-02-27 Thread Corey Huinker
On Tue, Feb 25, 2025 at 1:31 AM jian he wrote: > hi. > the thread "Statistics Import and Export" is quite hot. > so a new thread for some minor issue i found. > > > static int > _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) > { > > if (strcmp(te->desc, "STATISTICS D

Re: Statistics Import and Export

2025-02-21 Thread Corey Huinker
> > Oy. Those are outright horrid, even without any consideration of > pre-preparing them. We know the OID of the table we want to dump, > we should be doing "FROM pg_class WHERE oid = whatever" and lose > the join to pg_namespace altogether. The explicit casts to regclass > are quite expensive

Re: Statistics Import and Export

2025-02-21 Thread Corey Huinker
moteVersionStr is "18devel" rather than "18". I didn't include any work on the attribute query as I wanted to keep that separate for clarity purposes. From 45467a69813cbf25c2850b254c5d2710c231a723 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Fri, 21 Feb 2025 23

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 4:07 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 15:40 -0500, Tom Lane wrote: > > I'm a little suspicious whether that has any effect if you insert it > > before set_pglocale_pgservice(). > > Ah, right. Corey, can you please include that (in the right place, of > course)

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
> > > > After a bit of playing around, it seemed messy to make it into > a join, but we could replace the two array_agg sub-selects with > a single one: > > (SELECT pg_catalog.array_agg(ROW(attname, attstattarget) ORDER BY attnum) > FROM pg_catalog.pg_attribute WHERE attrelid = i.indexrelid) > > a

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 12:51 PM Tom Lane wrote: > Andres Freund writes: > > On 2025-02-24 05:11:48 -0500, Corey Huinker wrote: > >> * relpages/reltuples/relallvisible are now char[32] buffers in > RelStatsInfo > >> and nowhere else (existing relpages conversion r

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
> > > > I suspect that this is a *really* bad idea. It's very very hard to get > inplace > updates right. We have several unfixed correctness bugs that are related to > the use of inplace updates. I really don't think it's wise to add > additional > interfaces that can reach inplace updates unless

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 2:36 PM Tom Lane wrote: > Corey Huinker writes: > > Sadly, that attnum isn't available in pg_stats, so we'd have to > reintroduce > > the joins to pg_namespace and pg_class to get at pg_attribute, at least > for > > indexes. > >

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
> > I don't think that's necessarily true, hot pruning might help some, as > afaict > the restore happens in multiple transactions. > If we're willing to take the potential bloat to avoid a nasty complexity, then I'm all for discarding it. Jeff just indicated off-list that he isn't seeing noticeab

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 4:33 PM Andres Freund wrote: > Hi, > > On February 24, 2025 10:30:08 PM GMT+01:00, Corey Huinker < > corey.huin...@gmail.com> wrote: > >From what I can see, it doesn't. Moreover, the attstattarget array agg is > >only done in version

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 1:54 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 13:47 -0500, Corey Huinker wrote: > > There doesn't seem to be any way around it, but it will > > slightly complicate the dump-ing side of things, in that we need to > > either: > > &g

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 4:30 PM Corey Huinker wrote: > >> >> After a bit of playing around, it seemed messy to make it into >> a join, but we could replace the two array_agg sub-selects with >> a single one: >> >> (SELECT pg_catalog.array_agg(ROW(attname, a

Re: Statistics Import and Export

2025-02-23 Thread Corey Huinker
On Sun, Feb 23, 2025 at 7:22 PM Jeff Davis wrote: > On Sat, 2025-02-22 at 00:00 -0500, Corey Huinker wrote: > > > > Attached is the first optimization, which gets rid of the pg_class > > queries entirely, instead getting the same information from the > > existi

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
oved to end of both queries for consistency. From 95127f6fd82bde843e5840de93678ff784750c8a Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Mon, 24 Feb 2025 02:38:22 -0500 Subject: [PATCH v2 1/3] Leverage existing functions for relation stats. Rather than quer pg_class once per relation in order to

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 9:54 AM Andres Freund wrote: > Hi, > > On 2025-02-24 05:11:48 -0500, Corey Huinker wrote: > > Incorporating most of the feedback (I kept a few of > > the appendNamedArgument() calls) presented over the weekend. > > > > * removeVer

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
On Wed, Feb 26, 2025 at 11:23 AM Tom Lane wrote: > Jeff Davis writes: > > I think you had mentioned upthread something about getting rid of the > > table-driven logic, which is fine with me. Did you mean for that to > > happen in this patch as well? > > Per Corey's description of the patch (I di

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
> > I have a patch that is getting thwacked around by the churn in > stats_import.sql, and it occurred to me that I don't see why all the > negative tests for pg_restore_relation_stats() need to have all the > parameters provided. For example, in both of these tests, you are > testing the relation

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
On Wed, Feb 26, 2025 at 4:46 PM Tom Lane wrote: > Melanie Plageman writes: > > I have a patch that is getting thwacked around by the churn in > > stats_import.sql, and it occurred to me that I don't see why all the > > negative tests for pg_restore_relation_stats() need to have all the > > para

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
what was already there, etc. * the set difference tests remain, as they proved extremely useful in detecting undesirable side-effects during development From f3087b04784d6853970bf41eb619281d72ce94bd Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Wed, 26 Feb 2025 21:02:44 -0500 Subject: [PAT

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
> > To my mind the next task is to get the buildfarm green again by > fixing the expression-index-stats problem. I can have a go at > that once Jeff pushes these patches, unless one of you are already > on it? > Already on it, but I can step aside if you've got a clearer vision of how to solve it

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
On Tue, Feb 25, 2025 at 11:36 PM Tom Lane wrote: > Corey Huinker writes: > > My solution so far is to take allo the v11+ (SELECT array_agg...) > functions > > and put them into a LATERAL, two of them filtered by attstattarget > 0 > and > > a new one aggregating att

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
On Tue, Feb 25, 2025 at 9:00 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 09:54 -0500, Andres Freund wrote: > > Have you compared performance of with/without stats after these > > optimizations? > > On unoptimized build with asserts enabled, dumping the regression > database: > > --no-statistic

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
On Wed, Feb 26, 2025 at 12:05 AM Tom Lane wrote: > Corey Huinker writes: > > Just to confirm, we ARE able to assume dense packing of attributes in an > > index, and thus we can infer the attnum from the position of the attname > in > > the aggregated array, and there&#x

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
On Tue, Feb 25, 2025 at 1:22 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 12:50 -0500, Tom Lane wrote: > > Also, while working on the attached, I couldn't help forming the > > opinion that we'd be better off to nuke pg_set_attribute_stats() > > from orbit and require people to use pg_restore_attr

Re: Add partial :-variable expansion to psql \copy

2025-03-31 Thread Corey Huinker
> > Anyway, my feeling about it is that \copy parsing is a huge hack > right now, and I'd rather see it become less of a hack, that is > more like other psql commands, instead of getting even hackier. > I wasn't as horrified as Tom, but it did have the feeling of it solving half the problem. We c

Re: Statistics Import and Export

2025-03-31 Thread Corey Huinker
ing step in >> adjust_new_dumpfile? >> > > That sounds trickier. > Narrator: It was not trickier. In light of v11-0001 being committed as 4694aedf63bf, I've rebased the remaining patches. From 607984bdcc91fa31fb7a12e9b24fb8704aa14975 Mon Sep 17 00:00:00 2001 From: Corey Huinke

Re: Statistics Import and Export

2025-03-28 Thread Corey Huinker
A rebase and a reordering of the commits to put the really-really-must-have relallfrozen ahead of the really-must-have stats batching and both of them head of the error->warning step-downs. From 96b10b1eb955c5619d23cadf7de8b12d2db638a9 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Sat,

<    1   2   3   4   5