>
> Independently of that, do we want to switch over to storing
> reltuples as a string instead of converting it? I still feel
> uncomfortable about the amount of baggage we added to pg_dump
> to avoid that.
I'm obviously a 'yes' vote for string, either fixed width buffer or
pg_strdup'd, for the
On Thu, Feb 27, 2025 at 10:01 PM Corey Huinker
wrote:
> +--- error: relation is wrong type
>> +SELECT pg_catalog.pg_restore_relation_stats(
>> +'relation', 0::oid,
>> +'relpages', 17::integer,
>> +'reltuples
>
> Also, we will need to think through the set of pg_dump options again. A
>> lot of our tools seem to assume that "if it's the default, we don't
>> need a way to ask for it explicitly", which makes it a lot harder to
>> ever change the default and keep a coherent set of options.
>>
>
> That's a g
>
> I tried to generalize that requirement to all of
> {schema|data|statistics} for consistency, but that resulted in 9
> options.
>
9 options that resolve to 3 boolean variables. It's not that hard.
And if we add a fourth option set, then we have 12 options. So it's O(3N),
not O(N^2).
People ha
>
>
> if you want everything --include=schema,data,statistics (presumably
> redundant with the default behavior)
> if you want schema only --include=schema
> if you want "everything except schema" --include=data,statistics
>
Until we add a fourth option, and then it becomes completely ambiguous as
On Sat, Mar 8, 2025 at 12:52 AM Hari Krishna Sunder
wrote:
> To improve the performance of pg_dump can we add a new sql function that
> can operate more efficiently than the pg_stats view? It could also take in
> an optional list of oids to filter on.
> This will help speed up the dump and restor
a couple
orders of magnitude gain.
From 9cd4b4e0e280d0fd8cb120ac105d6e65a491cd7e Mon Sep 17 00:00:00 2001
From: Corey Huinker
Date: Tue, 4 Mar 2025 22:16:52 -0500
Subject: [PATCH v7 1/2] Split relation into schemaname and relname.
In order to further reduce potential error-failures in restor
>
>
> Patch attached. This patch does NOT change the default; stats are still
> opt-out. But it makes it easier for users to start specifying what they
> want or not explicitly, or to rely on the defaults if they prefer.
>
> Note that the patch would mean we go from 2 options in v17:
> --{schema|
>
> Until we add a fourth option, and then it becomes completely ambiguous as
>> to whether you wanted data+statstics, or you not-wanted schema.
>>
>>
> except it is perfectly clear that you *asked for* data and statistics, so
> you get what you asked for. however the user conjures in their heads w
>
> One fairly easy win would be to stop issuing getAttributeStats() for
> non-expression indexes. In most cases that'll already drastically cut down
> on
> the extra queries.
That does seem like an easy win, especially since we're already using
indexprs for some filters. I am concerned that, dow
>
> Apologies if this has already been considered upthread, but would it be
> possible to use one query to gather all the required information into a
> sorted table? At a glance, it looks to me like it might be feasible. I
> had a lot of luck with reducing the number per-object queries with that
On Wed, Mar 5, 2025 at 9:18 PM Andres Freund wrote:
> Hi,
>
> On 2025-03-05 20:54:35 -0500, Corey Huinker wrote:
> > It's been considered and not ruled out, with a "let's see how the simple
> > thing works, first" approach. Considerations are:
>
>
> The more I think about it, the less correct it seems to me to have the
> statement to restore statistics tracked via ArchiveOpts->createStmt. We
> use
> that for DDL, but this really is data, not DDL. Because we store it in
> ->createStmt it's stored in-memory for the runtime of pg_dump, wh
>
>
>
> > Pardon my inexperience, but aren't the ArchiveEntry records needed right
> up
> > until the program's run?
>
> s/the/the end of the/?
>
yes
> > If there's value in freeing them, why isn't it being done already? What
> > other thing would consume this freed memory?
>
> I'm not saying th
>
> To be honest, I am a bit surprised that we decided to enable this by
> default. It's not obvious to me that statistics should be regarded as
> part of the database in the same way that table definitions or table
> data are. That said, I'm not overwhelmingly opposed to that choice.
> However, ev
>
> Would it be appropriate to create a temp table? I wouldn't normally
> expect pg_dump to create temp tables, but I can't think of a major
> reason not to.
>
I think we can't - the db might be a replica.
>
> If not, did you have in mind a CTE with a large VALUES expression, or
> just a giant I
>
> I don't follow. We already have the tablenames, schemanames and oids of the
> to-be-dumped tables/indexes collected in pg_dump, all that's needed is to
> send
> a list of those to the server to filter there?
>
Do we have something that currently does that? All of the collect functions
(collect
On Tue, Feb 25, 2025 at 1:31 AM jian he wrote:
> hi.
> the thread "Statistics Import and Export" is quite hot.
> so a new thread for some minor issue i found.
>
>
> static int
> _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
> {
>
> if (strcmp(te->desc, "STATISTICS D
>
> Oy. Those are outright horrid, even without any consideration of
> pre-preparing them. We know the OID of the table we want to dump,
> we should be doing "FROM pg_class WHERE oid = whatever" and lose
> the join to pg_namespace altogether. The explicit casts to regclass
> are quite expensive
moteVersionStr is "18devel" rather than
"18".
I didn't include any work on the attribute query as I wanted to keep that
separate for clarity purposes.
From 45467a69813cbf25c2850b254c5d2710c231a723 Mon Sep 17 00:00:00 2001
From: Corey Huinker
Date: Fri, 21 Feb 2025 23
On Mon, Feb 24, 2025 at 4:07 PM Jeff Davis wrote:
> On Mon, 2025-02-24 at 15:40 -0500, Tom Lane wrote:
> > I'm a little suspicious whether that has any effect if you insert it
> > before set_pglocale_pgservice().
>
> Ah, right. Corey, can you please include that (in the right place, of
> course)
>
>
>
> After a bit of playing around, it seemed messy to make it into
> a join, but we could replace the two array_agg sub-selects with
> a single one:
>
> (SELECT pg_catalog.array_agg(ROW(attname, attstattarget) ORDER BY attnum)
> FROM pg_catalog.pg_attribute WHERE attrelid = i.indexrelid)
>
> a
On Mon, Feb 24, 2025 at 12:51 PM Tom Lane wrote:
> Andres Freund writes:
> > On 2025-02-24 05:11:48 -0500, Corey Huinker wrote:
> >> * relpages/reltuples/relallvisible are now char[32] buffers in
> RelStatsInfo
> >> and nowhere else (existing relpages conversion r
>
>
>
> I suspect that this is a *really* bad idea. It's very very hard to get
> inplace
> updates right. We have several unfixed correctness bugs that are related to
> the use of inplace updates. I really don't think it's wise to add
> additional
> interfaces that can reach inplace updates unless
On Mon, Feb 24, 2025 at 2:36 PM Tom Lane wrote:
> Corey Huinker writes:
> > Sadly, that attnum isn't available in pg_stats, so we'd have to
> reintroduce
> > the joins to pg_namespace and pg_class to get at pg_attribute, at least
> for
> > indexes.
>
>
>
> I don't think that's necessarily true, hot pruning might help some, as
> afaict
> the restore happens in multiple transactions.
>
If we're willing to take the potential bloat to avoid a nasty complexity,
then I'm all for discarding it. Jeff just indicated off-list that he isn't
seeing noticeab
On Mon, Feb 24, 2025 at 4:33 PM Andres Freund wrote:
> Hi,
>
> On February 24, 2025 10:30:08 PM GMT+01:00, Corey Huinker <
> corey.huin...@gmail.com> wrote:
> >From what I can see, it doesn't. Moreover, the attstattarget array agg is
> >only done in version
On Mon, Feb 24, 2025 at 1:54 PM Jeff Davis wrote:
> On Mon, 2025-02-24 at 13:47 -0500, Corey Huinker wrote:
> > There doesn't seem to be any way around it, but it will
> > slightly complicate the dump-ing side of things, in that we need to
> > either:
> >
&g
On Mon, Feb 24, 2025 at 4:30 PM Corey Huinker
wrote:
>
>>
>> After a bit of playing around, it seemed messy to make it into
>> a join, but we could replace the two array_agg sub-selects with
>> a single one:
>>
>> (SELECT pg_catalog.array_agg(ROW(attname, a
On Sun, Feb 23, 2025 at 7:22 PM Jeff Davis wrote:
> On Sat, 2025-02-22 at 00:00 -0500, Corey Huinker wrote:
> >
> > Attached is the first optimization, which gets rid of the pg_class
> > queries entirely, instead getting the same information from the
> > existi
oved to end of both queries for consistency.
From 95127f6fd82bde843e5840de93678ff784750c8a Mon Sep 17 00:00:00 2001
From: Corey Huinker
Date: Mon, 24 Feb 2025 02:38:22 -0500
Subject: [PATCH v2 1/3] Leverage existing functions for relation stats.
Rather than quer pg_class once per relation in order to
On Mon, Feb 24, 2025 at 9:54 AM Andres Freund wrote:
> Hi,
>
> On 2025-02-24 05:11:48 -0500, Corey Huinker wrote:
> > Incorporating most of the feedback (I kept a few of
> > the appendNamedArgument() calls) presented over the weekend.
> >
> > * removeVer
On Wed, Feb 26, 2025 at 11:23 AM Tom Lane wrote:
> Jeff Davis writes:
> > I think you had mentioned upthread something about getting rid of the
> > table-driven logic, which is fine with me. Did you mean for that to
> > happen in this patch as well?
>
> Per Corey's description of the patch (I di
>
> I have a patch that is getting thwacked around by the churn in
> stats_import.sql, and it occurred to me that I don't see why all the
> negative tests for pg_restore_relation_stats() need to have all the
> parameters provided. For example, in both of these tests, you are
> testing the relation
On Wed, Feb 26, 2025 at 4:46 PM Tom Lane wrote:
> Melanie Plageman writes:
> > I have a patch that is getting thwacked around by the churn in
> > stats_import.sql, and it occurred to me that I don't see why all the
> > negative tests for pg_restore_relation_stats() need to have all the
> > para
what was already there, etc.
* the set difference tests remain, as they proved extremely useful in
detecting undesirable side-effects during development
From f3087b04784d6853970bf41eb619281d72ce94bd Mon Sep 17 00:00:00 2001
From: Corey Huinker
Date: Wed, 26 Feb 2025 21:02:44 -0500
Subject: [PAT
>
> To my mind the next task is to get the buildfarm green again by
> fixing the expression-index-stats problem. I can have a go at
> that once Jeff pushes these patches, unless one of you are already
> on it?
>
Already on it, but I can step aside if you've got a clearer vision of how
to solve it
On Tue, Feb 25, 2025 at 11:36 PM Tom Lane wrote:
> Corey Huinker writes:
> > My solution so far is to take allo the v11+ (SELECT array_agg...)
> functions
> > and put them into a LATERAL, two of them filtered by attstattarget > 0
> and
> > a new one aggregating att
On Tue, Feb 25, 2025 at 9:00 PM Jeff Davis wrote:
> On Mon, 2025-02-24 at 09:54 -0500, Andres Freund wrote:
> > Have you compared performance of with/without stats after these
> > optimizations?
>
> On unoptimized build with asserts enabled, dumping the regression
> database:
>
> --no-statistic
On Wed, Feb 26, 2025 at 12:05 AM Tom Lane wrote:
> Corey Huinker writes:
> > Just to confirm, we ARE able to assume dense packing of attributes in an
> > index, and thus we can infer the attnum from the position of the attname
> in
> > the aggregated array, and there
On Tue, Feb 25, 2025 at 1:22 PM Jeff Davis wrote:
> On Mon, 2025-02-24 at 12:50 -0500, Tom Lane wrote:
> > Also, while working on the attached, I couldn't help forming the
> > opinion that we'd be better off to nuke pg_set_attribute_stats()
> > from orbit and require people to use pg_restore_attr
>
> Anyway, my feeling about it is that \copy parsing is a huge hack
> right now, and I'd rather see it become less of a hack, that is
> more like other psql commands, instead of getting even hackier.
>
I wasn't as horrified as Tom, but it did have the feeling of it solving
half the problem.
We c
ing step in
>> adjust_new_dumpfile?
>>
>
> That sounds trickier.
>
Narrator: It was not trickier.
In light of v11-0001 being committed as 4694aedf63bf, I've rebased the
remaining patches.
From 607984bdcc91fa31fb7a12e9b24fb8704aa14975 Mon Sep 17 00:00:00 2001
From: Corey Huinke
A rebase and a reordering of the commits to put the really-really-must-have
relallfrozen ahead of the really-must-have stats batching and both of them
head of the error->warning step-downs.
From 96b10b1eb955c5619d23cadf7de8b12d2db638a9 Mon Sep 17 00:00:00 2001
From: Corey Huinker
Date: Sat,
401 - 444 of 444 matches
Mail list logo