relfilenode statistics

2024-05-25 Thread Bertrand Drouvot
Hi hackers,

Please find attached a POC patch to implement $SUBJECT.

Adding relfilenode statistics has been proposed in [1]. The idea is to allow
tracking dirtied blocks, written blocks,... on a per relation basis.

The attached patch is not in a fully "polished" state yet: there is more places
we should add relfilenode counters, create more APIS to retrieve the relfilenode
stats

But I think that it is in a state that can be used to discuss the approach it
is implementing (so that we can agree or not on it) before moving forward.

The approach that is implemented in this patch is the following:

- A new PGSTAT_KIND_RELFILENODE is added
- A new attribute (aka relfile) has been added to PgStat_HashKey so that we
can record (dboid, spcOid and relfile) to identify a relfilenode entry
- pgstat_create_transactional() is used in RelationCreateStorage()
- pgstat_drop_transactional() is used in RelationDropStorage()
- RelationPreserveStorage() will remove the entry from the list of dropped stats

The current approach to deal with table rewrite is to:

- copy the relfilenode stats in table_relation_set_new_filelocator() from
the relfilenode stats entry to the shared table stats entry
- in the pg_statio_all_tables view: add the table stats entry (that contains
"previous" relfilenode stats (due to the above) that were linked to this 
relation
) to the current relfilenode stats linked to the relation

An example is done in the attached patch for the new heap_blks_written field
in pg_statio_all_tables. Outcome is:

"
postgres=# create table bdt (a int);
CREATE TABLE
postgres=# select heap_blks_written from pg_statio_all_tables where relname = 
'bdt';
 heap_blks_written
---
 0
(1 row)

postgres=# insert into bdt select generate_series(1,1);
INSERT 0 1
postgres=# select heap_blks_written from pg_statio_all_tables where relname = 
'bdt';
 heap_blks_written
---
 0
(1 row)

postgres=# checkpoint;
CHECKPOINT
postgres=# select heap_blks_written from pg_statio_all_tables where relname = 
'bdt';
 heap_blks_written
---
45
(1 row)

postgres=# truncate table bdt;
TRUNCATE TABLE
postgres=# select heap_blks_written from pg_statio_all_tables where relname = 
'bdt';
 heap_blks_written
---
45
(1 row)

postgres=# insert into bdt select generate_series(1,1);
INSERT 0 1
postgres=# select heap_blks_written from pg_statio_all_tables where relname = 
'bdt';
 heap_blks_written
---
45
(1 row)

postgres=# checkpoint;
CHECKPOINT
postgres=# select heap_blks_written from pg_statio_all_tables where relname = 
'bdt';
 heap_blks_written
---
90
(1 row)
"

Some remarks:

- My first attempt has been to call the pgstat_create_transactional() and
pgstat_drop_transactional() at the same places it is done for the relations but
that did not work well (mainly due to corner cases in case of rewrite).

- Please don't take care of the pgstat_count_buffer_read() and 
pgstat_count_buffer_hit() calls in pgstat_report_relfilenode_buffer_read()
and pgstat_report_relfilenode_buffer_hit(). Those stats will follow the same
flow as the one done and explained above for the new heap_blks_written one (
should we agree on it).

Looking forward to your comments, feedback.

Regards,

[1]: 
https://www.postgresql.org/message-id/20231113204439.r4lmys73tessqmak%40awork3.anarazel.de

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From e102c9d15c08c638879ece26008faee58cf4a07e Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Thu, 16 Nov 2023 02:30:01 +
Subject: [PATCH v1] Provide relfilenode statistics

---
 src/backend/access/rmgrdesc/xactdesc.c|   5 +-
 src/backend/catalog/storage.c |   8 ++
 src/backend/catalog/system_functions.sql  |   2 +-
 src/backend/catalog/system_views.sql  |   5 +-
 src/backend/postmaster/checkpointer.c |   5 +
 src/backend/storage/buffer/bufmgr.c   |   6 +-
 src/backend/storage/smgr/md.c |   7 ++
 src/backend/utils/activity/pgstat.c   |  39 --
 src/backend/utils/activity/pgstat_database.c  |  12 +-
 src/backend/utils/activity/pgstat_function.c  |  13 +-
 src/backend/utils/activity/pgstat_relation.c  | 112 --
 src/backend/utils/activity/pgstat_replslot.c  |  13 +-
 src/backend/utils/activity/pgstat_shmem.c |  19 ++-
 .../utils/activity/pgstat_subscription.c  |  12 +-
 src/backend/utils/activity/pgstat_xact.c  |  60 +++---
 src/backend/utils/adt/pgstatfuncs.c   |  34 +-
 src/include/access/tableam.h  |  19 +++
 src/include/access/xact.h |   1 +
 src/include/catalog/pg_proc.dat   |  14 ++-
 src/include/pgstat.h  |  19 ++-
 src/include/utils/pgstat_internal.h   |  34 

Re: Schema variables - new implementation for Postgres 15

2024-05-25 Thread walther

Pavel Stehule:
Sure there is more possibilities, but I don't want to lost the 
possibility to write code like


CREATE TEMP VARIABLE _x;

LET _x = 'hello';

DO $$
BEGIN
   RAISE NOTICE '%', _x;
END;
$$;

So I am searching for a way to do it safely, but still intuitive and 
user friendly.


Maybe a middle-way between this and Alvaro's proposal could be:

Whenever you have a FROM clause, a variable must be added to it to be 
accessible.  When you don't have a FROM clause, you can access it directly.


This would make the following work:

RAISE NOTICE '%', _x;

SELECT _x;

SELECT tbl.*, _x FROM tbl, _x;

SELECT tbl.*, (SELECT _x) FROM tbl, _x;

SELECT tbl.*, (SELECT _x FROM _x) FROM tbl;


But the following would be an error:

SELECT tbl.*, _x FROM tbl;

SELECT tbl.*, (SELECT _x) FROM tbl;


Best,

Wolfgang




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-25 Thread Jelte Fennema-Nio
On Thu, 23 May 2024 at 20:40, Tom Lane  wrote:
>
> Jacob Champion  writes:
> > Would it be good to expand on that idea of criticality? IIRC one of
> > Jelte's complaints earlier was that middleware has to know all the
> > extension types anyway, to be able to figure out whether it has to do
> > something about them or not. HTTP has the concept of hop-by-hop vs
> > end-to-end headers for related reasons.
>
> Yeah, perhaps.  We'd need to figure out just which classes we need
> to divide protocol parameters into, and then think about a way for
> code to understand which class a parameter falls into even when
> it doesn't specifically know that parameter.

I think this class is so rare, that it's not worth complicating the
discussion on new protocol features even more. AFAICT there is only
one proposed protocol change that does not need any pooler support
(apart from syncing the feature value when re-assigning the
connectin): Automatic binary encoding for a list of types

All others need some support from poolers, at the very least they need
new message types to not error out. But in many cases more complex
stuff is needed.




[PATCH]A minor improvement to the error-report in SimpleLruWriteAll()

2024-05-25 Thread Long Song


Hi hackers,
When I read the code, I noticed that in SimpleLruWriteAll(), only the last 
error is 
recorded when the file fails to close. Like the following,
```void
SimpleLruWriteAll(SlruCtl ctl, bool allow_redirtied)
{
SlruShared  shared = ctl->shared;
SlruWriteAllData fdata;
int64   pageno = 0;
int prevbank = SlotGetBankNumber(0);
boolok;
...
/*
 * Now close any files that were open
 */
ok = true;
for (int i = 0; i < fdata.num_files; i++)
{
if (CloseTransientFile(fdata.fd[i]) != 0)
{
slru_errcause = SLRU_CLOSE_FAILED;
slru_errno = errno;
pageno = fdata.segno[i] * SLRU_PAGES_PER_SEGMENT;
ok = false;
}
}
if (!ok)
SlruReportIOError(ctl, pageno, InvalidTransactionId);
```
// Here, SlruReportIOError() is called only once, meaning that the last error 
message is
 recorded. In my opinion, since failure to close a file is not common, logging 
an error 
message every time a failure occurs will not result in much log growth, but 
detailed error 
logging will help in most cases.

So, I changed the code to move the call to SlruReportIOError() inside the while 
loop.

Attached is the patch, I hope it can help.






v1-0001-Fix-error-report-missing-of-SimpleLruWriteAll.patch
Description: Binary data


Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-25 Thread Jelte Fennema-Nio
On Thu, 23 May 2024 at 20:12, Tom Lane  wrote:
>
> Jelte Fennema-Nio  writes:
> > On Fri, 17 May 2024 at 21:24, Robert Haas  wrote:
> >> Perhaps these are all minority positions, but I can't tell you what
> >> everyone thinks, only what I think.
>
> > I'd love to hear some opinions from others on these design choices. So
> > far it seems like we're the only two that have opinions on this (which
> > seems hard to believe) and our opinions are clearly conflicting. And
> > above all I'd like to move forward with this, be it my way or yours
> > (although I'd prefer my way of course ;) )
>
> I got around to looking through this thread in preparation for next
> week's patch review session.  I have a couple of opinions to offer:
>
> 1. Protocol versions suck.  Bumping them is seldom a good answer,
> and should never be done if you have any finer-grained negotiation
> mechanism available.  My aversion to this is over thirty years old:
> I learned that lesson from watching the GIF87-to-GIF89 transition mess.
> Authors of GIF-writing tools tended to take the easy way out and write
> "GIF89" in the header whether they were actually using any of the new
> version's features or not.  This led to an awful lot of pictures that
> couldn't be read by available GIF-displaying tools, for no good reason
> whatsoever.  The PNG committee, a couple years later, reacted to that
> mess by designing PNG to have no version number whatsoever, and yet
> be extensible in a fine-grained way.  (Basically, a PNG file is made
> up of labeled chunks.  If a reader doesn't recognize a particular
> chunk code, it can still tell whether the chunk is "critical" or not,
> and thereby decide if it must give up or can proceed while ignoring
> that chunk.)
>
> So overall, I have a strong preference for using the _pq_.xxx
> mechanism instead of a protocol version bump.  I do not believe
> the latter has any advantage.

I'm not necessarily super opposed to only using the _pq_.xxx
mechanism. I mainly think it's silly to have a protocol version number
and then never use it. And I feel some of the proposed changes don't
really benefit from being able to be turned on-and-off by themselves.
My rule of thumb would be:
1. Things that a modern client/pooler would always request: version bump
2. Everything else: _pq_.xxx

Of the proposed changes so far on the mailing list the only 2 that
would fall under 1 imho are:
1. The ParameterSet message
2. Longer than 32bit secret in BackendKeyData

I also don't think the GIF situation you describe translates fully to
this discussion. We have active protocol version negotiation, so if a
server doesn't support protocol 3.1 a client is expected to fall back
to the 3.0 protocol when communicating. Of course you can argue that a
badly behaved client will fail to connect when it gets a downgrade
request from the server, but that same argument can be made about a
server not reporting support for a _pq_.xxx parameter that every
modern client/pooler requests. So I don't think there's a practical
difference in the problem you're describing.



But again if I'm alone in this, then I don't




Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-25 Thread Jelte Fennema-Nio
(pressed send to early)

On Sat, 25 May 2024 at 12:39, Jelte Fennema-Nio  wrote:
> But again if I'm alone in this, then I don't

... mind budging on this to move this decision along. Using _pq_.xxx
parameters for all protocol changes would totally be acceptable to me.




Re: Schema variables - new implementation for Postgres 15

2024-05-25 Thread Pavel Stehule
so 25. 5. 2024 v 10:24 odesílatel  napsal:

> Pavel Stehule:
> > Sure there is more possibilities, but I don't want to lost the
> > possibility to write code like
> >
> > CREATE TEMP VARIABLE _x;
> >
> > LET _x = 'hello';
> >
> > DO $$
> > BEGIN
> >RAISE NOTICE '%', _x;
> > END;
> > $$;
> >
> > So I am searching for a way to do it safely, but still intuitive and
> > user friendly.
>
> Maybe a middle-way between this and Alvaro's proposal could be:
>
> Whenever you have a FROM clause, a variable must be added to it to be
> accessible.  When you don't have a FROM clause, you can access it directly.
>
> This would make the following work:
>
> RAISE NOTICE '%', _x;
>
> SELECT _x;
>
> SELECT tbl.*, _x FROM tbl, _x;
>
> SELECT tbl.*, (SELECT _x) FROM tbl, _x;
>
> SELECT tbl.*, (SELECT _x FROM _x) FROM tbl;
>
>
> But the following would be an error:
>
> SELECT tbl.*, _x FROM tbl;
>
> SELECT tbl.*, (SELECT _x) FROM tbl;
>
>
It looks odd - It is not intuitive, it introduces new inconsistency inside
Postgres, or with solutions in other databases. No other database has a
similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be
confused. Users that use PL/pgSQL will be confused.

Regards

Pavel


>
> Best,
>
> Wolfgang
>


Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2024-05-25 Thread Dave Cramer
On Sat, 25 May 2024 at 06:40, Jelte Fennema-Nio  wrote:

> On Thu, 23 May 2024 at 20:12, Tom Lane  wrote:
> >
> > Jelte Fennema-Nio  writes:
> > > On Fri, 17 May 2024 at 21:24, Robert Haas 
> wrote:
> > >> Perhaps these are all minority positions, but I can't tell you what
> > >> everyone thinks, only what I think.
> >
> > > I'd love to hear some opinions from others on these design choices. So
> > > far it seems like we're the only two that have opinions on this (which
> > > seems hard to believe) and our opinions are clearly conflicting. And
> > > above all I'd like to move forward with this, be it my way or yours
> > > (although I'd prefer my way of course ;) )
> >
> > I got around to looking through this thread in preparation for next
> > week's patch review session.  I have a couple of opinions to offer:
> >
> > 1. Protocol versions suck.  Bumping them is seldom a good answer,
> > and should never be done if you have any finer-grained negotiation
> > mechanism available.  My aversion to this is over thirty years old:
> > I learned that lesson from watching the GIF87-to-GIF89 transition mess.
> > Authors of GIF-writing tools tended to take the easy way out and write
> > "GIF89" in the header whether they were actually using any of the new
> > version's features or not.  This led to an awful lot of pictures that
> > couldn't be read by available GIF-displaying tools, for no good reason
> > whatsoever.  The PNG committee, a couple years later, reacted to that
> > mess by designing PNG to have no version number whatsoever, and yet
> > be extensible in a fine-grained way.  (Basically, a PNG file is made
> > up of labeled chunks.  If a reader doesn't recognize a particular
> > chunk code, it can still tell whether the chunk is "critical" or not,
> > and thereby decide if it must give up or can proceed while ignoring
> > that chunk.)
> >
> > So overall, I have a strong preference for using the _pq_.xxx
> > mechanism instead of a protocol version bump.  I do not believe
> > the latter has any advantage.
>
> I'm not necessarily super opposed to only using the _pq_.xxx
> mechanism.


I find it interesting that up to now nobody has ever used this mechanism.


> I mainly think it's silly to have a protocol version number
> and then never use it. And I feel some of the proposed changes don't
> really benefit from being able to be turned on-and-off by themselves.
> My rule of thumb would be:
> 1. Things that a modern client/pooler would always request: version bump
> 2. Everything else: _pq_.xxx
>

Have to agree, why have a protocol version and then just not use it ?

>
> Of the proposed changes so far on the mailing list the only 2 that
> would fall under 1 imho are:
> 1. The ParameterSet message
> 2. Longer than 32bit secret in BackendKeyData
>
> I also don't think the GIF situation you describe translates fully to
> this discussion. We have active protocol version negotiation, so if a
> server doesn't support protocol 3.1 a client is expected to fall back
> to the 3.0 protocol when communicating.


Also agree. Isn't the point of having a version number to figure out what
features the client wants and subsequently the server can provide?

> Of course you can argue that a
> badly behaved client will fail to connect when it gets a downgrade
> request from the server, but that same argument can be made about a
> server not reporting support for a _pq_.xxx parameter that every
> modern client/pooler requests. So I don't think there's a practical
> difference in the problem you're describing.
>

+1

>
>
>
> But again if I'm alone in this, then I don't
>

I would prefer to see a well defined protocol handshaking mechanism rather
than some strange _pq.xxx dance.

Dave


Re: Volatile write caches on macOS and Windows, redux

2024-05-25 Thread Jelte Fennema-Nio
On Tue, 18 Jul 2023 at 05:29, Thomas Munro  wrote:
> It's possible that fcntl(F_FULLFSYNC) might fail with ENOSUPP or other
> errors in obscure cases (eg unusual file systems).  In that case, you
> could manually lower fsync to just "on" and do your own research on
> whether power loss can toast your database, but that doesn't seem like
> a reason for us not to ship good solid defaults for typical users.

Is this the only reason why you're suggesting adding fsync=full,
instead of simply always setting F_FULLFSYNC when fsync=true on MacOS.
If so, I'm not sure we really gain anything by this tri-state. I think
people either care about data loss on power loss, or they don't. I
doubt many people want his third intermediate option, which afaict
basically means lose data on powerloss less often than fsync=false but
still lose data most of the time.

If you're going to keep this tri-state for MacOS, then it still seems
nicer to me to "fix" fsync=true on MacOS and introduce a fsync=partial
or something. Then defaults are the same across platforms and anyone
setting fsync=yes currently in their postgresql.conf would get the
fixed behaviour on upgrade.




Re: Improving tracking/processing of buildfarm test failures

2024-05-25 Thread Alexander Lakhin

Hello Amit and Noah,

24.05.2024 14:15, Amit Kapila wrote:

I feel it is a good idea to do something about this. It makes sense to
start with something simple and see how it works. I think this can
also help us whether we need to chase a particular BF failure
immediately after committing.


24.05.2024 23:00, Noah Misch wrote:




(I could start with the second approach, if you don't mind, and we'll see
how it works.)

Certainly you doing (2) can only help, though it may help less than (1).


Thank you for paying attention to this!

I've created such page to accumulate information on test failures:
https://wiki.postgresql.org/wiki/Known_Buildfarm_Test_Failures

I've deliberately added a trivial issue with partition_split, which is
doomed to be fixed soon, to test the information workflow, and I'm going
to add a few other items in the coming days.

Please share your comments and suggestions, if any.

Best regards,
Alexander




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-05-25 Thread Alexander Korotkov
On Fri, May 24, 2024 at 11:00 PM Alexander Lakhin  wrote:
>
> 24.05.2024 22:29, Tom Lane wrote:
> > The partition_split test has unstable results, as shown at [1].
> > I suggest adding "ORDER BY conname" to the two queries shown
> > to fail there.  Better look at other queries in the test for
> > possible similar problems, too.
>
> Yes, I've just reproduced it on an aarch64 device as follows:
> echo "autovacuum_naptime = 1
> autovacuum_vacuum_threshold = 1
> autovacuum_analyze_threshold = 1
> " > ~/temp.config
> TEMP_CONFIG=~/temp.config TESTS="$(printf 'partition_split %.0s' `seq 100`)" 
> make -s check-tests
> ...
> ok 80- partition_split   749 ms
> not ok 81- partition_split   728 ms
> ok 82- partition_split   732 ms
>
> $ cat src/test/regress/regression.diffs
> diff -U3 .../src/test/regress/expected/partition_split.out 
> .../src/test/regress/results/partition_split.out
> --- .../src/test/regress/expected/partition_split.out   2024-05-15 
> 17:15:57.171999830 +
> +++ .../src/test/regress/results/partition_split.out2024-05-24 
> 19:28:37.32749 +
> @@ -625,8 +625,8 @@
>   SELECT pg_get_constraintdef(oid), conname, conkey FROM pg_constraint WHERE 
> conrelid =
> 'sales_feb_mar_apr2022'::regclass::oid;
> pg_get_constraintdef | conname | conkey
>   
> -+-+
> - CHECK ((sales_amount > 1))  | 
> sales_range_sales_amount_check  | {2}
>FOREIGN KEY (salesperson_id) REFERENCES salespeople(salesperson_id) | 
> sales_range_salesperson_id_fkey | {1}
> + CHECK ((sales_amount > 1))  | 
> sales_range_sales_amount_check  | {2}
>   (2 rows)
>
>   ALTER TABLE sales_range SPLIT PARTITION sales_feb_mar_apr2022 INTO

Tom, Alexander, thank you for spotting this.
I'm going to care about it later today.

--
Regards,
Alexander Korotkov
Supabase




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-25 Thread Hannu Krosing
On Fri, May 24, 2024 at 10:00 PM Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Fri, May 24, 2024 at 2:57 PM Tom Lane  wrote:
> >> Doesn't seem right to me.  That will give pg_dump the wrong idea
> >> of what the initial privileges actually were, and I don't see how
> >> it can construct correct delta GRANT/REVOKE on the basis of false
> >> information.  During the dump reload, the extension will be
> >> recreated with the original owner (I think), causing its objects'
> >> privileges to go back to the original pg_init_privs values.
>
> > Oh! That does seem like it would make what I said wrong, but how would
> > it even know who the original owner was? Shouldn't we be recreating
> > the object with the owner it had at dump time?
>
> Keep in mind that the whole point here is for the pg_dump script to
> just say "CREATE EXTENSION foo", not to mess with the individual
> objects therein.  So the objects are (probably) going to be owned by
> the user that issued CREATE EXTENSION.
>
> In the original conception, that was the end of it: what you got for
> the member objects was whatever state CREATE EXTENSION left behind.
> The idea of pg_init_privs is to support dump/reload of subsequent
> manual alterations of privileges for extension-created objects.
> I'm not, at this point, 100% certain that that's a fully realizable
> goal.

The issue became visible because pg_dump issued a bogus

REVOKE ALL ON TABLE public.pg_stat_statements FROM "16390";

Maybe the right place for a fix is in pg_dump and the fix would be to *not*
issue REVOKE ALL ON  FROM  ?

Or alternatively change REVOKE to treat non-existing users as a no-op ?

Also, the pg_init_privs entry should either go away or at least be
changed at the point when the user referenced in init-privs is
dropped.

Having an pg_init_privs entry referencing a non-existing user is
certainly of no practical use.

Or maybe we should change the user at that point to NULL or some
special non-existing-user-id ?

> But I definitely think it's insane to expect that to work
> without also tracking changes in the ownership of said objects.
>
> Maybe forbidding ALTER OWNER on extension-owned objects isn't
> such a bad idea?




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-25 Thread Tom Lane
Hannu Krosing  writes:
> Having an pg_init_privs entry referencing a non-existing user is
> certainly of no practical use.

Sure, that's not up for debate.  What I think we're discussing
right now is

1. What other cases are badly handled by the pg_init_privs
mechanisms.

2. How much of that is practical to fix in v17, seeing that
it's all long-standing bugs and we're already past beta1.

I kind of doubt that the answer to #2 is "all of it".
But perhaps we can do better than "none of it".

regards, tom lane




Fix an incorrect assertion condition in mdwritev().

2024-05-25 Thread Xing Guo
Hi hackers,

In commit 4908c58[^1], a vectored variant of smgrwrite() is added and
the assertion condition in mdwritev() no longer matches the comment.
This patch helps fix it.

[^1]: 
https://github.com/postgres/postgres/commit/4908c5872059c409aa647bcde758dfeffe07996e

Best Regards,
Xing
From c77d1810ac6f13e9b58da5cd3ded5e47d44d03af Mon Sep 17 00:00:00 2001
From: Xing Guo 
Date: Sat, 25 May 2024 23:36:29 +0800
Subject: [PATCH v1] Fix an incorrect assertion in mdwritev().

In commit 4908c58[^1], we added smgrwritev() which is a vectored variant
of smgrwrite().  Since mdwritev() is to be used only for updating
already-existing blocks of a relation, the assertion condition shall be
updated accordingly.

[^1]: https://github.com/postgres/postgres/commit/4908c5872059c409aa647bcde758dfeffe07996e
---
 src/backend/storage/smgr/md.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index bf0f3ca76d..2abb5133a6 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -930,7 +930,7 @@ mdwritev(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
 {
 	/* This assert is too expensive to have on normally ... */
 #ifdef CHECK_WRITE_VS_EXTEND
-	Assert(blocknum < mdnblocks(reln, forknum));
+	Assert(blocknum + nblocks <= mdnblocks(reln, forknum));
 #endif
 
 	while (nblocks > 0)
-- 
2.45.1



Reg - pg_background async triggers

2024-05-25 Thread _sanjiv_ SK
Hi Postgres gurus,

  I try to perform DELETE and INSERT queries in the Trigger
function,

BEGIN
IF (TG_OP = 'DELETE') THEN
DELETE FROM…;
INSERT INTO….;
RETURN OLD;
ELSIF (TG_OP = 'UPDATE' OR TG_OP = 'INSERT') THEN
DELETE FROM…;
INSERT INTO….;
RETURN NEW;
END IF;
RETURN NULL; -- result is ignored since this is an AFTER trigger
END;

but as it is synchronous the performance of the each queries will be high
so I need to make the Queries in trigger function to be performed
asynchronously, I had found some approaches like dblink and pg_background,
In db_link it creates a new connection which is also not suit for my case,
it also comsumes time so I dropped it ☹.

I tried pg_background to achieve async queries like

DECLARE

result text;

BEGIN
IF (TG_OP = 'DELETE') THEN
SELECT * FROM pg_background_result(pg_background_launch(sql_command)) as
(result TEXT) INTO result;

  SELECT * FROM pg_background_result(pg_background_launch(sql_command)) as
(result TEXT) INTO result;
RETURN OLD;
ELSIF (TG_OP = 'UPDATE' OR TG_OP = 'INSERT') THEN
SELECT * FROM pg_background_result(pg_background_launch(sql_command)) as
(result TEXT) INTO result;

SELECT * FROM pg_background_result(pg_background_launch(sql_command)) as
(result TEXT) INTO result;

RETURN NEW;
END IF;
RETURN NULL; -- result is ignored since this is an AFTER trigger
END;

  Here also we are facing performance issue it consumes more
time than a direct sync Queries, So is this approach is correct for my case
and how to achieve it by any other approach. I had tried with LISTEN NOTIFY
as pg_notify() but I can’t listen and perform additional queries inside
postgres itself so I have wrote a java application to listen for this
notification and perform the queries asynchronously it is working fine😊
but I need to reduce external dependency here so please look up this issue
any suggestions most welcome..
#postgresql


Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-05-25 Thread Justin Pryzby
On Fri, May 03, 2024 at 04:32:25PM +0300, Alexander Korotkov wrote:
> On Fri, May 3, 2024 at 4:23 PM Justin Pryzby  wrote:
> > Note that the error that led to "EXCLUDING IDENTITY" is being discused
> > over here:
> > https://www.postgresql.org/message-id/3b8a9dc1-bbc7-0ef5-6863-c432afac7...@gmail.com
> >
> > It's possible that once that's addressed, the exclusion should be
> > removed here, too.
> 
> +1

Can EXCLUDING IDENTITY be removed now ?

I wasn't able to find why it was needed - at one point, I think there
was a test case that threw an error, but now when I remove the EXCLUDE,
nothing goes wrong.

-- 
Justin




Re: DROP OWNED BY fails to clean out pg_init_privs grants

2024-05-25 Thread Hannu Krosing
On Sat, May 25, 2024 at 4:48 PM Tom Lane  wrote:
>
> Hannu Krosing  writes:
> > Having an pg_init_privs entry referencing a non-existing user is
> > certainly of no practical use.
>
> Sure, that's not up for debate.  What I think we're discussing
> right now is
>
> 1. What other cases are badly handled by the pg_init_privs
> mechanisms.
>
> 2. How much of that is practical to fix in v17, seeing that
> it's all long-standing bugs and we're already past beta1.
>
> I kind of doubt that the answer to #2 is "all of it".
> But perhaps we can do better than "none of it".

Putting the fix either in pg_dump or making REVOKE tolerate
non-existing users  would definitely be most practical / useful fixes,
as these would actually allow pg_upgrade to v17 to work without
changing anything in older versions.

Currently one already can revoke a privilege that is not there in the
first place, with the end state being that the privilege (still) does
not exist.
This does not even generate a warning.

Extending this to revoking from users that do not exist does not seem
any different on conceptual level, though I understand that
implementation would be very different as it needs catching the user
lookup error from a very different part of the code.

That said, it would be better if we can have something that would be
easy to backport something that would make pg_upgrade work for all
supported versions.
Making REVOKE silently ignore revoking from non-existing users would
improve general robustness but could conceivably change behaviour if
somebody relies on it in their workflows.

Regards,
Hannu




Re: Fix an incorrect assertion condition in mdwritev().

2024-05-25 Thread Michael Paquier
On Sat, May 25, 2024 at 11:52:22PM +0800, Xing Guo wrote:
> In commit 4908c58[^1], a vectored variant of smgrwrite() is added and
> the assertion condition in mdwritev() no longer matches the comment.
> This patch helps fix it.
>
>   /* This assert is too expensive to have on normally ... */
>  #ifdef CHECK_WRITE_VS_EXTEND
> - Assert(blocknum < mdnblocks(reln, forknum));
> + Assert(blocknum + nblocks <= mdnblocks(reln, forknum));
>  #endif

Yes, it looks like you're right that this can be made stricter,
computing the number of blocks we're adding in the number calculated
(aka adding one block to this number fails immediately at initdb).
--
Michael


signature.asc
Description: PGP signature


Re: First draft of PG 17 release notes

2024-05-25 Thread Bruce Momjian
On Fri, May 24, 2024 at 10:50:28AM -0700, Andres Freund wrote:
> Hi,
> 
> On 2024-05-23 23:27:04 -0400, Bruce Momjian wrote:
> > On Thu, May 23, 2024 at 11:11:10PM -0400, Tom Lane wrote:
> > > Bruce Momjian  writes:
> > > I am not sure Bruce that you realize that your disregard for
> > > performance improvements is shared by nobody.  Arguably,
> > > performance is 90% of what we do these days, and it's also
> > > 90% of what users care about.
> >
> > Please stop saying I don't document performance.  I have already
> > explained enough which performance items I choose.  Please address my
> > criteria or suggest new criteria.
> 
> Bruce, just about everyone seems to disagree with your current approach. And
> not just this year, this has been a discussion in most if not all release note
> threads of the last few years.
> 
> People, including me, *have* addressed your criteria, but you just waved those
> concerns away. It's hard to continue discussing criteria when it doesn't at
> all feel like a conversation.
> 
> In the end, these are patches to the source code, I don't think you can just
> wave away widespread disagreement with your changes. That's not how we do
> postgres development.

Well, let's start with a new section for PG 17 that lists these.  Is it
20 items, 50, or 150?  I have no idea, but without the user-visible
filter, I am unable to determine what not-included performance features
are worthy of the release notes.

Can someone do that?  There is no reason other committers can't change
the release notes.  Yes, I realize we are looking for a consistent
voice, but the new section can probably have its own style, and I can
make adjustments if desired.

Also, I think this has gone unaddressed so long because if we skip a
user-visible change, users complain, but I don't remember anyone
complaining about skipped performance changes.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-25 Thread Bruce Momjian
On Fri, May 24, 2024 at 11:23:29AM -0700, Andres Freund wrote:
> Hi,
> 
> On 2024-05-22 18:33:03 -0400, Bruce Momjian wrote:
> > On Tue, May 21, 2024 at 09:40:28AM -0700, Andres Freund wrote:
> > > On 2024-05-18 11:13:54 -0400, Bruce Momjian wrote:
> > > I agree keeping things reasonably short is important. But I don't think 
> > > you're
> > > evenly applying it as a goal.
> > >
> > > Just skimming the notes from the end, I see
> > > - an 8 entries long pg_stat_statements section
> >
> > What item did you want to remove?  Those are all user-visible changes.
> 
> My point here was not that we necessarily need to remove those, but that their
> impact to users is smaller than many of the performance impacts you disregard.

I liked all your detailed suggestions so applied the attached patch. 

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.
diff --git a/doc/src/sgml/release-17.sgml b/doc/src/sgml/release-17.sgml
index 891678cc94b..1e65e99f2b2 100644
--- a/doc/src/sgml/release-17.sgml
+++ b/doc/src/sgml/release-17.sgml
@@ -628,45 +628,20 @@ Relevant columns have been removed from pg_stat_bgwriter and added to this new s
 
-
-
-
-Allow pg_stat_reset_shared() to reset all shared statistics (Atsushi Torikoshi)
-
-
-
-This is done by passing NULL.
-
-
-
-
-
-
-
-Allow pg_stat_reset_shared('slru') to clear SLRU statistics (Atsushi Torikoshi)
-
-
-
-Now pg_stat_reset_shared(NULL) also resets SLRU statistics.
-
-
-
-
 
 
 
-Allow pg_stat_reset_slru() to reset all SLRU statistics (Bharath Rupireddy)
+Improve control over resetting statistics (Atsushi Torikoshi, Bharath Rupireddy)
 
 
 
-The command pg_stat_reset_slru(NULL) already did this.
+Allow pg_stat_reset_shared() (with no arguments) and pg_stat_reset_shared(NULL) to reset all shared statistics.
+Allow pg_stat_reset_shared('slru') and pg_stat_reset_slru() (with no arguments) to reset SLRU statistics, which was already possible with pg_stat_reset_slru(NULL).
 
 
 
@@ -784,21 +759,22 @@ Add server variable trace_connection_negotiation to allow debugging of connectio
 
 
 
 
-Add per-table GRANT permission MAINTAIN to control maintenance operations (Nathan Bossart)
+Allow granting the right to perform maintenance operations (Nathan Bossart)
 
 
 
-The operations are VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER, and LOCK TABLE.
+The permission can be granted on a per-table basis using the MAINTAIN privilege and on a per-role basis via the pg_maintain predefined role.  Permitted operations are VACUUM, ANALYZE,
+REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER, and LOCK TABLE.
 
 
 
 
 
 
@@ -2234,45 +2210,19 @@ Allow reindexdb --index to process indexes from different tables in parallel (Ma
 
-
-
-
-Allow reindexdb to process objects in all databases matching a pattern (Nathan Bossart)
-
-
-
-Specifically, --all can now be used with --table, --schema, --index, and --system.
-
-
-
-
-
-
-
-Allow vacuumdb to process objects in all databases matching a pattern (Nathan Bossart)
-
-
-
-Specifically, --all can now be used with --table, --schema, and --exclude-schema.
-
-
-
-
 
 
 
-Allow clusterdb to process objects in all databases matching a pattern (Nathan Bossart)
+Allow reindexdb, vacuumdb, and clusterdb to process objects in all databases matching a pattern (Nathan Bossart)
 
 
 
-Specifically, --all can now be used with --table.
+The new option --all controls this behavior.
 
 
 
@@ -2550,28 +2500,6 @@ This value is used by the optimizer.
 
 
 
-
-
-
-
-Create custom wait events for postgres_fdw (Masahiro Ikeda)
-
-
-
-
-
-
-
-Create custom wait events for dblink (Masahiro Ikeda)
-
-
-
 
 
 
 
 Allow extensions to define custom wait events (Masahiro Ikeda)
 
+
+
+Custom wait events have been added to postgres_fdw and dblink.
+
 
 
 

Re: First draft of PG 17 release notes

2024-05-25 Thread Bruce Momjian
On Thu, May 23, 2024 at 01:22:51PM +0200, Álvaro Herrera wrote:
> Hello,
> 
> Regarding this item
> 
> : Allow the SLRU cache sizes to be configured (Andrey Borodin, Dilip Kumar)
> : 
> : The new server variables are commit_timestamp_buffers,
> : multixact_member_buffers, multixact_offset_buffers, notify_buffers,
> : serializable_buffers, subtransaction_buffers, and transaction_buffers.
> 
> I hereby request to be listed as third author of this feature.
> 
> Also, I'd like to suggest to make it more verbose, as details might be
> useful to users.  Mention that scalability is improved, because
> previously we've suggested to recompile with larger #defines, but to be
> cautious because values too high degrade performance.  Also mention the
> point that some of these grow with shared_buffers is user-visible enough
> that it warrants an explicit mention.  How about like this:
> 
> : Allow the SLRU cache sizes to be configured and improve performance of
> : larger caches
> : (Andrey Borodin, Dilip Kumar, Álvaro Herrera)
> : 
> : The new server variables are commit_timestamp_buffers,
> : multixact_member_buffers, multixact_offset_buffers, notify_buffers,
> : serializable_buffers, subtransaction_buffers, and transaction_buffers.
> : commit_timestamp_buffers, transaction_buffers and
> : subtransaction_buffers scale up automatically with shared_buffers.

Yes, I like that, patch applied.

> These three items
> 
> : Allow pg_stat_reset_shared() to reset all shared statistics (Atsushi 
> Torikoshi)
> : 
> : This is done by passing NULL.
> : 
> : Allow pg_stat_reset_shared('slru') to clear SLRU statistics (Atsushi 
> Torikoshi)
> : 
> : Now pg_stat_reset_shared(NULL) also resets SLRU statistics.
> : 
> : Allow pg_stat_reset_slru() to reset all SLRU statistics (Bharath Rupireddy)
> : 
> : The command pg_stat_reset_slru(NULL) already did this.
> 
> seem a bit repetitive.  (I think the first one is also wrong, because it
> says you have to pass NULL, but in reality you can also not give an
> argument and it works.)  Can we make them a single item?  Maybe
> something like
> 
> : Improve reset routines for shared statistics (Atsushi Torikoshi, Bharath 
> Rupireddy)
> :
> : Resetting all shared statistics can now be done with
> : pg_stat_reset_shared() or pg_stat_reset_shared(NULL), while SLRU
> : statistics can now be reset with pg_stat_reset_shared('slru'),
> : pg_stat_reset_slru() and pg_stat_reset_slru(NULL).

Andres already suggested improvement for this, and I posted the applied
patch.  Can you see if that is good or can be improved?  Thanks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-25 Thread Bruce Momjian
On Thu, May 23, 2024 at 04:54:28PM -0300, Marcos Pegoraro wrote:
>   • Rename SLRU columns in system view pg_stat_slru (Alvaro Herrera)
> 
> The column names accepted by pg_stat_slru_rest() are also changed.
> 
> Is pg_stat_slru_rest() correct ? 

Oops, typo, fixed, thanks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-05-25 Thread Alexander Korotkov
On Sat, May 25, 2024 at 8:53 PM Justin Pryzby  wrote:
> On Fri, May 03, 2024 at 04:32:25PM +0300, Alexander Korotkov wrote:
> > On Fri, May 3, 2024 at 4:23 PM Justin Pryzby  wrote:
> > > Note that the error that led to "EXCLUDING IDENTITY" is being discused
> > > over here:
> > > https://www.postgresql.org/message-id/3b8a9dc1-bbc7-0ef5-6863-c432afac7...@gmail.com
> > >
> > > It's possible that once that's addressed, the exclusion should be
> > > removed here, too.
> >
> > +1
>
> Can EXCLUDING IDENTITY be removed now ?
>
> I wasn't able to find why it was needed - at one point, I think there
> was a test case that threw an error, but now when I remove the EXCLUDE,
> nothing goes wrong.

Yes, it was broken before [1][2], but now it seems to work.  At the
same time, I'm not sure if we need to remove the EXCLUDE now.
IDENTITY is anyway successfully created when the new partition gets
attached.

Links.
1. 
https://www.postgresql.org/message-id/171085360143.2046436.7217841141682511557.p...@coridan.postgresql.org
2. 
https://www.postgresql.org/message-id/flat/ZiGH0xc1lxJ71ZfB%40pryzbyj2023#297b6aef85cb089abb38e9b1a9a7

--
Regards,
Alexander Korotkov
Supabase




Re: First draft of PG 17 release notes

2024-05-25 Thread Bruce Momjian
On Thu, May 23, 2024 at 08:19:15PM -0400, Peter Geoghegan wrote:
> On Wed, May 22, 2024 at 6:50 PM Bruce Momjian  wrote:
> > Agreed, patch applied, thanks.
> 
> The item for my commit 5bf748b8 currently reads:
> 
> "Allow btree indexes to more efficiently find array matches"
> 
> I think that this isn't likely to mean much to most users. It seems
> like it'd be a lot clearer if the wording was more in line with the
> beta1 announcement, which talks about the work as an enhancement to
> index scans that use an IN ( ) condition. Specifically referencing
> IN() (as opposed to something about arrays or array conditions) is
> likely to make the item much more understandable.
> 
> Referencing array matches makes me think of a GIN index on an array
> column. While ScalarArrayOps do use an array under the cover, that's
> mostly an implementation detail. At least it is to users that
> exclusively use IN(), likely the majority (that's the SQL standard
> syntax).
> 
> For context, the Postgres 9.2 release notes described the feature that
> my work directly built on as follows:
> 
> "Allow indexed_col op ANY(ARRAY[...]) conditions to be used in plain
> index scans and index-only scans"
> 
> This was a very accurate description of this earlier work. Similar
> wording could be used now, but that doesn't seem great to me either.
> Simply because this wording also doesn't reference IN() conditions in
> index quals.

Agreed.  I changed it to:

Allow btree indexes to more efficiently find a set of values, such as
those supplied by IN subqueries

Is that good?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-05-25 Thread Alexander Korotkov
On Sat, May 25, 2024 at 3:53 PM Alexander Korotkov  wrote:
> On Fri, May 24, 2024 at 11:00 PM Alexander Lakhin  wrote:
> >
> > 24.05.2024 22:29, Tom Lane wrote:
> > > The partition_split test has unstable results, as shown at [1].
> > > I suggest adding "ORDER BY conname" to the two queries shown
> > > to fail there.  Better look at other queries in the test for
> > > possible similar problems, too.
> >
> > Yes, I've just reproduced it on an aarch64 device as follows:
> > echo "autovacuum_naptime = 1
> > autovacuum_vacuum_threshold = 1
> > autovacuum_analyze_threshold = 1
> > " > ~/temp.config
> > TEMP_CONFIG=~/temp.config TESTS="$(printf 'partition_split %.0s' `seq 
> > 100`)" make -s check-tests
> > ...
> > ok 80- partition_split   749 ms
> > not ok 81- partition_split   728 ms
> > ok 82- partition_split   732 ms
> >
> > $ cat src/test/regress/regression.diffs
> > diff -U3 .../src/test/regress/expected/partition_split.out 
> > .../src/test/regress/results/partition_split.out
> > --- .../src/test/regress/expected/partition_split.out   2024-05-15 
> > 17:15:57.171999830 +
> > +++ .../src/test/regress/results/partition_split.out2024-05-24 
> > 19:28:37.32749 +
> > @@ -625,8 +625,8 @@
> >   SELECT pg_get_constraintdef(oid), conname, conkey FROM pg_constraint 
> > WHERE conrelid =
> > 'sales_feb_mar_apr2022'::regclass::oid;
> > pg_get_constraintdef | conname | conkey
> >   
> > -+-+
> > - CHECK ((sales_amount > 1))  | 
> > sales_range_sales_amount_check  | {2}
> >FOREIGN KEY (salesperson_id) REFERENCES salespeople(salesperson_id) | 
> > sales_range_salesperson_id_fkey | {1}
> > + CHECK ((sales_amount > 1))  | 
> > sales_range_sales_amount_check  | {2}
> >   (2 rows)
> >
> >   ALTER TABLE sales_range SPLIT PARTITION sales_feb_mar_apr2022 INTO
>
> Tom, Alexander, thank you for spotting this.
> I'm going to care about it later today.

ORDER BY is added in d53a4286d7 in these queries altogether with other
catalog queries with potentially unstable result.

--
Regards,
Alexander Korotkov
Supabase




Re: Fix an incorrect assertion condition in mdwritev().

2024-05-25 Thread Tom Lane
Michael Paquier  writes:
> On Sat, May 25, 2024 at 11:52:22PM +0800, Xing Guo wrote:
>> #ifdef CHECK_WRITE_VS_EXTEND
>> -Assert(blocknum < mdnblocks(reln, forknum));
>> +Assert(blocknum + nblocks <= mdnblocks(reln, forknum));
>> #endif

> Yes, it looks like you're right that this can be made stricter,
> computing the number of blocks we're adding in the number calculated
> (aka adding one block to this number fails immediately at initdb).

Hmm ... I agree that this is better normally.  But there's an
edge case where it would fail to notice a problem that the
existing code does notice: if blocknum is close to UINT32_MAX
and adding nblocks causes it to wrap around to a small value.
Is there an inexpensive way to catch that?  (If not, it's
not a reason to block this patch; but let's think about it
while we're here.)

regards, tom lane




Re: AIX support

2024-05-25 Thread Bruce Momjian
On Thu, May 23, 2024 at 07:03:20PM +0300, Heikki Linnakangas wrote:
> > Can you provide some more details on the expectations here?
> 
> Smallest possible patch that makes Postgres work on AIX again.
> 
> Perhaps start with the patch you posted yesterday, but remove hunks from it
> one by one, to see which ones are still needed.

Yes, bingo, that is exactly what needs to be done, and for the minimal
compiler, gcc, and the most recently supported versions of AIX.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Fix an incorrect assertion condition in mdwritev().

2024-05-25 Thread Tom Lane
I wrote:
> Hmm ... I agree that this is better normally.  But there's an
> edge case where it would fail to notice a problem that the
> existing code does notice: if blocknum is close to UINT32_MAX
> and adding nblocks causes it to wrap around to a small value.
> Is there an inexpensive way to catch that?

After a few minutes' thought, how about:

Assert((uint64) blocknum + (uint64) nblocks <= (uint64) mdnblocks(reln, 
forknum));

This'd stop being helpful if we ever widen BlockNumber to 64 bits,
but I think that's unlikely.  (Partitioning seems like a better answer
for giant tables.)

regards, tom lane