date:20240723

On Mon, 2024-07-22 at 13:55 -0400, Robert Haas wrote:
> On Mon, Jul 22, 2024 at 1:18 PM Laurenz Albe  wrote:
> > I understand the difficulty (madness) of discussing every Unicode
> > change.  If that's unworkable, my preference would be to stick with some
> > Unicode version and never modify it, ever.
> 
> I think that's a completely non-viable way forward. Even if everyone
> here voted in favor of that, five years from now there will be someone
> who shows up to say "I can't use your crappy software because the
> Unicode tables haven't been updated in five years, here's a patch!".
> And, like, what are we going to do? Still keeping shipping the 2024
> version of Unicode four hundred years from now, assuming humanity and
> civilization and PostgreSQL are still around then? Holding something
> still "forever" is just never going to work.

I hear you.  It would be interesting to know what other RDBMS do here.

Yours,
Laurenz Albe

Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)

2024-07-23 Thread Richard Guo

On Wed, Jun 5, 2024 at 3:48 PM Ashutosh Bapat
 wrote:
> This is different. But it needs a rebase. PFA rebased patch. The revised 
> commit message explains the change better.

I looked through this patch and I think it is in a good shape.

Some minor comments:

* I think it's better to declare 'child_relids' as type Relids
rather than Bitmapset *.  While they are the same thing, Bitmapset *
isn't typically used in joinrels.c.  And I doubt that it is necessary
to explicitly initialize it to NULL.

* This patch changes the signature of function build_child_join_rel().
I recommend arranging the two newly added parameters as 'int
nappinfos, AppendRelInfo **appinfos' to maintain consistency with
other functions that include these parameters.

* At the end of the for loop in try_partitionwise_join(), we attempt
to free the variables used within the loop.  I suggest freeing these
variables in the order or reverse order of their allocation, rather
than arbitrarily.  Also, in non-partitionwise-join planning, we do not
free local variables in this manner.  I understand that we do this for
partitionwise-join because accumulating local variables can lead to
significant memory usage, particularly with many partitions.  I think
it would be beneficial to add a comment in try_partitionwise_join()
explaining this behavior.

Thanks
Richard

Re: Document use of ldapurl with LDAP simple bind


On 08.07.24 23:27, Jacob Champion wrote:

On Fri, Jun 28, 2024 at 12:11 AM Peter Eisentraut  wrote:

This appears to imply that specifying ldapurl is only applicable for
search+bind.  Maybe that whole message should be simplified to something
like

"configuration mixes arguments for simple bind and search+bind"

(The old wording also ignores that the error might arise via "ldapsuffix".)


I kept the imperative "cannot" and tried to match the terminology with
our documentation at [1]:

 cannot mix options for simple bind and search+bind modes


Committed.

(I suppose this could be considered a bug fix, but I don't feel an 
urgency to go backpatching this.  Let me know if there are different 
opinions.)

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

On Fri, Jul 12, 2024 at 12:44:54PM +0300, Aleksander Alekseev wrote:
> Fair enough. Here is the updated patchset.

Hearing nothing but cicadas as now is their season, I have taken the
initiative here to address this open item.

0001 felt a bit too complicated in slru.h, so I have simplified it and
kept all the details in slru.c with SlruFileName().

I have reviewed all the code that uses SLRUs, and spotted three more
problematic code paths in predicate.c that needed an update like the
others for some pagenos.  I've added these, and applied 0002.  We
should be good now.
--
Michael


signature.asc
Description: PGP signature

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2024-07-23 Thread Aleksander Alekseev

Hi,

> Hearing nothing but cicadas as now is their season, I have taken the
> initiative here to address this open item.
>
> 0001 felt a bit too complicated in slru.h, so I have simplified it and
> kept all the details in slru.c with SlruFileName().
>
> I have reviewed all the code that uses SLRUs, and spotted three more
> problematic code paths in predicate.c that needed an update like the
> others for some pagenos.  I've added these, and applied 0002.  We
> should be good now.

Thank you!

-- 
Best regards,
Aleksander Alekseev

Re: PG_TEST_EXTRA and meson

2024-07-23 Thread Ashutosh Bapat

Hi Nazir,

On Fri, Jul 19, 2024 at 1:37 PM Nazir Bilal Yavuz  wrote:

> >
> > If you are willing to work on this further, please add it to the commitfest.
>
> Since general consensus is more towards having an environment variable
> to override Meson configure option, I converted solution-3 to
> something more like a patch. I updated the docs, added more comments
> and added this to the commitfest [1].

Thanks.

>
> The one downside of this approach is that PG_TEXT_EXTRA in user
> defined options in meson setup output could be misleading now.
>

Upthread Tom asked whether we should do a symmetric change to "make".
This set of patches does not do that. Here are my thoughts:
1. Those who use make, are not using configure time PG_TEST_EXTRA
anyway, so they don't need it.
2. Those meson users who use setup time PG_TEST_EXTRA and also want to
use make may find the need for the feature in make.
3. https://www.postgresql.org/docs/devel/install-requirements.html
says that the meson support is currently experimental and only works
when building from a Git checkout. So there's a possibility (even if
theoretical) that make and meson will co-exist. Also that we may
abandon meson?

Considering those, it seems to me that symmetry is required. I don't
know how hard it is to introduce PG_TEST_EXTRA as a configure time
option in "make". If it's simple, we should do that. Otherwise it will
be better to just remove PG_EXTRA_TEST option support from meson
support to keep make and meson symmetrical.

As far as the implementation is concerned the patch seems to be doing
what's expected. If PG_TEST_EXTRA is specified at setup time, it is
not needed to be specified as an environment variable at run time. But
it can also be overridden at runtime. If PG_TEST_EXTRA is not
specified at the time of setup, but specified at run time, it works. I
have tested xid_wraparound and wal_consistency_check.

I wonder whether we really require pg_test_extra argument to testwrap.
Why can't we use the logic in testwrap, to set run time PG_TEST_EXTRA,
in meson.build directly? I.e. set test_env['PG_TEST_EXTRA'] to
os.environ[;PG_TEST_EXTRA'] if the latter is set, otherwise set the
first to get_option('PG_TEST_EXTRA').

> Also, with this change; PG_TEST_EXTRA from configure_scripts in the
> .cirrus.tasks.yml file should be removed as they are useless now. I
> added that as a second patch.

I think this is useful and allows both make and meson to use the same
logic in cirrus.

--
Best Wishes,
Ashutosh Bapat

Re: Logical Replication of sequences

2024-07-23 Thread vignesh C

On Tue, 16 Jul 2024 at 06:00, Peter Smith  wrote:
>
> Hi,
>
> I was reading back through this thread to find out how the proposed new 
> command for refreshing sequences,  came about. The patch 0705 introduces a 
> new command syntax for ALTER SUBSCRIPTION ... REFRESH SEQUENCES
>
> So now there are 2 forms of subscription refresh.
>
> #1. ALTER SUBSCRIPTION name REFRESH PUBLICATION [ WITH ( refresh_option [= 
> value] [, ... ] ) ]

This is correct.

> #2. ALTER SUBSCRIPTION name REFRESH SEQUENCES

This is not correct, it is actually "ALTER SUBSCRIPTION name REFRESH
PUBLICATION SEQUENCES"

> 
>
> IMO, that separation seems complicated. It leaves many questions like:
> * It causes a bit of initial confusion. e.g. When I saw the REFRESH SEQUENCES 
> I first assumed that was needed because sequences were not covered by the 
> existing REFRESH PUBLICATION
> * Why wasn't command #2 called ALTER SUBSCRIPTION REFRESH PUBLICATION 
> SEQUENCES? E.g. missing keyword PUBLICATION. It seems inconsistent.

This is not correct, the existing implementation uses the key word
PUBLICATION, the actual syntax is:
"ALTER SUBSCRIPTION name REFRESH PUBLICATION SEQUENCES"

> * I expect sequence values can become stale pretty much immediately after 
> command #1, so the user will want to use command #2 anyway...

Yes

> * ... but if command #2 also does add/remove changed sequences same as 
> command #1 then what benefit was there of having the command #1 for sequences?
> * There is a separation of sequences (from tables) in command #2 but there is 
> no separation currently possible in command #1. It seemed inconsistent.

This can be enhanced if required. It is not included as of now because
I'm not sure if there is such a use case in case of tables.

> ~~~
>
> IIUC some of the goals I saw in the thread are to:
> * provide a way to fetch and refresh sequences that also keeps behaviors 
> (e.g. copy_data etc.) consistent with the refresh of subscription tables
> * provide a way to fetch and refresh *only* sequences
>
> I felt you could just enhance the existing refresh command syntax (command 
> #1), instead of introducing a new one it would be simpler and it would still 
> meet those same objectives.
>
> Synopsis:
> ALTER SUBSCRIPTION name REFRESH PUBLICATION [TABLES | SEQUENCES | ALL] [ WITH 
> ( refresh_option [= value] [, ... ] ) ]
>
> My only change is the introduction of the optional "[TABLES | SEQUENCES | 
> ALL]" clause.
>
> I believe that can do everything your current patch does, plus more:
> * Can refresh *only* TABLES if that is what you want (current patch 0705 
> cannot do this)
> * Can refresh *only* SEQUENCES (same as current patch 0705 command #2)
> * Has better integration with refresh options like "copy_data" (current patch 
> 0705 command #2 doesn't have options)
> * Existing REFRESH PUBLICATION syntax still works as-is. You can decide later 
> what is PG18 default if the "[TABLES | SEQUENCES | ALL]" is omitted.
>
> ~~~
>
> More examples using proposed syntax.
>
> ex1.
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION TABLES WITH (copy_data = false)
> - same as PG17 functionality for ALTER SUBSCRIPTION sub REFRESH PUBLICATION 
> WITH (copy_data = false)
>
> ex2.
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION TABLES WITH (copy_data = true)
> - same as PG17 functionality for ALTER SUBSCRIPTION sub REFRESH PUBLICATION 
> WITH (copy_data = true)
>
> ex3.
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES WITH (copy data = false)
> - this adds/removes only sequences to pg_subscription_rel but doesn't update 
> their sequence values
>
> ex4.
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES WITH (copy data = true)
> - this adds/removes only sequences to pg_subscription_rel and also updates 
> all sequence values.
> - this is equivalent behaviour of what your current 0705 patch is doing for 
> command #2, ALTER SUBSCRIPTION sub REFRESH SEQUENCES
>
> ex5.
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION ALL WITH (copy_data = false)
> - this is equivalent behaviour of what your current 0705 patch is doing for 
> command #1, ALTER SUBSCRIPTION sub REFRESH PUBLICATION WITH (copy_data = 
> false)
>
> ex6.
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION ALL WITH (copy_data = true)
> - this adds/removes tables and sequences and updates all table initial data 
> sequence values.- I think it is equivalent to your current 0705 patch doing
> command #1 ALTER SUBSCRIPTION sub REFRESH PUBLICATION WITH (copy_data = 
> true), followed by another command #2 ALTER SUBSCRIPTION sub REFRESH SEQUENCES
>
> ex7.
> ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES
> - Because default copy_data is true you do not need to specify options, so 
> this is the same behaviour as your current 0705 patch command #2, ALTER 
> SUBSCRIPTION sub REFRESH SEQUENCES.

I felt ex:4 is equivalent to command #2 "ALTER SUBSCRIPTION name
REFRESH PUBLICATION SEQUENCES" and ex:3 just updates the
pg_subscription_rel. But I'm not seeing an equivalent for "ALTER
SUBS

Re: Docs: Order of json aggregate functions

2024-07-23 Thread Marlene Reiterer

Am Mo., 22. Juli 2024 um 15:19 Uhr schrieb Wolfgang Walther
:
>
> The order of json related aggregate functions in the docs is currently
> like this:
>
> [...]
> json_agg
> json_objectagg
> json_object_agg
> json_object_agg_strict
> json_object_agg_unique
> json_arrayagg
> json_object_agg_unique_strict
> max
> min
> range_agg
> range_intersect_agg
> json_agg_strict
> [...]
>
> json_arrayagg and json_agg_strict are out of place.
>
> Attached patch puts them in the right spot. This is the same down to v16.


I compiled and it worked and didn't throw an error.

The changes to the patch seem useful in my perspective, for making it
easier to find the functions in the documentation, so people will find
them easier.

There is another table which isn't sorted too, the "Hypothetical-Set
Aggregate Functions". Which would be in need of an alphabetical
sorting too, if all the tables on this side
of the documentation should look alike.

Regards
Marlene

Add new fielids only at last

2024-07-23 Thread .

Currently, the new fields are only supported at the end, Cancannot move the 
location of the field when editing the table, This does not seem to be an 
elegant approach

Restrictive combination of GRANT and POLICY

2024-07-23 Thread Bakhtiyar Neyman

Hi!

The permissions given by GRANT and POLICY statements seem to always be
combined "permissively". In other words, if role `foo` inherits from roles
`can_access_all_columns_but_no_rows` and
`can_access_all_rows_but_no_columns`, then `foo` would be able to access
all rows and all columns of the table in question. I wonder, what it would
take to extend Postgres to allow to combine them "restrictively".

One hacky way to do so would be to apply the following logic when
evaluating a query:
1) Use the RLS policies to filter out the rows that should be visible to
the given user. On each row, record the set roles that allow the operation.
2) For each row and for each column, iterate through the intersection of
(recorded roles, roles the current roles inherits from) to see if the
column should be given access to. If not, return a null in that position.
(For updates/inserts error out).

Obviously, this would be a departure from SQL standard. But other than
that, is this a valid feature idea? I am not a fan of shoehorning nulls for
this, but given that the database can omit rows when using RLS, nulls don't
seem to be too far from that.

The reason I'm bringing it up is that it seems to solve the following
problem nicely: imagine you have a table `people`, and an association table
between two people called `friends`. Each person should see their own data
in `people` and a subset of columns of `people` if they are friends.
(Please refer to the attached file for definitions).

If there's an easier solution that's possible today I'd be curious to learn
about it. The best I could come up with (for queries only) is defining
views that do this "null-masking".Something like this:

CREATE VIEW people_for_person AS
SELECT
  id,
  CASE WHEN roles.is_self OR roles.is_friend THEN email END AS email,
  CASE WHEN roles.is_self THEN password END AS password
FROM people p
JOIN LATERAL (
  SELECT p.id = current_setting('app.user_id')::INTEGER AS is_self,
  EXISTS (
SELECT true
FROM friends f
WHERE f.person_id = p.id
  AND f.friend_id = current_setting('app.user_id')::INTEGER
  ) AS is_friend
) roles ON true;

Cheers,
Bakhtiyar


20240723032444_test.up.sql
Description: application/sql

Re: Parent/child context relation in pg_get_backend_memory_contexts()

2024-07-23 Thread Melih Mutlu

Hi David,

David Rowley , 15 Tem 2024 Pzt, 14:38 tarihinde şunu
yazdı:

> On Sat, 13 Jul 2024 at 10:12, Melih Mutlu  wrote:
> > I updated documentation for path and level columns and also fixed the
> tests as level starts from 1.
>
> Thanks for updating.
>
> +   The path column can be useful to build
> +   parent/child relation between memory contexts. For example, the
> following
> +   query calculates the total number of bytes used by a memory context
> and its
> +   child contexts:
>
> "a memory context" doesn't quite sound specific enough. Let's say what
> the query is doing exactly.
>

Changed "a memory context" with "CacheMemoryContext".


> +
> +WITH memory_contexts AS (
> +SELECT *
> +FROM pg_backend_memory_contexts
> +)
> +SELECT SUM(total_bytes)
> +FROM memory_contexts
> +WHERE ARRAY[(SELECT path[array_length(path, 1)] FROM memory_contexts
> WHERE name = 'CacheMemoryContext')] <@ path;
>
> I don't think that example query is the most simple example. Isn't it
> better to use the most simple form possible to express that?
>
> I think it would be nice to give an example of using "level" as an
> index into "path"
>
> WITH c AS (SELECT * FROM pg_backend_memory_contexts)
> SELECT sum(c1.total_bytes)
> FROM c c1, c c2
> WHERE c2.name = 'CacheMemoryContext'
> AND c1.path[c2.level] = c2.path[c2.level];
>

I changed the queries in the documentation and regression test to the ones
similar to the above query that you shared.


+ /*
> + * Queue up all the child contexts of this level for the next
> + * iteration of the outer loop.
> + */
>
> That outer loop is gone.
>

Removed that part.



> Also, this was due to my hasty writing of the patch. I named the
> function get_memory_context_name_and_indent. I meant to write "ident".
> If we did get rid of the "parent" column, I'd not see any need to keep
> that function. The logic could just be put in
> PutMemoryContextsStatsTupleStore(). I just did it that way to avoid
> the repeat.
>

Fixed the name. Also I needed to cast parameters when calling that function
as below to get rid of some warnings.

+   get_memory_context_name_and_ident(context,
+
(const char **)&name,
+
(const char **) &ident);

Thanks,
-- 
Melih Mutlu
Microsoft


v10-0001-Add-path-column-into-pg_backend_memory_contexts.patch
Description: Binary data

Re: CI, macports, darwin version problems

2024-07-23 Thread Thomas Munro

On Tue, Jul 23, 2024 at 7:37 AM Andres Freund  wrote:
> [2] https://cirrus-ci.com/task/5190473306865664

"Error: “disk.img” couldn’t be copied to
“3FA983DD-3078-4B28-A969-BCF86F8C9585” because there isn’t enough
space."

Could it be copying the whole image every time, in some way that would
get copy-on-write on the same file system, but having to copy
physically here?  That is, instead of using some kind of chain of
overlay disk image files as seen elsewhere, is this Tart thing relying
on file system COW for that?  Perhaps that is happening here[1] but I
don't immediately know how to find out where that Swift standard
library call turns into system calls...

[1] 
https://github.com/cirruslabs/tart/blob/main/Sources/tart/VMDirectory.swift#L119

Re: PG_TEST_EXTRA and meson

2024-07-23 Thread Nazir Bilal Yavuz

Hi,

On Tue, 23 Jul 2024 at 12:26, Ashutosh Bapat
 wrote:
>
> Upthread Tom asked whether we should do a symmetric change to "make".
> This set of patches does not do that. Here are my thoughts:
> 1. Those who use make, are not using configure time PG_TEST_EXTRA
> anyway, so they don't need it.
> 2. Those meson users who use setup time PG_TEST_EXTRA and also want to
> use make may find the need for the feature in make.
> 3. https://www.postgresql.org/docs/devel/install-requirements.html
> says that the meson support is currently experimental and only works
> when building from a Git checkout. So there's a possibility (even if
> theoretical) that make and meson will co-exist. Also that we may
> abandon meson?
>
> Considering those, it seems to me that symmetry is required. I don't
> know how hard it is to introduce PG_TEST_EXTRA as a configure time
> option in "make". If it's simple, we should do that. Otherwise it will
> be better to just remove PG_EXTRA_TEST option support from meson
> support to keep make and meson symmetrical.

I agree that symmetry should be the ultimate goal.

Upthread Jacob said he could work on a patch about introducing the
PG_TEST_EXTRA configure option to make builds. Would you still be
interested in working on this? If not, I would gladly work on it.

> As far as the implementation is concerned the patch seems to be doing
> what's expected. If PG_TEST_EXTRA is specified at setup time, it is
> not needed to be specified as an environment variable at run time. But
> it can also be overridden at runtime. If PG_TEST_EXTRA is not
> specified at the time of setup, but specified at run time, it works. I
> have tested xid_wraparound and wal_consistency_check.

Thanks for testing it!

> I wonder whether we really require pg_test_extra argument to testwrap.
> Why can't we use the logic in testwrap, to set run time PG_TEST_EXTRA,
> in meson.build directly? I.e. set test_env['PG_TEST_EXTRA'] to
> os.environ[;PG_TEST_EXTRA'] if the latter is set, otherwise set the
> first to get_option('PG_TEST_EXTRA').

When test_env('PG_TEST_EXTRA') is set, it could not be overridden
afterwards. Perhaps there is a way to override test_env() but I do not
know how.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Re: PG_TEST_EXTRA and meson

2024-07-23 Thread Ashutosh Bapat

On Tue, Jul 23, 2024 at 4:02 PM Nazir Bilal Yavuz  wrote:
>
> > I wonder whether we really require pg_test_extra argument to testwrap.
> > Why can't we use the logic in testwrap, to set run time PG_TEST_EXTRA,
> > in meson.build directly? I.e. set test_env['PG_TEST_EXTRA'] to
> > os.environ[;PG_TEST_EXTRA'] if the latter is set, otherwise set the
> > first to get_option('PG_TEST_EXTRA').
>
> When test_env('PG_TEST_EXTRA') is set, it could not be overridden
> afterwards. Perhaps there is a way to override test_env() but I do not
> know how.
>

I am not suggesting to override test_env['PG_TEST_EXTRA'] but set it
to the value after overriding if required. meson.build file seems to
allow some conditional variable setting. So I thought this would be
possible, haven't tried myself though.

-- 
Best Wishes,
Ashutosh Bapat

Re: xid_wraparound tests intermittent failure.

2024-07-23 Thread Andrew Dunstan



On 2024-07-22 Mo 9:29 PM, Masahiko Sawada wrote:

On Mon, Jul 22, 2024 at 12:53 PM Andrew Dunstan  wrote:


On 2024-07-22 Mo 12:46 PM, Tom Lane wrote:

Masahiko Sawada  writes:

Looking at dodo's failures, it seems that while it passes
module-xid_wraparound-check, all failures happened only during
testmodules-install-check-C. Can we check the server logs written
during xid_wraparound test in testmodules-install-check-C?

Oooh, that is indeed an interesting observation.  There are enough
examples now that it's hard to dismiss it as chance, but why would
the two runs be different?


It's not deterministic.

I tested the theory that it was some other concurrent tests causing the issue, 
but that didn't wash. Here's what I did:

 for f in `seq 1 100`
   do echo iteration = $f
   meson test --suite xid_wraparound || break
 done

It took until iteration 6 to get an error. I don't think my Ubuntu instance is especially 
slow. e.g. "meson compile" normally takes a handful of seconds. Maybe 
concurrent tests make it more likely, but they can't be the only cause.

Could you provide server logs in both OK and NG tests? I want to see
if there's a difference in the rate at which tables are vacuumed.



See 




The failure logs are from a run where both tests 1 and 2 failed.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: thread-safety: gmtime_r(), localtime_r()

On 04.07.24 18:36, Heikki Linnakangas wrote:

The Linux man page for localtime_r() says:

According to POSIX.1-2001, localtime() is required to behave as
though tzset(3) was called, while localtime_r() does not have  this
requirement.   For  portable  code,  tzset(3) should be called before
localtime_r().

It's not clear to me what happens if tzset() has not been called and the 
localtime_r() implementation does not call it either. I guess some 
implementation default timezone is used.

In the libpq traces, I don't think we care much. In ecpg, I'm not sure 
what the impact is if the application has not previously called tzset(). 
I'd suggest that we just document that an ecpg application should call 
tzset() before calling the functions that are sensitive to local 
timezone setting.

I have been studying this question.  It appears that various libc 
implementers have been equally puzzled by this; there are various 
comments like "it's unclear what POSIX wants here" in the sources.  (I 
have checked glibc, FreeBSD, and Solaris.)

Consider if a program calls localtime() or localtime_r() twice:

localtime(...);
...
localtime(...);

or

localtime_r(...);
...
localtime_r(...);

The question here is, how many times does this effectively (internally) 
call tzset().  There are three possible answers: 0, 1, or 2.

For localtime(), the answer is clear.  localtime() is required to call 
tzset() every time, so the answer is 2.

For localtime_r(), it's unclear.  What you are wondering, and I have 
been wondering, is whether the answer is 0 or non-zero (and possibly, if 
it's 0, will these calls misbehave badly).  What the libc implementers 
are wondering is whether the answer is 1 or 2.  The implementations 
whose source code I have checked think it should be 1.  They never 
consider that it could be 0 and it's okay to misbehave.

Where this difference appears it practice would be something like

setenv("TZ", "foo");
localtime(...);  // uses TZ foo
setenv("TZ", "bar");
localtime(...);  // uses TZ bar

versus

setenv("TZ", "foo");
localtime_r(...);  // uses TZ foo if first call in program
setenv("TZ", "bar");
localtime_r(...);  // probably does not use new TZ

If you want the second case to pick up the changed TZ setting, you must 
explicitly call tzset() to be sure.

I think, considering this, the proposed patch should be okay.  At least, 
the libraries are not going to misbehave if tzset() hasn't been called 
explicitly.  It will be called internally the first time it's needed.  I 
don't think we need to cater to cases like my example where the 
application changes the TZ environment variable but neglects to call 
tzset() itself.

>> There is one affected call in the backend.  Most of the backend
>> otherwise uses the custom functions pg_gmtime() and pg_localtime(),
>> which are implemented differently.
>
> Do we need to call tzset(3) somewhere in backend startup? Or could we
> replace that localtime() call in the backend with pg_localtime()?

Let's look at what this code actually does.  It just takes the current 
time and then loops through all possible weekdays and months to collect 
and cache the localized names.  The current time or time zone doesn't 
actually matter for this, we just need to fill in the struct tm a bit 
for strftime() to be happy.  We could probably replace this with 
gmtime() to avoid the question about time zone state.  (We probably 
don't even need to call time() beforehand, we could just use time zero. 
But the comment says "We use times close to current time as data for 
strftime().", which is probably prudent.)  (Since we are caching the 
results for the session, we're already not dealing with the hilarious 
hypothetical situation where the weekday and month names depend on the 
actual time, in case that is a concern.)

Re: Add new fielids only at last

2024-07-23 Thread Aleksander Alekseev

Hi,

> Currently, the new fields are only supported at the end, Cancannot move the 
> location of the field when editing the table, This does not seem to be an 
> elegant approach

Pretty confident it was discussed many times before. Please use the search.

-- 
Best regards,
Aleksander Alekseev

Re: PG_TEST_EXTRA and meson

2024-07-23 Thread Nazir Bilal Yavuz

Hi,

On Tue, 23 Jul 2024 at 13:40, Ashutosh Bapat
 wrote:
>
> On Tue, Jul 23, 2024 at 4:02 PM Nazir Bilal Yavuz  wrote:
> >
> > > I wonder whether we really require pg_test_extra argument to testwrap.
> > > Why can't we use the logic in testwrap, to set run time PG_TEST_EXTRA,
> > > in meson.build directly? I.e. set test_env['PG_TEST_EXTRA'] to
> > > os.environ[;PG_TEST_EXTRA'] if the latter is set, otherwise set the
> > > first to get_option('PG_TEST_EXTRA').
> >
> > When test_env('PG_TEST_EXTRA') is set, it could not be overridden
> > afterwards. Perhaps there is a way to override test_env() but I do not
> > know how.
> >
>
> I am not suggesting to override test_env['PG_TEST_EXTRA'] but set it
> to the value after overriding if required. meson.build file seems to
> allow some conditional variable setting. So I thought this would be
> possible, haven't tried myself though.

Sorry if I caused a misunderstanding. What I meant was, when the
test_env('PG_TEST_EXTRA') is set, Meson will always use PG_TEST_EXTRA
from the setup. So, Meson needs to be built again to change
PG_TEST_EXTRA.

AFAIK Meson does not support accessing environment variables but
run_command() could be used to test this:

-test_env.set('PG_TEST_EXTRA', get_option('PG_TEST_EXTRA'))
+pg_test_extra_env = run_command(
+  [python,
+  '-c',
+  'import os; print(os.getenv("PG_TEST_EXTRA", ""))'],
+  check: true).stdout()
+
+test_env.set('PG_TEST_EXTRA', pg_test_extra_env != '' ?
+  pg_test_extra_env :
+  get_option('PG_TEST_EXTRA'))
+

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-07-23 Thread Amit Kapila

On Mon, Jul 22, 2024 at 2:48 PM Peter Smith  wrote:
>
> Hi, Patch v22-0001 LGTM apart from the following nitpicks
>

I have included these in the attached. The patch looks good to me. I
am planning to push this tomorrow unless there are more comments.

-- 
With Regards,
Amit Kapila.

v23-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIP.patch
Description: Binary data

Re: DSO Terms Galore


On 19.07.24 21:27, David E. Wheeler wrote:

I’m trying to understand the standard terms for extension libraries. There seem 
too a bewildering number of terms used to refer to a shared object library, for 
example:

* LOAD[1]:
   * “shared library”
   * “shared library file”
* dynamic_library_path[2]:
   * “dynamically loadable module”
* xfunc-c[3]:
   * “dynamically loadable object”
   * “shared library”
   * “loadable object”
   * “loadable object file”
   * “object file”
   * “dynamically loaded object file”
* pg_config[5]:
   * “object code libraries” (static?)
   * “dynamically loadable modules”
* PGXS[4]:
   * “MODULES”
   * “shared-library objects”
   * “shared library”


I think in the POSIX-ish realm, the best term is "dynamically loadable 
library".  It's a library, because it contains functions you can, uh, 
borrow, just like a static library.  And it's dynamically loadable, as 
opposed to being loaded in a fixed manner at startup time.


Also, the "dl" in dlopen() etc. presumably stands for dynamic-something 
load-something.


Libtool uses the term "dlopened module" for this, and the command-line 
option is -module. 
(https://www.gnu.org/software/libtool/manual/libtool.html#Dlopened-modules)


Meson uses shared_module() for this.  (It has shared_library() and 
static_library() for things like libpq.)


Things like "object" or "object file" or probably wrong-ish.  I 
understand an object file to be a .o file, which you can't dlopen directly.


Shared library is semi-ok because on many platforms, link-time shared 
libraries (like libpq) and dynamically loadable libraries (like plpgsql) 
are the same file format.  But on some they're not, so it leads to 
confusion.


I think we can unify this around terms like "dynamically loadable 
library" and "dynamically loadable module" (or "loaded" in cases where 
it's talking about a file that has already been loaded).

Re: pgsql: Add more SQL/JSON constructor functions

2024-07-23 Thread jian he

While reviewing the patch, I found some inconsistency on json_table EXISTS.

--tested based on your patch and master.
src4=# SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a jsonb
EXISTS PATH '$'));
ERROR:  cannot cast behavior expression of type boolean to jsonb
src4=# SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a jsonb
EXISTS PATH '$' error on error));
  a
--
 true
(1 row)

Why explicitly "error on error" not report error while not explicitly
mentioning it yields an error?

"(a jsonb EXISTS PATH '$' error on error)" returns jsonb 'true'
imply that no errors happened.
so "(a jsonb EXISTS PATH '$')" should not have any errors.


but boolean cannot cast to jsonb so for JSON_TABLE,
we should reject
COLUMNS (a jsonb EXISTS PATH '$' error on error ));
COLUMNS (a jsonb EXISTS PATH '$' unknown on error ));
at an earlier stage.

because json_populate_type will use literal 'true'/'false' cast to
jsonb, which will not fail.
but JsonPathExists returns are *not* quoted true/false.
so rejecting it earlier is better than handling it at ExecEvalJsonExprPath.


attached patch trying to solve the problem, changes applied based on
your 0001, 0002.
after apply attached patch:


create domain djsonb as jsonb check(value = 'true');
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a djsonb EXISTS
PATH '$' error on error));
ERROR:  cannot cast type boolean to djsonb
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a djsonb EXISTS
PATH '$' unknown on error));
ERROR:  cannot cast type boolean to djsonb
SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a jsonb EXISTS PATH '$'));
ERROR:  cannot cast type boolean to jsonb



i found out a typo in
src/test/regress/expected/sqljson_queryfuncs.out,
src/test/regress/sql/sqljson_queryfuncs.sql
"fixed-legth" should be "fixed-length"
From bf05974a330a1df5877e77a0484922b4d17ccde9 Mon Sep 17 00:00:00 2001
From: jian he 
Date: Tue, 23 Jul 2024 19:57:26 +0800
Subject: [PATCH v1 1/1] better handle json_table EXISTS ERROR|UNKNOWN ON ERROR
 cases

bool cannot cast to jsonb/json.
so the following two expression should fail.
(column_name jsonb EXISTS PATH path_expression error on error);
(column_name jsonb EXISTS PATH path_expression unknown on error);

make it fail at parsing stage.
---
 src/backend/parser/parse_expr.c |  9 +
 src/test/regress/expected/sqljson_jsontable.out | 15 +--
 src/test/regress/sql/sqljson_jsontable.sql  | 10 ++
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 61611b8a..92dbf854 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -4262,6 +4262,7 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
 	Node	   *path_spec;
 	const char *func_name = NULL;
 	JsonFormatType default_format;
+	Oid returning_typid;
 
 	switch (func->op)
 	{
@@ -4478,6 +4479,14 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
 			if (jsexpr->returning->typid != BOOLOID)
 jsexpr->use_json_coercion = jsexpr->returning->typid != INT4OID;
 
+			returning_typid = getBaseType(jsexpr->returning->typid);
+
+			if (returning_typid == JSONBOID || returning_typid == JSONOID)
+	ereport(ERROR,
+			errcode(ERRCODE_CANNOT_COERCE),
+			errmsg("cannot cast type boolean to %s",
+			format_type_be(jsexpr->returning->typid)));
+
 			jsexpr->on_error = transformJsonBehavior(pstate, func->on_error,
 	 JSON_BEHAVIOR_FALSE,
 	 jsexpr->returning);
diff --git a/src/test/regress/expected/sqljson_jsontable.out b/src/test/regress/expected/sqljson_jsontable.out
index 9e93307f..b8779276 100644
--- a/src/test/regress/expected/sqljson_jsontable.out
+++ b/src/test/regress/expected/sqljson_jsontable.out
@@ -560,6 +560,17 @@ SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a int4 EXISTS PATH '$' ERROR
  1
 (1 row)
 
+--corner cases:
+--(column_name, jsonb EXISTS PATH '$' unknown on error)
+--(column_name, jsonb EXISTS PATH '$' error on error)
+create domain djsonb as jsonb check(value = 'true');
+SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a djsonb EXISTS PATH '$' error on error));
+ERROR:  cannot cast type boolean to djsonb
+SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a djsonb EXISTS PATH '$' unknown on error));
+ERROR:  cannot cast type boolean to djsonb
+SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a jsonb EXISTS PATH '$'));
+ERROR:  cannot cast type boolean to jsonb
+drop domain djsonb;
 SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a int2 EXISTS PATH '$.a'));
 ERROR:  cannot cast behavior expression of type boolean to smallint
 SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a int8 EXISTS PATH '$.a'));
@@ -582,9 +593,9 @@ SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a char(5) EXISTS PATH '$.a' E
 (1 row)
 
 SELECT * FROM JSON_TABLE(jsonb '"a"', '$' COLUMNS (a json EXISTS PATH '$.a'));
-ERROR:  cannot cast behavior expression of type boolean

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, Jul 23, 2024 at 3:11 AM Laurenz Albe  wrote:
> I hear you.  It would be interesting to know what other RDBMS do here.

Yeah, I agree.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeremy Schneider

On Tue, Jul 23, 2024 at 1:11 AM Laurenz Albe 
wrote:

> On Mon, 2024-07-22 at 13:55 -0400, Robert Haas wrote:
> > On Mon, Jul 22, 2024 at 1:18 PM Laurenz Albe 
> wrote:
> > > I understand the difficulty (madness) of discussing every Unicode
> > > change.  If that's unworkable, my preference would be to stick with
> some
> > > Unicode version and never modify it, ever.
> >
> > I think that's a completely non-viable way forward. Even if everyone
> > here voted in favor of that, five years from now there will be someone
> > who shows up to say "I can't use your crappy software because the
> > Unicode tables haven't been updated in five years, here's a patch!".
> > And, like, what are we going to do? Still keeping shipping the 2024
> > version of Unicode four hundred years from now, assuming humanity and
> > civilization and PostgreSQL are still around then? Holding something
> > still "forever" is just never going to work.
>
> I hear you.  It would be interesting to know what other RDBMS do here.

Other RDBMS are very careful not to corrupt databases, afaik including
function based indexes, by changing Unicode. I’m not aware of any other
RDBMS that updates Unicode versions in place; instead they support multiple
Unicode versions and do not drop the old ones.

See also:
https://www.postgresql.org/message-id/E8754F74-C65F-4A1A-826F-FD9F37599A2E%40ardentperf.com

I know Jeff mentioned that Unicode tables copied into Postgres for
normalization have been updated a few times. Did anyone ever actually
discuss the fact that things like function based indexes can be corrupted
by this, and weigh the reasoning? Are there past mailing list threads
touching on the corruption problem and making the argument why updating
anyway is the right thing to do? I always assumed that nobody had really
dug deeply into this before the last few years.

I do agree it isn’t as broad of a problem as linguistic collation itself,
which causes a lot more widespread corruption when it changes (as we’ve
seen from glibc 2.28 and also other older hacker mailing list threads about
smaller changes in older glibc versions corrupting databases). For now,
Postgres only has code-point collation and the other Unicode functions
mentioned in this thread.

-Jeremy

Re: Use read streams in CREATE DATABASE command when the strategy is wal_log

2024-07-23 Thread Noah Misch

On Mon, Jul 22, 2024 at 12:00:45PM +0300, Nazir Bilal Yavuz wrote:
> On Sat, 20 Jul 2024 at 21:14, Noah Misch  wrote:
> > On a different naming topic, my review missed that field name
> > copy_storage_using_buffer_read_stream_private.last_block doesn't fit how the
> > field is used.  Code uses it like an nblocks.  So let's either rename the
> > field or change the code to use it as a last_block (e.g. initialize it to
> > nblocks-1, not nblocks).
> 
> I prefered renaming it as nblocks, since that is how it was used in
> RelationCopyStorageUsingBuffer() before. Also, I realized that instead
> of setting p.blocknum = 0; initializing blkno as 0 and using
> p.blocknum = blkno makes sense. Because, p.blocknum and blkno should
> always start with the same block number. The relevant patch is
> attached.

I felt the local variable change was not a clear improvement.  It would have
been fine for the original patch to do it in that style, but the style of the
original patch was also fine.  So I've pushed just the struct field rename.

Re: replace strtok()


On 08.07.24 07:45, David Steele wrote:

On 6/24/24 19:57, Peter Eisentraut wrote:

On 24.06.24 02:34, Michael Paquier wrote:

On Sat, Jun 22, 2024 at 11:48:21AM -0400, Tom Lane wrote:

Peter Eisentraut  writes:

On 18.06.24 13:43, Ranier Vilela wrote:

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any 
testing.



Yeah, surely there are many possible implementations.  I'm thinking,
since we already took other str*() functions from OpenBSD, it makes
sense to do this here as well, so we have only one source to deal 
with.


Why not use strpbrk?  That's equally thread-safe, it's been there
since C89, and it doesn't have the problem that you can't find out
which of the delimiter characters was found.


Yeah, strpbrk() has been used in the tree as far as 2003 without any
port/ implementation.


The existing uses of strpbrk() are really just checking whether some 
characters exist in a string, more like an enhanced strchr().  I don't 
see any uses for tokenizing a string like strtok() or strsep() would 
do.   I think that would look quite cumbersome.  So I think a simpler 
and more convenient abstraction like strsep() would still be worthwhile.


I agree that using strsep() in these cases seems more natural. Since 
this patch provides a default implementation compatibility does not seem 
like a big issue.


I've also reviewed the rest of the patch and it looks good to me.


This has been committed.  Thanks.

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-07-23 Thread Melanie Plageman

On Mon, Jul 22, 2024 at 10:54 PM Masahiko Sawada  wrote:
>
> On Mon, Jul 22, 2024 at 6:26 PM Melanie Plageman
>  wrote:
> >
> > On Mon, Jul 22, 2024 at 6:36 PM Tom Lane  wrote:
> > >
> > > Melanie Plageman  writes:
> > > > We've only run tests with this commit on some of the back branches for
> > > > some of these animals. Of those, I don't see any failures so far. So,
> > > > it seems the test instability is just related to trying to get
> > > > multiple passes of index vacuuming reliably with TIDStore.
> > >
> > > > AFAICT, all the 32bit machine failures are timeouts waiting for the
> > > > standby to catch up (mamba, gull, merswine). Unfortunately, the
> > > > failures on copperhead (a 64 bit machine) are because we don't
> > > > actually succeed in triggering a second vacuum pass. This would not be
> > > > fixed by a longer timeout.
> > >
> > > Ouch.  This seems to me to raise the importance of getting a better
> > > way to test multiple-index-vacuum-passes.  Peter argued upthread
> > > that we don't need a better way, but I don't see how that argument
> > > holds water if copperhead was not reaching it despite being 64-bit.
> > > (Did you figure out exactly why it doesn't reach the code?)
> >
> > I wasn't able to reproduce the failure (failing to do > 1 index vacuum
> > pass) on my local machine (which is 64 bit) without decreasing the
> > number of tuples inserted. The copperhead failure confuses me because
> > the speed of the machine should *not* affect how much space the dead
> > item TIDStore takes up. I would have bet money that the same number
> > and offsets of dead tuples per page in a relation would take up the
> > same amount of space in a TIDStore on any 64-bit system -- regardless
> > of how slowly it runs vacuum.
>
> Looking at copperhead's failure logs, I could not find that "VACUUM
> (VERBOSE, FREEZE) vac_horizon_floor_table;" wrote the number of index
> scans in logs. Is there any clue that made you think the test failed
> to do multiple index vacuum passes?

The vacuum doesn't actually finish because I have a cursor that keeps
it from finishing and then I query pg_stat_progress_vacuum after the
first index vacuuming round should have happened and it did not do the
index vacuum:

[20:39:34.645](351.522s) # poll_query_until timed out executing this query:
#
# SELECT index_vacuum_count > 0
# FROM pg_stat_progress_vacuum
# WHERE datname='test_db' AND relid::regclass =
'vac_horizon_floor_table'::regclass;
#
# expecting this output:
# t
# last actual query output:
# f

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=copperhead&dt=2024-07-22%2015%3A00%3A11

I suppose it is possible that it did in fact time out and the index
vacuum was still in progress. But most of the other "too slow"
failures were when the standby was trying to catch up. Usually the
pg_stat_progress_vacuum test fails because we didn't actually do that
index vacuuming round yet.

- Melanie

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, Jul 23, 2024 at 8:32 AM Jeremy Schneider
 wrote:
> Other RDBMS are very careful not to corrupt databases, afaik including 
> function based indexes, by changing Unicode. I’m not aware of any other RDBMS 
> that updates Unicode versions in place; instead they support multiple Unicode 
> versions and do not drop the old ones.
>
> See also:
> https://www.postgresql.org/message-id/E8754F74-C65F-4A1A-826F-FD9F37599A2E%40ardentperf.com

Hmm. I think we might have some unique problems due to the fact that
we rely partly on the operating system behavior, partly on libicu, and
partly on our own internal tables.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pgsql: Add more SQL/JSON constructor functions

2024-07-23 Thread Amit Langote

On Tue, Jul 23, 2024 at 11:45 AM jian he  wrote:
> On Mon, Jul 22, 2024 at 4:46 PM Amit Langote  wrote:
> >
> > On Thu, Jul 18, 2024 at 3:04 PM jian he  wrote:
> > > we still have problem in transformJsonBehavior
> > >
> > > currently transformJsonBehavior:
> > > SELECT JSON_VALUE(jsonb '1234', '$' RETURNING bit(3)  DEFAULT 010111 ON 
> > > ERROR);
> > > ERROR:  cannot cast behavior expression of type text to bit
> > > LINE 1: ...VALUE(jsonb '1234', '$' RETURNING bit(3)  DEFAULT 010111 ON ...
> > >
> > > here, 010111 will default to int4, so "cannot cast behavior expression
> > > of type text to bit"
> > > is wrong?
> > > also int4/int8 can be explicitly cast to bit(3), in this case, it
> > > should return 111.
> >
> > I think we shouldn't try too hard in the code to "automatically" cast
> > the DEFAULT expression, especially if that means having to add special
> > case code for all sorts of source-target-type combinations.
> >
> > I'm inclined to just give a HINT to the user to cast the DEFAULT
> > expression by hand, because they *can* do that with the syntax that
> > exists.
>
> select typname, typinput, pg_get_function_identity_arguments(typinput)
> from pg_type pt join pg_proc proc  on proc.oid = pt.typinput
> where typtype = 'b' and typarray <> 0 and proc.pronargs > 1;
>
> As you can see from the query result, we only need to deal with bit
> and character type
> in this context.
>
> SELECT JSON_VALUE(jsonb '1234', '$.a' RETURNING bit(3) DEFAULT 10111 ON 
> empty);
> SELECT JSON_VALUE(jsonb '1234', '$.a' RETURNING char(3) DEFAULT 10111
> ON empty) ;
>
> the single quote literal ', no explicit cast, resolve to text type.
> no single quote like 11, no explicit cast, resolve to int type.
> we actually can cast int to bit, also have pg_cast entry.
> so the above these 2 examples should behave the same, given there is
> no pg_cast entry for int to text.
>
> select castsource::regtype ,casttarget::regtype ,castfunc,castcontext,
> castmethod
> from pg_cast where 'int'::regtype in (castsource::regtype 
> ,casttarget::regtype);
>
> but i won't insist on it, since bit/varbit don't use that much.

The cast from int to bit that exists in pg_cast is only good for
explicit casts, so would truncate user's value instead of flagging it
as invalid input, and this whole discussion is about not doing that.
With the DEFAULT expression specified or interpreted as a text string,
we don't have that problem because we can then use CoerceViaIO as an
assignment-level cast, whereby the invalid input *is* flagged as it
should, like this:

SELECT JSON_VALUE(jsonb '1234', '$' RETURNING bit(3)  DEFAULT '1' ON ERROR);
ERROR:  bit string length 5 does not match type bit(3)

So it seems fair to me to flag it when the user specifies an integer
in DEFAULT we can't create a cast expression that does not truncate a
value to fit the RETURNING type.

> > I'm planning to push the attached 2 patches.  0001 is to fix
> > transformJsonBehavior() for these cases and 0002 to adjust the
> > behavior of casting the result of JSON_EXISTS() and EXISTS columns to
> > integer type.  I've included the tests in your patch in 0001.  I
> > noticed using cast expression to coerce the boolean constants to
> > fixed-length types would produce unexpected errors when the planner's
> > const-simplification calls the cast functions.  So in 0001, I've made
> > that case also use runtime coercion using json_populate_type().
> >
>
> +  
> +   
> +If an ON ERROR or ON EMPTY
> +expression can't be coerced to the RETURNING type
> +successfully, an SQL NULL value will be returned.
> +   
> +  
>
> I think this change will have some controversy.

On second thought, I agree.  I've made some changes to *throw* the
error when the JsonBehavior values fail being coerced to the RETURNING
type.  Please check the attached.

In the attached patch, I've also taken care of the problem mentioned
in your latest email -- the solution I've chosen is not to produce the
error when ERROR ON ERROR is specified but to use runtime coercion
also for the jsonb type or any type that is not integer.  Also fixed
the typos.

Thanks for your attention!

--
Thanks, Amit Langote


0002-SQL-JSON-Fix-casting-for-integer-EXISTS-columns-in-J.patch
Description: Binary data


0001-SQL-JSON-Some-fixes-to-JsonBehavior-expression-casti.patch
Description: Binary data

Re: Things I don't like about \du's "Attributes" column

On Mon, Jul 22, 2024 at 5:19 PM Pavel Luzanov  wrote:
> Visible but inaccessible objects in system catalogs increase the volume
> of command output unnecessarily. Why do I need to know the list of all
> schemas in the database if I only have access to the public schema?
> The same applies to inaccessible tables, views, functions, etc.
>
> Not for safety, but for convenience, it might be worth having a set of views
> that show only those rows of the system catalog (with *acl column) that
> the user has access to. Either as the object owner, or through the privileges.
> Directly or indirectly through role membership.

So, I wasn't actually aware that anyone had a big problem in this
area. I thought that most of the junk you might see in \d
output would be hidden either because the objects you don't care about
are not in your search_path or because they are system objects. I
agree that doesn't help with schemas, but most people don't have a
huge number of schemas, and even if you do, you don't necessarily need
to look at the list all that frequently.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: xid_wraparound tests intermittent failure.

2024-07-23 Thread Andrew Dunstan




On 2024-07-22 Mo 10:11 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 2024-07-22 Mo 12:46 PM, Tom Lane wrote:

Masahiko Sawada  writes:

Looking at dodo's failures, it seems that while it passes
module-xid_wraparound-check, all failures happened only during
testmodules-install-check-C. Can we check the server logs written
during xid_wraparound test in testmodules-install-check-C?

Oooh, that is indeed an interesting observation.  There are enough
examples now that it's hard to dismiss it as chance, but why would
the two runs be different?

It's not deterministic.

Perhaps.  I tried "make check" on mamba's host and got exactly the
same failures as with "make installcheck", which counts in favor of
dodo's results being just luck.  Still, dodo has now shown 11 failures
in "make installcheck" and zero in "make check", so it's getting hard
to credit that there's no difference.





Yeah, I agree that's perplexing. That step doesn't run with "make -j 
nn", so it's a bit hard to see why it should get different results from 
one run rather than the other.  The only thing that's different is that 
there's another postgres instance running. Maybe that's just enough to 
slow the test down? After all, this is an RPi.



cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Comments on Custom RMGRs

2024-07-23 Thread Heikki Linnakangas


On 27/05/2024 21:20, Michael Paquier wrote:

On Fri, May 17, 2024 at 04:25:15PM -0400, Robert Haas wrote:

On Fri, May 17, 2024 at 4:20 PM Jeff Davis  wrote:

Regarding this particular change: the checkpointing hook seems more
like a table AM feature, so I agree with you that we should have a good
idea how a real table AM might use this, rather than only
pg_stat_statements.


I would even be OK with a pg_stat_statements example that is fully
working and fully explained. I just don't want to have no example at
all. The original proposal has been changed twice because of
complaints that the hook wasn't quite useful enough, but I think that
only proves that v3 is closer to being useful than v1. If v1 is 40% of
the way to useful and v3 is 120% of the way to useful, wonderful! But
if v1 is 20% of the way to being useful and v3 is 60% of the way to
being useful, it's not time to commit anything yet. I don't know which
is the case, and I think if someone wants this to be committed, they
need to explain clearly why it's the first and not the second.


Please note that I've been studying ways to have pg_stat_statements
being plugged in directly with the shared pgstat APIs to get it backed
by a dshash to give more flexibility and scaling, giving a way for
extensions to register their own stats kind.  In this case, the flush
of the stats would be controlled with a callback in the stats
registered by the extensions, conflicting with what's proposed here.
pg_stat_statements is all about stats, at the end.  I don't want this
argument to act as a barrier if a checkpoint hook is an accepted
consensus here,  but a checkpoint hook used for this code path is not
the most intuitive solution I can think of in the long-term.


On the topic of concrete uses for this API: We have a bunch of built-in 
resource managers that could be refactored to use this API. 
CheckPointGuts currently looks like this:



/*
 * Flush all data in shared memory to disk, and fsync
 *
 * This is the common code shared between regular checkpoints and
 * recovery restartpoints.
 */
static void
CheckPointGuts(XLogRecPtr checkPointRedo, int flags)
{
CheckPointRelationMap();
CheckPointReplicationSlots(flags & CHECKPOINT_IS_SHUTDOWN);
CheckPointSnapBuild();
CheckPointLogicalRewriteHeap();
CheckPointReplicationOrigin();

/* Write out all dirty data in SLRUs and the main buffer pool */
TRACE_POSTGRESQL_BUFFER_CHECKPOINT_START(flags);
CheckpointStats.ckpt_write_t = GetCurrentTimestamp();
CheckPointCLOG();
CheckPointCommitTs();
CheckPointSUBTRANS();
CheckPointMultiXact();
CheckPointPredicate();

RmgrCheckpoint(flags, RMGR_CHECKPOINT_BEFORE_BUFFERS);

CheckPointBuffers(flags);

RmgrCheckpoint(flags, RMGR_CHECKPOINT_AFTER_BUFFERS);

/* Perform all queued up fsyncs */
TRACE_POSTGRESQL_BUFFER_CHECKPOINT_SYNC_START();
CheckpointStats.ckpt_sync_t = GetCurrentTimestamp();
ProcessSyncRequests();
CheckpointStats.ckpt_sync_end_t = GetCurrentTimestamp();
TRACE_POSTGRESQL_BUFFER_CHECKPOINT_DONE();

/* We deliberately delay 2PC checkpointing as long as possible */
CheckPointTwoPhase(checkPointRedo);
}


Of these calls, CheckPointCLOG would be natural as the rmgr_callback of 
the clog rmgr. Similarly for CheckPointMultiXact and maybe a few others.



However, let's look at the pg_stat_statements patch (pgss_001.v1.patch):

It's now writing a new WAL record for every call to pgss_store(), 
turning even simple queries into WAL-logged operations. That's a 
non-starter. It will also not work on a standby. This needs to be 
redesigned so that the data is updated in memory, and written to disk 
and/or WAL-logged only periodically. Perhaps at checkpoints, but you 
could do it more frequently too.


I'm not convinced that the stats should be WAL-logged. Do you want them 
to be replicated and included in backups? Maybe, but maybe not. It's 
certainly a change to how it currently works.


If we don't WAL-log the stats, we don't really need a custom RMGR for 
it. We just need a checkpoint hook to flush the stats to disk, but we 
don't need a registered RMGR ID for it.


So, I got a feeling that adding this to the rmgr interface is not quite 
right. The rmgr callbacks are for things that run when WAL is 
*replayed*, while checkpoints are related to how WAL is generated. Let's 
design this as an independent hook, separate from rmgrs.



Another data point: In Neon, we actually had to add a little code to 
checkpoints, to WAL-log some exta data. That was a quick hack and might 
not be the right design in the first place, but these hooks would not 
have been useful for our purposes. We wanted to write a new WAL record 
at shutdown, and in CheckPointGuts(), it's already too late for that. It 
needs to be done earlier, before starting to the shutdown checkpoint. 
Similar to LogStandbySnapshot(), except

Re: [PATCH] GROUP BY ALL

2024-07-23 Thread Andrei Borodin

On 23 Jul 2024, at 00:40, Isaac Morland  wrote:odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be used in an aggregate functionLINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;AFAIR this problem was solved in my implementation [0]On 23 Jul 2024, at 01:29, Tom Lane  wrote:(Personally, I'd wonder exactly what ALL is quantified over: thewhole output of the FROM clause, or only columns mentioned in theSELECT tlist, or what? And why that choice rather than another?)I'd like to have GROUP BY AUTO (I also proposed version GROUP BY SURPRISE ME). But I wouldn't like to open pandora box of syntax sugar extensions which may will be incompatible with future standards.If we could have extensible grammar - I'd be happy to have a lot of such enhancements. My top 2 are FROM table SELECT column and better GROUP BY.Best regards, Andrey Borodin.[0] https://www.postgresql.org/message-id/flat/CAAhFRxjyTO5BHn9y1oOSEp0TtpTDTTTb7HJBNhTG%2Bi3-hXC0XQ%40mail.gmail.com

Re: [PATCH] GROUP BY ALL

On Mon, Jul 22, 2024 at 4:34 PM David G. Johnston
 wrote:
>
> On Mon, Jul 22, 2024 at 1:55 PM David Christensen  wrote:
>>
>> I see that there'd been some chatter but not a lot of discussion about
>> a GROUP BY ALL feature/functionality.  There certainly is utility in
>> such a construct IMHO.
>>
>> Still need some docs; just throwing this out there and getting some feedback.
>>
>
> I strongly dislike adding this feature.  I'd only consider supporting it if 
> it was part of the SQL standard.
>
> Code is written once and read many times.  This feature caters to the writer, 
> not the reader.  And furthermore usage of this is prone to be to the writer's 
> detriment as well.

I'd say this feature (at least for me) caters to the investigator;
someone who is interactively looking at data hence why it would cater
to the writer.  Consider acquainting yourself with a large table that
has a large number of annoying-named fields where you want to look at
how different data is correlated or broken-down.  Something along the
lines of:

SELECT last_name, substring(first_name,1,1) as first_initial,
income_range, count(*) FROM census_data GROUP BY ALL;

If you are iteratively looking at things, adding or removing fields
from your breakdown, you only need to change it in a single place, the
tlist.  Additionally, expressions can be used transparently without
needing to repeat them.  (Yes, in practice, I'd often use GROUP BY 1,
2, say, but if you add more fields to this you need to edit in
multiple places.)

David

Re: Document use of ldapurl with LDAP simple bind

2024-07-23 Thread Jacob Champion

On Tue, Jul 23, 2024 at 1:37 AM Peter Eisentraut  wrote:
> Committed.

Thanks!

> (I suppose this could be considered a bug fix, but I don't feel an
> urgency to go backpatching this.  Let me know if there are different
> opinions.)

Certainly no urgency. The docs part of the patch also could be
backported alone, but I don't feel strongly either way.

--Jacob

Re: [PATCH] GROUP BY ALL

On Mon, Jul 22, 2024 at 5:29 PM Tom Lane  wrote:
>
> "David G. Johnston"  writes:
> > On Mon, Jul 22, 2024 at 1:55 PM David Christensen  wrote:
> >> I see that there'd been some chatter but not a lot of discussion about
> >> a GROUP BY ALL feature/functionality.  There certainly is utility in
> >> such a construct IMHO.
>
> > I strongly dislike adding this feature.  I'd only consider supporting it if
> > it was part of the SQL standard.
>
> Yeah ... my recollection is that we already rejected this idea.
> If you want to re-litigate that, "throwing this out there" is
> not a sufficient argument.

Heh, fair enough.  I actually wrote the patch after encountering the
syntax in DuckDB and it seemed easy enough to add to Postgres while
providing some utility, then ended up seeing a thread about it later.
I did not get the sense that this had been globally rejected; though
there were definitely voices against it, it seemed to trail off rather
than coming to a conclusion.

> (Personally, I'd wonder exactly what ALL is quantified over: the
> whole output of the FROM clause, or only columns mentioned in the
> SELECT tlist, or what? And why that choice rather than another?)

My intention here was to basically be a shorthand for "group by
specified non-aggregate fields in the select list".  Perhaps I'm not
being creative enough, but what is the interpretation/use case for
anything else? :-)

While there are other ways to accomplish these things, making an easy
way to GROUP BY with aggregate queries would be useful in the field,
particularly when doing iterative discovery work would save a lot of
time with a situation that is both detectable and hits users with
errors all the time.

I'm not married to the exact syntax of this feature; anything else
short and consistent could work if `ALL` is considered to potentially
gain a different interpretation in the future.

David

Re: [PATCH] GROUP BY ALL

On Tue, Jul 23, 2024 at 8:21 AM Andrei Borodin  wrote:
>
> On 23 Jul 2024, at 00:40, Isaac Morland  wrote:
>
> odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
> ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be 
> used in an aggregate function
> LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
>
>
> AFAIR this problem was solved in my implementation [0]
>
> On 23 Jul 2024, at 01:29, Tom Lane  wrote:
>
> (Personally, I'd wonder exactly what ALL is quantified over: the
> whole output of the FROM clause, or only columns mentioned in the
> SELECT tlist, or what? And why that choice rather than another?)
>
>
> I'd like to have GROUP BY AUTO (I also proposed version GROUP BY SURPRISE 
> ME). But I wouldn't like to open pandora box of syntax sugar extensions which 
> may will be incompatible with future standards.
> If we could have extensible grammar - I'd be happy to have a lot of such 
> enhancements. My top 2 are FROM table SELECT column and better GROUP BY.

GROUP BY AUTO also seems fine here to me; I understand the desire to
avoid major incompatible syntax changes; GROUP BY ALL does exist in
multiple products so it's not unprecedented.

I wrote my patch before seeing your thread, sorry I missed that. :-)

David

Re: [PATCH] GROUP BY ALL

On Mon, Jul 22, 2024 at 4:41 PM Isaac Morland  wrote:

> And for when this might be useful, the syntax for it already exists, although 
> a spurious error message is generated:
>
> odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
> ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be 
> used in an aggregate function
> LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
> ^

This is with my patch, or existing postgres?  Grouping by record is
not actually what this patch is trying to do, so perhaps there is some
ambiguity; this is intended to GROUP BY any select item that is not an
aggregate function.

Re: Add new COPY option REJECT_LIMIT

2024-07-23 Thread torikoshia

On Tue, Jul 23, 2024 at 1:35 PM Fujii Masao
wrote:

Thanks for your review.

On 2024/07/22 21:37, torikoshia wrote:
On Fri, Jul 19, 2024 at 11:48 PM Junwang Zhao
wrote:

Thanks for the comment.

In patch 0002, the ratio is calculated by the already
skipped/processed
rows, but what if a user wants to copy 1000 rows, and he/she can
tolerate
10 error rows, so he/she might set *reject_limit 0.01*, but one bad
row in the
first 100 rows will fail the entire command, this might surprise the
user.

Since the ratio is calculated after all data is processed, the case
"one bad row in the first 100 rows will fail the entire command"
doesn't happen:

Yes, but is this the desired behavior when using the ratio threshold?
I was thinking that COPY should fail once the error ratio (errors vs.
total rows in the input file) reaches the threshold.

Users might expect like you.
However, implementing it would need a row number counting feature as you
pointed out, and it seems an overkill.

Describing the difference between ratio and number in the manual might
help, but
it might be better to make REJECT_LIMIT support only the number of
errors and leave it to the user to calculate the number from the ratio.

I'd like to hear if anyone has an opinion on the need for supporting
ratio.

I remember David prefers ratio[1].

+ This option must be used with ON_ERROR to be
set to

+ other than stop.

Regarding the ON_ERROR option, now it only has two values.
Instead of saying "other than stop," we should simply use "ignore"
for clarity and intuition?

I'll Modify it.
The reason for the roundabout wording was the expectation that on_error
would support values other than these in the future, but as you point
out, there are currently only two.

+ When specified INFINITY,
COPY ignores all
+ the errors. This is a synonym for ON_ERROR
ignore.

For the INFINITY value, the description is a bit unclear.
As I understand it, INFINITY is the default for REJECT_LIMIT.
So, if ON_ERROR=ignore is set without specifying REJECT_LIMIT,
COPY will ignore all errors, similar to when REJECT_LIMIT=INFINITY is
used.

In line with my previous suggestion, if we support only REJECT_LIMIT
without ON_ERROR, having INFINITY makes sense. However,
with ON_ERROR supported, REJECT_LIMIT=INFINITY seems redundant.
Users can just set ON_ERROR=ignore to ignore all errors,
making INFINITY unnecessary for REJECT_LIMIT. This is open for
discussion, but I believe it's better to remove INFINITY from
the REJECT_LIMIT options.

Agreed.
Unless there are opposing opinions, I'm going to remove 'INFINITY' in
the next patch.

+ else if (strcmp(defel->defname, "reject_limit") == 0)
+ {
+ if (reject_limit_specified)
+ errorConflictingDefElem(defel, pstate);
+ if (!opts_out->on_error)
+ ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("REJECT_LIMIT
requires ON_ERROR to be set to other than stop")));

Using "ignore" instead of "other than stop" in the error message
is clearer and more intuitive.

Agreed.

Checking if ON_ERROR and REJECT_LIMIT are specified should be
done after the foreach() processing of the options. Otherwise,
if REJECT_LIMIT is specified before ON_ERROR, unexpected errors can
occur.

---
=# copy test from '...' WITH (REJECT_LIMIT 7, ON_ERROR 'ignore');
ERROR: REJECT_LIMIT requires ON_ERROR to be set to other than stop
---

Ugh, I'll modify it.

+ ereport(ERROR,
+
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+errmsg("exceeded the
number specified by REJECT_LIMIT \"%lld\"",
+ (long
long) cstate->opts.reject_limits.num_err)));

The error message isn't clear about what exceeded REJECT_LIMIT.
How about saying "skipped more than REJECT_LIMIT rows" or something
instead?

Agreed.

+/*
+ * A struct to hold reject_limit options, in a parsed form.
+ * More values to be added in another patch.
+ */

The latter comment doesn't seem necessary or helpful.

Agreed.

[1]
https://www.postgresql.org/message-id/CAKFQuwYP91_G6tktYFTZq_CmkZ_%3DzuWjkz1%2B25Nd8bpsrDkx5Q%40mail.gmail.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Re: Add new COPY option REJECT_LIMIT

2024-07-23 Thread torikoshia


On 2024-07-23 02:06, Kirill Reshke wrote:

Thanks for your review.


Few comments:

+  When a positive integer value is specified, 
COPY limits
+  the maximum tolerable number of errors while converting a 
column's input

+  value into its data type.


If nothing is specified, then the maximum tolerable number of errors
is one, right? Should we state  this explicitly in the documentation?


REJECT_LIMIT now can be used wonly when on_error=ignore, I think the 
default(when only on_error=ignore is specified) is unlimited.

Anyway, I'm going to add a description about the default.


+COPY x from stdin with (on_error ignore, reject_limit 0);

How about a test where reject_limit is a string, but not
(case-intensively) 'infinity'?


Considering the discussion in[1], I'm now going to remove 'infinity'.


+ CopyRejectLimits reject_limits; /* thresholds of reject_limit */


Why are there multiple thresholds? Can we have only one?


This is because I thought it'd be more convenient to support both the 
number and the ratio of error, but I'm also beginning to think that it 
might be better to support only the number of cases, as discussed in 
[1].


[1]https://www.postgresql.org/message-id/5f807fcf3a36df7ba41464ab40b5c37d%40oss.nttdata.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Re: PG_TEST_EXTRA and meson

2024-07-23 Thread Jacob Champion

On Tue, Jul 23, 2024 at 3:32 AM Nazir Bilal Yavuz  wrote:
> Upthread Jacob said he could work on a patch about introducing the
> PG_TEST_EXTRA configure option to make builds. Would you still be
> interested in working on this? If not, I would gladly work on it.

Sure! Attached is a minimalist approach using AC_ARG_VAR.

It works for top-level `make check-world`, or `make check -C
src/test`. If you run `make check` directly from a test subdirectory,
the variable doesn't get picked up, because it's only exported from
the src/test level as of your patch c3382a3c3cc -- but if that turns
out to be a problem, we can plumb it all the way down or expand the
scope of the export.

Thanks,
--Jacob

make_test_extra.diff
Description: Binary data

Re: DSO Terms Galore

2024-07-23 Thread David E. Wheeler

On Jul 23, 2024, at 07:26, Peter Eisentraut  wrote:

> Things like "object" or "object file" or probably wrong-ish.  I understand an 
> object file to be a .o file, which you can't dlopen directly.

Agreed.

Another option, however, is “dynamically shared object” (DSO), which 
corresponds to the usual *nix extension, .so. I think I know the term most from 
Apache. It’s curious that I didn’t run across it while perusing the Postgres 
docs.

> I think we can unify this around terms like "dynamically loadable library" 
> and "dynamically loadable module" (or "loaded" in cases where it's talking 
> about a file that has already been loaded).

+1 for “dynamically loadable module” and, in common usage, “module”, since I 
don’t think it would be confused for anything else. “dynamically loadable 
library” would either have to always be used in full --- because “library” can 
be static, too --- or to “DLL”, which has strong Windows associations.

Best,

David

Re: proposal: schema variables

Thanks for the fixes and the new patch set!
I think that this would be a very valuable feature!

This is a very incomplete review after playing with the patch set for a while.

Some bugs and oddities I have found:

"psql" support:

  \? gives

\dV [PATTERN]  list variables

  I think that should say "schema variables" to distinguish them from
  psql variables.

  Running \dV when connected to an older server gives

ERROR:  relation "pg_catalog.pg_variable" does not exist
LINE 16: FROM pg_catalog.pg_variable v
  ^

  I think it would be better not to run the query and show a response like

session variables don't exist in server version 16

The LET statement:

CREATE VARIABLE testvar AS int4multirange[];
LET testvar = '{\{[2\,7]\,[11\,13]\}}';
ERROR:  variable "laurenz.testvar" is of type int4multirange[], but 
expression is of type text
LINE 1: LET testvar = '{\{[2\,7]\,[11\,13]\}}';
  ^
HINT:  You will need to rewrite or cast the expression.

  Sure, I can add an explicit type cast, but I think that the type should
  be determined by the type of the variable.  The right-hand side should be
  treated as "unknown", and the type input function should be used.

Parameter session_variables_ambiguity_warning:

--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1544,6 +1544,16 @@ struct config_bool ConfigureNamesBool[] =
false,
NULL, NULL, NULL
},
+   {
+   {"session_variables_ambiguity_warning", PGC_USERSET, 
DEVELOPER_OPTIONS,
+   gettext_noop("Raise warning when reference to a session 
variable is ambiguous."),
+   NULL,
+   GUC_NOT_IN_SAMPLE
+   },
+   &session_variables_ambiguity_warning,
+   false,
+   NULL, NULL, NULL
+   },

  I think the short_desc should be "Raise a warning" (with the indefinite 
article).

  DEVELOPER_OPTIONS is the wrong category.  We normally use that for parameters
  that are only relevant for PostgreSQL hackers.  I think it should be
  CLIENT_CONN_OTHER.

CREATE VARIABLE command:

CREATE VARIABLE str AS text NOT NULL;
ERROR:  session variable must have a default value, since it's declared NOT 
NULL

  Perhaps this error message would be better:

session variables declared as NOT NULL must have a default value

  This is buggy:

CREATE VARIABLE str AS text NOT NULL DEFAULT NULL;

  Ugh.

SELECT str;
ERROR:  null value is not allowed for NOT NULL session variable 
"laurenz.str"
DETAIL:  The result of DEFAULT expression is NULL.

  Perhaps that is a leftover from the previous coding, but I think there need be
  no check upon SELECT.  It should be enough to check during CREATE VARIABLE and
  LET.

pg_dump support:

  The attempt to dump a database with an older version leads to

pg_dump: error: query failed: ERROR:  relation "pg_catalog.pg_variable" 
does not exist
LINE 14: FROM pg_catalog.pg_variable v
  ^

  Dumping variables must be conditional on the server version.

IMMUTABLE variables:

+   
+IMMUTABLE
+
+ 
+  The assigned value of the session variable can not be changed.
+  Only if the session variable doesn't have a default value, a single
+  initialization is allowed using the LET command. 
Once
+  done, no further change is allowed until end of transaction
+  if the session variable was created with clause ON 
TRANSACTION
+  END RESET, or until reset of all session variables by
+  DISCARD VARIABLES, or until reset of all session
+  objects by command DISCARD ALL.
+ 
+
+   

  I can see the usefulness of IMMUTABLE variables, but I am surprised that
  they are reset by DISCARD.  What is the use case you have in mind?
  The use case I can envision is an application that sets a value right after
  authentication, for use with row-level security.  But then it would be harmful
  if the user could reset the variable with DISCARD.

Documentation:

--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml

+   
+The session variables can be shadowed by column references in a query. 
When
+a query contains identifiers or qualified identifiers that could be 
used as
+both a session variable identifiers and as column identifier, then the
+column identifier is preferred every time. Warnings can be emitted when
+this situation happens by enabling configuration parameter . User can 
explicitly
+qualify the source object by syntax table.column or
+variable.column.
+   

  I think you mean schema.variable, not 
variable.column.

Yours,
Laurenz Albe

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Noah Misch

On Mon, Jul 22, 2024 at 09:34:42AM -0700, Jeff Davis wrote:
> On Mon, 2024-07-22 at 11:14 -0400, Robert Haas wrote:
> > On Mon, Jul 22, 2024 at 10:26 AM Peter Eisentraut  
> > wrote:
> > > I disagree with that.  We should put ourselves into the position to
> > > adopt new Unicode versions without fear.  Similar to updates to
> > > time
> > > zones, snowball, etc.
> > > 
> > > We can't be discussing the merits of the Unicode update every year.
> > > That would be madness.
> > 
> > Yeah, I agree with that 100%.
> 
> It's hard for me to argue; that was my reasoning during development.
> 
> But Noah seems to have a very strong opinion on this matter:
> 
> https://www.postgresql.org/message-id/20240629220857.fb.nmisch%40google.com
> 
> and I thought this thread would be a better opportunity for him to
> express it. Noah?

Long-term, we should handle this like Oracle, SQL Server, and DB2 do:
https://postgr.es/m/CA+fnDAbmn2d5tzZsj-4wmD0jApHTsg_zGWUpteb=omssx5r...@mail.gmail.com

Short-term, we should remedy the step backward that pg_c_utf8 has taken:
https://postgr.es/m/20240718233908.52.nmi...@google.com
https://postgr.es/m/486d71991a3f80ec1c47e1bd7931e2ef3627b6b3.ca...@cybertec.at


$SUBJECT has proposed remedy "take more care with Unicode updates".  If one
wanted to pursue that, it should get more specific, by giving one or both of:

(a) principles for deciding whether a Unicode update is okay
(b) examples of past Unicode release changes and whether PostgreSQL should
adopt a future Unicode version making a similar change

That said, I'm not aware of an (a) or (b) likely to create an attractive
compromise between the "index scan agrees with seqscan after pg_upgrade" goal
(https://postgr.es/m/20240706195129...@rfd.leadboat.com) and the "don't freeze
Unicode data" goal
(https://postgr.es/m/CA+TgmoZRpOFVmQWKEXHdcKj9AFLbXT5ouwtXa58J=3ydlp0...@mail.gmail.com).
The "long-term" above would satisfy both goals.  If it were me, I would
abandon the "more care" proposal.

Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)

2024-07-23 Thread Thomas Simpson


Hi Andrew,

This is very interesting.

I had started looking at pg_dumpall trying to work out an approach.  I 
noticed parallel.c essentially already does all the thread creation and 
coordination that I knew would be needed. Given that is a solved 
problem, I started to look further (continued below).



On 22-Jul-2024 11:50, Andrew Dunstan wrote:


On 2024-07-19 Fr 9:46 AM, Thomas Simpson wrote:


Hi Scott,

I realize some of the background was snipped on what I sent to the 
hacker list, I'll try to fill in the details.


Short background is very large database ran out of space during 
vacuum full taking down the server.  There is a replica which was 
applying the WALs and so it too ran out of space.  On restart after 
clearing some space, the database came back up but left over the 
in-progress rebuild files.  I've cleared that replica and am using it 
as my rebuild target just now.


Trying to identify the 'orphan' files and move them away always led 
to the database spotting the supposedly unused files having gone and 
refusing to start, so I had no successful way to clean up and get 
space back.


Last resort after discussion is pg_dumpall & reload.  I'm doing this 
via a network pipe (netcat) as I do not have the vast amount of 
storage necessary for the dump file to be stored (in any format).


On 19-Jul-2024 09:26, Scott Ribe wrote:
Do you actually have 100G networking between the nodes? Because if 
not, a single CPU should be able to saturate 10G.
Servers connect via 10G WAN; sending is not the issue, it's 
application of the incoming stream on the destination which is 
bottlenecked.
Likewise the receiving end would need disk capable of keeping up. 
Which brings up the question, why not write to disk, but directly to 
the destination rather than write locally then copy?

In this case, it's not a local write, it's piped via netcat.
Do you require dump-reload because of suspected corruption? That's a 
tough one. But if not, if the goal is just to get up and running on 
a new server, why not pg_basebackup, streaming replica, promote? 
That depends on the level of data modification activity being low 
enough that pg_basebackup can keep up with WAL as it's generated and 
apply it faster than new WAL comes in, but given that your server is 
currently keeping up with writing that much WAL and flushing that 
many changes, seems likely it would keep up as long as the network 
connection is fast enough. Anyway, in that scenario, you don't need 
to care how long pg_basebackup takes.


If you do need a dump/reload because of suspected corruption, the 
only thing I can think of is something like doing it a table at a 
time--partitioning would help here, if practical.


The basebackup is, to the best of my understanding, essentially just 
copying the database files.  Since the failed vacuum has left extra 
files, my expectation is these too would be copied, leaving me in the 
same position I started in.  If I'm wrong, please tell me as that 
would be vastly quicker - it is how I originally set up the replica 
and it took only a few hours on the 10G link.


The inability to get a clean start if I move any files out the way 
leads me to be concerned for some underlying corruption/issue and the 
recommendation earlier in the discussion was opt for dump/reload as 
the fail-safe.


Resigned to my fate, my thoughts were to see if there is a way to 
improve the dump-reload approach for the future.  Since dump-reload 
is the ultimate upgrade suggestion in the documentation, it seems 
worthwhile to see if there is a way to improve the performance of 
that especially as very large databases like mine are a thing with 
PostgreSQL.  From a quick review of pg_dump.c (I'm no expert on it 
obviously), it feels like it's already doing most of what needs done 
and the addition is some sort of multi-thread coordination with a 
restore client to ensure each thread can successfully complete each 
task it has before accepting more work.  I realize that's actually 
difficult to implement.





There is a plan for a non-text mode for pg_dumpall. I have started 
work on it, and hope to have a WIP patch in a month or so. It's not my 
intention to parallelize it for the first cut, but it could definitely 
be parallelizable in future. However, it will require writing to disk 
somewhere, albeit that the data will be compressed. It's well nigh 
impossible to parallelize text format dumps.


Restoration of custom and directory format dumps has long been 
parallelized. Parallel dumps require directory format, and so will 
non-text pg_dumpall.




My general approach (which I'm sure is naive) was:

Add to pg_dumpall the concept of backup phase and I have the basic hooks 
in place.  0 = role grants etc.  The stuff before dumping actual 
databases.  I intercepted the fprintf(OPF to a hook function that for 
normal run just ends up doing the same as fprintf but for my parallel 
mode, it has a hook to send the info via the network (still to be

Re: CI, macports, darwin version problems

2024-07-23 Thread Joe Conway


On 7/23/24 06:31, Thomas Munro wrote:

On Tue, Jul 23, 2024 at 7:37 AM Andres Freund  wrote:

[2] https://cirrus-ci.com/task/5190473306865664


"Error: “disk.img” couldn’t be copied to
“3FA983DD-3078-4B28-A969-BCF86F8C9585” because there isn’t enough
space."

Could it be copying the whole image every time, in some way that would
get copy-on-write on the same file system, but having to copy
physically here?  That is, instead of using some kind of chain of
overlay disk image files as seen elsewhere, is this Tart thing relying
on file system COW for that?  Perhaps that is happening here[1] but I
don't immediately know how to find out where that Swift standard
library call turns into system calls...

[1] 
https://github.com/cirruslabs/tart/blob/main/Sources/tart/VMDirectory.swift#L119


I tried moving ~/.tart/tmp to the external drive as well, but that 
failed -- I *think* because tart is trying to do some kind of hardlink 
between the files in ~/.tart/tmp and ~/.tart/vms. So I move that back 
and at least the ventura runs are working again.


I also noticed that when I set up the external drive, I 
somehow automatically configured time machine to run (it was not done 
intentionally), and it seemed that the backups were consuming space on 
the primary drive . Did I mention I really hate messing with 
macos ;-). Any idea how to disable time machine entirely? The settings 
app provides next to zero configuration of the thing.


Anyway, maybe with the time machine stuff removed the there is enough space?

I guess if all else fails I will have to get the mac mini with more 
built in storage in order to accommodate sonoma.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] GROUP BY ALL

On Tue, 2024-07-23 at 08:37 -0500, David Christensen wrote:
> My intention here was to basically be a shorthand for "group by
> specified non-aggregate fields in the select list".  Perhaps I'm not
> being creative enough, but what is the interpretation/use case for
> anything else? :-)

I am somewhat against this feature.
It is too much magic for my taste.

It might be handy for interactive use, but I would frown at an application
that uses code like that, much like I'd frown at "SELECT *" in application code.

Yours,
Laurenz Albe

Re: SQL:2011 application time

2024-07-23 Thread Paul Jungwirth


On 7/18/24 11:39, Paul Jungwirth wrote:
So I swapped in the &&& patch, cleaned it up, and added tests. But something is wrong. After I get 
one failure from an empty, I keep getting failures, even though the table is empty:


regression=# truncate temporal_rng cascade;
NOTICE:  truncate cascades to table "temporal_fk_rng2rng"
TRUNCATE TABLE
regression=# insert into temporal_rng values ('[1,2)', 
'[2000-01-01,2010-01-01)'); -- ok so far
INSERT 0 1
regression=# insert into temporal_rng values ('[1,2)', 'empty'); -- should fail 
and does
ERROR:  range cannot be empty
regression=# insert into temporal_rng values ('[1,2)', 
'[2010-01-01,2020-01-01)'); -- uh oh
ERROR:  range cannot be empty
regression=# truncate temporal_rng cascade;
NOTICE:  truncate cascades to table "temporal_fk_rng2rng"
TRUNCATE TABLE
regression=# insert into temporal_rng values ('[1,2)', 
'[2000-01-01,2010-01-01)'); -- ok so far
INSERT 0 1
regression=# insert into temporal_rng values ('[1,2)', 
'[2010-01-01,2020-01-01)'); -- ok now
INSERT 0 1

It looks like the index is getting corrupted. Continuing from the above:

regression=# create extension pageinspect;
CREATE EXTENSION
regression=# select gist_page_items(get_raw_page('temporal_rng_pk', 0), 
'temporal_rng_pk');
   gist_page_items

  (1,"(0,1)",40,f,"(id, valid_at)=(""[1,2)"", ""[2000-01-01,2010-01-01)"")")
  (2,"(0,2)",40,f,"(id, valid_at)=(""[1,2)"", ""[2010-01-01,2020-01-01)"")")
(2 rows)

regression=# insert into temporal_rng values ('[1,2)', 'empty');
ERROR:  range cannot be empty
regression=# select gist_page_items(get_raw_page('temporal_rng_pk', 0), 
'temporal_rng_pk');
   gist_page_items

  (1,"(0,1)",40,f,"(id, valid_at)=(""[1,2)"", ""[2000-01-01,2010-01-01)"")")
  (2,"(0,2)",40,f,"(id, valid_at)=(""[1,2)"", ""[2010-01-01,2020-01-01)"")")
  (3,"(0,3)",32,f,"(id, valid_at)=(""[1,2)"", empty)")
(3 rows)


I realized this isn't index corruption, just MVCC. The exclusion constraint is 
checked after we
update the index, which is why the row gets left behind. But it doesn't cause 
any wrong answers, and
if you vacuum the table the row goes away.

This also explains my confusion here:

I thought of a possible problem: this operator works great if there are already rows in the table, 
but what if the *first row you insert* has an empty range? Then there is nothing to compare against, 
so the operator will never be used. Right?


Except when I test it, it still works!


The first row still does a comparison because when we check the exclusion 
constraint, there is a
comparison between the query and the key we just inserted. (When I say "query" 
I don't mean a SQL
query, but the value used to search the index that is compared against its 
keys.)

So I'm glad I didn't stumble on a GiST bug, but I think it means ereporting 
from an exclusion operator
is not a workable approach. Failures leave behind invalid tuples, and future 
(valid) tuples can fail if
we compare to those invalid tuples. Since MVCC visibility is stored in the 
heap, not in the index, it's
not really accessible to us here. So far I don't have any ideas to rescue this 
idea, even though I like
it a lot. So I will go back to the executor idea we discussed at pgconf.dev.

One tempting alternative though is to let exclusion constraints do the 
not-empty check, instead of
putting it in the executor. It would be an extra check we do only when the 
constraint has
pg_constraint.conperiod. Then we don't need to add & maintain 
pg_class.relwithoutoverlaps, and we don't
need a relcache change, and we don't need so much extra code to check existing 
rows when you add the
constraint. It doesn't use the existing available exclusion constraint 
functionality, but if we're
willing to extend the executor to know about WITHOUT OVERLAPS, I guess we could 
teach exclusion
constraints about it instead. Doing the check there does seem to have better 
locality with the feature.
So I think I will try that out as well.

Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.com

Re: [PATCH] GROUP BY ALL

On Tue, Jul 23, 2024 at 10:57 AM Laurenz Albe  wrote:
>
> On Tue, 2024-07-23 at 08:37 -0500, David Christensen wrote:
> > My intention here was to basically be a shorthand for "group by
> > specified non-aggregate fields in the select list".  Perhaps I'm not
> > being creative enough, but what is the interpretation/use case for
> > anything else? :-)
>
> I am somewhat against this feature.
> It is too much magic for my taste.
>
> It might be handy for interactive use, but I would frown at an application
> that uses code like that, much like I'd frown at "SELECT *" in application 
> code.

Sure, not everything that makes things easier is strictly necessary;
we could require `CAST(field AS text)` instead of `::text`, make
subqueries required for transforming oids into specific system tables
instead of `::regfoo` casts,  any number of other choices, remove
`SELECT *` as a parse option, but making it easier to do common things
interactively as a DBA has value as well.

David

Re: [PATCH] GROUP BY ALL

2024-07-23 Thread David G. Johnston

On Tue, Jul 23, 2024 at 9:48 AM David Christensen  wrote:

>
> Sure, not everything that makes things easier is strictly necessary;
> we could require `CAST(field AS text)` instead of `::text`,

Probably should have...being standard and all.  Though syntactic sugar is
quite different from new behavior - transforming :: to CAST is
straight-forward.

make
> subqueries required for transforming oids into specific system tables
> instead of `::regfoo` casts,

Since OID is non-standard this falls within our purview.

  any number of other choices, remove
> `SELECT *` as a parse option,

Again, standard dictated.

but making it easier to do common things
> interactively as a DBA has value as well.
>
>
Agreed, but this isn't a clear-cut win, and doesn't have standard
conformance to tip the scale over fully.

Also, there are so many better tools for data exploration.  Removing this
quirk only marginally closes that gap.

David J.

Re: [PATCH] GROUP BY ALL

2024-07-23 Thread Paul Jungwirth


On 7/22/24 15:43, Tom Lane wrote:

Isaac Morland  writes:

And for when this might be useful, the syntax for it already exists,
although a spurious error message is generated:



odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be
used in an aggregate function
LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
 ^



I'm not sure exactly what's going on here


The SELECT entry is expanded into "uw_term.col1, uw_term.col2,
uw_term.col3, ...", and those single-column Vars don't match the
whole-row Var appearing in the GROUP BY list.  I guess if we
think this is important, we could add a proof rule saying that
a per-column Var is functionally dependent on a whole-row Var
of the same relation.  Odd that the point hasn't come up before
(though I guess that suggests that few people try this).


I was just using this group-by-row feature last week to implement a temporal outer join in a way 
that would work for arbitrary tables. Here is some example SQL:


https://github.com/pjungwir/temporal_ops/blob/b10d65323749faa6c47956db2e8f95441e508fce/sql/outer_join.sql#L48-L66

That does `GROUP BY a` then `SELECT (x.a).*`.[1]

It is very useful for writing queries that don't want to know about the 
structure of the row.

I noticed the same error as Isaac. I worked around the problem by wrapping it in a subquery and 
decomposing the row outside. It's already an obscure feature, and an easy workaround might be why 
you haven't heard complaints before. I wouldn't mind writing a patch for that rule when I get a 
chance (if no one else gets to it first.)


[1] Actually I see it does `GROUP BY a, a.valid_at`, but that is surely more than I need. I think 
that `a.valid_at` is leftover from a previous version of the query.


Yours,

--
Paul  ~{:-)
p...@illuminatedcomputing.com

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, 2024-07-23 at 08:49 -0400, Robert Haas wrote:
> Hmm. I think we might have some unique problems due to the fact that
> we rely partly on the operating system behavior, partly on libicu,
> and
> partly on our own internal tables.

The reliance on the OS is especially problematic for reasons that have
already been discussed extensively.

One of my strongest motivations for PG_C_UTF8 was that there was still
a use case for libc in PG16: the "C.UTF-8" locale, which is not
supported at all in ICU. Daniel Vérité made me aware of the importance
of this locale, which offers code point order collation combined with
Unicode ctype semantics.

With PG17, between ICU and the builtin provider, there's little
remaining reason to use libc (aside from legacy).

Regards,
Jeff Davis

Re: Direct SSL connection and ALPN loose ends

2024-07-23 Thread Heikki Linnakangas


On 16/07/2024 09:54, Michael Paquier wrote:

On Mon, Jun 24, 2024 at 11:30:53PM +0300, Heikki Linnakangas wrote:

0002: This is the main patch that fixes the SSL fallback issue


+ conn->failed_enc_methods |= conn->allowed_enc_methods &
(~conn->current_enc_method);

Sounds reasonable to me.

It's a bit annoying to have to guess that current_enc_method is
tracking only one method at a time (aka these three fields are not
documented in libpq-int.h), while allowed_enc_methods and
failed_enc_methods is a bitwise combination of the methods that are
still allowed or that have already failed.


Yeah. In hindsight I'm still not very happy with the code structure with 
"allowed_enc_methods" and "current_enc_methods" and all that. The 
fallback logic is still complicated. It's better than in v16, IMHO, but 
still not great. This patch seems like the best fix for v17, but I 
wouldn't mind another round of refactoring for v18, if anyone's got some 
good ideas on how to structure it better. All these new tests are a 
great asset when refactoring this again.



+if (IsInjectionPointAttached("backend-initialize-v2-error"))
+{
+FrontendProtocol = PG_PROTOCOL(2,0);
+elog(FATAL, "protocol version 2 error triggered");
+}

This is an attempt to do stack manipulation with an injection point
set.  FrontendProtocol is a global variable, so you could have a new
callback setting up this global variable directly, then FATAL (I
really don't mind is modules/injection_points finishes with a library
of callbacks).

Not sure to like much this new IsInjectionPointAttached() that does a
search in the existing injection point pool, though.  This leads to
more code footprint in the core backend, and I'm trying to minimize
that.  Not everybody agrees with this view, I'd guess, which is also
fine.


Yeah, I'm also not too excited about the additional code in the backend, 
but I'm also not excited about writing another test C module just for 
this. I'm inclined to commit this as it is, but we can certainly revisit 
this later, since it's just test code.


Here's a new rebased version with some minor cleanup. Notably, I added 
docs for the new IS_INJECTION_POINT_ATTACHED() macro.


--
Heikki Linnakangas
Neon (https://neon.tech)
From 651db616fcb19b89c94cacb30c27267ac7c17070 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Tue, 23 Jul 2024 20:03:27 +0300
Subject: [PATCH v2 1/3] Fix fallback behavior when server sends an ERROR early
 at startup

With sslmode=prefer, the desired behavior is to completely fail the
connection attempt, *not* fall back to a plaintext connection, if the
server responds to the SSLRequest with an error ('E') response instead
of rejecting SSL with an 'N' response. This was broken in commit
05fd30c0e7.

Reported-by: Jacob Champion
Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com
---
 src/interfaces/libpq/fe-connect.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e003279fb6..360d9a4547 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3521,6 +3521,12 @@ keep_going:		/* We will come back to here until there is
 		 * byte here.
 		 */
 		conn->status = CONNECTION_AWAITING_RESPONSE;
+
+		/*
+		 * Don't fall back to a plaintext connection after
+		 * reading the error.
+		 */
+		conn->failed_enc_methods |= conn->allowed_enc_methods & (~conn->current_enc_method);
 		goto keep_going;
 	}
 	else
@@ -3612,6 +3618,13 @@ keep_going:		/* We will come back to here until there is
 		 * into AWAITING_RESPONSE state and let the code there
 		 * deal with it.  Note we have *not* consumed the "E"
 		 * byte here.
+		 *
+		 * Note that unlike on an error response to
+		 * SSLRequest, we allow falling back to SSL or
+		 * plaintext connection here.  GSS support was
+		 * introduced in PostgreSQL version 12, so an error
+		 * response might mean that we are connecting to a
+		 * pre-v12 server.
 		 */
 		conn->status = CONNECTION_AWAITING_RESPONSE;
 		goto keep_going;
@@ -3659,6 +3672,10 @@ keep_going:		/* We will come back to here until there is
 }
 else if (pollres == PGRES_POLLING_FAILED)
 {
+	/*
+	 * GSS handshake failed.  We will retry with an SSL or
+	 * plaintext connection, if permitted by the options.
+	 */
 	CONNECTION_FAILED();
 }
 /* Else, return POLLING_READING or POLLING_WRITING status */
-- 
2.39.2

From 649e2c89f62484940c55b02635e393bdd85654af Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Tue, 23 Jul 2024 20:20:37 +0300
Subject: [PATCH v2 2/3] Add test for early backend startup errors

The new test tests the libpq fallback behavior on an early error.

This adds an IS_INJECTION_POINT_ATTACHED() macro, to allow writing the
t

Re: Add mention of execution time memory for enable_partitionwise_* GUCs

2024-07-23 Thread Dimitrios Apostolou


Thank you for the patch improving the docs, I think it's a clear
improvement from before.

On Thu, 18 Jul 2024, David Rowley wrote:


I considered writing about work_mem, but felt I wanted to keep it as
brief as possible and just have some words that might make someone
think twice.  The details in the work_mem documentation should inform
the reader that work_mem is per executor node.  It likely wouldn't
hurt to have more documentation around which executor node types can
use a work_mem, which use work_mem * hash_mem_multiplier and which use
neither. We tend to not write too much about executor nodes in the
documents, so I'm not proposing that for this patch.


This is the only part I think is missing, since we now know (measurements
in [1], reproducible scenario in [2]) that the number of partitions plays
an important role in sizing the RAM of the server. It's just too big to
not mention that worst case will be n_partitions * work_mem.


[1] 
https://www.postgresql.org/message-id/d26e67d3-74bc-60aa-bf24-2a8fb83efe9c%40gmx.net

[2] 
https://www.postgresql.org/message-id/af6ed790-a5fe-19aa-1141-927595604c01%40gmx.net


I would also like to add an entry about this issue with links to the above
pages, to the TODO page at [3], as this is the only bugtracker I'm aware
of. Am I doing it right bringing it up for approval on this list?

[3] https://wiki.postgresql.org/wiki/Todo



Thanks,
Dimitris

Re: Add privileges test for pg_stat_statements to improve coverage

2024-07-23 Thread Fujii Masao




On 2024/07/23 15:02, kuroda.keis...@nttcom.co.jp wrote:

Hi Fujii-san,
Thank you for your comment!

attach v5 fixed patch.


Using "postgres" as the default superuser name can cause instability.
This is why the Patch Tester reports now test failures again.
You should create and use a different superuser, such as
"regress_stats_superuser."


I understand.
Add "regress_stats_superuser" for test by superuser.


Thanks for updating the patch!

I've slightly modified the comments in the regression tests for clarity.
Attached is the v6 patch. If there are no objections,
I will push this version.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONFrom d2dfc611750bce0b06ba3a3f82b136014d399987 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Wed, 24 Jul 2024 02:50:35 +0900
Subject: [PATCH v6] pg_stat_statements: Add regression test for privilege
 handling.

This commit adds a regression test to verify that pg_stat_statements
correctly handles privileges, improving its test coverage.

Author: Keisuke Kuroda
Reviewed-by: Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/2224ccf2e12c41ccb81702ef3303d...@nttcom.co.jp
---
 contrib/pg_stat_statements/Makefile   |  3 +-
 .../expected/privileges.out   | 97 +++
 contrib/pg_stat_statements/meson.build|  1 +
 contrib/pg_stat_statements/sql/privileges.sql | 60 
 4 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 contrib/pg_stat_statements/expected/privileges.out
 create mode 100644 contrib/pg_stat_statements/sql/privileges.sql

diff --git a/contrib/pg_stat_statements/Makefile 
b/contrib/pg_stat_statements/Makefile
index 414a30856e..c19ccad77e 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -19,7 +19,8 @@ LDFLAGS_SL += $(filter -lm, $(LIBS))
 
 REGRESS_OPTS = --temp-config 
$(top_srcdir)/contrib/pg_stat_statements/pg_stat_statements.conf
 REGRESS = select dml cursors utility level_tracking planning \
-   user_activity wal entry_timestamp cleanup oldextversions
+   user_activity wal entry_timestamp privileges cleanup \
+   oldextversions
 # Disabled because these tests require 
"shared_preload_libraries=pg_stat_statements",
 # which typical installcheck users do not have (e.g. buildfarm clients).
 NO_INSTALLCHECK = 1
diff --git a/contrib/pg_stat_statements/expected/privileges.out 
b/contrib/pg_stat_statements/expected/privileges.out
new file mode 100644
index 00..d0bb83
--- /dev/null
+++ b/contrib/pg_stat_statements/expected/privileges.out
@@ -0,0 +1,97 @@
+--
+-- Only superusers and roles with privileges of the pg_read_all_stats role
+-- are allowed to see the SQL text and queryid of queries executed by
+-- other users. Other users can see the statistics.
+--
+SET pg_stat_statements.track_utility = FALSE;
+CREATE ROLE regress_stats_superuser SUPERUSER;
+CREATE ROLE regress_stats_user1;
+CREATE ROLE regress_stats_user2;
+GRANT pg_read_all_stats TO regress_stats_user2;
+SET ROLE regress_stats_superuser;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
+(1 row)
+
+SELECT 1 AS "ONE";
+ ONE 
+-
+   1
+(1 row)
+
+SET ROLE regress_stats_user1;
+SELECT 1+1 AS "TWO";
+ TWO 
+-
+   2
+(1 row)
+
+--
+-- A superuser can read all columns of queries executed by others,
+-- including query text and queryid.
+--
+SET ROLE regress_stats_superuser;
+SELECT r.rolname, ss.queryid <> 0 AS queryid_bool, ss.query, ss.calls, ss.rows
+  FROM pg_stat_statements ss JOIN pg_roles r ON ss.userid = r.oid
+  ORDER BY r.rolname, ss.query COLLATE "C", ss.calls, ss.rows;
+ rolname | queryid_bool |   query  
  | calls | rows 
+-+--++---+--
+ regress_stats_superuser | t| SELECT $1 AS "ONE"   
  | 1 |1
+ regress_stats_superuser | t| SELECT pg_stat_statements_reset() IS 
NOT NULL AS t | 1 |1
+ regress_stats_user1 | t| SELECT $1+$2 AS "TWO"
  | 1 |1
+(3 rows)
+
+--
+-- regress_stats_user1 has no privileges to read the query text or
+-- queryid of queries executed by others but can see statistics
+-- like calls and rows.
+--
+SET ROLE regress_stats_user1;
+SELECT r.rolname, ss.queryid <> 0 AS queryid_bool, ss.query, ss.calls, ss.rows
+  FROM pg_stat_statements ss JOIN pg_roles r ON ss.userid = r.oid
+  ORDER BY r.rolname, ss.query COLLATE "C", ss.calls, ss.rows;
+ rolname | queryid_bool |  query   | calls | 
rows 
+-+--+--+---+--
+ regress_stats_superuser |  |  | 1 |   
 1
+ regress_stats_superuser |  |  | 1 |   
 1
+ regress_stats_superuser |  |  |

Useless toast

2024-07-23 Thread Marcos Pegoraro

Using version 16, seems strange when toast needs to be created. Tested with
domain being numeric or varchar(10) with the same results.

And If that domain is integer then no toast is created.

I think none of these tables should have a toast, right ?

postgres=# create domain mynum as numeric(15,2);
CREATE DOMAIN
postgres=# create table tab1(id integer, num numeric(15,2));
CREATE TABLE
postgres=# create table tab2(id integer, num mynum);
CREATE TABLE
postgres=# create table tab3(id integer, num mynum storage main);
CREATE TABLE
postgres=# create table tab4(id integer, num mynum storage plain);
CREATE TABLE
postgres=# select relname, reltoastrelid from pg_class where relname ~
'tab\d$' order by 1;
 relname | reltoastrelid
-+---
 tab1| 0
 tab2| 25511
 tab3| 25516
 tab4| 0
(4 rows)

regards
Marcos

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, Jul 23, 2024 at 1:03 PM Jeff Davis  wrote:
> One of my strongest motivations for PG_C_UTF8 was that there was still
> a use case for libc in PG16: the "C.UTF-8" locale, which is not
> supported at all in ICU. Daniel Vérité made me aware of the importance
> of this locale, which offers code point order collation combined with
> Unicode ctype semantics.
>
> With PG17, between ICU and the builtin provider, there's little
> remaining reason to use libc (aside from legacy).

I was really interested to read Jeremy Schneider's slide deck, to
which he linked earlier, wherein he explained that other major
databases default to something more like C.UTF-8. Maybe we need to
relitigate the debate about what our default should be in light of
those findings (but, if so, on another thread with a clear subject
line). But even if we were to decide to change the default, there are
lots and lots of existing databases out there that are using libc
collations. I'm not in a good position to guess how many of those
people actually truly care about language-specific collations. I'm
positive it's not zero, but I can't really guess how much more than
zero it is. Even if it were zero, though, the fact that so many
upgrades are done using pg_upgrade means that this problem will still
be around in a decade even if we changed the default tomorrow.

(I do understand that you wrote "aside from legacy" so I'm not
accusing you of ignoring the upgrade issues, just taking the
opportunity to be more explicit about my own view.)

Also, Noah has pointed out that C.UTF-8 introduces some
forward-compatibility hazards of its own, at least with respect to
ctype semantics. I don't have a clear view of what ought to be done
about that, but if we just replace a dependency on an unstable set of
libc definitions with a dependency on an equally unstable set of
PostgreSQL definitions, we're not really winning. Do we need to
version the new ctype provider?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [18] Policy on IMMUTABLE functions and Unicode updates

Robert Haas  writes:
> Also, Noah has pointed out that C.UTF-8 introduces some
> forward-compatibility hazards of its own, at least with respect to
> ctype semantics. I don't have a clear view of what ought to be done
> about that, but if we just replace a dependency on an unstable set of
> libc definitions with a dependency on an equally unstable set of
> PostgreSQL definitions, we're not really winning.

No, I think we *are* winning, because the updates are not "equally
unstable": with pg_c_utf8, we control when changes happen.  We can
align them with major releases and release-note the differences.
With libc-based collations, we have zero control and not much
notification.

> Do we need to version the new ctype provider?

It would be a version for the underlying Unicode definitions,
not the provider as such, but perhaps yes.  I don't know to what
extent doing so would satisfy Noah's concern; but if it would do
so I'd be happy with that answer.

regards, tom lane

Re: Useless toast


On 23.07.24 20:35, Marcos Pegoraro wrote:
Using version 16, seems strange when toast needs to be created. 
Tested with domain being numeric or varchar(10) with the same results.


And If that domain is integer then no toast is created.

I think none of these tables should have a toast, right ?


The mechanism that determines whether a toast table is needed only 
considers the data type, not the "typmod" (arguments of the data type). 
So this is perhaps suboptimal, but this logic just doesn't exist.


Also, note that varchar(10) means 10 characters, not 10 bytes, so you 
can't necessarily draw conclusions about storage size from that.  There 
aren't any supported character encodings that would encode 10 characters 
into more bytes than the toast threshold, so this is just theoretical, 
but it would be hard to decide what the actual threshold would be in 
practice.

Re: Useless toast

Marcos Pegoraro  writes:
> Using version 16, seems strange when toast needs to be created. Tested with
> domain being numeric or varchar(10) with the same results.

Domains are fairly opaque when it comes to maximum length.
I cannot get excited about adding code to make them less so.

regards, tom lane

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Joe Conway


On 7/23/24 15:26, Tom Lane wrote:

Robert Haas  writes:

Also, Noah has pointed out that C.UTF-8 introduces some
forward-compatibility hazards of its own, at least with respect to
ctype semantics. I don't have a clear view of what ought to be done
about that, but if we just replace a dependency on an unstable set of
libc definitions with a dependency on an equally unstable set of
PostgreSQL definitions, we're not really winning.


No, I think we *are* winning, because the updates are not "equally
unstable": with pg_c_utf8, we control when changes happen.  We can
align them with major releases and release-note the differences.
With libc-based collations, we have zero control and not much
notification.


+1


Do we need to version the new ctype provider?


It would be a version for the underlying Unicode definitions,
not the provider as such, but perhaps yes.  I don't know to what
extent doing so would satisfy Noah's concern; but if it would do
so I'd be happy with that answer.


I came to the same conclusion. I think someone mentioned somewhere on 
this thread that other databases support multiple Unicode versions. I 
think we need to figure out how to do that too.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Useless toast

Peter Eisentraut  writes:
> On 23.07.24 20:35, Marcos Pegoraro wrote:
>> I think none of these tables should have a toast, right ?

> The mechanism that determines whether a toast table is needed only 
> considers the data type, not the "typmod" (arguments of the data type). 
> So this is perhaps suboptimal, but this logic just doesn't exist.

Not true, see type_maximum_size() in format_type.c.  But I'm
uninterested in making that drill down into domains, or at least
that would not be my first concern if we were trying to improve it.
(The first concern would be to let extension types in on the fun.)

regards, tom lane

Re: pg_upgrade and logical replication

2024-07-23 Thread Nathan Bossart

On Tue, Jul 23, 2024 at 09:05:05AM +0530, Amit Kapila wrote:
> Right, the other option would be to move it to the place where we call
> check_old_cluster_for_valid_slots(), etc. Initially, it was kept in
> the specific function (get_db_rel_and_slot_infos) as we were
> mainlining the count at the per-database level but now as we are
> changing that I am not sure if calling it from the same place is a
> good idea. But OTOH, it is okay to keep it at the place where we
> retrieve the required information from the old cluster.

I moved it to where you suggested.

> One minor point is the comment atop get_subscription_count() still
> refers to the function name as get_db_subscription_count().

Oops, fixed.

-- 
nathan
>From 19831c5a2869f949e73564abea8a36858b39bcd1 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Sat, 20 Jul 2024 21:01:29 -0500
Subject: [PATCH v3 1/1] pg_upgrade: retrieve subscription count more
 efficiently

---
 src/bin/pg_upgrade/check.c  | 13 -
 src/bin/pg_upgrade/info.c   | 51 +
 src/bin/pg_upgrade/pg_upgrade.h |  4 +--
 3 files changed, 15 insertions(+), 53 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index 27924159d6..51e30a2f23 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -609,8 +609,10 @@ check_and_dump_old_cluster(bool live_check)
 
/*
 * Subscriptions and their dependencies can be migrated since 
PG17.
-* See comments atop get_db_subscription_count().
+* Before that the logical slots are not upgraded, so we will 
not be
+* able to upgrade the logical replication clusters completely.
 */
+   get_subscription_count(&old_cluster);
check_old_cluster_subscription_state();
}
 
@@ -1797,17 +1799,14 @@ check_new_cluster_subscription_configuration(void)
 {
PGresult   *res;
PGconn *conn;
-   int nsubs_on_old;
int max_replication_slots;
 
/* Subscriptions and their dependencies can be migrated since PG17. */
if (GET_MAJOR_VERSION(old_cluster.major_version) < 1700)
return;
 
-   nsubs_on_old = count_old_cluster_subscriptions();
-
/* Quick return if there are no subscriptions to be migrated. */
-   if (nsubs_on_old == 0)
+   if (old_cluster.nsubs == 0)
return;
 
prep_status("Checking for new cluster configuration for subscriptions");
@@ -1821,10 +1820,10 @@ check_new_cluster_subscription_configuration(void)
pg_fatal("could not determine parameter settings on new 
cluster");
 
max_replication_slots = atoi(PQgetvalue(res, 0, 0));
-   if (nsubs_on_old > max_replication_slots)
+   if (old_cluster.nsubs > max_replication_slots)
pg_fatal("\"max_replication_slots\" (%d) must be greater than 
or equal to the number of "
 "subscriptions (%d) on the old cluster",
-max_replication_slots, nsubs_on_old);
+max_replication_slots, old_cluster.nsubs);
 
PQclear(res);
PQfinish(conn);
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 95c22a7200..c07a69b63e 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -28,7 +28,6 @@ static void print_db_infos(DbInfoArr *db_arr);
 static void print_rel_infos(RelInfoArr *rel_arr);
 static void print_slot_infos(LogicalSlotInfoArr *slot_arr);
 static void get_old_cluster_logical_slot_infos(DbInfo *dbinfo, bool 
live_check);
-static void get_db_subscription_count(DbInfo *dbinfo);
 
 
 /*
@@ -293,15 +292,8 @@ get_db_rel_and_slot_infos(ClusterInfo *cluster, bool 
live_check)
 
get_rel_infos(cluster, pDbInfo);
 
-   /*
-* Retrieve the logical replication slots infos and the 
subscriptions
-* count for the old cluster.
-*/
if (cluster == &old_cluster)
-   {
get_old_cluster_logical_slot_infos(pDbInfo, live_check);
-   get_db_subscription_count(pDbInfo);
-   }
}
 
if (cluster == &old_cluster)
@@ -748,54 +740,25 @@ count_old_cluster_logical_slots(void)
 }
 
 /*
- * get_db_subscription_count()
- *
- * Gets the number of subscriptions in the database referred to by "dbinfo".
+ * get_subscription_count()
  *
- * Note: This function will not do anything if the old cluster is pre-PG17.
- * This is because before that the logical slots are not upgraded, so we will
- * not be able to upgrade the logical replication clusters completely.
+ * Gets the number of subscriptions in the cluster.
  */
-static void
-get_db_subscription_count(DbInfo *dbinfo)
+void
+get_subscription_count(ClusterInfo *cluster)
 {
PGconn *conn;

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, 2024-07-23 at 15:26 -0400, Tom Lane wrote:
> No, I think we *are* winning, because the updates are not "equally
> unstable": with pg_c_utf8, we control when changes happen.  We can
> align them with major releases and release-note the differences.
> With libc-based collations, we have zero control and not much
> notification.

Also, changes to libc collations are much more impactful, at least two
orders of magnitude. All indexes on text are at risk, even primary
keys.

PG_C_UTF8 has stable code point ordering (memcmp()) that is unaffected
by Unicode updates, so primary keys will never be affected. The risks
we are talking about are for expression indexes, e.g. on LOWER(). Even
if you do have such expression indexes, the types of changes Unicode
makes to casing and character properties are typically much more mild.

Regards,
Jeff Davis

Re: [PATCH] GROUP BY ALL


On 23.07.24 00:29, Tom Lane wrote:

(Personally, I'd wonder exactly what ALL is quantified over: the
whole output of the FROM clause, or only columns mentioned in the
SELECT tlist, or what? And why that choice rather than another?)


Looks like the main existing implementations take it to mean all entries 
in the SELECT list that are not aggregate functions.


https://duckdb.org/docs/sql/query_syntax/groupby.html#group-by-all
https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-qry-select-groupby.html#parameters
https://docs.snowflake.com/en/sql-reference/constructs/group-by#parameters

Re: [18] Policy on IMMUTABLE functions and Unicode updates

Jeff Davis  writes:
> On Tue, 2024-07-23 at 15:26 -0400, Tom Lane wrote:
>> No, I think we *are* winning, because the updates are not "equally
>> unstable": with pg_c_utf8, we control when changes happen.  We can
>> align them with major releases and release-note the differences.
>> With libc-based collations, we have zero control and not much
>> notification.

> Also, changes to libc collations are much more impactful, at least two
> orders of magnitude. All indexes on text are at risk, even primary
> keys.

Well, it depends on which libc collation you have in mind.  I was
thinking of a libc-supplied C.UTF-8 collation, which I would expect
to behave the same as pg_c_utf8, modulo which Unicode version it's
based on.  But even when comparing to that, pg_c_utf8 can win on
stability for the reasons I stated.  If you don't have a C.UTF-8
collation available, and are forced to use en_US.UTF-8 or
$locale-of-choice, then the stability picture is far more dire,
as Jeff says.

Noah seems to be comparing the stability of pg_c_utf8 to the stability
of a pure C/POSIX collation, but I do not think that is the relevant
comparison to make.  Besides, if someone is using C/POSIX, this
feature doesn't stop them from continuing to do so.

regards, tom lane

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, 2024-07-23 at 07:39 -0700, Noah Misch wrote:
> we should remedy the step backward that pg_c_utf8 has taken:

Obviously I disagree that we've taken a step backwards.

Can you articulate the principle by which all of the other problems
with IMMUTABLE are just fine, but updates to Unicode are intolerable,
and only for PG_C_UTF8?

Regards,
Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, Jul 23, 2024 at 3:26 PM Tom Lane  wrote:
> No, I think we *are* winning, because the updates are not "equally
> unstable": with pg_c_utf8, we control when changes happen.  We can
> align them with major releases and release-note the differences.
> With libc-based collations, we have zero control and not much
> notification.

OK, that's pretty fair.

> > Do we need to version the new ctype provider?
>
> It would be a version for the underlying Unicode definitions,
> not the provider as such, but perhaps yes.  I don't know to what
> extent doing so would satisfy Noah's concern; but if it would do
> so I'd be happy with that answer.

I don't see how we can get by without some kind of versioning here.
It's probably too late to do that for v17, but if we bet either that
(1) we'll never need to change anything for pg_c_utf8 or that (2)
those changes will be so minor that nobody will have a problem, I
think we will lose our bet.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [18] Policy on IMMUTABLE functions and Unicode updates

Robert Haas  writes:
> On Tue, Jul 23, 2024 at 3:26 PM Tom Lane  wrote:
>>> Do we need to version the new ctype provider?

>> It would be a version for the underlying Unicode definitions,
>> not the provider as such, but perhaps yes.  I don't know to what
>> extent doing so would satisfy Noah's concern; but if it would do
>> so I'd be happy with that answer.

> I don't see how we can get by without some kind of versioning here.
> It's probably too late to do that for v17,

Why?  If we agree that that's the way forward, we could certainly
stick some collversion other than "1" into pg_c_utf8's pg_collation
entry.  There's already been one v17 catversion bump since beta2
(716bd12d2), so another one is basically free.

regards, tom lane

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Daniel Verite

Tom Lane wrote:

> > I don't see how we can get by without some kind of versioning here.
> > It's probably too late to do that for v17,
> 
> Why?  If we agree that that's the way forward, we could certainly
> stick some collversion other than "1" into pg_c_utf8's pg_collation
> entry.  There's already been one v17 catversion bump since beta2
> (716bd12d2), so another one is basically free.

pg_collation.collversion has been used so far for the sort part
of the collations.

For the ctype part:

postgres=# select unicode_version();
 unicode_version 
-
 15.1
(1 row)


postgres=# select icu_unicode_version ();
 icu_unicode_version 
-
 14.0
(1 row)



Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

Re: [18] Policy on IMMUTABLE functions and Unicode updates


On 22.07.24 19:55, Robert Haas wrote:

Every other piece of software in the world has to deal with changes as
a result of the addition of new code points, and probably less
commonly, revisions to existing code points. Presumably, their stuff
breaks too, from time to time. I mean, I find it a bit difficult to
believe that web browsers or messaging applications on phones only
ever display emoji, and never try to do any sort of string sorting.


The sorting isn't the problem.  We have a versioning mechanism for 
collations.  What we do with the version information is clearly not 
perfect yet, but the mechanism exists and you can hack together queries 
that answer the question, did anything change here that would affect my 
indexes.  And you could build more tooling around that and so on.


The problem being considered here are updates to Unicode itself, as 
distinct from the collation tables.  A Unicode update can impact at 
least two things:


- Code points that were previously unassigned are now assigned.  That's 
obviously a very common thing with every Unicode update.  The new 
character will have new properties attached to it, so the result of 
various functions that use such properties (upper(), lower(), 
normalize(), etc.) could change, because previously the code point had 
no properties, and so those functions would not do anything interesting 
with the character.


- Certain properties of an existing character can change.  Like, a 
character used to be a letter and now it's a digit.  (This is an 
example; I'm not sure if that particular change would be allowed.)  In 
the extreme case, this could have the same impact as the above, but in 
practice the kinds of changes that are allowed wouldn't affect typical 
indexes.


I don't think this has anything in particular to do with the new builtin 
collation provider.  That is just one new consumer of this.

Re: [18] Policy on IMMUTABLE functions and Unicode updates

"Daniel Verite"  writes:
>   Tom Lane wrote:
>> Why?  If we agree that that's the way forward, we could certainly
>> stick some collversion other than "1" into pg_c_utf8's pg_collation
>> entry.  There's already been one v17 catversion bump since beta2
>> (716bd12d2), so another one is basically free.

> pg_collation.collversion has been used so far for the sort part
> of the collations.

Hmm, we haven't particularly drawn a distinction between sort-related
and not-sort-related aspects of collation versions AFAIK.  Perhaps
it'd be appropriate to do so, and I agree that there's not time to
design such a thing for v17.  But pg_c_utf8 might be the only case
where we could do anything other than advance those versions in
lockstep.  I doubt we have enough insight into the behaviors of
other providers to say confidently that an update affects only
one side of their behavior.

regards, tom lane

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, 2024-07-23 at 16:07 -0400, Tom Lane wrote:
> Well, it depends on which libc collation you have in mind.  I was
> thinking of a libc-supplied C.UTF-8 collation, which I would expect
> to behave the same as pg_c_utf8, modulo which Unicode version it's
> based on.

Daniel Vérité documented[1] cases where the libc C.UTF-8 locale changed
the *sort* behavior, thereby affecting primary keys.

Regards,
Jeff Davis

[1]
https://www.postgresql.org/message-id/8a3dc06f-9b9d-4ed7-9a12-2070d8b0165f%40manitou-mail.org

Re: query_id, pg_stat_activity, extended query protocol

2024-07-23 Thread Sami Imseih

> On Wed, Jul 17, 2024 at 11:32:49AM +0200, Anthonin Bonnefoy wrote:
>> Wouldn't it be enough to call pgstat_report_query_id in ExecutorRun
>> and ProcessUtility? With those changes [1], both normal statements and
>> utility statements called through extended protocol will correctly
>> report the query_id.
> 
> Interesting, and this position is really tempting.  By doing so you
> would force the query ID to be set to the one from the CTAS and
> EXPLAIN, because these would be executed before the inner queries, and
> pgstat_report_query_id() with its non-force option does not overwrite
> what would be already set (aka what should be the top-level query ID).
> 
> Using ExecutorRun() feels consistent with the closest thing I've
> touched in this area lately in 1d477a907e63, because that's the only
> code path that we are sure to take depending on the portal execution
> (two execution scenarios depending on how rows are retrieved, as far
> as I recall).  The comment should be perhaps more consistent with the
> executor start counterpart.  So I would be OK with that..  The
> location before the hook of ProcessUtility is tempting, as it would
> take care of the case of PortalRunMulti().  However..  Joining with a
> point from Sami upthread..
> 
> This is still not enough in the case of where we have a holdStore, no?
> This is the case where we would do *one* ExecutorRun(), followed up by
> a scan of the tuplestore in more than one execute message.  The v2
> proposed upthread, by positioning a query ID to be set in
> PortalRunSelect(), is still positioning that in two places.

Correct, I also don’t think ExecutorRun is enough. Another reason is we should 
also 
be setting the queryId during bind, right before planning starts. 
Planning could have significant impact on the server and I think we better
track the responsible queryId. 

I have not tested the holdStore case.  IIUC the holdStore deals with fetching a 
WITH HOLD CURSOR. Why would this matter for this conversation?

> Hmm...  How about being much more aggressive and just do the whole
> business in exec_execute_message(), just before we do the PortalRun()?
> I mean, that's the source of all our problems, and we know the
> statements that the portal will work on so we could go through the
> list, grab the first planned query and set the query ID based on that,
> without caring about the portal patterns we would need to think about.

Doing the work in exec_execute_message makes sense, although maybe
setting the queryId after pgstat_report_activity is better because it occurs 
earlier. 
Also, we should do the same for exec_bind_message and set the queryId 
right after pgstat_report_activity in this function as well.

We do have to account for the queryId changing after cache revalidation, so
we should still set the queryId inside GetCachedPlan in the case the query
underwent re-analysis. This means there is a chance that a queryId set at
the start of the exec_bind_message may be different by the time we complete
the function, in the case the query revalidation results in a different queryId.

See the attached v3.

Regards, 

Sami 

v3-0001-Fix-queryId-reporting-in-Extended-Query-Protocol.patch
Description: Binary data

Re: [18] Policy on IMMUTABLE functions and Unicode updates

Jeff Davis  writes:
> On Tue, 2024-07-23 at 16:07 -0400, Tom Lane wrote:
>> Well, it depends on which libc collation you have in mind.  I was
>> thinking of a libc-supplied C.UTF-8 collation, which I would expect
>> to behave the same as pg_c_utf8, modulo which Unicode version it's
>> based on.

> Daniel Vérité documented[1] cases where the libc C.UTF-8 locale changed
> the *sort* behavior, thereby affecting primary keys.

Ouch.  But we didn't establish whether that was an ancient bug,
or something likely to happen again.  (In any case, that surely
reinforces the point that we can expect pg_c_utf8 to be more
stable than any previously-available alternative.)

regards, tom lane

Re: Remove dependence on integer wrapping

2024-07-23 Thread Joseph Koshakow

On Mon, Jul 22, 2024 at 6:07 PM Matthew Kim  wrote:
>
> On Mon, Jul 22, 2024 at 5:52 PM Alexander Lakhin 
wrote:
>
>> Also there are several trap-producing cases with date types:
>> SELECT to_date('1', 'CC');
>
> Hi, I’ve attached a patch that fixes the to_date() overflow. Patches 1
through 3 remain unchanged.

Thanks for the contribution Mattew!

On Tue, Jul 23, 2024 at 2:14 AM jian he  wrote:
>
> On Tue, Jul 23, 2024 at 6:56 AM Joseph Koshakow  wrote:
>>
>> The specific bug that this patch fixes is preventing the following
>> statement:
>>
>> # INSERT INTO arroverflowtest(i[-2147483648:2147483647]) VALUES
('{1}');
>>
>> So we may want to add that test back in.
>>
> I agree with you.

I've updated the patch to add this test back in.

> also v13-0003-Remove-dependence-on-integer-wrapping-for-jsonb.patch
> in setPathArray we change to can
>
> if (idx == PG_INT32_MIN || -idx > nelems)
> {
> /*
>  * If asked to keep elements position consistent, it's not
allowed
>  * to prepend the array.
>  */
> if (op_type & JB_PATH_CONSISTENT_POSITION)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>  errmsg("path element at position %d is out of
> range: %d",
> level + 1, idx)));
> idx = PG_INT32_MIN;
> }

Done in the attached patch.
From 6f9b253b2f35dcbb91a44285b28f68916b039e9a Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 8 Jun 2024 22:16:46 -0400
Subject: [PATCH 1/4] Remove dependence on integer wrapping

This commit updates various parts of the code to no longer rely on
integer wrapping for correctness. Not all compilers support -fwrapv, so
it's best not to rely on it.
---
 src/backend/utils/adt/cash.c   |   7 +-
 src/backend/utils/adt/numeric.c|   5 +-
 src/backend/utils/adt/numutils.c   |  34 ---
 src/backend/utils/adt/timestamp.c  |  28 +-
 src/include/common/int.h   | 105 +
 src/interfaces/ecpg/pgtypeslib/timestamp.c |  11 +--
 src/test/regress/expected/timestamp.out|  13 +++
 src/test/regress/expected/timestamptz.out  |  13 +++
 src/test/regress/sql/timestamp.sql |   4 +
 src/test/regress/sql/timestamptz.sql   |   4 +
 10 files changed, 169 insertions(+), 55 deletions(-)

diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index b20c358486..52687dbf7b 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -429,8 +429,11 @@ cash_out(PG_FUNCTION_ARGS)
 
 	if (value < 0)
 	{
-		/* make the amount positive for digit-reconstruction loop */
-		value = -value;
+		/*
+		 * make the amount positive for digit-reconstruction loop, we can
+		 * leave INT64_MIN unchanged
+		 */
+		pg_neg_s64_overflow(value, &value);
 		/* set up formatting data */
 		signsymbol = (*lconvert->negative_sign != '\0') ? lconvert->negative_sign : "-";
 		sign_posn = lconvert->n_sign_posn;
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index d0f0923710..38965b4023 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -8114,15 +8114,14 @@ int64_to_numericvar(int64 val, NumericVar *var)
 
 	/* int64 can require at most 19 decimal digits; add one for safety */
 	alloc_var(var, 20 / DEC_DIGITS);
+	uval = pg_abs_s64(val);
 	if (val < 0)
 	{
 		var->sign = NUMERIC_NEG;
-		uval = -val;
 	}
 	else
 	{
 		var->sign = NUMERIC_POS;
-		uval = val;
 	}
 	var->dscale = 0;
 	if (val == 0)
@@ -11443,7 +11442,7 @@ power_var_int(const NumericVar *base, int exp, int exp_dscale,
 	 * Now we can proceed with the multiplications.
 	 */
 	neg = (exp < 0);
-	mask = abs(exp);
+	mask = pg_abs_s32(exp);
 
 	init_var(&base_prod);
 	set_var_from_var(base, &base_prod);
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index adc1e8a4cb..a3d7d6bf01 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 
+#include "common/int.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
 
@@ -131,6 +132,7 @@ pg_strtoint16_safe(const char *s, Node *escontext)
 	uint16		tmp = 0;
 	bool		neg = false;
 	unsigned char digit;
+	int16		result;
 
 	/*
 	 * The majority of cases are likely to be base-10 digits without any
@@ -190,10 +192,9 @@ pg_strtoint16_safe(const char *s, Node *escontext)
 
 	if (neg)
 	{
-		/* check the negative equivalent will fit without overflowing */
-		if (unlikely(tmp > (uint16) (-(PG_INT16_MIN + 1)) + 1))
+		if (pg_neg_u16_overflow(tmp, &result))
 			goto out_of_range;
-		return -((int16) tmp);
+		return result;
 	}
 
 	if (unlikely(tmp > PG_INT16_MAX))
@@ -333,10 +334,9 @@ slow:
 
 	if (neg)
 	{
-		/* check the negative equivalent will fit without overflowing */
-		if (tmp > (uint16) (-(PG_INT16_MIN + 1)) + 1)
+		if (pg_neg_u

Re: proposal: schema variables

On Tue, 2024-07-23 at 16:34 +0200, Laurenz Albe wrote:
> CREATE VARIABLE command:
> 
>   This is buggy:
> 
> CREATE VARIABLE str AS text NOT NULL DEFAULT NULL;
> 
>   Ugh.
> 
> SELECT str;
> ERROR:  null value is not allowed for NOT NULL session variable 
> "laurenz.str"
> DETAIL:  The result of DEFAULT expression is NULL.
> 
>   Perhaps that is a leftover from the previous coding, but I think there need 
> be
>   no check upon SELECT.  It should be enough to check during CREATE VARIABLE 
> and
>   LET.

I'm having second thoughts about that.

I was thinking of a variable like of a table column, but there is a fundamental
difference: there is a clear moment when a tuple is added (INSERT or UPDATE),
which is the point where a column can be checked for NULL values.

A variable can be SELECTed without having been LET before, in which case it
has the default value.  But there is no way to test the default value before
the variable is SELECTed.  So while DEFAULT NULL for a non-nullable variable
seems weird, it is no worse than DEFAULT somefunc() for a function that returns
NULL.

So perhaps the behavior I complained about above is actually the right one.
In the view of that, it doesn't seem necessary to enforce a DEFAULT value for
a NOT NULL variable: NOT NULL might just as well mean "you have to LET it before
you can SELECT it".

> IMMUTABLE variables:
> 
> +   
> +IMMUTABLE
> +
> + 
> +  The assigned value of the session variable can not be changed.
> +  Only if the session variable doesn't have a default value, a single
> +  initialization is allowed using the LET 
> command. Once
> +  done, no further change is allowed until end of transaction
> +  if the session variable was created with clause ON 
> TRANSACTION
> +  END RESET, or until reset of all session variables by
> +  DISCARD VARIABLES, or until reset of all session
> +  objects by command DISCARD ALL.
> + 
> +
> +   
> 
>   I can see the usefulness of IMMUTABLE variables, but I am surprised that
>   they are reset by DISCARD.  What is the use case you have in mind?
>   The use case I can envision is an application that sets a value right after
>   authentication, for use with row-level security.  But then it would be 
> harmful
>   if the user could reset the variable with DISCARD.

I'm beginning to be uncertain about that as well.  You might want to use a
connection pool, and you LET the variable when you take it out of the pool.
When the session is returned to the pool, variables get DISCARDed.

Sure, a user can call DISCARD, but only if he or she is in an interactive 
session.

So perhaps it is good as it is.

Yours,
Laurenz Albe

Re: Statistics Import and Export

2024-07-23 Thread Corey Huinker

Giving the parameter lists more thought, the desire for a return code more
granular than true/false/null, and the likelihood that each function will
inevitably get more parameters both stats and non-stats, I'm proposing the
following:

Two functions:

pg_set_relation_stats(
out schemaname name,
out relname name,
out row_written boolean,
out params_rejected text[],
kwargs any[]) RETURNS RECORD

and

pg_set_attribute_stats(
out schemaname name,
out relname name,
out inherited bool,
out row_written boolean,
out params_accepted text[],
out params_rejected text[],
kwargs any[]) RETURNS RECORD

The leading OUT parameters tell us the rel/attribute/inh affected (if any),
and which params had to be rejected for whatever reason. The kwargs is the
variadic key-value pairs that we were using for all stat functions, but now
we will be using it for all parameters, both statistics and control, the
control parameters will be:

relation - the oid of the relation
attname - the attribute name (does not apply for relstats)
inherited - true false for attribute stats, defaults false, does not apply
for relstats
warnings, boolean, if supplied AND set to true, then all ERROR that can be
stepped down to WARNINGS will be. This is "binary upgrade mode".
version - the numeric version (a la PG_VERSION_NUM) of the statistics
given. If NULL or omitted assume current PG_VERSION_NUM of server.
actual stats columns.

This allows casual users to set only the params they want for their needs,
and get proper errors, while pg_upgrade can set

'warnings', 'true', 'version', 120034

and get the upgrade behavior we need.











 and pg_set_attribute_stats.
pg_set_relation_stats(out schemaname name, out relname name,, out
row_written boolean, out params_entered int, out params_accepted int,
kwargs any[])

Re: Statistics Import and Export

2024-07-23 Thread Corey Huinker

>
>
>  and pg_set_attribute_stats.
> pg_set_relation_stats(out schemaname name, out relname name,, out
> row_written boolean, out params_entered int, out params_accepted int,
> kwargs any[])
>
>
Oops, didn't hit undo fast enough. Disregard this last bit.

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2024-07-23 Thread Masahiko Sawada

On Tue, Jul 23, 2024 at 5:43 AM Melanie Plageman
 wrote:
>
> On Mon, Jul 22, 2024 at 10:54 PM Masahiko Sawada  
> wrote:
> >
> > On Mon, Jul 22, 2024 at 6:26 PM Melanie Plageman
> >  wrote:
> > >
> > > On Mon, Jul 22, 2024 at 6:36 PM Tom Lane  wrote:
> > > >
> > > > Melanie Plageman  writes:
> > > > > We've only run tests with this commit on some of the back branches for
> > > > > some of these animals. Of those, I don't see any failures so far. So,
> > > > > it seems the test instability is just related to trying to get
> > > > > multiple passes of index vacuuming reliably with TIDStore.
> > > >
> > > > > AFAICT, all the 32bit machine failures are timeouts waiting for the
> > > > > standby to catch up (mamba, gull, merswine). Unfortunately, the
> > > > > failures on copperhead (a 64 bit machine) are because we don't
> > > > > actually succeed in triggering a second vacuum pass. This would not be
> > > > > fixed by a longer timeout.
> > > >
> > > > Ouch.  This seems to me to raise the importance of getting a better
> > > > way to test multiple-index-vacuum-passes.  Peter argued upthread
> > > > that we don't need a better way, but I don't see how that argument
> > > > holds water if copperhead was not reaching it despite being 64-bit.
> > > > (Did you figure out exactly why it doesn't reach the code?)
> > >
> > > I wasn't able to reproduce the failure (failing to do > 1 index vacuum
> > > pass) on my local machine (which is 64 bit) without decreasing the
> > > number of tuples inserted. The copperhead failure confuses me because
> > > the speed of the machine should *not* affect how much space the dead
> > > item TIDStore takes up. I would have bet money that the same number
> > > and offsets of dead tuples per page in a relation would take up the
> > > same amount of space in a TIDStore on any 64-bit system -- regardless
> > > of how slowly it runs vacuum.
> >
> > Looking at copperhead's failure logs, I could not find that "VACUUM
> > (VERBOSE, FREEZE) vac_horizon_floor_table;" wrote the number of index
> > scans in logs. Is there any clue that made you think the test failed
> > to do multiple index vacuum passes?
>
> The vacuum doesn't actually finish because I have a cursor that keeps
> it from finishing and then I query pg_stat_progress_vacuum after the
> first index vacuuming round should have happened and it did not do the
> index vacuum:
>
> [20:39:34.645](351.522s) # poll_query_until timed out executing this query:
> #
> # SELECT index_vacuum_count > 0
> # FROM pg_stat_progress_vacuum
> # WHERE datname='test_db' AND relid::regclass =
> 'vac_horizon_floor_table'::regclass;
> #
> # expecting this output:
> # t
> # last actual query output:
> # f
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=copperhead&dt=2024-07-22%2015%3A00%3A11
>
> I suppose it is possible that it did in fact time out and the index
> vacuum was still in progress. But most of the other "too slow"
> failures were when the standby was trying to catch up. Usually the
> pg_stat_progress_vacuum test fails because we didn't actually do that
> index vacuuming round yet.

Thank you for your explanation! I understood the test cases.

I figured out why two-round index vacuum was not triggered on
copperhead although it's a 64-bit system. In short, this test case
depends on MEMORY_CONTEXT_CHECK (or USE_ASSERT_CHECKING) being on.

In this test case, every BlocktableEntry size would be 16 bytes; the
header is 8 bytes and offset bitmap is 8 bytes (covering up to offset
63). We calculate the memory size (required_size in BumpAlloc()) to
allocate in a bump memory context as follows:

#ifdef MEMORY_CONTEXT_CHECKING
/* ensure there's always space for the sentinel byte */
chunk_size = MAXALIGN(size + 1);
#else
chunk_size = MAXALIGN(size);
#endif

   (snip)

required_size = chunk_size + Bump_CHUNKHDRSZ;

Without MEMORY_CONTEXT_CHECK, if size is 16 bytes, required_size is
also 16 bytes as it's already 8-byte aligned and Bump_CHUNKHDRSZ is 0.
On the other hand with MEMORY_CONTEXT_CHECK, the requied_size is
bumped to 40 bytes as chunk_size is 24 bytes and Bump_CHUNKHDRSZ is 16
bytes. Therefore, with MEMORY_CONTEXT_CHECK, we allocate more memory
and use more Bump memory blocks, resulting in filling up TidStore in
the test cases. We can easily reproduce this test failure with
PostgreSQL server built without --enable-cassert. It seems that
copperhead is the sole BF animal that doesn't use --enable-cassert but
runs recovery-check.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: xid_wraparound tests intermittent failure.

2024-07-23 Thread Masahiko Sawada

On Tue, Jul 23, 2024 at 3:49 AM Andrew Dunstan  wrote:
>
>
> On 2024-07-22 Mo 9:29 PM, Masahiko Sawada wrote:
>
> On Mon, Jul 22, 2024 at 12:53 PM Andrew Dunstan  wrote:
>
> On 2024-07-22 Mo 12:46 PM, Tom Lane wrote:
>
> Masahiko Sawada  writes:
>
> Looking at dodo's failures, it seems that while it passes
> module-xid_wraparound-check, all failures happened only during
> testmodules-install-check-C. Can we check the server logs written
> during xid_wraparound test in testmodules-install-check-C?
>
> Oooh, that is indeed an interesting observation.  There are enough
> examples now that it's hard to dismiss it as chance, but why would
> the two runs be different?
>
>
> It's not deterministic.
>
> I tested the theory that it was some other concurrent tests causing the 
> issue, but that didn't wash. Here's what I did:
>
> for f in `seq 1 100`
>   do echo iteration = $f
>   meson test --suite xid_wraparound || break
> done
>
> It took until iteration 6 to get an error. I don't think my Ubuntu instance 
> is especially slow. e.g. "meson compile" normally takes a handful of seconds. 
> Maybe concurrent tests make it more likely, but they can't be the only cause.
>
> Could you provide server logs in both OK and NG tests? I want to see
> if there's a difference in the rate at which tables are vacuumed.
>
>
> See 
> 
>
>
> The failure logs are from a run where both tests 1 and 2 failed.
>

Thank you for sharing the logs.

I think that the problem seems to match what Alexander Lakhin
mentioned[1]. Probably we can fix such a race condition somehow but
I'm not sure it's worth it as setting autovacuum = off and
autovacuum_max_workers = 1 (or a low number) is an extremely rare
case. I think it would be better to stabilize these tests. One idea is
to turn the autovacuum GUC parameter on while setting
autovacuum_enabled = off for each table. That way, we can ensure that
autovacuum workers are launched. And I think it seems to align real
use cases.

Regards,

[1] 
https://www.postgresql.org/message-id/02373ec3-50c6-df5a-0d65-5b9b1c0c86d6%40gmail.com

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: UUID v7

2024-07-23 Thread Sergey Prokhorenko


Dear Colleagues,

Althoughthe uuidv7(timestamp) function clearly contradicts RFC 9562, but 
theuuidv7(timestamp_offset) function is fully compliant with RFC 9562 and 
isabsolutely necessary.
Here is a quote from the RFC 9562to support thisstatement (RFC 9562: 
Universally Unique IDentifiers (UUIDs)):

| 
| 
| 
|  |  |

 |

 |
| 
|  | 
RFC 9562: Universally Unique IDentifiers (UUIDs)

This specification defines UUIDs (Universally Unique IDentifiers) -- also known 
as GUIDs (Globally Unique IDenti...
 |

 |

 |




"Altering,Fuzzing, or Smearing:

ImplementationsMAY alter the actual timestamp. Some examples include security 
considerationsaround providing a real-clock value within a UUID to 1) correct 
inaccurateclocks, 2) handle leap seconds, or 3) obtain a millisecond value by 
dividing by1024 (or some other value) for performance reasons (instead of 
dividing anumber of microseconds by 1000). This specification makes no 
requirement orguarantee about how close the clock value needs to be to the 
actual time. "

It’s written clumsily, of course, butthe intention of the authors of RFC 9562 
is completely clear: the currenttimestamp can be changed by any amount and for 
any reason, including securityor performance reasons. The wording provides only 
a few examples, the list ofwhich is certainly not exhaustive.

The motives of the authors of RFC 9562are also clear. The timestamp is needed 
only to generate monotonicallyincreasing UUIDv7.The timestamp should not be 
used as a source of data about the time the recordwas created (this is 
explicitly stated in section 6.12. Opacity). Therefore,the actual timestampcan 
and should be changed if necessary.

Why then does RFC 9562 contain wording aboutthe need to use "Unix Epoch 
timestamp"? First, the authors of RFC9562 wanted toget away from using the 
Gregorian calendar, which required a timestamp that wastoo long. Second, the 
RFC 9562 prohibits inserting into UUIDv7 a completely arbitrary dateand time 
value that does not increase with the passage of real time. And thisis correct, 
since in this case the generated UUIDv7 would not be monotonicallyincreasing. 
Thirdly, on almost all computing platforms there is a convenientsource of "Unix 
Epoch timestamp".

Whydoes the uuidv7() function need the optional formal parameter 
timestamp_offset?This question isbest answered by a quote from 
https://lu.sagebl.eu/notes/maybe-we-dont-need-uuidv7 :

"Leakinginformation

UUIDv4does not leak information assuming a proper implementation. But, UUIDv7 
in factdoes: the timestamp of the server is embeded into the ID. From a 
business pointof view it discloses information about resource creation time. It 
may not be aproblem depending on the context. Current RFC draft allows 
implementation totweak timestamps a little to enforce a strict increasing order 
between twogenerations and to alleviate some security concerns."

There is a lot of hate on the internetabout "UUIDv7 should not be used because 
it discloses the date and time the record wascreated." If there was a ban on 
changing the actual timestamp, this wouldprevent the use of UUIDv7 in 
mission-critical databases, and would generallylead to a decrease in the 
popularity of UUIDv7.

The implementation details of timestamp_offsetare, of course, up to the 
developer. But I would suggest two features:

1. Ifthe result of applyingtimestamp_offsetthe timestamp goes beyond the 
permissible interval, the timestamp_offset value mustbe reset tozero
2. Thedata type for timestamp_offsetshould bedeveloper-friendly interval 
type,(https://postgrespro.ru/docs/postgresql/16/datatype-datetime?lang=en#DATATYPE-INTERVAL-INPUT),
 which allows you to enter the argument value using words 
microsecond,millisecond, second, minute, hour, day, week, month, year, decade, 
century,millennium.
Ireally hope that timestamp_offsetwill be used inthe uuidv7() function for 
PostgreSQL.

Sergey prokhorenkosergeyprokhore...@yahoo.com.au

Re: Direct SSL connection and ALPN loose ends

On Tue, Jul 23, 2024 at 08:32:29PM +0300, Heikki Linnakangas wrote:
> All these new tests are a great asset when refactoring this again.

Thanks for doing that.  The coverage, especially with v2, is going to
be really useful.

> Yeah, I'm also not too excited about the additional code in the backend, but
> I'm also not excited about writing another test C module just for this. I'm
> inclined to commit this as it is, but we can certainly revisit this later,
> since it's just test code.

The point would be to rely on the existing injection_points module,
with a new callback in it.  The callbacks could be on a file of their
own in the module, for clarity.  What you have is OK for me anyway, it
is good to add more options to developers in this area and this gets
used in core.  That's also enough to manipulate the stack in or even
out of core.

> Here's a new rebased version with some minor cleanup. Notably, I added docs
> for the new IS_INJECTION_POINT_ATTACHED() macro.

0001 looks OK.

+   push @events, "backenderror" if $line =~ /error triggered for
injection point backend-/;
+   push @events, "v2error" if $line =~ /protocol version 2 error
triggered/;

Perhaps append an "injection_" for these two keywords?

+#include "storage/proc.h"

This inclusion in injection_point.c should not be needed.

> sets the FrontendProtocol global variable, but I think it's more
> straightforward to have the test code

The last sentence in the commit message of 0002 seems to be
unfinished.

Could you run a perltidy on 005_negotiate_encryption.pl?  There are a
bunch of new diffs in it.
--
Michael

signature.asc
Description: PGP signature

Re: Can't find bugs to work on

2024-07-23 Thread jian he

On Sat, Jul 13, 2024 at 12:09 AM David G. Johnston
 wrote:
>
> On Fri, Jul 12, 2024 at 8:44 AM Tom Lane  wrote:
>>
>>
>> As this example shows, it's now standard for PG commit log messages
>> to include a link to the relevant email thread, so all you need
>> is the message-ID of the first message in the thread to search
>> the commit log with.
>>
>
> Cross-posting my comment from Hackers Discord:
>
> """
> The specific request here seems fairly doable as something we could offer as 
> a custom search mode on the mailing list website.  For a given time period 
> show me all messages that began a new message thread where there are zero 
> replies to the message.  Apply to the -bugs list and you have at least most 
> of what is being asked for.  [...]
> """
>

In an ideal world, we should be able to use postgres FDW to query all
kinds of metadata about postgres mailing lists.

it will be same as "Search for archive" in
https://www.postgresql.org/list/pgsql-hackers/

Re: Add privileges test for pg_stat_statements to improve coverage

2024-07-23 Thread kuroda . keisuke

I've slightly modified the comments in the regression tests for 
clarity.

Attached is the v6 patch. If there are no objections,
I will push this version.


Thank you for updating patch! I have no objection.

Best Regards,
Keisuke Kuroda
NTT Comware

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, Jul 23, 2024 at 4:36 PM Peter Eisentraut  wrote:
> The sorting isn't the problem.  We have a versioning mechanism for
> collations.  What we do with the version information is clearly not
> perfect yet, but the mechanism exists and you can hack together queries
> that answer the question, did anything change here that would affect my
> indexes.  And you could build more tooling around that and so on.

In my experience, sorting is, overwhelmingly, the problem. What people
complain about is that they do an upgrade - of PG or some OS package -
and then their indexes are broken. Or their partition bounds are
broken.

That we have versioning information that someone could hypothetically
know how to do something useful with is not really useful, because
nobody actually knows how to do it, and there's nothing to trigger
them to do it in the first place. People don't think "oh, I'm running
dnf update, I better run undocumented queries against the PostgreSQL
system catalogs to see whether my system is going to melt afterwards."

What needs to happen is that when you do something that breaks
something, something notices automatically and tells you and gives you
a way to get it fixed again. Or better yet, when you do something that
would break something as things stand today, some kind of versioning
logic kicks in and you keep the old behavior and nothing actually
breaks.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [18] Policy on IMMUTABLE functions and Unicode updates

On Tue, 2024-07-23 at 21:37 -0400, Robert Haas wrote:
> In my experience, sorting is, overwhelmingly, the problem.

I strongly agree.

> That we have versioning information that someone could hypothetically
> know how to do something useful with is not really useful, because
> nobody actually knows how to do it

Including me. I put significant effort into creating some views that
could help users identify potentially-affected indexes based on
collation changes, and I gave up. In theory it's just about impossible
(consider some UDF that constructs queries and EXECUTEs them -- what
collations does that depend on?). In practice, it's not much easier,
and you might as well just reindex everything having to do with text.

In contrast, if the problem is CTYPE-related, users are in a much
better position. It won't affect their primary keys or most indexes.
It's much more tractable to review your expression indexes and look for
problems (not ideal, but better). Also, as Peter points out, CTYPE
changes are typically more narrow, so there's a good chance that
there's no problem at all.

Regards,
Jeff Davis

The 031_recovery_conflict.pl test might fail due to late pgstat entries flushing

2024-07-23 Thread Alexander Lakhin


Hello hackers,

A recent buildfarm failure [1] shows that the test 031_recovery_conflict.pl
can fail yet another way:
 23/296 postgresql:recovery / recovery/031_recovery_conflict ERROR    
11.55s   exit status 1

[07:58:53.979](0.255s) ok 11 - tablespace conflict: logfile contains terminated 
connection due to recovery conflict
[07:58:54.058](0.080s) not ok 12 - tablespace conflict: stats show conflict on 
standby
[07:58:54.059](0.000s) #   Failed test 'tablespace conflict: stats show 
conflict on standby'
#   at 
/home/bf/bf-build/rorqual/REL_17_STABLE/pgsql/src/test/recovery/t/031_recovery_conflict.pl
 line 332.
[07:58:54.059](0.000s) #  got: '0'
# expected: '1'

I managed to reproduce a similar failure by running multiple test instances
in parallel on a slow VM. With extra logging added, I see in a failed run
log:
10
10  #   Failed test 'startup deadlock: stats show conflict on standby'
10  #   at t/031_recovery_conflict.pl line 368.
10  #  got: '0'
10  # expected: '1'
10  [19:48:19] t/031_recovery_conflict.pl ..
10  Dubious, test returned 1 (wstat 256, 0x100)

2024-07-23 19:48:13.966 UTC [1668402:12][client backend][1/2:0] LOG:  !!!pgstat_report_recovery_conflict| reason: 13, 
pgstat_track_counts: 1 at character 15

2024-07-23 19:48:13.966 UTC [1668402:13][client backend][1/2:0] STATEMENT:  
SELECT * FROM test_recovery_conflict_table2;
2024-07-23 19:48:13.966 UTC [1668402:14][client backend][1/2:0] ERROR:  canceling statement due to conflict with 
recovery at character 15
2024-07-23 19:48:13.966 UTC [1668402:15][client backend][1/2:0] DETAIL:  User transaction caused buffer deadlock with 
recovery.

...
2024-07-23 19:48:14.129 UTC [1668805:8][client backend][5/2:0] LOG: statement: SELECT confl_deadlock FROM 
pg_stat_database_conflicts WHERE datname='test_db';

...
2024-07-23 19:48:14.148 UTC [1668402:16][client backend][1/0:0] LOG:  
!!!pgstat_database_flush_cb| nowait: 0

This failure can be reproduced easily with a sleep added as below:
@@ -514,6 +514,7 @@ pgstat_shutdown_hook(int code, Datum arg)
 if (OidIsValid(MyDatabaseId))
 pgstat_report_disconnect(MyDatabaseId);

+if (rand() % 5 == 0) pg_usleep(10);
 pgstat_report_stat(true);

By running the test in a loop, I get miscellaneous
"stats show conflict on standby" failures, including:
iteration 19
# +++ tap check in src/test/recovery +++
t/031_recovery_conflict.pl .. 1/?
#   Failed test 'buffer pin conflict: stats show conflict on standby'
#   at t/031_recovery_conflict.pl line 332.
#  got: '0'
# expected: '1'
t/031_recovery_conflict.pl .. 17/? # Looks like you failed 1 test of 18.

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rorqual&dt=2024-07-23%2007%3A56%3A35

Best regards,
Alexander

Re: Remove dependence on integer wrapping

2024-07-23 Thread Nathan Bossart

On Tue, Jul 23, 2024 at 05:41:18PM -0400, Joseph Koshakow wrote:
> On Tue, Jul 23, 2024 at 2:14 AM jian he  wrote:
>> On Tue, Jul 23, 2024 at 6:56 AM Joseph Koshakow  wrote:
>>> The specific bug that this patch fixes is preventing the following
>>> statement:
>>>
>>> # INSERT INTO arroverflowtest(i[-2147483648:2147483647]) VALUES
> ('{1}');
>>>
>>> So we may want to add that test back in.
>>>
>> I agree with you.
> 
> I've updated the patch to add this test back in.

I've committed/back-patched the fix for array_set_slice().  I ended up
fiddling with the test cases a bit more because they generate errors that
include a platform-dependent value (MaxArraySize).  To handle that, I moved
the new tests to the section added in commit 18b5851 where VERBOSITY is set
to "sqlstate".

-- 
nathan

Re: Logical Replication of sequences

2024-07-23 Thread Peter Smith

Hi, here are some review comments for patch v20240720-0003.

This review is a WIP. This post is only about the docs (*.sgml) of patch 0003.

==
doc/src/sgml/ref/alter_subscription.sgml

1. REFRESH PUBLICATION and copy_data
nitpicks:
- IMO the "synchronize the sequence data" info was misleading because
synchronization should only occur when copy_data=true.
- I also felt it was strange to mention pg_subscription_rel for
sequences, but not for tables. I modified this part too.
- Then I moved the information about re/synchronization of sequences
into the "copy_data" part.
- And added another link to ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES

Anyway, in summary, I have updated this page quite a lot according to
my understanding. Please take a look at the attached nitpick for my
suggestions.

nitpick - /The supported options are:/The only supported option is:/

~~~

2. REFRESH PUBLICATION SEQUENCES
nitpick - tweaked the wording
nitpicK - typo /syncronizes/synchronizes/

==
3. catalogs.sgml

IMO something is missing in Section "1.55. pg_subscription_rel".

Currently, this page only talks of relations/tables, but I think it
should mention "sequences" here too, particularly since now we are
linking to here from ALTER SUBSCRIPTION when talking about sequences.

==
99.
Please see the attached diffs patch which implements any nitpicks
mentioned above.

==
Kind Regards,
Peter Smith.
Fujitsu Australia
diff --git a/doc/src/sgml/ref/alter_subscription.sgml 
b/doc/src/sgml/ref/alter_subscription.sgml
index ba8c2b1..666d9b0 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -153,7 +153,7 @@ ALTER SUBSCRIPTION name RENAME TO <
 REFRESH PUBLICATION
 
  
-  Fetch missing table information from publisher.  This will start
+  Fetch missing table information from the publisher.  This will start
   replication of tables that were added to the subscribed-to publications
   since 
   CREATE SUBSCRIPTION or
@@ -161,29 +161,26 @@ ALTER SUBSCRIPTION name RENAME TO <
  
 
  
-  It also fetches the missing sequence information from the publisher and
-  synchronize the sequence data for newly added sequences with the
-  publisher. This will start synchronizing of sequences that were added to
-  the subscribed-to publications since 
-  CREATE SUBSCRIPTION or the last invocation of
-  REFRESH PUBLICATION. Additionally, it will remove any
-  sequences that are no longer part of the publication from the
-  pg_subscription_rel
-  system catalog. Sequences that have already been synchronized will not be
-  re-synchronized.
+  Also, fetch missing sequence information from the publisher.
+ 
+
+ 
+  The system catalog pg_subscription_rel
+  is updated to record all tables and sequences known to the subscription,
+  that are still part of the publication.
  
 
  
   refresh_option specifies additional options 
for the
-  refresh operation.  The supported options are:
+  refresh operation.  The only supported option is:
 
   

 copy_data (boolean)
 
  
-  Specifies whether to copy pre-existing data for tables and sequences
-  in the publications that are being subscribed to when the replication
+  Specifies whether to copy pre-existing data for tables and 
synchronize
+  sequences in the publications that are being subscribed to when the 
replication
   starts. The default is true.
  
  
@@ -191,6 +188,11 @@ ALTER SUBSCRIPTION name RENAME TO <
   filter WHERE clause has since been modified.
  
  
+  Previously subscribed sequences are not re-synchronized. To do that,
+  see 
+  ALTER SUBSCRIPTION ... REFRESH PUBLICATION 
SEQUENCES
+ 
+ 
   See  for details of
   how copy_data = true can interact with the
   origin
@@ -212,11 +214,11 @@ ALTER SUBSCRIPTION name RENAME TO <
 REFRESH PUBLICATION SEQUENCES
 
  
-  Fetch missing sequences information from publisher and re-synchronize the
+  Fetch missing sequence information from the publisher, then 
re-synchronize
   sequence data with the publisher. Unlike 
   ALTER SUBSCRIPTION ... REFRESH PUBLICATION 
which
-  only syncronizes the newly added sequences, this option will also
-  re-synchronize the sequence data for sequences that were previously 
added.
+  only synchronizes newly added sequences, REFRESH PUBLICATION 
SEQUENCES
+  will re-synchronize the sequence data for all subscribed sequences.

Re: apply_scanjoin_target_to_paths and partitionwise join

2024-07-23 Thread Richard Guo

On Wed, May 22, 2024 at 3:57 PM Ashutosh Bapat
 wrote:
> I will create patches for the back-branches once the patch for master is in a 
> committable state.

AFAIU, this patch prevents apply_scanjoin_target_to_paths() from
discarding old paths of partitioned joinrels.  Therefore, we can
retain non-partitionwise join paths if the cheapest path happens to be
among them.

One concern from me is that if the cheapest path of a joinrel is a
partitionwise join path, following this approach could lead to
undesirable cross-platform plan variations, as detailed in the
original comment.

Is there a specific query that demonstrates benefits from this change?
I'm curious about scenarios where a partitionwise join runs slower
than a non-partitionwise join.

Thanks
Richard

Re: [18] Policy on IMMUTABLE functions and Unicode updates


On 24.07.24 03:37, Robert Haas wrote:

On Tue, Jul 23, 2024 at 4:36 PM Peter Eisentraut  wrote:

The sorting isn't the problem.  We have a versioning mechanism for
collations.  What we do with the version information is clearly not
perfect yet, but the mechanism exists and you can hack together queries
that answer the question, did anything change here that would affect my
indexes.  And you could build more tooling around that and so on.


In my experience, sorting is, overwhelmingly, the problem. What people
complain about is that they do an upgrade - of PG or some OS package -
and then their indexes are broken. Or their partition bounds are
broken.


Fair enough.  My argument was, that topic is distinct from the topic of 
this thread.

Re: Conflict detection and logging in logical replication

2024-07-23 Thread Nisha Moond

On Thu, Jul 18, 2024 at 7:52 AM Zhijie Hou (Fujitsu)
 wrote:
>
> Attach the V5 patch set which changed the following:
>

Tested v5-0001 patch, and it fails to detect the update_exists
conflict for a setup where Pub has a non-partitioned table and Sub has
the same table partitioned.
Below is a testcase showcasing the issue:

Setup:
Pub:
create table tab (a int not null, b int not null);
alter table tab add constraint tab_pk primary key (a,b);

Sub:
create table tab (a int not null, b int not null) partition by range (b);
alter table tab add constraint tab_pk primary key (a,b);
CREATE TABLE tab_1 PARTITION OF tab FOR VALUES FROM (MINVALUE) TO (100);
CREATE TABLE tab_2 PARTITION OF tab FOR VALUES FROM (101) TO (MAXVALUE);

Test:
Pub: insert into tab values (1,1);
Sub: insert into tab values (2,1);
Pub: update tab set a=2 where a=1;--> After this update on Pub,
'update_exists' is expected on Sub, but it fails to detect the
conflict and logs the key violation error -

ERROR:  duplicate key value violates unique constraint "tab_1_pkey"
DETAIL:  Key (a, b)=(2, 1) already exists.

Thanks,
Nisha

Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx

On Fri, Jul 19, 2024 at 03:28:44PM +0200, Tomas Vondra wrote:
> OK, if you're already half-way through the review, I'll leave it up to
> you. I don't think we need to rush, and I'd have to learn about all the
> psql stuff first anyway.

It took me a couple of days to get back to it, but attached is what I
have finished with.  This was mostly OK, except for a few things:
- \close was inconsistent with the other two commands, where no
argument was treated as the unnamed prepared statement.  I think that
this should be made consistent with \parse and \bindx, requiring an
argument, where '' is the unnamed statement.
- The docs did not mention the case of the unnamed statement, so added
some notes about that.
- Some free() calls were not needed in the command executions, where
psql_scan_slash_option() returns NULL.
- Tests missing when no argument is provided for the new commands.

One last thing I have found really confusing is that this leads to the
addition of two more status flags in pset for the close and parse
parts, with \bind and \bindx sharing the third one while deciding
which path to use depending on if the statement name is provided.
That's fragile.  I think that it would be much cleaner to put all that
behind an enum, falling back to PQsendQuery() by default.  I am
attaching that as 0002, for clarity, but my plan is to merge both 0001
and 0002 together.
--
Michael
From 62303eeab20740f8e7bba36d6595b2a02771e5aa Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy 
Date: Thu, 18 Jan 2024 08:46:33 +0100
Subject: [PATCH v6 1/2] psql: Add support for prepared stmt with extended
 protocol

Currently, only unnamed prepared statement is supported by psql with the
\bind command so it's not possible to test named statement creation and
execution through extended protocol.

This commit introduces three additional commands: \parse, \bindx and
\close.
\parse creates a prepared statement using extended protocol.
\bindx binds and execute an existing prepared statement using extended
protocol.
\close closes an existing prepared statement using extended protocol.
---
 src/bin/psql/command.c | 128 +
 src/bin/psql/common.c  |  36 +++-
 src/bin/psql/help.c|   4 +
 src/bin/psql/settings.h|   6 ++
 src/bin/psql/tab-complete.c|   6 +-
 src/test/regress/expected/psql.out |  55 +
 src/test/regress/sql/psql.sql  |  28 ++-
 doc/src/sgml/ref/psql-ref.sgml |  90 
 8 files changed, 346 insertions(+), 7 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 180781ecd0..f23a7404cc 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -64,10 +64,14 @@ static backslashResult exec_command(const char *cmd,
 	PQExpBuffer previous_buf);
 static backslashResult exec_command_a(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_bind(PsqlScanState scan_state, bool active_branch);
+static backslashResult exec_command_bindx(PsqlScanState scan_state, bool active_branch,
+		  const char *cmd);
 static backslashResult exec_command_C(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_connect(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_cd(PsqlScanState scan_state, bool active_branch,
 	   const char *cmd);
+static backslashResult exec_command_close(PsqlScanState scan_state, bool active_branch,
+		  const char *cmd);
 static backslashResult exec_command_conninfo(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_copy(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_copyright(PsqlScanState scan_state, bool active_branch);
@@ -116,6 +120,8 @@ static backslashResult exec_command_lo(PsqlScanState scan_state, bool active_bra
 static backslashResult exec_command_out(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_print(PsqlScanState scan_state, bool active_branch,
 		  PQExpBuffer query_buf, PQExpBuffer previous_buf);
+static backslashResult exec_command_parse(PsqlScanState scan_state, bool active_branch,
+		  const char *cmd);
 static backslashResult exec_command_password(PsqlScanState scan_state, bool active_branch);
 static backslashResult exec_command_prompt(PsqlScanState scan_state, bool active_branch,
 		   const char *cmd);
@@ -312,12 +318,16 @@ exec_command(const char *cmd,
 		status = exec_command_a(scan_state, active_branch);
 	else if (strcmp(cmd, "bind") == 0)
 		status = exec_command_bind(scan_state, active_branch);
+	else if (strcmp(cmd, "bindx") == 0)
+		status = exec_command_bindx(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "C") == 0)
 		status = exec_command_C(scan_state, active_branch);
 	else if (strcmp(cmd, "c") == 0 || strcmp(cmd, "connect") == 0)
 		status = exec_command_connect(scan_state,

Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx