Re: Optimising numeric division

2024-08-24 Thread Joel Jacobson
On Sat, Aug 24, 2024, at 01:35, Joel Jacobson wrote:
> On Sat, Aug 24, 2024, at 00:00, Joel Jacobson wrote:
>> Since statistical tools that rely on normal distributions can't be used,
>> let's look at the individual measurements for (var1ndigits=3, var2ndigits=3)
>> since that seems to be the biggest slowdown on both CPUs,
>> and see if our level of surprise is affected.
>
> Here is a more traditional benchmark,
> which seems to also indicate (var1ndigits=3, var2ndigits=3) is a bit slower:

I tested just adding back div_var_int64, and it seems to help.

-- Intel Core i9-14900K:

select summary, x var1ndigits,y 
var2ndigits,a_avg,b_avg,pooled_stddev,abs_diff,rel_diff,sigmas from 
catbench.vreport where function_name = 'numeric_div' and summary like 'Add back 
div_var_int64%' and sigmas > 1 order by x,y;
 summary  | var1ndigits | var2ndigits | a_avg  | b_avg  | 
pooled_stddev | abs_diff | rel_diff | sigmas
--+-+-+++---+--+--+
 Add back div_var_int64.  |   1 |   3 | 120 ns | 85 ns  | 11 ns 
| -40 ns   |  -32 |  4
 Add back div_var_int64.  |   1 |   4 | 120 ns | 97 ns  | 11 ns 
| -28 ns   |  -23 |  3
 Add back div_var_int64.  |   2 |   3 | 120 ns | 89 ns  | 11 ns 
| -29 ns   |  -25 |  3
 Add back div_var_int64.  |   2 |   4 | 130 ns | 94 ns  | 14 ns 
| -32 ns   |  -25 |  2
 Add back div_var_int64.  |   3 |   3 | 130 ns | 85 ns  | 11 ns 
| -41 ns   |  -32 |  4
 Add back div_var_int64.  |   3 |   4 | 130 ns | 99 ns  | 13 ns 
| -29 ns   |  -23 |  2
 Add back div_var_int64.  |   4 |   4 | 130 ns | 100 ns | 12 ns 
| -28 ns   |  -22 |  2
(7 rows)

Regards,
Joel




thread-safety: getpwuid_r()

2024-08-24 Thread Peter Eisentraut
Here is a patch to replace a getpwuid() call in the backend, for 
thread-safety.


This is AFAICT the only call in the getpw*() family that needs to be 
dealt with.


(There is also a getgrnam() call, but that is called very early in the 
postmaster, before multiprocessing, so we can leave that as is.)


The solution here is actually quite attractive: We can replace the 
getpwuid() call by the existing wrapper function pg_get_user_name(), 
which internally uses getpwuid_r().  This also makes the code a bit 
simpler.  The same function is already used in libpq for a purpose that 
mirrors the backend use, so it's also nicer to use the same function for 
consistency.
From 090c800afd6271885d345f72bbd1d3b535dd6886 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 24 Aug 2024 10:35:56 +0200
Subject: [PATCH] thread-safety: getpwuid_r()

There was one getpwuid() call in the backend.  We can replace that one
by the existing wrapper function pg_get_user_name(), which internally
uses getpwuid_r() for thread-safety.  And this also makes the code a
bit simpler.  The same function is already used in libpq for a purpose
that mirrors the backend use.
---
 src/backend/libpq/auth.c | 21 +++--
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 2b607c52704..85dcf9e9fc9 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -1857,7 +1857,7 @@ auth_peer(hbaPort *port)
uid_t   uid;
gid_t   gid;
 #ifndef WIN32
-   struct passwd *pw;
+   charpwdbuf[BUFSIZ];
int ret;
 #endif
 
@@ -1876,25 +1876,18 @@ auth_peer(hbaPort *port)
}
 
 #ifndef WIN32
-   errno = 0;  /* clear errno before 
call */
-   pw = getpwuid(uid);
-   if (!pw)
+   if (!pg_get_user_name(uid, pwdbuf, sizeof pwdbuf))
{
-   int save_errno = errno;
-
-   ereport(LOG,
-   (errmsg("could not look up local user ID %ld: 
%s",
-   (long) uid,
-   save_errno ? 
strerror(save_errno) : _("user does not exist";
+   ereport(LOG, errmsg_internal("%s", pwdbuf));
return STATUS_ERROR;
}
 
/*
-* Make a copy of static getpw*() result area; this is our authenticated
-* identity.  Set it before calling check_usermap, because 
authentication
-* has already succeeded and we want the log file to reflect that.
+* Make a copy of result; this is our authenticated identity.  Set it
+* before calling check_usermap, because authentication has already
+* succeeded and we want the log file to reflect that.
 */
-   set_authn_id(port, pw->pw_name);
+   set_authn_id(port, pwdbuf);
 
ret = check_usermap(port->hba->usermap, port->user_name,

MyClientConnectionInfo.authn_id, false);
-- 
2.46.0



Re: Optimising numeric division

2024-08-24 Thread Dean Rasheed
On Sat, 24 Aug 2024 at 08:26, Joel Jacobson  wrote:
>
> On Sat, Aug 24, 2024, at 01:35, Joel Jacobson wrote:
> > On Sat, Aug 24, 2024, at 00:00, Joel Jacobson wrote:
> >> Since statistical tools that rely on normal distributions can't be used,
> >> let's look at the individual measurements for (var1ndigits=3, 
> >> var2ndigits=3)
> >> since that seems to be the biggest slowdown on both CPUs,
> >> and see if our level of surprise is affected.
> >
> > Here is a more traditional benchmark,
> > which seems to also indicate (var1ndigits=3, var2ndigits=3) is a bit slower:
>
> I tested just adding back div_var_int64, and it seems to help.
>

Thanks for testing.

There does appear to be quite a lot of variability between platforms
over whether or not div_var_int64() is a win for 3 and 4 digit
divisors. Since this patch is primarily about improving div_var()'s
long division algorithm, it's probably best for it to not touch that,
so I've put div_var_int64() back in for now. We could possibly
investigate whether it can be improved separately.

Looking at your other test results, they seem to confirm my previous
observation that exact mode is faster than approximate mode for
var2ndigits <= 12 or so, so I've added code to do that.

I also expanded on the comments for the quotient-correction code a bit.

Regards,
Dean
From 0b5ae26beb3b7cd0c68c566f01c527e59a264386 Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Fri, 23 Aug 2024 09:54:27 +0100
Subject: [PATCH v2 1/2] Speed up numeric division by always using the "fast"
 algorithm.

Formerly there were two internal functions in numeric.c to perform
numeric division, div_var() and div_var_fast(). div_var() performed
division exactly to a specified rscale using Knuth's long division
algorithm, while div_var_fast() used the algorithm from the "FM"
library, which approximates each quotient digit using floating-point
arithmetic, and computes a truncated quotient with DIV_GUARD_DIGITS
extra digits. div_var_fast() could be many times faster than
div_var(), but did not guarantee correct results in all cases, and was
therefore only suitable for use in transcendental functions, where
small errors are acceptable.

This commit merges div_var() and div_var_fast() together into a single
function with an extra "exact" boolean parameter, which can be set to
false if the caller is OK with an approximate result. The new function
uses the faster algorithm from the "FM" library, except that when
"exact" is true, it does not truncate the computation with
DIV_GUARD_DIGITS extra digits, but instead performs the full-precision
computation, subtracting off complete multiples of the divisor for
each quotient digit. However, it is able to retain most of the
performance benefits of div_var_fast(), by delaying the propagation of
carries, allowing the inner loop to be auto-vectorized.

Since this may still lead to an inaccurate result, when "exact" is
true, it then inspects the remainder and uses that to adjust the
quotient, if necessary, to make it correct. In practice, the quotient
rarely needs to be adjusted, and never by more than one in the final
digit, though it's difficult to prove that, so the code allows for
larger adjustments, just in case.

In addition, use base-NBASE^2 arithmetic and a 64-bit dividend array,
similar to mul_var(), so that the number of iterations of the outer
loop is roughly halved. Together with the faster algorithm, this makes
div_var() up to around 20 times as fast as the old Knuth algorithm
when "exact" is true, and using base-NBASE^2 arithmetic makes it up to
2 or 3 times as fast as the old div_var_fast() function when "exact"
is false.
---
 src/backend/utils/adt/numeric.c | 829 ++--
 1 file changed, 352 insertions(+), 477 deletions(-)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 44d88e9007..46ac04035e 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -60,7 +60,7 @@
  * NBASE that's less than sqrt(INT_MAX), in practice we are only interested
  * in NBASE a power of ten, so that I/O conversions and decimal rounding
  * are easy.  Also, it's actually more efficient if NBASE is rather less than
- * sqrt(INT_MAX), so that there is "headroom" for mul_var and div_var_fast to
+ * sqrt(INT_MAX), so that there is "headroom" for mul_var and div_var to
  * postpone processing carries.
  *
  * Values of NBASE other than 1 are considered of historical interest only
@@ -563,10 +563,7 @@ static void mul_var(const NumericVar *var1, const NumericVar *var2,
 static void mul_var_short(const NumericVar *var1, const NumericVar *var2,
 		  NumericVar *result);
 static void div_var(const NumericVar *var1, const NumericVar *var2,
-	NumericVar *result,
-	int rscale, bool round);
-static void div_var_fast(const NumericVar *var1, const NumericVar *var2,
-		 NumericVar *result, int rscale, bool round);
+	NumericVar *result, int rscale, bool round, bool exact);
 static

Re: Remove dependence on integer wrapping

2024-08-24 Thread Joseph Koshakow
On Wed, Aug 21, 2024 at 11:37 AM Nathan Bossart 
wrote:
>
> Hm.  It seems pretty clear that removing -fwrapv won't be happening
anytime
> soon.  I don't mind trying to fix a handful of cases from time to time,
but
> unless there's a live bug, I'm probably not going to treat this stuff as
> high priority.

I think I'm also going to take a step back because I'm a bit
fatigued on the overflow work. My goal here wasn't necessarily to
remove -fwrapv, because I think it will always be a useful safeguard.
Instead I wanted to add -ftrapv to builds with asserts enabled to try
and prevent future overflow based bugs. Though, it looks like that
won't happen anytime soon either.

FWIW, Matthew's patch actually does resolve a bug with `to_timestamp`
and `to_date`. It converts the following incorrect queries

test=# SELECT to_timestamp('2147483647,999', 'Y,YYY');
  to_timestamp
-
 0001-01-01 00:00:00-04:56:02 BC
(1 row)

test=# SELECT to_date('-2147483648', 'CC');
  to_date

 0001-01-01
(1 row)

into errors

test=# SELECT to_timestamp('2147483647,999', 'Y,YYY');
ERROR:  invalid input string for "Y,YYY"
test=# SELECT to_date('-2147483648', 'CC');
ERROR:  date out of range: "-2147483648"

So, it might be worth committing only his changes before moving on.


Thanks,
Joseph Koshakow


Re: Official devcontainer config

2024-08-24 Thread Junwang Zhao
Hi hackers,

On Sun, Aug 4, 2024 at 10:12 PM Andrew Dunstan  wrote:
>
>
> On 2024-08-03 Sa 10:13 PM, Junwang Zhao wrote:
> > On Sat, Aug 3, 2024 at 7:30 PM Andrew Dunstan  wrote:
> >>
> >> On 2024-08-02 Fr 2:45 PM, Peter Eisentraut wrote:
> >>> On 01.08.24 23:38, Andrew Dunstan wrote:
>  Not totally opposed, and I will probably give it a try very soon, but
>  I'm wondering if this really needs to go in the core repo. We've
>  generally shied away from doing much in the way of editor / devenv
>  support, trying to be fairly agnostic. It's true we carry
>  .dir-locals.el and .editorconfig, so that's not entirely true, but
>  those are really just about supporting our indentation etc. standards.
> >>> Yeah, the editor support in the tree ought to be minimal and factual,
> >>> based on coding standards and widely recognized best practices, not a
> >>> collection of one person's favorite aliases and scripts.  If the
> >>> scripts are good, let's look at them and maybe put them under
> >>> src/tools/ for everyone to use.  But a lot of this looks like it will
> >>> requite active maintenance if output formats or node formats or build
> >>> targets etc. change.  And other things require specific local paths.
> >>> That's fine for a local script or something, but not for a mainline
> >>> tool that the community will need to maintain.
> >>>
> >>> I suggest to start with a very minimal configuration. What are the
> >>> settings that absolute everyone will need, maybe to set indentation
> >>> style or something.
> >>>
> >> I believe you can get VS Code to support editorconfig, so from that POV
> >> maybe we don't need to do anything.
> >>
> >> I did try yesterday with the code from the OP's patch symlinked into my
> >> repo, but got an error with the Docker build, which kinda reinforces
> >> your point.
> > The reason symlink does not work is that configure_vscode needs to copy
> > launch.json and tasks.json into .vscode, it has to be in the
> > WORKDIR/.devcontainer.
>
>
> That's kind of awful. Anyway, I think we don't need to do anything about
> ignoring those. The user should simply add entries for them to
> .git/info/exclude or their local global exclude file (I have
> core.excludesfile = /home/andrew/.gitignore set.)
>
> I was eventually able to get it to work without using a symlink.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>

I did some work on the **devcontainer config** today, hoping this can
address some of the concerns raised on this thread.

1. I created a separate git repo named **Unofficial devcontainer
config for Postgres**[1].
2. The usage is quite simple, clone the repo directly into postgres's tree root,
and reopen in DevContainer (See REAME in [1] if you are interested).
3. Using Tristan's vscode code extension postgres-hacker as the formatter,
pgindent will be called on file save, I found this quite convenient.
4. Remove the git settings, Dev container copies the global
~/.gitconfig into the container
by default, so normally it is not necessary to add extra git aliases.

What I haven't addressed is that the repo still uses specific local
paths, I think
this is ok since the code is not going into the core.

One thing I want to ask is, is there any objection to adding the
.devcontainer and .vscode to the .gitignore file?

There are *.vcproj and pgsql.sln in .gitignore, so I guess it's ok to add
.devcontainer and .vscode?

[1]: https://github.com/pghacking/.devcontainer

-- 
Regards
Junwang Zhao


v1-0001-add-.devcontainer-and-.vscode-to-.gitignore.patch
Description: Binary data


Re: POC, WIP: OR-clause support for indexes

2024-08-24 Thread Alena Rybakina

On 23.08.2024 19:38, Alexander Korotkov wrote:

Hi, Alena!

On Fri, Aug 23, 2024 at 5:06 PM Alena Rybakina
 wrote:

To be fair, I fixed this before [0] by selecting the appropriate group
of "or" expressions to transform them to "ANY" expression and then
checking for compatibility with the index column. maybe we should try
this too? I can think about it.

[0]
https://www.postgresql.org/message-id/531fc0ab-371e-4235-97e3-dd2d077b6995%40postgrespro.ru

I probably didn't get your message.  Which patch version you think
resolve the problem?  I see [0] doesn't contain any patch.

Sorry, I got the links mixed up. We need this link [0].


I think further progress in this area of grouping OR args is possible
if there is a solution, which doesn't take extraordinary computational
complexity.
This approach does not require a large overhead - in fact, we separately 
did the conversion to "any" once going through the list of restrictinfo, 
we form candidates in the form of boolexpr using the "and" operator, 
which contains "any" and "or" expression, then we check with index 
columns which expression suits us.


--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: POC, WIP: OR-clause support for indexes

2024-08-24 Thread Alexander Korotkov
On Sat, Aug 24, 2024 at 4:08 PM Alena Rybakina
 wrote:
> On 23.08.2024 19:38, Alexander Korotkov wrote:
> > Hi, Alena!
> >
> > On Fri, Aug 23, 2024 at 5:06 PM Alena Rybakina
> >  wrote:
> >> To be fair, I fixed this before [0] by selecting the appropriate group
> >> of "or" expressions to transform them to "ANY" expression and then
> >> checking for compatibility with the index column. maybe we should try
> >> this too? I can think about it.
> >>
> >> [0]
> >> https://www.postgresql.org/message-id/531fc0ab-371e-4235-97e3-dd2d077b6995%40postgrespro.ru
> > I probably didn't get your message.  Which patch version you think
> > resolve the problem?  I see [0] doesn't contain any patch.
> Sorry, I got the links mixed up. We need this link [0].

Still confusion.
If that's another [0] from [0] in the cited message then it seems you
missed new link in your last message.

--
Regards,
Alexander Korotkov
Supabase




Re: POC, WIP: OR-clause support for indexes

2024-08-24 Thread Alena Rybakina

Sorry again.

The link to letter - - 
https://www.postgresql.org/message-id/759292d5-cb51-4b12-89fa-576c1d9b374d%40postgrespro.ru


Patch - 
https://www.postgresql.org/message-id/attachment/162897/v28-Transform-OR-clauses-to-ANY-expression.patch



On 24.08.2024 16:23, Alexander Korotkov wrote:

On Sat, Aug 24, 2024 at 4:08 PM Alena Rybakina
 wrote:

On 23.08.2024 19:38, Alexander Korotkov wrote:

Hi, Alena!

On Fri, Aug 23, 2024 at 5:06 PM Alena Rybakina
 wrote:

To be fair, I fixed this before [0] by selecting the appropriate group
of "or" expressions to transform them to "ANY" expression and then
checking for compatibility with the index column. maybe we should try
this too? I can think about it.

[0]
https://www.postgresql.org/message-id/531fc0ab-371e-4235-97e3-dd2d077b6995%40postgrespro.ru

I probably didn't get your message.  Which patch version you think
resolve the problem?  I see [0] doesn't contain any patch.

Sorry, I got the links mixed up. We need this link [0].

Still confusion.
If that's another [0] from [0] in the cited message then it seems you
missed new link in your last message.

--
Regards,
Alexander Korotkov
Supabase



--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Official devcontainer config

2024-08-24 Thread Peter Eisentraut

On 24.08.24 14:49, Junwang Zhao wrote:

What I haven't addressed is that the repo still uses specific local
paths, I think
this is ok since the code is not going into the core.


I'm not among the target users of this, but I imagine that that would 
significantly reduce the utility of this for everyone besides you?



One thing I want to ask is, is there any objection to adding the
.devcontainer and .vscode to the .gitignore file?


The standing policy is that files related to IDEs and editors should not 
be in our .gitignore, but you can put them into your personal ignore 
file somewhere.



There are *.vcproj and pgsql.sln in .gitignore, so I guess it's ok to add
.devcontainer and .vscode?


Those are files generated by the build process, so it is appropriate to 
have them there.  But in fact, they should have been removed now that 
the MSVC build system is done.






Re: thread-safety: getpwuid_r()

2024-08-24 Thread Heikki Linnakangas

On 24/08/2024 11:42, Peter Eisentraut wrote:

Here is a patch to replace a getpwuid() call in the backend, for
thread-safety.

This is AFAICT the only call in the getpw*() family that needs to be
dealt with.

(There is also a getgrnam() call, but that is called very early in the
postmaster, before multiprocessing, so we can leave that as is.)

The solution here is actually quite attractive: We can replace the
getpwuid() call by the existing wrapper function pg_get_user_name(),
which internally uses getpwuid_r().  This also makes the code a bit
simpler.  The same function is already used in libpq for a purpose that
mirrors the backend use, so it's also nicer to use the same function for
consistency.


Makes sense.

The temporary buffers are a bit funky. pg_get_user_name() internally 
uses a BUFSIZE-sized area to hold the result of getpwuid_r(). If the 
pg_get_user_name() caller passes a buffer smaller than BUFSIZE, the user 
id might get truncated. I don't think that's a concern on any real-world 
system, and the callers do pass a large-enough buffer so truncation 
can't happen. At a minimum, it would be good to add a comment to 
pg_get_user_name() along the lines of "if 'buflen' is smaller than 
BUFSIZE, the result might be truncated".


Come to think of it, the pg_get_user_name() function is just a thin 
wrapper around getpwuid_r(). It doesn't provide a lot of value. Perhaps 
we should remove pg_get_user_name() and pg_get_user_home_dir() 
altogether and call getpwuid_r() directly.


But no objection to committing this as it is, either.

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Official devcontainer config

2024-08-24 Thread Junwang Zhao
On Sat, Aug 24, 2024 at 9:47 PM Peter Eisentraut  wrote:
>
> On 24.08.24 14:49, Junwang Zhao wrote:
> > What I haven't addressed is that the repo still uses specific local
> > paths, I think
> > this is ok since the code is not going into the core.
>
> I'm not among the target users of this, but I imagine that that would
> significantly reduce the utility of this for everyone besides you?

Yeah, the reason why I started this thread is that we(at least Jelta and I)
think it may make some potential new contributors' lives easier.

>
> > One thing I want to ask is, is there any objection to adding the
> > .devcontainer and .vscode to the .gitignore file?
>
> The standing policy is that files related to IDEs and editors should not
> be in our .gitignore, but you can put them into your personal ignore
> file somewhere.

Sure, I can put them in global ignore file in various ways.

I just saw the policy in the comment, so I'm ok with it.

>
> > There are *.vcproj and pgsql.sln in .gitignore, so I guess it's ok to add
> > .devcontainer and .vscode?
>
> Those are files generated by the build process, so it is appropriate to
> have them there.  But in fact, they should have been removed now that
> the MSVC build system is done.
>

-- 
Regards
Junwang Zhao




list of acknowledgments for PG17

2024-08-24 Thread Peter Eisentraut
The list of acknowledgments for the PG17 release notes has been 
committed.  You can see it here:


https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=doc/src/sgml/release-17.sgml;h=08a479807ca2933668dede22e4e6f464b937ee45;hb=refs/heads/REL_17_STABLE#l3229

As usual, please check for problems such as wrong sorting, duplicate 
names in different variants, or names in the wrong order etc.





Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY

2024-08-24 Thread Michail Nikolaev
Hello, everyone!

This patch set addresses the issues discussed in this thread.

The main idea behind this fix is that it is safe to consider indisready
indexes alongside indisvalid indexes as arbiter indexes. However, it's
crucial that at least one fully valid index is present.

Why is it necessary to consider indisready during the planning phase?

The reason is that these indexes are required for correct processing during
the execution phase.
If "ready" indexes are skipped as arbiters by one transaction, they may
already have become "valid" for another concurrent transaction during its
planning phase.
As a result, both transactions could concurrently process the UPSERT
command with different sets of arbiters (while using the same set of
indexes for tuple insertion later).
This can lead to unexpected "duplicate key value violates unique
constraint" errors and deadlocks.

Is it safe to use a "ready" but not yet "valid" index as an arbiter?
Yes, as long as at least one "valid" index is also used as an arbiter.
The valid index ensures the correctness of the UPSERT logic, while the
"ready" index contains an equal or lesser number of tuples, making it safe
for speculative insertion.
In any case, the insert to that index will be processed during
ExecInsertIndexTuples one way or another (with applyNoDupErr or without).

Fix is divided into a few patches, each following this logic:

1) The first patch provides specs (and injection points) for the various
scenarios related to the issue.
2) The second patch introduces a straightforward change—adding indisready
indexes to arbiters alongside indisvalid. However, at least one indisvalid
is still required. This resolves simple cases involving REINDEX
CONCURRENTLY and CREATE INDEX CONCURRENTLY.
3) The third patch deals with named constraints. Instead of relying solely
on the index with the specified name, we attempt to find other indexes that
are equivalent in terms of being used as an arbiter.
4) This patch fixes a scenario involving partitioned tables. Special checks
are required for partitioned indexes, which may be processed by REINDEX
CONCURRENTLY.

Additionally, a patch with three extra TAP specifications for stress
testing is attached. This patch is not intended for commitment, so I
renamed the extension to prevent accidental application in some CI/DI jobs.

>
Also, it is possible to look at the patches on GitHub:
https://github.com/postgres/postgres/compare/master...michail-nikolaev:postgres:reindex_concurrently_with_upsert

Best regards,
Mikhail.
From 9a4d01018e21d491c967b3378e61abdb06f52c74 Mon Sep 17 00:00:00 2001
From: nkey 
Date: Sat, 24 Aug 2024 14:04:02 +0200
Subject: [PATCH v3 2/4] Modify the infer_arbiter_indexes function to consider
 both indisvalid and indisready indexes. Ensure that at least one indisvalid
 index is still required.

The change ensures that all concurrent transactions utilize the same set of indexes as arbiters. This uniformity is required to avoid conditions that could lead to "duplicate key value violates unique constraint" errors during UPSERT operations.

The patch resolves the issues in the following specs:
* reindex_concurrently_upsert
* index_concurrently_upsert
* index_concurrently_upsert_predicate

Despite the patch, the following specs are still affected:
* reindex_concurrently_upsert_partitioned
* reindex_concurrently_upsert_on_constraint
---
 src/backend/optimizer/util/plancat.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 78a3cfafde..5ffc815ae4 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -720,6 +720,7 @@ infer_arbiter_indexes(PlannerInfo *root)
 
 	/* Results */
 	List	   *results = NIL;
+	bool	   foundValid = false;
 
 	/*
 	 * Quickly return NIL for ON CONFLICT DO NOTHING without an inference
@@ -813,7 +814,13 @@ infer_arbiter_indexes(PlannerInfo *root)
 		idxRel = index_open(indexoid, rte->rellockmode);
 		idxForm = idxRel->rd_index;
 
-		if (!idxForm->indisvalid)
+		/*
+		 * We need to consider both indisvalid and indisready indexes because
+		 * them may become indisvalid before execution phase. It is required
+		 * to keep set of indexes used as arbiter to be the same for all
+		 * concurrent transactions.
+		 */
+		if (!idxForm->indisready)
 			goto next;
 
 		/*
@@ -835,10 +842,9 @@ infer_arbiter_indexes(PlannerInfo *root)
 		 errmsg("ON CONFLICT DO UPDATE not supported with exclusion constraints")));
 
 			results = lappend_oid(results, idxForm->indexrelid);
-			list_free(indexList);
+			foundValid |= idxForm->indisvalid;
 			index_close(idxRel, NoLock);
-			table_close(relation, NoLock);
-			return results;
+			break;
 		}
 		else if (indexOidFromConstraint != InvalidOid)
 		{
@@ -932,6 +938,7 @@ infer_arbiter_indexes(PlannerInfo *root)
 			goto next;
 
 		results = lappend_oid(results, idxForm->indexrelid);
+		foundValid |= idxForm->

Re: Optimize mul_var() for var1ndigits >= 8

2024-08-24 Thread Tom Lane
Dean Rasheed  writes:
> On Wed, 14 Aug 2024 at 07:31, Joel Jacobson  wrote:
>> I think this is acceptable, since it produces more correct results.

> Thanks for checking. I did a bit more testing myself and didn't see
> any problems, so I have committed both these patches.

About a dozen buildfarm members are complaining thus (eg [1]):

gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Werror=vla -Wendif-labels 
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv 
-fexcess-precision=standard -g -O2 -ftree-vectorize -I. -I. 
-I../../../../src/include  -D_GNU_SOURCE -I/usr/include/libxml2   -c -o 
numeric.o numeric.c
numeric.c: In function \342\200\230mul_var\342\200\231:
numeric.c:9209:9: warning: \342\200\230carry\342\200\231 may be used 
uninitialized in this function [-Wmaybe-uninitialized]
term = PRODSUM1(var1digits, 0, var2digits, 0) + carry;
 ^
numeric.c:8972:10: note: \342\200\230carry\342\200\231 was declared here
  uint32  carry;
  ^

I guess these compilers aren't able to convince themselves that the
first switch must initialize "carry".

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=arowana&dt=2024-08-24%2004%3A19%3A29&stg=build