date:20210721

Re: when the startup process doesn't (logging startup delays)

2021-07-21 Thread Nitin Jadhav

> I am not sure I am getting the code flow correctly. From 
> CloseStartupProgress()
> naming it seems, it corresponds to InitStartupProgress() but what it is doing
> is similar to LogStartupProgress(). I think it should be renamed to be inlined
> with LogStartupProgress(), IMO.

Renamed CloseStartupProgress() to LogStartupProgressComplete().

> This part should be an assertion, it's the developer's responsibility to call
> correctly.

This code is not required at all due to the fix of the next comment.

> Since we do have a separate call for the in-progress operation and the
> end-operation, only a single enum would have been enough. If we do this, then 
> I
> think we should remove get_startup_process_operation_string() move messages to
> the respective function.

Fixed.

> I'd really like to see this enabled by default, say with a default
> interval of 10 seconds. If it has to be enabled explicitly, most
> people won't, but I think a lot of people would benefit from knowing
> why their system is slow to start up when that sort of thing happens.
> I don't see much downside to having it on by default either, since it
> shouldn't be expensive.  I think the GUC's units should be seconds, not
> milliseconds, though.

I agree that it is better to enable it by default. Changed the code
accordingly and changed the GUC's units to seconds.

> The messages you've added are capitalized, but the ones PostgreSQL
> has currently are not. You should conform to the existing style.

Fixed.

> The "crash recovery complete" message looks redundant with the "redo
> done" message. Also, in my mind, "redo" is one particular phase of
> crash recovery, so it looks really odd that "crash recovery" finishes
> first and then "redo" finishes. I think some thought is needed about
> the terminology here.

Yes. "redo" is one phase of the crash recovery. Even "resetting
unlogged relations" is also a part of the crash recovery. These 2 are
the major time consuming operations of the crash recovery task. There
is a separate log message to indicate the progress of "resetting the
unlogged relations". So instead of saying 'performing crash recovery",
it is better to say "redo in progress" and not add any additional
message at the end of redo, instead retain the existing message to
avoid redundancy.

> I'm not clear why I get a message about the data directory fsync but
> not about resetting unlogged relations. I wasn't really expecting to
> get a message about things that completed in less than the configured
> interval, although I find that I don't mind having it there either.
> But then it seems like it should be consistent across the various
> things we're timing, and resetting unlogged relations should certainly
> be one of those.

It is the same across all the things we'are timing. I was able to see
those messages on my machine. I feel there is not much overhead of
logging one message at the end of the operation even though it
completes within the configured interval. Following are the log
messages shown on my machine.

2021-07-20 18:47:32.046 IST [102230] LOG:  listening on IPv4 address
"127.0.0.1", port 5445
2021-07-20 18:47:32.048 IST [102230] LOG:  listening on Unix socket
"/tmp/.s.PGSQL.5445"
2021-07-20 18:47:32.051 IST [102234] LOG:  database system was
interrupted; last known up at 2021-07-20 18:46:27 IST
2021-07-20 18:47:32.060 IST [102234] LOG:  data directory sync (fsync)
complete after 0.00 s
2021-07-20 18:47:32.060 IST [102234] LOG:  database system was not
properly shut down; automatic recovery in progress
2021-07-20 18:47:32.063 IST [102234] LOG:  unlogged relations reset after 0.00 s
2021-07-20 18:47:32.063 IST [102234] LOG:  redo starts at 0/14EF418
.2021-07-20 18:47:33.063 IST [102234] LOG:  redo in progress, elapsed
time: 1.00 s, current LSN: 0/5C13D28
.2021-07-20 18:47:34.063 IST [102234] LOG:  redo in progress, elapsed
time: 2.00 s, current LSN: 0/A289160
.2021-07-20 18:47:35.063 IST [102234] LOG:  redo in progress, elapsed
time: 3.00 s, current LSN: 0/EE2DE10
2021-07-20 18:47:35.563 IST [102234] LOG:  invalid record length at
0/115C63E0: wanted 24, got 0
2021-07-20 18:47:35.563 IST [102234] LOG:  redo done at 0/115C63B8
system usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s
2021-07-20 18:47:35.564 IST [102234] LOG:  unlogged relations reset after 0.00 s
2021-07-20 18:47:35.706 IST [102230] LOG:  database system is ready to
accept connections


> The way you've coded this has some drift. In a perfect world, I'd
> get a progress report at 1000.00 ms, 2000.00 ms, 3000.00 ms, etc.
> That's never going to be the case, because there's always going to be
> a slightly delay in responding to the timer interrupt. However, as
> you've coded it, the delay increases over time. The first "Performing
> crash recovery" message is only 373 ms late, but the last one is 4916
> ms late. To avoid this, you want to reschedule the timer interrupt
> based on the time the last one was supposed to fire, not the time it
> actually d

Re: Re: parallel distinct union and aggregate support patch

2021-07-21 Thread bu...@sohu.com

Sorry, this email was marked spam by sohu, so I didn't notice it, and last few 
months I work hard for merge PostgreSQL 14 to our cluster 
version(github.com/ADBSQL/AntDB).

I have an idea how to make "Parallel Redistribute" work, even under "Parallel 
Append" and "Nestloop". but "grouping sets" can not work in parallel mode using 
"Parallel Redistribute".
Wait days please, path coming soon.

From: David Rowley
Date: 2021-07-06 10:48
To: bu...@sohu.com
CC: David Steele; pgsql-hackers; tgl; Dilip Kumar; Thomas Munro; Tomas Vondra; 
hlinnaka; robertmhaas; pgsql
Subject: Re: Re: parallel distinct union and aggregate support patch
On Tue, 30 Mar 2021 at 22:33, bu...@sohu.com  wrote:
> I have written a plan with similar functions, It is known that the following 
> two situations do not work well.

I read through this thread and also wondered about a Parallel
Partition type operator.  It also seems to me that if it could be done
this way then you could just plug in existing nodes to get Sorting and
Aggregation rather than having to modify existing nodes to get them to
do what you need.

From what I've seen looking over the thread, a few people suggested
this and I didn't see anywhere where you responded to them about the
idea.  Just so you're aware, contributing to PostgreSQL is not a case
of throwing code at a wall and seeing which parts stick.  You need to
interact and respond to people reviewing your work. This is especially
true for the people who actually have the authority to merge any of
your work with the main code repo.

It seems to me you might be getting off to a bad start and you might
not be aware of this process. So I hope this email will help put you
on track.

Some of the people that you've not properly responded to include:

Thomas Munro:  PostgreSQL committer. Wrote Parallel Hash Join.
Robert Hass: PostgreSQL committer. Wrote much of the original parallel
query code
Heikki Linnakangas: PostgreSQL committer. Worked on many parts of the
planner and executor. Also works for the company that develops
Greenplum, a massively parallel processing RDBMS, based on Postgres.

You might find other information in [1].

If I wanted to do what you want to do, I think those 3 people might be
some of the last people I'd pick to ignore questions from! :-)

Also, I'd say also copying in Tom Lane randomly when he's not shown
any interest in the patch here is likely not a good way of making
forward progress.  You might find that it might bubble up on his radar
if you start constructively interacting with the people who have
questioned your design.  I'd say that should be your next step.

The probability of anyone merging any of your code without properly
discussing the design with the appropriate people are either very
close to zero or actually zero.

I hope this email helps you get on track.

David

[1] https://www.postgresql.org/community/contributors/

Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

2021-07-21 Thread Kyotaro Horiguchi

At Thu, 15 Jul 2021 16:19:07 +0900, Michael Paquier  wrote 
in 
> On Wed, Jul 14, 2021 at 11:02:47AM -0400, Alvaro Herrera wrote:
> > No, this doesn't work.  When the first word is something that is
> > not to be translated (a literal parameter name), let's use a %s (but of
> > course the values must be taken out of the phrase too).  But when it is
> > a translatable phrase, it must be present a full phrase, no
> > substitution:
> > 
> > pg_log_error("%s must be in range %s..%s", "extra_float_digits", "-15", 
> > "3");
> > pg_log..("compression level must be in range %s..%s", "0", "9"))
> > 
> > I think the purity test is whether you want to use a _() around the
> > argument, then you have to include it into the original message.
> 
> After thinking about that, it seems to me that we don't have that much
> context to lose once we build those error messages to show the option
> name of the command.  And the patch, as proposed, finishes with the

Agreed.

> same error message patterns all over the place, which would be a
> recipe for more inconsistencies in the future.  I think that we should
> centralize all that, say with a small-ish routine in a new file called
> src/fe_utils/option_parsing.c that uses strtoint() as follows:
> bool option_parse_int(const char *optarg,
> const char *optname,
> int min_range,
> int max_range,
> int *result);
>
> Then this routine may print two types of errors through
> pg_log_error():
> - Incorrect range, using min_range/max_range.
> - Junk input.
> The boolean status is here to let the caller do any custom exit()
> actions he wishes if there one of those two failures.  pg_dump has
> some of that with exit_nicely(), for one.

It is substantially the same suggestion with [1] in the original
thread.  The original proposal in the old thread was

> bool pg_strtoint64_range(arg, &result, min, max, &error_message)

The difference is your suggestion is making the function output the
message within.  I guess that the reason for the original proposal is
different style of message is required in several places.

Actually there are several irregulars.

1. Some "bare" options (that is not preceded by a hyphen option) like
 PID of pg_ctl kill doesn't fit the format.  \pset parameters of
 pg_ctl is the same.

2. pg_ctl, pg_upgrade use their own error reporting mechanisms.

3. parameters that take real numbers doesn't fit the scheme specifying
 range borders. For example boundary values may or may not be included
 in the range.

4. Most of the errors are PG_LOG_ERROR, but a few ones are
 PG_LOG_FATAL.

That being said, most of the caller sites are satisfied by
fixed-formed messages.

So I changed the interface as the following to deal with the above issues:

+extern optparse_result option_parse_int(int loglevel,
+   
const char *optarg, const char *optname,
+   
int min_range, int max_range,
+   
int *result);

loglevel specifies the loglevel to use to emit error messages. If it
is the newly added item PG_LOG_OMIT, the function never emits an error
message. Addition to that, the return type is changed to an enum which
indicates what trouble the given string has. The caller can print
arbitrary error messages consulting the value. (killproc in pg_ctl.c)

Other points:

I added two more similar functions option_parse_long/double. The
former is a clone of _int. The latter doesn't perform explicit range
checks for the reason described above.

Maybe we need to make pg_upgraded use the common-logging features
instead of its own, but it is not included in this patch.

pgbench's -L option accepts out-of-range values for internal
variable. As the added comment says, we can limit the value with the
large exact number but I limited it to 3600s since I can't imagine
people needs larger latency limit than that.

Similarly to the above, -R option can take for example 1E-300, which
leads to int64 overflow later. Similar to -L, I don't think people
don't need less than 1E-5 or so as this parameter.

The attached patch needs more polish but should be enough to tell what
I have in my mind.

regards.

1: 
https://www.postgresql.org/message-id/CAKJS1f94kkuB_53LcEf0HF%2BuxMiTCk5FtLx9gSsXcCByUKMz1g%40mail.gmail.com

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From aaf81ac0340e9df3b74e18c1492e5984c0412fb5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 8 Jul 2021 15:08:01 +0900
Subject: [PATCH v3 1/3] Be strict in numeric parameters on command line

Some numeric command line parameters are tolerant of valid values
followed by garbage like "123xyz".  Be strict to reject such invalid
values. Do the same for psql meta command parameters.
---
 src/bin/pg_amcheck/pg_amcheck.c|   9 +-
 src/bin/pg_basebackup/pg_basebackup.c  |  17 ++-
 src

Re: 回复： Why is XLOG_FPI_FOR_HINT always need backups?

2021-07-21 Thread Kyotaro Horiguchi

At Wed, 21 Jul 2021 11:23:20 +0900, Fujii Masao  
wrote in 
> I slightly modified the comments and pushed the patch. Thanks!

Thank you for commiting this!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: shared-memory based stats collector

2021-07-21 Thread Kyotaro Horiguchi

At Mon, 19 Jul 2021 15:34:56 +0500, Ibrar Ahmed  wrote 
in 
> The patch does not apply, and require rebase,

Yeah, thank you very much for checking that. However, this patch is
now developed in Andres' GitHub repository.  So I'm at a loss what to
do for the failure..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE

2021-07-21 Thread Kyotaro Horiguchi

Hello, Kuroda-san.

At Mon, 12 Jul 2021 04:05:21 +, "kuroda.hay...@fujitsu.com" 
 wrote in 
> > Similary, the following sequence doesn't yield an error, which is
> > expected.
> > 
> > > EXEC SQL AT conn1 DECLARE stmt STATEMENT;
> > > EXEC SQL AT conn2 EXECUTE stmt INTO ..;
> > 
> > In this case "conn2" set by the AT option is silently overwritten with
> > "conn1" by check_declared_list(). I think we should reject AT option
> > (with a different connection) in that case.
> 
> Actually this comes from Oracle's specification. Pro*C precompiler
> overwrite their connection in the situation, hence I followed that.
> But I agree this might be confused and I added the warning report.
> How do you think? Is it still strange?

(I'm perplexed from what is done while precompilation and what is done
 at execution time...)

How Pro*C behaves in that case?  If the second command ends with an
error, I think we are free to error out the second command before
execution.  If it works... do you know what is happening at the time?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Dean Rasheed

On Wed, 21 Jul 2021 at 03:48, Bruce Momjian  wrote:
>
> this example now gives me concern:
>
> SELECT INTERVAL '1.06 months 1 hour';
>interval
> ---
>  1 mon 2 days 01:00:00
>
> Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
> spilling to hours/minutes/seconds, even though hours is already
> specified.  I don't see a better way to handle this than the current
> code already does, but it is something odd.

Hmm, looking at this whole thread, I have to say that I prefer the old
behaviour of spilling down to lower units.

For example, with this patch:

SELECT '0.5 weeks'::interval;
 interval
--
 4 days

which I don't think is really an improvement. My expectation is that
half a week is 3.5 days, and I prefer what it used to return, namely
'3 days 12:00:00'.

It's true that that leads to odd-looking results when the field value
has lots of fractional digits, but that was at least explainable, and
followed the documentation.

Looking for a general principle to follow, how about this -- the
result of specifying a fractional value should be the same as
multiplying an interval of 1 unit by that value. In other words,
'1.8594 months'::interval should be the same as '1 month'::interval *
1.8594. (Actually, it probably can't easily be made exactly the same
in all cases, due to differences in the floating point computations in
the two cases, and rounding errors, but it's hopefully not far off,
unlike the results obtained by not spilling down to lower units on
input.)

The cases that are broken in master, in my opinion, are the larger
units (year and above), which don't propagate down in the same way as
fractional months and below. So, for example, '0.7 years' should be
8.4 months (with the conversion factor of 1 year = 12 months), giving
'8 months 12 days', which is what '1 year'::interval * 0.7 produces.
Sure, there are arguably more accurate ways of computing that.
However, that's the result obtained using the documented conversion
factors, so it's justifiable in those terms.

It's worth noting another case that is broken in master:

SELECT '1.7 decades'::interval;
 interval
--
 16 years 11 mons

which is surely not what anyone would expect. The current patch fixes
this, but it would also be fixed by handling the fractional digits for
these units in the same way as for smaller units. There was an earlier
patch doing that, I think, though I didn't test it.

Regards,
Dean

Re: A problem about partitionwise join

2021-07-21 Thread Richard Guo

On Fri, Nov 27, 2020 at 8:05 PM Ashutosh Bapat 
wrote:

> On Tue, Nov 10, 2020 at 2:43 PM Richard Guo 
> wrote:
> > Thanks Anastasia. I've rebased the patch with latest master.
> >
> > To recap, the problem we are fixing here is when generating join clauses
> > from equivalence classes, we only select the joinclause with the 'best
> > score', or the first joinclause with a score of 3. This may cause us to
> > miss some joinclause on partition keys and thus fail to generate
> > partitionwise join.
> >
> > The initial idea for the fix is to create all the RestrictInfos from ECs
> > in order to check whether there exist equi-join conditions involving
> > pairs of matching partition keys of the relations being joined for all
> > partition keys. And then Tom proposed a much better idea which leverages
> > function exprs_known_equal() to tell whether the partkeys can be found
> > in the same eclass, which is the current implementation in the latest
> > patch.
> >
>
> In the example you gave earlier, the equi join on partition key was
> there but it was replaced by individual constant assignment clauses.
> So if we keep the original restrictclause in there with a new flag
> indicating that it's redundant, have_partkey_equi_join will still be
> able to use it without much change. Depending upon where all we need
> to use avoid restrictclauses with the redundant flag, this might be an
> easier approach. However, with Tom's idea partition-wise join may be
> used even when there is no equi-join between partition keys but there
> are clauses like pk = const for all tables involved and const is the
> same for all such tables.
>

Correct. So with Tom's idea partition-wise join can cope with clauses
such as 'foo.k1 = bar.k1 and foo.k2 = 16 and bar.k2 = 16'.


>
> In the spirit of small improvement made to the performance of
> have_partkey_equi_join(), pk_has_clause should be renamed as
> pk_known_equal and pks_known_equal as num_equal_pks.
>

Thanks for the suggestion. Will do that in the new version of patch.


>
> The loop traversing the partition keys at a given position, may be
> optimized further if we pass lists to exprs_known_equal() which in
> turns checks whether one expression from each list is member of a
> given EC. This will avoid traversing all equivalence classes for each
> partition key expression, which can be a huge improvement when there
> are many ECs. But I think if one of the partition key expression at a
> given position is member of an equivalence class all the other
> partition key expressions at that position should be part of that
> equivalence class since there should be an equi-join between those. So
> the loop in loop may not be required to start with.
>

Good point. Quote from one of Tom's earlier emails,
 "It seems at least plausible that in the cases we care about, all the
 partkeys on each side would be in the same eclasses anyway, so that
 comparing the first members of each list would be sufficient."

But I'm not sure if this holds true in all cases. However, since each
base relation within the join contributes only one partexpr, the number
of partexprs would only be equal to the join degree. Thus the loop in
loop may not be a big problem?

PS. Sorry for delaying so long time!

Thanks
Richard


v5-0001-Fix-up-partitionwise-join.patch
Description: Binary data

Re: ORDER BY pushdowns seem broken in postgres_fdw

2021-07-21 Thread Ronan Dunklau

Le mercredi 21 juillet 2021, 04:25:00 CEST David Rowley a écrit :
> Here the test claims that it wants to ensure that the order by using
> operator(public.<^) is not pushed down into the foreign scan.
> However, unless I'm mistaken, it seems there's a completely wrong
> assumption there that the planner would even attempt that.  In current
> master we don't add PathKeys for ORDER BY aggregates, why would that
> sort get pushed down in the first place?

The whole aggregate, including it's order by clause, can be pushed down so 
there is nothing related to pathkeys here.

> 
> If I adjust that query to something that would have the planner set
> pathkeys for, it does push the ORDER BY to the foreign server without
> any consideration that the sort operator is not shippable to the
> foreign server.
> 
> Am I missing something here, or is postgres_fdw.c's
> get_useful_pathkeys_for_relation() just broken?

I think you're right, we need to add a check if the opfamily is shippable. 
I'll submit a patch for that including regression tests.

Regards,

-- 
Ronan Dunklau

Re: Using each rel as both outer and inner for JOIN_ANTI

2021-07-21 Thread Richard Guo

On Fri, Jul 2, 2021 at 11:59 AM Zhihong Yu  wrote:

>
>
> On Thu, Jul 1, 2021 at 8:24 PM Richard Guo  wrote:
>
>>
>> On Thu, Jul 1, 2021 at 3:18 PM Ronan Dunklau 
>> wrote:
>>
>>> Le jeudi 1 juillet 2021, 09:09:38 CEST Ronan Dunklau a écrit :
>>> > > Yes, thanks! I was making a big mistake here thinking the executor
>>> can
>>> > > stop after the first match. That's not true. We need to use each
>>> outer
>>> > > tuple to find all the matches and mark the corresponding hashtable
>>> > > entries. I have updated the patch with the fix.
>>> >
>>> > It looks OK to me.
>>>
>>> I forgot to mention: you also have failing tests due to the plans
>>> changing to
>>> use the new join type. This might not be the case anymore once you
>>> update the
>>> cost model, but if that's the case the tests should be updated.
>>>
>>
>> Thanks! Test cases are updated in v3 patch. Also merge join can do the
>> 'right anti join' too in the same patch.
>>
>> Thanks again for reviewing this patch.
>>
>> Thanks
>> Richard
>>
>>
> Hi,
> Minor comment:
> +* In a right-antijoin, we never return a matched
> tuple.
>
> matched tuple -> matching tuple
>

Thanks for reviewing.

The comment for JOIN_ANTI in ExecHashJoinImpl/ExecMergeJoin is:

"In an antijoin, we never return a matched tuple"

So I think we'd better keep it consistent for JOIN_RIGHT_ANTI?

Thanks
Richard

Re: Added schema level support for publication.

2021-07-21 Thread Rahila Syed

On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow  wrote:

> On Fri, Jul 16, 2021 at 8:13 PM vignesh C  wrote:
> >
> > Modified.
> >
> > Thanks for the comments, these issues are fixed as part of the v12 patch
> posted at [1].
> > [1]  -
> https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
> >
>
> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
> After that command, it still regards it as an empty (e) publication,
> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>
>
One issue here is that the code to update publication type is missing
in AlterPublicationTables for SET TABLE command.

More broadly, I am not clear about the behaviour of the patch when a
publication is created to publish only certain tables, and is later altered
to publish
a whole schema. I think such behaviour is legitimate. However,
AFAIU as per current code we can't update the publication type
from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

I have some review comments as follows:
1.
In ConvertSchemaSpecListToOidList(List *schemas) function:
 + search_path = fetch_search_path(false);
 +   nspname =
get_namespace_name(linitial_oid(search_path));
 +   if (nspname == NULL)/*
recently-deleted namespace? */
 +   ereport(ERROR,
 +
errcode(ERRCODE_UNDEFINED_SCHEMA),
 +   errmsg("no schema
has been selected"));
 +
 +   schemoid = get_namespace_oid(nspname,
false);
 +   break;

The call get_namespace_oid() is perhaps not needed as fetch_search_path
already fetches oids and simply
doing Schema oid = liinital_oid(search_path)); should be enough.

2. In the same function should there be an if else condition block instead
of a switch case as
there are only two cases.

Thank you,
Rahila Syed

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Tom Lane

Dean Rasheed  writes:
> Hmm, looking at this whole thread, I have to say that I prefer the old
> behaviour of spilling down to lower units.

> For example, with this patch:

> SELECT '0.5 weeks'::interval;
>  interval
> --
>  4 days

> which I don't think is really an improvement. My expectation is that
> half a week is 3.5 days, and I prefer what it used to return, namely
> '3 days 12:00:00'.

Yeah, that is clearly a significant dis-improvement.

In general, considering that (most of?) the existing behavior has stood
for decades, I think we need to tread VERY carefully about changing it.
I don't want to see this patch changing any case that is not indisputably
broken.

regards, tom lane

Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

2021-07-21 Thread David Rowley

On Wed, 21 Jul 2021 at 13:39, James Coleman  wrote:
> Thanks for doing the math measuring how much we could impact things.
>
> I'm +lots on getting this committed as is.

Ok good. I plan on taking a final look at the v10 patch tomorrow
morning NZ time (about 12 hours from now) and if all is well, I'll
push it.

If anyone feels differently, please let me know before then.

David

Re: Numeric x^y for negative x

2021-07-21 Thread Dean Rasheed

On Tue, 20 Jul 2021 at 10:17, Yugo NAGATA  wrote:
>
> This patch fixes numeric_power() to handle negative bases correctly and not
> to raise an error "cannot take logarithm of a negative number", as well as a
> bug that a result whose values is almost zero is incorrectly returend as 1.
> The previous behaviors are obvious strange, and these fixes seem to me 
> reasonable.
>
> Also, improper overflow errors are corrected in numeric_power() and
> numeric_exp() to return 0 when it is underflow in fact.
> I think it is no problem that these functions return zero instead of underflow
> errors because power_var_int() already do the same.
>
> The patch includes additional tests for checking negative bases cases and
> underflow and rounding of the almost zero results. It seems good.

Thanks for the review!


> Let me just make one comment.
>
> (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
>  errmsg("zero raised to a negative power is undefined")));
>
> -   if (sign1 < 0 && !numeric_is_integral(num2))
> -   ereport(ERROR,
> -   (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
> -errmsg("a negative number raised to a non-integer power 
> yields a complex result")));
> -
> /*
>  * Initialize things
>  */
>
> I don't think we need to move this check from numeric_power to power_var.

Moving it to power_var() means that it only needs to be checked in the
case of a negative base, together with an exponent that cannot be
handled by power_var_int(), which saves unnecessary checking. It isn't
necessary to do this test at all if the exponent is an integer small
enough to fit in a 32-bit int. And if it's not an integer, or it's a
larger integer than that, it seems more logical to do the test in
power_var() near to the other code handling that case.


> I noticed the following comment in a numeric_power().
>
> /*
>  * The SQL spec requires that we emit a particular SQLSTATE error code for
>  * certain error conditions.  Specifically, we don't return a
>  * divide-by-zero error code for 0 ^ -1.
>  */
>
> In the original code, two checks that could raise an error of
> ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION are following the comment.
> I think these check codes are placed together under this comment 
> intentionally,
> so I suggest not to move one of them.

Ah, that's a good point about the SQL spec. The comment only refers to
the case of 0 ^ -1, but the SQL spec does indeed say that a negative
number to a non-integer power should return the same error code.

Here is an updated patch with additional comments about the required
error code when raising a negative number to a non-integer power, and
where it is checked.

Regards,
Dean
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index 2a0f68f..8a8822c
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -3935,7 +3935,9 @@ numeric_power(PG_FUNCTION_ARGS)
 	/*
 	 * The SQL spec requires that we emit a particular SQLSTATE error code for
 	 * certain error conditions.  Specifically, we don't return a
-	 * divide-by-zero error code for 0 ^ -1.
+	 * divide-by-zero error code for 0 ^ -1.  Raising a negative number to a
+	 * non-integer power must produce the same error code, but that case is
+	 * handled in power_var().
 	 */
 	sign1 = numeric_sign_internal(num1);
 	sign2 = numeric_sign_internal(num2);
@@ -3945,11 +3947,6 @@ numeric_power(PG_FUNCTION_ARGS)
 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
  errmsg("zero raised to a negative power is undefined")));
 
-	if (sign1 < 0 && !numeric_is_integral(num2))
-		ereport(ERROR,
-(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
- errmsg("a negative number raised to a non-integer power yields a complex result")));
-
 	/*
 	 * Initialize things
 	 */
@@ -9762,12 +9759,18 @@ exp_var(const NumericVar *arg, NumericVa
 	 */
 	val = numericvar_to_double_no_overflow(&x);
 
-	/* Guard against overflow */
+	/* Guard against overflow/underflow */
 	/* If you change this limit, see also power_var()'s limit */
 	if (Abs(val) >= NUMERIC_MAX_RESULT_SCALE * 3)
-		ereport(ERROR,
-(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
- errmsg("value overflows numeric format")));
+	{
+		if (val > 0)
+			ereport(ERROR,
+	(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+	 errmsg("value overflows numeric format")));
+		zero_var(result);
+		result->dscale = rscale;
+		return;
+	}
 
 	/* decimal weight = log10(e^x) = x * log10(e) */
 	dweight = (int) (val * 0.434294481903252);
@@ -10125,6 +10128,8 @@ log_var(const NumericVar *base, const Nu
 static void
 power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result)
 {
+	int			res_sign;
+	NumericVar	abs_base;
 	NumericVar	ln_base;
 	NumericVar	ln_num;
 	int			ln_dweight;
@@ -10168,9 +10173,40 @@ power_var(const NumericVar *base, const
 		return;
 	}
 
+	init_var(&abs_base)

Re: postgres_fdw - make cached connection functions tests meaningful

2021-07-21 Thread Bharath Rupireddy

On Sat, Jul 17, 2021 at 9:37 PM Tom Lane  wrote:
>
> Bharath Rupireddy  writes:
> > On Thu, Jul 15, 2021 at 3:48 AM Tom Lane  wrote:
> >> More generally, though, I am not sure that I believe the premise of
> >> this patch.  AFAICS it's assuming that forcing debug_discard_caches
> >> off guarantees zero cache flushes, which it does not.
>
> > Can the setting debug_discard_caches = 0 still make extra
> > flushes/discards (not the regular cache flushes/discards that happen
> > because of alters or changes in the cached elements)? My understanding
> > was that debug_discard_caches = 0, disables all the extra flushes with
> > clobber cache builds. If my understanding wasn't right, isn't it good
> > to mention it somewhere in the documentation or in the source code?
>
> The reason for this mechanism is that cache flushes can be triggered
> at any time by sinval messages from other processes (e.g., background
> autovacuums).  Setting debug_discard_caches allows us to exercise
> that possibility exhaustively and make sure that the code is proof
> against cache entries disappearing unexpectedly.  Not setting
> debug_discard_caches doesn't mean that that can't happen, only that
> you can't predict when it will happen.

Thanks. I'm fine with dropping this patch, hence I marked the CF entry
as "rejected".

Regards,
Bharath Rupireddy.

Re: Printing backtrace of postgres processes

2021-07-21 Thread Bharath Rupireddy

On Tue, Jul 13, 2021 at 9:03 PM vignesh C  wrote:
>
> On Wed, May 12, 2021 at 2:27 AM Robert Haas  wrote:
> >
> > Maybe we should have a role that is specifically for server debugging
> > type things. This kind of overlaps with Mark Dilger's proposal to try
> > to allow SET for security-sensitive GUCs to be delegated via
> > predefined roles. The exact way to divide that up is open to question,
> > but it wouldn't seem crazy to me if the same role controlled the
> > ability to do this plus the ability to set the GUCs
> > backtrace_functions, debug_invalidate_system_caches_always,
> > wal_consistency_checking, and maybe a few other things.
>
> +1 for the idea of having a new role for this. Currently I have
> implemented this feature to be supported only for the superuser. If we
> are ok with having a new role to handle debugging features, I will
> make a 002 patch to handle this.

I see that there are a good number of user functions that are
accessible only by superuser (I searched for "if (!superuser())" in
the code base). I agree with the intention to not overload the
superuser anymore and have a few other special roles to delegate the
existing superuser-only functions to them. In that case, are we going
to revisit and reassign all the existing superuser-only functions?

Regards,
Bharath Rupireddy.

Re: Micro-optimizations to avoid some strlen calls.

2021-07-21 Thread David Rowley

On Tue, 20 Jul 2021 at 10:49, Ranier Vilela  wrote:
> There are some places, where strlen can have an overhead.
> This patch tries to fix this.

I'm with Michael and David on this.

I don't really feel like doing;

- snprintf(buffer, sizeof(buffer), "E%s%s\n",
+ buflen = snprintf(buffer, sizeof(buffer), "E%s%s\n",
  _("could not fork new process for connection: "),

is a good idea.  I'm unsure if you're either not aware of the value
that snprintf() returns or just happen to think an overflow is
unlikely enough because you're convinced that 1000 chars are always
enough to fit this translatable string.   I'd say if we were 100%
certain of that then it might as well become sprintf() instead.
However, I imagine you'll struggle to get people to side with you that
taking this overflow risk would be worthwhile given your lack of any
evidence that anything actually has become meaningfully faster as a
result of any of these changes.

David

Re: logical replication empty transactions

2021-07-21 Thread Ajin Cherian

On Mon, Jul 19, 2021 at 3:24 PM Peter Smith  wrote:

> 1a. Commit Comment - wording
>
updated.
>
> 1b. Commit Comment - wording
>
updated.

> 2. doc/src/sgml/logicaldecoding.sgml - wording
>
> @@ -884,11 +884,19 @@ typedef void (*LogicalDecodePrepareCB) (struct
> LogicalDecodingContext *ctx,
>The required commit_prepared_cb callback is called
>whenever a transaction COMMIT PREPARED has
> been decoded.
>The gid field, which is part of the
> -  txn parameter, can be used in this callback.
> +  txn parameter, can be used in this callback. The
> +  parameters prepare_end_lsn and
> +  prepare_time can be used to check if the plugin
> +  has received this PREPARE TRANSACTION in which case
> +  it can commit the transaction, otherwise, it can skip the commit. The
> +  gid alone is not sufficient because the 
> downstream
> +  node can have a prepared transaction with the same identifier.
>
> =>
>
> (some minor rewording of the last part)

updated.

>
> 3. src/backend/replication/logical/proto.c - whitespace
>
> @@ -244,12 +248,16 @@ logicalrep_read_commit_prepared(StringInfo in,
> LogicalRepCommitPreparedTxnData *
>   elog(ERROR, "unrecognized flags %u in commit prepared message", flags);
>
>   /* read fields */
> + prepare_data->prepare_end_lsn = pq_getmsgint64(in);
> + if (prepare_data->prepare_end_lsn == InvalidXLogRecPtr)
> + elog(ERROR,"prepare_end_lsn is not set in commit prepared message");
>
> =>
>
> There is missing space before the 2nd elog param.
>

fixed.

>
> 4a. =>
>
> "and was essentially an empty prepare" --> "so was essentially an empty 
> prepare"
>
> 4b. =>
>
> "In which case" --> "In this case"
>
> --

fixed.

> I felt that since this message postponement is now the new behaviour
> of this function then probably this should all be a function level
> comment instead of the comment being in the body of the function
>
> --
>
> 6. src/backend/replication/pgoutput/pgoutput.c - pgoutput_begin
>
> +
> +static void
> +pgoutput_begin(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
>
> =>
>
> Even though it is kind of obvious, it is probably better to provide a
> function comment here too
>
> --

Changed accordingly.

>

> I felt that the comment "skip COMMIT message if nothing was sent"
> should be done at the point where you *decide* to skip or not. So you
> could either move that comment to where the skip variable is assigned.
> Or (my preference) leave the comment where it is but change the
> variable name to be sent_begin = !data->sent_begin_txn;
>

Updated the comment to where the skip variable is assigned.


> --
>
> Regardless I think the comment should be elaborated a bit to describe
> the reason more.
>
> 7b. =>
>
> BEFORE
> /* skip COMMIT message if nothing was sent */
>
> AFTER
> /* If a BEGIN message was not yet sent, then it means there were no
> relevant changes encountered, so we can skip the COMMIT message too.
> */
>

Updated accordingly.


> --

> Like previously, I felt that this big comment should be at the
> function level of pgoutput_begin_prepare_txn instead of in the body of
> the function.
>
> --
>
> 8b. =>
>
> And then the body comment would be something simple like:
>
> /* Delegate to assign the begin sent flag as false same as for the
> BEGIN message. */
> pgoutput_begin_txn(ctx, txn);
>

Updated accordingly.

> --
>
> 9. src/backend/replication/pgoutput/pgoutput.c - pgoutput_begin_prepare
>
> +
> +static void
> +pgoutput_begin_prepare(LogicalDecodingContext *ctx, ReorderBufferTXN *txn)
>
> =>
>
> Probably this needs a function comment.
>

Updated.

> --
>
> 10. src/backend/replication/pgoutput/pgoutput.c - pgoutput_prepare_txn
>
> @@ -459,8 +520,18 @@ static void
>  pgoutput_prepare_txn(LogicalDecodingContext *ctx, ReorderBufferTXN *txn,
>   XLogRecPtr prepare_lsn)
>  {
> + PGOutputTxnData*data = (PGOutputTxnData *) txn->output_plugin_private;
> +
> + Assert(data);
>   OutputPluginUpdateProgress(ctx);
>
> + /* skip PREPARE message if nothing was sent */
> + if (!data->sent_begin_txn)
>
> =>
>
> Maybe elaborate on that "skip PREPARE message if nothing was sent"
> comment in a way similar to my review comment 7b. For example,
>
> AFTER
> /* If the BEGIN was not yet sent, then it means there were no relevant
> changes encountered, so we can skip the PREPARE message too. */
>

Updated.

> --
>
> 11. src/backend/replication/pgoutput/pgoutput.c - pgoutput_commit_prepared_txn
>
> @@ -471,12 +542,33 @@ pgoutput_prepare_txn(LogicalDecodingContext
> *ctx, ReorderBufferTXN *txn,
>   */
>  static void
>  pgoutput_commit_prepared_txn(LogicalDecodingContext *ctx,
> ReorderBufferTXN *txn,
> - XLogRecPtr commit_lsn)
> + XLogRecPtr commit_lsn, XLogRecPtr prepare_end_lsn,
> + TimestampTz prepare_time)
>  {
> + PGOutputTxnData*data = (PGOutputTxnData *) txn->output_plugin_private;
> +
>   OutputPluginUpdateProgress(ctx);
>
> + /*
> + * skip sending COMMIT PREPARED message if prepared transa

Re: logical replication empty transactions

2021-07-21 Thread Ajin Cherian

On Thu, Jul 15, 2021 at 3:50 PM osumi.takami...@fujitsu.com
 wrote:
> I started to test this patch but will give you some really minor quick 
> feedbacks.
>
> (1) pg_logical_slot_get_binary_changes() params.
>
> Technically, looks better to have proto_version 3 & two_phase option for the 
> function
> to test empty prepare ? I felt proto_version 1 doesn't support 2PC.
> [1] says "The following messages (Begin Prepare, Prepare, Commit Prepared, 
> Rollback Prepared)
> are available since protocol version 3." Then, if the test wants to skip 
> empty *prepares*,
> I suggest to update the proto_version and set two_phase 'on'.

Updated accordingly.

> (2) The following sentences may start with a lowercase letter.
> There are other similar codes for this.
>
> +   elog(DEBUG1, "Skipping replication of an empty transaction");

Fixed this.

I've addressed these comments in version 8 of the patch.

regards,
Ajin Cherian
Fujitsu Australia

Re: CLUSTER on partitioned index

2021-07-21 Thread Laurenz Albe

On Tue, 2021-07-20 at 20:27 -0400, Alvaro Herrera wrote:
> I have to wonder if there really *is* a use case for CLUSTER in the
> first place on regular tables, let alone on partitioned tables, which
> are likely to be large and thus take a lot of time.  What justifies
> spending so much time on this implementation?  My impression is that
> CLUSTER is pretty much a fringe command nowadays, because of the access
> exclusive lock required.
> 
> Does anybody actually use it?

I see is used in the field occasionally, as it can really drastically
improve performance.  But I admit is is not frequently used.

In a data warehouse, which is updated only occasionally, running
CLUSTER after an update can make a lot of sense.

I personally think that it is enough to be able to cluster the table
partiton by partition.

Yours,
Laurenz Albe

Re: when the startup process doesn't (logging startup delays)

2021-07-21 Thread Bharath Rupireddy

On Wed, Jul 21, 2021 at 12:52 PM Nitin Jadhav
 wrote:
> Please find the v5 patch attached. Kindly let me know your comments.

Thanks for the patch. Here are some comments on v5:
1) I still don't see the need for two functions LogStartupProgress and
LogStartupProgressComplete. Most of the code is duplicate. I think we
can just do it with a single function something like [1]:

2) Why isn't there a
LogStartupProgressComplete(STARTUP_PROCESS_OP_REDO)? Is it because of
the below existing log message?
ereport(LOG,
(errmsg("redo done at %X/%X system usage: %s",
LSN_FORMAT_ARGS(ReadRecPtr),
pg_rusage_show(&ru0;

3) I think it should be, "," after occurred instead of "."
+ * elapsed or not. TRUE if timeout occurred, FALSE otherwise.
instead of
+ * elapsed or not. TRUE if timeout occurred. FALSE otherwise.

[1]
+/*
+ * Logs the progress of the operations performed during the startup process.
+ * is_complete true/false indicates that the operation is finished/
+ * in-progress respectively.
+ */
+void
+LogStartupProgress(StartupProcessOp op, const char *path,
+  bool is_complete)
+{
+   long  secs;
+   int usecs;
+   int elapsed_ms;
+   int interval_in_ms;
+
+   /* If not called from the startup process then this feature is
of no use. */
+   if (!AmStartupProcess())
+   return;
+
+   /* If the feature is disabled, then no need to proceed further. */
+   if (log_startup_progress_interval < 0)
+   return;
+
+   /*
+* If the operation is in-progress and the timeout hasn't occurred, then
+* no need to log the details.
+*/
+   if (!is_complete && !logStartupProgressTimeout)
+   return;
+
+   /* Timeout has occurred. */
+   TimestampDifference(startupProcessOpStartTime,
+   GetCurrentTimestamp(),
+   &secs, &usecs);
+
+   /*
+* If the operation is in-progress, enable the timer for the next log
+* message based on the time that current log message timer
was supposed to
+* fire.
+*/
+   if (!is_complete)
+   {
+   elapsed_ms = (secs * 1000) + (usecs / 1000);
+   interval_in_ms = log_startup_progress_interval * 1000;
+   enable_timeout_after(LOG_STARTUP_PROGRESS_TIMEOUT,
+
(interval_in_ms - (elapsed_ms % interval_in_ms)));
+   }
+
+   switch(op)
+   {
+   case STARTUP_PROCESS_OP_SYNCFS:
+   {
+   if (is_complete)
+   ereport(LOG,
+   (errmsg("data
directory sync (syncfs) complete after %ld.%02d s",
+
 secs, (usecs / 1;
+   else
+   ereport(LOG,
+
(errmsg("syncing data directory (syncfs), elapsed time: %ld.%02d s,
current path: %s",
+
 secs, (usecs / 1), path)));
+   }
+   break;
+   case STARTUP_PROCESS_OP_FSYNC:
+   {
+   if (is_complete)
+   ereport(LOG,
+   (errmsg("data
directory sync (fsync) complete after %ld.%02d s",
+
 secs, (usecs / 1;
+   else
+   ereport(LOG,
+
(errmsg("syncing data directory (fsync), elapsed time: %ld.%02d s,
current path: %s",
+
 secs, (usecs / 1), path)));
+   }
+   break;
+   case STARTUP_PROCESS_OP_REDO:
+   {
+   /*
+* No need to log redo completion
status here, as it will be
+* done in the caller.
+*/
+   if (!is_complete)
+   ereport(LOG,
+   (errmsg("redo
in progress, elapsed time: %ld.%02d s, current LSN: %X/%X",
+
 secs, (usecs / 1), LSN_FORMAT_ARGS(ReadRecPtr;
+   }
+   break;
+   case STARTUP_PROCESS_OP_RESET_UNLOGGED_REL:
+   {
+   if (is_complete)
+   ereport(LOG,
+
(errmsg("unlogged relations reset after %ld.%02d s",
+
 secs, (usecs / 1;
+   else
+   ereport(LOG,
+
(errmsg("resetting unlogged relations, elapsed time: %ld.%02d s,
current path: %s",
+
 secs, (usecs / 1), path)));
+   }
+   break;
+   default:
+   ereport(E

Re: add 'noError' to euc_tw_and_big5.c

2021-07-21 Thread John Naylor

On Tue, Jul 20, 2021 at 10:35 PM Michael Paquier 
wrote:
>
> On Wed, Jul 21, 2021 at 02:15:14AM +, wangyu...@fujitsu.com wrote:
> > 'noError' argument was added at commit ea1b99a661,
> > but it seems to be neglected in euc_tw_and_big5.c Line 289.
> > please see the attachment.
>
> That sounds right to me.  Double-checking the area, I am not seeing
> another portion of the code to fix.

Agreed, will push.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

2021-07-21 Thread Michael Paquier

On Wed, Jul 21, 2021 at 05:02:29PM +0900, Kyotaro Horiguchi wrote:
> The difference is your suggestion is making the function output the
> message within.  I guess that the reason for the original proposal is
> different style of message is required in several places.

That's one step toward having a maximum number of frontend tools to
use the central logging APIs of src/common/.

> 1. Some "bare" options (that is not preceded by a hyphen option) like
>  PID of pg_ctl kill doesn't fit the format.  \pset parameters of
>  pg_ctl is the same.

Yep.  I was reviewing this one, but I have finished by removing it.
The argument 2 just below also came into my mind.

> 2. pg_ctl, pg_upgrade use their own error reporting mechanisms.

Yeah, for this reason I don't think that it is a good idea to switch
those areas to use the parsing of option_utils.c.  Perhaps we should
consider switching pg_upgrade to have a better logging infra, but
there are also reasons behind what we have now.  pg_ctl is out of
scope as it needs to cover WIN32 event logging.

> 3. parameters that take real numbers doesn't fit the scheme specifying
>  range borders. For example boundary values may or may not be included
>  in the range.

This concerns only pgbench, which I'd be fine to let as-is.

> 4. Most of the errors are PG_LOG_ERROR, but a few ones are
>  PG_LOG_FATAL.

I would take it that pgbench is inconsistent with the rest.  Note that
pg_dump uses fatal(), but that's just a wrapper to pg_log_error().

> loglevel specifies the loglevel to use to emit error messages. If it
> is the newly added item PG_LOG_OMIT, the function never emits an error
> message. Addition to that, the return type is changed to an enum which
> indicates what trouble the given string has. The caller can print
> arbitrary error messages consulting the value. (killproc in pg_ctl.c)

I am not much a fan of that.  If we do so, what's the point in having
a dependency to logging.c anyway in option_utils.c?  This OMIT option
only exists to bypass the specific logging needs where this gets
added.  That does not seem a design adapted to me in the long term,
neither am I a fan of specific error codes for a code path that's just
going to be used to parse command options.

> I added two more similar functions option_parse_long/double. The
> former is a clone of _int. The latter doesn't perform explicit range
> checks for the reason described above.

These have a limited impact, so I would limit things to int32 for
now.

> Maybe we need to make pg_upgrade use the common-logging features
> instead of its own, but it is not included in this patch.

Maybe.  That would be good in the long term, though its case is very
particular.

> The attached patch needs more polish but should be enough to tell what
> I have in my mind.

This breaks some of the TAP tests of pgbench and pg_dump, at short
sight.

The checks for the port value in pg_receivewal and pg_recvlogical is
strange to have.  We don't care about that in any other tools.

The number of checks for --jobs and workers could be made more
consistent across the board, but I have let that out for now.

Hacking on that, I am finishing with the attached.  It is less
ambitious, still very useful as it removes a dozen of custom error
messages in favor of the two ones introduced in option_utils.c.  On
top of that this reduces a bit the code:
 15 files changed, 156 insertions(+), 169 deletions(-) 

What do you think?
--
Michael
From 3aaf98a9e53d90ca2ac8c2734ae77487aa03843f Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 21 Jul 2021 20:42:13 +0900
Subject: [PATCH v4] Introduce and use routine for parsing of int32 options

---
 src/include/fe_utils/option_utils.h|  3 ++
 src/fe_utils/option_utils.c| 39 ++
 src/bin/pg_amcheck/pg_amcheck.c|  8 ++-
 src/bin/pg_basebackup/pg_basebackup.c  | 17 +++---
 src/bin/pg_basebackup/pg_receivewal.c  | 23 +++--
 src/bin/pg_basebackup/pg_recvlogical.c | 25 -
 src/bin/pg_checksums/Makefile  |  3 ++
 src/bin/pg_checksums/pg_checksums.c|  9 ++--
 src/bin/pg_dump/pg_dump.c  | 41 +++
 src/bin/pg_dump/pg_restore.c   | 31 +--
 src/bin/pg_dump/t/001_basic.pl |  8 +--
 src/bin/pgbench/pgbench.c  | 60 +++---
 src/bin/pgbench/t/002_pgbench_no_server.pl | 18 +++
 src/bin/scripts/reindexdb.c|  9 ++--
 src/bin/scripts/vacuumdb.c | 31 ---
 15 files changed, 156 insertions(+), 169 deletions(-)

diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index d653cb94e3..e999d56ec0 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -19,5 +19,8 @@ typedef void (*help_handler) (const char *progname);
 extern void handle_help_version_opts(int argc, char *argv[],
 	 const char *fixed_progname,
 	 help_handle

Re: ORDER BY pushdowns seem broken in postgres_fdw

2021-07-21 Thread Ronan Dunklau

Le mercredi 21 juillet 2021, 11:05:14 CEST Ronan Dunklau a écrit :
> Le mercredi 21 juillet 2021, 04:25:00 CEST David Rowley a écrit :
> > Here the test claims that it wants to ensure that the order by using
> > operator(public.<^) is not pushed down into the foreign scan.
> > However, unless I'm mistaken, it seems there's a completely wrong
> > assumption there that the planner would even attempt that.  In current
> > master we don't add PathKeys for ORDER BY aggregates, why would that
> > sort get pushed down in the first place?
> 
> The whole aggregate, including it's order by clause, can be pushed down so
> there is nothing related to pathkeys here.
> 
> > If I adjust that query to something that would have the planner set
> > pathkeys for, it does push the ORDER BY to the foreign server without
> > any consideration that the sort operator is not shippable to the
> > foreign server.
> > 
> > Am I missing something here, or is postgres_fdw.c's
> > get_useful_pathkeys_for_relation() just broken?
> 
> I think you're right, we need to add a check if the opfamily is shippable.
> I'll submit a patch for that including regression tests.
> 

In fact the support for generating the correct USING clause was inexistent 
too, so that needed a bit more work.

The attached patch does the following:
  - verify the opfamily is shippable to keep pathkeys
  - generate a correct order by clause using the actual operator.

The second part needed a bit of refactoring: the find_em_expr_for_input_target 
and find_em_expr_for_rel need to return the whole EquivalenceMember, because we 
can't know the type used by the opfamily from the expression (example: the 
expression could be of type intarray, but the type used by the opfamily could 
be anyarray).

I also moved the "USING "' string generation to a separate function 
since it's now used by appendAggOrderBy and appendOrderByClause.

The find_em_expr_for_rel is exposed in optimizer/paths.h, so I kept the 
existing function which returns the expr directly in case it is used out of 
tree. 



-- 
Ronan Dunklaucommit 2376cc6656853987e8f08e9b8f444bf391a9c269
Author: Ronan Dunklau 
Date:   Wed Jul 21 12:44:41 2021 +0200

Fix postgres_fdw PathKey's handling.

The operator family being used for the sort was completely
ignored, and as such its existence on the foreign server was not
checked.

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..2c986d3325 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -180,6 +182,7 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo 
*root,
   Index ignore_rel, 
List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, 
deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 deparse_expr_cxt 
*context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +913,26 @@ is_foreign_param(PlannerInfo *root,
return false;
 }
 
+/* Returns true if the given pathkey can be evaluated on the remote side
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+  RelOptInfo *baserel,
+  PathKey *pathkey)
+{
+   EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+   EquivalenceMember *em;
+   PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+   if (pathkey_ec->ec_has_volatile)
+   return false;
+   return (!pathkey_ec->ec_has_volatile &&
+   (em = find_em_for_rel(pathkey_ec, baserel)) &&
+   is_foreign_expr(root, baserel, em->em_expr) &&
+   is_shippable(pathkey->pk_opfamily, 
OperatorFamilyRelationId, fpinfo));
+}
+
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is
@@ -3125,6 +3148,37 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
appendStringInfoChar(buf, ')');
 }
 
+/*
+ * Append the USING  part of an ORDER BY clause
+ */
+static void
+appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt 
*context)
+{
+   StringInfo  buf = context->buf;
+   TypeCacheEntry *typentry;
+   typentry = lookup_t

Re: Micro-optimizations to avoid some strlen calls.

2021-07-21 Thread Ranier Vilela

Em qua., 21 de jul. de 2021 às 07:44, David Rowley 
escreveu:

> On Tue, 20 Jul 2021 at 10:49, Ranier Vilela  wrote:
> > There are some places, where strlen can have an overhead.
> > This patch tries to fix this.
>
> I'm with Michael and David on this.
>
> I don't really feel like doing;
>
> - snprintf(buffer, sizeof(buffer), "E%s%s\n",
> + buflen = snprintf(buffer, sizeof(buffer), "E%s%s\n",
>   _("could not fork new process for connection: "),
>
> is a good idea.  I'm unsure if you're either not aware of the value
> that snprintf() returns or just happen to think an overflow is
> unlikely enough because you're convinced that 1000 chars are always
> enough to fit this translatable string.   I'd say if we were 100%
> certain of that then it might as well become sprintf() instead.
> However, I imagine you'll struggle to get people to side with you that
> taking this overflow risk would be worthwhile given your lack of any
> evidence that anything actually has become meaningfully faster as a
> result of any of these changes.
>
I got your point.
Really getting only the result of snprintf is a bad idea.
In this case, the right way would be:

snprintf(buffer, sizeof(buffer), "E%s%s\n",
  _("could not fork new process for connection: "),
buflen = strlen(buffer);

Thus doesn't have to recount buffer over, if rc fails.
Thanks for the tip about snprintf, even though it's not the intention.
This is what I call a bad interface.

regards,
Ranier Vilela

>
>

Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

2021-07-21 Thread David Rowley

On Wed, 21 Jul 2021 at 23:50, Michael Paquier  wrote:
> Hacking on that, I am finishing with the attached.  It is less
> ambitious, still very useful as it removes a dozen of custom error
> messages in favor of the two ones introduced in option_utils.c.  On
> top of that this reduces a bit the code:
>  15 files changed, 156 insertions(+), 169 deletions(-)
>
> What do you think?

This is just a driveby review, but I think that it's good to see no
increase in the number of lines of code to make these improvements.
It's also good to see the added consistency introduced by this patch.

I didn't test the patch, so this is just from reading through.

I wondered about the TAP tests here:

command_fails_like(
[ 'pg_dump', '-j', '-1' ],
qr/\Qpg_dump: error: -j\/--jobs must be in range 0..2147483647\E/,
'pg_dump: invalid number of parallel jobs');

command_fails_like(
[ 'pg_restore', '-j', '-1', '-f -' ],
qr/\Qpg_restore: error: -j\/--jobs must be in range 0..2147483647\E/,
'pg_restore: invalid number of parallel jobs');

I see both of these are limited to 64 on windows. Won't those fail on Windows?

I also wondered if it would be worth doing #define MAX_JOBS  somewhere
away from the option parsing code.  This part is pretty ugly:

/*
* On Windows we can only have at most MAXIMUM_WAIT_OBJECTS
* (= 64 usually) parallel jobs because that's the maximum
* limit for the WaitForMultipleObjects() call.
*/
if (!option_parse_int(optarg, "-j/--jobs", 0,
#ifdef WIN32
  MAXIMUM_WAIT_OBJECTS,
#else
  INT_MAX,
#endif
  &numWorkers))
exit(1);
break;

David

Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

2021-07-21 Thread Michael Paquier

On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote:
> I see both of these are limited to 64 on windows. Won't those fail on Windows?

Yes, thanks, they would.  I would just cut the range numbers from the
expected output here.  This does not matter in terms of coverage
either.

x> I also wondered if it would be worth doing #define MAX_JOBS  somewhere
> away from the option parsing code.  This part is pretty ugly:

Agreed as well.  pg_dump and pg_restore have their own idea of
parallelism in parallel.{c.h}.  What about putting MAX_JOBS in
parallel.h then?
--
Michael


signature.asc
Description: PGP signature

Re: Use extended statistics to estimate (Var op Var) clauses

2021-07-21 Thread Dean Rasheed

On Tue, 20 Jul 2021 at 19:28, Tomas Vondra
 wrote:
>
> > The new code in statext_is_compatible_clause_internal() is a little
> > hard to follow because some of the comments aren't right
>
> I ended up doing something slightly different - examine_opclause_args
> now "returns" a list of expressions, instead of explicitly setting two
> parameters. That means we can do a simple foreach() here, which seems
> cleaner. It means we have to extract the expressions from the list in a
> couple places, but that seems acceptable. Do you agree?

Yes, that looks much neater.

> > In mcv_get_match_bitmap(), perhaps do the RESULT_IS_FINAL() checks
> > first in each loop.
>
> This is how master already does that now, and I wonder if it's done in
> this order intentionally. It's not clear to me doing it in the other way
> would be faster?

Ah OK, it just felt more natural to do it the other way round. I
suppose though, that for the first clause, the is-final check isn't
going to catch anything, whereas the is-null checks might. For the
remaining clauses, it will depend on the data as to which way is
faster, but it probably isn't going to make any noticeable difference
either way. So, although it initially seems a bit counter-intuitive,
it's probably better the way it is.

> I guess the last thing is maybe mentioning this in
> the docs, adding an example etc.

Yeah, good idea.

Regards,
Dean

Re: when the startup process doesn't (logging startup delays)

2021-07-21 Thread Justin Pryzby

I think walkdir() should only call LogStartupProgress(FSYNC_IN_PROGRESS, path);
when action == datadir_fsync_fname.

ResetUnloggedRelations() is calling
ResetUnloggedRelationsInTablespaceDir("base", op);
before calling InitStartupProgress().

This shows StartupXLOG() calling ResetUnloggedRelations() twice.
Should they both be shown ?  If so, maybe they should be distinguished as here:

elog(DEBUG1, "resetting unlogged relations: cleanup %d init %d",
 (op & UNLOGGED_RELATION_CLEANUP) != 0,
 (op & UNLOGGED_RELATION_INIT) != 0);

On Wed, Jul 21, 2021 at 12:52:24PM +0530, Nitin Jadhav wrote:
> 2021-07-20 18:47:32.046 IST [102230] LOG:  listening on IPv4 address 
> "127.0.0.1", port 5445
> 2021-07-20 18:47:32.048 IST [102230] LOG:  listening on Unix socket 
> "/tmp/.s.PGSQL.5445"
> 2021-07-20 18:47:32.051 IST [102234] LOG:  database system was interrupted; 
> last known up at 2021-07-20 18:46:27 IST
> 2021-07-20 18:47:32.060 IST [102234] LOG:  data directory sync (fsync) 
> complete after 0.00 s
> 2021-07-20 18:47:32.060 IST [102234] LOG:  database system was not properly 
> shut down; automatic recovery in progress
> 2021-07-20 18:47:32.063 IST [102234] LOG:  unlogged relations reset after 
> 0.00 s
> 2021-07-20 18:47:32.063 IST [102234] LOG:  redo starts at 0/14EF418
> 2021-07-20 18:47:33.063 IST [102234] LOG:  redo in progress, elapsed time: 
> 1.00 s, current LSN: 0/5C13D28
> 2021-07-20 18:47:34.063 IST [102234] LOG:  redo in progress, elapsed time: 
> 2.00 s, current LSN: 0/A289160
> 2021-07-20 18:47:35.063 IST [102234] LOG:  redo in progress, elapsed time: 
> 3.00 s, current LSN: 0/EE2DE10
> 2021-07-20 18:47:35.563 IST [102234] LOG:  invalid record length at 
> 0/115C63E0: wanted 24, got 0
> 2021-07-20 18:47:35.563 IST [102234] LOG:  redo done at 0/115C63B8 system 
> usage: CPU: user: 3.58 s, system: 0.14 s, elapsed: 3.50 s
> 2021-07-20 18:47:35.564 IST [102234] LOG:  unlogged relations reset after 
> 0.00 s
> 2021-07-20 18:47:35.706 IST [102230] LOG:  database system is ready to accept 
> connections

-- 
Justin

Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

2021-07-21 Thread David Rowley

On Thu, 22 Jul 2021 at 00:44, Michael Paquier  wrote:
>
> On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote:
> > I see both of these are limited to 64 on windows. Won't those fail on 
> > Windows?
>
> Yes, thanks, they would.  I would just cut the range numbers from the
> expected output here.  This does not matter in terms of coverage
> either.

Sounds good.

> x> I also wondered if it would be worth doing #define MAX_JOBS  somewhere
> > away from the option parsing code.  This part is pretty ugly:
>
> Agreed as well.  pg_dump and pg_restore have their own idea of
> parallelism in parallel.{c.h}.  What about putting MAX_JOBS in
> parallel.h then?

parallel.h looks ok to me.

David

Re: ORDER BY pushdowns seem broken in postgres_fdw

2021-07-21 Thread David Rowley

On Thu, 22 Jul 2021 at 00:28, Ronan Dunklau  wrote:
> The attached patch does the following:
>   - verify the opfamily is shippable to keep pathkeys
>   - generate a correct order by clause using the actual operator.

Thanks for writing the patch.

This is just a very superficial review.  I've not spent a great deal
of time looking at postgres_fdw code, so would rather some eyes that
were more familiar with the code looked too.

1. This comment needs to be updated. It still mentions
is_foreign_expr, which you're no longer calling.

  * is_foreign_expr would detect volatile expressions as well, but
  * checking ec_has_volatile here saves some cycles.
  */
- if (pathkey_ec->ec_has_volatile ||
- !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
- !is_foreign_expr(root, rel, em_expr))
+ if (!is_foreign_pathkey(root, rel, pathkey))

2. This is not a very easy return condition to read:

+ return (!pathkey_ec->ec_has_volatile &&
+ (em = find_em_for_rel(pathkey_ec, baserel)) &&
+ is_foreign_expr(root, baserel, em->em_expr) &&
+ is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));

I think it would be nicer to break that down into something easier on
the eyes that could be commented a little more.

3. This comment is no longer true:

  * Find an equivalence class member expression, all of whose Vars, come from
  * the indicated relation.
  */
-Expr *
-find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+EquivalenceMember*
+find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)

Also, missing space after EquivalenceMember.

The comment can just be moved down to:

+Expr *
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+ EquivalenceMember *em = find_em_for_rel(ec, rel);
+ return em ? em->em_expr : NULL;
+}

and you can rewrite the one for find_em_for_rel.

David

Re: add 'noError' to euc_tw_and_big5.c

2021-07-21 Thread John Naylor

On Tue, Jul 20, 2021 at 10:35 PM Michael Paquier 
wrote:
>
> On Wed, Jul 21, 2021 at 02:15:14AM +, wangyu...@fujitsu.com wrote:
> > 'noError' argument was added at commit ea1b99a661,
> > but it seems to be neglected in euc_tw_and_big5.c Line 289.
> > please see the attachment.
>
> That sounds right to me.  Double-checking the area, I am not seeing
> another portion of the code to fix.

Pushed, but I forgot to give you review credit, sorry about that. Thanks
for taking a look!

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Bitmap reuse

2021-07-21 Thread David Rowley

On Wed, 21 Jul 2021 at 13:32, Tom Lane  wrote:
>
> David Rowley  writes:
> > I imagined Jeff was meaning the bitmap from the scan of foo_x_idx, not
> > the combined ANDed bitmap from both indexes.
>
> To re-use that, you'd need a way to prevent the upper node from
> destructively modifying the tidbitmap.

Yeah.  And that would slow things down in the case where it was just
executed once as we'd need to make a copy of it to prevent the cached
version from being modified regardless if it would ever be used again
or not.

Maybe the planner would need to be involved in making the decision of
if the bitmap index scan should tuck away a carbon copy of the
resulting TIDBitmap after the first scan.  That way on rescan we could
just make a copy of the cached version and return that.  That saves
having to modify the callers to tell them not to damage the returned
TIDBitmap.

David

Re: Bitmap reuse

2021-07-21 Thread David Rowley

On Thu, 22 Jul 2021 at 01:54, David Rowley  wrote:
> Maybe the planner would need to be involved in making the decision of
> if the bitmap index scan should tuck away a carbon copy of the
> resulting TIDBitmap after the first scan.  That way on rescan we could
> just make a copy of the cached version and return that.  That saves
> having to modify the callers to tell them not to damage the returned
> TIDBitmap.

Oh but, meh. Caching could blow out work_mem...  We might end up using
work_mem * 2.

David

Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2021-07-21 Thread Ronan Dunklau

Le mercredi 21 juillet 2021, 04:52:43 CEST David Rowley a écrit :
> Thanks for finding this.  I've made a few changes to make this case
> work as you describe. Please see attached v6 patches.
> 
> Because I had to add additional code to take the GROUP BY pathkeys
> into account when choosing the best ORDER BY agg pathkeys, the
> function that does that became a little bigger.  To try to reduce the
> complexity of it, I got rid of all the processing from the initial
> loop and instead it now uses the logic from the 2nd loop to find the
> best pathkeys.  The reason I'd not done that in the first place was
> because I'd thought I could get away without building an additional
> Bitmapset for simple cases, but that's probably fairly cheap compared
> to building Pathkeys.   With the additional complexity for the GROUP
> BY pathkeys, the extra code seemed not worth it.
> 
> The 0001 patch is the ORDER BY aggregate code.  0002 is to fix up some
> broken regression tests in postgres_fdw that 0001 caused.  It appears
> that 0001 uncovered a bug in the postgres_fdw code.  I've reported
> that in [1]. If that turns out to be a bug then it'll need to be fixed
> before this can progress.

I tested the 0001 patch against both HEAD and my proposed bugfix for 
postgres_fdw.

There is a problem that the ordered aggregate is not pushed down anymore. The 
underlying Sort node is correctly pushed down though. 

This comes from the fact that postgres_fdw grouping path never contains any 
pathkey. Since the cost is fuzzily the same between the pushed-down aggregate 
and the locally performed one, the tie is broken against the pathkeys.

Ideally we would add the group pathkeys to the grouping path, but this would 
add an additional ORDER BY expression matching the GROUP BY. Moreover, some 
triaging of the pathkeys would be necessary since we now mix the sort-in-
aggref pathkeys with the group_pathkeys.

-- 
Ronan Dunklau

Re: ORDER BY pushdowns seem broken in postgres_fdw

2021-07-21 Thread Ronan Dunklau

Le mercredi 21 juillet 2021 15:45:15 CEST, vous avez écrit :
> On Thu, 22 Jul 2021 at 00:28, Ronan Dunklau  wrote:
> > The attached patch does the following:
> >   - verify the opfamily is shippable to keep pathkeys
> >   - generate a correct order by clause using the actual operator.
> 
> Thanks for writing the patch.
> 
> This is just a very superficial review.  I've not spent a great deal
> of time looking at postgres_fdw code, so would rather some eyes that
> were more familiar with the code looked too.

Thank you for the review.

> 
> 1. This comment needs to be updated. It still mentions
> is_foreign_expr, which you're no longer calling.
> 
>   * is_foreign_expr would detect volatile expressions as well, but
>   * checking ec_has_volatile here saves some cycles.
>   */
> - if (pathkey_ec->ec_has_volatile ||
> - !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
> - !is_foreign_expr(root, rel, em_expr))
> + if (!is_foreign_pathkey(root, rel, pathkey))
> 
Done. By the way, the comment just above mentions we don't have a way to use a 
prefix pathkey, but I suppose we should revisit that now that we have 
IncrementalSort. I'll mark it in my todo list for another patch.

> 2. This is not a very easy return condition to read:
> 
> + return (!pathkey_ec->ec_has_volatile &&
> + (em = find_em_for_rel(pathkey_ec, baserel)) &&
> + is_foreign_expr(root, baserel, em->em_expr) &&
> + is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));
> 
> I think it would be nicer to break that down into something easier on
> the eyes that could be commented a little more.

Done, let me know what you think about it.

> 
> 3. This comment is no longer true:
> 
>   * Find an equivalence class member expression, all of whose Vars, come
> from * the indicated relation.
>   */
> -Expr *
> -find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
> +EquivalenceMember*
> +find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
> 
> Also, missing space after EquivalenceMember.
> 
> The comment can just be moved down to:
> 
> +Expr *
> +find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
> +{
> + EquivalenceMember *em = find_em_for_rel(ec, rel);
> + return em ? em->em_expr : NULL;
> +}
> 
> and you can rewrite the one for find_em_for_rel.

I have done it the other way around. I'm not sure we really need to keep the 
find_em_expr_for_rel function on HEAD. If we decide to backpatch, it would need 
to be kept though. 

-- 
Ronan Dunklaucommit f7b3dc81878bf2ac503899f010eb18f390a64e37
Author: Ronan Dunklau 
Date:   Wed Jul 21 12:44:41 2021 +0200

Fix postgres_fdw PathKey's handling.

The operator family being used for the sort was completely
ignored, and as such its existence on the foreign server was not
checked.

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..5efefff65e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_opfamily.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -180,6 +182,7 @@ static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 			   Index ignore_rel, List **ignore_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
+static void appendOrderbyUsingClause(Oid sortop, Oid sortcoltype, deparse_expr_cxt *context);
 static void appendAggOrderBy(List *orderList, List *targetList,
 			 deparse_expr_cxt *context);
 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context);
@@ -910,6 +913,37 @@ is_foreign_param(PlannerInfo *root,
 	return false;
 }
 
+/* Returns true if the given pathkey can be evaluated on the remote side
+ */
+bool
+is_foreign_pathkey(PlannerInfo *root,
+   RelOptInfo *baserel,
+   PathKey *pathkey)
+{
+	EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+	EquivalenceMember *em;
+	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+	/*
+	 * is_foreign_expr would detect volatile expressions as well, but
+	 * checking ec_has_volatile here saves some cycles.
+	 */
+	if (pathkey_ec->ec_has_volatile)
+		return false;
+
+	em = find_em_for_rel(pathkey_ec, baserel);
+	if (em == NULL)
+		return false;
+
+	/*
+	 * Finally, verify the found member's expression is foreign and its operator
+	 * family is shippable.
+	 */
+	return (is_foreign_expr(root, baserel, em->em_expr) &&
+			is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));
+}
+
+
 /*
  * Convert type OID + typmod info into a type name we can ship to the remote
  * server.  Someplace else had better have verified that this type name is

Re: postgresql.conf.sample missing units

2021-07-21 Thread John Naylor

On Mon, Jul 19, 2021 at 10:31 AM John Naylor 
wrote:
>
> On Mon, Jul 19, 2021 at 5:44 AM Pavel Luzanov 
wrote:
> >
> > Hello,
> >
> > I found that the start section of the postgresql.conf file is missing a
> > description of two units: bytes (appeared in version 11) and
> > microseconds (in version 12).
> >
> > The attached patch makes these changes to the postgresql.conf.sample
file.
>
> Seems like an oversight. I'll commit this soon barring objections.

I pushed this and backpatched to v12. I also extracted just the "bytes"
part and applied it to v11.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: refactoring basebackup.c

2021-07-21 Thread Robert Haas

On Tue, Jul 20, 2021 at 4:03 PM Mark Dilger
 wrote:
> I was only imagining having a callback for injecting manifests or recovery 
> configurations.  It is not necessary that this be done in the current patch 
> set, or perhaps ever.

A callback where?

I actually think the ideal scenario would be if the server always did
all the work and the client wasn't involved in editing the tarfile,
but it's not super-easy to get there from here. We could add an option
to tell the server whether to inject the manifest into the archive,
which probably wouldn't be too bad. For it to inject the recovery
configuration, we'd have to send that configuration to the server
somehow. I thought about using COPY BOTH mode instead of COPY OUT mode
to allow for stuff like that, but it seems pretty complicated, and I
wasn't really sure that we'd get consensus that it was better even if
I went to the trouble of coding it up.

If we don't do that and stick with the current system where it's
handled on the client side, then I agree that we want to separate the
tar-specific concerns from the injection-type concerns, which the
patch does by making those operations different kinds of bbstreamer
that know only a relatively limited amount about what each other are
doing. You get [server] => [tar parser] => [recovery injector] => [tar
archiver], where the [recovery injector] step nukes the archive file
headers for the files it adds or modifies, and the [tar archiver] step
fixes them up again. So the only thing that the [recovery injector]
piece needs to know is that if it makes any changes to a file, it
should send that file to the next step with a 0-length archive header,
and all the [tar archiver] piece needs to know is that already-valid
headers can be left alone and 0-length ones need to be regenerated.

There may be a better scheme; I don't think this is perfectly elegant.
I do think it's better than what we've got now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [POC] verifying UTF-8 using SIMD instructions

2021-07-21 Thread Thomas Munro

On Sat, Mar 13, 2021 at 4:37 AM John Naylor
 wrote:
> On Fri, Mar 12, 2021 at 9:14 AM Amit Khandekar  wrote:
> > I was not thinking about auto-vectorizing the code in
> > pg_validate_utf8_sse42(). Rather, I was considering auto-vectorization
> > inside the individual helper functions that you wrote, such as
> > _mm_setr_epi8(), shift_right(), bitwise_and(), prev1(), splat(),
>
> If the PhD holders who came up with this algorithm thought it possible to do 
> it that way, I'm sure they would have. In reality, simdjson has different 
> files for SSE4, AVX, AVX512, NEON, and Altivec. We can incorporate any of 
> those as needed. That's a PG15 project, though, and I'm not volunteering.

Just for fun/experimentation, here's a quick (and probably too naive)
translation of those helper functions to NEON, on top of the v15
patch.
From 1464c22da33117900341496d03b92dec5be2a62a Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 22 Jul 2021 02:05:06 +1200
Subject: [PATCH 1/2] XXX Make SIMD code more platform neutral.

Move SIMD code into pg_utf_simd.c to experiment with the idea of a
shared implementation across architectures.  Introduce pg_u8x16_t to
abstract vector type.

XXX Experiment grade code only
---
 configure |  18 +--
 configure.ac  |  18 +--
 src/include/pg_config.h.in|  14 +-
 src/include/port/pg_utf8.h|  10 +-
 src/port/Makefile |   8 +-
 ...g_utf8_sse42_choose.c => pg_utf8_choose.c} |  10 +-
 src/port/{pg_utf8_sse42.c => pg_utf8_simd.c}  | 148 +-
 7 files changed, 114 insertions(+), 112 deletions(-)
 rename src/port/{pg_utf8_sse42_choose.c => pg_utf8_choose.c} (88%)
 rename src/port/{pg_utf8_sse42.c => pg_utf8_simd.c} (76%)

diff --git a/configure b/configure
index 30969840b1..df546b641c 100755
--- a/configure
+++ b/configure
@@ -18442,13 +18442,13 @@ fi
 #
 # You can override this logic by setting the appropriate USE_*_UTF8 flag to 1
 # in the template or configure command line.
-if test x"$USE_SSE42_UTF8" = x"" && test x"$USE_SSE42_UTF8_WITH_RUNTIME_CHECK" 
= x"" && test x"$USE_FALLBACK_UTF8" = x""; then
+if test x"$USE_SIMD_UTF8" = x"" && test x"$USE_SIMD_UTF8_WITH_RUNTIME_CHECK" = 
x"" && test x"$USE_FALLBACK_UTF8" = x""; then
   if test x"$pgac_sse42_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = 
x"1" ; then
-USE_SSE42_UTF8=1
+USE_SIMD_UTF8=1
   else
 # the CPUID instruction is needed for the runtime check.
 if test x"$pgac_sse42_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" 
= x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
-  USE_SSE42_UTF8_WITH_RUNTIME_CHECK=1
+  USE_SIMD_UTF8_WITH_RUNTIME_CHECK=1
 else
   # fall back to algorithm which doesn't require any special
   # CPU support.
@@ -18461,19 +18461,19 @@ fi
 # Note: We need the fallback for error handling in all builds.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking which UTF-8 validator to 
use" >&5
 $as_echo_n "checking which UTF-8 validator to use... " >&6; }
-if test x"$USE_SSE42_UTF8" = x"1"; then
+if test x"$USE_SIMD_UTF8" = x"1"; then
 
-$as_echo "#define USE_SSE42_UTF8 1" >>confdefs.h
+$as_echo "#define USE_SIMD_UTF8 1" >>confdefs.h
 
-  PG_UTF8_OBJS="pg_utf8_sse42.o pg_utf8_fallback.o"
+  PG_UTF8_OBJS="pg_utf8_simd.o pg_utf8_fallback.o"
   { $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2" >&5
 $as_echo "SSE 4.2" >&6; }
 else
-  if test x"$USE_SSE42_UTF8_WITH_RUNTIME_CHECK" = x"1"; then
+  if test x"$USE_SIMD_UTF8_WITH_RUNTIME_CHECK" = x"1"; then
 
-$as_echo "#define USE_SSE42_UTF8_WITH_RUNTIME_CHECK 1" >>confdefs.h
+$as_echo "#define USE_SIMD_UTF8_WITH_RUNTIME_CHECK 1" >>confdefs.h
 
-PG_UTF8_OBJS="pg_utf8_sse42.o pg_utf8_fallback.o pg_utf8_sse42_choose.o"
+PG_UTF8_OBJS="pg_utf8_simd.o pg_utf8_fallback.o pg_utf8_choose.o"
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: SSE 4.2 with runtime 
check" >&5
 $as_echo "SSE 4.2 with runtime check" >&6; }
   else
diff --git a/configure.ac b/configure.ac
index 5e2b4717c1..1606a80fb7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2217,13 +2217,13 @@ AC_SUBST(PG_CRC32C_OBJS)
 #
 # You can override this logic by setting the appropriate USE_*_UTF8 flag to 1
 # in the template or configure command line.
-if test x"$USE_SSE42_UTF8" = x"" && test x"$USE_SSE42_UTF8_WITH_RUNTIME_CHECK" 
= x"" && test x"$USE_FALLBACK_UTF8" = x""; then
+if test x"$USE_SIMD_UTF8" = x"" && test x"$USE_SIMD_UTF8_WITH_RUNTIME_CHECK" = 
x"" && test x"$USE_FALLBACK_UTF8" = x""; then
   if test x"$pgac_sse42_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = 
x"1" ; then
-USE_SSE42_UTF8=1
+USE_SIMD_UTF8=1
   else
 # the CPUID instruction is needed for the runtime check.
 if test x"$pgac_sse42_intrinsics" = x"yes" && (test x"$pgac_cv__get_cpuid" 
= x"yes" || test x"$pgac_cv__cpuid" = x"yes"); then
-  USE_SSE42_UTF8_WITH_RUNTIME_CHECK=1
+  USE_SIMD_UTF8_WITH_RUNTIME_CHECK=1

Re: shared-memory based stats collector

2021-07-21 Thread Andres Freund

Hi,

On 2021-07-21 17:09:49 +0900, Kyotaro Horiguchi wrote:
> At Mon, 19 Jul 2021 15:34:56 +0500, Ibrar Ahmed  wrote 
> in 
> > The patch does not apply, and require rebase,
> 
> Yeah, thank you very much for checking that. However, this patch is
> now developed in Andres' GitHub repository.  So I'm at a loss what to
> do for the failure..

I'll post a rebased version soon.

Greetings,

Andres Freund

Re: badly calculated width of emoji in psql

2021-07-21 Thread Jacob Champion

On Wed, 2021-07-21 at 00:08 +, Jacob Champion wrote:
> On Mon, 2021-07-19 at 13:13 +0200, Laurenz Albe wrote:
> > That could be adapted; the question is if the approach as such is
> > desirable or not.  This is necessarily a moving target, at the rate
> > that emojis are created and added to Unicode.
> 
> Sure. We already have code in the tree that deals with that moving
> target, though, by parsing apart pieces of the Unicode database. So the
> added maintenance cost should be pretty low.

(I am working on such a patch today and will report back.)

--Jacob

Re: refactoring basebackup.c

2021-07-21 Thread Mark Dilger

> On Jul 21, 2021, at 8:09 AM, Robert Haas  wrote:
> 
> A callback where?

If you were going to support lots of formats, not just tar, you might want the 
streamer class for each format to have a callback which sets up the injector, 
rather than having CreateBackupStreamer do it directly.  Even then, having now 
studied CreateBackupStreamer a bit more, the idea seems less appealing than it 
did initially.  I don't think it makes things any cleaner when only supporting 
tar, and maybe not even when supporting multiple formats, so I'll withdraw the 
suggestion.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: speed up verifying UTF-8

2021-07-21 Thread Vladimir Sitnikov

>I'm pretty confident this improvement is architecture-independent.

Thanks for testing it with different architectures.

It looks like the same utf8_advance function is good for both fast-path and
for the slow path.
Then pg_utf8_verifychar could be removed altogether along with the
corresponding IS_*_BYTE_LEAD macros.

Vladimir

Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

2021-07-21 Thread Robert Haas

On Wed, Jul 14, 2021 at 10:34 PM Greg Nancarrow  wrote:
> Unfortunately there is currently no test, code-comment, README or
> developer-discussion that definitively determines which approach (v2
> vs  v3/v4) is a valid fix for this issue.
> We don't know if having both the transaction and active snapshots in a
> parallel worker is intentional or not, and if so, why so?
> (certainly in the non-parallel case of the same statement execution,
> there is only one snapshot in question here - the obtained transaction
> snapshot is pushed as the active snapshot, as it is done in 95% of
> cases in the code)
> It seems that only the original code authors know how the snapshot
> handling in parallel-workers is MEANT to work, and they have yet to
> speak up about it here.
> At this point, we can only all agree that there is a problem to be fixed here.

Hi.

Thanks to Thomas Munro for drawing my attention to this thread. I
wasn't intentionally ignoring it, but there's a lot of email in the
world and only so much time.

When I wrote this code originally, the idea that I had in mind was
simply this: whatever state we have in the leader ought to be
reproduced in each worker. So if there's an active snapshot in the
leader, we ought to make that active in all the workers, and if
there's a transaction snapshot in the leader, we ought to make that
the transaction snapshot in all of the workers.

But I see now that my thinking was fuzzy, and I'm going to blame that
on the name GetTransactionSnapshot() being slightly misleading. If
IsolationUsesXactSnapshot() is true, then there's really such a thing
as a transaction snapshot and reproducing that in the worker is a
sensible thing to do. But when !IsolationUsesXactSnapshot(),
GetTransactionSnapshot() doesn't just "get the transaction snapshot",
because there isn't any such thing. It takes a whole new snapshot, on
the theory that you wouldn't be calling this function unless you had
finished up with the snapshot you got the last time you called this
function. And in the case of initiating parallel query, that is the
wrong thing.

I think that, at least in the case where IsolationUsesXactSnapshot()
is true, we need to make sure that calling GetTransactionSnapshot() in
a worker produces the same result that it would have produced in the
leader. Say one of the workers calls an sql or plpgsql function and
that function runs a bunch of SQL statements. It seems to me that
there's probably a way for this to result in calls inside the worker
to GetTransactionSnapshot(), and if that doesn't return the same
snapshot as in the leader, then we've broken MVCC.

What about when IsolationUsesXactSnapshot() is false? Perhaps it's OK
to just skip this altogether in that case. Certainly what we're doing
can't be right, because copying a snapshot that wouldn't have been
taken without parallel query can't ever be the right thing to do.
Perhaps we need to copy something else instead. I'm not really sure.

So I think v2 is probably on the right track, but wrong when the
transaction isolation level is REPEATABLE READ or SERIALIZABLE, and v3
and v4 just seem like unprincipled hacks that try to avoid the
assertion failure by lying about whether there's a problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: refactoring basebackup.c

2021-07-21 Thread Robert Haas

On Wed, Jul 21, 2021 at 12:11 PM Mark Dilger
 wrote:
> If you were going to support lots of formats, not just tar, you might want 
> the streamer class for each format to have a callback which sets up the 
> injector, rather than having CreateBackupStreamer do it directly.  Even then, 
> having now studied CreateBackupStreamer a bit more, the idea seems less 
> appealing than it did initially.  I don't think it makes things any cleaner 
> when only supporting tar, and maybe not even when supporting multiple 
> formats, so I'll withdraw the suggestion.

Gotcha. I think if we had a lot of formats I'd probably make a
separate function where you passed in the file extension and archive
type and it hands you back a parser for the appropriate kind of
archive, or something like that. And then maybe a second, similar
function where you pass in the injector and archive type and it wraps
an archiver of the right type around it and hands that back. But I
don't think that's worth doing until we have 2 or 3 formats, which may
or may not happen any time in the forseeable future.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: speed up verifying UTF-8

2021-07-21 Thread John Naylor

On Wed, Jul 21, 2021 at 12:13 PM Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:
> It looks like the same utf8_advance function is good for both fast-path
and for the slow path.
> Then pg_utf8_verifychar could be removed altogether along with the
corresponding IS_*_BYTE_LEAD macros.

pg_utf8_verifychar() is a public function usually called
through pg_wchar_table[], so it needs to remain in any case.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Case expression pushdown

2021-07-21 Thread Tom Lane

Gilles Darold  writes:
> I'm attaching the v5 patch again as it doesn't appears in the Latest 
> attachment list in the commitfest.

So this has a few issues:

1. In foreign_expr_walker, you're failing to recurse to either the
"arg" or "defresult" subtrees of a CaseExpr, so that it would fail
to notice unshippable constructs within those.

2. You're also failing to guard against the hazard that a WHEN
expression within a CASE-with-arg has been expanded into something
that doesn't look like "CaseTestExpr = something".  As written,
this patch would likely dump core in that situation, and if it didn't
it would send nonsense to the remote server.  Take a look at the
check for that situation in ruleutils.c (starting at line 8764
as of HEAD) and adapt it to this.  Probably what you want is to
just deem the CASE un-pushable if it's been modified away from that
structure.  This is enough of a corner case that optimizing it
isn't worth a great deal of trouble ... but crashing is not ok.

3. A potentially uncomfortable issue for the CASE-with-arg syntax
is that the specific equality operator being used appears nowhere
in the decompiled expression, thus raising the question of whether
the remote server will interpret it the same way we did.  Given
that we restrict the values-to-be-compared to be of shippable
types, maybe this is safe in practice, but I have a bad feeling
about it.  I wonder if we'd be better off just refusing to ship
CASE-with-arg at all, which would a-fortiori avoid point 2.

4. I'm not sure that I believe any part of the collation handling.
There is the question of what collations will be used for the
individual WHEN comparisons, which can probably be left for
the recursive checks of the CaseWhen.expr subtrees to handle;
and then there is the separate issue of whether the CASE's result
collation (which arises from the CaseWhen.result exprs plus the
CaseExpr.defresult expr) can be deemed to be safely derived from
remote Vars.  I haven't totally thought through how that should
work, but I'm pretty certain that handling the CaseWhen's within
separate recursive invocations of foreign_expr_walker cannot
possibly get it right.  However, you'll likely have to flatten
those anyway (i.e., handle them within the loop in the CaseExpr
case) while fixing point 2.

5. This is a cosmetic point, but: the locations of the various
additions in deparse.c seem to have been chosen with the aid
of a dartboard.  We do have a convention for this sort of thing,
which is to lay out code concerned with different node types
in the same order that the node types are declared in *nodes.h.
I'm not sufficiently anal to want to fix the existing violations
of that rule that I see in deparse.c; but the fact that somebody
got this wrong before isn't license to make things worse.

regards, tom lane

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Bryn Llewellyn

> On 21-Jul-2021, at 02:58, Tom Lane  wrote:
> 
> Dean Rasheed  writes:
>> Hmm, looking at this whole thread, I have to say that I prefer the old
>> behaviour of spilling down to lower units.
> 
>> For example, with this patch:
> 
>> SELECT '0.5 weeks'::interval;
>> interval
>> --
>> 4 days
> 
>> which I don't think is really an improvement. My expectation is that
>> half a week is 3.5 days, and I prefer what it used to return, namely
>> '3 days 12:00:00'.
> 
> Yeah, that is clearly a significant dis-improvement.
> 
> In general, considering that (most of?) the existing behavior has stood
> for decades, I think we need to tread VERY carefully about changing it.
> I don't want to see this patch changing any case that is not indisputably
> broken.

It was me that started the enormous thread with the title “Have I found an 
interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:

> select interval '-1.7 years';  -- -1 years -8 mons
> 
> select interval '29.4 months'; --  2 years  5 mons 12 
> days
> 
> select interval '-1.7 years 29.4 months';  --   8 mons 12 
> days << wrong
> select interval '29.4 months -1.7 years';  --   9 mons 12 
> days
> 
> select interval '-1.7 years' + interval '29.4 months'; --   9 mons 12 
> days
> select interval '29.4 months' + interval '-1.7 years'; --   9 mons 12 
> days

The consensus was that the outcome that I flagged with “wrong” does indeed have 
that status. After all, it’s hard to see how anybody could intend this rule 
(that anyway holds in only some cases):

-a + b <> b - a

It seems odd that there’s been no recent reference to my testcase and how it 
behaves in the environment of Bruce’s patch.

I don’t recall the history of the thread. But Bruce took on the task of fixing 
this narrow issue. Anyway, somehow, the whole question of “spill down” came up 
for discussion. The rules aren’t documented and I’ve been unable to find any 
reference even to the phenomenon. I have managed to implement a model, in 
PL/pgSQL, that gets the same results as the native implementation in every one 
of many tests that I’ve done. I appreciate that this doesn’t prove that my 
model is correct. But it would seem that it must be on the right track. The 
rules that my PL/pgSQL uses are self-evidently whimsical—but they were needed 
precisely to get the same outcomes as the native implementation. There was some 
discussion of all this somewhere in this thread.

If memory serves, it was Tom who suggested changing the spill-down rules. This 
was possibly meant entirely rhetorically. But it seems that Bruce did set about 
implementing a change here. (I was unable to find a clear prose functional spec 
for the new behavior. Probably I didn’t know where to look.

There’s no doubt that a change in these rules would change the behavior of 
extant code. But then, in a purist sense, this is the case with any bug fix.

I’m simply waiting on a final ruling and final outcome.

Meanwhile, I’ve worked out a way to tame all this business (by using domain 
types and associated functionality) so that application code can deal 
confidently with only pure months, pure days, and pure seconds interval values 
(thinking of the internal [mm, dd, ss] representation). The scheme ensures that 
spill-down never occurs by rounding the years or the months field to integral 
values. If you want to add a “mixed” interval to a timestamp, then you simply 
add the different kinds of interval in the one expression. And you use 
parentheses to assert, visibly, the priority rule that you intend.

Because this is ordinary application code, there are no compatibility issues 
for me. My approach won’t see a change in behavior no matter what is decided 
about the present patch.

Re: Printing backtrace of postgres processes

2021-07-21 Thread vignesh C

On Wed, Jul 21, 2021 at 3:52 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Tue, Jul 13, 2021 at 9:03 PM vignesh C  wrote:
> >
> > On Wed, May 12, 2021 at 2:27 AM Robert Haas 
wrote:
> > >
> > > Maybe we should have a role that is specifically for server debugging
> > > type things. This kind of overlaps with Mark Dilger's proposal to try
> > > to allow SET for security-sensitive GUCs to be delegated via
> > > predefined roles. The exact way to divide that up is open to question,
> > > but it wouldn't seem crazy to me if the same role controlled the
> > > ability to do this plus the ability to set the GUCs
> > > backtrace_functions, debug_invalidate_system_caches_always,
> > > wal_consistency_checking, and maybe a few other things.
> >
> > +1 for the idea of having a new role for this. Currently I have
> > implemented this feature to be supported only for the superuser. If we
> > are ok with having a new role to handle debugging features, I will
> > make a 002 patch to handle this.
>
> I see that there are a good number of user functions that are
> accessible only by superuser (I searched for "if (!superuser())" in
> the code base). I agree with the intention to not overload the
> superuser anymore and have a few other special roles to delegate the
> existing superuser-only functions to them. In that case, are we going
> to revisit and reassign all the existing superuser-only functions?

As Robert pointed out, this idea is based on Mark Dilger's proposal. Mark
Dilger is already handling many of them at [1]. I'm proposing this patch
only for server debugging functionalities based on Robert's suggestions at
[2].
[1] - https://commitfest.postgresql.org/33/3223/
[2] -
https://www.postgresql.org/message-id/CA%2BTgmoZz%3DK1bQRp0Ug%3D6uMGFWg-6kaxdHe6VSWaxq0U-YkppYQ%40mail.gmail.com

Regards,
Vignesh

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Tom Lane

Bryn Llewellyn  writes:
> On 21-Jul-2021, at 02:58, Tom Lane  wrote:
>> In general, considering that (most of?) the existing behavior has stood
>> for decades, I think we need to tread VERY carefully about changing it.
>> I don't want to see this patch changing any case that is not indisputably
>> broken.

> It was me that started the enormous thread with the title “Have I found an 
> interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:

>> select interval '-1.7 years';  -- -1 years -8 mons
>> 
>> select interval '29.4 months'; --  2 years  5 mons 
>> 12 days
>> 
>> select interval '-1.7 years 29.4 months';  --   8 mons 
>> 12 days << wrong
>> select interval '29.4 months -1.7 years';  --   9 mons 
>> 12 days
>> 
>> select interval '-1.7 years' + interval '29.4 months'; --   9 mons 
>> 12 days
>> select interval '29.4 months' + interval '-1.7 years'; --   9 mons 
>> 12 days

> The consensus was that the outcome that I flagged with “wrong” does indeed 
> have that status.

Yeah, I think it's self-evident that your last four cases should
produce the same results.  Whether '9 mons 12 days' is the best
possible result is debatable --- in a perfect world, maybe we'd
produce '9 mons' exactly --- but given that the first two cases
produce what they do, that does seem self-consistent.  I think
we should be setting out to fix that outlier without causing
any of the other five results to change.

regards, tom lane

Re: Added schema level support for publication.

2021-07-21 Thread vignesh C

On Mon, Jul 19, 2021 at 9:32 AM tanghy.f...@fujitsu.com <
tanghy.f...@fujitsu.com> wrote:
>
> On Friday, July 16, 2021 6:10 PM vignesh C 
> > On Wed, Jul 14, 2021 at 6:25 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > Wednesday, July 14, 2021 6:17 PM vignesh C 
wrote:
> > > > On Tue, Jul 13, 2021 at 12:06 PM tanghy.f...@fujitsu.com
> > > >  wrote:
> > > > >
> > > > > On Monday, July 12, 2021 5:36 PM vignesh C 
> > wrote:
> > > > > >
> > > > > > Thanks for reporting this issue, this issue is fixed in the v10
> > > > > > patch attached at [1].
> > > > > > [1] - https://www.postgresql.org/message-id/CALDaNm2%2BtR%2B8R-
> > > > > > sD1CSyMbZcZbkintZE-avefjsp7LCkm6HMmw%40mail.gmail.com
> > > > >
> > > > > Thanks for fixing it.
> > > > >
> > > > > By applying your V10 patch, I saw three problems, please have a
look.
> > > > >
> > > > > 1. An issue about pg_dump.
> > > > > When public schema was published, the publication was created in
the
> > > > > output file, but public schema was not added to it. (Other schemas
> > > > > could be added as expected.)
> > > > >
> > > > > I looked into it and found that selectDumpableNamespace function
marks
> > > > DUMP_COMPONENT_DEFINITION as needless when the schema is public,
> > > > leading to schema public is ignored in getPublicationSchemas. So
we'd better
> > > > check whether schemas should be dumped in another way.
> > > > >
> > > > > I tried to fix it with the following change, please have a look.
> > > > > (Maybe we also need to add some comments for it.)
> > > > >
> > > > > diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
> > > > > index f6b4f12648..a327d2568b 100644
> > > > > --- a/src/bin/pg_dump/pg_dump.c
> > > > > +++ b/src/bin/pg_dump/pg_dump.c
> > > > > @@ -4206,7 +4206,8 @@ getPublicationSchemas(Archive *fout,
> > > > NamespaceInfo nspinfo[], int numSchemas)
> > > > >  * Ignore publication membership of schemas whose
> > > > definitions are not
> > > > >  * to be dumped.
> > > > >  */
> > > > > -   if (!(nsinfo->dobj.dump &
> > > > DUMP_COMPONENT_DEFINITION))
> > > > > +   if (!((nsinfo->dobj.dump &
> > > > DUMP_COMPONENT_DEFINITION)
> > > > > +   || (strcmp(nsinfo->dobj.name, "public")
== 0
> > > > > + && nsinfo->dobj.dump != DUMP_COMPONENT_NONE)))
> > > > > continue;
> > > > >
> > > > > pg_log_info("reading publication membership for
schema
> > > > > \"%s\"",
> > > >
> > > > I felt it is intentionally done like that as the pubic schema is
created by default,
> > > > hence it is not required to dump else we will get errors while
restoring.
> > > > Thougths?
> > >
> > > Thanks for the new patches and I also looked at this issue.
> > >
> > > For user defined schema and publication:
> > > --
> > > create schema s1;
> > > create publication pub2 for SCHEMA s1;
> > > --
> > >
> > > pg_dump will only generate the following SQLs:
> > >
> > > --pg_dump result--
> > > CREATE PUBLICATION pub2 WITH (publish = 'insert, update, delete,
truncate');
> > > ALTER PUBLICATION pub2 ADD SCHEMA s1;
> > > --
> > >
> > > But for the public schema:
> > > --
> > > create publication pub for SCHEMA public;
> > > --
> > >
> > > pg_dump will only generate the following SQL:
> > >
> > > --pg_dump result--
> > > CREATE PUBLICATION pub WITH (publish = 'insert, update, delete,
truncate');
> > > --
> > >
> > > It didn't generate SQL like "ALTER PUBLICATION pub ADD SCHEMA
public;" which
> > > means the public schema won't be published after restoring. So, I
think we'd
> > > better let the pg_dump generate the ADD SCHEMA public SQL. Thoughts ?
> >
> > Thanks for reporting this issue, this issue is fixed in the v12 patch
attached.
> >
>
> I tested your v12 patch and found a problem in the following case.
>
> Step 1:
> postgres=# create schema s1;
> CREATE SCHEMA
> postgres=# create table s1.t1 (a int);
> CREATE TABLE
> postgres=# create publication pub_t for table s1.t1;
> CREATE PUBLICATION
> postgres=# create publication pub_s for schema s1;
> CREATE PUBLICATION
>
> Step 2:
> pg_dump -N s1
>
> I dumped and excluded schema s1, pg_dump generated the following SQL:
> ---
> ALTER PUBLICATION pub_s ADD SCHEMA s1;
>
> I think it was not expected because SQL like "ALTER PUBLICATION pub_t ADD
TABLE s1.t1" was not generated in my case. Thoughts?

Thanks for reporting this issue, this issue is fixed in the v13 patch
posted at [1]
[1] -
https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

2021-07-21 Thread vignesh C

On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow  wrote:
>
> On Fri, Jul 16, 2021 at 8:13 PM vignesh C  wrote:
> >
> > Modified.
> >
> > Thanks for the comments, these issues are fixed as part of the v12 patch 
> > posted at [1].
> > [1]  - 
> > https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
> >
>
> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
> After that command, it still regards it as an empty (e) publication,
> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>
> e.g.
>
> test_pub=# create schema myschema;
> CREATE SCHEMA
> test_pub=# CREATE TABLE myschema.test (key int, value text, data jsonb);
> CREATE TABLE
> test_pub=# create publication pub1;
> CREATE PUBLICATION
> test_pub=# \dRp+ pub1
>  Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
> root | Pubtype
> ---++-+-+-+---+--+-
>  gregn | f  | t   | t   | t   | t | f| e
> (1 row)
>
> test_pub=# alter publication pub1 set table myschema.test;
> ALTER PUBLICATION
> test_pub=# \dRp+ pub1
>  Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
> root | Pubtype
> ---++-+-+-+---+--+-
>  gregn | f  | t   | t   | t   | t | f| e
> (1 row)
>
> test_pub=# alter publication pub1 add schema myschema;
> ALTER PUBLICATION
> test_pub=# \dRp+ pub1
>  Publication pub1
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via
> root | Pubtype
> ---++-+-+-+---+--+-
>  gregn | f  | t   | t   | t   | t | f| s
> Schemas:
> "myschema"

Thanks for reporting this issue, this issue is fixed in the v13 patch
posted at [1]
[1] - 
https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh

Re: Added schema level support for publication.

2021-07-21 Thread vignesh C

On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed  wrote:
>
>
>
> On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow 
wrote:
>>
>> On Fri, Jul 16, 2021 at 8:13 PM vignesh C  wrote:
>> >
>> > Modified.
>> >
>> > Thanks for the comments, these issues are fixed as part of the v12
patch posted at [1].
>> > [1]  -
https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>> >
>>
>> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
>> After that command, it still regards it as an empty (e) publication,
>> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>>
>
> One issue here is that the code to update publication type is missing
> in AlterPublicationTables for SET TABLE command.

Modified.

> More broadly, I am not clear about the behaviour of the patch when a
> publication is created to publish only certain tables, and is later
altered to publish
> a whole schema. I think such behaviour is legitimate. However,
> AFAIU as per current code we can't update the publication type
> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

I initially thought this might not be required for users, I have not made
any change for this, I will try to get a few more people's opinion on this
and then fix it if required.

> I have some review comments as follows:
> 1.
> In ConvertSchemaSpecListToOidList(List *schemas) function:
>  + search_path = fetch_search_path(false);
>  +   nspname =
get_namespace_name(linitial_oid(search_path));
>  +   if (nspname == NULL)/*
recently-deleted namespace? */
>  +   ereport(ERROR,
>  +
errcode(ERRCODE_UNDEFINED_SCHEMA),
>  +   errmsg("no
schema has been selected"));
>  +
>  +   schemoid = get_namespace_oid(nspname,
false);
>  +   break;
>
> The call get_namespace_oid() is perhaps not needed as fetch_search_path
already fetches oids and simply
> doing Schema oid = liinital_oid(search_path)); should be enough.

Modified

> 2. In the same function should there be an if else condition block
instead of a switch case as
> there are only two cases.

Modified.

Thanks for the comments, these comments are fixed in the v13 patch posted
at [1].
[1] -
https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Bryn Llewellyn

> On 21-Jul-2021, at 01:23, Dean Rasheed  wrote:
> 
> On Wed, 21 Jul 2021 at 03:48, Bruce Momjian  wrote:
>> 
>> this example now gives me concern:
>> 
>>SELECT INTERVAL '1.06 months 1 hour';
>>   interval
>>---
>> 1 mon 2 days 01:00:00
>> 
>> Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
>> spilling to hours/minutes/seconds, even though hours is already
>> specified.  I don't see a better way to handle this than the current
>> code already does, but it is something odd.
> 
> Hmm, looking at this whole thread, I have to say that I prefer the old
> behaviour of spilling down to lower units.
> 
> For example, with this patch:
> 
> SELECT '0.5 weeks'::interval;
> interval
> --
> 4 days
> 
> which I don't think is really an improvement. My expectation is that
> half a week is 3.5 days, and I prefer what it used to return, namely
> '3 days 12:00:00'.
> 
> It's true that that leads to odd-looking results when the field value
> has lots of fractional digits, but that was at least explainable, and
> followed the documentation.
> 
> Looking for a general principle to follow, how about this -- the
> result of specifying a fractional value should be the same as
> multiplying an interval of 1 unit by that value. In other words,
> '1.8594 months'::interval should be the same as '1 month'::interval *
> 1.8594. (Actually, it probably can't easily be made exactly the same
> in all cases, due to differences in the floating point computations in
> the two cases, and rounding errors, but it's hopefully not far off,
> unlike the results obtained by not spilling down to lower units on
> input.)
> 
> The cases that are broken in master, in my opinion, are the larger
> units (year and above), which don't propagate down in the same way as
> fractional months and below. So, for example, '0.7 years' should be
> 8.4 months (with the conversion factor of 1 year = 12 months), giving
> '8 months 12 days', which is what '1 year'::interval * 0.7 produces.
> Sure, there are arguably more accurate ways of computing that.
> However, that's the result obtained using the documented conversion
> factors, so it's justifiable in those terms.
> 
> It's worth noting another case that is broken in master:
> 
> SELECT '1.7 decades'::interval;
> interval
> --
> 16 years 11 mons
> 
> which is surely not what anyone would expect. The current patch fixes
> this, but it would also be fixed by handling the fractional digits for
> these units in the same way as for smaller units. There was an earlier
> patch doing that, I think, though I didn't test it.
> 
> Regards,
> Dean

And try these two tests. (I’m using Version 13.3.) on current MacOS.

select
  '1.7 decades'::interval as i1, 
  ('1 decades'::interval)*1.7 as i2,
  ('10 years'::interval)*1.7 as i3;

   i1|i2|i3
--+--+--
 16 years 11 mons | 17 years | 17 years

select
  '1.7345 decades'::interval as i4, 
  ('1 decades'::interval)*1.7345 as i5,
  ('10 years'::interval)*1.7345 as i6;

   i4|   i5|   i6   
 
-+-+-
 17 years 4 mons | 17 years 4 mons 4 days 04:48:00 | 17 years 4 mons 4 days 
04:48:00

Shows only what we know already: mixed interval arithmetic is fishy.

Seems to me that units like “weeks”, “centuries”, “millennia”, and so on are a 
solution (broken in some cases) looking for a problem. Try this (and variants 
like I showed above):

select
  '1.7345 millennia'::interval as i7,
  '1.7345 centuries'::interval as i8,
  '1.7345 weeks'::interval as i9;

i7 |i8| i9 
---+--+
 1734 years 6 mons | 173 years 5 mons | 12 days 03:23:45.6

Re: [POC] verifying UTF-8 using SIMD instructions

2021-07-21 Thread John Naylor

On Wed, Jul 21, 2021 at 11:29 AM Thomas Munro 
wrote:

> Just for fun/experimentation, here's a quick (and probably too naive)
> translation of those helper functions to NEON, on top of the v15
> patch.

Neat! It's good to make it more architecture-agnostic, and I'm sure we can
use quite a bit of this. I don't know enough about NEON to comment
intelligently, but a quick glance through the simdjson source show a couple
differences that might be worth a look:

 to_bool(const pg_u8x16_t v)
 {
+#if defined(USE_NEON)
+ return vmaxvq_u32((uint32x4_t) v) != 0;

--> return vmaxvq_u8(*this) != 0;

 vzero()
 {
+#if defined(USE_NEON)
+ return vmovq_n_u8(0);

--> return vdupq_n_u8(0); // or equivalently, splat(0)

is_highbit_set(const pg_u8x16_t v)
 {
+#if defined(USE_NEON)
+ return to_bool(bitwise_and(v, vmovq_n_u8(0x80)));

--> return vmaxq_u8(v) > 0x7F

(Technically, their convention is: is_ascii(v) { return vmaxq_u8(v) < 0x80;
} , but same effect)

+#if defined(USE_NEON)
+static pg_attribute_always_inline pg_u8x16_t
+vset(uint8 v0, uint8 v1, uint8 v2, uint8 v3,
+ uint8 v4, uint8 v5, uint8 v6, uint8 v7,
+ uint8 v8, uint8 v9, uint8 v10, uint8 v11,
+ uint8 v12, uint8 v13, uint8 v14, uint8 v15)
+{
+ uint8 pg_attribute_aligned(16) values[16] = {
+ v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, v15
+ };
+ return vld1q_u8(values);
+}

--> They have this strange beast instead:

  // Doing a load like so end ups generating worse code.
  // uint8_t array[16] = {x1, x2, x3, x4, x5, x6, x7, x8,
  // x9, x10,x11,x12,x13,x14,x15,x16};
  // return vld1q_u8(array);
  uint8x16_t x{};
  // incredibly, Visual Studio does not allow x[0] = x1
  x = vsetq_lane_u8(x1, x, 0);
  x = vsetq_lane_u8(x2, x, 1);
  x = vsetq_lane_u8(x3, x, 2);
...
  x = vsetq_lane_u8(x15, x, 14);
  x = vsetq_lane_u8(x16, x, 15);
  return x;

Since you aligned the array, that might not have the problem alluded to
above, and it looks nicer.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Git revision in tarballs

2021-07-21 Thread Peter Eisentraut


On 15.07.21 10:33, Magnus Hagander wrote:

I think it'd be useful to be able to identify exactly which git commit
was used to produce a tarball. This would be especially useful when
downloading snapshot tarballs where that's not entirely clear, but can
also be used to verify that the release tarballs matches what's
expected (in the extremely rare case that a tarball is rewrapped for
example).


Or we could do what git-archive does:

Additionally the commit ID is stored in a global extended
pax header if the tar format is used; it can be extracted using git
get-tar-commit-id. In ZIP files it is stored as
a file comment.

Re: Git revision in tarballs

2021-07-21 Thread Daniel Gustafsson

> On 21 Jul 2021, at 20:25, Peter Eisentraut 
>  wrote:
> 
> On 15.07.21 10:33, Magnus Hagander wrote:
>> I think it'd be useful to be able to identify exactly which git commit
>> was used to produce a tarball. This would be especially useful when
>> downloading snapshot tarballs where that's not entirely clear, but can
>> also be used to verify that the release tarballs matches what's
>> expected (in the extremely rare case that a tarball is rewrapped for
>> example).
> 
> Or we could do what git-archive does:
> 
>Additionally the commit ID is stored in a global extended
>pax header if the tar format is used; it can be extracted using git
>get-tar-commit-id. In ZIP files it is stored as
>a file comment.

That does adds Git as a dependency for consuming the tarball though, which
might not be a problem but it's a change from what we require today.

--
Daniel Gustafsson   https://vmware.com/

Re: Git revision in tarballs

2021-07-21 Thread Tom Lane

Daniel Gustafsson  writes:
> On 21 Jul 2021, at 20:25, Peter Eisentraut 
>  wrote:
>> On 15.07.21 10:33, Magnus Hagander wrote:
>>> I think it'd be useful to be able to identify exactly which git commit
>>> was used to produce a tarball.

>> Or we could do what git-archive does:
>> Additionally the commit ID is stored in a global extended
>> pax header if the tar format is used; it can be extracted using git
>> get-tar-commit-id. In ZIP files it is stored as
>> a file comment.

> That does adds Git as a dependency for consuming the tarball though, which
> might not be a problem but it's a change from what we require today.

It also requires keeping the tarball itself around, which you might not
have done, or you might not remember which directory you extracted which
tarball into.  So on the whole that solution seems strictly worse.

FYI, the "put it into .gitrevision" solution is already implemented
in the new tarball-building script that Magnus and I have been
working on off-list.

regards, tom lane

Re: Git revision in tarballs

2021-07-21 Thread Daniel Gustafsson

> On 21 Jul 2021, at 21:23, Tom Lane  wrote:

> FYI, the "put it into .gitrevision" solution is already implemented
> in the new tarball-building script that Magnus and I have been
> working on off-list.

+1, I think that's the preferred option.

--
Daniel Gustafsson   https://vmware.com/

Re: Git revision in tarballs

2021-07-21 Thread Peter Eisentraut


On 21.07.21 21:12, Daniel Gustafsson wrote:

Or we could do what git-archive does:

Additionally the commit ID is stored in a global extended
pax header if the tar format is used; it can be extracted using git
get-tar-commit-id. In ZIP files it is stored as
a file comment.


That does adds Git as a dependency for consuming the tarball though, which
might not be a problem but it's a change from what we require today.


How so?

Re: psql - add SHOW_ALL_RESULTS option

2021-07-21 Thread Peter Eisentraut


On 15.07.21 17:46, Fabien COELHO wrote:

The patch does not apply on Head anymore, could you rebase and post a
patch. I'm changing the status to "Waiting for Author".


Ok. I noticed. The patch got significantly broken by the watch pager 
commit. I also have to enhance the added tests (per Peter request).


I wrote a test to check psql query cancel support.  I checked that it 
fails against the patch that was reverted.  Maybe this is useful.
From 6ff6f8b0246cea08b7e329a3e5f49cec3f83a5bc Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 21 Jul 2021 21:46:11 +0200
Subject: [PATCH] psql: Add test for query canceling

---
 src/bin/psql/t/020_cancel.pl | 32 
 1 file changed, 32 insertions(+)
 create mode 100644 src/bin/psql/t/020_cancel.pl

diff --git a/src/bin/psql/t/020_cancel.pl b/src/bin/psql/t/020_cancel.pl
new file mode 100644
index 00..0d56b47ff3
--- /dev/null
+++ b/src/bin/psql/t/020_cancel.pl
@@ -0,0 +1,32 @@
+use strict;
+use warnings;
+
+use PostgresNode;
+use TestLib;
+use Test::More tests => 2;
+
+my $tempdir = TestLib::tempdir;
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+
+# test query canceling by sending SIGINT to a running psql
+SKIP: {
+   skip "cancel test requires a Unix shell", 2 if $windows_os;
+
+   local %ENV = $node->_get_env();
+
+   local $SIG{ALRM} = sub {
+   my $psql_pid = TestLib::slurp_file("$tempdir/psql.pid");
+   kill 'INT', $psql_pid;
+   };
+   alarm 1;
+
+   my $stdin = "\\! echo \$PPID >$tempdir/psql.pid\nselect pg_sleep(5);";
+   my ($stdout, $stderr);
+   my $result = IPC::Run::run(['psql', '-v', 'ON_ERROR_STOP=1'], '<', 
\$stdin, '>', \$stdout, '2>', \$stderr);
+
+   ok(!$result, 'query failed');
+   like($stderr, qr/canceling statement due to user request/, 'query was 
canceled');
+}
-- 
2.32.0

Re: Git revision in tarballs

2021-07-21 Thread Tom Lane

Peter Eisentraut  writes:
> On 21.07.21 21:12, Daniel Gustafsson wrote:
>>> Additionally the commit ID is stored in a global extended
>>> pax header if the tar format is used; it can be extracted using git
>>> get-tar-commit-id. In ZIP files it is stored as
>>> a file comment.

>> That does adds Git as a dependency for consuming the tarball though, which
>> might not be a problem but it's a change from what we require today.

> How so?

It's only a dependency if you want to know the commit ID, which
perhaps isn't something you would have use for if you don't have
git installed ... but I don't think that's totally obvious.

Personally I'd be more worried about rendering the tarballs
totally corrupt from the perspective of somebody using an old
"tar" that hasn't heard of extended pax headers.  Maybe there
are no such versions in the wild anymore; but I do not see any
advantages to this approach that would justify taking any risk for.

regards, tom lane

Re: proposal: enhancing plpgsql debug API - returns text value of variable content

2021-07-21 Thread Pavel Stehule

Hi

pá 16. 7. 2021 v 18:40 odesílatel Pavel Stehule 
napsal:

> Hi
>
> pá 16. 7. 2021 v 15:05 odesílatel Aleksander Alekseev <
> aleksan...@timescale.com> napsal:
>
>> Hi Pavel,
>>
>> > I would like to print content of variables - and now, I have to go some
>> > deeper than I would like. I need to separate between scalar, row, and
>> > record variables. PLpgSQL has code for it - but it is private.
>> > [...]
>>
>> The patch seems OK, but I wonder - would it be possible to write a test
>> on it?
>>
>
>  Sure, it is possible - unfortunately - the size of this test will be
> significantly bigger than patch self.
>
> I'll try to write it some simply tracer, where this API can be used
>

I am sending an enhanced patch about the regress test for plpgsql's debug
API.

Regards

Pavel


>
>
>> --
>> Best regards,
>> Aleksander Alekseev
>>
>
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 14bbe12da5..e0e68a2592 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -4059,6 +4059,8 @@ plpgsql_estate_setup(PLpgSQL_execstate *estate,
 	{
 		(*plpgsql_plugin_ptr)->error_callback = plpgsql_exec_error_callback;
 		(*plpgsql_plugin_ptr)->assign_expr = exec_assign_expr;
+		(*plpgsql_plugin_ptr)->eval_datum = exec_eval_datum;
+		(*plpgsql_plugin_ptr)->cast_value = do_cast_value;
 
 		if ((*plpgsql_plugin_ptr)->func_setup)
 			((*plpgsql_plugin_ptr)->func_setup) (estate, func);
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 0f6a5b30b1..abe7d63f78 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -415,6 +415,7 @@ pl_block		: decl_sect K_BEGIN proc_sect exception_sect K_END opt_label
 		new->cmd_type	= PLPGSQL_STMT_BLOCK;
 		new->lineno		= plpgsql_location_to_lineno(@2);
 		new->stmtid		= ++plpgsql_curr_compile->nstatements;
+		new->ns			= plpgsql_ns_top();
 		new->label		= $1.label;
 		new->n_initvars = $1.n_initvars;
 		new->initvarnos = $1.initvarnos;
@@ -907,7 +908,8 @@ stmt_perform	: K_PERFORM
 		new = palloc0(sizeof(PLpgSQL_stmt_perform));
 		new->cmd_type = PLPGSQL_STMT_PERFORM;
 		new->lineno   = plpgsql_location_to_lineno(@1);
-		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->stmtid   = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		plpgsql_push_back_token(K_PERFORM);
 
 		/*
@@ -943,6 +945,7 @@ stmt_call		: K_CALL
 		new->cmd_type = PLPGSQL_STMT_CALL;
 		new->lineno = plpgsql_location_to_lineno(@1);
 		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->ns = plpgsql_ns_top();
 		plpgsql_push_back_token(K_CALL);
 		new->expr = read_sql_stmt();
 		new->is_call = true;
@@ -962,6 +965,7 @@ stmt_call		: K_CALL
 		new->cmd_type = PLPGSQL_STMT_CALL;
 		new->lineno = plpgsql_location_to_lineno(@1);
 		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->ns = plpgsql_ns_top();
 		plpgsql_push_back_token(K_DO);
 		new->expr = read_sql_stmt();
 		new->is_call = false;
@@ -1000,7 +1004,8 @@ stmt_assign		: T_DATUM
 		new = palloc0(sizeof(PLpgSQL_stmt_assign));
 		new->cmd_type = PLPGSQL_STMT_ASSIGN;
 		new->lineno   = plpgsql_location_to_lineno(@1);
-		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->stmtid   = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		new->varno = $1.datum->dno;
 		/* Push back the head name to include it in the stmt */
 		plpgsql_push_back_token(T_DATUM);
@@ -1022,6 +1027,7 @@ stmt_getdiag	: K_GET getdiag_area_opt K_DIAGNOSTICS getdiag_list ';'
 		new->cmd_type = PLPGSQL_STMT_GETDIAG;
 		new->lineno   = plpgsql_location_to_lineno(@1);
 		new->stmtid	  = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		new->is_stacked = $2;
 		new->diag_items = $4;
 
@@ -1194,6 +1200,7 @@ stmt_if			: K_IF expr_until_then proc_sect stmt_elsifs stmt_else K_END K_IF ';'
 		new->cmd_type	= PLPGSQL_STMT_IF;
 		new->lineno		= plpgsql_location_to_lineno(@1);
 		new->stmtid		= ++plpgsql_curr_compile->nstatements;
+		new->ns			= plpgsql_ns_top();
 		new->cond		= $2;
 		new->then_body	= $3;
 		new->elsif_list = $4;
@@ -1299,6 +1306,7 @@ stmt_loop		: opt_loop_label K_LOOP loop_body
 		new->cmd_type = PLPGSQL_STMT_LOOP;
 		new->lineno   = plpgsql_location_to_lineno(@2);
 		new->stmtid   = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		new->label	  = $1;
 		new->body	  = $3.stmts;
 
@@ -1317,6 +1325,7 @@ stmt_while		: opt_loop_label K_WHILE expr_until_loop loop_body
 		new->cmd_type = PLPGSQL_STMT_WHILE;
 		new->lineno   = plpgsql_location_to_lineno(@2);
 		new->stmtid	  = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		new->label	  = $1;
 		new->cond	  = $3;
 		new->body	  = $4.stmts;
@@ -1381,6 +1390,7 @@

Re: WIP: Relaxing the constraints on numeric scale

2021-07-21 Thread Tom Lane

Dean Rasheed  writes:
> Attached is a more complete patch, with updated docs and tests.

I took a brief look at this and have a couple of quick suggestions:

* As you mention, keeping some spare bits in the typmod might come
in handy some day, but as given this patch isn't really doing so.
I think it might be advisable to mask the scale off at 11 bits,
preserving the high 5 bits of the low-order half of the word for future
use.  The main objection to that I guess is that it would complicate
doing sign extension in TYPMOD_SCALE().  But it doesn't seem like we
use that logic in any really hot code paths, so another instruction
or three probably is not much of a cost.

* I agree with wrapping the typmod construction/extraction into macros
(or maybe they should be inline functions?) but the names you chose
seem generic enough to possibly confuse onlookers.  I'd suggest
changing TYPMOD to NUMERIC_TYPMOD or NUM_TYPMOD.  The comment for them
should probably also explicitly explain "For purely historical reasons,
VARHDRSZ is added to the typmod value after these fields are combined",
or words to that effect.

* It might be advisable to write NUMERIC_MIN_SCALE with parens:

#define NUMERIC_MIN_SCALE   (-1000)

to avoid any precedence gotchas.

* I'd be inclined to leave the num_typemod_test table in place,
rather than dropping it, so that it serves to exercise pg_dump
for these cases during the pg_upgrade test.

Haven't read the code in detail yet.

regards, tom lane

Re: badly calculated width of emoji in psql

2021-07-21 Thread Jacob Champion

On Wed, 2021-07-21 at 00:08 +, Jacob Champion wrote:
> I note that the doc comment for ucs_wcwidth()...
> 
> >  *- Spacing characters in the East Asian Wide (W) or East Asian
> >  *  FullWidth (F) category as defined in Unicode Technical
> >  *  Report #11 have a column width of 2.
> 
> ...doesn't match reality anymore. The East Asian width handling was
> last updated in 2006, it looks like? So I wonder whether fixing the
> code to match the comment would not only fix the emoji problem but also
> a bunch of other non-emoji characters.

Attached is my attempt at that. This adds a second interval table,
handling not only the emoji range in the original patch but also
correcting several non-emoji character ranges which are included in the
13.0 East Asian Wide/Fullwidth sets. Try for example

- U+2329 LEFT POINTING ANGLE BRACKET
- U+16FE0 TANGUT ITERATION MARK
- U+18000 KATAKANA LETTER ARCHAIC E

This should work reasonably well for terminals that depend on modern
versions of Unicode's EastAsianWidth.txt to figure out character width.
I don't know how it behaves on BSD libc or Windows.

The new binary search isn't free, but my naive attempt at measuring the
performance hit made it look worse than it actually is. Since the
measurement function was previously returning an incorrect (too short)
width, we used to get a free performance boost by not printing the
correct number of alignment/border characters. I'm still trying to
figure out how best to isolate the performance changes due to this
patch.

Pavel, I'd be interested to see what your benchmarks find with this
code. Does this fix the original issue for you?

--Jacob
From e59292beb2aeb1860734ead992cea38f59f6cab6 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Wed, 21 Jul 2021 10:41:05 -0700
Subject: [PATCH] ucs_wcwidth: update Fullwidth and Wide codepoint set

The hardcoded "wide character" set at the end of ucs_wcwidth() was last
touched around the Unicode 5.0 era.  This led to misalignment on modern
platforms when printing emoji and other codepoints that have since been
designated wide/fullwidth.

Use an interval table for these codepoints, and add a recipe to the
existing update-unicode rule to keep it up to date.

TODO: performance implications
---
 src/common/unicode/.gitignore |   1 +
 src/common/unicode/Makefile   |   9 +-
 .../generate-unicode_east_asian_fw_table.pl   |  76 +++
 src/common/wchar.c|  18 +--
 .../common/unicode_east_asian_fw_table.h  | 120 ++
 5 files changed, 208 insertions(+), 16 deletions(-)
 create mode 100644 src/common/unicode/generate-unicode_east_asian_fw_table.pl
 create mode 100644 src/include/common/unicode_east_asian_fw_table.h

diff --git a/src/common/unicode/.gitignore b/src/common/unicode/.gitignore
index 512862e538..46243f701d 100644
--- a/src/common/unicode/.gitignore
+++ b/src/common/unicode/.gitignore
@@ -4,5 +4,6 @@
 # Downloaded files
 /CompositionExclusions.txt
 /DerivedNormalizationProps.txt
+/EastAsianWidth.txt
 /NormalizationTest.txt
 /UnicodeData.txt
diff --git a/src/common/unicode/Makefile b/src/common/unicode/Makefile
index eb14add28a..a3683dd86b 100644
--- a/src/common/unicode/Makefile
+++ b/src/common/unicode/Makefile
@@ -18,14 +18,14 @@ LIBS += $(PTHREAD_LIBS)
 # By default, do nothing.
 all:
 
-update-unicode: unicode_norm_table.h unicode_combining_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
+update-unicode: unicode_norm_table.h unicode_combining_table.h unicode_east_asian_fw_table.h unicode_normprops_table.h unicode_norm_hashfunc.h
 	mv $^ ../../../src/include/common/
 	$(MAKE) normalization-check
 
 # These files are part of the Unicode Character Database. Download
 # them on demand.  The dependency on Makefile.global is for
 # UNICODE_VERSION.
-UnicodeData.txt DerivedNormalizationProps.txt CompositionExclusions.txt NormalizationTest.txt: $(top_builddir)/src/Makefile.global
+UnicodeData.txt EastAsianWidth.txt DerivedNormalizationProps.txt CompositionExclusions.txt NormalizationTest.txt: $(top_builddir)/src/Makefile.global
 	$(DOWNLOAD) https://www.unicode.org/Public/$(UNICODE_VERSION)/ucd/$(@F)
 
 # Generation of conversion tables used for string normalization with
@@ -38,6 +38,9 @@ unicode_norm_table.h: generate-unicode_norm_table.pl UnicodeData.txt Composition
 unicode_combining_table.h: generate-unicode_combining_table.pl UnicodeData.txt
 	$(PERL) $^ >$@
 
+unicode_east_asian_fw_table.h: generate-unicode_east_asian_fw_table.pl EastAsianWidth.txt
+	$(PERL) $^ >$@
+
 unicode_normprops_table.h: generate-unicode_normprops_table.pl DerivedNormalizationProps.txt
 	$(PERL) $^ >$@
 
@@ -64,6 +67,6 @@ clean:
 	rm -f $(OBJS) norm_test norm_test.o
 
 distclean: clean
-	rm -f UnicodeData.txt CompositionExclusions.txt NormalizationTest.txt norm_test_table.h unicode_norm_table.h
+	rm -f UnicodeData.txt EastAsianWidth.txt CompositionExclusions.txt NormalizationTest.txt norm_test_table.

Re: Rename of triggers for partitioned tables

2021-07-21 Thread Alvaro Herrera

... now, thinking about how does pg_dump deal with this, I think this
thread is all wrong, or at least mostly wrong.  We cannot have triggers
in partitions with different names from the triggers in the parents.
pg_dump would only dump the trigger in the parent and totally ignore the
ones in children if they have different names.

So we still need the recursion bits; but

1. if we detect that we're running in a trigger which has tgparentid
set, we have to raise an error,
2. if we are running on a trigger in a partitioned table, then ONLY must
not be specified.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"

Re: Using a stock openssl BIO

2021-07-21 Thread Daniel Gustafsson

> On 15 Jul 2021, at 04:17, Andres Freund  wrote:

> Thomas^WA bad person recently nerdsniped me (with the help of an accidental 
> use
> of an SSL connection in a benchmark leading to poor results) into checking 
> what
> would be needed to benefit from SSL/TLS hardware acceleration (available with
> suitable hardware, OS support (linux and freebsd) and openssl 3). 

Now why does that sounds so familiar.. =)

> In the backend the first reason is:
> 
> * Private substitute BIO: this does the sending and receiving using send() and
> * recv() instead. This is so that we can enable and disable interrupts
> * just while calling recv(). We cannot have interrupts occurring while
> * the bulk of OpenSSL runs, because it uses malloc() and possibly other
> * non-reentrant libc facilities.
> 
> I think this part has been obsolete for a while now

I concur.

> The second part is
> * We also need to call send() and recv()
> * directly so it gets passed through the socket/signals layer on Win32.
> 
> And the not stated need to set/unset pgwin32_noblock around the recv/send
> calls.
> 
> I don't think the signal handling stuff is still needed with nonblocking
> sockets? It seems we could just ensure that there's a pgwin32_poll_signals()
> somewhere higher up in secure_read/write()? E.g. in
> ProcessClientReadInterrupt()/ProcessClientWriteInterrupt() or with an explicit
> call.
> 
> And the pgwin32_noblock handling could just be done outside the 
> SSL_read/write().

I hadn't yet looked at the pgwin32 parts in detail, but this is along what I
was thinking (just more refined).

> On the client side it looks like things would be a bit harder. The main 
> problem
> seems to be dealing with SIGPIPE. We could possibly deal with that by moving
> the handling of that a layer up. That's a thick nest of ugly stuff :(.

My initial plan was to keep this for the backend, as the invasiveness of the
frontend patch is unlikely to be justified by the returns of the acceleration.

> FWIW, I don't think hardware tls acceleration is a particularly crucial thing
> for now.

Agreed, it will most likely be of limited use to most.  It might however make
sense to "get in on the ground floor" to be ready in case it's expanded on in
kernel+OpenSSL with postgres automatically just reaping the benefits.  Either
way I was hoping to get to a patch which is close enough to what it would need
to look like so we can decide with the facts at hand.

> I don't plan to work on this, but Thomas encouraged me to mention this on the
> list when I mention it to him.

I still have it on my TODO for after the vacation, and hope to reach that part
of the list soon.

--
Daniel Gustafsson   https://vmware.com/

Re: POC: GROUP BY optimization

2021-07-21 Thread Tomas Vondra

Hi,

after a bit more time spent on this, I found that the issue is due to
this chunk of code in

if (heapSort)
{
if (tuplesPerPrevGroup < output_tuples)
/* comparing only inside output_tuples */
correctedNGroups =
ceil(2.0 * output_tuples /
 (tuplesPerPrevGroup / nGroups));
else
/* two groups - in output and out */
correctedNGroups = 2.0;
}
else
correctedNGroups = nGroups;

if (correctedNGroups <= 1.0)
correctedNGroups = 2.0;
else
correctedNGroups = ceil(correctedNGroups);
per_tuple_cost += totalFuncCost * LOG2(correctedNGroups);


There's a couple issues, mostly due to differences in handling of cases
with different heapSort flag. A full-sort (no LIMIT clause) we have
heapSort=false, and hence the execution simply jumps to

correctedNGroups = nGroups;

while for LIMIT we do heapSort=true, in which case we also start with

tuplesPerPrevGroup = ntuples;

That is almost certainly greater than output_tuples (=limit+offset), so
the first if condition in calculating correctedNGroups can't possibly be
true, and we simply end with

correctedNGroups = 2.0;

in the first loop. Which seems pretty bogus - why would there be just
two groups? When processing the first expression, it's as if there was
one big "prev group" with all the tuples, so why not to just use nGroups
as it is?

This seems to confuse the costing quite a bit - enough to produce the
"inversed" costs with/without LIMIT, and picking the other plan.

I've simplified the costing a bit, and the attached version actually
undoes all the "suspicious" plan changes in postgres_fdw. It changes one
new plan, but that seems somewhat reasonable, as it pushes sort to the
remote side.

But after looking at the costing function, I have a bunch of additional
comments and questions:


1) I looked at the resources mentioned as sources the formulas came
from, but I've been unable to really match the algorithm to them. The
quicksort paper is particularly "dense", the notation seems to be very
different, and none of the theorems seem like an obvious fit. Would be
good to make the relationship clearer in comments etc.

For the Sedgewick course it's even worse - it's way too extensive to
just point at it and say "We're using ideas from this!" because no one
is going to know which chapter/section to look at. We need to be a bit
more specific about the reference.


2) I'm a bit puzzled by the "heapsort" chunk, actually. How come we need
it now, when we didn't need that before? In a way, the difference in
behavior between heasort and non-heapsort is what triggered the plan
changes ...

FWIW It's quite possible I tweaked the costing incorrectly, but it ends
up choosing the right plans purely by accident.


3) I'm getting a bit skeptical about the various magic coefficients that
are meant to model higher costs with non-uniform distribution. But
consider that we do this, for example:

   tuplesPerPrevGroup = ceil(1.5 * tuplesPerPrevGroup / nGroups);

but then in the next loop we call estimate_num_groups_incremental and
pass this "tweaked" tuplesPerPrevGroup value to it. I'm pretty sure this
may have various strange consequences - we'll calculate the nGroups
based on the inflated value, and we'll calculate tuplesPerPrevGroup from
that again - that seems susceptible to "amplification".

We inflate tuplesPerPrevGroup by 50%, which means we'll get a higher
nGroups estimate in the next loop - but not linearly. An then we'll
calculate the inflated tuplesPerPrevGroup and estimated nGroup ...

That seems pretty dubious, with hard to explain behavior, IMO.

If we want to keep applying these coefficients, we need to do that in a
way that does not affect the subsequent loop like this - we might tweak
the per_tuple_cost formula, for example, not tuplesPerPrevGroup.


4) I'm not sure it's actually a good idea to pass tuplesPerPrevGroup to
estimate_num_groups_incremental. In principle yes, if we use "group
size" from the previous step, then the returned value is the number of
new groups after adding the "new" pathkey.

But even if we ignore the issues with amplification mentioned in (3),
there's an issue with non-linear behavior in estimate_num_groups,
because at some point it's calculating

D(N,n,p) = n * (1 - ((N-p)/N)^(N/n))

where N - total rows, p - sample size, n - number of distinct values.
And if we have (N1,n1) and (N2,n2) then the ratio of calculated
estimated (which is pretty much what calculating group size does)

D(N2,n2,p2) / D(N1,n1,p1)

which will differ depending on p1 and p2. And if we're tweaking the
tuplesPerPrevGroup all the time, that's really annoying, as it may make
the groups smaller or larger, which is unpredictable and annoying, and I
wonder if it might go against the idea of penalizing tuplesPerPrevGroup
to some extent.

We could simply use the input "tuples" value here, and then divide the
curren

Re: Using a stock openssl BIO

2021-07-21 Thread Andres Freund

Hi,

On 2021-07-22 00:21:16 +0200, Daniel Gustafsson wrote:
> > On 15 Jul 2021, at 04:17, Andres Freund  wrote:
>
> > Thomas^WA bad person recently nerdsniped me (with the help of an accidental 
> > use
> > of an SSL connection in a benchmark leading to poor results) into checking 
> > what
> > would be needed to benefit from SSL/TLS hardware acceleration (available 
> > with
> > suitable hardware, OS support (linux and freebsd) and openssl 3).
>
> Now why does that sounds so familiar.. =)

:)

> > On the client side it looks like things would be a bit harder. The main 
> > problem
> > seems to be dealing with SIGPIPE. We could possibly deal with that by moving
> > the handling of that a layer up. That's a thick nest of ugly stuff :(.
>
> My initial plan was to keep this for the backend, as the invasiveness of the
> frontend patch is unlikely to be justified by the returns of the acceleration.

There's two main reasons I'd prefer not to do that:

1) It makes it surprisingly hard to benchmark the single connection TLS
   throughput, because there's no client that can pull data quick enough.
2) The main case for wanting offload imo is bulk data stuff (basebackup,
   normal backup). For that you also want to be able to receive the data
   fast. Outside of situations like that I don't think the gains are likely to
   be meaningful given AES-NI and other similar cpu level acceleration.

> > FWIW, I don't think hardware tls acceleration is a particularly crucial 
> > thing
> > for now.
>
> Agreed, it will most likely be of limited use to most.  It might however make
> sense to "get in on the ground floor" to be ready in case it's expanded on in
> kernel+OpenSSL with postgres automatically just reaping the benefits.  Either
> way I was hoping to get to a patch which is close enough to what it would need
> to look like so we can decide with the facts at hand.

Yea. I also just think getting rid of the bio stuff is good for
maintainability / robustness. Relying on our own socket functions being
compatible with the openssl bio doesn't sound very future proof... Especially
not combined with our use of the data field, which of course the other bio
functions may use (as they do when ktls is enabled!).

> > I don't plan to work on this, but Thomas encouraged me to mention this on 
> > the
> > list when I mention it to him.
>
> I still have it on my TODO for after the vacation, and hope to reach that part
> of the list soon.

Cool!

Greetings,

Andres Freund

Re: something is wonky with pgbench pipelining

2021-07-21 Thread Andres Freund

Hi,

On 2021-07-20 14:57:15 -0400, Alvaro Herrera wrote:
> On 2021-Jul-20, Andres Freund wrote:
> 
> > I think what's happening is that the first recvfrom() actually gets all 7
> > connection results. The server doesn't have any queries to process at that
> > point. But we ask the kernel whether there is new network input over and 
> > over
> > again, despite having results to process!
> 
> Hmm, yeah, that seems a missed opportunity.

> > with-isbusy:
> > ...
> > tps = 3990.424742 (without initial connection time)
> > ...
> >   1,013.71 msec task-clock#0.202 CPUs utilized
> > 80,203  raw_syscalls:sys_enter#   79.119 K/sec
> > 19,947  context-switches  #   19.677 K/sec
> >  2,943,676,361  cycles:u  #2.904 GHz
> >346,607,769  cycles:k  #0.342 GHz
> >  8,464,188,379  instructions:u#2.88  insn per cycle
> >226,665,530  instructions:k#0.65  insn per cycle
> 
> This is quite compelling.
> 
> If you don't mind I can get this pushed soon in the next couple of days
> -- or do you want to do it yourself?

I was thinking of pushing the attached, to both 14 and master, thinking
that was what you meant, but then I wasn't quite sure: It's a relatively
minor performance improvement, after all? OTOH, it arguably also just is
a bit of an API misuse...

I'm inclined to push it to 14 and master, but ...

Greetings,

Andres Freund
>From 336497ae1df22b966c3b05826987eb3b374f551a Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 21 Jul 2021 16:40:43 -0700
Subject: [PATCH] pgbench: When using pipelining only do PQconsumeInput() when
 necessary.

Up to now we did a PQconsumeInput() for each pipelined query, asking the OS
for more input - which it often won't have, as all results might already have
been sent. That turns out to have a noticeable performance impact.

Alvaro Herrera reviewed the idea to add the PQisBusy() check, but not this
concrete patch.

Author: Andres Freund 
Discussion: https://postgr.es/m/20210720180039.23rivhdft3l4m...@alap3.anarazel.de
Backpatch: 14, where libpq/pgbench pipelining was introduced.
---
 src/bin/pgbench/pgbench.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 364b5a2e47d..129cf2ed61d 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3460,7 +3460,14 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
  */
 			case CSTATE_WAIT_RESULT:
 pg_log_debug("client %d receiving", st->id);
-if (!PQconsumeInput(st->con))
+
+/*
+ * Only check for new network data if we processed all data
+ * fetched prior. Otherwise we end up doing a syscall for each
+ * individual pipelined query, which has a measurable
+ * performance impact.
+ */
+if (PQisBusy(st->con) && !PQconsumeInput(st->con))
 {
 	/* there's something wrong */
 	commandFailed(st, "SQL", "perhaps the backend died while processing");
-- 
2.32.0.rc2

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Bruce Momjian

On Wed, Jul 21, 2021 at 01:29:49PM -0400, Tom Lane wrote:
> Bryn Llewellyn  writes:
> > It was me that started the enormous thread with the title “Have I found an 
> > interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:
> 
> >> select interval '-1.7 years';  -- -1 years -8 mons
> >> 
> >> select interval '29.4 months'; --  2 years  5 mons 
> >> 12 days
> >> 
> >> select interval '-1.7 years 29.4 months';  --   8 mons 
> >> 12 days << wrong
> >> select interval '29.4 months -1.7 years';  --   9 mons 
> >> 12 days
> >> 
> >> select interval '-1.7 years' + interval '29.4 months'; --   9 mons 
> >> 12 days
> >> select interval '29.4 months' + interval '-1.7 years'; --   9 mons 
> >> 12 days
> 
> > The consensus was that the outcome that I flagged with “wrong” does indeed 
> > have that status.
> 
> Yeah, I think it's self-evident that your last four cases should
> produce the same results.  Whether '9 mons 12 days' is the best
> possible result is debatable --- in a perfect world, maybe we'd
> produce '9 mons' exactly --- but given that the first two cases
> produce what they do, that does seem self-consistent.  I think
> we should be setting out to fix that outlier without causing
> any of the other five results to change.

OK, I decided to reverse some of the changes I was proposing once I
started to think about the inaccuracy of not spilling down from 'weeks'
to seconds when hours also appear.  The fundamental issue is that the
months-to-days conversion is almost always an approximation, while the
days to seconds conversion is almost always accurate.  This means we are
never going to have consistent spill-down that is useful.

Therefore, I went ahead and accepted that years and larger units spill
only to months, months spill only to days, and weeks and lower spill all
the way down to seconds.  I also spelled this out in the docs, and
explained why we have this behavior.

Also, with my patch, the last four queries return the same result
because of the proper rounding also added by the patch, attached.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index e016f96fb4..e831242bf6 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2800,15 +2800,18 @@ P  years-months-
 
 
 
- In the verbose input format, and in some fields of the more compact
- input formats, field values can have fractional parts; for example
- '1.5 week' or '01:02:03.45'.  Such input is
- converted to the appropriate number of months, days, and seconds
- for storage.  When this would result in a fractional number of
- months or days, the fraction is added to the lower-order fields
- using the conversion factors 1 month = 30 days and 1 day = 24 hours.
- For example, '1.5 month' becomes 1 month and 15 days.
- Only seconds will ever be shown as fractional on output.
+ Field values can have fractional parts;  for example, '1.5
+ weeks' or '01:02:03.45'.  However,
+ because interval internally stores only three integer units (months,
+ days, seconds), fractional units must be spilled to smaller units.
+ For example, because months are approximated to equal 30 days,
+ fractional values of units greater than months is rounded to be the
+ nearest integer number of months.  Fractional months are rounded to
+ be the nearest integer number of days, assuming 24 hours per day.
+ Fractional units smaller than months are computed to the nearest
+ second.  For example, '1.5 months' becomes
+ 1 month 15 days.  Only seconds will ever be shown
+ as fractional on output.
 
 
 
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 54ae632de2..5551102447 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3300,35 +3300,43 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 
 	case DTK_MONTH:
 		tm->tm_mon += val;
-		AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+		/*
+		 * The DAYS_PER_MONTH is an estimate so just round
+		 * to days, rather than spilling to seconds.
+		 */
+		/* round to a full month? */
+		if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
+			tm->tm_mon++;
+		else
+			tm->tm_mday += rint(fval * DAYS_PER_MONTH);
 		tmask = DTK_M(MONTH);
 		break;
 
 	case DTK_YEAR:
 		tm->tm_year += val;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_PER_YEAR;
+			tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
 		tmask = DTK_M(YEAR);
 		break;
 
 	case DTK_DECADE:
 		tm->tm_year += val * 10;
 		if (fval != 0)
-			tm->tm_mon += fval * MONTHS_

Re: [POC] verifying UTF-8 using SIMD instructions

2021-07-21 Thread Thomas Munro

On Thu, Jul 22, 2021 at 6:16 AM John Naylor
 wrote:
> Neat! It's good to make it more architecture-agnostic, and I'm sure we can 
> use quite a bit of this.

One question is whether this "one size fits all" approach will be
extensible to wider SIMD.

>  to_bool(const pg_u8x16_t v)
>  {
> +#if defined(USE_NEON)
> + return vmaxvq_u32((uint32x4_t) v) != 0;
>
> --> return vmaxvq_u8(*this) != 0;

I chose that lane width because I saw an unsubstantiated claim
somewhere that it might be faster, but I have no idea if it matters.
The u8 code looks more natural anyway.  Changed.

>  vzero()
>  {
> +#if defined(USE_NEON)
> + return vmovq_n_u8(0);
>
> --> return vdupq_n_u8(0); // or equivalently, splat(0)

I guess it doesn't make a difference which builtin you use here, but I
was influenced by the ARM manual which says the vdupq form is
generated for immediate values.

> is_highbit_set(const pg_u8x16_t v)
>  {
> +#if defined(USE_NEON)
> + return to_bool(bitwise_and(v, vmovq_n_u8(0x80)));
>
> --> return vmaxq_u8(v) > 0x7F

Ah, of course.  Much nicer!

> +#if defined(USE_NEON)
> +static pg_attribute_always_inline pg_u8x16_t
> +vset(uint8 v0, uint8 v1, uint8 v2, uint8 v3,
> + uint8 v4, uint8 v5, uint8 v6, uint8 v7,
> + uint8 v8, uint8 v9, uint8 v10, uint8 v11,
> + uint8 v12, uint8 v13, uint8 v14, uint8 v15)
> +{
> + uint8 pg_attribute_aligned(16) values[16] = {
> + v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, v15
> + };
> + return vld1q_u8(values);
> +}
>
> --> They have this strange beast instead:
>
>   // Doing a load like so end ups generating worse code.
>   // uint8_t array[16] = {x1, x2, x3, x4, x5, x6, x7, x8,
>   // x9, x10,x11,x12,x13,x14,x15,x16};
>   // return vld1q_u8(array);
>   uint8x16_t x{};
>   // incredibly, Visual Studio does not allow x[0] = x1
>   x = vsetq_lane_u8(x1, x, 0);
>   x = vsetq_lane_u8(x2, x, 1);
>   x = vsetq_lane_u8(x3, x, 2);
> ...
>   x = vsetq_lane_u8(x15, x, 14);
>   x = vsetq_lane_u8(x16, x, 15);
>   return x;
>
> Since you aligned the array, that might not have the problem alluded to 
> above, and it looks nicer.

Strange indeed.  We should probably poke around in the assember and
see... it might be that MSVC doesn't like it, and I was just
cargo-culting the alignment.  I don't expect the generated code to
really "load" anything of course, it should ideally be some kind of
immediate mov...

FWIW here are some performance results from my humble RPI4:

master:

 chinese | mixed | ascii
-+---+---
4172 |  2763 |  1823
(1 row)

Your v15 patch:

 chinese | mixed | ascii
-+---+---
2267 |  1248 |   399
(1 row)

Your v15 patch set + the NEON patch, configured with USE_UTF8_SIMD=1:

 chinese | mixed | ascii
-+---+---
 909 |   620 |   318
(1 row)

It's so good I wonder if it's producing incorrect results :-)

I also tried to do a quick and dirty AltiVec patch to see if it could
fit into the same code "shape", with less immediate success: it works
out slower than the fallback code on the POWER7 machine I scrounged an
account on.  I'm not sure what's wrong there, but maybe it's a uesful
start (I'm probably confused about endianness, or the encoding of
boolean vectors which may be different (is true 0x01or 0xff, does it
matter?), or something else, and it's falling back on errors all the
time?).
From 1464c22da33117900341496d03b92dec5be2a62a Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 22 Jul 2021 02:05:06 +1200
Subject: [PATCH v2 1/3] XXX Make SIMD code more platform neutral.

Move SIMD code into pg_utf_simd.c to experiment with the idea of a
shared implementation across architectures.  Introduce pg_u8x16_t to
abstract vector type.

XXX Experiment grade code only
---
 configure |  18 +--
 configure.ac  |  18 +--
 src/include/pg_config.h.in|  14 +-
 src/include/port/pg_utf8.h|  10 +-
 src/port/Makefile |   8 +-
 ...g_utf8_sse42_choose.c => pg_utf8_choose.c} |  10 +-
 src/port/{pg_utf8_sse42.c => pg_utf8_simd.c}  | 148 +-
 7 files changed, 114 insertions(+), 112 deletions(-)
 rename src/port/{pg_utf8_sse42_choose.c => pg_utf8_choose.c} (88%)
 rename src/port/{pg_utf8_sse42.c => pg_utf8_simd.c} (76%)

diff --git a/configure b/configure
index 30969840b1..df546b641c 100755
--- a/configure
+++ b/configure
@@ -18442,13 +18442,13 @@ fi
 #
 # You can override this logic by setting the appropriate USE_*_UTF8 flag to 1
 # in the template or configure command line.
-if test x"$USE_SSE42_UTF8" = x"" && test x"$USE_SSE42_UTF8_WITH_RUNTIME_CHECK" 
= x"" && test x"$USE_FALLBACK_UTF8" = x""; then
+if test x"$USE_SIMD_UTF8" = x"" && test x"$USE_SIMD_UTF8_WITH_RUNTIME_CHECK" = 
x"" && test x"$USE_FALLBACK_UTF8" = x""; then
   if test x"$pgac_sse42_intrinsics" = x"yes" && test x"$SSE4_2_TARGETED" = 
x"1" ; then
-USE_SSE42_UTF8=1
+USE_SIMD_UTF8=1

Re: ORDER BY pushdowns seem broken in postgres_fdw

2021-07-21 Thread Ranier Vilela

Em qua., 21 de jul. de 2021 às 11:33, Ronan Dunklau 
escreveu:

> Le mercredi 21 juillet 2021 15:45:15 CEST, vous avez écrit :
> > On Thu, 22 Jul 2021 at 00:28, Ronan Dunklau 
> wrote:
> > > The attached patch does the following:
> > >   - verify the opfamily is shippable to keep pathkeys
> > >   - generate a correct order by clause using the actual operator.
> >
> > Thanks for writing the patch.
> >
> > This is just a very superficial review.  I've not spent a great deal
> > of time looking at postgres_fdw code, so would rather some eyes that
> > were more familiar with the code looked too.
>
> Thank you for the review.
>
> >
> > 1. This comment needs to be updated. It still mentions
> > is_foreign_expr, which you're no longer calling.
> >
> >   * is_foreign_expr would detect volatile expressions as well, but
> >   * checking ec_has_volatile here saves some cycles.
> >   */
> > - if (pathkey_ec->ec_has_volatile ||
> > - !(em_expr = find_em_expr_for_rel(pathkey_ec, rel)) ||
> > - !is_foreign_expr(root, rel, em_expr))
> > + if (!is_foreign_pathkey(root, rel, pathkey))
> >
> Done. By the way, the comment just above mentions we don't have a way to
> use a
> prefix pathkey, but I suppose we should revisit that now that we have
> IncrementalSort. I'll mark it in my todo list for another patch.
>
> > 2. This is not a very easy return condition to read:
> >
> > + return (!pathkey_ec->ec_has_volatile &&
> > + (em = find_em_for_rel(pathkey_ec, baserel)) &&
> > + is_foreign_expr(root, baserel, em->em_expr) &&
> > + is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId, fpinfo));
> >
> > I think it would be nicer to break that down into something easier on
> > the eyes that could be commented a little more.
>
> Done, let me know what you think about it.
>
> >
> > 3. This comment is no longer true:
> >
> >   * Find an equivalence class member expression, all of whose Vars, come
> > from * the indicated relation.
> >   */
> > -Expr *
> > -find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
> > +EquivalenceMember*
> > +find_em_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
> >
> > Also, missing space after EquivalenceMember.
> >
> > The comment can just be moved down to:
> >
> > +Expr *
> > +find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
> > +{
> > + EquivalenceMember *em = find_em_for_rel(ec, rel);
> > + return em ? em->em_expr : NULL;
> > +}
> >
> > and you can rewrite the one for find_em_for_rel.
>
> I have done it the other way around. I'm not sure we really need to keep
> the
> find_em_expr_for_rel function on HEAD. If we decide to backpatch, it would
> need
> to be kept though.
>
Unfortunately your patch does not apply clear into the head.
So I have a few suggestions on v2, attached with the .txt extension to
avoid cf bot.
Please, if ok, make the v3.

1. new version is_foreign_pathke?
+bool
+is_foreign_pathkey(PlannerInfo *root,
+   RelOptInfo *baserel,
+   PathKey *pathkey)
+{
+ EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
+ EquivalenceMember *em;
+
+ /*
+ * is_foreign_expr would detect volatile expressions as well, but
+ * checking ec_has_volatile here saves some cycles.
+ */
+ if (pathkey_ec->ec_has_volatile)
+ return false;
+
+ /*
+ * Found member's expression is foreign?
+ */
+ em = find_em_for_rel(pathkey_ec, baserel);
+ if (em != NULL && is_foreign_expr(root, baserel, em->em_expr))
+   {
+ PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) baserel->fdw_private;
+
+ /*
+ * Operator family is shippable?
+ */
+ return is_shippable(pathkey->pk_opfamily, OperatorFamilyRelationId,
fpinfo);
+ }
+
+ return false;
+}

2. appendOrderbyUsingClause function
Put the buffer actions together?

3. Apply style Postgres?
+ if (!HeapTupleIsValid(tuple))
+ {
+ elog(ERROR, "cache lookup failed for operator family %u",
pathkey->pk_opfamily);
+ }

4. Assertion not ok here?
+ em = find_em_for_rel(pathkey->pk_eclass, baserel);
+ em_expr = em->em_expr;
  Assert(em_expr != NULL);

find_em_for_rel function can returns NULL.
I think that is need deal with em_expr == NULL at runtime.

5. More readable version?
+find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
+{
+ EquivalenceMember *em = find_em_for_rel(ec, rel);
+
+ if (em != NULL)
+ return em->em_expr;
+
+ return NULL;
+}

regards,
Ranier Vilela
commit f7b3dc81878bf2ac503899f010eb18f390a64e37
Author: Ronan Dunklau 
Date:   Wed Jul 21 12:44:41 2021 +0200

Fix postgres_fdw PathKey's handling.

The operator family being used for the sort was completely
ignored, and as such its existence on the foreign server was not
checked.

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 31919fda8c..5efefff65e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -37,9 +37,11 @@
 #include "access/sysattr.h"
 #include "access/table.h"
 #include "catalog/pg_aggregate.h"
+#include "catalog/pg_amop.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_namespace.h"
 #include

Re: Micro-optimizations to avoid some strlen calls.

2021-07-21 Thread Ranier Vilela

Em qua., 21 de jul. de 2021 às 09:28, Ranier Vilela 
escreveu:

> Em qua., 21 de jul. de 2021 às 07:44, David Rowley 
> escreveu:
>
>> On Tue, 20 Jul 2021 at 10:49, Ranier Vilela  wrote:
>> > There are some places, where strlen can have an overhead.
>> > This patch tries to fix this.
>>
>> I'm with Michael and David on this.
>>
>> I don't really feel like doing;
>>
>> - snprintf(buffer, sizeof(buffer), "E%s%s\n",
>> + buflen = snprintf(buffer, sizeof(buffer), "E%s%s\n",
>>   _("could not fork new process for connection: "),
>>
>> is a good idea.  I'm unsure if you're either not aware of the value
>> that snprintf() returns or just happen to think an overflow is
>> unlikely enough because you're convinced that 1000 chars are always
>> enough to fit this translatable string.   I'd say if we were 100%
>> certain of that then it might as well become sprintf() instead.
>> However, I imagine you'll struggle to get people to side with you that
>> taking this overflow risk would be worthwhile given your lack of any
>> evidence that anything actually has become meaningfully faster as a
>> result of any of these changes.
>>
> I got your point.
> Really getting only the result of snprintf is a bad idea.
> In this case, the right way would be:
>
> snprintf(buffer, sizeof(buffer), "E%s%s\n",
>   _("could not fork new process for connection: "),
> buflen = strlen(buffer);
>
> Thus doesn't have to recount buffer over, if rc fails.
> Thanks for the tip about snprintf, even though it's not the intention.
> This is what I call a bad interface.
>
Here the v1 version, with fix to snprintf trap.

regards,
Ranier Vilela


v1-micro-optmization-avoid-strlen.patch
Description: Binary data

RE: [PATCH] Use optimized single-datum tuplesort in ExecSort

2021-07-21 Thread houzj.f...@fujitsu.com

From: David Rowley 
> On Wed, 21 Jul 2021 at 13:39, James Coleman  wrote:
> > Thanks for doing the math measuring how much we could impact things.
> >
> > I'm +lots on getting this committed as is.
> 
> Ok good. I plan on taking a final look at the v10 patch tomorrow morning NZ
> time (about 12 hours from now) and if all is well, I'll push it.
> 
> If anyone feels differently, please let me know before then.
Hi,

I noticed a minor thing about the v10 patch.

-
-   for (;;)
+   if (node->datumSort)
{
-   slot = ExecProcNode(outerNode);
-
-   if (TupIsNull(slot))
-   break;
-
-   tuplesort_puttupleslot(tuplesortstate, slot);
+   for (;;)
+   {
+   slot = ExecProcNode(outerNode);
+
+   if (TupIsNull(slot))
+   break;
+   slot_getsomeattrs(slot, 1);
+   tuplesort_putdatum(tuplesortstate,
+  
slot->tts_values[0],
+  
slot->tts_isnull[0]);
+   }
+   }
+   else
+   {
+   for (;;)
+   {
+   slot = ExecProcNode(outerNode);
+
+   if (TupIsNull(slot))
+   break;
+   tuplesort_puttupleslot(tuplesortstate, slot);
+   }

The above seems can be shorter like the following ?

for (;;)
{
slot = ExecProcNode(outerNode);
if (TupIsNull(slot))
break;
if (node->datumSort)
{
slot_getsomeattrs(slot, 1);
tuplesort_putdatum(tuplesortstate,
slot->tts_values[0],
slot->tts_isnull[0]);
}
else
tuplesort_puttupleslot(tuplesortstate, slot);
}

Best regards,
houzj

RE: Added schema level support for publication.

2021-07-21 Thread houzj.f...@fujitsu.com



From: vignesh C 
Sent: Thursday, July 22, 2021 1:38 AM
To: Rahila Syed 
Cc: Greg Nancarrow ; Tang, Haiying/唐 海英 
; Ajin Cherian ; PostgreSQL Hackers 
; Bharath Rupireddy 

Subject: Re: Added schema level support for publication.

On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed 
mailto:rahilasye...@gmail.com>> wrote:
>
>
>
> On Mon, Jul 19, 2021 at 2:41 PM Greg Nancarrow 
> mailto:gregn4...@gmail.com>> wrote:
>>
>> On Fri, Jul 16, 2021 at 8:13 PM vignesh C 
>> mailto:vignes...@gmail.com>> wrote:
>> >
>> > Modified.
>> >
>> > Thanks for the comments, these issues are fixed as part of the v12 patch 
>> > posted at [1].
>> > [1]  - 
>> > https://www.postgresql.org/message-id/CALDaNm3V9ny5dJM8nofLGJ3zDuDG0gS2dX%2BAhDph--U5y%2B4VbQ%40mail.gmail.com
>> >
>>
>> There seems to be a problem with ALTER PUBLICATION ... SET TABLE ...
>> After that command, it still regards it as an empty (e) publication,
>> so I can then ALTER PUBLICATION ... ADD SCHEMA ...
>>
>
> One issue here is that the code to update publication type is missing
> in AlterPublicationTables for SET TABLE command.

Modified.

> More broadly, I am not clear about the behaviour of the patch when a
> publication is created to publish only certain tables, and is later altered 
> to publish
> a whole schema. I think such behaviour is legitimate. However,
> AFAIU as per current code we can't update the publication type
> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

I initially thought this might not be required for users, I have not made any 
change for this, I will try to get a few more people's opinion on this and then 
fix it if required.

> I have some review comments as follows:
> 1.
> In ConvertSchemaSpecListToOidList(List *schemas) function:
>  + search_path = fetch_search_path(false);
>  +   nspname = 
> get_namespace_name(linitial_oid(search_path));
>  +   if (nspname == NULL)/* recently-deleted 
> namespace? */
>  +   ereport(ERROR,
>  +   
> errcode(ERRCODE_UNDEFINED_SCHEMA),
>  +   errmsg("no schema 
> has been selected"));
>  +
>  +   schemoid = get_namespace_oid(nspname, false);
>  +   break;
>
> The call get_namespace_oid() is perhaps not needed as fetch_search_path 
> already fetches oids and simply
> doing Schema oid = liinital_oid(search_path)); should be enough.

Modified

> 2. In the same function should there be an if else condition block instead of 
> a switch case as
> there are only two cases.

Modified.

Thanks for the comments, these comments are fixed in the v13 patch posted at 
[1].
[1] - 
https://www.postgresql.org/message-id/CALDaNm0%3DMaXyAok5iq_-DeWUd81vpdF47-MZbbrsd%2BzB2P6WwA%40mail.gmail.com

Regards,
Vignesh

Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

2021-07-21 Thread David Rowley

On Thu, 22 Jul 2021 at 12:27, houzj.f...@fujitsu.com
 wrote:
> The above seems can be shorter like the following ?
>
> for (;;)
> {
> slot = ExecProcNode(outerNode);
> if (TupIsNull(slot))
> break;
> if (node->datumSort)
> {
> slot_getsomeattrs(slot, 1);
> tuplesort_putdatum(tuplesortstate,
> slot->tts_values[0],
> slot->tts_isnull[0]);
> }
> else
> tuplesort_puttupleslot(tuplesortstate, slot);
> }

I don't think that's a good change.  It puts the branch inside the
loop the pulls all tuples from the subplan.  Given the loop is likely
to be very hot combined with the fact that it's so simple, I'd much
rather have two separate loops to keep the extra branch outside the
loop.  It's true the branch predictor is likely to get the prediction
correct on each iteration, but unless the compiler rewrites this into
two loops then the comparison and jump must be done per loop.

David

RE: Added schema level support for publication.

2021-07-21 Thread houzj.f...@fujitsu.com

On, July 22, 2021 1:38 AM vignesh C  wrote:
>On Wed, Jul 21, 2021 at 3:14 PM Rahila Syed  
>wrote:
>> More broadly, I am not clear about the behaviour of the patch when a
>> publication is created to publish only certain tables, and is later altered 
>> to publish
>> a whole schema. I think such behaviour is legitimate. However,
>> AFAIU as per current code we can't update the publication type
>> from PUBTYPE_TABLE to PUBTYPE_SCHEMA.

> I initially thought this might not be required for users, I have not made any
> change for this, I will try to get a few more people's opinion on this and 
> then fix it if required.

Currently, It's not allowed to ALTER a FOR TABLE PUBLICATION to a FOR ALL
TABLES PUBLICATION. So, I am not sure it's legitimate to ALTER a FOR TABLE
PUBLICATION to a FOR SCHEMA PUBLICATION. Personally, It sounds more like a
separate feature which can be discussed in a separate thread.

Best regards,
houzj

RE: [PATCH] Use optimized single-datum tuplesort in ExecSort

2021-07-21 Thread houzj.f...@fujitsu.com

On July 22, 2021 8:38 AM David Rowley 
> On Thu, 22 Jul 2021 at 12:27, houzj.f...@fujitsu.com 
> wrote:
> > The above seems can be shorter like the following ?
> >
> > for (;;)
> > {
> > slot = ExecProcNode(outerNode);
> > if (TupIsNull(slot))
> > break;
> > if (node->datumSort)
> > {
> > slot_getsomeattrs(slot, 1);
> > tuplesort_putdatum(tuplesortstate,
> > slot->tts_values[0],
> > slot->tts_isnull[0]);
> > }
> > else
> > tuplesort_puttupleslot(tuplesortstate, slot); }
> 
> I don't think that's a good change.  It puts the branch inside the loop the 
> pulls
> all tuples from the subplan.  Given the loop is likely to be very hot combined
> with the fact that it's so simple, I'd much rather have two separate loops to
> keep the extra branch outside the loop.  It's true the branch predictor is 
> likely
> to get the prediction correct on each iteration, but unless the compiler
> rewrites this into two loops then the comparison and jump must be done per
> loop.

Ah, you are right, I missed that. Thanks for the explanation.

Best regards,
houzj

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Bryn Llewellyn

> On 21-Jul-2021, at 17:07, Bruce Momjian  wrote:
> 
> On Wed, Jul 21, 2021 at 01:29:49PM -0400, Tom Lane wrote:
>> Bryn Llewellyn  writes:
>>> It was me that started the enormous thread with the title “Have I found an 
>>> interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:
>> 
 select interval '-1.7 years';  -- -1 years -8 mons

 select interval '29.4 months'; --  2 years  5 mons 
 12 days

 select interval '-1.7 years 29.4 months';  --   8 mons 
 12 days << wrong
 select interval '29.4 months -1.7 years';  --   9 mons 
 12 days

 select interval '-1.7 years' + interval '29.4 months'; --   9 mons 
 12 days
 select interval '29.4 months' + interval '-1.7 years'; --   9 mons 
 12 days
>> 
>>> The consensus was that the outcome that I flagged with “wrong” does indeed 
>>> have that status.
>> 
>> Yeah, I think it's self-evident that your last four cases should
>> produce the same results.  Whether '9 mons 12 days' is the best
>> possible result is debatable --- in a perfect world, maybe we'd
>> produce '9 mons' exactly --- but given that the first two cases
>> produce what they do, that does seem self-consistent.  I think
>> we should be setting out to fix that outlier without causing
>> any of the other five results to change.
> 
> OK, I decided to reverse some of the changes I was proposing once I
> started to think about the inaccuracy of not spilling down from 'weeks'
> to seconds when hours also appear.  The fundamental issue is that the
> months-to-days conversion is almost always an approximation, while the
> days to seconds conversion is almost always accurate.  This means we are
> never going to have consistent spill-down that is useful.
> 
> Therefore, I went ahead and accepted that years and larger units spill
> only to months, months spill only to days, and weeks and lower spill all
> the way down to seconds.  I also spelled this out in the docs, and
> explained why we have this behavior.
> 
> Also, with my patch, the last four queries return the same result
> because of the proper rounding also added by the patch, attached.

Your statement

“months-to-days conversion is almost always an approximation, while the days to 
seconds conversion is almost always accurate.” 

is misleading. Any conversion like these (and also the “spill up” conversions 
that the justify_hours(), justify_days(), and justify_interval() built-in 
functions bring) are semantically dangerous because of the different rules for 
adding a pure months, a pure days, or a pure seconds interval to a timestamptz 
value.

Unless you avoid mixed interval values, then it’s so hard (even though it is 
possible) to predict the outcomes of interval arithmetic. Rather, all you get 
is emergent behavior that I fail to see can be relied upon in deliberately 
designed application code. Here’s a telling example:

set timezone = 'America/Los_Angeles';
with
  c as (
select
  '2021-03-13 19:00:00 America/Los_Angeles'::timestamptz as d,
  '25 hours'::interval   as i)
select
  d +   i  as "d + i",
  d + justify_hours(i) as "d + justify_hours(i)"
from c;

This is the result:

 d + i  |  d + justify_hours(i)  
+
 2021-03-14 21:00:00-07 | 2021-03-14 20:00:00-07

The two results are different, even though the native equality test shows that 
the two different interval values are the same:

with
  c as (select '25 hours'::interval as i)
select (i = justify_hours(i))::text
from c;

The result is TRUE.

The only route to sanity is to use only pure interval values (i.e. where only 
one of the fields of the internal [mm, dd, ss] representation is non-zero.

I mentioned that you can use a set of three domain types to enforce your 
intended practice here.

In other words, by programming application code defensively, it’s possible to 
insulate oneself entirely from the emergent behavior of the decades old PG code 
that implements the unconstrained native interval functionality and that brings 
what can only be considered to be unpredictable results.

Moreover, this defensive approach insulates you from any changes that Bruce’s 
patch might make.

Re: Have I found an interval arithmetic bug?

2021-07-21 Thread Bruce Momjian

On Wed, Jul 21, 2021 at 06:39:26PM -0700, Bryn Llewellyn wrote:
> Your statement
> 
> 
> “months-to-days conversion is almost always an approximation, while the
> days to seconds conversion is almost always accurate.” 
> 
> 
> is misleading. Any conversion like these (and also the “spill up” conversions
> that the justify_hours(), justify_days(), and justify_interval() built-in
> functions bring) are semantically dangerous because of the different rules for
> adding a pure months, a pure days, or a pure seconds interval to a timestamptz
> value.

We are trying to get the most reasonable output for fractional values
--- I stand by my statements.

> Unless you avoid mixed interval values, then it’s so hard (even though it is
> possible) to predict the outcomes of interval arithmetic. Rather, all you get
> is emergent behavior that I fail to see can be relied upon in deliberately
> designed application code. Here’s a telling example:

The point is that we will get unusual values, so we should do the best
we can.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

2021-07-21 Thread David Rowley

On Wed, 21 Jul 2021 at 22:09, David Rowley  wrote:
>
> On Wed, 21 Jul 2021 at 13:39, James Coleman  wrote:
> > Thanks for doing the math measuring how much we could impact things.
> >
> > I'm +lots on getting this committed as is.
>
> Ok good. I plan on taking a final look at the v10 patch tomorrow
> morning NZ time (about 12 hours from now) and if all is well, I'll
> push it.

Pushed.

David

RE: Added schema level support for publication.

2021-07-21 Thread houzj.f...@fujitsu.com

On July 22, 2021 1:30 AM vignesh C  wrote
> > I think PubType in this case should be 't' instead of 'e'. Please have a 
> > look.
> 
> Thanks for reporting this issue, this issue is fixed in the attached v13 
> patch.
> I have changed relation name pg_publication_schema to pg_publication_sch
> so that the names are in similar lines with pg_publication_rel relation and 
> similar
> changes were done for variable names too.

Hi,

Thanks for the new version patches.
I had a few comments.

1)
+
+   appendPQExpBuffer(query, "ALTER PUBLICATION %s ", 
fmtId(pubrinfo->pubname));
+   appendPQExpBuffer(query, "ADD SCHEMA %s;\n", 
fmtId(schemainfo->dobj.name));
+

It seems we can combine these two function call.
like appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD SCHEMA %s;\n",
   fmtId(pubrinfo->pubname),
   fmtId(schemainfo->dobj.name));


2)
+   footers[0] = pstrdup("Publications:");
+

This word seems need to be translated.
footers[0] = pstrdup(_("Publications:"));

3)
I think it might be better to add a testcase to cover the issue
reported before [1].

[1] 
https://www.postgresql.org/message-id/CAJcOf-dsKOYKmdrU5nwWeFoHvhiACbmw_KU%3DJQMEeDp6WwijqA%40mail.gmail.com

4)
Personally, the new name pg_publication_sch is not very easy to understand.
(Maybe it's because I am not a native english speaker. If others feel ok,
please ignore this comment)

Best regards,
Houzj

Re: [bug?] Missed parallel safety checks, and wrong parallel safety

2021-07-21 Thread Amit Kapila

On Wed, Jul 21, 2021 at 12:30 AM Robert Haas  wrote:
>
> On Sun, Jul 4, 2021 at 1:44 AM Dilip Kumar  wrote:
> > In general, for the non-partitioned table, where we don't have much
> > overhead of checking the parallel safety and invalidation is also not
> > a big problem so I am tempted to provide an automatic parallel safety
> > check.  This would enable parallelism for more cases wherever it is
> > suitable without user intervention.  OTOH, I understand that providing
> > automatic checking might be very costly if the number of partitions is
> > more.  Can't we provide some mid-way where the parallelism is enabled
> > by default for the normal table but for the partitioned table it is
> > disabled by default and the user has to set it safe for enabling
> > parallelism?  I agree that such behavior might sound a bit hackish.
>
> I think that's basically the proposal that Amit and I have been discussing.
>

I see here we have a mix of opinions from various people. Dilip seems
to be favoring the approach where we provide some option to the user
for partitioned tables and automatic behavior for non-partitioned
tables but he also seems to have mild concerns about this behavior.
OTOH, Greg and Hou-San seem to favor an approach where we can provide
an option to the user for both partitioned and non-partitioned tables.
I am also in favor of providing an option to the user for the sake of
consistency in behavior and not trying to introduce a special kind of
invalidation as it doesn't serve the purpose for partitioned tables.
Robert seems to be in favor of automatic behavior but it is not very
clear to me if he is fine with dealing differently for partitioned and
non-partitioned relations. Robert, can you please provide your opinion
on what do you think is the best way to move forward here?

-- 
With Regards,
Amit Kapila.

[PATCH] support tab-completion for single quote input with equal sign

2021-07-21 Thread tanghy.f...@fujitsu.com

Hi

I found a problem when using tab-completion as follows:

CREATE SUBSCRIPTION my_subscription 
CONNECTION 'host=localhost port=5432 dbname=postgres' [TAB]

The word 'PUBLICATION' couldn't be auto completed as expected.

The reason is that the equal sign in a single quote is taken as WORD_BREAKS.
Which causes words count is not correct for a input command string.

Fix the problem in the attached patch. By fix this, "\t\n@$><=;|&{() " will not 
be taken as WORD_BREAKS in single quoted input string.
Please be kindly to take a look at the fix and kindly to tell me if you find 
any scenario affected by the fix.

Regards,
Tang


0001-support-tab-completion-for-single-quote-input-with-e.patch
Description:  0001-support-tab-completion-for-single-quote-input-with-e.patch

window build doesn't apply PG_CPPFLAGS correctly

2021-07-21 Thread Pavel Stehule

Hi

I tried to write test for plpgsql debug API, where I need to access to
plpgsql.h

I have line

PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plpgsql/src

that is working well on unix, but it do nothing on windows

[00:05:14] Project "C:\projects\postgresql\pgsql.sln" (1) is building
"C:\projects\postgresql\test_dbgapi.vcxproj" (87) on node 1 (default
targets).
[00:05:14] PrepareForBuild:
[00:05:14]   Creating directory ".\Release\test_dbgapi\".
[00:05:14]   Creating directory ".\Release\test_dbgapi\test_dbgapi.tlog\".
[00:05:14] InitializeBuildStatus:
[00:05:14]   Creating
".\Release\test_dbgapi\test_dbgapi.tlog\unsuccessfulbuild" because
"AlwaysCreate" was specified.
[00:05:14] ClCompile:
[00:05:14]   C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\bin\x86_amd64\CL.exe /c /Isrc/include /Isrc/include/port/win32
/Isrc/include/port/win32_msvc /Zi /nologo /W3 /WX- /Ox /D WIN32 /D _WINDOWS
/D __WINDOWS__ /D __WIN32__ /D WIN32_STACK_RLIMIT=4194304 /D
_CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _WINDLL /D _MBCS
/GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope
/Fo".\Release\test_dbgapi\\" /Fd".\Release\test_dbgapi\vc120.pdb" /Gd /TC
/wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /errorReport:queue /MP
src/test/modules/test_dbgapi/test_dbgapi.c
[00:05:14]   test_dbgapi.c
[00:05:16] src/test/modules/test_dbgapi/test_dbgapi.c(17): fatal error
C1083: Cannot open include file: 'plpgsql.h': No such file or directory
[C:\projects\postgresql\test_dbgapi.vcxproj]
[00:05:16] Done Building Project
"C:\projects\postgresql\test_dbgapi.vcxproj" (default targets) -- FAILED.
[00:05:16] Project "C:\projects\postgresql\pgsql.sln" (1) is building
"C:\projects\postgresql\test_ddl_deparse.vcxproj" (88) on node 1 (default
targets).

looks so PG_CPPFLAGS is not propagated to CPPFLAGS there.

Regards

Pavel

Re: Added schema level support for publication.

2021-07-21 Thread Greg Nancarrow

On Thu, Jul 22, 2021 at 1:42 PM houzj.f...@fujitsu.com
 wrote:
>
> Personally, the new name pg_publication_sch is not very easy to understand.
> (Maybe it's because I am not a native english speaker. If others feel ok,
> please ignore this comment)
>

I was actually thinking the same thing.
I prefer the full SCHEMA/schema, even for all the internal
variables/definitions which have been changed since the last patch
version.
I think Vignesh was trying to be consistent with pg_publication_rel
and pg_subscription_rel, but maybe "rel" is better understood to be an
abbreviation for "relation" than "sch" for "schema"?
Thoughts from others?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: proposal: enhancing plpgsql debug API - returns text value of variable content

2021-07-21 Thread Pavel Stehule

st 21. 7. 2021 v 22:23 odesílatel Pavel Stehule 
napsal:

> Hi
>
> pá 16. 7. 2021 v 18:40 odesílatel Pavel Stehule 
> napsal:
>
>> Hi
>>
>> pá 16. 7. 2021 v 15:05 odesílatel Aleksander Alekseev <
>> aleksan...@timescale.com> napsal:
>>
>>> Hi Pavel,
>>>
>>> > I would like to print content of variables - and now, I have to go some
>>> > deeper than I would like. I need to separate between scalar, row, and
>>> > record variables. PLpgSQL has code for it - but it is private.
>>> > [...]
>>>
>>> The patch seems OK, but I wonder - would it be possible to write a test
>>> on it?
>>>
>>
>>  Sure, it is possible - unfortunately - the size of this test will be
>> significantly bigger than patch self.
>>
>> I'll try to write it some simply tracer, where this API can be used
>>
>
> I am sending an enhanced patch about the regress test for plpgsql's debug
> API.
>

with modified Makefile to force use option
-I$(top_srcdir)/src/pl/plpgsql/src

override CPPFLAGS := $(CPPFLAGS) -I$(top_srcdir)/src/pl/plpgsql/src



> Regards
>
> Pavel
>
>
>>
>>
>>> --
>>> Best regards,
>>> Aleksander Alekseev
>>>
>>
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 14bbe12da5..e0e68a2592 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -4059,6 +4059,8 @@ plpgsql_estate_setup(PLpgSQL_execstate *estate,
 	{
 		(*plpgsql_plugin_ptr)->error_callback = plpgsql_exec_error_callback;
 		(*plpgsql_plugin_ptr)->assign_expr = exec_assign_expr;
+		(*plpgsql_plugin_ptr)->eval_datum = exec_eval_datum;
+		(*plpgsql_plugin_ptr)->cast_value = do_cast_value;
 
 		if ((*plpgsql_plugin_ptr)->func_setup)
 			((*plpgsql_plugin_ptr)->func_setup) (estate, func);
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 0f6a5b30b1..abe7d63f78 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -415,6 +415,7 @@ pl_block		: decl_sect K_BEGIN proc_sect exception_sect K_END opt_label
 		new->cmd_type	= PLPGSQL_STMT_BLOCK;
 		new->lineno		= plpgsql_location_to_lineno(@2);
 		new->stmtid		= ++plpgsql_curr_compile->nstatements;
+		new->ns			= plpgsql_ns_top();
 		new->label		= $1.label;
 		new->n_initvars = $1.n_initvars;
 		new->initvarnos = $1.initvarnos;
@@ -907,7 +908,8 @@ stmt_perform	: K_PERFORM
 		new = palloc0(sizeof(PLpgSQL_stmt_perform));
 		new->cmd_type = PLPGSQL_STMT_PERFORM;
 		new->lineno   = plpgsql_location_to_lineno(@1);
-		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->stmtid   = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		plpgsql_push_back_token(K_PERFORM);
 
 		/*
@@ -943,6 +945,7 @@ stmt_call		: K_CALL
 		new->cmd_type = PLPGSQL_STMT_CALL;
 		new->lineno = plpgsql_location_to_lineno(@1);
 		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->ns = plpgsql_ns_top();
 		plpgsql_push_back_token(K_CALL);
 		new->expr = read_sql_stmt();
 		new->is_call = true;
@@ -962,6 +965,7 @@ stmt_call		: K_CALL
 		new->cmd_type = PLPGSQL_STMT_CALL;
 		new->lineno = plpgsql_location_to_lineno(@1);
 		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->ns = plpgsql_ns_top();
 		plpgsql_push_back_token(K_DO);
 		new->expr = read_sql_stmt();
 		new->is_call = false;
@@ -1000,7 +1004,8 @@ stmt_assign		: T_DATUM
 		new = palloc0(sizeof(PLpgSQL_stmt_assign));
 		new->cmd_type = PLPGSQL_STMT_ASSIGN;
 		new->lineno   = plpgsql_location_to_lineno(@1);
-		new->stmtid = ++plpgsql_curr_compile->nstatements;
+		new->stmtid   = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		new->varno = $1.datum->dno;
 		/* Push back the head name to include it in the stmt */
 		plpgsql_push_back_token(T_DATUM);
@@ -1022,6 +1027,7 @@ stmt_getdiag	: K_GET getdiag_area_opt K_DIAGNOSTICS getdiag_list ';'
 		new->cmd_type = PLPGSQL_STMT_GETDIAG;
 		new->lineno   = plpgsql_location_to_lineno(@1);
 		new->stmtid	  = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		new->is_stacked = $2;
 		new->diag_items = $4;
 
@@ -1194,6 +1200,7 @@ stmt_if			: K_IF expr_until_then proc_sect stmt_elsifs stmt_else K_END K_IF ';'
 		new->cmd_type	= PLPGSQL_STMT_IF;
 		new->lineno		= plpgsql_location_to_lineno(@1);
 		new->stmtid		= ++plpgsql_curr_compile->nstatements;
+		new->ns			= plpgsql_ns_top();
 		new->cond		= $2;
 		new->then_body	= $3;
 		new->elsif_list = $4;
@@ -1299,6 +1306,7 @@ stmt_loop		: opt_loop_label K_LOOP loop_body
 		new->cmd_type = PLPGSQL_STMT_LOOP;
 		new->lineno   = plpgsql_location_to_lineno(@2);
 		new->stmtid   = ++plpgsql_curr_compile->nstatements;
+		new->ns		  = plpgsql_ns_top();
 		new->label	  = $1;
 		new->body	  = $3.stmts;
 
@@ -1317,6 +1325,7 @@ stmt_while		: opt_loop_label K_WHILE expr_until_loop loop_body
 		new->cmd_type = PLPGSQL_STMT_WHILE

Re: alter table set TABLE ACCESS METHOD

2021-07-21 Thread vignesh C

On Fri, Jul 16, 2021 at 9:19 AM Justin Pryzby  wrote:
>
> rebased.
>
> Also, there were two redundant checks for multiple SET ACCESS METHOD commands.
> But one of them wasn't hit if the ALTER was setting the current AM due to the
> no-op test.
>
> I think it's better to fail in every case, and not just sometimes (especially
> if we were to use ERRCODE_SYNTAX_ERROR).
>
> I included my 2ndary patch allowing to set the AM of partitioned table, same 
> as
> for a tablespace.

One of the tests is failing, please post an updated patch for this:
create_am.out   2021-07-22 10:34:56.234654166 +0530
@@ -177,6 +177,7 @@
 (1 row)

 -- CREATE TABLE ..  PARTITION BY supports USING
+-- new partitions will inherit from the current default, rather the
partition root
 CREATE TABLE tableam_parted_heap2 (a text, b int) PARTITION BY list
(a) USING heap2;
 SET default_table_access_method = 'heap';
 CREATE TABLE tableam_parted_a_heap2 PARTITION OF tableam_parted_heap2
FOR VALUES IN ('a');

Regards,
Vignesh

Re: Next Steps with Hash Indexes

2021-07-21 Thread Amit Kapila

On Tue, Jul 20, 2021 at 6:32 PM Simon Riggs
 wrote:
>
> On Tue, Jul 20, 2021 at 1:00 PM Amit Kapila  wrote:
> >
> > On Thu, Jul 15, 2021 at 10:11 PM Simon Riggs
> >  wrote:
> >
> > I think the main thing to think about for uniqueness check during
> > split (where we scan both the old and new buckets) was whether we need
> > to lock both the old (bucket_being_split) and new
> > (bucket_being_populated) buckets or just holding locks on one of them
> > (the current bucket in which we are inserting) is sufficient? During a
> > scan of the new bucket, we just retain pins on both the buckets (see
> > comments in _hash_first()) but if we need to retain locks on both
> > buckets then we need to do something different then we do it for
> > scans. But, I think it is sufficient to just hold an exclusive lock on
> > the primary bucket page in the bucket we are trying to insert and pin
> > on the other bucket (old bucket as we do for scans). Because no
> > concurrent inserter should try to insert into the old bucket and new
> > bucket the same tuple as before starting the split we always update
> > the metapage for hashm_lowmask and hashm_highmask which decides the
> > routing of the tuples.
>
> During an incomplete split, we need to scan both old and new. So
> during insert, we need to scan both old and new, while holding
> exclusive locks on both bucket pages.
>

It will surely work if we have an exclusive lock on both the buckets
(old and new) in this case but I think it is better if we can avoid
exclusive locking the old bucket (bucket_being_split) unless it is
really required. We need an exclusive lock on the primary bucket where
we are trying to insert to avoid any other inserter with the same key
but I think we don't need it for the old bucket because no inserter
with the same key can try to insert the key in an old bucket which
would belong to the new bucket.

> I've spent a few days looking at
> the split behavior and this seems a complete approach. I'm working on
> a patch now, still at hacking stage.
>
> (BTW, my opinion of the split mechanism has now changed from bad to
> very good. It works really well for unique data, but can be completely
> ineffective for badly skewed data).
>
> > Now, I think here the other problem we need to think about is that for
> > the hash index after finding the tuple in the index, we need to always
> > recheck in the heap as we don't store the actual value in the hash
> > index. For that in the scan, we get the tuple(s) from the index
> > (release locks) and then match qual after fetching tuple from the
> > heap. But we can't do that for uniqueness check because if we release
> > the locks on the index bucket page then another inserter could come
> > before we match it in heap. I think we need some mechanism that after
> > fetching TID from the index, we recheck the actual value in heap
> > before releasing the lock on the index bucket page.
>
> I don't think btree does that, so I'm not sure we do need that for
> hash. Yes, there is a race condition, but someone will win. Do we care
> who? Do we care enough to take the concurrency hit?
>

I think if we don't care we might end up with duplicates. I might be
missing something here but let me explain the problem I see. Say,
while doing a unique check we found the same hash value in the bucket
we are trying to insert, we can't say unique key violation at this
stage and error out without checking the actual value in heap. This is
because there is always a chance that two different key values can map
to the same hash value. Now, after checking in the heap if we found
that the actual value doesn't match so we decide to insert the value
in the hash index, and in the meantime, another insert of the same key
value already performed these checks and ends up inserting the value
in hash index and that would lead to a duplicate value in the hash
index. I think btree doesn't have a similar risk so we don't need such
a mechanism for btree.

-- 
With Regards,
Amit Kapila.

Re: Numeric x^y for negative x

2021-07-21 Thread Yugo NAGATA

On Wed, 21 Jul 2021 11:10:16 +0100
Dean Rasheed  wrote:

> On Tue, 20 Jul 2021 at 10:17, Yugo NAGATA  wrote:
> >
> > This patch fixes numeric_power() to handle negative bases correctly and not
> > to raise an error "cannot take logarithm of a negative number", as well as a
> > bug that a result whose values is almost zero is incorrectly returend as 1.
> > The previous behaviors are obvious strange, and these fixes seem to me 
> > reasonable.
> >
> > Also, improper overflow errors are corrected in numeric_power() and
> > numeric_exp() to return 0 when it is underflow in fact.
> > I think it is no problem that these functions return zero instead of 
> > underflow
> > errors because power_var_int() already do the same.
> >
> > The patch includes additional tests for checking negative bases cases and
> > underflow and rounding of the almost zero results. It seems good.
> 
> Thanks for the review!
> 
> 
> > Let me just make one comment.
> >
> > (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
> >  errmsg("zero raised to a negative power is undefined")));
> >
> > -   if (sign1 < 0 && !numeric_is_integral(num2))
> > -   ereport(ERROR,
> > -   (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
> > -errmsg("a negative number raised to a non-integer power 
> > yields a complex result")));
> > -
> > /*
> >  * Initialize things
> >  */
> >
> > I don't think we need to move this check from numeric_power to power_var.
> 
> Moving it to power_var() means that it only needs to be checked in the
> case of a negative base, together with an exponent that cannot be
> handled by power_var_int(), which saves unnecessary checking. It isn't
> necessary to do this test at all if the exponent is an integer small
> enough to fit in a 32-bit int. And if it's not an integer, or it's a
> larger integer than that, it seems more logical to do the test in
> power_var() near to the other code handling that case.

Indeed, I agree with that this change saves unnecessary checking.

> 
> > I noticed the following comment in a numeric_power().
> >
> > /*
> >  * The SQL spec requires that we emit a particular SQLSTATE error code 
> > for
> >  * certain error conditions.  Specifically, we don't return a
> >  * divide-by-zero error code for 0 ^ -1.
> >  */
> >
> > In the original code, two checks that could raise an error of
> > ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION are following the comment.
> > I think these check codes are placed together under this comment 
> > intentionally,
> > so I suggest not to move one of them.
> 
> Ah, that's a good point about the SQL spec. The comment only refers to
> the case of 0 ^ -1, but the SQL spec does indeed say that a negative
> number to a non-integer power should return the same error code.
> 
> Here is an updated patch with additional comments about the required
> error code when raising a negative number to a non-integer power, and
> where it is checked.

Thank you for updating the patch. I am fine with the additional comments.
I don't think there is any other problem left, so I marked it 
Ready-for-Committers.

Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: Allow batched insert during cross-partition updates

2021-07-21 Thread vignesh C

On Fri, Jul 2, 2021 at 7:35 AM Amit Langote  wrote:
>
> On Fri, Jul 2, 2021 at 1:39 AM Tom Lane  wrote:
> >
> > Amit Langote  writes:
> > > [ v6-0001-Allow-batching-of-inserts-during-cross-partition-.patch ]
> >
> > Per the cfbot, this isn't applying anymore, so I'm setting it back
> > to Waiting on Author.
>
> Rebased patch attached.  Thanks for the reminder.

One of the test postgres_fdw has failed, can you please post an
updated patch for the fix:
test postgres_fdw ... FAILED (test process exited with
exit code 2) 7264 ms

Regards,
Vignesh

Re: Automatic notification of top transaction IDs

2021-07-21 Thread vignesh C

On Thu, Jul 1, 2021 at 6:41 AM Gurjeet Singh  wrote:
>
> The proposed patch is attached.

There is one compilation warning:
xid.c:165:1: warning: no previous prototype for
‘FullTransactionIdToStr’ [-Wmissing-prototypes]
  165 | FullTransactionIdToStr(FullTransactionId fxid)
  | ^~

There are few compilation issues in documentation:
/usr/bin/xmllint --path . --noout --valid postgres.sgml
protocol.sgml:1327: parser error : Opening and ending tag mismatch:
literal line 1322 and para
   
  ^
protocol.sgml:1339: parser error : Opening and ending tag mismatch:
literal line 1322 and sect2
  
  ^
protocol.sgml:1581: parser error : Opening and ending tag mismatch:
para line 1322 and sect1
 
 ^
protocol.sgml:7893: parser error : Opening and ending tag mismatch:
sect2 line 1322 and chapter

  ^
protocol.sgml:7894: parser error : chunk is not well balanced

^
postgres.sgml:253: parser error : Failure to process entity protocol
  &protocol;
^
postgres.sgml:253: parser error : Entity 'protocol' not defined
  &protocol;
^

Regards,
Vignesh

Re: Incorrect usage of strtol, atoi for non-numeric junk inputs

2021-07-21 Thread Michael Paquier

On Thu, Jul 22, 2021 at 01:19:41AM +1200, David Rowley wrote:
> On Thu, 22 Jul 2021 at 00:44, Michael Paquier  wrote:
>>
>> On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote:
>> > I see both of these are limited to 64 on windows. Won't those fail on 
>> > Windows?
>>
>> Yes, thanks, they would.  I would just cut the range numbers from the
>> expected output here.  This does not matter in terms of coverage
>> either.
> 
> Sounds good.
> 
>> x> I also wondered if it would be worth doing #define MAX_JOBS  somewhere
>> > away from the option parsing code.  This part is pretty ugly:
>>
>> Agreed as well.  pg_dump and pg_restore have their own idea of
>> parallelism in parallel.{c.h}.  What about putting MAX_JOBS in
>> parallel.h then?
> 
> parallel.h looks ok to me.

Okay, done those parts as per the attached.  While on it, I noticed an
extra one for pg_dump --rows-per-insert.  I am counting 25 translated
strings cut in total.

Any objections to this first step?
--
Michael
From 36be37c0dd0810b65d75e9807b61c244d4f56333 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 22 Jul 2021 14:31:31 +0900
Subject: [PATCH v5] Introduce and use routine for parsing of int32 options

---
 src/include/fe_utils/option_utils.h|  3 ++
 src/fe_utils/option_utils.c| 39 ++
 src/bin/pg_amcheck/pg_amcheck.c|  8 ++-
 src/bin/pg_basebackup/pg_basebackup.c  | 17 +++---
 src/bin/pg_basebackup/pg_receivewal.c  | 23 +++--
 src/bin/pg_basebackup/pg_recvlogical.c | 25 -
 src/bin/pg_checksums/Makefile  |  3 ++
 src/bin/pg_checksums/pg_checksums.c|  9 ++--
 src/bin/pg_dump/parallel.h | 13 +
 src/bin/pg_dump/pg_dump.c  | 46 -
 src/bin/pg_dump/pg_restore.c   | 22 ++--
 src/bin/pg_dump/t/001_basic.pl | 20 
 src/bin/pgbench/pgbench.c  | 60 +++---
 src/bin/pgbench/t/002_pgbench_no_server.pl | 36 -
 src/bin/scripts/reindexdb.c|  9 ++--
 src/bin/scripts/vacuumdb.c | 31 ---
 16 files changed, 174 insertions(+), 190 deletions(-)

diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h
index d653cb94e3..e999d56ec0 100644
--- a/src/include/fe_utils/option_utils.h
+++ b/src/include/fe_utils/option_utils.h
@@ -19,5 +19,8 @@ typedef void (*help_handler) (const char *progname);
 extern void handle_help_version_opts(int argc, char *argv[],
 	 const char *fixed_progname,
 	 help_handler hlp);
+extern bool option_parse_int(const char *optarg, const char *optname,
+			 int min_range, int max_range,
+			 int *result);
 
 #endif			/* OPTION_UTILS_H */
diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c
index e19a495dba..a09d57db71 100644
--- a/src/fe_utils/option_utils.c
+++ b/src/fe_utils/option_utils.c
@@ -12,6 +12,8 @@
 
 #include "postgres_fe.h"
 
+#include "common/logging.h"
+#include "common/string.h"
 #include "fe_utils/option_utils.h"
 
 /*
@@ -36,3 +38,40 @@ handle_help_version_opts(int argc, char *argv[],
 		}
 	}
 }
+
+/*
+ * option_parse_int
+ *
+ * Parse integer value for an option.  Returns true if the parsing could
+ * be done with optionally *result holding the parsed value, and false on
+ * failure.
+ */
+bool
+option_parse_int(const char *optarg, const char *optname,
+ int min_range, int max_range,
+ int *result)
+{
+	char	   *endptr;
+	int			val;
+
+	errno = 0;
+	val = strtoint(optarg, &endptr, 10);
+
+	if (*endptr)
+	{
+		pg_log_error("invalid value \"%s\" for option \"%s\"",
+	 optarg, optname);
+		return false;
+	}
+
+	if (errno == ERANGE || val < min_range || val > max_range)
+	{
+		pg_log_error("\"%s\" must be in range %d..%d",
+	 optname, min_range, max_range);
+		return false;
+	}
+
+	if (result)
+		*result = val;
+	return true;
+}
diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 4bde16fb4b..ee8aa71bdf 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -12,6 +12,7 @@
  */
 #include "postgres_fe.h"
 
+#include 
 #include 
 
 #include "catalog/pg_am_d.h"
@@ -326,12 +327,9 @@ main(int argc, char *argv[])
 append_btree_pattern(&opts.exclude, optarg, encoding);
 break;
 			case 'j':
-opts.jobs = atoi(optarg);
-if (opts.jobs < 1)
-{
-	pg_log_error("number of parallel jobs must be at least 1");
+if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
+	  &opts.jobs))
 	exit(1);
-}
 break;
 			case 'p':
 port = pg_strdup(optarg);
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 8bb0acf498..9ea98481d8 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -31,6 +31,7 @@
 #include "common/file_utils.h"
 #include "common/logging.h"
 #include "common/string.h"
+#include "f

Re: psql - add SHOW_ALL_RESULTS option

2021-07-21 Thread Fabien COELHO




The patch does not apply on Head anymore, could you rebase and post a
patch. I'm changing the status to "Waiting for Author".


Ok. I noticed. The patch got significantly broken by the watch pager 
commit. I also have to enhance the added tests (per Peter request).


I wrote a test to check psql query cancel support.  I checked that it fails 
against the patch that was reverted.  Maybe this is useful.


Thank you! The patch update is in progress…

The newly added PSQL_WATCH_PAGER feature which broke the patch does not 
seem to be tested anywhere, this is tiring:-(


--
Fabien.

Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)

2021-07-21 Thread vignesh C

On Thu, Jul 1, 2021 at 9:42 PM Mark Dilger  wrote:
>
>
>
> > On Jun 29, 2021, at 6:25 PM, Mark Dilger  
> > wrote:
> >
> > Please find attached a new set of patches.
>
> And again, this time attaching a fifth patch which includes the work to allow 
> users who belong to the right security role to SET and ALTER SYSTEM SET 
> variables without being a superuser.

One of the patches
v4-0004-Add-default-role-for-database-operations.patch does not apply
on head, please post an updated patch:
patching file src/backend/commands/dbcommands.c
Hunk #1 succeeded at 748 (offset -15 lines).
Hunk #2 FAILED at 780.
Hunk #3 succeeded at 1883 (offset -42 lines).
1 out of 3 hunks FAILED -- saving rejects to file
src/backend/commands/dbcommands.c.rej

Regards,
Vignesh

Re: dynamic result sets support in extended query protocol

2021-07-21 Thread vignesh C

On Tue, Jun 29, 2021 at 7:10 PM Peter Eisentraut
 wrote:
>
> Here is an updated patch with some merge conflicts resolved, to keep it
> fresh.  It's still pending in the commit fest from last time.
>
> My focus right now is to work on the "psql - add SHOW_ALL_RESULTS
> option" patch (https://commitfest.postgresql.org/33/2096/) first, which
> is pretty much a prerequisite to this one.  The attached patch set
> contains a minimal variant of that patch in 0001 and 0002, just to get
> this working, but disregard those for the purposes of code review.
>
> The 0003 patch contains comprehensive documentation and test changes
> that can explain the feature in its current form.

One of the patch v3-0003-Dynamic-result-sets-from-procedures.patch
does not apply on HEAD, please post an updated patch for it:
Hunk #1 FAILED at 57.
1 out of 1 hunk FAILED -- saving rejects to file
src/include/commands/defrem.h.rej

Regards,
Vignesh

Re: Enhanced error message to include hint messages for redundant options error

2021-07-21 Thread vignesh C

On Thu, Jul 15, 2021 at 5:12 PM vignesh C  wrote:
>
> On Thu, Jul 15, 2021 at 1:40 PM Dean Rasheed  wrote:
> >
> > On Tue, 13 Jul 2021 at 15:30, vignesh C  wrote:
> > >
> > > On Tue, Jul 13, 2021 at 4:25 PM Dean Rasheed  
> > > wrote:
> > > >
> > > > As it stands, the improvements from (3) seem quite worthwhile. Also,
> > > > the patch saves a couple of hundred lines of code, and for me an
> > > > optimised executable is around 30 kB smaller, which is more than I
> > > > expected.
> > >
> > > Agreed, it can be handled as part of the 2nd patch. The changes you
> > > made apply neatly and the test passes.
> >
> > Pushed.
> >
> > I noticed that it's actually safe to call parser_errposition() with a
> > null ParseState, so I simplified the ereport() code to just call it
> > unconditionally. Also, I decided to not bother using the new function
> > in cases with a null ParseState anyway since it doesn't provide any
> > meaningful benefit in those cases, and those are the cases most likely
> > to targeted next, so it didn't seem sensible to change that code, only
> > for it to be changed again later.
> >
> > Probably the thing to think about next is the few remaining cases that
> > throw this error directly and don't have any errdetail or errhint to
> > help the user identify the offending option. My preference remains to
> > leave the primary error text unchanged, but just add some suitable
> > errdetail. Also, it's probably not worth adding a new function for
> > those remaining errors, since there are only a handful of them.
>
> Thanks for pushing this patch. I have changed the commitfest status to
> "waiting for author" till 0002 patch is posted.

I'm marking this entry in commitfest as committed, I'm planning to
work on the other comments later once I finish my current project
works.

Regards,
Vignesh

Re: printf %s with NULL pointer (was Re: BUG #17098: Assert failed on composing an error message when adding a type to an extension being dropped)

2021-07-21 Thread Peter Eisentraut


On 14.07.21 18:26, Tom Lane wrote:

Peter Eisentraut  writes:

In this particular case, I would for example be quite curious how those
alternative minimal C libraries such as musl-libc handle this.


Interesting question, so I took a look:

https://git.musl-libc.org/cgit/musl/tree/src/stdio/vfprintf.c#n593

 case 's':
a = arg.p ? arg.p : "(null)";
...

Any others you'd like to consider?


Similar here: 
https://github.com/ensc/dietlibc/blob/master/lib/__v_printf.c#L188


I think unless we get counterexamples, this is all good.

Re: Hook for extensible parsing.

2021-07-21 Thread vignesh C

On Sat, Jun 12, 2021 at 1:59 PM Julien Rouhaud  wrote:
>
> On Tue, Jun 08, 2021 at 12:16:48PM +0800, Julien Rouhaud wrote:
> > On Sun, Jun 06, 2021 at 02:50:19PM +0800, Julien Rouhaud wrote:
> > > On Sat, May 01, 2021 at 03:24:58PM +0800, Julien Rouhaud wrote:
> > > >
> > > > I'm attaching some POC patches that implement this approach to start a
> > > > discussion.
>
> The regression tests weren't stable, v4 fixes that.

1) CFBOT showed the following compilation errors in windows:
"C:\projects\postgresql\pgsql.sln" (default target) (1) ->
"C:\projects\postgresql\sqlol.vcxproj" (default target) (69) ->
(ClCompile target) ->
 c1 : fatal error C1083: Cannot open source file:
'contrib/sqlol/sqlol_gram.c': No such file or directory
[C:\projects\postgresql\sqlol.vcxproj]
 c:\projects\postgresql\contrib\sqlol\sqlol_gramparse.h(25): fatal
error C1083: Cannot open include file: 'sqlol_gram.h': No such file or
directory (contrib/sqlol/sqlol.c)
[C:\projects\postgresql\sqlol.vcxproj]
 c:\projects\postgresql\contrib\sqlol\sqlol_gramparse.h(25): fatal
error C1083: Cannot open include file: 'sqlol_gram.h': No such file or
directory (contrib/sqlol/sqlol_keywords.c)
[C:\projects\postgresql\sqlol.vcxproj]
c1 : fatal error C1083: Cannot open source file:
'contrib/sqlol/sqlol_scan.c': No such file or directory
[C:\projects\postgresql\sqlol.vcxproj]
0 Warning(s)
4 Error(s)
6123
6124Time Elapsed 00:05:40.23
6125

2) There was one small whitespace error with the patch:
git am v4-0002-Add-a-sqlol-parser.patch
Applying: Add a sqlol parser.
.git/rebase-apply/patch:818: new blank line at EOF.
+
warning: 1 line adds whitespace errors.

Regards,
Vignesh

1 2 >

1 - 100 of 101 matches

Mail list logo