Re: Improve performance of pg_strtointNN functions

2022-12-04 Thread Dean Rasheed
On Sun, 4 Dec 2022 at 03:19, David Rowley  wrote:
>
> Pushed with some small adjustments.
>

Ah, I see that you changed the overflow test, and I realise that I
forgot to answer your question about why I wrote that as 1 - INT_MIN /
10 over on the other thread.

The reason is that we need to detect whether tmp * base will exceed
-INT_MIN, not INT_MAX, since we're accumulating the absolute value of
a signed integer. So the right test is

tmp >= 1 - INT_MIN / base

or equivalently

tmp > -(INT_MIN / base)

I used the first form, because it didn't require extra parentheses,
but that doesn't really matter. The point is that, in general, that's
not the same as

tmp > INT_MAX / base

though it happens to be the same for base = 10, because INT_MIN and
INT_MAX aren't divisible by 10. It will break when base is a power of
2 though, so although it's not broken now, it's morally wrong, and it
risks breaking when Peter commits his patch.

Regards,
Dean




RE: Perform streaming logical transactions by background workers and parallel apply

2022-12-04 Thread houzj.f...@fujitsu.com
On Friday, December 2, 2022 7:27 PM Kuroda, Hayato/黒田 隼人 
 wroteL
> 
> Dear Hou,
> 
> Thanks for making the patch. Followings are my comments for v54-0003 and
> 0004.

Thanks for the comments!

> 
> 0003
> 
> pa_free_worker()
> 
> +   /* Unlink any files that were needed to serialize partial changes. */
> +   if (winfo->serialize_changes)
> +   stream_cleanup_files(MyLogicalRepWorker->subid,
> winfo->shared->xid);
> +
> 
> I think this part is not needed, because the LA cannot reach here if
> winfo->serialize_changes is true. Moreover stream_cleanup_files() is done in
> pa_free_worker_info().

Removed.

> LogicalParallelApplyLoop()
> 
> The parallel apply worker wakes up every 0.1s even if we are in the
> PARTIAL_SERIALIZE mode. Do you have idea to reduce that?

The parallel apply worker usually will wait on the stream lock after entering
PARTIAL_SERIALIZE mode.

> ```
> +   pa_spooled_messages();
> ```
> 
> Comments are needed here, like "Changes may be serialize...".

Added.

> pa_stream_abort()
> 
> ```
> +   /*
> +* Reopen the file and set the file position 
> to
> the saved
> +* position.
> +*/
> +   if (reopen_stream_fd)
> +   {
> +   charpath[MAXPGPATH];
> +
> +   changes_filename(path,
> MyLogicalRepWorker->subid, xid);
> +   stream_fd =
> BufFileOpenFileSet(&MyParallelShared->fileset,
> +
> path, O_RDONLY, false);
> +   BufFileSeek(stream_fd, fileno, offset,
> SEEK_SET);
> +   }
> ```
> 
> MyParallelShared->serialize_changes may be used instead of reopen_stream_fd.

These codes have been removed.

> 
> ```
> +   /*
> +* It is possible that while sending this change to
> parallel apply
> +* worker we need to switch to serialize mode.
> +*/
> +   if (winfo->serialize_changes)
> +   pa_set_fileset_state(winfo->shared,
> FS_READY);
> ```
> 
> There are three same parts in the code, can we combine them to common part?

These codes have been slightly refactored.

> apply_spooled_messages()
> 
> ```
> +   /*
> +* Break the loop if the parallel apply worker has finished
> applying
> +* the transaction. The parallel apply worker should have 
> closed
> the
> +* file before committing.
> +*/
> +   if (am_parallel_apply_worker() &&
> +   MyParallelShared->xact_state ==
> PARALLEL_TRANS_FINISHED)
> +   goto done;
> ```
> 
> I thnk pfree(buffer) and pfree(s2.data) should not be skippied.
> And this part should be at below "nchanges++;"

buffer, s2.data were allocated in the toplevel transaction's context and it
will be automatically freed soon when handling STREAM COMMIT.

> 
> 0004
> 
> set_subscription_retry()
> 
> ```
> +   LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
> +AccessShareLock);
> +
> ```
> 
> I think AccessExclusiveLock should be aquired instead of AccessShareLock.
> In AlterSubscription(), LockSharedObject(AccessExclusiveLock) seems to be
> used.

Changed.

Best regards,
Hou zj


RE: Perform streaming logical transactions by background workers and parallel apply

2022-12-04 Thread houzj.f...@fujitsu.com
On Friday, December 2, 2022 4:59 PM Peter Smith  wrote:
> 
> Here are my review comments for patch v54-0001.

Thanks for the comments!

> ==
> 
> FILE: .../replication/logical/applyparallelworker.c
> 
> 1b.
> 
> + *
> + * This file contains routines that are intended to support setting up,
> + using
> + * and tearing down a ParallelApplyWorkerInfo which is required to
> + communicate
> + * among leader and parallel apply workers.
> 
> "that are intended to support" -> "for"

I find the current word is consistent with the comments atop vacuumparallel.c 
and
execParallel.c. So didn't change this one.

> 3. pa_setup_dsm
> 
> +/*
> + * Set up a dynamic shared memory segment.
> + *
> + * We set up a control region that contains a fixed-size worker info
> + * (ParallelApplyWorkerShared), a message queue, and an error queue.
> + *
> + * Returns true on success, false on failure.
> + */
> +static bool
> +pa_setup_dsm(ParallelApplyWorkerInfo *winfo)
> 
> IMO that's confusing to say "fixed-sized worker info" when it's referring to 
> the
> ParallelApplyWorkerShared structure and not the other
> ParallelApplyWorkerInfo.
> 
> Might be better to say:
> 
> "a fixed-size worker info (ParallelApplyWorkerShared)" -> "a fixed-size struct
> (ParallelApplyWorkerShared)"

The ParallelApplyWorkerShared is also kind of information that shared
between workers. So, I am fine with current word. Or maybe just "fixed-size 
info" ?

> ~~~
> 
> 12. pa_clean_subtrans
> 
> +/* Reset the list that maintains subtransactions. */ void
> +pa_clean_subtrans(void)
> +{
> + subxactlist = NIL;
> +}
> 
> Maybe a more informative function name would be pa_reset_subxactlist()?

I thought the current name is more consistent with pa_start_subtrans.

> ~~~
> 
> 17. apply_handle_stream_stop
> 
> + case TRANS_PARALLEL_APPLY:
> + elog(DEBUG1, "applied %u changes in the streaming chunk",
> + parallel_stream_nchanges);
> +
> + /*
> + * By the time parallel apply worker is processing the changes in
> + * the current streaming block, the leader apply worker may have
> + * sent multiple streaming blocks. This can lead to parallel apply
> + * worker start waiting even when there are more chunk of streams
> + * in the queue. So, try to lock only if there is no message left
> + * in the queue. See Locking Considerations atop
> + * applyparallelworker.c.
> + */
> 
> SUGGESTION (minor rewording)
> 
> By the time the parallel apply worker is processing the changes in the current
> streaming block, the leader apply worker may have sent multiple streaming
> blocks. To the parallel apply from waiting unnecessarily, try to lock only if 
> there
> is no message left in the queue. See Locking Considerations atop
> applyparallelworker.c.
> 

Didn't change this one according to Amit's comment.

> 
> 21. apply_worker_clean_exit
> 
> I wasn't sure if calling this a 'clean' exit meant anything much.
> 
> How about:
> - apply_worker_proc_exit, or
> - apply_worker_exit

I thought the clean means the exit number is 0(proc_exit(0)) and is
not due to any ERROR, I am not sure If proc_exit or exit is better.

I have addressed other comments in the new version patch.

Best regards,
Hou zj


RE: Avoid streaming the transaction which are skipped (in corner cases)

2022-12-04 Thread houzj.f...@fujitsu.com
On Saturday, December 3, 2022 7:37 PM Amit Kapila  
wrote:
> 
> On Tue, Nov 29, 2022 at 12:23 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Tuesday, November 29, 2022 12:08 PM Dilip Kumar
>  wrote:
> >
> > I have few comments about the patch.
> >
> > 1.
> >
> > 1.1.
> > -   /* For streamed transactions notify the remote node about the abort.
> */
> > -   if (rbtxn_is_streamed(txn))
> > -   rb->stream_abort(rb, txn, lsn);
> > +   /* the transaction which is being skipped shouldn't have been
> streamed */
> > +   Assert(!rbtxn_is_streamed(txn));
> >
> > 1.2
> > -   rbtxn_is_serialized(txn))
> > +   rbtxn_is_serialized(txn) &&
> > +   rbtxn_has_streamable_change(txn))
> > ReorderBufferStreamTXN(rb, toptxn);
> >
> > In the above two places, I think we should do the check for the
> > top-level transaction(e.g. toptxn) because the patch only set flag for
> > the top-level transaction.
> >
> 
> Among these, the first one seems okay because it will check both the 
> transaction
> and its subtransactions from that path and none of those should be marked as
> streamed. I have fixed the second one in the attached patch.
> 
> > 2.
> >
> > +   /*
> > +* If there are any streamable changes getting queued then get the
> top
> > +* transaction and mark it has streamable change.  This is required
> for
> > +* streaming in-progress transactions, the in-progress transaction 
> > will
> > +* not be selected for streaming unless it has at least one 
> > streamable
> > +* change.
> > +*/
> > +   if (change->action == REORDER_BUFFER_CHANGE_INSERT ||
> > +   change->action == REORDER_BUFFER_CHANGE_UPDATE ||
> > +   change->action == REORDER_BUFFER_CHANGE_DELETE ||
> > +   change->action ==
> REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT ||
> > +   change->action ==
> REORDER_BUFFER_CHANGE_TRUNCATE)
> >
> > I think that a transaction that contains REORDER_BUFFER_CHANGE_MESSAGE
> > can also be considered as streamable. Is there a reason that we don't check 
> > it
> here ?
> >
> 
> No, I don't see any reason not to do this check for
> REORDER_BUFFER_CHANGE_MESSAGE.
> 
> Apart from the above, I have slightly adjusted the comments in the attached. 
> Do
> let me know what you think of the attached.

Thanks for updating the patch. It looks good to me.

Best regards,
Hou zj


Re: Error-safe user functions

2022-12-04 Thread Andrew Dunstan


On 2022-12-03 Sa 16:46, Tom Lane wrote:
> Andrew Dunstan  writes:
>> Great. Let's hope we can get this settled early next week and then we
>> can get to work on the next tranche of functions, those that will let
>> the SQL/JSON work restart.
> OK, here's a draft proposal.  I should start out by acknowledging that
> this steals a great deal from Nikita's original patch as well as yours,
> though I editorialized heavily.
>
> 0001 is the core infrastructure and documentation for the feature.
> (I didn't bother breaking it down further than that.)
>
> 0002 fixes boolin and int4in.  That is the work that we're going to
> have to replicate in an awful lot of places, and I am pleased by how
> short-and-sweet it is.  Of course, stuff like the datetime functions
> might be more complex to adapt.
>
> Then 0003 is a quick-hack version of COPY that is able to exercise
> all this.  I did not bother with the per-column flags as you had
> them, because I'm not sure if they're worth the trouble compared
> to a simple boolean; in any case we can add that refinement later.
> What I did add was a WARN_ON_ERROR option that exercises the ability
> to extract the error message after a soft error.  I'm not proposing
> that as a shippable feature, it's just something for testing.


Overall I think this is pretty good, and I hope we can settle on it quickly.


>
> I think there are just a couple of loose ends here:
>
> 1. Bikeshedding on my name choices is welcome.  I know Robert is
> dissatisfied with "ereturn", but I'm content with that so I didn't
> change it here.


I haven't got anything better than ereturn.

details_please seems more informal than our usual style. details_wanted
maybe?


>
> 2. Everybody has struggled with just where to put the declaration
> of the error context structure.  The most natural home for it
> probably would be elog.h, but that's out because it cannot depend
> on nodes.h, and the struct has to be a Node type to conform to
> the fmgr safety guidelines.  What I've done here is to drop it
> in nodes.h, as we've done with a couple of other hard-to-classify
> node types; but I can't say I'm satisfied with that.
>
> Other plausible answers seem to be:
>
> * Drop it in fmgr.h.  The only real problem is that historically
> we've not wanted fmgr.h to depend on nodes.h either.  But I'm not
> sure how strong the argument for that really is/was.  If we did
> do it like that we could clean up a few kluges, both in this patch
> and pre-existing (fmNodePtr at least could go away).
>
> * Invent a whole new header just for this struct.  But then we're
> back to the question of what to call it.  Maybe something along the
> lines of utils/elog_extras.h ?
>
>   


Maybe a new header misc_nodes.h?

Soon after we get this done I think we'll find we need to extend this to
non-input functions. But that can wait a short while.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Questions regarding distinct operation implementation

2022-12-04 Thread Ankit Kumar Pandey



On 04/12/22 02:27, David Rowley wrote:


To make this work when rows can exit the window frame seems
significantly harder. Likely a hash table would be a better data
structure to remove records from, but then how are you going to spill
the hash table to disk when it reaches work_mem? As David J mentions,
it seems like you'd need a hash table with a counter to track how many
times a given value appears and only remove it from the table once
that counter reaches 0.  Unsure how you're going to constrain that to
not use more than work_mem though.

Interesting problem, Hashtables created in normal aggregates (AGG_HASHED 
mode) may provide some reference as they have hashagg_spill_tuple but I 
am not sure of any prior implementation of hashtable with counter and 
spill. Major concern is, if we go through tuplesort route (without order 
by case), would we get handicapped in future if we want order by or more 
features?



Are there any other databases which support DISTINCT window aggregate
with an ORDER BY in the window clause?

Oracle db support distinct window aggregates albeit without order by 
clause. Rest of databases which I've tried (mysql/sqlserver express) 
don't even support distinct in window aggregates so those gets ruled out 
as well.



If you were to limit this to only working with the query you mentioned
in [1], i.e PARTITION BY without an ORDER BY, then you only need to
aggregate once per partition per aggregate and you only need to do
that once all of the tuples for the partition are in the tuplestore.
It seems to me like you could add all the records to a tuplesort and
then sort by the DISTINCT column then aggregate everything except for
consecutive duplicates. You can then aggregate any other aggregates
which share the same DISTINCT column, otherwise, you just destroy the
tuplesort and rinse and repeat for the next aggregate.
This looks like way to go that would ensure main use case of portability 
from Oracle.



If you were to limit this to only working with the query you mentioned
in [1], i.e PARTITION BY without an ORDER BY,


I need to dig deeper into order by case.

--
Regards,
Ankit Kumar Pandey





Re: Error-safe user functions

2022-12-04 Thread Tom Lane
Andrew Dunstan  writes:
> On 2022-12-03 Sa 16:46, Tom Lane wrote:
>> 1. Bikeshedding on my name choices is welcome.  I know Robert is
>> dissatisfied with "ereturn", but I'm content with that so I didn't
>> change it here.

> details_please seems more informal than our usual style. details_wanted
> maybe?

Yeah, Corey didn't like that either.  "details_wanted" works for me.

> Soon after we get this done I think we'll find we need to extend this to
> non-input functions. But that can wait a short while.

I'm curious to know exactly which other use-cases you foresee.
It wouldn't be a bad idea to write some draft code to verify
that this mechanism will work conveniently for them.

regards, tom lane




[PATCH] Add .idea to gitignore for JetBrains CLion

2022-12-04 Thread Sayyid Ali Sajjad Rizavi
>From 6d10dafdd7c7789eddd7fd72ca22dfde74febe23 Mon Sep 17 00:00:00 2001
From: Ali Sajjad 
Date: Sun, 4 Dec 2022 06:03:11 -0800
Subject: [PATCH] Add .idea to gitignore for JetBrains CLion

---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 1c0f3e5e35..7118b90f25 100644
--- a/.gitignore
+++ b/.gitignore
@@ -31,6 +31,7 @@ win32ver.rc
 *.exe
 lib*dll.def
 lib*.pc
+**/.idea

 # Local excludes in root directory
 /GNUmakefile
-- 
2.34.1

Probably I'm the first one building PostgreSQL on CLion.


Re: [PATCH] Add .idea to gitignore for JetBrains CLion

2022-12-04 Thread Sayyid Ali Sajjad Rizavi
I searched the commit fest app and there's already someone who has made
this.

On Sun, Dec 4, 2022 at 7:28 AM Sayyid Ali Sajjad Rizavi 
wrote:

> From 6d10dafdd7c7789eddd7fd72ca22dfde74febe23 Mon Sep 17 00:00:00 2001
> From: Ali Sajjad 
> Date: Sun, 4 Dec 2022 06:03:11 -0800
> Subject: [PATCH] Add .idea to gitignore for JetBrains CLion
>
> ---
>  .gitignore | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/.gitignore b/.gitignore
> index 1c0f3e5e35..7118b90f25 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -31,6 +31,7 @@ win32ver.rc
>  *.exe
>  lib*dll.def
>  lib*.pc
> +**/.idea
>
>  # Local excludes in root directory
>  /GNUmakefile
> --
> 2.34.1
>
> Probably I'm the first one building PostgreSQL on CLion.
>


Re: [PATCH] Add .idea to gitignore for JetBrains CLion

2022-12-04 Thread Tom Lane
Sayyid Ali Sajjad Rizavi  writes:
> +**/.idea

Our policy is that the in-tree .gitignore files should only hide
files that are build artifacts of standard build processes.
Something like this belongs in your personal ~/.gitexclude,
instead.

(BTW, perhaps we should remove the entries targeting ".sl"
extensions?  AFAIK that was only for HP-UX, which is now
desupported.)

regards, tom lane




Re: Error-safe user functions

2022-12-04 Thread Andrew Dunstan


On 2022-12-04 Su 10:25, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 2022-12-03 Sa 16:46, Tom Lane wrote:
>>> 1. Bikeshedding on my name choices is welcome.  I know Robert is
>>> dissatisfied with "ereturn", but I'm content with that so I didn't
>>> change it here.
>> details_please seems more informal than our usual style. details_wanted
>> maybe?
> Yeah, Corey didn't like that either.  "details_wanted" works for me.
>
>> Soon after we get this done I think we'll find we need to extend this to
>> non-input functions. But that can wait a short while.
> I'm curious to know exactly which other use-cases you foresee.
> It wouldn't be a bad idea to write some draft code to verify
> that this mechanism will work conveniently for them.


The SQL/JSON patches at [1] included fixes for some numeric and datetime
conversion functions as well as various input functions, so that's a
fairly immediate need. More generally, I can see uses for error free
casts, something like, say CAST(foo AS bar ON ERROR blurfl)


cheers


andrew


[1]
https://www.postgresql.org/message-id/f54ebd2b-0e67-d1c6-4ff7-5d732492d1a0%40postgrespro.ru

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Questions regarding distinct operation implementation

2022-12-04 Thread Vik Fearing

On 12/4/22 14:34, Ankit Kumar Pandey wrote:


On 04/12/22 02:27, David Rowley wrote:





If you were to limit this to only working with the query you mentioned
in [1], i.e PARTITION BY without an ORDER BY, then you only need to
aggregate once per partition per aggregate and you only need to do
that once all of the tuples for the partition are in the tuplestore.
It seems to me like you could add all the records to a tuplesort and
then sort by the DISTINCT column then aggregate everything except for
consecutive duplicates. You can then aggregate any other aggregates
which share the same DISTINCT column, otherwise, you just destroy the
tuplesort and rinse and repeat for the next aggregate.

>
This looks like way to go that would ensure main use case of portability 
from Oracle.


The goal should not be portability from Oracle, but adherence to the 
standard.

--
Vik Fearing





Re: Error-safe user functions

2022-12-04 Thread Vik Fearing

On 12/4/22 17:21, Andrew Dunstan wrote:


More generally, I can see uses for error free
casts, something like, say CAST(foo AS bar ON ERROR blurfl)


What I am proposing for inclusion in the standard is basically the same 
as what JSON does:


 ::=
CAST 
 AS 
[ FORMAT  ]
[  ON ERROR ]


 ::=
ERROR
  | NULL
  | DEFAULT 

Once/If I get that in, I will be pushing to get that syntax in postgres 
as well.

--
Vik Fearing





Re: Questions regarding distinct operation implementation

2022-12-04 Thread Ankit Kumar Pandey



On 04/12/22 22:25, Vik Fearing wrote:

On 12/4/22 14:34, Ankit Kumar Pandey wrote:

This looks like way to go that would ensure main use case of 
portability from Oracle.


The goal should not be portability from Oracle, but adherence to the 
standard.

Yes, Vik. You are right. Wrong remark from my side.

--
Regards,
Ankit Kumar Pandey





Re: Generate pg_stat_get_* functions with Macros

2022-12-04 Thread Nathan Bossart
On Sun, Dec 04, 2022 at 06:07:37AM +0100, Drouvot, Bertrand wrote:
> On 12/3/22 9:16 PM, Nathan Bossart wrote:
>> Thanks.  I editorialized a bit in the attached v3.  I'm not sure that my
>> proposed names for the macros are actually an improvement.  WDYT?
> 
> Thanks! I do prefer the macros definition ordering that you're proposing 
> (that makes pgstatfuncs.c "easier" to read).
> 
> As far the names, I think it's better to replace "TAB" with "REL" (like in v4 
> attached): the reason is that those macros will be used in [1] for both 
> tables and indexes stats (and so we'd have to replace "TAB" with "REL" in 
> [1]).
> Having "REL" already in place reduces the changes that will be needed in [1].

Alright.  I marked this as ready-for-committer.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: CREATE INDEX CONCURRENTLY on partitioned index

2022-12-04 Thread Justin Pryzby
On Sat, Dec 03, 2022 at 07:13:30PM +0400, Ilya Gladyshev wrote:
> Hi,
> 
> Thank you Justin and Alexander for working on this, I have reviewed and
> tested the latest patch, it works well, the problems mentioned
> previously are all fixed. I like the idea of sharing code of reindex
> and index, but I have noticed some peculiarities as a user. 
> 
> The reporting is somewhat confusing as it switches to reporting for
> reindex concurrently while building child indexes, this should be fixed
> with the simple patch I have attached. Another thing that I have
> noticed is that REINDEX, which is used under the hood, creates new
> indexes with suffix _ccnew, and if the index building fails, the
> indexes that could not be build will have the name with _ccnew suffix.
> This can actually be seen in your test:
> 
> ERROR:  could not create unique index "idxpart2_a_idx2_ccnew"

> I find it quite confusing and I don't think that this the expected
> behavior (if it is, I think it should be documented, like it is for
> REINDEX). As an example of problems that it might entail, DROP INDEX
> will not drop all the invalid indexes in the inheritance tree, because
> it will leave _ccnew indexes in place, which is ok for reindex
> concurrently, but that's not how C-I-C works now. I think that fixing
> this problem requires some heavy code rewrite and I'm not quite sure

This beavior is fixed.  I re-factored and re-implented to use
DefineIndex() for building indexes concurrently rather than reindexing.
That makes the patch smaller, actually, and has the added benefit of
splitting off the "Concurrently" part of DefineIndex() into a separate
function.

This currently handles partitions with a loop around the whole CIC
implementation, which means that things like WaitForLockers() happen
once for each index, the same as REINDEX CONCURRENTLY on a partitioned
table.  Contrast that with ReindexRelationConcurrently(), which handles
all the indexes on a table in one pass by looping around indexes within
each phase.

BTW, it causes the patch to fail to apply in cfbot when you send an
additional (002) supplementary patch without including the original
(001) patch.  You can name it *.txt to avoid the issue.
https://wiki.postgresql.org/wiki/Cfbot#Which_attachments_are_considered_to_be_patches.3F

Thanks for looking.

-- 
Justin
>From e25b15173f4ce939efa54426e369b6996129ff59 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 6 Jun 2020 17:42:23 -0500
Subject: [PATCH] Allow CREATE INDEX CONCURRENTLY on partitioned table

---
 doc/src/sgml/ddl.sgml  |   4 +-
 doc/src/sgml/ref/create_index.sgml |   9 --
 src/backend/commands/indexcmds.c   | 172 +
 src/test/regress/expected/indexing.out | 127 +-
 src/test/regress/sql/indexing.sql  |  26 +++-
 5 files changed, 268 insertions(+), 70 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 38618de01c5..cd72b455447 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -4163,9 +4163,7 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
  so that they are applied automatically to the entire hierarchy.
  This is very
  convenient, as not only will the existing partitions become indexed, but
- also any partitions that are created in the future will.  One limitation is
- that it's not possible to use the CONCURRENTLY
- qualifier when creating such a partitioned index.  To avoid long lock
+ also any partitions that are created in the future will.  To avoid long lock
  times, it is possible to use CREATE INDEX ON ONLY
  the partitioned table; such an index is marked invalid, and the partitions
  do not get the index applied automatically.  The indexes on partitions can
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 40986aa502f..fc8cda655f0 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -692,15 +692,6 @@ Indexes:
 cannot.

 
-   
-Concurrent builds for indexes on partitioned tables are currently not
-supported.  However, you may concurrently build the index on each
-partition individually and then finally create the partitioned index
-non-concurrently in order to reduce the time where writes to the
-partitioned table will be locked out.  In this case, building the
-partitioned index is a metadata only operation.
-   
-
   
  
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index b5b860c3abf..cfab45b9992 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -92,6 +92,11 @@ static char *ChooseIndexName(const char *tabname, Oid namespaceId,
 			 bool primary, bool isconstraint);
 static char *ChooseIndexNameAddition(List *colnames);
 static List *ChooseIndexColumnNames(List *indexElems);
+static void DefineIndexConcurrentInternal(Oid relationId,
+		  Oid 

Re: [PATCH] Add .idea to gitignore for JetBrains CLion

2022-12-04 Thread Daniel Gustafsson
> On 4 Dec 2022, at 16:35, Tom Lane  wrote:
> 
> Sayyid Ali Sajjad Rizavi  writes:
>> +**/.idea
> 
> Our policy is that the in-tree .gitignore files should only hide
> files that are build artifacts of standard build processes.
> Something like this belongs in your personal ~/.gitexclude,
> instead.

Since this comes up every now and again, I wonder if it's worth documenting
this in our .gitignore along the lines of:

--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,7 @@
+# This contains ignores for build artifacts from standard builds,
+# auxiliary files from local workflows should be ignored locally
+# with $GIT_DIR/info/exclude
+

> (BTW, perhaps we should remove the entries targeting ".sl"
> extensions?  AFAIK that was only for HP-UX, which is now
> desupported.)

+1.  Grepping through the .gitignores in the tree didn't reveal anything else
that seemed to have outlived its usefulness.

--
Daniel Gustafsson   https://vmware.com/





Re: [PATCH] Add .idea to gitignore for JetBrains CLion

2022-12-04 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 4 Dec 2022, at 16:35, Tom Lane  wrote:
>> Our policy is that the in-tree .gitignore files should only hide
>> files that are build artifacts of standard build processes.
>> Something like this belongs in your personal ~/.gitexclude,
>> instead.

> Since this comes up every now and again, I wonder if it's worth documenting
> this in our .gitignore along the lines of:

Good idea.

>> (BTW, perhaps we should remove the entries targeting ".sl"
>> extensions?  AFAIK that was only for HP-UX, which is now
>> desupported.)

> +1.  Grepping through the .gitignores in the tree didn't reveal anything else
> that seemed to have outlived its usefulness.

I'll make it so.  Thanks for checking.

regards, tom lane




Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-12-04 Thread Maciek Sakrejda
On Tue, Nov 29, 2022 at 5:13 PM Melanie Plageman
 wrote:
> Thanks for the review, Maciek!
>
> I've attached a new version 39 of the patch which addresses your docs
> feedback from this email as well as docs feedback from Andres in [1] and
> Justin in [2].

This looks great! Just a couple of minor comments.

> You are right: reused is a normal, expected part of strategy
> execution. And you are correct: the idea behind reusing existing
> strategy buffers instead of taking buffers off the freelist is to leave
> those buffers for blocks that we might expect to be accessed more than
> once.
>
> In practice, however, if you happen to not be using many shared buffers,
> and then do a large COPY, for example, you will end up doing a bunch of
> writes (in order to reuse the strategy buffers) that you perhaps didn't
> need to do at that time had you leveraged the freelist. I think the
> decision about which tradeoff to make is quite contentious, though.

Thanks for the explanation--that makes sense.

> On Mon, Nov 7, 2022 at 1:26 PM Maciek Sakrejda  wrote:
> > Alternately, what do you think about pulling equivalencies to existing
> > views out of the main column descriptions, and adding them after the
> > main table as a sort of footnote? Most view docs don't have anything
> > like that, but pg_stat_replication does and it might be a good pattern
> > to follow.
> >
> > Thoughts?
>
> Thanks for including a patch!
> In the attached v39, I've taken your suggestion of flattening some of
> the lists and done some rewording as well. I have also moved the note
> about equivalence with pg_stat_statements columns to the
> pg_stat_statements documentation. The result is quite a bit different
> than what I had before, so I would be interested to hear your thoughts.
>
> My concern with the blue "note" section like you mentioned is that it
> would be harder to read the lists of backend types than it was in the
> tabular format.

Oh, I wasn't thinking of doing a separate "note": just additional
paragraphs of text after the table (like what pg_stat_replication has
before its "note", or the brief comment after the pg_stat_archiver
table). But I think the updated docs work also.

+  
+   The context or location of an IO operation.
+  

maybe "...of an IO operation:" (colon) instead?

+   default. Future values could include those derived from
+   XLOG_BLCKSZ, once WAL IO is tracked in this view, and
+   constant multipliers once non-block-oriented IO (e.g. temporary file IO)
+   is tracked here.

I know Lukas had commented that we should communicate that the goal is
to eventually provide relatively comprehensive I/O stats in this view
(you do that in the view description and I think that works), and this
is sort of along those lines, but I think speculative documentation
like this is not all that helpful. I'd drop this last sentence. Just
my two cents.

+  
+   evicted in io_context
+   buffer pool and io_object
+   temp relation counts the number of times a block of
+   data from an existing local buffer was evicted in order to replace it
+   with another block, also in local buffers.
+  

Doesn't this follow from the first sentence of the column description?
I think we could drop this, no?

Otherwise, the docs look good to me.

Thanks,
Maciek




Re: Improve performance of pg_strtointNN functions

2022-12-04 Thread David Rowley
On Sun, 4 Dec 2022 at 22:53, Dean Rasheed  wrote:
> Ah, I see that you changed the overflow test, and I realise that I
> forgot to answer your question about why I wrote that as 1 - INT_MIN /
> 10 over on the other thread.
>
> The reason is that we need to detect whether tmp * base will exceed
> -INT_MIN, not INT_MAX, since we're accumulating the absolute value of
> a signed integer.

I think I'd been too focused on the simplicity of that expression and
also the base 10 part. I saw that everything worked in base 10 and
failed to give enough forward thought to other bases.

I now see that it was wrong-headed to code it the way I had it.
Thanks for pointing this out. I've just pushed a fix.

David




Re: restoring user id and SecContext before logging error in ri_PlanCheck

2022-12-04 Thread Noah Misch
On Wed, Nov 02, 2022 at 08:00:58AM -0700, Zhihong Yu wrote:
> Looking down in ri_PerformCheck(), I see there may be case where error from
> SPI_execute_snapshot() would skip restoring UID.

> @@ -2405,13 +2405,19 @@ ri_PerformCheck(const RI_ConstraintInfo *riinfo,
>  SECURITY_NOFORCE_RLS);
>  
>   /* Finally we can run the query. */
> - spi_result = SPI_execute_snapshot(qplan,
> -   vals, 
> nulls,
> -   
> test_snapshot, crosscheck_snapshot,
> -   
> false, false, limit);
> -
> - /* Restore UID and security context */
> - SetUserIdAndSecContext(save_userid, save_sec_context);
> + PG_TRY();
> + {
> + spi_result = SPI_execute_snapshot(qplan,
> + 
>   vals, nulls,
> + 
>   test_snapshot, crosscheck_snapshot,
> + 
>   false, false, limit);
> + }
> + PG_FINALLY();
> + {
> + /* Restore UID and security context */
> + SetUserIdAndSecContext(save_userid, save_sec_context);
> + }
> + PG_END_TRY();

After an error, AbortSubTransaction() or AbortTransaction() will restore
userid and sec_context.  That makes such changes unnecessary.




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-04 Thread Michael Paquier
On Sat, Dec 03, 2022 at 09:07:38AM +0530, Bharath Rupireddy wrote:
> Yes, I removed those changes. Even if someone sees an offset of a
> record within a WAL file elsewhere, they have the new utility function
> (0002) pg_walfile_offset_lsn().
> 
> I'm attaching the v4 patch set for further review.

+ * Compute an LSN and timline ID given a WAL file name and decimal byte offset.
s/timline/timeline/, exactly two times

+   Datum   values[2] = {0};
+   boolisnull[2] = {0};
I would not hardcode the number of attributes of the record returned.

Regarding pg_walfile_offset_lsn(), I am not sure that this is the best
move we can do as it is possible to compile a LSN from 0/0 with just a
segment number, say:
select '0/0'::pg_lsn + :segno * setting::int + :offset
  from pg_settings where name = 'wal_segment_size';

+   resultTupleDesc = CreateTemplateTupleDesc(2);
+   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 1, "lsn",
+  PG_LSNOID, -1, 0);
+   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "timeline_id",
+  INT4OID, -1, 0);
Let's use get_call_result_type() to get the TupleDesc and to avoid a
duplication between pg_proc.dat and this code.

Hence I would tend to let XLogFromFileName do the job, while having a
SQL function that is just a thin wrapper around it that returns the
segment TLI and its number, leaving the offset out of the equation as
well as this new XLogIdFromFileName().
--
Michael


signature.asc
Description: PGP signature


Re: Error-safe user functions

2022-12-04 Thread Michael Paquier
On Sun, Dec 04, 2022 at 06:01:33PM +0100, Vik Fearing wrote:
> Once/If I get that in, I will be pushing to get that syntax in postgres as
> well.

If I may ask, how long would it take to know if this grammar would be
integrated in the standard or not?
--
Michael


signature.asc
Description: PGP signature


Re: mprove tab completion for ALTER EXTENSION ADD/DROP

2022-12-04 Thread Michael Paquier
On Sat, Dec 03, 2022 at 05:34:57PM +, Matheus Alcantara wrote:
> I've tested your patched on current master and seems to be working properly.
> 
> I'm starting reviewing some patches here, let's see what more experience 
> hackers
> has to say about this, but as far I can tell is that is working as expected.

+   /* ALTER EXTENSION  ADD|DROP */
+   else if (Matches("ALTER", "EXTENSION", MatchAny, "ADD|DROP"))
+   COMPLETE_WITH("ACCESS METHOD", "AGGREGATE", "CAST", "COLLATION",
+ "CONVERSION", "DOMAIN", "EVENT 
TRIGGER", "FOREIGN",
+ "FUNCTION", "MATERIALIZED VIEW", 
"OPERATOR",
+ "PROCEDURAL LANGUAGE", "PROCEDURE", 
"LANGUAGE",
+ "ROUTINE", "SCHEMA", "SEQUENCE", 
"SERVER", "TABLE",
+ "TEXT SEARCH", "TRANSFORM FOR", 
"TYPE", "VIEW");
+
+   /* ALTER EXTENSION  ADD|DROP FOREIGN*/
+   else if (Matches("ALTER", "EXTENSION", MatchAny, "ADD|DROP", "FOREIGN"))
+   COMPLETE_WITH("DATA WRAPPER", "TABLE");

The DROP could be matched with the objects that are actually part of
the so-said extension?
--
Michael


signature.asc
Description: PGP signature


doc: add missing "id" attributes to extension packaging page

2022-12-04 Thread Ian Lawrence Barwick
Hi

On this page:

  https://www.postgresql.org/docs/current/extend-extensions.html

three of the  sections are missing an "id" attribute; patch adds
these. Noticed when trying to create a stable link to one of the affected
sections.

Regards

Ian Barwick
From c0dec489ec5b088d7e16b814aef9750ff271c606 Mon Sep 17 00:00:00 2001
From: Ian Barwick 
Date: Mon, 5 Dec 2022 10:41:05 +0900
Subject: [PATCH v1] doc: add "id" attributes to extension documentation

Some of the sections here:

  https://www.postgresql.org/docs/current/extend-extensions.html

were missing "id" attributes.
---
 doc/src/sgml/extend.sgml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml
index 46e873a166..ba89cf5a28 100644
--- a/doc/src/sgml/extend.sgml
+++ b/doc/src/sgml/extend.sgml
@@ -628,7 +628,7 @@ RETURNS anycompatible AS ...
 dropping the whole extension.

 
-   
+   
 Extension Files
 

@@ -1063,7 +1063,7 @@ SELECT pg_catalog.pg_extension_config_dump('my_config', 'WHERE NOT standard_entr
 

 
-   
+   
 Extension Updates
 
 
@@ -1144,7 +1144,7 @@ SELECT * FROM pg_extension_update_paths('extension_name

 
-   
+   
 Installing Extensions Using Update Scripts
 
 

base-commit: 71cb84ec69a38444c48bd8d3b5451b2da157848b
-- 
2.31.1



RE: Partial aggregates pushdown

2022-12-04 Thread fujii.y...@df.mitsubishielectric.co.jp
Hi Mr.Pyhalov.

> Attaching minor fixes. I haven't proof-read all comments (but perhaps, they
> need attention from some native speaker).
Thank you. I fixed according to your patch.
And I fixed have proof-read all comments and messages.

> Tested it with queries from
> https://github.com/swarm64/s64da-benchmark-toolkit, works as expected.
Thank you for additional tests.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation


0001-Partial-aggregates-push-down-v16.patch
Description: 0001-Partial-aggregates-push-down-v16.patch


Re: docs: add missing id elements for developer GUCs

2022-12-04 Thread Michael Paquier
On Sat, Dec 03, 2022 at 03:58:19PM +0900, Ian Lawrence Barwick wrote:
> A few of the developer option GUCs were missing the "id" attribute
> in their markup, making it impossible to link to them directly.

True enough that the other developer GUCs do that, so applied and
backpatched down to 11.
--
Michael


signature.asc
Description: PGP signature


Re: generic plans and "initial" pruning

2022-12-04 Thread Amit Langote
On Fri, Dec 2, 2022 at 7:40 PM Amit Langote  wrote:
> On Thu, Dec 1, 2022 at 9:43 PM Amit Langote  wrote:
> > On Thu, Dec 1, 2022 at 8:21 PM Alvaro Herrera  
> > wrote:
> > > On 2022-Dec-01, Amit Langote wrote:
> > > > Hmm, how about keeping the [Merge]Append's parent relation's RT index
> > > > in the PartitionPruneInfo and passing it down to
> > > > ExecInitPartitionPruning() from ExecInit[Merge]Append() for
> > > > cross-checking?  Both Append and MergeAppend already have a
> > > > 'apprelids' field that we can save a copy of in the
> > > > PartitionPruneInfo.  Tried that in the attached delta patch.
> > >
> > > Ah yeah, that sounds about what I was thinking.  I've merged that in and
> > > pushed to github, which had a strange pg_upgrade failure on Windows
> > > mentioning log files that were not captured by the CI tooling.  So I
> > > pushed another one trying to grab those files, in case it wasn't an
> > > one-off failure.  It's running now:
> > >   https://cirrus-ci.com/task/5857239638999040
> > >
> > > If all goes well with this run, I'll get this 0001 pushed.
> >
> > Thanks for pushing 0001.
> >
> > Rebased 0002 attached.
>
> Thought it might be good for PartitionPruneResult to also have
> root_parent_relids that matches with the corresponding
> PartitionPruneInfo.  ExecInitPartitionPruning() does a sanity check
> that the root_parent_relids of a given pair of PartitionPrune{Info |
> Result} match.
>
> Posting the patch separately as the attached 0002, just in case you
> might think that the extra cross-checking would be an overkill.

Rebased over 92c4dafe1eed and fixed some factual mistakes in the
comment above ExecutorDoInitialPruning().

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v27-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch
Description: Binary data


v27-0002-Add-root_parent_relids-to-PartitionPruneResult.patch
Description: Binary data


Re: Collation version tracking for macOS

2022-12-04 Thread Thomas Munro
On Tue, Nov 29, 2022 at 7:51 PM Jeff Davis  wrote:
> On Sat, 2022-11-26 at 18:27 +1300, Thomas Munro wrote:
> > On Thu, Nov 24, 2022 at 5:48 PM Thomas Munro 
> > wrote:
> > > On Thu, Nov 24, 2022 at 3:07 PM Jeff Davis 
> > > wrote:
> > > > I'd vote for 1 on the grounds that it's easier to document and
> > > > understand a single collation version, which comes straight from
> > > > ucol_getVersion(). This approach makes it a separate problem to
> > > > find
> > > > the collation version among whatever libraries the admin can
> > > > provide;
> > > > but adding some observability into the search should mitigate any
> > > > confusion.
> > >
> > > OK, it sounds like I should code that up next.
> >
> > Here's the first iteration.
>
> Thank you.

Thanks for the review.  Responses further down.  And thanks also for
the really interesting discussion about how the version numbers work
(or in some cases, don't work...), and practical packaging and linking
problems.

To have a hope of making something happen for PG16, which I think
means we need a serious contender patch in the next few weeks, we
really need to make some decisions.  I enjoyed trying out
search-by-collversion, but it's still not my favourite.  On the ballot
we have two main questions:

1.  Should we commit to search-by-collversion, or one of the explicit
library version ideas, and if the latter, which?
2.  Should we try to support being specific about minor versions (in
various different ways according to the choice made for #1)?

My tentative votes are:

1.  I think we should seriously consider provider = ICU63.  I still
think search-by-collversion is a little too magical, even though it
clearly can be made to work.  Of the non-magical systems, I think
encoding the choice of library into the provider name would avoid the
need to add a second confusing "X_version" concept alongside our
existing "X_version" columns in catalogues and DDL syntax, while still
making it super clear what is going on.  This would include adding DDL
commands so you can do ALTER DATABASE/COLLATION ... PROVIDER = ICU63
to make warnings go way.

2.  I think we should ignore minor versions for now (other than
reporting them in the relevant introspection functions), but not make
any choices that would prevent us from changing our mind about that in
a later release.  For example, having two levels of specificity ICU
and ICU68  in the libver-in-provider-name design wouldn't preclude us
from adding support for ICU68_2 later

I haven't actually tried that design out in code yet, but I'm willing
to try to code that up very soon.  So no new patch from me yet.  Does
anyone else want to express a view?

> Proposed changes:
>
> * I attached a first pass of some documentation.

Thanks.  Looks pretty good, and much of it would stay if we changed to
one of the other models.

> * Should be another GUC to turn WARNING into an ERROR. Useful at least
> for testing; perhaps too dangerous for production.

OK, will add that into the next version.

> * The libraries should be loaded in a more diliberate order. The "*"
> should be expanded in a descending fashion so that later versions are
> preferred.

Yeah, I agree.

> * GUCs should be validated.

Will do.

> * Should validate that loaded library has expected version.

Will do.

> * We need to revise or remove pg_collation_actual_version() and
> pg_database_collation_actual_version().

I never liked that use of the word "actual"...

> * The GUCs are PGC_SUSET, but don't take effect because
> icu_library_list_fully_loaded is never reset.

True.  Just rought edges because I was trying to prototype
search-by-collversion fast.  Will consider this for the next version.

> * The extra collations you're adding at bootstrap time are named based
> on the library major version. I suppose it might be more "proper" to
> name them based on the collation version, but that would be more
> verbose, so I won't advocate for that. Just pointing it out.

Ah, yes, the ones with names like "en-US-x-icu68".  I agree that made
a little less sense in the search-by-collversion patch.  Maybe we
wouldn't want these at all in the search-by-collversion model.  But I
think they're perfect the way they are in the provider = ICU68 model.
The other idea I considered ages ago was that we could use namespaces:
you could "icu68.en-US", or just "en-US" in some contexts to get what
your search path sees, but that all seemed a little too cute and not
really like anything else we do with system-created catalogues, so I
gave that idea up.

> * It looks hard (or impossible) to mix multiple ICU libraries with the
> same major version and different minor versions. That's because,
> e.g., libicui18n.so.63.1 links against libicuuc.63 and libicudata.63,
> and when you install ICU 63.2, those dependencies get clobbered with
> the 63.2 versions. That fails the sanity check I proposed above about
> the library version number matching the requested library version
> number. And it also just seems wro

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-04 Thread Bharath Rupireddy
On Mon, Dec 5, 2022 at 6:34 AM Michael Paquier  wrote:
>
> Regarding pg_walfile_offset_lsn(), I am not sure that this is the best
> move we can do as it is possible to compile a LSN from 0/0 with just a
> segment number, say:
> select '0/0'::pg_lsn + :segno * setting::int + :offset
>   from pg_settings where name = 'wal_segment_size';

Nice.

> +   resultTupleDesc = CreateTemplateTupleDesc(2);
> +   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 1, "lsn",
> +  PG_LSNOID, -1, 0);
> +   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "timeline_id",
> +  INT4OID, -1, 0);
> Let's use get_call_result_type() to get the TupleDesc and to avoid a
> duplication between pg_proc.dat and this code.
>
> Hence I would tend to let XLogFromFileName do the job, while having a
> SQL function that is just a thin wrapper around it that returns the
> segment TLI and its number, leaving the offset out of the equation as
> well as this new XLogIdFromFileName().

So, a SQL function pg_dissect_walfile_name (or some other better name)
given a WAL file name returns the tli and seg number. Then the
pg_walfile_offset_lsn can just be a SQL-defined function (in
system_functions.sql) using this new function and pg_settings. If this
understanding is correct, it looks good to me at this point.

That said, let's also hear from others.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Avoid streaming the transaction which are skipped (in corner cases)

2022-12-04 Thread Amit Kapila
On Sun, Dec 4, 2022 at 5:14 PM houzj.f...@fujitsu.com
 wrote:
>
> On Saturday, December 3, 2022 7:37 PM Amit Kapila  
> wrote:
> > Apart from the above, I have slightly adjusted the comments in the 
> > attached. Do
> > let me know what you think of the attached.
>
> Thanks for updating the patch. It looks good to me.
>

I feel the function name ReorderBufferLargestTopTXN() is slightly
misleading because it also checks some of the streaming properties
(like whether the TXN has partial changes and whether it contains any
streamable change). Shall we rename it to
ReorderBufferLargestStreamableTopTXN() or something like that?

The other point to consider is whether we need to have a test case for
this patch. I think before this patch if the size of DDL changes in a
transaction exceeds logical_decoding_work_mem, the empty streams will
be output in the plugin but after this patch, there won't be any such
stream.

-- 
With Regards,
Amit Kapila.




Re: Allow placeholders in ALTER ROLE w/o superuser

2022-12-04 Thread Alexander Korotkov
On Thu, Dec 1, 2022 at 6:14 AM Alexander Korotkov  wrote:
> On Wed, Nov 23, 2022 at 1:53 AM Steve Chavez  wrote:
> > So from my side this all looks good!
>
> Thank you for your feedback.
>
> The next revision of the patch is attached.  It contains code
> improvements, comments and documentation.  I'm going to also write
> sode tests.  pg_db_role_setting doesn't seem to be well-covered with
> tests.  I will probably need to write a new module into
> src/tests/modules to check now placeholders interacts with dynamically
> defined GUCs.

Another revision of patch is attached.  It's fixed now that USER SET
values can't be used for PGC_SUSET parameters.  Tests are added.  That
require new module test_pg_db_role_setting to check dynamically
defined GUCs.

--
Regards,
Alexander Korotkov


0001-USER-SET-parameters-for-pg_db_role_setting-v3.patch
Description: Binary data


Re: Avoid streaming the transaction which are skipped (in corner cases)

2022-12-04 Thread Dilip Kumar
On Mon, Dec 5, 2022 at 8:59 AM Amit Kapila  wrote:
>
> On Sun, Dec 4, 2022 at 5:14 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Saturday, December 3, 2022 7:37 PM Amit Kapila  
> > wrote:
> > > Apart from the above, I have slightly adjusted the comments in the 
> > > attached. Do
> > > let me know what you think of the attached.
> >
> > Thanks for updating the patch. It looks good to me.
> >
>
> I feel the function name ReorderBufferLargestTopTXN() is slightly
> misleading because it also checks some of the streaming properties
> (like whether the TXN has partial changes and whether it contains any
> streamable change). Shall we rename it to
> ReorderBufferLargestStreamableTopTXN() or something like that?

Yes that makes sense

> The other point to consider is whether we need to have a test case for
> this patch. I think before this patch if the size of DDL changes in a
> transaction exceeds logical_decoding_work_mem, the empty streams will
> be output in the plugin but after this patch, there won't be any such
> stream.

Yes, we can do that, I will make these two changes.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Optimize common expressions in projection evaluation

2022-12-04 Thread Peifeng Qiu
> the need for this code seems not that great.  But as to the code itself I'm 
> unable to properly judge.
A simplified version of my use case is like this:
CREATE FOREIGN TABLE ft(rawdata json);
INSERT INTO tbl SELECT (convert_func(rawdata)).* FROM ft;
We have a foreign data source that can emit json data in different
formats. We need different
convert_func to extract the actual fields out. The client know which
function to use, but as
the each json may have hundreds of columns, the performance is very poor.

Best regards,
Peifeng Qiu




Re: Optimize common expressions in projection evaluation

2022-12-04 Thread David G. Johnston
On Sun, Dec 4, 2022 at 9:00 PM Peifeng Qiu  wrote:

> > the need for this code seems not that great.  But as to the code itself
> I'm unable to properly judge.
> A simplified version of my use case is like this:
> CREATE FOREIGN TABLE ft(rawdata json);
> INSERT INTO tbl SELECT (convert_func(rawdata)).* FROM ft;
>
>
Which is properly written as the following, using lateral, which also
avoids the problem you describe:

INSERT INTO tbl
SELECT func_call.*
FROM ft
JOIN LATERAL convert_func(ft.rawdata) AS func_call ON true;

David J.


Re: Bug in row_number() optimization

2022-12-04 Thread David Rowley
On Fri, 2 Dec 2022 at 00:21, Sergey Shinderuk
 wrote:
> Maybe I'm missing something, but the previous call to spool_tuples()
> might have read extra tuples (if the tuplestore spilled to disk), and
> after switching to WINDOWAGG_PASSTHROUGH_STRICT mode we nevertheless
> would loop through these extra tuples and call ExecProject if only to
> increment winstate->currentpos.

The tuples which are spooled in the WindowAgg node are the ones from
the WindowAgg's subnode.  Since these don't contain the results of the
WindowFunc, then I don't think there's any issue with what's stored in
any of the spooled tuples.

What matters is what we pass along to the node that's reading from the
WindowAgg. If we NULL out the memory where we store the WindowFunc
(and maybe an Aggref) results then the ExecProject in ExecWindowAgg()
will no longer fill the WindowAgg's output slot with the address of
free'd memory (or some stale byval value which has lingered for byval
return type WindowFuncs).

Since the patch I sent sets the context's ecxt_aggnulls to true, it
means that when we do the ExecProject(), the EEOP_WINDOW_FUNC in
ExecInterpExpr (or the JIT equivalent) will put an SQL NULL in the
*output* slot for the WindowAgg node. The same is true for
EEOP_AGGREFs as the WindowAgg node that we are running in
WINDOWAGG_PASSTHROUGH mode could also contain normal aggregate
functions, not just WindowFuncs.

David




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-04 Thread Michael Paquier
On Mon, Dec 05, 2022 at 08:48:25AM +0530, Bharath Rupireddy wrote:
> So, a SQL function pg_dissect_walfile_name (or some other better name)
> given a WAL file name returns the tli and seg number. Then the
> pg_walfile_offset_lsn can just be a SQL-defined function (in
> system_functions.sql) using this new function and pg_settings. If this
> understanding is correct, it looks good to me at this point.

I would do without the SQL function that looks at pg_settings, FWIW.

> That said, let's also hear from others.

Sure.  Perhaps my set of suggestions will not get the majority,
though..
--
Michael


signature.asc
Description: PGP signature


Re: Bug in row_number() optimization

2022-12-04 Thread David Rowley
On Mon, 28 Nov 2022 at 22:59, Sergey Shinderuk
 wrote:
>
> On 28.11.2022 03:23, David Rowley wrote:
> > On Sat, 26 Nov 2022 at 05:19, Tom Lane  wrote:
> >> It's pretty unlikely that this would work during an actual index scan.
> >> I'm fairly sure that btree (and other index AMs) have hard-wired
> >> assumptions that index operators are strict.
> >
> > If we're worried about that then we could just restrict this
> > optimization to only work with strict quals.
>
> Not sure this is necessary if btree operators must be strict anyway.

I'd rather see the func_strict() test in there. You've already
demonstrated you can get wrong results with a non-strict operator. I'm
not disputing that it sounds like a broken operator class or not. I
just want to ensure we don't leave any holes open for this
optimisation to return incorrect results.

David




Re: Optimize common expressions in projection evaluation

2022-12-04 Thread Tom Lane
Peifeng Qiu  writes:
>> the need for this code seems not that great.  But as to the code itself I'm 
>> unable to properly judge.

> A simplified version of my use case is like this:
> CREATE FOREIGN TABLE ft(rawdata json);
> INSERT INTO tbl SELECT (convert_func(rawdata)).* FROM ft;

It might be worth noting that the code as we got it from Berkeley
could do this scenario without multiple evaluations of convert_func().
Memory is foggy, but I believe it involved essentially a two-level
targetlist.  Unfortunately, the scheme was impossibly baroque and
buggy, so we eventually ripped it out altogether in favor of the
multiple-evaluation behavior you see today.  I think that commit
62e29fe2e might have been what ripped it out, but I'm not quite
sure.  It's about the right time-frame, anyway.

I mention this because trying to reverse-engineer this situation
in execExpr seems seriously ugly and inefficient, even assuming
you can make it non-buggy.  The right solution has to involve never
expanding foo().* into duplicate function calls in the first place,
which is the way it used to be.  Maybe if you dug around in those
twenty-year-old changes you could get some inspiration.

I tend to agree with David that LATERAL offers a good-enough
solution in most cases ... but it is annoying that we accept
this syntax and then pessimize it.

regards, tom lane




Re: Optimize common expressions in projection evaluation

2022-12-04 Thread Pavel Stehule
po 5. 12. 2022 v 5:28 odesílatel Tom Lane  napsal:

> Peifeng Qiu  writes:
> >> the need for this code seems not that great.  But as to the code itself
> I'm unable to properly judge.
>
> > A simplified version of my use case is like this:
> > CREATE FOREIGN TABLE ft(rawdata json);
> > INSERT INTO tbl SELECT (convert_func(rawdata)).* FROM ft;
>
> It might be worth noting that the code as we got it from Berkeley
> could do this scenario without multiple evaluations of convert_func().
> Memory is foggy, but I believe it involved essentially a two-level
> targetlist.  Unfortunately, the scheme was impossibly baroque and
> buggy, so we eventually ripped it out altogether in favor of the
> multiple-evaluation behavior you see today.  I think that commit
> 62e29fe2e might have been what ripped it out, but I'm not quite
> sure.  It's about the right time-frame, anyway.
>
> I mention this because trying to reverse-engineer this situation
> in execExpr seems seriously ugly and inefficient, even assuming
> you can make it non-buggy.  The right solution has to involve never
> expanding foo().* into duplicate function calls in the first place,
> which is the way it used to be.  Maybe if you dug around in those
> twenty-year-old changes you could get some inspiration.
>
> I tend to agree with David that LATERAL offers a good-enough
> solution in most cases ... but it is annoying that we accept
> this syntax and then pessimize it.
>

I agree, so there is a perfect solution like don't use .*, but on second
hand any supported syntax should be optimized well or we should raise some
warning about negative performance impact.

Today there are a lot of baroque technologies in the stack so it is hard to
remember all good practices and it is hard for newbies to take this
knowledge. We should reduce possible performance traps when it is possible.

Regards

Pavel




>
> regards, tom lane
>
>
>


Re: Bug in row_number() optimization

2022-12-04 Thread David Rowley
On Thu, 1 Dec 2022 at 21:18, Richard Guo  wrote:
>> +   if (!func_strict(opexpr->opfuncid))
>> +   return false;
>>
>> Should return true instead?
>
>
> Yeah, you're right.  This should be a thinko.

Yeah, oops. That's wrong.

I've adjusted that in the attached.

I'm keen to move along and push the fix for this bug.  If there are no
objections to the method in the attached and also adding the
restriction to limit the optimization to only working with strict
OpExprs, then I'm going to push this, likely about 24 hours from now.

David
diff --git a/src/backend/executor/nodeWindowAgg.c 
b/src/backend/executor/nodeWindowAgg.c
index 81ba024bba..110c594edd 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2300,7 +2300,27 @@ ExecWindowAgg(PlanState *pstate)
continue;
}
else
+   {
winstate->status = 
WINDOWAGG_PASSTHROUGH;
+
+   /*
+* Otherwise, if we're not the 
top-window, we'd better
+* NULLify the aggregate 
results, else, by-ref result
+* types will point to freed 
memory.  We can get away
+* without storing the final 
result of the WindowFunc
+* because in the planner we 
insisted that the
+* runcondition is strict.  
Setting these to NULL will
+* ensure the top-level 
WindowAgg filter will always
+* filter out the rows produced 
in this WindowAgg when
+* not in WINDOWAGG_RUN state.
+*/
+   numfuncs = winstate->numfuncs;
+   for (i = 0; i < numfuncs; i++)
+   {
+   
econtext->ecxt_aggvalues[i] = (Datum) 0;
+   
econtext->ecxt_aggnulls[i] = true;
+   }
+   }
}
else
{
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index 03ee6dc832..595870ea30 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2399,6 +2399,24 @@ check_and_push_window_quals(Query *subquery, 
RangeTblEntry *rte, Index rti,
if (list_length(opexpr->args) != 2)
return true;
 
+   /*
+* Currently, we restrict this optimization to strict OpExprs.  The 
reason
+* for this is that during execution, once the runcondition becomes 
false,
+* we stop evaluating WindowFuncs in WindowAgg nodes and since we're no
+* longer updating them, we NULLify the aggregate and window aggregate
+* results to prevent by-ref result typed window aggregates from 
pointing
+* to free'd memory for byref WindowFuncs and storing stale values for
+* byval WindowFuncs (remember that window functions such as 
row_number()
+* return a byref type on non-FLOAT8PASSBYVAL builds).  Upper-level
+* WindowAgg nodes may make reference to these results in their filter
+* clause and we can ensure the filter remain false by making sure the
+* operator is strict and by performing the NULLification in the 
executor
+* when the runcondition first becomes false.
+*/
+   set_opfuncid(opexpr);
+   if (!func_strict(opexpr->opfuncid))
+   return true;
+
/*
 * Check for plain Vars that reference window functions in the subquery.
 * If we find any, we'll ask find_window_run_conditions() if 'opexpr' 
can


Re: Optimize common expressions in projection evaluation

2022-12-04 Thread Tom Lane
Pavel Stehule  writes:
> po 5. 12. 2022 v 5:28 odesílatel Tom Lane  napsal:
>> I tend to agree with David that LATERAL offers a good-enough
>> solution in most cases ... but it is annoying that we accept
>> this syntax and then pessimize it.

> I agree, so there is a perfect solution like don't use .*, but on second
> hand any supported syntax should be optimized well or we should raise some
> warning about negative performance impact.

Yeah.  I wonder if we could get the parser to automatically transform

SELECT (foo(t.x)).* FROM tab t

into

SELECT f.* FROM tab t, LATERAL foo(t.x) f

There are probably cases where this doesn't work or changes the
semantics subtly, but I suspect it could be made to work in
most cases of practical interest.

regards, tom lane




Re: Optimize common expressions in projection evaluation

2022-12-04 Thread David G. Johnston
On Sun, Dec 4, 2022 at 9:37 PM Pavel Stehule 
wrote:

>
> po 5. 12. 2022 v 5:28 odesílatel Tom Lane  napsal:
>
>> Peifeng Qiu  writes:
>> >> the need for this code seems not that great.  But as to the code
>> itself I'm unable to properly judge.
>>
>> I mention this because trying to reverse-engineer this situation
>> in execExpr seems seriously ugly and inefficient, even assuming
>> you can make it non-buggy.  The right solution has to involve never
>> expanding foo().* into duplicate function calls in the first place,
>> which is the way it used to be.  Maybe if you dug around in those
>> twenty-year-old changes you could get some inspiration.
>>
>> I tend to agree with David that LATERAL offers a good-enough
>> solution in most cases ... but it is annoying that we accept
>> this syntax and then pessimize it.
>>
>
> I agree, so there is a perfect solution like don't use .*, but on second
> hand any supported syntax should be optimized well or we should raise some
> warning about negative performance impact.
>
>
Yeah, I phrased that poorly.  I agree that having this problem solved would
be beneficial to the project.  But, and this is probably a bit unfair to
the OP, if Tom decided to implement LATERAL after years of hearing
complaints I doubted this patch was going to be acceptable.  The
benefit/cost ratio of fixing this in code just doesn't seem to be high
enough to try at this point.  But I'd be happy to be proven wrong.  And I
readily admit both a lack of knowledge, and that as time has passed maybe
we've improved the foundations enough to implement something here.

Otherwise, we can maybe improve the documentation.  There isn't any way
(that the project would accept anyway) to actually warn the user at
runtime.  If we go that route we should probably just disallow the syntax
outright.

David J.


Re: [PATCH] Add native windows on arm64 support

2022-12-04 Thread Michael Paquier
On Fri, Dec 02, 2022 at 11:09:15AM +, Niyas Sait wrote:
> I've attached a new revision of the patch (v5) and includes following
> changes,
> 
> 1. Add support for meson build system
> 2. Extend MSVC scripts to handle ARM64 platform.
> 3. Add arm64 definition of spin_delay function.
> 4. Exclude arm_acle.h import with MSVC compiler.

Hmm.  There are still a few things that need some attention here:
- USE_SSE42_CRC32C_WITH_RUNTIME_CHECK should not be set for aarch64.
- This is missing updates for ICU.  Looking at the upstream code,
Build.Windows.ProjectConfiguration.props uses libARM64 and binARM64
for the output library and binary paths.
- This is missing updates for krb5.  For this case, I am seeing no
traces of packages for aarch64, so I guess that we could just fail
hard until someone cares enough to ping us about what to do here.
- There were zero changes in the docs, but we need to update at least
the section about architectures supported for the 64-bit builds.
- Last comes OpenSSL, that supports amd64_arm64 as build target (see
NOTES-WINDOWS.md), and the library names are libssl.lib and
libcrypto.lib.  Looking at
https://slproweb.com/products/Win32OpenSSL.html, there are
experimental builds for arm64 with OpenSSL 3.0.  Niyas or somebody
else, could you look at the contents of lib/VC/ and see what we could
rely on for the debug builds after installing this MSI?  We should
rely on something like lib/VC/sslcrypto64MD.lib or
lib/VC/sslcrypto32MD.lib, but for arm64.

With meson gaining in maturity, perhaps that's not the most urgent
thing as we will likely remove src/tools/msvc/ soon but I'd rather do
that right anyway as much as I can to avoid an incorrect state in the
tree at any time in its history.

-   USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK => undef,
+   USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK => $self->{platform} eq "ARM64" ? : 
1 : undef,
Did you actually test this patch?  This won't work at all with perl,
per se the double colon after the question mark.

For now, please find attached an updated patch with all the fixes I
could come up with.
--
Michael
From 9cb858cfdb25c3a119a8b28321cf3218eac282c5 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 5 Dec 2022 14:09:29 +0900
Subject: [PATCH v6] Enable postgres native build for windows-arm64 platform

Following changes are included

- Extend MSVC scripts to handle ARM64 platform
- Add arm64 definition of spin_delay function
- Exclude arm_acle.h import for MSVC
- Add support for meson build
---
 src/include/storage/s_lock.h  | 10 +-
 src/port/pg_crc32c_armv8.c|  2 ++
 doc/src/sgml/install-windows.sgml |  3 ++-
 meson.build   | 33 +++
 src/tools/msvc/MSBuildProject.pm  | 16 +++
 src/tools/msvc/Mkvcbuild.pm   |  9 +++--
 src/tools/msvc/Solution.pm| 30 ++--
 src/tools/msvc/gendef.pl  |  8 
 8 files changed, 80 insertions(+), 31 deletions(-)

diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 8b19ab160f..ab6a6e0281 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -708,13 +708,21 @@ typedef LONG slock_t;
 #define SPIN_DELAY() spin_delay()
 
 /* If using Visual C++ on Win64, inline assembly is unavailable.
- * Use a _mm_pause intrinsic instead of rep nop.
+ * Use _mm_pause (x64) or __isb (arm64) intrinsic instead of rep nop.
  */
 #if defined(_WIN64)
 static __forceinline void
 spin_delay(void)
 {
+#ifdef _M_ARM64
+	/*
+	 * See spin_delay aarch64 inline assembly definition above for details
+	 * ref: https://learn.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics#BarrierRestrictions
+	*/
+	__isb(_ARM64_BARRIER_SY);
+#else
 	_mm_pause();
+#endif
 }
 #else
 static __forceinline void
diff --git a/src/port/pg_crc32c_armv8.c b/src/port/pg_crc32c_armv8.c
index 9e301f96f6..981718752f 100644
--- a/src/port/pg_crc32c_armv8.c
+++ b/src/port/pg_crc32c_armv8.c
@@ -14,7 +14,9 @@
  */
 #include "c.h"
 
+#ifndef _MSC_VER
 #include 
+#endif
 
 #include "port/pg_crc32c.h"
 
diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml
index bbd4960e7b..3f865d7d3b 100644
--- a/doc/src/sgml/install-windows.sgml
+++ b/doc/src/sgml/install-windows.sgml
@@ -352,7 +352,8 @@ $ENV{MSBFLAGS}="/m";
   Special Considerations for 64-Bit Windows
 
   
-   PostgreSQL will only build for the x64 architecture on 64-bit Windows.
+   PostgreSQL will only build for the x64 and ARM64 architectures on 64-bit
+   Windows.
   
 
   
diff --git a/meson.build b/meson.build
index 725e10d815..e354ad7650 100644
--- a/meson.build
+++ b/meson.build
@@ -1944,7 +1944,13 @@ int main(void)
 
 elif host_cpu == 'arm' or host_cpu == 'aarch64'
 
-  prog = '''
+  if cc.get_id() == 'msvc'
+cdata.set('USE_ARMV8_CRC32C', false)
+cdata.set('USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK', 1)
+have_optimized_crc = true
+  else
+
+prog = '''
 #include 
 
 int main(void)
@@ -1960,18 +1966,

Re: generic plans and "initial" pruning

2022-12-04 Thread Amit Langote
On Mon, Dec 5, 2022 at 12:00 PM Amit Langote  wrote:
> On Fri, Dec 2, 2022 at 7:40 PM Amit Langote  wrote:
> > Thought it might be good for PartitionPruneResult to also have
> > root_parent_relids that matches with the corresponding
> > PartitionPruneInfo.  ExecInitPartitionPruning() does a sanity check
> > that the root_parent_relids of a given pair of PartitionPrune{Info |
> > Result} match.
> >
> > Posting the patch separately as the attached 0002, just in case you
> > might think that the extra cross-checking would be an overkill.
>
> Rebased over 92c4dafe1eed and fixed some factual mistakes in the
> comment above ExecutorDoInitialPruning().

Sorry, I had forgotten to git-add hunks including some cosmetic
changes in that one.  Here's another version.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v28-0002-Add-root_parent_relids-to-PartitionPruneResult.patch
Description: Binary data


v28-0001-Optimize-AcquireExecutorLocks-by-locking-only-un.patch
Description: Binary data


Re: Failed Assert while pgstat_unlink_relation

2022-12-04 Thread Kyotaro Horiguchi
At Thu, 1 Dec 2022 19:23:28 -0800, Andres Freund  wrote in 
> > AtEOXact_PgStat_DroppedStats() visits a relation that has been dropped
> > then wiped (when CLOBBER_FREED_MEMORY) by AtEOXact_RelationCache()
> > called just before.  Since the relcache content directly pointed from
> > stats module is lost in this case, the stats side cannot defend itself
> > from this.
> > 
> > Perhaps RelationDestroyRelation() need to do pgstat_drop_entry() or
> > AtEOXact_PgStat_DroppedStats() should be called before
> > AtEOXact_RelationCache(). the latter of which is simpler. I think we
> > need to test this case, too.
> 
> This doesn't strike me as the right fix. What do you think about my patch at
> https://postgr.es/m/20221128210908.hyffmljjylj447nu%40awork3.anarazel.de ,
> leaving the quibbles around error handling aside?

Yeah, I didn't like what my patch does...

> Afaict it fixes the issue.

Hmm. I see it works for this specific case, but I don't understand why
it is generally safe.

The in-xact created relation t1 happened to be scanned during the
CREATE RULE and a stats entry is attached. So the stats entry loses t1
at roll-back, then crashes.  Thus, if I understand it correctly, it
seems to me that just unlinking the stats from t1 (when relkind is
changed) works.

But the fix doesn't change the behavior in relkind-not-changing
cases. If an in-xact-created table gets a stats entry then the
relcache entry for t1 is refreshed to a table relation again then the
transaction rolls back, crash will happen for the same reason. I'm not
sure if there is such a case actually.

When I tried to check that behavior further, I found that that
CREATE ROLE is no longer allowed..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Question regarding "Make archiver process an auxiliary process. commit"

2022-12-04 Thread Sravan Kumar
Hi,

I see that in the archiver code, in the function pgarch_MainLoop,
the archiver sleeps for a certain time or until there's a signal. The time
it sleeps for is represented by:

timeout = PGARCH_AUTOWAKE_INTERVAL - (curtime - last_copy_time);
It so happens that last_copy_time and curtime are always set at the same
time which always makes timeout equal (actually roughly equal) to
PGARCH_AUTOWAKE_INTERVAL.

I see that this behaviour was introduced as a part of the commit:
d75288fb27b8fe0a926aaab7d75816f091ecdc27. The discussion thread is:
https://postgr.es/m/20180629.173418.190173462.horiguchi.kyot...@lab.ntt.co.jp

The change was introduced in v31, with the following comment in the
discussion thread:

- pgarch_MainLoop start the loop with wakened = true when both
notified or timed out. Otherwise time_to_stop is set and exits from
the loop immediately. So the variable wakened is actually
useless. Removed it.

This behaviour was different before the commit:
d75288fb27b8fe0a926aaab7d75816f091ecdc27,
in which the archiver keeps track of how much time has elapsed since
last_copy_time
in case there was a signal, and it results in a smaller subsequent value of
timeout, until timeout is zero. This also avoids calling
pgarch_ArchiverCopyLoop
before PGARCH_AUTOWAKE_INTERVAL in case there's an intermittent signal.

With the current changes it may be okay to always sleep for
PGARCH_AUTOWAKE_INTERVAL,
but that means curtime and last_copy_time are no more needed.

I would like to validate if my understanding is correct, and which of the
behaviours we would like to retain.


Thanks & Regards,
Sravan Velagandula
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Add LZ4 compression in pg_dump

2022-12-04 Thread Michael Paquier
On Sat, Dec 03, 2022 at 11:45:30AM +0900, Michael Paquier wrote:
> While this is correct in checking that the contents are compressed
> under --with-zlib, this also removes the coverage where we make sure
> that this command is able to complete under --without-zlib without
> compressing any of the table data files.  Hence my point from
> upthread: this test had better not use compile_option, but change
> glob_pattern depending on if the build uses zlib or not.

In short, I mean something like the attached.  I have named the flag
content_patterns, and switched it to an array so as we can check that
toc.dat is always uncompression and that the other data files are
always uncompressed.

> In order to check this behavior with defaults_custom_format, perhaps
> we could just remove the -Z6 from it or add an extra command for its
> default behavior?

This is slightly more complicated as there is just one file generated
for the compression and non-compression cases, so I have let that as
it is now.
--
Michael
From 5c583358caed5598fec9abea6750ff7fbd98d269 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 5 Dec 2022 16:04:57 +0900
Subject: [PATCH v15] Provide coverage for pg_dump default compression for dir
 format

The restore program will succeed regardless of whether the dumped output was
compressed or not. This commit implements a portable way to check the contents
of the directory via perl's build in filename expansion.
---
 src/bin/pg_dump/t/002_pg_dump.pl | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 709db0986d..9796d2667f 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -36,6 +36,9 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
 # to test pg_restore's ability to parse manually compressed files
 # that otherwise pg_dump does not compress on its own (e.g. *.toc).
 #
+# content_patterns is an optional array consisting of strings compilable
+# with glob() to check the files generated after a dump.
+#
 # restore_cmd is the pg_restore command to run, if any.  Note
 # that this should generally be used when the pg_dump goes to
 # a non-text file and that the restore can then be used to
@@ -46,6 +49,10 @@ my $tempdir = PostgreSQL::Test::Utils::tempdir;
 # database and then pg_dump *that* database (or something along
 # those lines) to validate that part of the process.
 
+my $supports_icu  = ($ENV{with_icu} eq 'yes');
+my $supports_lz4  = check_pg_config("#define USE_LZ4 1");
+my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
+
 my %pgdump_runs = (
 	binary_upgrade => {
 		dump_cmd => [
@@ -213,6 +220,9 @@ my %pgdump_runs = (
 	},
 
 	# Do not use --no-sync to give test coverage for data sync.
+	# By default, the directory format compresses its contents
+	# when the code is compiled with gzip support, and lets things
+	# uncompressed when not compiled with it.
 	defaults_dir_format => {
 		test_key => 'defaults',
 		dump_cmd => [
@@ -224,6 +234,11 @@ my %pgdump_runs = (
 			"--file=$tempdir/defaults_dir_format.sql",
 			"$tempdir/defaults_dir_format",
 		],
+		content_patterns => ["$tempdir/defaults_dir_format/toc.dat",
+ $supports_gzip ?
+ "$tempdir/defaults_dir_format/*.dat.gz" :
+ "$tempdir/defaults_dir_format/*.dat",
+		],
 	},
 
 	# Do not use --no-sync to give test coverage for data sync.
@@ -3920,10 +3935,6 @@ if ($collation_check_stderr !~ /ERROR: /)
 	$collation_support = 1;
 }
 
-my $supports_icu  = ($ENV{with_icu} eq 'yes');
-my $supports_lz4  = check_pg_config("#define USE_LZ4 1");
-my $supports_gzip = check_pg_config("#define HAVE_LIBZ 1");
-
 # ICU doesn't work with some encodings
 my $encoding = $node->safe_psql('postgres', 'show server_encoding');
 $supports_icu = 0 if $encoding eq 'SQL_ASCII';
@@ -4153,6 +4164,16 @@ foreach my $run (sort keys %pgdump_runs)
 		command_ok(\@full_compress_cmd, "$run: compression commands");
 	}
 
+	if ($pgdump_runs{$run}->{content_patterns})
+	{
+		my $content_patterns = $pgdump_runs{$run}->{content_patterns};
+		foreach my $content_pattern (@{$content_patterns})
+		{
+			my @glob_output = glob($content_pattern);
+			is(scalar(@glob_output) > 0, 1, "$run: content check for $content_pattern");
+		}
+	}
+
 	if ($pgdump_runs{$run}->{restore_cmd})
 	{
 		$node->command_ok(\@{ $pgdump_runs{$run}->{restore_cmd} },
-- 
2.38.1



signature.asc
Description: PGP signature


Re: Generate pg_stat_get_* functions with Macros

2022-12-04 Thread Drouvot, Bertrand

Hi,

On 12/4/22 6:32 PM, Nathan Bossart wrote:

On Sun, Dec 04, 2022 at 06:07:37AM +0100, Drouvot, Bertrand wrote:

On 12/3/22 9:16 PM, Nathan Bossart wrote:

Thanks.  I editorialized a bit in the attached v3.  I'm not sure that my
proposed names for the macros are actually an improvement.  WDYT?


Thanks! I do prefer the macros definition ordering that you're proposing (that makes 
pgstatfuncs.c "easier" to read).

As far the names, I think it's better to replace "TAB" with "REL" (like in v4 attached): the reason 
is that those macros will be used in [1] for both tables and indexes stats (and so we'd have to replace "TAB" 
with "REL" in [1]).
Having "REL" already in place reduces the changes that will be needed in [1].


Alright.  I marked this as ready-for-committer.



Thanks!

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Generate pg_stat_get_* functions with Macros

2022-12-04 Thread Michael Paquier
On Mon, Dec 05, 2022 at 08:27:15AM +0100, Drouvot, Bertrand wrote:
> On 12/4/22 6:32 PM, Nathan Bossart wrote:
>> Alright.  I marked this as ready-for-committer.
> 
> Thanks!

Well, that's kind of nice:
 5 files changed, 139 insertions(+), 396 deletions(-)
And I like removing code, so..

In the same area, I am counting a total of 21 (?) pgstat routines for
databases that rely on pgstat_fetch_stat_dbentry() while returning an
int64.  This would lead to more cleanup.
--
Michael


signature.asc
Description: PGP signature