Re: [PATCH] Tab completion for SET COMPRESSION

2022-09-10 Thread Michael Paquier
On Thu, Sep 08, 2022 at 04:40:32PM +0900, Shinya Kato wrote:
> Thanks! I marked it as ready for committer.

I thought that there was a gotcha in this area for composite types,
but on second look it looks that I was wrong.  Hence, applied.
--
Michael


signature.asc
Description: PGP signature


Re: Summary function for pg_buffercache

2022-09-10 Thread Aleksander Alekseev
Hi Melih,

> I'm not sure about what undefined behaviour could harm this badly.

You are right that in practice nothing wrong will (probably) happen on
x86/x64 architecture with (most?) modern C compilers. This is not true in
the general case though. It's up to the compiler to decide how reading the
bufHdr->tag is going to be actually implemented. This can be one assembly
instruction or several instructions. This reading can be optimized-out if
the compiler believes the required value is already in the register, etc.
Since the result will be different depending on the assembly code used this
is an undefined behaviour and we can't use code like this.

> In the attached patch, I added buffer header locks just before examining
tag as follows

Many thanks for the updated patch! It looks better now.

However I have somewhat mixed feelings about avg_usagecount. Generally
AVG() is a relatively useless methric for monitoring. What if the user
wants MIN(), MAX() or let's say a 99th percentile? I suggest splitting it
into usagecount_min, usagecount_max and usagecount_sum. AVG() can be
derived as usercount_sum / used_buffers.

Also I suggest changing the names of the columns in order to make them
consistent with the rest of the system. If you consider pg_stat_activity
and family [1] you will notice that the columns are named
(entity)_(property), e.g. backend_xid, backend_type, client_addr, etc. So
instead of used_buffers and unused_buffers the naming should be
buffers_used and buffers_unused.

[1]: https://www.postgresql.org/docs/current/monitoring-stats.html

-- 
Best regards,
Aleksander Alekseev


[PATCH] initdb: do not exit after warn_on_mount_point

2022-09-10 Thread andrey . arapov
Hello,

please find my first patch for PostgreSQL is attached.
Kind regards,
Andrey Arapov


0001-initdb-do-not-exit-after-warn_on_mount_point.patch
Description: Binary data


Support for Rust

2022-09-10 Thread Lev Kokotov
Hello,

Are there any plans or thoughts about adding support for other languages
than C into Postgres, namely Rust? I would love to hack on some features
but I worry somewhat that the C compiler won't give me enough hints that
I'm doing something wrong, and the Rust compiler has been excellent at
preventing bugs.

Best,
Lev


pg_toast.pg_toast_relfilenode not exist due to vacuum full tablename

2022-09-10 Thread walker
hi,


this morning i met an issue, that after vacuum full tablename, the associated 
toast table shows not exist.
here is the operation steps:


drop table if exists reymont;
create table  reymont ( id bigint primary key, data bytea not null);
alter table reymont alter column data set compression pglz;
insert into reymont values(1, pg_read_binary_file('filename'));
vacuum full reymont;
select relname, relfilenode, reltoastrelid from pg_class where 
relname='reymont';
\d+ pg_toast.pg_toast_relfilenode
Did not find any relation named "pg_toast.pg_toast_relfilenode".


however, if display toast table before vacuum full operation, no problem.
drop table if exists reymont;
create table  reymont ( id bigint primary key, data bytea not null);
alter table reymont alter column data set compression pglz;
insert into reymont values(1, pg_read_binary_file('filename'));

\d+ pg_toast.pg_toast_relfilenode  --- it's ok, the toast table exists
vacuum full reymont;
\d+ pg_toast.pg_toast_relfilenode  --- it's ok, the toast table exists


it looks a little strange, any ideas? appreciate your help.


env:
pg14.4
linux 3.10.0-693.17.1.e17


thanks
walker

Re: Fix typo function circle_same (src/backend/utils/adt/geo_ops.c)

2022-09-10 Thread Ranier Vilela
Created a CF entry.
https://commitfest.postgresql.org/40/3883/

Attached a patch with a fix correction to regress output.

I think this needs to be backpatched until version 12.

regards,
Ranier Vilela


v1-0001-fix-typo-isnan-test-geo_ops.patch
Description: Binary data


Re: Summary function for pg_buffercache

2022-09-10 Thread Melih Mutlu
Hello Aleksander,

> I'm not sure about what undefined behaviour could harm this badly.
>
> You are right that in practice nothing wrong will (probably) happen on
> x86/x64 architecture with (most?) modern C compilers. This is not true in
> the general case though. It's up to the compiler to decide how reading the
> bufHdr->tag is going to be actually implemented. This can be one assembly
> instruction or several instructions. This reading can be optimized-out if
> the compiler believes the required value is already in the register, etc.
> Since the result will be different depending on the assembly code used this
> is an undefined behaviour and we can't use code like this.
>

Got it. Thanks for explaining.


> However I have somewhat mixed feelings about avg_usagecount. Generally
> AVG() is a relatively useless methric for monitoring. What if the user
> wants MIN(), MAX() or let's say a 99th percentile? I suggest splitting it
> into usagecount_min, usagecount_max and usagecount_sum. AVG() can be
> derived as usercount_sum / used_buffers.
>

Won't be usagecount_max almost always 5 as "BM_MAX_USAGE_COUNT" set to 5 in
buf_internals.h? I'm not sure about how much usagecount_min would add
either.
A usagecount is always an integer between 0 and 5, it's not
something unbounded. I think the 99th percentile would be much better than
average if strong outlier values could occur. But in this case, I feel like
an average value would be sufficiently useful as well.
usagecount_sum would actually be useful since average can be derived from
it. If you think that the sum of usagecounts has a meaning just by itself,
it makes sense to include it. Otherwise, wouldn't showing directly averaged
value be more useful?



> Also I suggest changing the names of the columns in order to make them
> consistent with the rest of the system. If you consider pg_stat_activity
> and family [1] you will notice that the columns are named
> (entity)_(property), e.g. backend_xid, backend_type, client_addr, etc. So
> instead of used_buffers and unused_buffers the naming should be
> buffers_used and buffers_unused.
>
> [1]: https://www.postgresql.org/docs/current/monitoring-stats.html
>

You're right. I will change the names accordingly. Thanks.


Regards,
Melih


Re: small windows psqlrc re-wording

2022-09-10 Thread Robert Treat
On Fri, Sep 9, 2022 at 1:52 PM Tom Lane  wrote:
>
> I wrote:
> > On testing that in HEAD, I read
>
> > Both the system-wide startup file and the user's personal startup file
> > can be made psql-version-specific by appending a dash and the
> > PostgreSQL major or minor release number to the file name, for example
> > ~/.psqlrc-16 or ~/.psqlrc-16devel.
>
> > That's a little confusing but it's actually accurate, because what
> > process_psqlrc_file appends is the string PG_VERSION, so in a devel
> > branch or beta release there's a non-numeric "minor release".
> > I'm inclined to go ahead and do it like that.
>
> I decided that what I found jarring about that was the use of "release
> number" with a non-numeric version, so I changed it to "release
> identifier" and pushed.
>

Looks good. Thanks Tom / Julien.


Robert Treat
https://xzilla.net




Re: pg_toast.pg_toast_relfilenode not exist due to vacuum full tablename

2022-09-10 Thread Tom Lane
"=?ISO-8859-1?B?d2Fsa2Vy?="  writes:
> this morning i met an issue, that after vacuum full tablename, the associated 
> toast table shows not exist.

Your example doesn't show what you actually did, but I think what is
fooling you is that VACUUM FULL changes the relfilenode of the table
but not the name of its toast table.  So the situation afterwards
might look like

regression=# select relname, relfilenode, reltoastrelid from pg_class where 
relname='reymont';
 relname | relfilenode | reltoastrelid 
-+-+---
 reymont |   40616 | 40611
(1 row)
regression=# select relname from pg_class where oid = 40611;
relname 

 pg_toast_40608
(1 row)

regression=# \d+ pg_toast.pg_toast_40608
TOAST table "pg_toast.pg_toast_40608"
   Column   |  Type   | Storage 
+-+-
 chunk_id   | oid | plain
 chunk_seq  | integer | plain
 chunk_data | bytea   | plain
Owning table: "public.reymont"
Indexes:
"pg_toast_40608_index" PRIMARY KEY, btree (chunk_id, chunk_seq)
Access method: heap

(where 40608 is reymont's original relfilenode).

I'm not sure if this should be considered a bug or not.  Everything still
works well enough, but conceivably we could have a TOAST name collision
down the road when we recycle the 40608 number --- I don't recall if
the TOAST logic is able to cope with that or not.

In any case, you should not be making assumptions about the name of
a TOAST table without verifying it.

regards, tom lane




Re: preserve timestamps when installing headers

2022-09-10 Thread Justin Pryzby
On Fri, Sep 09, 2022 at 10:23:57PM +0300, Heikki Linnakangas wrote:
> On 11/01/2022 00:03, Tom Lane wrote:
> > Peter Eisentraut  writes:
> > > I don't think preserving timestamps should be the default behavior, but
> > > I would support organizing things so that additional options can be
> > > passed to "install" to make it do whatever the user prefers.  But that
> > > won't work if some installations don't go through install.

> Here's a patch to switch back to $(INSTALL). With that, you can do:
> 
> ./configure INSTALL="/usr/bin/install -C"

+1, I recently looked for a way to do that while trying to accelerate
Cygwin/Mingw.

-- 
Justin




Re: [PATCH] initdb: do not exit after warn_on_mount_point

2022-09-10 Thread Tom Lane
andrey.ara...@nixaid.com writes:
> please find my first patch for PostgreSQL is attached.

You haven't explained why you think this would be a good
change, or even a safe one.

regards, tom lane




Re: Support for Rust

2022-09-10 Thread Andrey Borodin
Hi!

> On 10 Sep 2022, at 07:38, Lev Kokotov  wrote:
> 
> Are there any plans or thoughts about adding support for other languages than 
> C into Postgres, namely Rust? I would love to hack on some features but I 
> worry somewhat that the C compiler won't give me enough hints that I'm doing 
> something wrong, and the Rust compiler has been excellent at preventing bugs.

You can write Postgres extensions in Rust. And Postgres extensions are really 
powerful. What kind of features are you interested in?

Undoubtedly, attracting Rust folks to contribute Postgres could be a good 
things.
Yet some very simple questions arise.
1. Is Rust compatible with Memory Contexts and shared memory constructs of 
Postgres? With elog error reporting, PG_TRY() and his friends?
2. Does Rust support same set of platforms as Postgres? Quick glance at Build 
Farm can give an impression of what is supported by Postgres[0].
3. Do we gain anything besides compiler hints? Postgres development is hard due 
to interference of complex subsystems. It will be even harder if those systems 
will be implemented in different languages.

Probably, answers to all these questions are obvious to Rust pros. I just think 
this can be of interest to someone new to Rust (like me).


Best regards, Andrey Borodin.

[0] https://buildfarm.postgresql.org/cgi-bin/show_members.pl



Re: Support for Rust

2022-09-10 Thread Nathan Bossart
On Fri, Sep 09, 2022 at 07:38:14PM -0700, Lev Kokotov wrote:
> Are there any plans or thoughts about adding support for other languages
> than C into Postgres, namely Rust? I would love to hack on some features
> but I worry somewhat that the C compiler won't give me enough hints that
> I'm doing something wrong, and the Rust compiler has been excellent at
> preventing bugs.

There was some discussion about Rust back in 2017 [0] that you might be
interested in.

[0] 
https://www.postgresql.org/message-id/flat/CAASwCXdQUiuUnhycdRvrUmHuzk5PsaGxr54U4t34teQjcjb%3DAQ%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Splitting up guc.c

2022-09-10 Thread Tom Lane
Here's a WIP stab at the project Andres mentioned [1] of splitting up
guc.c into smaller files.  As things stand here, we have:

1. guc.c: the core GUC machinery.
2. guc_tables.c: the data arrays, and some previously-exposed constant
tables.  guc_tables.h can now be considered the associated header.
3. guc_hooks.c: (most of) the per-variable check/assign/show hooks
that had been in guc.c.  guc_hooks.h declares these.

File sizes are like so:

$ wc guc*c
  2629   9372  69467 guc-file.c
  7422  25136 202284 guc.c
   939   2693  22915 guc_hooks.c
  4877  13163 126769 guc_tables.c
 15867  50364 421435 total
$ size guc*o
   textdata bss dec hex filename
  13653   4 112   1376935c9 guc-file.o
  54953   0 564   55517d8dd guc.o
   6951   0 11270631b97 guc_hooks.o
  43570   62998 216  106784   1a120 guc_tables.o

I'm fairly happy with the way things turned out in guc.c and
guc_tables.c, but I don't much like guc_hooks.c.  I think instead of
creating such a file, what we should do is to shove most of those
functions into whatever module the GUC variable is associated with.
(Perhaps commands/variable.c could absorb any stragglers that lack
a better home.)  I made a start at that for wal_consistency_checking
and the syslog parameters, but haven't gone further yet.

Before proceeding further, I wanted to ask for comments on a design
choice that might be controversial.  Even though I don't want to
invent guc_hooks.c, I think we *should* invent guc_hooks.h, and
consolidate all the GUC hook function declarations there.  The
point would be to not have to #include guc.h in headers of unrelated
modules.  This is similar to what we've done with utils/fmgrprotos.h,
though the motivation is different.  I already moved a few declarations
from guc.h to there (and in consequence had to adjust #includes in
the modules defining those hooks), but there's a lot more to be done
if we apply that policy across the board.  Does anybody think that's
a bad approach, or have a better one?

BTW, this is more or less orthogonal to my other GUC patch at [2],
although both lead to the conclusion that we need to export
guc_malloc and friends.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/20220905233233.jhcu5jqsrtosmgh5%40awork3.anarazel.de
[2] 
https://www.postgresql.org/message-id/flat/2982579.1662416866%40sss.pgh.pa.us



split-up-guc-code-1.patch.gz
Description: split-up-guc-code-1.patch.gz


Re: Splitting up guc.c

2022-09-10 Thread Andres Freund
Hi,

On 2022-09-10 15:04:59 -0400, Tom Lane wrote:
> Here's a WIP stab at the project Andres mentioned [1] of splitting up
> guc.c into smaller files.

Cool!


> As things stand here, we have:
> 
> 1. guc.c: the core GUC machinery.
> 2. guc_tables.c: the data arrays, and some previously-exposed constant
> tables.  guc_tables.h can now be considered the associated header.
> 3. guc_hooks.c: (most of) the per-variable check/assign/show hooks
> that had been in guc.c.  guc_hooks.h declares these.
> 
> File sizes are like so:
> 
> $ wc guc*c
>   2629   9372  69467 guc-file.c
>   7422  25136 202284 guc.c
>939   2693  22915 guc_hooks.c
>   4877  13163 126769 guc_tables.c
>  15867  50364 421435 total
> $ size guc*o
>textdata bss dec hex filename
>   13653   4 112   1376935c9 guc-file.o
>   54953   0 564   55517d8dd guc.o
>6951   0 11270631b97 guc_hooks.o
>   43570   62998 216  106784   1a120 guc_tables.o

A tad surprised by the text size of guc_tables.o - not that it is a problem,
just seems a bit odd.


> I'm fairly happy with the way things turned out in guc.c and
> guc_tables.c, but I don't much like guc_hooks.c.  I think instead of
> creating such a file, what we should do is to shove most of those
> functions into whatever module the GUC variable is associated with.

+1. I think our associated habit of declaring externs in multiple .c files
isn't great either.



> Before proceeding further, I wanted to ask for comments on a design
> choice that might be controversial.  Even though I don't want to
> invent guc_hooks.c, I think we *should* invent guc_hooks.h, and
> consolidate all the GUC hook function declarations there.  The
> point would be to not have to #include guc.h in headers of unrelated
> modules.  This is similar to what we've done with utils/fmgrprotos.h,
> though the motivation is different.  I already moved a few declarations
> from guc.h to there (and in consequence had to adjust #includes in
> the modules defining those hooks), but there's a lot more to be done
> if we apply that policy across the board.  Does anybody think that's
> a bad approach, or have a better one?

Hm, I'm not opposed, the reasoning makes sense to me. How would this interact
with the declaration of the variables underlying GUCs?


Greetings,

Andres Freund




Re: Splitting up guc.c

2022-09-10 Thread Tom Lane
Andres Freund  writes:
> On 2022-09-10 15:04:59 -0400, Tom Lane wrote:
>> $ size guc*o
>> text data bss dec hex filename
>> 13653   4 112   1376935c9 guc-file.o
>> 54953   0 564   55517d8dd guc.o
>> 69510 11270631b97 guc_hooks.o
>> 43570   62998 216  106784   1a120 guc_tables.o

> A tad surprised by the text size of guc_tables.o - not that it is a problem,
> just seems a bit odd.

There's a pretty fair number of constant tables that got moved to there.
Not to mention all the constant strings.

>> Before proceeding further, I wanted to ask for comments on a design
>> choice that might be controversial.  Even though I don't want to
>> invent guc_hooks.c, I think we *should* invent guc_hooks.h, and
>> consolidate all the GUC hook function declarations there.  The
>> point would be to not have to #include guc.h in headers of unrelated
>> modules.  This is similar to what we've done with utils/fmgrprotos.h,
>> though the motivation is different.  I already moved a few declarations
>> from guc.h to there (and in consequence had to adjust #includes in
>> the modules defining those hooks), but there's a lot more to be done
>> if we apply that policy across the board.  Does anybody think that's
>> a bad approach, or have a better one?

> Hm, I'm not opposed, the reasoning makes sense to me. How would this interact
> with the declaration of the variables underlying GUCs?

I'd still declare the variables as we do now, ie just straightforwardly
export them from the associated modules.  Since they're all of native
C types, they don't cause any inclusion-footprint issues.  We could move
their declarations to a common file I guess, but I don't see any benefit.

regards, tom lane




Re: Splitting up guc.c

2022-09-10 Thread Andres Freund
Hi,

On 2022-09-10 12:15:33 -0700, Andres Freund wrote:
> On 2022-09-10 15:04:59 -0400, Tom Lane wrote:
> > As things stand here, we have:
> >
> > 1. guc.c: the core GUC machinery.
> > 2. guc_tables.c: the data arrays, and some previously-exposed constant
> > tables.  guc_tables.h can now be considered the associated header.
> > 3. guc_hooks.c: (most of) the per-variable check/assign/show hooks
> > that had been in guc.c.  guc_hooks.h declares these.
> >
> > File sizes are like so:
> >
> > $ wc guc*c
> >   2629   9372  69467 guc-file.c
> >   7422  25136 202284 guc.c
> >939   2693  22915 guc_hooks.c
> >   4877  13163 126769 guc_tables.c
> >  15867  50364 421435 total
> > $ size guc*o
> >textdata bss dec hex filename
> >   13653   4 112   1376935c9 guc-file.o
> >   54953   0 564   55517d8dd guc.o
> >6951   0 11270631b97 guc_hooks.o
> >   43570   62998 216  106784   1a120 guc_tables.o
>
> A tad surprised by the text size of guc_tables.o - not that it is a problem,
> just seems a bit odd.

Looks like that's just size misgrouping some section. Built a guc_tables.o
without debug information (that makes the output too complicated):

$ size guc_tables_nodebug.o
   textdata bss dec hex filename
  40044   66868 344  107256   1a2f8 guc_tables_nodebug.o

$ size --format=sysv guc_tables_nodebug.o
guc_tables_nodebug.o  :
sectionsize   addr
.text 0  0
.data52  0
.bss344  0
.rodata   40044  0
.data.rel.ro.local 3720  0
.data.rel.local   8  0
.data.rel 63088  0
.comment 31  0
.note.GNU-stack   0  0
Total107287

For some reason size adds .roata to the size for text in the default berkeley
style. Which is even documented:

   The Berkeley style output counts read only data in the "text" 
column, not in the "data" column, the "dec" and "hex" columns both display the 
sum
   of the "text", "data", and "bss" columns in decimal and hexadecimal 
respectively.

Greetings,

Andres Freund




Re: Splitting up guc.c

2022-09-10 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> On 2022-09-10 15:04:59 -0400, Tom Lane wrote:
>>> $ size guc*o
>>> text data bss dec hex filename
>>> 13653   4 112   1376935c9 guc-file.o
>>> 54953   0 564   55517d8dd guc.o
>>>  6951   0 11270631b97 guc_hooks.o
>>> 43570   62998 216  106784   1a120 guc_tables.o

>> A tad surprised by the text size of guc_tables.o - not that it is a problem,
>> just seems a bit odd.

> There's a pretty fair number of constant tables that got moved to there.
> Not to mention all the constant strings.

I forgot to include comparison numbers for HEAD:

$ wc guc*c
  2629   9372  69467 guc-file.c
 13335  41584 356896 guc.c
 15964  50956 426363 total
$ size guc*o
   textdata bss dec hex filename
  13653   4 112   1376935c9 guc-file.o
 105848   63156 908  169912   297b8 guc.o

This isn't completely apples-to-apples because of the few
hook functions I'd moved to other places in v1, but you can
see that the total text and data sizes didn't change much.
It'd likely indicate a mistake if they had.  (However, v1
does include const-ifying a few options tables that had
somehow escaped being labeled that way, so the total data
size did shrink a small amount.)

regards, tom lane




Re: CI and test improvements

2022-09-10 Thread Justin Pryzby
On Sun, Aug 28, 2022 at 02:28:02PM -0700, Andres Freund wrote:
> On 2022-08-28 12:10:29 -0500, Justin Pryzby wrote:
> > On Sun, Aug 28, 2022 at 09:07:52AM -0700, Andres Freund wrote:
> > > > --- /dev/null
> > > > +++ b/src/tools/ci/windows-compiler-warnings
> > >
> > > Wouldn't that be doable as something like
> > > sh -c 'if test -s file; then cat file;exit 1; fi"
> > > inside .cirrus.yml?
> >
> > I had written it inline in a couple ways, like
> > - sh -exc 'f=msbuild.warn.log; if [ -s "$f" ]; then cat "$f"; exit 1; else 
> > exit 0; fi'
> >
> > but then separated it out as you suggested in
> > 20220227010908.vz2a7dmfzgwg7...@alap3.anarazel.de
> >
> > after I complained about cmd.exe requiring escaping for && and ||
> > That makes writing any shell script a bit perilous and a separate script
> > seems better.
> 
> I remember that I suggested it - but note that the way I wrote above doesn't
> have anything needing escaping. 

It doesn't require it, but that still gives the impression that it's
normally possible to write one-liner shell scripts there, which is
misleading/wrong, and the reason why I took your suggestion to use a
separate script file.

> Anyway, what do you think of the multiline split I suggested?

Done, and sorted.

> That's what should have been in the commit message.

Sure.  I copied into the commit message the explanation that I had
written in June's email.

> > > > xi-os-only: freebsd
> > >
> > > Typo.
> >
> > No - it's deliberate so I can switch to and from "everything" to "this
> > only".
> 
> I don't see the point in posting patches to be applied if they contain lots of
> such things that a potential committer would need to catch and include a lot
> of of fixup patches.

I get that you disliked that I disabled the effect of a CI tag by
munging "c" to "x".  I've amended the message to avoid confusion.  But,
lots of what such things ?  "ci-os-only" would be removed before being
pushed anyway.

"catching things" is the first part of the review process, which (as I
understand) is intended to help patch authors to improve their patches.
If you found lots of problems in my patches, I'd need to know about
them; but most of what I heard seem like quibbles about the presentation
of the patches.  It's true that some parts are dirty/unclear, and that
seems reasonable for patches most of which haven't yet received review,
for which I asked whether to pursue the patch at all, and how best to
present them.  This is (or could be) an opportunity to make
improvements.

I renamed the two, related patches to Cluser.pm which said "f!", which
are deliberately separate but looked like "fixup" patches.  Are you
interested in any combination of those three, related changes to move
logic from Makefile to perl ?  If not, we don't need to debate the
merits of spliting the patch.

What about the three, related changes for ccache compression ?

Should these be dropped in favour of meson ?
 - cirrus/vcregress: test modules/contrib with NO_INSTALLCHECK=1
 - vcregress: add alltaptests

I added: "WIP: skip building if only docs have changed"

changesInclude() didn't seem to work right when I first tried to use it.
Eventually, I realized that it seems to use something like "git log",
and not "git diff" (as I'd thought).  It seems to work fine now that I
know what to expect.

git commit --amend --no-edit
git diff --stat @{1}..@{0} # this outputs nothing
git log  --stat @{1}..@{0} # this lists the files changed by the tip commit

It'd be nice to be have cfbot inject this patch into each commitfest
patch for awhile, to make sure everything works as expected.  Same for
the code coverage patch and the doc artifacts patch.  (These patches
currently assume that the base commit is HEAD~1, which is correct for
cfbot, and that would also provide code coverage and docs until such
time as cfbot is updated to apply and preserve the original series of
patches).

-- 
Justin
>From 0566ad149df1fc2ee5ba96e523933fc073165194 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 25 May 2022 21:53:22 -0500
Subject: [PATCH 01/23] cirrus/windows: add compiler_warnings_script

I'm not sure how to write this test in windows shell; it's also not easy to
write it in posix sh, since windows shell is somehow interpretting && and ||...

https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de

See also: 8a1ce5e54f6d144e4f8e19af7c767b026ee0c956

ci-os-only: windows

https://cirrus-ci.com/task/6183879907213312
https://cirrus-ci.com/task/4876271443247104
---
 .cirrus.yml| 14 +-
 src/tools/ci/windows-compiler-warnings | 16 
 2 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100755 src/tools/ci/windows-compiler-warnings

diff --git a/.cirrus.yml b/.cirrus.yml
index 81eb8a9996d..2be62791448 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -370,7 +370,14 @@ task:
 # ForceNoAlign prevents msbuild from introducing line-breaks for long lines
 # disab

Re: Mingw task for Cirrus CI

2022-09-10 Thread Justin Pryzby
On Mon, Sep 05, 2022 at 04:52:17PM -0700, Andres Freund wrote:
> I don't think you should need to use --host, that indicates cross compiling,

This made me consider the idea of cross-compiling for windows under a
new linux task, and then running tests under Windows with a dependent
task.  I suppose that use -Og, in addition to CompilerWarnings, which
uses -O2.

cirrusci caches are (or can be) shared between tasks.  I got this mostly
working, with a few kludges to compile and install stuff under src/test.
This may be yet another idea that's obsoleted by meson.  WDYT?

-- 
Justin




Re: archive modules

2022-09-10 Thread Tom Lane
Nathan Bossart  writes:
> Of course.  I've marked it as ready-for-committer.

Pushed with a bit of additional wordsmithing.

regards, tom lane




Re: archive modules

2022-09-10 Thread Nathan Bossart
On Sat, Sep 10, 2022 at 04:44:16PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> Of course.  I've marked it as ready-for-committer.
> 
> Pushed with a bit of additional wordsmithing.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Index ordering after IS NULL

2022-09-10 Thread Jeff Janes
On a two-column btree index, we can constrain the first column with
equality and read the rows in order by the 2nd column.  But we can't
constrain the first column by IS NULL and still read the rows in order by
the 2nd column.  But why not?  Surely the structure of the btree index
would allow for this to work.

Example:

create table if not exists j as select case when random()<0.9 then
floor(random()*10)::int end b, random() c from generate_series(1,100);
create index if not exists j_b_c_idx on j (b,c);
set enable_sort TO off;
explain analyze select * from j where b is null order by c limit 10;
explain analyze select * from j where b =8 order by c limit 10;

The first uses a sort despite it being disabled.

Cheers,

Jeff


Re: Support load balancing in libpq

2022-09-10 Thread Michael Banck
Hi,

the patch no longer applies cleanly, please rebase (it's trivial).

I don't like the provided commit message very much, I think the
discussion about pgJDBC having had load balancing for a while belongs
elsewhere.

On Wed, Jun 22, 2022 at 07:54:19AM +, Jelte Fennema wrote:
> I tried to stay in line with the naming of this same option in JDBC and
> Npgsql, where it's called "loadBalanceHosts" and "Load Balance Hosts"
> respectively. So, actually to be more in line it should be the option for 
> libpq should be called "load_balance_hosts" (not "loadbalance" like 
> in the previous patch). I attached a new patch with the name of the 
> option changed to this.

Maybe my imagination is not so great, but what else than hosts could we
possibly load-balance? I don't mind calling it load_balance, but I also
don't feel very strongly one way or the other and this is clearly
bikeshed territory.
 
> I also don't think the name is misleading. Randomization of hosts will 
> automatically result in balancing the load across multiple hosts. This is 
> assuming more than a single connection is made using the connection 
> string, either on the same client node or on different client nodes. I think
> I think is a fair assumption to make. Also note that this patch does not load 
> balance queries, it load balances connections. This is because libpq works
> at the connection level, not query level, due to session level state. 

I agree.

Also, I think the scope is ok for a first implementation (but see
below). You or others could possibly further enhance the algorithm in
the future, but it seems to be useful as-is.

> I agree it is indeed fairly simplistic load balancing.

If I understand correctly, you've added DNS-based load balancing on top
of just shuffling the provided hostnames.  This makes sense if a
hostname is backed by more than one IP address in the context of load
balancing, but it also complicates the patch. So I'm wondering how much
shorter the patch would be if you leave that out for now?

On the other hand, I believe pgJDBC keeps track of which hosts are up or
down and only load balances among the ones which are up (maybe
rechecking after a timeout? I don't remember), is this something you're
doing, or did you consider it?

Some quick remarks on the patch:

/* OK, scan this addrlist for a working server address */
-   conn->addr_cur = conn->addrlist;
reset_connection_state_machine = true;
conn->try_next_host = false;

The comment might need to be updated.

+   int naddr;  /* # of addrs returned 
by getaddrinfo */

This is spelt "number of" in several other places in the file, and we
still have enough space to spell it out here as well without a
line-wrap.


Michael




Re: Index ordering after IS NULL

2022-09-10 Thread Peter Geoghegan
On Sat, Sep 10, 2022 at 2:28 PM Jeff Janes  wrote:
> explain analyze select * from j where b is null order by c limit 10;
> explain analyze select * from j where b =8 order by c limit 10;
>
> The first uses a sort despite it being disabled.

The first/is null query seems to give the result and plan you're
looking for if the query is rewritten to order by "b, c", and not just
"c".

That in itself doesn't make your complaint any less valid, of course.
You don't have to do this with the second query, so why should you
have to do it with the first?

-- 
Peter Geoghegan




Re: why can't a table be part of the same publication as its schema

2022-09-10 Thread Robert Haas
On Fri, Sep 9, 2022 at 2:17 PM Mark Dilger  wrote:
> > On Sep 9, 2022, at 8:18 AM, Robert Haas  wrote:
> > Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1
> > { ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we
> > used this ALL TABLES IN SCHEMA language.
>
> The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all 
> tables in foo, until publication of other object types became supported, at 
> which point "ADD SCHEMA foo" would suddenly mean more than it did before.  
> People might find that surprising, so the "ALL TABLES IN" was intended to 
> future-proof against surprising behavioral changes.

If I encountered this syntax in a vacuum, that's not what I would
think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
tables in the schema to the publication one by one as individual
objects, i.e. add the tables that are currently as of this moment in
that schema to the publication; and I would think that ADD SCHEMA
meant remember that this schema is part of the publication and so
whenever tables are created and dropped in that schema (or moved in
and out) what is being published is automatically updated.

The analogy here seems to be to GRANT, which actually does support
both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives
privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA
modifies each table that is currently in that schema (never mind what
happens later).

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~

2022-09-10 Thread Michael Paquier
On Fri, Sep 09, 2022 at 08:11:09PM +0900, Michael Paquier wrote:
> Based on what I can see, the Windows animals seem to have digested
> 47bd0b3 (cygwin, MinGW and MSVC), so I think that we are good.

The last part that's worth adjusting is ldap_start_tls_sA(), which
would lead to the attached simplification.  The MinGW headers list
this routine, so like the previous change I think that it should be
safe for such builds.

Looking at the buildfarm animals, bowerbird, jacana, fairywren,
lorikeet and drongo disable ldap.  hamerkop is the only member that
provides coverage for it, still that's a MSVC build.

The CI provides coverage for ldap as it is enabled by default and
windows_build_config.pl does not tell otherwise, but with the existing
animals we don't have ldap coverage under MinGW.

Anyway, I'd like to apply the attached, and I don't quite see why it
would not work after 47bd0b3 under MinGW.  Any thoughts?
--
Michael
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index a776bc3ed7..3a10baa9ce 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -135,14 +135,6 @@ static int	CheckBSDAuth(Port *port, char *user);
 #else
 #include 
 
-/* Correct header from the Platform SDK */
-typedef
-ULONG		(*__ldap_start_tls_sA) (IN PLDAP ExternalHandle,
-	OUT PULONG ServerReturnValue,
-	OUT LDAPMessage **result,
-	IN PLDAPControlA * ServerControls,
-	IN PLDAPControlA * ClientControls
-);
 #endif
 
 static int	CheckLDAPAuth(Port *port);
@@ -2348,48 +2340,7 @@ InitializeLDAPConnection(Port *port, LDAP **ldap)
 #ifndef WIN32
 		if ((r = ldap_start_tls_s(*ldap, NULL, NULL)) != LDAP_SUCCESS)
 #else
-		static __ldap_start_tls_sA _ldap_start_tls_sA = NULL;
-
-		if (_ldap_start_tls_sA == NULL)
-		{
-			/*
-			 * Need to load this function dynamically because it may not exist
-			 * on Windows, and causes a load error for the whole exe if
-			 * referenced.
-			 */
-			HANDLE		ldaphandle;
-
-			ldaphandle = LoadLibrary("WLDAP32.DLL");
-			if (ldaphandle == NULL)
-			{
-/*
- * should never happen since we import other files from
- * wldap32, but check anyway
- */
-ereport(LOG,
-		(errmsg("could not load library \"%s\": error code %lu",
-"WLDAP32.DLL", GetLastError(;
-ldap_unbind(*ldap);
-return STATUS_ERROR;
-			}
-			_ldap_start_tls_sA = (__ldap_start_tls_sA) (pg_funcptr_t) GetProcAddress(ldaphandle, "ldap_start_tls_sA");
-			if (_ldap_start_tls_sA == NULL)
-			{
-ereport(LOG,
-		(errmsg("could not load function _ldap_start_tls_sA in wldap32.dll"),
-		 errdetail("LDAP over SSL is not supported on this platform.")));
-ldap_unbind(*ldap);
-FreeLibrary(ldaphandle);
-return STATUS_ERROR;
-			}
-
-			/*
-			 * Leak LDAP handle on purpose, because we need the library to
-			 * stay open. This is ok because it will only ever be leaked once
-			 * per process and is automatically cleaned up on process exit.
-			 */
-		}
-		if ((r = _ldap_start_tls_sA(*ldap, NULL, NULL, NULL, NULL)) != LDAP_SUCCESS)
+		if ((r = ldap_start_tls_sA(*ldap, NULL, NULL, NULL, NULL)) != LDAP_SUCCESS)
 #endif
 		{
 			ereport(LOG,


signature.asc
Description: PGP signature


Re: Splitting up guc.c

2022-09-10 Thread Michael Paquier
On Sat, Sep 10, 2022 at 03:04:59PM -0400, Tom Lane wrote:
> Before proceeding further, I wanted to ask for comments on a design
> choice that might be controversial.  Even though I don't want to
> invent guc_hooks.c, I think we *should* invent guc_hooks.h, and
> consolidate all the GUC hook function declarations there.  The
> point would be to not have to #include guc.h in headers of unrelated
> modules.  This is similar to what we've done with utils/fmgrprotos.h,
> though the motivation is different.  I already moved a few declarations
> from guc.h to there (and in consequence had to adjust #includes in
> the modules defining those hooks), but there's a lot more to be done
> if we apply that policy across the board.  Does anybody think that's
> a bad approach, or have a better one?

One part that I have found a bit strange lately about guc.c is that we
have mix the core machinery with the SQL-callable parts.  What do you
think about the addition of a gucfuncs.c in src/backend/utils/adt/ to
split things a bit more?
--
Michael


signature.asc
Description: PGP signature


Re: why can't a table be part of the same publication as its schema

2022-09-10 Thread Isaac Morland
On Sat, 10 Sept 2022 at 19:18, Robert Haas  wrote:

If I encountered this syntax in a vacuum, that's not what I would
> think. I would think that ADD ALL TABLES IN SCHEMA meant add all the
> tables in the schema to the publication one by one as individual
> objects, i.e. add the tables that are currently as of this moment in
> that schema to the publication; and I would think that ADD SCHEMA
> meant remember that this schema is part of the publication and so
> whenever tables are created and dropped in that schema (or moved in
> and out) what is being published is automatically updated.
>
> The analogy here seems to be to GRANT, which actually does support
> both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives
> privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA
> modifies each table that is currently in that schema (never mind what
> happens later).
>

Yes, except GRANT ON SCHEMA only grants access to the schema - CREATE or
USAGE. You cannot write GRANT SELECT ON SCHEMA to grant access to all
tables in the schema.


Re: Splitting up guc.c

2022-09-10 Thread Tom Lane
Michael Paquier  writes:
> One part that I have found a bit strange lately about guc.c is that we
> have mix the core machinery with the SQL-callable parts.  What do you
> think about the addition of a gucfuncs.c in src/backend/utils/adt/ to
> split things a bit more?

I might be wrong, but I think the SQL-callable stuff makes use
of some APIs that are currently private in guc.c.  So we'd have
to expose more API to make that possible.  Maybe that wouldn't
be a bad thing, but it seems to be getting beyond the original
idea here.  (Note I already had to expose find_option() in
order to get the wal_consistency_checking stuff moved out.)
It's not clear to me that "move the SQL-callable stuff" will
end with a nice API boundary.

regards, tom lane




Re: Index ordering after IS NULL

2022-09-10 Thread Tom Lane
Jeff Janes  writes:
> On a two-column btree index, we can constrain the first column with
> equality and read the rows in order by the 2nd column.  But we can't
> constrain the first column by IS NULL and still read the rows in order by
> the 2nd column.  But why not?

"x IS NULL" doesn't give rise to an EquivalenceClass, which is what
is needed to drive the deduction that the first index column isn't
affecting the result ordering.

Maybe we could extend the notion of ECs to allow that, but I'm not
too sure about how it'd work.  There are already some expectations
that EC equality operators be strict, and this'd blow a large hole
in a lot of related assumptions.  For example, given "x IS NULL AND
x = y", the correct deduction is not "y IS NULL", it's that the
WHERE condition is constant-FALSE.

regards, tom lane




Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~

2022-09-10 Thread Justin Pryzby
On Sun, Sep 11, 2022 at 09:28:54AM +0900, Michael Paquier wrote:
> On Fri, Sep 09, 2022 at 08:11:09PM +0900, Michael Paquier wrote:
> > Based on what I can see, the Windows animals seem to have digested
> > 47bd0b3 (cygwin, MinGW and MSVC), so I think that we are good.
> 
> The last part that's worth adjusting is ldap_start_tls_sA(), which
> would lead to the attached simplification.  The MinGW headers list
> this routine, so like the previous change I think that it should be
> safe for such builds.
> 
> Looking at the buildfarm animals, bowerbird, jacana, fairywren,
> lorikeet and drongo disable ldap.  hamerkop is the only member that
> provides coverage for it, still that's a MSVC build.
> 
> The CI provides coverage for ldap as it is enabled by default and
> windows_build_config.pl does not tell otherwise, but with the existing
> animals we don't have ldap coverage under MinGW.
> 
> Anyway, I'd like to apply the attached, and I don't quite see why it
> would not work after 47bd0b3 under MinGW.  Any thoughts?

There's a CF entry to add it, and I launched it with your patch.
(This is in a branch which already has that, and also does a few other
things differently).

https://cirrus-ci.com/task/6302833585684480

[02:07:57.497] checking whether to build with LDAP support... yes

It compiles, which is probably all that matters, and eventually skips
the test anyway.

[02:23:18.209] [02:23:18] c:/cirrus/src/test/ldap/t/001_auth.pl ..  skipped: 
ldap tests not supported on MSWin32 or dependencies not installed

-- 
Justin




Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~

2022-09-10 Thread Michael Paquier
On Sat, Sep 10, 2022 at 10:39:19PM -0500, Justin Pryzby wrote:
> There's a CF entry to add it, and I launched it with your patch.
> (This is in a branch which already has that, and also does a few other
> things differently).

No need for a CF entry if you want to play with the tree.  I have
Cirrus enabled on my own fork of Postgres on github, and I saw the
same result as you:
https://github.com/michaelpq/postgres/
--
Michael


signature.asc
Description: PGP signature