Re: [PATCH] Tab completion for SET COMPRESSION
On Thu, Sep 08, 2022 at 04:40:32PM +0900, Shinya Kato wrote: > Thanks! I marked it as ready for committer. I thought that there was a gotcha in this area for composite types, but on second look it looks that I was wrong. Hence, applied. -- Michael signature.asc Description: PGP signature
Re: Summary function for pg_buffercache
Hi Melih, > I'm not sure about what undefined behaviour could harm this badly. You are right that in practice nothing wrong will (probably) happen on x86/x64 architecture with (most?) modern C compilers. This is not true in the general case though. It's up to the compiler to decide how reading the bufHdr->tag is going to be actually implemented. This can be one assembly instruction or several instructions. This reading can be optimized-out if the compiler believes the required value is already in the register, etc. Since the result will be different depending on the assembly code used this is an undefined behaviour and we can't use code like this. > In the attached patch, I added buffer header locks just before examining tag as follows Many thanks for the updated patch! It looks better now. However I have somewhat mixed feelings about avg_usagecount. Generally AVG() is a relatively useless methric for monitoring. What if the user wants MIN(), MAX() or let's say a 99th percentile? I suggest splitting it into usagecount_min, usagecount_max and usagecount_sum. AVG() can be derived as usercount_sum / used_buffers. Also I suggest changing the names of the columns in order to make them consistent with the rest of the system. If you consider pg_stat_activity and family [1] you will notice that the columns are named (entity)_(property), e.g. backend_xid, backend_type, client_addr, etc. So instead of used_buffers and unused_buffers the naming should be buffers_used and buffers_unused. [1]: https://www.postgresql.org/docs/current/monitoring-stats.html -- Best regards, Aleksander Alekseev
[PATCH] initdb: do not exit after warn_on_mount_point
Hello, please find my first patch for PostgreSQL is attached. Kind regards, Andrey Arapov 0001-initdb-do-not-exit-after-warn_on_mount_point.patch Description: Binary data
Support for Rust
Hello, Are there any plans or thoughts about adding support for other languages than C into Postgres, namely Rust? I would love to hack on some features but I worry somewhat that the C compiler won't give me enough hints that I'm doing something wrong, and the Rust compiler has been excellent at preventing bugs. Best, Lev
pg_toast.pg_toast_relfilenode not exist due to vacuum full tablename
hi, this morning i met an issue, that after vacuum full tablename, the associated toast table shows not exist. here is the operation steps: drop table if exists reymont; create table reymont ( id bigint primary key, data bytea not null); alter table reymont alter column data set compression pglz; insert into reymont values(1, pg_read_binary_file('filename')); vacuum full reymont; select relname, relfilenode, reltoastrelid from pg_class where relname='reymont'; \d+ pg_toast.pg_toast_relfilenode Did not find any relation named "pg_toast.pg_toast_relfilenode". however, if display toast table before vacuum full operation, no problem. drop table if exists reymont; create table reymont ( id bigint primary key, data bytea not null); alter table reymont alter column data set compression pglz; insert into reymont values(1, pg_read_binary_file('filename')); \d+ pg_toast.pg_toast_relfilenode --- it's ok, the toast table exists vacuum full reymont; \d+ pg_toast.pg_toast_relfilenode --- it's ok, the toast table exists it looks a little strange, any ideas? appreciate your help. env: pg14.4 linux 3.10.0-693.17.1.e17 thanks walker
Re: Fix typo function circle_same (src/backend/utils/adt/geo_ops.c)
Created a CF entry. https://commitfest.postgresql.org/40/3883/ Attached a patch with a fix correction to regress output. I think this needs to be backpatched until version 12. regards, Ranier Vilela v1-0001-fix-typo-isnan-test-geo_ops.patch Description: Binary data
Re: Summary function for pg_buffercache
Hello Aleksander, > I'm not sure about what undefined behaviour could harm this badly. > > You are right that in practice nothing wrong will (probably) happen on > x86/x64 architecture with (most?) modern C compilers. This is not true in > the general case though. It's up to the compiler to decide how reading the > bufHdr->tag is going to be actually implemented. This can be one assembly > instruction or several instructions. This reading can be optimized-out if > the compiler believes the required value is already in the register, etc. > Since the result will be different depending on the assembly code used this > is an undefined behaviour and we can't use code like this. > Got it. Thanks for explaining. > However I have somewhat mixed feelings about avg_usagecount. Generally > AVG() is a relatively useless methric for monitoring. What if the user > wants MIN(), MAX() or let's say a 99th percentile? I suggest splitting it > into usagecount_min, usagecount_max and usagecount_sum. AVG() can be > derived as usercount_sum / used_buffers. > Won't be usagecount_max almost always 5 as "BM_MAX_USAGE_COUNT" set to 5 in buf_internals.h? I'm not sure about how much usagecount_min would add either. A usagecount is always an integer between 0 and 5, it's not something unbounded. I think the 99th percentile would be much better than average if strong outlier values could occur. But in this case, I feel like an average value would be sufficiently useful as well. usagecount_sum would actually be useful since average can be derived from it. If you think that the sum of usagecounts has a meaning just by itself, it makes sense to include it. Otherwise, wouldn't showing directly averaged value be more useful? > Also I suggest changing the names of the columns in order to make them > consistent with the rest of the system. If you consider pg_stat_activity > and family [1] you will notice that the columns are named > (entity)_(property), e.g. backend_xid, backend_type, client_addr, etc. So > instead of used_buffers and unused_buffers the naming should be > buffers_used and buffers_unused. > > [1]: https://www.postgresql.org/docs/current/monitoring-stats.html > You're right. I will change the names accordingly. Thanks. Regards, Melih
Re: small windows psqlrc re-wording
On Fri, Sep 9, 2022 at 1:52 PM Tom Lane wrote: > > I wrote: > > On testing that in HEAD, I read > > > Both the system-wide startup file and the user's personal startup file > > can be made psql-version-specific by appending a dash and the > > PostgreSQL major or minor release number to the file name, for example > > ~/.psqlrc-16 or ~/.psqlrc-16devel. > > > That's a little confusing but it's actually accurate, because what > > process_psqlrc_file appends is the string PG_VERSION, so in a devel > > branch or beta release there's a non-numeric "minor release". > > I'm inclined to go ahead and do it like that. > > I decided that what I found jarring about that was the use of "release > number" with a non-numeric version, so I changed it to "release > identifier" and pushed. > Looks good. Thanks Tom / Julien. Robert Treat https://xzilla.net
Re: pg_toast.pg_toast_relfilenode not exist due to vacuum full tablename
"=?ISO-8859-1?B?d2Fsa2Vy?=" writes: > this morning i met an issue, that after vacuum full tablename, the associated > toast table shows not exist. Your example doesn't show what you actually did, but I think what is fooling you is that VACUUM FULL changes the relfilenode of the table but not the name of its toast table. So the situation afterwards might look like regression=# select relname, relfilenode, reltoastrelid from pg_class where relname='reymont'; relname | relfilenode | reltoastrelid -+-+--- reymont | 40616 | 40611 (1 row) regression=# select relname from pg_class where oid = 40611; relname pg_toast_40608 (1 row) regression=# \d+ pg_toast.pg_toast_40608 TOAST table "pg_toast.pg_toast_40608" Column | Type | Storage +-+- chunk_id | oid | plain chunk_seq | integer | plain chunk_data | bytea | plain Owning table: "public.reymont" Indexes: "pg_toast_40608_index" PRIMARY KEY, btree (chunk_id, chunk_seq) Access method: heap (where 40608 is reymont's original relfilenode). I'm not sure if this should be considered a bug or not. Everything still works well enough, but conceivably we could have a TOAST name collision down the road when we recycle the 40608 number --- I don't recall if the TOAST logic is able to cope with that or not. In any case, you should not be making assumptions about the name of a TOAST table without verifying it. regards, tom lane
Re: preserve timestamps when installing headers
On Fri, Sep 09, 2022 at 10:23:57PM +0300, Heikki Linnakangas wrote: > On 11/01/2022 00:03, Tom Lane wrote: > > Peter Eisentraut writes: > > > I don't think preserving timestamps should be the default behavior, but > > > I would support organizing things so that additional options can be > > > passed to "install" to make it do whatever the user prefers. But that > > > won't work if some installations don't go through install. > Here's a patch to switch back to $(INSTALL). With that, you can do: > > ./configure INSTALL="/usr/bin/install -C" +1, I recently looked for a way to do that while trying to accelerate Cygwin/Mingw. -- Justin
Re: [PATCH] initdb: do not exit after warn_on_mount_point
andrey.ara...@nixaid.com writes: > please find my first patch for PostgreSQL is attached. You haven't explained why you think this would be a good change, or even a safe one. regards, tom lane
Re: Support for Rust
Hi! > On 10 Sep 2022, at 07:38, Lev Kokotov wrote: > > Are there any plans or thoughts about adding support for other languages than > C into Postgres, namely Rust? I would love to hack on some features but I > worry somewhat that the C compiler won't give me enough hints that I'm doing > something wrong, and the Rust compiler has been excellent at preventing bugs. You can write Postgres extensions in Rust. And Postgres extensions are really powerful. What kind of features are you interested in? Undoubtedly, attracting Rust folks to contribute Postgres could be a good things. Yet some very simple questions arise. 1. Is Rust compatible with Memory Contexts and shared memory constructs of Postgres? With elog error reporting, PG_TRY() and his friends? 2. Does Rust support same set of platforms as Postgres? Quick glance at Build Farm can give an impression of what is supported by Postgres[0]. 3. Do we gain anything besides compiler hints? Postgres development is hard due to interference of complex subsystems. It will be even harder if those systems will be implemented in different languages. Probably, answers to all these questions are obvious to Rust pros. I just think this can be of interest to someone new to Rust (like me). Best regards, Andrey Borodin. [0] https://buildfarm.postgresql.org/cgi-bin/show_members.pl
Re: Support for Rust
On Fri, Sep 09, 2022 at 07:38:14PM -0700, Lev Kokotov wrote: > Are there any plans or thoughts about adding support for other languages > than C into Postgres, namely Rust? I would love to hack on some features > but I worry somewhat that the C compiler won't give me enough hints that > I'm doing something wrong, and the Rust compiler has been excellent at > preventing bugs. There was some discussion about Rust back in 2017 [0] that you might be interested in. [0] https://www.postgresql.org/message-id/flat/CAASwCXdQUiuUnhycdRvrUmHuzk5PsaGxr54U4t34teQjcjb%3DAQ%40mail.gmail.com -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Splitting up guc.c
Here's a WIP stab at the project Andres mentioned [1] of splitting up guc.c into smaller files. As things stand here, we have: 1. guc.c: the core GUC machinery. 2. guc_tables.c: the data arrays, and some previously-exposed constant tables. guc_tables.h can now be considered the associated header. 3. guc_hooks.c: (most of) the per-variable check/assign/show hooks that had been in guc.c. guc_hooks.h declares these. File sizes are like so: $ wc guc*c 2629 9372 69467 guc-file.c 7422 25136 202284 guc.c 939 2693 22915 guc_hooks.c 4877 13163 126769 guc_tables.c 15867 50364 421435 total $ size guc*o textdata bss dec hex filename 13653 4 112 1376935c9 guc-file.o 54953 0 564 55517d8dd guc.o 6951 0 11270631b97 guc_hooks.o 43570 62998 216 106784 1a120 guc_tables.o I'm fairly happy with the way things turned out in guc.c and guc_tables.c, but I don't much like guc_hooks.c. I think instead of creating such a file, what we should do is to shove most of those functions into whatever module the GUC variable is associated with. (Perhaps commands/variable.c could absorb any stragglers that lack a better home.) I made a start at that for wal_consistency_checking and the syslog parameters, but haven't gone further yet. Before proceeding further, I wanted to ask for comments on a design choice that might be controversial. Even though I don't want to invent guc_hooks.c, I think we *should* invent guc_hooks.h, and consolidate all the GUC hook function declarations there. The point would be to not have to #include guc.h in headers of unrelated modules. This is similar to what we've done with utils/fmgrprotos.h, though the motivation is different. I already moved a few declarations from guc.h to there (and in consequence had to adjust #includes in the modules defining those hooks), but there's a lot more to be done if we apply that policy across the board. Does anybody think that's a bad approach, or have a better one? BTW, this is more or less orthogonal to my other GUC patch at [2], although both lead to the conclusion that we need to export guc_malloc and friends. regards, tom lane [1] https://www.postgresql.org/message-id/20220905233233.jhcu5jqsrtosmgh5%40awork3.anarazel.de [2] https://www.postgresql.org/message-id/flat/2982579.1662416866%40sss.pgh.pa.us split-up-guc-code-1.patch.gz Description: split-up-guc-code-1.patch.gz
Re: Splitting up guc.c
Hi, On 2022-09-10 15:04:59 -0400, Tom Lane wrote: > Here's a WIP stab at the project Andres mentioned [1] of splitting up > guc.c into smaller files. Cool! > As things stand here, we have: > > 1. guc.c: the core GUC machinery. > 2. guc_tables.c: the data arrays, and some previously-exposed constant > tables. guc_tables.h can now be considered the associated header. > 3. guc_hooks.c: (most of) the per-variable check/assign/show hooks > that had been in guc.c. guc_hooks.h declares these. > > File sizes are like so: > > $ wc guc*c > 2629 9372 69467 guc-file.c > 7422 25136 202284 guc.c >939 2693 22915 guc_hooks.c > 4877 13163 126769 guc_tables.c > 15867 50364 421435 total > $ size guc*o >textdata bss dec hex filename > 13653 4 112 1376935c9 guc-file.o > 54953 0 564 55517d8dd guc.o >6951 0 11270631b97 guc_hooks.o > 43570 62998 216 106784 1a120 guc_tables.o A tad surprised by the text size of guc_tables.o - not that it is a problem, just seems a bit odd. > I'm fairly happy with the way things turned out in guc.c and > guc_tables.c, but I don't much like guc_hooks.c. I think instead of > creating such a file, what we should do is to shove most of those > functions into whatever module the GUC variable is associated with. +1. I think our associated habit of declaring externs in multiple .c files isn't great either. > Before proceeding further, I wanted to ask for comments on a design > choice that might be controversial. Even though I don't want to > invent guc_hooks.c, I think we *should* invent guc_hooks.h, and > consolidate all the GUC hook function declarations there. The > point would be to not have to #include guc.h in headers of unrelated > modules. This is similar to what we've done with utils/fmgrprotos.h, > though the motivation is different. I already moved a few declarations > from guc.h to there (and in consequence had to adjust #includes in > the modules defining those hooks), but there's a lot more to be done > if we apply that policy across the board. Does anybody think that's > a bad approach, or have a better one? Hm, I'm not opposed, the reasoning makes sense to me. How would this interact with the declaration of the variables underlying GUCs? Greetings, Andres Freund
Re: Splitting up guc.c
Andres Freund writes: > On 2022-09-10 15:04:59 -0400, Tom Lane wrote: >> $ size guc*o >> text data bss dec hex filename >> 13653 4 112 1376935c9 guc-file.o >> 54953 0 564 55517d8dd guc.o >> 69510 11270631b97 guc_hooks.o >> 43570 62998 216 106784 1a120 guc_tables.o > A tad surprised by the text size of guc_tables.o - not that it is a problem, > just seems a bit odd. There's a pretty fair number of constant tables that got moved to there. Not to mention all the constant strings. >> Before proceeding further, I wanted to ask for comments on a design >> choice that might be controversial. Even though I don't want to >> invent guc_hooks.c, I think we *should* invent guc_hooks.h, and >> consolidate all the GUC hook function declarations there. The >> point would be to not have to #include guc.h in headers of unrelated >> modules. This is similar to what we've done with utils/fmgrprotos.h, >> though the motivation is different. I already moved a few declarations >> from guc.h to there (and in consequence had to adjust #includes in >> the modules defining those hooks), but there's a lot more to be done >> if we apply that policy across the board. Does anybody think that's >> a bad approach, or have a better one? > Hm, I'm not opposed, the reasoning makes sense to me. How would this interact > with the declaration of the variables underlying GUCs? I'd still declare the variables as we do now, ie just straightforwardly export them from the associated modules. Since they're all of native C types, they don't cause any inclusion-footprint issues. We could move their declarations to a common file I guess, but I don't see any benefit. regards, tom lane
Re: Splitting up guc.c
Hi, On 2022-09-10 12:15:33 -0700, Andres Freund wrote: > On 2022-09-10 15:04:59 -0400, Tom Lane wrote: > > As things stand here, we have: > > > > 1. guc.c: the core GUC machinery. > > 2. guc_tables.c: the data arrays, and some previously-exposed constant > > tables. guc_tables.h can now be considered the associated header. > > 3. guc_hooks.c: (most of) the per-variable check/assign/show hooks > > that had been in guc.c. guc_hooks.h declares these. > > > > File sizes are like so: > > > > $ wc guc*c > > 2629 9372 69467 guc-file.c > > 7422 25136 202284 guc.c > >939 2693 22915 guc_hooks.c > > 4877 13163 126769 guc_tables.c > > 15867 50364 421435 total > > $ size guc*o > >textdata bss dec hex filename > > 13653 4 112 1376935c9 guc-file.o > > 54953 0 564 55517d8dd guc.o > >6951 0 11270631b97 guc_hooks.o > > 43570 62998 216 106784 1a120 guc_tables.o > > A tad surprised by the text size of guc_tables.o - not that it is a problem, > just seems a bit odd. Looks like that's just size misgrouping some section. Built a guc_tables.o without debug information (that makes the output too complicated): $ size guc_tables_nodebug.o textdata bss dec hex filename 40044 66868 344 107256 1a2f8 guc_tables_nodebug.o $ size --format=sysv guc_tables_nodebug.o guc_tables_nodebug.o : sectionsize addr .text 0 0 .data52 0 .bss344 0 .rodata 40044 0 .data.rel.ro.local 3720 0 .data.rel.local 8 0 .data.rel 63088 0 .comment 31 0 .note.GNU-stack 0 0 Total107287 For some reason size adds .roata to the size for text in the default berkeley style. Which is even documented: The Berkeley style output counts read only data in the "text" column, not in the "data" column, the "dec" and "hex" columns both display the sum of the "text", "data", and "bss" columns in decimal and hexadecimal respectively. Greetings, Andres Freund
Re: Splitting up guc.c
I wrote: > Andres Freund writes: >> On 2022-09-10 15:04:59 -0400, Tom Lane wrote: >>> $ size guc*o >>> text data bss dec hex filename >>> 13653 4 112 1376935c9 guc-file.o >>> 54953 0 564 55517d8dd guc.o >>> 6951 0 11270631b97 guc_hooks.o >>> 43570 62998 216 106784 1a120 guc_tables.o >> A tad surprised by the text size of guc_tables.o - not that it is a problem, >> just seems a bit odd. > There's a pretty fair number of constant tables that got moved to there. > Not to mention all the constant strings. I forgot to include comparison numbers for HEAD: $ wc guc*c 2629 9372 69467 guc-file.c 13335 41584 356896 guc.c 15964 50956 426363 total $ size guc*o textdata bss dec hex filename 13653 4 112 1376935c9 guc-file.o 105848 63156 908 169912 297b8 guc.o This isn't completely apples-to-apples because of the few hook functions I'd moved to other places in v1, but you can see that the total text and data sizes didn't change much. It'd likely indicate a mistake if they had. (However, v1 does include const-ifying a few options tables that had somehow escaped being labeled that way, so the total data size did shrink a small amount.) regards, tom lane
Re: CI and test improvements
On Sun, Aug 28, 2022 at 02:28:02PM -0700, Andres Freund wrote: > On 2022-08-28 12:10:29 -0500, Justin Pryzby wrote: > > On Sun, Aug 28, 2022 at 09:07:52AM -0700, Andres Freund wrote: > > > > --- /dev/null > > > > +++ b/src/tools/ci/windows-compiler-warnings > > > > > > Wouldn't that be doable as something like > > > sh -c 'if test -s file; then cat file;exit 1; fi" > > > inside .cirrus.yml? > > > > I had written it inline in a couple ways, like > > - sh -exc 'f=msbuild.warn.log; if [ -s "$f" ]; then cat "$f"; exit 1; else > > exit 0; fi' > > > > but then separated it out as you suggested in > > 20220227010908.vz2a7dmfzgwg7...@alap3.anarazel.de > > > > after I complained about cmd.exe requiring escaping for && and || > > That makes writing any shell script a bit perilous and a separate script > > seems better. > > I remember that I suggested it - but note that the way I wrote above doesn't > have anything needing escaping. It doesn't require it, but that still gives the impression that it's normally possible to write one-liner shell scripts there, which is misleading/wrong, and the reason why I took your suggestion to use a separate script file. > Anyway, what do you think of the multiline split I suggested? Done, and sorted. > That's what should have been in the commit message. Sure. I copied into the commit message the explanation that I had written in June's email. > > > > xi-os-only: freebsd > > > > > > Typo. > > > > No - it's deliberate so I can switch to and from "everything" to "this > > only". > > I don't see the point in posting patches to be applied if they contain lots of > such things that a potential committer would need to catch and include a lot > of of fixup patches. I get that you disliked that I disabled the effect of a CI tag by munging "c" to "x". I've amended the message to avoid confusion. But, lots of what such things ? "ci-os-only" would be removed before being pushed anyway. "catching things" is the first part of the review process, which (as I understand) is intended to help patch authors to improve their patches. If you found lots of problems in my patches, I'd need to know about them; but most of what I heard seem like quibbles about the presentation of the patches. It's true that some parts are dirty/unclear, and that seems reasonable for patches most of which haven't yet received review, for which I asked whether to pursue the patch at all, and how best to present them. This is (or could be) an opportunity to make improvements. I renamed the two, related patches to Cluser.pm which said "f!", which are deliberately separate but looked like "fixup" patches. Are you interested in any combination of those three, related changes to move logic from Makefile to perl ? If not, we don't need to debate the merits of spliting the patch. What about the three, related changes for ccache compression ? Should these be dropped in favour of meson ? - cirrus/vcregress: test modules/contrib with NO_INSTALLCHECK=1 - vcregress: add alltaptests I added: "WIP: skip building if only docs have changed" changesInclude() didn't seem to work right when I first tried to use it. Eventually, I realized that it seems to use something like "git log", and not "git diff" (as I'd thought). It seems to work fine now that I know what to expect. git commit --amend --no-edit git diff --stat @{1}..@{0} # this outputs nothing git log --stat @{1}..@{0} # this lists the files changed by the tip commit It'd be nice to be have cfbot inject this patch into each commitfest patch for awhile, to make sure everything works as expected. Same for the code coverage patch and the doc artifacts patch. (These patches currently assume that the base commit is HEAD~1, which is correct for cfbot, and that would also provide code coverage and docs until such time as cfbot is updated to apply and preserve the original series of patches). -- Justin >From 0566ad149df1fc2ee5ba96e523933fc073165194 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Wed, 25 May 2022 21:53:22 -0500 Subject: [PATCH 01/23] cirrus/windows: add compiler_warnings_script I'm not sure how to write this test in windows shell; it's also not easy to write it in posix sh, since windows shell is somehow interpretting && and ||... https://www.postgresql.org/message-id/20220212212310.f645c6vw3njkgxka%40alap3.anarazel.de See also: 8a1ce5e54f6d144e4f8e19af7c767b026ee0c956 ci-os-only: windows https://cirrus-ci.com/task/6183879907213312 https://cirrus-ci.com/task/4876271443247104 --- .cirrus.yml| 14 +- src/tools/ci/windows-compiler-warnings | 16 2 files changed, 29 insertions(+), 1 deletion(-) create mode 100755 src/tools/ci/windows-compiler-warnings diff --git a/.cirrus.yml b/.cirrus.yml index 81eb8a9996d..2be62791448 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -370,7 +370,14 @@ task: # ForceNoAlign prevents msbuild from introducing line-breaks for long lines # disab
Re: Mingw task for Cirrus CI
On Mon, Sep 05, 2022 at 04:52:17PM -0700, Andres Freund wrote: > I don't think you should need to use --host, that indicates cross compiling, This made me consider the idea of cross-compiling for windows under a new linux task, and then running tests under Windows with a dependent task. I suppose that use -Og, in addition to CompilerWarnings, which uses -O2. cirrusci caches are (or can be) shared between tasks. I got this mostly working, with a few kludges to compile and install stuff under src/test. This may be yet another idea that's obsoleted by meson. WDYT? -- Justin
Re: archive modules
Nathan Bossart writes: > Of course. I've marked it as ready-for-committer. Pushed with a bit of additional wordsmithing. regards, tom lane
Re: archive modules
On Sat, Sep 10, 2022 at 04:44:16PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> Of course. I've marked it as ready-for-committer. > > Pushed with a bit of additional wordsmithing. Thanks! -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Index ordering after IS NULL
On a two-column btree index, we can constrain the first column with equality and read the rows in order by the 2nd column. But we can't constrain the first column by IS NULL and still read the rows in order by the 2nd column. But why not? Surely the structure of the btree index would allow for this to work. Example: create table if not exists j as select case when random()<0.9 then floor(random()*10)::int end b, random() c from generate_series(1,100); create index if not exists j_b_c_idx on j (b,c); set enable_sort TO off; explain analyze select * from j where b is null order by c limit 10; explain analyze select * from j where b =8 order by c limit 10; The first uses a sort despite it being disabled. Cheers, Jeff
Re: Support load balancing in libpq
Hi, the patch no longer applies cleanly, please rebase (it's trivial). I don't like the provided commit message very much, I think the discussion about pgJDBC having had load balancing for a while belongs elsewhere. On Wed, Jun 22, 2022 at 07:54:19AM +, Jelte Fennema wrote: > I tried to stay in line with the naming of this same option in JDBC and > Npgsql, where it's called "loadBalanceHosts" and "Load Balance Hosts" > respectively. So, actually to be more in line it should be the option for > libpq should be called "load_balance_hosts" (not "loadbalance" like > in the previous patch). I attached a new patch with the name of the > option changed to this. Maybe my imagination is not so great, but what else than hosts could we possibly load-balance? I don't mind calling it load_balance, but I also don't feel very strongly one way or the other and this is clearly bikeshed territory. > I also don't think the name is misleading. Randomization of hosts will > automatically result in balancing the load across multiple hosts. This is > assuming more than a single connection is made using the connection > string, either on the same client node or on different client nodes. I think > I think is a fair assumption to make. Also note that this patch does not load > balance queries, it load balances connections. This is because libpq works > at the connection level, not query level, due to session level state. I agree. Also, I think the scope is ok for a first implementation (but see below). You or others could possibly further enhance the algorithm in the future, but it seems to be useful as-is. > I agree it is indeed fairly simplistic load balancing. If I understand correctly, you've added DNS-based load balancing on top of just shuffling the provided hostnames. This makes sense if a hostname is backed by more than one IP address in the context of load balancing, but it also complicates the patch. So I'm wondering how much shorter the patch would be if you leave that out for now? On the other hand, I believe pgJDBC keeps track of which hosts are up or down and only load balances among the ones which are up (maybe rechecking after a timeout? I don't remember), is this something you're doing, or did you consider it? Some quick remarks on the patch: /* OK, scan this addrlist for a working server address */ - conn->addr_cur = conn->addrlist; reset_connection_state_machine = true; conn->try_next_host = false; The comment might need to be updated. + int naddr; /* # of addrs returned by getaddrinfo */ This is spelt "number of" in several other places in the file, and we still have enough space to spell it out here as well without a line-wrap. Michael
Re: Index ordering after IS NULL
On Sat, Sep 10, 2022 at 2:28 PM Jeff Janes wrote: > explain analyze select * from j where b is null order by c limit 10; > explain analyze select * from j where b =8 order by c limit 10; > > The first uses a sort despite it being disabled. The first/is null query seems to give the result and plan you're looking for if the query is rewritten to order by "b, c", and not just "c". That in itself doesn't make your complaint any less valid, of course. You don't have to do this with the second query, so why should you have to do it with the first? -- Peter Geoghegan
Re: why can't a table be part of the same publication as its schema
On Fri, Sep 9, 2022 at 2:17 PM Mark Dilger wrote: > > On Sep 9, 2022, at 8:18 AM, Robert Haas wrote: > > Things might be clearer if we'd made the syntax "ALTER PUBLICATION p1 > > { ADD | DROP } { TABLE | SCHEMA } name". I don't understand why we > > used this ALL TABLES IN SCHEMA language. > > The conversation, as I recall, was that "ADD SCHEMA foo" would only mean all > tables in foo, until publication of other object types became supported, at > which point "ADD SCHEMA foo" would suddenly mean more than it did before. > People might find that surprising, so the "ALL TABLES IN" was intended to > future-proof against surprising behavioral changes. If I encountered this syntax in a vacuum, that's not what I would think. I would think that ADD ALL TABLES IN SCHEMA meant add all the tables in the schema to the publication one by one as individual objects, i.e. add the tables that are currently as of this moment in that schema to the publication; and I would think that ADD SCHEMA meant remember that this schema is part of the publication and so whenever tables are created and dropped in that schema (or moved in and out) what is being published is automatically updated. The analogy here seems to be to GRANT, which actually does support both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA modifies each table that is currently in that schema (never mind what happens later). -- Robert Haas EDB: http://www.enterprisedb.com
Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~
On Fri, Sep 09, 2022 at 08:11:09PM +0900, Michael Paquier wrote: > Based on what I can see, the Windows animals seem to have digested > 47bd0b3 (cygwin, MinGW and MSVC), so I think that we are good. The last part that's worth adjusting is ldap_start_tls_sA(), which would lead to the attached simplification. The MinGW headers list this routine, so like the previous change I think that it should be safe for such builds. Looking at the buildfarm animals, bowerbird, jacana, fairywren, lorikeet and drongo disable ldap. hamerkop is the only member that provides coverage for it, still that's a MSVC build. The CI provides coverage for ldap as it is enabled by default and windows_build_config.pl does not tell otherwise, but with the existing animals we don't have ldap coverage under MinGW. Anyway, I'd like to apply the attached, and I don't quite see why it would not work after 47bd0b3 under MinGW. Any thoughts? -- Michael diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c index a776bc3ed7..3a10baa9ce 100644 --- a/src/backend/libpq/auth.c +++ b/src/backend/libpq/auth.c @@ -135,14 +135,6 @@ static int CheckBSDAuth(Port *port, char *user); #else #include -/* Correct header from the Platform SDK */ -typedef -ULONG (*__ldap_start_tls_sA) (IN PLDAP ExternalHandle, - OUT PULONG ServerReturnValue, - OUT LDAPMessage **result, - IN PLDAPControlA * ServerControls, - IN PLDAPControlA * ClientControls -); #endif static int CheckLDAPAuth(Port *port); @@ -2348,48 +2340,7 @@ InitializeLDAPConnection(Port *port, LDAP **ldap) #ifndef WIN32 if ((r = ldap_start_tls_s(*ldap, NULL, NULL)) != LDAP_SUCCESS) #else - static __ldap_start_tls_sA _ldap_start_tls_sA = NULL; - - if (_ldap_start_tls_sA == NULL) - { - /* - * Need to load this function dynamically because it may not exist - * on Windows, and causes a load error for the whole exe if - * referenced. - */ - HANDLE ldaphandle; - - ldaphandle = LoadLibrary("WLDAP32.DLL"); - if (ldaphandle == NULL) - { -/* - * should never happen since we import other files from - * wldap32, but check anyway - */ -ereport(LOG, - (errmsg("could not load library \"%s\": error code %lu", -"WLDAP32.DLL", GetLastError(; -ldap_unbind(*ldap); -return STATUS_ERROR; - } - _ldap_start_tls_sA = (__ldap_start_tls_sA) (pg_funcptr_t) GetProcAddress(ldaphandle, "ldap_start_tls_sA"); - if (_ldap_start_tls_sA == NULL) - { -ereport(LOG, - (errmsg("could not load function _ldap_start_tls_sA in wldap32.dll"), - errdetail("LDAP over SSL is not supported on this platform."))); -ldap_unbind(*ldap); -FreeLibrary(ldaphandle); -return STATUS_ERROR; - } - - /* - * Leak LDAP handle on purpose, because we need the library to - * stay open. This is ok because it will only ever be leaked once - * per process and is automatically cleaned up on process exit. - */ - } - if ((r = _ldap_start_tls_sA(*ldap, NULL, NULL, NULL, NULL)) != LDAP_SUCCESS) + if ((r = ldap_start_tls_sA(*ldap, NULL, NULL, NULL, NULL)) != LDAP_SUCCESS) #endif { ereport(LOG, signature.asc Description: PGP signature
Re: Splitting up guc.c
On Sat, Sep 10, 2022 at 03:04:59PM -0400, Tom Lane wrote: > Before proceeding further, I wanted to ask for comments on a design > choice that might be controversial. Even though I don't want to > invent guc_hooks.c, I think we *should* invent guc_hooks.h, and > consolidate all the GUC hook function declarations there. The > point would be to not have to #include guc.h in headers of unrelated > modules. This is similar to what we've done with utils/fmgrprotos.h, > though the motivation is different. I already moved a few declarations > from guc.h to there (and in consequence had to adjust #includes in > the modules defining those hooks), but there's a lot more to be done > if we apply that policy across the board. Does anybody think that's > a bad approach, or have a better one? One part that I have found a bit strange lately about guc.c is that we have mix the core machinery with the SQL-callable parts. What do you think about the addition of a gucfuncs.c in src/backend/utils/adt/ to split things a bit more? -- Michael signature.asc Description: PGP signature
Re: why can't a table be part of the same publication as its schema
On Sat, 10 Sept 2022 at 19:18, Robert Haas wrote: If I encountered this syntax in a vacuum, that's not what I would > think. I would think that ADD ALL TABLES IN SCHEMA meant add all the > tables in the schema to the publication one by one as individual > objects, i.e. add the tables that are currently as of this moment in > that schema to the publication; and I would think that ADD SCHEMA > meant remember that this schema is part of the publication and so > whenever tables are created and dropped in that schema (or moved in > and out) what is being published is automatically updated. > > The analogy here seems to be to GRANT, which actually does support > both syntaxes. And if I understand correctly, GRANT ON SCHEMA gives > privileges on the schema; whereas GRANT ON ALL TABLES IN SCHEMA > modifies each table that is currently in that schema (never mind what > happens later). > Yes, except GRANT ON SCHEMA only grants access to the schema - CREATE or USAGE. You cannot write GRANT SELECT ON SCHEMA to grant access to all tables in the schema.
Re: Splitting up guc.c
Michael Paquier writes: > One part that I have found a bit strange lately about guc.c is that we > have mix the core machinery with the SQL-callable parts. What do you > think about the addition of a gucfuncs.c in src/backend/utils/adt/ to > split things a bit more? I might be wrong, but I think the SQL-callable stuff makes use of some APIs that are currently private in guc.c. So we'd have to expose more API to make that possible. Maybe that wouldn't be a bad thing, but it seems to be getting beyond the original idea here. (Note I already had to expose find_option() in order to get the wal_consistency_checking stuff moved out.) It's not clear to me that "move the SQL-callable stuff" will end with a nice API boundary. regards, tom lane
Re: Index ordering after IS NULL
Jeff Janes writes: > On a two-column btree index, we can constrain the first column with > equality and read the rows in order by the 2nd column. But we can't > constrain the first column by IS NULL and still read the rows in order by > the 2nd column. But why not? "x IS NULL" doesn't give rise to an EquivalenceClass, which is what is needed to drive the deduction that the first index column isn't affecting the result ordering. Maybe we could extend the notion of ECs to allow that, but I'm not too sure about how it'd work. There are already some expectations that EC equality operators be strict, and this'd blow a large hole in a lot of related assumptions. For example, given "x IS NULL AND x = y", the correct deduction is not "y IS NULL", it's that the WHERE condition is constant-FALSE. regards, tom lane
Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~
On Sun, Sep 11, 2022 at 09:28:54AM +0900, Michael Paquier wrote: > On Fri, Sep 09, 2022 at 08:11:09PM +0900, Michael Paquier wrote: > > Based on what I can see, the Windows animals seem to have digested > > 47bd0b3 (cygwin, MinGW and MSVC), so I think that we are good. > > The last part that's worth adjusting is ldap_start_tls_sA(), which > would lead to the attached simplification. The MinGW headers list > this routine, so like the previous change I think that it should be > safe for such builds. > > Looking at the buildfarm animals, bowerbird, jacana, fairywren, > lorikeet and drongo disable ldap. hamerkop is the only member that > provides coverage for it, still that's a MSVC build. > > The CI provides coverage for ldap as it is enabled by default and > windows_build_config.pl does not tell otherwise, but with the existing > animals we don't have ldap coverage under MinGW. > > Anyway, I'd like to apply the attached, and I don't quite see why it > would not work after 47bd0b3 under MinGW. Any thoughts? There's a CF entry to add it, and I launched it with your patch. (This is in a branch which already has that, and also does a few other things differently). https://cirrus-ci.com/task/6302833585684480 [02:07:57.497] checking whether to build with LDAP support... yes It compiles, which is probably all that matters, and eventually skips the test anyway. [02:23:18.209] [02:23:18] c:/cirrus/src/test/ldap/t/001_auth.pl .. skipped: ldap tests not supported on MSWin32 or dependencies not installed -- Justin
Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~
On Sat, Sep 10, 2022 at 10:39:19PM -0500, Justin Pryzby wrote: > There's a CF entry to add it, and I launched it with your patch. > (This is in a branch which already has that, and also does a few other > things differently). No need for a CF entry if you want to play with the tree. I have Cirrus enabled on my own fork of Postgres on github, and I saw the same result as you: https://github.com/michaelpq/postgres/ -- Michael signature.asc Description: PGP signature