Re: Add BF member koel-like indentation checks to SanityCheck CI

2024-01-10 Thread Tristan Partin

On Wed Jan 10, 2024 at 1:58 AM CST, Tom Lane wrote:

Bharath Rupireddy  writes:
> IMO, running the
> pgindent in at least one of the CI systems if not all (either as part
> task SyanityCheck or task Linux - Debian Bullseye - Autoconf) help
> catches things early on in CF bot runs itself. This saves committers
> time but at the cost of free run-time that cirrus-ci provides.

But that puts the burden of pgindent-cleanliness onto initial patch
submitters, which I think is the wrong thing for reasons mentioned
upthread.  We want to enforce this at commit into the master repo, but
I fear enforcing it earlier will drive novice contributors away for
no very good reason.


If we are worried about turning away novice contributors, there are much 
bigger fish to fry than worrying if we will turn them away due to 
requiring code to be formatted a certain way. Like I said earlier, 
formatters are pretty common tools to be using these days. go fmt, deno 
fmt, rustfmt, prettier, black, clang-format, uncrustify, etc.


Code formatting requirements are more likely to turn someone away from 
contributing if code reviews are spent making numerous comments. 
Luckily, we can just say "run this command line incantation of 
pgindent," which in the grand scheme of things is easy compared to all 
the other things you have to be aware of to contribute to Postgres.


--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2024-01-12 Thread Tristan Partin

On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote:

On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin  wrote:
> I think the way to go is to expose some variation of libpq's
> pqSocketPoll(), which I would be happy to put together a patch for.
> Making frontends, psql in this case, have to reimplement the polling
> logic doesn't strike me as fruitful, which is essentially what I have
> done.

I encourage further exploration of this line of attack. I fear that if
I were to commit something like what you've posted up until now,
people would complain that that code was too ugly to live, and I'd
have a hard time telling them that they're wrong.


Completely agree. Let me look into this. Perhaps I can get something up 
next week or the week after.


--
Tristan Partin
Neon (https://neon.tech)




Re: Extensible storage manager API - SMGR hook Redux

2024-01-12 Thread Tristan Partin

Thought I would show off what is possible with this patchset.

Heikki, a couple of months ago in our internal Slack, said:

[I would like] a debugging tool that checks that we're not missing any 
fsyncs. I bumped into a few missing fsync bugs with unlogged tables 
lately and a tool like that would've been very helpful.


My task was to create such a tool, and off I went. I started with the 
storage manager extension patch that Matthias sent to the list last 
year[0].


Andres, in another thread[1], said:


I've been thinking that we need a validation layer for fsyncs, it's too hard
to get right without testing, and crash testing is not likel enough to catch
problems quickly / resource intensive.

My thought was that we could keep a shared hash table of all files created /
dirtied at the fd layer, with the filename as key and the value the current
LSN. We'd delete files from it when they're fsynced. At checkpoints we go
through the list and see if there's any files from before the redo that aren't
yet fsynced.  All of this in assert builds only, of course.


I took this idea and ran with it. I call it the fsync_checker™️. It is an 
extension that prints relations that haven't been fsynced prior to 
a CHECKPOINT. Note that this idea doesn't work in practice because 
relations might not be fsynced, but they might be WAL-logged, like in 
the case of createdb. See log_smgrcreate(). I can't think of an easy way 
to solve this problem looking at the codebase as it stands.


Here is a description of the patches:

0001:

This is essentially just the patch that Matthias posted earlier, but 
rebased and adjusted a little bit so storage managers can "inherit" from 
other storage managers.


0002:

This is an extension of 0001, which allows for extensions to set 
a global storage manager. This is pretty hacky, and if it was going to 
be pulled into mainline, it would need some better protection. For 
instance, only one extension should be able to set the global storage 
manager. We wouldn't want extensions stepping over each other, etc.


0003:

Adds a hook for extensions to inspect a checkpoint before it actually 
occurs. The purpose for the fsync_checker is so that I can iterate over

all the relations the extension tracks to find files that haven't been
synced prior to the completion of the checkpoint.

0004:

This is the actual fsync_checker extension itself. It must be preloaded.

Hopefully this is a good illustration of how the initial patch could be 
used, even though it isn't perfect.


[0]: 
https://www.postgresql.org/message-id/CAEze2WgMySu2suO_TLvFyGY3URa4mAx22WeoEicnK%3DPCNWEMrA%40mail.gmail.com
[1]: 
https://www.postgresql.org/message-id/20220127182838.ba3434dp2pe5vcia%40alap3.anarazel.de

--
Tristan Partin
Neon (https://neon.tech)
From 5ffbc7c35bb3248501b2517d26f99afe02fb53d6 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent 
Date: Tue, 27 Jun 2023 15:59:23 +0200
Subject: [PATCH v1 1/5] Expose f_smgr to extensions for manual implementation

There are various reasons why one would want to create their own
implementation of a storage manager, among which are block-level compression,
encryption and offloading to cold storage. This patch is a first patch that
allows extensions to register their own SMgr.

Note, however, that this SMgr is not yet used - only the first SMgr to register
is used, and this is currently the md.c smgr. Future commits will include
facilities to select an SMgr for each tablespace.
---
 src/backend/postmaster/postmaster.c |   5 +
 src/backend/storage/smgr/md.c   | 172 +++-
 src/backend/storage/smgr/smgr.c | 129 ++---
 src/backend/utils/init/miscinit.c   |  13 +++
 src/include/miscadmin.h |   1 +
 src/include/storage/md.h|   4 +
 src/include/storage/smgr.h  |  59 --
 7 files changed, 252 insertions(+), 131 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index feb471dd1d..a0e46fe1f2 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1010,6 +1010,11 @@ PostmasterMain(int argc, char *argv[])
 	 */
 	ApplyLauncherRegister();
 
+	/*
+	 * Register built-in managers that are not part of static arrays
+	 */
+	register_builtin_dynamic_managers();
+
 	/*
 	 * process any libraries that should be preloaded at postmaster start
 	 */
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index b1e9932a29..66a93101ab 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -87,6 +87,21 @@ typedef struct _MdfdVec
 } MdfdVec;
 
 static MemoryContext MdCxt;		/* context for all MdfdVec objects */
+SMgrId MdSMgrId;
+
+typedef struct MdSMgrRelationData
+{
+	/* parent data */
+	SMgrRelationData reln;
+	/*
+	 * for md.c; per-fork arrays of the number of open segments
+	 * (md_num_open_segs) and the 

Add pgindent test to check if codebase is correctly formatted

2024-01-16 Thread Tristan Partin
Had some time to watch code run through an extensive test suite, so 
thought I would propose this patch that is probably about 75% of the way 
to the stated $subject. I had to add in a hack for Meson, and I couldn't 
figure out a good hack for autotools.


I think a good solution would be to distribute pgindent and 
pg_bsd_indent. At Neon, we are trying to format our extension code using 
pgindent. I am sure there are other extension authors out there too that 
format using pgindent. Distributing pg_bsd_indent and pgindent in the 
postgresql-devel package would be a great help to those of us that 
pgindent out of tree code. It would also have the added benefit of 
adding the tools to $PREFIX/bin, which would make the test that I added 
not need a hack to get the pg_bsd_indent executable.


--
Tristan Partin
Neon (https://neon.tech)
From 6e9ca366b6e4976ae591012150fe77729e37c503 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 16 Jan 2024 19:00:44 -0600
Subject: [PATCH v1 1/2] Allow tests to register command line arguments in
 Meson

---
 meson.build | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index c317144b6b..12ba91109b 100644
--- a/meson.build
+++ b/meson.build
@@ -3284,6 +3284,8 @@ foreach test_dir : tests
 'env': env,
   } + t.get('test_kwargs', {})
 
+  test_args = t.get('args', [])
+
   foreach onetap : t['tests']
 # Make tap test names prettier, remove t/ and .pl
 onetap_p = onetap
@@ -3302,7 +3304,7 @@ foreach test_dir : tests
 '--testname', onetap_p,
 '--', test_command,
 test_dir['sd'] / onetap,
-  ],
+  ] + test_args,
     )
   endforeach
   install_suites += test_group
-- 
Tristan Partin
Neon (https://neon.tech)

From 13902328b24984ab2d18914b63c6874143715f48 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 16 Jan 2024 19:03:42 -0600
Subject: [PATCH v1 2/2] Add pgindent test to check if codebase is correctly
 formatted

Should be useful for developers and committers to test code as they
develop, commit, or prepare patches.
---
 src/meson.build  |  2 +-
 src/tools/pgindent/Makefile  | 24 
 src/tools/pgindent/meson.build   | 13 +
 src/tools/pgindent/t/001_pgindent.pl | 19 +++
 4 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100644 src/tools/pgindent/Makefile
 create mode 100644 src/tools/pgindent/meson.build
 create mode 100644 src/tools/pgindent/t/001_pgindent.pl

diff --git a/src/meson.build b/src/meson.build
index 65c7d17d08..0345d2fa79 100644
--- a/src/meson.build
+++ b/src/meson.build
@@ -14,7 +14,7 @@ subdir('pl')
 subdir('interfaces')
 
 subdir('tools/pg_bsd_indent')
-
+subdir('tools/pgindent')
 
 ### Generate a Makefile.global that's complete enough for PGXS to work.
 #
diff --git a/src/tools/pgindent/Makefile b/src/tools/pgindent/Makefile
new file mode 100644
index 00..45d6b71a12
--- /dev/null
+++ b/src/tools/pgindent/Makefile
@@ -0,0 +1,24 @@
+#-
+#
+# src/tools/pgindent/Makefile
+#
+# Copyright (c) 2024, PostgreSQL Global Development Group
+#
+#-
+
+subdir = src/tools/pgindent
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+clean distclean:
+	rm -rf log/ tmp_check/
+
+# Provide this alternate test name to allow testing pgindent without building
+# all of the surrounding Postgres installation.
+.PHONY: test
+
+pg_bsd_indent:
+	$(MAKE) -C ../pg_bsd_indent pg_bsd_indent
+
+test: pg_bsd_indent
+	$(prove_installcheck)
diff --git a/src/tools/pgindent/meson.build b/src/tools/pgindent/meson.build
new file mode 100644
index 00..d31ade11ce
--- /dev/null
+++ b/src/tools/pgindent/meson.build
@@ -0,0 +1,13 @@
+tests += {
+  'name': 'pgindent',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+'args': [
+  pg_bsd_indent.path(),
+],
+'tests': [
+  't/001_pgindent.pl',
+],
+  },
+}
diff --git a/src/tools/pgindent/t/001_pgindent.pl b/src/tools/pgindent/t/001_pgindent.pl
new file mode 100644
index 00..8ee93f4789
--- /dev/null
+++ b/src/tools/pgindent/t/001_pgindent.pl
@@ -0,0 +1,19 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+#
+# Check that all C code is formatted with pgindent
+#
+
+use strict;
+use warnings FATAL => 'all';
+use Test::More;
+use PostgreSQL::Test::Utils qw(command_ok);
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bpgindent\b/)
+{
+	plan skip_all => "test pgindent not enabled in PG_TEST_EXTRA";
+}
+
+my $pg_bsd_indent = $ARGV[0];
+command_ok(["./pgindent", "--indent=$pg_bsd_indent", "--check", "--diff"]);
+
+done_testing();
-- 
Tristan Partin
Neon (https://neon.tech)



Re: Add pgindent test to check if codebase is correctly formatted

2024-01-16 Thread Tristan Partin

On Tue Jan 16, 2024 at 7:27 PM CST, Bruce Momjian wrote:

On Tue, Jan 16, 2024 at 07:22:23PM -0600, Tristan Partin wrote:
> I think a good solution would be to distribute pgindent and pg_bsd_indent.
> At Neon, we are trying to format our extension code using pgindent. I am
> sure there are other extension authors out there too that format using
> pgindent. Distributing pg_bsd_indent and pgindent in the postgresql-devel
> package would be a great help to those of us that pgindent out of tree code.
> It would also have the added benefit of adding the tools to $PREFIX/bin,
> which would make the test that I added not need a hack to get the
> pg_bsd_indent executable.

So your point is that pg_bsd_indent and pgindent are in the source tree,
but not in any package distribution?  Isn't that a packager decision?


It requires changes to at least the Meson build files. pg_bsd_indent is 
not marked for installation currently. There is a TODO there. pgindent 
has no install_data() for instance. pg_bsd_indent seemingly gets 
installed somewhere in autotools given the contents of its Makefile, but 
I didn't see anything in my install tree afterward.


Sure RPM/DEB packagers can solve this issue downstream, but that doesn't 
help those of that run "meson install" or "make install" upstream. 
Packagers are probably more likely to package the tools if they are 
marked for installation by upstream too.


Hope this helps to better explain what changes would be required within 
the Postgres source tree.


--
Tristan Partin
Neon (https://neon.tech)




Re: Add pgindent test to check if codebase is correctly formatted

2024-01-17 Thread Tristan Partin

On Wed Jan 17, 2024 at 3:50 AM CST, Alvaro Herrera wrote:

Hmm, should this also install typedefs.list and pgindent.man?
What about the tooling to reformat Perl code?


Good point about pgindent.man. It would definitely be good to install 
alongside pgindent and pg_bsd_indent.


I don't know if we need to install the typedefs.list file. I think it 
would just be good enough to also install the find_typedefs script. But 
it needs some fixing up first[0]. Extension authors can then just 
generate their own typedefs.list that will include the typedefs of the 
extension and the typedefs of the postgres types they use. At least, 
that is what we have found works at Neon.


I cannot vouch for extension authors writing Perl but I think it could 
make sense to install the src/test/perl tree, so extension authors could 
more easily write tests for their extensions in Perl. But we could 
install the perltidy file and whatever else too. I keep my Perl writing 
to a minimum, so I am not the best person to vouch for these usecases.


[0]: 
https://www.postgresql.org/message-id/aaa59ef5-dce8-7369-5cae-487727664127%40dunslane.net

--
Tristan Partin
Neon (https://neon.tech)




Re: make dist using git archive

2024-01-22 Thread Tristan Partin

On Mon Jan 22, 2024 at 1:31 AM CST, Peter Eisentraut wrote:

From 4b128faca90238d0a0bb6949a8050c2501d1bd67 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 20 Jan 2024 21:54:36 +0100
Subject: [PATCH v0] make dist uses git archive

---
 GNUmakefile.in | 34 --
 meson.build| 38 ++
 2 files changed, 50 insertions(+), 22 deletions(-)

diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..3e04785ada2 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers 
submake-libpgport
 distdir= postgresql-$(VERSION)
 dummy  = =install=
 
+GIT = git

+
 dist: $(distdir).tar.gz $(distdir).tar.bz2
-   rm -rf $(distdir)
-
-$(distdir).tar: distdir
-   $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
-   @echo $(distdir)
-
-distdir:
-   rm -rf $(distdir)* $(dummy)
-   for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name 
.git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
-   mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
-   ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
-   done
-   $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+   $(GIT) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+   $(GIT) archive --format tar.gz --prefix $(distdir)/ HEAD -o $@
+
+$(distdir).tar.bz2: check-dirty-index
+   $(GIT) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 
--prefix $(distdir)/ HEAD -o $@
 
 distcheck: dist

rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index c317144b6bc..f0d870c5192 100644
--- a/meson.build
+++ b/meson.build
@@ -3347,6 +3347,44 @@ run_target('help',
 
 
 
+###

+# Distribution archive
+###
+
+git = find_program('git', required: false, native: true, disabler: true)
+bzip2 = find_program('bzip2', required: false, native: true, disabler: true)


This doesn't need to be a disabler. git is fine as-is. See later 
comment. Disablers only work like you are expecting when they are used 
like how git is used. Once you call a method like .path(), all bets are 
off.



+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+   command: [git, 'diff-index', '--quiet', 'HEAD'])


Seems like you might want to add -C here too?


+
+tar_gz = custom_target('tar.gz',
+  build_always_stale: true,
+  command: [git, '-C', '@SOURCE_ROOT@', 'archive',
+'--format', 'tar.gz',
+'--prefix', distdir + '/',
+'-o', '@BUILD_ROOT@/@OUTPUT@',
+'HEAD', '.'],
+  install: false,
+  output: distdir + '.tar.gz',
+)
+
+tar_bz2 = custom_target('tar.bz2',
+  build_always_stale: true,
+  command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command=' + 
bzip2.path() + ' -c', 'archive',
+'--format', 'tar.bz2',
+'--prefix', distdir + '/',


-'-o', '@BUILD_ROOT@/@OUTPUT@',
+'-o', join_paths(meson.build_root(), '@OUTPUT@'),

This will generate the tarballs in the build directory. Do the same for 
the previous target. Tested locally.



+'HEAD', '.'],
+  install: false,
+  output: distdir + '.tar.bz2',
+)


The bz2 target should be wrapped in an `if bzip2.found()`. It is 
possible for git to be found, but not bzip2. I might also define the bz2 
command out of line. Also, you may want to add 
these programs to meson_options.txt for overriding, even though the 
"meson-ic" way is to use a machine file.



+
+alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])


Are you intending for the check_dirty_index target to prohibit the other 
two targets from running? Currently that is not the case. If it is what 
you intend, use a stamp file or something to indicate a relationship. 
Alternatively, inline the git diff-index into the other commands. These 
might also do better as external scripts. It would reduce duplication 
between the autotools and Meson builds.



+
+
+
 ###
 # The End, The End, My Friend
 ###


I am not really following why we can't use the builtin Meson dist 
command. The only difference from my testing is it doesn't use 
a --prefix argument.


--
Tristan Partin
Neon (https://neon.tech)




Remove pthread_is_threaded_np() checks in postmaster

2024-01-23 Thread Tristan Partin
These checks are not effective for what they are trying to prevent. 
A recent commit[0] in libcurl when used on macOS has been tripping the 
pthread_is_threaded_np() check in postmaster.c for 
shared_preload_libraries entries which use libcurl (like Neon). Under 
the hood, libcurl calls SCDynamicStoreCopyProxies[1], which apparently 
causes the check to fail.


Attached is a patch to remove the check, and a minimal reproducer 
(curlexe.zip) for others to run on macOS.


Here are some logs:

Postgres working as expected:

$ LC_ALL="C" /opt/homebrew/opt/postgresql@16/bin/postgres -D 
/opt/homebrew/var/postgresql@16
2024-01-22 23:18:51.203 GMT [50388] LOG:  starting PostgreSQL 16.1 (Homebrew) 
on aarch64-apple-darwin23.2.0, compiled by Apple clang version 15.0.0 
(clang-1500.1.0.2.5), 64-bit
2024-01-22 23:18:51.204 GMT [50388] LOG:  listening on IPv6 address "::1", port 
5432
2024-01-22 23:18:51.204 GMT [50388] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2024-01-22 23:18:51.205 GMT [50388] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5432"
2024-01-22 23:18:51.207 GMT [50391] LOG:  database system was shut down at 
2023-12-21 23:12:10 GMT
2024-01-22 23:18:51.211 GMT [50388] LOG:  database system is ready to accept 
connections
^C2024-01-22 23:18:53.797 GMT [50388] LOG:  received fast shutdown request
2024-01-22 23:18:53.798 GMT [50388] LOG:  aborting any active transactions
2024-01-22 23:18:53.800 GMT [50388] LOG:  background worker "logical replication 
launcher" (PID 50394) exited with exit code 1
2024-01-22 23:18:53.801 GMT [50389] LOG:  shutting down
2024-01-22 23:18:53.801 GMT [50389] LOG:  checkpoint starting: shutdown 
immediate
2024-01-22 23:18:53.805 GMT [50389] LOG:  checkpoint complete: wrote 3 buffers 
(0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, sync=0.001 
s, total=0.005 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 
kB, estimate=0 kB; lsn=0/4BE77E0, redo lsn=0/4BE77E0
2024-01-22 23:18:53.809 GMT [50388] LOG:  database system is shut down


Postgres not working with attached extension preloaded:

$ echo shared_preload_libraries=curlexe >> 
/opt/homebrew/var/postgresql@16/postgresql.conf
$ LC_ALL="C" /opt/homebrew/opt/postgresql@16/bin/postgres -D 
/opt/homebrew/var/postgresql@16
2024-01-22 23:19:01.108 GMT [50395] LOG:  starting PostgreSQL 16.1 (Homebrew) 
on aarch64-apple-darwin23.2.0, compiled by Apple clang version 15.0.0 
(clang-1500.1.0.2.5), 64-bit
2024-01-22 23:19:01.110 GMT [50395] LOG:  listening on IPv6 address "::1", port 
5432
2024-01-22 23:19:01.110 GMT [50395] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2024-01-22 23:19:01.111 GMT [50395] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5432"
2024-01-22 23:19:01.113 GMT [50395] FATAL:  postmaster became multithreaded 
during startup
2024-01-22 23:19:01.113 GMT [50395] HINT:  Set the LC_ALL environment variable 
to a valid locale.
2024-01-22 23:19:01.114 GMT [50395] LOG:  database system is shut down


Same as previous, but without IPv6 support in libcurl:

$ DYLD_LIBRARY_PATH=/opt/homebrew/opt/curl-without-ipv6/lib LC_ALL="C" 
/opt/homebrew/opt/postgresql@16/bin/postgres -D /opt/homebrew/var/postgresql@16
2024-01-22 23:23:17.151 GMT [50546] LOG:  starting PostgreSQL 16.1 (Homebrew) 
on aarch64-apple-darwin23.2.0, compiled by Apple clang version 15.0.0 
(clang-1500.1.0.2.5), 64-bit
2024-01-22 23:23:17.152 GMT [50546] LOG:  listening on IPv6 address "::1", port 
5432
2024-01-22 23:23:17.152 GMT [50546] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2024-01-22 23:23:17.152 GMT [50546] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5432"
2024-01-22 23:23:17.155 GMT [50549] LOG:  database system was shut down at 
2024-01-22 23:23:10 GMT
2024-01-22 23:23:17.158 GMT [50546] LOG:  database system is ready to accept 
connections
^C2024-01-22 23:23:26.997 GMT [50546] LOG:  received fast shutdown request
2024-01-22 23:23:26.998 GMT [50546] LOG:  aborting any active transactions
2024-01-22 23:23:27.000 GMT [50546] LOG:  background worker "logical replication 
launcher" (PID 50552) exited with exit code 1
2024-01-22 23:23:27.000 GMT [50547] LOG:  shutting down
2024-01-22 23:23:27.001 GMT [50547] LOG:  checkpoint starting: shutdown 
immediate
2024-01-22 23:23:27.002 GMT [50547] LOG:  checkpoint complete: wrote 3 buffers 
(0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 
s, total=0.003 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 
kB, estimate=0 kB; lsn=0/4BE78D0, redo lsn=0/4BE78D0
2024-01-22 23:23:27.005 GMT [50546] LOG:  database system is shut down


[0]: 
https://github.com/curl/curl/commit/8b7cbe9decc205b08ec8258eb184c89a33e3084b
[1]: 
https://developer.apple.com/documentation/systemconfiguration/1517088-scdynamicstorecopyproxies

--
Tristan Partin
Neon (https://neon.tech)
<&g

Re: Remove pthread_is_threaded_np() checks in postmaster

2024-01-23 Thread Tristan Partin

On Tue Jan 23, 2024 at 1:47 PM CST, Andres Freund wrote:

Hi,
On 2024-01-23 13:20:15 -0600, Tristan Partin wrote:
> These checks are not effective for what they are trying to prevent. A recent
> commit[0] in libcurl when used on macOS has been tripping the
> pthread_is_threaded_np() check in postmaster.c for shared_preload_libraries
> entries which use libcurl (like Neon). Under the hood, libcurl calls
> SCDynamicStoreCopyProxies[1], which apparently causes the check to fail.

Maybe I'm missing something, but isn't that indicating the exact opposite,
namely that the check is precisely doing what it's intended?


The logic in the original commit seems sound:


On Darwin, detect and report a multithreaded postmaster.

Darwin --enable-nls builds use a substitute setlocale() that may start a
thread.  Buildfarm member orangutan experienced BackendList corruption
on account of different postmaster threads executing signal handlers
simultaneously.  Furthermore, a multithreaded postmaster risks undefined
behavior from sigprocmask() and fork().  Emit LOG messages about the
problem and its workaround.  Back-patch to 9.0 (all supported versions).


Some reading from signal(7):

A process-directed signal may be delivered to any one of the threads 
that does not currently have the signal blocked. If more than one of 
the threads has the signal unblocked, then the kernel chooses an 
arbitrary thread to which to deliver the signal.


So it sounds like the commit is trying to protect from the last 
sentence.


And then forks only copy from their parent thread...

Please ignore this thread. I need to think of a better solution to this 
problem.


--
Tristan Partin
Neon (https://neon.tech)




Re: Remove pthread_is_threaded_np() checks in postmaster

2024-01-23 Thread Tristan Partin

On Tue Jan 23, 2024 at 2:10 PM CST, Tristan Partin wrote:

On Tue Jan 23, 2024 at 1:47 PM CST, Andres Freund wrote:
> Hi,
> On 2024-01-23 13:20:15 -0600, Tristan Partin wrote:
> > These checks are not effective for what they are trying to prevent. A recent
> > commit[0] in libcurl when used on macOS has been tripping the
> > pthread_is_threaded_np() check in postmaster.c for shared_preload_libraries
> > entries which use libcurl (like Neon). Under the hood, libcurl calls
> > SCDynamicStoreCopyProxies[1], which apparently causes the check to fail.
>
> Maybe I'm missing something, but isn't that indicating the exact opposite,
> namely that the check is precisely doing what it's intended?

The logic in the original commit seems sound:

> On Darwin, detect and report a multithreaded postmaster.
>
> Darwin --enable-nls builds use a substitute setlocale() that may start a
> thread.  Buildfarm member orangutan experienced BackendList corruption
> on account of different postmaster threads executing signal handlers
> simultaneously.  Furthermore, a multithreaded postmaster risks undefined
> behavior from sigprocmask() and fork().  Emit LOG messages about the
> problem and its workaround.  Back-patch to 9.0 (all supported versions).

Some reading from signal(7):

> A process-directed signal may be delivered to any one of the threads 
> that does not currently have the signal blocked. If more than one of 
> the threads has the signal unblocked, then the kernel chooses an 
> arbitrary thread to which to deliver the signal.


So it sounds like the commit is trying to protect from the last 
sentence.


And then forks only copy from their parent thread...


What is keeping us from using pthread_sigmask(3) instead of 
sigprocmask(2)? If an extension can guarantee that threads that get 
launched by it don't interact with anything Postgres-related, would that 
be enough to protect from any fork(2) related issues? In the OP example, 
is there any harm in the thread that libcurl inadvertantly launches from 
a Postgres perspective?


--
Tristan Partin
Neon (https://neon.tech)




Re: Remove pthread_is_threaded_np() checks in postmaster

2024-01-23 Thread Tristan Partin

On Tue Jan 23, 2024 at 4:23 PM CST, Andres Freund wrote:

Hi,

On 2024-01-23 15:50:11 -0600, Tristan Partin wrote:
> What is keeping us from using pthread_sigmask(3) instead of sigprocmask(2)?

We would need to make sure to compile with threading support everywhere. One
issue is that on some platforms things get slower once you actually start
using pthreads.


What are examples of these reduced performance platforms?

From reading the meson.build files, it seems like building with 
threading enabled is the future, so should we just rip the band-aid off 
for 17?



> If an extension can guarantee that threads that get launched by it don't
> interact with anything Postgres-related, would that be enough to protect
> from any fork(2) related issues?

A fork() while threads are running is undefined behavior IIRC, and undefined
behavior isn't limited to a single thread. You'd definitely need to use
pthread_sigprocmask etc to address that aspect alone.


If you can find a resource that explains the UB, I would be very 
interested to read that. I found a SO[0] answer that made it seem like 
this actually wasn't the case.


[0]: https://stackoverflow.com/a/42679479/7572728

--
Tristan Partin
Neon (https://neon.tech)




Re: make dist using git archive

2024-01-24 Thread Tristan Partin

On Tue Jan 23, 2024 at 3:30 AM CST, Peter Eisentraut wrote:

On 22.01.24 21:04, Tristan Partin wrote:
> I am not really following why we can't use the builtin Meson dist 
> command. The only difference from my testing is it doesn't use a 
> --prefix argument.


Here are some problems I have identified:

1. meson dist internally runs gzip without the -n option.  That makes 
the tar.gz archive include a timestamp, which in turn makes it not 
reproducible.


2. Because gzip includes a platform indicator in the archive, the 
produced tar.gz archive is not reproducible across platforms.  (I don't 
know if gzip has an option to avoid that.  git archive uses an internal 
gzip implementation that handles this.)


3. Meson does not support tar.bz2 archives.

4. Meson uses git archive internally, but then unpacks and repacks the 
archive, which loses the ability to use git get-tar-commit-id.


5. I have found that the tar archives created by meson and git archive 
include the files in different orders.  I suspect that the Python 
tarfile module introduces some either randomness or platform dependency.


6. meson dist is also slower because of the additional work.

7. meson dist produces .sha256sum files but we have called them .sha256. 
  (This is obviously trivial, but it is something that would need to be 
dealt with somehow nonetheless.)


Most or all of these issues are fixable, either upstream in Meson or by 
adjusting our own requirements.  But for now this route would have some 
significant disadvantages.


Thanks Peter. I will bring these up with upstream!

--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2024-01-24 Thread Tristan Partin

On Wed Jan 24, 2024 at 9:58 AM CST, Jelte Fennema-Nio wrote:

I ran into an SSL issue when using the MSYS2/MINGW build of Postgres
for the PgBouncer test suite. Postgres crashed whenever you tried to
open an ssl connection to it.
https://github.com/msys2/MINGW-packages/issues/19851

I'm wondering if the issue described in this thread could be related
to the issue I ran into. Afaict the merged patch has not been released
yet.


Do you have a backtrace? Given that the version is 3.2.0, seems likely.

--
Tristan Partin
Neon (https://neon.tech)




Re: make dist using git archive

2024-01-24 Thread Tristan Partin

On Wed Jan 24, 2024 at 10:18 AM CST, Tristan Partin wrote:

On Tue Jan 23, 2024 at 3:30 AM CST, Peter Eisentraut wrote:
> On 22.01.24 21:04, Tristan Partin wrote:
> > I am not really following why we can't use the builtin Meson dist 
> > command. The only difference from my testing is it doesn't use a 
> > --prefix argument.

>
> Here are some problems I have identified:
>
> 1. meson dist internally runs gzip without the -n option.  That makes 
> the tar.gz archive include a timestamp, which in turn makes it not 
> reproducible.


It doesn't look like Python provides the facilities to affect this.

> 2. Because gzip includes a platform indicator in the archive, the 
> produced tar.gz archive is not reproducible across platforms.  (I don't 
> know if gzip has an option to avoid that.  git archive uses an internal 
> gzip implementation that handles this.)


Same reason as above.


> 3. Meson does not support tar.bz2 archives.


Submitted https://github.com/mesonbuild/meson/pull/12770.

> 4. Meson uses git archive internally, but then unpacks and repacks the 
> archive, which loses the ability to use git get-tar-commit-id.


Because Meson allows projects to distribute arbitrary files via 
meson.add_dist_script(), and can include subprojects via `meson dist 
--include-subprojects`, this doesn't seem like an easily solvable 
problem.


> 5. I have found that the tar archives created by meson and git archive 
> include the files in different orders.  I suspect that the Python 
> tarfile module introduces some either randomness or platform dependency.


Seems likely.


> 6. meson dist is also slower because of the additional work.


Not easily solvable due to 4.

> 7. meson dist produces .sha256sum files but we have called them .sha256. 
>   (This is obviously trivial, but it is something that would need to be 
> dealt with somehow nonetheless.)

>
> Most or all of these issues are fixable, either upstream in Meson or by 
> adjusting our own requirements.  But for now this route would have some 
> significant disadvantages.


Thanks Peter. I will bring these up with upstream!


I think the solution to point 4 is to not unpack/repack if there are no 
dist scripts and/or subprojects to distribute. I can take a look at 
this later. I think this would also solve points 1, 2, 5, and 6 because 
at that point meson is just calling git-archive.


--
Tristan Partin
Neon (https://neon.tech)




Re: make dist using git archive

2024-01-25 Thread Tristan Partin

On Thu Jan 25, 2024 at 10:04 AM CST, Peter Eisentraut wrote:

On 22.01.24 21:04, Tristan Partin wrote:
>> +    'HEAD', '.'],
>> +  install: false,
>> +  output: distdir + '.tar.bz2',
>> +)
> 
> The bz2 target should be wrapped in an `if bzip2.found()`.


The way that this currently works is that you will fail at configure 
time if bz2 doesn't exist on the system. Meson will try to resolve 
a .path() method on a NotFoundProgram. You might want to define the bz2 
target to just call `exit 1` in this case.


if bzip2.found()
 # do your current target
else
 bz2 = run_target('tar.bz2', command: ['exit', 1])
endif

This should cause pgdist to appropriately fail at run time when 
generating the bz2 tarball.


Well, I think we want the dist step to fail if bzip2 is not there.  At 
least that is the current expectation.


>> +alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])
> 
> Are you intending for the check_dirty_index target to prohibit the other 
> two targets from running? Currently that is not the case.


Yes, that was the hope, and that's how the make dist variant works.  But 
I couldn't figure this out with meson.  Also, the above actually also 
doesn't work with older meson versions, so I had to comment this out to 
get CI to work.


> If it is what 
> you intend, use a stamp file or something to indicate a relationship. 
> Alternatively, inline the git diff-index into the other commands. These 
> might also do better as external scripts. It would reduce duplication 
> between the autotools and Meson builds.


Yeah, maybe that's a direction.


For what it's worth, I run Meson 1.3, and the behavior of generating the 
tarballs even though it is a dirty tree still occurred. In the new patch 
you seem to say it was fixed in 0.60.


--
Tristan Partin
Neon (https://neon.tech)




Re: make dist using git archive

2024-01-26 Thread Tristan Partin

On Fri Jan 26, 2024 at 12:28 AM CST, Peter Eisentraut wrote:

On 25.01.24 17:25, Tristan Partin wrote:
> For what it's worth, I run Meson 1.3, and the behavior of generating the 
> tarballs even though it is a dirty tree still occurred. In the new patch 
> you seem to say it was fixed in 0.60.


The problem I'm referring to is that before 0.60, alias_target cannot 
depend on run_target (only "build target").  This is AFAICT not 
documented and might not have been an intentional change, but you can 
trace it in the meson source code, and it shows in the PostgreSQL CI. 
That's also why for the above bzip2 issue I have to use custom_target in 
place of your run_target.


https://github.com/mesonbuild/meson/pull/12783

Thanks for finding these issues.

--
Tristan Partin
Neon (https://neon.tech)




Re: meson + libpq_pipeline

2024-01-29 Thread Tristan Partin

On Mon Jan 29, 2024 at 11:37 AM CST, Alvaro Herrera wrote:

I just realized while looking at Jelte's patch for the new nonblocking
query cancel stuff that the Meson build doesn't run the libpq_pipeline
tests :-(

Is there any way to wire the tests to make it work?


I can try to take a look for you. Not sure how hard it will be, but 
I can take a crack at it this week.


--
Tristan Partin
Neon (https://neon.tech)




Fix some ubsan/asan related issues

2024-01-30 Thread Tristan Partin

Patch 1:

Passing NULL as a second argument to memcpy breaks ubsan, and there 
didn't seem to be anything preventing that in the LogLogicalMessage() 
codepath. Here is a preventative measure in LogLogicalMessage() and an 
Assert() in CopyXLogRecordToWAL().


Patch 2:

Support building with -Db_sanitize=address in Meson. Various executables 
are leaky which can cause the builds to fail and tests to fail, when we 
are fine leaking this memory.


Personally, I am a big stickler for always freeing my memory in 
executables even if it gets released at program exit because it makes 
the leak sanitizer much more effective. However (!), I am not here to 
change the philosophy of memory management in one-off executables, so 
I have just silenced memory leaks in various executables for now.


Patch 3:

THIS DOESN'T WORK. DO NOT COMMIT. PROPOSING FOR DISCUSSION!

In my effort to try to see if the test suite would pass with asan 
enabled, I ran into a max_stack_depth issue. I tried maxing it out 
(hence, the patch), but that still didn't remedy my issue. I tried to 
look on the list for any relevant emails, but nothing turned up. Maybe 
others have not attempted this before? Seems doubtful.


Not entirely sure how to fix this issue. I personally find asan 
extremely effective, even more than valgrind, so it would be great if 
I could run Postgres with asan enabled to catch various stupid C issues 
I might create. On my system, ulimit -a returns 8192 kbytes, so Postgres 
just lets me set 7680 kbytes as the max. Whatever asan is doing doesn't 
seem to leave enough stack space for Postgres.


---

I would like to see patch 1 reviewed and committed. Patch 2 honestly 
doesn't matter unless asan support can be fixed. I can also add a patch 
that errors out the Meson build if asan support is requested. That way 
others don't spend time heading down a dead end.


--
Tristan Partin
Neon (https://neon.tech)




Re: Fix some ubsan/asan related issues

2024-01-30 Thread Tristan Partin
Spend so much time writing out the email, I once again forget 
attachments...UGH.


--
Tristan Partin
Neon (https://neon.tech)
From 331cec1c9db6ff60dcc6d9ba62a9c8be4e5e95ed Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 29 Jan 2024 18:03:39 -0600
Subject: [PATCH v1 1/3] Refuse to register message in LogLogicalMessage if
 NULL

If this occurs, the memcpy of rdata_data in CopyXLogRecordToWAL breaks
the API contract of memcpy in glibc. The two pointer arguments are
marked as nonnull, even in the event the amount to copy is 0 bytes.
---
 src/backend/access/transam/xlog.c | 1 +
 src/backend/replication/logical/message.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..929888beb5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1288,6 +1288,7 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, XLogRecData *rdata,
 		}
 
 		Assert(CurrPos % XLOG_BLCKSZ >= SizeOfXLogShortPHD || rdata_len == 0);
+		Assert(rdata_data != NULL);
 		memcpy(currpos, rdata_data, rdata_len);
 		currpos += rdata_len;
 		CurrPos += rdata_len;
diff --git a/src/backend/replication/logical/message.c b/src/backend/replication/logical/message.c
index 2ac34e7781..126c57ef6e 100644
--- a/src/backend/replication/logical/message.c
+++ b/src/backend/replication/logical/message.c
@@ -67,7 +67,8 @@ LogLogicalMessage(const char *prefix, const char *message, size_t size,
 	XLogBeginInsert();
 	XLogRegisterData((char *) &xlrec, SizeOfLogicalMessage);
 	XLogRegisterData(unconstify(char *, prefix), xlrec.prefix_size);
-	XLogRegisterData(unconstify(char *, message), size);
+	if (message)
+		XLogRegisterData(unconstify(char *, message), size);
 
 	/* allow origin filtering */
 	XLogSetRecordFlags(XLOG_INCLUDE_ORIGIN);
-- 
Tristan Partin
Neon (https://neon.tech)

From dc9488f3fdee69b981b52c985fb77106d7d301ff Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Wed, 24 Jan 2024 17:07:01 -0600
Subject: [PATCH v1 2/3] meson: Support compiling with -Db_sanitize=address

The ecpg is parser is extremely leaky, so we need to silence leak
detection.
---
 meson.build|  3 +++
 src/bin/initdb/initdb.c| 11 +++
 src/bin/pg_config/pg_config.c  | 10 ++
 src/bin/pg_resetwal/pg_resetwal.c  | 10 ++
 src/include/pg_config.h.in |  5 +
 src/interfaces/ecpg/preproc/ecpg.c | 11 +++
 6 files changed, 50 insertions(+)

diff --git a/meson.build b/meson.build
index 8ed51b6aae..d8c524d6f6 100644
--- a/meson.build
+++ b/meson.build
@@ -2530,6 +2530,9 @@ cdata.set_quoted('PG_VERSION_STR',
   )
 )
 
+if get_option('b_sanitize').contains('address')
+  cdata.set('USE_ADDRESS_SANITIZER', 1)
+endif
 
 ###
 # NLS / Gettext
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index ac409b0006..e18e716d9c 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -338,6 +338,17 @@ do { \
 		output_failed = true, output_errno = errno; \
 } while (0)
 
+#ifdef USE_ADDRESS_SANITIZER
+
+const char *__asan_default_options(void);
+
+const char *__asan_default_options(void)
+{
+	return "detect_leaks=0";
+}
+
+#endif
+
 /*
  * Escape single quotes and backslashes, suitably for insertions into
  * configuration files or SQL E'' strings.
diff --git a/src/bin/pg_config/pg_config.c b/src/bin/pg_config/pg_config.c
index 77d09ccfc4..26d0b2f62b 100644
--- a/src/bin/pg_config/pg_config.c
+++ b/src/bin/pg_config/pg_config.c
@@ -67,6 +67,16 @@ static const InfoItem info_items[] = {
 	{NULL, NULL}
 };
 
+#ifdef USE_ADDRESS_SANITIZER
+
+const char *__asan_default_options(void);
+
+const char *__asan_default_options(void)
+{
+	return "detect_leaks=0";
+}
+
+#endif
 
 static void
 help(void)
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index e9dcb5a6d8..54f1ce5e44 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -89,6 +89,16 @@ static void KillExistingWALSummaries(void);
 static void WriteEmptyXLOG(void);
 static void usage(void);
 
+#ifdef USE_ADDRESS_SANITIZER
+
+const char *__asan_default_options(void);
+
+const char *__asan_default_options(void)
+{
+	return "detect_leaks=0";
+}
+
+#endif
 
 int
 main(int argc, char *argv[])
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 07e73567dc..ce0c700b6d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -668,6 +668,11 @@
 /* Define to 1 if strerror_r() returns int. */
 #undef STRERROR_R_INT
 
+/* Define to 1 if using the address sanitizer. Typically this can be detecte
+ * with __has_feature(address_sanitizer), but GCC doesn't support it with C99.
+ * Remove it when the standard is bumped. */
+#undef USE_ADDRESS_SANITI

Re: psql not responding to SIGINT upon db reconnection

2024-01-30 Thread Tristan Partin

On Fri Jan 12, 2024 at 11:13 AM CST, Tristan Partin wrote:

On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote:
> On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin  wrote:
> > I think the way to go is to expose some variation of libpq's
> > pqSocketPoll(), which I would be happy to put together a patch for.
> > Making frontends, psql in this case, have to reimplement the polling
> > logic doesn't strike me as fruitful, which is essentially what I have
> > done.
>
> I encourage further exploration of this line of attack. I fear that if
> I were to commit something like what you've posted up until now,
> people would complain that that code was too ugly to live, and I'd
> have a hard time telling them that they're wrong.

Completely agree. Let me look into this. Perhaps I can get something up 
next week or the week after.


Not next week, but here is a respin. I've exposed pqSocketPoll as 
PQsocketPoll and am just using that. You can see the diff is so much 
smaller, which is great!


In order to fight the race condition, I am just using a 1 second timeout 
instead of trying to integrate pselect or ppoll. We could add 
a PQsocketPPoll() to support those use cases, but I am not sure how 
available pselect and ppoll are. I guess on Windows we don't have 
pselect. I don't think using the pipe trick that Heikki mentioned 
earlier is suitable to expose via an API in libpq, but someone else 
might have a different opinion.


Maybe this is good enough until someone complains? Most people would 
probably just chalk any latency between keypress and cancellation as 
network latency and not a hardcoded 1 second.


Thanks for your feedback Robert!

--
Tristan Partin
Neon (https://neon.tech)
From 4fa6db1900a355a171d3e16019d02f3a415764d0 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v8 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml | 5 -
 src/interfaces/libpq/fe-misc.c  | 7 +++
 src/interfaces/libpq/libpq-fe.h | 4 
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d0d5aefadc..aa26c2cc8d 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -358,7 +358,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
Loop thus: If PQconnectPoll(conn) last returned
PGRES_POLLING_READING, wait until the socket is ready to
read (as indicated by select(), poll(), or
-   similar system function).
+   similar system function).  Note that PQsocketPoll
+   can help reduce boilerplate by abstracting the setup of
+   select(), or poll() if it is
+   available on your system.
Then call PQconnectPoll(conn) again.
Conversely, if PQconnectPoll(conn) last returned
PGRES_POLLING_WRITING, wait until the socket is ready
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 47a28b0a3a..ee917d375d 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -55,7 +55,6 @@ static int	pqPutMsgBytes(const void *buf, size_t len, PGconn *conn);
 static int	pqSendSome(PGconn *conn, int len);
 static int	pqSocketCheck(PGconn *conn, int forRead, int forWrite,
 		  time_t end_time);
-static int	pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time);
 
 /*
  * PQlibVersion: return the libpq version number
@@ -1059,7 +1058,7 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
 
 	/* We will retry as long as we get EINTR */
 	do
-		result = pqSocketPoll(conn->sock, forRead, forWrite, end_time);
+		result = PQsocketPoll(conn->sock, forRead, forWrite, end_time);
 	while (result < 0 && SOCK_ERRNO == EINTR);
 
 	if (result < 0)
@@ -1083,8 +1082,8 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
  * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
  * if end_time is 0 (or indeed, any time before now).
  */
-static int
-pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+int
+PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time)
 {
 	/* We use poll(2) if available, otherwise select(2) */
 #ifdef HAVE_POLL
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index defc415fa3..11a7fd32b6 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -21,6 +21,7 @@ extern "C"
 #endif
 
 #include 
+#include 
 
 /*
  * postgres_ext.h defines the backend's externally visible types,
@@ -644,6 +645,9 @@ extern int	lo_export(PGconn *conn, Oid lobjId, const char *filename);
 /* Get the version of the libpq library in use */
 extern int	PQlibVersion(void);
 
+/* Poll 

Re: psql not responding to SIGINT upon db reconnection

2024-01-31 Thread Tristan Partin

On Tue Jan 30, 2024 at 4:42 PM CST, Jelte Fennema-Nio wrote:

On Tue, 30 Jan 2024 at 23:20, Tristan Partin  wrote:
> Not next week, but here is a respin. I've exposed pqSocketPoll as
> PQsocketPoll and am just using that. You can see the diff is so much
> smaller, which is great!

The exports.txt change should be made part of patch 0001, also docs
are missing for the newly exposed function. PQsocketPoll does look
like quite a nice API to expose from libpq.


I was looking for documentation of PQsocket(), but didn't find any 
standalone (unless I completely missed it). So I just copied how 
PQsocket() is documented in PQconnectPoll(). I am happy to document it 
separately if you think it would be useful.


My bad on the exports.txt change being in the wrong commit. Git 
things... I will fix it on the next re-spin after resolving the previous 
paragraph.


--
Tristan Partin
Neon (https://neon.tech)




Re: Fix some ubsan/asan related issues

2024-01-31 Thread Tristan Partin

On Tue Jan 30, 2024 at 10:00 PM CST, Alexander Lakhin wrote:

Hello,

30.01.2024 18:57, Tristan Partin wrote:
> Patch 1:
>
> Passing NULL as a second argument to memcpy breaks ubsan, ...

Maybe you would like to fix also one more similar place, reached with:
create extension xml2;
select xslt_process('',
$$http://www.w3.org/1999/XSL/Transform";>


$$);

varlena.c:201:26: runtime error: null pointer passed as argument 2, which is 
declared to never be null

There is also an issue with pg_bsd_indent, I stumble upon when doing
`make check-world` with the sanitizers enabled:
https://www.postgresql.org/message-id/591971ce-25c1-90f3-0526-5f54e3ebb32e%40gmail.com

31.01.2024 00:23, Andres Freund wrote:
> The reason asan fails is that it uses a "shadow stack" to track stack variable
> lifetimes. These confuse our stack depth check. CI doesn't have the issue
> because the compiler doesn't yet enable the feature, locally I get around it
> by using ASAN_OPTIONS=detect_stack_use_after_return=0:...

Even with detect_stack_use_after_return=0, clang-18's asan makes the test
012_subtransactions.pl fail:
2024-01-31 03:24:25.691 UTC [4112455] 012_subtransactions.pl LOG: statement: 
SELECT hs_subxids(201);
2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl ERROR: stack depth 
limit exceeded
2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl HINT: Increase the configuration parameter max_stack_depth 
(currently 2048kB), after ensuring the platform's stack depth limit is adequate.


(All the other tests pass.)
Though the same test passes when I use clang-16.


Thanks Alexander! I will try and take a look at these.

--
Tristan Partin
Neon (https://neon.tech)




Re: Fix some ubsan/asan related issues

2024-02-05 Thread Tristan Partin

On Tue Jan 30, 2024 at 3:58 PM CST, Andres Freund wrote:

Hi,

On 2024-01-30 09:59:25 -0600, Tristan Partin wrote:
> From 331cec1c9db6ff60dcc6d9ba62a9c8be4e5e95ed Mon Sep 17 00:00:00 2001
> From: Tristan Partin 
> Date: Mon, 29 Jan 2024 18:03:39 -0600
> Subject: [PATCH v1 1/3] Refuse to register message in LogLogicalMessage if
>  NULL

> If this occurs, the memcpy of rdata_data in CopyXLogRecordToWAL breaks
> the API contract of memcpy in glibc. The two pointer arguments are
> marked as nonnull, even in the event the amount to copy is 0 bytes.

It seems pretty odd to call LogLogicalMessage() with a NULL argument. Why is
that something useful?


Dropped. Will change on the Neon side. Should we add an assert 
somewhere for good measure? Near the memcpy in question?



> From dc9488f3fdee69b981b52c985fb77106d7d301ff Mon Sep 17 00:00:00 2001
> From: Tristan Partin 
> Date: Wed, 24 Jan 2024 17:07:01 -0600
> Subject: [PATCH v1 2/3] meson: Support compiling with -Db_sanitize=address
> 
> The ecpg is parser is extremely leaky, so we need to silence leak

> detection.

This does stuff beyond epcg...


Dropped.


> +if get_option('b_sanitize').contains('address')
> +  cdata.set('USE_ADDRESS_SANITIZER', 1)
> +endif
>  
>  ###

>  # NLS / Gettext
> diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
> index ac409b0006..e18e716d9c 100644
> --- a/src/bin/initdb/initdb.c
> +++ b/src/bin/initdb/initdb.c
> @@ -338,6 +338,17 @@ do { \
>output_failed = true, output_errno = errno; \
>  } while (0)
>  
> +#ifdef USE_ADDRESS_SANITIZER


When asan is used  __SANITIZE_ADDRESS__ is defined, so we don't need to
implement this ourselves.


Thanks!


> +const char *__asan_default_options(void);
> +
> +const char *__asan_default_options(void)
> +{
> +  return "detect_leaks=0";
> +}
> +
> +#endif

Wonder if we should move this into some static library and link it into all
binaries that don't want leak detection? It doesn't seem great to have to
adjust this in a bunch of files if we want to adjust the options...


See attached patches. Here is what I found to be necessary to get 
a -Db_sanitize=address,undefined build to successfully make it through 
tests. I do have a few concerns about the patch.


1. For whatever reason, __SANITIZE_LEAK__ is not defined when the leak 
  sanitizer is enabled. So you will see me use this, to make some 
  include directives work. I don't like this as a final solution 
  because someone could use -fsanitize=leak.
2. I tracked down what seems to be a valid leak in adt/xml.c. Attached 
  (test.sql) is a fairly minimal reproduction of what is needed to show 
  the leak. I didn't spend too much time tracking it down. Might get to 
  it later, who knows. Below you will find the backtrace, and whoever 
  wants to try their hand at fixing it will need to comment out 
  xmlNewNode in the leak.supp file.
3. I don't love the new library name. Maybe it should be name more lsan 
  specific.
4. Should pg_attribute_no_asan be renamed to 
  pg_attribute_no_sanitize_address? That would match 
  pg_attribute_no_sanitize_alignment.


I will also attach a Meson test log for good measure. I didn't try 
testing anything that requires PG_TEST_EXTRA, but I suspect that 
everything will be fine.


Alexander, I haven't yet gotten to the things you pointed out in the 
sibling thread.


==221848==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 120 byte(s) in 1 object(s) allocated from:
   #0 0x7fac4a6d92ef in malloc (/lib64/libasan.so.8+0xd92ef) (BuildId: 
7fcb7759bc17ef47f9682414b6d99732d6a6ab0c)
   #1 0x7fac4a48427d in xmlNewNode (/lib64/libxml2.so.2+0x5d27d) (BuildId: 
3074681c8fa9b17e0cbed09bc61c25ada5c28e7c)
   #2 0x22107a6 in xmltotext_with_options ../src/backend/utils/adt/xml.c:754
   #3 0xead047 in ExecEvalXmlExpr ../src/backend/executor/execExprInterp.c:4020
   #4 0xe8c119 in ExecInterpExpr ../src/backend/executor/execExprInterp.c:1537
   #5 0xe91f2c in ExecInterpExprStillValid 
../src/backend/executor/execExprInterp.c:1881
   #6 0x109632d in ExecEvalExprSwitchContext 
../src/include/executor/executor.h:355
   #7 0x109655d in ExecProject ../src/include/executor/executor.h:389
   #8 0x1097186 in ExecResult ../src/backend/executor/nodeResult.c:136
   #9 0xf0f90c in ExecProcNodeFirst ../src/backend/executor/execProcnode.c:464
   #10 0xec9bec in ExecProcNode ../src/include/executor/executor.h:273
   #11 0xed875c in ExecutePlan ../src/backend/executor/execMain.c:1670
   #12 0xecbee0 in standard_ExecutorRun ../src/backend/executor/execMain.c:365
   #13 0xecb529 in ExecutorRun ../src/backend/executor/execMain.c:309
   #14 0x1ae89f6 in PortalRunSelect ../src/backend/tcop/pquery.c:924
   #15 0x1ae7c06 in PortalRun ../src/backend/tcop/pquery.

Re: psql not responding to SIGINT upon db reconnection

2024-03-22 Thread Tristan Partin

On Fri Mar 22, 2024 at 9:59 AM CDT, Robert Haas wrote:

On Wed, Jan 31, 2024 at 1:07 PM Tristan Partin  wrote:
> I was looking for documentation of PQsocket(), but didn't find any
> standalone (unless I completely missed it). So I just copied how
> PQsocket() is documented in PQconnectPoll(). I am happy to document it
> separately if you think it would be useful.

As Jelte said back at the end of January, I think you just completely
missed it. The relevant part of libpq.sgml starts like this:


 
PQsocketPQsocket

As far as I know, we document all of the exported libpq functions in
that SGML file.

I think there's no real reason why we couldn't get at least 0001 and
maybe also 0002 into this release, but only if you move quickly on
this. Feature freeze is approaching rapidly.

Modulo the documentation changes, I think 0001 is pretty much ready to
go. 0002 needs comments. I'm also not so sure about the name
process_connection_state_machine(); it seems a little verbose. How
about something like wait_until_connected()? And maybe put it below
do_connect() instead of above.


Robert, Jelte:

Sorry for taking a while to get back to y'all. I have taken your 
feedback into consideration for v9. This is my first time writing 
Postgres docs, so I'm ready for another round of editing :).


Robert, could you point out some places where comments would be useful 
in 0002? I did rename the function and moved it as suggested, thanks! In 
turn, I also realized I forgot a prototype, so also added it.


This patchset has also been rebased on master, so the libpq export 
number for PQsocketPoll was bumped.


--
Tristan Partin
Neon (https://neon.tech)
From 7f8bf7250ecf79f65c129ccc42643c36bc53f882 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v9 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml  | 38 +++-
 src/interfaces/libpq/exports.txt |  1 +
 src/interfaces/libpq/fe-misc.c   |  7 +++---
 src/interfaces/libpq/libpq-fe.h  |  4 
 4 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d3e87056f2c..10f191f6b88 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -262,6 +262,39 @@ PGconn *PQsetdb(char *pghost,
  
 
 
+
+ PQsocketPollPQsocketPoll
+ 
+  
+   nonblocking connection
+   Poll a connection's underlying socket descriptor retrieved with .
+
+int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+
+  
+
+  
+   This function takes care of the setup for monitoring a file descriptor. The underlying
+   function is either poll(2) or select(2),
+   depending on platform support. The primary use of this function is working through the state
+   machine instantiated by .
+  
+
+  
+   The function returns a value greater than 0 if the specified condition
+   is met, 0 if a timeout occurred, or -1 if an error
+   or interrupt occurred. In the event forRead and
+   forWrite are not set, the function immediately retuns a timeout
+   condition.
+  
+
+  
+   end_time is the time in the future in seconds starting from the UNIX
+   epoch in which you would like the function to return if the condition is not met.
+  
+ 
+
+
 
  PQconnectStartParamsPQconnectStartParams
  PQconnectStartPQconnectStart
@@ -358,7 +391,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
Loop thus: If PQconnectPoll(conn) last returned
PGRES_POLLING_READING, wait until the socket is ready to
read (as indicated by select(), poll(), or
-   similar system function).
+   similar system function).  Note that PQsocketPoll
+   can help reduce boilerplate by abstracting the setup of
+   select(2) or poll(2) if it is
+   available on your system.
Then call PQconnectPoll(conn) again.
Conversely, if PQconnectPoll(conn) last returned
PGRES_POLLING_WRITING, wait until the socket is ready
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 9fbd3d34074..1e48d37677d 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -202,3 +202,4 @@ PQcancelSocket199
 PQcancelErrorMessage  200
 PQcancelReset 201
 PQcancelFinish202
+PQsocketPoll  203
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index f2fc78a481c..f562cd8d344 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -55,7 +55,6 @@ static int	pqPutMsgBytes(const void *buf, size_t len, PGconn *conn);
 static int	pqSendSome(PGconn *conn, int len);
 static int	pqSock

Re: make dist using git archive

2024-03-22 Thread Tristan Partin

On Thu Mar 21, 2024 at 3:44 AM CDT, Peter Eisentraut wrote:

Here is an updated version of this patch set.


You should add 'disabler: true' to the git find_program in Meson. If Git 
doesn't exist on the system with the way your patch is currently 
written, the targets would be defined, even though they would never 
succeed.


You may also want to make sure that we are actually in a Git repository. 
I don't think git-archive works outside one.


Re the autoclrf, is this something we could throw in a .gitattributes 
files?


I have removed the "dirty check" stuff.  It didn't really work well/was 
buggy under meson, and it failed mysteriously on the Linux CI tasks.  So 
let's just drop that functionality for now.


I have also added a more complete commit message and some more code 
comments.



Meson has its own distribution building command (meson dist), but we
are not using that at this point.  The main problem is that the way
they have implemented it, it is not deterministic in the above sense.
Also, we want a "make" version for the time being.  But the target
name "dist" in meson is reserved for that reason, so we call the
custom target "pgdist" (so call something like "meson compile -C build
pgdist").


I would suggest poisoning `meson dist` in the following way:

if not meson.is_subproject()
	# Maybe edit the comment...Maybe tell perl to print this message 
	# instead and then exit non-zero?

#
# Meson has its own distribution building command (meson dist), but we
# are not using that at this point.  The main problem is that the way
# they have implemented it, it is not deterministic in the above sense.
# Also, we want a "make" version for the time being.  But the target
# name "dist" in meson is reserved for that reason, so we call the
# custom target "pgdist" (so call something like "meson compile -C build
# pgdist").
#
	# We don't poison the dist if we are a subproject because it is 
	# possible that the parent project may want to create a dist using 
	# the builtin Meson method.

meson.add_dist_script(perl, '-e', 'exit 1')
endif

I have extracted the freebsd CI script fix into a separate patch (0002). 
  I think this is useful even if we don't take the full CI patch (0003).


0002 looks pretty reasonable to me.

About the 0003 patch: It seems useful in principle to test these things 
continuously.  The dist script runs about 10 seconds in each task, and 
takes a bit of disk space for the artifacts.  I'm not sure to what 
degree this might bother someone.


0003 works for me :).

--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2024-03-22 Thread Tristan Partin

On Fri Mar 22, 2024 at 12:17 PM CDT, Robert Haas wrote:

On Fri, Mar 22, 2024 at 1:05 PM Tristan Partin  wrote:
> Sorry for taking a while to get back to y'all. I have taken your
> feedback into consideration for v9. This is my first time writing
> Postgres docs, so I'm ready for another round of editing :).

Yeah, that looks like it needs some wordsmithing yet. I can take a
crack at that at some point, but here are a few notes:

- "takes care of" and "working through the state machine" seem quite
vague to me.
- the meanings of forRead and forWrite don't seem to be documented.
- "retuns" is a typo.

> Robert, could you point out some places where comments would be useful
> in 0002? I did rename the function and moved it as suggested, thanks! In
> turn, I also realized I forgot a prototype, so also added it.

Well, for starters, I'd give the function a header comment.

Telling me that a 1 second timeout is a 1 second timeout is less
useful than telling me why we've picked a 1 second timeout. Presumably
the answer is "so we can respond to cancel interrupts in a timely
fashion", but I'd make that explicit.

It might be worth a comment saying that PQsocket() shouldn't be
hoisted out of the loop because it could be a multi-host connection
string and the socket might change under us. Unless that's not true,
in which case we should hoist the PQsocket() call out of the loop.

I think it would also be worth a comment saying that we don't need to
report errors here, as the caller will do that; we just need to wait
until the connection is known to have either succeeded or failed, or
until the user presses cancel.


This is good feedback, thanks. I have added comments where you 
suggested. I reworded the PQsocketPoll docs to hopefully meet your 
expectations. I am using the term "connection sequence" which is from 
the PQconnectStartParams docs, so hopefully people will be able to make 
that connection. I wrote documentation for "forRead" and "forWrite" as 
well.


I had a question about parameter naming. Right now I have a mix of 
camel-case and snake-case in the function signature since that is what 
I inherited. Should I change that to be consistent? If so, which case 
would you like?


Thanks for your continued reviews.

--
Tristan Partin
Neon (https://neon.tech)
From dc45e95ab443d973845dd840600d6719dcd4cae2 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v10 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml  | 43 +++-
 src/interfaces/libpq/exports.txt |  1 +
 src/interfaces/libpq/fe-misc.c   |  7 +++---
 src/interfaces/libpq/libpq-fe.h  |  4 +++
 4 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d3e87056f2c..774b57ba70b 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -262,6 +262,44 @@ PGconn *PQsetdb(char *pghost,
  
 
 
+
+ PQsocketPollPQsocketPoll
+ 
+  
+   nonblocking connection
+   Poll a connection's underlying socket descriptor retrieved with .
+
+int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+
+  
+
+  
+   This function sets up polling of a file descriptor. The underlying function is either
+   poll(2) or select(2), depending on platform
+   support. The primary use of this function is iterating through the connection sequence
+   described in the documentation of . If
+   forRead is specified, the function waits for the socket to be ready
+   for reading. If forWrite is specified, the function waits for the
+   socket to be ready for write. See POLLIN and POLLOUT
+   from poll(2), or readfds and
+   writefds from select(2) for more information.
+  
+
+  
+   The function returns a value greater than 0 if the specified condition
+   is met, 0 if a timeout occurred, or -1 if an error
+   or interrupt occurred. In the event forRead and
+   forWrite are not set, the function immediately returns a timeout
+   condition.
+  
+
+  
+   end_time is the time in the future in seconds starting from the UNIX
+   epoch in which you would like the function to return if the condition is not met.
+  
+ 
+
+
 
  PQconnectStartParamsPQconnectStartParams
  PQconnectStartPQconnectStart
@@ -358,7 +396,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
Loop thus: If PQconnectPoll(conn) last returned
PGRES_POLLING_READING, wait until the socket is ready to
read (as indicated by select(), poll(), or
-   similar system function).
+   similar system function).  Note

Re: make dist using git archive

2024-03-24 Thread Tristan Partin

3 comments left that are inconsequential. Feel free to ignore.


+# Meson has its own distribution building command (meson dist), but we
+# are not using that at this point.  The main problem is that the way
+# they have implemented it, it is not deterministic.  Also, we want it
+# to be equivalent to the "make" version for the time being.  But the
+# target name "dist" in meson is reserved for that reason, so we call
+# the custom target "pgdist".


The second sentence is a run-on.


+if bzip2.found()
+  tar_bz2 = custom_target('tar.bz2',
+build_always_stale: true,
+command: [git, '-C', '@SOURCE_ROOT@',
+  '-c', 'core.autocrlf=false',
+  '-c', 'tar.tar.bz2.command="' + bzip2.path() + '" -c',
+  'archive',
+  '--format', 'tar.bz2',
+  '--prefix', distdir + '/',
+  '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+  'HEAD', '.'],
+install: false,
+output: distdir + '.tar.bz2',
+  )


You might find Meson's string formatting syntax creates a more readable 
command string:


'tar.tar.bz2.command=@0@ -c'.format(bzip2.path())

And then 'install: false' is the default if you feel like leaving it 
out.


Otherwise, let's get this in!

--
Tristan Partin
Neon (https://neon.tech)




Re: make dist using git archive

2024-03-26 Thread Tristan Partin

On Tue Mar 26, 2024 at 2:56 AM CDT, Andres Freund wrote:

Hi,

On 2024-03-26 08:36:58 +0100, Peter Eisentraut wrote:
> On 26.03.24 01:23, Andres Freund wrote:
> > On 2024-03-25 06:44:33 +0100, Peter Eisentraut wrote:
> > > Done and committed.
> > 
> > This triggered a new warning for me:
> > 
> > ../../../../../home/andres/src/postgresql/meson.build:3422: WARNING: Project targets '>=0.54' but uses feature introduced in '0.55.0': Passing executable/found program object to script parameter of add_dist_script.
> 
> Hmm, I don't see that.  Is there another version dependency that controls

> when you see version dependency warnings? ;-)

Sometimes an incompatibility is later noticed and a warning is introduced at
that point.

> We could trivially remove this particular line, or perhaps put a
> 
> if meson.version().version_compare('>=0.55')
> 
> around it.  (But would that still warn?)


It shouldn't, no. As long as the code is actually executed within the check,
it avoids the warning. However if you just set a variable inside the version
gated block and then later use the variable outside that, it will
warn. Probably hard to avoid...


The following change also makes the warning go away, but the version 
comparison seems better to me due to how we choose not to use machine 
files for overriding programs[0]. :(


- meson.add_dist_script(perl, ...)
+ meson.add_dist_script('perl', ...)

Aside, but I think since we dropped AIX, we can bump the required Meson 
version. My last analysis of the situation told me that the AIX 
buildfarm animals were the only machines which didn't have a Python 
version capable of running a newer version. I would need to look at the 
situation again though.


[0]: If someone wants to make a plea here: 
https://github.com/mesonbuild/meson/pull/12623


--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2024-04-01 Thread Tristan Partin

On Mon Mar 25, 2024 at 1:44 PM CDT, Robert Haas wrote:

On Fri, Mar 22, 2024 at 4:58 PM Tristan Partin  wrote:
> I had a question about parameter naming. Right now I have a mix of
> camel-case and snake-case in the function signature since that is what
> I inherited. Should I change that to be consistent? If so, which case
> would you like?

Uh... PostgreSQL is kind of the wild west in that regard. The thing to
do is look for nearby precedents, but that doesn't help much here
because in the very same file, libpq-fe.h, we have:

extern int  PQsetResultAttrs(PGresult *res, int numAttributes,
PGresAttDesc *attDescs);
extern int  PQsetvalue(PGresult *res, int tup_num, int field_num,
char *value, int len);

Since the existing naming is consistent with one of those two styles,
I'd probably just leave it be.

+   The function returns a value greater than 0
if the specified condition
+   is met, 0 if a timeout occurred, or
-1 if an error
+   or interrupt occurred. In the event forRead and

We either need to tell people how to find out which error it was, or
if that's not possible and we can't reasonably make it possible, we
need to tell them why they shouldn't care. Because there's nothing
more delightful than someone who shows up and says "hey, I tried to do
XYZ, and I got an error," as if that were sufficient information for
me to do something useful.

+   end_time is the time in the future in
seconds starting from the UNIX
+   epoch in which you would like the function to return if the
condition is not met.

This sentence seems a bit contorted to me, like maybe Yoda wrote it. I
was about to try to rephrase it and maybe split it in two when I
wondered why we need to document how time_t works at all. Can't we
just say something like "If end_time is not -1, it specifies the time
at which this function should stop waiting for the condition to be
met" -- and maybe move it to the end of the first paragraph, so it's
before where we list the meanings of the return values?


Incorporated feedback, I have :).

--
Tristan Partin
Neon (https://neon.tech)
From 14c794d9699f7b9f27d1a4ec026f748c6b7d8853 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v11 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml  | 40 +++-
 src/interfaces/libpq/exports.txt |  1 +
 src/interfaces/libpq/fe-misc.c   |  7 +++---
 src/interfaces/libpq/libpq-fe.h  |  4 
 4 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d3e87056f2c..e69feacfe6a 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -262,6 +262,41 @@ PGconn *PQsetdb(char *pghost,
  
 
 
+
+ PQsocketPollPQsocketPoll
+ 
+  
+   nonblocking connection
+   Poll a connection's underlying socket descriptor retrieved with .
+
+int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+
+  
+
+  
+   This function sets up polling of a file descriptor. The underlying function is either
+   poll(2) or select(2), depending on platform
+   support. The primary use of this function is iterating through the connection sequence
+   described in the documentation of . If
+   forRead is specified, the function waits for the socket to be ready
+   for reading. If forWrite is specified, the function waits for the
+   socket to be ready for write. See POLLIN and POLLOUT
+   from poll(2), or readfds and
+   writefds from select(2) for more information. If
+   end_time is not -1, it specifies the time at which
+   this function should stop waiting for the condition to be met.
+  
+
+  
+   The function returns a value greater than 0 if the specified condition
+   is met, 0 if a timeout occurred, or -1 if an error
+   occurred. The error can be retrieved by checking the errno(3) value. In
+   the event forRead and forWrite are not set, the
+   function immediately returns a timeout condition.
+  
+ 
+
+
 
  PQconnectStartParamsPQconnectStartParams
  PQconnectStartPQconnectStart
@@ -358,7 +393,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
Loop thus: If PQconnectPoll(conn) last returned
PGRES_POLLING_READING, wait until the socket is ready to
read (as indicated by select(), poll(), or
-   similar system function).
+   similar system function).  Note that PQsocketPoll
+   can help reduce boilerplate by abstracting the setup of
+   select(2) or poll(2) if it is
+   available on your system.
Then call PQconnectPoll(conn) again.
Conversely, if PQconnectPoll(conn) last returned
 

Silence Meson warning on HEAD

2024-04-01 Thread Tristan Partin
Ubuntu 18.04.5   0.45.1 
Ubuntu 20.04.6   0.53.2 
Ubuntu 22.04.1   0.61.2 
Ubuntu 22.04.3   0.61.2 
Windows 10 / Cygwin64 3.4.9  1.2.3  
Windows / Msys 12.2.01.4.0  x   
Windows Server 2016  1.3.1  x   
Windows Server 2019  1.0.1  x   


Some notes:

- The minimum Meson version we test against is 1.0.1, on drongo
- For any  RHEL 7 derivatives, you see, I took the EPEL Meson version
- Debian 10 requires the backports repository to be enabled
- OmniOS / illumos r151038 has Python 3.9, so could fetch Meson over 
 pypi since it isn't packaged
- Missing information on OpenBSD 6.8, 6.9, and 7.3, but they should have 
 at least 0.55.0 available based on release dates
- The missing macOS versions could definitely run 0.55 either through 
 Homebrew, Macports, or PyPI

- Other systems didn't have easily publicly available information

At the top of the root meson.build file, there is this comment:


# We want < 0.56 for python 3.5 compatibility on old platforms. EPEL for
# RHEL 7 has 0.55. < 0.54 would require replacing some uses of the fs
# module, < 0.53 all uses of fs. So far there's no need to go to >=0.56.


Seems like as good an opportunity to bump Meson to 0.56 for Postgres 17, 
which I have found to exist in the EPEL for RHEL 7. I don't know what 
version exists in the base repo. Perhaps it is 0.55, which would still 
get rid of the aforementioned warning.


Committer, please pick your patch :).

[0]: 
https://www.postgresql.org/message-id/40e80f77-a294-4f29-a16f-e21bc7bc7...@eisentraut.org

--
Tristan Partin
Neon (https://neon.tech)
diff --git a/meson.build b/meson.build
index 18b5be842e3..87437960bc3 100644
--- a/meson.build
+++ b/meson.build
@@ -3419,7 +3419,10 @@ alias_target('pgdist', [tar_gz, tar_bz2])
 # But not if we are in a subproject, in case the parent project wants to
 # create a dist using the standard Meson command.
 if not meson.is_subproject()
-  meson.add_dist_script(perl, '-e', 'exit 1')
+  # We can only pass the identifier perl here when we depend on >= 0.55
+  if meson.version().version_compare('>=0.55')
+meson.add_dist_script(perl, '-e', 'exit 1')
+  endif
 endif
 
 
diff --git a/meson.build b/meson.build
index 18b5be842e3..80b412f741b 100644
--- a/meson.build
+++ b/meson.build
@@ -3419,7 +3419,7 @@ alias_target('pgdist', [tar_gz, tar_bz2])
 # But not if we are in a subproject, in case the parent project wants to
 # create a dist using the standard Meson command.
 if not meson.is_subproject()
-  meson.add_dist_script(perl, '-e', 'exit 1')
+  meson.add_dist_script(perl.path(), '-e', 'exit 1')
 endif
 
 
diff --git a/contrib/basebackup_to_shell/meson.build b/contrib/basebackup_to_shell/meson.build
index 8175c9b5c5b..201e69708a7 100644
--- a/contrib/basebackup_to_shell/meson.build
+++ b/contrib/basebackup_to_shell/meson.build
@@ -24,7 +24,7 @@ tests += {
 'tests': [
   't/001_basic.pl',
 ],
-'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '',
-'TAR': tar.found() ? tar.path() : '' },
+'env': {'GZIP_PROGRAM': gzip.found() ? gzip.full_path() : '',
+'TAR': tar.found() ? tar.full_path() : '' },
   },
 }
diff --git a/meson.build b/meson.build
index 18b5be842e3..e11df3ec002 100644
--- a/meson.build
+++ b/meson.build
@@ -1059,7 +1059,7 @@ pyopt = get_option('plpython')
 python3_dep = not_found_dep
 if not pyopt.disabled()
   pm = import('python')
-  python3_inst = pm.find_installation(python.path(), required: pyopt)
+  python3_inst = pm.find_installation(python.full_path(), required: pyopt)
   if python3_inst.found()
 python3_dep = python3_inst.dependency(embed: true, required: pyopt)
 # Remove this check after we depend on Meson >= 1.1.0
@@ -2723,11 +2723,11 @@ if host_system == 'windows'
 
   if cc.get_argument_syntax() == 'msvc'
 rc = find_program('rc', required: true)
-rcgen_base_args += ['--rc', rc.path()]
+rcgen_base_args += ['--rc', rc.full_path()]
 rcgen_outputs = ['@BASENAME@.rc', '@BASENAME@.res']
   else
 windres = find_program('windres', required: true)
-rcgen_base_args += ['--windres', windres.path()]
+rcgen_base_args += ['--windres', windres.full_path()]
 rcgen_outputs = ['@BASENAME@.rc', '@BASENAME@.obj']
   endif
 
@@ -3273,7 +3273,7 @@ foreach test_dir : tests
   endif
 
   test_co

Re: psql not responding to SIGINT upon db reconnection

2024-04-02 Thread Tristan Partin

On Tue Apr 2, 2024 at 9:32 AM CDT, Robert Haas wrote:

On Mon, Apr 1, 2024 at 12:04 PM Tristan Partin  wrote:
> > This sentence seems a bit contorted to me, like maybe Yoda wrote it.
>
> Incorporated feedback, I have :).

Committed it, I did. My thanks for working on this issue, I extend.


Thanks to everyone for their reviews! Patch is in a much better place 
than when it started.


--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2024-04-03 Thread Tristan Partin

On Wed Apr 3, 2024 at 8:32 AM CDT, Jelte Fennema-Nio wrote:

On Tue, 2 Apr 2024 at 16:33, Robert Haas  wrote:
> Committed it, I did. My thanks for working on this issue, I extend.

Looking at the committed version of this patch, the pg_unreachable
calls seemed weird to me. 1 is actually incorrect, thus possibly
resulting in undefined behaviour. And for the other call an imho
better fix would be to remove the now 21 year unused enum variant,
instead of introducing its only reference in the whole codebase.

Attached are two trivial patches, feel free to remove both of the
pg_unreachable calls.


Patches look good. Sorry about causing you to do some work.

--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2024-04-03 Thread Tristan Partin

On Wed Apr 3, 2024 at 9:50 AM CDT, Jelte Fennema-Nio wrote:

On Wed, 3 Apr 2024 at 16:31, Tom Lane  wrote:
> If we do the latter, we will almost certainly get pushback from
> distros who check for library ABI breaks.  I fear the comment
> suggesting that we could remove it someday is too optimistic.

Alright, changed patch 0002 to keep the variant. But remove it from
the recently added switch/case. And also updated the comment to remove
the "for awhile".


Removing from the switch statement causes a warning:


[1/2] Compiling C object src/bin/psql/psql.p/command.c.o
../src/bin/psql/command.c: In function ‘wait_until_connected’:
../src/bin/psql/command.c:3803:17: warning: enumeration value 
‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch]
 3803 | switch (PQconnectPoll(conn))
  |     ^~


--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2024-04-03 Thread Tristan Partin

On Wed Apr 3, 2024 at 10:05 AM CDT, Jelte Fennema-Nio wrote:

On Wed, 3 Apr 2024 at 16:55, Tristan Partin  wrote:
> Removing from the switch statement causes a warning:
>
> > [1/2] Compiling C object src/bin/psql/psql.p/command.c.o
> > ../src/bin/psql/command.c: In function ‘wait_until_connected’:
> > ../src/bin/psql/command.c:3803:17: warning: enumeration value 
‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch]
> >  3803 | switch (PQconnectPoll(conn))
> >   | ^~


Ofcourse... fixed now


I think patch 2 makes it worse. The value in -Wswitch is that when new 
enum variants are added, the developer knows the locations to update. 
Adding a default case makes -Wswitch pointless.


Patch 1 is still good. The comment change in patch 2 is good too!

--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2024-04-04 Thread Tristan Partin
Thanks Jelte and Robert for the extra effort to correct my mistake. 
I apologize. Bad copy-paste from a previous revision of the patch at 
some point.


--
Tristan Partin
Neon (https://neon.tech)




Re: make BuiltinTrancheNames less ugly

2024-02-12 Thread Tristan Partin
on's list of wait events.
  */
-extern const char *const IndividualLWLockNames[];  /* in lwlocknames.c */
-
 static const char *const BuiltinTrancheNames[] = {
+#include "lwlocknames.c"
[LWTRANCHE_XACT_BUFFER] = "XactBuffer",
[LWTRANCHE_COMMITTS_BUFFER] = "CommitTsBuffer",
[LWTRANCHE_SUBTRANS_BUFFER] = "SubtransBuffer",
@@ -738,11 +737,7 @@ LWLockReportWaitEnd(void)
 static const char *
 GetLWTrancheName(uint16 trancheId)
 {
-   /* Individual LWLock? */
-   if (trancheId < NUM_INDIVIDUAL_LWLOCKS)
-   return IndividualLWLockNames[trancheId];
-
-   /* Built-in tranche? */
+   /* Individual LWLock or built-in tranche? */
if (trancheId < LWTRANCHE_FIRST_USER_DEFINED)
return BuiltinTrancheNames[trancheId];
 
diff --git a/src/backend/storage/lmgr/meson.build b/src/backend/storage/lmgr/meson.build

index da32198f78..a12064ae8a 100644
--- a/src/backend/storage/lmgr/meson.build
+++ b/src/backend/storage/lmgr/meson.build
@@ -5,11 +5,19 @@ backend_sources += files(
   'deadlock.c',
   'lmgr.c',
   'lock.c',
-  'lwlock.c',
   'predicate.c',
   'proc.c',
   's_lock.c',
   'spin.c',
 )
 
-generated_backend_sources += lwlocknames[1]

+# this includes a .c file generated. Is there a better way?
+lwlock_source = files('lwlock.c')


I don't understand this comment. Could you explain it a bit more?


+
+lwlock_lib = static_library('lwlock',
+  lwlock_source,
+  dependencies: [backend_code],
+  include_directories: include_directories('../../../include/storage'),
+  kwargs: internal_lib_args,
+  )


Move the paren to the beginning of the line.


+backend_link_with += lwlock_lib


Earlier in the thread you had said this:


IMO it would be less ugly to have the origin file lwlocknames.txt be
not a text file but a .h with a macro that can be defined by
interested parties so that they can extract what they want from the
file, like PG_CMDTAG or PG_KEYWORD.  Using Perl for this seems dirty...


I really like this idea, and would definitely be more inclined to see 
a solution like this.


--
Tristan Partin
Neon (https://neon.tech)




Re: backend *.c #include cleanup (IWYU)

2024-02-12 Thread Tristan Partin

On Sat Feb 10, 2024 at 1:40 AM CST, Peter Eisentraut wrote:
I played with include-what-you-use (IWYU), "a tool for use with clang to 
analyze #includes in C and C++ source files".[0]  I came across this via 
clangd (the language server), because clangd (via the editor) kept 
suggesting a bunch of #includes to remove.  And I suppose it was right.


So as a test, I ran IWYU over the backend *.c files and removed all the 
#includes it suggested.  (Note that IWYU also suggests to *add* a bunch 
of #includes, in fact that is its main purpose; I didn't do this here.) 
In some cases, a more specific #include replaces another less specific 
one.  (To keep the patch from exploding in size, I ignored for now all 
the suggestions to replace catalog/pg_somecatalog.h with 
catalog/pg_somecatalog_d.h.)  This ended up with the attached patch, 
which has


  432 files changed, 233 insertions(+), 1023 deletions(-)

I tested this with various compilation options (assert, WAL_DEBUG, 
LOCK_DEBUG, different geqo variants, etc.) to make sure a header wasn't 
just used for some conditional code section.  Also, again, this patch 
touches just *.c files, so nothing declared from header files changes in 
hidden ways.


Also, as a small example, in src/backend/access/transam/rmgr.c you'll 
see some IWYU pragma annotations to handle a special case there.


The purpose of this patch is twofold: One, it's of course a nice 
cleanup.  Two, this is a test how well IWYU might work for us.  If we 
find either by human interpretation that a change doesn't make sense, or 
something breaks on some platform, then that would be useful feedback 
(perhaps to addressed by more pragma annotations or more test coverage).


(Interestingly, IWYU has been mentioned in src/tools/pginclude/README 
since 2012.  Has anyone else played with it?  Was it not mature enough 
back then?)


[0]: https://include-what-you-use.org/


I like this idea. This was something I tried to do but never finished in 
my last project. I have also been noticing the same issues from clangd. 
Having more automation around include patterns would be great! I think 
it would be good if there a Meson run_target()/Make .PHONY target that 
people could use to test too. Then, eventually the cfbot could pick this 
up.


--
Tristan Partin
Neon (https://neon.tech)




Re: make dist using git archive

2024-02-12 Thread Tristan Partin

On Sun Feb 11, 2024 at 5:09 PM CST, Peter Eisentraut wrote:
Small update: I noticed that on Windows (at least the one that is 
running the CI job), I need to use git -c core.autocrlf=false, otherwise 
git archive does line-ending conversion for the files it puts into the 
archive.  With this fix, all the archives produced by all the CI jobs 
across the different platforms match, except the .tar.gz archive from 
the Linux job, which I suspect suffers from an old git version.  We 
should get the Linux images updated to a newer Debian version soon 
anyway, so I think that issue will go away.


I think with this change, it is unlikely I will be able to upstream 
anything to Meson that would benefit Postgres here since setting this 
option seems project dependent.


--
Tristan Partin
Neon (https://neon.tech)




Re: make BuiltinTrancheNames less ugly

2024-03-01 Thread Tristan Partin

On Fri Mar 1, 2024 at 8:00 AM CST, Alvaro Herrera wrote:

On 2024-Feb-23, Heikki Linnakangas wrote:

> On 12/02/2024 19:01, Tristan Partin wrote:
> > On Wed Jan 24, 2024 at 8:09 AM CST, Alvaro Herrera wrote:
> > > IMO it would be less ugly to have the origin file lwlocknames.txt be
> > > not a text file but a .h with a macro that can be defined by
> > > interested parties so that they can extract what they want from the
> > > file, like PG_CMDTAG or PG_KEYWORD.  Using Perl for this seems dirty...
> > 
> > I really like this idea, and would definitely be more inclined to see

> > a solution like this.
> 
> +1 to that idea from me too. Seems pretty straightforward.


OK, here's a patch that does it.  I have not touched Meson yet.

I'm pretty disappointed of not being able to remove
generate-lwlocknames.pl (it now generates the .h, no longer the .c
file), but I can't find a way to do the equivalent #defines in CPP ...
it'd require invoking the C preprocessor twice.  Maybe an intermediate
.h file would solve the problem, but I have no idea how would that work
with Meson.  I guess I'll do it in Make and let somebody suggest a Meson
way.


I can help you with Meson if you get the Make implementation done.

--
Tristan Partin
Neon (https://neon.tech)




Re: Refactoring backend fork+exec code

2024-03-05 Thread Tristan Partin
  (errmsg("could not fork %s process: %m", 
PostmasterChildName(type;


I assume you are referring to the child name here?


XXX: We now have functions called AuxiliaryProcessInit() and
InitAuxiliaryProcess(). Confusing.


Based on my analysis, the *Init() is called in the Main functions, while 
Init*() is called before the Main functions. Maybe 
AuxiliaryProcessInit() could be renamed to AuxiliaryProcessStartup()? 
Rename the other to AuxiliaryProcessInit().


--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Specify -Wformat as a common warning flag for extensions

2024-03-07 Thread Tristan Partin

On Sun Jan 21, 2024 at 11:11 PM CST, Sutou Kouhei wrote:

Hi,

I'm an extension developer. If I use PostgreSQL built with
Meson, I get the following warning:

cc1: warning: '-Wformat-security' ignored without '-Wformat' 
[-Wformat-security]

Because "pg_config --cflags" includes -Wformat-security but
doesn't include -Wformat.

Can we specify -Wformat as a common warning flag too? If we
do it, "pg_config --cflags" includes both of
-Wformat-security and -Wformat. So I don't get the warning.


The GCC documentation[0] says the following:

If -Wformat is specified, also warn about uses of format functions 
that represent possible security problems. At present, this warns 
about calls to printf and scanf functions where the format string is 
not a string literal and there are no format arguments, as in printf 
(foo);. This may be a security hole if the format string came from 
untrusted input and contains ‘%n’. (This is currently a subset of what 
-Wformat-nonliteral warns about, but in future warnings may be added 
to -Wformat-security that are not included in -Wformat-nonliteral.)


It sounds like a legitimate issue. I have confirmed the issue exists 
with a pg_config compiled with Meson. I can also confirm that this issue 
exists in the autotools build.


Here is a v2 of your patch which includes the fix for autotools. I will 
mark this "Ready for Committer" in the commitfest. Thanks!


[0]: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html

--
Tristan Partin
Neon (https://neon.tech)
From bce0f70f866bb0e3f8fb8eb14c05bbbdd27c51f2 Mon Sep 17 00:00:00 2001
From: Sutou Kouhei 
Date: Mon, 22 Jan 2024 13:51:58 +0900
Subject: [PATCH v2] Add -Wformat to common warning flags

We specify -Wformat-security as a common warning flag explicitly. GCC
requires -Wformat to be added for -Wformat-security to take effect. If
-Wformat-security is used without -Wformat, GCC shows the following
warning:

cc1: warning: '-Wformat-security' ignored without '-Wformat' [-Wformat-security]

Co-authored-by: Sutou Kouhei 
---
 configure| 92 
 configure.ac |  3 ++
 meson.build  |  2 ++
 3 files changed, 97 insertions(+)

diff --git a/configure b/configure
index 36feeafbb23..eafab247e1d 100755
--- a/configure
+++ b/configure
@@ -5985,6 +5985,98 @@ if test x"$pgac_cv_prog_CXX_cxxflags__Wshadow_compatible_local" = x"yes"; then
 fi
 
 
+  # -Wformat-security requires -Wformat, so check for it
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports -Wformat, for CFLAGS" >&5
+$as_echo_n "checking whether ${CC} supports -Wformat, for CFLAGS... " >&6; }
+if ${pgac_cv_prog_CC_cflags__Wformat+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+pgac_save_CC=$CC
+CC=${CC}
+CFLAGS="${CFLAGS} -Wformat"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_prog_CC_cflags__Wformat=yes
+else
+  pgac_cv_prog_CC_cflags__Wformat=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+CC="$pgac_save_CC"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_prog_CC_cflags__Wformat" >&5
+$as_echo "$pgac_cv_prog_CC_cflags__Wformat" >&6; }
+if test x"$pgac_cv_prog_CC_cflags__Wformat" = x"yes"; then
+  CFLAGS="${CFLAGS} -Wformat"
+fi
+
+
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CXX} supports -Wformat, for CXXFLAGS" >&5
+$as_echo_n "checking whether ${CXX} supports -Wformat, for CXXFLAGS... " >&6; }
+if ${pgac_cv_prog_CXX_cxxflags__Wformat+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CXXFLAGS=$CXXFLAGS
+pgac_save_CXX=$CXX
+CXX=${CXX}
+CXXFLAGS="${CXXFLAGS} -Wformat"
+ac_save_cxx_werror_flag=$ac_cxx_werror_flag
+ac_cxx_werror_flag=yes
+ac_ext=cpp
+ac_cpp='$CXXCPP $CPPFLAGS'
+ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5'
+ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_compile "$LINENO"; then :
+  pgac_cv_prog_CXX_cxxflags__Wformat=yes
+else
+  pgac_cv_prog_CXX_cxxflags__Wformat=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_ext=c
+ac_cpp='$CPP $CPPFLAG

Re: meson: Specify -Wformat as a common warning flag for extensions

2024-03-08 Thread Tristan Partin

On Fri Mar 8, 2024 at 12:32 AM CST, Michael Paquier wrote:

On Thu, Mar 07, 2024 at 11:39:39PM -0600, Tristan Partin wrote:
> It sounds like a legitimate issue. I have confirmed the issue exists with a
> pg_config compiled with Meson. I can also confirm that this issue exists in
> the autotools build.

First time I'm hearing about that, but I'll admit that I am cheating
because -Wformat is forced in my local builds for some time now.  I'm
failing to see the issue with meson and ./configure even if I remove
the switch, though, using a recent version of gcc at 13.2.0, but
perhaps Debian does something underground.  Are there version and/or
environment requirements to be aware of?

Forcing -Wformat implies more stuff that can be disabled with
-Wno-format-contains-nul, -Wno-format-extra-args, and
-Wno-format-zero-length, but the thing is that we're usually very
conservative with such additions in the scripts.  See also
8b6f5f25102f, done, I guess, as an answer to this thread:
https://www.postgresql.org/message-id/4D431505.9010002%40dunslane.net

A quick look at the past history of pgsql-hackers does not mention
that as a problem, either, but I may have missed something.


Ok, I figured this out. -Wall implies -Wformat=1. We set warning_level 
to 1 in the Meson project() call, which implies -Wall, and set -Wall in 
CFLAGS for autoconf. That's the reason we don't get issues building 
Postgres. A user making use of the pg_config --cflags option, as Sutou 
is, *will* run into the aforementioned issues, since we don't propogate 
-Wall into pg_config.


$ gcc $(pg_config --cflags) -E - < /dev/null > /dev/null
cc1: warning: ‘-Wformat-security’ ignored without ‘-Wformat’ 
[-Wformat-security]
$ gcc -Wall $(pg_config --cflags) -E - < /dev/null > /dev/null
(nothing printed)

--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Specify -Wformat as a common warning flag for extensions

2024-03-12 Thread Tristan Partin

On Tue Mar 12, 2024 at 6:56 PM CDT, Sutou Kouhei wrote:

Hi,

In 
  "Re: meson: Specify -Wformat as a common warning flag for extensions" on Fri, 
08 Mar 2024 10:05:27 -0600,
  "Tristan Partin"  wrote:

> Ok, I figured this out. -Wall implies -Wformat=1. We set warning_level
> to 1 in the Meson project() call, which implies -Wall, and set -Wall
> in CFLAGS for autoconf. That's the reason we don't get issues building
> Postgres. A user making use of the pg_config --cflags option, as Sutou
> is, *will* run into the aforementioned issues, since we don't
> propogate -Wall into pg_config.
> 
> 	$ gcc $(pg_config --cflags) -E - < /dev/null > /dev/null

>cc1: warning: ‘-Wformat-security’ ignored without ‘-Wformat’
>[-Wformat-security]
>$ gcc -Wall $(pg_config --cflags) -E - < /dev/null > /dev/null
>(nothing printed)

Thanks for explaining this. You're right. This is the reason
why we don't need this for PostgreSQL itself but we need
this for PostgreSQL extensions. Sorry. I should have
explained this in the first e-mail...


What should we do to proceed this patch?


Perhaps adding some more clarification in the comments that I wrote.

-  # -Wformat-security requires -Wformat, so check for it
+  # -Wformat-secuirty requires -Wformat. We compile with -Wall in 
+  # Postgres, which includes -Wformat=1. -Wformat is shorthand for 
+  # -Wformat=1. The set of flags which includes -Wformat-security is 
+  # persisted into pg_config --cflags, which is commonly used by 
+  # PGXS-based extensions. The lack of -Wformat in the persisted flags
+  # will produce a warning on many GCC versions, so even though adding 
+  # -Wformat here is a no-op for Postgres, it silences other use cases.


That might be too long-winded though :).

--
Tristan Partin
Neon (https://neon.tech)




Remove a FIXME and unused variables in Meson

2024-03-13 Thread Tristan Partin

Two meson patches.

One of them adds version gates to two LLVM flags (-frwapv, 
-fno-strict-aliasing). I believe we moved the minimum LLVM version 
recently, so these might not be necessary, but maybe it helps for 
historictal reasons. If not, I'll just remove the comment in a different 
patch.


Second patch removes some unused variables. Were they analogous to 
things in autotools and the Meson portions haven't been added yet?


I was looking into adding LLVM JIT support to Meson since there is 
a TODO about it, but it wasn't clear what was missing except adding some 
variables into the PGXS Makefile.


--
Tristan Partin
Neon (https://neon.tech)




Re: Remove a FIXME and unused variables in Meson

2024-03-13 Thread Tristan Partin

On Thu Mar 14, 2024 at 12:15 AM CDT, Michael Paquier wrote:

On Thu, Mar 14, 2024 at 12:13:18AM -0500, Tristan Partin wrote:
> One of them adds version gates to two LLVM flags (-frwapv,
> -fno-strict-aliasing). I believe we moved the minimum LLVM version recently,
> so these might not be necessary, but maybe it helps for historictal reasons.
> If not, I'll just remove the comment in a different patch.
> 
> Second patch removes some unused variables. Were they analogous to things in

> autotools and the Meson portions haven't been added yet?
> 
> I was looking into adding LLVM JIT support to Meson since there is a TODO

> about it, but it wasn't clear what was missing except adding some variables
> into the PGXS Makefile.

It looks like you have forgotten to attach the patches.  :)


CLASSIC!

--
Tristan Partin
Neon (https://neon.tech)
From 615acd91d353defd7fbe136fd115515452d6d00b Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Thu, 14 Mar 2024 00:02:11 -0500
Subject: [PATCH v1 1/2] Protect adding llvm flags if found version is not
 enough

-fwrapv: https://github.com/llvm/llvm-project/commit/51924e517bd2d25faea6ef873db3c59ec4d09bf8
-fno-strict-aliasing: https://github.com/llvm/llvm-project/commit/10169b94cfe6838f881339f1944891f6d8451174
---
 src/backend/jit/llvm/meson.build | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/backend/jit/llvm/meson.build b/src/backend/jit/llvm/meson.build
index 41c759f73c5..0e1de7bc98e 100644
--- a/src/backend/jit/llvm/meson.build
+++ b/src/backend/jit/llvm/meson.build
@@ -58,8 +58,13 @@ else
 endif
 
 
-# XXX: Need to determine proper version of the function cflags for clang
-bitcode_cflags = ['-fno-strict-aliasing', '-fwrapv']
+bitcode_cflags = []
+if llvm.version().version_compare('>=2.9.0')
+  bitcode_cflags += ['-fno-strict-aliasing']
+endif
+if llvm.version().version_compare('>=2.8.0')
+  bitcode_cflags += ['-fwrapv']
+endif
 if llvm.version().version_compare('=15.0')
   bitcode_cflags += ['-Xclang', '-no-opaque-pointers']
 endif
-- 
Tristan Partin
Neon (https://neon.tech)

From 4e5541352cd582c7c7320d02c075cf7a95c6bfda Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Thu, 14 Mar 2024 00:05:38 -0500
Subject: [PATCH v1 2/2] Remove some unused variables in Meson

They weren't being used, so don't assign to them.
---
 meson.build | 5 -
 src/backend/meson.build | 1 -
 2 files changed, 6 deletions(-)

diff --git a/meson.build b/meson.build
index 85788f9dd8f..b0a45a7d834 100644
--- a/meson.build
+++ b/meson.build
@@ -202,7 +202,6 @@ if host_system == 'cygwin'
   dlsuffix = '.dll'
   mod_link_args_fmt = ['@0@']
   mod_link_with_name = 'lib@0@.exe.a'
-  mod_link_with_dir = 'libdir'
 
 elif host_system == 'darwin'
   dlsuffix = '.dylib'
@@ -212,7 +211,6 @@ elif host_system == 'darwin'
   export_fmt = '-Wl,-exported_symbols_list,@0@'
 
   mod_link_args_fmt = ['-bundle_loader', '@0@']
-  mod_link_with_dir = 'bindir'
   mod_link_with_name = '@0@'
 
   sysroot_args = [files('src/tools/darwin_sysroot'), get_option('darwin_sysroot')]
@@ -269,7 +267,6 @@ elif host_system == 'windows'
 mod_link_with_name = 'lib@0@.exe.a'
   endif
   mod_link_args_fmt = ['@0@']
-  mod_link_with_dir = 'libdir'
 
   shmem_kind = 'win32'
   sema_kind = 'win32'
@@ -488,10 +485,8 @@ dir_pgxs = dir_lib_pkg / 'pgxs'
 dir_include = get_option('includedir')
 
 dir_include_pkg = dir_include
-dir_include_pkg_rel = ''
 if not (dir_prefix_contains_pg or dir_include_pkg.contains('pgsql') or dir_include_pkg.contains('postgres'))
   dir_include_pkg = dir_include_pkg / pkg
-  dir_include_pkg_rel = pkg
 endif
 
 dir_man = get_option('mandir')
diff --git a/src/backend/meson.build b/src/backend/meson.build
index 436c04af080..d2c10d1abd7 100644
--- a/src/backend/meson.build
+++ b/src/backend/meson.build
@@ -154,7 +154,6 @@ if mod_link_args_fmt.length() > 0
 
   name = mod_link_with_name.format('postgres')
   link_with_uninst = meson.current_build_dir() / name
-  link_with_inst = '${@0@}/@1@'.format(mod_link_with_dir, name)
 
   foreach el : mod_link_args_fmt
 pg_mod_link_args += el.format(link_with_uninst)
-- 
Tristan Partin
Neon (https://neon.tech)



Re: controlling meson's parallelism (and some whining)

2023-10-20 Thread Tristan Partin

On Thu Oct 19, 2023 at 12:44 PM CDT, Robert Haas wrote:
The obvious fix to this is to just tell 'meson test' how many 
processes I'd like it to run. I thought maybe I could just do 'meson 
-j8 test' but that does not work, because the option is 
--num-processes and has no short version. Even typing -j8 every time 
would be kind of annoying; typing --num-processes 8 every time is 
ridiculously verbose.


I submitted a patch[0] to Meson to add -j.

[0]: https://github.com/mesonbuild/meson/pull/12403

--
Tristan Partin
Neon (https://neon.tech)




Re: controlling meson's parallelism (and some whining)

2023-10-20 Thread Tristan Partin

On Fri Oct 20, 2023 at 11:22 AM CDT, Tristan Partin wrote:

On Thu Oct 19, 2023 at 12:44 PM CDT, Robert Haas wrote:
> The obvious fix to this is to just tell 'meson test' how many 
> processes I'd like it to run. I thought maybe I could just do 'meson 
> -j8 test' but that does not work, because the option is 
> --num-processes and has no short version. Even typing -j8 every time 
> would be kind of annoying; typing --num-processes 8 every time is 
> ridiculously verbose.


I submitted a patch[0] to Meson to add -j.

[0]: https://github.com/mesonbuild/meson/pull/12403


You will see this in the 1.3.0 release which will be happening soon™️.

--
Tristan Partin
Neon (https://neon.tech)




Re: Explicitly skip TAP tests under Meson if disabled

2023-10-31 Thread Tristan Partin

Hi Peter,

You may find value in this Meson PR[0] adding a skip keyword argument to 
Meson's test() function. From what I understand of the PR and your 
issue, they seem related. If you could provide a comment describing why 
this is valuable to you, it would be good to help the Meson 
maintainers understand the use case better.


Thanks!

[0]: https://github.com/mesonbuild/meson/pull/12362

--
Tristan Partin
Neon (https://neon.tech)




Use thread-safe locale APIs

2023-10-31 Thread Tristan Partin
Postgres has been bitten by a few locale-related bugs, most recently via 
libperl[0]. I had previously submitted this patchset in the bug thread 
for the aforementioned bug, but here is a standalone submission for the 
purposes of an eventual commitfest submission, and to get discussion 
going. I was also flubbing up the commitfest bot with my patches. Sorry 
Joe! I feel fairly good about the patch, but I think I need some more 
testing and feedback. Localization is such a fickle beast.


I did leave one TODO because I need some input:

/* TODO: This does not handle "" as the locale */

check_locale() takes a canonname argument, which the caller expects to 
be the "canonical name" of the locale the caller passed in. The 
setlocale() man page is not very explicit about under what conditions 
the return value is different from the input string, and I haven't found 
much on the internet. Best I can tell is that the empty string is the 
only input value that differs from the output value of setlocale(). If 
that's the case, on Postmaster startup, I can query setlocale() for what 
the empty string canonicalizes to for all the locale categories we care 
about, and save them off. The other solution to the problem would be to 
find the equivalent API in the uselocale() family of functions, but I am 
under the impression that such an API doesn't exist given I haven't 
found it yet.


Also, should we just remove HAVE_USELOCALE? It seems like Windows is the 
only platform that doesn't support it. Then we can just use _WIN32 
instead.


I do not think this should be backpatched. Please see Joe's patch in the 
bug thread as a way to fix the libperl bug on pre-17 versions.


[0]: 
https://www.postgresql.org/message-id/17946-3e84cb577e9551c3%40postgresql.org

--
Tristan Partin
Neon (https://neon.tech)
From d336d84cf60b22147e1234260b3199a52e9863e3 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Fri, 30 Jun 2023 09:31:04 -0500
Subject: [PATCH v1 1/3] Skip checking for uselocale on Windows

Windows doesn't have uselocale, so skip it.
---
 meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index 2d516c8f37..aeaf75d45a 100644
--- a/meson.build
+++ b/meson.build
@@ -2454,7 +2454,7 @@ func_checks = [
   ['strsignal'],
   ['sync_file_range'],
   ['syncfs'],
-  ['uselocale'],
+  ['uselocale', {'skip': host_system == 'windows'}],
   ['wcstombs_l'],
 ]
 
-- 
Tristan Partin
Neon (https://neon.tech)

From 557870f1846990246b80cafdf3e06349302a38a5 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Fri, 30 Jun 2023 11:13:29 -0500
Subject: [PATCH v1 2/3] Add locale_is_c function

In some places throughout the codebase, there are string comparisons for
locales matching C or POSIX. Encapsulate this logic into a single
function and use it.
---
 src/backend/utils/adt/pg_locale.c | 39 ++-
 src/backend/utils/init/postinit.c |  4 +---
 src/backend/utils/mb/mbutils.c|  5 ++--
 src/include/utils/pg_locale.h |  1 +
 4 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index d5003da417..61d5c5dc4c 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -195,6 +195,23 @@ wcstombs_l(char *dest, const wchar_t *src, size_t n, locale_t loc)
 }
 #endif
 
+
+/*
+ * Check if a locale is the C locale
+ *
+ * POSIX is an alias for C. Passing ingore_case as true will use
+ * pg_strcasecmp() instead of strcmp().
+ */
+bool
+locale_is_c(const char *locale, bool ignore_case)
+{
+	if (!ignore_case)
+		return strcmp(locale, "C") == 0 || strcmp(locale, "POSIX") == 0;
+
+	return pg_strcasecmp(locale, "C") == 0 || pg_strcasecmp(locale, "POSIX") == 0;
+}
+
+
 /*
  * pg_perm_setlocale
  *
@@ -1280,10 +1297,8 @@ lookup_collation_cache(Oid collation, bool set_flags)
 			datum = SysCacheGetAttrNotNull(COLLOID, tp, Anum_pg_collation_collctype);
 			collctype = TextDatumGetCString(datum);
 
-			cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
-		 (strcmp(collcollate, "POSIX") == 0));
-			cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
-	   (strcmp(collctype, "POSIX") == 0));
+			cache_entry->collate_is_c = locale_is_c(collcollate, false);
+			cache_entry->ctype_is_c = locale_is_c(collctype, false);
 		}
 		else
 		{
@@ -1331,12 +1346,8 @@ lc_collate_is_c(Oid collation)
 		if (!localeptr)
 			elog(ERROR, "invalid LC_COLLATE setting");
 
-		if (strcmp(localeptr, "C") == 0)
-			result = true;
-		else if (strcmp(localeptr, "POSIX") == 0)
-			result = true;
-		else
-			result = false;
+		result = locale_is_c(localeptr, false);
+
 		return (bool) result;
 	}
 
@@ -13

Re: Use thread-safe locale APIs

2023-10-31 Thread Tristan Partin
Please discard this second thread. My mail client seems to have done 
something very wrong.


--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2023-11-06 Thread Tristan Partin

On Thu Nov 2, 2023 at 4:03 AM CDT, Shlok Kyal wrote:

Hi,

> That sounds like a much better solution. Attached you will find a v4
> that implements your suggestion. Please let me know if there is
> something that I missed. I can confirm that the patch works.
>
> $ ./build/src/bin/psql/psql -h pg.neon.tech
> NOTICE:  Welcome to Neon!
> Authenticate by visiting:
> https://console.neon.tech/psql_session/xxx
>
>
> NOTICE:  Connecting to database.
> psql (17devel, server 15.3)
> SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, 
compression: off)
> Type "help" for help.
>
> tristan957=> \c
> NOTICE:  Welcome to Neon!
> Authenticate by visiting:
> https://console.neon.tech/psql_session/yyy
>
>
> ^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 
failed:
> Previous connection kept
> tristan957=>

I went through CFbot and found out that some tests are timing out.
Links:
https://cirrus-ci.com/task/6735437444677632
https://cirrus-ci.com/task/4536414189125632
https://cirrus-ci.com/task/5046587584413696
https://cirrus-ci.com/task/6172487491256320
https://cirrus-ci.com/task/5609537537835008

Some tests which are timing out are as follows:
[00:48:49.243] Summary of Failures:
[00:48:49.243]
[00:48:49.243] 4/270 postgresql:regress / regress/regress TIMEOUT 1000.01s
[00:48:49.243] 6/270 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade
TIMEOUT 1000.02s
[00:48:49.243] 34/270 postgresql:recovery /
recovery/027_stream_regress TIMEOUT 1000.02s
[00:48:49.243] 48/270 postgresql:plpgsql / plpgsql/regress TIMEOUT 1000.02s
[00:48:49.243] 49/270 postgresql:plperl / plperl/regress TIMEOUT 1000.01s
[00:48:49.243] 61/270 postgresql:dblink / dblink/regress TIMEOUT 1000.03s
[00:48:49.243] 89/270 postgresql:postgres_fdw / postgres_fdw/regress
TIMEOUT 1000.01s
[00:48:49.243] 93/270 postgresql:test_decoding / test_decoding/regress
TIMEOUT 1000.02s
[00:48:49.243] 110/270 postgresql:test_extensions /
test_extensions/regress TIMEOUT 1000.03s
[00:48:49.243] 123/270 postgresql:unsafe_tests / unsafe_tests/regress
TIMEOUT 1000.02s
[00:48:49.243] 152/270 postgresql:pg_dump / pg_dump/010_dump_connstr
TIMEOUT 1000.03s

Just want to make sure you are aware of the issue.


Hi Shlok!

Thanks for taking a look. I will check these failures out to see if 
I can reproduce.


--
Tristan Partin
Neon (https://neon.tech)




Fix use of openssl.path() if openssl isn't found

2023-11-07 Thread Tristan Partin
Found this issue during my Fedora 39 upgrade. Tested that uninstalling 
openssl still allows the various ssl tests to run and succeed.


--
Tristan Partin
Neon (https://neon.tech)
From d684d6fc1546335804d2ed82ff909991965a61a8 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 7 Nov 2023 15:59:04 -0600
Subject: [PATCH v1] Fix use of openssl.path() if openssl isn't found

openssl(1) is an optional dependency of the Postgres Meson build, but
was inadvertantly required when defining some SSL tests.
---
 src/test/ssl/meson.build | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/test/ssl/meson.build b/src/test/ssl/meson.build
index 4cda81f3bc..c3ffcaa032 100644
--- a/src/test/ssl/meson.build
+++ b/src/test/ssl/meson.build
@@ -1,5 +1,10 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
+openssl_path = ''
+if openssl.found()
+  openssl_path = openssl.path()
+endif
+
 tests += {
   'name': 'ssl',
   'sd': meson.current_source_dir(),
@@ -7,7 +12,7 @@ tests += {
   'tap': {
 'env': {
   'with_ssl': ssl_library,
-  'OPENSSL': openssl.path(),
+  'OPENSSL': openssl_path,
 },
 'tests': [
   't/001_ssltests.pl',
-- 
Tristan Partin
Neon (https://neon.tech)



Re: Fix use of openssl.path() if openssl isn't found

2023-11-07 Thread Tristan Partin

On Tue Nov 7, 2023 at 11:53 PM CST, Michael Paquier wrote:

On Tue, Nov 07, 2023 at 04:06:56PM -0600, Tristan Partin wrote:
> Found this issue during my Fedora 39 upgrade. Tested that uninstalling
> openssl still allows the various ssl tests to run and succeed.

Good catch.  You are right that this is inconsistent with what we
expect in the test.

> +openssl_path = ''
> +if openssl.found()
> +  openssl_path = openssl.path()
> +endif
> +
>  tests += {
>'name': 'ssl',
>'sd': meson.current_source_dir(),
> @@ -7,7 +12,7 @@ tests += {
>'tap': {
>  'env': {
>'with_ssl': ssl_library,
> -  'OPENSSL': openssl.path(),
> +  'OPENSSL': openssl_path,
>  },
>  'tests': [
>'t/001_ssltests.pl',

Okay, that's a nit and it leads to the same result, but why not using
the same one-liner style like all the other meson.build files that
rely on optional commands?  See pg_verifybackup, pg_dump, etc.  That
would be more consistent.


Because I forgot there were ternary statements in Meson :). Thanks for 
the review. Here is v2.


--
Tristan Partin
Neon (https://neon.tech)
From 5fc0460b0b85b8f04976182c0f6ec650c40df833 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 7 Nov 2023 15:59:04 -0600
Subject: [PATCH v2] Fix use of openssl.path() if openssl isn't found

openssl(1) is an optional dependency of the Postgres Meson build, but
was inadvertantly required when defining some SSL tests.
---
 src/test/ssl/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/test/ssl/meson.build b/src/test/ssl/meson.build
index 4cda81f3bc..ed83c438ef 100644
--- a/src/test/ssl/meson.build
+++ b/src/test/ssl/meson.build
@@ -7,7 +7,7 @@ tests += {
   'tap': {
 'env': {
   'with_ssl': ssl_library,
-  'OPENSSL': openssl.path(),
+  'OPENSSL': openssl.found() ? openssl.path : '',
 },
 'tests': [
   't/001_ssltests.pl',
-- 
Tristan Partin
Neon (https://neon.tech)



Re: meson documentation build open issues

2023-11-08 Thread Tristan Partin

Looks good to me. Thanks for finding this.

--
Tristan Partin
Neon (https://neon.tech)




Re: Fix use of openssl.path() if openssl isn't found

2023-11-08 Thread Tristan Partin

On Wed Nov 8, 2023 at 2:31 AM CST, Michael Paquier wrote:

On Wed, Nov 08, 2023 at 12:07:49AM -0600, Tristan Partin wrote:
>'with_ssl': ssl_library,
> -  'OPENSSL': openssl.path(),
> +  'OPENSSL': openssl.found() ? openssl.path : '',

Except that this was incorrect.  I've fixed the grammar and applied
that down to 16.


Coding at 12 in the morning is never conducive to coherent thought :). 
Thanks. Sorry for the trouble.


--
Tristan Partin
Neon (https://neon.tech)




Fix some memory leaks in ecpg.addons

2023-11-08 Thread Tristan Partin
clang and gcc both now support -fsanitize=address,undefined. These are 
really useful to me personally when trying to debug issues. 
Unfortunately ecpg code has a ton of memory leaks, which makes builds 
really painful. It would be great to fix all of them, but I don't have 
the patience to try to read flex/bison code. Here are two memory leak 
fixes in any case.


--
Tristan Partin
Neon (https://neon.tech)
From af69e1711e87ae96d75814fc83a8588a4bdf7cac Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Wed, 8 Nov 2023 10:53:02 -0600
Subject: [PATCH v1] Patch two memory leaks in ecpg.addons

make_str() allocates memory, but we were never releasing it. There are
more issues in this code, but these were just two easy ones to fix.
---
 src/interfaces/ecpg/preproc/ecpg.addons | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/interfaces/ecpg/preproc/ecpg.addons b/src/interfaces/ecpg/preproc/ecpg.addons
index e94da2a3f8..dd1ea39c6e 100644
--- a/src/interfaces/ecpg/preproc/ecpg.addons
+++ b/src/interfaces/ecpg/preproc/ecpg.addons
@@ -51,6 +51,8 @@ ECPG: stmtExecuteStmt block
 str[strlen(str) - 1] = '\0';
 sprintf(length, "%zu", strlen(str));
 add_variable_to_tail(&argsinsert, new_variable(str, ECPGmake_simple_type(ECPGt_const, length, 0), 0), &no_indicator);
+
+free(str);
 			}
 			output_statement(cat_str(3, mm_strdup("execute"), mm_strdup("$0"), $1.type), 0, ECPGst_exec_with_exprlist);
 		}
@@ -79,6 +81,8 @@ ECPG: stmtPrepareStmt block
 str[strlen(str) - 1] = '\0';
 sprintf(length, "%zu", strlen(str));
 add_variable_to_tail(&argsinsert, new_variable(str, ECPGmake_simple_type(ECPGt_const, length, 0), 0), &no_indicator);
+
+free(str);
 			}
 			output_statement(cat_str(5, mm_strdup("prepare"), mm_strdup("$0"), $1.type, mm_strdup("as"), $1.stmt), 0, ECPGst_prepare);
 		}
-- 
Tristan Partin
Neon (https://neon.tech)



Re: Fix some memory leaks in ecpg.addons

2023-11-08 Thread Tristan Partin

On Wed Nov 8, 2023 at 11:18 AM CST, Michael Meskes wrote:

Am Mittwoch, dem 08.11.2023 um 12:07 -0500 schrieb Tom Lane:
> "Tristan Partin"  writes:
> > clang and gcc both now support -fsanitize=address,undefined. These
> > are 
> > really useful to me personally when trying to debug issues. 
> > Unfortunately ecpg code has a ton of memory leaks, which makes
> > builds 
> > really painful. It would be great to fix all of them, but I don't
> > have 
> > the patience to try to read flex/bison code. Here are two memory
> > leak 
> > fixes in any case.
> 
> I'm kind of failing to see the point.  As you say, the ecpg

> preprocessor leaks memory like there's no tomorrow.  But given its
> usage (process one source file and exit) I'm not sure that is worth
> much effort to fix.  And what does it buy to fix just two spots?

Agreed, it's not exactly uncommon for tools like ecpg to not worry
about memory. After all it gets freed when the program ends.


In the default configuration of AddressSanitizer, I can't even complete 
a full build of Postgres.


meson setup build -Db_sanitize=address
ninja -C build
[1677/1855] Generating 
src/interfaces/ecpg/test/compat_informix/rfmtlong.c with a custom command
	FAILED: src/interfaces/ecpg/test/compat_informix/rfmtlong.c 
	/home/tristan957/Projects/work/postgresql/build/src/interfaces/ecpg/preproc/ecpg --regression -I../src/interfaces/ecpg/test/compat_informix -I../src/interfaces/ecpg/include/ -C INFORMIX -o src/interfaces/ecpg/test/compat_informix/rfmtlong.c ../src/interfaces/ecpg/test/compat_informix/rfmtlong.pgc


=
==114881==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x7f88c34814a8 in strdup (/lib64/libasan.so.8+0x814a8) (BuildId: 
6f17f87dc4c1aa9f9dde7c4856604c3a25ba4872)
#1 0x4cfd93 in get_progname ../src/port/path.c:589
#2 0x4b6dae in main ../src/interfaces/ecpg/preproc/ecpg.c:137
#3 0x7f88c3246149 in __libc_start_call_main 
(/lib64/libc.so.6+0x28149) (BuildId: 651b2bed7ecaf18098a63b8f10299821749766e6)
#4 0x7f88c324620a in __libc_start_main_impl 
(/lib64/libc.so.6+0x2820a) (BuildId: 651b2bed7ecaf18098a63b8f10299821749766e6)
#5 0x402664 in _start 
(/home/tristan957/Projects/work/postgresql/build/src/interfaces/ecpg/preproc/ecpg+0x402664)
 (BuildId: fab06f774e305cbe628e03cdc22d935f7bb70a76)

SUMMARY: AddressSanitizer: 5 byte(s) leaked in 1 allocation(s).
ninja: build stopped: subcommand failed.

Are people using some suppression file or setting ASAN_OPTIONS to 
something?


Here is a patch with a better solution.

--
Tristan Partin
Neon (https://neon.tech)
From da1bce1d68a55d8f00093a1aa1b2dfe913bd4549 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Wed, 8 Nov 2023 11:35:48 -0600
Subject: [PATCH v1] Silence AddressSanitizer when creating custom targets with
 ecpg

ecpg leaks like no tomorrow and will cause builds to fail.
---
 src/interfaces/ecpg/test/compat_oracle/meson.build | 12 
 src/interfaces/ecpg/test/meson.build   |  9 +
 2 files changed, 21 insertions(+)

diff --git a/src/interfaces/ecpg/test/compat_oracle/meson.build b/src/interfaces/ecpg/test/compat_oracle/meson.build
index f07d9859d1..3aff426fec 100644
--- a/src/interfaces/ecpg/test/compat_oracle/meson.build
+++ b/src/interfaces/ecpg/test/compat_oracle/meson.build
@@ -4,6 +4,17 @@ pgc_files = [
   'char_array',
 ]
 
+exe_preproc_kw = {}
+
+# ecpg leaks memory like no tomorrow; silence AddressSanitizer
+if get_option('b_sanitize').contains('address')
+  exe_preproc_kw += {
+'env': {
+  'ASAN_OPTIONS': 'detect_leaks=0'
+}
+  }
+endif
+
 foreach pgc_file : pgc_files
   exe_input = custom_target('@0@.c'.format(pgc_file),
 input: '@0@.pgc'.format(pgc_file),
@@ -13,6 +24,7 @@ foreach pgc_file : pgc_files
   ecpg_preproc_test_command_end,
 install: false,
 build_by_default: false,
+kwargs: exe_preproc_kw,
   )
 
   ecpg_test_dependencies += executable(pgc_file,
diff --git a/src/interfaces/ecpg/test/meson.build b/src/interfaces/ecpg/test/meson.build
index b7a3fb4e0e..621e76bd88 100644
--- a/src/interfaces/ecpg/test/meson.build
+++ b/src/interfaces/ecpg/test/meson.build
@@ -40,6 +40,15 @@ ecpg_preproc_kw = {
   'build_by_default': false,
 }
 
+# ecpg leaks memory like no tomorrow; silence AddressSanitizer
+if get_option('b_sanitize').contains('address')
+  ecpg_preproc_kw += {
+'env': {
+  'ASAN_OPTIONS': 'detect_leaks=0'
+}
+  }
+endif
+
 ecpg_preproc_test_command_start = [
   ecpg_exe,
   '--regression',
-- 
Tristan Partin
Neon (https://neon.tech)



Re: Failure during Building Postgres in Windows with Meson

2023-11-09 Thread Tristan Partin

Can you try with Meson v1.2.3?

--
Tristan Partin
Neon (https://neon.tech)




Re: Failure during Building Postgres in Windows with Meson

2023-11-09 Thread Tristan Partin

On Thu Nov 9, 2023 at 9:31 AM CST, Nazir Bilal Yavuz wrote:

Hi,

On Thu, 9 Nov 2023 at 18:27, Tristan Partin  wrote:
>
> Can you try with Meson v1.2.3?

I tried with Meson v1.2.3 and upstream, both failed with the same error.


Please open a bug in the Meson repository which also mentions the last 
known working version. I wonder what versions of Meson we use in the 
build farm.


--
Tristan Partin
Neon (https://neon.tech)




Re: Failure during Building Postgres in Windows with Meson

2023-11-10 Thread Tristan Partin

On Thu Nov 9, 2023 at 9:31 AM CST, Nazir Bilal Yavuz wrote:

Hi,

On Thu, 9 Nov 2023 at 18:27, Tristan Partin  wrote:
>
> Can you try with Meson v1.2.3?

I tried with Meson v1.2.3 and upstream, both failed with the same error.


An employee at Collabora produced a fix[0]. It might still be worthwhile 
however to see why it happens in one shell and not the other.


[0]: https://github.com/mesonbuild/meson/pull/12498

--
Tristan Partin
Neon (https://neon.tech)




Re: Fix some memory leaks in ecpg.addons

2023-11-15 Thread Tristan Partin

On Wed Nov 8, 2023 at 11:52 AM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> On Wed Nov 8, 2023 at 11:18 AM CST, Michael Meskes wrote:
>> Agreed, it's not exactly uncommon for tools like ecpg to not worry
>> about memory. After all it gets freed when the program ends.

> In the default configuration of AddressSanitizer, I can't even complete 
> a full build of Postgres.


Why is the meson stuff building ecpg test cases as part of the core build?
That seems wrong for a number of reasons, not only that we don't hold
that code to the same standards as the core server.


After looking into this a tiny bit more, we are building the 
dependencies of the ecpg tests.



foreach pgc_file : pgc_files
  exe_input = custom_target('@0@.c'.format(pgc_file),
input: '@0@.pgc'.format(pgc_file),
output: '@BASENAME@.c',
command: ecpg_preproc_test_command_start +
  ['-C', 'ORACLE',] +
  ecpg_preproc_test_command_end,
install: false,
build_by_default: false,
kwargs: exe_preproc_kw,
  )

  ecpg_test_dependencies += executable(pgc_file,
exe_input,
kwargs: ecpg_test_exec_kw,
  )
endforeach


This is the pattern that we have in all the ecpg/test/*/meson.build 
files. That ecpg_test_dependencies variable is then used in the actual 
ecpg tests:



tests += {
  'name': 'ecpg',
  'sd': meson.current_source_dir(),
  'bd': meson.current_build_dir(),
  'ecpg': {
'expecteddir': meson.current_source_dir(),
'inputdir': meson.current_build_dir(),
'schedule': ecpg_test_files,
'sql': [
  'sql/twophase',
],
'test_kwargs': {
  'depends': ecpg_test_dependencies,
},
    'dbname': 'ecpg1_regression,ecpg2_regression',
'regress_args': ecpg_regress_args,
  },
}


So in my opinion there is nothing wrong here. The build is working as 
intended. Does this make sense to you, Tom?


--
Tristan Partin
Neon (https://neon.tech)




On non-Windows, hard depend on uselocale(3)

2023-11-15 Thread Tristan Partin
I have been working on adding using thread-safe locale APIs within 
Postgres where appropriate[0]. The patch that I originally submitted 
crashed during initdb (whoops!), so I worked on fixing the crash, which 
led me to having to touch some code in chklocale.c, which became 
a frustrating experience because chklocale.c is compiled in 3 different 
configurations.



pgport_variants = {
  '_srv': internal_lib_args + {
'dependencies': [backend_port_code],
  },
  '': default_lib_args + {
'dependencies': [frontend_port_code],
  },
  '_shlib': default_lib_args + {
'pic': true,
'dependencies': [frontend_port_code],
  },
}


This means that some APIs I added or changed in pg_locale.c, can't be 
used without conditional compilation depending on what variant is being 
compiled. Additionally, I also have conditional compilation based on 
HAVE_USELOCALE and WIN32.


I would like to propose removing HAVE_USELOCALE, and just have WIN32, 
which means that Postgres would require uselocale(3) on anything that 
isn't WIN32.


[0]: https://www.postgresql.org/message-id/cwmw5ozbwj10.1yflqwsue5...@neon.tech

--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2023-11-22 Thread Tristan Partin

On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:

On 06/11/2023 19:16, Tristan Partin wrote:
>>> That sounds like a much better solution. Attached you will find a v4
>>> that implements your suggestion. Please let me know if there is
>>> something that I missed. I can confirm that the patch works.

This patch is missing a select(). It will busy loop until the connection 
is established or cancelled.


If I add a wait (select, poll, etc.), then I can't control-C during the 
blocking call, so it doesn't really solve the problem. On Linux, we have 
signalfds which seem like the perfect solution to this problem, "wait on 
the pq socket or SIGINT." But that doesn't translate well to other 
operating systems :(.



tristan957=> \c
NOTICE:  Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/


^CTerminated


You can see here that I can't terminate the command. Where you see 
"Terminated" is me running `kill $(pgrep psql)` in another terminal.


Shouldn't we also clear CancelRequested after we have cancelled the 
attempt? Otherwise, any subsequent attempts will immediately fail too.


After switching to cancel_pressed, I don't think so. See this bit of 
code in the psql main loop:



/* main loop to get queries and execute them */
while (successResult == EXIT_SUCCESS)
{
/*
 * Clean up after a previous Control-C
 */
if (cancel_pressed)
{
if (!pset.cur_cmd_interactive)
{
/*
 * You get here if you stopped a script with Ctrl-C.
 */
successResult = EXIT_USER;
break;
}

cancel_pressed = false;
}


Should we use 'cancel_pressed' here rather than CancelRequested? To be 
honest, I don't understand the difference, so that's a genuine question. 
There was an attempt at unifying them in the past but it was reverted in 
commit 5d43c3c54d.


The more I look at this, the more I don't understand... I need to look 
at this harder, but wanted to get this email out. Switched to 
cancel_pressed though. Thanks for pointing it out. Going to wait to send 
a v2 while more discussion occurs.


--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2023-11-22 Thread Tristan Partin

On Wed Nov 22, 2023 at 3:00 PM CST, Heikki Linnakangas wrote:

On 22/11/2023 19:29, Tristan Partin wrote:
> On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:
>> On 06/11/2023 19:16, Tristan Partin wrote:
>>>>> That sounds like a much better solution. Attached you will find a v4
>>>>> that implements your suggestion. Please let me know if there is
>>>>> something that I missed. I can confirm that the patch works.
>>
>> This patch is missing a select(). It will busy loop until the connection
>> is established or cancelled.
> 
> If I add a wait (select, poll, etc.), then I can't control-C during the

> blocking call, so it doesn't really solve the problem.

Hmm, they should return with EINTR on signal. At least on Linux; I'm not 
sure how portable that is. See signal(7) man page, section "Interruption 
of system calls and library functions by signal handlers". You could 
also use a timeout like 5 s to ensure that you wake up and notice that 
the signal was received eventually, even if it doesn't interrupt the 
blocking call.


Ha, you're right. I had this working yesterday, but convinced myself it 
didn't. I had a do while loop wrapping the blocking call. Here is a v4, 
which seems to pass the tests that were pointed out to be failing 
earlier.


Noticed that I copy-pasted pqSocketPoll() into the psql code. I think 
this may be controversial. Not sure what the best solution to the issue 
is. I will pay attention to the buildfarm animals when they pick this 
up.


--
Tristan Partin
Neon (https://neon.tech)
From 1b4303df1200c561cf478f9c7037ff940f6cd741 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v4] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 src/bin/psql/command.c | 129 -
 1 file changed, 128 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 82cc091568..588eed9351 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -11,6 +11,11 @@
 #include 
 #include 
 #include 
+#if HAVE_POLL
+#include 
+#else
+#include 
+#endif
 #ifndef WIN32
 #include 			/* for stat() */
 #include 			/* for setitimer() */
@@ -40,6 +45,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3251,6 +3257,90 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf)
 	return false;
 }
 
+/*
+ * Check a file descriptor for read and/or write data, possibly waiting.
+ * If neither forRead nor forWrite are set, immediately return a timeout
+ * condition (without waiting).  Return >0 if condition is met, 0
+ * if a timeout occurred, -1 if an error or interrupt occurred.
+ *
+ * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
+ * if end_time is 0 (or indeed, any time before now).
+ */
+static int
+pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+{
+	/* We use poll(2) if available, otherwise select(2) */
+#ifdef HAVE_POLL
+	struct pollfd input_fd;
+	int			timeout_ms;
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	input_fd.fd = sock;
+	input_fd.events = POLLERR;
+	input_fd.revents = 0;
+
+	if (forRead)
+		input_fd.events |= POLLIN;
+	if (forWrite)
+		input_fd.events |= POLLOUT;
+
+	/* Compute appropriate timeout interval */
+	if (end_time == ((time_t) -1))
+		timeout_ms = -1;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout_ms = (end_time - now) * 1000;
+		else
+			timeout_ms = 0;
+	}
+
+	return poll(&input_fd, 1, timeout_ms);
+#else			/* !HAVE_POLL */
+
+	fd_set		input_mask;
+	fd_set		output_mask;
+	fd_set		except_mask;
+	struct timeval timeout;
+	struct timeval *ptr_timeout;
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	FD_ZERO(&input_mask);
+	FD_ZERO(&output_mask);
+	FD_ZERO(&except_mask);
+	if (forRead)
+		FD_SET(sock, &input_mask);
+
+	if (forWrite)
+		FD_SET(sock, &output_mask);
+	FD_SET(sock, &except_mask);
+
+	/* Compute appropriate timeout interval */
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout.tv_sec = end_time - now;
+		else
+			timeout.tv_sec = 0;
+		timeout.tv_usec = 0;
+		ptr_timeout = &timeout;
+	}
+
+	return select(sock

Re: SSL tests fail on OpenSSL v3.2.0

2023-11-27 Thread Tristan Partin

Nazir,

Thanks for opening a thread. Was just about to start one, here what we 
came up with so far.


Homebrew users discovered a regression[0] when using Postgres compiled 
and linked against OpenSSL version 3.2.


$ psql "postgresql://$DB?sslmode=require"
psql: error: connection to server at "redacted" (redacted), port 5432 failed: 
ERROR:  Parameter 'user' is missing in startup packet.
double free or corruption (out)
Aborted (core dumped)

Analyzing the backtrace, OpenSSL was overwriting heap-allocated data in
our PGconn struct because it thought BIO::ptr was a struct bss_sock_st
*. OpenSSL then called a memset() on a member of that struct, and we
zeroed out data in our PGconn struct.

BIO_get_data(3) says the following:


These functions are mainly useful when implementing a custom BIO.

The BIO_set_data() function associates the custom data pointed to by ptr
with the BIO a. This data can subsequently be retrieved via a call to
BIO_get_data(). This can be used by custom BIOs for storing
implementation specific information.


If you take a look at my_BIO_s_socket(), we create a partially custom
BIO, but for the most part are defaulting to the methods defined by
BIO_s_socket(). We need to set application-specific data and not BIO
private data, so that the BIO implementation we rely on, can properly
assert that its private data is what it expects.

The ssl test suite continues to pass with this patch. This patch should 
be backported to every supported Postgres version most likely.


[0]: https://github.com/Homebrew/homebrew-core/issues/155651

--
Tristan Partin
Neon (https://neon.tech)
From 672103a67aaf0dae5be6c5adcc5ce19e5b6d7676 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 27 Nov 2023 11:49:52 -0600
Subject: [PATCH v1] Use BIO_{get,set}_app_data() instead of
 BIO_{get,set}_data()

Compiling Postgres against OpenSSL 3.2 exposed a regression:

$ psql "postgresql://$DB?sslmode=require"
psql: error: connection to server at "redacted" (redacted), port 5432 failed: ERROR:  Parameter 'user' is missing in startup packet.
double free or corruption (out)
Aborted (core dumped)

Analyzing the backtrace, openssl was overwriting heap-allocated data in
our PGconn struct because it thought BIO::ptr was a struct bss_sock_st
*. OpenSSL then called a memset() on a member of that struct, and we
zeroed out data in our PGconn struct.

BIO_get_data(3) says the following:

> These functions are mainly useful when implementing a custom BIO.
>
> The BIO_set_data() function associates the custom data pointed to by ptr
> with the BIO a. This data can subsequently be retrieved via a call to
> BIO_get_data(). This can be used by custom BIOs for storing
> implementation specific information.

If you take a look at my_BIO_s_socket(), we create a partially custom
BIO, but for the most part are defaulting to the methods defined by
BIO_s_socket(). We need to set application-specific data and not BIO
private data, so that the BIO implementation we rely on, can properly
assert that its private data is what it expects.

Co-authored-by: Bo Andreson 
---
 meson.build  |  1 -
 src/backend/libpq/be-secure-openssl.c| 11 +++
 src/include/pg_config.h.in   |  3 ---
 src/interfaces/libpq/fe-secure-openssl.c | 20 +++-
 src/tools/msvc/Solution.pm   |  2 --
 5 files changed, 6 insertions(+), 31 deletions(-)

diff --git a/meson.build b/meson.build
index ee58ee7a06..0095fb183a 100644
--- a/meson.build
+++ b/meson.build
@@ -1285,7 +1285,6 @@ if sslopt in ['auto', 'openssl']
   # doesn't have these OpenSSL 1.1.0 functions. So check for individual
   # functions.
   ['OPENSSL_init_ssl'],
-  ['BIO_get_data'],
   ['BIO_meth_new'],
   ['ASN1_STRING_get0_data'],
   ['HMAC_CTX_new'],
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 31b6a6eacd..1b8b32c5b3 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -842,11 +842,6 @@ be_tls_write(Port *port, void *ptr, size_t len, int *waitfor)
  * see sock_read() and sock_write() in OpenSSL's crypto/bio/bss_sock.c.
  */
 
-#ifndef HAVE_BIO_GET_DATA
-#define BIO_get_data(bio) (bio->ptr)
-#define BIO_set_data(bio, data) (bio->ptr = data)
-#endif
-
 static BIO_METHOD *my_bio_methods = NULL;
 
 static int
@@ -856,7 +851,7 @@ my_sock_read(BIO *h, char *buf, int size)
 
 	if (buf != NULL)
 	{
-		res = secure_raw_read(((Port *) BIO_get_data(h)), buf, size);
+		res = secure_raw_read(((Port *) BIO_get_app_data(h)), buf, size);
 		BIO_clear_retry_flags(h);
 		if (res <= 0)
 		{
@@ -876,7 +871,7 @@ my_sock_write(BIO *h, const char *buf, int size)
 {
 	int			res = 0;
 
-	res = secure_raw_write(((Port *) BIO_get_data(h)), buf, size);
+	res = secur

Re: SSL tests fail on OpenSSL v3.2.0

2023-11-27 Thread Tristan Partin

Here is a v2 which adds back a comment that was not meant to be removed.

--
Tristan Partin
Neon (https://neon.tech)
From 4bcb73eab9ceba950581a890c52820a81134f7e4 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 27 Nov 2023 11:49:52 -0600
Subject: [PATCH v2] Use BIO_{get,set}_app_data() instead of
 BIO_{get,set}_data()

Compiling Postgres against OpenSSL 3.2 exposed a regression:

$ psql "postgresql://$DB?sslmode=require"
psql: error: connection to server at "redacted" (redacted), port 5432 failed: ERROR:  Parameter 'user' is missing in startup packet.
double free or corruption (out)
Aborted (core dumped)

Analyzing the backtrace, openssl was overwriting heap-allocated data in
our PGconn struct because it thought BIO::ptr was a struct bss_sock_st
*. OpenSSL then called a memset() on a member of that struct, and we
zeroed out data in our PGconn struct.

BIO_get_data(3) says the following:

> These functions are mainly useful when implementing a custom BIO.
>
> The BIO_set_data() function associates the custom data pointed to by ptr
> with the BIO a. This data can subsequently be retrieved via a call to
> BIO_get_data(). This can be used by custom BIOs for storing
> implementation specific information.

If you take a look at my_BIO_s_socket(), we create a partially custom
BIO, but for the most part are defaulting to the methods defined by
BIO_s_socket(). We need to set application-specific data and not BIO
private data, so that the BIO implementation we rely on, can properly
assert that its private data is what it expects.

Co-authored-by: Bo Andreson 
---
 meson.build  |  1 -
 src/backend/libpq/be-secure-openssl.c| 11 +++
 src/include/pg_config.h.in   |  3 ---
 src/interfaces/libpq/fe-secure-openssl.c | 11 +++
 src/tools/msvc/Solution.pm   |  2 --
 5 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/meson.build b/meson.build
index ee58ee7a06..0095fb183a 100644
--- a/meson.build
+++ b/meson.build
@@ -1285,7 +1285,6 @@ if sslopt in ['auto', 'openssl']
   # doesn't have these OpenSSL 1.1.0 functions. So check for individual
   # functions.
   ['OPENSSL_init_ssl'],
-  ['BIO_get_data'],
   ['BIO_meth_new'],
   ['ASN1_STRING_get0_data'],
   ['HMAC_CTX_new'],
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 31b6a6eacd..1b8b32c5b3 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -842,11 +842,6 @@ be_tls_write(Port *port, void *ptr, size_t len, int *waitfor)
  * see sock_read() and sock_write() in OpenSSL's crypto/bio/bss_sock.c.
  */
 
-#ifndef HAVE_BIO_GET_DATA
-#define BIO_get_data(bio) (bio->ptr)
-#define BIO_set_data(bio, data) (bio->ptr = data)
-#endif
-
 static BIO_METHOD *my_bio_methods = NULL;
 
 static int
@@ -856,7 +851,7 @@ my_sock_read(BIO *h, char *buf, int size)
 
 	if (buf != NULL)
 	{
-		res = secure_raw_read(((Port *) BIO_get_data(h)), buf, size);
+		res = secure_raw_read(((Port *) BIO_get_app_data(h)), buf, size);
 		BIO_clear_retry_flags(h);
 		if (res <= 0)
 		{
@@ -876,7 +871,7 @@ my_sock_write(BIO *h, const char *buf, int size)
 {
 	int			res = 0;
 
-	res = secure_raw_write(((Port *) BIO_get_data(h)), buf, size);
+	res = secure_raw_write(((Port *) BIO_get_app_data(h)), buf, size);
 	BIO_clear_retry_flags(h);
 	if (res <= 0)
 	{
@@ -952,7 +947,7 @@ my_SSL_set_fd(Port *port, int fd)
 		SSLerr(SSL_F_SSL_SET_FD, ERR_R_BUF_LIB);
 		goto err;
 	}
-	BIO_set_data(bio, port);
+	BIO_set_app_data(bio, port);
 
 	BIO_set_fd(bio, fd, BIO_NOCLOSE);
 	SSL_set_bio(port->ssl, bio, bio);
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index d8a2985567..5f16918243 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -66,9 +66,6 @@
 /* Define to 1 if you have the `backtrace_symbols' function. */
 #undef HAVE_BACKTRACE_SYMBOLS
 
-/* Define to 1 if you have the `BIO_get_data' function. */
-#undef HAVE_BIO_GET_DATA
-
 /* Define to 1 if you have the `BIO_meth_new' function. */
 #undef HAVE_BIO_METH_NEW
 
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index 4aeaf08312..e669bdbf1d 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -1815,11 +1815,6 @@ PQsslAttribute(PGconn *conn, const char *attribute_name)
  * see sock_read() and sock_write() in OpenSSL's crypto/bio/bss_sock.c.
  */
 
-#ifndef HAVE_BIO_GET_DATA
-#define BIO_get_data(bio) (bio->ptr)
-#define BIO_set_data(bio, data) (bio->ptr = data)
-#endif
-
 /* protected by ssl_config_mutex */
 static BIO_METHOD *my_bio_methods;
 
@@ -1828,7 +1823,7 @@ my_sock_read(BIO *h, char *buf, int size)
 {
 	int			res;
 
-	res = pqsecure_raw_rea

Re: SSL tests fail on OpenSSL v3.2.0

2023-11-27 Thread Tristan Partin

On Mon Nov 27, 2023 at 5:53 PM CST, Michael Paquier wrote:

On Mon, Nov 27, 2023 at 12:33:49PM -0600, Tristan Partin wrote:
> -#ifndef HAVE_BIO_GET_DATA
> -#define BIO_get_data(bio) (bio->ptr)
> -#define BIO_set_data(bio, data) (bio->ptr = data)
> -#endif

Shouldn't this patch do a refresh of configure.ac and remove the check
on BIO_get_data() if HAVE_BIO_GET_DATA is gone?


See the attached v3. I am unfamiliar with autotools, so I just hand 
edited the configure.ac script instead of whatever "refresh" means.


--
Tristan Partin
Neon (https://neon.tech)
From 2aa28288c389a2d8f9cbc77a8a710c905227383f Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 27 Nov 2023 11:49:52 -0600
Subject: [PATCH v3] Use BIO_{get,set}_app_data() instead of
 BIO_{get,set}_data()

Compiling Postgres against OpenSSL 3.2 exposed a regression:

$ psql "postgresql://$DB?sslmode=require"
psql: error: connection to server at "redacted" (redacted), port 5432 failed: ERROR:  Parameter 'user' is missing in startup packet.
double free or corruption (out)
Aborted (core dumped)

Analyzing the backtrace, openssl was overwriting heap-allocated data in
our PGconn struct because it thought BIO::ptr was a struct bss_sock_st
*. OpenSSL then called a memset() on a member of that struct, and we
zeroed out data in our PGconn struct.

BIO_get_data(3) says the following:

> These functions are mainly useful when implementing a custom BIO.
>
> The BIO_set_data() function associates the custom data pointed to by ptr
> with the BIO a. This data can subsequently be retrieved via a call to
> BIO_get_data(). This can be used by custom BIOs for storing
> implementation specific information.

If you take a look at my_BIO_s_socket(), we create a partially custom
BIO, but for the most part are defaulting to the methods defined by
BIO_s_socket(). We need to set application-specific data and not BIO
private data, so that the BIO implementation we rely on, can properly
assert that its private data is what it expects.

Co-authored-by: Bo Andreson 
---
 configure|  2 +-
 configure.ac |  2 +-
 meson.build  |  1 -
 src/backend/libpq/be-secure-openssl.c| 11 +++
 src/include/pg_config.h.in   |  3 ---
 src/interfaces/libpq/fe-secure-openssl.c | 11 +++
 src/tools/msvc/Solution.pm   |  2 --
 7 files changed, 8 insertions(+), 24 deletions(-)

diff --git a/configure b/configure
index c064115038..bf3ea690db 100755
--- a/configure
+++ b/configure
@@ -12836,7 +12836,7 @@ done
   # defines OPENSSL_VERSION_NUMBER to claim version 2.0.0, even though it
   # doesn't have these OpenSSL 1.1.0 functions. So check for individual
   # functions.
-  for ac_func in OPENSSL_init_ssl BIO_get_data BIO_meth_new ASN1_STRING_get0_data HMAC_CTX_new HMAC_CTX_free
+  for ac_func in OPENSSL_init_ssl BIO_meth_new ASN1_STRING_get0_data HMAC_CTX_new HMAC_CTX_free
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.ac b/configure.ac
index f220b379b3..fed7167e65 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1367,7 +1367,7 @@ if test "$with_ssl" = openssl ; then
   # defines OPENSSL_VERSION_NUMBER to claim version 2.0.0, even though it
   # doesn't have these OpenSSL 1.1.0 functions. So check for individual
   # functions.
-  AC_CHECK_FUNCS([OPENSSL_init_ssl BIO_get_data BIO_meth_new ASN1_STRING_get0_data HMAC_CTX_new HMAC_CTX_free])
+  AC_CHECK_FUNCS([OPENSSL_init_ssl BIO_meth_new ASN1_STRING_get0_data HMAC_CTX_new HMAC_CTX_free])
   # OpenSSL versions before 1.1.0 required setting callback functions, for
   # thread-safety. In 1.1.0, it's no longer required, and CRYPTO_lock()
   # function was removed.
diff --git a/meson.build b/meson.build
index ee58ee7a06..0095fb183a 100644
--- a/meson.build
+++ b/meson.build
@@ -1285,7 +1285,6 @@ if sslopt in ['auto', 'openssl']
   # doesn't have these OpenSSL 1.1.0 functions. So check for individual
   # functions.
   ['OPENSSL_init_ssl'],
-  ['BIO_get_data'],
   ['BIO_meth_new'],
   ['ASN1_STRING_get0_data'],
   ['HMAC_CTX_new'],
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 31b6a6eacd..1b8b32c5b3 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -842,11 +842,6 @@ be_tls_write(Port *port, void *ptr, size_t len, int *waitfor)
  * see sock_read() and sock_write() in OpenSSL's crypto/bio/bss_sock.c.
  */
 
-#ifndef HAVE_BIO_GET_DATA
-#define BIO_get_data(bio) (bio->ptr)
-#define BIO_set_data(bio, data) (bio->ptr = data)
-#endif
-
 static BIO_METHOD *my_bio_methods = NULL;

Re: SSL tests fail on OpenSSL v3.2.0

2023-11-27 Thread Tristan Partin

On Mon Nov 27, 2023 at 6:21 PM CST, Tom Lane wrote:

Michael Paquier  writes:
> Interesting.  I have yet to look at that in details, but
> BIO_get_app_data() exists down to 0.9.8, which is the oldest version
> we need to support for stable branches.  So that looks like a safe
> bet.

What about LibreSSL?  In general, I'm not too pleased with just assuming
that BIO_get_app_data exists.  If we can do that, we can probably remove
most of the OpenSSL function probes that configure.ac has today.  Even
if that's a good idea in HEAD, I doubt we want to do it all the way back.


As Bo said, this has been available since before LibreSSL forked off of 
OpenSSL.



I'd be inclined to form the patch more along the lines of
s/BIO_get_data/BIO_get_app_data/g, with a configure check for
BIO_get_app_data and falling back to the existing direct use of
bio->ptr if it's not there.


Falling back to what existed before is invalid. BIO::ptr is private data 
for the BIO implementation. BIO_{get,set}_app_data() does
something completely different than setting BIO::ptr. In Postgres we 
call BIO_meth_set_create() with BIO_meth_get_create() from 
BIO_s_socket(). The create function we pass allocates bi->ptr to 
a struct bss_sock_st * as previously stated, and that's been the case 
since March 10, 2022[0]. Essentially Postgres only worked because the 
BIO implementation didn't use the private data section until the linked 
commit. I don't see any reason to keep compatibility with what only 
worked by accident.


[0]: 
https://github.com/openssl/openssl/commit/a3e53d56831adb60d6875297b3339a4251f735d2

--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-27 Thread Tristan Partin

On Mon Nov 27, 2023 at 7:14 PM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> On Mon Nov 27, 2023 at 6:21 PM CST, Tom Lane wrote:
>> What about LibreSSL?  In general, I'm not too pleased with just assuming
>> that BIO_get_app_data exists.

> Falling back to what existed before is invalid.

Well, sure it only worked by accident, but it did work with older
OpenSSL versions.  If we assume that BIO_get_app_data exists, and
somebody tries to use it with a version that hasn't got that,
it won't work.

Having said that, my concern was mainly driven by the comments in
configure.ac claiming that this was an OpenSSL 1.1.0 addition.
Looking at the relevant commits, 593d4e47d and 5c6df67e0, it seems
that that was less about "the function doesn't exist before 1.1.0"
and more about "in 1.1.0 we have to use the function because we
can no longer directly access the ptr field".  If the function
does exist in 0.9.8 then I concur that we don't need to test.


I have gone back all the way to 1.0.0 and confirmed that the function 
exists. Didn't choose to go further than that since Postgres doesn't 
support it.


--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-28 Thread Tristan Partin

On Tue Nov 28, 2023 at 9:00 AM CST, Tom Lane wrote:

Daniel Gustafsson  writes:
> Thats not an issue, we don't support building with BoringSSL.

Right.  I'll work on getting this pushed, unless someone else
is already on it?


When you say "this" are you referring to the patch I sent or adding 
support for BoringSSL?


--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-28 Thread Tristan Partin
How are you guys running the tests? I have PG_TEST_EXTRA=ssl and 
everything passes for me. Granted, I am using the Meson build.


--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-28 Thread Tristan Partin

On Tue Nov 28, 2023 at 9:31 AM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> How are you guys running the tests? I have PG_TEST_EXTRA=ssl and 
> everything passes for me. Granted, I am using the Meson build.


I'm doing what it says in test/ssl/README:

make check PG_TEST_EXTRA=ssl

I don't know whether the meson build has support for running these
extra "unsafe" tests or not.


Thanks Tom. I'll check again. Maybe I didn't set LD_LIBRARY_PATH when 
running the tests. I have openssl installing to a non-default prefix.


--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-28 Thread Tristan Partin

On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> When you say "this" are you referring to the patch I sent or adding 
> support for BoringSSL?


I have no interest in supporting BoringSSL.  I just replied to
Daniel's comment because it seemed to resolve the last concern
about whether your patch is OK.


If you haven't started fixing the tests, then I'll get to work on it and 
send a new revision before the end of the day. Thanks!


--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-28 Thread Tristan Partin

On Tue Nov 28, 2023 at 10:06 AM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote:
>> I have no interest in supporting BoringSSL.  I just replied to
>> Daniel's comment because it seemed to resolve the last concern
>> about whether your patch is OK.

> If you haven't started fixing the tests, then I'll get to work on it and 
> send a new revision before the end of the day. Thanks!


No need, I can finish it up from here.


Sweet. I appreciate your help.

--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-29 Thread Tristan Partin

On Tue Nov 28, 2023 at 9:42 AM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> When you say "this" are you referring to the patch I sent or adding 
> support for BoringSSL?


I have no interest in supporting BoringSSL.


Funnily enough, here[0] is BoringSSL adding the BIO_{get,set}_app_data() 
APIs.


[0]: 
https://github.com/google/boringssl/commit/2139aba2e3e28cd1cdefbd9b48e2c31a75441203

--
Tristan Partin
Neon (https://neon.tech)




Re: SSL tests fail on OpenSSL v3.2.0

2023-11-29 Thread Tristan Partin

On Wed Nov 29, 2023 at 10:32 AM CST, Tom Lane wrote:

Daniel Gustafsson  writes:
> On 29 Nov 2023, at 16:21, Tristan Partin  wrote:
>> Funnily enough, here[0] is BoringSSL adding the BIO_{get,set}_app_data() 
APIs.

> Still doesn't seem like a good candidate for a postgres TLS library since they
> themselves claim:
>"Although BoringSSL is an open source project, it is not intended for
> general use, as OpenSSL is.  We don't recommend that third parties depend
> upon it.  Doing so is likely to be frustrating because there are no
> guarantees of API or ABI stability."

Kind of odd that, with that mission statement, they are adding
BIO_{get,set}_app_data on the justification that OpenSSL has it
and Postgres is starting to use it.  Nonetheless, that commit
also seems to prove the point about lack of API/ABI stability.

I'm content to take their advice and not try to support BoringSSL.
It's not clear what benefit to us there would be, and we already
have our hands full coping with all the different OpenSSL and LibreSSL
versions.


Yep, I just wanted to point it out in the interest of relevancy to our 
conversation yesterday :).


--
Tristan Partin
Neon (https://neon.tech)




Whose Cirrus CI credits are used when making a PR to the GitHub mirror?

2023-11-29 Thread Tristan Partin
I enabled CI on my personal Postgres fork. I then tried to open a PR 
against my fork, and since GitHub defaults to creating PRs against 
upstream, I accidentally opened a PR against the Postgres mirror, which 
the postgres-mirror bot then closed, which is good. Stupid me.


What the bot didn't do however was cancel the Cirrus CI build that arose 
from my immediately closed PR. Here[0] is the current run. It seems like 
you could very easily waste all the CI credits by creating a whole bunch 
of PRs against the mirror. 


If I am just wasting my own credits, please ignore :).

[0]: https://cirrus-ci.com/build/6235510532734976

--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2023-11-29 Thread Tristan Partin

On Thu Nov 23, 2023 at 3:19 AM CST, Heikki Linnakangas wrote:

On 22/11/2023 23:29, Tristan Partin wrote:
> Ha, you're right. I had this working yesterday, but convinced myself it
> didn't. I had a do while loop wrapping the blocking call. Here is a v4,
> which seems to pass the tests that were pointed out to be failing
> earlier.

Thanks! This suffers from a classic race condition:

> +  if (cancel_pressed)
> +  break;
> +
> +  sock = PQsocket(n_conn);
> +  if (sock == -1)
> +  break;
> +
> +  rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
> +  Assert(rc != 0); /* Timeout is impossible. */
> +  if (rc == -1)
> +  {
> +  success = false;
> +  break;
> +  }

If a signal arrives between the "if (cancel_pressed)" check and 
pqSocketPoll(), pgSocketPoll will "miss" the signal and block 
indefinitely. There are three solutions to this:


1. Use a timeout, so that you wake up periodically to check for any 
missed signals. Easy solution, the downsides are that you will not react 
as quickly if the signal is missed, and you waste some cycles by waking 
up unnecessarily.


2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use 
that in src/backend/storage/ipc/latch.c. It's portable, but somewhat 
complicated.


3. Use pselect() or ppoll(). They were created specifically to address 
this problem. Not fully portable, I think it's missing on Windows at least.


Maybe use pselect() if it's available, and select() with a timeout if 
it's not.


First I have ever heard of this race condition, and now I will commit it 
to durable storage :). I went ahead and did option #3 that you proposed. 
On Windows, I have a 1 second timeout for the select/poll. That seemed 
like a good balance of user experience and spinning the CPU. But I am 
open to other values. I don't have a Windows VM, but maybe I should set 
one up...


I am not completely in love with the code I have written. Lots of 
conditional compilation which makes it hard to read. Looking forward to 
another round of review to see what y'all think.


For what it's worth, I did try #2, but since psql installs its own 
SIGINT handler, the code became really hairy.


--
Tristan Partin
Neon (https://neon.tech)
From 4fdccf9211ac78bbd7430488d515646f9e9cce9b Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v5] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 meson.build|   1 +
 src/bin/psql/command.c | 222 -
 src/include/pg_config.h.in |   3 +
 src/tools/msvc/Solution.pm |   1 +
 4 files changed, 226 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index ee58ee7a06..2d63485c53 100644
--- a/meson.build
+++ b/meson.build
@@ -2440,6 +2440,7 @@ func_checks = [
   ['posix_fadvise'],
   ['posix_fallocate'],
   ['ppoll'],
+  ['pselect'],
   ['pstat'],
   ['pthread_barrier_wait', {'dependencies': [thread_dep]}],
   ['pthread_is_threaded_np', {'dependencies': [thread_dep]}],
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 82cc091568..c325fedb51 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -11,6 +11,11 @@
 #include 
 #include 
 #include 
+#if HAVE_POLL
+#include 
+#else
+#include 
+#endif
 #ifndef WIN32
 #include 			/* for stat() */
 #include 			/* for setitimer() */
@@ -40,6 +45,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3251,6 +3257,169 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf)
 	return false;
 }
 
+/*
+ * Check a file descriptor for read and/or write data, possibly waiting.
+ * If neither forRead nor forWrite are set, immediately return a timeout
+ * condition (without waiting).  Return >0 if condition is met, 0
+ * if a timeout occurred, -1 if an error or interrupt occurred.
+ *
+ * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
+ * if end_time is 0 (or indeed, any time before now).
+ */
+static int
+pqSock

Re: meson: Stop using deprecated way getting path of files

2023-11-29 Thread Tristan Partin
This looks good to me. What is our limiting factor on bumping the 
minimum Meson version?


While we are on the topic of Meson, it would be great if you could take 
a look at this thread[0], where I am trying to compile Postgres with 
-fsanitize=address,undefined (-Db_sanitize=address,undefined).


[0]: https://www.postgresql.org/message-id/CWTM35CAUKRT.1733OSMXUZW7%40neon.tech

--
Tristan Partin
Neon (https://neon.tech)




Re: Refactoring backend fork+exec code

2023-11-30 Thread Tristan Partin
pe, so the 
pipe
 + * chunking protocol isn't disturbed. Non-logpipe data gets 
translated on
 + * redirection (e.g. via pg_ctl -l) anyway.
 + */


Nit: The 'm' in the first "make" should be capitalized.


 +if (fread(¶m, sizeof(param), 1, fp) != 1)
 +{
 +write_stderr("could not read from backend variables file \"%s\": 
%s\n",
 + id, strerror(errno));
 +exit(1);
 +}
 +
 +/* read startup data */
 +*startup_data_len = param.startup_data_len;
 +if (param.startup_data_len > 0)
 +{
 +*startup_data = palloc(*startup_data_len);
 +if (fread(*startup_data, *startup_data_len, 1, fp) != 1)
 +{
 +write_stderr("could not read startup data from backend variables 
file \"%s\": %s\n",
 + id, strerror(errno));
 +exit(1);
 +}
 +}


fread(3) doesn't set errno. I would probably switch these to read(2) for 
the reason I wrote in a previous comment.



 +/*
 + * Need to reinitialize the SSL library in the backend, since the 
context
 + * structures contain function pointers and cannot be passed through 
the
 + * parameter file.
 + *
 + * If for some reason reload fails (maybe the user installed broken 
key
 + * files), soldier on without SSL; that's better than all connections
 + * becoming impossible.
 + *
 + * XXX should we do this in all child processes?  For the moment it's
 + * enough to do it in backend children.
 + */
 +#ifdef USE_SSL
 +if (EnableSSL)
 +{
 +if (secure_initialize(false) == 0)
 +LoadedSSL = true;
 +else
 +ereport(LOG,
 +(errmsg("SSL configuration could not be 
loaded in child process")));
 +}
 +#endif


Do other child process types do any non-local communication?


 -typedef struct ClientSocket {
 +struct ClientSocket
 +{
  pgsocketsock;/* File descriptor */
  SockAddrladdr;/* local addr 
(postmaster) */
  SockAddrraddr;/* remote addr (client) 
*/
 -} ClientSocket;
 +};
 +typedef struct ClientSocket ClientSocket;


Can't say I completely understand the reason for this change given it 
was added in your series.


I didn't look too hard at the Windows-specific code, so maybe someone 
who knows Windows will have something to say, but it also might've just 
been copy-paste that I didn't realize.


There were a few more XXXs that probably should be figured out before 
committing. Though perhaps some of them were already there.


Patches 1-3 seem committable as-is. I only had minor comments on 
everything but 7, so after taking a look at those, they could be 
committed.


Overall, this seems liked a marked improvement :).

--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Stop using deprecated way getting path of files

2023-11-30 Thread Tristan Partin

On Wed Nov 29, 2023 at 1:42 PM CST, Andres Freund wrote:

Hi,

On 2023-11-29 13:11:23 -0600, Tristan Partin wrote:
> What is our limiting factor on bumping the minimum Meson version?

Old distro versions, particularly ones where the distro just has an older
python. It's one thing to require installing meson but quite another to also
require building python. There's some other ongoing discussion about
establishing the minimum baseline in a somewhat more, uh, formalized way:
https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com


I'll take a look there. According to Meson, the following versions had 
Python version bumps:


0.61.5: 3.6
0.56.2: 3.5
0.45.1: 3.4

Taking a look at pkgs.org, Debian 10, Ubuntu 20.04, and Oracle Linux 
7 (a RHEL re-spin), and CentOS 7, all have >= Python 3.6.8. Granted, 
this isn't the whole picture of what Postgres supports from version 16+. 
To put things in perspective, Python 3.6 was released on December 23, 
2016, which is coming up on 7 years. Python 3.6 reached end of life on 
the same date in 2021.


Is there a complete list somewhere that talks about what platforms each 
version of Postgres supports?


--
Tristan Partin
Neon (https://neon.tech)




Re: Bug in pgbench prepared statements

2023-11-30 Thread Tristan Partin

On Wed Nov 29, 2023 at 7:38 PM CST, Lev Kokotov wrote:

Patch attached, if there is any interest in fixing this small bug.


I see prepareCommand() is called one more time in 
prepareCommandsInPipeline(). Should you also check the return value 
there?


It may also be useful to throw this patch on the January commitfest if 
no one else comes along to review/commit it. This first set of changes 
looks good to me.


--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Stop using deprecated way getting path of files

2023-12-01 Thread Tristan Partin

On Thu Nov 30, 2023 at 3:46 PM CST, Andrew Dunstan wrote:


On 2023-11-30 Th 16:00, Tristan Partin wrote:
> On Wed Nov 29, 2023 at 1:42 PM CST, Andres Freund wrote:
>> Hi,
>>
>> On 2023-11-29 13:11:23 -0600, Tristan Partin wrote:
>> > What is our limiting factor on bumping the minimum Meson version?
>>
>> Old distro versions, particularly ones where the distro just has an 
>> older
>> python. It's one thing to require installing meson but quite another 
>> to also

>> require building python. There's some other ongoing discussion about
>> establishing the minimum baseline in a somewhat more, uh, formalized 
>> way:
>> https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com 
>>

>
> I'll take a look there. According to Meson, the following versions had 
> Python version bumps:

>
> 0.61.5: 3.6
> 0.56.2: 3.5
> 0.45.1: 3.4
>
> Taking a look at pkgs.org, Debian 10, Ubuntu 20.04, and Oracle Linux 7 
> (a RHEL re-spin), and CentOS 7, all have >= Python 3.6.8. Granted, 
> this isn't the whole picture of what Postgres supports from version 
> 16+. To put things in perspective, Python 3.6 was released on December 
> 23, 2016, which is coming up on 7 years. Python 3.6 reached end of 
> life on the same date in 2021.

>
> Is there a complete list somewhere that talks about what platforms 
> each version of Postgres supports?



You can look at animals in the buildfarm. For meson only release 16 and 
up matter.


On the buildfarm page[0], it would be nice if more than just the 
compiler versions were stated. It would be nice to have all 
build/runtime dependencies listed. For instance, it would be interesting 
if there was a json document for each build animal, and perhaps a root 
json document which was an amalgomation of the individual documents. 
Then I could use a tool like jq to query all the information rather 
easily. As-is, I don't know where to search for package versions for 
some of the archaic operating systems in the farm. Perhaps other people 
have had similar problems in the past. Having a full write-up of every 
build machine would also be good for debugging purposes. If I see 
openssl tests suddenly failing on one machine, then I can just check the 
openssl version, and try to reproduce locally.


I know the buildfarm seems to be a volunteer thing, so asking more of 
them seems like a hard ask. Just wanted to throw my thoughts into the 
void.


[0]: https://buildfarm.postgresql.org/cgi-bin/show_members.pl

--
Tristan Partin
Neon (https://neon.tech)




Re: Refactoring backend fork+exec code

2023-12-01 Thread Tristan Partin

On Fri Dec 1, 2023 at 6:10 AM CST, Heikki Linnakangas wrote:

On 30/11/2023 20:44, Tristan Partin wrote:
> Patches 1-3 seem committable as-is.

Thanks for the review! I'm focusing on patches 1-3 now, and will come 
back to the rest after committing 1-3.


There was one test failure with EXEC_BACKEND from patch 2, in 
'test_shm_mq'. In restore_backend_variables() I checked if 'bgw_name' is 
empty to decide if the BackgroundWorker struct is filled in or not, but 
it turns out that 'test_shm_mq' doesn't fill in bgw_name. It probably 
should, I think that's an oversight in 'test_shm_mq', but that's a 
separate issue.


I did some more refactoring of patch 2, to fix that and to improve it in 
general. The BackgroundWorker struct is now passed through the 
fork-related functions similarly to the Port struct. That seems more 
consistent.


Attached is new version of these patches. For easier review, I made the 
new refactorings compared in a new commit 0003. I will squash that 
before pushing, but this makes it easier to see what changed.


Barring any new feedback or issues, I will commit these.


My only thought is that perhaps has_bg_worker is a better name than 
has_worker, but I agree that having a flag is better than checking 
bgw_name.


--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Stop using deprecated way getting path of files

2023-12-01 Thread Tristan Partin

On Fri Dec 1, 2023 at 12:07 PM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> On the buildfarm page[0], it would be nice if more than just the 
> compiler versions were stated. It would be nice to have all 
> build/runtime dependencies listed.


By and large, we've attempted to address such concerns by extending
the configure script to emit info about versions of things it finds.
So you should look into the configure step's log to see what version
of bison, openssl, etc is in use.


Good point. For some reason that slipped my mind. Off into the weeds 
I go...


--
Tristan Partin
Neon (https://neon.tech)




Re: Refactoring backend fork+exec code

2023-12-01 Thread Tristan Partin

On Fri Dec 1, 2023 at 2:44 PM CST, Heikki Linnakangas wrote:

On 01/12/2023 16:00, Alexander Lakhin wrote:
> Maybe you could look at issues with running `make check` under Valgrind
> when server built with CPPFLAGS="-DUSE_VALGRIND -DEXEC_BACKEND":
> # +++ regress check in src/test/regress +++
> # initializing database system by copying initdb template
> # postmaster failed, examine ".../src/test/regress/log/postmaster.log" for 
the reason
> Bail out!make[1]: ***
> 
> ...

> 2023-12-01 16:48:39.136 MSK postmaster[1307988] LOG:  listening on Unix socket 
"/tmp/pg_regress-pPFNk0/.s.PGSQL.55312"
> ==00:00:00:01.692 1259396== Syscall param write(buf) points to uninitialised 
byte(s)
> ==00:00:00:01.692 1259396==    at 0x5245A37: write (write.c:26)
> ==00:00:00:01.692 1259396==    by 0x51BBF6C: _IO_file_write@@GLIBC_2.2.5 
(fileops.c:1180)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: new_do_write (fileops.c:448)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_new_file_xsputn 
(fileops.c:1254)
> ==00:00:00:01.692 1259396==    by 0x51BC84F: _IO_file_xsputn@@GLIBC_2.2.5 
(fileops.c:1196)
> ==00:00:00:01.692 1259396==    by 0x51B1056: fwrite (iofwrite.c:39)
> ==00:00:00:01.692 1259396==    by 0x552E21: internal_forkexec 
(postmaster.c:4518)
> ==00:00:00:01.692 1259396==    by 0x5546A1: postmaster_forkexec 
(postmaster.c:)
> ==00:00:00:01.692 1259396==    by 0x55471C: StartChildProcess 
(postmaster.c:5275)
> ==00:00:00:01.692 1259396==    by 0x557B61: PostmasterMain (postmaster.c:1454)
> ==00:00:00:01.692 1259396==    by 0x472136: main (main.c:198)
> ==00:00:00:01.692 1259396==  Address 0x1ffeffdc11 is on thread 1's stack
> ==00:00:00:01.692 1259396==  in frame #4, created by internal_forkexec 
(postmaster.c:4482)
> ==00:00:00:01.692 1259396==
> 
> With memset(param, 0, sizeof(*param)); added at the beginning of

> save_backend_variables(), server starts successfully, but then
> `make check` fails with other Valgrind error.

That's actually a pre-existing issue, I'm seeing that even on unpatched 
'master'.


In a nutshell, the problem is that the uninitialized padding bytes in 
BackendParameters are written to the file. When we read the file back, 
we don't access the padding bytes, so that's harmless. But Valgrind 
doesn't know that.


On Windows, the file is created with 
CreateFileMapping(INVALID_HANDLE_VALUE, ...) and we write the variables 
directly to the mapping. If I understand the Windows API docs correctly, 
it is guaranteed to be initialized to zeros, so we don't have this 
problem on Windows, only on other platforms with EXEC_BACKEND. I think 
it makes sense to clear the memory on other platforms too, since that's 
what we do on Windows.


Committed a fix with memset(). I'm not sure what our policy with 
backpatching this kind of issues is. This goes back to all supported 
versions, but given the lack of complaints, I chose to not backpatch.


Seems like a harmless think to backpatch. It is conceivable that someone 
might want to run Valgrind on something other than HEAD.


--
Tristan Partin
Neon (https://neon.tech)




Re: Improving asan/ubsan support

2023-12-01 Thread Tristan Partin
e. Even some assistance from AddressSanitizer is better than none. 
Here[1][2] are all the AddressSanitizer flags for those curious.



Best regards,
Alexander

[1] 
https://www.postgresql.org/message-id/flat/CWTLB2WWVJJ2.2YV6ERNOL1WVF%40neon.tech
[2] 
https://www.postgresql.org/message-id/591971ce-25c1-90f3-0526-5f54e3ebb32e%40gmail.com


I personally would like to see Postgres have support for  
AddressSanitizer. I think it already supports UndefinedBehaviorSanitizer 
if I am remembering the buildfarm properly. AddressSanitizer has been so 
helpful in past experiences writing C.


[0]: 
https://github.com/google/sanitizers/wiki/AddressSanitizerUseAfterReturn#algorithm
[1]: 
https://github.com/postgres/postgres/commit/faeedbcefd40bfdf314e048c425b6d9208896d90
[2]: https://github.com/google/sanitizers/wiki/AddressSanitizerFlags
[3]: https://github.com/google/sanitizers/wiki/SanitizerCommonFlags

--
Tristan Partin
Neon (https://neon.tech)




Re: pgsql: meson: docs: Add {html,man} targets, rename install-doc-*

2023-12-01 Thread Tristan Partin
Commits look fine to me, but I hate the new target names... Luckily, 
I just use plain ninja, so I don't interact with that.



 +for name, v in targets_info_byname.items():
 +if len(targets_info_byname[name]) > 1:


My only comment is that you could reverse the logic and save yourself an 
indentation.


- if len(targets_info_byname[name]) > 1:
+ if len(targets_info_byname[name]) <= 1:
+ continue

But whatever you want.

--
Tristan Partin
Neon (https://neon.tech)




Re: [RFC] Clang plugin for catching suspicious typedef casting

2023-12-01 Thread Tristan Partin

On Fri Aug 4, 2023 at 5:47 AM CDT, Dmitry Dolgov wrote:

> On Thu, Aug 03, 2023 at 12:23:52PM -0500, Tristan Partin wrote:
>
> This is the first I am learning about clang plugins. Interesting concept.
> Did you give any thought to using libclang instead of a compiler plugin? I
> am kind of doing similar work, but I went with libclang instead of a plugin.

Nope, never thought about trying libclang. From the clang documentation
it seems a plugin is a suitable interface if one wants to:

special lint-style warnings or errors for your project

Which is what I was trying to achieve. Are there any advantages of
libclang that you see?


Only advantage I see to using libclang is that you can run programs 
built with libclang regardless of what your C compiler is. I typically 
use GCC.


I think your idea of automating this kind of thing is great no matter 
how it is implemented.


--
Tristan Partin
Neon (https://neon.tech)




Re: Clean up some signal usage mainly related to Windows

2023-12-01 Thread Tristan Partin

On Wed Jul 12, 2023 at 9:35 AM CDT, Tristan Partin wrote:

On Wed Jul 12, 2023 at 9:31 AM CDT, Peter Eisentraut wrote:
> On 12.07.23 16:23, Tristan Partin wrote:
> > It has come to my attention that STDOUT_FILENO might not be portable and
> > fileno(3) isn't marked as signal-safe, so I have just used the raw 1 for
> > stdout, which as far as I know is portable.
>
> We do use STDOUT_FILENO elsewhere in the code, and there are even 
> workaround definitions for Windows, so it appears it is meant to be used.


v3 is back to the original patch with newline being printed. Thanks.


Peter, did you have anything more to say about patch 1 in this series? 
Thinking about patch 2 more, not sure it should be considered until 
I setup a Windows VM to do some testing, or unless some benevolent 
Windows user wants to look at it and test it.


--
Tristan Partin
Neon (https://neon.tech)




Re: Rename ShmemVariableCache and initialize it in more standard way

2023-12-04 Thread Tristan Partin

On Mon Dec 4, 2023 at 6:49 AM CST, Heikki Linnakangas wrote:
This came up in the "Refactoring backend fork+exec code" thread recently 
[0], but is independent of that work:


On 11/07/2023 01:50, Andres Freund wrote:
>> --- a/src/backend/storage/ipc/shmem.c
>> +++ b/src/backend/storage/ipc/shmem.c
>> @@ -144,6 +144,8 @@ InitShmemAllocation(void)
>>/*
>> * Initialize ShmemVariableCache for transaction manager. (This 
doesn't
>> * really belong here, but not worth moving.)
>> +   *
>> +   * XXX: we really should move this
>> */
>>ShmemVariableCache = (VariableCache)
>>ShmemAlloc(sizeof(*ShmemVariableCache));
> 
> Heh. Indeed. And probably just rename it to something less insane.


Here's a patch to allocate and initialize it with a pair of ShmemSize 
and ShmemInit functions, like all other shared memory structs.





 +if (!IsUnderPostmaster)
 +{
 +Assert(!found);
 +memset(ShmemVariableCache, 0, sizeof(VariableCacheData));
 +}
 +else
 +Assert(found);


Should the else branch instead be a fatal log?

Patches look good to me.
--
Tristan Partin
Neon (https://neon.tech)




Re: Clean up some signal usage mainly related to Windows

2023-12-04 Thread Tristan Partin

On Mon Dec 4, 2023 at 9:22 AM CST, Peter Eisentraut wrote:

On 01.12.23 23:10, Tristan Partin wrote:
> On Wed Jul 12, 2023 at 9:35 AM CDT, Tristan Partin wrote:
>> On Wed Jul 12, 2023 at 9:31 AM CDT, Peter Eisentraut wrote:
>> > On 12.07.23 16:23, Tristan Partin wrote:
>> > > It has come to my attention that STDOUT_FILENO might not be 
>> portable and
>> > > fileno(3) isn't marked as signal-safe, so I have just used the raw 
>> 1 for

>> > > stdout, which as far as I know is portable.
>> >
>> > We do use STDOUT_FILENO elsewhere in the code, and there are even > 
>> workaround definitions for Windows, so it appears it is meant to be used.

>>
>> v3 is back to the original patch with newline being printed. Thanks.
> 
> Peter, did you have anything more to say about patch 1 in this series?


I think that patch is correct.  However, I wonder whether we even need 
that signal handler.  We could just delete the file immediately after 
opening it; then we don't need to worry about deleting it later.  On 
Windows, we could use O_TEMPORARY instead.


I don't think that would work because the same file is opened and closed 
multiple times throughout the course of the program. We first open the 
file in test_open() which sets needs_unlink to true, so for the rest of 
the program we need to unlink the file, but also continue to be able to 
open it. Here is the unlink(2) description for context:


unlink() deletes a name from the filesystem.  If that name was the 
last link to a file and no processes have the file open, the file is 
deleted and the space it was using is made available for reuse.


Given what you've suggested, we could never reopen the file after the 
unlink(2) call.


This is my first time reading this particular code, so maybe you have 
come to a different conclusion.


--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Stop using deprecated way getting path of files

2023-12-04 Thread Tristan Partin

On Fri Dec 1, 2023 at 12:16 PM CST, Tristan Partin wrote:

On Fri Dec 1, 2023 at 12:07 PM CST, Tom Lane wrote:
> "Tristan Partin"  writes:
> > On the buildfarm page[0], it would be nice if more than just the 
> > compiler versions were stated. It would be nice to have all 
> > build/runtime dependencies listed.

>
> By and large, we've attempted to address such concerns by extending
> the configure script to emit info about versions of things it finds.
> So you should look into the configure step's log to see what version
> of bison, openssl, etc is in use.

Good point. For some reason that slipped my mind. Off into the weeds 
I go...


Ok, so what I found is that we still have build farm animals using 
Python 3.4, specifically the AIX machines. There was also at least one 
Python 3.5 user too. Note that this was a manual check.


I think I'll probably work on a tool for querying information out of the 
build farm tonight to make tasks like this much more automated.


--
Tristan Partin
Neon (https://neon.tech)




Re: meson: Stop using deprecated way getting path of files

2023-12-04 Thread Tristan Partin

On Mon Dec 4, 2023 at 2:10 PM CST, Tom Lane wrote:

"Tristan Partin"  writes:
> On Fri Dec 1, 2023 at 12:16 PM CST, Tristan Partin wrote:
>>> Ok, so what I found is that we still have build farm animals using 
>>> Python 3.4, specifically the AIX machines. There was also at least one 
>>> Python 3.5 user too. Note that this was a manual check.


> I think I'll probably work on a tool for querying information out of the 
> build farm tonight to make tasks like this much more automated.


Not sure what you were using, but are you aware that SQL access to the
buildfarm database is available to project members?  My own stock
approach to checking on this sort of thing is like

select * from
(select sysname, snapshot, unnest(string_to_array(log_text, E'\n')) as l
from build_status_log join snapshots using (sysname, snapshot)
where log_stage = 'configure.log') ss
where l like 'checking for builtin %'


This looks useful. I had no idea about this. Can you send me any 
resources for setting this up? My idea was just to do some web scraping.


--
Tristan Partin
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

2023-12-05 Thread Tristan Partin

On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote:
I am not completely in love with the code I have written. Lots of 
conditional compilation which makes it hard to read. Looking forward to 
another round of review to see what y'all think.


Ok. Here is a patch which just uses select(2) with a timeout of 1s or 
pselect(2) if it is available. I also moved the state machine processing 
into its own function.


Thanks for your comments thus far.

--
Tristan Partin
Neon (https://neon.tech)
From b22f286d3733d6d5ec2a924682679f6884b3600c Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v6] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 meson.build|   1 +
 src/bin/psql/command.c | 224 -
 src/include/pg_config.h.in |   3 +
 src/tools/msvc/Solution.pm |   1 +
 4 files changed, 228 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index ee58ee7a06..2d63485c53 100644
--- a/meson.build
+++ b/meson.build
@@ -2440,6 +2440,7 @@ func_checks = [
   ['posix_fadvise'],
   ['posix_fallocate'],
   ['ppoll'],
+  ['pselect'],
   ['pstat'],
   ['pthread_barrier_wait', {'dependencies': [thread_dep]}],
   ['pthread_is_threaded_np', {'dependencies': [thread_dep]}],
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 82cc091568..e87401b790 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -11,6 +11,11 @@
 #include 
 #include 
 #include 
+#if HAVE_POLL
+#include 
+#else
+#include 
+#endif
 #ifndef WIN32
 #include 			/* for stat() */
 #include 			/* for setitimer() */
@@ -40,6 +45,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3251,6 +3257,169 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf)
 	return false;
 }
 
+/*
+ * Check a file descriptor for read and/or write data, possibly waiting.
+ * If neither forRead nor forWrite are set, immediately return a timeout
+ * condition (without waiting).  Return >0 if condition is met, 0
+ * if a timeout occurred, -1 if an error or interrupt occurred.
+ *
+ * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
+ * if end_time is 0 (or indeed, any time before now).
+ */
+static int
+pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+{
+	/*
+	 * We use functions in the following order if available:
+	 *   - ppoll(2) OR pselect(2)
+	 *   - poll(2) OR select(2)
+	 */
+#ifdef HAVE_POLL
+	struct pollfd input_fd;
+#ifdef HAVE_PPOLL
+	int			rc;
+	sigset_t	emptyset;
+	sigset_t	blockset;
+	sigset_t	origset;
+	struct timespec timeout;
+	struct timespec *ptr_timeout;
+#else
+	int			timeout_ms;
+#endif
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	input_fd.fd = sock;
+	input_fd.events = POLLERR;
+	input_fd.revents = 0;
+
+	if (forRead)
+		input_fd.events |= POLLIN;
+	if (forWrite)
+		input_fd.events |= POLLOUT;
+
+	/* Compute appropriate timeout interval */
+#ifdef HAVE_PPOLL
+	sigemptyset(&blockset);
+	sigaddset(&blockset, SIGINT);
+	sigprocmask(SIG_BLOCK, &blockset, &origset);
+
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		timeout.tv_sec = end_time;
+		timeout.tv_nsec = 0;
+		ptr_timeout = &timeout;
+	}
+#else
+	if (end_time == ((time_t) -1))
+		timeout_ms = -1;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout_ms = (end_time - now) * 1000;
+		else
+			timeout_ms = 0;
+	}
+#endif
+
+#ifdef HAVE_PPOLL
+	sigemptyset(&emptyset);
+	if (cancel_pressed)
+	{
+		sigprocmask(SIG_SETMASK, &origset, NULL);
+		return 1;
+	}
+
+	rc = ppoll(&input_fd, 1, ptr_timeout, &emptyset);
+	sigprocmask(SIG_SETMASK, &origset, NULL);
+	return rc;
+#else
+	return poll(&input_fd, 1, timeout_ms);
+#endif
+#else			/* !HAVE_POLL */
+
+	fd_set		input_mask;
+	fd_set		output_mask;
+	fd_set		except_mask;
+#ifdef HAVE_PSELECT
+	int			rc;
+	sigset_t	emptyset;
+	sigset_t	blockset;
+	sigset_t	origset;
+	struct timespec timeout;
+	struct timespec *ptr_timeout;
+#else
+	struct timeval timeout;
+	struct timeval *ptr_timeout;
+#endif
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	FD_ZERO(&input_mask);
+	FD_ZERO(&output_mask);
+	FD_ZERO(&except_mask);
+	if (forRead)
+		FD_SET(sock, &input_mask);
+

Remove WIN32 conditional compilation from win32common.c

2023-12-05 Thread Tristan Partin
The file is only referenced in Meson and MSVC scripts from what I can 
tell, and the Meson reference is protected by a Windows check.


--
Tristan Partin
Neon (https://neon.tech)
From bc9ffc7b0b141959a4c2a3f8b731f457ff5825c1 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 5 Dec 2023 10:18:37 -0600
Subject: [PATCH v1] Remove WIN32 conditional compilation from win32common.c

This file is only compiled when WIN32 is defined, so the #ifdef was
not needed.
---
 src/port/win32common.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/port/win32common.c b/src/port/win32common.c
index 2fd78f7f93..a075d01209 100644
--- a/src/port/win32common.c
+++ b/src/port/win32common.c
@@ -19,8 +19,6 @@
 #include "postgres.h"
 #endif
 
-#ifdef WIN32
-
 /*
  * pgwin32_get_file_type
  *
@@ -64,5 +62,3 @@ pgwin32_get_file_type(HANDLE hFile)
 
 	return fileType;
 }
-
-#endif			/* WIN32 */
-- 
Tristan Partin
Neon (https://neon.tech)



Re: Clean up some signal usage mainly related to Windows

2023-12-06 Thread Tristan Partin

On Wed Dec 6, 2023 at 10:18 AM CST, Nathan Bossart wrote:

On Wed, Dec 06, 2023 at 10:23:52AM +0100, Peter Eisentraut wrote:
> Ok, I have committed your 0001 patch.

My compiler is unhappy about this one:

../postgresql/src/bin/pg_test_fsync/pg_test_fsync.c:605:2: error: ignoring 
return value of ‘write’, declared with attribute warn_unused_result 
[-Werror=unused-result]
  605 |  write(STDOUT_FILENO, "\n", 1);
  |  ^


Some glibc source:


/* If fortification mode, we warn about unused results of certain
   function calls which can lead to problems.  */
#if __GNUC_PREREQ (3,4) || __glibc_has_attribute (__warn_unused_result__)
# define __attribute_warn_unused_result__ \
   __attribute__ ((__warn_unused_result__))
# if defined __USE_FORTIFY_LEVEL && __USE_FORTIFY_LEVEL > 0
#  define __wur __attribute_warn_unused_result__
# endif
#else
# define __attribute_warn_unused_result__ /* empty */
#endif
#ifndef __wur
# define __wur /* Ignore */
#endif



extern ssize_t write (int __fd, const void *__buf, size_t __n) __wur
__attr_access ((__read_only__, 2, 3));


According to my setup, I am hitting the /* Ignore */ variant of __wur. 
I am guessing that Fedora doesn't add fortification to the default 
CFLAGS. What distro are you using? But yes, something like what you 
proposed sounds good to me. Sorry for leaving this out!


Makes me wonder if setting -D_FORTIFY_SOURCE=2 in debug builds at least 
would make sense, if not all builds. According to the OpenSSF[0], level 
2 is only supposed to impact runtime performance by 0.1%.


[0]: 
https://best.openssf.org/Compiler-Hardening-Guides/Compiler-Options-Hardening-Guide-for-C-and-C++.html#performance-implications

--
Tristan Partin
Neon (https://neon.tech)




Add --check option to pgindent

2023-12-11 Thread Tristan Partin
Not sold on the name, but --check is a combination of --silent-diff and 
--show-diff. I envision --check mostly being used in CI environments. 
I recently came across a situation where this behavior would have been 
useful. Without --check, you're left to capture the output of 
--show-diff and exit 2 if the output isn't empty by yourself.


--
Tristan Partin
Neon (https://neon.tech)
From 1001252d49a47e660045cee3d2ba5abd87e925d9 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Mon, 11 Dec 2023 17:34:17 -0600
Subject: [PATCH v1] Add --check option to pgindent

The option is a combination of --show-diff and --silent-diff. It is
useful for situations such as CI checks where the diff can be logged,
but will also cause the build to fail. Without such an option, people
are left to re-implement the same logic over and over again.
---
 src/tools/pgindent/pgindent | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index bce63d95da..a285015c76 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -23,7 +23,8 @@ my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, @excludes,
 	$indent, $build, $show_diff,
-	$silent_diff, $help, @commits,);
+	$silent_diff, $help, @commits,
+	$check,);
 
 $help = 0;
 
@@ -35,7 +36,8 @@ my %options = (
 	"excludes=s" => \@excludes,
 	"indent=s" => \$indent,
 	"show-diff" => \$show_diff,
-	"silent-diff" => \$silent_diff,);
+	"silent-diff" => \$silent_diff,
+	"check" => \$check,);
 GetOptions(%options) || usage("bad command line argument");
 
 usage() if $help;
@@ -43,6 +45,12 @@ usage() if $help;
 usage("Cannot have both --silent-diff and --show-diff")
   if $silent_diff && $show_diff;
 
+usage("Cannot have both --check and --show-diff")
+  if $check && $show_diff;
+
+usage("Cannot have both --check and --silent-diff")
+  if $check && $silent_diff;
+
 usage("Cannot use --commit with command line file list")
   if (@commits && @ARGV);
 
@@ -325,6 +333,7 @@ Options:
 	--indent=PATH   path to pg_bsd_indent program
 	--show-diff show the changes that would be made
 	--silent-diff   exit with status 2 if any changes would be made
+	--check combination of --show-diff and --silent-diff
 The --excludes and --commit options can be given more than once.
 EOF
 	if ($help)
@@ -417,7 +426,12 @@ foreach my $source_filename (@files)
 
 	if ($source ne $orig_source)
 	{
-		if ($silent_diff)
+		if ($check)
+		{
+			print show_diff($source, $source_filename);
+			exit 2;
+		}
+		elsif ($silent_diff)
 		{
 			exit 2;
 		}
-- 
Tristan Partin
Neon (https://neon.tech)



Re: Add --check option to pgindent

2023-12-12 Thread Tristan Partin

On Tue Dec 12, 2023 at 5:47 AM CST, Euler Taveira wrote:

On Tue, Dec 12, 2023, at 7:28 AM, Michael Banck wrote:
> On Tue, Dec 12, 2023 at 11:18:59AM +0100, Daniel Gustafsson wrote:
> > > On 12 Dec 2023, at 01:09, Tristan Partin  wrote:
> > > 
> > > Not sold on the name, but --check is a combination of --silent-diff

> > > and --show-diff. I envision --check mostly being used in CI
> > > environments. I recently came across a situation where this behavior
> > > would have been useful. Without --check, you're left to capture the
> > > output of --show-diff and exit 2 if the output isn't empty by
> > > yourself.
> > 
> > I wonder if we should model this around the semantics of git diff to

> > keep it similar to other CI jobs which often use git diff?  git diff
> > --check means "are there conflicts or issues" which isn't really
> > comparable to here, git diff --exit-code however is pretty much
> > exactly what this is trying to accomplish.
> > 
> > That would make pgindent --show-diff --exit-code exit with 1 if there

> > were diffs and 0 if there are no diffs.
> 
> To be honest, I find that rather convoluted; contrary to "git diff", I

> believe the primary action of pgident is not to show diffs, so I find
> the proposed --check option to be entirely reasonable from a UX
> perspective.
> 
> On the other hand, tying a "does this need re-indenting?" question to a

> "--show-diff --exit-code" option combination is not very obvious (to me,
> at least).

Multiple options to accomplish a use case might not be obvious. I'm wondering
if we can combine it into a unique option.

--show-diff show the changes that would be made
--silent-diff   exit with status 2 if any changes would be made
+ --check combination of --show-diff and --silent-diff

I mean

--diff=show,silent,check

When you add exceptions, it starts to complicate the UI.

usage("Cannot have both --silent-diff and --show-diff")
   if $silent_diff && $show_diff;
 
+usage("Cannot have both --check and --show-diff")

+  if $check && $show_diff;
+
+usage("Cannot have both --check and --silent-diff")
+  if $check && $silent_diff;
+


I definitely agree. I just didn't want to remove options, but if people 
are ok with that, I will just change the interface.


I like the git-diff semantics or have --diff and --check, similar to how 
Python's black[0] is.


[0]: https://github.com/psf/black

--
Tristan Partin
Neon (https://neon.tech)




Re: typedefs.list glitches

2023-12-12 Thread Tristan Partin

On Thu May 12, 2022 at 4:21 PM CDT, Tom Lane wrote:

I wrote:
> every buildfarm member that's contributing to the typedefs list
> builds with OpenSSL.  That wouldn't surprise me, except that
> my own animal sifaka should be filling that gap.  Looking at
> its latest attempt[1], it seems to be generating an empty list,
> which I guess means that our recipe for extracting typedefs
> doesn't work on macOS/arm64.  I shall investigate.

Found it.  Current macOS produces

$ objdump -W
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/objdump:
 error: unknown argument '-W'

where last year's vintage produced

$ objdump -W
objdump: Unknown command line argument '-W'.  Try: 
'/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/objdump
 --help'
objdump: Did you mean '-C'?

This confuses run_build.pl into taking the "Linux and sometimes windows"
code path instead of the $using_osx one.  I think simplest fix is to
move the $using_osx branch ahead of the heuristic ones, as attached.


Hey Tom,

Was this patch ever committed?

--
Tristan Partin
Neon (https://neon.tech)




  1   2   3   >