Re: pg_preadv() and pg_pwritev()
On 11.01.2021 05:59, Thomas Munro wrote: Since only sifaka has managed to return a result so far (nice CPU), I had plenty of time to notice that macOS Big Sur has introduced preadv/pwritev. They were missing on Catalina[1]. [1] https://cirrus-ci.com/task/6002157537198080 Hi, Thomas! Indeed, pwritev is not available on macOS Catalina. So I get compiler warnings about that: /Users/shinderuk/src/pgwork/devel/build/../src/port/pwrite.c:117:10: warning: 'pwritev' is only available on macOS 11.0 or newer [-Wunguarded-availability-new] part = pg_pwritev(fd, iov, iovcnt, offset); ^~ /Users/shinderuk/src/pgwork/devel/build/../src/include/port/pg_iovec.h:49:20: note: expanded from macro 'pg_pwritev' #define pg_pwritev pwritev ^~~ /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk/usr/include/sys/uio.h:104:9: note: 'pwritev' has been marked as being introduced in macOS 11.0 here, but the deployment target is macOS 10.15.0 ssize_t pwritev(int, const struct iovec *, int, off_t) __DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0), watchos(7.0), tvos(14.0)); ^ /Users/shinderuk/src/pgwork/devel/build/../src/port/pwrite.c:117:10: note: enclose 'pwritev' in a __builtin_available check to silence this warning part = pg_pwritev(fd, iov, iovcnt, offset); ^~ /Users/shinderuk/src/pgwork/devel/build/../src/include/port/pg_iovec.h:49:20: note: expanded from macro 'pg_pwritev' #define pg_pwritev pwritev ^~~ 1 warning generated. (... several more warnings ...) And initdb fails: running bootstrap script ... dyld: lazy symbol binding failed: Symbol not found: _pwritev Referenced from: /Users/shinderuk/src/pgwork/devel/install/bin/postgres Expected in: /usr/lib/libSystem.B.dylib dyld: Symbol not found: _pwritev Referenced from: /Users/shinderuk/src/pgwork/devel/install/bin/postgres Expected in: /usr/lib/libSystem.B.dylib Regards. -- Sergey Shinderuk Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: pg_preadv() and pg_pwritev()
On 13.01.2021 12:56, Thomas Munro wrote: On Wed, Jan 13, 2021 at 10:40 PM Sergey Shinderuk wrote: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk/usr/include/sys/uio.h:104:9: note: 'pwritev' has been marked as being introduced in macOS 11.0 here, but the deployment target is macOS 10.15.0 ssize_t pwritev(int, const struct iovec *, int, off_t) __DARWIN_NOCANCEL(pwritev) __API_AVAILABLE(macos(11.0), ios(14.0), watchos(7.0), tvos(14.0)); ^ Hrm... So why did "configure" think you have pwritev, then? It seems like you must have been using different compilers or options at configure time and compile time, no? No, i've just rerun configure from clean checkout without any options. It does think that pwritev is available. I'll try to figure this out later and come back to you. Thanks.
Re: pg_preadv() and pg_pwritev()
The symptoms sound consistent with using bleeding-edge Xcode on a Catalina machine ... please report exact OS and Xcode versions. macOS 10.15.7 (19H2) Xcode 12.3 (12C33) macOS SDK 11.1 (20C63) Everything is fine if I run "configure" with PG_SYSROOT=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk I noticed that "cc" invoked from command line uses: -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk But "xcodebuild -version -sdk macosx Path" invoked by "configure" when PG_SYSROOT is not provided gives: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk Now I'm confused about different SDK versions and locations used by Xcode and CommandLineTools :)
Re: pg_preadv() and pg_pwritev()
On 14.01.2021 18:42, Tom Lane wrote: I noticed that "cc" invoked from command line uses: -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk Hm, how did you determine that exactly? % echo 'int main(void){}' >tmp.c % cc -v tmp.c Apple clang version 12.0.0 (clang-1200.0.32.28) Target: x86_64-apple-darwin19.6.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang" -cc1 -triple x86_64-apple-macosx10.15.0 -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -Werror=implicit-function-declaration -emit-obj -mrelax-all -disable-free -disable-llvm-verifier -discard-value-names -main-file-name tmp.c -mrelocation-model pic -pic-level 2 -mthread-model posix -mframe-pointer=all -fno-strict-return -masm-verbose -munwind-tables -target-sdk-version=10.15.6 -fcompatibility-qualified-id-block-type-checking -target-cpu penryn -dwarf-column-info -debugger-tuning=lldb -target-linker-version 609.8 -v -resource-dir /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0 -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -I/usr/local/include -internal-isystem /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/local/include -internal-isystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/include -internal-externc-isystem /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include -internal-externc-isystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include -Wno-reorder-init-list -Wno-implicit-int-float-conversion -Wno-c99-designator -Wno-final-dtor-non-final-class -Wno-extra-semi-stmt -Wno-misleading-indentation -Wno-quoted-include-in-framework-header -Wno-implicit-fallthrough -Wno-enum-enum-conversion -Wno-enum-float-conversion -fdebug-compilation-dir /Users/shinderuk -ferror-limit 19 -fmessage-length 238 -stack-protector 1 -fstack-check -mdarwin-stkchk-strong-link -fblocks -fencode-extended-block-signature -fregister-global-dtors-with-atexit -fgnuc-version=4.2.1 -fobjc-runtime=macosx-10.15.0 -fmax-type-align=16 -fdiagnostics-show-option -fcolor-diagnostics -o /var/folders/8x/jvqv7hyd5h98m7tz2zm9r0ycgn/T/tmp-91fb5e.o -x c tmp.c clang -cc1 version 12.0.0 (clang-1200.0.32.28) default target x86_64-apple-darwin19.6.0 ignoring nonexistent directory "/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/local/include" ignoring nonexistent directory "/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/Library/Frameworks" #include "..." search starts here: #include <...> search starts here: /usr/local/include /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/include /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/System/Library/Frameworks (framework directory) End of search list. "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld" -demangle -lto_library /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/libLTO.dylib -no_deduplicate -dynamic -arch x86_64 -platform_version macos 10.15.0 10.15.6 -syslibroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -o a.out -L/usr/local/lib /var/folders/8x/jvqv7hyd5h98m7tz2zm9r0ycgn/T/tmp-91fb5e.o -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/lib/darwin/libclang_rt.osx.a
Re: pg_preadv() and pg_pwritev()
On 15.01.2021 01:13, Tom Lane wrote: I borrowed my wife's Mac, which is still on Catalina and up to now never had Xcode on it, and found some very interesting things. Step 1: download/install Xcode 12.3, open it, agree to license, wait for it to finish "installing components". At this point, /Library/Developer/CommandLineTools doesn't exist, and we have the following outputs from various probe commands: % xcrun --show-sdk-path /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk % xcrun --sdk macosx --show-sdk-path /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk % xcodebuild -version -sdk macosx Path /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk Also, cc -v reports -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk Unsurprisingly, Xcode 12.3 itself only contains % ls -l /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs total 0 drwxr-xr-x 5 root wheel 160 Nov 30 07:27 DriverKit20.2.sdk drwxr-xr-x 7 root wheel 224 Nov 30 07:27 MacOSX.sdk lrwxr-xr-x 1 root wheel 10 Jan 14 15:57 MacOSX11.1.sdk -> MacOSX.sdk Step 2: install command line tools (I used "xcode-select --install" to fire this off, rather than the Xcode menu item). Now I have % ls -l /Library/Developer/CommandLineTools/SDKs total 0 lrwxr-xr-x 1 root wheel 14 Jan 14 16:42 MacOSX.sdk -> MacOSX11.1.sdk drwxr-xr-x 8 root wheel 256 Jul 9 2020 MacOSX10.15.sdk drwxr-xr-x 7 root wheel 224 Nov 30 07:33 MacOSX11.1.sdk which is pretty interesting in itself, because the same directory on my recently-updated-to-Big-Sur Macs does NOT have the 11.1 SDK. I wonder what determines which versions get installed here. More interesting yet: % xcrun --show-sdk-path /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk % xcrun --sdk macosx --show-sdk-path /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk % xcodebuild -version -sdk macosx Path /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk and cc -v reports -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk So apparently, "xcrun --show-sdk-path" (without any -sdk option) is the most authoritative guide to the compiler's default sysroot. However, googling turns up various people reporting that "xcrun --show-sdk-path" returns an empty string for them, and our last major investigation into this [1] found that there are some system states where the compiler appears to have no default sysroot, which I bet is the same thing. I do not at this point have a recipe to reproduce such a state, but we'd be fools to imagine it's no longer possible. My guess about it is that Apple's processes for updating the default sysroot during system updates are just plain buggy, with various corner cases that have ill-understood causes. Also, after re-reading [1] I am not at all excited about trying to remove the -isysroot switches from our *FLAGS. What I propose to do is keep that, but improve our mechanism for choosing a default value for PG_SYSROOT. It looks like first trying "xcrun --show-sdk-path", and falling back to "xcodebuild -version -sdk macosx Path" if that doesn't yield a valid path, is more likely to give a working build than relying entirely on xcodebuild. Maybe there's a case for trying "xcrun --sdk macosx --show-sdk-path" in between; in my tests that seemed noticeably faster than invoking xcodebuild, and I've not yet seen a case where it gave a different answer. Thoughts? regards, tom lane [1] https://www.postgresql.org/message-id/flat/20840.1537850987%40sss.pgh.pa.us Thanks for thorough investigation and sorry for the late reply. I spent quite some time trying to understand / reverse engineer the logic behind xcrun's default SDK selection. Apparently, "man xcrun" is not accurate saying: The SDK which will be searched defaults to the most recent available... I didn't find anything really useful or helpful. "/Library/Developer/CommandLineTools" is hardcoded into "libxcrun.dylib". On my machine xcrun scans /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs and /Library/Developer/CommandLineTools/SDKs in that order, and loads "SDKSettings.plist" from each subdirectory. I looked into plists, but couldn't find anything special about "MacOSX10.15.sdk". Okay, here is what I have: % ls -l /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs total 0 drwxr-xr-x 5 root wheel 160 Nov 30 15:27 DriverKit20.2.sdk drwxr-xr-x 7 root wheel 224 Nov 30 15:27 MacOSX.sdk lrwxr-xr-x 1 root wheel 10 Dec 17 14:25 MacOSX11.1.sdk -> MacOSX.sdk % ls -l /Library/Developer/CommandLineTools/SDKs total 0 lrwxr-xr-x 1 root wheel 14
Re: pg_preadv() and pg_pwritev()
On 14.01.2021 21:05, Tom Lane wrote: After considerable playing around, I'm guessing that the reason -no_weak_imports doesn't help is that it rejects calls that are marked as weak references on the *calling* side. Since AC_CHECK_FUNCS doesn't bother to #include the relevant header file, the compiler doesn't know that preadv() ought to be marked as a weak reference. Then, when the test program gets linked against the stub libc that's provided by the SDK, there is a version of preadv() there so no link failure occurs. (There are way more moving parts in this weak-reference thing than I'd realized.) Oh, that's interesting. I've just played with it a bit and it looks exactly as you say. Another thing I've been realizing while poking at this is that we might not need to set -isysroot explicitly at all, which would then lead to the compiler using its default sysroot automatically. In some experimentation, it seems like what we need PG_SYSROOT for is just for configure to be able to find tclConfig.sh and the Perl header files. So at this point I'm tempted to try ripping that out altogether. If you remove the lines in src/template/darwin that inject PG_SYSROOT into CPPFLAGS and LDFLAGS, do things work for you? Yes, it works fine.
Re: pg_preadv() and pg_pwritev()
On 15.01.2021 01:13, Tom Lane wrote: than relying entirely on xcodebuild. Maybe there's a case for trying "xcrun --sdk macosx --show-sdk-path" in between; in my tests that seemed noticeably faster than invoking xcodebuild, and I've not yet seen a case where it gave a different answer. I see that "xcrun --sdk macosx --show-sdk-path" really calls "/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk macosx -version Path" under the hood. % lldb -- xcrun --no-cache --sdk macosx --show-sdk-path (lldb) target create "xcrun" Current executable set to 'xcrun' (x86_64). (lldb) settings set -- target.run-args "--no-cache" "--sdk" "macosx" "--show-sdk-path" (lldb) settings set target.unset-env-vars SDKROOT (lldb) b popen Breakpoint 1: where = libsystem_c.dylib`popen, address = 0x7fff67265607 (lldb) r Process 26857 launched: '/usr/bin/xcrun' (x86_64) Process 26857 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x7fff6e313607 libsystem_c.dylib`popen libsystem_c.dylib`popen: -> 0x7fff6e313607 <+0>: pushq %rbp 0x7fff6e313608 <+1>: movq %rsp, %rbp 0x7fff6e31360b <+4>: pushq %r15 0x7fff6e31360d <+6>: pushq %r14 Target 0: (xcrun) stopped. (lldb) p (char *)$arg1 (char *) $1 = 0x000100406960 "/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk macosx -version Path" % sudo dtrace -n 'pid$target::popen:entry { trace(copyinstr(arg0)) }' -c 'xcrun --sdk macosx --show-sdk-path' dtrace: description 'pid$target::popen:entry ' matched 1 probe CPU IDFUNCTION:NAME 0 413269 popen:entry /Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk macosx -version Path /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk dtrace: pid 26905 has exited
Re: pg_preadv() and pg_pwritev()
On 15.01.2021 04:53, Sergey Shinderuk wrote: I see that "xcrun --sdk macosx --show-sdk-path" really calls "/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -sdk macosx -version Path" under the hood. ... and caches the result. xcodebuild not called without --no-cache. So it still make sense to fall back on xcodebuild. % sudo dtrace -n 'pid$target::popen:entry { trace(copyinstr(arg0)) }' -c 'xcrun --sdk macosx --show-sdk-path' dtrace: description 'pid$target::popen:entry ' matched 1 probe /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk dtrace: pid 26981 has exited
Re: pg_preadv() and pg_pwritev()
On 15.01.2021 05:04, Tom Lane wrote: on her machine there's no detail at all: % xcrun --verbose --no-cache --show-sdk-path /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk The same on my machine. I get details for --find, but not for --show-sdk-path. So I'm not sure what to make of that. But I'm hesitant to assume that xcrun is just a wrapper around xcodebuild. I agree.
Re: pg_preadv() and pg_pwritev()
On 15.01.2021 04:45, Tom Lane wrote: Hence, I propose the attached. This works as far as I can tell to fix the problem you're seeing. Yes, it works fine. Thank you very much.
Re: [PATCH 1/1] Fix detection of pwritev support for OSX.
On 21.01.2021 02:07, Tom Lane wrote: I now believe what is actually happening with the short command is that it's iterating through the available SDKs (according to some not very clear search path) and picking the first one it finds that matches the host system version. That matches the ktrace evidence that shows it reading the SDKSettings.plist file in each SDK directory. Yes, you are right. After some more digging... It searches the DEVELOPER_DIR first and then /Library/Developer/CommandLineTools, which is hardcoded. My DEVELOPER_DIR is % xcode-select -p /Applications/Xcode.app/Contents/Developer (For more detail try "otool -tV /usr/lib/libxcselect.dylib -p _xcselect_get_developer_dir_path".) It reads ProductVersion from /System/Library/CoreServices/SystemVersion.plist % plutil -p /System/Library/CoreServices/SystemVersion.plist | grep ProductVersion "ProductVersion" => "10.15.7" Strips anything after the second dot, and prepends "macosx" to it, which gives "macosx10.15". Then it scans through SDK dirs looking up CanonicalName from SDKSettings.plist until it finds a match with "macosx10.15". The overall callstack: % sudo dtrace -n 'syscall::getdirentries64:entry { ustack() }' -c 'xcrun --show-sdk-path' dtrace: description 'syscall::getdirentries64:entry ' matched 1 probe /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk dtrace: pid 20183 has exited CPU IDFUNCTION:NAME 0846getdirentries64:entry libsystem_kernel.dylib`__getdirentries64+0xa libsystem_c.dylib`readdir$INODE64+0x23 libsystem_c.dylib`scandir$INODE64+0x6c libxcrun.dylib`cltools_lookup_sdk_by_key+0x5f libxcrun.dylib`cltools_lookup_boot_system_sdk+0xda libxcrun.dylib`xcinfocache_resolve_sdkroot+0xc0 libxcrun.dylib`xcrun_main2+0x57a libxcrun.dylib`xcrun_main+0x9 libxcselect.dylib`xcselect_invoke_xcrun_via_library+0xc8 libxcselect.dylib`xcselect_invoke_xcrun+0x25a xcrun`DYLD-STUB$$getprogname libdyld.dylib`start+0x1 xcrun`0x2 0846getdirentries64:entry libsystem_kernel.dylib`__getdirentries64+0xa libsystem_c.dylib`readdir$INODE64+0x23 libsystem_c.dylib`scandir$INODE64+0x6c libxcrun.dylib`cltools_lookup_sdk_by_key+0x5f libxcrun.dylib`cltools_lookup_boot_system_sdk+0xf3 libxcrun.dylib`xcinfocache_resolve_sdkroot+0xc0 libxcrun.dylib`xcrun_main2+0x57a libxcrun.dylib`xcrun_main+0x9 libxcselect.dylib`xcselect_invoke_xcrun_via_library+0xc8 libxcselect.dylib`xcselect_invoke_xcrun+0x25a xcrun`DYLD-STUB$$getprogname libdyld.dylib`start+0x1 xcrun`0x2 The SDK search path: % sudo dtrace -n 'pid$target:::entry /probefunc=="cltools_lookup_sdk_by_key"/ { trace(copyinstr(arg0)); trace(copyinstr(arg1)) }' -c 'xcrun --show-sdk-path' dtrace: description 'pid$target:::entry ' matched 17293 probes /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk dtrace: pid 20191 has exited CPU IDFUNCTION:NAME 8 398290 cltools_lookup_sdk_by_key:entry /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer macosx10.15 9 398290 cltools_lookup_sdk_by_key:entry /Library/Developer/CommandLineTools macosx10.15 The properties read from SDKSettings.plist: % sudo dtrace -n 'pid$target:::entry /probefunc=="_cltools_lookup_property_in_path"/ { trace(copyinstr(arg0)); trace(copyinstr(arg1)); trace(copyinstr(arg2)) }' -c 'xcrun --show-sdk-path' dtrace: description 'pid$target:::entry ' matched 17293 probes /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk dtrace: pid 20195 has exited CPU IDFUNCTION:NAME 8 398288 _cltools_lookup_property_in_path:entry / System/Library/CoreServices/SystemVersion.plist ProductVersion 8 398288 _cltools_lookup_property_in_path:entry /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/DriverKit20.2.sdk SDKSettings.plist IsBaseSDK 8 398288 _cltools_lookup_property_in_path:entry /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/DriverKit20.2.sdk SDKSettings.plist CanonicalName 4 398288 _cltools_lookup_property_in_path:entry /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/DriverKit20.2.sdk SDKSettings.plist CanonicalNameForBuildSettings 4 398288 _cltools_lookup_property_in_path:entry /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk SDKSettings.plist IsBaseSDK 4 398288 _cltools_lookup_property_in_path:entry /Applications/Xcode.app/Contents/Developer/Platforms
Re: [PATCH 1/1] Fix detection of pwritev support for OSX.
On 22.01.2021 01:17, James Hilliard wrote: On Thu, Jan 21, 2021 at 11:38 AM Tom Lane wrote: James Hilliard writes: On Wed, Jan 20, 2021 at 4:07 PM Tom Lane wrote: I'm not sure that the case of not having the "command line tools" installed is interesting for our purposes. AFAIK you have to have that in order to have access to required tools like bison and gmake. (That reminds me, I was intending to add something to our docs about how-to-build-from-source to say that you need to install those.) Yeah, not 100% sure but I was able to build just fine after deleting my command line tools. Hm. I've never been totally clear on what's included in the "command line tools", although it's now apparent that one thing that gets installed is an SDK matching the host OS version. However, Apple's description at [1] says Command Line Tools Download the macOS SDK, headers, and build tools such as the Apple LLVM compiler and Make. These tools make it easy to install open source software or develop on UNIX within Terminal. macOS can automatically download these tools the first time you try to build software, and they are available on the downloads page. which certainly strongly implies that gmake is not there otherwise. At this point I lack any "bare" macOS system to check it on. I wonder whether you have a copy of make available from MacPorts or Homebrew. Or maybe uninstalling the command line tools doesn't really remove everything? Yeah, not entirely sure there but I do use homebrew. FWIW, I tested with a clean install of Catalina. Before I install anything at all, I already have xcode-select, xcrun and all the shims in /usr/bin for developer tools, including cc, make, git, xcodebuild... Just about everything listed in the FILES section of "man xcode-select". When I run any tool (except xcode-select), a GUI dialog pops up offering to install the Command Line Tools. So apparently those shims are not functional yet. I rejected the installation. Instead I downloaded Xcode12.1.xip via [1], the latest version with macosx10.15 SDK. I unpacked it and installed by dragging Xcode.app to /Applications. It seems to me there is no magic behind the scenes, just moving the directory. I selectively checked that the shims in /usr/bin didn't change after that. Now, running "cc" tells me that I have to accept the Xcode license agreement. After accepting it, all the shims in /usr/bin start to work, forwarding to the real tools inside Xcode.app. If I run the Homebrew installer, it says that it's going to install the Command Line Tools. I don't know why it needs them, all the tools are there already. I thought that CLT is a lighter-weight option when you don't want the full Xcode installation, but Homebrew requires them anyway. I rejected to install CLT and abandoned Homebrew. Then I just cloned and built Postgres successfully. So it looks like Xcode is really enough, at least on a recent macOS version. [1] https://xcodereleases.com -- Sergey Shinderuk Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PATCH 1/1] Fix detection of pwritev support for OSX.
On 22.01.2021 20:12, Tom Lane wrote: [ pokes more carefully... ] Ah-hah, I see why I needed the CLT. I bet you'll find that you can't build from "git clean -dfx" state with only Xcode, because comparing the contents of /Applications/Xcode.app/Contents/Developer/usr/bin and /Library/Developer/CommandLineTools/usr/bin on my own Mac, I observe that only the CLT provides bison and flex. I also see install_name_tool only in the CLT; we don't depend on that today, but may soon (see the latest thread about coping with SIP). I did git clone from scratch. Xcode really has all the tools. configure:9519: checking for bison configure:9537: found /usr/bin/bison configure:9549: result: /usr/bin/bison configure:9571: using bison (GNU Bison) 2.3 configure:9609: checking for flex configure:9654: result: /usr/bin/flex configure:9674: using flex 2.5.35 Apple(flex-32) % xcrun --find bison /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/bison % xcrun --find install_name_tool /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/install_name_tool On the whole it looks like we should recommend installing the CLT and not bothering with Xcode, which is about 10X the size: $ du -hs /Library/Developer/CommandLineTools 1.1G/Library/Developer/CommandLineTools $ du -hs /Applications/Xcode.app 15G/Applications/Xcode.app Fair. BTW, reading [1] I see You can install Xcode, the CLT, or both; Homebrew supports all three configurations. So I'm not sure why you got that prompt, unless you were using a formula that knew you were going to need bison. [1] https://docs.brew.sh/Installation#3 Apparently, this documentation is wrong. I’m not installing any particular formula, just running the Homebrew installer script. % /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" Password: ==> This script will install: [...] ==> The following new directories will be created: [...] ==> The Xcode Command Line Tools will be installed. Press RETURN to continue or any other key to abort ==> Installing Command Line Tools for Xcode-12.3 ==> /usr/bin/sudo /usr/sbin/softwareupdate -i Command\ Line\ Tools\ for\ Xcode-12.3 Software Update Tool Downloading Command Line Tools for Xcode [...] I checked the script [1], and it really requires the CLT. Here is the explanation [2] for this: There is actually no such requirement. However, there are formulae that will be forced to build from source if you do not have the CLT. They can still be built from source with Xcode only, but because the pre-built bottles are compiled in an environment that has both Xcode and the CLT installed, there are some cases where the bottles end up having a hard dependency on the CLT. A major example is gcc. So installing the CLT may help you avoid some lengthy source builds. We ensure that all Homebrew formulae can be built with Xcode.app alone. Most formulae can be built with just the CLT, and those that require the full Xcode.app have an explicit depends_on :xcode => :build. Some users would prefer to use only the CLT because it's a much smaller download and takes less time to install and upgrade than Xcode. [1] https://github.com/Homebrew/install/blob/master/install.sh#L191 [2] https://github.com/Homebrew/brew/issues/1613 Regards. -- Sergey Shinderuk Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PATCH 1/1] Fix detection of pwritev support for OSX.
On 23.01.2021 08:02, Sergey Shinderuk wrote: I checked the script [1], and it really requires the CLT. Here is the explanation [2] for this: There is actually no such requirement. However, there are formulae that will be forced to build from source if you do not have the CLT. They can still be built from source with Xcode only, but because the pre-built bottles are compiled in an environment that has both Xcode and the CLT installed, there are some cases where the bottles end up having a hard dependency on the CLT. A major example is gcc. So installing the CLT may help you avoid some lengthy source builds. We ensure that all Homebrew formulae can be built with Xcode.app alone. Most formulae can be built with just the CLT, and those that require the full Xcode.app have an explicit depends_on :xcode => :build. Some users would prefer to use only the CLT because it's a much smaller download and takes less time to install and upgrade than Xcode. In the gcc formula [1]: # The bottles are built on systems with the CLT installed, and do not work # out of the box on Xcode-only systems due to an incorrect sysroot. pour_bottle? do reason "The bottle needs the Xcode CLT to be installed." satisfy { MacOS::CLT.installed? } end I guess this is the "xcrun --show-sdk-path" thing we've alredy disccussed :) [1] https://github.com/Homebrew/homebrew-core/blob/master/Formula/gcc.rb#L36 -- Sergey Shinderuk Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [PATCH 1/1] Fix detection of pwritev support for OSX.
On 23.01.2021 08:02, Sergey Shinderuk wrote: On the whole it looks like we should recommend installing the CLT and not bothering with Xcode, which is about 10X the size: $ du -hs /Library/Developer/CommandLineTools 1.1G /Library/Developer/CommandLineTools $ du -hs /Applications/Xcode.app 15G /Applications/Xcode.app Fair. BTW, Homebrew prefers the CLT SDK: https://github.com/Homebrew/brew/blob/master/Library/Homebrew/os/mac.rb#L138 # Prefer CLT SDK when both Xcode and the CLT are installed. # Expected results: # 1. On Xcode-only systems, return the Xcode SDK. # 2. On Xcode-and-CLT systems where headers are provided by the system, return nil. # 3. On CLT-only systems with no CLT SDK, return nil. # 4. On CLT-only systems with a CLT SDK, where headers are provided by the system, return nil. # 5. On CLT-only systems with a CLT SDK, where headers are not provided by the system, return the CLT SDK. Here is the relevant discussion: https://github.com/Homebrew/brew/pull/7134 I like the example of Git compiled against the wrong LIBCURL_VERSION_NUM. Clearly, there are other issues with cross-compiling to a newer SDK, besides autoconf probes and weak imports. -- Sergey Shinderuk Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Improve error handling of HMAC computations and SCRAM
Hi, On 11.01.2022 07:56, Michael Paquier wrote: > Thoughts? A few comments after a quick glance... + * Returns a static string providing errors about an error that happened "errors about an error" looks odd. +static const char * +SSLerrmessage(unsigned long ecode) +{ + if (ecode == 0) + return NULL; + + /* +* This may return NULL, but we would fall back to a default error path if +* that were the case. +*/ + return ERR_reason_error_string(ecode); +} We already have SSLerrmessage elsewhere and it's documented to never return NULL. I find that confusing. If I have two distinct pg_hmac_ctx's, are their errreason's idependent from one another or do they really point to the same static buffer? Regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Improve error handling of HMAC computations and SCRAM
On 11.01.2022 10:57, Michael Paquier wrote: On Tue, Jan 11, 2022 at 10:50:50AM +0300, Sergey Shinderuk wrote: + * Returns a static string providing errors about an error that happened "errors about an error" looks odd. Sure, that could be reworded. What about "providing details about an error"? Yeah, that's better. I thought "providing errors about an error" was a typo, but now I see the same comment was committed in b69aba745. Is it just me? :) Thanks, -- Sergey Shinderukhttps://postgrespro.com/
Re: Improve error handling of HMAC computations and SCRAM
On 12.01.2022 14:32, Michael Paquier wrote: On Wed, Jan 12, 2022 at 12:56:17PM +0900, Michael Paquier wrote: Attached is a rebased patch for the HMAC portions, with a couple of fixes I noticed while going through this stuff again (mostly around SASLprep and pg_fe_scram_build_secret), and a fix for a conflict coming from 9cb5518. psql's \password is wrong to assume that the only error that can happen for scran-sha-256 is an OOM, but we'll get there. With an attachment, that's even better. (Thanks, Daniel.) Gave it a thorough read. Looks good, except for errstr not set in a couple of places (see the diff attached). Didn't test it. -- Sergey Shinderukhttps://postgrespro.com/diff --git a/src/common/hmac.c b/src/common/hmac.c index 592f9b20a38..a27778e86b3 100644 --- a/src/common/hmac.c +++ b/src/common/hmac.c @@ -46,9 +46,7 @@ typedef enum pg_hmac_errno PG_HMAC_ERROR_INTERNAL } pg_hmac_errno; -/* - * Internal structure for pg_hmac_ctx->data with this implementation. - */ +/* Internal pg_hmac_ctx structure */ struct pg_hmac_ctx { pg_cryptohash_ctx *hash; diff --git a/src/common/hmac_openssl.c b/src/common/hmac_openssl.c index c352f9db9e9..44f36d51dcb 100644 --- a/src/common/hmac_openssl.c +++ b/src/common/hmac_openssl.c @@ -60,9 +60,7 @@ typedef enum pg_hmac_errno PG_HMAC_ERROR_OPENSSL } pg_hmac_errno; -/* - * Internal structure for pg_hmac_ctx->data with this implementation. - */ +/* Internal pg_hmac_ctx structure */ struct pg_hmac_ctx { HMAC_CTX *hmacctx; diff --git a/src/common/scram-common.c b/src/common/scram-common.c index 5f90397c66d..8896b1e73e4 100644 --- a/src/common/scram-common.c +++ b/src/common/scram-common.c @@ -44,7 +44,10 @@ scram_SaltedPassword(const char *password, pg_hmac_ctx *hmac_ctx = pg_hmac_create(PG_SHA256); if (hmac_ctx == NULL) + { + *errstr = pg_hmac_error(NULL); /* returns OOM */ return -1; + } /* * Iterate hash calculation of HMAC entry using given salt. This is @@ -126,7 +129,10 @@ scram_ClientKey(const uint8 *salted_password, uint8 *result, pg_hmac_ctx *ctx = pg_hmac_create(PG_SHA256); if (ctx == NULL) + { + *errstr = pg_hmac_error(NULL); /* returns OOM */ return -1; + } if (pg_hmac_init(ctx, salted_password, SCRAM_KEY_LEN) < 0 || pg_hmac_update(ctx, (uint8 *) "Client Key", strlen("Client Key")) < 0 ||
Re: Improve error handling of HMAC computations and SCRAM
On 13.01.2022 10:24, Michael Paquier wrote: Thanks for the review. The comments about pg_hmac_ctx->data were wrong from the beginning, coming, I guess, from one of the earlier patch versions where this was discussed. So I have applied that independently. I have also spent a good amount of time on that to close the loop and make sure that no code paths are missing an error context, adjusted a couple of comments to explain more the role of *errstr in all the SCRAM routines, and finally applied it on HEAD. Thanks! -- Sergey Shinderukhttps://postgrespro.com/
Re: Windows: Wrong error message at connection termination
On 02.12.2021 22:31, Tom Lane wrote: I'll push the close-only change in a little bit. Unexpectedly, this changes the error message: postgres=# set idle_session_timeout = '1s'; SET postgres=# select 1; could not receive data from server: Software caused connection abort (0x2745/10053) The connection to the server was lost. Succeeded. postgres=# Without shutdown/closesocket it would most likely be: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. When the timeout expires, the server sends the error message and gracefully closes the connection by sending a FIN. Later, psql sends another query to the server, and the server responds with a RST. But now recv() returns WSAECONNABORTED(10053) instead of WSAECONNRESET(10054). Without shutdown/closesocket, after the timeout expires, the server sends the error message, the client sends an ACK, and the server responds with a RST. Then psql tries to sends the next query, but nothing is sent at the TCP level, and the next recv() returns WSAECONNRESET. IIUIC, in both cases we may or may not recv() the error message from the server depending on how fast the RST arrives from the server. Should we handle ECONNABORTED similarly to ECONNRESET in pqsecure_raw_read? -- Sergey Shinderukhttps://postgrespro.com/
Re: Windows: Wrong error message at connection termination
On 14.01.2022 13:01, Sergey Shinderuk wrote: When the timeout expires, the server sends the error message and gracefully closes the connection by sending a FIN. Later, psql sends another query to the server, and the server responds with a RST. But now recv() returns WSAECONNABORTED(10053) instead of WSAECONNRESET(10054). On the other hand, I cannot reproduce this behavior with a remote server even if pause psql just before the recv() call to let the RST win the race. So I get: postgres=# set idle_session_timeout = '1s'; recv() returned 15 errno 0 SET recv() returned -1 errno 10035 (WSAEWOULDBLOCK) postgres=# select 1; recv() returned 116 errno 0 recv() returned 0 errno 0 recv() returned 0 errno 0 FATAL: terminating connection due to idle-session timeout server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. recv() signals EOF like on Unix. Here I connected from a Windows virtual machine to the macOS host, but the Wireshark dump looks the same (there is a RST) as for a localhost connection. Is this "error-eating" behavior of RST on Windows specific only to localhost connections? -- Sergey Shinderukhttps://postgrespro.com/
Re: Windows: Wrong error message at connection termination
On 14.01.2022 13:01, Sergey Shinderuk wrote: Unexpectedly, this changes the error message: postgres=# set idle_session_timeout = '1s'; SET postgres=# select 1; could not receive data from server: Software caused connection abort (0x2745/10053) For the record, after more poking I realized that it depends on timing. By injecting delays I can get any of the following from libpq: * could not receive data from server: Software caused connection abort * server closed the connection unexpectedly * no connection to the server Should we handle ECONNABORTED similarly to ECONNRESET in pqsecure_raw_read? So this doesn't make sense anymore. Sorry for the noise. -- Sergey Shinderukhttps://postgrespro.com/
Re: Bug in DefineRange() with multiranges
On 10.10.2021 20:12, Peter Eisentraut wrote: On 04.10.21 19:09, Sergey Shinderuk wrote: I wonder what is the proper fix. Just drop pfree() altogether or add pstrdup() instead? I see that makeMultirangeTypeName() doesn't bother freeing its buf. I think removing the pfree()s is a correct fix. Thanks, here is a patch. -- Sergey Shinderukhttps://postgrespro.com/ From 6c548f07d2a254da46cd0b6f6e99b7ed24a6b811 Mon Sep 17 00:00:00 2001 From: Sergey Shinderuk Date: Tue, 12 Oct 2021 07:57:42 +0300 Subject: [PATCH] Fix premature pfree() of multirange_type_name in DefineRange() If the mutlirange_type_name parameter is given in the query, this would erroneously pfree() the string in the parse tree. Oversight in 6df7a9698bb0. --- src/backend/commands/typecmds.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c index b290629a450..9ab40341793 100644 --- a/src/backend/commands/typecmds.c +++ b/src/backend/commands/typecmds.c @@ -1707,7 +1707,6 @@ DefineRange(ParseState *pstate, CreateRangeStmt *stmt) /* Create cast from the range type to its multirange type */ CastCreate(typoid, multirangeOid, castFuncOid, 'e', 'f', DEPENDENCY_INTERNAL); - pfree(multirangeTypeName); pfree(multirangeArrayName); return address; -- 2.24.3 (Apple Git-128)
Re: Bug in DefineRange() with multiranges
On 13.10.2021 07:21, Michael Paquier wrote: Looks fine seen from here, so I'll apply shortly. Thank you! -- Sergey Shinderukhttps://postgrespro.com/
Re: [RFC] building postgres with meson
Hi, On 14.10.2021 23:54, John Naylor wrote: On Thu, Oct 14, 2021 at 4:34 PM Andres Freund <mailto:and...@anarazel.de>> wrote: > Is this a Mac with SIP enabled? The Mac CI presumably has that disabled, which is why I didn't see this issue there. Probably need to implement whatever Tom figured out to do about that for the current way of running tests. System Information says it's disabled. Running "csrutil status" complains of an unsupported configuration, which doesn't sound good, so I should probably go fix that independent of anything else. :-/ Maybe you could check that DYLD_LIBRARY_PATH is working for you? % DYLD_FALLBACK_LIBRARY_PATH= DYLD_LIBRARY_PATH=./tmp_install/usr/local/lib ./tmp_install/usr/local/bin/psql --version psql (PostgreSQL) 15devel Without DYLD_LIBRARY_PATH I get the error, as expected: % DYLD_FALLBACK_LIBRARY_PATH= ./tmp_install/usr/local/bin/psql --version dyld: Library not loaded: /usr/local/lib/libpq.5.dylib Referenced from: /Users/shinderuk/src/postgres-meson/build/./tmp_install/usr/local/bin/psql Reason: image not found I add "DYLD_FALLBACK_LIBRARY_PATH=" because otherwise dyld falls back to /usr/lib/libpq.5.dylib provided by Apple (I am testing on Catalina). % DYLD_PRINT_LIBRARIES=1 ./tmp_install/usr/local/bin/psql --version 2>&1 | grep libpq dyld: loaded: <4EDF735E-2104-32AD-BE7B-B400ABFCF57C> /usr/lib/libpq.5.dylib Regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Fix error handling in be_tls_open_server()
On 19.09.2023 03:54, Michael Paquier wrote: One doubt that I have is if we shouldn't let X509_NAME_print_ex() be as it is now, and not force a failure on the bio if this calls fails. If malloc fails inside X509_NAME_print_ex, then we will be left with empty port->peer_dn. Here is a gdb session showing this: (gdb) b X509_NAME_print_ex Breakpoint 1 at 0x7f539f6c0cf0 (gdb) c Continuing. Breakpoint 1, 0x7f539f6c0cf0 in X509_NAME_print_ex () from /lib/x86_64-linux-gnu/libcrypto.so.3 (gdb) bt #0 0x7f539f6c0cf0 in X509_NAME_print_ex () from /lib/x86_64-linux-gnu/libcrypto.so.3 #1 0x56026d2fbe8d in be_tls_open_server (port=port@entry=0x56026ed5d730) at be-secure-openssl.c:635 #2 0x56026d2eefa5 in secure_open_server (port=port@entry=0x56026ed5d730) at be-secure.c:118 #3 0x56026d3dc412 in ProcessStartupPacket (port=port@entry=0x56026ed5d730, ssl_done=ssl_done@entry=false, gss_done=gss_done@entry=false) at postmaster.c:2065 #4 0x56026d3dcd8e in BackendInitialize (port=port@entry=0x56026ed5d730) at postmaster.c:4377 #5 0x56026d3def6a in BackendStartup (port=port@entry=0x56026ed5d730) at postmaster.c:4155 #6 0x56026d3df115 in ServerLoop () at postmaster.c:1781 #7 0x56026d3e0645 in PostmasterMain (argc=argc@entry=20, argv=argv@entry=0x56026ec5a0d0) at postmaster.c:1465 #8 0x56026d2fcd7c in main (argc=20, argv=0x56026ec5a0d0) at main.c:198 (gdb) b malloc Breakpoint 2 at 0x7f539eca5120: file ./malloc/malloc.c, line 3287. (gdb) c Continuing. Breakpoint 2, __GI___libc_malloc (bytes=4) at ./malloc/malloc.c:3287 3287./malloc/malloc.c: No such file or directory. (gdb) bt #0 __GI___libc_malloc (bytes=4) at ./malloc/malloc.c:3287 #1 0x7f539f6f6e09 in BUF_MEM_grow_clean () from /lib/x86_64-linux-gnu/libcrypto.so.3 #2 0x7f539f6e4fb8 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #3 0x7f539f6d22fb in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #4 0x7f539f6d5c06 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #5 0x7f539f6d5d37 in BIO_write () from /lib/x86_64-linux-gnu/libcrypto.so.3 #6 0x7f539f6bdb41 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #7 0x7f539f6c0b7d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #8 0x56026d2fbe8d in be_tls_open_server (port=port@entry=0x56026ed5d730) at be-secure-openssl.c:635 #9 0x56026d2eefa5 in secure_open_server (port=port@entry=0x56026ed5d730) at be-secure.c:118 #10 0x56026d3dc412 in ProcessStartupPacket (port=port@entry=0x56026ed5d730, ssl_done=ssl_done@entry=false, gss_done=gss_done@entry=false) at postmaster.c:2065 #11 0x56026d3dcd8e in BackendInitialize (port=port@entry=0x56026ed5d730) at postmaster.c:4377 #12 0x56026d3def6a in BackendStartup (port=port@entry=0x56026ed5d730) at postmaster.c:4155 #13 0x56026d3df115 in ServerLoop () at postmaster.c:1781 #14 0x56026d3e0645 in PostmasterMain (argc=argc@entry=20, argv=argv@entry=0x56026ec5a0d0) at postmaster.c:1465 #15 0x56026d2fcd7c in main (argc=20, argv=0x56026ec5a0d0) at main.c:198 (gdb) return 0 Make __GI___libc_malloc return now? (y or n) y #0 0x7f539f6f6e09 in BUF_MEM_grow_clean () from /lib/x86_64-linux-gnu/libcrypto.so.3 (gdb) delete Delete all breakpoints? (y or n) y (gdb) c Continuing. And in the server log we see: DEBUG: SSL connection from DN:"" CN:"ssltestuser" While in the normal case we get: DEBUG: SSL connection from DN:"CN=ssltestuser" CN:"ssltestuser"\ Probably we shouldn't ignore the error from X509_NAME_print_ex? -- Sergey Shinderukhttps://postgrespro.com/
Re: Fix error handling in be_tls_open_server()
On 20.09.2023 11:42, Daniel Gustafsson wrote: Attached is a v2 on top of HEAD with commit message etc, which I propose to backpatch to v15 where it was introduced. There is a typo: upon en error. Otherwise, looks good to me. Thank you. -- Sergey Shinderukhttps://postgrespro.com/
Re: Schema variables - new implementation for Postgres 15
On 13.11.2022 20:59, Pavel Stehule wrote: fresh rebase Hello, Sorry, I haven't been following this thread, but I'd like to report a memory management bug. I couldn't apply the latest patches, so I tested with v20221104-1-* patches applied atop of commit b0284bfb1db. postgres=# create variable s text default 'abc'; create function f() returns text as $$ begin return g(s); end; $$ language plpgsql; create function g(t text) returns text as $$ begin let s = 'BOOM!'; return t; end; $$ language plpgsql; select f(); CREATE VARIABLE CREATE FUNCTION CREATE FUNCTION server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. LOG: server process (PID 55307) was terminated by signal 11: Segmentation fault DETAIL: Failed process was running: select f(); I believe it's a use-after-free error, triggered by assigning a new value to s in g(), thus making t a dangling pointer. After reconnecting I get a scary error: postgres=# select f(); ERROR: compressed pglz data is corrupt Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Bug in row_number() optimization
Hello, We've accidentally found a subtle bug introduced by commit 9d9c02ccd1aea8e9131d8f4edb21bf1687e40782 Author: David Rowley Date: Fri Apr 8 10:34:36 2022 +1200 Teach planner and executor about monotonic window funcs On a 32-bit system Valgrind reports use-after-free when running the "window" test: ==35487== Invalid read of size 4 ==35487==at 0x48398A4: memcpy (vg_replace_strmem.c:1035) ==35487==by 0x1A2902: fill_val (heaptuple.c:287) ==35487==by 0x1A2902: heap_fill_tuple (heaptuple.c:336) ==35487==by 0x1A3C29: heap_form_minimal_tuple (heaptuple.c:1412) ==35487==by 0x3D4555: tts_virtual_copy_minimal_tuple (execTuples.c:290) ==35487==by 0x72FC33: ExecCopySlotMinimalTuple (tuptable.h:473) ==35487==by 0x72FC33: tuplesort_puttupleslot (tuplesortvariants.c:610) ==35487==by 0x403463: ExecSort (nodeSort.c:153) ==35487==by 0x3D0C8E: ExecProcNodeFirst (execProcnode.c:464) ==35487==by 0x40AF09: ExecProcNode (executor.h:259) ==35487==by 0x40AF09: begin_partition (nodeWindowAgg.c:1106) ==35487==by 0x40D259: ExecWindowAgg (nodeWindowAgg.c:2125) ==35487==by 0x3D0C8E: ExecProcNodeFirst (execProcnode.c:464) ==35487==by 0x405E17: ExecProcNode (executor.h:259) ==35487==by 0x405E17: SubqueryNext (nodeSubqueryscan.c:53) ==35487==by 0x3D41C7: ExecScanFetch (execScan.c:133) ==35487==by 0x3D41C7: ExecScan (execScan.c:199) ==35487== Address 0xe3e8af0 is 168 bytes inside a block of size 8,192 alloc'd ==35487==at 0x483463B: malloc (vg_replace_malloc.c:299) ==35487==by 0x712B63: AllocSetContextCreateInternal (aset.c:446) ==35487==by 0x3D82BE: CreateExprContextInternal (execUtils.c:253) ==35487==by 0x3D84DC: CreateExprContext (execUtils.c:303) ==35487==by 0x3D8750: ExecAssignExprContext (execUtils.c:482) ==35487==by 0x40BC1A: ExecInitWindowAgg (nodeWindowAgg.c:2382) ==35487==by 0x3D1232: ExecInitNode (execProcnode.c:346) ==35487==by 0x4035E0: ExecInitSort (nodeSort.c:265) ==35487==by 0x3D11AB: ExecInitNode (execProcnode.c:321) ==35487==by 0x40BD36: ExecInitWindowAgg (nodeWindowAgg.c:2432) ==35487==by 0x3D1232: ExecInitNode (execProcnode.c:346) ==35487==by 0x405E99: ExecInitSubqueryScan (nodeSubqueryscan.c:126) It's faster to run just this test under Valgrind: make installcheck-test TESTS='test_setup window' This can also be reproduced on a 64-bit system by forcing int8 to be passed by reference: --- a/src/include/pg_config_manual.h +++ b/src/include/pg_config_manual.h @@ -82,9 +82,7 @@ * * Changing this requires an initdb. */ -#if SIZEOF_VOID_P >= 8 -#define USE_FLOAT8_BYVAL 1 -#endif +#undef USE_FLOAT8_BYVAL /* * When we don't have native spinlocks, we use semaphores to simulate them. Futhermore, zeroing freed memory makes the test fail: --- a/src/include/utils/memdebug.h +++ b/src/include/utils/memdebug.h @@ -39,7 +39,7 @@ static inline void wipe_mem(void *ptr, size_t size) { VALGRIND_MAKE_MEM_UNDEFINED(ptr, size); - memset(ptr, 0x7F, size); + memset(ptr, 0, size); VALGRIND_MAKE_MEM_NOACCESS(ptr, size); } $ cat src/test/regress/regression.diffs diff -U3 /home/sergey/pgwork/devel/src/src/test/regress/expected/window.out /home/sergey/pgwork/devel/src/src/test/regress/results/window.out --- /home/sergey/pgwork/devel/src/src/test/regress/expected/window.out 2022-11-03 18:26:52.203624217 +0300 +++ /home/sergey/pgwork/devel/src/src/test/regress/results/window.out 2022-11-16 01:47:18.494273352 +0300 @@ -3721,7 +3721,8 @@ ---+---++-++++ personnel | 5 | 3500 | 12-10-2007 | 2 | 1 | 2 | 2 sales | 3 | 4800 | 08-01-2007 | 3 | 1 | 3 | 3 -(2 rows) + sales | 4 | 4800 | 08-08-2007 | 3 | 0 | 3 | 3 +(3 rows) -- Tests to ensure we don't push down the run condition when it's not valid to -- do so. The failing query is: SELECT * FROM (SELECT *, count(salary) OVER (PARTITION BY depname || '') c1, -- w1 row_number() OVER (PARTITION BY depname) rn, -- w2 count(*) OVER (PARTITION BY depname) c2, -- w2 count(*) OVER (PARTITION BY '' || depname) c3 -- w3 FROM empsalary ) e WHERE rn <= 1 AND c1 <= 3; As far as I understand, ExecWindowAgg for the intermediate WindowAgg node switches into pass-through mode, stops evaluating row_number(), and returns the previous value instead. But if int8 is passed by reference, the previous value stored in econtext->ecxt_aggvalues becomes a dangling pointer when the per-output-tuple memory context is reset. Attaching a patch that makes the window test fail on a 64-bit system. Best regards, -- Sergey Shinderukhttps://postgrespro.com/diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h index f2a106f983d..66ab343d51c 100644 --- a/src/include/pg_config_manual.h +++ b/
Re: Bug in row_number() optimization
On 24.11.2022 06:16, Richard Guo wrote: Regarding how to fix this problem, firstly I believe we need to evaluate window functions in the per-tuple memory context, as the HEAD does. When we decide we need to go into pass-through mode, I'm thinking that we can just copy out the results of the last evaluation to the per-query memory context, while still storing their pointers in ecxt_aggvalues. Does this idea work? Although I'm not familiar with the code, this makes sense to me. You proposed: +#ifdef USE_FLOAT8_BYVAL + evalWfuncContext = winstate->ss.ps.ps_ExprContext->ecxt_per_tuple_memory; +#else + evalWfuncContext = winstate->ss.ps.ps_ExprContext->ecxt_per_query_memory; +#endif Shouldn't we handle any pass-by-reference type the same? I suppose, a user-defined window function can return some other type, not int8. Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Bug in row_number() optimization
On 25.11.2022 15:46, Richard Guo wrote: Considering the 'Filter' is a strict function, marking it as NULL would do. I think this is why this patch works. What about user-defined operators? I created my own <= operator for int8 which returns true on null input, and put it in a btree operator class. With my operator I get: depname | empno | salary | enroll_date | c1 | rn | c2 | c3 ---+---++-++++ personnel | 5 | 3500 | 2007-12-10 | 2 | 1 | 2 | 2 sales | 3 | 4800 | 2007-08-01 | 3 | 1 | 3 | 3 sales | 4 | 4800 | 2007-08-08 | 3 ||| 3 (3 rows) Admittedly, it's weird that (null <= 1) evaluates to true. But does it violate the contract of the btree operator class or something? Didn't find a clear answer in the docs. -- Sergey Shinderukhttps://postgrespro.com/
Re: Bug in row_number() optimization
On 28.11.2022 03:23, David Rowley wrote: On Sat, 26 Nov 2022 at 05:19, Tom Lane wrote: Sergey Shinderuk writes: What about user-defined operators? I created my own <= operator for int8 which returns true on null input, and put it in a btree operator class. Admittedly, it's weird that (null <= 1) evaluates to true. But does it violate the contract of the btree operator class or something? Didn't find a clear answer in the docs. It's pretty unlikely that this would work during an actual index scan. I'm fairly sure that btree (and other index AMs) have hard-wired assumptions that index operators are strict. If we're worried about that then we could just restrict this optimization to only work with strict quals. Not sure this is necessary if btree operators must be strict anyway. The proposal to copy the datums into the query context does not seem to me to be a good idea. If there are a large number of partitions then it sounds like we'll leak lots of memory. We could invent some partition context that we reset after each partition, but that's probably more complexity than it would be worth. Ah, good point. I've attached a draft patch to move the code to nullify the aggregate results so that's only done once per partition and adjusted the planner to limit this to strict quals. Not quite sure that we don't need to do anything for the WINDOWAGG_PASSTHROUGH_STRICT case. Although, we won't return any more tuples for the current partition, we still call ExecProject with dangling pointers. Is it okay? + if (!func_strict(opexpr->opfuncid)) + return false; Should return true instead? -- Sergey Shinderukhttps://postgrespro.com/
Re: Bug in row_number() optimization
On 01.12.2022 11:18, Richard Guo wrote: On Mon, Nov 28, 2022 at 5:59 PM Sergey Shinderuk mailto:s.shinde...@postgrespro.ru>> wrote: Not quite sure that we don't need to do anything for the WINDOWAGG_PASSTHROUGH_STRICT case. Although, we won't return any more tuples for the current partition, we still call ExecProject with dangling pointers. Is it okay? AFAIU once we go into WINDOWAGG_PASSTHROUGH_STRICT we will spool all the remaining tuples in the current partition without storing them and then move to the next partition if available and become WINDOWAGG_RUN again or become WINDOWAGG_DONE if there are no further partitions. It seems we would not have chance to see the dangling pointers. Maybe I'm missing something, but the previous call to spool_tuples() might have read extra tuples (if the tuplestore spilled to disk), and after switching to WINDOWAGG_PASSTHROUGH_STRICT mode we nevertheless would loop through these extra tuples and call ExecProject if only to increment winstate->currentpos. -- Sergey Shinderukhttps://postgrespro.com/
Add PL/pgSQL extra check no_data_found
Hello, I propose to add a new value "no_data_found" for the plpgsql.extra_errors and plpgsql.extra_warnings parameters [1]. With plpgsql.extra_errors=no_data_found SELECT INTO raises no_data_found exception when the result set is empty. With plpgsql.extra_errors=too_many_rows,no_data_found SELECT INTO behaves like SELECT INTO STRICT [2]. This could simplify migration from PL/SQL and may be just more convenient. One potential downside is that plpgsql.extra_errors=no_data_found could break existing functions expecting to get null or checking IF found explicitly. This is also true for the too_many_rows exception, but arguably it's a programmer error, while no_data_found switches to a different convention for handling (or not handling) an empty result with SELECT INTO. Otherwise the patch is straightforward. What do you think? -- Sergey Shinderukhttps://postgrespro.com/ [1] https://www.postgresql.org/docs/devel/plpgsql-development-tips.html#PLPGSQL-EXTRA-CHECKS [2] https://www.postgresql.org/docs/devel/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROWdiff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index a6473429480..80672a3672f 100644 --- a/src/pl/plpgsql/src/pl_exec.c +++ b/src/pl/plpgsql/src/pl_exec.c @@ -4197,8 +4197,14 @@ exec_stmt_execsql(PLpgSQL_execstate *estate, long tcount; int rc; PLpgSQL_expr *expr = stmt->sqlstmt; + int no_data_found_level = 0; int too_many_rows_level = 0; + if (plpgsql_extra_errors & PLPGSQL_XCHECK_NODATAFOUND) + no_data_found_level = ERROR; + else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_NODATAFOUND) + no_data_found_level = WARNING; + if (plpgsql_extra_errors & PLPGSQL_XCHECK_TOOMANYROWS) too_many_rows_level = ERROR; else if (plpgsql_extra_warnings & PLPGSQL_XCHECK_TOOMANYROWS) @@ -4355,16 +4361,19 @@ exec_stmt_execsql(PLpgSQL_execstate *estate, */ if (n == 0) { - if (stmt->strict) + if (stmt->strict || no_data_found_level) { char *errdetail; +int errlevel; if (estate->func->print_strict_params) errdetail = format_expr_params(estate, expr); else errdetail = NULL; -ereport(ERROR, +errlevel = stmt->strict ? ERROR : no_data_found_level; + +ereport(errlevel, (errcode(ERRCODE_NO_DATA_FOUND), errmsg("query returned no rows"), errdetail ? errdetail_internal("parameters: %s", errdetail) : 0)); diff --git a/src/pl/plpgsql/src/pl_handler.c b/src/pl/plpgsql/src/pl_handler.c index 5dc334b61b3..80151540c38 100644 --- a/src/pl/plpgsql/src/pl_handler.c +++ b/src/pl/plpgsql/src/pl_handler.c @@ -90,6 +90,8 @@ plpgsql_extra_checks_check_hook(char **newvalue, void **extra, GucSource source) if (pg_strcasecmp(tok, "shadowed_variables") == 0) extrachecks |= PLPGSQL_XCHECK_SHADOWVAR; + else if (pg_strcasecmp(tok, "no_data_found") == 0) +extrachecks |= PLPGSQL_XCHECK_NODATAFOUND; else if (pg_strcasecmp(tok, "too_many_rows") == 0) extrachecks |= PLPGSQL_XCHECK_TOOMANYROWS; else if (pg_strcasecmp(tok, "strict_multi_assignment") == 0) diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h index 088768867d9..b69c058ba86 100644 --- a/src/pl/plpgsql/src/plpgsql.h +++ b/src/pl/plpgsql/src/plpgsql.h @@ -1202,8 +1202,9 @@ extern bool plpgsql_check_asserts; /* extra compile-time and run-time checks */ #define PLPGSQL_XCHECK_NONE 0 #define PLPGSQL_XCHECK_SHADOWVAR(1 << 1) -#define PLPGSQL_XCHECK_TOOMANYROWS(1 << 2) -#define PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT (1 << 3) +#define PLPGSQL_XCHECK_NODATAFOUND(1 << 2) +#define PLPGSQL_XCHECK_TOOMANYROWS(1 << 3) +#define PLPGSQL_XCHECK_STRICTMULTIASSIGNMENT (1 << 4) #define PLPGSQL_XCHECK_ALL ((int) ~0) extern int plpgsql_extra_warnings; diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out index 08e42f17dc2..454e517fcc9 100644 --- a/src/test/regress/expected/plpgsql.out +++ b/src/test/regress/expected/plpgsql.out @@ -3086,6 +3086,25 @@ select shadowtest(1); (1 row) -- runtime extra checks +set plpgsql.extra_warnings to 'no_data_found'; +do $$ +declare x int; +begin + select 1 into x where 0 = 1; +end; +$$; +WARNING: query returned no rows +set plpgsql.extra_errors to 'no_data_found'; +do $$ +declare x int; +begin + select 1 into x where 0 = 1; +end; +$$; +ERROR: query returned no rows +CONTEXT: PL/pgSQL function inline_code_block line 4 at SQL statement +reset plpgsql.extra_errors; +reset plpgsql.extra_warnings; set plpgsql.extra_warnings to 'too_many_rows'; do $$ declare x int; diff --git a/src/test/regress/sql/plpgsql.sql b/src/test/regress/sql/plpgsql.sql index 588c3310337..60896b3d0da 100644 --- a/src/test/regress/sql/plpgsql.sql +++ b/src/test/regress/sql/plpgsql.sql @@ -2620
Re: Add PL/pgSQL extra check no_data_found
On 09.12.2022 09:46, Pavel Stehule wrote: I don't like the idea about possible replacement of INTO STRICT by INTO + extra warnings. Handling exceptions is significantly more expensive than in Oracle, and using INTO without STRICT with the next test IF NOT FOUND THEN can save one safepoint and one handling an exception. It should be mentioned in the documentation. Using this very common Oracle's pattern can have a very negative impact on performance in Postgres. If somebody does port from Oracle, and wants compatible behavior then he should use INTO STRICT. I think it is counterproductive to hide syntax differences when there is a significant difference in performance (and will be). Fair enough. Thank you, Pavel. -- Sergey Shinderukhttps://postgrespro.com/
Re: A bug with ExecCheckPermissions
On 08.02.2023 21:23, Alvaro Herrera wrote: On 2023-Feb-08, Amit Langote wrote: On Wed, Feb 8, 2023 at 16:19 Alvaro Herrera wrote: I think we should also patch ExecCheckPermissions to use forboth(), scanning the RTEs as it goes over the perminfos, and make sure that the entries are consistent. Hmm, we can’t use forboth here, because not all RTEs have the corresponding RTEPermissionInfo, inheritance children RTEs, for example. Doh, of course. Also, it doesn’t make much sense to reinstate the original loop over range table and fetch the RTEPermissionInfo for the RTEs with non-0 perminfoindex, because the main goal of the patch was to make ExecCheckPermissions() independent of range table length. Yeah, I'm thinking in a mechanism that would allow us to detect bugs in development builds — no need to have it run in production builds. However, I can't see any useful way to implement it. Maybe something like the attached would do? -- Sergey Shinderukhttps://postgrespro.com/ diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 39bfb48dc22..94866743f48 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -577,6 +577,28 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos, ListCell *l; boolresult = true; +#ifdef USE_ASSERT_CHECKING + Bitmapset *indexset = NULL; + + /* Check that rteperminfos is consistent with rangeTable */ + foreach(l, rangeTable) + { + RangeTblEntry *rte = lfirst_node(RangeTblEntry, l); + + if (rte->perminfoindex != 0) + { + /* Sanity checks */ + (void) getRTEPermissionInfo(rteperminfos, rte); + /* Many-to-one mapping not allowed */ + Assert(!bms_is_member(rte->perminfoindex, indexset)); + indexset = bms_add_member(indexset, rte->perminfoindex); + } + } + + /* All rteperminfos are referenced */ + Assert(bms_num_members(indexset) == list_length(rteperminfos)); +#endif + foreach(l, rteperminfos) { RTEPermissionInfo *perminfo = lfirst_node(RTEPermissionInfo, l); diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c index 375b17b9f3b..f6d19226a0a 100644 --- a/src/backend/utils/adt/ri_triggers.c +++ b/src/backend/utils/adt/ri_triggers.c @@ -1399,6 +1399,7 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel) pkrte->relid = RelationGetRelid(pk_rel); pkrte->relkind = pk_rel->rd_rel->relkind; pkrte->rellockmode = AccessShareLock; + pkrte->perminfoindex = 2; pk_perminfo = makeNode(RTEPermissionInfo); pk_perminfo->relid = RelationGetRelid(pk_rel); @@ -1409,6 +1410,7 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel) fkrte->relid = RelationGetRelid(fk_rel); fkrte->relkind = fk_rel->rd_rel->relkind; fkrte->rellockmode = AccessShareLock; + fkrte->perminfoindex = 1; fk_perminfo = makeNode(RTEPermissionInfo); fk_perminfo->relid = RelationGetRelid(fk_rel);
Re: Compile fail on macos big sur
Hi, On 23.09.2021 10:09, zhang listar wrote: > Hi, guys, I encount a problem on compiling pssql, the environment is: > os: macos big sur version 11.5.2 (20G95) > compiler: gcc-11 (Homebrew GCC 11.2.0) 11.2.0 I've just tried building with gcc-11 on Catalina (yes, it's time to upgrade) and it went fine, no warnings. Maybe `git clean -fdx` would help? (Be careful, though, it removes any changes you may have maid.) > /usr/local/bin/gcc-11 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Werror=vla -Wendif-labels > -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type > -Wformat-security -fno-strict-aliasing -fwrapv > -fexcess-precision=standard -Wno-format-truncation > -Wno-stringop-truncation -O2 zic.o -L../../src/port -L../../src/common > -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX11.3.sdk > -L/usr/local/opt/binutils/lib -Wl,-dead_strip_dylibs -lpgcommon > -lpgport -lz -lreadline -lm -o zic Here is what I have: /usr/local/bin/gcc-11 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -O2 zic.o -L../../src/port -L../../src/common -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk -Wl,-dead_strip_dylibs -lpgcommon -lpgport -lz -lreadline -lm -o zic Looks the same, and gives no warnings. Just in case, I configured like that: ./configure --prefix=$(cd ..;pwd)/install-gcc-11 --enable-cassert --enable-debug --enable-tap-tests CC=/usr/local/bin/gcc-11 Hope that helps. -- Sergey Shinderukhttps://postgrespro.com/
Re: Compile fail on macos big sur
On 23.09.2021 10:50, zhang listar wrote: > Thanks for your reply, I do make distclean and git clean -fdx, but it > does no help. > > the code: master, c7aeb775df895db240dcd6f47242f7e08899adfb > It looks like the macos issue, because of the ignoring of some lib, it > drives the compiling error. Maybe you could try adding -v to the problematic gcc command to see what really goes on. I see that gcc calls /usr/bin/ld, not binutils ld installed with Homebrew. I saw an advice to `brew unlink binutils` somewhere.
Memory leak in pg_hmac_final
Hi, Here is a patch fixing the subject. Regards, -- Sergey Shinderukhttps://postgrespro.com/ diff --git a/src/common/hmac.c b/src/common/hmac.c index 1089db67443..bfe2e7cb5e9 100644 --- a/src/common/hmac.c +++ b/src/common/hmac.c @@ -232,7 +232,10 @@ pg_hmac_final(pg_hmac_ctx *ctx, uint8 *dest, size_t len) memset(h, 0, ctx->digest_size); if (pg_cryptohash_final(ctx->hash, h, ctx->digest_size) < 0) + { + FREE(h); return -1; + } /* H(K XOR opad, tmp) */ if (pg_cryptohash_init(ctx->hash) < 0 || @@ -240,9 +243,11 @@ pg_hmac_final(pg_hmac_ctx *ctx, uint8 *dest, size_t len) pg_cryptohash_update(ctx->hash, h, ctx->digest_size) < 0 || pg_cryptohash_final(ctx->hash, dest, len) < 0) { + FREE(h); return -1; } + FREE(h); return 0; }
Re: Memory leak in pg_hmac_final
On 01.10.2021 15:05, Daniel Gustafsson wrote: >> On 1 Oct 2021, at 12:39, Sergey Shinderuk wrote: > >> Here is a patch fixing the subject. > > Seems reasonable on a quick glance, the interim h buffer should be freed (this > is present since 14). I'll have another look at this in a bit and will take > care of it. Thanks. I found it using the leaks tool on macOS. Without the patch: % MallocStackLogging=1 leaks -quiet -atExit -- psql -d 'dbname=postgres user=alice password=secret' -XAtc 'select 1' ... Process 91635: 4390 nodes malloced for 252 KB Process 91635: 4103 leaks for 131296 total leaked bytes. ... (User alice has a SCRAM-encrypted password.) With the patch: Process 98250: 290 nodes malloced for 124 KB Process 98250: 3 leaks for 96 total leaked bytes. The remaining leaks are expected and not worth fixing, I guess: STACK OF 1 INSTANCE OF 'ROOT LEAK: malloc<32>': 4 libdyld.dylib 0x7fff68d80cc9 start + 1 3 psql 0x10938b9f9 main + 2393 startup.c:207 2 psql 0x1093ab5a5 pg_malloc + 21 fe_memutils.c:49 1 libsystem_malloc.dylib 0x7fff68f36cf5 malloc + 21 0 libsystem_malloc.dylib 0x7fff68f36d9e malloc_zone_malloc + 140 2 (48 bytes) ROOT LEAK: 0x7ffbb75040d0 [32] 1 (16 bytes) 0x7ffbb75040f0 [16] length: 8 "select 1" STACK OF 1 INSTANCE OF 'ROOT LEAK: malloc<48>': 5 libdyld.dylib 0x7fff68d80cc9 start + 1 4 psql 0x10938b8b0 main + 2064 startup.c:207 3 psql 0x1093ab78e pg_strdup + 14 fe_memutils.c:96 2 libsystem_c.dylib 0x7fff68e26ce6 strdup + 32 1 libsystem_malloc.dylib 0x7fff68f36cf5 malloc + 21 0 libsystem_malloc.dylib 0x7fff68f36d9e malloc_zone_malloc + 140 1 (48 bytes) ROOT LEAK: 0x7ffbb75040a0 [48] length: 42 "dbname=postgres user=alice password=secret" -- Sergey Shinderukhttps://postgrespro.com/
Re: pgcrypto support for bcrypt $2b$ hashes
On 02.10.2021 06:48, Daniel Fone wrote: > I don’t get these compiler warnings and I can’t find any settings to use that > might generate them. I’m compiling on macOS 11.6 configured with > `--enable-cassert --enable-depend --enable-debug CFLAGS=-O0` > Hi, Daniel! I don't get them from clang on macOS either. > I’ve optimistically updated the patch to hopefully address them, but I’d like > to know what I need to do to get those warnings. But gcc-11 on Ubuntu 20.04 emits them. Regards, -- Sergey Shinderukhttps://postgrespro.com/
Bug in DefineRange() with multiranges
Hi, My colleague, Alex Kozhemyakin, stumbled upon a bug in DefineRange(). The problem is here: @@ -1707,7 +1707,6 @@ DefineRange(ParseState *pstate, CreateRangeStmt *stmt) /* Create cast from the range type to its multirange type */ CastCreate(typoid, multirangeOid, castFuncOid, 'e', 'f', DEPENDENCY_INTERNAL); - pfree(multirangeTypeName); pfree(multirangeArrayName); return address; Given a query create type textrange1 as range(subtype=text, multirange_type_name=multirange_of_text, collation="C"); the string "multirange_of_text" in the parse tree is erroneously pfree'd. The corrupted parse tree is then passed to event triggers. There is another branch in DefineRange() that genereates a multirange type name which is fine to free. I wonder what is the proper fix. Just drop pfree() altogether or add pstrdup() instead? I see that makeMultirangeTypeName() doesn't bother freeing its buf. Here is a gdb session demonstating the bug: Breakpoint 1, ProcessUtilitySlow (pstate=0x5652e80c7730, pstmt=0x5652e80a6a40, queryString=0x5652e80a5790 "create type textrange1 as range(subtype=text, multirange_type_name=multirange_of_text, collation=\"C\");", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, qc=0x7ffe835b4be0, dest=) at /pgwork/REL_14_STABLE/src/src/backend/tcop/utility.c:1621 1621address = DefineRange((CreateRangeStmt *) parsetree); (gdb) p *(Value *)((TypeName *)((DefElem *)((CreateRangeStmt *)parsetree)->params->elements[1].ptr_value)->arg)->names->elements[0].ptr_value $1 = {type = T_String, val = {ival = -401972176, str = 0x5652e80a6430 "multirange_of_text"}} (gdb) n 1900if (!commandCollected) (gdb) p *(Value *)((TypeName *)((DefElem *)((CreateRangeStmt *)parsetree)->params->elements[1].ptr_value)->arg)->names->elements[0].ptr_value $2 = {type = T_String, val = {ival = -401972176, str = 0x5652e80a6430 '\177' , "\020"}} Regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: gcc -Wclobbered in PostgresMain
Hello, Tom, On 08.07.2023 18:11, Tom Lane wrote: What we ought to be doing is resetting these two flags after the disable_all_timeouts call. Oops, I missed that. Having done that, it wouldn't really be necessary to mark these as volatile. I kept that marking anyway for consistency with send_ready_for_query, but perhaps we shouldn't? I don't know. Maybe marking them volatile is more future proof. Not sure. I also moved firstchar's declaration inside the loop where it's used, to make it clear that this variable needn't be volatile and is not preserved after longjmp(). Good idea, but then why not the same for input_message? It's fully reinitialized each time through the loop, too. Yeah, that's better. In short, something like the attached, except I'm not totally sold on changing the volatility of the timeout flags. Looks good to me. Thank you. -- Sergey Shinderukhttps://postgrespro.com/
Fix error handling in be_tls_open_server()
Hi, A static analyzer reported a possible pfree(NULL) in be_tls_open_server(). Here is a fix. Also handle an error from X509_NAME_print_ex(). AFAICS, the error "SSL certificate's distinguished name contains embedded null" could not be reached at all, because XN_FLAG_RFC2253 passed to X509_NAME_print_ex() ensures that null bytes are escaped. Best regards, -- Sergey Shinderukhttps://postgrespro.com/diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 658b09988d6..31b6a6eacdf 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -620,8 +620,11 @@ aloop: bio = BIO_new(BIO_s_mem()); if (!bio) { - pfree(port->peer_cn); - port->peer_cn = NULL; + if (port->peer_cn != NULL) + { + pfree(port->peer_cn); + port->peer_cn = NULL; + } return -1; } @@ -632,12 +635,15 @@ aloop: * which make regular expression matching a bit easier. Also note that * it prints the Subject fields in reverse order. */ - X509_NAME_print_ex(bio, x509name, 0, XN_FLAG_RFC2253); - if (BIO_get_mem_ptr(bio, &bio_buf) <= 0) + if (X509_NAME_print_ex(bio, x509name, 0, XN_FLAG_RFC2253) == -1 || + BIO_get_mem_ptr(bio, &bio_buf) <= 0) { BIO_free(bio); - pfree(port->peer_cn); - port->peer_cn = NULL; + if (port->peer_cn != NULL) + { + pfree(port->peer_cn); + port->peer_cn = NULL; + } return -1; } peer_dn = MemoryContextAlloc(TopMemoryContext, bio_buf->length + 1); @@ -651,8 +657,11 @@ aloop: (errcode(ERRCODE_PROTOCOL_VIOLATION), errmsg("SSL certificate's distinguished name contains embedded null"))); pfree(peer_dn); - pfree(port->peer_cn); - port->peer_cn = NULL; + if (port->peer_cn != NULL) + { + pfree(port->peer_cn); + port->peer_cn = NULL; + } return -1; }
Re: Fix error handling in be_tls_open_server()
On 23.08.2023 16:23, Daniel Gustafsson wrote: On 1 Aug 2023, at 16:44, Sergey Shinderuk wrote: A static analyzer reported a possible pfree(NULL) in be_tls_open_server(). This has the smell of a theoretical problem, I can't really imagine a certificate where which would produce this. Have you been able to trigger it? I triggered a crash by generating a certificate without a CN and forcing malloc to return NULL when called from X509_NAME_print_ex or BIO_get_mem_ptr with gdb. Initially I tried to trigger a crash by generating a certificate without a CN and with a DN contaning the null byte. But as I said, the error condition "SSL certificate's distinguished name contains embedded null" isn't really reachable, because XN_FLAG_RFC2253 escapes null bytes. -- Sergey Shinderukhttps://postgrespro.com/
Re: Fix error handling in be_tls_open_server()
On 24.08.2023 11:38, Daniel Gustafsson wrote: On 24 Aug 2023, at 10:11, Sergey Shinderuk wrote: I triggered a crash by generating a certificate without a CN and forcing malloc to return NULL when called from X509_NAME_print_ex or BIO_get_mem_ptr with gdb. Can you extend the patch with that certificate and a test using it? The certificates are generated from config files kept in the repo in src/test/ssl in order to be reproducible. But I need to force malloc to fail at the right time during the test. I don't think I can make a stable and portable test from this. -- Sergey Shinderukhttps://postgrespro.com/
Re: Fix error handling in be_tls_open_server()
On 28.08.2023 18:12, Jacob Champion wrote: On Thu, Aug 24, 2023 at 6:25 PM Michael Paquier wrote: LD_PRELOAD is the only thing I can think about, but that's very fancy. Even with that, having a certificate with a NULL peer_cn could prove to be useful in the SSL suite to stress more patterns around it? +1. Last we tried it, OpenSSL didn't want to create a certificate with an embedded null, but maybe things have changed? To embed a null byte into the Subject, I first generated a regular certificate request in the DER (binary) format, then manually inserted null into the file and recomputed the checksum. Like this: https://security.stackexchange.com/a/58845 I'll try to add a client certificate lacking a CN to the SSL test suite. -- Sergey Shinderukhttps://postgrespro.com/
Re: On login trigger: take three
On 02.09.2022 18:36, Daniel Gustafsson wrote: This had bitrotted a fair bit, attached is a rebase along with (mostly) documentation fixes. 0001 adds a generic GUC for ignoring event triggers and 0002 adds the login event with event trigger support, and hooks it up to the GUC such that broken triggers wont require single-user mode. Moving the CF entry back to Needs Review. Hello! There is a race around setting and clearing of dathasloginevt. Steps to reproduce: 1. Create a trigger: CREATE FUNCTION on_login_proc() RETURNS event_trigger AS $$ BEGIN RAISE NOTICE 'You are welcome!'; END; $$ LANGUAGE plpgsql; CREATE EVENT TRIGGER on_login_trigger ON login EXECUTE PROCEDURE on_login_proc(); 2. Then drop it, but don't start new sessions: DROP EVENT TRIGGER on_login_trigger; 3. Create another trigger, but don't commit yet: BEGIN; CREATE EVENT TRIGGER on_login_trigger ON login EXECUTE PROCEDURE on_login_proc(); 4. Start a new session. This clears dathasloginevt. 5. Commit the transaction: COMMIT; Now we have a trigger, but it doesn't fire, because dathasloginevt=false. If two sessions create triggers concurrently, one of them will fail. Steps: 1. In the first session, start a transaction and create a trigger: BEGIN; CREATE EVENT TRIGGER on_login_trigger1 ON login EXECUTE PROCEDURE on_login_proc(); 2. In the second session, create another trigger (this query blocks): CREATE EVENT TRIGGER on_login_trigger2 ON login EXECUTE PROCEDURE on_login_proc(); 3. Commit in the first session: COMMIT; The second session fails: postgres=# CREATE EVENT TRIGGER on_login_trigger2 ON login EXECUTE PROCEDURE on_login_proc(); ERROR: tuple concurrently updated What else bothers me is that login triggers execute in an environment under user control and one has to be very careful. The example trigger from the documentation +DECLARE + hour integer = EXTRACT('hour' FROM current_time); + rec boolean; +BEGIN +-- 1. Forbid logging in between 2AM and 4AM. +IF hour BETWEEN 2 AND 4 THEN + RAISE EXCEPTION 'Login forbidden'; +END IF; can be bypassed with PGOPTIONS='-c timezone=...'. Probably this is nothing new and concerns any SECURITY DEFINER function, but still... Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: On login trigger: take three
On 20.09.2022 17:10, Daniel Gustafsson wrote: On 20 Sep 2022, at 15:43, Sergey Shinderuk wrote: There is a race around setting and clearing of dathasloginevt. Thanks for the report! The whole dathasloginevt mechanism is IMO too clunky and unelegant to go ahead with, I wouldn't be surprised if there are other bugs lurking there. Indeed. CREATE DATABASE doesn't copy dathasloginevt from the template database. ALTER EVENT TRIGGER ... ENABLE doesn't set dathasloginevt. In both cases triggers are enabled according to \dy output, but don't fire. The admin may not notice it immediately. -- Sergey Shinderukhttps://postgrespro.com/
gcc -Wclobbered in PostgresMain
Hi, hackers! While analyzing -Wclobbered warnings from gcc we found a true one in PostgresMain(): postgres.c: In function ‘PostgresMain’: postgres.c:4118:25: warning: variable ‘idle_in_transaction_timeout_enabled’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] 4118 | boolidle_in_transaction_timeout_enabled = false; | ^~~ postgres.c:4119:25: warning: variable ‘idle_session_timeout_enabled’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] 4119 | boolidle_session_timeout_enabled = false; | ^~~~ These variables must be declared volatile, because they are read after longjmp(). send_ready_for_query declared there is volatile. Without volatile, these variables are kept in registers and restored by longjump(). I think, this is harmless because the error handling code calls disable_all_timeouts() anyway. But strictly speaking the code is invoking undefined behavior by reading those variables after longjmp(), so it's worth fixing. And for consistency with send_ready_for_query too. I believe, making them volatile doesn't affect performance in any way. I also moved firstchar's declaration inside the loop where it's used, to make it clear that this variable needn't be volatile and is not preserved after longjmp(). Best regards, -- Sergey Shinderukhttps://postgrespro.com/diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 01b6cc1f7d3..56d8b0814b5 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -4111,12 +4111,11 @@ PostgresSingleUserMain(int argc, char *argv[], void PostgresMain(const char *dbname, const char *username) { - int firstchar; StringInfoData input_message; sigjmp_buf local_sigjmp_buf; volatile bool send_ready_for_query = true; - bool idle_in_transaction_timeout_enabled = false; - bool idle_session_timeout_enabled = false; + volatile bool idle_in_transaction_timeout_enabled = false; + volatile bool idle_session_timeout_enabled = false; Assert(dbname != NULL); Assert(username != NULL); @@ -4418,6 +4417,8 @@ PostgresMain(const char *dbname, const char *username) for (;;) { + int firstchar; + /* * At top of loop, reset extended-query-message flag, so that any * errors encountered in "idle" state don't provoke skip.
Postmaster fails to shut down right after crash restart
Hello, While developing a patch and running regression tests I noticed that the postmaster could fail to shut down right after crash restart. It could get stuck in the PM_WAIT_BACKENDS state forever. As far as I understand, the problem occurs when a shutdown signal is received before getting PMSIGNAL_RECOVERY_STARTED from the startup process. In that case the FatalError flag is not cleared, and the postmaster is stuck in PM_WAIT_BACKENDS waiting for the checkpointer, which ignores SIGTERM. To easily reproduce the problem I added pg_usleep in xlogrecovery.c just before SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED). See the patch attached. Then I run a script that simulates a crash and does pg_ctl stop: $ ./init.sh [...] $ ./stop-after-crash.sh waiting for server to start done server started waiting for server to shut down... failed pg_ctl: server does not shut down Some processes are still alive: $ ps uf -C postgres USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND sergey279874 0.0 0.0 222816 28560 ?Ss 14:25 0:00 /home/sergey/pgwork/devel/install/bin/postgres -D data sergey279887 0.0 0.0 222772 5664 ?Ss 14:25 0:00 \_ postgres: io worker 0 sergey279888 0.0 0.0 222772 5664 ?Ss 14:25 0:00 \_ postgres: io worker 1 sergey279889 0.0 0.0 222772 5664 ?Ss 14:25 0:00 \_ postgres: io worker 2 sergey279891 0.0 0.0 222884 8480 ?Ss 14:25 0:00 \_ postgres: checkpointer Here is an excerpt from the debug log: postmaster[279874] LOG: all server processes terminated; reinitializing startup[279890] LOG: database system was interrupted; last known up at 2025-04-24 14:25:58 MSK startup[279890] LOG: database system was not properly shut down; automatic recovery in progress postmaster[279874] DEBUG: postmaster received shutdown request signal postmaster[279874] LOG: received fast shutdown request postmaster[279874] DEBUG: updating PMState from PM_STARTUP to PM_STOP_BACKENDS postmaster[279874] DEBUG: sending signal 15/SIGTERM to background writer process with pid 279892 postmaster[279874] DEBUG: sending signal 15/SIGTERM to checkpointer process with pid 279891 postmaster[279874] DEBUG: sending signal 15/SIGTERM to startup process with pid 279890 postmaster[279874] DEBUG: sending signal 15/SIGTERM to io worker process with pid 279889 postmaster[279874] DEBUG: sending signal 15/SIGTERM to io worker process with pid 279888 postmaster[279874] DEBUG: sending signal 15/SIGTERM to io worker process with pid 279887 postmaster[279874] DEBUG: updating PMState from PM_STOP_BACKENDS to PM_WAIT_BACKENDS startup[279890] LOG: invalid record length at 0/175A4D8: expected at least 24, got 0 postmaster[279874] DEBUG: postmaster received pmsignal signal startup[279890] LOG: redo is not required checkpointer[279891] LOG: checkpoint starting: end-of-recovery immediate wait checkpointer[279891] LOG: checkpoint complete: wrote 0 buffers (0.0%), wrote 3 SLRU buffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=0.007 s, sync=0.002 s, total=0.026 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB; lsn=0/175A4D8, redo lsn=0/175A4D8 startup[279890] DEBUG: exit(0) postmaster[279874] DEBUG: updating PMState from PM_WAIT_BACKENDS to PM_WAIT_BACKENDS checkpointer[279891] DEBUG: checkpoint skipped because system is idle checkpointer[279891] DEBUG: checkpoint skipped because system is idle I don't know how to fix this, but thought it's worth reporting. Best regards, -- Sergey Shinderukhttps://postgrespro.com/diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 6ce979f2d8b..19ee8b09685 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -1696,7 +1696,10 @@ PerformWalRecovery(void) * archiver if necessary. */ if (IsUnderPostmaster) + { + pg_usleep(300); SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED); + } /* * Allow read-only connections immediately if we're consistent already. init.sh Description: application/shellscript stop-after-crash.sh Description: application/shellscript logfile.gz Description: application/gzip
Re: Read-Write optimistic lock (Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum)
On 16.06.2025 17:41, Andres Freund wrote: TBH, I don't see a point in continuing with this thread without something that others can test. I rather doubt that the right fix here is to just change the lock model over, but without a repro I can't evaluate that. Hello, I think I can reproduce the issue with pgbench on a muti-core server. I start a regular select-only test with 64 clients, and while it's running, I start a plpgsql loop creating and dropping temporary tables from a single psql session. I observe ~25% drop in tps reported by pgbench until I cancel the query in psql. $ pgbench -n -S -c64 -j64 -T300 -P1 progress: 10.0 s, 1249724.7 tps, lat 0.051 ms stddev 0.002, 0 failed progress: 11.0 s, 1248289.0 tps, lat 0.051 ms stddev 0.002, 0 failed progress: 12.0 s, 1246001.0 tps, lat 0.051 ms stddev 0.002, 0 failed progress: 13.0 s, 1247832.5 tps, lat 0.051 ms stddev 0.002, 0 failed progress: 14.0 s, 1248205.8 tps, lat 0.051 ms stddev 0.002, 0 failed progress: 15.0 s, 1247737.3 tps, lat 0.051 ms stddev 0.002, 0 failed progress: 16.0 s, 1219444.3 tps, lat 0.052 ms stddev 0.039, 0 failed progress: 17.0 s, 893943.4 tps, lat 0.071 ms stddev 0.159, 0 failed progress: 18.0 s, 927861.3 tps, lat 0.069 ms stddev 0.150, 0 failed progress: 19.0 s, 886317.1 tps, lat 0.072 ms stddev 0.163, 0 failed progress: 20.0 s, 877200.1 tps, lat 0.073 ms stddev 0.164, 0 failed progress: 21.0 s, 875424.4 tps, lat 0.073 ms stddev 0.163, 0 failed progress: 22.0 s, 877693.0 tps, lat 0.073 ms stddev 0.165, 0 failed progress: 23.0 s, 897202.8 tps, lat 0.071 ms stddev 0.158, 0 failed progress: 24.0 s, 917853.4 tps, lat 0.070 ms stddev 0.153, 0 failed progress: 25.0 s, 907865.1 tps, lat 0.070 ms stddev 0.154, 0 failed Here I started the following loop in psql around 17s and tps dropped by ~25%: do $$ begin for i in 1..100 loop create temp table tt1 (a bigserial primary key, b text); drop table tt1; commit; end loop; end; $$; Now, if I simply remove the spinlock in SIGetDataEntries, I see a drop of just ~6% under concurrent DDL. I think this strongly suggests that the spinlock is the bottleneck. Before that, I tried removing `if (!hasMessages) return` optimization in SIGetDataEntries to stress the spinlock and observed ~35% drop in tps of select-only with an empty sinval queue (no DDL running in background). Then I also removed the spinlock in SIGetDataEntries, and the loss was just ~4%, which may be noise. I think this also suggests that the spinlock could be the bottleneck. I'm running this on a 2 socket AMD EPYC 9654 96-Core server with postgres and pgbench bound to distinct CPUs. PGDATA is placed on tmpfs. postgres is running with the default settings. pgbench tables are of scale 1. pgbench is connecting via loopback/127.0.0.1. Does this sound convincing? Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Error with DEFAULT VALUE in temp table
Hi, Tom! Thank you for working on this. I see you've fixed the patch and committed it as a0b99fc1220. I tested it a bit and see some side effects which may be unintentional. 1. SCHEMA lost object_name. Before: postgres=# create schema foo; CREATE SCHEMA postgres=# drop schema foo; DROP SCHEMA postgres=# select * from dropped_objects \gx -[ RECORD 1 ]---+--- n | 1 classid | 2615 objid | 16404 objsubid| 0 original| t normal | f is_temporary| f object_type | schema schema_name | object_name | foo object_identity | foo address_names | {foo} address_args| {} After: postgres=# select * from dropped_objects \gx -[ RECORD 1 ]---+--- n | 1 classid | 2615 objid | 16394 objsubid| 0 original| t normal | f is_temporary| f object_type | schema schema_name | object_name | object_identity | foo address_names | {foo} address_args| {} 2. DEFAULT VALUE now has schema_name and object_name. Before: postgres=# create temp table bar (a int default 0); CREATE TABLE postgres=# drop table bar; DROP TABLE postgres=# select * from dropped_objects where object_type = 'default value' \gx -[ RECORD 1 ]---+-- n | 4 classid | 2604 objid | 16422 objsubid| 0 original| f normal | f is_temporary| f object_type | default value schema_name | object_name | object_identity | for pg_temp.bar.a address_names | {pg_temp,bar,a} address_args| {} After: postgres=# select * from dropped_objects where object_type = 'default value' \gx -[ RECORD 1 ]---+-- n | 4 classid | 2604 objid | 16430 objsubid| 0 original| f normal | f is_temporary| t object_type | default value schema_name | pg_temp object_name | bar object_identity | for pg_temp.bar.a address_names | {pg_temp,bar,a} address_args| {} This may be intentional, but doesn't quite match the description for object_name in the docs: Name of the object, if the combination of schema and name can be used as a unique identifier for the object; otherwise NULL. Also it doesn't match with the record for the column itself: postgres=# create temp table bar (a int default 0); CREATE TABLE postgres=# alter table bar drop column a; ALTER TABLE postgres=# select * from dropped_objects \gx -[ RECORD 1 ]---+-- n | 1 classid | 1259 objid | 16435 objsubid| 1 original| t normal | f is_temporary| t object_type | table column schema_name | pg_temp object_name | object_identity | pg_temp.bar.a address_names | {pg_temp,bar,a} address_args| {} -[ RECORD 2 ]---+-- n | 2 classid | 2604 objid | 16438 objsubid| 0 original| f normal | f is_temporary| t object_type | default value schema_name | pg_temp object_name | bar object_identity | for pg_temp.bar.a address_names | {pg_temp,bar,a} address_args| {} object_name is null for the table column, but not null for its default value. As for schema_name, I'm not sure whether it should be null or not. Currently schema_name is null for triggers and policy objects, but that may be accidental. Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Error with DEFAULT VALUE in temp table
On 12.09.2025 14:01, Sergey Shinderuk wrote: object_name is null for the table column, but not null for its default value. As for schema_name, I'm not sure whether it should be null or not. Currently schema_name is null for triggers and policy objects, but that may be accidental. Perhaps "default value" should be like "table constraint", which have schema_name and null object_name. postgres=# create temp table bar (a int not null default 0); CREATE TABLE postgres=# alter table bar drop column a; ALTER TABLE postgres=# select * from dropped_objects \gx -[ RECORD 1 ]---+-- n | 1 classid | 1259 objid | 16445 objsubid| 1 original| t normal | f is_temporary| t object_type | table column schema_name | pg_temp object_name | object_identity | pg_temp.bar.a address_names | {pg_temp,bar,a} address_args| {} -[ RECORD 2 ]---+-- n | 2 classid | 2604 objid | 16448 objsubid| 0 original| f normal | f is_temporary| t object_type | default value schema_name | pg_temp object_name | bar object_identity | for pg_temp.bar.a address_names | {pg_temp,bar,a} address_args| {} -[ RECORD 3 ]---+-- n | 3 classid | 2606 objid | 16449 objsubid| 0 original| f normal | f is_temporary| t object_type | table constraint schema_name | pg_temp object_name | object_identity | bar_a_not_null on pg_temp.bar address_names | {pg_temp,bar,bar_a_not_null} address_args| {} Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Error with DEFAULT VALUE in temp table
On 13.09.2025 00:19, Tom Lane wrote: Fixed (and tested) in the attached. Great! Thank you. > So that "else" action was unreachable, and the code failed > to set "istemp" true for its own temp schema. As for dropping my own temp schema, it's still a bit inconsistent (as it was before): postgres=# select pg_my_temp_schema()::regnamespace; pg_my_temp_schema --- pg_temp_0 (1 row) postgres=# drop schema pg_temp_0; DROP SCHEMA postgres=# select * from dropped_objects where object_type = 'schema' and is_temporary \gx -[ RECORD 1 ]---+-- n | 7 classid | 2615 objid | 16398 objsubid| 0 original| t normal | f is_temporary| t object_type | schema schema_name | object_name | pg_temp_0 object_identity | pg_temp address_names | {pg_temp} address_args| {} object_identity is pg_temp, but object_name is pg_temp_0. But maybe that's okay. Anyway, I don't think that dropping my own temp schema makes sense. Also I noticed that schema_name for temp functions doesn't match with object_identity (pg_temp vs pg_temp_1): postgres=# create function pg_temp.bar(int) returns int as 'select $1' language sql; CREATE FUNCTION postgres=# drop function pg_temp.bar(int); DROP FUNCTION postgres=# select * from dropped_objects where object_type = 'function' and is_temporary \gx -[ RECORD 1 ]---+--- n | 8 classid | 1255 objid | 16412 objsubid| 0 original| t normal | f is_temporary| t object_type | function schema_name | pg_temp object_name | object_identity | pg_temp_1.bar(integer) address_names | {pg_temp,bar} address_args| {integer} There should be a call to get_namespace_name_or_temp somewhere, I guess. If you say this should be fixed, I can come up with a patch later. But maybe it's trivial. Thanks again! -- Sergey Shinderukhttps://postgrespro.com/