Hi, As threatened in [1]... For CI, originally in the AIO project but now more generally, I wanted to get windows backtraces as part of CI. I also was confused why visual studio's "just in time debugging" (i.e. a window popping up offering to debug a process when it crashes) didn't work with postgres.
My first attempt was to try to use the existing crashdump stuff in pgwin32_install_crashdump_handler(). That's not really quite what I want, because it only handles postmaster rather than any binary, but I thought it'd be a good start. But outside of toy situations it didn't work for me. A bunch of debugging later I figured out that the reason neither the SetUnhandledExceptionFilter() nor JIT debugging works is that the SEM_NOGPFAULTERRORBOX in the SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOGPFAULTERRORBOX); we do in startup_hacks() prevents the paths dealing with crashes from being reached. The SEM_NOGPFAULTERRORBOX hails from: commit 27bff7502f04ee01237ed3f5a997748ae43d3a81 Author: Bruce Momjian <br...@momjian.us> Date: 2006-06-12 16:17:20 +0000 Prevent Win32 from displaying a popup box on backend crash. Instead let the postmaster deal with it. Magnus Hagander I actually see error popups despite SEM_NOGPFAULTERRORBOX, at least for paths reaching abort() (and thus our assertions). The reason for abort() error boxes not being suppressed appears to be that in debug mode a separate facility is reponsible for that: [2], [3] "The default behavior is to print the message. _CALL_REPORTFAULT, if set, specifies that a Watson crash dump is generated and reported when abort is called. By default, crash dump reporting is enabled in non-DEBUG builds." We apparently need _set_abort_behavior(_CALL_REPORTFAULT) to have abort() behave the same between debug and release builds. [4] To prevent the error popups we appear to at least need to call _CrtSetReportMode(). The docs say: If you do not call _CrtSetReportMode to define the output destination of messages, then the following defaults are in effect: Assertion failures and errors are directed to a debug message window. We can configure it so that that stuff goes to stderr, by calling _CrtSetReportMode(_CRT_ASSERT, _CRTDBG_MODE_FILE | _CRTDBG_MODE_DEBUG); _CrtSetReportFile(_CRT_ASSERT, _CRTDBG_FILE_STDERR); (and the same for _CRT_ERROR and perhaps _CRT_WARNING) which removes the default _CRTDBG_MODE_WNDW. It's possible that we'd need to do more than this, but this was sufficient to get crash reports for segfaults and abort() in both assert and release builds, without seeing an error popup. To actually get the crash reports I ended up doing the following on the OS level [5]: Set-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug' -Name 'Debugger' -Value '\"C:\Windows Kits\10\Debuggers\x64\cdb.exe\" -p %ld -e %ld -g -kqm -c \".lines -e; .symfix+ ;.logappend c:\cirrus\crashlog.txt ; !peb; ~*kP ; .logclose ; q \"' ; ` New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug' -Name 'Auto' -Value 1 -PropertyType DWord ; ` Get-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug' -Name Debugger; ` This requires 'cdb' to be present, which is included in the Windows 10 SDK (or other OS versions, it doesn't appear to have changed much). Whenever there's an unhandled crash, cdb.exe is invoked with the parameters above, which appends the crash report to crashlog.txt. Alternatively we can generate "minidumps" [6], but that doesn't appear to be more helpful for CI purposes at least - all we'd do is to create a backtrace using the same tool. But it might be helpful for local development, to e.g. analyze crashes in more detail. The above ends up dumping all crashes into a single file, but that can probably be improved. But cdb is so gnarly that I wanted to stop looking once I got this far... Andrew, I wonder if something like this could make sense for windows BF animals? Greetings, Andres Freund [1] https://postgr.es/m/20211001222752.wrz7erzh4cajvgp6%40alap3.anarazel.de [2] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/crtsetreportmode?view=msvc-160 [3] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/set-abort-behavior?view=msvc-160 [4] If anybody can explain to me what the two different parameters to _set_abort_behavior() do, I'd be all ears [5] https://docs.microsoft.com/en-us/windows/win32/debug/configuring-automatic-debugging [6] https://docs.microsoft.com/en-us/windows/win32/wer/wer-settings