Peter Eisentraut <peter.eisentr...@2ndquadrant.com> writes: > I took this patch for a quick spin on macOS. The result was that the > test suite hangs in the test src/test/recovery/t/017_shm.pl. I didn't > see any mentions of this anywhere in the thread, but that test is newer > than the beginning of this thread. Can anyone confirm or deny this > issue? Is it specific to macOS perhaps?
Yeah, I duplicated the problem in macOS Catalina (10.15.2), using today's HEAD. The core regression tests pass, as do the earlier recovery tests (I didn't try a full check-world though). Somewhere early in 017_shm.pl, things freeze up with four postmaster-child processes stuck in 100%- CPU-consuming loops. I captured stack traces: (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x00007fff6554dbb6 libsystem_kernel.dylib`kqueue + 10 frame #1: 0x0000000105511533 postgres`CreateWaitEventSet(context=<unavailable>, nevents=<unavailable>) at latch.c:622:19 [opt] frame #2: 0x0000000105511305 postgres`WaitLatchOrSocket(latch=0x0000000112e02da4, wakeEvents=41, sock=-1, timeout=237000, wait_event_info=83886084) at latch.c:389:22 [opt] frame #3: 0x00000001054a7073 postgres`CheckpointerMain at checkpointer.c:514:10 [opt] frame #4: 0x00000001052da390 postgres`AuxiliaryProcessMain(argc=2, argv=0x00007ffeea9dded0) at bootstrap.c:461:4 [opt] (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x00007fff6554dbce libsystem_kernel.dylib`kevent + 10 frame #1: 0x0000000105511ddc postgres`WaitEventAdjustKqueue(set=0x00007fc8e8805920, event=0x00007fc8e8805958, old_events=<unavailable>) at latch.c:1034:7 [opt] frame #2: 0x0000000105511638 postgres`AddWaitEventToSet(set=<unavailable>, events=<unavailable>, fd=<unavailable>, latch=<unavailable>, user_data=<unavailable>) at latch.c:778:2 [opt] frame #3: 0x0000000105511342 postgres`WaitLatchOrSocket(latch=0x0000000112e030f4, wakeEvents=41, sock=-1, timeout=200, wait_event_info=83886083) at latch.c:397:3 [opt] frame #4: 0x00000001054a6d69 postgres`BackgroundWriterMain at bgwriter.c:304:8 [opt] frame #5: 0x00000001052da38b postgres`AuxiliaryProcessMain(argc=2, argv=0x00007ffeea9dded0) at bootstrap.c:456:4 [opt] (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x00007fff65549c66 libsystem_kernel.dylib`close + 10 frame #1: 0x0000000105511466 postgres`WaitLatchOrSocket [inlined] FreeWaitEventSet(set=<unavailable>) at latch.c:660:2 [opt] frame #2: 0x000000010551145d postgres`WaitLatchOrSocket(latch=0x0000000112e03444, wakeEvents=<unavailable>, sock=-1, timeout=5000, wait_event_info=83886093) at latch.c:432 [opt] frame #3: 0x00000001054b8685 postgres`WalWriterMain at walwriter.c:256:10 [opt] frame #4: 0x00000001052da39a postgres`AuxiliaryProcessMain(argc=2, argv=0x00007ffeea9dded0) at bootstrap.c:467:4 [opt] (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x00007fff655515be libsystem_kernel.dylib`__select + 10 frame #1: 0x00000001056a6191 postgres`pg_usleep(microsec=<unavailable>) at pgsleep.c:56:10 [opt] frame #2: 0x00000001054abe12 postgres`backend_read_statsfile at pgstat.c:5720:3 [opt] frame #3: 0x00000001054adcc0 postgres`pgstat_fetch_stat_dbentry(dbid=<unavailable>) at pgstat.c:2431:2 [opt] frame #4: 0x00000001054a320c postgres`do_start_worker at autovacuum.c:1248:20 [opt] frame #5: 0x00000001054a2639 postgres`AutoVacLauncherMain [inlined] launch_worker(now=632853327674576) at autovacuum.c:1357:9 [opt] frame #6: 0x00000001054a2634 postgres`AutoVacLauncherMain(argc=<unavailable>, argv=<unavailable>) at autovacuum.c:769 [opt] frame #7: 0x00000001054a1ea7 postgres`StartAutoVacLauncher at autovacuum.c:415:4 [opt] I'm not sure how much faith to put in the last couple of those, as stopping the earlier processes could perhaps have had side-effects. But evidently 017_shm.pl is doing something that interferes with our ability to create kqueue-based WaitEventSets. regards, tom lane