On Sat, Jan 7, 2023 at 3:40 AM Andrew Dunstan <and...@dunslane.net> wrote: > OK, should I now try re-enabling TAP tests on lorikeet?
Not before https://commitfest.postgresql.org/41/4032/ is committed. After that, it might be worth a try? I have no idea if the PANIC problem I mentioned last night would apply to lorikeet's kernel too. To summarise the kinds of failure we have analysed in this thread: 1. SysV semaphores are buggy; fixed, I hope, by recent commit (= just don't use them). 2. The regular crashes we already knew about from other threads due to signal masking being buggy seem to be fixed, coincidentally, by CF #4032, not yet committed (= don't rely on sa_mask for correctness). 3. PANIC apparently caused by non-atomic rename(), based on analysis of similar failures seen on other old buggy OSes back in 2018. If lorikeet has problem #3 (which it may not; we know from CF #3951 that kernel versions differ in related respects and Server 2019 as used on CI seems to have the most conservative/old Windows behaviour) then it might fail in the TAP tests just like the proposed CI-for-Cygwin patch, unless you also do data_sync_retry=on, which seems like a pretty ugly workaround to me. I don't know what else might be broken by non-atomic rename(), and I'd rather not find out :-D I finished up here by trying to tidy up some weird looking nonsense in our code while working on general portability cleanup, since I needed a way to check if __CYGWIN__ stuff still works, but what we found out is that it's more broken than anyone realised, and now I have to pull the emergency rabbit hole ejection cord because I have less than zero time for or interest in debugging Cygwin.