On Sat, Aug 24, 2024 at 12:22 AM Anthonin Bonnefoy <anthonin.bonne...@datadoghq.com> wrote: > On Thu, Aug 22, 2024 at 12:33 PM Thomas Munro <thomas.mu...@gmail.com> wrote: > > I fear that back-porting, for the LLVM project, would mean "we fix it > > in main/20.x, and also back-port it to 19.x". Do distros back-port > > further? > > That's also my fear, I'm not familiar with distros back-port policy > but eyeballing ubuntu package changelog[1], it seems to be mostly > build fixes. > > Given that there's no visible way to fix the relocation issue, I > wonder if jit shouldn't be disabled for arm64 until either the > RuntimeDyld fix is merged or the switch to JITLink is done. Disabling > jit tuple deforming may be enough but I'm not confident the issue > won't happen in a different part.
We've experienced something a little similar before: In the early days of PostgreSQL LLVM, it didn't work at all on ARM or POWER. We sent a trivial fix[1] upstream that landed in LLVM 7; since it was a small and obvious problem and it took a long time for some distros to ship LLVM 7, we even contemplated hot-patching that LLVM function with our own copy (but, ugh, only for about 7 nanoseconds). That was before we turned JIT on by default, and was also easier to deal with because it was an obvious consistent failure in basic tests, so packagers probably just disabled the build option on those architectures. IIUC this one is a random and rare crash depending on malloc() and perhaps also the working size of your virtual memory dart board. (Annoyingly, I had tried to reproduce this quite a few times on small ARM systems when earlier reports came in, d'oh!). This degree of support window mismatch is probably what triggered RHEL to develop their new rolling LLVM version policy. Unfortunately, it's the other distros that tell *us* which versions to support, and not the reverse (for example CF #4920 is about to drop support for LLVM < 14, but that will only be for PostgreSQL 18+). Ultimately, if it doesn't work, and doesn't get fixed, it's hard for us to do much about it. But hmm, this is probably madness... I wonder if it would be feasible to detect address span overflow ourselves at a useful time, as a kind of band-aid defence... [1] https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com