Den ons 26 feb. 2025 kl 14:45 skrev Daniel Sahlberg < daniel.l.sahlb...@gmail.com>:
> Den ons 26 feb. 2025 kl 14:30 skrev Stefan Sperling <s...@apache.org>: > >> On Wed, Feb 26, 2025 at 01:24:22PM +0100, Daniel Sahlberg wrote: >> > Looking at the code, I'm assuming that if SVN_UNALIGNED_ACCESS_IS_OK was >> > set to 0 (so the char by char loop after #endif is used instead), the >> code >> > would run just fine. Can you confirm that? >> > >> > SVN_UNALIGNED_ACCESS_IS_OK is defined as follows - we'd probably have to >> > change that for GCC15. >> >> Unaligned access is undefined behaviour (UB). Compilers are now taking >> more liberties with undefined behaviour than they used to. So we need >> to be very careful in our assumptions about UB. >> >> Our code which is guarded by SVN_UNALIGNED_ACCESS_IS_OK might become >> unsafe when built with newer compilers on any platform. This code's >> behaviour is no longer under our control, but the compiler's s control. >> svn_eol__find_eol_start() might crash, become a no-op, or have unknown >> side-effects resulting in annoyances or even CVEs. >> >> At the very least, we should have this macro off by default and require >> it to be enabled manually. For best safety, the code should be removed. >> I understand that removing this code will incur a performance cost. >> The code was backed up by elaborate performance testing when it was >> written years ago. >> >> But what modern compilers make of UB will likely keep changing for the >> worse (or better, depending on which side of the UB-fence you are on), >> violating assumptions about platform behaviour in the macro definition: >> >> > #ifndef SVN_UNALIGNED_ACCESS_IS_OK >> > # if defined(_M_IX86) || defined(i386) \ >> > || defined(_M_X64) || defined(__x86_64) \ >> > || defined(__powerpc__) || defined(__ppc__) >> > # define SVN_UNALIGNED_ACCESS_IS_OK 1 >> > # else >> > # define SVN_UNALIGNED_ACCESS_IS_OK 0 >> > # endif >> > #endif >> >> I have seen contrived programs being translated into nothing on x86 when >> built with a relatively recent version of clang (16) at optimization >> levels >> of -O2 and up. For example, the following program will literally have its >> entire main() function elided: >> >> [[[ >> #include <float.h> >> int main(void) { >> double v = DBL_MAX; >> int v2 = v; >> if (v2 != v) return 0; >> return 1; >> } >> ]]] >> (courtesy of https://bsd.network/@kristapsdz/113870782376321887) >> >> On OpenBSD/amd64, with clang -O2 this program compiles to 3 instructions, >> an empty function. The resulting executable just crashes when run. >> [[[ >> (gdb) disassemble main >> 0x0000000000001950 <+0>: endbr64 >> 0x0000000000001954 <+4>: push %rbp >> 0x0000000000001955 <+5>: mov %rsp,%rbp >> ]]] >> >> At -01 there are 4 lines (still crashes). Without optimization (-O0) >> there are 38 lines of assembly instead of 3, and the program exits >> with status 0. >> >> Granted, what I have seen are contrived cases. But they resulted >> from someone looking into problems while doing development in C. >> As a C developer, this scares me enough to avoid UB at all costs >> because I cannot trust the compiler to warn about my mistakes >> which introduce UB. Rather, the compiler might decide to elide >> the code and change the intended meaning of the program. >> > > Thanks for the detailed explanation. It seems really ugly. > > Would add a separate for-loop before "one machine word at a time", > iterating until buf is aligned (maximum sizeof(void*)-1 times) help here? > Should have researched a bit more before replying. There was already some code to do exactly this, but it was removed in r1722879: [[[ stefan2 2016-01-04 15:17:04 Stop using pointer arithmetics to check for proper alignment because that is not portable. As a result, platforms that don't allow unaligned data access will suffer a small additional performance hit. * subversion/libsvn_subr/eol.c (svn_eol__find_eol_start): No longer attempt aligned chunky processing when unaligned access is not supported. * subversion/libsvn_subr/utf_validate.c (first_non_fsm_start_char): Same. ]]] Don't quite understand why but there is probably a reasonable explanation. > > Any idea of the performance gain on SVN_UNALIGNED_ACCESS_IS_OK on modern > hardware/modern compilers? > > Cheers, > Daniel > >