jrtc27 added a comment. In D79916#2279871 <https://reviews.llvm.org/D79916#2279871>, @Bdragon28 wrote:
> In D79916#2279866 <https://reviews.llvm.org/D79916#2279866>, @jrtc27 wrote: > >> In D79916#2279863 <https://reviews.llvm.org/D79916#2279863>, @Bdragon28 >> wrote: >> >>> In D79916#2279816 <https://reviews.llvm.org/D79916#2279816>, @jrtc27 wrote: >>> >>>> In D79916#2279812 <https://reviews.llvm.org/D79916#2279812>, @Bdragon28 >>>> wrote: >>>> >>>>> In D79916#2279045 <https://reviews.llvm.org/D79916#2279045>, @jrtc27 >>>>> wrote: >>>>> >>>>>> This has significantly regressed FreeBSD's performance with the new >>>>>> version of Clang. It seems Clang does not inline functions at -O1, >>>>>> unlike GCC, and since FreeBSD currently compiles its kernel with -O >>>>>> whenever debug symbols are enabled[1] (which, of course, is almost >>>>>> always true), this results in all its `static inline` helper functions >>>>>> not being inlined at all, a pattern that is common in the kernel, used >>>>>> for things like `get_curthread` and the atomics implementations. >>>>>> >>>>>> [1] This is a dubious decision made in r140400 in 2005 to provide "truer >>>>>> debugger stack traces" (well, before then there was ping-ponging between >>>>>> -O and -O2 based on concerns around correctness vs performance, but >>>>>> amd64 is an exception that has always used -O2 since r127180 it seems). >>>>>> Given that GCC will inline at -O, at least these days, the motivation >>>>>> seems to no longer exist, and compiling a kernel at anything other than >>>>>> -O2 (or maybe -O3) seems like a silly thing to do, but nevertheless it's >>>>>> what is currently done. >>>>>> >>>>>> Cc: @dim @trasz >>>>> >>>>> This is actually SUCH a bad idea that a kernel built with -O will *not >>>>> work at all* on 32 bit powerpc platforms (presumably due to allocating >>>>> stack frames in the middle of assembly fragments in the memory management >>>>> that are supposed to be inlined at all times.) I had to hack kern.pre.mk >>>>> to rquest -O2 at all times just to get a functioning kernel. >>>> >>>> Well, -O0, -O1, -O2 and -O should all produce working kernels, and any >>>> cases where they don't are compiler bugs (or kernel bugs if they rely on >>>> UB) that should be fixed, not worked around by tweaking the compiler flags >>>> in a fragile way until you get the behaviour relied on. Correctness and >>>> performance are very different issues here. >>> >>> As an example: >>> >>> static __inline void >>> mtsrin(vm_offset_t va, register_t value) >>> { >>> >>> __asm __volatile ("mtsrin %0,%1; isync" :: "r"(value), "r"(va)); >>> } >>> >>> This code is used in the mmu when bootstrapping the cpu like so: >>> >>> for (i = 0; i < 16; i++) >>> mtsrin(i << ADDR_SR_SHFT, kernel_pmap->pm_sr[i]); >>> powerpc_sync(); >>> >>> sdr = (u_int)moea_pteg_table | (moea_pteg_mask >> 10); >>> __asm __volatile("mtsdr1 %0" :: "r"(sdr)); >>> isync(); >>> >>> tlbia(); >>> >>> During the loop there, we are in the middle of programming the MMU segment >>> registers in real mode, and is supposed to be doing all work out of >>> registers. (and powerpc_sync() and isync() should be expanded to their >>> single assembly instruction, not a function call. The whole point of >>> calling those is that we are in an inconsistent hardware state and need to >>> sync up before continuing execution) >>> >>> If there isn't a way to force inlining, we will have to change to using >>> preprocessor macros in cpufunc.h. >> >> There is, it's called `__attribute__((always_inline))` and supported by both >> GCC and Clang. But at -O0 you'll still have register allocation to deal >> with, so really that code is just fundamentally broken and should not be >> written in C. There is no way for you to guarantee stack spills are not >> used, it's way out of scope for C. > > Is there a way to have always_inline and unused at the same time? I tried > using always_inline and it caused warnings in things that used *parts* of > cpufunc.h. Both `__attribute__((always_inline)) __attribute__((unused))` and `__attribute__((always_inline, unused))` work, but really you should use `__always_inline __unused` in FreeBSD (which will expand to the former). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79916/new/ https://reviews.llvm.org/D79916 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits