jrtc27 added a comment.

In D79916#2279875 <https://reviews.llvm.org/D79916#2279875>, @jrtc27 wrote:

> In D79916#2279871 <https://reviews.llvm.org/D79916#2279871>, @Bdragon28 wrote:
>
>> In D79916#2279866 <https://reviews.llvm.org/D79916#2279866>, @jrtc27 wrote:
>>
>>> In D79916#2279863 <https://reviews.llvm.org/D79916#2279863>, @Bdragon28 
>>> wrote:
>>>
>>>> In D79916#2279816 <https://reviews.llvm.org/D79916#2279816>, @jrtc27 wrote:
>>>>
>>>>> In D79916#2279812 <https://reviews.llvm.org/D79916#2279812>, @Bdragon28 
>>>>> wrote:
>>>>>
>>>>>> In D79916#2279045 <https://reviews.llvm.org/D79916#2279045>, @jrtc27 
>>>>>> wrote:
>>>>>>
>>>>>>> This has significantly regressed FreeBSD's performance with the new 
>>>>>>> version of Clang. It seems Clang does not inline functions at -O1, 
>>>>>>> unlike GCC, and since FreeBSD currently compiles its kernel with -O 
>>>>>>> whenever debug symbols are enabled[1] (which, of course, is almost 
>>>>>>> always true), this results in all its `static inline` helper functions 
>>>>>>> not being inlined at all, a pattern that is common in the kernel, used 
>>>>>>> for things like `get_curthread` and the atomics implementations.
>>>>>>>
>>>>>>> [1] This is a dubious decision made in r140400 in 2005 to provide 
>>>>>>> "truer debugger stack traces" (well, before then there was ping-ponging 
>>>>>>> between -O and -O2 based on concerns around correctness vs performance, 
>>>>>>> but amd64 is an exception that has always used -O2 since r127180 it 
>>>>>>> seems). Given that GCC will inline at -O, at least these days, the 
>>>>>>> motivation seems to no longer exist, and compiling a kernel at anything 
>>>>>>> other than -O2 (or maybe -O3) seems like a silly thing to do, but 
>>>>>>> nevertheless it's what is currently done.
>>>>>>>
>>>>>>> Cc: @dim @trasz
>>>>>>
>>>>>> This is actually SUCH a bad idea that a kernel built with -O will *not 
>>>>>> work at all* on 32 bit powerpc platforms (presumably due to allocating 
>>>>>> stack frames in the middle of assembly fragments in the memory 
>>>>>> management that are supposed to be inlined at all times.) I had to hack 
>>>>>> kern.pre.mk to rquest -O2 at all times just to get a functioning kernel.
>>>>>
>>>>> Well, -O0, -O1, -O2 and -O should all produce working kernels, and any 
>>>>> cases where they don't are compiler bugs (or kernel bugs if they rely on 
>>>>> UB) that should be fixed, not worked around by tweaking the compiler 
>>>>> flags in a fragile way until you get the behaviour relied on. Correctness 
>>>>> and performance are very different issues here.
>>>>
>>>> As an example:
>>>>
>>>>   static __inline void
>>>>   mtsrin(vm_offset_t va, register_t value)
>>>>   {
>>>>   
>>>>           __asm __volatile ("mtsrin %0,%1; isync" :: "r"(value), "r"(va));
>>>>   }
>>>>
>>>> This code is used in the mmu when bootstrapping the cpu like so:
>>>>
>>>>   for (i = 0; i < 16; i++)
>>>>           mtsrin(i << ADDR_SR_SHFT, kernel_pmap->pm_sr[i]);
>>>>   powerpc_sync();
>>>>   
>>>>   sdr = (u_int)moea_pteg_table | (moea_pteg_mask >> 10);
>>>>   __asm __volatile("mtsdr1 %0" :: "r"(sdr));
>>>>   isync();
>>>>   
>>>>   tlbia();
>>>>
>>>> During the loop there, we are in the middle of programming the MMU segment 
>>>> registers in real mode, and is supposed to be doing all work out of 
>>>> registers. (and powerpc_sync() and isync() should be expanded to their 
>>>> single assembly instruction, not a function call. The whole point of 
>>>> calling those is that we are in an inconsistent hardware state and need to 
>>>> sync up before continuing execution)
>>>>
>>>> If there isn't a way to force inlining, we will have to change to using 
>>>> preprocessor macros in cpufunc.h.
>>>
>>> There is, it's called `__attribute__((always_inline))` and supported by 
>>> both GCC and Clang. But at -O0 you'll still have register allocation to 
>>> deal with, so really that code is just fundamentally broken and should not 
>>> be written in C. There is no way for you to guarantee stack spills are not 
>>> used, it's way out of scope for C.
>>
>> Is there a way to have always_inline and unused at the same time? I tried 
>> using always_inline and it caused warnings in things that used *parts* of 
>> cpufunc.h.
>
> Both `__attribute__((always_inline)) __attribute__((unused))` and 
> `__attribute__((always_inline, unused))` work, but really you should use 
> `__always_inline __unused` in FreeBSD (which will expand to the former).

But also you really should not get warnings for unused static functions in 
included headers, only ones defined in the C source file itself. We'd have 
countless warnings in the kernel across all architectures otherwise.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79916/new/

https://reviews.llvm.org/D79916

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to