On Thu, 11 May 2023 at 08:08, Uros Bizjak <ubiz...@gmail.com> wrote: > > On Thu, May 11, 2023 at 12:04 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Wed, May 10, 2023 at 2:17 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > On Tue, May 9, 2023 at 10:58 AM Ard Biesheuvel <a...@kernel.org> wrote: > > > > > > > > The small and medium PIC code models generate profiling calls that > > > > always load the address of __fentry__() via the GOT, even if > > > > -mdirect-extern-access is in effect. > > > > > > > > This deviates from the behavior with respect to other external > > > > references, and results in a longer opcode that relies on linker > > > > relaxation to eliminate the GOT load. In this particular case, the > > > > transformation replaces an indirect 'CALL *__fentry__@GOTPCREL(%rip)' > > > > with either 'CALL __fentry__; NOP' or 'NOP; CALL __fentry__', where the > > > > NOP is a 1 byte NOP that preserves the 6 byte length of the sequence. > > > > > > > > This is problematic for the Linux kernel, which generally relies on > > > > -mdirect-extern-access and hidden visibility to eliminate GOT based > > > > symbol references in code generated with -fpie/-fpic, without having to > > > > depend on linker relaxation. > > > > > > > > The Linux kernel relies on code patching to replace these opcodes with > > > > NOPs at runtime, and this is complicated code that we'd prefer not to > > > > complicate even more by adding support for patching both 5 and 6 byte > > > > sequences as well as parsing the instruction stream to decide which > > > > variant of CALL+NOP we are dealing with. > > > > > > > > So let's honour -mdirect-extern-access, and only load the address of > > > > __fentry__ via the GOT if direct references to external symbols are not > > > > permitted. > > > > > > > > Note that the GOT reference in question is in fact a data reference: we > > > > explicitly load the address of __fentry__ from the GOT, which amounts to > > > > eager binding, rather than emitting a PLT call that could bind eagerly, > > > > lazily or directly at link time. > > > > > > > > gcc/ChangeLog: > > > > > > > > * config/i386/i386.cc (x86_function_profiler): Take > > > > ix86_direct_extern_access into account when generating calls > > > > to __fentry__() > > > > > > HJ, is the patch OK with you? > > > > LGTM. > > OK then. >
Thanks all. Is anything expected of me at this point?