On 4 June 2018 at 20:10, Laszlo Ersek <ler...@redhat.com> wrote: > Hi! > > Apologies if this isn't the right place for asking. For the problem > statement, I'll simply steal Ard's writeup [1]: > >> KVM on ARM refuses to decode load/store instructions used to perform >> I/O to emulated devices, and instead relies on the exception syndrome >> information to describe the operand register, access size, etc. This >> is only possible for instructions that have a single input/output >> register (as opposed to ones that increment the offset register, or >> load/store pair instructions, etc). Otherwise, QEMU crashes with the >> following error >> >> error: kvm run failed Function not implemented >> [...] >> QEMU: Terminated >> >> and KVM produces a warning such as the following in the kernel log >> >> kvm [17646]: load/store instruction decoding not implemented >> >> GCC with LTO enabled will emit such instructions for Mmio[Read|Write] >> invocations performed in a loop, so we need to disable LTO [...] > > We have a Red Hat Bugzilla about the (very likely) same issue [2]. > > Earlier, we had to work around the same on AArch64 too [3]. > > Would it be possible to introduce a dedicated -mXXX option, for ARM and > AArch64, that disabled the generation of such multi-operand > instructions? > > I note there are several similar instructions (for other architectures): > * -mno-multiple (ppc) > * -mno-fused-madd (ia64) > * -mno-mmx and a lot of friends (x86) > > Obviously, if the feature request is deemed justified, we should provide > the exact family of instructions to disable. I'll leave that to others > on the CC list with more ARM/AArch64 expertise; I just wanted to get > this thread started. (Sorry if the option is already being requested > elsewhere; I admit I didn't search the GCC bugzilla.) >
I am not convinced that tweaking GCC code generation is the correct approach here, to be honest. The issue only occurs when load/store instructions trap into KVM, which (correct me if I am wrong) mostly only occurs when emulating MMIO. The case I have been looking into (UEFI) uses MMIO accessors correctly, but due to the way they are implemented (in C), LTO code generation may result in load/store instructions with multiple outputs to be used. So first of all, I would like to understand the magnitude of the problem. If all cases we can identify involve performing MMIO using C memory references, I think we should fix the code rather than the compiler.