On 06/05/18 08:04, Ard Biesheuvel wrote:
> On 4 June 2018 at 20:10, Laszlo Ersek <ler...@redhat.com> wrote:
>> Hi!
>>
>> Apologies if this isn't the right place for asking. For the problem
>> statement, I'll simply steal Ard's writeup [1]:
>>
>>> KVM on ARM refuses to decode load/store instructions used to perform
>>> I/O to emulated devices, and instead relies on the exception syndrome
>>> information to describe the operand register, access size, etc. This
>>> is only possible for instructions that have a single input/output
>>> register (as opposed to ones that increment the offset register, or
>>> load/store pair instructions, etc). Otherwise, QEMU crashes with the
>>> following error
>>>
>>>   error: kvm run failed Function not implemented
>>>   [...]
>>>   QEMU: Terminated
>>>
>>> and KVM produces a warning such as the following in the kernel log
>>>
>>>   kvm [17646]: load/store instruction decoding not implemented
>>>
>>> GCC with LTO enabled will emit such instructions for Mmio[Read|Write]
>>> invocations performed in a loop, so we need to disable LTO [...]
>>
>> We have a Red Hat Bugzilla about the (very likely) same issue [2].
>>
>> Earlier, we had to work around the same on AArch64 too [3].
>>
>> Would it be possible to introduce a dedicated -mXXX option, for ARM and
>> AArch64, that disabled the generation of such multi-operand
>> instructions?
>>
>> I note there are several similar instructions (for other architectures):
>> * -mno-multiple (ppc)
>> * -mno-fused-madd (ia64)
>> * -mno-mmx and a lot of friends (x86)
>>
>> Obviously, if the feature request is deemed justified, we should provide
>> the exact family of instructions to disable. I'll leave that to others
>> on the CC list with more ARM/AArch64 expertise; I just wanted to get
>> this thread started. (Sorry if the option is already being requested
>> elsewhere; I admit I didn't search the GCC bugzilla.)
>>
> 
> I am not convinced that tweaking GCC code generation is the correct
> approach here, to be honest.
> 
> The issue only occurs when load/store instructions trap into KVM,
> which (correct me if I am wrong) mostly only occurs when emulating
> MMIO. The case I have been looking into (UEFI) uses MMIO accessors
> correctly, but due to the way they are implemented (in C), LTO code
> generation may result in load/store instructions with multiple outputs
> to be used.
> 
> So first of all, I would like to understand the magnitude of the
> problem. If all cases we can identify involve performing MMIO using C
> memory references, I think we should fix the code rather than the
> compiler.

To my understanding, Daniel has the opposite preference; namely, the
above approach doesn't scale to a large and moving target like the
kernel. Because the instructions in question work on the bare metal
(IOW, the guest code is not "broken" in any sense of the word), people
will continue writing kernel MMIO code in C that "lures" gcc into
generating such ARM/AArch64 assembly that contains those instructions.

The RHBZ I linked earlier remains elusive; the issue is not easy to
trigger, and when it does trigger, one has to investigate the symptoms
(the guest code at the trap address) every time, trace it back to
C-language source code, and either tweak that C code, or else tweak the
compiler flags specifically for that code / module. AIUI Daniel prefers
to work around the KVM issue without having to analyze every guest site,
as they pop up over time. The expression "all cases we can identify" is
the core of the problem; it's not a well-defined set.

Your edk2 ArmVirtQemu patch adds a heavy-weight flag (-fno-lto) to a
pin-point location; another possibility (that might scale better to
humans) is a new, lighter-weight flag, such as "-mno-multiple", that is
applied universally to a codebase.

Thanks!
Laszlo

Reply via email to