On 07/06/18 10:46, Richard Biener wrote:
> On Thu, Jun 7, 2018 at 11:45 AM Ard Biesheuvel
> <ard.biesheu...@linaro.org> wrote:
>>
>> On 7 June 2018 at 11:35, Richard Biener <richard.guent...@gmail.com> wrote:
>>> On Thu, Jun 7, 2018 at 10:45 AM Ard Biesheuvel
>>> <ard.biesheu...@linaro.org> wrote:
>>>>
>>>> On 7 June 2018 at 10:21, Christoffer Dall <christoffer.d...@arm.com> wrote:
>>>>> On Thu, Jun 07, 2018 at 09:56:18AM +0200, Ard Biesheuvel wrote:
>>>>>> On 7 June 2018 at 09:48, Christoffer Dall <christoffer.d...@arm.com> 
>>>>>> wrote:
>>>>>>> [+Will]
>>>>>>>
>>>>>>> On Tue, Jun 05, 2018 at 03:07:14PM +0200, Laszlo Ersek wrote:
>>>>>>>> On 06/05/18 13:30, Richard Biener wrote:
>>>>>>>>> On Mon, Jun 4, 2018 at 8:11 PM Laszlo Ersek <ler...@redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi!
>>>>>>>>>>
>>>>>>>>>> Apologies if this isn't the right place for asking. For the problem
>>>>>>>>>> statement, I'll simply steal Ard's writeup [1]:
>>>>>>>>>>
>>>>>>>>>>> KVM on ARM refuses to decode load/store instructions used to perform
>>>>>>>>>>> I/O to emulated devices, and instead relies on the exception 
>>>>>>>>>>> syndrome
>>>>>>>>>>> information to describe the operand register, access size, etc. This
>>>>>>>>>>> is only possible for instructions that have a single input/output
>>>>>>>>>>> register (as opposed to ones that increment the offset register, or
>>>>>>>>>>> load/store pair instructions, etc). Otherwise, QEMU crashes with the
>>>>>>>>>>> following error
>>>>>>>>>>>
>>>>>>>>>>>   error: kvm run failed Function not implemented
>>>>>>>>>>>   [...]
>>>>>>>>>>>   QEMU: Terminated
>>>>>>>>>>>
>>>>>>>>>>> and KVM produces a warning such as the following in the kernel log
>>>>>>>>>>>
>>>>>>>>>>>   kvm [17646]: load/store instruction decoding not implemented
>>>>>>>>>
>>>>>>>>> This looks like a kvm/qemu issue to me.  Whatever that exception 
>>>>>>>>> syndrome
>>>>>>>>> thing is, it surely has a pointer to the offending instruction it 
>>>>>>>>> could decode?
>>>>>>>>
>>>>>>>> I believe so -- the instruction decoding is theoretically possible (to
>>>>>>>> my understanding); KVM currently doesn't do it because it's super
>>>>>>>> complex (again, to my understanding).
>>>>>>>>
>>>>>>> The instruction decoding was considered and discarded because the
>>>>>>> understanding at the time was that any instruction that didn't generate
>>>>>>> valid decoding hints in the syndrome register (such as multiple output
>>>>>>> register operations) would not be safe to use on device memory, and
>>>>>>> therefore shouldn't be used neither on real hardware nor in VM guests.
>>>>>>>
>>>>>>
>>>>>> How is it unsafe for a load or store with writeback to be used on
>>>>>> device memory? That does not make sense to me.
>>>>>
>>>>> I don't understand that either, which is why I cc'ed Will who argued for
>>>>> this last IIRC.
>>>>>
>>>>>> In any case, I suppose that *decoding* the instruction is not the
>>>>>> problem, it is loading the opcode in the first place, given that it is
>>>>>> not recorded in any system registers when the exception is taken. ELR
>>>>>> will contain a virtual guest address [which could be in userland], and
>>>>>> the host should translate that (which involves guest page tables that
>>>>>> may be modified by other VCPUs concurrently) and map it to be able to
>>>>>> get to the actual bits.
>>>>>>
>>>>>>> If this still holds, it's not a question of an architecture bug or a
>>>>>>> missing feature in KVM, but a question of a guest doing something wrong.
>>>>>>>
>>>>>>
>>>>>> Do you have a mutt macro for that response? :-)
>>>>>>
>>>>>
>>>>> No I don't.  And I wouldn't mind adding instruction decoding to KVM.  I
>>>>> already wrote it once, but the maintainer didn't want to merge the code
>>>>> unless I unified all instruction decoding in the arm kernel, which I was
>>>>> unable to do.
>>>>>
>>>>
>>>> Yikes.
>>>>
>>>> So how does your code actually load the opcode?
>>>>
>>>>> Sarkasm and instruction decoding stories aside, we've had a number of
>>>>> reports of this kind of error in the past where the problem was simply
>>>>> people using the wrong the DT with their guest kernel.  I don't think
>>>>> we've seen an actual case of a real guest that was using the 'wrong'
>>>>> instruction to actually do I/O.
>>>>>
>>>>
>>>> Currently, LTO builds of EDK2 for 32-bit mach-virt are broken because
>>>> of this. The MMIO accessors are written in C using volatile pointers,
>>>> allowing LTO to merge adjacent accesses or loops performing MMIO,
>>>> resulting in, e.g., instructions with writeback to be emitted.
>>>
>>> I'd like to see a testcase where GCC does merging on volatile accesses.
>>> That would be a GCC bug.  So I suspect the C code isn't quite using volatile
>>> accesses...
>>>
>>
>> The accesses themselves are not being merged. But code such as
>>
>> MmioRead32:
>>   ldr   w0, [x0]
>>   ret
>>
>> SomeOtherFunction:
>>   ...
>> 0:mov   x20, x0
>>   bl    MmioRead32
>>   ...
>>   add   x20, x20, #4
>>   ...
>>   b.xx  0b
>>
>>
>> (where the two are based on C code but from different compilation
>> units) may under LTO be turned into code involving a post increment on
>> the memory address of the ldr, resulting in an instruction that has
>> two outputs, triggering the KVM error.
> 
> Ah, I see!  Of course this doesn't have anything to do with LTO per-se,
> you are just lucky it doesn't happen without ;)
> 
> Richard.
> 

I didn't think we supported pre/post increments on volatiles.

R.

Reply via email to