On Thu, Jun 07, 2018 at 09:56:18AM +0200, Ard Biesheuvel wrote: > On 7 June 2018 at 09:48, Christoffer Dall <christoffer.d...@arm.com> wrote: > > [+Will] > > > > On Tue, Jun 05, 2018 at 03:07:14PM +0200, Laszlo Ersek wrote: > >> On 06/05/18 13:30, Richard Biener wrote: > >> > On Mon, Jun 4, 2018 at 8:11 PM Laszlo Ersek <ler...@redhat.com> wrote: > >> >> > >> >> Hi! > >> >> > >> >> Apologies if this isn't the right place for asking. For the problem > >> >> statement, I'll simply steal Ard's writeup [1]: > >> >> > >> >>> KVM on ARM refuses to decode load/store instructions used to perform > >> >>> I/O to emulated devices, and instead relies on the exception syndrome > >> >>> information to describe the operand register, access size, etc. This > >> >>> is only possible for instructions that have a single input/output > >> >>> register (as opposed to ones that increment the offset register, or > >> >>> load/store pair instructions, etc). Otherwise, QEMU crashes with the > >> >>> following error > >> >>> > >> >>> error: kvm run failed Function not implemented > >> >>> [...] > >> >>> QEMU: Terminated > >> >>> > >> >>> and KVM produces a warning such as the following in the kernel log > >> >>> > >> >>> kvm [17646]: load/store instruction decoding not implemented > >> > > >> > This looks like a kvm/qemu issue to me. Whatever that exception syndrome > >> > thing is, it surely has a pointer to the offending instruction it could > >> > decode? > >> > >> I believe so -- the instruction decoding is theoretically possible (to > >> my understanding); KVM currently doesn't do it because it's super > >> complex (again, to my understanding). > >> > > The instruction decoding was considered and discarded because the > > understanding at the time was that any instruction that didn't generate > > valid decoding hints in the syndrome register (such as multiple output > > register operations) would not be safe to use on device memory, and > > therefore shouldn't be used neither on real hardware nor in VM guests. > > > > How is it unsafe for a load or store with writeback to be used on > device memory? That does not make sense to me.
I don't understand that either, which is why I cc'ed Will who argued for this last IIRC. > In any case, I suppose that *decoding* the instruction is not the > problem, it is loading the opcode in the first place, given that it is > not recorded in any system registers when the exception is taken. ELR > will contain a virtual guest address [which could be in userland], and > the host should translate that (which involves guest page tables that > may be modified by other VCPUs concurrently) and map it to be able to > get to the actual bits. > > > If this still holds, it's not a question of an architecture bug or a > > missing feature in KVM, but a question of a guest doing something wrong. > > > > Do you have a mutt macro for that response? :-) > No I don't. And I wouldn't mind adding instruction decoding to KVM. I already wrote it once, but the maintainer didn't want to merge the code unless I unified all instruction decoding in the arm kernel, which I was unable to do. Sarkasm and instruction decoding stories aside, we've had a number of reports of this kind of error in the past where the problem was simply people using the wrong the DT with their guest kernel. I don't think we've seen an actual case of a real guest that was using the 'wrong' instruction to actually do I/O. > > I added Will here, because he provided the rationale for why the premise > > held last we discussed this, and he may be able to update us on that. > > > > As Ard pointed out, KVM could probably provide a more helpful error > > message or be a bit more clever in trying to find out what happened. > > Some times, if guests are misconfigured, and for example think that > > normal memory is placed in the guest physical memory map where the > > hypervisor placed an I/O device, guests will issue non-decodable > > instructions against I/O memory and KVM simply provides this misleading > > error message. > > > > As I pointed out in the BZ entry, this smells like another exception > that gets reported at EL2, and lacks the syndrome information for > unrelated reasons (i.e., nothing to do with the opcode) I don't understand how that would result in the error message reported. kvm_handle_guest_abort() has: /* Check the stage-2 fault is trans. fault or write fault */ if (fault_status != FSC_FAULT && fault_status != FSC_PERM && fault_status != FSC_ACCESS) { kvm_err("Unsupported FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n", kvm_vcpu_trap_get_class(vcpu), (unsigned long)kvm_vcpu_trap_get_fault(vcpu), (unsigned long)kvm_vcpu_get_hsr(vcpu)); return -EFAULT; } Are you saying that we can get a stage 2 data abort, which reports the DFSC as translation, permission, or access fault, while it's something different completely? In any case, I was just trying to help here, by providing some additional background and context. I agree that KVM's error messaging could be improved, and if we should implement instruction decoding for certain instructions that can validly be used for I/O access, then we can add that to our to-do list. -Christoffer