On 10.07.20 10:32, David Hildenbrand wrote: > On 09.07.20 12:37, Cornelia Huck wrote: >> On Wed, 8 Jul 2020 20:51:32 +0200 >> David Hildenbrand <da...@redhat.com> wrote: >> >>> Let's implement the "storage configuration" part of diag260. This diag >>> is found under z/VM, to indicate usable chunks of memory tot he guest OS. >>> As I don't have access to documentation, I have no clue what the actual >>> error cases are, and which other stuff we could eventually query using this >>> interface. Somebody with access to documentation should fix this. This >>> implementation seems to work with Linux guests just fine. >>> >>> The Linux kernel supports diag260 to query the available memory since >>> v4.20. Older kernels / kvm-unit-tests will later fail to run in such a VM >>> (with maxmem being defined and bigger than the memory size, e.g., "-m >>> 2G,maxmem=4G"), just as if support for SCLP storage information is not >>> implemented. They will fail to detect the actual initial memory size. >>> >>> This interface allows us to expose the maximum ramsize via sclp >>> and the initial ramsize via diag260 - without having to mess with the >>> memory increment size and having to align the initial memory size to it. >>> >>> This is a preparation for memory device support. We'll unlock the >>> implementation with a new QEMU machine that supports memory devices. >>> >>> Signed-off-by: David Hildenbrand <da...@redhat.com> >>> --- >>> target/s390x/diag.c | 57 ++++++++++++++++++++++++++++++++++++++ >>> target/s390x/internal.h | 2 ++ >>> target/s390x/kvm.c | 11 ++++++++ >>> target/s390x/misc_helper.c | 6 ++++ >>> target/s390x/translate.c | 4 +++ >>> 5 files changed, 80 insertions(+) >>> >>> diff --git a/target/s390x/diag.c b/target/s390x/diag.c >>> index 1a48429564..c3b1e24b2c 100644 >>> --- a/target/s390x/diag.c >>> +++ b/target/s390x/diag.c >>> @@ -23,6 +23,63 @@ >>> #include "hw/s390x/pv.h" >>> #include "kvm_s390x.h" >>> >>> +void handle_diag_260(CPUS390XState *env, uint64_t r1, uint64_t r3, >>> uintptr_t ra) >>> +{ >>> + MachineState *ms = MACHINE(qdev_get_machine()); >>> + const ram_addr_t initial_ram_size = ms->ram_size; >>> + const uint64_t subcode = env->regs[r3]; >>> + S390CPU *cpu = env_archcpu(env); >>> + ram_addr_t addr, length; >>> + uint64_t tmp; >>> + >>> + /* TODO: Unlock with new QEMU machine. */ >>> + if (false) { >>> + s390_program_interrupt(env, PGM_OPERATION, ra); >>> + return; >>> + } >>> + >>> + /* >>> + * There also seems to be subcode "0xc", which stores the size of the >>> + * first chunk and the total size to r1/r2. It's only used by very old >>> + * Linux, so don't implement it. >> >> FWIW, >> https://www-01.ibm.com/servers/resourcelink/svc0302a.nsf/pages/zVMV7R1sc246272/$file/hcpb4_v7r1.pdf >> seems to list the available subcodes. Anything but 0xc and 0x10 is for >> 24/31 bit only, so we can safely ignore them. Not sure what we want to >> do with 0xc: it is supposed to "Return the highest addressable byte of >> virtual storage in the host-primary address space, including named >> saved systems and saved segments", so returning the end of the address >> space should be easy enough, but not very useful. >> >>> + */ >>> + if ((r1 & 1) || subcode != 0x10) { >>> + s390_program_interrupt(env, PGM_SPECIFICATION, ra); >>> + return; >>> + } >>> + addr = env->regs[r1]; >>> + length = env->regs[r1 + 1]; >>> + >>> + /* FIXME: Somebody with documentation should fix this. */ >> >> Doc mentioned above says for specification exception: >> >> "For subcode X'10': >> • Rx is not an even-numbered register. >> • The address contained in Rx is not on a quadword boundary. >> • The length contained in Rx+1 is not a positive multiple of 16." >> >>> + if (!QEMU_IS_ALIGNED(addr, 16) || !QEMU_IS_ALIGNED(length, 16)) { >>> + s390_program_interrupt(env, PGM_SPECIFICATION, ra); >>> + return; >>> + } >>> + >>> + /* FIXME: Somebody with documentation should fix this. */ >>> + if (!length) { >> >> Probably specification exception as well? >> >>> + setcc(cpu, 3); >>> + return; >>> + } >>> + >>> + /* FIXME: Somebody with documentation should fix this. */ >> >> For access exception: >> >> "For subcode X'10', an error occurred trying to store the extent >> information into the guest's output area." >> >>> + if (!address_space_access_valid(&address_space_memory, addr, length, >>> true, >>> + MEMTXATTRS_UNSPECIFIED)) { >>> + s390_program_interrupt(env, PGM_ADDRESSING, ra); >>> + return; >>> + } >>> + >>> + /* Indicate our initial memory ([0 .. ram_size - 1]) */ >>> + tmp = cpu_to_be64(0); >>> + cpu_physical_memory_write(addr, &tmp, sizeof(tmp)); >>> + tmp = cpu_to_be64(initial_ram_size - 1); >>> + cpu_physical_memory_write(addr + sizeof(tmp), &tmp, sizeof(tmp)); >>> + >>> + /* Exactly one entry was stored. */ >>> + env->regs[r3] = 1; >>> + setcc(cpu, 0); >>> +} >>> + >>> int handle_diag_288(CPUS390XState *env, uint64_t r1, uint64_t r3) >>> { >>> uint64_t func = env->regs[r1]; >> >> (...) >> >>> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c >>> index 58dbc023eb..d7274eb320 100644 >>> --- a/target/s390x/misc_helper.c >>> +++ b/target/s390x/misc_helper.c >>> @@ -116,6 +116,12 @@ void HELPER(diag)(CPUS390XState *env, uint32_t r1, >>> uint32_t r3, uint32_t num) >>> uint64_t r; >>> >>> switch (num) { >>> + case 0x260: >>> + qemu_mutex_lock_iothread(); >>> + handle_diag_260(env, r1, r3, GETPC()); >>> + qemu_mutex_unlock_iothread(); >>> + r = 0; >>> + break; >>> case 0x500: >>> /* KVM hypercall */ >>> qemu_mutex_lock_iothread(); >> >> Looking at the doc referenced above, it seems that we treat every diag >> call as privileged under tcg; but it seems that 0x44 isn't? (Unrelated >> to your patch; maybe I'm misreading.) > > That's also a BUG in kvm then? > > int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) > { > ... > if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE) > return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP); > ... > } >
But OTOH, it does not sound sane if user space can bypass the OS to yield the CPU ... so this might just be a wrong documentation. All DIAGs should be privileged IIRC. -- Thanks, David / dhildenb