Re: [Qemu-devel] [PATCH 2/3] s390: implement pci instructions

Frank Blaschka Wed, 12 Nov 2014 00:57:22 -0800

On Tue, Nov 11, 2014 at 04:24:24PM +0100, Alexander Graf wrote:
> 
> 
> On 11.11.14 15:08, Frank Blaschka wrote:
> > On Tue, Nov 11, 2014 at 01:51:25PM +0100, Alexander Graf wrote:
> >>
> >>
> >>
> >>> Am 11.11.2014 um 13:39 schrieb Frank Blaschka 
> >>> <blasc...@linux.vnet.ibm.com>:
> >>>
> >>>> On Tue, Nov 11, 2014 at 01:16:04PM +0100, Alexander Graf wrote:
> >>>>
> >>>>
> >>>>> On 11.11.14 13:10, Frank Blaschka wrote:
> >>>>>> On Mon, Nov 10, 2014 at 04:56:21PM +0100, Alexander Graf wrote:
> >>>>>>
> >>>>>>
> >>>>>>> On 10.11.14 15:20, Frank Blaschka wrote:
> >>>>>>> From: Frank Blaschka <frank.blasc...@de.ibm.com>
> >>>>>>>
> >>>>>>> This patch implements the s390 pci instructions in qemu. It allows
> >>>>>>> to access and drive pci devices attached to the s390 pci bus.
> >>>>>>> Because of platform constrains devices using IO BARs are not
> >>>>>>> supported. Also a device has to support MSI/MSI-X to run on s390.
> >>>>>>>
> >>>>>>> Signed-off-by: Frank Blaschka <frank.blasc...@de.ibm.com>
> >>>>>>> ---
> >>>>>>> target-s390x/Makefile.objs |   2 +-
> >>>>>>> target-s390x/kvm.c         |  52 ++++
> >>>>>>> target-s390x/pci_ic.c      | 753 
> >>>>>>> +++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>> target-s390x/pci_ic.h      | 335 ++++++++++++++++++++
> >>>>>>> 4 files changed, 1141 insertions(+), 1 deletion(-)
> >>>>>>> create mode 100644 target-s390x/pci_ic.c
> >>>>>>> create mode 100644 target-s390x/pci_ic.h
> >>>>>>>
> >>>>
> >>>> [...]
> >>>>
> >>>>>>> +int kvm_pcilg_service_call(S390CPU *cpu, struct kvm_run *run)
> >>>>>>> +{
> >>>>>>> +    CPUS390XState *env = &cpu->env;
> >>>>>>> +    S390PCIBusDevice *pbdev;
> >>>>>>> +    uint8_t r1 = (run->s390_sieic.ipb & 0x00f00000) >> 20;
> >>>>>>> +    uint8_t r2 = (run->s390_sieic.ipb & 0x000f0000) >> 16;
> >>>>>>> +    PciLgStg *rp;
> >>>>>>> +    uint64_t offset;
> >>>>>>> +    uint64_t data;
> >>>>>>> +    uint8_t len;
> >>>>>>> +
> >>>>>>> +    cpu_synchronize_state(CPU(cpu));
> >>>>>>> +
> >>>>>>> +    if (env->psw.mask & PSW_MASK_PSTATE) {
> >>>>>>> +        program_interrupt(env, PGM_PRIVILEGED, 4);
> >>>>>>> +        return 0;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    if (r2 & 0x1) {
> >>>>>>> +        program_interrupt(env, PGM_SPECIFICATION, 4);
> >>>>>>> +        return 0;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    rp = (PciLgStg *)&env->regs[r2];
> >>>>>>> +    offset = env->regs[r2 + 1];
> >>>>>>> +
> >>>>>>> +    pbdev = s390_pci_find_dev_by_fh(rp->fh);
> >>>>>>> +    if (!pbdev) {
> >>>>>>> +        DPRINTF("pcilg no pci dev\n");
> >>>>>>> +        setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
> >>>>>>> +        return 0;
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    len = rp->len & 0xF;
> >>>>>>> +    if (rp->pcias < 6) {
> >>>>>>> +        if ((8 - (offset & 0x7)) < len) {
> >>>>>>> +            program_interrupt(env, PGM_OPERAND, 4);
> >>>>>>> +            return 0;
> >>>>>>> +        }
> >>>>>>> +        MemoryRegion *mr = pbdev->pdev->io_regions[rp->pcias].memory;
> >>>>>>> +        io_mem_read(mr, offset, &data, len);
> >>>>>>> +    } else if (rp->pcias == 15) {
> >>>>>>> +        if ((4 - (offset & 0x3)) < len) {
> >>>>>>> +            program_interrupt(env, PGM_OPERAND, 4);
> >>>>>>> +            return 0;
> >>>>>>> +        }
> >>>>>>> +        data =  pci_host_config_read_common(
> >>>>>>> +                   pbdev->pdev, offset, 
> >>>>>>> pci_config_size(pbdev->pdev), len);
> >>>>>>> +
> >>>>>>> +        switch (len) {
> >>>>>>> +        case 1:
> >>>>>>> +            break;
> >>>>>>> +        case 2:
> >>>>>>> +            data = cpu_to_le16(data);
> >>>>>>> +            break;
> >>>>>>> +        case 4:
> >>>>>>> +            data = cpu_to_le32(data);
> >>>>>>> +            break;
> >>>>>>> +        case 8:
> >>>>>>> +            data = cpu_to_le64(data);
> >>>>>>> +            break;
> >>>>>>
> >>>>>> Why? Also, this is wrong. cpu_to_le64 convert between host endianness
> >>>>>> and LE. So if you're running this on an LE host, you won't swap the
> >>>>>> value and get a broken result.
> >>>>>>
> >>>>>> If you know that the value is always swapped, use bswapxx().
> >>>>>>
> >>>>>
> >>>>> Actually the code is right and required for a big endian host :-)
> >>>>> pcilg/pcistg provide access to the PCI config space which is defined
> >>>>> as PCI byte order (little endian). Since pci_host_config_read_common 
> >>>>> does
> >>>>> already a le to cpu conversion we have to convert back to PCI byte 
> >>>>> order.
> >>>>> Doing an unconditional swap would be a bug on a little endian host.
> >>>>
> >>>> Why would it be a bug? The value you end up writing is contents of a
> >>>> register and thus doesn't have endianness. So if QEMU was an LE process,
> >>>
> >>> No, the s390 guest executing pcilg instruction expects to receive config 
> >>> space data
> >>> in PCI byte order.
> >>>
> >>>> the value of data would be identical as on a BE QEMU before your swab.
> >>>> After the swab, it would be bswap'ed on BE, but not LE. So LE hosts 
> >>>> break.
> >>>>
> >>>
> >>> Again on BE endian host we do the swap because of 
> >>> pci_host_config_read_common does
> >>> read the value and do a byte swap for that value, but we need PCI byte 
> >>> order not BE here.
> >>>
> >>> On LE host pci_host_config_read_common does not do a byte swap so we do 
> >>> not have to
> >>> convert back to PCI byte order.
> >>
> >> We maintain the PCI config space always in LE byte order in memory, that's 
> >> why there is a bwap in its read function. The return result of the read 
> >> function however is always the same, regardless of LE or BE host. If I do 
> >> a read of size 4, I will always get 0x1, not 0x01000000 returned.
> >>
> >> So now you need to convert that 0x1 into a 0x01000000 manually here 
> >> because some architect thought that registers have endianness (which they 
> >> don't). But you need to do it always, even on an LE host, because the pci 
> >> config space return value is identical on LE and BE.
> >>
> > so you tell me pci_host_config_read_common does not end up in 
> > pci_default_read_config?
> > 
> > uint32_t pci_default_read_config(PCIDevice *d,
> >                                  uint32_t address, int len)
> > {
> >     uint32_t val = 0;
> > 
> >     memcpy(&val, d->config + address, len);
> >     return le32_to_cpu(val);
> > }
> > 
> > What did I miss?
> 
> That's exactly where you end up in - and it's there to convert from the
> PCI config space backing storage to a native number.
> 
> Imagine you write 0x12345678 at offset 0. Because PCI config space is
> defined to be LE, in the PCI config space memory this gets stored as
> 
> 78 56 34 12
> 
> The reason we do the internal storage of the config space that way is
> that it's (in some PCI implementations) legal to access with single byte
> granularities. So you could do a pci_config_read(offset = 1) which
> should return 0x56.
> 
> However, that means we completely nullify any effect of host endianness
> in the PCI config layer already. So if you do pci_config_write(offset =
> 0, size = 4, value = 0x12345678), the contents of d->config will always
> be identical, regardless of host endianness. The same holds true for
> pci_config_read(offset = 0, size = 4). It will always return 0x12345678.
>

I understood this from the beginning and I completely agree to this.
 
> In your code, you swab that value again. I assume there's a reason
> you're swapping it and that it's the way the architecture expects it

Yes, s390 pcilg architecture states:
Data in the PCI configuration space are treated
as being in little-endian byte ordering

> (mind to point me to the respective spec so I can verify?). But if the
> architecture expects it, then it expects it regardless of host
> endianness. The contents of regs[r1] should always be 0x78563412, no
> matter whether we're in an LE or a BE environment.
> 
> Does that make sense now?
> 
Absolutely lets make an example for qemu running on BE and LE

byte order    config space backing   pci_default_read_config   pcilg (with 
cpu_to_le)
BE            0x78563412             0x12345678                0x78563412
LE            0x78563412             0x78563412                0x78563412

So what is the problem with my code? Adding unconditional byte swap instead of
cpu_to_le in pcilg would break architecture for pcilg if qemu is running on LE
platform.

> 
> Alex
>

Re: [Qemu-devel] [PATCH 2/3] s390: implement pci instructions

Reply via email to