Re: [Qemu-devel] virtio scsi host draft specification, v3

2011-07-01 Thread Hannes Reinecke

On 07/01/2011 08:41 AM, Paolo Bonzini wrote:

On 06/29/2011 11:39 AM, Stefan Hajnoczi wrote:

> > Of course, when doing so we would be lose the ability to
freely remap
> > LUNs. But then remapping LUNs doesn't gain you much imho.
> > Plus you could always use qemu block backend here if you want
> > to hide the details.
>
> And you could always use the QEMU block backend with
> scsi-generic if you want to remap LUNs, instead of true

>> > passthrough via the kernel target.


IIUC the in-kernel target always does remapping. It passes through
individual LUNs rather than entire targets and you pick LU Numbers to
map to the backing storage (which may or may not be a SCSI
pass-through device). Nicholas Bellinger can confirm whether this is
correct.


But then I don't understand. If you pick LU numbers both with the
in-kernel target and with QEMU, you do not need to use e.g. WWPNs
with fiber channel, because we are not passing through the details
of the transport protocol (one day we might have virtio-fc, but more
likely not). So the LUNs you use might as well be represented by
hierarchical LUNs.



Actually, the kernel does _not_ do a LUN remapping. It just so 
happens that most storage arrays will present the LUN starting with 
0, so normally you wouldn't notice.


However, some arrays have an array-wide LUN range, so you start 
seeing LUNs at odd places:


[3:0:5:0]diskLSI  INF-01-000750  /dev/sdw
[3:0:5:7]diskLSI  Universal Xport  0750  /dev/sdx


Using NPIV with KVM would be done by mapping the same virtual N_Port
ID in the host(s) to the same LU number in the guest. You might
already do this now with virtio-blk, in fact.


The point here is not the mapping. The point is rescanning.

You can map existing NPIV devices already. But you _cannot_ rescan
the host/device whatever _from the guest_ to detect if new devices
are present.
That is the problem I'm trying to describe here.

To be more explicit:
Currently you have to map existing devices directly as individual 
block or scsi devices to the guest.
And rescan within the guest can only be sent to that device, so the 
only information you will get able to gather is if the device itself 
is still present.
You are unable to detect if there are other devices attached to your 
guest which you should connect to.


So we have to have an enclosing instance (ie the equivalent of a 
SCSI target), which is capable of telling us exactly this.



Put in another way: the virtio-scsi device is itself a SCSI target,
so yes, there is a single target port identifier in virtio-scsi. But
this SCSI target just passes requests down to multiple real targets,
and so will let you do ALUA and all that.


Argl. No way. The virtio-scsi device has to map to a single LUN.

I thought I mentioned this already, but I'd better clarify this again:

The SCSI spec itself only deals with LUNs, so anything you'll read 
in there obviously will only handle the interaction between the 
initiator (read: host) and the LUN itself. However, the actual 
command is send via an intermediat target, hence you'll always see 
the reference to the ITL (initiator-target-lun) nexus.
The SCSI spec details discovery of the individual LUNs presented by 
a given target, it does _NOT_ detail the discovery of the targets 
themselves.
That is being delegated to the underlying transport, in most cases 
SAS or FibreChannel.
For the same reason the SCSI spec can afford to disdain any 
reference to path failure, device hot-plugging etc; all of these 
things are being delegated to the transport.


In our context the virtio-scsi device should map to the LUN, and the 
virtio-scsi _host_ backend should map to the target.

The virtio-scsi _guest_ driver will then map to the initiator.

So we should be able to attach more than one device to the backend,
which then will be presented to the initiator.

In the case of NPIV it would make sense to map the virtual SCSI host 
to the backend, so that all devices presented to the virtual SCSI 
host will be presented to the backend, too.
However, when doing so these devices will normally be referenced by 
their original LUN, as these will be presented to the guest via eg 
'REPORT LUNS'.


The above thread now tries to figure out if we should remap those 
LUN numbers or just expose them as they are.
If we decide on remapping, we have to emulate _all_ commands 
referring explicitely to those LUN numbers (persistent reservations, 
anyone?). If we don't, we would expose some hardware detail to the 
guest, but would save us _a lot_ of processing.


I'm all for the latter.

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



Re: [Qemu-devel] [PATCH 2/3] block: drive_init(): Simplify interface type setting

2011-07-01 Thread Markus Armbruster
Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH 3/3] block: drive_init(): Improve CHS setting error message

2011-07-01 Thread Markus Armbruster
Luiz Capitulino  writes:

> The current media doesn't clearly say the error cause.
>
> Signed-off-by: Luiz Capitulino 
> ---
>  blockdev.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/blockdev.c b/blockdev.c
> index 0a90ae8..2dbdd1b 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -311,7 +311,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
>   media = MEDIA_DISK;
>   } else if (!strcmp(buf, "cdrom")) {
>  if (cyls || secs || heads) {
> -error_report("'%s' invalid physical CHS format", buf);
> +error_report("CHS can't be set for CDROM media '%s'", buf);
>   return NULL;
>  }
>   media = MEDIA_CDROM;

It's an improvement.  I'd like "CHS can't be set with media=%s" even
better, because it's closer to the actual option string.  Matter of
taste.



Re: [Qemu-devel] [PATCH 1/3] block: drive_init(): Fix indentation

2011-07-01 Thread Markus Armbruster
Luiz Capitulino  writes:

> Some constructions in that function have broken indentation. Fix them.
>
> Signed-off-by: Luiz Capitulino 
> ---
>  blockdev.c |   52 ++--
>  1 files changed, 26 insertions(+), 26 deletions(-)
>
> diff --git a/blockdev.c b/blockdev.c
> index 7d579d6..27bf68a 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -272,23 +272,23 @@ DriveInfo *drive_init(QemuOpts *opts, int 
> default_to_scsi)
>  if (type == IF_COUNT) {
>  error_report("unsupported bus type '%s'", buf);
>  return NULL;
> - }
> + }

Old indentation is correct, apart from tabs.

New indentation is incorrect, and fails to remove tabs.

if (type == IF_COUNT) {
error_report("unsupported bus type '%s'", buf);
return NULL;
}
^
tab hereincorrect indentation

Check your editor settings.

I don't see "broken" indentation anywhere in this function.  A few lines
are off: indented 3 instead of 4.  If we want to fix such things, I'd
prefer it done globally.



[Qemu-devel] [PATCH 0/3] [v4] Megasas HBA emulation

2011-07-01 Thread Hannes Reinecke
Hi all,

thanks to Paolo and Stefan most of the SCSI patches are now in, so
I've made the next attempt of submitting my Megaraid SAS HBA emulation.

To do so, I've done two additional patches, both should be valid cleanups.

- Replace 'tag' by 'hba_private'
  The SCSIRequest structure has a 'tag', which is being used by the
  drivers to match the SCSIRequest to the internal request structure.
  The only driver actually to benefit from this is the lsi53c895a
  driver, everyone else either leaves it blank or uses some internal
  numberting here.
  So this patch converts the 'tag' to a 'hba_private' pointer, which
  allows the driver to store a pointer to the internal structure
  directly within the SCSIRequest. This saves the lookup and an
  additional field in the driver internal request structure.
- Add an 'offset' parameter to iov_to_buf()
  iov_from_buf() has it, but iov_to_buf() has it not. But we'll be
  needing it if the iovec is larger than the buffer. So there.

And, of course, the megasas driver itself. Which has been modified
to work with the new interface; otherwise there have been no changes
to the previous submission.

Hannes Reinecke (3):
  iov: Add 'offset' parameter to iov_to_buf()
  scsi: replace 'tag' with 'hba_private' pointer
  megasas: LSI Megaraid SAS emulation

 Makefile.objs   |1 +
 default-configs/pci.mak |1 +
 hw/esp.c|2 +-
 hw/lsi53c895a.c |   17 +-
 hw/megasas.c| 1923 +++
 hw/mfi.h| 1197 +
 hw/pci_ids.h|3 +-
 hw/scsi-bus.c   |   22 +-
 hw/scsi-disk.c  |5 +-
 hw/scsi-generic.c   |4 +-
 hw/scsi.h   |8 +-
 hw/spapr_vscsi.c|   41 +-
 hw/usb-msd.c|   10 +-
 hw/virtio-net.c |2 +-
 hw/virtio-serial-bus.c  |2 +-
 iov.c   |   23 +-
 iov.h   |2 +-
 trace-events|   14 +-
 18 files changed, 3193 insertions(+), 84 deletions(-)
 create mode 100644 hw/megasas.c
 create mode 100644 hw/mfi.h




[Qemu-devel] [PATCH 1/3] iov: Add 'offset' parameter to iov_to_buf()

2011-07-01 Thread Hannes Reinecke
Occasionally, the buffer needs to be placed at a offset within
the iovec when copying the buffer to the iovec.

Signed-off-by: Hannes Reinecke 
---
 hw/virtio-net.c|2 +-
 hw/virtio-serial-bus.c |2 +-
 iov.c  |   23 ++-
 iov.h  |2 +-
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 6997e02..a32cc01 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -657,7 +657,7 @@ static ssize_t virtio_net_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 
 /* copy in packet.  ugh */
 len = iov_from_buf(sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 total += len;
 offset += len;
 /* If buffers can't be merged, at this point we
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 7f6db7b..53c58d0 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -103,7 +103,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
 }
 
 len = iov_from_buf(elem.in_sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 offset += len;
 
 virtqueue_push(vq, &elem, len);
diff --git a/iov.c b/iov.c
index 588cd04..9ead6ee 100644
--- a/iov.c
+++ b/iov.c
@@ -15,21 +15,26 @@
 #include "iov.h"
 
 size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size)
+const void *buf, size_t offset, size_t size)
 {
-size_t offset;
+size_t iov_off, buf_off;
 unsigned int i;
 
-offset = 0;
-for (i = 0; offset < size && i < iovcnt; i++) {
-size_t len;
+iov_off = 0;
+buf_off = 0;
+for (i = 0; i < iovcnt && size; i++) {
+if (offset < (iov_off + iov[i].iov_len)) {
+size_t len = MIN((iov_off + iov[i].iov_len) - offset, size);
 
-len = MIN(iov[i].iov_len, size - offset);
+memcpy(iov[i].iov_base + (offset - iov_off), buf + buf_off, len);
 
-memcpy(iov[i].iov_base, buf + offset, len);
-offset += len;
+buf_off += len;
+offset += len;
+size -= len;
+}
+iov_off += iov[i].iov_len;
 }
-return offset;
+return buf_off;
 }
 
 size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
diff --git a/iov.h b/iov.h
index 60a8547..2677527 100644
--- a/iov.h
+++ b/iov.h
@@ -13,7 +13,7 @@
 #include "qemu-common.h"
 
 size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size);
+const void *buf, size_t offset, size_t size);
 size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
   void *buf, size_t offset, size_t size);
 size_t iov_size(const struct iovec *iov, const unsigned int iovcnt);
-- 
1.6.0.2




[Qemu-devel] [PATCH 2/3] scsi: replace 'tag' with 'hba_private' pointer

2011-07-01 Thread Hannes Reinecke
'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.

Signed-off-by: Hannes Reinecke 
---
 hw/esp.c  |2 +-
 hw/lsi53c895a.c   |   17 -
 hw/scsi-bus.c |   22 +++---
 hw/scsi-disk.c|5 ++---
 hw/scsi-generic.c |4 ++--
 hw/scsi.h |8 
 hw/spapr_vscsi.c  |   41 -
 hw/usb-msd.c  |   10 +-
 trace-events  |   14 +++---
 9 files changed, 52 insertions(+), 71 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index 6d3f5d2..912ff89 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)
 
 DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
 lun = busid & 7;
-s->current_req = scsi_req_new(s->current_dev, 0, lun);
+s->current_req = scsi_req_new(s->current_dev, lun, s);
 datalen = scsi_req_enqueue(s->current_req, buf);
 s->ti_size = datalen;
 if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..272e919 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -670,7 +670,7 @@ static void lsi_request_cancelled(SCSIRequest *req)
 return;
 }
 
-p = lsi_find_by_tag(s, req->tag);
+p = req->hba_private;
 if (p) {
 QTAILQ_REMOVE(&s->queue, p, next);
 scsi_req_unref(req);
@@ -680,18 +680,17 @@ static void lsi_request_cancelled(SCSIRequest *req)
 
 /* Record that data is available for a queued command.  Returns zero if
the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
 {
-lsi_request *p;
+lsi_request *p = req->hba_private;
 
-p = lsi_find_by_tag(s, tag);
 if (!p) {
-BADF("IO with unknown tag %d\n", tag);
+BADF("IO with unknown reference %p\n", req->hba_private);
 return 1;
 }
 
 if (p->pending) {
-BADF("Multiple IO pending for tag %d\n", tag);
+BADF("Multiple IO pending for request %p\n", p);
 }
 p->pending = len;
 /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +742,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
 int out;
 
-if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
+if (s->waiting == 1 || !s->current || req->hba_private != s->current ||
 (lsi_irq_on_rsl(s) && !(s->scntl1 & LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req->tag, len)) {
+if (lsi_queue_req(s, req, len)) {
 return;
 }
 }
@@ -789,7 +788,7 @@ static void lsi_do_command(LSIState *s)
 assert(s->current == NULL);
 s->current = qemu_mallocz(sizeof(lsi_request));
 s->current->tag = s->select_tag;
-s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
+s->current->req = scsi_req_new(dev, s->current_lun, s->current);
 
 n = scsi_req_enqueue(s->current->req, buf);
 if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..d1fc481 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,7 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
 return res;
 }
 
-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t lun, void 
*hba_private)
 {
 SCSIRequest *req;
 
@@ -139,16 +139,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
 req->refcount = 1;
 req->bus = scsi_bus_from_device(d);
 req->dev = d;
-req->tag = tag;
 req->lun = lun;
+req->hba_private = hba_private;
 req->status = -1;
-trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
+trace_scsi_req_alloc(req->dev->id, req->lun, req->hba_private);
 return req;
 }
 
-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t lun, void *hba_private)
 {
-return d->info->alloc_req(d, tag, lun);
+return d->info->alloc_req(d, lun, hba_private);
 }
 
 uint8_t *scsi_req_get_buf(SCSIRequest *req)
@@ -182,7 +182,7 @@ int32_t scsi_req_enqueue(SCSIRequest *req, uint8_t *buf)
 
 static void scsi_req_dequeue(SCSIRequest *req)
 {
-trace_scsi_req_dequeue(req->dev->id, req->lun, req->tag);
+trace_scsi_req_dequeue(req->dev->id, req->lun, req->hba_private);
 if (req->enqueued) {
 QTAILQ_REMOVE(&req->dev->requests, req, next);
 req->enqueued = false;
@@ -214,7 +214,7 @@ static int scsi_req_length(SCSIRequest *req, uint8_t *cmd)
 req->cmd.len = 12;
 break;
 default:
-trace_scsi_req_pars

Re: [Qemu-devel] [PATCH v3] xen: implement unplug protocol in xen_platform

2011-07-01 Thread Kevin Wolf
Am 30.06.2011 16:16, schrieb Stefano Stabellini:
> On Thu, 30 Jun 2011, Kevin Wolf wrote:
>>> +static int pci_piix3_xen_ide_unplug(DeviceState *dev)
>>> +{
>>> +PCIDevice *pci_dev;
>>> +PCIIDEState *pci_ide;
>>> +DriveInfo *di;
>>> +int i = 0;
>>> +
>>> +pci_dev = DO_UPCAST(PCIDevice, qdev, dev);
>>> +pci_ide = DO_UPCAST(PCIIDEState, dev, pci_dev);
>>> +
>>> +for (; i < 3; i++) {
>>> +di = drive_get_by_index(IF_IDE, i);
>>> +if (di != NULL && di->bdrv != NULL && !di->bdrv->removable) {
>>> +DeviceState *ds = bdrv_get_attached(di->bdrv);
>>> +if (ds) {
>>> +bdrv_detach(di->bdrv, ds);
>>> +}
>>> +bdrv_close(di->bdrv);
>>> +pci_ide->bus[di->bus].ifs[di->unit].bs = NULL;
>>
>> Have you tested if this is enough if the guest tries to continue using
>> the device? I don't know of any case where it's not sufficient, just
>> trying to make sure that it's really true in practice.
> 
> The purpose of this is to "hide" the disk from the guest. The unplug is
> supposed to happen *before* the guest enumerates the IDE disks; it is
> responsibility of the guest to make sure of it.
> I tested it with Linux PV on HVM drivers, and Linux doesn't see the
> emulated disk after the unplug, as it should be.

Yeah. What I meant is that we should make sure that a misbehaving guest,
which just keeps on playing with the IDE ports anyway, can't crash qemu.
A quick review suggests that it is the case, but testing it anyway would
be better.

>>> +drive_put_ref(di);
>>> +}
>>> +}
>>> +qdev_reset_all(&(pci_ide->dev.qdev));
>>> +return 0;
>>> +}
>>> +
>>> +PCIDevice *pci_piix3_xen_ide_init(PCIBus *bus, DriveInfo **hd_table, int 
>>> devfn)
>>> +{
>>> +PCIDevice *dev;
>>> +
>>> +dev = pci_create_simple(bus, devfn, "piix3-ide-xen");
>>> +dev->qdev.info->unplug = pci_piix3_xen_ide_unplug;
>>
>> Can't this be moved into the PCIDeviceInfo now that we have a separate
>> one for Xen?
>  
> No because it would be overridden by the default pci unplug function,
> that is not what we want in this case.

Okay. I'm not familiar with that code, so I'll just trust you there.

Kevin



Re: [Qemu-devel] [PATCH] virtio-blk: Turn drive serial into a qdev property

2011-07-01 Thread Kevin Wolf
Am 20.06.2011 11:35, schrieb Markus Armbruster:
> It needs to be a qdev property, because it belongs to the drive's
> guest part.  Precedence: commit a0fef654 and 6ced55a5.
> 
> Bonus: info qtree now shows the serial number.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  hw/s390-virtio-bus.c |4 +++-
>  hw/s390-virtio-bus.h |1 +
>  hw/virtio-blk.c  |   29 +++--
>  hw/virtio-blk.h  |2 ++
>  hw/virtio-pci.c  |4 +++-
>  hw/virtio-pci.h  |1 +
>  hw/virtio.h  |3 ++-
>  7 files changed, 31 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
> index d4a12f7..2bf4821 100644
> --- a/hw/s390-virtio-bus.c
> +++ b/hw/s390-virtio-bus.c
> @@ -128,7 +128,8 @@ static int s390_virtio_blk_init(VirtIOS390Device *dev)
>  {
>  VirtIODevice *vdev;
>  
> -vdev = virtio_blk_init((DeviceState *)dev, &dev->block);
> +vdev = virtio_blk_init((DeviceState *)dev, &dev->block,
> +   &dev->block_serial);
>  if (!vdev) {
>  return -1;
>  }
> @@ -355,6 +356,7 @@ static VirtIOS390DeviceInfo s390_virtio_blk = {
>  .qdev.size = sizeof(VirtIOS390Device),
>  .qdev.props = (Property[]) {
>  DEFINE_BLOCK_PROPERTIES(VirtIOS390Device, block),
> +DEFINE_PROP_STRING("serial", VirtIOS390Device, block_serial),
>  DEFINE_PROP_END_OF_LIST(),
>  },
>  };
> diff --git a/hw/s390-virtio-bus.h b/hw/s390-virtio-bus.h
> index 0c412d0..f1bece7 100644
> --- a/hw/s390-virtio-bus.h
> +++ b/hw/s390-virtio-bus.h
> @@ -42,6 +42,7 @@ typedef struct VirtIOS390Device {
>  uint8_t feat_len;
>  VirtIODevice *vdev;
>  BlockConf block;
> +char *block_serial;
>  NICConf nic;
>  uint32_t host_features;
>  virtio_serial_conf serial;
> diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
> index 91e0394..6471ac8 100644
> --- a/hw/virtio-blk.c
> +++ b/hw/virtio-blk.c
> @@ -28,8 +28,8 @@ typedef struct VirtIOBlock
>  void *rq;
>  QEMUBH *bh;
>  BlockConf *conf;
> +char *serial;
>  unsigned short sector_mask;
> -char sn[BLOCK_SERIAL_STRLEN];
>  DeviceState *qdev;
>  } VirtIOBlock;
>  
> @@ -362,8 +362,13 @@ static void virtio_blk_handle_request(VirtIOBlockReq 
> *req,
>  } else if (type & VIRTIO_BLK_T_GET_ID) {
>  VirtIOBlock *s = req->dev;
>  
> -memcpy(req->elem.in_sg[0].iov_base, s->sn,
> -   MIN(req->elem.in_sg[0].iov_len, sizeof(s->sn)));
> +/*
> + * NB: per existing s/n string convention the string is
> + * terminated by '\0' only when shorter than buffer.
> + */
> +strncpy(req->elem.in_sg[0].iov_base,
> +s->serial ? s->serial : "",
> +MIN(req->elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES));

Not sure what you're trying to do with VIRTIO_BLK_ID_BYTES here.

s->string either is dinfo->serial, in which case it happens to be the
same as BLOCK_SERIAL_STRLEN and as such makes some sense. Or it may be a
qdev property, in which case the string just has the length it has. Or
it's the empty string. So I think in two of three cases you're
potentially reading beyond the end of the buffer.

Kevin



Re: [Qemu-devel] [PATCH 1/3] iov: Add 'offset' parameter to iov_to_buf()

2011-07-01 Thread Paolo Bonzini

On 07/01/2011 09:42 AM, Hannes Reinecke wrote:

  size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size)
+const void *buf, size_t offset, size_t size)


Wrong commit subject, it seems. :)

Paolo



Re: [Qemu-devel] [PATCH 1/3] iov: Add 'offset' parameter to iov_to_buf()

2011-07-01 Thread Hannes Reinecke

On 07/01/2011 10:02 AM, Alexander Graf wrote:


On 01.07.2011, at 09:42, Hannes Reinecke wrote:


Occasionally, the buffer needs to be placed at a offset within
the iovec when copying the buffer to the iovec.


So this is a buffer into the iovec, right? Wouldn't it make sense
> to also modify iov_to_buf respectively then, so the API stays 
similar?


Ahem. That's exactly what the patch does. Except from the mixed-up 
subject.


iov_to_buff() has an offset parameter, iov_from_buf() has not.
For no obvious reasons.


Also, it'd be nice to give the parameter a more obvious name, so potential

> users can easily recognize what it offsets.



Yes, that sounds reasonable.

What about 'iov_off' ?
(And possibly rename 'iovcnt' to 'iov_cnt' for consistency ?)

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



Re: [Qemu-devel] [PATCH 1/3] iov: Add 'offset' parameter to iov_to_buf()

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 09:42, Hannes Reinecke wrote:

> Occasionally, the buffer needs to be placed at a offset within
> the iovec when copying the buffer to the iovec.

So this is a buffer into the iovec, right? Wouldn't it make sense to also 
modify iov_to_buf respectively then, so the API stays similar? Also, it'd be 
nice to give the parameter a more obvious name, so potential users can easily 
recognize what it offsets.


Alex

> 
> Signed-off-by: Hannes Reinecke 
> ---
> hw/virtio-net.c|2 +-
> hw/virtio-serial-bus.c |2 +-
> iov.c  |   23 ++-
> iov.h  |2 +-
> 4 files changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index 6997e02..a32cc01 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -657,7 +657,7 @@ static ssize_t virtio_net_receive(VLANClientState *nc, 
> const uint8_t *buf, size_
> 
> /* copy in packet.  ugh */
> len = iov_from_buf(sg, elem.in_num,
> -   buf + offset, size - offset);
> +   buf + offset, 0, size - offset);
> total += len;
> offset += len;
> /* If buffers can't be merged, at this point we
> diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
> index 7f6db7b..53c58d0 100644
> --- a/hw/virtio-serial-bus.c
> +++ b/hw/virtio-serial-bus.c
> @@ -103,7 +103,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
> }
> 
> len = iov_from_buf(elem.in_sg, elem.in_num,
> -   buf + offset, size - offset);
> +   buf + offset, 0, size - offset);
> offset += len;
> 
> virtqueue_push(vq, &elem, len);
> diff --git a/iov.c b/iov.c
> index 588cd04..9ead6ee 100644
> --- a/iov.c
> +++ b/iov.c
> @@ -15,21 +15,26 @@
> #include "iov.h"
> 
> size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
> -const void *buf, size_t size)
> +const void *buf, size_t offset, size_t size)
> {
> -size_t offset;
> +size_t iov_off, buf_off;
> unsigned int i;
> 
> -offset = 0;
> -for (i = 0; offset < size && i < iovcnt; i++) {
> -size_t len;
> +iov_off = 0;
> +buf_off = 0;
> +for (i = 0; i < iovcnt && size; i++) {
> +if (offset < (iov_off + iov[i].iov_len)) {
> +size_t len = MIN((iov_off + iov[i].iov_len) - offset, size);
> 
> -len = MIN(iov[i].iov_len, size - offset);
> +memcpy(iov[i].iov_base + (offset - iov_off), buf + buf_off, len);
> 
> -memcpy(iov[i].iov_base, buf + offset, len);
> -offset += len;
> +buf_off += len;
> +offset += len;
> +size -= len;
> +}
> +iov_off += iov[i].iov_len;
> }
> -return offset;
> +return buf_off;
> }
> 
> size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
> diff --git a/iov.h b/iov.h
> index 60a8547..2677527 100644
> --- a/iov.h
> +++ b/iov.h
> @@ -13,7 +13,7 @@
> #include "qemu-common.h"
> 
> size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
> -const void *buf, size_t size);
> +const void *buf, size_t offset, size_t size);
> size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
>   void *buf, size_t offset, size_t size);
> size_t iov_size(const struct iovec *iov, const unsigned int iovcnt);
> -- 
> 1.6.0.2
> 




Re: [Qemu-devel] [PATCH 1/3] iov: Add 'offset' parameter to iov_to_buf()

2011-07-01 Thread Hannes Reinecke

On 07/01/2011 10:03 AM, Paolo Bonzini wrote:

On 07/01/2011 09:42 AM, Hannes Reinecke wrote:

size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
- const void *buf, size_t size)
+ const void *buf, size_t offset, size_t size)


Wrong commit subject, it seems. :)


Bummer.

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



Re: [Qemu-devel] [PATCH 2/3] scsi: replace 'tag' with 'hba_private' pointer

2011-07-01 Thread Paolo Bonzini

On 07/01/2011 09:42 AM, Hannes Reinecke wrote:

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.


This makes tracing a bit harder to follow.  Perhaps you can keep the 
transport tag (a uint64_t) in the SCSIRequest for debugging purposes?



Signed-off-by: Hannes Reinecke
---
  hw/esp.c  |2 +-
  hw/lsi53c895a.c   |   17 -
  hw/scsi-bus.c |   22 +++---
  hw/scsi-disk.c|5 ++---
  hw/scsi-generic.c |4 ++--
  hw/scsi.h |8 
  hw/spapr_vscsi.c  |   41 -
  hw/usb-msd.c  |   10 +-
  trace-events  |   14 +++---
  9 files changed, 52 insertions(+), 71 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index 6d3f5d2..912ff89 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)

  DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
  lun = busid&  7;
-s->current_req = scsi_req_new(s->current_dev, 0, lun);
+s->current_req = scsi_req_new(s->current_dev, lun, s);


Might as well pass NULL here.  The hba_private value is basically 
unnecessary when the adapter doesn't support tagged command queuing.



diff --git a/hw/usb-msd.c b/hw/usb-msd.c
index 86582cc..4e2ea03 100644
--- a/hw/usb-msd.c
+++ b/hw/usb-msd.c
@@ -216,8 +216,8 @@ static void usb_msd_transfer_data(SCSIRequest *req, 
uint32_t len)
  MSDState *s = DO_UPCAST(MSDState, dev.qdev, req->bus->qbus.parent);
  USBPacket *p = s->packet;

-if (req->tag != s->tag) {
-fprintf(stderr, "usb-msd: Unexpected SCSI Tag 0x%x\n", req->tag);
+if (req->hba_private != s) {
+fprintf(stderr, "usb-msd: Unexpected SCSI command 0x%p\n", req);
  }


Same here, just pass NULL and remove these ifs.

Otherwise looks like a very good idea.

Paolo



Re: [Qemu-devel] virtio scsi host draft specification, v3

2011-07-01 Thread Paolo Bonzini

On 07/01/2011 09:14 AM, Hannes Reinecke wrote:

Actually, the kernel does _not_ do a LUN remapping.


Not the kernel, the in-kernel target.  The in-kernel target can and will
map hardware LUNs (target_lun in drivers/target/*) to arbitrary LUNs
(mapped_lun).


Put in another way: the virtio-scsi device is itself a SCSI
target,


Argl. No way. The virtio-scsi device has to map to a single LUN.


I think we are talking about different things. By "virtio-scsi device"
I meant the "virtio-scsi HBA".  When I referred to a LUN as seen by the
guest, I was calling it a "virtual SCSI device".  So yes, we were
calling things with different names.  Perhaps from now on
we can call them virtio-scsi {initiator,target,LUN} and have no
ambiguity?  I'll also modify the spec in this sense.


The SCSI spec itself only deals with LUNs, so anything you'll read in
there obviously will only handle the interaction between the
initiator (read: host) and the LUN itself. However, the actual
command is send via an intermediat target, hence you'll always see
the reference to the ITL (initiator-target-lun) nexus.


Yes, this I understand.


The SCSI spec details discovery of the individual LUNs presented by a
given target, it does _NOT_ detail the discovery of the targets
themselves.  That is being delegated to the underlying transport


And in fact I have this in virtio-scsi too, since virtio-scsi _is_ a
transport:

When VIRTIO_SCSI_EVT_RESET_REMOVED or VIRTIO_SCSI_EVT_RESET_RESCAN
is sent for LUN 0, the driver should ask the initiator to rescan
the target, in order to detect the case when an entire target has
appeared or disappeared.

[If the device fails] to report an event due to missing buffers,
[...] the driver should poll the logical units for unit attention
conditions, and/or do whatever form of bus scan is appropriate for
the guest operating system.


In the case of NPIV it would make sense to map the virtual SCSI host
 to the backend, so that all devices presented to the virtual SCSI
host will be presented to the backend, too. However, when doing so
these devices will normally be referenced by their original LUN, as
these will be presented to the guest via eg 'REPORT LUNS'.


Right.


The above thread now tries to figure out if we should remap those LUN
numbers or just expose them as they are. If we decide on remapping,
we have to emulate _all_ commands referring explicitely to those LUN
numbers (persistent reservations, anyone?).


But it seems to me that commands referring explicitly to LUN numbers
most likely have to be reimplemented anyway for virtualization.  I'm
thinking exactly of persistent reservations.  If two guests on the same
host try a persistent reservation, they should conflict with each other.
If reservation commands were just passed through, they would be seen
as coming from the same initiator (the HBA driver or iSCSI initiator in
the host OS).

etc.


If we don't, we would expose some hardware detail to the guest, but
would save us _a lot_ of processing.


But can we afford it?  And would the architecture allow that at all?

Paolo



Re: [Qemu-devel] [PATCH] sheepdog: add data preallocation support

2011-07-01 Thread Kevin Wolf
Am 21.05.2011 14:35, schrieb MORITA Kazutaka:
> This introduces a qemu-img create option for sheepdog which allows the
> data to be preallocated (note that sheepdog always preallocates
> metadata).  This is necessary to use Sheepdog volumes as a backend
> storage for iSCSI target.  More information is available at
> https://sourceforge.net/apps/trac/sheepdog/wiki/General%20Protocol%20Support
> 
> The option is disabled by default and you need to enable it like the
> following:
> 
> qemu-img create sheepdog:test -o preallocation=data 1G
> 
> Signed-off-by: MORITA Kazutaka 
> Signed-off-by: FUJITA Tomonori 

Hm, looks like I forgot about this patch and nobody pinged me...

>  block/sheepdog.c |   73 
> +-
>  1 files changed, 72 insertions(+), 1 deletions(-)
> 
> diff --git a/block/sheepdog.c b/block/sheepdog.c
> index 0392ca8..38ca9aa 100644
> --- a/block/sheepdog.c
> +++ b/block/sheepdog.c
> @@ -1292,6 +1292,57 @@ static int do_sd_create(char *filename, int64_t 
> vdi_size,
>  return 0;
>  }
>  
> +static int sd_prealloc(uint32_t vid, int64_t vdi_size)

General comment to this function: Wouldn't it be easier to call the
existing bdrv_write function in a loop instead of reimplementing the
write manually? Though there may be a reason for it that I'm just missing.

> +{
> +int fd, ret;
> +SheepdogInode *inode;
> +char *buf;
> +unsigned long idx, max_idx;
> +
> +fd = connect_to_sdog(NULL, NULL);
> +if (fd < 0) {
> +return -EIO;
> +}
> +
> +inode = qemu_malloc(sizeof(*inode));
> +buf = qemu_malloc(SD_DATA_OBJ_SIZE);
> +
> +ret = read_object(fd, (char *)inode, vid_to_vdi_oid(vid),
> +  0, sizeof(*inode), 0);

No error handling?

> +
> +max_idx = (vdi_size + SD_DATA_OBJ_SIZE - 1) / SD_DATA_OBJ_SIZE;
> +
> +for (idx = 0; idx < max_idx; idx++) {
> +uint64_t oid;
> +oid = vid_to_data_oid(vid, idx);
> +
> +if (inode->data_vdi_id[idx]) {
> +ret = read_object(fd, buf, 
> vid_to_vdi_oid(inode->data_vdi_id[idx]),
> +  1, SD_DATA_OBJ_SIZE, 0);
> +if (ret)
> +goto out;

Missing braces.

Also, what is this if branch doing? Is it to ensure that we don't
overwrite existing data? But then, isn't an image always empty when we
preallocate it?

> +} else {
> +memset(buf, 0, SD_DATA_OBJ_SIZE);
> +}
> +
> +ret = write_object(fd, buf, oid, 1, SD_DATA_OBJ_SIZE, 0, 1);
> +if (ret)
> +goto out;

Braces

> +
> +inode->data_vdi_id[idx] = vid;
> +ret = write_object(fd, (char *)inode, vid_to_vdi_oid(vid),
> +   1, sizeof(*inode), 0, 0);
> +if (ret)
> +goto out;

Same here

> +}
> +out:
> +free(inode);
> +free(buf);
> +closesocket(fd);
> +
> +return ret;
> +}
> +
>  static int sd_create(const char *filename, QEMUOptionParameter *options)
>  {
>  int ret;
> @@ -1301,6 +1352,7 @@ static int sd_create(const char *filename, 
> QEMUOptionParameter *options)
>  BDRVSheepdogState s;
>  char vdi[SD_MAX_VDI_LEN], tag[SD_MAX_VDI_TAG_LEN];
>  uint32_t snapid;
> +int prealloc = 0;
>  
>  strstart(filename, "sheepdog:", (const char **)&filename);
>  
> @@ -1317,6 +1369,16 @@ static int sd_create(const char *filename, 
> QEMUOptionParameter *options)
>  vdi_size = options->value.n;
>  } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
>  backing_file = options->value.s;
> +} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
> +if (!options->value.s || !strcmp(options->value.s, "off")) {
> +prealloc = 0;
> +} else if (!strcmp(options->value.s, "data")) {
> +prealloc = 1;
> +} else {
> +error_report("Invalid preallocation mode: '%s'\n",
> +options->value.s);
> +return -EINVAL;
> +}
>  }
>  options++;
>  }
> @@ -1354,7 +1416,11 @@ static int sd_create(const char *filename, 
> QEMUOptionParameter *options)
>  bdrv_delete(bs);
>  }
>  
> -return do_sd_create((char *)vdi, vdi_size, base_vid, &vid, 0, s.addr, 
> s.port);
> +ret = do_sd_create((char *)vdi, vdi_size, base_vid, &vid, 0, s.addr, 
> s.port);
> +if (!prealloc || ret)
> +return ret;

And here.

> +
> +return sd_prealloc(vid, vdi_size);
>  }
>  
>  static void sd_close(BlockDriverState *bs)
> @@ -1990,6 +2056,11 @@ static QEMUOptionParameter sd_create_options[] = {
>  .type = OPT_STRING,
>  .help = "File name of a base image"
>  },
> +{
> +.name = BLOCK_OPT_PREALLOC,
> +.type = OPT_STRING,
> +.help = "Preallocation mode (allowed values: off, data)"
> +},
>  { NULL }
>  };

In my RFC for qcow2, I called this preallocation mode "full" 

Re: [Qemu-devel] [PATCH 1/3] iov: Add 'offset' parameter to iov_to_buf()

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 10:07, Hannes Reinecke wrote:

> On 07/01/2011 10:02 AM, Alexander Graf wrote:
>> 
>> On 01.07.2011, at 09:42, Hannes Reinecke wrote:
>> 
>>> Occasionally, the buffer needs to be placed at a offset within
>>> the iovec when copying the buffer to the iovec.
>> 
>> So this is a buffer into the iovec, right? Wouldn't it make sense
> > to also modify iov_to_buf respectively then, so the API stays similar?
> 
> Ahem. That's exactly what the patch does. Except from the mixed-up subject.
> 
> iov_to_buff() has an offset parameter, iov_from_buf() has not.
> For no obvious reasons.

Ah, I see. Please state this in your patch description :). Makes it a lot 
easier to understand the rationale that you're merely moving the "from" API 
towards the same parameters as to "to" one.

> 
>> Also, it'd be nice to give the parameter a more obvious name, so potential
> > users can easily recognize what it offsets.
>> 
> Yes, that sounds reasonable.
> 
> What about 'iov_off' ?
> (And possibly rename 'iovcnt' to 'iov_cnt' for consistency ?)

Yup, that'd be a lot more readable :)


Alex




Re: [Qemu-devel] [PATCH] [PowerPC][RFC] booke timers

2011-07-01 Thread Fabien Chouteau
On 30/06/2011 21:26, Scott Wood wrote:
> On Thu, 30 Jun 2011 17:51:10 +0200
> Fabien Chouteau  wrote:
>
>> On 28/06/2011 19:49, Scott Wood wrote:
>>> On Tue, 28 Jun 2011 15:35:00 +0200
>>> Fabien Chouteau  wrote:
>>>
 On 27/06/2011 22:28, Scott Wood wrote:
> On Mon, 27 Jun 2011 18:14:06 +0200
> Fabien Chouteau  wrote:
>
>> While working on the emulation of the freescale p2010 (e500v2) I 
>> realized that
>> there's no implementation of booke's timers features. Currently mpc8544 
>> uses
>> ppc_emb (ppc_emb_timers_init) which is close but not exactly like booke 
>> (for
>> example booke uses different SPR).
>
> ppc_emb_timers_init should be renamed something less generic, then.

 Agreed, can you help me to find a better name?
>>>
>>> What chips are covered by it?  40x?
>>
>> The function is used by ppc4xx_init (ppc4xx_devs.c) and ppc440_init_xilinx
>> (virtex_ml507.c), so I guess ppc_4xx_timers_int will be fine...
>
> Doesn't 440 have normal booke timers?
>

Actually ppc_emb_timers_init should be renamed ppc40x_timers_init, and Xilinx's
ppc440 should use the new implementation of booke timers.

> I think some changes in the decrementer code are needed to provide booke
> semantics -- no raising the interrupt based on the high bit of decr, and
> stop counting when reach zero.

 Can you please explain, I don't see where I'm not compliant with booke's
 decrementer.
>>>
>>> It's not an issue with this code specifically, but existing behavior in the
>>> decrementer code that isn't appropriate for booke.
>>>
>>> On classic/server powerpc, when decrementer hits zero, it wraps around, and
>>> the upper bit of DECR is used to signal the interrupt.  On booke, when
>>> decrementer hits zero, it stops, and the upper bit of DECR is not special.
>>>
>>
>> And that's why I implemented the booke_decr_cb function that will emulate 
>> this
>> behavior.
>
> booke_decr_cb() doesn't control what happens when DECR is read after an
> expiration, nor does it control whether an interrupt is raised if software
> writes DECR with the high bit set.
>

OK, I will add a condition to avoid this on booke processors.

 It's just a mask to keep only the defined bits.
>>>
>>> Just seems unnecessary, and potentially harmful if CPU-specific code wants
>>> to interpret implementation-defined bits, or if new bits are architected
>>> in the future.
>>>
>>
>> On the other hand, undefined bit should always be read as zeros.
>
> I don't think there's any such requirement.

My bad ;)

"The handling of reserved bits in System Registers (e.g. Integer Exception
Register, Floating-Point Status and Control Register) is
implementation-dependent. Software is permitted to write any value to such a
bit with no visible effect on processors that implement this version of Book E.
A subsequent reading of the bit returns a 0 if the last value written to the
bit was 0 and returns an undefined value (0 or 1) otherwise."

-- 
Fabien Chouteau



Re: [Qemu-devel] [PATCH 2/3] scsi: replace 'tag' with 'hba_private' pointer

2011-07-01 Thread Hannes Reinecke

On 07/01/2011 10:27 AM, Paolo Bonzini wrote:

On 07/01/2011 09:42 AM, Hannes Reinecke wrote:

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.


This makes tracing a bit harder to follow. Perhaps you can keep the
transport tag (a uint64_t) in the SCSIRequest for debugging purposes?


Sure. Anything to get the patches accepted :-)


Signed-off-by: Hannes Reinecke
---
hw/esp.c | 2 +-
hw/lsi53c895a.c | 17 -
hw/scsi-bus.c | 22 +++---
hw/scsi-disk.c | 5 ++---
hw/scsi-generic.c | 4 ++--
hw/scsi.h | 8 
hw/spapr_vscsi.c | 41 -
hw/usb-msd.c | 10 +-
trace-events | 14 +++---
9 files changed, 52 insertions(+), 71 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index 6d3f5d2..912ff89 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t
*buf, uint8_t busid)

DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
lun = busid& 7;
- s->current_req = scsi_req_new(s->current_dev, 0, lun);
+ s->current_req = scsi_req_new(s->current_dev, lun, s);


Might as well pass NULL here. The hba_private value is basically
unnecessary when the adapter doesn't support tagged command queuing.


diff --git a/hw/usb-msd.c b/hw/usb-msd.c
index 86582cc..4e2ea03 100644
--- a/hw/usb-msd.c
+++ b/hw/usb-msd.c
@@ -216,8 +216,8 @@ static void usb_msd_transfer_data(SCSIRequest
*req, uint32_t len)
MSDState *s = DO_UPCAST(MSDState, dev.qdev, req->bus->qbus.parent);
USBPacket *p = s->packet;

- if (req->tag != s->tag) {
- fprintf(stderr, "usb-msd: Unexpected SCSI Tag 0x%x\n", req->tag);
+ if (req->hba_private != s) {
+ fprintf(stderr, "usb-msd: Unexpected SCSI command 0x%p\n", req);
}


Same here, just pass NULL and remove these ifs.

Otherwise looks like a very good idea.


Ok, I'll be resending both.

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



Re: [Qemu-devel] [PATCH 3/3] megasas: LSI Megaraid SAS emulation

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 09:42, Hannes Reinecke wrote:

> This patch adds an emulation for the LSI Megaraid SAS 8708EM2 HBA.
> 
> Signed-off-by: Hannes Reinecke 
> ---
> Makefile.objs   |1 +
> default-configs/pci.mak |1 +
> hw/megasas.c| 1923 +++
> hw/mfi.h| 1197 +
> hw/pci_ids.h|3 +-
> 5 files changed, 3124 insertions(+), 1 deletions(-)
> create mode 100644 hw/megasas.c
> create mode 100644 hw/mfi.h
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index cea15e4..6f5d113 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -258,6 +258,7 @@ hw-obj-$(CONFIG_AHCI) += ide/ich.o
> 
> # SCSI layer
> hw-obj-$(CONFIG_LSI_SCSI_PCI) += lsi53c895a.o
> +hw-obj-$(CONFIG_MEGASAS_SCSI_PCI) += megasas.o
> hw-obj-$(CONFIG_ESP) += esp.o
> 
> hw-obj-y += dma-helpers.o sysbus.o isa-bus.o
> diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> index 22bd350..fabb56c 100644
> --- a/default-configs/pci.mak
> +++ b/default-configs/pci.mak
> @@ -9,6 +9,7 @@ CONFIG_EEPRO100_PCI=y
> CONFIG_PCNET_PCI=y
> CONFIG_PCNET_COMMON=y
> CONFIG_LSI_SCSI_PCI=y
> +CONFIG_MEGASAS_SCSI_PCI=y
> CONFIG_RTL8139_PCI=y
> CONFIG_E1000_PCI=y
> CONFIG_IDE_CORE=y
> diff --git a/hw/megasas.c b/hw/megasas.c
> new file mode 100644
> index 000..75f9be3
> --- /dev/null
> +++ b/hw/megasas.c
> @@ -0,0 +1,1923 @@
> +/*
> + * QEMU MegaRAID SAS 8708EM2 Host Bus Adapter emulation
> + *
> + * Copyright (c) 2009-2011 Hannes Reinecke, SUSE Labs
> + *
> + * This code is licenced under the LGPL.

Please take a look at the license header of other LGPL code and just copy it :).

> + */
> +
> +#include 
> +#include 

Are you sure you need to manually include those?

> +
> +#include "hw.h"
> +#include "pci.h"
> +#include "dma.h"
> +#include "iov.h"
> +#include "scsi.h"
> +#include "scsi-defs.h"
> +#include "block_int.h"
> +#ifdef __linux__
> +# include 

Is this really necessary? Device code shouldn't be host dependent IMHO. I also 
haven't found any user of this in the actual code, so it might be as easy as 
merely removing the include :).

> +#endif
> +
> +#include "mfi.h"
> +
> +#define DEBUG_MEGASAS
> +#undef DEBUG_MEGASAS_REG
> +#undef DEBUG_MEGASAS_QUEUE
> +#undef DEBUG_MEGASAS_MFI
> +#undef DEBUG_MEGASAS_IO
> +#undef DEBUG_MEGASAS_DCMD
> +
> +#ifdef DEBUG_MEGASAS
> +#define DPRINTF(fmt, ...) \
> +do { printf("megasas: " fmt , ## __VA_ARGS__); } while (0)
> +#define BADF(fmt, ...) \
> +do { fprintf(stderr, "megasas: error: " fmt , ## __VA_ARGS__); exit(1);} 
> while (0)
> +#ifdef DEBUG_MEGASAS_REG
> +#define DPRINTF_REG DPRINTF
> +#else
> +#define DPRINTF_REG(fmt, ...) do {} while(0)
> +#endif
> +#ifdef DEBUG_MEGASAS_QUEUE
> +#define DPRINTF_QUEUE DPRINTF
> +#else
> +#define DPRINTF_QUEUE(fmt, ...) do {} while(0)
> +#endif
> +#ifdef DEBUG_MEGASAS_MFI
> +#define DPRINTF_MFI DPRINTF
> +#else
> +#define DPRINTF_MFI(fmt, ...) do {} while(0)
> +#endif
> +#ifdef DEBUG_MEGASAS_IO
> +#define DPRINTF_IO DPRINTF
> +#else
> +#define DPRINTF_IO(fmt, ...) do {} while(0)
> +#endif
> +#ifdef DEBUG_MEGASAS_DCMD
> +#define DPRINTF_DCMD DPRINTF
> +#else
> +#define DPRINTF_DCMD(fmt, ...) do {} while(0)
> +#endif
> +#else
> +#define DPRINTF(fmt, ...) do {} while(0)
> +#define DPRINTF_REG DPRINTF
> +#define DPRINTF_QUEUE DPRINTF
> +#define DPRINTF_MFI DPRINTF
> +#define DPRINTF_IO DPRINTF
> +#define DPRINTF_DCMD DPRINTF
> +#define BADF(fmt, ...) \
> +do { fprintf(stderr, "megasas: error: " fmt , ## __VA_ARGS__);} while (0)
> +#endif
> +
> +/* Static definitions */
> +#define MEGASAS_VERSION "1.20"
> +#define MEGASAS_MAX_FRAMES 2048 /* Firmware limit at 65535 */
> +#define MEGASAS_DEFAULT_FRAMES 1000 /* Windows requires this */
> +#define MEGASAS_MAX_SGE 256 /* Firmware limit */
> +#define MEGASAS_DEFAULT_SGE 80
> +#define MEGASAS_MAX_SECTORS 0x  /* No real limit */
> +#define MEGASAS_MAX_ARRAYS 128
> +
> +const char *mfi_frame_desc[] = {
> +"MFI init", "LD Read", "LD Write", "LD SCSI", "PD SCSI",
> +"MFI Doorbell", "MFI Abort", "MFI SMP", "MFI Stop"};
> +
> +struct megasas_cmd_t {
> +int index;
> +int context;
> +int count;
> +
> +target_phys_addr_t pa;
> +target_phys_addr_t pa_size;
> +union mfi_frame *frame;
> +SCSIRequest *req;
> +struct iovec *iov;
> +void *iov_buf;
> +long iov_cnt;
> +long iov_size;
> +long iov_offset;

Why would anything be a long? It's either target_ulong or uintXX_t for device 
code usually :).

> +SCSIDevice *sdev;
> +struct megasas_state_t *state;
> +};
> +
> +typedef struct megasas_state_t {
> +PCIDevice dev;
> +int mmio_io_addr;
> +int io_addr;
> +int queue_addr;
> +uint32_t frame_hi;
> +
> +int fw_state;
> +uint32_t fw_sge;
> +uint32_t fw_cmds;
> +int fw_luns;
> +int intr_mask;
> +int doorbell;
> +int busy;
> +char *raid_mode_str;
> +int is_jbod;
> +
> +int event_co

[Qemu-devel] qemu crashes on Mac OS X

2011-07-01 Thread Damjan Marion (damarion)

Hi,

I have an issue when I try to run qemu-system-arm on Mac OS X. 
Sometime between 1 and 15 secs after qemu is started it crashes
as shown bellow.

Same thing on linux host works fine.

Is anybody else experiencing this?
Any Hints?

Thanks,

Damjan



(gdb) run
Starting program: /opt/arm-qemu/bin/qemu-system-arm -M verdex -pflash flash.img 
-nographic -monitor null -m 289
Reading symbols for shared libraries 
.++
 done
pxa2xx_clkpwr_write: CPU frequency change attempt


U-Boot 1.2.0 (May 10 2008 - 21:17:19) - PXA270@400 MHz - 1604

*** Welcome to Gumstix ***

DRAM:  256 MB
Flash: 32 MB
Using default environment

Hit any key to stop autoboot:  1 
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x7fff5fbfed30
0x7fff5fbfed30 in ?? ()
(gdb) 
(gdb) bt
#0  0x7fff5fbfed30 in ?? ()
#1  0x0001000c26f4 in qemu_iohandler_poll ()
#2  0x0001001975ae in main_loop_wait ()
#3  0x0001001976e2 in main_loop ()
#4  0x00010019bfbc in qemu_main ()
#5  0x0001000d63a5 in main ()
(gdb) 




Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Jakub Jermar
Hi Artyom,

On 1.7.2011 11:15, Artyom Tarasenko wrote:
> Hi Jakub,
> 2011/6/30 Jakub Jermar :
>> Hi,
>>
>> we have been observing a problem with HelenOS running on the latest git
>> Qemu/sparc64. The gist of the problem is that the following computation
>> of minimum of 64 and 512 surprisingly gives 512:
>>
>>  bytes = min(len, BPS(bs) - pos % BPS(bs));
>>  bytes = min(bytes, nodep->size - pos);
>>
>> On input, `len` is 64, `pos` is 0, `BPS(bs)` is 512 and `nodep->size` is
>> some bigger number. Surprisingly, in a non-single-stepping mode, Qemu
>> computes `bytes` as 512 instead of 64.
> 
> How is did you manage to see "bytes" without GDB?
> In my case, as reported, a printf or any other statement using bytes
> was simply not reached.

I just added a printf of all the variables in play after the computation
of the second min(). In my case, bytes was printed as the fourth number
(i.e. fifth argument to printf()). This modification of the code, as you
pointed out on our ML, reproduces the error on the first invocation of
the function, otherwise the bug seems to show later.

>> When singlestepping via GDB, the
>> result is correct.
> 
> You mean singlestepping gives other result than just setting a
> breakpoint at target address?

Yes, even though I tried singlestepping just once.

> And just to clarify, with 'singlestepping' do you mean manual stepping
> through the code, or the '-singlestep' qemu option?

Should have been clearer on this - I meant putting a GDB breakpoint to
the first `cmp` instruction in the block at 0x67c8 and then doing
`stepi` all the way down beoynd the second `movgu` instruction at
0x6808. In this case, the condition code register was computed properly.

On the other hand, putting a breakpoint directly to the secnod `movgu`
at 0x6808 showed me the unexpected value in ccr and the behavior was the
same as if no breakpoint was placed at all.

I would imagine that when singlestepping using GDB's `stepi`, Qemu would
throw away the large block and create a separate block for each
instruction, but this is just a bloody guess of mine.

Jakub



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Artyom Tarasenko
Hi Jakub,
2011/6/30 Jakub Jermar :
> Hi,
>
> we have been observing a problem with HelenOS running on the latest git
> Qemu/sparc64. The gist of the problem is that the following computation
> of minimum of 64 and 512 surprisingly gives 512:
>
>  bytes = min(len, BPS(bs) - pos % BPS(bs));
>  bytes = min(bytes, nodep->size - pos);
>
> On input, `len` is 64, `pos` is 0, `BPS(bs)` is 512 and `nodep->size` is
> some bigger number. Surprisingly, in a non-single-stepping mode, Qemu
> computes `bytes` as 512 instead of 64.

How is did you manage to see "bytes" without GDB?
In my case, as reported, a printf or any other statement using bytes
was simply not reached.

>When singlestepping via GDB, the
> result is correct.

You mean singlestepping gives other result than just setting a
breakpoint at target address?
And just to clarify, with 'singlestepping' do you mean manual stepping
through the code, or the '-singlestep' qemu option?

> I think this could be a bug in Qemu so I am attaching the relevant
> portion of qemu.log with some comments and pointers in it.
>
> I would appreciate if someone who understands the sparc64 code
> translation could have a look at this. More debugging data may be
> provided upon request.
>
> Thanks,
> Jakub
>
> ___
> HelenOS-devel mailing list
> helenos-de...@lists.modry.cz
> http://lists.modry.cz/cgi-bin/listinfo/helenos-devel
>
>



-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Artyom Tarasenko
On Fri, Jul 1, 2011 at 11:36 AM, Jakub Jermar  wrote:
> Hi Artyom,
>
> On 1.7.2011 11:15, Artyom Tarasenko wrote:
>> Hi Jakub,
>> 2011/6/30 Jakub Jermar :
>>> Hi,
>>>
>>> we have been observing a problem with HelenOS running on the latest git
>>> Qemu/sparc64. The gist of the problem is that the following computation
>>> of minimum of 64 and 512 surprisingly gives 512:
>>>
>>>  bytes = min(len, BPS(bs) - pos % BPS(bs));
>>>  bytes = min(bytes, nodep->size - pos);
>>>
>>> On input, `len` is 64, `pos` is 0, `BPS(bs)` is 512 and `nodep->size` is
>>> some bigger number. Surprisingly, in a non-single-stepping mode, Qemu
>>> computes `bytes` as 512 instead of 64.
>>
>> How is did you manage to see "bytes" without GDB?
>> In my case, as reported, a printf or any other statement using bytes
>> was simply not reached.
>
> I just added a printf of all the variables in play after the computation
> of the second min(). In my case, bytes was printed as the fourth number
> (i.e. fifth argument to printf()). This modification of the code, as you
> pointed out on our ML, reproduces the error on the first invocation of
> the function, otherwise the bug seems to show later.
>
>>> When singlestepping via GDB, the
>>> result is correct.
>>
>> You mean singlestepping gives other result than just setting a
>> breakpoint at target address?
>
> Yes, even though I tried singlestepping just once.
>
>> And just to clarify, with 'singlestepping' do you mean manual stepping
>> through the code, or the '-singlestep' qemu option?
>
> Should have been clearer on this - I meant putting a GDB breakpoint to
> the first `cmp` instruction in the block at 0x67c8 and then doing
> `stepi` all the way down beoynd the second `movgu` instruction at
> 0x6808. In this case, the condition code register was computed properly.
>
> On the other hand, putting a breakpoint directly to the secnod `movgu`
> at 0x6808 showed me the unexpected value in ccr and the behavior was the
> same as if no breakpoint was placed at all.

Looks like it's a pretty nice test case.

The test case scenario is to load the initial values into the
registers (you already know them),
calculate the mins, cmp the result with expected and jump somewhere.
Since it happens with interrupts disabled, we don't need an OS or bios
to test that. The binary just has to have an entry point at 0x20,
then we can load it instead of bios.

Care to produce the test case? (Otherwise I'll do it, but probably not
until the weekend after the next one).

> I would imagine that when singlestepping using GDB's `stepi`, Qemu would
> throw away the large block and create a separate block for each
> instruction, but this is just a bloody guess of mine.

Yes, I think this the way stepi  works.

> Jakub
>
> ___
> HelenOS-devel mailing list
> helenos-de...@lists.modry.cz
> http://lists.modry.cz/cgi-bin/listinfo/helenos-devel
>



-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Paul Brook
> One feature we need for QEMU/KVM on embedded Power Architecture is the
> ability to do passthru assignment of SoC I/O devices and memory.  An
> important use case in embedded is creating static partitions--
> taking physical memory and I/O devices (non-PCI) and partitioning
> them between the host Linux and several virtual machines.   Things like
> live migration would not be needed or supported in these types of
> scenarios.
> 
> SoC devices do not sit on a probeable bus and there are no identifiers
> like 01:00.0 with PCI that we can use to identify devices--  the host
> Linux kernel is made aware of SoC I/O devices from nodes/properties in a
> device tree structure passed at boot.   QEMU needs to generate a
> device tree to pass to the guest as well with all the guest's virtual
> and physical resources.  Today a number of mostly complete guest device
> trees are kept under ./pc-bios in QEMU, but this too static and
> inflexible.

I doubt you're going to get generic passthrough of arbitrary devices working 
in a useful way. My expectation is that, at minimum, you'll need a bus 
specific proxy device. i.e. create a virtual device in qemu that responds to 
the guest, and happens poke at a host device rather than emulating things 
directly.

For busses like I2C this is fairly trivial - all communication with the device 
goes down a single well defined and easily proxied channel.  For more complex 
busses you end up having to emulate a lot more.  Basically you have to emulate 
everything that is different between the host and guest.  If that happens to 
include device specific state then you loose.

Using PCI devices as an example: The resources provided by the device are 
self-describing, so proxying those is fairly straightforward, and doesn't even 
require manual configuration.  However replicating the environment seen by the 
device is trickier as PCI devices can initiate memory accesses (i.e. bus-
master).  For machines without an IOMMU this means passthrough in general 
can't work, and substantial amounts of device specific knowledge is required. 
You'd need to intercept and modify and/oor proxy all data relating to DMA 
addresses.  In practice you need to emulate an IOMMU inside qemu (so you can 
determine the address space accessed by the device), and arrange for the host 
IOMMU to present the same virtual address space to the real device.

Paul



[Qemu-devel] PCI: how handle multifunction / compound devices best?

2011-07-01 Thread Gerd Hoffmann
  Hi folks,

I'm still looking for a sane way to handle multifunction pci devices,
specifically a EHCI USB controller with UHCI companion controllers.

Here comes a small (incomplete[1]) two patch series to make the issue
more clear.  The first patch adds a function which creates all devices
with the properties set as needed.  The second patch adds a '-usb2'
switch to windup this in a user-friendly way.

Adding a -usb2 switch sucks though.  I'd prefer to have some way to
create such devices via -device, but without asking the user to create
all functions individually on the command line, i.e. have something
along the lines of ...

 qemu -device ich9-usb,slot=08

... instead of ...

 qemu \
  -device ich9-usb-ehci1,addr=08.7,multifunction=on,id=ehci \
  -device 
ich9-usb-uhci1,addr=08.0,multifunction=on,masterbus=ehci.0,firstport=0 \
  -device 
ich9-usb-uhci2,addr=08.1,multifunction=on,masterbus=ehci.0,firstport=2 \
  -device ich9-usb-uhci3,addr=08.2,multifunction=on,masterbus=ehci.0,firstport=4

Suggestions?

cheers,
  Gerd

[1] full series @ http://www.kraxel.org/cgit/qemu/log/?h=usb.18

Gerd Hoffmann (2):
  usb: add ich9_ehci_with_companion_init()
  usb: windup ich9_ehci_with_companion_init via -usb2 switch.

 hw/pc_piix.c|   14 ++
 hw/usb-ehci.c   |   49 +
 hw/usb.h|3 +++
 qemu-options.hx |   13 +++--
 vl.c|3 +++
 5 files changed, 76 insertions(+), 6 deletions(-)




[Qemu-devel] [PATCH 1/2] usb: add ich9_ehci_with_companion_init()

2011-07-01 Thread Gerd Hoffmann
Add convinience function which creates a ehci controller with three
companion uhci controllers, with all properties are setup correctly.

The guest will see this:

[root@fedora64 ~]# lspci -s2
00:02.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #1 (rev 03)
00:02.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #2 (rev 03)
00:02.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI 
Controller #3 (rev 03)
00:02.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI 
Controller #1 (rev 03)

Signed-off-by: Gerd Hoffmann 
---
 hw/usb-ehci.c |   49 +
 hw/usb.h  |3 +++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/hw/usb-ehci.c b/hw/usb-ehci.c
index a4758f9..d24eecb 100644
--- a/hw/usb-ehci.c
+++ b/hw/usb-ehci.c
@@ -2354,6 +2354,55 @@ static void ehci_register(void)
 }
 device_init(ehci_register);
 
+int ich9_ehci_with_companion_init(PCIBus *bus, int devfn, const char *id)
+{
+PCIDevice *ehci = NULL, *uhci1 = NULL, *uhci2 = NULL, *uhci3 = NULL;
+char masterbus[32];
+int slot = PCI_SLOT(devfn);
+
+snprintf(masterbus, sizeof(masterbus), "%s.0", id);
+
+ehci = pci_create_multifunction(bus, PCI_DEVFN(slot, 7), true,
+"ich9-usb-ehci1");
+if (NULL == ehci) {
+goto fail;
+}
+ehci->qdev.id = id;
+qdev_init_nofail(&ehci->qdev);
+
+uhci1 = pci_create_multifunction(bus, PCI_DEVFN(slot, 0), true,
+ "ich9-usb-uhci1");
+if (NULL == uhci1) {
+goto fail;
+}
+qdev_prop_set_string(&uhci1->qdev, "masterbus", masterbus);
+qdev_prop_set_uint32(&uhci1->qdev, "firstport", 0);
+qdev_init_nofail(&uhci1->qdev);
+
+uhci2 = pci_create_multifunction(bus, PCI_DEVFN(slot, 1), true,
+ "ich9-usb-uhci2");
+if (NULL == uhci2) {
+goto fail;
+}
+qdev_prop_set_string(&uhci2->qdev, "masterbus", masterbus);
+qdev_prop_set_uint32(&uhci2->qdev, "firstport", 2);
+qdev_init_nofail(&uhci2->qdev);
+
+uhci3 = pci_create_multifunction(bus, PCI_DEVFN(slot, 2), true,
+ "ich9-usb-uhci3");
+if (NULL == uhci3) {
+goto fail;
+}
+qdev_prop_set_string(&uhci3->qdev, "masterbus", masterbus);
+qdev_prop_set_uint32(&uhci3->qdev, "firstport", 4);
+qdev_init_nofail(&uhci3->qdev);
+return 0;
+
+fail:
+return -1;
+}
+
+
 /*
  * vim: expandtab ts=4
  */
diff --git a/hw/usb.h b/hw/usb.h
index ded2de2..2f90b3a 100644
--- a/hw/usb.h
+++ b/hw/usb.h
@@ -313,6 +313,9 @@ void usb_hid_datain_cb(USBDevice *dev, void *opaque, void 
(*datain)(void *));
 /* usb-bt.c */
 USBDevice *usb_bt_init(HCIInfo *hci);
 
+/* usb-ehci.c */
+int ich9_ehci_with_companion_init(PCIBus *bus, int devfn, const char *id);
+
 /* usb ports of the VM */
 
 #define VM_USB_HUB_SIZE 8
-- 
1.7.1




Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 13:16, Paul Brook wrote:

>> One feature we need for QEMU/KVM on embedded Power Architecture is the
>> ability to do passthru assignment of SoC I/O devices and memory.  An
>> important use case in embedded is creating static partitions--
>> taking physical memory and I/O devices (non-PCI) and partitioning
>> them between the host Linux and several virtual machines.   Things like
>> live migration would not be needed or supported in these types of
>> scenarios.
>> 
>> SoC devices do not sit on a probeable bus and there are no identifiers
>> like 01:00.0 with PCI that we can use to identify devices--  the host
>> Linux kernel is made aware of SoC I/O devices from nodes/properties in a
>> device tree structure passed at boot.   QEMU needs to generate a
>> device tree to pass to the guest as well with all the guest's virtual
>> and physical resources.  Today a number of mostly complete guest device
>> trees are kept under ./pc-bios in QEMU, but this too static and
>> inflexible.
> 
> I doubt you're going to get generic passthrough of arbitrary devices working 
> in a useful way. My expectation is that, at minimum, you'll need a bus 
> specific proxy device. i.e. create a virtual device in qemu that responds to 
> the guest, and happens poke at a host device rather than emulating things 
> directly.
> 
> For busses like I2C this is fairly trivial - all communication with the 
> device 
> goes down a single well defined and easily proxied channel.  For more complex 
> busses you end up having to emulate a lot more.  Basically you have to 
> emulate 
> everything that is different between the host and guest.  If that happens to 
> include device specific state then you loose.
> 
> Using PCI devices as an example: The resources provided by the device are 
> self-describing, so proxying those is fairly straightforward, and doesn't 
> even 
> require manual configuration.  However replicating the environment seen by 
> the 
> device is trickier as PCI devices can initiate memory accesses (i.e. bus-
> master).  For machines without an IOMMU this means passthrough in general 
> can't work, and substantial amounts of device specific knowledge is required. 
> You'd need to intercept and modify and/oor proxy all data relating to DMA 
> addresses.  In practice you need to emulate an IOMMU inside qemu (so you can 
> determine the address space accessed by the device), and arrange for the host 
> IOMMU to present the same virtual address space to the real device.

Well, for DMA the solution is reasonably simple. We have basically two choices:

  * run 1:1 mapped, so the guest physical address == host physical address, at 
which point DMA works, but everything is insecure
  * use an IOMMU

We can easily limit it to those two cases. The more challenging part here (and 
the main reason for the email) is the question on how to configure all of that 
in a flexible, yet simple way. We can find the IO regions for devices from the 
host device tree - no problem there.

But the real challenge is how to expose the device to the guest device tree. 
Especially when it comes to links between dt nodes, interrupt maps, etc. We 
basically have 3 choices there:

  * take the host device tree pieces and modify them
  * provide device tree chunks for each device (manually or through qdev 
parameters)
  * use the device tree as machine config file and base everything on it 
(solves the linking problem)

The main question is which one would be the cleanest solution. And how would it 
be implemented.


Alex




[Qemu-devel] [PATCH 2/2] usb: windup ich9_ehci_with_companion_init via -usb2 switch.

2011-07-01 Thread Gerd Hoffmann
Signed-off-by: Gerd Hoffmann 
---
 hw/pc_piix.c|   14 ++
 qemu-options.hx |   13 +++--
 vl.c|3 +++
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index c5c16b4..c9aed67 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -39,6 +39,7 @@
 #include "blockdev.h"
 #include "smbus.h"
 #include "xen.h"
+#include "usb.h"
 #ifdef CONFIG_XEN
 #  include 
 #endif
@@ -134,6 +135,15 @@ static void pc_init1(ram_addr_t ram_size,
 
 pc_register_ferr_irq(isa_get_irq(13));
 
+if (pci_enabled && usb_enabled) {
+if (usb_enabled == 2) {
+int slot = PCI_SLOT(piix3_devfn) + 1;
+ich9_ehci_with_companion_init(pci_bus, PCI_DEVFN(slot, 0), "ehci");
+} else {
+usb_uhci_piix3_init(pci_bus, piix3_devfn + 2);
+}
+}
+
 pc_vga_init(pci_enabled? pci_bus: NULL);
 
 if (xen_enabled()) {
@@ -172,10 +182,6 @@ static void pc_init1(ram_addr_t ram_size,
 pc_cmos_init(below_4g_mem_size, above_4g_mem_size, boot_device,
  idebus[0], idebus[1], rtc_state);
 
-if (pci_enabled && usb_enabled) {
-usb_uhci_piix3_init(pci_bus, piix3_devfn + 2);
-}
-
 if (pci_enabled && acpi_enabled) {
 i2c_bus *smbus;
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 37e54ee..c59b532 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -411,7 +411,7 @@ STEXI
 ETEXI
 
 DEF("usb", 0, QEMU_OPTION_usb,
-"-usbenable the USB driver (will be the default soon)\n",
+"-usbenable the USB 1.1 driver (will be the default soon)\n",
 QEMU_ARCH_ALL)
 STEXI
 USB options:
@@ -419,7 +419,16 @@ USB options:
 
 @item -usb
 @findex -usb
-Enable the USB driver (will be the default soon)
+Enable the USB 1.1 driver (will be the default soon)
+ETEXI
+
+DEF("usb2", 0, QEMU_OPTION_usb2,
+"-usb2   enable the USB 2.0 driver\n",
+QEMU_ARCH_ALL)
+STEXI
+@item -usb2
+@findex -usb2
+Enable the USB 2.0 driver
 ETEXI
 
 DEF("usbdevice", HAS_ARG, QEMU_OPTION_usbdevice,
diff --git a/vl.c b/vl.c
index 52402a2..f4f23ce 100644
--- a/vl.c
+++ b/vl.c
@@ -2698,6 +2698,9 @@ int main(int argc, char **argv, char **envp)
 case QEMU_OPTION_usb:
 usb_enabled = 1;
 break;
+case QEMU_OPTION_usb2:
+usb_enabled = 2;
+break;
 case QEMU_OPTION_usbdevice:
 usb_enabled = 1;
 add_device_config(DEV_USB, optarg);
-- 
1.7.1




Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 02:58, Benjamin Herrenschmidt wrote:

> On Thu, 2011-06-30 at 15:59 +, Yoder Stuart-B08248 wrote:
>> One feature we need for QEMU/KVM on embedded Power Architecture is the 
>> ability to do passthru assignment of SoC I/O devices and memory.  An 
>> important use case in embedded is creating static partitions-- 
>> taking physical memory and I/O devices (non-PCI) and partitioning
>> them between the host Linux and several virtual machines.   Things like
>> live migration would not be needed or supported in these types of scenarios.
>> 
>> SoC devices do not sit on a probeable bus and there are no identifiers 
>> like 01:00.0 with PCI that we can use to identify devices--  the host
>> Linux kernel is made aware of SoC I/O devices from nodes/properties in a 
>> device tree structure passed at boot.   QEMU needs to generate a
>> device tree to pass to the guest as well with all the guest's virtual
>> and physical resources.  Today a number of mostly complete guest device
>> trees are kept under ./pc-bios in QEMU, but this too static and
>> inflexible.
>> 
>> Some new mechanism is needed to assign SoC devices to guests, and we
>> (FSL + Alex Graf) have been discussing a few possible approaches
>> for doing this from QEMU and would like some feedback.
>> 
>> Some possibilities:
>> 
>> 1. Option 1.  Pass the host dev tree to QEMU and assign devices
>>   by device tree path
>> 
>> -dtb ./mpc8572ds.dtb -device assigned-soc-dev,dev=/soc/i2c@3000
>> 
>>   /soc/i2c@3000 is the device tree path to the assigned device.
>>   The device node 'i2c@3000' has some number of properties (e.g. 
>>   address, interrupt info) and possibly subnodes under
>>   it.   QEMU copies that node when generating the guest dev tree.
>>   See snippet of entire node:  http://paste2.org/p/1496460
> 
> Yuck (see below)
> 
>> 2. Option 2.  Pass the entire assigned device node as a string to
>>   QEMU
>> 
>> -device assigned-soc-dev,dev=/i2c@3000,dev-node='#address-cells = <1>;
>>  #size-cells = <0>; cell-index = <0>; compatible = "fsl-i2c";
>>  reg = <0xffe03000 0x100>; interrupts = <43 2>;
>>  interrupt-parent = <&mpic>; dfsrr;'
> 
> Beuark ! (see below)
> 
>>   This avoids needing to pass the host device tree, but could 
>>   get awkward-- the i2c example above is very simple, some device
>>   nodes are very large with a complex hierarchy of subnodes and 
>>   could be hundreds of lines of text to represent a single
>>   node.
>> 
>> It gets more complicated...
> 
> 
> So, from a qemu command line perspective, all you should have to do is
> pass qemu the device-tree -path- to the device you want to pass-trough
> (you may support passing a full hierarchy here).
> 
> That is for normal MMIO mapped SoC devices. Something else (individual
> i2c, usb, ...) will use specific virtualization of the corresponding
> busses.
> 
> Anything else sucks too much really.
> 
> From there, well, there's several approach inside qemu/kvm to handle
> that path. If you want to do things at the qemu level you can probably
> parse /proc/device-tree. But I'd personally just make it a kernel thing.
> 
> IE. I would have an ioctl to "instanciate" a pass-through device, that
> takes that path as an argument. I would make it return an anonymous fd
> which you can then use to mmap the resources, etc...

Yeah, one idea was to use VFIO here. We could for example modify the host 
device tree to occupy device we want to pass through with a specific 
compatibility parameter. Or we could try to steal the node during runtime. But 
I agree, reading the device tree data from a VFIO node sounds reasonable. If 
it's required.

> 
>> In some cases, modifications to device tree nodes may be needed.
>> An example-- sometimes a device tree property references another node 
>> and that relationship may not exist when assigned to a guest.
>> A "phy-handle" property may need to be deleted and a "fixed-link"
>> property added to a node representing a network device.
> 
> That's fishy. Why wouldn't you give full access to the MDIO ? It's
> shared ? Such things are so device-specific that they would have to be
> handled by device-specific quirks, which can live either in qemu or in
> the kernel.

Hrm, so you'd create a separate device for MDIO which can do pass-through of 
those?

> 
>> So in addition to assigning a device, a mechanism is needed to update 
>> device tree nodes.  So for the above example, maybe--
>> 
>> -device assigned-soc-dev,dev=/soc/ethernet@b2000,delete-prop=phy-handle,
>>  node-update="fixed-link = <2 1 1000 0 0>"
> 
> That's just so gross and error prone, borderline insane.

Alternatives:

  * not modify the device tree (unlikely to work)
  * pass a full device tree chunk to qemu instead of modification commands
  * ?

> 
>> The types of modifications needed--  deleting nodes, deleting properties, 
>> adding nodes, adding properties, adding properties that reference other
>> nodes, changing properties. This device tree transforma

Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 13:55, Paul Brook wrote:

> 
>> But the real challenge is how to expose the device to the guest device
>> tree. Especially when it comes to links between dt nodes, interrupt maps,
>> etc. We basically have 3 choices there:
>> 
>>  * take the host device tree pieces and modify them
>>  * provide device tree chunks for each device (manually or through qdev
>> parameters) * use the device tree as machine config file and base
>> everything on it (solves the linking problem)
>> 
>> The main question is which one would be the cleanest solution. And how
>> would it be implemented.
> 
> I don't think any of this is specific to device passthrough.  It occurs as 
> soon as you have any user-configurable parts of the machine (or even just a 
> nontrivial selection of machine variants).  My guess is the only reason you 
> haven't hit it before is because you're only emulated a single hard-coded 
> SoC/board.

Well, the real reason we haven't hit this before is that we don't have any 
devices in Qemu that are generic. We only have specific device emulation. This 
however would be a device that can handle hundreds of different backing 
devices, all with different requirements.

The infrastructure we have today simply isn't made for this. The question is 
how can we model it so that it will? :)


Alex




Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Paul Brook

> But the real challenge is how to expose the device to the guest device
> tree. Especially when it comes to links between dt nodes, interrupt maps,
> etc. We basically have 3 choices there:
> 
>   * take the host device tree pieces and modify them
>   * provide device tree chunks for each device (manually or through qdev
> parameters) * use the device tree as machine config file and base
> everything on it (solves the linking problem)
> 
> The main question is which one would be the cleanest solution. And how
> would it be implemented.

I don't think any of this is specific to device passthrough.  It occurs as 
soon as you have any user-configurable parts of the machine (or even just a 
nontrivial selection of machine variants).  My guess is the only reason you 
haven't hit it before is because you're only emulated a single hard-coded 
SoC/board.

Paul



Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Anthony Liguori

On 07/01/2011 06:40 AM, Alexander Graf wrote:


On 01.07.2011, at 02:58, Benjamin Herrenschmidt wrote:


On Thu, 2011-06-30 at 15:59 +, Yoder Stuart-B08248 wrote:

One feature we need for QEMU/KVM on embedded Power Architecture is the
ability to do passthru assignment of SoC I/O devices and memory.  An
important use case in embedded is creating static partitions--
taking physical memory and I/O devices (non-PCI) and partitioning
them between the host Linux and several virtual machines.   Things like
live migration would not be needed or supported in these types of scenarios.

SoC devices do not sit on a probeable bus and there are no identifiers
like 01:00.0 with PCI that we can use to identify devices--  the host
Linux kernel is made aware of SoC I/O devices from nodes/properties in a
device tree structure passed at boot.   QEMU needs to generate a
device tree to pass to the guest as well with all the guest's virtual
and physical resources.  Today a number of mostly complete guest device
trees are kept under ./pc-bios in QEMU, but this too static and
inflexible.

Some new mechanism is needed to assign SoC devices to guests, and we
(FSL + Alex Graf) have been discussing a few possible approaches
for doing this from QEMU and would like some feedback.

Some possibilities:

1. Option 1.  Pass the host dev tree to QEMU and assign devices
   by device tree path

 -dtb ./mpc8572ds.dtb -device assigned-soc-dev,dev=/soc/i2c@3000

   /soc/i2c@3000 is the device tree path to the assigned device.
   The device node 'i2c@3000' has some number of properties (e.g.
   address, interrupt info) and possibly subnodes under
   it.   QEMU copies that node when generating the guest dev tree.
   See snippet of entire node:  http://paste2.org/p/1496460


Yuck (see below)


2. Option 2.  Pass the entire assigned device node as a string to
   QEMU

 -device assigned-soc-dev,dev=/i2c@3000,dev-node='#address-cells =<1>;
  #size-cells =<0>; cell-index =<0>; compatible = "fsl-i2c";
  reg =<0xffe03000 0x100>; interrupts =<43 2>;
  interrupt-parent =<&mpic>; dfsrr;'


Beuark ! (see below)


   This avoids needing to pass the host device tree, but could
   get awkward-- the i2c example above is very simple, some device
   nodes are very large with a complex hierarchy of subnodes and
   could be hundreds of lines of text to represent a single
   node.

It gets more complicated...



So, from a qemu command line perspective, all you should have to do is
pass qemu the device-tree -path- to the device you want to pass-trough
(you may support passing a full hierarchy here).

That is for normal MMIO mapped SoC devices. Something else (individual
i2c, usb, ...) will use specific virtualization of the corresponding
busses.

Anything else sucks too much really.

 From there, well, there's several approach inside qemu/kvm to handle
that path. If you want to do things at the qemu level you can probably
parse /proc/device-tree. But I'd personally just make it a kernel thing.

IE. I would have an ioctl to "instanciate" a pass-through device, that
takes that path as an argument. I would make it return an anonymous fd
which you can then use to mmap the resources, etc...


Yeah, one idea was to use VFIO here. We could for example modify the host 
device tree to occupy device we want to pass through with a specific 
compatibility parameter. Or we could try to steal the node during runtime. But 
I agree, reading the device tree data from a VFIO node sounds reasonable. If 
it's required.


That makes it very specific to systems that use device trees.

To do the same for ARM platforms or x86, you would need to invent yet 
another mechanism.


Passing through arbitrary MMIO is fairly straight forward (likewise with 
PIO).  Passing through IRQs is a bit less straight forward and perhaps 
VFIO is the answer here.


I don't see a problem with QEMU figuring out what a device's resources 
are and doing the assignment.


Regards,

Anthony Liguori



Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Anthony Liguori

On 06/30/2011 07:58 PM, Benjamin Herrenschmidt wrote:

On Thu, 2011-06-30 at 15:59 +, Yoder Stuart-B08248 wrote:

This avoids needing to pass the host device tree, but could
get awkward-- the i2c example above is very simple, some device
nodes are very large with a complex hierarchy of subnodes and
could be hundreds of lines of text to represent a single
node.

It gets more complicated...



So, from a qemu command line perspective, all you should have to do is
pass qemu the device-tree -path- to the device you want to pass-trough
(you may support passing a full hierarchy here).


I agree in principle but I think it should be done in a slightly 
different way.


I think we ought to support composing a device by passthrough.  For 
instance, something like:


[physical-device "mydev"]
region[0].file = "/dev/mem"
region[0].guest_address = "0x42232000"
region[0].file_offset = "0x23423400"
region[0].size = "4096"
irq[0].guest_irq = "10"
irq[0].host_irq = "10"

This should be independent of anything to do with device tree.  This 
would be useful for x86 too to assign platform devices (like the HPET).


I think there should be a separate mechanism to manipulate the guest 
device tree, just like there are mechanisms to manipulate the guest's 
ACPI tables.


Given these two mechanisms, there should be a simple command line like 
Ben has suggested that just takes a host device tree path and Just 
Works.  It really is just a convenience interface though.


With raw mechanisms like I described above, it would give you the 
flexibility to pass through a device with a modified host tree fragment 
without having an overly complicated command line interface for the more 
common case.


Regards,

Anthony Liguori



Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Anthony Liguori

On 07/01/2011 07:02 AM, Alexander Graf wrote:


On 01.07.2011, at 13:55, Paul Brook wrote:




But the real challenge is how to expose the device to the guest device
tree. Especially when it comes to links between dt nodes, interrupt maps,
etc. We basically have 3 choices there:

  * take the host device tree pieces and modify them
  * provide device tree chunks for each device (manually or through qdev
parameters) * use the device tree as machine config file and base
everything on it (solves the linking problem)

The main question is which one would be the cleanest solution. And how
would it be implemented.


I don't think any of this is specific to device passthrough.  It occurs as
soon as you have any user-configurable parts of the machine (or even just a
nontrivial selection of machine variants).  My guess is the only reason you
haven't hit it before is because you're only emulated a single hard-coded
SoC/board.


Well, the real reason we haven't hit this before is that we don't have any 
devices in Qemu that are generic. We only have specific device emulation. This 
however would be a device that can handle hundreds of different backing 
devices, all with different requirements.

The infrastructure we have today simply isn't made for this. The question is 
how can we model it so that it will? :)


Our infrastructure is quite capable of handling this.  It has many other 
problems but I think the only thing really missing is the way to have 
lists of parameters.  That seems easy to solve though.


Regards,

Anthony Liguori




Alex






Re: [Qemu-devel] PCI: how handle multifunction / compound devices best?

2011-07-01 Thread Anthony Liguori

On 07/01/2011 06:27 AM, Gerd Hoffmann wrote:

   Hi folks,

I'm still looking for a sane way to handle multifunction pci devices,
specifically a EHCI USB controller with UHCI companion controllers.

Here comes a small (incomplete[1]) two patch series to make the issue
more clear.  The first patch adds a function which creates all devices
with the properties set as needed.  The second patch adds a '-usb2'
switch to windup this in a user-friendly way.

Adding a -usb2 switch sucks though.  I'd prefer to have some way to
create such devices via -device, but without asking the user to create
all functions individually on the command line, i.e. have something
along the lines of ...

  qemu -device ich9-usb,slot=08


This is the place where are current infrastructure pretty much stinks.

The first problem is that the device factory interface should be 
involved with device addressing.  We really should have:


qemu -device ich9-usb,id=uhci0 -set i440fx.slot[8]=uhci0

The device doesn't care what it's PCI slot address is.

ich9-usb should be able to create its functions too as part of its 
initialization.  I don't see an easy answer here unless we do away with 
the current bus model.  There's no way we can do composition given the 
bus model we have.


Best we could do is syntactic sugar for what you have below.

Regards,

Anthony Liguori



... instead of ...

  qemu \
   -device ich9-usb-ehci1,addr=08.7,multifunction=on,id=ehci \
   -device 
ich9-usb-uhci1,addr=08.0,multifunction=on,masterbus=ehci.0,firstport=0 \
   -device 
ich9-usb-uhci2,addr=08.1,multifunction=on,masterbus=ehci.0,firstport=2 \
   -device 
ich9-usb-uhci3,addr=08.2,multifunction=on,masterbus=ehci.0,firstport=4

Suggestions?

cheers,
   Gerd

[1] full series @ http://www.kraxel.org/cgit/qemu/log/?h=usb.18

Gerd Hoffmann (2):
   usb: add ich9_ehci_with_companion_init()
   usb: windup ich9_ehci_with_companion_init via -usb2 switch.

  hw/pc_piix.c|   14 ++
  hw/usb-ehci.c   |   49 +
  hw/usb.h|3 +++
  qemu-options.hx |   13 +++--
  vl.c|3 +++
  5 files changed, 76 insertions(+), 6 deletions(-)







Re: [Qemu-devel] [PATCH v6 01/12] VMDK: introduce VmdkExtent

2011-07-01 Thread Kevin Wolf
Am 01.07.2011 06:55, schrieb Fam Zheng:
> Introduced VmdkExtent array into BDRVVmdkState, enable holding multiple
> image extents for multiple file image support.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c |  355 
> +-
>  1 files changed, 252 insertions(+), 103 deletions(-)
> 
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 922b23d..f26137d 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -60,7 +60,11 @@ typedef struct {
>  
>  #define L2_CACHE_SIZE 16
>  
> -typedef struct BDRVVmdkState {
> +typedef struct VmdkExtent {
> +BlockDriverState *file;
> +bool flat;
> +int64_t sectors;
> +int64_t end_sector;
>  int64_t l1_table_offset;
>  int64_t l1_backup_table_offset;
>  uint32_t *l1_table;
> @@ -74,7 +78,12 @@ typedef struct BDRVVmdkState {
>  uint32_t l2_cache_counts[L2_CACHE_SIZE];
>  
>  unsigned int cluster_sectors;
> +} VmdkExtent;
> +
> +typedef struct BDRVVmdkState {
> +int num_extents;
>  uint32_t parent_cid;
> +VmdkExtent *extents;
>  } BDRVVmdkState;
>  
>  typedef struct VmdkMetaData {
> @@ -105,6 +114,19 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
> const char *filename)
>  #define DESC_SIZE 20*SECTOR_SIZE // 20 sectors of 512 bytes each
>  #define HEADER_SIZE 512  // first sector of 512 bytes
>  
> +static void vmdk_free_extents(BlockDriverState *bs)
> +{
> +int i;
> +BDRVVmdkState *s = bs->opaque;
> +
> +for (i = 0; i < s->num_extents; i++) {
> +qemu_free(s->extents[i].l1_table);
> +qemu_free(s->extents[i].l2_cache);
> +qemu_free(s->extents[i].l1_backup_table);
> +}
> +qemu_free(s->extents);
> +}
> +
>  static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
>  {
>  char desc[DESC_SIZE];
> @@ -156,8 +178,8 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t 
> cid)
>  
>  static int vmdk_is_cid_valid(BlockDriverState *bs)
>  {
> -#ifdef CHECK_CID
>  BDRVVmdkState *s = bs->opaque;
> +#ifdef CHECK_CID
>  BlockDriverState *p_bs = bs->backing_hd;
>  uint32_t cur_pcid;

Hm, intentional? The only thing this seems to change is that it causes
an 'unused variable' warning of CHECK_CID isn't defined.

> @@ -358,11 +380,50 @@ static int vmdk_parent_open(BlockDriverState *bs)
>  return 0;
>  }
>  
> +/* Create and append extent to the extent array. Return the added VmdkExtent
> + * address. return NULL if allocation failed. */
> +static VmdkExtent *vmdk_add_extent(BlockDriverState *bs,
> +   BlockDriverState *file, bool flat, int64_t 
> sectors,
> +   int64_t l1_offset, int64_t l1_backup_offset,
> +   uint32_t l1_size,
> +   int l2_size, unsigned int cluster_sectors)
> +{
> +VmdkExtent *extent;
> +BDRVVmdkState *s = bs->opaque;
> +
> +s->extents = qemu_realloc(s->extents,
> +  (s->num_extents + 1) * sizeof(VmdkExtent));
> +extent = &s->extents[s->num_extents];
> +s->num_extents++;
> +
> +memset(extent, 0, sizeof(VmdkExtent));
> +extent->file = file;
> +extent->flat = flat;
> +extent->sectors = sectors;
> +extent->l1_table_offset = l1_offset;
> +extent->l1_backup_table_offset = l1_backup_offset;
> +extent->l1_size = l1_size;
> +extent->l1_entry_sectors = l2_size * cluster_sectors;
> +extent->l2_size = l2_size;
> +extent->cluster_sectors = cluster_sectors;
> +
> +if (s->num_extents > 1) {
> +extent->end_sector = (*(extent - 1)).end_sector + extent->sectors;
> +} else {
> +extent->end_sector = extent->sectors;
> +}
> +bs->total_sectors = extent->end_sector;

This means that extents must be added in ascending order, so that the
latest extent describes the highest sector numbers. Should be documented.

> +return extent;
> +}
> +
> +
>  static int vmdk_open(BlockDriverState *bs, int flags)
>  {
>  BDRVVmdkState *s = bs->opaque;
>  uint32_t magic;
> -int l1_size, i;
> +int i;
> +uint32_t l1_size, l1_entry_sectors;
> +VmdkExtent *extent = NULL;
>  
>  if (bdrv_pread(bs->file, 0, &magic, sizeof(magic)) != sizeof(magic))
>  goto fail;
> @@ -370,32 +431,34 @@ static int vmdk_open(BlockDriverState *bs, int flags)
>  magic = be32_to_cpu(magic);
>  if (magic == VMDK3_MAGIC) {
>  VMDK3Header header;
> -
> -if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header)) != 
> sizeof(header))
> +if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header))
> +!= sizeof(header)) {
>  goto fail;
> -s->cluster_sectors = le32_to_cpu(header.granularity);
> -s->l2_size = 1 << 9;
> -s->l1_size = 1 << 6;
> -bs->total_sectors = le32_to_cpu(header.disk_sectors);
> -s->l1_table_offset = le32_to_cpu(header.l1dir_offset) << 9;
> -s->l1_ba

Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Paul Brook
> > So, from a qemu command line perspective, all you should have to do is
> > pass qemu the device-tree -path- to the device you want to pass-trough
> > (you may support passing a full hierarchy here).
> 
> I agree in principle but I think it should be done in a slightly
> different way.
> 
> I think we ought to support composing a device by passthrough.  For
> instance, something like:
> 
> [physical-device "mydev"]
> region[0].file = "/dev/mem"
> region[0].guest_address = "0x42232000"
> region[0].file_offset = "0x23423400"
> region[0].size = "4096"
> irq[0].guest_irq = "10"
> irq[0].host_irq = "10"
> 
> This should be independent of anything to do with device tree.  This
> would be useful for x86 too to assign platform devices (like the HPET).

I'm not quite sure what you're getting at here.  IMO there should be little or 
no need for special knowledge of passthrough devices.  They should just be 
annother qdev device, configured in the normal way.  e.g.:
   -device sysbus-host,hostdev=whatever,normal_mmio_and_irq_config
Should work the same as adding any other device. If it doesn't then we should 
fix that.  This is an example of why it's good to have device features (IRQs, 
MMIO regions, sockets, or whatever we call them) registered when the device is 
instantiated, not relying on pre-compiled device decriptors/property lists.  
In the latter case you probably need explicit variants for differnt numbers of 
IRQs, MMIO regions, etc.

While I'm thinking about it, we already have exactly this for USB (i.e. the 
usb-host device).

> I think there should be a separate mechanism to manipulate the guest
> device tree, just like there are mechanisms to manipulate the guest's
> ACPI tables.

I aggree.  Any sort of device tree (IIUC ACPI tables are in principle giving 
the same information) is, in practice, going to need to be assembled at 
runtime.  This needs some mechanism for devices to describe themselves, 
probably largely independent of actual machine/device creation code.

We've got away without it thus far because the only real place where we have 
nontrivial user-specified machine variants is on the PCI bus.  Devices there 
are for the most part self-describing so the guest firmware/OS can probe 
hardware itself.

Paul



Re: [Qemu-devel] [PATCH v6 04/12] VMDK: separate vmdk_open by format version

2011-07-01 Thread Kevin Wolf
Am 01.07.2011 06:55, schrieb Fam Zheng:
> Separate vmdk_open by subformats to:
> * vmdk_open_vmdk3
> * vmdk_open_vmdk4
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c |  197 ++---
>  1 files changed, 131 insertions(+), 66 deletions(-)
> 
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 4684670..03248f3 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -456,74 +456,26 @@ static VmdkExtent *vmdk_add_extent(BlockDriverState *bs,
>  return extent;
>  }
>  
> -
> -static int vmdk_open(BlockDriverState *bs, int flags)
> +static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent)
>  {
> -BDRVVmdkState *s = bs->opaque;
> -uint32_t magic;
> -int i;
> -uint32_t l1_size, l1_entry_sectors;
> -VmdkExtent *extent = NULL;
> -
> -if (bdrv_pread(bs->file, 0, &magic, sizeof(magic)) != sizeof(magic))
> -goto fail;
> -
> -magic = be32_to_cpu(magic);
> -if (magic == VMDK3_MAGIC) {
> -VMDK3Header header;
> -if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header))
> -!= sizeof(header)) {
> -goto fail;
> -}
> -extent = vmdk_add_extent(bs, bs->file, false,
> -  le32_to_cpu(header.disk_sectors),
> -  le32_to_cpu(header.l1dir_offset) << 9, 0,
> -  1 << 6, 1 << 9, 
> le32_to_cpu(header.granularity));
> -} else if (magic == VMDK4_MAGIC) {
> -VMDK4Header header;
> -if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header))
> -!= sizeof(header)) {
> -goto fail;
> -}
> -l1_entry_sectors = le32_to_cpu(header.num_gtes_per_gte)
> -* le64_to_cpu(header.granularity);
> -l1_size = (le64_to_cpu(header.capacity) + l1_entry_sectors - 1)
> -/ l1_entry_sectors;
> -extent = vmdk_add_extent(bs, bs->file, false,
> -  le64_to_cpu(header.capacity),
> -  le64_to_cpu(header.gd_offset) << 9,
> -  le64_to_cpu(header.rgd_offset) << 9,
> -  l1_size,
> -  le32_to_cpu(header.num_gtes_per_gte),
> -  le64_to_cpu(header.granularity));
> -if (extent->l1_entry_sectors <= 0) {
> -goto fail;
> -}
> -// try to open parent images, if exist
> -if (vmdk_parent_open(bs) != 0)
> -goto fail;
> -// write the CID once after the image creation
> -s->parent_cid = vmdk_read_cid(bs,1);
> -} else {
> -goto fail;
> -}
> -
> -/* sum up the total sectors */
> -bs->total_sectors = 0;
> -for (i = 0; i < s->num_extents; i++) {
> -bs->total_sectors += s->extents[i].sectors;
> -}
> +int ret;
> +int l1_size, i;
>  
>  /* read the L1 table */
>  l1_size = extent->l1_size * sizeof(uint32_t);
>  extent->l1_table = qemu_malloc(l1_size);
> -if (bdrv_pread(bs->file,
> -extent->l1_table_offset,
> -extent->l1_table,
> -l1_size)
> -!= l1_size) {
> +if (!extent->l1_table) {
> +ret = -ENOMEM;
>  goto fail;
>  }

qemu_malloc() never fails, so you don't need this check.

> +ret = bdrv_pread(bs->file,
> +extent->l1_table_offset,
> +extent->l1_table,
> +l1_size);
> +if (ret != l1_size) {
> +ret = ret < 0 ? ret : -EIO;
> +goto fail_l1;
> +}

bdrv_pread only ever returns 0 for success or -errno for errors. So you
can simplify the code like this:

ret = bdrv_pread(...);
if (ret < 0) {
goto fail_l1;
}

You have the same pattern in other places, too.

>  for (i = 0; i < extent->l1_size; i++) {
>  le32_to_cpus(&extent->l1_table[i]);
>  }
> @@ -538,16 +490,129 @@ static int vmdk_open(BlockDriverState *bs, int flags)
>  goto fail;
>  }
>  for (i = 0; i < extent->l1_size; i++) {
> +if (!extent->l1_backup_table) {
> +ret = -ENOMEM;
> +goto fail_l1;
> +}
> +}

Hm, isn't this checking the same thing multiple times?

Anyway, qemu_malloc() still doesn't fail. ;-)

> +ret = bdrv_pread(bs->file,
> +extent->l1_backup_table_offset,
> +extent->l1_backup_table,
> +l1_size);

This duplicates a read from a few lines above. Mismerge?

> +if (ret != l1_size) {
> +ret = ret < 0 ? ret : -EIO;
> +goto fail;
> +}
> +for (i = 0; i < extent->l1_size; i++) {
>  le32_to_cpus(&extent->l1_backup_table[i]);
>  }
>  }
>  
>  extent->l2_cache =
>  qemu_malloc(extent->l2_size * L2_CACHE_SIZE * sizeof(uint32_t));
> +i

Re: [Qemu-devel] [PATCH v6 04/12] VMDK: separate vmdk_open by format version

2011-07-01 Thread Fam Zheng
>
> bdrv_pread only ever returns 0 for success or -errno for errors. So you
> can simplify the code like this:
>
> ret = bdrv_pread(...);
> if (ret < 0) {
>    goto fail_l1;
> }
>
> You have the same pattern in other places, too.
>

I think bdrv_pead do return the read bytes, did you mean bdrv_read here? :)

-- 
Best regards!
Fam Zheng



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Jakub Jermar
On 1.7.2011 12:41, Artyom Tarasenko wrote:
> Looks like it's a pretty nice test case.
> 
> The test case scenario is to load the initial values into the
> registers (you already know them),
> calculate the mins, cmp the result with expected and jump somewhere.
> Since it happens with interrupts disabled, we don't need an OS or bios
> to test that. The binary just has to have an entry point at 0x20,
> then we can load it instead of bios.
> 
> Care to produce the test case? (Otherwise I'll do it, but probably not
> until the weekend after the next one).

Yes I have a small reproducible testcase, see the attachments.

When _not_ singlestepping via GDB's `stepi`, the testcase will fail and
crash Qemu like this:

qemu: fatal: Trap 0x0101 while trap level (5) >= MAXTL (5), Error state


On the other hand, when I attach GDB to Qemu and step through all
instructions using `stepi`, the testcase will succeed and crash Qemu
like this:

qemu: fatal: Trap 0x0100 while trap level (5) >= MAXTL (5), Error state


Mind the difference in the trap type - 0x100 for success, 0x101 for failure.

This is how I run the test:

Without GDB:
$ qemu-system-sparc64 -bios ./testcase

With GDB:
$ qemu-system-sparc64 -bios ./testcase -s -S

>From another terminal:
$ /usr/local/cross/sparc64/bin/sparc64-linux-gnu-gdb
(gdb) set architecture sparc:v9
(gdb) target remote localhost:1234
(gdb) stepi
...

Hope this helps to fix the problem.

Jakub
.PHONY=all
all:
rm *.o testcase
/usr/local/cross/sparc64/bin/sparc64-linux-gnu-as -g testcase.S -o 
testcase.o
/usr/local/cross/sparc64/bin/sparc64-linux-gnu-ld -Ttext 0x00 
testcase.o -o testcase --oformat binary



testcase
Description: Binary data
.global _start 

.register %g2, #scratch
.register %g3, #scratch

.text

.space 0x20

_start:
set 110393, %i1
set 0, %i2
set 131072, %g1
set 0x40, %g3

cmp  %i1, %g3
srl  %g1, 8, %g4
srl  %g1, 0x18, %g1
or  %g4, %g1, %g4
sllx  %g4, 0x30, %g2
srlx  %g2, 0x30, %g2
udivx  %i2, %g2, %g1
mulx  %g1, %g2, %g1
movgu  %xcc, %g3, %i1
sub  %i2, %g1, %g1
sub  %g2, %g1, %g1
cmp  %i1, %g1
bgu  %xcc, 0f
nop

succ:
ta 0

fail:
0:
ta 1



Re: [Qemu-devel] [PATCH v6 04/12] VMDK: separate vmdk_open by format version

2011-07-01 Thread Kevin Wolf
Am 01.07.2011 15:06, schrieb Fam Zheng:
>>
>> bdrv_pread only ever returns 0 for success or -errno for errors. So you
>> can simplify the code like this:
>>
>> ret = bdrv_pread(...);
>> if (ret < 0) {
>>goto fail_l1;
>> }
>>
>> You have the same pattern in other places, too.
> 
> I think bdrv_pead do return the read bytes, did you mean bdrv_read here? :)

Yes, you're right, it returns the read bytes. But it's always -errno or
the full byte count, there are no short reads. So my explanation wasn't
quite right, but the suggestion stays the same. :-)

Kevin



Re: [Qemu-devel] [PATCH 1/3] block: drive_init(): Fix indentation

2011-07-01 Thread Luiz Capitulino
On Fri, 01 Jul 2011 09:16:38 +0200
Markus Armbruster  wrote:

> Luiz Capitulino  writes:
> 
> > Some constructions in that function have broken indentation. Fix them.
> >
> > Signed-off-by: Luiz Capitulino 
> > ---
> >  blockdev.c |   52 ++--
> >  1 files changed, 26 insertions(+), 26 deletions(-)
> >
> > diff --git a/blockdev.c b/blockdev.c
> > index 7d579d6..27bf68a 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -272,23 +272,23 @@ DriveInfo *drive_init(QemuOpts *opts, int 
> > default_to_scsi)
> >  if (type == IF_COUNT) {
> >  error_report("unsupported bus type '%s'", buf);
> >  return NULL;
> > -   }
> > +   }
> 
> Old indentation is correct, apart from tabs.
> 
> New indentation is incorrect, and fails to remove tabs.
> 
> if (type == IF_COUNT) {
> error_report("unsupported bus type '%s'", buf);
> return NULL;
>   }
> ^
> tab hereincorrect indentation

Yeah, I realized that after Stefan's comment.

> Check your editor settings.
> 
> I don't see "broken" indentation anywhere in this function.  A few lines
> are off: indented 3 instead of 4.  If we want to fix such things, I'd
> prefer it done globally.

Ok, I'll drop this patch from the series then and let to do it globally
later.



Re: [Qemu-devel] [PATCH v6 05/12] VMDK: add field BDRVVmdkState.desc_offset

2011-07-01 Thread Kevin Wolf
Am 01.07.2011 06:55, schrieb Fam Zheng:
> There are several occurrence of magic number 0x200 as the descriptor
> offset within mono sparse image file. This is not the case for images
> with separate descriptor file. So a field is added to BDRVVmdkState to
> hold the correct value.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c |   26 --
>  1 files changed, 16 insertions(+), 10 deletions(-)
> 

> @@ -296,7 +300,6 @@ static int vmdk_snapshot_create(const char *filename, 
> const char *backing_file)
>  ret = -errno;
>  goto fail;
>  }
> -/* the descriptor offset = 0x200 */
>  if (lseek(p_fd, 0x200, SEEK_SET) == -1) {
>  ret = -errno;
>  goto fail;

Is this comment wrong or why do you remove it? I think for the newly
created image it is correct and with this patch we have an unexplained
magic 0x200 in the code.

Kevin



Re: [Qemu-devel] [PATCH 2/3] scsi: replace 'tag' with 'hba_private' pointer

2011-07-01 Thread Hannes Reinecke

On 07/01/2011 10:27 AM, Paolo Bonzini wrote:

On 07/01/2011 09:42 AM, Hannes Reinecke wrote:

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.


This makes tracing a bit harder to follow. Perhaps you can keep the
transport tag (a uint64_t) in the SCSIRequest for debugging purposes?


Hmm. The transport tag wouldn't have any meaning outside scsi-bus.c.
And it's a 64-bit value. So why can't we use the hba_private pointer 
directly here?
After some I/O has been ongoing the linear 'tag' number becomes 
unreadable very quickly, so there's not much difference here ...


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



[Qemu-devel] [PATCH 1/2] block: drive_init(): Simplify interface type setting

2011-07-01 Thread Luiz Capitulino
Signed-off-by: Luiz Capitulino 
---
 blockdev.c |   12 
 1 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 7d579d6..470be71 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -240,14 +240,6 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
 int ret;
 
 translation = BIOS_ATA_TRANSLATION_AUTO;
-
-if (default_to_scsi) {
-type = IF_SCSI;
-pstrcpy(devname, sizeof(devname), "scsi");
-} else {
-type = IF_IDE;
-pstrcpy(devname, sizeof(devname), "ide");
-}
 media = MEDIA_DISK;
 
 /* extract parameters */
@@ -273,7 +265,11 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
 error_report("unsupported bus type '%s'", buf);
 return NULL;
}
+} else {
+type = default_to_scsi ? IF_SCSI : IF_IDE;
+pstrcpy(devname, sizeof(devname), if_name[type]);
 }
+
 max_devs = if_max_devs[type];
 
 if (cyls || heads || secs) {
-- 
1.7.6.49.g033c2




Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Anthony Liguori

On 07/01/2011 07:52 AM, Paul Brook wrote:

So, from a qemu command line perspective, all you should have to do is
pass qemu the device-tree -path- to the device you want to pass-trough
(you may support passing a full hierarchy here).


I agree in principle but I think it should be done in a slightly
different way.

I think we ought to support composing a device by passthrough.  For
instance, something like:

[physical-device "mydev"]
region[0].file = "/dev/mem"
region[0].guest_address = "0x42232000"
region[0].file_offset = "0x23423400"
region[0].size = "4096"
irq[0].guest_irq = "10"
irq[0].host_irq = "10"

This should be independent of anything to do with device tree.  This
would be useful for x86 too to assign platform devices (like the HPET).


I'm not quite sure what you're getting at here.  IMO there should be little or
no need for special knowledge of passthrough devices.  They should just be
annother qdev device, configured in the normal way.  e.g.:
-device sysbus-host,hostdev=whatever,normal_mmio_and_irq_config


What I wrote about is just readconfig syntax.  It's the same as:

-device physical-device,id=mydev,region[0].file=/dev/mem,

Regards,

Anthony Liguori


I think there should be a separate mechanism to manipulate the guest
device tree, just like there are mechanisms to manipulate the guest's
ACPI tables.


I aggree.  Any sort of device tree (IIUC ACPI tables are in principle giving
the same information) is, in practice, going to need to be assembled at
runtime.  This needs some mechanism for devices to describe themselves,
probably largely independent of actual machine/device creation code.

We've got away without it thus far because the only real place where we have
nontrivial user-specified machine variants is on the PCI bus.  Devices there
are for the most part self-describing so the guest firmware/OS can probe
hardware itself.

Paul






Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Jakub Jermar
On 1.7.2011 14:57, Jakub Jermar wrote:
> On 1.7.2011 12:41, Artyom Tarasenko wrote:
>> Looks like it's a pretty nice test case.
>>
>> The test case scenario is to load the initial values into the
>> registers (you already know them),
>> calculate the mins, cmp the result with expected and jump somewhere.
>> Since it happens with interrupts disabled, we don't need an OS or bios
>> to test that. The binary just has to have an entry point at 0x20,
>> then we can load it instead of bios.
>>
>> Care to produce the test case? (Otherwise I'll do it, but probably not
>> until the weekend after the next one).
> 
> Yes I have a small reproducible testcase, see the attachments.

Actually, the testcase can be further reduced into:

.global _start

.text

.space 0x20

_start:
set 110393, %i1
set 0x40, %i2

cmp  %i1, %i2
udivx  %g0, 1, %g0
movgu  %xcc, %i2, %i1
cmp  %i1, 512
bgu  %xcc, 0f
nop

succ:
ta 0

fail:
0:
ta 1

The presence of the `udivx` instruction seems to be essential. Even
though it has no effect on the computation, removing it will make the
testcase non-reproducible.

Jakub



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Laurent Desnogues
On Fri, Jul 1, 2011 at 2:57 PM, Jakub Jermar  wrote:
[...]
> When _not_ singlestepping via GDB's `stepi`, the testcase will fail and
> crash Qemu like this:
>
> qemu: fatal: Trap 0x0101 while trap level (5) >= MAXTL (5), Error state
> 
>
> On the other hand, when I attach GDB to Qemu and step through all
> instructions using `stepi`, the testcase will succeed and crash Qemu
> like this:
>
> qemu: fatal: Trap 0x0100 while trap level (5) >= MAXTL (5), Error state
> 
>
> Mind the difference in the trap type - 0x100 for success, 0x101 for failure.
>
> This is how I run the test:
>
> Without GDB:
> $ qemu-system-sparc64 -bios ./testcase
>
> With GDB:
> $ qemu-system-sparc64 -bios ./testcase -s -S
>
> From another terminal:
> $ /usr/local/cross/sparc64/bin/sparc64-linux-gnu-gdb
> (gdb) set architecture sparc:v9
> (gdb) target remote localhost:1234
> (gdb) stepi
> ...
>
> Hope this helps to fix the problem.

You don't have to use gdb to reproduce the issue, just add -singlestep
when running qemu.

I find it odd that udivx is using cpu_cc_src and cpu_cc_src2.  Using
dedicated local temps seems to fix the issue.

HTH,

Laurent



[Qemu-devel] [PATCH V2] [PowerPC][RFC] booke timers

2011-07-01 Thread Fabien Chouteau
While working on the emulation of the freescale p2010 (e500v2) I realized that
there's no implementation of booke's timers features. Currently mpc8544 uses
ppc_emb (ppc_emb_timers_init) which is close but not exactly like booke (for
example booke uses different SPR).

This is a first attempt for a distinct and clean implementation of booke's
timers.

Please feel free to comment.

V2:
  - Fix fixed timer, now trigger each time the selected
bit switch from 0 to 1.
  - Fix e500 criterion.
  - Trigger an interrupt when user set DIE/FIE/WIE while
DIS/FIS/WIS is already set.
  - Minor fixes (mask definition, variable name...).
  - rename ppc_emb to ppc_40x

Signed-off-by: Fabien Chouteau 
---
 Makefile.target |2 +-
 hw/ppc.c|  126 +++-
 hw/ppc.h|   25 -
 hw/ppc4xx_devs.c|2 +-
 hw/ppc_booke.c  |  266 +++
 hw/ppce500_mpc8544ds.c  |4 +-
 hw/virtex_ml507.c   |   11 +--
 target-ppc/cpu.h|   22 
 target-ppc/helper.c |   12 ++-
 target-ppc/translate_init.c |   43 +++-
 10 files changed, 412 insertions(+), 101 deletions(-)
 create mode 100644 hw/ppc_booke.c

diff --git a/Makefile.target b/Makefile.target
index 38afdb8..f0d4062 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -242,7 +242,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
-obj-ppc-y = ppc.o
+obj-ppc-y = ppc.o ppc_booke.o
 obj-ppc-y += vga.o
 # PREP target
 obj-ppc-y += i8259.o mc146818rtc.o
diff --git a/hw/ppc.c b/hw/ppc.c
index 9157719..f13f4ea 100644
--- a/hw/ppc.c
+++ b/hw/ppc.c
@@ -50,7 +50,7 @@
 static void cpu_ppc_tb_stop (CPUState *env);
 static void cpu_ppc_tb_start (CPUState *env);
 
-static void ppc_set_irq (CPUState *env, int n_IRQ, int level)
+void ppc_set_irq (CPUState *env, int n_IRQ, int level)
 {
 unsigned int old_pending = env->pending_interrupts;
 
@@ -423,25 +423,8 @@ void ppce500_irq_init (CPUState *env)
 }
 /*/
 /* PowerPC time base and decrementer emulation */
-struct ppc_tb_t {
-/* Time base management */
-int64_t  tb_offset;/* Compensation*/
-int64_t  atb_offset;   /* Compensation*/
-uint32_t tb_freq;  /* TB frequency*/
-/* Decrementer management */
-uint64_t decr_next;/* Tick for next decr interrupt*/
-uint32_t decr_freq;/* decrementer frequency   */
-struct QEMUTimer *decr_timer;
-/* Hypervisor decrementer management */
-uint64_t hdecr_next;/* Tick for next hdecr interrupt  */
-struct QEMUTimer *hdecr_timer;
-uint64_t purr_load;
-uint64_t purr_start;
-void *opaque;
-};
 
-static inline uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk,
-  int64_t tb_offset)
+uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, int64_t tb_offset)
 {
 /* TB time in tb periods */
 return muldiv64(vmclk, tb_env->tb_freq, get_ticks_per_sec()) + tb_offset;
@@ -678,18 +661,23 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t 
*nextp,
 decr, value);
 now = qemu_get_clock_ns(vm_clock);
 next = now + muldiv64(value, get_ticks_per_sec(), tb_env->decr_freq);
-if (is_excp)
+if (is_excp) {
 next += *nextp - now;
-if (next == now)
+}
+if (next == now) {
 next++;
+}
 *nextp = next;
 /* Adjust timer */
 qemu_mod_timer(timer, next);
 /* If we set a negative value and the decrementer was positive,
- * raise an exception.
+ * raise an exception (not for booke).
  */
-if ((value & 0x8000) && !(decr & 0x8000))
+if (! (env->insns_flags & PPC_BOOKE)
+&& (value & 0x8000)
+&& !(decr & 0x8000)) {
 (*raise_excp)(env);
+}
 }
 
 static inline void _cpu_ppc_store_decr(CPUState *env, uint32_t decr,
@@ -806,11 +794,11 @@ uint32_t cpu_ppc601_load_rtcl (CPUState *env)
 }
 
 /*/
-/* Embedded PowerPC timers */
+/* PowerPC 40x timers */
 
 /* PIT, FIT & WDT */
-typedef struct ppcemb_timer_t ppcemb_timer_t;
-struct ppcemb_timer_t {
+typedef struct ppc40x_timer_t ppc40x_timer_t;
+struct ppc40x_timer_t {
 uint64_t pit_reload;  /* PIT auto-reload value*/
 uint64_t fit_next;/* Tick for next FIT interrupt  */
 struct QEMUTimer *fit_timer;
@@ -826,12 +814,12 @@ static void cpu_4xx_fit_cb (void *opaque)
 {
 CPUState *env;
 ppc_tb_t *tb_env;
-ppcemb_timer_t *ppcemb_timer;
+ppc40x_timer_t *ppc40x_timer;
 uint64_t now, next;
 
 env = opaque;
 tb_env = env->tb_env;
-ppcemb_timer = tb_env->opaque;
+ppc40x_timer = tb_env->opaque;
 now = qemu_get_clock_ns(vm_cloc

Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Laurent Desnogues
On Fri, Jul 1, 2011 at 4:11 PM, Jakub Jermar  wrote:
[...]
> Actually, the testcase can be further reduced into:
>
> .global _start
>
> .text
>
> .space 0x20
>
> _start:
>        set 110393, %i1
>        set 0x40, %i2
>
>        cmp  %i1, %i2
>        udivx  %g0, 1, %g0
>        movgu  %xcc, %i2, %i1
>        cmp  %i1, 512
>        bgu  %xcc, 0f
>        nop
>
> succ:
>        ta 0
>
> fail:
> 0:
>        ta 1
>
> The presence of the `udivx` instruction seems to be essential. Even
> though it has no effect on the computation, removing it will make the
> testcase non-reproducible.

Could you try to replace udivx with sdivx?  It looks wrong too.


Laurent



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Jakub Jermar
On 1.7.2011 16:15, Laurent Desnogues wrote:
> You don't have to use gdb to reproduce the issue, just add -singlestep
> when running qemu.

This is strange, -singlestep does not work for me here as the testcase
still ends with trap 0x101. The only way how the testcase can succeed
for me is via GDB's stepi.

Jakub



[Qemu-devel] [PATCH v2 0/2]: block: Small drive_init() improvements

2011-07-01 Thread Luiz Capitulino
Please, see individual patches for details.

v1 -> v2:
-

o Drop indentation patch
o Use error message suggested by Markus

 blockdev.c |   14 +-
 1 files changed, 5 insertions(+), 9 deletions(-)



Re: [Qemu-devel] [PATCH 2/3] scsi: replace 'tag' with 'hba_private' pointer

2011-07-01 Thread Paolo Bonzini

On 07/01/2011 03:11 PM, Hannes Reinecke wrote:

On 07/01/2011 10:27 AM, Paolo Bonzini wrote:

On 07/01/2011 09:42 AM, Hannes Reinecke wrote:

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.


This makes tracing a bit harder to follow. Perhaps you can keep the
transport tag (a uint64_t) in the SCSIRequest for debugging purposes?


Hmm. The transport tag wouldn't have any meaning outside scsi-bus.c.


It depends, in vmw_pvscsi I take it from a field in the request block 
that is 0..255.  So either you have a small tag that is recycled but 
stays nice, or a large tag that is unwieldy but should not be recycled 
ever.  A pointer is unwieldy _and_ is recycled, so it gives the worse of 
both worlds.


But I'm not very attached to this, I may even do it myself if/when I 
find the need.  Won't ack yet because of the nit with ESP/USB, but even 
if you do not bother I will ack the next respin.


Paolo



[Qemu-devel] [PATCH 2/2] block: drive_init(): Improve CHS setting error message

2011-07-01 Thread Luiz Capitulino
The current message doesn't clearly communicate the error cause.

Signed-off-by: Luiz Capitulino 
---
 blockdev.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 470be71..a97a801 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -310,7 +310,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
media = MEDIA_DISK;
} else if (!strcmp(buf, "cdrom")) {
 if (cyls || secs || heads) {
-error_report("'%s' invalid physical CHS format", buf);
+error_report("CHS can't be set with media=%s", buf);
return NULL;
 }
media = MEDIA_CDROM;
-- 
1.7.6.49.g033c2




Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Artyom Tarasenko
On Fri, Jul 1, 2011 at 4:15 PM, Laurent Desnogues
 wrote:
> On Fri, Jul 1, 2011 at 2:57 PM, Jakub Jermar  wrote:
> [...]
>> When _not_ singlestepping via GDB's `stepi`, the testcase will fail and
>> crash Qemu like this:
>>
>> qemu: fatal: Trap 0x0101 while trap level (5) >= MAXTL (5), Error state
>> 
>>
>> On the other hand, when I attach GDB to Qemu and step through all
>> instructions using `stepi`, the testcase will succeed and crash Qemu
>> like this:
>>
>> qemu: fatal: Trap 0x0100 while trap level (5) >= MAXTL (5), Error state
>> 
>>
>> Mind the difference in the trap type - 0x100 for success, 0x101 for failure.
>>
>> This is how I run the test:
>>
>> Without GDB:
>> $ qemu-system-sparc64 -bios ./testcase
>>
>> With GDB:
>> $ qemu-system-sparc64 -bios ./testcase -s -S
>>
>> From another terminal:
>> $ /usr/local/cross/sparc64/bin/sparc64-linux-gnu-gdb
>> (gdb) set architecture sparc:v9
>> (gdb) target remote localhost:1234
>> (gdb) stepi
>> ...
>>
>> Hope this helps to fix the problem.

It definitely does, thanks a lot Jakub!

> You don't have to use gdb to reproduce the issue, just add -singlestep
> when running qemu.
>
> I find it odd that udivx is using cpu_cc_src and cpu_cc_src2.  Using
> dedicated local temps seems to fix the issue.

Do we need to copy cpu_src* to further temps at all? IMHO

-tcg_gen_mov_tl(cpu_cc_src, cpu_src1);
-tcg_gen_mov_tl(cpu_cc_src2, cpu_src2);
-gen_trap_ifdivzero_tl(cpu_cc_src2);
-tcg_gen_divu_i64(cpu_dst, cpu_cc_src, cpu_cc_src2);
+gen_trap_ifdivzero_tl(cpu_src2);
+tcg_gen_divu_i64(cpu_dst, cpu_src1, cpu_src2);

should do it. Or cpu_src is what you mean by dedicated?

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Jakub Jermar
On 1.7.2011 16:24, Laurent Desnogues wrote:
> On Fri, Jul 1, 2011 at 4:11 PM, Jakub Jermar  wrote:
> [...]
>> Actually, the testcase can be further reduced into:
>>
>> .global _start
>>
>> .text
>>
>> .space 0x20
>>
>> _start:
>>set 110393, %i1
>>set 0x40, %i2
>>
>>cmp  %i1, %i2
>>udivx  %g0, 1, %g0
>>movgu  %xcc, %i2, %i1
>>cmp  %i1, 512
>>bgu  %xcc, 0f
>>nop
>>
>> succ:
>>ta 0
>>
>> fail:
>> 0:
>>ta 1
>>
>> The presence of the `udivx` instruction seems to be essential. Even
>> though it has no effect on the computation, removing it will make the
>> testcase non-reproducible.
> 
> Could you try to replace udivx with sdivx?  It looks wrong too.

Yeah, `sdivx` behaves the same wrt. the testcase.

Jakub



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Artyom Tarasenko
On Fri, Jul 1, 2011 at 4:28 PM, Jakub Jermar  wrote:
> On 1.7.2011 16:24, Laurent Desnogues wrote:
>> On Fri, Jul 1, 2011 at 4:11 PM, Jakub Jermar  wrote:
>> [...]
>>> Actually, the testcase can be further reduced into:
>>>
>>> .global _start
>>>
>>> .text
>>>
>>> .space 0x20
>>>
>>> _start:
>>>        set 110393, %i1
>>>        set 0x40, %i2
>>>
>>>        cmp  %i1, %i2
>>>        udivx  %g0, 1, %g0
>>>        movgu  %xcc, %i2, %i1
>>>        cmp  %i1, 512
>>>        bgu  %xcc, 0f
>>>        nop
>>>
>>> succ:
>>>        ta 0
>>>
>>> fail:
>>> 0:
>>>        ta 1
>>>
>>> The presence of the `udivx` instruction seems to be essential. Even
>>> though it has no effect on the computation, removing it will make the
>>> testcase non-reproducible.
>>
>> Could you try to replace udivx with sdivx?  It looks wrong too.
>
> Yeah, `sdivx` behaves the same wrt. the testcase.

Don't need to. Will send a qemu patch tonight. Great job! Once more,
thanks a lot!


-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] [PATCH] Add e500 instructions dcblc, dcbtls and dcbtstl as no-op

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 16:59, Fabien Chouteau wrote:

> On 01/07/2011 00:38, Alexander Graf wrote:
>> 
>> On 01.07.2011, at 00:32, Scott Wood wrote:
>> 
>>> On Fri, 1 Jul 2011 00:28:19 +0200
>>> Alexander Graf  wrote:
>>> 
 
 On 01.07.2011, at 00:23, Scott Wood wrote:
 
> On Fri, 1 Jul 2011 00:18:22 +0200
> Alexander Graf  wrote:
> 
>> 
>> On 01.07.2011, at 00:11, Scott Wood wrote:
>> 
>>> Almost, but what if we have write permission but not read?
>> 
>> How would you write back data from a cache line when you haven't read it 
>> earlier?
> 
> The CPU can read it.  The program can't.
 
 Hrm. We can always just call the check manually and trigger the respective 
 interrupt :).
>>> 
>>> Yep.  A little trickier, but doable.
>>> 
>>> but what about a race with DMA from the I/O thread?
>> 
>> That'd simply be broken, but I don't quite see how it wouldn't with real 
>> hardware either :).
> 
> Real hardware doesn't generate a load/store sequence that the program 
> didn't
> ask for -- where's the breakage?
 
 Real hardware flushes whatever contents are in that cache line to RAM, no? 
 So it would collide with the DMA just as much. Or am I missing something?
>>> 
>>> If the DMA happens after the cache line is fetched, it'll be flushed,
>>> whether locked or not.  But that's different from losing some of what the
>>> device wrote.
>> 
>> Ah, so DMA flushes even locked cache lines? Then it makes sense. Well, I 
>> guess the best choice here really is to merely do a manual storage 
>> protection check and be done with it.
>> 
> 
> Well, this is far beyond the scope of my knowledge of e500 and the current
> patch is sufficient for me, so I will let you implement this if you want to...

Well, all it needs is to call mmubooke206_get_physical_address on the address 
(or each page of the destination) with access_type set to write and check for 
the result. If it's protected, inject a DSI (see cpu_ppc_handle_mmu_fault).

But yeah, I can try and see if I get around to implementing it. No promises 
though. Do you have a test case I can verify this against?


Alex




Re: [Qemu-devel] [PATCH] Add e500 instructions dcblc, dcbtls and dcbtstl as no-op

2011-07-01 Thread Fabien Chouteau
On 01/07/2011 00:38, Alexander Graf wrote:
> 
> On 01.07.2011, at 00:32, Scott Wood wrote:
> 
>> On Fri, 1 Jul 2011 00:28:19 +0200
>> Alexander Graf  wrote:
>>
>>>
>>> On 01.07.2011, at 00:23, Scott Wood wrote:
>>>
 On Fri, 1 Jul 2011 00:18:22 +0200
 Alexander Graf  wrote:

>
> On 01.07.2011, at 00:11, Scott Wood wrote:
>
>> Almost, but what if we have write permission but not read?
>
> How would you write back data from a cache line when you haven't read it 
> earlier?

 The CPU can read it.  The program can't.
>>>
>>> Hrm. We can always just call the check manually and trigger the respective 
>>> interrupt :).
>>
>> Yep.  A little trickier, but doable.
>>
>> but what about a race with DMA from the I/O thread?
>
> That'd simply be broken, but I don't quite see how it wouldn't with real 
> hardware either :).

 Real hardware doesn't generate a load/store sequence that the program 
 didn't
 ask for -- where's the breakage?
>>>
>>> Real hardware flushes whatever contents are in that cache line to RAM, no? 
>>> So it would collide with the DMA just as much. Or am I missing something?
>>
>> If the DMA happens after the cache line is fetched, it'll be flushed,
>> whether locked or not.  But that's different from losing some of what the
>> device wrote.
> 
> Ah, so DMA flushes even locked cache lines? Then it makes sense. Well, I 
> guess the best choice here really is to merely do a manual storage protection 
> check and be done with it.
> 

Well, this is far beyond the scope of my knowledge of e500 and the current
patch is sufficient for me, so I will let you implement this if you want to...

-- 
Fabien Chouteau



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Laurent Desnogues
On Fri, Jul 1, 2011 at 5:03 PM, Artyom Tarasenko  wrote:
[...]
>> I find it odd that udivx is using cpu_cc_src and cpu_cc_src2.  Using
>> dedicated local temps seems to fix the issue.
>
> Do we need to copy cpu_src* to further temps at all? IMHO
>
> -                        tcg_gen_mov_tl(cpu_cc_src, cpu_src1);
> -                        tcg_gen_mov_tl(cpu_cc_src2, cpu_src2);
> -                        gen_trap_ifdivzero_tl(cpu_cc_src2);
> -                        tcg_gen_divu_i64(cpu_dst, cpu_cc_src, cpu_cc_src2);
> +                        gen_trap_ifdivzero_tl(cpu_src2);
> +                        tcg_gen_divu_i64(cpu_dst, cpu_src1, cpu_src2);
>
> should do it. Or cpu_src is what you mean by dedicated?

You have to use two local temps here to store cpu_src1
and cpu_src2 because gen_trap_ifdivzero_tl uses
tcg_gen_brcondi_tl (cf tcg/README comment about
local usage and jumps).  Note you'll have to do something
similar in gen_op_sdivx.


Laurent



Re: [Qemu-devel] qemu crashes on Mac OS X

2011-07-01 Thread Damjan Marion

On Jul 1, 2011, at 11:17 AM, Damjan Marion (damarion) wrote:

> 
> Hi,
> 
> I have an issue when I try to run qemu-system-arm on Mac OS X. 
> Sometime between 1 and 15 secs after qemu is started it crashes
> as shown bellow.
> 
> Same thing on linux host works fine.
> 
> Is anybody else experiencing this?
> Any Hints?

After bisection seems that this starts happening after following patch:

commit 09716e45a05cc0c93bcf55bd0c0888dd678e490f
Author: Alexander Graf 
Date:   Thu Jun 9 00:55:37 2011 +0200

sigfd: use pthread_sigmask


diff --git a/compatfd.c b/compatfd.c
index bd377c4..41586ce 100644
--- a/compatfd.c
+++ b/compatfd.c
@@ -29,7 +29,7 @@ static void *sigwait_compat(void *opaque)
 sigset_t all;
 
 sigfillset(&all);
-sigprocmask(SIG_BLOCK, &all, NULL);
+pthread_sigmask(SIG_BLOCK, &all, NULL);
 
 while (1) {


However before this patch qemu doesn't respond to keyboard (i.e. commit 
31b7c261).

Last full working commit is 630ecca.

Thanks,

Damjan


[Qemu-devel] [PATCH 0/3] [v5] Megasas HBA Emulation

2011-07-01 Thread Hannes Reinecke
Hi all,

after getting various feedback from Paolo, Stefan, and Alexander
I've respun the patches.

Chances since the previous version:
- iov: Update parameter usage in iov_(to|from)_buf()
  Updated description for the first patch and clarified the usage
  Renamed arguments for io_XXX for clarification
- scsi: Add 'hba_private' to SCSIRequest
  Kept 'tag' for tracing and just add 'hba_private' as an
  additional field as per request from Paolo
- megasas: checkpatch.pl fixes and update to work with the
  changed interface in scsi_req_new(). Also included the
  suggested fixes from Alex.

Hannes Reinecke (3):
  iov: Update parameter usage in iov_(to|from)_buf()
  scsi: Add 'hba_private' to SCSIRequest
  megasas: LSI Megaraid SAS emulation

 Makefile.objs   |1 +
 default-configs/pci.mak |1 +
 hw/esp.c|2 +-
 hw/lsi53c895a.c |   22 +-
 hw/megasas.c| 1920 +++
 hw/mfi.h| 1197 +
 hw/pci_ids.h|3 +-
 hw/scsi-bus.c   |9 +-
 hw/scsi-disk.c  |4 +-
 hw/scsi-generic.c   |5 +-
 hw/scsi.h   |   10 +-
 hw/spapr_vscsi.c|   29 +-
 hw/usb-msd.c|9 +-
 hw/virtio-net.c |2 +-
 hw/virtio-serial-bus.c  |2 +-
 iov.c   |   49 +-
 iov.h   |   10 +-
 17 files changed, 3192 insertions(+), 83 deletions(-)
 create mode 100644 hw/megasas.c
 create mode 100644 hw/mfi.h

-- 
1.7.3.4




[Qemu-devel] [PATCH 1/3] iov: Update parameter usage in iov_(to|from)_buf()

2011-07-01 Thread Hannes Reinecke
iov_to_buf() has an 'offset' parameter, iov_from_buf() hasn't.
This patch adds the missing parameter to iov_from_buf().
It also renames the 'offset' parameter to 'iov_off' to
emphasize it's the offset into the iovec and not the buffer.

Signed-off-by: Hannes Reinecke 
---
 hw/virtio-net.c|2 +-
 hw/virtio-serial-bus.c |2 +-
 iov.c  |   49 ++-
 iov.h  |   10 
 4 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 6997e02..a32cc01 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -657,7 +657,7 @@ static ssize_t virtio_net_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 
 /* copy in packet.  ugh */
 len = iov_from_buf(sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 total += len;
 offset += len;
 /* If buffers can't be merged, at this point we
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 7f6db7b..53c58d0 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -103,7 +103,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
 }
 
 len = iov_from_buf(elem.in_sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 offset += len;
 
 virtqueue_push(vq, &elem, len);
diff --git a/iov.c b/iov.c
index 588cd04..1e02791 100644
--- a/iov.c
+++ b/iov.c
@@ -14,56 +14,61 @@
 
 #include "iov.h"
 
-size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size)
+size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt,
+const void *buf, size_t iov_off, size_t size)
 {
-size_t offset;
+size_t iovec_off, buf_off;
 unsigned int i;
 
-offset = 0;
-for (i = 0; offset < size && i < iovcnt; i++) {
-size_t len;
+iovec_off = 0;
+buf_off = 0;
+for (i = 0; i < iov_cnt && size; i++) {
+if (iov_off < (iovec_off + iov[i].iov_len)) {
+size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off, size);
 
-len = MIN(iov[i].iov_len, size - offset);
+memcpy(iov[i].iov_base + (iov_off - iovec_off), buf + buf_off, 
len);
 
-memcpy(iov[i].iov_base, buf + offset, len);
-offset += len;
+buf_off += len;
+iov_off += len;
+size -= len;
+}
+iovec_off += iov[i].iov_len;
 }
-return offset;
+return buf_off;
 }
 
-size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
-  void *buf, size_t offset, size_t size)
+size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt,
+  void *buf, size_t iov_off, size_t size)
 {
 uint8_t *ptr;
-size_t iov_off, buf_off;
+size_t iovec_off, buf_off;
 unsigned int i;
 
 ptr = buf;
-iov_off = 0;
+iovec_off = 0;
 buf_off = 0;
-for (i = 0; i < iovcnt && size; i++) {
-if (offset < (iov_off + iov[i].iov_len)) {
-size_t len = MIN((iov_off + iov[i].iov_len) - offset , size);
+for (i = 0; i < iov_cnt && size; i++) {
+if (iov_off < (iovec_off + iov[i].iov_len)) {
+size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size);
 
-memcpy(ptr + buf_off, iov[i].iov_base + (offset - iov_off), len);
+memcpy(ptr + buf_off, iov[i].iov_base + (iov_off - iovec_off), 
len);
 
 buf_off += len;
-offset += len;
+iov_off += len;
 size -= len;
 }
-iov_off += iov[i].iov_len;
+iovec_off += iov[i].iov_len;
 }
 return buf_off;
 }
 
-size_t iov_size(const struct iovec *iov, const unsigned int iovcnt)
+size_t iov_size(const struct iovec *iov, const unsigned int iov_cnt)
 {
 size_t len;
 unsigned int i;
 
 len = 0;
-for (i = 0; i < iovcnt; i++) {
+for (i = 0; i < iov_cnt; i++) {
 len += iov[i].iov_len;
 }
 return len;
diff --git a/iov.h b/iov.h
index 60a8547..110f67a 100644
--- a/iov.h
+++ b/iov.h
@@ -12,8 +12,8 @@
 
 #include "qemu-common.h"
 
-size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size);
-size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
-  void *buf, size_t offset, size_t size);
-size_t iov_size(const struct iovec *iov, const unsigned int iovcnt);
+size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt,
+const void *buf, size_t iov_off, size_t size);
+size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt,
+  void *buf, size_t iov_off, size_t size);
+size_t iov_size(const struct iovec *iov, const unsigned int iov_cnt);
-- 
1.7.3.4




[Qemu-devel] [PATCH 2/3] scsi: Add 'hba_private' to SCSIRequest

2011-07-01 Thread Hannes Reinecke
'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reinecke 
---
 hw/esp.c  |2 +-
 hw/lsi53c895a.c   |   22 --
 hw/scsi-bus.c |9 ++---
 hw/scsi-disk.c|4 ++--
 hw/scsi-generic.c |5 +++--
 hw/scsi.h |   10 +++---
 hw/spapr_vscsi.c  |   29 +
 hw/usb-msd.c  |9 +
 8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index 6d3f5d2..aa87197 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)
 
 DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
 lun = busid & 7;
-s->current_req = scsi_req_new(s->current_dev, 0, lun);
+s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
 datalen = scsi_req_enqueue(s->current_req, buf);
 s->ti_size = datalen;
 if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
 static void lsi_request_cancelled(SCSIRequest *req)
 {
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
-lsi_request *p;
+lsi_request *p = req->hba_private;
 
 if (s->current && req == s->current->req) {
 scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
 return;
 }
 
-p = lsi_find_by_tag(s, req->tag);
 if (p) {
 QTAILQ_REMOVE(&s->queue, p, next);
 scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)
 
 /* Record that data is available for a queued command.  Returns zero if
the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
 {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF("IO with unknown tag %d\n", tag);
-return 1;
-}
+lsi_request *p = req->hba_private;
 
 if (p->pending) {
-BADF("Multiple IO pending for tag %d\n", tag);
+BADF("Multiple IO pending for request %p\n", p);
 }
 p->pending = len;
 /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
 int out;
 
-if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
+if (s->waiting == 1 || !s->current || req->hba_private != s->current ||
 (lsi_irq_on_rsl(s) && !(s->scntl1 & LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req->tag, len)) {
+if (lsi_queue_req(s, req, len)) {
 return;
 }
 }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
 assert(s->current == NULL);
 s->current = qemu_mallocz(sizeof(lsi_request));
 s->current->tag = s->select_tag;
-s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
+s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
+   s->current);
 
 n = scsi_req_enqueue(s->current->req, buf);
 if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
 return res;
 }
 
-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
 {
 SCSIRequest *req;
 
@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
 req->dev = d;
 req->tag = tag;
 req->lun = lun;
+req->hba_private = hba_private;
 req->status = -1;
 trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
 return req;
 }
 
-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
 {
-return d->info->alloc_req(d, tag, lun);
+return d->info->alloc_req(d, tag, lun, hba_private);
 }
 
 uint8_t *scsi_req_get_buf(SCSIRequest *req)
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index a8c7372..c2a99fe 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -81,13 +81,13 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, 
int type);
 static int scsi_disk_emulate_command(SCSIDiskReq *r, uint8_t *outbuf);
 

Re: [Qemu-devel] [PATCH] Add e500 instructions dcblc, dcbtls and dcbtstl as no-op

2011-07-01 Thread Fabien Chouteau
On 01/07/2011 17:05, Alexander Graf wrote:
>
> On 01.07.2011, at 16:59, Fabien Chouteau wrote:
>
>> On 01/07/2011 00:38, Alexander Graf wrote:
>>>
>>> On 01.07.2011, at 00:32, Scott Wood wrote:
>>>
 On Fri, 1 Jul 2011 00:28:19 +0200
 Alexander Graf  wrote:

>
> On 01.07.2011, at 00:23, Scott Wood wrote:
>
>> On Fri, 1 Jul 2011 00:18:22 +0200
>> Alexander Graf  wrote:
>>
>>>
>>> On 01.07.2011, at 00:11, Scott Wood wrote:
>>>
 Almost, but what if we have write permission but not read?
>>>
>>> How would you write back data from a cache line when you haven't read 
>>> it earlier?
>>
>> The CPU can read it.  The program can't.
>
> Hrm. We can always just call the check manually and trigger the 
> respective interrupt :).

 Yep.  A little trickier, but doable.

 but what about a race with DMA from the I/O thread?
>>>
>>> That'd simply be broken, but I don't quite see how it wouldn't with 
>>> real hardware either :).
>>
>> Real hardware doesn't generate a load/store sequence that the program 
>> didn't
>> ask for -- where's the breakage?
>
> Real hardware flushes whatever contents are in that cache line to RAM, 
> no? So it would collide with the DMA just as much. Or am I missing 
> something?

 If the DMA happens after the cache line is fetched, it'll be flushed,
 whether locked or not.  But that's different from losing some of what the
 device wrote.
>>>
>>> Ah, so DMA flushes even locked cache lines? Then it makes sense. Well, I 
>>> guess the best choice here really is to merely do a manual storage 
>>> protection check and be done with it.
>>>
>>
>> Well, this is far beyond the scope of my knowledge of e500 and the current
>> patch is sufficient for me, so I will let you implement this if you want 
>> to...
>
> Well, all it needs is to call mmubooke206_get_physical_address on the address 
> (or each page of the destination) with access_type set to write and check for 
> the result. If it's protected, inject a DSI (see cpu_ppc_handle_mmu_fault).
>
> But yeah, I can try and see if I get around to implementing it. No promises 
> though. Do you have a test case I can verify this against?

No, I just implemented these no-oped instructions to get rid of an illegal
instruction exception while running u-boot on my emulated p2010.

-- 
Fabien Chouteau



Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Artyom Tarasenko
On Fri, Jul 1, 2011 at 5:20 PM, Laurent Desnogues
 wrote:
> On Fri, Jul 1, 2011 at 5:03 PM, Artyom Tarasenko  wrote:
> [...]
>>> I find it odd that udivx is using cpu_cc_src and cpu_cc_src2.  Using
>>> dedicated local temps seems to fix the issue.
>>
>> Do we need to copy cpu_src* to further temps at all? IMHO
>>
>> -                        tcg_gen_mov_tl(cpu_cc_src, cpu_src1);
>> -                        tcg_gen_mov_tl(cpu_cc_src2, cpu_src2);
>> -                        gen_trap_ifdivzero_tl(cpu_cc_src2);
>> -                        tcg_gen_divu_i64(cpu_dst, cpu_cc_src, cpu_cc_src2);
>> +                        gen_trap_ifdivzero_tl(cpu_src2);
>> +                        tcg_gen_divu_i64(cpu_dst, cpu_src1, cpu_src2);
>>
>> should do it. Or cpu_src is what you mean by dedicated?
>
> You have to use two local temps here to store cpu_src1
> and cpu_src2 because gen_trap_ifdivzero_tl uses
> tcg_gen_brcondi_tl (cf tcg/README comment about
> local usage and jumps).

The same story as the recent crash with wrpr %pstate, I see. I should
have done my homework.
Thanks for the hint!

Artyom

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] [PATCH 2/3] scsi: Add 'hba_private' to SCSIRequest

2011-07-01 Thread Paolo Bonzini

On 07/01/2011 05:35 PM, Hannes Reinecke wrote:

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reinecke
---
  hw/esp.c  |2 +-
  hw/lsi53c895a.c   |   22 --
  hw/scsi-bus.c |9 ++---
  hw/scsi-disk.c|4 ++--
  hw/scsi-generic.c |5 +++--
  hw/scsi.h |   10 +++---
  hw/spapr_vscsi.c  |   29 +
  hw/usb-msd.c  |9 +
  8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index 6d3f5d2..aa87197 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)

  DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
  lun = busid&  7;
-s->current_req = scsi_req_new(s->current_dev, 0, lun);
+s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
  datalen = scsi_req_enqueue(s->current_req, buf);
  s->ti_size = datalen;
  if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
  static void lsi_request_cancelled(SCSIRequest *req)
  {
  LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
-lsi_request *p;
+lsi_request *p = req->hba_private;

  if (s->current&&  req == s->current->req) {
  scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
  return;
  }

-p = lsi_find_by_tag(s, req->tag);
  if (p) {
  QTAILQ_REMOVE(&s->queue, p, next);
  scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

  /* Record that data is available for a queued command.  Returns zero if
 the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
  {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF("IO with unknown tag %d\n", tag);
-return 1;
-}
+lsi_request *p = req->hba_private;

  if (p->pending) {
-BADF("Multiple IO pending for tag %d\n", tag);
+BADF("Multiple IO pending for request %p\n", p);
  }
  p->pending = len;
  /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
  LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
  int out;

-if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
+if (s->waiting == 1 || !s->current || req->hba_private != s->current ||
  (lsi_irq_on_rsl(s)&&  !(s->scntl1&  LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req->tag, len)) {
+if (lsi_queue_req(s, req, len)) {
  return;
  }
  }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
  assert(s->current == NULL);
  s->current = qemu_mallocz(sizeof(lsi_request));
  s->current->tag = s->select_tag;
-s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
+s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
+   s->current);

  n = scsi_req_enqueue(s->current->req, buf);
  if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
  return res;
  }

-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
  {
  SCSIRequest *req;

@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
  req->dev = d;
  req->tag = tag;
  req->lun = lun;
+req->hba_private = hba_private;
  req->status = -1;
  trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
  return req;
  }

-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
  {
-return d->info->alloc_req(d, tag, lun);
+return d->info->alloc_req(d, tag, lun, hba_private);
  }

  uint8_t *scsi_req_get_buf(SCSIRequest *req)
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index a8c7372..c2a99fe 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -81,13 +81,13 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int

Re: [Qemu-devel] [PATCH] Add e500 instructions dcblc, dcbtls and dcbtstl as no-op

2011-07-01 Thread Scott Wood
On Fri, 1 Jul 2011 17:39:49 +0200
Fabien Chouteau  wrote:

> No, I just implemented these no-oped instructions to get rid of an illegal
> instruction exception while running u-boot on my emulated p2010.
> 

Heh, so someone *is* trying to run firmware. :-)

It would take a lot of low-level hardware simulation to get unmodified
U-Boot to run -- probably not worth it unless your goal is an environment
for debugging U-Boot.

It uses cache locking to create working space before the memory controller
is initialized, so it's really expecting those instructions to work.

-Scott




Re: [Qemu-devel] SPARC64 support on FreeBSD, has it improved as of yet?

2011-07-01 Thread Super Bisquit
On Wed, Jun 29, 2011 at 9:46 PM, Super Bisquit wrote:

>
>
> On Wed, Jun 29, 2011 at 1:10 AM, Bob Breuer  wrote:
>
>> Super Bisquit wrote:
>> >
>> ...
>> >
>> > It builds, doesn't run. More like it runs and hangs.
>> >
>> > $ qemu-system-sparc -cpu LEON3 -hda test.img -cdrom
>> Downloads/debian-6.0.2.1-sparc-businesscard.iso -m 256 -boot d
>> >
>>
>> That command line won't work.  OpenBIOS doesn't support LEON, and the
>> last version of Debian for sparc32 was 4.0.
>>
>> Try instead: "qemu-system-sparc -cdrom debian-40r9-sparc-netinst.iso
>> -boot d"
>>
>> You can get a cd image from
>> http://cdimage.debian.org/cdimage/archive/4.0_r9/sparc/iso-cd/ but the
>> installer may not be able to load packages from the internet because the
>> packages have been moved to archive.debian.org.
>>
>> Bob
>>
>
> No response either from sparc32 or powerpc.  I386 also didn't work.
> What gdb commands should be ran on the core and what qemu monitor commands
> should I run?
>
>
Here. When someone else on the list has FreeBSD installed to a
SPARC64/UltraSPARC device and has installed qemu to it, then it will be easy
to see what I am referring to constantly.


Re: [Qemu-devel] [PATCH 1/3] iov: Update parameter usage in iov_(to|from)_buf()

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 17:35, Hannes Reinecke wrote:

> iov_to_buf() has an 'offset' parameter, iov_from_buf() hasn't.
> This patch adds the missing parameter to iov_from_buf().
> It also renames the 'offset' parameter to 'iov_off' to
> emphasize it's the offset into the iovec and not the buffer.
> 
> Signed-off-by: Hannes Reinecke 

Acked-by: Alexander Graf 

Alex




Re: [Qemu-devel] [HelenOS-devel] [sparc64] Miscomputed minimum of a group of numbers in sparc64 emulation

2011-07-01 Thread Artyom Tarasenko
On Fri, Jul 1, 2011 at 5:53 PM, Artyom Tarasenko  wrote:
> On Fri, Jul 1, 2011 at 5:20 PM, Laurent Desnogues
>  wrote:
>> On Fri, Jul 1, 2011 at 5:03 PM, Artyom Tarasenko  wrote:
>> [...]
 I find it odd that udivx is using cpu_cc_src and cpu_cc_src2.  Using
 dedicated local temps seems to fix the issue.
>>>
>>> Do we need to copy cpu_src* to further temps at all? IMHO
>>>
>>> -                        tcg_gen_mov_tl(cpu_cc_src, cpu_src1);
>>> -                        tcg_gen_mov_tl(cpu_cc_src2, cpu_src2);
>>> -                        gen_trap_ifdivzero_tl(cpu_cc_src2);
>>> -                        tcg_gen_divu_i64(cpu_dst, cpu_cc_src, cpu_cc_src2);
>>> +                        gen_trap_ifdivzero_tl(cpu_src2);
>>> +                        tcg_gen_divu_i64(cpu_dst, cpu_src1, cpu_src2);
>>>
>>> should do it. Or cpu_src is what you mean by dedicated?
>>
>> You have to use two local temps here to store cpu_src1
>> and cpu_src2 because gen_trap_ifdivzero_tl uses
>> tcg_gen_brcondi_tl (cf tcg/README comment about
>> local usage and jumps).

Thought on a second thought, it's only critical if the jump is taken,
right? And here if the jump is taken, we don't need  cpu_src1 &
cpu_src2.

> The same story as the recent crash with wrpr %pstate, I see. I should
> have done my homework.
> Thanks for the hint!




-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/



Re: [Qemu-devel] [PATCH] Add e500 instructions dcblc, dcbtls and dcbtstl as no-op

2011-07-01 Thread Fabien Chouteau
On 01/07/2011 18:00, Scott Wood wrote:
> On Fri, 1 Jul 2011 17:39:49 +0200
> Fabien Chouteau  wrote:
> 
>> No, I just implemented these no-oped instructions to get rid of an illegal
>> instruction exception while running u-boot on my emulated p2010.
>>
> 
> Heh, so someone *is* trying to run firmware. :-)
> 
> It would take a lot of low-level hardware simulation to get unmodified
> U-Boot to run -- probably not worth it unless your goal is an environment
> for debugging U-Boot.
> 
> It uses cache locking to create working space before the memory controller
> is initialized, so it's really expecting those instructions to work.

It was mainly to test the correctness of emulation, but it appears that I have
to implement more devices (eSPI, I2C, LBC...) than I initially thought.

-- 
Fabien Chouteau



Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Scott Wood
On Fri, 1 Jul 2011 10:58:14 +1000
Benjamin Herrenschmidt  wrote:

> So, from a qemu command line perspective, all you should have to do is
> pass qemu the device-tree -path- to the device you want to pass-trough
> (you may support passing a full hierarchy here).
> 
> That is for normal MMIO mapped SoC devices. Something else (individual
> i2c, usb, ...) will use specific virtualization of the corresponding
> busses.
> 
> Anything else sucks too much really.
> 
> From there, well, there's several approach inside qemu/kvm to handle
> that path. If you want to do things at the qemu level you can probably
> parse /proc/device-tree.

That's what option 1 is, except that instead of adding code to qemu to
parse /proc/device-tree, we'd use dtc to dump /proc/device-tree into a dtb
and let qemu use libfdt to look at the tree.  This is less Linux-specific,
more modular, and more flexible for doing the sort of insane hacks that are
going to happen in embedded-land whether you like them or not. :-)

> But I'd personally just make it a kernel thing.

I'd rather keep the kernel interface simple -- assign this memory region,
assign that interrupt, use this IOMMU device ID, etc.  Getting the kernel
involved in preparing the guest device tree, and understanding guuest
configuration, seems quite excessive.

> IE. I would have an ioctl to "instanciate" a pass-through device, that
> takes that path as an argument. I would make it return an anonymous fd
> which you can then use to mmap the resources, etc...
> 
> > In some cases, modifications to device tree nodes may be needed.
> > An example-- sometimes a device tree property references another node 
> > and that relationship may not exist when assigned to a guest.
> > A "phy-handle" property may need to be deleted and a "fixed-link"
> > property added to a node representing a network device.
> 
> That's fishy. Why wouldn't you give full access to the MDIO ? It's
> shared ? 

Yes, it's shared.  Yes, it sucks.

> Such things are so device-specific that they would have to be
> handled by device-specific quirks, which can live either in qemu or in
> the kernel.

Or in the configuration of qemu.  Not all users of the device want to do
the same thing.

> > So in addition to assigning a device, a mechanism is needed to update 
> > device tree nodes.  So for the above example, maybe--
> > 
> >  -device assigned-soc-dev,dev=/soc/ethernet@b2000,delete-prop=phy-handle,
> >   node-update="fixed-link = <2 1 1000 0 0>"
> 
> That's just so gross and error prone, borderline insane.

Welcome to embedded. :-)

Here, users are going to want to be able to mess around under the hood in
a way that server or desktop users generally don't need or want to.

> > The types of modifications needed--  deleting nodes, deleting properties, 
> > adding nodes, adding properties, adding properties that reference other
> > nodes, changing properties. This device tree transformation mechanism
> > needed is general enough that it could apply to any device tree based
> > embedded platform (e.g. ARM, MIPS)
> >
> > Another complexity relates to the IOMMU.  Here things get very company 
> > and IOMMU specific. Freescale has a proprietary IOMMU.
> 
> Look at the work currently being done for a generic qemu iommu layer. We
> need it for server power as well and from what I last saw coming from
> Eduardo and David, it's not PCI specific.

The problem is that our current IOMMU doesn't implement full paging (yes,
the HW people have been screamed at, but we're stuck with it for current
chips).  You have to break things down into regions following certain
alignment rules, which may require user guidance as to which memory regions
actually need DMA access, especially if you're setting up discontiguous
shared memory regions and such.

-Scott




Re: [Qemu-devel] [PATCH 0/9] PPC patch queue 2011-06-17

2011-07-01 Thread Alexander Graf

On 17.06.2011, at 16:43, Alexander Graf wrote:

> The following changes since commit eb47d7c5d96060040931c42773ee07e61e547af9:
>  Peter Maydell (1):
>hw/9118.c: Implement active-low interrupt support
> 
> are available in the git repository at:
> 
>  git://repo.or.cz/qemu/agraf.git ppc-next

This was supposed to be called [PULL]. Any chance we could get it in? :)


Alex




Re: [Qemu-devel] [PATCH 3/3] megasas: LSI Megaraid SAS emulation

2011-07-01 Thread Alexander Graf

On 01.07.2011, at 17:35, Hannes Reinecke wrote:

> This patch adds an emulation for the LSI Megaraid SAS 8708EM2 HBA.

Have you tried to execute the current version of megasas and actually do 
something with it? I just booted up openSUSE 11.4 rescue from DVD with a 
megasas adapter that contained a raw file backed by tmpfs. Creating a partition 
worked fine, but when running mkfs.ext3 and mounting afterwards, the mount 
fails saying there is no ext3 on the disk.

Sounds like data corruption to me :). I know that this used to work a while 
back, so it might be a regression recently?


Alex




Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Scott Wood
On Fri, 1 Jul 2011 07:10:45 -0500
Anthony Liguori  wrote:

> I agree in principle but I think it should be done in a slightly 
> different way.
> 
> I think we ought to support composing a device by passthrough.  For 
> instance, something like:
> 
> [physical-device "mydev"]
> region[0].file = "/dev/mem"
> region[0].guest_address = "0x42232000"
> region[0].file_offset = "0x23423400"
> region[0].size = "4096"
> irq[0].guest_irq = "10"
> irq[0].host_irq = "10"
> 
> This should be independent of anything to do with device tree.  This 
> would be useful for x86 too to assign platform devices (like the HPET).

That's fine, as long as there's something layered on top of it for the case
where we do want to reference something in the device tree.  

However, we'll need to address the question of what it means to say "irq 10"
-- outside of PC-land there often isn't a global IRQ numberspace that isn't
a fiction created by some software layer.  Addressing this is one of the
device tree's strengths.

-Scott




Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Scott Wood
On Fri, 1 Jul 2011 18:03:01 +0100
Paul Brook  wrote:

> Basically you should start by implementing full emulation of a device with 
> similar characteristics to the one you want to passthrough.

That's not going to happen.

> Once you've done all the above, host device passthrough should be relatively 
> straightforward.  Just replace the emulation bits in the above device with 
> code that pokes at a real device via the relevant kernel API.

That's not what we mean by direct device assignment.

We're talking about directly mapping the registers into the guest.  The
whole point is performance.

-Scott




Re: [Qemu-devel] [PATCH v6 08/12] VMDK: change get_cluster_offset return type

2011-07-01 Thread Kevin Wolf
Am 01.07.2011 06:55, schrieb Fam Zheng:
> The return type of get_cluster_offset was an offset that use 0 to denote
> 'not allocated', this will be no longer true for flat extents, as we see
> flat extent file as a single huge cluster whose offset is 0 and length
> is the whole file length.
> So now we use int return value, 0 means success and otherwise offset
> invalid.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c |   73 
> +++---
>  1 files changed, 39 insertions(+), 34 deletions(-)
> 
> diff --git a/block/vmdk.c b/block/vmdk.c
> index 8783629..40e4464 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -685,18 +685,23 @@ static int vmdk_L2update(VmdkExtent *extent, 
> VmdkMetaData *m_data)
>  return 0;
>  }
>  
> -static uint64_t get_cluster_offset(BlockDriverState *bs,
> +static int get_cluster_offset(BlockDriverState *bs,
>  VmdkExtent *extent,
>  VmdkMetaData *m_data,
> -uint64_t offset, int allocate)
> +uint64_t offset,
> +int allocate,
> +uint64_t *cluster_offset)
>  {
>  unsigned int l1_index, l2_offset, l2_index;
>  int min_index, i, j;
>  uint32_t min_count, *l2_table, tmp = 0;
> -uint64_t cluster_offset;
>  
>  if (m_data)
>  m_data->valid = 0;
> +if (extent->flat) {
> +*cluster_offset = 0;
> +return 0;
> +}
>  
>  l1_index = (offset >> 9) / extent->l1_entry_sectors;
>  if (l1_index >= extent->l1_size) {

Let me complete what comes next:

l1_index = (offset >> 9) / extent->l1_entry_sectors;
if (l1_index >= extent->l1_size) {
return 0;
}
l2_offset = extent->l1_table[l1_index];
if (!l2_offset) {
return 0;
}

Shouldn't these returns be changed to -1? Also there is a return 0;
after a failed read, which doesn't seem to be changed.

Kevin



Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Paul Brook
> > irq[0].guest_irq = "10"
> > 
> > This should be independent of anything to do with device tree.  This
> > would be useful for x86 too to assign platform devices (like the HPET).
> 
> That's fine, as long as there's something layered on top of it for the case
> where we do want to reference something in the device tree.
> 
> However, we'll need to address the question of what it means to say "irq
> 10" -- outside of PC-land there often isn't a global IRQ numberspace that
> isn't a fiction created by some software layer.  Addressing this is one of
> the device tree's strengths.

That's an entirely separate problem, thoug probably a prerequisite.

Basically you should start by implementing full emulation of a device with 
similar characteristics to the one you want to passthrough.

Then fix whatever is needed to allow the user to contol instantiation of those 
devices. This almost certainly means using the -device commandline option.  
This currently only works for a fairly simple subset of devices (approximately 
PCI and USB), so you'll probably need to fix/implement the missing bits.  To 
do this you'll probably need to do some work on the various bits of the qdev 
relating to linking devices together.  See recent discussion about sockets in 
the "basic support for composing sysbus devices" thread.

To expose this to the guest you'll probably also need to implement some form 
of dynamic device tree assembly/manipulation.  Not strictly necessary (we can 
require the user supply a complete device tree that matches whatever devices 
they've configured), but probably highly desirable.

Once you've done all the above, host device passthrough should be relatively 
straightforward.  Just replace the emulation bits in the above device with 
code that pokes at a real device via the relevant kernel API.

Paul



Re: [Qemu-devel] [PATCH V3] Support logging xen-guest console

2011-07-01 Thread Stefano Stabellini
On Fri, 1 Jul 2011, Chun Yan Liu wrote:
> On Thursday, June 30, 2011 07:18:54 PM you wrote:
> > On 06/30/2011 11:39 AM, Chun Yan Liu wrote:
> > > On Thursday, June 30, 2011 03:58:57 PM Alexander Graf wrote:
> > >> On 30.06.2011, at 09:08, Chunyan Liu wrote:
> > >>> Add code to support logging xen-domU console, as what xenconsoled does.
> > >>> To enable logging, set environment variable XENCONSOLED_TRACE=guest and
> > >>> XENCONSOLED_LOGDIR=, log file will be saved in
> > >>> .
> > >> 
> > >> In fact, this whole thing looks as if you're merely trying to
> > >> reimplement "tee" on the xenconsole output. Wouldn't it make more sense
> > >> to do this in
> > >> 
> > >> the char layer? So we could do:
> > >>-xenconsole tee:stdio,file:/tmp/xen.log
> > >> 
> > >> or similar? That's probably a lot more useful than a random Xen specific
> > >> hack.
> > > 
> > > Thanks, Alex.  It IS something like "tee". But IMO, change in
> > > xen_console.c and change in char layer are just different time points,
> > > do not have essential difference. Change in xen_console.c is trying to
> > > backup output data into log file before sending to char device, change
> > > in char device is trying to dupicate data from char device to log file.
> > > Correct me if I'm wrong.
> > 
> > Sure, the outcome is the same though, no? We get the output data in both
> > a file and the char backend.
> >  
> Char device in qemu is not only used by console. Compared with the benefits 
> that brings, I still doubt if it is proper to adding this functionality to 
> char layer.  
> Stefano, how do you think?
 
I agree with Alexander that Qemu should take a command line argument to
enable xen_console logging, rather than reading an environmental
variable.
However considering that xen_console allocates char drivers dynamically,
the command line syntax proposed by Alexander is not a good fit.
We need a global command line parameter that enables logging for all
xen_console ptys, for example:

-xencon_logs /path/to/logdir

Regarding the particular way to implement logging, I think this patch is
OK. Of course if the char layer had a "tee" option we could use it to
log the pty output, for example prepending "tee:/log/to/file," to the
filename parameter that we are going to pass to qemu_chr_open.




Re: [Qemu-devel] [PATCH v3] xen: implement unplug protocol in xen_platform

2011-07-01 Thread Stefano Stabellini
On Fri, 1 Jul 2011, Kevin Wolf wrote:
> Am 30.06.2011 16:16, schrieb Stefano Stabellini:
> > On Thu, 30 Jun 2011, Kevin Wolf wrote:
> >>> +static int pci_piix3_xen_ide_unplug(DeviceState *dev)
> >>> +{
> >>> +PCIDevice *pci_dev;
> >>> +PCIIDEState *pci_ide;
> >>> +DriveInfo *di;
> >>> +int i = 0;
> >>> +
> >>> +pci_dev = DO_UPCAST(PCIDevice, qdev, dev);
> >>> +pci_ide = DO_UPCAST(PCIIDEState, dev, pci_dev);
> >>> +
> >>> +for (; i < 3; i++) {
> >>> +di = drive_get_by_index(IF_IDE, i);
> >>> +if (di != NULL && di->bdrv != NULL && !di->bdrv->removable) {
> >>> +DeviceState *ds = bdrv_get_attached(di->bdrv);
> >>> +if (ds) {
> >>> +bdrv_detach(di->bdrv, ds);
> >>> +}
> >>> +bdrv_close(di->bdrv);
> >>> +pci_ide->bus[di->bus].ifs[di->unit].bs = NULL;
> >>
> >> Have you tested if this is enough if the guest tries to continue using
> >> the device? I don't know of any case where it's not sufficient, just
> >> trying to make sure that it's really true in practice.
> > 
> > The purpose of this is to "hide" the disk from the guest. The unplug is
> > supposed to happen *before* the guest enumerates the IDE disks; it is
> > responsibility of the guest to make sure of it.
> > I tested it with Linux PV on HVM drivers, and Linux doesn't see the
> > emulated disk after the unplug, as it should be.
> 
> Yeah. What I meant is that we should make sure that a misbehaving guest,
> which just keeps on playing with the IDE ports anyway, can't crash qemu.
> A quick review suggests that it is the case, but testing it anyway would
> be better.

I see what you mean: I tested it, a guest cannot crash Qemu.




Re: [Qemu-devel] qemu-user[armel/mips] and debian-rootfs

2011-07-01 Thread Lisandro Damián Nicanor Pérez Meyer
Hi! First of all, sorry for breaking the thread [0] (although it is too old 
maybe).

[0] 

As I have stumbled upon this bug, I would like to know if there is something I 
can do to try to help fixing it. I have never seen qemu's code and I am really 
a newbie to it, so any pointer would be higly appreciated.

Kinds regrads, Lisandro.


-- 
Wiki is not WysiWyg. It's an intelligence test of sorts to be able to edit a
wiki page. It's not rocket science, but it doesn't appeal to the VideoAddicts.
If it doesn't appeal, they don't participate, which leaves those of us who 
read
and write to get on with rational discourse.
  http://www.c2.com/cgi/wiki?WhyWikiWorks

Lisandro Damián Nicanor Pérez Meyer
http://perezmeyer.com.ar/
http://perezmeyer.blogspot.com/


signature.asc
Description: This is a digitally signed message part.


Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Scott Wood
On Fri, 1 Jul 2011 12:16:35 +0100
Paul Brook  wrote:

> > One feature we need for QEMU/KVM on embedded Power Architecture is the
> > ability to do passthru assignment of SoC I/O devices and memory.  An
> > important use case in embedded is creating static partitions--
> > taking physical memory and I/O devices (non-PCI) and partitioning
> > them between the host Linux and several virtual machines.   Things like
> > live migration would not be needed or supported in these types of
> > scenarios.
> > 
> > SoC devices do not sit on a probeable bus and there are no identifiers
> > like 01:00.0 with PCI that we can use to identify devices--  the host
> > Linux kernel is made aware of SoC I/O devices from nodes/properties in a
> > device tree structure passed at boot.   QEMU needs to generate a
> > device tree to pass to the guest as well with all the guest's virtual
> > and physical resources.  Today a number of mostly complete guest device
> > trees are kept under ./pc-bios in QEMU, but this too static and
> > inflexible.
> 
> I doubt you're going to get generic passthrough of arbitrary devices working 
> in a useful way.

It's usefully working for us internally -- we're just trying to find a way
to improve it for upstream, with a better configuration mechanism.

> My expectation is that, at minimum, you'll need a bus 
> specific proxy device. i.e. create a virtual device in qemu that responds to 
> the guest, and happens poke at a host device rather than emulating things 
> directly.

Many of these embedded devices don't sit on any sort of software-visible
bus, and requiring that the I/O happen via MMIO traps would result in
unacceptable overhead.

> Basically you have to emulate  everything that is different between the host 
> and guest.

Directly assigning a device means you don't get to have differences between
the actual hardware device and what the guest sees.  The kind of thin
wrapper you're suggesting might have some use cases, but it's a different
problem from what we're trying to solve.

-Scott




[Qemu-devel] [PATCH][sparc64] fix cpu_cc_src and cpu_cc_src2 corruption in udivx and sdivx

2011-07-01 Thread Artyom Tarasenko
udivx and sdvix don't modify condition flags, so they shall not
overwrite cpu_cc_*

Signed-off-by: Artyom Tarasenko 
---
 target-sparc/translate.c |   32 ++--
 1 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 992cd77..f32a674 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -727,19 +727,24 @@ static inline void gen_trap_ifdivzero_tl(TCGv divisor)
 static inline void gen_op_sdivx(TCGv dst, TCGv src1, TCGv src2)
 {
 int l1, l2;
+TCGv r_temp1, r_temp2;
 
 l1 = gen_new_label();
 l2 = gen_new_label();
-tcg_gen_mov_tl(cpu_cc_src, src1);
-tcg_gen_mov_tl(cpu_cc_src2, src2);
-gen_trap_ifdivzero_tl(cpu_cc_src2);
-tcg_gen_brcondi_tl(TCG_COND_NE, cpu_cc_src, INT64_MIN, l1);
-tcg_gen_brcondi_tl(TCG_COND_NE, cpu_cc_src2, -1, l1);
+r_temp1 = tcg_temp_local_new();
+r_temp2 = tcg_temp_local_new();
+tcg_gen_mov_tl(r_temp1, src1);
+tcg_gen_mov_tl(r_temp2, src2);
+gen_trap_ifdivzero_tl(r_temp2);
+tcg_gen_brcondi_tl(TCG_COND_NE, r_temp1, INT64_MIN, l1);
+tcg_gen_brcondi_tl(TCG_COND_NE, r_temp2, -1, l1);
 tcg_gen_movi_i64(dst, INT64_MIN);
 tcg_gen_br(l2);
 gen_set_label(l1);
-tcg_gen_div_i64(dst, cpu_cc_src, cpu_cc_src2);
+tcg_gen_div_i64(dst, r_temp1, r_temp2);
 gen_set_label(l2);
+tcg_temp_free(r_temp1);
+tcg_temp_free(r_temp2);
 }
 #endif
 
@@ -3173,10 +3178,17 @@ static void disas_sparc_insn(DisasContext * dc)
 break;
 #ifdef TARGET_SPARC64
 case 0xd: /* V9 udivx */
-tcg_gen_mov_tl(cpu_cc_src, cpu_src1);
-tcg_gen_mov_tl(cpu_cc_src2, cpu_src2);
-gen_trap_ifdivzero_tl(cpu_cc_src2);
-tcg_gen_divu_i64(cpu_dst, cpu_cc_src, cpu_cc_src2);
+{
+TCGv r_temp1, r_temp2;
+r_temp1 = tcg_temp_local_new();
+r_temp2 = tcg_temp_local_new();
+tcg_gen_mov_tl(r_temp1, cpu_src1);
+tcg_gen_mov_tl(r_temp2, cpu_src2);
+gen_trap_ifdivzero_tl(r_temp2);
+tcg_gen_divu_i64(cpu_dst, r_temp1, r_temp2);
+tcg_temp_free(r_temp1);
+tcg_temp_free(r_temp2);
+}
 break;
 #endif
 case 0xe: /* udiv */
-- 
1.7.3.4




Re: [Qemu-devel] Benchmarking activities

2011-07-01 Thread Blue Swirl
2011/6/27 Ben Vogler :
> Hi QEMU devel team,
>
>
>
> I work for Toyota Technical Centre Australia in the software department. We
> are currently conducting benchmarking on a broad spectrum of virtual
> platform simulators. I was wondering if I could ask you guys a few question
> about QEMU, however I understand if it is too much to ask. Below is a list
> of questions I have:
>
>
>
> -  I have seen examples of QEMU processor cores being wrapped in
> SystemC and used in OSCI based virtual systems – is this the general
> approach, or is there other/better ways of going about using QEMU not as an
> emulator (such as VMware), but as a simulator?

Maybe this report from QEMU User Forum may help:
http://lists.nongnu.org/archive/html/qemu-devel/2011-03/msg02065.html

> -  Is there full backwards compatibility between versions of QEMU?

In some cases (maybe x86 only) there is migration compatibility
between versions. In other cases this effort has not been considered
worthwhile.

> -  I have been looking but could not find a complete list of
> processor core models supported by QEMU. I have seen there are processors
> from Sparc, ARM, MIPS, but are there any core models from NEC, or Renesas in
> particular? Would you please be able to point me in the right direction?

Those are not supported. Each emulator can list the CPU models with
flag -cpu '?'.

>
>
>
> The rest of my questions are based on the assumption that QEMU IP will be
> used in a virtual system simulation, rather than emulation.
>
>
>
> -  Is co-simulation possible? For example, connecting an engine
> model running in Dymola to the QEMU (processor model) based virtual system
> simulator.

No idea.

> -   Are there any inbuilt data tracing features? For example,
> hardware signal tracing, register monitoring etc.

Tracing is quite new addition, so far it's only used for development
or debugging QEMU point of view I think.

>
>
>
>
> Thanks for your time.
>
>
>
>
>
> Regards,
>
>
>
> TOYOTA TECHNICAL CENTER AUSTRALIA
> Ben Vogler    ベン・ヴォグラ
> Software Engineer
> Software Development Group
> 611-633 Blackburn Road
> Notting Hill, VIC 3168
> Email:    bvog...@toyotatech.com.au
>
> Phone:  +61 3 9501 5226
>
> Confidential - do not forward this email.
>
> P  Please consider the environment before printing this e-mail
>
>



Re: [Qemu-devel] 80-column rule and breaking output statements

2011-07-01 Thread Blue Swirl
On Mon, Jun 27, 2011 at 12:48 PM, Amit Shah  wrote:
> On (Sun) 26 Jun 2011 [20:43:16], Blue Swirl wrote:
>> On Wed, Jun 22, 2011 at 7:53 AM, Amit Shah  wrote:
>
> [snip]
>
>> > From dec93d9eccd639f7bfd1343dca65fa112eb19e3e Mon Sep 17 00:00:00 2001
>> > Message-Id: 
>> > 
>> > From: Amit Shah 
>> > Date: Wed, 22 Jun 2011 10:20:48 +0530
>> > Subject: [PATCH 1/1] CODING_STYLE: Add exception for log output 80-char 
>> > limit
>> >
>> > Output that's logged can be beyond 80 chars to preserve sane grepping.
>>
>> Nack. Grepping (why just logs?) is just one use case. There are other
>
> I'm not talking about grepping logs.  I'm talking about grepping code
> once you have something in the logs.  With line breaks in the code but
> not in the log, you will not get the desired result.
>
> Example:
>
>
> fprintf("This is a line ");
> fprintf("broken in two code lines\n");
>
>
> Output is:
>
> This is a line broken in two code lines
>
>
> Picking that line from the output and grepping for it in the source
> doesn't get you what you want.

Nack. Grep would not match for example
fprintf("'%s' invalid media\n", s);
unless you know how to grep only for some parts of the error string
and in that case splitting the line would not hurt very much anymore.

If you want robust grepping, each error message should contain an ID,
for example:
fprintf("ERROR-ID-0015:'%s' invalid media\n", s);

Then grepping would work even for
fprintf(
"ERROR-ID-0015:"
"'%s'"
" invalid"
" media\n",
s);



[Qemu-devel] simulation paper with Blackfin and QEMU + GNU sim

2011-07-01 Thread Mike Frysinger
Robin and i wrote a paper on simulation that focused on QEMU and the
GNU sim with the Blackfin architecture.  it covers implementation,
testing, and benchmarking.  maybe some people will find it useful.
maybe not.
http://docs.blackfin.uclinux.org/lib/exe/fetch.php?media=presentations:ols-2011-sim.pdf
then we presented it at OLS this year:
http://docs.blackfin.uclinux.org/lib/exe/fetch.php?media=presentations:ols-2011-sim.odp
-mike



Re: [Qemu-devel] [PATCHv2] Add compat eventfd header

2011-07-01 Thread Blue Swirl
On Thu, Jun 30, 2011 at 6:57 PM, Michael S. Tsirkin  wrote:
> Support build on rhel 5.X where we have syscall for eventfd but not
> userspace wrapper.
>
> (cherry-picked from commit 9e3269181e9bc56feb43bcd4e8ce0b82cd543e65
>  in qemu-kvm.git).
>
> Signed-off-by: Michael S. Tsirkin 
> ---
>
> Changes from v1:
>  checkpatch fix
>  address comments by agraf
>  verify we are on linux
>
>  compat/sys/eventfd.h |   20 
>  configure            |    6 --
>  2 files changed, 24 insertions(+), 2 deletions(-)
>  create mode 100644 compat/sys/eventfd.h
>
> diff --git a/compat/sys/eventfd.h b/compat/sys/eventfd.h
> new file mode 100644
> index 000..1801a5f
> --- /dev/null
> +++ b/compat/sys/eventfd.h

Since we have linux-headers directory now, the directory should be
compat-headers. I'd also add 'linux' directory below that to avoid
collisions, so the full path would be
compat-headers/linux/sys/eventfd.h.

> @@ -0,0 +1,20 @@
> +#ifndef _COMPAT_SYS_EVENTFD
> +#define _COMPAT_SYS_EVENTFD
> +
> +#ifdef CONFIG_EVENTFD
> +
> +#ifndef __linux__
> +#error __linux__ is not defined: eventfd is only supported on linux
> +#endif

With the linux directory, this check wouldn't be needed. It's not
incorrect and we could add more specific checks later (for example if
SYS_eventfd is not defined).

> +
> +#include 
> +#include 
> +
> +static inline int eventfd(int count, int flags)
> +{
> +    return syscall(SYS_eventfd, count, flags);
> +}
> +
> +#endif
> +
> +#endif
> diff --git a/configure b/configure
> index 856b41e..6f7dd74 100755
> --- a/configure
> +++ b/configure
> @@ -822,7 +822,6 @@ esac
>
>  [ -z "$guest_base" ] && guest_base="$host_guest_base"
>
> -
>  default_target_list=""
>
>  # these targets are portable
> @@ -891,6 +890,9 @@ sparc64-bsd-user \
>  "
>  fi
>
> +#compat headers
> +QEMU_CFLAGS="$QEMU_CFLAGS -idirafter $source_path/compat"

Please use $source_path/compat-headers/$targetos/.

> +
>  if test x"$show_help" = x"yes" ; then
>  cat << EOF
>
> @@ -2122,7 +2124,7 @@ int main(void)
>     return 0;
>  }
>  EOF
> -if compile_prog "" "" ; then
> +if compile_prog "-DCONFIG_EVENTFD" "" ; then
>   eventfd=yes
>  fi
>
> --
> 1.7.5.53.gc233e
>



[Qemu-devel] [Bug 804517] Re: qemu crashes on Darwin in qemu_iohandler_poll

2011-07-01 Thread Damjan Marion
fter bisection seems that this starts happening after following patch:

commit 09716e45a05cc0c93bcf55bd0c0888dd678e490f
Author: Alexander Graf 
Date:   Thu Jun 9 00:55:37 2011 +0200

   sigfd: use pthread_sigmask


diff --git a/compatfd.c b/compatfd.c
index bd377c4..41586ce 100644
--- a/compatfd.c
+++ b/compatfd.c
@@ -29,7 +29,7 @@ static void *sigwait_compat(void *opaque)
sigset_t all;

sigfillset(&all);
-sigprocmask(SIG_BLOCK, &all, NULL);
+pthread_sigmask(SIG_BLOCK, &all, NULL);

while (1) {


However before this patch qemu doesn't respond to keyboard (i.e. commit 
31b7c261).

Last full working commit is 630ecca.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/804517

Title:
  qemu crashes on Darwin in qemu_iohandler_poll

Status in QEMU:
  New

Bug description:
  I have an issue when I try to run qemu-system-arm on Mac OS X. 
  Sometime between 1 and 15 secs after qemu is started it crashes
  as shown bellow.

  Same thing on linux host works fine.

  Is anybody else experiencing this?
  Any Hints?

  Thanks,

  Damjan


  (gdb) run
  Starting program: /opt/arm-qemu/bin/qemu-system-arm -M verdex -pflash 
flash.img -nographic -monitor null -m 289
  Reading symbols for shared libraries 
.++
 done
  pxa2xx_clkpwr_write: CPU frequency change attempt

  
  U-Boot 1.2.0 (May 10 2008 - 21:17:19) - PXA270@400 MHz - 1604

  *** Welcome to Gumstix ***

  DRAM:  256 MB
  Flash: 32 MB
  Using default environment

  Hit any key to stop autoboot:  1 
  Program received signal EXC_BAD_ACCESS, Could not access memory.
  Reason: KERN_PROTECTION_FAILURE at address: 0x7fff5fbfed30
  0x7fff5fbfed30 in ?? ()
  (gdb) 
  (gdb) bt
  #0  0x7fff5fbfed30 in ?? ()
  #1  0x0001000c26f4 in qemu_iohandler_poll ()
  #2  0x0001001975ae in main_loop_wait ()
  #3  0x0001001976e2 in main_loop ()
  #4  0x00010019bfbc in qemu_main ()
  #5  0x0001000d63a5 in main ()
  (gdb)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/804517/+subscriptions



[Qemu-devel] [Bug 804517] [NEW] qemu crashes on Darwin in qemu_iohandler_poll

2011-07-01 Thread Damjan Marion
Public bug reported:

I have an issue when I try to run qemu-system-arm on Mac OS X. 
Sometime between 1 and 15 secs after qemu is started it crashes
as shown bellow.

Same thing on linux host works fine.

Is anybody else experiencing this?
Any Hints?

Thanks,

Damjan


(gdb) run
Starting program: /opt/arm-qemu/bin/qemu-system-arm -M verdex -pflash flash.img 
-nographic -monitor null -m 289
Reading symbols for shared libraries 
.++
 done
pxa2xx_clkpwr_write: CPU frequency change attempt


U-Boot 1.2.0 (May 10 2008 - 21:17:19) - PXA270@400 MHz - 1604

*** Welcome to Gumstix ***

DRAM:  256 MB
Flash: 32 MB
Using default environment

Hit any key to stop autoboot:  1 
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x7fff5fbfed30
0x7fff5fbfed30 in ?? ()
(gdb) 
(gdb) bt
#0  0x7fff5fbfed30 in ?? ()
#1  0x0001000c26f4 in qemu_iohandler_poll ()
#2  0x0001001975ae in main_loop_wait ()
#3  0x0001001976e2 in main_loop ()
#4  0x00010019bfbc in qemu_main ()
#5  0x0001000d63a5 in main ()
(gdb)

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/804517

Title:
  qemu crashes on Darwin in qemu_iohandler_poll

Status in QEMU:
  New

Bug description:
  I have an issue when I try to run qemu-system-arm on Mac OS X. 
  Sometime between 1 and 15 secs after qemu is started it crashes
  as shown bellow.

  Same thing on linux host works fine.

  Is anybody else experiencing this?
  Any Hints?

  Thanks,

  Damjan


  (gdb) run
  Starting program: /opt/arm-qemu/bin/qemu-system-arm -M verdex -pflash 
flash.img -nographic -monitor null -m 289
  Reading symbols for shared libraries 
.++
 done
  pxa2xx_clkpwr_write: CPU frequency change attempt

  
  U-Boot 1.2.0 (May 10 2008 - 21:17:19) - PXA270@400 MHz - 1604

  *** Welcome to Gumstix ***

  DRAM:  256 MB
  Flash: 32 MB
  Using default environment

  Hit any key to stop autoboot:  1 
  Program received signal EXC_BAD_ACCESS, Could not access memory.
  Reason: KERN_PROTECTION_FAILURE at address: 0x7fff5fbfed30
  0x7fff5fbfed30 in ?? ()
  (gdb) 
  (gdb) bt
  #0  0x7fff5fbfed30 in ?? ()
  #1  0x0001000c26f4 in qemu_iohandler_poll ()
  #2  0x0001001975ae in main_loop_wait ()
  #3  0x0001001976e2 in main_loop ()
  #4  0x00010019bfbc in qemu_main ()
  #5  0x0001000d63a5 in main ()
  (gdb)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/804517/+subscriptions



Re: [Qemu-devel] [PATCH V2] [PowerPC][RFC] booke timers

2011-07-01 Thread Scott Wood
On Fri, 1 Jul 2011 16:13:41 +0200
Fabien Chouteau  wrote:

> +uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, int64_t tb_offset)
>  {
>  /* TB time in tb periods */
>  return muldiv64(vmclk, tb_env->tb_freq, get_ticks_per_sec()) + tb_offset;
> @@ -678,18 +661,23 @@ static void __cpu_ppc_store_decr (CPUState *env, 
> uint64_t *nextp,
>  decr, value);
>  now = qemu_get_clock_ns(vm_clock);
>  next = now + muldiv64(value, get_ticks_per_sec(), tb_env->decr_freq);
> -if (is_excp)
> +if (is_excp) {
>  next += *nextp - now;
> -if (next == now)
> +}
> +if (next == now) {
>  next++;
> +}
>  *nextp = next;
>  /* Adjust timer */
>  qemu_mod_timer(timer, next);
>  /* If we set a negative value and the decrementer was positive,
> - * raise an exception.
> + * raise an exception (not for booke).
>   */
> -if ((value & 0x8000) && !(decr & 0x8000))
> +if (! (env->insns_flags & PPC_BOOKE)
> +&& (value & 0x8000)
> +&& !(decr & 0x8000)) {
>  (*raise_excp)(env);
> +}
>  }
>  

Also, load_decr should go from this:

if (diff >= 0)
decr = muldiv64(diff, tb_env->decr_freq, get_ticks_per_sec());
else
decr = -muldiv64(-diff, tb_env->decr_freq, get_ticks_per_sec());

to something like this:

if (diff >= 0) {
decr = muldiv64(diff, tb_env->decr_freq, get_ticks_per_sec());
} else if (env->insns_flags & PPC_BOOKE) {
decr = 0;
} else {
decr = -muldiv64(-diff, tb_env->decr_freq, get_ticks_per_sec());
}

> +/* Return the location of the bit of time base at which the FIT will raise an
> +   interrupt */
> +static uint8_t booke_get_fit_target(CPUState *env)
> +{
> +uint8_t fp = (env->spr[SPR_BOOKE_TCR] & TCR_FP_MASK) >> TCR_FP_SHIFT;
> +
> +/* Only for e500 */
> +if (env->insns_flags2 & PPC2_E500) {
> +uint32_t fpext = (env->spr[SPR_BOOKE_TCR] & TCR_E500_FPEXT_MASK)
> +>> TCR_E500_FPEXT_SHIFT;
> +fp |= fpext << 2;
> +} else {
> +fp = env->fit_period[fp];
> +}
> +
> +return fp;
> +}
> +
> +/* Return the location of the bit of time base at which the WDT will raise an
> +   interrupt */
> +static uint8_t booke_get_wdt_target(CPUState *env)
> +{
> +uint8_t wp = (env->spr[SPR_BOOKE_TCR] & TCR_WP_MASK) >> TCR_WP_SHIFT;
> +
> +/* Only for e500 */
> +if (env->insns_flags2 & PPC2_E500) {
> +uint32_t wpext = (env->spr[SPR_BOOKE_TCR] & TCR_E500_WPEXT_MASK)
> +>> TCR_E500_WPEXT_SHIFT;
> +wp |= wpext << 2;
> +} else {
> +wp = env->wdt_period[wp];
> +}
> +
> +return wp;
> +}

e500 fp/wp is expressed as bits from the MSB side of TB, so we need to
subtract from 63 to get something sane -- and document that fit/wdt_period
is from the LSB side.

> +static void booke_update_fixed_timer(CPUState *env,
> + uint8_t   target_bit,
> + uint64_t  *next,
> + struct QEMUTimer *timer)
> +{
> +ppc_tb_t *tb_env = env->tb_env;
> +uint64_t lapse;
> +uint64_t tb;
> +uint64_t period = 1 << (target_bit + 1);
> +uint64_t now;
> +
> +now = qemu_get_clock_ns(vm_clock);
> +tb  = cpu_ppc_get_tb(tb_env, now, tb_env->tb_offset);
> +
> +if (tb <= (1 << target_bit)) {
> +lapse = (1 << target_bit) - tb;
> +} else {
> +lapse = period - ((tb - (1 << target_bit)) % period);

We know period is a power of two, so just do "& (period - 1)".

That should let you get rid of the special case for
"tb <= (1 << target_bit)" as well.

-Scott




Re: [Qemu-devel] SPARC64 support on FreeBSD, has it improved as of yet?

2011-07-01 Thread Blue Swirl
On Fri, Jul 1, 2011 at 7:03 PM, Super Bisquit  wrote:
>
>
> On Wed, Jun 29, 2011 at 9:46 PM, Super Bisquit 
> wrote:
>>
>>
>> On Wed, Jun 29, 2011 at 1:10 AM, Bob Breuer  wrote:
>>>
>>> Super Bisquit wrote:
>>> >
>>> ...
>>> >
>>> > It builds, doesn't run. More like it runs and hangs.
>>> >
>>> > $ qemu-system-sparc -cpu LEON3 -hda test.img -cdrom
>>> > Downloads/debian-6.0.2.1-sparc-businesscard.iso -m 256 -boot d
>>> >
>>>
>>> That command line won't work.  OpenBIOS doesn't support LEON, and the
>>> last version of Debian for sparc32 was 4.0.
>>>
>>> Try instead: "qemu-system-sparc -cdrom debian-40r9-sparc-netinst.iso
>>> -boot d"
>>>
>>> You can get a cd image from
>>> http://cdimage.debian.org/cdimage/archive/4.0_r9/sparc/iso-cd/ but the
>>> installer may not be able to load packages from the internet because the
>>> packages have been moved to archive.debian.org.
>>>
>>> Bob
>>
>> No response either from sparc32 or powerpc.  I386 also didn't work.
>> What gdb commands should be ran on the core and what qemu monitor commands
>> should I run?
>>
>
> Here. When someone else on the list has FreeBSD installed to a
> SPARC64/UltraSPARC device and has installed qemu to it, then it will be easy
> to see what I am referring to constantly.

More Sparc (or BSD) hackers are very much welcome.



Re: [Qemu-devel] "cpu-exec.c: avoid AREG0 use" breaks x86 emulation on x86-64

2011-07-01 Thread Blue Swirl
On Fri, Jul 1, 2011 at 4:44 AM, TeLeMan  wrote:
> On Fri, Jul 1, 2011 at 00:47, Jan Kiszka  wrote:
>> Hi Blue,
>>
>> commit cea5f9a28f breaks here, just starting qemu without any
>> parameters:
>>
>> Starting program: qemu-system-x86_64
>> [Thread debugging using libthread_db enabled]
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x739ac770 in __sigsetjmp () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x739ac770 in __sigsetjmp () from /lib64/libc.so.6
>> #1  0x004eb96c in cpu_x86_exec (env=0x11d09a0) at cpu-exec.c:233
>> #2  0x0040f056 in tcg_cpu_exec (env=0x11d09a0) at cpus.c:1059
>> #3  cpu_exec_all () at cpus.c:1100
>> #4  0x0058cfcb in main_loop () at vl.c:1380
>> #5  main (argc=, argv=, 
>> envp=) at vl.c:3318
>>
>> Please have a look.
>>
>> Jan
>>
>> --
>> Siemens AG, Corporate Technology, CT T DE IT 1
>> Corporate Competence Center Embedded Linux
>>
>>
> It works on gcc 4.4.5 x64. Which version is your gcc?

Works for me too. The change added a second argument to
tcg_qemu_tb_exec() and in tcg/i386/* we assume that the second
register is %rsi. Isn't that so also with your gcc?



Re: [Qemu-devel] device assignment for embedded Power

2011-07-01 Thread Paul Brook
> On Fri, 1 Jul 2011 18:03:01 +0100
> 
> Paul Brook  wrote:
> > Basically you should start by implementing full emulation of a device
> > with similar characteristics to the one you want to passthrough.
> 
> That's not going to happen.

Why is your device so unique? How does it interact with the guest system and 
what features does it require that doen't exist in any device that can be 
emulated?

I'm also extremely sceptical of anything that only works in a kvm environment.  
Makes me think it's an unmaintainable hack, and almost certainly going to 
cause you immense amounts of pain later.

> > I doubt you're going to get generic passthrough of arbitrary devices
> > working in a useful way.
> 
> It's usefully working for us internally -- we're just trying to find a way
> to improve it for upstream, with a better configuration mechanism.

I don't believe that either.  More likely you've got passthrough of device 
hanging off your specific CPU bus, using only (or even a subset of) the 
facilities provided by that bus.

> > Basically you have to emulate  everything that is different between the
> > host and guest.
> 
> Directly assigning a device means you don't get to have differences between
> the actual hardware device and what the guest sees.  The kind of thin
> wrapper you're suggesting might have some use cases, but it's a different
> problem from what we're trying to solve.

That's the problem. You've skipped several steps and gone startigh for 
optimization before you've even got basic functionality working.

You've also missed the point I was making.  In order to do device passthrough 
you need to define a boundary allong which the emulated machine state can be 
fully replicated on the host machine.  Anything inside this boundary is (by 
definition) that same on both the host and guest systems (we're effectively 
using host hardware to emulate a device for us). Outside that boundary the 
host and guest systems will diverge.

For a device that merely responds to CPU initiated MMIO transfers this is 
pretty simple, it's the point at which MMIO transfers are generated. So the 
guest gets a proxy device that intercepts accesses to that memory region, and 
the host proxies some way for qemu to poke values at the host device.

> > Once you've done all the above, host device passthrough should be
> > relatively straightforward.  Just replace the emulation bits in the
> > above device with code that pokes at a real device via the relevant
> > kernel API.
> 
> That's not what we mean by direct device assignment.

Maybe, but IMO but it's a necessary prerequisite. You're trying to run before 
you can walk.

> We're talking about directly mapping the registers into the guest.  The
> whole point is performance.

That's an additional step after you get passthrough working the normal way.
We already have mechanisms (or at least patches) for mapping file-like objects 
into guest physical memory.  That's largely independent of device passthrough.  
It's a relatively minor tweak to how the passthrough device sets up its MMIO 
regions.

Mapping host device MMIO regions into guest space is entirely uninteresting 
unless we already have some way of creating guest-host passthrough devices.  
Creating guest-device passthrough devices isn't going to happen until the can 
create arbitrary devices (within the set emulated by qemu) that interact with 
the rest of the emulated machine in a similar way.

Paul



Re: [Qemu-devel] [PATCH 08/12] TCG/PPC: use TCG_REG_CALL_STACK instead of TCG_REG_R1

2011-07-01 Thread Blue Swirl
On Tue, Jun 28, 2011 at 1:51 AM, malc  wrote:
> On Sun, 26 Jun 2011, Blue Swirl wrote:
>
>> Use TCG_REG_CALL_STACK instead of TCG_REG_R1 etc. for consistency.
>>
>
> This i'd rather avoid.

Why? In addition to the consistency among targets, a magic constant is
replaced with a symbol which improves the documenting abilities and
readability of the code.



Re: [Qemu-devel] [PATCH 08/12] TCG/PPC: use TCG_REG_CALL_STACK instead of TCG_REG_R1

2011-07-01 Thread malc
On Sat, 2 Jul 2011, Blue Swirl wrote:

> On Tue, Jun 28, 2011 at 1:51 AM, malc  wrote:
> > On Sun, 26 Jun 2011, Blue Swirl wrote:
> >
> >> Use TCG_REG_CALL_STACK instead of TCG_REG_R1 etc. for consistency.
> >>
> >
> > This i'd rather avoid.
> 
> Why? In addition to the consistency among targets, a magic constant is
> replaced with a symbol which improves the documenting abilities and
> readability of the code.

Makes it harder to read for me personally.

-- 
mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCH] hppa: Fix printf warnings in hppa-dis.c.

2011-07-01 Thread Blue Swirl
Thanks, applied.

On Tue, Jun 21, 2011 at 1:02 AM, Richard Henderson  wrote:
> Signed-off-by: Richard Henderson 
> ---
>  hppa-dis.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/hppa-dis.c b/hppa-dis.c
> index a5760a9..435da73 100644
> --- a/hppa-dis.c
> +++ b/hppa-dis.c
> @@ -1771,13 +1771,13 @@ static const char *const add_compl_names[] = { 0, "", 
> ",l", ",tsv" };
>  static void
>  fput_reg (unsigned reg, disassemble_info *info)
>  {
> -  (*info->fprintf_func) (info->stream, reg ? reg_names[reg] : "r0");
> +  (*info->fprintf_func) (info->stream, "%s", reg ? reg_names[reg] : "r0");
>  }
>
>  static void
>  fput_fp_reg (unsigned reg, disassemble_info *info)
>  {
> -  (*info->fprintf_func) (info->stream, reg ? fp_reg_names[reg] : "fr0");
> +  (*info->fprintf_func) (info->stream, "%s", reg ? fp_reg_names[reg] : 
> "fr0");
>  }
>
>  static void
> @@ -1794,7 +1794,7 @@ fput_fp_reg_r (unsigned reg, disassemble_info *info)
>  static void
>  fput_creg (unsigned reg, disassemble_info *info)
>  {
> -  (*info->fprintf_func) (info->stream, control_reg[reg]);
> +  (*info->fprintf_func) (info->stream, "%s", control_reg[reg]);
>  }
>
>  /* Print constants with sign.  */
> --
> 1.5.6.5
>
>
>



Re: [Qemu-devel] [PATCH 01/12] TCG/HPPA: use TCG_REG_CALL_STACK instead of TCG_REG_SP

2011-07-01 Thread Blue Swirl
On Mon, Jun 27, 2011 at 12:02 AM, Richard Henderson  wrote:
> On 06/26/2011 12:20 PM, Blue Swirl wrote:
>> Use TCG_REG_CALL_STACK instead of TCG_REG_SP for consistency.
>>
>> Signed-off-by: Blue Swirl 
>
> Acked-by: Richard Henderson 

Thanks for the Ack, applied both patches.



Re: [Qemu-devel] [PATCH][sparc64] fix cpu_cc_src and cpu_cc_src2 corruption in udivx and sdivx

2011-07-01 Thread Laurent Desnogues
On Fri, Jul 1, 2011 at 9:28 PM, Artyom Tarasenko  wrote:
> udivx and sdvix don't modify condition flags, so they shall not
> overwrite cpu_cc_*

Looks good to me.


Laurent

> Signed-off-by: Artyom Tarasenko 
> ---
>  target-sparc/translate.c |   32 ++--
>  1 files changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/target-sparc/translate.c b/target-sparc/translate.c
> index 992cd77..f32a674 100644
> --- a/target-sparc/translate.c
> +++ b/target-sparc/translate.c
> @@ -727,19 +727,24 @@ static inline void gen_trap_ifdivzero_tl(TCGv divisor)
>  static inline void gen_op_sdivx(TCGv dst, TCGv src1, TCGv src2)
>  {
>     int l1, l2;
> +    TCGv r_temp1, r_temp2;
>
>     l1 = gen_new_label();
>     l2 = gen_new_label();
> -    tcg_gen_mov_tl(cpu_cc_src, src1);
> -    tcg_gen_mov_tl(cpu_cc_src2, src2);
> -    gen_trap_ifdivzero_tl(cpu_cc_src2);
> -    tcg_gen_brcondi_tl(TCG_COND_NE, cpu_cc_src, INT64_MIN, l1);
> -    tcg_gen_brcondi_tl(TCG_COND_NE, cpu_cc_src2, -1, l1);
> +    r_temp1 = tcg_temp_local_new();
> +    r_temp2 = tcg_temp_local_new();
> +    tcg_gen_mov_tl(r_temp1, src1);
> +    tcg_gen_mov_tl(r_temp2, src2);
> +    gen_trap_ifdivzero_tl(r_temp2);
> +    tcg_gen_brcondi_tl(TCG_COND_NE, r_temp1, INT64_MIN, l1);
> +    tcg_gen_brcondi_tl(TCG_COND_NE, r_temp2, -1, l1);
>     tcg_gen_movi_i64(dst, INT64_MIN);
>     tcg_gen_br(l2);
>     gen_set_label(l1);
> -    tcg_gen_div_i64(dst, cpu_cc_src, cpu_cc_src2);
> +    tcg_gen_div_i64(dst, r_temp1, r_temp2);
>     gen_set_label(l2);
> +    tcg_temp_free(r_temp1);
> +    tcg_temp_free(r_temp2);
>  }
>  #endif
>
> @@ -3173,10 +3178,17 @@ static void disas_sparc_insn(DisasContext * dc)
>                         break;
>  #ifdef TARGET_SPARC64
>                     case 0xd: /* V9 udivx */
> -                        tcg_gen_mov_tl(cpu_cc_src, cpu_src1);
> -                        tcg_gen_mov_tl(cpu_cc_src2, cpu_src2);
> -                        gen_trap_ifdivzero_tl(cpu_cc_src2);
> -                        tcg_gen_divu_i64(cpu_dst, cpu_cc_src, cpu_cc_src2);
> +                        {
> +                            TCGv r_temp1, r_temp2;
> +                            r_temp1 = tcg_temp_local_new();
> +                            r_temp2 = tcg_temp_local_new();
> +                            tcg_gen_mov_tl(r_temp1, cpu_src1);
> +                            tcg_gen_mov_tl(r_temp2, cpu_src2);
> +                            gen_trap_ifdivzero_tl(r_temp2);
> +                            tcg_gen_divu_i64(cpu_dst, r_temp1, r_temp2);
> +                            tcg_temp_free(r_temp1);
> +                            tcg_temp_free(r_temp2);
> +                        }
>                         break;
>  #endif
>                     case 0xe: /* udiv */
> --
> 1.7.3.4
>
>



  1   2   >