date:20111206

Re: [Qemu-devel] [PATCH v4 2/3] Extract code to nbd_setup function to be used for many purposes

2011-12-06 Thread Paolo Bonzini


On 12/06/2011 07:56 AM, Chunyan Liu wrote:


Currently, the nbd_setup needs parameters: device, srcpath, flags,
partition, dev_offset, nbdflags, sockpath, bindto, port, shared,
persistent, verbose, sigterm_rfd. More than 10 parameters. I still
didn't find a better way to reduce parameters. Making variables global
is a workaround to avoid nbd_setup taking too many parameters. Actually,
except for sigterm_rfd, all others are pared from command line options.


Reading again this patch, I am not sure why you are doing it this way.

There is no reason why bdrv_new/open/delete has to be redone for every 
/dev/nbdX we try (or if there is a reason, _that_ is what should be 
fixed first).  Also the "tail" of nbd_setup, basically the select loop, 
should not be tried multiple times.


I do not understand why you cannot simply do it like this:

- in the server thread, do everything as it is now

- pass "device" to the client thread instead of opening it in main()

- in the client thread, either use "device" as it is or (if device == 
NULL, which implies find == 1) loop until nbd_init succeeds.


Am I just confused?

Paolo

Re: [Qemu-devel] [PATCH v4 2/3] Extract code to nbd_setup function to be used for many purposes

2011-12-06 Thread Chunyan Liu

2011/12/6 Paolo Bonzini 

> On 12/06/2011 07:56 AM, Chunyan Liu wrote:
>
>>
>> Currently, the nbd_setup needs parameters: device, srcpath, flags,
>> partition, dev_offset, nbdflags, sockpath, bindto, port, shared,
>> persistent, verbose, sigterm_rfd. More than 10 parameters. I still
>> didn't find a better way to reduce parameters. Making variables global
>> is a workaround to avoid nbd_setup taking too many parameters. Actually,
>> except for sigterm_rfd, all others are pared from command line options.
>>
>
> Reading again this patch, I am not sure why you are doing it this way.
>
> There is no reason why bdrv_new/open/delete has to be redone for every
> /dev/nbdX we try (or if there is a reason, _that_ is what should be fixed
> first).


Well, it's not "had to be redone", did it just to avoid passing too many
parameters to nbd_setup. (otherwise, should pass "bs, dev_offset and
fd_size to nbd_setup.) Can be moved out.


> Also the "tail" of nbd_setup, basically the select loop, should not be
> tried multiple times.
>
> I do not understand why you cannot simply do it like this:
>
> - in the server thread, do everything as it is now
>

Nope. When device changes, both client thread and server thread should be
refreshed. sockpath and sharing_fds[] is changed with different device.


> - pass "device" to the client thread instead of opening it in main()
>
> - in the client thread, either use "device" as it is or (if device ==
> NULL, which implies find == 1) loop until nbd_init succeeds.

Am I just confused?
>
> Paolo
>
>

Re: [Qemu-devel] [PATCH v4 2/3] Extract code to nbd_setup function to be used for many purposes

2011-12-06 Thread Paolo Bonzini


On 12/06/2011 09:42 AM, Chunyan Liu wrote:


I do not understand why you cannot simply do it like this:

- in the server thread, do everything as it is now

Nope. When device changes, both client thread and server thread should
be refreshed. sockpath and sharing_fds[] is changed with different device.


Then let's change the default sockpath to include the pid rather than 
the NBD device name.


Paolo

Re: [Qemu-devel] [PATCH v4 2/3] Extract code to nbd_setup function to be used for many purposes

2011-12-06 Thread Chunyan Liu

2011/12/6 Paolo Bonzini 

> On 12/06/2011 09:42 AM, Chunyan Liu wrote:
>
>>
>>I do not understand why you cannot simply do it like this:
>>
>>- in the server thread, do everything as it is now
>>
>> Nope. When device changes, both client thread and server thread should
>> be refreshed. sockpath and sharing_fds[] is changed with different device.
>>
>
> Then let's change the default sockpath to include the pid rather than the
> NBD device name.
>
> Sounds good. Will try. Thanks.


> Paolo
>
>

[Qemu-devel] About the snapshot

2011-12-06 Thread Zhi Hui Li



1) :

for example:

BDRVQcowState *s = bs->opaque;

s->snapshots
s->nb_snapshots


1:use the command:   qemu-img snapshot ./test.qcow2  -c aa
the memory of the s->snapshot don't free,
if the s->nb_snapshots is large, Does it have some problems.

2: use the command:  qemu-system-x86_64  ./test.qcow2 -snapshot
when the program ends, Does it need to free the s->snapshots ?




2):
in the function of
qcow2_update_snapshot_refcount

it has some "  goto fail  ";
if the function runs some times, then something makes it goto fail,
I am not sure whether it will make the refcount  incorrect.

Re: [Qemu-devel] Qemu stable releases

2011-12-06 Thread Stefan Hajnoczi

On Mon, Dec 5, 2011 at 8:08 PM, Justin M. Forbes  wrote:
> The stable tree for 1.0 has now been created and the mailing list
> exists.

Where does the stable 1.0 tree live?

Stefan

Re: [Qemu-devel] About the snapshot

2011-12-06 Thread Stefan Hajnoczi

On Tue, Dec 6, 2011 at 9:07 AM, Zhi Hui Li  wrote:
>
> 1) :
>
> for example:
>
> BDRVQcowState *s = bs->opaque;
>
> s->snapshots
> s->nb_snapshots
>
>
> 1:use the command:   qemu-img snapshot ./test.qcow2  -c aa
> the memory of the s->snapshot don't free,
> if the s->nb_snapshots is large, Does it have some problems.
>
> 2: use the command:  qemu-system-x86_64  ./test.qcow2 -snapshot
> when the program ends, Does it need to free the s->snapshots ?

These two commands are unrelated.  QEMU uses the term "snapshot" for
several different features:

1. qemu-img snapshot refers to "internal snapshots" that are contained
within qcow2 image files.  The savevm/loadvm/delvm monitor commands
operate on internal snapshots.

2. qemu -snapshot refers to a temporary qcow2 image file created to
buffer any data that the guest writes.  When QEMU exits your disk
image is not modified and the temporary qcow2 file is deleted.  You
can also apply the buffer to the disk image using the "commit" monitor
command.

> 2):
> in the function of
> qcow2_update_snapshot_refcount
>
> it has some "  goto fail  ";
> if the function runs some times, then something makes it goto fail,
> I am not sure whether it will make the refcount  incorrect.

When an error occurs its possible that refcount leaks are introduced
(the refcount was increment but will never be used), but we should
never decrement a reference that is still in use.

Please be more specific about the problem so that Kevin or I can take a look.

Stefan

Re: [Qemu-devel] About the snapshot

2011-12-06 Thread Zhi Hui Li


On 2011年12月06日 17:40, Stefan Hajnoczi wrote:

On Tue, Dec 6, 2011 at 9:07 AM, Zhi Hui Li  wrote:


1) :

for example:

BDRVQcowState *s = bs->opaque;

s->snapshots
s->nb_snapshots


1:use the command:   qemu-img snapshot ./test.qcow2  -c aa
the memory of the s->snapshot don't free,
if the s->nb_snapshots is large, Does it have some problems.

2: use the command:  qemu-system-x86_64  ./test.qcow2 -snapshot
when the program ends, Does it need to free the s->snapshots ?


These two commands are unrelated.  QEMU uses the term "snapshot" for
several different features:

1. qemu-img snapshot refers to "internal snapshots" that are contained
within qcow2 image files.  The savevm/loadvm/delvm monitor commands
operate on internal snapshots.

2. qemu -snapshot refers to a temporary qcow2 image file created to
buffer any data that the guest writes.  When QEMU exits your disk
image is not modified and the temporary qcow2 file is deleted.  You
can also apply the buffer to the disk image using the "commit" monitor
command.


yes, I understand what you say, but the qemu-img and savevm both call 
the function of qcow2_snapshot_create, when I use the command qemu-img 
snapshot ./test.qcow2  -c aa,

but the memory of the s->snapshot don't free.




2):
in the function of
qcow2_update_snapshot_refcount

it has some "  goto fail  ";
if the function runs some times, then something makes it goto fail,
I am not sure whether it will make the refcount  incorrect.


When an error occurs its possible that refcount leaks are introduced
(the refcount was increment but will never be used), but we should
never decrement a reference that is still in use.

Please be more specific about the problem so that Kevin or I can take a look.

Stefan


Ok, I got it.

thank you very much !

Re: [Qemu-devel] [Qemu-trivial] [PATCH] Convert source files to UTF-8 encoding

2011-12-06 Thread Stefan Hajnoczi

On Fri, Dec 02, 2011 at 10:30:41AM +0100, Stefan Weil wrote:
> Most QEMU files either are pure ASCII or use UTF-8.
> Convert some files which still used ISO-8859-1 to UTF-8.
> 
> Signed-off-by: Stefan Weil 
> ---
>  hw/ds1225y.c   |2 +-
>  hw/fdc.c   |2 +-
>  hw/jazz_led.c  |2 +-
>  hw/tc6393xb_template.h |2 +-
>  hw/vmport.c|2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)

Thänk§, äppliéd to thé triviäl pätché§ -néxt tréé:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches-next

Stefan

Re: [Qemu-devel] [PATCH 0/2] linux-user: Fix non-UTF-8 characters

2011-12-06 Thread Stefan Hajnoczi

On Sat, Dec 03, 2011 at 08:55:59AM +0100, Stefan Weil wrote:
> Am 02.12.2011 20:30, schrieb Peter Maydell:
> >These two patches fix some non-UTF-8 characters in linux-user/
> >source files. Other than the five files Stefan Weil recently sent
> >a patch to fix, these are the only remaining non-UTF-8 characters
> >in the source tree.
> >
> >Since I don't entirely trust the email path for sending patches
> >to files where the diffs include non-UTF-8, I've also put the
> >changes up in a git repo, so you may prefer to pull from there:
> 
> 'git am' accepted both patches here without any problem.
> 
> >The following changes since commit 1c8a881daaca6fe0646a425b0970fb3ad25f6732:
> >
> >   Update version for 1.0 release (2011-12-01 14:04:21 -0600)
> >
> >are available in the git repository at:
> >   git://git.linaro.org/people/pmaydell/qemu-arm.git non-utf-fixes
> >
> >Peter Maydell (2):
> >   linux-user/cpu-uname.c: Convert to UTF-8
> >   linux-user/arm/nwfpe/fpopcode.h: Fix non-UTF-8 characters
> >
> >  linux-user/arm/nwfpe/fpopcode.h |   34 +-
> >  linux-user/cpu-uname.c  |2 +-
> >  2 files changed, 18 insertions(+), 18 deletions(-)
> 
> Reviewed-by: Stefan Weil 
> 
> I cc'ed qemu-trivial, too. Stefan, you can take both patches
> into the trivial patch queue.

Thänk§, äppliéd to thé triviäl pätché§ -néxt tréé.

Stefan

Re: [Qemu-devel] About the snapshot

2011-12-06 Thread Stefan Hajnoczi

On Tue, Dec 6, 2011 at 10:01 AM, Zhi Hui Li  wrote:
> On 2011年12月06日 17:40, Stefan Hajnoczi wrote:
>>
>> On Tue, Dec 6, 2011 at 9:07 AM, Zhi Hui Li
>>  wrote:
>>>
>>>
>>> 1) :
>>>
>>> for example:
>>>
>>> BDRVQcowState *s = bs->opaque;
>>>
>>> s->snapshots
>>> s->nb_snapshots
>>>
>>>
>>> 1:use the command:   qemu-img snapshot ./test.qcow2  -c aa
>>> the memory of the s->snapshot don't free,
>>> if the s->nb_snapshots is large, Does it have some problems.
>>>
>>> 2: use the command:  qemu-system-x86_64  ./test.qcow2 -snapshot
>>> when the program ends, Does it need to free the s->snapshots ?
>>
>>
>> These two commands are unrelated.  QEMU uses the term "snapshot" for
>> several different features:
>>
>> 1. qemu-img snapshot refers to "internal snapshots" that are contained
>> within qcow2 image files.  The savevm/loadvm/delvm monitor commands
>> operate on internal snapshots.
>>
>> 2. qemu -snapshot refers to a temporary qcow2 image file created to
>> buffer any data that the guest writes.  When QEMU exits your disk
>> image is not modified and the temporary qcow2 file is deleted.  You
>> can also apply the buffer to the disk image using the "commit" monitor
>> command.
>
>
> yes, I understand what you say, but the qemu-img and savevm both call the
> function of qcow2_snapshot_create, when I use the command qemu-img snapshot
> ./test.qcow2  -c aa,
> but the memory of the s->snapshot don't free.

Okay, I think you're saying that in #1 s->snapshots is leaked because
qcow2_free_snapshots() is not being called from qcow2_close().

Do you want to send a patch to fix this?

Stefan

Re: [Qemu-devel] [ANNOUNCE] QEMU 1.0 release

2011-12-06 Thread Avi Kivity

On 12/05/2011 08:05 PM, Anthony Liguori wrote:
> On 12/05/2011 11:50 AM, Avi Kivity wrote:
>> On 12/05/2011 05:28 PM, Anthony Liguori wrote:
>>> On 12/05/2011 09:06 AM, Alex Jia wrote:
 Hi Anthony,
 It seems the following link is unavailable now:
 http://wiki.qemu.org/download/qemu-1.0.tar.gz
>>>
>>> The VM is crashing pretty often.
>>
>> Out of curiosity, do you know why?
>
> Yes :-)

Please share it if you can (publicly or privately), then, or curiosity
will kill another cat.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [Qemu-trivial] [PATCH] Convert keymap file to UTF-8 encoding

2011-12-06 Thread Stefan Hajnoczi

On Sat, Dec 03, 2011 at 10:45:25AM +0100, Stefan Weil wrote:
> Most QEMU files either are pure ASCII or use UTF-8.
> Convert this keymap file which still used ISO-8859-1 to UTF-8.
> 
> Signed-off-by: Stefan Weil 
> ---
>  pc-bios/keymaps/is |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)

Thanks, applied to the trivial patches tree:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches

Stefan

Re: [Qemu-devel] [Qemu-trivial] endless loop when use qemu-system-mipsel to load bios

2011-12-06 Thread Stefan Hajnoczi

On Sat, Dec 03, 2011 at 11:27:29PM +0100, Stefan Weil wrote:
> Am 16.11.2011 16:21, schrieb rui chen:
> >sorry, here is my new patch file:
> >
> >From 05f4abe8d8c37f1585f2bb7cb89b15426044bb65 Mon Sep 17 00:00:00 2001
> >From: Chen Rui mailto:chenn...@gmail.com>>
> >Date: Sun, 13 Nov 2011 19:42:42 +0800
> >Subject: [PATCH] resolve an endless loop when use
> >qemu-system-mipsel to load bios
> >
> >
> >Signed-off-by: Chen Rui mailto:chenn...@gmail.com>>
> >---
> > hw/mips_malta.c |1 +
> > 1 files changed, 1 insertions(+), 0 deletions(-)
> >
> >diff --git a/hw/mips_malta.c b/hw/mips_malta.c
> >index bb49749..e7dfbd6 100644
> >--- a/hw/mips_malta.c
> >+++ b/hw/mips_malta.c
> >@@ -911,6 +911,7 @@ void mips_malta_init (ram_addr_t ram_size,
> > uint32_t *end = addr + bios_size;
> > while (addr < end) {
> > bswap32s(addr);
> >+addr++;
> > }
> > }
> > #endif
> >-- 
> >1.7.1
> 
> This patch fixes a regression introduced by commit
> d758525180e0efff8a59cfea11f5f8348014ff6a
> (More phys_ram_base elimination).
> 
> Please apply it to QEMU git master (with a modified subject, s/use/using/).
> 
> Tested-by: Stefan Weil 

Thanks, applied to the trivial patches -next tree.

Stefan

Re: [Qemu-devel] [PATCH] Rename get_tls to tls_var

2011-12-06 Thread Stefan Hajnoczi

On Mon, Dec 05, 2011 at 03:18:54PM +0100, Jan Kiszka wrote:
> get_tls() can serve as a lvalue as well, so 'get' might be confusing.
> 
> Signed-off-by: Jan Kiszka 
> ---
>  cpu-all.h  |2 +-
>  qemu-tls.h |4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)

Thanks, applied to the trivial patches tree:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches

Stefan

Re: [Qemu-devel] [Qemu-trivial] [PATCH] linux-user/syscall.c: Don't skip stracing for fcntl64 failure case

2011-12-06 Thread Stefan Hajnoczi

On Mon, Dec 05, 2011 at 11:11:50PM +, Peter Maydell wrote:
> In an fcntl64 failure path, we were returning directly rather than
> simply breaking out of the switch statement. This skips the strace
> code for printing the syscall return value, so don't do that.
> 
> Signed-off-by: Peter Maydell 
> ---
> Alex Graf spotted this one...
> 
>  linux-user/syscall.c |6 --
>  1 files changed, 4 insertions(+), 2 deletions(-)

Thanks, applied to the trivial patches tree:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches

Stefan

Re: [Qemu-devel] [Qemu-trivial] [PATCH] memory: minor documentation fixes/enhancements

2011-12-06 Thread Stefan Hajnoczi

On Mon, Dec 05, 2011 at 10:48:37PM +0100, Stefan Weil wrote:
> Am 05.12.2011 20:54, schrieb Ademar de Souza Reis Jr:
> >Fix typos and minor documentation errors in both memory.h and
> >docs/memory.txt.
> >
> >Also add missing documentation formatting tags to transaction
> >functions.
> >
> >Signed-off-by: Ademar de Souza Reis Jr
> >---
> >  docs/memory.txt |6 +++---
> >  memory.h|   38 ++
> >  2 files changed, 25 insertions(+), 19 deletions(-)

Thanks, applied to the trivial-patches -next tree.

Stefan

Re: [Qemu-devel] [PATCH v3 for-1.1 0/9] target-arm: More inference rules for features

2011-12-06 Thread Peter Maydell

On 6 December 2011 00:30, Andreas Färber  wrote:
> Question: Should we also add the following rules?
> V7MP => V7

Well, V7MP indicates support for the Multiprocessing Extensions, which
are an optional extension to ARMv7-A and ARMv7-R. So in some sense
V7MP implies V7. However it's not like the "main series" v4-v5-v6-v7.
So I think I'd rather not have the rule there: then when you're defining
the CPU you're effectively defining the main architecture version (V7)
and also specifying any optional extensions (V7MP).

> THUMB2EE => THUMB2

Again, this is kind of true by accident -- THUMB2EE is an extension
to V7AR, and V7AR mandates THUMB2, so any core with THUMB2EE has
THUMB2, but it's not the kind of dependency I'd want to put in a rule
for. (V7 and A profile (not R) does imply THUMB2EE, but we don't
currently have a feature flag to distinguish A and R profiles unless
you want to claim that FEATURE_MPU does that...)

-- PMM

[Qemu-devel] [PATCH 00/11] virtio-scsi device model

2011-12-06 Thread Paolo Bonzini

Given that the discussion on the spec has converged, here is finally
the virtio-scsi device model.

The first patch is an (already posted) bug fix.  The next 6 patches
add scatter/gather support to the SCSI layer; the final 4 add the
device---first a stub, and then progressively more features.

Paolo Bonzini (9):
  qiov: prevent double free or use-after-free
  dma-helpers: make QEMUSGList target independent
  dma-helpers: add dma_buf_read and dma_buf_write
  dma-helpers: add accounting wrappers
  scsi: pass residual amount to command_complete
  scsi: add scatter/gather functionality
  scsi-disk: enable scatter/gather functionality
  virtio-scsi: add basic SCSI bus operation
  virtio-scsi: process control queue requests

Stefan Hajnoczi (2):
  virtio-scsi: Add virtio-scsi stub device
  virtio-scsi: Add basic request processing infrastructure

 Makefile.target   |1 +
 cutils.c  |3 +
 default-configs/pci.mak   |1 +
 default-configs/s390x-softmmu.mak |1 +
 dma-helpers.c |   36 +++
 dma.h |   20 +-
 hw/esp.c  |5 +-
 hw/ide/ahci.c |   10 +-
 hw/lsi53c895a.c   |4 +-
 hw/pci.h  |1 +
 hw/s390-virtio-bus.c  |   24 ++
 hw/s390-virtio-bus.h  |2 +
 hw/scsi-bus.c |   35 ++-
 hw/scsi-disk.c|   63 -
 hw/scsi.h |7 +-
 hw/spapr_vscsi.c  |4 +-
 hw/usb-msd.c  |4 +-
 hw/virtio-pci.c   |   41 +++
 hw/virtio-pci.h   |2 +
 hw/virtio-scsi.c  |  553 +
 hw/virtio-scsi.h  |   36 +++
 hw/virtio.h   |3 +
 22 files changed, 815 insertions(+), 41 deletions(-)
 create mode 100644 hw/virtio-scsi.c
 create mode 100644 hw/virtio-scsi.h

-- 
1.7.7.1

[Qemu-devel] [PATCH 04/11] dma-helpers: add accounting wrappers

2011-12-06 Thread Paolo Bonzini

The length of the transfer is already in the sglist, so add a wrapper
that fetches it.

Signed-off-by: Paolo Bonzini 
---
 dma-helpers.c |6 ++
 dma.h |3 +++
 hw/ide/ahci.c |   10 --
 3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index f53a51f..a773489 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -234,3 +234,9 @@ uint64_t dma_buf_write(uint8_t *ptr, int32_t len, 
QEMUSGList *sg)
 {
 return dma_buf_rw(ptr, len, sg, 1);
 }
+
+void dma_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
+QEMUSGList *sg, enum BlockAcctType type)
+{
+bdrv_acct_start(bs, cookie, sg->size, type);
+}
diff --git a/dma.h b/dma.h
index 346ac4f..20e86d2 100644
--- a/dma.h
+++ b/dma.h
@@ -61,4 +61,7 @@ BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs,
 uint64_t dma_buf_read(uint8_t *ptr, int32_t len, QEMUSGList *sg);
 uint64_t dma_buf_write(uint8_t *ptr, int32_t len, QEMUSGList *sg);
 
+void dma_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
+QEMUSGList *sg, enum BlockAcctType type);
+
 #endif
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 0af201d..28f32cc 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -794,9 +794,8 @@ static void process_ncq_command(AHCIState *s, int port, 
uint8_t *cmd_fis,
 DPRINTF(port, "tag %d aio read %"PRId64"\n",
 ncq_tfs->tag, ncq_tfs->lba);
 
-bdrv_acct_start(ncq_tfs->drive->port.ifs[0].bs, &ncq_tfs->acct,
-(ncq_tfs->sector_count-1) * BDRV_SECTOR_SIZE,
-BDRV_ACCT_READ);
+dma_acct_start(ncq_tfs->drive->port.ifs[0].bs, &ncq_tfs->acct,
+   &ncq_tfs->sglist, BDRV_ACCT_READ);
 ncq_tfs->aiocb = dma_bdrv_read(ncq_tfs->drive->port.ifs[0].bs,
&ncq_tfs->sglist, ncq_tfs->lba,
ncq_cb, ncq_tfs);
@@ -808,9 +807,8 @@ static void process_ncq_command(AHCIState *s, int port, 
uint8_t *cmd_fis,
 DPRINTF(port, "tag %d aio write %"PRId64"\n",
 ncq_tfs->tag, ncq_tfs->lba);
 
-bdrv_acct_start(ncq_tfs->drive->port.ifs[0].bs, &ncq_tfs->acct,
-(ncq_tfs->sector_count-1) * BDRV_SECTOR_SIZE,
-BDRV_ACCT_WRITE);
+dma_acct_start(ncq_tfs->drive->port.ifs[0].bs, &ncq_tfs->acct,
+   &ncq_tfs->sglist, BDRV_ACCT_WRITE);
 ncq_tfs->aiocb = dma_bdrv_write(ncq_tfs->drive->port.ifs[0].bs,
 &ncq_tfs->sglist, ncq_tfs->lba,
 ncq_cb, ncq_tfs);
-- 
1.7.7.1

[Qemu-devel] [PATCH 01/11] qiov: prevent double free or use-after-free

2011-12-06 Thread Paolo Bonzini

qemu_iovec_destroy does not clear the QEMUIOVector fully, and the data
could thus be used after free or freed again.  This can be observed with
virtio-scsi, because canceling DMA requests can happen more easily with
SCSI (due to task management functions) than with other backends.

Signed-off-by: Paolo Bonzini 
---
 cutils.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/cutils.c b/cutils.c
index 6db6304..24b3fe3 100644
--- a/cutils.c
+++ b/cutils.c
@@ -217,7 +217,10 @@ void qemu_iovec_destroy(QEMUIOVector *qiov)
 {
 assert(qiov->nalloc != -1);
 
+qemu_iovec_reset(qiov);
 g_free(qiov->iov);
+qiov->nalloc = 0;
+qiov->iov = NULL;
 }
 
 void qemu_iovec_reset(QEMUIOVector *qiov)
-- 
1.7.7.1

[Qemu-devel] [PATCH 11/11] virtio-scsi: process control queue requests

2011-12-06 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 hw/virtio-scsi.c |  128 ++---
 1 files changed, 120 insertions(+), 8 deletions(-)

diff --git a/hw/virtio-scsi.c b/hw/virtio-scsi.c
index dbb4003..601a646 100644
--- a/hw/virtio-scsi.c
+++ b/hw/virtio-scsi.c
@@ -134,6 +134,7 @@ typedef struct {
 VirtQueue *cmd_vq;
 uint32_t sense_size;
 uint32_t cdb_size;
+bool resetting;
 } VirtIOSCSI;
 
 typedef struct VirtIOSCSIReq {
@@ -236,15 +237,98 @@ static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, 
VirtQueue *vq)
 return req;
 }
 
-static void virtio_scsi_fail_ctrl_req(VirtIOSCSIReq *req)
+static void virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
 {
-if (req->req.tmf->type == VIRTIO_SCSI_T_TMF) {
-req->resp.tmf->response = VIRTIO_SCSI_S_FAILURE;
-} else {
-req->resp.an->response = VIRTIO_SCSI_S_FAILURE;
+SCSIDevice *d = virtio_scsi_device_find(s, req->req.cmd->lun);
+SCSIRequest *r, *next;
+DeviceState *qdev;
+int target;
+
+switch (req->req.tmf->subtype) {
+case VIRTIO_SCSI_T_TMF_ABORT_TASK:
+case VIRTIO_SCSI_T_TMF_QUERY_TASK:
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+goto fail;
+}
+if (d->lun != virtio_scsi_get_lun(req->req.cmd->lun)) {
+req->resp.tmf->response = VIRTIO_SCSI_S_INCORRECT_LUN;
+break;
+}
+QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) {
+if (r->tag == req->req.cmd->tag) {
+break;
+}
+}
+if (r && r->hba_private) {
+if (req->req.tmf->subtype == VIRTIO_SCSI_T_TMF_ABORT_TASK) {
+scsi_req_cancel(r);
+}
+req->resp.tmf->response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED;
+} else {
+req->resp.tmf->response = VIRTIO_SCSI_S_OK;
+}
+break;
+
+case VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET:
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+goto fail;
+}
+if (d->lun == virtio_scsi_get_lun(req->req.cmd->lun)) {
+s->resetting++;
+qdev_reset_all(&d->qdev);
+s->resetting--;
+}
+break;
+
+case VIRTIO_SCSI_T_TMF_ABORT_TASK_SET:
+case VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET:
+case VIRTIO_SCSI_T_TMF_QUERY_TASK_SET:
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+goto fail;
+}
+if (d->lun != virtio_scsi_get_lun(req->req.cmd->lun)) {
+req->resp.tmf->response = VIRTIO_SCSI_S_INCORRECT_LUN;
+break;
+}
+req->resp.tmf->response = VIRTIO_SCSI_S_OK;
+QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) {
+if (r->hba_private) {
+if (req->req.tmf->subtype != VIRTIO_SCSI_T_TMF_QUERY_TASK) {
+scsi_req_cancel(r);
+}
+req->resp.tmf->response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED;
+}
+}
+break;
+
+case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET:
+target = req->req.cmd->lun[1];
+s->resetting++;
+QTAILQ_FOREACH(qdev, &s->bus.qbus.children, sibling) {
+ d = DO_UPCAST(SCSIDevice, qdev, qdev);
+ if (d->channel == 0 && d->id == target) {
+qdev_reset_all(&d->qdev);
+ }
+}
+s->resetting--;
+break;
+
+case VIRTIO_SCSI_T_TMF_CLEAR_ACA:
+req->resp.tmf->response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED;
+break;
+
+default:
+req->resp.tmf->response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
+break;
 }
 
-virtio_scsi_complete_req(req);
+return;
+
+fail:
+req->resp.tmf->response = VIRTIO_SCSI_S_BAD_TARGET;
 }
 
 static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
@@ -253,7 +334,31 @@ static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 VirtIOSCSIReq *req;
 
 while ((req = virtio_scsi_pop_req(s, vq))) {
-virtio_scsi_fail_ctrl_req(req);
+int out_size, in_size;
+if (req->elem.out_num < 1 || req->elem.in_num < 1) {
+virtio_scsi_bad_req();
+continue;
+}
+
+out_size = req->elem.out_sg[0].iov_len;
+in_size = req->elem.in_sg[0].iov_len;
+if (req->req.tmf->type == VIRTIO_SCSI_T_TMF) {
+if (out_size < sizeof(VirtIOSCSICtrlTMFReq) ||
+in_size < sizeof(VirtIOSCSICtrlTMFResp)) {
+virtio_scsi_bad_req();
+}
+virtio_scsi_do_tmf(s, req);
+
+} else if (req->req.tmf->type == VIRTIO_SCSI_T_AN_QUERY ||
+   req->req.tmf->type == VIRTIO_SCSI_T_AN_SUBSCRIBE) {
+if (out_size < sizeof(VirtIOSCSICtrlANReq) ||
+in_size < sizeof(VirtIOSCSICtrlANResp)) {
+virtio_scsi_bad_req();
+}
+req-

[Qemu-devel] [PATCH 05/19] configure: Print a banner comment at the top of config.log

2011-12-06 Thread Stefan Hajnoczi

From: Peter Maydell 

Print a banner comment at the top of config.log identifying
when configure was run and the arguments used. This is occasionally
useful for debugging purposes.

Signed-off-by: Peter Maydell 
Signed-off-by: Stefan Hajnoczi 
---
 configure |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index c98aed1..452b8cf 100755
--- a/configure
+++ b/configure
@@ -20,6 +20,11 @@ TMPE="${TMPDIR1}/qemu-conf-${RANDOM}-$$-${RANDOM}.exe"
 trap "rm -f $TMPC $TMPO $TMPE" EXIT INT QUIT TERM
 rm -f config.log
 
+# Print a helpful header at the top of config.log
+echo "# QEMU configure log $(date)" >> config.log
+echo "# produced by $0 $*" >> config.log
+echo "#" >> config.log
+
 compile_object() {
   echo $cc $QEMU_CFLAGS -c -o $TMPO $TMPC >> config.log
   $cc $QEMU_CFLAGS -c -o $TMPO $TMPC >> config.log 2>&1
-- 
1.7.7.3

[Qemu-devel] [PATCH 04/19] configure: Include #define name in check_define compiler error

2011-12-06 Thread Stefan Hajnoczi

From: Peter Maydell 

Include the name of the #define being tested for in the compiler
error produced when a check_define test is run and fails. This
appears only in the config.log, but it does make it a little easier
to debug problems by inspecting config.log.

Signed-off-by: Peter Maydell 
Signed-off-by: Stefan Hajnoczi 
---
 configure |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 5fbd812..c98aed1 100755
--- a/configure
+++ b/configure
@@ -249,7 +249,7 @@ source_path=`cd "$source_path"; pwd`
 check_define() {
 cat > $TMPC <

[Qemu-devel] [PATCH 14/19] linux-user/cpu-uname.c: Convert to UTF-8

2011-12-06 Thread Stefan Hajnoczi

From: Peter Maydell 

Convert comment from ISO-8859-1 encoding to UTF-8 to match the rest
of QEMU's source code.

Reviewed-by: Stefan Weil 
Signed-off-by: Peter Maydell 
Signed-off-by: Stefan Hajnoczi 
---
 linux-user/cpu-uname.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/linux-user/cpu-uname.c b/linux-user/cpu-uname.c
index 23afede..ddc37be 100644
--- a/linux-user/cpu-uname.c
+++ b/linux-user/cpu-uname.c
@@ -1,7 +1,7 @@
 /*
  *  cpu to uname machine name map
  *
- *  Copyright (c) 2009 Lo�c Minier
+ *  Copyright (c) 2009 Loïc Minier
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
-- 
1.7.7.3

[Qemu-devel] [PATCH 16/19] Rename get_tls to tls_var

2011-12-06 Thread Stefan Hajnoczi

From: Jan Kiszka 

get_tls() can serve as a lvalue as well, so 'get' might be confusing.

Signed-off-by: Jan Kiszka 
Signed-off-by: Stefan Hajnoczi 
---
 cpu-all.h  |2 +-
 qemu-tls.h |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 7246a67..9d78715 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -336,7 +336,7 @@ void QEMU_NORETURN cpu_abort(CPUState *env, const char 
*fmt, ...)
 GCC_FMT_ATTR(2, 3);
 extern CPUState *first_cpu;
 DECLARE_TLS(CPUState *,cpu_single_env);
-#define cpu_single_env get_tls(cpu_single_env)
+#define cpu_single_env tls_var(cpu_single_env)
 
 /* Flags for use in ENV->INTERRUPT_PENDING.
 
diff --git a/qemu-tls.h b/qemu-tls.h
index 5b70f10..b92ea9d 100644
--- a/qemu-tls.h
+++ b/qemu-tls.h
@@ -41,12 +41,12 @@
 #ifdef __linux__
 #define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
 #define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
-#define get_tls(x)   tls__##x
+#define tls_var(x)   tls__##x
 #else
 /* Dummy implementations which define plain global variables */
 #define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
 #define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
-#define get_tls(x)   tls__##x
+#define tls_var(x)   tls__##x
 #endif
 
 #endif
-- 
1.7.7.3

[Qemu-devel] [PATCH 17/19] linux-user/syscall.c: Don't skip stracing for fcntl64 failure case

2011-12-06 Thread Stefan Hajnoczi

From: Peter Maydell 

In an fcntl64 failure path, we were returning directly rather than
simply breaking out of the switch statement. This skips the strace
code for printing the syscall return value, so don't do that.

Acked-by: Alexander Graf 
Signed-off-by: Peter Maydell 
Signed-off-by: Stefan Hajnoczi 
---
 linux-user/syscall.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index c84cc65..2bf9e7e 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7521,8 +7521,10 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #endif
 
cmd = target_to_host_fcntl_cmd(arg2);
-   if (cmd == -TARGET_EINVAL)
-   return cmd;
+if (cmd == -TARGET_EINVAL) {
+ret = cmd;
+break;
+}
 
 switch(arg2) {
 case TARGET_F_GETLK64:
-- 
1.7.7.3

[Qemu-devel] [PATCH 15/19] linux-user/arm/nwfpe/fpopcode.h: Fix non-UTF-8 characters

2011-12-06 Thread Stefan Hajnoczi

From: Peter Maydell 

Fix some stray non-UTF-8 characters used in some ASCII art tables
by converting them to plain ASCII '|' instead.

Reviewed-by: Stefan Weil 
Signed-off-by: Peter Maydell 
Signed-off-by: Stefan Hajnoczi 
---
 linux-user/arm/nwfpe/fpopcode.h |   34 +-
 1 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/linux-user/arm/nwfpe/fpopcode.h b/linux-user/arm/nwfpe/fpopcode.h
index e7d1009..1b1137f 100644
--- a/linux-user/arm/nwfpe/fpopcode.h
+++ b/linux-user/arm/nwfpe/fpopcode.h
@@ -75,11 +75,11 @@ TABLE 1
 +-+---+---+-+-+
 |  Precision  | u | v | FPSR.EP | length  |
 +-+---+---+-+-+
-| Single  | 0 � 0 |x| 1 words |
-| Double  | 1 � 1 |x| 2 words |
-| Extended| 1 � 1 |x| 3 words |
-| Packed decimal  | 1 � 1 |0| 3 words |
-| Expanded packed decimal | 1 � 1 |1| 4 words |
+| Single  | 0 | 0 |x| 1 words |
+| Double  | 1 | 1 |x| 2 words |
+| Extended| 1 | 1 |x| 3 words |
+| Packed decimal  | 1 | 1 |0| 3 words |
+| Expanded packed decimal | 1 | 1 |1| 4 words |
 +-+---+---+-+-+
 Note: x = don't care
 */
@@ -89,10 +89,10 @@ TABLE 2
 +---+---+-+
 | w | x | Number of registers to transfer |
 +---+---+-+
-| 0 � 1 |  1  |
-| 1 � 0 |  2  |
-| 1 � 1 |  3  |
-| 0 � 0 |  4  |
+| 0 | 1 |  1  |
+| 1 | 0 |  2  |
+| 1 | 1 |  3  |
+| 0 | 0 |  4  |
 +---+---+-+
 */
 
@@ -153,10 +153,10 @@ TABLE 5
 +-+---+---+
 |  Rounding Precision | e | f |
 +-+---+---+
-| IEEE Single precision   | 0 � 0 |
-| IEEE Double precision   | 0 � 1 |
-| IEEE Extended precision | 1 � 0 |
-| undefined (trap)| 1 � 1 |
+| IEEE Single precision   | 0 | 0 |
+| IEEE Double precision   | 0 | 1 |
+| IEEE Extended precision | 1 | 0 |
+| undefined (trap)| 1 | 1 |
 +-+---+---+
 */
 
@@ -165,10 +165,10 @@ TABLE 5
 +-+---+---+
 |  Rounding Mode  | g | h |
 +-+---+---+
-| Round to nearest (default)  | 0 � 0 |
-| Round toward plus infinity  | 0 � 1 |
-| Round toward negative infinity  | 1 � 0 |
-| Round toward zero   | 1 � 1 |
+| Round to nearest (default)  | 0 | 0 |
+| Round toward plus infinity  | 0 | 1 |
+| Round toward negative infinity  | 1 | 0 |
+| Round toward zero   | 1 | 1 |
 +-+---+---+
 */
 
-- 
1.7.7.3

[Qemu-devel] [PATCH 18/19] memory: minor documentation fixes/enhancements

2011-12-06 Thread Stefan Hajnoczi

From: Ademar de Souza Reis Jr 

Fix typos and minor documentation errors in both memory.h and
docs/memory.txt.

Also add missing documentation formatting tags to transaction
functions.

Reviewed-by: Stefan Weil 
Signed-off-by: Ademar de Souza Reis Jr 
Signed-off-by: Stefan Hajnoczi 
---
 docs/memory.txt |6 +++---
 memory.h|   38 ++
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/docs/memory.txt b/docs/memory.txt
index 3fc1683..5bbee8e 100644
--- a/docs/memory.txt
+++ b/docs/memory.txt
@@ -7,7 +7,7 @@ machine.  It attempts to allow modelling of:
  - ordinary RAM
  - memory-mapped I/O (MMIO)
  - memory controllers that can dynamically reroute physical memory regions
-  to different destinations
+   to different destinations
 
 The memory model provides support for
 
@@ -121,7 +121,7 @@ pci (0-2^32-1)
 
 ram: ram@0x-0x
 
-The is a (simplified) PC memory map. The 4GB RAM block is mapped into the
+This is a (simplified) PC memory map. The 4GB RAM block is mapped into the
 system address space via two aliases: "lomem" is a 1:1 mapping of the first
 3.5GB; "himem" maps the last 0.5GB at address 4GB.  This leaves 0.5GB for the
 so-called PCI hole, that allows a 32-bit PCI bus to exist in a system with
@@ -164,7 +164,7 @@ various constraints can be supplied to control how these 
callbacks are called:
  - .impl.min_access_size, .impl.max_access_size define the access sizes
(in bytes) supported by the *implementation*; other access sizes will be
emulated using the ones available.  For example a 4-byte write will be
-   emulated using four 1-byte write, if .impl.max_access_size = 1.
+   emulated using four 1-byte writes, if .impl.max_access_size = 1.
  - .impl.valid specifies that the *implementation* only supports unaligned
accesses; unaligned accesses will be emulated by two aligned accesses.
  - .old_portio and .old_mmio can be used to ease porting from code using
diff --git a/memory.h b/memory.h
index 53bf261..beae127 100644
--- a/memory.h
+++ b/memory.h
@@ -149,7 +149,7 @@ struct MemoryRegionPortio {
 /**
  * memory_region_init: Initialize a memory region
  *
- * The region typically acts as a container for other memory regions.  Us
+ * The region typically acts as a container for other memory regions.  Use
  * memory_region_add_subregion() to add subregions.
  *
  * @mr: the #MemoryRegion to be initialized
@@ -162,7 +162,7 @@ void memory_region_init(MemoryRegion *mr,
 /**
  * memory_region_init_io: Initialize an I/O memory region.
  *
- * Accesses into the region will be cause the callbacks in @ops to be called.
+ * Accesses into the region will cause the callbacks in @ops to be called.
  * if @size is nonzero, subregions will be clipped to @size.
  *
  * @mr: the #MemoryRegion to be initialized.
@@ -180,7 +180,7 @@ void memory_region_init_io(MemoryRegion *mr,
 
 /**
  * memory_region_init_ram:  Initialize RAM memory region.  Accesses into the
- *  region will be modify memory directly.
+ *  region will modify memory directly.
  *
  * @mr: the #MemoryRegion to be initialized.
  * @dev: a device associated with the region; may be %NULL.
@@ -196,7 +196,7 @@ void memory_region_init_ram(MemoryRegion *mr,
 
 /**
  * memory_region_init_ram:  Initialize RAM memory region from a user-provided.
- *  pointer.  Accesses into the region will be modify
+ *  pointer.  Accesses into the region will modify
  *  memory directly.
  *
  * @mr: the #MemoryRegion to be initialized.
@@ -250,7 +250,7 @@ void memory_region_init_rom_device(MemoryRegion *mr,
uint64_t size);
 
 /**
- * memory_region_destroy: Destroy a memory region and relaim all resources.
+ * memory_region_destroy: Destroy a memory region and reclaim all resources.
  *
  * @mr: the region to be destroyed.  May not currently be a subregion
  *  (see memory_region_add_subregion()) or referenced in an alias
@@ -417,7 +417,7 @@ void memory_region_clear_coalescing(MemoryRegion *mr);
  *
  * Marks a word in an IO region (initialized with memory_region_init_io())
  * as a trigger for an eventfd event.  The I/O callback will not be called.
- * The caller must be prepared to handle failure (hat is, take the required
+ * The caller must be prepared to handle failure (that is, take the required
  * action if the callback _is_ called).
  *
  * @mr: the memory region being updated.
@@ -435,10 +435,10 @@ void memory_region_add_eventfd(MemoryRegion *mr,
int fd);
 
 /**
- * memory_region_del_eventfd: Cancel and eventfd.
+ * memory_region_del_eventfd: Cancel an eventfd.
  *
- * Cancels an eventfd trigger request by a previous memory_region_add_eventfd()
- * call.
+ * Cancels an eventfd trigger requested by a previous
+ * memory_region_add_eventfd() call.
  *
  * @mr: the memory region being updated.
  * @a

[Qemu-devel] [PATCH 19/19] mips_malta: resolve endless loop when loading bios

2011-12-06 Thread Stefan Hajnoczi

From: Chen Rui 

Tested-by: Stefan Weil 
Signed-off-by: Chen Rui 
---
 hw/mips_malta.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/mips_malta.c b/hw/mips_malta.c
index bb49749..e7dfbd6 100644
--- a/hw/mips_malta.c
+++ b/hw/mips_malta.c
@@ -911,6 +911,7 @@ void mips_malta_init (ram_addr_t ram_size,
 uint32_t *end = addr + bios_size;
 while (addr < end) {
 bswap32s(addr);
+addr++;
 }
 }
 #endif
-- 
1.7.7.3

[Qemu-devel] [PATCH 05/11] scsi: pass residual amount to command_complete

2011-12-06 Thread Paolo Bonzini

With the upcoming sglist support, HBAs will not see any transfer_data
call and will not have a way to detect short transfers.  So pass the
residual amount of data upon command completion.

Signed-off-by: Paolo Bonzini 
---
 hw/esp.c |3 ++-
 hw/lsi53c895a.c  |2 +-
 hw/scsi-bus.c|   12 
 hw/scsi.h|3 ++-
 hw/spapr_vscsi.c |2 +-
 hw/usb-msd.c |2 +-
 6 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index b698a43..8516db5 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -389,7 +389,8 @@ static void esp_do_dma(ESPState *s)
 esp_dma_done(s);
 }
 
-static void esp_command_complete(SCSIRequest *req, uint32_t status)
+static void esp_command_complete(SCSIRequest *req, uint32_t status,
+ int32_t resid)
 {
 ESPState *s = DO_UPCAST(ESPState, busdev.qdev, req->bus->qbus.parent);
 
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index fcc27d7..c53760b 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -699,7 +699,7 @@ static int lsi_queue_req(LSIState *s, SCSIRequest *req, 
uint32_t len)
 }
 
  /* Callback to indicate that the SCSI layer has completed a command.  */
-static void lsi_command_complete(SCSIRequest *req, uint32_t status)
+static void lsi_command_complete(SCSIRequest *req, uint32_t status, int32_t 
resid)
 {
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
 int out;
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index 64e709e..aa811f4 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -512,6 +512,8 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, 
uint32_t lun,
 }
 
 req->cmd = cmd;
+req->resid = req->cmd.xfer;
+
 switch (buf[0]) {
 case INQUIRY:
 trace_scsi_inquiry(d->id, lun, tag, cmd.buf[1], cmd.buf[2]);
@@ -1256,10 +1258,12 @@ void scsi_req_data(SCSIRequest *req, int len)
 {
 if (req->io_canceled) {
 trace_scsi_req_data_canceled(req->dev->id, req->lun, req->tag, len);
-} else {
-trace_scsi_req_data(req->dev->id, req->lun, req->tag, len);
-req->bus->info->transfer_data(req, len);
+return;
 }
+trace_scsi_req_data(req->dev->id, req->lun, req->tag, len);
+assert(req->cmd.mode != SCSI_XFER_NONE);
+req->resid -= len;
+req->bus->info->transfer_data(req, len);
 }
 
 void scsi_req_print(SCSIRequest *req)
@@ -1318,7 +1322,7 @@ void scsi_req_complete(SCSIRequest *req, int status)
 
 scsi_req_ref(req);
 scsi_req_dequeue(req);
-req->bus->info->complete(req, req->status);
+req->bus->info->complete(req, req->status, req->resid);
 scsi_req_unref(req);
 }
 
diff --git a/hw/scsi.h b/hw/scsi.h
index ab6e952..27ca087 100644
--- a/hw/scsi.h
+++ b/hw/scsi.h
@@ -47,6 +47,7 @@ struct SCSIRequest {
 uint32_t  tag;
 uint32_t  lun;
 uint32_t  status;
+size_tresid;
 SCSICommand   cmd;
 BlockDriverAIOCB  *aiocb;
 uint8_t sense[SCSI_SENSE_BUF_SIZE];
@@ -107,7 +108,7 @@ struct SCSIBusInfo {
 int tcq;
 int max_channel, max_target, max_lun;
 void (*transfer_data)(SCSIRequest *req, uint32_t arg);
-void (*complete)(SCSIRequest *req, uint32_t arg);
+void (*complete)(SCSIRequest *req, uint32_t arg, int32_t len);
 void (*cancel)(SCSIRequest *req);
 };
 
diff --git a/hw/spapr_vscsi.c b/hw/spapr_vscsi.c
index 00e2d2d..c28bba9 100644
--- a/hw/spapr_vscsi.c
+++ b/hw/spapr_vscsi.c
@@ -494,7 +494,7 @@ static void vscsi_transfer_data(SCSIRequest *sreq, uint32_t 
len)
 }
 
 /* Callback to indicate that the SCSI layer has completed a transfer.  */
-static void vscsi_command_complete(SCSIRequest *sreq, uint32_t status)
+static void vscsi_command_complete(SCSIRequest *sreq, uint32_t status, int32_t 
resid)
 {
 VSCSIState *s = DO_UPCAST(VSCSIState, vdev.qdev, sreq->bus->qbus.parent);
 vscsi_req *req = sreq->hba_private;
diff --git a/hw/usb-msd.c b/hw/usb-msd.c
index 4c06950..1c7bc82 100644
--- a/hw/usb-msd.c
+++ b/hw/usb-msd.c
@@ -223,7 +223,7 @@ static void usb_msd_transfer_data(SCSIRequest *req, 
uint32_t len)
 }
 }
 
-static void usb_msd_command_complete(SCSIRequest *req, uint32_t status)
+static void usb_msd_command_complete(SCSIRequest *req, uint32_t status, 
int32_t resid)
 {
 MSDState *s = DO_UPCAST(MSDState, dev.qdev, req->bus->qbus.parent);
 USBPacket *p = s->packet;
-- 
1.7.7.1

[Qemu-devel] [PATCH 10/19] pcie_aer: adjust do_pcie_aer_inejct_error -> do_pcie_aer_inject_error

2011-12-06 Thread Stefan Hajnoczi

From: Zhi Yong Wu 

This function name is a bit wrong. Although it doesn't impact function, it is a 
bit necessary that we should fixup it.

Signed-off-by: Zhi Yong Wu 
Signed-off-by: Stefan Hajnoczi 
---
 hmp-commands.hx |2 +-
 hw/pci-stub.c   |2 +-
 hw/pcie_aer.c   |2 +-
 sysemu.h|2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 79a9195..54b2abf 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -923,7 +923,7 @@ ETEXI
   " = 32bit x 4\n\t\t\t"
   " = 32bit x 4",
 .user_print  = pcie_aer_inject_error_print,
-.mhandler.cmd_new = do_pcie_aer_inejct_error,
+.mhandler.cmd_new = do_pcie_aer_inject_error,
 },
 
 STEXI
diff --git a/hw/pci-stub.c b/hw/pci-stub.c
index 636171c..134c448 100644
--- a/hw/pci-stub.c
+++ b/hw/pci-stub.c
@@ -34,7 +34,7 @@ static void pci_error_message(Monitor *mon)
 monitor_printf(mon, "PCI devices not supported\n");
 }
 
-int do_pcie_aer_inejct_error(Monitor *mon,
+int do_pcie_aer_inject_error(Monitor *mon,
  const QDict *qdict, QObject **ret_data)
 {
 pci_error_message(mon);
diff --git a/hw/pcie_aer.c b/hw/pcie_aer.c
index b9d1097..3b6981c 100644
--- a/hw/pcie_aer.c
+++ b/hw/pcie_aer.c
@@ -951,7 +951,7 @@ static int pcie_aer_parse_error_string(const char 
*error_name,
 return -EINVAL;
 }
 
-int do_pcie_aer_inejct_error(Monitor *mon,
+int do_pcie_aer_inject_error(Monitor *mon,
  const QDict *qdict, QObject **ret_data)
 {
 const char *id = qdict_get_str(qdict, "id");
diff --git a/sysemu.h b/sysemu.h
index 22cd720..3806901 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -148,7 +148,7 @@ void do_pci_device_hot_remove(Monitor *mon, const QDict 
*qdict);
 
 /* pcie aer error injection */
 void pcie_aer_inject_error_print(Monitor *mon, const QObject *data);
-int do_pcie_aer_inejct_error(Monitor *mon,
+int do_pcie_aer_inject_error(Monitor *mon,
  const QDict *qdict, QObject **ret_data);
 
 /* serial ports */
-- 
1.7.7.3

[Qemu-devel] [PATCH 08/11] virtio-scsi: Add virtio-scsi stub device

2011-12-06 Thread Paolo Bonzini

From: Stefan Hajnoczi 

Add a useless virtio SCSI HBA device:

  qemu -device virtio-scsi-pci

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
---
 Makefile.target   |1 +
 default-configs/pci.mak   |1 +
 default-configs/s390x-softmmu.mak |1 +
 hw/pci.h  |1 +
 hw/s390-virtio-bus.c  |   24 
 hw/s390-virtio-bus.h  |2 +
 hw/virtio-pci.c   |   41 +++
 hw/virtio-pci.h   |2 +
 hw/virtio-scsi.c  |  227 +
 hw/virtio-scsi.h  |   36 ++
 hw/virtio.h   |3 +
 11 files changed, 339 insertions(+), 0 deletions(-)
 create mode 100644 hw/virtio-scsi.c
 create mode 100644 hw/virtio-scsi.h

diff --git a/Makefile.target b/Makefile.target
index a111521..f3bc562 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -199,6 +199,7 @@ obj-y = arch_init.o cpus.o monitor.o machine.o gdbstub.o 
balloon.o ioport.o
 # need to fix this properly
 obj-$(CONFIG_NO_PCI) += pci-stub.o
 obj-$(CONFIG_VIRTIO) += virtio.o virtio-blk.o virtio-balloon.o virtio-net.o 
virtio-serial-bus.o
+obj-$(CONFIG_VIRTIO_SCSI) += virtio-scsi.o
 obj-y += vhost_net.o
 obj-$(CONFIG_VHOST_NET) += vhost.o
 obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/virtio-9p-device.o
diff --git a/default-configs/pci.mak b/default-configs/pci.mak
index 22bd350..9c8edd4 100644
--- a/default-configs/pci.mak
+++ b/default-configs/pci.mak
@@ -1,5 +1,6 @@
 CONFIG_PCI=y
 CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_SCSI=y
 CONFIG_VIRTIO=y
 CONFIG_USB_UHCI=y
 CONFIG_USB_OHCI=y
diff --git a/default-configs/s390x-softmmu.mak 
b/default-configs/s390x-softmmu.mak
index 3005729..e588803 100644
--- a/default-configs/s390x-softmmu.mak
+++ b/default-configs/s390x-softmmu.mak
@@ -1 +1,2 @@
 CONFIG_VIRTIO=y
+CONFIG_VIRTIO_SCSI=y
diff --git a/hw/pci.h b/hw/pci.h
index 625e717..767bcd4 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -75,6 +75,7 @@
 #define PCI_DEVICE_ID_VIRTIO_BLOCK   0x1001
 #define PCI_DEVICE_ID_VIRTIO_BALLOON 0x1002
 #define PCI_DEVICE_ID_VIRTIO_CONSOLE 0x1003
+#define PCI_DEVICE_ID_VIRTIO_SCSI0x1004
 
 #define FMT_PCIBUS  PRIx64
 
diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index c4b9a99..2da52c2 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -158,6 +158,18 @@ static int s390_virtio_serial_init(VirtIOS390Device *dev)
 return r;
 }
 
+static int s390_virtio_scsi_init(VirtIOS390Device *dev)
+{
+VirtIODevice *vdev;
+
+vdev = virtio_scsi_init((DeviceState *)dev, &dev->scsi);
+if (!vdev) {
+return -1;
+}
+
+return s390_virtio_device_init(dev, vdev);
+}
+
 static uint64_t s390_virtio_device_vq_token(VirtIOS390Device *dev, int vq)
 {
 ram_addr_t token_off;
@@ -370,6 +382,17 @@ static VirtIOS390DeviceInfo s390_virtio_serial = {
 },
 };
 
+static VirtIOS390DeviceInfo s390_virtio_scsi = {
+.init = s390_virtio_scsi_init,
+.qdev.name = "virtio-scsi-s390",
+.qdev.alias = "virtio-scsi",
+.qdev.size = sizeof(VirtIOS390Device),
+.qdev.props = (Property[]) {
+DEFINE_VIRTIO_SCSI_PROPERTIES(VirtIOS390Device, host_features, scsi),
+DEFINE_PROP_END_OF_LIST(),
+},
+};
+
 static int s390_virtio_busdev_init(DeviceState *dev, DeviceInfo *info)
 {
 VirtIOS390DeviceInfo *_info = (VirtIOS390DeviceInfo *)info;
@@ -392,6 +415,7 @@ static void s390_virtio_register(void)
 s390_virtio_bus_register_withprop(&s390_virtio_serial);
 s390_virtio_bus_register_withprop(&s390_virtio_blk);
 s390_virtio_bus_register_withprop(&s390_virtio_net);
+s390_virtio_bus_register_withprop(&s390_virtio_scsi);
 }
 device_init(s390_virtio_register);
 
diff --git a/hw/s390-virtio-bus.h b/hw/s390-virtio-bus.h
index f1bece7..a840936 100644
--- a/hw/s390-virtio-bus.h
+++ b/hw/s390-virtio-bus.h
@@ -19,6 +19,7 @@
 
 #include "virtio-net.h"
 #include "virtio-serial.h"
+#include "virtio-scsi.h"
 
 #define VIRTIO_DEV_OFFS_TYPE   0   /* 8 bits */
 #define VIRTIO_DEV_OFFS_NUM_VQ 1   /* 8 bits */
@@ -47,6 +48,7 @@ typedef struct VirtIOS390Device {
 uint32_t host_features;
 virtio_serial_conf serial;
 virtio_net_conf net;
+VirtIOSCSIConf scsi;
 } VirtIOS390Device;
 
 typedef struct VirtIOS390Bus {
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 64c6a94..5ba018f 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -19,6 +19,7 @@
 #include "virtio-blk.h"
 #include "virtio-net.h"
 #include "virtio-serial.h"
+#include "virtio-scsi.h"
 #include "pci.h"
 #include "qemu-error.h"
 #include "msix.h"
@@ -779,6 +780,32 @@ static int virtio_balloon_exit_pci(PCIDevice *pci_dev)
 return virtio_exit_pci(pci_dev);
 }
 
+static int virtio_scsi_init_pci(PCIDevice *pci_dev)
+{
+VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
+VirtIODevice *vdev;
+
+vdev = virtio_scsi_init(&pci_dev->qdev, &proxy->scsi);
+

[Qemu-devel] [PATCH 02/11] dma-helpers: make QEMUSGList target independent

2011-12-06 Thread Paolo Bonzini

This lets scsi-* observe the overall s/g list and pass it to the DMA
helpers.

size_t can potentially be smaller than dma_addr_t.  However, the overall
size of a S/G list can actually be bigger than dma_addr_t because (for
writes) you could reuse the same buffer multiple times.  So there is
no difference in practice.

Signed-off-by: Paolo Bonzini 
---
 dma.h |   14 +++---
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/dma.h b/dma.h
index a13209d..d50019b 100644
--- a/dma.h
+++ b/dma.h
@@ -17,6 +17,13 @@
 
 typedef struct ScatterGatherEntry ScatterGatherEntry;
 
+struct QEMUSGList {
+ScatterGatherEntry *sg;
+int nsg;
+int nalloc;
+size_t size;
+};
+
 #if defined(TARGET_PHYS_ADDR_BITS)
 typedef target_phys_addr_t dma_addr_t;
 
@@ -32,13 +39,6 @@ struct ScatterGatherEntry {
 dma_addr_t len;
 };
 
-struct QEMUSGList {
-ScatterGatherEntry *sg;
-int nsg;
-int nalloc;
-dma_addr_t size;
-};
-
 void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
 void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len);
 void qemu_sglist_destroy(QEMUSGList *qsg);
-- 
1.7.7.1

[Qemu-devel] [PATCH 02/19] console: Fix console_putchar() for CSI J

2011-12-06 Thread Stefan Hajnoczi

From: Markus Armbruster 

It falls through to the code for CSI K.  "Erase Down" also does "Erase
End of Line", "Erase Up" also does "Erase Start of Line", and "Erase
Screen" also does "Erase Line".  Happens not to be visible.  Fix it
anyway.  Spotted by Coverity.

Signed-off-by: Markus Armbruster 
Signed-off-by: Stefan Hajnoczi 
---
 console.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/console.c b/console.c
index 374fcba..ce0429d 100644
--- a/console.c
+++ b/console.c
@@ -1011,6 +1011,7 @@ static void console_putchar(TextConsole *s, int ch)
 }
 break;
 }
+break;
 case 'K':
 switch (s->esc_params[0]) {
 case 0:
-- 
1.7.7.3

[Qemu-devel] [PULL 00/19] Trivial patches for 3 to 6 December 2011

2011-12-06 Thread Stefan Hajnoczi

The following changes since commit 217bfb445b54db618a30f3a39170bebd9fd9dbf2:

  hw/arm_gic.c: Ignore attempts to complete nonexistent IRQs (2011-12-05 
21:38:56 +0100)

are available in the git repository at:
  ssh://repo.or.cz/srv/git/qemu/stefanha.git trivial-patches-next

Ademar de Souza Reis Jr (1):
  memory: minor documentation fixes/enhancements

Chen Rui (1):
  mips_malta: resolve endless loop when loading bios

Dong Xu Wang (2):
  fix typo: delete redundant semicolon
  fix spelling in hw sub directory

Jan Kiszka (1):
  Rename get_tls to tls_var

Markus Armbruster (3):
  console: Clean up confusing indentation in console_putchar()
  console: Fix console_putchar() for CSI J
  console: Fix qemu_default_pixelformat() for 24 bpp

Peter Maydell (6):
  configure: Include #define name in check_define compiler error
  configure: Print a banner comment at the top of config.log
  configure: Pull linux-headers/asm symlink creation out of loop
  linux-user/cpu-uname.c: Convert to UTF-8
  linux-user/arm/nwfpe/fpopcode.h: Fix non-UTF-8 characters
  linux-user/syscall.c: Don't skip stracing for fcntl64 failure case

Stefan Weil (2):
  Convert source files to UTF-8 encoding
  Convert keymap file to UTF-8 encoding

Zhi Hui Li (2):
  net/socket.c : fix memory leak
  win32: fix memory leak

Zhi Yong Wu (1):
  pcie_aer: adjust do_pcie_aer_inejct_error -> do_pcie_aer_inject_error

 block/nbd.c |4 +-
 configure   |   44 +++
 console.c   |   20 +
 cpu-all.h   |2 +-
 cpus.c  |2 +-
 docs/memory.txt |6 ++--
 hmp-commands.hx |2 +-
 hw/9pfs/codir.c |6 ++--
 hw/9pfs/virtio-9p-coth.h|2 +-
 hw/9pfs/virtio-9p-handle.c  |4 +-
 hw/9pfs/virtio-9p.c |4 +-
 hw/acpi.c   |2 +-
 hw/alpha_dp264.c|2 +-
 hw/arm_gic.c|2 +-
 hw/bt-hci-csr.c |2 +-
 hw/cirrus_vga.c |2 +-
 hw/ds1225y.c|2 +-
 hw/e1000_hw.h   |2 +-
 hw/eepro100.c   |2 +-
 hw/etraxfs_dma.c|2 +-
 hw/etraxfs_pic.c|2 +-
 hw/fdc.c|2 +-
 hw/fmopl.c  |8 +++---
 hw/gusemu.h |2 +-
 hw/gusemu_hal.c |4 +-
 hw/ide/core.c   |2 +-
 hw/ide/via.c|2 +-
 hw/jazz_led.c   |2 +-
 hw/lan9118.c|4 +-
 hw/mips_malta.c |1 +
 hw/omap2.c  |6 ++--
 hw/pc.c |2 +-
 hw/pci-stub.c   |2 +-
 hw/pcie_aer.c   |4 +-
 hw/pl110.c  |2 +-
 hw/pl181.c  |4 +-
 hw/ppc.c|2 +-
 hw/sh7750_regs.h|2 +-
 hw/smc91c111.c  |2 +-
 hw/spapr.h  |2 +-
 hw/tc6393xb_template.h  |2 +-
 hw/vmport.c |2 +-
 hw/wdt_ib700.c  |2 +-
 linux-user/arm/nwfpe/fpopcode.h |   34 +++---
 linux-user/cpu-uname.c  |2 +-
 linux-user/syscall.c|8 --
 memory.h|   38 +++--
 net/socket.c|3 ++
 net/tap-solaris.c   |2 +-
 os-win32.c  |7 ++
 pc-bios/keymaps/is  |2 +-
 qemu-tls.h  |4 +-
 sysemu.h|2 +-
 target-s390x/op_helper.c|4 +-
 ui/vnc.c|2 +-
 usb-redir.c |4 +-
 56 files changed, 159 insertions(+), 130 deletions(-)

-- 
1.7.7.3

[Qemu-devel] [PATCH 07/11] scsi-disk: enable scatter/gather functionality

2011-12-06 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 hw/scsi-disk.c |   63 ---
 1 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 505accd..1640d2d 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -38,6 +38,7 @@ do { fprintf(stderr, "scsi-disk: " fmt , ## __VA_ARGS__); } 
while (0)
 #include "sysemu.h"
 #include "blockdev.h"
 #include "block_int.h"
+#include "dma.h"
 
 #ifdef __linux
 #include 
@@ -123,6 +124,27 @@ static uint32_t scsi_init_iovec(SCSIDiskReq *r)
 return r->qiov.size / 512;
 }
 
+static void scsi_dma_complete(void *opaque, int ret)
+{
+SCSIDiskReq *r = (SCSIDiskReq *)opaque;
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+
+bdrv_acct_done(s->qdev.conf.bs, &r->acct);
+
+if (ret) {
+if (scsi_handle_rw_error(r, -ret)) {
+goto done;
+}
+}
+
+r->sector += r->sector_count;
+r->sector_count = 0;
+scsi_req_complete(&r->req, GOOD);
+
+done:
+scsi_req_unref(&r->req);
+}
+
 static void scsi_read_complete(void * opaque, int ret)
 {
 SCSIDiskReq *r = (SCSIDiskReq *)opaque;
@@ -213,10 +235,17 @@ static void scsi_read_data(SCSIRequest *req)
 return;
 }
 
-n = scsi_init_iovec(r);
-bdrv_acct_start(s->qdev.conf.bs, &r->acct, n * BDRV_SECTOR_SIZE, 
BDRV_ACCT_READ);
-r->req.aiocb = bdrv_aio_readv(s->qdev.conf.bs, r->sector, &r->qiov, n,
-  scsi_read_complete, r);
+if (r->req.sg) {
+dma_acct_start(s->qdev.conf.bs, &r->acct, r->req.sg, BDRV_ACCT_READ);
+r->req.resid -= r->req.sg->size;
+r->req.aiocb = dma_bdrv_read(s->qdev.conf.bs, r->req.sg, r->sector,
+ scsi_dma_complete, r);
+} else {
+n = scsi_init_iovec(r);
+bdrv_acct_start(s->qdev.conf.bs, &r->acct, n * BDRV_SECTOR_SIZE, 
BDRV_ACCT_READ);
+r->req.aiocb = bdrv_aio_readv(s->qdev.conf.bs, r->sector, &r->qiov, n,
+  scsi_read_complete, r);
+}
 }
 
 /*
@@ -315,18 +344,26 @@ static void scsi_write_data(SCSIRequest *req)
 return;
 }
 
-n = r->qiov.size / 512;
-if (n) {
-if (s->tray_open) {
-scsi_write_complete(r, -ENOMEDIUM);
-return;
-}
+if (!r->req.sg && !r->qiov.size) {
+/* Called for the first time.  Ask the driver to send us more data.  */
+scsi_write_complete(r, 0);
+return;
+}
+if (s->tray_open) {
+scsi_write_complete(r, -ENOMEDIUM);
+return;
+}
+
+if (r->req.sg) {
+dma_acct_start(s->qdev.conf.bs, &r->acct, r->req.sg, BDRV_ACCT_WRITE);
+r->req.resid -= r->req.sg->size;
+r->req.aiocb = dma_bdrv_write(s->qdev.conf.bs, r->req.sg, r->sector,
+  scsi_dma_complete, r);
+} else {
+n = r->qiov.size / 512;
 bdrv_acct_start(s->qdev.conf.bs, &r->acct, n * BDRV_SECTOR_SIZE, 
BDRV_ACCT_WRITE);
 r->req.aiocb = bdrv_aio_writev(s->qdev.conf.bs, r->sector, &r->qiov, n,
scsi_write_complete, r);
-} else {
-/* Called for the first time.  Ask the driver to send us more data.  */
-scsi_write_complete(r, 0);
 }
 }
 
-- 
1.7.7.1

[Qemu-devel] [PATCH 03/11] dma-helpers: add dma_buf_read and dma_buf_write

2011-12-06 Thread Paolo Bonzini

These helpers do a full transfer from an in-memory buffer to target
memory, with support for scatter/gather lists.  It will be used to
store the reply of an emulated command into a QEMUSGList provided by
the adapter.

Signed-off-by: Paolo Bonzini 
---
 dma-helpers.c |   30 ++
 dma.h |3 +++
 2 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index f08cdb5..f53a51f 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -204,3 +204,33 @@ BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs,
 {
 return dma_bdrv_io(bs, sg, sector, bdrv_aio_writev, cb, opaque, true);
 }
+
+
+static uint64_t dma_buf_rw(uint8_t *ptr, int32_t len, QEMUSGList *sg, bool 
to_dev)
+{
+uint64_t resid;
+int sg_cur_index;
+
+resid = sg->size;
+sg_cur_index = 0;
+len = MIN(len, resid);
+while (len > 0) {
+ScatterGatherEntry entry = sg->sg[sg_cur_index++];
+cpu_physical_memory_rw(entry.base, ptr, MIN(len, entry.len), !to_dev);
+ptr += entry.len;
+len -= entry.len;
+resid -= entry.len;
+}
+
+return resid;
+}
+
+uint64_t dma_buf_read(uint8_t *ptr, int32_t len, QEMUSGList *sg)
+{
+return dma_buf_rw(ptr, len, sg, 0);
+}
+
+uint64_t dma_buf_write(uint8_t *ptr, int32_t len, QEMUSGList *sg)
+{
+return dma_buf_rw(ptr, len, sg, 1);
+}
diff --git a/dma.h b/dma.h
index d50019b..346ac4f 100644
--- a/dma.h
+++ b/dma.h
@@ -58,4 +58,7 @@ BlockDriverAIOCB *dma_bdrv_read(BlockDriverState *bs,
 BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs,
  QEMUSGList *sg, uint64_t sector,
  BlockDriverCompletionFunc *cb, void *opaque);
+uint64_t dma_buf_read(uint8_t *ptr, int32_t len, QEMUSGList *sg);
+uint64_t dma_buf_write(uint8_t *ptr, int32_t len, QEMUSGList *sg);
+
 #endif
-- 
1.7.7.1

[Qemu-devel] [PATCH 09/19] fix spelling in hw sub directory

2011-12-06 Thread Stefan Hajnoczi

From: Dong Xu Wang 

Correct obvious spelling errors in qemu/hw directory.

Signed-off-by: Dong Xu Wang 
Signed-off-by: Stefan Hajnoczi 
---
 hw/9pfs/virtio-9p-coth.h   |2 +-
 hw/9pfs/virtio-9p-handle.c |2 +-
 hw/alpha_dp264.c   |2 +-
 hw/arm_gic.c   |2 +-
 hw/bt-hci-csr.c|2 +-
 hw/cirrus_vga.c|2 +-
 hw/e1000_hw.h  |2 +-
 hw/etraxfs_dma.c   |2 +-
 hw/etraxfs_pic.c   |2 +-
 hw/fmopl.c |8 
 hw/gusemu.h|2 +-
 hw/gusemu_hal.c|4 ++--
 hw/ide/core.c  |2 +-
 hw/lan9118.c   |4 ++--
 hw/omap2.c |6 +++---
 hw/pc.c|2 +-
 hw/pcie_aer.c  |2 +-
 hw/pl110.c |2 +-
 hw/pl181.c |4 ++--
 hw/sh7750_regs.h   |2 +-
 hw/spapr.h |2 +-
 hw/wdt_ib700.c |2 +-
 22 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/hw/9pfs/virtio-9p-coth.h b/hw/9pfs/virtio-9p-coth.h
index c4b74b0..c31c965 100644
--- a/hw/9pfs/virtio-9p-coth.h
+++ b/hw/9pfs/virtio-9p-coth.h
@@ -44,7 +44,7 @@ typedef struct V9fsThPool {
 qemu_coroutine_self()); \
 qemu_bh_schedule(co_bh);\
 /*  \
- * yeild in qemu thread and re-enter back   \
+ * yield in qemu thread and re-enter back   \
  * in glib worker thread\
  */ \
 qemu_coroutine_yield(); \
diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
index 755e8e0..b556e39 100644
--- a/hw/9pfs/virtio-9p-handle.c
+++ b/hw/9pfs/virtio-9p-handle.c
@@ -520,7 +520,7 @@ static int handle_name_to_path(FsContext *ctx, V9fsPath 
*dir_path,
 }
 fh = g_malloc(sizeof(struct file_handle) + data->handle_bytes);
 fh->handle_bytes = data->handle_bytes;
-/* add a "./" at the begining of the path */
+/* add a "./" at the beginning of the path */
 snprintf(buffer, PATH_MAX, "./%s", name);
 /* flag = 0 imply don't follow symlink */
 ret = name_to_handle(dirfd, buffer, fh, &mnt_id, 0);
diff --git a/hw/alpha_dp264.c b/hw/alpha_dp264.c
index fcc20e9..598b830 100644
--- a/hw/alpha_dp264.c
+++ b/hw/alpha_dp264.c
@@ -2,7 +2,7 @@
  * QEMU Alpha DP264/CLIPPER hardware system emulator.
  *
  * Choose CLIPPER IRQ mappings over, say, DP264, MONET, or WEBBRICK
- * variants because CLIPPER doesn't have an SMC669 SuperIO controler
+ * variants because CLIPPER doesn't have an SMC669 SuperIO controller
  * that we need to emulate as well.
  */
 
diff --git a/hw/arm_gic.c b/hw/arm_gic.c
index 527c9ce..1a896fb 100644
--- a/hw/arm_gic.c
+++ b/hw/arm_gic.c
@@ -602,7 +602,7 @@ static uint32_t gic_cpu_read(gic_state *s, int cpu, int 
offset)
 return 0;
 case 0x0c: /* Acknowledge */
 return gic_acknowledge_irq(s, cpu);
-case 0x14: /* Runing Priority */
+case 0x14: /* Running Priority */
 return s->running_priority[cpu];
 case 0x18: /* Highest Pending Interrupt */
 return s->current_pending[cpu];
diff --git a/hw/bt-hci-csr.c b/hw/bt-hci-csr.c
index 0dcf897..772b677 100644
--- a/hw/bt-hci-csr.c
+++ b/hw/bt-hci-csr.c
@@ -222,7 +222,7 @@ static void csrhci_in_packet(struct csrhci_s *s, uint8_t 
*pkt)
 
 rpkt = csrhci_out_packet_csr(s, H4_NEG_PKT, 10);
 
-*rpkt ++ = 0x20;   /* Operational settings negotation Ok */
+*rpkt ++ = 0x20;   /* Operational settings negotiation Ok */
 memcpy(rpkt, pkt, 7); rpkt += 7;
 *rpkt ++ = 0xff;
 *rpkt = 0xff;
diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
index c7e365b..1b216e8 100644
--- a/hw/cirrus_vga.c
+++ b/hw/cirrus_vga.c
@@ -783,7 +783,7 @@ static void cirrus_bitblt_cputovideo_next(CirrusVGAState * 
s)
 s->cirrus_srccounter -= s->cirrus_blt_srcpitch;
 if (s->cirrus_srccounter <= 0)
 goto the_end;
-/* more bytes than needed can be transfered because of
+/* more bytes than needed can be transferred because of
word alignment, so we keep them for the next line */
 /* XXX: keep alignment to speed up transfer */
 end_ptr = s->cirrus_bltbuf + s->cirrus_blt_srcpitch;
diff --git a/hw/e1000_hw.h b/hw/e1000_hw.h
index 2e341ac..9e29af8 100644
--- a/hw/e1000_hw.h
+++ b/hw/e1000_hw.h
@@ -295,7 +295,7 @@
 
 #define E1000_KUMCTRLSTA 0x00034 /* MAC-PHY interface - RW */
 #define E1000_MDPHYA 0x0003C  /* PHY address - RW */
-#define E1000_MANC2H 0x05860  /* Managment Control To Host - RW */
+#define E1000_MANC2H 0x0586

Re: [Qemu-devel] [PATCH 16/19] Rename get_tls to tls_var

2011-12-06 Thread Andreas Färber

Am 06.12.2011 12:01, schrieb Stefan Hajnoczi:
> From: Jan Kiszka 
> 
> get_tls() can serve as a lvalue as well, so 'get' might be confusing.

Note that this does not work for POSIX pthread_getspecific(), which
we'll need to support at some point in time, so I don't think this is a
terribly good idea.

At least please don't start actually using it as lvalue, we'd need a
set_tls() for assignment (in which case get_tls() would've provided nice
symmetry).

Andreas

> 
> Signed-off-by: Jan Kiszka 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  cpu-all.h  |2 +-
>  qemu-tls.h |4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/cpu-all.h b/cpu-all.h
> index 7246a67..9d78715 100644
> --- a/cpu-all.h
> +++ b/cpu-all.h
> @@ -336,7 +336,7 @@ void QEMU_NORETURN cpu_abort(CPUState *env, const char 
> *fmt, ...)
>  GCC_FMT_ATTR(2, 3);
>  extern CPUState *first_cpu;
>  DECLARE_TLS(CPUState *,cpu_single_env);
> -#define cpu_single_env get_tls(cpu_single_env)
> +#define cpu_single_env tls_var(cpu_single_env)
>  
>  /* Flags for use in ENV->INTERRUPT_PENDING.
>  
> diff --git a/qemu-tls.h b/qemu-tls.h
> index 5b70f10..b92ea9d 100644
> --- a/qemu-tls.h
> +++ b/qemu-tls.h
> @@ -41,12 +41,12 @@
>  #ifdef __linux__
>  #define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
>  #define DEFINE_TLS(type, x)  __thread __typeof__(type) tls__##x
> -#define get_tls(x)   tls__##x
> +#define tls_var(x)   tls__##x
>  #else
>  /* Dummy implementations which define plain global variables */
>  #define DECLARE_TLS(type, x) extern DEFINE_TLS(type, x)
>  #define DEFINE_TLS(type, x)  __typeof__(type) tls__##x
> -#define get_tls(x)   tls__##x
> +#define tls_var(x)   tls__##x
>  #endif
>  
>  #endif


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 16/19] Rename get_tls to tls_var

2011-12-06 Thread Jan Kiszka

On 2011-12-06 12:43, Andreas Färber wrote:
> Am 06.12.2011 12:01, schrieb Stefan Hajnoczi:
>> From: Jan Kiszka 
>>
>> get_tls() can serve as a lvalue as well, so 'get' might be confusing.
> 
> Note that this does not work for POSIX pthread_getspecific(), which
> we'll need to support at some point in time, so I don't think this is a
> terribly good idea.
> 
> At least please don't start actually using it as lvalue, we'd need a
> set_tls() for assignment (in which case get_tls() would've provided nice
> symmetry).

We already use it like this as TLS is not usable otherwise.

I don't mind get/set_tls, I just didn't like current get_tls(x) = bla.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 11/19] configure: Pull linux-headers/asm symlink creation out of loop

2011-12-06 Thread Stefan Hajnoczi

From: Peter Maydell 

Pull the creation of the linux-headers/asm symlink out of the loop
so we don't pointlessly delete and recreate it once for each target.
Also move the setting of the includes variable up so that it is
in the same place as the other code which sets this variable.

Signed-off-by: Peter Maydell 
Signed-off-by: Stefan Hajnoczi 
---
 configure |   37 -
 1 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/configure b/configure
index 452b8cf..4bcb8ef 100755
--- a/configure
+++ b/configure
@@ -3286,6 +3286,22 @@ for d in libdis libdis-user; do
 echo > $d/config.mak
 done
 
+# use included Linux headers
+if test "$linux" = "yes" ; then
+  mkdir -p linux-headers
+  case "$cpu" in
+  i386|x86_64)
+symlink $source_path/linux-headers/asm-x86 linux-headers/asm
+;;
+  ppcemb|ppc|ppc64)
+symlink $source_path/linux-headers/asm-powerpc linux-headers/asm
+;;
+  s390x)
+symlink $source_path/linux-headers/asm-s390 linux-headers/asm
+;;
+  esac
+fi
+
 for target in $target_list; do
 target_dir="$target"
 config_target_mak=$target_dir/config-target.mak
@@ -3611,6 +3627,10 @@ else
 fi
 includes="-I\$(SRC_PATH)/tcg $includes"
 
+if test "$linux" = "yes" ; then
+  includes="-I\$(SRC_PATH)/linux-headers $includes"
+fi
+
 if test "$target_user_only" = "yes" ; then
 libdis_config_mak=libdis-user/config.mak
 else
@@ -3742,23 +3762,6 @@ if test "$target_linux_user" = "yes" -o 
"$target_bsd_user" = "yes" ; then
   esac
 fi
 
-# use included Linux headers
-if test "$linux" = "yes" ; then
-  includes="-I\$(SRC_PATH)/linux-headers $includes"
-  mkdir -p linux-headers
-  case "$cpu" in
-  i386|x86_64)
-symlink $source_path/linux-headers/asm-x86 linux-headers/asm
-;;
-  ppcemb|ppc|ppc64)
-symlink $source_path/linux-headers/asm-powerpc linux-headers/asm
-;;
-  s390x)
-symlink $source_path/linux-headers/asm-s390 linux-headers/asm
-;;
-  esac
-fi
-
 echo "LDFLAGS+=$ldflags" >> $config_target_mak
 echo "QEMU_CFLAGS+=$cflags" >> $config_target_mak
 echo "QEMU_INCLUDES+=$includes" >> $config_target_mak
-- 
1.7.7.3

[Qemu-devel] [PATCH 08/19] fix typo: delete redundant semicolon

2011-12-06 Thread Stefan Hajnoczi

From: Dong Xu Wang 

Double semicolons should be single.

Signed-off-by: Dong Xu Wang 
Signed-off-by: Stefan Hajnoczi 
---
 block/nbd.c|4 ++--
 cpus.c |2 +-
 hw/9pfs/codir.c|6 +++---
 hw/9pfs/virtio-9p-handle.c |2 +-
 hw/9pfs/virtio-9p.c|4 ++--
 hw/acpi.c  |2 +-
 hw/eepro100.c  |2 +-
 hw/ide/via.c   |2 +-
 hw/ppc.c   |2 +-
 hw/smc91c111.c |2 +-
 linux-user/syscall.c   |2 +-
 net/tap-solaris.c  |2 +-
 target-s390x/op_helper.c   |4 ++--
 ui/vnc.c   |2 +-
 usb-redir.c|4 ++--
 15 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 882b2dc..95212da 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -189,7 +189,7 @@ static int nbd_read(BlockDriverState *bs, int64_t 
sector_num,
 
 request.type = NBD_CMD_READ;
 request.handle = (uint64_t)(intptr_t)bs;
-request.from = sector_num * 512;;
+request.from = sector_num * 512;
 request.len = nb_sectors * 512;
 
 if (nbd_send_request(s->sock, &request) == -1)
@@ -219,7 +219,7 @@ static int nbd_write(BlockDriverState *bs, int64_t 
sector_num,
 
 request.type = NBD_CMD_WRITE;
 request.handle = (uint64_t)(intptr_t)bs;
-request.from = sector_num * 512;;
+request.from = sector_num * 512;
 request.len = nb_sectors * 512;
 
 if (nbd_send_request(s->sock, &request) == -1)
diff --git a/cpus.c b/cpus.c
index ca46ec6..a53276a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -89,7 +89,7 @@ TimersState timers_state;
 int64_t cpu_get_icount(void)
 {
 int64_t icount;
-CPUState *env = cpu_single_env;;
+CPUState *env = cpu_single_env;
 
 icount = qemu_icount;
 if (env) {
diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c
index 9b6d47d..3d18828 100644
--- a/hw/9pfs/codir.c
+++ b/hw/9pfs/codir.c
@@ -90,7 +90,7 @@ int v9fs_co_mkdir(V9fsPDU *pdu, V9fsFidState *fidp, 
V9fsString *name,
 V9fsState *s = pdu->s;
 
 if (v9fs_request_cancelled(pdu)) {
-return -EINTR;;
+return -EINTR;
 }
 cred_init(&cred);
 cred.fc_mode = mode;
@@ -124,7 +124,7 @@ int v9fs_co_opendir(V9fsPDU *pdu, V9fsFidState *fidp)
 V9fsState *s = pdu->s;
 
 if (v9fs_request_cancelled(pdu)) {
-return -EINTR;;
+return -EINTR;
 }
 v9fs_path_read_lock(s);
 v9fs_co_run_in_worker(
@@ -152,7 +152,7 @@ int v9fs_co_closedir(V9fsPDU *pdu, V9fsFidOpenState *fs)
 V9fsState *s = pdu->s;
 
 if (v9fs_request_cancelled(pdu)) {
-return -EINTR;;
+return -EINTR;
 }
 v9fs_co_run_in_worker(
 {
diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
index f97d898..755e8e0 100644
--- a/hw/9pfs/virtio-9p-handle.c
+++ b/hw/9pfs/virtio-9p-handle.c
@@ -59,7 +59,7 @@ static inline int open_by_handle(int mountfd, const char *fh, 
int flags)
 static int handle_update_file_cred(int dirfd, const char *name, FsCred *credp)
 {
 int fd, ret;
-fd = openat(dirfd, name, O_NONBLOCK | O_NOFOLLOW);;
+fd = openat(dirfd, name, O_NONBLOCK | O_NOFOLLOW);
 if (fd < 0) {
 return fd;
 }
diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c
index dd43209..36a862f 100644
--- a/hw/9pfs/virtio-9p.c
+++ b/hw/9pfs/virtio-9p.c
@@ -1492,7 +1492,7 @@ static void v9fs_walk(void *opaque)
 int32_t fid, newfid;
 V9fsString *wnames = NULL;
 V9fsFidState *fidp;
-V9fsFidState *newfidp = NULL;;
+V9fsFidState *newfidp = NULL;
 V9fsPDU *pdu = opaque;
 V9fsState *s = pdu->s;
 
@@ -2398,7 +2398,7 @@ static void v9fs_link(void *opaque)
 V9fsState *s = pdu->s;
 int32_t dfid, oldfid;
 V9fsFidState *dfidp, *oldfidp;
-V9fsString name;;
+V9fsString name;
 size_t offset = 7;
 int err = 0;
 
diff --git a/hw/acpi.c b/hw/acpi.c
index 1cf35e1..9c35f2d 100644
--- a/hw/acpi.c
+++ b/hw/acpi.c
@@ -304,7 +304,7 @@ void acpi_pm_tmr_calc_overflow_time(ACPIPMTimer *tmr)
 
 uint32_t acpi_pm_tmr_get(ACPIPMTimer *tmr)
 {
-uint32_t d = acpi_pm_tmr_get_clock();;
+uint32_t d = acpi_pm_tmr_get_clock();
 return d & 0xff;
 }
 
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 29ec5b4..e430f56 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -258,7 +258,7 @@ typedef struct {
 
 /* Data in mem is always in the byte order of the controller (le).
  * It must be dword aligned to allow direct access to 32 bit values. */
-uint8_t mem[PCI_MEM_SIZE] __attribute__((aligned(8)));;
+uint8_t mem[PCI_MEM_SIZE] __attribute__((aligned(8)));
 
 /* Configuration bytes. */
 uint8_t configuration[22];
diff --git a/hw/ide/via.c b/hw/ide/via.c
index 098f150..a57134c 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -172,7 +172,7 @@ static void vt82c686b_init_ports(PCIIDEState *d) {
 /* via ide func */
 static int vt82c686b_ide_initfn(PCIDevice *dev)
 {
-PCIIDEState *d = DO_UPCAST(PCIIDESt

[Qemu-devel] [PATCH 10/11] virtio-scsi: add basic SCSI bus operation

2011-12-06 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 hw/virtio-scsi.c |  101 +++---
 1 files changed, 89 insertions(+), 11 deletions(-)

diff --git a/hw/virtio-scsi.c b/hw/virtio-scsi.c
index c86e15e..dbb4003 100644
--- a/hw/virtio-scsi.c
+++ b/hw/virtio-scsi.c
@@ -128,6 +128,7 @@ typedef struct {
 DeviceState *qdev;
 VirtIOSCSIConf *conf;
 
+SCSIBus bus;
 VirtQueue *ctrl_vq;
 VirtQueue *event_vq;
 VirtQueue *cmd_vq;
@@ -156,6 +157,22 @@ typedef struct VirtIOSCSIReq {
 } resp;
 } VirtIOSCSIReq;
 
+static inline int virtio_scsi_get_lun(uint8_t *lun)
+{
+return ((lun[2] << 8) | lun[3]) & 0x3FFF;
+}
+
+static inline SCSIDevice *virtio_scsi_device_find(VirtIOSCSI *s, uint8_t *lun)
+{
+if (lun[0] != 1) {
+return NULL;
+}
+if (lun[2] != 0 && !(lun[2] >= 0x40 && lun[2] < 0x80)) {
+return NULL;
+}
+return scsi_device_find(&s->bus, 0, lun[1], virtio_scsi_get_lun(lun));
+}
+
 static void virtio_scsi_complete_req(VirtIOSCSIReq *req)
 {
 VirtIOSCSI *s = req->dev;
@@ -240,6 +257,36 @@ static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 }
 }
 
+static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
+ int32_t resid)
+{
+VirtIOSCSIReq *req = r->hba_private;
+
+req->resp.cmd->response = VIRTIO_SCSI_S_OK;
+req->resp.cmd->status = status;
+if (req->resp.cmd->status == GOOD) {
+req->resp.cmd->resid = resid;
+if (resid) {
+req->resp.cmd->response = VIRTIO_SCSI_S_UNDERRUN;
+}
+} else {
+   req->resp.cmd->resid = 0;
+   scsi_req_get_sense(r, req->resp.cmd->sense, VIRTIO_SCSI_SENSE_SIZE);
+}
+virtio_scsi_complete_req(req);
+}
+
+static void virtio_scsi_request_cancelled(SCSIRequest *r)
+{
+VirtIOSCSIReq *req = r->hba_private;
+
+if (!req) {
+return;
+}
+req->resp.cmd->response = VIRTIO_SCSI_S_ABORTED;
+virtio_scsi_complete_req(req);
+}
+
 static void virtio_scsi_fail_cmd_req(VirtIOSCSIReq *req)
 {
 req->resp.cmd->response = VIRTIO_SCSI_S_FAILURE;
@@ -250,8 +297,10 @@ static void virtio_scsi_handle_cmd(VirtIODevice *vdev, 
VirtQueue *vq)
 {
 VirtIOSCSI *s = (VirtIOSCSI *)vdev;
 VirtIOSCSIReq *req;
+int n;
 
 while ((req = virtio_scsi_pop_req(s, vq))) {
+SCSIDevice *d;
 int out_size, in_size;
 if (req->elem.out_num < 1 || req->elem.in_num < 1) {
 virtio_scsi_bad_req();
@@ -265,18 +314,32 @@ static void virtio_scsi_handle_cmd(VirtIODevice *vdev, 
VirtQueue *vq)
 virtio_scsi_fail_cmd_req(req);
 continue;
 }
+
+d = virtio_scsi_device_find(s, req->req.cmd->lun);
+if (!d) {
+req->resp.cmd->response = VIRTIO_SCSI_S_BAD_TARGET;
+virtio_scsi_complete_req(req);
+continue;
+}
+req->sreq = scsi_req_new(d, req->req.cmd->tag,
+ virtio_scsi_get_lun(req->req.cmd->lun),
+ req->req.cmd->cdb, req);
+
+if (req->sreq->cmd.mode != SCSI_XFER_NONE) {
+int req_mode =
+(req->elem.in_num > 1 ? SCSI_XFER_FROM_DEV : SCSI_XFER_TO_DEV);
+
+if (req->sreq->cmd.mode != req_mode) {
+virtio_scsi_fail_cmd_req(req);
+scsi_req_cancel(req->sreq);
+continue;
+}
+}
 
-req->resp.cmd->resid = 0;
-req->resp.cmd->status_qualifier = 0;
-req->resp.cmd->status = CHECK_CONDITION;
-req->resp.cmd->sense_len = 4;
-req->resp.cmd->sense[0] = 0xf0; /* Fixed format current sense */
-req->resp.cmd->sense[1] = ILLEGAL_REQUEST;
-req->resp.cmd->sense[2] = 0x20;
-req->resp.cmd->sense[3] = 0x00;
-req->resp.cmd->response = VIRTIO_SCSI_S_OK;
-
-virtio_scsi_complete_req(req);
+n = scsi_req_enqueue(req->sreq, &req->qsgl);
+if (n) {
+scsi_req_continue(req->sreq);
+}
 }
 }
 
@@ -331,6 +394,16 @@ static void virtio_scsi_reset(VirtIODevice *vdev)
 s->cdb_size = VIRTIO_SCSI_CDB_SIZE;
 }
 
+static struct SCSIBusInfo virtio_scsi_scsi_info = {
+.tcq = true,
+.max_channel = VIRTIO_SCSI_MAX_CHANNEL,
+.max_target = VIRTIO_SCSI_MAX_TARGET,
+.max_lun = VIRTIO_SCSI_MAX_LUN,
+
+.complete = virtio_scsi_command_complete,
+.cancel = virtio_scsi_request_cancelled,
+};
+
 VirtIODevice *virtio_scsi_init(DeviceState *dev, VirtIOSCSIConf *proxyconf)
 {
 VirtIOSCSI *s;
@@ -355,6 +428,11 @@ VirtIODevice *virtio_scsi_init(DeviceState *dev, 
VirtIOSCSIConf *proxyconf)
 s->cmd_vq = virtio_add_queue(&s->vdev, VIRTIO_SCSI_VQ_SIZE,
virtio_scsi_handle_cmd);
 
+scsi_bus_new(&s->bus, dev, &virtio_scsi_scsi_info);
+if (!dev->hotplugged) {
+scsi_bus_legacy_handle_cmdline(&s->bus);
+}
+
 /* TODO savevm */

[Qemu-devel] [PATCH 12/19] Convert source files to UTF-8 encoding

2011-12-06 Thread Stefan Hajnoczi

From: Stefan Weil 

Most QEMU files either are pure ASCII or use UTF-8.
Convert some files which still used ISO-8859-1 to UTF-8.

Signed-off-by: Stefan Weil 
Signed-off-by: Stefan Hajnoczi 
---
 hw/ds1225y.c   |2 +-
 hw/fdc.c   |2 +-
 hw/jazz_led.c  |2 +-
 hw/tc6393xb_template.h |2 +-
 hw/vmport.c|2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/ds1225y.c b/hw/ds1225y.c
index 59d224e..7aa0832 100644
--- a/hw/ds1225y.c
+++ b/hw/ds1225y.c
@@ -1,7 +1,7 @@
 /*
  * QEMU NVRAM emulation for DS1225Y chip
  *
- * Copyright (c) 2007-2008 Herv� Poussineau
+ * Copyright (c) 2007-2008 Hervé Poussineau
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
diff --git a/hw/fdc.c b/hw/fdc.c
index 2adfb44..70aa5c7 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -2,7 +2,7 @@
  * QEMU Floppy disk emulator (Intel 82078)
  *
  * Copyright (c) 2003, 2007 Jocelyn Mayer
- * Copyright (c) 2008 Herv� Poussineau
+ * Copyright (c) 2008 Hervé Poussineau
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
diff --git a/hw/jazz_led.c b/hw/jazz_led.c
index 6fab334..f8a2182 100644
--- a/hw/jazz_led.c
+++ b/hw/jazz_led.c
@@ -1,7 +1,7 @@
 /*
  * QEMU JAZZ LED emulator.
  *
- * Copyright (c) 2007 Herv� Poussineau
+ * Copyright (c) 2007 Hervé Poussineau
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
diff --git a/hw/tc6393xb_template.h b/hw/tc6393xb_template.h
index 1ccf6e8..4cbbad5 100644
--- a/hw/tc6393xb_template.h
+++ b/hw/tc6393xb_template.h
@@ -5,7 +5,7 @@
  *
  * FB support code. Based on G364 fb emulator
  *
- * Copyright (c) 2007 Herv� Poussineau
+ * Copyright (c) 2007 Hervé Poussineau
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
diff --git a/hw/vmport.c b/hw/vmport.c
index b5c6fa1..0a3dbc5 100644
--- a/hw/vmport.c
+++ b/hw/vmport.c
@@ -1,7 +1,7 @@
 /*
  * QEMU VMPort emulation
  *
- * Copyright (C) 2007 Herv� Poussineau
+ * Copyright (C) 2007 Hervé Poussineau
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
-- 
1.7.7.3

[Qemu-devel] [PATCH 03/19] console: Fix qemu_default_pixelformat() for 24 bpp

2011-12-06 Thread Stefan Hajnoczi

From: Markus Armbruster 

Falls through to 32 bpp.  Harmless, because the only difference is the
alpha component, and we're not using that.  Spotted by Coverity.

Acked-by: Stefano Stabellini 
Signed-off-by: Markus Armbruster 
Signed-off-by: Stefan Hajnoczi 
---
 console.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/console.c b/console.c
index ce0429d..223f8fd 100644
--- a/console.c
+++ b/console.c
@@ -1688,6 +1688,7 @@ PixelFormat qemu_default_pixelformat(int bpp)
 pf.rbits = 8;
 pf.gbits = 8;
 pf.bbits = 8;
+break;
 case 32:
 pf.rmask = 0x00FF;
 pf.gmask = 0xFF00;
-- 
1.7.7.3

[Qemu-devel] [PATCH 01/19] console: Clean up confusing indentation in console_putchar()

2011-12-06 Thread Stefan Hajnoczi

From: Markus Armbruster 

Signed-off-by: Markus Armbruster 
Signed-off-by: Stefan Hajnoczi 
---
 console.c |   18 +-
 1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/console.c b/console.c
index f6fe441..374fcba 100644
--- a/console.c
+++ b/console.c
@@ -1009,16 +1009,16 @@ static void console_putchar(TextConsole *s, int ch)
 console_clear_xy(s, x, y);
 }
 }
-break;
+break;
 }
 case 'K':
 switch (s->esc_params[0]) {
 case 0:
-/* clear to eol */
-for(x = s->x; x < s->width; x++) {
+/* clear to eol */
+for(x = s->x; x < s->width; x++) {
 console_clear_xy(s, x, s->y);
-}
-break;
+}
+break;
 case 1:
 /* clear from beginning of line */
 for (x = 0; x <= s->x; x++) {
@@ -1030,12 +1030,12 @@ static void console_putchar(TextConsole *s, int ch)
 for(x = 0; x < s->width; x++) {
 console_clear_xy(s, x, s->y);
 }
-break;
-}
+break;
+}
 break;
 case 'm':
-console_handle_escape(s);
-break;
+console_handle_escape(s);
+break;
 case 'n':
 /* report cursor position */
 /* TODO: send ESC[row;colR */
-- 
1.7.7.3

[Qemu-devel] [PATCH 06/11] scsi: add scatter/gather functionality

2011-12-06 Thread Paolo Bonzini

Scatter/gather functionality uses the newly added DMA helpers.  The
device can choose between doing DMA itself, or calling scsi_req_data
as usual.  In the latter case, scsi_req_data will use the buffer-based
DMA helpers to copy piecewise to/from the destination area(s).

Signed-off-by: Paolo Bonzini 
---
 hw/esp.c |2 +-
 hw/lsi53c895a.c  |2 +-
 hw/scsi-bus.c|   28 
 hw/scsi.h|4 +++-
 hw/spapr_vscsi.c |2 +-
 hw/usb-msd.c |2 +-
 6 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index 8516db5..e72e8ba 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -239,7 +239,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)
 lun = busid & 7;
 current_lun = scsi_device_find(&s->bus, 0, s->current_dev->id, lun);
 s->current_req = scsi_req_new(current_lun, 0, lun, buf, NULL);
-datalen = scsi_req_enqueue(s->current_req);
+datalen = scsi_req_enqueue(s->current_req, NULL);
 s->ti_size = datalen;
 if (datalen != 0) {
 s->rregs[ESP_RSTAT] = STAT_TC;
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index c53760b..55533af 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -778,7 +778,7 @@ static void lsi_do_command(LSIState *s)
 s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun, buf,
s->current);
 
-n = scsi_req_enqueue(s->current->req);
+n = scsi_req_enqueue(s->current->req, NULL);
 if (n) {
 if (n > 0) {
 lsi_set_phase(s, PHASE_DI);
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index aa811f4..b42c3b1 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -5,6 +5,7 @@
 #include "qdev.h"
 #include "blockdev.h"
 #include "trace.h"
+#include "dma.h"
 
 static char *scsibus_get_fw_dev_path(DeviceState *dev);
 static int scsi_req_parse(SCSICommand *cmd, SCSIDevice *dev, uint8_t *buf);
@@ -51,7 +52,7 @@ static void scsi_dma_restart_bh(void *opaque)
 break;
 case SCSI_XFER_NONE:
 scsi_req_dequeue(req);
-scsi_req_enqueue(req);
+scsi_req_enqueue(req, NULL);
 break;
 }
 }
@@ -626,7 +627,7 @@ void scsi_req_build_sense(SCSIRequest *req, SCSISense sense)
 req->sense_len = 18;
 }
 
-int32_t scsi_req_enqueue(SCSIRequest *req)
+int32_t scsi_req_enqueue(SCSIRequest *req, QEMUSGList *sg)
 {
 int32_t rc;
 
@@ -634,6 +635,7 @@ int32_t scsi_req_enqueue(SCSIRequest *req)
 scsi_req_ref(req);
 req->enqueued = true;
 QTAILQ_INSERT_TAIL(&req->dev->requests, req, next);
+req->sg = sg;
 
 scsi_req_ref(req);
 rc = req->ops->send_command(req, req->cmd.buf);
@@ -1256,14 +1258,32 @@ void scsi_req_continue(SCSIRequest *req)
Once it completes, calling scsi_req_continue will restart I/O.  */
 void scsi_req_data(SCSIRequest *req, int len)
 {
+uint8_t *buf;
 if (req->io_canceled) {
 trace_scsi_req_data_canceled(req->dev->id, req->lun, req->tag, len);
 return;
 }
 trace_scsi_req_data(req->dev->id, req->lun, req->tag, len);
 assert(req->cmd.mode != SCSI_XFER_NONE);
-req->resid -= len;
-req->bus->info->transfer_data(req, len);
+if (!req->sg) {
+req->resid -= len;
+req->bus->info->transfer_data(req, len);
+return;
+}
+
+/* If the device calls scsi_req_data and the HBA specified a
+ * scatter/gather list, the transfer has to happen in a single
+ * step.  */
+assert(!req->dma_started);
+req->dma_started = true;
+
+buf = scsi_req_get_buf(req);
+if (req->cmd.mode == SCSI_XFER_FROM_DEV) {
+req->resid = dma_buf_read(buf, len, req->sg);
+} else {
+req->resid = dma_buf_write(buf, len, req->sg);
+}
+scsi_req_continue(req);
 }
 
 void scsi_req_print(SCSIRequest *req)
diff --git a/hw/scsi.h b/hw/scsi.h
index 27ca087..8d84c8f 100644
--- a/hw/scsi.h
+++ b/hw/scsi.h
@@ -50,6 +50,8 @@ struct SCSIRequest {
 size_tresid;
 SCSICommand   cmd;
 BlockDriverAIOCB  *aiocb;
+QEMUSGList*sg;
+bool  dma_started;
 uint8_t sense[SCSI_SENSE_BUF_SIZE];
 uint32_t sense_len;
 bool enqueued;
@@ -187,7 +189,7 @@ SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, 
SCSIDevice *d,
 uint32_t tag, uint32_t lun, void *hba_private);
 SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
   uint8_t *buf, void *hba_private);
-int32_t scsi_req_enqueue(SCSIRequest *req);
+int32_t scsi_req_enqueue(SCSIRequest *req, QEMUSGList *qsg);
 void scsi_req_free(SCSIRequest *req);
 SCSIRequest *scsi_req_ref(SCSIRequest *req);
 void scsi_req_unref(SCSIRequest *req);
diff --git a/hw/spapr_vscsi.c b/hw/spapr_vscsi.c
index c28bba9..fe4ca65 100644
--- a/hw/spapr_vscsi.c
+++ b/hw/spapr_vscsi.c
@@ -624,7 +624,7 @@ static int vscsi_queue_cmd(VSCSIState *s, vscsi_req *req)
 
 req->lun = l

Re: [Qemu-devel] [PATCH 16/19] Rename get_tls to tls_var

2011-12-06 Thread Paolo Bonzini


On 12/06/2011 12:43 PM, Andreas Färber wrote:

Am 06.12.2011 12:01, schrieb Stefan Hajnoczi:

From: Jan Kiszka

get_tls() can serve as a lvalue as well, so 'get' might be confusing.


Note that this does not work for POSIX pthread_getspecific(), which
we'll need to support at some point in time, so I don't think this is a
terribly good idea.


I already posted the code for pthread_getspecific(), but it didn't work 
on OpenBSD.


The idea was to reserve a single pthread key and allocate a whole block 
of memory.  Then get_tls can use pthread_getspecific() and add an offset 
within the block of memory.  The reason is that with many allocated keys 
pthread_getspecific() becomes slower.


Paolo

[Qemu-devel] [PATCH 09/11] virtio-scsi: Add basic request processing infrastructure

2011-12-06 Thread Paolo Bonzini

From: Stefan Hajnoczi 

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
---
 hw/virtio-scsi.c |  142 +-
 1 files changed, 140 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-scsi.c b/hw/virtio-scsi.c
index a807a28..c86e15e 100644
--- a/hw/virtio-scsi.c
+++ b/hw/virtio-scsi.c
@@ -135,14 +135,152 @@ typedef struct {
 uint32_t cdb_size;
 } VirtIOSCSI;
 
+typedef struct VirtIOSCSIReq {
+VirtIOSCSI *dev;
+VirtQueue *vq;
+VirtQueueElement elem;
+QEMUSGList qsgl;
+SCSIRequest *sreq;
+union {
+char  *buf;
+VirtIOSCSICmdReq  *cmd;
+VirtIOSCSICtrlTMFReq  *tmf;
+VirtIOSCSICtrlANReq   *an;
+} req;
+union {
+char  *buf;
+VirtIOSCSICmdResp *cmd;
+VirtIOSCSICtrlTMFResp *tmf;
+VirtIOSCSICtrlANResp  *an;
+VirtIOSCSIEvent   *event;
+} resp;
+} VirtIOSCSIReq;
+
+static void virtio_scsi_complete_req(VirtIOSCSIReq *req)
+{
+VirtIOSCSI *s = req->dev;
+VirtQueue *vq = req->vq;
+virtqueue_push(vq, &req->elem, req->qsgl.size + 
req->elem.in_sg[0].iov_len);
+qemu_sglist_destroy(&req->qsgl);
+if (req->sreq) {
+req->sreq->hba_private = NULL;
+scsi_req_unref(req->sreq);
+}
+g_free(req);
+virtio_notify(&s->vdev, vq);
+}
+
+static void virtio_scsi_bad_req(void)
+{
+error_report("wrong size for virtio-scsi headers");
+exit(1);
+}
+
+static void qemu_sgl_init_external(QEMUSGList *qsgl, struct iovec *sg,
+   target_phys_addr_t *addr, int num)
+{
+memset(qsgl, 0, sizeof(*qsgl));
+while (num--) {
+qemu_sglist_add(qsgl, *(addr++), (sg++)->iov_len);
+}
+}
+
+static void virtio_scsi_parse_req(VirtIOSCSI *s, VirtQueue *vq,
+  VirtIOSCSIReq *req)
+{
+assert(req->elem.out_num && req->elem.in_num);
+req->vq = vq;
+req->dev = s;
+req->sreq = NULL;
+req->req.buf = req->elem.out_sg[0].iov_base;
+req->resp.buf = req->elem.in_sg[0].iov_base;
+
+if (req->elem.out_num > 1) {
+qemu_sgl_init_external(&req->qsgl, &req->elem.out_sg[1],
+   &req->elem.out_addr[1],
+   req->elem.out_num - 1);
+} else {
+qemu_sgl_init_external(&req->qsgl, &req->elem.in_sg[1],
+   &req->elem.in_addr[1],
+   req->elem.in_num - 1);
+}
+}
+
+static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, VirtQueue *vq)
+{
+VirtIOSCSIReq *req;
+req = g_malloc(sizeof(*req));
+if (!virtqueue_pop(vq, &req->elem)) {
+g_free(req);
+return NULL;
+}
+
+virtio_scsi_parse_req(s, vq, req);
+return req;
+}
+
+static void virtio_scsi_fail_ctrl_req(VirtIOSCSIReq *req)
+{
+if (req->req.tmf->type == VIRTIO_SCSI_T_TMF) {
+req->resp.tmf->response = VIRTIO_SCSI_S_FAILURE;
+} else {
+req->resp.an->response = VIRTIO_SCSI_S_FAILURE;
+}
+
+virtio_scsi_complete_req(req);
+}
+
 static void virtio_scsi_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
-/* TODO */
+VirtIOSCSI *s = (VirtIOSCSI *)vdev;
+VirtIOSCSIReq *req;
+
+while ((req = virtio_scsi_pop_req(s, vq))) {
+virtio_scsi_fail_ctrl_req(req);
+}
+}
+
+static void virtio_scsi_fail_cmd_req(VirtIOSCSIReq *req)
+{
+req->resp.cmd->response = VIRTIO_SCSI_S_FAILURE;
+virtio_scsi_complete_req(req);
 }
 
 static void virtio_scsi_handle_cmd(VirtIODevice *vdev, VirtQueue *vq)
 {
-/* TODO */
+VirtIOSCSI *s = (VirtIOSCSI *)vdev;
+VirtIOSCSIReq *req;
+
+while ((req = virtio_scsi_pop_req(s, vq))) {
+int out_size, in_size;
+if (req->elem.out_num < 1 || req->elem.in_num < 1) {
+virtio_scsi_bad_req();
+}
+
+out_size = req->elem.out_sg[0].iov_len;
+in_size = req->elem.in_sg[0].iov_len;
+if (out_size < sizeof(VirtIOSCSICmdReq) + s->cdb_size ||
+in_size < sizeof(VirtIOSCSICmdResp) + s->sense_size) {
+virtio_scsi_bad_req();
+}
+
+if (req->elem.out_num > 1 && req->elem.in_num > 1) {
+virtio_scsi_fail_cmd_req(req);
+continue;
+}
+
+req->resp.cmd->resid = 0;
+req->resp.cmd->status_qualifier = 0;
+req->resp.cmd->status = CHECK_CONDITION;
+req->resp.cmd->sense_len = 4;
+req->resp.cmd->sense[0] = 0xf0; /* Fixed format current sense */
+req->resp.cmd->sense[1] = ILLEGAL_REQUEST;
+req->resp.cmd->sense[2] = 0x20;
+req->resp.cmd->sense[3] = 0x00;
+req->resp.cmd->response = VIRTIO_SCSI_S_OK;
+
+virtio_scsi_complete_req(req);
+}
 }
 
 static void virtio_scsi_get_config(VirtIODevice *vdev,
-- 
1.7.7.1

Re: [Qemu-devel] [PATCH v3 for-1.1 0/9] target-arm: More inference rules for features

2011-12-06 Thread Peter Maydell

On 6 December 2011 00:30, Andreas Färber  wrote:
> Hello Peter,
>
> Here's an extended and cleaned up version of my inference series, taking into 
> account
> your suggestions and dropping my motivating FYI patches for Cortex-R4F.
>
> The intent is to reduce effort and knowledge required to add/maintain ARM CPU 
> cores.

Series:
Reviewed-by: Peter Maydell 

and taken into my target-arm.next tree
http://git.linaro.org/gitweb?p=people/pmaydell/qemu-arm.git;a=shortlog;h=refs/heads/target-arm.next

-- PMM

[Qemu-devel] [Bug 899961] Re: qemu/kvm locks up when run 32bit userspace with 64bit kernel

2011-12-06 Thread Michael Tokarev

** Description changed:

- Applies to both qemu and qemu-kvm 1.0, but only when kernel is 64bit and
+ Only applies to qemu-kvm 1.0, and only when kernel is 64bit and
  userspace is 32bit, on x86.  Did not happen with previous released
  versions, such as 0.15.  Not all guests triggers this issue - so far,
  only (32bit) windows 7 guest shows it, but does that quite reliable:
- first boot of an old guest with new qemu (or qemu-kvm), windows finds a
- new CPU and suggests rebooting - hit "Reboot" and in a few seconds it
- will be locked up (including the monitor), with 100% CPU usage.
- Killable with -9.
+ first boot of an old guest with new qemu-kvm, windows finds a new CPU
+ and suggests rebooting - hit "Reboot" and in a few seconds it will be
+ locked up (including the monitor), with no CPU usage whatsoever.
+ Killable only with -9.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/899961

Title:
  qemu/kvm locks up when run 32bit userspace with 64bit kernel

Status in QEMU:
  New

Bug description:
  Only applies to qemu-kvm 1.0, and only when kernel is 64bit and
  userspace is 32bit, on x86.  Did not happen with previous released
  versions, such as 0.15.  Not all guests triggers this issue - so far,
  only (32bit) windows 7 guest shows it, but does that quite reliable:
  first boot of an old guest with new qemu-kvm, windows finds a new CPU
  and suggests rebooting - hit "Reboot" and in a few seconds it will be
  locked up (including the monitor), with no CPU usage whatsoever.
  Killable only with -9.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/899961/+subscriptions

[Qemu-devel] [PATCH 13/19] Convert keymap file to UTF-8 encoding

2011-12-06 Thread Stefan Hajnoczi

From: Stefan Weil 

Most QEMU files either are pure ASCII or use UTF-8.
Convert this keymap file which still used ISO-8859-1 to UTF-8.

Signed-off-by: Stefan Weil 
Signed-off-by: Stefan Hajnoczi 
---
 pc-bios/keymaps/is |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/pc-bios/keymaps/is b/pc-bios/keymaps/is
index 21dc1fd..935ac1d 100644
--- a/pc-bios/keymaps/is
+++ b/pc-bios/keymaps/is
@@ -1,4 +1,4 @@
-# 2004-03-16 Halld�r Gu�mundsson and Morten Lange
+# 2004-03-16 Halldór Guðmundsson and Morten Lange
 # Keyboard definition file for the Icelandic keyboard
 # to be used in rdesktop 1.3.x ( See rdesktop.org)
 # generated from XKB map de, and changed manually
-- 
1.7.7.3

[Qemu-devel] [PATCH 06/19] net/socket.c : fix memory leak

2011-12-06 Thread Stefan Hajnoczi

From: Zhi Hui Li 

Signed-off-by: Li Zhi Hui 
Signed-off-by: Stefan Hajnoczi 
---
 net/socket.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index e9ef128..0f09164 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -409,6 +409,7 @@ static int net_socket_listen_init(VLANState *vlan,
 fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
 if (fd < 0) {
 perror("socket");
+g_free(s);
 return -1;
 }
 socket_set_nonblock(fd);
@@ -420,11 +421,13 @@ static int net_socket_listen_init(VLANState *vlan,
 ret = bind(fd, (struct sockaddr *)&saddr, sizeof(saddr));
 if (ret < 0) {
 perror("bind");
+g_free(s);
 return -1;
 }
 ret = listen(fd, 0);
 if (ret < 0) {
 perror("listen");
+g_free(s);
 return -1;
 }
 s->vlan = vlan;
-- 
1.7.7.3

[Qemu-devel] [PATCH 07/19] win32: fix memory leak

2011-12-06 Thread Stefan Hajnoczi

From: Zhi Hui Li 

string is allocated by g_malloc, will not be used after putenv, should be
free before return.

Paolo Bonzini  confirmed this is safe under Wine:

"1) the underlying Win32 APIs require separate arguments for the
variable and value; 2) even though in the end Wine stores the
environment as name=value
(http://source.winehq.org/source/dlls/ntdll/env.c), it does so in a
single consecutive block of memory, not as a char* array like POSIX
does.  While (2) might apply only to Wine, (1) surely applies to Windows
as well."

Tested-by: Stefan Weil 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Li Zhi Hui 
Signed-off-by: Stefan Hajnoczi 
---
 os-win32.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/os-win32.c b/os-win32.c
index 8ad5fa1..8523d8d 100644
--- a/os-win32.c
+++ b/os-win32.c
@@ -44,6 +44,13 @@ int setenv(const char *name, const char *value, int 
overwrite)
 char *string = g_malloc(length);
 snprintf(string, length, "%s=%s", name, value);
 result = putenv(string);
+
+/* Windows takes a copy and does not continue to use our string.
+ * Therefore it can be safely freed on this platform.  POSIX code
+ * typically has to leak the string because according to the spec it
+ * becomes part of the environment.
+ */
+g_free(string);
 }
 return result;
 }
-- 
1.7.7.3

Re: [Qemu-devel] winXP "Standard PC" HAL and qemu-kvm >= 0.15

2011-12-06 Thread Michael S. Tsirkin

On Tue, Dec 06, 2011 at 03:02:49PM +0400, Michael Tokarev wrote:
> On 06.12.2011 14:32, Avi Kivity wrote:
> > On 12/05/2011 10:19 PM, Michael Tokarev wrote:
> >> On 05.12.2011 17:28, Avi Kivity wrote:
> >> []
>  I haven't debugged further yet, -- because it were
>  not easy to find out what was causing the regression
>  and how to reproduce it, and also because I don't think
>  it is the right HAL for qemu-kvm guest anyway.
> >>>
> >>> It's not, but the regression indicates we broke something.  It would be
> >>> good to know what that is.
> >>
> >> So today I gave it a chance with git bisect, and here's what it found:
> 
> >> First bad commit ef390067a72fe09977bb4ac8211313e1503302ea
> >> Merge: c7b3e90 0fd542f
> >> Author: Avi Kivity 
> >> Date:   Sun May 15 04:48:05 2011 -0400
> >>
> >> Merge commit '0fd542fb7d13ddf12f897bb27c5950f31638b1df' into 
> >> upstream-merge
> 
> And after applying Avi's instructions here's the real bisect
> result:
> 
> ab431c283e7055bcd6fb622f212bb29e84a6a134 is the first bad commit
> commit ab431c283e7055bcd6fb622f212bb29e84a6a134
> Author: Isaku Yamahata 
> Date:   Fri Apr 1 20:43:23 2011 +0900
> 
> piix_pci: optimize set irq path

Could you try with this commit reverted please?
Reverting patch below. Warning: compiled only.

commit 8f40db3918a0618a3beb9a771a569d20fe9c1bb3
Author: Michael S. Tsirkin 
Date:   Tue Dec 6 14:24:32 2011 +0200

Revert "piix_pci: optimize set irq path"

This reverts commit ab431c283e7055bcd6fb622f212bb29e84a6a134.

Conflicts:

hw/piix_pci.c

diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index ee11ff2..5b35c01 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -38,28 +38,12 @@
 
 typedef PCIHostState I440FXState;
 
-#define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
 #define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
 #define XEN_PIIX_NUM_PIRQS  128ULL
 #define PIIX_PIRQC  0x60
 
 typedef struct PIIX3State {
 PCIDevice dev;
-
-/*
- * bitmap to track pic levels.
- * The pic level is the logical OR of all the PCI irqs mapped to it
- * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
- *
- * PIRQ is mapped to PIC pins, we track it by
- * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
- * pic_irq * PIIX_NUM_PIRQS + pirq
- */
-#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
-#error "unable to encode pic state in 64bit in pic_levels."
-#endif
-uint64_t pic_levels;
-
 qemu_irq *pic;
 
 /* This member isn't used. Just for save/load compatibility */
@@ -365,61 +349,24 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
*piix3_devfn,
 return b;
 }
 
-/* PIIX3 PCI to ISA bridge */
-static void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
-{
-qemu_set_irq(piix3->pic[pic_irq],
- !!(piix3->pic_levels &
-(((1ULL << PIIX_NUM_PIRQS) - 1) <<
- (pic_irq * PIIX_NUM_PIRQS;
-}
-
-static void piix3_set_irq_level(PIIX3State *piix3, int pirq, int level)
-{
-int pic_irq;
-uint64_t mask;
-
-pic_irq = piix3->dev.config[PIIX_PIRQC + pirq];
-if (pic_irq >= PIIX_NUM_PIC_IRQS) {
-return;
-}
-
-mask = 1ULL << ((pic_irq * PIIX_NUM_PIRQS) + pirq);
-piix3->pic_levels &= ~mask;
-piix3->pic_levels |= mask * !!level;
-
-piix3_set_irq_pic(piix3, pic_irq);
-}
-
-static void piix3_set_irq(void *opaque, int pirq, int level)
+static void piix3_set_irq(void *opaque, int irq_num, int level)
 {
+int i, pic_irq, pic_level;
 PIIX3State *piix3 = opaque;
-piix3_set_irq_level(piix3, pirq, level);
-}
-
-/* irq routing is changed. so rebuild bitmap */
-static void piix3_update_irq_levels(PIIX3State *piix3)
-{
-int pirq;
-
-piix3->pic_levels = 0;
-for (pirq = 0; pirq < PIIX_NUM_PIRQS; pirq++) {
-piix3_set_irq_level(piix3, pirq,
-pci_bus_get_irq_level(piix3->dev.bus, pirq));
-}
-}
 
-static void piix3_write_config(PCIDevice *dev,
-   uint32_t address, uint32_t val, int len)
-{
-pci_default_write_config(dev, address, val, len);
-if (ranges_overlap(address, len, PIIX_PIRQC, 4)) {
-PIIX3State *piix3 = DO_UPCAST(PIIX3State, dev, dev);
-int pic_irq;
-piix3_update_irq_levels(piix3);
-for (pic_irq = 0; pic_irq < PIIX_NUM_PIC_IRQS; pic_irq++) {
-piix3_set_irq_pic(piix3, pic_irq);
+/* now we change the pic irq level according to the piix irq mappings */
+/* XXX: optimize */
+pic_irq = piix3->dev.config[0x60 + irq_num];
+if (pic_irq < 16) {
+/* The pic level is the logical OR of all the PCI irqs mapped
+   to it */
+pic_level = 0;
+for (i = 0; i < 4; i++) {
+if (pic_irq == piix3->dev.config[0x60 + i]) {
+pic_level |= pci_bus_get_irq_level(piix3->dev.bus, i);
+}
 }
+qemu_set_irq(piix3->pic[pic_irq], pic_level);
 }
 }

Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support

2011-12-06 Thread Avi Kivity

On 12/01/2011 03:37 AM, bill4car...@gmail.com wrote:
> From: Bill Carson 
>
> This patch adds minimal codes to support A15  which enables ARM KVM could
> run Guest OS build with Versatile Express Cortex-A15x4 tile.
>
> +
> +static inline int
> +gic_get_current_cpu(void)
> +{
> +  return cpu_single_env->cpu_index;
> +}

Bad indents.

> +/* Per-CPU private memory mapped IO.  */
> +static uint64_t a15mpcore_priv_read(void *opaque, target_phys_addr_t offset,
> + unsigned size)
> +{
> +a15mpcore_priv_state *s = (a15mpcore_priv_state *)opaque;
> +int id;
> +
> +offset &= 0xfff;
> +/* Interrupt controller.  */
> +if (offset < 0x200) {
> + id = gic_get_current_cpu();
> + } else {
> +id = (offset - 0x200) >> 8;
> +if (id >= s->num_cpu) {
> +return 0;
> +}
> +}
> + return gic_cpu_read(&s->gic, id, offset & 0xff);
> +}

Very bad indents.  Detab your files.

> +
> +static void a15mpcore_priv_write(void *opaque, target_phys_addr_t offset,
> +  uint64_t value, unsigned size)
> +{
> +a15mpcore_priv_state *s = (a15mpcore_priv_state *)opaque;
> +int id;
> + 
> +offset &= 0xfff;
> +/* Interrupt controller.  */
> +if (offset < 0x200) {
> + id = gic_get_current_cpu();
> + } else {
> +id = (offset - 0x200) >> 8;
> +if (id >= s->num_cpu) {
> +return 0;
> +}
> +}
> + return gic_cpu_write(&s->gic, id, offset & 0xff, value);
> +}

Here, too.

> +static void a15mpcore_priv_map_setup(a15mpcore_priv_state *s)
> +{
> +memory_region_init(&s->container, "mpcode-priv-container", 0x3000);
> +memory_region_init_io(&s->iomem, &mpcore_priv_ops, s, "mpcode-priv",
> +  0x1000);
> +memory_region_add_subregion(&s->container, 0x2000, &s->iomem);
> +memory_region_add_subregion(&s->container, 0x1000, &s->gic.iomem);
> +}

mpcode or mpcore?

> +/* RAM is from 0x8000 upwards. The bottom 64MB of the
> + * address space should in theory be remappable to various
> + * things including ROM or RAM; we always map the RAM there.
> + */
> +cpu_register_physical_memory(0x0, low_ram_size, ram_offset | IO_MEM_RAM);
> +cpu_register_physical_memory(0x8000, ram_size,
> + ram_offset | IO_MEM_RAM);
> +

Use memory_region_init_ram()/memory_region_add_subregion() instead.

> +
> +/* ??? Hack to map an additional page of ram for the secondary CPU
> +   startup code.  I guess this works on real hardware because the
> +   BootROM happens to be in ROM/flash or in memory that isn't clobbered
> +   until after Linux boots the secondary CPUs.  */
> +ram_offset = qemu_ram_alloc(NULL, "vexpress.hack", 0x1000);
> +cpu_register_physical_memory(SMP_BOOT_ADDR, 0x1000,
> + ram_offset | IO_MEM_RAM);

Here, too.

It would be better to unhack this; short-term hacks tend to remain in
the long term, and even after they're fixed we keep them for backwards
compatibility.


-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support

2011-12-06 Thread Peter Maydell

On 6 December 2011 12:28, Avi Kivity  wrote:
> On 12/01/2011 03:37 AM, bill4car...@gmail.com wrote:
>> +    /* ??? Hack to map an additional page of ram for the secondary CPU
>> +       startup code.  I guess this works on real hardware because the
>> +       BootROM happens to be in ROM/flash or in memory that isn't clobbered
>> +       until after Linux boots the secondary CPUs.  */
>> +    ram_offset = qemu_ram_alloc(NULL, "vexpress.hack", 0x1000);
>> +    cpu_register_physical_memory(SMP_BOOT_ADDR, 0x1000,
>> +                                 ram_offset | IO_MEM_RAM);

> It would be better to unhack this; short-term hacks tend to remain in
> the long term, and even after they're fixed we keep them for backwards
> compatibility.

Do you have a better suggestion in this case? We've had the same
code in the realview board since 2007 when ARM SMP support was first
added...

There's no particular back-compat implication here as far as I know:
the location of the secondary CPU holding pen code is irrelevant to
the actual guest being run. (On a real system it will be somewhere
inside the boot ROM.)

-- PMM

Re: [Qemu-devel] [Bug 899961] Re: qemu/kvm locks up when run 32bit userspace with 64bit kernel

2011-12-06 Thread Avi Kivity

On 12/04/2011 07:45 PM, Michael Tokarev wrote:
> Actually after trying to do lots of experiments and finally a git
> bisection, it turned out that the issue only affects qemu-kvm, not
> upstream qemu.  Bisection between qemu-kvm 0.15.0 and 1.0 lead to this
> commit:
>
> commit 145e11e840500e04a4d0a624918bb17596be19e9
> Merge: ce967f6 b195043
> Author: Avi Kivity 
> Date:   Wed Aug 10 12:06:58 2011 +0300
>
> Merge commit 'b195043003d90ea4027ea01cc7a6c974ac915108' into 
> upstream-merge
> 
> * commit 'b195043003d90ea4027ea01cc7a6c974ac915108': (130 commits)
>...
>
> After which I'm stuck... ;)
>

32-on-64 doesn't build on Fedora due to glib2-devel.i686 conflicting
with its x86_64 cousin, so I can't reproduce this.  Can you try
bisecting this further?

$ git tag M b195043003d90ea4027ea01cc7a6c974ac915108
$ git bisect start M M^
...
$ git merge --no-commit M^
[build and test]
$ git merge --abort (or git checkout -f)
$ git bisect [good|bad]

if it happens that M^2 was the bad commit, you'll get a merge conflict. 
In that case do

$ git merge --abort
$ git merge M

(you'll just be testing M again, but it's good to verify)

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support

2011-12-06 Thread Avi Kivity

On 12/06/2011 02:35 PM, Peter Maydell wrote:
> On 6 December 2011 12:28, Avi Kivity  wrote:
> > On 12/01/2011 03:37 AM, bill4car...@gmail.com wrote:
> >> +/* ??? Hack to map an additional page of ram for the secondary CPU
> >> +   startup code.  I guess this works on real hardware because the
> >> +   BootROM happens to be in ROM/flash or in memory that isn't 
> >> clobbered
> >> +   until after Linux boots the secondary CPUs.  */
> >> +ram_offset = qemu_ram_alloc(NULL, "vexpress.hack", 0x1000);
> >> +cpu_register_physical_memory(SMP_BOOT_ADDR, 0x1000,
> >> + ram_offset | IO_MEM_RAM);
>
> > It would be better to unhack this; short-term hacks tend to remain in
> > the long term, and even after they're fixed we keep them for backwards
> > compatibility.
>
> Do you have a better suggestion in this case? We've had the same
> code in the realview board since 2007 when ARM SMP support was first
> added...

No idea really since I don't fully understand what's going on.  It's
just a knee-jerk reaction to the word 'hack'.

Can't we just do what real hardware does?

> There's no particular back-compat implication here as far as I know:
> the location of the secondary CPU holding pen code is irrelevant to
> the actual guest being run. (On a real system it will be somewhere
> inside the boot ROM.)

Suppose you live migrate when the code is running there?

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support

2011-12-06 Thread Peter Maydell

On 6 December 2011 12:39, Avi Kivity  wrote:
> On 12/06/2011 02:35 PM, Peter Maydell wrote:
>> On 6 December 2011 12:28, Avi Kivity  wrote:
>> Do you have a better suggestion in this case? We've had the same
>> code in the realview board since 2007 when ARM SMP support was first
>> added...
>
> No idea really since I don't fully understand what's going on.  It's
> just a knee-jerk reaction to the word 'hack'.
>
> Can't we just do what real hardware does?

Real hardware runs the boot ROM. The boot ROM is an unredistributable
binary blob, and in any case we almost certainly don't implement
the hardware well enough to pass the boot ROM's self tests and init
code.

We could probably put the pen code in a QEMU-specific bit of ROM
code in the same place the hardware boot ROM actually lives.

Longer term, I'm toying with the idea of having QEMU run UEFI
(for vexpress UEFI can boot the platform from bare metal, and it's
open source so we can (a) redistribute it and (b) add in qemu-specific
code if we need to). That would look a bit more like x86 qemu.

>> There's no particular back-compat implication here as far as I know:
>> the location of the secondary CPU holding pen code is irrelevant to
>> the actual guest being run. (On a real system it will be somewhere
>> inside the boot ROM.)
>
> Suppose you live migrate when the code is running there?

Currently for ARM we permit changes which would break live migration,
because it's not supported to start with.

(Does migration copy across rom contents and sizes?)

-- PMM

Re: [Qemu-devel] [PATCH v5] block:add-cow file format

2011-12-06 Thread Marcelo Tosatti

On Tue, Nov 15, 2011 at 01:28:51PM +0800, Dong Xu Wang wrote:
> From: Dong Xu Wang 
> 
> Provide a new file format: add-cow. The usage can be found in add-cow.txt of
> this patch.
> 
> Signed-off-by: Dong Xu Wang 
> ---
>  Makefile.objs  |1 +
>  block.c|2 +-
>  block.h|1 +
>  block/add-cow.c|  417 
> 
>  block_int.h|1 +
>  docs/specs/add-cow.txt |   57 +++
>  6 files changed, 478 insertions(+), 1 deletions(-)
>  create mode 100644 block/add-cow.c
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index d7a6539..ad99243 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -31,6 +31,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>  
>  block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o 
> vpc.o vvfat.o
>  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
> qcow2-cache.o
> +block-nested-y += add-cow.o
>  block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
>  block-nested-y += qed-check.o
>  block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
> diff --git a/block.c b/block.c
> index 86910b0..a2be27b 100644
> --- a/block.c
> +++ b/block.c
> @@ -106,7 +106,7 @@ int is_windows_drive(const char *filename)
>  #endif
>  
>  /* check if the path starts with ":" */
> -static int path_has_protocol(const char *path)
> +int path_has_protocol(const char *path)
>  {
>  #ifdef _WIN32
>  if (is_windows_drive(path) ||
> diff --git a/block.h b/block.h
> index 051a25d..836284f 100644
> --- a/block.h
> +++ b/block.h
> @@ -276,6 +276,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, 
> QEMUSnapshotInfo *sn);
>  
>  char *get_human_readable_size(char *buf, int buf_size, int64_t size);
>  int path_is_absolute(const char *path);
> +int path_has_protocol(const char *path);
>  void path_combine(char *dest, int dest_size,
>const char *base_path,
>const char *filename);
> diff --git a/block/add-cow.c b/block/add-cow.c
> new file mode 100644
> index 000..54d30a9
> --- /dev/null
> +++ b/block/add-cow.c
> @@ -0,0 +1,417 @@
> +#include "qemu-common.h"
> +#include "block_int.h"
> +#include "module.h"
> +
> +#define ADD_COW_MAGIC   (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | 
> \
> +((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \
> +((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \
> +((uint64_t)'W' << 8) | 0xFF)
> +#define ADD_COW_VERSION 1
> +#define ADD_COW_FILE_LEN1024
> +
> +typedef struct AddCowHeader {
> +uint64_tmagic;
> +uint32_tversion;
> +charbacking_file[ADD_COW_FILE_LEN];
> +charimage_file[ADD_COW_FILE_LEN];
> +uint64_tsize;
> +} QEMU_PACKED AddCowHeader;
> +
> +typedef struct BDRVAddCowState {
> +charimage_file[ADD_COW_FILE_LEN];
> +BlockDriverState*image_hd;
> +uint8_t *bitmap;
> +uint64_tbitmap_size;
> +CoMutex lock;
> +} BDRVAddCowState;
> +
> +static int add_cow_probe(const uint8_t *buf, int buf_size, const char 
> *filename)
> +{
> +const AddCowHeader *header = (const void *)buf;
> +
> +if (be64_to_cpu(header->magic) == ADD_COW_MAGIC &&
> +be32_to_cpu(header->version) == ADD_COW_VERSION) {
> +return 100;
> +} else {
> +return 0;
> +}
> +}
> +
> +static int add_cow_open(BlockDriverState *bs, int flags)
> +{
> +AddCowHeaderheader;
> +int64_t size;
> +charimage_filename[ADD_COW_FILE_LEN];
> +int image_flags;
> +BlockDriver *image_drv = NULL;
> +int ret;
> +BDRVAddCowState *state = (BDRVAddCowState *)(bs->opaque);
> +
> +ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
> +if (ret != sizeof(header)) {
> +goto fail;
> +}
> +
> +if (be64_to_cpu(header.magic) != ADD_COW_MAGIC ||
> +be32_to_cpu(header.version) != ADD_COW_VERSION) {
> +ret = -EINVAL;
> +goto fail;
> +}
> +
> +size = be64_to_cpu(header.size);
> +bs->total_sectors = size / BDRV_SECTOR_SIZE;
> +
> +QEMU_BUILD_BUG_ON(sizeof(state->image_file) != 
> sizeof(header.image_file));
> +pstrcpy(bs->backing_file, sizeof(bs->backing_file),
> +header.backing_file);
> +pstrcpy(state->image_file, sizeof(state->image_file),
> +header.image_file);
> +
> +state->bitmap_size = ((bs->total_sectors + 7) >> 3);
> +state->bitmap = g_malloc0(state->bitmap_size);
> +
> +ret = bdrv_pread(bs->file, sizeof(header), state->bitmap,
> +state->bitmap_size);
> +if (ret != state->bitmap_size) {
> +goto fail;
> +}

Reading the entire bitmap in memory is not acceptable, it may be huge.
Be

Re: [Qemu-devel] [PATCH v5] block:add-cow file format

2011-12-06 Thread Kevin Wolf

Am 06.12.2011 13:48, schrieb Marcelo Tosatti:
> On Tue, Nov 15, 2011 at 01:28:51PM +0800, Dong Xu Wang wrote:
>> From: Dong Xu Wang 
>>
>> Provide a new file format: add-cow. The usage can be found in add-cow.txt of
>> this patch.
>>
>> Signed-off-by: Dong Xu Wang 
>> ---
>>  Makefile.objs  |1 +
>>  block.c|2 +-
>>  block.h|1 +
>>  block/add-cow.c|  417 
>> 
>>  block_int.h|1 +
>>  docs/specs/add-cow.txt |   57 +++
>>  6 files changed, 478 insertions(+), 1 deletions(-)
>>  create mode 100644 block/add-cow.c
>>  create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/Makefile.objs b/Makefile.objs
>> index d7a6539..ad99243 100644
>> --- a/Makefile.objs
>> +++ b/Makefile.objs
>> @@ -31,6 +31,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>>  
>>  block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o 
>> vpc.o vvfat.o
>>  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
>> qcow2-cache.o
>> +block-nested-y += add-cow.o
>>  block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
>>  block-nested-y += qed-check.o
>>  block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
>> diff --git a/block.c b/block.c
>> index 86910b0..a2be27b 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -106,7 +106,7 @@ int is_windows_drive(const char *filename)
>>  #endif
>>  
>>  /* check if the path starts with ":" */
>> -static int path_has_protocol(const char *path)
>> +int path_has_protocol(const char *path)
>>  {
>>  #ifdef _WIN32
>>  if (is_windows_drive(path) ||
>> diff --git a/block.h b/block.h
>> index 051a25d..836284f 100644
>> --- a/block.h
>> +++ b/block.h
>> @@ -276,6 +276,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, 
>> QEMUSnapshotInfo *sn);
>>  
>>  char *get_human_readable_size(char *buf, int buf_size, int64_t size);
>>  int path_is_absolute(const char *path);
>> +int path_has_protocol(const char *path);
>>  void path_combine(char *dest, int dest_size,
>>const char *base_path,
>>const char *filename);
>> diff --git a/block/add-cow.c b/block/add-cow.c
>> new file mode 100644
>> index 000..54d30a9
>> --- /dev/null
>> +++ b/block/add-cow.c
>> @@ -0,0 +1,417 @@
>> +#include "qemu-common.h"
>> +#include "block_int.h"
>> +#include "module.h"
>> +
>> +#define ADD_COW_MAGIC   (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) 
>> | \
>> +((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | 
>> \
>> +((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | 
>> \
>> +((uint64_t)'W' << 8) | 0xFF)
>> +#define ADD_COW_VERSION 1
>> +#define ADD_COW_FILE_LEN1024
>> +
>> +typedef struct AddCowHeader {
>> +uint64_tmagic;
>> +uint32_tversion;
>> +charbacking_file[ADD_COW_FILE_LEN];
>> +charimage_file[ADD_COW_FILE_LEN];
>> +uint64_tsize;
>> +} QEMU_PACKED AddCowHeader;
>> +
>> +typedef struct BDRVAddCowState {
>> +charimage_file[ADD_COW_FILE_LEN];
>> +BlockDriverState*image_hd;
>> +uint8_t *bitmap;
>> +uint64_tbitmap_size;
>> +CoMutex lock;
>> +} BDRVAddCowState;
>> +
>> +static int add_cow_probe(const uint8_t *buf, int buf_size, const char 
>> *filename)
>> +{
>> +const AddCowHeader *header = (const void *)buf;
>> +
>> +if (be64_to_cpu(header->magic) == ADD_COW_MAGIC &&
>> +be32_to_cpu(header->version) == ADD_COW_VERSION) {
>> +return 100;
>> +} else {
>> +return 0;
>> +}
>> +}
>> +
>> +static int add_cow_open(BlockDriverState *bs, int flags)
>> +{
>> +AddCowHeaderheader;
>> +int64_t size;
>> +charimage_filename[ADD_COW_FILE_LEN];
>> +int image_flags;
>> +BlockDriver *image_drv = NULL;
>> +int ret;
>> +BDRVAddCowState *state = (BDRVAddCowState *)(bs->opaque);
>> +
>> +ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
>> +if (ret != sizeof(header)) {
>> +goto fail;
>> +}
>> +
>> +if (be64_to_cpu(header.magic) != ADD_COW_MAGIC ||
>> +be32_to_cpu(header.version) != ADD_COW_VERSION) {
>> +ret = -EINVAL;
>> +goto fail;
>> +}
>> +
>> +size = be64_to_cpu(header.size);
>> +bs->total_sectors = size / BDRV_SECTOR_SIZE;
>> +
>> +QEMU_BUILD_BUG_ON(sizeof(state->image_file) != 
>> sizeof(header.image_file));
>> +pstrcpy(bs->backing_file, sizeof(bs->backing_file),
>> +header.backing_file);
>> +pstrcpy(state->image_file, sizeof(state->image_file),
>> +header.image_file);
>> +
>> +state->bitmap_size = ((bs->total_sectors + 7) >> 3);
>> +state->bitmap = g_malloc0(state->bitmap_size);
>> +
>> +ret = bdrv_pread(bs->file, sizeof(header), sta

[Qemu-devel] [PATCH v3 00/16] uq/master: Introduce basic irqchip support

2011-12-06 Thread Jan Kiszka

In this revision, I'm now trying the approach of backend/frontend
split-ups for the affected IRQ chips. That means we keep a single qdev
device description but fork off specific logic early during device init.
The backends support this by providing hooks that user space and KVM
models can implement differently.

The result is slightly larger and comes with the not really beautiful
ioapic.kvm_gsi_base property but should otherwise meet expectations.

Comments?

PS: Series is still against old uq/master, therefore containing patches
that took/will take different routes.

Jan Kiszka (16):
  msi: Generalize msix_supported to msi_supported
  kvm: Move kvmclock into hw/kvm folder
  apic: Stop timer on reset
  apic: Introduce backend/frontend infrastructure for KVM reuse
  apic: Open-code timer save/restore
  i8259: Introduce backend/frontend infrastructure for KVM reuse
  ioapic: Convert to memory API
  ioapic: Reject non-dword accesses to IOWIN register
  ioapic: Introduce backend/frontend infrastructure for KVM reuse
  memory: Introduce memory_region_init_reservation
  kvm: Introduce core services for in-kernel irqchip support
  kvm: x86: Establish IRQ0 override control
  kvm: x86: Add user space part for in-kernel APIC
  kvm: x86: Add user space part for in-kernel i8259
  kvm: x86: Add user space part for in-kernel IOAPIC
  kvm: Arm in-kernel irqchip support

 Makefile.objs  |2 +-
 Makefile.target|6 +-
 configure  |1 +
 hw/apic.c  |  303 
 hw/apic_common.c   |  305 
 hw/apic_internal.h |  121 
 hw/i8259.c |  127 ++---
 hw/i8259_common.c  |  173 +++
 hw/i8259_internal.h|   82 +++
 hw/ioapic.c|  160 --
 hw/ioapic_common.c |  138 ++
 hw/ioapic_internal.h   |  106 ++
 hw/kvm/apic.c  |  129 +
 hw/{kvmclock.c => kvm/clock.c} |4 +-
 hw/{kvmclock.h => kvm/clock.h} |0
 hw/kvm/i8259.c |  126 +
 hw/kvm/ioapic.c|  101 +
 hw/msi.c   |8 +
 hw/msi.h   |2 +
 hw/msix.c  |9 +-
 hw/msix.h  |2 -
 hw/pc.c|   19 ++-
 hw/pc.h|1 +
 hw/pc_piix.c   |   66 -
 kvm-all.c  |  154 
 kvm-stub.c |5 +
 kvm.h  |   13 ++
 memory.c   |   36 +
 memory.h   |   16 ++
 qemu-config.c  |4 +
 qemu-options.hx|5 +-
 sysemu.h   |1 -
 target-i386/kvm.c  |   19 +++
 trace-events   |2 +-
 vl.c   |1 -
 35 files changed, 1694 insertions(+), 553 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h
 create mode 100644 hw/i8259_common.c
 create mode 100644 hw/i8259_internal.h
 create mode 100644 hw/ioapic_common.c
 create mode 100644 hw/ioapic_internal.h
 create mode 100644 hw/kvm/apic.c
 rename hw/{kvmclock.c => kvm/clock.c} (98%)
 rename hw/{kvmclock.h => kvm/clock.h} (100%)
 create mode 100644 hw/kvm/i8259.c
 create mode 100644 hw/kvm/ioapic.c

-- 
1.7.3.4

[Qemu-devel] [PATCH v3 02/16] kvm: Move kvmclock into hw/kvm folder

2011-12-06 Thread Jan Kiszka

More KVM-specific devices will come, so let's start with moving the
kvmclock into a dedicated folder.

Signed-off-by: Jan Kiszka 
---
 Makefile.target|4 ++--
 configure  |1 +
 hw/{kvmclock.c => kvm/clock.c} |4 ++--
 hw/{kvmclock.h => kvm/clock.h} |0
 hw/pc_piix.c   |2 +-
 5 files changed, 6 insertions(+), 5 deletions(-)
 rename hw/{kvmclock.c => kvm/clock.c} (98%)
 rename hw/{kvmclock.h => kvm/clock.h} (100%)

diff --git a/Makefile.target b/Makefile.target
index 1e90df7..3a9e95d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvmclock.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
@@ -421,7 +421,7 @@ qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx
 
 clean:
rm -f *.o *.a *~ $(PROGS) nwfpe/*.o fpu/*.o
-   rm -f *.d */*.d tcg/*.o ide/*.o 9pfs/*.o
+   rm -f *.d */*.d tcg/*.o ide/*.o 9pfs/*.o kvm/*.o
rm -f hmp-commands.h qmp-commands-old.h gdbstub-xml.c
 ifdef CONFIG_TRACE_SYSTEMTAP
rm -f *.stp
diff --git a/configure b/configure
index 4f87e0a..d768e44 100755
--- a/configure
+++ b/configure
@@ -3220,6 +3220,7 @@ mkdir -p $target_dir/fpu
 mkdir -p $target_dir/tcg
 mkdir -p $target_dir/ide
 mkdir -p $target_dir/9pfs
+mkdir -p $target_dir/kvm
 if test "$target" = "arm-linux-user" -o "$target" = "armeb-linux-user" -o 
"$target" = "arm-bsd-user" -o "$target" = "armeb-bsd-user" ; then
   mkdir -p $target_dir/nwfpe
 fi
diff --git a/hw/kvmclock.c b/hw/kvm/clock.c
similarity index 98%
rename from hw/kvmclock.c
rename to hw/kvm/clock.c
index 5388bc4..5983271 100644
--- a/hw/kvmclock.c
+++ b/hw/kvm/clock.c
@@ -13,9 +13,9 @@
 
 #include "qemu-common.h"
 #include "sysemu.h"
-#include "sysbus.h"
 #include "kvm.h"
-#include "kvmclock.h"
+#include "hw/sysbus.h"
+#include "hw/kvm/clock.h"
 
 #include 
 #include 
diff --git a/hw/kvmclock.h b/hw/kvm/clock.h
similarity index 100%
rename from hw/kvmclock.h
rename to hw/kvm/clock.h
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index c89042f..22997b0 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -34,7 +34,7 @@
 #include "boards.h"
 #include "ide.h"
 #include "kvm.h"
-#include "kvmclock.h"
+#include "kvm/clock.h"
 #include "sysemu.h"
 #include "sysbus.h"
 #include "arch_init.h"
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 13/16] kvm: x86: Add user space part for in-kernel APIC

2011-12-06 Thread Jan Kiszka

This introduces the alternative APIC backend which makes use of KVM's
in-kernel device model. MSI is not yet supported, so we disable this
when the in-kernel model is in use.

Signed-off-by: Jan Kiszka 
---
 Makefile.target   |2 +-
 hw/kvm/apic.c |  129 +
 hw/pc.c   |   15 --
 kvm.h |3 +
 target-i386/kvm.c |8 +++
 5 files changed, 151 insertions(+), 6 deletions(-)
 create mode 100644 hw/kvm/apic.c

diff --git a/Makefile.target b/Makefile.target
index 4cd3c0e..66b42d5 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/apic.c b/hw/kvm/apic.c
new file mode 100644
index 000..ed8638c
--- /dev/null
+++ b/hw/kvm/apic.c
@@ -0,0 +1,129 @@
+/*
+ * KVM in-kernel APIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka  
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
+   int reg_id, uint32_t val)
+{
+*((uint32_t *)(kapic->regs + (reg_id << 4))) = val;
+}
+
+static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic,
+   int reg_id)
+{
+return *((uint32_t *)(kapic->regs + (reg_id << 4)));
+}
+
+int kvm_put_apic(CPUState *env)
+{
+APICState *s = DO_UPCAST(APICState, busdev.qdev, env->apic_state);
+struct kvm_lapic_state kapic;
+int i;
+
+if (s && kvm_enabled() && kvm_irqchip_in_kernel()) {
+memset(&kapic, 0, sizeof(kapic));
+kvm_apic_set_reg(&kapic, 0x2, s->id << 24);
+kvm_apic_set_reg(&kapic, 0x8, s->tpr);
+kvm_apic_set_reg(&kapic, 0xd, s->log_dest << 24);
+kvm_apic_set_reg(&kapic, 0xe, s->dest_mode << 28 | 0x0fff);
+kvm_apic_set_reg(&kapic, 0xf, s->spurious_vec);
+for (i = 0; i < 8; i++) {
+kvm_apic_set_reg(&kapic, 0x10 + i, s->isr[i]);
+kvm_apic_set_reg(&kapic, 0x18 + i, s->tmr[i]);
+kvm_apic_set_reg(&kapic, 0x20 + i, s->irr[i]);
+}
+kvm_apic_set_reg(&kapic, 0x28, s->esr);
+kvm_apic_set_reg(&kapic, 0x30, s->icr[0]);
+kvm_apic_set_reg(&kapic, 0x31, s->icr[1]);
+for (i = 0; i < APIC_LVT_NB; i++) {
+kvm_apic_set_reg(&kapic, 0x32 + i, s->lvt[i]);
+}
+kvm_apic_set_reg(&kapic, 0x38, s->initial_count);
+kvm_apic_set_reg(&kapic, 0x3e, s->divide_conf);
+
+return kvm_vcpu_ioctl(env, KVM_SET_LAPIC, &kapic);
+}
+
+return 0;
+}
+
+int kvm_get_apic(CPUState *env)
+{
+APICState *s = DO_UPCAST(APICState, busdev.qdev, env->apic_state);
+struct kvm_lapic_state kapic;
+int ret, i, v;
+
+if (s && kvm_enabled() && kvm_irqchip_in_kernel()) {
+ret = kvm_vcpu_ioctl(env, KVM_GET_LAPIC, &kapic);
+if (ret < 0) {
+return ret;
+}
+
+s->id = kvm_apic_get_reg(&kapic, 0x2) >> 24;
+s->tpr = kvm_apic_get_reg(&kapic, 0x8);
+s->arb_id = kvm_apic_get_reg(&kapic, 0x9);
+s->log_dest = kvm_apic_get_reg(&kapic, 0xd) >> 24;
+s->dest_mode = kvm_apic_get_reg(&kapic, 0xe) >> 28;
+s->spurious_vec = kvm_apic_get_reg(&kapic, 0xf);
+for (i = 0; i < 8; i++) {
+s->isr[i] = kvm_apic_get_reg(&kapic, 0x10 + i);
+s->tmr[i] = kvm_apic_get_reg(&kapic, 0x18 + i);
+s->irr[i] = kvm_apic_get_reg(&kapic, 0x20 + i);
+}
+s->esr = kvm_apic_get_reg(&kapic, 0x28);
+s->icr[0] = kvm_apic_get_reg(&kapic, 0x30);
+s->icr[1] = kvm_apic_get_reg(&kapic, 0x31);
+for (i = 0; i < APIC_LVT_NB; i++) {
+s->lvt[i] = kvm_apic_get_reg(&kapic, 0x32 + i);
+}
+s->initial_count = kvm_apic_get_reg(&kapic, 0x38);
+s->divide_conf = kvm_apic_get_reg(&kapic, 0x3e);
+
+v = (s->divide_conf & 3) | ((s->divide_conf >> 1) & 4);
+s->count_shift = (v + 1) & 7;
+
+s->initial_count_load_time = qemu_get_clock_ns(vm_clock);
+apic_next_timer(s, s->initial_count_load_time);
+}
+return 0;
+}
+
+static void kvm_apic_set_base(APICState *s, uint64_t val)
+{
+s->apicbase = val;
+}
+
+static void kvm_apic_set_tpr(APICState *s, uint8_t val)
+{
+s->tpr = (val & 0x0f) << 4;
+}
+
+static void kvm_apic_backend_init(APICState *s)
+{
+memory_region_init_reservation(&s->io_memory, "kvm-apic-msi",
+   MSI_SPACE_SIZE);
+}
+
+static APICBackend kvm_apic_backend = {
+.

[Qemu-devel] [PATCH v3 15/16] kvm: x86: Add user space part for in-kernel IOAPIC

2011-12-06 Thread Jan Kiszka

This introduces the KVM-accelerated IOAPIC backend and extends the IRQ
routing setup by the 0->2 redirection when needed.

The IOAPIC gains a KVM-specific property that allows to define the GSI
base for injecting interrupts into the kernel model. This will allow to
disentangle PIC and IOAPIC pins for chipsets that support more
sophisticated IRQ routes than the PIIX3. So far the base is kept at 0,
i.e. PIC and IOAPIC share pins 0..15.

Signed-off-by: Jan Kiszka 
---
 Makefile.target  |2 +-
 hw/ioapic_common.c   |1 +
 hw/ioapic_internal.h |1 +
 hw/kvm/ioapic.c  |  101 ++
 hw/pc_piix.c |   14 +++
 5 files changed, 118 insertions(+), 1 deletions(-)
 create mode 100644 hw/kvm/ioapic.c

diff --git a/Makefile.target b/Makefile.target
index 850b80f..2f3407b 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o kvm/ioapic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/ioapic_common.c b/hw/ioapic_common.c
index 5268459..64dc233 100644
--- a/hw/ioapic_common.c
+++ b/hw/ioapic_common.c
@@ -122,6 +122,7 @@ static SysBusDeviceInfo ioapic_info = {
 .qdev.no_user = 1,
 .qdev.props = (Property[]) {
 DEFINE_PROP_STRING("backend", IOAPICState, backend_name),
+DEFINE_PROP_UINT32("kvm_gsi_base", IOAPICState, kvm_gsi_base, 0),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
diff --git a/hw/ioapic_internal.h b/hw/ioapic_internal.h
index c5fab8b..bf63115 100644
--- a/hw/ioapic_internal.h
+++ b/hw/ioapic_internal.h
@@ -95,6 +95,7 @@ struct IOAPICState {
 
 char *backend_name;
 IOAPICBackend *backend;
+uint32_t kvm_gsi_base;
 };
 
 void ioapic_register_device(void);
diff --git a/hw/kvm/ioapic.c b/hw/kvm/ioapic.c
new file mode 100644
index 000..0e66240
--- /dev/null
+++ b/hw/kvm/ioapic.c
@@ -0,0 +1,101 @@
+/*
+ * KVM in-kernel IOPIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka  
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "hw/pc.h"
+#include "hw/ioapic_internal.h"
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static void kvm_ioapic_get(IOAPICState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_ioapic_state *kioapic;
+int ret, i;
+
+chip.chip_id = KVM_IRQCHIP_IOAPIC;
+ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, &chip);
+if (ret < 0) {
+fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+abort();
+}
+
+kioapic = &chip.chip.ioapic;
+
+s->id = kioapic->id;
+s->ioregsel = kioapic->ioregsel;
+s->irr = kioapic->irr;
+for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+s->ioredtbl[i] = kioapic->redirtbl[i].bits;
+}
+}
+
+static void kvm_ioapic_put(IOAPICState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_ioapic_state *kioapic;
+int ret, i;
+
+chip.chip_id = KVM_IRQCHIP_IOAPIC;
+kioapic = &chip.chip.ioapic;
+
+kioapic->id = s->id;
+kioapic->ioregsel = s->ioregsel;
+kioapic->base_address = s->busdev.mmio[0].addr;
+kioapic->irr = s->irr;
+for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+kioapic->redirtbl[i].bits = s->ioredtbl[i];
+}
+
+ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, &chip);
+if (ret < 0) {
+fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+abort();
+}
+}
+
+static void kvm_ioapic_reset(IOAPICState *s)
+{
+ioapic_reset_internal(s);
+
+kvm_ioapic_put(s);
+}
+
+static void kvm_ioapic_set_irq(void *opaque, int irq, int level)
+{
+IOAPICState *s = opaque;
+int delivered;
+
+delivered = kvm_irqchip_set_irq(kvm_state, s->kvm_gsi_base + irq, level);
+apic_set_irq_delivered(delivered);
+}
+
+static void kvm_ioapic_backend_init(IOAPICState *s, int index)
+{
+memory_region_init_reservation(&s->io_memory, "kvm-ioapic", 0x1000);
+
+qdev_init_gpio_in(&s->busdev.qdev, kvm_ioapic_set_irq, IOAPIC_NUM_PINS);
+}
+
+static IOAPICBackend kvm_ioapic_backend = {
+.name = "KVM",
+.init = kvm_ioapic_backend_init,
+.reset = kvm_ioapic_reset,
+.pre_save = kvm_ioapic_get,
+.post_load = kvm_ioapic_put,
+};
+
+static void kvm_ioapic_register(void)
+{
+ioapic_register_backend(&kvm_ioapic_backend);
+}
+
+device_init(kvm_ioapic_register)
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 351b032..dfebf37 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -68,6 +68,15 @@ static void kvm_piix3_setup_irq_routing(bool pci_enabled)
 for (i = 8; i < 16; ++i) {
 kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_SLAVE, i - 8);
 }
+

Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support

2011-12-06 Thread Avi Kivity

On 12/06/2011 02:48 PM, Peter Maydell wrote:
> On 6 December 2011 12:39, Avi Kivity  wrote:
> > On 12/06/2011 02:35 PM, Peter Maydell wrote:
> >> On 6 December 2011 12:28, Avi Kivity  wrote:
> >> Do you have a better suggestion in this case? We've had the same
> >> code in the realview board since 2007 when ARM SMP support was first
> >> added...
> >
> > No idea really since I don't fully understand what's going on.  It's
> > just a knee-jerk reaction to the word 'hack'.
> >
> > Can't we just do what real hardware does?
>
> Real hardware runs the boot ROM. The boot ROM is an unredistributable
> binary blob, and in any case we almost certainly don't implement
> the hardware well enough to pass the boot ROM's self tests and init
> code.
>
> We could probably put the pen code in a QEMU-specific bit of ROM
> code in the same place the hardware boot ROM actually lives.

That would be an improvement.

> Longer term, I'm toying with the idea of having QEMU run UEFI
> (for vexpress UEFI can boot the platform from bare metal, and it's
> open source so we can (a) redistribute it and (b) add in qemu-specific
> code if we need to). That would look a bit more like x86 qemu.

Right.

> >> There's no particular back-compat implication here as far as I know:
> >> the location of the secondary CPU holding pen code is irrelevant to
> >> the actual guest being run. (On a real system it will be somewhere
> >> inside the boot ROM.)
> >
> > Suppose you live migrate when the code is running there?
>
> Currently for ARM we permit changes which would break live migration,
> because it's not supported to start with.

Okay, so we should remove the hack before enabling live migration.

In any case, I understand you currently have to boot with -kernel? 
That's even a stronger reason to fix this thing.

> (Does migration copy across rom contents and sizes?)

Contents, yes, sizes, no.

-- 
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] Add minimal Vexpress Cortex A15 support

2011-12-06 Thread Peter Maydell

On 6 December 2011 13:12, Avi Kivity  wrote:
> On 12/06/2011 02:48 PM, Peter Maydell wrote:
>> Real hardware runs the boot ROM. The boot ROM is an unredistributable
>> binary blob, and in any case we almost certainly don't implement
>> the hardware well enough to pass the boot ROM's self tests and init
>> code.
>>
>> We could probably put the pen code in a QEMU-specific bit of ROM
>> code in the same place the hardware boot ROM actually lives.
>
> That would be an improvement.

Feel free to submit a patch :-)

Anyway, I don't intend to make cleaning this up be a blocker for
adding A15 system support.

-- PMM

Re: [Qemu-devel] [PATCH] coroutine: switch per-thread free pool to a global pool

2011-12-06 Thread Stefan Hajnoczi

On Mon, Dec 5, 2011 at 5:20 PM, Avi Kivity  wrote:
> ucontext-based coroutines use a free pool to reduce allocations and
> deallocations of coroutine objects.  The pool is per-thread, presumably
> to improve locality.  However, as coroutines are usually allocated in
> a vcpu thread and freed in the I/O thread, the pool accounting gets
> screwed up and we end allocating and freeing a coroutine for every I/O
> request.  This is expensive since large objects are allocated via the
> kernel, and are not cached by the C runtime.
>
> Fix by switching to a global pool.  This is safe since we're protected
> by the global mutex.

Looks good to me.  I did check how hw/9pfs/ uses coroutines because it
bounces them into worker threads that are not under the QEMU mutex but
they are not created/destroyed there so we should be okay.

Stefan

[Qemu-devel] [PATCH v3 09/16] ioapic: Introduce backend/frontend infrastructure for KVM reuse

2011-12-06 Thread Jan Kiszka

Split up the IOAPIC analogously to APIC and i8259. KVM will share the
device description, reset logic and certain init parts with the user
space model.

Signed-off-by: Jan Kiszka 
---
 Makefile.target  |2 +-
 hw/ioapic.c  |  130 ---
 hw/ioapic_common.c   |  137 ++
 hw/ioapic_internal.h |  105 ++
 4 files changed, 253 insertions(+), 121 deletions(-)
 create mode 100644 hw/ioapic_common.c
 create mode 100644 hw/ioapic_internal.h

diff --git a/Makefile.target b/Makefile.target
index 7bb6b13..4cd3c0e 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -226,7 +226,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o pc.o
-obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic_common.o ioapic.o 
piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/ioapic.c b/hw/ioapic.c
index eb75766..2db72e0 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -24,9 +24,7 @@
 #include "pc.h"
 #include "apic.h"
 #include "ioapic.h"
-#include "qemu-timer.h"
-#include "host-utils.h"
-#include "sysbus.h"
+#include "ioapic_internal.h"
 
 //#define DEBUG_IOAPIC
 
@@ -37,62 +35,6 @@
 #define DPRINTF(fmt, ...)
 #endif
 
-#define MAX_IOAPICS 1
-
-#define IOAPIC_VERSION  0x11
-
-#define IOAPIC_LVT_DEST_SHIFT   56
-#define IOAPIC_LVT_MASKED_SHIFT 16
-#define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
-#define IOAPIC_LVT_REMOTE_IRR_SHIFT 14
-#define IOAPIC_LVT_POLARITY_SHIFT   13
-#define IOAPIC_LVT_DELIV_STATUS_SHIFT   12
-#define IOAPIC_LVT_DEST_MODE_SHIFT  11
-#define IOAPIC_LVT_DELIV_MODE_SHIFT 8
-
-#define IOAPIC_LVT_MASKED   (1 << IOAPIC_LVT_MASKED_SHIFT)
-#define IOAPIC_LVT_REMOTE_IRR   (1 << IOAPIC_LVT_REMOTE_IRR_SHIFT)
-
-#define IOAPIC_TRIGGER_EDGE 0
-#define IOAPIC_TRIGGER_LEVEL1
-
-/*io{apic,sapic} delivery mode*/
-#define IOAPIC_DM_FIXED 0x0
-#define IOAPIC_DM_LOWEST_PRIORITY   0x1
-#define IOAPIC_DM_PMI   0x2
-#define IOAPIC_DM_NMI   0x4
-#define IOAPIC_DM_INIT  0x5
-#define IOAPIC_DM_SIPI  0x6
-#define IOAPIC_DM_EXTINT0x7
-#define IOAPIC_DM_MASK  0x7
-
-#define IOAPIC_VECTOR_MASK  0xff
-
-#define IOAPIC_IOREGSEL 0x00
-#define IOAPIC_IOWIN0x10
-
-#define IOAPIC_REG_ID   0x00
-#define IOAPIC_REG_VER  0x01
-#define IOAPIC_REG_ARB  0x02
-#define IOAPIC_REG_REDTBL_BASE  0x10
-#define IOAPIC_ID   0x00
-
-#define IOAPIC_ID_SHIFT 24
-#define IOAPIC_ID_MASK  0xf
-
-#define IOAPIC_VER_ENTRIES_SHIFT16
-
-typedef struct IOAPICState IOAPICState;
-
-struct IOAPICState {
-SysBusDevice busdev;
-MemoryRegion io_memory;
-uint8_t id;
-uint8_t ioregsel;
-uint32_t irr;
-uint64_t ioredtbl[IOAPIC_NUM_PINS];
-};
-
 static IOAPICState *ioapics[MAX_IOAPICS];
 
 static void ioapic_service(IOAPICState *s)
@@ -278,83 +220,31 @@ ioapic_mem_write(void *opaque, target_phys_addr_t addr, 
uint64_t val,
 }
 }
 
-static int ioapic_post_load(void *opaque, int version_id)
-{
-IOAPICState *s = opaque;
-
-if (version_id == 1) {
-/* set sane value */
-s->irr = 0;
-}
-return 0;
-}
-
-static const VMStateDescription vmstate_ioapic = {
-.name = "ioapic",
-.version_id = 3,
-.post_load = ioapic_post_load,
-.minimum_version_id = 1,
-.minimum_version_id_old = 1,
-.fields = (VMStateField[]) {
-VMSTATE_UINT8(id, IOAPICState),
-VMSTATE_UINT8(ioregsel, IOAPICState),
-VMSTATE_UNUSED_V(2, 8), /* to account for qemu-kvm's v2 format */
-VMSTATE_UINT32_V(irr, IOAPICState, 2),
-VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
-VMSTATE_END_OF_LIST()
-}
-};
-
-static void ioapic_reset(DeviceState *d)
-{
-IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, d);
-int i;
-
-s->id = 0;
-s->ioregsel = 0;
-s->irr = 0;
-for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-s->ioredtbl[i] = 1 << IOAPIC_LVT_MASKED_SHIFT;
-}
-}
-
 static const MemoryRegionOps ioapic_io_ops = {
 .read = ioapic_mem_read,
 .write = ioapic_mem_write,
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static int ioapic_init1(SysBusDevice *dev)
+static void ioapic_backend_init(IOAPICState *s, int index)
 {
-IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
-static int ioapic_no;
-
-if (ioapic_no >= MAX_IOAPICS) {
-return -1;
-}
-
 memory_region_init_io(&s->io_memory,

[Qemu-devel] [PATCH v3 10/16] memory: Introduce memory_region_init_reservation

2011-12-06 Thread Jan Kiszka

Introduce a memory region type that can reserve I/O space. Such regions
are useful for modeling I/O that is only handled outside of QEMU, i.e.
in the context of an accelerator like KVM.

Any access to such a region from QEMU is a bug, but could theoretically
be triggered by guest code (DMA to reserved region). So only warning
about such events once, then ignore them.

Signed-off-by: Jan Kiszka 
---
 memory.c |   36 
 memory.h |   16 
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/memory.c b/memory.c
index dc5e35d..6d55cf6 100644
--- a/memory.c
+++ b/memory.c
@@ -1003,6 +1003,42 @@ void memory_region_init_rom_device(MemoryRegion *mr,
 mr->backend_registered = true;
 }
 
+static uint64_t invalid_read(void *opaque, target_phys_addr_t addr,
+ unsigned size)
+{
+MemoryRegion *mr = opaque;
+
+if (!mr->warning_printed) {
+fprintf(stderr, "Invalid read from memory region %s\n", mr->name);
+mr->warning_printed = true;
+}
+return -1U;
+}
+
+static void invalid_write(void *opaque, target_phys_addr_t addr, uint64_t data,
+  unsigned size)
+{
+MemoryRegion *mr = opaque;
+
+if (!mr->warning_printed) {
+fprintf(stderr, "Invalid write to memory region %s\n", mr->name);
+mr->warning_printed = true;
+}
+}
+
+static const MemoryRegionOps reservation_ops = {
+.read = invalid_read,
+.write = invalid_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+void memory_region_init_reservation(MemoryRegion *mr,
+const char *name,
+uint64_t size)
+{
+memory_region_init_io(mr, &reservation_ops, mr, name, size);
+}
+
 void memory_region_destroy(MemoryRegion *mr)
 {
 assert(QTAILQ_EMPTY(&mr->subregions));
diff --git a/memory.h b/memory.h
index d5b47da..b479350 100644
--- a/memory.h
+++ b/memory.h
@@ -115,6 +115,7 @@ struct MemoryRegion {
 bool terminates;
 bool readable;
 bool readonly; /* For RAM regions */
+bool warning_printed; /* For reservations */
 MemoryRegion *alias;
 target_phys_addr_t alias_offset;
 unsigned priority;
@@ -242,6 +243,21 @@ void memory_region_init_rom_device(MemoryRegion *mr,
uint64_t size);
 
 /**
+ * memory_region_init_reservation: Initialize a memory region that reserves
+ * I/O space.
+ *
+ * A reservation region primariy serves debugging purposes.  It claims I/O
+ * space that is not supposed to be handled by QEMU itself.  Any access via
+ * the memory API will cause an abort().
+ *
+ * @mr: the #MemoryRegion to be initialized
+ * @name: used for debugging; not visible to the user or ABI
+ * @size: size of the region.
+ */
+void memory_region_init_reservation(MemoryRegion *mr,
+const char *name,
+uint64_t size);
+/**
  * memory_region_destroy: Destroy a memory region and relaim all resources.
  *
  * @mr: the region to be destroyed.  May not currently be a subregion
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 06/16] i8259: Introduce backend/frontend infrastructure for KVM reuse

2011-12-06 Thread Jan Kiszka

Analogously to the APIC, we will reuse some parts of the user space
i8259 model for KVM. Again, we create a PIC backend infrastructure and
provide hooks for init, reset, and vmload/save. This also introduces a
common helper to instantiate a single i8259 chip from the cascade-
creating i8259_init function.

Signed-off-by: Jan Kiszka 
---
 Makefile.objs   |2 +-
 hw/i8259.c  |  127 +-
 hw/i8259_common.c   |  173 +++
 hw/i8259_internal.h |   82 
 4 files changed, 271 insertions(+), 113 deletions(-)
 create mode 100644 hw/i8259_common.c
 create mode 100644 hw/i8259_internal.h

diff --git a/Makefile.objs b/Makefile.objs
index 01587c8..5372eec 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -220,7 +220,7 @@ hw-obj-$(CONFIG_APPLESMC) += applesmc.o
 hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
 hw-obj-$(CONFIG_USB_REDIR) += usb-redir.o
-hw-obj-$(CONFIG_I8259) += i8259.o
+hw-obj-$(CONFIG_I8259) += i8259_common.o i8259.o
 
 # PPC devices
 hw-obj-$(CONFIG_PREP_PCI) += prep_pci.o
diff --git a/hw/i8259.c b/hw/i8259.c
index ab519de..413802c 100644
--- a/hw/i8259.c
+++ b/hw/i8259.c
@@ -26,6 +26,7 @@
 #include "isa.h"
 #include "monitor.h"
 #include "qemu-timer.h"
+#include "i8259_internal.h"
 
 /* debug PIC */
 //#define DEBUG_PIC
@@ -40,33 +41,6 @@
 //#define DEBUG_IRQ_LATENCY
 //#define DEBUG_IRQ_COUNT
 
-struct PicState {
-ISADevice dev;
-uint8_t last_irr; /* edge detection */
-uint8_t irr; /* interrupt request register */
-uint8_t imr; /* interrupt mask register */
-uint8_t isr; /* interrupt service register */
-uint8_t priority_add; /* highest irq priority */
-uint8_t irq_base;
-uint8_t read_reg_select;
-uint8_t poll;
-uint8_t special_mask;
-uint8_t init_state;
-uint8_t auto_eoi;
-uint8_t rotate_on_auto_eoi;
-uint8_t special_fully_nested_mode;
-uint8_t init4; /* true if 4 byte init */
-uint8_t single_mode; /* true if slave pic is not initialized */
-uint8_t elcr; /* PIIX edge/trigger selection*/
-uint8_t elcr_mask;
-qemu_irq int_out[1];
-uint32_t master; /* reflects /SP input pin */
-uint32_t iobase;
-uint32_t elcr_addr;
-MemoryRegion base_io;
-MemoryRegion elcr_io;
-};
-
 #if defined(DEBUG_PIC) || defined(DEBUG_IRQ_COUNT)
 static int irq_level[16];
 #endif
@@ -248,29 +222,12 @@ int pic_read_irq(PicState *s)
 
 static void pic_init_reset(PicState *s)
 {
-s->last_irr = 0;
-s->irr = 0;
-s->imr = 0;
-s->isr = 0;
-s->priority_add = 0;
-s->irq_base = 0;
-s->read_reg_select = 0;
-s->poll = 0;
-s->special_mask = 0;
-s->init_state = 0;
-s->auto_eoi = 0;
-s->rotate_on_auto_eoi = 0;
-s->special_fully_nested_mode = 0;
-s->init4 = 0;
-s->single_mode = 0;
-/* Note: ELCR is not reset */
+pic_reset_internal(s);
 pic_update_irq(s);
 }
 
-static void pic_reset(DeviceState *dev)
+static void pic_reset(PicState *s)
 {
-PicState *s = container_of(dev, PicState, dev.qdev);
-
 pic_init_reset(s);
 s->elcr = 0;
 }
@@ -418,32 +375,6 @@ static uint64_t elcr_ioport_read(void *opaque, 
target_phys_addr_t addr,
 return s->elcr;
 }
 
-static const VMStateDescription vmstate_pic = {
-.name = "i8259",
-.version_id = 1,
-.minimum_version_id = 1,
-.minimum_version_id_old = 1,
-.fields = (VMStateField[]) {
-VMSTATE_UINT8(last_irr, PicState),
-VMSTATE_UINT8(irr, PicState),
-VMSTATE_UINT8(imr, PicState),
-VMSTATE_UINT8(isr, PicState),
-VMSTATE_UINT8(priority_add, PicState),
-VMSTATE_UINT8(irq_base, PicState),
-VMSTATE_UINT8(read_reg_select, PicState),
-VMSTATE_UINT8(poll, PicState),
-VMSTATE_UINT8(special_mask, PicState),
-VMSTATE_UINT8(init_state, PicState),
-VMSTATE_UINT8(auto_eoi, PicState),
-VMSTATE_UINT8(rotate_on_auto_eoi, PicState),
-VMSTATE_UINT8(special_fully_nested_mode, PicState),
-VMSTATE_UINT8(init4, PicState),
-VMSTATE_UINT8(single_mode, PicState),
-VMSTATE_UINT8(elcr, PicState),
-VMSTATE_END_OF_LIST()
-}
-};
-
 static const MemoryRegionOps pic_base_ioport_ops = {
 .read = pic_ioport_read,
 .write = pic_ioport_write,
@@ -462,24 +393,13 @@ static const MemoryRegionOps pic_elcr_ioport_ops = {
 },
 };
 
-static int pic_initfn(ISADevice *dev)
+static void pic_backend_init(PicState *s)
 {
-PicState *s = DO_UPCAST(PicState, dev, dev);
-
 memory_region_init_io(&s->base_io, &pic_base_ioport_ops, s, "pic", 2);
 memory_region_init_io(&s->elcr_io, &pic_elcr_ioport_ops, s, "elcr", 1);
 
-isa_register_ioport(NULL, &s->base_io, s->iobase);
-if (s->elcr_addr != -1) {
-isa_register_ioport(NULL, &s->elcr_io, s->elcr_addr);
-}
-
-qdev_init_gpio_out(&dev->qdev, s->int_out, ARRAY

[Qemu-devel] [PATCH v3 08/16] ioapic: Reject non-dword accesses to IOWIN register

2011-12-06 Thread Jan Kiszka

Aligns the model with the spec.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index 56b1612..eb75766 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -208,6 +208,9 @@ ioapic_mem_read(void *opaque, target_phys_addr_t addr, 
unsigned int size)
 val = s->ioregsel;
 break;
 case IOAPIC_IOWIN:
+if (size != 4) {
+break;
+}
 switch (s->ioregsel) {
 case IOAPIC_REG_ID:
 val = s->id << IOAPIC_ID_SHIFT;
@@ -247,6 +250,9 @@ ioapic_mem_write(void *opaque, target_phys_addr_t addr, 
uint64_t val,
 s->ioregsel = val;
 break;
 case IOAPIC_IOWIN:
+if (size != 4) {
+break;
+}
 DPRINTF("write: %08x = %08x\n", s->ioregsel, val);
 switch (s->ioregsel) {
 case IOAPIC_REG_ID:
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 16/16] kvm: Arm in-kernel irqchip support

2011-12-06 Thread Jan Kiszka

Make the basic in-kernel irqchip support selectable via
-machine ...,kernel_irqchip=on. Leave it off by default until it can
fully replace user space models.

Signed-off-by: Jan Kiszka 
---
 qemu-config.c   |4 
 qemu-options.hx |5 -
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index 90b6b3e..fc25115 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -483,6 +483,10 @@ static QemuOptsList qemu_machine_opts = {
 .name = "accel",
 .type = QEMU_OPT_STRING,
 .help = "accelerator list",
+}, {
+.name = "kernel_irqchip",
+.type = QEMU_OPT_BOOL,
+.help = "use KVM in-kernel irqchip",
 },
 { /* End of list */ }
 },
diff --git a/qemu-options.hx b/qemu-options.hx
index 5d2a776..e10186b 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -31,7 +31,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "-machine [type=]name[,prop[=value][,...]]\n"
 "selects emulated machine (-machine ? for list)\n"
 "property accel=accel1[:accel2[:...]] selects 
accelerator\n"
-"supported accelerators are kvm, xen, tcg (default: 
tcg)\n",
+"supported accelerators are kvm, xen, tcg (default: tcg)\n"
+"kernel_irqchip=on|off controls accelerated irqchip 
support\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
@@ -44,6 +45,8 @@ This is used to enable an accelerator. Depending on the 
target architecture,
 kvm, xen, or tcg can be available. By default, tcg is used. If there is more
 than one accelerator specified, the next one is used if the previous one fails
 to initialize.
+@item kernel_irqchip=on|off
+Enables in-kernel irqchip support for the chosen accelerator when available.
 @end table
 ETEXI
 
-- 
1.7.3.4

Re: [Qemu-devel] [PATCH v3 00/16] uq/master: Introduce basic irqchip support

2011-12-06 Thread Avi Kivity

On 12/06/2011 02:58 PM, Jan Kiszka wrote:
> In this revision, I'm now trying the approach of backend/frontend
> split-ups for the affected IRQ chips. That means we keep a single qdev
> device description but fork off specific logic early during device init.
> The backends support this by providing hooks that user space and KVM
> models can implement differently.
>
> The result is slightly larger and comes with the not really beautiful
> ioapic.kvm_gsi_base property but should otherwise meet expectations.
>
> Comments?

Looks good to me, much nicer than the previous approaches.  I'll wait a
bit for more reviews though.

> PS: Series is still against old uq/master, therefore containing patches
> that took/will take different routes.

I just pushed a rebased uq/master.  In the future, either ping me or
just base on upstream (which uq/master supposedly tracks).

-- 
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH v3 01/16] msi: Generalize msix_supported to msi_supported

2011-12-06 Thread Jan Kiszka

Rename msix_supported to msi_supported and control MSI and MSI-X
activation this way. That was likely to original intention for this
flag, but MSI support came after MSI-X.

Signed-off-by: Jan Kiszka 
---
 hw/msi.c  |8 
 hw/msi.h  |2 ++
 hw/msix.c |9 -
 hw/msix.h |2 --
 hw/pc.c   |4 ++--
 5 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/msi.c b/hw/msi.c
index f214fcf..5d6ceb6 100644
--- a/hw/msi.c
+++ b/hw/msi.c
@@ -36,6 +36,9 @@
 
 #define PCI_MSI_VECTORS_MAX 32
 
+/* Flag for interrupt controller to declare MSI/MSI-X support */
+bool msi_supported;
+
 /* If we get rid of cap allocator, we won't need this. */
 static inline uint8_t msi_cap_sizeof(uint16_t flags)
 {
@@ -116,6 +119,11 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
 uint16_t flags;
 uint8_t cap_size;
 int config_offset;
+
+if (!msi_supported) {
+return -ENOTSUP;
+}
+
 MSI_DEV_PRINTF(dev,
"init offset: 0x%"PRIx8" vector: %"PRId8
" 64bit %d mask %d\n",
diff --git a/hw/msi.h b/hw/msi.h
index 5766018..3040bb0 100644
--- a/hw/msi.h
+++ b/hw/msi.h
@@ -24,6 +24,8 @@
 #include "qemu-common.h"
 #include "pci.h"
 
+extern bool msi_supported;
+
 bool msi_enabled(const PCIDevice *dev);
 int msi_init(struct PCIDevice *dev, uint8_t offset,
  unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask);
diff --git a/hw/msix.c b/hw/msix.c
index b15bafc..8850fbd 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -12,6 +12,7 @@
  */
 
 #include "hw.h"
+#include "msi.h"
 #include "msix.h"
 #include "pci.h"
 #include "range.h"
@@ -32,9 +33,6 @@
 #define MSIX_MAX_ENTRIES 32
 
 
-/* Flag for interrupt controller to declare MSI-X support */
-int msix_supported;
-
 /* Add MSI-X capability to the config space for the device. */
 /* Given a bar and its size, add MSI-X table on top of it
  * and fill MSI-X capability in the config space.
@@ -212,10 +210,11 @@ int msix_init(struct PCIDevice *dev, unsigned short 
nentries,
   unsigned bar_nr, unsigned bar_size)
 {
 int ret;
+
 /* Nothing to do if MSI is not supported by interrupt controller */
-if (!msix_supported)
+if (!msi_supported) {
 return -ENOTSUP;
-
+}
 if (nentries > MSIX_MAX_ENTRIES)
 return -EINVAL;
 
diff --git a/hw/msix.h b/hw/msix.h
index 7e04336..5aba22b 100644
--- a/hw/msix.h
+++ b/hw/msix.h
@@ -29,6 +29,4 @@ void msix_notify(PCIDevice *dev, unsigned vector);
 
 void msix_reset(PCIDevice *dev);
 
-extern int msix_supported;
-
 #endif
diff --git a/hw/pc.c b/hw/pc.c
index 9328ee5..5225d5b 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -36,7 +36,7 @@
 #include "elf.h"
 #include "multiboot.h"
 #include "mc146818rtc.h"
-#include "msix.h"
+#include "msi.h"
 #include "sysbus.h"
 #include "sysemu.h"
 #include "blockdev.h"
@@ -896,7 +896,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 apic_mapped = 1;
 }
 
-msix_supported = 1;
+msi_supported = true;
 
 return dev;
 }
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 03/16] apic: Stop timer on reset

2011-12-06 Thread Jan Kiszka

All LVTs are masked on reset, so the timer becomes ineffective. Letting
it tick nevertheless is harmless, but will at least create a spurious
trace event.

Signed-off-by: Jan Kiszka 
---
 hw/apic.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 8289eef..2644a82 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -528,6 +528,8 @@ void apic_init_reset(DeviceState *d)
 s->initial_count_load_time = 0;
 s->next_time = 0;
 s->wait_for_sipi = 1;
+
+qemu_del_timer(s->timer);
 }
 
 static void apic_startup(APICState *s, int vector_num)
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 12/16] kvm: x86: Establish IRQ0 override control

2011-12-06 Thread Jan Kiszka

KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.

Signed-off-by: Jan Kiszka 
---
 hw/pc.c|3 ++-
 kvm-all.c  |5 +
 kvm-stub.c |5 +
 kvm.h  |2 ++
 sysemu.h   |1 -
 vl.c   |1 -
 6 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index fcecbf2..99c1bfd 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -39,6 +39,7 @@
 #include "msi.h"
 #include "sysbus.h"
 #include "sysemu.h"
+#include "kvm.h"
 #include "blockdev.h"
 #include "ui/qemu-spice.h"
 #include "memory.h"
@@ -609,7 +610,7 @@ static void *bochs_bios_init(void)
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES, (uint8_t *)acpi_tables,
  acpi_tables_len);
-fw_cfg_add_bytes(fw_cfg, FW_CFG_IRQ0_OVERRIDE, &irq0override, 1);
+fw_cfg_add_i32(fw_cfg, FW_CFG_IRQ0_OVERRIDE, kvm_allows_irq0_override());
 
 smbios_table = smbios_get_table(&smbios_len);
 if (smbios_table)
diff --git a/kvm-all.c b/kvm-all.c
index a85e14f..665455c 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1260,6 +1260,11 @@ int kvm_has_gsi_routing(void)
 return kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING);
 }
 
+int kvm_allows_irq0_override(void)
+{
+return !kvm_enabled() || !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 if (!kvm_has_sync_mmu()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 06064b9..6c2b06b 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -78,6 +78,11 @@ int kvm_has_many_ioeventfds(void)
 return 0;
 }
 
+int kvm_allows_irq0_override(void)
+{
+return 1;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index 0d6c453..a3c87af 100644
--- a/kvm.h
+++ b/kvm.h
@@ -53,6 +53,8 @@ int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_gsi_routing(void);
 
+int kvm_allows_irq0_override(void);
+
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
 
diff --git a/sysemu.h b/sysemu.h
index 22cd720..3bd896e 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -102,7 +102,6 @@ extern int vga_interface_type;
 extern int graphic_width;
 extern int graphic_height;
 extern int graphic_depth;
-extern uint8_t irq0override;
 extern DisplayType display_type;
 extern const char *keyboard_layout;
 extern int win2k_install_hack;
diff --git a/vl.c b/vl.c
index fcce25f..22d02b9 100644
--- a/vl.c
+++ b/vl.c
@@ -218,7 +218,6 @@ int no_reboot = 0;
 int no_shutdown = 0;
 int cursor_hide = 1;
 int graphic_rotate = 0;
-uint8_t irq0override = 1;
 const char *watchdog;
 QEMUOptionRom option_rom[MAX_OPTION_ROMS];
 int nb_option_roms;
-- 
1.7.3.4

[Qemu-devel] [PATCH v3 04/16] apic: Introduce backend/frontend infrastructure for KVM reuse

2011-12-06 Thread Jan Kiszka

The KVM in-kernel APIC model will reuse parts of the user space model
while providing the same frontend view to guest and most management
interfaces. Introduce an APIC backend concept to encapsulate those
parts that will tell user space and KVM model apart. The backend offers
callback hooks for init and base/tpr setting that will be implemented
accordingly.

Signed-off-by: Jan Kiszka 
---
 Makefile.target|2 +-
 hw/apic.c  |  277 +++-
 hw/apic_common.c   |  258 
 hw/apic_internal.h |  118 ++
 hw/pc.c|1 +
 trace-events   |2 +-
 6 files changed, 393 insertions(+), 265 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h

diff --git a/Makefile.target b/Makefile.target
index 3a9e95d..7bb6b13 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -226,7 +226,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o pc.o
-obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/apic.c b/hw/apic.c
index 2644a82..8c8f658 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -16,53 +16,13 @@
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see 
  */
-#include "hw.h"
+#include "apic_internal.h"
 #include "apic.h"
 #include "ioapic.h"
-#include "qemu-timer.h"
 #include "host-utils.h"
-#include "sysbus.h"
 #include "trace.h"
 #include "pc.h"
 
-/* APIC Local Vector Table */
-#define APIC_LVT_TIMER   0
-#define APIC_LVT_THERMAL 1
-#define APIC_LVT_PERFORM 2
-#define APIC_LVT_LINT0   3
-#define APIC_LVT_LINT1   4
-#define APIC_LVT_ERROR   5
-#define APIC_LVT_NB  6
-
-/* APIC delivery modes */
-#define APIC_DM_FIXED  0
-#define APIC_DM_LOWPRI 1
-#define APIC_DM_SMI2
-#define APIC_DM_NMI4
-#define APIC_DM_INIT   5
-#define APIC_DM_SIPI   6
-#define APIC_DM_EXTINT 7
-
-/* APIC destination mode */
-#define APIC_DESTMODE_FLAT 0xf
-#define APIC_DESTMODE_CLUSTER  1
-
-#define APIC_TRIGGER_EDGE  0
-#define APIC_TRIGGER_LEVEL 1
-
-#defineAPIC_LVT_TIMER_PERIODIC (1<<17)
-#defineAPIC_LVT_MASKED (1<<16)
-#defineAPIC_LVT_LEVEL_TRIGGER  (1<<15)
-#defineAPIC_LVT_REMOTE_IRR (1<<14)
-#defineAPIC_INPUT_POLARITY (1<<13)
-#defineAPIC_SEND_PENDING   (1<<12)
-
-#define ESR_ILLEGAL_ADDRESS (1 << 7)
-
-#define APIC_SV_DIRECTED_IO (1<<12)
-#define APIC_SV_ENABLE  (1<<8)
-
-#define MAX_APICS 255
 #define MAX_APIC_WORDS 8
 
 /* Intel APIC constants: from include/asm/msidef.h */
@@ -75,40 +35,7 @@
 #define MSI_ADDR_DEST_ID_SHIFT 12
 #defineMSI_ADDR_DEST_ID_MASK   0x000
 
-#define MSI_ADDR_SIZE   0x10
-
-typedef struct APICState APICState;
-
-struct APICState {
-SysBusDevice busdev;
-MemoryRegion io_memory;
-void *cpu_env;
-uint32_t apicbase;
-uint8_t id;
-uint8_t arb_id;
-uint8_t tpr;
-uint32_t spurious_vec;
-uint8_t log_dest;
-uint8_t dest_mode;
-uint32_t isr[8];  /* in service register */
-uint32_t tmr[8];  /* trigger mode register */
-uint32_t irr[8]; /* interrupt request register */
-uint32_t lvt[APIC_LVT_NB];
-uint32_t esr; /* error register */
-uint32_t icr[2];
-
-uint32_t divide_conf;
-int count_shift;
-uint32_t initial_count;
-int64_t initial_count_load_time, next_time;
-uint32_t idx;
-QEMUTimer *timer;
-int sipi_vector;
-int wait_for_sipi;
-};
-
 static APICState *local_apics[MAX_APICS + 1];
-static int apic_irq_delivered;
 
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
 static void apic_update_irq(APICState *s);
@@ -293,14 +220,8 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, 
uint8_t delivery_mode,
 apic_bus_deliver(deliver_bitmask, delivery_mode, vector_num, trigger_mode);
 }
 
-void cpu_set_apic_base(DeviceState *d, uint64_t val)
+static void apic_set_base(APICState *s, uint64_t val)
 {
-APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-trace_cpu_set_apic_base(val);
-
-if (!s)
-return;
 s->apicbase = (val & 0xf000) |
 (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
 /* if disabled, cannot be enabled again */
@@ -311,32 +232,12 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
 }
 }
 
-uint64_t cpu_get_apic_base(DeviceState *d)
+static void apic_set_tpr(APICState *s, uint8_t val)
 {
-APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
-
-trace_cpu_get_apic_base(s ? (uint64_

[Qemu-devel] [PATCH v3 05/16] apic: Open-code timer save/restore

2011-12-06 Thread Jan Kiszka

To enable migration between accelerated and non-accelerated APIC models,
we will need to handle the timer saving and restoring specially and can
no longer rely on the automatics of VMSTATE_TIMER. Specifically,
accelerated model will not start any QEMUTimer.

This patch therefore factors out the generic bits into apic_next_timer
and introduces a post-load callback that can be implemented differently
by both models.

Signed-off-by: Jan Kiszka 
---
 hw/apic.c  |   30 --
 hw/apic_common.c   |   51 +--
 hw/apic_internal.h |3 +++
 3 files changed, 64 insertions(+), 20 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 8c8f658..d5a3f84 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -516,25 +516,9 @@ static uint32_t apic_get_current_count(APICState *s)
 
 static void apic_timer_update(APICState *s, int64_t current_time)
 {
-int64_t next_time, d;
-
-if (!(s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED)) {
-d = (current_time - s->initial_count_load_time) >>
-s->count_shift;
-if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
-if (!s->initial_count)
-goto no_timer;
-d = ((d / ((uint64_t)s->initial_count + 1)) + 1) * 
((uint64_t)s->initial_count + 1);
-} else {
-if (d >= s->initial_count)
-goto no_timer;
-d = (uint64_t)s->initial_count + 1;
-}
-next_time = s->initial_count_load_time + (d << s->count_shift);
-qemu_mod_timer(s->timer, next_time);
-s->next_time = next_time;
+if (apic_next_timer(s, current_time)) {
+qemu_mod_timer(s->timer, s->next_time);
 } else {
-no_timer:
 qemu_del_timer(s->timer);
 }
 }
@@ -765,11 +749,21 @@ static void apic_backend_init(APICState *s)
 local_apics[s->idx] = s;
 }
 
+static void apic_post_load(APICState *s)
+{
+if (s->timer_expiry != -1) {
+qemu_mod_timer(s->timer, s->timer_expiry);
+} else {
+qemu_del_timer(s->timer);
+}
+}
+
 static APICBackend apic_backend = {
 .name = "QEMU",
 .init = apic_backend_init,
 .set_base = apic_set_base,
 .set_tpr = apic_set_tpr,
+.post_load = apic_post_load,
 };
 
 static void apic_register_devices(void)
diff --git a/hw/apic_common.c b/hw/apic_common.c
index 1e6f287..e6ac1af 100644
--- a/hw/apic_common.c
+++ b/hw/apic_common.c
@@ -82,6 +82,39 @@ int apic_get_irq_delivered(void)
 return apic_irq_delivered;
 }
 
+bool apic_next_timer(APICState *s, int64_t current_time)
+{
+int64_t d;
+
+/* We need to store the timer state separately to support APIC
+ * implementations that maintain a non-QEMU timer, e.g. inside the
+ * host kernel. This open-coded state allows us to migrate between
+ * both models. */
+s->timer_expiry = -1;
+
+if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_MASKED) {
+return false;
+}
+
+d = (current_time - s->initial_count_load_time) >> s->count_shift;
+
+if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
+if (!s->initial_count) {
+return false;
+}
+d = ((d / ((uint64_t)s->initial_count + 1)) + 1) *
+((uint64_t)s->initial_count + 1);
+} else {
+if (d >= s->initial_count) {
+return false;
+}
+d = (uint64_t)s->initial_count + 1;
+}
+s->next_time = s->initial_count_load_time + (d << s->count_shift);
+s->timer_expiry = s->next_time;
+return true;
+}
+
 void apic_init_reset(DeviceState *d)
 {
 APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
@@ -109,7 +142,10 @@ void apic_init_reset(DeviceState *d)
 s->next_time = 0;
 s->wait_for_sipi = 1;
 
-qemu_del_timer(s->timer);
+if (s->timer) {
+qemu_del_timer(s->timer);
+}
+s->timer_expiry = -1;
 }
 
 static void apic_reset(DeviceState *d)
@@ -174,12 +210,23 @@ static int apic_load_old(QEMUFile *f, void *opaque, int 
version_id)
 return 0;
 }
 
+static int apic_dispatch_post_load(void *opaque, int version_id)
+{
+APICState *s = opaque;
+
+if (s->backend->post_load) {
+s->backend->post_load(s);
+}
+return 0;
+}
+
 static const VMStateDescription vmstate_apic = {
 .name = "apic",
 .version_id = 3,
 .minimum_version_id = 3,
 .minimum_version_id_old = 1,
 .load_state_old = apic_load_old,
+.post_load = apic_dispatch_post_load,
 .fields  = (VMStateField[]) {
 VMSTATE_UINT32(apicbase, APICState),
 VMSTATE_UINT8(id, APICState),
@@ -199,7 +246,7 @@ static const VMStateDescription vmstate_apic = {
 VMSTATE_UINT32(initial_count, APICState),
 VMSTATE_INT64(initial_count_load_time, APICState),
 VMSTATE_INT64(next_time, APICState),
-VMSTATE_TIMER(timer, APICState),
+VMSTATE_INT64(timer_expiry, APICState), /* open-coded timer state */
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/hw/api

[Qemu-devel] [PATCH v3 14/16] kvm: x86: Add user space part for in-kernel i8259

2011-12-06 Thread Jan Kiszka

Introduce the alternative i8259 backend that exploits KVM in-kernel
acceleration.

The PIIX3 initialization code is furthermore extended by KVM specific
IRQ route setup. GSI injection differs in KVM mode from the user space
model. As we can dispatch ISA-range IRQs to both IOAPIC and PIC inside
the kernel, we do not need to inject them separately. This is reflected
by a KVM-specific GSI handler.

Signed-off-by: Jan Kiszka 
---
 Makefile.target |2 +-
 hw/kvm/i8259.c  |  126 +++
 hw/pc.h |1 +
 hw/pc_piix.c|   50 --
 4 files changed, 174 insertions(+), 5 deletions(-)
 create mode 100644 hw/kvm/i8259.c

diff --git a/Makefile.target b/Makefile.target
index 66b42d5..850b80f 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,7 +231,7 @@ obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/i8259.c b/hw/kvm/i8259.c
new file mode 100644
index 000..98d7141
--- /dev/null
+++ b/hw/kvm/i8259.c
@@ -0,0 +1,126 @@
+/*
+ * KVM in-kernel PIC (i8259) support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka  
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include "hw/i8259_internal.h"
+#include "hw/apic_internal.h"
+#include "kvm.h"
+
+static void kvm_pic_get(PicState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_pic_state *kpic;
+int ret;
+
+chip.chip_id = s->master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, &chip);
+if (ret < 0) {
+fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+abort();
+}
+
+kpic = &chip.chip.pic;
+
+s->last_irr = kpic->last_irr;
+s->irr = kpic->irr;
+s->imr = kpic->imr;
+s->isr = kpic->isr;
+s->priority_add = kpic->priority_add;
+s->irq_base = kpic->irq_base;
+s->read_reg_select = kpic->read_reg_select;
+s->poll = kpic->poll;
+s->special_mask = kpic->special_mask;
+s->init_state = kpic->init_state;
+s->auto_eoi = kpic->auto_eoi;
+s->rotate_on_auto_eoi = kpic->rotate_on_auto_eoi;
+s->special_fully_nested_mode = kpic->special_fully_nested_mode;
+s->init4 = kpic->init4;
+s->elcr = kpic->elcr;
+s->elcr_mask = kpic->elcr_mask;
+}
+
+static void kvm_pic_put(PicState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_pic_state *kpic;
+int ret;
+
+chip.chip_id = s->master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+
+kpic = &chip.chip.pic;
+
+kpic->last_irr = s->last_irr;
+kpic->irr = s->irr;
+kpic->imr = s->imr;
+kpic->isr = s->isr;
+kpic->priority_add = s->priority_add;
+kpic->irq_base = s->irq_base;
+kpic->read_reg_select = s->read_reg_select;
+kpic->poll = s->poll;
+kpic->special_mask = s->special_mask;
+kpic->init_state = s->init_state;
+kpic->auto_eoi = s->auto_eoi;
+kpic->rotate_on_auto_eoi = s->rotate_on_auto_eoi;
+kpic->special_fully_nested_mode = s->special_fully_nested_mode;
+kpic->init4 = s->init4;
+kpic->elcr = s->elcr;
+kpic->elcr_mask = s->elcr_mask;
+
+ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, &chip);
+if (ret < 0) {
+fprintf(stderr, "KVM_GET_IRQCHIP failed: %s\n", strerror(ret));
+abort();
+}
+}
+
+static void kvm_pic_reset(PicState *s)
+{
+pic_reset_internal(s);
+s->elcr = 0;
+
+kvm_pic_put(s);
+}
+
+static void kvm_pic_set_irq(void *opaque, int irq, int level)
+{
+int delivered;
+
+delivered = kvm_irqchip_set_irq(kvm_state, irq, level);
+apic_set_irq_delivered(delivered);
+}
+
+static void kvm_pic_backend_init(PicState *s)
+{
+memory_region_init_reservation(&s->base_io, "kvm-pic", 2);
+memory_region_init_reservation(&s->elcr_io, "kvm-elcr", 1);
+}
+
+qemu_irq *kvm_i8259_init(void)
+{
+i8259_init_chip(true, "KVM");
+i8259_init_chip(false, "KVM");
+
+return qemu_allocate_irqs(kvm_pic_set_irq, NULL, ISA_NUM_IRQS);
+}
+
+static PICBackend kvm_pic_backend = {
+.name = "KVM",
+.init = kvm_pic_backend_init,
+.reset = kvm_pic_reset,
+.pre_save = kvm_pic_get,
+.post_load = kvm_pic_put,
+};
+
+static void kvm_pic_register(void)
+{
+pic_register_backend(&kvm_pic_backend);
+}
+
+device_init(kvm_pic_register)
diff --git a/hw/pc.h b/hw/pc.h
index b8ad9a3..d8e7313 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -63,6 +63,7 @@ bool parallel_mm_init(target_phys_addr_t base, int it_shift, 
qemu_irq irq,
 typedef struct PicState PicState;
 extern PicState *isa_pic;
 qemu_irq *i8259_init(qemu_irq parent_irq);
+qemu_irq *kvm_i8259_init(void);

[Qemu-devel] [PATCH v3 11/16] kvm: Introduce core services for in-kernel irqchip support

2011-12-06 Thread Jan Kiszka

Add the basic infrastructure to active in-kernel irqchip support, inject
interrupts into these models, and maintain IRQ routes.

Routing is optional and depends on the host arch supporting
KVM_CAP_IRQ_ROUTING. When it's not available on x86, we looe the HPET as
we can't route GSI0 to IOAPIC pin 2.

In-kernel irqchip support will once be controlled by the machine
property 'kernel_irqchip', but this is not yet wired up.

Signed-off-by: Jan Kiszka 
---
 kvm-all.c |  149 +
 kvm.h |8 +++
 target-i386/kvm.c |   11 
 3 files changed, 168 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index e7faf5c..a85e14f 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -76,6 +76,13 @@ struct KVMState
 int pit_in_kernel;
 int xsave, xcrs;
 int many_ioeventfds;
+int irqchip_inject_ioctl;
+#ifdef KVM_CAP_IRQ_ROUTING
+struct kvm_irq_routing *irq_routes;
+int nr_allocated_irq_routes;
+uint32_t *used_gsi_bitmap;
+unsigned int max_gsi;
+#endif
 };
 
 KVMState *kvm_state;
@@ -692,6 +699,138 @@ static void kvm_handle_interrupt(CPUState *env, int mask)
 }
 }
 
+int kvm_irqchip_set_irq(KVMState *s, int irq, int level)
+{
+struct kvm_irq_level event;
+int ret;
+
+assert(s->irqchip_in_kernel);
+
+event.level = level;
+event.irq = irq;
+ret = kvm_vm_ioctl(s, s->irqchip_inject_ioctl, &event);
+if (ret < 0) {
+perror("kvm_set_irqchip_line");
+abort();
+}
+
+return (s->irqchip_inject_ioctl == KVM_IRQ_LINE) ? 1 : event.status;
+}
+
+#ifdef KVM_CAP_IRQ_ROUTING
+static void set_gsi(KVMState *s, unsigned int gsi)
+{
+assert(gsi < s->max_gsi);
+
+s->used_gsi_bitmap[gsi / 32] |= 1U << (gsi % 32);
+}
+
+static void kvm_init_irq_routing(KVMState *s)
+{
+int gsi_count;
+
+gsi_count = kvm_check_extension(s, KVM_CAP_IRQ_ROUTING);
+if (gsi_count > 0) {
+unsigned int gsi_bits, i;
+
+/* Round up so we can search ints using ffs */
+gsi_bits = (gsi_count + 31) / 32;
+s->used_gsi_bitmap = g_malloc0(gsi_bits / 8);
+s->max_gsi = gsi_bits;
+
+/* Mark any over-allocated bits as already in use */
+for (i = gsi_count; i < gsi_bits; i++) {
+set_gsi(s, i);
+}
+}
+
+s->irq_routes = g_malloc0(sizeof(*s->irq_routes));
+s->nr_allocated_irq_routes = 0;
+
+kvm_arch_init_irq_routing(s);
+}
+
+static void kvm_add_routing_entry(KVMState *s,
+  struct kvm_irq_routing_entry *entry)
+{
+struct kvm_irq_routing_entry *new;
+int n, size;
+
+if (s->irq_routes->nr == s->nr_allocated_irq_routes) {
+n = s->nr_allocated_irq_routes * 2;
+if (n < 64) {
+n = 64;
+}
+size = sizeof(struct kvm_irq_routing);
+size += n * sizeof(*new);
+s->irq_routes = g_realloc(s->irq_routes, size);
+s->nr_allocated_irq_routes = n;
+}
+n = s->irq_routes->nr++;
+new = &s->irq_routes->entries[n];
+memset(new, 0, sizeof(*new));
+new->gsi = entry->gsi;
+new->type = entry->type;
+new->flags = entry->flags;
+new->u = entry->u;
+
+set_gsi(s, entry->gsi);
+}
+
+void kvm_irqchip_add_route(KVMState *s, int irq, int irqchip, int pin)
+{
+struct kvm_irq_routing_entry e;
+
+e.gsi = irq;
+e.type = KVM_IRQ_ROUTING_IRQCHIP;
+e.flags = 0;
+e.u.irqchip.irqchip = irqchip;
+e.u.irqchip.pin = pin;
+kvm_add_routing_entry(s, &e);
+}
+
+int kvm_irqchip_commit_routes(KVMState *s)
+{
+s->irq_routes->flags = 0;
+return kvm_vm_ioctl(s, KVM_SET_GSI_ROUTING, s->irq_routes);
+}
+
+#else /* !KVM_CAP_IRQ_ROUTING */
+
+static void kvm_init_irq_routing(KVMState *s)
+{
+}
+#endif /* !KVM_CAP_IRQ_ROUTING */
+
+static int kvm_irqchip_create(KVMState *s)
+{
+QemuOptsList *list = qemu_find_opts("machine");
+int ret;
+
+if (QTAILQ_EMPTY(&list->head) ||
+!qemu_opt_get_bool(QTAILQ_FIRST(&list->head),
+   "kernel_irqchip", false) ||
+!kvm_check_extension(s, KVM_CAP_IRQCHIP)) {
+return 0;
+}
+
+ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
+if (ret < 0) {
+fprintf(stderr, "Create kernel irqchip failed\n");
+return ret;
+}
+
+s->irqchip_inject_ioctl = KVM_IRQ_LINE;
+if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) {
+s->irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS;
+}
+s->irqchip_in_kernel = 1;
+
+kvm_init_irq_routing(s);
+
+return 0;
+}
+
 int kvm_init(void)
 {
 static const char upgrade_note[] =
@@ -786,6 +925,11 @@ int kvm_init(void)
 goto err;
 }
 
+ret = kvm_irqchip_create(s);
+if (ret < 0) {
+goto err;
+}
+
 kvm_state = s;
 cpu_register_phys_memory_client(&kvm_cpu_phys_memory_client);
 
@@ -,6 +1255,11 @@ int kvm_has_many_ioeventfds(void)
 return kvm_state->many_ioeventfds;
 }
 
+int kvm_has_gsi_routing(void)
+

[Qemu-devel] [PATCH v3 07/16] ioapic: Convert to memory API

2011-12-06 Thread Jan Kiszka

This maintains the old imprecise access size handling.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c |   28 +++-
 1 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index 61991d7..56b1612 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -86,6 +86,7 @@ typedef struct IOAPICState IOAPICState;
 
 struct IOAPICState {
 SysBusDevice busdev;
+MemoryRegion io_memory;
 uint8_t id;
 uint8_t ioregsel;
 uint32_t irr;
@@ -195,7 +196,8 @@ void ioapic_eoi_broadcast(int vector)
 }
 }
 
-static uint32_t ioapic_mem_readl(void *opaque, target_phys_addr_t addr)
+static uint64_t
+ioapic_mem_read(void *opaque, target_phys_addr_t addr, unsigned int size)
 {
 IOAPICState *s = opaque;
 int index;
@@ -234,7 +236,8 @@ static uint32_t ioapic_mem_readl(void *opaque, 
target_phys_addr_t addr)
 }
 
 static void
-ioapic_mem_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
+ioapic_mem_write(void *opaque, target_phys_addr_t addr, uint64_t val,
+ unsigned int size)
 {
 IOAPICState *s = opaque;
 int index;
@@ -309,32 +312,23 @@ static void ioapic_reset(DeviceState *d)
 }
 }
 
-static CPUReadMemoryFunc * const ioapic_mem_read[3] = {
-ioapic_mem_readl,
-ioapic_mem_readl,
-ioapic_mem_readl,
-};
-
-static CPUWriteMemoryFunc * const ioapic_mem_write[3] = {
-ioapic_mem_writel,
-ioapic_mem_writel,
-ioapic_mem_writel,
+static const MemoryRegionOps ioapic_io_ops = {
+.read = ioapic_mem_read,
+.write = ioapic_mem_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
 };
 
 static int ioapic_init1(SysBusDevice *dev)
 {
 IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
-int io_memory;
 static int ioapic_no;
 
 if (ioapic_no >= MAX_IOAPICS) {
 return -1;
 }
 
-io_memory = cpu_register_io_memory(ioapic_mem_read,
-   ioapic_mem_write, s,
-   DEVICE_NATIVE_ENDIAN);
-sysbus_init_mmio(dev, 0x1000, io_memory);
+memory_region_init_io(&s->io_memory, &ioapic_io_ops, s, "ioapic", 0x1000);
+sysbus_init_mmio_region(dev, &s->io_memory);
 
 qdev_init_gpio_in(&dev->qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
-- 
1.7.3.4

Re: [Qemu-devel] [PATCH v3 00/16] uq/master: Introduce basic irqchip support

2011-12-06 Thread Jan Kiszka

On 2011-12-06 14:55, Avi Kivity wrote:
> On 12/06/2011 02:58 PM, Jan Kiszka wrote:
>> In this revision, I'm now trying the approach of backend/frontend
>> split-ups for the affected IRQ chips. That means we keep a single qdev
>> device description but fork off specific logic early during device init.
>> The backends support this by providing hooks that user space and KVM
>> models can implement differently.
>>
>> The result is slightly larger and comes with the not really beautiful
>> ioapic.kvm_gsi_base property but should otherwise meet expectations.
>>
>> Comments?
> 
> Looks good to me, much nicer than the previous approaches.  I'll wait a
> bit for more reviews though.
> 
>> PS: Series is still against old uq/master, therefore containing patches
>> that took/will take different routes.
> 
> I just pushed a rebased uq/master.  In the future, either ping me or
> just base on upstream (which uq/master supposedly tracks).

Requires minor rebasing. Will wait for comments before reposting.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 0/2] [RFC] qemu-ga: add support for guest command execution

2011-12-06 Thread Michael Roth

The code is still in rough shape, but while we're on the topic of guest agents
I wanted to put out a working example of how exec functionality can be added
to qemu-ga to provide a mechansim for building arbitrarilly high-level
interfaces.

The hope is that by allowing qemu-ga to execute commands in the guest, paired
with file read/write access, we can instrument a guest "on the fly" to support
any type of hyperviser functionality, and do so without dramatically enlarging
the role qemu-ga plays as a small, QEMU-specific agent that is tightly
integrated with QEMU/QMP/libvirt.

These patches add the following interfaces:

guest-file-open-pipe
guest-exec
guest-exec-status

The guest-file-open-pipe interface is analagous to the existing guest-file-open
interface (it might be best to roll it into it actually): it returns a handle
that can be handled via the existing guest-file-{read,write,flush,close}
interface. Internally it creates a FIFO pair that we can use to associate
handles to the stdin/stdout/stderr of a guest-exec spawned process. We can also
also use them to redirect output into other processes, giving us the basic
tools to build a basic shell (or a full-blown one if we add TTY support) using
a single qemu-ga.

Theoretically we can even deploy other agents, including session-level agents,
and communicate with them via these same handles. Thus, ovirt could deploy and
run an agent via qemu-ga, Spice could deploy vdagent, etc. Since the interface
is somewhat tedious, I'm working on a wrapper script to try out some of
these scenarios, but a basic use case using the raw QMP interface is included
below.

Any thoughts/comments on this approach are appreciated.

EXAMPLE USAGE (execute `top -b -n1`):

{'execute': 'guest-file-open-pipe'}
{'return': 6}

{'execute': 'guest-exec',\
 'arguments': {'detach': True,   \
   'handle_stdout': 6,   \
   'params': [{'param': '-b'},   \
  {'param': '-n1'}], \
   'path': 'top'}}
{'return': {'exit-code': 0,  \
'exited': False, \
'handle_stderr': -1, \
'handle_stdin': -1,  \
'handle_stdout': 6,  \
'pid': 14267}}

{'execute': 'guest-file-read',   \
 'arguments': {'count': 65536,   \
   'handle': 6}}
{'return': {'buf-b64': '',   \
'count': 0,  \
'eof': False}}

{'execute': 'guest-file-read',   \
 'arguments': {'count': 65536,   \
   'handle': 6}}
{'return': {'buf-b64': 'dG9wIC0gMjI6N...',   \
'count': 11064,  \
'eof': True}}

/*
top - 22:41:49 up 1 day,  4:30,  3 users,  load average: 0.00, 0.00, 0.00
Tasks: 114 total,   1 running, 113 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2%us,  0.2%sy,  0.0%ni, 99.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:504848k total,   445664k used,59184k free,49100k buffers
Swap:   323580k total,  224k used,   323356k free,   256392k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
14267 root  20   0 19272 1248  924 R2  0.2   0:00.02 top
1 root  20   0 24008 2048 1280 S0  0.4   0:00.85 init
2 root  20   0 000 S0  0.0   0:00.30 kthreadd
3 root  20   0 000 S0  0.0   0:01.09 ksoftirqd/0
...
*/

{'execute': 'guest-exec-status', \
 'arguments': {'pid': 14267}}
{'return': {'exit-code': 0,  \
'exited': True,  \
'handle_stderr': -1, \
'handle_stdin': -1,  \
'handle_stdout': 6,  \
'pid': 14267}}

{'execute': 'guest-file-close'}  \
 'arguments': {'handle': 6}}
{'return': {}}

Michael Roth (2):
  guest agent: add guest-file-open-pipe
  guest agent: add guest-exec and guest-exec-status interfaces

 qapi-schema-guest.json |   79 +++-
 qga/guest-agent-commands.c |  478 +---
 2 files changed, 531 insertions(+), 26 deletions(-)

-- 
1.7.4.1

[Qemu-devel] [PATCH 2/2] guest agent: add guest-exec and guest-exec-status interfaces

2011-12-06 Thread Michael Roth

Interfaces to execute/manage processes in the guest. Child process'
stdin/stdout/stderr can be associated with handles for communication
via read/write interfaces.

Signed-off-by: Michael Roth 
---
 qapi-schema-guest.json |   55 
 qga/guest-agent-commands.c |  299 
 2 files changed, 354 insertions(+), 0 deletions(-)

diff --git a/qapi-schema-guest.json b/qapi-schema-guest.json
index 4c9f063..7bf3086 100644
--- a/qapi-schema-guest.json
+++ b/qapi-schema-guest.json
@@ -237,3 +237,58 @@
 ##
 { 'command': 'guest-fsfreeze-thaw',
   'returns': 'int' }
+
+##
+# @guest-exec-status
+#
+# Check status of process associated with PID retrieved via guest-exec.
+# Reap the process and associated metadata if it has exited.
+#
+# @pid: pid returned from guest-exec
+# @wait: #optional whether to wait for completion or simply poll once.
+# Waiting is disallowed if a stdin/stdout/stderr handle was supplied
+# to guest-exec
+#
+# Returns: GuestExecStatus on success
+#
+{ 'type': 'GuestExecStatus',
+  'data': { 'pid': 'int', 'exited': 'bool', 'exit-code': 'int',
+'handle_stdin': 'int', 'handle_stdout': 'int',
+'handle_stderr': 'int' } }
+{ 'command': 'guest-exec-status',
+  'data':{ 'pid': 'int', '*wait': 'bool' },
+  'returns': 'GuestExecStatus' }
+
+##
+# @guest-exec:
+#
+# Execute a command in the guest
+#
+# If a pipe is associated with the resulting process, the
+# read/write/write sides of the process' stdin/stdout/stderr will
+# be transferred automatically, so no need to close them from the
+# client. If no handle is passed in for stdin/stdout/stderr, they
+# will be closed before executing the command.
+#
+# @path: path or executable name to execute
+# @params: #optional parameter list to pass to executable
+# @handle_stdin: #optional handle to associate with process' stdin.
+# @handle_stdout: #optional handle to associate with process' stdout
+# @handle_stderr: #optional handle to associate with process' stderr
+# @detach: #optional whether to detach the process or execute it
+# synchronously. If synchronous, passing of stdin/stdout/stderr handles
+# will be disallowed, since process can block on closing them and cause
+# a deadlock in the guest agent.
+#
+# Returns: GuestExecStatus on success.
+#
+# Since: 1.0.50
+##
+{ 'type': 'GuestExecParam',
+  'data': { 'param': 'str' } }
+{ 'command': 'guest-exec',
+  'data':{ 'path': 'str', '*params': ['GuestExecParam'],
+   '*handle_stdin': 'int', '*handle_stdout': 'int',
+   '*handle_stderr': 'int',
+   '*detach': 'bool' },
+  'returns': 'GuestExecStatus' }
diff --git a/qga/guest-agent-commands.c b/qga/guest-agent-commands.c
index ae77ee4..11f9d00 100644
--- a/qga/guest-agent-commands.c
+++ b/qga/guest-agent-commands.c
@@ -450,6 +450,304 @@ static void guest_file_init(void)
 QTAILQ_INIT(&guest_file_state.filehandles);
 }
 
+typedef struct GuestExecInfo {
+pid_t pid;
+char **params;
+GuestFileHandle *gfh_stdin;
+GuestFileHandle *gfh_stdout;
+GuestFileHandle *gfh_stderr;
+QTAILQ_ENTRY(GuestExecInfo) next;
+} GuestExecInfo;
+
+static struct {
+QTAILQ_HEAD(, GuestExecInfo) processes;
+} guest_exec_state;
+
+static void guest_exec_info_add(pid_t pid, char **params,
+GuestFileHandle *in, GuestFileHandle *out,
+GuestFileHandle *error, Error **err)
+{
+GuestExecInfo *gei;
+
+gei = g_malloc0(sizeof(*gei));
+gei->pid = pid;
+gei->params = params;
+gei->gfh_stdin = in;
+gei->gfh_stdout = out;
+gei->gfh_stderr = error;
+QTAILQ_INSERT_TAIL(&guest_exec_state.processes, gei, next);
+}
+
+static GuestExecInfo *guest_exec_info_find(pid_t pid)
+{
+GuestExecInfo *gei;
+
+QTAILQ_FOREACH(gei, &guest_exec_state.processes, next)
+{
+if (gei->pid == pid) {
+return gei;
+}
+}
+
+return NULL;
+}
+
+#include 
+#include 
+GuestExecStatus *qmp_guest_exec_status(int64_t pid, bool has_wait,
+   bool wait, Error **err)
+{
+GuestExecInfo *gei;
+GuestExecStatus *ges;
+int status, ret;
+char **ptr;
+
+slog("guest-exec-status called");
+
+gei = guest_exec_info_find(pid);
+if (gei == NULL) {
+error_set(err, QERR_INVALID_PARAMETER, "pid");
+return NULL;
+}
+
+ret = waitpid(gei->pid, &status, WNOHANG);
+if (ret == -1) {
+error_set(err, QERR_UNDEFINED_ERROR);
+return NULL;
+}
+
+ges = g_malloc0(sizeof(*ges));
+ges->handle_stdin = gei->gfh_stdin ? gei->gfh_stdin->id : -1;
+ges->handle_stdout = gei->gfh_stdout ? gei->gfh_stdout->id : -1;
+ges->handle_stderr = gei->gfh_stderr ? gei->gfh_stderr->id : -1;
+ges->pid = gei->pid;
+if (ret == 0) {
+ges->exited = false;
+} else {
+ges->exited = true;
+/* reap child info once user has successfully wait()'d */
+

[Qemu-devel] [PATCH 1/2] guest agent: add guest-file-open-pipe

2011-12-06 Thread Michael Roth

Creates a FIFO pair that can be used with existing file read/write
interfaces to communicate with processes spawned via the forthcoming
guest-file-exec interface.

Signed-off-by: Michael Roth 
---
 qapi-schema-guest.json |   24 ++-
 qga/guest-agent-commands.c |  179 +--
 2 files changed, 177 insertions(+), 26 deletions(-)

diff --git a/qapi-schema-guest.json b/qapi-schema-guest.json
index fde5971..4c9f063 100644
--- a/qapi-schema-guest.json
+++ b/qapi-schema-guest.json
@@ -80,18 +80,40 @@
   'returns': 'int' }
 
 ##
+# @guest-file-open-pipe
+#
+# Open a pipe to in the guest to associated with a qga-spawned processes
+# for communication.
+#
+# Returns: Guest file handle on success, as per guest-file-open. This
+# handle is useable with the same interfaces as a handle returned by
+# guest-file-open.
+#
+# Since: 1.0.50
+##
+{ 'command': 'guest-file-open-pipe',
+  'returns': 'int' }
+
+##
 # @guest-file-close:
 #
 # Close an open file in the guest
 #
 # @handle: filehandle returned by guest-file-open
+# @pipe-end: #optional GuestFilePipeEnd value ("rw"/"w"/"r") to specify
+# which end of the pipe to close. Please note that closing the write
+# side of a pipe will block until the read side is closed. If you've
+# passed the read-side of the pipe to a qga-spawned process, make sure
+# the process as exited before attempting to close the write side.
 #
 # Returns: Nothing on success.
 #
 # Since: 0.15.0
 ##
+{ 'enum': 'GuestFilePipeEnd',
+  'data': [ 'r', 'w', 'rw' ] }
 { 'command': 'guest-file-close',
-  'data': { 'handle': 'int' } }
+  'data': { 'handle': 'int', '*pipe-end': 'GuestFilePipeEnd' } }
 
 ##
 # @guest-file-read:
diff --git a/qga/guest-agent-commands.c b/qga/guest-agent-commands.c
index 6da9904..ae77ee4 100644
--- a/qga/guest-agent-commands.c
+++ b/qga/guest-agent-commands.c
@@ -44,6 +44,34 @@ static void slog(const char *fmt, ...)
 va_end(ap);
 }
 
+static void toggle_flags(int fd, long flags, bool set, Error **err)
+{
+int ret, old_flags;
+
+old_flags = fcntl(fd, F_GETFL);
+if (old_flags == -1) {
+error_set(err, QERR_QGA_COMMAND_FAILED,
+  "failed to fetch filehandle flags");
+return;
+}
+ret = fcntl(fd, F_SETFL, set ? old_flags | flags : old_flags & ~flags);
+if (ret == -1) {
+error_set(err, QERR_QGA_COMMAND_FAILED,
+  "failed to set filehandle flags");
+return;
+}
+}
+
+static void ftoggle_flags(FILE *fh, long flags, bool set, Error **err)
+{
+int fd;
+if (!fh || (fd = fileno(fh)) == -1) {
+error_set(err, QERR_QGA_COMMAND_FAILED, "invalid filehandle");
+return;
+}
+toggle_flags(fd, flags, set, err);
+}
+
 int64_t qmp_guest_sync(int64_t id, Error **errp)
 {
 return id;
@@ -102,7 +130,14 @@ void qmp_guest_shutdown(bool has_mode, const char *mode, 
Error **err)
 
 typedef struct GuestFileHandle {
 uint64_t id;
-FILE *fh;
+bool is_pipe;
+union {
+FILE *fh;
+struct {
+FILE *in;
+FILE *out;
+} pipe;
+} stream;
 QTAILQ_ENTRY(GuestFileHandle) next;
 } GuestFileHandle;
 
@@ -110,14 +145,31 @@ static struct {
 QTAILQ_HEAD(, GuestFileHandle) filehandles;
 } guest_file_state;
 
-static void guest_file_handle_add(FILE *fh)
+static uint64_t guest_file_handle_add(FILE *fh)
 {
 GuestFileHandle *gfh;
 
 gfh = g_malloc0(sizeof(GuestFileHandle));
 gfh->id = fileno(fh);
-gfh->fh = fh;
+gfh->is_pipe = false;
+gfh->stream.fh = fh;
+
+QTAILQ_INSERT_TAIL(&guest_file_state.filehandles, gfh, next);
+return gfh->id;
+}
+
+static uint64_t guest_file_handle_add_pipe(FILE *in, FILE *out)
+{
+GuestFileHandle *gfh;
+
+gfh = g_malloc0(sizeof(GuestFileHandle));
+gfh->id = fileno(in);
+gfh->is_pipe = true;
+gfh->stream.pipe.in = in;
+gfh->stream.pipe.out = out;
+
 QTAILQ_INSERT_TAIL(&guest_file_state.filehandles, gfh, next);
+return gfh->id;
 }
 
 static GuestFileHandle *guest_file_handle_find(int64_t id)
@@ -137,7 +189,6 @@ static GuestFileHandle *guest_file_handle_find(int64_t id)
 int64_t qmp_guest_file_open(const char *path, bool has_mode, const char *mode, 
Error **err)
 {
 FILE *fh;
-int fd;
 int64_t ret = -1;
 
 if (!has_mode) {
@@ -153,39 +204,112 @@ int64_t qmp_guest_file_open(const char *path, bool 
has_mode, const char *mode, E
 /* set fd non-blocking to avoid common use cases (like reading from a
  * named pipe) from hanging the agent
  */
-fd = fileno(fh);
-ret = fcntl(fd, F_GETFL);
-ret = fcntl(fd, F_SETFL, ret | O_NONBLOCK);
-if (ret == -1) {
-error_set(err, QERR_QGA_COMMAND_FAILED, "fcntl() failed");
+ftoggle_flags(fh, O_NONBLOCK, true, err);
+if (error_is_set(err)) {
 fclose(fh);
 return -1;
 }
 
-guest_file_handle_add(fh);
-slog("guest-file-open, handle: %d", fd);
-return fd;
+ret = guest_file_handle_ad

Re: [Qemu-devel] [PATCH 0/2] [RFC] qemu-ga: add support for guest command execution

2011-12-06 Thread Daniel P. Berrange

On Tue, Dec 06, 2011 at 08:34:06AM -0600, Michael Roth wrote:
> The code is still in rough shape, but while we're on the topic of guest agents
> I wanted to put out a working example of how exec functionality can be added
> to qemu-ga to provide a mechansim for building arbitrarilly high-level
> interfaces.
> 
> The hope is that by allowing qemu-ga to execute commands in the guest, paired
> with file read/write access, we can instrument a guest "on the fly" to support
> any type of hyperviser functionality, and do so without dramatically enlarging
> the role qemu-ga plays as a small, QEMU-specific agent that is tightly
> integrated with QEMU/QMP/libvirt.
> 
> These patches add the following interfaces:
> 
> guest-file-open-pipe
> guest-exec
> guest-exec-status
> 
> The guest-file-open-pipe interface is analagous to the existing 
> guest-file-open
> interface (it might be best to roll it into it actually): it returns a handle
> that can be handled via the existing guest-file-{read,write,flush,close}
> interface. Internally it creates a FIFO pair that we can use to associate
> handles to the stdin/stdout/stderr of a guest-exec spawned process. We can 
> also
> also use them to redirect output into other processes, giving us the basic
> tools to build a basic shell (or a full-blown one if we add TTY support) using
> a single qemu-ga.
> 
> Theoretically we can even deploy other agents, including session-level agents,
> and communicate with them via these same handles. Thus, ovirt could deploy and
> run an agent via qemu-ga, Spice could deploy vdagent, etc. Since the interface
> is somewhat tedious, I'm working on a wrapper script to try out some of
> these scenarios, but a basic use case using the raw QMP interface is included
> below.
> 
> Any thoughts/comments on this approach are appreciated.
> 
> EXAMPLE USAGE (execute `top -b -n1`):
> 
> {'execute': 'guest-file-open-pipe'}
> {'return': 6}
> 
> {'execute': 'guest-exec',\
>  'arguments': {'detach': True,   \
>'handle_stdout': 6,   \
>'params': [{'param': '-b'},   \
>   {'param': '-n1'}], \
>'path': 'top'}}

This feels like a rather verbose way of specifying
the ARGV. Why not just allow

  {'execute': 'guest-exec',\
   'arguments': {'detach': True,   \
 'handle_stdout': 6,   \
 'params': ['-b', '-n1'],  \
 'path': 'top'}}

Or even


  {'execute': 'guest-exec',\
   'arguments': {'detach': True,   \
 'handle_stdout': 6,   \
 'argv': ['top', '-b', '-n1']}} \

and just use the first element of argv as the binary to
execute. Also you might need to set env variables for
some tools, so we'd want

  {'execute': 'guest-exec',\
   'arguments': {'detach': True,   \
 'handle_stdout': 6,   \
 'argv': ['top', '-b', '-n1'], \
 'env' : ['TMPDIR=/wibble']}}

and perhaps also you might want to run as a non-root
user, so allow a username/groupname ?

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] winXP "Standard PC" HAL and qemu-kvm >= 0.15

2011-12-06 Thread Michael Tokarev

On 06.12.2011 16:27, Michael S. Tsirkin wrote:
> On Tue, Dec 06, 2011 at 03:02:49PM +0400, Michael Tokarev wrote:
[]
>> And after applying Avi's instructions here's the real bisect
>> result:
>>
>> ab431c283e7055bcd6fb622f212bb29e84a6a134 is the first bad commit
>> commit ab431c283e7055bcd6fb622f212bb29e84a6a134
>> Author: Isaku Yamahata 
>> Date:   Fri Apr 1 20:43:23 2011 +0900
>>
>> piix_pci: optimize set irq path
> 
> Could you try with this commit reverted please?
> Reverting patch below. Warning: compiled only.

After some discussion on IRC, here's a summary.

I applied this patch on top of qemu-kvm-0.15.0.
The resulting executable shows the same bad behavour with my
test guest as it was without this patch.  So apparently just
reverting this patch isn't enough for the problem to go away.

But when doing a bisection, the result is very reliable - it
always points to the commit above (which we tried to revert
by this patch).

More data points (all against qemu-kvm-0.15.0).

First, as Avi pointed out, this patch references PIC which is
used by standardPC HAL and not used by ACPI HAL.  So it might
be something to think about, at least.

Now, so far, all deviecs which are on IRQ11 are affected.  When
enabling USB and NIC, they both gets assigned to IRQ11 and both
does not work.  When enabling just one of them (either), only
that device (which gets assigned to IRQ11) does not work.  All
other devices apparently works fine (including PS/2 Mouse on
IRQ12).

When using just one of NIC/USB, all IRQs in the guest becomes
single-device, so IRQ sharing isn't a problem.

I wasn't able to force the guest to use IRQ10 so far (to verify).

Also, when booted with -no-kvm-irqchip, guest Just Works, including
USB and NIC sharing IRQ11.

While on IRC there was one more person who suffered from the same
issue, now with Win2003.  He was able to solve his guest issue by
changing StandardPC HAL into ACPI HAL, using a "hackish" way (by
replacing C:\Windows\System32\HAL.DLL into HALACPI.DLL as found
on win2k3 installation CDROM).  I wasn't able to replace stdhal
into anything else on my test winXP machine - after changing HAL.DLL,
on next reboot my guest complains about being unable to find boot
device (BSOD STOP 0x7b) - despite using stdIDE and mergeide.  I'll
investigate the guest side further later.

When in this funky mode with non-working IRQ11 (when a NIC (rtl8139)
is assigned to it), winXP guest shows huge delays when trying to
open "My Computer" properties - it freezes for 30..40 seconds after
hitting "Properties" in the context menu.  I can only guess it is
trying to do something with the IRQs at that time, which does not
work.  I wasn't able to (quickly) find a tool for winXP to show
IRQ statistics.

That's about all the info so far which I know about this issue.

Thanks,

/mjt

Re: [Qemu-devel] [PATCH v5] block:add-cow file format

2011-12-06 Thread Kevin Wolf

Am 15.11.2011 06:28, schrieb Dong Xu Wang:
> From: Dong Xu Wang 
> 
> Provide a new file format: add-cow. The usage can be found in add-cow.txt of
> this patch.
> 
> Signed-off-by: Dong Xu Wang 
> ---
>  Makefile.objs  |1 +
>  block.c|2 +-
>  block.h|1 +
>  block/add-cow.c|  417 
> 
>  block_int.h|1 +
>  docs/specs/add-cow.txt |   57 +++
>  6 files changed, 478 insertions(+), 1 deletions(-)
>  create mode 100644 block/add-cow.c
>  create mode 100644 docs/specs/add-cow.txt
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index d7a6539..ad99243 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -31,6 +31,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>  
>  block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o 
> vpc.o vvfat.o
>  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
> qcow2-cache.o
> +block-nested-y += add-cow.o
>  block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
>  block-nested-y += qed-check.o
>  block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
> diff --git a/block.c b/block.c
> index 86910b0..a2be27b 100644
> --- a/block.c
> +++ b/block.c
> @@ -106,7 +106,7 @@ int is_windows_drive(const char *filename)
>  #endif
>  
>  /* check if the path starts with ":" */
> -static int path_has_protocol(const char *path)
> +int path_has_protocol(const char *path)
>  {
>  #ifdef _WIN32
>  if (is_windows_drive(path) ||
> diff --git a/block.h b/block.h
> index 051a25d..836284f 100644
> --- a/block.h
> +++ b/block.h
> @@ -276,6 +276,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, 
> QEMUSnapshotInfo *sn);
>  
>  char *get_human_readable_size(char *buf, int buf_size, int64_t size);
>  int path_is_absolute(const char *path);
> +int path_has_protocol(const char *path);
>  void path_combine(char *dest, int dest_size,
>const char *base_path,
>const char *filename);
> diff --git a/block/add-cow.c b/block/add-cow.c
> new file mode 100644
> index 000..54d30a9
> --- /dev/null
> +++ b/block/add-cow.c
> @@ -0,0 +1,417 @@
> +#include "qemu-common.h"
> +#include "block_int.h"
> +#include "module.h"
> +
> +#define ADD_COW_MAGIC   (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | 
> \
> +((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \
> +((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \
> +((uint64_t)'W' << 8) | 0xFF)
> +#define ADD_COW_VERSION 1
> +#define ADD_COW_FILE_LEN1024
> +
> +typedef struct AddCowHeader {
> +uint64_tmagic;
> +uint32_tversion;
> +charbacking_file[ADD_COW_FILE_LEN];
> +charimage_file[ADD_COW_FILE_LEN];
> +uint64_tsize;
> +} QEMU_PACKED AddCowHeader;
> +
> +typedef struct BDRVAddCowState {
> +charimage_file[ADD_COW_FILE_LEN];
> +BlockDriverState*image_hd;
> +uint8_t *bitmap;
> +uint64_tbitmap_size;
> +CoMutex lock;
> +} BDRVAddCowState;
> +
> +static int add_cow_probe(const uint8_t *buf, int buf_size, const char 
> *filename)
> +{
> +const AddCowHeader *header = (const void *)buf;
> +
> +if (be64_to_cpu(header->magic) == ADD_COW_MAGIC &&
> +be32_to_cpu(header->version) == ADD_COW_VERSION) {
> +return 100;
> +} else {
> +return 0;
> +}
> +}
> +
> +static int add_cow_open(BlockDriverState *bs, int flags)
> +{
> +AddCowHeaderheader;
> +int64_t size;
> +charimage_filename[ADD_COW_FILE_LEN];
> +int image_flags;
> +BlockDriver *image_drv = NULL;
> +int ret;
> +BDRVAddCowState *state = (BDRVAddCowState *)(bs->opaque);
> +
> +ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
> +if (ret != sizeof(header)) {
> +goto fail;
> +}
> +
> +if (be64_to_cpu(header.magic) != ADD_COW_MAGIC ||
> +be32_to_cpu(header.version) != ADD_COW_VERSION) {
> +ret = -EINVAL;
> +goto fail;
> +}

Please have a look at qcow2 for better handling of newer version
numbers. We should try to give a good error message for this case.

> +
> +size = be64_to_cpu(header.size);
> +bs->total_sectors = size / BDRV_SECTOR_SIZE;
> +
> +QEMU_BUILD_BUG_ON(sizeof(state->image_file) != 
> sizeof(header.image_file));
> +pstrcpy(bs->backing_file, sizeof(bs->backing_file),
> +header.backing_file);
> +pstrcpy(state->image_file, sizeof(state->image_file),
> +header.image_file);

You need the same QEMU_BUILD_BUG_ON for the backing file, or you can't
assume that header.image_file is large enough that it doesn't matter
that it isn't necessarily correctly terminated.

> +
> +state->bitmap_size = ((bs->total_sectors + 7) >> 3);

Re: [Qemu-devel] [PATCH v5] block:add-cow file format

2011-12-06 Thread Marcelo Tosatti

On Tue, Dec 06, 2011 at 01:59:48PM +0100, Kevin Wolf wrote:
> >> +
> >> +ret = bdrv_pread(bs->file, sizeof(header), state->bitmap,
> >> +state->bitmap_size);
> >> +if (ret != state->bitmap_size) {
> >> +goto fail;
> >> +}
> > 
> > Reading the entire bitmap in memory is not acceptable, it may be huge.
> > Better mmap it and use msync(MS_SYNC) when writing it back. This way the
> > host can free memory easily upon pressure.
> 
> You can't use mmap in block drivers. It would only work with raw-posix
> backends, if at all.
> 
> Kevin

This is just the bitmap, a plain file. Why would you want to use
anything other than a plain file to use as storage for the bitmap?

Re: [Qemu-devel] About the snapshot

2011-12-06 Thread Zhi Hui Li

2011/12/6 Stefan Hajnoczi 

>  On Tue, Dec 6, 2011 at 10:01 AM, Zhi Hui Li 
> wrote:
> > On 2011年12月06日 17:40, Stefan Hajnoczi wrote:
> >>
> >> On Tue, Dec 6, 2011 at 9:07 AM, Zhi Hui Li
> >>  wrote:
> >>>
> >>>
> >>> 1) :
> >>>
> >>> for example:
> >>>
> >>> BDRVQcowState *s = bs->opaque;
> >>>
> >>> s->snapshots
> >>> s->nb_snapshots
> >>>
> >>>
> >>> 1:use the command:   qemu-img snapshot ./test.qcow2  -c aa
> >>> the memory of the s->snapshot don't free,
> >>> if the s->nb_snapshots is large, Does it have some problems.
> >>>
> >>> 2: use the command:  qemu-system-x86_64  ./test.qcow2 -snapshot
> >>> when the program ends, Does it need to free the s->snapshots ?
> >>
> Okay, I think you're saying that in #1 s->snapshots is leaked because
> qcow2_free_snapshots() is not being called from qcow2_close().
>
> Do you want to send a patch to fix this?
>

 Ok, I will send a patch  tomorrow.
But I think in the #2  it also need to call qcow2_free_snapshots() , if you
have called several times savevm,
the s->snapshots will very large, when the process end, it also need to
free.

Thank you very much for your feedback !

>
> Stefan
>
>

Re: [Qemu-devel] [PATCH v5] block:add-cow file format

2011-12-06 Thread Marcelo Tosatti

On Tue, Dec 06, 2011 at 12:53:16PM -0200, Marcelo Tosatti wrote:
> On Tue, Dec 06, 2011 at 01:59:48PM +0100, Kevin Wolf wrote:
> > >> +
> > >> +ret = bdrv_pread(bs->file, sizeof(header), state->bitmap,
> > >> +state->bitmap_size);
> > >> +if (ret != state->bitmap_size) {
> > >> +goto fail;
> > >> +}
> > > 
> > > Reading the entire bitmap in memory is not acceptable, it may be huge.
> > > Better mmap it and use msync(MS_SYNC) when writing it back. This way the
> > > host can free memory easily upon pressure.
> > 
> > You can't use mmap in block drivers. It would only work with raw-posix
> > backends, if at all.
> > 
> > Kevin
> 
> This is just the bitmap, a plain file. Why would you want to use
> anything other than a plain file to use as storage for the bitmap?

Well, mmap'ing would make life much simpler, but it has limitations such 
as portability.

Then what is necessary is a cache similar to qcow2's metadata cache.

Re: [Qemu-devel] [PATCH v5] block:add-cow file format

2011-12-06 Thread Kevin Wolf

Am 06.12.2011 15:53, schrieb Marcelo Tosatti:
> On Tue, Dec 06, 2011 at 01:59:48PM +0100, Kevin Wolf wrote:
 +
 +ret = bdrv_pread(bs->file, sizeof(header), state->bitmap,
 +state->bitmap_size);
 +if (ret != state->bitmap_size) {
 +goto fail;
 +}
>>>
>>> Reading the entire bitmap in memory is not acceptable, it may be huge.
>>> Better mmap it and use msync(MS_SYNC) when writing it back. This way the
>>> host can free memory easily upon pressure.
>>
>> You can't use mmap in block drivers. It would only work with raw-posix
>> backends, if at all.
> 
> This is just the bitmap, a plain file. Why would you want to use
> anything other than a plain file to use as storage for the bitmap?

The obvious case is raw-win32. There are probably not so obvious, but
still valid use cases that involve things like NBD, iSCSI, blkdebug or
whatever.

Kevin

Re: [Qemu-devel] [PATCH v5] block:add-cow file format

2011-12-06 Thread Kevin Wolf

Am 06.12.2011 16:06, schrieb Marcelo Tosatti:
> On Tue, Dec 06, 2011 at 12:53:16PM -0200, Marcelo Tosatti wrote:
>> On Tue, Dec 06, 2011 at 01:59:48PM +0100, Kevin Wolf wrote:
> +
> +ret = bdrv_pread(bs->file, sizeof(header), state->bitmap,
> +state->bitmap_size);
> +if (ret != state->bitmap_size) {
> +goto fail;
> +}

 Reading the entire bitmap in memory is not acceptable, it may be huge.
 Better mmap it and use msync(MS_SYNC) when writing it back. This way the
 host can free memory easily upon pressure.
>>>
>>> You can't use mmap in block drivers. It would only work with raw-posix
>>> backends, if at all.
>>>
>>> Kevin
>>
>> This is just the bitmap, a plain file. Why would you want to use
>> anything other than a plain file to use as storage for the bitmap?
> 
> Well, mmap'ing would make life much simpler, but it has limitations such 
> as portability.
> 
> Then what is necessary is a cache similar to qcow2's metadata cache.

Right, we can probably generalise the qcow2 code and make it available
for other drivers as well.

Kevin

[Qemu-devel] [PATCH 01/25] add qemu_send_full and qemu_recv_full

2011-12-06 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 osdep.c   |   67 +
 qemu-common.h |4 +++
 2 files changed, 71 insertions(+), 0 deletions(-)

diff --git a/osdep.c b/osdep.c
index 56e6963..70bad27 100644
--- a/osdep.c
+++ b/osdep.c
@@ -166,3 +166,70 @@ int qemu_accept(int s, struct sockaddr *addr, socklen_t 
*addrlen)
 
 return ret;
 }
+
+/*
+ * A variant of send(2) which handles partial write.
+ *
+ * Return the number of bytes transferred, which is only
+ * smaller than `count' if there is an error.
+ *
+ * This function won't work with non-blocking fd's.
+ * Any of the possibilities with non-bloking fd's is bad:
+ *   - return a short write (then name is wrong)
+ *   - busy wait adding (errno == EAGAIN) to the loop
+ */
+ssize_t qemu_send_full(int fd, const void *buf, size_t count, int flags)
+{
+ssize_t ret = 0;
+ssize_t total = 0;
+
+while (count) {
+ret = send(fd, buf, count, flags);
+if (ret < 0) {
+if (errno == EINTR) {
+continue;
+}
+break;
+}
+
+count -= ret;
+buf += ret;
+total += ret;
+}
+
+return total;
+}
+
+/*
+ * A variant of recv(2) which handles partial write.
+ *
+ * Return the number of bytes transferred, which is only
+ * smaller than `count' if there is an error.
+ *
+ * This function won't work with non-blocking fd's.
+ * Any of the possibilities with non-bloking fd's is bad:
+ *   - return a short write (then name is wrong)
+ *   - busy wait adding (errno == EAGAIN) to the loop
+ */
+ssize_t qemu_recv_full(int fd, void *buf, size_t count, int flags)
+{
+ssize_t ret = 0;
+ssize_t total = 0;
+
+while (count) {
+ret = qemu_recv(fd, buf, count, flags);
+if (ret <= 0) {
+if (ret < 0 && errno == EINTR) {
+continue;
+}
+break;
+}
+
+count -= ret;
+buf += ret;
+total += ret;
+}
+
+return total;
+}
+
diff --git a/qemu-common.h b/qemu-common.h
index 44870fe..bb60a5a 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -173,6 +173,10 @@ void *qemu_oom_check(void *ptr);
 int qemu_open(const char *name, int flags, ...);
 ssize_t qemu_write_full(int fd, const void *buf, size_t count)
 QEMU_WARN_UNUSED_RESULT;
+ssize_t qemu_send_full(int fd, const void *buf, size_t count, int flags)
+QEMU_WARN_UNUSED_RESULT;
+ssize_t qemu_recv_full(int fd, const void *buf, size_t count, int flags)
+QEMU_WARN_UNUSED_RESULT;
 void qemu_set_cloexec(int fd);
 
 #ifndef _WIN32
-- 
1.7.7.1

[Qemu-devel] [PATCH 05/25] nbd: allow multiple in-flight requests

2011-12-06 Thread Paolo Bonzini

Allow sending up to 16 requests, and drive the replies to the coroutine
that did the request.  The code is written to be exactly the same as
before this patch when MAX_NBD_REQUESTS == 1 (modulo the extra mutex
and state).

Signed-off-by: Paolo Bonzini 
---
 block/nbd.c |   69 +++---
 1 files changed, 56 insertions(+), 13 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 7e6bf87..93f5d16 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -46,6 +46,10 @@
 #define logout(fmt, ...) ((void)0)
 #endif
 
+#define MAX_NBD_REQUESTS   16
+#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
+#define INDEX_TO_HANDLE(bs, index)  ((index)  ^ ((uint64_t)(intptr_t)bs))
+
 typedef struct BDRVNBDState {
 int sock;
 uint32_t nbdflags;
@@ -53,9 +57,12 @@ typedef struct BDRVNBDState {
 size_t blocksize;
 char *export_name; /* An NBD server may export several devices */
 
-CoMutex mutex;
-Coroutine *coroutine;
+CoMutex send_mutex;
+CoMutex free_sema;
+Coroutine *send_coroutine;
+int in_flight;
 
+Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
 struct nbd_reply reply;
 
 /* If it begins with  '/', this is a UNIX domain socket. Otherwise,
@@ -112,41 +119,68 @@ out:
 
 static void nbd_coroutine_start(BDRVNBDState *s, struct nbd_request *request)
 {
-qemu_co_mutex_lock(&s->mutex);
-s->coroutine = qemu_coroutine_self();
-request->handle = (uint64_t)(intptr_t)s;
+int i;
+
+/* Poor man semaphore.  The free_sema is locked when no other request
+ * can be accepted, and unlocked after receiving one reply.  */
+if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
+qemu_co_mutex_lock(&s->free_sema);
+assert(s->in_flight < MAX_NBD_REQUESTS);
+}
+s->in_flight++;
+
+for (i = 0; i < MAX_NBD_REQUESTS; i++) {
+if (s->recv_coroutine[i] == NULL) {
+s->recv_coroutine[i] = qemu_coroutine_self();
+break;
+}
+}
+
+assert(i < MAX_NBD_REQUESTS);
+request->handle = INDEX_TO_HANDLE(s, i);
 }
 
 static int nbd_have_request(void *opaque)
 {
 BDRVNBDState *s = opaque;
 
-return !!s->coroutine;
+return s->in_flight > 0;
 }
 
 static void nbd_reply_ready(void *opaque)
 {
 BDRVNBDState *s = opaque;
+int i;
 
 if (s->reply.handle == 0) {
 /* No reply already in flight.  Fetch a header.  */
 if (nbd_receive_reply(s->sock, &s->reply) < 0) {
 s->reply.handle = 0;
+goto fail;
 }
 }
 
 /* There's no need for a mutex on the receive side, because the
  * handler acts as a synchronization point and ensures that only
  * one coroutine is called until the reply finishes.  */
-if (s->coroutine) {
-qemu_coroutine_enter(s->coroutine, NULL);
+i = HANDLE_TO_INDEX(s, s->reply.handle);
+if (s->recv_coroutine[i]) {
+qemu_coroutine_enter(s->recv_coroutine[i], NULL);
+return;
+}
+
+fail:
+for (i = 0; i < MAX_NBD_REQUESTS; i++) {
+if (s->recv_coroutine[i]) {
+qemu_coroutine_enter(s->recv_coroutine[i], NULL);
+}
 }
 }
 
 static void nbd_restart_write(void *opaque)
 {
 BDRVNBDState *s = opaque;
-qemu_coroutine_enter(s->coroutine, NULL);
+qemu_coroutine_enter(s->send_coroutine, NULL);
 }
 
 static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request,
@@ -154,6 +188,8 @@ static int nbd_co_send_request(BDRVNBDState *s, struct 
nbd_request *request,
 {
 int rc, ret;
 
+qemu_co_mutex_lock(&s->send_mutex);
+s->send_coroutine = qemu_coroutine_self();
 qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, nbd_restart_write,
 nbd_have_request, NULL, s);
 rc = nbd_send_request(s->sock, request);
@@ -166,6 +202,8 @@ static int nbd_co_send_request(BDRVNBDState *s, struct 
nbd_request *request,
 }
 qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL,
 nbd_have_request, NULL, s);
+s->send_coroutine = NULL;
+qemu_co_mutex_unlock(&s->send_mutex);
 return rc;
 }
 
@@ -175,7 +213,8 @@ static void nbd_co_receive_reply(BDRVNBDState *s, struct 
nbd_request *request,
 {
 int ret;
 
-/* Wait until we're woken up by the read handler.  */
+/* Wait until we're woken up by the read handler.  TODO: perhaps
+ * peek at the next reply and avoid yielding if it's ours?  */
 qemu_coroutine_yield();
 *reply = s->reply;
 if (reply->handle != request->handle) {
@@ -195,8 +234,11 @@ static void nbd_co_receive_reply(BDRVNBDState *s, struct 
nbd_request *request,
 
 static void nbd_coroutine_end(BDRVNBDState *s, struct nbd_request *request)
 {
-s->coroutine = NULL;
-qemu_co_mutex_unlock(&s->mutex);
+int i = HANDLE_TO_INDEX(s, request->handle);
+s->recv_coroutine[i] = NULL;
+if (s->in_flight-- == MAX_NBD_REQUESTS) {
+qemu_co_mutex_unlock(&s->free_sema);
+}

[Qemu-devel] [PATCH 23/25] qemu-nbd: add client pointer to NBDRequest

2011-12-06 Thread Paolo Bonzini

By attaching a client to an NBDRequest, we can avoid passing around the
socket descriptor and data buffer.

Also, we can now manage the reference count for the client in
nbd_request_get/put request instead of having to do it ourselved in
nbd_read.  This simplifies things when coroutines are used.

Signed-off-by: Paolo Bonzini 
---
 nbd.c |   48 +++-
 1 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/nbd.c b/nbd.c
index f479c30..ee5325b 100644
--- a/nbd.c
+++ b/nbd.c
@@ -589,6 +589,7 @@ typedef struct NBDRequest NBDRequest;
 
 struct NBDRequest {
 QSIMPLEQ_ENTRY(NBDRequest) entry;
+NBDClient *client;
 uint8_t *data;
 };
 
@@ -631,9 +632,11 @@ static void nbd_client_close(NBDClient *client)
 nbd_client_put(client);
 }
 
-static NBDRequest *nbd_request_get(NBDExport *exp)
+static NBDRequest *nbd_request_get(NBDClient *client)
 {
 NBDRequest *req;
+NBDExport *exp = client->exp;
+
 if (QSIMPLEQ_EMPTY(&exp->requests)) {
 req = g_malloc0(sizeof(NBDRequest));
 req->data = qemu_blockalign(exp->bs, NBD_BUFFER_SIZE);
@@ -641,12 +644,16 @@ static NBDRequest *nbd_request_get(NBDExport *exp)
 req = QSIMPLEQ_FIRST(&exp->requests);
 QSIMPLEQ_REMOVE_HEAD(&exp->requests, entry);
 }
+nbd_client_get(client);
+req->client = client;
 return req;
 }
 
-static void nbd_request_put(NBDExport *exp, NBDRequest *req)
+static void nbd_request_put(NBDRequest *req)
 {
-QSIMPLEQ_INSERT_HEAD(&exp->requests, req, entry);
+NBDClient *client = req->client;
+QSIMPLEQ_INSERT_HEAD(&client->exp->requests, req, entry);
+nbd_client_put(client);
 }
 
 NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset,
@@ -674,9 +681,11 @@ void nbd_export_close(NBDExport *exp)
 g_free(exp);
 }
 
-static int nbd_do_send_reply(int csock, struct nbd_reply *reply,
- uint8_t *data, int len)
+static int nbd_do_send_reply(NBDRequest *req, struct nbd_reply *reply,
+ int len)
 {
+NBDClient *client = req->client;
+int csock = client->sock;
 int rc, ret;
 
 if (!len) {
@@ -688,7 +697,7 @@ static int nbd_do_send_reply(int csock, struct nbd_reply 
*reply,
 socket_set_cork(csock, 1);
 rc = nbd_send_reply(csock, reply);
 if (rc != -1) {
-ret = write_sync(csock, data, len);
+ret = write_sync(csock, req->data, len);
 if (ret != len) {
 errno = EIO;
 rc = -1;
@@ -702,9 +711,10 @@ static int nbd_do_send_reply(int csock, struct nbd_reply 
*reply,
 return rc;
 }
 
-static int nbd_do_receive_request(int csock, struct nbd_request *request,
-  uint8_t *data)
+static int nbd_do_receive_request(NBDRequest *req, struct nbd_request *request)
 {
+NBDClient *client = req->client;
+int csock = client->sock;
 int rc;
 
 if (nbd_receive_request(csock, request) == -1) {
@@ -731,7 +741,7 @@ static int nbd_do_receive_request(int csock, struct 
nbd_request *request,
 if ((request->type & NBD_CMD_MASK_COMMAND) == NBD_CMD_WRITE) {
 TRACE("Reading %u byte(s)", request->len);
 
-if (read_sync(csock, data, request->len) != request->len) {
+if (read_sync(csock, req->data, request->len) != request->len) {
 LOG("reading from socket failed");
 rc = -EIO;
 goto out;
@@ -745,9 +755,8 @@ out:
 
 static int nbd_trip(NBDClient *client)
 {
+NBDRequest *req = nbd_request_get(client);
 NBDExport *exp = client->exp;
-NBDRequest *req = nbd_request_get(exp);
-int csock = client->sock;
 struct nbd_request request;
 struct nbd_reply reply;
 int rc = -1;
@@ -755,7 +764,7 @@ static int nbd_trip(NBDClient *client)
 
 TRACE("Reading request.");
 
-ret = nbd_do_receive_request(csock, &request, req->data);
+ret = nbd_do_receive_request(req, &request);
 if (ret == -EIO) {
 goto out;
 }
@@ -790,7 +799,7 @@ static int nbd_trip(NBDClient *client)
 }
 
 TRACE("Read %u byte(s)", request.len);
-if (nbd_do_send_reply(csock, &reply, req->data, request.len) < 0)
+if (nbd_do_send_reply(req, &reply, request.len) < 0)
 goto out;
 break;
 case NBD_CMD_WRITE:
@@ -821,7 +830,7 @@ static int nbd_trip(NBDClient *client)
 }
 }
 
-if (nbd_do_send_reply(csock, &reply, NULL, 0) < 0)
+if (nbd_do_send_reply(req, &reply, 0) < 0)
 goto out;
 break;
 case NBD_CMD_DISC:
@@ -837,7 +846,7 @@ static int nbd_trip(NBDClient *client)
 reply.error = -ret;
 }
 
-if (nbd_do_send_reply(csock, &reply, NULL, 0) < 0)
+if (nbd_do_send_reply(req, &reply, 0) < 0)
 goto out;
 break;
 case NBD_CMD_TRIM:
@@ -848,7 +857,7 @@ static int nbd_trip(NBDClient *client)
 LOG("discard failed");

Re: [Qemu-devel] About the snapshot

2011-12-06 Thread Stefan Hajnoczi

2011/12/6 Zhi Hui Li :
>
>
> 2011/12/6 Stefan Hajnoczi 
>>
>> On Tue, Dec 6, 2011 at 10:01 AM, Zhi Hui Li 
>> wrote:
>> > On 2011年12月06日 17:40, Stefan Hajnoczi wrote:
>> >>
>> >> On Tue, Dec 6, 2011 at 9:07 AM, Zhi Hui Li
>> >>  wrote:
>> >>>
>> >>>
>> >>> 1) :
>> >>>
>> >>> for example:
>> >>>
>> >>> BDRVQcowState *s = bs->opaque;
>> >>>
>> >>> s->snapshots
>> >>> s->nb_snapshots
>> >>>
>> >>>
>> >>> 1:use the command:   qemu-img snapshot ./test.qcow2  -c aa
>> >>> the memory of the s->snapshot don't free,
>> >>> if the s->nb_snapshots is large, Does it have some problems.
>> >>>
>> >>> 2: use the command:  qemu-system-x86_64  ./test.qcow2 -snapshot
>> >>> when the program ends, Does it need to free the s->snapshots ?
>> >>
>> Okay, I think you're saying that in #1 s->snapshots is leaked because
>> qcow2_free_snapshots() is not being called from qcow2_close().
>>
>> Do you want to send a patch to fix this?
>
>
> Ok, I will send a patch  tomorrow.
> But I think in the #2  it also need to call qcow2_free_snapshots() , if you
> have called several times savevm,
> the s->snapshots will very large, when the process end, it also need to
> free.

Right, I think I understand what you're saying.  I was thinking about
what #1 and #2 mean differently, but it doesn't matter.

If qcow2_close() frees s->snapshots then the problem is solved in all
possible qcow2 use cases, including #2 with savevm.

Stefan

1 2 >

1 - 100 of 195 matches

Mail list logo