date:20110602

[Qemu-devel] Hello Would You Like To Earn

2011-06-02 Thread Sharon . Burns

Hello qemu-devel

Would you like to earn an extra $200 everyday?, for just 45 minutes work? You 
could quit your job and make double the money at home working for yourself.

visit->http:tinyurl.com/3brnlpx

Regards,

Sharon Burns

Survey Human Resources Dept.

Re: [Qemu-devel] [RFC]QEMU disk I/O limits

2011-06-02 Thread Sasha Levin

On Thu, 2011-06-02 at 14:29 +0800, Zhi Yong Wu wrote:
> On Thu, Jun 02, 2011 at 09:17:06AM +0300, Sasha Levin wrote:
> >Date: Thu, 02 Jun 2011 09:17:06 +0300
> >From: Sasha Levin 
> >To: Zhi Yong Wu 
> >Cc: qemu-devel@nongnu.org, k...@vger.kernel.org, kw...@redhat.com,
> > aligu...@us.ibm.com, herb...@gondor.apana.org.au,
> > guijianf...@cn.fujitsu.com, wu...@cn.ibm.com, luow...@cn.ibm.com,
> > zh...@cn.ibm.com, zhaoy...@cn.ibm.com, l...@redhat.com,
> > rahar...@us.ibm.com, vgo...@redhat.com, stefa...@linux.vnet.ibm.com
> >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits
> >X-Mailer: Evolution 2.32.2 
> >
> >Hi,
> >
> >On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote:
> >> Hello, all,
> >> 
> >> I have prepared to work on a feature called "Disk I/O limits" for 
> >> qemu-kvm projeect.
> >> This feature will enable the user to cap disk I/O amount performed by 
> >> a VM.It is important for some storage resources to be shared among 
> >> multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, 
> >> they will hurt the performance of other VMs.
> >> 
> >> More detail is available here:
> >> http://wiki.qemu.org/Features/DiskIOLimits
> >> 
> >> 1.) Why we need per-drive disk I/O limits 
> >> As you've known, for linux, cgroup blkio-controller has supported I/O 
> >> throttling on block devices. More importantly, there is no single 
> >> mechanism for disk I/O throttling across all underlying storage types 
> >> (image file, LVM, NFS, Ceph) and for some types there is no way to 
> >> throttle at all. 
> >> 
> >> Disk I/O limits feature introduces QEMU block layer I/O limits 
> >> together with command-line and QMP interfaces for configuring limits. This 
> >> allows I/O limits to be imposed across all underlying storage types using 
> >> a single interface.
> >> 
> >> 2.) How disk I/O limits will be implemented
> >> QEMU block layer will introduce a per-drive disk I/O request queue for 
> >> those disks whose "disk I/O limits" feature is enabled. It can control 
> >> disk I/O limits individually for each disk when multiple disks are 
> >> attached to a VM, and enable use cases like unlimited local disk access 
> >> but shared storage access with limits. 
> >> In mutliple I/O threads scenario, when an application in a VM issues a 
> >> block I/O request, this request will be intercepted by QEMU block layer, 
> >> then it will calculate disk runtime I/O rate and determine if it has go 
> >> beyond its limits. If yes, this I/O request will enqueue to that 
> >> introduced queue; otherwise it will be serviced.
> >> 
> >> 3.) How the users enable and play with it
> >> QEMU -drive option will be extended so that disk I/O limits can be 
> >> specified on its command line, such as -drive [iops=xxx,][throughput=xxx] 
> >> or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this 
> >> argument is specified, it means that "disk I/O limits" feature is enabled 
> >> for this drive disk.
> >> The feature will also provide users with the ability to change 
> >> per-drive disk I/O limits at runtime using QMP commands.
> >
> >I'm wondering if you've considered adding a 'burst' parameter -
> >something which will not limit (or limit less) the io ops or the
> >throughput for the first 'x' ms in a given time window.
> Currently no, Do you let us know what scenario it will make sense to?

My assumption is that most guests are not doing constant disk I/O
access. Instead, the operations are usually short and happen on small
scale (relatively small amount of bytes accessed).

For example: Multiple table DB lookup, serving a website, file servers.

Basically, if I need to do a DB lookup which needs 50MB of data from a
disk which is limited to 10MB/s, I'd rather let it burst for 1 second
and complete the lookup faster instead of having it read data for 5
seconds.

If the guest now starts running multiple lookups one after the other,
thats when I would like to limit.

> Regards,
> 
> Zhiyong Wu
> >
> >> Regards,
> >> 
> >> Zhiyong Wu
> >> 
> >
> >-- 
> >
> >Sasha.
> >

-- 

Sasha.

Re: [Qemu-devel] [PATCH V2 3/3] Remove warning in printf due to type mismatch

2011-06-02 Thread Stefan Weil


Am 30.05.2011 00:22, schrieb Alexandre Raymond:

8<
qemu/target-lm32/translate.c: In function 
‘gen_intermediate_code_internal’:
qemu/target-lm32/translate.c:1135: warning: format ‘%zd’ expects type 
‘signed size_t’, but argument 4 has type ‘int’

8<

Both gen_opc_ptr and gen_opc_buf are "uint16_t *". The difference between
pointers is a ptrdiff_t so printf needs '%td'.

Signed-off-by: Alexandre Raymond 
---
target-lm32/translate.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index eb21158..5e19725 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1132,7 +1132,7 @@ static void 
gen_intermediate_code_internal(CPUState *env,

if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
qemu_log("\n");
log_target_disas(pc_start, dc->pc - pc_start, 0);
- qemu_log("\nisize=%d osize=%zd\n",
+ qemu_log("\nisize=%d osize=%td\n",
dc->pc - pc_start, gen_opc_ptr - gen_opc_buf);
}
#endif


Acked-by: Stefan Weil

[Qemu-devel] Hello Would You Like To Earn

2011-06-02 Thread Sharon . Burns

Hello qemu-devel

Would you like to earn an extra $200 everyday?, for just 45 minutes work? You 
could quit your job and make double the money at home working for yourself.

visit->http:tinyurl.com/3brnlpx

Regards,

Sharon Burns

Survey Human Resources Dept.

[Qemu-devel] qemu-mipsel problem GC_INIT()

2011-06-02 Thread Dragoslav Sicarov

Hi,
I'm using qemu-mipsel user-mod, and i have a problem with GC_INIT();

I wrote this simple program

#include
int main ()
{
GC_INIT();
}

and I compiled for mipsel.

When I start executable file I got segmentation fault in qemu user-mode.
Same executable file work on qemu system mipsel.

How do I resolve this problem?

Best regards, Dragoslav Sicarov.

[Qemu-devel] Hello Would You Like To Earn

2011-06-02 Thread Sharon . Burns

Hello qemu-devel

Would you like to earn an extra $200 everyday?, for just 45 minutes work? You 
could quit your job and make double the money at home working for yourself.

visit->http:tinyurl.com/3brnlpx

Regards,

Sharon Burns

Survey Human Resources Dept.

Re: [Qemu-devel] [RFC]QEMU disk I/O limits

2011-06-02 Thread Zhi Yong Wu

On Thu, Jun 02, 2011 at 10:15:02AM +0300, Sasha Levin wrote:
>Date: Thu, 02 Jun 2011 10:15:02 +0300
>From: Sasha Levin 
>To: Zhi Yong Wu 
>Cc: kw...@redhat.com, aligu...@us.ibm.com, herb...@gondor.apana.org.au,
>   k...@vger.kernel.org, guijianf...@cn.fujitsu.com,
>   qemu-devel@nongnu.org, wu...@cn.ibm.com, luow...@cn.ibm.com,
>   zh...@cn.ibm.com, zhaoy...@cn.ibm.com, l...@redhat.com,
>   rahar...@us.ibm.com, vgo...@redhat.com, stefa...@linux.vnet.ibm.com
>Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits
>X-Mailer: Evolution 2.32.2 
>
>On Thu, 2011-06-02 at 14:29 +0800, Zhi Yong Wu wrote:
>> On Thu, Jun 02, 2011 at 09:17:06AM +0300, Sasha Levin wrote:
>> >Date: Thu, 02 Jun 2011 09:17:06 +0300
>> >From: Sasha Levin 
>> >To: Zhi Yong Wu 
>> >Cc: qemu-devel@nongnu.org, k...@vger.kernel.org, kw...@redhat.com,
>> >aligu...@us.ibm.com, herb...@gondor.apana.org.au,
>> >guijianf...@cn.fujitsu.com, wu...@cn.ibm.com, luow...@cn.ibm.com,
>> >zh...@cn.ibm.com, zhaoy...@cn.ibm.com, l...@redhat.com,
>> >rahar...@us.ibm.com, vgo...@redhat.com, stefa...@linux.vnet.ibm.com
>> >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits
>> >X-Mailer: Evolution 2.32.2 
>> >
>> >Hi,
>> >
>> >On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote:
>> >> Hello, all,
>> >> 
>> >> I have prepared to work on a feature called "Disk I/O limits" for 
>> >> qemu-kvm projeect.
>> >> This feature will enable the user to cap disk I/O amount performed by 
>> >> a VM.It is important for some storage resources to be shared among 
>> >> multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, 
>> >> they will hurt the performance of other VMs.
>> >> 
>> >> More detail is available here:
>> >> http://wiki.qemu.org/Features/DiskIOLimits
>> >> 
>> >> 1.) Why we need per-drive disk I/O limits 
>> >> As you've known, for linux, cgroup blkio-controller has supported I/O 
>> >> throttling on block devices. More importantly, there is no single 
>> >> mechanism for disk I/O throttling across all underlying storage types 
>> >> (image file, LVM, NFS, Ceph) and for some types there is no way to 
>> >> throttle at all. 
>> >> 
>> >> Disk I/O limits feature introduces QEMU block layer I/O limits 
>> >> together with command-line and QMP interfaces for configuring limits. 
>> >> This allows I/O limits to be imposed across all underlying storage types 
>> >> using a single interface.
>> >> 
>> >> 2.) How disk I/O limits will be implemented
>> >> QEMU block layer will introduce a per-drive disk I/O request queue 
>> >> for those disks whose "disk I/O limits" feature is enabled. It can 
>> >> control disk I/O limits individually for each disk when multiple disks 
>> >> are attached to a VM, and enable use cases like unlimited local disk 
>> >> access but shared storage access with limits. 
>> >> In mutliple I/O threads scenario, when an application in a VM issues 
>> >> a block I/O request, this request will be intercepted by QEMU block 
>> >> layer, then it will calculate disk runtime I/O rate and determine if it 
>> >> has go beyond its limits. If yes, this I/O request will enqueue to that 
>> >> introduced queue; otherwise it will be serviced.
>> >> 
>> >> 3.) How the users enable and play with it
>> >> QEMU -drive option will be extended so that disk I/O limits can be 
>> >> specified on its command line, such as -drive [iops=xxx,][throughput=xxx] 
>> >> or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this 
>> >> argument is specified, it means that "disk I/O limits" feature is enabled 
>> >> for this drive disk.
>> >> The feature will also provide users with the ability to change 
>> >> per-drive disk I/O limits at runtime using QMP commands.
>> >
>> >I'm wondering if you've considered adding a 'burst' parameter -
>> >something which will not limit (or limit less) the io ops or the
>> >throughput for the first 'x' ms in a given time window.
>> Currently no, Do you let us know what scenario it will make sense to?
>
>My assumption is that most guests are not doing constant disk I/O
>access. Instead, the operations are usually short and happen on small
>scale (relatively small amount of bytes accessed).
>
>For example: Multiple table DB lookup, serving a website, file servers.
>
>Basically, if I need to do a DB lookup which needs 50MB of data from a
>disk which is limited to 10MB/s, I'd rather let it burst for 1 second
>and complete the lookup faster instead of having it read data for 5
>seconds.
>
>If the guest now starts running multiple lookups one after the other,
>thats when I would like to limit.
HI, Sasha,

If iops or bps parameters are not specified to -drive, it will not limit this 
disk I/O rate. Of course, QMP commands will be extended to support changing or 
disabling disk I/O limits at runtime. If you'd like not limit a disk I/O rate, 
you can use it to disabled this feature.

I don't make sure that this is the right answer for your

Re: [Qemu-devel] [PATCH 5/5] QMP: add server mode to QEMUMonitorProtocol

2011-06-02 Thread Daniel P. Berrange

On Thu, Jun 02, 2011 at 12:19:22PM +1000, Brad Hards wrote:
> On Thu, 2 Jun 2011 01:54:05 AM Luiz Capitulino wrote:
> > QEMU supports socket chardevs that establish connections like a server
> > or a client. 
> Is this protocol documented anywhere?

There are docs for the QMP monitor in the QMP/ subdirectory of the
QEMU source tree, while the chardev options are documented in the man
page

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Daniel P. Berrange

On Wed, Jun 01, 2011 at 04:35:03PM -0500, Anthony Liguori wrote:
> On 06/01/2011 04:12 PM, Luiz Capitulino wrote:
> >Hi there,
> >
> >There are people who want to use QMP for thin provisioning. That's, the VM is
> >started with a small storage and when a no space error is triggered, more 
> >space
> >is allocated and the VM is put to run again.
> >
> >QMP has two limitations that prevent people from doing this today:
> >
> >1. The BLOCK_IO_ERROR doesn't contain error information
> >
> >2. Considering we solve item 1, we still have to provide a way for clients
> >to query why a VM stopped. This is needed because clients may miss the
> >BLOCK_IO_ERROR event or may connect to the VM while it's already stopped
> >
> >A proposal to solve both problems follow.
> >
> >A. BLOCK_IO_ERROR information
> >-
> >
> >We already have discussed this a lot, but didn't reach a consensus. My 
> >solution
> >is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR event,
> >for example (see the "reason" key):
> >
> >{ "event": "BLOCK_IO_ERROR",
> >"data": { "device": "ide0-hd1",
> >  "operation": "write",
> >  "action": "stop",
> >  "reason": "enospc", }
> 
> you can call the reason whatever you want, but don't call it
> stringfied errno name :-)
> 
> In fact, just make reason "no space".
> 
> >"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
> >
> >Valid error reasons could be: "enospc", "eio", etc.
> 
> No etc :-)  Error reasons should we be well known and well documented.
> 
> >B. query-stop-reason
> >
> >
> >I also have a simple solution for item 2. The vm_stop() accepts a reason
> >argument, so we could store it somewhere and return it as a string, like:
> >
> >->  { "execute": "query-stop-reason" }
> ><- { "return": { "reason": "user" } }
> >
> >Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey,
> >this should be "ioerror", no?), "watchdog", "panic", "savevm", "loadvm",
> >"migrate".
> >
> >Also note that we have a STOP event. It should be extended with the
> >stop reason too, for completeness.
> 
> 
> Can we just extend query-block?

Primarily we want 'query-stop-reason' to tell us what caused the VM
CPUs to stop. If that reason was 'ioerror', then 'query-block' could
be used to find out which particular block device(s) caused the IO
error to occurr & get the "reason" that was in the BLOCK_IO_ERROR
event.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] Hello Would You Like To Earn

2011-06-02 Thread Sharon . Burns

Hello qemu-devel

Would you like to earn an extra $200 everyday?, for just 45 minutes work? You 
could quit your job and make double the money at home working for yourself.

visit->http:tinyurl.com/3brnlpx

Regards,

Sharon Burns

Survey Human Resources Dept.

Re: [Qemu-devel] [RFC]QEMU disk I/O limits

2011-06-02 Thread Michal Suchanek

On 1 June 2011 05:12, Zhi Yong Wu  wrote:
> On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote:
>>Date: Tue, 31 May 2011 15:55:49 -0400
>>From: Vivek Goyal 
>>To: Zhi Yong Wu 
>>Cc: kw...@redhat.com, aligu...@us.ibm.com, stefa...@linux.vnet.ibm.com,
>>       k...@vger.kernel.org, guijianf...@cn.fujitsu.com,
>>       qemu-devel@nongnu.org, wu...@cn.ibm.com,
>>       herb...@gondor.hengli.com.au, luow...@cn.ibm.com, zh...@cn.ibm.com,
>>       zhaoy...@cn.ibm.com, l...@redhat.com, rahar...@us.ibm.com
>>Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits
>>User-Agent: Mutt/1.5.21 (2010-09-15)
>>
>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote:
>>
>>[..]
>>>     3.) How the users enable and play with it
>>>     QEMU -drive option will be extended so that disk I/O limits can be 
>>> specified on its command line, such as -drive [iops=xxx,][throughput=xxx] 
>>> or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this 
>>> argument is specified, it means that "disk I/O limits" feature is enabled 
>>> for this drive disk.
>>
>>How does throughput interface look like? is it bytes per second or something
>>else?
> HI, Vivek,
> It will be a value based on bytes per second.
>
>>
>>Do we have read and write variants for throughput as we have for iops.
> QEMU code has two variants "rd_bytes, wr_bytes", but we maybe need to get 
> their bytes per second.
>
>>
>>if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr"
>>might be good names too for thoughput interface.
> I agree with you, and can change them as your suggestions.
>

Changing them this way is not going to be an improvement. While
rd_bytes and wr_bytes lack the time interval specification bps_rd and
bps_wr is ambiguous. Is that bits? bytes? Sure, there should be some
distinction by capitalization but that does not apply since qemu
arguments are all lowercase.

Thanks

Michal

Re: [Qemu-devel] [RFC PATCH 01/13] Generic DMA memory access interface

2011-06-02 Thread David Gibson

On Wed, Jun 01, 2011 at 08:45:56AM -0700, Richard Henderson wrote:
> On 06/01/2011 08:35 AM, Eduard - Gabriel Munteanu wrote:
> > Maybe it's not nice, but you're missing the fact upcasting gives you
> > some type safety. With opaques you have none.
> 
> Lol.  Do you understand what container_of does?
> This is not dynamic_cast<> with RTTI.
> 
> You can put any type name in there that you like,
> so long as it has a field name to match.  The type
> of the field you give doesn't even have to match
> the type of the pointer that you pass in.

Uh, if that's true, that's a bug in the container_of implementation.
The ccan container_of implementation, for example, certainly does
check that the given field has type matching the pointer.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

[Qemu-devel] Hello Would You Like To Earn

2011-06-02 Thread Sharon . Burns

Hello qemu-devel

Would you like to earn an extra $200 everyday?, for just 45 minutes work? You 
could quit your job and make double the money at home working for yourself.

visit->http:tinyurl.com/3brnlpx

Regards,

Sharon Burns

Survey Human Resources Dept.

Re: [Qemu-devel] [PATCH] Fix compilation warning due to missing header for sigaction

2011-06-02 Thread Andreas Färber


Am 02.06.2011 um 04:21 schrieb Alexandre Raymond:

Fix the following warning by including signal.h directly in qemu- 
common.h

8<
iohandler.c: In function ‘qemu_init_child_watch’:
iohandler.c:172: warning: implicit declaration of function ‘sigaction’
iohandler.c:172: warning: nested extern declaration of ‘sigaction’
8<

Signed-off-by: Alexandre Raymond 


Tested-by: Andreas Färber 

Why in qemu-common.h and not in iohandler.c though?
If we put it into qemu-common.h, you should remove other inclusions of  
signal.h.


Andreas


---
qemu-common.h |1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/qemu-common.h b/qemu-common.h
index b851b20..39fabc9 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -39,6 +39,7 @@ typedef struct Monitor Monitor;
#include 
#include 
#include 
+#include 

#ifdef _WIN32
#include "qemu-os-win32.h"
--
1.7.5

Re: [Qemu-devel] [PATCH] target-arm: Fix compilation failure for 64 bit hosts

2011-06-02 Thread Peter Maydell

Ping?

thanks
-- PMM

On 26 May 2011 18:57, Stefan Weil  wrote:
> Am 26.05.2011 13:03, schrieb Peter Maydell:
>>
>> Use the correct _ptr aliases for manipulating the pointer to
>> the fp_status; this fixes a compilation failure on 64 bit hosts.
>>
>> Signed-off-by: Peter Maydell
>> ---
>> Apologies for the build breakage.
>>
>>  target-arm/translate.c |   18 +-
>>  1 files changed, 9 insertions(+), 9 deletions(-)
>>
>>
>
> Acked-by: Stefan Weil

Re: [Qemu-devel] [RFC PATCH 01/13] Generic DMA memory access interface

2011-06-02 Thread David Gibson

On Wed, Jun 01, 2011 at 08:16:44AM -0700, Richard Henderson wrote:
> On 06/01/2011 07:29 AM, Avi Kivity wrote:
> > On 06/01/2011 05:01 PM, Richard Henderson wrote:
> >> >  +err = dev->mmu->translate(dev, addr,&paddr,&plen, is_write);
> >>
> >> I see you didn't take my suggestion for using an opaque callback pointer.
> >> Really and truly, I won't be able to use this as-is for Alpha.
> >>
> > 
> > Rather than opaques, please pass the DMA engine itself and use 
> > container_of().
> 
> The dma engine object is currently sitting in the PCIBus structure.
> Which is private, and can't be extended by a host bridge implementation.
> 
> The entire code could be re-arranged, true, but please suggest something
> reasonable.
> 
> > We should be removing opaques, not adding them.
> 
> See my followup elsewhere.  Opaques *can* be cleaner than upcasting,
> particularly if there are too many hoops through which to jump.

So, in the meantime, I've also done a version of Eduard's earlier
patches, with added support for the PAPR hypervisor managed IOMMU.

I have also significantly reworked how the structure lookup works,
partly because in my case I'l looking at IOMMU translation for non-PCI
devices, but I think it may also address your concerns.  I'm still
using upcasts, but there are less steps from the device to the IOMMU
state.

I've been sick and haven't had a chance to merge my stuff with
Eduard's changes.  I'll post them anyway, as another discussion
point.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] virtio scsi host draft specification, v2

2011-06-02 Thread Michael S. Tsirkin

On Wed, Jun 01, 2011 at 05:51:54PM +0300, Avi Kivity wrote:
> On 06/01/2011 05:36 PM, Michael S. Tsirkin wrote:
> >>
> >>  So, if I am going to give this liberty with buffers to the driver, I
> >>  _have_ to keep the size information.  Otherwise, I agree that it is
> >>  redundant and I will remove it.  What poison do you prefer?
> >>
> >
> >Ah, I think I understand now. Both sense and data have in
> >fields that might only be used partially?
> >In that case I think I agree: it's best to require the use of separate
> >buffers for them, in this way used len will give you useful information
> >and you won't need sense_len and data_len: just a flag to
> >mark the fact that there *is* a sense buffer following.
> >And the num field does that.
> 
> 
> Do you mean to use the virtio iovec length to determine information
> about the message (like splitting it into buffers)?

Exactly the reverse :)

> I think that's a bad idea.  Splitting into buffers is a function of
> memory management.  For example, a driver in userspace (or a nested
> guest) will have additional fragmentation into 4K pages after it
> passes through the iommu.
> 
> Let's not mix layers here.

Right. If there are two buffers of variable length there
should be two add_buf calls.

> -- 
> error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] cocoa: Provide central qemu_main() prototype

2011-06-02 Thread Andreas Färber


Am 30.05.2011 um 01:53 schrieb Alexandre Raymond:


Just thinking out loud here : wouldn't it make more sense to put the
main() of each gui framework directly in its corresponding file and
select the right one in the makefile using the configure options?

so you'd have
-no gui -> ui/no_gui.c:main() -> qemu_main()=== compile with  
no_gui.c + vl.c

-sdl -> ui/sdl.c:main() -> qemu_main() === compile with sdl.c + vl.c
-cocoa -> ui/cocoa.m:main() -> qemu_main() === compile with cocoa.m  
+ vl.c


with ui/no_gui.c, ui/sdl.c and ui/cocoa.m each having their own  
main():

8<
...
int main(...) {
   return qemu_main();
}
...
8<

and definitively rename main() to qemu_main() in vl.c ?


Anthony, waiting on your comment here as it's an overall UI  
architectural question.


To me that sounds the wrong direction to fix this... The only frontend  
that forces another main() function on us seems to be SDL under some  
circumstances. For Cocoa that was a QEMU-internal choice.


Instead of always renaming our main() function, maybe we can introduce  
some general hooks from our main() that the frontends can use to  
initialize them? One hook would need to be before processing of  
options (since launching a Cocoa app may add some Cocoa-specific  
parameters from the desktop or AppleScript/Automator) and another one  
once the options are processed and it's clear what display mode we're  
in.


So should we go ahead with my patch for the next pull or do some  
reorganization touching all frontends?


Andreas

On Sun, May 29, 2011 at 3:58 PM, Andreas Färber > wrote:

This fixes a missing prototype warning in vl.c and obsoletes
the prototype in cocoa.m. Adjust callers in cocoa.m to supply
third argument, which is currently only used on Linux/ppc.

The prototype is designed so that it could be shared with SDL
and other frontends, if desired.

Cc: Alexandre Raymond 
Signed-off-by: Andreas Färber 
---
 qemu-common.h |5 +
 ui/cocoa.m|6 +++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/qemu-common.h b/qemu-common.h
index b851b20..218289c 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -131,6 +131,11 @@ static inline char *realpath(const char *path,  
char *resolved_path)


 #endif /* !defined(NEED_CPU_H) */

+/* main function, renamed */
+#if defined(CONFIG_COCOA)
+int qemu_main(int argc, char **argv, char **envp);
+#endif
+
 /* bottom halves */
 typedef void QEMUBHFunc(void *opaque);

diff --git a/ui/cocoa.m b/ui/cocoa.m
index 1ff1ac6..6566e46 100644
--- a/ui/cocoa.m
+++ b/ui/cocoa.m
@@ -23,6 +23,7 @@
 */

 #import 
+#include 

 #include "qemu-common.h"
 #include "console.h"
@@ -61,7 +62,6 @@ typedef struct {
int bitsPerPixel;
 } QEMUScreen;

-int qemu_main(int argc, char **argv); // main defined in qemu/vl.c
 NSWindow *normalWindow;
 id cocoaView;
 static DisplayChangeListener *dcl;
@@ -794,7 +794,7 @@ static int cocoa_keycode_to_qemu(int keycode)
COCOA_DEBUG("QemuCocoaAppController: startEmulationWithArgc\n");

int status;
-status = qemu_main(argc, argv);
+status = qemu_main(argc, argv, *_NSGetEnviron());
exit(status);
 }

@@ -876,7 +876,7 @@ int main (int argc, const char * argv[]) {
!strcmp(opt, "-nographic") ||
!strcmp(opt, "-version") ||
!strcmp(opt, "-curses")) {
-return qemu_main(gArgc, gArgv);
+return qemu_main(gArgc, gArgv, *_NSGetEnviron());
}
}
}
--
1.7.5.3

Re: [Qemu-devel] virtio scsi host draft specification, v2

2011-06-02 Thread Michael S. Tsirkin

BTW, I think Cc the virtio mailing list
in the next version is a good idea.

On Wed, Jun 01, 2011 at 05:59:28PM +0200, Paolo Bonzini wrote:
> On 06/01/2011 04:36 PM, Michael S. Tsirkin wrote:
> >Ah, I think I understand now. Both sense and data have in
> >fields that might only be used partially?
> >In that case I think I agree: it's best to require the use of separate
> >buffers for them, in this way used len will give you useful information
> >and you won't need sense_len and data_len: just a flag to
> >mark the fact that there*is*  a sense buffer following.
> 
> If the device wants a sense buffer to be there always, that's
> sensible.  No flag needed here, then.  Also, sense is always "in",
> there is no out.
> 
> But I do not understand how the used len helps me.  If I read the
> spec correctly, the length will be the number of bytes written, but
> this will always point to after the last field.  If sense or data
> are written partially, this will not be written in the fields---in
> fact, virtio_blk does contain both sense_len and residual.  Its
> sense field is fixed size, which is probably why it doesn't contain
> something like datain_size (there is just one variable-size field).
> 
> Strictly speaking I wouldn't need dataout_size too, because I have
> only one variable-size read-only field, but I prefer to be
> future-proof.
> 
> I think the following is a good compromise:
> 
> 1) keep dataout_size, datain_size and sense_size;
> 
> 2) dataout, datain and sense shall each start a separate buffer, and
> sense shall be contained in a single buffer; it is permissible to
> put sense and the subsequent fields in the same buffer.  This will
> make it easy for the QEMU implementation to pick up its iovecs.
> 
> It will also let the device detect mistakes in filling the data
> sizes.

I am not sure whether the length info in the header is redundant. If so, I
think it's better not to duplicate it on the assumption that this will
let us detect bugs: the bugs will be in the header as likely as not.
If the overlap is not complete, some redundancy is not too bad.

> In practice I expect 3 descriptors will be used (one direct
> for read-only stuff up to data; one possibly indirect for data; one
> direct for sense and other write-only stuff).
> 
> >However some questions:
> >1. I think you don't need numdatain/numdataout: each
> >buffer can include in and out segments. Just tell device how many
> >buffers are there.
> 
> I don't understand.
> 
> Paolo

I think I didn't express myself clearly. Follows a somewhat
lengthy background to make sure we use the same terms.
Feel free to ignore until 

There seems to be some confusion about the terminology: the term buffers
is confusing. I'm guilty of mixing terms too. Let's refer to the virtio
spec.  There are two kinds of entities there:

- descriptor: each descriptor points to a location
  in memory. It can be in or out. descriptors can be
  chained together.

- head: internally, points to a chain of descriptors, both in and out.
  * driver makes head available to device, thus adding some
memory for out + some memory for in.
  * device uses the memory, and reports to driver how much
in memory was used. The assumption is always that
device consumes in memory from the beginning
in a contigious way. That's why length is enough.
That's also why it's called add_buf: conceptually
it is a single buffer.

Drivers and devices never operate in terms of descriptors: these are
internal to the virtio ring transport.  Instead, they always operate in
term of heads. virtio interface does let you pass in
s/g entries but this is an artifact of linux.

At the moment many devices in qemu assume that drivers put some info in
specific descriptors. This is not a good idea, they should always
operate in terms of offsets from start of descriptor.  vhost does this
correctly BTW.  I posted a patch to fix that previously, need to dust it
up and merge.

Now to our problem:
As far as I can tell there are two input buffers in each request: sense
and data. Right?

If sense is fixed length, we can simply put it first, have device write
sense then data.  This does not seem too limiting, if you want a lot of
flexibility sense length can be in device config.  If we don't want to
limit ourselves to fixed length sense, we would have driver use two
heads for a request.  This is possible but one needs to be careful in
the driver to make sure there's enough space for both requests. Maybe
add_bufs API to add multiple bufs might be a good idea here.

-- 
MST

Re: [Qemu-devel] virtio scsi host draft specification, v2

2011-06-02 Thread Avi Kivity


On 06/02/2011 01:42 PM, Michael S. Tsirkin wrote:

On Wed, Jun 01, 2011 at 05:51:54PM +0300, Avi Kivity wrote:
>  On 06/01/2011 05:36 PM, Michael S. Tsirkin wrote:
>  >>
>  >>   So, if I am going to give this liberty with buffers to the driver, I
>  >>   _have_ to keep the size information.  Otherwise, I agree that it is
>  >>   redundant and I will remove it.  What poison do you prefer?
>  >>
>  >
>  >Ah, I think I understand now. Both sense and data have in
>  >fields that might only be used partially?
>  >In that case I think I agree: it's best to require the use of separate
>  >buffers for them, in this way used len will give you useful information
>  >and you won't need sense_len and data_len: just a flag to
>  >mark the fact that there *is* a sense buffer following.
>  >And the num field does that.
>
>
>  Do you mean to use the virtio iovec length to determine information
>  about the message (like splitting it into buffers)?

Exactly the reverse :)


They're both equally bad.


>  I think that's a bad idea.  Splitting into buffers is a function of
>  memory management.  For example, a driver in userspace (or a nested
>  guest) will have additional fragmentation into 4K pages after it
>  passes through the iommu.
>
>  Let's not mix layers here.

Right. If there are two buffers of variable length there
should be two add_buf calls.


No.  The guest should be free to use one large continuous buffer of size 
N, of N buffers of size 1.


--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 01/14] kvm: remove fop write only variable

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 target-i386/kvm.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index faedc6c..58a70bc 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -970,7 +970,7 @@ static int kvm_get_xsave(CPUState *env)
 #ifdef KVM_CAP_XSAVE
 struct kvm_xsave* xsave;
 int ret, i;
-uint16_t cwd, swd, twd, fop;
+uint16_t cwd, swd, twd;

 if (!kvm_has_xsave()) {
 return kvm_get_fpu(env);
@@ -986,7 +986,6 @@ static int kvm_get_xsave(CPUState *env)
 cwd = (uint16_t)xsave->region[0];
 swd = (uint16_t)(xsave->region[0] >> 16);
 twd = (uint16_t)xsave->region[1];
-fop = (uint16_t)(xsave->region[1] >> 16);
 env->fpstt = (swd >> 11) & 7;
 env->fpus = swd;
 env->fpuc = cwd;
-- 
1.7.5.2

[Qemu-devel] [PATCH 00/14] More gcc 4.6 warnings fixes

2011-06-02 Thread Juan Quintela

Hi

This series fixes compilation of plain:

./configure

with gcc-4.6.0 on Fedora 15.

Notes:
- linuxload.c: why it define id_change if it never test its value?
  git log shows that it has been this way forever
- linux-user/syscall.c: do we want to return an error in the default case?
  my guess is yes, but ...
- mips: it puts 8 arguments on the stack, but do_syscall() only uses 6.
  at least some syscalls uses already 7 arguments, I guess this has never 
worked before.

- for this kind of warnings, I have added:
   (void)unused_var;
  We can remove the variable altogether, comment it, use 
__attribute__(no_warn_unused).

- linux-user, syscall for alpha.  can anyone check tat my s/arg1/how/
  is the right change.  Looking at the normal sigprocmask call
  emulation, it looks like my change is wright, but one never knows.

Later, Juan.


Juan Quintela (14):
  kvm: remove fop write only variable
  tcg: define and set call_type only when it is used
  flatload: memp was a write-only variable
  xen: pentry is not used in this function
  linuxload: id_change was a write only variable
  flatload: end_code was only used in a debug message
  alpha: fn2 was a write only variable
  syscall: really return ret code
  exec: last_first_tb was only used in !ONLY_USER case
  mips: we really need the extra arguments
  linux-user: fpu_save_addr is not used
  linux-user: syscall should use sanitized arg1
  alpha: disp12 is not used for USER emulation
  lsi53c895a: current_dev is not used

 exec.c   |   10 +++---
 hw/lsi53c895a.c  |2 --
 linux-user/flatload.c|8 ++--
 linux-user/linuxload.c   |   25 +
 linux-user/main.c|4 
 linux-user/signal.c  |3 ++-
 linux-user/syscall.c |   10 +-
 target-alpha/translate.c |   10 +++---
 target-i386/kvm.c|3 +--
 tcg/tcg.c|9 +++--
 xen-mapcache.c   |3 +--
 11 files changed, 33 insertions(+), 54 deletions(-)

-- 
1.7.5.2

[Qemu-devel] [PATCH 02/14] tcg: define and set call_type only when it is used

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 tcg/tcg.c |9 +++--
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index fad92f9..2d180a5 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -585,9 +585,6 @@ void tcg_register_helper(void *func, const char *name)
 void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned int flags,
int sizemask, TCGArg ret, int nargs, TCGArg *args)
 {
-#ifdef TCG_TARGET_I386
-int call_type;
-#endif
 int i;
 int real_args;
 int nb_rets;
@@ -612,9 +609,6 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned 
int flags,

 *gen_opc_ptr++ = INDEX_op_call;
 nparam = gen_opparam_ptr++;
-#ifdef TCG_TARGET_I386
-call_type = (flags & TCG_CALL_TYPE_MASK);
-#endif
 if (ret != TCG_CALL_DUMMY_ARG) {
 #if TCG_TARGET_REG_BITS < 64
 if (sizemask & 1) {
@@ -641,6 +635,9 @@ void tcg_gen_callN(TCGContext *s, TCGv_ptr func, unsigned 
int flags,
 int is_64bit = sizemask & (1 << (i+1)*2);
 if (is_64bit) {
 #ifdef TCG_TARGET_I386
+int call_type;
+
+call_type = (flags & TCG_CALL_TYPE_MASK);
 /* REGPARM case: if the third parameter is 64 bit, it is
allocated on the stack */
 if (i == 2 && call_type == TCG_CALL_TYPE_REGPARM) {
-- 
1.7.5.2

[Qemu-devel] [PATCH 04/14] xen: pentry is not used in this function

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 xen-mapcache.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/xen-mapcache.c b/xen-mapcache.c
index 349cc62..a2419dc 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -198,7 +198,7 @@ uint8_t *qemu_map_cache(target_phys_addr_t phys_addr, 
target_phys_addr_t size, u

 void qemu_map_cache_unlock(void *buffer)
 {
-MapCacheEntry *entry = NULL, *pentry = NULL;
+MapCacheEntry *entry = NULL;
 MapCacheRev *reventry;
 target_phys_addr_t paddr_index;
 int found = 0;
@@ -218,7 +218,6 @@ void qemu_map_cache_unlock(void *buffer)

 entry = &mapcache->entry[paddr_index % mapcache->nr_buckets];
 while (entry && entry->paddr_index != paddr_index) {
-pentry = entry;
 entry = entry->next;
 }
 if (!entry) {
-- 
1.7.5.2

[Qemu-devel] [PATCH 06/14] flatload: end_code was only used in a debug message

2011-06-02 Thread Juan Quintela

Just unfold its definition in only use.

Signed-off-by: Juan Quintela 
---
 linux-user/flatload.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index 580bc21..8dad5df 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -383,7 +383,7 @@ static int load_flat_file(struct linux_binprm * bprm,
 abi_ulong reloc = 0, rp;
 int i, rev, relocs = 0;
 abi_ulong fpos;
-abi_ulong start_code, end_code;
+abi_ulong start_code;
 abi_ulong indx_len;

 hdr = ((struct flat_hdr *) bprm->buf); /* exec-header */
@@ -549,11 +549,10 @@ static int load_flat_file(struct linux_binprm * bprm,

 /* The main program needs a little extra setup in the task structure */
 start_code = textpos + sizeof (struct flat_hdr);
-end_code = textpos + text_len;

 DBG_FLT("%s %s: TEXT=%x-%x DATA=%x-%x BSS=%x-%x\n",
 id ? "Lib" : "Load", bprm->filename,
-(int) start_code, (int) end_code,
+(int) start_code, (int) textpos + text_lon,
 (int) datapos,
 (int) (datapos + data_len),
 (int) (datapos + data_len),
-- 
1.7.5.2

[Qemu-devel] [PATCH 08/14] syscall: really return ret code

2011-06-02 Thread Juan Quintela

We assign ret with the error code, but then return 0 unconditionally.

Signed-off-by: Juan Quintela 
---
 linux-user/syscall.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5cb27c7..f3d03b0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -3751,10 +3751,10 @@ static abi_long do_get_thread_area(CPUX86State *env, 
abi_ulong ptr)
 #ifndef TARGET_ABI32
 static abi_long do_arch_prctl(CPUX86State *env, int code, abi_ulong addr)
 {
-abi_long ret;
+abi_long ret = 0;
 abi_ulong val;
 int idx;
-
+
 switch(code) {
 case TARGET_ARCH_SET_GS:
 case TARGET_ARCH_SET_FS:
@@ -3773,13 +3773,13 @@ static abi_long do_arch_prctl(CPUX86State *env, int 
code, abi_ulong addr)
 idx = R_FS;
 val = env->segs[idx].base;
 if (put_user(val, addr, abi_ulong))
-return -TARGET_EFAULT;
+ret = -TARGET_EFAULT;
 break;
 default:
 ret = -TARGET_EINVAL;
 break;
 }
-return 0;
+return ret;
 }
 #endif

-- 
1.7.5.2

[Qemu-devel] [PATCH 12/14] linux-user: syscall should use sanitized arg1

2011-06-02 Thread Juan Quintela

Looking at the other architectures, we should be using "how" not "arg1".

Signed-off-by: Juan Quintela 
---
 linux-user/syscall.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f3d03b0..c90fcc2 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7077,7 +7077,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 }
 mask = arg2;
 target_to_host_old_sigset(&set, &mask);
-sigprocmask(arg1, &set, &oldset);
+sigprocmask(how, &set, &oldset);
 host_to_target_old_sigset(&mask, &oldset);
 ret = mask;
 }
-- 
1.7.5.2

[Qemu-devel] [PATCH 03/14] flatload: memp was a write-only variable

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 linux-user/flatload.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index cd7af7c..580bc21 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -379,7 +379,6 @@ static int load_flat_file(struct linux_binprm * bprm,
 abi_long result;
 abi_ulong realdatastart = 0;
 abi_ulong text_len, data_len, bss_len, stack_len, flags;
-abi_ulong memp = 0; /* for finding the brk area */
 abi_ulong extra;
 abi_ulong reloc = 0, rp;
 int i, rev, relocs = 0;
@@ -491,7 +490,6 @@ static int load_flat_file(struct linux_binprm * bprm,
 }

 reloc = datapos + (ntohl(hdr->reloc_start) - text_len);
-memp = realdatastart;

 } else {

@@ -506,7 +504,6 @@ static int load_flat_file(struct linux_binprm * bprm,
 realdatastart = textpos + ntohl(hdr->data_start);
 datapos = realdatastart + indx_len;
 reloc = (textpos + ntohl(hdr->reloc_start) + indx_len);
-memp = textpos;

 #ifdef CONFIG_BINFMT_ZFLAT
 #error code needs checking
-- 
1.7.5.2

[Qemu-devel] [PATCH 05/14] linuxload: id_change was a write only variable

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 linux-user/linuxload.c |   25 +
 1 files changed, 1 insertions(+), 24 deletions(-)

diff --git a/linux-user/linuxload.c b/linux-user/linuxload.c
index ac8c486..62ebc7e 100644
--- a/linux-user/linuxload.c
+++ b/linux-user/linuxload.c
@@ -26,22 +26,6 @@ abi_long memcpy_to_target(abi_ulong dest, const void *src,
 return 0;
 }

-static int in_group_p(gid_t g)
-{
-/* return TRUE if we're in the specified group, FALSE otherwise */
-intngroup;
-inti;
-gid_t  grouplist[NGROUPS];
-
-ngroup = getgroups(NGROUPS, grouplist);
-for(i = 0; i < ngroup; i++) {
-   if(grouplist[i] == g) {
-   return 1;
-   }
-}
-return 0;
-}
-
 static int count(char ** vec)
 {
 inti;
@@ -57,7 +41,7 @@ static int prepare_binprm(struct linux_binprm *bprm)
 {
 struct statst;
 int mode;
-int retval, id_change;
+int retval;

 if(fstat(bprm->fd, &st) < 0) {
return(-errno);
@@ -73,14 +57,10 @@ static int prepare_binprm(struct linux_binprm *bprm)

 bprm->e_uid = geteuid();
 bprm->e_gid = getegid();
-id_change = 0;

 /* Set-uid? */
 if(mode & S_ISUID) {
bprm->e_uid = st.st_uid;
-   if(bprm->e_uid != geteuid()) {
-   id_change = 1;
-   }
 }

 /* Set-gid? */
@@ -91,9 +71,6 @@ static int prepare_binprm(struct linux_binprm *bprm)
  */
 if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
bprm->e_gid = st.st_gid;
-   if (!in_group_p(bprm->e_gid)) {
-   id_change = 1;
-   }
 }

 retval = read(bprm->fd, bprm->buf, BPRM_BUF_SIZE);
-- 
1.7.5.2

[Qemu-devel] [PATCH 14/14] lsi53c895a: current_dev is not used

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 hw/lsi53c895a.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 83084b6..90c6cbc 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -889,7 +889,6 @@ static void lsi_do_msgout(LSIState *s)
 uint8_t msg;
 int len;
 uint32_t current_tag;
-SCSIDevice *current_dev;
 lsi_request *current_req, *p, *p_next;
 int id;

@@ -901,7 +900,6 @@ static void lsi_do_msgout(LSIState *s)
 current_req = lsi_find_by_tag(s, current_tag);
 }
 id = (current_tag >> 8) & 0xf;
-current_dev = s->bus.devs[id];

 DPRINTF("MSG out len=%d\n", s->dbc);
 while (s->dbc) {
-- 
1.7.5.2

[Qemu-devel] [PATCH 07/14] alpha: fn2 was a write only variable

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 target-alpha/translate.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 456ba51..5c11cf2 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -1469,7 +1469,7 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 uint32_t palcode;
 int32_t disp21, disp16, disp12;
 uint16_t fn11;
-uint8_t opc, ra, rb, rc, fpfn, fn7, fn2, islit, real_islit;
+uint8_t opc, ra, rb, rc, fpfn, fn7, islit, real_islit;
 uint8_t lit;
 ExitStatus ret;

@@ -1491,7 +1491,6 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 fn11 = (insn >> 5) & 0x07FF;
 fpfn = fn11 & 0x3F;
 fn7 = (insn >> 5) & 0x007F;
-fn2 = (insn >> 5) & 0x0003;
 LOG_DISAS("opc %02x ra %2d rb %2d rc %2d disp16 %6d\n",
   opc, ra, rb, rc, disp16);

-- 
1.7.5.2

[Qemu-devel] [PATCH 13/14] alpha: disp12 is not used for USER emulation

2011-06-02 Thread Juan Quintela


Signed-off-by: Juan Quintela 
---
 target-alpha/translate.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 5c11cf2..fd286d8 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -1467,7 +1467,10 @@ static void gen_rx(int ra, int set)
 static ExitStatus translate_one(DisasContext *ctx, uint32_t insn)
 {
 uint32_t palcode;
-int32_t disp21, disp16, disp12;
+int32_t disp21, disp16;
+#ifndef CONFIG_USER_ONLY
+int32_t disp12;
+#endif
 uint16_t fn11;
 uint8_t opc, ra, rb, rc, fpfn, fn7, islit, real_islit;
 uint8_t lit;
@@ -1487,7 +1490,9 @@ static ExitStatus translate_one(DisasContext *ctx, 
uint32_t insn)
 palcode = insn & 0x03FF;
 disp21 = ((int32_t)((insn & 0x001F) << 11)) >> 11;
 disp16 = (int16_t)(insn & 0x);
+#ifndef CONFIG_USER_ONLY
 disp12 = (int32_t)((insn & 0x0FFF) << 20) >> 20;
+#endif
 fn11 = (insn >> 5) & 0x07FF;
 fpfn = fn11 & 0x3F;
 fn7 = (insn >> 5) & 0x007F;
-- 
1.7.5.2

Re: [Qemu-devel] virtio scsi host draft specification, v2

2011-06-02 Thread Paolo Bonzini


On 06/02/2011 01:41 PM, Michael S. Tsirkin wrote:

Now to our problem:
As far as I can tell there are two input buffers in each request: sense
and data. Right?

If sense is fixed length, we can simply put it first, have device write
sense then data.  This does not seem too limiting, if you want a lot of
flexibility sense length can be in device config.  If we don't want to
limit ourselves to fixed length sense, we would have driver use two
heads for a request.  This is possible but one needs to be careful in
the driver to make sure there's enough space for both requests. Maybe
add_bufs API to add multiple bufs might be a good idea here.


I should be on holiday today so I'll answer this quickly.  Sounds like 
we can converge, I'll put data at the end and define the length of sense 
in the config: the device writes a default (defined by the spec to be 
always 96) and the driver can modify it.  The _single_ head would contain:


- read-only: command etc.

followed by:

- write-only: sense, status etc.

followed by:

- read-only: data to device
- write-only: data from device

IIUC, qemu only sees a bunch of read-only and write-only buffers.  It 
doesn't see the relative ordering of read-only vs. write-only.  But it 
knows the sizes of read-only and write-only data, so it can figure out 
datain_size and dataout_size.  sense_size is in the config, so neither 
of the three needs to be in the request.


sense_len needs to stay, since any number of bytes can be written in the 
sense buffer.  The used-length field should be usable for 
uni-directional commands, but I'm not sure about commands that have both 
datain and dataout.  I'll read the SCSI spec about it tomorrow.


Making qemu support arbitrarily partitioned buffers may require some 
extra utility functions to work on iovecs, but nothing too complex.  If 
your patches already contain something like that, please dig them up so 
I can avoid duplicate work!


Paolo

[Qemu-devel] [PATCH 09/14] exec: last_first_tb was only used in !ONLY_USER case

2011-06-02 Thread Juan Quintela

Once there, use a better variable name.

Signed-off-by: Juan Quintela 
---
 exec.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index 8529390..4b1afec 100644
--- a/exec.c
+++ b/exec.c
@@ -1208,12 +1208,16 @@ static inline void tb_alloc_page(TranslationBlock *tb,
  unsigned int n, tb_page_addr_t page_addr)
 {
 PageDesc *p;
-TranslationBlock *last_first_tb;
+#ifndef CONFIG_USER_ONLY
+bool page_already_protected;
+#endif

 tb->page_addr[n] = page_addr;
 p = page_find_alloc(page_addr >> TARGET_PAGE_BITS, 1);
 tb->page_next[n] = p->first_tb;
-last_first_tb = p->first_tb;
+#ifndef CONFIG_USER_ONLY
+page_already_protected = p->first_tb != NULL;
+#endif
 p->first_tb = (TranslationBlock *)((long)tb | n);
 invalidate_page_bitmap(p);

@@ -1249,7 +1253,7 @@ static inline void tb_alloc_page(TranslationBlock *tb,
 /* if some code is already present, then the pages are already
protected. So we handle the case where only the first TB is
allocated in a physical page */
-if (!last_first_tb) {
+if (!page_already_protected) {
 tlb_protect_code(page_addr);
 }
 #endif
-- 
1.7.5.2

Re: [Qemu-devel] virtio scsi host draft specification, v2

2011-06-02 Thread Michael S. Tsirkin

On Thu, Jun 02, 2011 at 02:42:51PM +0300, Avi Kivity wrote:
> On 06/02/2011 01:42 PM, Michael S. Tsirkin wrote:
> >On Wed, Jun 01, 2011 at 05:51:54PM +0300, Avi Kivity wrote:
> >>  On 06/01/2011 05:36 PM, Michael S. Tsirkin wrote:
> >>  >>
> >>  >>   So, if I am going to give this liberty with buffers to the driver, I
> >>  >>   _have_ to keep the size information.  Otherwise, I agree that it is
> >>  >>   redundant and I will remove it.  What poison do you prefer?
> >>  >>
> >>  >
> >>  >Ah, I think I understand now. Both sense and data have in
> >>  >fields that might only be used partially?
> >>  >In that case I think I agree: it's best to require the use of separate
> >>  >buffers for them, in this way used len will give you useful information
> >>  >and you won't need sense_len and data_len: just a flag to
> >>  >mark the fact that there *is* a sense buffer following.
> >>  >And the num field does that.
> >>
> >>
> >>  Do you mean to use the virtio iovec length to determine information
> >>  about the message (like splitting it into buffers)?
> >
> >Exactly the reverse :)
> 
> They're both equally bad.
> 
> >>  I think that's a bad idea.  Splitting into buffers is a function of
> >>  memory management.  For example, a driver in userspace (or a nested
> >>  guest) will have additional fragmentation into 4K pages after it
> >>  passes through the iommu.
> >>
> >>  Let's not mix layers here.
> >
> >Right. If there are two buffers of variable length there
> >should be two add_buf calls.
> 
> No.  The guest should be free to use one large continuous buffer of
> size N, of N buffers of size 1.

That's exactly what I was saying.

> -- 
> error compiling committee.c: too many arguments to function

[Qemu-devel] [PATCH 10/14] mips: we really need the extra arguments

2011-06-02 Thread Juan Quintela

I have no clue how/why syscalls with 7 parameters work on mips.

Signed-off-by: Juan Quintela 
---
 linux-user/main.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 088def3..d13affa 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2067,6 +2067,10 @@ void cpu_loop(CPUMIPSState *env)
 default:
 break;
 }
+/* We should change do_syscall to take extra args.
+   Some syscalls on mips already use 7 args */
+(void)arg7;
+(void)arg8;
 ret = do_syscall(env, env->active_tc.gpr[2],
  env->active_tc.gpr[4],
  env->active_tc.gpr[5],
-- 
1.7.5.2

[Qemu-devel] [PATCH 11/14] linux-user: fpu_save_addr is not used

2011-06-02 Thread Juan Quintela

It is only read to set the error code?

Signed-off-by: Juan Quintela 
---
 linux-user/signal.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index c7a375f..edf4cdb 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -2122,7 +2122,7 @@ long do_sigreturn(CPUState *env)
}

 err |= __get_user(fpu_save_addr, &sf->fpu_save);
-
+(void)fpu_save_addr;
 //if (fpu_save)
 //err |= restore_fpu_state(env, fpu_save);

@@ -2295,6 +2295,7 @@ void sparc64_set_context(CPUSPARCState *env)
  abi_ulong) != 0)
 goto do_sigsegv;
 err |= __get_user(fenab, &(ucp->tuc_mcontext.mc_fpregs.mcfpu_enab));
+(void)fenab;
 err |= __get_user(env->fprs, &(ucp->tuc_mcontext.mc_fpregs.mcfpu_fprs));
 {
 uint32_t *src, *dst;
-- 
1.7.5.2

Re: [Qemu-devel] virtio scsi host draft specification, v2

2011-06-02 Thread Michael S. Tsirkin

On Thu, Jun 02, 2011 at 02:02:52PM +0200, Paolo Bonzini wrote:
> On 06/02/2011 01:41 PM, Michael S. Tsirkin wrote:
> >Now to our problem:
> >As far as I can tell there are two input buffers in each request: sense
> >and data. Right?
> >
> >If sense is fixed length, we can simply put it first, have device write
> >sense then data.  This does not seem too limiting, if you want a lot of
> >flexibility sense length can be in device config.  If we don't want to
> >limit ourselves to fixed length sense, we would have driver use two
> >heads for a request.  This is possible but one needs to be careful in
> >the driver to make sure there's enough space for both requests. Maybe
> >add_bufs API to add multiple bufs might be a good idea here.
> 
> I should be on holiday today so I'll answer this quickly.  Sounds
> like we can converge, I'll put data at the end and define the length
> of sense in the config: the device writes a default (defined by the
> spec to be always 96) and the driver can modify it.  The _single_
> head would contain:
> 
> - read-only: command etc.
> 
> followed by:
> 
> - write-only: sense, status etc.
> 
> followed by:
> 
> - read-only: data to device
> - write-only: data from device

Yes, this works.

> IIUC, qemu only sees a bunch of read-only and write-only buffers.
> It doesn't see the relative ordering of read-only vs. write-only.

In virtio write is always before read.

> But it knows the sizes of read-only and write-only data, so it can
> figure out datain_size and dataout_size.  sense_size is in the
> config, so neither of the three needs to be in the request.
> 
> sense_len needs to stay, since any number of bytes can be written in
> the sense buffer.

I think this means sense_len needs to go into the in buffer
as well. As head is first it's an out buffer.

>  The used-length field should be usable for
> uni-directional commands, but I'm not sure about commands that have
> both datain and dataout.  I'll read the SCSI spec about it tomorrow.

used-length is the part of in buffer actually written.
actual data len is thus used-length - 96
(sense is assumed to be fixed length by virtio ring,
 sense_len tells driver how many bytes are
 actually valid).


> 
> Making qemu support arbitrarily partitioned buffers may require some
> extra utility functions to work on iovecs, but nothing too complex.
> If your patches already contain something like that, please dig them
> up so I can avoid duplicate work!
> 
> Paolo

Yes. Will do.

-- 
MST

[Qemu-devel] Hello Would You Like To Earn

2011-06-02 Thread Sharon . Burns

Hello qemu-devel

Would you like to earn an extra $200 everyday?, for just 45 minutes work? You 
could quit your job and make double the money at home working for yourself.

visit->http:tinyurl.com/3brnlpx

Regards,

Sharon Burns

Survey Human Resources Dept.

Re: [Qemu-devel] [RFC PATCH] Darwin: Fix compilation warning regarding the deprecated daemon() function

2011-06-02 Thread Andreas Färber


Am 02.06.2011 um 04:45 schrieb Alexandre Raymond:

On OSX > 10.5, daemon() is deprecated, resulting int he following  
warning:


>= 10.5

http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ 
ManPages/man3/daemon.3.html



8<
qemu-nbd.c: In function ‘main’:
qemu-nbd.c:371: warning: ‘daemon’ is deprecated (declared at /usr/ 
include/stdlib.h:289)

8<

The following trick, used in mDNSResponder, takes care of this  
warning:

http://www.opensource.apple.com/source/mDNSResponder/mDNSResponder-258.18/mDNSPosix/PosixDaemon.c


Even if apparently applied by Apple themselves, I consider it a bad  
hack for curing symptoms.


http://developer.apple.com/library/mac/technotes/tn2083/_index.html#// 
apple_ref/doc/uid/DTS10003794-CH1-SUBSECTION64


Possibly a better fix would be to supply a .plist file for use with  
launchd/launchctl and to #ifndef __APPLE__ the daemon() functionality?


Further comments inline.


Signed-off-by: Alexandre Raymond 
---
qemu-nbd.c |9 +
1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index e858033..10b0791 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -16,6 +16,10 @@
 *  along with this program; if not, see .

 */

+#if __APPLE__


#ifdef __APPLE__


+#define daemon fake_daemon_function
+#endif
+
#include 
#include "block_int.h"
#include "nbd.h"
@@ -32,6 +36,11 @@
#include 
#include 

+#if __APPLE__


dito


+#undef daemon
+extern int daemon(int, int);


Blue Swirl has declared war on "extern"... ;)


+#endif
+
#define SOCKET_PATH"/var/lock/qemu-nbd-%s"

#define NBD_BUFFER_SIZE (1024*1024)
--
1.7.5


Andreas

Re: [Qemu-devel] [PATCH 00/14] More gcc 4.6 warnings fixes

2011-06-02 Thread Christophe Fergeau

Hi,

On Thu, Jun 02, 2011 at 01:53:35PM +0200, Juan Quintela wrote:
> Juan Quintela (14):
>   kvm: remove fop write only variable
>   tcg: define and set call_type only when it is used

These were already sent in this thread:
http://thread.gmane.org/gmane.comp.emulators.qemu/101411

>   linuxload: id_change was a write only variable
>   lsi53c895a: current_dev is not used

And these were sent in
http://thread.gmane.org/gmane.comp.emulators.qemu/104773

Your linux-load patch seems better than the one I proposed though, I didn't
noticed there was an unused function after doing the cleanup.

Christophe


pgps4gtVvqWH2.pgp
Description: PGP signature

[Qemu-devel] Hello Would You Like To Earn

2011-06-02 Thread Sharon . Burns

Hello qemu-devel

Would you like to earn an extra $200 everyday?, for just 45 minutes work? You 
could quit your job and make double the money at home working for yourself.

visit->http:tinyurl.com/3brnlpx

Regards,

Sharon Burns

Survey Human Resources Dept.

Re: [Qemu-devel] virtio scsi host draft specification, v2

2011-06-02 Thread Avi Kivity


On 06/02/2011 02:56 PM, Michael S. Tsirkin wrote:

>  >
>  >Right. If there are two buffers of variable length there
>  >should be two add_buf calls.
>
>  No.  The guest should be free to use one large continuous buffer of
>  size N, of N buffers of size 1.

That's exactly what I was saying.


I'm really confused then.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 14/14] lsi53c895a: current_dev is not used

2011-06-02 Thread Andreas Färber


Am 02.06.2011 um 13:53 schrieb Juan Quintela:



Signed-off-by: Juan Quintela 


See http://patchwork.ozlabs.org/patch/98182/ - ack'ed by Paolo.

Andreas


---
hw/lsi53c895a.c |2 --
1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 83084b6..90c6cbc 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -889,7 +889,6 @@ static void lsi_do_msgout(LSIState *s)
uint8_t msg;
int len;
uint32_t current_tag;
-SCSIDevice *current_dev;
lsi_request *current_req, *p, *p_next;
int id;

@@ -901,7 +900,6 @@ static void lsi_do_msgout(LSIState *s)
current_req = lsi_find_by_tag(s, current_tag);
}
id = (current_tag >> 8) & 0xf;
-current_dev = s->bus.devs[id];

DPRINTF("MSG out len=%d\n", s->dbc);
while (s->dbc) {
--
1.7.5.2

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Anthony Liguori


On 06/02/2011 04:06 AM, Daniel P. Berrange wrote:

On Wed, Jun 01, 2011 at 04:35:03PM -0500, Anthony Liguori wrote:

"timestamp": { "seconds": 1265044230, "microseconds": 450486 } }

Valid error reasons could be: "enospc", "eio", etc.


No etc :-)  Error reasons should we be well known and well documented.


B. query-stop-reason


I also have a simple solution for item 2. The vm_stop() accepts a reason
argument, so we could store it somewhere and return it as a string, like:

->   { "execute": "query-stop-reason" }
<- { "return": { "reason": "user" } }

Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey,
this should be "ioerror", no?), "watchdog", "panic", "savevm", "loadvm",
"migrate".

Also note that we have a STOP event. It should be extended with the
stop reason too, for completeness.



Can we just extend query-block?


Primarily we want 'query-stop-reason' to tell us what caused the VM
CPUs to stop. If that reason was 'ioerror', then 'query-block' could
be used to find out which particular block device(s) caused the IO
error to occurr&  get the "reason" that was in the BLOCK_IO_ERROR
event.


My concern is that we're over abstracting here.  We're not going to add 
additional stop reasons in the future.


Maybe just add an 'io-error': True to query-state.

Regards,

Anthony Liguori



Regards,
Daniel

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Jiri Denemark

On Thu, Jun 02, 2011 at 08:08:35 -0500, Anthony Liguori wrote:
> On 06/02/2011 04:06 AM, Daniel P. Berrange wrote:
> >>> B. query-stop-reason
> >>> 
> >>>
> >>> I also have a simple solution for item 2. The vm_stop() accepts a reason
> >>> argument, so we could store it somewhere and return it as a string, like:
> >>>
> >>> ->   { "execute": "query-stop-reason" }
> >>> <- { "return": { "reason": "user" } }
> >>>
> >>> Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey,
> >>> this should be "ioerror", no?), "watchdog", "panic", "savevm", "loadvm",
> >>> "migrate".
> >>>
> >>> Also note that we have a STOP event. It should be extended with the
> >>> stop reason too, for completeness.
> >>
> >>
> >> Can we just extend query-block?
> >
> > Primarily we want 'query-stop-reason' to tell us what caused the VM
> > CPUs to stop. If that reason was 'ioerror', then 'query-block' could
> > be used to find out which particular block device(s) caused the IO
> > error to occurr&  get the "reason" that was in the BLOCK_IO_ERROR
> > event.
> 
> My concern is that we're over abstracting here.  We're not going to add 
> additional stop reasons in the future.
> 
> Maybe just add an 'io-error': True to query-state.

Sure, adding a new field to query-state response would work as well. And it
seems like a good idea to me since one already needs to call query-status to
check if CPUs are stopped or not so it makes sense to incorporate the
additional information there as well. And if you want to be safe for the
future, the new field doesn't have to be boolean 'io-error' but it can be the
string 'reason' which Luiz suggested above.

Jirka

Re: [Qemu-devel] [PATCH 14/14] lsi53c895a: current_dev is not used

2011-06-02 Thread Juan Quintela

Andreas Färber  wrote:
> Am 02.06.2011 um 13:53 schrieb Juan Quintela:
>
>>
>> Signed-off-by: Juan Quintela 
>
> See http://patchwork.ozlabs.org/patch/98182/ - ack'ed by Paolo.
>
> Andreas

oops, yeap.

This makes everything compiles (not that I care which one gets
integrated).  Just to get the whole thing compiling O:-)

Thanks, Juan.

>
>> ---
>> hw/lsi53c895a.c |2 --
>> 1 files changed, 0 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
>> index 83084b6..90c6cbc 100644
>> --- a/hw/lsi53c895a.c
>> +++ b/hw/lsi53c895a.c
>> @@ -889,7 +889,6 @@ static void lsi_do_msgout(LSIState *s)
>> uint8_t msg;
>> int len;
>> uint32_t current_tag;
>> -SCSIDevice *current_dev;
>> lsi_request *current_req, *p, *p_next;
>> int id;
>>
>> @@ -901,7 +900,6 @@ static void lsi_do_msgout(LSIState *s)
>> current_req = lsi_find_by_tag(s, current_tag);
>> }
>> id = (current_tag >> 8) & 0xf;
>> -current_dev = s->bus.devs[id];
>>
>> DPRINTF("MSG out len=%d\n", s->dbc);
>> while (s->dbc) {
>> -- 
>> 1.7.5.2
>
<#secure method=pgpmime mode=sign>

Re: [Qemu-devel] [PATCH 11/14] linux-user: fpu_save_addr is not used

2011-06-02 Thread Peter Maydell

On 2 June 2011 12:53, Juan Quintela  wrote:
> It is only read to set the error code?

>         err |= __get_user(fpu_save_addr, &sf->fpu_save);
> -
> +        (void)fpu_save_addr;

In linux-user __get_user can never generate an error: faults
are always caught by the lock_user_struct() or equivalent call
done beforehand. The error handling is I think a leftover from
code borrowed from the kernel (which does have a __get_user
that might return an error).

So I think the correct fix here is just to remove the __get_user
lines and the variables if they're not used.

-- PMM

Re: [Qemu-devel] [PATCH v1][ 21/21] qapi: add QAPI code generation documentation

2011-06-02 Thread Lluís

Here go some minor text corrections.

Michael Roth writes:

> Signed-off-by: Michael Roth 
> ---
>  docs/qapi-code-gen.txt |  315 
> 
>  1 files changed, 315 insertions(+), 0 deletions(-)
>  create mode 100644 docs/qapi-code-gen.txt

> diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
> new file mode 100644
> index 000..1ba7a9e
> --- /dev/null
> +++ b/docs/qapi-code-gen.txt
> @@ -0,0 +1,315 @@
> += How to use the QAPI code generator =
> +
> +* Note: as of this writing, QMP does not use QAPI. Eventually QMP
> +commands will be converted to use QAPI internally. The following
> +information describes QMP/QAPI as it will exist afterward the
> +conversion.
> +
> +QAPI is a native C API within QEMU which provides management-level
> +functionality to internal/external users. For external
> +users/processes, this interface is made available by a JSON-based
> +QEMU Monitor protocol which provided by the QMP server.

s/which provided/that is provided/


> +
> +To map QMP-defined interfaces to the native C QAPI implementations,
> +and JSON-based schema is used to define types and function

s/and JSON-based/a JSON-based/


> +signatures, and a set of scripts is used to generate types/signatures,
> +and marshaling/dispatch code. The QEMU Guest Agent also uses these
> +scripts, paired with a seperate schema, to generate
> +marshaling/dispatch code for the guest agent server running in the
> +guest.
> +
> +This document will describe how the schemas, scripts, and resulting
> +code is used.
> +
> +
> +== QMP/Guest agent schema ==
> +
> +This file defines the types, commands, and events used by QMP.  It should
> +fully describe the interface used by QMP.
> +
> +This file is designed to be loosely based on JSON although it's technical

s/technical/technically/


> +executable Python.  While dictionaries are used, they are parsed as
> +OrderedDicts so that ordering is preserved.

Valid data types should probably be explained/listed.


> +
> +There are two basic syntaxes used.  The first syntax defines a type and is
> +represented by a dictionary.  There are three kinds of types that are
> +supported.
> +
> +A complex type is a dictionary containing a single key who's value is a
> +dictionary.  This corresponds to a struct in C or an Object in JSON.  An
> +example of a complex type is:
> +
> + { 'type': 'MyType',
> +   'data' { 'member1': 'str', 'member2': 'int', '*member3': 'str } }
> +
> +The use of '*' as a prefix to the name means the member is optional.  
> Optional
> +members should always be added to the end of the dictionary to preserve
> +backwards compatibility.
> +
> +An enumeration type is a dictionary containing a single key who's value is a
> +list of strings.  An example enumeration is:
> +
> + { 'enum': 'MyEnum', 'data': [ 'value1', 'value2', 'value3' ] }
> +
> +Generally speaking, complex types and enums should always use CamelCase for
> +the type names.
> +
> +Commands are defined by using a list containing three members.  The first
> +member is the command name, the second member is a dictionary containing
> +arguments, and the third member is the return type.
> +
> +An example command is:
> +
> + { 'command': 'my-command',
> +   'data': { 'arg1': 'str', '*arg2': 'str' },
> +   'returns': 'str' ]
> +
> +Command names should be all lower case with words separated by a hyphen.
> +
> +
> +== Code generation ==
> +
> +Schemas are fed into 3 scripts to generate all the code/files that, paired
> +with the core QAPI libraries, comprise everything required to take JSON
> +commands read in by a QMP/guest agent server, unmarshal the arguments into
> +the underlying C types, call into the corresponding C function, and map the
> +response back to a QMP/guest agent response to be returned to the user.
> +
> +For example usage, we'll use the following schema, which describes a single

s/For example usage/As an (usage )example/


> +complex user-defined type (which will produce a C struct, along with a list
> +node structure that can be used to chain together a list of such types in
> +case we want to accept/return a list of this type with a command), and a
> +command which takes that type as a parameter and returns the same type:
> +
> +mdroth@illuin:~/w/qemu2.git$ cat example-schema.json 
> +{ 'type': 'UserDefOne',
> +  'data': { 'integer': 'int', 'string': 'str' } }
> +
> +{ 'commands': 'my-command',

s/commands/command/


> +  'data': {'arg1': 'UserDefOne'},
> +  'returns':  'UserDefOne' }
> +mdroth@illuin:~/w/qemu2.git$
> +
> +=== scripts/qapi-types.py ===
> +
> +Used to generate the C types defined by a schema. The following files are
> +created:
> +
> +$(prefix)qapi-types.h - C types corresponding to types defined in
> +the schema you pass in
> +$(prefix)qapi-types.c - Cleanup functions for the above C types
> +
> +The $(prefix) is an optional parameter used to as a namespace to keep the

s

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Anthony Liguori


On 06/02/2011 08:24 AM, Jiri Denemark wrote:

On Thu, Jun 02, 2011 at 08:08:35 -0500, Anthony Liguori wrote:

On 06/02/2011 04:06 AM, Daniel P. Berrange wrote:

B. query-stop-reason


I also have a simple solution for item 2. The vm_stop() accepts a reason
argument, so we could store it somewhere and return it as a string, like:

->{ "execute": "query-stop-reason" }
<- { "return": { "reason": "user" } }

Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey,
this should be "ioerror", no?), "watchdog", "panic", "savevm", "loadvm",
"migrate".

Also note that we have a STOP event. It should be extended with the
stop reason too, for completeness.



Can we just extend query-block?


Primarily we want 'query-stop-reason' to tell us what caused the VM
CPUs to stop. If that reason was 'ioerror', then 'query-block' could
be used to find out which particular block device(s) caused the IO
error to occurr&   get the "reason" that was in the BLOCK_IO_ERROR
event.


My concern is that we're over abstracting here.  We're not going to add
additional stop reasons in the future.

Maybe just add an 'io-error': True to query-state.


Sure, adding a new field to query-state response would work as well. And it
seems like a good idea to me since one already needs to call query-status to
check if CPUs are stopped or not so it makes sense to incorporate the
additional information there as well. And if you want to be safe for the
future, the new field doesn't have to be boolean 'io-error' but it can be the
string 'reason' which Luiz suggested above.



String enumerations are a Bad Thing.  It's impossible to figure out what 
strings are valid and it lacks type safety.


Adding more booleans provides better type safety, and when we move to 
QAPI with a queryable schema, provides a way to figure out exactly what 
combinations are supported by QEMU.


Regards,

Anthony Liguori



Jirka

Re: [Qemu-devel] [PATCH] Fix compilation warning due to missing header for sigaction

2011-06-02 Thread Alexandre Raymond

On Thu, Jun 2, 2011 at 6:13 AM, Andreas Färber  wrote:
> Am 02.06.2011 um 04:21 schrieb Alexandre Raymond:
>
>> Fix the following warning by including signal.h directly in qemu-common.h
>> 8<
>> iohandler.c: In function ‘qemu_init_child_watch’:
>> iohandler.c:172: warning: implicit declaration of function ‘sigaction’
>> iohandler.c:172: warning: nested extern declaration of ‘sigaction’
>> 8<
>>
>> Signed-off-by: Alexandre Raymond 
>
> Tested-by: Andreas Färber 
>
> Why in qemu-common.h and not in iohandler.c though?
> If we put it into qemu-common.h, you should remove other inclusions of
> signal.h.

Well, I was simply following Anthony's advice from "[PATCH] #include
cleanlines" : "The idea behind qemu-common.h is to avoid direct
includes to help with portability."

Alexandre

Re: [Qemu-devel] [RFC PATCH] Darwin: Fix compilation warning regarding the deprecated daemon() function

2011-06-02 Thread Peter Maydell

On 2 June 2011 03:45, Alexandre Raymond  wrote:

> The following trick, used in mDNSResponder, takes care of this warning:
> http://www.opensource.apple.com/source/mDNSResponder/mDNSResponder-258.18/mDNSPosix/PosixDaemon.c

If we do decide to borrow this trick from there, can we
also borrow some equivalent of the comment which explains
why we're doing it?

-- PMM

Re: [Qemu-devel] [RFC PATCH] Darwin: Fix compilation warning regarding the deprecated daemon() function

2011-06-02 Thread Alexandre Raymond

Hi Andreas,

On Thu, Jun 2, 2011 at 8:09 AM, Andreas Färber  wrote:
> Am 02.06.2011 um 04:45 schrieb Alexandre Raymond:
>
>> On OSX > 10.5, daemon() is deprecated, resulting int he following warning:
>
>>= 10.5
>
> http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man3/daemon.3.html
>
>> 8<
>> qemu-nbd.c: In function ‘main’:
>> qemu-nbd.c:371: warning: ‘daemon’ is deprecated (declared at
>> /usr/include/stdlib.h:289)
>> 8<
>>
>> The following trick, used in mDNSResponder, takes care of this warning:
>>
>> http://www.opensource.apple.com/source/mDNSResponder/mDNSResponder-258.18/mDNSPosix/PosixDaemon.c
>
> Even if apparently applied by Apple themselves, I consider it a bad hack for
> curing symptoms.
>
> http://developer.apple.com/library/mac/technotes/tn2083/_index.html#//apple_ref/doc/uid/DTS10003794-CH1-SUBSECTION64

I agree that this is a nasty hack. It's really up to you guys. I can
try to modify this patch to use launchd instead.

Alexandre

Re: [Qemu-devel] [PULL v5 00/25] SCSI subsystem improvements

2011-06-02 Thread Andreas Färber


Am 31.05.2011 um 15:38 schrieb Anthony Liguori:


On 05/26/2011 05:56 AM, Paolo Bonzini wrote:
The following changes since commit  
aa29141d84d58171c2d219f0a4b599bd76fb2e37:


  Merge remote-tracking branch 'kraxel/CVE-2011-1751' into staging  
(2011-05-25 07:04:13 -0500)


are available in the git repository at:

  git://github.com/bonzini/qemu.git scsi.2

This series includes the following improvements to the SCSI  
subsystem:


Pulled.  Thanks.


Unfortunately that pulled in the v5 version, breaking simple trace  
build.


Paolo, do you have differential patches for your v6 already?

Andreas

Re: [Qemu-devel] [PULL 00/26] Alpha system emulation, v5

2011-06-02 Thread Richard Henderson

Ping^2.

r~

On 05/27/2011 12:55 PM, Richard Henderson wrote:
> Ping?
> 
> 
> r~
> 
> On 05/23/2011 01:28 PM, Richard Henderson wrote:
>> Changes from v4 -> v5
>>
>>   * Claim official ownership of the Alpha port, rather
>> than leave it as "unmaintained".
>>
>>   * Drop all the patches in hw/ for now.  While they're necessary
>> to actually make the port work, these are the subset of the whole
>> patchset for which I'm confident I'm doing the Right Thing and
>> don't really need patch review.
>>
>> No mistake, patch review is still welcome but no one has posted
>> *anything* substantive for v1->v4.
>>
>> Please pull.
>>
>>
>> r~
>>
>>
>> The following changes since commit dcfd14b3741983c466ad92fa2ae91eeafce3e5d5:
>>
>>   Delete unused tb_invalidate_page_range (2011-05-22 10:47:28 +)
>>
>> are available in the git repository at:
>>   git://repo.or.cz/qemu/rth.git axp-next
>>
>> Richard Henderson (26):
>>   target-alpha: Claim ownership.
>>   target-alpha: Disassemble EV6 PALcode instructions.
>>   target-alpha: Single-step properly across branches.
>>   target-alpha: Remove partial support for palcode emulation.
>>   target-alpha: Fix translation of PALmode memory insns.
>>   target-alpha: Fix system store_conditional
>>   target-alpha: Cleanup MMU modes.
>>   target-alpha: Merge HW_REI and HW_RET implementations.
>>   target-alpha: Rationalize internal processor registers.
>>   target-alpha: Enable the alpha-softmmu target.
>>   target-alpha: Tidy exception constants.
>>   target-alpha: Tidy up arithmetic exceptions.
>>   target-alpha: Use do_restore_state for arithmetic exceptions.
>>   target-alpha: Add various symbolic constants.
>>   target-alpha: Use kernel mmu_idx for pal_mode.
>>   target-alpha: Add IPRs to be used by the emulation PALcode.
>>   target-alpha: Implement do_interrupt for system mode.
>>   target-alpha: Swap shadow registers moving to/from PALmode.
>>   target-alpha: All ISA checks to use TB->FLAGS.
>>   target-alpha: Disable interrupts properly.
>>   target-alpha: Implement more CALL_PAL values inline.
>>   target-alpha: Implement cpu_alpha_handle_mmu_fault for system mode.
>>   target-alpha: Remap PIO space for 43-bit KSEG for EV6.
>>   target-alpha: Trap for unassigned and unaligned addresses.
>>   target-alpha: Use a fixed frequency for the RPCC in system mode.
>>   target-alpha: Implement TLB flush primitives.
>>
>>  MAINTAINERS   |4 +-
>>  Makefile.target   |3 +-
>>  alpha-dis.c   |4 -
>>  configure |1 +
>>  cpu-exec.c|   33 +-
>>  default-configs/alpha-softmmu.mak |9 +
>>  dis-asm.h |3 +
>>  disas.c   |2 +-
>>  exec-all.h|2 +-
>>  exec.c|   12 +-
>>  hw/alpha_palcode.c| 1048 
>> -
>>  linux-user/main.c |   50 +--
>>  target-alpha/cpu.h|  375 ++
>>  target-alpha/exec.h   |   12 +-
>>  target-alpha/helper.c |  589 +
>>  target-alpha/helper.h |   32 +-
>>  target-alpha/machine.c|   87 +++
>>  target-alpha/op_helper.c  |  278 +--
>>  target-alpha/translate.c  |  804 
>>  19 files changed, 1179 insertions(+), 2169 deletions(-)
>>  create mode 100644 default-configs/alpha-softmmu.mak
>>  delete mode 100644 hw/alpha_palcode.c
>>  create mode 100644 target-alpha/machine.c
>

Re: [Qemu-devel] [PATCH] Fix build on FreeBSD

2011-06-02 Thread Andreas Färber


Am 31.05.2011 um 16:57 schrieb Nathan Whitehorn:


Add some includes required to build qemu on FreeBSD.


Missing Sob.


---
bsd-user/syscall.c |2 ++
iohandler.c|1 +
os-posix.c |4 
3 files changed, 7 insertions(+), 0 deletions(-)

[...]

diff --git a/iohandler.c b/iohandler.c
index 2b82421..7266aca 100644
--- a/iohandler.c
+++ b/iohandler.c
@@ -29,6 +29,7 @@

#ifndef _WIN32
#include 
+#include 
#endif

typedef struct IOHandlerRecord {


This is independent of the other BSD issues (that I cannot judge) and  
affects Darwin and Haiku as well. Did you check that with your patch  
Win32 does not need the header?


Alexandre (cc'ed) posted a slightly different patch, adding it in qemu- 
common.h instead.


Andreas

Re: [Qemu-devel] [PATCH 11/14] linux-user: fpu_save_addr is not used

2011-06-02 Thread Juan Quintela

Peter Maydell  wrote:
> On 2 June 2011 12:53, Juan Quintela  wrote:
>> It is only read to set the error code?
>
>>         err |= __get_user(fpu_save_addr, &sf->fpu_save);
>> -
>> +        (void)fpu_save_addr;
>
> In linux-user __get_user can never generate an error: faults
> are always caught by the lock_user_struct() or equivalent call
> done beforehand. The error handling is I think a leftover from
> code borrowed from the kernel (which does have a __get_user
> that might return an error).
>
> So I think the correct fix here is just to remove the __get_user
> lines and the variables if they're not used.

Fine with me.  Will do for next series.

Later, Juan.

[Qemu-devel] [PATCH 14/14] Make spapr tces use generic dma layer

2011-06-02 Thread David Gibson

---
 Makefile.target  |4 +-
 hw/spapr.c   |3 +
 hw/spapr.h   |   14 +++-
 hw/spapr_iommu.c |  236 
 hw/spapr_llan.c  |   70 --
 hw/spapr_vio.c   |  289 +++--
 hw/spapr_vio.h   |   60 ++--
 hw/spapr_vscsi.c |   26 +++---
 hw/spapr_vty.c   |1 +
 9 files changed, 354 insertions(+), 349 deletions(-)
 create mode 100644 hw/spapr_iommu.c

diff --git a/Makefile.target b/Makefile.target
index 042ba1b..365d43e 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -252,8 +252,8 @@ obj-ppc-y += ppc_oldworld.o
 # NewWorld PowerMac
 obj-ppc-y += ppc_newworld.o
 # IBM pSeries (sPAPR)
-ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
-obj-ppc-y += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
+ifeq ($(CONFIG_FDT)$(CONFIG_IOMMU)$(TARGET_PPC64),yyy)
+obj-ppc-y += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o spapr_iommu.o
 obj-ppc-y += xics.o spapr_vty.o spapr_llan.o spapr_vscsi.o
 obj-ppc-$(CONFIG_VIRTIO) += spapr_virtio.o
 endif
diff --git a/hw/spapr.c b/hw/spapr.c
index febcad4..45b87bb 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -415,6 +415,9 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 /* Set up Interrupt Controller */
 spapr->icp = xics_system_init(XICS_IRQS);
 
+/* Set up IOMMU */
+spapr_iommu_init();
+
 /* Set up VIO bus */
 spapr->vio_bus = spapr_vio_bus_init();
 
diff --git a/hw/spapr.h b/hw/spapr.h
index 382b0f1..525c9ba 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -1,6 +1,8 @@
 #if !defined(__HW_SPAPR_H__)
 #define __HW_SPAPR_H__
 
+#include "dma_rw.h"
+
 struct VIOsPAPRBus;
 struct VirtIOsPAPRBus;
 struct icp_state;
@@ -269,7 +271,7 @@ typedef struct sPAPREnvironment {
 
 extern sPAPREnvironment *spapr;
 
-/*#define DEBUG_SPAPR_HCALLS*/
+#define DEBUG_SPAPR_HCALLS
 
 #ifdef DEBUG_SPAPR_HCALLS
 #define hcall_dprintf(fmt, ...) \
@@ -307,4 +309,14 @@ target_ulong spapr_rtas_call(sPAPREnvironment *spapr,
 int spapr_rtas_device_tree_setup(void *fdt, target_phys_addr_t rtas_addr,
  target_phys_addr_t rtas_size);
 
+#define SPAPR_TCE_PAGE_SHIFT   12
+#define SPAPR_TCE_PAGE_SIZE(1ULL << SPAPR_TCE_PAGE_SHIFT)
+#define SPAPR_TCE_PAGE_MASK(SPAPR_TCE_PAGE_SIZE - 1)
+
+void spapr_iommu_init(void);
+void spapr_tce_init_dev(DeviceState *dev, uint64_t liobn,
+size_t window_size);
+void spapr_tce_clear_dev(DeviceState *dev);
+int spapr_dma_dt(void *fdt, int node_off, DMAMmu *iommu);
+
 #endif /* !defined (__HW_SPAPR_H__) */
diff --git a/hw/spapr_iommu.c b/hw/spapr_iommu.c
new file mode 100644
index 000..49b930d
--- /dev/null
+++ b/hw/spapr_iommu.c
@@ -0,0 +1,236 @@
+/*
+ * QEMU sPAPR IOMMU (TCE) code
+ *
+ * Copyright (c) 2010 David Gibson, IBM Corporation 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+#include "hw.h"
+#include "kvm.h"
+#include "qdev.h"
+#include "kvm_ppc.h"
+#include "hw/dma_rw.h"
+
+#include "hw/spapr.h"
+
+#include 
+
+/* #define DEBUG_TCE */
+
+enum sPAPRTCEAccess {
+SPAPR_TCE_FAULT = 0,
+SPAPR_TCE_RO = 1,
+SPAPR_TCE_WO = 2,
+SPAPR_TCE_RW = 3,
+};
+
+typedef struct sPAPRTCE {
+uint64_t tce;
+} sPAPRTCE;
+
+
+typedef struct sPAPRTCETable sPAPRTCETable;
+
+struct sPAPRTCETable {
+struct DMAMmu dma;
+target_ulong liobn;
+uint32_t window_size;
+sPAPRTCE *table;
+int fd;
+QLIST_ENTRY(sPAPRTCETable) list;
+};
+
+
+QLIST_HEAD(spapr_tce_tables, sPAPRTCETable) spapr_tce_tables;
+
+static sPAPRTCETable *spapr_tce_find_by_liobn(target_ulong liobn)
+{
+sPAPRTCETable *tcet;
+
+QLIST_FOREACH(tcet, &spapr_tce_tables, list) {
+if (tcet->liobn == liobn) {
+return tcet;
+}
+}
+
+return NULL;
+}
+
+static int spapr_tce_translate(DeviceState *qdev,
+   dma_addr_t addr,
+   dma_addr_t *paddr,
+   dma_addr_t *len,
+   int is_write)
+{
+sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, qdev->iommu);
+enum sPAPRTCEAccess access = is_write ? SPAPR_TCE_WO : SPAPR_TCE_RO;
+uint64_t tce;
+
+#ifdef DEBUG_TCE
+fprintf(stderr, "spapr_tce_translate addr=0x%llx\n",
+(unsigned long long)addr);
+#endif
+
+/* Check if we

[Qemu-devel] Yet another take on a generic dma/iommu layer

2011-06-02 Thread David Gibson

Here is my variant on Eduard - Gabriel Munteanu's patches to add a
DMA/IOMMU layer, this one is expanded to allow it to support the PAPR
TCE mechanism.  At present, we implement PAPR TCEs directly in the
PAPR virtual IO bus layer, the last patch of this series reworks the
code to implement it through the generic DMA layer.  That will make
life easier when we come to implement PCI for the pseries machine.

Apart from that, I've significantly reworked how the IOMMU data is
accessed from the qdev.  The DMADevice structure is gone - I saw no
point to it.  Instead, the DeviceState contains a pointer directly to
a DmaMmu structure.  NULL here indicates no IOMMU, so DMAs go directl
to guest physical addresses.  All the DMA R/W helper functions take a
DeviceState * and reach the DmaMmu from there.

The DmaMmu represents a single DMA context / address space, it could
be seperate for each device, or shared between several devices,
depending on whether a particular IOMMU implements independent
translation for each device, or a single shared DMA address space for
(e.g.) a whole bus.  From the DmaMmu structure, IOMMU specific state
information can be reached via upcasting.

For PCI IOMMUS, the pci bus structure references a PCIBusIOMMU
structure.  That contains a single 'new_device' callback which obtains
the appropriate DmaMmu context for a given PCI device.  That could
either be a pointer to a fixed existing DmaMmu, if the IOMMU
implements a single shared address space (the AMD IOMMU uses this), or
it could allocate a new DmaMmu context if the IOMMU provides a
separate DMA address space for each device.

[Qemu-devel] [PATCH 02/14] pci: add IOMMU support via the generic DMA layer

2011-06-02 Thread David Gibson

IOMMUs can now be hooked onto the PCI bus. This makes use of the generic
DMA layer.

Signed-off-by: Eduard - Gabriel Munteanu 
Signed-off-by: David Gibson 
---
 hw/pci.c   |9 +
 hw/pci.h   |5 +
 hw/pci_internals.h |7 +++
 3 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 1d297d6..03ec453 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -745,6 +745,10 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev, PCIBus *bus,
 return NULL;
 }
 pci_dev->bus = bus;
+#ifdef CONFIG_IOMMU
+if (bus->iommu)
+pci_dev->qdev.iommu = bus->iommu->new_device(bus);
+#endif
 pci_dev->devfn = devfn;
 pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
 pci_dev->irq_state = 0;
@@ -2163,3 +2167,8 @@ int pci_qdev_find_device(const char *id, PCIDevice **pdev)
 
 return rc;
 }
+
+void pci_register_iommu(PCIBus *bus, PCIBusIOMMU *iommu)
+{
+bus->iommu = iommu;
+}
diff --git a/hw/pci.h b/hw/pci.h
index 0d288ce..a71ba04 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -5,6 +5,7 @@
 #include "qobject.h"
 
 #include "qdev.h"
+#include "dma_rw.h"
 
 /* PCI includes legacy ISA access.  */
 #include "isa.h"
@@ -129,6 +130,7 @@ enum {
 
 struct PCIDevice {
 DeviceState qdev;
+
 /* PCI config space */
 uint8_t *config;
 
@@ -271,6 +273,9 @@ void pci_bridge_update_mappings(PCIBus *b);
 
 void pci_device_deassert_intx(PCIDevice *dev);
 
+typedef struct PCIBusIOMMU PCIBusIOMMU;
+void pci_register_iommu(PCIBus *bus, PCIBusIOMMU *iommu);
+
 static inline void
 pci_set_byte(uint8_t *config, uint8_t val)
 {
diff --git a/hw/pci_internals.h b/hw/pci_internals.h
index fbe1866..ff1a640 100644
--- a/hw/pci_internals.h
+++ b/hw/pci_internals.h
@@ -14,8 +14,15 @@
 
 extern struct BusInfo pci_bus_info;
 
+typedef DMAMmu *(*pci_iommu_new_device_fn)(PCIBus *);
+
+struct PCIBusIOMMU {
+pci_iommu_new_device_fn new_device;
+};
+
 struct PCIBus {
 BusState qbus;
+PCIBusIOMMU *iommu;
 uint8_t devfn_min;
 pci_set_irq_fn set_irq;
 pci_map_irq_fn map_irq;
-- 
1.7.4.4

[Qemu-devel] [PATCH 10/14] lsi53c895a: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/lsi53c895a.c |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 83084b6..0b8a213 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -395,7 +395,7 @@ static inline uint32_t read_dword(LSIState *s, uint32_t 
addr)
 if ((addr & 0xe000) == s->script_ram_base) {
 return s->script_ram[(addr & 0x1fff) >> 2];
 }
-cpu_physical_memory_read(addr, (uint8_t *)&buf, 4);
+dma_memory_read(&s->dev.qdev, addr, (uint8_t *)&buf, 4);
 return cpu_to_le32(buf);
 }
 
@@ -573,9 +573,9 @@ static void lsi_do_dma(LSIState *s, int out)
 }
 /* ??? Set SFBR to first data byte.  */
 if (out) {
-cpu_physical_memory_read(addr, s->current->dma_buf, count);
+dma_memory_read(&s->dev.qdev, addr, s->current->dma_buf, count);
 } else {
-cpu_physical_memory_write(addr, s->current->dma_buf, count);
+dma_memory_write(&s->dev.qdev, addr, s->current->dma_buf, count);
 }
 s->current->dma_len -= count;
 if (s->current->dma_len == 0) {
@@ -775,7 +775,7 @@ static void lsi_do_command(LSIState *s)
 DPRINTF("Send command len=%d\n", s->dbc);
 if (s->dbc > 16)
 s->dbc = 16;
-cpu_physical_memory_read(s->dnad, buf, s->dbc);
+dma_memory_read(&s->dev.qdev, s->dnad, buf, s->dbc);
 s->sfbr = buf[0];
 s->command_complete = 0;
 
@@ -825,7 +825,7 @@ static void lsi_do_status(LSIState *s)
 s->dbc = 1;
 status = s->status;
 s->sfbr = status;
-cpu_physical_memory_write(s->dnad, &status, 1);
+dma_memory_write(&s->dev.qdev, s->dnad, &status, 1);
 lsi_set_phase(s, PHASE_MI);
 s->msg_action = 1;
 lsi_add_msg_byte(s, 0); /* COMMAND COMPLETE */
@@ -839,7 +839,7 @@ static void lsi_do_msgin(LSIState *s)
 len = s->msg_len;
 if (len > s->dbc)
 len = s->dbc;
-cpu_physical_memory_write(s->dnad, s->msg, len);
+dma_memory_write(&s->dev.qdev, s->dnad, s->msg, len);
 /* Linux drivers rely on the last byte being in the SIDL.  */
 s->sidl = s->msg[len - 1];
 s->msg_len -= len;
@@ -871,7 +871,7 @@ static void lsi_do_msgin(LSIState *s)
 static uint8_t lsi_get_msgbyte(LSIState *s)
 {
 uint8_t data;
-cpu_physical_memory_read(s->dnad, &data, 1);
+dma_memory_read(&s->dev.qdev, s->dnad, &data, 1);
 s->dnad++;
 s->dbc--;
 return data;
@@ -1028,8 +1028,8 @@ static void lsi_memcpy(LSIState *s, uint32_t dest, 
uint32_t src, int count)
 DPRINTF("memcpy dest 0x%08x src 0x%08x count %d\n", dest, src, count);
 while (count) {
 n = (count > LSI_BUF_SIZE) ? LSI_BUF_SIZE : count;
-cpu_physical_memory_read(src, buf, n);
-cpu_physical_memory_write(dest, buf, n);
+dma_memory_read(&s->dev.qdev, src, buf, n);
+dma_memory_write(&s->dev.qdev, dest, buf, n);
 src += n;
 dest += n;
 count -= n;
@@ -1097,7 +1097,7 @@ again:
 
 /* 32-bit Table indirect */
 offset = sxt24(addr);
-cpu_physical_memory_read(s->dsa + offset, (uint8_t *)buf, 8);
+dma_memory_read(&s->dev.qdev, s->dsa + offset, (uint8_t *)buf, 8);
 /* byte count is stored in bits 0:23 only */
 s->dbc = cpu_to_le32(buf[0]) & 0xff;
 s->rbc = s->dbc;
@@ -1456,7 +1456,7 @@ again:
 n = (insn & 7);
 reg = (insn >> 16) & 0xff;
 if (insn & (1 << 24)) {
-cpu_physical_memory_read(addr, data, n);
+dma_memory_read(&s->dev.qdev, addr, data, n);
 DPRINTF("Load reg 0x%x size %d addr 0x%08x = %08x\n", reg, n,
 addr, *(int *)data);
 for (i = 0; i < n; i++) {
@@ -1467,7 +1467,7 @@ again:
 for (i = 0; i < n; i++) {
 data[i] = lsi_reg_readb(s, reg + i);
 }
-cpu_physical_memory_write(addr, data, n);
+dma_memory_write(&s->dev.qdev, addr, data, n);
 }
 }
 }
-- 
1.7.4.4

[Qemu-devel] [PATCH 04/14] ide: use the DMA memory access interface for PCI IDE controllers

2011-06-02 Thread David Gibson

Emulated PCI IDE controllers now use the memory access interface. This
also allows an emulated IOMMU to translate and check accesses.

Map invalidation results in cancelling DMA transfers. Since the guest OS
can't properly recover the DMA results in case the mapping is changed,
this is a fairly good approximation.

Note this doesn't handle AHCI emulation yet!

Signed-off-by: Eduard - Gabriel Munteanu 
Signed-off-by: David Gibson m
---
 dma-helpers.c |   25 +++--
 dma.h |   10 ++
 hw/ide/ahci.c |3 ++-
 hw/ide/internal.h |1 +
 hw/ide/macio.c|4 ++--
 hw/ide/pci.c  |   18 +++---
 6 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index 712ed89..6a00df4 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -10,12 +10,13 @@
 #include "dma.h"
 #include "block_int.h"
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DeviceState *dev)
 {
 qsg->sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry));
 qsg->nsg = 0;
 qsg->nalloc = alloc_hint;
 qsg->size = 0;
+qsg->dev = dev;
 }
 
 void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
@@ -73,16 +74,27 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
 int i;
 
 for (i = 0; i < dbs->iov.niov; ++i) {
-cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
-  dbs->iov.iov[i].iov_len, !dbs->is_write,
-  dbs->iov.iov[i].iov_len);
+dma_memory_unmap(dbs->sg->dev,
+ dbs->iov.iov[i].iov_base,
+ dbs->iov.iov[i].iov_len, !dbs->is_write,
+ dbs->iov.iov[i].iov_len);
 }
 }
 
+static void dma_bdrv_cancel(void *opaque)
+{
+DMAAIOCB *dbs = opaque;
+
+bdrv_aio_cancel(dbs->acb);
+dma_bdrv_unmap(dbs);
+qemu_iovec_destroy(&dbs->iov);
+qemu_aio_release(dbs);
+}
+
 static void dma_bdrv_cb(void *opaque, int ret)
 {
 DMAAIOCB *dbs = (DMAAIOCB *)opaque;
-target_phys_addr_t cur_addr, cur_len;
+dma_addr_t cur_addr, cur_len;
 void *mem;
 
 dbs->acb = NULL;
@@ -100,7 +112,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
 while (dbs->sg_cur_index < dbs->sg->nsg) {
 cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
 cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
-mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->is_write);
+mem = dma_memory_map(dbs->sg->dev, dma_bdrv_cancel, dbs,
+ cur_addr, &cur_len, !dbs->is_write);
 if (!mem)
 break;
 qemu_iovec_add(&dbs->iov, mem, cur_len);
diff --git a/dma.h b/dma.h
index f3bb275..a2cd649 100644
--- a/dma.h
+++ b/dma.h
@@ -14,20 +14,22 @@
 //#include "cpu.h"
 #include "hw/hw.h"
 #include "block.h"
+#include "hw/dma_rw.h"
 
 typedef struct {
-target_phys_addr_t base;
-target_phys_addr_t len;
+dma_addr_t base;
+dma_addr_t len;
 } ScatterGatherEntry;
 
 typedef struct {
 ScatterGatherEntry *sg;
 int nsg;
 int nalloc;
-target_phys_addr_t size;
+dma_addr_t size;
+DeviceState *dev;
 } QEMUSGList;
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DeviceState *dev);
 void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
  target_phys_addr_t len);
 void qemu_sglist_destroy(QEMUSGList *qsg);
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 1f008a3..5bc3f4a 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -680,7 +680,8 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList 
*sglist)
 if (sglist_alloc_hint > 0) {
 AHCI_SG *tbl = (AHCI_SG *)prdt;
 
-qemu_sglist_init(sglist, sglist_alloc_hint);
+/* FIXME: pass a proper DMADevice. */
+qemu_sglist_init(sglist, sglist_alloc_hint, NULL);
 for (i = 0; i < sglist_alloc_hint; i++) {
 /* flags_size is zero-based */
 qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr),
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index c2b35ec..fd7e04f 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -474,6 +474,7 @@ struct IDEDMA {
 struct iovec iov;
 QEMUIOVector qiov;
 BlockDriverAIOCB *aiocb;
+DeviceState *dev;
 };
 
 struct IDEBus {
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 7107f6b..a111481 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -78,7 +78,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
 
 s->io_buffer_size = io->len;
 
-qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
+qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL);
 qemu_sglist_add(&s->sg, io->addr, io->len);
 io->addr += io->len;
 io->len = 0;
@@ -140,7 +140,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
 s->io_buffer_i

[Qemu-devel] [PATCH 13/14] usb-ohci: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/usb-ohci.c |   43 ---
 1 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index 8b966f7..061aa7b 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -116,6 +116,11 @@ typedef struct {
 
 } OHCIState;
 
+typedef struct {
+PCIDevice pci_dev;
+OHCIState state;
+} OHCIPCIState;
+
 /* Host Controller Communications Area */
 struct ohci_hcca {
 uint32_t intr[32];
@@ -422,12 +427,13 @@ static void ohci_reset(void *opaque)
 static inline int get_dwords(OHCIState *ohci,
  uint32_t addr, uint32_t *buf, int num)
 {
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
 int i;
 
 addr += ohci->localmem_base;
 
 for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
-cpu_physical_memory_read(addr, buf, sizeof(*buf));
+dma_memory_read(&s->pci_dev.qdev, addr, buf, sizeof(*buf));
 *buf = le32_to_cpu(*buf);
 }
 
@@ -438,13 +444,14 @@ static inline int get_dwords(OHCIState *ohci,
 static inline int put_dwords(OHCIState *ohci,
  uint32_t addr, uint32_t *buf, int num)
 {
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
 int i;
 
 addr += ohci->localmem_base;
 
 for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
 uint32_t tmp = cpu_to_le32(*buf);
-cpu_physical_memory_write(addr, &tmp, sizeof(tmp));
+dma_memory_write(&s->pci_dev.qdev, addr, &tmp, sizeof(tmp));
 }
 
 return 1;
@@ -454,12 +461,13 @@ static inline int put_dwords(OHCIState *ohci,
 static inline int get_words(OHCIState *ohci,
 uint32_t addr, uint16_t *buf, int num)
 {
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
 int i;
 
 addr += ohci->localmem_base;
 
 for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
-cpu_physical_memory_read(addr, buf, sizeof(*buf));
+dma_memory_read(&s->pci_dev.qdev, addr, buf, sizeof(*buf));
 *buf = le16_to_cpu(*buf);
 }
 
@@ -470,13 +478,14 @@ static inline int get_words(OHCIState *ohci,
 static inline int put_words(OHCIState *ohci,
 uint32_t addr, uint16_t *buf, int num)
 {
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
 int i;
 
 addr += ohci->localmem_base;
 
 for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
 uint16_t tmp = cpu_to_le16(*buf);
-cpu_physical_memory_write(addr, &tmp, sizeof(tmp));
+dma_memory_write(&s->pci_dev.qdev, addr, &tmp, sizeof(tmp));
 }
 
 return 1;
@@ -504,7 +513,9 @@ static inline int ohci_read_iso_td(OHCIState *ohci,
 static inline int ohci_read_hcca(OHCIState *ohci,
  uint32_t addr, struct ohci_hcca *hcca)
 {
-cpu_physical_memory_read(addr + ohci->localmem_base, hcca, sizeof(*hcca));
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
+dma_memory_read(&s->pci_dev.qdev, addr + ohci->localmem_base, hcca,
+sizeof(*hcca));
 return 1;
 }
 
@@ -530,7 +541,10 @@ static inline int ohci_put_iso_td(OHCIState *ohci,
 static inline int ohci_put_hcca(OHCIState *ohci,
 uint32_t addr, struct ohci_hcca *hcca)
 {
-cpu_physical_memory_write(addr + ohci->localmem_base, hcca, sizeof(*hcca));
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
+
+dma_memory_write(&s->pci_dev.qdev, addr + ohci->localmem_base, hcca,
+ sizeof(*hcca));
 return 1;
 }
 
@@ -538,6 +552,7 @@ static inline int ohci_put_hcca(OHCIState *ohci,
 static void ohci_copy_td(OHCIState *ohci, struct ohci_td *td,
  uint8_t *buf, int len, int write)
 {
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
 uint32_t ptr;
 uint32_t n;
 
@@ -545,12 +560,13 @@ static void ohci_copy_td(OHCIState *ohci, struct ohci_td 
*td,
 n = 0x1000 - (ptr & 0xfff);
 if (n > len)
 n = len;
-cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, n, write);
+dma_memory_rw(&s->pci_dev.qdev, ptr + ohci->localmem_base, buf, n, write);
 if (n == len)
 return;
 ptr = td->be & ~0xfffu;
 buf += n;
-cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, len - n, write);
+dma_memory_rw(&s->pci_dev.qdev, ptr + ohci->localmem_base,
+  buf, len - n, write);
 }
 
 /* Read/Write the contents of an ISO TD from/to main memory.  */
@@ -558,6 +574,7 @@ static void ohci_copy_iso_td(OHCIState *ohci,
  uint32_t start_addr, uint32_t end_addr,
  uint8_t *buf, int len, int write)
 {
+OHCIPCIState *s = container_of(ohci, OHCIPCIState, state);
 uint32_t ptr;
 uint32_t n;
 
@@ -565,12 +582,13 @@ static void ohci_copy_iso_td(OHCIState

[Qemu-devel] [RESEND][PATCH v3][ 5/7] guest agent: add guest agent RPCs/commands

2011-06-02 Thread Michael Roth

This adds the initial set of QMP/QAPI commands provided by the guest
agent:

guest-sync
guest-ping
guest-info
guest-file-open
guest-file-read
guest-file-write
guest-file-seek
guest-file-close
guest-fsfreeze-freeze
guest-fsfreeze-thaw
guest-fsfreeze-status

The input/output specification for these commands are documented in the
schema.

Signed-off-by: Michael Roth 
---
 qga/guest-agent-commands.c |  497 
 1 files changed, 497 insertions(+), 0 deletions(-)
 create mode 100644 qga/guest-agent-commands.c

diff --git a/qga/guest-agent-commands.c b/qga/guest-agent-commands.c
new file mode 100644
index 000..88865ee
--- /dev/null
+++ b/qga/guest-agent-commands.c
@@ -0,0 +1,497 @@
+/*
+ * QEMU Guest Agent commands
+ *
+ * Copyright IBM Corp. 2011
+ *
+ * Authors:
+ *  Michael Roth  
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+//#include "guest-agent.h"
+#include "qga/guest-agent-core.h"
+#include "qga-qmp-commands.h"
+
+static GAState *ga_state;
+
+static bool logging_enabled(void)
+{
+return ga_logging_enabled(ga_state);
+}
+
+static void disable_logging(void)
+{
+ga_disable_logging(ga_state);
+}
+
+static void enable_logging(void)
+{
+ga_enable_logging(ga_state);
+}
+
+/* Note: in some situations, like with the fsfreeze, logging may be
+ * temporarilly disabled. if it is necessary that a command be able
+ * to log for accounting purposes, check logging_enabled() beforehand,
+ * and use the QERR_QGA_LOGGING_DISABLED to generate an error
+ */
+static void slog(const char *fmt, ...)
+{
+va_list ap;
+
+va_start(ap, fmt);
+g_logv("syslog", G_LOG_LEVEL_INFO, fmt, ap);
+va_end(ap);
+}
+
+int64_t qmp_guest_sync(int64_t id, Error **errp)
+{
+return id;
+}
+
+void qmp_guest_ping(Error **err)
+{
+slog("guest-ping called");
+}
+
+struct GuestAgentInfo *qmp_guest_info(Error **err)
+{
+GuestAgentInfo *info = g_malloc0(sizeof(GuestAgentInfo));
+
+info->version = g_strdup(QGA_VERSION);
+info->timeout_ms = ga_get_timeout(ga_state);
+
+return info;
+}
+
+void qmp_guest_shutdown(const char *shutdown_mode, Error **err)
+{
+int ret;
+const char *shutdown_flag;
+
+if (!logging_enabled()) {
+error_set(err, QERR_QGA_LOGGING_FAILED);
+return;
+}
+
+slog("guest-shutdown called, shutdown_mode: %s", shutdown_mode);
+if (strcmp(shutdown_mode, "halt") == 0) {
+shutdown_flag = "-H";
+} else if (strcmp(shutdown_mode, "powerdown") == 0) {
+shutdown_flag = "-P";
+} else if (strcmp(shutdown_mode, "reboot") == 0) {
+shutdown_flag = "-r";
+} else {
+error_set(err, QERR_INVALID_PARAMETER_VALUE, "shutdown_mode",
+  "halt|powerdown|reboot");
+return;
+}
+
+ret = fork();
+if (ret == 0) {
+/* child, start the shutdown */
+setsid();
+fclose(stdin);
+fclose(stdout);
+fclose(stderr);
+
+sleep(5);
+ret = execl("/sbin/shutdown", "shutdown", shutdown_flag, "+0",
+"hypervisor initiated shutdown", (char*)NULL);
+exit(!!ret);
+} else if (ret < 0) {
+error_set(err, QERR_UNDEFINED_ERROR);
+}
+}
+
+typedef struct GuestFileHandle {
+uint64_t id;
+FILE *fh;
+} GuestFileHandle;
+
+static struct {
+GSList *filehandles;
+uint64_t last_id;
+} guest_file_state;
+
+static int64_t guest_file_handle_add(FILE *fh)
+{
+GuestFileHandle *gfh;
+
+gfh = g_malloc(sizeof(GuestFileHandle));
+gfh->id = guest_file_state.last_id++;
+gfh->fh = fh;
+guest_file_state.filehandles = g_slist_append(guest_file_state.filehandles,
+  gfh);
+return gfh->id;
+}
+
+static gint guest_file_handle_match(gconstpointer elem, gconstpointer id_p)
+{
+const uint64_t *id = id_p;
+const GuestFileHandle *gfh = elem;
+
+g_assert(gfh);
+return (gfh->id != *id);
+}
+
+static FILE *guest_file_handle_find(int64_t id)
+{
+GSList *elem = g_slist_find_custom(guest_file_state.filehandles, &id,
+   guest_file_handle_match);
+GuestFileHandle *gfh;
+
+if (elem) {
+g_assert(elem->data);
+gfh = elem->data;
+return gfh->fh;
+}
+
+return NULL;
+}
+
+static void guest_file_handle_remove(int64_t id)
+{
+GSList *elem = g_slist_find_custom(guest_file_state.filehandles, &id,
+   guest_file_handle_match);
+gpointer data = elem->data;
+
+if (!data) {
+return;
+}
+guest_file_state.filehandles = g_slist_remove(guest_file_state.filehandles,
+  data);
+g_free(data);
+}
+
+int64_t qmp_guest_file_open(const char *filename, const char *mode, Error 
**err)
+{
+FILE *fh;
+

[Qemu-devel] [PATCH 05/14] rtl8139: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/rtl8139.c |  104 -
 1 files changed, 58 insertions(+), 46 deletions(-)

diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index 2f8db58..8d0306a 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -53,6 +53,7 @@
 
 #include "hw.h"
 #include "pci.h"
+#include "dma_rw.h"
 #include "qemu-timer.h"
 #include "net.h"
 #include "loader.h"
@@ -424,12 +425,6 @@ typedef struct RTL8139TallyCounters
 uint16_t   TxUndrn;
 } RTL8139TallyCounters;
 
-/* Clears all tally counters */
-static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters);
-
-/* Writes tally counters to specified physical memory address */
-static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t 
tc_addr, RTL8139TallyCounters* counters);
-
 typedef struct RTL8139State {
 PCIDevice dev;
 uint8_t phys[8]; /* mac address */
@@ -510,6 +505,14 @@ typedef struct RTL8139State {
 int rtl8139_mmio_io_addr_dummy;
 } RTL8139State;
 
+/* Clears all tally counters */
+static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters);
+
+/* Writes tally counters to specified physical memory address */
+static void
+RTL8139TallyCounters_physical_memory_write(RTL8139State *s,
+   target_phys_addr_t tc_addr);
+
 static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
 
 static void prom9346_decode_command(EEprom9346 *eeprom, uint8_t command)
@@ -771,15 +774,15 @@ static void rtl8139_write_buffer(RTL8139State *s, const 
void *buf, int size)
 
 if (size > wrapped)
 {
-cpu_physical_memory_write( s->RxBuf + s->RxBufAddr,
-   buf, size-wrapped );
+dma_memory_write(&s->dev.qdev, s->RxBuf + s->RxBufAddr,
+ buf, size-wrapped);
 }
 
 /* reset buffer pointer */
 s->RxBufAddr = 0;
 
-cpu_physical_memory_write( s->RxBuf + s->RxBufAddr,
-   buf + (size-wrapped), wrapped );
+dma_memory_write(&s->dev.qdev, s->RxBuf + s->RxBufAddr,
+ buf + (size-wrapped), wrapped);
 
 s->RxBufAddr = wrapped;
 
@@ -788,7 +791,7 @@ static void rtl8139_write_buffer(RTL8139State *s, const 
void *buf, int size)
 }
 
 /* non-wrapping path or overwrapping enabled */
-cpu_physical_memory_write( s->RxBuf + s->RxBufAddr, buf, size );
+dma_memory_write(&s->dev.qdev, s->RxBuf + s->RxBufAddr, buf, size);
 
 s->RxBufAddr += size;
 }
@@ -988,13 +991,17 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 
 uint32_t val, rxdw0,rxdw1,rxbufLO,rxbufHI;
 
-cpu_physical_memory_read(cplus_rx_ring_desc,(uint8_t *)&val, 4);
+dma_memory_read(&s->dev.qdev, cplus_rx_ring_desc,
+(uint8_t *)&val, 4);
 rxdw0 = le32_to_cpu(val);
-cpu_physical_memory_read(cplus_rx_ring_desc+4,  (uint8_t *)&val, 4);
+dma_memory_read(&s->dev.qdev, cplus_rx_ring_desc+4,
+(uint8_t *)&val, 4);
 rxdw1 = le32_to_cpu(val);
-cpu_physical_memory_read(cplus_rx_ring_desc+8,  (uint8_t *)&val, 4);
+dma_memory_read(&s->dev.qdev, cplus_rx_ring_desc+8,
+(uint8_t *)&val, 4);
 rxbufLO = le32_to_cpu(val);
-cpu_physical_memory_read(cplus_rx_ring_desc+12, (uint8_t *)&val, 4);
+dma_memory_read(&s->dev.qdev, cplus_rx_ring_desc+12,
+(uint8_t *)&val, 4);
 rxbufHI = le32_to_cpu(val);
 
 DPRINTF("+++ C+ mode RX descriptor %d %08x %08x %08x %08x\n",
@@ -1062,12 +1069,12 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 
 /* receive/copy to target memory */
 if (dot1q_buf) {
-cpu_physical_memory_write(rx_addr, buf, 2 * ETHER_ADDR_LEN);
-cpu_physical_memory_write(rx_addr + 2 * ETHER_ADDR_LEN,
-buf + 2 * ETHER_ADDR_LEN + VLAN_HLEN,
-size - 2 * ETHER_ADDR_LEN);
+dma_memory_write(&s->dev.qdev, rx_addr, buf, 2 * ETHER_ADDR_LEN);
+dma_memory_write(&s->dev.qdev, rx_addr + 2 * ETHER_ADDR_LEN,
+ buf + 2 * ETHER_ADDR_LEN + VLAN_HLEN,
+ size - 2 * ETHER_ADDR_LEN);
 } else {
-cpu_physical_memory_write(rx_addr, buf, size);
+dma_memory_write(&s->dev.qdev, rx_addr, buf, size);
 }
 
 if (s->CpCmd & CPlusRxChkSum)
@@ -1077,7 +1084,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 
 /* write checksum */
 val = cpu_to_le32(crc32(0, buf, size_));
-cpu_physical_memory_write( rx_addr+size, (uint8_t *)&val, 4);
+dma

[Qemu-devel] [PATCH 09/14] e1000: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/e1000.c |   27 +++
 1 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index f160bfc..b10c6d6 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -472,7 +472,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 bytes = split_size;
 if (tp->size + bytes > msh)
 bytes = msh - tp->size;
-cpu_physical_memory_read(addr, tp->data + tp->size, bytes);
+dma_memory_read(&s->dev.qdev, addr, tp->data + tp->size, bytes);
 if ((sz = tp->size + bytes) >= hdr && tp->size < hdr)
 memmove(tp->header, tp->data, hdr);
 tp->size = sz;
@@ -487,7 +487,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 // context descriptor TSE is not set, while data descriptor TSE is set
 DBGOUT(TXERR, "TCP segmentaion Error\n");
 } else {
-cpu_physical_memory_read(addr, tp->data + tp->size, split_size);
+dma_memory_read(&s->dev.qdev, addr, tp->data + tp->size, split_size);
 tp->size += split_size;
 }
 
@@ -503,7 +503,9 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 }
 
 static uint32_t
-txdesc_writeback(target_phys_addr_t base, struct e1000_tx_desc *dp)
+txdesc_writeback(E1000State *s,
+ target_phys_addr_t base,
+ struct e1000_tx_desc *dp)
 {
 uint32_t txd_upper, txd_lower = le32_to_cpu(dp->lower.data);
 
@@ -512,8 +514,9 @@ txdesc_writeback(target_phys_addr_t base, struct 
e1000_tx_desc *dp)
 txd_upper = (le32_to_cpu(dp->upper.data) | E1000_TXD_STAT_DD) &
 ~(E1000_TXD_STAT_EC | E1000_TXD_STAT_LC | E1000_TXD_STAT_TU);
 dp->upper.data = cpu_to_le32(txd_upper);
-cpu_physical_memory_write(base + ((char *)&dp->upper - (char *)dp),
-  (void *)&dp->upper, sizeof(dp->upper));
+dma_memory_write(&s->dev.qdev,
+ base + ((char *)&dp->upper - (char *)dp),
+ (void *)&dp->upper, sizeof(dp->upper));
 return E1000_ICR_TXDW;
 }
 
@@ -540,14 +543,14 @@ start_xmit(E1000State *s)
 while (s->mac_reg[TDH] != s->mac_reg[TDT]) {
 base = tx_desc_base(s) +
sizeof(struct e1000_tx_desc) * s->mac_reg[TDH];
-cpu_physical_memory_read(base, (void *)&desc, sizeof(desc));
+dma_memory_read(&s->dev.qdev, base, (void *)&desc, sizeof(desc));
 
 DBGOUT(TX, "index %d: %p : %x %x\n", s->mac_reg[TDH],
(void *)(intptr_t)desc.buffer_addr, desc.lower.data,
desc.upper.data);
 
 process_tx_desc(s, &desc);
-cause |= txdesc_writeback(base, &desc);
+cause |= txdesc_writeback(s, base, &desc);
 
 if (++s->mac_reg[TDH] * sizeof(desc) >= s->mac_reg[TDLEN])
 s->mac_reg[TDH] = 0;
@@ -717,7 +720,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 desc_size = s->rxbuf_size;
 }
 base = rx_desc_base(s) + sizeof(desc) * s->mac_reg[RDH];
-cpu_physical_memory_read(base, (void *)&desc, sizeof(desc));
+dma_memory_read(&s->dev.qdev, base, (void *)&desc, sizeof(desc));
 desc.special = vlan_special;
 desc.status |= (vlan_status | E1000_RXD_STAT_DD);
 if (desc.buffer_addr) {
@@ -726,9 +729,9 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 if (copy_size > s->rxbuf_size) {
 copy_size = s->rxbuf_size;
 }
-cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr),
-  (void *)(buf + desc_offset + 
vlan_offset),
-  copy_size);
+dma_memory_write(&s->dev.qdev, le64_to_cpu(desc.buffer_addr),
+ (void *)(buf + desc_offset + vlan_offset),
+ copy_size);
 }
 desc_offset += desc_size;
 desc.length = cpu_to_le16(desc_size);
@@ -742,7 +745,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 } else { // as per intel docs; skip descriptors with null buf addr
 DBGOUT(RX, "Null RX descriptor!!\n");
 }
-cpu_physical_memory_write(base, (void *)&desc, sizeof(desc));
+dma_memory_write(&s->dev.qdev, base, (void *)&desc, sizeof(desc));
 
 if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN])
 s->mac_reg[RDH] = 0;
-- 
1.7.4.4

[Qemu-devel] [PATCH 01/14] Generic DMA memory access interface

2011-06-02 Thread David Gibson

This introduces replacements for memory access functions like
cpu_physical_memory_read(). The new interface can handle address
translation and access checking through an IOMMU.

David Gibson: I have made several bugfixes and cleanups to Eduard's
original patch.

 * dma_memory_rw() was incorrectly using (uninitialized) plen instead
   of len in the fallback to no-IOMMU case.

 * the dma_memory_map() tracking was storing the guest physical
   address of each mapping, but not the qemu user virtual address.
   However in unmap() it was then attempting to lookup by virtual
   using a completely bogus cast.

 * The dma_memory_rw() function is moved from dma_rw.h to dma_rw.c, it
   was a bit too much code for an inline.

 * IOMMU support is now available on all target platforms, not just
   i386, but is configurable (--enable-iommu/--disable-iommu).  Stubs
   are used so that individual drivers can use the new dma interface
   and it will turn into old-style cpu physical accesses at no cost on
   IOMMU-less builds.

Signed-off-by: Eduard - Gabriel Munteanu 
Signed-off-by: David Gibson 
---
 Makefile.target |1 +
 configure   |   12 +++
 hw/dma_rw.c |  219 +++
 hw/dma_rw.h |  149 +
 hw/qdev.h   |5 +
 5 files changed, 386 insertions(+), 0 deletions(-)
 create mode 100644 hw/dma_rw.c
 create mode 100644 hw/dma_rw.h

diff --git a/Makefile.target b/Makefile.target
index 4f97b26..76fd734 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -198,6 +198,7 @@ obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/virtio-9p.o
 obj-y += rwhandler.o
 obj-$(CONFIG_KVM) += kvm.o kvm-all.o
 obj-$(CONFIG_NO_KVM) += kvm-stub.o
+obj-$(CONFIG_IOMMU) += dma_rw.o
 LIBS+=-lz
 
 QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
diff --git a/configure b/configure
index a318d37..56f9616 100755
--- a/configure
+++ b/configure
@@ -131,6 +131,7 @@ xen_ctrl_version=""
 linux_aio=""
 attr=""
 vhost_net=""
+iommu="no"
 xfs=""
 
 gprof="no"
@@ -724,6 +725,10 @@ for opt do
   ;;
   --enable-vhost-net) vhost_net="yes"
   ;;
+  --enable-iommu) iommu="yes"
+  ;;
+  --disable-iommu) iommu="no"
+  ;;
   --disable-opengl) opengl="no"
   ;;
   --enable-opengl) opengl="yes"
@@ -1006,6 +1011,8 @@ echo "  --enable-docsenable documentation 
build"
 echo "  --disable-docs   disable documentation build"
 echo "  --disable-vhost-net  disable vhost-net acceleration support"
 echo "  --enable-vhost-net   enable vhost-net acceleration support"
+echo "  --disable-iommu  disable IOMMU emulation support"
+echo "  --enable-iommu   enable IOMMU emulation support"
 echo "  --enable-trace-backend=B Set trace backend"
 echo "   Available backends:" 
$("$source_path"/scripts/tracetool --list-backends)
 echo "  --with-trace-file=NAME   Full PATH,NAME of file to store traces"
@@ -2702,6 +2709,7 @@ echo "madvise   $madvise"
 echo "posix_madvise $posix_madvise"
 echo "uuid support  $uuid"
 echo "vhost-net support $vhost_net"
+echo "IOMMU support $iommu"
 echo "Trace backend $trace_backend"
 echo "Trace output file $trace_file-"
 echo "spice support $spice"
@@ -3515,6 +3523,10 @@ if test "$target_softmmu" = "yes" -a \( \
   echo "CONFIG_NEED_MMU=y" >> $config_target_mak
 fi
 
+if test "$iommu" = "yes" ; then
+  echo "CONFIG_IOMMU=y" >> $config_host_mak
+fi
+
 if test "$gprof" = "yes" ; then
   echo "TARGET_GPROF=yes" >> $config_target_mak
   if test "$target_linux_user" = "yes" ; then
diff --git a/hw/dma_rw.c b/hw/dma_rw.c
new file mode 100644
index 000..6586425
--- /dev/null
+++ b/hw/dma_rw.c
@@ -0,0 +1,219 @@
+/*
+ * Generic DMA memory access interface.
+ *
+ * Copyright (c) 2011 Eduard - Gabriel Munteanu
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "dma_rw.h"
+#include "range.h"
+
+/* #define DEBUG_DMA */
+
+int dma_memory_check(DeviceState *dev, dma_add

[Qemu-devel] [PATCH 03/14] AMD IOMMU emulation

2011-06-02 Thread David Gibson

This introduces emulation for the AMD IOMMU, described in "AMD I/O
Virtualization Technology (IOMMU) Specification".

Signed-off-by: Eduard - Gabriel Munteanu 
---
 Makefile.target |1 +
 hw/amd_iommu.c  |  701 +++
 hw/pc.c |2 +
 hw/pci_ids.h|2 +
 hw/pci_regs.h   |1 +
 5 files changed, 707 insertions(+), 0 deletions(-)
 create mode 100644 hw/amd_iommu.c

diff --git a/Makefile.target b/Makefile.target
index 76fd734..042ba1b 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -234,6 +234,7 @@ obj-$(CONFIG_IVSHMEM) += ivshmem.o
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o i8259.o pc.o
 obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
+obj-i386-$(CONFIG_IOMMU) += amd_iommu.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
new file mode 100644
index 000..b487fae
--- /dev/null
+++ b/hw/amd_iommu.c
@@ -0,0 +1,701 @@
+/*
+ * AMD IOMMU emulation
+ *
+ * Copyright (c) 2011 Eduard - Gabriel Munteanu
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "pc.h"
+#include "hw.h"
+#include "pci.h"
+#include "pci_internals.h"
+#include "qlist.h"
+#include "dma_rw.h"
+
+/* Capability registers */
+#define CAPAB_HEADER0x00
+#define   CAPAB_REV_TYPE0x02
+#define   CAPAB_FLAGS   0x03
+#define CAPAB_BAR_LOW   0x04
+#define CAPAB_BAR_HIGH  0x08
+#define CAPAB_RANGE 0x0C
+#define CAPAB_MISC  0x10
+
+#define CAPAB_SIZE  0x14
+#define CAPAB_REG_SIZE  0x04
+
+/* Capability header data */
+#define CAPAB_FLAG_IOTLBSUP (1 << 0)
+#define CAPAB_FLAG_HTTUNNEL (1 << 1)
+#define CAPAB_FLAG_NPCACHE  (1 << 2)
+#define CAPAB_INIT_REV  (1 << 3)
+#define CAPAB_INIT_TYPE 3
+#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE)
+#define CAPAB_INIT_FLAGS(CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL)
+#define CAPAB_INIT_MISC (64 << 15) | (48 << 8)
+#define CAPAB_BAR_MASK  ~((1UL << 14) - 1)
+
+/* MMIO registers */
+#define MMIO_DEVICE_TABLE   0x
+#define MMIO_COMMAND_BASE   0x0008
+#define MMIO_EVENT_BASE 0x0010
+#define MMIO_CONTROL0x0018
+#define MMIO_EXCL_BASE  0x0020
+#define MMIO_EXCL_LIMIT 0x0028
+#define MMIO_COMMAND_HEAD   0x2000
+#define MMIO_COMMAND_TAIL   0x2008
+#define MMIO_EVENT_HEAD 0x2010
+#define MMIO_EVENT_TAIL 0x2018
+#define MMIO_STATUS 0x2020
+
+#define MMIO_SIZE   0x4000
+
+#define MMIO_DEVTAB_SIZE_MASK   ((1ULL << 12) - 1)
+#define MMIO_DEVTAB_BASE_MASK   (((1ULL << 52) - 1) & ~MMIO_DEVTAB_SIZE_MASK)
+#define MMIO_DEVTAB_ENTRY_SIZE  32
+#define MMIO_DEVTAB_SIZE_UNIT   4096
+
+#define MMIO_CMDBUF_SIZE_BYTE   (MMIO_COMMAND_BASE + 7)
+#define MMIO_CMDBUF_SIZE_MASK   0x0F
+#define MMIO_CMDBUF_BASE_MASK   MMIO_DEVTAB_BASE_MASK
+#define MMIO_CMDBUF_DEFAULT_SIZE8
+#define MMIO_CMDBUF_HEAD_MASK   (((1ULL << 19) - 1) & ~0x0F)
+#define MMIO_CMDBUF_TAIL_MASK   MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EVTLOG_SIZE_BYTE   (MMIO_EVENT_BASE + 7)
+#define MMIO_EVTLOG_SIZE_MASK   MMIO_CMDBUF_SIZE_MASK
+#define MMIO_EVTLOG_BASE_MASK   MMIO_CMDBUF_BASE_MASK
+#define MMIO_EVTLOG_DEFAULT_SIZEMMIO_CMDBUF_DEFAULT_SIZE
+#define MMIO_EVTLOG_HEAD_MASK   (((1ULL << 19) - 1) & ~0x0F)
+#define MMIO_EVTLOG_TAIL_MASK   MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_ENABLED_MASK  (1ULL << 0)
+#define MMIO_EXCL_ALLOW_MASK(1ULL << 1)
+#define MMIO_EXCL_LIMIT_MASKMMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_LIMIT_LOW 0xFFF
+
+#define MMIO_CONTROL_IOMMUEN(1ULL << 0)
+#define MMIO_CONTROL_HTTUNEN(1ULL << 1)
+#define MMIO_CONTROL_EVENTLOGEN

[Qemu-devel] [PATCH 08/14] es1370: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/es1370.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/es1370.c b/hw/es1370.c
index 40cb48c..8880746 100644
--- a/hw/es1370.c
+++ b/hw/es1370.c
@@ -802,7 +802,7 @@ static void es1370_transfer_audio (ES1370State *s, struct 
chan *d, int loop_sel,
 if (!acquired)
 break;
 
-cpu_physical_memory_write (addr, tmpbuf, acquired);
+dma_memory_write (&s->dev.qdev, addr, tmpbuf, acquired);
 
 temp -= acquired;
 addr += acquired;
@@ -816,7 +816,7 @@ static void es1370_transfer_audio (ES1370State *s, struct 
chan *d, int loop_sel,
 int copied, to_copy;
 
 to_copy = audio_MIN ((size_t) temp, sizeof (tmpbuf));
-cpu_physical_memory_read (addr, tmpbuf, to_copy);
+dma_memory_read (&s->dev.qdev, addr, tmpbuf, to_copy);
 copied = AUD_write (voice, tmpbuf, to_copy);
 if (!copied)
 break;
-- 
1.7.4.4

[Qemu-devel] [PATCH 12/14] usb-uhci: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/usb-uhci.c |   26 ++
 1 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c
index c0de05b..bf43a0f 100644
--- a/hw/usb-uhci.c
+++ b/hw/usb-uhci.c
@@ -683,7 +683,7 @@ static int uhci_complete_td(UHCIState *s, UHCI_TD *td, 
UHCIAsync *async, uint32_
 
 if (len > 0) {
 /* write the data back */
-cpu_physical_memory_write(td->buffer, async->buffer, len);
+dma_memory_write(&s->dev.qdev, td->buffer, async->buffer, len);
 }
 
 if ((td->ctrl & TD_CTRL_SPD) && len < max_len) {
@@ -803,7 +803,7 @@ static int uhci_handle_td(UHCIState *s, uint32_t addr, 
UHCI_TD *td, uint32_t *in
 switch(pid) {
 case USB_TOKEN_OUT:
 case USB_TOKEN_SETUP:
-cpu_physical_memory_read(td->buffer, async->buffer, max_len);
+dma_memory_read(&s->dev.qdev, td->buffer, async->buffer, max_len);
 len = uhci_broadcast_packet(s, &async->packet);
 if (len >= 0)
 len = max_len;
@@ -846,7 +846,7 @@ static void uhci_async_complete(USBDevice *dev, USBPacket 
*packet)
 uint32_t link = async->td;
 uint32_t int_mask = 0, val;
 
-cpu_physical_memory_read(link & ~0xf, (uint8_t *) &td, sizeof(td));
+dma_memory_read(&s->dev.qdev, link & ~0xf, (uint8_t *) &td, 
sizeof(td));
 le32_to_cpus(&td.link);
 le32_to_cpus(&td.ctrl);
 le32_to_cpus(&td.token);
@@ -858,8 +858,8 @@ static void uhci_async_complete(USBDevice *dev, USBPacket 
*packet)
 
 /* update the status bits of the TD */
 val = cpu_to_le32(td.ctrl);
-cpu_physical_memory_write((link & ~0xf) + 4,
-  (const uint8_t *)&val, sizeof(val));
+dma_memory_write(&s->dev.qdev, (link & ~0xf) + 4,
+ (const uint8_t *)&val, sizeof(val));
 uhci_async_free(s, async);
 } else {
 async->done = 1;
@@ -922,7 +922,7 @@ static void uhci_process_frame(UHCIState *s)
 
 DPRINTF("uhci: processing frame %d addr 0x%x\n" , s->frnum, frame_addr);
 
-cpu_physical_memory_read(frame_addr, (uint8_t *)&link, 4);
+dma_memory_read(&s->dev.qdev, frame_addr, (uint8_t *)&link, 4);
 le32_to_cpus(&link);
 
 int_mask = 0;
@@ -946,7 +946,8 @@ static void uhci_process_frame(UHCIState *s)
 break;
 }
 
-cpu_physical_memory_read(link & ~0xf, (uint8_t *) &qh, sizeof(qh));
+dma_memory_read(&s->dev.qdev,
+link & ~0xf, (uint8_t *) &qh, sizeof(qh));
 le32_to_cpus(&qh.link);
 le32_to_cpus(&qh.el_link);
 
@@ -966,7 +967,8 @@ static void uhci_process_frame(UHCIState *s)
 }
 
 /* TD */
-cpu_physical_memory_read(link & ~0xf, (uint8_t *) &td, sizeof(td));
+dma_memory_read(&s->dev.qdev,
+link & ~0xf, (uint8_t *) &td, sizeof(td));
 le32_to_cpus(&td.link);
 le32_to_cpus(&td.ctrl);
 le32_to_cpus(&td.token);
@@ -980,8 +982,8 @@ static void uhci_process_frame(UHCIState *s)
 if (old_td_ctrl != td.ctrl) {
 /* update the status bits of the TD */
 val = cpu_to_le32(td.ctrl);
-cpu_physical_memory_write((link & ~0xf) + 4,
-  (const uint8_t *)&val, sizeof(val));
+dma_memory_write(&s->dev.qdev, (link & ~0xf) + 4,
+ (const uint8_t *)&val, sizeof(val));
 }
 
 if (ret < 0) {
@@ -1009,8 +1011,8 @@ static void uhci_process_frame(UHCIState *s)
/* update QH element link */
 qh.el_link = link;
 val = cpu_to_le32(qh.el_link);
-cpu_physical_memory_write((curr_qh & ~0xf) + 4,
-  (const uint8_t *)&val, sizeof(val));
+dma_memory_write(&s->dev.qdev, (curr_qh & ~0xf) + 4,
+ (const uint8_t *)&val, sizeof(val));
 
 if (!depth_first(link)) {
/* done with this QH */
-- 
1.7.4.4

[Qemu-devel] [RESEND][PATCH v3][ 6/7] guest agent: add guest agent commands schema file

2011-06-02 Thread Michael Roth


Signed-off-by: Michael Roth 
---
 qapi-schema-guest.json |  198 
 1 files changed, 198 insertions(+), 0 deletions(-)
 create mode 100644 qapi-schema-guest.json

diff --git a/qapi-schema-guest.json b/qapi-schema-guest.json
new file mode 100644
index 000..be79c49
--- /dev/null
+++ b/qapi-schema-guest.json
@@ -0,0 +1,198 @@
+# *-*- Mode: Python -*-*
+
+##
+# @guest-sync:
+#
+# Echo back a unique integer value
+#
+# This is used by clients talking to the guest agent over the
+# wire to ensure the stream is in sync and doesn't contain stale
+# data from previous client. All guest agent responses should be
+# ignored until the provided unique integer value is returned,
+# and it is up to the client to handle stale whole or
+# partially-delivered JSON text in such a way that this response
+# can be obtained.
+#
+# Such clients should also preceed this command
+# with a 0xFF byte to make such the guest agent flushes any
+# partially read JSON data from a previous session.
+#
+# @id: randomly generated 64-bit integer
+#
+# Returns: The unique integer id passed in by the client
+#
+# Since: 0.15.0
+##
+{ 'command': 'guest-sync'
+  'data':{ 'id': 'int' },
+  'returns': 'int' }
+
+##
+# @guest-ping:
+#
+# Ping the guest agent, a non-error return implies success
+#
+# Since: 0.15.0
+##
+{ 'command': 'guest-ping' }
+
+##
+# @guest-info:
+#
+# Get some information about the guest agent.
+#
+# Since: 0.15.0
+##
+{ 'type': 'GuestAgentInfo', 'data': {'version': 'str', 'timeout_ms': 'int'} }
+{ 'command': 'guest-info',
+  'returns': 'GuestAgentInfo' }
+
+##
+# @guest-shutdown:
+#
+# Initiate guest-activated shutdown
+#
+# @shutdown_mode: "halt", "powerdown", or "reboot"
+#
+# Returns: Nothing on success
+#
+# Since: 0.15.0
+##
+{ 'command': 'guest-shutdown', 'data': { 'shutdown_mode': 'str' } }
+
+##
+# @guest-file-open:
+#
+# Open a file in the guest and retrieve a file handle for it
+#
+# @filename: Full path to the file in the guest to open.
+#
+# @mode: #optional open mode, as per fopen(), "r" is the default.
+#
+# Returns: Guest file handle on success.
+#  If @filename cannot be opened, OpenFileFailed
+#
+# Since: 0.15.0
+##
+{ 'command': 'guest-file-open',
+  'data':{ 'filename': 'str', 'mode': 'str' },
+  'returns': 'int' }
+
+##
+# @guest-file-read:
+#
+# Read from an open file in the guest
+#
+# @filehandle: filehandle returned by guest-file-open
+#
+# @count: maximum number of bytes to read
+#
+# Returns: GuestFileRead on success.
+#  If @filehandle cannot be found, OpenFileFailed
+#
+# Since: 0.15.0
+##
+{ 'type': 'GuestFileRead',
+  'data': { 'count': 'int', 'buf': 'str', 'eof': 'bool' } }
+
+{ 'command': 'guest-file-read',
+  'data':{ 'filehandle': 'int', 'count': 'int' },
+  'returns': 'GuestFileRead' }
+
+##
+# @guest-file-write:
+#
+# Write to an open file in the guest
+#
+# @filehandle: filehandle returned by guest-file-open
+#
+# @data_b64: base64-encoded string representing data to be written
+#
+# @count: bytes to write (actual bytes, after b64-decode)
+#
+# Returns: GuestFileWrite on success.
+#  If @filehandle cannot be found, OpenFileFailed
+#
+# Since: 0.15.0
+##
+{ 'type': 'GuestFileWrite',
+  'data': { 'count': 'int', 'eof': 'bool' } }
+{ 'command': 'guest-file-write',
+  'data':{ 'filehandle': 'int', 'data_b64': 'str', 'count': 'int' },
+  'returns': 'GuestFileWrite' }
+
+##
+# @guest-file-seek:
+#
+# Seek to a position in the file, as with fseek(), and return the
+# current file position afterward. Also encapsulates ftell()'s
+# functionality, just Set offset=0, whence=SEEK_CUR.
+#
+# @filehandle: filehandle returned by guest-file-open
+#
+# @offset: bytes to skip over in the file stream
+#
+# @whence: SEEK_SET, SEEK_CUR, or SEEK_END, as with fseek()
+#
+# Returns: GuestFileSeek on success.
+#  If @filename cannot be opened, OpenFileFailed
+#
+# Since: 0.15.0
+##
+{ 'type': 'GuestFileSeek',
+  'data': { 'position': 'int', 'eof': 'bool' } }
+
+{ 'command': 'guest-file-seek',
+  'data':{ 'filehandle': 'int', 'offset': 'int', 'whence': 'int' },
+  'returns': 'GuestFileSeek' }
+
+##
+# @guest-file-close:
+#
+# Close an open file in the guest
+#
+# @filehandle: filehandle returned by guest-file-open
+#
+# Returns: Nothing on success.
+#  If @filename cannot be opened, OpenFileFailed
+#
+# Since: 0.15.0
+##
+{ 'command': 'guest-file-close',
+  'data': { 'filehandle': 'int' } }
+
+##
+# @guest-fsfreeze-status:
+#
+# get guest fsfreeze state
+#
+# Returns: Status of fsfreeze state
+#
+# Since: 0.15.0
+##
+{ 'command': 'guest-fsfreeze-status',
+  'returns': 'int' }
+
+##
+# @guest-fsfreeze-freeze:
+#
+# Sync and freeze all non-network guest filesystems
+#
+# Returns: Number of file systems frozen
+#
+# Since: 0.15.0
+##
+{ 'command': 'guest-fsfreeze-freeze',
+  'returns': 'int' }
+
+##
+# @guest-fsfreeze-thaw:
+#
+# Unfreeze frozen guest fileystems
+#
+# Returns: Number of file systems thawed
+#

[Qemu-devel] [PATCH 07/14] ac97: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/ac97.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/ac97.c b/hw/ac97.c
index a946c1a..a947afc 100644
--- a/hw/ac97.c
+++ b/hw/ac97.c
@@ -223,7 +223,7 @@ static void fetch_bd (AC97LinkState *s, AC97BusMasterRegs 
*r)
 {
 uint8_t b[8];
 
-cpu_physical_memory_read (r->bdbar + r->civ * 8, b, 8);
+dma_memory_read (&s->dev.qdev, r->bdbar + r->civ * 8, b, 8);
 r->bd_valid = 1;
 r->bd.addr = le32_to_cpu (*(uint32_t *) &b[0]) & ~3;
 r->bd.ctl_len = le32_to_cpu (*(uint32_t *) &b[4]);
@@ -972,7 +972,7 @@ static int write_audio (AC97LinkState *s, AC97BusMasterRegs 
*r,
 while (temp) {
 int copied;
 to_copy = audio_MIN (temp, sizeof (tmpbuf));
-cpu_physical_memory_read (addr, tmpbuf, to_copy);
+dma_memory_read (&s->dev.qdev, addr, tmpbuf, to_copy);
 copied = AUD_write (s->voice_po, tmpbuf, to_copy);
 dolog ("write_audio max=%x to_copy=%x copied=%x\n",
max, to_copy, copied);
@@ -1053,7 +1053,7 @@ static int read_audio (AC97LinkState *s, 
AC97BusMasterRegs *r,
 *stop = 1;
 break;
 }
-cpu_physical_memory_write (addr, tmpbuf, acquired);
+dma_memory_write (&s->dev.qdev, addr, tmpbuf, acquired);
 temp -= acquired;
 addr += acquired;
 nread += acquired;
-- 
1.7.4.4

[Qemu-devel] [PATCH 06/14] eepro100: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
Signed-off-by: David Gibson 
---
 hw/eepro100.c |  107 +++--
 1 files changed, 58 insertions(+), 49 deletions(-)

diff --git a/hw/eepro100.c b/hw/eepro100.c
index 9f16efd..761ecaa 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -317,35 +317,36 @@ static const uint16_t eepro100_mdi_mask[] = {
 };
 
 /* Read a 16 bit little endian value from physical memory. */
-static uint16_t e100_ldw_le_phys(target_phys_addr_t addr)
+static uint16_t e100_ldw_le_dma(EEPRO100State *s, pcibus_t addr)
 {
 /* Load 16 bit (little endian) word from emulated hardware. */
 uint16_t val;
-cpu_physical_memory_read(addr, &val, sizeof(val));
+dma_memory_read(&s->dev.qdev, addr, &val, sizeof(val));
 return le16_to_cpu(val);
 }
-
 /* Read a 32 bit little endian value from physical memory. */
-static uint32_t e100_ldl_le_phys(target_phys_addr_t addr)
+static uint32_t e100_ldl_le_dma(EEPRO100State *s, pcibus_t addr)
 {
 /* Load 32 bit (little endian) word from emulated hardware. */
 uint32_t val;
-cpu_physical_memory_read(addr, &val, sizeof(val));
+dma_memory_read(&s->dev.qdev, addr, &val, sizeof(val));
 return le32_to_cpu(val);
 }
 
 /* Write a 16 bit little endian value to physical memory. */
-static void e100_stw_le_phys(target_phys_addr_t addr, uint16_t val)
+static void e100_stw_le_dma(EEPRO100State *s, pcibus_t addr,
+uint16_t val)
 {
 val = cpu_to_le16(val);
-cpu_physical_memory_write(addr, &val, sizeof(val));
+dma_memory_write(&s->dev.qdev, addr, &val, sizeof(val));
 }
 
 /* Write a 32 bit little endian value to physical memory. */
-static void e100_stl_le_phys(target_phys_addr_t addr, uint32_t val)
+static void e100_stl_le_dma(EEPRO100State *s, pcibus_t addr,
+uint32_t val)
 {
 val = cpu_to_le32(val);
-cpu_physical_memory_write(addr, &val, sizeof(val));
+dma_memory_write(&s->dev.qdev, addr, (const uint8_t *)&val, sizeof(val));
 }
 
 #define POLYNOMIAL 0x04c11db6
@@ -757,21 +758,23 @@ static void dump_statistics(EEPRO100State * s)
  * values which really matter.
  * Number of data should check configuration!!!
  */
-cpu_physical_memory_write(s->statsaddr, &s->statistics, s->stats_size);
-e100_stl_le_phys(s->statsaddr + 0, s->statistics.tx_good_frames);
-e100_stl_le_phys(s->statsaddr + 36, s->statistics.rx_good_frames);
-e100_stl_le_phys(s->statsaddr + 48, s->statistics.rx_resource_errors);
-e100_stl_le_phys(s->statsaddr + 60, s->statistics.rx_short_frame_errors);
+dma_memory_write(&s->dev.qdev, s->statsaddr,
+ (uint8_t *) & s->statistics, s->stats_size);
+e100_stl_le_dma(s, s->statsaddr + 0, s->statistics.tx_good_frames);
+e100_stl_le_dma(s, s->statsaddr + 36, s->statistics.rx_good_frames);
+e100_stl_le_dma(s, s->statsaddr + 48, s->statistics.rx_resource_errors);
+e100_stl_le_dma(s, s->statsaddr + 60, s->statistics.rx_short_frame_errors);
 #if 0
-e100_stw_le_phys(s->statsaddr + 76, s->statistics.xmt_tco_frames);
-e100_stw_le_phys(s->statsaddr + 78, s->statistics.rcv_tco_frames);
+e100_stw_le_dma(s, s->statsaddr + 76, s->statistics.xmt_tco_frames);
+e100_stw_le_dma(s, s->statsaddr + 78, s->statistics.rcv_tco_frames);
 missing("CU dump statistical counters");
 #endif
 }
 
 static void read_cb(EEPRO100State *s)
 {
-cpu_physical_memory_read(s->cb_address, &s->tx, sizeof(s->tx));
+dma_memory_read(&s->dev.qdev, s->cb_address,
+(uint8_t *) &s->tx, sizeof(s->tx));
 s->tx.status = le16_to_cpu(s->tx.status);
 s->tx.command = le16_to_cpu(s->tx.command);
 s->tx.link = le32_to_cpu(s->tx.link);
@@ -801,18 +804,18 @@ static void tx_command(EEPRO100State *s)
 }
 assert(tcb_bytes <= sizeof(buf));
 while (size < tcb_bytes) {
-uint32_t tx_buffer_address = e100_ldl_le_phys(tbd_address);
-uint16_t tx_buffer_size = e100_ldw_le_phys(tbd_address + 4);
+uint32_t tx_buffer_address = e100_ldl_le_dma(s, tbd_address);
+uint16_t tx_buffer_size = e100_ldw_le_dma(s, tbd_address + 4);
 #if 0
-uint16_t tx_buffer_el = e100_ldw_le_phys(tbd_address + 6);
+uint16_t tx_buffer_el = e100_ldw_le_dma(s, tbd_address + 6);
 #endif
 tbd_address += 8;
 TRACE(RXTX, logout
 ("TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n",
  tx_buffer_address, tx_buffer_size));
 tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
-cpu_physical_memory_read(tx_buffer_address, &buf[size],
- tx_buffer_size);
+dma_memory_read(&s->dev.qdev,
+tx_buffer_address, &buf[size], tx_buffer_size);
 size += tx_buffer_size;
 }
 if (tbd_array == 0x) {
@@ -823,16 +826,16 @@ static void tx_

[Qemu-devel] [PATCH 11/14] pcnet: use the DMA memory access interface

2011-06-02 Thread David Gibson

This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu 
---
 hw/pcnet-pci.c |   17 +
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/pcnet-pci.c b/hw/pcnet-pci.c
index 9415a1e..9e962a3 100644
--- a/hw/pcnet-pci.c
+++ b/hw/pcnet-pci.c
@@ -214,16 +214,16 @@ static CPUReadMemoryFunc * const pcnet_mmio_read[] = {
 &pcnet_mmio_readl
 };
 
-static void pci_physical_memory_write(void *dma_opaque, target_phys_addr_t 
addr,
-  uint8_t *buf, int len, int do_bswap)
+static void pci_dma_write(void *dma_opaque, target_phys_addr_t addr,
+  uint8_t *buf, int len, int do_bswap)
 {
-cpu_physical_memory_write(addr, buf, len);
+dma_memory_write(dma_opaque, addr, buf, len);
 }
 
-static void pci_physical_memory_read(void *dma_opaque, target_phys_addr_t addr,
- uint8_t *buf, int len, int do_bswap)
+static void pci_dma_read(void *dma_opaque, target_phys_addr_t addr,
+ uint8_t *buf, int len, int do_bswap)
 {
-cpu_physical_memory_read(addr, buf, len);
+dma_memory_read(dma_opaque, addr, buf, len);
 }
 
 static void pci_pcnet_cleanup(VLANClientState *nc)
@@ -290,8 +290,9 @@ static int pci_pcnet_init(PCIDevice *pci_dev)
 pci_register_bar_simple(pci_dev, 1, PCNET_PNPMMIO_SIZE, 0, s->mmio_index);
 
 s->irq = pci_dev->irq[0];
-s->phys_mem_read = pci_physical_memory_read;
-s->phys_mem_write = pci_physical_memory_write;
+s->phys_mem_read = pci_dma_read;
+s->phys_mem_write = pci_dma_write;
+s->dma_opaque = &pci_dev->qdev;
 
 if (!pci_dev->qdev.hotplugged) {
 static int loaded = 0;
-- 
1.7.4.4

Re: [Qemu-devel] [PATCH 01/14] Generic DMA memory access interface

2011-06-02 Thread Richard Henderson

On 06/02/2011 08:12 AM, David Gibson wrote:
> +err = iommu->translate(dev, addr, &paddr, &plen, is_write);
> +if (err) {
> +return NULL;
> +}
> +
> +/*
> + * If this is true, the virtual region is contiguous,
> + * but the translated physical region isn't. We just
> + * clamp *len, much like cpu_physical_memory_map() does.
> + */
> +if (plen < *len) {
> +*len = plen;
> +}
> +
> +buf = cpu_physical_memory_map(paddr, &plen, is_write);
> +*len = plen;
> +
> +/* We treat maps as remote TLBs to cope with stuff like AIO. */

Oh, that reminds me.  There's a bug here in Eduard's original:

PLEN is set to the maximum length of the transfer by the 
translate function.  What we do *not* want is to pass a
very very large region to cpu_physical_memory_map.

The effects of this are hard to see with the AMD IOMMU, since
it is entirely page based and thus PLEN will be increased by
no more than 4k, but Alpha IOMMUs have direct-mapped translation
windows that can be up to 128GB.

I'm unsure whether I prefer to force the translator function
to never increase PLEN (which is what I implemented in my 
own branch) or whether all callers of the translate function
must be aware that the returned PLEN can increase.

r~

Re: [Qemu-devel] [PATCH 02/14] pci: add IOMMU support via the generic DMA layer

2011-06-02 Thread Richard Henderson

On 06/02/2011 08:12 AM, David Gibson wrote:
> --- a/hw/pci_internals.h
> +++ b/hw/pci_internals.h
> @@ -14,8 +14,15 @@
>  
>  extern struct BusInfo pci_bus_info;
>  
> +typedef DMAMmu *(*pci_iommu_new_device_fn)(PCIBus *);
> +
> +struct PCIBusIOMMU {
> +pci_iommu_new_device_fn new_device;
> +};
> +
>  struct PCIBus {
>  BusState qbus;
> +PCIBusIOMMU *iommu;

Is there a reason that you put PCIBusIOMMU here and not in pci.h?
At present, the only users of pci_internals.h are the core pci
implementation files, not pci host bridges, not pci devices.

Modulo that, I can live with this arrangement.

r~

Re: [Qemu-devel] [PATCH 01/14] Generic DMA memory access interface

2011-06-02 Thread Eduard - Gabriel Munteanu

On Thu, Jun 02, 2011 at 09:43:32AM -0700, Richard Henderson wrote:
> On 06/02/2011 08:12 AM, David Gibson wrote:
> > +err = iommu->translate(dev, addr, &paddr, &plen, is_write);
> > +if (err) {
> > +return NULL;
> > +}
> > +
> > +/*
> > + * If this is true, the virtual region is contiguous,
> > + * but the translated physical region isn't. We just
> > + * clamp *len, much like cpu_physical_memory_map() does.
> > + */
> > +if (plen < *len) {
> > +*len = plen;
> > +}
> > +
> > +buf = cpu_physical_memory_map(paddr, &plen, is_write);
> > +*len = plen;
> > +
> > +/* We treat maps as remote TLBs to cope with stuff like AIO. */
> 
> Oh, that reminds me.  There's a bug here in Eduard's original:
> 
> PLEN is set to the maximum length of the transfer by the 
> translate function.  What we do *not* want is to pass a
> very very large region to cpu_physical_memory_map.
> 
> The effects of this are hard to see with the AMD IOMMU, since
> it is entirely page based and thus PLEN will be increased by
> no more than 4k, but Alpha IOMMUs have direct-mapped translation
> windows that can be up to 128GB.
> 
> I'm unsure whether I prefer to force the translator function
> to never increase PLEN (which is what I implemented in my 
> own branch) or whether all callers of the translate function
> must be aware that the returned PLEN can increase.

My latest patches seem to have fixed that:

+if (plen < *len) {
+*len = plen;
+}
+
+buf = cpu_physical_memory_map(paddr, len, is_write);

I think the callers of translate() should take care of clamping the
length. Note the 'len' passed into translate() is write-only, so there's
nothing to increase in relation to, but I get your point.

The reason is we have two different behaviors to cater for: maps need to
be contiguous, while plain reads/writes try to resolve the whole range
of DMA addresses. In order to do that, translate() tells the caller
where the translation ceases to be valid so it can make informed
choices.

And anyway, translate() isn't supposed to be called from other places,
just the DMA abstraction.

> 
> r~

[Qemu-devel] [PATCH v3] Introduce format string for pid_t

2011-06-02 Thread Andreas Färber

BeOS and Haiku on i386 use long for 32-bit types, including pid_t.
Using %d with pid_t therefore results in a warning.

Unfortunately POSIX:2008 does not define a PRId* string for pid_t.

In some places pid_t was previously casted to long and %ld hardcoded.
The predecessor of this patch added another upcast for the simpletrace
filename but was not applied to date.

Since new uses of pid_t with %d keep creeping in, let's instead define
an OS-dependent format string and use that consistently.

Cc: Stefan Hajnoczi 
Cc: Blue Swirl 
Cc: Ingo Weinhold 
Cc: Gleb Natapov 
Signed-off-by: Andreas Färber 
---
 v2: Use %ld and long instead of %lu for Haiku for x86_64 compatibility.
 
 configure  |2 +-
 os-posix.c |2 +-
 os-win32.c |2 +-
 osdep.h|6 ++
 vl.c   |2 +-
 5 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index a318d37..0f2b6bf 100755
--- a/configure
+++ b/configure
@@ -3037,7 +3037,7 @@ if test "$trace_backend" = "simple"; then
 fi
 # Set the appropriate trace file.
 if test "$trace_backend" = "simple"; then
-  trace_file="\"$trace_file-%u\""
+  trace_file="\"$trace_file-\" FMT_pid"
 fi
 if test "$trace_backend" = "dtrace" -a "$trace_backend_stap" = "yes" ; then
   echo "CONFIG_SYSTEMTAP_TRACE=y" >> $config_host_mak
diff --git a/os-posix.c b/os-posix.c
index 3204197..7dfb278 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -368,7 +368,7 @@ int qemu_create_pidfile(const char *filename)
 if (lockf(fd, F_TLOCK, 0) == -1) {
 return -1;
 }
-len = snprintf(buffer, sizeof(buffer), "%ld\n", (long)getpid());
+len = snprintf(buffer, sizeof(buffer), FMT_pid "\n", getpid());
 if (write(fd, buffer, len) != len) {
 return -1;
 }
diff --git a/os-win32.c b/os-win32.c
index d6d54c6..b6652af 100644
--- a/os-win32.c
+++ b/os-win32.c
@@ -258,7 +258,7 @@ int qemu_create_pidfile(const char *filename)
 if (file == INVALID_HANDLE_VALUE) {
 return -1;
 }
-len = snprintf(buffer, sizeof(buffer), "%ld\n", (long)getpid());
+len = snprintf(buffer, sizeof(buffer), FMT_pid "\n", getpid());
 ret = WriteFileEx(file, (LPCVOID)buffer, (DWORD)len,
  &overlap, NULL);
 if (ret == 0) {
diff --git a/osdep.h b/osdep.h
index 970d767..97d167a 100644
--- a/osdep.h
+++ b/osdep.h
@@ -127,6 +127,12 @@ void qemu_vfree(void *ptr);
 
 int qemu_madvise(void *addr, size_t len, int advice);
 
+#if defined(__HAIKU__) && defined(__i386__)
+#define FMT_pid "%ld"
+#else
+#define FMT_pid "%d"
+#endif
+
 int qemu_create_pidfile(const char *filename);
 int qemu_get_thread_id(void);
 
diff --git a/vl.c b/vl.c
index b362871..b7b98f0 100644
--- a/vl.c
+++ b/vl.c
@@ -1191,7 +1191,7 @@ void qemu_kill_report(void)
  */
 fputc('\n', stderr);
 } else {
-fprintf(stderr, " from pid %d\n", shutdown_pid);
+fprintf(stderr, " from pid " FMT_pid "\n", shutdown_pid);
 }
 shutdown_signal = -1;
 }
-- 
1.7.5.2.317.g391b14

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Luiz Capitulino

On Thu, 02 Jun 2011 09:02:30 -0500
Anthony Liguori  wrote:

> On 06/02/2011 08:24 AM, Jiri Denemark wrote:
> > On Thu, Jun 02, 2011 at 08:08:35 -0500, Anthony Liguori wrote:
> >> On 06/02/2011 04:06 AM, Daniel P. Berrange wrote:
> > B. query-stop-reason
> > 
> >
> > I also have a simple solution for item 2. The vm_stop() accepts a reason
> > argument, so we could store it somewhere and return it as a string, 
> > like:
> >
> > ->{ "execute": "query-stop-reason" }
> > <- { "return": { "reason": "user" } }
> >
> > Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey,
> > this should be "ioerror", no?), "watchdog", "panic", "savevm", "loadvm",
> > "migrate".
> >
> > Also note that we have a STOP event. It should be extended with the
> > stop reason too, for completeness.
> 
> 
>  Can we just extend query-block?
> >>>
> >>> Primarily we want 'query-stop-reason' to tell us what caused the VM
> >>> CPUs to stop. If that reason was 'ioerror', then 'query-block' could
> >>> be used to find out which particular block device(s) caused the IO
> >>> error to occurr&   get the "reason" that was in the BLOCK_IO_ERROR
> >>> event.
> >>
> >> My concern is that we're over abstracting here.  We're not going to add
> >> additional stop reasons in the future.
> >>
> >> Maybe just add an 'io-error': True to query-state.
> >
> > Sure, adding a new field to query-state response would work as well. And it
> > seems like a good idea to me since one already needs to call query-status to
> > check if CPUs are stopped or not so it makes sense to incorporate the
> > additional information there as well. And if you want to be safe for the
> > future, the new field doesn't have to be boolean 'io-error' but it can be 
> > the
> > string 'reason' which Luiz suggested above.
> 
> 
> String enumerations are a Bad Thing.  It's impossible to figure out what 
> strings are valid and it lacks type safety.
> 
> Adding more booleans provides better type safety, and when we move to 
> QAPI with a queryable schema, provides a way to figure out exactly what 
> combinations are supported by QEMU.

To summarize:

 1. Add a 'io-error' field to query-status (which is only present if
field 'running' is false)

 2. Extend query-block to contain error information associated with the
device. This is interesting, because this information will be available
even if the error didn't cause the VM to stop

Seems good enough to me, comments?

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Luiz Capitulino

On Wed, 01 Jun 2011 16:35:03 -0500
Anthony Liguori  wrote:

> On 06/01/2011 04:12 PM, Luiz Capitulino wrote:
> > Hi there,
> >
> > There are people who want to use QMP for thin provisioning. That's, the VM 
> > is
> > started with a small storage and when a no space error is triggered, more 
> > space
> > is allocated and the VM is put to run again.
> >
> > QMP has two limitations that prevent people from doing this today:
> >
> > 1. The BLOCK_IO_ERROR doesn't contain error information
> >
> > 2. Considering we solve item 1, we still have to provide a way for clients
> > to query why a VM stopped. This is needed because clients may miss the
> > BLOCK_IO_ERROR event or may connect to the VM while it's already stopped
> >
> > A proposal to solve both problems follow.
> >
> > A. BLOCK_IO_ERROR information
> > -
> >
> > We already have discussed this a lot, but didn't reach a consensus. My 
> > solution
> > is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR event,
> > for example (see the "reason" key):
> >
> > { "event": "BLOCK_IO_ERROR",
> > "data": { "device": "ide0-hd1",
> >   "operation": "write",
> >   "action": "stop",
> >   "reason": "enospc", }
> 
> you can call the reason whatever you want, but don't call it stringfied 
> errno name :-)
> 
> In fact, just make reason "no space".

You mean, we should do:

  "reason": "no space"

Or that we should make it a boolean, like:

 "no space": true

I'm ok with either way. But in case you meant the second one, I guess
we should make "reason" a dictionary so that we can group related
information when we extend the field, for example:

 "reason": { "no space": false, "no permission": true }

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Anthony Liguori


On 06/02/2011 12:57 PM, Luiz Capitulino wrote:

On Wed, 01 Jun 2011 16:35:03 -0500
Anthony Liguori  wrote:


On 06/01/2011 04:12 PM, Luiz Capitulino wrote:

Hi there,

There are people who want to use QMP for thin provisioning. That's, the VM is
started with a small storage and when a no space error is triggered, more space
is allocated and the VM is put to run again.

QMP has two limitations that prevent people from doing this today:

1. The BLOCK_IO_ERROR doesn't contain error information

2. Considering we solve item 1, we still have to provide a way for clients
 to query why a VM stopped. This is needed because clients may miss the
 BLOCK_IO_ERROR event or may connect to the VM while it's already stopped

A proposal to solve both problems follow.

A. BLOCK_IO_ERROR information
-

We already have discussed this a lot, but didn't reach a consensus. My solution
is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR event,
for example (see the "reason" key):

{ "event": "BLOCK_IO_ERROR",
 "data": { "device": "ide0-hd1",
   "operation": "write",
   "action": "stop",
   "reason": "enospc", }


you can call the reason whatever you want, but don't call it stringfied
errno name :-)

In fact, just make reason "no space".


You mean, we should do:

   "reason": "no space"

Or that we should make it a boolean, like:

  "no space": true



Do we need reason in BLOCK_IO_ERROR if query-block returns this information?



I'm ok with either way. But in case you meant the second one, I guess
we should make "reason" a dictionary so that we can group related
information when we extend the field, for example:

  "reason": { "no space": false, "no permission": true }


Why would we ever have "no permission"?

Part of my argument for not having reason is I don't think we actually 
need to be this generic.  I think we're over abstracting.


Regards,

Anthony Liguori

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Luiz Capitulino

On Thu, 02 Jun 2011 13:00:04 -0500
Anthony Liguori  wrote:

> On 06/02/2011 12:57 PM, Luiz Capitulino wrote:
> > On Wed, 01 Jun 2011 16:35:03 -0500
> > Anthony Liguori  wrote:
> >
> >> On 06/01/2011 04:12 PM, Luiz Capitulino wrote:
> >>> Hi there,
> >>>
> >>> There are people who want to use QMP for thin provisioning. That's, the 
> >>> VM is
> >>> started with a small storage and when a no space error is triggered, more 
> >>> space
> >>> is allocated and the VM is put to run again.
> >>>
> >>> QMP has two limitations that prevent people from doing this today:
> >>>
> >>> 1. The BLOCK_IO_ERROR doesn't contain error information
> >>>
> >>> 2. Considering we solve item 1, we still have to provide a way for clients
> >>>  to query why a VM stopped. This is needed because clients may miss 
> >>> the
> >>>  BLOCK_IO_ERROR event or may connect to the VM while it's already 
> >>> stopped
> >>>
> >>> A proposal to solve both problems follow.
> >>>
> >>> A. BLOCK_IO_ERROR information
> >>> -
> >>>
> >>> We already have discussed this a lot, but didn't reach a consensus. My 
> >>> solution
> >>> is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR 
> >>> event,
> >>> for example (see the "reason" key):
> >>>
> >>> { "event": "BLOCK_IO_ERROR",
> >>>  "data": { "device": "ide0-hd1",
> >>>"operation": "write",
> >>>"action": "stop",
> >>>"reason": "enospc", }
> >>
> >> you can call the reason whatever you want, but don't call it stringfied
> >> errno name :-)
> >>
> >> In fact, just make reason "no space".
> >
> > You mean, we should do:
> >
> >"reason": "no space"
> >
> > Or that we should make it a boolean, like:
> >
> >   "no space": true
> 
> 
> Do we need reason in BLOCK_IO_ERROR if query-block returns this information?

True, no.

> > I'm ok with either way. But in case you meant the second one, I guess
> > we should make "reason" a dictionary so that we can group related
> > information when we extend the field, for example:
> >
> >   "reason": { "no space": false, "no permission": true }
> 
> Why would we ever have "no permission"?

It's an I/O error. I have a report from a developer who was getting
the BLOCK_IO_ERROR event and had to debug qemu to know the error cause,
it turned out to be no permission.

> Part of my argument for not having reason is I don't think we actually 
> need to be this generic.  I think we're over abstracting.

I'm quite sure we'll want to add new errors reasons in the near future.

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Anthony Liguori


On 06/02/2011 01:01 PM, Luiz Capitulino wrote:

On Thu, 02 Jun 2011 09:02:30 -0500
Anthony Liguori  wrote:


On 06/02/2011 08:24 AM, Jiri Denemark wrote:

On Thu, Jun 02, 2011 at 08:08:35 -0500, Anthony Liguori wrote:

On 06/02/2011 04:06 AM, Daniel P. Berrange wrote:

B. query-stop-reason


I also have a simple solution for item 2. The vm_stop() accepts a reason
argument, so we could store it somewhere and return it as a string, like:

-> { "execute": "query-stop-reason" }
<- { "return": { "reason": "user" } }

Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey,
this should be "ioerror", no?), "watchdog", "panic", "savevm", "loadvm",
"migrate".

Also note that we have a STOP event. It should be extended with the
stop reason too, for completeness.



Can we just extend query-block?


Primarily we want 'query-stop-reason' to tell us what caused the VM
CPUs to stop. If that reason was 'ioerror', then 'query-block' could
be used to find out which particular block device(s) caused the IO
error to occurr&get the "reason" that was in the BLOCK_IO_ERROR
event.


My concern is that we're over abstracting here.  We're not going to add
additional stop reasons in the future.

Maybe just add an 'io-error': True to query-state.


Sure, adding a new field to query-state response would work as well. And it
seems like a good idea to me since one already needs to call query-status to
check if CPUs are stopped or not so it makes sense to incorporate the
additional information there as well. And if you want to be safe for the
future, the new field doesn't have to be boolean 'io-error' but it can be the
string 'reason' which Luiz suggested above.



String enumerations are a Bad Thing.  It's impossible to figure out what
strings are valid and it lacks type safety.

Adding more booleans provides better type safety, and when we move to
QAPI with a queryable schema, provides a way to figure out exactly what
combinations are supported by QEMU.


To summarize:

  1. Add a 'io-error' field to query-status (which is only present if
 field 'running' is false)


It may or may not be present.  Lack of presence does not tell you anything.

It is only true when running is false AND the guest was stopped because 
of an io error.




  2. Extend query-block to contain error information associated with the
 device. This is interesting, because this information will be available
 even if the error didn't cause the VM to stop


Well we need at least some way to indicate that a block device is in a 
failed state.  For instance, if you have two block device, but you miss 
the IO_ERROR event, you need to figure out which of the two devices is 
giving errors.


But I was thinking of something that had the semantics of, last_iop_failed.

Regards,

Anthony Liguori


Seems good enough to me, comments?

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Anthony Liguori


On 06/02/2011 01:09 PM, Luiz Capitulino wrote:

On Thu, 02 Jun 2011 13:00:04 -0500
Anthony Liguori  wrote:


On 06/02/2011 12:57 PM, Luiz Capitulino wrote:

On Wed, 01 Jun 2011 16:35:03 -0500
Anthony Liguori   wrote:


On 06/01/2011 04:12 PM, Luiz Capitulino wrote:

Hi there,

There are people who want to use QMP for thin provisioning. That's, the VM is
started with a small storage and when a no space error is triggered, more space
is allocated and the VM is put to run again.

QMP has two limitations that prevent people from doing this today:

1. The BLOCK_IO_ERROR doesn't contain error information

2. Considering we solve item 1, we still have to provide a way for clients
  to query why a VM stopped. This is needed because clients may miss the
  BLOCK_IO_ERROR event or may connect to the VM while it's already stopped

A proposal to solve both problems follow.

A. BLOCK_IO_ERROR information
-

We already have discussed this a lot, but didn't reach a consensus. My solution
is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR event,
for example (see the "reason" key):

{ "event": "BLOCK_IO_ERROR",
  "data": { "device": "ide0-hd1",
"operation": "write",
"action": "stop",
"reason": "enospc", }


you can call the reason whatever you want, but don't call it stringfied
errno name :-)

In fact, just make reason "no space".


You mean, we should do:

"reason": "no space"

Or that we should make it a boolean, like:

   "no space": true



Do we need reason in BLOCK_IO_ERROR if query-block returns this information?


True, no.


I'm ok with either way. But in case you meant the second one, I guess
we should make "reason" a dictionary so that we can group related
information when we extend the field, for example:

   "reason": { "no space": false, "no permission": true }


Why would we ever have "no permission"?


Why did it happen?  It's not clear to me when read/write would return 
EPERM.  open() should fail.  In fact, EPERM is not mentioned in man 2 read.


Regards,

Anthony Liguori



It's an I/O error. I have a report from a developer who was getting
the BLOCK_IO_ERROR event and had to debug qemu to know the error cause,
it turned out to be no permission.


Part of my argument for not having reason is I don't think we actually
need to be this generic.  I think we're over abstracting.


I'm quite sure we'll want to add new errors reasons in the near future.

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Luiz Capitulino

On Thu, 02 Jun 2011 13:32:25 -0500
Anthony Liguori  wrote:

> On 06/02/2011 01:01 PM, Luiz Capitulino wrote:
> > On Thu, 02 Jun 2011 09:02:30 -0500
> > Anthony Liguori  wrote:
> >
> >> On 06/02/2011 08:24 AM, Jiri Denemark wrote:
> >>> On Thu, Jun 02, 2011 at 08:08:35 -0500, Anthony Liguori wrote:
>  On 06/02/2011 04:06 AM, Daniel P. Berrange wrote:
> >>> B. query-stop-reason
> >>> 
> >>>
> >>> I also have a simple solution for item 2. The vm_stop() accepts a 
> >>> reason
> >>> argument, so we could store it somewhere and return it as a string, 
> >>> like:
> >>>
> >>> -> { "execute": "query-stop-reason" }
> >>> <- { "return": { "reason": "user" } }
> >>>
> >>> Valid reasons could be: "user", "debug", "shutdown", "diskfull" (hey,
> >>> this should be "ioerror", no?), "watchdog", "panic", "savevm", 
> >>> "loadvm",
> >>> "migrate".
> >>>
> >>> Also note that we have a STOP event. It should be extended with the
> >>> stop reason too, for completeness.
> >>
> >>
> >> Can we just extend query-block?
> >
> > Primarily we want 'query-stop-reason' to tell us what caused the VM
> > CPUs to stop. If that reason was 'ioerror', then 'query-block' could
> > be used to find out which particular block device(s) caused the IO
> > error to occurr&get the "reason" that was in the BLOCK_IO_ERROR
> > event.
> 
>  My concern is that we're over abstracting here.  We're not going to add
>  additional stop reasons in the future.
> 
>  Maybe just add an 'io-error': True to query-state.
> >>>
> >>> Sure, adding a new field to query-state response would work as well. And 
> >>> it
> >>> seems like a good idea to me since one already needs to call query-status 
> >>> to
> >>> check if CPUs are stopped or not so it makes sense to incorporate the
> >>> additional information there as well. And if you want to be safe for the
> >>> future, the new field doesn't have to be boolean 'io-error' but it can be 
> >>> the
> >>> string 'reason' which Luiz suggested above.
> >>
> >>
> >> String enumerations are a Bad Thing.  It's impossible to figure out what
> >> strings are valid and it lacks type safety.
> >>
> >> Adding more booleans provides better type safety, and when we move to
> >> QAPI with a queryable schema, provides a way to figure out exactly what
> >> combinations are supported by QEMU.
> >
> > To summarize:
> >
> >   1. Add a 'io-error' field to query-status (which is only present if
> >  field 'running' is false)
> 
> It may or may not be present.  Lack of presence does not tell you anything.
> 
> It is only true when running is false AND the guest was stopped because 
> of an io error.

Right.

> >
> >   2. Extend query-block to contain error information associated with the
> >  device. This is interesting, because this information will be available
> >  even if the error didn't cause the VM to stop
> 
> Well we need at least some way to indicate that a block device is in a 
> failed state.  For instance, if you have two block device, but you miss 
> the IO_ERROR event, you need to figure out which of the two devices is 
> giving errors.

Can't query-block be used for that? The 'io-error' key will only be present
for the failing device(s).

> 
> But I was thinking of something that had the semantics of, last_iop_failed.
> 
> Regards,
> 
> Anthony Liguori
> 
> > Seems good enough to me, comments?
> >
>

[Qemu-devel] [PATCH] cocoa: Revert dependency on VNC

2011-06-02 Thread Andreas Färber

In 821601ea5b02a68ada479731a4d3d07a9876632a (Make VNC support optional)
cocoa.o was moved from ui-obj-$(CONFIG_COCOA) to vnc-obj-$(CONFIG_COCOA),
adding a dependency on $(CONFIG_VNC). That must've been unintentional.

Cc: Jes Sorensen 
Cc: Anthony Liguori 
Signed-off-by: Andreas Färber 
---
 Makefile.objs |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 90838f6..2e6419f 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -126,6 +126,7 @@ common-obj-y += $(addprefix audio/, $(audio-obj-y))
 
 ui-obj-y += keymaps.o
 ui-obj-$(CONFIG_SDL) += sdl.o sdl_zoom.o x_keymap.o
+ui-obj-$(CONFIG_COCOA) += cocoa.o
 ui-obj-$(CONFIG_CURSES) += curses.o
 vnc-obj-y += vnc.o d3des.o
 vnc-obj-y += vnc-enc-zlib.o vnc-enc-hextile.o
@@ -133,7 +134,6 @@ vnc-obj-y += vnc-enc-tight.o vnc-palette.o
 vnc-obj-y += vnc-enc-zrle.o
 vnc-obj-$(CONFIG_VNC_TLS) += vnc-tls.o vnc-auth-vencrypt.o
 vnc-obj-$(CONFIG_VNC_SASL) += vnc-auth-sasl.o
-vnc-obj-$(CONFIG_COCOA) += cocoa.o
 ifdef CONFIG_VNC_THREAD
 vnc-obj-y += vnc-jobs-async.o
 else
-- 
1.7.5.3

[Qemu-devel] [RHEL6 qemu-kvm PATCH 02/11] Allow an optional qemu_early_init_vcpu()

2011-06-02 Thread Eduardo Habkost

From: john cooper 

Allow an optional qemu_early_init_vcpu() such that
kvm_arch_get_supported_cpuid() can be used from
cpu_x86_register().  Without this minimal setup
kvm_arch_get_supported_cpuid() gags kvm_ioctl() via
passing a NULL initialized KVMState *.

[ehabkost: made Subject line shorter]

Signed-off-by: john cooper 
Signed-off-by: Eduardo Habkost 
---
 cpus.c   |8 
 kvm-all.c|   36 +---
 kvm.h|1 +
 qemu-common.h|2 ++
 target-i386/helper.c |1 +
 5 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/cpus.c b/cpus.c
index 1fc34b7..25122db 100644
--- a/cpus.c
+++ b/cpus.c
@@ -544,6 +544,14 @@ void qemu_main_loop_start(void)
 {
 }
 
+void qemu_early_init_vcpu(void *_env)
+{
+CPUState *env = _env;
+
+if (kvm_enabled())
+kvm_early_init_vcpu(env);
+}
+
 void qemu_init_vcpu(void *_env)
 {
 CPUState *env = _env;
diff --git a/kvm-all.c b/kvm-all.c
index 106eb3a..dc846aa 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -195,24 +195,46 @@ int kvm_pit_in_kernel(void)
 return kvm_state->pit_in_kernel;
 }
 
-int kvm_init_vcpu(CPUState *env)
+/* env->kvm_state is needed early by kvm_check_extension()
+ * break it out so it may be setup early where needed
+ */
+int kvm_early_init_vcpu(CPUState *env)
+
 {
 KVMState *s = kvm_state;
-long mmap_size;
 int ret;
 
-DPRINTF("kvm_init_vcpu\n");
+DPRINTF("kvm_early_init_vcpu\n");
+
+if (env->kvm_state) {  /* already setup */
+return 0;
+}
 
 ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, env->cpu_index);
 if (ret < 0) {
 DPRINTF("kvm_create_vcpu failed\n");
-goto err;
+} else {
+env->kvm_fd = ret;
+env->kvm_state = s;
+env->kvm_vcpu_dirty = 1;
 }
+return ret;
+}
+
+int kvm_init_vcpu(CPUState *env)
+{
+KVMState *s;
+long mmap_size;
+int ret;
+
+DPRINTF("kvm_init_vcpu\n");
 
-env->kvm_fd = ret;
-env->kvm_state = s;
-env->kvm_vcpu_dirty = 1;
+ret = kvm_early_init_vcpu(env);
+if (ret < 0) {
+goto err;
+}
 
+s = env->kvm_state;
 mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
 if (mmap_size < 0) {
 ret = mmap_size;
diff --git a/kvm.h b/kvm.h
index d565dba..fe0631b 100644
--- a/kvm.h
+++ b/kvm.h
@@ -53,6 +53,7 @@ int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
 
 #ifdef NEED_CPU_H
+int kvm_early_init_vcpu(CPUState *env);
 int kvm_init_vcpu(CPUState *env);
 
 int kvm_cpu_exec(CPUState *env);
diff --git a/qemu-common.h b/qemu-common.h
index b851b20..2bea318 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -313,8 +313,10 @@ struct qemu_work_item {
 };
 
 #ifdef CONFIG_USER_ONLY
+#define qemu_early_init_vcpu(env) do { } while (0)
 #define qemu_init_vcpu(env) do { } while (0)
 #else
+void qemu_early_init_vcpu(void *env);
 void qemu_init_vcpu(void *env);
 #endif
 
diff --git a/target-i386/helper.c b/target-i386/helper.c
index 89df997..73f44e8 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -1261,6 +1261,7 @@ CPUX86State *cpu_x86_init(const char *cpu_model)
 cpu_set_debug_excp_handler(breakpoint_handler);
 #endif
 }
+qemu_early_init_vcpu(env);
 if (cpu_x86_register(env, cpu_model) < 0) {
 cpu_x86_close(env);
 return NULL;
-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 05/11] cpu defs: use Intel flag names for Intel models

2011-06-02 Thread Eduardo Habkost

Use 'i64' instead of 'lm' and 'xd' instead of 'nx' on Intel models.

The flags have different names on Intel docs, so use those names for clarity.

This is based on a previous patch from John Cooper where this was introduced
with many other changes at the same time. Original John's patch submission is
at Message-ID: <4ddad5e7.2020...@redhat.com>, 
.

Signed-off-by: Eduardo Habkost 
---
 sysconfigs/target/target-x86_64.conf |6 +++---
 target-i386/cpuid.c  |4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index a0df33c..fd4e421 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -9,7 +9,7 @@
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 ssse3 x2apic"
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
+   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
model_id = "Intel Celeron_4x0 (Conroe/Merom Class Core 2)"
@@ -23,7 +23,7 @@
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 cx16 ssse3 sse4.1 x2apic"
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
+   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
model_id = "Intel Core 2 Duo P9xxx (Penryn Class Core 2)"
@@ -37,7 +37,7 @@
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 cx16 ssse3 sse4.1 sse4.2 popcnt x2apic"
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
+   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
model_id = "Intel Core i7 9xx (Nehalem Class Core i7)"
diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index c151f12..fc72f7b 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -57,9 +57,9 @@ static const char *ext2_feature_name[] = {
 "cx8" /* AMD CMPXCHG8B */, "apic", NULL, "syscall",
 "mtrr", "pge", "mca", "cmov",
 "pat", "pse36", NULL, NULL /* Linux mp */,
-"nx" /* Intel xd */, NULL, "mmxext", "mmx",
+"nx|xd", NULL, "mmxext", "mmx",
 "fxsr", "fxsr_opt" /* AMD ffxsr */, "pdpe1gb" /* AMD Page1GB */, "rdtscp",
-NULL, "lm" /* Intel 64 */, "3dnowext", "3dnow",
+NULL, "lm|i64", "3dnowext", "3dnow",
 };
 static const char *ext3_feature_name[] = {
 "lahf_lm" /* AMD LahfSahf */, "cmp_legacy", "svm", "extapic" /* AMD 
ExtApicSpace */,
-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 07/11] cpu defs: uncomment empty extfeatures_ecx definition for Opteron_G1

2011-06-02 Thread Eduardo Habkost

This should have no visible effect, but it should just clean up the
config file a bit.

This is based on a previous patch from John Cooper where this was introduced
with many other changes at the same time. Original John's patch submission is
at Message-ID: <4ddad5e7.2020...@redhat.com>, 
.

Signed-off-by: Eduardo Habkost 
---
 sysconfigs/target/target-x86_64.conf |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index 09b30a4..ea310bb 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -52,7 +52,7 @@
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 x2apic" # x2apic kvm emulated
extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
-#   extfeature_ecx = ""
+   extfeature_ecx = " "
xlevel = "0x8008"
model_id = "AMD Opteron 240 (Gen 1 Class Opteron)"
 
-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 04/11] Support -readconfig "?" to debug config file loading

2011-06-02 Thread Eduardo Habkost

From: john cooper 

Failure by qemu to open a default config file isn't cause to
error exit -- it just quietly continues on.   After puzzling
issues with otherwise opaque config file locations and
startup handling numerous times, some help from qemu seemed
justified.

In the case of a "?" pseudo filename arg to -readconfig,
verbose open of all config files will be enabled.  Normal
handling of config files is otherwise unaffected by this
option.

Note: other CLI flag schemes have been discussed at length
to accommodate this option.  However given the constraints
of the existing user interface, a solution which minimally
impacts the user is ultimately required.

[ehabkost: edited commit message to have better Subject line]

Signed-off-by: john cooper 
Signed-off-by: Eduardo Habkost 
---
 qemu-config.c |   30 +++---
 qemu-config.h |2 +-
 vl.c  |   20 +---
 3 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index 5d7ffa2..b39b8fe 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -666,21 +666,29 @@ out:
 return res;
 }
 
-int qemu_read_config_file(const char *filename)
+/* attempt to open and parse config file, report problems if vflag
+ */
+int qemu_read_config_file(const char *filename, int vflag)
 {
 FILE *f = fopen(filename, "r");
-int ret;
+int rv = 0;
+const char *err;
 
 if (f == NULL) {
-return -errno;
+rv = -errno;
+err = "open";
+} else if (qemu_config_parse(f, vm_config_groups, filename) != 0) {
+rv = -EINVAL;
+err = "parse";
+} else if (vflag) {
+fprintf(stderr, "parsed config file %s\n", filename);
 }
-
-ret = qemu_config_parse(f, vm_config_groups, filename);
-fclose(f);
-
-if (ret == 0) {
-return 0;
-} else {
-return -EINVAL;
+if (f) {
+fclose(f);
+}
+if (rv && vflag) {
+fprintf(stderr, "can't %s config file %s: %s\n",
+err, filename, strerror(-rv));
 }
+return rv;
 }
diff --git a/qemu-config.h b/qemu-config.h
index 20d707f..b90a7cc 100644
--- a/qemu-config.h
+++ b/qemu-config.h
@@ -14,6 +14,6 @@ void qemu_add_globals(void);
 void qemu_config_write(FILE *fp);
 int qemu_config_parse(FILE *fp, QemuOptsList **lists, const char *fname);
 
-int qemu_read_config_file(const char *filename);
+int qemu_read_config_file(const char *filename, int vflag);
 
 #endif /* QEMU_CONFIG_H */
diff --git a/vl.c b/vl.c
index b362871..65b0791 100644
--- a/vl.c
+++ b/vl.c
@@ -2059,6 +2059,7 @@ int main(int argc, char **argv, char **envp)
 int show_vnc_port = 0;
 #endif
 int defconfig = 1;
+int defconfig_verbose = 0;
 const char *trace_file = NULL;
 
 atexit(qemu_run_exit_notifiers);
@@ -2108,6 +2109,12 @@ int main(int argc, char **argv, char **envp)
 case QEMU_OPTION_nodefconfig:
 defconfig=0;
 break;
+case QEMU_OPTION_readconfig:
+/* pseudo filename "?" enables verbose config file handling */
+if (!strcmp(optarg, "?")) {
+defconfig_verbose = 1;
+}
+break;
 }
 }
 }
@@ -2115,12 +2122,13 @@ int main(int argc, char **argv, char **envp)
 if (defconfig) {
 int ret;
 
-ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR "/qemu.conf");
+ret = qemu_read_config_file(CONFIG_QEMU_CONFDIR "/qemu.conf",
+defconfig_verbose);
 if (ret < 0 && ret != -ENOENT) {
 exit(1);
 }
 
-ret = qemu_read_config_file(arch_config_name);
+ret = qemu_read_config_file(arch_config_name, defconfig_verbose);
 if (ret < 0 && ret != -ENOENT) {
 exit(1);
 }
@@ -2857,11 +2865,9 @@ int main(int argc, char **argv, char **envp)
 #endif
 case QEMU_OPTION_readconfig:
 {
-int ret = qemu_read_config_file(optarg);
-if (ret < 0) {
-fprintf(stderr, "read config %s: %s\n", optarg,
-strerror(-ret));
-exit(1);
+if (strcmp(optarg, "?") &&
+qemu_read_config_file(optarg, defconfig_verbose) < 0) {
+exit(1);
 }
 break;
 }
-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 08/11] reorder cpuid feature bits on target-x86_64.conf

2011-06-02 Thread Eduardo Habkost

This makes the flag order match the bit order in the CPU. This patch just
changes the ordering on the config file, and should have no visible effect.

This is based on a previous patch from John Cooper where this was introduced
with many other changes at the same time. Original John's patch submission is
at Message-ID: <4ddad5e7.2020...@redhat.com>, 
.

To make sure the flag sets are really not changed by this patch, I have
used the following stupid script to compare the flag values in the
config files:
https://gist.github.com/1004885

Signed-off-by: Eduardo Habkost 
---
 sysconfigs/target/target-x86_64.conf |   32 
 1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index ea310bb..d368b6c 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -7,8 +7,8 @@
family = "6"
model = "15"
stepping = "3"
-   feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 ssse3 x2apic"
+   feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
+   feature_ecx = "x2apic ssse3 sse3"
extfeature_edx = "i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
@@ -21,8 +21,8 @@
family = "6"
model = "23"
stepping = "3"
-   feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16 ssse3 sse4.1 x2apic"
+   feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
+   feature_ecx = "x2apic sse4.1 cx16 ssse3 sse3"
extfeature_edx = "i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
@@ -35,8 +35,8 @@
family = "6"
model = "26"
stepping = "3"
-   feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16 ssse3 sse4.1 sse4.2 popcnt x2apic"
+   feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
+   feature_ecx = "popcnt x2apic sse4.2 sse4.1 cx16 ssse3 sse3"
extfeature_edx = "i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
@@ -49,9 +49,9 @@
family = "15"
model = "6"
stepping = "1"
-   feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 x2apic" # x2apic kvm emulated
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
+   feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
+   feature_ecx = "x2apic sse3" # x2apic kvm emulated
+   extfeature_edx = "lm fxsr mmx nx pat cmov pge syscall apic cx8 mce pae msr 
tsc pse de fpu"
extfeature_ecx = " "
xlevel = "0x8008"
model_id = "AMD Opteron 240 (Gen 1 Class Opteron)"
@@ -63,9 +63,9 @@
family = "15"
model = "6"
stepping = "1"
-   feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16 x2apic"# x2apic kvm emulated
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx rdtscp"
+   feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
+   feature_ecx = "x2apic cx16 sse3"# x2apic kvm emulated
+   extfeature_edx = "lm rdtscp fxsr mmx nx pat cmov pge syscall apic cx8 mce 
pae msr tsc pse de fpu"
extfeature_ecx = "svm lahf_lm"
xlevel = "0x8008"
model_id = "AMD Opteron 22xx (Gen 2 Class Opteron)"
@@ -77,10 +77,10 @@
family = "15"
model = "6"
stepping = "1"
-   feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16 monitor popcnt x2apic" # x2apic kvm emulated
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx rdtscp"
-   extfeature_ecx = "svm sse4a  abm misalignsse lahf_lm"
+   feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
+   feature_ecx = "popcnt x2apic cx16 monitor sse3" # x2apic kvm emulated
+   extfeature_edx = "lm rdtscp fxsr mmx nx pat cmov pge syscall apic cx8 mce 
pae msr tsc pse de fpu"
+   extfeature_ecx = "misalignsse sse4a abm svm lahf_lm"
xlevel = "0x8008"
model_id = "AMD Opteron 23xx (Gen 3 Class Opteron)"
 
-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 06/11] cpu defs: remove replicated flags from Intel

2011-06-02 Thread Eduardo Habkost

This patch removes the replicated feature flags from cpuid 8000_0001:edx
(extfeature_edx) from Intel models, as the duplicated feature flags are present
only on AMD CPUs. On Intel models, only the i64, syscall, and xd flags are kept
on extfeature_edx.

This is based on a previous patch from John Cooper where this was introduced
with many other changes at the same time. Original John's patch submission is
at Message-ID: <4ddad5e7.2020...@redhat.com>, 
.

Original John's patch description was:

cpu model bug fixes and definition corrections

This patch was intended to address the replicated feature
flags in cpuid 8000_0001:edx from cpuid _0001:edx.
This is due to AMD's definition where these flags are
mostly cloned in the 8000_0001:edx cpuid function.
qemu64 attempted to glue together the respective Intel
and AMD nearly disjoint features and this propagated to
the new Intel models as doing so was believed conservative
at the time.  However after further soak and test lugging
around this cruft doesn't provide any value, could
conceivably confuse a guest, and has confused users trying
to maintain/add cpu definitions.  This also caused issues
for libvirt attempting to track this mis-encoding.

So we've here tossed out the AMD replicated definitions
from the Intel models, added a few replications into AMD
definitions which were missing according to AMD's latest
CPUID document, and reordered the config file flags to
follow intuitive sequential bit ordering.  Also two flag
name aliases were added for clarity to Intel models.  The
end result being the models definitions now conform to
their respective cpuid specifications sans x2apic which is
emulated by kvm.

This was tested with the following combinations:

[Conroe, Penryn, Nehalem] x [F12-64, win64, win32] -- Intel host
[Opteron_G1, Opteron_G2, Opteron_G3] x [F12-64, win64, win32] -- AMD 
host

Yielding successful boots in all cases.

Signed-off-by: john cooper 

Signed-off-by: Eduardo Habkost 
---
 sysconfigs/target/target-x86_64.conf |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index fd4e421..09b30a4 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -9,7 +9,7 @@
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 ssse3 x2apic"
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   i64 syscall xd"
+   extfeature_edx = "i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
model_id = "Intel Celeron_4x0 (Conroe/Merom Class Core 2)"
@@ -23,7 +23,7 @@
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 cx16 ssse3 sse4.1 x2apic"
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   i64 syscall xd"
+   extfeature_edx = "i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
model_id = "Intel Core 2 Duo P9xxx (Penryn Class Core 2)"
@@ -37,7 +37,7 @@
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 cx16 ssse3 sse4.1 sse4.2 popcnt x2apic"
-   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   i64 syscall xd"
+   extfeature_edx = "i64 syscall xd"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
model_id = "Intel Core i7 9xx (Nehalem Class Core i7)"
-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 00/11] cpu model bug fixes and definition corrections (v2)

2011-06-02 Thread Eduardo Habkost

Hi,

This series is a revamp of the series John Cooper sent at
Message-ID: <4ddad592.3020...@redhat.com> 
(http://marc.info/?l=qemu-devel&m=130618852625770).

The results of applying this series is, bit-by-bit exactly the same of applying
the previous series from John. All I did was to rewrite Subject lines and split
one large patch changing CPU flags into small steps, so the changes can be more
easily reviewed/discussed. The proof can be checked by looking at the
'cpudefs-v1-john' and 'cpudefs-v2-ehabkost' branches at
.

Eduardo Habkost (5):
  cpu defs: use Intel flag names for Intel models
  cpu defs: remove replicated flags from Intel
  cpu defs: uncomment empty extfeatures_ecx definition for Opteron_G1
  reorder cpuid feature bits on target-x86_64.conf
  cpu defs: add pse36, mca, mtrr to AMD CPU definitions

john cooper (6):
  correct archaic CPU model "model" field for Intel CPUs.
  Allow an optional qemu_early_init_vcpu()
  Add kvm emulated x2apic flag to config defined cpu models (v2)
  Support -readconfig "?" to debug config file loading
  add Westmere as a qemu cpu model
  add "default" pseudo CPU model name

 cpus.c   |8 
 hw/pc.c  |   41 +++---
 kvm-all.c|   36 +++
 kvm.h|1 +
 qemu-common.h|2 +
 qemu-config.c|   30 ++--
 qemu-config.h|2 +-
 sysconfigs/target/target-x86_64.conf |   58 +++---
 target-i386/cpuid.c  |   65 --
 target-i386/helper.c |2 +
 vl.c |   20 +++
 11 files changed, 193 insertions(+), 72 deletions(-)

-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 01/11] correct archaic CPU model "model" field for Intel CPUs.

2011-06-02 Thread Eduardo Habkost

From: john cooper 

The old "model" values caused two known problems:

- Skype crashes on a winxp guest if model < 6, due to syscall vs.
  sysenter confusion.

- 32 bit windows doesn't enable MSI support if model < 13.

After consulting with Intel the following recommendations were
received which more accurately represent shipped silicon.

[ehabkost: made Subject line shorter]

Signed-off-by: john cooper 
Signed-off-by: Eduardo Habkost 
---
 sysconfigs/target/target-x86_64.conf |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index 43ad282..0613870 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -5,7 +5,7 @@
level = "2"
vendor = "GenuineIntel"
family = "6"
-   model = "2"
+   model = "15"
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 ssse3"
@@ -19,7 +19,7 @@
level = "2"
vendor = "GenuineIntel"
family = "6"
-   model = "2"
+   model = "23"
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 cx16 ssse3 sse4.1"
@@ -33,7 +33,7 @@
level = "2"
vendor = "GenuineIntel"
family = "6"
-   model = "2"
+   model = "26"
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
feature_ecx = "sse3 cx16 ssse3 sse4.1 sse4.2 popcnt"
-- 
1.7.3.2

[Qemu-devel] [RHEL6 qemu-kvm PATCH 03/11] Add kvm emulated x2apic flag to config defined cpu models (v2)

2011-06-02 Thread Eduardo Habkost

From: john cooper 

Add kvm emulated x2apic flag to config defined cpu models
and general support for such hypervisor emulated flags.

In addition to checking user request flags against the host
we also selectively check against kvm for emulated flags.

[ehabkost: made Subject line shorter]
[ehabkost: v2: cosmetic: add "x2apic kvm emulated" comments to conf file]

Signed-off-by: john cooper 
Signed-off-by: Eduardo Habkost 
---
 hw/pc.c  |2 +-
 sysconfigs/target/target-x86_64.conf |   12 +++---
 target-i386/cpuid.c  |   61 -
 3 files changed, 51 insertions(+), 24 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 8106197..5b94e53 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -927,7 +927,7 @@ static CPUState *pc_new_cpu(const char *cpu_model)
 
 env = cpu_init(cpu_model);
 if (!env) {
-fprintf(stderr, "Unable to find x86 CPU definition\n");
+fprintf(stderr, "Unable to support requested x86 CPU definition\n");
 exit(1);
 }
 if ((env->cpuid_features & CPUID_APIC) || smp_cpus > 1) {
diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index 0613870..a0df33c 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -8,7 +8,7 @@
model = "15"
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 ssse3"
+   feature_ecx = "sse3 ssse3 x2apic"
extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
@@ -22,7 +22,7 @@
model = "23"
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16 ssse3 sse4.1"
+   feature_ecx = "sse3 cx16 ssse3 sse4.1 x2apic"
extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
@@ -36,7 +36,7 @@
model = "26"
stepping = "3"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16 ssse3 sse4.1 sse4.2 popcnt"
+   feature_ecx = "sse3 cx16 ssse3 sse4.1 sse4.2 popcnt x2apic"
extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
extfeature_ecx = "lahf_lm"
xlevel = "0x800A"
@@ -50,7 +50,7 @@
model = "6"
stepping = "1"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3"
+   feature_ecx = "sse3 x2apic" # x2apic kvm emulated
extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
 #   extfeature_ecx = ""
xlevel = "0x8008"
@@ -64,7 +64,7 @@
model = "6"
stepping = "1"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16"
+   feature_ecx = "sse3 cx16 x2apic"# x2apic kvm emulated
extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx rdtscp"
extfeature_ecx = "svm lahf_lm"
xlevel = "0x8008"
@@ -78,7 +78,7 @@
model = "6"
stepping = "1"
feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
-   feature_ecx = "sse3 cx16 monitor popcnt"
+   feature_ecx = "sse3 cx16 monitor popcnt x2apic" # x2apic kvm emulated
extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx rdtscp"
extfeature_ecx = "svm sse4a  abm misalignsse lahf_lm"
xlevel = "0x8008"
diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index e479a4d..c151f12 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -549,15 +549,40 @@ static int unavailable_host_feature(struct 
model_features_t *f, uint32_t mask)
 return 0;
 }
 
-/* best effort attempt to inform user requested cpu flags aren't making
- * their way to the guest.  Note: ft[].check_feat ideally should be
- * specified via a guest_def field to suppress report of extraneous flags.
+/* determine the effective set of cpuid features visible to a guest.
+ * in the case kvm is enabled, we also selectively include features
+ * emulated by the hypervisor
  */
-static int check_features_against_host(x86_def_t *guest_def)
+static void summary_cpuid_features(CPUX86State *env, x86_def_t *hd)
+{
+struct {
+uint32_t *pfeat, cmd, reg, mask;
+} fmap[] = {
+{&hd->features, 0x0001, R_EDX, 0},
+{&hd->ext_features, 0x0001, R_ECX, CPUID_EXT_X2APIC},
+{&hd->ext2_features, 0x8001, R_EDX, 0},
+{&hd->ext3_features, 0x8001, R_ECX, 0},
+

[Qemu-devel] [RHEL6 qemu-kvm PATCH 09/11] cpu defs: add pse36, mca, mtrr to AMD CPU definitions

2011-06-02 Thread Eduardo Habkost

This patch adds some missing flags to extfeature_edx, that were missing
according to AMD's latest CPUID document.

This is based on a previous patch from John Cooper where this was introduced
with many other changes at the same time. Original John's patch submission is
at Message-ID: <4ddad5e7.2020...@redhat.com>, 
.

Original John's patch description was:

cpu model bug fixes and definition corrections

This patch was intended to address the replicated feature
flags in cpuid 8000_0001:edx from cpuid _0001:edx.
This is due to AMD's definition where these flags are
mostly cloned in the 8000_0001:edx cpuid function.
qemu64 attempted to glue together the respective Intel
and AMD nearly disjoint features and this propagated to
the new Intel models as doing so was believed conservative
at the time.  However after further soak and test lugging
around this cruft doesn't provide any value, could
conceivably confuse a guest, and has confused users trying
to maintain/add cpu definitions.  This also caused issues
for libvirt attempting to track this mis-encoding.

So we've here tossed out the AMD replicated definitions
from the Intel models, added a few replications into AMD
definitions which were missing according to AMD's latest
CPUID document, and reordered the config file flags to
follow intuitive sequential bit ordering.  Also two flag
name aliases were added for clarity to Intel models.  The
end result being the models definitions now conform to
their respective cpuid specifications sans x2apic which is
emulated by kvm.

This was tested with the following combinations:

[Conroe, Penryn, Nehalem] x [F12-64, win64, win32] -- Intel host
[Opteron_G1, Opteron_G2, Opteron_G3] x [F12-64, win64, win32] -- AMD 
host

Yielding successful boots in all cases.

Signed-off-by: john cooper 

Signed-off-by: Eduardo Habkost 
---
 sysconfigs/target/target-x86_64.conf |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index d368b6c..3874ff1 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -51,7 +51,7 @@
stepping = "1"
feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
feature_ecx = "x2apic sse3" # x2apic kvm emulated
-   extfeature_edx = "lm fxsr mmx nx pat cmov pge syscall apic cx8 mce pae msr 
tsc pse de fpu"
+   extfeature_edx = "lm fxsr mmx nx pse36 pat cmov mca pge mtrr syscall apic 
cx8 mce pae msr tsc pse de fpu"
extfeature_ecx = " "
xlevel = "0x8008"
model_id = "AMD Opteron 240 (Gen 1 Class Opteron)"
@@ -65,7 +65,7 @@
stepping = "1"
feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
feature_ecx = "x2apic cx16 sse3"# x2apic kvm emulated
-   extfeature_edx = "lm rdtscp fxsr mmx nx pat cmov pge syscall apic cx8 mce 
pae msr tsc pse de fpu"
+   extfeature_edx = "lm rdtscp fxsr mmx nx pse36 pat cmov mca pge mtrr syscall 
apic cx8 mce pae msr tsc pse de fpu"
extfeature_ecx = "svm lahf_lm"
xlevel = "0x8008"
model_id = "AMD Opteron 22xx (Gen 2 Class Opteron)"
@@ -79,7 +79,7 @@
stepping = "1"
feature_edx = "sse2 sse fxsr mmx clflush pse36 pat cmov mca pge mtrr sep 
apic cx8 mce pae msr tsc pse de fpu"
feature_ecx = "popcnt x2apic cx16 monitor sse3" # x2apic kvm emulated
-   extfeature_edx = "lm rdtscp fxsr mmx nx pat cmov pge syscall apic cx8 mce 
pae msr tsc pse de fpu"
+   extfeature_edx = "lm rdtscp fxsr mmx nx pse36 pat cmov mca pge mtrr syscall 
apic cx8 mce pae msr tsc pse de fpu"
extfeature_ecx = "misalignsse sse4a abm svm lahf_lm"
xlevel = "0x8008"
model_id = "AMD Opteron 23xx (Gen 3 Class Opteron)"
-- 
1.7.3.2

Re: [Qemu-devel] [PATCH 01/14] Generic DMA memory access interface

2011-06-02 Thread Richard Henderson

On 06/02/2011 10:35 AM, Eduard - Gabriel Munteanu wrote:
> My latest patches seem to have fixed that:
> 
> +if (plen < *len) {
> +*len = plen;
> +}
> +
> +buf = cpu_physical_memory_map(paddr, len, is_write);

No, len is (or was in previous patches) dma_addr_t which
is not the same as target_phys_addr_t.

Which is why plen was used before, because it was the 
right type.

r~

Re: [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

2011-06-02 Thread Luiz Capitulino

On Thu, 02 Jun 2011 13:33:52 -0500
Anthony Liguori  wrote:

> On 06/02/2011 01:09 PM, Luiz Capitulino wrote:
> > On Thu, 02 Jun 2011 13:00:04 -0500
> > Anthony Liguori  wrote:
> >
> >> On 06/02/2011 12:57 PM, Luiz Capitulino wrote:
> >>> On Wed, 01 Jun 2011 16:35:03 -0500
> >>> Anthony Liguori   wrote:
> >>>
>  On 06/01/2011 04:12 PM, Luiz Capitulino wrote:
> > Hi there,
> >
> > There are people who want to use QMP for thin provisioning. That's, the 
> > VM is
> > started with a small storage and when a no space error is triggered, 
> > more space
> > is allocated and the VM is put to run again.
> >
> > QMP has two limitations that prevent people from doing this today:
> >
> > 1. The BLOCK_IO_ERROR doesn't contain error information
> >
> > 2. Considering we solve item 1, we still have to provide a way for 
> > clients
> >   to query why a VM stopped. This is needed because clients may 
> > miss the
> >   BLOCK_IO_ERROR event or may connect to the VM while it's already 
> > stopped
> >
> > A proposal to solve both problems follow.
> >
> > A. BLOCK_IO_ERROR information
> > -
> >
> > We already have discussed this a lot, but didn't reach a consensus. My 
> > solution
> > is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR 
> > event,
> > for example (see the "reason" key):
> >
> > { "event": "BLOCK_IO_ERROR",
> >   "data": { "device": "ide0-hd1",
> > "operation": "write",
> > "action": "stop",
> > "reason": "enospc", }
> 
>  you can call the reason whatever you want, but don't call it stringfied
>  errno name :-)
> 
>  In fact, just make reason "no space".
> >>>
> >>> You mean, we should do:
> >>>
> >>> "reason": "no space"
> >>>
> >>> Or that we should make it a boolean, like:
> >>>
> >>>"no space": true
> >>
> >>
> >> Do we need reason in BLOCK_IO_ERROR if query-block returns this 
> >> information?
> >
> > True, no.
> >
> >>> I'm ok with either way. But in case you meant the second one, I guess
> >>> we should make "reason" a dictionary so that we can group related
> >>> information when we extend the field, for example:
> >>>
> >>>"reason": { "no space": false, "no permission": true }
> >>
> >> Why would we ever have "no permission"?
> 
> Why did it happen?  It's not clear to me when read/write would return 
> EPERM.  open() should fail.  In fact, EPERM is not mentioned in man 2 read.

Actually, the error was an EACCESS which might sound more bizarre :)

What happened was that the device file in question had its permission
changed during VM execution due to a bug somewhere else. I'm not sure if
the error was returned in a read() or write() (Kevin might have more details).

This is a bit extreme and I'd agree it's arguable whether or not we should
report EACCESS, but I had this in mind and ended up mentioning it...

Maybe libvirt guys could provide more input wrt the error reason usage.
If we don't have valid use cases for other errors, then I'll agree that
providing only "no space" is enough.

[Qemu-devel] [RHEL6 qemu-kvm PATCH 10/11] add Westmere as a qemu cpu model

2011-06-02 Thread Eduardo Habkost

From: john cooper 

This patch adds Westmere as a qemu cpu model.  The only
additional guest visible feature of a Westmere relative
to Nehalem is the inclusion of AES instructions.  However
as other non-ABI visible modifications exist along with
fabrication changes, the CPUID data of the corresponding
deployed silicon was altered slightly to reflect this.

We've seen isolated cases where apparently unrelated yet
slightly incoherent CPUID data has caused problems, most
notably during guest boot.  Providing Westmere as a
model separate fro Nehalem allows us to more easily address
such quirks.

[ehabkost: edited commit message to have a better Subject line]

Signed-off-by: john cooper 
Signed-off-by: Eduardo Habkost 
---
 sysconfigs/target/target-x86_64.conf |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/sysconfigs/target/target-x86_64.conf 
b/sysconfigs/target/target-x86_64.conf
index 3874ff1..fcae2dd 100644
--- a/sysconfigs/target/target-x86_64.conf
+++ b/sysconfigs/target/target-x86_64.conf
@@ -43,6 +43,20 @@
model_id = "Intel Core i7 9xx (Nehalem Class Core i7)"
 
 [cpudef]
+   name = "Westmere"
+   level = "11"
+   vendor = "GenuineIntel"
+   family = "6"
+   model = "44"
+   stepping = "1"
+   feature_edx = "sse2 sse fxsr mmx pat cmov pge sep apic cx8 mce pae msr tsc 
pse de fpumtrr clflush mca pse36"
+   feature_ecx = "sse3 cx16 ssse3 sse4.1 sse4.2 x2apic popcnt aes"
+   extfeature_edx = "fxsr mmx pat cmov pge apic cx8 mce pae msr tsc pse de fpu 
   lm syscall nx"
+   extfeature_ecx = "lahf_lm"
+   xlevel = "0x800A"
+   model_id = "Westmere E56xx/L56xx/X56xx (Nehalem-C)"
+
+[cpudef]
name = "Opteron_G1"
level = "5"
vendor = "AuthenticAMD"
-- 
1.7.3.2

Re: [Qemu-devel] [PATCH 00/11] cpu model bug fixes and definition corrections (v2)

2011-06-02 Thread Eduardo Habkost

Ouch, the subject prefix is completely wrong because of broken
git-send-email config on my side, sorry.

Please ignore the 'RHEL6 qemu-kvm' prefix, it is actually supposed to go
to the main Qemu tree.

-- 
Eduardo

1 2 >

1 - 100 of 112 matches

Mail list logo