date:20110317

[Qemu-devel] Re: [v1 PATCH 3/3]: Convert v9fs_stat to threaded model.

2011-03-17 Thread Stefan Hajnoczi

On Thu, Mar 17, 2011 at 4:26 AM, Venkateswararao Jujjuri (JV)
 wrote:
> On 3/16/2011 10:10 AM, Stefan Hajnoczi wrote:
>> On Wed, Mar 16, 2011 at 2:33 PM, Venkateswararao Jujjuri (JV)
>>  wrote:
>>> On 3/16/2011 3:23 AM, Stefan Hajnoczi wrote:
 On Tue, Mar 15, 2011 at 10:39 AM, Arun R Bharadwaj
  wrote:
> -static void v9fs_stat_post_lstat(V9fsState *s, V9fsStatState *vs, int 
> err)
> +static void v9fs_stat_post_lstat(void *opaque)
>  {
> -    if (err == -1) {
> -        err = -errno;
> +    V9fsStatState *vs = (V9fsStatState *)opaque;

 No need to cast void* in C.

> +    if (vs->err == -1) {
> +        vs->err = -(vs->v9fs_errno);

 How about the thread worker function puts the -errno into a vs->ret field:

 static void v9fs_stat_do_lstat(V9fsRequest *request)
 {
     V9fsStatState *vs = container_of(request, V9fsStatState, request);

     vs->ret = v9fs_do_lstat(vs->s, &vs->fidp->fsmap.path, &vs->stbuf);
     if (vs->ret != 0) {
         vs->ret = -errno;
     }
 }

 Then v9fs_stat_post_lstat() can use vs->ret directly and does not need
 to juggle around the two fields, vs->err and vs->v9fs_errno.

>         goto out;
>     }
>
> -    err = stat_to_v9stat(s, &vs->fidp->fsmap.path, &vs->stbuf, 
> &vs->v9stat);
> -    if (err) {
> +    vs->err = stat_to_v9stat(vs->s, &vs->fidp->fsmap.path, &vs->stbuf, 
> &vs->v9stat);

 This function can block in v9fs_do_readlink().  Needs to be done
 asynchronously to avoid blocking QEMU.

> +    if (vs->err) {
>         goto out;
>     }
>     vs->offset += pdu_marshal(vs->pdu, vs->offset, "wS", 0, &vs->v9stat);
> -    err = vs->offset;
> +    vs->err = vs->offset;
>
>  out:
> -    complete_pdu(s, vs->pdu, err);
> +    complete_pdu(vs->s, vs->pdu, vs->err);
>     v9fs_stat_free(&vs->v9stat);
>     qemu_free(vs);
>  }
>
> +static void v9fs_stat_do_lstat(V9fsRequest *request)
> +{
> +    V9fsStatState *vs = container_of(request, V9fsStatState, request);

 Nice.  Could container_of() be used for v9fs_post_lstat() too?  I'm
 suggesting making post op functions take the V9fsRequest* instead of a
 void* opaque pointer.

> +
> +    vs->err = v9fs_do_lstat(vs->s, &vs->fidp->fsmap.path, &vs->stbuf);

 This is not threadsafe since rpath still uses a static buffer in
 qemu.git.  Please ensure that rpath() is thread-safe before pushing
 this patch.
>>>
>>> There is another patch on the internal list to make rpath thread safe.
>>>

> +    vs->v9fs_errno = errno;
> +}
> +
>  static void v9fs_stat(V9fsState *s, V9fsPDU *pdu)
>  {
>     int32_t fid;
> @@ -1487,6 +1496,10 @@ static void v9fs_stat(V9fsState *s, V9fsPDU *pdu)
>     vs = qemu_malloc(sizeof(*vs));
>     vs->pdu = pdu;
>     vs->offset = 7;
> +    vs->s = s;
> +    vs->request.func = v9fs_stat_do_lstat;
> +    vs->request.post_op.func = v9fs_stat_post_lstat;
> +    vs->request.post_op.arg = vs;
>
>     memset(&vs->v9stat, 0, sizeof(vs->v9stat));
>
> @@ -1498,8 +1511,11 @@ static void v9fs_stat(V9fsState *s, V9fsPDU *pdu)
>         goto out;
>     }
>
> +    /*
>     err = v9fs_do_lstat(s, &vs->fidp->fsmap.path, &vs->stbuf);
>     v9fs_stat_post_lstat(s, vs, err);
> +    */

 Please remove unused code, it quickly becomes out-of-date and confuses 
 readers.

> +    v9fs_qemu_submit_request(&vs->request);

 What happens when another PDU is handled next that uses the same fid?
 The worst case is if the client sends TCLUNK and fid is freed while
 the worker thread and later the post op still access the memory.
 There needs to be some kind of guard (like a reference count) to
 prevent this.
>>>
>>> As per the protocol this should not happen. Client is the controls the fid,
>>> and the fid is created or destroyed per the directive of the client.
>>> It should not send clunk until the response is received on that fid
>>> based operation(if there is any).
>>
>> Unfortunately it is still possible for a guest to do it.  The model
>> for emulated hardware is that the guest is untrusted and we cannot
>> allow things to crash.
>
> Well, it can happen only if the guest OS is hacked...and the worst thing
> can happen is guest goes down. I am not sure how it is different from
> a bare metal system..

No, use after free can lead to arbitrary code execution or other
effects more severe than the guest going down:
http://en.wikipedia.org/wiki/Malloc#Use_after_free

For example if the same memory is handed out by malloc and used to
store a function pointer, then the function pointer can be corrupted
and cause a jump to an arbitrary place in memory.

Hardware emulation is like implementing system calls in an operating
system.  Y

[Qemu-devel] Re: [PATCH 15/26] Virtual hash page table handling on pSeries machine'

2011-03-17 Thread Alexander Graf


On 17.03.2011, at 02:03, David Gibson  wrote:

> On Wed, Mar 16, 2011 at 04:03:47PM +0100, Alexander Graf wrote:
>> On 03/16/2011 05:56 AM, David Gibson wrote:
> [snip]
>>> @@ -248,6 +261,16 @@ static void ppc_spapr_init(ram_addr_t ram_size,
>>> ram_offset = qemu_ram_alloc(NULL, "ppc_spapr.ram", ram_size);
>>> cpu_register_physical_memory(0, ram_size, ram_offset);
>>> 
>>> +/* allocate hash page table */
>>> +htab_size = 1ULL<<  (pteg_shift + 7);
>> 
>> Linux makes the htab size depend on the provided amount of ram.
>> Shouldn't we do the same?
> 
> Well... maybe.  In fact the guidelines for hash allocation tend to be
> rather larger than really necessary for a Linux guest, so generally
> 16mb for the hash will be fine.  This does also correspond to the
> allocation for the guest hash we use in our experimental kvm code
> (making the hash exactly one hugepage makes the necessary contiguous
> allocation easier).

Hrm - ok :).

Alex

[Qemu-devel] Re: [PATCH 16/26] Implement hcall based RTAS for pSeries machines

2011-03-17 Thread Alexander Graf





On 17.03.2011, at 02:22, David Gibson  wrote:

> On Wed, Mar 16, 2011 at 04:08:30PM +0100, Alexander Graf wrote:
>> On 03/16/2011 05:56 AM, David Gibson wrote:
> [snip]
>>> diff --git a/pc-bios/spapr-rtas.bin b/pc-bios/spapr-rtas.bin
>>> new file mode 100644
>>> index 
>>> ..eade9c0e8ff0fd3071e3a6638a11c1a2e9a47152
>>> GIT binary patch
>>> literal 20
>>> bcmb>> 
>>> literal 0
>>> HcmV?d1
>> 
>> Despite being very simple, this is missing source code. There needs
>> to at least be a reference on where to find it in some text file in
>> pc-bios.
> 
> Good point.  Can I just include the source here?  Having an external
> reference to a 5-line asm file seems crazy...

Sure - we already have the source code in-tree for some x86 option roms.

Alex

>

[Qemu-devel] Re: [PATCH 18/26] Implement the PAPR (pSeries) virtualized interrupt controller (xics)

2011-03-17 Thread Alexander Graf


On 17.03.2011, at 02:29, David Gibson  wrote:

> On Wed, Mar 16, 2011 at 04:47:11PM +0100, Alexander Graf wrote:
>> On 03/16/2011 05:56 AM, David Gibson wrote:
> [snip]
>>> +/* static void ics_resend(struct icp_server_state *ss) */
>>> +/* { */
>>> +/* int i; */
>>> +
>>> +/* for (i = 0; i<  xics->nr_irqs; i++) */
>>> +/* ics_resend_irq(xics, nr, ss); */
>>> +/* } */
>> 
>> Why is all this commented out? Better #if 0 it all away. Or even
>> better, don't include it in the patch - unless you think the code is
>> crucial and to be activated soon.
> 
> Hrm, it was supposed to implement level (rather than message)
> interrupts on XICS.  But I think its bitrotted since I commented it
> out.  Removed.
> 
>>> diff --git a/hw/xics.h b/hw/xics.h
>>> new file mode 100644
>>> index 000..e55f5f1
>>> --- /dev/null
>>> +++ b/hw/xics.h
>> 
>> Header missing
> 
> I'm not sure what you mean by this

Every source file should have a license/copyright header ;)

Alex

>

[Qemu-devel] Re: [PATCH 19/26] Add PAPR H_VIO_SIGNAL hypercall and infrastructure for VIO interrupts

2011-03-17 Thread Alexander Graf


On 17.03.2011, at 02:38, David Gibson  wrote:

> On Wed, Mar 16, 2011 at 04:49:07PM +0100, Alexander Graf wrote:
>> On 03/16/2011 05:56 AM, David Gibson wrote:
> [snip]
>>> +return H_PARAMETER;;
>>> +
>>> +dev->signal_state = mode;
>> 
>> No need to notify the device?
> 
> No, at the point it would send an interrupt the device checks
> signal_state.
> 
> That said, I was considering another cleanup to the signal code to
> have the devices use a helper function checking signal_state, and also
> allowing multiple interrupts for a VIO device.  Not sure if it's worth
> folding that into this series or doing it as a later extension.

It's fine to leave it as is for now.

>

[Qemu-devel] Re: [PATCH 25/26] Add a PAPR TCE-bypass mechanism for the pSeries machine

2011-03-17 Thread Alexander Graf

On 17.03.2011, at 04:25, Benjamin Herrenschmidt  
wrote:

> On Thu, 2011-03-17 at 13:21 +1100, David Gibson wrote:
>>> Is this an official extension used by anyone or is it your own
>>> invention that's not implemented in pHyp?
>> 
>> The latter.
> 
> The main reason is to avoid having to deal with TCEs in SLOF :-)

That makes sense :). Let's move this patch to later when you introduce SLOF 
support then? As it is, it would be unused code.

Alex

>

[Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Michael Tokarev

Trivial patch.  I've sent it yesterday but somehow it didn't
reach the list.

This fixes the problem when qemu continues even if -drive specification
is somehow invalid, resulting in a mess.  Applicable for both current
master and for stable-0.14 (and 0.13 and 0.12 as well).

The prob can actually be seriuos: when you start guest with two drives
and make an error in the specification of one of them, and the guest
has something like a raid array on the two drives, guest may start failing
that array or kick "missing" drives which may result in a mess - this is
what actually happened to me, I did't want a resync at all, and a resync
resulted in re-writing (and allocating) a 4TB virtual drive I used for
testing, which in turn resulted in my filesystem filling up and whole
thing failing badly.  Yes it was just testing VM, I experimented with
larger raid arrays, but the end result was quite, well, unexpected.

Thanks!

Signed-off-by: Michael Tokarev 
---
 vl.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/vl.c b/vl.c
index 8bcf2ae..79f996e 100644
--- a/vl.c
+++ b/vl.c
@@ -2098,7 +2098,8 @@ int main(int argc, char **argv, char **envp)
   HD_OPTS);
 break;
 case QEMU_OPTION_drive:
-drive_def(optarg);
+if (drive_def(optarg) == NULL)
+exit(1);
break;
 case QEMU_OPTION_set:
 if (qemu_set_option(optarg) != 0)
-- 
1.7.2.3

[Qemu-devel] Re: [PATCH 04/26] pci: add accessor function to get irq levels

2011-03-17 Thread Michael S. Tsirkin

On Thu, Mar 17, 2011 at 03:05:00PM +0900, Isaku Yamahata wrote:
> On Thu, Mar 17, 2011 at 07:29:09AM +0200, Michael S. Tsirkin wrote:
> > On Wed, Mar 16, 2011 at 06:29:15PM +0900, Isaku Yamahata wrote:
> > > Introduce accessor function to know INTx levels.
> > > It will be used later by q35.
> > > Although piix_pci tracks the intx line levels, it can be eliminated
> > > by this helper function.
> > 
> > At least for piix, the right thing to IMO is to have bit per
> > IRQ, then the for loop can be replaced with a single !!.  There's a TODO
> > there which this will fix.  I think we can reuse pci device irq_state
> > for this: need to check. Haven't looked at q35 yet - applies there as
> > well?
> 
> Yes, such bitmap optimization is possible.
> But this accessor function is still necessary,

OK, I'm convinced. It makes sense off data path,
much easier than try to unswizzle and swizzle back
to the new values.

> please see the following. (I didn't do any test yet. Just to show the idea)
> If you like it, I'll post it as separate patch.

Yes. BTW as long as we touch it, we might want some symbolic
name for constants 0x60, 16, and use PCI_NUM_PINS instead of 4.
Some more suggestions below.

Also, save/restore needs to be updated.

> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index 151353c..82b7daf 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>  
>  typedef struct PIIX3State {
>  PCIDevice dev;
> +unsigned long irq_level[16];

That's 1024 bits. We really only need 4*16 = 64 bits.
Also pic_levels might be a better name.
So just
   uint64_t pic_levels;

Maybe stick a check there:
#if PCI_NUM_PINS * PIIX_NUM_PIC_IRQS > 64
#error unable to encode pic state in 64 bit in pic_levels
#endif
Also, need to clear on init?

>  int32_t dummy_for_save_load_compat[4];
>  qemu_irq *pic;
>  } PIIX3State;
> @@ -200,25 +201,51 @@ PCIBus *i440fx_init(int *piix3_devfn, qemu_irq *pic, 
> ram_addr_t ram_size)
>  }
>  
>  /* PIIX3 PCI to ISA bridge */
> -
>  static void piix3_set_irq(void *opaque, int irq_num, int level)
>  {
>  int i, pic_irq, pic_level;
>  PIIX3State *piix3 = opaque;
>  
> -/* now we change the pic irq level according to the piix irq mappings */
> -/* XXX: optimize */
>  pic_irq = piix3->dev.config[0x60 + irq_num];
> -if (pic_irq < 16) {
> -/* The pic level is the logical OR of all the PCI irqs mapped
> -   to it */
> -pic_level = 0;
> -for (i = 0; i < 4; i++) {
> -if (pic_irq == piix3->dev.config[0x60 + i]) {
> -pic_level |= pci_bus_get_irq_level(piix3->dev.bus, i);
> -}
> +if (pic_irq >= 16) {
> +return;
> +}
> +
> +/* The pic level is the logical OR of all the PCI irqs mapped to it */
> +if (level) {
> +set_bit(&piix3->irq_level[pic_irq], irq_num);
> +} else {
> +clear_bit(&piix3->irq_level[pic_irq], irq_num);
> +}

We can do this without a branch too I think (assuming uint64_t suggested
above):
mask = 0x1ull << (pic_irq * 16 + irq_num);
piix3->pic_levels &= ~mask;
piix3->pic_levels |= mask;

> +qemu_set_irq(piix3->pic[pic_irq], !!piix3->irq_level[pic_irq]);
> +}
> +
> +/* irq routing is changed. so rebuild bitmap */
> +static void piix3_update_irq_levels(PIIX3State *piix3)
> +{
> +int i;
> +for (i = 0; i < 16; i++) {
> +piix3->irq_level[i] = 0;
> +}

memset(piix3->irq_level, 0, sizeof piix3->irq_level);

> +for (i = 0; i < 4; i++) {
> +int pic_irq = piix3->dev.config[0x60 + irq_num];
> +if (pic_irq >= 16) {
> +continue;
> +}
> +if (pci_bus_get_irq_level(piix3->dev.bus, i)) {
> +set_bit(&piix3->irq_level[pic_irq], i);
>  }
> -qemu_set_irq(piix3->pic[pic_irq], pic_level);

Hmm, don't we need to set the levels in guest appropriately?

There's also some duplication here.
Can't we just do
for (i = 0; i < 4; i++) {
piix3_set_irq(piix3, i, pci_bus_get_irq_level(piix3->dev.bus, 
i));
}
?

> +}
> +}
> +
> +static void piix3_write_config(PCIDevice *dev,
> +   uint32_t address, uint32_t val, int len)
> +{
> +PIIX3State *piix3 = DO_UPCAST(PIIX3State, dev, dev);
> +
> +pci_default_write_config(dev, address, val, len);
> +if (ranges_overlap(address, len, 0x60, 4)) {
> +piix3_update_irq_levels(piix3);
>  }
>  }
>  
> @@ -318,6 +345,7 @@ static PCIDeviceInfo i440fx_info[] = {
>  .qdev.no_user = 1,
>  .no_hotplug   = 1,
>  .init = piix3_initfn,
> +.config_write = piix3_write_config,
>  },{
>  /* end of list */
>  }
> 
> -- 
> yamahata

Re: [Qemu-devel] v2 revamp acpitable parsing and allow to specify complete (headerful) table

2011-03-17 Thread Michael Tokarev

17.03.2011 08:19, Isaku Yamahata wrote:
> Ouch. You already clean it up.

Please excuse me for this.  My first try was just an RFC to show the
"basic idea" - as if it's so much large idea :), it wasn't my intention
to ask for comments about the code itself, I said "_something_ of
this theme".  And I should have Cc'd you on my real submission too,
obviously, which I somehow overlooked.  I should be more careful.

> Here is my diff from your patch. Can you please merged into the patch?
> changes are
> - eliminate unaligned access
> - error report

I intentionally did not implement reporting, because it needs to be
done using generic "error reporting" infrastructure which I haven't
learned yet ;)

> - replace magic number with symbolic constants
> - unconverted strtol(base=10)

Aha, one more case which I didn't spot, thanks! :)

> Signed-off-by: Isaku Yamahata 
> 
> --- qemu-acpi-load-other-0/hw/acpi.c  2011-03-17 14:02:07.0 +0900
> +++ qemu-acpi-load-0/hw/acpi.c2011-03-17 14:14:39.0 +0900
> @@ -19,8 +19,42 @@
>  #include "pc.h"
>  #include "acpi.h"
>  
> +struct acpi_table_header
> +{
> +char signature [4];/* ACPI signature (4 ASCII characters) */
> +uint32_t length;  /* Length of table, in bytes, including header 
> */
> +uint8_t revision; /* ACPI Specification minor version # */
> +uint8_t checksum; /* To make sum of entire table == 0 */
> +char oem_id [6];   /* OEM identification */
> +char oem_table_id [8]; /* OEM table identification */
> +uint32_t oem_revision;/* OEM revision number */
> +char asl_compiler_id [4]; /* ASL compiler vendor ID */
> +uint32_t asl_compiler_revision; /* ASL compiler revision number */
> +} __attribute__((packed));
> +
> +#define ACPI_TABLE_OFF(x)   (offsetof(struct acpi_table_header, x))
> +#define ACPI_TABLE_SIZE(x)  (sizeof(((struct acpi_table_header*)0)->x))
> +
> +#define ACPI_TABLE_SIG_OFF  ACPI_TABLE_OFF(signature)
> +#define ACPI_TABLE_SIG_SIZE ACPI_TABLE_SIZE(signature)

Um.  This is jus too much.  I've a better idea, with less code: after
loading the table, do a memcpy() into a local acpi_table_header
structure, perform all stuff there without the need of these #defines,
and copy it back at the end right before the checksum.

This way, we eliminate the defines, eliminate unaligned access, and
eliminate the need for these uint16 variables too.

I'll cook it in a few minutes, is that ok?

It's just too much (IMHO anyway) for such a little function which
is used so unfrequently :)

>  /* increase number of tables */
> -(*(uint16_t *)acpi_tables) =
> -cpu_to_le32(le32_to_cpu(*(uint16_t *)acpi_tables) + 1);
> +(*(uint16_t*)acpi_tables) =
> +cpu_to_le16(le16_to_cpu(*(uint16_t*)acpi_tables) + 1);
>  return 0;

This is something about which checkpatch.pl complains, telling
there should be a space before "*" in (uint16_t*) ;)

Thank you!

/mjt

[Qemu-devel] Re: [PATCH 25/26] Add a PAPR TCE-bypass mechanism for the pSeries machine

2011-03-17 Thread Benjamin Herrenschmidt

On Thu, 2011-03-17 at 08:44 +0100, Alexander Graf wrote:
> On 17.03.2011, at 04:25, Benjamin Herrenschmidt  
> wrote:
> 
> > On Thu, 2011-03-17 at 13:21 +1100, David Gibson wrote:
> >>> Is this an official extension used by anyone or is it your own
> >>> invention that's not implemented in pHyp?
> >> 
> >> The latter.
> > 
> > The main reason is to avoid having to deal with TCEs in SLOF :-)
> 
> That makes sense :). Let's move this patch to later when you introduce SLOF
> support then? As it is, it would be unused code.

Well, SLOF is around the corner, I just need to find out where to put
the git repo :-)

Cheers,
Ben.

> 
> Alex
> 
> >

[Qemu-devel] Re: [PATCH 1/4] block: clarify the meaning of BDRV_O_NOCACHE

2011-03-17 Thread Kevin Wolf

Am 16.03.2011 18:00, schrieb Stefan Hajnoczi:
> On Wed, Mar 16, 2011 at 2:08 PM, Christoph Hellwig  wrote:
>> On Wed, Mar 16, 2011 at 09:42:37AM +, Stefan Hajnoczi wrote:
>>> -writethrough = ((flags & BDRV_O_CACHE_MASK) == 0);
>>> +writethrough = ((flags & (BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)) == 0);
>>
>> or rather
>>
>>writethrough = ((flags & (BDRV_O_CACHE_WB) != );
>>
>> but yes, this code had sneaked in since my initial version.
> 
> My intention was that if we don't care about honoring flushes then we
> might as well use Qcow2Cache.  But yes, just checking for cache mode
> is the clearest.

You mean for a possible writethrough mode with BDRV_O_NO_FLUSH set? Such
a mode doesn't make a lot of sense to me.

Kevin

[Qemu-devel] Re: [PATCH] pcnet: Fix sign extension: make ipxe work with >2G RAM

2011-03-17 Thread Stefan Hajnoczi

On Tue, Mar 15, 2011 at 10:47:22AM -0600, Alex Williamson wrote:
> From: Michael Brown 
> 
> The problem is with definitions in hw/pcnet.c such as:
> 
>   #define CSR_CRDA(S)  ((S)->csr[28] | ((S)->csr[29] << 16))
> 
> "(S)->csr[29]" is a uint16_t, but "(S)->csr[29] << 16" gets promoted to
> int, so the overall CSR_CRDA(s) is a (signed) int rather than a uint32_t.
> 
> This then gets assigned to a uint64_t using
> 
>   target_phys_addr_t crda = CSR_CRDA(s);
> 
> so when (S)->csr[29] has the high bit set, we end up with
> crda=0x.
> 
> From: Michael Brown 
> Signed-off-by: Alex Williamson 
> ---
> 
>  hw/pcnet.c |   30 +++---
>  1 files changed, 15 insertions(+), 15 deletions(-)

Reviewed-by: Stefan Hajnoczi

[Qemu-devel] Re: [PATCH 1/4] block: clarify the meaning of BDRV_O_NOCACHE

2011-03-17 Thread Stefan Hajnoczi

On Thu, Mar 17, 2011 at 9:07 AM, Kevin Wolf  wrote:
> Am 16.03.2011 18:00, schrieb Stefan Hajnoczi:
>> On Wed, Mar 16, 2011 at 2:08 PM, Christoph Hellwig  wrote:
>>> On Wed, Mar 16, 2011 at 09:42:37AM +, Stefan Hajnoczi wrote:
 -    writethrough = ((flags & BDRV_O_CACHE_MASK) == 0);
 +    writethrough = ((flags & (BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)) == 0);
>>>
>>> or rather
>>>
>>>        writethrough = ((flags & (BDRV_O_CACHE_WB) != );
>>>
>>> but yes, this code had sneaked in since my initial version.
>>
>> My intention was that if we don't care about honoring flushes then we
>> might as well use Qcow2Cache.  But yes, just checking for cache mode
>> is the clearest.
>
> You mean for a possible writethrough mode with BDRV_O_NO_FLUSH set? Such
> a mode doesn't make a lot of sense to me.

Yeah, I guess you're right, we can never have BDRV_O_NO_FLUSH without
BDRV_O_CACHE_WB today.  I wasn't thinking end-to-end, just looking at
the BDRV_O_* flag bits.

Stefan

Re: [Qemu-devel] [PATCH v3 4/4] hw/qxl-render: drop cursor locks, replace with pipe

2011-03-17 Thread Alon Levy

On Wed, Mar 16, 2011 at 05:48:41PM +0100, Jes Sorensen wrote:
> On 03/16/11 16:52, Alon Levy wrote:
> > +void qxl_server_request_cursor_set(PCIQXLDevice *qxl, QEMUCursor *c, int 
> > x, int y)
> > +{
> > +QXLServerCursorSetRequest req;
> > +int r;
> > +
> > +req.req = QXL_SERVER_CURSOR_SET;
> > +req.data.c = c;
> > +req.data.x = x;
> > +req.data.y = y;
> > +r = write(qxl->ssd.pipe[1], &req, sizeof(req));
> > +assert(r == sizeof(req));
> > +}
> 
> There's a number of asserts here, which I am not sure is a good thing. I
> don't understand how far down the code this is, and if it is really
> fatal if this write fails?

A failure there means we can't write to a pipe between the server thread
and the iothread (main thread). That is not supposed to happen - and if
it does it means some operation by the spice server will never complete.

Same for the asserts below, writes are from spice server thread, reads
are in iothread.

> 
> > +/* called from spice server thread context only */
> > +void qxl_server_request_cursor_move(PCIQXLDevice *qxl, int x, int y)
> > +{
> > +QXLServerCursorMoveRequest req;
> > +int r;
> > +
> > +req.req = QXL_SERVER_CURSOR_MOVE;
> > +req.data.x = x;
> > +req.data.y = y;
> > +r = write(qxl->ssd.pipe[1], &req, sizeof(req));
> > +assert(r == sizeof(req));
> 
> ditto
> 
> > +static void read_bytes(int fd, void *buf, int len_requested)
> > +{
> > +int len;
> > +int total_len = 0;
> > +
> > +do {
> > +len = read(fd, buf, len_requested - total_len);
> > +if (len < 0) {
> > +if (errno == EINTR || errno == EAGAIN) {
> > +continue;
> > +}
> > +perror("qxl: pipe_read: read failed");
> > +/* will abort once it's out of the while loop */
> > +break;
> > +}
> > +total_len += len;
> > +buf = (uint8_t *)buf + len;
> > +} while (total_len < len_requested);
> > +assert(total_len == len_requested);
> 
> and here?
> 
> Cheers,
> Jes
> 
> 
>

[Qemu-devel] Re: [PATCH 25/26] Add a PAPR TCE-bypass mechanism for the pSeries machine

2011-03-17 Thread Alexander Graf


On 03/17/2011 09:44 AM, Benjamin Herrenschmidt wrote:

On Thu, 2011-03-17 at 08:44 +0100, Alexander Graf wrote:

On 17.03.2011, at 04:25, Benjamin Herrenschmidt  
wrote:


On Thu, 2011-03-17 at 13:21 +1100, David Gibson wrote:

Is this an official extension used by anyone or is it your own
invention that's not implemented in pHyp?

The latter.

The main reason is to avoid having to deal with TCEs in SLOF :-)

That makes sense :). Let's move this patch to later when you introduce SLOF
support then? As it is, it would be unused code.

Well, SLOF is around the corner, I just need to find out where to put
the git repo :-)


Include it in v4 then :)


Alex

Re: [Qemu-devel] [PATCH v3 4/4] hw/qxl-render: drop cursor locks, replace with pipe

2011-03-17 Thread Jes Sorensen

On 03/17/11 10:32, Alon Levy wrote:
> On Wed, Mar 16, 2011 at 05:48:41PM +0100, Jes Sorensen wrote:
>> > On 03/16/11 16:52, Alon Levy wrote:
>>> > > +void qxl_server_request_cursor_set(PCIQXLDevice *qxl, QEMUCursor *c, 
>>> > > int x, int y)
>>> > > +{
>>> > > +QXLServerCursorSetRequest req;
>>> > > +int r;
>>> > > +
>>> > > +req.req = QXL_SERVER_CURSOR_SET;
>>> > > +req.data.c = c;
>>> > > +req.data.x = x;
>>> > > +req.data.y = y;
>>> > > +r = write(qxl->ssd.pipe[1], &req, sizeof(req));
>>> > > +assert(r == sizeof(req));
>>> > > +}
>> > 
>> > There's a number of asserts here, which I am not sure is a good thing. I
>> > don't understand how far down the code this is, and if it is really
>> > fatal if this write fails?
> A failure there means we can't write to a pipe between the server thread
> and the iothread (main thread). That is not supposed to happen - and if
> it does it means some operation by the spice server will never complete.
> 
> Same for the asserts below, writes are from spice server thread, reads
> are in iothread.

But shouldn't this make it try to reconnect? Even if the reconnect
fails, it shouldn't kill the guest IMHO.

Cheers,
Jes

Re: [Qemu-devel] Re: KVM call agenda for Jan 25

2011-03-17 Thread Kevin Wolf

Am 16.03.2011 18:47, schrieb Stefan Hajnoczi:
> On Tue, Mar 15, 2011 at 10:27 AM, Kevin Wolf  wrote:
>> Am 14.03.2011 16:13, schrieb Dushyant Bansal:

 Nice that qemu-img convert isn't that far out by default on raw :).

 About Google Summer of Code, I have posted my take on applying and
 want to share that with you and qemu-devel:

 http://blog.vmsplice.net/2011/03/advice-for-students-applying-to-google.html

>>> Thanks for sharing your experiences.
>>>
>>> After reading about qcow2 and qed and how they organize data (thanks to
>>> the newly added qcow2 doc and discussions on the mailing list), this is
>>> what I understand.
>>>
>>> So, the main difference between qed and qcow2 is the absence of
>>> reference count structure in qed(means less meta data).
>>> It improves performance due to:
>>> 1. For write operations, less or no metadata to update.
>>> 2. Data write and metadata write can be in any order
>>>
>>> This also means these features are no longer supported:
>>> 1. Internal snapshots,
>>> 2. CPU/device state snapshots,
>>> 3. Compression,
>>> 4. Encryption
>>>
>>> Now, coming to qed<-->qcow2 conversion, I want to clarify some things.
>>>
>>> 1. header_size: variable in qed, equals to cluster size in qcow2:
>>> When will it be larger than 1 cluster in qed? So, what will happen to
>>> that extra data on qed->qcow2 conversion.
>>
>> If you have an feature that is used in the original image, but cannot be
>> represented in the new format, I think you should just get an error.
>>
>>> 2. L2 table size: equals to L1 table size in qed, equals to cluster size
>>> in qcow2:
>>> we need to take it into account during conversion.
>>
>> Right. I think we'll have to rewrite all of the metadata.
>>
>> I wonder if we can manage to have a nice block driver interface for
>> in-place image conversions so that we don't only get a qed<->qcow2
>> converter, but also can implement the interface in e.g. VMDK and get
>> VMDK<->qcow2 and VMDK<->qed as well.
> 
> I think this will be tricky but would be very interested if someone
> has ideas.  Code-wise an in-place converter probably needs access to
> both format's on-disk structures or internal functions.  I don't think
> abstracting this is easy because the more you abstract the less
> control you have over keeping things in-place and cleanly putting the
> new structures in place.

Well, if it was easy, I would have suggested a specific way of doing it. ;-)

But it would be a really cool thing to have, and I think it's more fun
for a GSoC participant to actually think about a hard problem than just
doing mostly mechanical work.

> On the other hand, I think the starting point for a generic in-place
> converter would be a loop that does something like bdrv_is_allocated()
> but translates the guest position in the block device into an offset
> into the image file.  That, together with some sort of free map or
> space allocation bitmap would allow a generic approach to figuring out
> the data mapping and which parts of the file can be safely used.

We can discuss the detailed API later, but I agree that the critical
thing to convert is the mapping.

You would probably open the file with the source format driver read-only
and with the destination driver read-write. For qcow2 you would start
with writing a refcount table that marks the whole file as used, other
formats use the file size anyway. Then you can start creating L1 and L2
tables and copy the mapping over. Once this is done, you do an fsck to
free the metadata of the old format.

One thing that may become tricky is the image header which both drivers
may want to use and which is fixed at offset 0. And of course, you must
make sure that the image is safe at any point if the converter crashes.

> The big benefit doing an interface for in-place conversion is that we
> can support n-to-n conversions with at most n converter code rather
> than having to code n * n - n different in-place converters.

Yes, this was the idea.

Kevin

[Qemu-devel] [PATCH] v3 revamp acpitable parsing and allow to specify complete (headerful) table

2011-03-17 Thread Michael Tokarev

This patch almost rewrites acpi_table_add() function
(but still leaves it using old get_param_value() interface).
The result is that it's now possible to specify whole table
(together with a header) in an external file, instead of just
data portion, with a new file= parameter, but at the same time
it's still possible to specify header fields as before.

Now with the checkpatch.pl formatting fixes, thanks to
Stefan Hajnoczi for suggestions, with changes from
Isaku Yamahata, and with my further refinements.

Signed-off-by: Michael Tokarev 
---
 hw/acpi.c   |  285 +++
 qemu-options.hx |7 +-
 2 files changed, 168 insertions(+), 124 deletions(-)

diff --git a/hw/acpi.c b/hw/acpi.c
index 237526d..4cc0311 100644
--- a/hw/acpi.c
+++ b/hw/acpi.c
@@ -15,6 +15,7 @@
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see 
  */
+
 #include "hw.h"
 #include "pc.h"
 #include "acpi.h"
@@ -24,17 +25,29 @@
 
 struct acpi_table_header
 {
-char signature [4];/* ACPI signature (4 ASCII characters) */
+uint16_t _length; /* our length, not actual part of the hdr */
+  /* XXX why we have 2 length fields here? */
+char sig[4];  /* ACPI signature (4 ASCII characters) */
 uint32_t length;  /* Length of table, in bytes, including header */
 uint8_t revision; /* ACPI Specification minor version # */
 uint8_t checksum; /* To make sum of entire table == 0 */
-char oem_id [6];   /* OEM identification */
-char oem_table_id [8]; /* OEM table identification */
+char oem_id[6];   /* OEM identification */
+char oem_table_id[8]; /* OEM table identification */
 uint32_t oem_revision;/* OEM revision number */
-char asl_compiler_id [4]; /* ASL compiler vendor ID */
+char asl_compiler_id[4];  /* ASL compiler vendor ID */
 uint32_t asl_compiler_revision; /* ASL compiler revision number */
 } __attribute__((packed));
 
+#define ACPI_TABLE_HDR_SIZE sizeof(struct acpi_table_header)
+#define ACPI_TABLE_PFX_SIZE sizeof(uint16_t)  /* size of the extra prefix */
+
+static const char dfl_hdr[ACPI_TABLE_HDR_SIZE] =
+"\0\0"   /* fake _length (2) */
+"QEMU\0\0\0\0\1\0"   /* sig (4), len(4), revno (1), csum (1) */
+"QEMUQEQEMUQEMU\1\0\0\0" /* OEM id (6), table (8), revno (4) */
+"QEMU\1\0\0\0"   /* ASL compiler ID (4), version (4) */
+;
+
 char *acpi_tables;
 size_t acpi_tables_len;
 
@@ -47,156 +60,182 @@ static int acpi_checksum(const uint8_t *data, int len)
 return (-sum) & 0xff;
 }
 
+/* like strncpy() but zero-fills the tail of destination */
+static void strzcpy(char *dst, const char *src, size_t size)
+{
+size_t len = strlen(src);
+if (len >= size) {
+len = size;
+} else {
+  memset(dst + len, 0, size - len);
+}
+memcpy(dst, src, len);
+}
+
+/* XXX fixme: this function uses obsolete argument parsing interface */
 int acpi_table_add(const char *t)
 {
-static const char *dfl_id = "QEMUQEMU";
 char buf[1024], *p, *f;
-struct acpi_table_header acpi_hdr;
 unsigned long val;
-uint32_t length;
-struct acpi_table_header *acpi_hdr_p;
-size_t off;
+size_t len, start, allen;
+bool has_header;
+int changed;
+int r;
+struct acpi_table_header hdr;
+
+if (get_param_value(buf, sizeof(buf), "data", t)) {
+has_header = false;
+} else if (get_param_value(buf, sizeof(buf), "file", t)) {
+has_header = true;
+} else {
+has_header = false;
+buf[0] = '\0';
+}
+
+if (!acpi_tables) {
+allen = sizeof(uint16_t);
+acpi_tables = qemu_mallocz(allen);
+}
+else {
+allen = acpi_tables_len;
+}
+
+start = allen;
+acpi_tables = qemu_realloc(acpi_tables, start + ACPI_TABLE_HDR_SIZE);
+allen += has_header ? ACPI_TABLE_PFX_SIZE : ACPI_TABLE_HDR_SIZE;
+
+/* now read in the data files, reallocating buffer as needed */
+
+for (f = strtok(buf, ":"); f; f = strtok(NULL, ":")) {
+int fd = open(f, O_RDONLY);
+
+if (fd < 0) {
+fprintf(stderr, "can't open file %s: %s\n", f, strerror(errno));
+return -1;
+}
+
+for (;;) {
+char data[8192];
+r = read(fd, data, sizeof(data));
+if (r == 0) {
+break;
+} else if (r > 0) {
+acpi_tables = qemu_realloc(acpi_tables, allen + r);
+memcpy(acpi_tables + allen, data, r);
+allen += r;
+} else if (errno != EINTR) {
+fprintf(stderr, "can't read file %s: %s\n",
+f, strerror(errno));
+close(fd);
+return -1;
+}
+}
+
+close(fd);
+}
+
+/* now fill in the head

Re: [Qemu-devel] OSX build issues

2011-03-17 Thread François Revol

Hi,

Le 16 mars 2011 à 08:57, Tristan Gingold a écrit :

>> It should fix the build issue.
>> But QEMU is unreliable on OSX even when it gets built.
>> I tried to bisect but lost some time trying to find a revision that actually 
>> builds. I thought I updated more this month...
>> 
>> 55f8d6ac3e03d2859393c281737f60c65dfc9ab3 definitely works ok.
> 
> J'utilise ce hack pour eviter ce probleme.

Indeed this hack works (quite a hack for someone working on ada :p)...

> diff --git a/cpus.c b/cpus.c
> index 5d14468..e976452 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -548,6 +548,7 @@ static void qemu_tcg_init_cpu_signals(void)
> #ifndef CONFIG_IOTHREAD
> int qemu_init_main_loop(void)
> {
> +#if 0
> int ret;
> 
> ret = qemu_signal_init();
> @@ -558,6 +559,7 @@ int qemu_init_main_loop(void)
> qemu_init_sigbus();
> 
> return qemu_event_init();
> +#endif
> }
> 
> void qemu_main_loop_start(void)

(Added a return 0 for the sake of it).

From the content of the functions called it's either one of the added fds which 
cause problem on select() (but why ?), or likely some signal masks which 
interfere with some internal thread in the process.

I sampled the process and took some screenshots without and with the #if 0 hack:
http://revolf.free.fr/osx/shots/shot_qemu_init_main_loop_if0.png
Things work but oddly what is supposed to be an internal dispatcher thread ends 
up executing qemu code. The main thread includes lots of calls from arbitrary 
addresses indicating it receives some signals.

http://revolf.free.fr/osx/shots/shot_qemu_init_main_loop_if1.png
Things stale, and there are 2 more threads that wait, and the main thread seems 
quite stuck in select().

François.

Re: [Qemu-devel] [PATCH v3 4/4] hw/qxl-render: drop cursor locks, replace with pipe

2011-03-17 Thread Alon Levy

On Thu, Mar 17, 2011 at 10:48:43AM +0100, Jes Sorensen wrote:
> On 03/17/11 10:32, Alon Levy wrote:
> > On Wed, Mar 16, 2011 at 05:48:41PM +0100, Jes Sorensen wrote:
> >> > On 03/16/11 16:52, Alon Levy wrote:
> >>> > > +void qxl_server_request_cursor_set(PCIQXLDevice *qxl, QEMUCursor *c, 
> >>> > > int x, int y)
> >>> > > +{
> >>> > > +QXLServerCursorSetRequest req;
> >>> > > +int r;
> >>> > > +
> >>> > > +req.req = QXL_SERVER_CURSOR_SET;
> >>> > > +req.data.c = c;
> >>> > > +req.data.x = x;
> >>> > > +req.data.y = y;
> >>> > > +r = write(qxl->ssd.pipe[1], &req, sizeof(req));
> >>> > > +assert(r == sizeof(req));
> >>> > > +}
> >> > 
> >> > There's a number of asserts here, which I am not sure is a good thing. I
> >> > don't understand how far down the code this is, and if it is really
> >> > fatal if this write fails?
> > A failure there means we can't write to a pipe between the server thread
> > and the iothread (main thread). That is not supposed to happen - and if
> > it does it means some operation by the spice server will never complete.
> > 
> > Same for the asserts below, writes are from spice server thread, reads
> > are in iothread.
> 
> But shouldn't this make it try to reconnect? Even if the reconnect
> fails, it shouldn't kill the guest IMHO.

reconnect? between two threads in the qemu process? why would the write
fail to begin with? this is like saying if I'm failing a kvm ioctl I should
just retry.

> 
> Cheers,
> Jes
> 
> 
>

Re: [Qemu-devel] [PATCH v3 4/4] hw/qxl-render: drop cursor locks, replace with pipe

2011-03-17 Thread Jes Sorensen

On 03/17/11 11:27, Alon Levy wrote:
> On Thu, Mar 17, 2011 at 10:48:43AM +0100, Jes Sorensen wrote:
>>> Same for the asserts below, writes are from spice server thread, reads
>>> are in iothread.
>>
>> But shouldn't this make it try to reconnect? Even if the reconnect
>> fails, it shouldn't kill the guest IMHO.
> 
> reconnect? between two threads in the qemu process? why would the write
> fail to begin with? this is like saying if I'm failing a kvm ioctl I should
> just retry.

Ah ok, I missed that part, somehow I had in my mind it was two different
processes, despite you mentioning threads.

Still if gfx handling fails, it shouldn't nuke the guest.

Cheers,
Jes

Re: [libvirt] [Qemu-devel] KVM call minutes for Mar 15

2011-03-17 Thread Daniel P. Berrange

On Tue, Mar 15, 2011 at 12:06:06PM -0700, Chris Wright wrote:
> * Anthony Liguori (anth...@codemonkey.ws) wrote:
> > On 03/15/2011 09:53 AM, Chris Wright wrote:
> > > QAPI
> 
> > >- c library implementation is critical to have unit tests and test
> > >   driven development
> > >   - thread safe?
> > > - no shared state, no statics.
> > > - threading model requires lock for the qmp session
> > >   - licensiing?
> > > - LGPL
> > >   - forwards/backwards compat?
> > > - designed with that in mind see wiki:
> > >
> > >   http://wiki.qemu.org/Features/QAPI
> > 
> > One neat feature of libqmp is that once libvirt has a better QMP
> > passthrough interface, we can create a QmpSession that uses libvirt.
> > 
> > It would look something like:
> > 
> > QmpSession *libqmp_session_new_libvirt(virDomainPtr dom);
> 
> Looks like you mean this?
> 
>-> request QmpSession -> 
> client  libvirt
><- return QmpSession  <-
> 
> client -> QmpSession -> QMP -> QEMU
> 
> So bypassing libvirt completely to actually use the session?
> 
> Currently, it's more like:
> 
> client -> QemuMonitorCommand -> libvirt -> QMP -> QEMU
> 
> > The QmpSession returned by this call can then be used with all of
> > the libqmp interfaces.  This means we can still exercise our test
> > suite with a guest launched through libvirt.  It also should make
> > the libvirt pass through interface a bit easier to consume by third
> > parties.
> 
> This sounds like it's something libvirt folks should be involved with.
> At the very least, this mode is there now and considered basically
> unstable/experimental/developer use:
> 
>  "Qemu monitor command '%s' executed; libvirt results may be unpredictable!"
> 
> So likely some concern about making it easier to use, esp. assuming
> that third parties above are mgmt apps, not just developers.

Although we provide monitor and command line passthrough in libvirt,
our recommendation is that mgmt apps do not develop against these
APIs. Our goal / policy is that apps should be able todo anything
they need using the formally modelled libvirt public APIs.

The primary intended usage for the monitor/command line passthrough
is debugging & experimentation, and as a very short term hack/workaround
for mgmt apps while formal APIs are added to libvirt. In other words,
we provide the feature because we don't want libvirt to be a roadblock,
but we still strongly discourage their usage untill all other options
have been exhausted.

In same way as loading binary only modules into the kernels sets a
'tainted' flag, we plan that direct usage of monitor/command line
passthrough will set a tainted flag against a VM. This is allow distro
maintainers to identify usage & decide how they wish to support these
features in products (if at all).

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] setting OEM ID in BIOS from qemu?

2011-03-17 Thread Michael Tokarev

Hello.

What's the possible way(s) to set OEM identification
string in BIOS too, so that it provides ACPI tables
with the given identification?

It is needed to support running OEM version of windows
vista and windows 7 (for example) from /dev/sda without
voiding its activation - this needs a real SLIC table
plus the corresponding string other tables (FACP, FACS,
-- I don't know for sure which other table windows
checks, exactly).

First is already available on the host system in
/sys/firmware/acpi/tables/SLIC, and it's possible
to use this table directly with -acpitable file=...
after my patch for hw/acpi.c which I submitted today
for qemu.

The second component is missing currently, hence
this my question.

Thanks!

/mjt

Re: [Qemu-devel] [PATCH v3 4/4] hw/qxl-render: drop cursor locks, replace with pipe

2011-03-17 Thread Alon Levy

On Thu, Mar 17, 2011 at 11:29:03AM +0100, Jes Sorensen wrote:
> On 03/17/11 11:27, Alon Levy wrote:
> > On Thu, Mar 17, 2011 at 10:48:43AM +0100, Jes Sorensen wrote:
> >>> Same for the asserts below, writes are from spice server thread, reads
> >>> are in iothread.
> >>
> >> But shouldn't this make it try to reconnect? Even if the reconnect
> >> fails, it shouldn't kill the guest IMHO.
> > 
> > reconnect? between two threads in the qemu process? why would the write
> > fail to begin with? this is like saying if I'm failing a kvm ioctl I should
> > just retry.
> 
> Ah ok, I missed that part, somehow I had in my mind it was two different
> processes, despite you mentioning threads.
> 
> Still if gfx handling fails, it shouldn't nuke the guest.

ok, try to apply that logic to any other device - network, usb, etc., I don't
think it holds.

> 
> Cheers,
> Jes
>

Re: [Qemu-devel] [PATCH 5/7] ccid: add ccid-card-emulated device

2011-03-17 Thread Alon Levy

On Mon, Mar 14, 2011 at 04:41:02PM +0100, Jes Sorensen wrote:
> On 02/23/11 12:20, Alon Levy wrote:
> > diff --git a/hw/ccid-card-emulated.c b/hw/ccid-card-emulated.c
> > new file mode 100644
> > index 000..bd84d45
> > --- /dev/null
> > +++ b/hw/ccid-card-emulated.c
> > @@ -0,0 +1,599 @@
> > +/*
> > + * CCID Card Device. Emulated card.
> > + *
> > + * Copyright (c) 2011 Red Hat.
> > + * Written by Alon Levy.
> > + *
> > + * This code is licenced under the GNU LGPL, version 2 or later.
> > + */
> > +
> > +/*
> > + * It can be used to provide access to the local hardware in a non 
> > exclusive
> > + * way, or it can use certificates. It requires the usb-ccid bus.
> > + *
> > + * Usage 1: standard, mirror hardware reader+card:
> > + * qemu .. -usb -device usb-ccid -device ccid-card-emulated
> > + *
> > + * Usage 2: use certificates, no hardware required
> > + * one time: create the certificates:
> > + *  for i in 1 2 3; do
> > + *  certutil -d /etc/pki/nssdb -x -t "CT,CT,CT" -S -s "CN=user$i" -n 
> > user$i
> > + *  done
> > + * qemu .. -usb -device usb-ccid \
> > + *  -device ccid-card-emulated,cert1=user1,cert2=user2,cert3=user3
> > + *
> > + * If you use a non default db for the certificates you can specify it 
> > using
> > + * the db parameter.
> > + */
> > +
> > +#include 
> 
> qemu-thread.h
ok, fixing.

> 
> > +static const char *emul_event_to_string(uint32_t emul_event)
> > +{
> > +switch (emul_event) {
> > +case EMUL_READER_INSERT: return "EMUL_READER_INSERT";
> > +case EMUL_READER_REMOVE: return "EMUL_READER_REMOVE";
> > +case EMUL_CARD_INSERT: return "EMUL_CARD_INSERT";
> > +case EMUL_CARD_REMOVE: return "EMUL_CARD_REMOVE";
> > +case EMUL_GUEST_APDU: return "EMUL_GUEST_APDU";
> > +case EMUL_RESPONSE_APDU: return "EMUL_RESPONSE_APDU";
> > +case EMUL_ERROR: return "EMUL_ERROR";
> 
> YUCK!
can we turn down the caps / disgust statements? I understand this
is a personal affront to you somehow? can we settle this at dawn
tommorrow?

> 
> No multi statements on a single line!
> 
> > +#define MAX_ATR_SIZE 40
> > +struct EmulatedState {
> > +CCIDCardState base;
> > +uint8_t  debug;
> > +char*backend_str;
> > +uint32_t backend;
> > +char*cert1;
> > +char*cert2;
> > +char*cert3;
> > +char*db;
> > +uint8_t  atr[MAX_ATR_SIZE];
> > +uint8_t  atr_length;
> > +QSIMPLEQ_HEAD(event_list, EmulEvent) event_list;
> > +pthread_mutex_t event_list_mutex;
> > +VReader *reader;
> > +QSIMPLEQ_HEAD(guest_apdu_list, EmulEvent) guest_apdu_list;
> > +pthread_mutex_t vreader_mutex; /* and guest_apdu_list mutex */
> > +pthread_mutex_t handle_apdu_mutex;
> > +pthread_cond_t handle_apdu_cond;
> > +int  pipe[2];
> > +int  quit_apdu_thread;
> > +pthread_mutex_t apdu_thread_quit_mutex;
> > +pthread_cond_t apdu_thread_quit_cond;
> > +};
> 
> Bad struct packing and wrong thread types.
will use qemu-thread. that's what you mean by wrong thread types, right?
s/pthread_thread_t/QemuThread/ etc. (Cond, Mutex)

> 
> > +static void emulated_push_type(EmulatedState *card, uint32_t type)
> > +{
> > +EmulEvent *event = (EmulEvent *)malloc(sizeof(EmulEvent));
> 
> qemu_malloc()
yep, fixing.

> 
> > +assert(event);
> > +event->p.gen.type = type;
> > +emulated_push_event(card, event);
> > +}
> > +
> > +static void emulated_push_error(EmulatedState *card, uint64_t code)
> > +{
> > +EmulEvent *event = (EmulEvent *)malloc(sizeof(EmulEvent));
> 
> qemu_malloc()
fixing.

> 
> > +assert(event);
> > +event->p.error.type = EMUL_ERROR;
> > +event->p.error.code = code;
> > +emulated_push_event(card, event);
> > +}
> > +
> > +static void emulated_push_data_type(EmulatedState *card, uint32_t type,
> > +const uint8_t *data, uint32_t len)
> > +{
> > +EmulEvent *event = (EmulEvent *)malloc(sizeof(EmulEvent) + len);
> 
> qemu_malloc()
fixing.

> 
> > +static void pipe_read(void *opaque)
> > +{
> > +EmulatedState *card = opaque;
> > +EmulEvent *event, *next;
> > +char dummy;
> > +int len;
> > +
> > +do {
> > +len = read(card->pipe[0], &dummy, sizeof(dummy));
> > +} while (len == sizeof(dummy));
> 
> Shouldn't you check error codes here?
yes, my bad.

> 
> Jes

Re: [Qemu-devel] [PATCH 5/7] ccid: add ccid-card-emulated device

2011-03-17 Thread Alon Levy

On Thu, Mar 17, 2011 at 12:54:16PM +0200, Alon Levy wrote:

sorry for the double review of the review, nothing new here.

> On Mon, Mar 14, 2011 at 04:41:02PM +0100, Jes Sorensen wrote:
> > On 02/23/11 12:20, Alon Levy wrote:
> > > diff --git a/hw/ccid-card-emulated.c b/hw/ccid-card-emulated.c
> > > new file mode 100644
> > > index 000..bd84d45
> > > --- /dev/null
> > > +++ b/hw/ccid-card-emulated.c
> > > @@ -0,0 +1,599 @@
> > > +/*
> > > + * CCID Card Device. Emulated card.
> > > + *
> > > + * Copyright (c) 2011 Red Hat.
> > > + * Written by Alon Levy.
> > > + *
> > > + * This code is licenced under the GNU LGPL, version 2 or later.
> > > + */
> > > +
> > > +/*
> > > + * It can be used to provide access to the local hardware in a non 
> > > exclusive
> > > + * way, or it can use certificates. It requires the usb-ccid bus.
> > > + *
> > > + * Usage 1: standard, mirror hardware reader+card:
> > > + * qemu .. -usb -device usb-ccid -device ccid-card-emulated
> > > + *
> > > + * Usage 2: use certificates, no hardware required
> > > + * one time: create the certificates:
> > > + *  for i in 1 2 3; do
> > > + *  certutil -d /etc/pki/nssdb -x -t "CT,CT,CT" -S -s "CN=user$i" -n 
> > > user$i
> > > + *  done
> > > + * qemu .. -usb -device usb-ccid \
> > > + *  -device ccid-card-emulated,cert1=user1,cert2=user2,cert3=user3
> > > + *
> > > + * If you use a non default db for the certificates you can specify it 
> > > using
> > > + * the db parameter.
> > > + */
> > > +
> > > +#include 
> > 
> > qemu-thread.h
> ok, fixing.
> 
> > 
> > > +static const char *emul_event_to_string(uint32_t emul_event)
> > > +{
> > > +switch (emul_event) {
> > > +case EMUL_READER_INSERT: return "EMUL_READER_INSERT";
> > > +case EMUL_READER_REMOVE: return "EMUL_READER_REMOVE";
> > > +case EMUL_CARD_INSERT: return "EMUL_CARD_INSERT";
> > > +case EMUL_CARD_REMOVE: return "EMUL_CARD_REMOVE";
> > > +case EMUL_GUEST_APDU: return "EMUL_GUEST_APDU";
> > > +case EMUL_RESPONSE_APDU: return "EMUL_RESPONSE_APDU";
> > > +case EMUL_ERROR: return "EMUL_ERROR";
> > 
> > YUCK!
> can we turn down the caps / disgust statements? I understand this
> is a personal affront to you somehow? can we settle this at dawn
> tommorrow?
> 
> > 
> > No multi statements on a single line!
> > 
> > > +#define MAX_ATR_SIZE 40
> > > +struct EmulatedState {
> > > +CCIDCardState base;
> > > +uint8_t  debug;
> > > +char*backend_str;
> > > +uint32_t backend;
> > > +char*cert1;
> > > +char*cert2;
> > > +char*cert3;
> > > +char*db;
> > > +uint8_t  atr[MAX_ATR_SIZE];
> > > +uint8_t  atr_length;
> > > +QSIMPLEQ_HEAD(event_list, EmulEvent) event_list;
> > > +pthread_mutex_t event_list_mutex;
> > > +VReader *reader;
> > > +QSIMPLEQ_HEAD(guest_apdu_list, EmulEvent) guest_apdu_list;
> > > +pthread_mutex_t vreader_mutex; /* and guest_apdu_list mutex */
> > > +pthread_mutex_t handle_apdu_mutex;
> > > +pthread_cond_t handle_apdu_cond;
> > > +int  pipe[2];
> > > +int  quit_apdu_thread;
> > > +pthread_mutex_t apdu_thread_quit_mutex;
> > > +pthread_cond_t apdu_thread_quit_cond;
> > > +};
> > 
> > Bad struct packing and wrong thread types.
> will use qemu-thread. that's what you mean by wrong thread types, right?
> s/pthread_thread_t/QemuThread/ etc. (Cond, Mutex)
> 
> > 
> > > +static void emulated_push_type(EmulatedState *card, uint32_t type)
> > > +{
> > > +EmulEvent *event = (EmulEvent *)malloc(sizeof(EmulEvent));
> > 
> > qemu_malloc()
> yep, fixing.
> 
> > 
> > > +assert(event);
> > > +event->p.gen.type = type;
> > > +emulated_push_event(card, event);
> > > +}
> > > +
> > > +static void emulated_push_error(EmulatedState *card, uint64_t code)
> > > +{
> > > +EmulEvent *event = (EmulEvent *)malloc(sizeof(EmulEvent));
> > 
> > qemu_malloc()
> fixing.
> 
> > 
> > > +assert(event);
> > > +event->p.error.type = EMUL_ERROR;
> > > +event->p.error.code = code;
> > > +emulated_push_event(card, event);
> > > +}
> > > +
> > > +static void emulated_push_data_type(EmulatedState *card, uint32_t type,
> > > +const uint8_t *data, uint32_t len)
> > > +{
> > > +EmulEvent *event = (EmulEvent *)malloc(sizeof(EmulEvent) + len);
> > 
> > qemu_malloc()
> fixing.
> 
> > 
> > > +static void pipe_read(void *opaque)
> > > +{
> > > +EmulatedState *card = opaque;
> > > +EmulEvent *event, *next;
> > > +char dummy;
> > > +int len;
> > > +
> > > +do {
> > > +len = read(card->pipe[0], &dummy, sizeof(dummy));
> > > +} while (len == sizeof(dummy));
> > 
> > Shouldn't you check error codes here?
> yes, my bad.
> 
> > 
> > Jes
>

Re: [Qemu-devel] OSX build issues

2011-03-17 Thread Tristan Gingold


On Mar 17, 2011, at 11:28 AM, François Revol wrote:

> Hi,
> 
> Le 16 mars 2011 à 08:57, Tristan Gingold a écrit :
> 
>>> It should fix the build issue.
>>> But QEMU is unreliable on OSX even when it gets built.
>>> I tried to bisect but lost some time trying to find a revision that 
>>> actually builds. I thought I updated more this month...
>>> 
>>> 55f8d6ac3e03d2859393c281737f60c65dfc9ab3 definitely works ok.
>> 
>> J'utilise ce hack pour eviter ce probleme.
> 
> Indeed this hack works (quite a hack for someone working on ada :p)...

That's why I send it to you via a private mail ;-)

Note that I also get failures on Linux iirc.

> From the content of the functions called it's either one of the added fds 
> which cause problem on select() (but why ?), or likely some signal masks 
> which interfere with some internal thread in the process.
> 
> I sampled the process and took some screenshots without and with the #if 0 
> hack:
> http://revolf.free.fr/osx/shots/shot_qemu_init_main_loop_if0.png
> Things work but oddly what is supposed to be an internal dispatcher thread 
> ends up executing qemu code. The main thread includes lots of calls from 
> arbitrary addresses indicating it receives some signals.
> 
> http://revolf.free.fr/osx/shots/shot_qemu_init_main_loop_if1.png
> Things stale, and there are 2 more threads that wait, and the main thread 
> seems quite stuck in select().

Yes, I have to investigate that...

Tristan.

Re: [Qemu-devel] OSX build issues

2011-03-17 Thread Juha.Riihimaki


On 17.03.11 13:09 , "ext Tristan Gingold"  wrote:

>On Mar 17, 2011, at 11:28 AM, François Revol wrote:
>
>>From the content of the functions called it's either one of the added
>>fds which cause problem on select() (but why ?), or likely some signal
>>masks which interfere with some internal thread in the process.
>> 
>> I sampled the process and took some screenshots without and with the
>>#if 0 hack:
>> http://revolf.free.fr/osx/shots/shot_qemu_init_main_loop_if0.png
>> Things work but oddly what is supposed to be an internal dispatcher
>>thread ends up executing qemu code. The main thread includes lots of
>>calls from arbitrary addresses indicating it receives some signals.
>> 
>> http://revolf.free.fr/osx/shots/shot_qemu_init_main_loop_if1.png
>> Things stale, and there are 2 more threads that wait, and the main
>>thread seems quite stuck in select().
>
>Yes, I have to investigate that...

>From what I've found out so far is that the sigwait_compat() function in
compatfd.c will block all signals with sigprocmask() call and then wait
for a single signal in sigwaitinfo() (forgot what it is). Sounds a bit odd
to me but perhaps there's a reason for that? I tried changing the
sigprocmask() call so that it only blocks the same signals that it will
later wait for and things start rolling again.


Regards,
Juha

[Qemu-devel] Re: [PATCH, RFC] virtio_blk: add cache control support

2011-03-17 Thread Rusty Russell

On Wed, 16 Mar 2011 15:09:58 +0100, Christoph Hellwig  wrote:
> On Wed, Mar 16, 2011 at 02:39:39PM +1030, Rusty Russell wrote:
> > > + if (strncmp(buf, "write through", sizeof("write through") - 1) == 0) {
> > > + ;
> > > + } else if (strncmp(buf, "write back", sizeof("write back") - 1) == 0) {
> > 
> >Is there a reason we're not letting gcc and/or strcmp do the
> > optimization work here?
> 
> I'm happ to switch strcmp.

Of course, that's assuming buf is nul terminated.

> > > + vdev->config->set(vdev, offsetof(struct virtio_blk_config, features),
> > > +   &features, sizeof(features));
> > > +
> > > + vdev->config->get(vdev, offsetof(struct virtio_blk_config, features),
> > > +   &features2, sizeof(features2));
> > > +
> > > + if ((features & VIRTIO_BLK_RT_WCE) !=
> > > + (features2 & VIRTIO_BLK_RT_WCE))
> > > + return -EIO;
> > 
> >This seems like a debugging check you left in.  Or do you suspect
> > some issues?
> 
> No, it's intentional.  config space writes can't return errors, so we need
> to check that the value has really changed.  I'll add a comment explaining it.

OK, under what circumstances could it fail?

If you're using this mechanism to indicate that the host doesn't support
the feature, that's making an assumption about the nature of config
space writes which isn't true for non-PCI virtio.

ie. lguest and S/390 don't trap writes to config space.

Or perhaps they should?  But we should be explicit about needing it...

Thanks,
Rusty.

Re: [Qemu-devel] Re: [PATCH 00/15] QAPI Round 1 (core code generator) (v2)

2011-03-17 Thread Kevin Wolf

Am 16.03.2011 16:59, schrieb Anthony Liguori:
> On 03/16/2011 09:34 AM, Luiz Capitulino wrote:
>> On Fri, 11 Mar 2011 17:05:30 -0600
>> Anthony Liguori  wrote:
>>
>>> For more information about the background of QAPI, see
>>> http://wiki.qemu.org/Features/QAPI
>>>
>>> This series depends on 'QAPI Round 0' which I posted earlier.
>>>
>>> Since v2, the major changes are:
>>>
>>>   - Switch to a multiline code emitter to ease readability
>>>   - Use named parameters for escape sequences
>>>   - Add support for proxy commands
>>>   - Add support for asynchronous commands
>>>
>>> This version still adds a -qmp2 option.  This is the only practical way I 
>>> know
>>> to have testable code while not merging 200 patches all at once.
>> I've started reviewing this and my first impression is that this seems to be
>> real good. However, there's a lot of code here and some parts of it are a bit
>> complicated, so I need more time to do a thorough review and testing.
>>
>> Having said that, my only immediate concern is weather this will have any
>> negative side effects on the wire protocol, today or in the future.
>>
>> I mean, a C library has different extensibility constraints and functionality
>> requirements than a high-level protocol and tying/mixing the two can have
>> bad side effects, like this small one (patch 12/15):
> 
> C library is not quite as important as C interface.  I want QMP to be an 
> interface that we consume internally because that will make QMP a strong 
> external interface.
> 
> A fundamental design characteristic for me is that first and foremost, 
> QMP should be a good C interface and that the wire representation should 
> be easy to support in a good C interface.
> 
> This is a shift in our direction but the good news is that the practical 
> impact is small.  But I don't think there's a lot of value of focusing 
> on non-C consumers because any non-C consumer is capable of consuming a 
> good C interface (but the inverse is not true).
> 
>> +##
>> +# @put_event:
>> +#
>> +# Disconnect a signal.  This command is used to disconnect from a signal 
>> based
>> +# on the handle returned by a signal accessor.
>> +#
>> +# @tag: the handle returned by a signal accessor.
>> +#
>> +# Returns: Nothing on success.
>> +#  If @tag is not a valid handle, InvalidParameterValue
>> +#
>> +# Since: 0.15.0
>>
>> The name 'signal' (at least today) doesn't make sense on the wire protocol,
>> 'put_event' probably doesn't make sense in the C library, nor does 'event'.
> 
> I tried very hard to make events useful without changing the wire 
> protocol significantly but I've failed there.
> 
> I've got a new proposal for handling events that introduces the concept 
> of a signal on the wire and the notion of connecting and disconnecting 
> from signals.
> 
>> Another detail is that, event extension is more important than command
>> extension, because it's probably going to happen. I think it would be very
>> bad to add new events just because we wanted to add a new field.
> 
> The way this is typically handled is that signals tend to pass 
> structures instead of lots of fields.  For instance, most of the GDK 
> events just pass a structure for the event (like GdkButtonEvent).

Can we do that with existing events or would we break the external
interface because we'd have to nest everything one level deeper?

Kevin

Re: [Qemu-devel] Re: [PATCH 00/15] QAPI Round 1 (core code generator) (v2)

2011-03-17 Thread Anthony Liguori


On 03/17/2011 07:21 AM, Kevin Wolf wrote:



Another detail is that, event extension is more important than command
extension, because it's probably going to happen. I think it would be very
bad to add new events just because we wanted to add a new field.

The way this is typically handled is that signals tend to pass
structures instead of lots of fields.  For instance, most of the GDK
events just pass a structure for the event (like GdkButtonEvent).

Can we do that with existing events or would we break the external
interface because we'd have to nest everything one level deeper?


We have to introduce new versions of existing events anyway so we can 
make sure to nest the structures appropriately.  I think BLOCK_IO_ERROR 
is the only one that isn't doing this today FWIW.


Regards,

Anthony Liguori


Kevin

Re: [Qemu-devel] Re: [PATCH 00/15] QAPI Round 1 (core code generator) (v2)

2011-03-17 Thread Kevin Wolf

Am 17.03.2011 13:46, schrieb Anthony Liguori:
> On 03/17/2011 07:21 AM, Kevin Wolf wrote:
>>>
 Another detail is that, event extension is more important than command
 extension, because it's probably going to happen. I think it would be very
 bad to add new events just because we wanted to add a new field.
>>> The way this is typically handled is that signals tend to pass
>>> structures instead of lots of fields.  For instance, most of the GDK
>>> events just pass a structure for the event (like GdkButtonEvent).
>> Can we do that with existing events or would we break the external
>> interface because we'd have to nest everything one level deeper?
> 
> We have to introduce new versions of existing events anyway so we can 
> make sure to nest the structures appropriately.  I think BLOCK_IO_ERROR 
> is the only one that isn't doing this today FWIW.

But then we must always send both events in order to maintain
compatibility, right? That sucks.

If I understand right, the problem with the current events isn't even on
the protocol level, which would be visible externally, but just that it
doesn't map to the C interface in the way you like. Is there a reason to
change the events from a wire protocol perspective?

Kevin

Re: [Qemu-devel] [PATCH 18/26] Implement the PAPR (pSeries) virtualized interrupt controller (xics)

2011-03-17 Thread Anthony Liguori


On 03/16/2011 08:34 PM, David Gibson wrote:



+/*
+ * ICP: Presentation layer
+ */
+
+struct icp_server_state {
+uint32_t cppr :8;
+uint32_t xisr :24;

No real reason to use bitfields here.

Well.. in the hardware xics implementation, CPPR and XISR are
considered fields of the one 32-bit register, XIRR.  Matching that is
why I have the bitfield.


Bitfields don't work well with the way we save device state.

Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 03/26] Add a hook to allow hypercalls to be emulated on PowerPC

2011-03-17 Thread Anthony Liguori


On 03/16/2011 11:55 PM, David Gibson wrote:

On Wed, Mar 16, 2011 at 03:44:49PM -0500, Anthony Liguori wrote:

On 03/15/2011 11:56 PM, David Gibson wrote:

From: David Gibson

PowerPC and POWER chips since the POWER4 and 970 have a special
hypervisor mode, and a corresponding form of the system call
instruction which traps to the hypervisor.

qemu currently has stub implementations of hypervisor mode.  That
is, the outline is there to allow qemu to run a PowerPC hypervisor
under emulation.  There are a number of details missing so this
won't actually work at present, but the idea is there.

What there is no provision at all, is for qemu to instead emulate
the hypervisor itself.  That is to have hypercalls trap into qemu
and their result be emulated from qemu, rather than running
hypervisor code within the emulated system.

Hypervisor hardware aware KVM implementations are in the works and
it would  be useful for debugging and development to also allow
full emulation of the same para-virtualized guests as such a KVM.

Therefore, this patch adds a hook which will allow a machine to
set up emulation of hypervisor calls.

Signed-off-by: David Gibson
---
  target-ppc/cpu.h|2 ++
  target-ppc/helper.c |4 
  2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index a20c132..eaddc27 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -692,6 +692,8 @@ struct CPUPPCState {
  int bfd_mach;
  uint32_t flags;
  uint64_t insns_flags;
+void (*emulate_hypercall)(CPUState *, void *);
+void *hcall_opaque;

Is the hypercall handler ever specific to a CPU?

If you mean, "is the hypercall environment ever different from one cpu
to another within the same guest at the same time", then no.  Or at
least, no for any platform that exists now, and anything plausible I
can think of.


Yes, that's what I was asking.  So having a function pointer in each 
CPUState isn't necessary.



If you mean can the hypercall ABI and handling be different for
different CPU models within an architecture, then yes.  It's not there
yet, but BookE CPUs *will* have a quite different hypercall
environment to the PAPR hypercall environment used on IBM servers.


I'd prefer to see this as a generic interface that wasn't specific
to target-ppc.
Basically, add a:

void cpu_hypercall(CPUState *env);

And then implement it within your target.

I'm not exactly sure what you mean by "target" here.  It is *not*
sufficient to make the hypercall function per guest architecture, it
must be per machine.  However, it could be a global hook rather than
in the CPUState.


I'd suggest a totally generic hypercall infrastructure but I know that's 
not plausible for Power.  So I'm suggesting defining cpu_hypercall() in 
cpu.h, and then somewhere in target-ppc/, you can implement whatever 
logic you need to support that function.


This fits well with how we dispatch other forms of I/O (cpu_outb, 
cpu_physical_memory_rw, etc).


Regards,

Anthony Liguori

Re: [Qemu-devel] Re: [PATCH 00/15] QAPI Round 1 (core code generator) (v2)

2011-03-17 Thread Anthony Liguori


On 03/17/2011 08:15 AM, Kevin Wolf wrote:

Am 17.03.2011 13:46, schrieb Anthony Liguori:

On 03/17/2011 07:21 AM, Kevin Wolf wrote:

Another detail is that, event extension is more important than command
extension, because it's probably going to happen. I think it would be very
bad to add new events just because we wanted to add a new field.

The way this is typically handled is that signals tend to pass
structures instead of lots of fields.  For instance, most of the GDK
events just pass a structure for the event (like GdkButtonEvent).

Can we do that with existing events or would we break the external
interface because we'd have to nest everything one level deeper?

We have to introduce new versions of existing events anyway so we can
make sure to nest the structures appropriately.  I think BLOCK_IO_ERROR
is the only one that isn't doing this today FWIW.

But then we must always send both events in order to maintain
compatibility, right? That sucks.


No, it's more complicated than that unfortunately.  The old events are 
"broadcast events".  The new event/signal model requires explicit 
registration.  There is a capabilities negotiation feature that let's us 
disable broadcast events.


So from the wire perspective, a newer client will never see/care about 
broadcast events.



If I understand right, the problem with the current events isn't even on
the protocol level, which would be visible externally,


No, the problem with the old events is that they aren't 
registered/maskable.  So even if you don't care about BLOCK_IO_ERROR, 
you're getting the notification.  Plus, we'd like to add the ability to 
add a tag to events when we register them.


The other problem is that events are all global today.  BLOCK_IO_ERROR 
is a good example of this.  It's really an error that's specific to a 
block device and it passes the name of the block device that it's 
specific to as an argument.  But if we have a masking mechanism it could 
only globally enable/disable BLOCK_IO_ERROR for all devices.


It would be much nicer to be able to enable the event for specific block 
devices.  This requires some protocol visible changes.  I'm still 
writing up these changes but hope to have something for review soon.




  but just that it
doesn't map to the C interface in the way you like.


I think I've maybe been using "C interface" to much.  The current event 
wire protocol doesn't map to any client interface well.


Regards,

Anthony Liguori


  Is there a reason to
change the events from a wire protocol perspective?

Kevin




Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit

2011-03-17 Thread Alon Levy

On Mon, Mar 14, 2011 at 04:20:22PM +0100, Jes Sorensen wrote:
> On 02/23/11 12:20, Alon Levy wrote:
> > +/* private data for PKI applets */
> > +typedef struct CACPKIAppletDataStruct {
> > +unsigned char *cert;
> > +int cert_len;
> > +unsigned char *cert_buffer;
> > +int cert_buffer_len;
> > +unsigned char *sign_buffer;
> > +int sign_buffer_len;
> > +VCardKey *key;
> > +} CACPKIAppletData;
> 
> Grouping the ints together would allow for better struct padding.
> 
> > +/*
> > + *  resest the inter call state between applet selects
> > + */
> 
> I presume it meant to say 'resets' ?
> 
> > diff --git a/libcacard/event.c b/libcacard/event.c
> > new file mode 100644
> > index 000..20276a0
> > --- /dev/null
> > +++ b/libcacard/event.c
> > @@ -0,0 +1,112 @@
> > +/*
> > + *
> > + */
> 
> This comment is really spot on :)
> 
> > diff --git a/libcacard/mutex.h b/libcacard/mutex.h
> > new file mode 100644
> > index 000..cb46aa7
> > --- /dev/null
> > +++ b/libcacard/mutex.h
> 
> UH OH!
> 
> > +/*
> > + *  This header file provides a way of mapping windows and linux thread 
> > calls
> > + *  to a set of macros.  Ideally this would be shared by whatever 
> > subsystem we
> > + *  link with.
> > + */
> > +
> > +#ifndef _H_MUTEX
> > +#define _H_MUTEX
> > +#ifdef _WIN32
> > +#include 
> > +typedef CRITICAL_SECTION mutex_t;
> > +#define MUTEX_INIT(mutex) InitializeCriticalSection(&mutex)
> > +#define MUTEX_LOCK(mutex) EnterCriticalSection(&mutex)
> > +#define MUTEX_UNLOCK(mutex) LeaveCriticalSection(&mutex)
> > +typedef CONDITION_VARIABLE condition_t;
> > +#define CONDITION_INIT(cond) InitializeConditionVariable(&cond)
> > +#define CONDITION_WAIT(cond, mutex) \
> > +SleepConditionVariableCS(&cond, &mutex, INFINTE)
> > +#define CONDITION_NOTIFY(cond) WakeConditionVariable(&cond)
> > +typedef uint32_t thread_t;
> > +typedef HANDLE thread_status_t;
> > +#define THREAD_CREATE(tid, func, arg) \
> > +CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)func, arg, 0, &tid)
> > +#define THREAD_SUCCESS(status) ((status) !=  NULL)
> > +#else
> > +#include 
> > +typedef pthread_mutex_t mutex_t;
> > +#define MUTEX_INIT(mutex) pthread_mutex_init(&mutex, NULL)
> > +#define MUTEX_LOCK(mutex) pthread_mutex_lock(&mutex)
> > +#define MUTEX_UNLOCK(mutex) pthread_mutex_unlock(&mutex)
> > +typedef pthread_cond_t condition_t;
> > +#define CONDITION_INIT(cond) pthread_cond_init(&cond, NULL)
> > +#define CONDITION_WAIT(cond, mutex) pthread_cond_wait(&cond, &mutex)
> > +#define CONDITION_NOTIFY(cond) pthread_cond_signal(&cond)
> > +typedef pthread_t thread_t;
> > +typedef int thread_status_t;
> > +#define THREAD_CREATE(tid, func, arg) pthread_create(&tid, NULL, func, arg)
> > +#define THREAD_SUCCESS(status)  ((status) == 0)
> > +#endif
> 
> NACK! This is no good, the code needs to be fixed to use QemuMutex from
> qemu-thread.h
> 
> In addition, a thing like a mutex feature should be in a separate patch,
>  not part of the code that uses it. However QEMU already has a mutex set
> so this part needs to go.
> 
> > +static VCardAppletPrivate *
> > +passthru_new_applet_private(VReader *reader)
> > +{
> > +VCardAppletPrivate *applet_private = NULL;
> > +LONG rv;
> > +
> > +applet_private = (VCardAppletPrivate 
> > *)malloc(sizeof(VCardAppletPrivate));
> 
> qemu_malloc()
> 
> > +if (applet_private == NULL) {
> > +goto fail;
> > +}
> 
> and it never fails.
> 
> > +if (new_reader_list_len != reader_list_len) {
> > +/* update the list */
> > +new_reader_list = (char *)malloc(new_reader_list_len);
> 
> qemu_malloc() again.
> 
> Please grep through the full patch set and make sure you do not have any
> direct calls to malloc() or free().
> 
> > +/* try resetting the pcsc_lite subsystem */
> > +SCardReleaseContext(global_context);
> > +global_context = 0; /* should close it */
> > +printf("* SCard failure %x\n", rv);
> 
> error_report()
> 
> > diff --git a/libcacard/vcard_emul_nss.c b/libcacard/vcard_emul_nss.c
> > new file mode 100644
> > index 000..e4f0b73
> > --- /dev/null
> > +++ b/libcacard/vcard_emul_nss.c
> [snip]
> > +struct VReaderEmulStruct {
> > +PK11SlotInfo *slot;
> > +VCardEmulType default_type;
> > +char *type_params;
> > +PRBool present;
> 
> What is PRBool and where does it come from?
> 
> > +void
> > +vcard_emul_reset(VCard *card, VCardPower power)
> > +{
> > +PK11SlotInfo *slot;
> > +
> > +if (!nss_emul_init) {
> > +return;
> > +}
> > +
> > +/* if we reset the card (either power on or power off), we loose our 
> > login
> > + * state */
> > +/* TODO: we may also need to send insertion/removal events? */
> > +slot = vcard_emul_card_get_slot(card);
> > +(void)PK11_Logout(slot);
> 
> We don't (void) cast calls like that in QEMU.
> 
> > +/*
> > + *  NSS needs the app to supply a password prompt. In our case the only 
> > time
> >

[Qemu-devel] [PATCH 1/3] pci: add accessor function to get irq levels

2011-03-17 Thread Isaku Yamahata

Introduce accessor function to know INTx levels.
It will be used later by q35.
Although piix_pci tracks the intx line levels, it can be eliminated
by this helper function.

Cc: Michael S. Tsirkin 
Signed-off-by: Isaku Yamahata 
---
 hw/pci.c |7 +++
 hw/pci.h |1 +
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index d6c5e66..67cb3d7 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -126,6 +126,13 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int 
irq_num, int change)
 bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] != 0);
 }
 
+int pci_bus_get_irq_level(PCIBus *bus, int irq_num)
+{
+assert(irq_num >= 0);
+assert(irq_num < bus->nirq);
+return !!bus->irq_count[irq_num];
+}
+
 /* Update interrupt status bit in config space on interrupt
  * state change. */
 static void pci_update_irq_status(PCIDevice *dev)
diff --git a/hw/pci.h b/hw/pci.h
index 46b3ad3..f523722 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -234,6 +234,7 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
 PCIBus *pci_bus_new(DeviceState *parent, const char *name, int devfn_min);
 void pci_bus_irqs(PCIBus *bus, pci_set_irq_fn set_irq, pci_map_irq_fn map_irq,
   void *irq_opaque, int nirq);
+int pci_bus_get_irq_level(PCIBus *bus, int irq_num);
 void pci_bus_hotplug(PCIBus *bus, pci_hotplug_fn hotplug, DeviceState *dev);
 /* 0 <= pin <= 3 0 = INTA, 1 = INTB, 2 = INTC, 3 = INTD */
 int pci_swizzle_map_irq_fn(void *opaque, PCIDevice *pci_dev, int pin);
-- 
1.7.1.1

[Qemu-devel] [PATCH 0/3] piix_pci: optimize irq data path

2011-03-17 Thread Isaku Yamahata

This patch series optimizes irq data path of piix_pci.
So far piix3 tracks each pirq level and checks whether a given pic pins is
asserted by seeing if each pirq is mapped into the pic pin.
This is independent on irq routing, but data path is on slow path.

Given that irq routing is rarely changed and asserting pic pins is on
data path, the path that asserts pic pins should be optimized and
chainging irq routing should be on slow path.
The new behavior with this patch series is to use bitmap which is addressed
by pirq and pic pins with a given irq routing.
When pirq is asserted, the bitmap is set and see if the pic pins is
asserted by checking the bitmaps.
When irq routing is changed, rebuild the bitmap and re-assert pic pins.

Isaku Yamahata (3):
  pci: add accessor function to get irq levels
  piix_pci: eliminate PIIX3State::pci_irq_levels
  piix_pci: optimize set irq path

 hw/pci.c  |7 +++
 hw/pci.h  |1 +
 hw/piix_pci.c |  126 -
 3 files changed, 115 insertions(+), 19 deletions(-)

[Qemu-devel] [PATCH 2/3] piix_pci: eliminate PIIX3State::pci_irq_levels

2011-03-17 Thread Isaku Yamahata

PIIX3State::pci_irq_levels are redundant which is already tracked by
PCIBus layer. So eliminate them.

Cc: Michael S. Tsirkin 
Signed-off-by: Isaku Yamahata 
---
 hw/piix_pci.c |   31 +--
 1 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 892c576..2d0ad9b 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -39,7 +39,7 @@ typedef PCIHostState I440FXState;
 
 typedef struct PIIX3State {
 PCIDevice dev;
-int pci_irq_levels[4];
+int32_t dummy_for_save_load_compat[4];
 qemu_irq *pic;
 } PIIX3State;
 
@@ -162,9 +162,11 @@ static int i440fx_load_old(QEMUFile* f, void *opaque, int 
version_id)
 i440fx_update_memory_mappings(d);
 qemu_get_8s(f, &d->smm_enabled);
 
-if (version_id == 2)
-for (i = 0; i < 4; i++)
-d->piix3->pci_irq_levels[i] = qemu_get_be32(f);
+if (version_id == 2) {
+for (i = 0; i < 4; i++) {
+qemu_get_be32(f); /* dummy load for compatibility */
+}
+}
 
 return 0;
 }
@@ -256,8 +258,6 @@ static void piix3_set_irq(void *opaque, int irq_num, int 
level)
 int i, pic_irq, pic_level;
 PIIX3State *piix3 = opaque;
 
-piix3->pci_irq_levels[irq_num] = level;
-
 /* now we change the pic irq level according to the piix irq mappings */
 /* XXX: optimize */
 pic_irq = piix3->dev.config[0x60 + irq_num];
@@ -266,8 +266,9 @@ static void piix3_set_irq(void *opaque, int irq_num, int 
level)
to it */
 pic_level = 0;
 for (i = 0; i < 4; i++) {
-if (pic_irq == piix3->dev.config[0x60 + i])
-pic_level |= piix3->pci_irq_levels[i];
+if (pic_irq == piix3->dev.config[0x60 + i]) {
+pic_level |= pci_bus_get_irq_level(piix3->dev.bus, i);
+}
 }
 qemu_set_irq(piix3->pic[pic_irq], pic_level);
 }
@@ -309,8 +310,17 @@ static void piix3_reset(void *opaque)
 pci_conf[0xab] = 0x00;
 pci_conf[0xac] = 0x00;
 pci_conf[0xae] = 0x00;
+}
 
-memset(d->pci_irq_levels, 0, sizeof(d->pci_irq_levels));
+static void piix3_pre_save(void *opaque)
+{
+int i;
+PIIX3State *piix3 = opaque;
+
+for (i = 0; i < ARRAY_SIZE(piix3->dummy_for_save_load_compat); i++) {
+piix3->dummy_for_save_load_compat[i] =
+pci_bus_get_irq_level(piix3->dev.bus, i);
+}
 }
 
 static const VMStateDescription vmstate_piix3 = {
@@ -318,9 +328,10 @@ static const VMStateDescription vmstate_piix3 = {
 .version_id = 3,
 .minimum_version_id = 2,
 .minimum_version_id_old = 2,
+.pre_save = piix3_pre_save,
 .fields  = (VMStateField []) {
 VMSTATE_PCI_DEVICE(dev, PIIX3State),
-VMSTATE_INT32_ARRAY_V(pci_irq_levels, PIIX3State, 4, 3),
+VMSTATE_INT32_ARRAY_V(dummy_for_save_load_compat, PIIX3State, 4, 3),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.7.1.1

[Qemu-devel] [PATCH 3/3] piix_pci: optimize set irq path

2011-03-17 Thread Isaku Yamahata

optimize irq routing in piix_pic.c which has been a TODO.

Cc: Michael S. Tsirkin 
Signed-off-by: Isaku Yamahata 
---
 hw/piix_pci.c |  103 +---
 1 files changed, 90 insertions(+), 13 deletions(-)

diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 2d0ad9b..80ce205 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -29,6 +29,7 @@
 #include "isa.h"
 #include "sysbus.h"
 #include "range.h"
+#include "bitops.h"
 
 /*
  * I440FX chipset data sheet.
@@ -37,8 +38,32 @@
 
 typedef PCIHostState I440FXState;
 
+#define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
+#define PIIX_NUM_PIRQS  4   /* PIRQ[A-D] */
+#define PIIX_PIRQC  0x60
+
 typedef struct PIIX3State {
 PCIDevice dev;
+
+/*
+ * bitmap to track pic levels.
+ * The pic level is the logical OR of all the PCI irqs mapped to it
+ * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
+ *
+ * PIRQ is mapped to PIC pins, we track it by
+ * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
+ * pic_irq * PIIX_NUM_PIRQS + pirq
+ */
+#define PIIX_PIC_LEVEL_MASK(pic_irq)\
+(((uint64_t)PIIX_NUM_PIRQS - 1) << ((pic_irq) * PIIX_NUM_PIRQS))
+#define PIIX_PIC_LEVEL_BIT(pic_irq, pirq)   \
+(1ULL << (((pic_irq) * PIIX_NUM_PIRQS) + (pirq)))
+
+#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
+#error "unable to encode pic state in 64bit in pic_levels."
+#endif
+uint64_t pic_levels;
+
 int32_t dummy_for_save_load_compat[4];
 qemu_irq *pic;
 } PIIX3State;
@@ -252,25 +277,66 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
*piix3_devfn, qemu_irq *
 }
 
 /* PIIX3 PCI to ISA bridge */
+static void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
+{
+qemu_set_irq(piix3->pic[pic_irq],
+ !!(piix3->pic_levels & PIIX_PIC_LEVEL_MASK(pic_irq)));
+}
+
+static int piix3_set_irq_level(PIIX3State *piix3, int irq_num, int level)
+{
+int pic_irq;
+uint64_t mask;
+
+pic_irq = piix3->dev.config[PIIX_PIRQC + irq_num];
+if (pic_irq >= PIIX_NUM_PIC_IRQS) {
+return -1;
+}
+
+mask = PIIX_PIC_LEVEL_BIT(pic_irq, irq_num);
+piix3->pic_levels &= ~mask;
+piix3->pic_levels |= mask * !!level;
+return pic_irq;
+}
 
 static void piix3_set_irq(void *opaque, int irq_num, int level)
 {
-int i, pic_irq, pic_level;
 PIIX3State *piix3 = opaque;
+int pic_irq;
+pic_irq = piix3_set_irq_level(piix3, irq_num, level);
+if (pic_irq >= 0) {
+piix3_set_irq_pic(piix3, pic_irq);
+}
+}
 
-/* now we change the pic irq level according to the piix irq mappings */
-/* XXX: optimize */
-pic_irq = piix3->dev.config[0x60 + irq_num];
-if (pic_irq < 16) {
-/* The pic level is the logical OR of all the PCI irqs mapped
-   to it */
-pic_level = 0;
-for (i = 0; i < 4; i++) {
-if (pic_irq == piix3->dev.config[0x60 + i]) {
-pic_level |= pci_bus_get_irq_level(piix3->dev.bus, i);
-}
+static void piix3_reset_irq_levels(PIIX3State *piix3)
+{
+piix3->pic_levels = 0;
+}
+
+/* irq routing is changed. so rebuild bitmap */
+static void piix3_rebuild_irq_levels(PIIX3State *piix3)
+{
+int pirq;
+
+piix3_reset_irq_levels(piix3);
+for (pirq = 0; pirq < PIIX_NUM_PIRQS; pirq++) {
+piix3_set_irq_level(piix3, pirq,
+pci_bus_get_irq_level(piix3->dev.bus, pirq));
+}
+}
+
+static void piix3_write_config(PCIDevice *dev,
+   uint32_t address, uint32_t val, int len)
+{
+pci_default_write_config(dev, address, val, len);
+if (ranges_overlap(address, len, PIIX_PIRQC, 4)) {
+PIIX3State *piix3 = DO_UPCAST(PIIX3State, dev, dev);
+int pic_irq;
+piix3_rebuild_irq_levels(piix3);
+for (pic_irq = 0; pic_irq < PIIX_NUM_PIC_IRQS; pic_irq++) {
+piix3_set_irq_pic(piix3, pic_irq);
 }
-qemu_set_irq(piix3->pic[pic_irq], pic_level);
 }
 }
 
@@ -310,6 +376,15 @@ static void piix3_reset(void *opaque)
 pci_conf[0xab] = 0x00;
 pci_conf[0xac] = 0x00;
 pci_conf[0xae] = 0x00;
+
+piix3_reset_irq_levels(d);
+}
+
+static int piix3_post_load(void *opaque, int version_id)
+{
+PIIX3State *piix3 = opaque;
+piix3_rebuild_irq_levels(piix3);
+return 0;
 }
 
 static void piix3_pre_save(void *opaque)
@@ -328,6 +403,7 @@ static const VMStateDescription vmstate_piix3 = {
 .version_id = 3,
 .minimum_version_id = 2,
 .minimum_version_id_old = 2,
+.post_load = piix3_post_load,
 .pre_save = piix3_pre_save,
 .fields  = (VMStateField []) {
 VMSTATE_PCI_DEVICE(dev, PIIX3State),
@@ -370,6 +446,7 @@ static PCIDeviceInfo i440fx_info[] = {
 .qdev.no_user = 1,
 .no_hotplug   = 1,
 .init = piix3_initfn,
+.config_write = piix3_write_config,
 },{
 /* end of list */
 }
-- 
1.7.1.1

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Markus Armbruster

Michael Tokarev  writes:

> Trivial patch.  I've sent it yesterday but somehow it didn't
> reach the list.
>
> This fixes the problem when qemu continues even if -drive specification
> is somehow invalid, resulting in a mess.  Applicable for both current
> master and for stable-0.14 (and 0.13 and 0.12 as well).

Note patch doesn't apply to 0.12 and 0.13.

> The prob can actually be seriuos: when you start guest with two drives
> and make an error in the specification of one of them, and the guest
> has something like a raid array on the two drives, guest may start failing
> that array or kick "missing" drives which may result in a mess - this is
> what actually happened to me, I did't want a resync at all, and a resync
> resulted in re-writing (and allocating) a 4TB virtual drive I used for
> testing, which in turn resulted in my filesystem filling up and whole
> thing failing badly.  Yes it was just testing VM, I experimented with
> larger raid arrays, but the end result was quite, well, unexpected.
>
> Thanks!
>
> Signed-off-by: Michael Tokarev 
> ---
>  vl.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/vl.c b/vl.c
> index 8bcf2ae..79f996e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2098,7 +2098,8 @@ int main(int argc, char **argv, char **envp)
>HD_OPTS);
>  break;
>  case QEMU_OPTION_drive:
> -drive_def(optarg);
> +if (drive_def(optarg) == NULL)
> +exit(1);
>   break;
>  case QEMU_OPTION_set:
>  if (qemu_set_option(optarg) != 0)

What about all the other unchecked drive_add() calls in main()?

Re: [Qemu-devel] Re: [PATCH 00/15] QAPI Round 1 (core code generator) (v2)

2011-03-17 Thread Kevin Wolf

Am 17.03.2011 14:28, schrieb Anthony Liguori:
> On 03/17/2011 08:15 AM, Kevin Wolf wrote:
>> Am 17.03.2011 13:46, schrieb Anthony Liguori:
>>> On 03/17/2011 07:21 AM, Kevin Wolf wrote:
>> Another detail is that, event extension is more important than command
>> extension, because it's probably going to happen. I think it would be 
>> very
>> bad to add new events just because we wanted to add a new field.
> The way this is typically handled is that signals tend to pass
> structures instead of lots of fields.  For instance, most of the GDK
> events just pass a structure for the event (like GdkButtonEvent).
 Can we do that with existing events or would we break the external
 interface because we'd have to nest everything one level deeper?
>>> We have to introduce new versions of existing events anyway so we can
>>> make sure to nest the structures appropriately.  I think BLOCK_IO_ERROR
>>> is the only one that isn't doing this today FWIW.
>> But then we must always send both events in order to maintain
>> compatibility, right? That sucks.
> 
> No, it's more complicated than that unfortunately.  The old events are 
> "broadcast events".  The new event/signal model requires explicit 
> registration.  There is a capabilities negotiation feature that let's us 
> disable broadcast events.
> 
> So from the wire perspective, a newer client will never see/care about 
> broadcast events.

Right, it can disable them (i.e. some events are registered by default
and there's a command/capability that unregisters all events). But
what's the problem if the client re-enables one of these events by
registering for it?

Adding new events means that we need to have code to generate (that's
what I meant when I said "send") both events for a single action.
Especially if we happen to do it again in the future, this is going to
get really ugly.

>> If I understand right, the problem with the current events isn't even on
>> the protocol level, which would be visible externally,
> 
> No, the problem with the old events is that they aren't 
> registered/maskable.  So even if you don't care about BLOCK_IO_ERROR, 
> you're getting the notification.  Plus, we'd like to add the ability to 
> add a tag to events when we register them.

What's the problem with registering them? If you want to stop client
from doing this you must introduce a special case for obsolete events
that cannot be registered. Do we gain anything from this?

> The other problem is that events are all global today.  BLOCK_IO_ERROR 
> is a good example of this.  It's really an error that's specific to a 
> block device and it passes the name of the block device that it's 
> specific to as an argument.  But if we have a masking mechanism it could 
> only globally enable/disable BLOCK_IO_ERROR for all devices.
> 
> It would be much nicer to be able to enable the event for specific block 
> devices.  This requires some protocol visible changes.  I'm still 
> writing up these changes but hope to have something for review soon.

I wonder if the old, more generic event couldn't be generated
automatically if the more specific signal is triggered in the block code.

>>   but just that it
>> doesn't map to the C interface in the way you like.
> 
> I think I've maybe been using "C interface" to much.  The current event 
> wire protocol doesn't map to any client interface well.

If you mean their broadcast style, that's not really related to nesting
or struct vs. argument.

Kevin

Re: [Qemu-devel] Re: [PATCH, RFC] virtio_blk: add cache control support

2011-03-17 Thread Christoph Hellwig

On Thu, Mar 17, 2011 at 03:36:08PM +1030, Rusty Russell wrote:
> > I'm happ to switch strcmp.
> 
> Of course, that's assuming buf is nul terminated.

It's the string the user writes into it, which normally should be
nul-terminated.

> > No, it's intentional.  config space writes can't return errors, so we need
> > to check that the value has really changed.  I'll add a comment explaining 
> > it.
> 
> OK, under what circumstances could it fail?
> 
> If you're using this mechanism to indicate that the host doesn't support
> the feature, that's making an assumption about the nature of config
> space writes which isn't true for non-PCI virtio.
> 
> ie. lguest and S/390 don't trap writes to config space.
> 
> Or perhaps they should?  But we should be explicit about needing it...

We have the features flag to indicate if updating the caching mode is
supported, but we we could still fail it for other reasons - e.g. we could fail
to reopen the file with/without O_SYNC.  But if lguest or S/390 don't support
trapping config space write this feature won't work for them at all.  As do
other features that make use of config space write, e.g. updating the MAC
address for virtio-net.

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Michael Tokarev

17.03.2011 16:51, Markus Armbruster wrote:
> Michael Tokarev  writes:
> 
>> Trivial patch.  I've sent it yesterday but somehow it didn't
>> reach the list.
>>
>> This fixes the problem when qemu continues even if -drive specification
>> is somehow invalid, resulting in a mess.  Applicable for both current
>> master and for stable-0.14 (and 0.13 and 0.12 as well).
> 
> Note patch doesn't apply to 0.12 and 0.13.

Yes it doesn't, since to 0.14 the code changed.
What I mean is that the same problem exist in
earlier versions too.  I'll apply a change of
this effect to 0.12.5 as used in Debian now,
something like the one below.

[]
> What about all the other unchecked drive_add() calls in main()?

These are much less worrisome - they fail only of the
internal definitions of options are incorrect, which
should never happen.  For example:

case QEMU_OPTION_hdd:
drive_add(IF_DEFAULT, popt->index - QEMU_OPTION_hda, optarg,
  HD_OPTS);

There, optarg is just a filename, and HD_OPTS is
defined like this:

#define HD_OPTS "media=disk"

So it should not fail when parsing options, only
when trying to interpret and actually open the
filename, which happens much later in the game.

Thanks,

/mjt

For 0.12 and 0.13:

diff --git a/vl.c b/vl.c
index 77677e8..069a1df 100644
--- a/vl.c
+++ b/vl.c
@@ -5025,7 +5025,8 @@ int main(int argc, char **argv, char **envp)
 drive_add(optarg, HD_ALIAS, popt->index - QEMU_OPTION_hda);
 break;
 case QEMU_OPTION_drive:
-drive_add(NULL, "%s", optarg);
+if (drive_add(NULL, "%s", optarg) == NULL)
+exit(1);
break;
 case QEMU_OPTION_set:
 if (qemu_set_option(optarg) != 0)

[Qemu-devel] Re: Clock skew after pausing guests

2011-03-17 Thread Timur Safin

Have you checked with Windows Time service enabled in the guest?

net start w32time

Best Regards,
Timur


2011/3/17 Virtbie :
> Hello all
> I am seeing, perhaps unsurprisingly, a skewed system clock after pausing
> and then resuming a qemu-kvm guest. The guest continues with its earlier
> time.
> Since I'd like to use the pause technique for backups, which can take 1
> hour, this is significant.
>
> This is with windows guests, and it doesn't seem to recover by itself
> just by waiting.
>
> Linux guests instead seem to adjust their clock correctly after the resume.
>
> Is there any way to tell the windows guests to resynchronize its clock
> with the host?
> I am using virsh / libvirt.
>
> Thank you
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[Qemu-devel] Clock skew after pausing guests

2011-03-17 Thread Virtbie

Hello all
I am seeing, perhaps unsurprisingly, a skewed system clock after pausing
and then resuming a qemu-kvm guest. The guest continues with its earlier
time.
Since I'd like to use the pause technique for backups, which can take 1
hour, this is significant.

This is with windows guests, and it doesn't seem to recover by itself
just by waiting.

Linux guests instead seem to adjust their clock correctly after the resume.

Is there any way to tell the windows guests to resynchronize its clock
with the host?
I am using virsh / libvirt.

Thank you

Re: [Qemu-devel] [PATCH 5/7] ccid: add ccid-card-emulated device

2011-03-17 Thread Jes Sorensen

On 03/17/11 11:54, Alon Levy wrote:
> On Mon, Mar 14, 2011 at 04:41:02PM +0100, Jes Sorensen wrote:
>>> +static const char *emul_event_to_string(uint32_t emul_event)
>>> +{
>>> +switch (emul_event) {
>>> +case EMUL_READER_INSERT: return "EMUL_READER_INSERT";
>>> +case EMUL_READER_REMOVE: return "EMUL_READER_REMOVE";
>>> +case EMUL_CARD_INSERT: return "EMUL_CARD_INSERT";
>>> +case EMUL_CARD_REMOVE: return "EMUL_CARD_REMOVE";
>>> +case EMUL_GUEST_APDU: return "EMUL_GUEST_APDU";
>>> +case EMUL_RESPONSE_APDU: return "EMUL_RESPONSE_APDU";
>>> +case EMUL_ERROR: return "EMUL_ERROR";
>>
>> YUCK!
> can we turn down the caps / disgust statements? I understand this
> is a personal affront to you somehow? can we settle this at dawn
> tommorrow?

LOL, there is nothing in coding style allowing this, even if there's a
couple of cases in the code still doing it.

Tomorrow is ok - will you be taking an overnight flight here? :)

>>
>> No multi statements on a single line!
>>
>>> +#define MAX_ATR_SIZE 40
>>> +struct EmulatedState {
>>> +CCIDCardState base;
>>> +uint8_t  debug;
>>> +char*backend_str;
>>> +uint32_t backend;
>>> +char*cert1;
>>> +char*cert2;
>>> +char*cert3;
>>> +char*db;
>>> +uint8_t  atr[MAX_ATR_SIZE];
>>> +uint8_t  atr_length;
>>> +QSIMPLEQ_HEAD(event_list, EmulEvent) event_list;
>>> +pthread_mutex_t event_list_mutex;
>>> +VReader *reader;
>>> +QSIMPLEQ_HEAD(guest_apdu_list, EmulEvent) guest_apdu_list;
>>> +pthread_mutex_t vreader_mutex; /* and guest_apdu_list mutex */
>>> +pthread_mutex_t handle_apdu_mutex;
>>> +pthread_cond_t handle_apdu_cond;
>>> +int  pipe[2];
>>> +int  quit_apdu_thread;
>>> +pthread_mutex_t apdu_thread_quit_mutex;
>>> +pthread_cond_t apdu_thread_quit_cond;
>>> +};
>>
>> Bad struct packing and wrong thread types.
> will use qemu-thread. that's what you mean by wrong thread types, right?
> s/pthread_thread_t/QemuThread/ etc. (Cond, Mutex)

Yep

Looks good!

Cheers,
Jes

Re: [Qemu-devel] [PATCH v3 4/4] hw/qxl-render: drop cursor locks, replace with pipe

2011-03-17 Thread Jes Sorensen

On 03/17/11 11:45, Alon Levy wrote:
> On Thu, Mar 17, 2011 at 11:29:03AM +0100, Jes Sorensen wrote:
>> On 03/17/11 11:27, Alon Levy wrote:
>>> On Thu, Mar 17, 2011 at 10:48:43AM +0100, Jes Sorensen wrote:
> Same for the asserts below, writes are from spice server thread, reads
> are in iothread.

 But shouldn't this make it try to reconnect? Even if the reconnect
 fails, it shouldn't kill the guest IMHO.
>>>
>>> reconnect? between two threads in the qemu process? why would the write
>>> fail to begin with? this is like saying if I'm failing a kvm ioctl I should
>>> just retry.
>>
>> Ah ok, I missed that part, somehow I had in my mind it was two different
>> processes, despite you mentioning threads.
>>
>> Still if gfx handling fails, it shouldn't nuke the guest.
> 
> ok, try to apply that logic to any other device - network, usb, etc., I don't
> think it holds.

Maybe I am looking at the wrong angle - I would think that is network or
usb breaks, we would still keep running, and for gfx the guest should be
able to keep running even if the monitor is disconnected.

It's not a big issue so if you feel it is fine as is, I won't object.

Cheers,
Jes

[Qemu-devel] Re: [PATCH 3/3] piix_pci: optimize set irq path

2011-03-17 Thread Michael S. Tsirkin

On Thu, Mar 17, 2011 at 10:49:53PM +0900, Isaku Yamahata wrote:
> optimize irq routing in piix_pic.c which has been a TODO.
> 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Isaku Yamahata 

Some minor comments, and looks like load has a minor bug -
probably an old one as we didn't use to have a post load.

> ---
>  hw/piix_pci.c |  103 +---
>  1 files changed, 90 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index 2d0ad9b..80ce205 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -29,6 +29,7 @@
>  #include "isa.h"
>  #include "sysbus.h"
>  #include "range.h"
> +#include "bitops.h"

still needed?

>  
>  /*
>   * I440FX chipset data sheet.
> @@ -37,8 +38,32 @@
>  
>  typedef PCIHostState I440FXState;
>  
> +#define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
> +#define PIIX_NUM_PIRQS  4   /* PIRQ[A-D] */
> +#define PIIX_PIRQC  0x60
> +
>  typedef struct PIIX3State {
>  PCIDevice dev;
> +
> +/*
> + * bitmap to track pic levels.
> + * The pic level is the logical OR of all the PCI irqs mapped to it
> + * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
> + *
> + * PIRQ is mapped to PIC pins, we track it by
> + * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
> + * pic_irq * PIIX_NUM_PIRQS + pirq
> + */
> +#define PIIX_PIC_LEVEL_MASK(pic_irq)\
> +(((uint64_t)PIIX_NUM_PIRQS - 1) << ((pic_irq) * PIIX_NUM_PIRQS))
> +#define PIIX_PIC_LEVEL_BIT(pic_irq, pirq)   \
> +(1ULL << (((pic_irq) * PIIX_NUM_PIRQS) + (pirq)))

I'll be happier with these open-coded: will be clearer
without all the () that macros need.
And we can make PIIX_NUM_PIRQS ULL so won't need a cast.
But not critical.

> +
> +#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
> +#error "unable to encode pic state in 64bit in pic_levels."
> +#endif
> +uint64_t pic_levels;
> +
>  int32_t dummy_for_save_load_compat[4];
>  qemu_irq *pic;
>  } PIIX3State;
> @@ -252,25 +277,66 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
> *piix3_devfn, qemu_irq *
>  }
>  
>  /* PIIX3 PCI to ISA bridge */
> +static void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
> +{
> +qemu_set_irq(piix3->pic[pic_irq],
> + !!(piix3->pic_levels & PIIX_PIC_LEVEL_MASK(pic_irq)));
> +}
> +
> +static int piix3_set_irq_level(PIIX3State *piix3, int irq_num, int level)
> +{
> +int pic_irq;
> +uint64_t mask;
> +
> +pic_irq = piix3->dev.config[PIIX_PIRQC + irq_num];
> +if (pic_irq >= PIIX_NUM_PIC_IRQS) {
> +return -1;
> +}
> +
> +mask = PIIX_PIC_LEVEL_BIT(pic_irq, irq_num);
> +piix3->pic_levels &= ~mask;
> +piix3->pic_levels |= mask * !!level;
> +return pic_irq;
> +}
>  
>  static void piix3_set_irq(void *opaque, int irq_num, int level)

I find the split between piix3_set_irq_level
and piix3_set_irq_level unintuitive.
It seems that we can just always call piix3_set_irq
and then piix3_set_irq_level can be inlined:
return instead of pic_irq >= 0 below.

>  {
> -int i, pic_irq, pic_level;
>  PIIX3State *piix3 = opaque;
> +int pic_irq;
> +pic_irq = piix3_set_irq_level(piix3, irq_num, level);
> +if (pic_irq >= 0) {
> +piix3_set_irq_pic(piix3, pic_irq);
> +}
> +}
>  
> -/* now we change the pic irq level according to the piix irq mappings */
> -/* XXX: optimize */
> -pic_irq = piix3->dev.config[0x60 + irq_num];
> -if (pic_irq < 16) {
> -/* The pic level is the logical OR of all the PCI irqs mapped
> -   to it */
> -pic_level = 0;
> -for (i = 0; i < 4; i++) {
> -if (pic_irq == piix3->dev.config[0x60 + i]) {
> -pic_level |= pci_bus_get_irq_level(piix3->dev.bus, i);
> -}
> +static void piix3_reset_irq_levels(PIIX3State *piix3)
> +{
> +piix3->pic_levels = 0;
> +}
> +

Going overboard on the abstraction front IMO.

> +/* irq routing is changed. so rebuild bitmap */
> +static void piix3_rebuild_irq_levels(PIIX3State *piix3)

I iked _update_ better than rebuild, but not critical.

> +{
> +int pirq;
> +
> +piix3_reset_irq_levels(piix3);
> +for (pirq = 0; pirq < PIIX_NUM_PIRQS; pirq++) {
> +piix3_set_irq_level(piix3, pirq,
> +pci_bus_get_irq_level(piix3->dev.bus, pirq));
> +}
> +}
> +
> +static void piix3_write_config(PCIDevice *dev,
> +   uint32_t address, uint32_t val, int len)
> +{
> +pci_default_write_config(dev, address, val, len);
> +if (ranges_overlap(address, len, PIIX_PIRQC, 4)) {
> +PIIX3State *piix3 = DO_UPCAST(PIIX3State, dev, dev);
> +int pic_irq;
> +piix3_rebuild_irq_levels(piix3);
> +for (pic_irq = 0; pic_irq < PIIX_NUM_PIC_IRQS; pic_irq++) {
> +piix3_set_irq_pic(piix3, pic_irq);
>  }
> -qemu_set_irq(piix3->pic[pic_irq], pic_level);
>  }
>

[Qemu-devel] Re: [PATCH 0/3] piix_pci: optimize irq data path

2011-03-17 Thread Michael S. Tsirkin

On Thu, Mar 17, 2011 at 10:49:50PM +0900, Isaku Yamahata wrote:
> This patch series optimizes irq data path of piix_pci.
> So far piix3 tracks each pirq level and checks whether a given pic pins is
> asserted by seeing if each pirq is mapped into the pic pin.
> This is independent on irq routing, but data path is on slow path.
> 
> Given that irq routing is rarely changed and asserting pic pins is on
> data path, the path that asserts pic pins should be optimized and
> chainging irq routing should be on slow path.
> The new behavior with this patch series is to use bitmap which is addressed
> by pirq and pic pins with a given irq routing.
> When pirq is asserted, the bitmap is set and see if the pic pins is
> asserted by checking the bitmaps.
> When irq routing is changed, rebuild the bitmap and re-assert pic pins.
> 
> Isaku Yamahata (3):
>   pci: add accessor function to get irq levels
>   piix_pci: eliminate PIIX3State::pci_irq_levels
>   piix_pci: optimize set irq path
> 
>  hw/pci.c  |7 +++
>  hw/pci.h  |1 +
>  hw/piix_pci.c |  126 
> -
>  3 files changed, 115 insertions(+), 19 deletions(-)

Thanks, I'll apply 1-2, 3 can be improved a bit but nothing major so
if you are busy I can apply as is and we can add fixups on top.
Let me know.

-- 
MST

[Qemu-devel] Re: [PATCH 03/26] pci: introduce pci_swizzle_map_irq_fn() for standardized interrupt pin swizzle

2011-03-17 Thread Michael S. Tsirkin

On Wed, Mar 16, 2011 at 06:29:14PM +0900, Isaku Yamahata wrote:
> introduce pci_swizzle_map_irq_fn() for interrupt pin swizzle which is
> standardized. PCI bridge swizzle is common logic, by introducing
> this function duplicated swizzle logic will be avoided later.
> 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Isaku Yamahata 

As long as it's used on data path, better to inline it?

> ---
>  hw/pci.c |   18 ++
>  hw/pci.h |2 ++
>  2 files changed, 20 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index 5349488..d6c5e66 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -1117,6 +1117,24 @@ static void pci_set_irq(void *opaque, int irq_num, int 
> level)
>  pci_change_irq_level(pci_dev, irq_num, change);
>  }
>  
> +/*
> + * PCI-to-PCI bridge specification
> + * 9.1: Interrupt routing. Table 9-1
> + *
> + * the PCI Express Base Specification, Revision 2.1
> + * 2.2.8.1: INTx interrutp signaling - Rules
> + *  the Implementation Note
> + *  Table 2-20
> + */
> +/*
> + * 0 <= pin <= 3 0 = INTA, 1 = INTB, 2 = INTC, 3 = INTD
> + * 0-origin unlike PCI interrupt pin register.
> + */
> +int pci_swizzle_map_irq_fn(void *opaque, PCIDevice *pci_dev, int pin)
> +{
> +return (pin + PCI_SLOT(pci_dev->devfn)) % PCI_NUM_PINS;
> +}
> +
>  /***/
>  /* monitor info on PCI */
>  
> diff --git a/hw/pci.h b/hw/pci.h
> index 1a08139..46b3ad3 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -235,6 +235,8 @@ PCIBus *pci_bus_new(DeviceState *parent, const char 
> *name, int devfn_min);
>  void pci_bus_irqs(PCIBus *bus, pci_set_irq_fn set_irq, pci_map_irq_fn 
> map_irq,
>void *irq_opaque, int nirq);
>  void pci_bus_hotplug(PCIBus *bus, pci_hotplug_fn hotplug, DeviceState *dev);
> +/* 0 <= pin <= 3 0 = INTA, 1 = INTB, 2 = INTC, 3 = INTD */
> +int pci_swizzle_map_irq_fn(void *opaque, PCIDevice *pci_dev, int pin);
>  PCIBus *pci_register_bus(DeviceState *parent, const char *name,
>   pci_set_irq_fn set_irq, pci_map_irq_fn map_irq,
>   void *irq_opaque, int devfn_min, int nirq);
> -- 
> 1.7.1.1

Re: [Qemu-devel] [PATCH 2/4] block: add a helper to change writeback mode on the fly

2011-03-17 Thread Daniel P. Berrange

On Tue, Mar 15, 2011 at 03:11:32PM +0100, Christoph Hellwig wrote:
> Add a new bdrv_change_cache that can set/clear the writeback flag
> at runtime by stopping all I/O and closing/reopening the image file.
> 
> All code is based on a patch from Prerna Saxena 
> with minimal refactoring.
> 
> Signed-off-by: Christoph Hellwig 
> 
>  
> +static int bdrv_reopen(BlockDriverState *bs, int bdrv_flags)
> +{
> +BlockDriver *drv = bs->drv;
> +int ret;
> +
> +if (bdrv_flags == bs->open_flags) {
> +return 0;
> +}
> +
> +/* Quiesce IO for the given block device */
> +qemu_aio_flush();
> +bdrv_flush(bs);
> +
> +bdrv_close(bs);
> +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv);
> +
> +/*
> + * A failed attempt to reopen the image file must lead to 'abort()'
> + */
> +if (ret != 0) {
> +abort();
> +}
> +
> +return ret;
> +}

> +
> +int bdrv_change_cache(BlockDriverState *bs, bool enable)
> +{
> +int bdrv_flags = 0;
> +
> +bdrv_flags = bs->open_flags & ~BDRV_O_CACHE_WB;
> +if (enable) {
> +bdrv_flags |= BDRV_O_CACHE_WB;
> +}
> +
> +return bdrv_reopen(bs, bdrv_flags);
> +}

Is there any way we can manage todo this *without* closing and
re-opening the file descriptor ?

One of the things we're considering for the future is to enable
QEMU to be passed open FD(s) for its drives, from the management
system, instead of having QEMU open the files itself. This will
make it possible to have strong, per-VM selinux isolation for
disks stored on NFS. It will also make it easier to let us run
QEMU inside a private filesystem namespace (CLONE_NEWNS) and not
have to expose any files in that namespace except the QEMU binary
& a few device nodes (particularly useful for hotplug since we
won't know what files need to be visible inside the private namespace
ahead of time).

If QEMU expects to close+reopen any of its disks at any time the
guest requests, this will complicate life somewhat

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

[Qemu-devel] [PATCH v21 09/11] libcacard: add docs

2011-03-17 Thread Alon Levy

From: Robert Relyea 

---
 docs/libcacard.txt |  483 
 1 files changed, 483 insertions(+), 0 deletions(-)
 create mode 100644 docs/libcacard.txt

diff --git a/docs/libcacard.txt b/docs/libcacard.txt
new file mode 100644
index 000..5dee6fa
--- /dev/null
+++ b/docs/libcacard.txt
@@ -0,0 +1,483 @@
+This file documents the CAC (Common Access Card) library in the libcacard
+subdirectory.
+
+Virtual Smart Card Emulator
+
+This emulator is designed to provide emulation of actual smart cards to a
+virtual card reader running in a guest virtual machine. The emulated smart
+cards can be representations of real smart cards, where the necessary functions
+such as signing, card removal/insertion, etc. are mapped to real, physical
+cards which are shared with the client machine the emulator is running on, or
+the cards could be pure software constructs.
+
+The emulator is structured to allow multiple replacable or additional pieces,
+so it can be easily modified for future requirements. The primary envisioned
+modifications are:
+
+1) The socket connection to the virtual card reader (presumably a CCID reader,
+but other ISO-7816 compatible readers could be used). The code that handles
+this is in vscclient.c.
+
+2) The virtual card low level emulation. This is currently supplied by using
+NSS. This emulation could be replaced by implementations based on other
+security libraries, including but not limitted to openssl+pkcs#11 library,
+raw pkcs#11, Microsoft CAPI, direct opensc calls, etc. The code that handles
+this is in vcard_emul_nss.c.
+
+3) Emulation for new types of cards. The current implementation emulates the
+original DoD CAC standard with separate pki containers. This emulator lives in
+cac.c. More than one card type emulator could be included. Other cards could
+be emulated as well, including PIV, newer versions of CAC, PKCS #15, etc.
+
+
+Replacing the Socket Based Virtual Reader Interface.
+
+The current implementation contains a replacable module vscclient.c. The
+current vscclient.c implements a sockets interface to the virtual ccid reader
+on the guest. CCID commands that are pertinent to emulation are passed
+across the socket, and their responses are passed back along that same socket.
+The protocol that vscclient uses is defined in vscard_common.h and connects
+to a qemu ccid usb device. Since this socket runs as a client, vscclient.c
+implements a program with a main entry. It also handles argument parsing for
+the emulator.
+
+An application that wants to use the virtual reader can replace vscclient.c
+with it's own implementation that connects to it's own CCID reader.  The calls
+that the CCID reader can call are:
+
+  VReaderList * vreader_get_reader_list();
+
+  This function returns a list of virtual readers.  These readers may map to
+  physical devices, or simulated devices depending on vcard the back end. Each
+  reader in the list should represent a reader to the virtual machine. Virtual
+  USB address mapping is left to the CCID reader front end. This call can be
+  made any time to get an updated list. The returned list is a copy of the
+  internal list that can be referenced by the caller without locking. This copy
+  must be freed by the caller with vreader_list_delete when it is no longer
+  needed.
+
+  VReaderListEntry *vreader_list_get_first(VReaderList *);
+
+  This function gets the first entry on the reader list. Along with
+  vreader_list_get_next(), vreader_list_get_first() can be used to walk the
+  reader list returned from vreader_get_reader_list(). VReaderListEntries are
+  part of the list themselves and do not need to be freed separately from the
+  list. If there are no entries on the list, it will return NULL.
+
+  VReaderListEntry *vreader_list_get_next(VReaderListEntry *);
+
+  This function gets the next entry in the list. If there are no more entries
+  it will return NULL.
+
+  VReader * vreader_list_get_reader(VReaderListEntry *)
+
+  This function returns the reader stored in the reader List entry. Caller gets
+  a new reference to a reader. The caller must free it's reference when it is
+  finished with vreader_free().
+
+  void vreader_free(VReader *reader);
+
+   This function frees a reference to a reader. Reader's are reference counted
+   and are automatically deleted when the last reference is freed.
+
+  void vreader_list_delete(VReaderList *list);
+
+   This function frees the list, all the elements on the list, and all the
+   reader references held by the list.
+
+  VReaderStatus vreader_power_on(VReader *reader, char *atr, int *len);
+
+  This functions simulates a card power on. Virtual cards do not care about
+  the actual voltage and other physical parameters, but it does care that the
+  card is actually on or off. Cycling the card causes the card to reset. If
+  the caller provides enough space, vreader_power_on will return the ATR of
+  the vi

[Qemu-devel] [PATCH v21 05/11] ccid: add passthru card device

2011-03-17 Thread Alon Levy

The passthru ccid card is a device sitting on the usb-ccid bus and
using a chardevice to communicate with a remote device using the
VSCard protocol defined in libcacard/vscard_common.h

Usage docs available in following patch in docs/ccid.txt

Signed-off-by: Alon Levy 

---

Changes from v20->v21: (Jes Sorenson review)
 * add reference to COPYING in header
 * long comment reformatting

Changes from v19->v20:
 * checkpatch.pl

Changes from v18->v19:
 * add qdev.desc
 * remove .qdev.unplug (no hot unplug support for ccid bus)

Changes from v16->v17:
 * fix wrong cast when receiving VSC_Error
 * ccid-card-passthru: force chardev user wakeup by sending Init
   see lengthy comment below.

Changes from v15->v16:

Behavioral changes:
 * return correct size
 * return error instead of assert if client sent too large ATR
 * don't assert if client sent too large a size, but add asserts for indices to 
buffer
 * reset vscard_in indices on chardev disconnect
 * handle init from client
 * error if no chardev supplied
 * use ntoh, hton
 * eradicate reader_id_t
 * remove Reconnect usage (removed from VSCARD protocol)
 * send VSC_SUCCESS on card insert/remove and reader add/remove

Style fixes:
 * width of line fix
 * update copyright
 * remove old TODO's
 * update file header comment
 * use macros for debug levels
 * c++ style comment replacement
 * update copyright license
 * fix ATR size comment
 * fix whitespace in struct def
 * fix DPRINTF prefix
 * line width fix

ccid-card-passthru: force chardev user wakeup by sending Init

The problem: how to wakeup the user of the smartcard when the smartcard
device is initialized?

Long term solution: have a callback interface. This was done via
the deprecated so called chardev ioctl interface.

Short term solution: do a write. Specifically we write an Init message.
And we change the client to send it's own Init message regardless of
receiving this one. Additional Init messages will be regarded as
acceptable, the first one received after connection establishment is
the determining one wrt capabilities.
---
 Makefile.objs   |2 +-
 hw/ccid-card-passthru.c |  341 +++
 2 files changed, 342 insertions(+), 1 deletions(-)
 create mode 100644 hw/ccid-card-passthru.c

diff --git a/Makefile.objs b/Makefile.objs
index 6c5cfb6..4a4bf0a 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -199,7 +199,7 @@ hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 hw-obj-$(CONFIG_DMA) += dma.o
 hw-obj-$(CONFIG_HPET) += hpet.o
 hw-obj-$(CONFIG_APPLESMC) += applesmc.o
-hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o
+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/hw/ccid-card-passthru.c b/hw/ccid-card-passthru.c
new file mode 100644
index 000..76abfb1
--- /dev/null
+++ b/hw/ccid-card-passthru.c
@@ -0,0 +1,341 @@
+/*
+ * CCID Passthru Card Device emulation
+ *
+ * Copyright (c) 2011 Red Hat.
+ * Written by Alon Levy.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.1 or later.
+ * This code is licenced under the GNU LGPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include 
+
+#include "qemu-char.h"
+#include "monitor.h"
+#include "hw/ccid.h"
+#include "libcacard/vscard_common.h"
+
+#define DPRINTF(card, lvl, fmt, ...)\
+do {\
+if (lvl <= card->debug) {   \
+printf("ccid-card-passthru: " fmt , ## __VA_ARGS__); \
+}   \
+} while (0)
+
+#define D_WARN 1
+#define D_INFO 2
+#define D_MORE_INFO 3
+#define D_VERBOSE 4
+
+/* TODO: do we still need this? */
+uint8_t DEFAULT_ATR[] = {
+/*
+ * From some example somewhere
+ * 0x3B, 0xB0, 0x18, 0x00, 0xD1, 0x81, 0x05, 0xB1, 0x40, 0x38, 0x1F, 0x03, 0x28
+ */
+
+/* From an Athena smart card */
+ 0x3B, 0xD5, 0x18, 0xFF, 0x80, 0x91, 0xFE, 0x1F, 0xC3, 0x80, 0x73, 0xC8, 0x21,
+ 0x13, 0x08
+};
+
+
+#define PASSTHRU_DEV_NAME "ccid-card-passthru"
+#define VSCARD_IN_SIZE 65536
+
+/* maximum size of ATR - from 7816-3 */
+#define MAX_ATR_SIZE40
+
+typedef struct PassthruState PassthruState;
+
+struct PassthruState {
+CCIDCardState base;
+CharDriverState *cs;
+uint8_t  vscard_in_data[VSCARD_IN_SIZE];
+uint32_t vscard_in_pos;
+uint32_t vscard_in_hdr;
+uint8_t  atr[MAX_ATR_SIZE];
+uint8_t  atr_length;
+uint8_t  debug;
+};
+
+/*
+ * VSCard protocol over chardev
+ * This code should not depend on the card type.
+ */
+
+static void ccid_card_vscard_send_msg(PassthruState *s,
+VSCMsgType type, uint32_t reader_id,
+const uint8_t *payload, uint32_t length)
+{
+VSCMsgHeader scr_msg_header;
+
+scr_msg_header.type = htonl(type);
+scr_msg_header.reader_id = htonl(reader_id);
+scr_msg_header.length = htonl(length);
+qemu_chr_write(s->cs, (uint8_t *)&scr_msg_header, sizeof(

[Qemu-devel] [PATCH v21 01/11] trace: move trace objects from Makefile to Makefile.objs

2011-03-17 Thread Alon Levy

---
 Makefile  |   32 
 Makefile.objs |   32 
 2 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/Makefile b/Makefile
index eca4c76..5b35b9f 100644
--- a/Makefile
+++ b/Makefile
@@ -112,38 +112,6 @@ ui/vnc.o: QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
 
 bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)
 
-ifeq ($(TRACE_BACKEND),dtrace)
-trace.h: trace.h-timestamp trace-dtrace.h
-else
-trace.h: trace.h-timestamp
-endif
-trace.h-timestamp: $(SRC_PATH)/trace-events config-host.mak
-   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -h < $< > $@,"  GEN   trace.h")
-   @cmp -s $@ trace.h || cp $@ trace.h
-
-trace.c: trace.c-timestamp
-trace.c-timestamp: $(SRC_PATH)/trace-events config-host.mak
-   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -c < $< > $@,"  GEN   trace.c")
-   @cmp -s $@ trace.c || cp $@ trace.c
-
-trace.o: trace.c $(GENERATED_HEADERS)
-
-trace-dtrace.h: trace-dtrace.dtrace
-   $(call quiet-command,dtrace -o $@ -h -s $<, "  GEN   trace-dtrace.h")
-
-# Normal practice is to name DTrace probe file with a '.d' extension
-# but that gets picked up by QEMU's Makefile as an external dependancy
-# rule file. So we use '.dtrace' instead
-trace-dtrace.dtrace: trace-dtrace.dtrace-timestamp
-trace-dtrace.dtrace-timestamp: $(SRC_PATH)/trace-events config-host.mak
-   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -d < $< > $@,"  GEN   trace-dtrace.dtrace")
-   @cmp -s $@ trace-dtrace.dtrace || cp $@ trace-dtrace.dtrace
-
-trace-dtrace.o: trace-dtrace.dtrace $(GENERATED_HEADERS)
-   $(call quiet-command,dtrace -o $@ -G -s $<, "  GEN trace-dtrace.o")
-
-simpletrace.o: simpletrace.c $(GENERATED_HEADERS)
-
 version.o: $(SRC_PATH)/version.rc config-host.mak
$(call quiet-command,$(WINDRES) -I. -o $@ $<,"  RC$(TARGET_DIR)$@")
 
diff --git a/Makefile.objs b/Makefile.objs
index a52f42f..be4c4d9 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -309,6 +309,38 @@ libdis-$(CONFIG_SPARC_DIS) += sparc-dis.o
 # trace
 
 ifeq ($(TRACE_BACKEND),dtrace)
+trace.h: trace.h-timestamp trace-dtrace.h
+else
+trace.h: trace.h-timestamp
+endif
+trace.h-timestamp: $(SRC_PATH)/trace-events config-host.mak
+   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -h < $< > $@,"  GEN   trace.h")
+   @cmp -s $@ trace.h || cp $@ trace.h
+
+trace.c: trace.c-timestamp
+trace.c-timestamp: $(SRC_PATH)/trace-events config-host.mak
+   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -c < $< > $@,"  GEN   trace.c")
+   @cmp -s $@ trace.c || cp $@ trace.c
+
+trace.o: trace.c $(GENERATED_HEADERS)
+
+trace-dtrace.h: trace-dtrace.dtrace
+   $(call quiet-command,dtrace -o $@ -h -s $<, "  GEN   trace-dtrace.h")
+
+# Normal practice is to name DTrace probe file with a '.d' extension
+# but that gets picked up by QEMU's Makefile as an external dependancy
+# rule file. So we use '.dtrace' instead
+trace-dtrace.dtrace: trace-dtrace.dtrace-timestamp
+trace-dtrace.dtrace-timestamp: $(SRC_PATH)/trace-events config-host.mak
+   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -d < $< > $@,"  GEN   trace-dtrace.dtrace")
+   @cmp -s $@ trace-dtrace.dtrace || cp $@ trace-dtrace.dtrace
+
+trace-dtrace.o: trace-dtrace.dtrace $(GENERATED_HEADERS)
+   $(call quiet-command,dtrace -o $@ -G -s $<, "  GEN trace-dtrace.o")
+
+simpletrace.o: simpletrace.c $(GENERATED_HEADERS)
+
+ifeq ($(TRACE_BACKEND),dtrace)
 trace-obj-y = trace-dtrace.o
 else
 trace-obj-y = trace.o
-- 
1.7.4.1

[Qemu-devel] [PATCH v21 04/11] introduce libcacard/vscard_common.h

2011-03-17 Thread Alon Levy

---

Signed-off-by: Alon Levy 

v20->v21 changes: (Jes Sorenson review)
 * license set to 2+
 * long comment fixes, remove empty line at eof.
 * add reference to COPYING

v19->v20 changes:
 * checkpatch.pl

v15->v16 changes:

Protocol change:
 * VSCMsgInit capabilities and magic
 * removed ReaderResponse, will use Error instead with code==VSC_SUCCESS.
 * adaded Flush and FlushComplete, remove Reconnect.
 * define VSCARD_MAGIC
 * added error code VSC_SUCCESS.

Fixes:
 * update VSCMsgInit comment
 * fix message type enum
 * remove underscore from wrapping define
 * update copyright
 * updated comments.
 * Header comment updated
 * remove C++ style comment
 * fix comment for VSCMsgError
 * give names to enums in typedefs
---
 libcacard/vscard_common.h |  178 +
 1 files changed, 178 insertions(+), 0 deletions(-)
 create mode 100644 libcacard/vscard_common.h

diff --git a/libcacard/vscard_common.h b/libcacard/vscard_common.h
new file mode 100644
index 000..bebd52d
--- /dev/null
+++ b/libcacard/vscard_common.h
@@ -0,0 +1,178 @@
+/* Virtual Smart Card protocol definition
+ *
+ * This protocol is between a host using virtual smart card readers,
+ * and a client providing the smart cards, perhaps by emulating them or by
+ * access to real cards.
+ *
+ * Definitions for this protocol:
+ *  Host   - user of the card
+ *  Client - owner of the card
+ *
+ * The current implementation passes the raw APDU's from 7816 and additionally
+ * contains messages to setup and teardown readers, handle insertion and
+ * removal of cards, negotiate the protocol via capabilities and provide
+ * for error responses.
+ *
+ * Copyright (c) 2011 Red Hat.
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#ifndef VSCARD_COMMON_H
+#define VSCARD_COMMON_H
+
+#include 
+
+#define VERSION_MAJOR_BITS 11
+#define VERSION_MIDDLE_BITS 11
+#define VERSION_MINOR_BITS 10
+
+#define MAKE_VERSION(major, middle, minor) \
+ ((major  << (VERSION_MINOR_BITS + VERSION_MIDDLE_BITS)) \
+  | (middle <<  VERSION_MINOR_BITS) \
+  | (minor))
+
+/*
+ * IMPORTANT NOTE on VERSION
+ *
+ * The version below MUST be changed whenever a change in this file is made.
+ *
+ * The last digit, the minor, is for bug fix changes only.
+ *
+ * The middle digit is for backward / forward compatible changes, updates
+ * to the existing messages, addition of fields.
+ *
+ * The major digit is for a breaking change of protocol, presumably
+ * something that cannot be accomodated with the existing protocol.
+ */
+
+#define VSCARD_VERSION MAKE_VERSION(0, 0, 2)
+
+typedef enum VSCMsgType {
+VSC_Init = 1,
+VSC_Error,
+VSC_ReaderAdd,
+VSC_ReaderRemove,
+VSC_ATR,
+VSC_CardRemove,
+VSC_APDU,
+VSC_Flush,
+VSC_FlushComplete
+} VSCMsgType;
+
+typedef enum VSCErrorCode {
+VSC_SUCCESS = 0,
+VSC_GENERAL_ERROR = 1,
+VSC_CANNOT_ADD_MORE_READERS,
+VSC_CARD_ALREAY_INSERTED,
+} VSCErrorCode;
+
+#define VSCARD_UNDEFINED_READER_ID  0x
+#define VSCARD_MINIMAL_READER_ID0
+
+#define VSCARD_MAGIC (*(uint32_t *)"VSCD")
+
+/*
+ * Header
+ * Each message starts with the header.
+ * type - message type
+ * reader_id - used by messages that are reader specific
+ * length - length of payload (not including header, i.e. zero for
+ *  messages containing empty payloads)
+ */
+typedef struct VSCMsgHeader {
+uint32_t   type;
+uint32_t   reader_id;
+uint32_t   length;
+uint8_tdata[0];
+} VSCMsgHeader;
+
+/*
+ * VSCMsgInit   Client <-> Host
+ * Client sends it on connection, with its own capabilities.
+ * Host replies with VSCMsgInit filling in its capabilities.
+ *
+ * It is not meant to be used for negotiation, i.e. sending more then
+ * once from any side, but could be used for that in the future.
+ */
+typedef struct VSCMsgInit {
+uint32_t   magic;
+uint32_t   version;
+uint32_t   capabilities[1]; /* receiver must check length,
+   array may grow in the future*/
+} VSCMsgInit;
+
+/*
+ * VSCMsgError  Client <-> Host
+ * This message is a response to any of:
+ *  Reader Add
+ *  Reader Remove
+ *  Card Remove
+ * If the operation was successful then VSC_SUCCESS
+ * is returned, other wise a specific error code.
+ */
+typedef struct VSCMsgError {
+uint32_t   code;
+} VSCMsgError;
+
+/*
+ * VSCMsgReaderAdd  Client -> Host
+ * Host replies with allocated reader id in VSCMsgError with code==SUCCESS.
+ *
+ * name - name of the reader on client side, UTF-8 encoded. Only used
+ *  for client presentation (may be translated to the device presented to the
+ *  guest), protocol wise only reader_id is important.
+ */
+typedef struct VSCMsgReaderAdd {
+uint8_tname[0];
+} VSCMsgReaderAdd;
+
+/*
+ * VSCMsgReaderRemove   Client -> Host
+ * The client's reader has been removed.
+ */
+typedef struct VSC

[Qemu-devel] [PATCH v21 07/11] libcacard: add vscclient

2011-03-17 Thread Alon Levy

From: Robert Relyea 

client to talk to ccid-card-passthru and use smartcard on client to
perform actual operations.
---
 libcacard/Makefile|7 +-
 libcacard/vscclient.c |  730 +
 2 files changed, 736 insertions(+), 1 deletions(-)
 create mode 100644 libcacard/vscclient.c

diff --git a/libcacard/Makefile b/libcacard/Makefile
index d780788..7a335bc 100644
--- a/libcacard/Makefile
+++ b/libcacard/Makefile
@@ -12,6 +12,11 @@ endif
 
 QEMU_OBJS=$(QEMU_THREAD) $(oslib-obj-y) $(trace-obj-y) qemu-malloc.o
 
+vscclient: $(libcacard-y) $(QEMU_OBJS) vscclient.o
+   $(call quiet-command,$(CC) $(libcacard_libs) -lrt -o $@ $^,"  LINK  
$(TARGET_DIR)$@")
+
+all: vscclient
+
 clean:
-   rm -f *.o */*.o *.d */*.d *.a */*.a *~ */*~
+   rm -f *.o */*.o *.d */*.d *.a */*.a *~ */*~ vscclient
 
diff --git a/libcacard/vscclient.c b/libcacard/vscclient.c
new file mode 100644
index 000..8dde449
--- /dev/null
+++ b/libcacard/vscclient.c
@@ -0,0 +1,730 @@
+/*
+ * Tester for VSCARD protocol, client side.
+ *
+ * Can be used with ccid-card-passthru.
+ *
+ * Copyright (c) 2011 Red Hat.
+ * Written by Alon Levy.
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "qemu-thread.h"
+#include "qemu-common.h"
+
+#include "vscard_common.h"
+
+#include "vreader.h"
+#include "vcard_emul.h"
+#include "vevent.h"
+
+int verbose;
+
+int sock;
+
+static void
+print_byte_array(
+uint8_t *arrBytes,
+unsigned int nSize
+) {
+int i;
+for (i = 0; i < nSize; i++) {
+printf("%02X ", arrBytes[i]);
+}
+printf("\n");
+}
+
+static void
+print_usage(void) {
+printf("vscclient [-c  .. -e  -d %s] "
+" \n",
+#ifdef USE_PASSTHRU
+" -p");
+printf(" -p use passthrough mode\n");
+#else
+   "");
+#endif
+vcard_emul_usage();
+}
+
+static QemuMutex write_lock;
+
+static int
+send_msg(
+VSCMsgType type,
+uint32_t reader_id,
+const void *msg,
+unsigned int length
+) {
+int rv;
+VSCMsgHeader mhHeader;
+
+qemu_mutex_lock(&write_lock);
+
+if (verbose > 10) {
+printf("sending type=%d id=%d, len =%d (0x%x)\n",
+   type, reader_id, length, length);
+}
+
+mhHeader.type = htonl(type);
+mhHeader.reader_id = 0;
+mhHeader.length = htonl(length);
+rv = write(
+sock,
+&mhHeader,
+sizeof(mhHeader)
+);
+if (rv < 0) {
+/* Error */
+printf("write header error\n");
+close(sock);
+qemu_mutex_unlock(&write_lock);
+return 16;
+}
+rv = write(
+sock,
+msg,
+length
+);
+if (rv < 0) {
+/* Error */
+printf("write error\n");
+close(sock);
+qemu_mutex_unlock(&write_lock);
+return 16;
+}
+qemu_mutex_unlock(&write_lock);
+
+return 0;
+}
+
+static VReader *pending_reader;
+static QemuMutex pending_reader_lock;
+static QemuCond pending_reader_condition;
+
+#define MAX_ATR_LEN 40
+static void *
+event_thread(void *arg)
+{
+unsigned char atr[MAX_ATR_LEN];
+int atr_len = MAX_ATR_LEN;
+VEvent *event = NULL;
+unsigned int reader_id;
+
+
+while (1) {
+const char *reader_name;
+
+event = vevent_wait_next_vevent();
+if (event == NULL) {
+break;
+}
+reader_id = vreader_get_id(event->reader);
+if (reader_id == VSCARD_UNDEFINED_READER_ID &&
+event->type != VEVENT_READER_INSERT) {
+/* ignore events from readers qemu has rejected */
+/* if qemu is still deciding on this reader, wait to see if need to
+ * forward this event */
+qemu_mutex_lock(&pending_reader_lock);
+if (!pending_reader || (pending_reader != event->reader)) {
+/* wasn't for a pending reader, this reader has already been
+ * rejected by qemu */
+qemu_mutex_unlock(&pending_reader_lock);
+vevent_delete(event);
+continue;
+}
+/* this reader hasn't been told it's status from qemu yet, wait for
+ * that status */
+while (pending_reader != NULL) {
+qemu_cond_wait(&pending_reader_condition, 
&pending_reader_lock);
+}
+qemu_mutex_unlock(&pending_reader_lock);
+/* now recheck the id */
+reader_id = vreader_get_id(event->reader);
+if (reader_id == VSCARD_UNDEFINED_READER_ID) {
+/* this reader was rejected */
+vevent_delete(event);
+continue;
+}
+/* reader was accepted, now forward the event */
+}
+switch (event->type) {
+case VEVENT_READER_I

[Qemu-devel] [PATCH v21 03/11] usb-ccid: add CCID bus

2011-03-17 Thread Alon Levy

A CCID device is a smart card reader. It is a USB device, defined at [1].
This patch introduces the usb-ccid device that is a ccid bus. Next patches will
introduce two card types to use it, a passthru card and an emulated card.

 [1] http://www.usb.org/developers/devclass_docs/DWG_Smart-Card_CCID_Rev110.

Signed-off-by: Alon Levy 

---

changes from v20->v21: (Jes Sorenson review)
 * cosmetic changes - fix multi line comments.
 * reorder fields in USBCCIDState
 * add reference to COPYING
 * add --enable-smartcard and --disable-smartcard here (moved
 from last patch)

changes from v19->v20:
 * checkpatch.pl

changes from v18->v19:
 * merged: ccid.h: add copyright, fix define and remove non C89 comments
 * add qdev.desc

changes from v15->v16:

Behavioral changes:
 * fix abort on client answer after card remove
 * enable migration
 * remove side affect code from asserts
 * return consistent self-powered state
 * mask out reserved bits in ccid_set_parameters
 * add missing abRFU in SetParameters (no affect on linux guest)

whitefixes / comments / consts defines:
 * remove stale comment
 * remove ccid_print_pending_answers if no DEBUG_CCID
 * replace printf's with DPRINTF, remove DEBUG_CCID, add verbosity defines
 * use error_report
 * update copyright (most of the code is not original)
 * reword known bug comment
 * add missing closing quote in comment
 * add missing whitespace on one line
 * s/CCID_SetParameter/CCID_SetParameters/
 * add comments
 * use define for max packet size

Comment for "return consistent self-powered state":

the Configuration Descriptor bmAttributes claims we are self powered,
but we were returning not self powered to USB_REQ_GET_STATUS control message.

In practice, this message is not sent by a linux 2.6.35.10-74.fc14.x86_64
guest (not tested on other guests), unless you issue lsusb -v as root (for
example).
---
 Makefile.objs |1 +
 configure |   11 +
 hw/ccid.h |   59 +++
 hw/usb-ccid.c | 1419 +
 4 files changed, 1490 insertions(+), 0 deletions(-)
 create mode 100644 hw/ccid.h
 create mode 100644 hw/usb-ccid.c

diff --git a/Makefile.objs b/Makefile.objs
index be4c4d9..6c5cfb6 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -199,6 +199,7 @@ hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 hw-obj-$(CONFIG_DMA) += dma.o
 hw-obj-$(CONFIG_HPET) += hpet.o
 hw-obj-$(CONFIG_APPLESMC) += applesmc.o
+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/configure b/configure
index a166de0..3b0174e 100755
--- a/configure
+++ b/configure
@@ -174,6 +174,7 @@ trace_backend="nop"
 trace_file="trace"
 spice=""
 rbd=""
+smartcard=""
 
 # parse CC options first
 for opt do
@@ -719,6 +720,10 @@ for opt do
   ;;
   --enable-rbd) rbd="yes"
   ;;
+  --disable-smartcard) smartcard="no"
+  ;;
+  --enable-smartcard) smartcard="yes"
+  ;;
   *) echo "ERROR: unknown option $opt"; show_help="yes"
   ;;
   esac
@@ -914,6 +919,8 @@ echo "   Default:trace-"
 echo "  --disable-spice  disable spice"
 echo "  --enable-spice   enable spice"
 echo "  --enable-rbd enable building the rados block device (rbd)"
+echo "  --disable-smartcard  disable smartcard support"
+echo "  --enable-smartcard   enable smartcard support"
 echo ""
 echo "NOTE: The object files are built at the place where configure is 
launched"
 exit 1
@@ -2809,6 +2816,10 @@ if test "$spice" = "yes" ; then
   echo "CONFIG_SPICE=y" >> $config_host_mak
 fi
 
+if test "$smartcard" = "yes" ; then
+  echo "CONFIG_SMARTCARD=y" >> $config_host_mak
+fi
+
 # XXX: suppress that
 if [ "$bsd" = "yes" ] ; then
   echo "CONFIG_BSD=y" >> $config_host_mak
diff --git a/hw/ccid.h b/hw/ccid.h
new file mode 100644
index 000..dbfc13c
--- /dev/null
+++ b/hw/ccid.h
@@ -0,0 +1,59 @@
+/*
+ * CCID Passthru Card Device emulation
+ *
+ * Copyright (c) 2011 Red Hat.
+ * Written by Alon Levy.
+ *
+ * This code is licenced under the GNU LGPL, version 2 or later.
+ */
+
+#ifndef CCID_H
+#define CCID_H
+
+#include "qdev.h"
+
+typedef struct CCIDCardState CCIDCardState;
+typedef struct CCIDCardInfo CCIDCardInfo;
+
+/*
+ * state of the CCID Card device (i.e. hw/ccid-card-*.c)
+ */
+struct CCIDCardState {
+DeviceState qdev;
+uint32_tslot; /* For future use with multiple slot reader. */
+};
+
+/*
+ * callbacks to be used by the CCID device (hw/usb-ccid.c) to call
+ * into the smartcard device (hw/ccid-card-*.c)
+ */
+struct CCIDCardInfo {
+DeviceInfo qdev;
+void (*print)(Monitor *mon, CCIDCardState *card, int indent);
+const uint8_t *(*get_atr)(CCIDCardState *card, uint32_t *len);
+void (*apdu_from_guest)(CCIDCardState *card,
+const uint8_t *apdu,
+uint32_t len);
+int (*exitfn)(CCIDCardState *card);
+int (*initfn)(CCIDCardState *card);
+};
+
+/*
+ * API for smartcard calling the CCID device (used by hw/ccid-card-*.c)

[Qemu-devel] [PATCH v21 02/11] qemu-thread.h: include inttypes.h

2011-03-17 Thread Alon Levy

qemu-thread.h relies on uint64_t being defined, but doesn't include
inttypes.h explicitly. This makes it easier to use it from vscclient (part
of libcacard).
---
 qemu-thread.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/qemu-thread.h b/qemu-thread.h
index acdb6b2..b144e73 100644
--- a/qemu-thread.h
+++ b/qemu-thread.h
@@ -1,6 +1,8 @@
 #ifndef __QEMU_THREAD_H
 #define __QEMU_THREAD_H 1
 
+#include 
+
 typedef struct QemuMutex QemuMutex;
 typedef struct QemuCond QemuCond;
 typedef struct QemuThread QemuThread;
-- 
1.7.4.1

[Qemu-devel] [PATCH v21 11/11] ccid: add docs

2011-03-17 Thread Alon Levy

Add documentation for the usb-ccid device and accompanying two card
devices, ccid-card-emulated and ccid-card-passthru.

Signed-off-by: Alon Levy 
---
 docs/ccid.txt |  135 +
 1 files changed, 135 insertions(+), 0 deletions(-)
 create mode 100644 docs/ccid.txt

diff --git a/docs/ccid.txt b/docs/ccid.txt
new file mode 100644
index 000..b8e504a
--- /dev/null
+++ b/docs/ccid.txt
@@ -0,0 +1,135 @@
+Qemu CCID Device Documentation.
+
+Contents
+1. USB CCID device
+2. Building
+3. Using ccid-card-emulated with hardware
+4. Using ccid-card-emulated with certificates
+5. Using ccid-card-passthru with client side hardware
+6. Using ccid-card-passthru with client side certificates
+7. Passthrough protocol scenario
+8. libcacard
+
+1. USB CCID device
+
+The USB CCID device is a USB device implementing the CCID specification, which
+lets one connect smart card readers that implement the same spec. For more
+information see the specification:
+
+ Universal Serial Bus
+ Device Class: Smart Card
+ CCID
+ Specification for
+ Integrated Circuit(s) Cards Interface Devices
+ Revision 1.1
+ April 22rd, 2005
+
+Smartcard are used for authentication, single sign on, decryption in
+public/private schemes and digital signatures. A smartcard reader on the client
+cannot be used on a guest with simple usb passthrough since it will then not be
+available on the client, possibly locking the computer when it is "removed". On
+the other hand this device can let you use the smartcard on both the client and
+the guest machine. It is also possible to have a completely virtual smart card
+reader and smart card (i.e. not backed by a physical device) using this device.
+
+2. Building
+
+The cryptographic functions and access to the physical card is done via NSS.
+
+Installing NSS:
+
+In redhat/fedora:
+yum install nss-devel
+In ubuntu/debian:
+apt-get install libnss3-dev
+(not tested on ubuntu)
+
+Configuring and building:
+./configure --enable-smartcard && make
+
+3. Using ccid-card-emulated with hardware
+
+Assuming you have a working smartcard on the host with the current
+user, using NSS, qemu acts as another NSS client using ccid-card-emulated:
+
+qemu -usb -device usb-ccid -device ccid-card-emualated
+
+4. Using ccid-card-emulated with certificates
+
+You must create the certificates. This is a one time process. We use NSS
+certificates:
+
+certutil -d /etc/pki/nssdb -x -t "CT,CT,CT" -S -s "CN=cert1" -n cert1
+
+Note: you must have exactly three certificates.
+
+Assuming the current user can access the certificates (use certutil -L to
+verify), you can use the emulated card type with the certificates backend:
+
+qemu -usb -device usb-ccid -device 
ccid-card-emulated,backend=certificates,cert1=cert1,cert2=cert2,cert3=cert3
+
+5. Using ccid-card-passthru with client side hardware
+
+on the host specify the ccid-card-passthru device with a suitable chardev:
+
+qemu -chardev socket,server,host=0.0.0.0,port=2001,id=ccid,nowait -usb 
-device usb-ccid -device ccid-card-passthru,chardev=ccid
+
+on the client run vscclient, built when you built the libcacard library:
+libcacard/vscclient  2001
+
+6. Using ccid-card-passthru with client side certificates
+
+Run qemu as per #5, and run vscclient as follows:
+(Note: vscclient command line interface is in a state of change)
+
+libcacard/vscclient -e "db=\"/etc/pki/nssdb\" use_hw=no 
soft=(,Test,CAC,,cert1,cert2,cert3)"  2001
+
+7. Passthrough protocol scenario
+
+This is a typical interchange of messages when using the passthru card device.
+usb-ccid is a usb device. It defaults to an unattached usb device on startup.
+usb-ccid expects a chardev and expects the protocol defined in
+cac_card/vscard_common.h to be passed over that.
+The usb-ccid device can be in one of three modes:
+ * detached
+ * attached with no card
+ * attached with card
+
+A typical interchange is: (the arrow shows who started each exchange, it can 
be client
+originated or guest originated)
+
+client event  |  vscclient   |passthru| usb-ccid  
|  guest event
+--
+  |  VSC_Init||   |
+  |  VSC_ReaderAdd   || attach|
+  |  ||   
|  sees new usb device.
+card inserted ->  |  ||   |
+  |  VSC_ATR |   insert   | insert
|  see new card
+  |  ||   |
+  |  VSC_APDU|   VSC_APDU |   
| <- guest sends APDU
+client<->physical |  ||   |
+card APDU exchange|  |

[Qemu-devel] [PATCH v21 00/11] usb-ccid

2011-03-17 Thread Alon Levy

This patchset adds three new devices, usb-ccid, ccid-card-passthru and
ccid-card-emulated, providing a CCID bus, a simple passthru protocol
implementing card requiring a client, and a standalone emulated card.

It also introduces a new directory libcaccard with CAC card emulation,
CAC is a type of ISO 7816 smart card.

Tree for pull: git://anongit.freedesktop.org/~alon/qemu usb_ccid.v20

v20->v21 changes:
 * all: cosmetics
 * libcacard, ccid-card-passthru:
  * use qemu-{malloc,free} and qemu-thread, error_report
 * libcacard:
  * split to multiple patches

v19->v20 changes:
 * checkpatch.pl. Here are the remaining errors with explanation:
  * ignored 5 macro errors of the type
   "ERROR: Macros with complex values should be enclosed in parenthesis"
   because fixing them breaks current code, if it really bothers someone
   I can fix it.
   * four of them are in libcacard/card_7816t.h:
   /* give the subfields a unified look */
   ..
#define a_cla a_header->ah_cla /* class */
#define a_ins a_header->ah_ins /* instruction */
#define a_p1 a_header->ah_p1   /* parameter 1 */
#define a_p2 a_header->ah_p2   /* parameter 2 */
   * and the fifth:
#4946: FILE: libcacard/vcardt.h:31:
+#define VCARD_ATR_PREFIX(size) 0x3b, 0x66+(size), 0x00, 0xff, \
+   'V', 'C', 'A', 'R', 'D', '_'
  * Ignored this warning since I couldn't figure it out, and it's a test
   file:
WARNING: externs should be avoided in .c files
#2343: FILE: libcacard/link_test.c:7:
+VCardStatus cac_card_init(const char *flags, VCard *card,

v18-v19 changes:
 * more merges, down to a single digit number of patches.
 * drop enumeration property, use string.
 * rebased (trivial)

v17-v18 changes:
 * merge vscard_common.h patches.
 * actually provide a tree to pull.

v16-v17 changes:
 * merged all the "v15->v16" patches
 * merged some more wherever it was easy (all same file commits).
 * added signed off by to first four patches
 * ccid.h: added copyright, removed underscore in defines, and replaced
 non C89 comments

v15-v16 changes:
 * split vscard_common introducing patch for ease of review
 * sum of commit logs for the v15-v16 commits: (whitespace fixes
removed for space, see original commit messages in later patches)
  * usb-ccid:
   * fix abort on client answer after card remove
   * enable migration
   * remove side affect code from asserts
   * return consistent self-powered state
   * mask out reserved bits in ccid_set_parameters
   * add missing abRFU in SetParameters (no affect on linux guest)
  * vscard_common.h protocol change:
   * VSCMsgInit capabilities and magic
   * removed ReaderResponse, will use Error instead with code==VSC_SUCCESS.
   * added Flush and FlushComplete, remove Reconnect.
   * define VSCARD_MAGIC
   * added error code VSC_SUCCESS.
  * ccid-card-passthru
   * return correct size
   * return error instead of assert if client sent too large ATR
   * don't assert if client sent too large a size, but add asserts for indices 
to buffer
   * reset vscard_in indices on chardev disconnect
   * handle init from client
   * error if no chardev supplied
   * use ntoh, hton
   * eradicate reader_id_t
   * remove Reconnect usage (removed from VSCARD protocol)
   * send VSC_SUCCESS on card insert/remove and reader add/remove
  * ccid-card-emulated
   * fix error reporting in initfn

v14-v15 changes:
 * add patch with --enable-smartcard and --disable-smartcard and only
  disable ccid-card-emulated if nss not found.
 * add patch with description strings
 * s/libcaccard/libcacard/ in docs/ccid.txt

v13-v14 changes:
 - support device_del/device_add on ccid-card-* and usb-ccid
 * usb-ccid:
  * lose card reference when card device deleted
  * check slot number and deny adding a slot if one is already added.
 * ccid-card-*: use qdev_simple_unplug_cb in both emulated and passthru ccid 
cards,
   the exitfn already takes care of triggering card removal in the usb dev.
 * libcacard:
  * remove double include of config-host.mak
  * add replay of card events to libcacard to support second and more emulation
  * don't initialize more then once (doesn't support it right now, so one
   thread, NSS thread, is left when device_del is done)
  * add VCARD_EMUL_INIT_ALREADY_INITED
 * ccid-card-emulated:
  * take correct mutexes on signaling to fix deadlocks on device_del
  * allow card insertion/removal event without proper reader insertion event

v12-v13 changes:
 * libcacard:
  * fix Makefile clean to remove vscclient
  * fix double include of config-host in Makefile
 * usb-ccid: remove attach/detach logic, usb is always attached. Guest
  doesn't care if there is a reader attached with no card anyway.
 * ccid-card-passthru: don't close chr_dev on removal, makes it possible
  to use device_del/device_add to create remove/insertion for debugging.

v11-v12 changes:
 * fix out of tree build

v10-v11 changes:
 * fix last patch that removed one of the doc files.
 * updated flow table in docs/ccid.txt

v8-v10 changes:
 * usb-ccid:

[Qemu-devel] [PATCH v21 10/11] ccid: add ccid-card-emulated device

2011-03-17 Thread Alon Levy

This devices uses libcacard (internal) to emulate a smartcard conforming
to the CAC standard. It attaches to the usb-ccid bus. Usage instructions
(example command lines) are in the following patch in docs/ccid.txt. It
uses libcacard which uses nss, so it can work with both hw cards and
certificates (files).

Signed-off-by: Alon Levy 

---

changes from v20->v21: (Jes Sorenson review)
 * cosmetics
 * use qemu-thread and qemu_malloc/qemu_free

changes from v19->v20:
 * checkpatch.pl

changes from v18->v19:
 * add qdev.desc
 * backend: drop the enumeration property, back to using a string one.

changes from v16->v17:
 * use PROP_TYPE_ENUM for backend

changes from v15->v16:
 * fix error reporting in initfn
 * bump copyright year
 * update copyright license

changes from v1:
 * remove stale comments, use only c-style comments
 * bugfix, forgot to set recv_len
 * change reader name to 'Virtual Reader'
---
 Makefile.objs   |1 +
 hw/ccid-card-emulated.c |  595 +++
 2 files changed, 596 insertions(+), 0 deletions(-)
 create mode 100644 hw/ccid-card-emulated.c

diff --git a/Makefile.objs b/Makefile.objs
index ed77dfd..c4097fb 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -200,6 +200,7 @@ hw-obj-$(CONFIG_DMA) += dma.o
 hw-obj-$(CONFIG_HPET) += hpet.o
 hw-obj-$(CONFIG_APPLESMC) += applesmc.o
 hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
+hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/hw/ccid-card-emulated.c b/hw/ccid-card-emulated.c
new file mode 100644
index 000..0b07184
--- /dev/null
+++ b/hw/ccid-card-emulated.c
@@ -0,0 +1,595 @@
+/*
+ * CCID Card Device. Emulated card.
+ *
+ * Copyright (c) 2011 Red Hat.
+ * Written by Alon Levy.
+ *
+ * This code is licenced under the GNU LGPL, version 2 or later.
+ */
+
+/*
+ * It can be used to provide access to the local hardware in a non exclusive
+ * way, or it can use certificates. It requires the usb-ccid bus.
+ *
+ * Usage 1: standard, mirror hardware reader+card:
+ * qemu .. -usb -device usb-ccid -device ccid-card-emulated
+ *
+ * Usage 2: use certificates, no hardware required
+ * one time: create the certificates:
+ *  for i in 1 2 3; do
+ *  certutil -d /etc/pki/nssdb -x -t "CT,CT,CT" -S -s "CN=user$i" -n user$i
+ *  done
+ * qemu .. -usb -device usb-ccid \
+ *  -device ccid-card-emulated,cert1=user1,cert2=user2,cert3=user3
+ *
+ * If you use a non default db for the certificates you can specify it using
+ * the db parameter.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "qemu-thread.h"
+#include "qemu-char.h"
+#include "monitor.h"
+#include "hw/ccid.h"
+
+#define DPRINTF(card, lvl, fmt, ...) \
+do {\
+if (lvl <= card->debug) {\
+printf("ccid-card-emul: %s: " fmt , __func__, ## __VA_ARGS__);\
+} \
+} while (0)
+
+#define EMULATED_DEV_NAME "ccid-card-emulated"
+
+#define BACKEND_NSS_EMULATED_NAME "nss-emulated"
+#define BACKEND_CERTIFICATES_NAME "certificates"
+
+enum {
+BACKEND_NSS_EMULATED = 1,
+BACKEND_CERTIFICATES
+};
+
+#define DEFAULT_BACKEND BACKEND_NSS_EMULATED
+
+typedef struct EmulatedState EmulatedState;
+
+enum {
+EMUL_READER_INSERT = 0,
+EMUL_READER_REMOVE,
+EMUL_CARD_INSERT,
+EMUL_CARD_REMOVE,
+EMUL_GUEST_APDU,
+EMUL_RESPONSE_APDU,
+EMUL_ERROR,
+};
+
+static const char *emul_event_to_string(uint32_t emul_event)
+{
+switch (emul_event) {
+case EMUL_READER_INSERT:
+return "EMUL_READER_INSERT";
+case EMUL_READER_REMOVE:
+return "EMUL_READER_REMOVE";
+case EMUL_CARD_INSERT:
+return "EMUL_CARD_INSERT";
+case EMUL_CARD_REMOVE:
+return "EMUL_CARD_REMOVE";
+case EMUL_GUEST_APDU:
+return "EMUL_GUEST_APDU";
+case EMUL_RESPONSE_APDU:
+return "EMUL_RESPONSE_APDU";
+case EMUL_ERROR:
+return "EMUL_ERROR";
+}
+return "UNKNOWN";
+}
+
+typedef struct EmulEvent {
+QSIMPLEQ_ENTRY(EmulEvent) entry;
+union {
+struct {
+uint32_t type;
+} gen;
+struct {
+uint32_t type;
+uint64_t code;
+} error;
+struct {
+uint32_t type;
+uint32_t len;
+uint8_t data[];
+} data;
+} p;
+} EmulEvent;
+
+#define MAX_ATR_SIZE 40
+struct EmulatedState {
+CCIDCardState base;
+uint8_t  debug;
+char*backend_str;
+uint32_t backend;
+char*cert1;
+char*cert2;
+char*cert3;
+char*db;
+uint8_t  atr[MAX_ATR_SIZE];
+uint8_t  atr_length;
+QSIMPLEQ_HEAD(event_list, EmulEvent) event_list;
+QemuMutex event_list_mutex;
+VReader *reader;
+QSIMPLEQ_HEAD(guest_apdu_list, EmulEvent) guest_apdu_list;
+QemuMutex vreader_mutex; /* and guest_apdu_list mutex */
+QemuMutex handle_apdu_mutex;
+QemuCond handle_apdu_cond;
+int  pipe[2];
+int  quit_apdu_thread;
+QemuMutex a

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Kevin Wolf

Am 17.03.2011 08:58, schrieb Michael Tokarev:
> Trivial patch.  I've sent it yesterday but somehow it didn't
> reach the list.
> 
> This fixes the problem when qemu continues even if -drive specification
> is somehow invalid, resulting in a mess.  Applicable for both current
> master and for stable-0.14 (and 0.13 and 0.12 as well).
> 
> The prob can actually be seriuos: when you start guest with two drives
> and make an error in the specification of one of them, and the guest
> has something like a raid array on the two drives, guest may start failing
> that array or kick "missing" drives which may result in a mess - this is
> what actually happened to me, I did't want a resync at all, and a resync
> resulted in re-writing (and allocating) a 4TB virtual drive I used for
> testing, which in turn resulted in my filesystem filling up and whole
> thing failing badly.  Yes it was just testing VM, I experimented with
> larger raid arrays, but the end result was quite, well, unexpected.
> 
> Thanks!
> 
> Signed-off-by: Michael Tokarev 

Before I got a message like this and the guest started anyway:

qemu-system-x86_64: -drive asdfj: Invalid parameter 'asdfj'

Now it exits like it should, but I don't get an error message any more.
Exiting silently isn't really nice either.

> ---
>  vl.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index 8bcf2ae..79f996e 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2098,7 +2098,8 @@ int main(int argc, char **argv, char **envp)
>HD_OPTS);
>  break;
>  case QEMU_OPTION_drive:
> -drive_def(optarg);
> +if (drive_def(optarg) == NULL)
> +exit(1);

Coding style requires braces here.

Kevin

[Qemu-devel] [PATCH v21 08/11] libcacard: add passthru

2011-03-17 Thread Alon Levy

From: Robert Relyea 

In this mode libcacard doesn't emulate a card, but just passes apdu's
straight to the underlying card.

Not to be confused with ccid-card-passthru, which doesn't use libcacard
at all. So with this functionality in libcacard you can talk directly
to the host accessible card, for instance for provisioning or other
functions not available through the CAC interface. This can also be
used from a remote client for the same purpose.
---
 Makefile.objs   |2 +-
 libcacard/passthru.c|  609 +++
 libcacard/passthru.h|   53 
 libcacard/vcard_emul_type.c |6 +
 libcacard/vscclient.c   |   22 ++-
 5 files changed, 688 insertions(+), 4 deletions(-)
 create mode 100644 libcacard/passthru.c
 create mode 100644 libcacard/passthru.h

diff --git a/Makefile.objs b/Makefile.objs
index f703781..ed77dfd 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -354,7 +354,7 @@ endif
 ##
 # smartcard
 
-libcacard-y = cac.o event.o vcard.o vreader.o vcard_emul_nss.o 
vcard_emul_type.o card_7816.o
+libcacard-y = cac.o event.o passthru.o vcard.o vreader.o vcard_emul_nss.o 
vcard_emul_type.o card_7816.o
 
 vl.o: QEMU_CFLAGS+=$(GPROF_CFLAGS)
 
diff --git a/libcacard/passthru.c b/libcacard/passthru.c
new file mode 100644
index 000..d78e2db
--- /dev/null
+++ b/libcacard/passthru.c
@@ -0,0 +1,609 @@
+/*
+ * implement the applets for the CAC card.
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#ifdef USE_PASSTHRU
+#include 
+#include 
+
+#include 
+
+#include "qemu-thread.h"
+
+#include "vcard.h"
+#include "vcard_emul.h"
+#include "card_7816.h"
+#include "vreader.h"
+#include "vcard_emul.h"
+#include "passthru.h"
+
+/*
+ * Passthru applet private data
+ */
+struct VCardAppletPrivateStruct {
+char *reader_name;
+/* pcsc-lite parameters */
+SCARDHANDLE hCard;
+uint32_t hProtocol;
+SCARD_IO_REQUEST *send_io;
+unsigned char atr[MAX_ATR_SIZE];
+int atr_len;
+};
+
+static SCARDCONTEXT global_context;
+
+#define MAX_RESPONSE_LENGTH 261 /*65537 */
+/*
+ * handle all the APDU's that are common to all CAC applets
+ */
+static VCardStatus
+passthru_process_apdu(VCard *card, VCardAPDU *apdu, VCardResponse **response)
+{
+LONG rv;
+unsigned char buf[MAX_RESPONSE_LENGTH];
+uint32_t len = MAX_RESPONSE_LENGTH;
+VCardAppletPrivate *applet_private = NULL;
+SCARD_IO_REQUEST receive_io;
+
+applet_private = vcard_get_current_applet_private(card, 0);
+if (applet_private == NULL) {
+*response = vcard_make_response(VCARD7816_STATUS_EXC_ERROR);
+return VCARD_DONE;
+}
+
+rv = SCardTransmit(applet_private->hCard, applet_private->send_io,
+   apdu->a_data, apdu->a_len, &receive_io, buf, &len);
+if (rv != SCARD_S_SUCCESS) {
+*response = vcard_make_response(VCARD7816_STATUS_EXC_ERROR);
+return VCARD_DONE;
+}
+
+*response = vcard_response_new_data(buf, len);
+if (*response == NULL) {
+*response =
+vcard_make_response(VCARD7816_STATUS_EXC_ERROR_MEMORY_FAILURE);
+} else {
+(*response)->b_total_len = (*response)->b_len;
+}
+return VCARD_DONE;
+}
+
+static void
+passthru_card_set_atr(VCard *card, unsigned char *atr, int atr_len)
+{
+VCardAppletPrivate *applet_private = NULL;
+applet_private = vcard_get_current_applet_private(card, 0);
+if (applet_private == NULL) {
+return;
+}
+applet_private->atr_len = MIN(atr_len, sizeof(applet_private->atr));
+memcpy(applet_private->atr, atr, applet_private->atr_len);
+}
+
+static void passthru_card_get_atr(VCard *card, unsigned char *atr, int 
*atr_len)
+{
+VCardAppletPrivate *applet_private = NULL;
+SCARD_READERSTATE *state;
+
+applet_private = vcard_get_current_applet_private(card, 0);
+if ((applet_private == NULL) || (applet_private->atr_len == 0)) {
+vcard_emul_get_atr(card, atr, atr_len);
+return;
+}
+*atr_len = MIN(applet_private->atr_len, *atr_len);
+memcpy(atr, applet_private->atr, *atr_len);
+return;
+}
+
+/*
+ *  reset the inter call state between applet selects
+ */
+static VCardStatus
+passthru_reset(VCard *card, int channel)
+{
+return VCARD_DONE;
+}
+
+static VCardStatus
+passthru_pcsc_lite_init()
+{
+LONG rv;
+if (global_context != 0) {
+return VCARD_DONE;
+}
+rv = SCardEstablishContext(SCARD_SCOPE_SYSTEM, NULL, NULL, 
&global_context);
+if (rv != SCARD_S_SUCCESS) {
+return VCARD_FAIL;
+}
+return VCARD_DONE;
+}
+
+/*
+ *  match if s1 is completely contained in s2
+ */
+static int
+string_match(const char *s1, const char *s2)
+{
+int len = strlen(s1);
+const char *start;
+
+for (start = strchr(s2, *s1); start; start = strchr(start+1, *s1)) {
+if

Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling

2011-03-17 Thread Markus Armbruster

Anthony Liguori  writes:

> As I've been waiting for QAPI review, I've been working on the design
> of a new mechanism to replace our current command line option handling
> (QemuOpts) with something that reuses the QAPI infrastructure.

I'm ignoring the connection to QAPI, because I'm still ignorant there.

> The 'QemuOpts' syntax is just a way to encode complex data structures.
> nic,model=virtio,macaddress=00:01:02:03:04:05' can be mapped directly
> to a C data structure.  This is exactly what QCFG does using the same
> JSON schema mechanism that QMP uses.
>
> The effect is that you describe a command line argument in JSON like so:
>
> { 'type': 'VncConfig',
>   'data': { 'address': 'str', '*password': 'bool', '*reverse': 'bool',
> '*no-lock-key-sync': 'bool', '*sasl': 'bool', '*tls': 'bool',
> '*x509': 'str', '*x509verify': 'str', '*acl': 'bool',
> '*lossy': 'bool' } }
>
>
> You then just implement a C function that gets called for each -vnc
> option specified:
>
> void qcfg_handle_vnc(VncConfig *option, Error **errp)
> {
> }
>
> And that's it.  You can squirrel away the option such that they all
> can be processed later, you can perform additional validation and
> return an error, or you can implement the appropriate logic.
>
> The VncConfig structure is a proper C data structure.  The advantages
> of this approach compared to QemuOpts are similar to QAPI:
>
> 1) Strong typing means less bugs with lack of command line validation.
> In many cases, a bad command line results in a SEGV today.

Static typing, you mean.

> 2) Every option is formally specified and documented in a way that is
> both rigorous and machine readable.  This means we can generate high
> quality documentation in a variety of formats.

You make it sound like QemuOpts wouldn't specify options.  It does.
Where your proposal differs from QemuOpts, in my opinion, is that it
uses C types rather than QemuOpts' very own ad hoc type system, and is
more expressive, see your 6) below.

> 3) The command line parameters support full introspection.  This
> should provide the same functionality as Dan's earlier introspection
> patches.

Again, this isn't something QemuOpts could not do.  We just neglected to
make its option specification available outside QEMU, mostly because for
lack of consensus on what we want to expose there.

> 4) The 'VncConfig' structure also has JSON marshallers and the
> qcfg_handle_vnc() function can be trivially bridged to QMP.  This
> means command line oriented interfaces (like device_add) are better
> integrated with QMP.

Yes.

> 5) Very complex data types can be implemented.  We had some discussion
> of supporting nested structures with -blockdev.  This wouldn't work
> with QemuOpts but I've already implemented it with QCFG (blockdev
> syntax is my test case right now).  The syntax I'm currently using is
> -blockdev
> cache=none,id=foo,format.qcow.protocol.nbd.hostname=localhost where
> .' is used to reference sub structures.

We might want some syntactic sugar so that users don't have to repeat
format.qcow.protocol.nbd.FOO for every FOO they want to configure.

> 6) Unions are supported which means complex options like -net can be
> specified in the schema and don't require hand validation.
>
> I've got a spec written up at http://wiki.qemu.org/Features/QCFG.

Not reachable right now.

> Initial code is in my QAPI tree.
>
> I'm not going to start converting things until we get closer to the
> end of 0.15 and QAPI is fully merged.  My plan is to focus on this for
> 0.16 and do a full conversion for the 0.16 time frame using the same
> approach as QAPI.  That means that for 0.16, we would be able to set
> all command line options via QMP in a programmatic fashion with full
> support for introspection.
>
> I'm haven't yet closed on how to bridge this to qdev.  qdev is a big
> consumer of QemuOpts today.  I have some general ideas about what I'd
> like to do but so far, I haven't written anything down.

Glad you mention qdev.

qdev properties describe the configurable parts of qdev state objects.
A state object is a C struct.  The description is C data.  Poor man's
reflection.

qdev needs to parse and print a device's properties.  It uses QemuOpts
to parse NAME=VALUE,... option strings into a list of (NAME, VALUE), and
callbacks to parse the VALUEs.  Awkward, especially when you go QMP ->
option string -> qdev struct.

Yet another one: vmstate, which describes migratable parts of qdev state
objects.

Unlike these two, QCFG doesn't describe C structs, it generates them
from JSON specs.  If I understand your proposal correctly.  Hmm.

Can we avoid defining our data types in JSON rather than C?

Can we adopt *one* reflection mechanism?

> I wanted to share these plans early hoping to get some feedback and
> also to maybe interest some folks in helping out.

There's also QEMUOptionParameter, thankfully limited to block code.  Any
plans to cover that as well?

Hmm, my nitpicking and qu

Re: [Qemu-devel] Re: [PATCH 03/26] pci: introduce pci_swizzle_map_irq_fn() for standardized interrupt pin swizzle

2011-03-17 Thread Isaku Yamahata

On Thu, Mar 17, 2011 at 04:43:36PM +0200, Michael S. Tsirkin wrote:
> On Wed, Mar 16, 2011 at 06:29:14PM +0900, Isaku Yamahata wrote:
> > introduce pci_swizzle_map_irq_fn() for interrupt pin swizzle which is
> > standardized. PCI bridge swizzle is common logic, by introducing
> > this function duplicated swizzle logic will be avoided later.
> > 
> > Cc: Michael S. Tsirkin 
> > Signed-off-by: Isaku Yamahata 
> 
> As long as it's used on data path, better to inline it?

The function is passed for pci_map_irq_fn as function pointer.
So inlining doesn't make sense.

> 
> > ---
> >  hw/pci.c |   18 ++
> >  hw/pci.h |2 ++
> >  2 files changed, 20 insertions(+), 0 deletions(-)
> > 
> > diff --git a/hw/pci.c b/hw/pci.c
> > index 5349488..d6c5e66 100644
> > --- a/hw/pci.c
> > +++ b/hw/pci.c
> > @@ -1117,6 +1117,24 @@ static void pci_set_irq(void *opaque, int irq_num, 
> > int level)
> >  pci_change_irq_level(pci_dev, irq_num, change);
> >  }
> >  
> > +/*
> > + * PCI-to-PCI bridge specification
> > + * 9.1: Interrupt routing. Table 9-1
> > + *
> > + * the PCI Express Base Specification, Revision 2.1
> > + * 2.2.8.1: INTx interrutp signaling - Rules
> > + *  the Implementation Note
> > + *  Table 2-20
> > + */
> > +/*
> > + * 0 <= pin <= 3 0 = INTA, 1 = INTB, 2 = INTC, 3 = INTD
> > + * 0-origin unlike PCI interrupt pin register.
> > + */
> > +int pci_swizzle_map_irq_fn(void *opaque, PCIDevice *pci_dev, int pin)
> > +{
> > +return (pin + PCI_SLOT(pci_dev->devfn)) % PCI_NUM_PINS;
> > +}
> > +
> >  /***/
> >  /* monitor info on PCI */
> >  
> > diff --git a/hw/pci.h b/hw/pci.h
> > index 1a08139..46b3ad3 100644
> > --- a/hw/pci.h
> > +++ b/hw/pci.h
> > @@ -235,6 +235,8 @@ PCIBus *pci_bus_new(DeviceState *parent, const char 
> > *name, int devfn_min);
> >  void pci_bus_irqs(PCIBus *bus, pci_set_irq_fn set_irq, pci_map_irq_fn 
> > map_irq,
> >void *irq_opaque, int nirq);
> >  void pci_bus_hotplug(PCIBus *bus, pci_hotplug_fn hotplug, DeviceState 
> > *dev);
> > +/* 0 <= pin <= 3 0 = INTA, 1 = INTB, 2 = INTC, 3 = INTD */
> > +int pci_swizzle_map_irq_fn(void *opaque, PCIDevice *pci_dev, int pin);
> >  PCIBus *pci_register_bus(DeviceState *parent, const char *name,
> >   pci_set_irq_fn set_irq, pci_map_irq_fn map_irq,
> >   void *irq_opaque, int devfn_min, int nirq);
> > -- 
> > 1.7.1.1
> 

-- 
yamahata

Re: [Qemu-devel] Re: [RFC] QCFG: a new mechanism to replace QemuOpts and option handling

2011-03-17 Thread Markus Armbruster

Kevin Wolf  writes:

> Am 15.03.2011 14:37, schrieb Anthony Liguori:
>> On 03/15/2011 06:21 AM, Kevin Wolf wrote:
>>> Am 14.03.2011 18:48, schrieb Anthony Liguori:
 I've got a spec written up at http://wiki.qemu.org/Features/QCFG.
 Initial code is in my QAPI tree.
>>> One question about a small detail on this wiki page:
>>>
 typedef struct BlockdevConfig {
  char * file;
  struct BlockdevConfig * backing_file;

  struct BlockdevConfig * next;
 } BlockdevConfig;
>>> What is the 'next' pointer used for,
>> 
>> This is a standard part of QAPI.  All types get a next pointer added 
>> such that we can support lists of complex types.
>
> Only a single list for each object.

Don't even think of trees.  Yuck.

>>>   are you going to store a list of
>>> all -blockdev options used? And why isn't it a QLIST or something?
>> 
>> Two reasons.  QLIST requires another type for the head of the list which 
>> would complicate things overall.  Second is that these types are part of 
>> the libqmp interface and I didn't want to force qemu-queue on any 
>> consumer of libqmp.
>
> And now you force existing qemu code to go back to open coded lists,
> which is arguably a step backwards. I don't think this is any better
> than forcing the (non-existent) users of libqmp to include one
> additional header file.

Agree.  Let's not screw up our internal interfaces for the sake of an
external interface that may or may not turn out to be viable.

Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests

2011-03-17 Thread rukhsana ansari

Alex, Michael,

Thank you for the clarification.

On Tue, Mar 15, 2011 at 1:01 AM, Alex Williamson  wrote:

> On Mon, 2011-03-14 at 21:00 +0200, Michael S. Tsirkin wrote:
> > On Mon, Mar 14, 2011 at 10:35:08PM +0530, rukhsana ansari wrote:
> > > Seeking clarification to the original question I posted:
> > > >>
> > > >>
> > > > This maybe a novice question - Would appreciate it if you can you
> provide a
> > > > pointer to documentation or relevant code that explains what is the
> > > > limitation in supporting level irq support in kvm irqfd.
> > > >
> > > >
> > > >
> > > After browsing the KVM kernel code, it does look like direct assignment
> of PCI
> > > devices allows support for level-triggered interrupts to be injected to
> the
> > > guest from the kernel.  (as opposed to not supporting it for vhost
> irqfd
> > > mechanism)
> > > This occurs when the guest device supports INTX.
> > > Reference:  kvm_assigned_dev_interrupt_work_handler() in assigned-dev.c
> calls
> > > kvm_set_irq()
> > > with the guest_irq.
> > > This function in turn invokes the assigned set function  (either
> > > kvm_set_pic_irq or kvm_set_ioapic_irq) which was setup at kvm_irq_chip
> creation
> > > time when kvm_setup_default_irq_routing () called for handling ioctl
> > > KVM_CREATE_IRQCHIP.
> > >
> > > So, it isn't clear why level-triggered interrupt isn't supported for
> irqfd
> > > mechanism.
> > > Would greatly appreciate clarification here
> > >
> > > Thanks
> > > -Rukhsana
> > >
> >
> > Mostly, no one came up with an implementation so far.
> >
> > If the point is to use irqfd with vhost-net, there's also
> > a question of adding interfaces to
> > 1. pass IO read transactions directly to another kernel module
> > 2. add an interface to clear the irq level
> >
> > Maybe the right thing is to combine the two somehow:
> > irqfd might get an oiption to set a bit in memory,
> > ioeventfd might get an option to read and clear from memory
> > and clear irqfd line at the same time.
>
> I had wanted this for VFIO too and it gets pretty complicated.  The
> first problem with level triggered interrupts is that you need to know
> which GSI your device triggers.  This means translating PCI INTA through
> bridge swizzles and chipset mapping to an IOAPIC.  Current device
> assignment does this through a complete hack in qemu.  Then you can set
> the IRQ, but being level triggered, we need to know when the guest has
> serviced the IRQ so we can de-assert it.  This requires a hook into the
> in-kernel APIC to sent the EOI back out to userspace.
>
> I posted RFC patches for doing all this a while back, but they didn't go
> anywhere.  I think the feeling was that it was too intrusive for "slow"
> interrupts.  The current thinking for VFIO based device assignment is to
> use qemu for level interrupts until we find something that actually
> needs low latency in this path.  We generally consider INTx to be like
> supporting i/o port space or non-4k BARs, ie. necessary for
> compatibility, but not necessarily a performance path.  High performance
> devices should always be using some kind of MSI because it bypasses all
> of the APIC complications and slowness.  Thanks,
>
> Alex
>
>


-- 
-Rukhsana

Re: [Qemu-devel] [PATCH 2/4] block: add a helper to change writeback mode on the fly

2011-03-17 Thread Kevin Wolf

Am 17.03.2011 15:44, schrieb Daniel P. Berrange:
> On Tue, Mar 15, 2011 at 03:11:32PM +0100, Christoph Hellwig wrote:
>> Add a new bdrv_change_cache that can set/clear the writeback flag
>> at runtime by stopping all I/O and closing/reopening the image file.
>>
>> All code is based on a patch from Prerna Saxena 
>> with minimal refactoring.
>>
>> Signed-off-by: Christoph Hellwig 
>>
>>  
>> +static int bdrv_reopen(BlockDriverState *bs, int bdrv_flags)
>> +{
>> +BlockDriver *drv = bs->drv;
>> +int ret;
>> +
>> +if (bdrv_flags == bs->open_flags) {
>> +return 0;
>> +}
>> +
>> +/* Quiesce IO for the given block device */
>> +qemu_aio_flush();
>> +bdrv_flush(bs);
>> +
>> +bdrv_close(bs);
>> +ret = bdrv_open(bs, bs->filename, bdrv_flags, drv);
>> +
>> +/*
>> + * A failed attempt to reopen the image file must lead to 'abort()'
>> + */
>> +if (ret != 0) {
>> +abort();
>> +}
>> +
>> +return ret;
>> +}
> 
> 
> 
>> +
>> +int bdrv_change_cache(BlockDriverState *bs, bool enable)
>> +{
>> +int bdrv_flags = 0;
>> +
>> +bdrv_flags = bs->open_flags & ~BDRV_O_CACHE_WB;
>> +if (enable) {
>> +bdrv_flags |= BDRV_O_CACHE_WB;
>> +}
>> +
>> +return bdrv_reopen(bs, bdrv_flags);
>> +}
> 
> Is there any way we can manage todo this *without* closing and
> re-opening the file descriptor ?
> 
> One of the things we're considering for the future is to enable
> QEMU to be passed open FD(s) for its drives, from the management
> system, instead of having QEMU open the files itself. 

What about cache mode, read-only flags etc. that are set with the right
flags during the open? Must qemu just assume that the management has
done the right thing?

And what about things like backing files or other information that
depends on the content of the image file?

> If QEMU expects to close+reopen any of its disks at any time the
> guest requests, this will complicate life somewhat

It does expect this. For example for making backing files temporarily
read-write during a 'commit' monitor command, we already reopen the
image. (Let's hope nobody uses -snapshot, live snapshots and commit, he
would lose his disk...)

Kevin

Re: [Qemu-devel] qemu.org unreachable ?

2011-03-17 Thread Anthony Liguori


On 03/17/2011 10:19 AM, François Revol wrote:

At least from france, confirmed by several people.

François.

traceroute to qemu.org (64.62.173.86), 64 hops max, 52 byte packets
  1  172.28.0.1 (172.28.0.1)  1.145 ms  0.960 ms  0.859 ms
  2  172.30.1.1 (172.30.1.1)  0.702 ms  0.786 ms  0.631 ms
  3  routeur-iut.iut-valence.fr (193.51.172.1)  1.207 ms  0.894 ms  0.878 ms
  4  r-val1.grenet.fr (194.214.50.1)  0.920 ms  0.996 ms  0.904 ms
  5  193.54.135.13 (193.54.135.13)  1.572 ms  3.565 ms  1.670 ms
  6  grenoble-valence.amplivia.fr (193.54.133.141)  6.838 ms  3.696 ms  3.771 ms
  7  * * *
  8  te1-3-lyon2-rtr-021.noc.renater.fr (193.51.189.110)  12.558 ms  12.544 ms  
14.353 ms
  9  te0-0-0-1-paris2-rtr-001.noc.renater.fr (193.51.189.2)  13.107 ms  13.642 
ms  12.555 ms
10  hurricane-electric.franceix.net (193.105.232.10)  18.148 ms  13.341 ms  
20.290 ms
11  10gigabitethernet1-3.core1.lon1.he.net (72.52.92.33)  21.393 ms  28.434 ms  
25.895 ms
12  10gigabitethernet7-4.core1.nyc4.he.net (72.52.92.77)  89.660 ms  89.808 ms  
90.473 ms
13  10gigabitethernet2-3.core1.ash1.he.net (72.52.92.86)  104.163 ms  95.601 ms 
 102.327 ms
14  10gigabitethernet1-4.core1.pao1.he.net (72.52.92.29)  164.219 ms  164.891 ms
 10gigabitethernet1-1.core1.fmt1.he.net (72.52.92.109)  176.086 ms
15  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126)  173.076 ms  172.690 ms  172.628 ms
16  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126)  1625.419 ms !H *  1419.260 ms !H


PING qemu.org (64.62.173.86): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
92 bytes from lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126): Destination Host Unreachable
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
  4  5  00 5400 d1d7   0   31  01 1c40 172.28.1.225  64.62.173.86


I can confirm and I'm working on getting it resolved.

Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH v21 01/11] trace: move trace objects from Makefile to Makefile.objs

2011-03-17 Thread Alon Levy

On Thu, Mar 17, 2011 at 04:49:26PM +0200, Alon Levy wrote:
> ---
>  Makefile  |   32 
>  Makefile.objs |   32 
>  2 files changed, 32 insertions(+), 32 deletions(-)

The commit message can be a little more verbose :(

It will be:

allow libcacard (later commit) to make use of the rules for
building trace objects, to build libcacard/vscclient (and
later still libcacard/libcacard.so).

> 
> diff --git a/Makefile b/Makefile
> index eca4c76..5b35b9f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -112,38 +112,6 @@ ui/vnc.o: QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
>  
>  bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)
>  
> -ifeq ($(TRACE_BACKEND),dtrace)
> -trace.h: trace.h-timestamp trace-dtrace.h
> -else
> -trace.h: trace.h-timestamp
> -endif
> -trace.h-timestamp: $(SRC_PATH)/trace-events config-host.mak
> - $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
> --$(TRACE_BACKEND) -h < $< > $@,"  GEN   trace.h")
> - @cmp -s $@ trace.h || cp $@ trace.h
> -
> -trace.c: trace.c-timestamp
> -trace.c-timestamp: $(SRC_PATH)/trace-events config-host.mak
> - $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
> --$(TRACE_BACKEND) -c < $< > $@,"  GEN   trace.c")
> - @cmp -s $@ trace.c || cp $@ trace.c
> -
> -trace.o: trace.c $(GENERATED_HEADERS)
> -
> -trace-dtrace.h: trace-dtrace.dtrace
> - $(call quiet-command,dtrace -o $@ -h -s $<, "  GEN   trace-dtrace.h")
> -
> -# Normal practice is to name DTrace probe file with a '.d' extension
> -# but that gets picked up by QEMU's Makefile as an external dependancy
> -# rule file. So we use '.dtrace' instead
> -trace-dtrace.dtrace: trace-dtrace.dtrace-timestamp
> -trace-dtrace.dtrace-timestamp: $(SRC_PATH)/trace-events config-host.mak
> - $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
> --$(TRACE_BACKEND) -d < $< > $@,"  GEN   trace-dtrace.dtrace")
> - @cmp -s $@ trace-dtrace.dtrace || cp $@ trace-dtrace.dtrace
> -
> -trace-dtrace.o: trace-dtrace.dtrace $(GENERATED_HEADERS)
> - $(call quiet-command,dtrace -o $@ -G -s $<, "  GEN trace-dtrace.o")
> -
> -simpletrace.o: simpletrace.c $(GENERATED_HEADERS)
> -
>  version.o: $(SRC_PATH)/version.rc config-host.mak
>   $(call quiet-command,$(WINDRES) -I. -o $@ $<,"  RC$(TARGET_DIR)$@")
>  
> diff --git a/Makefile.objs b/Makefile.objs
> index a52f42f..be4c4d9 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -309,6 +309,38 @@ libdis-$(CONFIG_SPARC_DIS) += sparc-dis.o
>  # trace
>  
>  ifeq ($(TRACE_BACKEND),dtrace)
> +trace.h: trace.h-timestamp trace-dtrace.h
> +else
> +trace.h: trace.h-timestamp
> +endif
> +trace.h-timestamp: $(SRC_PATH)/trace-events config-host.mak
> + $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
> --$(TRACE_BACKEND) -h < $< > $@,"  GEN   trace.h")
> + @cmp -s $@ trace.h || cp $@ trace.h
> +
> +trace.c: trace.c-timestamp
> +trace.c-timestamp: $(SRC_PATH)/trace-events config-host.mak
> + $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
> --$(TRACE_BACKEND) -c < $< > $@,"  GEN   trace.c")
> + @cmp -s $@ trace.c || cp $@ trace.c
> +
> +trace.o: trace.c $(GENERATED_HEADERS)
> +
> +trace-dtrace.h: trace-dtrace.dtrace
> + $(call quiet-command,dtrace -o $@ -h -s $<, "  GEN   trace-dtrace.h")
> +
> +# Normal practice is to name DTrace probe file with a '.d' extension
> +# but that gets picked up by QEMU's Makefile as an external dependancy
> +# rule file. So we use '.dtrace' instead
> +trace-dtrace.dtrace: trace-dtrace.dtrace-timestamp
> +trace-dtrace.dtrace-timestamp: $(SRC_PATH)/trace-events config-host.mak
> + $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
> --$(TRACE_BACKEND) -d < $< > $@,"  GEN   trace-dtrace.dtrace")
> + @cmp -s $@ trace-dtrace.dtrace || cp $@ trace-dtrace.dtrace
> +
> +trace-dtrace.o: trace-dtrace.dtrace $(GENERATED_HEADERS)
> + $(call quiet-command,dtrace -o $@ -G -s $<, "  GEN trace-dtrace.o")
> +
> +simpletrace.o: simpletrace.c $(GENERATED_HEADERS)
> +
> +ifeq ($(TRACE_BACKEND),dtrace)
>  trace-obj-y = trace-dtrace.o
>  else
>  trace-obj-y = trace.o
> -- 
> 1.7.4.1
> 
>

Re: [Qemu-devel] qemu.org unreachable ?

2011-03-17 Thread Антон Кочков

Looks like.
>From Russia also

Best regards,
Anton Kochkov.




On Thu, Mar 17, 2011 at 18:19, François Revol  wrote:
> At least from france, confirmed by several people.
>
> François.
>
> traceroute to qemu.org (64.62.173.86), 64 hops max, 52 byte packets
>  1  172.28.0.1 (172.28.0.1)  1.145 ms  0.960 ms  0.859 ms
>  2  172.30.1.1 (172.30.1.1)  0.702 ms  0.786 ms  0.631 ms
>  3  routeur-iut.iut-valence.fr (193.51.172.1)  1.207 ms  0.894 ms  0.878 ms
>  4  r-val1.grenet.fr (194.214.50.1)  0.920 ms  0.996 ms  0.904 ms
>  5  193.54.135.13 (193.54.135.13)  1.572 ms  3.565 ms  1.670 ms
>  6  grenoble-valence.amplivia.fr (193.54.133.141)  6.838 ms  3.696 ms  3.771 
> ms
>  7  * * *
>  8  te1-3-lyon2-rtr-021.noc.renater.fr (193.51.189.110)  12.558 ms  12.544 ms 
>  14.353 ms
>  9  te0-0-0-1-paris2-rtr-001.noc.renater.fr (193.51.189.2)  13.107 ms  13.642 
> ms  12.555 ms
> 10  hurricane-electric.franceix.net (193.105.232.10)  18.148 ms  13.341 ms  
> 20.290 ms
> 11  10gigabitethernet1-3.core1.lon1.he.net (72.52.92.33)  21.393 ms  28.434 
> ms  25.895 ms
> 12  10gigabitethernet7-4.core1.nyc4.he.net (72.52.92.77)  89.660 ms  89.808 
> ms  90.473 ms
> 13  10gigabitethernet2-3.core1.ash1.he.net (72.52.92.86)  104.163 ms  95.601 
> ms  102.327 ms
> 14  10gigabitethernet1-4.core1.pao1.he.net (72.52.92.29)  164.219 ms  164.891 
> ms
>    10gigabitethernet1-1.core1.fmt1.he.net (72.52.92.109)  176.086 ms
> 15  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
> (66.220.10.126)  173.076 ms  172.690 ms  172.628 ms
> 16  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
> (66.220.10.126)  1625.419 ms !H *  1419.260 ms !H
>
>
> PING qemu.org (64.62.173.86): 56 data bytes
> Request timeout for icmp_seq 0
> Request timeout for icmp_seq 1
> 92 bytes from 
> lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
> (66.220.10.126): Destination Host Unreachable
> Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
>  4  5  00 5400 d1d7   0   31  01 1c40 172.28.1.225  64.62.173.86
>
>
>
>

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Markus Armbruster

Kevin Wolf  writes:

> Am 17.03.2011 08:58, schrieb Michael Tokarev:
>> Trivial patch.  I've sent it yesterday but somehow it didn't
>> reach the list.
>> 
>> This fixes the problem when qemu continues even if -drive specification
>> is somehow invalid, resulting in a mess.  Applicable for both current
>> master and for stable-0.14 (and 0.13 and 0.12 as well).
>> 
>> The prob can actually be seriuos: when you start guest with two drives
>> and make an error in the specification of one of them, and the guest
>> has something like a raid array on the two drives, guest may start failing
>> that array or kick "missing" drives which may result in a mess - this is
>> what actually happened to me, I did't want a resync at all, and a resync
>> resulted in re-writing (and allocating) a 4TB virtual drive I used for
>> testing, which in turn resulted in my filesystem filling up and whole
>> thing failing badly.  Yes it was just testing VM, I experimented with
>> larger raid arrays, but the end result was quite, well, unexpected.
>> 
>> Thanks!
>> 
>> Signed-off-by: Michael Tokarev 
>
> Before I got a message like this and the guest started anyway:
>
> qemu-system-x86_64: -drive asdfj: Invalid parameter 'asdfj'
>
> Now it exits like it should, but I don't get an error message any more.
[...]

Are you sure?  I still get the error message in my testing.

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Michael Tokarev

17.03.2011 18:04, Kevin Wolf wrote:
> Am 17.03.2011 08:58, schrieb Michael Tokarev:
[]
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -2098,7 +2098,8 @@ int main(int argc, char **argv, char **envp)
>>HD_OPTS);
>>  break;
>>  case QEMU_OPTION_drive:
>> -drive_def(optarg);
>> +if (drive_def(optarg) == NULL)
>> +exit(1);
> 
> Coding style requires braces here.

I'll just stick it into debian package.  I'm not going
to change all the other braces like this around that place.

Thanks.

/mjt

[Qemu-devel] qemu.org unreachable ?

2011-03-17 Thread François Revol

At least from france, confirmed by several people.

François.

traceroute to qemu.org (64.62.173.86), 64 hops max, 52 byte packets
 1  172.28.0.1 (172.28.0.1)  1.145 ms  0.960 ms  0.859 ms
 2  172.30.1.1 (172.30.1.1)  0.702 ms  0.786 ms  0.631 ms
 3  routeur-iut.iut-valence.fr (193.51.172.1)  1.207 ms  0.894 ms  0.878 ms
 4  r-val1.grenet.fr (194.214.50.1)  0.920 ms  0.996 ms  0.904 ms
 5  193.54.135.13 (193.54.135.13)  1.572 ms  3.565 ms  1.670 ms
 6  grenoble-valence.amplivia.fr (193.54.133.141)  6.838 ms  3.696 ms  3.771 ms
 7  * * *
 8  te1-3-lyon2-rtr-021.noc.renater.fr (193.51.189.110)  12.558 ms  12.544 ms  
14.353 ms
 9  te0-0-0-1-paris2-rtr-001.noc.renater.fr (193.51.189.2)  13.107 ms  13.642 
ms  12.555 ms
10  hurricane-electric.franceix.net (193.105.232.10)  18.148 ms  13.341 ms  
20.290 ms
11  10gigabitethernet1-3.core1.lon1.he.net (72.52.92.33)  21.393 ms  28.434 ms  
25.895 ms
12  10gigabitethernet7-4.core1.nyc4.he.net (72.52.92.77)  89.660 ms  89.808 ms  
90.473 ms
13  10gigabitethernet2-3.core1.ash1.he.net (72.52.92.86)  104.163 ms  95.601 ms 
 102.327 ms
14  10gigabitethernet1-4.core1.pao1.he.net (72.52.92.29)  164.219 ms  164.891 ms
10gigabitethernet1-1.core1.fmt1.he.net (72.52.92.109)  176.086 ms
15  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126)  173.076 ms  172.690 ms  172.628 ms
16  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126)  1625.419 ms !H *  1419.260 ms !H


PING qemu.org (64.62.173.86): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
92 bytes from lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126): Destination Host Unreachable
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
 4  5  00 5400 d1d7   0   31  01 1c40 172.28.1.225  64.62.173.86

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Kevin Wolf

Am 17.03.2011 16:49, schrieb Michael Tokarev:
> 17.03.2011 18:04, Kevin Wolf wrote:
>> Am 17.03.2011 08:58, schrieb Michael Tokarev:
> []
>>> --- a/vl.c
>>> +++ b/vl.c
>>> @@ -2098,7 +2098,8 @@ int main(int argc, char **argv, char **envp)
>>>HD_OPTS);
>>>  break;
>>>  case QEMU_OPTION_drive:
>>> -drive_def(optarg);
>>> +if (drive_def(optarg) == NULL)
>>> +exit(1);
>>
>> Coding style requires braces here.
> 
> I'll just stick it into debian package.  I'm not going
> to change all the other braces like this around that place.

I'm only asking to add one pair of braces in the line that you touch,
not to change everything in qemu. Shouldn't be that hard...

Kevin

Re: [Qemu-devel] qemu.org unreachable ?

2011-03-17 Thread Anthony Liguori


On 03/17/2011 11:10 AM, Антон Кочков wrote:

Looks like.
 From Russia also


The server is down.  The VM was having what appeared to be memory issues 
(random processes SEGV'ing) over the past week.  I can't say for sure 
whether it's an issue with the guest or whether it's an issue with the 
physical machine.


I'll keep the list posted as I learn more.  I hope to get it back online 
ASAP.


Regards,

Anthony Liguori


Best regards,
Anton Kochkov.




On Thu, Mar 17, 2011 at 18:19, François Revol  wrote:

At least from france, confirmed by several people.

François.

traceroute to qemu.org (64.62.173.86), 64 hops max, 52 byte packets
  1  172.28.0.1 (172.28.0.1)  1.145 ms  0.960 ms  0.859 ms
  2  172.30.1.1 (172.30.1.1)  0.702 ms  0.786 ms  0.631 ms
  3  routeur-iut.iut-valence.fr (193.51.172.1)  1.207 ms  0.894 ms  0.878 ms
  4  r-val1.grenet.fr (194.214.50.1)  0.920 ms  0.996 ms  0.904 ms
  5  193.54.135.13 (193.54.135.13)  1.572 ms  3.565 ms  1.670 ms
  6  grenoble-valence.amplivia.fr (193.54.133.141)  6.838 ms  3.696 ms  3.771 ms
  7  * * *
  8  te1-3-lyon2-rtr-021.noc.renater.fr (193.51.189.110)  12.558 ms  12.544 ms  
14.353 ms
  9  te0-0-0-1-paris2-rtr-001.noc.renater.fr (193.51.189.2)  13.107 ms  13.642 
ms  12.555 ms
10  hurricane-electric.franceix.net (193.105.232.10)  18.148 ms  13.341 ms  
20.290 ms
11  10gigabitethernet1-3.core1.lon1.he.net (72.52.92.33)  21.393 ms  28.434 ms  
25.895 ms
12  10gigabitethernet7-4.core1.nyc4.he.net (72.52.92.77)  89.660 ms  89.808 ms  
90.473 ms
13  10gigabitethernet2-3.core1.ash1.he.net (72.52.92.86)  104.163 ms  95.601 ms 
 102.327 ms
14  10gigabitethernet1-4.core1.pao1.he.net (72.52.92.29)  164.219 ms  164.891 ms
10gigabitethernet1-1.core1.fmt1.he.net (72.52.92.109)  176.086 ms
15  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126)  173.076 ms  172.690 ms  172.628 ms
16  lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126)  1625.419 ms !H *  1419.260 ms !H


PING qemu.org (64.62.173.86): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
92 bytes from lafrance-internet-services.gigabitethernet3-15.core1.fmt1.he.net 
(66.220.10.126): Destination Host Unreachable
Vr HL TOS  Len   ID Flg  off TTL Pro  cks  Src  Dst
  4  5  00 5400 d1d7   0   31  01 1c40 172.28.1.225  64.62.173.86

[Qemu-devel] FVD Paper Accepted to USENIX ATC'11

2011-03-17 Thread Chunqiang Tang

I am pleased to report that a short of version of the FVD-cow paper I 
previously posted here was just accepted to USENIX Annual Technical 
Conference (USENIX ATC'11), which is a prestigious and highly competitive 
research conference in the systems field. This shows from another angle 
(in addition to the good performance numbers) that FVD indeed advances the 
state-of-the-art in virtual machine image format. Coincidentally, Fabrice 
Bellard's original paper on QEMU was also published in USENIX ATC several 
years ago. 

The draft version of the long FVD-cow paper is available at 
http://researcher.watson.ibm.com/researcher/files/us-ctang/FVD-cow.pdf . 
This paper focuses on the on-demand image streaming feature of FVD, i.e., 
using FVD's bitmap to efficiently implement copy-on-write, copy-on-read, 
and adaptive prefetching. Other features of FVD (including journal, small 
lookup table, and the most recent work on efficient internal snapshot) are 
not yet published in research paper, and will be posted here once 
published, which can serve as detailed documentation of the FVD algorithm 
and code.

Regards,
ChunQiang (CQ) Tang, Ph.D.
Homepage: http://www.research.ibm.com/people/c/ctang

Re: [Qemu-devel] [PATCH v21 01/11] trace: move trace objects from Makefile to Makefile.objs

2011-03-17 Thread Stefan Hajnoczi

On Thu, Mar 17, 2011 at 3:00 PM, Alon Levy  wrote:
> On Thu, Mar 17, 2011 at 04:49:26PM +0200, Alon Levy wrote:
>> ---
>>  Makefile      |   32 
>>  Makefile.objs |   32 
>>  2 files changed, 32 insertions(+), 32 deletions(-)
>
> The commit message can be a little more verbose :(
>
> It will be:
>
> allow libcacard (later commit) to make use of the rules for
> building trace objects, to build libcacard/vscclient (and
> later still libcacard/libcacard.so).

I will test this patch and get back to you within the next two or three days.

Stefan

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Kevin Wolf

Am 17.03.2011 16:33, schrieb Markus Armbruster:
> Kevin Wolf  writes:
> 
>> Am 17.03.2011 08:58, schrieb Michael Tokarev:
>>> Trivial patch.  I've sent it yesterday but somehow it didn't
>>> reach the list.
>>>
>>> This fixes the problem when qemu continues even if -drive specification
>>> is somehow invalid, resulting in a mess.  Applicable for both current
>>> master and for stable-0.14 (and 0.13 and 0.12 as well).
>>>
>>> The prob can actually be seriuos: when you start guest with two drives
>>> and make an error in the specification of one of them, and the guest
>>> has something like a raid array on the two drives, guest may start failing
>>> that array or kick "missing" drives which may result in a mess - this is
>>> what actually happened to me, I did't want a resync at all, and a resync
>>> resulted in re-writing (and allocating) a 4TB virtual drive I used for
>>> testing, which in turn resulted in my filesystem filling up and whole
>>> thing failing badly.  Yes it was just testing VM, I experimented with
>>> larger raid arrays, but the end result was quite, well, unexpected.
>>>
>>> Thanks!
>>>
>>> Signed-off-by: Michael Tokarev 
>>
>> Before I got a message like this and the guest started anyway:
>>
>> qemu-system-x86_64: -drive asdfj: Invalid parameter 'asdfj'
>>
>> Now it exits like it should, but I don't get an error message any more.
> [...]
> 
> Are you sure?  I still get the error message in my testing.

You're right. Instead of using git am to apply the patch, I just edited
vl.c manually and probably messed up. The patch works.

So it's only the braces.

Kevin

Re: [Qemu-devel] [PATCH v21 01/11] trace: move trace objects from Makefile to Makefile.objs

2011-03-17 Thread Alon Levy

On Thu, Mar 17, 2011 at 04:45:15PM +, Stefan Hajnoczi wrote:
> On Thu, Mar 17, 2011 at 3:00 PM, Alon Levy  wrote:
> > On Thu, Mar 17, 2011 at 04:49:26PM +0200, Alon Levy wrote:
> >> ---
> >>  Makefile      |   32 
> >>  Makefile.objs |   32 
> >>  2 files changed, 32 insertions(+), 32 deletions(-)
> >
> > The commit message can be a little more verbose :(
> >
> > It will be:
> >
> > allow libcacard (later commit) to make use of the rules for
> > building trace objects, to build libcacard/vscclient (and
> > later still libcacard/libcacard.so).
> 
> I will test this patch and get back to you within the next two or three days.
Thanks, much appreciated.

Alon

> 
> Stefan
>

Re: [Qemu-devel] [PATCH v3 4/4] hw/qxl-render: drop cursor locks, replace with pipe

2011-03-17 Thread Alon Levy

On Thu, Mar 17, 2011 at 03:19:28PM +0100, Jes Sorensen wrote:
> On 03/17/11 11:45, Alon Levy wrote:
> > On Thu, Mar 17, 2011 at 11:29:03AM +0100, Jes Sorensen wrote:
> >> On 03/17/11 11:27, Alon Levy wrote:
> >>> On Thu, Mar 17, 2011 at 10:48:43AM +0100, Jes Sorensen wrote:
> > Same for the asserts below, writes are from spice server thread, reads
> > are in iothread.
> 
>  But shouldn't this make it try to reconnect? Even if the reconnect
>  fails, it shouldn't kill the guest IMHO.
> >>>
> >>> reconnect? between two threads in the qemu process? why would the write
> >>> fail to begin with? this is like saying if I'm failing a kvm ioctl I 
> >>> should
> >>> just retry.
> >>
> >> Ah ok, I missed that part, somehow I had in my mind it was two different
> >> processes, despite you mentioning threads.
> >>
> >> Still if gfx handling fails, it shouldn't nuke the guest.
> > 
> > ok, try to apply that logic to any other device - network, usb, etc., I 
> > don't
> > think it holds.
> 
> Maybe I am looking at the wrong angle - I would think that is network or
> usb breaks, we would still keep running, and for gfx the guest should be
> able to keep running even if the monitor is disconnected.
> 
> It's not a big issue so if you feel it is fine as is, I won't object.

I think it could be marginally useful - I mean if someone is running a vm with
spice it's for desktop use, and that means graphics is important. of course they
could be running a server vm and have it for debugging, but that would
be silly.

This could hide an actual bug, by not aborting here. Not a major point.

I'm also not sure what's the right thing to do - turn on a flag for stopping
handling requests? it would just open up a whole set of states we don't handle
currently, half-working states.

> 
> Cheers,
> Jes
>

Re: [Qemu-devel] [PATCH +STABLE-0.14] exit if -drive specified is invalid instead of ignoring the "wrong" -drive

2011-03-17 Thread Markus Armbruster

Michael Tokarev  writes:

> 17.03.2011 16:51, Markus Armbruster wrote:
>> Michael Tokarev  writes:
>> 
>>> Trivial patch.  I've sent it yesterday but somehow it didn't
>>> reach the list.
>>>
>>> This fixes the problem when qemu continues even if -drive specification
>>> is somehow invalid, resulting in a mess.  Applicable for both current
>>> master and for stable-0.14 (and 0.13 and 0.12 as well).
>> 
>> Note patch doesn't apply to 0.12 and 0.13.
>
> Yes it doesn't, since to 0.14 the code changed.
> What I mean is that the same problem exist in
> earlier versions too.  I'll apply a change of
> this effect to 0.12.5 as used in Debian now,
> something like the one below.
>
> []
>> What about all the other unchecked drive_add() calls in main()?
>
> These are much less worrisome - they fail only of the
> internal definitions of options are incorrect, which
> should never happen.  For example:
>
> case QEMU_OPTION_hdd:
> drive_add(IF_DEFAULT, popt->index - QEMU_OPTION_hda, optarg,
>   HD_OPTS);
>
> There, optarg is just a filename, and HD_OPTS is
> defined like this:
>
> #define HD_OPTS "media=disk"
>
> So it should not fail when parsing options, only
> when trying to interpret and actually open the
> filename, which happens much later in the game.

Fair enough.

Re: [Qemu-devel] Re: [PATCH 00/15] QAPI Round 1 (core code generator) (v2)

2011-03-17 Thread Anthony Liguori


On 03/17/2011 09:04 AM, Kevin Wolf wrote:



No, the problem with the old events is that they aren't
registered/maskable.  So even if you don't care about BLOCK_IO_ERROR,
you're getting the notification.  Plus, we'd like to add the ability to
add a tag to events when we register them.

What's the problem with registering them?


You could add a simple mask/unmask event to the current protocol.  But 
that doesn't let you filter events for a given object.  As I mentioned 
in the previous note, I would like to be able to only receive 
BLOCK_IO_ERROR for certain devices.



  If you want to stop client
from doing this you must introduce a special case for obsolete events
that cannot be registered. Do we gain anything from this?


Actually, the old style events are distinct from the new signals.  They 
cannot be registered/unregistered.  My intention is for new clients to 
just ignore that they exist.



The other problem is that events are all global today.  BLOCK_IO_ERROR
is a good example of this.  It's really an error that's specific to a
block device and it passes the name of the block device that it's
specific to as an argument.  But if we have a masking mechanism it could
only globally enable/disable BLOCK_IO_ERROR for all devices.

It would be much nicer to be able to enable the event for specific block
devices.  This requires some protocol visible changes.  I'm still
writing up these changes but hope to have something for review soon.

I wonder if the old, more generic event couldn't be generated
automatically if the more specific signal is triggered in the block code.


Yes, this is what I'm going to try to do.  This is already how the 
patches I sent out work but now that I have the new signal model, it 
requires a little more tweaking.


But this is the advantage of being able to consume interfaces within 
QEMU.  As long as the block layer exposes a signal mechanism that 
provides the same information, we can hide the old style events entirely 
within the QMP layer by registering for the new style signals and 
created old style events.



   but just that it
doesn't map to the C interface in the way you like.

I think I've maybe been using "C interface" to much.  The current event
wire protocol doesn't map to any client interface well.

If you mean their broadcast style, that's not really related to nesting
or struct vs. argument.


I meant structs vs. arguments too.

I'd like to do:

import qmp

def io_error(blockdev, operation, action):
   if operation == 'stop':
   

srv = qmp.ServerProxy(...)

srv.connect('IO_ERROR', io_error)

Python has the same problem re: arguments.  If we add a new argument to 
a signal, then it breaks the Python client library.  At the same time, 
when possible, it's nice to use argument syntax because the alternative is:


import qmp

def io_error(args):
if args['action'] == 'stop':
   

Which isn't quite as nice to me.  So I don't want to preclude having the 
ability to have a signal with a few arguments but we just have to be 
careful that we're not goign to need to add more arguments later.  If we 
do, then we should make sure there's a structure there too.


Regards,

Anthony Liguori


Kevin

Re: [Qemu-devel] [PATCH 2/4] block: add a helper to change writeback mode on the fly

2011-03-17 Thread Stefan Hajnoczi

On Thu, Mar 17, 2011 at 3:11 PM, Kevin Wolf  wrote:
> Am 17.03.2011 15:44, schrieb Daniel P. Berrange:
>> On Tue, Mar 15, 2011 at 03:11:32PM +0100, Christoph Hellwig wrote:
>>> Add a new bdrv_change_cache that can set/clear the writeback flag
>>> at runtime by stopping all I/O and closing/reopening the image file.
>>>
>>> All code is based on a patch from Prerna Saxena 
>>> with minimal refactoring.
>>>
>>> Signed-off-by: Christoph Hellwig 
>>>
>>>
>>> +static int bdrv_reopen(BlockDriverState *bs, int bdrv_flags)
>>> +{
>>> +    BlockDriver *drv = bs->drv;
>>> +    int ret;
>>> +
>>> +    if (bdrv_flags == bs->open_flags) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    /* Quiesce IO for the given block device */
>>> +    qemu_aio_flush();
>>> +    bdrv_flush(bs);
>>> +
>>> +    bdrv_close(bs);
>>> +    ret = bdrv_open(bs, bs->filename, bdrv_flags, drv);
>>> +
>>> +    /*
>>> +     * A failed attempt to reopen the image file must lead to 'abort()'
>>> +     */
>>> +    if (ret != 0) {
>>> +        abort();
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>
>>
>>
>>> +
>>> +int bdrv_change_cache(BlockDriverState *bs, bool enable)
>>> +{
>>> +    int bdrv_flags = 0;
>>> +
>>> +    bdrv_flags = bs->open_flags & ~BDRV_O_CACHE_WB;
>>> +    if (enable) {
>>> +        bdrv_flags |= BDRV_O_CACHE_WB;
>>> +    }
>>> +
>>> +    return bdrv_reopen(bs, bdrv_flags);
>>> +}
>>
>> Is there any way we can manage todo this *without* closing and
>> re-opening the file descriptor ?
>>
>> One of the things we're considering for the future is to enable
>> QEMU to be passed open FD(s) for its drives, from the management
>> system, instead of having QEMU open the files itself.
>
> What about cache mode, read-only flags etc. that are set with the right
> flags during the open? Must qemu just assume that the management has
> done the right thing?
>
> And what about things like backing files or other information that
> depends on the content of the image file?
>
>> If QEMU expects to close+reopen any of its disks at any time the
>> guest requests, this will complicate life somewhat
>
> It does expect this. For example for making backing files temporarily
> read-write during a 'commit' monitor command, we already reopen the
> image. (Let's hope nobody uses -snapshot, live snapshots and commit, he
> would lose his disk...)

I think we need to use the most robust solution possible.  We don't
want to get into a situation that can be avoided.  Especially in cases
where human errors can (==will) be made.

I suggested using /proc/$pid/fd to reopen an existing file descriptor.
 That interface is only available on Linux and possibly some of
{Solaris, *BSD, OSX}.  So there needs to be a fallback, but in this
situation it feels right to take advantage of whatever means are
provided by the host OS to make safe transitions between image files.

Stefan

Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling

2011-03-17 Thread Anthony Liguori


On 03/17/2011 10:22 AM, Markus Armbruster wrote:



void qcfg_handle_vnc(VncConfig *option, Error **errp)
{
}

And that's it.  You can squirrel away the option such that they all
can be processed later, you can perform additional validation and
return an error, or you can implement the appropriate logic.

The VncConfig structure is a proper C data structure.  The advantages
of this approach compared to QemuOpts are similar to QAPI:

1) Strong typing means less bugs with lack of command line validation.
In many cases, a bad command line results in a SEGV today.

Static typing, you mean.


Both.  We don't consistently check error returns when dealing with 
QemuOpts so while it does implement dynamic typing, the dynamic typing 
is not consistently enforced.



2) Every option is formally specified and documented in a way that is
both rigorous and machine readable.  This means we can generate high
quality documentation in a variety of formats.

You make it sound like QemuOpts wouldn't specify options.  It does.
Where your proposal differs from QemuOpts, in my opinion, is that it
uses C types rather than QemuOpts' very own ad hoc type system, and is
more expressive, see your 6) below.


My point in that bullet is that the documentation is machine readable 
and highly structured.



3) The command line parameters support full introspection.  This
should provide the same functionality as Dan's earlier introspection
patches.

Again, this isn't something QemuOpts could not do.  We just neglected to
make its option specification available outside QEMU, mostly because for
lack of consensus on what we want to expose there.


It's all just a matter of bits, right :-)

But QemuOpts is just about parsing a string, it doesn't really 
explicitly tie to a command line.  Another layer would be needed to 
clearly map that option foo is associated with this QemuOpts.


So yeah, it could grow in that direction but there's a lot more to it 
then just returning a list of names and ties given a qemu opts definition.




5) Very complex data types can be implemented.  We had some discussion
of supporting nested structures with -blockdev.  This wouldn't work
with QemuOpts but I've already implemented it with QCFG (blockdev
syntax is my test case right now).  The syntax I'm currently using is
-blockdev
cache=none,id=foo,format.qcow.protocol.nbd.hostname=localhost where
.' is used to reference sub structures.

We might want some syntactic sugar so that users don't have to repeat
format.qcow.protocol.nbd.FOO for every FOO they want to configure.


-set will still function in this model as key/value parsing and 
translation to a C structure are two separate steps.


But yeah, I think we ought to brain storm ways to simplify blockdev 
because the structure is very complex.  But that's a separate discussion.



Initial code is in my QAPI tree.

I'm not going to start converting things until we get closer to the
end of 0.15 and QAPI is fully merged.  My plan is to focus on this for
0.16 and do a full conversion for the 0.16 time frame using the same
approach as QAPI.  That means that for 0.16, we would be able to set
all command line options via QMP in a programmatic fashion with full
support for introspection.

I'm haven't yet closed on how to bridge this to qdev.  qdev is a big
consumer of QemuOpts today.  I have some general ideas about what I'd
like to do but so far, I haven't written anything down.

Glad you mention qdev.

qdev properties describe the configurable parts of qdev state objects.
A state object is a C struct.  The description is C data.  Poor man's
reflection.


Right.  The problem is it's very hard to reflect C even if you parse it 
without additional annotations.  For instance:


typedef struct Foo {
Bar *bar;
} Foo;

What the type of bar is is ambigious.  It could be a pointer to a list 
of Bar's (if bar has an embedded pointer), it could be an array of Bars 
that is terminated using a field within Bar, it could be a pointer to a 
fixed size array of Bars, or it could be just a pointer to a single Bar 
object.


So you end up needing additional annotations like:

typedef struct Foo {
   size_t n_bar;
   Bar *bar sizeis(n_bar);
} Foo;

This is what most IDLs that use C style syntax do.


qdev needs to parse and print a device's properties.  It uses QemuOpts
to parse NAME=VALUE,... option strings into a list of (NAME, VALUE), and
callbacks to parse the VALUEs.  Awkward, especially when you go QMP ->
option string ->  qdev struct.


Indeed.  You can do very weird things like pass a float through qdev and 
it appears as a (valid) string type with QemuOpts.


But qdev properties are a little odder semantically.  qdev properties 
are construct-only and read-only.   They are construct-only because they 
are set implicitly in the object before its created as a sort of default 
value.


It's easy to bridge QCFG/QMP to this but I've been thinking of going a 
step further.  I've been thinking of taking the current qdev prope

[Qemu-devel] QEMU e1000 driver loopback mode supported?

2011-03-17 Thread amy chen

Hi, I have experienced the loopback mode failed on the interfaces. Just
wondering is loopback mode is supported on the QEMU e1000 driver?

Thanks,
Amy

[Qemu-devel] [PATCH 1/3] use kernel-provided para_features instead of statically coming up with new capabilities

2011-03-17 Thread Glauber Costa

According to Avi's comments over my last submission, I decided to take a
different, and more correct direction - we hope.

This patch is now using the features provided by KVM_GET_SUPPORTED_CPUID 
directly to
mask out features from guest-visible cpuid.

The old get_para_features() mechanism is kept for older kernels that do not 
implement it.

Signed-off-by: Glauber Costa 
---
 target-i386/kvm.c |   76 +++-
 1 files changed, 45 insertions(+), 31 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index da757fa..dc1e547 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -95,6 +95,35 @@ static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max)
 return cpuid;
 }
 
+#ifdef CONFIG_KVM_PARA
+struct kvm_para_features {
+int cap;
+int feature;
+} para_features[] = {
+{ KVM_CAP_CLOCKSOURCE, KVM_FEATURE_CLOCKSOURCE },
+{ KVM_CAP_NOP_IO_DELAY, KVM_FEATURE_NOP_IO_DELAY },
+{ KVM_CAP_PV_MMU, KVM_FEATURE_MMU_OP },
+#ifdef KVM_CAP_ASYNC_PF
+{ KVM_CAP_ASYNC_PF, KVM_FEATURE_ASYNC_PF },
+#endif
+{ -1, -1 }
+};
+
+static int get_para_features(CPUState *env)
+{
+int i, features = 0;
+
+for (i = 0; i < ARRAY_SIZE(para_features) - 1; i++) {
+if (kvm_check_extension(env->kvm_state, para_features[i].cap)) {
+features |= (1 << para_features[i].feature);
+}
+}
+
+return features;
+}
+#endif
+
+
 uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
   uint32_t index, int reg)
 {
@@ -102,6 +131,7 @@ uint32_t kvm_arch_get_supported_cpuid(CPUState *env, 
uint32_t function,
 int i, max;
 uint32_t ret = 0;
 uint32_t cpuid_1_edx;
+int has_kvm_features = 0;
 
 max = 1;
 while ((cpuid = try_get_cpuid(env->kvm_state, max)) == NULL) {
@@ -111,6 +141,9 @@ uint32_t kvm_arch_get_supported_cpuid(CPUState *env, 
uint32_t function,
 for (i = 0; i < cpuid->nent; ++i) {
 if (cpuid->entries[i].function == function &&
 cpuid->entries[i].index == index) {
+if (cpuid->entries[i].function == KVM_CPUID_FEATURES) {
+has_kvm_features = 1;
+}
 switch (reg) {
 case R_EAX:
 ret = cpuid->entries[i].eax;
@@ -141,41 +174,16 @@ uint32_t kvm_arch_get_supported_cpuid(CPUState *env, 
uint32_t function,
 }
 }
 
+/* fallback for older kernels */
+if (!has_kvm_features && (function == KVM_CPUID_FEATURES)) {
+ret = get_para_features(env);
+}
+
 qemu_free(cpuid);
 
 return ret;
 }
 
-#ifdef CONFIG_KVM_PARA
-struct kvm_para_features {
-int cap;
-int feature;
-} para_features[] = {
-{ KVM_CAP_CLOCKSOURCE, KVM_FEATURE_CLOCKSOURCE },
-{ KVM_CAP_NOP_IO_DELAY, KVM_FEATURE_NOP_IO_DELAY },
-{ KVM_CAP_PV_MMU, KVM_FEATURE_MMU_OP },
-#ifdef KVM_CAP_ASYNC_PF
-{ KVM_CAP_ASYNC_PF, KVM_FEATURE_ASYNC_PF },
-#endif
-{ -1, -1 }
-};
-
-static int get_para_features(CPUState *env)
-{
-int i, features = 0;
-
-for (i = 0; i < ARRAY_SIZE(para_features) - 1; i++) {
-if (kvm_check_extension(env->kvm_state, para_features[i].cap)) {
-features |= (1 << para_features[i].feature);
-}
-}
-#ifdef KVM_CAP_ASYNC_PF
-has_msr_async_pf_en = features & (1 << KVM_FEATURE_ASYNC_PF);
-#endif
-return features;
-}
-#endif
-
 #ifdef KVM_CAP_MCE
 static int kvm_get_mce_cap_supported(KVMState *s, uint64_t *mce_cap,
  int *max_banks)
@@ -363,7 +371,13 @@ int kvm_arch_init_vcpu(CPUState *env)
 c = &cpuid_data.entries[cpuid_i++];
 memset(c, 0, sizeof(*c));
 c->function = KVM_CPUID_FEATURES;
-c->eax = env->cpuid_kvm_features & get_para_features(env);
+c->eax = env->cpuid_kvm_features & kvm_arch_get_supported_cpuid(env,
+KVM_CPUID_FEATURES, 0, R_EAX);
+
+#ifdef KVM_CAP_ASYNC_PF
+has_msr_async_pf_en = c->eax & (1 << KVM_FEATURE_ASYNC_PF);
+#endif
+
 #endif
 
 cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
-- 
1.7.2.3

[Qemu-devel] [PATCH 0/3] enable newer msr set for kvm

2011-03-17 Thread Glauber Costa

This patch is a follow up to an earlier one that aims to enable
kvmclock newer msr set. This time I'm doing it through a more sane
mechanism of consulting the kernel about the supported msr set.

Glauber Costa (3):
  use kernel-provided para_features instead of statically coming up
with new capabilities
  add kvmclock to its second bit
  don't create kvmclock when one of the flags are present.

 hw/kvmclock.c   |6 +++-
 target-i386/cpuid.c |3 +-
 target-i386/kvm.c   |   76 ++-
 3 files changed, 51 insertions(+), 34 deletions(-)

-- 
1.7.2.3

[Qemu-devel] [PATCH 3/3] don't create kvmclock when one of the flags are present.

2011-03-17 Thread Glauber Costa

kvmclock presence can be signalled by two different flags. So for
device creation, we have to test for both.

Signed-off-by: Glauber Costa 
---
 hw/kvmclock.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/hw/kvmclock.c b/hw/kvmclock.c
index b6ceddf..004c4ad 100644
--- a/hw/kvmclock.c
+++ b/hw/kvmclock.c
@@ -103,7 +103,11 @@ static SysBusDeviceInfo kvmclock_info = {
 void kvmclock_create(void)
 {
 if (kvm_enabled() &&
-first_cpu->cpuid_kvm_features & (1ULL << KVM_FEATURE_CLOCKSOURCE)) {
+first_cpu->cpuid_kvm_features & ((1ULL << KVM_FEATURE_CLOCKSOURCE)
+#ifdef KVM_FEATURE_CLOCKSOURCE2
+|| (1ULL << KVM_FEATURE_CLOCKSOURCE2)
+#endif
+)) {
 sysbus_create_simple("kvmclock", -1, NULL);
 }
 }
-- 
1.7.2.3

[Qemu-devel] [PATCH 2/3] add kvmclock to its second bit

2011-03-17 Thread Glauber Costa

We have two bits that can represent kvmclock in cpuid.
They signal the guest which msr set to use. When we tweak flags
involving this value - specially when we use "-", we have to act on both.

Besides adding it to the kvm features list, we also have to "break" the
assumption represented by the break in lookup_feature.

Signed-off-by: Glauber Costa 
---
 target-i386/cpuid.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index d28de20..48f9bbd 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -73,7 +73,7 @@ static const char *ext3_feature_name[] = {
 };
 
 static const char *kvm_feature_name[] = {
-"kvmclock", "kvm_nopiodelay", "kvm_mmu", NULL, "kvm_asyncpf", NULL, NULL, 
NULL,
+"kvmclock", "kvm_nopiodelay", "kvm_mmu", "kvmclock", "kvm_asyncpf", NULL, 
NULL, NULL,
 NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
@@ -193,7 +193,6 @@ static int lookup_feature(uint32_t *pval, const char *s, 
const char *e,
 for (mask = 1, ppc = featureset; mask; mask <<= 1, ++ppc)
 if (*ppc && !altcmp(s, e, *ppc)) {
 *pval |= mask;
-break;
 }
 return (mask ? 1 : 0);
 }
-- 
1.7.2.3

[Qemu-devel] [PATCH v2 0/3] piix_pci: optimize irq data path

2011-03-17 Thread Isaku Yamahata

This patch series optimizes irq data path of piix_pci.
So far piix3 tracks each pirq level and checks whether a given pic pins is
asserted by seeing if each pirq is mapped into the pic pin.
This is independent on irq routing, but data path is on slow path.

Given that irq routing is rarely changed and asserting pic pins is on
data path, the path that asserts pic pins should be optimized and
chainging irq routing should be on slow path.
The new behavior with this patch series is to use bitmap which is addressed
by pirq and pic pins with a given irq routing.
When pirq is asserted, the bitmap is set and see if the pic pins is
asserted by checking the bitmaps.
When irq routing is changed, rebuild the bitmap and re-assert pic pins.

Changes v1 -> v2:
- addressed review comments.

Isaku Yamahata (3):
  pci: add accessor function to get irq levels
  piix_pci: eliminate PIIX3State::pci_irq_levels
  piix_pci: optimize set irq path

 hw/pci.c  |7 +++
 hw/pci.h  |1 +
 hw/piix_pci.c |  117 +++-
 3 files changed, 106 insertions(+), 19 deletions(-)

[Qemu-devel] [PATCH v2 3/3] piix_pci: optimize set irq path

2011-03-17 Thread Isaku Yamahata

optimize irq routing in piix_pic.c which has been a TODO.
So far piix3 tracks each pirq level and checks whether a given pic pins is
asserted by seeing if each pirq is mapped into the pic pin.
This is independent on irq routing, but data path is on slow path.

Given that irq routing is rarely changed and asserting pic pins is on
data path, the path that asserts pic pins should be optimized and
chainging irq routing should be on slow path.
The new behavior with this patch series is to use bitmap which is addressed
by pirq and pic pins with a given irq routing.
When pirq is asserted, the bitmap is set and see if the pic pins is
asserted by checking the bitmaps.
When irq routing is changed, rebuild the bitmap and re-assert pic pins.

Cc: Michael S. Tsirkin 
Signed-off-by: Isaku Yamahata 
---
 hw/piix_pci.c |   94 +
 1 files changed, 81 insertions(+), 13 deletions(-)

diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index a1d1f55..aa9cf6c 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -37,8 +37,27 @@
 
 typedef PCIHostState I440FXState;
 
+#define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
+#define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
+#define PIIX_PIRQC  0x60
+
 typedef struct PIIX3State {
 PCIDevice dev;
+
+/*
+ * bitmap to track pic levels.
+ * The pic level is the logical OR of all the PCI irqs mapped to it
+ * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
+ *
+ * PIRQ is mapped to PIC pins, we track it by
+ * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
+ * pic_irq * PIIX_NUM_PIRQS + pirq
+ */
+#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
+#error "unable to encode pic state in 64bit in pic_levels."
+#endif
+uint64_t pic_levels;
+
 int32_t dummy_for_save_load_compat[4];
 qemu_irq *pic;
 } PIIX3State;
@@ -252,25 +271,63 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int 
*piix3_devfn, qemu_irq *
 }
 
 /* PIIX3 PCI to ISA bridge */
+static void piix3_set_irq_pic(PIIX3State *piix3, int pic_irq)
+{
+qemu_set_irq(piix3->pic[pic_irq],
+ !!(piix3->pic_levels &
+((PIIX_NUM_PIRQS - 1) << (pic_irq * PIIX_NUM_PIRQS;
+}
+
+static void piix3_set_irq_level(PIIX3State *piix3, int irq_num, int level,
+bool propagate)
+{
+int pic_irq;
+uint64_t mask;
+
+pic_irq = piix3->dev.config[PIIX_PIRQC + irq_num];
+if (pic_irq >= PIIX_NUM_PIC_IRQS) {
+return;
+}
+
+mask = 1ULL << ((pic_irq * PIIX_NUM_PIRQS) + irq_num);
+piix3->pic_levels &= ~mask;
+piix3->pic_levels |= mask * !!level;
+
+if (propagate) {
+piix3_set_irq_pic(piix3, pic_irq);
+}
+}
 
 static void piix3_set_irq(void *opaque, int irq_num, int level)
 {
-int i, pic_irq, pic_level;
 PIIX3State *piix3 = opaque;
+piix3_set_irq_level(piix3, irq_num, level, true);
+}
 
-/* now we change the pic irq level according to the piix irq mappings */
-/* XXX: optimize */
-pic_irq = piix3->dev.config[0x60 + irq_num];
-if (pic_irq < 16) {
-/* The pic level is the logical OR of all the PCI irqs mapped
-   to it */
-pic_level = 0;
-for (i = 0; i < 4; i++) {
-if (pic_irq == piix3->dev.config[0x60 + i]) {
-pic_level |= pci_bus_get_irq_level(piix3->dev.bus, i);
-}
+/* irq routing is changed. so rebuild bitmap */
+static void piix3_update_irq_levels(PIIX3State *piix3)
+{
+int pirq;
+
+piix3->pic_levels = 0;
+for (pirq = 0; pirq < PIIX_NUM_PIRQS; pirq++) {
+piix3_set_irq_level(piix3, pirq,
+pci_bus_get_irq_level(piix3->dev.bus, pirq),
+false);
+}
+}
+
+static void piix3_write_config(PCIDevice *dev,
+   uint32_t address, uint32_t val, int len)
+{
+pci_default_write_config(dev, address, val, len);
+if (ranges_overlap(address, len, PIIX_PIRQC, 4)) {
+PIIX3State *piix3 = DO_UPCAST(PIIX3State, dev, dev);
+int pic_irq;
+piix3_update_irq_levels(piix3);
+for (pic_irq = 0; pic_irq < PIIX_NUM_PIC_IRQS; pic_irq++) {
+piix3_set_irq_pic(piix3, pic_irq);
 }
-qemu_set_irq(piix3->pic[pic_irq], pic_level);
 }
 }
 
@@ -310,6 +367,15 @@ static void piix3_reset(void *opaque)
 pci_conf[0xab] = 0x00;
 pci_conf[0xac] = 0x00;
 pci_conf[0xae] = 0x00;
+
+d->pic_levels = 0;
+}
+
+static int piix3_post_load(void *opaque, int version_id)
+{
+PIIX3State *piix3 = opaque;
+piix3_update_irq_levels(piix3);
+return 0;
 }
 
 static void piix3_pre_save(void *opaque)
@@ -328,6 +394,7 @@ static const VMStateDescription vmstate_piix3 = {
 .version_id = 3,
 .minimum_version_id = 2,
 .minimum_version_id_old = 2,
+.post_load = piix3_post_load,
 .pre_save = piix3_pre_save,
 .fields  = (VMStateField []) {
 VMSTATE_PCI_DEVICE

[Qemu-devel] [PATCH v2 2/3] piix_pci: eliminate PIIX3State::pci_irq_levels

2011-03-17 Thread Isaku Yamahata

PIIX3State::pci_irq_levels are redundant which is already tracked by
PCIBus layer. So eliminate them.

Cc: Michael S. Tsirkin 
Signed-off-by: Isaku Yamahata 
---
 hw/piix_pci.c |   31 +--
 1 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 358da58..a1d1f55 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -39,7 +39,7 @@ typedef PCIHostState I440FXState;
 
 typedef struct PIIX3State {
 PCIDevice dev;
-int pci_irq_levels[4];
+int32_t dummy_for_save_load_compat[4];
 qemu_irq *pic;
 } PIIX3State;
 
@@ -162,9 +162,11 @@ static int i440fx_load_old(QEMUFile* f, void *opaque, int 
version_id)
 i440fx_update_memory_mappings(d);
 qemu_get_8s(f, &d->smm_enabled);
 
-if (version_id == 2)
-for (i = 0; i < 4; i++)
-d->piix3->pci_irq_levels[i] = qemu_get_be32(f);
+if (version_id == 2) {
+for (i = 0; i < 4; i++) {
+qemu_get_be32(f); /* dummy load for compatibility */
+}
+}
 
 return 0;
 }
@@ -256,8 +258,6 @@ static void piix3_set_irq(void *opaque, int irq_num, int 
level)
 int i, pic_irq, pic_level;
 PIIX3State *piix3 = opaque;
 
-piix3->pci_irq_levels[irq_num] = level;
-
 /* now we change the pic irq level according to the piix irq mappings */
 /* XXX: optimize */
 pic_irq = piix3->dev.config[0x60 + irq_num];
@@ -266,8 +266,9 @@ static void piix3_set_irq(void *opaque, int irq_num, int 
level)
to it */
 pic_level = 0;
 for (i = 0; i < 4; i++) {
-if (pic_irq == piix3->dev.config[0x60 + i])
-pic_level |= piix3->pci_irq_levels[i];
+if (pic_irq == piix3->dev.config[0x60 + i]) {
+pic_level |= pci_bus_get_irq_level(piix3->dev.bus, i);
+}
 }
 qemu_set_irq(piix3->pic[pic_irq], pic_level);
 }
@@ -309,8 +310,17 @@ static void piix3_reset(void *opaque)
 pci_conf[0xab] = 0x00;
 pci_conf[0xac] = 0x00;
 pci_conf[0xae] = 0x00;
+}
 
-memset(d->pci_irq_levels, 0, sizeof(d->pci_irq_levels));
+static void piix3_pre_save(void *opaque)
+{
+int i;
+PIIX3State *piix3 = opaque;
+
+for (i = 0; i < ARRAY_SIZE(piix3->dummy_for_save_load_compat); i++) {
+piix3->dummy_for_save_load_compat[i] =
+pci_bus_get_irq_level(piix3->dev.bus, i);
+}
 }
 
 static const VMStateDescription vmstate_piix3 = {
@@ -318,9 +328,10 @@ static const VMStateDescription vmstate_piix3 = {
 .version_id = 3,
 .minimum_version_id = 2,
 .minimum_version_id_old = 2,
+.pre_save = piix3_pre_save,
 .fields  = (VMStateField []) {
 VMSTATE_PCI_DEVICE(dev, PIIX3State),
-VMSTATE_INT32_ARRAY_V(pci_irq_levels, PIIX3State, 4, 3),
+VMSTATE_INT32_ARRAY_V(dummy_for_save_load_compat, PIIX3State, 4, 3),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
1.7.1.1

[Qemu-devel] Re: [PATCH 3/3] piix_pci: optimize set irq path

2011-03-17 Thread Isaku Yamahata

On Thu, Mar 17, 2011 at 04:41:08PM +0200, Michael S. Tsirkin wrote:

> > +static int piix3_post_load(void *opaque, int version_id)
> > +{
> > +PIIX3State *piix3 = opaque;
> > +piix3_rebuild_irq_levels(piix3);
> 
> Don't we need to set_irq_pic here as well?
> And in that case, just make the for loop
> part of piix3_rebuild_irq_levels.

No, we need not. Because pic irq state is saved/loaded by pic itself.
So todo here is to make pic_levels consistent with other part.
-- 
yamahata

[Qemu-devel] [PATCH v2 1/3] pci: add accessor function to get irq levels

2011-03-17 Thread Isaku Yamahata

Introduce accessor function to know INTx levels.
It will be used later by q35.
Although piix_pci tracks the intx line levels, it can be eliminated
by this helper function.

Cc: Michael S. Tsirkin 
Signed-off-by: Isaku Yamahata 
---
 hw/pci.c |7 +++
 hw/pci.h |1 +
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 8b76cea..6ad3f10 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -126,6 +126,13 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int 
irq_num, int change)
 bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] != 0);
 }
 
+int pci_bus_get_irq_level(PCIBus *bus, int irq_num)
+{
+assert(irq_num >= 0);
+assert(irq_num < bus->nirq);
+return !!bus->irq_count[irq_num];
+}
+
 /* Update interrupt status bit in config space on interrupt
  * state change. */
 static void pci_update_irq_status(PCIDevice *dev)
diff --git a/hw/pci.h b/hw/pci.h
index 113e556..092a463 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -233,6 +233,7 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
 PCIBus *pci_bus_new(DeviceState *parent, const char *name, int devfn_min);
 void pci_bus_irqs(PCIBus *bus, pci_set_irq_fn set_irq, pci_map_irq_fn map_irq,
   void *irq_opaque, int nirq);
+int pci_bus_get_irq_level(PCIBus *bus, int irq_num);
 void pci_bus_hotplug(PCIBus *bus, pci_hotplug_fn hotplug, DeviceState *dev);
 PCIBus *pci_register_bus(DeviceState *parent, const char *name,
  pci_set_irq_fn set_irq, pci_map_irq_fn map_irq,
-- 
1.7.1.1

[Qemu-devel] Re: [PATCH] v3 revamp acpitable parsing and allow to specify complete (headerful) table

2011-03-17 Thread Isaku Yamahata

It looks good, except the unnecessary black line after return 0;

Reviewed-by: Isaku Yamahata 

On Thu, Mar 17, 2011 at 01:00:54PM +0300, Michael Tokarev wrote:
> This patch almost rewrites acpi_table_add() function
> (but still leaves it using old get_param_value() interface).
> The result is that it's now possible to specify whole table
> (together with a header) in an external file, instead of just
> data portion, with a new file= parameter, but at the same time
> it's still possible to specify header fields as before.
> 
> Now with the checkpatch.pl formatting fixes, thanks to
> Stefan Hajnoczi for suggestions, with changes from
> Isaku Yamahata, and with my further refinements.
> 
> Signed-off-by: Michael Tokarev 
> ---
>  hw/acpi.c   |  285 
> +++
>  qemu-options.hx |7 +-
>  2 files changed, 168 insertions(+), 124 deletions(-)
> 
> diff --git a/hw/acpi.c b/hw/acpi.c
> index 237526d..4cc0311 100644
> --- a/hw/acpi.c
> +++ b/hw/acpi.c
> @@ -15,6 +15,7 @@
>   * You should have received a copy of the GNU Lesser General Public
>   * License along with this library; if not, see 
> 
>   */
> +
>  #include "hw.h"
>  #include "pc.h"
>  #include "acpi.h"
> @@ -24,17 +25,29 @@
>  
>  struct acpi_table_header
>  {
> -char signature [4];/* ACPI signature (4 ASCII characters) */
> +uint16_t _length; /* our length, not actual part of the hdr */
> +  /* XXX why we have 2 length fields here? */
> +char sig[4];  /* ACPI signature (4 ASCII characters) */
>  uint32_t length;  /* Length of table, in bytes, including header 
> */
>  uint8_t revision; /* ACPI Specification minor version # */
>  uint8_t checksum; /* To make sum of entire table == 0 */
> -char oem_id [6];   /* OEM identification */
> -char oem_table_id [8]; /* OEM table identification */
> +char oem_id[6];   /* OEM identification */
> +char oem_table_id[8]; /* OEM table identification */
>  uint32_t oem_revision;/* OEM revision number */
> -char asl_compiler_id [4]; /* ASL compiler vendor ID */
> +char asl_compiler_id[4];  /* ASL compiler vendor ID */
>  uint32_t asl_compiler_revision; /* ASL compiler revision number */
>  } __attribute__((packed));
>  
> +#define ACPI_TABLE_HDR_SIZE sizeof(struct acpi_table_header)
> +#define ACPI_TABLE_PFX_SIZE sizeof(uint16_t)  /* size of the extra prefix */
> +
> +static const char dfl_hdr[ACPI_TABLE_HDR_SIZE] =
> +"\0\0"   /* fake _length (2) */
> +"QEMU\0\0\0\0\1\0"   /* sig (4), len(4), revno (1), csum (1) */
> +"QEMUQEQEMUQEMU\1\0\0\0" /* OEM id (6), table (8), revno (4) */
> +"QEMU\1\0\0\0"   /* ASL compiler ID (4), version (4) */
> +;
> +
>  char *acpi_tables;
>  size_t acpi_tables_len;
>  
> @@ -47,156 +60,182 @@ static int acpi_checksum(const uint8_t *data, int len)
>  return (-sum) & 0xff;
>  }
>  
> +/* like strncpy() but zero-fills the tail of destination */
> +static void strzcpy(char *dst, const char *src, size_t size)
> +{
> +size_t len = strlen(src);
> +if (len >= size) {
> +len = size;
> +} else {
> +  memset(dst + len, 0, size - len);
> +}
> +memcpy(dst, src, len);
> +}
> +
> +/* XXX fixme: this function uses obsolete argument parsing interface */
>  int acpi_table_add(const char *t)
>  {
> -static const char *dfl_id = "QEMUQEMU";
>  char buf[1024], *p, *f;
> -struct acpi_table_header acpi_hdr;
>  unsigned long val;
> -uint32_t length;
> -struct acpi_table_header *acpi_hdr_p;
> -size_t off;
> +size_t len, start, allen;
> +bool has_header;
> +int changed;
> +int r;
> +struct acpi_table_header hdr;
> +
> +if (get_param_value(buf, sizeof(buf), "data", t)) {
> +has_header = false;
> +} else if (get_param_value(buf, sizeof(buf), "file", t)) {
> +has_header = true;
> +} else {
> +has_header = false;
> +buf[0] = '\0';
> +}
> +
> +if (!acpi_tables) {
> +allen = sizeof(uint16_t);
> +acpi_tables = qemu_mallocz(allen);
> +}
> +else {
> +allen = acpi_tables_len;
> +}
> +
> +start = allen;
> +acpi_tables = qemu_realloc(acpi_tables, start + ACPI_TABLE_HDR_SIZE);
> +allen += has_header ? ACPI_TABLE_PFX_SIZE : ACPI_TABLE_HDR_SIZE;
> +
> +/* now read in the data files, reallocating buffer as needed */
> +
> +for (f = strtok(buf, ":"); f; f = strtok(NULL, ":")) {
> +int fd = open(f, O_RDONLY);
> +
> +if (fd < 0) {
> +fprintf(stderr, "can't open file %s: %s\n", f, strerror(errno));
> +return -1;
> +}
> +
> +for (;;) {
> +char data[8192];
> +r = read(fd, data, sizeof(data));
> +if (r == 0) {
> +break;
> +} else if (r > 0) {
> +

Re: [Qemu-devel] [PATCH 21/26] Implement TCE translation for sPAPR VIO

2011-03-17 Thread David Gibson

On Wed, Mar 16, 2011 at 05:20:53PM -0500, Anthony Liguori wrote:
> On 03/15/2011 11:56 PM, David Gibson wrote:
> >From: Ben Herrenschmidt
[snip]
> >+static target_ulong h_put_tce(CPUState *env, sPAPREnvironment *spapr,
> >+  target_ulong opcode, target_ulong *args)
> >+{
> >+target_ulong liobn = args[0];
> >+target_ulong ioba = args[1];
> >+target_ulong tce = args[2];
> >+VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr->vio_bus, liobn);
> >+VIOsPAPR_RTCE *rtce;
> >+
> >+if (!dev) {
> >+fprintf(stderr, "spapr_vio_put_tce on non-existent LIOBN "
> >+TARGET_FMT_lx "\n",
> >+liobn);
> 
> You generally want to avoid guest triggered fprintfs as it can be
> exploited in scenarios where qemu's stdout is logged to disk
> (libvirt).  We usually wrap this in a DPRINTF() of some sort.

Ah, good point.  I've gone through and audited for this sort of thing.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

[Qemu-devel] [V9 PATCH 00/13] virtio-9p: Use chroot to safely access files in passthrough security model

2011-03-17 Thread M. Mohan Kumar

In passthrough security model, following symbolic links in the server
side could result in TOCTTOU vulnerabilities.

This patchset resolves this issue by creating a dedicated process which
chroots into the share path and all file object access is done in the
chroot environment.

This patchset implements chroot enviroment, provides necessary functions
that can be used by the passthrough function calls.

Changes from version V8:
* Make chmod and chown also operate under chroot process
* Check for invalid path requests, minor cleanups

Changes from version V7:
* Add two chroot methods remove and rename
* Minor cleanups like consolidating functions

Changes from version V6:
* Send only fd/errno in socket operations instead of FdInfo structure
* Minor cleanups

Changes from version V5:
* Return errno on failure instead of setting errno
* Minor cleanups like updated comments, enable CONFIG_THREAD if
  CONFIG_VIRTFS is enabled

Changes from version V4:
* Avoid using malloc/free inside chroot process
* Seperate chroot server and client functions

Changes from version V3
* Return EIO incase of socket read/write fail instead of exiting
* Changed data types as suggested by Blue Swirl
* Chroot process reports error through qemu process

Changes from version V2
* Treat socket IO errors as fatal, ie qemu will exit
* Split patchset based on chroot side (server) and qemu side(client)
  functionalities

M. Mohan Kumar (13):
  Implement qemu_read_full
  virtio-9p: Enable CONFIG_THREAD if CONFIG_VIRTFS is enabled
  virtio-9p: Provide chroot worker side interfaces
  virtio-9p: Add qemu side interfaces for chroot environment
  virtio-9p: Add support to open a file in chroot environment
  virtio-9p: Create support in chroot environment
  virtio-9p: Support for creating special files
  virtio-9p: Add support for removing file or directory
  virtio-9p: Add support to rename
  virtio-9p: Move file post creation changes to none security model
  virtio-9p: Add support for chmod
  virtio-9p: Add support for chown
  virtio-9p: Chroot environment for other functions

 Makefile.objs |1 +
 configure |1 +
 hw/9pfs/virtio-9p-chroot-worker.c |  342 +
 hw/9pfs/virtio-9p-chroot.c|  105 +++
 hw/9pfs/virtio-9p-chroot.h|   47 +
 hw/9pfs/virtio-9p-local.c |  332 +++-
 hw/9pfs/virtio-9p.c   |   24 +++
 hw/file-op-9p.h   |3 +
 osdep.c   |   32 
 qemu-common.h |2 +
 10 files changed, 811 insertions(+), 78 deletions(-)
 create mode 100644 hw/9pfs/virtio-9p-chroot-worker.c
 create mode 100644 hw/9pfs/virtio-9p-chroot.c
 create mode 100644 hw/9pfs/virtio-9p-chroot.h

-- 
1.7.3.4

[Qemu-devel] [V9 PATCH 10/13] virtio-9p: Move file post creation changes to none security model

2011-03-17 Thread M. Mohan Kumar

After creating a file object, its permission and ownership details are updated
as per 9p client's request for both passthrough and none security model.
But with chrooted environment its not required for passthrough security model.
Move all post file creation changes to none security model.

Signed-off-by: M. Mohan Kumar 
---
 hw/9pfs/virtio-9p-local.c |   19 ++-
 1 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c
index 48ca7ab..65622de 100644
--- a/hw/9pfs/virtio-9p-local.c
+++ b/hw/9pfs/virtio-9p-local.c
@@ -147,21 +147,14 @@ static int local_set_xattr(const char *path, FsCred 
*credp)
 return 0;
 }
 
-static int local_post_create_passthrough(FsContext *fs_ctx, const char *path,
+static int local_post_create_none(FsContext *fs_ctx, const char *path,
 FsCred *credp)
 {
+int retval;
 if (chmod(rpath(fs_ctx, path), credp->fc_mode & 0) < 0) {
 return -1;
 }
-if (lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid) < 0) {
-/*
- * If we fail to change ownership and if we are
- * using security model none. Ignore the error
- */
-if (fs_ctx->fs_sm != SM_NONE) {
-return -1;
-}
-}
+retval = lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid);
 return 0;
 }
 
@@ -302,7 +295,7 @@ static int local_mknod(FsContext *fs_ctx, const char *path, 
FsCred *credp)
 if (err == -1) {
 return err;
 }
-err = local_post_create_passthrough(fs_ctx, path, credp);
+err = local_post_create_none(fs_ctx, path, credp);
 if (err == -1) {
 serrno = errno;
 goto err_end;
@@ -344,7 +337,7 @@ static int local_mkdir(FsContext *fs_ctx, const char *path, 
FsCred *credp)
 if (err == -1) {
 return err;
 }
-err = local_post_create_passthrough(fs_ctx, path, credp);
+err = local_post_create_none(fs_ctx, path, credp);
 if (err == -1) {
 serrno = errno;
 goto err_end;
@@ -419,7 +412,7 @@ static int local_open2(FsContext *fs_ctx, const char *path, 
int flags,
 if (fd == -1) {
 return fd;
 }
-err = local_post_create_passthrough(fs_ctx, path, credp);
+err = local_post_create_none(fs_ctx, path, credp);
 if (err == -1) {
 serrno = errno;
 goto err_end;
-- 
1.7.3.4

1 2 >

1 - 100 of 119 matches

Mail list logo