date:20101115

Re: [Qemu-devel] [PATCH] linux-user: remove unnecessary local from __get_user(), __put_user()

2010-11-15 Thread Riku Voipio

On Mon, Nov 08, 2010 at 06:13:58PM +, Peter Maydell wrote:
> Remove an unnecessary local variable from the __get_user() and
> __put_user() macros. This avoids confusing compilation failures
> if the name of the local variable ('size') happens to be the
> same as the variable the macro user is trying to read/write.

Looks fine, will push on my next patchset

Acked-by: Riku Voipio 
 
> Signed-off-by: Peter Maydell 
> ---
>  linux-user/qemu.h |6 ++
>  1 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/linux-user/qemu.h b/linux-user/qemu.h
> index 708021e..d717392 100644
> --- a/linux-user/qemu.h
> +++ b/linux-user/qemu.h
> @@ -266,8 +266,7 @@ static inline int access_ok(int type, abi_ulong addr, 
> abi_ulong size)
>   */
>  #define __put_user(x, hptr)\
>  ({\
> -int size = sizeof(*hptr);\
> -switch(size) {\
> +switch(sizeof(*hptr)) {\
>  case 1:\
>  *(uint8_t *)(hptr) = (uint8_t)(typeof(*hptr))(x);\
>  break;\
> @@ -288,8 +287,7 @@ static inline int access_ok(int type, abi_ulong addr, 
> abi_ulong size)
>  
>  #define __get_user(x, hptr) \
>  ({\
> -int size = sizeof(*hptr);\
> -switch(size) {\
> +switch(sizeof(*hptr)) {\
>  case 1:\
>  x = (typeof(*hptr))*(uint8_t *)(hptr);\
>  break;\
> -- 
> 1.7.1
>

[Qemu-devel] Re: [PATCH v8 07/11] pci: introduce a parser for pci device path

2010-11-15 Thread Michael S. Tsirkin

On Mon, Nov 15, 2010 at 04:30:43PM +0900, Isaku Yamahata wrote:
> introduce a function to parse pci device path of
> the format, [Domain:]Slot.Function:Slot.Function:Slot.Function.
> 
> Signed-off-by: Isaku Yamahata 

Hmm.
How about we use openfirmware path like what Gleb's patch does,
with a fallback to bus:dev.fn for when it's unambiguous?

> ---
>  hw/pci.c |   87 
> ++
>  hw/pci.h |1 +
>  2 files changed, 88 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index fba765b..6a9a5ef 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -433,6 +433,93 @@ static void pci_set_default_subsystem_id(PCIDevice 
> *pci_dev)
>  }
>  
>  /*
> + * Parse format [Domain:]Slot.Function:Slot.Function:Slot.Function
> + * and get PCIDevice
> + * return 0 on success
> + *   -1 on error: format is invalid or device isn't found.
> + */
> +int pci_parse_dev_path(const char *addr, PCIDevice **pdev)
> +{
> +unsigned long domain;
> +unsigned long slot;
> +unsigned long func;
> +const char *p;
> +char *e;
> +unsigned long val;
> +PCIBus *bus;
> +PCIBus *child_bus;
> +PCIDevice *d;
> +
> +p = addr;
> +val = strtoul(p, &e, 0);
> +if (e == p) {
> +return -1;
> +}
> +if (*e == ':') {
> +domain = val;
> +p = e + 1;
> +val = strtoul(p, &e, 0);
> +if (e == p) {
> +return -1;
> +}
> +} else if (*e == '.'){
> +domain = 0;
> +} else {
> +return -1;
> +}
> +if (domain > 0x) {
> +return -1;
> +}
> +
> +bus = pci_find_root_bus(domain);
> +if (!bus) {
> +return -1;
> +}
> +
> +for (;;) {
> +slot = val;
> +if (*e != '.') {
> +return -1;
> +}
> +p = e + 1;
> +val = strtoul(p, &e, 0);
> +if (e == p) {
> +return -1;
> +}
> +func = val;
> +if (slot > 0x1f || func >= PCI_FUNC_MAX) {
> +return -1;
> +}
> +d = bus->devices[PCI_DEVFN(slot, func)];
> +if (!d) {
> +return -1;
> +}
> +if (*e == '\0') {
> +break;
> +}
> +
> +if (*e != ':') {
> +return -1;
> +}
> +p = e + 1;
> +val = strtoul(p, &e, 0);
> +if (e == p) {
> +return -1;
> +}
> +QLIST_FOREACH(child_bus, &bus->child, sibling) {
> +if (child_bus->parent_dev == d) {
> +bus = child_bus;
> +continue;
> +}
> +}
> +return -1;
> +}
> +
> +*pdev = d;
> +return 0;
> +}
> +
> +/*
>   * Parse [[:]:], return -1 on error if funcp == NULL
>   *   [[:]:]., return -1 on error
>   */
> diff --git a/hw/pci.h b/hw/pci.h
> index 7100804..8c16f91 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -239,6 +239,7 @@ PCIBus *pci_find_bus(PCIBus *bus, int bus_num);
>  PCIDevice *pci_find_device(PCIBus *bus, int bus_num, int slot, int function);
>  PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr);
>  
> +int pci_parse_dev_path(const char *addr, PCIDevice **pdev);
>  int pci_parse_devaddr(const char *addr, int *domp, int *busp,
>unsigned int *slotp, unsigned int *funcp);
>  int pci_read_devaddr(Monitor *mon, const char *addr, int *domp, int *busp,
> -- 
> 1.7.1.1

[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-15 Thread Gleb Natapov

On Mon, Nov 15, 2010 at 09:53:50AM +0200, Michael S. Tsirkin wrote:
> On Mon, Nov 15, 2010 at 09:40:08AM +0200, Gleb Natapov wrote:
> > On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote:
> > > On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote:
> > > > +/*
> > > > + * This function returns device list as an array in a below format:
> > > > + * +-+-+---+-+---+--
> > > > + * |  n  |  l1 |   devpath1|  l2 |  devpath2 | ...
> > > > + * +-+-+---+-+---+--
> > > > + * where:
> > > > + *   n - a number of devise pathes (one byte)
> > > > + *   l - length of following device path string (one byte)
> > > > + *   devpath - non-null terminated string of length l representing
> > > > + * one device path
> > > > + */
> > > 
> > > Why not just return a newline separated list that is null terminated?
> > > 
> > Doing it like this will needlessly complicate firmware side. How do you
> > know how much memory to allocate before reading device list?
> 
> Do a memory scan, count newlines until you reach 0?
> 
To do memory scan you need to read it into memory first. To read it into
memory you need to know how much memory to allocate to know how much
memory to allocate you meed to do memory scan... Notice pattern here :)
Of course you can scan IO space too discarding everything you read first
time, but why introduce broken interface in the first place?

> > Doing it
> > like Blue suggest (have BOOTINDEX_LEN and BOOTINDEX_STRING) solves this.
> > To create nice array from bootindex string you firmware will still have
> > to do additional pass on it though.
> 
> Why is this a problem? Pass over memory is cheap, isn't it?
> 
More code, each line of code potentially introduce bug. But I will go with
Blue suggestion anyway since we already use it for other things.

> > With format like above the code
> > would look like that:
> > 
> > qemu_cfg_read(&n, 1);
> > arr = alloc(n);
> > for (i=0; i >  qemu_cfg_read(&l, 1);
> >  arr[i] = zalloc(l+1);
> >  qemu_cfg_read(arr[i], l);
> > }
> >  
> > 
> > --
> > Gleb.
> 
> 
> At this point I don't care about format.
I do.

> But I would like one without 1-byte-length limitations,
> just so we can cover whatever pci can through at us.
> 
I agree. 1-byte for one device string may be to limiting. It is still
more then 15 PCI bridges on a PC and if you have your pci bus go that
deep you are doing something very wrong. But according to spec device
name can be 32 byte long and device address may be 64bit physical
address and that makes length of one device element to be 50 byte.

--
Gleb.

[Qemu-devel] Re: [PATCH v8 01/11] pci: revise pci command register initialization

2010-11-15 Thread Michael S. Tsirkin

On Mon, Nov 15, 2010 at 04:30:37PM +0900, Isaku Yamahata wrote:
> This patch cleans up command register initialization with
> comments.
> 
> Signed-off-by: Isaku Yamahata 

Probably just fold with the next patch.

> ---
>  hw/pci.c |   42 ++
>  1 files changed, 42 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index 962886e..b70a568 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -544,8 +544,50 @@ static void pci_init_wmask(PCIDevice *dev)
>  
>  dev->wmask[PCI_CACHE_LINE_SIZE] = 0xff;
>  dev->wmask[PCI_INTERRUPT_LINE] = 0xff;
> +
> +/*
> + * bit 0: PCI_COMMAND_IO
> + *type 0: if IO BAR is used, RW
> + *type 1: RW
> + * bit 1: PCI_COMMAND_MEMORY
> + *type 0: if IO BAR is used, RW
> + *type 1: RW
> + * bit 2: PCI_COMMAND_MASTER
> + *type 0: RW if bus master
> + *type 1: RW
> + * bit 3: PCI_COMMAND_SPECIAL
> + *RO=0, optionally RW: Such device should set this bit itself
> + * bit 4: PCI_COMMAND_INVALIDATE
> + *RO=0, optionally RW: Such device should set this bit itself
> + * bit 5: PCI_COMMAND_VGA_PALETTE
> + *RO=0, optionally RW: Such device should set this bit itself
> + * bit 6: PCI_COMMAND_PARITY
> + *RW with exceptions: Such device should clear this bit itself
> + *Given that qemu doesn't emulate pci bus cycles, so that there
> + *is no place to generate parity error. So just making this
> + *register RW is okay because there is no place which refers
> + *this bit.
> + *TODO: When device assignment tried to inject PERR# into qemu,
> + *  some extra work would be needed.
> + * bit 7: PCI_COMMAND_WAIT: reserved (PCI 3.0)
> + *RO=0
> + * bit 8: PCI_COMMAND_SERR
> + *RW with exceptions: Such device should clear this bit itself
> + *Given that qemu doesn't emulate pci bus cycles, so that there
> + *is no place to generate system error. So just making this
> + *register RW is okay because there is no place which refers
> + *this bit.
> + *TODO: When device assignment tried to inject SERR# into qemu,
> + *  some extra work would be needed.
> + * bit 9: PCI_COMMAND_FAST_BACK
> + *RO=0, optionally RW: Such device should set this bit itself
> + * bit 10: PCI_COMMAND_INTX_DISABLE
> + * RW
> + * bit 11-15: reserved
> + */
>  pci_set_word(dev->wmask + PCI_COMMAND,
>   PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER |
> + PCI_COMMAND_PARITY | PCI_COMMAND_SERR |
>   PCI_COMMAND_INTX_DISABLE);
>  
>  memset(dev->wmask + PCI_CONFIG_HEADER_SIZE, 0xff,
> -- 
> 1.7.1.1

Re: Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?

2010-11-15 Thread Mulyadi Santosa

Hi

OK it's getting interesting perhaps it would lead into
instrumentation topic, which is quite hot topic in qemu-devel quite
recently, so you jump into the wagon just about the right time :)

2010/11/15 F. Zhang :
> I am very pleased to share ideas with you. But my English is too poor, er…,
> I’ll try my best to make it clear. J

Either do I. How much do you expect Indonesian like me to write
fluently English, after all? :D heheh, just joking :)

OK, one thing for sure here is, I think you can implement your idea on
top of several (not so complete) existing frameworks in Qemu.
Tracing...is one of them...not sure about the rest...

> Yes, I have read that paper, it’s wonderful!
>
> Besides the Argos, the bitblaze group, led by Dawn Song in Berkeley, has
> achieved great success in the taint analysis. The website about their
> dynamic analysis work (called TEMU) can be found at:
> http://bitblaze.cs.berkeley.edu/temu.html
>
> And TEMU is now open-source.

Thanks for sharing that...it's new stuff for me. So, why don't you
just pick TEMU and improve it instead of...uhm...sorry if I am wrong,
working from scratch? After all, I believe in both Argos and TEMU (and
maybe other similar projects), they share common codes here and there.

But ehm...CMIIW, seems like TEMU is based on Qemu 0.9,x, right? So
it's sorry I forgot the name, the generated code is mostly a
constructed by fragments of small codes generated by gcc. Now, it is
qemu which does it by itself. So, a lot of things change
(substantially).

> Yes. For each process’s memory space A, I wanna make a shadow memory B. The
> shadow memory is used to store the tag of data. In other words, if addr in
> memory A is tainted, then the corresponding byte in B should be marked to
> indicate that addr in A is tainted.

I agree that should be the way it worksbut. (see below)


>>How about using unused one of unused PTE flags for such tag?
>
> Sorry, what is the PTE flag?

Page Table Entry...i believe not all flags are really used by the OS
nowadays, so I guess you can utilize 1 or 2 bits there whenever
possible...
>
> In fact, the tag is stored in the shadow memory of the process.
>
> Let us consider the following instruction:
>
> mov eax, [esi]
>
> If data in [esi] is tainted, then eax is tained, too.

May we know, what kind of information do you plan to store in such tag?

> In this instruction, we should first consider whether [esi] is tainted or
> not. This is done by checking the tag in the shadow memory. If [esi] is
> tainted, then the tag for eax in the shadow memory is set, too.
>
> The question is: how to implement the upper functions? maybe I should modify
> the instruction-translation functions to implement the trace of tainted data
> propagation?


I think you should hook all the memory operation related opcode (or to
be precise, Qemu opcode). That way, you won't miss any..

> Yes, I wanna make QEMU cooperate with the GUEST OS. In fact, malware under
> analysis is run within the GUEST OS.

Hm, I thought it would be host OS + qemudon't you think, if it is
guest OS +qemu, while there is a chance guest OS is compromised first,
then we get such unreliable data? Or am I missing something here?

>The guest os collects “higher” semantic
> from the OS level, and the QEMU collects “lower” semantic from the
> instruction level. Combination of both semantics is necessary in the
> analysis process.

The question is, in a situation where malware already compromise "the
higher semantic", could we trust the analysis?

> The question is: how to communicate between the QEMU and the guest OS, so
> that they can cooperate with each other?

OK, so let's assume it's really guest OS +qemu...i think, uhm, better
create pseudo device, quite similar with virtioor you can think
it's like /dev/sda, /dev/rtc etc... the guest OS must somewhat be
installed with a driver which knows how to read and talk to this
device.

Via the driver, fed any analysis resultqemu collects it...and
finally pass it to host OS.

Other possibilty is to reserve certain memory region (kinda BIOS
reserved memory space), mmap it inside the guest OS, then treat it
like System V shared memory. Put the data in it, Qemu regularly checks
it...

What do you think?

PS: eduardo cruz might be an interesting person to talk to..he did
instrumention work lately too

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-15 Thread Gleb Natapov

On Sun, Nov 14, 2010 at 10:50:13PM +, Blue Swirl wrote:
> On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
> >
> > Signed-off-by: Gleb Natapov 
> > ---
> >  hw/fw_cfg.c |   14 ++
> >  hw/fw_cfg.h |    4 +++-
> >  sysemu.h    |    1 +
> >  vl.c        |   51 +++
> >  4 files changed, 69 insertions(+), 1 deletions(-)
> >
> > diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
> > index 7b9434f..f6a67db 100644
> > --- a/hw/fw_cfg.c
> > +++ b/hw/fw_cfg.c
> > @@ -53,6 +53,7 @@ struct FWCfgState {
> >     FWCfgFiles *files;
> >     uint16_t cur_entry;
> >     uint32_t cur_offset;
> > +    Notifier machine_ready;
> >  };
> >
> >  static void fw_cfg_write(FWCfgState *s, uint8_t value)
> > @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s,  const char 
> > *filename, uint8_t *data,
> >     return 1;
> >  }
> >
> > +static void fw_cfg_machine_ready(struct Notifier* n)
> > +{
> > +    uint32_t len;
> > +    char *bootindex = get_boot_devices_list(&len);
> > +
> > +    fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready),
> > +                     FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len);
> > +}
> > +
> >  FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
> >                         target_phys_addr_t ctl_addr, target_phys_addr_t 
> > data_addr)
> >  {
> > @@ -343,6 +353,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
> > data_port,
> >     fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
> >     fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
> >
> > +
> > +    s->machine_ready.notify = fw_cfg_machine_ready;
> > +    qemu_add_machine_init_done_notifier(&s->machine_ready);
> > +
> >     return s;
> >  }
> >
> > diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
> > index 856bf91..4d61410 100644
> > --- a/hw/fw_cfg.h
> > +++ b/hw/fw_cfg.h
> > @@ -30,7 +30,9 @@
> >
> >  #define FW_CFG_FILE_FIRST       0x20
> >  #define FW_CFG_FILE_SLOTS       0x10
> > -#define FW_CFG_MAX_ENTRY        (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> > +#define FW_CFG_FILE_LAST_SLOT   (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
> > +#define FW_CFG_BOOTINDEX        (FW_CFG_FILE_LAST_SLOT + 1)
> > +#define FW_CFG_MAX_ENTRY        FW_CFG_BOOTINDEX
> 
> This should be
> #define FW_CFG_MAX_ENTRY(FW_CFG_BOOTINDEX + 1)
> because the check is like this:
> if ((key & FW_CFG_ENTRY_MASK) >= FW_CFG_MAX_ENTRY) {
> s->cur_entry = FW_CFG_INVALID;
> 
Yeah, will fix.

> With that change, I got the bootindex passed to OpenBIOS:
> OpenBIOS for Sparc64
> Configuration device id QEMU version 1 machine id 0
> kernel cmdline
> CPUs: 1 x SUNW,UltraSPARC-IIi
> UUID: ----
> bootindex num_strings 1
> bootindex /p...@01fe/i...@5/dr...@1/d...@0
> 
> The device path does not match exactly, but it's close:
> /p...@1fe,0/pci-...@5/i...@600/d...@0

pbm->pci should be solvable by the patch at the end. Were in the spec
it is allowed to abbreviate 1fe as 1fe,0? Spec allows to drop
starting zeroes but TARGET_FMT_plx definition in targphys.h has 0 after
%. I can define another one without leading zeroes. Can you suggest
a name?  TARGET_FMT_lx is poisoned. As of ATA there is no open firmware
binding spec for ATA, so everyone does what he pleases. I based my
implementation on what open firmware showing when running on qemu x86.
"pci-ata" should be "ide" according to PCI binding spec :) 

diff --git a/hw/apb_pci.c b/hw/apb_pci.c
index c619112..643aa49 100644
--- a/hw/apb_pci.c
+++ b/hw/apb_pci.c
@@ -453,6 +453,7 @@ static PCIDeviceInfo pbm_pci_host_info = {
 
 static SysBusDeviceInfo pbm_host_info = {
 .qdev.name = "pbm",
+.qdev.fw_name = "pci",
 .qdev.size = sizeof(APBState),
 .qdev.reset = pci_pbm_reset,
 .init = pci_pbm_init_device,
--
Gleb.

[Qemu-devel] Re: [PATCH v8 07/11] pci: introduce a parser for pci device path

2010-11-15 Thread Isaku Yamahata

On Mon, Nov 15, 2010 at 10:06:13AM +0200, Michael S. Tsirkin wrote:
> On Mon, Nov 15, 2010 at 04:30:43PM +0900, Isaku Yamahata wrote:
> > introduce a function to parse pci device path of
> > the format, [Domain:]Slot.Function:Slot.Function:Slot.Function.
> > 
> > Signed-off-by: Isaku Yamahata 
> 
> Hmm.
> How about we use openfirmware path like what Gleb's patch does,
> with a fallback to bus:dev.fn for when it's unambiguous?

Okay, let me check my understanding of the format.

The openfirmware path in pci case looks like
/pci@/@,/.../@,

"pci@" corresponds to pci domain. So  should be
also supported in addition to .

Maybe we'd like "@" to be optional.
So the parser would accept something like
/{, }/,/.../,
-- 
yamahata

[Qemu-devel] Re: [PATCH v8 07/11] pci: introduce a parser for pci device path

2010-11-15 Thread Gleb Natapov

On Mon, Nov 15, 2010 at 05:57:40PM +0900, Isaku Yamahata wrote:
> On Mon, Nov 15, 2010 at 10:06:13AM +0200, Michael S. Tsirkin wrote:
> > On Mon, Nov 15, 2010 at 04:30:43PM +0900, Isaku Yamahata wrote:
> > > introduce a function to parse pci device path of
> > > the format, [Domain:]Slot.Function:Slot.Function:Slot.Function.
> > > 
> > > Signed-off-by: Isaku Yamahata 
> > 
> > Hmm.
> > How about we use openfirmware path like what Gleb's patch does,
> > with a fallback to bus:dev.fn for when it's unambiguous?
> 
> Okay, let me check my understanding of the format.
> 
> The openfirmware path in pci case looks like
> /pci@/@,/.../@,
> 
> "pci@" corresponds to pci domain. So  should be
> also supported in addition to .
> 
Correct.

> Maybe we'd like "@" to be optional.
> So the parser would accept something like
> /{, }/,/.../,
My patch series has pci_dev_fw_name() function that fills in 
for you, so no reason to make it optional. If you still want to make it
optional @ should stay, so it should be /@ioport/@slot.func/

--
Gleb.

Re: [Qemu-devel] [PATCH RESENT] msix: allow byte and word reading from mmio

2010-11-15 Thread Bernhard Kohl


Am 22.08.2010 23:34, schrieb ext Anthony Liguori:

On 08/19/2010 07:56 AM, Bernhard Kohl wrote:
   

It's legal that the guest reads a single byte or word from mmio.
I have an OS which reads single bytes and it works fine on real
hardware. Maybe this happens due to casting.

Signed-off-by: Bernhard Kohl

 

Hi Michael,

I'm looking for this to come through your tree unless you disagree.

Regards,

Anthony Liguori
   


Hi,

could you please look at this again?

Thanks
Bernhard

   

---
   hw/msix.c |   20 
   1 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/hw/msix.c b/hw/msix.c
index d99403a..7dac7f7 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, 
target_phys_addr_t addr)
   return pci_get_long(page + offset);
   }

-static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t addr)
+static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr)
   {
-fprintf(stderr, "MSI-X: only dword read is allowed!\n");
-return 0;
+PCIDevice *dev = opaque;
+unsigned int offset = addr&   (MSIX_PAGE_SIZE - 1)&   ~0x1;
+void *page = dev->msix_table_page;
+
+return pci_get_word(page + offset);
+}
+
+static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr)
+{
+PCIDevice *dev = opaque;
+unsigned int offset = addr&   (MSIX_PAGE_SIZE - 1);
+void *page = dev->msix_table_page;
+
+return pci_get_byte(page + offset);
   }

   static uint8_t msix_pending_mask(int vector)
@@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = {
   };

   static CPUReadMemoryFunc * const msix_mmio_read[] = {
-msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl
+msix_mmio_readb, msix_mmio_readw, msix_mmio_readl
   };

   /* Should be called from device's map method. */

[Qemu-devel] Re: [PATCH] pc: disable the BOCHS BIOS panic port

2010-11-15 Thread Bernhard Kohl


Am 01.09.2010 16:44, schrieb Bernhard Kohl:

We have an OS which writes to port 0x400 when probing for special hardware.
This causes an exit of the VM. With SeaBIOS this port isn't used anyway.

Signed-off-by: Bernhard Kohl
---
  hw/pc.c |2 --
  1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 69b13bf..3f81229 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -430,8 +430,6 @@ static void bochs_bios_write(void *opaque, uint32_t addr, 
uint32_t val)
  /* Bochs BIOS messages */
  case 0x400:
  case 0x401:
-fprintf(stderr, "BIOS panic at rombios.c, line %d\n", val);
-exit(1);
  case 0x402:
  case 0x403:
  #ifdef DEBUG_BIOS
   


Hi,

could you please look at this?

Thanks
Bernhard

Re: [Qemu-devel] Re: [PATCH] pc: disable the BOCHS BIOS panic port

2010-11-15 Thread Alexander Graf


On 15.11.2010, at 10:53, Bernhard Kohl wrote:

> Am 01.09.2010 16:44, schrieb Bernhard Kohl:
>> We have an OS which writes to port 0x400 when probing for special hardware.
>> This causes an exit of the VM. With SeaBIOS this port isn't used anyway.
>> 
>> Signed-off-by: Bernhard Kohl
>> ---
>>  hw/pc.c |2 --
>>  1 files changed, 0 insertions(+), 2 deletions(-)
>> 
>> diff --git a/hw/pc.c b/hw/pc.c
>> index 69b13bf..3f81229 100644
>> --- a/hw/pc.c
>> +++ b/hw/pc.c
>> @@ -430,8 +430,6 @@ static void bochs_bios_write(void *opaque, uint32_t 
>> addr, uint32_t val)
>>  /* Bochs BIOS messages */
>>  case 0x400:
>>  case 0x401:
>> -fprintf(stderr, "BIOS panic at rombios.c, line %d\n", val);
>> -exit(1);
>>  case 0x402:
>>  case 0x403:
>>  #ifdef DEBUG_BIOS
>>   
> 
> Hi,
> 
> could you please look at this?

This patch makes that port that was silent before print debug output if 
DEBUG_BIOS is enabled which might be confusing.

How about something like

case 0x400:
case 0x401:
/* used to be panic, now unused */
break;


Alex

[Qemu-devel] Re: [PATCHv4 13/15] Add bootindex for option roms.

2010-11-15 Thread Gleb Natapov

On Sun, Nov 14, 2010 at 09:33:06PM +, Blue Swirl wrote:
> On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov  wrote:
> > Extend -option-rom command to have additional parameter ,bootindex=.
> 
> This patch is broken:
>   CCarm-softmmu/palm.o
> /src/qemu/hw/palm.c: In function 'palmte_init':
> /src/qemu/hw/palm.c:237: error: incompatible type for argument 1 of
> 'get_image_size'
> /src/qemu/hw/palm.c:245: error: incompatible type for argument 1 of
> 'load_image_targphys'
> cc1: warnings being treated as errors
> /src/qemu/hw/palm.c:250: error: format '%s' expects type 'char *', but
> argument 4 has type 'QEMUOptionRom'
>   CCarm-softmmu/nseries.o
> /src/qemu/hw/nseries.c: In function 'n8x0_init':
> /src/qemu/hw/nseries.c:1346: error: incompatible type for argument 1
> of 'load_image_targphys'
Fixed in new version.

--
Gleb.

Re: [Qemu-devel] [PATCH V6 02/15] xen: Support new libxc calls from xen unstable.

2010-11-15 Thread Alexander Graf


On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:

> From: Anthony PERARD 
> 
> Update the libxenctrl calls in Qemu to use the new interface, otherwise
> Qemu wouldn't be able to build against new versions of the library.
> 
> We also check libxenctrl version in configure, from Xen 3.3.0 to Xen
> unstable.
> 
> Signed-off-by: Anthony PERARD 
> Signed-off-by: Stefano Stabellini 
> ---
> configure|   67 -
> hw/xen_backend.c |   10 +++---
> hw/xen_backend.h |2 +-
> hw/xen_common.h  |   38 +---
> hw/xen_disk.c|   12 
> hw/xen_domainbuild.c |2 +-
> hw/xen_nic.c |   16 ++--
> 7 files changed, 109 insertions(+), 38 deletions(-)
> 
> diff --git a/configure b/configure
> index 9ed05de..f6a7073 100755
> --- a/configure
> +++ b/configure
> @@ -284,6 +284,7 @@ vnc_jpeg=""
> vnc_png=""
> vnc_thread="no"
> xen=""
> +xen_ctrl_version=""
> linux_aio=""
> attr=""
> vhost_net=""
> @@ -1135,20 +1136,81 @@ fi
> 
> if test "$xen" != "no" ; then
>   xen_libs="-lxenstore -lxenctrl -lxenguest"
> +
> +  # Xen unstable
>   cat > $TMPC < #include 
> #include 
> -int main(void) { xs_daemon_open(); xc_interface_open(); return 0; }
> +#include 
> +#include 
> +#if !defined(HVM_MAX_VCPUS)
> +# error HVM_MAX_VCPUS not defined
> +#endif
> +int main(void) {
> +  xc_interface *xc;
> +  xs_daemon_open();
> +  xc_interface_open(0, 0, 0);
> +  xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
> +  xc_gnttab_open(xc);
> +  return 0;
> +}
> EOF
>   if compile_prog "" "$xen_libs" ; then
> +xen_ctrl_version=410
> xen=yes
> -libs_softmmu="$xen_libs $libs_softmmu"
> +
> +  # Xen 4.0.0
> +  elif (
> +  cat > $TMPC < +#include 
> +#include 
> +#include 
> +#include 
> +#if !defined(HVM_MAX_VCPUS)
> +# error HVM_MAX_VCPUS not defined
> +#endif
> +int main(void) {
> +  xs_daemon_open();
> +  xc_interface_open();
> +  xc_gnttab_open();
> +  xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
> +  return 0;
> +}
> +EOF
> +  compile_prog "" "$xen_libs"
> +) ; then
> +xen_ctrl_version=400
> +xen=yes
> +
> +  # Xen 3.3.0, 3.4.0
> +  elif (
> +  cat > $TMPC < +#include 
> +#include 
> +int main(void) {
> +  xs_daemon_open();
> +  xc_interface_open();
> +  xc_gnttab_open();
> +  xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
> +  return 0;
> +}
> +EOF
> +  compile_prog "" "$xen_libs"
> +) ; then
> +xen_ctrl_version=330
> +xen=yes
> +
> +  # Xen not found or unsupported
>   else
> if test "$xen" = "yes" ; then
>   feature_not_found "xen"
> fi
> xen=no
>   fi
> +
> +  if test "$xen" = "yes"; then
> +libs_softmmu="$xen_libs $libs_softmmu"
> +  fi
> fi
> 
> ##
> @@ -2546,6 +2608,7 @@ if test "$bluez" = "yes" ; then
> fi
> if test "$xen" = "yes" ; then
>   echo "CONFIG_XEN=y" >> $config_host_mak
> +  echo "CONFIG_XEN_CTRL_INTERFACE_VERSION=$xen_ctrl_version" >> 
> $config_host_mak
> fi
> if test "$io_thread" = "yes" ; then
>   echo "CONFIG_IOTHREAD=y" >> $config_host_mak
> diff --git a/hw/xen_backend.c b/hw/xen_backend.c
> index 860b038..3e99751 100644
> --- a/hw/xen_backend.c
> +++ b/hw/xen_backend.c
> @@ -43,7 +43,7 @@
> /* - */
> 
> /* public */
> -int xen_xc;
> +qemu_xc_interface xen_xc = XC_HANDLER_INITIAL_VALUE;
> struct xs_handle *xenstore = NULL;
> const char *xen_protocol;
> 
> @@ -216,7 +216,7 @@ static struct XenDevice *xen_be_get_xendev(const char 
> *type, int dom, int dev,
> fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
> 
> if (ops->flags & DEVOPS_FLAG_NEED_GNTDEV) {
> -xendev->gnttabdev = xc_gnttab_open();
> +xendev->gnttabdev = xc_gnttab_open(xen_xc);
> if (xendev->gnttabdev < 0) {
> xen_be_printf(NULL, 0, "can't open gnttab device\n");
> xc_evtchn_close(xendev->evtchndev);
> @@ -269,7 +269,7 @@ static struct XenDevice *xen_be_del_xendev(int dom, int 
> dev)
> if (xendev->evtchndev >= 0)
> xc_evtchn_close(xendev->evtchndev);
> if (xendev->gnttabdev >= 0)
> -xc_gnttab_close(xendev->gnttabdev);
> +xc_gnttab_close(xen_xc, xendev->gnttabdev);
> 
> QTAILQ_REMOVE(&xendevs, xendev, next);
> qemu_free(xendev);
> @@ -627,8 +627,8 @@ int xen_be_init(void)
> if (qemu_set_fd_handler(xs_fileno(xenstore), xenstore_update, NULL, NULL) 
> < 0)
> goto err;
> 
> -xen_xc = xc_interface_open();
> -if (xen_xc == -1) {
> +xen_xc = xc_interface_open(NULL, NULL, 0);
> +if (xen_xc == XC_HANDLER_INITIAL_VALUE) {
> xen_be_printf(NULL, 0, "can't open xen interface\n");
> goto err;
> }
> diff --git a/hw/xen_backend.h b/hw/xen_backend.h
> index 1b428e3..7d84098 100644
> --- a/hw/xen_backend.h
> +++ b/hw/xen_backend.h
> @@ -55,7 +55,7 @@ struct XenDevice {
> /*

Re: [Qemu-devel] [PATCH V6 03/15] xen: Add xen_machine_fv

2010-11-15 Thread Alexander Graf


On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:

> From: Anthony PERARD 
> 
> Add the Xen FV (Fully Virtualized) machine to Qemu;
> this is groundwork to add Xen device model support in Qemu.
> 
> Signed-off-by: Anthony PERARD 
> Signed-off-by: Stefano Stabellini 
> ---
> Makefile.target |3 +
> hw/xen_common.h |5 ++
> hw/xen_machine_fv.c |  158 +++
> 3 files changed, 166 insertions(+), 0 deletions(-)
> create mode 100644 hw/xen_machine_fv.c
> 
> diff --git a/Makefile.target b/Makefile.target
> index c48cbcc..ddabc8a 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -185,6 +185,9 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
> # xen backend driver support
> obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
> 
> +# xen full virtualized machine
> +obj-i386-$(CONFIG_XEN) += xen_machine_fv.o
> +
> # USB layer
> obj-$(CONFIG_USB_OHCI) += usb-ohci.o
> 
> diff --git a/hw/xen_common.h b/hw/xen_common.h
> index 9f75e52..4c0f97d 100644
> --- a/hw/xen_common.h
> +++ b/hw/xen_common.h
> @@ -18,6 +18,11 @@
>  * We don't support Xen prior to 3.3.0.
>  */
> 
> +/* Before Xen 4.0.0 */
> +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 400
> +#  define HVM_MAX_VCPUS 32
> +#endif
> +
> /* Xen unstable */
> #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 410
> typedef int qemu_xc_interface;
> diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
> new file mode 100644
> index 000..260cda3
> --- /dev/null
> +++ b/hw/xen_machine_fv.c
> @@ -0,0 +1,158 @@
> +/*
> + * QEMU Xen FV Machine
> + *
> + * Copyright (c) 2003-2007 Fabrice Bellard
> + * Copyright (c) 2007 Red Hat

Shouldn't there be Citrix in there?

> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.

So all this code has always been public domain? There are no GPL'ed pieces in 
there?


The rest of the code looks pretty much like a duplicate of pc.c. I'm not saying 
this is bad, but it'll be a lot of work to keep things in sync when others 
start changing pc.c.

So long-term we should get rid of the duplication. Hopefully by having a 
working machine description framework that doesn't require C skills :). It's 
way out of the scope of this patch set though.


Alex

Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.

2010-11-15 Thread Alexander Graf


On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:

> From: Anthony PERARD 
> 
> This option gives the ability to switch one "accelerator" like kvm, xen
> or the default one tcg. We can specify more than one accelerator by
> separate them by a comma. QEMU will try each one and use the first whose
> works.
> 
> So,
> 
> -accel xen,kvm,tcg
> 
> which would try Xen support first, then KVM and finaly tcg if none of
> the other works.
> 
> Signed-off-by: Anthony PERARD 


This patch is completely independent of the other patches. Anthony, can we pull 
it in separately before the others?

Acked-by: Alexander Graf

Re: [Qemu-devel] virtio-blk broken after system reset

2010-11-15 Thread Kevin Wolf

Am 13.11.2010 11:09, schrieb Jan Kiszka:
> Am 13.11.2010 11:01, Michael Tokarev wrote:
>> 13.11.2010 10:51, Jan Kiszka wrote:
>>> Am 13.11.2010 08:49, Stefan Hajnoczi wrote:
 On Fri, Nov 12, 2010 at 10:02 PM, Jan Kiszka  wrote:
> Hi,
>
> both after hard and guest-initiated reset, something is seriously broken
> with virtio block devices. If I reset my Linux guest while still in
> grub, the bios will simply fail to read from the disk after the reboot. 
> If I
> reset after Linux touched the device, qemu terminates:
>
> Breakpoint 1, 0x74b945b0 in _exit () from /lib64/libc.so.6
> (gdb) bt
> #0  0x74b945b0 in _exit () from /lib64/libc.so.6
> #1  0x74b2948d in __run_exit_handlers () from /lib64/libc.so.6
> #2  0x74b29535 in exit () from /lib64/libc.so.6
> #3  0x00568da3 in virtqueue_num_heads (vq=0x17040e0, idx=0) at 
> /data/qemu/hw/virtio.c:258
> #4  0x00569511 in virtqueue_pop (vq=0x17040e0, elem=0x17cea58) at 
> /data/qemu/hw/virtio.c:388
> #5  0x00419e31 in virtio_blk_get_request (s=0x1704010) at 
> /data/qemu/hw/virtio-blk.c:132
> #6  virtio_blk_handle_output (vdev=0x1704010, vq=) 
> at /data/qemu/hw/virtio-blk.c:369
>
>> []
>>> And what about the guest-triggerable qemu exit above?
>>
>> There are _lots_ of guest-triggerable qemu exits out there.
>>
>> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx)
>> {
>> uint16_t num_heads = vring_avail_idx(vq) - idx;
>>
>> /* Check it isn't doing very strange things with descriptor numbers. */
>> if (num_heads > vq->vring.num) {
>> fprintf(stderr, "Guest moved used index from %u to %u",
>> idx, vring_avail_idx(vq));
>> exit(1);
>> }
>>
>> return num_heads;
>> }
>>
>> This is done when guest behaves insanely (or qemu thinks it does).
>> On a real hw similar behavour most likely will lead to a system
>> lockup, qemu just exits.
> 
> There is also real hw out there that goes into an error state if it's
> misprogrammed.
> 
> I think we have to remove all those premature exits. They also prevent
> handing the device inside the guest to an untrusted driver (relevant
> once we have IOMMU emulation).

I agree, those exits are bugs. There are a lot of them in virtio code.
At some point, someone should go through the whole virtio code and fix
them all.

Kevin

[Qemu-devel] Re: [PATCH RESENT] msix: allow byte and word reading from mmio

2010-11-15 Thread Michael S. Tsirkin

On Thu, Aug 19, 2010 at 02:56:51PM +0200, Bernhard Kohl wrote:
> It's legal that the guest reads a single byte or word from mmio.

Interesting. The spec seems to say this:

For all accesses to MSI-X Table and MSI-X PBA fields, software must use
aligned full DWORD or aligned full QWORD transactions; otherwise, the
result is undefined.

> I have an OS which reads single bytes and it works fine on real
> hardware. Maybe this happens due to casting.

What do you mean by casting?

> 
> Signed-off-by: Bernhard Kohl 

I guess we can merge this, but we need a comment I think since this
seems to contradict the spec, and people were sending patches relying on
this.

Does something like the following describe the situation correctly?

/* Note: PCI spec requires that all MSI-X table accesses
   are either DWORD or QWORD, size aligned.  Some guests seem to violate
   this rule for read accesses, performing single byte reads.
   Since it's easy to support this, let's do so.
   Also support 16 bit size aligned reads, just in case.
 */

Does you guest also do 16 bit reads?  Are accesses at least aligned?

> ---
>  hw/msix.c |   20 
>  1 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/msix.c b/hw/msix.c
> index d99403a..7dac7f7 100644
> --- a/hw/msix.c
> +++ b/hw/msix.c
> @@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, 
> target_phys_addr_t addr)
>  return pci_get_long(page + offset);
>  }
>  
> -static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t 
> addr)
> +static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr)
>  {
> -fprintf(stderr, "MSI-X: only dword read is allowed!\n");
> -return 0;
> +PCIDevice *dev = opaque;
> +unsigned int offset = addr & (MSIX_PAGE_SIZE - 1) & ~0x1;
> +void *page = dev->msix_table_page;
> +
> +return pci_get_word(page + offset);
> +}
> +
> +static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr)
> +{
> +PCIDevice *dev = opaque;
> +unsigned int offset = addr & (MSIX_PAGE_SIZE - 1);
> +void *page = dev->msix_table_page;
> +
> +return pci_get_byte(page + offset);
>  }
>  
>  static uint8_t msix_pending_mask(int vector)
> @@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = {
>  };
>  
>  static CPUReadMemoryFunc * const msix_mmio_read[] = {
> -msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl
> +msix_mmio_readb, msix_mmio_readw, msix_mmio_readl
>  };
>  
>  /* Should be called from device's map method. */
> -- 
> 1.7.2.1

Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.

2010-11-15 Thread Alexander Graf


On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:

> From: Anthony PERARD 
> 
> This option gives the ability to switch one "accelerator" like kvm, xen
> or the default one tcg. We can specify more than one accelerator by
> separate them by a comma. QEMU will try each one and use the first whose
> works.
> 
> So,
> 
> -accel xen,kvm,tcg
> 
> which would try Xen support first, then KVM and finaly tcg if none of
> the other works.
> 
> Signed-off-by: Anthony PERARD 
> ---
> qemu-options.hx |   10 ++
> vl.c|   86 ---
> 2 files changed, 85 insertions(+), 11 deletions(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 718d47a..ba8385b 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1925,6 +1925,16 @@ Enable KVM full virtualization support. This option is 
> only available
> if KVM support is enabled when compiling.
> ETEXI
> 
> +DEF("accel", HAS_ARG, QEMU_OPTION_accel, \
> +"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", 
> QEMU_ARCH_ALL)
> +STEXI
> +...@item -accel @var{accel}[,@var{accel}[,...]]
> +...@findex -accel
> +This is use to enable an accelerator, in kvm,xen,tcg.
> +By default, it use only tcg. If there a more than one accelerator
> +specified, the next one is used if the first don't work.
> +ETEXI
> +
> DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid,
> "-xen-domid id   specify xen guest domain id\n", QEMU_ARCH_ALL)
> DEF("xen-create", 0, QEMU_OPTION_xen_create,
> diff --git a/vl.c b/vl.c
> index df414ef..40a26ee 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname)
> return 0;
> }
> 
> +static struct {
> +const char *opt_name;
> +const char *name;
> +int (*available)(void);
> +int (*init)(int smp_cpus);
> +int *allowed;
> +} accel_list[] = {
> +{ "tcg", "tcg", NULL, NULL, NULL },
> +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed },

Thinking about this a bit more ...

kvm_available is a function pointer that gets #defined to (0) when we don't 
have KVM available. I can imagine some compiler might throw a warning on us for 
this one day.

Is there a valid reason not to do

static inline int kvm_enabled()
{
#ifdef CONFIG_KVM
return kvm_allowed;
#else
return 0;
#endif
}

That should compile into the exact same code but be valid for function pointers.

> +};
> +
> +static int accel_parse_init(const char *opts)
> +{
> +const char *p = opts;
> +char buf[10];
> +int i, ret;
> +bool accel_initalised = 0;
> +bool init_failed = 0;
> +
> +while (!accel_initalised && *p != '\0') {
> +if (*p == ',') {
> +p++;
> +}
> +p = get_opt_name(buf, sizeof (buf), p, ',');
> +for (i = 0; i < ARRAY_SIZE(accel_list); i++) {
> +if (strcmp(accel_list[i].opt_name, buf) == 0) {
> +if (accel_list[i].init) {

If you'd make the function pointers mandatory, you could also get rid of a lot 
of checks here. Just introduce a new small stub helper for tcg_init and 
tcg_allowed and always call the pointers.


I take back my Acked-by. Sorry, I guess I should have read things in more 
detail first O_o. I still believe that this patch can go in before the others, 
but I'd at least like to see some comments on the (0) pointer thing :).


Alex

Re: [Qemu-devel] [PATCH V6 05/15] xen: Add xen in -accel option.

2010-11-15 Thread Alexander Graf


On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:

> From: Anthony PERARD 
> 
> This come with the initialisation of Xen.
> 
> Signed-off-by: Anthony PERARD 
> ---
> Makefile.target |5 +
> hw/xen.h|   10 ++
> vl.c|2 ++
> xen-all.c   |   25 +
> xen-stub.c  |   17 +
> 5 files changed, 59 insertions(+), 0 deletions(-)
> create mode 100644 xen-all.c
> create mode 100644 xen-stub.c
> 
> diff --git a/Makefile.target b/Makefile.target
> index ddabc8a..644cafa 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -2,6 +2,7 @@
> 
> GENERATED_HEADERS = config-target.h
> CONFIG_NO_KVM = $(if $(subst n,,$(CONFIG_KVM)),n,y)
> +CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y)
> 
> include ../config-host.mak
> include config-devices.mak
> @@ -185,6 +186,10 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
> # xen backend driver support
> obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
> 
> +# xen support
> +obj-$(CONFIG_XEN) += xen-all.o
> +obj-$(CONFIG_NO_XEN) += xen-stub.o
> +
> # xen full virtualized machine
> obj-i386-$(CONFIG_XEN) += xen_machine_fv.o
> 
> diff --git a/hw/xen.h b/hw/xen.h
> index 780dcf7..14bbb6e 100644
> --- a/hw/xen.h
> +++ b/hw/xen.h
> @@ -18,4 +18,14 @@ enum xen_mode {
> extern uint32_t xen_domid;
> extern enum xen_mode xen_mode;
> 
> +extern int xen_allowed;
> +
> +#if defined CONFIG_XEN
> +#define xen_enabled() (xen_allowed)
> +#else
> +#define xen_enabled() (0)
> +#endif

Please see my comment on patch 04. I don't think this preprocessor magic buys 
us anything over static inline function definitions. And with those we can get 
rid of a lot of pitfalls.


Alex

Re: [Qemu-devel] Add support for SystemTAP and DTrace tracing backends (v6)

2010-11-15 Thread Stefan Hajnoczi

On Fri, Nov 12, 2010 at 1:20 PM, Daniel P. Berrange  wrote:
> An update of the SystemTAP/DTrace patches from
>
>  http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00563.html
>
> The first patch contains the generic DTrace tracing backend support.
> The second patch contains additional pieces for SystemTAP to generate
> a tapset for each of the qemu-system-XXX binaries, to simplify life
> for admins wanting to use the tracing backend.
>
> Changed in v6:
>
>  - Fix handling of probes with no-args in DTrace provider
>  - Generate one tapset per target, for both system and user
>   emulators
>  - Re-write command line arg parsing in tracetool so it
>   doesn't assume a particular ordering of args
>  - Change '-s' to '--stap' in tracetool to avoid confusion
>  - Pass full binary name into tracetool

Both patches look good.  It would be nice to merge this.

Stefan

Re: [Qemu-devel] [PATCH V6 06/15] xen: Add the Xen platform pci device

2010-11-15 Thread Alexander Graf


On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:

> From: Steven Smith 
> 
> Introduce a new emulated PCI device, specific to fully virtualized Xen
> guests.  The device is necessary for PV on HVM drivers to work.
> 
> Signed-off-by: Steven Smith 
> Signed-off-by: Anthony PERARD 
> Signed-off-by: Stefano Stabellini 
> ---
> Makefile.target |1 +
> hw/hw.h |3 +
> hw/pci_ids.h|2 +
> hw/xen_machine_fv.c |3 +
> hw/xen_platform.c   |  431 +++
> hw/xen_platform.h   |8 +
> 6 files changed, 448 insertions(+), 0 deletions(-)
> create mode 100644 hw/xen_platform.c
> create mode 100644 hw/xen_platform.h
> 
> diff --git a/Makefile.target b/Makefile.target
> index 644cafa..db84edb 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -192,6 +192,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o
> 
> # xen full virtualized machine
> obj-i386-$(CONFIG_XEN) += xen_machine_fv.o
> +obj-i386-$(CONFIG_XEN) += xen_platform.o
> 
> # USB layer
> obj-$(CONFIG_USB_OHCI) += usb-ohci.o
> diff --git a/hw/hw.h b/hw/hw.h
> index 4405092..67f3369 100644
> --- a/hw/hw.h
> +++ b/hw/hw.h
> @@ -653,6 +653,9 @@ extern const VMStateDescription vmstate_i2c_slave;
> #define VMSTATE_INT32_LE(_f, _s)   \
> VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t)
> 
> +#define VMSTATE_UINT8_TEST(_f, _s, _t)   \
> +VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint8, uint8_t)
> +
> #define VMSTATE_UINT16_TEST(_f, _s, _t)   \
> VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint16, uint16_t)
> 
> diff --git a/hw/pci_ids.h b/hw/pci_ids.h
> index 39e9f1d..1f2e0dd 100644
> --- a/hw/pci_ids.h
> +++ b/hw/pci_ids.h
> @@ -105,3 +105,5 @@
> #define PCI_DEVICE_ID_INTEL_82371AB  0x7111
> #define PCI_DEVICE_ID_INTEL_82371AB_20x7112
> #define PCI_DEVICE_ID_INTEL_82371AB_30x7113
> +
> +#define PCI_VENDOR_ID_XENSOURCE  0x5853
> diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
> index 260cda3..39ee7c7 100644
> --- a/hw/xen_machine_fv.c
> +++ b/hw/xen_machine_fv.c
> @@ -35,6 +35,7 @@
> 
> #include "xen_common.h"
> #include "xen/hvm/hvm_info_table.h"
> +#include "xen_platform.h"
> 
> #define MAX_IDE_BUS 2
> 
> @@ -88,6 +89,8 @@ static void xen_init_fv(ram_addr_t ram_size,
> 
> pc_vga_init(pci_bus);
> 
> +pci_xen_platform_init(pci_bus);
> +
> /* init basic PC hardware */
> pc_basic_device_init(isa_irq, &floppy_controller, &rtc_state);
> 
> diff --git a/hw/xen_platform.c b/hw/xen_platform.c
> new file mode 100644
> index 000..7551c81
> --- /dev/null
> +++ b/hw/xen_platform.c
> @@ -0,0 +1,431 @@
> +/*
> + * XEN platform pci device, formerly known as the event channel device
> + *
> + * Copyright (c) 2003-2004 Intel Corp.
> + * Copyright (c) 2006 XenSource
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "hw.h"
> +#include "pc.h"
> +#include "pci.h"
> +#include "irq.h"
> +#include "xen_common.h"
> +#include "net.h"
> +#include "xen_platform.h"
> +#include "xen_backend.h"
> +#include "qemu-log.h"
> +#include "rwhandler.h"
> +
> +#include 
> +#include 
> +
> +//#define DEBUG_PLATFORM
> +
> +#ifdef DEBUG_PLATFORM
> +#define DPRINTF(fmt, ...) do { \
> +fprintf(stderr, "xen_platform: " fmt, ## __VA_ARGS__); \
> +} while (0)
> +#else
> +#define DPRINTF(fmt, ...) do { } while (0)
> +#endif
> +
> +#define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */
> +
> +typedef struct PCIXenPlatformState {
> +PCIDevice  pci_dev;
> +uint8_t flags; /* used only for version_id == 2 */
> +int drivers_blacklisted;
> +uint16_t driver_product_version;
> +
> +/* Log from guest drivers */
> +int throttling_disabled;
> +char log_buffer[4096];
> +int log_buffer_off;
> +} PCIXenPlatformState;
> +
> +#define XEN_P

Re: [Qemu-devel] [PATCH V6 03/15] xen: Add xen_machine_fv

2010-11-15 Thread Kevin Wolf

Am 15.11.2010 11:35, schrieb Alexander Graf:
> 
> On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:
> 
>> From: Anthony PERARD 
>>
>> Add the Xen FV (Fully Virtualized) machine to Qemu;
>> this is groundwork to add Xen device model support in Qemu.
>>
>> Signed-off-by: Anthony PERARD 
>> Signed-off-by: Stefano Stabellini 
>> ---
>> Makefile.target |3 +
>> hw/xen_common.h |5 ++
>> hw/xen_machine_fv.c |  158 
>> +++
>> 3 files changed, 166 insertions(+), 0 deletions(-)
>> create mode 100644 hw/xen_machine_fv.c
>>
>> diff --git a/Makefile.target b/Makefile.target
>> index c48cbcc..ddabc8a 100644
>> --- a/Makefile.target
>> +++ b/Makefile.target
>> @@ -185,6 +185,9 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
>> # xen backend driver support
>> obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
>>
>> +# xen full virtualized machine
>> +obj-i386-$(CONFIG_XEN) += xen_machine_fv.o
>> +
>> # USB layer
>> obj-$(CONFIG_USB_OHCI) += usb-ohci.o
>>
>> diff --git a/hw/xen_common.h b/hw/xen_common.h
>> index 9f75e52..4c0f97d 100644
>> --- a/hw/xen_common.h
>> +++ b/hw/xen_common.h
>> @@ -18,6 +18,11 @@
>>  * We don't support Xen prior to 3.3.0.
>>  */
>>
>> +/* Before Xen 4.0.0 */
>> +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 400
>> +#  define HVM_MAX_VCPUS 32
>> +#endif
>> +
>> /* Xen unstable */
>> #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 410
>> typedef int qemu_xc_interface;
>> diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c
>> new file mode 100644
>> index 000..260cda3
>> --- /dev/null
>> +++ b/hw/xen_machine_fv.c
>> @@ -0,0 +1,158 @@
>> +/*
>> + * QEMU Xen FV Machine
>> + *
>> + * Copyright (c) 2003-2007 Fabrice Bellard
>> + * Copyright (c) 2007 Red Hat
> 
> Shouldn't there be Citrix in there?
> 
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a 
>> copy
>> + * of this software and associated documentation files (the "Software"), to 
>> deal
>> + * in the Software without restriction, including without limitation the 
>> rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
>> FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
> 
> So all this code has always been public domain? There are no GPL'ed pieces in 
> there?

Public domain is something entirely different. This is just a permissive
license, and it's used a lot throughout qemu. So as you state that this
file is mostly a copy of pc.c which uses this license, at least for the
copied part it's not only okay, but even required to use the same
license here.

Kevin

Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify

2010-11-15 Thread Stefan Hajnoczi

On Sun, Nov 14, 2010 at 12:19 PM, Avi Kivity  wrote:
> On 11/14/2010 01:05 PM, Avi Kivity wrote:
>>>
>>> I agree, but let's enable virtio-ioeventfd carefully because bad code
>>> is out there.
>>
>>
>> Sure.  Note as long as the thread waiting on ioeventfd doesn't consume too
>> much cpu, it will awaken quickly and we won't have the "transaction per
>> timeslice" effect.
>>
>> btw, what about virtio-blk with linux-aio?  Have you benchmarked that with
>> and without ioeventfd?
>>
>
> And, what about efficiency?  As in bits/cycle?

We are running benchmarks with this latest patch and will report results.

Stefan

[Qemu-devel] [PATCH] device-assignment: register a reset function

2010-11-15 Thread Bernhard Kohl

This is necessary because during reboot of a VM the assigned devices
continue DMA transfers which causes memory corruption.

Signed-off-by: Thomas Ostler 
Signed-off-by: Bernhard Kohl 
---
Sorry for for the long delay. Finally we added Alex' suggestions
and rebased the patch.

Thanks
Bernhard
---
 hw/device-assignment.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 5f5bde1..3f8de66 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -1434,6 +1434,17 @@ static void 
assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
 dev->msix_table_page = NULL;
 }
 
+static void reset_assigned_device(DeviceState *dev)
+{
+PCIDevice *d = DO_UPCAST(PCIDevice, qdev, dev);
+uint32_t conf;
+
+/* reset the bus master bit to avoid further DMA transfers */
+conf = assigned_dev_pci_read_config(d, PCI_COMMAND, 2);
+conf &= ~PCI_COMMAND_MASTER;
+assigned_dev_pci_write_config(d, PCI_COMMAND, conf, 2);
+}
+
 static int assigned_initfn(struct PCIDevice *pci_dev)
 {
 AssignedDevice *dev = DO_UPCAST(AssignedDevice, dev, pci_dev);
@@ -1544,6 +1555,7 @@ static PCIDeviceInfo assign_info = {
 .qdev.name= "pci-assign",
 .qdev.desc= "pass through host pci devices to the guest",
 .qdev.size= sizeof(AssignedDevice),
+.qdev.reset   = reset_assigned_device,
 .init = assigned_initfn,
 .exit = assigned_exitfn,
 .config_read  = assigned_dev_pci_read_config,
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH V6 09/15] xen: Introduce the Xen mapcache

2010-11-15 Thread Alexander Graf


On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:

> From: Jun Nakajima 
> 
> On IA32 host or IA32 PAE host, at present, generally, we can't create
> an HVM guest with more than 2G memory, because generally it's almost
> impossible for Qemu to find a large enough and consecutive virtual
> address space to map an HVM guest's whole physical address space.
> The attached patch fixes this issue using dynamic mapping based on
> little blocks of memory.
> 
> Each call to qemu_get_ram_ptr makes a call to qemu_map_cache with the
> lock option, so mapcache will not unmap these ram_ptr.
> 
> Signed-off-by: Jun Nakajima 
> Signed-off-by: Anthony PERARD 
> Signed-off-by: Stefano Stabellini 
> ---
> Makefile.target |3 +
> configure   |3 +
> exec.c  |   40 ++-
> hw/xen.h|   10 ++
> hw/xen_common.h |2 +
> xen-all.c   |   64 +++
> xen-mapcache-stub.c |   33 ++
> xen-mapcache.c  |  301 +++
> xen-mapcache.h  |   14 +++
> xen-stub.c  |4 +
> 10 files changed, 470 insertions(+), 4 deletions(-)
> create mode 100644 xen-mapcache-stub.c
> create mode 100644 xen-mapcache.c
> create mode 100644 xen-mapcache.h
> 
> diff --git a/Makefile.target b/Makefile.target
> index db84edb..5646582 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -187,8 +187,11 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
> obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
> 
> # xen support
> +CONFIG_NO_XEN_MAPCACHE = $(if $(subst n,,$(CONFIG_XEN_MAPCACHE)),n,y)
> obj-$(CONFIG_XEN) += xen-all.o
> obj-$(CONFIG_NO_XEN) += xen-stub.o
> +obj-$(CONFIG_XEN_MAPCACHE) += xen-mapcache.o
> +obj-$(CONFIG_NO_XEN_MAPCACHE) += xen-mapcache-stub.o
> 
> # xen full virtualized machine
> obj-i386-$(CONFIG_XEN) += xen_machine_fv.o
> diff --git a/configure b/configure
> index f6a7073..d5a8553 100755
> --- a/configure
> +++ b/configure
> @@ -2943,6 +2943,9 @@ case "$target_arch2" in
>   i386|x86_64)
> if test "$xen" = "yes" -a "$target_softmmu" = "yes" ; then
>   echo "CONFIG_XEN=y" >> $config_target_mak
> +  if test "$cpu" = "i386" -o "$cpu" = "x86_64"; then
> +  echo "CONFIG_XEN_MAPCACHE=y" >> $config_target_mak
> +  fi
> fi
> esac
> case "$target_arch2" in
> diff --git a/exec.c b/exec.c
> index 631d8c5..d2cded6 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -39,6 +39,7 @@
> #include "hw/qdev.h"
> #include "osdep.h"
> #include "kvm.h"
> +#include "hw/xen.h"
> #include "qemu-timer.h"
> #if defined(CONFIG_USER_ONLY)
> #include 
> @@ -58,6 +59,8 @@
> #include 
> #endif
> #endif
> +#else /* !CONFIG_USER_ONLY */
> +#include "xen-mapcache.h"
> #endif
> 
> //#define DEBUG_TB_INVALIDATE
> @@ -2834,6 +2837,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
> const char *name,
> }
> }
> 
> +new_block->offset = find_ram_offset(size);
> if (host) {
> new_block->host = host;
> } else {
> @@ -2855,13 +2859,15 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, 
> const char *name,
>PROT_EXEC|PROT_READ|PROT_WRITE,
>MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> #else
> -new_block->host = qemu_vmalloc(size);
> +if (xen_mapcache_enabled()) {
> +xen_ram_alloc(new_block->offset, size);
> +} else {
> +new_block->host = qemu_vmalloc(size);
> +}
> #endif
> qemu_madvise(new_block->host, size, QEMU_MADV_MERGEABLE);
> }
> }
> -
> -new_block->offset = find_ram_offset(size);
> new_block->length = size;
> 
> QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
> @@ -2902,7 +2908,11 @@ void qemu_ram_free(ram_addr_t addr)
> #if defined(TARGET_S390X) && defined(CONFIG_KVM)
> munmap(block->host, block->length);
> #else
> -qemu_vfree(block->host);
> +if (xen_mapcache_enabled()) {
> +qemu_invalidate_entry(block->host);
> +} else {
> +qemu_vfree(block->host);
> +}
> #endif
> }
> qemu_free(block);
> @@ -2928,6 +2938,15 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
> if (addr - block->offset < block->length) {
> QLIST_REMOVE(block, next);
> QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +if (xen_mapcache_enabled()) {
> +/* We need to check if the requested address is in the RAM
> + * because we don't want to map the entire memory in QEMU.
> + */
> +if (block->offset == 0) {
> +return qemu_map_cache(addr, 0, 1);
> +}
> +block->host = qemu_map_cache(block->offset, block->length, 
> 1);
> +}
> return block->host + (addr - block->offset);
> }
> }
> @@ -2944,11 +2963,21 @@ int qemu_ram

[Qemu-devel] [PATCH] make-release: fix mtime for a wider range of git versions

2010-11-15 Thread Bernhard Kohl

With the latest git versions, e.g. 1.7.2.3, git still prints out
the tag info in addition to the requested format. So let's simply
fetch the first line from the output.

In addition I use the --pretty option instead of --format which
is not recognized in very old git versions, e.g. 1.5.5.6.

Tested with git versions 1.5.5.6 and 1.7.2.3.

Signed-off-by: Bernhard Kohl 
---
 kvm/scripts/make-release |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kvm/scripts/make-release b/kvm/scripts/make-release
index 56302c3..2d050fc 100755
--- a/kvm/scripts/make-release
+++ b/kvm/scripts/make-release
@@ -51,7 +51,7 @@ cd "$(dirname "$0")"/../..
 mkdir -p "$(dirname "$tarball")"
 git archive --prefix="$name/" --format=tar "$commit" > "$tarball"
 
-mtime=`git show --format=%ct "$commit""^{commit}" --`
+mtime=`git show --pretty=format:%ct "$commit""^{commit}" -- | head -n 1`
 tarargs="--owner=root --group=root"
 
 mkdir -p "$tmpdir/$name"
-- 
1.7.2.3

Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?

2010-11-15 Thread Lluís

Mulyadi Santosa writes:

>> Yes, I have read that paper, it’s wonderful!
>> 
>> Besides the Argos, the bitblaze group, led by Dawn Song in Berkeley, has
>> achieved great success in the taint analysis. The website about their
>> dynamic analysis work (called TEMU) can be found at:
>> http://bitblaze.cs.berkeley.edu/temu.html
>> 
>> And TEMU is now open-source.

> Thanks for sharing that...it's new stuff for me. So, why don't you
> just pick TEMU and improve it instead of...uhm...sorry if I am wrong,
> working from scratch? After all, I believe in both Argos and TEMU (and
> maybe other similar projects), they share common codes here and there.

> But ehm...CMIIW, seems like TEMU is based on Qemu 0.9,x, right? So
> it's sorry I forgot the name, the generated code is mostly a
> constructed by fragments of small codes generated by gcc. Now, it is
> qemu which does it by itself. So, a lot of things change
> (substantially).

I haven't read the TEMU work, but from the problem description I think
you want something similar to "Practical Taint-Based Protection using
Demand Emulation" or many others (I remember reading some of them a few
years ago on the ISCA, MICRO and/or ASPLOS conferences).

>> Yes. For each process’s memory space A, I wanna make a shadow memory B. The
>> shadow memory is used to store the tag of data. In other words, if addr in
>> memory A is tainted, then the corresponding byte in B should be marked to
>> indicate that addr in A is tainted.

The main question here is... what is the granularity that you want to
track with? Bytes? Words? Pages? This will greatly influence which is
your best approach.

Now that I think of it, you could use the tracing points I sent for
guest virtual memory accesses, and instrument them instead of calling a
file-tracing backend (this should provide a hook for an arbitrary
granularity). Then, simply keep track also of address-space changes and
your instrumentation code can always know when to activate propagation.

This, together with the optimization I sent for dynamic control of trace
generation in TCG emulation code should get you on tracks.

Of course, you should still modify all register-accessing instructions
to propagate information passing through the register set. For that,
maybe you could start with the "fetch" tracing/instrumentation point I
sent long time ago, which keeps track of general-purpose register
usage/definition on x86 (although I'm sure I left some astray usages due
to the decoding complexity in x86).

>> The guest os collects “higher” semantic
>> from the OS level, and the QEMU collects “lower” semantic from the
>> instruction level. Combination of both semantics is necessary in the
>> analysis process.

> The question is, in a situation where malware already compromise "the
> higher semantic", could we trust the analysis?

Beware, I've read exactly this kind of scheme on previous top-tier
conferences (but I think tests were using an architectural simulator, so
it's not for a current production environment).

I've found it :)

 Secure program execution via dynamic information flow tracking
 ASPLOS 2004

>> The question is: how to communicate between the QEMU and the guest OS, so
>> that they can cooperate with each other?

A few choices here, but you should first define if the communication
must be based just on control signals, and/or providing memory storage:
  * virtual device : If you need some kind of storage that the guest OS
must access, you could look at the ivshmem device
  * backdoor instruction : It's the simplest option; I sent some patch
series recently with two different implementations for x86.

Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

[Qemu-devel] [PATCH] audio: disable timer when idle.

2010-11-15 Thread Gerd Hoffmann

This patch makes sure the audio timer gets disabled when no voice is
active.  It adds a variable to track whenever the timer is needed or
not to the global audio state.  audio_timer_reset() will set that
variable.  audio_timer() will check that variable and only re-arm
the timer when needed.  One additional call to audio_timer_reset() has
been added to the output voice disable code path.

Signed-off-by: Gerd Hoffmann 
---
 audio/audio.c |6 +-
 audio/audio_int.h |1 +
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index ade342e..9cf91ae 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -1101,7 +1101,8 @@ static void audio_timer (void *opaque)
 AudioState *s = opaque;
 
 audio_run ("timer");
-qemu_mod_timer (s->ts, qemu_get_clock (vm_clock) + conf.period.ticks);
+if (s->need_timer)
+qemu_mod_timer (s->ts, qemu_get_clock (vm_clock) + conf.period.ticks);
 }
 
 
@@ -1124,9 +1125,11 @@ static void audio_reset_timer (void)
 AudioState *s = &glob_audio_state;
 
 if (audio_is_timer_needed ()) {
+s->need_timer = 1;
 qemu_mod_timer (s->ts, qemu_get_clock (vm_clock) + 1);
 }
 else {
+s->need_timer = 0;
 qemu_del_timer (s->ts);
 }
 }
@@ -1380,6 +1383,7 @@ static void audio_run_out (AudioState *s)
 sc->sw.active = 0;
 audio_recalc_and_notify_capture (sc->cap);
 }
+audio_reset_timer ();
 continue;
 }
 
diff --git a/audio/audio_int.h b/audio/audio_int.h
index d66f2c3..df060a9 100644
--- a/audio/audio_int.h
+++ b/audio/audio_int.h
@@ -190,6 +190,7 @@ struct AudioState {
 void *drv_opaque;
 
 QEMUTimer *ts;
+int need_timer;
 QLIST_HEAD (card_listhead, QEMUSoundCard) card_head;
 QLIST_HEAD (hw_in_listhead, HWVoiceIn) hw_head_in;
 QLIST_HEAD (hw_out_listhead, HWVoiceOut) hw_head_out;
-- 
1.7.1

Re: [Qemu-devel] [PATCH] ARM: semi-hosting support for stderr

2010-11-15 Thread Christophe Lyon

On 14.11.2010 18:50, Peter Maydell wrote:
> Although newlib/libgloss use different modes for opening stdout
> and stderr, the ARM C library implementation does not, for
> instance; so your patch is relying on a detail of implementation of
> a particular semihosting user. So I don't think we can apply this
> patch, because it could break other (nonlibgloss) users of
> semihosting.
> 

Thanks for your answer.

I think we could argue it's a bug in the ARM semihosting spec, so maybe
we could request them to change this. However, I am unsure how to do
this efficiently, as I recently tried with little success to have them
change the "angel_SWIreason_ReportException" such that at least
ADP_Stopped_AplicationExit can have the program return code as an extra
parameter... (anyway I have another small Qemu patch to handle this
situation by returning the contents of r0 when the program entered
exit(). I can submit it if someone is willing to review it).

Christophe.

Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.

2010-11-15 Thread Paolo Bonzini


On 11/15/2010 11:46 AM, Alexander Graf wrote:

I take back my Acked-by. Sorry, I guess I should have read things in
more detail first O_o. I still believe that this patch can go in
before the others, but I'd at least like to see some comments on the
(0) pointer thing:).


I agree, it kind of works by chance, (0) works but (1) wouldn't.  Better 
to define the function always.


Paolo

[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-15 Thread Kevin O'Connor

On Mon, Nov 15, 2010 at 09:40:08AM +0200, Gleb Natapov wrote:
> On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote:
> > Why not just return a newline separated list that is null terminated?
> > 
> Doing it like this will needlessly complicate firmware side. How do you
> know how much memory to allocate before reading device list?

My preference would be for the size to be exposed via the
QEMU_CFG_FILE_DIR selector.  (My preference would be for all objects
in fw_cfg to have entries in QEMU_CFG_FILE_DIR describing their size
in a reliable manner.)

-Kevin

[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-15 Thread Gleb Natapov

On Mon, Nov 15, 2010 at 08:26:35AM -0500, Kevin O'Connor wrote:
> On Mon, Nov 15, 2010 at 09:40:08AM +0200, Gleb Natapov wrote:
> > On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote:
> > > Why not just return a newline separated list that is null terminated?
> > > 
> > Doing it like this will needlessly complicate firmware side. How do you
> > know how much memory to allocate before reading device list?
> 
> My preference would be for the size to be exposed via the
> QEMU_CFG_FILE_DIR selector.  (My preference would be for all objects
> in fw_cfg to have entries in QEMU_CFG_FILE_DIR describing their size
> in a reliable manner.)
> 
Will interface suggested by Blue will be good for you? The one with two
fw_cfg ids. BOOTINDEX_LEN for len and BOOTINDEX_DATA for device list. I
already changed my implementation to this one. Using FILE_DIR requires
us to generate synthetic name. Hmm BTW I do not see proper endianness
handling in FILE_DIR.

--
Gleb.

[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.

2010-11-15 Thread Gleb Natapov

On Mon, Nov 15, 2010 at 03:36:25PM +0200, Gleb Natapov wrote:
>   Hmm BTW I do not see proper endianness
> handling in FILE_DIR.
> 
That's just me. Everything it OK there with endianness.

--
Gleb.

Re: [Qemu-devel] [PATCH V6 02/15] xen: Support new libxc calls from xen unstable.

2010-11-15 Thread Stefano Stabellini

On Mon, 15 Nov 2010, Alexander Graf wrote:
> > /*
> > - * tweaks needed to build with different xen versions
> > - *  0x00030205 -> 3.1.0
> > - *  0x00030207 -> 3.2.0
> > - *  0x00030208 -> unstable
> > + * We don't support Xen prior to 3.3.0.
> >  */
> > -#include 
> > -#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030205
> > -# define evtchn_port_or_error_t int
> > -#endif
> > -#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030207
> > -# define xc_map_foreign_pages xc_map_foreign_batch
> > -#endif
> > -#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030208
> > -# define xen_mb()  mb()
> > -# define xen_rmb() rmb()
> > -# define xen_wmb() wmb()
> > +
> > +/* Xen unstable */
> > +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 410
> > +typedef int qemu_xc_interface;
> > +#  define XC_HANDLER_INITIAL_VALUE   -1
> > +#  define xc_fd(xen_xc)  xen_xc
> > +#  define xc_interface_open(l, dl, f)xc_interface_open()
> > +#  define xc_gnttab_open(xc) xc_gnttab_open()
> > +#  define xc_gnttab_map_grant_ref(xc, gnt, domid, ref, flags) \
> > +xc_gnttab_map_grant_ref(gnt, domid, ref, flags)
> > +#  define xc_gnttab_map_grant_refs(xc, gnt, count, domids, refs, flags) \
> > +xc_gnttab_map_grant_refs(gnt, count, domids, refs, flags)
> > +#  define xc_gnttab_munmap(xc, gnt, pages, niov) xc_gnttab_munmap(gnt, 
> > pages, niov)
> > +#  define xc_gnttab_close(xc, dev)   xc_gnttab_close(dev)
> 
> All these defines are very icky. In my patchset I replace all those libxc 
> calls by indirect calls through a struct anyways. Maybe it'd make sense to 
> integrate that part into your series already.
> 
> That way you could make this whole compat stuff a lot cleaner. You could use 
> static inline functions to create wrappers to the real calls and just assign 
> those as callbacks in the dispatch struct.
> 
> In general, I'd disagree to the preprocessor approach here. If you're not 
> going with the libxc dispatch struct patch, I'd at least like to see all 
> those converted to static inline functions called with different names 
> internally.
> 
 
Yeah, I am not a fan of preprocessor magic as well, it is probably worth
investigating the indirect calls approach, surely would be easier to
read.

Re: [Qemu-devel] [PATCH V6 03/15] xen: Add xen_machine_fv

2010-11-15 Thread Stefano Stabellini

On Mon, 15 Nov 2010, Kevin Wolf wrote:
> >> + * Permission is hereby granted, free of charge, to any person obtaining 
> >> a copy
> >> + * of this software and associated documentation files (the "Software"), 
> >> to deal
> >> + * in the Software without restriction, including without limitation the 
> >> rights
> >> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
> >> sell
> >> + * copies of the Software, and to permit persons to whom the Software is
> >> + * furnished to do so, subject to the following conditions:
> >> + *
> >> + * The above copyright notice and this permission notice shall be 
> >> included in
> >> + * all copies or substantial portions of the Software.
> >> + *
> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> >> EXPRESS OR
> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> >> MERCHANTABILITY,
> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> >> OTHER
> >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> >> ARISING FROM,
> >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
> >> IN
> >> + * THE SOFTWARE.
> > 
> > So all this code has always been public domain? There are no GPL'ed pieces 
> > in there?
> 
> Public domain is something entirely different. This is just a permissive
> license, and it's used a lot throughout qemu. So as you state that this
> file is mostly a copy of pc.c which uses this license, at least for the
> copied part it's not only okay, but even required to use the same
> license here.
 
Indeed, the license above is BSD because the original was BSD, it would
be rude to GPL it.

Re: [Qemu-devel] [PATCH] make trace options use autoconfy names

2010-11-15 Thread Stefan Hajnoczi

On Sun, Nov 14, 2010 at 1:52 PM, Paolo Bonzini  wrote:
> On 11/14/2010 02:38 PM, Andreas Färber wrote:
>>>
>>> - --trace-file=*) trace_file="$optarg"
>>> + --enable-trace-file=*) trace_file="$optarg"
>>> ;;
>>
>> but this should be --with-trace-file=... please. It is not being
>> enabled, just set to a different value.
>
> --with-* should be reserved for library paths, but I can change it if people
> prefer it that way.

Actually I think we have something similar to overriding --prefix here
>:).  It's a path that you can set at ./configure time.

So is it not okay to use --trace-file=?  But I know nothing
of autoconf and --enable-* or --with-* sort of make sense too.

Stefan

Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.

2010-11-15 Thread Anthony PERARD

On Mon, 15 Nov 2010, Alexander Graf wrote:

>
> On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:
>
> > From: Anthony PERARD 
> >
> > This option gives the ability to switch one "accelerator" like kvm, xen
> > or the default one tcg. We can specify more than one accelerator by
> > separate them by a comma. QEMU will try each one and use the first whose
> > works.
> >
> > So,
> >
> > -accel xen,kvm,tcg
> >
> > which would try Xen support first, then KVM and finaly tcg if none of
> > the other works.
> >
> > Signed-off-by: Anthony PERARD 
> > ---
> > qemu-options.hx |   10 ++
> > vl.c|   86 
> > ---
> > 2 files changed, 85 insertions(+), 11 deletions(-)
> >
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 718d47a..ba8385b 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -1925,6 +1925,16 @@ Enable KVM full virtualization support. This option 
> > is only available
> > if KVM support is enabled when compiling.
> > ETEXI
> >
> > +DEF("accel", HAS_ARG, QEMU_OPTION_accel, \
> > +"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", 
> > QEMU_ARCH_ALL)
> > +STEXI
> > +...@item -accel @var{accel}[,@var{accel}[,...]]
> > +...@findex -accel
> > +This is use to enable an accelerator, in kvm,xen,tcg.
> > +By default, it use only tcg. If there a more than one accelerator
> > +specified, the next one is used if the first don't work.
> > +ETEXI
> > +
> > DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid,
> > "-xen-domid id   specify xen guest domain id\n", QEMU_ARCH_ALL)
> > DEF("xen-create", 0, QEMU_OPTION_xen_create,
> > diff --git a/vl.c b/vl.c
> > index df414ef..40a26ee 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname)
> > return 0;
> > }
> >
> > +static struct {
> > +const char *opt_name;
> > +const char *name;
> > +int (*available)(void);
> > +int (*init)(int smp_cpus);
> > +int *allowed;
> > +} accel_list[] = {
> > +{ "tcg", "tcg", NULL, NULL, NULL },
> > +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed },
>
> Thinking about this a bit more ...
>
> kvm_available is a function pointer that gets #defined to (0) when we don't 
> have KVM available. I can imagine some compiler might throw a warning on us 
> for this one day.
>
> Is there a valid reason not to do
>
> static inline int kvm_enabled()
> {
> #ifdef CONFIG_KVM
> return kvm_allowed;
> #else
> return 0;
> #endif
> }
>
> That should compile into the exact same code but be valid for function 
> pointers.

I will do this change, as well as for the two others patches.

> > +};
> > +
> > +static int accel_parse_init(const char *opts)
> > +{
> > +const char *p = opts;
> > +char buf[10];
> > +int i, ret;
> > +bool accel_initalised = 0;
> > +bool init_failed = 0;
> > +
> > +while (!accel_initalised && *p != '\0') {
> > +if (*p == ',') {
> > +p++;
> > +}
> > +p = get_opt_name(buf, sizeof (buf), p, ',');
> > +for (i = 0; i < ARRAY_SIZE(accel_list); i++) {
> > +if (strcmp(accel_list[i].opt_name, buf) == 0) {
> > +if (accel_list[i].init) {
>
> If you'd make the function pointers mandatory, you could also get rid of a 
> lot of checks here. Just introduce a new small stub helper for tcg_init and 
> tcg_allowed and always call the pointers.

Yes, this will make the code more readable.

> I take back my Acked-by. Sorry, I guess I should have read things in more 
> detail first O_o. I still believe that this patch can go in before the 
> others, but I'd at least like to see some comments on the (0) pointer thing 
> :).

And I will send the patch out of the patch series.

Thanks,

-- 
Anthony PERARD

[Qemu-devel] [PATCHv5 06/15] Add get_fw_dev_path callback to IDE bus.

2010-11-15 Thread Gleb Natapov


Signed-off-by: Gleb Natapov 
---
 hw/ide/qdev.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 88ff657..01a181b 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -24,9 +24,12 @@
 
 /* - */
 
+static char *idebus_get_fw_dev_path(DeviceState *dev);
+
 static struct BusInfo ide_bus_info = {
 .name  = "IDE",
 .size  = sizeof(IDEBus),
+.get_fw_dev_path = idebus_get_fw_dev_path,
 };
 
 void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id)
@@ -35,6 +38,16 @@ void ide_bus_new(IDEBus *idebus, DeviceState *dev, int 
bus_id)
 idebus->bus_id = bus_id;
 }
 
+static char *idebus_get_fw_dev_path(DeviceState *dev)
+{
+char path[30];
+
+snprintf(path, sizeof(path), "%...@%d", qdev_fw_name(dev),
+ ((IDEBus*)dev->parent_bus)->bus_id);
+
+return strdup(path);
+}
+
 static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base)
 {
 IDEDevice *dev = DO_UPCAST(IDEDevice, qdev, qdev);
-- 
1.7.1

[Qemu-devel] [PATCHv5 07/15] Add get_dev_path callback for system bus.

2010-11-15 Thread Gleb Natapov

Prints out mmio or pio used to access child device.

Signed-off-by: Gleb Natapov 
---
 hw/pci_host.c |2 ++
 hw/sysbus.c   |   30 ++
 hw/sysbus.h   |4 
 3 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/hw/pci_host.c b/hw/pci_host.c
index bc5b771..28d45bf 100644
--- a/hw/pci_host.c
+++ b/hw/pci_host.c
@@ -197,6 +197,7 @@ void pci_host_conf_register_ioport(pio_addr_t ioport, 
PCIHostState *s)
 {
 pci_host_init(s);
 register_ioport_simple(&s->conf_noswap_handler, ioport, 4, 4);
+sysbus_init_ioports(&s->busdev, ioport, 4);
 }
 
 int pci_host_data_register_mmio(PCIHostState *s, int swap)
@@ -215,4 +216,5 @@ void pci_host_data_register_ioport(pio_addr_t ioport, 
PCIHostState *s)
 register_ioport_simple(&s->data_noswap_handler, ioport, 4, 1);
 register_ioport_simple(&s->data_noswap_handler, ioport, 4, 2);
 register_ioport_simple(&s->data_noswap_handler, ioport, 4, 4);
+sysbus_init_ioports(&s->busdev, ioport, 4);
 }
diff --git a/hw/sysbus.c b/hw/sysbus.c
index d817721..1583bd8 100644
--- a/hw/sysbus.c
+++ b/hw/sysbus.c
@@ -22,11 +22,13 @@
 #include "monitor.h"
 
 static void sysbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
+static char *sysbus_get_fw_dev_path(DeviceState *dev);
 
 struct BusInfo system_bus_info = {
 .name   = "System",
 .size   = sizeof(BusState),
 .print_dev  = sysbus_dev_print,
+.get_fw_dev_path = sysbus_get_fw_dev_path,
 };
 
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq)
@@ -106,6 +108,16 @@ void sysbus_init_mmio_cb(SysBusDevice *dev, 
target_phys_addr_t size,
 dev->mmio[n].cb = cb;
 }
 
+void sysbus_init_ioports(SysBusDevice *dev, pio_addr_t ioport, pio_addr_t size)
+{
+pio_addr_t i;
+
+for (i = 0; i < size; i++) {
+assert(dev->num_pio < QDEV_MAX_PIO);
+dev->pio[dev->num_pio++] = ioport++;
+}
+}
+
 static int sysbus_device_init(DeviceState *dev, DeviceInfo *base)
 {
 SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev);
@@ -171,3 +183,21 @@ static void sysbus_dev_print(Monitor *mon, DeviceState 
*dev, int indent)
indent, "", s->mmio[i].addr, s->mmio[i].size);
 }
 }
+
+static char *sysbus_get_fw_dev_path(DeviceState *dev)
+{
+SysBusDevice *s = sysbus_from_qdev(dev);
+char path[40];
+int off;
+
+off = snprintf(path, sizeof(path), "%s", qdev_fw_name(dev));
+
+if (s->num_mmio) {
+snprintf(path + off, sizeof(path) - off, "@"TARGET_FMT_plx,
+ s->mmio[0].addr);
+} else if (s->num_pio) {
+snprintf(path + off, sizeof(path) - off, "@i%04x", s->pio[0]);
+}
+
+return strdup(path);
+}
diff --git a/hw/sysbus.h b/hw/sysbus.h
index 5980901..e9eb618 100644
--- a/hw/sysbus.h
+++ b/hw/sysbus.h
@@ -6,6 +6,7 @@
 #include "qdev.h"
 
 #define QDEV_MAX_MMIO 32
+#define QDEV_MAX_PIO 32
 #define QDEV_MAX_IRQ 256
 
 typedef struct SysBusDevice SysBusDevice;
@@ -23,6 +24,8 @@ struct SysBusDevice {
 mmio_mapfunc cb;
 ram_addr_t iofunc;
 } mmio[QDEV_MAX_MMIO];
+int num_pio;
+pio_addr_t pio[QDEV_MAX_PIO];
 };
 
 typedef int (*sysbus_initfn)(SysBusDevice *dev);
@@ -45,6 +48,7 @@ void sysbus_init_mmio_cb(SysBusDevice *dev, 
target_phys_addr_t size,
 mmio_mapfunc cb);
 void sysbus_init_irq(SysBusDevice *dev, qemu_irq *p);
 void sysbus_pass_irq(SysBusDevice *dev, SysBusDevice *target);
+void sysbus_init_ioports(SysBusDevice *dev, pio_addr_t ioport, pio_addr_t 
size);
 
 
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq);
-- 
1.7.1

[Qemu-devel] [PATCHv5 13/15] Add bootindex for option roms.

2010-11-15 Thread Gleb Natapov

Extend -option-rom command to have additional parameter ,bootindex=.

Signed-off-by: Gleb Natapov 
---
 hw/loader.c|   16 +++-
 hw/loader.h|8 
 hw/multiboot.c |3 ++-
 hw/ne2000.c|2 +-
 hw/nseries.c   |4 ++--
 hw/palm.c  |6 +++---
 hw/pc.c|7 ---
 hw/pci.c   |2 +-
 hw/pcnet.c |2 +-
 qemu-config.c  |   17 +
 sysemu.h   |6 +-
 vl.c   |   11 +--
 12 files changed, 60 insertions(+), 24 deletions(-)

diff --git a/hw/loader.c b/hw/loader.c
index 1e98326..eb198f6 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -107,7 +107,7 @@ int load_image_targphys(const char *filename,
 
 size = get_image_size(filename);
 if (size > 0)
-rom_add_file_fixed(filename, addr);
+rom_add_file_fixed(filename, addr, -1);
 return size;
 }
 
@@ -557,10 +557,11 @@ static void rom_insert(Rom *rom)
 }
 
 int rom_add_file(const char *file, const char *fw_dir,
- target_phys_addr_t addr)
+ target_phys_addr_t addr, int32_t bootindex)
 {
 Rom *rom;
 int rc, fd = -1;
+char devpath[100];
 
 rom = qemu_mallocz(sizeof(*rom));
 rom->name = qemu_strdup(file);
@@ -605,7 +606,12 @@ int rom_add_file(const char *file, const char *fw_dir,
 snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir,
  basename);
 fw_cfg_add_file(fw_cfg, fw_file_name, rom->data, rom->romsize);
+snprintf(devpath, sizeof(devpath), "/r...@%s", fw_file_name);
+} else {
+snprintf(devpath, sizeof(devpath), "/rom@" TARGET_FMT_plx, addr);
 }
+
+add_boot_device_path(bootindex, NULL, devpath);
 return 0;
 
 err:
@@ -635,12 +641,12 @@ int rom_add_blob(const char *name, const void *blob, 
size_t len,
 
 int rom_add_vga(const char *file)
 {
-return rom_add_file(file, "vgaroms", 0);
+return rom_add_file(file, "vgaroms", 0, -1);
 }
 
-int rom_add_option(const char *file)
+int rom_add_option(const char *file, int32_t bootindex)
 {
-return rom_add_file(file, "genroms", 0);
+return rom_add_file(file, "genroms", 0, bootindex);
 }
 
 static void rom_reset(void *unused)
diff --git a/hw/loader.h b/hw/loader.h
index 1f82fc5..fc6bdff 100644
--- a/hw/loader.h
+++ b/hw/loader.h
@@ -22,7 +22,7 @@ void pstrcpy_targphys(const char *name,
 
 
 int rom_add_file(const char *file, const char *fw_dir,
- target_phys_addr_t addr);
+ target_phys_addr_t addr, int32_t bootindex);
 int rom_add_blob(const char *name, const void *blob, size_t len,
  target_phys_addr_t addr);
 int rom_load_all(void);
@@ -31,8 +31,8 @@ int rom_copy(uint8_t *dest, target_phys_addr_t addr, size_t 
size);
 void *rom_ptr(target_phys_addr_t addr);
 void do_info_roms(Monitor *mon);
 
-#define rom_add_file_fixed(_f, _a)  \
-rom_add_file(_f, NULL, _a)
+#define rom_add_file_fixed(_f, _a, _i)  \
+rom_add_file(_f, NULL, _a, _i)
 #define rom_add_blob_fixed(_f, _b, _l, _a)  \
 rom_add_blob(_f, _b, _l, _a)
 
@@ -43,6 +43,6 @@ void do_info_roms(Monitor *mon);
 #define PC_ROM_SIZE(PC_ROM_MAX - PC_ROM_MIN_VGA)
 
 int rom_add_vga(const char *file);
-int rom_add_option(const char *file);
+int rom_add_option(const char *file, int32_t bootindex);
 
 #endif
diff --git a/hw/multiboot.c b/hw/multiboot.c
index f9097a2..b438019 100644
--- a/hw/multiboot.c
+++ b/hw/multiboot.c
@@ -325,7 +325,8 @@ int load_multiboot(void *fw_cfg,
 fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
  sizeof(bootinfo));
 
-option_rom[nb_option_roms] = "multiboot.bin";
+option_rom[nb_option_roms].name = "multiboot.bin";
+option_rom[nb_option_roms].bootindex = 0;
 nb_option_roms++;
 
 return 1; /* yes, we are multiboot */
diff --git a/hw/ne2000.c b/hw/ne2000.c
index f4bbac2..67e0cb0 100644
--- a/hw/ne2000.c
+++ b/hw/ne2000.c
@@ -742,7 +742,7 @@ static int pci_ne2000_init(PCIDevice *pci_dev)
 if (!pci_dev->qdev.hotplugged) {
 static int loaded = 0;
 if (!loaded) {
-rom_add_option("pxe-ne2k_pci.bin");
+rom_add_option("pxe-ne2k_pci.bin", -1);
 loaded = 1;
 }
 }
diff --git a/hw/nseries.c b/hw/nseries.c
index 04a028d..2f6f473 100644
--- a/hw/nseries.c
+++ b/hw/nseries.c
@@ -1326,7 +1326,7 @@ static void n8x0_init(ram_addr_t ram_size, const char 
*boot_device,
 qemu_register_reset(n8x0_boot_init, s);
 }
 
-if (option_rom[0] && (boot_device[0] == 'n' || !kernel_filename)) {
+if (option_rom[0].name && (boot_device[0] == 'n' || !kernel_filename)) {
 int rom_size;
 uint8_t nolo_tags[0x1];
 /* No, wait, better start at the ROM.  */
@@ -1341,7 +1341,7 @@ static void n8x0_init(ram_addr_t ram_size, const char 
*boot_device,
  *
  * The code above is for loading the `zImage' file from Nokia
  * images.  */
-rom_size =

[Qemu-devel] [PATCHv5 04/15] Add get_fw_dev_path callback to ISA bus in qdev.

2010-11-15 Thread Gleb Natapov

Use device ioports to create unique device path.

Signed-off-by: Gleb Natapov 
---
 hw/isa-bus.c |   16 
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/hw/isa-bus.c b/hw/isa-bus.c
index c0ac7e9..c423c1b 100644
--- a/hw/isa-bus.c
+++ b/hw/isa-bus.c
@@ -31,11 +31,13 @@ static ISABus *isabus;
 target_phys_addr_t isa_mem_base = 0;
 
 static void isabus_dev_print(Monitor *mon, DeviceState *dev, int indent);
+static char *isabus_get_fw_dev_path(DeviceState *dev);
 
 static struct BusInfo isa_bus_info = {
 .name  = "ISA",
 .size  = sizeof(ISABus),
 .print_dev = isabus_dev_print,
+.get_fw_dev_path = isabus_get_fw_dev_path,
 };
 
 ISABus *isa_bus_new(DeviceState *dev)
@@ -188,4 +190,18 @@ static void isabus_register_devices(void)
 sysbus_register_withprop(&isabus_bridge_info);
 }
 
+static char *isabus_get_fw_dev_path(DeviceState *dev)
+{
+ISADevice *d = (ISADevice*)dev;
+char path[40];
+int off;
+
+off = snprintf(path, sizeof(path), "%s", qdev_fw_name(dev));
+if (d->nioports) {
+snprintf(path + off, sizeof(path) - off, "@%04x", d->ioports[0]);
+}
+
+return strdup(path);
+}
+
 device_init(isabus_register_devices)
-- 
1.7.1

[Qemu-devel] [PATCHv5 01/15] Introduce fw_name field to DeviceInfo structure.

2010-11-15 Thread Gleb Natapov

Add "fw_name" to DeviceInfo to use in device path building. In
contrast to "name" "fw_name" should refer to functionality device
provides instead of particular device model like "name" does.

Signed-off-by: Gleb Natapov 
---
 hw/fdc.c|1 +
 hw/ide/isa.c|1 +
 hw/ide/qdev.c   |1 +
 hw/isa-bus.c|1 +
 hw/lance.c  |1 +
 hw/piix_pci.c   |1 +
 hw/qdev.h   |6 ++
 hw/usb-hub.c|1 +
 hw/usb-net.c|1 +
 hw/virtio-pci.c |1 +
 10 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/hw/fdc.c b/hw/fdc.c
index c159dcb..a467c4b 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -2040,6 +2040,7 @@ static const VMStateDescription vmstate_isa_fdc ={
 static ISADeviceInfo isa_fdc_info = {
 .init = isabus_fdc_init1,
 .qdev.name  = "isa-fdc",
+.qdev.fw_name  = "fdc",
 .qdev.size  = sizeof(FDCtrlISABus),
 .qdev.no_user = 1,
 .qdev.vmsd  = &vmstate_isa_fdc,
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 6b57e0d..9856435 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -98,6 +98,7 @@ ISADevice *isa_ide_init(int iobase, int iobase2, int isairq,
 
 static ISADeviceInfo isa_ide_info = {
 .qdev.name  = "isa-ide",
+.qdev.fw_name  = "ide",
 .qdev.size  = sizeof(ISAIDEState),
 .init   = isa_ide_initfn,
 .qdev.reset = isa_ide_reset,
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 0808760..6d27b60 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -134,6 +134,7 @@ static int ide_drive_initfn(IDEDevice *dev)
 
 static IDEDeviceInfo ide_drive_info = {
 .qdev.name  = "ide-drive",
+.qdev.fw_name  = "drive",
 .qdev.size  = sizeof(IDEDrive),
 .init   = ide_drive_initfn,
 .qdev.props = (Property[]) {
diff --git a/hw/isa-bus.c b/hw/isa-bus.c
index 4e306de..26036e0 100644
--- a/hw/isa-bus.c
+++ b/hw/isa-bus.c
@@ -153,6 +153,7 @@ static int isabus_bridge_init(SysBusDevice *dev)
 static SysBusDeviceInfo isabus_bridge_info = {
 .init = isabus_bridge_init,
 .qdev.name  = "isabus-bridge",
+.qdev.fw_name  = "isa",
 .qdev.size  = sizeof(SysBusDevice),
 .qdev.no_user = 1,
 };
diff --git a/hw/lance.c b/hw/lance.c
index dc12144..1a3bb1a 100644
--- a/hw/lance.c
+++ b/hw/lance.c
@@ -141,6 +141,7 @@ static void lance_reset(DeviceState *dev)
 static SysBusDeviceInfo lance_info = {
 .init   = lance_init,
 .qdev.name  = "lance",
+.qdev.fw_name  = "ethernet",
 .qdev.size  = sizeof(SysBusPCNetState),
 .qdev.reset = lance_reset,
 .qdev.vmsd  = &vmstate_lance,
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index b5589b9..38f9d9e 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -365,6 +365,7 @@ static PCIDeviceInfo i440fx_info[] = {
 static SysBusDeviceInfo i440fx_pcihost_info = {
 .init = i440fx_pcihost_initfn,
 .qdev.name= "i440FX-pcihost",
+.qdev.fw_name = "pci",
 .qdev.size= sizeof(I440FXState),
 .qdev.no_user = 1,
 };
diff --git a/hw/qdev.h b/hw/qdev.h
index 579328a..9f90efe 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -139,6 +139,7 @@ typedef void (*qdev_resetfn)(DeviceState *dev);
 
 struct DeviceInfo {
 const char *name;
+const char *fw_name;
 const char *alias;
 const char *desc;
 size_t size;
@@ -288,6 +289,11 @@ void qdev_prop_set_defaults(DeviceState *dev, Property 
*props);
 void qdev_prop_register_global_list(GlobalProperty *props);
 void qdev_prop_set_globals(DeviceState *dev);
 
+static inline const char *qdev_fw_name(DeviceState *dev)
+{
+return dev->info->fw_name ? : dev->info->alias ? : dev->info->name;
+}
+
 /* This is a nasty hack to allow passing a NULL bus to qdev_create.  */
 extern struct BusInfo system_bus_info;
 
diff --git a/hw/usb-hub.c b/hw/usb-hub.c
index 2a1edfc..8e3a96b 100644
--- a/hw/usb-hub.c
+++ b/hw/usb-hub.c
@@ -545,6 +545,7 @@ static int usb_hub_initfn(USBDevice *dev)
 static struct USBDeviceInfo hub_info = {
 .product_desc   = "QEMU USB Hub",
 .qdev.name  = "usb-hub",
+.qdev.fw_name= "hub",
 .qdev.size  = sizeof(USBHubState),
 .init   = usb_hub_initfn,
 .handle_packet  = usb_hub_handle_packet,
diff --git a/hw/usb-net.c b/hw/usb-net.c
index 70f9263..2287ee1 100644
--- a/hw/usb-net.c
+++ b/hw/usb-net.c
@@ -1496,6 +1496,7 @@ static USBDevice *usb_net_init(const char *cmdline)
 static struct USBDeviceInfo net_info = {
 .product_desc   = "QEMU USB Network Interface",
 .qdev.name  = "usb-net",
+.qdev.fw_name= "network",
 .qdev.size  = sizeof(USBNetState),
 .init   = usb_net_initfn,
 .handle_packet  = usb_generic_handle_packet,
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 729917d..be2c92f 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -697,6 +697,7 @@ static int virtio_9p_init_pci(PCIDevice *pci_dev)
 static PCIDeviceInfo virtio_info[] = {
 {
 .qdev.name = "virtio-blk-pci",
+.qdev.alias = "virtio-blk",
 .qdev.size = sizeof(VirtIOPCIProxy),
 .init  = virtio_

[Qemu-devel] [PATCHv5 10/15] Add get_dev_path callback for usb bus.

2010-11-15 Thread Gleb Natapov


Signed-off-by: Gleb Natapov 
---
 hw/usb-bus.c |   42 ++
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/hw/usb-bus.c b/hw/usb-bus.c
index 256b881..8b4583c 100644
--- a/hw/usb-bus.c
+++ b/hw/usb-bus.c
@@ -5,11 +5,13 @@
 #include "monitor.h"
 
 static void usb_bus_dev_print(Monitor *mon, DeviceState *qdev, int indent);
+static char *usbbus_get_fw_dev_path(DeviceState *dev);
 
 static struct BusInfo usb_bus_info = {
 .name  = "USB",
 .size  = sizeof(USBBus),
 .print_dev = usb_bus_dev_print,
+.get_fw_dev_path = usbbus_get_fw_dev_path,
 };
 static int next_usb_bus = 0;
 static QTAILQ_HEAD(, USBBus) busses = QTAILQ_HEAD_INITIALIZER(busses);
@@ -307,3 +309,43 @@ USBDevice *usbdevice_create(const char *cmdline)
 }
 return usb->usbdevice_init(params);
 }
+
+static int usbbus_get_fw_dev_path_helper(USBDevice *d, USBBus *bus, char *p,
+ int len)
+{
+int l = 0;
+USBPort *port;
+
+QTAILQ_FOREACH(port, &bus->used, next) {
+if (port->dev == d) {
+if (port->pdev) {
+l = usbbus_get_fw_dev_path_helper(port->pdev, bus, p, len);
+}
+l += snprintf(p + l, len - l, "%...@%x/", qdev_fw_name(&d->qdev),
+  port->index);
+break;
+}
+}
+
+return l;
+}
+
+static char *usbbus_get_fw_dev_path(DeviceState *dev)
+{
+USBDevice *d = (USBDevice*)dev;
+USBBus *bus = usb_bus_from_device(d);
+char path[100];
+int l;
+
+assert(d->attached != 0);
+
+l = usbbus_get_fw_dev_path_helper(d, bus, path, sizeof(path));
+
+if (l == 0) {
+abort();
+}
+
+path[l-1] = '\0';
+
+return strdup(path);
+}
-- 
1.7.1

[Qemu-devel] [PATCHv5 02/15] Introduce new BusInfo callback get_fw_dev_path.

2010-11-15 Thread Gleb Natapov

New get_fw_dev_path callback will be used for build device path usable
by firmware in contrast to qdev qemu internal device path.

Signed-off-by: Gleb Natapov 
---
 hw/qdev.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/hw/qdev.h b/hw/qdev.h
index 9f90efe..dc669b3 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -49,12 +49,14 @@ struct DeviceState {
 
 typedef void (*bus_dev_printfn)(Monitor *mon, DeviceState *dev, int indent);
 typedef char *(*bus_get_dev_path)(DeviceState *dev);
+typedef char *(*bus_get_fw_dev_path)(DeviceState *dev);
 
 struct BusInfo {
 const char *name;
 size_t size;
 bus_dev_printfn print_dev;
 bus_get_dev_path get_dev_path;
+bus_get_fw_dev_path get_fw_dev_path;
 Property *props;
 };
 
-- 
1.7.1

[Qemu-devel] [PATCHv5 12/15] Change fw_cfg_add_file() to get full file path as a parameter.

2010-11-15 Thread Gleb Natapov

Change fw_cfg_add_file() to get full file path as a parameter instead
of building one internally. Two reasons for that. First caller may need
to know how file is named. Second this moves policy of file naming out
from fw_cfg. Platform may want to use more then two levels of
directories for instance.

Signed-off-by: Gleb Natapov 
---
 hw/fw_cfg.c |   16 
 hw/fw_cfg.h |4 ++--
 hw/loader.c |   16 ++--
 3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 72866ae..7b9434f 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -277,10 +277,9 @@ int fw_cfg_add_callback(FWCfgState *s, uint16_t key, 
FWCfgCallback callback,
 return 1;
 }
 
-int fw_cfg_add_file(FWCfgState *s,  const char *dir, const char *filename,
-uint8_t *data, uint32_t len)
+int fw_cfg_add_file(FWCfgState *s,  const char *filename, uint8_t *data,
+uint32_t len)
 {
-const char *basename;
 int i, index;
 
 if (!s->files) {
@@ -297,15 +296,8 @@ int fw_cfg_add_file(FWCfgState *s,  const char *dir, const 
char *filename,
 
 fw_cfg_add_bytes(s, FW_CFG_FILE_FIRST + index, data, len);
 
-basename = strrchr(filename, '/');
-if (basename) {
-basename++;
-} else {
-basename = filename;
-}
-
-snprintf(s->files->f[index].name, sizeof(s->files->f[index].name),
- "%s/%s", dir, basename);
+pstrcpy(s->files->f[index].name, sizeof(s->files->f[index].name),
+filename);
 for (i = 0; i < index; i++) {
 if (strcmp(s->files->f[index].name, s->files->f[i].name) == 0) {
 FW_CFG_DPRINTF("%s: skip duplicate: %s\n", __FUNCTION__,
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 4d13a4f..856bf91 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -60,8 +60,8 @@ int fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t 
value);
 int fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value);
 int fw_cfg_add_callback(FWCfgState *s, uint16_t key, FWCfgCallback callback,
 void *callback_opaque, uint8_t *data, size_t len);
-int fw_cfg_add_file(FWCfgState *s, const char *dir, const char *filename,
-uint8_t *data, uint32_t len);
+int fw_cfg_add_file(FWCfgState *s, const char *filename, uint8_t *data,
+uint32_t len);
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 target_phys_addr_t crl_addr, target_phys_addr_t 
data_addr);
 
diff --git a/hw/loader.c b/hw/loader.c
index 49ac1fa..1e98326 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -592,8 +592,20 @@ int rom_add_file(const char *file, const char *fw_dir,
 }
 close(fd);
 rom_insert(rom);
-if (rom->fw_file && fw_cfg)
-fw_cfg_add_file(fw_cfg, rom->fw_dir, rom->fw_file, rom->data, 
rom->romsize);
+if (rom->fw_file && fw_cfg) {
+const char *basename;
+char fw_file_name[56];
+
+basename = strrchr(rom->fw_file, '/');
+if (basename) {
+basename++;
+} else {
+basename = rom->fw_file;
+}
+snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir,
+ basename);
+fw_cfg_add_file(fw_cfg, fw_file_name, rom->data, rom->romsize);
+}
 return 0;
 
 err:
-- 
1.7.1

[Qemu-devel] [PATCHv5 15/15] Pass boot device list to firmware.

2010-11-15 Thread Gleb Natapov


Signed-off-by: Gleb Natapov 
---
 hw/fw_cfg.c |   15 +++
 hw/fw_cfg.h |5 -
 sysemu.h|1 +
 vl.c|   49 +
 4 files changed, 69 insertions(+), 1 deletions(-)

diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index 7b9434f..4eea338 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -53,6 +53,7 @@ struct FWCfgState {
 FWCfgFiles *files;
 uint16_t cur_entry;
 uint32_t cur_offset;
+Notifier machine_ready;
 };
 
 static void fw_cfg_write(FWCfgState *s, uint8_t value)
@@ -315,6 +316,16 @@ int fw_cfg_add_file(FWCfgState *s,  const char *filename, 
uint8_t *data,
 return 1;
 }
 
+static void fw_cfg_machine_ready(struct Notifier* n)
+{
+uint32_t len;
+FWCfgState *s = container_of(n, FWCfgState, machine_ready);
+char *bootindex = get_boot_devices_list(&len);
+
+fw_cfg_add_i32(s, FW_CFG_BOOTINDEX_LEN, len);
+fw_cfg_add_bytes(s, FW_CFG_BOOTINDEX_DATA, (uint8_t*)bootindex, len);
+}
+
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 target_phys_addr_t ctl_addr, target_phys_addr_t 
data_addr)
 {
@@ -343,6 +354,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
data_port,
 fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
 fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
 
+
+s->machine_ready.notify = fw_cfg_machine_ready;
+qemu_add_machine_init_done_notifier(&s->machine_ready);
+
 return s;
 }
 
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 856bf91..b951f6b 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -30,7 +30,10 @@
 
 #define FW_CFG_FILE_FIRST   0x20
 #define FW_CFG_FILE_SLOTS   0x10
-#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
+#define FW_CFG_FILE_LAST_SLOT   (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS)
+#define FW_CFG_BOOTINDEX_LEN(FW_CFG_FILE_LAST_SLOT + 1)
+#define FW_CFG_BOOTINDEX_DATA   (FW_CFG_FILE_LAST_SLOT + 2)
+#define FW_CFG_MAX_ENTRYFW_CFG_BOOTINDEX_DATA + 1 
 
 #define FW_CFG_WRITE_CHANNEL0x4000
 #define FW_CFG_ARCH_LOCAL   0x8000
diff --git a/sysemu.h b/sysemu.h
index c42f33a..38a20a3 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -196,4 +196,5 @@ void register_devices(void);
 
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix);
+char *get_boot_devices_list(uint32_t *size);
 #endif
diff --git a/vl.c b/vl.c
index 918d988..ab36f9f 100644
--- a/vl.c
+++ b/vl.c
@@ -735,6 +735,55 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
*dev,
 QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
 }
 
+/*
+ * This function returns device list as an array in a below format:
+ * +---+-+-
+ * | devpath1  |  devpath2   | ...
+ * +---+-+-
+ * where:
+ *   devpath - null terminated string representing one device path
+ *
+ *  memory pointed by "size" is assigned total length of teh array in bytes
+ *
+ */
+char *get_boot_devices_list(uint32_t *size)
+{
+FWBootEntry *i;
+uint32_t total = 0;
+char *list = NULL;
+
+QTAILQ_FOREACH(i, &fw_boot_order, link) {
+char *devpath = NULL, *bootpath;
+int len;
+
+if (i->dev) {
+devpath = qdev_get_fw_dev_path(i->dev);
+assert(devpath);
+}
+
+if (i->suffix && devpath) {
+bootpath = qemu_malloc(strlen(devpath) + strlen(i->suffix) + 2);
+sprintf(bootpath, "%s/%s", devpath, i->suffix);
+qemu_free(devpath);
+} else if (devpath) {
+bootpath = devpath;
+} else {
+bootpath = strdup(i->suffix);
+assert(bootpath);
+}
+
+len = strlen(bootpath) + 1;
+list = qemu_realloc(list, total + len);
+memcpy(&list[total], bootpath, len);
+total += len;
+qemu_free(bootpath);
+}
+
+*size = total;
+
+return list;
+}
+
 static void numa_add(const char *optarg)
 {
 char option[128];
-- 
1.7.1

[Qemu-devel] [PATCHv5 00/15] boot order specification

2010-11-15 Thread Gleb Natapov

I am using open firmware naming scheme to specify device path names.
In this version: fixed compilation problem, changed how device list
is passed into firmware.

Names look like this on pci machine:
/p...@i0cf8/i...@1,1/dr...@1/d...@0
/p...@i0cf8/i...@1/f...@03f1/flo...@1
/p...@i0cf8/i...@1/f...@03f1/flo...@0
/p...@i0cf8/i...@1,1/dr...@1/d...@1
/p...@i0cf8/i...@1,1/dr...@0/d...@0
/p...@i0cf8/s...@3/d...@0
/p...@i0cf8/ether...@4/ethernet-...@0
/p...@i0cf8/ether...@5/ethernet-...@0
/p...@i0cf8/i...@1,1/dr...@0/d...@1
/p...@i0cf8/i...@1/i...@01e8/dr...@0/d...@0
/p...@i0cf8/u...@1,2/netw...@0/ether...@0
/p...@i0cf8/u...@1,2/h...@1/netw...@0/ether...@0
/r...@genroms/linuxboot.bin

and on isa machine:
/isa/i...@0170/dr...@0/d...@0
/isa/f...@03f1/flo...@1
/isa/f...@03f1/flo...@0
/isa/i...@0170/dr...@0/d...@1


Instead of using get_dev_path() callback I introduces another one
get_fw_dev_path. Unfortunately the way get_dev_path() callback is used
in migration code makes it hard to reuse it for other purposes. First
of all it is not called recursively so caller expects it to provide
unique name by itself. Device path though is inherently recursive. Each
individual element may not be unique, but the whole path will be. On
the other hand to call get_dev_path() recursively in migration code we
should implement it for all possible buses first. Other problem is
compatibility. If we change get_dev_path() output format now we will not
be able to migrate from old qemu to new one without some additional
compatibility layer.

Gleb Natapov (15):
  Introduce fw_name field to DeviceInfo structure.
  Introduce new BusInfo callback get_fw_dev_path.
  Keep track of ISA ports ISA device is using in qdev.
  Add get_fw_dev_path callback to ISA bus in qdev.
  Store IDE bus id in IDEBus structure for easy access.
  Add get_fw_dev_path callback to IDE bus.
  Add get_dev_path callback for system bus.
  Add get_fw_dev_path callback for pci bus.
  Record which USBDevice USBPort belongs too.
  Add get_dev_path callback for usb bus.
  Add bootindex parameter to net/block/fd device
  Change fw_cfg_add_file() to get full file path as a parameter.
  Add bootindex for option roms.
  Add notifier that will be called when machine is fully created.
  Pass boot device list to firmware.

 block_int.h   |4 +-
 hw/cs4231a.c  |1 +
 hw/e1000.c|4 ++
 hw/eepro100.c |3 +
 hw/fdc.c  |   12 ++
 hw/fw_cfg.c   |   31 +--
 hw/fw_cfg.h   |9 +++-
 hw/gus.c  |4 ++
 hw/ide/cmd646.c   |4 +-
 hw/ide/internal.h |3 +-
 hw/ide/isa.c  |5 ++-
 hw/ide/piix.c |4 +-
 hw/ide/qdev.c |   22 ++-
 hw/ide/via.c  |4 +-
 hw/isa-bus.c  |   42 +++
 hw/isa.h  |4 ++
 hw/lance.c|1 +
 hw/loader.c   |   32 +++---
 hw/loader.h   |8 ++--
 hw/m48t59.c   |1 +
 hw/mc146818rtc.c  |1 +
 hw/multiboot.c|3 +-
 hw/ne2000-isa.c   |3 +
 hw/ne2000.c   |5 ++-
 hw/nseries.c  |4 +-
 hw/palm.c |6 +-
 hw/parallel.c |5 ++
 hw/pc.c   |7 ++-
 hw/pci.c  |  110 +++---
 hw/pci_host.c |2 +
 hw/pckbd.c|3 +
 hw/pcnet.c|6 ++-
 hw/piix_pci.c |1 +
 hw/qdev.c |   32 +++
 hw/qdev.h |9 
 hw/rtl8139.c  |4 ++
 hw/sb16.c |4 ++
 hw/serial.c   |1 +
 hw/sysbus.c   |   30 ++
 hw/sysbus.h   |4 ++
 hw/usb-bus.c  |   45 -
 hw/usb-hub.c  |3 +-
 hw/usb-musb.c |2 +-
 hw/usb-net.c  |3 +
 hw/usb-ohci.c |2 +-
 hw/usb-uhci.c |2 +-
 hw/usb.h  |3 +-
 hw/virtio-blk.c   |2 +
 hw/virtio-net.c   |2 +
 hw/virtio-pci.c   |1 +
 net.h |4 +-
 qemu-config.c |   17 
 sysemu.h  |   11 +-
 vl.c  |  115 -
 54 files changed, 569 insertions(+), 81 deletions(-)

[Qemu-devel] [PATCHv5 08/15] Add get_fw_dev_path callback for pci bus.

2010-11-15 Thread Gleb Natapov


Signed-off-by: Gleb Natapov 
---
 hw/pci.c |  108 -
 1 files changed, 85 insertions(+), 23 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 962886e..114b435 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -43,12 +43,14 @@
 
 static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent);
 static char *pcibus_get_dev_path(DeviceState *dev);
+static char *pcibus_get_fw_dev_path(DeviceState *dev);
 
 struct BusInfo pci_bus_info = {
 .name   = "PCI",
 .size   = sizeof(PCIBus),
 .print_dev  = pcibus_dev_print,
 .get_dev_path = pcibus_get_dev_path,
+.get_fw_dev_path = pcibus_get_fw_dev_path,
 .props  = (Property[]) {
 DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
 DEFINE_PROP_STRING("romfile", PCIDevice, romfile),
@@ -1061,45 +1063,63 @@ void pci_msi_notify(PCIDevice *dev, unsigned int vector)
 typedef struct {
 uint16_t class;
 const char *desc;
+const char *fw_name;
+uint16_t fw_ign_bits;
 } pci_class_desc;
 
 static const pci_class_desc pci_class_descriptions[] =
 {
-{ 0x0100, "SCSI controller"},
-{ 0x0101, "IDE controller"},
-{ 0x0102, "Floppy controller"},
-{ 0x0103, "IPI controller"},
-{ 0x0104, "RAID controller"},
+{ 0x0001, "VGA controller", "display"},
+{ 0x0100, "SCSI controller", "scsi"},
+{ 0x0101, "IDE controller", "ide"},
+{ 0x0102, "Floppy controller", "fdc"},
+{ 0x0103, "IPI controller", "ipi"},
+{ 0x0104, "RAID controller", "raid"},
 { 0x0106, "SATA controller"},
 { 0x0107, "SAS controller"},
 { 0x0180, "Storage controller"},
-{ 0x0200, "Ethernet controller"},
-{ 0x0201, "Token Ring controller"},
-{ 0x0202, "FDDI controller"},
-{ 0x0203, "ATM controller"},
+{ 0x0200, "Ethernet controller", "ethernet"},
+{ 0x0201, "Token Ring controller", "token-ring"},
+{ 0x0202, "FDDI controller", "fddi"},
+{ 0x0203, "ATM controller", "atm"},
 { 0x0280, "Network controller"},
-{ 0x0300, "VGA controller"},
+{ 0x0300, "VGA controller", "display", 0x00ff},
 { 0x0301, "XGA controller"},
 { 0x0302, "3D controller"},
 { 0x0380, "Display controller"},
-{ 0x0400, "Video controller"},
-{ 0x0401, "Audio controller"},
+{ 0x0400, "Video controller", "video"},
+{ 0x0401, "Audio controller", "sound"},
 { 0x0402, "Phone"},
 { 0x0480, "Multimedia controller"},
-{ 0x0500, "RAM controller"},
-{ 0x0501, "Flash controller"},
+{ 0x0500, "RAM controller", "memory"},
+{ 0x0501, "Flash controller", "flash"},
 { 0x0580, "Memory controller"},
-{ 0x0600, "Host bridge"},
-{ 0x0601, "ISA bridge"},
-{ 0x0602, "EISA bridge"},
-{ 0x0603, "MC bridge"},
-{ 0x0604, "PCI bridge"},
-{ 0x0605, "PCMCIA bridge"},
-{ 0x0606, "NUBUS bridge"},
-{ 0x0607, "CARDBUS bridge"},
+{ 0x0600, "Host bridge", "host"},
+{ 0x0601, "ISA bridge", "isa"},
+{ 0x0602, "EISA bridge", "eisa"},
+{ 0x0603, "MC bridge", "mca"},
+{ 0x0604, "PCI bridge", "pci"},
+{ 0x0605, "PCMCIA bridge", "pcmcia"},
+{ 0x0606, "NUBUS bridge", "nubus"},
+{ 0x0607, "CARDBUS bridge", "cardbus"},
 { 0x0608, "RACEWAY bridge"},
 { 0x0680, "Bridge"},
-{ 0x0c03, "USB controller"},
+{ 0x0700, "Serial port", "serial"},
+{ 0x0701, "Parallel port", "parallel"},
+{ 0x0800, "Interrupt controller", "interrupt-controller"},
+{ 0x0801, "DMA controller", "dma-controller"},
+{ 0x0802, "Timer", "timer"},
+{ 0x0803, "RTC", "rtc"},
+{ 0x0900, "Keyboard", "keyboard"},
+{ 0x0901, "Pen", "pen"},
+{ 0x0902, "Mouse", "mouse"},
+{ 0x0A00, "Dock station", "dock", 0x00ff},
+{ 0x0B00, "i386 cpu", "cpu", 0x00ff},
+{ 0x0c00, "Fireware contorller", "fireware"},
+{ 0x0c01, "Access bus controller", "access-bus"},
+{ 0x0c02, "SSA controller", "ssa"},
+{ 0x0c03, "USB controller", "usb"},
+{ 0x0c04, "Fibre channel controller", "fibre-channel"},
 { 0, NULL}
 };
 
@@ -1825,6 +1845,48 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
*dev, int indent)
 }
 }
 
+static char *pci_dev_fw_name(DeviceState *dev, char *buf, int len)
+{
+PCIDevice *d = (PCIDevice *)dev;
+const char *name = NULL;
+const pci_class_desc *desc =  pci_class_descriptions;
+int class = pci_get_word(d->config + PCI_CLASS_DEVICE);
+
+while (desc->desc &&
+  (class & ~desc->fw_ign_bits) !=
+  (desc->class & ~desc->fw_ign_bits)) {
+desc++;
+}
+
+if (desc->desc) {
+name = desc->fw_name;
+}
+
+if (name) {
+pstrcpy(buf, len, name);
+} else {
+snprintf(buf, len, "pci%04x,%04x",
+ pci_get_word(d->config + PCI_VENDOR_ID),
+ pci_get_word(d->config + PCI_DEVICE_ID));
+}
+
+return buf;
+}
+
+static char *pcibus_get_fw_dev_path(DeviceState *dev)
+{
+PCIDevice *d = (PCIDevice *)dev;
+char path[50], na

[Qemu-devel] [PATCHv5 03/15] Keep track of ISA ports ISA device is using in qdev.

2010-11-15 Thread Gleb Natapov

Store all io ports used by device in ISADevice structure.

Signed-off-by: Gleb Natapov 
---
 hw/cs4231a.c |1 +
 hw/fdc.c |3 +++
 hw/gus.c |4 
 hw/ide/isa.c |2 ++
 hw/isa-bus.c |   25 +
 hw/isa.h |4 
 hw/m48t59.c  |1 +
 hw/mc146818rtc.c |1 +
 hw/ne2000-isa.c  |3 +++
 hw/parallel.c|5 +
 hw/pckbd.c   |3 +++
 hw/sb16.c|4 
 hw/serial.c  |1 +
 13 files changed, 57 insertions(+), 0 deletions(-)

diff --git a/hw/cs4231a.c b/hw/cs4231a.c
index 4d5ce5c..598f032 100644
--- a/hw/cs4231a.c
+++ b/hw/cs4231a.c
@@ -645,6 +645,7 @@ static int cs4231a_initfn (ISADevice *dev)
 isa_init_irq (dev, &s->pic, s->irq);
 
 for (i = 0; i < 4; i++) {
+isa_init_ioport(dev, i);
 register_ioport_write (s->port + i, 1, 1, cs_write, s);
 register_ioport_read (s->port + i, 1, 1, cs_read, s);
 }
diff --git a/hw/fdc.c b/hw/fdc.c
index a467c4b..5ab754b 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -1983,6 +1983,9 @@ static int isabus_fdc_init1(ISADevice *dev)
   &fdctrl_write_port, fdctrl);
 register_ioport_write(iobase + 0x07, 1, 1,
   &fdctrl_write_port, fdctrl);
+isa_init_ioport_range(dev, iobase + 1, 5);
+isa_init_ioport(dev, iobase + 7);
+
 isa_init_irq(&isa->busdev, &fdctrl->irq, isairq);
 fdctrl->dma_chann = dma_chann;
 
diff --git a/hw/gus.c b/hw/gus.c
index e9016d8..ff9e7c7 100644
--- a/hw/gus.c
+++ b/hw/gus.c
@@ -264,20 +264,24 @@ static int gus_initfn (ISADevice *dev)
 
 register_ioport_write (s->port, 1, 1, gus_writeb, s);
 register_ioport_write (s->port, 1, 2, gus_writew, s);
+isa_init_ioport_range(dev, s->port, 2);
 
 register_ioport_read ((s->port + 0x100) & 0xf00, 1, 1, gus_readb, s);
 register_ioport_read ((s->port + 0x100) & 0xf00, 1, 2, gus_readw, s);
+isa_init_ioport_range(dev, (s->port + 0x100) & 0xf00, 2);
 
 register_ioport_write (s->port + 6, 10, 1, gus_writeb, s);
 register_ioport_write (s->port + 6, 10, 2, gus_writew, s);
 register_ioport_read (s->port + 6, 10, 1, gus_readb, s);
 register_ioport_read (s->port + 6, 10, 2, gus_readw, s);
+isa_init_ioport_range(dev, s->port + 6, 10);
 
 
 register_ioport_write (s->port + 0x100, 8, 1, gus_writeb, s);
 register_ioport_write (s->port + 0x100, 8, 2, gus_writew, s);
 register_ioport_read (s->port + 0x100, 8, 1, gus_readb, s);
 register_ioport_read (s->port + 0x100, 8, 2, gus_readw, s);
+isa_init_ioport_range(dev, s->port + 0x100, 8);
 
 DMA_register_channel (s->emu.gusdma, GUS_read_DMA, s);
 s->emu.himemaddr = s->himem;
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 9856435..4206afd 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -70,6 +70,8 @@ static int isa_ide_initfn(ISADevice *dev)
 ide_bus_new(&s->bus, &s->dev.qdev);
 ide_init_ioport(&s->bus, s->iobase, s->iobase2);
 isa_init_irq(dev, &s->irq, s->isairq);
+isa_init_ioport_range(dev, s->iobase, 8);
+isa_init_ioport(dev, s->iobase2);
 ide_init2(&s->bus, s->irq);
 vmstate_register(&dev->qdev, 0, &vmstate_ide_isa, s);
 return 0;
diff --git a/hw/isa-bus.c b/hw/isa-bus.c
index 26036e0..c0ac7e9 100644
--- a/hw/isa-bus.c
+++ b/hw/isa-bus.c
@@ -92,6 +92,31 @@ void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq)
 dev->nirqs++;
 }
 
+static void isa_init_ioport_one(ISADevice *dev, uint16_t ioport)
+{
+assert(dev->nioports < ARRAY_SIZE(dev->ioports));
+dev->ioports[dev->nioports++] = ioport;
+}
+
+static int isa_cmp_ports(const void *p1, const void *p2)
+{
+return *(uint16_t*)p1 - *(uint16_t*)p2;
+}
+
+void isa_init_ioport_range(ISADevice *dev, uint16_t start, uint16_t length)
+{
+int i;
+for (i = start; i < start + length; i++) {
+isa_init_ioport_one(dev, i);
+}
+qsort(dev->ioports, dev->nioports, sizeof(dev->ioports[0]), isa_cmp_ports);
+}
+
+void isa_init_ioport(ISADevice *dev, uint16_t ioport)
+{
+isa_init_ioport_range(dev, ioport, 1);
+}
+
 static int isa_qdev_init(DeviceState *qdev, DeviceInfo *base)
 {
 ISADevice *dev = DO_UPCAST(ISADevice, qdev, qdev);
diff --git a/hw/isa.h b/hw/isa.h
index aaf0272..4794b76 100644
--- a/hw/isa.h
+++ b/hw/isa.h
@@ -14,6 +14,8 @@ struct ISADevice {
 DeviceState qdev;
 uint32_t isairq[2];
 int nirqs;
+uint16_t ioports[32];
+int nioports;
 };
 
 typedef int (*isa_qdev_initfn)(ISADevice *dev);
@@ -26,6 +28,8 @@ ISABus *isa_bus_new(DeviceState *dev);
 void isa_bus_irqs(qemu_irq *irqs);
 qemu_irq isa_reserve_irq(int isairq);
 void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq);
+void isa_init_ioport(ISADevice *dev, uint16_t ioport);
+void isa_init_ioport_range(ISADevice *dev, uint16_t start, uint16_t length);
 void isa_qdev_register(ISADeviceInfo *info);
 ISADevice *isa_create(const char *name);
 ISADevice *isa_create_simple(const char *name);
diff --git a/hw/m48t59.c b/hw/m

[Qemu-devel] [PATCHv5 05/15] Store IDE bus id in IDEBus structure for easy access.

2010-11-15 Thread Gleb Natapov


Signed-off-by: Gleb Natapov 
---
 hw/ide/cmd646.c   |4 ++--
 hw/ide/internal.h |3 ++-
 hw/ide/isa.c  |2 +-
 hw/ide/piix.c |4 ++--
 hw/ide/qdev.c |3 ++-
 hw/ide/via.c  |4 ++--
 6 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c
index ff80dd5..b2cbdbc 100644
--- a/hw/ide/cmd646.c
+++ b/hw/ide/cmd646.c
@@ -257,8 +257,8 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev)
 pci_conf[PCI_INTERRUPT_PIN] = 0x01; // interrupt on pin 1
 
 irq = qemu_allocate_irqs(cmd646_set_irq, d, 2);
-ide_bus_new(&d->bus[0], &d->dev.qdev);
-ide_bus_new(&d->bus[1], &d->dev.qdev);
+ide_bus_new(&d->bus[0], &d->dev.qdev, 0);
+ide_bus_new(&d->bus[1], &d->dev.qdev, 1);
 ide_init2(&d->bus[0], irq[0]);
 ide_init2(&d->bus[1], irq[1]);
 
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index d652e06..c0a1abc 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -448,6 +448,7 @@ struct IDEBus {
 IDEDevice *slave;
 BMDMAState *bmdma;
 IDEState ifs[2];
+int bus_id;
 uint8_t unit;
 uint8_t cmd;
 qemu_irq irq;
@@ -565,7 +566,7 @@ void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo 
*hd0,
 void ide_init_ioport(IDEBus *bus, int iobase, int iobase2);
 
 /* hw/ide/qdev.c */
-void ide_bus_new(IDEBus *idebus, DeviceState *dev);
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id);
 IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive);
 
 #endif /* HW_IDE_INTERNAL_H */
diff --git a/hw/ide/isa.c b/hw/ide/isa.c
index 4206afd..8c59c5a 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -67,7 +67,7 @@ static int isa_ide_initfn(ISADevice *dev)
 {
 ISAIDEState *s = DO_UPCAST(ISAIDEState, dev, dev);
 
-ide_bus_new(&s->bus, &s->dev.qdev);
+ide_bus_new(&s->bus, &s->dev.qdev, 0);
 ide_init_ioport(&s->bus, s->iobase, s->iobase2);
 isa_init_irq(dev, &s->irq, s->isairq);
 isa_init_ioport_range(dev, s->iobase, 8);
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index 07483e8..d0b04a3 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -129,8 +129,8 @@ static int pci_piix_ide_initfn(PCIIDEState *d)
 
 vmstate_register(&d->dev.qdev, 0, &vmstate_ide_pci, d);
 
-ide_bus_new(&d->bus[0], &d->dev.qdev);
-ide_bus_new(&d->bus[1], &d->dev.qdev);
+ide_bus_new(&d->bus[0], &d->dev.qdev, 0);
+ide_bus_new(&d->bus[1], &d->dev.qdev, 1);
 ide_init_ioport(&d->bus[0], 0x1f0, 0x3f6);
 ide_init_ioport(&d->bus[1], 0x170, 0x376);
 
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 6d27b60..88ff657 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -29,9 +29,10 @@ static struct BusInfo ide_bus_info = {
 .size  = sizeof(IDEBus),
 };
 
-void ide_bus_new(IDEBus *idebus, DeviceState *dev)
+void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id)
 {
 qbus_create_inplace(&idebus->qbus, &ide_bus_info, dev, NULL);
+idebus->bus_id = bus_id;
 }
 
 static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base)
diff --git a/hw/ide/via.c b/hw/ide/via.c
index b2c7cad..cc48b2b 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -158,8 +158,8 @@ static int vt82c686b_ide_initfn(PCIDevice *dev)
 
 vmstate_register(&dev->qdev, 0, &vmstate_ide_pci, d);
 
-ide_bus_new(&d->bus[0], &d->dev.qdev);
-ide_bus_new(&d->bus[1], &d->dev.qdev);
+ide_bus_new(&d->bus[0], &d->dev.qdev, 0);
+ide_bus_new(&d->bus[1], &d->dev.qdev, 1);
 ide_init2(&d->bus[0], isa_reserve_irq(14));
 ide_init2(&d->bus[1], isa_reserve_irq(15));
 ide_init_ioport(&d->bus[0], 0x1f0, 0x3f6);
-- 
1.7.1

[Qemu-devel] [PATCHv5 09/15] Record which USBDevice USBPort belongs too.

2010-11-15 Thread Gleb Natapov

Ports on root hub will have NULL here. This is needed to reconstruct
path from device to its root hub to build device path.

Signed-off-by: Gleb Natapov 
---
 hw/usb-bus.c  |3 ++-
 hw/usb-hub.c  |2 +-
 hw/usb-musb.c |2 +-
 hw/usb-ohci.c |2 +-
 hw/usb-uhci.c |2 +-
 hw/usb.h  |3 ++-
 6 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/usb-bus.c b/hw/usb-bus.c
index b692503..256b881 100644
--- a/hw/usb-bus.c
+++ b/hw/usb-bus.c
@@ -110,11 +110,12 @@ USBDevice *usb_create_simple(USBBus *bus, const char 
*name)
 }
 
 void usb_register_port(USBBus *bus, USBPort *port, void *opaque, int index,
-   usb_attachfn attach)
+   USBDevice *pdev, usb_attachfn attach)
 {
 port->opaque = opaque;
 port->index = index;
 port->attach = attach;
+port->pdev = pdev;
 QTAILQ_INSERT_TAIL(&bus->free, port, next);
 bus->nfree++;
 }
diff --git a/hw/usb-hub.c b/hw/usb-hub.c
index 8e3a96b..8a3f829 100644
--- a/hw/usb-hub.c
+++ b/hw/usb-hub.c
@@ -535,7 +535,7 @@ static int usb_hub_initfn(USBDevice *dev)
 for (i = 0; i < s->nb_ports; i++) {
 port = &s->ports[i];
 usb_register_port(usb_bus_from_device(dev),
-  &port->port, s, i, usb_hub_attach);
+  &port->port, s, i, &s->dev, usb_hub_attach);
 port->wPortStatus = PORT_STAT_POWER;
 port->wPortChange = 0;
 }
diff --git a/hw/usb-musb.c b/hw/usb-musb.c
index 7f15842..9efe7a6 100644
--- a/hw/usb-musb.c
+++ b/hw/usb-musb.c
@@ -343,7 +343,7 @@ struct MUSBState {
 }
 
 usb_bus_new(&s->bus, NULL /* FIXME */);
-usb_register_port(&s->bus, &s->port, s, 0, musb_attach);
+usb_register_port(&s->bus, &s->port, s, 0, NULL, musb_attach);
 
 return s;
 }
diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index c60fd8d..59604cf 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -1705,7 +1705,7 @@ static void usb_ohci_init(OHCIState *ohci, DeviceState 
*dev,
 usb_bus_new(&ohci->bus, dev);
 ohci->num_ports = num_ports;
 for (i = 0; i < num_ports; i++) {
-usb_register_port(&ohci->bus, &ohci->rhport[i].port, ohci, i, 
ohci_attach);
+usb_register_port(&ohci->bus, &ohci->rhport[i].port, ohci, i, NULL, 
ohci_attach);
 }
 
 ohci->async_td = 0;
diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c
index 1d83400..b9b822f 100644
--- a/hw/usb-uhci.c
+++ b/hw/usb-uhci.c
@@ -1115,7 +1115,7 @@ static int usb_uhci_common_initfn(UHCIState *s)
 
 usb_bus_new(&s->bus, &s->dev.qdev);
 for(i = 0; i < NB_PORTS; i++) {
-usb_register_port(&s->bus, &s->ports[i].port, s, i, uhci_attach);
+usb_register_port(&s->bus, &s->ports[i].port, s, i, NULL, uhci_attach);
 }
 s->frame_timer = qemu_new_timer(vm_clock, uhci_frame_timer, s);
 s->expire_time = qemu_get_clock(vm_clock) +
diff --git a/hw/usb.h b/hw/usb.h
index 00d2802..0b32d77 100644
--- a/hw/usb.h
+++ b/hw/usb.h
@@ -203,6 +203,7 @@ struct USBPort {
 USBDevice *dev;
 usb_attachfn attach;
 void *opaque;
+USBDevice *pdev;
 int index; /* internal port index, may be used with the opaque */
 QTAILQ_ENTRY(USBPort) next;
 };
@@ -312,7 +313,7 @@ USBDevice *usb_create(USBBus *bus, const char *name);
 USBDevice *usb_create_simple(USBBus *bus, const char *name);
 USBDevice *usbdevice_create(const char *cmdline);
 void usb_register_port(USBBus *bus, USBPort *port, void *opaque, int index,
-   usb_attachfn attach);
+   USBDevice *pdev, usb_attachfn attach);
 void usb_unregister_port(USBBus *bus, USBPort *port);
 int usb_device_attach(USBDevice *dev);
 int usb_device_detach(USBDevice *dev);
-- 
1.7.1

[Qemu-devel] [PATCHv5 11/15] Add bootindex parameter to net/block/fd device

2010-11-15 Thread Gleb Natapov

If bootindex is specified on command line a string that describes device
in firmware readable way is added into sorted list. Later this list will
be passed into firmware to control boot order.

Signed-off-by: Gleb Natapov 
---
 block_int.h |4 +++-
 hw/e1000.c  |4 
 hw/eepro100.c   |3 +++
 hw/fdc.c|8 
 hw/ide/qdev.c   |5 +
 hw/ne2000.c |3 +++
 hw/pcnet.c  |4 
 hw/qdev.c   |   32 
 hw/qdev.h   |1 +
 hw/rtl8139.c|4 
 hw/usb-net.c|2 ++
 hw/virtio-blk.c |2 ++
 hw/virtio-net.c |2 ++
 net.h   |4 +++-
 sysemu.h|2 ++
 vl.c|   40 
 16 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/block_int.h b/block_int.h
index 3c3adb5..0a0e47d 100644
--- a/block_int.h
+++ b/block_int.h
@@ -227,6 +227,7 @@ typedef struct BlockConf {
 uint16_t logical_block_size;
 uint16_t min_io_size;
 uint32_t opt_io_size;
+int32_t bootindex;
 } BlockConf;
 
 static inline unsigned int get_physical_block_exp(BlockConf *conf)
@@ -249,6 +250,7 @@ static inline unsigned int get_physical_block_exp(BlockConf 
*conf)
 DEFINE_PROP_UINT16("physical_block_size", _state,   \
_conf.physical_block_size, 512), \
 DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0),  \
-DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0)
+DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0),\
+DEFINE_PROP_INT32("bootindex", _state, _conf.bootindex, -1) \
 
 #endif /* BLOCK_INT_H */
diff --git a/hw/e1000.c b/hw/e1000.c
index 532efdc..053f33e 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -30,6 +30,7 @@
 #include "net.h"
 #include "net/checksum.h"
 #include "loader.h"
+#include "sysemu.h"
 
 #include "e1000_hw.h"
 
@@ -1148,6 +1149,9 @@ static int pci_e1000_init(PCIDevice *pci_dev)
   d->dev.qdev.info->name, d->dev.qdev.id, d);
 
 qemu_format_nic_info_str(&d->nic->nc, macaddr);
+
+add_boot_device_path(d->conf.bootindex, &pci_dev->qdev, "ethernet-...@0");
+
 return 0;
 }
 
diff --git a/hw/eepro100.c b/hw/eepro100.c
index 41d792a..80adac6 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -46,6 +46,7 @@
 #include "pci.h"
 #include "net.h"
 #include "eeprom93xx.h"
+#include "sysemu.h"
 
 #define KiB 1024
 
@@ -1907,6 +1908,8 @@ static int e100_nic_init(PCIDevice *pci_dev)
 s->vmstate->name = s->nic->nc.model;
 vmstate_register(&pci_dev->qdev, -1, s->vmstate, s);
 
+add_boot_device_path(s->conf.bootindex, &pci_dev->qdev, "ethernet-...@0");
+
 return 0;
 }
 
diff --git a/hw/fdc.c b/hw/fdc.c
index 5ab754b..7b1349f 100644
--- a/hw/fdc.c
+++ b/hw/fdc.c
@@ -35,6 +35,7 @@
 #include "sysbus.h"
 #include "qdev-addr.h"
 #include "blockdev.h"
+#include "sysemu.h"
 
 //
 /* debug Floppy devices */
@@ -523,6 +524,8 @@ typedef struct FDCtrlSysBus {
 typedef struct FDCtrlISABus {
 ISADevice busdev;
 struct FDCtrl state;
+int32_t bootindexA;
+int32_t bootindexB;
 } FDCtrlISABus;
 
 static uint32_t fdctrl_read (void *opaque, uint32_t reg)
@@ -1992,6 +1995,9 @@ static int isabus_fdc_init1(ISADevice *dev)
 qdev_set_legacy_instance_id(&dev->qdev, iobase, 2);
 ret = fdctrl_init_common(fdctrl);
 
+add_boot_device_path(isa->bootindexA, &dev->qdev, "flo...@0");
+add_boot_device_path(isa->bootindexB, &dev->qdev, "flo...@1");
+
 return ret;
 }
 
@@ -2051,6 +2057,8 @@ static ISADeviceInfo isa_fdc_info = {
 .qdev.props = (Property[]) {
 DEFINE_PROP_DRIVE("driveA", FDCtrlISABus, state.drives[0].bs),
 DEFINE_PROP_DRIVE("driveB", FDCtrlISABus, state.drives[1].bs),
+DEFINE_PROP_INT32("bootindexA", FDCtrlISABus, bootindexA, -1),
+DEFINE_PROP_INT32("bootindexB", FDCtrlISABus, bootindexB, -1),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 01a181b..69a00e2 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -21,6 +21,7 @@
 #include "qemu-error.h"
 #include 
 #include "blockdev.h"
+#include "sysemu.h"
 
 /* - */
 
@@ -143,6 +144,10 @@ static int ide_drive_initfn(IDEDevice *dev)
 if (!dev->serial) {
 dev->serial = qemu_strdup(s->drive_serial_str);
 }
+
+add_boot_device_path(dev->conf.bootindex, &dev->qdev,
+ dev->unit ? "d...@1" : "d...@0");
+
 return 0;
 }
 
diff --git a/hw/ne2000.c b/hw/ne2000.c
index 126e7cf..f4bbac2 100644
--- a/hw/ne2000.c
+++ b/hw/ne2000.c
@@ -26,6 +26,7 @@
 #include "net.h"
 #include "ne2000.h"
 #include "loader.h"
+#include "sysemu.h"
 
 /* debug NE2000 card */
 //#define DEBUG_NE2000
@@ -746,6 +747,8 @@ static int pci_ne2000_init(PCIDevice *pci_dev)
 }
 }
 
+add_boot_device_path(s->c.bootindex, &pci_dev->qdev, "

[Qemu-devel] [PATCHv5 14/15] Add notifier that will be called when machine is fully created.

2010-11-15 Thread Gleb Natapov

Action that depends on fully initialized device model should register
with this notifier chain.

Signed-off-by: Gleb Natapov 
---
 sysemu.h |2 ++
 vl.c |   15 +++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/sysemu.h b/sysemu.h
index 48f8eee..c42f33a 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -60,6 +60,8 @@ void qemu_system_reset(void);
 void qemu_add_exit_notifier(Notifier *notify);
 void qemu_remove_exit_notifier(Notifier *notify);
 
+void qemu_add_machine_init_done_notifier(Notifier *notify);
+
 void do_savevm(Monitor *mon, const QDict *qdict);
 int load_vmstate(const char *name);
 void do_delvm(Monitor *mon, const QDict *qdict);
diff --git a/vl.c b/vl.c
index e8ada75..918d988 100644
--- a/vl.c
+++ b/vl.c
@@ -253,6 +253,9 @@ static void *boot_set_opaque;
 static NotifierList exit_notifiers =
 NOTIFIER_LIST_INITIALIZER(exit_notifiers);
 
+static NotifierList machine_init_done_notifiers =
+NOTIFIER_LIST_INITIALIZER(machine_init_done_notifiers);
+
 int kvm_allowed = 0;
 uint32_t xen_domid;
 enum xen_mode xen_mode = XEN_EMULATE;
@@ -1778,6 +1781,16 @@ static void qemu_run_exit_notifiers(void)
 notifier_list_notify(&exit_notifiers);
 }
 
+void qemu_add_machine_init_done_notifier(Notifier *notify)
+{
+notifier_list_add(&machine_init_done_notifiers, notify);
+}
+
+static void qemu_run_machine_init_done_notifiers(void)
+{
+notifier_list_notify(&machine_init_done_notifiers);
+}
+
 static const QEMUOption *lookup_opt(int argc, char **argv,
 const char **poptarg, int *poptind)
 {
@@ -3023,6 +3036,8 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+qemu_run_machine_init_done_notifiers();
+
 qemu_system_reset();
 if (loadvm) {
 if (load_vmstate(loadvm) < 0) {
-- 
1.7.1

Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.

2010-11-15 Thread Alexander Graf

On 15.11.2010, at 15:47, Anthony PERARD wrote:

> On Mon, 15 Nov 2010, Anthony PERARD wrote:
> 
>> On Mon, 15 Nov 2010, Alexander Graf wrote:
>> 
>>> 
>>> On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:
>>> 
 From: Anthony PERARD 

 This option gives the ability to switch one "accelerator" like kvm, xen
 or the default one tcg. We can specify more than one accelerator by
 separate them by a comma. QEMU will try each one and use the first whose
 works.

 So,

 -accel xen,kvm,tcg

 which would try Xen support first, then KVM and finaly tcg if none of
 the other works.

 Signed-off-by: Anthony PERARD 
 ---
 qemu-options.hx |   10 ++
 vl.c|   86 
 ---
 2 files changed, 85 insertions(+), 11 deletions(-)

 diff --git a/vl.c b/vl.c
 index df414ef..40a26ee 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname)
return 0;
 }

 +static struct {
 +const char *opt_name;
 +const char *name;
 +int (*available)(void);
 +int (*init)(int smp_cpus);
 +int *allowed;
 +} accel_list[] = {
 +{ "tcg", "tcg", NULL, NULL, NULL },
 +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed },
>>> 
>>> Thinking about this a bit more ...
>>> 
>>> kvm_available is a function pointer that gets #defined to (0) when we don't 
>>> have KVM available. I can imagine some compiler might throw a warning on us 
>>> for this one day.
>>> 
>>> Is there a valid reason not to do
>>> 
>>> static inline int kvm_enabled()
>>> {
>>> #ifdef CONFIG_KVM
>>>return kvm_allowed;
>>> #else
>>>return 0;
>>> #endif
>>> }
>>> 
>>> That should compile into the exact same code but be valid for function 
>>> pointers.
>> 
>> I will do this change, as well as for the two others patches.
> 
> Actually, kvm_available is already a function, not a define.
> 
> kvm_enable can be change from define to function, but not in this patch,
> because I don't use it.

Oh no worries for stuff you don't use. As long as kvm_available is a function, 
just make sure that xen_available is one too :). The main incentive is to get 
rid of all those if (!x->available) parts, as we can guarantee there to always 
be a function behind. 

Treating tcg the same way would also help in the long term for non-tcg build of 
qemu which some people are interested in ;).

Alex

Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.

2010-11-15 Thread Anthony PERARD

On Mon, 15 Nov 2010, Anthony PERARD wrote:

> On Mon, 15 Nov 2010, Alexander Graf wrote:
>
> >
> > On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote:
> >
> > > From: Anthony PERARD 
> > >
> > > This option gives the ability to switch one "accelerator" like kvm, xen
> > > or the default one tcg. We can specify more than one accelerator by
> > > separate them by a comma. QEMU will try each one and use the first whose
> > > works.
> > >
> > > So,
> > >
> > > -accel xen,kvm,tcg
> > >
> > > which would try Xen support first, then KVM and finaly tcg if none of
> > > the other works.
> > >
> > > Signed-off-by: Anthony PERARD 
> > > ---
> > > qemu-options.hx |   10 ++
> > > vl.c|   86 
> > > ---
> > > 2 files changed, 85 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/vl.c b/vl.c
> > > index df414ef..40a26ee 100644
> > > --- a/vl.c
> > > +++ b/vl.c
> > > @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname)
> > > return 0;
> > > }
> > >
> > > +static struct {
> > > +const char *opt_name;
> > > +const char *name;
> > > +int (*available)(void);
> > > +int (*init)(int smp_cpus);
> > > +int *allowed;
> > > +} accel_list[] = {
> > > +{ "tcg", "tcg", NULL, NULL, NULL },
> > > +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed },
> >
> > Thinking about this a bit more ...
> >
> > kvm_available is a function pointer that gets #defined to (0) when we don't 
> > have KVM available. I can imagine some compiler might throw a warning on us 
> > for this one day.
> >
> > Is there a valid reason not to do
> >
> > static inline int kvm_enabled()
> > {
> > #ifdef CONFIG_KVM
> > return kvm_allowed;
> > #else
> > return 0;
> > #endif
> > }
> >
> > That should compile into the exact same code but be valid for function 
> > pointers.
>
> I will do this change, as well as for the two others patches.

Actually, kvm_available is already a function, not a define.

kvm_enable can be change from define to function, but not in this patch,
because I don't use it.

-- 
Anthony PERARD

[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Juan Quintela

"Michael S. Tsirkin"  wrote:
> There's no reason for tap to run when VM is stopped.
> If we let it, it confuses the bridge on TX
> and corrupts DMA memory on RX.
>
> Signed-off-by: Michael S. Tsirkin 

once here, what handlers make sense to run while stopped?
/me can think of the normal console, non live migration, loadvm and not
much more.  Perhaps it is easier to just move the other way around?

Later, Juan.

[Qemu-devel] [PATCH 1/2] virtio-9p: Use chroot to safely access files in passthrough model

2010-11-15 Thread M. Mohan Kumar

In passthrough security model, following symbolic links in the server
side could result in accessing files outside guest's export path.This
could happen under two conditions:
1) If a modified guest kernel is sending symbolic link as part of the
file path and when resolving that symbolic link at server side, it could
result in accessing files outside export path.
2) If a same path is exported to multiple guests and if guest1 tries to
open a file "a/b/c/passwd" and meanwhile guest2 did this "rm -rf a/b/c;
cd a/b; ln -s ../../etc c". If guest1 lookup happened and guest2
completed these operations just before guest1 opening the file, this
operation could result in opening host's /etc/passwd.

Following approach is used to avoid the security issue involved in
following symbolic links in the passthrough model. Create a sub-process
which will chroot into export path, so that even if there is a symbolic
link in the path it could never go beyond the share path.

When qemu is started with passthrough security model, a process is
forked and this sub-process process takes care of accessing files in the
passthrough share path. It does
 * Create socketpair
 * Chroot into share path
 * Read file open request from socket descriptor
 * Open request contains file path, flags, mode, uid, gid, dev etc
 * Based on the request type it creates/opens file/directory/device node
 * Return the file descriptor to main process using socket with
   SCM_RIGHTS as cmsg type.

Main process when ever there is a request for a resource to be
opened/created, it constructs the open request and writes that into
its socket descriptor and reads from chroot process socket to get the
file descriptor.

This patch implements chroot enviroment, provides necessary functions
that can be used by the passthrough function calls.

Signed-off-by: M. Mohan Kumar 
---
 Makefile.objs |1 +
 hw/file-op-9p.h   |2 +
 hw/virtio-9p-chroot.c |  275 +
 hw/virtio-9p.c|   20 
 hw/virtio-9p.h|   21 
 5 files changed, 319 insertions(+), 0 deletions(-)
 create mode 100644 hw/virtio-9p-chroot.c

diff --git a/Makefile.objs b/Makefile.objs
index cd5a24b..134da8e 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -251,6 +251,7 @@ hw-obj-$(CONFIG_SOUND) += $(sound-obj-y)
 
 hw-obj-$(CONFIG_VIRTFS) += virtio-9p-debug.o virtio-9p-local.o 
virtio-9p-xattr.o
 hw-obj-$(CONFIG_VIRTFS) += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
+hw-obj-$(CONFIG_VIRTFS) += virtio-9p-chroot.o
 
 ##
 # libdis
diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h
index c7731c2..149a915 100644
--- a/hw/file-op-9p.h
+++ b/hw/file-op-9p.h
@@ -55,6 +55,8 @@ typedef struct FsContext
 SecModel fs_sm;
 uid_t uid;
 struct xattr_operations **xops;
+pthread_mutex_t chroot_mutex;
+int chroot_socket;
 } FsContext;
 
 extern void cred_init(FsCred *);
diff --git a/hw/virtio-9p-chroot.c b/hw/virtio-9p-chroot.c
new file mode 100644
index 000..50e28a1
--- /dev/null
+++ b/hw/virtio-9p-chroot.c
@@ -0,0 +1,275 @@
+/*
+ * Virtio 9p chroot environment for secured access to exported file
+ * system
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ * M. Mohan Kumar 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the copying file in the top-level directory
+ *
+ */
+
+#include "virtio.h"
+#include "qemu_socket.h"
+#include "qemu-thread.h"
+#include "virtio-9p.h"
+#include 
+#include 
+
+/* Structure used to transfer file descriptor and error info to the main
+ * process. fd will be zero if there was an error(in this case error
+ * will hold the errno). error will be zero and fd will be a valid
+ * identifier when open was success
+ */
+typedef struct {
+int fd;
+int error;
+} FdInfo;
+
+static int sendfd(int sockfd, FdInfo fd_info)
+{
+struct msghdr msg = { };
+struct iovec iov;
+union {
+struct cmsghdr cmsg;
+char control[CMSG_SPACE(sizeof(int))];
+} msg_control;
+struct cmsghdr *cmsg;
+
+iov.iov_base = &fd_info;
+iov.iov_len = sizeof(fd_info);
+
+memset(&msg, 0, sizeof(msg));
+msg.msg_iov = &iov;
+msg.msg_iovlen = 1;
+msg.msg_control = &msg_control;
+msg.msg_controllen = sizeof(msg_control);
+
+cmsg = &msg_control.cmsg;
+cmsg->cmsg_len = CMSG_LEN(sizeof(fd_info.fd));
+cmsg->cmsg_level = SOL_SOCKET;
+cmsg->cmsg_type = SCM_RIGHTS;
+memcpy(CMSG_DATA(cmsg), &fd_info.fd, sizeof(fd_info.fd));
+
+return sendmsg(sockfd, &msg, 0);
+}
+
+static int getfd(int sockfd, int *error)
+{
+struct msghdr msg = { };
+struct iovec iov;
+union {
+struct cmsghdr cmsg;
+char control[CMSG_SPACE(sizeof(int))];
+} msg_control;
+struct cmsghdr *cmsg;
+int retval, fd;
+FdInfo fd_info;
+
+iov.iov_base = &fd_info;
+iov.iov_len = sizeof(fd_info);
+
+memset(&msg, 0, sizeof(msg));
+msg.msg_iov =

[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Anthony Liguori


On 11/15/2010 08:52 AM, Juan Quintela wrote:

"Michael S. Tsirkin"  wrote:
   

There's no reason for tap to run when VM is stopped.
If we let it, it confuses the bridge on TX
and corrupts DMA memory on RX.

Signed-off-by: Michael S. Tsirkin
 

once here, what handlers make sense to run while stopped?
/me can think of the normal console, non live migration, loadvm and not
much more.  Perhaps it is easier to just move the other way around?
   


I'm not sure I concur that this is really a problem.

Semantically, I don't think that stop has to imply that the guest memory 
no longer changes.


Regards,

Anthony Liguori


Later, Juan.

[Qemu-devel] [PATCH 2/2] virtio-9p: Use chroot interface in passthrough model

2010-11-15 Thread M. Mohan Kumar

Make use of chroot interfaces for passthrough security model to fix the
vulnerability in following symbolic links.

Signed-off-by: M. Mohan Kumar 
---
 hw/virtio-9p-local.c |  284 ++
 1 files changed, 218 insertions(+), 66 deletions(-)

diff --git a/hw/virtio-9p-local.c b/hw/virtio-9p-local.c
index 656bfb3..4b72dec 100644
--- a/hw/virtio-9p-local.c
+++ b/hw/virtio-9p-local.c
@@ -19,16 +19,91 @@
 #include 
 #include 
 #include 
+#include 
+
+static int get_fd(FsContext *fs_ctx, const char *path, int flags, FsCred 
*credp)
+{
+V9fsOpenRequest request;
+int fd, error = 0;
+
+memset(&request, 0, sizeof(request));
+request.data.path_len = strlen(path);
+request.path.path = qemu_strdup(path);
+request.data.flags = flags;
+if (credp) {
+request.data.mode = credp->fc_mode;
+request.data.uid = credp->fc_uid;
+request.data.gid = credp->fc_gid;
+request.data.dev = credp->fc_rdev;
+}
+fd = v9fs_getfd(&request, &error, fs_ctx);
+if (error) {
+errno = error;
+} else {
+errno = error;
+}
+qemu_strdup(request.path.path);
+return fd;
+}
+
+static int get_pfd(FsContext *fs_ctx, const char *path)
+{
+V9fsOpenRequest request;
+int fd, error = 0;
+char *dpath = qemu_strdup(path);
+
+memset(&request, 0, sizeof(request));
+request.path.path = dirname(dpath);
+request.data.path_len = strlen(request.path.path);
+request.data.flags = O_RDONLY | O_DIRECTORY | O_NOFOLLOW;
+fd = v9fs_getfd(&request, &error, fs_ctx);
+if (error) {
+errno = error;
+} else {
+errno = 0;
+}
+qemu_free(dpath);
+return fd;
+}
 
+static int do_symlink(FsContext *fs_ctx, const char *oldpath,
+const char *newpath, FsCred *credp)
+{
+V9fsOpenRequest request;
+int fd, error = 0;
+
+memset(&request, 0, sizeof(request));
+request.data.path_len = strlen(newpath);
+request.path.path = qemu_strdup(newpath);
+request.data.oldpath_len = strlen(oldpath);
+request.path.old_path = qemu_strdup(oldpath);
+request.data.flags = S_IFLNK | O_CREAT;
+
+if (credp) {
+request.data.mode = credp->fc_mode;
+request.data.uid = credp->fc_uid;
+request.data.gid = credp->fc_gid;
+request.data.dev = credp->fc_rdev;
+}
+fd = v9fs_getfd(&request, &error, fs_ctx);
+if (error) {
+errno = error;
+} else {
+errno = error;
+}
+qemu_strdup(request.path.path);
+return fd;
+}
 
 static int local_lstat(FsContext *fs_ctx, const char *path, struct stat *stbuf)
 {
 int err;
-err =  lstat(rpath(fs_ctx, path), stbuf);
-if (err) {
-return err;
-}
+
 if (fs_ctx->fs_sm == SM_MAPPED) {
+err =  lstat(rpath(fs_ctx, path), stbuf);
+if (err) {
+return err;
+}
 /* Actual credentials are part of extended attrs */
 uid_t tmp_uid;
 gid_t tmp_gid;
@@ -50,6 +125,22 @@ static int local_lstat(FsContext *fs_ctx, const char *path, 
struct stat *stbuf)
 sizeof(dev_t)) > 0) {
 stbuf->st_rdev = tmp_dev;
 }
+} else if (fs_ctx->fs_sm == SM_PASSTHROUGH) {
+int pfd;
+char *base, *basep;
+
+base = qemu_strdup(path);
+basep = basename(base);
+
+pfd = get_pfd(fs_ctx, path);
+err = fstatat(pfd, basep, stbuf, AT_SYMLINK_NOFOLLOW);
+close(pfd);
+free(base);
+} else {
+err =  lstat(rpath(fs_ctx, path), stbuf);
+if (err) {
+return err;
+}
 }
 return err;
 }
@@ -88,21 +179,13 @@ static int local_set_xattr(const char *path, FsCred *credp)
 return 0;
 }
 
-static int local_post_create_passthrough(FsContext *fs_ctx, const char *path,
-FsCred *credp)
+static int local_post_create_none(FsContext *fs_ctx, const char *path,
+FsCred *credp)
 {
 if (chmod(rpath(fs_ctx, path), credp->fc_mode & 0) < 0) {
 return -1;
 }
-if (lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid) < 0) {
-/*
- * If we fail to change ownership and if we are
- * using security model none. Ignore the error
- */
-if (fs_ctx->fs_sm != SM_NONE) {
-return -1;
-}
-}
+lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid);
 return 0;
 }
 
@@ -121,9 +204,16 @@ static ssize_t local_readlink(FsContext *fs_ctx, const 
char *path,
 } while (tsize == -1 && errno == EINTR);
 close(fd);
 return tsize;
-} else if ((fs_ctx->fs_sm == SM_PASSTHROUGH) ||
-   (fs_ctx->fs_sm == SM_NONE)) {
+} else if (fs_ctx->fs_sm == SM_NONE) {
 tsize = readlink(rpath(fs_ctx, path), buf, bufsz);
+} else if (fs_ctx->fs_sm == SM_PASSTHROUGH) {
+int pfd;
+char *base;
+base = qemu_strdup(path);
+pfd =

[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Michael S. Tsirkin

On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote:
> On 11/15/2010 08:52 AM, Juan Quintela wrote:
> >"Michael S. Tsirkin"  wrote:
> >>There's no reason for tap to run when VM is stopped.
> >>If we let it, it confuses the bridge on TX
> >>and corrupts DMA memory on RX.
> >>
> >>Signed-off-by: Michael S. Tsirkin
> >once here, what handlers make sense to run while stopped?
> >/me can think of the normal console, non live migration, loadvm and not
> >much more.  Perhaps it is easier to just move the other way around?
> 
> I'm not sure I concur that this is really a problem.
> Semantically, I don't think that stop has to imply that the guest
> memory no longer changes.
> 
> Regards,
> 
> Anthony Liguori
> 
> >Later, Juan.

Well, I do not really know about vmstop that is not for migration.
For most vmstop calls are for migration. And there, the problems are very
real.

First, it's not just memory.  At least for network transmit, sending out
packets with the same MAC from two locations is a problem. See?

For memory, it is much worse:  any memory changes can either get
discarded or not.  This breaks consistency guarantees that guest relies
upon. Imagine virtio index getting updated but content not being
updated.  See?

-- 
MST

[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Michael S. Tsirkin

On Mon, Nov 15, 2010 at 03:52:54PM +0100, Juan Quintela wrote:
> "Michael S. Tsirkin"  wrote:
> > There's no reason for tap to run when VM is stopped.
> > If we let it, it confuses the bridge on TX
> > and corrupts DMA memory on RX.
> >
> > Signed-off-by: Michael S. Tsirkin 
> 
> once here, what handlers make sense to run while stopped?
> /me can think of the normal console, non live migration, loadvm and not
> much more.  Perhaps it is easier to just move the other way around?
> 
> Later, Juan.

vnc? SDL?
Yes, more devices need to be stopped than not, but I
tread carefully to avoid breaking existing functionality.
If you could solve it for all devices in one swoop, that'd be
great. I'm not up to it.

-- 
MST

[Qemu-devel] [PATCH] Introduce -accel command option.

2010-11-15 Thread anthony . perard

From: Anthony PERARD 

This option gives the ability to switch one "accelerator" like kvm, xen
or the default one tcg. We can specify more than one accelerator by
separate them by a comma. QEMU will try each one and use the first whose
works.

So,

-accel xen,kvm,tcg

which would try Xen support first, then KVM and finaly tcg if none of
the other works.

Signed-off-by: Anthony PERARD 
---
 arch_init.c |5 +++
 arch_init.h |1 +
 qemu-options.hx |   10 ++
 vl.c|   85 +++---
 4 files changed, 90 insertions(+), 11 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 4486925..e0d7a4c 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -639,6 +639,11 @@ int audio_available(void)
 #endif
 }
 
+int tcg_available(void)
+{
+return 1;
+}
+
 int kvm_available(void)
 {
 #ifdef CONFIG_KVM
diff --git a/arch_init.h b/arch_init.h
index 682890c..f0fb6a0 100644
--- a/arch_init.h
+++ b/arch_init.h
@@ -27,6 +27,7 @@ void do_acpitable_option(const char *optarg);
 void do_smbios_option(const char *optarg);
 void cpudef_init(void);
 int audio_available(void);
+int tcg_available(void);
 int kvm_available(void);
 int xen_available(void);
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 4d99a58..958d126 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1975,6 +1975,16 @@ Enable KVM full virtualization support. This option is 
only available
 if KVM support is enabled when compiling.
 ETEXI
 
+DEF("accel", HAS_ARG, QEMU_OPTION_accel, \
+"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", 
QEMU_ARCH_ALL)
+STEXI
+...@item -accel @var{accel}[,@var{accel}[,...]]
+...@findex -accel
+This is use to enable an accelerator, in kvm,xen,tcg.
+By default, it use only tcg. If there a more than one accelerator
+specified, the next one is used if the first don't work.
+ETEXI
+
 DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid,
 "-xen-domid id   specify xen guest domain id\n", QEMU_ARCH_ALL)
 DEF("xen-create", 0, QEMU_OPTION_xen_create,
diff --git a/vl.c b/vl.c
index c58583d..2917e32 100644
--- a/vl.c
+++ b/vl.c
@@ -242,6 +242,7 @@ static void *boot_set_opaque;
 static NotifierList exit_notifiers =
 NOTIFIER_LIST_INITIALIZER(exit_notifiers);
 
+static int tcg_allowed = 1;
 int kvm_allowed = 0;
 uint32_t xen_domid;
 enum xen_mode xen_mode = XEN_EMULATE;
@@ -1723,6 +1724,72 @@ static int debugcon_parse(const char *devname)
 return 0;
 }
 
+static int tcg_init(int smp_cpus)
+{
+return 0;
+}
+
+static struct {
+const char *opt_name;
+const char *name;
+int (*available)(void);
+int (*init)(int smp_cpus);
+int *allowed;
+} accel_list[] = {
+{ "tcg", "tcg", tcg_available, tcg_init, &tcg_allowed },
+{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed },
+};
+
+static int accel_parse_init(const char *opts)
+{
+const char *p = opts;
+char buf[10];
+int i, ret;
+bool accel_initalised = 0;
+bool init_failed = 0;
+
+while (!accel_initalised && *p != '\0') {
+if (*p == ',') {
+p++;
+}
+p = get_opt_name(buf, sizeof (buf), p, ',');
+for (i = 0; i < ARRAY_SIZE(accel_list); i++) {
+if (strcmp(accel_list[i].opt_name, buf) == 0) {
+ret = accel_list[i].init(smp_cpus);
+if (ret < 0) {
+init_failed = 1;
+if (!accel_list[i].available()) {
+printf("%s not supported for this target\n",
+   accel_list[i].name);
+} else {
+fprintf(stderr, "failed to initialize %s: %s\n",
+accel_list[i].name,
+strerror(-ret));
+}
+} else {
+accel_initalised = 1;
+*(accel_list[i].allowed) = 1;
+}
+break;
+}
+}
+if (i == ARRAY_SIZE(accel_list)) {
+fprintf(stderr, "\"%s\" accelerator does not exist.\n", buf);
+}
+}
+
+if (!accel_initalised) {
+fprintf(stderr, "No accelerator found!\n");
+exit(1);
+}
+
+if (init_failed) {
+fprintf(stderr, "Back to %s accelerator.\n", accel_list[i].name);
+}
+
+return !accel_initalised;
+}
+
 void qemu_add_exit_notifier(Notifier *notify)
 {
 notifier_list_add(&exit_notifiers, notify);
@@ -1802,6 +1869,7 @@ int main(int argc, char **argv, char **envp)
 const char *incoming = NULL;
 int show_vnc_port = 0;
 int defconfig = 1;
+const char *accel_list_opts = "tcg";
 
 #ifdef CONFIG_SIMPLE_TRACE
 const char *trace_file = NULL;
@@ -2409,7 +2477,10 @@ int main(int argc, char **argv, char **envp)
 do_smbios_option(optarg);
 break;
 case QEMU_OPTION_enable_kvm:
-kvm_allowed = 1;
+accel_list_opts = "kvm";
+

Re: [Qemu-devel] [PATCH] make trace options use autoconfy names

2010-11-15 Thread Paolo Bonzini


On 11/15/2010 03:17 PM, Stefan Hajnoczi wrote:

On Sun, Nov 14, 2010 at 1:52 PM, Paolo Bonzini  wrote:

On 11/14/2010 02:38 PM, Andreas Färber wrote:


- --trace-file=*) trace_file="$optarg"
+ --enable-trace-file=*) trace_file="$optarg"
;;


but this should be --with-trace-file=... please. It is not being
enabled, just set to a different value.


--with-* should be reserved for library paths, but I can change it if people
prefer it that way.


Actually I think we have something similar to overriding --prefix here
:).  It's a path that you can set at ./configure time.


Yeah, that's true.  However...


So is it not okay to use --trace-file=?


... Autoconf would not allow unknown options not starting with --enable- 
or --with-.


The rationale to avoid incompatible options in QEMU is this: suppose you 
have a project using Autoconf (e.g. GCC) and you want to drop QEMU as a 
subdirectory in there, e.g. to run the GCC testsuite under QEMU usermode 
emulation (GCC can already do this for other simulators).  To pass 
options to QEMU's configure, you can include them in GCC's commandline. 
 The script will simply pass the option down to QEMU and it will be 
processed there.  However, if you pass --trace-file to GCC's configure 
script, it will complain and stop.


Probably I would use something like --enable-trace-backend=simple:trace- 
if I was adding something similar to an autoconfiscated project.  But 
unless it provides some additional benefit (as is the case with 
cross-compilation support) I want to keep the syntactic changes in my 
patches to the minimum.



But I know nothing of autoconf and --enable-* or --with-* sort of
make sense too.


Whatever, I have to repost the other series anyway, so I'll change to 
--with-.


Paolo

[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Anthony Liguori


On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote:

On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote:
   

On 11/15/2010 08:52 AM, Juan Quintela wrote:
 

"Michael S. Tsirkin"   wrote:
   

There's no reason for tap to run when VM is stopped.
If we let it, it confuses the bridge on TX
and corrupts DMA memory on RX.

Signed-off-by: Michael S. Tsirkin
 

once here, what handlers make sense to run while stopped?
/me can think of the normal console, non live migration, loadvm and not
much more.  Perhaps it is easier to just move the other way around?
   

I'm not sure I concur that this is really a problem.
Semantically, I don't think that stop has to imply that the guest
memory no longer changes.

Regards,

Anthony Liguori

 

Later, Juan.
   

Well, I do not really know about vmstop that is not for migration.
   


They are separate.  For instance, we don't rely on stop to pause pending 
disk I/O because we don't serialize pending disk I/O operations.  
Instead, we flush all pending I/O and rely on the fact that disk I/O 
requests are always submitted in the context of a vCPU operation.  This 
assumption breaks down though with ioeventfd so we need to revisit it.



For most vmstop calls are for migration. And there, the problems are very
real.

First, it's not just memory.  At least for network transmit, sending out
packets with the same MAC from two locations is a problem. See?
   


I agree it's a problem but I'm not sure that just marking fd handlers 
really helps.


Bottom halves can also trigger transmits.  I think that if we put 
something in the network layer that just queued packets if the vm is 
stopped, it would be a more robust solution to the problem.



For memory, it is much worse:  any memory changes can either get
discarded or not.  This breaks consistency guarantees that guest relies
upon. Imagine virtio index getting updated but content not being
updated.  See?
   


If you suppress any I/O then the memory changes don't matter because the 
same changes will happen on the destination too.


I think this basic problem is the same as Kemari.  We can either attempt 
to totally freeze a guest which means stopping all callbacks that are 
device related or we can prevent I/O from happening which should 
introduce enough determinism to fix the problem in practice.


Regards,

Anthony Liguori

Re: [Qemu-devel] Re: [PATCH] pc: disable the BOCHS BIOS panic port

2010-11-15 Thread Bernhard Kohl


Am 15.11.2010 11:09, schrieb ext Alexander Graf:

On 15.11.2010, at 10:53, Bernhard Kohl wrote:

   

Am 01.09.2010 16:44, schrieb Bernhard Kohl:
 

We have an OS which writes to port 0x400 when probing for special hardware.
This causes an exit of the VM. With SeaBIOS this port isn't used anyway.

Signed-off-by: Bernhard Kohl
---
  hw/pc.c |2 --
  1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 69b13bf..3f81229 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -430,8 +430,6 @@ static void bochs_bios_write(void *opaque, uint32_t addr, 
uint32_t val)
  /* Bochs BIOS messages */
  case 0x400:
  case 0x401:
-fprintf(stderr, "BIOS panic at rombios.c, line %d\n", val);
-exit(1);
  case 0x402:
  case 0x403:
  #ifdef DEBUG_BIOS

   

Hi,

could you please look at this?
 

This patch makes that port that was silent before print debug output if 
DEBUG_BIOS is enabled which might be confusing.

How about something like

case 0x400:
case 0x401:
 /* used to be panic, now unused */
 break;


Alex


   

Yes, you are right. I will take your proposal in patch v2.

Bernhard

[Qemu-devel] Re: [PATCH RESENT] msix: allow byte and word reading from mmio

2010-11-15 Thread Bernhard Kohl


Am 15.11.2010 11:42, schrieb ext Michael S. Tsirkin:

On Thu, Aug 19, 2010 at 02:56:51PM +0200, Bernhard Kohl wrote:
   

It's legal that the guest reads a single byte or word from mmio.
 

Interesting. The spec seems to say this:

For all accesses to MSI-X Table and MSI-X PBA fields, software must use
aligned full DWORD or aligned full QWORD transactions; otherwise, the
result is undefined.
   


I will remove the first statement from the commit message and add
something like the comment you proposed below.

   

I have an OS which reads single bytes and it works fine on real
hardware. Maybe this happens due to casting.
 

What do you mean by casting?
   


I did not yet locate the line of code where this happens in the guest.
As all other accesses are dword, I assume that there is some masking,
shifting or casting to a byte variable in the source code, and the
compiler generates a byte access from this.

   

Signed-off-by: Bernhard Kohl
 

I guess we can merge this, but we need a comment I think since this
seems to contradict the spec, and people were sending patches relying on
this.

Does something like the following describe the situation correctly?

/* Note: PCI spec requires that all MSI-X table accesses
are either DWORD or QWORD, size aligned.  Some guests seem to violate
this rule for read accesses, performing single byte reads.
Since it's easy to support this, let's do so.
Also support 16 bit size aligned reads, just in case.
  */
   


Yes, that's is exactly the situation with my guest.
I will add this comment.


Does you guest also do 16 bit reads?  Are accesses at least aligned?
   


I will check this with my guest and the readw function disabled
in the patch. This will take some time.

   

---
  hw/msix.c |   20 
  1 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/hw/msix.c b/hw/msix.c
index d99403a..7dac7f7 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, 
target_phys_addr_t addr)
  return pci_get_long(page + offset);
  }

-static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t addr)
+static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr)
  {
-fprintf(stderr, "MSI-X: only dword read is allowed!\n");
-return 0;
+PCIDevice *dev = opaque;
+unsigned int offset = addr&  (MSIX_PAGE_SIZE - 1)&  ~0x1;
+void *page = dev->msix_table_page;
+
+return pci_get_word(page + offset);
+}
+
+static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr)
+{
+PCIDevice *dev = opaque;
+unsigned int offset = addr&  (MSIX_PAGE_SIZE - 1);
+void *page = dev->msix_table_page;
+
+return pci_get_byte(page + offset);
  }

  static uint8_t msix_pending_mask(int vector)
@@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = {
  };

  static CPUReadMemoryFunc * const msix_mmio_read[] = {
-msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl
+msix_mmio_readb, msix_mmio_readw, msix_mmio_readl
  };

  /* Should be called from device's map method. */
--
1.7.2.1

[Qemu-devel] Re: [PATCH v3] virtio-9p: fix build on !CONFIG_UTIMENSAT

2010-11-15 Thread M. Mohan Kumar

> This patch introduce a fallback mechanism for old systems that do not
> support utimensat().  This fix build failure with following warnings:
> 
> hw/virtio-9p-local.c: In function 'local_utimensat':
> hw/virtio-9p-local.c:479: warning: implicit declaration of function
> 'utimensat' hw/virtio-9p-local.c:479: warning: nested extern declaration
> of 'utimensat'
> 
> and:
> 
> hw/virtio-9p.c: In function 'v9fs_setattr_post_chmod':
> hw/virtio-9p.c:1410: error: 'UTIME_NOW' undeclared (first use in this
> function) hw/virtio-9p.c:1410: error: (Each undeclared identifier is
> reported only once hw/virtio-9p.c:1410: error: for each function it
> appears in.)
> hw/virtio-9p.c:1413: error: 'UTIME_OMIT' undeclared (first use in this
> function) hw/virtio-9p.c: In function 'v9fs_wstat_post_chmod':
> hw/virtio-9p.c:2905: error: 'UTIME_OMIT' undeclared (first use in this
> function)
> 
> v3:
>   - Use better alternative handling for UTIME_NOW/OMIT
>   - Move qemu_utimensat() to cutils.c
> V2:
>   - Introduce qemu_utimensat()
> 
> Signed-off-by: Hidetoshi Seto 

Looks good to me.

Acked-by: M. Mohan Kumar

Re: [Qemu-devel] [PATCH 1/2] virtio-9p: Use chroot to safely access files in passthrough model

2010-11-15 Thread Stefan Hajnoczi

On Mon, Nov 15, 2010 at 2:52 PM, M. Mohan Kumar  wrote:
> In passthrough security model, following symbolic links in the server
> side could result in accessing files outside guest's export path.This
> could happen under two conditions:
> 1) If a modified guest kernel is sending symbolic link as part of the
> file path and when resolving that symbolic link at server side, it could
> result in accessing files outside export path.
> 2) If a same path is exported to multiple guests and if guest1 tries to
> open a file "a/b/c/passwd" and meanwhile guest2 did this "rm -rf a/b/c;
> cd a/b; ln -s ../../etc c". If guest1 lookup happened and guest2
> completed these operations just before guest1 opening the file, this
> operation could result in opening host's /etc/passwd.
>
> Following approach is used to avoid the security issue involved in
> following symbolic links in the passthrough model. Create a sub-process
> which will chroot into export path, so that even if there is a symbolic
> link in the path it could never go beyond the share path.
>
> When qemu is started with passthrough security model, a process is
> forked and this sub-process process takes care of accessing files in the
> passthrough share path. It does
>  * Create socketpair
>  * Chroot into share path
>  * Read file open request from socket descriptor
>  * Open request contains file path, flags, mode, uid, gid, dev etc
>  * Based on the request type it creates/opens file/directory/device node
>  * Return the file descriptor to main process using socket with
>   SCM_RIGHTS as cmsg type.
>
> Main process when ever there is a request for a resource to be
> opened/created, it constructs the open request and writes that into
> its socket descriptor and reads from chroot process socket to get the
> file descriptor.

How does the child process exit cleanly?  If QEMU crashes will the
child process be orphaned?

>
> This patch implements chroot enviroment, provides necessary functions
> that can be used by the passthrough function calls.
>
> Signed-off-by: M. Mohan Kumar 
> ---
>  Makefile.objs         |    1 +
>  hw/file-op-9p.h       |    2 +
>  hw/virtio-9p-chroot.c |  275 
> +
>  hw/virtio-9p.c        |   20 
>  hw/virtio-9p.h        |   21 
>  5 files changed, 319 insertions(+), 0 deletions(-)
>  create mode 100644 hw/virtio-9p-chroot.c
>
> diff --git a/Makefile.objs b/Makefile.objs
> index cd5a24b..134da8e 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -251,6 +251,7 @@ hw-obj-$(CONFIG_SOUND) += $(sound-obj-y)
>
>  hw-obj-$(CONFIG_VIRTFS) += virtio-9p-debug.o virtio-9p-local.o 
> virtio-9p-xattr.o
>  hw-obj-$(CONFIG_VIRTFS) += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
> +hw-obj-$(CONFIG_VIRTFS) += virtio-9p-chroot.o
>
>  ##
>  # libdis
> diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h
> index c7731c2..149a915 100644
> --- a/hw/file-op-9p.h
> +++ b/hw/file-op-9p.h
> @@ -55,6 +55,8 @@ typedef struct FsContext
>     SecModel fs_sm;
>     uid_t uid;
>     struct xattr_operations **xops;
> +    pthread_mutex_t chroot_mutex;
> +    int chroot_socket;
>  } FsContext;
>
>  extern void cred_init(FsCred *);
> diff --git a/hw/virtio-9p-chroot.c b/hw/virtio-9p-chroot.c
> new file mode 100644
> index 000..50e28a1
> --- /dev/null
> +++ b/hw/virtio-9p-chroot.c
> @@ -0,0 +1,275 @@
> +/*
> + * Virtio 9p chroot environment for secured access to exported file
> + * system
> + *
> + * Copyright IBM, Corp. 2010
> + *
> + * Authors:
> + * M. Mohan Kumar 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the copying file in the top-level directory
> + *
> + */
> +
> +#include "virtio.h"
> +#include "qemu_socket.h"
> +#include "qemu-thread.h"
> +#include "virtio-9p.h"
> +#include 
> +#include 
> +
> +/* Structure used to transfer file descriptor and error info to the main
> + * process. fd will be zero if there was an error(in this case error
> + * will hold the errno). error will be zero and fd will be a valid
> + * identifier when open was success
> + */
> +typedef struct {
> +    int fd;
> +    int error;
> +} FdInfo;
> +
> +static int sendfd(int sockfd, FdInfo fd_info)
> +{
> +    struct msghdr msg = { };
> +    struct iovec iov;
> +    union {
> +        struct cmsghdr cmsg;
> +        char control[CMSG_SPACE(sizeof(int))];
> +    } msg_control;
> +    struct cmsghdr *cmsg;
> +
> +    iov.iov_base = &fd_info;
> +    iov.iov_len = sizeof(fd_info);
> +
> +    memset(&msg, 0, sizeof(msg));
> +    msg.msg_iov = &iov;
> +    msg.msg_iovlen = 1;
> +    msg.msg_control = &msg_control;
> +    msg.msg_controllen = sizeof(msg_control);
> +
> +    cmsg = &msg_control.cmsg;
> +    cmsg->cmsg_len = CMSG_LEN(sizeof(fd_info.fd));
> +    cmsg->cmsg_level = SOL_SOCKET;
> +    cmsg->cmsg_type = SCM_RIGHTS;
> +    memcpy(CMSG_DATA(cmsg), &fd_info.fd, sizeof(fd_info.fd));
> +
> +    return sendmsg(sockfd, &msg, 0

Re: [Qemu-devel] [PATCH] Introduce -accel command option.

2010-11-15 Thread Anthony Liguori


On 11/15/2010 09:45 AM, anthony.per...@citrix.com wrote:

From: Anthony PERARD

This option gives the ability to switch one "accelerator" like kvm, xen
or the default one tcg. We can specify more than one accelerator by
separate them by a comma. QEMU will try each one and use the first whose
works.

So,

-accel xen,kvm,tcg

which would try Xen support first, then KVM and finaly tcg if none of
the other works.
   


Should use QemuOpts instead of parsing by hand.  I'd rather it be 
presented as a -machine option too with accel=xen:kvm:tcg to specify order.


Regards,

Anthony Liguori


Signed-off-by: Anthony PERARD
---
  arch_init.c |5 +++
  arch_init.h |1 +
  qemu-options.hx |   10 ++
  vl.c|   85 +++---
  4 files changed, 90 insertions(+), 11 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 4486925..e0d7a4c 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -639,6 +639,11 @@ int audio_available(void)
  #endif
  }

+int tcg_available(void)
+{
+return 1;
+}
+
  int kvm_available(void)
  {
  #ifdef CONFIG_KVM
diff --git a/arch_init.h b/arch_init.h
index 682890c..f0fb6a0 100644
--- a/arch_init.h
+++ b/arch_init.h
@@ -27,6 +27,7 @@ void do_acpitable_option(const char *optarg);
  void do_smbios_option(const char *optarg);
  void cpudef_init(void);
  int audio_available(void);
+int tcg_available(void);
  int kvm_available(void);
  int xen_available(void);

diff --git a/qemu-options.hx b/qemu-options.hx
index 4d99a58..958d126 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1975,6 +1975,16 @@ Enable KVM full virtualization support. This option is 
only available
  if KVM support is enabled when compiling.
  ETEXI

+DEF("accel", HAS_ARG, QEMU_OPTION_accel, \
+"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", 
QEMU_ARCH_ALL)
+STEXI
+...@item -accel @var{accel}[,@var{accel}[,...]]
+...@findex -accel
+This is use to enable an accelerator, in kvm,xen,tcg.
+By default, it use only tcg. If there a more than one accelerator
+specified, the next one is used if the first don't work.
+ETEXI
+
  DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid,
  "-xen-domid id   specify xen guest domain id\n", QEMU_ARCH_ALL)
  DEF("xen-create", 0, QEMU_OPTION_xen_create,
diff --git a/vl.c b/vl.c
index c58583d..2917e32 100644
--- a/vl.c
+++ b/vl.c
@@ -242,6 +242,7 @@ static void *boot_set_opaque;
  static NotifierList exit_notifiers =
  NOTIFIER_LIST_INITIALIZER(exit_notifiers);

+static int tcg_allowed = 1;
  int kvm_allowed = 0;
  uint32_t xen_domid;
  enum xen_mode xen_mode = XEN_EMULATE;
@@ -1723,6 +1724,72 @@ static int debugcon_parse(const char *devname)
  return 0;
  }

+static int tcg_init(int smp_cpus)
+{
+return 0;
+}
+
+static struct {
+const char *opt_name;
+const char *name;
+int (*available)(void);
+int (*init)(int smp_cpus);
+int *allowed;
+} accel_list[] = {
+{ "tcg", "tcg", tcg_available, tcg_init,&tcg_allowed },
+{ "kvm", "KVM", kvm_available, kvm_init,&kvm_allowed },
+};
+
+static int accel_parse_init(const char *opts)
+{
+const char *p = opts;
+char buf[10];
+int i, ret;
+bool accel_initalised = 0;
+bool init_failed = 0;
+
+while (!accel_initalised&&  *p != '\0') {
+if (*p == ',') {
+p++;
+}
+p = get_opt_name(buf, sizeof (buf), p, ',');
+for (i = 0; i<  ARRAY_SIZE(accel_list); i++) {
+if (strcmp(accel_list[i].opt_name, buf) == 0) {
+ret = accel_list[i].init(smp_cpus);
+if (ret<  0) {
+init_failed = 1;
+if (!accel_list[i].available()) {
+printf("%s not supported for this target\n",
+   accel_list[i].name);
+} else {
+fprintf(stderr, "failed to initialize %s: %s\n",
+accel_list[i].name,
+strerror(-ret));
+}
+} else {
+accel_initalised = 1;
+*(accel_list[i].allowed) = 1;
+}
+break;
+}
+}
+if (i == ARRAY_SIZE(accel_list)) {
+fprintf(stderr, "\"%s\" accelerator does not exist.\n", buf);
+}
+}
+
+if (!accel_initalised) {
+fprintf(stderr, "No accelerator found!\n");
+exit(1);
+}
+
+if (init_failed) {
+fprintf(stderr, "Back to %s accelerator.\n", accel_list[i].name);
+}
+
+return !accel_initalised;
+}
+
  void qemu_add_exit_notifier(Notifier *notify)
  {
  notifier_list_add(&exit_notifiers, notify);
@@ -1802,6 +1869,7 @@ int main(int argc, char **argv, char **envp)
  const char *incoming = NULL;
  int show_vnc_port = 0;
  int defconfig = 1;
+const char *accel_list_opts = "tcg";

  #ifdef CONFIG_SIMPLE_TRACE
  const char *trace_file = NULL;

[Qemu-devel] Re: [PATCH] ceph/rbd block driver for qemu-kvm (v7)

2010-11-15 Thread Christian Brunner

Hi Stefan,

thanks for your feedback. Yehuda and Sage have already committed some
pathes to our git repository.

What I'm not sure about is the rados_(de)initialization for multiple
rbd images. I suspect that _deinitialize should only be called for the
last rbd image.

Yehuda and Sage know librados a lot better than me. I pretty sure,
that they will give some feedback about this remaining issue. After
that we will send an updated patch.

Regards,
Christian

2010/11/11 Stefan Hajnoczi :
> On Fri, Oct 15, 2010 at 8:54 PM, Christian Brunner  wrote:
>> [...]
>> +
>> +    if ((r = rados_initialize(0, NULL)) < 0) {
>> +        error_report("error initializing");
>> +        return r;
>> +    }
>
> Does rados_initialize() work when called multiple times?  This would happen if
> the VM has several rbd devices attached.
>
> [...]
>
> Stefan
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[Qemu-devel] Re: Where's gpxe-eepro100-80862449.rom ?

2010-11-15 Thread Stefan Weil


Am 25.10.2010 18:54, schrieb Michael S. Tsirkin:

On Mon, Oct 25, 2010 at 06:23:24PM +0200, Stefan Weil wrote:
   

Am 25.10.2010 14:11, schrieb Markus Armbruster:
 

Stefan Weil  writes:

   

Am 13.10.2010 09:13, schrieb Markus Armbruster:
 

Stefan Weil  writes:
   

[...]
   

Do you think there is urgent need for a
gpxe-eepro100-80862449.rom binary?

 

Well, "-device i82801" complains because it misses this binary. Do we
want to ship that way?

   

I just sent two patches which create the rom data on load.
So -device i82801 no longer complains but gets the boot
information from dhcp (tested with i386-softmmu).

Your build tree will be made smaller by at least 50 KB :-)

Do you think these patches should be added to stable-0.13, too?
 

If the gain for "-device i82801" is worth the risk, which depends on the
patch. Have we reached consensus on how to fix it in master?
   


The latest patch version fixes rom data only for the default
roms, so the risk is minimized and full tests are possible.

There is still no consensus whether we need a new qdev property
(which allows users to have their rom data fixed, too) or not.

My patch does not prevent adding that new qdev property,
so I suggest applying my patch now and adding the property
later (if it is ever needed).

Regards
Stefan
 

Fair enough, I guess.
   


So is there consensus that both patches can be committed to qemu master?

[Qemu-devel] Re: [PATCH] make-release: fix mtime for a wider range of git versions

2010-11-15 Thread Bernhard Kohl


Am 15.11.2010 12:48, schrieb Bernhard Kohl:

With the latest git versions, e.g. 1.7.2.3, git still prints out
the tag info in addition to the requested format. So let's simply
fetch the first line from the output.

In addition I use the --pretty option instead of --format which
is not recognized in very old git versions, e.g. 1.5.5.6.

Tested with git versions 1.5.5.6 and 1.7.2.3.

Signed-off-by: Bernhard Kohl
   


Sorry, I sent this to the wrong list. Resent to kvm!

Bernhard


---
  kvm/scripts/make-release |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kvm/scripts/make-release b/kvm/scripts/make-release
index 56302c3..2d050fc 100755
--- a/kvm/scripts/make-release
+++ b/kvm/scripts/make-release
@@ -51,7 +51,7 @@ cd "$(dirname "$0")"/../..
  mkdir -p "$(dirname "$tarball")"
  git archive --prefix="$name/" --format=tar "$commit">  "$tarball"

-mtime=`git show --format=%ct "$commit""^{commit}" --`
+mtime=`git show --pretty=format:%ct "$commit""^{commit}" -- | head -n 1`
  tarargs="--owner=root --group=root"

  mkdir -p "$tmpdir/$name"

[Qemu-devel] Re: [PATCH] ceph/rbd block driver for qemu-kvm (v7)

2010-11-15 Thread Yehuda Sadeh Weinraub

On Mon, Nov 15, 2010 at 9:04 AM, Christian Brunner
 wrote:
> Hi Stefan,
>
> thanks for your feedback. Yehuda and Sage have already committed some
> pathes to our git repository.
>
> What I'm not sure about is the rados_(de)initialization for multiple
> rbd images. I suspect that _deinitialize should only be called for the
> last rbd image.
>
> Yehuda and Sage know librados a lot better than me. I pretty sure,
> that they will give some feedback about this remaining issue. After
> that we will send an updated patch.
>
> Regards,
> Christian
>
> 2010/11/11 Stefan Hajnoczi :
>> On Fri, Oct 15, 2010 at 8:54 PM, Christian Brunner  wrote:
>>> [...]
>>> +
>>> +    if ((r = rados_initialize(0, NULL)) < 0) {
>>> +        error_report("error initializing");
>>> +        return r;
>>> +    }
>>
>> Does rados_initialize() work when called multiple times?  This would happen 
>> if
>> the VM has several rbd devices attached.
>>
>> [...]
>>
>> Stefan
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

The rados (de)initialization is refcounted and it is safe to call it
multiple times.

[Qemu-devel] [Bug 611142] Re: seabios should have native scsi support

2010-11-15 Thread Serge Hallyn

(Clearly this must be done upstream first, so marking Triaged for the Ubuntu
task)

** Changed in: qemu-kvm (Ubuntu)
   Status: New => Triaged

-- 
seabios should have native scsi support
https://bugs.launchpad.net/bugs/611142
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in QEMU: New
Status in “qemu-kvm” package in Ubuntu: Triaged
Status in “seabios” package in Ubuntu: New

Bug description:
Binary package hint: seabios

Currently when a grub multiboot image is booted with 'kvm -kernel' and 
'biosdisk' module, it will see block devices of type IDE or virtio.  It will 
not see scsi devices.

To demonstrate this:
$ qemu-img create -f qcow2 disk.img 1G
$ grub-mkrescue --output=rescue.iso
$ grub-mkimage -O i386-pc --output=grub-mb.img biosdisk minicmd part_msdos

An 'ls' inside the grub prompt will show hard disks (hd0) on if:
a.) -drive uses interface of virtio or scsi
or
b.) kvm boots boot from a cdrom or floppy 

For example, with these commands, grub will see a '(hd0)'
$ kvm -drive file=disk.img,if=scsi,boot=on -cdrom rescue.iso -boot d
$ kvm -drive file=disk.img,if=scsi,boot=on -floppy rescue.iso -boot a
$ kvm -drive file=disk.img,if=virtio,boot=on -cdrom rescue.iso -boot d
$ kvm -drive file=disk.img,if=ide,boot=on -cdrom rescue.iso -boot d
$ kvm -drive file=disk.img,if=virtio,boot=on -kernel grub-mb.img 
$ kvm -drive file=disk.img,if=ide,boot=on -kernel grub-mb.img 

But the following will not:
$ kvm -drive file=disk.img,if=scsi,boot=on -kernel grub-mb.img

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: seabios 0.6.0-0ubuntu1
ProcVersionSignature: User Name 2.6.32-305.9-ec2 2.6.32.11+drm33.2
Uname: Linux 2.6.32-305-ec2 i686
Architecture: i386
Date: Thu Jul 29 03:21:21 2010
Dependencies:
 
Ec2AMI: ami-e930db80
Ec2AMIManifest: 
ubuntu-images-testing-us/ubuntu-maverick-daily-i386-server-20100727.manifest.xml
Ec2AvailabilityZone: us-east-1b
Ec2InstanceType: m1.small
Ec2Kernel: aki-407d9529
Ec2Ramdisk: unavailable
PackageArchitecture: all
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: seabios

[Qemu-devel] [Bug 241119] Re: usb_add of a Creative ZEN unrecognized in guest

2010-11-15 Thread Alexandrosg

Hi. I have a similar problem, with a simple JetFlash usb drive.

After adding it with usb_add, "info usb" shows nothing and I start getting the 
following message in the console:
husb: config #1 need -1
husb: 1 interfaces claimed for configuration 1
husb: grabbed usb device 1.3
usb_linux_update_endp_table: Protocol error

This is under Maverick using qemu-kvm 0.12.5+noroms-0ubuntu7

The same operation works fine in the same pc under my gentoo system,
currently running qemu-kvm-0.13.0, but it has been working now for a
year, without ever giving me problems.

-- 
usb_add of a Creative ZEN unrecognized in guest
https://bugs.launchpad.net/bugs/241119
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in QEMU: Incomplete
Status in “qemu-kvm” package in Ubuntu: Confirmed

Bug description:
Binary package hint: kvm

This happens when I add my Creative ZEN to a virtual machine running XP. The 
device is recognised well at first and drivers are installed correctly. But 
when trying to connect windows crashes with the classic blue screen It 
complains about something like usbohci.sys, I can't read well because it 
crashes too fast.
I have also tried with another virtual machine running Vista, same results.
Any help would be really appreciated!

I'm using the module kvm-amd with Ubuntu 8.04

The USB device has the following ID: 041e:4157 Creative Technology, Ltd

kvm:
  Installed: 1:62+dfsg-0ubuntu7
  Candidate: 1:62+dfsg-0ubuntu7
  Version table:
 *** 1:62+dfsg-0ubuntu7 0
500 http://archive.ubuntu.com hardy/main Packages
100 /var/lib/dpkg/status

[Qemu-devel] Re: [PATCH] pci: allow hotplug removal of cold-plugged devices

2010-11-15 Thread Cam Macdonell

On Sun, Nov 14, 2010 at 7:18 AM, Michael S. Tsirkin  wrote:
> This patch fixes 5beb8ad503c88a76f2b8106c3b74b4ce485a60e1
> which broke hotplug removal of cold plugged devices:
>
> - pass addition/removal state to hotplug callbacks
> - use that in piix and pcie
>
> This also fixes an assert on hotplug removal of coldplugged
> express devices.
>
> Reported-by: by Cam Macdonell .
> Signed-off-by: Isaku Yamahata 
> Signed-off-by: Michael S. Tsirkin 

Acked-by: Cam Macdonell 

> ---
>
> So I think the below would be the cleanest way
> to fix the bug as we keep the hot/cold plug logic
> in a central palce. Untested. Comments? Cam?

Yes, it seems to fix the problem.

>
>
> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> index 66c7885..f549089 100644
> --- a/hw/acpi_piix4.c
> +++ b/hw/acpi_piix4.c
> @@ -585,7 +585,8 @@ static void pciej_write(void *opaque, uint32_t addr, 
> uint32_t val)
>     PIIX4_DPRINTF("pciej write %x <== %d\n", addr, val);
>  }
>
> -static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, int 
> state);
> +static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
> +                                PCIHotplugState state);
>
>  static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
>  {
> @@ -615,18 +616,23 @@ static void disable_device(PIIX4PMState *s, int slot)
>     s->pci0_status.down |= (1 << slot);
>  }
>
> -static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, int state)
> +static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
> +                               PCIHotplugState state)
>  {
>     int slot = PCI_SLOT(dev->devfn);
>     PIIX4PMState *s = DO_UPCAST(PIIX4PMState, dev,
>                                 DO_UPCAST(PCIDevice, qdev, qdev));
>
> -    if (!dev->qdev.hotplugged)
> +    /* Don't send event when device is enabled during qemu machine creation:
> +     * it is present on boot, no hotplug event is necessary. We do send an
> +     * event when the device is disabled later. */
> +    if (state == PCI_COLDPLUG_ENABLED) {
>         return 0;
> +    }
>
>     s->pci0_status.up = 0;
>     s->pci0_status.down = 0;
> -    if (state) {
> +    if (state == PCI_HOTPLUG_ENABLED) {
>         enable_device(s, slot);
>     } else {
>         disable_device(s, slot);
> diff --git a/hw/pci.c b/hw/pci.c
> index 30e1603..316b24f 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -1566,8 +1566,11 @@ static int pci_qdev_init(DeviceState *qdev, DeviceInfo 
> *base)
>     pci_add_option_rom(pci_dev);
>
>     if (bus->hotplug) {
> -        /* lower layer must check qdev->hotplugged */
> -        rc = bus->hotplug(bus->hotplug_qdev, pci_dev, 1);
> +        /* Let buses differentiate between hotplug and when device is
> +         * enabled during qemu machine creation. */
> +        rc = bus->hotplug(bus->hotplug_qdev, pci_dev,
> +                          qdev->hotplugged ? PCI_HOTPLUG_ENABLED:
> +                          PCI_COLDPLUG_ENABLED);
>         if (rc != 0) {
>             int r = pci_unregister_device(&pci_dev->qdev);
>             assert(!r);
> @@ -1581,7 +1584,8 @@ static int pci_unplug_device(DeviceState *qdev)
>  {
>     PCIDevice *dev = DO_UPCAST(PCIDevice, qdev, qdev);
>
> -    return dev->bus->hotplug(dev->bus->hotplug_qdev, dev, 0);
> +    return dev->bus->hotplug(dev->bus->hotplug_qdev, dev,
> +                             PCI_HOTPLUG_DISABLED);
>  }
>
>  void pci_qdev_register(PCIDeviceInfo *info)
> diff --git a/hw/pci.h b/hw/pci.h
> index 7100804..09b3e4c 100644
> --- a/hw/pci.h
> +++ b/hw/pci.h
> @@ -214,7 +214,15 @@ int pci_device_load(PCIDevice *s, QEMUFile *f);
>
>  typedef void (*pci_set_irq_fn)(void *opaque, int irq_num, int level);
>  typedef int (*pci_map_irq_fn)(PCIDevice *pci_dev, int irq_num);
> -typedef int (*pci_hotplug_fn)(DeviceState *qdev, PCIDevice *pci_dev, int 
> state);
> +
> +typedef enum {
> +    PCI_HOTPLUG_DISABLED,
> +    PCI_HOTPLUG_ENABLED,
> +    PCI_COLDPLUG_ENABLED,
> +} PCIHotplugState;
> +
> +typedef int (*pci_hotplug_fn)(DeviceState *qdev, PCIDevice *pci_dev,
> +                              PCIHotplugState state);
>  void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
>                          const char *name, int devfn_min);
>  PCIBus *pci_bus_new(DeviceState *parent, const char *name, int devfn_min);
> diff --git a/hw/pcie.c b/hw/pcie.c
> index 35918f7..4df48b8 100644
> --- a/hw/pcie.c
> +++ b/hw/pcie.c
> @@ -192,14 +192,16 @@ static void pcie_cap_slot_event(PCIDevice *dev, 
> PCIExpressHotPlugEvent event)
>  }
>
>  static int pcie_cap_slot_hotplug(DeviceState *qdev,
> -                                 PCIDevice *pci_dev, int state)
> +                                 PCIDevice *pci_dev, PCIHotplugState state)
>  {
>     PCIDevice *d = DO_UPCAST(PCIDevice, qdev, qdev);
>     uint8_t *exp_cap = d->config + d->exp.exp_cap;
>     uint16_t sltsta = pci_get_word(exp_cap + PCI_EXP_SLTSTA);
>
> -    if (!pci_dev->qdev.hotplugged) {
> -        assert(state); /* this case only happe

[Qemu-devel] [PATCH 0/3] v11: Threadlets: A generic task offloading framework

2010-11-15 Thread Arun R Bharadwaj

Hi,

This is the v11 of the refactored patch-series to have a generic
asynchronous task offloading framework (called threadlets)
within qemu.

I have run KVM autotest suite with this patch. This test suite
ran successfully for the following tests:
-connecthon
-ebizzy
-dbench
-fsstress
-disktest
-cpu_hotplug
-hackbench
-sleeptest

Changelog:

* Moved the qemu_cond_broadcast to the right place.
* Removed unnecessary extern qualifiers.



The following series implements...

---

Arun R Bharadwaj (1):
  Move threadlets infrastructure to qemu-threadlets.c

Gautham R Shenoy (2):
  Make paio subsystem use threadlets infrastructure
  Add helper functions to enable virtio-9p make use of the threadlets


 Makefile.objs  |3 -
 configure  |2 
 docs/async-support.txt |  141 ++
 hw/virtio-9p.c |  164 ++-
 posix-aio-compat.c |  227 +++-
 qemu-threadlets.c  |  189 
 qemu-threadlets.h  |   47 ++
 vl.c   |3 +
 8 files changed, 599 insertions(+), 177 deletions(-)
 create mode 100644 docs/async-support.txt
 create mode 100644 qemu-threadlets.c
 create mode 100644 qemu-threadlets.h

-- 
arun

[Qemu-devel] [PATCH 1/3] Make paio subsystem use threadlets infrastructure

2010-11-15 Thread Arun R Bharadwaj

From: Gautham R Shenoy 

This patch creates a generic asynchronous-task-offloading infrastructure named
threadlets.

The patch creates a global queue on-to which subsystems can queue their tasks to
be executed asynchronously.

The patch also provides API's that allow a subsystem to create a private queue
with an associated pool of threads.

The patch has been tested with fstress.

Signed-off-by: Arun R Bharadwaj 
Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Gautham R Shenoy 
Signed-off-by: Sripathi Kodi 
---
 Makefile.objs  |2 
 configure  |2 
 posix-aio-compat.c |  354 +++-
 3 files changed, 211 insertions(+), 147 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index cd5a24b..3b7ec27 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -9,6 +9,7 @@ qobject-obj-y += qerror.o
 
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
 block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
+block-obj-$(CONFIG_POSIX) += qemu-thread.o
 block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
@@ -124,7 +125,6 @@ endif
 common-obj-y += $(addprefix ui/, $(ui-obj-y))
 
 common-obj-y += iov.o acl.o
-common-obj-$(CONFIG_THREAD) += qemu-thread.o
 common-obj-y += notify.o event_notifier.o
 common-obj-y += qemu-timer.o
 
diff --git a/configure b/configure
index a079a49..addf733 100755
--- a/configure
+++ b/configure
@@ -2456,7 +2456,6 @@ if test "$vnc_png" != "no" ; then
 fi
 if test "$vnc_thread" != "no" ; then
   echo "CONFIG_VNC_THREAD=y" >> $config_host_mak
-  echo "CONFIG_THREAD=y" >> $config_host_mak
 fi
 if test "$fnmatch" = "yes" ; then
   echo "CONFIG_FNMATCH=y" >> $config_host_mak
@@ -2534,7 +2533,6 @@ if test "$xen" = "yes" ; then
 fi
 if test "$io_thread" = "yes" ; then
   echo "CONFIG_IOTHREAD=y" >> $config_host_mak
-  echo "CONFIG_THREAD=y" >> $config_host_mak
 fi
 if test "$linux_aio" = "yes" ; then
   echo "CONFIG_LINUX_AIO=y" >> $config_host_mak
diff --git a/posix-aio-compat.c b/posix-aio-compat.c
index 7b862b5..e1812fc 100644
--- a/posix-aio-compat.c
+++ b/posix-aio-compat.c
@@ -29,7 +29,33 @@
 #include "block_int.h"
 
 #include "block/raw-posix-aio.h"
+#include "qemu-thread.h"
 
+#define MAX_GLOBAL_THREADS  64
+#define MIN_GLOBAL_THREADS   8
+
+static QemuMutex aiocb_mutex;
+static QemuCond aiocb_completion;
+
+typedef struct ThreadletQueue
+{
+QemuMutex lock;
+QemuCond cond;
+int max_threads;
+int min_threads;
+int cur_threads;
+int idle_threads;
+QTAILQ_HEAD(, ThreadletWork) request_list;
+} ThreadletQueue;
+
+typedef struct ThreadletWork
+{
+QTAILQ_ENTRY(ThreadletWork) node;
+void (*func)(struct ThreadletWork *work);
+} ThreadletWork;
+
+static ThreadletQueue globalqueue;
+static int globalqueue_init;
 
 struct qemu_paiocb {
 BlockDriverAIOCB common;
@@ -44,13 +70,12 @@ struct qemu_paiocb {
 int ev_signo;
 off_t aio_offset;
 
-QTAILQ_ENTRY(qemu_paiocb) node;
 int aio_type;
 ssize_t ret;
-int active;
 struct qemu_paiocb *next;
 
 int async_context_id;
+ThreadletWork work;
 };
 
 typedef struct PosixAioState {
@@ -58,64 +83,169 @@ typedef struct PosixAioState {
 struct qemu_paiocb *first_aio;
 } PosixAioState;
 
+static void *threadlet_worker(void *data)
+{
+ThreadletQueue *queue = data;
 
-static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
-static pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
-static pthread_t thread_id;
-static pthread_attr_t attr;
-static int max_threads = 64;
-static int cur_threads = 0;
-static int idle_threads = 0;
-static QTAILQ_HEAD(, qemu_paiocb) request_list;
+qemu_mutex_lock(&queue->lock);
+while (1) {
+ThreadletWork *work;
+int ret = 0;
+
+while (QTAILQ_EMPTY(&queue->request_list) &&
+   (ret != ETIMEDOUT)) {
+/* wait for cond to be signalled or broadcast for 1000s */
+ret = qemu_cond_timedwait((&queue->cond),
+ &(queue->lock), 10*10);
+}
 
-#ifdef CONFIG_PREADV
-static int preadv_present = 1;
-#else
-static int preadv_present = 0;
-#endif
+assert(queue->idle_threads != 0);
+if (QTAILQ_EMPTY(&queue->request_list)) {
+if (queue->cur_threads > queue->min_threads) {
+/* We retain the minimum number of threads */
+break;
+}
+} else {
+work = QTAILQ_FIRST(&queue->request_list);
+QTAILQ_REMOVE(&queue->request_list, work, node);
 
-static void die2(int err, const char *what)
-{
-fprintf(stderr, "%s failed: %s\n", what, strerror(err));
-abort();
+queue->idle_threads--;
+qemu_mutex_unlock(&queue->lock);
+
+/* execute the work function */
+work->func(work);
+
+qemu_mutex_lock(&queue->lock);
+queue->idle_threads++;
+}
+}
+
+queue->idle_th

[Qemu-devel] [PATCH 2/3] Move threadlets infrastructure to qemu-threadlets.c

2010-11-15 Thread Arun R Bharadwaj

The reason for creating this generic infrastructure is so that other subsystems,
such as virtio-9p could make use of it for offloading tasks that could block.

Signed-off-by: Arun R Bharadwaj 
Signed-off-by: Aneesh Kumar K.V 
Signed-off-by: Gautham R Shenoy 
Signed-off-by: Sripathi Kodi 

Acked-by: Stefan Hajnoczi 
---
 Makefile.objs  |1 
 docs/async-support.txt |  141 +++
 posix-aio-compat.c |  173 
 qemu-threadlets.c  |  168 +++
 qemu-threadlets.h  |   46 +
 5 files changed, 357 insertions(+), 172 deletions(-)
 create mode 100644 docs/async-support.txt
 create mode 100644 qemu-threadlets.c
 create mode 100644 qemu-threadlets.h

diff --git a/Makefile.objs b/Makefile.objs
index 3b7ec27..2cf8aba 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -10,6 +10,7 @@ qobject-obj-y += qerror.o
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
 block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
 block-obj-$(CONFIG_POSIX) += qemu-thread.o
+block-obj-$(CONFIG_POSIX) += qemu-threadlets.o
 block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
diff --git a/docs/async-support.txt b/docs/async-support.txt
new file mode 100644
index 000..9f22b9a
--- /dev/null
+++ b/docs/async-support.txt
@@ -0,0 +1,141 @@
+== How to use the threadlets infrastructure supported in Qemu ==
+
+== Threadlets ==
+
+Q.1: What are threadlets ?
+A.1: Threadlets is an infrastructure within QEMU that allows other subsystems
+ to offload possibly blocking work to a queue to be processed by a pool
+ of threads asynchronously.
+
+Q.2: When would one want to use threadlets ?
+A.2: Threadlets are useful when there are operations that can be performed
+ outside the context of the VCPU/IO threads inorder to free these latter
+ to service any other guest requests.
+
+Q.3: I have some work that can be executed in an asynchronous context. How
+ should I go about it ?
+A.3: One could follow the steps listed below:
+
+ - Define a function which would do the asynchronous work.
+   static void my_threadlet_func(ThreadletWork *work)
+   {
+   }
+
+ - Declare an object of type ThreadletWork;
+   ThreadletWork work;
+
+
+ - Assign a value to the "func" member of ThreadletWork object.
+   work.func = my_threadlet_func;
+
+ - Submit the threadlet to the global queue.
+   submit_threadletwork(&work);
+
+ - Continue servicing some other guest operations.
+
+Q.4: I want to my_threadlet_func to access some non-global data. How do I do
+ that ?
+A.4: Suppose you want my_threadlet_func to access some non-global data-object
+ of type myPrivateData. In that case one could follow the following steps.
+
+ - Define a member of the type ThreadletWork within myPrivateData.
+   typedef struct MyPrivateData {
+   ...;
+   ...;
+   ...;
+   ThreadletWork work;
+   } MyPrivateData;
+
+   MyPrivateData my_data;
+
+ - Initialize myData.work as described in A.3
+   myData.work.func = my_threadlet_func;
+   submit_threadletwork(&myData.work);
+
+ - Access the myData object inside my_threadlet_func() using container_of
+   primitive
+   static void my_threadlet_func(ThreadletWork *work)
+   {
+   myPrivateData *mydata_ptr;
+   mydata_ptr = container_of(work, myPrivateData, work);
+
+   /* mydata_ptr now points to myData object */
+   }
+
+Q.5: Are there any precautions one must take while sharing data with the
+ Asynchronous thread-pool ?
+A.5: Yes, make sure that the helper function of the type my_threadlet_func()
+ does not access/modify data when it can be accessed or modified in the
+ context of VCPU thread or IO thread. This is because the asynchronous
+ threads in the pool can run in parallel with the VCPU/IOThreads as shown
+ in the figure.
+
+ A typical workflow is as follows:
+
+  VCPU/IOThread
+   |
+   | (1)
+   |
+   V
+Offload work  (2)
+  |---> to threadlets -> Helper thread
+  ||   |
+  ||   |
+  || (3)   | (4)
+  ||   |
+  | Handle other Guest requests|
+  ||   |
+  ||   V
+  || (3)  Signal the I/O Thread
+  |(6) |

[Qemu-devel] [PATCH 3/3] Add helper functions to enable virtio-9p make use of the threadlets

2010-11-15 Thread Arun R Bharadwaj

From: Gautham R Shenoy 

infrastructure for offloading blocking tasks such as making posix calls on
to the helper threads and handle the post_posix_operations() from the
context of the iothread. This frees the vcpu thread to process any other guest
operations while the processing of v9fs_io is in progress.

Signed-off-by: Gautham R Shenoy 
Signed-off-by: Sripathi Kodi 
Signed-off-by: Arun R Bharadwaj 
---
 hw/virtio-9p.c |  164 
 posix-aio-compat.c |   30 +++---
 qemu-threadlets.c  |   21 +++
 qemu-threadlets.h  |1 
 vl.c   |3 +
 5 files changed, 196 insertions(+), 23 deletions(-)

diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c
index 7c59988..88da20f 100644
--- a/hw/virtio-9p.c
+++ b/hw/virtio-9p.c
@@ -18,6 +18,7 @@
 #include "fsdev/qemu-fsdev.h"
 #include "virtio-9p-debug.h"
 #include "virtio-9p-xattr.h"
+#include "qemu-threadlets.h"
 
 int debug_9p_pdu;
 
@@ -33,6 +34,149 @@ enum {
 Oappend = 0x80,
 };
 
+typedef struct V9fsPostOp {
+QTAILQ_ENTRY(V9fsPostOp) node;
+void (*func)(void *arg);
+void *arg;
+} V9fsPostOp;
+
+static struct {
+int rfd;
+int wfd;
+QemuMutex lock;
+QTAILQ_HEAD(, V9fsPostOp) post_op_list;
+} v9fs_async_struct;
+
+static void die2(int err, const char *what)
+{
+fprintf(stderr, "%s failed: %s\n", what, strerror(err));
+abort();
+}
+
+static void die(const char *what)
+{
+die2(errno, what);
+}
+
+#define ASYNC_MAX_PROCESS   5
+
+/**
+ * v9fs_process_post_ops: Process any pending v9fs_post_posix_operation
+ * @arg: Not used.
+ *
+ * This function serves as a callback to the iothread to be called into 
whenever
+ * the v9fs_async_struct.wfd is written into. This thread goes through the list
+ * of v9fs_post_posix_operations() and executes them. In the process, it might
+ * queue more job on the asynchronous thread pool.
+ */
+static void v9fs_process_post_ops(void *arg)
+{
+int count = 0;
+struct V9fsPostOp *post_op;
+int ret;
+char byte;
+
+do {
+ret = read(v9fs_async_struct.rfd, &byte, sizeof(byte));
+} while (ret > 0 || (ret == -1 && errno == EINTR));
+
+qemu_mutex_lock(&v9fs_async_struct.lock);
+for (count = 0; count < ASYNC_MAX_PROCESS; count++) {
+if (QTAILQ_EMPTY(&(v9fs_async_struct.post_op_list))) {
+break;
+}
+post_op = QTAILQ_FIRST(&(v9fs_async_struct.post_op_list));
+QTAILQ_REMOVE(&(v9fs_async_struct.post_op_list), post_op, node);
+
+qemu_mutex_unlock(&v9fs_async_struct.lock);
+post_op->func(post_op->arg);
+qemu_free(post_op);
+qemu_mutex_lock(&v9fs_async_struct.lock);
+}
+qemu_mutex_unlock(&v9fs_async_struct.lock);
+}
+
+/**
+ * v9fs_async_signal: Inform the io-thread of completion of async job.
+ *
+ * This function is used to inform the iothread that a particular
+ * async-operation pertaining to v9fs has been completed and that the io thread
+ * can handle the v9fs_post_posix_operation.
+ *
+ * This is based on the aio_signal_handler
+ */
+static inline void v9fs_async_signal(void)
+{
+char byte = 0;
+ssize_t ret;
+int tries = 0;
+
+qemu_mutex_lock(&v9fs_async_struct.lock);
+do {
+assert(tries != 100);
+   ret = write(v9fs_async_struct.wfd, &byte, sizeof(byte));
+tries++;
+} while (ret < 0 && errno == EAGAIN);
+qemu_mutex_unlock(&v9fs_async_struct.lock);
+
+if (ret < 0 && errno != EAGAIN) {
+die("write() in v9fs");
+}
+
+if (kill(getpid(), SIGUSR2)) {
+die("kill failed");
+}
+}
+
+/**
+ * v9fs_async_helper_done: Marks the completion of the v9fs_async job
+ * @func: v9fs_post_posix_func() for post-processing invoked in the context of
+ *the io-thread
+ * @arg: Argument to func.
+ *
+ * This function is called from the context of one of the asynchronous threads
+ * in the thread pool. This is called when the asynchronous thread has finished
+ * executing a v9fs_posix_operation. It's purpose is to initiate the process of
+ * informing the io-thread that the v9fs_posix_operation has completed.
+ */
+static void v9fs_async_helper_done(void (*func)(void *arg), void *arg)
+{
+struct V9fsPostOp *post_op;
+
+post_op = qemu_mallocz(sizeof(*post_op));
+post_op->func = func;
+post_op->arg = arg;
+
+qemu_mutex_lock(&v9fs_async_struct.lock);
+QTAILQ_INSERT_TAIL(&(v9fs_async_struct.post_op_list), post_op, node);
+qemu_mutex_unlock(&v9fs_async_struct.lock);
+
+v9fs_async_signal();
+}
+
+/**
+ * v9fs_do_async_posix: Offload v9fs_posix_operation onto async thread.
+ * @vs: V9fsOPState variable for the OP operation.
+ * @posix_fn: The posix function which has to be offloaded onto async thread.
+ * @post_fn_ptr: Address of the location to hold the post_fn corresponding to
+ * the posix_fn
+ * @post_fn: The post processing function corresponding to the posix_fn.
+ *
+ * This function is a helper to offload posix_operation on to t

[Qemu-devel] [PATCH] *-dis: Replace fprintf_ftype by fprintf_function (format checking)

2010-11-15 Thread Stefan Weil

This patch adds more printf format checking.

Additional modifications were needed for this code change:

* alpha-dis.c: The local definition of MAX conflicts with
  a previous definition from osdep.h, so add an #undef.

* dis-asm.h: Add include for fprintf_function (qemu-common.h).
  The standard (now redundant) includes are removed.

* mis-dis.c: The definition of ARRAY_SIZE is no longer needed
  and must be removed (conflict with previous definition from
  qemu-common.h).

* sh4-dis.c: Remove some unneeded forward declarations.

Cc: Blue Swirl 
Signed-off-by: Stefan Weil 
---
 alpha-dis.c  |3 +++
 arm-dis.c|   14 +++---
 dis-asm.h|   10 ++
 m68k-dis.c   |2 +-
 microblaze-dis.c |2 +-
 mips-dis.c   |2 --
 sh4-dis.c|   16 +---
 7 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/alpha-dis.c b/alpha-dis.c
index 970da5b..8a2411e 100644
--- a/alpha-dis.c
+++ b/alpha-dis.c
@@ -22,6 +22,9 @@ along with this file; see the file COPYING.  If not, see
 #include 
 #include "dis-asm.h"
 
+/* MAX is redefined below, so remove any previous definition. */
+#undef MAX
+
 /* The opcode table is an array of struct alpha_opcode.  */
 
 struct alpha_opcode
diff --git a/arm-dis.c b/arm-dis.c
index fe7ac99..af21739 100644
--- a/arm-dis.c
+++ b/arm-dis.c
@@ -1587,7 +1587,7 @@ arm_decode_bitfield (const char *ptr, unsigned long insn,
 }
 
 static void
-arm_decode_shift (long given, fprintf_ftype func, void *stream,
+arm_decode_shift (long given, fprintf_function func, void *stream,
  int print_shift)
 {
   func (stream, "%s", arm_regnames[given & 0xf]);
@@ -1633,7 +1633,7 @@ print_insn_coprocessor (bfd_vma pc, struct 
disassemble_info *info, long given,
 {
   const struct opcode32 *insn;
   void *stream = info->stream;
-  fprintf_ftype func = info->fprintf_func;
+  fprintf_function func = info->fprintf_func;
   unsigned long mask;
   unsigned long value;
   int cond;
@@ -2127,7 +2127,7 @@ static void
 print_arm_address (bfd_vma pc, struct disassemble_info *info, long given)
 {
   void *stream = info->stream;
-  fprintf_ftype func = info->fprintf_func;
+  fprintf_function func = info->fprintf_func;
 
   if (((given & 0x000f) == 0x000f)
   && ((given & 0x0200) == 0))
@@ -,7 +,7 @@ print_insn_neon (struct disassemble_info *info, long 
given, bfd_boolean thumb)
 {
   const struct opcode32 *insn;
   void *stream = info->stream;
-  fprintf_ftype func = info->fprintf_func;
+  fprintf_function func = info->fprintf_func;
 
   if (thumb)
 {
@@ -2676,7 +2676,7 @@ print_insn_arm_internal (bfd_vma pc, struct 
disassemble_info *info, long given)
 {
   const struct opcode32 *insn;
   void *stream = info->stream;
-  fprintf_ftype func = info->fprintf_func;
+  fprintf_function func = info->fprintf_func;
 
   if (print_insn_coprocessor (pc, info, given, false))
 return;
@@ -3036,7 +3036,7 @@ print_insn_thumb16 (bfd_vma pc, struct disassemble_info 
*info, long given)
 {
   const struct opcode16 *insn;
   void *stream = info->stream;
-  fprintf_ftype func = info->fprintf_func;
+  fprintf_function func = info->fprintf_func;
 
   for (insn = thumb_opcodes; insn->assembler; insn++)
 if ((given & insn->mask) == insn->value)
@@ -3312,7 +3312,7 @@ print_insn_thumb32 (bfd_vma pc, struct disassemble_info 
*info, long given)
 {
   const struct opcode32 *insn;
   void *stream = info->stream;
-  fprintf_ftype func = info->fprintf_func;
+  fprintf_function func = info->fprintf_func;
 
   if (print_insn_coprocessor (pc, info, given, true))
 return;
diff --git a/dis-asm.h b/dis-asm.h
index 9b9657e..3fb4838 100644
--- a/dis-asm.h
+++ b/dis-asm.h
@@ -9,11 +9,7 @@
 #ifndef DIS_ASM_H
 #define DIS_ASM_H
 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include "qemu-common.h"
 
 typedef void *PTR;
 typedef uint64_t bfd_vma;
@@ -237,8 +233,6 @@ typedef struct symbol_cache_entry
 } udata;
 } asymbol;
 
-typedef int (*fprintf_ftype) (FILE*, const char*, ...);
-
 enum dis_insn_type {
   dis_noninsn, /* Not a valid instruction */
   dis_nonbranch,   /* Not a branch instruction */
@@ -261,7 +255,7 @@ enum dis_insn_type {
by hand, or using one of the initialization macros below.  */
 
 typedef struct disassemble_info {
-  fprintf_ftype fprintf_func;
+  fprintf_function fprintf_func;
   FILE *stream;
   PTR application_data;
 
diff --git a/m68k-dis.c b/m68k-dis.c
index d93943e..04f837a 100644
--- a/m68k-dis.c
+++ b/m68k-dis.c
@@ -1732,7 +1732,7 @@ match_insn_m68k (bfd_vma memaddr,
   const char *d;
 
   bfd_byte *buffer = priv->the_buffer;
-  fprintf_ftype save_printer = info->fprintf_func;
+  fprintf_function save_printer = info->fprintf_func;
   void (* save_print_address) (bfd_vma, struct disassemble_info *)
 = info->print_address_func;
 
diff --git a/microblaze-dis.c b/microblaze-dis.c
index 7694a43..16c312f 100644
--- a/microblaze-dis.c
+++ b/microblaze-dis.c
@@ -789,7 +789,7

Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on

2010-11-15 Thread Mike Ryan

Anyone care to comment?

On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote:
> Add an option to specify the host interface to send multicast packets on
> when using a multicast socket for networking. The option takes the name
> of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket
> option, which causes the packets to use that interface as an egress.
> 
> This is useful if the host machine has several interfaces with several
> virtual networks across disparate interfaces.
> ---
>  net.c   |4 
>  net/socket.c|   48 
>  qemu-common.h   |1 +
>  qemu-options.hx |   11 +--
>  qemu_socket.h   |1 +
>  5 files changed, 51 insertions(+), 14 deletions(-)
> 
> diff --git a/net.c b/net.c
> index c5e6063..ff9eb27 100644
> --- a/net.c
> +++ b/net.c
> @@ -1050,6 +1050,10 @@ static const struct {
>  .name = "mcast",
>  .type = QEMU_OPT_STRING,
>  .help = "UDP multicast address and port number",
> +}, {
> +.name = "interface",
> +.type = QEMU_OPT_STRING,
> +.help = "interface to send multicast packets on",
>  },
>  { /* end of list */ }
>  },
> diff --git a/net/socket.c b/net/socket.c
> index 1c4e153..ed7cd12 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque)
>  qemu_send_packet(&s->nc, s->buf, size);
>  }
>  
> -static int net_socket_mcast_create(struct sockaddr_in *mcastaddr)
> +static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const char 
> *interface)
>  {
>  struct ip_mreq imr;
>  int fd;
>  int val, ret;
> +struct in_addr maddr;
> +struct ifreq ifr;
>  if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) {
>   fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does 
> not contain a multicast address\n",
>   inet_ntoa(mcastaddr->sin_addr),
> @@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in 
> *mcastaddr)
>   goto fail;
>  }
>  
> +/* If an interface name is given, only send packets out that interface */
> +if (interface != NULL) {
> +strncpy(ifr.ifr_name, interface, IFNAMSIZ);
> +ret = ioctl(fd, SIOCGIFADDR, &ifr);
> +if (ret < 0) {
> +fprintf(stderr, "qemu: error: specified interface \"%s\" does 
> not exist\n", interface);
> +goto fail;
> +}
> +
> +maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr;
> +ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF, &maddr, 
> sizeof(maddr));
> +if (ret < 0) {
> +perror("setsockopt(IP_MULTICAST_IF)");
> +goto fail;
> +}
> +}
> +
>  socket_set_nonblock(fd);
>  return fd;
>  fail:
> @@ -248,7 +267,7 @@ static NetSocketState *net_socket_fd_init_dgram(VLANState 
> *vlan,
>   return NULL;
>   }
>   /* clone dgram socket */
> - newfd = net_socket_mcast_create(&saddr);
> + newfd = net_socket_mcast_create(&saddr, NULL);
>   if (newfd < 0) {
>   /* error already reported by net_socket_mcast_create() */
>   close(fd);
> @@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan,
>  static int net_socket_mcast_init(VLANState *vlan,
>   const char *model,
>   const char *name,
> - const char *host_str)
> + const char *host_str,
> + const char *interface)
>  {
>  NetSocketState *s;
>  int fd;
> @@ -478,7 +498,7 @@ static int net_socket_mcast_init(VLANState *vlan,
>  return -1;
>  
>  
> -fd = net_socket_mcast_create(&saddr);
> +fd = net_socket_mcast_create(&saddr, interface);
>  if (fd < 0)
>   return -1;
>  
> @@ -505,8 +525,9 @@ int net_init_socket(QemuOpts *opts,
>  
>  if (qemu_opt_get(opts, "listen") ||
>  qemu_opt_get(opts, "connect") ||
> -qemu_opt_get(opts, "mcast")) {
> -error_report("listen=, connect= and mcast= is invalid with fd=");
> +qemu_opt_get(opts, "mcast") ||
> +qemu_opt_get(opts, "interface")) {
> +error_report("listen=, connect=, mcast= and interface= is 
> invalid with fd=\n");
>  return -1;
>  }
>  
> @@ -524,8 +545,9 @@ int net_init_socket(QemuOpts *opts,
>  
>  if (qemu_opt_get(opts, "fd") ||
>  qemu_opt_get(opts, "connect") ||
> -qemu_opt_get(opts, "mcast")) {
> -error_report("fd=, connect= and mcast= is invalid with listen=");
> +qemu_opt_get(opts, "mcast") ||
> +qemu_opt_get(opts, "interface")) {
> +error_report("fd=, connect=, mcast= and interface= is invalid 
>

[Qemu-devel] [PATCH] target-sparc: Use fprintf_function (format checking)

2010-11-15 Thread Stefan Weil

This change was missing in commit
9a78eead0c74333a394c0f7bbfc4423ac746fcd5.

Cc: Blue Swirl 
Signed-off-by: Stefan Weil 
---
 target-sparc/cpu.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 7e0d17c..eb5d9c1 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -2,6 +2,7 @@
 #define CPU_SPARC_H
 
 #include "config.h"
+#include "qemu-common.h"
 
 #if !defined(TARGET_SPARC64)
 #define TARGET_LONG_BITS 32
@@ -442,8 +443,7 @@ typedef struct CPUSPARCState {
 /* helper.c */
 CPUSPARCState *cpu_sparc_init(const char *cpu_model);
 void cpu_sparc_set_id(CPUSPARCState *env, unsigned int cpu);
-void sparc_cpu_list (FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt,
- ...));
+void sparc_cpu_list(FILE *f, fprintf_function cpu_fprintf);
 void cpu_lock(void);
 void cpu_unlock(void);
 int cpu_sparc_handle_mmu_fault(CPUSPARCState *env1, target_ulong address, int 
rw,
-- 
1.7.2.3

Re: [Qemu-devel] Re: [Try2][PATCH] Initial implementation of a mpeg1 layer2 streaming audio driver.

2010-11-15 Thread Anthony Liguori


On 11/12/2010 10:54 AM, François Revol wrote:

Le 12 nov. 2010 à 15:32, Anthony Liguori a écrit :

   

I did try years ago, but at least the current wav driver really didn't like 
fifos back then. I recall trying for hours to get it pipe to ffmpeg or others 
without much luck.

Also, this poses several problems about the control of the external process 
(respawn on listener disconnection, close on exit...).
   

Doing encoding in QEMU is going to starve the guest pretty bad.  For an x86 
guest, that's going to result in issues with time drift, etc.

Making reencoding with an external process work a bit more nicely also has the 
advantage of making this work with other formats like Ogg which are a bit more 
Open Source friendly.
 

The current patch uses a separate thread, but since I clone this part from the 
esdaudio code I didn't check what it was doing. It seems it's not really 
queueing anything though except the single mix buffer, which kind of defeats 
the purpose of having a separate thread. This might explain why it lags so much 
here... I tried increasing the buffer size and lowering the threshold but it 
doesn't seem to help.

I agree it'd be better to use external programs when possible but as I said 
it's a bit harder to handle the errors and such, and I wanted to have something 
working.
Also, it requires more work to set it up for users, they must install the 
externals, figure out the command line options...
Possibly we can provide default templates for known programs, either text files 
with % escaping for args, or just a shell script passing env vars maybe...

Besides, external or not, IIRC a pipe is by default 4kB max, which isn't much 
better for decoupling the processing on its own, if the encoder is too slow it 
will still starve the audio thread, and the rest. Also it all requires more 
context switching and IPC, which increase the total processing time.

So I think it might be interesting to have both.

I'll see if I can buffer a bit more in the twolame code and if it helps, then 
I'll try to merge with the failed attempts I have around at using external 
progs.
   


Okay, but my thinking was that we'd do something like:

audio_capture "capture_command -opt=val -opt2=val" ...

Which would make it very easy to tie into lame, oggenc, etc.

Having simple alias commands like mp3_capture and ogg_capture that 
invoke common tools with reasonable options also would be a bonus.


Regards,

Anthony Liguori


François.

Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on

2010-11-15 Thread Anthony Liguori


On 11/15/2010 12:54 PM, Mike Ryan wrote:

Anyone care to comment?
   


I must admit, I don't understand the use-case well enough to really give 
an appropriate amount of review as to whether this is the best solution 
to the problem.


Michael, do you have any thoughts?

Regards,

Anthony Liguori


On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote:
   

Add an option to specify the host interface to send multicast packets on
when using a multicast socket for networking. The option takes the name
of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket
option, which causes the packets to use that interface as an egress.

This is useful if the host machine has several interfaces with several
virtual networks across disparate interfaces.
---
  net.c   |4 
  net/socket.c|   48 
  qemu-common.h   |1 +
  qemu-options.hx |   11 +--
  qemu_socket.h   |1 +
  5 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/net.c b/net.c
index c5e6063..ff9eb27 100644
--- a/net.c
+++ b/net.c
@@ -1050,6 +1050,10 @@ static const struct {
  .name = "mcast",
  .type = QEMU_OPT_STRING,
  .help = "UDP multicast address and port number",
+}, {
+.name = "interface",
+.type = QEMU_OPT_STRING,
+.help = "interface to send multicast packets on",
  },
  { /* end of list */ }
  },
diff --git a/net/socket.c b/net/socket.c
index 1c4e153..ed7cd12 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque)
  qemu_send_packet(&s->nc, s->buf, size);
  }

-static int net_socket_mcast_create(struct sockaddr_in *mcastaddr)
+static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const char 
*interface)
  {
  struct ip_mreq imr;
  int fd;
  int val, ret;
+struct in_addr maddr;
+struct ifreq ifr;
  if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) {
fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does not 
contain a multicast address\n",
inet_ntoa(mcastaddr->sin_addr),
@@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr)
goto fail;
  }

+/* If an interface name is given, only send packets out that interface */
+if (interface != NULL) {
+strncpy(ifr.ifr_name, interface, IFNAMSIZ);
+ret = ioctl(fd, SIOCGIFADDR,&ifr);
+if (ret<  0) {
+fprintf(stderr, "qemu: error: specified interface \"%s\" does not 
exist\n", interface);
+goto fail;
+}
+
+maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr;
+ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,&maddr, 
sizeof(maddr));
+if (ret<  0) {
+perror("setsockopt(IP_MULTICAST_IF)");
+goto fail;
+}
+}
+
  socket_set_nonblock(fd);
  return fd;
  fail:
@@ -248,7 +267,7 @@ static NetSocketState *net_socket_fd_init_dgram(VLANState 
*vlan,
return NULL;
}
/* clone dgram socket */
-   newfd = net_socket_mcast_create(&saddr);
+   newfd = net_socket_mcast_create(&saddr, NULL);
if (newfd<  0) {
/* error already reported by net_socket_mcast_create() */
close(fd);
@@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan,
  static int net_socket_mcast_init(VLANState *vlan,
   const char *model,
   const char *name,
- const char *host_str)
+ const char *host_str,
+ const char *interface)
  {
  NetSocketState *s;
  int fd;
@@ -478,7 +498,7 @@ static int net_socket_mcast_init(VLANState *vlan,
  return -1;


-fd = net_socket_mcast_create(&saddr);
+fd = net_socket_mcast_create(&saddr, interface);
  if (fd<  0)
return -1;

@@ -505,8 +525,9 @@ int net_init_socket(QemuOpts *opts,

  if (qemu_opt_get(opts, "listen") ||
  qemu_opt_get(opts, "connect") ||
-qemu_opt_get(opts, "mcast")) {
-error_report("listen=, connect= and mcast= is invalid with fd=");
+qemu_opt_get(opts, "mcast") ||
+qemu_opt_get(opts, "interface")) {
+error_report("listen=, connect=, mcast= and interface= is invalid with 
fd=\n");
  return -1;
  }

@@ -524,8 +545,9 @@ int net_init_socket(QemuOpts *opts,

  if (qemu_opt_get(opts, "fd") ||
  qemu_opt_get(opts, "connect") ||
-qemu_opt_get(opts, "mcast")) {
-error_report("fd=, connect= and mcast= is invalid with listen=");
+qemu_opt_get(opts, "mcast") ||
+qemu_opt_get(opts, "interface")) {
+

Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on

2010-11-15 Thread Mike Ryan

I'll clarify/elaborate a bit:

When using a multicast socket, the OS chooses a default physical
interface to send packets. The patch I've supplied allows the user to
select the interface.

Suppose you have a setup like so:

BoxA --- BoxB --- BoxC

You wish to run virtual machines on BoxB and BoxC and network them using
a multicast UDP socket.

BoxB has two network interfaces, and the default multicast interface may
be the link between BoxA and BoxB. In this situation, BoxC will not
receive any multicast packets from BoxB and networking between the boxes
is therefore impossible.

The utility of a multicast socket is obviously limited in my simplified
example. Generalize BoxC to a LAN of physical machines all running
virtual machines you wish to network and the use case should become a
bit clearer.

On Mon, Nov 15, 2010 at 01:36:28PM -0600, Anthony Liguori wrote:
> On 11/15/2010 12:54 PM, Mike Ryan wrote:
> >Anyone care to comment?
> 
> I must admit, I don't understand the use-case well enough to really
> give an appropriate amount of review as to whether this is the best
> solution to the problem.
> 
> Michael, do you have any thoughts?
> 
> Regards,
> 
> Anthony Liguori
> 
> >On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote:
> >>Add an option to specify the host interface to send multicast packets on
> >>when using a multicast socket for networking. The option takes the name
> >>of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket
> >>option, which causes the packets to use that interface as an egress.
> >>
> >>This is useful if the host machine has several interfaces with several
> >>virtual networks across disparate interfaces.
> >>---
> >>  net.c   |4 
> >>  net/socket.c|   48 
> >>  qemu-common.h   |1 +
> >>  qemu-options.hx |   11 +--
> >>  qemu_socket.h   |1 +
> >>  5 files changed, 51 insertions(+), 14 deletions(-)
> >>
> >>diff --git a/net.c b/net.c
> >>index c5e6063..ff9eb27 100644
> >>--- a/net.c
> >>+++ b/net.c
> >>@@ -1050,6 +1050,10 @@ static const struct {
> >>  .name = "mcast",
> >>  .type = QEMU_OPT_STRING,
> >>  .help = "UDP multicast address and port number",
> >>+}, {
> >>+.name = "interface",
> >>+.type = QEMU_OPT_STRING,
> >>+.help = "interface to send multicast packets on",
> >>  },
> >>  { /* end of list */ }
> >>  },
> >>diff --git a/net/socket.c b/net/socket.c
> >>index 1c4e153..ed7cd12 100644
> >>--- a/net/socket.c
> >>+++ b/net/socket.c
> >>@@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque)
> >>  qemu_send_packet(&s->nc, s->buf, size);
> >>  }
> >>
> >>-static int net_socket_mcast_create(struct sockaddr_in *mcastaddr)
> >>+static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const 
> >>char *interface)
> >>  {
> >>  struct ip_mreq imr;
> >>  int fd;
> >>  int val, ret;
> >>+struct in_addr maddr;
> >>+struct ifreq ifr;
> >>  if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) {
> >>fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does 
> >> not contain a multicast address\n",
> >>inet_ntoa(mcastaddr->sin_addr),
> >>@@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in 
> >>*mcastaddr)
> >>goto fail;
> >>  }
> >>
> >>+/* If an interface name is given, only send packets out that interface 
> >>*/
> >>+if (interface != NULL) {
> >>+strncpy(ifr.ifr_name, interface, IFNAMSIZ);
> >>+ret = ioctl(fd, SIOCGIFADDR,&ifr);
> >>+if (ret<  0) {
> >>+fprintf(stderr, "qemu: error: specified interface \"%s\" does 
> >>not exist\n", interface);
> >>+goto fail;
> >>+}
> >>+
> >>+maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr;
> >>+ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,&maddr, 
> >>sizeof(maddr));
> >>+if (ret<  0) {
> >>+perror("setsockopt(IP_MULTICAST_IF)");
> >>+goto fail;
> >>+}
> >>+}
> >>+
> >>  socket_set_nonblock(fd);
> >>  return fd;
> >>  fail:
> >>@@ -248,7 +267,7 @@ static NetSocketState 
> >>*net_socket_fd_init_dgram(VLANState *vlan,
> >>return NULL;
> >>}
> >>/* clone dgram socket */
> >>-   newfd = net_socket_mcast_create(&saddr);
> >>+   newfd = net_socket_mcast_create(&saddr, NULL);
> >>if (newfd<  0) {
> >>/* error already reported by net_socket_mcast_create() */
> >>close(fd);
> >>@@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan,
> >>  static int net_socket_mcast_init(VLANState *vlan,
> >>   const char *model,
> >>   const char *name,
> >>- const char *host_str)
> >>+

[Qemu-devel] How to detect a stopped guest os?

2010-11-15 Thread Wilhelm

Hello,

I know, this is not the right place to ask, but I wasn't abled to find a
users mailing list.

The question: is there any qemu (monitor) command to detect if the guest
os has stopped / poweroff?

-- 
Wilhelm

[Qemu-devel] [PATCH] audio: Use GCC_FMT_ATTR (format checking)

2010-11-15 Thread Stefan Weil

Cc: Blue Swirl 
Signed-off-by: Stefan Weil 
---
 audio/audio_pt_int.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/audio/audio_pt_int.c b/audio/audio_pt_int.c
index f15cc70..908c569 100644
--- a/audio/audio_pt_int.c
+++ b/audio/audio_pt_int.c
@@ -8,7 +8,8 @@
 
 #include 
 
-static void logerr (struct audio_pt *pt, int err, const char *fmt, ...)
+static void GCC_FMT_ATTR(3, 4) logerr (struct audio_pt *pt, int err,
+   const char *fmt, ...)
 {
 va_list ap;
 
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH] make trace options use autoconfy names

2010-11-15 Thread Andreas Färber


Am 15.11.2010 um 16:48 schrieb Paolo Bonzini:


On 11/15/2010 03:17 PM, Stefan Hajnoczi wrote:
On Sun, Nov 14, 2010 at 1:52 PM, Paolo  
Bonzini  wrote:

On 11/14/2010 02:38 PM, Andreas Färber wrote:


- --trace-file=*) trace_file="$optarg"
+ --enable-trace-file=*) trace_file="$optarg"
;;


but this should be --with-trace-file=... please. It is not being
enabled, just set to a different value.


--with-* should be reserved for library paths, but I can change it  
if people

prefer it that way.


I did think of the argument as a file name, sort of a relative path...

Actually I think we have something similar to overriding --prefix  
here

:).  It's a path that you can set at ./configure time.


Yeah, that's true.  However...


So is it not okay to use --trace-file=?


... Autoconf would not allow unknown options not starting with -- 
enable- or --with-.


The rationale to avoid incompatible options in QEMU is this: suppose  
you have a project using Autoconf (e.g. GCC) and you want to drop  
QEMU as a subdirectory in there, e.g. to run the GCC testsuite under  
QEMU usermode emulation (GCC can already do this for other  
simulators).  To pass options to QEMU's configure, you can include  
them in GCC's commandline.  The script will simply pass the option  
down to QEMU and it will be processed there.  However, if you pass -- 
trace-file to GCC's configure script, it will complain and stop.


Probably I would use something like --enable-trace- 
backend=simple:trace- if I was adding something similar to an  
autoconfiscated project.  But unless it provides some additional  
benefit (as is the case with cross-compilation support) I want to  
keep the syntactic changes in my patches to the minimum.



But I know nothing of autoconf and --enable-* or --with-* sort of
make sense too.


Whatever, I have to repost the other series anyway, so I'll change  
to --with-.


Thinking more about it, what about --enable-simple-trace=...,  
callapsing the two options into one?


Another autoconf way to pass this stuff would we ./configure ...  
TRACE_FILE=...


I wouldn't mind either way though, just noticed that the --enable- 
trace-file suggestion by autoconf convention would allow --disable- 
trace-file. Similar issue for --without-trace-file though.


Andreas

Re: [Qemu-devel] [PATCH] *-dis: Replace fprintf_ftype by fprintf_function (format checking)

2010-11-15 Thread Andreas Färber


Am 15.11.2010 um 19:39 schrieb Stefan Weil:


This patch adds more printf format checking.

Additional modifications were needed for this code change:

* alpha-dis.c: The local definition of MAX conflicts with
 a previous definition from osdep.h, so add an #undef.

* dis-asm.h: Add include for fprintf_function (qemu-common.h).
 The standard (now redundant) includes are removed.

* mis-dis.c:


:-)

[...]

alpha-dis.c  |3 +++
arm-dis.c|   14 +++---
dis-asm.h|   10 ++
m68k-dis.c   |2 +-
microblaze-dis.c |2 +-
mips-dis.c   |2 --
sh4-dis.c|   16 +---
7 files changed, 19 insertions(+), 30 deletions(-)

[Qemu-devel] [PATCH] darwin-user: Use GCC_FMT_ATTR (format checking)

2010-11-15 Thread Stefan Weil

The redundant forward declaration of qerror in machload.c
is removed because it should be taken from qemu.h.

Please note that this patch is untested because
I have no matching environment to compile it.

Cc: Blue Swirl 
Signed-off-by: Stefan Weil 
---
 darwin-user/machload.c |2 +-
 darwin-user/qemu.h |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/darwin-user/machload.c b/darwin-user/machload.c
index 4bb5c72..3bc3b65 100644
--- a/darwin-user/machload.c
+++ b/darwin-user/machload.c
@@ -82,7 +82,7 @@ void *macho_text_sect = 0;
 int   macho_offset = 0;
 
 int load_object(const char *filename, struct target_pt_regs * regs, void ** 
mh);
-void qerror(const char *format, ...);
+
 #ifdef TARGET_I386
 typedef struct mach_i386_thread_state {
 unsigned inteax;
diff --git a/darwin-user/qemu.h b/darwin-user/qemu.h
index 0c5081b..b6d3e6c 100644
--- a/darwin-user/qemu.h
+++ b/darwin-user/qemu.h
@@ -100,7 +100,7 @@ int do_sigaction(int sig, const struct sigaction *act,
 int do_sigaltstack(const struct sigaltstack *ss, struct sigaltstack *oss);
 
 void gemu_log(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
-void qerror(const char *fmt, ...);
+void qerror(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 
 void write_dt(void *ptr, unsigned long addr, unsigned long limit, int flags);
 
-- 
1.7.2.3

[Qemu-devel] Re: [PATCH RESENT] msix: allow byte and word reading from mmio

2010-11-15 Thread Michael S. Tsirkin

On Mon, Nov 15, 2010 at 05:40:30PM +0100, Bernhard Kohl wrote:
> Am 15.11.2010 11:42, schrieb ext Michael S. Tsirkin:
> >On Thu, Aug 19, 2010 at 02:56:51PM +0200, Bernhard Kohl wrote:
> >>It's legal that the guest reads a single byte or word from mmio.
> >Interesting. The spec seems to say this:
> >
> > For all accesses to MSI-X Table and MSI-X PBA fields, software must use
> > aligned full DWORD or aligned full QWORD transactions; otherwise, the
> > result is undefined.
> 
> I will remove the first statement from the commit message and add
> something like the comment you proposed below.
> 
> >>I have an OS which reads single bytes and it works fine on real
> >>hardware. Maybe this happens due to casting.
> >What do you mean by casting?
> 
> I did not yet locate the line of code where this happens in the guest.
> As all other accesses are dword, I assume that there is some masking,
> shifting or casting to a byte variable in the source code, and the
> compiler generates a byte access from this.

And I presume it's too late to fix the guest in question?

> >>Signed-off-by: Bernhard Kohl
> >I guess we can merge this, but we need a comment I think since this
> >seems to contradict the spec, and people were sending patches relying on
> >this.
> >
> >Does something like the following describe the situation correctly?
> >
> >/* Note: PCI spec requires that all MSI-X table accesses
> >are either DWORD or QWORD, size aligned.  Some guests seem to violate
> >this rule for read accesses, performing single byte reads.
> >Since it's easy to support this, let's do so.
> >Also support 16 bit size aligned reads, just in case.
> >  */
> 
> Yes, that's is exactly the situation with my guest.
> I will add this comment.
> 
> >Does you guest also do 16 bit reads?  Are accesses at least aligned?
> 
> I will check this with my guest and the readw function disabled
> in the patch. This will take some time.
> 
> >>---
> >>  hw/msix.c |   20 
> >>  1 files changed, 16 insertions(+), 4 deletions(-)
> >>
> >>diff --git a/hw/msix.c b/hw/msix.c
> >>index d99403a..7dac7f7 100644
> >>--- a/hw/msix.c
> >>+++ b/hw/msix.c
> >>@@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, 
> >>target_phys_addr_t addr)
> >>  return pci_get_long(page + offset);
> >>  }
> >>
> >>-static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t 
> >>addr)
> >>+static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr)
> >>  {
> >>-fprintf(stderr, "MSI-X: only dword read is allowed!\n");
> >>-return 0;
> >>+PCIDevice *dev = opaque;
> >>+unsigned int offset = addr&  (MSIX_PAGE_SIZE - 1)&  ~0x1;
> >>+void *page = dev->msix_table_page;
> >>+
> >>+return pci_get_word(page + offset);
> >>+}
> >>+
> >>+static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr)
> >>+{
> >>+PCIDevice *dev = opaque;
> >>+unsigned int offset = addr&  (MSIX_PAGE_SIZE - 1);
> >>+void *page = dev->msix_table_page;
> >>+
> >>+return pci_get_byte(page + offset);
> >>  }
> >>
> >>  static uint8_t msix_pending_mask(int vector)
> >>@@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = {
> >>  };
> >>
> >>  static CPUReadMemoryFunc * const msix_mmio_read[] = {
> >>-msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl
> >>+msix_mmio_readb, msix_mmio_readw, msix_mmio_readl
> >>  };
> >>
> >>  /* Should be called from device's map method. */
> >>-- 
> >>1.7.2.1

[Qemu-devel] [PATCH 2/4] virtio: Convert fprintf() to error_report()

2010-11-15 Thread Stefan Hajnoczi

Signed-off-by: Stefan Hajnoczi 
---
 hw/virtio.c |   35 ++-
 1 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/hw/virtio.c b/hw/virtio.c
index a2a657e..849a60f 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -14,6 +14,7 @@
 #include 
 
 #include "trace.h"
+#include "qemu-error.h"
 #include "virtio.h"
 #include "sysemu.h"
 
@@ -253,8 +254,8 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned int 
idx)
 
 /* Check it isn't doing very strange things with descriptor numbers. */
 if (num_heads > vq->vring.num) {
-fprintf(stderr, "Guest moved used index from %u to %u",
-idx, vring_avail_idx(vq));
+error_report("Guest moved used index from %u to %u",
+ idx, vring_avail_idx(vq));
 exit(1);
 }
 
@@ -271,7 +272,7 @@ static unsigned int virtqueue_get_head(VirtQueue *vq, 
unsigned int idx)
 
 /* If their number is silly, that's a fatal mistake. */
 if (head >= vq->vring.num) {
-fprintf(stderr, "Guest says index %u is available", head);
+error_report("Guest says index %u is available", head);
 exit(1);
 }
 
@@ -293,7 +294,7 @@ static unsigned virtqueue_next_desc(target_phys_addr_t 
desc_pa,
 wmb();
 
 if (next >= max) {
-fprintf(stderr, "Desc next is %u", next);
+error_report("Desc next is %u", next);
 exit(1);
 }
 
@@ -320,13 +321,13 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, 
int out_bytes)
 
 if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT) {
 if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) {
-fprintf(stderr, "Invalid size for indirect buffer table\n");
+error_report("Invalid size for indirect buffer table");
 exit(1);
 }
 
 /* If we've got too many, that implies a descriptor loop. */
 if (num_bufs >= max) {
-fprintf(stderr, "Looped descriptor");
+error_report("Looped descriptor");
 exit(1);
 }
 
@@ -340,7 +341,7 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, int 
out_bytes)
 do {
 /* If we've got too many, that implies a descriptor loop. */
 if (++num_bufs > max) {
-fprintf(stderr, "Looped descriptor");
+error_report("Looped descriptor");
 exit(1);
 }
 
@@ -374,7 +375,7 @@ void virtqueue_map_sg(struct iovec *sg, target_phys_addr_t 
*addr,
 len = sg[i].iov_len;
 sg[i].iov_base = cpu_physical_memory_map(addr[i], &len, is_write);
 if (sg[i].iov_base == NULL || len != sg[i].iov_len) {
-fprintf(stderr, "virtio: trying to map MMIO memory\n");
+error_report("virtio: trying to map MMIO memory");
 exit(1);
 }
 }
@@ -397,7 +398,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
 if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT) {
 if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) {
-fprintf(stderr, "Invalid size for indirect buffer table\n");
+error_report("Invalid size for indirect buffer table");
 exit(1);
 }
 
@@ -423,7 +424,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 
 /* If we've got too many, that implies a descriptor loop. */
 if ((elem->in_num + elem->out_num) > max) {
-fprintf(stderr, "Looped descriptor");
+error_report("Looped descriptor");
 exit(1);
 }
 } while ((i = virtqueue_next_desc(desc_pa, i, max)) != max);
@@ -694,8 +695,8 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
 qemu_get_be16s(f, &vdev->queue_sel);
 qemu_get_be32s(f, &features);
 if (features & ~supported_features) {
-fprintf(stderr, "Features 0x%x unsupported. Allowed features: 0x%x\n",
-features, supported_features);
+error_report("Features 0x%x unsupported. Allowed features: 0x%x",
+ features, supported_features);
 return -1;
 }
 if (vdev->set_features)
@@ -717,11 +718,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
num_heads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx;
/* Check it isn't doing very strange things with descriptor numbers. */
if (num_heads > vdev->vq[i].vring.num) {
-   fprintf(stderr, "VQ %d size 0x%x Guest index 0x%x "
-"inconsistent with Host index 0x%x: delta 0x%x\n",
-   i, vdev->vq[i].vring.num,
-vring_avail_idx(&vdev->vq[i]),
-vdev->vq[i].last_avail_idx, num_heads);
+   error_report("VQ %d size 0x%x Guest index 0x%x "
+"inconsistent with Host index 0x%x: delta 0x%x",
+i, vdev->vq[i].vring.num,
+

[Qemu-devel] [PATCH 0/4] virtio: Convert fprintf() to error_report()

2010-11-15 Thread Stefan Hajnoczi

The virtio hardware emulation code uses fprintf(stderr, ...) error messages.
Improve things slightly by moving to error_report() so error messages will be
printed to the monitor, if present.

We want to handle device error states properly instead of bailing out with
exit(1) but this series does not attempt to fix that yet.

Leave virtio-9p for now where there are many cases of fprintf(stderr, ...) and
development is still very active.

[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Anthony Liguori


On 11/15/2010 02:26 PM, Michael S. Tsirkin wrote:

Are there any system ones?
   


There's one in qemu-char.c.  Not sure it's easy to just say for the 
future that there can't be BHs that get delivered during vmstop.



Can we just stop processing them?
   


We ran into a lot of problems trying to do this with emulating 
synchronous I/O in the block layer.  If we can avoid the 
stop-dispatching handlers approach, I think we'll have far fewer problems.



  I think that if we put
something in the network layer that just queued packets if the vm is
stopped, it would be a more robust solution to the problem.
 

Will only work for -net. The problem is for anything
that can trigger activity when vm is stopped.
   


Activity or external I/O?

My assertion is that if we can stop things from hitting the disk, 
escaping the network, or leaving a character device, we're solid.



For memory, it is much worse:  any memory changes can either get
discarded or not.  This breaks consistency guarantees that guest relies
upon. Imagine virtio index getting updated but content not being
updated.  See?
   

If you suppress any I/O then the memory changes don't matter because
the same changes will happen on the destination too.
 

They matter, and same changes won't happen.
Example:

virtio used index is in page 1, it can point at data in page 2.
device writes into data, *then* into index. Order matters,
but won't be preserved: migration assumes memory does not
change after vmstop, and so it might send old values for
data but new values for index. Result will be invalid
data coming into guest.
   


No, this scenario is invalid.

Migration copies data while the guest is live and whenever a change 
happens, updates the dirty bitmap (that's why cpu_physical_memory_rw 
matters).


Once the VM is stopped, we never return to the main loop before 
completing migration so nothing else gets an opportunity to run.  This 
means when we send a page, we guarantee it won't be made dirty against 
until after the migration completes.


Once the migration completes, the contents of memory may change but 
that's okay.  As long as you stop packets from being sent, if you need 
to resume the guest, it'll pick up where it left off.



On the destination guest will pick up the index and
get bad (stale) data.
   


If you're seeing this happen with vhost, you aren't updating dirty bits 
correctly.  AFAICT, this cannot happen with userspace device models.


   

I think this basic problem is the same as Kemari.  We can either
attempt to totally freeze a guest which means stopping all callbacks
that are device related or we can prevent I/O from happening which
should introduce enough determinism to fix the problem in practice.

Regards,

Anthony Liguori
 


See above. IMO it's a different problem. Unlike Kemari,
I don't really see any drawbacks to stop all callbacks.
Do you?
   


I think it's going to be extremely difficult to get right and keep 
right.  A bunch of code needs to be converted to make us safe and then 
becomes brittle as it's very simple to change things in such a way that 
we're broken again.


Regards,

Anthony Liguori

Re: [Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Anthony Liguori


On 11/15/2010 02:53 PM, Stefan Hajnoczi wrote:


vhost has a solution for this: register a VMChangeStateHandler
function that stops ioeventfd while vm_stop(0) is in effect.  Then
poll to see if there is work pending.

I will add equivalent functionality to virtio-ioeventfd.
   


I still think that stopping this at the net/block layer is going to be a 
lot more robust in the long term.  All net/block traffic flows through a 
single set of interfaces whereas getting the devices to completely stop 
themselves requires touching every device and making sure that noone 
adds back vmstop-unfriendly behavior down the road.


Regards,

Anthony Liguori


Stefan

Re: [Qemu-devel] [PATCH 1/3] Make paio subsystem use threadlets infrastructure

2010-11-15 Thread Anthony Liguori


On 11/15/2010 11:53 AM, Arun R Bharadwaj wrote:

From: Gautham R Shenoy

This patch creates a generic asynchronous-task-offloading infrastructure named
threadlets.

The patch creates a global queue on-to which subsystems can queue their tasks to
be executed asynchronously.

The patch also provides API's that allow a subsystem to create a private queue
with an associated pool of threads.

The patch has been tested with fstress.
   


This patch is very hard to review.  Can you generize this via a serial 
of incremental changes instead of a big rewrite?


This code has been historically fickle so doing the extra work to 
simplify review is worth it.


Regards,

Anthony Liguori


Signed-off-by: Arun R Bharadwaj
Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Gautham R Shenoy
Signed-off-by: Sripathi Kodi
---
  Makefile.objs  |2
  configure  |2
  posix-aio-compat.c |  354 +++-
  3 files changed, 211 insertions(+), 147 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index cd5a24b..3b7ec27 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -9,6 +9,7 @@ qobject-obj-y += qerror.o

  block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
  block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o
+block-obj-$(CONFIG_POSIX) += qemu-thread.o
  block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
  block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o

@@ -124,7 +125,6 @@ endif
  common-obj-y += $(addprefix ui/, $(ui-obj-y))

  common-obj-y += iov.o acl.o
-common-obj-$(CONFIG_THREAD) += qemu-thread.o
  common-obj-y += notify.o event_notifier.o
  common-obj-y += qemu-timer.o

diff --git a/configure b/configure
index a079a49..addf733 100755
--- a/configure
+++ b/configure
@@ -2456,7 +2456,6 @@ if test "$vnc_png" != "no" ; then
  fi
  if test "$vnc_thread" != "no" ; then
echo "CONFIG_VNC_THREAD=y">>  $config_host_mak
-  echo "CONFIG_THREAD=y">>  $config_host_mak
  fi
  if test "$fnmatch" = "yes" ; then
echo "CONFIG_FNMATCH=y">>  $config_host_mak
@@ -2534,7 +2533,6 @@ if test "$xen" = "yes" ; then
  fi
  if test "$io_thread" = "yes" ; then
echo "CONFIG_IOTHREAD=y">>  $config_host_mak
-  echo "CONFIG_THREAD=y">>  $config_host_mak
  fi
  if test "$linux_aio" = "yes" ; then
echo "CONFIG_LINUX_AIO=y">>  $config_host_mak
diff --git a/posix-aio-compat.c b/posix-aio-compat.c
index 7b862b5..e1812fc 100644
--- a/posix-aio-compat.c
+++ b/posix-aio-compat.c
@@ -29,7 +29,33 @@
  #include "block_int.h"

  #include "block/raw-posix-aio.h"
+#include "qemu-thread.h"

+#define MAX_GLOBAL_THREADS  64
+#define MIN_GLOBAL_THREADS   8
+
+static QemuMutex aiocb_mutex;
+static QemuCond aiocb_completion;
+
+typedef struct ThreadletQueue
+{
+QemuMutex lock;
+QemuCond cond;
+int max_threads;
+int min_threads;
+int cur_threads;
+int idle_threads;
+QTAILQ_HEAD(, ThreadletWork) request_list;
+} ThreadletQueue;
+
+typedef struct ThreadletWork
+{
+QTAILQ_ENTRY(ThreadletWork) node;
+void (*func)(struct ThreadletWork *work);
+} ThreadletWork;
+
+static ThreadletQueue globalqueue;
+static int globalqueue_init;

  struct qemu_paiocb {
  BlockDriverAIOCB common;
@@ -44,13 +70,12 @@ struct qemu_paiocb {
  int ev_signo;
  off_t aio_offset;

-QTAILQ_ENTRY(qemu_paiocb) node;
  int aio_type;
  ssize_t ret;
-int active;
  struct qemu_paiocb *next;

  int async_context_id;
+ThreadletWork work;
  };

  typedef struct PosixAioState {
@@ -58,64 +83,169 @@ typedef struct PosixAioState {
  struct qemu_paiocb *first_aio;
  } PosixAioState;

+static void *threadlet_worker(void *data)
+{
+ThreadletQueue *queue = data;

-static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
-static pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
-static pthread_t thread_id;
-static pthread_attr_t attr;
-static int max_threads = 64;
-static int cur_threads = 0;
-static int idle_threads = 0;
-static QTAILQ_HEAD(, qemu_paiocb) request_list;
+qemu_mutex_lock(&queue->lock);
+while (1) {
+ThreadletWork *work;
+int ret = 0;
+
+while (QTAILQ_EMPTY(&queue->request_list)&&
+   (ret != ETIMEDOUT)) {
+/* wait for cond to be signalled or broadcast for 1000s */
+ret = qemu_cond_timedwait((&queue->cond),
+&(queue->lock), 10*10);
+}

-#ifdef CONFIG_PREADV
-static int preadv_present = 1;
-#else
-static int preadv_present = 0;
-#endif
+assert(queue->idle_threads != 0);
+if (QTAILQ_EMPTY(&queue->request_list)) {
+if (queue->cur_threads>  queue->min_threads) {
+/* We retain the minimum number of threads */
+break;
+}
+} else {
+work = QTAILQ_FIRST(&queue->request_list);
+QTAILQ_REMOVE(&queue->request_list, work, node);

-static void die2(int err, const char *what)
-{
-fprintf(stderr, "%s failed: %s\n", what, strerror(err));
-abort();
+

Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on

2010-11-15 Thread Anthony Liguori


On 11/15/2010 01:52 PM, Mike Ryan wrote:

I'll clarify/elaborate a bit:

When using a multicast socket, the OS chooses a default physical
interface to send packets. The patch I've supplied allows the user to
select the interface.

Suppose you have a setup like so:

BoxA --- BoxB --- BoxC

You wish to run virtual machines on BoxB and BoxC and network them using
a multicast UDP socket.

BoxB has two network interfaces, and the default multicast interface may
be the link between BoxA and BoxB. In this situation, BoxC will not
receive any multicast packets from BoxB and networking between the boxes
is therefore impossible.

The utility of a multicast socket is obviously limited in my simplified
example. Generalize BoxC to a LAN of physical machines all running
virtual machines you wish to network and the use case should become a
bit clearer.
   


Thanks.  Second question is how portable is SIOCGIFADDR?  I suspect 
that's very Linux-centric..


Regards,

Anthony Liguori


On Mon, Nov 15, 2010 at 01:36:28PM -0600, Anthony Liguori wrote:
   

On 11/15/2010 12:54 PM, Mike Ryan wrote:
 

Anyone care to comment?
   

I must admit, I don't understand the use-case well enough to really
give an appropriate amount of review as to whether this is the best
solution to the problem.

Michael, do you have any thoughts?

Regards,

Anthony Liguori

 

On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote:
   

Add an option to specify the host interface to send multicast packets on
when using a multicast socket for networking. The option takes the name
of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket
option, which causes the packets to use that interface as an egress.

This is useful if the host machine has several interfaces with several
virtual networks across disparate interfaces.
---
  net.c   |4 
  net/socket.c|   48 
  qemu-common.h   |1 +
  qemu-options.hx |   11 +--
  qemu_socket.h   |1 +
  5 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/net.c b/net.c
index c5e6063..ff9eb27 100644
--- a/net.c
+++ b/net.c
@@ -1050,6 +1050,10 @@ static const struct {
  .name = "mcast",
  .type = QEMU_OPT_STRING,
  .help = "UDP multicast address and port number",
+}, {
+.name = "interface",
+.type = QEMU_OPT_STRING,
+.help = "interface to send multicast packets on",
  },
  { /* end of list */ }
  },
diff --git a/net/socket.c b/net/socket.c
index 1c4e153..ed7cd12 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque)
  qemu_send_packet(&s->nc, s->buf, size);
  }

-static int net_socket_mcast_create(struct sockaddr_in *mcastaddr)
+static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const char 
*interface)
  {
  struct ip_mreq imr;
  int fd;
  int val, ret;
+struct in_addr maddr;
+struct ifreq ifr;
  if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) {
fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does not 
contain a multicast address\n",
inet_ntoa(mcastaddr->sin_addr),
@@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in 
*mcastaddr)
goto fail;
  }

+/* If an interface name is given, only send packets out that interface */
+if (interface != NULL) {
+strncpy(ifr.ifr_name, interface, IFNAMSIZ);
+ret = ioctl(fd, SIOCGIFADDR,&ifr);
+if (ret<   0) {
+fprintf(stderr, "qemu: error: specified interface \"%s\" does not 
exist\n", interface);
+goto fail;
+}
+
+maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr;
+ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,&maddr, 
sizeof(maddr));
+if (ret<   0) {
+perror("setsockopt(IP_MULTICAST_IF)");
+goto fail;
+}
+}
+
  socket_set_nonblock(fd);
  return fd;
  fail:
@@ -248,7 +267,7 @@ static NetSocketState *net_socket_fd_init_dgram(VLANState 
*vlan,
return NULL;
}
/* clone dgram socket */
-   newfd = net_socket_mcast_create(&saddr);
+   newfd = net_socket_mcast_create(&saddr, NULL);
if (newfd<   0) {
/* error already reported by net_socket_mcast_create() */
close(fd);
@@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan,
  static int net_socket_mcast_init(VLANState *vlan,
   const char *model,
   const char *name,
- const char *host_str)
+ const char *host_str,
+ const char *interface)
  {
  NetSocketState *s;
  int fd;
@@ -478,7 +498,7

[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Michael S. Tsirkin

On Mon, Nov 15, 2010 at 10:09:29AM -0600, Anthony Liguori wrote:
> On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote:
> >On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote:
> >>On 11/15/2010 08:52 AM, Juan Quintela wrote:
> >>>"Michael S. Tsirkin"   wrote:
> There's no reason for tap to run when VM is stopped.
> If we let it, it confuses the bridge on TX
> and corrupts DMA memory on RX.
> 
> Signed-off-by: Michael S. Tsirkin
> >>>once here, what handlers make sense to run while stopped?
> >>>/me can think of the normal console, non live migration, loadvm and not
> >>>much more.  Perhaps it is easier to just move the other way around?
> >>I'm not sure I concur that this is really a problem.
> >>Semantically, I don't think that stop has to imply that the guest
> >>memory no longer changes.
> >>
> >>Regards,
> >>
> >>Anthony Liguori
> >>
> >>>Later, Juan.
> >Well, I do not really know about vmstop that is not for migration.
> 
> They are separate.  For instance, we don't rely on stop to pause
> pending disk I/O because we don't serialize pending disk I/O
> operations.  Instead, we flush all pending I/O and rely on the fact
> that disk I/O requests are always submitted in the context of a vCPU
> operation.  This assumption breaks down though with ioeventfd so we
> need to revisit it.
> 
> >For most vmstop calls are for migration. And there, the problems are very
> >real.
> >
> >First, it's not just memory.  At least for network transmit, sending out
> >packets with the same MAC from two locations is a problem. See?
> 
> I agree it's a problem but I'm not sure that just marking fd
> handlers really helps.
> 
> Bottom halves can also trigger transmits.

Are there any system ones?
Can we just stop processing them?

>  I think that if we put
> something in the network layer that just queued packets if the vm is
> stopped, it would be a more robust solution to the problem.

Will only work for -net. The problem is for anything
that can trigger activity when vm is stopped.

> >For memory, it is much worse:  any memory changes can either get
> >discarded or not.  This breaks consistency guarantees that guest relies
> >upon. Imagine virtio index getting updated but content not being
> >updated.  See?
> 
> If you suppress any I/O then the memory changes don't matter because
> the same changes will happen on the destination too.

They matter, and same changes won't happen.
Example:

virtio used index is in page 1, it can point at data in page 2.
device writes into data, *then* into index. Order matters,
but won't be preserved: migration assumes memory does not
change after vmstop, and so it might send old values for
data but new values for index. Result will be invalid
data coming into guest.

On the destination guest will pick up the index and
get bad (stale) data.


> I think this basic problem is the same as Kemari.  We can either
> attempt to totally freeze a guest which means stopping all callbacks
> that are device related or we can prevent I/O from happening which
> should introduce enough determinism to fix the problem in practice.
> 
> Regards,
> 
> Anthony Liguori


See above. IMO it's a different problem. Unlike Kemari,
I don't really see any drawbacks to stop all callbacks.
Do you?

-- 
MST

Re: [Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device

2010-11-15 Thread Stefan Hajnoczi

On Mon, Nov 15, 2010 at 4:09 PM, Anthony Liguori  wrote:
> On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote:
>>
>> On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote:
>>
>>>
>>> On 11/15/2010 08:52 AM, Juan Quintela wrote:
>>>

 "Michael S. Tsirkin"   wrote:

>
> There's no reason for tap to run when VM is stopped.
> If we let it, it confuses the bridge on TX
> and corrupts DMA memory on RX.
>
> Signed-off-by: Michael S. Tsirkin
>

 once here, what handlers make sense to run while stopped?
 /me can think of the normal console, non live migration, loadvm and not
 much more.  Perhaps it is easier to just move the other way around?

>>>
>>> I'm not sure I concur that this is really a problem.
>>> Semantically, I don't think that stop has to imply that the guest
>>> memory no longer changes.
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>>>

 Later, Juan.

>>
>> Well, I do not really know about vmstop that is not for migration.
>>
>
> They are separate.  For instance, we don't rely on stop to pause pending
> disk I/O because we don't serialize pending disk I/O operations.  Instead,
> we flush all pending I/O and rely on the fact that disk I/O requests are
> always submitted in the context of a vCPU operation.  This assumption breaks
> down though with ioeventfd so we need to revisit it.

vhost has a solution for this: register a VMChangeStateHandler
function that stops ioeventfd while vm_stop(0) is in effect.  Then
poll to see if there is work pending.

I will add equivalent functionality to virtio-ioeventfd.

Stefan

1 2 >

1 - 100 of 142 matches

Mail list logo