Re: [Qemu-devel] [PATCH] linux-user: remove unnecessary local from __get_user(), __put_user()
On Mon, Nov 08, 2010 at 06:13:58PM +, Peter Maydell wrote: > Remove an unnecessary local variable from the __get_user() and > __put_user() macros. This avoids confusing compilation failures > if the name of the local variable ('size') happens to be the > same as the variable the macro user is trying to read/write. Looks fine, will push on my next patchset Acked-by: Riku Voipio > Signed-off-by: Peter Maydell > --- > linux-user/qemu.h |6 ++ > 1 files changed, 2 insertions(+), 4 deletions(-) > > diff --git a/linux-user/qemu.h b/linux-user/qemu.h > index 708021e..d717392 100644 > --- a/linux-user/qemu.h > +++ b/linux-user/qemu.h > @@ -266,8 +266,7 @@ static inline int access_ok(int type, abi_ulong addr, > abi_ulong size) > */ > #define __put_user(x, hptr)\ > ({\ > -int size = sizeof(*hptr);\ > -switch(size) {\ > +switch(sizeof(*hptr)) {\ > case 1:\ > *(uint8_t *)(hptr) = (uint8_t)(typeof(*hptr))(x);\ > break;\ > @@ -288,8 +287,7 @@ static inline int access_ok(int type, abi_ulong addr, > abi_ulong size) > > #define __get_user(x, hptr) \ > ({\ > -int size = sizeof(*hptr);\ > -switch(size) {\ > +switch(sizeof(*hptr)) {\ > case 1:\ > x = (typeof(*hptr))*(uint8_t *)(hptr);\ > break;\ > -- > 1.7.1 >
[Qemu-devel] Re: [PATCH v8 07/11] pci: introduce a parser for pci device path
On Mon, Nov 15, 2010 at 04:30:43PM +0900, Isaku Yamahata wrote: > introduce a function to parse pci device path of > the format, [Domain:]Slot.Function:Slot.Function:Slot.Function. > > Signed-off-by: Isaku Yamahata Hmm. How about we use openfirmware path like what Gleb's patch does, with a fallback to bus:dev.fn for when it's unambiguous? > --- > hw/pci.c | 87 > ++ > hw/pci.h |1 + > 2 files changed, 88 insertions(+), 0 deletions(-) > > diff --git a/hw/pci.c b/hw/pci.c > index fba765b..6a9a5ef 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -433,6 +433,93 @@ static void pci_set_default_subsystem_id(PCIDevice > *pci_dev) > } > > /* > + * Parse format [Domain:]Slot.Function:Slot.Function:Slot.Function > + * and get PCIDevice > + * return 0 on success > + * -1 on error: format is invalid or device isn't found. > + */ > +int pci_parse_dev_path(const char *addr, PCIDevice **pdev) > +{ > +unsigned long domain; > +unsigned long slot; > +unsigned long func; > +const char *p; > +char *e; > +unsigned long val; > +PCIBus *bus; > +PCIBus *child_bus; > +PCIDevice *d; > + > +p = addr; > +val = strtoul(p, &e, 0); > +if (e == p) { > +return -1; > +} > +if (*e == ':') { > +domain = val; > +p = e + 1; > +val = strtoul(p, &e, 0); > +if (e == p) { > +return -1; > +} > +} else if (*e == '.'){ > +domain = 0; > +} else { > +return -1; > +} > +if (domain > 0x) { > +return -1; > +} > + > +bus = pci_find_root_bus(domain); > +if (!bus) { > +return -1; > +} > + > +for (;;) { > +slot = val; > +if (*e != '.') { > +return -1; > +} > +p = e + 1; > +val = strtoul(p, &e, 0); > +if (e == p) { > +return -1; > +} > +func = val; > +if (slot > 0x1f || func >= PCI_FUNC_MAX) { > +return -1; > +} > +d = bus->devices[PCI_DEVFN(slot, func)]; > +if (!d) { > +return -1; > +} > +if (*e == '\0') { > +break; > +} > + > +if (*e != ':') { > +return -1; > +} > +p = e + 1; > +val = strtoul(p, &e, 0); > +if (e == p) { > +return -1; > +} > +QLIST_FOREACH(child_bus, &bus->child, sibling) { > +if (child_bus->parent_dev == d) { > +bus = child_bus; > +continue; > +} > +} > +return -1; > +} > + > +*pdev = d; > +return 0; > +} > + > +/* > * Parse [[:]:], return -1 on error if funcp == NULL > * [[:]:]., return -1 on error > */ > diff --git a/hw/pci.h b/hw/pci.h > index 7100804..8c16f91 100644 > --- a/hw/pci.h > +++ b/hw/pci.h > @@ -239,6 +239,7 @@ PCIBus *pci_find_bus(PCIBus *bus, int bus_num); > PCIDevice *pci_find_device(PCIBus *bus, int bus_num, int slot, int function); > PCIBus *pci_get_bus_devfn(int *devfnp, const char *devaddr); > > +int pci_parse_dev_path(const char *addr, PCIDevice **pdev); > int pci_parse_devaddr(const char *addr, int *domp, int *busp, >unsigned int *slotp, unsigned int *funcp); > int pci_read_devaddr(Monitor *mon, const char *addr, int *domp, int *busp, > -- > 1.7.1.1
[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.
On Mon, Nov 15, 2010 at 09:53:50AM +0200, Michael S. Tsirkin wrote: > On Mon, Nov 15, 2010 at 09:40:08AM +0200, Gleb Natapov wrote: > > On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote: > > > On Sun, Nov 14, 2010 at 05:39:41PM +0200, Gleb Natapov wrote: > > > > +/* > > > > + * This function returns device list as an array in a below format: > > > > + * +-+-+---+-+---+-- > > > > + * | n | l1 | devpath1| l2 | devpath2 | ... > > > > + * +-+-+---+-+---+-- > > > > + * where: > > > > + * n - a number of devise pathes (one byte) > > > > + * l - length of following device path string (one byte) > > > > + * devpath - non-null terminated string of length l representing > > > > + * one device path > > > > + */ > > > > > > Why not just return a newline separated list that is null terminated? > > > > > Doing it like this will needlessly complicate firmware side. How do you > > know how much memory to allocate before reading device list? > > Do a memory scan, count newlines until you reach 0? > To do memory scan you need to read it into memory first. To read it into memory you need to know how much memory to allocate to know how much memory to allocate you meed to do memory scan... Notice pattern here :) Of course you can scan IO space too discarding everything you read first time, but why introduce broken interface in the first place? > > Doing it > > like Blue suggest (have BOOTINDEX_LEN and BOOTINDEX_STRING) solves this. > > To create nice array from bootindex string you firmware will still have > > to do additional pass on it though. > > Why is this a problem? Pass over memory is cheap, isn't it? > More code, each line of code potentially introduce bug. But I will go with Blue suggestion anyway since we already use it for other things. > > With format like above the code > > would look like that: > > > > qemu_cfg_read(&n, 1); > > arr = alloc(n); > > for (i=0; i > qemu_cfg_read(&l, 1); > > arr[i] = zalloc(l+1); > > qemu_cfg_read(arr[i], l); > > } > > > > > > -- > > Gleb. > > > At this point I don't care about format. I do. > But I would like one without 1-byte-length limitations, > just so we can cover whatever pci can through at us. > I agree. 1-byte for one device string may be to limiting. It is still more then 15 PCI bridges on a PC and if you have your pci bus go that deep you are doing something very wrong. But according to spec device name can be 32 byte long and device address may be 64bit physical address and that makes length of one device element to be 50 byte. -- Gleb.
[Qemu-devel] Re: [PATCH v8 01/11] pci: revise pci command register initialization
On Mon, Nov 15, 2010 at 04:30:37PM +0900, Isaku Yamahata wrote: > This patch cleans up command register initialization with > comments. > > Signed-off-by: Isaku Yamahata Probably just fold with the next patch. > --- > hw/pci.c | 42 ++ > 1 files changed, 42 insertions(+), 0 deletions(-) > > diff --git a/hw/pci.c b/hw/pci.c > index 962886e..b70a568 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -544,8 +544,50 @@ static void pci_init_wmask(PCIDevice *dev) > > dev->wmask[PCI_CACHE_LINE_SIZE] = 0xff; > dev->wmask[PCI_INTERRUPT_LINE] = 0xff; > + > +/* > + * bit 0: PCI_COMMAND_IO > + *type 0: if IO BAR is used, RW > + *type 1: RW > + * bit 1: PCI_COMMAND_MEMORY > + *type 0: if IO BAR is used, RW > + *type 1: RW > + * bit 2: PCI_COMMAND_MASTER > + *type 0: RW if bus master > + *type 1: RW > + * bit 3: PCI_COMMAND_SPECIAL > + *RO=0, optionally RW: Such device should set this bit itself > + * bit 4: PCI_COMMAND_INVALIDATE > + *RO=0, optionally RW: Such device should set this bit itself > + * bit 5: PCI_COMMAND_VGA_PALETTE > + *RO=0, optionally RW: Such device should set this bit itself > + * bit 6: PCI_COMMAND_PARITY > + *RW with exceptions: Such device should clear this bit itself > + *Given that qemu doesn't emulate pci bus cycles, so that there > + *is no place to generate parity error. So just making this > + *register RW is okay because there is no place which refers > + *this bit. > + *TODO: When device assignment tried to inject PERR# into qemu, > + * some extra work would be needed. > + * bit 7: PCI_COMMAND_WAIT: reserved (PCI 3.0) > + *RO=0 > + * bit 8: PCI_COMMAND_SERR > + *RW with exceptions: Such device should clear this bit itself > + *Given that qemu doesn't emulate pci bus cycles, so that there > + *is no place to generate system error. So just making this > + *register RW is okay because there is no place which refers > + *this bit. > + *TODO: When device assignment tried to inject SERR# into qemu, > + * some extra work would be needed. > + * bit 9: PCI_COMMAND_FAST_BACK > + *RO=0, optionally RW: Such device should set this bit itself > + * bit 10: PCI_COMMAND_INTX_DISABLE > + * RW > + * bit 11-15: reserved > + */ > pci_set_word(dev->wmask + PCI_COMMAND, > PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | > + PCI_COMMAND_PARITY | PCI_COMMAND_SERR | > PCI_COMMAND_INTX_DISABLE); > > memset(dev->wmask + PCI_CONFIG_HEADER_SIZE, 0xff, > -- > 1.7.1.1
Re: Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?
Hi OK it's getting interesting perhaps it would lead into instrumentation topic, which is quite hot topic in qemu-devel quite recently, so you jump into the wagon just about the right time :) 2010/11/15 F. Zhang : > I am very pleased to share ideas with you. But my English is too poor, er…, > I’ll try my best to make it clear. J Either do I. How much do you expect Indonesian like me to write fluently English, after all? :D heheh, just joking :) OK, one thing for sure here is, I think you can implement your idea on top of several (not so complete) existing frameworks in Qemu. Tracing...is one of them...not sure about the rest... > Yes, I have read that paper, it’s wonderful! > > Besides the Argos, the bitblaze group, led by Dawn Song in Berkeley, has > achieved great success in the taint analysis. The website about their > dynamic analysis work (called TEMU) can be found at: > http://bitblaze.cs.berkeley.edu/temu.html > > And TEMU is now open-source. Thanks for sharing that...it's new stuff for me. So, why don't you just pick TEMU and improve it instead of...uhm...sorry if I am wrong, working from scratch? After all, I believe in both Argos and TEMU (and maybe other similar projects), they share common codes here and there. But ehm...CMIIW, seems like TEMU is based on Qemu 0.9,x, right? So it's sorry I forgot the name, the generated code is mostly a constructed by fragments of small codes generated by gcc. Now, it is qemu which does it by itself. So, a lot of things change (substantially). > Yes. For each process’s memory space A, I wanna make a shadow memory B. The > shadow memory is used to store the tag of data. In other words, if addr in > memory A is tainted, then the corresponding byte in B should be marked to > indicate that addr in A is tainted. I agree that should be the way it worksbut. (see below) >>How about using unused one of unused PTE flags for such tag? > > Sorry, what is the PTE flag? Page Table Entry...i believe not all flags are really used by the OS nowadays, so I guess you can utilize 1 or 2 bits there whenever possible... > > In fact, the tag is stored in the shadow memory of the process. > > Let us consider the following instruction: > > mov eax, [esi] > > If data in [esi] is tainted, then eax is tained, too. May we know, what kind of information do you plan to store in such tag? > In this instruction, we should first consider whether [esi] is tainted or > not. This is done by checking the tag in the shadow memory. If [esi] is > tainted, then the tag for eax in the shadow memory is set, too. > > The question is: how to implement the upper functions? maybe I should modify > the instruction-translation functions to implement the trace of tainted data > propagation? I think you should hook all the memory operation related opcode (or to be precise, Qemu opcode). That way, you won't miss any.. > Yes, I wanna make QEMU cooperate with the GUEST OS. In fact, malware under > analysis is run within the GUEST OS. Hm, I thought it would be host OS + qemudon't you think, if it is guest OS +qemu, while there is a chance guest OS is compromised first, then we get such unreliable data? Or am I missing something here? >The guest os collects “higher” semantic > from the OS level, and the QEMU collects “lower” semantic from the > instruction level. Combination of both semantics is necessary in the > analysis process. The question is, in a situation where malware already compromise "the higher semantic", could we trust the analysis? > The question is: how to communicate between the QEMU and the guest OS, so > that they can cooperate with each other? OK, so let's assume it's really guest OS +qemu...i think, uhm, better create pseudo device, quite similar with virtioor you can think it's like /dev/sda, /dev/rtc etc... the guest OS must somewhat be installed with a driver which knows how to read and talk to this device. Via the driver, fed any analysis resultqemu collects it...and finally pass it to host OS. Other possibilty is to reserve certain memory region (kinda BIOS reserved memory space), mmap it inside the guest OS, then treat it like System V shared memory. Put the data in it, Qemu regularly checks it... What do you think? PS: eduardo cruz might be an interesting person to talk to..he did instrumention work lately too -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com
[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.
On Sun, Nov 14, 2010 at 10:50:13PM +, Blue Swirl wrote: > On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov wrote: > > > > Signed-off-by: Gleb Natapov > > --- > > hw/fw_cfg.c | 14 ++ > > hw/fw_cfg.h | 4 +++- > > sysemu.h | 1 + > > vl.c | 51 +++ > > 4 files changed, 69 insertions(+), 1 deletions(-) > > > > diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c > > index 7b9434f..f6a67db 100644 > > --- a/hw/fw_cfg.c > > +++ b/hw/fw_cfg.c > > @@ -53,6 +53,7 @@ struct FWCfgState { > > FWCfgFiles *files; > > uint16_t cur_entry; > > uint32_t cur_offset; > > + Notifier machine_ready; > > }; > > > > static void fw_cfg_write(FWCfgState *s, uint8_t value) > > @@ -315,6 +316,15 @@ int fw_cfg_add_file(FWCfgState *s, const char > > *filename, uint8_t *data, > > return 1; > > } > > > > +static void fw_cfg_machine_ready(struct Notifier* n) > > +{ > > + uint32_t len; > > + char *bootindex = get_boot_devices_list(&len); > > + > > + fw_cfg_add_bytes(container_of(n, FWCfgState, machine_ready), > > + FW_CFG_BOOTINDEX, (uint8_t*)bootindex, len); > > +} > > + > > FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port, > > target_phys_addr_t ctl_addr, target_phys_addr_t > > data_addr) > > { > > @@ -343,6 +353,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t > > data_port, > > fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus); > > fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu); > > > > + > > + s->machine_ready.notify = fw_cfg_machine_ready; > > + qemu_add_machine_init_done_notifier(&s->machine_ready); > > + > > return s; > > } > > > > diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h > > index 856bf91..4d61410 100644 > > --- a/hw/fw_cfg.h > > +++ b/hw/fw_cfg.h > > @@ -30,7 +30,9 @@ > > > > #define FW_CFG_FILE_FIRST 0x20 > > #define FW_CFG_FILE_SLOTS 0x10 > > -#define FW_CFG_MAX_ENTRY (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS) > > +#define FW_CFG_FILE_LAST_SLOT (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS) > > +#define FW_CFG_BOOTINDEX (FW_CFG_FILE_LAST_SLOT + 1) > > +#define FW_CFG_MAX_ENTRY FW_CFG_BOOTINDEX > > This should be > #define FW_CFG_MAX_ENTRY(FW_CFG_BOOTINDEX + 1) > because the check is like this: > if ((key & FW_CFG_ENTRY_MASK) >= FW_CFG_MAX_ENTRY) { > s->cur_entry = FW_CFG_INVALID; > Yeah, will fix. > With that change, I got the bootindex passed to OpenBIOS: > OpenBIOS for Sparc64 > Configuration device id QEMU version 1 machine id 0 > kernel cmdline > CPUs: 1 x SUNW,UltraSPARC-IIi > UUID: ---- > bootindex num_strings 1 > bootindex /p...@01fe/i...@5/dr...@1/d...@0 > > The device path does not match exactly, but it's close: > /p...@1fe,0/pci-...@5/i...@600/d...@0 pbm->pci should be solvable by the patch at the end. Were in the spec it is allowed to abbreviate 1fe as 1fe,0? Spec allows to drop starting zeroes but TARGET_FMT_plx definition in targphys.h has 0 after %. I can define another one without leading zeroes. Can you suggest a name? TARGET_FMT_lx is poisoned. As of ATA there is no open firmware binding spec for ATA, so everyone does what he pleases. I based my implementation on what open firmware showing when running on qemu x86. "pci-ata" should be "ide" according to PCI binding spec :) diff --git a/hw/apb_pci.c b/hw/apb_pci.c index c619112..643aa49 100644 --- a/hw/apb_pci.c +++ b/hw/apb_pci.c @@ -453,6 +453,7 @@ static PCIDeviceInfo pbm_pci_host_info = { static SysBusDeviceInfo pbm_host_info = { .qdev.name = "pbm", +.qdev.fw_name = "pci", .qdev.size = sizeof(APBState), .qdev.reset = pci_pbm_reset, .init = pci_pbm_init_device, -- Gleb.
[Qemu-devel] Re: [PATCH v8 07/11] pci: introduce a parser for pci device path
On Mon, Nov 15, 2010 at 10:06:13AM +0200, Michael S. Tsirkin wrote: > On Mon, Nov 15, 2010 at 04:30:43PM +0900, Isaku Yamahata wrote: > > introduce a function to parse pci device path of > > the format, [Domain:]Slot.Function:Slot.Function:Slot.Function. > > > > Signed-off-by: Isaku Yamahata > > Hmm. > How about we use openfirmware path like what Gleb's patch does, > with a fallback to bus:dev.fn for when it's unambiguous? Okay, let me check my understanding of the format. The openfirmware path in pci case looks like /pci@/@,/.../@, "pci@" corresponds to pci domain. So should be also supported in addition to . Maybe we'd like "@" to be optional. So the parser would accept something like /{, }/,/.../, -- yamahata
[Qemu-devel] Re: [PATCH v8 07/11] pci: introduce a parser for pci device path
On Mon, Nov 15, 2010 at 05:57:40PM +0900, Isaku Yamahata wrote: > On Mon, Nov 15, 2010 at 10:06:13AM +0200, Michael S. Tsirkin wrote: > > On Mon, Nov 15, 2010 at 04:30:43PM +0900, Isaku Yamahata wrote: > > > introduce a function to parse pci device path of > > > the format, [Domain:]Slot.Function:Slot.Function:Slot.Function. > > > > > > Signed-off-by: Isaku Yamahata > > > > Hmm. > > How about we use openfirmware path like what Gleb's patch does, > > with a fallback to bus:dev.fn for when it's unambiguous? > > Okay, let me check my understanding of the format. > > The openfirmware path in pci case looks like > /pci@/@,/.../@, > > "pci@" corresponds to pci domain. So should be > also supported in addition to . > Correct. > Maybe we'd like "@" to be optional. > So the parser would accept something like > /{, }/,/.../, My patch series has pci_dev_fw_name() function that fills in for you, so no reason to make it optional. If you still want to make it optional @ should stay, so it should be /@ioport/@slot.func/ -- Gleb.
Re: [Qemu-devel] [PATCH RESENT] msix: allow byte and word reading from mmio
Am 22.08.2010 23:34, schrieb ext Anthony Liguori: On 08/19/2010 07:56 AM, Bernhard Kohl wrote: It's legal that the guest reads a single byte or word from mmio. I have an OS which reads single bytes and it works fine on real hardware. Maybe this happens due to casting. Signed-off-by: Bernhard Kohl Hi Michael, I'm looking for this to come through your tree unless you disagree. Regards, Anthony Liguori Hi, could you please look at this again? Thanks Bernhard --- hw/msix.c | 20 1 files changed, 16 insertions(+), 4 deletions(-) diff --git a/hw/msix.c b/hw/msix.c index d99403a..7dac7f7 100644 --- a/hw/msix.c +++ b/hw/msix.c @@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, target_phys_addr_t addr) return pci_get_long(page + offset); } -static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t addr) +static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr) { -fprintf(stderr, "MSI-X: only dword read is allowed!\n"); -return 0; +PCIDevice *dev = opaque; +unsigned int offset = addr& (MSIX_PAGE_SIZE - 1)& ~0x1; +void *page = dev->msix_table_page; + +return pci_get_word(page + offset); +} + +static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr) +{ +PCIDevice *dev = opaque; +unsigned int offset = addr& (MSIX_PAGE_SIZE - 1); +void *page = dev->msix_table_page; + +return pci_get_byte(page + offset); } static uint8_t msix_pending_mask(int vector) @@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = { }; static CPUReadMemoryFunc * const msix_mmio_read[] = { -msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl +msix_mmio_readb, msix_mmio_readw, msix_mmio_readl }; /* Should be called from device's map method. */
[Qemu-devel] Re: [PATCH] pc: disable the BOCHS BIOS panic port
Am 01.09.2010 16:44, schrieb Bernhard Kohl: We have an OS which writes to port 0x400 when probing for special hardware. This causes an exit of the VM. With SeaBIOS this port isn't used anyway. Signed-off-by: Bernhard Kohl --- hw/pc.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 69b13bf..3f81229 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -430,8 +430,6 @@ static void bochs_bios_write(void *opaque, uint32_t addr, uint32_t val) /* Bochs BIOS messages */ case 0x400: case 0x401: -fprintf(stderr, "BIOS panic at rombios.c, line %d\n", val); -exit(1); case 0x402: case 0x403: #ifdef DEBUG_BIOS Hi, could you please look at this? Thanks Bernhard
Re: [Qemu-devel] Re: [PATCH] pc: disable the BOCHS BIOS panic port
On 15.11.2010, at 10:53, Bernhard Kohl wrote: > Am 01.09.2010 16:44, schrieb Bernhard Kohl: >> We have an OS which writes to port 0x400 when probing for special hardware. >> This causes an exit of the VM. With SeaBIOS this port isn't used anyway. >> >> Signed-off-by: Bernhard Kohl >> --- >> hw/pc.c |2 -- >> 1 files changed, 0 insertions(+), 2 deletions(-) >> >> diff --git a/hw/pc.c b/hw/pc.c >> index 69b13bf..3f81229 100644 >> --- a/hw/pc.c >> +++ b/hw/pc.c >> @@ -430,8 +430,6 @@ static void bochs_bios_write(void *opaque, uint32_t >> addr, uint32_t val) >> /* Bochs BIOS messages */ >> case 0x400: >> case 0x401: >> -fprintf(stderr, "BIOS panic at rombios.c, line %d\n", val); >> -exit(1); >> case 0x402: >> case 0x403: >> #ifdef DEBUG_BIOS >> > > Hi, > > could you please look at this? This patch makes that port that was silent before print debug output if DEBUG_BIOS is enabled which might be confusing. How about something like case 0x400: case 0x401: /* used to be panic, now unused */ break; Alex
[Qemu-devel] Re: [PATCHv4 13/15] Add bootindex for option roms.
On Sun, Nov 14, 2010 at 09:33:06PM +, Blue Swirl wrote: > On Sun, Nov 14, 2010 at 3:39 PM, Gleb Natapov wrote: > > Extend -option-rom command to have additional parameter ,bootindex=. > > This patch is broken: > CCarm-softmmu/palm.o > /src/qemu/hw/palm.c: In function 'palmte_init': > /src/qemu/hw/palm.c:237: error: incompatible type for argument 1 of > 'get_image_size' > /src/qemu/hw/palm.c:245: error: incompatible type for argument 1 of > 'load_image_targphys' > cc1: warnings being treated as errors > /src/qemu/hw/palm.c:250: error: format '%s' expects type 'char *', but > argument 4 has type 'QEMUOptionRom' > CCarm-softmmu/nseries.o > /src/qemu/hw/nseries.c: In function 'n8x0_init': > /src/qemu/hw/nseries.c:1346: error: incompatible type for argument 1 > of 'load_image_targphys' Fixed in new version. -- Gleb.
Re: [Qemu-devel] [PATCH V6 02/15] xen: Support new libxc calls from xen unstable.
On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > From: Anthony PERARD > > Update the libxenctrl calls in Qemu to use the new interface, otherwise > Qemu wouldn't be able to build against new versions of the library. > > We also check libxenctrl version in configure, from Xen 3.3.0 to Xen > unstable. > > Signed-off-by: Anthony PERARD > Signed-off-by: Stefano Stabellini > --- > configure| 67 - > hw/xen_backend.c | 10 +++--- > hw/xen_backend.h |2 +- > hw/xen_common.h | 38 +--- > hw/xen_disk.c| 12 > hw/xen_domainbuild.c |2 +- > hw/xen_nic.c | 16 ++-- > 7 files changed, 109 insertions(+), 38 deletions(-) > > diff --git a/configure b/configure > index 9ed05de..f6a7073 100755 > --- a/configure > +++ b/configure > @@ -284,6 +284,7 @@ vnc_jpeg="" > vnc_png="" > vnc_thread="no" > xen="" > +xen_ctrl_version="" > linux_aio="" > attr="" > vhost_net="" > @@ -1135,20 +1136,81 @@ fi > > if test "$xen" != "no" ; then > xen_libs="-lxenstore -lxenctrl -lxenguest" > + > + # Xen unstable > cat > $TMPC < #include > #include > -int main(void) { xs_daemon_open(); xc_interface_open(); return 0; } > +#include > +#include > +#if !defined(HVM_MAX_VCPUS) > +# error HVM_MAX_VCPUS not defined > +#endif > +int main(void) { > + xc_interface *xc; > + xs_daemon_open(); > + xc_interface_open(0, 0, 0); > + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); > + xc_gnttab_open(xc); > + return 0; > +} > EOF > if compile_prog "" "$xen_libs" ; then > +xen_ctrl_version=410 > xen=yes > -libs_softmmu="$xen_libs $libs_softmmu" > + > + # Xen 4.0.0 > + elif ( > + cat > $TMPC < +#include > +#include > +#include > +#include > +#if !defined(HVM_MAX_VCPUS) > +# error HVM_MAX_VCPUS not defined > +#endif > +int main(void) { > + xs_daemon_open(); > + xc_interface_open(); > + xc_gnttab_open(); > + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); > + return 0; > +} > +EOF > + compile_prog "" "$xen_libs" > +) ; then > +xen_ctrl_version=400 > +xen=yes > + > + # Xen 3.3.0, 3.4.0 > + elif ( > + cat > $TMPC < +#include > +#include > +int main(void) { > + xs_daemon_open(); > + xc_interface_open(); > + xc_gnttab_open(); > + xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0); > + return 0; > +} > +EOF > + compile_prog "" "$xen_libs" > +) ; then > +xen_ctrl_version=330 > +xen=yes > + > + # Xen not found or unsupported > else > if test "$xen" = "yes" ; then > feature_not_found "xen" > fi > xen=no > fi > + > + if test "$xen" = "yes"; then > +libs_softmmu="$xen_libs $libs_softmmu" > + fi > fi > > ## > @@ -2546,6 +2608,7 @@ if test "$bluez" = "yes" ; then > fi > if test "$xen" = "yes" ; then > echo "CONFIG_XEN=y" >> $config_host_mak > + echo "CONFIG_XEN_CTRL_INTERFACE_VERSION=$xen_ctrl_version" >> > $config_host_mak > fi > if test "$io_thread" = "yes" ; then > echo "CONFIG_IOTHREAD=y" >> $config_host_mak > diff --git a/hw/xen_backend.c b/hw/xen_backend.c > index 860b038..3e99751 100644 > --- a/hw/xen_backend.c > +++ b/hw/xen_backend.c > @@ -43,7 +43,7 @@ > /* - */ > > /* public */ > -int xen_xc; > +qemu_xc_interface xen_xc = XC_HANDLER_INITIAL_VALUE; > struct xs_handle *xenstore = NULL; > const char *xen_protocol; > > @@ -216,7 +216,7 @@ static struct XenDevice *xen_be_get_xendev(const char > *type, int dom, int dev, > fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC); > > if (ops->flags & DEVOPS_FLAG_NEED_GNTDEV) { > -xendev->gnttabdev = xc_gnttab_open(); > +xendev->gnttabdev = xc_gnttab_open(xen_xc); > if (xendev->gnttabdev < 0) { > xen_be_printf(NULL, 0, "can't open gnttab device\n"); > xc_evtchn_close(xendev->evtchndev); > @@ -269,7 +269,7 @@ static struct XenDevice *xen_be_del_xendev(int dom, int > dev) > if (xendev->evtchndev >= 0) > xc_evtchn_close(xendev->evtchndev); > if (xendev->gnttabdev >= 0) > -xc_gnttab_close(xendev->gnttabdev); > +xc_gnttab_close(xen_xc, xendev->gnttabdev); > > QTAILQ_REMOVE(&xendevs, xendev, next); > qemu_free(xendev); > @@ -627,8 +627,8 @@ int xen_be_init(void) > if (qemu_set_fd_handler(xs_fileno(xenstore), xenstore_update, NULL, NULL) > < 0) > goto err; > > -xen_xc = xc_interface_open(); > -if (xen_xc == -1) { > +xen_xc = xc_interface_open(NULL, NULL, 0); > +if (xen_xc == XC_HANDLER_INITIAL_VALUE) { > xen_be_printf(NULL, 0, "can't open xen interface\n"); > goto err; > } > diff --git a/hw/xen_backend.h b/hw/xen_backend.h > index 1b428e3..7d84098 100644 > --- a/hw/xen_backend.h > +++ b/hw/xen_backend.h > @@ -55,7 +55,7 @@ struct XenDevice { > /*
Re: [Qemu-devel] [PATCH V6 03/15] xen: Add xen_machine_fv
On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > From: Anthony PERARD > > Add the Xen FV (Fully Virtualized) machine to Qemu; > this is groundwork to add Xen device model support in Qemu. > > Signed-off-by: Anthony PERARD > Signed-off-by: Stefano Stabellini > --- > Makefile.target |3 + > hw/xen_common.h |5 ++ > hw/xen_machine_fv.c | 158 +++ > 3 files changed, 166 insertions(+), 0 deletions(-) > create mode 100644 hw/xen_machine_fv.c > > diff --git a/Makefile.target b/Makefile.target > index c48cbcc..ddabc8a 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -185,6 +185,9 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) > # xen backend driver support > obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o > > +# xen full virtualized machine > +obj-i386-$(CONFIG_XEN) += xen_machine_fv.o > + > # USB layer > obj-$(CONFIG_USB_OHCI) += usb-ohci.o > > diff --git a/hw/xen_common.h b/hw/xen_common.h > index 9f75e52..4c0f97d 100644 > --- a/hw/xen_common.h > +++ b/hw/xen_common.h > @@ -18,6 +18,11 @@ > * We don't support Xen prior to 3.3.0. > */ > > +/* Before Xen 4.0.0 */ > +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 400 > +# define HVM_MAX_VCPUS 32 > +#endif > + > /* Xen unstable */ > #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 410 > typedef int qemu_xc_interface; > diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c > new file mode 100644 > index 000..260cda3 > --- /dev/null > +++ b/hw/xen_machine_fv.c > @@ -0,0 +1,158 @@ > +/* > + * QEMU Xen FV Machine > + * > + * Copyright (c) 2003-2007 Fabrice Bellard > + * Copyright (c) 2007 Red Hat Shouldn't there be Citrix in there? > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > copy > + * of this software and associated documentation files (the "Software"), to > deal > + * in the Software without restriction, including without limitation the > rights > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > + * copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > FROM, > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > + * THE SOFTWARE. So all this code has always been public domain? There are no GPL'ed pieces in there? The rest of the code looks pretty much like a duplicate of pc.c. I'm not saying this is bad, but it'll be a lot of work to keep things in sync when others start changing pc.c. So long-term we should get rid of the duplication. Hopefully by having a working machine description framework that doesn't require C skills :). It's way out of the scope of this patch set though. Alex
Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.
On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > From: Anthony PERARD > > This option gives the ability to switch one "accelerator" like kvm, xen > or the default one tcg. We can specify more than one accelerator by > separate them by a comma. QEMU will try each one and use the first whose > works. > > So, > > -accel xen,kvm,tcg > > which would try Xen support first, then KVM and finaly tcg if none of > the other works. > > Signed-off-by: Anthony PERARD This patch is completely independent of the other patches. Anthony, can we pull it in separately before the others? Acked-by: Alexander Graf
Re: [Qemu-devel] virtio-blk broken after system reset
Am 13.11.2010 11:09, schrieb Jan Kiszka: > Am 13.11.2010 11:01, Michael Tokarev wrote: >> 13.11.2010 10:51, Jan Kiszka wrote: >>> Am 13.11.2010 08:49, Stefan Hajnoczi wrote: On Fri, Nov 12, 2010 at 10:02 PM, Jan Kiszka wrote: > Hi, > > both after hard and guest-initiated reset, something is seriously broken > with virtio block devices. If I reset my Linux guest while still in > grub, the bios will simply fail to read from the disk after the reboot. > If I > reset after Linux touched the device, qemu terminates: > > Breakpoint 1, 0x74b945b0 in _exit () from /lib64/libc.so.6 > (gdb) bt > #0 0x74b945b0 in _exit () from /lib64/libc.so.6 > #1 0x74b2948d in __run_exit_handlers () from /lib64/libc.so.6 > #2 0x74b29535 in exit () from /lib64/libc.so.6 > #3 0x00568da3 in virtqueue_num_heads (vq=0x17040e0, idx=0) at > /data/qemu/hw/virtio.c:258 > #4 0x00569511 in virtqueue_pop (vq=0x17040e0, elem=0x17cea58) at > /data/qemu/hw/virtio.c:388 > #5 0x00419e31 in virtio_blk_get_request (s=0x1704010) at > /data/qemu/hw/virtio-blk.c:132 > #6 virtio_blk_handle_output (vdev=0x1704010, vq=) > at /data/qemu/hw/virtio-blk.c:369 > >> [] >>> And what about the guest-triggerable qemu exit above? >> >> There are _lots_ of guest-triggerable qemu exits out there. >> >> static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) >> { >> uint16_t num_heads = vring_avail_idx(vq) - idx; >> >> /* Check it isn't doing very strange things with descriptor numbers. */ >> if (num_heads > vq->vring.num) { >> fprintf(stderr, "Guest moved used index from %u to %u", >> idx, vring_avail_idx(vq)); >> exit(1); >> } >> >> return num_heads; >> } >> >> This is done when guest behaves insanely (or qemu thinks it does). >> On a real hw similar behavour most likely will lead to a system >> lockup, qemu just exits. > > There is also real hw out there that goes into an error state if it's > misprogrammed. > > I think we have to remove all those premature exits. They also prevent > handing the device inside the guest to an untrusted driver (relevant > once we have IOMMU emulation). I agree, those exits are bugs. There are a lot of them in virtio code. At some point, someone should go through the whole virtio code and fix them all. Kevin
[Qemu-devel] Re: [PATCH RESENT] msix: allow byte and word reading from mmio
On Thu, Aug 19, 2010 at 02:56:51PM +0200, Bernhard Kohl wrote: > It's legal that the guest reads a single byte or word from mmio. Interesting. The spec seems to say this: For all accesses to MSI-X Table and MSI-X PBA fields, software must use aligned full DWORD or aligned full QWORD transactions; otherwise, the result is undefined. > I have an OS which reads single bytes and it works fine on real > hardware. Maybe this happens due to casting. What do you mean by casting? > > Signed-off-by: Bernhard Kohl I guess we can merge this, but we need a comment I think since this seems to contradict the spec, and people were sending patches relying on this. Does something like the following describe the situation correctly? /* Note: PCI spec requires that all MSI-X table accesses are either DWORD or QWORD, size aligned. Some guests seem to violate this rule for read accesses, performing single byte reads. Since it's easy to support this, let's do so. Also support 16 bit size aligned reads, just in case. */ Does you guest also do 16 bit reads? Are accesses at least aligned? > --- > hw/msix.c | 20 > 1 files changed, 16 insertions(+), 4 deletions(-) > > diff --git a/hw/msix.c b/hw/msix.c > index d99403a..7dac7f7 100644 > --- a/hw/msix.c > +++ b/hw/msix.c > @@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, > target_phys_addr_t addr) > return pci_get_long(page + offset); > } > > -static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t > addr) > +static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr) > { > -fprintf(stderr, "MSI-X: only dword read is allowed!\n"); > -return 0; > +PCIDevice *dev = opaque; > +unsigned int offset = addr & (MSIX_PAGE_SIZE - 1) & ~0x1; > +void *page = dev->msix_table_page; > + > +return pci_get_word(page + offset); > +} > + > +static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr) > +{ > +PCIDevice *dev = opaque; > +unsigned int offset = addr & (MSIX_PAGE_SIZE - 1); > +void *page = dev->msix_table_page; > + > +return pci_get_byte(page + offset); > } > > static uint8_t msix_pending_mask(int vector) > @@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = { > }; > > static CPUReadMemoryFunc * const msix_mmio_read[] = { > -msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl > +msix_mmio_readb, msix_mmio_readw, msix_mmio_readl > }; > > /* Should be called from device's map method. */ > -- > 1.7.2.1
Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.
On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > From: Anthony PERARD > > This option gives the ability to switch one "accelerator" like kvm, xen > or the default one tcg. We can specify more than one accelerator by > separate them by a comma. QEMU will try each one and use the first whose > works. > > So, > > -accel xen,kvm,tcg > > which would try Xen support first, then KVM and finaly tcg if none of > the other works. > > Signed-off-by: Anthony PERARD > --- > qemu-options.hx | 10 ++ > vl.c| 86 --- > 2 files changed, 85 insertions(+), 11 deletions(-) > > diff --git a/qemu-options.hx b/qemu-options.hx > index 718d47a..ba8385b 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -1925,6 +1925,16 @@ Enable KVM full virtualization support. This option is > only available > if KVM support is enabled when compiling. > ETEXI > > +DEF("accel", HAS_ARG, QEMU_OPTION_accel, \ > +"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", > QEMU_ARCH_ALL) > +STEXI > +...@item -accel @var{accel}[,@var{accel}[,...]] > +...@findex -accel > +This is use to enable an accelerator, in kvm,xen,tcg. > +By default, it use only tcg. If there a more than one accelerator > +specified, the next one is used if the first don't work. > +ETEXI > + > DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid, > "-xen-domid id specify xen guest domain id\n", QEMU_ARCH_ALL) > DEF("xen-create", 0, QEMU_OPTION_xen_create, > diff --git a/vl.c b/vl.c > index df414ef..40a26ee 100644 > --- a/vl.c > +++ b/vl.c > @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname) > return 0; > } > > +static struct { > +const char *opt_name; > +const char *name; > +int (*available)(void); > +int (*init)(int smp_cpus); > +int *allowed; > +} accel_list[] = { > +{ "tcg", "tcg", NULL, NULL, NULL }, > +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed }, Thinking about this a bit more ... kvm_available is a function pointer that gets #defined to (0) when we don't have KVM available. I can imagine some compiler might throw a warning on us for this one day. Is there a valid reason not to do static inline int kvm_enabled() { #ifdef CONFIG_KVM return kvm_allowed; #else return 0; #endif } That should compile into the exact same code but be valid for function pointers. > +}; > + > +static int accel_parse_init(const char *opts) > +{ > +const char *p = opts; > +char buf[10]; > +int i, ret; > +bool accel_initalised = 0; > +bool init_failed = 0; > + > +while (!accel_initalised && *p != '\0') { > +if (*p == ',') { > +p++; > +} > +p = get_opt_name(buf, sizeof (buf), p, ','); > +for (i = 0; i < ARRAY_SIZE(accel_list); i++) { > +if (strcmp(accel_list[i].opt_name, buf) == 0) { > +if (accel_list[i].init) { If you'd make the function pointers mandatory, you could also get rid of a lot of checks here. Just introduce a new small stub helper for tcg_init and tcg_allowed and always call the pointers. I take back my Acked-by. Sorry, I guess I should have read things in more detail first O_o. I still believe that this patch can go in before the others, but I'd at least like to see some comments on the (0) pointer thing :). Alex
Re: [Qemu-devel] [PATCH V6 05/15] xen: Add xen in -accel option.
On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > From: Anthony PERARD > > This come with the initialisation of Xen. > > Signed-off-by: Anthony PERARD > --- > Makefile.target |5 + > hw/xen.h| 10 ++ > vl.c|2 ++ > xen-all.c | 25 + > xen-stub.c | 17 + > 5 files changed, 59 insertions(+), 0 deletions(-) > create mode 100644 xen-all.c > create mode 100644 xen-stub.c > > diff --git a/Makefile.target b/Makefile.target > index ddabc8a..644cafa 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -2,6 +2,7 @@ > > GENERATED_HEADERS = config-target.h > CONFIG_NO_KVM = $(if $(subst n,,$(CONFIG_KVM)),n,y) > +CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y) > > include ../config-host.mak > include config-devices.mak > @@ -185,6 +186,10 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) > # xen backend driver support > obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o > > +# xen support > +obj-$(CONFIG_XEN) += xen-all.o > +obj-$(CONFIG_NO_XEN) += xen-stub.o > + > # xen full virtualized machine > obj-i386-$(CONFIG_XEN) += xen_machine_fv.o > > diff --git a/hw/xen.h b/hw/xen.h > index 780dcf7..14bbb6e 100644 > --- a/hw/xen.h > +++ b/hw/xen.h > @@ -18,4 +18,14 @@ enum xen_mode { > extern uint32_t xen_domid; > extern enum xen_mode xen_mode; > > +extern int xen_allowed; > + > +#if defined CONFIG_XEN > +#define xen_enabled() (xen_allowed) > +#else > +#define xen_enabled() (0) > +#endif Please see my comment on patch 04. I don't think this preprocessor magic buys us anything over static inline function definitions. And with those we can get rid of a lot of pitfalls. Alex
Re: [Qemu-devel] Add support for SystemTAP and DTrace tracing backends (v6)
On Fri, Nov 12, 2010 at 1:20 PM, Daniel P. Berrange wrote: > An update of the SystemTAP/DTrace patches from > > http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg00563.html > > The first patch contains the generic DTrace tracing backend support. > The second patch contains additional pieces for SystemTAP to generate > a tapset for each of the qemu-system-XXX binaries, to simplify life > for admins wanting to use the tracing backend. > > Changed in v6: > > - Fix handling of probes with no-args in DTrace provider > - Generate one tapset per target, for both system and user > emulators > - Re-write command line arg parsing in tracetool so it > doesn't assume a particular ordering of args > - Change '-s' to '--stap' in tracetool to avoid confusion > - Pass full binary name into tracetool Both patches look good. It would be nice to merge this. Stefan
Re: [Qemu-devel] [PATCH V6 06/15] xen: Add the Xen platform pci device
On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > From: Steven Smith > > Introduce a new emulated PCI device, specific to fully virtualized Xen > guests. The device is necessary for PV on HVM drivers to work. > > Signed-off-by: Steven Smith > Signed-off-by: Anthony PERARD > Signed-off-by: Stefano Stabellini > --- > Makefile.target |1 + > hw/hw.h |3 + > hw/pci_ids.h|2 + > hw/xen_machine_fv.c |3 + > hw/xen_platform.c | 431 +++ > hw/xen_platform.h |8 + > 6 files changed, 448 insertions(+), 0 deletions(-) > create mode 100644 hw/xen_platform.c > create mode 100644 hw/xen_platform.h > > diff --git a/Makefile.target b/Makefile.target > index 644cafa..db84edb 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -192,6 +192,7 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o > > # xen full virtualized machine > obj-i386-$(CONFIG_XEN) += xen_machine_fv.o > +obj-i386-$(CONFIG_XEN) += xen_platform.o > > # USB layer > obj-$(CONFIG_USB_OHCI) += usb-ohci.o > diff --git a/hw/hw.h b/hw/hw.h > index 4405092..67f3369 100644 > --- a/hw/hw.h > +++ b/hw/hw.h > @@ -653,6 +653,9 @@ extern const VMStateDescription vmstate_i2c_slave; > #define VMSTATE_INT32_LE(_f, _s) \ > VMSTATE_SINGLE(_f, _s, 0, vmstate_info_int32_le, int32_t) > > +#define VMSTATE_UINT8_TEST(_f, _s, _t) \ > +VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint8, uint8_t) > + > #define VMSTATE_UINT16_TEST(_f, _s, _t) \ > VMSTATE_SINGLE_TEST(_f, _s, _t, 0, vmstate_info_uint16, uint16_t) > > diff --git a/hw/pci_ids.h b/hw/pci_ids.h > index 39e9f1d..1f2e0dd 100644 > --- a/hw/pci_ids.h > +++ b/hw/pci_ids.h > @@ -105,3 +105,5 @@ > #define PCI_DEVICE_ID_INTEL_82371AB 0x7111 > #define PCI_DEVICE_ID_INTEL_82371AB_20x7112 > #define PCI_DEVICE_ID_INTEL_82371AB_30x7113 > + > +#define PCI_VENDOR_ID_XENSOURCE 0x5853 > diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c > index 260cda3..39ee7c7 100644 > --- a/hw/xen_machine_fv.c > +++ b/hw/xen_machine_fv.c > @@ -35,6 +35,7 @@ > > #include "xen_common.h" > #include "xen/hvm/hvm_info_table.h" > +#include "xen_platform.h" > > #define MAX_IDE_BUS 2 > > @@ -88,6 +89,8 @@ static void xen_init_fv(ram_addr_t ram_size, > > pc_vga_init(pci_bus); > > +pci_xen_platform_init(pci_bus); > + > /* init basic PC hardware */ > pc_basic_device_init(isa_irq, &floppy_controller, &rtc_state); > > diff --git a/hw/xen_platform.c b/hw/xen_platform.c > new file mode 100644 > index 000..7551c81 > --- /dev/null > +++ b/hw/xen_platform.c > @@ -0,0 +1,431 @@ > +/* > + * XEN platform pci device, formerly known as the event channel device > + * > + * Copyright (c) 2003-2004 Intel Corp. > + * Copyright (c) 2006 XenSource > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > copy > + * of this software and associated documentation files (the "Software"), to > deal > + * in the Software without restriction, including without limitation the > rights > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > + * copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > FROM, > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > + * THE SOFTWARE. > + */ > + > +#include "hw.h" > +#include "pc.h" > +#include "pci.h" > +#include "irq.h" > +#include "xen_common.h" > +#include "net.h" > +#include "xen_platform.h" > +#include "xen_backend.h" > +#include "qemu-log.h" > +#include "rwhandler.h" > + > +#include > +#include > + > +//#define DEBUG_PLATFORM > + > +#ifdef DEBUG_PLATFORM > +#define DPRINTF(fmt, ...) do { \ > +fprintf(stderr, "xen_platform: " fmt, ## __VA_ARGS__); \ > +} while (0) > +#else > +#define DPRINTF(fmt, ...) do { } while (0) > +#endif > + > +#define PFFLAG_ROM_LOCK 1 /* Sets whether ROM memory area is RW or RO */ > + > +typedef struct PCIXenPlatformState { > +PCIDevice pci_dev; > +uint8_t flags; /* used only for version_id == 2 */ > +int drivers_blacklisted; > +uint16_t driver_product_version; > + > +/* Log from guest drivers */ > +int throttling_disabled; > +char log_buffer[4096]; > +int log_buffer_off; > +} PCIXenPlatformState; > + > +#define XEN_P
Re: [Qemu-devel] [PATCH V6 03/15] xen: Add xen_machine_fv
Am 15.11.2010 11:35, schrieb Alexander Graf: > > On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > >> From: Anthony PERARD >> >> Add the Xen FV (Fully Virtualized) machine to Qemu; >> this is groundwork to add Xen device model support in Qemu. >> >> Signed-off-by: Anthony PERARD >> Signed-off-by: Stefano Stabellini >> --- >> Makefile.target |3 + >> hw/xen_common.h |5 ++ >> hw/xen_machine_fv.c | 158 >> +++ >> 3 files changed, 166 insertions(+), 0 deletions(-) >> create mode 100644 hw/xen_machine_fv.c >> >> diff --git a/Makefile.target b/Makefile.target >> index c48cbcc..ddabc8a 100644 >> --- a/Makefile.target >> +++ b/Makefile.target >> @@ -185,6 +185,9 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) >> # xen backend driver support >> obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o >> >> +# xen full virtualized machine >> +obj-i386-$(CONFIG_XEN) += xen_machine_fv.o >> + >> # USB layer >> obj-$(CONFIG_USB_OHCI) += usb-ohci.o >> >> diff --git a/hw/xen_common.h b/hw/xen_common.h >> index 9f75e52..4c0f97d 100644 >> --- a/hw/xen_common.h >> +++ b/hw/xen_common.h >> @@ -18,6 +18,11 @@ >> * We don't support Xen prior to 3.3.0. >> */ >> >> +/* Before Xen 4.0.0 */ >> +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 400 >> +# define HVM_MAX_VCPUS 32 >> +#endif >> + >> /* Xen unstable */ >> #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 410 >> typedef int qemu_xc_interface; >> diff --git a/hw/xen_machine_fv.c b/hw/xen_machine_fv.c >> new file mode 100644 >> index 000..260cda3 >> --- /dev/null >> +++ b/hw/xen_machine_fv.c >> @@ -0,0 +1,158 @@ >> +/* >> + * QEMU Xen FV Machine >> + * >> + * Copyright (c) 2003-2007 Fabrice Bellard >> + * Copyright (c) 2007 Red Hat > > Shouldn't there be Citrix in there? > >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> copy >> + * of this software and associated documentation files (the "Software"), to >> deal >> + * in the Software without restriction, including without limitation the >> rights >> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell >> + * copies of the Software, and to permit persons to whom the Software is >> + * furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included >> in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS >> OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >> OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >> FROM, >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN >> + * THE SOFTWARE. > > So all this code has always been public domain? There are no GPL'ed pieces in > there? Public domain is something entirely different. This is just a permissive license, and it's used a lot throughout qemu. So as you state that this file is mostly a copy of pc.c which uses this license, at least for the copied part it's not only okay, but even required to use the same license here. Kevin
Re: [Qemu-devel] Re: [PATCH 2/3] virtio-pci: Use ioeventfd for virtqueue notify
On Sun, Nov 14, 2010 at 12:19 PM, Avi Kivity wrote: > On 11/14/2010 01:05 PM, Avi Kivity wrote: >>> >>> I agree, but let's enable virtio-ioeventfd carefully because bad code >>> is out there. >> >> >> Sure. Note as long as the thread waiting on ioeventfd doesn't consume too >> much cpu, it will awaken quickly and we won't have the "transaction per >> timeslice" effect. >> >> btw, what about virtio-blk with linux-aio? Have you benchmarked that with >> and without ioeventfd? >> > > And, what about efficiency? As in bits/cycle? We are running benchmarks with this latest patch and will report results. Stefan
[Qemu-devel] [PATCH] device-assignment: register a reset function
This is necessary because during reboot of a VM the assigned devices continue DMA transfers which causes memory corruption. Signed-off-by: Thomas Ostler Signed-off-by: Bernhard Kohl --- Sorry for for the long delay. Finally we added Alex' suggestions and rebased the patch. Thanks Bernhard --- hw/device-assignment.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/hw/device-assignment.c b/hw/device-assignment.c index 5f5bde1..3f8de66 100644 --- a/hw/device-assignment.c +++ b/hw/device-assignment.c @@ -1434,6 +1434,17 @@ static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) dev->msix_table_page = NULL; } +static void reset_assigned_device(DeviceState *dev) +{ +PCIDevice *d = DO_UPCAST(PCIDevice, qdev, dev); +uint32_t conf; + +/* reset the bus master bit to avoid further DMA transfers */ +conf = assigned_dev_pci_read_config(d, PCI_COMMAND, 2); +conf &= ~PCI_COMMAND_MASTER; +assigned_dev_pci_write_config(d, PCI_COMMAND, conf, 2); +} + static int assigned_initfn(struct PCIDevice *pci_dev) { AssignedDevice *dev = DO_UPCAST(AssignedDevice, dev, pci_dev); @@ -1544,6 +1555,7 @@ static PCIDeviceInfo assign_info = { .qdev.name= "pci-assign", .qdev.desc= "pass through host pci devices to the guest", .qdev.size= sizeof(AssignedDevice), +.qdev.reset = reset_assigned_device, .init = assigned_initfn, .exit = assigned_exitfn, .config_read = assigned_dev_pci_read_config, -- 1.7.2.3
Re: [Qemu-devel] [PATCH V6 09/15] xen: Introduce the Xen mapcache
On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > From: Jun Nakajima > > On IA32 host or IA32 PAE host, at present, generally, we can't create > an HVM guest with more than 2G memory, because generally it's almost > impossible for Qemu to find a large enough and consecutive virtual > address space to map an HVM guest's whole physical address space. > The attached patch fixes this issue using dynamic mapping based on > little blocks of memory. > > Each call to qemu_get_ram_ptr makes a call to qemu_map_cache with the > lock option, so mapcache will not unmap these ram_ptr. > > Signed-off-by: Jun Nakajima > Signed-off-by: Anthony PERARD > Signed-off-by: Stefano Stabellini > --- > Makefile.target |3 + > configure |3 + > exec.c | 40 ++- > hw/xen.h| 10 ++ > hw/xen_common.h |2 + > xen-all.c | 64 +++ > xen-mapcache-stub.c | 33 ++ > xen-mapcache.c | 301 +++ > xen-mapcache.h | 14 +++ > xen-stub.c |4 + > 10 files changed, 470 insertions(+), 4 deletions(-) > create mode 100644 xen-mapcache-stub.c > create mode 100644 xen-mapcache.c > create mode 100644 xen-mapcache.h > > diff --git a/Makefile.target b/Makefile.target > index db84edb..5646582 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -187,8 +187,11 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS) > obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o > > # xen support > +CONFIG_NO_XEN_MAPCACHE = $(if $(subst n,,$(CONFIG_XEN_MAPCACHE)),n,y) > obj-$(CONFIG_XEN) += xen-all.o > obj-$(CONFIG_NO_XEN) += xen-stub.o > +obj-$(CONFIG_XEN_MAPCACHE) += xen-mapcache.o > +obj-$(CONFIG_NO_XEN_MAPCACHE) += xen-mapcache-stub.o > > # xen full virtualized machine > obj-i386-$(CONFIG_XEN) += xen_machine_fv.o > diff --git a/configure b/configure > index f6a7073..d5a8553 100755 > --- a/configure > +++ b/configure > @@ -2943,6 +2943,9 @@ case "$target_arch2" in > i386|x86_64) > if test "$xen" = "yes" -a "$target_softmmu" = "yes" ; then > echo "CONFIG_XEN=y" >> $config_target_mak > + if test "$cpu" = "i386" -o "$cpu" = "x86_64"; then > + echo "CONFIG_XEN_MAPCACHE=y" >> $config_target_mak > + fi > fi > esac > case "$target_arch2" in > diff --git a/exec.c b/exec.c > index 631d8c5..d2cded6 100644 > --- a/exec.c > +++ b/exec.c > @@ -39,6 +39,7 @@ > #include "hw/qdev.h" > #include "osdep.h" > #include "kvm.h" > +#include "hw/xen.h" > #include "qemu-timer.h" > #if defined(CONFIG_USER_ONLY) > #include > @@ -58,6 +59,8 @@ > #include > #endif > #endif > +#else /* !CONFIG_USER_ONLY */ > +#include "xen-mapcache.h" > #endif > > //#define DEBUG_TB_INVALIDATE > @@ -2834,6 +2837,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, > const char *name, > } > } > > +new_block->offset = find_ram_offset(size); > if (host) { > new_block->host = host; > } else { > @@ -2855,13 +2859,15 @@ ram_addr_t qemu_ram_alloc_from_ptr(DeviceState *dev, > const char *name, >PROT_EXEC|PROT_READ|PROT_WRITE, >MAP_SHARED | MAP_ANONYMOUS, -1, 0); > #else > -new_block->host = qemu_vmalloc(size); > +if (xen_mapcache_enabled()) { > +xen_ram_alloc(new_block->offset, size); > +} else { > +new_block->host = qemu_vmalloc(size); > +} > #endif > qemu_madvise(new_block->host, size, QEMU_MADV_MERGEABLE); > } > } > - > -new_block->offset = find_ram_offset(size); > new_block->length = size; > > QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next); > @@ -2902,7 +2908,11 @@ void qemu_ram_free(ram_addr_t addr) > #if defined(TARGET_S390X) && defined(CONFIG_KVM) > munmap(block->host, block->length); > #else > -qemu_vfree(block->host); > +if (xen_mapcache_enabled()) { > +qemu_invalidate_entry(block->host); > +} else { > +qemu_vfree(block->host); > +} > #endif > } > qemu_free(block); > @@ -2928,6 +2938,15 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > if (addr - block->offset < block->length) { > QLIST_REMOVE(block, next); > QLIST_INSERT_HEAD(&ram_list.blocks, block, next); > +if (xen_mapcache_enabled()) { > +/* We need to check if the requested address is in the RAM > + * because we don't want to map the entire memory in QEMU. > + */ > +if (block->offset == 0) { > +return qemu_map_cache(addr, 0, 1); > +} > +block->host = qemu_map_cache(block->offset, block->length, > 1); > +} > return block->host + (addr - block->offset); > } > } > @@ -2944,11 +2963,21 @@ int qemu_ram
[Qemu-devel] [PATCH] make-release: fix mtime for a wider range of git versions
With the latest git versions, e.g. 1.7.2.3, git still prints out the tag info in addition to the requested format. So let's simply fetch the first line from the output. In addition I use the --pretty option instead of --format which is not recognized in very old git versions, e.g. 1.5.5.6. Tested with git versions 1.5.5.6 and 1.7.2.3. Signed-off-by: Bernhard Kohl --- kvm/scripts/make-release |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kvm/scripts/make-release b/kvm/scripts/make-release index 56302c3..2d050fc 100755 --- a/kvm/scripts/make-release +++ b/kvm/scripts/make-release @@ -51,7 +51,7 @@ cd "$(dirname "$0")"/../.. mkdir -p "$(dirname "$tarball")" git archive --prefix="$name/" --format=tar "$commit" > "$tarball" -mtime=`git show --format=%ct "$commit""^{commit}" --` +mtime=`git show --pretty=format:%ct "$commit""^{commit}" -- | head -n 1` tarargs="--owner=root --group=root" mkdir -p "$tmpdir/$name" -- 1.7.2.3
Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?
Mulyadi Santosa writes: >> Yes, I have read that paper, it’s wonderful! >> >> Besides the Argos, the bitblaze group, led by Dawn Song in Berkeley, has >> achieved great success in the taint analysis. The website about their >> dynamic analysis work (called TEMU) can be found at: >> http://bitblaze.cs.berkeley.edu/temu.html >> >> And TEMU is now open-source. > Thanks for sharing that...it's new stuff for me. So, why don't you > just pick TEMU and improve it instead of...uhm...sorry if I am wrong, > working from scratch? After all, I believe in both Argos and TEMU (and > maybe other similar projects), they share common codes here and there. > But ehm...CMIIW, seems like TEMU is based on Qemu 0.9,x, right? So > it's sorry I forgot the name, the generated code is mostly a > constructed by fragments of small codes generated by gcc. Now, it is > qemu which does it by itself. So, a lot of things change > (substantially). I haven't read the TEMU work, but from the problem description I think you want something similar to "Practical Taint-Based Protection using Demand Emulation" or many others (I remember reading some of them a few years ago on the ISCA, MICRO and/or ASPLOS conferences). >> Yes. For each process’s memory space A, I wanna make a shadow memory B. The >> shadow memory is used to store the tag of data. In other words, if addr in >> memory A is tainted, then the corresponding byte in B should be marked to >> indicate that addr in A is tainted. The main question here is... what is the granularity that you want to track with? Bytes? Words? Pages? This will greatly influence which is your best approach. Now that I think of it, you could use the tracing points I sent for guest virtual memory accesses, and instrument them instead of calling a file-tracing backend (this should provide a hook for an arbitrary granularity). Then, simply keep track also of address-space changes and your instrumentation code can always know when to activate propagation. This, together with the optimization I sent for dynamic control of trace generation in TCG emulation code should get you on tracks. Of course, you should still modify all register-accessing instructions to propagate information passing through the register set. For that, maybe you could start with the "fetch" tracing/instrumentation point I sent long time ago, which keeps track of general-purpose register usage/definition on x86 (although I'm sure I left some astray usages due to the decoding complexity in x86). >> The guest os collects “higher” semantic >> from the OS level, and the QEMU collects “lower” semantic from the >> instruction level. Combination of both semantics is necessary in the >> analysis process. > The question is, in a situation where malware already compromise "the > higher semantic", could we trust the analysis? Beware, I've read exactly this kind of scheme on previous top-tier conferences (but I think tests were using an architectural simulator, so it's not for a current production environment). I've found it :) Secure program execution via dynamic information flow tracking ASPLOS 2004 >> The question is: how to communicate between the QEMU and the guest OS, so >> that they can cooperate with each other? A few choices here, but you should first define if the communication must be based just on control signals, and/or providing memory storage: * virtual device : If you need some kind of storage that the guest OS must access, you could look at the ivshmem device * backdoor instruction : It's the simplest option; I sent some patch series recently with two different implementations for x86. Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
[Qemu-devel] [PATCH] audio: disable timer when idle.
This patch makes sure the audio timer gets disabled when no voice is active. It adds a variable to track whenever the timer is needed or not to the global audio state. audio_timer_reset() will set that variable. audio_timer() will check that variable and only re-arm the timer when needed. One additional call to audio_timer_reset() has been added to the output voice disable code path. Signed-off-by: Gerd Hoffmann --- audio/audio.c |6 +- audio/audio_int.h |1 + 2 files changed, 6 insertions(+), 1 deletions(-) diff --git a/audio/audio.c b/audio/audio.c index ade342e..9cf91ae 100644 --- a/audio/audio.c +++ b/audio/audio.c @@ -1101,7 +1101,8 @@ static void audio_timer (void *opaque) AudioState *s = opaque; audio_run ("timer"); -qemu_mod_timer (s->ts, qemu_get_clock (vm_clock) + conf.period.ticks); +if (s->need_timer) +qemu_mod_timer (s->ts, qemu_get_clock (vm_clock) + conf.period.ticks); } @@ -1124,9 +1125,11 @@ static void audio_reset_timer (void) AudioState *s = &glob_audio_state; if (audio_is_timer_needed ()) { +s->need_timer = 1; qemu_mod_timer (s->ts, qemu_get_clock (vm_clock) + 1); } else { +s->need_timer = 0; qemu_del_timer (s->ts); } } @@ -1380,6 +1383,7 @@ static void audio_run_out (AudioState *s) sc->sw.active = 0; audio_recalc_and_notify_capture (sc->cap); } +audio_reset_timer (); continue; } diff --git a/audio/audio_int.h b/audio/audio_int.h index d66f2c3..df060a9 100644 --- a/audio/audio_int.h +++ b/audio/audio_int.h @@ -190,6 +190,7 @@ struct AudioState { void *drv_opaque; QEMUTimer *ts; +int need_timer; QLIST_HEAD (card_listhead, QEMUSoundCard) card_head; QLIST_HEAD (hw_in_listhead, HWVoiceIn) hw_head_in; QLIST_HEAD (hw_out_listhead, HWVoiceOut) hw_head_out; -- 1.7.1
Re: [Qemu-devel] [PATCH] ARM: semi-hosting support for stderr
On 14.11.2010 18:50, Peter Maydell wrote: > Although newlib/libgloss use different modes for opening stdout > and stderr, the ARM C library implementation does not, for > instance; so your patch is relying on a detail of implementation of > a particular semihosting user. So I don't think we can apply this > patch, because it could break other (nonlibgloss) users of > semihosting. > Thanks for your answer. I think we could argue it's a bug in the ARM semihosting spec, so maybe we could request them to change this. However, I am unsure how to do this efficiently, as I recently tried with little success to have them change the "angel_SWIreason_ReportException" such that at least ADP_Stopped_AplicationExit can have the program return code as an extra parameter... (anyway I have another small Qemu patch to handle this situation by returning the contents of r0 when the program entered exit(). I can submit it if someone is willing to review it). Christophe.
Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.
On 11/15/2010 11:46 AM, Alexander Graf wrote: I take back my Acked-by. Sorry, I guess I should have read things in more detail first O_o. I still believe that this patch can go in before the others, but I'd at least like to see some comments on the (0) pointer thing:). I agree, it kind of works by chance, (0) works but (1) wouldn't. Better to define the function always. Paolo
[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.
On Mon, Nov 15, 2010 at 09:40:08AM +0200, Gleb Natapov wrote: > On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote: > > Why not just return a newline separated list that is null terminated? > > > Doing it like this will needlessly complicate firmware side. How do you > know how much memory to allocate before reading device list? My preference would be for the size to be exposed via the QEMU_CFG_FILE_DIR selector. (My preference would be for all objects in fw_cfg to have entries in QEMU_CFG_FILE_DIR describing their size in a reliable manner.) -Kevin
[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.
On Mon, Nov 15, 2010 at 08:26:35AM -0500, Kevin O'Connor wrote: > On Mon, Nov 15, 2010 at 09:40:08AM +0200, Gleb Natapov wrote: > > On Sun, Nov 14, 2010 at 10:40:33PM -0500, Kevin O'Connor wrote: > > > Why not just return a newline separated list that is null terminated? > > > > > Doing it like this will needlessly complicate firmware side. How do you > > know how much memory to allocate before reading device list? > > My preference would be for the size to be exposed via the > QEMU_CFG_FILE_DIR selector. (My preference would be for all objects > in fw_cfg to have entries in QEMU_CFG_FILE_DIR describing their size > in a reliable manner.) > Will interface suggested by Blue will be good for you? The one with two fw_cfg ids. BOOTINDEX_LEN for len and BOOTINDEX_DATA for device list. I already changed my implementation to this one. Using FILE_DIR requires us to generate synthetic name. Hmm BTW I do not see proper endianness handling in FILE_DIR. -- Gleb.
[Qemu-devel] Re: [PATCHv4 15/15] Pass boot device list to firmware.
On Mon, Nov 15, 2010 at 03:36:25PM +0200, Gleb Natapov wrote: > Hmm BTW I do not see proper endianness > handling in FILE_DIR. > That's just me. Everything it OK there with endianness. -- Gleb.
Re: [Qemu-devel] [PATCH V6 02/15] xen: Support new libxc calls from xen unstable.
On Mon, 15 Nov 2010, Alexander Graf wrote: > > /* > > - * tweaks needed to build with different xen versions > > - * 0x00030205 -> 3.1.0 > > - * 0x00030207 -> 3.2.0 > > - * 0x00030208 -> unstable > > + * We don't support Xen prior to 3.3.0. > > */ > > -#include > > -#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030205 > > -# define evtchn_port_or_error_t int > > -#endif > > -#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030207 > > -# define xc_map_foreign_pages xc_map_foreign_batch > > -#endif > > -#if __XEN_LATEST_INTERFACE_VERSION__ < 0x00030208 > > -# define xen_mb() mb() > > -# define xen_rmb() rmb() > > -# define xen_wmb() wmb() > > + > > +/* Xen unstable */ > > +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 410 > > +typedef int qemu_xc_interface; > > +# define XC_HANDLER_INITIAL_VALUE -1 > > +# define xc_fd(xen_xc) xen_xc > > +# define xc_interface_open(l, dl, f)xc_interface_open() > > +# define xc_gnttab_open(xc) xc_gnttab_open() > > +# define xc_gnttab_map_grant_ref(xc, gnt, domid, ref, flags) \ > > +xc_gnttab_map_grant_ref(gnt, domid, ref, flags) > > +# define xc_gnttab_map_grant_refs(xc, gnt, count, domids, refs, flags) \ > > +xc_gnttab_map_grant_refs(gnt, count, domids, refs, flags) > > +# define xc_gnttab_munmap(xc, gnt, pages, niov) xc_gnttab_munmap(gnt, > > pages, niov) > > +# define xc_gnttab_close(xc, dev) xc_gnttab_close(dev) > > All these defines are very icky. In my patchset I replace all those libxc > calls by indirect calls through a struct anyways. Maybe it'd make sense to > integrate that part into your series already. > > That way you could make this whole compat stuff a lot cleaner. You could use > static inline functions to create wrappers to the real calls and just assign > those as callbacks in the dispatch struct. > > In general, I'd disagree to the preprocessor approach here. If you're not > going with the libxc dispatch struct patch, I'd at least like to see all > those converted to static inline functions called with different names > internally. > Yeah, I am not a fan of preprocessor magic as well, it is probably worth investigating the indirect calls approach, surely would be easier to read.
Re: [Qemu-devel] [PATCH V6 03/15] xen: Add xen_machine_fv
On Mon, 15 Nov 2010, Kevin Wolf wrote: > >> + * Permission is hereby granted, free of charge, to any person obtaining > >> a copy > >> + * of this software and associated documentation files (the "Software"), > >> to deal > >> + * in the Software without restriction, including without limitation the > >> rights > >> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or > >> sell > >> + * copies of the Software, and to permit persons to whom the Software is > >> + * furnished to do so, subject to the following conditions: > >> + * > >> + * The above copyright notice and this permission notice shall be > >> included in > >> + * all copies or substantial portions of the Software. > >> + * > >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > >> EXPRESS OR > >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > >> MERCHANTABILITY, > >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > >> OTHER > >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > >> ARISING FROM, > >> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > >> IN > >> + * THE SOFTWARE. > > > > So all this code has always been public domain? There are no GPL'ed pieces > > in there? > > Public domain is something entirely different. This is just a permissive > license, and it's used a lot throughout qemu. So as you state that this > file is mostly a copy of pc.c which uses this license, at least for the > copied part it's not only okay, but even required to use the same > license here. Indeed, the license above is BSD because the original was BSD, it would be rude to GPL it.
Re: [Qemu-devel] [PATCH] make trace options use autoconfy names
On Sun, Nov 14, 2010 at 1:52 PM, Paolo Bonzini wrote: > On 11/14/2010 02:38 PM, Andreas Färber wrote: >>> >>> - --trace-file=*) trace_file="$optarg" >>> + --enable-trace-file=*) trace_file="$optarg" >>> ;; >> >> but this should be --with-trace-file=... please. It is not being >> enabled, just set to a different value. > > --with-* should be reserved for library paths, but I can change it if people > prefer it that way. Actually I think we have something similar to overriding --prefix here >:). It's a path that you can set at ./configure time. So is it not okay to use --trace-file=? But I know nothing of autoconf and --enable-* or --with-* sort of make sense too. Stefan
Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.
On Mon, 15 Nov 2010, Alexander Graf wrote: > > On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > > > From: Anthony PERARD > > > > This option gives the ability to switch one "accelerator" like kvm, xen > > or the default one tcg. We can specify more than one accelerator by > > separate them by a comma. QEMU will try each one and use the first whose > > works. > > > > So, > > > > -accel xen,kvm,tcg > > > > which would try Xen support first, then KVM and finaly tcg if none of > > the other works. > > > > Signed-off-by: Anthony PERARD > > --- > > qemu-options.hx | 10 ++ > > vl.c| 86 > > --- > > 2 files changed, 85 insertions(+), 11 deletions(-) > > > > diff --git a/qemu-options.hx b/qemu-options.hx > > index 718d47a..ba8385b 100644 > > --- a/qemu-options.hx > > +++ b/qemu-options.hx > > @@ -1925,6 +1925,16 @@ Enable KVM full virtualization support. This option > > is only available > > if KVM support is enabled when compiling. > > ETEXI > > > > +DEF("accel", HAS_ARG, QEMU_OPTION_accel, \ > > +"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", > > QEMU_ARCH_ALL) > > +STEXI > > +...@item -accel @var{accel}[,@var{accel}[,...]] > > +...@findex -accel > > +This is use to enable an accelerator, in kvm,xen,tcg. > > +By default, it use only tcg. If there a more than one accelerator > > +specified, the next one is used if the first don't work. > > +ETEXI > > + > > DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid, > > "-xen-domid id specify xen guest domain id\n", QEMU_ARCH_ALL) > > DEF("xen-create", 0, QEMU_OPTION_xen_create, > > diff --git a/vl.c b/vl.c > > index df414ef..40a26ee 100644 > > --- a/vl.c > > +++ b/vl.c > > @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname) > > return 0; > > } > > > > +static struct { > > +const char *opt_name; > > +const char *name; > > +int (*available)(void); > > +int (*init)(int smp_cpus); > > +int *allowed; > > +} accel_list[] = { > > +{ "tcg", "tcg", NULL, NULL, NULL }, > > +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed }, > > Thinking about this a bit more ... > > kvm_available is a function pointer that gets #defined to (0) when we don't > have KVM available. I can imagine some compiler might throw a warning on us > for this one day. > > Is there a valid reason not to do > > static inline int kvm_enabled() > { > #ifdef CONFIG_KVM > return kvm_allowed; > #else > return 0; > #endif > } > > That should compile into the exact same code but be valid for function > pointers. I will do this change, as well as for the two others patches. > > +}; > > + > > +static int accel_parse_init(const char *opts) > > +{ > > +const char *p = opts; > > +char buf[10]; > > +int i, ret; > > +bool accel_initalised = 0; > > +bool init_failed = 0; > > + > > +while (!accel_initalised && *p != '\0') { > > +if (*p == ',') { > > +p++; > > +} > > +p = get_opt_name(buf, sizeof (buf), p, ','); > > +for (i = 0; i < ARRAY_SIZE(accel_list); i++) { > > +if (strcmp(accel_list[i].opt_name, buf) == 0) { > > +if (accel_list[i].init) { > > If you'd make the function pointers mandatory, you could also get rid of a > lot of checks here. Just introduce a new small stub helper for tcg_init and > tcg_allowed and always call the pointers. Yes, this will make the code more readable. > I take back my Acked-by. Sorry, I guess I should have read things in more > detail first O_o. I still believe that this patch can go in before the > others, but I'd at least like to see some comments on the (0) pointer thing > :). And I will send the patch out of the patch series. Thanks, -- Anthony PERARD
[Qemu-devel] [PATCHv5 06/15] Add get_fw_dev_path callback to IDE bus.
Signed-off-by: Gleb Natapov --- hw/ide/qdev.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index 88ff657..01a181b 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -24,9 +24,12 @@ /* - */ +static char *idebus_get_fw_dev_path(DeviceState *dev); + static struct BusInfo ide_bus_info = { .name = "IDE", .size = sizeof(IDEBus), +.get_fw_dev_path = idebus_get_fw_dev_path, }; void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id) @@ -35,6 +38,16 @@ void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id) idebus->bus_id = bus_id; } +static char *idebus_get_fw_dev_path(DeviceState *dev) +{ +char path[30]; + +snprintf(path, sizeof(path), "%...@%d", qdev_fw_name(dev), + ((IDEBus*)dev->parent_bus)->bus_id); + +return strdup(path); +} + static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base) { IDEDevice *dev = DO_UPCAST(IDEDevice, qdev, qdev); -- 1.7.1
[Qemu-devel] [PATCHv5 07/15] Add get_dev_path callback for system bus.
Prints out mmio or pio used to access child device. Signed-off-by: Gleb Natapov --- hw/pci_host.c |2 ++ hw/sysbus.c | 30 ++ hw/sysbus.h |4 3 files changed, 36 insertions(+), 0 deletions(-) diff --git a/hw/pci_host.c b/hw/pci_host.c index bc5b771..28d45bf 100644 --- a/hw/pci_host.c +++ b/hw/pci_host.c @@ -197,6 +197,7 @@ void pci_host_conf_register_ioport(pio_addr_t ioport, PCIHostState *s) { pci_host_init(s); register_ioport_simple(&s->conf_noswap_handler, ioport, 4, 4); +sysbus_init_ioports(&s->busdev, ioport, 4); } int pci_host_data_register_mmio(PCIHostState *s, int swap) @@ -215,4 +216,5 @@ void pci_host_data_register_ioport(pio_addr_t ioport, PCIHostState *s) register_ioport_simple(&s->data_noswap_handler, ioport, 4, 1); register_ioport_simple(&s->data_noswap_handler, ioport, 4, 2); register_ioport_simple(&s->data_noswap_handler, ioport, 4, 4); +sysbus_init_ioports(&s->busdev, ioport, 4); } diff --git a/hw/sysbus.c b/hw/sysbus.c index d817721..1583bd8 100644 --- a/hw/sysbus.c +++ b/hw/sysbus.c @@ -22,11 +22,13 @@ #include "monitor.h" static void sysbus_dev_print(Monitor *mon, DeviceState *dev, int indent); +static char *sysbus_get_fw_dev_path(DeviceState *dev); struct BusInfo system_bus_info = { .name = "System", .size = sizeof(BusState), .print_dev = sysbus_dev_print, +.get_fw_dev_path = sysbus_get_fw_dev_path, }; void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq) @@ -106,6 +108,16 @@ void sysbus_init_mmio_cb(SysBusDevice *dev, target_phys_addr_t size, dev->mmio[n].cb = cb; } +void sysbus_init_ioports(SysBusDevice *dev, pio_addr_t ioport, pio_addr_t size) +{ +pio_addr_t i; + +for (i = 0; i < size; i++) { +assert(dev->num_pio < QDEV_MAX_PIO); +dev->pio[dev->num_pio++] = ioport++; +} +} + static int sysbus_device_init(DeviceState *dev, DeviceInfo *base) { SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev); @@ -171,3 +183,21 @@ static void sysbus_dev_print(Monitor *mon, DeviceState *dev, int indent) indent, "", s->mmio[i].addr, s->mmio[i].size); } } + +static char *sysbus_get_fw_dev_path(DeviceState *dev) +{ +SysBusDevice *s = sysbus_from_qdev(dev); +char path[40]; +int off; + +off = snprintf(path, sizeof(path), "%s", qdev_fw_name(dev)); + +if (s->num_mmio) { +snprintf(path + off, sizeof(path) - off, "@"TARGET_FMT_plx, + s->mmio[0].addr); +} else if (s->num_pio) { +snprintf(path + off, sizeof(path) - off, "@i%04x", s->pio[0]); +} + +return strdup(path); +} diff --git a/hw/sysbus.h b/hw/sysbus.h index 5980901..e9eb618 100644 --- a/hw/sysbus.h +++ b/hw/sysbus.h @@ -6,6 +6,7 @@ #include "qdev.h" #define QDEV_MAX_MMIO 32 +#define QDEV_MAX_PIO 32 #define QDEV_MAX_IRQ 256 typedef struct SysBusDevice SysBusDevice; @@ -23,6 +24,8 @@ struct SysBusDevice { mmio_mapfunc cb; ram_addr_t iofunc; } mmio[QDEV_MAX_MMIO]; +int num_pio; +pio_addr_t pio[QDEV_MAX_PIO]; }; typedef int (*sysbus_initfn)(SysBusDevice *dev); @@ -45,6 +48,7 @@ void sysbus_init_mmio_cb(SysBusDevice *dev, target_phys_addr_t size, mmio_mapfunc cb); void sysbus_init_irq(SysBusDevice *dev, qemu_irq *p); void sysbus_pass_irq(SysBusDevice *dev, SysBusDevice *target); +void sysbus_init_ioports(SysBusDevice *dev, pio_addr_t ioport, pio_addr_t size); void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq); -- 1.7.1
[Qemu-devel] [PATCHv5 13/15] Add bootindex for option roms.
Extend -option-rom command to have additional parameter ,bootindex=. Signed-off-by: Gleb Natapov --- hw/loader.c| 16 +++- hw/loader.h|8 hw/multiboot.c |3 ++- hw/ne2000.c|2 +- hw/nseries.c |4 ++-- hw/palm.c |6 +++--- hw/pc.c|7 --- hw/pci.c |2 +- hw/pcnet.c |2 +- qemu-config.c | 17 + sysemu.h |6 +- vl.c | 11 +-- 12 files changed, 60 insertions(+), 24 deletions(-) diff --git a/hw/loader.c b/hw/loader.c index 1e98326..eb198f6 100644 --- a/hw/loader.c +++ b/hw/loader.c @@ -107,7 +107,7 @@ int load_image_targphys(const char *filename, size = get_image_size(filename); if (size > 0) -rom_add_file_fixed(filename, addr); +rom_add_file_fixed(filename, addr, -1); return size; } @@ -557,10 +557,11 @@ static void rom_insert(Rom *rom) } int rom_add_file(const char *file, const char *fw_dir, - target_phys_addr_t addr) + target_phys_addr_t addr, int32_t bootindex) { Rom *rom; int rc, fd = -1; +char devpath[100]; rom = qemu_mallocz(sizeof(*rom)); rom->name = qemu_strdup(file); @@ -605,7 +606,12 @@ int rom_add_file(const char *file, const char *fw_dir, snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir, basename); fw_cfg_add_file(fw_cfg, fw_file_name, rom->data, rom->romsize); +snprintf(devpath, sizeof(devpath), "/r...@%s", fw_file_name); +} else { +snprintf(devpath, sizeof(devpath), "/rom@" TARGET_FMT_plx, addr); } + +add_boot_device_path(bootindex, NULL, devpath); return 0; err: @@ -635,12 +641,12 @@ int rom_add_blob(const char *name, const void *blob, size_t len, int rom_add_vga(const char *file) { -return rom_add_file(file, "vgaroms", 0); +return rom_add_file(file, "vgaroms", 0, -1); } -int rom_add_option(const char *file) +int rom_add_option(const char *file, int32_t bootindex) { -return rom_add_file(file, "genroms", 0); +return rom_add_file(file, "genroms", 0, bootindex); } static void rom_reset(void *unused) diff --git a/hw/loader.h b/hw/loader.h index 1f82fc5..fc6bdff 100644 --- a/hw/loader.h +++ b/hw/loader.h @@ -22,7 +22,7 @@ void pstrcpy_targphys(const char *name, int rom_add_file(const char *file, const char *fw_dir, - target_phys_addr_t addr); + target_phys_addr_t addr, int32_t bootindex); int rom_add_blob(const char *name, const void *blob, size_t len, target_phys_addr_t addr); int rom_load_all(void); @@ -31,8 +31,8 @@ int rom_copy(uint8_t *dest, target_phys_addr_t addr, size_t size); void *rom_ptr(target_phys_addr_t addr); void do_info_roms(Monitor *mon); -#define rom_add_file_fixed(_f, _a) \ -rom_add_file(_f, NULL, _a) +#define rom_add_file_fixed(_f, _a, _i) \ +rom_add_file(_f, NULL, _a, _i) #define rom_add_blob_fixed(_f, _b, _l, _a) \ rom_add_blob(_f, _b, _l, _a) @@ -43,6 +43,6 @@ void do_info_roms(Monitor *mon); #define PC_ROM_SIZE(PC_ROM_MAX - PC_ROM_MIN_VGA) int rom_add_vga(const char *file); -int rom_add_option(const char *file); +int rom_add_option(const char *file, int32_t bootindex); #endif diff --git a/hw/multiboot.c b/hw/multiboot.c index f9097a2..b438019 100644 --- a/hw/multiboot.c +++ b/hw/multiboot.c @@ -325,7 +325,8 @@ int load_multiboot(void *fw_cfg, fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data, sizeof(bootinfo)); -option_rom[nb_option_roms] = "multiboot.bin"; +option_rom[nb_option_roms].name = "multiboot.bin"; +option_rom[nb_option_roms].bootindex = 0; nb_option_roms++; return 1; /* yes, we are multiboot */ diff --git a/hw/ne2000.c b/hw/ne2000.c index f4bbac2..67e0cb0 100644 --- a/hw/ne2000.c +++ b/hw/ne2000.c @@ -742,7 +742,7 @@ static int pci_ne2000_init(PCIDevice *pci_dev) if (!pci_dev->qdev.hotplugged) { static int loaded = 0; if (!loaded) { -rom_add_option("pxe-ne2k_pci.bin"); +rom_add_option("pxe-ne2k_pci.bin", -1); loaded = 1; } } diff --git a/hw/nseries.c b/hw/nseries.c index 04a028d..2f6f473 100644 --- a/hw/nseries.c +++ b/hw/nseries.c @@ -1326,7 +1326,7 @@ static void n8x0_init(ram_addr_t ram_size, const char *boot_device, qemu_register_reset(n8x0_boot_init, s); } -if (option_rom[0] && (boot_device[0] == 'n' || !kernel_filename)) { +if (option_rom[0].name && (boot_device[0] == 'n' || !kernel_filename)) { int rom_size; uint8_t nolo_tags[0x1]; /* No, wait, better start at the ROM. */ @@ -1341,7 +1341,7 @@ static void n8x0_init(ram_addr_t ram_size, const char *boot_device, * * The code above is for loading the `zImage' file from Nokia * images. */ -rom_size =
[Qemu-devel] [PATCHv5 04/15] Add get_fw_dev_path callback to ISA bus in qdev.
Use device ioports to create unique device path. Signed-off-by: Gleb Natapov --- hw/isa-bus.c | 16 1 files changed, 16 insertions(+), 0 deletions(-) diff --git a/hw/isa-bus.c b/hw/isa-bus.c index c0ac7e9..c423c1b 100644 --- a/hw/isa-bus.c +++ b/hw/isa-bus.c @@ -31,11 +31,13 @@ static ISABus *isabus; target_phys_addr_t isa_mem_base = 0; static void isabus_dev_print(Monitor *mon, DeviceState *dev, int indent); +static char *isabus_get_fw_dev_path(DeviceState *dev); static struct BusInfo isa_bus_info = { .name = "ISA", .size = sizeof(ISABus), .print_dev = isabus_dev_print, +.get_fw_dev_path = isabus_get_fw_dev_path, }; ISABus *isa_bus_new(DeviceState *dev) @@ -188,4 +190,18 @@ static void isabus_register_devices(void) sysbus_register_withprop(&isabus_bridge_info); } +static char *isabus_get_fw_dev_path(DeviceState *dev) +{ +ISADevice *d = (ISADevice*)dev; +char path[40]; +int off; + +off = snprintf(path, sizeof(path), "%s", qdev_fw_name(dev)); +if (d->nioports) { +snprintf(path + off, sizeof(path) - off, "@%04x", d->ioports[0]); +} + +return strdup(path); +} + device_init(isabus_register_devices) -- 1.7.1
[Qemu-devel] [PATCHv5 01/15] Introduce fw_name field to DeviceInfo structure.
Add "fw_name" to DeviceInfo to use in device path building. In contrast to "name" "fw_name" should refer to functionality device provides instead of particular device model like "name" does. Signed-off-by: Gleb Natapov --- hw/fdc.c|1 + hw/ide/isa.c|1 + hw/ide/qdev.c |1 + hw/isa-bus.c|1 + hw/lance.c |1 + hw/piix_pci.c |1 + hw/qdev.h |6 ++ hw/usb-hub.c|1 + hw/usb-net.c|1 + hw/virtio-pci.c |1 + 10 files changed, 15 insertions(+), 0 deletions(-) diff --git a/hw/fdc.c b/hw/fdc.c index c159dcb..a467c4b 100644 --- a/hw/fdc.c +++ b/hw/fdc.c @@ -2040,6 +2040,7 @@ static const VMStateDescription vmstate_isa_fdc ={ static ISADeviceInfo isa_fdc_info = { .init = isabus_fdc_init1, .qdev.name = "isa-fdc", +.qdev.fw_name = "fdc", .qdev.size = sizeof(FDCtrlISABus), .qdev.no_user = 1, .qdev.vmsd = &vmstate_isa_fdc, diff --git a/hw/ide/isa.c b/hw/ide/isa.c index 6b57e0d..9856435 100644 --- a/hw/ide/isa.c +++ b/hw/ide/isa.c @@ -98,6 +98,7 @@ ISADevice *isa_ide_init(int iobase, int iobase2, int isairq, static ISADeviceInfo isa_ide_info = { .qdev.name = "isa-ide", +.qdev.fw_name = "ide", .qdev.size = sizeof(ISAIDEState), .init = isa_ide_initfn, .qdev.reset = isa_ide_reset, diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index 0808760..6d27b60 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -134,6 +134,7 @@ static int ide_drive_initfn(IDEDevice *dev) static IDEDeviceInfo ide_drive_info = { .qdev.name = "ide-drive", +.qdev.fw_name = "drive", .qdev.size = sizeof(IDEDrive), .init = ide_drive_initfn, .qdev.props = (Property[]) { diff --git a/hw/isa-bus.c b/hw/isa-bus.c index 4e306de..26036e0 100644 --- a/hw/isa-bus.c +++ b/hw/isa-bus.c @@ -153,6 +153,7 @@ static int isabus_bridge_init(SysBusDevice *dev) static SysBusDeviceInfo isabus_bridge_info = { .init = isabus_bridge_init, .qdev.name = "isabus-bridge", +.qdev.fw_name = "isa", .qdev.size = sizeof(SysBusDevice), .qdev.no_user = 1, }; diff --git a/hw/lance.c b/hw/lance.c index dc12144..1a3bb1a 100644 --- a/hw/lance.c +++ b/hw/lance.c @@ -141,6 +141,7 @@ static void lance_reset(DeviceState *dev) static SysBusDeviceInfo lance_info = { .init = lance_init, .qdev.name = "lance", +.qdev.fw_name = "ethernet", .qdev.size = sizeof(SysBusPCNetState), .qdev.reset = lance_reset, .qdev.vmsd = &vmstate_lance, diff --git a/hw/piix_pci.c b/hw/piix_pci.c index b5589b9..38f9d9e 100644 --- a/hw/piix_pci.c +++ b/hw/piix_pci.c @@ -365,6 +365,7 @@ static PCIDeviceInfo i440fx_info[] = { static SysBusDeviceInfo i440fx_pcihost_info = { .init = i440fx_pcihost_initfn, .qdev.name= "i440FX-pcihost", +.qdev.fw_name = "pci", .qdev.size= sizeof(I440FXState), .qdev.no_user = 1, }; diff --git a/hw/qdev.h b/hw/qdev.h index 579328a..9f90efe 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -139,6 +139,7 @@ typedef void (*qdev_resetfn)(DeviceState *dev); struct DeviceInfo { const char *name; +const char *fw_name; const char *alias; const char *desc; size_t size; @@ -288,6 +289,11 @@ void qdev_prop_set_defaults(DeviceState *dev, Property *props); void qdev_prop_register_global_list(GlobalProperty *props); void qdev_prop_set_globals(DeviceState *dev); +static inline const char *qdev_fw_name(DeviceState *dev) +{ +return dev->info->fw_name ? : dev->info->alias ? : dev->info->name; +} + /* This is a nasty hack to allow passing a NULL bus to qdev_create. */ extern struct BusInfo system_bus_info; diff --git a/hw/usb-hub.c b/hw/usb-hub.c index 2a1edfc..8e3a96b 100644 --- a/hw/usb-hub.c +++ b/hw/usb-hub.c @@ -545,6 +545,7 @@ static int usb_hub_initfn(USBDevice *dev) static struct USBDeviceInfo hub_info = { .product_desc = "QEMU USB Hub", .qdev.name = "usb-hub", +.qdev.fw_name= "hub", .qdev.size = sizeof(USBHubState), .init = usb_hub_initfn, .handle_packet = usb_hub_handle_packet, diff --git a/hw/usb-net.c b/hw/usb-net.c index 70f9263..2287ee1 100644 --- a/hw/usb-net.c +++ b/hw/usb-net.c @@ -1496,6 +1496,7 @@ static USBDevice *usb_net_init(const char *cmdline) static struct USBDeviceInfo net_info = { .product_desc = "QEMU USB Network Interface", .qdev.name = "usb-net", +.qdev.fw_name= "network", .qdev.size = sizeof(USBNetState), .init = usb_net_initfn, .handle_packet = usb_generic_handle_packet, diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c index 729917d..be2c92f 100644 --- a/hw/virtio-pci.c +++ b/hw/virtio-pci.c @@ -697,6 +697,7 @@ static int virtio_9p_init_pci(PCIDevice *pci_dev) static PCIDeviceInfo virtio_info[] = { { .qdev.name = "virtio-blk-pci", +.qdev.alias = "virtio-blk", .qdev.size = sizeof(VirtIOPCIProxy), .init = virtio_
[Qemu-devel] [PATCHv5 10/15] Add get_dev_path callback for usb bus.
Signed-off-by: Gleb Natapov --- hw/usb-bus.c | 42 ++ 1 files changed, 42 insertions(+), 0 deletions(-) diff --git a/hw/usb-bus.c b/hw/usb-bus.c index 256b881..8b4583c 100644 --- a/hw/usb-bus.c +++ b/hw/usb-bus.c @@ -5,11 +5,13 @@ #include "monitor.h" static void usb_bus_dev_print(Monitor *mon, DeviceState *qdev, int indent); +static char *usbbus_get_fw_dev_path(DeviceState *dev); static struct BusInfo usb_bus_info = { .name = "USB", .size = sizeof(USBBus), .print_dev = usb_bus_dev_print, +.get_fw_dev_path = usbbus_get_fw_dev_path, }; static int next_usb_bus = 0; static QTAILQ_HEAD(, USBBus) busses = QTAILQ_HEAD_INITIALIZER(busses); @@ -307,3 +309,43 @@ USBDevice *usbdevice_create(const char *cmdline) } return usb->usbdevice_init(params); } + +static int usbbus_get_fw_dev_path_helper(USBDevice *d, USBBus *bus, char *p, + int len) +{ +int l = 0; +USBPort *port; + +QTAILQ_FOREACH(port, &bus->used, next) { +if (port->dev == d) { +if (port->pdev) { +l = usbbus_get_fw_dev_path_helper(port->pdev, bus, p, len); +} +l += snprintf(p + l, len - l, "%...@%x/", qdev_fw_name(&d->qdev), + port->index); +break; +} +} + +return l; +} + +static char *usbbus_get_fw_dev_path(DeviceState *dev) +{ +USBDevice *d = (USBDevice*)dev; +USBBus *bus = usb_bus_from_device(d); +char path[100]; +int l; + +assert(d->attached != 0); + +l = usbbus_get_fw_dev_path_helper(d, bus, path, sizeof(path)); + +if (l == 0) { +abort(); +} + +path[l-1] = '\0'; + +return strdup(path); +} -- 1.7.1
[Qemu-devel] [PATCHv5 02/15] Introduce new BusInfo callback get_fw_dev_path.
New get_fw_dev_path callback will be used for build device path usable by firmware in contrast to qdev qemu internal device path. Signed-off-by: Gleb Natapov --- hw/qdev.h |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/hw/qdev.h b/hw/qdev.h index 9f90efe..dc669b3 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -49,12 +49,14 @@ struct DeviceState { typedef void (*bus_dev_printfn)(Monitor *mon, DeviceState *dev, int indent); typedef char *(*bus_get_dev_path)(DeviceState *dev); +typedef char *(*bus_get_fw_dev_path)(DeviceState *dev); struct BusInfo { const char *name; size_t size; bus_dev_printfn print_dev; bus_get_dev_path get_dev_path; +bus_get_fw_dev_path get_fw_dev_path; Property *props; }; -- 1.7.1
[Qemu-devel] [PATCHv5 12/15] Change fw_cfg_add_file() to get full file path as a parameter.
Change fw_cfg_add_file() to get full file path as a parameter instead of building one internally. Two reasons for that. First caller may need to know how file is named. Second this moves policy of file naming out from fw_cfg. Platform may want to use more then two levels of directories for instance. Signed-off-by: Gleb Natapov --- hw/fw_cfg.c | 16 hw/fw_cfg.h |4 ++-- hw/loader.c | 16 ++-- 3 files changed, 20 insertions(+), 16 deletions(-) diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c index 72866ae..7b9434f 100644 --- a/hw/fw_cfg.c +++ b/hw/fw_cfg.c @@ -277,10 +277,9 @@ int fw_cfg_add_callback(FWCfgState *s, uint16_t key, FWCfgCallback callback, return 1; } -int fw_cfg_add_file(FWCfgState *s, const char *dir, const char *filename, -uint8_t *data, uint32_t len) +int fw_cfg_add_file(FWCfgState *s, const char *filename, uint8_t *data, +uint32_t len) { -const char *basename; int i, index; if (!s->files) { @@ -297,15 +296,8 @@ int fw_cfg_add_file(FWCfgState *s, const char *dir, const char *filename, fw_cfg_add_bytes(s, FW_CFG_FILE_FIRST + index, data, len); -basename = strrchr(filename, '/'); -if (basename) { -basename++; -} else { -basename = filename; -} - -snprintf(s->files->f[index].name, sizeof(s->files->f[index].name), - "%s/%s", dir, basename); +pstrcpy(s->files->f[index].name, sizeof(s->files->f[index].name), +filename); for (i = 0; i < index; i++) { if (strcmp(s->files->f[index].name, s->files->f[i].name) == 0) { FW_CFG_DPRINTF("%s: skip duplicate: %s\n", __FUNCTION__, diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h index 4d13a4f..856bf91 100644 --- a/hw/fw_cfg.h +++ b/hw/fw_cfg.h @@ -60,8 +60,8 @@ int fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value); int fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value); int fw_cfg_add_callback(FWCfgState *s, uint16_t key, FWCfgCallback callback, void *callback_opaque, uint8_t *data, size_t len); -int fw_cfg_add_file(FWCfgState *s, const char *dir, const char *filename, -uint8_t *data, uint32_t len); +int fw_cfg_add_file(FWCfgState *s, const char *filename, uint8_t *data, +uint32_t len); FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port, target_phys_addr_t crl_addr, target_phys_addr_t data_addr); diff --git a/hw/loader.c b/hw/loader.c index 49ac1fa..1e98326 100644 --- a/hw/loader.c +++ b/hw/loader.c @@ -592,8 +592,20 @@ int rom_add_file(const char *file, const char *fw_dir, } close(fd); rom_insert(rom); -if (rom->fw_file && fw_cfg) -fw_cfg_add_file(fw_cfg, rom->fw_dir, rom->fw_file, rom->data, rom->romsize); +if (rom->fw_file && fw_cfg) { +const char *basename; +char fw_file_name[56]; + +basename = strrchr(rom->fw_file, '/'); +if (basename) { +basename++; +} else { +basename = rom->fw_file; +} +snprintf(fw_file_name, sizeof(fw_file_name), "%s/%s", rom->fw_dir, + basename); +fw_cfg_add_file(fw_cfg, fw_file_name, rom->data, rom->romsize); +} return 0; err: -- 1.7.1
[Qemu-devel] [PATCHv5 15/15] Pass boot device list to firmware.
Signed-off-by: Gleb Natapov --- hw/fw_cfg.c | 15 +++ hw/fw_cfg.h |5 - sysemu.h|1 + vl.c| 49 + 4 files changed, 69 insertions(+), 1 deletions(-) diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c index 7b9434f..4eea338 100644 --- a/hw/fw_cfg.c +++ b/hw/fw_cfg.c @@ -53,6 +53,7 @@ struct FWCfgState { FWCfgFiles *files; uint16_t cur_entry; uint32_t cur_offset; +Notifier machine_ready; }; static void fw_cfg_write(FWCfgState *s, uint8_t value) @@ -315,6 +316,16 @@ int fw_cfg_add_file(FWCfgState *s, const char *filename, uint8_t *data, return 1; } +static void fw_cfg_machine_ready(struct Notifier* n) +{ +uint32_t len; +FWCfgState *s = container_of(n, FWCfgState, machine_ready); +char *bootindex = get_boot_devices_list(&len); + +fw_cfg_add_i32(s, FW_CFG_BOOTINDEX_LEN, len); +fw_cfg_add_bytes(s, FW_CFG_BOOTINDEX_DATA, (uint8_t*)bootindex, len); +} + FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port, target_phys_addr_t ctl_addr, target_phys_addr_t data_addr) { @@ -343,6 +354,10 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port, fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus); fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu); + +s->machine_ready.notify = fw_cfg_machine_ready; +qemu_add_machine_init_done_notifier(&s->machine_ready); + return s; } diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h index 856bf91..b951f6b 100644 --- a/hw/fw_cfg.h +++ b/hw/fw_cfg.h @@ -30,7 +30,10 @@ #define FW_CFG_FILE_FIRST 0x20 #define FW_CFG_FILE_SLOTS 0x10 -#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS) +#define FW_CFG_FILE_LAST_SLOT (FW_CFG_FILE_FIRST+FW_CFG_FILE_SLOTS) +#define FW_CFG_BOOTINDEX_LEN(FW_CFG_FILE_LAST_SLOT + 1) +#define FW_CFG_BOOTINDEX_DATA (FW_CFG_FILE_LAST_SLOT + 2) +#define FW_CFG_MAX_ENTRYFW_CFG_BOOTINDEX_DATA + 1 #define FW_CFG_WRITE_CHANNEL0x4000 #define FW_CFG_ARCH_LOCAL 0x8000 diff --git a/sysemu.h b/sysemu.h index c42f33a..38a20a3 100644 --- a/sysemu.h +++ b/sysemu.h @@ -196,4 +196,5 @@ void register_devices(void); void add_boot_device_path(int32_t bootindex, DeviceState *dev, const char *suffix); +char *get_boot_devices_list(uint32_t *size); #endif diff --git a/vl.c b/vl.c index 918d988..ab36f9f 100644 --- a/vl.c +++ b/vl.c @@ -735,6 +735,55 @@ void add_boot_device_path(int32_t bootindex, DeviceState *dev, QTAILQ_INSERT_TAIL(&fw_boot_order, node, link); } +/* + * This function returns device list as an array in a below format: + * +---+-+- + * | devpath1 | devpath2 | ... + * +---+-+- + * where: + * devpath - null terminated string representing one device path + * + * memory pointed by "size" is assigned total length of teh array in bytes + * + */ +char *get_boot_devices_list(uint32_t *size) +{ +FWBootEntry *i; +uint32_t total = 0; +char *list = NULL; + +QTAILQ_FOREACH(i, &fw_boot_order, link) { +char *devpath = NULL, *bootpath; +int len; + +if (i->dev) { +devpath = qdev_get_fw_dev_path(i->dev); +assert(devpath); +} + +if (i->suffix && devpath) { +bootpath = qemu_malloc(strlen(devpath) + strlen(i->suffix) + 2); +sprintf(bootpath, "%s/%s", devpath, i->suffix); +qemu_free(devpath); +} else if (devpath) { +bootpath = devpath; +} else { +bootpath = strdup(i->suffix); +assert(bootpath); +} + +len = strlen(bootpath) + 1; +list = qemu_realloc(list, total + len); +memcpy(&list[total], bootpath, len); +total += len; +qemu_free(bootpath); +} + +*size = total; + +return list; +} + static void numa_add(const char *optarg) { char option[128]; -- 1.7.1
[Qemu-devel] [PATCHv5 00/15] boot order specification
I am using open firmware naming scheme to specify device path names. In this version: fixed compilation problem, changed how device list is passed into firmware. Names look like this on pci machine: /p...@i0cf8/i...@1,1/dr...@1/d...@0 /p...@i0cf8/i...@1/f...@03f1/flo...@1 /p...@i0cf8/i...@1/f...@03f1/flo...@0 /p...@i0cf8/i...@1,1/dr...@1/d...@1 /p...@i0cf8/i...@1,1/dr...@0/d...@0 /p...@i0cf8/s...@3/d...@0 /p...@i0cf8/ether...@4/ethernet-...@0 /p...@i0cf8/ether...@5/ethernet-...@0 /p...@i0cf8/i...@1,1/dr...@0/d...@1 /p...@i0cf8/i...@1/i...@01e8/dr...@0/d...@0 /p...@i0cf8/u...@1,2/netw...@0/ether...@0 /p...@i0cf8/u...@1,2/h...@1/netw...@0/ether...@0 /r...@genroms/linuxboot.bin and on isa machine: /isa/i...@0170/dr...@0/d...@0 /isa/f...@03f1/flo...@1 /isa/f...@03f1/flo...@0 /isa/i...@0170/dr...@0/d...@1 Instead of using get_dev_path() callback I introduces another one get_fw_dev_path. Unfortunately the way get_dev_path() callback is used in migration code makes it hard to reuse it for other purposes. First of all it is not called recursively so caller expects it to provide unique name by itself. Device path though is inherently recursive. Each individual element may not be unique, but the whole path will be. On the other hand to call get_dev_path() recursively in migration code we should implement it for all possible buses first. Other problem is compatibility. If we change get_dev_path() output format now we will not be able to migrate from old qemu to new one without some additional compatibility layer. Gleb Natapov (15): Introduce fw_name field to DeviceInfo structure. Introduce new BusInfo callback get_fw_dev_path. Keep track of ISA ports ISA device is using in qdev. Add get_fw_dev_path callback to ISA bus in qdev. Store IDE bus id in IDEBus structure for easy access. Add get_fw_dev_path callback to IDE bus. Add get_dev_path callback for system bus. Add get_fw_dev_path callback for pci bus. Record which USBDevice USBPort belongs too. Add get_dev_path callback for usb bus. Add bootindex parameter to net/block/fd device Change fw_cfg_add_file() to get full file path as a parameter. Add bootindex for option roms. Add notifier that will be called when machine is fully created. Pass boot device list to firmware. block_int.h |4 +- hw/cs4231a.c |1 + hw/e1000.c|4 ++ hw/eepro100.c |3 + hw/fdc.c | 12 ++ hw/fw_cfg.c | 31 +-- hw/fw_cfg.h |9 +++- hw/gus.c |4 ++ hw/ide/cmd646.c |4 +- hw/ide/internal.h |3 +- hw/ide/isa.c |5 ++- hw/ide/piix.c |4 +- hw/ide/qdev.c | 22 ++- hw/ide/via.c |4 +- hw/isa-bus.c | 42 +++ hw/isa.h |4 ++ hw/lance.c|1 + hw/loader.c | 32 +++--- hw/loader.h |8 ++-- hw/m48t59.c |1 + hw/mc146818rtc.c |1 + hw/multiboot.c|3 +- hw/ne2000-isa.c |3 + hw/ne2000.c |5 ++- hw/nseries.c |4 +- hw/palm.c |6 +- hw/parallel.c |5 ++ hw/pc.c |7 ++- hw/pci.c | 110 +++--- hw/pci_host.c |2 + hw/pckbd.c|3 + hw/pcnet.c|6 ++- hw/piix_pci.c |1 + hw/qdev.c | 32 +++ hw/qdev.h |9 hw/rtl8139.c |4 ++ hw/sb16.c |4 ++ hw/serial.c |1 + hw/sysbus.c | 30 ++ hw/sysbus.h |4 ++ hw/usb-bus.c | 45 - hw/usb-hub.c |3 +- hw/usb-musb.c |2 +- hw/usb-net.c |3 + hw/usb-ohci.c |2 +- hw/usb-uhci.c |2 +- hw/usb.h |3 +- hw/virtio-blk.c |2 + hw/virtio-net.c |2 + hw/virtio-pci.c |1 + net.h |4 +- qemu-config.c | 17 sysemu.h | 11 +- vl.c | 115 - 54 files changed, 569 insertions(+), 81 deletions(-)
[Qemu-devel] [PATCHv5 08/15] Add get_fw_dev_path callback for pci bus.
Signed-off-by: Gleb Natapov --- hw/pci.c | 108 - 1 files changed, 85 insertions(+), 23 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 962886e..114b435 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -43,12 +43,14 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent); static char *pcibus_get_dev_path(DeviceState *dev); +static char *pcibus_get_fw_dev_path(DeviceState *dev); struct BusInfo pci_bus_info = { .name = "PCI", .size = sizeof(PCIBus), .print_dev = pcibus_dev_print, .get_dev_path = pcibus_get_dev_path, +.get_fw_dev_path = pcibus_get_fw_dev_path, .props = (Property[]) { DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), DEFINE_PROP_STRING("romfile", PCIDevice, romfile), @@ -1061,45 +1063,63 @@ void pci_msi_notify(PCIDevice *dev, unsigned int vector) typedef struct { uint16_t class; const char *desc; +const char *fw_name; +uint16_t fw_ign_bits; } pci_class_desc; static const pci_class_desc pci_class_descriptions[] = { -{ 0x0100, "SCSI controller"}, -{ 0x0101, "IDE controller"}, -{ 0x0102, "Floppy controller"}, -{ 0x0103, "IPI controller"}, -{ 0x0104, "RAID controller"}, +{ 0x0001, "VGA controller", "display"}, +{ 0x0100, "SCSI controller", "scsi"}, +{ 0x0101, "IDE controller", "ide"}, +{ 0x0102, "Floppy controller", "fdc"}, +{ 0x0103, "IPI controller", "ipi"}, +{ 0x0104, "RAID controller", "raid"}, { 0x0106, "SATA controller"}, { 0x0107, "SAS controller"}, { 0x0180, "Storage controller"}, -{ 0x0200, "Ethernet controller"}, -{ 0x0201, "Token Ring controller"}, -{ 0x0202, "FDDI controller"}, -{ 0x0203, "ATM controller"}, +{ 0x0200, "Ethernet controller", "ethernet"}, +{ 0x0201, "Token Ring controller", "token-ring"}, +{ 0x0202, "FDDI controller", "fddi"}, +{ 0x0203, "ATM controller", "atm"}, { 0x0280, "Network controller"}, -{ 0x0300, "VGA controller"}, +{ 0x0300, "VGA controller", "display", 0x00ff}, { 0x0301, "XGA controller"}, { 0x0302, "3D controller"}, { 0x0380, "Display controller"}, -{ 0x0400, "Video controller"}, -{ 0x0401, "Audio controller"}, +{ 0x0400, "Video controller", "video"}, +{ 0x0401, "Audio controller", "sound"}, { 0x0402, "Phone"}, { 0x0480, "Multimedia controller"}, -{ 0x0500, "RAM controller"}, -{ 0x0501, "Flash controller"}, +{ 0x0500, "RAM controller", "memory"}, +{ 0x0501, "Flash controller", "flash"}, { 0x0580, "Memory controller"}, -{ 0x0600, "Host bridge"}, -{ 0x0601, "ISA bridge"}, -{ 0x0602, "EISA bridge"}, -{ 0x0603, "MC bridge"}, -{ 0x0604, "PCI bridge"}, -{ 0x0605, "PCMCIA bridge"}, -{ 0x0606, "NUBUS bridge"}, -{ 0x0607, "CARDBUS bridge"}, +{ 0x0600, "Host bridge", "host"}, +{ 0x0601, "ISA bridge", "isa"}, +{ 0x0602, "EISA bridge", "eisa"}, +{ 0x0603, "MC bridge", "mca"}, +{ 0x0604, "PCI bridge", "pci"}, +{ 0x0605, "PCMCIA bridge", "pcmcia"}, +{ 0x0606, "NUBUS bridge", "nubus"}, +{ 0x0607, "CARDBUS bridge", "cardbus"}, { 0x0608, "RACEWAY bridge"}, { 0x0680, "Bridge"}, -{ 0x0c03, "USB controller"}, +{ 0x0700, "Serial port", "serial"}, +{ 0x0701, "Parallel port", "parallel"}, +{ 0x0800, "Interrupt controller", "interrupt-controller"}, +{ 0x0801, "DMA controller", "dma-controller"}, +{ 0x0802, "Timer", "timer"}, +{ 0x0803, "RTC", "rtc"}, +{ 0x0900, "Keyboard", "keyboard"}, +{ 0x0901, "Pen", "pen"}, +{ 0x0902, "Mouse", "mouse"}, +{ 0x0A00, "Dock station", "dock", 0x00ff}, +{ 0x0B00, "i386 cpu", "cpu", 0x00ff}, +{ 0x0c00, "Fireware contorller", "fireware"}, +{ 0x0c01, "Access bus controller", "access-bus"}, +{ 0x0c02, "SSA controller", "ssa"}, +{ 0x0c03, "USB controller", "usb"}, +{ 0x0c04, "Fibre channel controller", "fibre-channel"}, { 0, NULL} }; @@ -1825,6 +1845,48 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) } } +static char *pci_dev_fw_name(DeviceState *dev, char *buf, int len) +{ +PCIDevice *d = (PCIDevice *)dev; +const char *name = NULL; +const pci_class_desc *desc = pci_class_descriptions; +int class = pci_get_word(d->config + PCI_CLASS_DEVICE); + +while (desc->desc && + (class & ~desc->fw_ign_bits) != + (desc->class & ~desc->fw_ign_bits)) { +desc++; +} + +if (desc->desc) { +name = desc->fw_name; +} + +if (name) { +pstrcpy(buf, len, name); +} else { +snprintf(buf, len, "pci%04x,%04x", + pci_get_word(d->config + PCI_VENDOR_ID), + pci_get_word(d->config + PCI_DEVICE_ID)); +} + +return buf; +} + +static char *pcibus_get_fw_dev_path(DeviceState *dev) +{ +PCIDevice *d = (PCIDevice *)dev; +char path[50], na
[Qemu-devel] [PATCHv5 03/15] Keep track of ISA ports ISA device is using in qdev.
Store all io ports used by device in ISADevice structure. Signed-off-by: Gleb Natapov --- hw/cs4231a.c |1 + hw/fdc.c |3 +++ hw/gus.c |4 hw/ide/isa.c |2 ++ hw/isa-bus.c | 25 + hw/isa.h |4 hw/m48t59.c |1 + hw/mc146818rtc.c |1 + hw/ne2000-isa.c |3 +++ hw/parallel.c|5 + hw/pckbd.c |3 +++ hw/sb16.c|4 hw/serial.c |1 + 13 files changed, 57 insertions(+), 0 deletions(-) diff --git a/hw/cs4231a.c b/hw/cs4231a.c index 4d5ce5c..598f032 100644 --- a/hw/cs4231a.c +++ b/hw/cs4231a.c @@ -645,6 +645,7 @@ static int cs4231a_initfn (ISADevice *dev) isa_init_irq (dev, &s->pic, s->irq); for (i = 0; i < 4; i++) { +isa_init_ioport(dev, i); register_ioport_write (s->port + i, 1, 1, cs_write, s); register_ioport_read (s->port + i, 1, 1, cs_read, s); } diff --git a/hw/fdc.c b/hw/fdc.c index a467c4b..5ab754b 100644 --- a/hw/fdc.c +++ b/hw/fdc.c @@ -1983,6 +1983,9 @@ static int isabus_fdc_init1(ISADevice *dev) &fdctrl_write_port, fdctrl); register_ioport_write(iobase + 0x07, 1, 1, &fdctrl_write_port, fdctrl); +isa_init_ioport_range(dev, iobase + 1, 5); +isa_init_ioport(dev, iobase + 7); + isa_init_irq(&isa->busdev, &fdctrl->irq, isairq); fdctrl->dma_chann = dma_chann; diff --git a/hw/gus.c b/hw/gus.c index e9016d8..ff9e7c7 100644 --- a/hw/gus.c +++ b/hw/gus.c @@ -264,20 +264,24 @@ static int gus_initfn (ISADevice *dev) register_ioport_write (s->port, 1, 1, gus_writeb, s); register_ioport_write (s->port, 1, 2, gus_writew, s); +isa_init_ioport_range(dev, s->port, 2); register_ioport_read ((s->port + 0x100) & 0xf00, 1, 1, gus_readb, s); register_ioport_read ((s->port + 0x100) & 0xf00, 1, 2, gus_readw, s); +isa_init_ioport_range(dev, (s->port + 0x100) & 0xf00, 2); register_ioport_write (s->port + 6, 10, 1, gus_writeb, s); register_ioport_write (s->port + 6, 10, 2, gus_writew, s); register_ioport_read (s->port + 6, 10, 1, gus_readb, s); register_ioport_read (s->port + 6, 10, 2, gus_readw, s); +isa_init_ioport_range(dev, s->port + 6, 10); register_ioport_write (s->port + 0x100, 8, 1, gus_writeb, s); register_ioport_write (s->port + 0x100, 8, 2, gus_writew, s); register_ioport_read (s->port + 0x100, 8, 1, gus_readb, s); register_ioport_read (s->port + 0x100, 8, 2, gus_readw, s); +isa_init_ioport_range(dev, s->port + 0x100, 8); DMA_register_channel (s->emu.gusdma, GUS_read_DMA, s); s->emu.himemaddr = s->himem; diff --git a/hw/ide/isa.c b/hw/ide/isa.c index 9856435..4206afd 100644 --- a/hw/ide/isa.c +++ b/hw/ide/isa.c @@ -70,6 +70,8 @@ static int isa_ide_initfn(ISADevice *dev) ide_bus_new(&s->bus, &s->dev.qdev); ide_init_ioport(&s->bus, s->iobase, s->iobase2); isa_init_irq(dev, &s->irq, s->isairq); +isa_init_ioport_range(dev, s->iobase, 8); +isa_init_ioport(dev, s->iobase2); ide_init2(&s->bus, s->irq); vmstate_register(&dev->qdev, 0, &vmstate_ide_isa, s); return 0; diff --git a/hw/isa-bus.c b/hw/isa-bus.c index 26036e0..c0ac7e9 100644 --- a/hw/isa-bus.c +++ b/hw/isa-bus.c @@ -92,6 +92,31 @@ void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq) dev->nirqs++; } +static void isa_init_ioport_one(ISADevice *dev, uint16_t ioport) +{ +assert(dev->nioports < ARRAY_SIZE(dev->ioports)); +dev->ioports[dev->nioports++] = ioport; +} + +static int isa_cmp_ports(const void *p1, const void *p2) +{ +return *(uint16_t*)p1 - *(uint16_t*)p2; +} + +void isa_init_ioport_range(ISADevice *dev, uint16_t start, uint16_t length) +{ +int i; +for (i = start; i < start + length; i++) { +isa_init_ioport_one(dev, i); +} +qsort(dev->ioports, dev->nioports, sizeof(dev->ioports[0]), isa_cmp_ports); +} + +void isa_init_ioport(ISADevice *dev, uint16_t ioport) +{ +isa_init_ioport_range(dev, ioport, 1); +} + static int isa_qdev_init(DeviceState *qdev, DeviceInfo *base) { ISADevice *dev = DO_UPCAST(ISADevice, qdev, qdev); diff --git a/hw/isa.h b/hw/isa.h index aaf0272..4794b76 100644 --- a/hw/isa.h +++ b/hw/isa.h @@ -14,6 +14,8 @@ struct ISADevice { DeviceState qdev; uint32_t isairq[2]; int nirqs; +uint16_t ioports[32]; +int nioports; }; typedef int (*isa_qdev_initfn)(ISADevice *dev); @@ -26,6 +28,8 @@ ISABus *isa_bus_new(DeviceState *dev); void isa_bus_irqs(qemu_irq *irqs); qemu_irq isa_reserve_irq(int isairq); void isa_init_irq(ISADevice *dev, qemu_irq *p, int isairq); +void isa_init_ioport(ISADevice *dev, uint16_t ioport); +void isa_init_ioport_range(ISADevice *dev, uint16_t start, uint16_t length); void isa_qdev_register(ISADeviceInfo *info); ISADevice *isa_create(const char *name); ISADevice *isa_create_simple(const char *name); diff --git a/hw/m48t59.c b/hw/m
[Qemu-devel] [PATCHv5 05/15] Store IDE bus id in IDEBus structure for easy access.
Signed-off-by: Gleb Natapov --- hw/ide/cmd646.c |4 ++-- hw/ide/internal.h |3 ++- hw/ide/isa.c |2 +- hw/ide/piix.c |4 ++-- hw/ide/qdev.c |3 ++- hw/ide/via.c |4 ++-- 6 files changed, 11 insertions(+), 9 deletions(-) diff --git a/hw/ide/cmd646.c b/hw/ide/cmd646.c index ff80dd5..b2cbdbc 100644 --- a/hw/ide/cmd646.c +++ b/hw/ide/cmd646.c @@ -257,8 +257,8 @@ static int pci_cmd646_ide_initfn(PCIDevice *dev) pci_conf[PCI_INTERRUPT_PIN] = 0x01; // interrupt on pin 1 irq = qemu_allocate_irqs(cmd646_set_irq, d, 2); -ide_bus_new(&d->bus[0], &d->dev.qdev); -ide_bus_new(&d->bus[1], &d->dev.qdev); +ide_bus_new(&d->bus[0], &d->dev.qdev, 0); +ide_bus_new(&d->bus[1], &d->dev.qdev, 1); ide_init2(&d->bus[0], irq[0]); ide_init2(&d->bus[1], irq[1]); diff --git a/hw/ide/internal.h b/hw/ide/internal.h index d652e06..c0a1abc 100644 --- a/hw/ide/internal.h +++ b/hw/ide/internal.h @@ -448,6 +448,7 @@ struct IDEBus { IDEDevice *slave; BMDMAState *bmdma; IDEState ifs[2]; +int bus_id; uint8_t unit; uint8_t cmd; qemu_irq irq; @@ -565,7 +566,7 @@ void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0, void ide_init_ioport(IDEBus *bus, int iobase, int iobase2); /* hw/ide/qdev.c */ -void ide_bus_new(IDEBus *idebus, DeviceState *dev); +void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id); IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive); #endif /* HW_IDE_INTERNAL_H */ diff --git a/hw/ide/isa.c b/hw/ide/isa.c index 4206afd..8c59c5a 100644 --- a/hw/ide/isa.c +++ b/hw/ide/isa.c @@ -67,7 +67,7 @@ static int isa_ide_initfn(ISADevice *dev) { ISAIDEState *s = DO_UPCAST(ISAIDEState, dev, dev); -ide_bus_new(&s->bus, &s->dev.qdev); +ide_bus_new(&s->bus, &s->dev.qdev, 0); ide_init_ioport(&s->bus, s->iobase, s->iobase2); isa_init_irq(dev, &s->irq, s->isairq); isa_init_ioport_range(dev, s->iobase, 8); diff --git a/hw/ide/piix.c b/hw/ide/piix.c index 07483e8..d0b04a3 100644 --- a/hw/ide/piix.c +++ b/hw/ide/piix.c @@ -129,8 +129,8 @@ static int pci_piix_ide_initfn(PCIIDEState *d) vmstate_register(&d->dev.qdev, 0, &vmstate_ide_pci, d); -ide_bus_new(&d->bus[0], &d->dev.qdev); -ide_bus_new(&d->bus[1], &d->dev.qdev); +ide_bus_new(&d->bus[0], &d->dev.qdev, 0); +ide_bus_new(&d->bus[1], &d->dev.qdev, 1); ide_init_ioport(&d->bus[0], 0x1f0, 0x3f6); ide_init_ioport(&d->bus[1], 0x170, 0x376); diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index 6d27b60..88ff657 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -29,9 +29,10 @@ static struct BusInfo ide_bus_info = { .size = sizeof(IDEBus), }; -void ide_bus_new(IDEBus *idebus, DeviceState *dev) +void ide_bus_new(IDEBus *idebus, DeviceState *dev, int bus_id) { qbus_create_inplace(&idebus->qbus, &ide_bus_info, dev, NULL); +idebus->bus_id = bus_id; } static int ide_qdev_init(DeviceState *qdev, DeviceInfo *base) diff --git a/hw/ide/via.c b/hw/ide/via.c index b2c7cad..cc48b2b 100644 --- a/hw/ide/via.c +++ b/hw/ide/via.c @@ -158,8 +158,8 @@ static int vt82c686b_ide_initfn(PCIDevice *dev) vmstate_register(&dev->qdev, 0, &vmstate_ide_pci, d); -ide_bus_new(&d->bus[0], &d->dev.qdev); -ide_bus_new(&d->bus[1], &d->dev.qdev); +ide_bus_new(&d->bus[0], &d->dev.qdev, 0); +ide_bus_new(&d->bus[1], &d->dev.qdev, 1); ide_init2(&d->bus[0], isa_reserve_irq(14)); ide_init2(&d->bus[1], isa_reserve_irq(15)); ide_init_ioport(&d->bus[0], 0x1f0, 0x3f6); -- 1.7.1
[Qemu-devel] [PATCHv5 09/15] Record which USBDevice USBPort belongs too.
Ports on root hub will have NULL here. This is needed to reconstruct path from device to its root hub to build device path. Signed-off-by: Gleb Natapov --- hw/usb-bus.c |3 ++- hw/usb-hub.c |2 +- hw/usb-musb.c |2 +- hw/usb-ohci.c |2 +- hw/usb-uhci.c |2 +- hw/usb.h |3 ++- 6 files changed, 8 insertions(+), 6 deletions(-) diff --git a/hw/usb-bus.c b/hw/usb-bus.c index b692503..256b881 100644 --- a/hw/usb-bus.c +++ b/hw/usb-bus.c @@ -110,11 +110,12 @@ USBDevice *usb_create_simple(USBBus *bus, const char *name) } void usb_register_port(USBBus *bus, USBPort *port, void *opaque, int index, - usb_attachfn attach) + USBDevice *pdev, usb_attachfn attach) { port->opaque = opaque; port->index = index; port->attach = attach; +port->pdev = pdev; QTAILQ_INSERT_TAIL(&bus->free, port, next); bus->nfree++; } diff --git a/hw/usb-hub.c b/hw/usb-hub.c index 8e3a96b..8a3f829 100644 --- a/hw/usb-hub.c +++ b/hw/usb-hub.c @@ -535,7 +535,7 @@ static int usb_hub_initfn(USBDevice *dev) for (i = 0; i < s->nb_ports; i++) { port = &s->ports[i]; usb_register_port(usb_bus_from_device(dev), - &port->port, s, i, usb_hub_attach); + &port->port, s, i, &s->dev, usb_hub_attach); port->wPortStatus = PORT_STAT_POWER; port->wPortChange = 0; } diff --git a/hw/usb-musb.c b/hw/usb-musb.c index 7f15842..9efe7a6 100644 --- a/hw/usb-musb.c +++ b/hw/usb-musb.c @@ -343,7 +343,7 @@ struct MUSBState { } usb_bus_new(&s->bus, NULL /* FIXME */); -usb_register_port(&s->bus, &s->port, s, 0, musb_attach); +usb_register_port(&s->bus, &s->port, s, 0, NULL, musb_attach); return s; } diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c index c60fd8d..59604cf 100644 --- a/hw/usb-ohci.c +++ b/hw/usb-ohci.c @@ -1705,7 +1705,7 @@ static void usb_ohci_init(OHCIState *ohci, DeviceState *dev, usb_bus_new(&ohci->bus, dev); ohci->num_ports = num_ports; for (i = 0; i < num_ports; i++) { -usb_register_port(&ohci->bus, &ohci->rhport[i].port, ohci, i, ohci_attach); +usb_register_port(&ohci->bus, &ohci->rhport[i].port, ohci, i, NULL, ohci_attach); } ohci->async_td = 0; diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c index 1d83400..b9b822f 100644 --- a/hw/usb-uhci.c +++ b/hw/usb-uhci.c @@ -1115,7 +1115,7 @@ static int usb_uhci_common_initfn(UHCIState *s) usb_bus_new(&s->bus, &s->dev.qdev); for(i = 0; i < NB_PORTS; i++) { -usb_register_port(&s->bus, &s->ports[i].port, s, i, uhci_attach); +usb_register_port(&s->bus, &s->ports[i].port, s, i, NULL, uhci_attach); } s->frame_timer = qemu_new_timer(vm_clock, uhci_frame_timer, s); s->expire_time = qemu_get_clock(vm_clock) + diff --git a/hw/usb.h b/hw/usb.h index 00d2802..0b32d77 100644 --- a/hw/usb.h +++ b/hw/usb.h @@ -203,6 +203,7 @@ struct USBPort { USBDevice *dev; usb_attachfn attach; void *opaque; +USBDevice *pdev; int index; /* internal port index, may be used with the opaque */ QTAILQ_ENTRY(USBPort) next; }; @@ -312,7 +313,7 @@ USBDevice *usb_create(USBBus *bus, const char *name); USBDevice *usb_create_simple(USBBus *bus, const char *name); USBDevice *usbdevice_create(const char *cmdline); void usb_register_port(USBBus *bus, USBPort *port, void *opaque, int index, - usb_attachfn attach); + USBDevice *pdev, usb_attachfn attach); void usb_unregister_port(USBBus *bus, USBPort *port); int usb_device_attach(USBDevice *dev); int usb_device_detach(USBDevice *dev); -- 1.7.1
[Qemu-devel] [PATCHv5 11/15] Add bootindex parameter to net/block/fd device
If bootindex is specified on command line a string that describes device in firmware readable way is added into sorted list. Later this list will be passed into firmware to control boot order. Signed-off-by: Gleb Natapov --- block_int.h |4 +++- hw/e1000.c |4 hw/eepro100.c |3 +++ hw/fdc.c|8 hw/ide/qdev.c |5 + hw/ne2000.c |3 +++ hw/pcnet.c |4 hw/qdev.c | 32 hw/qdev.h |1 + hw/rtl8139.c|4 hw/usb-net.c|2 ++ hw/virtio-blk.c |2 ++ hw/virtio-net.c |2 ++ net.h |4 +++- sysemu.h|2 ++ vl.c| 40 16 files changed, 118 insertions(+), 2 deletions(-) diff --git a/block_int.h b/block_int.h index 3c3adb5..0a0e47d 100644 --- a/block_int.h +++ b/block_int.h @@ -227,6 +227,7 @@ typedef struct BlockConf { uint16_t logical_block_size; uint16_t min_io_size; uint32_t opt_io_size; +int32_t bootindex; } BlockConf; static inline unsigned int get_physical_block_exp(BlockConf *conf) @@ -249,6 +250,7 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf) DEFINE_PROP_UINT16("physical_block_size", _state, \ _conf.physical_block_size, 512), \ DEFINE_PROP_UINT16("min_io_size", _state, _conf.min_io_size, 0), \ -DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0) +DEFINE_PROP_UINT32("opt_io_size", _state, _conf.opt_io_size, 0),\ +DEFINE_PROP_INT32("bootindex", _state, _conf.bootindex, -1) \ #endif /* BLOCK_INT_H */ diff --git a/hw/e1000.c b/hw/e1000.c index 532efdc..053f33e 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -30,6 +30,7 @@ #include "net.h" #include "net/checksum.h" #include "loader.h" +#include "sysemu.h" #include "e1000_hw.h" @@ -1148,6 +1149,9 @@ static int pci_e1000_init(PCIDevice *pci_dev) d->dev.qdev.info->name, d->dev.qdev.id, d); qemu_format_nic_info_str(&d->nic->nc, macaddr); + +add_boot_device_path(d->conf.bootindex, &pci_dev->qdev, "ethernet-...@0"); + return 0; } diff --git a/hw/eepro100.c b/hw/eepro100.c index 41d792a..80adac6 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -46,6 +46,7 @@ #include "pci.h" #include "net.h" #include "eeprom93xx.h" +#include "sysemu.h" #define KiB 1024 @@ -1907,6 +1908,8 @@ static int e100_nic_init(PCIDevice *pci_dev) s->vmstate->name = s->nic->nc.model; vmstate_register(&pci_dev->qdev, -1, s->vmstate, s); +add_boot_device_path(s->conf.bootindex, &pci_dev->qdev, "ethernet-...@0"); + return 0; } diff --git a/hw/fdc.c b/hw/fdc.c index 5ab754b..7b1349f 100644 --- a/hw/fdc.c +++ b/hw/fdc.c @@ -35,6 +35,7 @@ #include "sysbus.h" #include "qdev-addr.h" #include "blockdev.h" +#include "sysemu.h" // /* debug Floppy devices */ @@ -523,6 +524,8 @@ typedef struct FDCtrlSysBus { typedef struct FDCtrlISABus { ISADevice busdev; struct FDCtrl state; +int32_t bootindexA; +int32_t bootindexB; } FDCtrlISABus; static uint32_t fdctrl_read (void *opaque, uint32_t reg) @@ -1992,6 +1995,9 @@ static int isabus_fdc_init1(ISADevice *dev) qdev_set_legacy_instance_id(&dev->qdev, iobase, 2); ret = fdctrl_init_common(fdctrl); +add_boot_device_path(isa->bootindexA, &dev->qdev, "flo...@0"); +add_boot_device_path(isa->bootindexB, &dev->qdev, "flo...@1"); + return ret; } @@ -2051,6 +2057,8 @@ static ISADeviceInfo isa_fdc_info = { .qdev.props = (Property[]) { DEFINE_PROP_DRIVE("driveA", FDCtrlISABus, state.drives[0].bs), DEFINE_PROP_DRIVE("driveB", FDCtrlISABus, state.drives[1].bs), +DEFINE_PROP_INT32("bootindexA", FDCtrlISABus, bootindexA, -1), +DEFINE_PROP_INT32("bootindexB", FDCtrlISABus, bootindexB, -1), DEFINE_PROP_END_OF_LIST(), }, }; diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index 01a181b..69a00e2 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -21,6 +21,7 @@ #include "qemu-error.h" #include #include "blockdev.h" +#include "sysemu.h" /* - */ @@ -143,6 +144,10 @@ static int ide_drive_initfn(IDEDevice *dev) if (!dev->serial) { dev->serial = qemu_strdup(s->drive_serial_str); } + +add_boot_device_path(dev->conf.bootindex, &dev->qdev, + dev->unit ? "d...@1" : "d...@0"); + return 0; } diff --git a/hw/ne2000.c b/hw/ne2000.c index 126e7cf..f4bbac2 100644 --- a/hw/ne2000.c +++ b/hw/ne2000.c @@ -26,6 +26,7 @@ #include "net.h" #include "ne2000.h" #include "loader.h" +#include "sysemu.h" /* debug NE2000 card */ //#define DEBUG_NE2000 @@ -746,6 +747,8 @@ static int pci_ne2000_init(PCIDevice *pci_dev) } } +add_boot_device_path(s->c.bootindex, &pci_dev->qdev, "
[Qemu-devel] [PATCHv5 14/15] Add notifier that will be called when machine is fully created.
Action that depends on fully initialized device model should register with this notifier chain. Signed-off-by: Gleb Natapov --- sysemu.h |2 ++ vl.c | 15 +++ 2 files changed, 17 insertions(+), 0 deletions(-) diff --git a/sysemu.h b/sysemu.h index 48f8eee..c42f33a 100644 --- a/sysemu.h +++ b/sysemu.h @@ -60,6 +60,8 @@ void qemu_system_reset(void); void qemu_add_exit_notifier(Notifier *notify); void qemu_remove_exit_notifier(Notifier *notify); +void qemu_add_machine_init_done_notifier(Notifier *notify); + void do_savevm(Monitor *mon, const QDict *qdict); int load_vmstate(const char *name); void do_delvm(Monitor *mon, const QDict *qdict); diff --git a/vl.c b/vl.c index e8ada75..918d988 100644 --- a/vl.c +++ b/vl.c @@ -253,6 +253,9 @@ static void *boot_set_opaque; static NotifierList exit_notifiers = NOTIFIER_LIST_INITIALIZER(exit_notifiers); +static NotifierList machine_init_done_notifiers = +NOTIFIER_LIST_INITIALIZER(machine_init_done_notifiers); + int kvm_allowed = 0; uint32_t xen_domid; enum xen_mode xen_mode = XEN_EMULATE; @@ -1778,6 +1781,16 @@ static void qemu_run_exit_notifiers(void) notifier_list_notify(&exit_notifiers); } +void qemu_add_machine_init_done_notifier(Notifier *notify) +{ +notifier_list_add(&machine_init_done_notifiers, notify); +} + +static void qemu_run_machine_init_done_notifiers(void) +{ +notifier_list_notify(&machine_init_done_notifiers); +} + static const QEMUOption *lookup_opt(int argc, char **argv, const char **poptarg, int *poptind) { @@ -3023,6 +3036,8 @@ int main(int argc, char **argv, char **envp) exit(1); } +qemu_run_machine_init_done_notifiers(); + qemu_system_reset(); if (loadvm) { if (load_vmstate(loadvm) < 0) { -- 1.7.1
Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.
On 15.11.2010, at 15:47, Anthony PERARD wrote: > On Mon, 15 Nov 2010, Anthony PERARD wrote: > >> On Mon, 15 Nov 2010, Alexander Graf wrote: >> >>> >>> On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: >>> From: Anthony PERARD This option gives the ability to switch one "accelerator" like kvm, xen or the default one tcg. We can specify more than one accelerator by separate them by a comma. QEMU will try each one and use the first whose works. So, -accel xen,kvm,tcg which would try Xen support first, then KVM and finaly tcg if none of the other works. Signed-off-by: Anthony PERARD --- qemu-options.hx | 10 ++ vl.c| 86 --- 2 files changed, 85 insertions(+), 11 deletions(-) diff --git a/vl.c b/vl.c index df414ef..40a26ee 100644 --- a/vl.c +++ b/vl.c @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname) return 0; } +static struct { +const char *opt_name; +const char *name; +int (*available)(void); +int (*init)(int smp_cpus); +int *allowed; +} accel_list[] = { +{ "tcg", "tcg", NULL, NULL, NULL }, +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed }, >>> >>> Thinking about this a bit more ... >>> >>> kvm_available is a function pointer that gets #defined to (0) when we don't >>> have KVM available. I can imagine some compiler might throw a warning on us >>> for this one day. >>> >>> Is there a valid reason not to do >>> >>> static inline int kvm_enabled() >>> { >>> #ifdef CONFIG_KVM >>>return kvm_allowed; >>> #else >>>return 0; >>> #endif >>> } >>> >>> That should compile into the exact same code but be valid for function >>> pointers. >> >> I will do this change, as well as for the two others patches. > > Actually, kvm_available is already a function, not a define. > > kvm_enable can be change from define to function, but not in this patch, > because I don't use it. Oh no worries for stuff you don't use. As long as kvm_available is a function, just make sure that xen_available is one too :). The main incentive is to get rid of all those if (!x->available) parts, as we can guarantee there to always be a function behind. Treating tcg the same way would also help in the long term for non-tcg build of qemu which some people are interested in ;). Alex
Re: [Qemu-devel] [PATCH V6 04/15] Introduce -accel command option.
On Mon, 15 Nov 2010, Anthony PERARD wrote: > On Mon, 15 Nov 2010, Alexander Graf wrote: > > > > > On 21.10.2010, at 19:36, anthony.per...@citrix.com wrote: > > > > > From: Anthony PERARD > > > > > > This option gives the ability to switch one "accelerator" like kvm, xen > > > or the default one tcg. We can specify more than one accelerator by > > > separate them by a comma. QEMU will try each one and use the first whose > > > works. > > > > > > So, > > > > > > -accel xen,kvm,tcg > > > > > > which would try Xen support first, then KVM and finaly tcg if none of > > > the other works. > > > > > > Signed-off-by: Anthony PERARD > > > --- > > > qemu-options.hx | 10 ++ > > > vl.c| 86 > > > --- > > > 2 files changed, 85 insertions(+), 11 deletions(-) > > > > > > diff --git a/vl.c b/vl.c > > > index df414ef..40a26ee 100644 > > > --- a/vl.c > > > +++ b/vl.c > > > @@ -1750,6 +1750,74 @@ static int debugcon_parse(const char *devname) > > > return 0; > > > } > > > > > > +static struct { > > > +const char *opt_name; > > > +const char *name; > > > +int (*available)(void); > > > +int (*init)(int smp_cpus); > > > +int *allowed; > > > +} accel_list[] = { > > > +{ "tcg", "tcg", NULL, NULL, NULL }, > > > +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed }, > > > > Thinking about this a bit more ... > > > > kvm_available is a function pointer that gets #defined to (0) when we don't > > have KVM available. I can imagine some compiler might throw a warning on us > > for this one day. > > > > Is there a valid reason not to do > > > > static inline int kvm_enabled() > > { > > #ifdef CONFIG_KVM > > return kvm_allowed; > > #else > > return 0; > > #endif > > } > > > > That should compile into the exact same code but be valid for function > > pointers. > > I will do this change, as well as for the two others patches. Actually, kvm_available is already a function, not a define. kvm_enable can be change from define to function, but not in this patch, because I don't use it. -- Anthony PERARD
[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
"Michael S. Tsirkin" wrote: > There's no reason for tap to run when VM is stopped. > If we let it, it confuses the bridge on TX > and corrupts DMA memory on RX. > > Signed-off-by: Michael S. Tsirkin once here, what handlers make sense to run while stopped? /me can think of the normal console, non live migration, loadvm and not much more. Perhaps it is easier to just move the other way around? Later, Juan.
[Qemu-devel] [PATCH 1/2] virtio-9p: Use chroot to safely access files in passthrough model
In passthrough security model, following symbolic links in the server side could result in accessing files outside guest's export path.This could happen under two conditions: 1) If a modified guest kernel is sending symbolic link as part of the file path and when resolving that symbolic link at server side, it could result in accessing files outside export path. 2) If a same path is exported to multiple guests and if guest1 tries to open a file "a/b/c/passwd" and meanwhile guest2 did this "rm -rf a/b/c; cd a/b; ln -s ../../etc c". If guest1 lookup happened and guest2 completed these operations just before guest1 opening the file, this operation could result in opening host's /etc/passwd. Following approach is used to avoid the security issue involved in following symbolic links in the passthrough model. Create a sub-process which will chroot into export path, so that even if there is a symbolic link in the path it could never go beyond the share path. When qemu is started with passthrough security model, a process is forked and this sub-process process takes care of accessing files in the passthrough share path. It does * Create socketpair * Chroot into share path * Read file open request from socket descriptor * Open request contains file path, flags, mode, uid, gid, dev etc * Based on the request type it creates/opens file/directory/device node * Return the file descriptor to main process using socket with SCM_RIGHTS as cmsg type. Main process when ever there is a request for a resource to be opened/created, it constructs the open request and writes that into its socket descriptor and reads from chroot process socket to get the file descriptor. This patch implements chroot enviroment, provides necessary functions that can be used by the passthrough function calls. Signed-off-by: M. Mohan Kumar --- Makefile.objs |1 + hw/file-op-9p.h |2 + hw/virtio-9p-chroot.c | 275 + hw/virtio-9p.c| 20 hw/virtio-9p.h| 21 5 files changed, 319 insertions(+), 0 deletions(-) create mode 100644 hw/virtio-9p-chroot.c diff --git a/Makefile.objs b/Makefile.objs index cd5a24b..134da8e 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -251,6 +251,7 @@ hw-obj-$(CONFIG_SOUND) += $(sound-obj-y) hw-obj-$(CONFIG_VIRTFS) += virtio-9p-debug.o virtio-9p-local.o virtio-9p-xattr.o hw-obj-$(CONFIG_VIRTFS) += virtio-9p-xattr-user.o virtio-9p-posix-acl.o +hw-obj-$(CONFIG_VIRTFS) += virtio-9p-chroot.o ## # libdis diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h index c7731c2..149a915 100644 --- a/hw/file-op-9p.h +++ b/hw/file-op-9p.h @@ -55,6 +55,8 @@ typedef struct FsContext SecModel fs_sm; uid_t uid; struct xattr_operations **xops; +pthread_mutex_t chroot_mutex; +int chroot_socket; } FsContext; extern void cred_init(FsCred *); diff --git a/hw/virtio-9p-chroot.c b/hw/virtio-9p-chroot.c new file mode 100644 index 000..50e28a1 --- /dev/null +++ b/hw/virtio-9p-chroot.c @@ -0,0 +1,275 @@ +/* + * Virtio 9p chroot environment for secured access to exported file + * system + * + * Copyright IBM, Corp. 2010 + * + * Authors: + * M. Mohan Kumar + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the copying file in the top-level directory + * + */ + +#include "virtio.h" +#include "qemu_socket.h" +#include "qemu-thread.h" +#include "virtio-9p.h" +#include +#include + +/* Structure used to transfer file descriptor and error info to the main + * process. fd will be zero if there was an error(in this case error + * will hold the errno). error will be zero and fd will be a valid + * identifier when open was success + */ +typedef struct { +int fd; +int error; +} FdInfo; + +static int sendfd(int sockfd, FdInfo fd_info) +{ +struct msghdr msg = { }; +struct iovec iov; +union { +struct cmsghdr cmsg; +char control[CMSG_SPACE(sizeof(int))]; +} msg_control; +struct cmsghdr *cmsg; + +iov.iov_base = &fd_info; +iov.iov_len = sizeof(fd_info); + +memset(&msg, 0, sizeof(msg)); +msg.msg_iov = &iov; +msg.msg_iovlen = 1; +msg.msg_control = &msg_control; +msg.msg_controllen = sizeof(msg_control); + +cmsg = &msg_control.cmsg; +cmsg->cmsg_len = CMSG_LEN(sizeof(fd_info.fd)); +cmsg->cmsg_level = SOL_SOCKET; +cmsg->cmsg_type = SCM_RIGHTS; +memcpy(CMSG_DATA(cmsg), &fd_info.fd, sizeof(fd_info.fd)); + +return sendmsg(sockfd, &msg, 0); +} + +static int getfd(int sockfd, int *error) +{ +struct msghdr msg = { }; +struct iovec iov; +union { +struct cmsghdr cmsg; +char control[CMSG_SPACE(sizeof(int))]; +} msg_control; +struct cmsghdr *cmsg; +int retval, fd; +FdInfo fd_info; + +iov.iov_base = &fd_info; +iov.iov_len = sizeof(fd_info); + +memset(&msg, 0, sizeof(msg)); +msg.msg_iov =
[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On 11/15/2010 08:52 AM, Juan Quintela wrote: "Michael S. Tsirkin" wrote: There's no reason for tap to run when VM is stopped. If we let it, it confuses the bridge on TX and corrupts DMA memory on RX. Signed-off-by: Michael S. Tsirkin once here, what handlers make sense to run while stopped? /me can think of the normal console, non live migration, loadvm and not much more. Perhaps it is easier to just move the other way around? I'm not sure I concur that this is really a problem. Semantically, I don't think that stop has to imply that the guest memory no longer changes. Regards, Anthony Liguori Later, Juan.
[Qemu-devel] [PATCH 2/2] virtio-9p: Use chroot interface in passthrough model
Make use of chroot interfaces for passthrough security model to fix the vulnerability in following symbolic links. Signed-off-by: M. Mohan Kumar --- hw/virtio-9p-local.c | 284 ++ 1 files changed, 218 insertions(+), 66 deletions(-) diff --git a/hw/virtio-9p-local.c b/hw/virtio-9p-local.c index 656bfb3..4b72dec 100644 --- a/hw/virtio-9p-local.c +++ b/hw/virtio-9p-local.c @@ -19,16 +19,91 @@ #include #include #include +#include + +static int get_fd(FsContext *fs_ctx, const char *path, int flags, FsCred *credp) +{ +V9fsOpenRequest request; +int fd, error = 0; + +memset(&request, 0, sizeof(request)); +request.data.path_len = strlen(path); +request.path.path = qemu_strdup(path); +request.data.flags = flags; +if (credp) { +request.data.mode = credp->fc_mode; +request.data.uid = credp->fc_uid; +request.data.gid = credp->fc_gid; +request.data.dev = credp->fc_rdev; +} +fd = v9fs_getfd(&request, &error, fs_ctx); +if (error) { +errno = error; +} else { +errno = error; +} +qemu_strdup(request.path.path); +return fd; +} + +static int get_pfd(FsContext *fs_ctx, const char *path) +{ +V9fsOpenRequest request; +int fd, error = 0; +char *dpath = qemu_strdup(path); + +memset(&request, 0, sizeof(request)); +request.path.path = dirname(dpath); +request.data.path_len = strlen(request.path.path); +request.data.flags = O_RDONLY | O_DIRECTORY | O_NOFOLLOW; +fd = v9fs_getfd(&request, &error, fs_ctx); +if (error) { +errno = error; +} else { +errno = 0; +} +qemu_free(dpath); +return fd; +} +static int do_symlink(FsContext *fs_ctx, const char *oldpath, +const char *newpath, FsCred *credp) +{ +V9fsOpenRequest request; +int fd, error = 0; + +memset(&request, 0, sizeof(request)); +request.data.path_len = strlen(newpath); +request.path.path = qemu_strdup(newpath); +request.data.oldpath_len = strlen(oldpath); +request.path.old_path = qemu_strdup(oldpath); +request.data.flags = S_IFLNK | O_CREAT; + +if (credp) { +request.data.mode = credp->fc_mode; +request.data.uid = credp->fc_uid; +request.data.gid = credp->fc_gid; +request.data.dev = credp->fc_rdev; +} +fd = v9fs_getfd(&request, &error, fs_ctx); +if (error) { +errno = error; +} else { +errno = error; +} +qemu_strdup(request.path.path); +return fd; +} static int local_lstat(FsContext *fs_ctx, const char *path, struct stat *stbuf) { int err; -err = lstat(rpath(fs_ctx, path), stbuf); -if (err) { -return err; -} + if (fs_ctx->fs_sm == SM_MAPPED) { +err = lstat(rpath(fs_ctx, path), stbuf); +if (err) { +return err; +} /* Actual credentials are part of extended attrs */ uid_t tmp_uid; gid_t tmp_gid; @@ -50,6 +125,22 @@ static int local_lstat(FsContext *fs_ctx, const char *path, struct stat *stbuf) sizeof(dev_t)) > 0) { stbuf->st_rdev = tmp_dev; } +} else if (fs_ctx->fs_sm == SM_PASSTHROUGH) { +int pfd; +char *base, *basep; + +base = qemu_strdup(path); +basep = basename(base); + +pfd = get_pfd(fs_ctx, path); +err = fstatat(pfd, basep, stbuf, AT_SYMLINK_NOFOLLOW); +close(pfd); +free(base); +} else { +err = lstat(rpath(fs_ctx, path), stbuf); +if (err) { +return err; +} } return err; } @@ -88,21 +179,13 @@ static int local_set_xattr(const char *path, FsCred *credp) return 0; } -static int local_post_create_passthrough(FsContext *fs_ctx, const char *path, -FsCred *credp) +static int local_post_create_none(FsContext *fs_ctx, const char *path, +FsCred *credp) { if (chmod(rpath(fs_ctx, path), credp->fc_mode & 0) < 0) { return -1; } -if (lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid) < 0) { -/* - * If we fail to change ownership and if we are - * using security model none. Ignore the error - */ -if (fs_ctx->fs_sm != SM_NONE) { -return -1; -} -} +lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid); return 0; } @@ -121,9 +204,16 @@ static ssize_t local_readlink(FsContext *fs_ctx, const char *path, } while (tsize == -1 && errno == EINTR); close(fd); return tsize; -} else if ((fs_ctx->fs_sm == SM_PASSTHROUGH) || - (fs_ctx->fs_sm == SM_NONE)) { +} else if (fs_ctx->fs_sm == SM_NONE) { tsize = readlink(rpath(fs_ctx, path), buf, bufsz); +} else if (fs_ctx->fs_sm == SM_PASSTHROUGH) { +int pfd; +char *base; +base = qemu_strdup(path); +pfd =
[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: > On 11/15/2010 08:52 AM, Juan Quintela wrote: > >"Michael S. Tsirkin" wrote: > >>There's no reason for tap to run when VM is stopped. > >>If we let it, it confuses the bridge on TX > >>and corrupts DMA memory on RX. > >> > >>Signed-off-by: Michael S. Tsirkin > >once here, what handlers make sense to run while stopped? > >/me can think of the normal console, non live migration, loadvm and not > >much more. Perhaps it is easier to just move the other way around? > > I'm not sure I concur that this is really a problem. > Semantically, I don't think that stop has to imply that the guest > memory no longer changes. > > Regards, > > Anthony Liguori > > >Later, Juan. Well, I do not really know about vmstop that is not for migration. For most vmstop calls are for migration. And there, the problems are very real. First, it's not just memory. At least for network transmit, sending out packets with the same MAC from two locations is a problem. See? For memory, it is much worse: any memory changes can either get discarded or not. This breaks consistency guarantees that guest relies upon. Imagine virtio index getting updated but content not being updated. See? -- MST
[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On Mon, Nov 15, 2010 at 03:52:54PM +0100, Juan Quintela wrote: > "Michael S. Tsirkin" wrote: > > There's no reason for tap to run when VM is stopped. > > If we let it, it confuses the bridge on TX > > and corrupts DMA memory on RX. > > > > Signed-off-by: Michael S. Tsirkin > > once here, what handlers make sense to run while stopped? > /me can think of the normal console, non live migration, loadvm and not > much more. Perhaps it is easier to just move the other way around? > > Later, Juan. vnc? SDL? Yes, more devices need to be stopped than not, but I tread carefully to avoid breaking existing functionality. If you could solve it for all devices in one swoop, that'd be great. I'm not up to it. -- MST
[Qemu-devel] [PATCH] Introduce -accel command option.
From: Anthony PERARD This option gives the ability to switch one "accelerator" like kvm, xen or the default one tcg. We can specify more than one accelerator by separate them by a comma. QEMU will try each one and use the first whose works. So, -accel xen,kvm,tcg which would try Xen support first, then KVM and finaly tcg if none of the other works. Signed-off-by: Anthony PERARD --- arch_init.c |5 +++ arch_init.h |1 + qemu-options.hx | 10 ++ vl.c| 85 +++--- 4 files changed, 90 insertions(+), 11 deletions(-) diff --git a/arch_init.c b/arch_init.c index 4486925..e0d7a4c 100644 --- a/arch_init.c +++ b/arch_init.c @@ -639,6 +639,11 @@ int audio_available(void) #endif } +int tcg_available(void) +{ +return 1; +} + int kvm_available(void) { #ifdef CONFIG_KVM diff --git a/arch_init.h b/arch_init.h index 682890c..f0fb6a0 100644 --- a/arch_init.h +++ b/arch_init.h @@ -27,6 +27,7 @@ void do_acpitable_option(const char *optarg); void do_smbios_option(const char *optarg); void cpudef_init(void); int audio_available(void); +int tcg_available(void); int kvm_available(void); int xen_available(void); diff --git a/qemu-options.hx b/qemu-options.hx index 4d99a58..958d126 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -1975,6 +1975,16 @@ Enable KVM full virtualization support. This option is only available if KVM support is enabled when compiling. ETEXI +DEF("accel", HAS_ARG, QEMU_OPTION_accel, \ +"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", QEMU_ARCH_ALL) +STEXI +...@item -accel @var{accel}[,@var{accel}[,...]] +...@findex -accel +This is use to enable an accelerator, in kvm,xen,tcg. +By default, it use only tcg. If there a more than one accelerator +specified, the next one is used if the first don't work. +ETEXI + DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid, "-xen-domid id specify xen guest domain id\n", QEMU_ARCH_ALL) DEF("xen-create", 0, QEMU_OPTION_xen_create, diff --git a/vl.c b/vl.c index c58583d..2917e32 100644 --- a/vl.c +++ b/vl.c @@ -242,6 +242,7 @@ static void *boot_set_opaque; static NotifierList exit_notifiers = NOTIFIER_LIST_INITIALIZER(exit_notifiers); +static int tcg_allowed = 1; int kvm_allowed = 0; uint32_t xen_domid; enum xen_mode xen_mode = XEN_EMULATE; @@ -1723,6 +1724,72 @@ static int debugcon_parse(const char *devname) return 0; } +static int tcg_init(int smp_cpus) +{ +return 0; +} + +static struct { +const char *opt_name; +const char *name; +int (*available)(void); +int (*init)(int smp_cpus); +int *allowed; +} accel_list[] = { +{ "tcg", "tcg", tcg_available, tcg_init, &tcg_allowed }, +{ "kvm", "KVM", kvm_available, kvm_init, &kvm_allowed }, +}; + +static int accel_parse_init(const char *opts) +{ +const char *p = opts; +char buf[10]; +int i, ret; +bool accel_initalised = 0; +bool init_failed = 0; + +while (!accel_initalised && *p != '\0') { +if (*p == ',') { +p++; +} +p = get_opt_name(buf, sizeof (buf), p, ','); +for (i = 0; i < ARRAY_SIZE(accel_list); i++) { +if (strcmp(accel_list[i].opt_name, buf) == 0) { +ret = accel_list[i].init(smp_cpus); +if (ret < 0) { +init_failed = 1; +if (!accel_list[i].available()) { +printf("%s not supported for this target\n", + accel_list[i].name); +} else { +fprintf(stderr, "failed to initialize %s: %s\n", +accel_list[i].name, +strerror(-ret)); +} +} else { +accel_initalised = 1; +*(accel_list[i].allowed) = 1; +} +break; +} +} +if (i == ARRAY_SIZE(accel_list)) { +fprintf(stderr, "\"%s\" accelerator does not exist.\n", buf); +} +} + +if (!accel_initalised) { +fprintf(stderr, "No accelerator found!\n"); +exit(1); +} + +if (init_failed) { +fprintf(stderr, "Back to %s accelerator.\n", accel_list[i].name); +} + +return !accel_initalised; +} + void qemu_add_exit_notifier(Notifier *notify) { notifier_list_add(&exit_notifiers, notify); @@ -1802,6 +1869,7 @@ int main(int argc, char **argv, char **envp) const char *incoming = NULL; int show_vnc_port = 0; int defconfig = 1; +const char *accel_list_opts = "tcg"; #ifdef CONFIG_SIMPLE_TRACE const char *trace_file = NULL; @@ -2409,7 +2477,10 @@ int main(int argc, char **argv, char **envp) do_smbios_option(optarg); break; case QEMU_OPTION_enable_kvm: -kvm_allowed = 1; +accel_list_opts = "kvm"; +
Re: [Qemu-devel] [PATCH] make trace options use autoconfy names
On 11/15/2010 03:17 PM, Stefan Hajnoczi wrote: On Sun, Nov 14, 2010 at 1:52 PM, Paolo Bonzini wrote: On 11/14/2010 02:38 PM, Andreas Färber wrote: - --trace-file=*) trace_file="$optarg" + --enable-trace-file=*) trace_file="$optarg" ;; but this should be --with-trace-file=... please. It is not being enabled, just set to a different value. --with-* should be reserved for library paths, but I can change it if people prefer it that way. Actually I think we have something similar to overriding --prefix here :). It's a path that you can set at ./configure time. Yeah, that's true. However... So is it not okay to use --trace-file=? ... Autoconf would not allow unknown options not starting with --enable- or --with-. The rationale to avoid incompatible options in QEMU is this: suppose you have a project using Autoconf (e.g. GCC) and you want to drop QEMU as a subdirectory in there, e.g. to run the GCC testsuite under QEMU usermode emulation (GCC can already do this for other simulators). To pass options to QEMU's configure, you can include them in GCC's commandline. The script will simply pass the option down to QEMU and it will be processed there. However, if you pass --trace-file to GCC's configure script, it will complain and stop. Probably I would use something like --enable-trace-backend=simple:trace- if I was adding something similar to an autoconfiscated project. But unless it provides some additional benefit (as is the case with cross-compilation support) I want to keep the syntactic changes in my patches to the minimum. But I know nothing of autoconf and --enable-* or --with-* sort of make sense too. Whatever, I have to repost the other series anyway, so I'll change to --with-. Paolo
[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote: On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: On 11/15/2010 08:52 AM, Juan Quintela wrote: "Michael S. Tsirkin" wrote: There's no reason for tap to run when VM is stopped. If we let it, it confuses the bridge on TX and corrupts DMA memory on RX. Signed-off-by: Michael S. Tsirkin once here, what handlers make sense to run while stopped? /me can think of the normal console, non live migration, loadvm and not much more. Perhaps it is easier to just move the other way around? I'm not sure I concur that this is really a problem. Semantically, I don't think that stop has to imply that the guest memory no longer changes. Regards, Anthony Liguori Later, Juan. Well, I do not really know about vmstop that is not for migration. They are separate. For instance, we don't rely on stop to pause pending disk I/O because we don't serialize pending disk I/O operations. Instead, we flush all pending I/O and rely on the fact that disk I/O requests are always submitted in the context of a vCPU operation. This assumption breaks down though with ioeventfd so we need to revisit it. For most vmstop calls are for migration. And there, the problems are very real. First, it's not just memory. At least for network transmit, sending out packets with the same MAC from two locations is a problem. See? I agree it's a problem but I'm not sure that just marking fd handlers really helps. Bottom halves can also trigger transmits. I think that if we put something in the network layer that just queued packets if the vm is stopped, it would be a more robust solution to the problem. For memory, it is much worse: any memory changes can either get discarded or not. This breaks consistency guarantees that guest relies upon. Imagine virtio index getting updated but content not being updated. See? If you suppress any I/O then the memory changes don't matter because the same changes will happen on the destination too. I think this basic problem is the same as Kemari. We can either attempt to totally freeze a guest which means stopping all callbacks that are device related or we can prevent I/O from happening which should introduce enough determinism to fix the problem in practice. Regards, Anthony Liguori
Re: [Qemu-devel] Re: [PATCH] pc: disable the BOCHS BIOS panic port
Am 15.11.2010 11:09, schrieb ext Alexander Graf: On 15.11.2010, at 10:53, Bernhard Kohl wrote: Am 01.09.2010 16:44, schrieb Bernhard Kohl: We have an OS which writes to port 0x400 when probing for special hardware. This causes an exit of the VM. With SeaBIOS this port isn't used anyway. Signed-off-by: Bernhard Kohl --- hw/pc.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 69b13bf..3f81229 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -430,8 +430,6 @@ static void bochs_bios_write(void *opaque, uint32_t addr, uint32_t val) /* Bochs BIOS messages */ case 0x400: case 0x401: -fprintf(stderr, "BIOS panic at rombios.c, line %d\n", val); -exit(1); case 0x402: case 0x403: #ifdef DEBUG_BIOS Hi, could you please look at this? This patch makes that port that was silent before print debug output if DEBUG_BIOS is enabled which might be confusing. How about something like case 0x400: case 0x401: /* used to be panic, now unused */ break; Alex Yes, you are right. I will take your proposal in patch v2. Bernhard
[Qemu-devel] Re: [PATCH RESENT] msix: allow byte and word reading from mmio
Am 15.11.2010 11:42, schrieb ext Michael S. Tsirkin: On Thu, Aug 19, 2010 at 02:56:51PM +0200, Bernhard Kohl wrote: It's legal that the guest reads a single byte or word from mmio. Interesting. The spec seems to say this: For all accesses to MSI-X Table and MSI-X PBA fields, software must use aligned full DWORD or aligned full QWORD transactions; otherwise, the result is undefined. I will remove the first statement from the commit message and add something like the comment you proposed below. I have an OS which reads single bytes and it works fine on real hardware. Maybe this happens due to casting. What do you mean by casting? I did not yet locate the line of code where this happens in the guest. As all other accesses are dword, I assume that there is some masking, shifting or casting to a byte variable in the source code, and the compiler generates a byte access from this. Signed-off-by: Bernhard Kohl I guess we can merge this, but we need a comment I think since this seems to contradict the spec, and people were sending patches relying on this. Does something like the following describe the situation correctly? /* Note: PCI spec requires that all MSI-X table accesses are either DWORD or QWORD, size aligned. Some guests seem to violate this rule for read accesses, performing single byte reads. Since it's easy to support this, let's do so. Also support 16 bit size aligned reads, just in case. */ Yes, that's is exactly the situation with my guest. I will add this comment. Does you guest also do 16 bit reads? Are accesses at least aligned? I will check this with my guest and the readw function disabled in the patch. This will take some time. --- hw/msix.c | 20 1 files changed, 16 insertions(+), 4 deletions(-) diff --git a/hw/msix.c b/hw/msix.c index d99403a..7dac7f7 100644 --- a/hw/msix.c +++ b/hw/msix.c @@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, target_phys_addr_t addr) return pci_get_long(page + offset); } -static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t addr) +static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr) { -fprintf(stderr, "MSI-X: only dword read is allowed!\n"); -return 0; +PCIDevice *dev = opaque; +unsigned int offset = addr& (MSIX_PAGE_SIZE - 1)& ~0x1; +void *page = dev->msix_table_page; + +return pci_get_word(page + offset); +} + +static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr) +{ +PCIDevice *dev = opaque; +unsigned int offset = addr& (MSIX_PAGE_SIZE - 1); +void *page = dev->msix_table_page; + +return pci_get_byte(page + offset); } static uint8_t msix_pending_mask(int vector) @@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = { }; static CPUReadMemoryFunc * const msix_mmio_read[] = { -msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl +msix_mmio_readb, msix_mmio_readw, msix_mmio_readl }; /* Should be called from device's map method. */ -- 1.7.2.1
[Qemu-devel] Re: [PATCH v3] virtio-9p: fix build on !CONFIG_UTIMENSAT
> This patch introduce a fallback mechanism for old systems that do not > support utimensat(). This fix build failure with following warnings: > > hw/virtio-9p-local.c: In function 'local_utimensat': > hw/virtio-9p-local.c:479: warning: implicit declaration of function > 'utimensat' hw/virtio-9p-local.c:479: warning: nested extern declaration > of 'utimensat' > > and: > > hw/virtio-9p.c: In function 'v9fs_setattr_post_chmod': > hw/virtio-9p.c:1410: error: 'UTIME_NOW' undeclared (first use in this > function) hw/virtio-9p.c:1410: error: (Each undeclared identifier is > reported only once hw/virtio-9p.c:1410: error: for each function it > appears in.) > hw/virtio-9p.c:1413: error: 'UTIME_OMIT' undeclared (first use in this > function) hw/virtio-9p.c: In function 'v9fs_wstat_post_chmod': > hw/virtio-9p.c:2905: error: 'UTIME_OMIT' undeclared (first use in this > function) > > v3: > - Use better alternative handling for UTIME_NOW/OMIT > - Move qemu_utimensat() to cutils.c > V2: > - Introduce qemu_utimensat() > > Signed-off-by: Hidetoshi Seto Looks good to me. Acked-by: M. Mohan Kumar
Re: [Qemu-devel] [PATCH 1/2] virtio-9p: Use chroot to safely access files in passthrough model
On Mon, Nov 15, 2010 at 2:52 PM, M. Mohan Kumar wrote: > In passthrough security model, following symbolic links in the server > side could result in accessing files outside guest's export path.This > could happen under two conditions: > 1) If a modified guest kernel is sending symbolic link as part of the > file path and when resolving that symbolic link at server side, it could > result in accessing files outside export path. > 2) If a same path is exported to multiple guests and if guest1 tries to > open a file "a/b/c/passwd" and meanwhile guest2 did this "rm -rf a/b/c; > cd a/b; ln -s ../../etc c". If guest1 lookup happened and guest2 > completed these operations just before guest1 opening the file, this > operation could result in opening host's /etc/passwd. > > Following approach is used to avoid the security issue involved in > following symbolic links in the passthrough model. Create a sub-process > which will chroot into export path, so that even if there is a symbolic > link in the path it could never go beyond the share path. > > When qemu is started with passthrough security model, a process is > forked and this sub-process process takes care of accessing files in the > passthrough share path. It does > * Create socketpair > * Chroot into share path > * Read file open request from socket descriptor > * Open request contains file path, flags, mode, uid, gid, dev etc > * Based on the request type it creates/opens file/directory/device node > * Return the file descriptor to main process using socket with > SCM_RIGHTS as cmsg type. > > Main process when ever there is a request for a resource to be > opened/created, it constructs the open request and writes that into > its socket descriptor and reads from chroot process socket to get the > file descriptor. How does the child process exit cleanly? If QEMU crashes will the child process be orphaned? > > This patch implements chroot enviroment, provides necessary functions > that can be used by the passthrough function calls. > > Signed-off-by: M. Mohan Kumar > --- > Makefile.objs | 1 + > hw/file-op-9p.h | 2 + > hw/virtio-9p-chroot.c | 275 > + > hw/virtio-9p.c | 20 > hw/virtio-9p.h | 21 > 5 files changed, 319 insertions(+), 0 deletions(-) > create mode 100644 hw/virtio-9p-chroot.c > > diff --git a/Makefile.objs b/Makefile.objs > index cd5a24b..134da8e 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -251,6 +251,7 @@ hw-obj-$(CONFIG_SOUND) += $(sound-obj-y) > > hw-obj-$(CONFIG_VIRTFS) += virtio-9p-debug.o virtio-9p-local.o > virtio-9p-xattr.o > hw-obj-$(CONFIG_VIRTFS) += virtio-9p-xattr-user.o virtio-9p-posix-acl.o > +hw-obj-$(CONFIG_VIRTFS) += virtio-9p-chroot.o > > ## > # libdis > diff --git a/hw/file-op-9p.h b/hw/file-op-9p.h > index c7731c2..149a915 100644 > --- a/hw/file-op-9p.h > +++ b/hw/file-op-9p.h > @@ -55,6 +55,8 @@ typedef struct FsContext > SecModel fs_sm; > uid_t uid; > struct xattr_operations **xops; > + pthread_mutex_t chroot_mutex; > + int chroot_socket; > } FsContext; > > extern void cred_init(FsCred *); > diff --git a/hw/virtio-9p-chroot.c b/hw/virtio-9p-chroot.c > new file mode 100644 > index 000..50e28a1 > --- /dev/null > +++ b/hw/virtio-9p-chroot.c > @@ -0,0 +1,275 @@ > +/* > + * Virtio 9p chroot environment for secured access to exported file > + * system > + * > + * Copyright IBM, Corp. 2010 > + * > + * Authors: > + * M. Mohan Kumar > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the copying file in the top-level directory > + * > + */ > + > +#include "virtio.h" > +#include "qemu_socket.h" > +#include "qemu-thread.h" > +#include "virtio-9p.h" > +#include > +#include > + > +/* Structure used to transfer file descriptor and error info to the main > + * process. fd will be zero if there was an error(in this case error > + * will hold the errno). error will be zero and fd will be a valid > + * identifier when open was success > + */ > +typedef struct { > + int fd; > + int error; > +} FdInfo; > + > +static int sendfd(int sockfd, FdInfo fd_info) > +{ > + struct msghdr msg = { }; > + struct iovec iov; > + union { > + struct cmsghdr cmsg; > + char control[CMSG_SPACE(sizeof(int))]; > + } msg_control; > + struct cmsghdr *cmsg; > + > + iov.iov_base = &fd_info; > + iov.iov_len = sizeof(fd_info); > + > + memset(&msg, 0, sizeof(msg)); > + msg.msg_iov = &iov; > + msg.msg_iovlen = 1; > + msg.msg_control = &msg_control; > + msg.msg_controllen = sizeof(msg_control); > + > + cmsg = &msg_control.cmsg; > + cmsg->cmsg_len = CMSG_LEN(sizeof(fd_info.fd)); > + cmsg->cmsg_level = SOL_SOCKET; > + cmsg->cmsg_type = SCM_RIGHTS; > + memcpy(CMSG_DATA(cmsg), &fd_info.fd, sizeof(fd_info.fd)); > + > + return sendmsg(sockfd, &msg, 0
Re: [Qemu-devel] [PATCH] Introduce -accel command option.
On 11/15/2010 09:45 AM, anthony.per...@citrix.com wrote: From: Anthony PERARD This option gives the ability to switch one "accelerator" like kvm, xen or the default one tcg. We can specify more than one accelerator by separate them by a comma. QEMU will try each one and use the first whose works. So, -accel xen,kvm,tcg which would try Xen support first, then KVM and finaly tcg if none of the other works. Should use QemuOpts instead of parsing by hand. I'd rather it be presented as a -machine option too with accel=xen:kvm:tcg to specify order. Regards, Anthony Liguori Signed-off-by: Anthony PERARD --- arch_init.c |5 +++ arch_init.h |1 + qemu-options.hx | 10 ++ vl.c| 85 +++--- 4 files changed, 90 insertions(+), 11 deletions(-) diff --git a/arch_init.c b/arch_init.c index 4486925..e0d7a4c 100644 --- a/arch_init.c +++ b/arch_init.c @@ -639,6 +639,11 @@ int audio_available(void) #endif } +int tcg_available(void) +{ +return 1; +} + int kvm_available(void) { #ifdef CONFIG_KVM diff --git a/arch_init.h b/arch_init.h index 682890c..f0fb6a0 100644 --- a/arch_init.h +++ b/arch_init.h @@ -27,6 +27,7 @@ void do_acpitable_option(const char *optarg); void do_smbios_option(const char *optarg); void cpudef_init(void); int audio_available(void); +int tcg_available(void); int kvm_available(void); int xen_available(void); diff --git a/qemu-options.hx b/qemu-options.hx index 4d99a58..958d126 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -1975,6 +1975,16 @@ Enable KVM full virtualization support. This option is only available if KVM support is enabled when compiling. ETEXI +DEF("accel", HAS_ARG, QEMU_OPTION_accel, \ +"-accel acceluse an accelerator (kvm,xen,tcg), default is tcg\n", QEMU_ARCH_ALL) +STEXI +...@item -accel @var{accel}[,@var{accel}[,...]] +...@findex -accel +This is use to enable an accelerator, in kvm,xen,tcg. +By default, it use only tcg. If there a more than one accelerator +specified, the next one is used if the first don't work. +ETEXI + DEF("xen-domid", HAS_ARG, QEMU_OPTION_xen_domid, "-xen-domid id specify xen guest domain id\n", QEMU_ARCH_ALL) DEF("xen-create", 0, QEMU_OPTION_xen_create, diff --git a/vl.c b/vl.c index c58583d..2917e32 100644 --- a/vl.c +++ b/vl.c @@ -242,6 +242,7 @@ static void *boot_set_opaque; static NotifierList exit_notifiers = NOTIFIER_LIST_INITIALIZER(exit_notifiers); +static int tcg_allowed = 1; int kvm_allowed = 0; uint32_t xen_domid; enum xen_mode xen_mode = XEN_EMULATE; @@ -1723,6 +1724,72 @@ static int debugcon_parse(const char *devname) return 0; } +static int tcg_init(int smp_cpus) +{ +return 0; +} + +static struct { +const char *opt_name; +const char *name; +int (*available)(void); +int (*init)(int smp_cpus); +int *allowed; +} accel_list[] = { +{ "tcg", "tcg", tcg_available, tcg_init,&tcg_allowed }, +{ "kvm", "KVM", kvm_available, kvm_init,&kvm_allowed }, +}; + +static int accel_parse_init(const char *opts) +{ +const char *p = opts; +char buf[10]; +int i, ret; +bool accel_initalised = 0; +bool init_failed = 0; + +while (!accel_initalised&& *p != '\0') { +if (*p == ',') { +p++; +} +p = get_opt_name(buf, sizeof (buf), p, ','); +for (i = 0; i< ARRAY_SIZE(accel_list); i++) { +if (strcmp(accel_list[i].opt_name, buf) == 0) { +ret = accel_list[i].init(smp_cpus); +if (ret< 0) { +init_failed = 1; +if (!accel_list[i].available()) { +printf("%s not supported for this target\n", + accel_list[i].name); +} else { +fprintf(stderr, "failed to initialize %s: %s\n", +accel_list[i].name, +strerror(-ret)); +} +} else { +accel_initalised = 1; +*(accel_list[i].allowed) = 1; +} +break; +} +} +if (i == ARRAY_SIZE(accel_list)) { +fprintf(stderr, "\"%s\" accelerator does not exist.\n", buf); +} +} + +if (!accel_initalised) { +fprintf(stderr, "No accelerator found!\n"); +exit(1); +} + +if (init_failed) { +fprintf(stderr, "Back to %s accelerator.\n", accel_list[i].name); +} + +return !accel_initalised; +} + void qemu_add_exit_notifier(Notifier *notify) { notifier_list_add(&exit_notifiers, notify); @@ -1802,6 +1869,7 @@ int main(int argc, char **argv, char **envp) const char *incoming = NULL; int show_vnc_port = 0; int defconfig = 1; +const char *accel_list_opts = "tcg"; #ifdef CONFIG_SIMPLE_TRACE const char *trace_file = NULL;
[Qemu-devel] Re: [PATCH] ceph/rbd block driver for qemu-kvm (v7)
Hi Stefan, thanks for your feedback. Yehuda and Sage have already committed some pathes to our git repository. What I'm not sure about is the rados_(de)initialization for multiple rbd images. I suspect that _deinitialize should only be called for the last rbd image. Yehuda and Sage know librados a lot better than me. I pretty sure, that they will give some feedback about this remaining issue. After that we will send an updated patch. Regards, Christian 2010/11/11 Stefan Hajnoczi : > On Fri, Oct 15, 2010 at 8:54 PM, Christian Brunner wrote: >> [...] >> + >> + if ((r = rados_initialize(0, NULL)) < 0) { >> + error_report("error initializing"); >> + return r; >> + } > > Does rados_initialize() work when called multiple times? This would happen if > the VM has several rbd devices attached. > > [...] > > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
[Qemu-devel] Re: Where's gpxe-eepro100-80862449.rom ?
Am 25.10.2010 18:54, schrieb Michael S. Tsirkin: On Mon, Oct 25, 2010 at 06:23:24PM +0200, Stefan Weil wrote: Am 25.10.2010 14:11, schrieb Markus Armbruster: Stefan Weil writes: Am 13.10.2010 09:13, schrieb Markus Armbruster: Stefan Weil writes: [...] Do you think there is urgent need for a gpxe-eepro100-80862449.rom binary? Well, "-device i82801" complains because it misses this binary. Do we want to ship that way? I just sent two patches which create the rom data on load. So -device i82801 no longer complains but gets the boot information from dhcp (tested with i386-softmmu). Your build tree will be made smaller by at least 50 KB :-) Do you think these patches should be added to stable-0.13, too? If the gain for "-device i82801" is worth the risk, which depends on the patch. Have we reached consensus on how to fix it in master? The latest patch version fixes rom data only for the default roms, so the risk is minimized and full tests are possible. There is still no consensus whether we need a new qdev property (which allows users to have their rom data fixed, too) or not. My patch does not prevent adding that new qdev property, so I suggest applying my patch now and adding the property later (if it is ever needed). Regards Stefan Fair enough, I guess. So is there consensus that both patches can be committed to qemu master?
[Qemu-devel] Re: [PATCH] make-release: fix mtime for a wider range of git versions
Am 15.11.2010 12:48, schrieb Bernhard Kohl: With the latest git versions, e.g. 1.7.2.3, git still prints out the tag info in addition to the requested format. So let's simply fetch the first line from the output. In addition I use the --pretty option instead of --format which is not recognized in very old git versions, e.g. 1.5.5.6. Tested with git versions 1.5.5.6 and 1.7.2.3. Signed-off-by: Bernhard Kohl Sorry, I sent this to the wrong list. Resent to kvm! Bernhard --- kvm/scripts/make-release |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kvm/scripts/make-release b/kvm/scripts/make-release index 56302c3..2d050fc 100755 --- a/kvm/scripts/make-release +++ b/kvm/scripts/make-release @@ -51,7 +51,7 @@ cd "$(dirname "$0")"/../.. mkdir -p "$(dirname "$tarball")" git archive --prefix="$name/" --format=tar "$commit"> "$tarball" -mtime=`git show --format=%ct "$commit""^{commit}" --` +mtime=`git show --pretty=format:%ct "$commit""^{commit}" -- | head -n 1` tarargs="--owner=root --group=root" mkdir -p "$tmpdir/$name"
[Qemu-devel] Re: [PATCH] ceph/rbd block driver for qemu-kvm (v7)
On Mon, Nov 15, 2010 at 9:04 AM, Christian Brunner wrote: > Hi Stefan, > > thanks for your feedback. Yehuda and Sage have already committed some > pathes to our git repository. > > What I'm not sure about is the rados_(de)initialization for multiple > rbd images. I suspect that _deinitialize should only be called for the > last rbd image. > > Yehuda and Sage know librados a lot better than me. I pretty sure, > that they will give some feedback about this remaining issue. After > that we will send an updated patch. > > Regards, > Christian > > 2010/11/11 Stefan Hajnoczi : >> On Fri, Oct 15, 2010 at 8:54 PM, Christian Brunner wrote: >>> [...] >>> + >>> + if ((r = rados_initialize(0, NULL)) < 0) { >>> + error_report("error initializing"); >>> + return r; >>> + } >> >> Does rados_initialize() work when called multiple times? This would happen >> if >> the VM has several rbd devices attached. >> >> [...] >> >> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > The rados (de)initialization is refcounted and it is safe to call it multiple times.
[Qemu-devel] [Bug 611142] Re: seabios should have native scsi support
(Clearly this must be done upstream first, so marking Triaged for the Ubuntu task) ** Changed in: qemu-kvm (Ubuntu) Status: New => Triaged -- seabios should have native scsi support https://bugs.launchpad.net/bugs/611142 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: New Status in “qemu-kvm” package in Ubuntu: Triaged Status in “seabios” package in Ubuntu: New Bug description: Binary package hint: seabios Currently when a grub multiboot image is booted with 'kvm -kernel' and 'biosdisk' module, it will see block devices of type IDE or virtio. It will not see scsi devices. To demonstrate this: $ qemu-img create -f qcow2 disk.img 1G $ grub-mkrescue --output=rescue.iso $ grub-mkimage -O i386-pc --output=grub-mb.img biosdisk minicmd part_msdos An 'ls' inside the grub prompt will show hard disks (hd0) on if: a.) -drive uses interface of virtio or scsi or b.) kvm boots boot from a cdrom or floppy For example, with these commands, grub will see a '(hd0)' $ kvm -drive file=disk.img,if=scsi,boot=on -cdrom rescue.iso -boot d $ kvm -drive file=disk.img,if=scsi,boot=on -floppy rescue.iso -boot a $ kvm -drive file=disk.img,if=virtio,boot=on -cdrom rescue.iso -boot d $ kvm -drive file=disk.img,if=ide,boot=on -cdrom rescue.iso -boot d $ kvm -drive file=disk.img,if=virtio,boot=on -kernel grub-mb.img $ kvm -drive file=disk.img,if=ide,boot=on -kernel grub-mb.img But the following will not: $ kvm -drive file=disk.img,if=scsi,boot=on -kernel grub-mb.img ProblemType: Bug DistroRelease: Ubuntu 10.10 Package: seabios 0.6.0-0ubuntu1 ProcVersionSignature: User Name 2.6.32-305.9-ec2 2.6.32.11+drm33.2 Uname: Linux 2.6.32-305-ec2 i686 Architecture: i386 Date: Thu Jul 29 03:21:21 2010 Dependencies: Ec2AMI: ami-e930db80 Ec2AMIManifest: ubuntu-images-testing-us/ubuntu-maverick-daily-i386-server-20100727.manifest.xml Ec2AvailabilityZone: us-east-1b Ec2InstanceType: m1.small Ec2Kernel: aki-407d9529 Ec2Ramdisk: unavailable PackageArchitecture: all ProcEnviron: PATH=(custom, user) LANG=en_US.UTF-8 SHELL=/bin/bash SourcePackage: seabios
[Qemu-devel] [Bug 241119] Re: usb_add of a Creative ZEN unrecognized in guest
Hi. I have a similar problem, with a simple JetFlash usb drive. After adding it with usb_add, "info usb" shows nothing and I start getting the following message in the console: husb: config #1 need -1 husb: 1 interfaces claimed for configuration 1 husb: grabbed usb device 1.3 usb_linux_update_endp_table: Protocol error This is under Maverick using qemu-kvm 0.12.5+noroms-0ubuntu7 The same operation works fine in the same pc under my gentoo system, currently running qemu-kvm-0.13.0, but it has been working now for a year, without ever giving me problems. -- usb_add of a Creative ZEN unrecognized in guest https://bugs.launchpad.net/bugs/241119 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: Incomplete Status in “qemu-kvm” package in Ubuntu: Confirmed Bug description: Binary package hint: kvm This happens when I add my Creative ZEN to a virtual machine running XP. The device is recognised well at first and drivers are installed correctly. But when trying to connect windows crashes with the classic blue screen It complains about something like usbohci.sys, I can't read well because it crashes too fast. I have also tried with another virtual machine running Vista, same results. Any help would be really appreciated! I'm using the module kvm-amd with Ubuntu 8.04 The USB device has the following ID: 041e:4157 Creative Technology, Ltd kvm: Installed: 1:62+dfsg-0ubuntu7 Candidate: 1:62+dfsg-0ubuntu7 Version table: *** 1:62+dfsg-0ubuntu7 0 500 http://archive.ubuntu.com hardy/main Packages 100 /var/lib/dpkg/status
[Qemu-devel] Re: [PATCH] pci: allow hotplug removal of cold-plugged devices
On Sun, Nov 14, 2010 at 7:18 AM, Michael S. Tsirkin wrote: > This patch fixes 5beb8ad503c88a76f2b8106c3b74b4ce485a60e1 > which broke hotplug removal of cold plugged devices: > > - pass addition/removal state to hotplug callbacks > - use that in piix and pcie > > This also fixes an assert on hotplug removal of coldplugged > express devices. > > Reported-by: by Cam Macdonell . > Signed-off-by: Isaku Yamahata > Signed-off-by: Michael S. Tsirkin Acked-by: Cam Macdonell > --- > > So I think the below would be the cleanest way > to fix the bug as we keep the hot/cold plug logic > in a central palce. Untested. Comments? Cam? Yes, it seems to fix the problem. > > > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c > index 66c7885..f549089 100644 > --- a/hw/acpi_piix4.c > +++ b/hw/acpi_piix4.c > @@ -585,7 +585,8 @@ static void pciej_write(void *opaque, uint32_t addr, > uint32_t val) > PIIX4_DPRINTF("pciej write %x <== %d\n", addr, val); > } > > -static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, int > state); > +static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, > + PCIHotplugState state); > > static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s) > { > @@ -615,18 +616,23 @@ static void disable_device(PIIX4PMState *s, int slot) > s->pci0_status.down |= (1 << slot); > } > > -static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, int state) > +static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, > + PCIHotplugState state) > { > int slot = PCI_SLOT(dev->devfn); > PIIX4PMState *s = DO_UPCAST(PIIX4PMState, dev, > DO_UPCAST(PCIDevice, qdev, qdev)); > > - if (!dev->qdev.hotplugged) > + /* Don't send event when device is enabled during qemu machine creation: > + * it is present on boot, no hotplug event is necessary. We do send an > + * event when the device is disabled later. */ > + if (state == PCI_COLDPLUG_ENABLED) { > return 0; > + } > > s->pci0_status.up = 0; > s->pci0_status.down = 0; > - if (state) { > + if (state == PCI_HOTPLUG_ENABLED) { > enable_device(s, slot); > } else { > disable_device(s, slot); > diff --git a/hw/pci.c b/hw/pci.c > index 30e1603..316b24f 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -1566,8 +1566,11 @@ static int pci_qdev_init(DeviceState *qdev, DeviceInfo > *base) > pci_add_option_rom(pci_dev); > > if (bus->hotplug) { > - /* lower layer must check qdev->hotplugged */ > - rc = bus->hotplug(bus->hotplug_qdev, pci_dev, 1); > + /* Let buses differentiate between hotplug and when device is > + * enabled during qemu machine creation. */ > + rc = bus->hotplug(bus->hotplug_qdev, pci_dev, > + qdev->hotplugged ? PCI_HOTPLUG_ENABLED: > + PCI_COLDPLUG_ENABLED); > if (rc != 0) { > int r = pci_unregister_device(&pci_dev->qdev); > assert(!r); > @@ -1581,7 +1584,8 @@ static int pci_unplug_device(DeviceState *qdev) > { > PCIDevice *dev = DO_UPCAST(PCIDevice, qdev, qdev); > > - return dev->bus->hotplug(dev->bus->hotplug_qdev, dev, 0); > + return dev->bus->hotplug(dev->bus->hotplug_qdev, dev, > + PCI_HOTPLUG_DISABLED); > } > > void pci_qdev_register(PCIDeviceInfo *info) > diff --git a/hw/pci.h b/hw/pci.h > index 7100804..09b3e4c 100644 > --- a/hw/pci.h > +++ b/hw/pci.h > @@ -214,7 +214,15 @@ int pci_device_load(PCIDevice *s, QEMUFile *f); > > typedef void (*pci_set_irq_fn)(void *opaque, int irq_num, int level); > typedef int (*pci_map_irq_fn)(PCIDevice *pci_dev, int irq_num); > -typedef int (*pci_hotplug_fn)(DeviceState *qdev, PCIDevice *pci_dev, int > state); > + > +typedef enum { > + PCI_HOTPLUG_DISABLED, > + PCI_HOTPLUG_ENABLED, > + PCI_COLDPLUG_ENABLED, > +} PCIHotplugState; > + > +typedef int (*pci_hotplug_fn)(DeviceState *qdev, PCIDevice *pci_dev, > + PCIHotplugState state); > void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent, > const char *name, int devfn_min); > PCIBus *pci_bus_new(DeviceState *parent, const char *name, int devfn_min); > diff --git a/hw/pcie.c b/hw/pcie.c > index 35918f7..4df48b8 100644 > --- a/hw/pcie.c > +++ b/hw/pcie.c > @@ -192,14 +192,16 @@ static void pcie_cap_slot_event(PCIDevice *dev, > PCIExpressHotPlugEvent event) > } > > static int pcie_cap_slot_hotplug(DeviceState *qdev, > - PCIDevice *pci_dev, int state) > + PCIDevice *pci_dev, PCIHotplugState state) > { > PCIDevice *d = DO_UPCAST(PCIDevice, qdev, qdev); > uint8_t *exp_cap = d->config + d->exp.exp_cap; > uint16_t sltsta = pci_get_word(exp_cap + PCI_EXP_SLTSTA); > > - if (!pci_dev->qdev.hotplugged) { > - assert(state); /* this case only happe
[Qemu-devel] [PATCH 0/3] v11: Threadlets: A generic task offloading framework
Hi, This is the v11 of the refactored patch-series to have a generic asynchronous task offloading framework (called threadlets) within qemu. I have run KVM autotest suite with this patch. This test suite ran successfully for the following tests: -connecthon -ebizzy -dbench -fsstress -disktest -cpu_hotplug -hackbench -sleeptest Changelog: * Moved the qemu_cond_broadcast to the right place. * Removed unnecessary extern qualifiers. The following series implements... --- Arun R Bharadwaj (1): Move threadlets infrastructure to qemu-threadlets.c Gautham R Shenoy (2): Make paio subsystem use threadlets infrastructure Add helper functions to enable virtio-9p make use of the threadlets Makefile.objs |3 - configure |2 docs/async-support.txt | 141 ++ hw/virtio-9p.c | 164 ++- posix-aio-compat.c | 227 +++- qemu-threadlets.c | 189 qemu-threadlets.h | 47 ++ vl.c |3 + 8 files changed, 599 insertions(+), 177 deletions(-) create mode 100644 docs/async-support.txt create mode 100644 qemu-threadlets.c create mode 100644 qemu-threadlets.h -- arun
[Qemu-devel] [PATCH 1/3] Make paio subsystem use threadlets infrastructure
From: Gautham R Shenoy This patch creates a generic asynchronous-task-offloading infrastructure named threadlets. The patch creates a global queue on-to which subsystems can queue their tasks to be executed asynchronously. The patch also provides API's that allow a subsystem to create a private queue with an associated pool of threads. The patch has been tested with fstress. Signed-off-by: Arun R Bharadwaj Signed-off-by: Aneesh Kumar K.V Signed-off-by: Gautham R Shenoy Signed-off-by: Sripathi Kodi --- Makefile.objs |2 configure |2 posix-aio-compat.c | 354 +++- 3 files changed, 211 insertions(+), 147 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index cd5a24b..3b7ec27 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -9,6 +9,7 @@ qobject-obj-y += qerror.o block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o +block-obj-$(CONFIG_POSIX) += qemu-thread.o block-obj-$(CONFIG_POSIX) += posix-aio-compat.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o @@ -124,7 +125,6 @@ endif common-obj-y += $(addprefix ui/, $(ui-obj-y)) common-obj-y += iov.o acl.o -common-obj-$(CONFIG_THREAD) += qemu-thread.o common-obj-y += notify.o event_notifier.o common-obj-y += qemu-timer.o diff --git a/configure b/configure index a079a49..addf733 100755 --- a/configure +++ b/configure @@ -2456,7 +2456,6 @@ if test "$vnc_png" != "no" ; then fi if test "$vnc_thread" != "no" ; then echo "CONFIG_VNC_THREAD=y" >> $config_host_mak - echo "CONFIG_THREAD=y" >> $config_host_mak fi if test "$fnmatch" = "yes" ; then echo "CONFIG_FNMATCH=y" >> $config_host_mak @@ -2534,7 +2533,6 @@ if test "$xen" = "yes" ; then fi if test "$io_thread" = "yes" ; then echo "CONFIG_IOTHREAD=y" >> $config_host_mak - echo "CONFIG_THREAD=y" >> $config_host_mak fi if test "$linux_aio" = "yes" ; then echo "CONFIG_LINUX_AIO=y" >> $config_host_mak diff --git a/posix-aio-compat.c b/posix-aio-compat.c index 7b862b5..e1812fc 100644 --- a/posix-aio-compat.c +++ b/posix-aio-compat.c @@ -29,7 +29,33 @@ #include "block_int.h" #include "block/raw-posix-aio.h" +#include "qemu-thread.h" +#define MAX_GLOBAL_THREADS 64 +#define MIN_GLOBAL_THREADS 8 + +static QemuMutex aiocb_mutex; +static QemuCond aiocb_completion; + +typedef struct ThreadletQueue +{ +QemuMutex lock; +QemuCond cond; +int max_threads; +int min_threads; +int cur_threads; +int idle_threads; +QTAILQ_HEAD(, ThreadletWork) request_list; +} ThreadletQueue; + +typedef struct ThreadletWork +{ +QTAILQ_ENTRY(ThreadletWork) node; +void (*func)(struct ThreadletWork *work); +} ThreadletWork; + +static ThreadletQueue globalqueue; +static int globalqueue_init; struct qemu_paiocb { BlockDriverAIOCB common; @@ -44,13 +70,12 @@ struct qemu_paiocb { int ev_signo; off_t aio_offset; -QTAILQ_ENTRY(qemu_paiocb) node; int aio_type; ssize_t ret; -int active; struct qemu_paiocb *next; int async_context_id; +ThreadletWork work; }; typedef struct PosixAioState { @@ -58,64 +83,169 @@ typedef struct PosixAioState { struct qemu_paiocb *first_aio; } PosixAioState; +static void *threadlet_worker(void *data) +{ +ThreadletQueue *queue = data; -static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; -static pthread_cond_t cond = PTHREAD_COND_INITIALIZER; -static pthread_t thread_id; -static pthread_attr_t attr; -static int max_threads = 64; -static int cur_threads = 0; -static int idle_threads = 0; -static QTAILQ_HEAD(, qemu_paiocb) request_list; +qemu_mutex_lock(&queue->lock); +while (1) { +ThreadletWork *work; +int ret = 0; + +while (QTAILQ_EMPTY(&queue->request_list) && + (ret != ETIMEDOUT)) { +/* wait for cond to be signalled or broadcast for 1000s */ +ret = qemu_cond_timedwait((&queue->cond), + &(queue->lock), 10*10); +} -#ifdef CONFIG_PREADV -static int preadv_present = 1; -#else -static int preadv_present = 0; -#endif +assert(queue->idle_threads != 0); +if (QTAILQ_EMPTY(&queue->request_list)) { +if (queue->cur_threads > queue->min_threads) { +/* We retain the minimum number of threads */ +break; +} +} else { +work = QTAILQ_FIRST(&queue->request_list); +QTAILQ_REMOVE(&queue->request_list, work, node); -static void die2(int err, const char *what) -{ -fprintf(stderr, "%s failed: %s\n", what, strerror(err)); -abort(); +queue->idle_threads--; +qemu_mutex_unlock(&queue->lock); + +/* execute the work function */ +work->func(work); + +qemu_mutex_lock(&queue->lock); +queue->idle_threads++; +} +} + +queue->idle_th
[Qemu-devel] [PATCH 2/3] Move threadlets infrastructure to qemu-threadlets.c
The reason for creating this generic infrastructure is so that other subsystems, such as virtio-9p could make use of it for offloading tasks that could block. Signed-off-by: Arun R Bharadwaj Signed-off-by: Aneesh Kumar K.V Signed-off-by: Gautham R Shenoy Signed-off-by: Sripathi Kodi Acked-by: Stefan Hajnoczi --- Makefile.objs |1 docs/async-support.txt | 141 +++ posix-aio-compat.c | 173 qemu-threadlets.c | 168 +++ qemu-threadlets.h | 46 + 5 files changed, 357 insertions(+), 172 deletions(-) create mode 100644 docs/async-support.txt create mode 100644 qemu-threadlets.c create mode 100644 qemu-threadlets.h diff --git a/Makefile.objs b/Makefile.objs index 3b7ec27..2cf8aba 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -10,6 +10,7 @@ qobject-obj-y += qerror.o block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o block-obj-$(CONFIG_POSIX) += qemu-thread.o +block-obj-$(CONFIG_POSIX) += qemu-threadlets.o block-obj-$(CONFIG_POSIX) += posix-aio-compat.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o diff --git a/docs/async-support.txt b/docs/async-support.txt new file mode 100644 index 000..9f22b9a --- /dev/null +++ b/docs/async-support.txt @@ -0,0 +1,141 @@ +== How to use the threadlets infrastructure supported in Qemu == + +== Threadlets == + +Q.1: What are threadlets ? +A.1: Threadlets is an infrastructure within QEMU that allows other subsystems + to offload possibly blocking work to a queue to be processed by a pool + of threads asynchronously. + +Q.2: When would one want to use threadlets ? +A.2: Threadlets are useful when there are operations that can be performed + outside the context of the VCPU/IO threads inorder to free these latter + to service any other guest requests. + +Q.3: I have some work that can be executed in an asynchronous context. How + should I go about it ? +A.3: One could follow the steps listed below: + + - Define a function which would do the asynchronous work. + static void my_threadlet_func(ThreadletWork *work) + { + } + + - Declare an object of type ThreadletWork; + ThreadletWork work; + + + - Assign a value to the "func" member of ThreadletWork object. + work.func = my_threadlet_func; + + - Submit the threadlet to the global queue. + submit_threadletwork(&work); + + - Continue servicing some other guest operations. + +Q.4: I want to my_threadlet_func to access some non-global data. How do I do + that ? +A.4: Suppose you want my_threadlet_func to access some non-global data-object + of type myPrivateData. In that case one could follow the following steps. + + - Define a member of the type ThreadletWork within myPrivateData. + typedef struct MyPrivateData { + ...; + ...; + ...; + ThreadletWork work; + } MyPrivateData; + + MyPrivateData my_data; + + - Initialize myData.work as described in A.3 + myData.work.func = my_threadlet_func; + submit_threadletwork(&myData.work); + + - Access the myData object inside my_threadlet_func() using container_of + primitive + static void my_threadlet_func(ThreadletWork *work) + { + myPrivateData *mydata_ptr; + mydata_ptr = container_of(work, myPrivateData, work); + + /* mydata_ptr now points to myData object */ + } + +Q.5: Are there any precautions one must take while sharing data with the + Asynchronous thread-pool ? +A.5: Yes, make sure that the helper function of the type my_threadlet_func() + does not access/modify data when it can be accessed or modified in the + context of VCPU thread or IO thread. This is because the asynchronous + threads in the pool can run in parallel with the VCPU/IOThreads as shown + in the figure. + + A typical workflow is as follows: + + VCPU/IOThread + | + | (1) + | + V +Offload work (2) + |---> to threadlets -> Helper thread + || | + || | + || (3) | (4) + || | + | Handle other Guest requests| + || | + || V + || (3) Signal the I/O Thread + |(6) |
[Qemu-devel] [PATCH 3/3] Add helper functions to enable virtio-9p make use of the threadlets
From: Gautham R Shenoy infrastructure for offloading blocking tasks such as making posix calls on to the helper threads and handle the post_posix_operations() from the context of the iothread. This frees the vcpu thread to process any other guest operations while the processing of v9fs_io is in progress. Signed-off-by: Gautham R Shenoy Signed-off-by: Sripathi Kodi Signed-off-by: Arun R Bharadwaj --- hw/virtio-9p.c | 164 posix-aio-compat.c | 30 +++--- qemu-threadlets.c | 21 +++ qemu-threadlets.h |1 vl.c |3 + 5 files changed, 196 insertions(+), 23 deletions(-) diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c index 7c59988..88da20f 100644 --- a/hw/virtio-9p.c +++ b/hw/virtio-9p.c @@ -18,6 +18,7 @@ #include "fsdev/qemu-fsdev.h" #include "virtio-9p-debug.h" #include "virtio-9p-xattr.h" +#include "qemu-threadlets.h" int debug_9p_pdu; @@ -33,6 +34,149 @@ enum { Oappend = 0x80, }; +typedef struct V9fsPostOp { +QTAILQ_ENTRY(V9fsPostOp) node; +void (*func)(void *arg); +void *arg; +} V9fsPostOp; + +static struct { +int rfd; +int wfd; +QemuMutex lock; +QTAILQ_HEAD(, V9fsPostOp) post_op_list; +} v9fs_async_struct; + +static void die2(int err, const char *what) +{ +fprintf(stderr, "%s failed: %s\n", what, strerror(err)); +abort(); +} + +static void die(const char *what) +{ +die2(errno, what); +} + +#define ASYNC_MAX_PROCESS 5 + +/** + * v9fs_process_post_ops: Process any pending v9fs_post_posix_operation + * @arg: Not used. + * + * This function serves as a callback to the iothread to be called into whenever + * the v9fs_async_struct.wfd is written into. This thread goes through the list + * of v9fs_post_posix_operations() and executes them. In the process, it might + * queue more job on the asynchronous thread pool. + */ +static void v9fs_process_post_ops(void *arg) +{ +int count = 0; +struct V9fsPostOp *post_op; +int ret; +char byte; + +do { +ret = read(v9fs_async_struct.rfd, &byte, sizeof(byte)); +} while (ret > 0 || (ret == -1 && errno == EINTR)); + +qemu_mutex_lock(&v9fs_async_struct.lock); +for (count = 0; count < ASYNC_MAX_PROCESS; count++) { +if (QTAILQ_EMPTY(&(v9fs_async_struct.post_op_list))) { +break; +} +post_op = QTAILQ_FIRST(&(v9fs_async_struct.post_op_list)); +QTAILQ_REMOVE(&(v9fs_async_struct.post_op_list), post_op, node); + +qemu_mutex_unlock(&v9fs_async_struct.lock); +post_op->func(post_op->arg); +qemu_free(post_op); +qemu_mutex_lock(&v9fs_async_struct.lock); +} +qemu_mutex_unlock(&v9fs_async_struct.lock); +} + +/** + * v9fs_async_signal: Inform the io-thread of completion of async job. + * + * This function is used to inform the iothread that a particular + * async-operation pertaining to v9fs has been completed and that the io thread + * can handle the v9fs_post_posix_operation. + * + * This is based on the aio_signal_handler + */ +static inline void v9fs_async_signal(void) +{ +char byte = 0; +ssize_t ret; +int tries = 0; + +qemu_mutex_lock(&v9fs_async_struct.lock); +do { +assert(tries != 100); + ret = write(v9fs_async_struct.wfd, &byte, sizeof(byte)); +tries++; +} while (ret < 0 && errno == EAGAIN); +qemu_mutex_unlock(&v9fs_async_struct.lock); + +if (ret < 0 && errno != EAGAIN) { +die("write() in v9fs"); +} + +if (kill(getpid(), SIGUSR2)) { +die("kill failed"); +} +} + +/** + * v9fs_async_helper_done: Marks the completion of the v9fs_async job + * @func: v9fs_post_posix_func() for post-processing invoked in the context of + *the io-thread + * @arg: Argument to func. + * + * This function is called from the context of one of the asynchronous threads + * in the thread pool. This is called when the asynchronous thread has finished + * executing a v9fs_posix_operation. It's purpose is to initiate the process of + * informing the io-thread that the v9fs_posix_operation has completed. + */ +static void v9fs_async_helper_done(void (*func)(void *arg), void *arg) +{ +struct V9fsPostOp *post_op; + +post_op = qemu_mallocz(sizeof(*post_op)); +post_op->func = func; +post_op->arg = arg; + +qemu_mutex_lock(&v9fs_async_struct.lock); +QTAILQ_INSERT_TAIL(&(v9fs_async_struct.post_op_list), post_op, node); +qemu_mutex_unlock(&v9fs_async_struct.lock); + +v9fs_async_signal(); +} + +/** + * v9fs_do_async_posix: Offload v9fs_posix_operation onto async thread. + * @vs: V9fsOPState variable for the OP operation. + * @posix_fn: The posix function which has to be offloaded onto async thread. + * @post_fn_ptr: Address of the location to hold the post_fn corresponding to + * the posix_fn + * @post_fn: The post processing function corresponding to the posix_fn. + * + * This function is a helper to offload posix_operation on to t
[Qemu-devel] [PATCH] *-dis: Replace fprintf_ftype by fprintf_function (format checking)
This patch adds more printf format checking. Additional modifications were needed for this code change: * alpha-dis.c: The local definition of MAX conflicts with a previous definition from osdep.h, so add an #undef. * dis-asm.h: Add include for fprintf_function (qemu-common.h). The standard (now redundant) includes are removed. * mis-dis.c: The definition of ARRAY_SIZE is no longer needed and must be removed (conflict with previous definition from qemu-common.h). * sh4-dis.c: Remove some unneeded forward declarations. Cc: Blue Swirl Signed-off-by: Stefan Weil --- alpha-dis.c |3 +++ arm-dis.c| 14 +++--- dis-asm.h| 10 ++ m68k-dis.c |2 +- microblaze-dis.c |2 +- mips-dis.c |2 -- sh4-dis.c| 16 +--- 7 files changed, 19 insertions(+), 30 deletions(-) diff --git a/alpha-dis.c b/alpha-dis.c index 970da5b..8a2411e 100644 --- a/alpha-dis.c +++ b/alpha-dis.c @@ -22,6 +22,9 @@ along with this file; see the file COPYING. If not, see #include #include "dis-asm.h" +/* MAX is redefined below, so remove any previous definition. */ +#undef MAX + /* The opcode table is an array of struct alpha_opcode. */ struct alpha_opcode diff --git a/arm-dis.c b/arm-dis.c index fe7ac99..af21739 100644 --- a/arm-dis.c +++ b/arm-dis.c @@ -1587,7 +1587,7 @@ arm_decode_bitfield (const char *ptr, unsigned long insn, } static void -arm_decode_shift (long given, fprintf_ftype func, void *stream, +arm_decode_shift (long given, fprintf_function func, void *stream, int print_shift) { func (stream, "%s", arm_regnames[given & 0xf]); @@ -1633,7 +1633,7 @@ print_insn_coprocessor (bfd_vma pc, struct disassemble_info *info, long given, { const struct opcode32 *insn; void *stream = info->stream; - fprintf_ftype func = info->fprintf_func; + fprintf_function func = info->fprintf_func; unsigned long mask; unsigned long value; int cond; @@ -2127,7 +2127,7 @@ static void print_arm_address (bfd_vma pc, struct disassemble_info *info, long given) { void *stream = info->stream; - fprintf_ftype func = info->fprintf_func; + fprintf_function func = info->fprintf_func; if (((given & 0x000f) == 0x000f) && ((given & 0x0200) == 0)) @@ -,7 +,7 @@ print_insn_neon (struct disassemble_info *info, long given, bfd_boolean thumb) { const struct opcode32 *insn; void *stream = info->stream; - fprintf_ftype func = info->fprintf_func; + fprintf_function func = info->fprintf_func; if (thumb) { @@ -2676,7 +2676,7 @@ print_insn_arm_internal (bfd_vma pc, struct disassemble_info *info, long given) { const struct opcode32 *insn; void *stream = info->stream; - fprintf_ftype func = info->fprintf_func; + fprintf_function func = info->fprintf_func; if (print_insn_coprocessor (pc, info, given, false)) return; @@ -3036,7 +3036,7 @@ print_insn_thumb16 (bfd_vma pc, struct disassemble_info *info, long given) { const struct opcode16 *insn; void *stream = info->stream; - fprintf_ftype func = info->fprintf_func; + fprintf_function func = info->fprintf_func; for (insn = thumb_opcodes; insn->assembler; insn++) if ((given & insn->mask) == insn->value) @@ -3312,7 +3312,7 @@ print_insn_thumb32 (bfd_vma pc, struct disassemble_info *info, long given) { const struct opcode32 *insn; void *stream = info->stream; - fprintf_ftype func = info->fprintf_func; + fprintf_function func = info->fprintf_func; if (print_insn_coprocessor (pc, info, given, true)) return; diff --git a/dis-asm.h b/dis-asm.h index 9b9657e..3fb4838 100644 --- a/dis-asm.h +++ b/dis-asm.h @@ -9,11 +9,7 @@ #ifndef DIS_ASM_H #define DIS_ASM_H -#include -#include -#include -#include -#include +#include "qemu-common.h" typedef void *PTR; typedef uint64_t bfd_vma; @@ -237,8 +233,6 @@ typedef struct symbol_cache_entry } udata; } asymbol; -typedef int (*fprintf_ftype) (FILE*, const char*, ...); - enum dis_insn_type { dis_noninsn, /* Not a valid instruction */ dis_nonbranch, /* Not a branch instruction */ @@ -261,7 +255,7 @@ enum dis_insn_type { by hand, or using one of the initialization macros below. */ typedef struct disassemble_info { - fprintf_ftype fprintf_func; + fprintf_function fprintf_func; FILE *stream; PTR application_data; diff --git a/m68k-dis.c b/m68k-dis.c index d93943e..04f837a 100644 --- a/m68k-dis.c +++ b/m68k-dis.c @@ -1732,7 +1732,7 @@ match_insn_m68k (bfd_vma memaddr, const char *d; bfd_byte *buffer = priv->the_buffer; - fprintf_ftype save_printer = info->fprintf_func; + fprintf_function save_printer = info->fprintf_func; void (* save_print_address) (bfd_vma, struct disassemble_info *) = info->print_address_func; diff --git a/microblaze-dis.c b/microblaze-dis.c index 7694a43..16c312f 100644 --- a/microblaze-dis.c +++ b/microblaze-dis.c @@ -789,7 +789,7
Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on
Anyone care to comment? On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote: > Add an option to specify the host interface to send multicast packets on > when using a multicast socket for networking. The option takes the name > of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket > option, which causes the packets to use that interface as an egress. > > This is useful if the host machine has several interfaces with several > virtual networks across disparate interfaces. > --- > net.c |4 > net/socket.c| 48 > qemu-common.h |1 + > qemu-options.hx | 11 +-- > qemu_socket.h |1 + > 5 files changed, 51 insertions(+), 14 deletions(-) > > diff --git a/net.c b/net.c > index c5e6063..ff9eb27 100644 > --- a/net.c > +++ b/net.c > @@ -1050,6 +1050,10 @@ static const struct { > .name = "mcast", > .type = QEMU_OPT_STRING, > .help = "UDP multicast address and port number", > +}, { > +.name = "interface", > +.type = QEMU_OPT_STRING, > +.help = "interface to send multicast packets on", > }, > { /* end of list */ } > }, > diff --git a/net/socket.c b/net/socket.c > index 1c4e153..ed7cd12 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque) > qemu_send_packet(&s->nc, s->buf, size); > } > > -static int net_socket_mcast_create(struct sockaddr_in *mcastaddr) > +static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const char > *interface) > { > struct ip_mreq imr; > int fd; > int val, ret; > +struct in_addr maddr; > +struct ifreq ifr; > if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) { > fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does > not contain a multicast address\n", > inet_ntoa(mcastaddr->sin_addr), > @@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in > *mcastaddr) > goto fail; > } > > +/* If an interface name is given, only send packets out that interface */ > +if (interface != NULL) { > +strncpy(ifr.ifr_name, interface, IFNAMSIZ); > +ret = ioctl(fd, SIOCGIFADDR, &ifr); > +if (ret < 0) { > +fprintf(stderr, "qemu: error: specified interface \"%s\" does > not exist\n", interface); > +goto fail; > +} > + > +maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr; > +ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF, &maddr, > sizeof(maddr)); > +if (ret < 0) { > +perror("setsockopt(IP_MULTICAST_IF)"); > +goto fail; > +} > +} > + > socket_set_nonblock(fd); > return fd; > fail: > @@ -248,7 +267,7 @@ static NetSocketState *net_socket_fd_init_dgram(VLANState > *vlan, > return NULL; > } > /* clone dgram socket */ > - newfd = net_socket_mcast_create(&saddr); > + newfd = net_socket_mcast_create(&saddr, NULL); > if (newfd < 0) { > /* error already reported by net_socket_mcast_create() */ > close(fd); > @@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan, > static int net_socket_mcast_init(VLANState *vlan, > const char *model, > const char *name, > - const char *host_str) > + const char *host_str, > + const char *interface) > { > NetSocketState *s; > int fd; > @@ -478,7 +498,7 @@ static int net_socket_mcast_init(VLANState *vlan, > return -1; > > > -fd = net_socket_mcast_create(&saddr); > +fd = net_socket_mcast_create(&saddr, interface); > if (fd < 0) > return -1; > > @@ -505,8 +525,9 @@ int net_init_socket(QemuOpts *opts, > > if (qemu_opt_get(opts, "listen") || > qemu_opt_get(opts, "connect") || > -qemu_opt_get(opts, "mcast")) { > -error_report("listen=, connect= and mcast= is invalid with fd="); > +qemu_opt_get(opts, "mcast") || > +qemu_opt_get(opts, "interface")) { > +error_report("listen=, connect=, mcast= and interface= is > invalid with fd=\n"); > return -1; > } > > @@ -524,8 +545,9 @@ int net_init_socket(QemuOpts *opts, > > if (qemu_opt_get(opts, "fd") || > qemu_opt_get(opts, "connect") || > -qemu_opt_get(opts, "mcast")) { > -error_report("fd=, connect= and mcast= is invalid with listen="); > +qemu_opt_get(opts, "mcast") || > +qemu_opt_get(opts, "interface")) { > +error_report("fd=, connect=, mcast= and interface= is invalid >
[Qemu-devel] [PATCH] target-sparc: Use fprintf_function (format checking)
This change was missing in commit 9a78eead0c74333a394c0f7bbfc4423ac746fcd5. Cc: Blue Swirl Signed-off-by: Stefan Weil --- target-sparc/cpu.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h index 7e0d17c..eb5d9c1 100644 --- a/target-sparc/cpu.h +++ b/target-sparc/cpu.h @@ -2,6 +2,7 @@ #define CPU_SPARC_H #include "config.h" +#include "qemu-common.h" #if !defined(TARGET_SPARC64) #define TARGET_LONG_BITS 32 @@ -442,8 +443,7 @@ typedef struct CPUSPARCState { /* helper.c */ CPUSPARCState *cpu_sparc_init(const char *cpu_model); void cpu_sparc_set_id(CPUSPARCState *env, unsigned int cpu); -void sparc_cpu_list (FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, - ...)); +void sparc_cpu_list(FILE *f, fprintf_function cpu_fprintf); void cpu_lock(void); void cpu_unlock(void); int cpu_sparc_handle_mmu_fault(CPUSPARCState *env1, target_ulong address, int rw, -- 1.7.2.3
Re: [Qemu-devel] Re: [Try2][PATCH] Initial implementation of a mpeg1 layer2 streaming audio driver.
On 11/12/2010 10:54 AM, François Revol wrote: Le 12 nov. 2010 à 15:32, Anthony Liguori a écrit : I did try years ago, but at least the current wav driver really didn't like fifos back then. I recall trying for hours to get it pipe to ffmpeg or others without much luck. Also, this poses several problems about the control of the external process (respawn on listener disconnection, close on exit...). Doing encoding in QEMU is going to starve the guest pretty bad. For an x86 guest, that's going to result in issues with time drift, etc. Making reencoding with an external process work a bit more nicely also has the advantage of making this work with other formats like Ogg which are a bit more Open Source friendly. The current patch uses a separate thread, but since I clone this part from the esdaudio code I didn't check what it was doing. It seems it's not really queueing anything though except the single mix buffer, which kind of defeats the purpose of having a separate thread. This might explain why it lags so much here... I tried increasing the buffer size and lowering the threshold but it doesn't seem to help. I agree it'd be better to use external programs when possible but as I said it's a bit harder to handle the errors and such, and I wanted to have something working. Also, it requires more work to set it up for users, they must install the externals, figure out the command line options... Possibly we can provide default templates for known programs, either text files with % escaping for args, or just a shell script passing env vars maybe... Besides, external or not, IIRC a pipe is by default 4kB max, which isn't much better for decoupling the processing on its own, if the encoder is too slow it will still starve the audio thread, and the rest. Also it all requires more context switching and IPC, which increase the total processing time. So I think it might be interesting to have both. I'll see if I can buffer a bit more in the twolame code and if it helps, then I'll try to merge with the failed attempts I have around at using external progs. Okay, but my thinking was that we'd do something like: audio_capture "capture_command -opt=val -opt2=val" ... Which would make it very easy to tie into lame, oggenc, etc. Having simple alias commands like mp3_capture and ogg_capture that invoke common tools with reasonable options also would be a bonus. Regards, Anthony Liguori François.
Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on
On 11/15/2010 12:54 PM, Mike Ryan wrote: Anyone care to comment? I must admit, I don't understand the use-case well enough to really give an appropriate amount of review as to whether this is the best solution to the problem. Michael, do you have any thoughts? Regards, Anthony Liguori On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote: Add an option to specify the host interface to send multicast packets on when using a multicast socket for networking. The option takes the name of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket option, which causes the packets to use that interface as an egress. This is useful if the host machine has several interfaces with several virtual networks across disparate interfaces. --- net.c |4 net/socket.c| 48 qemu-common.h |1 + qemu-options.hx | 11 +-- qemu_socket.h |1 + 5 files changed, 51 insertions(+), 14 deletions(-) diff --git a/net.c b/net.c index c5e6063..ff9eb27 100644 --- a/net.c +++ b/net.c @@ -1050,6 +1050,10 @@ static const struct { .name = "mcast", .type = QEMU_OPT_STRING, .help = "UDP multicast address and port number", +}, { +.name = "interface", +.type = QEMU_OPT_STRING, +.help = "interface to send multicast packets on", }, { /* end of list */ } }, diff --git a/net/socket.c b/net/socket.c index 1c4e153..ed7cd12 100644 --- a/net/socket.c +++ b/net/socket.c @@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque) qemu_send_packet(&s->nc, s->buf, size); } -static int net_socket_mcast_create(struct sockaddr_in *mcastaddr) +static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const char *interface) { struct ip_mreq imr; int fd; int val, ret; +struct in_addr maddr; +struct ifreq ifr; if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) { fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does not contain a multicast address\n", inet_ntoa(mcastaddr->sin_addr), @@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in *mcastaddr) goto fail; } +/* If an interface name is given, only send packets out that interface */ +if (interface != NULL) { +strncpy(ifr.ifr_name, interface, IFNAMSIZ); +ret = ioctl(fd, SIOCGIFADDR,&ifr); +if (ret< 0) { +fprintf(stderr, "qemu: error: specified interface \"%s\" does not exist\n", interface); +goto fail; +} + +maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr; +ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,&maddr, sizeof(maddr)); +if (ret< 0) { +perror("setsockopt(IP_MULTICAST_IF)"); +goto fail; +} +} + socket_set_nonblock(fd); return fd; fail: @@ -248,7 +267,7 @@ static NetSocketState *net_socket_fd_init_dgram(VLANState *vlan, return NULL; } /* clone dgram socket */ - newfd = net_socket_mcast_create(&saddr); + newfd = net_socket_mcast_create(&saddr, NULL); if (newfd< 0) { /* error already reported by net_socket_mcast_create() */ close(fd); @@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan, static int net_socket_mcast_init(VLANState *vlan, const char *model, const char *name, - const char *host_str) + const char *host_str, + const char *interface) { NetSocketState *s; int fd; @@ -478,7 +498,7 @@ static int net_socket_mcast_init(VLANState *vlan, return -1; -fd = net_socket_mcast_create(&saddr); +fd = net_socket_mcast_create(&saddr, interface); if (fd< 0) return -1; @@ -505,8 +525,9 @@ int net_init_socket(QemuOpts *opts, if (qemu_opt_get(opts, "listen") || qemu_opt_get(opts, "connect") || -qemu_opt_get(opts, "mcast")) { -error_report("listen=, connect= and mcast= is invalid with fd="); +qemu_opt_get(opts, "mcast") || +qemu_opt_get(opts, "interface")) { +error_report("listen=, connect=, mcast= and interface= is invalid with fd=\n"); return -1; } @@ -524,8 +545,9 @@ int net_init_socket(QemuOpts *opts, if (qemu_opt_get(opts, "fd") || qemu_opt_get(opts, "connect") || -qemu_opt_get(opts, "mcast")) { -error_report("fd=, connect= and mcast= is invalid with listen="); +qemu_opt_get(opts, "mcast") || +qemu_opt_get(opts, "interface")) { +
Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on
I'll clarify/elaborate a bit: When using a multicast socket, the OS chooses a default physical interface to send packets. The patch I've supplied allows the user to select the interface. Suppose you have a setup like so: BoxA --- BoxB --- BoxC You wish to run virtual machines on BoxB and BoxC and network them using a multicast UDP socket. BoxB has two network interfaces, and the default multicast interface may be the link between BoxA and BoxB. In this situation, BoxC will not receive any multicast packets from BoxB and networking between the boxes is therefore impossible. The utility of a multicast socket is obviously limited in my simplified example. Generalize BoxC to a LAN of physical machines all running virtual machines you wish to network and the use case should become a bit clearer. On Mon, Nov 15, 2010 at 01:36:28PM -0600, Anthony Liguori wrote: > On 11/15/2010 12:54 PM, Mike Ryan wrote: > >Anyone care to comment? > > I must admit, I don't understand the use-case well enough to really > give an appropriate amount of review as to whether this is the best > solution to the problem. > > Michael, do you have any thoughts? > > Regards, > > Anthony Liguori > > >On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote: > >>Add an option to specify the host interface to send multicast packets on > >>when using a multicast socket for networking. The option takes the name > >>of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket > >>option, which causes the packets to use that interface as an egress. > >> > >>This is useful if the host machine has several interfaces with several > >>virtual networks across disparate interfaces. > >>--- > >> net.c |4 > >> net/socket.c| 48 > >> qemu-common.h |1 + > >> qemu-options.hx | 11 +-- > >> qemu_socket.h |1 + > >> 5 files changed, 51 insertions(+), 14 deletions(-) > >> > >>diff --git a/net.c b/net.c > >>index c5e6063..ff9eb27 100644 > >>--- a/net.c > >>+++ b/net.c > >>@@ -1050,6 +1050,10 @@ static const struct { > >> .name = "mcast", > >> .type = QEMU_OPT_STRING, > >> .help = "UDP multicast address and port number", > >>+}, { > >>+.name = "interface", > >>+.type = QEMU_OPT_STRING, > >>+.help = "interface to send multicast packets on", > >> }, > >> { /* end of list */ } > >> }, > >>diff --git a/net/socket.c b/net/socket.c > >>index 1c4e153..ed7cd12 100644 > >>--- a/net/socket.c > >>+++ b/net/socket.c > >>@@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque) > >> qemu_send_packet(&s->nc, s->buf, size); > >> } > >> > >>-static int net_socket_mcast_create(struct sockaddr_in *mcastaddr) > >>+static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const > >>char *interface) > >> { > >> struct ip_mreq imr; > >> int fd; > >> int val, ret; > >>+struct in_addr maddr; > >>+struct ifreq ifr; > >> if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) { > >>fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does > >> not contain a multicast address\n", > >>inet_ntoa(mcastaddr->sin_addr), > >>@@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in > >>*mcastaddr) > >>goto fail; > >> } > >> > >>+/* If an interface name is given, only send packets out that interface > >>*/ > >>+if (interface != NULL) { > >>+strncpy(ifr.ifr_name, interface, IFNAMSIZ); > >>+ret = ioctl(fd, SIOCGIFADDR,&ifr); > >>+if (ret< 0) { > >>+fprintf(stderr, "qemu: error: specified interface \"%s\" does > >>not exist\n", interface); > >>+goto fail; > >>+} > >>+ > >>+maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr; > >>+ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,&maddr, > >>sizeof(maddr)); > >>+if (ret< 0) { > >>+perror("setsockopt(IP_MULTICAST_IF)"); > >>+goto fail; > >>+} > >>+} > >>+ > >> socket_set_nonblock(fd); > >> return fd; > >> fail: > >>@@ -248,7 +267,7 @@ static NetSocketState > >>*net_socket_fd_init_dgram(VLANState *vlan, > >>return NULL; > >>} > >>/* clone dgram socket */ > >>- newfd = net_socket_mcast_create(&saddr); > >>+ newfd = net_socket_mcast_create(&saddr, NULL); > >>if (newfd< 0) { > >>/* error already reported by net_socket_mcast_create() */ > >>close(fd); > >>@@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan, > >> static int net_socket_mcast_init(VLANState *vlan, > >> const char *model, > >> const char *name, > >>- const char *host_str) > >>+
[Qemu-devel] How to detect a stopped guest os?
Hello, I know, this is not the right place to ask, but I wasn't abled to find a users mailing list. The question: is there any qemu (monitor) command to detect if the guest os has stopped / poweroff? -- Wilhelm
[Qemu-devel] [PATCH] audio: Use GCC_FMT_ATTR (format checking)
Cc: Blue Swirl Signed-off-by: Stefan Weil --- audio/audio_pt_int.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/audio/audio_pt_int.c b/audio/audio_pt_int.c index f15cc70..908c569 100644 --- a/audio/audio_pt_int.c +++ b/audio/audio_pt_int.c @@ -8,7 +8,8 @@ #include -static void logerr (struct audio_pt *pt, int err, const char *fmt, ...) +static void GCC_FMT_ATTR(3, 4) logerr (struct audio_pt *pt, int err, + const char *fmt, ...) { va_list ap; -- 1.7.2.3
Re: [Qemu-devel] [PATCH] make trace options use autoconfy names
Am 15.11.2010 um 16:48 schrieb Paolo Bonzini: On 11/15/2010 03:17 PM, Stefan Hajnoczi wrote: On Sun, Nov 14, 2010 at 1:52 PM, Paolo Bonzini wrote: On 11/14/2010 02:38 PM, Andreas Färber wrote: - --trace-file=*) trace_file="$optarg" + --enable-trace-file=*) trace_file="$optarg" ;; but this should be --with-trace-file=... please. It is not being enabled, just set to a different value. --with-* should be reserved for library paths, but I can change it if people prefer it that way. I did think of the argument as a file name, sort of a relative path... Actually I think we have something similar to overriding --prefix here :). It's a path that you can set at ./configure time. Yeah, that's true. However... So is it not okay to use --trace-file=? ... Autoconf would not allow unknown options not starting with -- enable- or --with-. The rationale to avoid incompatible options in QEMU is this: suppose you have a project using Autoconf (e.g. GCC) and you want to drop QEMU as a subdirectory in there, e.g. to run the GCC testsuite under QEMU usermode emulation (GCC can already do this for other simulators). To pass options to QEMU's configure, you can include them in GCC's commandline. The script will simply pass the option down to QEMU and it will be processed there. However, if you pass -- trace-file to GCC's configure script, it will complain and stop. Probably I would use something like --enable-trace- backend=simple:trace- if I was adding something similar to an autoconfiscated project. But unless it provides some additional benefit (as is the case with cross-compilation support) I want to keep the syntactic changes in my patches to the minimum. But I know nothing of autoconf and --enable-* or --with-* sort of make sense too. Whatever, I have to repost the other series anyway, so I'll change to --with-. Thinking more about it, what about --enable-simple-trace=..., callapsing the two options into one? Another autoconf way to pass this stuff would we ./configure ... TRACE_FILE=... I wouldn't mind either way though, just noticed that the --enable- trace-file suggestion by autoconf convention would allow --disable- trace-file. Similar issue for --without-trace-file though. Andreas
Re: [Qemu-devel] [PATCH] *-dis: Replace fprintf_ftype by fprintf_function (format checking)
Am 15.11.2010 um 19:39 schrieb Stefan Weil: This patch adds more printf format checking. Additional modifications were needed for this code change: * alpha-dis.c: The local definition of MAX conflicts with a previous definition from osdep.h, so add an #undef. * dis-asm.h: Add include for fprintf_function (qemu-common.h). The standard (now redundant) includes are removed. * mis-dis.c: :-) [...] alpha-dis.c |3 +++ arm-dis.c| 14 +++--- dis-asm.h| 10 ++ m68k-dis.c |2 +- microblaze-dis.c |2 +- mips-dis.c |2 -- sh4-dis.c| 16 +--- 7 files changed, 19 insertions(+), 30 deletions(-)
[Qemu-devel] [PATCH] darwin-user: Use GCC_FMT_ATTR (format checking)
The redundant forward declaration of qerror in machload.c is removed because it should be taken from qemu.h. Please note that this patch is untested because I have no matching environment to compile it. Cc: Blue Swirl Signed-off-by: Stefan Weil --- darwin-user/machload.c |2 +- darwin-user/qemu.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/darwin-user/machload.c b/darwin-user/machload.c index 4bb5c72..3bc3b65 100644 --- a/darwin-user/machload.c +++ b/darwin-user/machload.c @@ -82,7 +82,7 @@ void *macho_text_sect = 0; int macho_offset = 0; int load_object(const char *filename, struct target_pt_regs * regs, void ** mh); -void qerror(const char *format, ...); + #ifdef TARGET_I386 typedef struct mach_i386_thread_state { unsigned inteax; diff --git a/darwin-user/qemu.h b/darwin-user/qemu.h index 0c5081b..b6d3e6c 100644 --- a/darwin-user/qemu.h +++ b/darwin-user/qemu.h @@ -100,7 +100,7 @@ int do_sigaction(int sig, const struct sigaction *act, int do_sigaltstack(const struct sigaltstack *ss, struct sigaltstack *oss); void gemu_log(const char *fmt, ...) GCC_FMT_ATTR(1, 2); -void qerror(const char *fmt, ...); +void qerror(const char *fmt, ...) GCC_FMT_ATTR(1, 2); void write_dt(void *ptr, unsigned long addr, unsigned long limit, int flags); -- 1.7.2.3
[Qemu-devel] Re: [PATCH RESENT] msix: allow byte and word reading from mmio
On Mon, Nov 15, 2010 at 05:40:30PM +0100, Bernhard Kohl wrote: > Am 15.11.2010 11:42, schrieb ext Michael S. Tsirkin: > >On Thu, Aug 19, 2010 at 02:56:51PM +0200, Bernhard Kohl wrote: > >>It's legal that the guest reads a single byte or word from mmio. > >Interesting. The spec seems to say this: > > > > For all accesses to MSI-X Table and MSI-X PBA fields, software must use > > aligned full DWORD or aligned full QWORD transactions; otherwise, the > > result is undefined. > > I will remove the first statement from the commit message and add > something like the comment you proposed below. > > >>I have an OS which reads single bytes and it works fine on real > >>hardware. Maybe this happens due to casting. > >What do you mean by casting? > > I did not yet locate the line of code where this happens in the guest. > As all other accesses are dword, I assume that there is some masking, > shifting or casting to a byte variable in the source code, and the > compiler generates a byte access from this. And I presume it's too late to fix the guest in question? > >>Signed-off-by: Bernhard Kohl > >I guess we can merge this, but we need a comment I think since this > >seems to contradict the spec, and people were sending patches relying on > >this. > > > >Does something like the following describe the situation correctly? > > > >/* Note: PCI spec requires that all MSI-X table accesses > >are either DWORD or QWORD, size aligned. Some guests seem to violate > >this rule for read accesses, performing single byte reads. > >Since it's easy to support this, let's do so. > >Also support 16 bit size aligned reads, just in case. > > */ > > Yes, that's is exactly the situation with my guest. > I will add this comment. > > >Does you guest also do 16 bit reads? Are accesses at least aligned? > > I will check this with my guest and the readw function disabled > in the patch. This will take some time. > > >>--- > >> hw/msix.c | 20 > >> 1 files changed, 16 insertions(+), 4 deletions(-) > >> > >>diff --git a/hw/msix.c b/hw/msix.c > >>index d99403a..7dac7f7 100644 > >>--- a/hw/msix.c > >>+++ b/hw/msix.c > >>@@ -100,10 +100,22 @@ static uint32_t msix_mmio_readl(void *opaque, > >>target_phys_addr_t addr) > >> return pci_get_long(page + offset); > >> } > >> > >>-static uint32_t msix_mmio_read_unallowed(void *opaque, target_phys_addr_t > >>addr) > >>+static uint32_t msix_mmio_readw(void *opaque, target_phys_addr_t addr) > >> { > >>-fprintf(stderr, "MSI-X: only dword read is allowed!\n"); > >>-return 0; > >>+PCIDevice *dev = opaque; > >>+unsigned int offset = addr& (MSIX_PAGE_SIZE - 1)& ~0x1; > >>+void *page = dev->msix_table_page; > >>+ > >>+return pci_get_word(page + offset); > >>+} > >>+ > >>+static uint32_t msix_mmio_readb(void *opaque, target_phys_addr_t addr) > >>+{ > >>+PCIDevice *dev = opaque; > >>+unsigned int offset = addr& (MSIX_PAGE_SIZE - 1); > >>+void *page = dev->msix_table_page; > >>+ > >>+return pci_get_byte(page + offset); > >> } > >> > >> static uint8_t msix_pending_mask(int vector) > >>@@ -198,7 +210,7 @@ static CPUWriteMemoryFunc * const msix_mmio_write[] = { > >> }; > >> > >> static CPUReadMemoryFunc * const msix_mmio_read[] = { > >>-msix_mmio_read_unallowed, msix_mmio_read_unallowed, msix_mmio_readl > >>+msix_mmio_readb, msix_mmio_readw, msix_mmio_readl > >> }; > >> > >> /* Should be called from device's map method. */ > >>-- > >>1.7.2.1
[Qemu-devel] [PATCH 2/4] virtio: Convert fprintf() to error_report()
Signed-off-by: Stefan Hajnoczi --- hw/virtio.c | 35 ++- 1 files changed, 18 insertions(+), 17 deletions(-) diff --git a/hw/virtio.c b/hw/virtio.c index a2a657e..849a60f 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -14,6 +14,7 @@ #include #include "trace.h" +#include "qemu-error.h" #include "virtio.h" #include "sysemu.h" @@ -253,8 +254,8 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) /* Check it isn't doing very strange things with descriptor numbers. */ if (num_heads > vq->vring.num) { -fprintf(stderr, "Guest moved used index from %u to %u", -idx, vring_avail_idx(vq)); +error_report("Guest moved used index from %u to %u", + idx, vring_avail_idx(vq)); exit(1); } @@ -271,7 +272,7 @@ static unsigned int virtqueue_get_head(VirtQueue *vq, unsigned int idx) /* If their number is silly, that's a fatal mistake. */ if (head >= vq->vring.num) { -fprintf(stderr, "Guest says index %u is available", head); +error_report("Guest says index %u is available", head); exit(1); } @@ -293,7 +294,7 @@ static unsigned virtqueue_next_desc(target_phys_addr_t desc_pa, wmb(); if (next >= max) { -fprintf(stderr, "Desc next is %u", next); +error_report("Desc next is %u", next); exit(1); } @@ -320,13 +321,13 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, int out_bytes) if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT) { if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) { -fprintf(stderr, "Invalid size for indirect buffer table\n"); +error_report("Invalid size for indirect buffer table"); exit(1); } /* If we've got too many, that implies a descriptor loop. */ if (num_bufs >= max) { -fprintf(stderr, "Looped descriptor"); +error_report("Looped descriptor"); exit(1); } @@ -340,7 +341,7 @@ int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, int out_bytes) do { /* If we've got too many, that implies a descriptor loop. */ if (++num_bufs > max) { -fprintf(stderr, "Looped descriptor"); +error_report("Looped descriptor"); exit(1); } @@ -374,7 +375,7 @@ void virtqueue_map_sg(struct iovec *sg, target_phys_addr_t *addr, len = sg[i].iov_len; sg[i].iov_base = cpu_physical_memory_map(addr[i], &len, is_write); if (sg[i].iov_base == NULL || len != sg[i].iov_len) { -fprintf(stderr, "virtio: trying to map MMIO memory\n"); +error_report("virtio: trying to map MMIO memory"); exit(1); } } @@ -397,7 +398,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT) { if (vring_desc_len(desc_pa, i) % sizeof(VRingDesc)) { -fprintf(stderr, "Invalid size for indirect buffer table\n"); +error_report("Invalid size for indirect buffer table"); exit(1); } @@ -423,7 +424,7 @@ int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem) /* If we've got too many, that implies a descriptor loop. */ if ((elem->in_num + elem->out_num) > max) { -fprintf(stderr, "Looped descriptor"); +error_report("Looped descriptor"); exit(1); } } while ((i = virtqueue_next_desc(desc_pa, i, max)) != max); @@ -694,8 +695,8 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f) qemu_get_be16s(f, &vdev->queue_sel); qemu_get_be32s(f, &features); if (features & ~supported_features) { -fprintf(stderr, "Features 0x%x unsupported. Allowed features: 0x%x\n", -features, supported_features); +error_report("Features 0x%x unsupported. Allowed features: 0x%x", + features, supported_features); return -1; } if (vdev->set_features) @@ -717,11 +718,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f) num_heads = vring_avail_idx(&vdev->vq[i]) - vdev->vq[i].last_avail_idx; /* Check it isn't doing very strange things with descriptor numbers. */ if (num_heads > vdev->vq[i].vring.num) { - fprintf(stderr, "VQ %d size 0x%x Guest index 0x%x " -"inconsistent with Host index 0x%x: delta 0x%x\n", - i, vdev->vq[i].vring.num, -vring_avail_idx(&vdev->vq[i]), -vdev->vq[i].last_avail_idx, num_heads); + error_report("VQ %d size 0x%x Guest index 0x%x " +"inconsistent with Host index 0x%x: delta 0x%x", +i, vdev->vq[i].vring.num, +
[Qemu-devel] [PATCH 0/4] virtio: Convert fprintf() to error_report()
The virtio hardware emulation code uses fprintf(stderr, ...) error messages. Improve things slightly by moving to error_report() so error messages will be printed to the monitor, if present. We want to handle device error states properly instead of bailing out with exit(1) but this series does not attempt to fix that yet. Leave virtio-9p for now where there are many cases of fprintf(stderr, ...) and development is still very active.
[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On 11/15/2010 02:26 PM, Michael S. Tsirkin wrote: Are there any system ones? There's one in qemu-char.c. Not sure it's easy to just say for the future that there can't be BHs that get delivered during vmstop. Can we just stop processing them? We ran into a lot of problems trying to do this with emulating synchronous I/O in the block layer. If we can avoid the stop-dispatching handlers approach, I think we'll have far fewer problems. I think that if we put something in the network layer that just queued packets if the vm is stopped, it would be a more robust solution to the problem. Will only work for -net. The problem is for anything that can trigger activity when vm is stopped. Activity or external I/O? My assertion is that if we can stop things from hitting the disk, escaping the network, or leaving a character device, we're solid. For memory, it is much worse: any memory changes can either get discarded or not. This breaks consistency guarantees that guest relies upon. Imagine virtio index getting updated but content not being updated. See? If you suppress any I/O then the memory changes don't matter because the same changes will happen on the destination too. They matter, and same changes won't happen. Example: virtio used index is in page 1, it can point at data in page 2. device writes into data, *then* into index. Order matters, but won't be preserved: migration assumes memory does not change after vmstop, and so it might send old values for data but new values for index. Result will be invalid data coming into guest. No, this scenario is invalid. Migration copies data while the guest is live and whenever a change happens, updates the dirty bitmap (that's why cpu_physical_memory_rw matters). Once the VM is stopped, we never return to the main loop before completing migration so nothing else gets an opportunity to run. This means when we send a page, we guarantee it won't be made dirty against until after the migration completes. Once the migration completes, the contents of memory may change but that's okay. As long as you stop packets from being sent, if you need to resume the guest, it'll pick up where it left off. On the destination guest will pick up the index and get bad (stale) data. If you're seeing this happen with vhost, you aren't updating dirty bits correctly. AFAICT, this cannot happen with userspace device models. I think this basic problem is the same as Kemari. We can either attempt to totally freeze a guest which means stopping all callbacks that are device related or we can prevent I/O from happening which should introduce enough determinism to fix the problem in practice. Regards, Anthony Liguori See above. IMO it's a different problem. Unlike Kemari, I don't really see any drawbacks to stop all callbacks. Do you? I think it's going to be extremely difficult to get right and keep right. A bunch of code needs to be converted to make us safe and then becomes brittle as it's very simple to change things in such a way that we're broken again. Regards, Anthony Liguori
Re: [Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On 11/15/2010 02:53 PM, Stefan Hajnoczi wrote: vhost has a solution for this: register a VMChangeStateHandler function that stops ioeventfd while vm_stop(0) is in effect. Then poll to see if there is work pending. I will add equivalent functionality to virtio-ioeventfd. I still think that stopping this at the net/block layer is going to be a lot more robust in the long term. All net/block traffic flows through a single set of interfaces whereas getting the devices to completely stop themselves requires touching every device and making sure that noone adds back vmstop-unfriendly behavior down the road. Regards, Anthony Liguori Stefan
Re: [Qemu-devel] [PATCH 1/3] Make paio subsystem use threadlets infrastructure
On 11/15/2010 11:53 AM, Arun R Bharadwaj wrote: From: Gautham R Shenoy This patch creates a generic asynchronous-task-offloading infrastructure named threadlets. The patch creates a global queue on-to which subsystems can queue their tasks to be executed asynchronously. The patch also provides API's that allow a subsystem to create a private queue with an associated pool of threads. The patch has been tested with fstress. This patch is very hard to review. Can you generize this via a serial of incremental changes instead of a big rewrite? This code has been historically fickle so doing the extra work to simplify review is worth it. Regards, Anthony Liguori Signed-off-by: Arun R Bharadwaj Signed-off-by: Aneesh Kumar K.V Signed-off-by: Gautham R Shenoy Signed-off-by: Sripathi Kodi --- Makefile.objs |2 configure |2 posix-aio-compat.c | 354 +++- 3 files changed, 211 insertions(+), 147 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index cd5a24b..3b7ec27 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -9,6 +9,7 @@ qobject-obj-y += qerror.o block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o block-obj-y += nbd.o block.o aio.o aes.o osdep.o qemu-config.o +block-obj-$(CONFIG_POSIX) += qemu-thread.o block-obj-$(CONFIG_POSIX) += posix-aio-compat.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o @@ -124,7 +125,6 @@ endif common-obj-y += $(addprefix ui/, $(ui-obj-y)) common-obj-y += iov.o acl.o -common-obj-$(CONFIG_THREAD) += qemu-thread.o common-obj-y += notify.o event_notifier.o common-obj-y += qemu-timer.o diff --git a/configure b/configure index a079a49..addf733 100755 --- a/configure +++ b/configure @@ -2456,7 +2456,6 @@ if test "$vnc_png" != "no" ; then fi if test "$vnc_thread" != "no" ; then echo "CONFIG_VNC_THREAD=y">> $config_host_mak - echo "CONFIG_THREAD=y">> $config_host_mak fi if test "$fnmatch" = "yes" ; then echo "CONFIG_FNMATCH=y">> $config_host_mak @@ -2534,7 +2533,6 @@ if test "$xen" = "yes" ; then fi if test "$io_thread" = "yes" ; then echo "CONFIG_IOTHREAD=y">> $config_host_mak - echo "CONFIG_THREAD=y">> $config_host_mak fi if test "$linux_aio" = "yes" ; then echo "CONFIG_LINUX_AIO=y">> $config_host_mak diff --git a/posix-aio-compat.c b/posix-aio-compat.c index 7b862b5..e1812fc 100644 --- a/posix-aio-compat.c +++ b/posix-aio-compat.c @@ -29,7 +29,33 @@ #include "block_int.h" #include "block/raw-posix-aio.h" +#include "qemu-thread.h" +#define MAX_GLOBAL_THREADS 64 +#define MIN_GLOBAL_THREADS 8 + +static QemuMutex aiocb_mutex; +static QemuCond aiocb_completion; + +typedef struct ThreadletQueue +{ +QemuMutex lock; +QemuCond cond; +int max_threads; +int min_threads; +int cur_threads; +int idle_threads; +QTAILQ_HEAD(, ThreadletWork) request_list; +} ThreadletQueue; + +typedef struct ThreadletWork +{ +QTAILQ_ENTRY(ThreadletWork) node; +void (*func)(struct ThreadletWork *work); +} ThreadletWork; + +static ThreadletQueue globalqueue; +static int globalqueue_init; struct qemu_paiocb { BlockDriverAIOCB common; @@ -44,13 +70,12 @@ struct qemu_paiocb { int ev_signo; off_t aio_offset; -QTAILQ_ENTRY(qemu_paiocb) node; int aio_type; ssize_t ret; -int active; struct qemu_paiocb *next; int async_context_id; +ThreadletWork work; }; typedef struct PosixAioState { @@ -58,64 +83,169 @@ typedef struct PosixAioState { struct qemu_paiocb *first_aio; } PosixAioState; +static void *threadlet_worker(void *data) +{ +ThreadletQueue *queue = data; -static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; -static pthread_cond_t cond = PTHREAD_COND_INITIALIZER; -static pthread_t thread_id; -static pthread_attr_t attr; -static int max_threads = 64; -static int cur_threads = 0; -static int idle_threads = 0; -static QTAILQ_HEAD(, qemu_paiocb) request_list; +qemu_mutex_lock(&queue->lock); +while (1) { +ThreadletWork *work; +int ret = 0; + +while (QTAILQ_EMPTY(&queue->request_list)&& + (ret != ETIMEDOUT)) { +/* wait for cond to be signalled or broadcast for 1000s */ +ret = qemu_cond_timedwait((&queue->cond), +&(queue->lock), 10*10); +} -#ifdef CONFIG_PREADV -static int preadv_present = 1; -#else -static int preadv_present = 0; -#endif +assert(queue->idle_threads != 0); +if (QTAILQ_EMPTY(&queue->request_list)) { +if (queue->cur_threads> queue->min_threads) { +/* We retain the minimum number of threads */ +break; +} +} else { +work = QTAILQ_FIRST(&queue->request_list); +QTAILQ_REMOVE(&queue->request_list, work, node); -static void die2(int err, const char *what) -{ -fprintf(stderr, "%s failed: %s\n", what, strerror(err)); -abort(); +
Re: [Qemu-devel] [PATCH] add a command line option to specify the interface to send multicast packets on
On 11/15/2010 01:52 PM, Mike Ryan wrote: I'll clarify/elaborate a bit: When using a multicast socket, the OS chooses a default physical interface to send packets. The patch I've supplied allows the user to select the interface. Suppose you have a setup like so: BoxA --- BoxB --- BoxC You wish to run virtual machines on BoxB and BoxC and network them using a multicast UDP socket. BoxB has two network interfaces, and the default multicast interface may be the link between BoxA and BoxB. In this situation, BoxC will not receive any multicast packets from BoxB and networking between the boxes is therefore impossible. The utility of a multicast socket is obviously limited in my simplified example. Generalize BoxC to a LAN of physical machines all running virtual machines you wish to network and the use case should become a bit clearer. Thanks. Second question is how portable is SIOCGIFADDR? I suspect that's very Linux-centric.. Regards, Anthony Liguori On Mon, Nov 15, 2010 at 01:36:28PM -0600, Anthony Liguori wrote: On 11/15/2010 12:54 PM, Mike Ryan wrote: Anyone care to comment? I must admit, I don't understand the use-case well enough to really give an appropriate amount of review as to whether this is the best solution to the problem. Michael, do you have any thoughts? Regards, Anthony Liguori On Wed, Nov 10, 2010 at 05:47:35PM -0800, Mike Ryan wrote: Add an option to specify the host interface to send multicast packets on when using a multicast socket for networking. The option takes the name of a host interface (e.g., eth0) and sets the IP_MULTICAST_IF socket option, which causes the packets to use that interface as an egress. This is useful if the host machine has several interfaces with several virtual networks across disparate interfaces. --- net.c |4 net/socket.c| 48 qemu-common.h |1 + qemu-options.hx | 11 +-- qemu_socket.h |1 + 5 files changed, 51 insertions(+), 14 deletions(-) diff --git a/net.c b/net.c index c5e6063..ff9eb27 100644 --- a/net.c +++ b/net.c @@ -1050,6 +1050,10 @@ static const struct { .name = "mcast", .type = QEMU_OPT_STRING, .help = "UDP multicast address and port number", +}, { +.name = "interface", +.type = QEMU_OPT_STRING, +.help = "interface to send multicast packets on", }, { /* end of list */ } }, diff --git a/net/socket.c b/net/socket.c index 1c4e153..ed7cd12 100644 --- a/net/socket.c +++ b/net/socket.c @@ -149,11 +149,13 @@ static void net_socket_send_dgram(void *opaque) qemu_send_packet(&s->nc, s->buf, size); } -static int net_socket_mcast_create(struct sockaddr_in *mcastaddr) +static int net_socket_mcast_create(struct sockaddr_in *mcastaddr, const char *interface) { struct ip_mreq imr; int fd; int val, ret; +struct in_addr maddr; +struct ifreq ifr; if (!IN_MULTICAST(ntohl(mcastaddr->sin_addr.s_addr))) { fprintf(stderr, "qemu: error: specified mcastaddr \"%s\" (0x%08x) does not contain a multicast address\n", inet_ntoa(mcastaddr->sin_addr), @@ -201,6 +203,23 @@ static int net_socket_mcast_create(struct sockaddr_in *mcastaddr) goto fail; } +/* If an interface name is given, only send packets out that interface */ +if (interface != NULL) { +strncpy(ifr.ifr_name, interface, IFNAMSIZ); +ret = ioctl(fd, SIOCGIFADDR,&ifr); +if (ret< 0) { +fprintf(stderr, "qemu: error: specified interface \"%s\" does not exist\n", interface); +goto fail; +} + +maddr = ((struct sockaddr_in *)&ifr.ifr_addr)->sin_addr; +ret = setsockopt(fd, IPPROTO_IP, IP_MULTICAST_IF,&maddr, sizeof(maddr)); +if (ret< 0) { +perror("setsockopt(IP_MULTICAST_IF)"); +goto fail; +} +} + socket_set_nonblock(fd); return fd; fail: @@ -248,7 +267,7 @@ static NetSocketState *net_socket_fd_init_dgram(VLANState *vlan, return NULL; } /* clone dgram socket */ - newfd = net_socket_mcast_create(&saddr); + newfd = net_socket_mcast_create(&saddr, NULL); if (newfd< 0) { /* error already reported by net_socket_mcast_create() */ close(fd); @@ -468,7 +487,8 @@ static int net_socket_connect_init(VLANState *vlan, static int net_socket_mcast_init(VLANState *vlan, const char *model, const char *name, - const char *host_str) + const char *host_str, + const char *interface) { NetSocketState *s; int fd; @@ -478,7 +498,7
[Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On Mon, Nov 15, 2010 at 10:09:29AM -0600, Anthony Liguori wrote: > On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote: > >On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: > >>On 11/15/2010 08:52 AM, Juan Quintela wrote: > >>>"Michael S. Tsirkin" wrote: > There's no reason for tap to run when VM is stopped. > If we let it, it confuses the bridge on TX > and corrupts DMA memory on RX. > > Signed-off-by: Michael S. Tsirkin > >>>once here, what handlers make sense to run while stopped? > >>>/me can think of the normal console, non live migration, loadvm and not > >>>much more. Perhaps it is easier to just move the other way around? > >>I'm not sure I concur that this is really a problem. > >>Semantically, I don't think that stop has to imply that the guest > >>memory no longer changes. > >> > >>Regards, > >> > >>Anthony Liguori > >> > >>>Later, Juan. > >Well, I do not really know about vmstop that is not for migration. > > They are separate. For instance, we don't rely on stop to pause > pending disk I/O because we don't serialize pending disk I/O > operations. Instead, we flush all pending I/O and rely on the fact > that disk I/O requests are always submitted in the context of a vCPU > operation. This assumption breaks down though with ioeventfd so we > need to revisit it. > > >For most vmstop calls are for migration. And there, the problems are very > >real. > > > >First, it's not just memory. At least for network transmit, sending out > >packets with the same MAC from two locations is a problem. See? > > I agree it's a problem but I'm not sure that just marking fd > handlers really helps. > > Bottom halves can also trigger transmits. Are there any system ones? Can we just stop processing them? > I think that if we put > something in the network layer that just queued packets if the vm is > stopped, it would be a more robust solution to the problem. Will only work for -net. The problem is for anything that can trigger activity when vm is stopped. > >For memory, it is much worse: any memory changes can either get > >discarded or not. This breaks consistency guarantees that guest relies > >upon. Imagine virtio index getting updated but content not being > >updated. See? > > If you suppress any I/O then the memory changes don't matter because > the same changes will happen on the destination too. They matter, and same changes won't happen. Example: virtio used index is in page 1, it can point at data in page 2. device writes into data, *then* into index. Order matters, but won't be preserved: migration assumes memory does not change after vmstop, and so it might send old values for data but new values for index. Result will be invalid data coming into guest. On the destination guest will pick up the index and get bad (stale) data. > I think this basic problem is the same as Kemari. We can either > attempt to totally freeze a guest which means stopping all callbacks > that are device related or we can prevent I/O from happening which > should introduce enough determinism to fix the problem in practice. > > Regards, > > Anthony Liguori See above. IMO it's a different problem. Unlike Kemari, I don't really see any drawbacks to stop all callbacks. Do you? -- MST
Re: [Qemu-devel] Re: [PATCHv2 2/2] tap: mark fd handler as device
On Mon, Nov 15, 2010 at 4:09 PM, Anthony Liguori wrote: > On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote: >> >> On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: >> >>> >>> On 11/15/2010 08:52 AM, Juan Quintela wrote: >>> "Michael S. Tsirkin" wrote: > > There's no reason for tap to run when VM is stopped. > If we let it, it confuses the bridge on TX > and corrupts DMA memory on RX. > > Signed-off-by: Michael S. Tsirkin > once here, what handlers make sense to run while stopped? /me can think of the normal console, non live migration, loadvm and not much more. Perhaps it is easier to just move the other way around? >>> >>> I'm not sure I concur that this is really a problem. >>> Semantically, I don't think that stop has to imply that the guest >>> memory no longer changes. >>> >>> Regards, >>> >>> Anthony Liguori >>> >>> Later, Juan. >> >> Well, I do not really know about vmstop that is not for migration. >> > > They are separate. For instance, we don't rely on stop to pause pending > disk I/O because we don't serialize pending disk I/O operations. Instead, > we flush all pending I/O and rely on the fact that disk I/O requests are > always submitted in the context of a vCPU operation. This assumption breaks > down though with ioeventfd so we need to revisit it. vhost has a solution for this: register a VMChangeStateHandler function that stops ioeventfd while vm_stop(0) is in effect. Then poll to see if there is work pending. I will add equivalent functionality to virtio-ioeventfd. Stefan