Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a écrit : > 2011/1/20 Pierre Riteau : >> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >> >>> 2011/1/19 Pierre Riteau : b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return value of bdrv_write and aborts migration when it fails. However, if the size of the block device to migrate is not a multiple of BLOCK_SIZE (currently 1 MB), the last bdrv_write will fail with -EIO. Fixed by calling bdrv_write with the correct size of the last block. --- block-migration.c | 16 +++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/block-migration.c b/block-migration.c index 1475325..eeb9c62 100644 --- a/block-migration.c +++ b/block-migration.c @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) int64_t addr; BlockDriverState *bs; uint8_t *buf; +int64_t total_sectors; +int nr_sectors; do { addr = qemu_get_be64(f); @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) return -EINVAL; } +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; +if (total_sectors <= 0) { +fprintf(stderr, "Error getting length of block device %s\n", device_name); +return -EINVAL; +} + +if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { +nr_sectors = total_sectors - addr; +} else { +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; +} + buf = qemu_malloc(BLOCK_SIZE); qemu_get_buffer(f, buf, BLOCK_SIZE); -ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK); +ret = bdrv_write(bs, addr, buf, nr_sectors); qemu_free(buf); if (ret < 0) { -- 1.7.3.5 >>> >>> Hi Pierre, >>> >>> I don't think the fix above is correct. If you have a file which >>> isn't aliened with BLOCK_SIZE, you won't get an error with the >>> patch. However, the receiver doesn't know how much sectors which >>> the sender wants to be written, so the guest may fail after >>> migration because some data may not be written. IIUC, although >>> changing bytestream should be prevented as much as possible, we >>> should save/load total_sectors to check appropriate file is >>> allocated on the receiver side. >> >> Isn't the guest supposed to be started using a file with the correct size? > > I personally don't like that; It's insisting too much to the user. > Can't we expand the image on the fly? We can just abort if expanding > failed anyway. At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails. Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition. >> But I guess changing the protocol would be best as it would avoid headaches >> to people who mistakenly created a file that is too small. > > We should think carefully before changing the protocol. > > Kevin? > >> >>> BTW, you should use error_report instead of fprintf(stderr, ...). >> >> I didn't know that, I followed what was used in this file. Thank you. >> >> -- >> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France >> http://perso.univ-rennes1.fr/pierre.riteau/ >> >> >>
Re: [Qemu-devel] Qemu signal handling
On 21 January 2011 03:34, maheen butt wrote: > In QEMU code almost every signal is handled then why this warning is > generated from syscall.c > #elif defined(TARGET_ABI_MIPSN64) > > # warning signal handling not implemented This is in the Linux user-mode code, which has to correctly emulate calling a signal handler in the program being run. This requires (among other things) setting up a structure with all the CPU registers in the right place, which we will pass to the signal handler. So it is both CPU specific, and specific to the ABI being used by the program being run. For this particular case (MIPS CPU and the N64 or N32 ABI) the code has not been implemented. So we print a warning at compile time, and at runtime we will not correctly run programs which use signals. -- PMM
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/20/11 20:39, Anthony Liguori wrote: On 01/20/2011 02:44 AM, Gerd Hoffmann wrote: Hi, For (2), you cannot use bus=X,addr=Y because it makes assumptions about the PCI topology which may change in newer -M pc's. Why should the PCI topology for 'pc' ever change? We'll probably get q35 support some day, but when this lands I expect we'll see a new machine type 'q35', so '-m q35' will pick the ich9 chipset (which will have a different pci topology of course) and '-m pc' will pick the existing piix chipset (which will continue to look like it looks today). But then what's the default machine type? When I say -M pc, I really mean the default machine. I'd tend to leave pc as default for a release cycle or two so we can hash out issues with q35, then flip the default once it got broader testing and runs stable. At some point, "qemu-system-x86_64 -device virtio-net-pci,addr=2.0" Is not going to be a reliable way to invoke qemu because there's no way we can guarantee that slot 2 isn't occupied by a chipset device or some other default device. Indeed. But qemu -M pc should continue to work though. 'pc' would better named 'piix3', but renaming it now is probably not worth the trouble. cheers, Gerd
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
Hi, By the way, we don't have a QEMUState but instead use globals. /me wants to underline this. IMO it is absolutely pointless to worry about ways to pass around kvm_state. There never ever will be a serious need for that. We can stick with the current model of keeping global state in global variables. And just do the same with kvm_state. Or we can move to have all state in a QEMUState struct which we'll pass around basically everywhere. Then we can simply embed or reference kvm_state there. I'd tend to stick with the global variables as I don't see the point in having a QEMUstate. I doubt we'll ever see two virtual machines driven by a single qemu process. YMMV. cheers, Gerd
Re: [Qemu-devel] [PATCH 0/5] -drive/drive_add fixes
On Mon, Jan 17, 2011 at 07:31:25PM +0100, Markus Armbruster wrote: > Note: PATCH 3/5 makes -drive reject duplicate definitions instead of > ignoring all but the first silently. If this isn't sufficiently > bug-compatible for you, we need to talk. > > Markus Armbruster (5): > blockdev: Fix error message for invalid -drive CHS > blockdev: Make drive_init() use error_report() > blockdev: Reject multiple definitions for the same drive > blockdev: Fix drive_del not to crash when drive is not in use > blockdev: Fix drive_add for drives without media > > blockdev.c | 88 -- > blockdev.h |2 +- > hw/device-hotplug.c |3 +- > hw/usb-msd.c|3 +- > vl.c|6 +-- > 5 files changed, 47 insertions(+), 55 deletions(-) Reviewed-by: Stefan Hajnoczi
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
Am 21.01.2011 09:08, schrieb Pierre Riteau: > Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a > écrit : > >> 2011/1/20 Pierre Riteau : >>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>> 2011/1/19 Pierre Riteau : > b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return > value of bdrv_write and aborts migration when it fails. However, if the > size of the block device to migrate is not a multiple of BLOCK_SIZE > (currently 1 MB), the last bdrv_write will fail with -EIO. > > Fixed by calling bdrv_write with the correct size of the last block. > --- > block-migration.c | 16 +++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > diff --git a/block-migration.c b/block-migration.c > index 1475325..eeb9c62 100644 > --- a/block-migration.c > +++ b/block-migration.c > @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int > version_id) > int64_t addr; > BlockDriverState *bs; > uint8_t *buf; > +int64_t total_sectors; > +int nr_sectors; > > do { > addr = qemu_get_be64(f); > @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, > int version_id) > return -EINVAL; > } > > +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; > +if (total_sectors <= 0) { > +fprintf(stderr, "Error getting length of block device > %s\n", device_name); > +return -EINVAL; > +} > + > +if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { > +nr_sectors = total_sectors - addr; > +} else { > +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; > +} > + > buf = qemu_malloc(BLOCK_SIZE); > > qemu_get_buffer(f, buf, BLOCK_SIZE); > -ret = bdrv_write(bs, addr, buf, > BDRV_SECTORS_PER_DIRTY_CHUNK); > +ret = bdrv_write(bs, addr, buf, nr_sectors); > > qemu_free(buf); > if (ret < 0) { > -- > 1.7.3.5 > > > Hi Pierre, I don't think the fix above is correct. If you have a file which isn't aliened with BLOCK_SIZE, you won't get an error with the patch. However, the receiver doesn't know how much sectors which the sender wants to be written, so the guest may fail after migration because some data may not be written. IIUC, although changing bytestream should be prevented as much as possible, we should save/load total_sectors to check appropriate file is allocated on the receiver side. >>> >>> Isn't the guest supposed to be started using a file with the correct size? >> >> I personally don't like that; It's insisting too much to the user. >> Can't we expand the image on the fly? We can just abort if expanding >> failed anyway. > > At first I thought your expansion idea was best, but now I think there are > valid scenarios where it fails. > > Imagine both sides are not using a file but a disk partition as storage. If > the partition size is not rounded to 1 MB, the last write will fail with the > current code, and there is no way we can expand the partition. Actually, that you can change the image size is a special case. It only works on raw with file and sheepdog, and on qcow2 and qed. All other block drivers can't do it. >>> But I guess changing the protocol would be best as it would avoid headaches >>> to people who mistakenly created a file that is too small. >> >> We should think carefully before changing the protocol. >> >> Kevin? Can we do it in a compatible way? I agree that it would be nice to catch this error, but changing the protocol in an incompatible way for it seems to be too much. Anyway, it's independent of this patch and can be done on top. Kevin
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
Am 19.01.2011 15:59, schrieb Pierre Riteau: > b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return > value of bdrv_write and aborts migration when it fails. However, if the > size of the block device to migrate is not a multiple of BLOCK_SIZE > (currently 1 MB), the last bdrv_write will fail with -EIO. > > Fixed by calling bdrv_write with the correct size of the last block. > --- > block-migration.c | 16 +++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > diff --git a/block-migration.c b/block-migration.c > index 1475325..eeb9c62 100644 > --- a/block-migration.c > +++ b/block-migration.c > @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int > version_id) > int64_t addr; > BlockDriverState *bs; > uint8_t *buf; > +int64_t total_sectors; > +int nr_sectors; > > do { > addr = qemu_get_be64(f); > @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int > version_id) > return -EINVAL; > } > > +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; > +if (total_sectors <= 0) { > +fprintf(stderr, "Error getting length of block device %s\n", > device_name); > +return -EINVAL; > +} Can you resend the patch with error_report(), as Yoshi mentioned? Also, I would move the total_sectors calculation outside the loop - though I have no idea how many iterations it typically has, so it might not improve things a lot. Kevin
[Qemu-devel] [PATCH] docs: Document simple trace backend thread-safety limitation
Signed-off-by: Stefan Hajnoczi --- docs/tracing.txt |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/docs/tracing.txt b/docs/tracing.txt index 963c504..d2499d9 100644 --- a/docs/tracing.txt +++ b/docs/tracing.txt @@ -133,6 +133,11 @@ source tree. It may not be as powerful as platform-specific or third-party trace backends but it is portable. This is the recommended trace backend unless you have specific needs for more advanced backends. +This trace backend is not thread-safe. In many cases this is not an issue +since the QEMU global mutex covers much of the codebase. Consider this +limitation when tracing utility functions that may be called from worker +threads with no synchronization. + Monitor commands * info trace -- 1.7.2.3
Re: [Qemu-devel] [PATCH 1/3] make path_has_protocol() to return pointer instead of bool
Am 12.01.2011 11:57, schrieb Michael Tokarev: > Currently protocol: parsing in filenames is ad-hoc and scattered all around > block.c. This is a first step to prepare for common parsing. > > Signed-off-by: Michael Tokarev > --- > block.c | 18 +++--- > 1 files changed, 15 insertions(+), 3 deletions(-) > > diff --git a/block.c b/block.c > index ff2795b..e5a6f60 100644 > --- a/block.c > +++ b/block.c > @@ -90,9 +90,11 @@ int is_windows_drive(const char *filename) > } > #endif > > -/* check if the path starts with ":" */ > -static int path_has_protocol(const char *path) > +/* check if the path starts with ":" > + * Return pointer to the leading colon or NULL */ > +static char *path_has_protocol(const char *path) > { > +const char *p; > #ifdef _WIN32 > if (is_windows_drive(path) || > is_windows_drive_prefix(path)) { > @@ -100,7 +102,17 @@ static int path_has_protocol(const char *path) > } > #endif > > -return strchr(path, ':') != NULL; > +p = path; > +/* we allow [a-z_] for now */ > +while((*p >= 'a' && *p <= 'z') || *p == '_') { Maybe qemu_isalnum(*p) || *p == '_' instead? We probably won't need uppercase letters, but digits are well possible. We'll have a hard time adding any characters here later as this will break previously working image filenames. > +++p; > +} > + > +#define MAX_PROTO_LEN 31 > +/* recognize non-empty string of max MAX_PROTO chars as protocol */ > +return > +*p == ':' && p > path && (p - path) <= MAX_PROTO_LEN ? > +(char*)p : NULL; What's the point of MAX_PROTO_LEN? It just seems to make the handling even less consistent than it already is. Kevin
Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
On Thu, Jan 20, 2011 at 06:23:36PM -0600, Anthony Liguori wrote: > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote: > >On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote: > >>On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote: > >>>When MSI is off, each interrupt needs to be bounced through the io > >>>thread when it's set/cleared, so vhost-net causes more context switches and > >>>higher CPU utilization than userspace virtio which handles networking in > >>>the same thread. > >>> > >>>We'll need to fix this by adding level irq support in kvm irqfd, > >>>for now disable vhost-net in these configurations. > >>> > >>>Signed-off-by: Michael S. Tsirkin > >>I actually think this should be a terminal error. The user asks for > >>vhost-net, if we cannot enable it, we should exit. > >> > >>Or we should warn the user that they should expect bad performance. > >>Silently doing something that the user has explicitly asked us not > >>to do is not a good behavior. > >> > >>Regards, > >> > >>Anthony Liguori > >The issue is that user has no control of the guest, and can not know > >whether the guest enables MSI. So what you ask for will just make > >some guests fail, and others fail sometimes. > >The user also has no way to know that version X of kvm does not expose a > >way to inject level interrupts with irqfd. > > > >We could have *another* flag that says "use vhost where it helps" but > >then I think this is what everyone wants to do, anyway, and libvirt > >already sets vhost=on so I prefer redefining the meaning of an existing > >flag. > > In the very least, there needs to be a vhost=force. > Having some sort of friendly default policy is fine but we need to > provide a mechanism for a user to have the final say. If you want > to redefine vhost=on to really mean, use the friendly default, > that's fine by me, but only if the vhost=force option exists. OK, I will add that, probably as a separate flag as vhost is a boolean. This will get worse performance but it will be what the user asked for. > > I actually would think libvirt would want to use vhost=force. > Debugging with vhost=on is going to be a royal pain in the ass if a > user reports bad performance. Given the libvirt XML, you can't > actually tell from the guest and the XML whether or not vhost was > actually in use or not. Yes you can: check MSI enabled in the guest, if it is - check vhost enabled in the XML. Not that bad at all, is it? > > Regards, > > Anthony Liguori We get worse performance without MSI anyway, how is this different? > >Maybe this is best handled by a documentation update? > > > >We always said: > > "use vhost=on to enable experimental in kernel > > accelerator\n" > > > >note 'enable' not 'require'. This is similar to how we specify > >nvectors : you can not make guest use the feature. > > > >How about this: > > > >diff --git a/qemu-options.hx b/qemu-options.hx > >index 898561d..3c937c1 100644 > >--- a/qemu-options.hx > >+++ b/qemu-options.hx > >@@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net, > > "use vnet_hdr=off to avoid enabling the IFF_VNET_HDR > > tap flag\n" > > "use vnet_hdr=on to make the lack of IFF_VNET_HDR > > support an error condition\n" > > "use vhost=on to enable experimental in kernel > > accelerator\n" > >+"(note: vhost=on has no effect unless guest uses > >MSI-X)\n" > > "use 'vhostfd=h' to connect to an already opened vhost > > net device\n" > > #endif > > "-net > > socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n" > > > >
Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote: > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote: > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote: > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote: > > > > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote: > > >> > > >>> When MSI is off, each interrupt needs to be bounced through the io > > >>> thread when it's set/cleared, so vhost-net causes more context switches > > >>> and > > >>> higher CPU utilization than userspace virtio which handles networking in > > >>> the same thread. > > >>> > > >>> We'll need to fix this by adding level irq support in kvm irqfd, > > >>> for now disable vhost-net in these configurations. > > >>> > > >>> Signed-off-by: Michael S. Tsirkin > > >>> > > >> I actually think this should be a terminal error. The user asks for > > >> vhost-net, if we cannot enable it, we should exit. > > >> > > >> Or we should warn the user that they should expect bad performance. > > >> Silently doing something that the user has explicitly asked us not > > >> to do is not a good behavior. > > >> > > >> Regards, > > >> > > >> Anthony Liguori > > >> > > > The issue is that user has no control of the guest, and can not know > > > whether the guest enables MSI. So what you ask for will just make > > > some guests fail, and others fail sometimes. > > > The user also has no way to know that version X of kvm does not expose a > > > way to inject level interrupts with irqfd. > > > > > > We could have *another* flag that says "use vhost where it helps" but > > > then I think this is what everyone wants to do, anyway, and libvirt > > > already sets vhost=on so I prefer redefining the meaning of an existing > > > flag. > > > > > > > In the very least, there needs to be a vhost=force. > > > > Having some sort of friendly default policy is fine but we need to > > provide a mechanism for a user to have the final say. If you want to > > redefine vhost=on to really mean, use the friendly default, that's fine > > by me, but only if the vhost=force option exists. > > > > I actually would think libvirt would want to use vhost=force. Debugging > > with vhost=on is going to be a royal pain in the ass if a user reports > > bad performance. Given the libvirt XML, you can't actually tell from > > the guest and the XML whether or not vhost was actually in use or not. > > If we add a force option, let's please distinguish hotplug from VM > creation time. The latter can abort. Hotplug should print an error and > fail the initfn. It can't abort at init - MSI is disabled at init, it needs to be enabled by the guest later. And aborting the guest in the middle of the run is a very bad idea. What vhostforce=true will do is force vhost backend to be used even if it is slower. > Thanks, > > Alex
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
Gerd Hoffmann writes: > On 01/20/11 20:39, Anthony Liguori wrote: >> On 01/20/2011 02:44 AM, Gerd Hoffmann wrote: >>> Hi, >>> For (2), you cannot use bus=X,addr=Y because it makes assumptions about the PCI topology which may change in newer -M pc's. >>> >>> Why should the PCI topology for 'pc' ever change? >>> >>> We'll probably get q35 support some day, but when this lands I expect >>> we'll see a new machine type 'q35', so '-m q35' will pick the ich9 >>> chipset (which will have a different pci topology of course) and '-m >>> pc' will pick the existing piix chipset (which will continue to look >>> like it looks today). >> >> But then what's the default machine type? When I say -M pc, I really >> mean the default machine. > > I'd tend to leave pc as default for a release cycle or two so we can > hash out issues with q35, then flip the default once it got broader > testing and runs stable. > >> At some point, "qemu-system-x86_64 -device virtio-net-pci,addr=2.0" >> >> Is not going to be a reliable way to invoke qemu because there's no way >> we can guarantee that slot 2 isn't occupied by a chipset device or some >> other default device. > > Indeed. But qemu -M pc should continue to work though. 'pc' would > better named 'piix3', but renaming it now is probably not worth the > trouble. We mustn't change pc-0.14 & friends. We routinely change pc, but whether an upgrade to q35 qualifies as routine change is debatable. If you don't want PCI topology (and more) to change across QEMU updates, consider using the versioned machine types.
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
Gerd Hoffmann writes: > Hi, > >> By the way, we don't have a QEMUState but instead use globals. > > /me wants to underline this. > > IMO it is absolutely pointless to worry about ways to pass around > kvm_state. There never ever will be a serious need for that. > > We can stick with the current model of keeping global state in global > variables. And just do the same with kvm_state. > > Or we can move to have all state in a QEMUState struct which we'll > pass around basically everywhere. Then we can simply embed or > reference kvm_state there. > > I'd tend to stick with the global variables as I don't see the point > in having a QEMUstate. I doubt we'll ever see two virtual machines > driven by a single qemu process. YMMV. /me grabs the fat magic marker and underlines some more.
Re: [Qemu-devel] [PATCH 3/3] make path_combine() especially for filenames, not URLs
Am 12.01.2011 11:57, schrieb Michael Tokarev: > Currently the two routines tries to "understand" and skip : > prefix in path arguments are path_combine() and path_is_absolute() > (the latter isn't used anywhere but in the former). This is wrong, > since notion of absolute path is, at least, protocol-specific. > > The implementation is more wrong on windows where even non-absolute > paths but with drive name (d:foo) should be treated as absolute, as > in, one can't combine, say, c:\bar with d:foo forming c:\foo as > path_combine() currently does. > > Introduce isslash() macro that works correctly on both windows and > unix, use it in is_windows_drive_prefix() (since any form of slash > can be used in constructs like //./), You're saying that \/.\ is a valid format for it? Wow... > remove path_is_absolute() and > hardcode the trivial (but right) condition in path_combine(), and > simplify path_combine() further by removing : handling > and unifying shash searching. > > Signed-off-by: Michael Tokarev > --- > block.c | 72 +- > 1 files changed, 25 insertions(+), 47 deletions(-) > > diff --git a/block.c b/block.c > index 42d6ff1..31a821d 100644 > --- a/block.c > +++ b/block.c > @@ -71,6 +71,9 @@ static BlockDriverState *bs_snapshots; > static int use_bdrv_whitelist; > > #ifdef _WIN32 > + > +#define isslash(c) ((c) == '/' || (c) == '\\') > + > static int is_windows_drive_prefix(const char *filename) > { > return (((filename[0] >= 'a' && filename[0] <= 'z') || > @@ -83,11 +86,17 @@ int is_windows_drive(const char *filename) > if (is_windows_drive_prefix(filename) && > filename[2] == '\0') > return 1; > -if (strstart(filename, ".\\", NULL) || > -strstart(filename, "//./", NULL)) > -return 1; > +if (isslash(filename[0] && isslash(filename[1]) && Missing bracket, doesn't even compile. > +filename[2] == '.' && isslash(filename[3])) > +return 1; /* special case: windows device "//./" */ Please add curly braces. > return 0; > } > + > +#else > + > +#define isslash(c) ((c) == '/') > +#define is_windows_drive_prefix(filename) (0) Please make this a function, for consistency and type checking. The compiler is going to inline it anyway. > + > #endif > > /* check if the path starts with ":" > @@ -115,61 +124,30 @@ static char *path_has_protocol(const char *path) > (char*)p : NULL; > } > > -int path_is_absolute(const char *path) > -{ > -const char *p; > -#ifdef _WIN32 > -/* specific case for names like: "\\.\d:" */ > -if (*path == '/' || *path == '\\') > -return 1; > -#endif > -p = strchr(path, ':'); > -if (p) > -p++; > -else > -p = path; > -#ifdef _WIN32 > -return (*p == '/' || *p == '\\'); > -#else > -return (*p == '/'); > -#endif > -} > - > /* if filename is absolute, just copy it to dest. Otherwise, build a > - path to it by considering it is relative to base_path. URL are > - supported. */ > + path to it by considering it is relative to base_path. > + This is really about filenames not URLs - we don't support > + : prefix in filename since it makes no sense, especially > + if the protocol in base_path is not the same as in filename. > + */ > void path_combine(char *dest, int dest_size, >const char *base_path, >const char *filename) > { > -const char *p, *p1; > +const char *p; > int len; > > if (dest_size <= 0) > return; > -if (path_is_absolute(filename)) { > +/* on windows, d:filename should be treated as absolute too */ > +if (isslash(filename[0]) || is_windows_drive_prefix(filename)) { > pstrcpy(dest, dest_size, filename); > } else { > -p = strchr(base_path, ':'); > -if (p) > -p++; > -else > -p = base_path; > -p1 = strrchr(base_path, '/'); > -#ifdef _WIN32 > -{ > -const char *p2; > -p2 = strrchr(base_path, '\\'); > -if (!p1 || p2 > p1) > -p1 = p2; > -} > -#endif > -if (p1) > -p1++; > -else > -p1 = base_path; > -if (p1 > p) > -p = p1; > +/* find last slash */ > +p = base_path + strlen(base_path); > +while(p >= base_path && !isslash(*p)) > +--p; > +p++; Please add braces. > len = p - base_path; > if (len > dest_size - 1) > len = dest_size - 1; This changes the semantics to throw away the protocol. For example fat:foo combined with bar would have resulted in fat:bar previously and results in bar now. Probably not a problem. path_combine gets even worse if filename has a protocol. It's completely unclear what it's supposed to do with protocols anyway. Kevin
[Qemu-devel] [PATCH 3/5] spitz: make sl-nand emulation use qdev infrastructure
Switch sl-nand emulation to use qdev and vmstate. Also drop ecc_get/_put functions as sl-nand was the only user of that code. Signed-off-by: Dmitry Eremin-Solenikov --- hw/ecc.c | 27 +++- hw/flash.h |3 +- hw/onenand.c |1 + hw/spitz.c | 97 - 4 files changed, 75 insertions(+), 53 deletions(-) diff --git a/hw/ecc.c b/hw/ecc.c index 2fbf167..a75408b 100644 --- a/hw/ecc.c +++ b/hw/ecc.c @@ -74,18 +74,15 @@ void ecc_reset(ECCState *s) } /* Save/restore */ -void ecc_put(QEMUFile *f, ECCState *s) -{ -qemu_put_8s(f, &s->cp); -qemu_put_be16s(f, &s->lp[0]); -qemu_put_be16s(f, &s->lp[1]); -qemu_put_be16s(f, &s->count); -} - -void ecc_get(QEMUFile *f, ECCState *s) -{ -qemu_get_8s(f, &s->cp); -qemu_get_be16s(f, &s->lp[0]); -qemu_get_be16s(f, &s->lp[1]); -qemu_get_be16s(f, &s->count); -} +VMStateDescription vmstate_ecc_state = { +.name = "ecc-state", +.version_id = 0, +.minimum_version_id = 0, +.minimum_version_id_old = 0, +.fields = (VMStateField []) { +VMSTATE_UINT8(cp, ECCState), +VMSTATE_UINT16_ARRAY(lp, ECCState, 2), +VMSTATE_UINT16(count, ECCState), +VMSTATE_END_OF_LIST(), +}, +}; diff --git a/hw/flash.h b/hw/flash.h index a80205c..d7d103e 100644 --- a/hw/flash.h +++ b/hw/flash.h @@ -51,5 +51,4 @@ typedef struct { uint8_t ecc_digest(ECCState *s, uint8_t sample); void ecc_reset(ECCState *s); -void ecc_put(QEMUFile *f, ECCState *s); -void ecc_get(QEMUFile *f, ECCState *s); +extern VMStateDescription vmstate_ecc_state; diff --git a/hw/onenand.c b/hw/onenand.c index d9cdcf2..71c1ab4 100644 --- a/hw/onenand.c +++ b/hw/onenand.c @@ -19,6 +19,7 @@ */ #include "qemu-common.h" +#include "hw.h" #include "flash.h" #include "irq.h" #include "blockdev.h" diff --git a/hw/spitz.c b/hw/spitz.c index 87731d3..c69a121 100644 --- a/hw/spitz.c +++ b/hw/spitz.c @@ -47,8 +47,11 @@ #define FLASHCTL_NCE (FLASHCTL_CE0 | FLASHCTL_CE1) typedef struct { +SysBusDevice busdev; NANDFlashState *nand; uint8_t ctl; +uint8_t manf_id; +uint8_t chip_id; ECCState ecc; } SLNANDState; @@ -131,56 +134,53 @@ static void sl_writeb(void *opaque, target_phys_addr_t addr, } } -static void sl_save(QEMUFile *f, void *opaque) -{ -SLNANDState *s = (SLNANDState *) opaque; - -qemu_put_8s(f, &s->ctl); -ecc_put(f, &s->ecc); -} - -static int sl_load(QEMUFile *f, void *opaque, int version_id) -{ -SLNANDState *s = (SLNANDState *) opaque; - -qemu_get_8s(f, &s->ctl); -ecc_get(f, &s->ecc); - -return 0; -} - enum { FLASH_128M, FLASH_1024M, }; +static CPUReadMemoryFunc * const sl_readfn[] = { +sl_readb, +sl_readb, +sl_readl, +}; +static CPUWriteMemoryFunc * const sl_writefn[] = { +sl_writeb, +sl_writeb, +sl_writeb, +}; + static void sl_flash_register(PXA2xxState *cpu, int size) { +DeviceState *dev; + +dev = qdev_create(NULL, "sl-nand"); + +qdev_prop_set_uint8(dev, "manf_id", NAND_MFR_SAMSUNG); +if (size == FLASH_128M) +qdev_prop_set_uint8(dev, "chip_id", 0x73); +else if (size == FLASH_1024M) +qdev_prop_set_uint8(dev, "chip_id", 0xf1); + +qdev_init_nofail(dev); +sysbus_mmio_map(sysbus_from_qdev(dev), 0, FLASH_BASE); +} + +static int sl_nand_init(SysBusDevice *dev) { int iomemtype; SLNANDState *s; -CPUReadMemoryFunc * const sl_readfn[] = { -sl_readb, -sl_readb, -sl_readl, -}; -CPUWriteMemoryFunc * const sl_writefn[] = { -sl_writeb, -sl_writeb, -sl_writeb, -}; - -s = (SLNANDState *) qemu_mallocz(sizeof(SLNANDState)); + +s = FROM_SYSBUS(SLNANDState, dev); + s->ctl = 0; -if (size == FLASH_128M) -s->nand = nand_init(NAND_MFR_SAMSUNG, 0x73); -else if (size == FLASH_1024M) -s->nand = nand_init(NAND_MFR_SAMSUNG, 0xf1); +s->nand = nand_init(s->manf_id, s->chip_id); iomemtype = cpu_register_io_memory(sl_readfn, sl_writefn, s, DEVICE_NATIVE_ENDIAN); -cpu_register_physical_memory(FLASH_BASE, 0x40, iomemtype); -register_savevm(NULL, "sl_flash", 0, 0, sl_save, sl_load, s); +sysbus_init_mmio(dev, 0x40, iomemtype); + +return 0; } /* Spitz Keyboard */ @@ -1027,6 +1027,30 @@ static void spitz_machine_init(void) machine_init(spitz_machine_init); +static VMStateDescription vmstate_sl_nand_info = { +.name = "sl-nand", +.version_id = 0, +.minimum_version_id = 0, +.minimum_version_id_old = 0, +.fields = (VMStateField []) { +VMSTATE_UINT8(ctl, SLNANDState), +VMSTATE_STRUCT(ecc, SLNANDState, 0, vmstate_ecc_state, ECCState), +VMSTATE_END_OF_LIST(), +}, +}; + +static SysBusDeviceInfo sl_nand_info = { +.init = sl_nand_init, +.qdev.name = "sl-nand", +.qdev.size = sizeof(SLNANDState), +.qdev.vmsd = &vmstate_sl_nand_inf
[Qemu-devel] [PATCH 5/5] pxa2xx_gpio: switch to using qdev
Signed-off-by: Dmitry Eremin-Solenikov --- hw/gumstix.c |4 +- hw/pxa.h | 10 +--- hw/pxa2xx.c |4 +- hw/pxa2xx_gpio.c | 151 ++ hw/spitz.c | 34 ++-- hw/tosa.c| 12 ++-- 6 files changed, 102 insertions(+), 113 deletions(-) diff --git a/hw/gumstix.c b/hw/gumstix.c index af8b464..ee63f63 100644 --- a/hw/gumstix.c +++ b/hw/gumstix.c @@ -78,7 +78,7 @@ static void connex_init(ram_addr_t ram_size, /* Interrupt line of NIC is connected to GPIO line 36 */ smc91c111_init(&nd_table[0], 0x04000300, -pxa2xx_gpio_in_get(cpu->gpio)[36]); +qdev_get_gpio_in(cpu->gpio, 36)); } static void verdex_init(ram_addr_t ram_size, @@ -117,7 +117,7 @@ static void verdex_init(ram_addr_t ram_size, /* Interrupt line of NIC is connected to GPIO line 99 */ smc91c111_init(&nd_table[0], 0x04000300, -pxa2xx_gpio_in_get(cpu->gpio)[99]); +qdev_get_gpio_in(cpu->gpio, 99)); } static QEMUMachine connex_machine = { diff --git a/hw/pxa.h b/hw/pxa.h index 8d6a8c3..f73d33b 100644 --- a/hw/pxa.h +++ b/hw/pxa.h @@ -70,13 +70,9 @@ void pxa25x_timer_init(target_phys_addr_t base, qemu_irq *irqs); void pxa27x_timer_init(target_phys_addr_t base, qemu_irq *irqs, qemu_irq irq4); /* pxa2xx_gpio.c */ -typedef struct PXA2xxGPIOInfo PXA2xxGPIOInfo; -PXA2xxGPIOInfo *pxa2xx_gpio_init(target_phys_addr_t base, +DeviceState *pxa2xx_gpio_init(target_phys_addr_t base, CPUState *env, qemu_irq *pic, int lines); -qemu_irq *pxa2xx_gpio_in_get(PXA2xxGPIOInfo *s); -void pxa2xx_gpio_out_set(PXA2xxGPIOInfo *s, -int line, qemu_irq handler); -void pxa2xx_gpio_read_notifier(PXA2xxGPIOInfo *s, qemu_irq handler); +void pxa2xx_gpio_read_notifier(DeviceState *dev, qemu_irq handler); /* pxa2xx_dma.c */ typedef struct PXA2xxDMAState PXA2xxDMAState; @@ -132,7 +128,7 @@ typedef struct { qemu_irq *pic; qemu_irq reset; PXA2xxDMAState *dma; -PXA2xxGPIOInfo *gpio; +DeviceState *gpio; PXA2xxLCDState *lcd; SSIBus **ssp; PXA2xxI2CState *i2c[2]; diff --git a/hw/pxa2xx.c b/hw/pxa2xx.c index 6e72a5c..d966846 100644 --- a/hw/pxa2xx.c +++ b/hw/pxa2xx.c @@ -2158,7 +2158,7 @@ PXA2xxState *pxa270_init(unsigned int sdram_size, const char *revision) /* GPIO1 resets the processor */ /* The handler can be overridden by board-specific code */ -pxa2xx_gpio_out_set(s->gpio, 1, s->reset); +qdev_connect_gpio_out(s->gpio, 1, s->reset); return s; } @@ -2279,7 +2279,7 @@ PXA2xxState *pxa255_init(unsigned int sdram_size) /* GPIO1 resets the processor */ /* The handler can be overridden by board-specific code */ -pxa2xx_gpio_out_set(s->gpio, 1, s->reset); +qdev_connect_gpio_out(s->gpio, 1, s->reset); return s; } diff --git a/hw/pxa2xx_gpio.c b/hw/pxa2xx_gpio.c index 0d03446..295a0ff 100644 --- a/hw/pxa2xx_gpio.c +++ b/hw/pxa2xx_gpio.c @@ -8,15 +8,17 @@ */ #include "hw.h" +#include "sysbus.h" #include "pxa.h" #define PXA2XX_GPIO_BANKS 4 +typedef struct PXA2xxGPIOInfo PXA2xxGPIOInfo; struct PXA2xxGPIOInfo { -qemu_irq *pic; +SysBusDevice busdev; +qemu_irq irq0, irq1, irqX; int lines; -CPUState *cpu_env; -qemu_irq *in; +void *cpu_env; /* XXX: GNU C vectors are more suitable */ uint32_t ilevel[PXA2XX_GPIO_BANKS]; @@ -66,19 +68,19 @@ static struct { static void pxa2xx_gpio_irq_update(PXA2xxGPIOInfo *s) { if (s->status[0] & (1 << 0)) -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_0]); +qemu_irq_raise(s->irq0); else -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_0]); +qemu_irq_lower(s->irq0); if (s->status[0] & (1 << 1)) -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_1]); +qemu_irq_raise(s->irq1); else -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_1]); +qemu_irq_lower(s->irq1); if ((s->status[0] & ~3) | s->status[1] | s->status[2] | s->status[3]) -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_X]); +qemu_irq_raise(s->irqX); else -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_X]); +qemu_irq_lower(s->irqX); } /* Bitmap of pins used as standby and sleep wake-up sources. */ @@ -114,7 +116,7 @@ static void pxa2xx_gpio_set(void *opaque, int line, int level) pxa2xx_gpio_irq_update(s); /* Wake-up GPIOs */ -if (s->cpu_env->halted && (mask & ~s->dir[bank] & pxa2xx_gpio_wake[bank])) +if (((CPUARMState*)s->cpu_env)->halted && (mask & ~s->dir[bank] & pxa2xx_gpio_wake[bank])) cpu_interrupt(s->cpu_env, CPU_INTERRUPT_EXITTB); } @@ -249,96 +251,87 @@ static CPUWriteMemoryFunc * const pxa2xx_gpio_writefn[] = { pxa2xx_gpio_write }; -static void pxa2xx_gpio_save(QEMUFile *f, void *opaque) -{ -PXA2xxGPIOInfo *s = (PXA2xxGPIOInfo *) opaque; -int i; - -qemu_put_be32(f, s->lines); - -for (i
[Qemu-devel] [PATCH 4/5] spitz: make spitz-keyboard to use qdev infrastructure
Signed-off-by: Dmitry Eremin-Solenikov --- hw/spitz.c | 127 ++-- 1 files changed, 72 insertions(+), 55 deletions(-) diff --git a/hw/spitz.c b/hw/spitz.c index c69a121..5f95bab 100644 --- a/hw/spitz.c +++ b/hw/spitz.c @@ -219,11 +219,10 @@ static const int spitz_gpiomap[5] = { SPITZ_GPIO_AK_INT, SPITZ_GPIO_SYNC, SPITZ_GPIO_ON_KEY, SPITZ_GPIO_SWA, SPITZ_GPIO_SWB, }; -static int spitz_gpio_invert[5] = { 0, 0, 0, 0, 0, }; typedef struct { +SysBusDevice busdev; qemu_irq sense[SPITZ_KEY_SENSE_NUM]; -qemu_irq *strobe; qemu_irq gpiomap[5]; int keymap[0x80]; uint16_t keyrow[SPITZ_KEY_SENSE_NUM]; @@ -274,8 +273,7 @@ static void spitz_keyboard_keydown(SpitzKeyboardState *s, int keycode) /* Handle the additional keys */ if ((spitz_keycode >> 4) == SPITZ_KEY_SENSE_NUM) { -qemu_set_irq(s->gpiomap[spitz_keycode & 0xf], (keycode < 0x80) ^ -spitz_gpio_invert[spitz_keycode & 0xf]); +qemu_set_irq(s->gpiomap[spitz_keycode & 0xf], (keycode < 0x80)); return; } @@ -293,8 +291,9 @@ static void spitz_keyboard_keydown(SpitzKeyboardState *s, int keycode) #define QUEUE_KEY(c) s->fifo[(s->fifopos + s->fifolen ++) & 0xf] = c -static void spitz_keyboard_handler(SpitzKeyboardState *s, int keycode) +static void spitz_keyboard_handler(void *opaque, int keycode) { +SpitzKeyboardState *s = opaque; uint16_t code; int mapcode; switch (keycode) { @@ -440,34 +439,15 @@ static void spitz_keyboard_pre_map(SpitzKeyboardState *s) s->imodifiers = 0; s->fifopos = 0; s->fifolen = 0; -s->kbdtimer = qemu_new_timer(vm_clock, spitz_keyboard_tick, s); -spitz_keyboard_tick(s); } #undef SHIFT #undef CTRL #undef FN -static void spitz_keyboard_save(QEMUFile *f, void *opaque) +static int spitz_keyboard_post_load(void *opaque, int version_id) { SpitzKeyboardState *s = (SpitzKeyboardState *) opaque; -int i; - -qemu_put_be16s(f, &s->sense_state); -qemu_put_be16s(f, &s->strobe_state); -for (i = 0; i < 5; i ++) -qemu_put_byte(f, spitz_gpio_invert[i]); -} - -static int spitz_keyboard_load(QEMUFile *f, void *opaque, int version_id) -{ -SpitzKeyboardState *s = (SpitzKeyboardState *) opaque; -int i; - -qemu_get_be16s(f, &s->sense_state); -qemu_get_be16s(f, &s->strobe_state); -for (i = 0; i < 5; i ++) -spitz_gpio_invert[i] = qemu_get_byte(f); /* Release all pressed keys */ memset(s->keyrow, 0, sizeof(s->keyrow)); @@ -482,36 +462,55 @@ static int spitz_keyboard_load(QEMUFile *f, void *opaque, int version_id) static void spitz_keyboard_register(PXA2xxState *cpu) { -int i, j; +int i; +DeviceState *dev; SpitzKeyboardState *s; -s = (SpitzKeyboardState *) -qemu_mallocz(sizeof(SpitzKeyboardState)); -memset(s, 0, sizeof(SpitzKeyboardState)); - -for (i = 0; i < 0x80; i ++) -s->keymap[i] = -1; -for (i = 0; i < SPITZ_KEY_SENSE_NUM + 1; i ++) -for (j = 0; j < SPITZ_KEY_STROBE_NUM; j ++) -if (spitz_keymap[i][j] != -1) -s->keymap[spitz_keymap[i][j]] = (i << 4) | j; +dev = sysbus_create_simple("spitz-keyboard", -1, NULL); +s = FROM_SYSBUS(SpitzKeyboardState, sysbus_from_qdev(dev)); for (i = 0; i < SPITZ_KEY_SENSE_NUM; i ++) -s->sense[i] = pxa2xx_gpio_in_get(cpu->gpio)[spitz_gpio_key_sense[i]]; +qdev_connect_gpio_out(dev, i, pxa2xx_gpio_in_get(cpu->gpio)[spitz_gpio_key_sense[i]]); for (i = 0; i < 5; i ++) s->gpiomap[i] = pxa2xx_gpio_in_get(cpu->gpio)[spitz_gpiomap[i]]; -s->strobe = qemu_allocate_irqs(spitz_keyboard_strobe, s, -SPITZ_KEY_STROBE_NUM); +if (graphic_rotate) +s->gpiomap[4] = qemu_irq_invert(s->gpiomap[4]); + +for (i = 0; i < 5; i++) +qemu_set_irq(s->gpiomap[i], 0); + for (i = 0; i < SPITZ_KEY_STROBE_NUM; i ++) -pxa2xx_gpio_out_set(cpu->gpio, spitz_gpio_key_strobe[i], s->strobe[i]); +pxa2xx_gpio_out_set(cpu->gpio, spitz_gpio_key_strobe[i], +qdev_get_gpio_in(dev, i)); + +qemu_mod_timer(s->kbdtimer, qemu_get_clock(vm_clock)); + +qemu_add_kbd_event_handler(spitz_keyboard_handler, s); +} + +static int spitz_keyboard_init(SysBusDevice *dev) +{ +SpitzKeyboardState *s; +int i, j; + +s = FROM_SYSBUS(SpitzKeyboardState, dev); + +for (i = 0; i < 0x80; i ++) +s->keymap[i] = -1; +for (i = 0; i < SPITZ_KEY_SENSE_NUM + 1; i ++) +for (j = 0; j < SPITZ_KEY_STROBE_NUM; j ++) +if (spitz_keymap[i][j] != -1) +s->keymap[spitz_keymap[i][j]] = (i << 4) | j; spitz_keyboard_pre_map(s); -qemu_add_kbd_event_handler((QEMUPutKBDEvent *) spitz_keyboard_handler, s); -register_savevm(NULL, "spitz_keyboard", 0, 0, -spitz_keyboard_save, spitz_keyboard_load, s); +s->kbdtimer = q
[Qemu-devel] [ARM] Contributing tests for Neon
Hi all, I have developed some tests for ARM-Neon in the form of C sources files calling ARM Neon intrinsics, and comparing the results of the resulting program with a known reference (eg execution on actual CPU) shows if the execution engine is follows the spec. These tests currently represent 750KB of sources (150 files, 12000 lines), and I would like to contribute them to the qemu testing infrastructure. I have a few questions on how to proceed: - we wish to release the files under the MIT license, is it OK to deliver them as-is in the 'tests' qemu subdir? - given the size of the whole stuff, I don't think it's suitable that I send it on the list for review, what should I do? - at the time of writing, GCC/ARM fails to compile those tests correctly, and I rely on ARM's compiler to generate my reference executable and output. Is it acceptable that the Makefile references armcc? Thanks, Christophe.
Re: [Qemu-devel] [RFC PATCH] Fake machine for scalability testing
Anthony Liguori writes: > On 01/20/2011 11:12 AM, Markus Armbruster wrote: >> Anthony Liguori writes: >> >> >>> On 01/18/2011 02:16 PM, Markus Armbruster wrote: >>> The problem: you want to do serious scalability testing (1000s of VMs) of your management stack. If each guest eats up a few 100MiB and competes for CPU, that requires a serious host machine. Which you don't have. You also don't want to modify the management stack at all, if you can help it. The solution: a perfectly normal-looking QEMU that uses minimal resources. Ability to execute any guest code is strictly optional ;) New option -fake-machine creates a fake machine incapable of running guest code. Completely compiled out by default, enable with configure --enable-fake-machine. With -fake-machine, CPU use is negligible, and memory use is rather modest. Non-fake VM running F-14 live, right after boot: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 15707 2558 53 191837 414388 1 21:05 pts/300:00:29 [...] Same VM -fake-machine, after similar time elapsed: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 15742 2558 0 85129 9412 0 21:07 pts/300:00:00 [...] We're using a very similar patch for RHEL scalability testing. >>> Interesting, but: >>> >>> 9432 anthony 20 0 153m 14m 5384 S0 0.2 0:00.22 >>> qemu-system-x86 >>> >>> That's qemu-system-x86 -m 4 >>> >> Sure you ran qemu-system-x86 -fake-machine? >> > > No, I didn't try it. My point was that -m 4 is already pretty small. Ah! However, it's not as small as -fake-machine, and eats all the CPU it can get. None-fake VM as above, but with -m 4: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 19325 2558 93 39869 17020 1 11:30 pts/300:00:42 [...] And I believe we can make -fake-machine use even less memory than now, with a little more work. >>> In terms of memory overhead, the largest source is not really going to >>> be addressed by -fake-machine (l1_phys_map and phys_ram_dirty). >>> >> git-grep phys_ram_dirty finds nothing. >> > > Yeah, it's now ram_list[i].phys_dirty. > > l1_phys_map is (sizeof(void *) + sizeof(PhysPageDesc)) * mem_size_in_pages > > phys_dirty is mem_size_in_pages bytes. Thanks. >>> I don't really understand the point of not creating a VCPU with KVM. >>> Is there some type of overhead in doing that? >>> >> I briefly looked at both main loops, TCG's was the first one I happened >> to crack, and I didn't feel like doing both then. If the general >> approach is okay, I'll gladly investigate how to do it with KVM. >> > > I guess what I don't understand is why do you need to not run guest > code? Specifically, if you remove the following, is it any less > useful? > > diff --git a/cpu-exec.c b/cpu-exec.c > index 8c9fb8b..cd1259a 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -230,7 +230,7 @@ int cpu_exec(CPUState *env1) > uint8_t *tc_ptr; > unsigned long next_tb; > > -if (cpu_halted(env1) == EXCP_HALTED) > +if (fake_machine || cpu_halted(env1) == EXCP_HALTED) > > return EXCP_HALTED; I don't want 1000s of guests running infinite "not enough memory to do anything useful, panic!" reboot loops. Because that's 1000s of guests competing for CPU. If you think we can achieve my goals (stated in my first paragraph) in a different way, I'm all ears.
Re: [Qemu-devel] [PATCH 2/3] block: tell drivers about an image resize
Am 19.01.2011 18:02, schrieb Christoph Hellwig: > Extend the change_cb callback with a reason argument, and use it > to tell drivers about size changes. > > Signed-off-by: Christoph Hellwig > > Index: qemu/block.c > === > --- qemu.orig/block.c 2011-01-18 20:54:45.246021572 +0100 > +++ qemu/block.c 2011-01-18 20:56:54.117255612 +0100 > @@ -645,7 +645,7 @@ int bdrv_open(BlockDriverState *bs, cons > /* call the change callback */ > bs->media_changed = 1; > if (bs->change_cb) > -bs->change_cb(bs->change_opaque); > +bs->change_cb(bs->change_opaque, CHANGE_MEDIA); > } > > return 0; > @@ -684,7 +684,7 @@ void bdrv_close(BlockDriverState *bs) > /* call the change callback */ > bs->media_changed = 1; > if (bs->change_cb) > -bs->change_cb(bs->change_opaque); > +bs->change_cb(bs->change_opaque, CHANGE_MEDIA); > } > } > > @@ -1135,6 +1135,8 @@ int bdrv_truncate(BlockDriverState *bs, > ret = drv->bdrv_truncate(bs, offset); > if (ret == 0) { > ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS); > +if (bs->change_cb) > +bs->change_cb(bs->change_opaque, CHANGE_SIZE); Braces are missing here. Apart from that and Stefan's s/id/device/ the series looks good to me. Kevin
[Qemu-devel] Re: [PATCH] pci/pcie: make pci_find_device() ARI aware.
On Thu, Jan 20, 2011 at 04:15:48PM +0200, Michael S. Tsirkin wrote: > On Thu, Jan 20, 2011 at 03:57:39PM +0900, Isaku Yamahata wrote: > > make pci_find_device() ARI aware. > > > > Signed-off-by: Isaku Yamahata > > --- > > hw/pci.c |7 +++ > > 1 files changed, 7 insertions(+), 0 deletions(-) > > > > diff --git a/hw/pci.c b/hw/pci.c > > index 8d0e3df..851f350 100644 > > --- a/hw/pci.c > > +++ b/hw/pci.c > > @@ -1596,11 +1596,18 @@ PCIBus *pci_find_bus(PCIBus *bus, int bus_num) > > > > PCIDevice *pci_find_device(PCIBus *bus, int bus_num, int slot, int > > function) > > { > > +PCIDevice *d; > > bus = pci_find_bus(bus, bus_num); > > > > if (!bus) > > return NULL; > > > > +d = bus->parent_dev; > > +if (d && pci_is_express(d) && > > +pcie_cap_get_type(d) == PCI_EXP_TYPE_DOWNSTREAM && > > +!pcie_cap_is_ari_enabled(d) && slot > 0) { > > +return NULL; > > +} > > return bus->devices[PCI_DEVFN(slot, function)]; > > } > > I'd like to split this express-specific code out in some way. > Further, the downstream port always has a single slot. > Maybe it shouldn't be a bus at all, but this needs some thought. Yes, it needs consideration. > How about we put flag in PCIBus that says that a single > slot is supported? Downstream ports would just set it. So such a flag must be set/clear by something like pcie_cap_ari_write_config() depending on ARI bit at runtime. pcie device can have 256 functions instead of 8. Maybe we'd like to emulate how p2p bridge transfers pci transaction to child pci bus somehow. So how about introducing new method? something like the following. Which is your preference? new flag or new method? PCIBus PCIDevice (*get_device)(bus, slot, function) pci_get_device_default() return bus->devices[PCI_DEVFN(slot, function)]; pcie_downstream_get_device() d = bus->parent_dev; if (!pcie_cap_is_ari_enabled(d) && slot > 0) { return NULL } return bus->devices[PCI_DEVFN(slot, function)]; pci_find_device() ... return bus->get_device(bus, slot, function) -- yamahata
Re: [Qemu-devel] [RFC PATCH] Fake machine for scalability testing
On Thu, Jan 20, 2011 at 01:50:33PM -0600, Anthony Liguori wrote: > On 01/20/2011 11:12 AM, Markus Armbruster wrote: > >Anthony Liguori writes: > > > >>On 01/18/2011 02:16 PM, Markus Armbruster wrote: > >>>The problem: you want to do serious scalability testing (1000s of VMs) > >>>of your management stack. If each guest eats up a few 100MiB and > >>>competes for CPU, that requires a serious host machine. Which you don't > >>>have. You also don't want to modify the management stack at all, if you > >>>can help it. > >>> > >>>The solution: a perfectly normal-looking QEMU that uses minimal > >>>resources. Ability to execute any guest code is strictly optional ;) > >>> > >>>New option -fake-machine creates a fake machine incapable of running > >>>guest code. Completely compiled out by default, enable with configure > >>>--enable-fake-machine. > >>> > >>>With -fake-machine, CPU use is negligible, and memory use is rather > >>>modest. > >>> > >>>Non-fake VM running F-14 live, right after boot: > >>>UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD > >>>armbru 15707 2558 53 191837 414388 1 21:05 pts/300:00:29 [...] > >>> > >>>Same VM -fake-machine, after similar time elapsed: > >>>UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD > >>>armbru 15742 2558 0 85129 9412 0 21:07 pts/300:00:00 [...] > >>> > >>>We're using a very similar patch for RHEL scalability testing. > >>> > >>Interesting, but: > >> > >> 9432 anthony 20 0 153m 14m 5384 S0 0.2 0:00.22 > >>qemu-system-x86 > >> > >>That's qemu-system-x86 -m 4 > >Sure you ran qemu-system-x86 -fake-machine? > > No, I didn't try it. My point was that -m 4 is already pretty small. One of the core ideas/requirements behind the "fake QEMU" was that we won't need to modify applications to adjust the command line arguments in this kind of way. We want all their machine definition logic to remain unaffected. In fact my original prototype did not even require addition of the passing of an extra '-fake-machine' argument, it would have just been a plain drop in alternative QEMU binary. It also stubbed out much of the KVM codepaths, so you could "enable" KVM mode without actually really having KVM available on the host. > >>In terms of memory overhead, the largest source is not really going to > >>be addressed by -fake-machine (l1_phys_map and phys_ram_dirty). > >git-grep phys_ram_dirty finds nothing. > > Yeah, it's now ram_list[i].phys_dirty. > > l1_phys_map is (sizeof(void *) + sizeof(PhysPageDesc)) * mem_size_in_pages > > phys_dirty is mem_size_in_pages bytes. > > >>I don't really understand the point of not creating a VCPU with KVM. > >>Is there some type of overhead in doing that? > >I briefly looked at both main loops, TCG's was the first one I happened > >to crack, and I didn't feel like doing both then. If the general > >approach is okay, I'll gladly investigate how to do it with KVM. > > I guess what I don't understand is why do you need to not run guest > code? Specifically, if you remove the following, is it any less > useful? IIUC, if you don't have the following change, then the guest CPUs will actually execute the kernel/bootable disk configured, causing host CPU utilization to rise. Even if it only adds 2% load on the host, this quickly becomes an issue as you get 200 or more VMs on the host. Ideally we would have the main loop completely disabled, not merely the CPUs, because this would avoid all possible background CPU load that any QEMU internal timers might cause. Basically the desired goal is, make no change to the QEMU command line aguments, but have zero memory and CPU overhead by running QEMU. fake-machine doesn't get as close to zero as my original fake QEMU target managed, but it is still pretty good, considering how much less code is involved in fake-machine. > diff --git a/cpu-exec.c b/cpu-exec.c > index 8c9fb8b..cd1259a 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -230,7 +230,7 @@ int cpu_exec(CPUState *env1) > uint8_t *tc_ptr; > unsigned long next_tb; > > -if (cpu_halted(env1) == EXCP_HALTED) > +if (fake_machine || cpu_halted(env1) == EXCP_HALTED) > > return EXCP_HALTED; Daniel
[Qemu-devel] [PATCH] audio: consolidate audio_init()
consolidate audio_init() and remove references to shoundhw. Cc: Blue Swirl Signed-off-by: Isaku Yamahata --- arch_init.c | 35 ++- arch_init.h |1 + hw/mips_jazz.c | 24 ++-- hw/mips_malta.c | 23 ++- hw/pc.c | 17 - hw/pc.h |3 --- hw/pc_piix.c|3 ++- sysemu.h| 15 --- 8 files changed, 41 insertions(+), 80 deletions(-) diff --git a/arch_init.c b/arch_init.c index e32e289..cc56f0f 100644 --- a/arch_init.c +++ b/arch_init.c @@ -461,7 +461,18 @@ void qemu_service_io(void) } #ifdef HAS_AUDIO -struct soundhw soundhw[] = { +struct soundhw { +const char *name; +const char *descr; +int enabled; +int isa; +union { +int (*init_isa) (qemu_irq *pic); +int (*init_pci) (PCIBus *bus); +} init; +}; + +static struct soundhw soundhw[] = { #ifdef HAS_AUDIO_CHOICE #if defined(TARGET_I386) || defined(TARGET_MIPS) { @@ -610,10 +621,32 @@ void select_soundhw(const char *optarg) } } } + +void audio_init(qemu_irq *isa_pic, PCIBus *pci_bus) +{ +struct soundhw *c; + +for (c = soundhw; c->name; ++c) { +if (c->enabled) { +if (c->isa) { +if (isa_pic) { +c->init.init_isa(isa_pic); +} +} else { +if (pci_bus) { +c->init.init_pci(pci_bus); +} +} +} +} +} #else void select_soundhw(const char *optarg) { } +void audio_init(qemu_irq *isa_pic, PCIBus *pci_bus) +{ +} #endif int qemu_uuid_parse(const char *str, uint8_t *uuid) diff --git a/arch_init.h b/arch_init.h index 682890c..17c9164 100644 --- a/arch_init.h +++ b/arch_init.h @@ -27,6 +27,7 @@ void do_acpitable_option(const char *optarg); void do_smbios_option(const char *optarg); void cpudef_init(void); int audio_available(void); +void audio_init(qemu_irq *isa_pic, PCIBus *pci_bus); int kvm_available(void); int xen_available(void); diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c index a7caa3f..85eba5a 100644 --- a/hw/mips_jazz.c +++ b/hw/mips_jazz.c @@ -29,7 +29,7 @@ #include "isa.h" #include "fdc.h" #include "sysemu.h" -#include "audio/audio.h" +#include "arch_init.h" #include "boards.h" #include "net.h" #include "esp.h" @@ -90,26 +90,6 @@ static CPUWriteMemoryFunc * const dma_dummy_write[3] = { dma_dummy_writeb, }; -static void audio_init(qemu_irq *pic) -{ -struct soundhw *c; -int audio_enabled = 0; - -for (c = soundhw; !audio_enabled && c->name; ++c) { -audio_enabled = c->enabled; -} - -if (audio_enabled) { -for (c = soundhw; c->name; ++c) { -if (c->enabled) { -if (c->isa) { -c->init.init_isa(pic); -} -} -} -} -} - #define MAGNUM_BIOS_SIZE_MAX 0x7e000 #define MAGNUM_BIOS_SIZE (BIOS_SIZE < MAGNUM_BIOS_SIZE_MAX ? BIOS_SIZE : MAGNUM_BIOS_SIZE_MAX) @@ -284,7 +264,7 @@ void mips_jazz_init (ram_addr_t ram_size, /* Sound card */ /* FIXME: missing Jazz sound at 0x8000c000, rc4030[2] */ -audio_init(i8259); +audio_init(i8259, NULL); /* NVRAM: Unprotected at 0x9000, Protected at 0xa000, Read only at 0xb000 */ ds1225y_init(0x80009000, "nvram"); diff --git a/hw/mips_malta.c b/hw/mips_malta.c index d3ba969..9ae1e1f 100644 --- a/hw/mips_malta.c +++ b/hw/mips_malta.c @@ -37,7 +37,7 @@ #include "vmware_vga.h" #include "qemu-char.h" #include "sysemu.h" -#include "audio/audio.h" +#include "arch_init.h" #include "boards.h" #include "qemu-log.h" #include "mips-bios.h" @@ -457,25 +457,6 @@ static MaltaFPGAState *malta_fpga_init(target_phys_addr_t base, qemu_irq uart_ir return s; } -/* Audio support */ -static void audio_init (PCIBus *pci_bus) -{ -struct soundhw *c; -int audio_enabled = 0; - -for (c = soundhw; !audio_enabled && c->name; ++c) { -audio_enabled = c->enabled; -} - -if (audio_enabled) { -for (c = soundhw; c->name; ++c) { -if (c->enabled) { -c->init.init_pci(pci_bus); -} -} -} -} - /* Network support */ static void network_init(void) { @@ -967,7 +948,7 @@ void mips_malta_init (ram_addr_t ram_size, fdctrl_init_isa(fd); /* Sound card */ -audio_init(pci_bus); +audio_init(NULL, pci_bus); /* Network card */ network_init(); diff --git a/hw/pc.c b/hw/pc.c index fface7d..4dfdc0b 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -829,23 +829,6 @@ static const int ne2000_irq[NE2000_NB_MAX] = { 9, 10, 11, 3, 4, 5 }; static const int parallel_io[MAX_PARALLEL_PORTS] = { 0x378, 0x278, 0x3bc }; static const int parallel_irq[MAX_PARALLEL_PORTS] = { 7, 7, 7 }; -void pc_audio_init (PCIBus *pci_bus, qemu_irq *pic) -{ -struct soundhw *c; - -for (c = soundhw; c->name; ++c) { -if (c->enabled) { -
[Qemu-devel] [PATCH] mips_fulong: remove bogus HAS_AUDIO
remove bogus HAS_AUDIO according to 738012bec4c67e697e766edadab3f522c552a04d. Cc: Blue Swirl Cc: Huacai Chen Cc: Aurelien Jarno Signed-off-by: Isaku Yamahata --- hw/mips_fulong2e.c |4 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c index 07eb9ee..2783ed5 100644 --- a/hw/mips_fulong2e.c +++ b/hw/mips_fulong2e.c @@ -218,13 +218,11 @@ uint8_t eeprom_spd[0x80] = { }; /* Audio support */ -#ifdef HAS_AUDIO static void audio_init (PCIBus *pci_bus) { vt82c686b_ac97_init(pci_bus, PCI_DEVFN(FULONG2E_VIA_SLOT, 5)); vt82c686b_mc97_init(pci_bus, PCI_DEVFN(FULONG2E_VIA_SLOT, 6)); } -#endif /* Network support */ static void network_init (void) @@ -391,9 +389,7 @@ static void mips_fulong2e_init(ram_addr_t ram_size, const char *boot_device, } /* Sound card */ -#ifdef HAS_AUDIO audio_init(pci_bus); -#endif /* Network card */ network_init(); } -- 1.7.1.1
Re: [Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit
On Tue, Jan 18, 2011 at 12:06 PM, Stefan Hajnoczi wrote: > On Tue, Jan 18, 2011 at 11:39 AM, Kevin Wolf wrote: >> Am 18.01.2011 11:10, schrieb Stefan Hajnoczi: >>> Provide the "removable" qdev property bit to override the SCSI INQUIRY >>> removable (RMB) bit for non-CDROM devices. This will be used by USB >>> Mass Storage Devices, which sometimes have this guest-visible bit set >>> and sometimes do not. They therefore requires a means for user >>> configuration. >>> >>> Signed-off-by: Stefan Hajnoczi >> >> Should we print an error message when the user tries to make a CD-ROM >> non-removable instead of silently ignoring the option? > > Good point. I will add a check in scsi_disk_initfn() for v3. Actually this case is hard to check against. The removable property is a boolean that defaults to false. We can't detect the difference between default, cleared, or set. I'm sending out a new version of the patch series that updates docs/qdev-device-use.txt to describe how the removable property works. Stefan
[Qemu-devel] [PATCH v3 3/4] usb-msd: Propagate removable bit to SCSI device
USB Mass Storage Devices sometimes have the RMB (removable) bit set in the SCSI INQUIRY response. Thumbdrives tend to have the bit set whereas hard disks do not. Operating systems differentiate between removable devices and fixed devices. Under Linux, the anaconda installer looks for removable devices. Under Windows, only fixed devices may have more than one partition and AutoRun is also affected by the removable bit. For these reasons, allow USB Mass Storage Devices to override the removable bit: qemu -usb -drive if=none,file=test.img,cache=none,id=disk0 -device usb-storage,drive=disk0,removable=on The default is off. Signed-off-by: Stefan Hajnoczi --- hw/usb-msd.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/hw/usb-msd.c b/hw/usb-msd.c index ee897b6..f1f150b 100644 --- a/hw/usb-msd.c +++ b/hw/usb-msd.c @@ -50,6 +50,7 @@ typedef struct { SCSIBus bus; BlockConf conf; SCSIDevice *scsi_dev; +uint32_t removable; int result; /* For async completion. */ USBPacket *packet; @@ -544,7 +545,7 @@ static int usb_msd_initfn(USBDevice *dev) s->dev.speed = USB_SPEED_FULL; scsi_bus_new(&s->bus, &s->dev.qdev, 0, 1, usb_msd_command_complete); -s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, false); +s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, !!s->removable); if (!s->scsi_dev) { return -1; } @@ -634,6 +635,7 @@ static struct USBDeviceInfo msd_info = { .usbdevice_init = usb_msd_init, .qdev.props = (Property[]) { DEFINE_BLOCK_PROPERTIES(MSDState, conf), +DEFINE_PROP_BIT("removable", MSDState, removable, 0, false), DEFINE_PROP_END_OF_LIST(), }, }; -- 1.7.2.3
[Qemu-devel] [PATCH] monitor: use after free in do_wav_capture()
use after free in do_wav_capture() on the error path. Signed-off-by: Isaku Yamahata --- monitor.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/monitor.c b/monitor.c index d291158..cab5f20 100644 --- a/monitor.c +++ b/monitor.c @@ -2511,6 +2511,7 @@ static void do_wav_capture(Monitor *mon, const QDict *qdict) if (wav_start_capture (s, path, freq, bits, nchannels)) { monitor_printf(mon, "Faied to add wave capture\n"); qemu_free (s); +return; } QLIST_INSERT_HEAD (&capture_head, s, entries); } -- 1.7.1.1
[Qemu-devel] [PATCH v3 1/4] scsi-disk: Allow overriding SCSI INQUIRY removable bit
Provide the "removable" qdev property bit to override the SCSI INQUIRY removable (RMB) bit for non-CDROM devices. This will be used by USB Mass Storage Devices, which sometimes have this guest-visible bit set and sometimes do not. They therefore requires a means for user configuration. Signed-off-by: Stefan Hajnoczi --- hw/scsi-disk.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index 6cb317c..488eedd 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -72,6 +72,7 @@ struct SCSIDiskState /* The qemu block layer uses a fixed 512 byte sector size. This is the number of 512 byte blocks in a single scsi sector. */ int cluster_size; +uint32_t removable; uint64_t max_lba; QEMUBH *bh; char *version; @@ -552,6 +553,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) memcpy(&outbuf[16], "QEMU CD-ROM ", 16); } else { outbuf[0] = 0; +outbuf[1] = s->removable ? 0x80 : 0; memcpy(&outbuf[16], "QEMU HARDDISK ", 16); } memcpy(&outbuf[8], "QEMU", 8); @@ -1295,6 +1297,7 @@ static SCSIDeviceInfo scsi_disk_info = { DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf), DEFINE_PROP_STRING("ver", SCSIDiskState, version), DEFINE_PROP_STRING("serial", SCSIDiskState, serial), +DEFINE_PROP_BIT("removable", SCSIDiskState, removable, 0, false), DEFINE_PROP_END_OF_LIST(), }, }; -- 1.7.2.3
[Qemu-devel] [PATCH v3 4/4] docs: Document scsi-disk and usb-storage removable parameter
Signed-off-by: Stefan Hajnoczi --- docs/qdev-device-use.txt | 13 +++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/docs/qdev-device-use.txt b/docs/qdev-device-use.txt index f2f9b75..e43 100644 --- a/docs/qdev-device-use.txt +++ b/docs/qdev-device-use.txt @@ -80,7 +80,11 @@ The -device argument differs in detail for each kind of drive: This SCSI controller a single SCSI bus, named ID.0. Put a disk on it: - -device scsi-disk,drive=DRIVE-ID,bus=ID.0,scsi-id=SCSI-ID + -device scsi-disk,drive=DRIVE-ID,bus=ID.0,scsi-id=SCSI-ID,removable=RMB + + The (optional) removable parameter lets you override the SCSI INQUIRY + removable (RMB) bit for non CD-ROM devices. It is ignored for CD-ROM devices + which are always removable. RMB is "on" or "off". * if=floppy @@ -116,7 +120,12 @@ For USB devices, the old way is actually different: Provides much less control than -drive's HOST-OPTS... The new way fixes that: --device usb-storage,drive=DRIVE-ID +-device usb-storage,drive=DRIVE-ID,removable=RMB + +The removable parameter gives control over the SCSI INQUIRY removable (RMB) +bit. USB thumbdrives usually set removable=on, while USB hard disks set +removable=off. See the if=scsi description above for details on the removable +parameter. === Character Devices === -- 1.7.2.3
[Qemu-devel] [PATCH v3 2/4] scsi: Allow SCSI devices to override the removable bit
Some SCSI devices may wish to override the removable bit. Add support for a qdev property on the SCSI device. Signed-off-by: Stefan Hajnoczi --- hw/pci-hotplug.c |2 +- hw/scsi-bus.c|8 ++-- hw/scsi.h|3 ++- hw/usb-msd.c |2 +- 4 files changed, 10 insertions(+), 5 deletions(-) diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c index 716133c..270a982 100644 --- a/hw/pci-hotplug.c +++ b/hw/pci-hotplug.c @@ -90,7 +90,7 @@ static int scsi_hot_add(Monitor *mon, DeviceState *adapter, * specified). */ dinfo->unit = qemu_opt_get_number(dinfo->opts, "unit", -1); -scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit); +scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit, false); if (!scsidev) { return -1; } diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c index 7febb86..ceeb4ec 100644 --- a/hw/scsi-bus.c +++ b/hw/scsi-bus.c @@ -87,7 +87,8 @@ void scsi_qdev_register(SCSIDeviceInfo *info) } /* handle legacy '-drive if=scsi,...' cmd line args */ -SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, int unit) +SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, + int unit, bool removable) { const char *driver; DeviceState *dev; @@ -95,6 +96,9 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, int driver = bdrv_is_sg(bdrv) ? "scsi-generic" : "scsi-disk"; dev = qdev_create(&bus->qbus, driver); qdev_prop_set_uint32(dev, "scsi-id", unit); +if (qdev_prop_exists(dev, "removable")) { +qdev_prop_set_bit(dev, "removable", removable); +} if (qdev_prop_set_drive(dev, "drive", bdrv) < 0) { qdev_free(dev); return NULL; @@ -117,7 +121,7 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus) continue; } qemu_opts_loc_restore(dinfo->opts); -if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit)) { +if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit, false)) { res = -1; break; } diff --git a/hw/scsi.h b/hw/scsi.h index bf02adf..846fbba 100644 --- a/hw/scsi.h +++ b/hw/scsi.h @@ -94,7 +94,8 @@ static inline SCSIBus *scsi_bus_from_device(SCSIDevice *d) return DO_UPCAST(SCSIBus, qbus, d->qdev.parent_bus); } -SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, int unit); +SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, + int unit, bool removable); int scsi_bus_legacy_handle_cmdline(SCSIBus *bus); SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t lun); diff --git a/hw/usb-msd.c b/hw/usb-msd.c index 0a95d8d..ee897b6 100644 --- a/hw/usb-msd.c +++ b/hw/usb-msd.c @@ -544,7 +544,7 @@ static int usb_msd_initfn(USBDevice *dev) s->dev.speed = USB_SPEED_FULL; scsi_bus_new(&s->bus, &s->dev.qdev, 0, 1, usb_msd_command_complete); -s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0); +s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, false); if (!s->scsi_dev) { return -1; } -- 1.7.2.3
[Qemu-devel] [PATCH v3 0/4] usb-msd: Add usb-storage, removable=on|off property
Allow overriding the SCSI INQUIRY removable (RMB) bit for scsi-disk and usb-msd devices. In particular this addresses the problem that some usb-msd devices have the bit set while other do not have it set. Now the user can choose and get desired guest behavior. qemu -usb -drive if=none,file=test.img,cache=none,id=disk0 -device usb-storage,drive=disk0,removable=on The default is off. v3: * Document removable property in qdev-device-use.txt * Use bit number 0 instead of bit 1 for the qdev property v2: * Rewritten to override the bit at the scsi-disk level docs/qdev-device-use.txt | 13 +++-- hw/pci-hotplug.c |2 +- hw/scsi-bus.c|8 ++-- hw/scsi-disk.c |3 +++ hw/scsi.h|3 ++- hw/usb-msd.c |4 +++- 6 files changed, 26 insertions(+), 7 deletions(-)
Re: [Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit
Am 21.01.2011 11:45, schrieb Stefan Hajnoczi: > On Tue, Jan 18, 2011 at 12:06 PM, Stefan Hajnoczi wrote: >> On Tue, Jan 18, 2011 at 11:39 AM, Kevin Wolf wrote: >>> Am 18.01.2011 11:10, schrieb Stefan Hajnoczi: Provide the "removable" qdev property bit to override the SCSI INQUIRY removable (RMB) bit for non-CDROM devices. This will be used by USB Mass Storage Devices, which sometimes have this guest-visible bit set and sometimes do not. They therefore requires a means for user configuration. Signed-off-by: Stefan Hajnoczi >>> >>> Should we print an error message when the user tries to make a CD-ROM >>> non-removable instead of silently ignoring the option? >> >> Good point. I will add a check in scsi_disk_initfn() for v3. > > Actually this case is hard to check against. The removable property > is a boolean that defaults to false. We can't detect the difference > between default, cleared, or set. Hm, I see... Maybe we should make scsi-disk and scsi-cdrom different devices in the long run, then scsi-cdrom could default to true (or rather not have the property at all). > I'm sending out a new version of the patch series that updates > docs/qdev-device-use.txt to describe how the removable property works. Okay, let's just document how it works. Kevin
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
On 21 janv. 2011, at 10:16, Kevin Wolf wrote: > Am 19.01.2011 15:59, schrieb Pierre Riteau: >> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >> value of bdrv_write and aborts migration when it fails. However, if the >> size of the block device to migrate is not a multiple of BLOCK_SIZE >> (currently 1 MB), the last bdrv_write will fail with -EIO. >> >> Fixed by calling bdrv_write with the correct size of the last block. >> --- >> block-migration.c | 16 +++- >> 1 files changed, 15 insertions(+), 1 deletions(-) >> >> diff --git a/block-migration.c b/block-migration.c >> index 1475325..eeb9c62 100644 >> --- a/block-migration.c >> +++ b/block-migration.c >> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int >> version_id) >> int64_t addr; >> BlockDriverState *bs; >> uint8_t *buf; >> +int64_t total_sectors; >> +int nr_sectors; >> >> do { >> addr = qemu_get_be64(f); >> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int >> version_id) >> return -EINVAL; >> } >> >> +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >> +if (total_sectors <= 0) { >> +fprintf(stderr, "Error getting length of block device >> %s\n", device_name); >> +return -EINVAL; >> +} > > Can you resend the patch with error_report(), as Yoshi mentioned? > > Also, I would move the total_sectors calculation outside the loop - > though I have no idea how many iterations it typically has, so it might > not improve things a lot. Actually, it is not possible to move the total_sectors calculation outside the loop, since the loop can receive blocks from any device (this is why each block is prefixed by the device name). I'm sending a new patch with a small optimization to avoid recalculating total_sectors when the device doesn't change in the next iteration. -- Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France http://perso.univ-rennes1.fr/pierre.riteau/
[Qemu-devel] [PATCH v2] Fix block migration when the device size is not a multiple of 1 MB
b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return value of bdrv_write and aborts migration when it fails. However, if the size of the block device to migrate is not a multiple of BLOCK_SIZE (currently 1 MB), the last bdrv_write will fail with -EIO. Fixed by calling bdrv_write with the correct size of the last block. Signed-off-by: Pierre Riteau --- This v2 has the following changes: - use error_report instead of fprintf (comment from Yoshiaki Tamura) - don't recompute total_sectors when the device hasn't changed since the previous iteration (comment from Kevin Wolf) block-migration.c | 22 -- 1 files changed, 20 insertions(+), 2 deletions(-) diff --git a/block-migration.c b/block-migration.c index 1475325..5f1473a 100644 --- a/block-migration.c +++ b/block-migration.c @@ -633,8 +633,10 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) int len, flags; char device_name[256]; int64_t addr; -BlockDriverState *bs; +BlockDriverState *bs, *bs_prev = NULL; uint8_t *buf; +int64_t total_sectors = 0; +int nr_sectors; do { addr = qemu_get_be64(f); @@ -656,10 +658,26 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) return -EINVAL; } +if (bs != bs_prev) { +bs_prev = bs; +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; +if (total_sectors <= 0) { +error_report("Error getting length of block device %s\n", + device_name); +return -EINVAL; +} +} + +if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { +nr_sectors = total_sectors - addr; +} else { +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; +} + buf = qemu_malloc(BLOCK_SIZE); qemu_get_buffer(f, buf, BLOCK_SIZE); -ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK); +ret = bdrv_write(bs, addr, buf, nr_sectors); qemu_free(buf); if (ret < 0) { -- 1.7.3.5
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
Am 21.01.2011 12:38, schrieb Pierre Riteau: > On 21 janv. 2011, at 10:16, Kevin Wolf wrote: > >> Am 19.01.2011 15:59, schrieb Pierre Riteau: >>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >>> value of bdrv_write and aborts migration when it fails. However, if the >>> size of the block device to migrate is not a multiple of BLOCK_SIZE >>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>> >>> Fixed by calling bdrv_write with the correct size of the last block. >>> --- >>> block-migration.c | 16 +++- >>> 1 files changed, 15 insertions(+), 1 deletions(-) >>> >>> diff --git a/block-migration.c b/block-migration.c >>> index 1475325..eeb9c62 100644 >>> --- a/block-migration.c >>> +++ b/block-migration.c >>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int >>> version_id) >>> int64_t addr; >>> BlockDriverState *bs; >>> uint8_t *buf; >>> +int64_t total_sectors; >>> +int nr_sectors; >>> >>> do { >>> addr = qemu_get_be64(f); >>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int >>> version_id) >>> return -EINVAL; >>> } >>> >>> +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >>> +if (total_sectors <= 0) { >>> +fprintf(stderr, "Error getting length of block device >>> %s\n", device_name); >>> +return -EINVAL; >>> +} >> >> Can you resend the patch with error_report(), as Yoshi mentioned? >> >> Also, I would move the total_sectors calculation outside the loop - >> though I have no idea how many iterations it typically has, so it might >> not improve things a lot. > > Actually, it is not possible to move the total_sectors calculation outside > the loop, since the loop can receive blocks from any device (this is why each > block is prefixed by the device name). > I'm sending a new patch with a small optimization to avoid recalculating > total_sectors when the device doesn't change in the next iteration. Right, I should have read a bit more context... I won't insist on an optimization in this case, but if you like to do it, go ahead. Kevin
[Qemu-devel] Re: [PATCH v2] Fix block migration when the device size is not a multiple of 1 MB
Am 21.01.2011 12:42, schrieb Pierre Riteau: > b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return > value of bdrv_write and aborts migration when it fails. However, if the > size of the block device to migrate is not a multiple of BLOCK_SIZE > (currently 1 MB), the last bdrv_write will fail with -EIO. > > Fixed by calling bdrv_write with the correct size of the last block. > > Signed-off-by: Pierre Riteau Thanks, applied to the block branch. Kevin
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
2011/1/21 Pierre Riteau : > Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a > écrit : > >> 2011/1/20 Pierre Riteau : >>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>> 2011/1/19 Pierre Riteau : > b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return > value of bdrv_write and aborts migration when it fails. However, if the > size of the block device to migrate is not a multiple of BLOCK_SIZE > (currently 1 MB), the last bdrv_write will fail with -EIO. > > Fixed by calling bdrv_write with the correct size of the last block. > --- > block-migration.c | 16 +++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > diff --git a/block-migration.c b/block-migration.c > index 1475325..eeb9c62 100644 > --- a/block-migration.c > +++ b/block-migration.c > @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int > version_id) > int64_t addr; > BlockDriverState *bs; > uint8_t *buf; > + int64_t total_sectors; > + int nr_sectors; > > do { > addr = qemu_get_be64(f); > @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, > int version_id) > return -EINVAL; > } > > + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; > + if (total_sectors <= 0) { > + fprintf(stderr, "Error getting length of block device > %s\n", device_name); > + return -EINVAL; > + } > + > + if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { > + nr_sectors = total_sectors - addr; > + } else { > + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; > + } > + > buf = qemu_malloc(BLOCK_SIZE); > > qemu_get_buffer(f, buf, BLOCK_SIZE); > - ret = bdrv_write(bs, addr, buf, > BDRV_SECTORS_PER_DIRTY_CHUNK); > + ret = bdrv_write(bs, addr, buf, nr_sectors); > > qemu_free(buf); > if (ret < 0) { > -- > 1.7.3.5 > > > Hi Pierre, I don't think the fix above is correct. If you have a file which isn't aliened with BLOCK_SIZE, you won't get an error with the patch. However, the receiver doesn't know how much sectors which the sender wants to be written, so the guest may fail after migration because some data may not be written. IIUC, although changing bytestream should be prevented as much as possible, we should save/load total_sectors to check appropriate file is allocated on the receiver side. >>> >>> Isn't the guest supposed to be started using a file with the correct size? >> >> I personally don't like that; It's insisting too much to the user. >> Can't we expand the image on the fly? We can just abort if expanding >> failed anyway. > > At first I thought your expansion idea was best, but now I think there are > valid scenarios where it fails. > > Imagine both sides are not using a file but a disk partition as storage. If > the partition size is not rounded to 1 MB, the last write will fail with the > current code, and there is no way we can expand the partition. > Right. But in case of partition doesn't the check in the patch below return error? Does bdrv_getlength return the size correctly? total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; if (total_sectors <= 0) { fprintf(stderr, "Error getting length of block device %s\n", device_name); return -EINVAL; } Yoshi >>> But I guess changing the protocol would be best as it would avoid headaches >>> to people who mistakenly created a file that is too small. >> >> We should think carefully before changing the protocol. >> >> Kevin? >> >>> BTW, you should use error_report instead of fprintf(stderr, ...). >>> >>> I didn't know that, I followed what was used in this file. Thank you. >>> >>> -- >>> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France >>> http://perso.univ-rennes1.fr/pierre.riteau/ >>> >>> >>> > >
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
2011/1/21 Kevin Wolf : > Am 21.01.2011 09:08, schrieb Pierre Riteau: >> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a >> écrit : >> >>> 2011/1/20 Pierre Riteau : On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > 2011/1/19 Pierre Riteau : >> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >> value of bdrv_write and aborts migration when it fails. However, if the >> size of the block device to migrate is not a multiple of BLOCK_SIZE >> (currently 1 MB), the last bdrv_write will fail with -EIO. >> >> Fixed by calling bdrv_write with the correct size of the last block. >> --- >> block-migration.c | 16 +++- >> 1 files changed, 15 insertions(+), 1 deletions(-) >> >> diff --git a/block-migration.c b/block-migration.c >> index 1475325..eeb9c62 100644 >> --- a/block-migration.c >> +++ b/block-migration.c >> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int >> version_id) >> int64_t addr; >> BlockDriverState *bs; >> uint8_t *buf; >> + int64_t total_sectors; >> + int nr_sectors; >> >> do { >> addr = qemu_get_be64(f); >> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, >> int version_id) >> return -EINVAL; >> } >> >> + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >> + if (total_sectors <= 0) { >> + fprintf(stderr, "Error getting length of block device >> %s\n", device_name); >> + return -EINVAL; >> + } >> + >> + if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { >> + nr_sectors = total_sectors - addr; >> + } else { >> + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >> + } >> + >> buf = qemu_malloc(BLOCK_SIZE); >> >> qemu_get_buffer(f, buf, BLOCK_SIZE); >> - ret = bdrv_write(bs, addr, buf, >> BDRV_SECTORS_PER_DIRTY_CHUNK); >> + ret = bdrv_write(bs, addr, buf, nr_sectors); >> >> qemu_free(buf); >> if (ret < 0) { >> -- >> 1.7.3.5 >> >> >> > > Hi Pierre, > > I don't think the fix above is correct. If you have a file which > isn't aliened with BLOCK_SIZE, you won't get an error with the > patch. However, the receiver doesn't know how much sectors which > the sender wants to be written, so the guest may fail after > migration because some data may not be written. IIUC, although > changing bytestream should be prevented as much as possible, we > should save/load total_sectors to check appropriate file is > allocated on the receiver side. Isn't the guest supposed to be started using a file with the correct size? >>> >>> I personally don't like that; It's insisting too much to the user. >>> Can't we expand the image on the fly? We can just abort if expanding >>> failed anyway. >> >> At first I thought your expansion idea was best, but now I think there are >> valid scenarios where it fails. >> >> Imagine both sides are not using a file but a disk partition as storage. If >> the partition size is not rounded to 1 MB, the last write will fail with the >> current code, and there is no way we can expand the partition. > > Actually, that you can change the image size is a special case. It only > works on raw with file and sheepdog, and on qcow2 and qed. All other > block drivers can't do it. > But I guess changing the protocol would be best as it would avoid headaches to people who mistakenly created a file that is too small. >>> >>> We should think carefully before changing the protocol. >>> >>> Kevin? > > Can we do it in a compatible way? I agree that it would be nice to catch > this error, but changing the protocol in an incompatible way for it > seems to be too much. No. However, it's not only about catching this error, but improving the usability of block migration. I don't expect to change all at once, I think it would be worthwhile to discuss if we want to improve block migration. Yoshi > Anyway, it's independent of this patch and can be done on top. > > Kevin > >
[Qemu-devel] Re: [PATCH 0/5] -drive/drive_add fixes
Am 17.01.2011 19:31, schrieb Markus Armbruster: > Note: PATCH 3/5 makes -drive reject duplicate definitions instead of > ignoring all but the first silently. If this isn't sufficiently > bug-compatible for you, we need to talk. > > Markus Armbruster (5): > blockdev: Fix error message for invalid -drive CHS > blockdev: Make drive_init() use error_report() > blockdev: Reject multiple definitions for the same drive > blockdev: Fix drive_del not to crash when drive is not in use > blockdev: Fix drive_add for drives without media > > blockdev.c | 88 -- > blockdev.h |2 +- > hw/device-hotplug.c |3 +- > hw/usb-msd.c|3 +- > vl.c|6 +-- > 5 files changed, 47 insertions(+), 55 deletions(-) Thanks, applied all to the block branch. Kevin
[Qemu-devel] Re: [PATCH v3 0/4] usb-msd: Add usb-storage, removable=on|off property
Am 21.01.2011 12:00, schrieb Stefan Hajnoczi: > Allow overriding the SCSI INQUIRY removable (RMB) bit for scsi-disk and > usb-msd > devices. In particular this addresses the problem that some usb-msd devices > have the bit set while other do not have it set. Now the user can choose and > get desired guest behavior. > > qemu -usb > -drive if=none,file=test.img,cache=none,id=disk0 > -device usb-storage,drive=disk0,removable=on > > The default is off. > > v3: > * Document removable property in qdev-device-use.txt > * Use bit number 0 instead of bit 1 for the qdev property > > v2: > * Rewritten to override the bit at the scsi-disk level > > docs/qdev-device-use.txt | 13 +++-- > hw/pci-hotplug.c |2 +- > hw/scsi-bus.c|8 ++-- > hw/scsi-disk.c |3 +++ > hw/scsi.h|3 ++- > hw/usb-msd.c |4 +++- > 6 files changed, 26 insertions(+), 7 deletions(-) Thanks, applied all to the block branch. Kevin
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: > 2011/1/21 Pierre Riteau : >> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a >> écrit : >> >>> 2011/1/20 Pierre Riteau : On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > 2011/1/19 Pierre Riteau : >> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >> value of bdrv_write and aborts migration when it fails. However, if the >> size of the block device to migrate is not a multiple of BLOCK_SIZE >> (currently 1 MB), the last bdrv_write will fail with -EIO. >> >> Fixed by calling bdrv_write with the correct size of the last block. >> --- >> block-migration.c | 16 +++- >> 1 files changed, 15 insertions(+), 1 deletions(-) >> >> diff --git a/block-migration.c b/block-migration.c >> index 1475325..eeb9c62 100644 >> --- a/block-migration.c >> +++ b/block-migration.c >> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int >> version_id) >> int64_t addr; >> BlockDriverState *bs; >> uint8_t *buf; >> +int64_t total_sectors; >> +int nr_sectors; >> >> do { >> addr = qemu_get_be64(f); >> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, >> int version_id) >> return -EINVAL; >> } >> >> +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >> +if (total_sectors <= 0) { >> +fprintf(stderr, "Error getting length of block device >> %s\n", device_name); >> +return -EINVAL; >> +} >> + >> +if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { >> +nr_sectors = total_sectors - addr; >> +} else { >> +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >> +} >> + >> buf = qemu_malloc(BLOCK_SIZE); >> >> qemu_get_buffer(f, buf, BLOCK_SIZE); >> -ret = bdrv_write(bs, addr, buf, >> BDRV_SECTORS_PER_DIRTY_CHUNK); >> +ret = bdrv_write(bs, addr, buf, nr_sectors); >> >> qemu_free(buf); >> if (ret < 0) { >> -- >> 1.7.3.5 >> >> >> > > Hi Pierre, > > I don't think the fix above is correct. If you have a file which > isn't aliened with BLOCK_SIZE, you won't get an error with the > patch. However, the receiver doesn't know how much sectors which > the sender wants to be written, so the guest may fail after > migration because some data may not be written. IIUC, although > changing bytestream should be prevented as much as possible, we > should save/load total_sectors to check appropriate file is > allocated on the receiver side. Isn't the guest supposed to be started using a file with the correct size? >>> >>> I personally don't like that; It's insisting too much to the user. >>> Can't we expand the image on the fly? We can just abort if expanding >>> failed anyway. >> >> At first I thought your expansion idea was best, but now I think there are >> valid scenarios where it fails. >> >> Imagine both sides are not using a file but a disk partition as storage. If >> the partition size is not rounded to 1 MB, the last write will fail with the >> current code, and there is no way we can expand the partition. >> > > Right. But in case of partition doesn't the check in the patch below > return error? Does bdrv_getlength return the size correctly? I'm pretty sure that it does. We would have problems in other places if it didn't (e.g. we're checking if I/O requests are within the disk size). Kevin
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
2011/1/21 Kevin Wolf : > Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: >> 2011/1/21 Pierre Riteau : >>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a >>> écrit : >>> 2011/1/20 Pierre Riteau : > On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > >> 2011/1/19 Pierre Riteau : >>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >>> value of bdrv_write and aborts migration when it fails. However, if the >>> size of the block device to migrate is not a multiple of BLOCK_SIZE >>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>> >>> Fixed by calling bdrv_write with the correct size of the last block. >>> --- >>> block-migration.c | 16 +++- >>> 1 files changed, 15 insertions(+), 1 deletions(-) >>> >>> diff --git a/block-migration.c b/block-migration.c >>> index 1475325..eeb9c62 100644 >>> --- a/block-migration.c >>> +++ b/block-migration.c >>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, >>> int version_id) >>> int64_t addr; >>> BlockDriverState *bs; >>> uint8_t *buf; >>> + int64_t total_sectors; >>> + int nr_sectors; >>> >>> do { >>> addr = qemu_get_be64(f); >>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, >>> int version_id) >>> return -EINVAL; >>> } >>> >>> + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >>> + if (total_sectors <= 0) { >>> + fprintf(stderr, "Error getting length of block device >>> %s\n", device_name); >>> + return -EINVAL; >>> + } >>> + >>> + if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { >>> + nr_sectors = total_sectors - addr; >>> + } else { >>> + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >>> + } >>> + >>> buf = qemu_malloc(BLOCK_SIZE); >>> >>> qemu_get_buffer(f, buf, BLOCK_SIZE); >>> - ret = bdrv_write(bs, addr, buf, >>> BDRV_SECTORS_PER_DIRTY_CHUNK); >>> + ret = bdrv_write(bs, addr, buf, nr_sectors); >>> >>> qemu_free(buf); >>> if (ret < 0) { >>> -- >>> 1.7.3.5 >>> >>> >>> >> >> Hi Pierre, >> >> I don't think the fix above is correct. If you have a file which >> isn't aliened with BLOCK_SIZE, you won't get an error with the >> patch. However, the receiver doesn't know how much sectors which >> the sender wants to be written, so the guest may fail after >> migration because some data may not be written. IIUC, although >> changing bytestream should be prevented as much as possible, we >> should save/load total_sectors to check appropriate file is >> allocated on the receiver side. > > Isn't the guest supposed to be started using a file with the correct size? I personally don't like that; It's insisting too much to the user. Can't we expand the image on the fly? We can just abort if expanding failed anyway. >>> >>> At first I thought your expansion idea was best, but now I think there are >>> valid scenarios where it fails. >>> >>> Imagine both sides are not using a file but a disk partition as storage. If >>> the partition size is not rounded to 1 MB, the last write will fail with >>> the current code, and there is no way we can expand the partition. >>> >> >> Right. But in case of partition doesn't the check in the patch below >> return error? Does bdrv_getlength return the size correctly? > > I'm pretty sure that it does. We would have problems in other places if > it didn't (e.g. we're checking if I/O requests are within the disk size). Sorry for the noise. I just learned it's returning the value of lseek in case of raw-posix. Yoshi > > Kevin > >
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: > 2011/1/21 Kevin Wolf : >> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: >>> 2011/1/21 Pierre Riteau : Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a écrit : > 2011/1/20 Pierre Riteau : >> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >> >>> 2011/1/19 Pierre Riteau : b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return value of bdrv_write and aborts migration when it fails. However, if the size of the block device to migrate is not a multiple of BLOCK_SIZE (currently 1 MB), the last bdrv_write will fail with -EIO. Fixed by calling bdrv_write with the correct size of the last block. --- block-migration.c | 16 +++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/block-migration.c b/block-migration.c index 1475325..eeb9c62 100644 --- a/block-migration.c +++ b/block-migration.c @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) int64_t addr; BlockDriverState *bs; uint8_t *buf; +int64_t total_sectors; +int nr_sectors; do { addr = qemu_get_be64(f); @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) return -EINVAL; } +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; +if (total_sectors <= 0) { +fprintf(stderr, "Error getting length of block device %s\n", device_name); +return -EINVAL; +} + +if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { +nr_sectors = total_sectors - addr; +} else { +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; +} + buf = qemu_malloc(BLOCK_SIZE); qemu_get_buffer(f, buf, BLOCK_SIZE); -ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK); +ret = bdrv_write(bs, addr, buf, nr_sectors); qemu_free(buf); if (ret < 0) { -- 1.7.3.5 >>> >>> Hi Pierre, >>> >>> I don't think the fix above is correct. If you have a file which >>> isn't aliened with BLOCK_SIZE, you won't get an error with the >>> patch. However, the receiver doesn't know how much sectors which >>> the sender wants to be written, so the guest may fail after >>> migration because some data may not be written. IIUC, although >>> changing bytestream should be prevented as much as possible, we >>> should save/load total_sectors to check appropriate file is >>> allocated on the receiver side. >> >> Isn't the guest supposed to be started using a file with the correct >> size? > > I personally don't like that; It's insisting too much to the user. > Can't we expand the image on the fly? We can just abort if expanding > failed anyway. At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails. Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition. >>> >>> Right. But in case of partition doesn't the check in the patch below >>> return error? Does bdrv_getlength return the size correctly? >> >> I'm pretty sure that it does. We would have problems in other places if >> it didn't (e.g. we're checking if I/O requests are within the disk size). > > Sorry for the noise. I just learned it's returning the value of lseek > in case of raw-posix. And it does a ioctl call on other platforms than Linux. -- Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France http://perso.univ-rennes1.fr/pierre.riteau/
Re: [Spice-devel] [Qemu-devel] usb redirection status report
Hi, On 01/20/2011 09:11 PM, David Mansfield wrote: H i, On 01/20/2011 12:27 PM, Christoph Hellwig wrote: On Wed, Jan 19, 2011 at 07:15:47PM +0100, Hans de Goede wrote: Hi All, As most of you know I'm working on usb redirection (making client usb devices accessible in guests over the network). I thought it would be a good idea to write a short status report. So fat the following has been done: * written and posted a network protocol for usb redir. over the network, and send this to the list. As another late-comer, hate to ask a possibly old question but will this added protocol allow usb devices from arbitrary "remote" hosts to be redirected or only from the host running the spice client? Yes, the protocol is transport independent and there will be a standalone usb-host implementation (an tcp/ip server daemon running a machine wishing to "share" a usb device). I recently tried to use the native qemu usb functionality to pass-through a webcam and fell flat on my face due to what others described as "timing issues", and was wondering if it would be possible to pass a USB device plugged in to the vm-host to the vm-guest using your new functionality, even though a third (and unrelated) host is running the spice client. For vm-host to vm-guest pass through it is best to use the existing pass through support. I'm aware of the timing issues there, and I started with fixing those before even thinking about doing usb redirection over the network, see: http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg02527.html For the first patch of my patch set fixing the timing issues for local usb redirection. Tested with multiple webcams (usb iso mode input devices) and a set of usb speakers (usb iso mode output device). Unfortunately there has been little response to this patch set, so I've no idea when it will get merged. (this seems to be the story with a lot of qemu patch sets here on the list, like Gerd's usb descriptor rehandling patches, maybe we need to review how qemu's patch merging processes work?) Regards, Hans
Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote: > On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote: > > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote: > > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote: > > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote: > > > > > > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote: > > > >> > > > >>> When MSI is off, each interrupt needs to be bounced through the io > > > >>> thread when it's set/cleared, so vhost-net causes more context > > > >>> switches and > > > >>> higher CPU utilization than userspace virtio which handles networking > > > >>> in > > > >>> the same thread. > > > >>> > > > >>> We'll need to fix this by adding level irq support in kvm irqfd, > > > >>> for now disable vhost-net in these configurations. > > > >>> > > > >>> Signed-off-by: Michael S. Tsirkin > > > >>> > > > >> I actually think this should be a terminal error. The user asks for > > > >> vhost-net, if we cannot enable it, we should exit. > > > >> > > > >> Or we should warn the user that they should expect bad performance. > > > >> Silently doing something that the user has explicitly asked us not > > > >> to do is not a good behavior. > > > >> > > > >> Regards, > > > >> > > > >> Anthony Liguori > > > >> > > > > The issue is that user has no control of the guest, and can not know > > > > whether the guest enables MSI. So what you ask for will just make > > > > some guests fail, and others fail sometimes. > > > > The user also has no way to know that version X of kvm does not expose a > > > > way to inject level interrupts with irqfd. > > > > > > > > We could have *another* flag that says "use vhost where it helps" but > > > > then I think this is what everyone wants to do, anyway, and libvirt > > > > already sets vhost=on so I prefer redefining the meaning of an existing > > > > flag. > > > > > > > > > > In the very least, there needs to be a vhost=force. > > > > > > Having some sort of friendly default policy is fine but we need to > > > provide a mechanism for a user to have the final say. If you want to > > > redefine vhost=on to really mean, use the friendly default, that's fine > > > by me, but only if the vhost=force option exists. > > > > > > I actually would think libvirt would want to use vhost=force. Debugging > > > with vhost=on is going to be a royal pain in the ass if a user reports > > > bad performance. Given the libvirt XML, you can't actually tell from > > > the guest and the XML whether or not vhost was actually in use or not. > > > > If we add a force option, let's please distinguish hotplug from VM > > creation time. The latter can abort. Hotplug should print an error and > > fail the initfn. > > It can't abort at init - MSI is disabled at init, it needs to be enabled > by the guest later. And aborting the guest in the middle of the run > is a very bad idea. Yeah, I was thinking about the ordering of device being added vs guest enabling MSI this morning. Waiting until the guest decides to try to start using the device to NAK it with an abort is very undesirable. What if when we have vhost=on,force, the device doesn't advertise an INTx (PCI_INTERRUPT_PIN = 0)? Alex
Re: [Spice-devel] [Qemu-devel] usb redirection status report
On 21.01.2011, at 14:23, Hans de Goede wrote: > Hi, > > On 01/20/2011 09:11 PM, David Mansfield wrote: >>> H >>> i, >>> >>> On 01/20/2011 12:27 PM, Christoph Hellwig wrote: On Wed, Jan 19, 2011 at 07:15:47PM +0100, Hans de Goede wrote: > Hi All, > > As most of you know I'm working on usb redirection (making client usb > devices > accessible in guests over the network). > > I thought it would be a good idea to write a short status report. > > So fat the following has been done: > > * written and posted a network protocol for usb redir. over the network, > and send this to the list. >> >> As another late-comer, hate to ask a possibly old question but will this >> added protocol allow usb devices from arbitrary "remote" hosts to be >> redirected or only from the host running the spice client? >> > > Yes, the protocol is transport independent and there will be a standalone > usb-host implementation (an tcp/ip server daemon running a machine wishing > to "share" a usb device). > >> I recently tried to use the native qemu usb functionality to >> pass-through a webcam and fell flat on my face due to what others >> described as "timing issues", and was wondering if it would be possible >> to pass a USB device plugged in to the vm-host to the vm-guest using >> your new functionality, even though a third (and unrelated) host is >> running the spice client. > > For vm-host to vm-guest pass through it is best to use the existing > pass through support. I'm aware of the timing issues there, and I started > with fixing those before even thinking about doing usb redirection over > the network, see: > http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg02527.html > > For the first patch of my patch set fixing the timing issues for local > usb redirection. Tested with multiple webcams (usb iso mode input devices) > and a set of usb speakers (usb iso mode output device). > > Unfortunately there has been little response to this patch set, so I've > no idea when it will get merged. > > (this seems to be the story with a lot of qemu patch sets here on the list, > like Gerd's usb descriptor rehandling patches, maybe we need to review how > qemu's patch merging processes work?) IIUC the main problem is that we have no active usb maintainer and Anthony wants people to volunteer themselves instead of just assigning them to be one :). Alex
RE: [Qemu-devel] [PATCH] target-arm: Fix loading of scalar value forNeon multiply-by-scalar
> -Original Message- > From: qemu-devel-bounces+wschi=dolby@nongnu.org > [mailto:qemu-devel-bounces+wschi=dolby@nongnu.org] On > Behalf Of Peter Maydell > Sent: Wednesday, January 19, 2011 11:30 AM > To: qemu-devel@nongnu.org > Cc: Christophe Lyon; patc...@linaro.org > Subject: [Qemu-devel] [PATCH] target-arm: Fix loading of > scalar value forNeon multiply-by-scalar > > Fix the register and part of register we get the scalar from > in the various "multiply vector by scalar" ops (VMUL by > scalar and friends). FWIW, on the two test cases that I have, this patch (together with Christophe's) does not improve behaviour (see https://bugs.launchpad.net/bugs/702885). Regards, - Wolfgang
Re: [Spice-devel] [Qemu-devel] usb redirection status report
Hi, Unfortunately there has been little response to this patch set, so I've no idea when it will get merged. (this seems to be the story with a lot of qemu patch sets here on the list, like Gerd's usb descriptor rehandling patches, maybe we need to review how qemu's patch merging processes work?) IIUC the main problem is that we have no active usb maintainer and Anthony wants people to volunteer themselves instead of just assigning them to be one :). http://cgit.freedesktop.org/spice/qemu/commit/?h=usb.4&id=5d0d62feee8aa75525207ef24919c0522651a432 cheers, Gerd
Re: [Spice-devel] [Qemu-devel] usb redirection status report
On 21.01.2011, at 14:33, Gerd Hoffmann wrote: > Hi, > >>> Unfortunately there has been little response to this patch set, so >>> I've no idea when it will get merged. >>> >>> (this seems to be the story with a lot of qemu patch sets here on >>> the list, like Gerd's usb descriptor rehandling patches, maybe we >>> need to review how qemu's patch merging processes work?) >> >> IIUC the main problem is that we have no active usb maintainer and >> Anthony wants people to volunteer themselves instead of just >> assigning them to be one :). > > http://cgit.freedesktop.org/spice/qemu/commit/?h=usb.4&id=5d0d62feee8aa75525207ef24919c0522651a432 Thanks for stepping up :). Since you volunteered, I guess it's your role to review those patches then, despite that maintainers commit not being upstream yet. Alex
Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
On Fri, Jan 21, 2011 at 06:19:13AM -0700, Alex Williamson wrote: > On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote: > > On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote: > > > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote: > > > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote: > > > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote: > > > > > > > > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote: > > > > >> > > > > >>> When MSI is off, each interrupt needs to be bounced through the io > > > > >>> thread when it's set/cleared, so vhost-net causes more context > > > > >>> switches and > > > > >>> higher CPU utilization than userspace virtio which handles > > > > >>> networking in > > > > >>> the same thread. > > > > >>> > > > > >>> We'll need to fix this by adding level irq support in kvm irqfd, > > > > >>> for now disable vhost-net in these configurations. > > > > >>> > > > > >>> Signed-off-by: Michael S. Tsirkin > > > > >>> > > > > >> I actually think this should be a terminal error. The user asks for > > > > >> vhost-net, if we cannot enable it, we should exit. > > > > >> > > > > >> Or we should warn the user that they should expect bad performance. > > > > >> Silently doing something that the user has explicitly asked us not > > > > >> to do is not a good behavior. > > > > >> > > > > >> Regards, > > > > >> > > > > >> Anthony Liguori > > > > >> > > > > > The issue is that user has no control of the guest, and can not know > > > > > whether the guest enables MSI. So what you ask for will just make > > > > > some guests fail, and others fail sometimes. > > > > > The user also has no way to know that version X of kvm does not > > > > > expose a > > > > > way to inject level interrupts with irqfd. > > > > > > > > > > We could have *another* flag that says "use vhost where it helps" but > > > > > then I think this is what everyone wants to do, anyway, and libvirt > > > > > already sets vhost=on so I prefer redefining the meaning of an > > > > > existing > > > > > flag. > > > > > > > > > > > > > In the very least, there needs to be a vhost=force. > > > > > > > > Having some sort of friendly default policy is fine but we need to > > > > provide a mechanism for a user to have the final say. If you want to > > > > redefine vhost=on to really mean, use the friendly default, that's fine > > > > by me, but only if the vhost=force option exists. > > > > > > > > I actually would think libvirt would want to use vhost=force. > > > > Debugging > > > > with vhost=on is going to be a royal pain in the ass if a user reports > > > > bad performance. Given the libvirt XML, you can't actually tell from > > > > the guest and the XML whether or not vhost was actually in use or not. > > > > > > If we add a force option, let's please distinguish hotplug from VM > > > creation time. The latter can abort. Hotplug should print an error and > > > fail the initfn. > > > > It can't abort at init - MSI is disabled at init, it needs to be enabled > > by the guest later. And aborting the guest in the middle of the run > > is a very bad idea. > > Yeah, I was thinking about the ordering of device being added vs guest > enabling MSI this morning. Waiting until the guest decides to try to > start using the device to NAK it with an abort is very undesirable. > What if when we have vhost=on,force, the device doesn't advertise an > INTx (PCI_INTERRUPT_PIN = 0)? > > Alex Then we break backward compatibility with old guests. I don't see what the issue is really: It is trivial to check that the guest uses MSIX. -- MST
Re: [Qemu-devel] [PATCH] target-arm: Fix loading of scalar value forNeon multiply-by-scalar
On 21 January 2011 13:32, Schildbach, Wolfgang wrote: > FWIW, on the two test cases that I have, this patch (together with > Christophe's) does not improve behaviour (see > https://bugs.launchpad.net/bugs/702885). Hrm. Can you attached the compiled ARM binaries to that bug report, to save me the effort of digging out an armcc, please? thanks -- PMM
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
2011/1/21 Pierre Riteau : > On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: > >> 2011/1/21 Kevin Wolf : >>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: 2011/1/21 Pierre Riteau : > Le 20 janv. 2011 à 17:18, Yoshiaki Tamura > a écrit : > >> 2011/1/20 Pierre Riteau : >>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>> 2011/1/19 Pierre Riteau : > b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return > value of bdrv_write and aborts migration when it fails. However, if > the > size of the block device to migrate is not a multiple of BLOCK_SIZE > (currently 1 MB), the last bdrv_write will fail with -EIO. > > Fixed by calling bdrv_write with the correct size of the last block. > --- > block-migration.c | 16 +++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > diff --git a/block-migration.c b/block-migration.c > index 1475325..eeb9c62 100644 > --- a/block-migration.c > +++ b/block-migration.c > @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, > int version_id) > int64_t addr; > BlockDriverState *bs; > uint8_t *buf; > + int64_t total_sectors; > + int nr_sectors; > > do { > addr = qemu_get_be64(f); > @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void > *opaque, int version_id) > return -EINVAL; > } > > + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; > + if (total_sectors <= 0) { > + fprintf(stderr, "Error getting length of block > device %s\n", device_name); > + return -EINVAL; > + } > + > + if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) > { > + nr_sectors = total_sectors - addr; > + } else { > + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; > + } > + > buf = qemu_malloc(BLOCK_SIZE); > > qemu_get_buffer(f, buf, BLOCK_SIZE); > - ret = bdrv_write(bs, addr, buf, > BDRV_SECTORS_PER_DIRTY_CHUNK); > + ret = bdrv_write(bs, addr, buf, nr_sectors); > > qemu_free(buf); > if (ret < 0) { > -- > 1.7.3.5 > > > Hi Pierre, I don't think the fix above is correct. If you have a file which isn't aliened with BLOCK_SIZE, you won't get an error with the patch. However, the receiver doesn't know how much sectors which the sender wants to be written, so the guest may fail after migration because some data may not be written. IIUC, although changing bytestream should be prevented as much as possible, we should save/load total_sectors to check appropriate file is allocated on the receiver side. >>> >>> Isn't the guest supposed to be started using a file with the correct >>> size? >> >> I personally don't like that; It's insisting too much to the user. >> Can't we expand the image on the fly? We can just abort if expanding >> failed anyway. > > At first I thought your expansion idea was best, but now I think there > are valid scenarios where it fails. > > Imagine both sides are not using a file but a disk partition as storage. > If the partition size is not rounded to 1 MB, the last write will fail > with the current code, and there is no way we can expand the partition. > Right. But in case of partition doesn't the check in the patch below return error? Does bdrv_getlength return the size correctly? >>> >>> I'm pretty sure that it does. We would have problems in other places if >>> it didn't (e.g. we're checking if I/O requests are within the disk size). >> >> Sorry for the noise. I just learned it's returning the value of lseek >> in case of raw-posix. > > > And it does a ioctl call on other platforms than Linux. Thanks. Just a quick question regarding total_sectors. BlockDriverState seems to contain total_sectors. Can we avoid calling bdrv_getlength() if bs->total_sectors were already there? Yoshi > > -- > Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France > http://perso.univ-rennes1.fr/pierre.riteau/ > > >
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
Am 21.01.2011 14:59, schrieb Yoshiaki Tamura: > 2011/1/21 Pierre Riteau : >> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: >> >>> 2011/1/21 Kevin Wolf : Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: > 2011/1/21 Pierre Riteau : >> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura >> a écrit : >> >>> 2011/1/20 Pierre Riteau : On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > 2011/1/19 Pierre Riteau : >> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >> value of bdrv_write and aborts migration when it fails. However, if >> the >> size of the block device to migrate is not a multiple of BLOCK_SIZE >> (currently 1 MB), the last bdrv_write will fail with -EIO. >> >> Fixed by calling bdrv_write with the correct size of the last block. >> --- >> block-migration.c | 16 +++- >> 1 files changed, 15 insertions(+), 1 deletions(-) >> >> diff --git a/block-migration.c b/block-migration.c >> index 1475325..eeb9c62 100644 >> --- a/block-migration.c >> +++ b/block-migration.c >> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, >> int version_id) >> int64_t addr; >> BlockDriverState *bs; >> uint8_t *buf; >> +int64_t total_sectors; >> +int nr_sectors; >> >> do { >> addr = qemu_get_be64(f); >> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void >> *opaque, int version_id) >> return -EINVAL; >> } >> >> +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >> +if (total_sectors <= 0) { >> +fprintf(stderr, "Error getting length of block >> device %s\n", device_name); >> +return -EINVAL; >> +} >> + >> +if (total_sectors - addr < >> BDRV_SECTORS_PER_DIRTY_CHUNK) { >> +nr_sectors = total_sectors - addr; >> +} else { >> +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >> +} >> + >> buf = qemu_malloc(BLOCK_SIZE); >> >> qemu_get_buffer(f, buf, BLOCK_SIZE); >> -ret = bdrv_write(bs, addr, buf, >> BDRV_SECTORS_PER_DIRTY_CHUNK); >> +ret = bdrv_write(bs, addr, buf, nr_sectors); >> >> qemu_free(buf); >> if (ret < 0) { >> -- >> 1.7.3.5 >> >> >> > > Hi Pierre, > > I don't think the fix above is correct. If you have a file which > isn't aliened with BLOCK_SIZE, you won't get an error with the > patch. However, the receiver doesn't know how much sectors which > the sender wants to be written, so the guest may fail after > migration because some data may not be written. IIUC, although > changing bytestream should be prevented as much as possible, we > should save/load total_sectors to check appropriate file is > allocated on the receiver side. Isn't the guest supposed to be started using a file with the correct size? >>> >>> I personally don't like that; It's insisting too much to the user. >>> Can't we expand the image on the fly? We can just abort if expanding >>> failed anyway. >> >> At first I thought your expansion idea was best, but now I think there >> are valid scenarios where it fails. >> >> Imagine both sides are not using a file but a disk partition as storage. >> If the partition size is not rounded to 1 MB, the last write will fail >> with the current code, and there is no way we can expand the partition. >> > > Right. But in case of partition doesn't the check in the patch below > return error? Does bdrv_getlength return the size correctly? I'm pretty sure that it does. We would have problems in other places if it didn't (e.g. we're checking if I/O requests are within the disk size). >>> >>> Sorry for the noise. I just learned it's returning the value of lseek >>> in case of raw-posix. >> >> >> And it does a ioctl call on other platforms than Linux. > > Thanks. Just a quick question regarding total_sectors. > BlockDriverState seems to contain total_sectors. Can we avoid > calling bdrv_getlength() if bs->total_sectors were already there? I'd need to check the details, but I think it may not be correct with growable files. Kevin
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote: > 2011/1/21 Pierre Riteau : >> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: >> >>> 2011/1/21 Kevin Wolf : Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: > 2011/1/21 Pierre Riteau : >> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura >> a écrit : >> >>> 2011/1/20 Pierre Riteau : On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > 2011/1/19 Pierre Riteau : >> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >> value of bdrv_write and aborts migration when it fails. However, if >> the >> size of the block device to migrate is not a multiple of BLOCK_SIZE >> (currently 1 MB), the last bdrv_write will fail with -EIO. >> >> Fixed by calling bdrv_write with the correct size of the last block. >> --- >> block-migration.c | 16 +++- >> 1 files changed, 15 insertions(+), 1 deletions(-) >> >> diff --git a/block-migration.c b/block-migration.c >> index 1475325..eeb9c62 100644 >> --- a/block-migration.c >> +++ b/block-migration.c >> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, >> int version_id) >> int64_t addr; >> BlockDriverState *bs; >> uint8_t *buf; >> +int64_t total_sectors; >> +int nr_sectors; >> >> do { >> addr = qemu_get_be64(f); >> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void >> *opaque, int version_id) >> return -EINVAL; >> } >> >> +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >> +if (total_sectors <= 0) { >> +fprintf(stderr, "Error getting length of block >> device %s\n", device_name); >> +return -EINVAL; >> +} >> + >> +if (total_sectors - addr < >> BDRV_SECTORS_PER_DIRTY_CHUNK) { >> +nr_sectors = total_sectors - addr; >> +} else { >> +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >> +} >> + >> buf = qemu_malloc(BLOCK_SIZE); >> >> qemu_get_buffer(f, buf, BLOCK_SIZE); >> -ret = bdrv_write(bs, addr, buf, >> BDRV_SECTORS_PER_DIRTY_CHUNK); >> +ret = bdrv_write(bs, addr, buf, nr_sectors); >> >> qemu_free(buf); >> if (ret < 0) { >> -- >> 1.7.3.5 >> >> >> > > Hi Pierre, > > I don't think the fix above is correct. If you have a file which > isn't aliened with BLOCK_SIZE, you won't get an error with the > patch. However, the receiver doesn't know how much sectors which > the sender wants to be written, so the guest may fail after > migration because some data may not be written. IIUC, although > changing bytestream should be prevented as much as possible, we > should save/load total_sectors to check appropriate file is > allocated on the receiver side. Isn't the guest supposed to be started using a file with the correct size? >>> >>> I personally don't like that; It's insisting too much to the user. >>> Can't we expand the image on the fly? We can just abort if expanding >>> failed anyway. >> >> At first I thought your expansion idea was best, but now I think there >> are valid scenarios where it fails. >> >> Imagine both sides are not using a file but a disk partition as storage. >> If the partition size is not rounded to 1 MB, the last write will fail >> with the current code, and there is no way we can expand the partition. >> > > Right. But in case of partition doesn't the check in the patch below > return error? Does bdrv_getlength return the size correctly? I'm pretty sure that it does. We would have problems in other places if it didn't (e.g. we're checking if I/O requests are within the disk size). >>> >>> Sorry for the noise. I just learned it's returning the value of lseek >>> in case of raw-posix. >> >> >> And it does a ioctl call on other platforms than Linux. > > Thanks. Just a quick question regarding total_sectors. > BlockDriverState seems to contain total_sectors. Can we avoid > calling bdrv_getlength() if bs->total_sectors were already there? From a comment in bdrv_getlength(): Fixed size devices use the total_sectors value for speed instead of issuing a length query (like lseek) on each call. Also, legacy block drivers don't provide a bdrv_getlength function and mu
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
2011/1/21 Kevin Wolf : > Am 21.01.2011 14:59, schrieb Yoshiaki Tamura: >> 2011/1/21 Pierre Riteau : >>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: >>> 2011/1/21 Kevin Wolf : > Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: >> 2011/1/21 Pierre Riteau : >>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura >>> a écrit : >>> 2011/1/20 Pierre Riteau : > On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > >> 2011/1/19 Pierre Riteau : >>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >>> value of bdrv_write and aborts migration when it fails. However, if >>> the >>> size of the block device to migrate is not a multiple of BLOCK_SIZE >>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>> >>> Fixed by calling bdrv_write with the correct size of the last block. >>> --- >>> block-migration.c | 16 +++- >>> 1 files changed, 15 insertions(+), 1 deletions(-) >>> >>> diff --git a/block-migration.c b/block-migration.c >>> index 1475325..eeb9c62 100644 >>> --- a/block-migration.c >>> +++ b/block-migration.c >>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void >>> *opaque, int version_id) >>> int64_t addr; >>> BlockDriverState *bs; >>> uint8_t *buf; >>> + int64_t total_sectors; >>> + int nr_sectors; >>> >>> do { >>> addr = qemu_get_be64(f); >>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void >>> *opaque, int version_id) >>> return -EINVAL; >>> } >>> >>> + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >>> + if (total_sectors <= 0) { >>> + fprintf(stderr, "Error getting length of block >>> device %s\n", device_name); >>> + return -EINVAL; >>> + } >>> + >>> + if (total_sectors - addr < >>> BDRV_SECTORS_PER_DIRTY_CHUNK) { >>> + nr_sectors = total_sectors - addr; >>> + } else { >>> + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >>> + } >>> + >>> buf = qemu_malloc(BLOCK_SIZE); >>> >>> qemu_get_buffer(f, buf, BLOCK_SIZE); >>> - ret = bdrv_write(bs, addr, buf, >>> BDRV_SECTORS_PER_DIRTY_CHUNK); >>> + ret = bdrv_write(bs, addr, buf, nr_sectors); >>> >>> qemu_free(buf); >>> if (ret < 0) { >>> -- >>> 1.7.3.5 >>> >>> >>> >> >> Hi Pierre, >> >> I don't think the fix above is correct. If you have a file which >> isn't aliened with BLOCK_SIZE, you won't get an error with the >> patch. However, the receiver doesn't know how much sectors which >> the sender wants to be written, so the guest may fail after >> migration because some data may not be written. IIUC, although >> changing bytestream should be prevented as much as possible, we >> should save/load total_sectors to check appropriate file is >> allocated on the receiver side. > > Isn't the guest supposed to be started using a file with the correct > size? I personally don't like that; It's insisting too much to the user. Can't we expand the image on the fly? We can just abort if expanding failed anyway. >>> >>> At first I thought your expansion idea was best, but now I think there >>> are valid scenarios where it fails. >>> >>> Imagine both sides are not using a file but a disk partition as >>> storage. If the partition size is not rounded to 1 MB, the last write >>> will fail with the current code, and there is no way we can expand the >>> partition. >>> >> >> Right. But in case of partition doesn't the check in the patch below >> return error? Does bdrv_getlength return the size correctly? > > I'm pretty sure that it does. We would have problems in other places if > it didn't (e.g. we're checking if I/O requests are within the disk size). Sorry for the noise. I just learned it's returning the value of lseek in case of raw-posix. >>> >>> >>> And it does a ioctl call on other platforms than Linux. >> >> Thanks. Just a quick question regarding total_sectors. >> BlockDriverState seems to contain total_sectors. Can we avoid >> calling bdrv_getlength() if bs->total_sectors were already there? > > I'd need to check the details, but I think it may not be correct with > growable files. Does growable flag me
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
2011/1/21 Pierre Riteau : > On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote: > >> 2011/1/21 Pierre Riteau : >>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: >>> 2011/1/21 Kevin Wolf : > Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: >> 2011/1/21 Pierre Riteau : >>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura >>> a écrit : >>> 2011/1/20 Pierre Riteau : > On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > >> 2011/1/19 Pierre Riteau : >>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return >>> value of bdrv_write and aborts migration when it fails. However, if >>> the >>> size of the block device to migrate is not a multiple of BLOCK_SIZE >>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>> >>> Fixed by calling bdrv_write with the correct size of the last block. >>> --- >>> block-migration.c | 16 +++- >>> 1 files changed, 15 insertions(+), 1 deletions(-) >>> >>> diff --git a/block-migration.c b/block-migration.c >>> index 1475325..eeb9c62 100644 >>> --- a/block-migration.c >>> +++ b/block-migration.c >>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void >>> *opaque, int version_id) >>> int64_t addr; >>> BlockDriverState *bs; >>> uint8_t *buf; >>> + int64_t total_sectors; >>> + int nr_sectors; >>> >>> do { >>> addr = qemu_get_be64(f); >>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void >>> *opaque, int version_id) >>> return -EINVAL; >>> } >>> >>> + total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; >>> + if (total_sectors <= 0) { >>> + fprintf(stderr, "Error getting length of block >>> device %s\n", device_name); >>> + return -EINVAL; >>> + } >>> + >>> + if (total_sectors - addr < >>> BDRV_SECTORS_PER_DIRTY_CHUNK) { >>> + nr_sectors = total_sectors - addr; >>> + } else { >>> + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >>> + } >>> + >>> buf = qemu_malloc(BLOCK_SIZE); >>> >>> qemu_get_buffer(f, buf, BLOCK_SIZE); >>> - ret = bdrv_write(bs, addr, buf, >>> BDRV_SECTORS_PER_DIRTY_CHUNK); >>> + ret = bdrv_write(bs, addr, buf, nr_sectors); >>> >>> qemu_free(buf); >>> if (ret < 0) { >>> -- >>> 1.7.3.5 >>> >>> >>> >> >> Hi Pierre, >> >> I don't think the fix above is correct. If you have a file which >> isn't aliened with BLOCK_SIZE, you won't get an error with the >> patch. However, the receiver doesn't know how much sectors which >> the sender wants to be written, so the guest may fail after >> migration because some data may not be written. IIUC, although >> changing bytestream should be prevented as much as possible, we >> should save/load total_sectors to check appropriate file is >> allocated on the receiver side. > > Isn't the guest supposed to be started using a file with the correct > size? I personally don't like that; It's insisting too much to the user. Can't we expand the image on the fly? We can just abort if expanding failed anyway. >>> >>> At first I thought your expansion idea was best, but now I think there >>> are valid scenarios where it fails. >>> >>> Imagine both sides are not using a file but a disk partition as >>> storage. If the partition size is not rounded to 1 MB, the last write >>> will fail with the current code, and there is no way we can expand the >>> partition. >>> >> >> Right. But in case of partition doesn't the check in the patch below >> return error? Does bdrv_getlength return the size correctly? > > I'm pretty sure that it does. We would have problems in other places if > it didn't (e.g. we're checking if I/O requests are within the disk size). Sorry for the noise. I just learned it's returning the value of lseek in case of raw-posix. >>> >>> >>> And it does a ioctl call on other platforms than Linux. >> >> Thanks. Just a quick question regarding total_sectors. >> BlockDriverState seems to contain total_sectors. Can we avoid >> calling bdrv_getlength() if bs->total_sectors were already there? > > From a comment in bdrv_getlength(): > > Fixed size devices use the total_sectors value for speed ins
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
On 21 janv. 2011, at 15:21, Yoshiaki Tamura wrote: > 2011/1/21 Pierre Riteau : >> On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote: >> >>> 2011/1/21 Pierre Riteau : On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: > 2011/1/21 Kevin Wolf : >> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: >>> 2011/1/21 Pierre Riteau : Le 20 janv. 2011 à 17:18, Yoshiaki Tamura a écrit : > 2011/1/20 Pierre Riteau : >> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >> >>> 2011/1/19 Pierre Riteau : b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return value of bdrv_write and aborts migration when it fails. However, if the size of the block device to migrate is not a multiple of BLOCK_SIZE (currently 1 MB), the last bdrv_write will fail with -EIO. Fixed by calling bdrv_write with the correct size of the last block. --- block-migration.c | 16 +++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/block-migration.c b/block-migration.c index 1475325..eeb9c62 100644 --- a/block-migration.c +++ b/block-migration.c @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) int64_t addr; BlockDriverState *bs; uint8_t *buf; +int64_t total_sectors; +int nr_sectors; do { addr = qemu_get_be64(f); @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id) return -EINVAL; } +total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS; +if (total_sectors <= 0) { +fprintf(stderr, "Error getting length of block device %s\n", device_name); +return -EINVAL; +} + +if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) { +nr_sectors = total_sectors - addr; +} else { +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; +} + buf = qemu_malloc(BLOCK_SIZE); qemu_get_buffer(f, buf, BLOCK_SIZE); -ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK); +ret = bdrv_write(bs, addr, buf, nr_sectors); qemu_free(buf); if (ret < 0) { -- 1.7.3.5 >>> >>> Hi Pierre, >>> >>> I don't think the fix above is correct. If you have a file which >>> isn't aliened with BLOCK_SIZE, you won't get an error with the >>> patch. However, the receiver doesn't know how much sectors which >>> the sender wants to be written, so the guest may fail after >>> migration because some data may not be written. IIUC, although >>> changing bytestream should be prevented as much as possible, we >>> should save/load total_sectors to check appropriate file is >>> allocated on the receiver side. >> >> Isn't the guest supposed to be started using a file with the correct >> size? > > I personally don't like that; It's insisting too much to the user. > Can't we expand the image on the fly? We can just abort if expanding > failed anyway. At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails. Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition. >>> >>> Right. But in case of partition doesn't the check in the patch below >>> return error? Does bdrv_getlength return the size correctly? >> >> I'm pretty sure that it does. We would have problems in other places if >> it didn't (e.g. we're checking if I/O requests are within the disk size). > > Sorry for the noise. I just learned it's returning the value of lseek > in case of raw-posix. And it does a ioctl call on other platforms than Linux. >>> >>> Thanks. Just a quick question regarding total_sectors. >>>
[Qemu-devel] Re: [PATCH] pci/pcie: make pci_find_device() ARI aware.
On Fri, Jan 21, 2011 at 07:44:16PM +0900, Isaku Yamahata wrote: > On Thu, Jan 20, 2011 at 04:15:48PM +0200, Michael S. Tsirkin wrote: > > On Thu, Jan 20, 2011 at 03:57:39PM +0900, Isaku Yamahata wrote: > > > make pci_find_device() ARI aware. > > > > > > Signed-off-by: Isaku Yamahata > > > --- > > > hw/pci.c |7 +++ > > > 1 files changed, 7 insertions(+), 0 deletions(-) > > > > > > diff --git a/hw/pci.c b/hw/pci.c > > > index 8d0e3df..851f350 100644 > > > --- a/hw/pci.c > > > +++ b/hw/pci.c > > > @@ -1596,11 +1596,18 @@ PCIBus *pci_find_bus(PCIBus *bus, int bus_num) > > > > > > PCIDevice *pci_find_device(PCIBus *bus, int bus_num, int slot, int > > > function) > > > { > > > +PCIDevice *d; > > > bus = pci_find_bus(bus, bus_num); > > > > > > if (!bus) > > > return NULL; > > > > > > +d = bus->parent_dev; > > > +if (d && pci_is_express(d) && > > > +pcie_cap_get_type(d) == PCI_EXP_TYPE_DOWNSTREAM && > > > +!pcie_cap_is_ari_enabled(d) && slot > 0) { > > > +return NULL; > > > +} > > > return bus->devices[PCI_DEVFN(slot, function)]; > > > } > > > > I'd like to split this express-specific code out in some way. > > Further, the downstream port always has a single slot. > > Maybe it shouldn't be a bus at all, but this needs some thought. > > Yes, it needs consideration. > > > > How about we put flag in PCIBus that says that a single > > slot is supported? Downstream ports would just set it. > > So such a flag must be set/clear by something like pcie_cap_ari_write_config() > depending on ARI bit at runtime. Well, to figure it out, could you please describe what is the situation your patch tries to fix? I would generally would like a reason for the change to be given in commit logs, please try to avoid just restating what the patch does. Are you trying to create a device with > 8 functions? If that is the case I suspect this is not the best way to do this at all. > pcie device can have 256 functions instead of 8. Only if it's an ARI device. And note that if you have a device with 256 functions and disable ARI in the port, it appears as multiple devices. > Maybe we'd like to emulate how p2p bridge transfers pci transaction > to child pci bus somehow. To support > 8 functions per device, some refactoring would be needed: you can not figure out slot and function from the root bus, it depends on ARI along the way. So APIs that pass in decoded slot/function do not make sense anymore, you must pass in devfn all the way. But since everyone decodes and encodes them in the same way, many things will work even without decoding. -- MST
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
2011/1/21 Pierre Riteau : > On 21 janv. 2011, at 15:21, Yoshiaki Tamura wrote: > >> 2011/1/21 Pierre Riteau : >>> On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote: >>> 2011/1/21 Pierre Riteau : > On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: > >> 2011/1/21 Kevin Wolf : >>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: 2011/1/21 Pierre Riteau : > Le 20 janv. 2011 à 17:18, Yoshiaki Tamura > a écrit : > >> 2011/1/20 Pierre Riteau : >>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>> 2011/1/19 Pierre Riteau : > b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the > return > value of bdrv_write and aborts migration when it fails. However, > if the > size of the block device to migrate is not a multiple of > BLOCK_SIZE > (currently 1 MB), the last bdrv_write will fail with -EIO. > > Fixed by calling bdrv_write with the correct size of the last > block. > --- > block-migration.c | 16 +++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > diff --git a/block-migration.c b/block-migration.c > index 1475325..eeb9c62 100644 > --- a/block-migration.c > +++ b/block-migration.c > @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void > *opaque, int version_id) > int64_t addr; > BlockDriverState *bs; > uint8_t *buf; > + int64_t total_sectors; > + int nr_sectors; > > do { > addr = qemu_get_be64(f); > @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void > *opaque, int version_id) > return -EINVAL; > } > > + total_sectors = bdrv_getlength(bs) >> > BDRV_SECTOR_BITS; > + if (total_sectors <= 0) { > + fprintf(stderr, "Error getting length of block > device %s\n", device_name); > + return -EINVAL; > + } > + > + if (total_sectors - addr < > BDRV_SECTORS_PER_DIRTY_CHUNK) { > + nr_sectors = total_sectors - addr; > + } else { > + nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; > + } > + > buf = qemu_malloc(BLOCK_SIZE); > > qemu_get_buffer(f, buf, BLOCK_SIZE); > - ret = bdrv_write(bs, addr, buf, > BDRV_SECTORS_PER_DIRTY_CHUNK); > + ret = bdrv_write(bs, addr, buf, nr_sectors); > > qemu_free(buf); > if (ret < 0) { > -- > 1.7.3.5 > > > Hi Pierre, I don't think the fix above is correct. If you have a file which isn't aliened with BLOCK_SIZE, you won't get an error with the patch. However, the receiver doesn't know how much sectors which the sender wants to be written, so the guest may fail after migration because some data may not be written. IIUC, although changing bytestream should be prevented as much as possible, we should save/load total_sectors to check appropriate file is allocated on the receiver side. >>> >>> Isn't the guest supposed to be started using a file with the >>> correct size? >> >> I personally don't like that; It's insisting too much to the user. >> Can't we expand the image on the fly? We can just abort if expanding >> failed anyway. > > At first I thought your expansion idea was best, but now I think > there are valid scenarios where it fails. > > Imagine both sides are not using a file but a disk partition as > storage. If the partition size is not rounded to 1 MB, the last write > will fail with the current code, and there is no way we can expand > the partition. > Right. But in case of partition doesn't the check in the patch below return error? Does bdrv_getlength return the size correctly? >>> >>> I'm pretty sure that it does. We would have problems in other places if >>> it didn't (e.g. we're checking if I/O requests are within the disk >>> size). >> >> Sorry for the noise. I just learned it's returning the value of lseek >> in case of raw-posix
[Qemu-devel] Re: [PATCH 3/5] blockdev: Reject multiple definitions for the same drive
Am 17.01.2011 19:31, schrieb Markus Armbruster: > For reasons lost in the mist of time, we silently ignore multiple > definitions for the same drive: > > $ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive > if=ide,index=1,file=tmp.qcow2 -drive if=ide,index=1,file=nonexistant > QEMU 0.13.50 monitor - type 'help' for more information > (qemu) info block > ide0-hd1: type=hd removable=0 file=tmp.qcow2 backing_file=tmp.img ro=0 > drv=qcow2 encrypted=0 > > With if=none, this can become quite confusing: > > $ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive > if=none,index=1,file=tmp.qcow2,id=eins -drive > if=none,index=1,file=nonexistant,id=zwei -device ide-drive,drive=eins -device > ide-drive,drive=zwei > qemu-system-x86_64: -device ide-drive,drive=zwei: Property > 'ide-drive.drive' can't find value 'zwei' > > The second -device fails, because it refers to drive zwei, which got > silently ignored. > > Make multiple drive definitions fail cleanly. > > Signed-off-by: Markus Armbruster Dropped this one (and patch 5, which depends on it) from the block branch again, it breaks -cdrom and probably other drives which are created by default. Kevin
Re: [Spice-devel] [Qemu-devel] usb redirection status report
On 01/21/11 14:41, Alexander Graf wrote: On 21.01.2011, at 14:33, Gerd Hoffmann wrote: Hi, Unfortunately there has been little response to this patch set, so I've no idea when it will get merged. (this seems to be the story with a lot of qemu patch sets here on the list, like Gerd's usb descriptor rehandling patches, maybe we need to review how qemu's patch merging processes work?) IIUC the main problem is that we have no active usb maintainer and Anthony wants people to volunteer themselves instead of just assigning them to be one :). http://cgit.freedesktop.org/spice/qemu/commit/?h=usb.4&id=5d0d62feee8aa75525207ef24919c0522651a432 Thanks for stepping up :). Since you volunteered, I guess it's your role to review those patches then, despite that maintainers commit not being upstream yet. They are sitting in my inbox, waiting for the usb patch queue being merged+flushed. I can't stack up stuff endlessly, the patch queue is quite big as-is already ... cheers, Gerd
Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
On 01/21/2011 03:48 AM, Michael S. Tsirkin wrote: On Thu, Jan 20, 2011 at 06:23:36PM -0600, Anthony Liguori wrote: On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote: On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote: On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote: When MSI is off, each interrupt needs to be bounced through the io thread when it's set/cleared, so vhost-net causes more context switches and higher CPU utilization than userspace virtio which handles networking in the same thread. We'll need to fix this by adding level irq support in kvm irqfd, for now disable vhost-net in these configurations. Signed-off-by: Michael S. Tsirkin I actually think this should be a terminal error. The user asks for vhost-net, if we cannot enable it, we should exit. Or we should warn the user that they should expect bad performance. Silently doing something that the user has explicitly asked us not to do is not a good behavior. Regards, Anthony Liguori The issue is that user has no control of the guest, and can not know whether the guest enables MSI. So what you ask for will just make some guests fail, and others fail sometimes. The user also has no way to know that version X of kvm does not expose a way to inject level interrupts with irqfd. We could have *another* flag that says "use vhost where it helps" but then I think this is what everyone wants to do, anyway, and libvirt already sets vhost=on so I prefer redefining the meaning of an existing flag. In the very least, there needs to be a vhost=force. Having some sort of friendly default policy is fine but we need to provide a mechanism for a user to have the final say. If you want to redefine vhost=on to really mean, use the friendly default, that's fine by me, but only if the vhost=force option exists. OK, I will add that, probably as a separate flag as vhost is a boolean. This will get worse performance but it will be what the user asked for. I actually would think libvirt would want to use vhost=force. Debugging with vhost=on is going to be a royal pain in the ass if a user reports bad performance. Given the libvirt XML, you can't actually tell from the guest and the XML whether or not vhost was actually in use or not. Yes you can: check MSI enabled in the guest, if it is - check vhost enabled in the XML. Not that bad at all, is it? Until you automatically detect level triggered interrupt support for irqfd. This means it's also dependent on a kernel feature too. Is there any way to tell in QEMU that vhost was silently disabled? Regards, Anthony Liguori Regards, Anthony Liguori We get worse performance without MSI anyway, how is this different? Maybe this is best handled by a documentation update? We always said: "use vhost=on to enable experimental in kernel accelerator\n" note 'enable' not 'require'. This is similar to how we specify nvectors : you can not make guest use the feature. How about this: diff --git a/qemu-options.hx b/qemu-options.hx index 898561d..3c937c1 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net, "use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap flag\n" "use vnet_hdr=on to make the lack of IFF_VNET_HDR support an error condition\n" "use vhost=on to enable experimental in kernel accelerator\n" +"(note: vhost=on has no effect unless guest uses MSI-X)\n" "use 'vhostfd=h' to connect to an already opened vhost net device\n" #endif "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests
On 01/21/2011 03:55 AM, Michael S. Tsirkin wrote: On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote: On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote: On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote: On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote: On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote: When MSI is off, each interrupt needs to be bounced through the io thread when it's set/cleared, so vhost-net causes more context switches and higher CPU utilization than userspace virtio which handles networking in the same thread. We'll need to fix this by adding level irq support in kvm irqfd, for now disable vhost-net in these configurations. Signed-off-by: Michael S. Tsirkin I actually think this should be a terminal error. The user asks for vhost-net, if we cannot enable it, we should exit. Or we should warn the user that they should expect bad performance. Silently doing something that the user has explicitly asked us not to do is not a good behavior. Regards, Anthony Liguori The issue is that user has no control of the guest, and can not know whether the guest enables MSI. So what you ask for will just make some guests fail, and others fail sometimes. The user also has no way to know that version X of kvm does not expose a way to inject level interrupts with irqfd. We could have *another* flag that says "use vhost where it helps" but then I think this is what everyone wants to do, anyway, and libvirt already sets vhost=on so I prefer redefining the meaning of an existing flag. In the very least, there needs to be a vhost=force. Having some sort of friendly default policy is fine but we need to provide a mechanism for a user to have the final say. If you want to redefine vhost=on to really mean, use the friendly default, that's fine by me, but only if the vhost=force option exists. I actually would think libvirt would want to use vhost=force. Debugging with vhost=on is going to be a royal pain in the ass if a user reports bad performance. Given the libvirt XML, you can't actually tell from the guest and the XML whether or not vhost was actually in use or not. If we add a force option, let's please distinguish hotplug from VM creation time. The latter can abort. Hotplug should print an error and fail the initfn. It can't abort at init - MSI is disabled at init, it needs to be enabled by the guest later. And aborting the guest in the middle of the run is a very bad idea. What vhostforce=true will do is force vhost backend to be used even if it is slower. vhost=on,vhostforce=false use vhost if we think it will improve performance vhost=on,vhostforce=true always use vhost vhost=off,vhostforce=*do not use vhost Regards, Anthony Liguori Thanks, Alex
Re: [Qemu-devel] [RFC PATCH] Fake machine for scalability testing
On 01/21/2011 04:43 AM, Daniel P. Berrange wrote: On Thu, Jan 20, 2011 at 01:50:33PM -0600, Anthony Liguori wrote: On 01/20/2011 11:12 AM, Markus Armbruster wrote: Anthony Liguori writes: On 01/18/2011 02:16 PM, Markus Armbruster wrote: The problem: you want to do serious scalability testing (1000s of VMs) of your management stack. If each guest eats up a few 100MiB and competes for CPU, that requires a serious host machine. Which you don't have. You also don't want to modify the management stack at all, if you can help it. The solution: a perfectly normal-looking QEMU that uses minimal resources. Ability to execute any guest code is strictly optional ;) New option -fake-machine creates a fake machine incapable of running guest code. Completely compiled out by default, enable with configure --enable-fake-machine. With -fake-machine, CPU use is negligible, and memory use is rather modest. Non-fake VM running F-14 live, right after boot: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 15707 2558 53 191837 414388 1 21:05 pts/300:00:29 [...] Same VM -fake-machine, after similar time elapsed: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 15742 2558 0 85129 9412 0 21:07 pts/300:00:00 [...] We're using a very similar patch for RHEL scalability testing. Interesting, but: 9432 anthony 20 0 153m 14m 5384 S0 0.2 0:00.22 qemu-system-x86 That's qemu-system-x86 -m 4 Sure you ran qemu-system-x86 -fake-machine? No, I didn't try it. My point was that -m 4 is already pretty small. One of the core ideas/requirements behind the "fake QEMU" was that we won't need to modify applications to adjust the command line arguments in this kind of way. We want all their machine definition logic to remain unaffected. In fact my original prototype did not even require addition of the passing of an extra '-fake-machine' argument, it would have just been a plain drop in alternative QEMU binary. It also stubbed out much of the KVM codepaths, so you could "enable" KVM mode without actually really having KVM available on the host. In terms of memory overhead, the largest source is not really going to be addressed by -fake-machine (l1_phys_map and phys_ram_dirty). git-grep phys_ram_dirty finds nothing. Yeah, it's now ram_list[i].phys_dirty. l1_phys_map is (sizeof(void *) + sizeof(PhysPageDesc)) * mem_size_in_pages phys_dirty is mem_size_in_pages bytes. I don't really understand the point of not creating a VCPU with KVM. Is there some type of overhead in doing that? I briefly looked at both main loops, TCG's was the first one I happened to crack, and I didn't feel like doing both then. If the general approach is okay, I'll gladly investigate how to do it with KVM. I guess what I don't understand is why do you need to not run guest code? Specifically, if you remove the following, is it any less useful? IIUC, if you don't have the following change, then the guest CPUs will actually execute the kernel/bootable disk configured, causing host CPU utilization to rise. Even if it only adds 2% load on the host, this quickly becomes an issue as you get 200 or more VMs on the host. Ideally we would have the main loop completely disabled, not merely the CPUs, because this would avoid all possible background CPU load that any QEMU internal timers might cause. Basically the desired goal is, make no change to the QEMU command line aguments, but have zero memory and CPU overhead by running QEMU. fake-machine doesn't get as close to zero as my original fake QEMU target managed, but it is still pretty good, considering how much less code is involved in fake-machine. Oh, so what you really want to do is: #!/bin/sh /usr/libexec/qemu-kvm -m 4 Ignore all command line parameters and just run a micro guest. If you don't specify any kernel/boot disks, you don't need to disable a VCPU execution because it'll spin in a hlt loop once the bios executes. Regards, Anthony Liguori diff --git a/cpu-exec.c b/cpu-exec.c index 8c9fb8b..cd1259a 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -230,7 +230,7 @@ int cpu_exec(CPUState *env1) uint8_t *tc_ptr; unsigned long next_tb; -if (cpu_halted(env1) == EXCP_HALTED) +if (fake_machine || cpu_halted(env1) == EXCP_HALTED) return EXCP_HALTED; Daniel
Re: [Qemu-devel] [RFC PATCH] Fake machine for scalability testing
On Fri, Jan 21, 2011 at 08:43:20AM -0600, Anthony Liguori wrote: > On 01/21/2011 04:43 AM, Daniel P. Berrange wrote: > >On Thu, Jan 20, 2011 at 01:50:33PM -0600, Anthony Liguori wrote: > >>On 01/20/2011 11:12 AM, Markus Armbruster wrote: > >>>Anthony Liguori writes: > >>> > On 01/18/2011 02:16 PM, Markus Armbruster wrote: > >The problem: you want to do serious scalability testing (1000s of VMs) > >of your management stack. If each guest eats up a few 100MiB and > >competes for CPU, that requires a serious host machine. Which you don't > >have. You also don't want to modify the management stack at all, if you > >can help it. > > > >The solution: a perfectly normal-looking QEMU that uses minimal > >resources. Ability to execute any guest code is strictly optional ;) > > > >New option -fake-machine creates a fake machine incapable of running > >guest code. Completely compiled out by default, enable with configure > >--enable-fake-machine. > > > >With -fake-machine, CPU use is negligible, and memory use is rather > >modest. > > > >Non-fake VM running F-14 live, right after boot: > >UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD > >armbru 15707 2558 53 191837 414388 1 21:05 pts/300:00:29 [...] > > > >Same VM -fake-machine, after similar time elapsed: > >UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD > >armbru 15742 2558 0 85129 9412 0 21:07 pts/300:00:00 [...] > > > >We're using a very similar patch for RHEL scalability testing. > > > Interesting, but: > > 9432 anthony 20 0 153m 14m 5384 S0 0.2 0:00.22 > qemu-system-x86 > > That's qemu-system-x86 -m 4 > >>>Sure you ran qemu-system-x86 -fake-machine? > >>No, I didn't try it. My point was that -m 4 is already pretty small. > >One of the core ideas/requirements behind the "fake QEMU" was > >that we won't need to modify applications to adjust the command > >line arguments in this kind of way. We want all their machine > >definition logic to remain unaffected. In fact my original > >prototype did not even require addition of the passing of an > >extra '-fake-machine' argument, it would have just been a plain > >drop in alternative QEMU binary. It also stubbed out much of > >the KVM codepaths, so you could "enable" KVM mode without > >actually really having KVM available on the host. > > > In terms of memory overhead, the largest source is not really going to > be addressed by -fake-machine (l1_phys_map and phys_ram_dirty). > >>>git-grep phys_ram_dirty finds nothing. > >>Yeah, it's now ram_list[i].phys_dirty. > >> > >>l1_phys_map is (sizeof(void *) + sizeof(PhysPageDesc)) * mem_size_in_pages > >> > >>phys_dirty is mem_size_in_pages bytes. > >> > I don't really understand the point of not creating a VCPU with KVM. > Is there some type of overhead in doing that? > >>>I briefly looked at both main loops, TCG's was the first one I happened > >>>to crack, and I didn't feel like doing both then. If the general > >>>approach is okay, I'll gladly investigate how to do it with KVM. > >>I guess what I don't understand is why do you need to not run guest > >>code? Specifically, if you remove the following, is it any less > >>useful? > >IIUC, if you don't have the following change, then the guest CPUs > >will actually execute the kernel/bootable disk configured, causing > >host CPU utilization to rise. Even if it only adds 2% load on the > >host, this quickly becomes an issue as you get 200 or more VMs on > >the host. Ideally we would have the main loop completely disabled, > >not merely the CPUs, because this would avoid all possible background > >CPU load that any QEMU internal timers might cause. > > > >Basically the desired goal is, make no change to the QEMU command > >line aguments, but have zero memory and CPU overhead by running > >QEMU. fake-machine doesn't get as close to zero as my original > >fake QEMU target managed, but it is still pretty good, considering > >how much less code is involved in fake-machine. > > Oh, so what you really want to do is: > > #!/bin/sh > /usr/libexec/qemu-kvm -m 4 > > Ignore all command line parameters and just run a micro guest. If > you don't specify any kernel/boot disks, you don't need to disable a > VCPU execution because it'll spin in a hlt loop once the bios > executes. That's going to likely cause app confusion, because the app will be specifying 1 GB, but when it talks to the balloon it is only going to see / be allowed to set the balloon between 0 & 4 MB. Daniel
Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
On 21 janv. 2011, at 15:30, Yoshiaki Tamura wrote: > 2011/1/21 Pierre Riteau : >> On 21 janv. 2011, at 15:21, Yoshiaki Tamura wrote: >> >>> 2011/1/21 Pierre Riteau : On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote: > 2011/1/21 Pierre Riteau : >> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: >> >>> 2011/1/21 Kevin Wolf : Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: > 2011/1/21 Pierre Riteau : >> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura >> a écrit : >> >>> 2011/1/20 Pierre Riteau : On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: > 2011/1/19 Pierre Riteau : >> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the >> return >> value of bdrv_write and aborts migration when it fails. However, >> if the >> size of the block device to migrate is not a multiple of >> BLOCK_SIZE >> (currently 1 MB), the last bdrv_write will fail with -EIO. >> >> Fixed by calling bdrv_write with the correct size of the last >> block. >> --- >> block-migration.c | 16 +++- >> 1 files changed, 15 insertions(+), 1 deletions(-) >> >> diff --git a/block-migration.c b/block-migration.c >> index 1475325..eeb9c62 100644 >> --- a/block-migration.c >> +++ b/block-migration.c >> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void >> *opaque, int version_id) >> int64_t addr; >> BlockDriverState *bs; >> uint8_t *buf; >> +int64_t total_sectors; >> +int nr_sectors; >> >> do { >> addr = qemu_get_be64(f); >> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void >> *opaque, int version_id) >> return -EINVAL; >> } >> >> +total_sectors = bdrv_getlength(bs) >> >> BDRV_SECTOR_BITS; >> +if (total_sectors <= 0) { >> +fprintf(stderr, "Error getting length of block >> device %s\n", device_name); >> +return -EINVAL; >> +} >> + >> +if (total_sectors - addr < >> BDRV_SECTORS_PER_DIRTY_CHUNK) { >> +nr_sectors = total_sectors - addr; >> +} else { >> +nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK; >> +} >> + >> buf = qemu_malloc(BLOCK_SIZE); >> >> qemu_get_buffer(f, buf, BLOCK_SIZE); >> -ret = bdrv_write(bs, addr, buf, >> BDRV_SECTORS_PER_DIRTY_CHUNK); >> +ret = bdrv_write(bs, addr, buf, nr_sectors); >> >> qemu_free(buf); >> if (ret < 0) { >> -- >> 1.7.3.5 >> >> >> > > Hi Pierre, > > I don't think the fix above is correct. If you have a file which > isn't aliened with BLOCK_SIZE, you won't get an error with the > patch. However, the receiver doesn't know how much sectors which > the sender wants to be written, so the guest may fail after > migration because some data may not be written. IIUC, although > changing bytestream should be prevented as much as possible, we > should save/load total_sectors to check appropriate file is > allocated on the receiver side. Isn't the guest supposed to be started using a file with the correct size? >>> >>> I personally don't like that; It's insisting too much to the user. >>> Can't we expand the image on the fly? We can just abort if >>> expanding >>> failed anyway. >> >> At first I thought your expansion idea was best, but now I think >> there are valid scenarios where it fails. >> >> Imagine both sides are not using a file but a disk partition as >> storage. If the partition size is not rounded to 1 MB, the last >> write will fail with the current code, and there is no way we can >> expand the partition. >> > > Right. But in case of partition doesn't the check in the patch below > return error? Does bdrv_getlength return the size correctly? I'm pretty sure that it does. We would have problems in other places if >
Re: [Qemu-devel] [RFC PATCH] Fake machine for scalability testing
On 01/21/2011 04:38 AM, Markus Armbruster wrote: Anthony Liguori writes: On 01/20/2011 11:12 AM, Markus Armbruster wrote: Anthony Liguori writes: On 01/18/2011 02:16 PM, Markus Armbruster wrote: The problem: you want to do serious scalability testing (1000s of VMs) of your management stack. If each guest eats up a few 100MiB and competes for CPU, that requires a serious host machine. Which you don't have. You also don't want to modify the management stack at all, if you can help it. The solution: a perfectly normal-looking QEMU that uses minimal resources. Ability to execute any guest code is strictly optional ;) New option -fake-machine creates a fake machine incapable of running guest code. Completely compiled out by default, enable with configure --enable-fake-machine. With -fake-machine, CPU use is negligible, and memory use is rather modest. Non-fake VM running F-14 live, right after boot: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 15707 2558 53 191837 414388 1 21:05 pts/300:00:29 [...] Same VM -fake-machine, after similar time elapsed: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 15742 2558 0 85129 9412 0 21:07 pts/300:00:00 [...] We're using a very similar patch for RHEL scalability testing. Interesting, but: 9432 anthony 20 0 153m 14m 5384 S0 0.2 0:00.22 qemu-system-x86 That's qemu-system-x86 -m 4 Sure you ran qemu-system-x86 -fake-machine? No, I didn't try it. My point was that -m 4 is already pretty small. Ah! However, it's not as small as -fake-machine, and eats all the CPU it can get. None-fake VM as above, but with -m 4: UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD armbru 19325 2558 93 39869 17020 1 11:30 pts/300:00:42 [...] And I believe we can make -fake-machine use even less memory than now, with a little more work. In terms of memory overhead, the largest source is not really going to be addressed by -fake-machine (l1_phys_map and phys_ram_dirty). git-grep phys_ram_dirty finds nothing. Yeah, it's now ram_list[i].phys_dirty. l1_phys_map is (sizeof(void *) + sizeof(PhysPageDesc)) * mem_size_in_pages phys_dirty is mem_size_in_pages bytes. Thanks. I don't really understand the point of not creating a VCPU with KVM. Is there some type of overhead in doing that? I briefly looked at both main loops, TCG's was the first one I happened to crack, and I didn't feel like doing both then. If the general approach is okay, I'll gladly investigate how to do it with KVM. I guess what I don't understand is why do you need to not run guest code? Specifically, if you remove the following, is it any less useful? diff --git a/cpu-exec.c b/cpu-exec.c index 8c9fb8b..cd1259a 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -230,7 +230,7 @@ int cpu_exec(CPUState *env1) uint8_t *tc_ptr; unsigned long next_tb; -if (cpu_halted(env1) == EXCP_HALTED) +if (fake_machine || cpu_halted(env1) == EXCP_HALTED) return EXCP_HALTED; I don't want 1000s of guests running infinite "not enough memory to do anything useful, panic!" reboot loops. Because that's 1000s of guests competing for CPU. Hrm, that's not the behavior I see. With no bootable drive, the BIOS will spin in a HLT loop as part of int18. Regards, Anthony Liguori If you think we can achieve my goals (stated in my first paragraph) in a different way, I'm all ears.
[Qemu-devel] [Bug 702885] Re: "Internal resource leak" error with ARM NEON vmull.s32 insn
** Attachment added: "Binary to reproduce bug" https://bugs.launchpad.net/qemu/+bug/702885/+attachment/1801849/+files/main.axf -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/702885 Title: "Internal resource leak" error with ARM NEON vmull.s32 insn Status in QEMU: New Bug description: This bug occurs in qemu, commit 78a59470e6bbc6e16dc4033767492649c1ae4cfd (most recent as of 01/14/2011). Compile, assemble, and link the code below, with the ARM tools. (I use ARM C/C++ Compiler, 4.1 [Build 462]). armasm --cpu Cortex-A8 --licensing=flex foo.s armcc --cpu Cortex-A8 --licensing=flex -o main -L--sysv main.c foo.o Execute on qemu-arm and observe an "Internal resource leak" message. > qemu-arm main Internal resource leak before 818c - Wolfgang main.c: int main(void) { void foofunc(void); foofunc(); return 0 ; } foo.s: ARM REQUIRE8 PRESERVE8 AREA code, CODE, READONLY, ALIGN=2 foofunc PROC VMULL.S32 q1, d2, d4 MOV pc, lr ENDP EXPORT foofunc [CODE] END
[Qemu-devel] Re: [PATCH v4 00/32] usb descriptor overhaul + more
On 01/12/11 12:19, Gerd Hoffmann wrote: Hi, Here is a update of the usb descriptor patches and a bunch of improvements on top of that. This patch series features: [ ... ] The following changes since commit 05bf441eb69a813d3893174d54faa6afa8c0d39b: cris: Remove unused orig_flags (2011-01-10 23:28:08 +0100) are available in the git repository at: git://anongit.freedesktop.org/spice/qemu usb.4 Ping? cheers, Gerd
Re: [Spice-devel] [Qemu-devel] usb redirection status report
On 21.01.2011, at 15:33, Gerd Hoffmann wrote: > On 01/21/11 14:41, Alexander Graf wrote: >> >> On 21.01.2011, at 14:33, Gerd Hoffmann wrote: >> >>> Hi, >>> > Unfortunately there has been little response to this patch set, > so I've no idea when it will get merged. > > (this seems to be the story with a lot of qemu patch sets here > on the list, like Gerd's usb descriptor rehandling patches, > maybe we need to review how qemu's patch merging processes > work?) IIUC the main problem is that we have no active usb maintainer and Anthony wants people to volunteer themselves instead of just assigning them to be one :). >>> >>> http://cgit.freedesktop.org/spice/qemu/commit/?h=usb.4&id=5d0d62feee8aa75525207ef24919c0522651a432 >> >>> >> Thanks for stepping up :). Since you volunteered, I guess it's your >> role to review those patches then, despite that maintainers commit >> not being upstream yet. > > They are sitting in my inbox, waiting for the usb patch queue being > merged+flushed. I can't stack up stuff endlessly, the patch queue is quite > big as-is already ... Sounds like a good topic for the community call maybe? There's got to be a reason Anthony hasn't pulled yet. Alex
[Qemu-devel] [Bug 705931] Re: make ui sdl error 1 on git devel
** Attachment added: "konsole.txt" https://bugs.launchpad.net/bugs/705931/+attachment/1801911/+files/konsole.txt -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/705931 Title: make ui sdl error 1 on git devel Status in QEMU: New Bug description: after clone git devel, try compile on slackware 13.1 32 bit machine got error: ui/sdl.c:809:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token ui/sdl.c:815:36: error: expected ')' before '*' token /usr/include/X11/Xlib.h:3575:14: error: old-style parameter declarations in prototyped function definition /usr/include/X11/Xlib.h:3576:5: error: parameter name omitted ui/sdl.c:883:1: error: expected '{' at end of input ui/sdl.c:883:1: error: control reaches end of non-void function make: *** [ui/sdl.o] Error 1 root@darkstar:/usr/src/qemu/qemu# gcc -v Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper Target: i486-slackware-linux Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap --enable-languages=ada,c,c++,fortran,java,objc,lto --enable-threads=posix --enable-checking=release --with-system-zlib --with-python-dir=/lib/python2.6/site-packages --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp --enable-lto --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux --build=i486-slackware-linux --host=i486-slackware-linux Thread model: posix gcc version 4.5.1 (GCC) root@darkstar:/usr/src/qemu/qemu# uname -a Linux darkstar 2.6.35.7-smp #2 SMP Mon Oct 11 14:52:09 CDT 2010 i686 Intel(R) Core(TM)2 Duo CPU T7100 @ 1.80GHz GenuineIntel GNU/Linux root@darkstar:/usr/src/qemu/qemu# cat /etc/slackware-version Slackware 13.1.0 thanks
[Qemu-devel] [Bug 705931] [NEW] make ui sdl error 1 on git devel
Public bug reported: after clone git devel, try compile on slackware 13.1 32 bit machine got error: ui/sdl.c:809:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token ui/sdl.c:815:36: error: expected ')' before '*' token /usr/include/X11/Xlib.h:3575:14: error: old-style parameter declarations in prototyped function definition /usr/include/X11/Xlib.h:3576:5: error: parameter name omitted ui/sdl.c:883:1: error: expected '{' at end of input ui/sdl.c:883:1: error: control reaches end of non-void function make: *** [ui/sdl.o] Error 1 root@darkstar:/usr/src/qemu/qemu# gcc -v Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper Target: i486-slackware-linux Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap --enable-languages=ada,c,c++,fortran,java,objc,lto --enable-threads=posix --enable-checking=release --with-system-zlib --with-python-dir=/lib/python2.6/site-packages --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp --enable-lto --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux --build=i486-slackware-linux --host=i486-slackware-linux Thread model: posix gcc version 4.5.1 (GCC) root@darkstar:/usr/src/qemu/qemu# uname -a Linux darkstar 2.6.35.7-smp #2 SMP Mon Oct 11 14:52:09 CDT 2010 i686 Intel(R) Core(TM)2 Duo CPU T7100 @ 1.80GHz GenuineIntel GNU/Linux root@darkstar:/usr/src/qemu/qemu# cat /etc/slackware-version Slackware 13.1.0 thanks ** Affects: qemu Importance: Undecided Status: New ** Tags: 13.1 32 bit qemu sdl slackware -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/705931 Title: make ui sdl error 1 on git devel Status in QEMU: New Bug description: after clone git devel, try compile on slackware 13.1 32 bit machine got error: ui/sdl.c:809:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token ui/sdl.c:815:36: error: expected ')' before '*' token /usr/include/X11/Xlib.h:3575:14: error: old-style parameter declarations in prototyped function definition /usr/include/X11/Xlib.h:3576:5: error: parameter name omitted ui/sdl.c:883:1: error: expected '{' at end of input ui/sdl.c:883:1: error: control reaches end of non-void function make: *** [ui/sdl.o] Error 1 root@darkstar:/usr/src/qemu/qemu# gcc -v Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper Target: i486-slackware-linux Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap --enable-languages=ada,c,c++,fortran,java,objc,lto --enable-threads=posix --enable-checking=release --with-system-zlib --with-python-dir=/lib/python2.6/site-packages --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp --enable-lto --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux --build=i486-slackware-linux --host=i486-slackware-linux Thread model: posix gcc version 4.5.1 (GCC) root@darkstar:/usr/src/qemu/qemu# uname -a Linux darkstar 2.6.35.7-smp #2 SMP Mon Oct 11 14:52:09 CDT 2010 i686 Intel(R) Core(TM)2 Duo CPU T7100 @ 1.80GHz GenuineIntel GNU/Linux root@darkstar:/usr/src/qemu/qemu# cat /etc/slackware-version Slackware 13.1.0 thanks
Re: [Qemu-devel] [PATCH 4/5] spitz: make spitz-keyboard to use qdev infrastructure
Does this patch preserve the screen rotation feature?
[Qemu-devel] [Bug 702885] Re: "Internal resource leak" error with ARM NEON vmull.s32 insn
That binary executes OK for me with no resource leak messages with: qemu master as of commit b646968336 http://patchwork.ozlabs.org/patch/79728/ http://patchwork.ozlabs.org/patch/79581/ (i386 host.) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/702885 Title: "Internal resource leak" error with ARM NEON vmull.s32 insn Status in QEMU: New Bug description: This bug occurs in qemu, commit 78a59470e6bbc6e16dc4033767492649c1ae4cfd (most recent as of 01/14/2011). Compile, assemble, and link the code below, with the ARM tools. (I use ARM C/C++ Compiler, 4.1 [Build 462]). armasm --cpu Cortex-A8 --licensing=flex foo.s armcc --cpu Cortex-A8 --licensing=flex -o main -L--sysv main.c foo.o Execute on qemu-arm and observe an "Internal resource leak" message. > qemu-arm main Internal resource leak before 818c - Wolfgang main.c: int main(void) { void foofunc(void); foofunc(); return 0 ; } foo.s: ARM REQUIRE8 PRESERVE8 AREA code, CODE, READONLY, ALIGN=2 foofunc PROC VMULL.S32 q1, d2, d4 MOV pc, lr ENDP EXPORT foofunc [CODE] END
[Qemu-devel] Re: [RFC][PATCH v6 03/23] Make qemu timers available for tools
On 01/17/11 14:14, Michael Roth wrote: > diff --git a/qemu-ioh.c b/qemu-ioh.c > index cc71470..001e7a2 100644 > --- a/qemu-ioh.c > +++ b/qemu-ioh.c > @@ -22,7 +22,11 @@ > * THE SOFTWARE. > */ > #include "qemu-ioh.h" > +#include "qemu-char.h" > #include "qlist.h" > +#ifdef CONFIG_EVENTFD > +#include > +#endif > > /* XXX: fd_read_poll should be suppressed, but an API change is > necessary in the character devices to suppress fd_can_read(). */ > @@ -113,3 +117,92 @@ void qemu_process_fd_handlers2(void *ioh_record_list, > const fd_set *rfds, > } > } > } > + > +#ifndef _WIN32 > +void iothread_event_increment(int *io_thread_fd) Please split the WIN32 stuff into it's own file, similar to oslib-posix and oslib-win32.c etc. > diff --git a/qemu-ioh.h b/qemu-ioh.h > index 7c6e833..2c714a9 100644 > --- a/qemu-ioh.h > +++ b/qemu-ioh.h > @@ -31,4 +31,13 @@ void qemu_get_fdset2(void *ioh_record_list, int *nfds, > fd_set *rfds, > void qemu_process_fd_handlers2(void *ioh_record_list, const fd_set *rfds, > const fd_set *wfds, const fd_set *xfds); > > + > +#ifndef _WIN32 > +void iothread_event_increment(int *io_thread_fd); > +int iothread_event_init(int *io_thread_fd); > +#else > +int win32_event_init(HANDLE *qemu_event_handle); > +void win32_event_increment(HANDLE *qemu_event_handle); > +#endif Can you not do something slightly nicer that allows for those to be the same prototype for all users? Like define a event_handle_t? > + > +#ifndef _WIN32 > +static int io_thread_fd = -1; Needs splitting into separate files too. > diff --git a/qemu-tool.h b/qemu-tool.h > new file mode 100644 > index 000..fd693cf > --- /dev/null > +++ b/qemu-tool.h > @@ -0,0 +1,26 @@ > +#ifndef QEMU_TOOL_H > +#define QEMU_TOOL_H > + > +#include "qemu-common.h" > + > +#ifdef CONFIG_EVENTFD > +#include > +#endif > + > +typedef void VMStateDescription; > +typedef int VMStateInfo; > + > +#ifndef _WIN32 > +void qemu_event_increment(void); > +int qemu_event_init(void); > +#else > +int qemu_event_init(void); > +void qemu_event_increment(void); > +#endif No matter how long I stare at those prototypes, I fail to see the difference between the win32 and the posix version :) Cheers, Jes
Re: [Qemu-devel] [PATCH 5/5] pxa2xx_gpio: switch to using qdev
Dmitry Eremin-Solenikov writes: > Signed-off-by: Dmitry Eremin-Solenikov > --- > hw/gumstix.c |4 +- > hw/pxa.h | 10 +--- > hw/pxa2xx.c |4 +- > hw/pxa2xx_gpio.c | 151 > ++ > hw/spitz.c | 34 ++-- > hw/tosa.c| 12 ++-- > 6 files changed, 102 insertions(+), 113 deletions(-) > > diff --git a/hw/gumstix.c b/hw/gumstix.c > index af8b464..ee63f63 100644 > --- a/hw/gumstix.c > +++ b/hw/gumstix.c > @@ -78,7 +78,7 @@ static void connex_init(ram_addr_t ram_size, > > /* Interrupt line of NIC is connected to GPIO line 36 */ > smc91c111_init(&nd_table[0], 0x04000300, > -pxa2xx_gpio_in_get(cpu->gpio)[36]); > +qdev_get_gpio_in(cpu->gpio, 36)); > } > > static void verdex_init(ram_addr_t ram_size, > @@ -117,7 +117,7 @@ static void verdex_init(ram_addr_t ram_size, > > /* Interrupt line of NIC is connected to GPIO line 99 */ > smc91c111_init(&nd_table[0], 0x04000300, > -pxa2xx_gpio_in_get(cpu->gpio)[99]); > +qdev_get_gpio_in(cpu->gpio, 99)); > } > > static QEMUMachine connex_machine = { > diff --git a/hw/pxa.h b/hw/pxa.h > index 8d6a8c3..f73d33b 100644 > --- a/hw/pxa.h > +++ b/hw/pxa.h > @@ -70,13 +70,9 @@ void pxa25x_timer_init(target_phys_addr_t base, qemu_irq > *irqs); > void pxa27x_timer_init(target_phys_addr_t base, qemu_irq *irqs, qemu_irq > irq4); > > /* pxa2xx_gpio.c */ > -typedef struct PXA2xxGPIOInfo PXA2xxGPIOInfo; > -PXA2xxGPIOInfo *pxa2xx_gpio_init(target_phys_addr_t base, > +DeviceState *pxa2xx_gpio_init(target_phys_addr_t base, > CPUState *env, qemu_irq *pic, int lines); > -qemu_irq *pxa2xx_gpio_in_get(PXA2xxGPIOInfo *s); > -void pxa2xx_gpio_out_set(PXA2xxGPIOInfo *s, > -int line, qemu_irq handler); > -void pxa2xx_gpio_read_notifier(PXA2xxGPIOInfo *s, qemu_irq handler); > +void pxa2xx_gpio_read_notifier(DeviceState *dev, qemu_irq handler); > > /* pxa2xx_dma.c */ > typedef struct PXA2xxDMAState PXA2xxDMAState; > @@ -132,7 +128,7 @@ typedef struct { > qemu_irq *pic; > qemu_irq reset; > PXA2xxDMAState *dma; > -PXA2xxGPIOInfo *gpio; > +DeviceState *gpio; > PXA2xxLCDState *lcd; > SSIBus **ssp; > PXA2xxI2CState *i2c[2]; > diff --git a/hw/pxa2xx.c b/hw/pxa2xx.c > index 6e72a5c..d966846 100644 > --- a/hw/pxa2xx.c > +++ b/hw/pxa2xx.c > @@ -2158,7 +2158,7 @@ PXA2xxState *pxa270_init(unsigned int sdram_size, const > char *revision) > > /* GPIO1 resets the processor */ > /* The handler can be overridden by board-specific code */ > -pxa2xx_gpio_out_set(s->gpio, 1, s->reset); > +qdev_connect_gpio_out(s->gpio, 1, s->reset); > return s; > } > > @@ -2279,7 +2279,7 @@ PXA2xxState *pxa255_init(unsigned int sdram_size) > > /* GPIO1 resets the processor */ > /* The handler can be overridden by board-specific code */ > -pxa2xx_gpio_out_set(s->gpio, 1, s->reset); > +qdev_connect_gpio_out(s->gpio, 1, s->reset); > return s; > } > > diff --git a/hw/pxa2xx_gpio.c b/hw/pxa2xx_gpio.c > index 0d03446..295a0ff 100644 > --- a/hw/pxa2xx_gpio.c > +++ b/hw/pxa2xx_gpio.c > @@ -8,15 +8,17 @@ > */ > > #include "hw.h" > +#include "sysbus.h" > #include "pxa.h" > > #define PXA2XX_GPIO_BANKS4 > > +typedef struct PXA2xxGPIOInfo PXA2xxGPIOInfo; > struct PXA2xxGPIOInfo { > -qemu_irq *pic; > +SysBusDevice busdev; > +qemu_irq irq0, irq1, irqX; > int lines; > -CPUState *cpu_env; > -qemu_irq *in; > +void *cpu_env; cpu_env made void * here because you DEFINE_PROP_PTR() it in pxa2xxgpioinfo. DEFINE_PROP_PTR() is for dirty hacks only. Which means you got one around here. See use of cpu_env below. > > /* XXX: GNU C vectors are more suitable */ > uint32_t ilevel[PXA2XX_GPIO_BANKS]; > @@ -66,19 +68,19 @@ static struct { > static void pxa2xx_gpio_irq_update(PXA2xxGPIOInfo *s) > { > if (s->status[0] & (1 << 0)) > -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_0]); > +qemu_irq_raise(s->irq0); > else > -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_0]); > +qemu_irq_lower(s->irq0); > > if (s->status[0] & (1 << 1)) > -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_1]); > +qemu_irq_raise(s->irq1); > else > -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_1]); > +qemu_irq_lower(s->irq1); > > if ((s->status[0] & ~3) | s->status[1] | s->status[2] | s->status[3]) > -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_X]); > +qemu_irq_raise(s->irqX); > else > -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_X]); > +qemu_irq_lower(s->irqX); > } > > /* Bitmap of pins used as standby and sleep wake-up sources. */ > @@ -114,7 +116,7 @@ static void pxa2xx_gpio_set(void *opaque, int line, int > level) > pxa2xx_gpio_irq_update(s); > > /* Wake-up GPIOs */ > -if (s->cpu
Re: [Qemu-devel] [PATCH 1/5] SharpSL scoop device - convert to qdev
Dmitry Eremin-Solenikov writes: > Convert SharpSL scoop device to qdev, remove lots of supporting code, as > lot of init and gpio related things can now be done automagically. Bonus: conversion to vmstate. I don't know the device, but the conversion looks sane to me. Same for 2/5 and 3/5.
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On Fri, Jan 21, 2011 at 8:46 AM, Gerd Hoffmann wrote: > Hi, > >> By the way, we don't have a QEMUState but instead use globals. > > /me wants to underline this. > > IMO it is absolutely pointless to worry about ways to pass around kvm_state. > There never ever will be a serious need for that. > > We can stick with the current model of keeping global state in global > variables. And just do the same with kvm_state. > > Or we can move to have all state in a QEMUState struct which we'll pass > around basically everywhere. Then we can simply embed or reference > kvm_state there. > > I'd tend to stick with the global variables as I don't see the point in > having a QEMUstate. I doubt we'll ever see two virtual machines driven by a > single qemu process. YMMV. Global variables are signs of a poor design. QEMUState would not help that, instead more specific structures should be designed, much like what I've proposed for KVMState. Some of these new structures should be even passed around when it makes sense. But I'd not start kvm_state redesign around global variables or QEMUState.
[Qemu-devel] Re: [RFC][PATCH v6 07/23] virtagent: base server definitions
> diff --git a/virtagent-server.h b/virtagent-server.h > new file mode 100644 > index 000..9f68921 > --- /dev/null > +++ b/virtagent-server.h > @@ -0,0 +1,34 @@ > +/* > + * virt-agent - host/guest RPC daemon functions > + * > + * Copyright IBM Corp. 2010 > + * > + * Authors: > + * Michael Roth > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + * > + */ > + > +#include > +#include > + > +#define GUEST_AGENT_SERVICE_ID "virtagent" > +#define GUEST_AGENT_PATH "/tmp/virtagent-guest.sock" > +#define HOST_AGENT_SERVICE_ID "virtagent-host" > +#define HOST_AGENT_PATH "/tmp/virtagent-host.sock" > +#define VA_GETFILE_MAX 1 << 30 > +#define VA_FILEBUF_LEN 16384 > +#define VA_DMESG_LEN 16384 I really don't like these hard coded constants - you you have a command line interface allowing for the change of the sockets and file names? Otherwise you'll hit problems on the host side with concurrent runs of qemu. I really would like to see the dmesg stuff removed too for now as we discussed earlier. Cheers, Jes
[Qemu-devel] Re: [PATCH] pci/pcie: make pci_find_device() ARI aware.
On Fri, Jan 21, 2011 at 04:29:41PM +0200, Michael S. Tsirkin wrote: > On Fri, Jan 21, 2011 at 07:44:16PM +0900, Isaku Yamahata wrote: > > On Thu, Jan 20, 2011 at 04:15:48PM +0200, Michael S. Tsirkin wrote: > > > On Thu, Jan 20, 2011 at 03:57:39PM +0900, Isaku Yamahata wrote: > > > > make pci_find_device() ARI aware. > > > > > > > > Signed-off-by: Isaku Yamahata > > > > --- > > > > hw/pci.c |7 +++ > > > > 1 files changed, 7 insertions(+), 0 deletions(-) > > > > > > > > diff --git a/hw/pci.c b/hw/pci.c > > > > index 8d0e3df..851f350 100644 > > > > --- a/hw/pci.c > > > > +++ b/hw/pci.c > > > > @@ -1596,11 +1596,18 @@ PCIBus *pci_find_bus(PCIBus *bus, int bus_num) > > > > > > > > PCIDevice *pci_find_device(PCIBus *bus, int bus_num, int slot, int > > > > function) > > > > { > > > > +PCIDevice *d; > > > > bus = pci_find_bus(bus, bus_num); > > > > > > > > if (!bus) > > > > return NULL; > > > > > > > > +d = bus->parent_dev; > > > > +if (d && pci_is_express(d) && > > > > +pcie_cap_get_type(d) == PCI_EXP_TYPE_DOWNSTREAM && > > > > +!pcie_cap_is_ari_enabled(d) && slot > 0) { > > > > +return NULL; > > > > +} > > > > return bus->devices[PCI_DEVFN(slot, function)]; > > > > } > > > > > > I'd like to split this express-specific code out in some way. > > > Further, the downstream port always has a single slot. > > > Maybe it shouldn't be a bus at all, but this needs some thought. > > > > Yes, it needs consideration. > > > > > > > How about we put flag in PCIBus that says that a single > > > slot is supported? Downstream ports would just set it. > > > > So such a flag must be set/clear by something like > > pcie_cap_ari_write_config() > > depending on ARI bit at runtime. > > Well, to figure it out, could you please describe what is the situation > your patch tries to fix? I would generally would like a reason for the > change to be given in commit logs, please try to avoid just restating > what the patch does. It seems that I should have added the comment to refer the spec. I'd like to implement ARI enable bit correctly. Downstream port(and root port) doesn't forward pci transaction for function > 7 by default for compatibility, Only when ARI forwarding enable bit of downstream/root port is set, the virtual p2p bridge forwards pci transaction for function > 7 (i.e. slot > 0). 6.13 Alternative routing-ID Interpretation(ARI) 7.8.15 Device capabilites 2 register ARI forwarding supproted 7.8.16 Device control 2 register ARI forwarding Enable 5 ARI Forwarding Enable When set, the Downstream Port disables its traditional Device Number field being 0 enforcement when turning a Type 1 Configuration Request into a Type 0 Configuration Request, permitting access to Extended Functions in an ARI Device immediately below the Port. See Section 6.13. Default value of this bit is 0b. Must be hardwired to 0b if the ARI Forwarding Supported bit is 0b. Oh, the patch should check root port in addition to downstream port. > Are you trying to create a device with > 8 functions? > If that is the case I suspect this is not the best way > to do this at all. > > > pcie device can have 256 functions instead of 8. > > Only if it's an ARI device. And note that if you have a device with > 256 functions and disable ARI in the port, it appears as > multiple devices. > > > Maybe we'd like to emulate how p2p bridge transfers pci transaction > > to child pci bus somehow. > > To support > 8 functions per device, some refactoring would be needed: > you can not figure out slot and function from the root bus, > it depends on ARI along the way. So APIs that pass in > decoded slot/function do not make sense anymore, > you must pass in devfn all the way. > > But since everyone decodes and encodes them in the same way, > many things will work even without decoding. I think there are only two issues to address. Configuration space and hot plug. As you already described it, ARI is defined such that APIs can stay same, just interpret slot bits as part of function number. This patch addresses configuration space access. Multifunction hot plug hasn't been addressed even for conventional pci case. Some kind of refactoring would be necessary for it, I think. -- yamahata
[Qemu-devel] Re: [RFC][PATCH v6 08/23] virtagent: add va.getfile RPC
On 01/17/11 14:15, Michael Roth wrote: > Add RPC to retrieve a guest file. This interface is intended > for smaller reads like peeking at logs and /proc and such. > > Signed-off-by: Michael Roth > --- > virtagent-server.c | 59 > > 1 files changed, 59 insertions(+), 0 deletions(-) > > diff --git a/virtagent-server.c b/virtagent-server.c > index c38a9e0..af4b940 100644 > --- a/virtagent-server.c > +++ b/virtagent-server.c > @@ -62,12 +62,71 @@ out: > return ret; > } > > +/* RPC functions common to guest/host daemons */ > + > +/* va_getfile(): return file contents > + * rpc return values: > + * - base64-encoded file contents > + */ > +static xmlrpc_value *va_getfile(xmlrpc_env *env, > +xmlrpc_value *params, > +void *user_data) > +{ > +const char *path; > +char *file_contents = NULL; > +char buf[VA_FILEBUF_LEN]; malloc()! > +int fd, ret, count = 0; > +xmlrpc_value *result = NULL; > + > +/* parse argument array */ > +xmlrpc_decompose_value(env, params, "(s)", &path); > +if (env->fault_occurred) { > +return NULL; > +} > + > +SLOG("va_getfile(), path:%s", path); > + > +fd = open(path, O_RDONLY); > +if (fd == -1) { > +LOG("open failed: %s", strerror(errno)); > +xmlrpc_faultf(env, "open failed: %s", strerror(errno)); > +return NULL; > +} > + > +while ((ret = read(fd, buf, VA_FILEBUF_LEN)) > 0) { > +file_contents = qemu_realloc(file_contents, count + VA_FILEBUF_LEN); > +memcpy(file_contents + count, buf, ret); Sorry, I brought this up before. This realloc() stuff is a disaster waiting to happen. Please remove it from the patch series, until you have an implementation that copies over a page of the time. > +count += ret; Cheers, Jes
[Qemu-devel] Re: [RFC][PATCH v6 09/23] virtagent: add agent_viewfile qmp/hmp command
On 01/17/11 14:15, Michael Roth wrote: > Utilize the getfile RPC to provide a means to view text files in the > guest. Getfile can handle binary files as well but we don't advertise > that here due to the special handling requiring to store it and provide > it back to the user (base64 encoding it for instance). Hence the > otherwise confusing "viewfile" as opposed to "getfile". I am really against this in any shape or form. Please do a copy and view on the host. Cheers, Jes
Re: [Qemu-devel] [RFC PATCH] Fake machine for scalability testing
Anthony Liguori writes: > On 01/21/2011 04:38 AM, Markus Armbruster wrote: >> Anthony Liguori writes: >> >> >>> On 01/20/2011 11:12 AM, Markus Armbruster wrote: >>> Anthony Liguori writes: > On 01/18/2011 02:16 PM, Markus Armbruster wrote: > > >> The problem: you want to do serious scalability testing (1000s of VMs) >> of your management stack. If each guest eats up a few 100MiB and >> competes for CPU, that requires a serious host machine. Which you don't >> have. You also don't want to modify the management stack at all, if you >> can help it. >> >> The solution: a perfectly normal-looking QEMU that uses minimal >> resources. Ability to execute any guest code is strictly optional ;) >> >> New option -fake-machine creates a fake machine incapable of running >> guest code. Completely compiled out by default, enable with configure >> --enable-fake-machine. >> >> With -fake-machine, CPU use is negligible, and memory use is rather >> modest. >> >> Non-fake VM running F-14 live, right after boot: >> UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD >> armbru 15707 2558 53 191837 414388 1 21:05 pts/300:00:29 [...] >> >> Same VM -fake-machine, after similar time elapsed: >> UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD >> armbru 15742 2558 0 85129 9412 0 21:07 pts/300:00:00 [...] >> >> We're using a very similar patch for RHEL scalability testing. >> >> >> > Interesting, but: > >9432 anthony 20 0 153m 14m 5384 S0 0.2 0:00.22 > qemu-system-x86 > > That's qemu-system-x86 -m 4 > > Sure you ran qemu-system-x86 -fake-machine? >>> No, I didn't try it. My point was that -m 4 is already pretty small. >>> >> Ah! >> >> However, it's not as small as -fake-machine, and eats all the CPU it can >> get. >> >> None-fake VM as above, but with -m 4: >> UIDPID PPID CSZ RSS PSR STIME TTY TIME CMD >> armbru 19325 2558 93 39869 17020 1 11:30 pts/300:00:42 [...] >> >> And I believe we can make -fake-machine use even less memory than now, >> with a little more work. >> >> > In terms of memory overhead, the largest source is not really going to > be addressed by -fake-machine (l1_phys_map and phys_ram_dirty). > > git-grep phys_ram_dirty finds nothing. >>> Yeah, it's now ram_list[i].phys_dirty. >>> >>> l1_phys_map is (sizeof(void *) + sizeof(PhysPageDesc)) * mem_size_in_pages >>> >>> phys_dirty is mem_size_in_pages bytes. >>> >> Thanks. >> >> > I don't really understand the point of not creating a VCPU with KVM. > Is there some type of overhead in doing that? > > I briefly looked at both main loops, TCG's was the first one I happened to crack, and I didn't feel like doing both then. If the general approach is okay, I'll gladly investigate how to do it with KVM. >>> I guess what I don't understand is why do you need to not run guest >>> code? Specifically, if you remove the following, is it any less >>> useful? >>> >>> diff --git a/cpu-exec.c b/cpu-exec.c >>> index 8c9fb8b..cd1259a 100644 >>> --- a/cpu-exec.c >>> +++ b/cpu-exec.c >>> @@ -230,7 +230,7 @@ int cpu_exec(CPUState *env1) >>> uint8_t *tc_ptr; >>> unsigned long next_tb; >>> >>> -if (cpu_halted(env1) == EXCP_HALTED) >>> +if (fake_machine || cpu_halted(env1) == EXCP_HALTED) >>> >>> return EXCP_HALTED; >>> >> I don't want 1000s of guests running infinite "not enough memory to do >> anything useful, panic!" reboot loops. Because that's 1000s of guests >> competing for CPU. >> > > Hrm, that's not the behavior I see. With no bootable drive, the BIOS > will spin in a HLT loop as part of int18. Aha. I used a bootable drive. Using a non-bootable drive may well curb the CPU use sufficiently. Not sure we can always do that in our testing. The less we have to hack up the stack for testing, the better.
[Qemu-devel] [PATCH 4/5] spitz: make spitz-keyboard to use qdev infrastructure
Signed-off-by: Dmitry Eremin-Solenikov --- hw/spitz.c | 127 ++-- 1 files changed, 72 insertions(+), 55 deletions(-) diff --git a/hw/spitz.c b/hw/spitz.c index c69a121..e3ece05 100644 --- a/hw/spitz.c +++ b/hw/spitz.c @@ -219,11 +219,10 @@ static const int spitz_gpiomap[5] = { SPITZ_GPIO_AK_INT, SPITZ_GPIO_SYNC, SPITZ_GPIO_ON_KEY, SPITZ_GPIO_SWA, SPITZ_GPIO_SWB, }; -static int spitz_gpio_invert[5] = { 0, 0, 0, 0, 0, }; typedef struct { +SysBusDevice busdev; qemu_irq sense[SPITZ_KEY_SENSE_NUM]; -qemu_irq *strobe; qemu_irq gpiomap[5]; int keymap[0x80]; uint16_t keyrow[SPITZ_KEY_SENSE_NUM]; @@ -274,8 +273,7 @@ static void spitz_keyboard_keydown(SpitzKeyboardState *s, int keycode) /* Handle the additional keys */ if ((spitz_keycode >> 4) == SPITZ_KEY_SENSE_NUM) { -qemu_set_irq(s->gpiomap[spitz_keycode & 0xf], (keycode < 0x80) ^ -spitz_gpio_invert[spitz_keycode & 0xf]); +qemu_set_irq(s->gpiomap[spitz_keycode & 0xf], (keycode < 0x80)); return; } @@ -293,8 +291,9 @@ static void spitz_keyboard_keydown(SpitzKeyboardState *s, int keycode) #define QUEUE_KEY(c) s->fifo[(s->fifopos + s->fifolen ++) & 0xf] = c -static void spitz_keyboard_handler(SpitzKeyboardState *s, int keycode) +static void spitz_keyboard_handler(void *opaque, int keycode) { +SpitzKeyboardState *s = opaque; uint16_t code; int mapcode; switch (keycode) { @@ -440,34 +439,15 @@ static void spitz_keyboard_pre_map(SpitzKeyboardState *s) s->imodifiers = 0; s->fifopos = 0; s->fifolen = 0; -s->kbdtimer = qemu_new_timer(vm_clock, spitz_keyboard_tick, s); -spitz_keyboard_tick(s); } #undef SHIFT #undef CTRL #undef FN -static void spitz_keyboard_save(QEMUFile *f, void *opaque) +static int spitz_keyboard_post_load(void *opaque, int version_id) { SpitzKeyboardState *s = (SpitzKeyboardState *) opaque; -int i; - -qemu_put_be16s(f, &s->sense_state); -qemu_put_be16s(f, &s->strobe_state); -for (i = 0; i < 5; i ++) -qemu_put_byte(f, spitz_gpio_invert[i]); -} - -static int spitz_keyboard_load(QEMUFile *f, void *opaque, int version_id) -{ -SpitzKeyboardState *s = (SpitzKeyboardState *) opaque; -int i; - -qemu_get_be16s(f, &s->sense_state); -qemu_get_be16s(f, &s->strobe_state); -for (i = 0; i < 5; i ++) -spitz_gpio_invert[i] = qemu_get_byte(f); /* Release all pressed keys */ memset(s->keyrow, 0, sizeof(s->keyrow)); @@ -482,36 +462,55 @@ static int spitz_keyboard_load(QEMUFile *f, void *opaque, int version_id) static void spitz_keyboard_register(PXA2xxState *cpu) { -int i, j; +int i; +DeviceState *dev; SpitzKeyboardState *s; -s = (SpitzKeyboardState *) -qemu_mallocz(sizeof(SpitzKeyboardState)); -memset(s, 0, sizeof(SpitzKeyboardState)); - -for (i = 0; i < 0x80; i ++) -s->keymap[i] = -1; -for (i = 0; i < SPITZ_KEY_SENSE_NUM + 1; i ++) -for (j = 0; j < SPITZ_KEY_STROBE_NUM; j ++) -if (spitz_keymap[i][j] != -1) -s->keymap[spitz_keymap[i][j]] = (i << 4) | j; +dev = sysbus_create_simple("spitz-keyboard", -1, NULL); +s = FROM_SYSBUS(SpitzKeyboardState, sysbus_from_qdev(dev)); for (i = 0; i < SPITZ_KEY_SENSE_NUM; i ++) -s->sense[i] = pxa2xx_gpio_in_get(cpu->gpio)[spitz_gpio_key_sense[i]]; +qdev_connect_gpio_out(dev, i, pxa2xx_gpio_in_get(cpu->gpio)[spitz_gpio_key_sense[i]]); for (i = 0; i < 5; i ++) s->gpiomap[i] = pxa2xx_gpio_in_get(cpu->gpio)[spitz_gpiomap[i]]; -s->strobe = qemu_allocate_irqs(spitz_keyboard_strobe, s, -SPITZ_KEY_STROBE_NUM); +if (!graphic_rotate) +s->gpiomap[4] = qemu_irq_invert(s->gpiomap[4]); + +for (i = 0; i < 5; i++) +qemu_set_irq(s->gpiomap[i], 0); + for (i = 0; i < SPITZ_KEY_STROBE_NUM; i ++) -pxa2xx_gpio_out_set(cpu->gpio, spitz_gpio_key_strobe[i], s->strobe[i]); +pxa2xx_gpio_out_set(cpu->gpio, spitz_gpio_key_strobe[i], +qdev_get_gpio_in(dev, i)); + +qemu_mod_timer(s->kbdtimer, qemu_get_clock(vm_clock)); + +qemu_add_kbd_event_handler(spitz_keyboard_handler, s); +} + +static int spitz_keyboard_init(SysBusDevice *dev) +{ +SpitzKeyboardState *s; +int i, j; + +s = FROM_SYSBUS(SpitzKeyboardState, dev); + +for (i = 0; i < 0x80; i ++) +s->keymap[i] = -1; +for (i = 0; i < SPITZ_KEY_SENSE_NUM + 1; i ++) +for (j = 0; j < SPITZ_KEY_STROBE_NUM; j ++) +if (spitz_keymap[i][j] != -1) +s->keymap[spitz_keymap[i][j]] = (i << 4) | j; spitz_keyboard_pre_map(s); -qemu_add_kbd_event_handler((QEMUPutKBDEvent *) spitz_keyboard_handler, s); -register_savevm(NULL, "spitz_keyboard", 0, 0, -spitz_keyboard_save, spitz_keyboard_load, s); +s->kbdtimer =
[Qemu-devel] Re: [PATCH 3/5] blockdev: Reject multiple definitions for the same drive
Kevin Wolf writes: > Am 17.01.2011 19:31, schrieb Markus Armbruster: >> For reasons lost in the mist of time, we silently ignore multiple >> definitions for the same drive: >> >> $ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive >> if=ide,index=1,file=tmp.qcow2 -drive if=ide,index=1,file=nonexistant >> QEMU 0.13.50 monitor - type 'help' for more information >> (qemu) info block >> ide0-hd1: type=hd removable=0 file=tmp.qcow2 backing_file=tmp.img ro=0 >> drv=qcow2 encrypted=0 >> >> With if=none, this can become quite confusing: >> >> $ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive >> if=none,index=1,file=tmp.qcow2,id=eins -drive >> if=none,index=1,file=nonexistant,id=zwei -device ide-drive,drive=eins >> -device ide-drive,drive=zwei >> qemu-system-x86_64: -device ide-drive,drive=zwei: Property >> 'ide-drive.drive' can't find value 'zwei' >> >> The second -device fails, because it refers to drive zwei, which got >> silently ignored. >> >> Make multiple drive definitions fail cleanly. >> >> Signed-off-by: Markus Armbruster > > Dropped this one (and patch 5, which depends on it) from the block > branch again, it breaks -cdrom and probably other drives which are > created by default. --verbose? I was wondering what crap could depend on the crazy silent ignore...
[Qemu-devel] [PATCH 5/5] pxa2xx_gpio: switch to using qdev
Signed-off-by: Dmitry Eremin-Solenikov --- hw/gumstix.c |4 +- hw/pxa.h | 10 +--- hw/pxa2xx.c |4 +- hw/pxa2xx_gpio.c | 150 ++ hw/spitz.c | 34 ++-- hw/tosa.c| 12 ++-- 6 files changed, 103 insertions(+), 111 deletions(-) diff --git a/hw/gumstix.c b/hw/gumstix.c index af8b464..ee63f63 100644 --- a/hw/gumstix.c +++ b/hw/gumstix.c @@ -78,7 +78,7 @@ static void connex_init(ram_addr_t ram_size, /* Interrupt line of NIC is connected to GPIO line 36 */ smc91c111_init(&nd_table[0], 0x04000300, -pxa2xx_gpio_in_get(cpu->gpio)[36]); +qdev_get_gpio_in(cpu->gpio, 36)); } static void verdex_init(ram_addr_t ram_size, @@ -117,7 +117,7 @@ static void verdex_init(ram_addr_t ram_size, /* Interrupt line of NIC is connected to GPIO line 99 */ smc91c111_init(&nd_table[0], 0x04000300, -pxa2xx_gpio_in_get(cpu->gpio)[99]); +qdev_get_gpio_in(cpu->gpio, 99)); } static QEMUMachine connex_machine = { diff --git a/hw/pxa.h b/hw/pxa.h index 8d6a8c3..f73d33b 100644 --- a/hw/pxa.h +++ b/hw/pxa.h @@ -70,13 +70,9 @@ void pxa25x_timer_init(target_phys_addr_t base, qemu_irq *irqs); void pxa27x_timer_init(target_phys_addr_t base, qemu_irq *irqs, qemu_irq irq4); /* pxa2xx_gpio.c */ -typedef struct PXA2xxGPIOInfo PXA2xxGPIOInfo; -PXA2xxGPIOInfo *pxa2xx_gpio_init(target_phys_addr_t base, +DeviceState *pxa2xx_gpio_init(target_phys_addr_t base, CPUState *env, qemu_irq *pic, int lines); -qemu_irq *pxa2xx_gpio_in_get(PXA2xxGPIOInfo *s); -void pxa2xx_gpio_out_set(PXA2xxGPIOInfo *s, -int line, qemu_irq handler); -void pxa2xx_gpio_read_notifier(PXA2xxGPIOInfo *s, qemu_irq handler); +void pxa2xx_gpio_read_notifier(DeviceState *dev, qemu_irq handler); /* pxa2xx_dma.c */ typedef struct PXA2xxDMAState PXA2xxDMAState; @@ -132,7 +128,7 @@ typedef struct { qemu_irq *pic; qemu_irq reset; PXA2xxDMAState *dma; -PXA2xxGPIOInfo *gpio; +DeviceState *gpio; PXA2xxLCDState *lcd; SSIBus **ssp; PXA2xxI2CState *i2c[2]; diff --git a/hw/pxa2xx.c b/hw/pxa2xx.c index 6e72a5c..d966846 100644 --- a/hw/pxa2xx.c +++ b/hw/pxa2xx.c @@ -2158,7 +2158,7 @@ PXA2xxState *pxa270_init(unsigned int sdram_size, const char *revision) /* GPIO1 resets the processor */ /* The handler can be overridden by board-specific code */ -pxa2xx_gpio_out_set(s->gpio, 1, s->reset); +qdev_connect_gpio_out(s->gpio, 1, s->reset); return s; } @@ -2279,7 +2279,7 @@ PXA2xxState *pxa255_init(unsigned int sdram_size) /* GPIO1 resets the processor */ /* The handler can be overridden by board-specific code */ -pxa2xx_gpio_out_set(s->gpio, 1, s->reset); +qdev_connect_gpio_out(s->gpio, 1, s->reset); return s; } diff --git a/hw/pxa2xx_gpio.c b/hw/pxa2xx_gpio.c index 0d03446..789965d 100644 --- a/hw/pxa2xx_gpio.c +++ b/hw/pxa2xx_gpio.c @@ -8,15 +8,18 @@ */ #include "hw.h" +#include "sysbus.h" #include "pxa.h" #define PXA2XX_GPIO_BANKS 4 +typedef struct PXA2xxGPIOInfo PXA2xxGPIOInfo; struct PXA2xxGPIOInfo { -qemu_irq *pic; +SysBusDevice busdev; +qemu_irq irq0, irq1, irqX; int lines; +int ncpu; CPUState *cpu_env; -qemu_irq *in; /* XXX: GNU C vectors are more suitable */ uint32_t ilevel[PXA2XX_GPIO_BANKS]; @@ -66,19 +69,19 @@ static struct { static void pxa2xx_gpio_irq_update(PXA2xxGPIOInfo *s) { if (s->status[0] & (1 << 0)) -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_0]); +qemu_irq_raise(s->irq0); else -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_0]); +qemu_irq_lower(s->irq0); if (s->status[0] & (1 << 1)) -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_1]); +qemu_irq_raise(s->irq1); else -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_1]); +qemu_irq_lower(s->irq1); if ((s->status[0] & ~3) | s->status[1] | s->status[2] | s->status[3]) -qemu_irq_raise(s->pic[PXA2XX_PIC_GPIO_X]); +qemu_irq_raise(s->irqX); else -qemu_irq_lower(s->pic[PXA2XX_PIC_GPIO_X]); +qemu_irq_lower(s->irqX); } /* Bitmap of pins used as standby and sleep wake-up sources. */ @@ -249,96 +252,89 @@ static CPUWriteMemoryFunc * const pxa2xx_gpio_writefn[] = { pxa2xx_gpio_write }; -static void pxa2xx_gpio_save(QEMUFile *f, void *opaque) -{ -PXA2xxGPIOInfo *s = (PXA2xxGPIOInfo *) opaque; -int i; - -qemu_put_be32(f, s->lines); - -for (i = 0; i < PXA2XX_GPIO_BANKS; i ++) { -qemu_put_be32s(f, &s->ilevel[i]); -qemu_put_be32s(f, &s->olevel[i]); -qemu_put_be32s(f, &s->dir[i]); -qemu_put_be32s(f, &s->rising[i]); -qemu_put_be32s(f, &s->falling[i]); -qemu_put_be32s(f, &s->status[i]); -qemu_put_be32s(f, &s->gafr[i * 2 + 0]); -qemu_put_be32s(f, &s->gafr[i * 2 + 1]); -
[Qemu-devel] [PATCH v3 5/5] SPARC: Add asr17 register support
This register is activated by CPU_FEATURE_ASR17 in the feature field. Signed-off-by: Fabien Chouteau --- target-sparc/cpu.h |1 + target-sparc/helper.c|3 ++- target-sparc/translate.c | 11 +++ 3 files changed, 14 insertions(+), 1 deletions(-) diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h index 5c50d9e..6f5990b 100644 --- a/target-sparc/cpu.h +++ b/target-sparc/cpu.h @@ -267,6 +267,7 @@ typedef struct sparc_def_t { #define CPU_FEATURE_CMT (1 << 12) #define CPU_FEATURE_GL (1 << 13) #define CPU_FEATURE_TA0_SHUTDOWN (1 << 14) /* Shutdown on "ta 0x0" */ +#define CPU_FEATURE_ASR17(1 << 15) #ifndef TARGET_SPARC64 #define CPU_DEFAULT_FEATURES (CPU_FEATURE_FLOAT | CPU_FEATURE_SWAP | \ CPU_FEATURE_MUL | CPU_FEATURE_DIV | \ diff --git a/target-sparc/helper.c b/target-sparc/helper.c index ec6ac27..2f3d1e6 100644 --- a/target-sparc/helper.c +++ b/target-sparc/helper.c @@ -1288,7 +1288,8 @@ static const sparc_def_t sparc_defs[] = { .mmu_sfsr_mask = 0x, .mmu_trcr_mask = 0x, .nwindows = 8, -.features = CPU_DEFAULT_FEATURES | CPU_FEATURE_TA0_SHUTDOWN, +.features = CPU_DEFAULT_FEATURES | CPU_FEATURE_TA0_SHUTDOWN | +CPU_FEATURE_ASR17, }, #endif }; diff --git a/target-sparc/translate.c b/target-sparc/translate.c index dff0f19..e26462e 100644 --- a/target-sparc/translate.c +++ b/target-sparc/translate.c @@ -2067,6 +2067,17 @@ static void disas_sparc_insn(DisasContext * dc) case 0x10 ... 0x1f: /* implementation-dependent in the SPARCv8 manual, rdy on the microSPARC II */ +/* Read Asr17 */ +if (rs1 == 0x11 && dc->def->features & CPU_FEATURE_ASR17) { +TCGv r_const; + +/* Read Asr17 for a Leon3 monoprocessor */ +r_const = tcg_const_tl((1 << 8) + | (dc->def->nwindows - 1)); +gen_movl_TN_reg(rd, r_const); +tcg_temp_free(r_const); +break; +} #endif gen_movl_TN_reg(rd, cpu_y); break; -- 1.7.1
[Qemu-devel] [PATCH v3 3/5] SPARC: Emulation of GRLIB APB UART
This device exposes one parameter: - chardev (ptr) : Pointer to a qemu character device Emulation of GrLib devices is base on the GRLIB IP Core User's Manual: http://www.gaisler.com/products/grlib/grip.pdf Signed-off-by: Fabien Chouteau --- hw/grlib.h | 23 ++ hw/grlib_apbuart.c | 189 trace-events |4 + 3 files changed, 216 insertions(+), 0 deletions(-) diff --git a/hw/grlib.h b/hw/grlib.h index f92d6d3..fdf4b11 100644 --- a/hw/grlib.h +++ b/hw/grlib.h @@ -100,4 +100,27 @@ DeviceState *grlib_gptimer_create(target_phys_addr_t base, return dev; } +/* APB UART */ + +static inline +DeviceState *grlib_apbuart_create(target_phys_addr_t base, + CharDriverState*serial, + qemu_irqirq) +{ +DeviceState *dev; + +dev = qdev_create(NULL, "grlib,apbuart"); +qdev_prop_set_chr(dev, "chrdev", serial); + +if (qdev_init(dev)) { +return NULL; +} + +sysbus_mmio_map(sysbus_from_qdev(dev), 0, base); + +sysbus_connect_irq(sysbus_from_qdev(dev), 0, irq); + +return dev; +} + #endif /* ! _GRLIB_H_ */ diff --git a/hw/grlib_apbuart.c b/hw/grlib_apbuart.c new file mode 100644 index 000..e603419 --- /dev/null +++ b/hw/grlib_apbuart.c @@ -0,0 +1,189 @@ +/* + * QEMU GRLIB APB UART Emulator + * + * Copyright (c) 2010-2011 AdaCore + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "sysbus.h" +#include "qemu-char.h" + +#include "trace.h" + +#define UART_REG_SIZE 20 /* Size of memory mapped registers */ + +/* UART status register fields */ +#define UART_DATA_READY (1 << 0) +#define UART_TRANSMIT_SHIFT_EMPTY (1 << 1) +#define UART_TRANSMIT_FIFO_EMPTY (1 << 2) +#define UART_BREAK_RECEIVED (1 << 3) +#define UART_OVERRUN (1 << 4) +#define UART_PARITY_ERROR (1 << 5) +#define UART_FRAMING_ERROR(1 << 6) +#define UART_TRANSMIT_FIFO_HALF (1 << 7) +#define UART_RECEIVE_FIFO_HALF(1 << 8) +#define UART_TRANSMIT_FIFO_FULL (1 << 9) +#define UART_RECEIVE_FIFO_FULL(1 << 10) + +/* UART control register fields */ +#define UART_RECEIVE_ENABLE (1 << 0) +#define UART_TRANSMIT_ENABLE (1 << 1) +#define UART_RECEIVE_INTERRUPT (1 << 2) +#define UART_TRANSMIT_INTERRUPT (1 << 3) +#define UART_PARITY_SELECT (1 << 4) +#define UART_PARITY_ENABLE (1 << 5) +#define UART_FLOW_CONTROL(1 << 6) +#define UART_LOOPBACK(1 << 7) +#define UART_EXTERNAL_CLOCK (1 << 8) +#define UART_RECEIVE_FIFO_INTERRUPT (1 << 9) +#define UART_TRANSMIT_FIFO_INTERRUPT (1 << 10) +#define UART_FIFO_DEBUG_MODE (1 << 11) +#define UART_OUTPUT_ENABLE (1 << 12) +#define UART_FIFO_AVAILABLE (1 << 31) + +/* Memory mapped register offsets */ +#define DATA_OFFSET 0x00 +#define STATUS_OFFSET 0x04 +#define CONTROL_OFFSET0x08 +#define SCALER_OFFSET 0x0C /* not supported */ +#define FIFO_DEBUG_OFFSET 0x10 /* not supported */ + +typedef struct UART +{ +SysBusDevice busdev; + +qemu_irq irq; + +CharDriverState *chr; + +/* registers */ +uint32_t receive; +uint32_t status; +uint32_t control; +} UART; + +static int grlib_apbuart_can_receive(void *opaque) +{ +UART *uart = opaque; + +return !!(uart->status & UART_DATA_READY); +} + +static void grlib_apbuart_receive(void *opaque, const uint8_t *buf, int size) +{ +UART *uart = opaque; + +uart->receive = *buf; +uart->status |= UART_DATA_READY; + +if (uart->control & UART_RECEIVE_INTERRUPT) { +qemu_irq_pulse(uart->irq); +} +} + +static void grlib_apbuart_event(void *opaque, int event) +{ +trace_grlib_apbuart_event(event); +} + +static void +grlib_apbuart_writel(void *opaque, target_phys_addr_t addr, uint32_t value) +{ +UART *uar
[Qemu-devel] [PATCH v3 0/5][RFC] New SPARC machine: Leon3
Hello Qemu-devel, Here is the third version of Leon3 emulation patch-set. Modifications since v2: - Tracepoints - DEFINE_PROP_* macros - New interface to trigger interrupts on Leon3 (set_pil_in:leon3.c) - Minor reformating Please feel free to comment. Regards, --- This patch set introduces a new SPARC V8 machine: Leon3. It's an open-source VHDL System-On-Chip, well known in space industry (more information on http://www.gaisler.com). Leon3 is made of multiple components available in the GrLib VHDL library. Three devices are implemented: uart, timers and IRQ manager. You can find code for these peripherals in the grlib_* files. Modifications have been done to the SPARC cpu emulation code to handle Leon3's specific behavior: - IRQ management - Cache control - Asr17 (implementation-dependent Ancillary State Registers) - Shutdown Fabien Chouteau (5): SPARC: Emulation of GRLIB GPTimer SPARC: Emulation of GRLIB IRQMP SPARC: Emulation of GRLIB APB UART SPARC: Emulation of Leon3 SPARC: Add asr17 register support Makefile.target |5 +- hw/grlib.h | 126 +++ hw/grlib_apbuart.c | 189 ++ hw/grlib_gptimer.c | 401 ++ hw/grlib_irqmp.c | 380 +++ hw/leon3.c | 231 ++ target-sparc/cpu.h | 38 +++-- target-sparc/helper.c|8 +- target-sparc/helper.h|1 + target-sparc/op_helper.c | 154 +- target-sparc/translate.c | 24 +++- trace-events | 20 +++ 12 files changed, 1555 insertions(+), 22 deletions(-) create mode 100644 hw/grlib.h create mode 100644 hw/grlib_apbuart.c create mode 100644 hw/grlib_gptimer.c create mode 100644 hw/grlib_irqmp.c create mode 100644 hw/leon3.c
[Qemu-devel] [PATCH v3 2/5] SPARC: Emulation of GRLIB IRQMP
This device exposes two parameters: - set_pil_in(ptr) : A function to set the pil_in of the SPARC CPU - set_pil_in_opaque (ptr) : Opaque argument of the set_pil_in function Emulation of GrLib devices is base on the GRLIB IP Core User's Manual: http://www.gaisler.com/products/grlib/grip.pdf Signed-off-by: Fabien Chouteau --- hw/grlib.h | 38 ++ hw/grlib_irqmp.c | 380 ++ trace-events |6 + 3 files changed, 424 insertions(+), 0 deletions(-) diff --git a/hw/grlib.h b/hw/grlib.h index 776acf9..f92d6d3 100644 --- a/hw/grlib.h +++ b/hw/grlib.h @@ -32,6 +32,44 @@ * http://www.gaisler.com/products/grlib/grip.pdf */ +/* IRQMP */ + +typedef void (*set_pil_in_fn) (void *opaque, uint32_t pil_in); + +void grlib_irqmp_set_irq(void *opaque, int irq, int level); + +void grlib_irqmp_ack(DeviceState *dev, int intno); + +static inline +DeviceState *grlib_irqmp_create(target_phys_addr_t base, +CPUState*env, +qemu_irq **cpu_irqs, +uint32_t nr_irqs, +set_pil_in_fnset_pil_in) +{ +DeviceState *dev; + +assert(cpu_irqs != NULL); + +dev = qdev_create(NULL, "grlib,irqmp"); +qdev_prop_set_ptr(dev, "set_pil_in", set_pil_in); +qdev_prop_set_ptr(dev, "set_pil_in_opaque", env); + +if (qdev_init(dev)) { +return NULL; +} + +env->irq_manager = dev; + +sysbus_mmio_map(sysbus_from_qdev(dev), 0, base); + +*cpu_irqs = qemu_allocate_irqs(grlib_irqmp_set_irq, + dev, + nr_irqs); + +return dev; +} + /* GPTimer */ static inline diff --git a/hw/grlib_irqmp.c b/hw/grlib_irqmp.c new file mode 100644 index 000..3db35ec --- /dev/null +++ b/hw/grlib_irqmp.c @@ -0,0 +1,380 @@ +/* + * QEMU GRLIB IRQMP Emulator + * + * (Multiprocessor and extended interrupt not supported) + * + * Copyright (c) 2010-2011 AdaCore + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "sysbus.h" +#include "cpu.h" + +#include "grlib.h" + +#include "trace.h" + +#define IRQMP_MAX_CPU 16 +#define IRQMP_REG_SIZE 256 /* Size of memory mapped registers */ + +/* Memory mapped register offsets */ +#define LEVEL_OFFSET 0x00 +#define PENDING_OFFSET 0x04 +#define FORCE0_OFFSET0x08 +#define CLEAR_OFFSET 0x0C +#define MP_STATUS_OFFSET 0x10 +#define BROADCAST_OFFSET 0x14 +#define MASK_OFFSET 0x40 +#define FORCE_OFFSET 0x80 +#define EXTENDED_OFFSET 0xC0 + +typedef struct IRQMPState IRQMPState; + +typedef struct IRQMP +{ +SysBusDevice busdev; + +void *set_pil_in; +void *set_pil_in_opaque; + +IRQMPState *state; +} IRQMP; + +struct IRQMPState +{ +uint32_t level; +uint32_t pending; +uint32_t clear; +uint32_t broadcast; + +uint32_t mask[IRQMP_MAX_CPU]; +uint32_t force[IRQMP_MAX_CPU]; +uint32_t extended[IRQMP_MAX_CPU]; + +IRQMP*parent; +}; + +static void grlib_irqmp_check_irqs(IRQMPState *state) +{ +uint32_t pend = 0; +uint32_t level0 = 0; +uint32_t level1 = 0; +set_pil_in_fn set_pil_in; + +assert(state != NULL); +assert(state->parent != NULL); + +/* IRQ for CPU 0 (no SMP support) */ +pend = (state->pending | state->force[0]) +& state->mask[0]; + +level0 = pend & ~state->level; +level1 = pend & state->level; + +trace_grlib_irqmp_check_irqs(state->pending, state->force[0], + state->mask[0], level1, level0); + +set_pil_in = (set_pil_in_fn)state->parent->set_pil_in; + +/* Trigger level1 interrupt first and level0 if there is no level1 */ +if (level1 != 0) { +set_pil_in(state->parent->set_pil_in_opaque, level1); +} else { +set_pil_in(state->parent->set_pil_in_opaque, lev
Re: [Qemu-devel] Re: [PATCH v4 00/32] usb descriptor overhaul + more
On Fri, Jan 21, 2011 at 04:00:47PM +0100, Gerd Hoffmann wrote: > On 01/12/11 12:19, Gerd Hoffmann wrote: >>Hi, >> >> Here is a update of the usb descriptor patches and a bunch of >> improvements on top of that. This patch series features: > > [ ... ] > >> The following changes since commit 05bf441eb69a813d3893174d54faa6afa8c0d39b: >> >>cris: Remove unused orig_flags (2011-01-10 23:28:08 +0100) >> >> are available in the git repository at: >>git://anongit.freedesktop.org/spice/qemu usb.4 > > Ping? > Pulled. Thanks for all the good work. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [PATCH v3 1/5] SPARC: Emulation of GRLIB GPTimer
This device exposes three parameters: - frequency (uint32) : The system frequency - irq-line (uint32) : IRQ line number for the first timer (others use irq-line + 1, irq-line + 2...) - nr-timers (uint32) : Number of timers Emulation of GrLib devices is base on the GRLIB IP Core User's Manual: http://www.gaisler.com/products/grlib/grip.pdf Signed-off-by: Fabien Chouteau --- hw/grlib.h | 65 + hw/grlib_gptimer.c | 401 trace-events | 10 ++ 3 files changed, 476 insertions(+), 0 deletions(-) diff --git a/hw/grlib.h b/hw/grlib.h new file mode 100644 index 000..776acf9 --- /dev/null +++ b/hw/grlib.h @@ -0,0 +1,65 @@ +/* + * QEMU GRLIB Components + * + * Copyright (c) 2010-2011 AdaCore + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#ifndef _GRLIB_H_ +#define _GRLIB_H_ + +#include "qdev.h" +#include "sysbus.h" + +/* Emulation of GrLib device is base on the GRLIB IP Core User's Manual: + * http://www.gaisler.com/products/grlib/grip.pdf + */ + +/* GPTimer */ + +static inline +DeviceState *grlib_gptimer_create(target_phys_addr_t base, + uint32_tnr_timers, + uint32_tfreq, + qemu_irq *cpu_irqs, + int base_irq) +{ +DeviceState *dev; +int i; + +dev = qdev_create(NULL, "grlib,gptimer"); +qdev_prop_set_uint32(dev, "nr-timers", nr_timers); +qdev_prop_set_uint32(dev, "frequency", freq); +qdev_prop_set_uint32(dev, "irq-line", base_irq); + +if (qdev_init(dev)) { +return NULL; +} + +sysbus_mmio_map(sysbus_from_qdev(dev), 0, base); + +for (i = 0; i < nr_timers; i++) { +sysbus_connect_irq(sysbus_from_qdev(dev), i, cpu_irqs[base_irq + i]); +} + +return dev; +} + +#endif /* ! _GRLIB_H_ */ diff --git a/hw/grlib_gptimer.c b/hw/grlib_gptimer.c new file mode 100644 index 000..43661d3 --- /dev/null +++ b/hw/grlib_gptimer.c @@ -0,0 +1,401 @@ +/* + * QEMU GRLIB GPTimer Emulator + * + * Copyright (c) 2010-2011 AdaCore + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "sysbus.h" +#include "qemu-timer.h" + +#include "trace.h" + +#define UNIT_REG_SIZE16 /* Size of memory mapped regs for the unit */ +#define GPTIMER_REG_SIZE 16 /* Size of memory mapped regs for a GPTimer */ + +#define GPTIMER_MAX_TIMERS 8 + +/* GPTimer Config register fields */ +#define GPTIMER_ENABLE (1 << 0) +#define GPTIMER_RESTART (1 << 1) +#define GPTIMER_LOAD(1 << 2) +#define GPTIMER_INT_ENABLE (1 << 3) +#define GPTIMER_INT_PENDING (1 << 4) +#define GPTIMER_CHAIN (1 << 5) /* Not supported */ +#define GPTIMER_DEBUG_HALT (1 << 6) /* Not supported */ + +/* Memory mapped register offsets */ +#define
[Qemu-devel] [PATCH v3 4/5] SPARC: Emulation of Leon3
Leon3 is an open-source VHDL System-On-Chip, well known in space industry (more information on http://www.gaisler.com). Leon3 is made of multiple components available in the GrLib VHDL library. Three devices are implemented: uart, timers and IRQ manager. You can find code for these peripherals in the grlib_* files. Signed-off-by: Fabien Chouteau --- Makefile.target |5 +- hw/leon3.c | 231 ++ target-sparc/cpu.h | 37 +--- target-sparc/helper.c|7 +- target-sparc/helper.h|1 + target-sparc/op_helper.c | 154 ++- target-sparc/translate.c | 13 ++- 7 files changed, 426 insertions(+), 22 deletions(-) diff --git a/Makefile.target b/Makefile.target index e15b1c4..efd8406 100644 --- a/Makefile.target +++ b/Makefile.target @@ -286,7 +286,10 @@ obj-sparc-y += cirrus_vga.o else obj-sparc-y = sun4m.o lance.o tcx.o sun4m_iommu.o slavio_intctl.o obj-sparc-y += slavio_timer.o slavio_misc.o sparc32_dma.o -obj-sparc-y += cs4231.o eccmemctl.o sbi.o sun4c_intctl.o +obj-sparc-y += cs4231.o eccmemctl.o sbi.o sun4c_intctl.o leon3.o + +# GRLIB +obj-sparc-y += grlib_gptimer.o grlib_irqmp.o grlib_apbuart.o endif obj-arm-y = integratorcp.o versatilepb.o arm_pic.o arm_timer.o diff --git a/hw/leon3.c b/hw/leon3.c new file mode 100644 index 000..f8f994a --- /dev/null +++ b/hw/leon3.c @@ -0,0 +1,231 @@ +/* + * QEMU Leon3 System Emulator + * + * Copyright (c) 2010-2011 AdaCore + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ +#include "hw.h" +#include "qemu-timer.h" +#include "qemu-char.h" +#include "sysemu.h" +#include "boards.h" +#include "loader.h" +#include "elf.h" + +#include "grlib.h" + +//#define DEBUG_LEON3 + +#ifdef DEBUG_LEON3 +#define DPRINTF(fmt, ...) \ +do { printf("Leon3: " fmt , ## __VA_ARGS__); } while (0) +#else +#define DPRINTF(fmt, ...) +#endif + +/* Default system clock. */ +#define CPU_CLK (40 * 1000 * 1000) + +#define PROM_FILENAME"u-boot.bin" + +#define MAX_PILS 16 + +typedef struct ResetData { +CPUState *env; +uint32_t entry;/* save kernel entry in case of reset */ +} ResetData; + +static void main_cpu_reset(void *opaque) +{ +ResetData *s = (ResetData *)opaque; +CPUState *env = s->env; + +cpu_reset(env); + +env->halted = 0; +env->pc = s->entry; +env->npc= s->entry + 4; +} + +static void leon3_irq_ack(void *irq_manager, int intno) +{ +grlib_irqmp_ack((DeviceState *)irq_manager, intno); +leon3_cache_control_int(); +} + +static void leon3_set_pil_in(void *opaque, uint32_t pil_in) +{ +CPUState *env = (CPUState *)opaque; + +assert(env != NULL); + +env->pil_in = pil_in; + +if (env->pil_in && (env->interrupt_index == 0 || +(env->interrupt_index & ~15) == TT_EXTINT)) { +unsigned int i; + +for (i = 15; i > 0; i--) { +if (env->pil_in & (1 << i)) { +int old_interrupt = env->interrupt_index; + +env->interrupt_index = TT_EXTINT | i; +if (old_interrupt != env->interrupt_index) { +DPRINTF("Set CPU IRQ %d\n", i); +cpu_interrupt(env, CPU_INTERRUPT_HARD); +} +break; +} +} +} else if (!env->pil_in && (env->interrupt_index & ~15) == TT_EXTINT) { +DPRINTF("Reset CPU IRQ %d\n", env->interrupt_index & 15); +env->interrupt_index = 0; +cpu_reset_interrupt(env, CPU_INTERRUPT_HARD); +} +} + +static void leon3_generic_hw_init(ram_addr_t ram_size, + const char *boot_device, + const char *kernel_filename, + const char *kernel_cmdline, + const char *initrd_filename, +
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-21 17:37, Blue Swirl wrote: > On Fri, Jan 21, 2011 at 8:46 AM, Gerd Hoffmann wrote: >> Hi, >> >>> By the way, we don't have a QEMUState but instead use globals. >> >> /me wants to underline this. >> >> IMO it is absolutely pointless to worry about ways to pass around kvm_state. >> There never ever will be a serious need for that. >> >> We can stick with the current model of keeping global state in global >> variables. And just do the same with kvm_state. >> >> Or we can move to have all state in a QEMUState struct which we'll pass >> around basically everywhere. Then we can simply embed or reference >> kvm_state there. >> >> I'd tend to stick with the global variables as I don't see the point in >> having a QEMUstate. I doubt we'll ever see two virtual machines driven by a >> single qemu process. YMMV. > > Global variables are signs of a poor design. s/are/can be/. > QEMUState would not help > that, instead more specific structures should be designed, much like > what I've proposed for KVMState. Some of these new structures should > be even passed around when it makes sense. > > But I'd not start kvm_state redesign around global variables or QEMUState. We do not need to move individual fields yet, but we need to define classes of fields and strategies how to deal with them long-term. Then we can move forward, and that already in the right direction. Obvious classes are - static host capabilities and means for the KVM core to query them - per-VM fields - fields related to memory management And we now need at least a plan for the second class to proceed with the actual job. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH] checkpatch: Fix bracing false positives on #else
Signed-off-by: Jan Kiszka --- scripts/checkpatch.pl |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 55ef439..4fa06c0 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -2536,7 +2536,8 @@ sub process { } } if (!defined $suppress_ifbraces{$linenr - 1} && - $line =~ /\b(if|while|for|else)\b/) { + $line =~ /\b(if|while|for|else)\b/ && + $line !~ /\#\s*else/) { my $allowed = 0; # Check the pre-context. -- 1.7.1
[Qemu-devel] Re: [PATCH 3/5] blockdev: Reject multiple definitions for the same drive
Am 21.01.2011 17:58, schrieb Markus Armbruster: > Kevin Wolf writes: > >> Am 17.01.2011 19:31, schrieb Markus Armbruster: >>> For reasons lost in the mist of time, we silently ignore multiple >>> definitions for the same drive: >>> >>> $ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive >>> if=ide,index=1,file=tmp.qcow2 -drive if=ide,index=1,file=nonexistant >>> QEMU 0.13.50 monitor - type 'help' for more information >>> (qemu) info block >>> ide0-hd1: type=hd removable=0 file=tmp.qcow2 backing_file=tmp.img ro=0 >>> drv=qcow2 encrypted=0 >>> >>> With if=none, this can become quite confusing: >>> >>> $ qemu-system-x86_64 -nodefaults -vnc :1 -S -monitor stdio -drive >>> if=none,index=1,file=tmp.qcow2,id=eins -drive >>> if=none,index=1,file=nonexistant,id=zwei -device ide-drive,drive=eins >>> -device ide-drive,drive=zwei >>> qemu-system-x86_64: -device ide-drive,drive=zwei: Property >>> 'ide-drive.drive' can't find value 'zwei' >>> >>> The second -device fails, because it refers to drive zwei, which got >>> silently ignored. >>> >>> Make multiple drive definitions fail cleanly. >>> >>> Signed-off-by: Markus Armbruster >> >> Dropped this one (and patch 5, which depends on it) from the block >> branch again, it breaks -cdrom and probably other drives which are >> created by default. > > --verbose? > > I was wondering what crap could depend on the crazy silent ignore... Just try using -cdrom and you'll see yourself. >From what I understand, we always create the default device. If the user has actually specified one, we still try to create the default device, it fails and that failure was ignored until now (and with the patch applied qemu aborts in this case). Kevin
[Qemu-devel] Re: [PATCH] checkpatch: Fix bracing false positives on #else
Thanks, applied. On Fri, Jan 21, 2011 at 5:19 PM, Jan Kiszka wrote: > Signed-off-by: Jan Kiszka > --- > scripts/checkpatch.pl | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl > index 55ef439..4fa06c0 100755 > --- a/scripts/checkpatch.pl > +++ b/scripts/checkpatch.pl > @@ -2536,7 +2536,8 @@ sub process { > } > } > if (!defined $suppress_ifbraces{$linenr - 1} && > - $line =~ /\b(if|while|for|else)\b/) { > + $line =~ /\b(if|while|for|else)\b/ && > + $line !~ /\#\s*else/) { > my $allowed = 0; > > # Check the pre-context. > -- > 1.7.1 >
Re: [Qemu-devel] [PATCH v3 2/4] scsi: Allow SCSI devices to override the removable bit
Stefan Hajnoczi writes: > Some SCSI devices may wish to override the removable bit. Add support > for a qdev property on the SCSI device. I find this description a bit misleading. The qdev property is added in 1/4. Here, you merely extend scsi_bus_legacy_add_drive() to provide access to it. Its primary users (-drive if=scsi & friends) don't use that access. But there's another user: usb_msd_initfn()[*], and 3/4 will make that one use the access. I guess I'd squash 2+3 together, but that's strictly a matter of taste. > > Signed-off-by: Stefan Hajnoczi > --- > hw/pci-hotplug.c |2 +- > hw/scsi-bus.c|8 ++-- > hw/scsi.h|3 ++- > hw/usb-msd.c |2 +- > 4 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c > index 716133c..270a982 100644 > --- a/hw/pci-hotplug.c > +++ b/hw/pci-hotplug.c > @@ -90,7 +90,7 @@ static int scsi_hot_add(Monitor *mon, DeviceState *adapter, > * specified). > */ > dinfo->unit = qemu_opt_get_number(dinfo->opts, "unit", -1); > -scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit); > +scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit, > false); > if (!scsidev) { > return -1; > } > diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c > index 7febb86..ceeb4ec 100644 > --- a/hw/scsi-bus.c > +++ b/hw/scsi-bus.c > @@ -87,7 +87,8 @@ void scsi_qdev_register(SCSIDeviceInfo *info) > } > > /* handle legacy '-drive if=scsi,...' cmd line args */ > -SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, > int unit) > +SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, > + int unit, bool removable) > { > const char *driver; > DeviceState *dev; > @@ -95,6 +96,9 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, > BlockDriverState *bdrv, int > driver = bdrv_is_sg(bdrv) ? "scsi-generic" : "scsi-disk"; > dev = qdev_create(&bus->qbus, driver); > qdev_prop_set_uint32(dev, "scsi-id", unit); > +if (qdev_prop_exists(dev, "removable")) { Isn't this just a funky way to check for "scsi-disk"? Removable gets silently ignored for -device usb-storage with a scsi generic drive. Worth nothing in 4/4. > +qdev_prop_set_bit(dev, "removable", removable); > +} > if (qdev_prop_set_drive(dev, "drive", bdrv) < 0) { > qdev_free(dev); > return NULL; > @@ -117,7 +121,7 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus) > continue; > } > qemu_opts_loc_restore(dinfo->opts); > -if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit)) { > +if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit, false)) { > res = -1; > break; > } [...] [*] I hate that usb_msd_initfn() uses scsi_bus_legacy_add_drive(). Actually, I hate pretty much everything about usb-msd.c. But it's not your fault.
[Qemu-devel] RE: [Bug 702885] Re: "Internal resource leak" error with ARM NEONvmull.s32 insn
Duh. I had missed the greater part of Christophe's patch (I am still having trouble with my mail client; applying patches off the list is manual for me). With both patches applied, indeed the bug filed on launchpad seems fixed. On my second test case, behaviour is much improved. Thanks much! - Wolfgang > -Original Message- > From: boun...@canonical.com [mailto:boun...@canonical.com] On > Behalf Of Peter Maydell > Sent: Friday, January 21, 2011 8:01 AM > To: Schildbach, Wolfgang > Subject: [Bug 702885] Re: "Internal resource leak" error with > ARM NEONvmull.s32 insn > > That binary executes OK for me with no resource leak messages with: > qemu master as of commit b646968336 > http://patchwork.ozlabs.org/patch/79728/ > http://patchwork.ozlabs.org/patch/79581/ > > (i386 host.) > > -- > You received this bug notification because you are a direct > subscriber of the bug. > https://bugs.launchpad.net/bugs/702885 > > Title: > "Internal resource leak" error with ARM NEON vmull.s32 insn > > Status in QEMU: > New > > Bug description: > This bug occurs in qemu, commit > 78a59470e6bbc6e16dc4033767492649c1ae4cfd (most recent as of > 01/14/2011). > > Compile, assemble, and link the code below, with the ARM > tools. (I use > ARM C/C++ Compiler, 4.1 [Build 462]). > > armasm --cpu Cortex-A8 --licensing=flex foo.s > armcc --cpu Cortex-A8 --licensing=flex -o main -L--sysv main.c foo.o > > Execute on qemu-arm and observe an "Internal resource leak" message. > > > qemu-arm main > Internal resource leak before 818c > > - Wolfgang > > main.c: > int main(void) > { > void foofunc(void); > foofunc(); > > return 0 ; > } > > > foo.s: > ARM > REQUIRE8 > PRESERVE8 > AREA code, CODE, READONLY, ALIGN=2 > > foofunc PROC > VMULL.S32 q1, d2, d4 > MOV pc, lr > > ENDP > > EXPORT foofunc [CODE] > END > > To unsubscribe from this bug, go to: > https://bugs.launchpad.net/qemu/+bug/702885/+subscribe > -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/702885 Title: "Internal resource leak" error with ARM NEON vmull.s32 insn Status in QEMU: New Bug description: This bug occurs in qemu, commit 78a59470e6bbc6e16dc4033767492649c1ae4cfd (most recent as of 01/14/2011). Compile, assemble, and link the code below, with the ARM tools. (I use ARM C/C++ Compiler, 4.1 [Build 462]). armasm --cpu Cortex-A8 --licensing=flex foo.s armcc --cpu Cortex-A8 --licensing=flex -o main -L--sysv main.c foo.o Execute on qemu-arm and observe an "Internal resource leak" message. > qemu-arm main Internal resource leak before 818c - Wolfgang main.c: int main(void) { void foofunc(void); foofunc(); return 0 ; } foo.s: ARM REQUIRE8 PRESERVE8 AREA code, CODE, READONLY, ALIGN=2 foofunc PROC VMULL.S32 q1, d2, d4 MOV pc, lr ENDP EXPORT foofunc [CODE] END
[Qemu-devel] Re: [RFC][PATCH v6 03/23] Make qemu timers available for tools
On 01/21/2011 10:30 AM, Jes Sorensen wrote: On 01/17/11 14:14, Michael Roth wrote: diff --git a/qemu-ioh.c b/qemu-ioh.c index cc71470..001e7a2 100644 --- a/qemu-ioh.c +++ b/qemu-ioh.c @@ -22,7 +22,11 @@ * THE SOFTWARE. */ #include "qemu-ioh.h" +#include "qemu-char.h" #include "qlist.h" +#ifdef CONFIG_EVENTFD +#include +#endif /* XXX: fd_read_poll should be suppressed, but an API change is necessary in the character devices to suppress fd_can_read(). */ @@ -113,3 +117,92 @@ void qemu_process_fd_handlers2(void *ioh_record_list, const fd_set *rfds, } } } + +#ifndef _WIN32 +void iothread_event_increment(int *io_thread_fd) Please split the WIN32 stuff into it's own file, similar to oslib-posix and oslib-win32.c etc. Will look into this, but qemu-ioh.c has common code too so we'd end up with qemu-ioh/qemu-ioh-posix/qemu-ioh-win2.c. We could alternatively have a "#ifndef _WIN32" around functions in qemu-ioh.c that would be replaced by win32-specific ones from qemu-ioh-win32. No strong preference either way, but sometimes I find navigating across too many files more annoying that #ifdefs, and there's not a whole lot in these. diff --git a/qemu-ioh.h b/qemu-ioh.h index 7c6e833..2c714a9 100644 --- a/qemu-ioh.h +++ b/qemu-ioh.h @@ -31,4 +31,13 @@ void qemu_get_fdset2(void *ioh_record_list, int *nfds, fd_set *rfds, void qemu_process_fd_handlers2(void *ioh_record_list, const fd_set *rfds, const fd_set *wfds, const fd_set *xfds); + +#ifndef _WIN32 +void iothread_event_increment(int *io_thread_fd); +int iothread_event_init(int *io_thread_fd); +#else +int win32_event_init(HANDLE *qemu_event_handle); +void win32_event_increment(HANDLE *qemu_event_handle); +#endif Can you not do something slightly nicer that allows for those to be the same prototype for all users? Like define a event_handle_t? Don't see why not. + +#ifndef _WIN32 +static int io_thread_fd = -1; Needs splitting into separate files too. diff --git a/qemu-tool.h b/qemu-tool.h new file mode 100644 index 000..fd693cf --- /dev/null +++ b/qemu-tool.h @@ -0,0 +1,26 @@ +#ifndef QEMU_TOOL_H +#define QEMU_TOOL_H + +#include "qemu-common.h" + +#ifdef CONFIG_EVENTFD +#include +#endif + +typedef void VMStateDescription; +typedef int VMStateInfo; + +#ifndef _WIN32 +void qemu_event_increment(void); +int qemu_event_init(void); +#else +int qemu_event_init(void); +void qemu_event_increment(void); +#endif No matter how long I stare at those prototypes, I fail to see the difference between the win32 and the posix version :) Heh, the ordering of course! :) Not sure how I missed this one. The patch is pretty rough in general, I'll see what I can do about cleaning things up a bit. Cheers, Jes