[Qemu-devel] Nyheter hos DokuMera

2010-08-24 Thread DokuMera Nyhetsbrev
Om du har problem med att läsa detta e-postmeddelande, klicka här 
(http://www.anp.se/newsletterweb/846225/444059437941455D4B7142445C43) för en 
webb-version.

Vårt nyhetsbrev skickas automatiskt till våra kunder och intressenter. Vill du 
inte ha detta nyhetsbrev framöver, klicka här för att avprenumerera 
(http://www.anp.se/oa/846225/444059437941455D4B7142445C43).


Nyhetsbrev 34/2010Detta nyhetsbrev är skickat till: qemu-devel@nongnu.org

(http://www.dokumera.se/newsletter_redirect.asp?to=http://www.dokumera.se&from=172191075&prefix=dm)
 (http://www.anp.se/taf/846225/444059437941455D4B7142445C43)  
(http://www.dokumera.se/newsletter_redirect.asp?to=http://www.dokumera.se/out_newsletters.asp&from=172191075&prefix=dm)
  
(http://www.anp.se/newsletter.asp?sqid=846225&sid=444059437941455D4B7142445C43&print=true)


Medierätt - musik

Det uppkommer ständigt nya kanaler inom media vilket skapar nya förutsättningar 
för kommunikation och underhållning. Som upphovsman äger du rätten till ditt 
verk. Upphovsrätten uppstår i och med att du tecknar ner din musik (text 
och/eller noter) eller spelar in din den. Det kan vara svårt att bevisa att du 
verkligen är den ursprunglige upphovsmannen men det finns några sätt att skydda 
sitt verk på:

.Skicka din demo, dina noter eller din text i ett rekommenderat brev till dig 
själv och behåll paketet oöppnat. Genom att göra det kan du använda 
poststämpeln som ett bevis på när verket uppkommit.
.Anmäl ditt verk till Stim (krävs att du är registrerad hos Stim).

Dessa åtgärder är inga bevis på att du verkligen är den ursprunglige 
upphovsmannen men du kan vid en tvist bevisa när verket uppstått.

Medierätt är ett nytt område i DokuMeras produktdatabas. Här finner du bland 
annat avtal inom musik och film. Genom DokuMeras Företagspaket 
(http://www.dokumera.se/visa-kategorier.asp?id=1321) får du tillgång till fler 
än 1300 avtal, checklistor, policys, expertsvar och mycket annat som 
underlättar och juridiskt säkerställer arbetet i ditt företag. 


Veckans dokument 

Artistavtal skivinspelning >> 
(http://www.dokumera.se/artistavtal_skivinspelning_4380_dd.html) Ny !

Musikproducentavtal >> 
(http://www.dokumera.se/musikproducentavtal_4379_dd.html) Ny !

Musikförlagsavtal >> (http://www.dokumera.se/musikforlagsavtal_4381_dd.html) Ny 
!


Veckans expert

Jens Göthberg, Expert inom Medierätt och Upphovsrätt

Advokatfirman Jens Göthberg

Advokat Jens Göthberg har lång erfarenhet av rådgivning till aktörer inom 
underhållningsindustrin och har biträtt flera av Sveriges största artister, 
skådespelare och manusförfattare under de senaste åren.

08-20 89 80
jens.gothb...@gothberg-co.se


Erbjudande

I höst startar Sveriges första utbildning för entreprenörer och företagsledare 
som vill växa med sina företag - Academy of Excellence. Som prenumerant på 
Dokumeras nyhetsbrev får du 2 000 kronor rabatt om du bokar din plats före 27 
augusti. Uppge kampanjkod "dokumera" och anmäl dig på 
www.academyofexcellence.se (http://www.academyofexcellence.se/).

Den 2 september anordnas eventet Växa 2010 på Café Opera - med inspiratörer som 
Bert Karlsson, Jan Carlzon, David Lega m.fl. Eventet är öppet även för dig som 
inte går utbildningen.

(http://www.dokumera.se/newsletter_redirect.asp?to=http://www.dokumera.se/foretagspaketet_1321_dc.html&from=172191075&prefix=dm)
 
(http://www.dokumera.se/newsletter_redirect.asp?to=http://www.dokumera.se/out_atq_ppdviewer.asp&from=172191075&prefix=dm)
 
(http://www.dokumera.se/newsletter_redirect.asp?to=http://www.dokumera.se/styckvisa_dokumentmallar_1330_dc.html&from=172191075&prefix=dm)

(http://www.dokumera.se/newsletter_redirect.asp?to=http://www.dokumera.se/out_contactusmessage.asp&from=172191075&prefix=dm)

För en kostnadsfri exklusiv presentation av hur DokuMera kan spara tiotusentals 
kronor åt just mitt företag.
Givetvis är du varmt välkommen att ringa oss på 08-664 04 50.

Innehållet i nyhetsbrev ska inte tolkas som ett åtagande från DokuMeras sida. 
Informationen sänds ut i befintligt skick, utan garantier och digitala 
signaturer.

Re: [Qemu-devel] Emulate character device

2010-08-24 Thread Stefan Hajnoczi
On Tue, Aug 24, 2010 at 6:04 AM, Thisara Rupasinghe
 wrote:
> There I need to use machines
> character device "dev/vedio0" and stream that to the qemu environment.
> Therefore on the emulator that web cam stream will be available as the
> emulators character device "dev/video0". So how can I able to do that? I'm
> quite new to Qemu and please can anyone help me to do this?

QEMU hardware emulation usually doesn't work at the Linux character
device passthrough.  In the source tree, look at hw/ which contains
the emulated guest devices.  These are emulated hardware devices that
sit on busses like PCI, not Linux character device passthrough.  For
example, a real RTL8139 PCI network card is emulated.

There is some support for USB device passthrough.  If you have a USB
webcam (even "internal" webcams are often USB) and Android supports
USB webcams, then that might work.

It really depends on what webcam devices the Android guest knows how
to drive.  QEMU either needs to pass through one of those devices or
it needs hardware emulation so the guest thinks it is talking to a
supported device.

Stefan



Re: [Qemu-devel] [PATCH] Introduce NBD named exports.

2010-08-24 Thread Kevin Wolf
Am 24.08.2010 03:04, schrieb Laurent Vivier:
> This patch allows to connect Qemu using NBD protocol to an nbd-server
> using named exports.
> 
> For instance, if on the host "isoserver", in /etc/nbd-server/config, you have:
> 
> [generic]
> [debian-500-ppc-netinst]
> exportname = /ISO/debian-500-powerpc-netinst.iso
> [Fedora-10-ppc-netinst]
> exportname = /ISO/Fedora-10-ppc-netinst.iso
> 
> You can connect to it, using:
> 
> qemu -cdrom nbd:isoserver:exportname=debian-500-ppc-netinst
> qemu -cdrom nbd:isoserver:exportname=Fedora-10-ppc-netinst
> 
> NOTE: you need at least nbd-server 2.9.18
> Signed-off-by: Laurent Vivier 
> ---
>  block/nbd.c   |   57 +++
>  nbd.c |  119 ++--
>  nbd.h |5 ++-
>  qemu-doc.texi |7 +++
>  qemu-nbd.c|2 +-
>  5 files changed, 158 insertions(+), 32 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index a1ec123..ee22b47 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -42,55 +42,78 @@ typedef struct BDRVNBDState {
>  static int nbd_open(BlockDriverState *bs, const char* filename, int flags)
>  {
>  BDRVNBDState *s = bs->opaque;
> +
> +#define EN_OPTSTR ":exportname="

Hm, this makes EN_OPTSTR look like a local variable, but it's actually a
macro which is valid after the end of the function. So maybe move it out?

> +char *file;
> +char *name;
>  const char *host;
>  const char *unixpath;
>  int sock;
>  off_t size;
>  size_t blocksize;
>  int ret;
> +int err = -EINVAL;
> +
> +file = qemu_strdup(filename);
>  
> -if (!strstart(filename, "nbd:", &host))
> -return -EINVAL;
> +name = strstr(file, EN_OPTSTR);
> +if (name) {
> +if (name[sizeof(EN_OPTSTR) - 1] == 0)
> +goto out;

Please add braces here and throughout the patch (see CODING_STYLE)

Maybe using strlen(EN_OPTSTR) would be clearer? At first I missed the
fact that sizeof includes the null byte, but maybe it's just me.

> +name[0] = 0;
> +name += sizeof(EN_OPTSTR) - 1;
> +}
> +
> +if (!strstart(file, "nbd:", &host))
> +goto out;
>  
>  if (strstart(host, "unix:", &unixpath)) {
>  
>  if (unixpath[0] != '/')
> -return -EINVAL;
> +goto out;
>  
>  sock = unix_socket_outgoing(unixpath);
>  
>  } else {
> -uint16_t port;
> +uint16_t port = NBD_DEFAULT_PORT;
>  char *p, *r;
>  char hostname[128];
>  
>  pstrcpy(hostname, 128, host);
>  
>  p = strchr(hostname, ':');
> -if (p == NULL)
> -return -EINVAL;
> +if (p != NULL) {
> +*p = '\0';
> +p++;
>  
> -*p = '\0';
> -p++;
> +port = strtol(p, &r, 0);
> +if (r == p)
> +goto out;
> +} else if (name == NULL)
> +goto out;

Wrong indentation level (and missing braces again)

>  
> -port = strtol(p, &r, 0);
> -if (r == p)
> -return -EINVAL;
>  sock = tcp_socket_outgoing(hostname, port);
>  }
>  
> -if (sock == -1)
> -return -errno;
> +if (sock == -1) {
> +err = -errno;
> +goto out;
> +}
>  
> -ret = nbd_receive_negotiate(sock, &size, &blocksize);
> -if (ret == -1)
> -return -errno;
> +ret = nbd_receive_negotiate(sock, name, &size, &blocksize);
> +if (ret == -1) {
> +err = -errno;
> +goto out;
> +}
>  
>  s->sock = sock;
>  s->size = size;
>  s->blocksize = blocksize;
> +err = 0;
>  
> -return 0;
> +out:
> +qemu_free(file);
> +return err;
>  }
>  
>  static int nbd_read(BlockDriverState *bs, int64_t sector_num,
> diff --git a/nbd.c b/nbd.c
> index a9f295f..755613a 100644
> --- a/nbd.c
> +++ b/nbd.c
> @@ -62,6 +62,8 @@
>  #define NBD_SET_SIZE_BLOCKS  _IO(0xab, 7)
>  #define NBD_DISCONNECT  _IO(0xab, 8)
>  
> +#define NBD_OPT_EXPORT_NAME  (1 << 0)
> +
>  /* That's all folks */
>  
>  #define read_sync(fd, buffer, size) nbd_wr_sync(fd, buffer, size, true)
> @@ -296,22 +298,26 @@ int nbd_negotiate(int csock, off_t size)
>   return 0;
>  }
>  
> -int nbd_receive_negotiate(int csock, off_t *size, size_t *blocksize)
> +int nbd_receive_negotiate(int csock, const char *name,
> +  off_t *size, size_t *blocksize)
>  {
> - char buf[8 + 8 + 8 + 128];
> - uint64_t magic;
> + char buf[256] = "\0\0\0\0\0\0\0\0\0";
> + uint64_t magic, s;
> + uint16_t tmp;
> + uint32_t flags;

Hm, where are the flags actually used? There are a few places where they
are assigned, but I can't see any use.

>  
>   TRACE("Receiving negotation.");
>  
> - if (read_sync(csock, buf, sizeof(buf)) != sizeof(buf)) {
> + if (read_sync(csock, buf, 8) != 8) {
>   LOG("read failed");
>   errno = EINVAL;
> 

Re: [Qemu-devel] Emulate character device

2010-08-24 Thread Stefan Hajnoczi
On Tue, Aug 24, 2010 at 10:02 AM, Thisara Rupasinghe
 wrote:
> Thank you very much for your information.
>
> That means i can pass web cam as usb device into the emulator. right?
> If you can give me some directions, that will be very helpful.

Please keep qemu-devel@nongnu.org CCed so others can help too.

Passing a USB webcam into the Android guest would work if the Android
OS has the required USB webcam driver.  You could check the Android
source tree or ask the developers to understand what webcam drivers
are available.

QEMU USB passthrough works like this:
$ qemu -usb -usbdevice host::

Where  is the vendor ID and  is the product ID of the device.
You can find these using lsusb(1) on the host.  For example:
$ lsusb
Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 006 Device 003: ID 413c:2003 Dell Computer Corp. Keyboard
Bus 006 Device 002: ID 413c:3012 Dell Computer Corp. Optical Wheel Mouse

If I want to pass through the keyboard, I use -usbdevice host:413c:2003.

I haven't tested these steps, it's been a while since I used USB
passthrough.  For a full guide, try:
http://bitbud.com/2008/08/09/usb-device-passthrough-under-kvm/

Stefan



Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Daniel P. Berrange
On Tue, Aug 24, 2010 at 12:02:30AM +0200, Alexander Graf wrote:
> The monitor command for hotplugging is in i386 specific code. This is just
> plain wrong, as S390 just learned how to do hotplugging too and needs to
> get drives for that.
> 
> So let's add a generic copy to generic code that handles drive_add in a
> way that doesn't have pci dependencies.
> 
> I'm not fully happy with the patch as is. IMHO there should only be a
> single target agnostic drive_hot_add function available. How we could
> potentially fit IF_SCSI in there I don't know though.

I'm not sure that this patch is actually neccessary. Via a undocumented,
sick, dirty hack, you can already use the current drive_add command
without a PCI address, for both virtio + scsi. In fact not using the
PCI address with drive_add is the preferred approach in the new qdev
world even on x86

The key is that you should use  if=none for all cases. Here are two
examples of how libvirt does it currently:

VirtIO:

  drive_add dummy 
file=/var/lib/libvirt/images/data.img,if=none,id=drive-virtio-disk1,format=raw
  device_add 
virtio-blk-pci,bus=pci.0,addr=0x0,drive=drive-virtio-disk1,id=virtio-disk1'

SCSI:

  drive_add dummy 
file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw'
  device_add scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1

The 'dummy' value there can be absolutely anything you want.
It is totaly ignored when QEMU sees if=none in 2nd arg.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|



Re: [Qemu-devel] [PATCH v2 0/7] APIC/IOAPIC cleanup

2010-08-24 Thread Avi Kivity

 On 08/23/2010 07:02 PM, Anthony Liguori wrote:

On 08/23/2010 10:14 AM, Avi Kivity wrote:

 On 08/23/2010 05:47 PM, Anthony Liguori wrote:




Devices can contain references to structs and objects.  If a 
Device contains a reference to an object, the object should be 
stored in a BusState which is a container of Devices.  Therefore, 
the object should inherit from Device.


I disagree.  It's up to the author to decide whether to split a 
Device into 1 or 15 objects.


If one of the other objects is also a subclass of DeviceState, then 
you're probably violating that DeviceState's contract.  But that's 
a different (and trivial) matter.


(side point: in C no objects have constructors and methods.  in C++ 
all objects have constructors and methods, even PODs)


Things that inherit from DeviceState have ctors/dtors.  Things that 
don't end up inventing their own and probably will do so poorly.


you mean misimplement other_state_init(OtherState *) and 
other_state_destroy(OtherState *)?


And hot plug, save/restore, properties, and all of the other things 
that go along with qdev.


Hot plug is a good example of how easy it is to screw this up by not 
having the right hooks being propagated through the device model.


Come on now, this is all trivial.  If you get a qdev callback you just 
propagate it to all objects that need it.


Practically all non-trivial devices need to do it.  For example if you 
remove gpio from qdev, then a device that has an array of gpios will 
have un-embedded objects in its DeviceState.


Look at ivshmem.c with its non-embedded Peer objects.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] Emulate character device

2010-08-24 Thread Thisara Rupasinghe
Thank you Stefan,
I will work on this.

Thanks for the support.

On Tue, Aug 24, 2010 at 2:43 PM, Stefan Hajnoczi  wrote:

> On Tue, Aug 24, 2010 at 10:02 AM, Thisara Rupasinghe
>  wrote:
> > Thank you very much for your information.
> >
> > That means i can pass web cam as usb device into the emulator. right?
> > If you can give me some directions, that will be very helpful.
>
> Please keep qemu-devel@nongnu.org CCed so others can help too.
>
> Passing a USB webcam into the Android guest would work if the Android
> OS has the required USB webcam driver.  You could check the Android
> source tree or ask the developers to understand what webcam drivers
> are available.
>
> QEMU USB passthrough works like this:
> $ qemu -usb -usbdevice host::
>
> Where  is the vendor ID and  is the product ID of the device.
> You can find these using lsusb(1) on the host.  For example:
> $ lsusb
> Bus 008 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
> Bus 006 Device 003: ID 413c:2003 Dell Computer Corp. Keyboard
> Bus 006 Device 002: ID 413c:3012 Dell Computer Corp. Optical Wheel Mouse
>
> If I want to pass through the keyboard, I use -usbdevice host:413c:2003.
>
> I haven't tested these steps, it's been a while since I used USB
> passthrough.  For a full guide, try:
> http://bitbud.com/2008/08/09/usb-device-passthrough-under-kvm/
>
> Stefan
>



-- 
Thanks & Regards,
Thisara.


[Qemu-devel] [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Kevin Wolf
This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.

This fix has caused severe slowdowns on recent kernels that actually do flush
when they are told so. Reverting this patch hurts correctness and means that we
could get corrupted images in case of a host crash. This means that qcow2 might
not be an option for some people without this fix. On the other hand, I get
reports that the slowdown is so massive that not reverting it would mean that
people can't use it either because it just takes ages to complete stuff. It
probably can be fixed, but not in time for 0.13.0.

Usually, if there's a possible tradeoff between correctness and performance, I
tend to choose correctness, but I'm not so sure in this case. I'm not sure with
reverting either, which is why I post this as an RFC only.

I hope to get some more comments on how to proceed here for 0.13.

Kevin


---
 block/qcow2-cluster.c  |   24 
 block/qcow2-refcount.c |   24 
 block/qcow2-snapshot.c |   23 ---
 block/qcow2.c  |   10 +-
 4 files changed, 41 insertions(+), 40 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 166922f..5760ad6 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -64,8 +64,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
 BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE);
 for(i = 0; i < s->l1_size; i++)
 new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
-ret = bdrv_pwrite_sync(bs->file, new_l1_table_offset, new_l1_table, 
new_l1_size2);
-if (ret < 0)
+ret = bdrv_pwrite(bs->file, new_l1_table_offset, new_l1_table, 
new_l1_size2);
+if (ret != new_l1_size2)
 goto fail;
 for(i = 0; i < s->l1_size; i++)
 new_l1_table[i] = be64_to_cpu(new_l1_table[i]);
@@ -74,8 +74,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
 BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_ACTIVATE_TABLE);
 cpu_to_be32w((uint32_t*)data, new_l1_size);
 cpu_to_be64w((uint64_t*)(data + 4), new_l1_table_offset);
-ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, l1_size), 
data,sizeof(data));
-if (ret < 0) {
+ret = bdrv_pwrite(bs->file, offsetof(QCowHeader, l1_size), 
data,sizeof(data));
+if (ret != sizeof(data)) {
 goto fail;
 }
 qemu_free(s->l1_table);
@@ -87,7 +87,7 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
  fail:
 qemu_free(new_l1_table);
 qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2);
-return ret;
+return ret < 0 ? ret : -EIO;
 }
 
 void qcow2_l2_cache_reset(BlockDriverState *bs)
@@ -207,7 +207,7 @@ static int write_l1_entry(BlockDriverState *bs, int 
l1_index)
 }
 
 BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE);
-ret = bdrv_pwrite_sync(bs->file, s->l1_table_offset + 8 * l1_start_index,
+ret = bdrv_pwrite(bs->file, s->l1_table_offset + 8 * l1_start_index,
 buf, sizeof(buf));
 if (ret < 0) {
 return ret;
@@ -263,7 +263,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, 
uint64_t **table)
 }
 /* write the l2 table to the file */
 BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE);
-ret = bdrv_pwrite_sync(bs->file, l2_offset, l2_table,
+ret = bdrv_pwrite(bs->file, l2_offset, l2_table,
 s->l2_size * sizeof(uint64_t));
 if (ret < 0) {
 goto fail;
@@ -413,8 +413,8 @@ static int copy_sectors(BlockDriverState *bs, uint64_t 
start_sect,
 &s->aes_encrypt_key);
 }
 BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
-ret = bdrv_write_sync(bs->file, (cluster_offset >> 9) + n_start,
-s->cluster_data, n);
+ret = bdrv_write(bs->file, (cluster_offset >> 9) + n_start,
+ s->cluster_data, n);
 if (ret < 0)
 return ret;
 return 0;
@@ -631,10 +631,10 @@ uint64_t 
qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
 
 BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
 l2_table[l2_index] = cpu_to_be64(cluster_offset);
-if (bdrv_pwrite_sync(bs->file,
+if (bdrv_pwrite(bs->file,
 l2_offset + l2_index * sizeof(uint64_t),
 l2_table + l2_index,
-sizeof(uint64_t)) < 0)
+sizeof(uint64_t)) != sizeof(uint64_t))
 return 0;
 
 return cluster_offset;
@@ -655,7 +655,7 @@ static int write_l2_entries(BlockDriverState *bs, uint64_t 
*l2_table,
 int ret;
 
 BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
-ret = bdrv_pwrite_sync(bs->file, l2_offset + start_offset,
+ret = bdrv_pwrite(bs->file, l2_offset + start_offset,
 &l2_table[l2_start_index], len);
 if (ret < 0) {
 return ret;
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 4c19e7e..4542068 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -44,8 +44,8 @@ static int write_refcount_block(BlockDriverState *bs)
 }
 
  

Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Alexander Graf
Daniel P. Berrange wrote:
> On Tue, Aug 24, 2010 at 12:02:30AM +0200, Alexander Graf wrote:
>   
>> The monitor command for hotplugging is in i386 specific code. This is just
>> plain wrong, as S390 just learned how to do hotplugging too and needs to
>> get drives for that.
>>
>> So let's add a generic copy to generic code that handles drive_add in a
>> way that doesn't have pci dependencies.
>>
>> I'm not fully happy with the patch as is. IMHO there should only be a
>> single target agnostic drive_hot_add function available. How we could
>> potentially fit IF_SCSI in there I don't know though.
>> 
>
> I'm not sure that this patch is actually neccessary. Via a undocumented,
> sick, dirty hack, you can already use the current drive_add command
> without a PCI address, for both virtio + scsi. In fact not using the
> PCI address with drive_add is the preferred approach in the new qdev
> world even on x86
>   

It is certainly necessary since the current code is in a big fat #if
defined(TARGET_I386) block :).

> The key is that you should use  if=none for all cases. Here are two
> examples of how libvirt does it currently:
>
> VirtIO:
>
>   drive_add dummy 
> file=/var/lib/libvirt/images/data.img,if=none,id=drive-virtio-disk1,format=raw
>   device_add 
> virtio-blk-pci,bus=pci.0,addr=0x0,drive=drive-virtio-disk1,id=virtio-disk1'
>
> SCSI:
>
>   drive_add dummy 
> file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw'
>   device_add 
> scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
>
> The 'dummy' value there can be absolutely anything you want.
> It is totaly ignored when QEMU sees if=none in 2nd arg.
>   

I'd be all for removing the pci-hotplug.c version of drive_add then. But
I think the IF_SCSI option there is to append a drive to an existing
SCSI bus, no?


Alex




Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Daniel P. Berrange
On Tue, Aug 24, 2010 at 12:45:19PM +0200, Alexander Graf wrote:
> Daniel P. Berrange wrote:
> > On Tue, Aug 24, 2010 at 12:02:30AM +0200, Alexander Graf wrote:
> >   
> >> The monitor command for hotplugging is in i386 specific code. This is just
> >> plain wrong, as S390 just learned how to do hotplugging too and needs to
> >> get drives for that.
> >>
> >> So let's add a generic copy to generic code that handles drive_add in a
> >> way that doesn't have pci dependencies.
> >>
> >> I'm not fully happy with the patch as is. IMHO there should only be a
> >> single target agnostic drive_hot_add function available. How we could
> >> potentially fit IF_SCSI in there I don't know though.
> >> 
> >
> > I'm not sure that this patch is actually neccessary. Via a undocumented,
> > sick, dirty hack, you can already use the current drive_add command
> > without a PCI address, for both virtio + scsi. In fact not using the
> > PCI address with drive_add is the preferred approach in the new qdev
> > world even on x86
> >   
> 
> It is certainly necessary since the current code is in a big fat #if
> defined(TARGET_I386) block :).

True, true, killing the #ifdef is needed :-)

> > The key is that you should use  if=none for all cases. Here are two
> > examples of how libvirt does it currently:
> >
> > VirtIO:
> >
> >   drive_add dummy 
> > file=/var/lib/libvirt/images/data.img,if=none,id=drive-virtio-disk1,format=raw
> >   device_add 
> > virtio-blk-pci,bus=pci.0,addr=0x0,drive=drive-virtio-disk1,id=virtio-disk1'
> >
> > SCSI:
> >
> >   drive_add dummy 
> > file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw'
> >   device_add 
> > scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
> >
> > The 'dummy' value there can be absolutely anything you want.
> > It is totaly ignored when QEMU sees if=none in 2nd arg.
> >   
> 
> I'd be all for removing the pci-hotplug.c version of drive_add then. But
> I think the IF_SCSI option there is to append a drive to an existing
> SCSI bus, no?

Actually this SCSI example I give above is appending a drive to an existing
bus (scsi0), in slot 1 (scsi-id=1).  To best of my knowledge there is no
remaining use case that requires use of IF_SCSI, IF_IDE, etc. The IF_NONE
approach can cope with all, modulo bugs that appear periodically with code
that mistakenly checks for a particular IF_XXX constant.

If you wanted to also create a new SCSI bus, before creating the drive on
it, you'd need to run three commands in total:

  device_add lsi,id=scsi0,bus=pci.0,addr=0x7
  drive_add dummy 
file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw
  device_add scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|



[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Stefan Hajnoczi
On Tue, Aug 24, 2010 at 11:40 AM, Kevin Wolf  wrote:
> This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.
>
> This fix has caused severe slowdowns on recent kernels that actually do flush
> when they are told so. Reverting this patch hurts correctness and means that 
> we
> could get corrupted images in case of a host crash. This means that qcow2 
> might
> not be an option for some people without this fix. On the other hand, I get
> reports that the slowdown is so massive that not reverting it would mean that
> people can't use it either because it just takes ages to complete stuff. It
> probably can be fixed, but not in time for 0.13.0.
>
> Usually, if there's a possible tradeoff between correctness and performance, I
> tend to choose correctness, but I'm not so sure in this case. I'm not sure 
> with
> reverting either, which is why I post this as an RFC only.
>
> I hope to get some more comments on how to proceed here for 0.13.

Sometimes an improvement has a side effect and it makes sense to hold
back the improvement until the side effect can be resolved.  The
period of time in which users could rely on qcow2 data integrity is
small to none, I feel reverting the commit makes sense.

QEMU 0.12.5 has qcow2 sync metadata writes in commit
37060c28e522843fbf6f7e59af745dfcb05b132c.  Was the performance
regression spotted on 0.12.5 or 0.13?

Stefan



[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Michael Tokarev
24.08.2010 15:02, Stefan Hajnoczi wrote:
[]
> Sometimes an improvement has a side effect and it makes sense to hold
> back the improvement until the side effect can be resolved.  The
> period of time in which users could rely on qcow2 data integrity is
> small to none, I feel reverting the commit makes sense.

And I agree with this.

> QEMU 0.12.5 has qcow2 sync metadata writes in commit
> 37060c28e522843fbf6f7e59af745dfcb05b132c.  Was the performance
> regression spotted on 0.12.5 or 0.13?

It started as a debian bugreport ( http://bugs.debian.org/594069 )
which were reported once 0.12.5 entered debian testing.  So
it were spotted in 0.12.5, not 0.13.  I verfied the bug locally
and indeed, it makes _huge_ difference (0.12.4 vs 0.12.5), --
e.g 600mb mem windows7 hybernation on my machine now takes about
40 minutes to complete instead of ~20 sec in 0.12.4.

I'm reverting whole series (there are 5 patches in total, covering
qcow, qcow2, vmdk and vpc plus infrastructure) for debian 0.12
package.  It is sad but there's no other option for now.

/mjt



Re: [Qemu-devel] [PATCH 12/15] piix_pci: Introduces Xen specific call for irq.

2010-08-24 Thread Isaku Yamahata
On Mon, Aug 23, 2010 at 10:50:49AM +0100, stefano.stabell...@eu.citrix.com 
wrote:
> From: Anthony PERARD 
> 
> This patch introduces Xen specific call in piix_pci.
> 
> The specific part for Xen is in write_config, set_irq and get_pirq.
> 
> Signed-off-by: Anthony PERARD 
> Signed-off-by: Stefano Stabellini 
> ---
>  hw/piix_pci.c |   19 ---
>  hw/xen.h  |3 +++
>  xen-all.c |   25 +
>  xen-stub.c|9 +
>  4 files changed, 53 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index f152a0f..994057f 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -28,6 +28,7 @@
>  #include "pci_host.h"
>  #include "isa.h"
>  #include "sysbus.h"
> +#include "xen.h"
>  
>  /*
>   * I440FX chipset data sheet.
> @@ -61,9 +62,13 @@ static void piix3_set_irq(void *opaque, int irq_num, int 
> level);
> mapping. */
>  static int pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
>  {
> -int slot_addend;
> -slot_addend = (pci_dev->devfn >> 3) - 1;
> -return (irq_num + slot_addend) & 3;
> +if (!xen_enabled()) {
> +int slot_addend;
> +slot_addend = (pci_dev->devfn >> 3) - 1;
> +return (irq_num + slot_addend) & 3;
> +} else {
> +return irq_num + ((pci_dev->devfn >> 3) << 2);
> +}
>  }
>  
>  static void update_pam(PCII440FXState *d, uint32_t start, uint32_t end, int 
> r)

pci_slot_get_pirq() is passed to pci_bus_irqs() in i440fx_init().
It would be better to pass xen specific function to pci_bus_irqs() instead of
dynamic check.


> @@ -142,6 +147,9 @@ static void i440fx_write_config(PCIDevice *dev,
>  {
>  PCII440FXState *d = DO_UPCAST(PCII440FXState, dev, dev);
>  
> +if (xen_enabled())
> +xen_piix_pci_write_config_client(address, val, len);
> +
>  /* XXX: implement SMRAM.D_LOCK */
>  pci_default_write_config(dev, address, val, len);
>  if (ranges_overlap(address, len, I440FX_PAM, I440FX_PAM_SIZE) ||

Maybe do we want another PCIDeviceInfo whose difference is .write_config?
I'm not sure which is better, though.


> @@ -255,6 +263,11 @@ static void piix3_set_irq(void *opaque, int irq_num, int 
> level)
>  int i, pic_irq, pic_level;
>  PIIX3State *piix3 = opaque;
>  
> +if (xen_enabled()) {
> +xen_piix3_set_irq(irq_num, level);
> +return;
> +}
> +
>  piix3->pci_irq_levels[irq_num] = level;

Ditto. Pass xen_piix3_set_irq() to pci_bus_irqs()?

thanks,

>  
>  /* now we change the pic irq level according to the piix irq mappings */
> diff --git a/hw/xen.h b/hw/xen.h
> index f1d01d3..77012c2 100644
> --- a/hw/xen.h
> +++ b/hw/xen.h
> @@ -26,4 +26,7 @@ extern int xen_allowed;
>  #define xen_enabled() (0)
>  #endif
>  
> +void xen_piix3_set_irq(int irq_num, int level);
> +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int 
> len);
> +
>  #endif /* QEMU_HW_XEN_H */
> diff --git a/xen-all.c b/xen-all.c
> index e69de29..2d789ad 100644
> --- a/xen-all.c
> +++ b/xen-all.c
> @@ -0,0 +1,25 @@
> +#include "config.h"
> +
> +#include "hw/xen.h"
> +#include "hw/xen_backend.h"
> +
> +void xen_piix3_set_irq(int irq_num, int level)
> +{
> +xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2,
> +irq_num & 3, level);
> +}
> +
> +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int 
> len)
> +{
> +int i;
> +
> +/* Scan for updates to PCI link routes (0x60-0x63). */
> +for (i = 0; i < len; i++) {
> +uint8_t v = (val >> (8*i)) & 0xff;
> +if (v & 0x80)
> +v = 0;
> +v &= 0xf;
> +if (((address+i) >= 0x60) && ((address+i) <= 0x63))
> +xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60, 
> v);
> +}
> +}
> diff --git a/xen-stub.c b/xen-stub.c
> index e69de29..38c64cf 100644
> --- a/xen-stub.c
> +++ b/xen-stub.c
> @@ -0,0 +1,9 @@
> +#include "hw/xen.h"
> +
> +void xen_piix3_set_irq(int irq_num, int level)
> +{
> +}
> +
> +void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int 
> len)
> +{
> +}
> -- 
> 1.7.0.4
> 
> 

-- 
yamahata



Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Isaku Yamahata
Add Cc: m...@redhat.com.

MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
And the name should be start with PCI_ prefix for consistency?

Except that, the patches look okay.

thanks,

On Tue, Aug 24, 2010 at 02:49:27PM +0800, Ken CC wrote:
> Define MAX_PCI_SLOTS as 0x1f, if pci addr provided from command line
> is bigger than 0x1f, return error -EINVAL.
> 
> 0x1f << 3 | 7 == 255 (PCIBUS_MAX_DEVICES - 1)
> 
> Signed-off-by: Ken CC 
> ---
>  hw/qdev-properties.c |2 ++
>  hw/qdev.h|3 +++
>  2 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
> index 9219cd7..96d814c 100644
> --- a/hw/qdev-properties.c
> +++ b/hw/qdev-properties.c
> @@ -514,6 +514,8 @@ static int parse_pci_devfn(DeviceState *dev, Property 
> *prop, const char *str)
>  return -EINVAL;
>  }
>  }
> +if (slot > MAX_PCI_SLOTS)
> +return -EINVAL;
>  if (str[n] != '\0')
>  return -EINVAL;
>  if (fn > 7)
> diff --git a/hw/qdev.h b/hw/qdev.h
> index 678f8b7..fcfe52e 100644
> --- a/hw/qdev.h
> +++ b/hw/qdev.h
> @@ -7,6 +7,9 @@
>  #include "qemu-char.h"
>  #include "qemu-option.h"
>  
> +
> +#define MAX_PCI_SLOTS 0x1f
> +
>  typedef struct Property Property;
>  
>  typedef struct PropertyInfo PropertyInfo;
> 
> 

-- 
yamahata



[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Kevin Wolf
Am 24.08.2010 13:02, schrieb Stefan Hajnoczi:
> On Tue, Aug 24, 2010 at 11:40 AM, Kevin Wolf  wrote:
>> This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.
>>
>> This fix has caused severe slowdowns on recent kernels that actually do flush
>> when they are told so. Reverting this patch hurts correctness and means that 
>> we
>> could get corrupted images in case of a host crash. This means that qcow2 
>> might
>> not be an option for some people without this fix. On the other hand, I get
>> reports that the slowdown is so massive that not reverting it would mean that
>> people can't use it either because it just takes ages to complete stuff. It
>> probably can be fixed, but not in time for 0.13.0.
>>
>> Usually, if there's a possible tradeoff between correctness and performance, 
>> I
>> tend to choose correctness, but I'm not so sure in this case. I'm not sure 
>> with
>> reverting either, which is why I post this as an RFC only.
>>
>> I hope to get some more comments on how to proceed here for 0.13.
> 
> Sometimes an improvement has a side effect and it makes sense to hold
> back the improvement until the side effect can be resolved.  The
> period of time in which users could rely on qcow2 data integrity is
> small to none, I feel reverting the commit makes sense.

Right, that's the vague feeling I have, too.

> QEMU 0.12.5 has qcow2 sync metadata writes in commit
> 37060c28e522843fbf6f7e59af745dfcb05b132c.  Was the performance
> regression spotted on 0.12.5 or 0.13?

Both. You mean we should consider a 0.12.6 if we decide to revert? I
think so far 0.12.5 was planned to be last 0.12.x release.

Kevin



Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Avi Kivity

 On 08/24/2010 02:35 PM, Isaku Yamahata wrote:

Add Cc: m...@redhat.com.

MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
And the name should be start with PCI_ prefix for consistency?

Except that, the patches look okay.



These aren't slots, are they?  They are functions.

There's a lot of slot/function confusion in qemu.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Alexander Graf
Kevin Wolf wrote:
> Am 24.08.2010 13:02, schrieb Stefan Hajnoczi:
>   
>> On Tue, Aug 24, 2010 at 11:40 AM, Kevin Wolf  wrote:
>> 
>>> This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.
>>>
>>> This fix has caused severe slowdowns on recent kernels that actually do 
>>> flush
>>> when they are told so. Reverting this patch hurts correctness and means 
>>> that we
>>> could get corrupted images in case of a host crash. This means that qcow2 
>>> might
>>> not be an option for some people without this fix. On the other hand, I get
>>> reports that the slowdown is so massive that not reverting it would mean 
>>> that
>>> people can't use it either because it just takes ages to complete stuff. It
>>> probably can be fixed, but not in time for 0.13.0.
>>>
>>> Usually, if there's a possible tradeoff between correctness and 
>>> performance, I
>>> tend to choose correctness, but I'm not so sure in this case. I'm not sure 
>>> with
>>> reverting either, which is why I post this as an RFC only.
>>>
>>> I hope to get some more comments on how to proceed here for 0.13.
>>>   
>> Sometimes an improvement has a side effect and it makes sense to hold
>> back the improvement until the side effect can be resolved.  The
>> period of time in which users could rely on qcow2 data integrity is
>> small to none, I feel reverting the commit makes sense.
>> 
>
> Right, that's the vague feeling I have, too.
>   

If we don't think of qcow2 as integer format, why don't we just default
to cache=unsafe there then? That way you could keep all the syncs in
place making it stable with cache=!unsafe, but the default for users
would be fast albeit unsafe, which it already is.


Alex




Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Isaku Yamahata
On Tue, Aug 24, 2010 at 02:42:18PM +0300, Avi Kivity wrote:
>  On 08/24/2010 02:35 PM, Isaku Yamahata wrote:
>> Add Cc: m...@redhat.com.
>>
>> MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
>> And the name should be start with PCI_ prefix for consistency?
>>
>> Except that, the patches look okay.
>>
>
> These aren't slots, are they?  They are functions.

The function checks if given $slot.$fn (or $slot) is valid.
So it's slots. max 32.

> There's a lot of slot/function confusion in qemu.

Agreed.
-- 
yamahata



Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Avi Kivity

 On 08/24/2010 03:07 PM, Isaku Yamahata wrote:

On Tue, Aug 24, 2010 at 02:42:18PM +0300, Avi Kivity wrote:

  On 08/24/2010 02:35 PM, Isaku Yamahata wrote:

Add Cc: m...@redhat.com.

MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
And the name should be start with PCI_ prefix for consistency?

Except that, the patches look okay.


These aren't slots, are they?  They are functions.

The function checks if given $slot.$fn (or $slot) is valid.
So it's slots. max 32.


+assert(devfn<  PCIBUS_MAX_DEVICES);


Looks like we're comparing a function number to PCIBUS_MAX_DEVICES.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Kevin Wolf
Am 24.08.2010 13:56, schrieb Alexander Graf:
> Kevin Wolf wrote:
>> Am 24.08.2010 13:02, schrieb Stefan Hajnoczi:
>>   
>>> On Tue, Aug 24, 2010 at 11:40 AM, Kevin Wolf  wrote:
>>> 
 This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.

 This fix has caused severe slowdowns on recent kernels that actually do 
 flush
 when they are told so. Reverting this patch hurts correctness and means 
 that we
 could get corrupted images in case of a host crash. This means that qcow2 
 might
 not be an option for some people without this fix. On the other hand, I get
 reports that the slowdown is so massive that not reverting it would mean 
 that
 people can't use it either because it just takes ages to complete stuff. It
 probably can be fixed, but not in time for 0.13.0.

 Usually, if there's a possible tradeoff between correctness and 
 performance, I
 tend to choose correctness, but I'm not so sure in this case. I'm not sure 
 with
 reverting either, which is why I post this as an RFC only.

 I hope to get some more comments on how to proceed here for 0.13.
   
>>> Sometimes an improvement has a side effect and it makes sense to hold
>>> back the improvement until the side effect can be resolved.  The
>>> period of time in which users could rely on qcow2 data integrity is
>>> small to none, I feel reverting the commit makes sense.
>>> 
>>
>> Right, that's the vague feeling I have, too.
>>   
> 
> If we don't think of qcow2 as integer format, why don't we just default
> to cache=unsafe there then? That way you could keep all the syncs in
> place making it stable with cache=!unsafe, but the default for users
> would be fast albeit unsafe, which it already is.

Well, safety is not boolean. Considering to make it mostly safe instead
of completely safe because of the performance doesn't mean that we
should make it completely unsafe.

That said, what we should do is changing the cache mode to unsafe in
certain places in qemu-img, e.g. in convert for the destination image.
If it fails, you'll throw it away anyway.

Kevin



Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Alexander Graf
Kevin Wolf wrote:
> Am 24.08.2010 13:56, schrieb Alexander Graf:
>   
>> Kevin Wolf wrote:
>> 
>>> Am 24.08.2010 13:02, schrieb Stefan Hajnoczi:
>>>   
>>>   
 On Tue, Aug 24, 2010 at 11:40 AM, Kevin Wolf  wrote:
 
 
> This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.
>
> This fix has caused severe slowdowns on recent kernels that actually do 
> flush
> when they are told so. Reverting this patch hurts correctness and means 
> that we
> could get corrupted images in case of a host crash. This means that qcow2 
> might
> not be an option for some people without this fix. On the other hand, I 
> get
> reports that the slowdown is so massive that not reverting it would mean 
> that
> people can't use it either because it just takes ages to complete stuff. 
> It
> probably can be fixed, but not in time for 0.13.0.
>
> Usually, if there's a possible tradeoff between correctness and 
> performance, I
> tend to choose correctness, but I'm not so sure in this case. I'm not 
> sure with
> reverting either, which is why I post this as an RFC only.
>
> I hope to get some more comments on how to proceed here for 0.13.
>   
>   
 Sometimes an improvement has a side effect and it makes sense to hold
 back the improvement until the side effect can be resolved.  The
 period of time in which users could rely on qcow2 data integrity is
 small to none, I feel reverting the commit makes sense.
 
 
>>> Right, that's the vague feeling I have, too.
>>>   
>>>   
>> If we don't think of qcow2 as integer format, why don't we just default
>> to cache=unsafe there then? That way you could keep all the syncs in
>> place making it stable with cache=!unsafe, but the default for users
>> would be fast albeit unsafe, which it already is.
>> 
>
> Well, safety is not boolean. Considering to make it mostly safe instead
> of completely safe because of the performance doesn't mean that we
> should make it completely unsafe.
>   

What is safety then? A vague feeling of "oh today is monday so my data
is safe, but on tuesday I always lose my image data"? Either we promise
to keep data safe or we don't. There is no in between.

> That said, what we should do is changing the cache mode to unsafe in
> certain places in qemu-img, e.g. in convert for the destination image.
> If it fails, you'll throw it away anyway.
>   

That would be useful either way.


Alex




Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Isaku Yamahata
On Tue, Aug 24, 2010 at 03:04:44PM +0300, Avi Kivity wrote:
>  On 08/24/2010 03:07 PM, Isaku Yamahata wrote:
>> On Tue, Aug 24, 2010 at 02:42:18PM +0300, Avi Kivity wrote:
>>>   On 08/24/2010 02:35 PM, Isaku Yamahata wrote:
 Add Cc: m...@redhat.com.

 MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
 And the name should be start with PCI_ prefix for consistency?

 Except that, the patches look okay.

>>> These aren't slots, are they?  They are functions.
>> The function checks if given $slot.$fn (or $slot) is valid.
>> So it's slots. max 32.
>
> +assert(devfn<  PCIBUS_MAX_DEVICES);
>
>
> Looks like we're comparing a function number to PCIBUS_MAX_DEVICES.

Ah, you're talking about 2/3. I talked about 3/3.
You're right. The name is misleading.
PCIBUS_MAX_FUNCTIONS? Or suggestions?
-- 
yamahata



Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Avi Kivity

 On 08/24/2010 03:24 PM, Isaku Yamahata wrote:

On Tue, Aug 24, 2010 at 03:04:44PM +0300, Avi Kivity wrote:

  On 08/24/2010 03:07 PM, Isaku Yamahata wrote:

On Tue, Aug 24, 2010 at 02:42:18PM +0300, Avi Kivity wrote:

   On 08/24/2010 02:35 PM, Isaku Yamahata wrote:

Add Cc: m...@redhat.com.

MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
And the name should be start with PCI_ prefix for consistency?

Except that, the patches look okay.


These aren't slots, are they?  They are functions.

The function checks if given $slot.$fn (or $slot) is valid.
So it's slots. max 32.

+assert(devfn<   PCIBUS_MAX_DEVICES);


Looks like we're comparing a function number to PCIBUS_MAX_DEVICES.

Ah, you're talking about 2/3. I talked about 3/3.
You're right. The name is misleading.
PCIBUS_MAX_FUNCTIONS? Or suggestions?


Or assert(devfn / 8 < PCIBUS_MAX_DEVICES) (and change that to be 32).


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Chen Cao
On Tue, Aug 24, 2010 at 03:04:44PM +0300, Avi Kivity wrote:
>  On 08/24/2010 03:07 PM, Isaku Yamahata wrote:
> >On Tue, Aug 24, 2010 at 02:42:18PM +0300, Avi Kivity wrote:
> >>  On 08/24/2010 02:35 PM, Isaku Yamahata wrote:
> >>>Add Cc: m...@redhat.com.
> >>>
> >>>MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
> >>>And the name should be start with PCI_ prefix for consistency?
> >>>
> >>>Except that, the patches look okay.
> >>>
> >>These aren't slots, are they?  They are functions.
> >The function checks if given $slot.$fn (or $slot) is valid.
> >So it's slots. max 32.
> 
> +assert(devfn<  PCIBUS_MAX_DEVICES);
> 
> 
> Looks like we're comparing a function number to PCIBUS_MAX_DEVICES.
> 

PCIBUS_MAX_DEVICES is the size of PCIBus.devices[], I have added it in
the first patch at the defination of struct PCIBus, line 50 hw/pci.c.
so i think the better name of the macro should be PCIBUS_MAX_FN,
right?


Ken

> -- 
> error compiling committee.c: too many arguments to function
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 



Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Avi Kivity

 On 08/24/2010 03:12 PM, Alexander Graf wrote:



Well, safety is not boolean. Considering to make it mostly safe instead
of completely safe because of the performance doesn't mean that we
should make it completely unsafe.


What is safety then? A vague feeling of "oh today is monday so my data
is safe, but on tuesday I always lose my image data"? Either we promise
to keep data safe or we don't. There is no in between.



Do you drive a car?

Though in general I agree we shouldn't compromise on data integrity.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 3/3] Check pci slot number in parse_pci_devfn

2010-08-24 Thread Avi Kivity

 On 08/24/2010 03:16 PM, Chen Cao wrote:

On Tue, Aug 24, 2010 at 03:04:44PM +0300, Avi Kivity wrote:

  On 08/24/2010 03:07 PM, Isaku Yamahata wrote:

On Tue, Aug 24, 2010 at 02:42:18PM +0300, Avi Kivity wrote:

  On 08/24/2010 02:35 PM, Isaku Yamahata wrote:

Add Cc: m...@redhat.com.

MAX_PCI_SLOTS should be in pci.h instead of qdev.h?
And the name should be start with PCI_ prefix for consistency?

Except that, the patches look okay.


These aren't slots, are they?  They are functions.

The function checks if given $slot.$fn (or $slot) is valid.
So it's slots. max 32.

+assert(devfn<   PCIBUS_MAX_DEVICES);


Looks like we're comparing a function number to PCIBUS_MAX_DEVICES.


PCIBUS_MAX_DEVICES is the size of PCIBus.devices[], I have added it in
the first patch at the defination of struct PCIBus, line 50 hw/pci.c.
so i think the better name of the macro should be PCIBUS_MAX_FN,
right?


Or make it 32 and scale it by PCI_FUNCTIONS_PER_DEVICE.

PCIBus.devices[] should be renamed to functions[] (later).

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Alexander Graf
Avi Kivity wrote:
>  On 08/24/2010 03:12 PM, Alexander Graf wrote:
>>
>>> Well, safety is not boolean. Considering to make it mostly safe instead
>>> of completely safe because of the performance doesn't mean that we
>>> should make it completely unsafe.
>>>
>> What is safety then? A vague feeling of "oh today is monday so my data
>> is safe, but on tuesday I always lose my image data"? Either we promise
>> to keep data safe or we don't. There is no in between.
>>
>
> Do you drive a car?

Would you buy a car where the breaks are known to not always work? ;)

> Though in general I agree we shouldn't compromise on data integrity.

That's my point. Either we go for it or we don't.


Alex




[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Stefan Hajnoczi
On Tue, Aug 24, 2010 at 12:40 PM, Kevin Wolf  wrote:
> Am 24.08.2010 13:02, schrieb Stefan Hajnoczi:
>> QEMU 0.12.5 has qcow2 sync metadata writes in commit
>> 37060c28e522843fbf6f7e59af745dfcb05b132c.  Was the performance
>> regression spotted on 0.12.5 or 0.13?
>
> Both. You mean we should consider a 0.12.6 if we decide to revert? I
> think so far 0.12.5 was planned to be last 0.12.x release.

Yes, especially if distros will revert the patches manually without an
upstream release.  I can see arguments for either way though.

Stefan



Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Avi Kivity

 On 08/24/2010 03:21 PM, Alexander Graf wrote:

Avi Kivity wrote:

  On 08/24/2010 03:12 PM, Alexander Graf wrote:

Well, safety is not boolean. Considering to make it mostly safe instead
of completely safe because of the performance doesn't mean that we
should make it completely unsafe.


What is safety then? A vague feeling of "oh today is monday so my data
is safe, but on tuesday I always lose my image data"? Either we promise
to keep data safe or we don't. There is no in between.


Do you drive a car?

Would you buy a car where the breaks are known to not always work? ;)


That's not the case, even with cache=unsafe.


Though in general I agree we shouldn't compromise on data integrity.

That's my point. Either we go for it or we don't.


I don't know how bad the performance regression is, and how large the 
integrity risk is.  I'd default towards preserving integrity, but maybe 
this situation is different.


--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Michael Tokarev
24.08.2010 16:21, Stefan Hajnoczi wrote:
> On Tue, Aug 24, 2010 at 12:40 PM, Kevin Wolf  wrote:
>> Am 24.08.2010 13:02, schrieb Stefan Hajnoczi:
>>> QEMU 0.12.5 has qcow2 sync metadata writes in commit
>>> 37060c28e522843fbf6f7e59af745dfcb05b132c.  Was the performance
>>> regression spotted on 0.12.5 or 0.13?
>>
>> Both. You mean we should consider a 0.12.6 if we decide to revert? I
>> think so far 0.12.5 was planned to be last 0.12.x release.
> 
> Yes, especially if distros will revert the patches manually without an
> upstream release.  I can see arguments for either way though.

Note that for 0.12 there's no other option than to revert the
whole thing.  Because, well, there's no "cache=unsafe" there,
not to say about it being the default.  And quemu-kvm breaks
the user's systems in the middle of stable series (by "breaks"
I mean making their working guests unusable).

/mjt



Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Kevin Wolf
Am 24.08.2010 14:27, schrieb Avi Kivity:
>   On 08/24/2010 03:21 PM, Alexander Graf wrote:
>> Avi Kivity wrote:
>>>   On 08/24/2010 03:12 PM, Alexander Graf wrote:
> Well, safety is not boolean. Considering to make it mostly safe instead
> of completely safe because of the performance doesn't mean that we
> should make it completely unsafe.
>
 What is safety then? A vague feeling of "oh today is monday so my data
 is safe, but on tuesday I always lose my image data"? Either we promise
 to keep data safe or we don't. There is no in between.

>>> Do you drive a car?
>> Would you buy a car where the breaks are known to not always work? ;)
> 
> That's not the case, even with cache=unsafe.
> 
>>> Though in general I agree we shouldn't compromise on data integrity.
>> That's my point. Either we go for it or we don't.
> 
> I don't know how bad the performance regression is, and how large the 
> integrity risk is.  I'd default towards preserving integrity, but maybe 
> this situation is different.

I have reports of installations taking like 50 min instead of 14 min. My
own qemu-io based test goes up from 1 s to 23 s. And I think the winner
is Michael's image conversion which went up from 30 s to 49 min.

So it's not like we're talking about just some 10 or 20 percent.

Kevin



Re: [Qemu-devel] [PATCH 0/5] CODING_STYLE amendments

2010-08-24 Thread Markus Armbruster
Anthony Liguori  writes:

> On 08/22/2010 01:36 PM, malc wrote:
>
> But how would you do that? Drop the CODING_STYLE (and accept
> anything)? Switch to a new CODING_STYLE that is widely appreciated and
> so all bikeshedding will cease? Enforce current style?
>  
 I would suggest we either clean up the existing rule, or switch to the
 Linux kernel style, with the explicit exemption that existing code can
 keep the 4-char indentation, unless the whole file is converted. I'd
 like to avoid a total reformatting of the codebase, but we could look at
 it on a file by file base if it becomes relevant.

>>> Sounds reasonable.
>>>
>>>  
>> Doesn't to me.
>>
>
> I'm strongly opposed to any reformatting of the tree.
>
> All it does is break git blame which makes debugging harder without
> offering any real benefits.

And that's why enforcing the parts of the coding style that are
insufficiently consistent with the existing code (mandatory braces, for
instance) is a waste of time.  Writing our own tools to help with that
is an even bigger waste of time.

Everyone's free to waste his or her time however he or she pleases, of
course.  But I respectfully request to refrain from wasting somebody
else's time.  Yes, we should insist people make their changes blend with
the existing code.  But making them jump through additional brace-hoops
when their changes do blend crosses the line for me.



Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Avi Kivity

 On 08/24/2010 03:35 PM, Kevin Wolf wrote:



I don't know how bad the performance regression is, and how large the
integrity risk is.  I'd default towards preserving integrity, but maybe
this situation is different.

I have reports of installations taking like 50 min instead of 14 min. My
own qemu-io based test goes up from 1 s to 23 s. And I think the winner
is Michael's image conversion which went up from 30 s to 49 min.

So it's not like we're talking about just some 10 or 20 percent.


Image conversion should be done with cache=unsafe (with an fsync at the 
end to make sure a guest isn't launched with volatile data in the 
cache[1]).  The io test isn't a user workload.


The 14 min -> 50 min regression is pretty bad, but I'm not sure it's bad 
enough to merit risking user data.


[1] or better, when launching a guest with cache!=unsafe start with an 
fsync to make sure it's on disk.


--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Juan Quintela
Kevin Wolf  wrote:
> This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.
>
> This fix has caused severe slowdowns on recent kernels that actually do flush
> when they are told so. Reverting this patch hurts correctness and means that 
> we
> could get corrupted images in case of a host crash. This means that qcow2 
> might
> not be an option for some people without this fix. On the other hand, I get
> reports that the slowdown is so massive that not reverting it would mean that
> people can't use it either because it just takes ages to complete stuff. It
> probably can be fixed, but not in time for 0.13.0.
>
> Usually, if there's a possible tradeoff between correctness and performance, I
> tend to choose correctness, but I'm not so sure in this case. I'm not sure 
> with
> reverting either, which is why I post this as an RFC only.
>
> I hope to get some more comments on how to proceed here for 0.13.

I think that we have to revert also.

It is a thoug decission so.  This patch makes things more "correct",
but it makes some load really slow, for instance, and install went from
15m to 50m on my hardware :(

My point here is that with the patch, qcow2 becomes almost unusable, so
... what to do?

Later, Juan.



Re: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Kevin Wolf
Am 24.08.2010 14:39, schrieb Avi Kivity:
>   On 08/24/2010 03:35 PM, Kevin Wolf wrote:
>>
>>> I don't know how bad the performance regression is, and how large the
>>> integrity risk is.  I'd default towards preserving integrity, but maybe
>>> this situation is different.
>> I have reports of installations taking like 50 min instead of 14 min. My
>> own qemu-io based test goes up from 1 s to 23 s. And I think the winner
>> is Michael's image conversion which went up from 30 s to 49 min.
>>
>> So it's not like we're talking about just some 10 or 20 percent.
> 
> Image conversion should be done with cache=unsafe (with an fsync at the 

Well, should, but is not. Arguably we should improve our format drivers
to make the metadata flushes less of a problem, too. But we haven't done
so yet. This is why this discussion is about 0.13, not 0.14.

> end to make sure a guest isn't launched with volatile data in the 
> cache[1]).  The io test isn't a user workload.
> 
> The 14 min -> 50 min regression is pretty bad, but I'm not sure it's bad 
> enough to merit risking user data.

Without a disk write cache (or with a battery-backed one) cache=none is
safe without any additional flushes. So yes, users need to be aware of
the meaning of cache options. But they already had to be aware of it
with 0.12.

However, it might trap users who are not aware of it, which is why we
even have this discussion. It's not an easy question.

> [1] or better, when launching a guest with cache!=unsafe start with an 
> fsync to make sure it's on disk.

Hm, that's an interesting idea, never thought of that. Might make sense.

Kevin



[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Anthony Liguori

On 08/24/2010 05:40 AM, Kevin Wolf wrote:

This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.

This fix has caused severe slowdowns on recent kernels that actually do flush
when they are told so. Reverting this patch hurts correctness and means that we
could get corrupted images in case of a host crash. This means that qcow2 might
not be an option for some people without this fix. On the other hand, I get
reports that the slowdown is so massive that not reverting it would mean that
people can't use it either because it just takes ages to complete stuff. It
probably can be fixed, but not in time for 0.13.0.

Usually, if there's a possible tradeoff between correctness and performance, I
tend to choose correctness, but I'm not so sure in this case. I'm not sure with
reverting either, which is why I post this as an RFC only.

I hope to get some more comments on how to proceed here for 0.13.
   


How fundamental of an issue is this?  Is this something we think we know 
how to fix and we just don't think there's time to fix it for 0.13?


And with this fix in place, how much confidence do we have in qcow2 with 
respect to data integrity on power loss?


We've shipped every version of QEMU since qcow2 was introduced with 
known data corruptions.  It sucks big time.  I think it's either that 
building an image format is a really hard problem akin to making a file 
system and we shouldn't be in that business or that qcow2 is bad as an 
image format which makes this all harder than it should be.


Either way, I think we need to look seriously at enabling alternatives 
(i.e. external snapshots, simple sparse files, etc.).


Regards,

Anthony Liguori


Kevin
   





[Qemu-devel] Re: KVM call agenda for August 24

2010-08-24 Thread Anthony Liguori

On 08/23/2010 05:30 PM, Chris Wright wrote:

Please send in any agenda items you are interested in covering.
   


There are quite a few important discussions on the list but I think they 
should stay on the list right now.


So it sounds like we don't have an agenda for today.

Regards,

Anthony Liguori


thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   





[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Kevin Wolf
Am 24.08.2010 15:01, schrieb Anthony Liguori:
> On 08/24/2010 05:40 AM, Kevin Wolf wrote:
>> This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.
>>
>> This fix has caused severe slowdowns on recent kernels that actually do flush
>> when they are told so. Reverting this patch hurts correctness and means that 
>> we
>> could get corrupted images in case of a host crash. This means that qcow2 
>> might
>> not be an option for some people without this fix. On the other hand, I get
>> reports that the slowdown is so massive that not reverting it would mean that
>> people can't use it either because it just takes ages to complete stuff. It
>> probably can be fixed, but not in time for 0.13.0.
>>
>> Usually, if there's a possible tradeoff between correctness and performance, 
>> I
>> tend to choose correctness, but I'm not so sure in this case. I'm not sure 
>> with
>> reverting either, which is why I post this as an RFC only.
>>
>> I hope to get some more comments on how to proceed here for 0.13.
>>
> 
> How fundamental of an issue is this?  Is this something we think we know 
> how to fix and we just don't think there's time to fix it for 0.13?

I think we can improve things basically by trying to batch metadata
writes and do them in parallel while already processing the next requests.

I'm not sure what the numbers are going to look like with something like
this in place, I need to try it. It's definitely not something that I
want to go into 0.13 at this point.

> And with this fix in place, how much confidence do we have in qcow2 with 
> respect to data integrity on power loss?

How do you measure confidence? :-)

There are power failure tests and I don't have any bugs open to that
respect. I'm not sure how intensively it's tested, though.

> We've shipped every version of QEMU since qcow2 was introduced with 
> known data corruptions.  It sucks big time.  I think it's either that 
> building an image format is a really hard problem akin to making a file 
> system and we shouldn't be in that business or that qcow2 is bad as an 
> image format which makes this all harder than it should be.

I tend to say that it's just hard to get right. Most of the problems
that were fixed in qcow2 over the last year are probably present in our
VMDK implementation as well, just to pick one example.

Kevin



[Qemu-devel] [PATCH 1/4] PCI: define PCIBUS_MAX_DEVICES and PCI_FUNCTIONS_PER_DEVICE in pci.h

2010-08-24 Thread Ken CC
And update the max function number used in struct PCIBus{} to
PCIBUS_MAX_FUNCTIONS = PCI_FUNCTIONS_PER_DEVICE * PCIBUS_MAX_DEVICES

TODO:
according to Avi Kivity, PCIBus.devices[] should be renamed to functions[]

Signed-off-by: Ken CC 
---
 hw/pci.c |2 +-
 hw/pci.h |2 ++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 70dbace..9234fe3 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -47,7 +47,7 @@ struct PCIBus {
 pci_hotplug_fn hotplug;
 DeviceState *hotplug_qdev;
 void *irq_opaque;
-PCIDevice *devices[256];
+PCIDevice *devices[PCIBUS_MAX_DEVICES * PCI_FUNCTIONS_PER_DEVICE];
 PCIDevice *parent_dev;
 target_phys_addr_t mem_base;
 
diff --git a/hw/pci.h b/hw/pci.h
index ccb99d0..eb97b76 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -17,6 +17,8 @@ struct kvm_irq_routing_entry;
 #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn) ((devfn) & 0x07)
 #define PCI_FUNC_MAX8
+#define PCI_FUNCTIONS_PER_DEVICE 8
+#define PCIBUS_MAX_DEVICES 32
 
 /* Class, Vendor and Device IDs from Linux's pci_ids.h */
 #include "pci_ids.h"




[Qemu-devel] [PATCH 2/4] pci init: fail qemu if devfn exceeding the max function number supported on bus

2010-08-24 Thread Ken CC
Check if devfn < PCIBUS_MAX_DEVICES * PCI_FUNCTIONS_PER_DEVICE

Signed-off-by: Ken CC 
---
 hw/pci.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 9234fe3..fc4becd 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -747,6 +747,7 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev, PCIBus *bus,
  PCIConfigWriteFunc *config_write,
  bool is_bridge)
 {
+assert(devfn / PCI_FUNCTIONS_PER_DEVICE < PCIBUS_MAX_DEVICES);
 if (devfn < 0) {
 for(devfn = bus->devfn_min ; devfn < ARRAY_SIZE(bus->devices);
 devfn += PCI_FUNC_MAX) {




[Qemu-devel] [PATCH 3/4] Check pci slot number against PCIBUS_MAX_DEVICES in parse_pci_devfn

2010-08-24 Thread Ken CC
If pci addr provided from command line is bigger than 32,
PCIBUS_MAX_DEVICES, return error -EINVAL.

32 << 3 | 7 == 256 (PCIBUS_MAX_FUNCTIONS)
PCIBUS_MAX_FUNCTIONS = PCIBUS_MAX_DEVICES * PCI_FUNCTIONS_PER_DEVICE

Signed-off-by: Ken CC 
---
 hw/qdev-properties.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index 9219cd7..565fd08 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -1,5 +1,5 @@
 #include "net.h"
-#include "qdev.h"
+#include "pci.h"
 #include "qerror.h"
 
 void *qdev_get_prop_ptr(DeviceState *dev, Property *prop)
@@ -514,6 +514,8 @@ static int parse_pci_devfn(DeviceState *dev, Property 
*prop, const char *str)
 return -EINVAL;
 }
 }
+if (slot >= PCIBUS_MAX_DEVICES)
+return -EINVAL;
 if (str[n] != '\0')
 return -EINVAL;
 if (fn > 7)




[Qemu-devel] [PATCH 4/4] Rename PCI_FUNC_MAX to PCI_FUNCTIONS_PER_DEVICES in pci.[ch]

2010-08-24 Thread Ken CC
PCI_FUNC_MAX is introduced by
6eab3de16d36c48a983366b09d0a0029a5260bc3
and
6fa84913eccec4266a27c81ae88465f6790742b9
which should be safe to rename to PCI_FUNCTIONS_PER_DEVICES.

Signed-off-by: Ken CC 
---
 hw/pci.c |4 ++--
 hw/pci.h |1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index fc4becd..3901455 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -711,7 +711,7 @@ static int pci_init_multifunction(PCIBus *bus, PCIDevice 
*dev)
 return 0;
 }
 /* function 0 indicates single function, so function > 0 must be NULL */
-for (func = 1; func < PCI_FUNC_MAX; ++func) {
+for (func = 1; func < PCI_FUNCTIONS_PER_DEVICE; ++func) {
 if (bus->devices[PCI_DEVFN(slot, func)]) {
 error_report("PCI: %x.0 indicates single function, "
  "but %x.%x is already populated.",
@@ -750,7 +750,7 @@ static PCIDevice *do_pci_register_device(PCIDevice 
*pci_dev, PCIBus *bus,
 assert(devfn / PCI_FUNCTIONS_PER_DEVICE < PCIBUS_MAX_DEVICES);
 if (devfn < 0) {
 for(devfn = bus->devfn_min ; devfn < ARRAY_SIZE(bus->devices);
-devfn += PCI_FUNC_MAX) {
+devfn += PCI_FUNCTIONS_PER_DEVICE) {
 if (!bus->devices[devfn])
 goto found;
 }
diff --git a/hw/pci.h b/hw/pci.h
index eb97b76..f6fb6d8 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -16,7 +16,6 @@ struct kvm_irq_routing_entry;
 #define PCI_DEVFN(slot, func)   slot) & 0x1f) << 3) | ((func) & 0x07))
 #define PCI_SLOT(devfn) (((devfn) >> 3) & 0x1f)
 #define PCI_FUNC(devfn) ((devfn) & 0x07)
-#define PCI_FUNC_MAX8
 #define PCI_FUNCTIONS_PER_DEVICE 8
 #define PCIBUS_MAX_DEVICES 32
 




[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Anthony Liguori

On 08/24/2010 08:16 AM, Kevin Wolf wrote:

Am 24.08.2010 15:01, schrieb Anthony Liguori:
   

On 08/24/2010 05:40 AM, Kevin Wolf wrote:
 

This reverts commit 8b3b720620a1137a1b794fc3ed64734236f94e06.

This fix has caused severe slowdowns on recent kernels that actually do flush
when they are told so. Reverting this patch hurts correctness and means that we
could get corrupted images in case of a host crash. This means that qcow2 might
not be an option for some people without this fix. On the other hand, I get
reports that the slowdown is so massive that not reverting it would mean that
people can't use it either because it just takes ages to complete stuff. It
probably can be fixed, but not in time for 0.13.0.

Usually, if there's a possible tradeoff between correctness and performance, I
tend to choose correctness, but I'm not so sure in this case. I'm not sure with
reverting either, which is why I post this as an RFC only.

I hope to get some more comments on how to proceed here for 0.13.

   

How fundamental of an issue is this?  Is this something we think we know
how to fix and we just don't think there's time to fix it for 0.13?
 

I think we can improve things basically by trying to batch metadata
writes and do them in parallel while already processing the next requests.

I'm not sure what the numbers are going to look like with something like
this in place, I need to try it. It's definitely not something that I
want to go into 0.13 at this point.
   


I'm not sure this patch is needed in the first place.

If you have a sequence of operations like:

0) receive guest write request Z
1) submit write A
2) write A completes
3) submit write B
4) write B completes
5) report guest write Z complete

You're adding a:

4.5) sync write B

Which is ultimately unnecessary if what you care about is avoiding 
reordering of step (2) and (4).  When a write() request completes, 
you're guaranteed that a subsequent read() request will return the 
written data.  That's always true.  If I could do a write(A) followed by 
a write(B) and then read()=A, no software would actually function correctly.


It's important to make sure that you don't get image corruption if (2) 
happens but not (4).  But I think that's okay in qcow2 today.


Regards,

Anthony Liguori


And with this fix in place, how much confidence do we have in qcow2 with
respect to data integrity on power loss?
 

How do you measure confidence? :-)

There are power failure tests and I don't have any bugs open to that
respect. I'm not sure how intensively it's tested, though.

   

We've shipped every version of QEMU since qcow2 was introduced with
known data corruptions.  It sucks big time.  I think it's either that
building an image format is a really hard problem akin to making a file
system and we shouldn't be in that business or that qcow2 is bad as an
image format which makes this all harder than it should be.
 

I tend to say that it's just hard to get right. Most of the problems
that were fixed in qcow2 over the last year are probably present in our
VMDK implementation as well, just to pick one example.

Kevin
   





[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Avi Kivity

 On 08/24/2010 04:29 PM, Anthony Liguori wrote:


I'm not sure this patch is needed in the first place.

If you have a sequence of operations like:

0) receive guest write request Z
1) submit write A
2) write A completes
3) submit write B
4) write B completes
5) report guest write Z complete

You're adding a:

4.5) sync write B

Which is ultimately unnecessary if what you care about is avoiding 
reordering of step (2) and (4).  When a write() request completes, 
you're guaranteed that a subsequent read() request will return the 
written data.  That's always true.  If I could do a write(A) followed 
by a write(B) and then read()=A, no software would actually function 
correctly.


It's important to make sure that you don't get image corruption if (2) 
happens but not (4).  But I think that's okay in qcow2 today.


It's about metadata writes.  If an operation changes metadata, we must 
sync it to disk before writing any data or other metadata which depends 
on it, regardless of any promises to the guest.


--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Anthony Liguori

On 08/24/2010 08:31 AM, Avi Kivity wrote:

 On 08/24/2010 04:29 PM, Anthony Liguori wrote:


I'm not sure this patch is needed in the first place.

If you have a sequence of operations like:

0) receive guest write request Z
1) submit write A
2) write A completes
3) submit write B
4) write B completes
5) report guest write Z complete

You're adding a:

4.5) sync write B

Which is ultimately unnecessary if what you care about is avoiding 
reordering of step (2) and (4).  When a write() request completes, 
you're guaranteed that a subsequent read() request will return the 
written data.  That's always true.  If I could do a write(A) followed 
by a write(B) and then read()=A, no software would actually function 
correctly.


It's important to make sure that you don't get image corruption if 
(2) happens but not (4).  But I think that's okay in qcow2 today.


It's about metadata writes.  If an operation changes metadata, we must 
sync it to disk before writing any data or other metadata which 
depends on it, regardless of any promises to the guest.


Why?  If the metadata isn't sync, we loose the write.

But that can happen anyway because we're not sync'ing the data

We need to sync the metadata in the event of a guest initiated flush, 
but we shouldn't need to for a normal write.


Regards,

Anthony Liguori




[Qemu-devel] [Bug 621950] Re: qemu not able to run 64 bit OS when -enable-kvm is used with a 64 bit processor

2010-08-24 Thread bhasker
oops
I made a mistake
there were old binaries left out.
It is working fine now.


** Changed in: qemu
   Status: New => Invalid

-- 
qemu not able to run 64 bit OS when -enable-kvm is used with a 64 bit processor
https://bugs.launchpad.net/bugs/621950
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in QEMU: Invalid

Bug description:
Hi,

 The host processor is a 64 bit processor (as below)

  When I run the emulated mode of 64 bit (full emulation) I can run the 64 bit 
OS fine. when I start using -enable-kvm, I get error message that 

"Your CPU does not support long mode. Use a 32bit distribution."

trying to install SuSe linux enterprise server 64 bit 

Information 
--
--
processor cat /proc/cpuinfo 

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Quad CPU   @ 2.66GHz
stepping: 7
cpu MHz : 2671.246
cache size  : 4096 KB
physical id : 0
siblings: 4
core id : 3
cpu cores   : 4
apicid  : 3
initial apicid  : 3
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm constant_tsc 
arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 
cx16 xtpr pdcm lahf_lm tpr_shadow
bogomips: 5342.46
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

--
command line used

qemu-system-x86_64 -enable-kvm -cdrom /img/cd.iso 


why is kvm not able to use 64 bit capability in a host which is a 64 bit 
processor ?





[Qemu-devel] KVM call cancelled [was: Re: KVM call agenda for August 24]

2010-08-24 Thread Chris Wright
* Anthony Liguori (anth...@codemonkey.ws) wrote:
> On 08/23/2010 05:30 PM, Chris Wright wrote:
> >Please send in any agenda items you are interested in covering.
> 
> There are quite a few important discussions on the list but I think
> they should stay on the list right now.
> 
> So it sounds like we don't have an agenda for today.

No agenda, so this week's call is cancelled.

thanks,
-chris



[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Avi Kivity

 On 08/24/2010 04:35 PM, Anthony Liguori wrote:
It's about metadata writes.  If an operation changes metadata, we 
must sync it to disk before writing any data or other metadata which 
depends on it, regardless of any promises to the guest.



Why?  If the metadata isn't sync, we loose the write.

But that can happen anyway because we're not sync'ing the data

We need to sync the metadata in the event of a guest initiated flush, 
but we shouldn't need to for a normal write.


1. Allocate a cluster (increase refcount table)

2. Link cluster to L2 table

3. Second operation makes it to disk; first still in pagecache

4. Crash

5. Dangling pointer from L2 to freed cluster

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Alexander Graf
Daniel P. Berrange wrote:
> On Tue, Aug 24, 2010 at 12:45:19PM +0200, Alexander Graf wrote:
>   
>> Daniel P. Berrange wrote:
>> 
>>> On Tue, Aug 24, 2010 at 12:02:30AM +0200, Alexander Graf wrote:
>>>   
>>>   
 The monitor command for hotplugging is in i386 specific code. This is just
 plain wrong, as S390 just learned how to do hotplugging too and needs to
 get drives for that.

 So let's add a generic copy to generic code that handles drive_add in a
 way that doesn't have pci dependencies.

 I'm not fully happy with the patch as is. IMHO there should only be a
 single target agnostic drive_hot_add function available. How we could
 potentially fit IF_SCSI in there I don't know though.
 
 
>>> I'm not sure that this patch is actually neccessary. Via a undocumented,
>>> sick, dirty hack, you can already use the current drive_add command
>>> without a PCI address, for both virtio + scsi. In fact not using the
>>> PCI address with drive_add is the preferred approach in the new qdev
>>> world even on x86
>>>   
>>>   
>> It is certainly necessary since the current code is in a big fat #if
>> defined(TARGET_I386) block :).
>> 
>
> True, true, killing the #ifdef is needed :-)
>   

And moving it out of the pci specific file. Yeah :). Basically my
proposal was to take the patches as is and phase out the pci-hotplug.c
version.

>   
>>> The key is that you should use  if=none for all cases. Here are two
>>> examples of how libvirt does it currently:
>>>
>>> VirtIO:
>>>
>>>   drive_add dummy 
>>> file=/var/lib/libvirt/images/data.img,if=none,id=drive-virtio-disk1,format=raw
>>>   device_add 
>>> virtio-blk-pci,bus=pci.0,addr=0x0,drive=drive-virtio-disk1,id=virtio-disk1'
>>>
>>> SCSI:
>>>
>>>   drive_add dummy 
>>> file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw'
>>>   device_add 
>>> scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
>>>
>>> The 'dummy' value there can be absolutely anything you want.
>>> It is totaly ignored when QEMU sees if=none in 2nd arg.
>>>   
>>>   
>> I'd be all for removing the pci-hotplug.c version of drive_add then. But
>> I think the IF_SCSI option there is to append a drive to an existing
>> SCSI bus, no?
>> 
>
> Actually this SCSI example I give above is appending a drive to an existing
> bus (scsi0), in slot 1 (scsi-id=1).  To best of my knowledge there is no
> remaining use case that requires use of IF_SCSI, IF_IDE, etc. The IF_NONE
> approach can cope with all, modulo bugs that appear periodically with code
> that mistakenly checks for a particular IF_XXX constant.
>
> If you wanted to also create a new SCSI bus, before creating the drive on
> it, you'd need to run three commands in total:
>
>   device_add lsi,id=scsi0,bus=pci.0,addr=0x7
>   drive_add dummy 
> file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw
>   device_add 
> scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
>   

Nice - so we can just deprecate if=!none?


Alex




[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Anthony Liguori

On 08/24/2010 08:39 AM, Avi Kivity wrote:

 On 08/24/2010 04:35 PM, Anthony Liguori wrote:
It's about metadata writes.  If an operation changes metadata, we 
must sync it to disk before writing any data or other metadata which 
depends on it, regardless of any promises to the guest.



Why?  If the metadata isn't sync, we loose the write.

But that can happen anyway because we're not sync'ing the data

We need to sync the metadata in the event of a guest initiated flush, 
but we shouldn't need to for a normal write.


1. Allocate a cluster (increase refcount table)

2. Link cluster to L2 table

3. Second operation makes it to disk; first still in pagecache

4. Crash

5. Dangling pointer from L2 to freed cluster


Yes, having this discussion in IRC.

The problem is that we maintain a refcount table.  If we didn't do 
internal disk snapshots, we wouldn't have this problem.  IOW, VMDK 
doesn't have this problem so the answer to my very first question is 
that qcow2 is too difficult a format to get right.


Regards,

Anthony Liguori




[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Avi Kivity

 On 08/24/2010 04:40 PM, Anthony Liguori wrote:

1. Allocate a cluster (increase refcount table)

2. Link cluster to L2 table

3. Second operation makes it to disk; first still in pagecache

4. Crash

5. Dangling pointer from L2 to freed cluster



Yes, having this discussion in IRC.

The problem is that we maintain a refcount table. 


Are you sure that's the only issue?

If we didn't do internal disk snapshots, we wouldn't have this 
problem.  IOW, VMDK doesn't have this problem so the answer to my very 
first question is that qcow2 is too difficult a format to get right.


One doesn't follow from the other (though I'm no fan of internal 
snapshots, myself).


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Alexander Graf
Daniel P. Berrange wrote:
> On Tue, Aug 24, 2010 at 03:40:25PM +0200, Alexander Graf wrote:
>   
>> Daniel P. Berrange wrote:
>> 
>>> On Tue, Aug 24, 2010 at 12:45:19PM +0200, Alexander Graf wrote:
>>>   
> The key is that you should use  if=none for all cases. Here are two
> examples of how libvirt does it currently:
>
> VirtIO:
>
>   drive_add dummy 
> file=/var/lib/libvirt/images/data.img,if=none,id=drive-virtio-disk1,format=raw
>   device_add 
> virtio-blk-pci,bus=pci.0,addr=0x0,drive=drive-virtio-disk1,id=virtio-disk1'
>
> SCSI:
>
>   drive_add dummy 
> file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw'
>   device_add 
> scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
>
> The 'dummy' value there can be absolutely anything you want.
> It is totaly ignored when QEMU sees if=none in 2nd arg.
>   
>   
>   
 I'd be all for removing the pci-hotplug.c version of drive_add then. But
 I think the IF_SCSI option there is to append a drive to an existing
 SCSI bus, no?
 
 
>>> Actually this SCSI example I give above is appending a drive to an existing
>>> bus (scsi0), in slot 1 (scsi-id=1).  To best of my knowledge there is no
>>> remaining use case that requires use of IF_SCSI, IF_IDE, etc. The IF_NONE
>>> approach can cope with all, modulo bugs that appear periodically with code
>>> that mistakenly checks for a particular IF_XXX constant.
>>>
>>> If you wanted to also create a new SCSI bus, before creating the drive on
>>> it, you'd need to run three commands in total:
>>>
>>>   device_add lsi,id=scsi0,bus=pci.0,addr=0x7
>>>   drive_add dummy 
>>> file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw
>>>   device_add 
>>> scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
>>>   
>>>   
>> Nice - so we can just deprecate if=!none?
>> 
>
> In theory yes, but its not nice to tell users to switch everything over to
> use if=none, if we're going to deprecate that too in the next release when
> blockdev appears. Might as well just deprecate entire of drive_add/-drive
> at once.
>   

I guess I still fail to see the reason for blockdev when we force
drive_add to if=none...


Alex




Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Daniel P. Berrange
On Tue, Aug 24, 2010 at 03:40:25PM +0200, Alexander Graf wrote:
> Daniel P. Berrange wrote:
> > On Tue, Aug 24, 2010 at 12:45:19PM +0200, Alexander Graf wrote:
> >   
> >> Daniel P. Berrange wrote:
> >> 
> >>> On Tue, Aug 24, 2010 at 12:02:30AM +0200, Alexander Graf wrote:
> >>>   
> >>>   
>  The monitor command for hotplugging is in i386 specific code. This is 
>  just
>  plain wrong, as S390 just learned how to do hotplugging too and needs to
>  get drives for that.
> 
>  So let's add a generic copy to generic code that handles drive_add in a
>  way that doesn't have pci dependencies.
> 
>  I'm not fully happy with the patch as is. IMHO there should only be a
>  single target agnostic drive_hot_add function available. How we could
>  potentially fit IF_SCSI in there I don't know though.
>  
>  
> >>> I'm not sure that this patch is actually neccessary. Via a undocumented,
> >>> sick, dirty hack, you can already use the current drive_add command
> >>> without a PCI address, for both virtio + scsi. In fact not using the
> >>> PCI address with drive_add is the preferred approach in the new qdev
> >>> world even on x86
> >>>   
> >>>   
> >> It is certainly necessary since the current code is in a big fat #if
> >> defined(TARGET_I386) block :).
> >> 
> >
> > True, true, killing the #ifdef is needed :-)
> >   
> 
> And moving it out of the pci specific file. Yeah :). Basically my
> proposal was to take the patches as is and phase out the pci-hotplug.c
> version.
> 
> >   
> >>> The key is that you should use  if=none for all cases. Here are two
> >>> examples of how libvirt does it currently:
> >>>
> >>> VirtIO:
> >>>
> >>>   drive_add dummy 
> >>> file=/var/lib/libvirt/images/data.img,if=none,id=drive-virtio-disk1,format=raw
> >>>   device_add 
> >>> virtio-blk-pci,bus=pci.0,addr=0x0,drive=drive-virtio-disk1,id=virtio-disk1'
> >>>
> >>> SCSI:
> >>>
> >>>   drive_add dummy 
> >>> file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw'
> >>>   device_add 
> >>> scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
> >>>
> >>> The 'dummy' value there can be absolutely anything you want.
> >>> It is totaly ignored when QEMU sees if=none in 2nd arg.
> >>>   
> >>>   
> >> I'd be all for removing the pci-hotplug.c version of drive_add then. But
> >> I think the IF_SCSI option there is to append a drive to an existing
> >> SCSI bus, no?
> >> 
> >
> > Actually this SCSI example I give above is appending a drive to an existing
> > bus (scsi0), in slot 1 (scsi-id=1).  To best of my knowledge there is no
> > remaining use case that requires use of IF_SCSI, IF_IDE, etc. The IF_NONE
> > approach can cope with all, modulo bugs that appear periodically with code
> > that mistakenly checks for a particular IF_XXX constant.
> >
> > If you wanted to also create a new SCSI bus, before creating the drive on
> > it, you'd need to run three commands in total:
> >
> >   device_add lsi,id=scsi0,bus=pci.0,addr=0x7
> >   drive_add dummy 
> > file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw
> >   device_add 
> > scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
> >   
> 
> Nice - so we can just deprecate if=!none?

In theory yes, but its not nice to tell users to switch everything over to
use if=none, if we're going to deprecate that too in the next release when
blockdev appears. Might as well just deprecate entire of drive_add/-drive
at once.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|



Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Daniel P. Berrange
On Tue, Aug 24, 2010 at 03:46:14PM +0200, Alexander Graf wrote:
> Daniel P. Berrange wrote:
> > On Tue, Aug 24, 2010 at 03:40:25PM +0200, Alexander Graf wrote:
> >   
> >> Daniel P. Berrange wrote:
> >> 
> >>> On Tue, Aug 24, 2010 at 12:45:19PM +0200, Alexander Graf wrote:
> >>>   
> > The key is that you should use  if=none for all cases. Here are two
> > examples of how libvirt does it currently:
> >
> > VirtIO:
> >
> >   drive_add dummy 
> > file=/var/lib/libvirt/images/data.img,if=none,id=drive-virtio-disk1,format=raw
> >   device_add 
> > virtio-blk-pci,bus=pci.0,addr=0x0,drive=drive-virtio-disk1,id=virtio-disk1'
> >
> > SCSI:
> >
> >   drive_add dummy 
> > file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw'
> >   device_add 
> > scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
> >
> > The 'dummy' value there can be absolutely anything you want.
> > It is totaly ignored when QEMU sees if=none in 2nd arg.
> >   
> >   
> >   
>  I'd be all for removing the pci-hotplug.c version of drive_add then. But
>  I think the IF_SCSI option there is to append a drive to an existing
>  SCSI bus, no?
>  
>  
> >>> Actually this SCSI example I give above is appending a drive to an 
> >>> existing
> >>> bus (scsi0), in slot 1 (scsi-id=1).  To best of my knowledge there is no
> >>> remaining use case that requires use of IF_SCSI, IF_IDE, etc. The IF_NONE
> >>> approach can cope with all, modulo bugs that appear periodically with code
> >>> that mistakenly checks for a particular IF_XXX constant.
> >>>
> >>> If you wanted to also create a new SCSI bus, before creating the drive on
> >>> it, you'd need to run three commands in total:
> >>>
> >>>   device_add lsi,id=scsi0,bus=pci.0,addr=0x7
> >>>   drive_add dummy 
> >>> file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw
> >>>   device_add 
> >>> scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
> >>>   
> >>>   
> >> Nice - so we can just deprecate if=!none?
> >> 
> >
> > In theory yes, but its not nice to tell users to switch everything over to
> > use if=none, if we're going to deprecate that too in the next release when
> > blockdev appears. Might as well just deprecate entire of drive_add/-drive
> > at once.
> >   
> 
> I guess I still fail to see the reason for blockdev when we force
> drive_add to if=none...

Markus can no doubt explain better than me, but off the top of my head
 
 - 'drive' has guest properties that should be against the device
   not the drive which is for host properties (eg serial=, if=)
 - 'file' is mangled to include protocol/format which means that 
   it can't be unambigously parsed. (eg filenames containing :)

Fixing those, particularly the latter, would breaks back-compat so we
really need a new arg with sensible definition. This is what blockdev
is intended todo (as well as a internal code cleanup)

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|



[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"

2010-08-24 Thread Anthony Liguori

On 08/24/2010 08:44 AM, Avi Kivity wrote:

 On 08/24/2010 04:40 PM, Anthony Liguori wrote:

1. Allocate a cluster (increase refcount table)

2. Link cluster to L2 table

3. Second operation makes it to disk; first still in pagecache

4. Crash

5. Dangling pointer from L2 to freed cluster



Yes, having this discussion in IRC.

The problem is that we maintain a refcount table. 


Are you sure that's the only issue?


No.

If we didn't do internal disk snapshots, we wouldn't have this 
problem.  IOW, VMDK doesn't have this problem so the answer to my 
very first question is that qcow2 is too difficult a format to get 
right.


One doesn't follow from the other (though I'm no fan of internal 
snapshots, myself).


It does.  Let's consider the failure scenarios:

1) guest submits write request
2) allocate extent
3) write data to disk (a)
4) write (a) completes
5) update reference count table for new extent (b)
6) write (b) completes
7) write extent table (c)
8) write (c) completes
9) complete guest write request

If this all happened in order and we lost power, the worst case error is 
that we leak a block which isn't terrible.


But we're not guaranteed that this happens in order.

If (b) or (c) happen before (a), then the image is not corrupted but 
data gets lost.  That's okay because it's part of the guest contract.


If (c) happens before (b), then we've created an extent that's attached 
to a table with a zero reference count.  This is a corrupt image.


Let's consider if we eliminate the reference count table which means 
eliminating internal snapshots.


1) guest submits write request
2) allocate extent
3) write data to disk (a)
4) write (a) completes
5) write extent table (c)
6) write (c) completes
7) complete guest write request

If this all happens in order and we lose power, we just leak a block.  
It means we need a periodic fsck.


If (c) completes before (a), then it means that the image is not 
corrupted but data gets lost.  This is okay based on the guest contract.


And that's it.  There is no scenario where the disk is corrupted.

So in summary, both situations are not perfect, but scenario (1) can 
result in a corrupted image whereas scenario (2) results in leakage.  
The classic solution to this is fsck.


Regards,

Anthony Liguori



[Qemu-devel] Triburile Online - Tribal-Speed

2010-08-24 Thread Triburile Online - Tribal-Speed
Triburile Speed REGISTER



[Qemu-devel] Build error - target i386-linux-user on RHEL4.8 x86_64

2010-08-24 Thread C K Kashyap
Hi,
I tried to build qemu's user mode linux emulation on RHEL 4.8 x86_64 -
but I run into this issue -
  ARi386-linux-user/libqemu.a
  LINK  i386-linux-user/qemu-i386
/usr/bin/ld:/home/ckk/lab/qemu-0.12.5/x86_64.ld:41: syntax error
collect2: ld returned 1 exit status
make[1]: *** [qemu-i386] Error 1
make: *** [subdir-i386-linux-user] Error 2
[...@windowher-dr qemu-0.12.5]$


This was from qemu-0.12.5.

I am however, able to build qemu-0.11.1 correctly.

I'd really appreciate it if someone could help me with the build error.

-- 
Regards,
Kashyap



Re: [Qemu-devel] Re: Unusual physical address when using 64-bit BAR

2010-08-24 Thread Cam Macdonell
On Tue, Jul 20, 2010 at 9:49 PM, Isaku Yamahata  wrote:
> Added Cc: seab...@seabios.org
>
> On Wed, Jul 21, 2010 at 06:31:01AM +0300, Michael S. Tsirkin wrote:
>> On Tue, Jul 20, 2010 at 06:52:23PM +0900, Isaku Yamahata wrote:
>> > On Wed, Jul 14, 2010 at 09:10:28AM -0600, Cam Macdonell wrote:
>> > > On Tue, Jul 13, 2010 at 8:52 PM, Isaku Yamahata  
>> > > wrote:
>> > > > On Tue, Jul 13, 2010 at 04:48:19PM -0600, Cam Macdonell wrote:
>> > > >> On Tue, Jul 13, 2010 at 2:41 PM, Isaku Yamahata 
>> > > >>  wrote:
>> > > >> > On Tue, Jul 13, 2010 at 02:05:51PM -0600, Cam Macdonell wrote:
>> > > >> >> >> > Seabios completely ignore the 64-bitness of the BAR. ?Looks 
>> > > >> >> >> > like it also
>> > > >> >> >> > thinks the second half of the BAR is an I/O region instead of 
>> > > >> >> >> > memory (hence
>> > > >> >> >> > the c200, that's part of the pci portio region.
>> > > >> >> >
>> > > >> >> > I've sent the patches to address it. But they haven't been 
>> > > >> >> > merged yet.
>> > > >> >> > seabios doesn't map BARs beyond 4GB.
>> > > >> >> > If bar is mapped beyond 4GB, guest BIOS does it.
>> > > >> >>
>> > > >> >> Have those patches been merged yet?
>> > > >> >
>> > > >> > They have been merged into seabios upstream now.
>> > > >> > qemu seabios fork hasn't pulled for a while, though.
>> > > >> >
>> > > >> >
>> > > >> >> > To see how seabios works, it would help to increase 
>> > > >> >> > CONFIG_DEBUG_LEVEL
>> > > >> >> > in config.h of seabios
>> > > >> >>
>> > > >> >> Where does the output from seabios end up? ?Inside dmesg?
>> > > >> >
>> > > >> > It outputs them to the serial console which qemu emulates.
>> > > >> > seabios is out of kernel control, so dmesg doesn't show it.
>> > > >> >
>> > > >> >
>> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >> pci_read_config: (val) 0x <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > >> >> >
>> > > >> >> > seabios BAR3. Not sure how it is mapped from this
>> > > >> >> > message.
>> > > >> >>
>> > > >> >> Isn't the BAR3 from the fact that a 64-bit BAR would use both BAR2 
>> > > >> >> and
>> > > >> >> BAR3 to store all 64-bits?
>> > > >> >
>> > > >> > Yes. Seabios misbehaves. 64bit bar is(was) a missing feature.
>> > > >> > --
>> > > >> > yamahata
>> > > >> >
>> > > >> >
>> > > >>
>> > > >> With the latest seabios git passed via -bios, I no longer see the
>> > > >> 48-bit address, but instead a 32-bit address and then
>> > > >> . ?This guest has 1gb of RAM so the address isn't be
>> > > >> mapped beyond 4g.
>> > > >
>> > > > Can I see the debug log like before?
>> > > > (hopefully seabios with CONFIG_DEBUG_LEVEL enabled.)
>> > >
>> > > Here's the dump from SeaBIOS in the region related to the PCI devices.
>> > >  The SeaBIOS output is identical whether the BAR is 32-bit or 64-bit.
>> > >
>> > > PCI: bus=0 devfn=0x10: vendor_id=0x1013 device_id=0x00b8
>> > > region 0: 0xf000
>> > > region 1: 0xf200
>> > > region 6: 0xf201
>> > > PCI: bus=0 devfn=0x18: vendor_id=0x1af4 device_id=0x1000
>> > > region 0: 0xc020
>> > > region 1: 0xf202
>> > > region 6: 0xf203
>> > > PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
>> > > region 0: 0xf204
>> > > region 1: 0xf2041000
>> > > region 2: 0x
>> >
>> > Is this region (region 2 of devfn=0x20: vendor_id=0x1af4 device_id=0x1110)
>> > the BAR in quistion?
>> > The value 0 seems odd. Probably BAR address calculation overflowed.
>> > Currently seabios doesn't check overflow. I attached the patch.
>> >
>> >
>> > > > Do you know who sets the BAR to ?
>> > >
>> > > Here are the config reads/writes related to the 0x18/1c, the 'IVSHMEM'
>> > > lines are from the map function passed to pci_register_bar().  It
>> > > looks like SeaBIOS sets the address to 0 and then the potentially
>> > > useful e000 address gets mangled into 00.
>> >
>> > There is something wrong with the debug message of write case, I suppose.
>> > All written value are 0, but the resulted effect doesn't seems so.
>> >
>> > >
>> > > IVSHMEM: guest pci addr = 0, guest h/w addr = 1090912256, size = 
>> > > 536870912
>> > >
>> > > ...snip...
>> > >
>> > > pci_read_config: (val) 0x4 <- 0x18 (addr)
>> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
>> > > IVSHMEM: guest pci addr = e000, guest h/w addr = 1090912256, size = 
>> > > 2000
>> >
>> > If 0 is written to 0x18, the bar address should be 0, but it says e000.
>> >
>> > > pci_read_config: (val) 0xe004 <- 0x18 (addr)
>> >
>> > The read value isn't 0. and so on...
>> >
>> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
>> > > pci_read_config: (val) 0x0 <- 0x1c (addr)
>> > > pci_write_config: (val) 0x0 -> 0x1c (addr)
>> > > IVSHMEM: guest pci addr = , guest h/w ad

Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Anthony Liguori

On 08/24/2010 08:44 AM, Daniel P. Berrange wrote:



Actually this SCSI example I give above is appending a drive to an existing
bus (scsi0), in slot 1 (scsi-id=1).  To best of my knowledge there is no
remaining use case that requires use of IF_SCSI, IF_IDE, etc. The IF_NONE
approach can cope with all, modulo bugs that appear periodically with code
that mistakenly checks for a particular IF_XXX constant.

If you wanted to also create a new SCSI bus, before creating the drive on
it, you'd need to run three commands in total:

   device_add lsi,id=scsi0,bus=pci.0,addr=0x7
   drive_add dummy 
file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw
   device_add scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1

   

Nice - so we can just deprecate if=!none?
 

In theory yes, but its not nice to tell users to switch everything over to
use if=none, if we're going to deprecate that too in the next release when
blockdev appears. Might as well just deprecate entire of drive_add/-drive
at once.
   


I think what Alex is really asking is can we have 'blockdev_add 
var0=val0,var1=val1[,...]' implemented as 'drive_add dummy 
if=none,var0=val0,var1=val1[,...]'.  I don't know the answer to why that 
isn't possible or desirable.


Regards,

Anthony Liguori


Regards,
Daniel
   





[Qemu-devel] Re: Bug#592875: pxelinux: incompatible with qemu with kvm enabled

2010-08-24 Thread Vagrant Cascadian
it seems versions of pxelinux 4.00 and later hangs qemu (0.12.x, 0.13.x) when
network booting using etherboot or gPXE and qemu's kvm support is enabled.

pxelinux 3.86 and earlier do not appear to trigger the problem. i also didn't
experience the problem with qemu-kvm (formerly known as kvm). qemu without kvm
support enabled also seems to work fine.

for more details, please see:

  http://bugs.debian.org/592875
  http://bugs.debian.org/591934

thanks!

live well,
  vagrant



Re: [Qemu-devel] [PATCH 4/5] Add generic drive hotplugging

2010-08-24 Thread Alexander Graf

On 24.08.2010, at 20:35, Anthony Liguori wrote:

> On 08/24/2010 08:44 AM, Daniel P. Berrange wrote:
>> 
 Actually this SCSI example I give above is appending a drive to an existing
 bus (scsi0), in slot 1 (scsi-id=1).  To best of my knowledge there is no
 remaining use case that requires use of IF_SCSI, IF_IDE, etc. The IF_NONE
 approach can cope with all, modulo bugs that appear periodically with code
 that mistakenly checks for a particular IF_XXX constant.
 
 If you wanted to also create a new SCSI bus, before creating the drive on
 it, you'd need to run three commands in total:
 
   device_add lsi,id=scsi0,bus=pci.0,addr=0x7
   drive_add dummy 
 file=/var/lib/libvirt/images/data.img,if=none,id=drive-scsi0-0-1,format=raw
   device_add 
 scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1
 
   
>>> Nice - so we can just deprecate if=!none?
>>> 
>> In theory yes, but its not nice to tell users to switch everything over to
>> use if=none, if we're going to deprecate that too in the next release when
>> blockdev appears. Might as well just deprecate entire of drive_add/-drive
>> at once.
>>   
> 
> I think what Alex is really asking is can we have 'blockdev_add 
> var0=val0,var1=val1[,...]' implemented as 'drive_add dummy 
> if=none,var0=val0,var1=val1[,...]'.  I don't know the answer to why that 
> isn't possible or desirable.

Uh, I was really asking for why we need something not drive_add and instead 
have yet another iteration of drive parameter. But oh well.

I guess we deferred a bit from the original thread. What's the preferred way to 
go here? Remove the IF_SCSI case completely or keep two copies of the drive_add 
function around?


Alex




Re: [Qemu-devel] [PATCH] Introduce NBD named exports.

2010-08-24 Thread Laurent Vivier
Le mardi 24 août 2010 à 10:59 +0200, Kevin Wolf a écrit :
> Am 24.08.2010 03:04, schrieb Laurent Vivier:
> > This patch allows to connect Qemu using NBD protocol to an nbd-server
> > using named exports.
> > 
> > For instance, if on the host "isoserver", in /etc/nbd-server/config, you 
> > have:
> > 
> > [generic]
> > [debian-500-ppc-netinst]
> > exportname = /ISO/debian-500-powerpc-netinst.iso
> > [Fedora-10-ppc-netinst]
> > exportname = /ISO/Fedora-10-ppc-netinst.iso
> > 
> > You can connect to it, using:
> > 
> > qemu -cdrom nbd:isoserver:exportname=debian-500-ppc-netinst
> > qemu -cdrom nbd:isoserver:exportname=Fedora-10-ppc-netinst
> > 
> > NOTE: you need at least nbd-server 2.9.18
> > Signed-off-by: Laurent Vivier 
> > ---
> >  block/nbd.c   |   57 +++
> >  nbd.c |  119 
> > ++--
> >  nbd.h |5 ++-
> >  qemu-doc.texi |7 +++
> >  qemu-nbd.c|2 +-
> >  5 files changed, 158 insertions(+), 32 deletions(-)
> > 
> > diff --git a/block/nbd.c b/block/nbd.c
> > index a1ec123..ee22b47 100644
> > --- a/block/nbd.c
> > +++ b/block/nbd.c
> > @@ -42,55 +42,78 @@ typedef struct BDRVNBDState {
> >  static int nbd_open(BlockDriverState *bs, const char* filename, int flags)
> >  {
> >  BDRVNBDState *s = bs->opaque;
> > +
> > +#define EN_OPTSTR ":exportname="
> 
> Hm, this makes EN_OPTSTR look like a local variable, but it's actually a
> macro which is valid after the end of the function. So maybe move it out?

I agree, but have a look at curl_open().

> 
> > +char *file;
> > +char *name;
> >  const char *host;
> >  const char *unixpath;
> >  int sock;
> >  off_t size;
> >  size_t blocksize;
> >  int ret;
> > +int err = -EINVAL;
> > +
> > +file = qemu_strdup(filename);
> >  
> > -if (!strstart(filename, "nbd:", &host))
> > -return -EINVAL;
> > +name = strstr(file, EN_OPTSTR);
> > +if (name) {
> > +if (name[sizeof(EN_OPTSTR) - 1] == 0)
> > +goto out;
> 
> Please add braces here and throughout the patch (see CODING_STYLE)

OK.

> Maybe using strlen(EN_OPTSTR) would be clearer? At first I missed the
> fact that sizeof includes the null byte, but maybe it's just me.

I think sizeof() is resolved at compile time whereas strlen() it is at
runtime. For me, it is better, it like to have a constant.

> 
> > +name[0] = 0;
> > +name += sizeof(EN_OPTSTR) - 1;
> > +}
> > +
> > +if (!strstart(file, "nbd:", &host))
> > +goto out;
> >  
> >  if (strstart(host, "unix:", &unixpath)) {
> >  
> >  if (unixpath[0] != '/')
> > -return -EINVAL;
> > +goto out;
> >  
> >  sock = unix_socket_outgoing(unixpath);
> >  
> >  } else {
> > -uint16_t port;
> > +uint16_t port = NBD_DEFAULT_PORT;
> >  char *p, *r;
> >  char hostname[128];
> >  
> >  pstrcpy(hostname, 128, host);
> >  
> >  p = strchr(hostname, ':');
> > -if (p == NULL)
> > -return -EINVAL;
> > +if (p != NULL) {
> > +*p = '\0';
> > +p++;
> >  
> > -*p = '\0';
> > -p++;
> > +port = strtol(p, &r, 0);
> > +if (r == p)
> > +goto out;
> > +} else if (name == NULL)
> > +goto out;
> 
> Wrong indentation level (and missing braces again)

OK

> 
> >  
> > -port = strtol(p, &r, 0);
> > -if (r == p)
> > -return -EINVAL;
> >  sock = tcp_socket_outgoing(hostname, port);
> >  }
> >  
> > -if (sock == -1)
> > -return -errno;
> > +if (sock == -1) {
> > +err = -errno;
> > +goto out;
> > +}
> >  
> > -ret = nbd_receive_negotiate(sock, &size, &blocksize);
> > -if (ret == -1)
> > -return -errno;
> > +ret = nbd_receive_negotiate(sock, name, &size, &blocksize);
> > +if (ret == -1) {
> > +err = -errno;
> > +goto out;
> > +}
> >  
> >  s->sock = sock;
> >  s->size = size;
> >  s->blocksize = blocksize;
> > +err = 0;
> >  
> > -return 0;
> > +out:
> > +qemu_free(file);
> > +return err;
> >  }
> >  
> >  static int nbd_read(BlockDriverState *bs, int64_t sector_num,
> > diff --git a/nbd.c b/nbd.c
> > index a9f295f..755613a 100644
> > --- a/nbd.c
> > +++ b/nbd.c
> > @@ -62,6 +62,8 @@
> >  #define NBD_SET_SIZE_BLOCKS_IO(0xab, 7)
> >  #define NBD_DISCONNECT  _IO(0xab, 8)
> >  
> > +#define NBD_OPT_EXPORT_NAME(1 << 0)
> > +
> >  /* That's all folks */
> >  
> >  #define read_sync(fd, buffer, size) nbd_wr_sync(fd, buffer, size, true)
> > @@ -296,22 +298,26 @@ int nbd_negotiate(int csock, off_t size)
> > return 0;
> >  }
> >  
> > -int nbd_receive_negotiate(int csock, off_t *size, size_t *blocksize)
> > +int nbd_receive_negotiate(int csock, const char *name,
> > +  

[Qemu-devel] Re: [syslinux] Bug#592875: pxelinux: incompatible with qemu with kvm enabled

2010-08-24 Thread H. Peter Anvin
On 08/24/2010 12:15 PM, Vagrant Cascadian wrote:
> it seems versions of pxelinux 4.00 and later hangs qemu (0.12.x, 0.13.x) when
> network booting using etherboot or gPXE and qemu's kvm support is enabled.
> 
> pxelinux 3.86 and earlier do not appear to trigger the problem. i also didn't
> experience the problem with qemu-kvm (formerly known as kvm). qemu without kvm
> support enabled also seems to work fine.
> 
> for more details, please see:
> 
>   http://bugs.debian.org/592875
>   http://bugs.debian.org/591934
> 

Any way you can do a git bisect on this one?

-hpa



[Qemu-devel] [PATCH][v2] Introduce NBD named exports.

2010-08-24 Thread Laurent Vivier
This patch allows to connect Qemu using NBD protocol to an nbd-server
using named exports.

For instance, if on the host "isoserver", in /etc/nbd-server/config, you have:

[generic]
[debian-500-ppc-netinst]
exportname = /ISO/debian-500-powerpc-netinst.iso
[Fedora-10-ppc-netinst]
exportname = /ISO/Fedora-10-ppc-netinst.iso

You can connect to it, using:

qemu -cdrom nbd:isoserver:exportname=debian-500-ppc-netinst
qemu -cdrom nbd:isoserver:exportname=Fedora-10-ppc-netinst

NOTE: you need at least nbd-server 2.9.18

Signed-off-by: Laurent Vivier 
---
v2: address Kevin comments (CODING STYLE)

 block/nbd.c   |   68 +++-
 nbd.c |  118 +++--
 nbd.h |5 ++-
 qemu-doc.texi |7 +++
 qemu-nbd.c|4 +-
 5 files changed, 169 insertions(+), 33 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index a1ec123..5012b70 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -33,6 +33,8 @@
 #include 
 #include 
 
+#define EN_OPTSTR ":exportname="
+
 typedef struct BDRVNBDState {
 int sock;
 off_t size;
@@ -42,55 +44,83 @@ typedef struct BDRVNBDState {
 static int nbd_open(BlockDriverState *bs, const char* filename, int flags)
 {
 BDRVNBDState *s = bs->opaque;
+uint32_t nbdflags;
+
+char *file;
+char *name;
 const char *host;
 const char *unixpath;
 int sock;
 off_t size;
 size_t blocksize;
 int ret;
+int err = -EINVAL;
+
+file = qemu_strdup(filename);
 
-if (!strstart(filename, "nbd:", &host))
-return -EINVAL;
+name = strstr(file, EN_OPTSTR);
+if (name) {
+if (name[sizeof(EN_OPTSTR) - 1] == 0) {
+goto out;
+}
+name[0] = 0;
+name += sizeof(EN_OPTSTR) - 1;
+}
+
+if (!strstart(file, "nbd:", &host)) {
+goto out;
+}
 
 if (strstart(host, "unix:", &unixpath)) {
 
-if (unixpath[0] != '/')
-return -EINVAL;
+if (unixpath[0] != '/') {
+goto out;
+}
 
 sock = unix_socket_outgoing(unixpath);
 
 } else {
-uint16_t port;
+uint16_t port = NBD_DEFAULT_PORT;
 char *p, *r;
 char hostname[128];
 
 pstrcpy(hostname, 128, host);
 
 p = strchr(hostname, ':');
-if (p == NULL)
-return -EINVAL;
+if (p != NULL) {
+*p = '\0';
+p++;
+
+port = strtol(p, &r, 0);
+if (r == p) {
+goto out;
+}
+} else if (name == NULL) {
+goto out;
+}
 
-*p = '\0';
-p++;
-
-port = strtol(p, &r, 0);
-if (r == p)
-return -EINVAL;
 sock = tcp_socket_outgoing(hostname, port);
 }
 
-if (sock == -1)
-return -errno;
+if (sock == -1) {
+err = -errno;
+goto out;
+}
 
-ret = nbd_receive_negotiate(sock, &size, &blocksize);
-if (ret == -1)
-return -errno;
+ret = nbd_receive_negotiate(sock, name, &nbdflags, &size, &blocksize);
+if (ret == -1) {
+err = -errno;
+goto out;
+}
 
 s->sock = sock;
 s->size = size;
 s->blocksize = blocksize;
+err = 0;
 
-return 0;
+out:
+qemu_free(file);
+return err;
 }
 
 static int nbd_read(BlockDriverState *bs, int64_t sector_num,
diff --git a/nbd.c b/nbd.c
index a9f295f..30dec8d 100644
--- a/nbd.c
+++ b/nbd.c
@@ -62,6 +62,8 @@
 #define NBD_SET_SIZE_BLOCKS_IO(0xab, 7)
 #define NBD_DISCONNECT  _IO(0xab, 8)
 
+#define NBD_OPT_EXPORT_NAME(1 << 0)
+
 /* That's all folks */
 
 #define read_sync(fd, buffer, size) nbd_wr_sync(fd, buffer, size, true)
@@ -296,22 +298,27 @@ int nbd_negotiate(int csock, off_t size)
return 0;
 }
 
-int nbd_receive_negotiate(int csock, off_t *size, size_t *blocksize)
+int nbd_receive_negotiate(int csock, const char *name, uint32_t *flags,
+  off_t *size, size_t *blocksize)
 {
-   char buf[8 + 8 + 8 + 128];
-   uint64_t magic;
+   char buf[256];
+   uint64_t magic, s;
+   uint16_t tmp;
 
TRACE("Receiving negotation.");
 
-   if (read_sync(csock, buf, sizeof(buf)) != sizeof(buf)) {
+   if (read_sync(csock, buf, 8) != 8) {
LOG("read failed");
errno = EINVAL;
return -1;
}
 
-   magic = be64_to_cpup((uint64_t*)(buf + 8));
-   *size = be64_to_cpup((uint64_t*)(buf + 16));
-   *blocksize = 1024;
+   buf[8] = '\0';
+   if (strlen(buf) == 0) {
+   LOG("server connection closed");
+   errno = EINVAL;
+   return -1;
+   }
 
TRACE("Magic is %c%c%c%c%c%c%c%c",
  qemu_isprint(buf[0]) ? buf[0] : '.',
@@ -322,8 +329,6 @@ int nbd_receive_negotiate(int csock, off_t *size, size_t 
*blocksize)
  qemu_isprint(buf[5]) ? buf[5] : '.',
  qemu_isprint(buf[6]) ? 

[Qemu-devel] vhost_net.c broken by --kerneldir

2010-08-24 Thread Hollis Blanchard

I'm scratching my head on this one, but here's my problem:
% ./configure --target-list=x86_64-softmmu 
--kerneldir=/home/hollisb/work/linux-2.6.hg

% make V=1
[...]
gcc -I/home/hollisb/work/qemu.git/slirp -Werror -m32 
-fstack-protector-all -Wold-style-definition -Wold-style-declaration -I. 
-I/home/hollisb/work/qemu.git -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wendif-labels -Wwrite-strings 
-Wmissing-prototypes -fno-strict-aliasing  -DHAS_AUDIO 
-DHAS_AUDIO_CHOICE -I/home/hollisb/work/qemu.git/fpu 
-I/home/hollisb/work/qemu.git/tcg 
-I/home/hollisb/work/qemu.git/tcg/i386  -DTARGET_PHYS_ADDR_BITS=64 -I.. 
-I/home/hollisb/work/qemu.git/target-i386 -DNEED_CPU_H 
-I/home/hollisb/work/linux-2.6.hg/include 
-I/home/hollisb/work/linux-2.6.hg/arch/x86/include -MMD -MP -MT 
vhost_net.o -MF ./vhost_net.d -O2 -g  -c -o vhost_net.o 
/home/hollisb/work/qemu.git/hw/vhost_net.c
In file included from 
/home/hollisb/work/linux-2.6.hg/arch/x86/include/asm/sigcontext.h:5,

 from /usr/include/bits/sigcontext.h:28,
 from /usr/include/signal.h:339,
 from /home/hollisb/work/qemu.git/cpu-defs.h:29,
 from /home/hollisb/work/qemu.git/target-i386/cpu.h:46,
 from /home/hollisb/work/qemu.git/qemu-common.h:100,
 from /home/hollisb/work/qemu.git/net.h:5,
 from /home/hollisb/work/qemu.git/hw/vhost_net.c:13:
/home/hollisb/work/linux-2.6.hg/include/linux/types.h:13:2: error: 
#warning "Attempt to use kernel headers from user space, see 
http://kernelnewbies.org/KernelHeaders";


The problem seems to be that jump from /usr/include/bits/sigcontext.h to 
asm/sigcontext.h inside  rather than inside /usr/include. It 
seems like adding -Ikerneldir/arch/foo/include will always be a problem, 
since it will always be used to satisfy "#include ". Only 
files built with KVM_CFLAGS would be affected, and I guess vhost_net.c 
accidentally gets into a broken include path where the other KVM_CFLAGS 
files doesn't.


I'm wondering why this isn't causing more problems for more people. My 
host is Fedora 12, FWIW, but this doesn't seem like it could at all be 
related to toolchain version, for example.


--
Hollis Blanchard
Mentor Graphics, Embedded Systems Division




Re: [Qemu-devel] Re: Unusual physical address when using 64-bit BAR

2010-08-24 Thread Isaku Yamahata
On Tue, Aug 24, 2010 at 10:52:36AM -0600, Cam Macdonell wrote:
> Hi, 64-bit BARs still do not seem to be working.
> 
> When using the latest seabios the guest does not hit a "BUG:"
> statement, but booting still fails
> 
> HPET: 1 timers in total, 0 timers will be used for per-cpu timer
> divide error:  [#1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> 
> Pid: 1, comm: swapper Not tainted 2.6.35+ #299 /Bochs
> RIP: 0010:[]  [] hpet_alloc+0x12c/0x35b
> RSP: 0018:88007d7b3d80  EFLAGS: 00010246
> RAX: 00038d7ea4c68000 RBX: 88007d062cc0 RCX: 
> RDX:  RSI:  RDI: 817bb9b0
> RBP: 88007d7b3dc0 R08: 80d0 R09: c900
> R10: 88007d72b5a0 R11:  R12: 88007d7b3dd0
> R13: c900 R14:  R15: 817a41c3
> FS:  () GS:88000200() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2:  CR3: 01a42000 CR4: 06f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process swapper (pid: 1, threadinfo 88007d7b2000, task 88007d7b8000)
> Stack:
>  88007f43ab90 88007f43ab90 81ca6174 81b1f5e1
> <0>  0100 0100 
> <0> 88007d7b3e80 810294ea fed0 c900
> Call Trace:
>  [] ? hpet_late_init+0x0/0xea
>  [] hpet_reserve_platform_timers+0x10b/0x115
>  [] ? hpet_late_init+0x0/0xea
>  [] hpet_late_init+0x6b/0xea
>  [] ? hpet_late_init+0x0/0xea
>  [] do_one_initcall+0x5e/0x159
>  [] kernel_init+0x19a/0x228
>  [] kernel_thread_helper+0x4/0x10
>  [] ? kernel_init+0x0/0x228
>  [] ? kernel_thread_helper+0x0/0x10
> Code: 89 1d ca b2 b3 00 48 c1 ea 21 8b 73 34 49 c7 c7 c3 41 7a 81 48
> 8d 04 02 4c 89 f2 48 c7 c7 b0 b9 7b 81 48 c1 ea 20 48 89 d1 31 d2 <48>
> f7 f1 83 7b 30 01 48 c7 c1 86 1c 7d 81 49 0f 46 cf 48 89 43
> RIP  [] hpet_alloc+0x12c/0x35b
>  RSP 
> ---[ end trace a7919e7f17c0a725 ]---
> Kernel panic - not syncing: Attempted to kill init!
> Pid: 1, comm: swapper Tainted: G  D 2.6.35+ #299
> Call Trace:
>  [] panic+0x8b/0x10b
>  [] ? exit_ptrace+0x38/0x121
>  [] do_exit+0x7a/0x722
>  [] ? spin_unlock_irqrestore+0xe/0x10
>  [] ? kmsg_dump+0x12b/0x145
>  [] oops_end+0xbf/0xc7
>  [] die+0x5a/0x63
>  [] do_trap+0x121/0x130
>  [] do_divide_error+0x96/0x9f
>  [] ? hpet_alloc+0x12c/0x35b
>  [] ? radix_tree_preload+0x34/0x88
>  [] divide_error+0x1b/0x20
>  [] ? hpet_alloc+0x12c/0x35b
>  [] ? hpet_late_init+0x0/0xea
>  [] hpet_reserve_platform_timers+0x10b/0x115
>  [] ? hpet_late_init+0x0/0xea
>  [] hpet_late_init+0x6b/0xea
>  [] ? hpet_late_init+0x0/0xea
>  [] do_one_initcall+0x5e/0x159
>  [] kernel_init+0x19a/0x228
>  [] kernel_thread_helper+0x4/0x10
>  [] ? kernel_init+0x0/0x228
>  [] ? kernel_thread_helper+0x0/0x10
> 
> seabios output for the device:
> 
> PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
> region 0: 0xf102
> region 2: 0x
> init smm
> 
> Running the latest seabios, the debug output only remaps the BAR
> twice, once with a potentially correct address of e
> 
> pci_read_config: (val) 0xe004 <- 0x18 (addr)

The upstream seabios lacks overflow check at the moment.
I haven't found time to address PMM yet.


> pci_default_write_config: (val) 0x0 -> 0x18 (addr)
> IVSHMEM: guest pci addr = e000, guest h/w addr = 2164588544, size = 
> 2000
> pci_read_config: (val) 0xe004 <- 0x18 (addr)
> pci_default_write_config: (val) 0x0 -> 0x18 (addr)
> pci_read_config: (val) 0x0 <- 0x1c (addr)
> pci_default_write_config: (val) 0x0 -> 0x1c (addr)
> IVSHMEM: guest pci addr = , guest h/w addr =
> 2164588544, size = 2000
> pci_read_config: (val) 0x <- 0x1c (addr)
> pci_default_write_config: (val) 0x0 -> 0x1c (addr)
> pci_read_config: (val) 0x0 <- 0x20 (addr)
> 
> the pci writes are all still 0, I can't see how my debug statements
> are incorrect though.  Below is my trivial pci config debugging patch.

The debug out put should be before the for-loop.

for (i = 0; i < l && addr + i < config_size; val >>= 8, ++i) {
 ^ 
 Here val becomes 0
uint8_t wmask = d->wmask[addr + i];
d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
}

-- 
yamahata



Re: [Qemu-devel] [PATCH] qdev: Reset hotplugged devices

2010-08-24 Thread Isaku Yamahata
On Fri, Aug 20, 2010 at 10:56:57AM -0500, Anthony Liguori wrote:
> The real problem is how we do reset.  We shouldn't register a reset  
> handler for every qdev device but rather register a single reset handler  
> that walks the device tree and calls reset on every reachable device.
>
> Then we can always call reset in init() and there's no need to have a  
> dev->hotplugged check.  The qdev device tree reset handler should not be  
> registered until *after* we call qemu_system_reset() after creating the  
> device model which will ensure that we don't do a double reset.

I don't see why you re-invent the patch.
Please see the patches I sent out on August 5.
http://lists.nongnu.org/archive/html/qemu-devel/2010-08/msg00377.html
http://lists.nongnu.org/archive/html/qemu-devel/2010-08/msg00383.html

Maybe we can merge the patches.
As for your patch, I have some comment.
- bus itself may want its own handler. At lease pci bus needs it.
  And propagating reset signal to children is up to the bus controller.

- child devices should be reset before parent.
  This doesn't matter much though.

-- 
yamahata



[Qemu-devel] Da lau khong hoi tham suc khoe!

2010-08-24 Thread htsoft.tuannq
Xin chào và chúc anh/chị một ngày vui vẻ!

Có phải bạn đang đi tìm phần mềm quản trị kinh doanh như thế này?

HỆ THỐNG PHẦN MỀM QUẢN TRỊ KINH DOANH

HTSOFT BIZMAN.NET

“Không chỉ là công cụ ghi chép, báo cáo mà còn là công cụ tư vấn, định
hướng 
và hỗ trợ ra quyết định cho nhà quản lý!”

HTsoft   – Chúng tôi định hướng sự phát
triển bền vững với phương châm CHẤT LƯỢNG TỐT +

GIÁ BÁN HỢP LÝ  bởi điều đó luôn luôn nằm trong khả năng
của HTsoft

HTsoft BizMan.NET – Phần mềm quản lý bán hàng, quản trị kinh doanh
chuyên nghiệp

 Với chi phí license chỉ: 900.000 VNĐ/ máy tính

Download Free  :
http://htsoftgroup.com/Default.aspx?portalid=35
  (demo user: htsoft
mật khẩu: 1)

HTsoft BizMan.NET – Có đầy đủ các chức năng của một phần mềm quản lý
công ty/ cửa hàng/ hệ thống Bán sỉ/Bán lẻ/Phân phối/Chuỗi siêu thị

1. Module Quản lý Kho­ và Mua hàng  

2. Module Bán hàng (bán lẻ/bán sỉ)

3. Module Quản lý đơn hàng (PO/SO)

4. Module Quản lý tạm ứng

5. Module Quản lý công nợ phải thu

6. Module Quản lý công nợ phải trả

7. Module Lập kế hoạch và theo dõi tiến độ

 

 8. Module Quản lý đối trừ công nợ

 9. Module Quản lý sổ quỹ tiền mặt

 10. Module Quản lý sổ ngân hàng

 11. Module Quản lý khách hàng thân

   thiết (Thẻ KHTT, tích điểm,...)

 12. Module Báo cáo – Thống kê

 13. Module Quản trị, phân quyền

14. Quản lý chuyển kho đa chi nhánh

 

HTsoft – Chúng tôi là nhà sản xuất phần mềm có uy tín, đặc biệt trong
lĩnh vực quản lý Bán hàng

   Qua nhiều năm phát triển, chúng tôi đã trao đổi, tham vấn
cùng với nhiều công ty lớn trong lĩnh vực này để hình thành sản phẩm như
hiện nay. HTSOFT BIZMAN.NET
  đã được hầu
hết khách hàng đánh giá rất cao trong việc nâng cao hiệu quả hoạt động,
góp phần mở rộng hoạt động kinh doanh, tăng tối đa tính chặt chẽ của
nghiệp vụ quản lý các nguồn tài chính. (Xem danh sách khách hàng tiêu
biểu của HTsoft)  

HTsoft BizMan.NET – Thay đổi xứng tầm với doanh nghiệp

Chúng tôi tự hào khẳng định rằng giải pháp của mình có
khả năng đáp ứng rất cao cho nhiều bài toán quản lý phức tạp, trong
nhiều lĩnh vực với mọi quy mô nhỏ và vừa, đa chi nhánh online. HTsoft
luôn sẵn sàng tư vấn cho bạn một giải pháp phần mềm nhằm tối ưu hóa họat
động kinh doanh.

Hãy gọi chúng tôi để nhận được sự tư vấn tốt nhất

HTsoft Co ,. LTD 
Add: Số 6/76-Phố An Dương-Q.Tây Hồ-Hà Nội 
Website: www.htsoftgroup.com   

Tư vấn giải pháp: Trần Thanh Hoàng 
Yahoo: htsoft.hoangtt
Add: Số 6/76-Phố An Dương-Q.Tây Hồ-Hà Nội
Email: hoang...@htsoftgroup.com  ;
hoang...@gmail.com   
Mobile: 0936.198.079/ Tel: (04) 22.433.079 

Quản lý kinh doanh: Ngô Quang Tuân 
Yahoo: htsoft.tuannq
Email: htsoft.tua...@gmail.com    
Mobile: 0987.847.736/ Tel: (04) 66.751.207

Chúng tôi thành thật xin lỗi nếu mail này làm phiền đến quý vị!