Re: [Qemu-devel] [PATCH RFC 1/1] KVM: s390: Add MEMOP ioctl for reading/writing guest memory

2015-02-04 Thread Christian Borntraeger
Am 03.02.2015 um 16:22 schrieb Paolo Bonzini:
> 
> 
> On 03/02/2015 16:16, Thomas Huth wrote:
>> Actually, I'd prefer to keep the "virtual" in the defines for the type
>> of operation below: When it comes to s390 storage keys, we likely might
>> need some calls for reading and writing to physical memory, too. Then
>> we could simply extend this ioctl instead of inventing a new one.

Rereading that. Shall we replace "virtual" with "logical"? That is what is
used architecturally when we mean "do whatever is appropriate right now"
That can boil down to virtual via DAT, virtual via access register mode, 
real if DAT is off... and if fact your kernel implementation does that.


> 
> Can you explain why it is necessary to read/write physical addresses
> from user space?  In the case of QEMU, I'm worried that you would have
> to invent your own memory read/write APIs that are different from
> everything else.
> 
> On real s390 zPCI, does bus-master DMA update storage keys?

the classic channel I/O does set the storage key change/reference and
also triggers errors in the storage key protection value mismatches.

The PCI IOTA structure does contain a storage key value for accesses,
so I assume its the same here, but I dont know for sure.

Conny:
I am asking myself, if we should explicitly add a comment in the 
virtio-ccw spec, that all accesses are assumed to be with key 0 and 
thus never cause key protection. The change/reference bit is set
by the underlying I/O or memory copy anyway.
We can then add a ccw later on to set a different key if we ever need
that.


> 
>>> Not really true, as you don't check it.  So "It is not used by KVM with
>>> the currently defined set of flags" is a better explanation.
>>
>> ok ... and maybe add "should be set to zero" ?
> 
> If you don't check it, it is misleading to document this.
> 
> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




[Qemu-devel] [PATCH v3] pc: acpi-build: make linker & RSDP tables dynamic

2015-02-04 Thread Igor Mammedov
Linker and RSDP tables are build only once, so if later
during rebuild sizes of other ACPI tables change
pointers will be patched incorrectly due to wrong
offsets in RSDP and linker.

To fix it rebuild linker and RSDP tables along with
the rest of ACPI tables so that they would have
offsets that match just built tables.

Here is a simple reproducer:
 1: hotplug bridge using command:
 device_add pci-bridge,chassis_nr=1
 2: reset system from monitor:
 system_reset

As result pointers to ACPI tables are not correct
and guest can't read/parse ACPI tables and on top
of it linker corrupted them by patching at stale
offsets.

Windows guests just refuses to boot and
Linux guests are more resilient and try to boot without
ACPI, sometimes successfully.

Fix applies only to new machine types starting from 2.3,
so it won't break migration for old machine types.

Signed-off-by: Igor Mammedov 
---
 hw/i386/acpi-build.c | 27 ---
 hw/i386/pc_piix.c|  3 +++
 hw/i386/pc_q35.c |  3 +++
 include/hw/i386/pc.h |  1 +
 4 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4944249..58cf8b7 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1523,6 +1523,10 @@ struct AcpiBuildState {
 /* Copy of table in RAM (for patching). */
 ram_addr_t table_ram;
 uint32_t table_size;
+ram_addr_t linker_ram;
+uint32_t linker_size;
+ram_addr_t rsdp_ram;
+uint32_t rsdp_size;
 /* Is table patched? */
 uint8_t patched;
 PcGuestInfo *guest_info;
@@ -1733,6 +1737,10 @@ static void acpi_build_update(void *build_opaque, 
uint32_t offset)
 
 memcpy(qemu_get_ram_ptr(build_state->table_ram), tables.table_data->data,
build_state->table_size);
+memcpy(qemu_get_ram_ptr(build_state->linker_ram), tables.linker->data,
+   build_state->linker_size);
+memcpy(qemu_get_ram_ptr(build_state->rsdp_ram), tables.rsdp->data,
+   build_state->rsdp_size);
 
 cpu_physical_memory_set_dirty_range_nocode(build_state->table_ram,
build_state->table_size);
@@ -1799,17 +1807,22 @@ void acpi_setup(PcGuestInfo *guest_info)
 assert(build_state->table_ram != RAM_ADDR_MAX);
 build_state->table_size = acpi_data_len(tables.table_data);
 
-acpi_add_rom_blob(NULL, tables.linker, "etc/table-loader", 0);
+build_state->linker_ram = acpi_add_rom_blob(build_state, tables.linker,
+"etc/table-loader", 0);
+build_state->linker_size = acpi_data_len(tables.linker);
 
 fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
 tables.tcpalog->data, acpi_data_len(tables.tcpalog));
 
-/*
- * RSDP is small so it's easy to keep it immutable, no need to
- * bother with ROM blobs.
- */
-fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_RSDP_FILE,
-tables.rsdp->data, acpi_data_len(tables.rsdp));
+if (guest_info->has_imutable_rsdp) {
+/* Keep for compatibility with old machine types */
+fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_RSDP_FILE,
+tables.rsdp->data, acpi_data_len(tables.rsdp));
+} else {
+build_state->rsdp_ram = acpi_add_rom_blob(build_state, tables.rsdp,
+  ACPI_BUILD_RSDP_FILE, 0);
+build_state->rsdp_size = acpi_data_len(tables.rsdp);
+}
 
 qemu_register_reset(acpi_build_reset, build_state);
 acpi_build_reset(build_state);
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 38b42b0..866b783 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -60,6 +60,7 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
 static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 
 static bool has_acpi_build = true;
+static bool has_imutable_rsdp;
 static int legacy_acpi_table_size;
 static bool smbios_defaults = true;
 static bool smbios_legacy_mode;
@@ -168,6 +169,7 @@ static void pc_init1(MachineState *machine,
 
 guest_info->isapc_ram_fw = !pci_enabled;
 guest_info->has_reserved_memory = has_reserved_memory;
+guest_info->has_imutable_rsdp = has_imutable_rsdp;
 
 if (smbios_defaults) {
 MachineClass *mc = MACHINE_GET_CLASS(machine);
@@ -310,6 +312,7 @@ static void pc_init_pci(MachineState *machine)
 
 static void pc_compat_2_2(MachineState *machine)
 {
+has_imutable_rsdp = true;
 x86_cpu_compat_set_features("kvm64", FEAT_1_EDX, 0, CPUID_VME);
 x86_cpu_compat_set_features("kvm32", FEAT_1_EDX, 0, CPUID_VME);
 x86_cpu_compat_set_features("Conroe", FEAT_1_EDX, 0, CPUID_VME);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 63027ee..6f649a1 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -50,6 +50,7 @@
 #define MAX_SATA_PORTS 6
 
 static bool has_acpi_build = true;
+static bool has_imutable_rsdp;
 static bool smbios_defaults = true;
 static bool 

Re: [Qemu-devel] [PATCH v2] qga: add guest-set-admin-password command

2015-02-04 Thread Daniel P. Berrange
On Tue, Feb 03, 2015 at 03:16:08PM -0700, Eric Blake wrote:
> On 01/12/2015 08:58 AM, Daniel P. Berrange wrote:
> > Add a new 'guest-set-admin-password' command for changing the
> > root/administrator password. This command is needed to allow
> > OpenStack to support its API for changing the admin password
> > on a running guest.
> > 
> > Accepts either the raw password string:
> > 
> > $ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
> >'{ "execute": "guest-set-admin-password", "arguments":
> >  { "crypted": false, "password": "12345678" } }'
> >   {"return":{}}
> > 
> > Or a pre-encrypted string (recommended)
> > 
> > $ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
> >'{ "execute": "guest-set-admin-password", "arguments":
> >  { "crypted": true, "password":
> > "$6$T9O/j/aGPrE...sniprQoRN4F0.GG0MPjNUNyml." } }'
> > 
> > NB windows support is desirable, but not implemented in this
> > patch.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  qga/commands-posix.c | 90 
> > 
> >  qga/commands-win32.c |  6 
> >  qga/qapi-schema.json | 19 +++
> >  3 files changed, 115 insertions(+)
> > 
> 
> > +++ b/qga/qapi-schema.json
> > @@ -738,3 +738,22 @@
> >  ##
> >  { 'command': 'guest-get-fsinfo',
> >'returns': ['GuestFilesystemInfo'] }
> > +
> > +##
> > +# @guest-set-admin-password
> > +#
> > +# @crypted: true if password is already crypt()d, false if raw
> > +# @password: the new password entry
> > +#
> > +# If the @crypted flag is true, it is the callers responsibility
> 
> s/callers/caller's/
> 
> > +# to ensure the correct crypt() encryption scheme is used. This
> > +# command does not attempt to interpret or report on the encryption
> > +# scheme. Refer to the documentation of the guest operating system
> > +# in question to determine what is supported.
> > +#
> > +# Returns: Nothing on success.
> > +#
> > +# Since 2.3
> > +##
> > +{ 'command': 'guest-set-admin-password',
> > +  'data': { 'crypted': 'bool', 'password': 'str' } }
> > 
> 
> Normally, 'password':'str' means we are passing UTF8 JSON.  But what if
> the desired password is NOT valid UTF8, but still valid to the end user
> (for example, a user that intentionally wants a Latin1 encoded password
> that uses 8-bit characters)?  In other interfaces, we've allowed an enum
> that specifies whether a raw data string is 'utf8' or 'base64' encoded;
> should we have such a parameter here?

IMHO, QEMU itself needs to avoid interpreting the password at all and
just pass the raw bytes through to the operating system specific
command as-is. ie we should be 8-bit pure in what we accept. This
probably argues for unconditionally using base64 encoded data for
the parameter.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] RFC: Proposal to add QEMU "Guest Environment Variables"

2015-02-04 Thread Daniel P. Berrange
On Tue, Feb 03, 2015 at 02:09:22PM -0500, Gabriel L. Somlo wrote:
> Hi,
> 
> I'm interested in adding a way for a host to pass environment variables
> into a qemu guest VM -- analogous to setting environment variables for
> a process to access via getenv() and friends.
> 
> The QEMU Guest Agent (QGA) does not appear to quite fit the bill, at
> least not in its current form: The agent must have been successfully
> started on the guest before the host would have to connect to it (in
> a separate act from just starting the guest in the first place), and
> get it to execute any hypothetical commands to configure or otherwise
> influence the guest.

> So, my question for the QEMU dev team:
> 
> 1. Would you consider this feature a useful addition to QEMU ?
>I.e., would this be acceptable (of interest) to the upstream project?
> 
> 2. Is anything similar already being worked on (so I could either join
>that effort, or back off, as the case may be) ? :)

IMHO this is already a solved problem via the cloud-init project which
is the widely used standard for injecting information into guest OS
at boot time. Any OS distro shipping cloud images is already going to
have cloud-init provided.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] RFC: Proposal to add QEMU "Guest Environment Variables"

2015-02-04 Thread Daniel P. Berrange
On Tue, Feb 03, 2015 at 04:38:59PM -0500, Gabriel L. Somlo wrote:
> On Tue, Feb 03, 2015 at 02:11:12PM -0600, Michael Roth wrote:
> > 
> > This does seem like useful functionality, but I think I'd like to know
> > more about the actual use-cases being looked at.
> 
> The proposed functionality is mostly equivalent to that offered by
> "GuestInfo variables". So yes, initial activation scripts :)
> 
> > Is this mostly about executing initial activation scripts? Because after
> > that point, a key-value store can be managed through the
> > guest-file-read/write interfaces for anything on the guest-side that's
> > interested in these variables.
> > 
> > Even activation could be done using this approach, where the
> > scripts start QGA and wait for the host to coordinate the initial creation
> > of the file containing those variables, then setting a file marker that
> > allows activation to proceed. And if that seems wonky, I'm fairly sure you
> > could script the creation of the initial key-value store prior to starting
> > the guest using libguestfs:
> > 
> >   http://libguestfs.org/
> 
> Specifically, I'm trying to port to QEMU a simulation/training setup
> where multiple VMs are started from the same base image, and guestinfo
> environment variables help each instance determine its "personality".
> 
> Editing the disk image is not feasible, since the idea is to share the
> base disk image across multiple VMs. And needing to connect to each VM
> after having started it, wait for it to bring up the QGA, then get it
> to accept environment variables, that's precisely the wonkiness I'm
> trying to avoid :)

AFAICT, you're describing the exact scenario that cloud-init solves.
You have a generic cloud base image, and you provide metadata to the
guest (either via a transient CDROM image generated at boot, or via
a speciall network interface with well known IP addr + HTTP service)
which is uses to configure itself for a specific task as boot up.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH] acpi: update RSDP on guest access

2015-02-04 Thread Michael S. Tsirkin
As Igor pointed out, when RSDT offset changes,
RSDP needs to change as well.
We really should have put it in a ROM region, but
we can't change that for old machine types,
let's just set the callback and update it explicitly.

Signed-off-by: Michael S. Tsirkin 
---
 hw/i386/acpi-build.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 74586f3..dfbf690 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1529,6 +1529,7 @@ struct AcpiBuildState {
 /* Is table patched? */
 uint8_t patched;
 PcGuestInfo *guest_info;
+void *rsdp;
 } AcpiBuildState;
 
 static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
@@ -1737,6 +1738,8 @@ static void acpi_build_update(void *build_opaque, 
uint32_t offset)
 memcpy(qemu_get_ram_ptr(build_state->table_ram), tables.table_data->data,
build_state->table_size);
 
+memcpy(build_state->rsdp, tables.rsdp->data, acpi_data_len(tables.rsdp));
+
 cpu_physical_memory_set_dirty_range_nocode(build_state->table_ram,
build_state->table_size);
 
@@ -1811,8 +1814,11 @@ void acpi_setup(PcGuestInfo *guest_info)
  * RSDP is small so it's easy to keep it immutable, no need to
  * bother with ROM blobs.
  */
-fw_cfg_add_file(guest_info->fw_cfg, ACPI_BUILD_RSDP_FILE,
-tables.rsdp->data, acpi_data_len(tables.rsdp));
+fw_cfg_add_file_callback(guest_info->fw_cfg, ACPI_BUILD_RSDP_FILE,
+ acpi_build_update, NULL,
+ tables.rsdp->data, acpi_data_len(tables.rsdp));
+
+build_state->rsdp = tables.rsdp->data;
 
 qemu_register_reset(acpi_build_reset, build_state);
 acpi_build_reset(build_state);
-- 
MST



[Qemu-devel] [PATCH RFC v3 6/6] qemu-iotests: s390x: fix test 055

2015-02-04 Thread Xiao Guang Chen
From: Mao Chuan Li 

There is no such device 'ide-cd' defined on the s390 platform, so
test_medium_not_found() test is skipped.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li  
---
 tests/qemu-iotests/055 | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 451b67d..5cf487b 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -90,6 +90,9 @@ class TestSingleDrive(iotests.QMPTestCase):
 'target image does not match source after backup')
 
 def test_medium_not_found(self):
+if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('drive-backup', device='ide1-cd0',
  target=target_img, sync='full')
 self.assert_qmp(result, 'error/class', 'GenericError')
@@ -252,6 +255,9 @@ class TestSingleTransaction(iotests.QMPTestCase):
 'target image does not match source after backup')
 
 def test_medium_not_found(self):
+if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('transaction', actions=[{
 'type': 'drive-backup',
 'data': { 'device': 'ide1-cd0',
-- 
1.9.1




[Qemu-devel] [PATCH RFC v3 0/6] Update tests/qemu-iotests failing cases for the s390 platform

2015-02-04 Thread Xiao Guang Chen
v3: 
1. Fix a typo in v2.

v2:
1. Drop the patches for test 039 for it has been fixed in upstream.
2. Integrate patches for test 071, 067 and 087.
3. Keep the other patches.

v1:
1. updated the test suite to be default-machine-type-aware, from the previous 
platform-aware
2. created a new patch "qemu-iotests: run qemu with -nodefaults" to counterpart 
the impact from the commit:
c88930a6866e74953e931ae749781e98e486e5c8
qemu-char: Permit only a single "stdio" character device

When more than one is used, the terminal settings aren't restored
correctly on exit.  Fixable.  However, such usage makes no sense,
because the users race for input, so outlaw it instead.

If you want to connect multiple things to stdio, use the mux
chardev.
3. updated all the checking of platform name to the current machine name

Mao Chuan Li (5):
  qemu-iotests: qemu machine type support
  qemu-iotests: run qemu with -nodefaults
  qemu-iotests: s390x: fix test 041
  qemu-iotests: s390x: fix test 051
  qemu-iotests: s390x: fix test 055

Xiao Guang Chen (1):
  qemu-iotests: fix tests 067, 071 and 087

 tests/qemu-iotests/041 |   6 +
 tests/qemu-iotests/051 |  91 +---
 tests/qemu-iotests/051.s390-virtio.out | 377 +
 tests/qemu-iotests/055 |   6 +
 tests/qemu-iotests/067 |   8 +-
 tests/qemu-iotests/067.out |  26 +--
 tests/qemu-iotests/071.out |  12 +-
 tests/qemu-iotests/087.out |  18 +-
 tests/qemu-iotests/check   |   5 +
 tests/qemu-iotests/common  |   1 +
 tests/qemu-iotests/common.config   |   3 +-
 tests/qemu-iotests/common.qemu |   2 +-
 tests/qemu-iotests/iotests.py  |   1 +
 13 files changed, 481 insertions(+), 75 deletions(-)
 create mode 100644 tests/qemu-iotests/051.s390-virtio.out

-- 
1.9.1




[Qemu-devel] [PATCH RFC v3 1/6] qemu-iotests: fix tests 067, 071 and 087

2015-02-04 Thread Xiao Guang Chen
From: Xiao Guang Chen 

Update the output files for test case 067, 071 and 087 because qemu option
-nodefaults was used to start a guest so there are no default floppy and
cdrom for guests any more.
Use virtio-blk instead of virtio-blk-pci as the device driver for test case
067. For virtio-blk-pci is the same with virtio-blk as device driver but
other platform such as s390 may not recognize the virtio-blk-pci.

Reviewed-by:   Michael Mueller 
Signed-off-by: Xiao Guang Chen 
---
 tests/qemu-iotests/067 |  8 
 tests/qemu-iotests/067.out | 26 +-
 tests/qemu-iotests/071.out | 12 
 tests/qemu-iotests/087.out | 18 +++---
 4 files changed, 20 insertions(+), 44 deletions(-)

diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067
index d025192..6cddf10 100755
--- a/tests/qemu-iotests/067
+++ b/tests/qemu-iotests/067
@@ -56,7 +56,7 @@ echo
 echo === -drive/-device and device_del ===
 echo
 
-run_qemu -drive file=$TEST_IMG,format=$IMGFMT,if=none,id=disk -device 
virtio-blk-pci,drive=disk,id=virtio0 <

[Qemu-devel] [PATCH RFC v3 2/6] qemu-iotests: qemu machine type support

2015-02-04 Thread Xiao Guang Chen
From: Mao Chuan Li 

This patch adds qemu machine type support to the io test suite. Based on
the qemu default machine type the reference output file can now vary
from the default to a machine specific output file if necessary. That
shall allow all platforms to use this test suite.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li 
---
 tests/qemu-iotests/check | 5 +
 tests/qemu-iotests/common.config | 1 +
 tests/qemu-iotests/iotests.py| 1 +
 3 files changed, 7 insertions(+)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 8ca4011..fc0351d 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -323,6 +323,11 @@ do
 fi
 
 reference="$source_iotests/$seq.out"
+reference_machine="$source_iotests/$seq.$QEMU_DEFAULT_MACHINE.out"
+if [ -f $reference_machine ]; then
+reference=$reference_machine
+fi
+
 if [ "$CACHEMODE" = "none" ]; then
 [ -f "$source_iotests/$seq.out.nocache" ] && 
reference="$source_iotests/$seq.out.nocache"
 fi
diff --git a/tests/qemu-iotests/common.config b/tests/qemu-iotests/common.config
index bd6790b..73e25da 100644
--- a/tests/qemu-iotests/common.config
+++ b/tests/qemu-iotests/common.config
@@ -107,6 +107,7 @@ export QEMU=$QEMU_PROG
 export QEMU_IMG=$QEMU_IMG_PROG
 export QEMU_IO="$QEMU_IO_PROG $QEMU_IO_OPTIONS"
 export QEMU_NBD=$QEMU_NBD_PROG
+export QEMU_DEFAULT_MACHINE=$($QEMU -machine ? | awk '/(default)/{print $1}')
 
 [ -f /etc/qemu-iotest.config ]   && . /etc/qemu-iotest.config
 
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index f57f154..69dee95 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -39,6 +39,7 @@ imgproto = os.environ.get('IMGPROTO', 'file')
 test_dir = os.environ.get('TEST_DIR', '/var/tmp')
 output_dir = os.environ.get('OUTPUT_DIR', '.')
 cachemode = os.environ.get('CACHEMODE')
+qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
 
 socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
 
-- 
1.9.1




[Qemu-devel] [PATCH RFC v3 3/6] qemu-iotests: run qemu with -nodefaults

2015-02-04 Thread Xiao Guang Chen
From: Mao Chuan Li 

This patch fixes an io test suite issue that was introduced with the
commit c88930a6866e74953e931ae749781e98e486e5c8 'qemu-char: Permit only
a single "stdio" character device'. The option supresses the creation of
default devices.

Reviewed-by: Michael Mueller 
Signed-off-by: Mao Chuan Li 
---
 tests/qemu-iotests/common| 1 +
 tests/qemu-iotests/common.config | 2 +-
 tests/qemu-iotests/common.qemu   | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index 9e12bec..f3fa6dd 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -51,6 +51,7 @@ export IMGOPTS=""
 export CACHEMODE="writeback"
 export QEMU_IO_OPTIONS=""
 export CACHEMODE_IS_DEFAULT=true
+export QEMU_OPTIONS="-nodefaults"
 
 for r
 do
diff --git a/tests/qemu-iotests/common.config b/tests/qemu-iotests/common.config
index 73e25da..3025921 100644
--- a/tests/qemu-iotests/common.config
+++ b/tests/qemu-iotests/common.config
@@ -103,7 +103,7 @@ if [ -z "$QEMU_NBD_PROG" ]; then
 export QEMU_NBD_PROG="`set_prog_path qemu-nbd`"
 fi
 
-export QEMU=$QEMU_PROG
+export QEMU="$QEMU_PROG $QEMU_OPTIONS"
 export QEMU_IMG=$QEMU_IMG_PROG
 export QEMU_IO="$QEMU_IO_PROG $QEMU_IO_OPTIONS"
 export QEMU_NBD=$QEMU_NBD_PROG
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index ee7ebb4..ea27536 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -153,7 +153,7 @@ function _launch_qemu()
 mkfifo "${fifo_out}"
 mkfifo "${fifo_in}"
 
-"${QEMU}" -nographic -serial none ${comm} -machine accel=qtest "${@}" 2>&1 
\
+${QEMU} -nographic -serial none ${comm} -machine accel=qtest "${@}" 2>&1 \
 >"${fifo_out}" 
\
 <"${fifo_in}" &
 QEMU_PID[${_QEMU_HANDLE}]=$!
-- 
1.9.1




[Qemu-devel] [PATCH RFC v3 4/6] qemu-iotests: s390x: fix test 041

2015-02-04 Thread Xiao Guang Chen
From: Mao Chuan Li 

There is no such device 'ide-cd' defined on the s390 platform, so
test_medium_not_found() test is skipped.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li  
---
 tests/qemu-iotests/041 | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 59a8f73..4fb1b34 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -197,6 +197,9 @@ class TestSingleDrive(ImageMirroringTestCase):
 'target image does not match source after mirroring')
 
 def test_medium_not_found(self):
+   if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('drive-mirror', device='ide1-cd0', sync='full',
  target=target_img)
 self.assert_qmp(result, 'error/class', 'GenericError')
@@ -867,6 +870,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 if not self.has_quorum():
 return
 
+if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('drive-mirror', device='ide1-cd0', sync='full',
  node_name='repair0',
  replaces='img1',
-- 
1.9.1




[Qemu-devel] [PATCH RFC v3 5/6] qemu-iotests: s390x: fix test 051

2015-02-04 Thread Xiao Guang Chen
From: Mao Chuan Li 

The tests for device type "ide_cd" are skipped for the s390 platform.
The default device id of hard disk on the s390 platform differs to that
of the x86 platform. A new variable device_id is defined and "virtio0"
set for the s390 platform. A s390 platform specific output file is also
needed.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li 
---
 tests/qemu-iotests/051 |  91 +---
 tests/qemu-iotests/051.s390-virtio.out | 377 +
 2 files changed, 439 insertions(+), 29 deletions(-)
 create mode 100644 tests/qemu-iotests/051.s390-virtio.out

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 11c858f..2b600df 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -137,13 +137,19 @@ run_qemu -drive if=ide
 run_qemu -drive if=virtio
 run_qemu -drive if=scsi
 
-run_qemu -drive if=none,id=disk -device ide-cd,drive=disk
-run_qemu -drive if=none,id=disk -device lsi53c895a -device scsi-cd,drive=disk
-
-run_qemu -drive if=none,id=disk -device ide-drive,drive=disk
-run_qemu -drive if=none,id=disk -device ide-hd,drive=disk
-run_qemu -drive if=none,id=disk -device lsi53c895a -device scsi-disk,drive=disk
-run_qemu -drive if=none,id=disk -device lsi53c895a -device scsi-hd,drive=disk
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+;;
+*)
+run_qemu -drive if=none,id=disk -device ide-cd,drive=disk
+run_qemu -drive if=none,id=disk -device lsi53c895a -device 
scsi-cd,drive=disk
+
+run_qemu -drive if=none,id=disk -device ide-drive,drive=disk
+run_qemu -drive if=none,id=disk -device ide-hd,drive=disk
+run_qemu -drive if=none,id=disk -device lsi53c895a -device 
scsi-disk,drive=disk
+run_qemu -drive if=none,id=disk -device lsi53c895a -device 
scsi-hd,drive=disk
+;;
+esac
 
 echo
 echo === Read-only ===
@@ -157,13 +163,19 @@ run_qemu -drive file="$TEST_IMG",if=ide,readonly=on
 run_qemu -drive file="$TEST_IMG",if=virtio,readonly=on
 run_qemu -drive file="$TEST_IMG",if=scsi,readonly=on
 
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-cd,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-cd,drive=disk
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+;;
+*)
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-cd,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-cd,drive=disk
 
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-drive,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-hd,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-disk,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-hd,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-drive,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-hd,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-disk,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-hd,drive=disk
+;;
+esac
 
 echo
 echo === Cache modes ===
@@ -172,12 +184,24 @@ echo
 # Cannot use the test image because cache=none might not work on the host FS
 # Use cdrom so that we won't get errors about missing media
 
-run_qemu -drive media=cdrom,cache=none
-run_qemu -drive media=cdrom,cache=directsync
-run_qemu -drive media=cdrom,cache=writeback
-run_qemu -drive media=cdrom,cache=writethrough
-run_qemu -drive media=cdrom,cache=unsafe
-run_qemu -drive media=cdrom,cache=invalid_value
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+run_qemu -drive if=scsi,media=cdrom,cache=none
+run_qemu -drive if=scsi,media=cdrom,cache=directsync
+run_qemu -drive if=scsi,media=cdrom,cache=writeback
+run_qemu -drive if=scsi,media=cdrom,cache=writethrough
+run_qemu -drive if=scsi,media=cdrom,cache=unsafe
+run_qemu -drive if=scsi,media=cdrom,cache=invalid_value
+;;
+*)
+run_qemu -drive media=cdrom,cache=none
+run_qemu -drive media=cdrom,cache=directsync
+run_qemu -drive media=cdrom,cache=writeback
+run_qemu -drive media=cdrom,cache=writethrough
+run_qemu -drive media=cdrom,cache=unsafe
+run_qemu -drive media=cdrom,cache=invalid_value
+;;
+esac
 
 echo
 echo === Specifying the protocol layer ===
@@ -241,28 +265,37 @@ echo
 echo === Snapshot mode ===
 echo
 
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+device_id="virtio0"
+;;
+*)
+device_id="ide0-hd0"
+;;
+esac
+
 $QEMU_IO -c "write -P 0x11 0 4k" "$TEST_IMG" | _filter_qemu_io
 
-echo 'qem

Re: [Qemu-devel] [PATCH 00/10] pci: Partial conversion to realize

2015-02-04 Thread Markus Armbruster
Ping?

Michael, can this go through your tree?

Markus Armbruster  writes:

> I posted this series as RFC back in October, but it depended on
> patches then still under review, so I put it aside, and promptly
> forgot.  Fortunately, rebasing and updating it wasn't much trouble.
>
> While discussing Gonglei's "[PATCH v2 00/19] usb: convert device init
> to realize", Paolo called the PCI conversion job "Gargantuan".  This
> series attempts to crack it into manageable jobs.
>
> The basic idea comes from qdev core: have the core deal with just
> realize, but default the device models' realize() method to one that
> calls the old init() method.  Unconverted device models don't set
> their realize(), thus get one that calls their init().  We can then
> convert device by device instead of having to convert of all of PCI in
> one Gargantuan go.
>
> Since PCI's exit() cannot fail, I chose not to add an unrealize().
> Precedence: USBDeviceClass method handle_destroy(), called on USB
> unrealize.
>
> Aside: USBDeviceClass also has an unrealize() method, but it's never
> set and never called.
>
> PATCH 01 converts the interface between PCI core and qdev to realize.
>
> PATCH 02 adds realize to the interface between PCI core and PCI device
> models.  Once all device models are converted to realize, the old init
> interface can be dropped, completing the Gargantuan job.
>
> PATCH 03-04 convert device models that cannot fail initialization.
>
> PATCH 05-10 convert a few that can fail, but are really easy to
> convert.
>
> Markus Armbruster (10):
>   pci: Convert core to realize
>   pci: Permit incremental conversion of device models to realize
>   pci: Trivial device model conversions to realize
>   pcnet: pcnet_common_init() always returns 0, change to void
>   pcnet: Convert to realize
>   serial-pci: Convert to realize
>   ide/ich: Convert to realize
>   cirrus-vga: Convert to realize
>   qxl: Convert to realize
>   pci-assign: Convert to realize
>
>  hw/acpi/piix4.c|   5 +-
>  hw/audio/ac97.c|   5 +-
>  hw/audio/es1370.c  |   5 +-
>  hw/audio/intel-hda.c   |   6 +--
>  hw/char/serial-pci.c   |  22 -
>  hw/display/cirrus_vga.c|  11 ++---
>  hw/display/qxl.c   |  36 +++
>  hw/display/vga-pci.c   |  11 ++---
>  hw/display/vmware_vga.c|   6 +--
>  hw/i2c/smbus_ich9.c|   5 +-
>  hw/i386/kvm/pci-assign.c   |  10 ++--
>  hw/ide/cmd646.c|   5 +-
>  hw/ide/ich.c   |  13 +++---
>  hw/ide/piix.c  |  10 ++--
>  hw/ide/via.c   |   6 +--
>  hw/ipack/tpci200.c |   6 +--
>  hw/isa/i82378.c|   6 +--
>  hw/isa/piix4.c |   5 +-
>  hw/isa/vt82c686.c  |  24 --
>  hw/misc/pci-testdev.c  |   6 +--
>  hw/net/e1000.c |   6 +--
>  hw/net/eepro100.c  |   6 +--
>  hw/net/lance.c |   3 +-
>  hw/net/ne2000.c|   6 +--
>  hw/net/pcnet-pci.c |   6 +--
>  hw/net/pcnet.c |   4 +-
>  hw/net/pcnet.h |   2 +-
>  hw/net/rtl8139.c   |   6 +--
>  hw/net/vmxnet3.c   |   6 +--
>  hw/pci-bridge/dec.c|   5 +-
>  hw/pci-host/apb.c  |   5 +-
>  hw/pci-host/bonito.c   |   6 +--
>  hw/pci-host/grackle.c  |   5 +-
>  hw/pci-host/piix.c |  12 ++---
>  hw/pci-host/ppce500.c  |   6 +--
>  hw/pci-host/prep.c |   6 +--
>  hw/pci-host/q35.c  |   5 +-
>  hw/pci-host/uninorth.c |  20 
>  hw/pci-host/versatile.c|   5 +-
>  hw/pci/pci.c   | 113 
> ++---
>  hw/sd/sdhci.c  |   5 +-
>  hw/usb/hcd-ehci-pci.c  |   6 +--
>  hw/usb/hcd-xhci.c  |   6 +--
>  hw/watchdog/wdt_i6300esb.c |   6 +--
>  include/hw/pci/pci.h   |   3 +-
>  45 files changed, 203 insertions(+), 259 deletions(-)



Re: [Qemu-devel] [PATCH v4 4/4] Add migration stream analyzation script

2015-02-04 Thread Amit Shah
On (Thu) 22 Jan 2015 [15:01:40], Alexander Graf wrote:

> [PATCH v4 4/4] Add migration stream analyzation script

analysis is a more widely-accepted word :-)

Amit



Re: [Qemu-devel] [PATCH 00/10] pci: Partial conversion to realize

2015-02-04 Thread Michael S. Tsirkin
On Wed, Feb 04, 2015 at 10:34:45AM +0100, Markus Armbruster wrote:
> Ping?
> 
> Michael, can this go through your tree?

Thanks, I'll apply them.

-- 
MST



Re: [Qemu-devel] [PATCH 00/10] pci: Partial conversion to realize

2015-02-04 Thread Gonglei
On 2015/1/19 22:52, Markus Armbruster wrote:

> I posted this series as RFC back in October, but it depended on
> patches then still under review, so I put it aside, and promptly
> forgot.  Fortunately, rebasing and updating it wasn't much trouble.
> 
> While discussing Gonglei's "[PATCH v2 00/19] usb: convert device init
> to realize", Paolo called the PCI conversion job "Gargantuan".  This
> series attempts to crack it into manageable jobs.
> 
> The basic idea comes from qdev core: have the core deal with just
> realize, but default the device models' realize() method to one that
> calls the old init() method.  Unconverted device models don't set
> their realize(), thus get one that calls their init().  We can then
> convert device by device instead of having to convert of all of PCI in
> one Gargantuan go.
> 
> Since PCI's exit() cannot fail, I chose not to add an unrealize().
> Precedence: USBDeviceClass method handle_destroy(), called on USB
> unrealize.
> 
> Aside: USBDeviceClass also has an unrealize() method, but it's never
> set and never called.
> 
> PATCH 01 converts the interface between PCI core and qdev to realize.
> 
> PATCH 02 adds realize to the interface between PCI core and PCI device
> models.  Once all device models are converted to realize, the old init
> interface can be dropped, completing the Gargantuan job.
> 
> PATCH 03-04 convert device models that cannot fail initialization.
> 
> PATCH 05-10 convert a few that can fail, but are really easy to
> convert.
> 
> Markus Armbruster (10):
>   pci: Convert core to realize
>   pci: Permit incremental conversion of device models to realize
>   pci: Trivial device model conversions to realize
>   pcnet: pcnet_common_init() always returns 0, change to void
>   pcnet: Convert to realize
>   serial-pci: Convert to realize
>   ide/ich: Convert to realize
>   cirrus-vga: Convert to realize
>   qxl: Convert to realize
>   pci-assign: Convert to realize
> 
>  hw/acpi/piix4.c|   5 +-
>  hw/audio/ac97.c|   5 +-
>  hw/audio/es1370.c  |   5 +-
>  hw/audio/intel-hda.c   |   6 +--
>  hw/char/serial-pci.c   |  22 -
>  hw/display/cirrus_vga.c|  11 ++---
>  hw/display/qxl.c   |  36 +++
>  hw/display/vga-pci.c   |  11 ++---
>  hw/display/vmware_vga.c|   6 +--
>  hw/i2c/smbus_ich9.c|   5 +-
>  hw/i386/kvm/pci-assign.c   |  10 ++--
>  hw/ide/cmd646.c|   5 +-
>  hw/ide/ich.c   |  13 +++---
>  hw/ide/piix.c  |  10 ++--
>  hw/ide/via.c   |   6 +--
>  hw/ipack/tpci200.c |   6 +--
>  hw/isa/i82378.c|   6 +--
>  hw/isa/piix4.c |   5 +-
>  hw/isa/vt82c686.c  |  24 --
>  hw/misc/pci-testdev.c  |   6 +--
>  hw/net/e1000.c |   6 +--
>  hw/net/eepro100.c  |   6 +--
>  hw/net/lance.c |   3 +-
>  hw/net/ne2000.c|   6 +--
>  hw/net/pcnet-pci.c |   6 +--
>  hw/net/pcnet.c |   4 +-
>  hw/net/pcnet.h |   2 +-
>  hw/net/rtl8139.c   |   6 +--
>  hw/net/vmxnet3.c   |   6 +--
>  hw/pci-bridge/dec.c|   5 +-
>  hw/pci-host/apb.c  |   5 +-
>  hw/pci-host/bonito.c   |   6 +--
>  hw/pci-host/grackle.c  |   5 +-
>  hw/pci-host/piix.c |  12 ++---
>  hw/pci-host/ppce500.c  |   6 +--
>  hw/pci-host/prep.c |   6 +--
>  hw/pci-host/q35.c  |   5 +-
>  hw/pci-host/uninorth.c |  20 
>  hw/pci-host/versatile.c|   5 +-
>  hw/pci/pci.c   | 113 
> ++---
>  hw/sd/sdhci.c  |   5 +-
>  hw/usb/hcd-ehci-pci.c  |   6 +--
>  hw/usb/hcd-xhci.c  |   6 +--
>  hw/watchdog/wdt_i6300esb.c |   6 +--
>  include/hw/pci/pci.h   |   3 +-
>  45 files changed, 203 insertions(+), 259 deletions(-)
> 

Markus, you may forget to add my 'R-by' in this formal version. :)

For series:
Reviewed-by: Gonglei 

Regards,
-Gonglei




Re: [Qemu-devel] [PATCH 0/4] Pair g_malloc() with g_free(), not free()

2015-02-04 Thread Markus Armbruster
Got reviews, want qemu-trivial: *ping* :)

Markus Armbruster  writes:

> I'm routing these patches through qemu-trivial, even though some of
> them touch actively maintained code, and could go through the
> respective tree instead:
>
> * PATCH 3 sPAPR (Alex)
>
> * PATCH 4 USB (Gerd)
>
> If you want me to reroute any of them, let me know.
>
> Markus Armbruster (4):
>   qemu-option: Replace pointless use of g_malloc0() by g_malloc()
>   qemu-option: Pair g_malloc() with g_free(), not free()
>   spapr_vio: Pair g_malloc() with g_free(), not free()
>   usb: Pair g_malloc() with g_free(), not free()
>
>  hw/ppc/spapr_vio.c | 2 +-
>  hw/usb/desc-msos.c | 2 +-
>  util/qemu-option.c | 8 
>  3 files changed, 6 insertions(+), 6 deletions(-)



Re: [Qemu-devel] [PATCH] xilinx_ethlite: Clean up after commit 2f991ad

2015-02-04 Thread Markus Armbruster
Ping?

Markus Armbruster  writes:

> The "fall through" added by the commit is clearly intentional.  Mark
> it so.  Hushes up Coverity.
>
> Signed-off-by: Markus Armbruster 
> ---
>  hw/net/xilinx_ethlite.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/net/xilinx_ethlite.c b/hw/net/xilinx_ethlite.c
> index 9536f64..ad6b553 100644
> --- a/hw/net/xilinx_ethlite.c
> +++ b/hw/net/xilinx_ethlite.c
> @@ -146,6 +146,7 @@ eth_write(void *opaque, hwaddr addr,
>  if (!(value & CTRL_S)) {
>  qemu_flush_queued_packets(qemu_get_queue(s->nic));
>  }
> +/* fall through */
>  case R_TX_LEN0:
>  case R_TX_LEN1:
>  case R_TX_GIE0:



Re: [Qemu-devel] [PATCH 0/3] util/uri: Cleanups and a bug fix

2015-02-04 Thread Markus Armbruster
Ping?

Markus Armbruster  writes:

> Note: checkpatch is unhappy with the first patch, because I refrained
> from cleaning up the ugly return(NULL).  They're all over the place.
>
> Markus Armbruster (3):
>   util/uri: uri_new() can't fail, drop dead error handling
>   util/uri: realloc2n() can't fail, drop dead error handling
>   util/uri: URI member path can be null, compare more carfully
>
>  util/uri.c | 61 +
>  1 file changed, 13 insertions(+), 48 deletions(-)



Re: [Qemu-devel] [PATCH 2/3] rcu: use rcu_{dereference, assign_pointer} instead of atomic_rcu_{read, set}

2015-02-04 Thread Paolo Bonzini


On 03/02/2015 23:08, Emilio G. Cota wrote:
> This matches the semantics of liburcu.

This is not necessary.  The two sets of macros are exactly the same, so
it's okay to use atomic_rcu_read/write.

Paolo



Re: [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset

2015-02-04 Thread Chen Fan


On 02/03/2015 04:16 AM, Alex Williamson wrote:

On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:

when vfio device support FLR, then when device reset,
we call VFIO_DEVICE_RESET ioctl to reset the device first,
at kernel side, we also can see the order of reset:
3330 rc = pcie_flr(dev, probe);
3331 if (rc != -ENOTTY)
3332 goto done;

3334 rc = pci_af_flr(dev, probe);
3335 if (rc != -ENOTTY)
3336 goto done;
3337
3338 rc = pci_pm_reset(dev, probe);
3339 if (rc != -ENOTTY)
3340 goto done;

so when vfio has FLR, reset it directly.

Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 8c81bb3..54eb6b4 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3455,7 +3455,7 @@ static void vfio_pci_reset(DeviceState *dev)
  vfio_pci_pre_reset(vdev);
  
  if (vdev->vbasedev.reset_works &&

-(vdev->has_flr || !vdev->has_pm_reset) &&
+vdev->has_flr &&
  !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
  trace_vfio_pci_reset_flr(vdev->vbasedev.name);
  goto post_reset;

Does this actually fix anything?  QEMU shouldn't rely on a specific
behavior of the kernel.  This test is de-prioritizing a PM reset because
they're often non-effective.  If the device supports FLR, the second
part of the OR is unreached, so what's the point of this change?

For this change, when I tested the code on my own machine.
I found the vfio device has neither flr nor pm reset (e.g. NoSoftRst+).
this also trigger ioctl VFIO_DEVICE_RESET, is it right?

Thanks,
Chen



Thanks,

Alex

.






Re: [Qemu-devel] [PATCH v4 0/4] Migration Deciphering aid

2015-02-04 Thread Amit Shah
On (Thu) 22 Jan 2015 [15:01:36], Alexander Graf wrote:
> Migration is a black hole to most people. One of the biggest reasons for
> this is that its protocol is a secret, undocumented sauce of code rolling
> around random parts of the QEMU code base.
> 
> But what if we simply exposed the description of how the format looks like
> alongside the actual migration stream? This is what this patch set does.
> 
> It adds a new section that comes after the end of stream marker (so that it
> doesn't slow down migration) that contains a JSON description of the device
> state description.
> 
> Along with this patch set also comes a python script that can read said JSON
> from a migration dump and decipher the device state and ram contents of the
> migration dump using it.
> 
> With this, you can now fully examine all glorious details that go over the
> wire when virtual machine state gets dumped, such as during live migration.
> 
> We discussed the approach taken here during KVM Forum 2013. Originally, my 
> idea
> was to include a special device that contains the JSON data which can be 
> enabled
> on demand. Anthony suggested however to just always include the description 
> data
> after the end marker which I think is a great idea.

'make check' fails with this series applied:

migration/vmstate.o: In function `vmstate_save_state':
qemu/migration/vmstate.c:289: undefined reference to `json_prop_str'
qemu/migration/vmstate.c:290: undefined reference to `json_prop_int'
qemu/migration/vmstate.c:291: undefined reference to `json_start_array'
migration/vmstate.o: In function `vmsd_desc_field_start':
qemu/migration/vmstate.c:245: undefined reference to `json_start_object'
qemu/migration/vmstate.c:246: undefined reference to `json_prop_str'
qemu/migration/vmstate.c:249: undefined reference to `json_prop_int'
qemu/migration/vmstate.c:254: undefined reference to `json_prop_str'
migration/vmstate.o: In function `vmsd_desc_field_end':
qemu/migration/vmstate.c:275: undefined reference to `json_prop_int'
qemu/migration/vmstate.c:276: undefined reference to `json_end_object'
migration/vmstate.o: In function `vmstate_save_state':
qemu/migration/vmstate.c:337: undefined reference to `json_end_array'
migration/vmstate.o: In function `vmstate_subsection_save':
qemu/migration/vmstate.c:423: undefined reference to `json_start_object'
qemu/migration/vmstate.c:434: undefined reference to `json_end_object'
qemu/migration/vmstate.c:419: undefined reference to `json_start_array'
migration/vmstate.o: In function `vmsd_desc_field_start':
qemu/migration/vmstate.c:251: undefined reference to `json_prop_int'
migration/vmstate.o: In function `vmsd_desc_field_end':
qemu/migration/vmstate.c:272: undefined reference to `json_end_object'
migration/vmstate.o: In function `vmsd_desc_field_start':
qemu/migration/vmstate.c:257: undefined reference to `json_start_object'
migration/vmstate.o: In function `vmstate_subsection_save':
qemu/migration/vmstate.c:441: undefined reference to `json_end_array'
collect2: error: ld returned 1 exit status
qemu/rules.mak:122: recipe for target 'tests/test-vmstate' failed
make: *** [tests/test-vmstate] Error 1

Amit



[Qemu-devel] [PATCH v2 2/6] rtl8139: g_malloc() can't fail, bury dead error handling

2015-02-04 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Reviewed-by: Gonglei 
---
 hw/net/rtl8139.c | 14 --
 1 file changed, 14 deletions(-)

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 6fa9e0a..b55e438 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -2075,20 +2075,6 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
 "length to %d\n", txsize);
 }
 
-if (!s->cplus_txbuffer)
-{
-/* out of memory */
-
-DPRINTF("+++ C+ mode transmiter failed to reallocate %d bytes\n",
-s->cplus_txbuffer_len);
-
-/* update tally counter */
-++s->tally_counters.TxERR;
-++s->tally_counters.TxAbt;
-
-return 0;
-}
-
 /* append more data to the packet */
 
 DPRINTF("+++ C+ mode transmit reading %d bytes from host memory at "
-- 
1.9.3




[Qemu-devel] [PATCH v2 3/6] kvm: g_malloc() can't fail, bury dead error handling

2015-02-04 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Reviewed-by: Gonglei 
---
 kvm-all.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 2f21a4e..05a79c2 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -2070,10 +2070,6 @@ int kvm_insert_breakpoint(CPUState *cpu, target_ulong 
addr,
 }
 
 bp = g_malloc(sizeof(struct kvm_sw_breakpoint));
-if (!bp) {
-return -ENOMEM;
-}
-
 bp->pc = addr;
 bp->use_count = 1;
 err = kvm_arch_insert_sw_breakpoint(cpu, bp);
-- 
1.9.3




[Qemu-devel] [PATCH v2 5/6] vnc: g_realloc() can't fail, bury dead error handling

2015-02-04 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Reviewed-by: Gonglei 
Reviewed-by: Thomas Huth 
---
 ui/vnc.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index a742c90..02552ee 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -659,10 +659,6 @@ void buffer_reserve(Buffer *buffer, size_t len)
 if ((buffer->capacity - buffer->offset) < len) {
 buffer->capacity += (len + 1024);
 buffer->buffer = g_realloc(buffer->buffer, buffer->capacity);
-if (buffer->buffer == NULL) {
-fprintf(stderr, "vnc: out of memory\n");
-exit(1);
-}
 }
 }
 
-- 
1.9.3




[Qemu-devel] [PATCH v2 0/6] Trivial cleanups around g_malloc()

2015-02-04 Thread Markus Armbruster
I'm routing these patches through qemu-trivial, even though some of
them touch actively maintained code, and could go through the
respective tree instead:

* PATCH 1 block (Kevin, Stefan)

* PATCH 3 KVM (Paolo)

* PATCH 4 migration (Juan, Amit)

* PATCH 5 VNC (Gerd)

If you want me to reroute any of them, let me know.

v2:
* Trivially rebased, R-bys retained

Markus Armbruster (6):
  onenand: g_malloc() can't fail, bury dead error handling
  rtl8139: g_malloc() can't fail, bury dead error handling
  kvm: g_malloc() can't fail, bury dead error handling
  rdma: g_malloc0() can't fail, bury dead error handling
  vnc: g_realloc() can't fail, bury dead error handling
  translate-all: Use g_try_malloc() for dynamic translator buffer

 hw/block/onenand.c |  8 +---
 hw/net/rtl8139.c   | 14 --
 kvm-all.c  |  4 
 migration/rdma.c   |  3 ---
 translate-all.c|  2 +-
 ui/vnc.c   |  4 
 6 files changed, 2 insertions(+), 33 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH v2 4/6] rdma: g_malloc0() can't fail, bury dead error handling

2015-02-04 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Reviewed-by: Gonglei 
---
 migration/rdma.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index b32dbdf..306570d 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1147,9 +1147,6 @@ static int qemu_rdma_register_and_get_keys(RDMAContext 
*rdma,
 /* allocate memory to store chunk MRs */
 if (!block->pmr) {
 block->pmr = g_malloc0(block->nb_chunks * sizeof(struct ibv_mr *));
-if (!block->pmr) {
-return -1;
-}
 }
 
 /*
-- 
1.9.3




[Qemu-devel] [PATCH v2 6/6] translate-all: Use g_try_malloc() for dynamic translator buffer

2015-02-04 Thread Markus Armbruster
The USE_MMAP code can fail, and the caller handles the failure
already.  Let the !USE_MMAP code fail as well, for consistency.

Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Reviewed-by: Gonglei 
---
 translate-all.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/translate-all.c b/translate-all.c
index 4a1b64f..9f47ce7 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -631,7 +631,7 @@ static inline void *alloc_code_gen_buffer(void)
 #else
 static inline void *alloc_code_gen_buffer(void)
 {
-void *buf = g_malloc(tcg_ctx.code_gen_buffer_size);
+void *buf = g_try_malloc(tcg_ctx.code_gen_buffer_size);
 
 if (buf == NULL) {
 return NULL;
-- 
1.9.3




Re: [Qemu-devel] [PATCH v2 00/12] QEmu memory hot unplug support

2015-02-04 Thread Michael S. Tsirkin
Re: subject - it's really QEMU - not QEmu.

you have this in code as well, so I'm guessing you didn't
run your patches through checkpatch.pl. Please do
and fix fallout if any.

-- 
MST



[Qemu-devel] [PATCH v2 1/6] onenand: g_malloc() can't fail, bury dead error handling

2015-02-04 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Reviewed-by: Gonglei 
---
 hw/block/onenand.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/hw/block/onenand.c b/hw/block/onenand.c
index 348630d..1b2c893 100644
--- a/hw/block/onenand.c
+++ b/hw/block/onenand.c
@@ -346,15 +346,9 @@ static inline int onenand_prog_spare(OneNANDState *s, int 
sec, int secn,
 static inline int onenand_erase(OneNANDState *s, int sec, int num)
 {
 uint8_t *blankbuf, *tmpbuf;
+
 blankbuf = g_malloc(512);
-if (!blankbuf) {
-return 1;
-}
 tmpbuf = g_malloc(512);
-if (!tmpbuf) {
-g_free(blankbuf);
-return 1;
-}
 memset(blankbuf, 0xff, 512);
 for (; num > 0; num--, sec++) {
 if (s->blk_cur) {
-- 
1.9.3




Re: [Qemu-devel] [PATCH 00/10] pci: Partial conversion to realize

2015-02-04 Thread Markus Armbruster
Gonglei  writes:

> Markus, you may forget to add my 'R-by' in this formal version. :)

I chose not to carry R-bys from the RFC version forward, because the
rebase after almost three months wasn't 100% trivial.  I certainly
didn't mean to snub your review!

> For series:
> Reviewed-by: Gonglei 

Thanks!



Re: [Qemu-devel] [PATCH 0/3] rcu: add option to use upstream liburcu

2015-02-04 Thread Paolo Bonzini


On 03/02/2015 23:08, Emilio G. Cota wrote:
> * The first two patches bring back the RCU API to exactly
>   match that of liburcu.

Bringing over rcu_dereference/rcu_assign_pointer is unnecessary, I
think.  The names do not conflict.

As to call_rcu/call_rcu1, I understand there is a problem.  Maybe call
the QEMU versions rcu_call/rcu_call1?  Then you can add simple wrappers
if you want to use liburcu-mb.

> * The third patch adds a configure flag to choose from either
>   liburcu or QEMU's RCU.
> 
> Apart from this, I wonder what to do about other valuable bits in
> liburcu, particularly in liburcu-cds, which I'm using currently
> off-tree.  I see three ways of eventually doing this:
> 
> a) Add Windows support in liburcu, thereby eliminating the
>de facto fork in QEMU.

This would be certainly doable.

Note that liburcu is not widely packaged, so we would have to add it as
a submodule.  What is the advantage of using liburcu?

> b) Bring (fork) liburcu-cds to QEMU, just like liburcu-mb was.

To some extent this is what we're doing.  cds is very much inspired by
the Linux kernel, but QEMU is already using both BSD (intrusive) and
GLib (non-intrusive) lists, and I didn't like the idea of adding yet
another API.  I like the simplicity of the Linux hlist/list APIs, but
two kinds of lists are already one too many.

So, part 2 of the RCU work has an API for RCU lists based on BSD lists
(that QEMU is already using).  These are not the lock-free data
structures available in CDS, just the usual RCU-based lists with
blocking write side and wait-free read side.

QEMU has very limited support for (non-RCU-based) lock-free data
structures in qemu/queue.h: see QSLIST_INSERT_HEAD_ATOMIC and
QSLIST_MOVE_ATOMIC.  The call_rcu implementation in util/rcu.c is based
on wfqueue from liburcu-cds, but it would not be hard to change it to
use QSLIST_INSERT_HEAD_ATOMIC/QSLIST_MOVE_ATOMIC instead.  In both cases
the data structure is multi-producer/single-consumer.

QEMU hardly uses hash tables at all.

Another thing to note is that I didn't envision a huge use of RCU in
QEMU; I'm employing it in decidedly central places where it can bring
great benefit, but I wouldn't be surprised if RCU only found a handful
of users in the end.

Coarse locks with very low contention (such as AioContext) have great
performance, and most data structures in QEMU fit this model: data
structures for separate devices are more or less independent, and the
lock is only needed for the rare cases when the main thread (for example
the monitor) interacts with the device.

> c) Add a compile-time flag (say CONFIG_LIBURCU_CDS), and then only
>use data structures from liburcu-cds where appropriate, falling
>back to traditional locked structures when !CONFIG_LIBURCU_CDS.
> 
> Would c) be acceptable for upstream, provided the gains (say in
> scalability/memory footprint) are significant?

I think if there were a killer use of liburcu-cds, we would just port
liburcu (the mb, bp and qsbr variants) to Windows, and switch to
liburcu-mb/liburcu-cds.

This brings the most important question of all: what are you doing with
QEMU? :)

Paolo



Re: [Qemu-devel] [PATCH 00/10] pci: Partial conversion to realize

2015-02-04 Thread Gonglei
On 2015/2/4 18:30, Markus Armbruster wrote:

> Gonglei  writes:
> 
>> Markus, you may forget to add my 'R-by' in this formal version. :)
> 
> I chose not to carry R-bys from the RFC version forward, because the
> rebase after almost three months wasn't 100% trivial.  I certainly
> didn't mean to snub your review!
> 

Get it, thanks.

Regards,
-Gonglei

>> For series:
>> Reviewed-by: Gonglei 
> 
> Thanks!






Re: [Qemu-devel] [PATCH v2 00/12] QEmu memory hot unplug support

2015-02-04 Thread Zhu Guihua
On Wed, 2015-02-04 at 11:28 +0100, Michael S. Tsirkin wrote:
> Re: subject - it's really QEMU - not QEmu.
> 
> you have this in code as well, so I'm guessing you didn't
> run your patches through checkpatch.pl. Please do
> and fix fallout if any.

I do it now, and I find warnings about this.
I will fix it. Thanks.

Regards,
Zhu

> 





Re: [Qemu-devel] [PATCH RFC 1/1] KVM: s390: Add MEMOP ioctl for reading/writing guest memory

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 09:26, Christian Borntraeger wrote:
> Am 03.02.2015 um 16:22 schrieb Paolo Bonzini:
>> On 03/02/2015 16:16, Thomas Huth wrote:
>>> Actually, I'd prefer to keep the "virtual" in the defines for the type
>>> of operation below: When it comes to s390 storage keys, we likely might
>>> need some calls for reading and writing to physical memory, too. Then
>>> we could simply extend this ioctl instead of inventing a new one.
> 
> Rereading that. Shall we replace "virtual" with "logical"? That is what is
> used architecturally when we mean "do whatever is appropriate right now"
> That can boil down to virtual via DAT, virtual via access register mode, 
> real if DAT is off... and if fact your kernel implementation does that.

That makes sense.

>> Can you explain why it is necessary to read/write physical addresses
>> from user space?  In the case of QEMU, I'm worried that you would have
>> to invent your own memory read/write APIs that are different from
>> everything else.
>>
>> On real s390 zPCI, does bus-master DMA update storage keys?
> 
> the classic channel I/O does set the storage key change/reference and
> also triggers errors in the storage key protection value mismatches.
> 
> The PCI IOTA structure does contain a storage key value for accesses,
> so I assume its the same here, but I dont know for sure.

Emulating that in QEMU would be very hard.  Every DMA read/write would
have to go through a bounce buffer, but QEMU block device models for
example try hard to read from host disk directly into guest memory.

> Conny:
> I am asking myself, if we should explicitly add a comment in the 
> virtio-ccw spec, that all accesses are assumed to be with key 0 and 
> thus never cause key protection. The change/reference bit is set
> by the underlying I/O or memory copy anyway.

Can you explain the last sentence? :)

Paolo

> We can then add a ccw later on to set a different key if we ever need
> that.
> 
> 
>>
 Not really true, as you don't check it.  So "It is not used by KVM with
 the currently defined set of flags" is a better explanation.
>>>
>>> ok ... and maybe add "should be set to zero" ?
>>
>> If you don't check it, it is misleading to document this.
>>
>> Paolo
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



Re: [Qemu-devel] [v2][PATCH] libxl: add one machine property to support IGD GFX passthrough

2015-02-04 Thread Ian Campbell
On Wed, 2015-02-04 at 09:34 +0800, Chen, Tiejun wrote:
> >
> >>   "-machine xxx,igd-passthru=on", to enable/disable that feature.
> >>   And we also remove that old option, "-gfx_passthru", just from
> >>   the case of LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN since actually
> >>   no any qemu stream version really need or use that.
> > ^up ?
> >
> > What happens if you pass this new option to an older version of qemu
> > upstream? I suppose it doesn't fail any worse than passing -gfx_passthru
> > would have done.
> 
> Neither '-gfx_passthru' nor 'igd-passthrou' exists in any version of 
> qemu upstream. As I mentioned previously, just now we're starting to 
> support this in qemu upstream :)
> 
> But you known, on Xen side we have two qemu versions, 
> qemu-xen-traditional and qemu-xen. Although '-gfx_passthru' is adopted 
> in both two versions, just qemu-xen-traditional supports IGD passthrough 
> completely. For qemu-xen, we just have this term definition but without 
> that IGD passthrough feature support actually.

I'm afraid I don't follow, you seem to be simultaneously saying that
qemu-xen both does and does not support -gfx_passthru, which cannot be
right. I think from the following paragraph that what you mean is that
upstream qemu-xen has no support for any kind of -gfx_passthru command
line option in any form in any existing version, including the git tree.
Is that correct?

(for the purposes of this conversation qemu-traditional is not of
interest)

> And now we're trying to support IGD passthrough in qemu stream. This 
> mean we should set 'device_model_version="qemu-xen", right? Then libxl 
> still pass '-gfx_passthru' to qemu upstream. But when I post other 
> stuffs to qemu upstream community to support IGD passthrough. Gerd 
> thought "-machine xxx,igd-passthru=on" is better than '-gfx_passthru'. 
> So finally you see this change in Xen/tools/libxl.
> 
> >
> > I have one more general concern, which is that hardcoding igd-passthru
> > here may make it harder to support gfx_passthru of other cards in the
> > future.
> 
> Actually gfx_passthrou is introduced to just work for IGD passthrough 
> since something specific to IGD is tricky, so we have to need such a 
> flag to handle this precisely, like its fixed at 00:02.0, and expose 
> some ISA bridge PCI config info and even host bridge PCI config info.
> 
> So I don't thing other cards need this.

If one type VGA device needs these sorts of workaround it is not
inconceivable that some other one will also need workarounds in the
future.

Even if you don't consider non-IGD, what about the possibility of a
future IGD device which needs different (or additional, or fewer)
workarounds?

> > Somehow something in the stack needs to either detect or be told which
> > kind of card to passthrough. Automatic detection would be preferable,
> > but maybe being told by the user is the only possibility.
> 
> Based on the above explanation, something should be done before we 
> detect to construct that real card , so its difficulty to achieve this 
> goal currently.
> 
> >
> > Is there any way, given gfx_passthru as a boolean that libxl can
> > automatically figure out that IGD passthru is what is actually desired
> > -- e.g. by scanning the set of PCI devices given to the guest perhaps?
> 
> Sorry I don't understand what you mean here.

"gfx_passthru" is a generically named option, but it is being
implemented in an IGD specific way. We need to consider the possibility
of other graphics devices needing special handling in the future and
plan accordingly such that we can try and maintain our API guarantees
when this happens.

I think there are three ways to achieve that:

  * Make the libxl/xl option something which is not generic e.g.
igd_passthru=1
  * Add a second option to allow the user to configure the kind of
graphics device being passed thru (e.g. gfx_passthru=1,
passthru_device="igd"), or combine the two by making the
gfx_passthru option a string instead of a boolean.
  * Make libxl detect the type of graphics device somehow and
therefore automatically determine when gfx_passthru=1 =>
igd-passthru

>  Currently, we have to set 
> something as follows,
> 
> gfx_passthru=1
> pci=["00:02.0"]
> 
> This always works for qemu-xen-traditional.
> 
> But you should know '00:02.0' doesn't mean we are passing IGD through.

But by looking at the device 00:02.0 (e.g. its PCI vendor and device ID
and other properties) we can unambiguously determine if it is an IGD
device or not, can't we?

> > If not then that _might_ suggest we should deprecate the gdx_passthru
> > option at the libxl interface and switch to something more expressive,
> > such as an Enumeration of card types (with a singleton of IGD right
> > now), but I'm not really very familiar with passthru nor the qemu side
> > of this.
> >
> > What happens if you try to pass two different GFX cards to a guest?
> >
> 

Re: [Qemu-devel] [PATCH v2] qga: add guest-set-admin-password command

2015-02-04 Thread Roman Kagan
On Mon, Jan 12, 2015 at 03:58:14PM +, Daniel P. Berrange wrote:
> Add a new 'guest-set-admin-password' command for changing the
> root/administrator password. This command is needed to allow
> OpenStack to support its API for changing the admin password
> on a running guest.
> 
> Accepts either the raw password string:
> 
> $ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
>'{ "execute": "guest-set-admin-password", "arguments":
>  { "crypted": false, "password": "12345678" } }'
>   {"return":{}}
> 
> Or a pre-encrypted string (recommended)
> 
> $ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
>'{ "execute": "guest-set-admin-password", "arguments":
>  { "crypted": true, "password":
> "$6$T9O/j/aGPrE...sniprQoRN4F0.GG0MPjNUNyml." } }'

Does it have to be a QMP command?  Wouldn't the recently (re-)submitted
guest-exec allow to do the same, by running "chpasswd" in the guest and
piping the username:password into its stdin?

Besides I think it makes sense to (optionally) pass the username, to
allow to change the password for arbitrary users.  This would make the
functionality useful for systems where root password plays no role as
root logins are disallowed, and the only access to root shell is via
sudo from a user belonging to a particular group (IIRC Ubuntu is usually
set up like that).

> NB windows support is desirable, but not implemented in this
> patch.

Yes Windows may have an issue with username here too, because the admin
user can be any user (and even "Administrator" can be localized).

Roman.



Re: [Qemu-devel] [PATCH v2] qga: add guest-set-admin-password command

2015-02-04 Thread Daniel P. Berrange
On Wed, Feb 04, 2015 at 01:48:40PM +0300, Roman Kagan wrote:
> On Mon, Jan 12, 2015 at 03:58:14PM +, Daniel P. Berrange wrote:
> > Add a new 'guest-set-admin-password' command for changing the
> > root/administrator password. This command is needed to allow
> > OpenStack to support its API for changing the admin password
> > on a running guest.
> > 
> > Accepts either the raw password string:
> > 
> > $ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
> >'{ "execute": "guest-set-admin-password", "arguments":
> >  { "crypted": false, "password": "12345678" } }'
> >   {"return":{}}
> > 
> > Or a pre-encrypted string (recommended)
> > 
> > $ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
> >'{ "execute": "guest-set-admin-password", "arguments":
> >  { "crypted": true, "password":
> > "$6$T9O/j/aGPrE...sniprQoRN4F0.GG0MPjNUNyml." } }'
> 
> Does it have to be a QMP command?  Wouldn't the recently (re-)submitted
> guest-exec allow to do the same, by running "chpasswd" in the guest and
> piping the username:password into its stdin?

guest-exec puts the burden on the calling application to figure out which
command to invoke and what syntax it has. This is really sucky for any
kind of cross-OS portability. ie windows is going to be completely different
from Linux, and even different UNIX variants are different to some extent.

I don't consider guest-exec to be something that managment applications
should *ever* use to build features around. It is just a useful mechanism
for human administrators to do ad-hoc interactions with guests.

> Besides I think it makes sense to (optionally) pass the username, to
> allow to change the password for arbitrary users.  This would make the
> functionality useful for systems where root password plays no role as
> root logins are disallowed, and the only access to root shell is via
> sudo from a user belonging to a particular group (IIRC Ubuntu is usually
> set up like that).

Yep, extending it to any username is a possibility if it is thought to
be useful

> 
> > NB windows support is desirable, but not implemented in this
> > patch.
> 
> Yes Windows may have an issue with username here too, because the admin
> user can be any user (and even "Administrator" can be localized).

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH RFC 1/1] KVM: s390: Add MEMOP ioctl for reading/writing guest memory

2015-02-04 Thread Thomas Huth
On Wed, 04 Feb 2015 09:26:11 +0100
Christian Borntraeger  wrote:

> Am 03.02.2015 um 16:22 schrieb Paolo Bonzini:
> > 
> > 
> > On 03/02/2015 16:16, Thomas Huth wrote:
> >> Actually, I'd prefer to keep the "virtual" in the defines for the type
> >> of operation below: When it comes to s390 storage keys, we likely might
> >> need some calls for reading and writing to physical memory, too. Then
> >> we could simply extend this ioctl instead of inventing a new one.
> 
> Rereading that. Shall we replace "virtual" with "logical"? That is what is
> used architecturally when we mean "do whatever is appropriate right now"
> That can boil down to virtual via DAT, virtual via access register mode, 
> real if DAT is off... and if fact your kernel implementation does that.

True, but so far I tried to avoid to include s390-only wording into
this ioctl in case other architectures might need such a functionality
later, too (then this ioctl could be simply used there, too).

OTOH, maybe the memory model on s390 is just too special to try to keep
this ioctl generic ... but then I guess I also should rename the
define KVM_GUEST_MEM_OP into KVM_S390_GUEST_MEM_OP, for example?

 Thomas




[Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Daniel P. Berrange
In QEMU there are a number of features which involve communication with an
external system over an I/O channel of some form. The features include
migration, NBD, VNC and character devices. The I/O channel in question might
might be a FIFO pipe, a PTY, a TCP socket, a UNIX domain socket, RMDA channel
or something else, while the external system can be another QEMU, libvirt, an
NBD server, or something else.

Currently the only place where there is any meaningful level of security is
the VNC server, which supports the VeNCrypt extension providing TLS sessions
for data encryption and x509 cert validation for authentication, and the SASL
extension which also provides either encryption of authentication or both.
The migration data channel is more or less completely unprotected unless it
is tunnelled via libvirt or equivalent external secure channel. The same is
true for NBD, though there was a recent discussion about defining an extension
to use TLS. Likewise serial ports, parallel ports, virtio consoles all use the
chardev backends which offer no security features.

There are some other network related pieces in QEMU, namely the built-in iSCSI,
RBD, NFS clients, and SPICE server. Since those are all implemented via 3rd
party libraries rather than natively by QEMU I'm going ignore them for the sake
of this discussion and focus on network interaction directly implemented in
QEMU.

We have a broad goal in OpenStack that every network channel in use must have
encryption and authentication capabilities. Currently all the communication
channels between the end user and the cloud infrastructure edge servers are
secured, but internally a number of the cloud infrastructure components are
unsecured. For example, we recommend to tunnel migration via libvirt, though
that excludes use of the NBD for block migration since libvirt can't currently
tunnel that. Internally we will shortly use  VeNCrypt for protecting VNC
between the compute hosts and the openstack console proxy edge servers. We
are lacking the abilty to secure serial ports between the compute nods and
console proxy.

Essentially the project considers that it is no longer sufficient to consider
the private management LAN (on which the cloud infrastructure is deployed) to
be fully trusted; it must be considered hostile.

So back to QEMU/KVM, we want to have

 - Native support for encryption & authentication with migration, since
   tunnelling via libvirt has a non-negligble performance overhead adding
   both latency and CPU load

 - Native support for encryption & authentication with NBD server to enable
   security of the block migration service

 - Native support for encryption & authentication with chardev TCP backends
   to enable security of the serial port consoles.

As a starting point, the desire is to have TLS for session encryption and
x509 certificate verification for authentication, but at a later date we'd
also like to add the ability to use SASL for either aspect too. Thinking about
QEMU users in general, it would probably be useful if any impl could allow for
future enhancements such as SSH tunnelling too.

I have some development cycles available to work on addressing these gaps in
QEMU, along with relevant experiance since I did the previous VNC server work
for adding TLS and SASL in QEMU, along with similar in libvirt and GTK-VNC.

Having looked at the QEMU code for VNC, migration and chardevs I think the
main stumbling block is currently that QEMU does not have any kind of standard
approach for dealing with I/O channels internally. Migration has the QEMUFile
abstraction, but it is currently fairly coupled to the migration code itself
and limited in scope, because it doesn't actually deal with socket listen
or connect tasks. That's still part of the migration code itself, QEMUFile
only deals with I/O transfer, so it is a pretty incomplete abstraction layer.
The chardev code has a backend abstraction, but that is really horribly
entwined with the chardev code so not reusable in any way without a rewrite
from scratch IMHO. The VNC server has alot of useful code for dealing with
TLS & SASL, but it is somewhat entwined with the VNC server. The VNC server
doesn't use any I/O abstraction just preferring raw FDs / sockets. I've not
looked at the NBD code in detail, but I'm presuming it is using raw FDs /
sockets.

Since this TLS/SASL code is non-trivial (and obviously security critical), I
really don't want to end up with 4 separate places to implement it in QEMU.
IMHO the only practical / sensible approach is to define some kind of standard
I/O channel API internally to QEMU which migration, NBD, chardev and VNC all
use. That gives us a single place to integrate all the security mechanisms
we need to support.  In libvirt we did something like this a little while
ago by defining a standard internal sockets API[1], with plugins for things
like SASL[2], TLS[4] and SSH[5] and it has been very successful in simplifying
the code by centralizing the hairy l

[Qemu-devel] [PATCH] vl.c: fixed QEMU crash if no display was selected in command line

2015-02-04 Thread Marcel Apfelbaum
Commit 4db14629c3 (vnc: switch to QemuOpts, allow multiple servers)
converted vnc option to QemuOpts.
However, the default vnc display had ...,to=99,... between parameters
that is not covered by the QemuOpts.
Remove it because is not longer used and solves the crash.

Signed-off-by: Marcel Apfelbaum 
---
This issue is bugging me some time now. Please let me know if
I got it wrong.

 vl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index 983259b..c8f33d2 100644
--- a/vl.c
+++ b/vl.c
@@ -3970,7 +3970,7 @@ int main(int argc, char **argv, char **envp)
 #elif defined(CONFIG_SDL) || defined(CONFIG_COCOA)
 display_type = DT_SDL;
 #elif defined(CONFIG_VNC)
-vnc_parse_func("localhost:0,to=99,id=default");
+vnc_parse_func("localhost:0,id=default");
 show_vnc_port = 1;
 #else
 display_type = DT_NONE;
-- 
2.1.0




Re: [Qemu-devel] [PATCH v5 07/26] qcow2: Respect error in qcow2_alloc_bytes()

2015-02-04 Thread Kevin Wolf
Am 15.12.2014 um 13:50 hat Max Reitz geschrieben:
> qcow2_update_cluster_refcount() may fail, and qcow2_alloc_bytes() should
> mind that case.
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  block/qcow2-refcount.c | 33 +
>  1 file changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 0308a7e..db81647 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -778,8 +778,8 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, 
> uint64_t offset,
>  int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
>  {
>  BDRVQcowState *s = bs->opaque;
> -int64_t offset, cluster_offset;
> -int free_in_cluster;
> +int64_t offset, cluster_offset, new_cluster;
> +int free_in_cluster, ret;
>  
>  BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_BYTES);
>  assert(size > 0 && size <= s->cluster_size);
> @@ -800,23 +800,32 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int 
> size)
>  free_in_cluster -= size;
>  if (free_in_cluster == 0)
>  s->free_byte_offset = 0;
> -if (offset_into_cluster(s, offset) != 0)
> -qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
> -  false, QCOW2_DISCARD_NEVER);
> +if (offset_into_cluster(s, offset) != 0) {
> +ret = qcow2_update_cluster_refcount(bs, offset >> 
> s->cluster_bits,
> +1, false, 
> QCOW2_DISCARD_NEVER);
> +if (ret < 0) {
> +return ret;

Not sure how relevant it is, but s->free_byte_offset has already been
increased, so we're leaving sub-cluster space unused. (It's not really
leaking as freeing all references still frees the cluster.)

> +}
> +}
>  } else {
> -offset = qcow2_alloc_clusters(bs, s->cluster_size);
> -if (offset < 0) {
> -return offset;
> +new_cluster = qcow2_alloc_clusters(bs, s->cluster_size);
> +if (new_cluster < 0) {
> +return new_cluster;
>  }

offset is the return value of this function, and now there are cases
where it isn't set to new_cluster any more (I wonder why gcc doesn't
warn).

Why can't we keep offset where it was used and only assign new_cluster
additionally so we can do the cleanup?

>  cluster_offset = start_of_cluster(s, s->free_byte_offset);
> -if ((cluster_offset + s->cluster_size) == offset) {
> +if ((cluster_offset + s->cluster_size) == new_cluster) {
>  /* we are lucky: contiguous data */
>  offset = s->free_byte_offset;
> -qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
> -  false, QCOW2_DISCARD_NEVER);
> +ret = qcow2_update_cluster_refcount(bs, offset >> 
> s->cluster_bits,
> +1, false, 
> QCOW2_DISCARD_NEVER);
> +if (ret < 0) {
> +qcow2_free_clusters(bs, new_cluster, s->cluster_size,
> +QCOW2_DISCARD_NEVER);
> +return ret;
> +}
>  s->free_byte_offset += size;
>  } else {
> -s->free_byte_offset = offset;
> +s->free_byte_offset = new_cluster;
>  goto redo;
>  }
>  }

Kevin



Re: [Qemu-devel] [PATCH RFC 1/1] KVM: s390: Add MEMOP ioctl for reading/writing guest memory

2015-02-04 Thread Christian Borntraeger
Am 04.02.2015 um 11:39 schrieb Paolo Bonzini:
>> Conny:
>> I am asking myself, if we should explicitly add a comment in the 
>> virtio-ccw spec, that all accesses are assumed to be with key 0 and 
>> thus never cause key protection. The change/reference bit is set
>> by the underlying I/O or memory copy anyway.
> 
> Can you explain the last sentence? :)

Whenever vhost or qemu or a finished aio request wrote content into a
virtio buffer, the HW has set the storage key for that physical page,
which  makes it automatically dirty/referenced in the guest visible
storage key. 


For completeness sake: 
Now, if the guest does not use the storage key, but instead the new fault
based software dirty tracking, it wont notice the change bit. The guest 
I/O itself when finished will mark the struct page as Dirty, just like on
x86.

Makes sense?

Christian




Re: [Qemu-devel] [PATCH RFC 1/1] KVM: s390: Add MEMOP ioctl for reading/writing guest memory

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 12:25, Christian Borntraeger wrote:
> Whenever vhost or qemu or a finished aio request wrote content into a
> virtio buffer, the HW has set the storage key for that physical page,
> which  makes it automatically dirty/referenced in the guest visible
> storage key. 

Ah, I knew the storage keys were per-physical page, but I wasn't sure if
they were separate for the host and the guest.  That's obvious now.

Can we detect non-zero storage key in emulated zPCI requests, and fail
the request somehow?

Paolo



Re: [Qemu-devel] [PATCH v5 08/26] qcow2: Refcount overflow and qcow2_alloc_bytes()

2015-02-04 Thread Kevin Wolf
Am 15.12.2014 um 13:50 hat Max Reitz geschrieben:
> qcow2_alloc_bytes() may reuse a cluster multiple times, in which case
> the refcount is increased accordingly. However, if this would lead to an
> overflow the function should instead just not reuse this cluster and
> allocate a new one.
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  block/qcow2-refcount.c | 31 ++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index db81647..fd28a13 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -780,9 +780,11 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
>  BDRVQcowState *s = bs->opaque;
>  int64_t offset, cluster_offset, new_cluster;
>  int free_in_cluster, ret;
> +uint64_t refcount;
>  
>  BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_BYTES);
>  assert(size > 0 && size <= s->cluster_size);
> + redo:
>  if (s->free_byte_offset == 0) {
>  offset = qcow2_alloc_clusters(bs, s->cluster_size);
>  if (offset < 0) {
> @@ -790,12 +792,25 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int 
> size)
>  }
>  s->free_byte_offset = offset;
>  }
> - redo:
> +
>  free_in_cluster = s->cluster_size -
>  offset_into_cluster(s, s->free_byte_offset);
>  if (size <= free_in_cluster) {
>  /* enough space in current cluster */
>  offset = s->free_byte_offset;
> +
> +if (offset_into_cluster(s, offset) != 0) {
> +/* We will have to increase the refcount of this cluster; if the
> + * maximum has been reached already, this cluster cannot be used 
> */
> +ret = qcow2_get_refcount(bs, offset >> s->cluster_bits, 
> &refcount);
> +if (ret < 0) {
> +return ret;
> +} else if (refcount == s->refcount_max) {
> +s->free_byte_offset = 0;
> +goto redo;
> +}
> +}
> +
>  s->free_byte_offset += size;
>  free_in_cluster -= size;
>  if (free_in_cluster == 0)
> @@ -816,6 +831,20 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
>  if ((cluster_offset + s->cluster_size) == new_cluster) {
>  /* we are lucky: contiguous data */
>  offset = s->free_byte_offset;
> +
> +/* Same as above: In order to reuse the cluster, the refcount 
> has to
> + * be increased; if that will not work, we are not so lucky after
> + * all */
> +ret = qcow2_get_refcount(bs, offset >> s->cluster_bits, 
> &refcount);
> +if (ret < 0) {
> +qcow2_free_clusters(bs, new_cluster, s->cluster_size,
> +QCOW2_DISCARD_NEVER);
> +return ret;
> +} else if (refcount == s->refcount_max) {
> +s->free_byte_offset = offset;

I think you mean 0. offset is already the old value.

> +goto redo;
> +}
> +
>  ret = qcow2_update_cluster_refcount(bs, offset >> 
> s->cluster_bits,
>  1, false, 
> QCOW2_DISCARD_NEVER);
>  if (ret < 0) {

I wonder if the code duplication is necessary. I was already thinking
that there was some duplication when I reviewed the previous patch, but
now it seems to become even more obvious that the three parts of this
function are:

1. Allocate a new cluster
2. Allocate space in the already allocated cluster
3. Allocate a new cluster and space inside it, which is just 1. + 2.

Kevin



[Qemu-devel] [PATCH v2 0/3] vfio: free data and unmap BARs in instance_finalize

2015-02-04 Thread Paolo Bonzini
This is v2 of the patch sent yesterday.  In addition to including
the missing vfio bits, I split it in two parts: one introducing
instance_finalize (patch 2), and the second making the freeing of BARs
RCU-friendly (patch 3).

With these two changes I found the error path logic a bit hard
to follow.  So I preceded it with patch 1, which tries to make things
a little bit clearer, at least to me.

VFIO is probably the device that requires the largest changes, due to
the complex, highly data-driven initialization sequence.  No other device
can use so many dynamic data structures, because their configuration is
obviously not as variable as for PCI pass-through.

Paolo

Paolo Bonzini (3):
  vfio: cleanup vfio_get_device error path, remove vfio_populate_device callback
  vfio: free dynamically-allocated data in instance_finalize
  vfio: unmap and free BAR data in instance_finalize

 hw/vfio/common.c  | 36 
 hw/vfio/pci.c | 96 ++-
 include/hw/vfio/vfio-common.h |  1 -
 3 files changed, 92 insertions(+), 41 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH 2/3] vfio: free dynamically-allocated data in instance_finalize

2015-02-04 Thread Paolo Bonzini
In order to enable out-of-BQL address space lookup, destruction of
devices needs to be split in two phases.

Unrealize is the first phase; once it complete no new accesses will
be started, but there may still be pending memory accesses can still
be completed.

The second part is freeing the device, which only happens once all memory
accesses are complete.  At this point the reference count has dropped to
zero, an RCU grace period must have completed (because the RCU-protected
FlatViews hold a reference to the device via memory_region_ref).  This is
when instance_finalize is called.

Freeing data belongs in an instance_finalize callback, because the
dynamically allocated memory can still be used after unrealize by the
pending memory accesses.

This starts the process by creating an instance_finalize callback and
freeing most of the dynamically-allocated data in instance_finalize.
Because instance_finalize is also called on error paths or also when
the device is actually not realized, the common code needs some changes
to be ready for this.  The error path in vfio_initfn can be simplified too.

Cc: Alex Williamson 
Signed-off-by: Paolo Bonzini 
---
 hw/vfio/common.c |  5 -
 hw/vfio/pci.c| 22 +-
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 242b71d..1bf9de9 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -847,7 +847,7 @@ free_group_exit:
 
 void vfio_put_group(VFIOGroup *group)
 {
-if (!QLIST_EMPTY(&group->device_list)) {
+if (!group || !QLIST_EMPTY(&group->device_list)) {
 return;
 }
 
@@ -902,6 +902,9 @@ int vfio_get_device(VFIOGroup *group, const char *name,
 
 void vfio_put_base_device(VFIODevice *vbasedev)
 {
+if (!vbasedev->group) {
+return;
+}
 QLIST_REMOVE(vbasedev, next);
 trace_vfio_put_base_device(vbasedev->fd);
 close(vbasedev->fd);
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 2f5f718..0e1d229 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3257,7 +3257,7 @@ static int vfio_initfn(PCIDevice *pdev)
 if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) {
 ret = ret < 0 ? -errno : -EFAULT;
 error_report("vfio: Failed to read device config space");
-goto out_put;
+return ret;
 }
 
 /* vfio emulates a lot for us, but some bits need extra love */
@@ -3289,7 +3289,7 @@ static int vfio_initfn(PCIDevice *pdev)
 
 ret = vfio_early_setup_msix(vdev);
 if (ret) {
-goto out_put;
+return ret;
 }
 
 vfio_map_bars(vdev);
@@ -3328,17 +3328,24 @@ out_teardown:
 pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
 vfio_teardown_msi(vdev);
 vfio_unmap_bars(vdev);
-out_put:
+return ret;
+}
+
+static void vfio_instance_finalize(Object *obj)
+{
+PCIDevice *pci_dev = PCI_DEVICE(obj);
+VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pci_dev);
+VFIOGroup *group = vdev->vbasedev.group;
+
 g_free(vdev->emulated_config_bits);
+g_free(vdev->rom);
 vfio_put_device(vdev);
 vfio_put_group(group);
-return ret;
 }
 
 static void vfio_exitfn(PCIDevice *pdev)
 {
 VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
-VFIOGroup *group = vdev->vbasedev.group;
 
 vfio_unregister_err_notifier(vdev);
 pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
@@ -3348,10 +3355,6 @@ static void vfio_exitfn(PCIDevice *pdev)
 }
 vfio_teardown_msi(vdev);
 vfio_unmap_bars(vdev);
-g_free(vdev->emulated_config_bits);
-g_free(vdev->rom);
-vfio_put_device(vdev);
-vfio_put_group(group);
 }
 
 static void vfio_pci_reset(DeviceState *dev)
@@ -3439,6 +3442,7 @@ static const TypeInfo vfio_pci_dev_info = {
 .instance_size = sizeof(VFIOPCIDevice),
 .class_init = vfio_pci_dev_class_init,
 .instance_init = vfio_instance_init,
+.instance_finalize = vfio_instance_finalize,
 };
 
 static void register_vfio_pci_dev_type(void)
-- 
1.8.3.1





[Qemu-devel] [PATCH 3/3] vfio: unmap and free BAR data in instance_finalize

2015-02-04 Thread Paolo Bonzini
In the case of VFIO, the unrealize callback is too early to munmap the
BARs.  The munmap must be delayed until memory accesses are complete.
To do this, split vfio_unmap_bars in two.  The removal step, now called
vfio_unregister_bars, remains in vfio_exitfn.  The reclamation step
is vfio_unmap_bars and is moved to the instance_finalize callback.

Similarly, quirk MemoryRegions have to be removed during
vfio_unregister_bars, but freeing the data structure must be delayed
to vfio_unmap_bars.

Cc: Alex Williamson 
Signed-off-by: Paolo Bonzini 
---
 hw/vfio/pci.c | 65 +++
 1 file changed, 57 insertions(+), 8 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 0e1d229..6eb07ed 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1996,12 +1996,23 @@ static void vfio_vga_quirk_setup(VFIOPCIDevice *vdev)
 
 static void vfio_vga_quirk_teardown(VFIOPCIDevice *vdev)
 {
+VFIOQuirk *quirk;
+int i;
+
+for (i = 0; i < ARRAY_SIZE(vdev->vga.region); i++) {
+QLIST_FOREACH(quirk, &vdev->vga.region[i].quirks, next) {
+memory_region_del_subregion(&vdev->vga.region[i].mem, &quirk->mem);
+}
+}
+}
+
+static void vfio_vga_quirk_free(VFIOPCIDevice *vdev)
+{
 int i;
 
 for (i = 0; i < ARRAY_SIZE(vdev->vga.region); i++) {
 while (!QLIST_EMPTY(&vdev->vga.region[i].quirks)) {
 VFIOQuirk *quirk = QLIST_FIRST(&vdev->vga.region[i].quirks);
-memory_region_del_subregion(&vdev->vga.region[i].mem, &quirk->mem);
 object_unparent(OBJECT(&quirk->mem));
 QLIST_REMOVE(quirk, next);
 g_free(quirk);
@@ -2022,10 +2033,19 @@ static void vfio_bar_quirk_setup(VFIOPCIDevice *vdev, 
int nr)
 static void vfio_bar_quirk_teardown(VFIOPCIDevice *vdev, int nr)
 {
 VFIOBAR *bar = &vdev->bars[nr];
+VFIOQuirk *quirk;
+
+QLIST_FOREACH(quirk, &bar->quirks, next) {
+memory_region_del_subregion(&bar->region.mem, &quirk->mem);
+}
+}
+
+static void vfio_bar_quirk_free(VFIOPCIDevice *vdev, int nr)
+{
+VFIOBAR *bar = &vdev->bars[nr];
 
 while (!QLIST_EMPTY(&bar->quirks)) {
 VFIOQuirk *quirk = QLIST_FIRST(&bar->quirks);
-memory_region_del_subregion(&bar->region.mem, &quirk->mem);
 object_unparent(OBJECT(&quirk->mem));
 QLIST_REMOVE(quirk, next);
 g_free(quirk);
@@ -2281,7 +2301,7 @@ static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, 
bool enabled)
 }
 }
 
-static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
+static void vfio_unregister_bar(VFIOPCIDevice *vdev, int nr)
 {
 VFIOBAR *bar = &vdev->bars[nr];
 
@@ -2292,10 +2312,25 @@ static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
 vfio_bar_quirk_teardown(vdev, nr);
 
 memory_region_del_subregion(&bar->region.mem, &bar->region.mmap_mem);
-munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
 
 if (vdev->msix && vdev->msix->table_bar == nr) {
 memory_region_del_subregion(&bar->region.mem, &vdev->msix->mmap_mem);
+}
+}
+
+static void vfio_unmap_bar(VFIOPCIDevice *vdev, int nr)
+{
+VFIOBAR *bar = &vdev->bars[nr];
+
+if (!bar->region.size) {
+return;
+}
+
+vfio_bar_quirk_free(vdev, nr);
+
+munmap(bar->region.mmap, memory_region_size(&bar->region.mmap_mem));
+
+if (vdev->msix && vdev->msix->table_bar == nr) {
 munmap(vdev->msix->mmap, memory_region_size(&vdev->msix->mmap_mem));
 }
 }
@@ -2403,12 +2438,12 @@ static void vfio_map_bars(VFIOPCIDevice *vdev)
 }
 }
 
-static void vfio_unmap_bars(VFIOPCIDevice *vdev)
+static void vfio_unregister_bars(VFIOPCIDevice *vdev)
 {
 int i;
 
 for (i = 0; i < PCI_ROM_SLOT; i++) {
-vfio_unmap_bar(vdev, i);
+vfio_unregister_bar(vdev, i);
 }
 
 if (vdev->has_vga) {
@@ -2417,6 +2452,19 @@ static void vfio_unmap_bars(VFIOPCIDevice *vdev)
 }
 }
 
+static void vfio_unmap_bars(VFIOPCIDevice *vdev)
+{
+int i;
+
+for (i = 0; i < PCI_ROM_SLOT; i++) {
+vfio_unmap_bar(vdev, i);
+}
+
+if (vdev->has_vga) {
+vfio_vga_quirk_free(vdev);
+}
+}
+
 /*
  * General setup
  */
@@ -3327,7 +3375,7 @@ static int vfio_initfn(PCIDevice *pdev)
 out_teardown:
 pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
 vfio_teardown_msi(vdev);
-vfio_unmap_bars(vdev);
+vfio_unregister_bars(vdev);
 return ret;
 }
 
@@ -3337,6 +3385,7 @@ static void vfio_instance_finalize(Object *obj)
 VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pci_dev);
 VFIOGroup *group = vdev->vbasedev.group;
 
+vfio_unmap_bars(vdev);
 g_free(vdev->emulated_config_bits);
 g_free(vdev->rom);
 vfio_put_device(vdev);
@@ -3354,7 +3403,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 timer_free(vdev->intx.mmap_timer);
 }
 vfio_teardown_msi(vdev);
-vfio_unmap_bars(vdev);
+vfio_unregister_bars(vdev);
 }
 
 static void vfio_pci_reset(DeviceState *dev)
-- 
1.8.

[Qemu-devel] [PATCH 1/3] vfio: cleanup vfio_get_device error path, remove vfio_populate_device callback

2015-02-04 Thread Paolo Bonzini
With the next patch vfio_put_base_device will be called unconditionally at
instance_finalize time, which will mean calling it twice if vfio_populate_device
fails.  This works, but it is slightly harder to follow.

Change vfio_get_device to not touch the vbasedev struct until it will
definitely succeed, moving the vfio_populate_device call back to vfio-pci.
This way, vfio_put_base_device will only be called once and only on
non-error paths.

Cc: Alex Williamson 
Signed-off-by: Paolo Bonzini 
---
 hw/vfio/common.c  | 31 ---
 hw/vfio/pci.c | 11 +++
 include/hw/vfio/vfio-common.h |  1 -
 3 files changed, 19 insertions(+), 24 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index cf483ff..242b71d 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -867,27 +867,28 @@ int vfio_get_device(VFIOGroup *group, const char *name,
VFIODevice *vbasedev)
 {
 struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) };
-int ret;
+int ret, fd;
 
-ret = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
-if (ret < 0) {
+fd = ioctl(group->fd, VFIO_GROUP_GET_DEVICE_FD, name);
+if (fd < 0) {
 error_report("vfio: error getting device %s from group %d: %m",
  name, group->groupid);
 error_printf("Verify all devices in group %d are bound to vfio- "
  "or pci-stub and not already in use\n", group->groupid);
-return ret;
+return fd;
 }
 
-vbasedev->fd = ret;
-vbasedev->group = group;
-QLIST_INSERT_HEAD(&group->device_list, vbasedev, next);
-
-ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_INFO, &dev_info);
+ret = ioctl(fd, VFIO_DEVICE_GET_INFO, &dev_info);
 if (ret) {
 error_report("vfio: error getting device info: %m");
-goto error;
+close(fd);
+return ret;
 }
 
+vbasedev->fd = fd;
+vbasedev->group = group;
+QLIST_INSERT_HEAD(&group->device_list, vbasedev, next);
+
 vbasedev->num_irqs = dev_info.num_irqs;
 vbasedev->num_regions = dev_info.num_regions;
 vbasedev->flags = dev_info.flags;
@@ -896,20 +897,12 @@ int vfio_get_device(VFIOGroup *group, const char *name,
   dev_info.num_irqs);
 
 vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
-
-ret = vbasedev->ops->vfio_populate_device(vbasedev);
-
-error:
-if (ret) {
-vfio_put_base_device(vbasedev);
-}
-return ret;
+return 0;
 }
 
 void vfio_put_base_device(VFIODevice *vbasedev)
 {
 QLIST_REMOVE(vbasedev, next);
-vbasedev->group = NULL;
 trace_vfio_put_base_device(vbasedev->fd);
 close(vbasedev->fd);
 }
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 014a92c..2f5f718 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -195,7 +195,6 @@ static uint32_t vfio_pci_read_config(PCIDevice *pdev, 
uint32_t addr, int len);
 static void vfio_pci_write_config(PCIDevice *pdev, uint32_t addr,
   uint32_t val, int len);
 static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
-static int vfio_populate_device(VFIODevice *vbasedev);
 
 /*
  * Disabling BAR mmaping can be slow, but toggling it around INTx can
@@ -2934,12 +2933,11 @@ static VFIODeviceOps vfio_pci_ops = {
 .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
 .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
 .vfio_eoi = vfio_eoi,
-.vfio_populate_device = vfio_populate_device,
 };
 
-static int vfio_populate_device(VFIODevice *vbasedev)
+static int vfio_pci_populate_device(VFIOPCIDevice *vdev)
 {
-VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+VFIODevice *vbasedev = &vdev->vbasedev;
 struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
 struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) };
 int i, ret = -1;
@@ -3247,6 +3245,11 @@ static int vfio_initfn(PCIDevice *pdev)
 return ret;
 }
 
+ret = vfio_pci_populate_device(vdev);
+if (ret) {
+return ret;
+}
+
 /* Get a copy of config space */
 ret = pread(vdev->vbasedev.fd, vdev->pdev.config,
 MIN(pci_config_size(&vdev->pdev), vdev->config_size),
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 1d5bfe8..5f3679b 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -112,7 +112,6 @@ struct VFIODeviceOps {
 void (*vfio_compute_needs_reset)(VFIODevice *vdev);
 int (*vfio_hot_reset_multi)(VFIODevice *vdev);
 void (*vfio_eoi)(VFIODevice *vdev);
-int (*vfio_populate_device)(VFIODevice *vdev);
 };
 
 typedef struct VFIOGroup {
-- 
1.8.3.1





Re: [Qemu-devel] [PATCH RFC 1/1] KVM: s390: Add MEMOP ioctl for reading/writing guest memory

2015-02-04 Thread Christian Borntraeger
Am 04.02.2015 um 12:42 schrieb Paolo Bonzini:
> 
> 
> On 04/02/2015 12:25, Christian Borntraeger wrote:
>> Whenever vhost or qemu or a finished aio request wrote content into a
>> virtio buffer, the HW has set the storage key for that physical page,
>> which  makes it automatically dirty/referenced in the guest visible
>> storage key. 
> 
> Ah, I knew the storage keys were per-physical page, but I wasn't sure if
> they were separate for the host and the guest.  That's obvious now.

Just something on top:
the storage key is per physical page. Just once. It contains C/R/ACC/F
(change, reference, access key, fetch protection)

But: there is also the pgste page table extension. That is used to  separate
both by doing logically ORs. The host and millicode will do the right shifting
copying  to keep both values separate, but when the physical storage key gets
dirty, the host and the guest view is now "changed==yes"

> 
> Can we detect non-zero storage key in emulated zPCI requests, and fail
> the request somehow?

Not right now. Even the kernel KVM module does not do this for emulated
instructions (as Linux has always key 0). Somewhen we might want to add
that capability, but its obviously not trivial for I/O like things. It
would get easier if we avoid VFIO etc and just had used the hardware support,
though. but as far as I can see this is not an option in QEMU.



Christian




[Qemu-devel] [PATCH 4/7] usb: Suppress bogus error when automatic usb-hub creation fails

2015-02-04 Thread Markus Armbruster
USBDevice's realize method usb_qdev_realize() automatically creates a
usb-hub when only one port is left.  Creating devices in realize
methods is questionable, but works.

If usb-hub creation fails, an error is reported to stderr, but the
failure is otherwise ignored.  We then create the actual device using
the last port, which may well succeed.

Example:

$ qemu -nodefaults -S -display none -machine usb=on -monitor stdio
QEMU 2.2.50 monitor - type 'help' for more information
(qemu) device_add usb-mouse
[Repeat 36 times]
(qemu) info usb
  Device 0.0, Port 1, Speed 12 Mb/s, Product QEMU USB Mouse
  Device 0.0, Port 2, Speed 12 Mb/s, Product QEMU USB Hub
  Device 0.0, Port 2.1, Speed 12 Mb/s, Product QEMU USB Mouse
[More mice and hubs omitted...]
  Device 0.0, Port 2.8.8.8.8.7, Speed 12 Mb/s, Product QEMU USB Mouse
(qemu) device_add usb-mouse
usb hub chain too deep
Failed to initialize USB device 'usb-hub'
(qemu) info usb
[...]
  Device 0.0, Port 2.8.8.8.8.7, Speed 12 Mb/s, Product QEMU USB Mouse
  Device 0.0, Port 2.8.8.8.8.8, Speed 12 Mb/s, Product QEMU USB Mouse

Despite the "Failed" message, the command actually succeeded.

In QMP, it's worse.  When adding the 37th mouse via QMP, the command
fails with

{"error": {"class": "GenericError", "desc": "usb hub chain too deep"}}

Additionally, "Failed to initialize USB device 'usb-hub'" is reported
on stderr.  Despite the command failure, the device was created.  This
is wrong.

Fix by avoiding qdev_init() for usb-hub creation, so we can ignore
errors cleanly.

Signed-off-by: Markus Armbruster 
---
 hw/usb/bus.c | 35 ---
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index 3e85afe..5abfac0 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -315,15 +315,36 @@ USBDevice *usb_create(USBBus *bus, const char *name)
 return USB_DEVICE(dev);
 }
 
+static USBDevice *usb_try_create_simple(USBBus *bus, const char *name,
+Error **errp)
+{
+Error *err = NULL;
+USBDevice *dev;
+
+dev = USB_DEVICE(qdev_try_create(&bus->qbus, name));
+if (!dev) {
+error_setg(errp, "Failed to create USB device '%s'", name);
+return NULL;
+}
+object_property_set_bool(OBJECT(dev), true, "realized", &err);
+if (err) {
+error_setg(errp, "Failed to initialize USB device '%s': %s",
+   name, error_get_pretty(err));
+error_free(err);
+object_unparent(OBJECT(dev));
+return NULL;
+}
+return dev;
+}
+
 USBDevice *usb_create_simple(USBBus *bus, const char *name)
 {
-USBDevice *dev = usb_create(bus, name);
-int rc;
+Error *err = NULL;
+USBDevice *dev = usb_try_create_simple(bus, name, &err);
 
-rc = qdev_init(&dev->qdev);
-if (rc < 0) {
-error_report("Failed to initialize USB device '%s'", name);
-return NULL;
+if (!dev) {
+error_report("%s", error_get_pretty(err));
+error_free(err);
 }
 return dev;
 }
@@ -419,7 +440,7 @@ void usb_claim_port(USBDevice *dev, Error **errp)
 } else {
 if (bus->nfree == 1 && strcmp(object_get_typename(OBJECT(dev)), 
"usb-hub") != 0) {
 /* Create a new hub and chain it on */
-usb_create_simple(bus, "usb-hub");
+usb_try_create_simple(bus, "usb-hub", NULL);
 }
 if (bus->nfree == 0) {
 error_setg(errp, "tried to attach usb device %s to a bus "
-- 
1.9.3




[Qemu-devel] [PATCH 0/7] usb: Improvements around device realization

2015-02-04 Thread Markus Armbruster
Markus Armbruster (7):
  usb: usb_create() can't fail, drop useless error handling
  usb: Improve -usbdevice error reporting a bit
  usb: Do not prefix error_setg() messages with "Error: "
  usb: Suppress bogus error when automatic usb-hub creation fails
  usb: Change usb_create_simple() to abort on failure
  r2d: Don't use legacy -usbdevice support for setting up board
  PPC: Don't use legacy -usbdevice support for setting up board

 hw/ppc/mac_newworld.c  |  7 --
 hw/ppc/spapr.c |  7 --
 hw/sh4/r2d.c   |  2 +-
 hw/usb/bus.c   | 64 +++---
 hw/usb/dev-bluetooth.c | 11 +
 hw/usb/dev-network.c   |  4 
 hw/usb/dev-serial.c|  7 --
 hw/usb/dev-storage.c   |  6 -
 hw/usb/host-legacy.c   |  1 -
 9 files changed, 57 insertions(+), 52 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH 2/7] usb: Improve -usbdevice error reporting a bit

2015-02-04 Thread Markus Armbruster
Most LegacyUSBFactory usbdevice_init() methods realize with
qdev_init_nofail(), even though their caller usbdevice_create() can
handle failure.  Okay if it really can't fail (I didn't check), but
somewhat brittle.

usb_msd_init() and usb_bt_init() call qdev_init().  The latter
additionally reports an error when qdev_init() fails.

Realization failure produces multiple error reports: a specific one
from qdev_init(), and generic ones from usb_bt_init(),
usb_create_simple(), usbdevice_create() and usb_parse().

Remove realization from the usbdevice_init() methods.  Realize in
usbdevice_create(), and produce exactly one error message there.  You
still get another one from usb_parse().

Signed-off-by: Markus Armbruster 
---
 hw/usb/bus.c   | 22 +++---
 hw/usb/dev-bluetooth.c |  5 -
 hw/usb/dev-network.c   |  1 -
 hw/usb/dev-serial.c|  4 
 hw/usb/dev-storage.c   |  3 ---
 hw/usb/host-legacy.c   |  1 -
 6 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index eeb6872..3f69fe1 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -651,10 +651,12 @@ USBDevice *usbdevice_create(const char *cmdline)
 {
 USBBus *bus = usb_bus_find(-1 /* any */);
 LegacyUSBFactory *f = NULL;
+Error *err = NULL;
 GSList *i;
 char driver[32];
 const char *params;
 int len;
+USBDevice *dev;
 
 params = strchr(cmdline,':');
 if (params) {
@@ -689,14 +691,28 @@ USBDevice *usbdevice_create(const char *cmdline)
 return NULL;
 }
 
-if (!f->usbdevice_init) {
+if (f->usbdevice_init) {
+dev = f->usbdevice_init(bus, params);
+} else {
 if (*params) {
 error_report("usbdevice %s accepts no params", driver);
 return NULL;
 }
-return usb_create_simple(bus, f->name);
+dev = usb_create(bus, f->name);
 }
-return f->usbdevice_init(bus, params);
+if (!dev) {
+error_report("Failed to create USB device '%s'", f->name);
+return NULL;
+}
+object_property_set_bool(OBJECT(dev), true, "realized", &err);
+if (err) {
+error_report("Failed to initialize USB device '%s': %s",
+ f->name, error_get_pretty(err));
+error_free(err);
+object_unparent(OBJECT(dev));
+return NULL;
+}
+return dev;
 }
 
 static void usb_device_class_init(ObjectClass *klass, void *data)
diff --git a/hw/usb/dev-bluetooth.c b/hw/usb/dev-bluetooth.c
index 8b3b316..9bf6730 100644
--- a/hw/usb/dev-bluetooth.c
+++ b/hw/usb/dev-bluetooth.c
@@ -536,11 +536,6 @@ static USBDevice *usb_bt_init(USBBus *bus, const char 
*cmdline)
 dev = usb_create(bus, name);
 s = DO_UPCAST(struct USBBtState, dev, dev);
 s->hci = hci;
-if (qdev_init(&dev->qdev) < 0) {
-error_report("Failed to initialize USB device '%s'", name);
-return NULL;
-}
-
 return dev;
 }
 
diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index 620fc11..7131abd 100644
--- a/hw/usb/dev-network.c
+++ b/hw/usb/dev-network.c
@@ -1406,7 +1406,6 @@ static USBDevice *usb_net_init(USBBus *bus, const char 
*cmdline)
 
 dev = usb_create(bus, "usb-net");
 qdev_set_nic_properties(&dev->qdev, &nd_table[idx]);
-qdev_init_nofail(&dev->qdev);
 return dev;
 }
 
diff --git a/hw/usb/dev-serial.c b/hw/usb/dev-serial.c
index f347c05..67c2072 100644
--- a/hw/usb/dev-serial.c
+++ b/hw/usb/dev-serial.c
@@ -549,8 +549,6 @@ static USBDevice *usb_serial_init(USBBus *bus, const char 
*filename)
 qdev_prop_set_uint16(&dev->qdev, "vendorid", vendorid);
 if (productid)
 qdev_prop_set_uint16(&dev->qdev, "productid", productid);
-qdev_init_nofail(&dev->qdev);
-
 return dev;
 }
 
@@ -565,8 +563,6 @@ static USBDevice *usb_braille_init(USBBus *bus, const char 
*unused)
 
 dev = usb_create(bus, "usb-braille");
 qdev_prop_set_chr(&dev->qdev, "chardev", cdrv);
-qdev_init_nofail(&dev->qdev);
-
 return dev;
 }
 
diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
index 5f22275..af2e1b9 100644
--- a/hw/usb/dev-storage.c
+++ b/hw/usb/dev-storage.c
@@ -711,9 +711,6 @@ static USBDevice *usb_msd_init(USBBus *bus, const char 
*filename)
 object_unparent(OBJECT(dev));
 return NULL;
 }
-if (qdev_init(&dev->qdev) < 0)
-return NULL;
-
 return dev;
 }
 
diff --git a/hw/usb/host-legacy.c b/hw/usb/host-legacy.c
index 3cc9c42..422ed9a 100644
--- a/hw/usb/host-legacy.c
+++ b/hw/usb/host-legacy.c
@@ -128,7 +128,6 @@ USBDevice *usb_host_device_open(USBBus *bus, const char 
*devname)
 qdev_prop_set_uint32(&dev->qdev, "hostaddr",  filter.addr);
 qdev_prop_set_uint32(&dev->qdev, "vendorid",  filter.vendor_id);
 qdev_prop_set_uint32(&dev->qdev, "productid", filter.product_id);
-qdev_init_nofail(&dev->qdev);
 return dev;
 
 fail:
-- 
1.9.3




[Qemu-devel] [PATCH 5/7] usb: Change usb_create_simple() to abort on failure

2015-02-04 Thread Markus Armbruster
Instead of returning null pointer.  Matches pci_create_simple(),
isa_create_simple(), sysbus_create_simple().  It's unused since the
previous commit, but I'll put it to use again shortly.

Signed-off-by: Markus Armbruster 
---
 hw/usb/bus.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index 5abfac0..d83a938 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -339,14 +339,7 @@ static USBDevice *usb_try_create_simple(USBBus *bus, const 
char *name,
 
 USBDevice *usb_create_simple(USBBus *bus, const char *name)
 {
-Error *err = NULL;
-USBDevice *dev = usb_try_create_simple(bus, name, &err);
-
-if (!dev) {
-error_report("%s", error_get_pretty(err));
-error_free(err);
-}
-return dev;
+return usb_try_create_simple(bus, name, &error_abort);
 }
 
 static void usb_fill_port(USBPort *port, void *opaque, int index,
-- 
1.9.3




[Qemu-devel] [PATCH 6/7] r2d: Don't use legacy -usbdevice support for setting up board

2015-02-04 Thread Markus Armbruster
It's tempting, because usbdevice_create() is so simple to use.  But
there's a lot of unwanted complexity behind the simple interface.
Switch to usb_create_simple().

Cc: Magnus Damm 
Signed-off-by: Markus Armbruster 
---
 hw/sh4/r2d.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/sh4/r2d.c b/hw/sh4/r2d.c
index 12f44d2..d1d0847 100644
--- a/hw/sh4/r2d.c
+++ b/hw/sh4/r2d.c
@@ -301,7 +301,7 @@ static void r2d_init(MachineState *machine)
 "rtl8139", i==0 ? "2" : NULL);
 
 /* USB keyboard */
-usbdevice_create("keyboard");
+usb_create_simple(usb_bus_find(-1), "usb-kbd");
 
 /* Todo: register on board registers */
 memset(&boot_params, 0, sizeof(boot_params));
-- 
1.9.3




[Qemu-devel] [PATCH 1/7] usb: usb_create() can't fail, drop useless error handling

2015-02-04 Thread Markus Armbruster
Signed-off-by: Markus Armbruster 
---
 hw/usb/bus.c   | 4 
 hw/usb/dev-bluetooth.c | 6 +-
 hw/usb/dev-network.c   | 3 ---
 hw/usb/dev-serial.c| 3 ---
 hw/usb/dev-storage.c   | 3 ---
 5 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index 986b2d8..eeb6872 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -320,10 +320,6 @@ USBDevice *usb_create_simple(USBBus *bus, const char *name)
 USBDevice *dev = usb_create(bus, name);
 int rc;
 
-if (!dev) {
-error_report("Failed to create USB device '%s'", name);
-return NULL;
-}
 rc = qdev_init(&dev->qdev);
 if (rc < 0) {
 error_report("Failed to initialize USB device '%s'", name);
diff --git a/hw/usb/dev-bluetooth.c b/hw/usb/dev-bluetooth.c
index 390d475..8b3b316 100644
--- a/hw/usb/dev-bluetooth.c
+++ b/hw/usb/dev-bluetooth.c
@@ -530,14 +530,10 @@ static USBDevice *usb_bt_init(USBBus *bus, const char 
*cmdline)
 } else {
 hci = bt_new_hci(qemu_find_bt_vlan(0));
 }
-
 if (!hci)
 return NULL;
+
 dev = usb_create(bus, name);
-if (!dev) {
-error_report("Failed to create USB device '%s'", name);
-return NULL;
-}
 s = DO_UPCAST(struct USBBtState, dev, dev);
 s->hci = hci;
 if (qdev_init(&dev->qdev) < 0) {
diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index 5b95d5c..620fc11 100644
--- a/hw/usb/dev-network.c
+++ b/hw/usb/dev-network.c
@@ -1405,9 +1405,6 @@ static USBDevice *usb_net_init(USBBus *bus, const char 
*cmdline)
 }
 
 dev = usb_create(bus, "usb-net");
-if (!dev) {
-return NULL;
-}
 qdev_set_nic_properties(&dev->qdev, &nd_table[idx]);
 qdev_init_nofail(&dev->qdev);
 return dev;
diff --git a/hw/usb/dev-serial.c b/hw/usb/dev-serial.c
index 1cee450..f347c05 100644
--- a/hw/usb/dev-serial.c
+++ b/hw/usb/dev-serial.c
@@ -544,9 +544,6 @@ static USBDevice *usb_serial_init(USBBus *bus, const char 
*filename)
 return NULL;
 
 dev = usb_create(bus, "usb-serial");
-if (!dev) {
-return NULL;
-}
 qdev_prop_set_chr(&dev->qdev, "chardev", cdrv);
 if (vendorid)
 qdev_prop_set_uint16(&dev->qdev, "vendorid", vendorid);
diff --git a/hw/usb/dev-storage.c b/hw/usb/dev-storage.c
index 4539733..5f22275 100644
--- a/hw/usb/dev-storage.c
+++ b/hw/usb/dev-storage.c
@@ -706,9 +706,6 @@ static USBDevice *usb_msd_init(USBBus *bus, const char 
*filename)
 
 /* create guest device */
 dev = usb_create(bus, "usb-storage");
-if (!dev) {
-return NULL;
-}
 if (qdev_prop_set_drive(&dev->qdev, "drive",
 blk_by_legacy_dinfo(dinfo)) < 0) {
 object_unparent(OBJECT(dev));
-- 
1.9.3




[Qemu-devel] [PATCH 7/7] PPC: Don't use legacy -usbdevice support for setting up board

2015-02-04 Thread Markus Armbruster
It's tempting, because usbdevice_create() is so simple to use.  But
there's a lot of unwanted complexity behind the simple interface.
Switch to usb_create_simple().

Cc: Alexander Graf 
Cc: qemu-...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/ppc/mac_newworld.c | 7 +--
 hw/ppc/spapr.c| 7 +--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index c377012..624b4ab 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -420,11 +420,14 @@ static void ppc_core99_init(MachineState *machine)
 
 if (machine->usb) {
 pci_create_simple(pci_bus, -1, "pci-ohci");
+
 /* U3 needs to use USB for input because Linux doesn't support via-cuda
 on PPC64 */
 if (machine_arch == ARCH_MAC99_U3) {
-usbdevice_create("keyboard");
-usbdevice_create("mouse");
+USBBus *usb_bus = usb_bus_find(-1);
+
+usb_create_simple(usb_bus, "usb-kbd");
+usb_create_simple(usb_bus, "usb-mouse");
 }
 }
 
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b560459..5112373 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1533,9 +1533,12 @@ static void ppc_spapr_init(MachineState *machine)
 
 if (machine->usb) {
 pci_create_simple(phb->bus, -1, "pci-ohci");
+
 if (spapr->has_graphics) {
-usbdevice_create("keyboard");
-usbdevice_create("mouse");
+USBBus *usb_bus = usb_bus_find(-1);
+
+usb_create_simple(usb_bus, "usb-kbd");
+usb_create_simple(usb_bus, "usb-mouse");
 }
 }
 
-- 
1.9.3




[Qemu-devel] [PATCH 3/7] usb: Do not prefix error_setg() messages with "Error: "

2015-02-04 Thread Markus Armbruster
Because it produces beauties like

(qemu) usb_add mouse
Failed to initialize USB device 'usb-mouse': Error: tried to attach usb 
device QEMU USB Mouse to a bus with no free ports

Signed-off-by: Markus Armbruster 
---
 hw/usb/bus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index 3f69fe1..3e85afe 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -412,7 +412,7 @@ void usb_claim_port(USBDevice *dev, Error **errp)
 }
 }
 if (port == NULL) {
-error_setg(errp, "Error: usb port %s (bus %s) not found (in use?)",
+error_setg(errp, "usb port %s (bus %s) not found (in use?)",
dev->port_path, bus->qbus.name);
 return;
 }
@@ -422,7 +422,7 @@ void usb_claim_port(USBDevice *dev, Error **errp)
 usb_create_simple(bus, "usb-hub");
 }
 if (bus->nfree == 0) {
-error_setg(errp, "Error: tried to attach usb device %s to a bus "
+error_setg(errp, "tried to attach usb device %s to a bus "
"with no free ports", dev->product_desc);
 return;
 }
-- 
1.9.3




Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 12:32, Daniel P. Berrange wrote:
> So my idea would be that we define a QEMUChannel object and set of APIs to
> standardize all interaction with sockets, pipes, RDMA, whatever $channel,
> and then convert the QEMU features I've mentioned over to use that. I think
> that would be simpler than trying to untangle QEMUFile code from migration
> and then extend its features.

Could it be GIOChannel simply?

1) Chardev is already mostly a wrapper around GIOChannel

2) NBD and VNC could be converted to GIOChannel with relative ease

3) migration is more complicated because (unlike everything else) it
uses a separate thread and blocking sockets, but you could probably
write a GIOChannel-based implementation of QEMUFile.

I found a GIOChannel wrapper for gnutls at
https://github.com/aldebaran/connman/blob/master/gweb/giognutls.c.  It's
not the right license for QEMU (GPLv2-only) but it's only 400 lines of
code.  If necessary I can help with clean-room reverse engineering.

Paolo



Re: [Qemu-devel] [PATCH 4/9] rcu: introduce RCU-enabled QLIST

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 04:42, Fam Zheng wrote:
> On Tue, 02/03 13:52, Paolo Bonzini wrote:
>> From: Mike Day 
>>
>> Add RCU-enabled variants on the existing bsd DQ facility. Each
>> operation has the same interface as the existing (non-RCU)
>> version. Also, each operation is implemented as macro.
>>
>> Using the RCU-enabled QLIST, existing QLIST users will be able to
>> convert to RCU without using a different list interface.
>>
>> Signed-off-by: Mike Day 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  hw/9pfs/virtio-9p-synth.c |   2 +-
>>  include/qemu/queue.h  |  11 --
>>  include/qemu/rcu_queue.h  | 134 
>>  tests/Makefile|   5 +-
>>  tests/test-rcu-list.c | 306 
>> ++
>>  5 files changed, 445 insertions(+), 13 deletions(-)
>>  create mode 100644 include/qemu/rcu_queue.h
>>  create mode 100644 tests/test-rcu-list.c
>>
>> diff --git a/hw/9pfs/virtio-9p-synth.c b/hw/9pfs/virtio-9p-synth.c
>> index e75aa87..a0ab9a8 100644
>> --- a/hw/9pfs/virtio-9p-synth.c
>> +++ b/hw/9pfs/virtio-9p-synth.c
>> @@ -18,7 +18,7 @@
>>  #include "fsdev/qemu-fsdev.h"
>>  #include "virtio-9p-synth.h"
>>  #include "qemu/rcu.h"
>> -
>> +#include "qemu/rcu_queue.h"
>>  #include 
>>  
>>  /* Root node for synth file system */
>> diff --git a/include/qemu/queue.h b/include/qemu/queue.h
>> index c602797..8094150 100644
>> --- a/include/qemu/queue.h
>> +++ b/include/qemu/queue.h
>> @@ -139,17 +139,6 @@ struct {
>> \
>>  (elm)->field.le_prev = &(head)->lh_first;   \
>>  } while (/*CONSTCOND*/0)
>>  
>> -#define QLIST_INSERT_HEAD_RCU(head, elm, field) do {\
>> -(elm)->field.le_prev = &(head)->lh_first;   \
>> -(elm)->field.le_next = (head)->lh_first;\
>> -smp_wmb(); /* fill elm before linking it */ \
>> -if ((head)->lh_first != NULL)  {\
>> -(head)->lh_first->field.le_prev = &(elm)->field.le_next;\
>> -}   \
>> -(head)->lh_first = (elm);   \
>> -smp_wmb();  \
>> -} while (/* CONSTCOND*/0)
>> -
>>  #define QLIST_REMOVE(elm, field) do {   \
>>  if ((elm)->field.le_next != NULL)   \
>>  (elm)->field.le_next->field.le_prev =   \
>> diff --git a/include/qemu/rcu_queue.h b/include/qemu/rcu_queue.h
>> new file mode 100644
>> index 000..3aca7a5
>> --- /dev/null
>> +++ b/include/qemu/rcu_queue.h
>> @@ -0,0 +1,134 @@
>> +#ifndef QEMU_RCU_QUEUE_H
>> +#define QEMU_RCU_QUEUE_H
>> +
>> +/*
>> + * rcu_queue.h
>> + *
>> + * RCU-friendly versions of the queue.h primitives.
>> + *
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2.1 of the License, or (at your option) any later version.
>> + *
>> + * This library is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with this library; if not, write to the Free Software
>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
>> USA
>> + *
>> + * Copyright (c) 2013 Mike D. Day, IBM Corporation.
>> + *
>> + * IBM's contributions to this file may be relicensed under LGPLv2 or later.
>> + */
>> +
>> +#include "qemu/queue.h"
>> +#include "qemu/atomic.h"
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +
>> +/*
>> + * List access methods.
>> + */
>> +#define QLIST_EMPTY_RCU(head) (atomic_rcu_read(&(head)->lh_first) == NULL)
>> +#define QLIST_FIRST_RCU(head) (atomic_rcu_read(&(head)->lh_first))
>> +#define QLIST_NEXT_RCU(elm, field) (atomic_rcu_read(&(elm)->field.le_next))
>> +
>> +/*
>> + * List functions.
>> + */
>> +
>> +
>> +/*
>> + *  The difference between atomic_read/set and atomic_rcu_read/set
>> + *  is in the including of a read/write memory barrier to the volatile
>> + *  access. atomic_rcu_* macros include the memory barrier, the
>> + *  plain atomic macros do not. Therefore, it should be correct to
>> + *  issue a series of reads or writes to the same element using only
>> + *  the atomic_* macro, until the last read or write, which should be
>> + *  atomic_rcu_* to introduce a read or write memory barrier as
>> + *  appropriate.
>> + */
>> +
>> +/* Upon publication of the listelm->next value, list readers
>> + 

Re: [Qemu-devel] [PATCH 6/9] cosmetic changes preparing for the following patches

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 04:10, Fam Zheng wrote:
>> > -return;
> Other changes are equivalent, but not quite for this one. But I think it is
> still correct, so:
> 
> Reviewed-by: Fam Zheng 

True, this is not entirely cosmetic.  Still, the loop is guaranteed to
only hit one block, because of the check on offset.

Removing the "return;" matches what we do in the RAM_PREALLOC case.

Paolo



Re: [Qemu-devel] [PATCH 7/7] PPC: Don't use legacy -usbdevice support for setting up board

2015-02-04 Thread Alexander Graf


On 04.02.15 13:28, Markus Armbruster wrote:
> It's tempting, because usbdevice_create() is so simple to use.  But
> there's a lot of unwanted complexity behind the simple interface.
> Switch to usb_create_simple().
> 
> Cc: Alexander Graf 
> Cc: qemu-...@nongnu.org
> Signed-off-by: Markus Armbruster 

Reviewed-by: Alexander Graf 


Alex



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Daniel P. Berrange
On Wed, Feb 04, 2015 at 01:43:12PM +0100, Paolo Bonzini wrote:
> 
> 
> On 04/02/2015 12:32, Daniel P. Berrange wrote:
> > So my idea would be that we define a QEMUChannel object and set of APIs to
> > standardize all interaction with sockets, pipes, RDMA, whatever $channel,
> > and then convert the QEMU features I've mentioned over to use that. I think
> > that would be simpler than trying to untangle QEMUFile code from migration
> > and then extend its features.
> 
> Could it be GIOChannel simply?
> 
> 1) Chardev is already mostly a wrapper around GIOChannel
> 
> 2) NBD and VNC could be converted to GIOChannel with relative ease
> 
> 3) migration is more complicated because (unlike everything else) it
> uses a separate thread and blocking sockets, but you could probably
> write a GIOChannel-based implementation of QEMUFile.

It might be possible to base it on GIOChannel, but IIRC some of the
migration code was using iovecs for I/O and GIOChannel API doesn't
allow for that. So you'd have to sacrifice performance by issuing a
separate syscall for each iovec element which seems sucky to me.
If you think that's an acceptable limitation though, I could certainly
explore use of GIOChannel.

More broadly speaking GIOChannel has fallen out of favour in the
glib ecosystem, with most apps/libraries more focused on use of
the GIO APIs instead, but IIUC QEMU avoids use of the GIO library
due to need to support older glib versions.

> I found a GIOChannel wrapper for gnutls at
> https://github.com/aldebaran/connman/blob/master/gweb/giognutls.c.  It's
> not the right license for QEMU (GPLv2-only) but it's only 400 lines of
> code.  If necessary I can help with clean-room reverse engineering.

It doesn't seem todo any thing related to certificate validation which
explains why it is so short compared ot the gnutls code we already have
for VNC in QEMU. So I don't think it's particularly useful in terms of
saving effort.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v6 0/5] Geometry and blocksize detection for backing devices.

2015-02-04 Thread Ekaterina Tumanova

On 01/19/2015 05:34 PM, Ekaterina Tumanova wrote:

Updates v5 -> v6:

Minor Updates according the last review from Stefan Hajnoczi:
1. Do not change the flow of code, factored out of raw_probe_alignment.
2. added #ifdef __linux__ in 2 places of raw-posix.c, mentioned by reviewer.
3. adjusted the comment hdev_probe_geometry according suggestment.
4. use bdrv_nb_sectors(bs) instead of bs->total_sectors.
5. do not discard error blk_probe_blocksizes(). now has rc.
6. put the 512-byte default blocksize value in blkconf_blocksizes.
7. drop the default parameter from the DEFINE_PROP_BLOCKSIZE() macro.

Thanks,
Kate.

---
Patchset Description (didn't change):

Proper geometry and blocksize information is vital for support of
DASD/ECKD drives in Linux guests. Otherwise things will fail in
certain cases.

The existing geometry and blocksize qemu defaults have no sense
for DASD drives (hd_geometry_guess detection and 512 for sizes).
Setting this information manually in XML file is far from user-friendly,
as the admin has to manually look up the properties of the
host disks and then modify the guest definition accordingly.

Since Linux uses DASDs as normal block devices, we actually
want to use virtio-blk to pass those to KVM guests.

In order to avoid any change in behavior of other drives, the DASD
special casing was advised. We call ioctl BIODASDINFO2 on the block
device, which will only succeed if the device is really a DASD.

In order to retrieve the underlying device geometry and blocksizes
a new block-backend functions and underlying driver functions were
introduced (blk_probe_blocksizes anf blk_probe_geometry wrappers
and corresponding bdrv_xx functions).

As for now only "host_device" driver received new detection methods.
For "raw" we call childs method as usual. In future one may update
other drivers to add some other detection heuristics.

If the host_device appears to be a DASD, the driver functions
(hdev_probe_blocksizes and hdev_probe_geometry) will call certain
ioctls in order to detect geometry and blocksizes of the underlying device.
if probing failed bdrv_probe_blocksizes caller will set defaults,
and bdrv_probe_geometry will fail to allow fallback to old detection logic.

The front-end (BlockConf API) was updated:
1. a new blkconf_blocksizes function was added. It doesn't
change user-defined blocksize values. If properties are unset, it will
set values, returned by blk_probe_backend. In order to allow this logic,
blocksize properties were initialized with 0. (driver will return 512 if
backing device probing didn't succeed or if driver method is not defined).
2. hd_geometry guess was changed to firstly try to retrieve values via
blk_probe_geometry and if it fails, fallback to the old logic.

Ekaterina Tumanova (5):
   block: add bdrv functions for geometry and blocksize
   raw-posix: Factor block size detection out of raw_probe_alignment()
   block: Add driver methods to probe blocksizes and geometry
   block-backend: Add wrappers for blocksizes and geometry probing
   BlockConf: Call backend functions to detect geometry and blocksizes

  block.c|  36 ++
  block/block-backend.c  |  10 +++
  block/raw-posix.c  | 154 -
  block/raw_bsd.c|  12 
  hw/block/block.c   |  20 ++
  hw/block/hd-geometry.c |  10 ++-
  hw/block/nvme.c|   1 +
  hw/block/virtio-blk.c  |   1 +
  hw/core/qdev-properties.c  |   3 +-
  hw/ide/qdev.c  |   1 +
  hw/scsi/scsi-disk.c|   1 +
  hw/usb/dev-storage.c   |   1 +
  include/block/block.h  |  13 
  include/block/block_int.h  |  15 
  include/hw/block/block.h   |   5 +-
  include/hw/qdev-properties.h   |   4 +-
  include/sysemu/block-backend.h |   2 +
  17 files changed, 267 insertions(+), 22 deletions(-)



ping




Re: [Qemu-devel] [PATCH v5 09/26] qcow2: Helper for refcount array reallocation

2015-02-04 Thread Kevin Wolf
Am 15.12.2014 um 13:50 hat Max Reitz geschrieben:
> Add a helper function for reallocating a refcount array, independent of
> the refcount order. The newly allocated space is zeroed and the function
> handles failed reallocations gracefully.
> 
> The helper function will always align the buffer size to a cluster
> boundary; if storing the refcounts in such an array in big endian byte
> order, this makes it possible to write parts of the array directly as
> refcount blocks into the image file.
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> ---
>  block/qcow2-refcount.c | 137 
> +++--
>  1 file changed, 88 insertions(+), 49 deletions(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index fd28a13..4ede971 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1130,6 +1130,70 @@ fail:
>  /* refcount checking functions */
>  
>  
> +static size_t refcount_array_byte_size(BDRVQcowState *s, uint64_t entries)
> +{
> +if (s->refcount_order < 3) {
> +/* sub-byte width */
> +int shift = 3 - s->refcount_order;
> +return DIV_ROUND_UP(entries, 1 << shift);
> +} else if (s->refcount_order == 3) {
> +/* byte width */
> +return entries;
> +} else {
> +/* multiple bytes wide */
> +
> +/* This assertion holds because there is no way we can address more 
> than
> + * 2^(64 - 9) clusters at once (with cluster size 512 = 2^9, and 
> because
> + * offsets have to be representable in bytes); due to every cluster

Considering this, which implies that a multiplication by 64 can't
overflow, why can't this function be as simple as the following?

assert(entries <= (1 << (64 - 9)));
return DIV_ROUND_UP(entries * s->refcount_bits, 8)

> + * corresponding to one refcount entry and because refcount_order 
> has to
> + * be below 7, we are far below that limit */
> +assert(!(entries >> (64 - (s->refcount_order - 3;
> +
> +return entries << (s->refcount_order - 3);
> +}
> +}
> +
> +/**
> + * Reallocates *array so that it can hold new_size entries. *size must 
> contain
> + * the current number of entries in *array. If the reallocation fails, *array
> + * and *size will not be modified and -errno will be returned. If the
> + * reallocation is successful, *array will be set to the new buffer and *size
> + * will be set to new_size.

And 0 is returned.

> The size of the reallocated refcount array buffer
> + * will be aligned to a cluster boundary, and the newly allocated area will 
> be
> + * zeroed.
> + */
> +static int realloc_refcount_array(BDRVQcowState *s, uint16_t **array,
> +  int64_t *size, int64_t new_size)
> +{
> +/* Round to clusters so the array can be directly written to disk */
> +size_t old_byte_size = ROUND_UP(refcount_array_byte_size(s, *size),
> +s->cluster_size);
> +size_t new_byte_size = ROUND_UP(refcount_array_byte_size(s, new_size),
> +s->cluster_size);

size_to_clusters()? (Unfortunately still not short enough to keep each
initialisation on a single line...)

> +uint16_t *new_ptr;
> +
> +if (new_byte_size == old_byte_size) {
> +*size = new_size;
> +return 0;
> +}

This is only correct if the array was previously allocated by the same
function, or at least rounded up to a cluster boundary. What do we win
by keeping a smaller byte count in *size than is actually allocated? If
we had the real allocation size in it, we wouldn't have to make
assumptions about the real array size.

> +assert(new_byte_size > 0);
> +
> +new_ptr = g_try_realloc(*array, new_byte_size);
> +if (!new_ptr) {
> +return -ENOMEM;
> +}
> +
> +if (new_byte_size > old_byte_size) {
> +memset((void *)((uintptr_t)new_ptr + old_byte_size), 0,
> +   new_byte_size - old_byte_size);
> +}
> +
> +*array = new_ptr;
> +*size  = new_size;
> +
> +return 0;
> +}
>  
>  /*
>   * Increases the refcount for a range of clusters in a given refcount table.
> @@ -1146,6 +1210,7 @@ static int inc_refcounts(BlockDriverState *bs,
>  {
>  BDRVQcowState *s = bs->opaque;
>  uint64_t start, last, cluster_offset, k;
> +int ret;
>  
>  if (size <= 0) {
>  return 0;
> @@ -1157,23 +1222,12 @@ static int inc_refcounts(BlockDriverState *bs,
>  cluster_offset += s->cluster_size) {
>  k = cluster_offset >> s->cluster_bits;
>  if (k >= *refcount_table_size) {
> -int64_t old_refcount_table_size = *refcount_table_size;
> -uint16_t *new_refcount_table;
> -
> -*refcount_table_size = k + 1;
> -new_refcount_table = g_try_realloc(*refcount_table,
> -   *refcount_table_size *
> -   sizeof

Re: [Qemu-devel] [PATCH 1/2] Add GDB qAttached support

2015-02-04 Thread Pedro Alves
Hi, I was skimming the list, and noticed:

On 01/31/2015 10:28 AM, Jan Kiszka wrote:
> @@ -1187,6 +1193,10 @@ static int gdb_handle_packet(GDBState *s, const char 
> *line_buf)
>  put_packet_binary(s, buf, len + 1);
>  break;
>  }
> +if (strncmp(p, "Attached", 8) == 0) {

This looks like it'd mishandle a future qAttached2 packet.

It should be doing something like:

   if (strncmp(p, "Attached", 8) == 0 &&
  (p[8] == '\0' || p[8] == ':')) {

or:

   if (strcmp(p, "Attached") == 0 || strncmp(p, "Attached:", 9) == 0) {


Likewise other packets, if they have the same issue.
(I'm not familiar with qemu's stub's internals.)

Thanks,
Pedro Alves




Re: [Qemu-devel] [PATCH v2] qga: add guest-set-admin-password command

2015-02-04 Thread Olga Krishtal

On 12/01/15 18:58, Daniel P. Berrange wrote:

Add a new 'guest-set-admin-password' command for changing the
root/administrator password. This command is needed to allow
OpenStack to support its API for changing the admin password
on a running guest.

Accepts either the raw password string:

$ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
'{ "execute": "guest-set-admin-password", "arguments":
  { "crypted": false, "password": "12345678" } }'
   {"return":{}}

Or a pre-encrypted string (recommended)

$ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
'{ "execute": "guest-set-admin-password", "arguments":
  { "crypted": true, "password":
 "$6$T9O/j/aGPrE...sniprQoRN4F0.GG0MPjNUNyml." } }'

NB windows support is desirable, but not implemented in this
patch.

Signed-off-by: Daniel P. Berrange 
---
  qga/commands-posix.c | 90 
  qga/commands-win32.c |  6 
  qga/qapi-schema.json | 19 +++
  3 files changed, 115 insertions(+)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index f6f3e3c..4887889 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -1875,6 +1875,90 @@ int64_t qmp_guest_set_vcpus(GuestLogicalProcessorList 
*vcpus, Error **errp)
  return processed;
  }
  
+void qmp_guest_set_admin_password(bool crypted, const char *password,

+  Error **errp)
+{
+Error *local_err = NULL;
+char *passwd_path = NULL;
+pid_t pid;
+int status;
+int datafd[2] = { -1, -1 };
+char *acctpw = g_strdup_printf("root:%s\n", password);
+size_t acctpwlen = strlen(acctpw);
+
+if (strchr(password, '\n')) {
+error_setg(errp, "forbidden characters in new password");
+goto out;
+}
+
+passwd_path = g_find_program_in_path("chpasswd");
+
+if (!passwd_path) {
+error_setg(errp, "cannot find 'passwd' program in PATH");
+goto out;
+}
+
+if (pipe(datafd) < 0) {
+error_setg(errp, "cannot create pipe FDs");
+goto out;
+}
+
+pid = fork();
+if (pid == 0) {
+close(datafd[1]);
+/* child */
+setsid();
+dup2(datafd[0], 0);
+reopen_fd_to_null(1);
+reopen_fd_to_null(2);
+
+if (crypted) {
+execle(passwd_path, "chpasswd", "-e", NULL, environ);
+} else {
+execle(passwd_path, "chpasswd", NULL, environ);
+}
+_exit(EXIT_FAILURE);
+} else if (pid < 0) {
+error_setg_errno(errp, errno, "failed to create child process");
+goto out;
+}
+close(datafd[0]);
+datafd[0] = -1;
+
+if (qemu_write_full(datafd[1], acctpw, acctpwlen) != acctpwlen) {
+error_setg_errno(errp, errno, "cannot write new account password");
+goto out;
+}
+close(datafd[1]);
+datafd[1] = -1;
+
+ga_wait_child(pid, &status, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto out;
+}
+
+if (!WIFEXITED(status)) {
+error_setg(errp, "child process has terminated abnormally");
+goto out;
+}
+
+if (WEXITSTATUS(status)) {
+error_setg(errp, "child process has failed to set admin password");
+goto out;
+}
+
+out:
+g_free(acctpw);
+g_free(passwd_path);
+if (datafd[0] != -1) {
+close(datafd[0]);
+}
+if (datafd[1] != -1) {
+close(datafd[1]);
+}
+}
+
  #else /* defined(__linux__) */
  
  void qmp_guest_suspend_disk(Error **errp)

@@ -1910,6 +1994,12 @@ int64_t qmp_guest_set_vcpus(GuestLogicalProcessorList 
*vcpus, Error **errp)
  return -1;
  }
  
+void qmp_guest_set_admin_password(bool crypted, const char *password,

+  Error **errp)
+{
+error_set(errp, QERR_UNSUPPORTED);
+}
+
  #endif
  
  #if !defined(CONFIG_FSFREEZE)

diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 3bcbeae..56854d5 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -446,6 +446,12 @@ int64_t qmp_guest_set_vcpus(GuestLogicalProcessorList 
*vcpus, Error **errp)
  return -1;
  }
  
+void qmp_guest_set_admin_password(bool crypted, const char *password,

+  Error **errp)
+{
+error_set(errp, QERR_UNSUPPORTED);
+}
+
  /* add unsupported commands to the blacklist */
  GList *ga_command_blacklist_init(GList *blacklist)
  {
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 376e79f..25118e2 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -738,3 +738,22 @@
  ##
  { 'command': 'guest-get-fsinfo',
'returns': ['GuestFilesystemInfo'] }
+
+##
+# @guest-set-admin-password
+#
+# @crypted: true if password is already crypt()d, false if raw
+# @password: the new password entry
+#
+# If the @crypted flag is true, it is the callers responsibility
+# to ensure the correct crypt() encryption scheme is used. This
+# command does not attempt to interpret or report

Re: [Qemu-devel] [PATCH v15 2/2] sPAPR: Implement sPAPRPHBClass::eeh_handler

2015-02-04 Thread Alexander Graf


On 14.01.15 02:41, David Gibson wrote:
> On Mon, Jan 05, 2015 at 11:26:28AM +1100, Gavin Shan wrote:
>> The patch implements sPAPRPHBClass::eeh_handler so that the
>> EEH RTAS requests can be routed to VFIO for further handling.
>>
>> Signed-off-by: Gavin Shan 
>> ---
>>  hw/ppc/spapr_pci_vfio.c | 56 
>> +
>>  hw/vfio/common.c|  1 +
>>  2 files changed, 57 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
>> index 144912b..73652a9 100644
>> --- a/hw/ppc/spapr_pci_vfio.c
>> +++ b/hw/ppc/spapr_pci_vfio.c
>> @@ -71,6 +71,61 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
>> *sphb, Error **errp)
>>  spapr_tce_get_iommu(tcet));
>>  }
>>  
>> +static int spapr_phb_vfio_eeh_handler(sPAPRPHBState *sphb, int req, int opt)
>> +{
>> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
>> +struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
> 
> This is a local variable, which means it won't be initialized.  You
> never memset() it and it's not obvious that all fields get
> initialized, which makes it dangerous to pass to an ioctl().

As far as I understand C, in the construct above all unmentioned fields
actually do get initialized to 0.


Alex



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 14:00, Daniel P. Berrange wrote:
> On Wed, Feb 04, 2015 at 01:43:12PM +0100, Paolo Bonzini wrote:
>>
>>
>> On 04/02/2015 12:32, Daniel P. Berrange wrote:
>>> So my idea would be that we define a QEMUChannel object and set of APIs to
>>> standardize all interaction with sockets, pipes, RDMA, whatever $channel,
>>> and then convert the QEMU features I've mentioned over to use that. I think
>>> that would be simpler than trying to untangle QEMUFile code from migration
>>> and then extend its features.
>>
>> Could it be GIOChannel simply?
>>
>> 1) Chardev is already mostly a wrapper around GIOChannel
>>
>> 2) NBD and VNC could be converted to GIOChannel with relative ease
>>
>> 3) migration is more complicated because (unlike everything else) it
>> uses a separate thread and blocking sockets, but you could probably
>> write a GIOChannel-based implementation of QEMUFile.
> 
> It might be possible to base it on GIOChannel, but IIRC some of the
> migration code was using iovecs for I/O and GIOChannel API doesn't
> allow for that. So you'd have to sacrifice performance by issuing a
> separate syscall for each iovec element which seems sucky to me.
> If you think that's an acceptable limitation though, I could certainly
> explore use of GIOChannel.

As long as QEMUFile remains there and GIOChannel is used only when
encryption is required, that would be an acceptable limitation.  As I
wrote above, migration is a bit special anyway.

> More broadly speaking GIOChannel has fallen out of favour in the
> glib ecosystem, with most apps/libraries more focused on use of
> the GIO APIs instead, but IIUC QEMU avoids use of the GIO library
> due to need to support older glib versions.

Besides that, QEMU developers are not extremely familiar with all the
glib stuff, and GIOChannel is a thin-enough wrapper that it's pretty
easy to understand what's going on.  But that can change if the
alternative has advantages.  Perhaps we could start by converting
chardevs from GIOChannel to GIO.

GIO has TLS bindings (not SASL I think?) in GIO 2.28.  Currently we
require glib 2.12 (released 2006) on POSIX systems and glib 2.20
(released 2009) on Windows.  That's very conservative indeed, I wouldn't
mind changing to a newer version.

>> I found a GIOChannel wrapper for gnutls at
>> https://github.com/aldebaran/connman/blob/master/gweb/giognutls.c.  It's
>> not the right license for QEMU (GPLv2-only) but it's only 400 lines of
>> code.  If necessary I can help with clean-room reverse engineering.
> 
> It doesn't seem todo any thing related to certificate validation which
> explains why it is so short compared ot the gnutls code we already have
> for VNC in QEMU. So I don't think it's particularly useful in terms of
> saving effort.

Yeah, it was only interesting for the GIOChannel boilerplate.

Paolo



Re: [Qemu-devel] [PATCH v15 2/2] sPAPR: Implement sPAPRPHBClass::eeh_handler

2015-02-04 Thread David Gibson
On Wed, Feb 04, 2015 at 02:42:00PM +0100, Alexander Graf wrote:
> 
> 
> On 14.01.15 02:41, David Gibson wrote:
> > On Mon, Jan 05, 2015 at 11:26:28AM +1100, Gavin Shan wrote:
> >> The patch implements sPAPRPHBClass::eeh_handler so that the
> >> EEH RTAS requests can be routed to VFIO for further handling.
> >>
> >> Signed-off-by: Gavin Shan 
> >> ---
> >>  hw/ppc/spapr_pci_vfio.c | 56 
> >> +
> >>  hw/vfio/common.c|  1 +
> >>  2 files changed, 57 insertions(+)
> >>
> >> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> >> index 144912b..73652a9 100644
> >> --- a/hw/ppc/spapr_pci_vfio.c
> >> +++ b/hw/ppc/spapr_pci_vfio.c
> >> @@ -71,6 +71,61 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
> >> *sphb, Error **errp)
> >>  spapr_tce_get_iommu(tcet));
> >>  }
> >>  
> >> +static int spapr_phb_vfio_eeh_handler(sPAPRPHBState *sphb, int req, int 
> >> opt)
> >> +{
> >> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> >> +struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
> > 
> > This is a local variable, which means it won't be initialized.  You
> > never memset() it and it's not obvious that all fields get
> > initialized, which makes it dangerous to pass to an ioctl().
> 
> As far as I understand C, in the construct above all unmentioned fields
> actually do get initialized to 0.

Sorry, my mistake.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpk3U1Mw0Afn.pgp
Description: PGP signature


Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> In QEMU there are a number of features which involve communication with an
> external system over an I/O channel of some form. The features include
> migration, NBD, VNC and character devices. The I/O channel in question might
> might be a FIFO pipe, a PTY, a TCP socket, a UNIX domain socket, RMDA channel
> or something else, while the external system can be another QEMU, libvirt, an
> NBD server, or something else.

> Currently the only place where there is any meaningful level of security is
> the VNC server, which supports the VeNCrypt extension providing TLS sessions
> for data encryption and x509 cert validation for authentication, and the SASL
> extension which also provides either encryption of authentication or both.
> The migration data channel is more or less completely unprotected unless it
> is tunnelled via libvirt or equivalent external secure channel. The same is
> true for NBD, though there was a recent discussion about defining an extension
> to use TLS. Likewise serial ports, parallel ports, virtio consoles all use the
> chardev backends which offer no security features.

I think fixing this for migration might simplify stuff for users a lot as well;
the choice of whether libvirt does tunneling or not and what it means
for how block migration happens etc can get very confusing.

> There are some other network related pieces in QEMU, namely the built-in 
> iSCSI,
> RBD, NFS clients, and SPICE server. Since those are all implemented via 3rd
> party libraries rather than natively by QEMU I'm going ignore them for the 
> sake
> of this discussion and focus on network interaction directly implemented in
> QEMU.
> 
> We have a broad goal in OpenStack that every network channel in use must have
> encryption and authentication capabilities. Currently all the communication
> channels between the end user and the cloud infrastructure edge servers are
> secured, but internally a number of the cloud infrastructure components are
> unsecured. For example, we recommend to tunnel migration via libvirt, though
> that excludes use of the NBD for block migration since libvirt can't currently
> tunnel that. Internally we will shortly use  VeNCrypt for protecting VNC
> between the compute hosts and the openstack console proxy edge servers. We
> are lacking the abilty to secure serial ports between the compute nods and
> console proxy.
> 
> Essentially the project considers that it is no longer sufficient to consider
> the private management LAN (on which the cloud infrastructure is deployed) to
> be fully trusted; it must be considered hostile.
> 
> So back to QEMU/KVM, we want to have
> 
>  - Native support for encryption & authentication with migration, since
>tunnelling via libvirt has a non-negligble performance overhead adding
>both latency and CPU load
> 
>  - Native support for encryption & authentication with NBD server to enable
>security of the block migration service
> 
>  - Native support for encryption & authentication with chardev TCP backends
>to enable security of the serial port consoles.
> 
> As a starting point, the desire is to have TLS for session encryption and
> x509 certificate verification for authentication, but at a later date we'd
> also like to add the ability to use SASL for either aspect too. Thinking about
> QEMU users in general, it would probably be useful if any impl could allow for
> future enhancements such as SSH tunnelling too.
> 
> I have some development cycles available to work on addressing these gaps in
> QEMU, along with relevant experiance since I did the previous VNC server work
> for adding TLS and SASL in QEMU, along with similar in libvirt and GTK-VNC.
> 
> Having looked at the QEMU code for VNC, migration and chardevs I think the
> main stumbling block is currently that QEMU does not have any kind of standard
> approach for dealing with I/O channels internally. Migration has the QEMUFile
> abstraction, but it is currently fairly coupled to the migration code itself
> and limited in scope, because it doesn't actually deal with socket listen
> or connect tasks. That's still part of the migration code itself, QEMUFile
> only deals with I/O transfer, so it is a pretty incomplete abstraction layer.
> The chardev code has a backend abstraction, but that is really horribly
> entwined with the chardev code so not reusable in any way without a rewrite
> from scratch IMHO. The VNC server has alot of useful code for dealing with
> TLS & SASL, but it is somewhat entwined with the VNC server. The VNC server
> doesn't use any I/O abstraction just preferring raw FDs / sockets. I've not
> looked at the NBD code in detail, but I'm presuming it is using raw FDs /
> sockets.
> 
> Since this TLS/SASL code is non-trivial (and obviously security critical), I
> really don't want to end up with 4 separate places to implement it in QEMU.
> IMHO the only practical / sensible approach is to define some kind of standar

Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Markus Armbruster
"Daniel P. Berrange"  writes:

> On Wed, Feb 04, 2015 at 01:43:12PM +0100, Paolo Bonzini wrote:
>> 
>> 
>> On 04/02/2015 12:32, Daniel P. Berrange wrote:
>> > So my idea would be that we define a QEMUChannel object and set of APIs to
>> > standardize all interaction with sockets, pipes, RDMA, whatever $channel,
>> > and then convert the QEMU features I've mentioned over to use that. I think
>> > that would be simpler than trying to untangle QEMUFile code from migration
>> > and then extend its features.
>> 
>> Could it be GIOChannel simply?
>> 
>> 1) Chardev is already mostly a wrapper around GIOChannel
>> 
>> 2) NBD and VNC could be converted to GIOChannel with relative ease
>> 
>> 3) migration is more complicated because (unlike everything else) it
>> uses a separate thread and blocking sockets, but you could probably
>> write a GIOChannel-based implementation of QEMUFile.
>
> It might be possible to base it on GIOChannel, but IIRC some of the
> migration code was using iovecs for I/O and GIOChannel API doesn't
> allow for that. So you'd have to sacrifice performance by issuing a
> separate syscall for each iovec element which seems sucky to me.
> If you think that's an acceptable limitation though, I could certainly
> explore use of GIOChannel.
>
> More broadly speaking GIOChannel has fallen out of favour in the
> glib ecosystem, with most apps/libraries more focused on use of
> the GIO APIs instead, but IIUC QEMU avoids use of the GIO library
> due to need to support older glib versions.

Remind me: what GLib version are we targeting, and why?

[...]



Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2015-02-04 Thread Michael S. Tsirkin
On Mon, Feb 02, 2015 at 12:29:47AM +0100, Paolo Bonzini wrote:
> 
> 
> On 17/12/2014 07:02, haifeng@huawei.com wrote:
> > From: linhaifeng 
> > 
> > If we create VM with two or more numa nodes qemu will create two
> > or more hugepage files but qemu only send one hugepage file fd
> > to vhost-user when VM's memory size is 2G and with two numa nodes.
> > 
> > Signed-off-by: linhaifeng 
> 
> The bug is in vhost_dev_assign_memory.  It doesn't check that the file
> descriptor matches when merging regions.  Michael, does the merging
> trigger in practice?  Can we just eliminate it?
> 
> Paolo

I'm not sure: does memory core ever give us two adjacent
RAM segments that we *can* merge?
If yes it would trigger, and extra memory slots slow down lookups
linearly so they aren't free.

> > ---
> >  hw/virtio/vhost-user.c  | 78 
> > ++---
> >  hw/virtio/vhost.c   | 13 
> >  linux-headers/linux/vhost.h |  7 
> >  3 files changed, 73 insertions(+), 25 deletions(-)
> > 
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index aefe0bb..439cbba 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -24,6 +24,10 @@
> >  #include 
> >  
> >  #define VHOST_MEMORY_MAX_NREGIONS8
> > +/* FIXME: same as the max number of numa node?*/
> > +#define HUGEPAGE_MAX_FILES   8
> > +
> > +#define RAM_SHARED (1 << 1)
> >  
> >  typedef enum VhostUserRequest {
> >  VHOST_USER_NONE = 0,
> > @@ -41,14 +45,15 @@ typedef enum VhostUserRequest {
> >  VHOST_USER_SET_VRING_KICK = 12,
> >  VHOST_USER_SET_VRING_CALL = 13,
> >  VHOST_USER_SET_VRING_ERR = 14,
> > -VHOST_USER_MAX
> > +VHOST_USER_MMAP_HUGEPAGE_FILE = 15,
> > +VHOST_USER_UNMAP_HUGEPAGE_FILE = 16,
> > +VHOST_USER_MAX,
> >  } VhostUserRequest;
> >  
> >  typedef struct VhostUserMemoryRegion {
> >  uint64_t guest_phys_addr;
> >  uint64_t memory_size;
> >  uint64_t userspace_addr;
> > -uint64_t mmap_offset;
> >  } VhostUserMemoryRegion;
> >  
> >  typedef struct VhostUserMemory {
> > @@ -57,6 +62,16 @@ typedef struct VhostUserMemory {
> >  VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
> >  } VhostUserMemory;
> >  
> > +typedef struct HugepageMemoryInfo {
> > +uint64_t base_addr;
> > +uint64_t size;
> > +}HugeMemInfo;
> > +
> > +typedef struct HugepageInfo {
> > +uint32_t num;
> > +HugeMemInfo files[HUGEPAGE_MAX_FILES];
> > +}HugepageInfo;
> > +
> >  typedef struct VhostUserMsg {
> >  VhostUserRequest request;
> >  
> > @@ -71,6 +86,7 @@ typedef struct VhostUserMsg {
> >  struct vhost_vring_state state;
> >  struct vhost_vring_addr addr;
> >  VhostUserMemory memory;
> > +HugepageInfo huge_info;
> >  };
> >  } QEMU_PACKED VhostUserMsg;
> >  
> > @@ -104,7 +120,9 @@ static unsigned long int 
> > ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
> >  VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
> >  VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
> >  VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
> > -VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
> > +VHOST_SET_VRING_ERR,/* VHOST_USER_SET_VRING_ERR */
> > +VHOST_MMAP_HUGEPAGE_FILE,  /* VHOST_USER_MMAP_HUGEPAGE_FILE */
> > +VHOST_UNMAP_HUGEPAGE_FILE, /* VHOST_USER_UNMAP_HUGEPAGE_FILE */
> >  };
> >  
> >  static VhostUserRequest vhost_user_request_translate(unsigned long int 
> > request)
> > @@ -190,6 +208,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
> > unsigned long int request,
> >  int fds[VHOST_MEMORY_MAX_NREGIONS];
> >  int i, fd;
> >  size_t fd_num = 0;
> > +RAMBlock *block;
> >  
> >  assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
> >  
> > @@ -213,37 +232,46 @@ static int vhost_user_call(struct vhost_dev *dev, 
> > unsigned long int request,
> >  case VHOST_RESET_OWNER:
> >  break;
> >  
> > -case VHOST_SET_MEM_TABLE:
> > -for (i = 0; i < dev->mem->nregions; ++i) {
> > -struct vhost_memory_region *reg = dev->mem->regions + i;
> > -ram_addr_t ram_addr;
> > +case VHOST_MMAP_HUGEPAGE_FILE:
> > +qemu_mutex_lock_ramlist();
> >  
> > -assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
> > -qemu_ram_addr_from_host((void 
> > *)(uintptr_t)reg->userspace_addr, &ram_addr);
> > -fd = qemu_get_ram_fd(ram_addr);
> > -if (fd > 0) {
> > -msg.memory.regions[fd_num].userspace_addr = 
> > reg->userspace_addr;
> > -msg.memory.regions[fd_num].memory_size  = reg->memory_size;
> > -msg.memory.regions[fd_num].guest_phys_addr = 
> > reg->guest_phys_addr;
> > -msg.memory.regions[fd_num].mmap_offset = 
> > reg->userspace_addr -
> > -(uintptr_t) qemu_get_ram_block_host_ptr(ram_addr);
> > -ass

Re: [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset

2015-02-04 Thread Alex Williamson
On Wed, 2015-02-04 at 17:54 +0800, Chen Fan wrote:
> On 02/03/2015 04:16 AM, Alex Williamson wrote:
> > On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
> >> when vfio device support FLR, then when device reset,
> >> we call VFIO_DEVICE_RESET ioctl to reset the device first,
> >> at kernel side, we also can see the order of reset:
> >> 3330 rc = pcie_flr(dev, probe);
> >> 3331 if (rc != -ENOTTY)
> >> 3332 goto done;
> >> 
> >> 3334 rc = pci_af_flr(dev, probe);
> >> 3335 if (rc != -ENOTTY)
> >> 3336 goto done;
> >> 3337
> >> 3338 rc = pci_pm_reset(dev, probe);
> >> 3339 if (rc != -ENOTTY)
> >> 3340 goto done;
> >>
> >> so when vfio has FLR, reset it directly.
> >>
> >> Signed-off-by: Chen Fan 
> >> ---
> >>   hw/vfio/pci.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> index 8c81bb3..54eb6b4 100644
> >> --- a/hw/vfio/pci.c
> >> +++ b/hw/vfio/pci.c
> >> @@ -3455,7 +3455,7 @@ static void vfio_pci_reset(DeviceState *dev)
> >>   vfio_pci_pre_reset(vdev);
> >>   
> >>   if (vdev->vbasedev.reset_works &&
> >> -(vdev->has_flr || !vdev->has_pm_reset) &&
> >> +vdev->has_flr &&
> >>   !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
> >>   trace_vfio_pci_reset_flr(vdev->vbasedev.name);
> >>   goto post_reset;
> > Does this actually fix anything?  QEMU shouldn't rely on a specific
> > behavior of the kernel.  This test is de-prioritizing a PM reset because
> > they're often non-effective.  If the device supports FLR, the second
> > part of the OR is unreached, so what's the point of this change?
> For this change, when I tested the code on my own machine.
> I found the vfio device has neither flr nor pm reset (e.g. NoSoftRst+).
> this also trigger ioctl VFIO_DEVICE_RESET, is it right?

Yes, that means that the device has a device specific reset or that it's
a singleton device on the bus and we can use the simpler path of
VFIO_DEVICE_RESET.  Thanks,

Alex




Re: [Qemu-devel] [PATCH v5 11/26] qcow2: More helpers for refcount modification

2015-02-04 Thread Kevin Wolf
Am 15.12.2014 um 13:50 hat Max Reitz geschrieben:
> Add helper functions for getting and setting refcounts in a refcount
> array for any possible refcount order, and choose the correct one during
> refcount initialization.
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> Reviewed-by: Stefan Hajnoczi 

Hm... Wouldn't it be better to have a single generic implementation that
is based on uint64_t* and calculates the right shifts and bitmasks on
the fly?

Kevin



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Peter Maydell
On 4 February 2015 at 13:49, Markus Armbruster  wrote:
> Remind me: what GLib version are we targeting, and why?

Our current minimum is 2.12 (or 2.20 in Windows specific code),
and the reason is RHEL5/Centos 5.

-- PMM



Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 14:52, Michael S. Tsirkin wrote:
> I'm not sure: does memory core ever give us two adjacent
> RAM segments that we *can* merge?

I don't think so.  Memory core merges ranges already if the following holds:

- same memory region
- same dirty logging mode
- same readonly behavior (ignore reads, trap reads as MMIO, read/write)
- r2 end address is the same as r1 start address
- r2's offset in the memory region is equal to r1's plus the size of r1

Paolo

> If yes it would trigger, and extra memory slots slow down lookups
> linearly so they aren't free.



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Daniel P. Berrange
On Wed, Feb 04, 2015 at 01:08:21PM +, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > In QEMU there are a number of features which involve communication with an
> > external system over an I/O channel of some form. The features include
> > migration, NBD, VNC and character devices. The I/O channel in question might
> > might be a FIFO pipe, a PTY, a TCP socket, a UNIX domain socket, RMDA 
> > channel
> > or something else, while the external system can be another QEMU, libvirt, 
> > an
> > NBD server, or something else.
> 
> > Currently the only place where there is any meaningful level of security is
> > the VNC server, which supports the VeNCrypt extension providing TLS sessions
> > for data encryption and x509 cert validation for authentication, and the 
> > SASL
> > extension which also provides either encryption of authentication or both.
> > The migration data channel is more or less completely unprotected unless it
> > is tunnelled via libvirt or equivalent external secure channel. The same is
> > true for NBD, though there was a recent discussion about defining an 
> > extension
> > to use TLS. Likewise serial ports, parallel ports, virtio consoles all use 
> > the
> > chardev backends which offer no security features.
> 
> I think fixing this for migration might simplify stuff for users a lot as 
> well;
> the choice of whether libvirt does tunneling or not and what it means
> for how block migration happens etc can get very confusing.

Yes, and not helped by the fact that no single current impl offers all
desired features - they are all partially overlapping sets :-(


> > Since this TLS/SASL code is non-trivial (and obviously security critical), I
> > really don't want to end up with 4 separate places to implement it in QEMU.
> > IMHO the only practical / sensible approach is to define some kind of 
> > standard
> > I/O channel API internally to QEMU which migration, NBD, chardev and VNC all
> > use. That gives us a single place to integrate all the security mechanisms
> > we need to support.  In libvirt we did something like this a little while
> > ago by defining a standard internal sockets API[1], with plugins for things
> > like SASL[2], TLS[4] and SSH[5] and it has been very successful in 
> > simplifying
> > the code by centralizing the hairy logic, though I wouldn't aim for exactly
> > the same design if doing it again - a more layered approach like QEMU 
> > blockdev
> > drivers is probably better in retrospect.
> > 
> > So my idea would be that we define a QEMUChannel object and set of APIs to
> > standardize all interaction with sockets, pipes, RDMA, whatever $channel,
> 
> I'm not sure if it makes sense for RDMA; it already has a couple of hooks
> that go around the back of QEMUFile in migration, and it's transferring the
> data via DMA so the page data doesn't go through the same path.

Could you ever anticipate any need for authentication or encryption in
the RDMA based channel ? I don't know enough about RDMA myself to know
if it makes sense or not, other than the fact that any channel between
two separate hosts needs security at some level in the stack.


> Some things to keep in mind:
>   1) The current patch from Liang Li doing multi threaded compression
>  It just strikes me as an exmaple of another type of filter in the 
> migration
>  stream.

Yes, that does make sense in that way,

>   2) Postcopy and fault tolerance need a bidirectional channel; and that back
>  channel tends to be small messages.

TLS will require bidirectional channels too in order to perform the
handshake. SASL will require bidirectional channels too. So this
seems rather inevitable.

>   3) I'd considered making a separate socket/fd for passing page data
>  in the hope of maybe making that page take data via splice; but am not
>  sure yet.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH 1/2] Add GDB qAttached support

2015-02-04 Thread Jan Kiszka
On 2015-02-04 14:36, Pedro Alves wrote:
> Hi, I was skimming the list, and noticed:
> 
> On 01/31/2015 10:28 AM, Jan Kiszka wrote:
>> @@ -1187,6 +1193,10 @@ static int gdb_handle_packet(GDBState *s, const char 
>> *line_buf)
>>  put_packet_binary(s, buf, len + 1);
>>  break;
>>  }
>> +if (strncmp(p, "Attached", 8) == 0) {
> 
> This looks like it'd mishandle a future qAttached2 packet.
> 
> It should be doing something like:
> 
>if (strncmp(p, "Attached", 8) == 0 &&
>   (p[8] == '\0' || p[8] == ':')) {
> 
> or:
> 
>if (strcmp(p, "Attached") == 0 || strncmp(p, "Attached:", 9) == 0) {
> 
> 
> Likewise other packets, if they have the same issue.
> (I'm not familiar with qemu's stub's internals.)

Thanks for the remark! Will update the patch using the easier readable
second variant.

Jan




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Daniel P. Berrange
On Wed, Feb 04, 2015 at 02:42:20PM +0100, Paolo Bonzini wrote:
> 
> 
> On 04/02/2015 14:00, Daniel P. Berrange wrote:
> > On Wed, Feb 04, 2015 at 01:43:12PM +0100, Paolo Bonzini wrote:
> >>
> >>
> >> On 04/02/2015 12:32, Daniel P. Berrange wrote:
> >>> So my idea would be that we define a QEMUChannel object and set of APIs to
> >>> standardize all interaction with sockets, pipes, RDMA, whatever $channel,
> >>> and then convert the QEMU features I've mentioned over to use that. I 
> >>> think
> >>> that would be simpler than trying to untangle QEMUFile code from migration
> >>> and then extend its features.
> >>
> >> Could it be GIOChannel simply?
> >>
> >> 1) Chardev is already mostly a wrapper around GIOChannel
> >>
> >> 2) NBD and VNC could be converted to GIOChannel with relative ease
> >>
> >> 3) migration is more complicated because (unlike everything else) it
> >> uses a separate thread and blocking sockets, but you could probably
> >> write a GIOChannel-based implementation of QEMUFile.
> > 
> > It might be possible to base it on GIOChannel, but IIRC some of the
> > migration code was using iovecs for I/O and GIOChannel API doesn't
> > allow for that. So you'd have to sacrifice performance by issuing a
> > separate syscall for each iovec element which seems sucky to me.
> > If you think that's an acceptable limitation though, I could certainly
> > explore use of GIOChannel.
> 
> As long as QEMUFile remains there and GIOChannel is used only when
> encryption is required, that would be an acceptable limitation.  As I
> wrote above, migration is a bit special anyway.

I'm not sure I'd like the idea of having different codepaths for
the encrypted vs non-encrypted impl. it seems like a recipe for
increased maintainence work and inconsistent behaviour over the
long term. My thought was that QEMUFile would basically go
away entirely by the end of the conversion, or at most be dealing
with the data rate throttling if that didn't fit nicely into the
generic IO layer.

> > More broadly speaking GIOChannel has fallen out of favour in the
> > glib ecosystem, with most apps/libraries more focused on use of
> > the GIO APIs instead, but IIUC QEMU avoids use of the GIO library
> > due to need to support older glib versions.
> 
> Besides that, QEMU developers are not extremely familiar with all the
> glib stuff, and GIOChannel is a thin-enough wrapper that it's pretty
> easy to understand what's going on.  But that can change if the
> alternative has advantages.  Perhaps we could start by converting
> chardevs from GIOChannel to GIO.
> 
> GIO has TLS bindings (not SASL I think?) in GIO 2.28.  Currently we
> require glib 2.12 (released 2006) on POSIX systems and glib 2.20
> (released 2009) on Windows.  That's very conservative indeed, I wouldn't
> mind changing to a newer version.

Yeah, it has some level of functionality for TLS, but I'm not sure about
the full extent of it and whether it'd be sufficient for what we need
in VNC for example.

The main difference between GIO's APIs and GIOChannel is that the new
GIO stuff is really designed around the idea of asynchronous callbacks
for completion of IO.

  eg

 g_input_stream_read_async(stream, buffer, size, read_done_callback);

 and then when read_done_callback gets triggered you have to call

 g_input_stream_read_finish(stream)

in order to get the success/failure status of the read, and the byte
count. While it is quite nice for new code IME, this is probably quite
alot harder to refit into existing QEMU codebase.  So either GIOChannel
or something custom that is similar to GIOChannel would likely be an
easier fit.

> >> I found a GIOChannel wrapper for gnutls at
> >> https://github.com/aldebaran/connman/blob/master/gweb/giognutls.c.  It's
> >> not the right license for QEMU (GPLv2-only) but it's only 400 lines of
> >> code.  If necessary I can help with clean-room reverse engineering.
> > 
> > It doesn't seem todo any thing related to certificate validation which
> > explains why it is so short compared ot the gnutls code we already have
> > for VNC in QEMU. So I don't think it's particularly useful in terms of
> > saving effort.
> 
> Yeah, it was only interesting for the GIOChannel boilerplate.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v2] qga: add guest-set-admin-password command

2015-02-04 Thread Daniel P. Berrange
On Wed, Feb 04, 2015 at 04:25:47PM +0300, Olga Krishtal wrote:
> On 12/01/15 18:58, Daniel P. Berrange wrote:
> >Add a new 'guest-set-admin-password' command for changing the
> >root/administrator password. This command is needed to allow
> >OpenStack to support its API for changing the admin password
> >on a running guest.
> >
> >Accepts either the raw password string:
> >
> >$ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
> >'{ "execute": "guest-set-admin-password", "arguments":
> >  { "crypted": false, "password": "12345678" } }'
> >   {"return":{}}
> >
> >Or a pre-encrypted string (recommended)
> >
> >$ virsh -c qemu:///system  qemu-agent-command f21x86_64 \
> >'{ "execute": "guest-set-admin-password", "arguments":
> >  { "crypted": true, "password":
> > "$6$T9O/j/aGPrE...sniprQoRN4F0.GG0MPjNUNyml." } }'
> >
> >NB windows support is desirable, but not implemented in this
> >patch.
> >
> >Signed-off-by: Daniel P. Berrange 
> >---
> >  qga/commands-posix.c | 90 
> > 
> >  qga/commands-win32.c |  6 
> >  qga/qapi-schema.json | 19 +++
> >  3 files changed, 115 insertions(+)
> >
> >diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> >index f6f3e3c..4887889 100644
> >--- a/qga/commands-posix.c
> >+++ b/qga/commands-posix.c
> >@@ -1875,6 +1875,90 @@ int64_t qmp_guest_set_vcpus(GuestLogicalProcessorList 
> >*vcpus, Error **errp)
> >  return processed;
> >  }
> >+void qmp_guest_set_admin_password(bool crypted, const char *password,
> >+  Error **errp)
> >+{
> >+Error *local_err = NULL;
> >+char *passwd_path = NULL;
> >+pid_t pid;
> >+int status;
> >+int datafd[2] = { -1, -1 };
> >+char *acctpw = g_strdup_printf("root:%s\n", password);
> >+size_t acctpwlen = strlen(acctpw);
> >+
> >+if (strchr(password, '\n')) {
> >+error_setg(errp, "forbidden characters in new password");
> >+goto out;
> >+}
> >+
> >+passwd_path = g_find_program_in_path("chpasswd");
> >+
> >+if (!passwd_path) {
> >+error_setg(errp, "cannot find 'passwd' program in PATH");
> >+goto out;
> >+}
> >+
> >+if (pipe(datafd) < 0) {
> >+error_setg(errp, "cannot create pipe FDs");
> >+goto out;
> >+}
> >+
> >+pid = fork();
> >+if (pid == 0) {
> >+close(datafd[1]);
> >+/* child */
> >+setsid();
> >+dup2(datafd[0], 0);
> >+reopen_fd_to_null(1);
> >+reopen_fd_to_null(2);
> >+
> >+if (crypted) {
> >+execle(passwd_path, "chpasswd", "-e", NULL, environ);
> >+} else {
> >+execle(passwd_path, "chpasswd", NULL, environ);
> >+}
> >+_exit(EXIT_FAILURE);
> >+} else if (pid < 0) {
> >+error_setg_errno(errp, errno, "failed to create child process");
> >+goto out;
> >+}
> >+close(datafd[0]);
> >+datafd[0] = -1;
> >+
> >+if (qemu_write_full(datafd[1], acctpw, acctpwlen) != acctpwlen) {
> >+error_setg_errno(errp, errno, "cannot write new account password");
> >+goto out;
> >+}
> >+close(datafd[1]);
> >+datafd[1] = -1;
> >+
> >+ga_wait_child(pid, &status, &local_err);
> >+if (local_err) {
> >+error_propagate(errp, local_err);
> >+goto out;
> >+}
> >+
> >+if (!WIFEXITED(status)) {
> >+error_setg(errp, "child process has terminated abnormally");
> >+goto out;
> >+}
> >+
> >+if (WEXITSTATUS(status)) {
> >+error_setg(errp, "child process has failed to set admin password");
> >+goto out;
> >+}
> >+
> >+out:
> >+g_free(acctpw);
> >+g_free(passwd_path);
> >+if (datafd[0] != -1) {
> >+close(datafd[0]);
> >+}
> >+if (datafd[1] != -1) {
> >+close(datafd[1]);
> >+}
> >+}
> >+
> >  #else /* defined(__linux__) */
> >  void qmp_guest_suspend_disk(Error **errp)
> >@@ -1910,6 +1994,12 @@ int64_t qmp_guest_set_vcpus(GuestLogicalProcessorList 
> >*vcpus, Error **errp)
> >  return -1;
> >  }
> >+void qmp_guest_set_admin_password(bool crypted, const char *password,
> >+  Error **errp)
> >+{
> >+error_set(errp, QERR_UNSUPPORTED);
> >+}
> >+
> >  #endif
> >  #if !defined(CONFIG_FSFREEZE)
> >diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> >index 3bcbeae..56854d5 100644
> >--- a/qga/commands-win32.c
> >+++ b/qga/commands-win32.c
> >@@ -446,6 +446,12 @@ int64_t qmp_guest_set_vcpus(GuestLogicalProcessorList 
> >*vcpus, Error **errp)
> >  return -1;
> >  }
> >+void qmp_guest_set_admin_password(bool crypted, const char *password,
> >+  Error **errp)
> >+{
> >+error_set(errp, QERR_UNSUPPORTED);
> >+}
> >+
> >  /* add unsupported commands to the blacklist */
> >  GList *ga_command_blacklist_init(GList *blacklist)
> >  {
> >diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
> 

Re: [Qemu-devel] [PATCH] vl.c: fix -usb option assertion failure in qemu_opt_get_bool_helper()

2015-02-04 Thread Christian Borntraeger
Am 02.02.2015 um 11:02 schrieb Marcel Apfelbaum:
> On 01/31/2015 11:23 AM, Jan Kiszka wrote:
>> On 2015-01-05 12:37, Jan Kiszka wrote:
>>> On 2015-01-05 12:22, Stefan Hajnoczi wrote:
 Commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6 ("machine: remove
 qemu_machine_opts global list") removed option descriptions from the
 -machine QemuOptsList to avoid repeating MachineState's QOM properties.

 This change broke vl.c:usb_enabled() because qemu_opt_get_bool() cannot
 be used on QemuOptsList without option descriptions since QemuOpts
 doesn't know the type and therefore left an unparsed string value.

 This patch avoids calling qemu_opt_get_bool() to fix the assertion
 failure:

$ qemu-system-x86_64 -usb
qemu_opt_get_bool_helper: Assertion `opt->desc && opt->desc->type == 
 QEMU_OPT_BOOL' failed.

 Test the presence of -usb using qemu_opt_find() but use the
 MachineState->usb field instead of qemu_opt_get_bool().

 Cc: Marcel Apfelbaum 
 Cc: Tiejun Chen 
 Signed-off-by: Stefan Hajnoczi 
 ---
   vl.c | 7 +--
   1 file changed, 5 insertions(+), 2 deletions(-)

 diff --git a/vl.c b/vl.c
 index bea9656..6e8889c 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -999,8 +999,11 @@ static int parse_name(QemuOpts *opts, void *opaque)

   bool usb_enabled(bool default_usb)
   {
 -return qemu_opt_get_bool(qemu_get_machine_opts(), "usb",
 - has_defaults && default_usb);
 +if (qemu_opt_find(qemu_get_machine_opts(), "usb")) {
 +return current_machine->usb;
 +} else {
 +return has_defaults && default_usb;
 +}
   }

   #ifndef _WIN32

>>>
>>> That still leaves the other boolean machine options broken. A generic
>>> fix would be good. Or revert the original commit until we have one.
>>
>> Just pulled current master, and at least the iommu machine option is
>> still triggering an abort.
>>
>> What is the status of fixing these fallouts? Was anything else addressed
>> by now, just this one forgotten?
> No, is not forgotten. I waited to see that the approach to fix this issue
> is accepted by the reviewers.
> 
> Thanks for the reminder, patches will be submitted soon.
> Marcel

Anything to look at a branch or something? Adding new machine bool options
is currently pretty hard - as long as it doesnt work ;-)

Christian




Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 15:08, Daniel P. Berrange wrote:
>> > As long as QEMUFile remains there and GIOChannel is used only when
>> > encryption is required, that would be an acceptable limitation.  As I
>> > wrote above, migration is a bit special anyway.
> I'm not sure I'd like the idea of having different codepaths for
> the encrypted vs non-encrypted impl. it seems like a recipe for
> increased maintainence work and inconsistent behaviour over the
> long term. My thought was that QEMUFile would basically go
> away entirely by the end of the conversion, or at most be dealing
> with the data rate throttling if that didn't fit nicely into the
> generic IO layer.

QEMUFile has a bunch of hooks for RDMA (they were also used by the
never-upstreamed patches to speed up AF_UNIX migration with vmsplice),
so it cannot go away:

typedef struct QEMUFileOps {
QEMUFilePutBufferFunc *put_buffer;
QEMUFileGetBufferFunc *get_buffer;
QEMUFileCloseFunc *close;
QEMUFileGetFD *get_fd;
QEMUFileWritevBufferFunc *writev_buffer;
QEMURamHookFunc *before_ram_iterate;
QEMURamHookFunc *after_ram_iterate;
QEMURamHookFunc *hook_ram_load;
QEMURamSaveFunc *save_page;
QEMUFileShutdownFunc *shut_down;
} QEMUFileOps;

GIO doesn't provide writev either, so it's not a good match for
non-encrypted migration, which really tries hard to do no copies in
userspace.

> > GIO has TLS bindings (not SASL I think?) in GIO 2.28.  Currently we
> > require glib 2.12 (released 2006) on POSIX systems and glib 2.20
> > (released 2009) on Windows.  That's very conservative indeed, I wouldn't
> > mind changing to a newer version.
>
> Yeah, it has some level of functionality for TLS, but I'm not sure about
> the full extent of it and whether it'd be sufficient for what we need
> in VNC for example.

Okay...  I don't know much about this.

> The main difference between GIO's APIs and GIOChannel is that the new
> GIO stuff is really designed around the idea of asynchronous callbacks
> for completion of IO.
> 
>   eg
> 
>  g_input_stream_read_async(stream, buffer, size, read_done_callback);
> 
>  and then when read_done_callback gets triggered you have to call
> 
>  g_input_stream_read_finish(stream)
> 
> in order to get the success/failure status of the read, and the byte
> count. While it is quite nice for new code IME, this is probably quite
> alot harder to refit into existing QEMU codebase.

It also supports GIOChannel's GSource model via
GPollableInputStream/GPollableOutputStream.  The GNUTLS bindings support
that interface too.

It also supports blocking operation, which is what migration wants.

So I think we could take a look at GIO if its TLS support is advanced
enough for your purposes.

Paolo



Re: [Qemu-devel] [PATCH 1/8] qga: fixed warning in qemu-ga.exe for mingw >= 4.9.1

2015-02-04 Thread Olga Krishtal

On 04/02/15 00:29, Eric Blake wrote:

On 12/31/2014 06:06 AM, Denis V. Lunev wrote:

From: Olga Krishtal 

strtok_r was redefined before the patch

Signed-off-by: Olga Krishtal 
Signed-off-by: Denis V. Lunev 
CC: Michael Roth 
---
  include/sysemu/os-win32.h | 2 ++
  1 file changed, 2 insertions(+)

What's the actual compiler error you get without this patch?
/mingw/include/string.h:88:9: note: previous declaration of 'strtok_r' 
was here
   char *strtok_r(char * __restrict__ _Str, const char * __restrict__ 
_Delim, char ** __restrict__ __last);


/include/sysemu/os-win32.h:83:7: warning: redundant redeclaration of 
'strtok_r' [-Wredundant-decls]

 char *strtok_r(char *str, const char *delim, char **saveptr);



diff --git a/include/sysemu/os-win32.h b/include/sysemu/os-win32.h
index af3fbc4..84e229b 100644
--- a/include/sysemu/os-win32.h
+++ b/include/sysemu/os-win32.h
@@ -81,7 +81,9 @@ struct tm *gmtime_r(const time_t *timep, struct tm *result);
  #undef localtime_r
  struct tm *localtime_r(const time_t *timep, struct tm *result);
  
+#if defined __MINGW32__ && !(__GNUC__ >= 4 && __GNUC_MINOR__ >= 9)

  char *strtok_r(char *str, const char *delim, char **saveptr);

It feels weird to key off of the gcc version.  Whether strtok_r is
defined is a property of the system headers, not the compiler.  For
example, I'm aware a but where older mingw  would pollute
the namespace with a broken #define of strtok_r; but that has been fixed
in newer mingw headers (on Fedora, I know that F20 mingw has the bug,
F21 does not).  But that is the version of mingw, not the version of
gcc, that determined that problem.  Therefore, I'm not sure this is the
right patch, but without knowing the symptoms it fixes, I don't know if
anything is better.

 There is an idea to get rid of the strtok_r() because it is used only 
once, so we can replace it with other functions to get the same 
functionality.




Re: [Qemu-devel] [PATCH] vl.c: fix -usb option assertion failure in qemu_opt_get_bool_helper()

2015-02-04 Thread Marcel Apfelbaum

On 02/04/2015 04:18 PM, Christian Borntraeger wrote:

Am 02.02.2015 um 11:02 schrieb Marcel Apfelbaum:

On 01/31/2015 11:23 AM, Jan Kiszka wrote:

On 2015-01-05 12:37, Jan Kiszka wrote:

On 2015-01-05 12:22, Stefan Hajnoczi wrote:

Commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6 ("machine: remove
qemu_machine_opts global list") removed option descriptions from the
-machine QemuOptsList to avoid repeating MachineState's QOM properties.

This change broke vl.c:usb_enabled() because qemu_opt_get_bool() cannot
be used on QemuOptsList without option descriptions since QemuOpts
doesn't know the type and therefore left an unparsed string value.

This patch avoids calling qemu_opt_get_bool() to fix the assertion
failure:

$ qemu-system-x86_64 -usb
qemu_opt_get_bool_helper: Assertion `opt->desc && opt->desc->type == 
QEMU_OPT_BOOL' failed.

Test the presence of -usb using qemu_opt_find() but use the
MachineState->usb field instead of qemu_opt_get_bool().

Cc: Marcel Apfelbaum 
Cc: Tiejun Chen 
Signed-off-by: Stefan Hajnoczi 
---
   vl.c | 7 +--
   1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index bea9656..6e8889c 100644
--- a/vl.c
+++ b/vl.c
@@ -999,8 +999,11 @@ static int parse_name(QemuOpts *opts, void *opaque)

   bool usb_enabled(bool default_usb)
   {
-return qemu_opt_get_bool(qemu_get_machine_opts(), "usb",
- has_defaults && default_usb);
+if (qemu_opt_find(qemu_get_machine_opts(), "usb")) {
+return current_machine->usb;
+} else {
+return has_defaults && default_usb;
+}
   }

   #ifndef _WIN32



That still leaves the other boolean machine options broken. A generic
fix would be good. Or revert the original commit until we have one.


Just pulled current master, and at least the iommu machine option is
still triggering an abort.

What is the status of fixing these fallouts? Was anything else addressed
by now, just this one forgotten?

No, is not forgotten. I waited to see that the approach to fix this issue
is accepted by the reviewers.

Thanks for the reminder, patches will be submitted soon.
Marcel


Anything to look at a branch or something? Adding new machine bool options
is currently pretty hard - as long as it doesnt work ;-)

It will be ready today or tomorrow :)

Thanks,
Marcel


Christian






Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 15:02, Daniel P. Berrange wrote:
> > I'm not sure if it makes sense for RDMA; it already has a couple of hooks
> > that go around the back of QEMUFile in migration, and it's transferring the
> > data via DMA so the page data doesn't go through the same path.
> 
> Could you ever anticipate any need for authentication or encryption in
> the RDMA based channel ? I don't know enough about RDMA myself to know
> if it makes sense or not, other than the fact that any channel between
> two separate hosts needs security at some level in the stack.

Authentication, possibly; but I don't think encryption makes sense.  Marcel?

Paolo



Re: [Qemu-devel] [PATCH 1/2] Add GDB qAttached support

2015-02-04 Thread Peter Maydell
On 4 February 2015 at 13:36, Pedro Alves  wrote:
>
> This looks like it'd mishandle a future qAttached2 packet.
>
> It should be doing something like:
>
>if (strncmp(p, "Attached", 8) == 0 &&
>   (p[8] == '\0' || p[8] == ':')) {
>
> or:
>
>if (strcmp(p, "Attached") == 0 || strncmp(p, "Attached:", 9) == 0) {
>
>
> Likewise other packets, if they have the same issue.
> (I'm not familiar with qemu's stub's internals.)

Looks like we get this wrong for a lot of our existing
query packet handling too... Maybe worth having a utility
function for "is this a foo query packet" rather than
raw strcmp/strncmp?

-- PMM



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Daniel P. Berrange
On Wed, Feb 04, 2015 at 03:23:22PM +0100, Paolo Bonzini wrote:
> 
> 
> On 04/02/2015 15:08, Daniel P. Berrange wrote:
> >> > As long as QEMUFile remains there and GIOChannel is used only when
> >> > encryption is required, that would be an acceptable limitation.  As I
> >> > wrote above, migration is a bit special anyway.
> > I'm not sure I'd like the idea of having different codepaths for
> > the encrypted vs non-encrypted impl. it seems like a recipe for
> > increased maintainence work and inconsistent behaviour over the
> > long term. My thought was that QEMUFile would basically go
> > away entirely by the end of the conversion, or at most be dealing
> > with the data rate throttling if that didn't fit nicely into the
> > generic IO layer.
> 
> QEMUFile has a bunch of hooks for RDMA (they were also used by the
> never-upstreamed patches to speed up AF_UNIX migration with vmsplice),
> so it cannot go away:
> 
> typedef struct QEMUFileOps {
> QEMUFilePutBufferFunc *put_buffer;
> QEMUFileGetBufferFunc *get_buffer;
> QEMUFileCloseFunc *close;
> QEMUFileGetFD *get_fd;
> QEMUFileWritevBufferFunc *writev_buffer;
> QEMURamHookFunc *before_ram_iterate;
> QEMURamHookFunc *after_ram_iterate;
> QEMURamHookFunc *hook_ram_load;
> QEMURamSaveFunc *save_page;
> QEMUFileShutdownFunc *shut_down;
> } QEMUFileOps;
> 
> GIO doesn't provide writev either, so it's not a good match for
> non-encrypted migration, which really tries hard to do no copies in
> userspace.

Ok, maybe RDMA will still need QEMUFile, but for non-encrypted TCP
I'd hope to be able to achieve zero-copy with the new API too - it
would certainly be my intention/goal.

> > The main difference between GIO's APIs and GIOChannel is that the new
> > GIO stuff is really designed around the idea of asynchronous callbacks
> > for completion of IO.
> > 
> >   eg
> > 
> >  g_input_stream_read_async(stream, buffer, size, read_done_callback);
> > 
> >  and then when read_done_callback gets triggered you have to call
> > 
> >  g_input_stream_read_finish(stream)
> > 
> > in order to get the success/failure status of the read, and the byte
> > count. While it is quite nice for new code IME, this is probably quite
> > alot harder to refit into existing QEMU codebase.
> 
> It also supports GIOChannel's GSource model via
> GPollableInputStream/GPollableOutputStream.  The GNUTLS bindings support
> that interface too.
> 
> It also supports blocking operation, which is what migration wants.
> 
> So I think we could take a look at GIO if its TLS support is advanced
> enough for your purposes.

Ok, I'll investigate GIO a little further to see how practical a fit it
is for us.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 14:08, Dr. David Alan Gilbert wrote:
>   3) I'd considered making a separate socket/fd for passing page data
>  in the hope of maybe making that page take data via splice; but am not
>  sure yet.

There were even patches for this:
http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg04006.html

The submitter disappeared together with Anthony, but the patches were
pretty good and reused some of the RDMA hooks.

Paolo



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> 
> 
> On 04/02/2015 14:08, Dr. David Alan Gilbert wrote:
> >   3) I'd considered making a separate socket/fd for passing page data
> >  in the hope of maybe making that page take data via splice; but am not
> >  sure yet.
> 
> There were even patches for this:
> http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg04006.html
> 
> The submitter disappeared together with Anthony, but the patches were
> pretty good and reused some of the RDMA hooks.

Yep, I looked at those a while back and intend to have another look
at some time; I think they were primarily intended for same-host migration
to upgrade running QEMU instances.

Dave

> 
> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH RFC v2 7/8] migration: add dirty parameter

2015-02-04 Thread Vladimir Sementsov-Ogievskiy

On 27.01.2015 19:20, Eric Blake wrote:

On 01/27/2015 03:56 AM, Vladimir Sementsov-Ogievskiy wrote:

Add dirty parameter to qmp-migrate command. If this parameter is true,
migration/block.c will migrate dirty bitmaps. This parameter can be used
without "blk" parameter to migrate only dirty bitmaps, skipping block
migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
+++ b/qapi-schema.json
@@ -1656,12 +1656,17 @@
  # @detach: this argument exists only for compatibility reasons and
  #  is ignored by QEMU
  #
+# @dirty: #optional do dirty-bitmaps migration (can be used with or without
+# @blk parameter)
+# (since 2.3)

Rather than adding it to 'migrate', where the command is not
introspectible, I'd rather you add it to 'migrate-set-capabilities',
where I can then use 'query-migrate-capabilities' to learn if it is
supported.

Thank you. Moreover, it is turned out to be a simpler way than with 
migration parameter. I've done the change, it will be in v3. I'm just 
waiting for review of the core part of the series - migration/dirty-bitmap.c


--
Best regards,
Vladimir




Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Daniel P. Berrange
On Wed, Feb 04, 2015 at 04:48:44PM +0200, Marcel Apfelbaum wrote:
> On 02/04/2015 04:28 PM, Paolo Bonzini wrote:
> >
> >
> >On 04/02/2015 15:02, Daniel P. Berrange wrote:
> >>>I'm not sure if it makes sense for RDMA; it already has a couple of hooks
> >>>that go around the back of QEMUFile in migration, and it's transferring the
> >>>data via DMA so the page data doesn't go through the same path.
> >>
> >>Could you ever anticipate any need for authentication or encryption in
> >>the RDMA based channel ? I don't know enough about RDMA myself to know
> >>if it makes sense or not, other than the fact that any channel between
> >>two separate hosts needs security at some level in the stack.
> >
> >Authentication, possibly; but I don't think encryption makes sense.  Marcel?
> I personally think that the protocol is safe enough, but as always there are 
> holes
> and I am not a security expert:
> 
> "RDMA mechanisms can create a potential security vulnerability. A node 
> may access another nodes
>  memory region that was supposed to be banned.
>  In order to protect remote memory access to unauthorized memory areas, 
> InfiniBand defines memory
>  protection mechanisms, where a remote memory access requires a special 
> key (R_Key). The R_Key is
>  negotiated between the peers and is validated at the target’s system HCA 
> card. In case of illegal key the
>  packet is dropped. The R_Key requirement is built into silicon and 
> driver code and cannot be disabled
>  even when attacker compromises root/admin/superuser account on one or 
> multiple servers."
> 
> More on Layer 2 attacks and how Infiniband handle those:
> 
> http://www.mellanox.com/related-docs/whitepapers/WP_Secuirty_In_InfiniBand_Fabrics_Final.pdf
>

Ok I guess we could probably just assume RDMA is sufficient as is for now.
We can fairly easily revisit it later if we find it it needs more work

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Marcel Apfelbaum

On 02/04/2015 04:28 PM, Paolo Bonzini wrote:



On 04/02/2015 15:02, Daniel P. Berrange wrote:

I'm not sure if it makes sense for RDMA; it already has a couple of hooks
that go around the back of QEMUFile in migration, and it's transferring the
data via DMA so the page data doesn't go through the same path.


Could you ever anticipate any need for authentication or encryption in
the RDMA based channel ? I don't know enough about RDMA myself to know
if it makes sense or not, other than the fact that any channel between
two separate hosts needs security at some level in the stack.


Authentication, possibly; but I don't think encryption makes sense.  Marcel?

I personally think that the protocol is safe enough, but as always there are 
holes
and I am not a security expert:

"RDMA mechanisms can create a potential security vulnerability. A node may 
access another nodes
 memory region that was supposed to be banned.
 In order to protect remote memory access to unauthorized memory areas, 
InfiniBand defines memory
 protection mechanisms, where a remote memory access requires a special key 
(R_Key). The R_Key is
 negotiated between the peers and is validated at the target’s system HCA 
card. In case of illegal key the
 packet is dropped. The R_Key requirement is built into silicon and driver 
code and cannot be disabled
 even when attacker compromises root/admin/superuser account on one or multiple 
servers."

More on Layer 2 attacks and how Infiniband handle those:

http://www.mellanox.com/related-docs/whitepapers/WP_Secuirty_In_InfiniBand_Fabrics_Final.pdf

I hope I helped,
Marcel



Paolo






Re: [Qemu-devel] RFC: Universal encryption on QEMU I/O channels

2015-02-04 Thread Paolo Bonzini


On 04/02/2015 15:34, Daniel P. Berrange wrote:
> > GIO doesn't provide writev either, so it's not a good match for
> > non-encrypted migration, which really tries hard to do no copies in
> > userspace.
> 
> Ok, maybe RDMA will still need QEMUFile, but for non-encrypted TCP
> I'd hope to be able to achieve zero-copy with the new API too - it
> would certainly be my intention/goal.

For GIO/GIOChannel, you'd have to choose between zerocopy and many
syscalls, or one copy and few syscalls.  Since every page has two iov
entries, one of which is just 8 bytes, one copy and few syscalls is
probably faster---but still only half the speed of writev.

Paolo



Re: [Qemu-devel] [PATCH v5 07/26] qcow2: Respect error in qcow2_alloc_bytes()

2015-02-04 Thread Max Reitz

On 2015-02-04 at 06:40, Kevin Wolf wrote:

Am 15.12.2014 um 13:50 hat Max Reitz geschrieben:

qcow2_update_cluster_refcount() may fail, and qcow2_alloc_bytes() should
mind that case.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
---
  block/qcow2-refcount.c | 33 +
  1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 0308a7e..db81647 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -778,8 +778,8 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t 
offset,
  int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
  {
  BDRVQcowState *s = bs->opaque;
-int64_t offset, cluster_offset;
-int free_in_cluster;
+int64_t offset, cluster_offset, new_cluster;
+int free_in_cluster, ret;
  
  BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_BYTES);

  assert(size > 0 && size <= s->cluster_size);
@@ -800,23 +800,32 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
  free_in_cluster -= size;
  if (free_in_cluster == 0)
  s->free_byte_offset = 0;
-if (offset_into_cluster(s, offset) != 0)
-qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
-  false, QCOW2_DISCARD_NEVER);
+if (offset_into_cluster(s, offset) != 0) {
+ret = qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits,
+1, false, QCOW2_DISCARD_NEVER);
+if (ret < 0) {
+return ret;

Not sure how relevant it is, but s->free_byte_offset has already been
increased, so we're leaving sub-cluster space unused. (It's not really
leaking as freeing all references still frees the cluster.)


Right, will fix.


+}
+}
  } else {
-offset = qcow2_alloc_clusters(bs, s->cluster_size);
-if (offset < 0) {
-return offset;
+new_cluster = qcow2_alloc_clusters(bs, s->cluster_size);
+if (new_cluster < 0) {
+return new_cluster;
  }

offset is the return value of this function, and now there are cases
where it isn't set to new_cluster any more (I wonder why gcc doesn't
warn).


Because @offset is always set. In case the next condition is true, it is 
set to s->free_byte_offset, just like it was before. In case it isn't, 
s->free_byte_offset will be set to @new_cluster and the loop will be 
started again (probably always resulting in size <= free_in_cluster 
being true and thus @offset being set to s->free_byte_offset).



Why can't we keep offset where it was used and only assign new_cluster
additionally so we can do the cleanup?


Why should we? If I were a reader of this code, I think it would confuse 
me to have two variables holding the same value but for some reason only 
using @offset, but once it's been overwritten, suddenly using 
@new_cluster (whereas one could just use @new_cluster everywhere, and 
then writing @offset would be superfluous). Also, the idea was having 
@offset hold the offset of where the compressed data is to be stored, 
whereas @new_cluster is just the offset of a new cluster, but not 
necessarily the offset where compressed data will be stored (it won't be 
if (cluster_offset + s->cluster_size) == new_cluster).


Max


  cluster_offset = start_of_cluster(s, s->free_byte_offset);
-if ((cluster_offset + s->cluster_size) == offset) {
+if ((cluster_offset + s->cluster_size) == new_cluster) {
  /* we are lucky: contiguous data */
  offset = s->free_byte_offset;
-qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
-  false, QCOW2_DISCARD_NEVER);
+ret = qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits,
+1, false, QCOW2_DISCARD_NEVER);
+if (ret < 0) {
+qcow2_free_clusters(bs, new_cluster, s->cluster_size,
+QCOW2_DISCARD_NEVER);
+return ret;
+}
  s->free_byte_offset += size;
  } else {
-s->free_byte_offset = offset;
+s->free_byte_offset = new_cluster;
  goto redo;
  }
  }

Kevin





Re: [Qemu-devel] [PATCH V11 2/3] i386: Add a Virtual Machine Generation ID device

2015-02-04 Thread Gal Hammer
Hi Igor,

- Original Message -
> From: "Igor Mammedov" 
> To: "Gal Hammer" 
> Cc: qemu-devel@nongnu.org, m...@redhat.com
> Sent: Monday, February 2, 2015 3:55:02 PM
> Subject: Re: [Qemu-devel] [PATCH V11 2/3] i386: Add a Virtual Machine 
> Generation ID device
> 
> On Mon, 02 Feb 2015 15:13:39 +0200
> Gal Hammer  wrote:
> 
> > On 02/02/2015 14:46, Igor Mammedov wrote:
> > > On Sun, 01 Feb 2015 14:56:26 +0200
> > > Gal Hammer  wrote:
> > >
> > >> On 22/01/2015 15:52, Igor Mammedov wrote:
> > >>> On Tue, 16 Dec 2014 17:50:43 +0200
> > >>> Gal Hammer  wrote:
> > >>>
> >  Based on Microsoft's sepecifications (paper can be dowloaded from
> >  http://go.microsoft.com/fwlink/?LinkId=260709), add a device
> >  description to the SSDT ACPI table and its implementation.
> > 
> >  The GUID is set using a global "vmgenid.uuid" parameter.
> > 
> >  Signed-off-by: Gal Hammer 
> > 
> > >>>
> >  --- a/hw/i386/acpi-build.c
> >  +++ b/hw/i386/acpi-build.c
> >  @@ -257,6 +257,7 @@ static void acpi_get_pci_info(PcPciInfo *info)
> > #define ACPI_BUILD_TABLE_FILE "etc/acpi/tables"
> > #define ACPI_BUILD_RSDP_FILE "etc/acpi/rsdp"
> > #define ACPI_BUILD_TPMLOG_FILE "etc/tpm/log"
> >  +#define ACPI_BUILD_VMGENID_FILE "etc/vm-generation-id"
> > 
> > static void
> > build_header(GArray *linker, GArray *table_data,
> >  @@ -1068,6 +1069,8 @@ build_ssdt(GArray *table_data, GArray *linker,
> > {
> > MachineState *machine = MACHINE(qdev_get_machine());
> > uint32_t nr_mem = machine->ram_slots;
> >  +uint32_t vm_gid_physical_address;
> >  +uint32_t vm_gid_offset = 0;
> > unsigned acpi_cpus = guest_info->apic_id_limit;
> > int ssdt_start = table_data->len;
> > uint8_t *ssdt_ptr;
> >  @@ -1096,6 +1099,21 @@ build_ssdt(GArray *table_data, GArray *linker,
> > ACPI_BUILD_SET_LE(ssdt_ptr, sizeof(ssdp_misc_aml),
> >   ssdt_isa_pest[0], 16, misc->pvpanic_port);
> > 
> >  +if (vm_generation_id_set()) {
> >  +vm_gid_physical_address = ssdt_start +
> >  ssdt_acpi_vm_gid_addr[0];
> >  +bios_linker_loader_alloc(linker, ACPI_BUILD_VMGENID_FILE, 8,
> >  true);
> >  +bios_linker_loader_add_pointer(linker,
> >  ACPI_BUILD_VMGENID_FILE,
> >  +   ACPI_BUILD_TABLE_FILE,
> >  +   table_data,
> >  +   &vm_gid_offset,
> >  +   sizeof(vm_gid_offset));
> > >>> could some explain how this pointer magic works,
> > >>
> > >> I can try, but don't you think that a magic is gone once explained? ;-)
> > >>
> > >>>   From my weak understanding it seems broken.
> > >>> Lets see:
> > >>>
> > >>>[1] &vm_gid_offset - must be pointer inside of dest_file blob
> > >>>(ACPI_BUILD_VMGENID_FILE)
> > >>>[2] vm_gid_offset - should hold offset of the place inside of
> > >>>src_file
> > >>>   (ACPI_BUILD_TABLE_FILE) where to pointer inside
> > >>>   of dest_file should point to
> > >>
> > >> The vm_gid_offset should point where in the ACPI_BUILD_VMGENID_FILE the
> > >> VM's GUID is stored. At the moment, it should always be zero because the
> > >> GUID is stored at the begging of the ACPI_BUILD_VMGENID_FILE.
> > >>
> > >>>
> > >>> now:
> > >>> vm_gid_physical_address - holds [2] i.e. offset of VGIA constant in
> > >>> inside SSDT in ACPI_BUILD_TABLE_FILE.
> > >>>
> >  +ACPI_BUILD_SET_LE(ssdt_ptr, sizeof(ssdp_misc_aml),
> >  +  ssdt_acpi_vm_gid_addr[0], 32,
> >  vm_gid_physical_address);
> > >>> Then we write this offset into VGIA in ACPI_BUILD_TABLE_FILE.
> > >>
> > >> Yes. This offset is later patched by the linker to the full physical
> > >> address.
> > >>
> > >>> After BIOS loads tables it's going to patch at
> > >>>[3] ACPI_BUILD_VMGENID_FILE + (&vm_gid_offset - table_data->data) /*
> > >>>only god knows where it will be/
> > >>>
> > >>> and on top of it write in it value:
> > >>>*(ACPI_BUILD_TABLE_FILE +  *[3])
> > >>
> > >> We know exactly where it is, no need to call for god's help :-).
> > >>
> > >>> This approach in general of patching arbitrary place in AML blob
> > >>> to get PHY addr of buffer with UUID, is quite a hack, especially
> > >>> in light of that we are trying to hide all direct access to AML
> > >>> blobs with related pointer arithmetic and manual patching.
> > >>>
> > >>> Why not reserve some potion of RAM and pass to BIOS/guest
> > >>> a reservation so it won't be part of AddressRangeMemory or
> > >>> AddressRangeACPI as MS spec requires? Then you won't need
> > >>> jump all above hoops to just get buffer's PHY addr.
> > >>
> > >> I'll be glad to hear a new idea that I didn't already try in one of
> > >> 

Re: [Qemu-devel] [PATCH v5 07/26] qcow2: Respect error in qcow2_alloc_bytes()

2015-02-04 Thread Kevin Wolf
Am 04.02.2015 um 16:04 hat Max Reitz geschrieben:
> On 2015-02-04 at 06:40, Kevin Wolf wrote:
> >Am 15.12.2014 um 13:50 hat Max Reitz geschrieben:
> >>qcow2_update_cluster_refcount() may fail, and qcow2_alloc_bytes() should
> >>mind that case.
> >>
> >>Signed-off-by: Max Reitz 
> >>Reviewed-by: Eric Blake 
> >>Reviewed-by: Stefan Hajnoczi 
> >>---
> >>  block/qcow2-refcount.c | 33 +
> >>  1 file changed, 21 insertions(+), 12 deletions(-)
> >>
> >>diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> >>index 0308a7e..db81647 100644
> >>--- a/block/qcow2-refcount.c
> >>+++ b/block/qcow2-refcount.c
> >>@@ -778,8 +778,8 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, 
> >>uint64_t offset,
> >>  int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
> >>  {
> >>  BDRVQcowState *s = bs->opaque;
> >>-int64_t offset, cluster_offset;
> >>-int free_in_cluster;
> >>+int64_t offset, cluster_offset, new_cluster;
> >>+int free_in_cluster, ret;
> >>  BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_BYTES);
> >>  assert(size > 0 && size <= s->cluster_size);
> >>@@ -800,23 +800,32 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int 
> >>size)
> >>  free_in_cluster -= size;
> >>  if (free_in_cluster == 0)
> >>  s->free_byte_offset = 0;
> >>-if (offset_into_cluster(s, offset) != 0)
> >>-qcow2_update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
> >>-  false, QCOW2_DISCARD_NEVER);
> >>+if (offset_into_cluster(s, offset) != 0) {
> >>+ret = qcow2_update_cluster_refcount(bs, offset >> 
> >>s->cluster_bits,
> >>+1, false, 
> >>QCOW2_DISCARD_NEVER);
> >>+if (ret < 0) {
> >>+return ret;
> >Not sure how relevant it is, but s->free_byte_offset has already been
> >increased, so we're leaving sub-cluster space unused. (It's not really
> >leaking as freeing all references still frees the cluster.)
> 
> Right, will fix.
> 
> >>+}
> >>+}
> >>  } else {
> >>-offset = qcow2_alloc_clusters(bs, s->cluster_size);
> >>-if (offset < 0) {
> >>-return offset;
> >>+new_cluster = qcow2_alloc_clusters(bs, s->cluster_size);
> >>+if (new_cluster < 0) {
> >>+return new_cluster;
> >>  }
> >offset is the return value of this function, and now there are cases
> >where it isn't set to new_cluster any more (I wonder why gcc doesn't
> >warn).
> 
> Because @offset is always set. In case the next condition is true,
> it is set to s->free_byte_offset, just like it was before. In case
> it isn't, s->free_byte_offset will be set to @new_cluster and the
> loop will be started again (probably always resulting in size <=
> free_in_cluster being true and thus @offset being set to
> s->free_byte_offset).

Yes, I misread. Hooray for goto. One more reason for a cleanup of the
whole function.

Kevin



Re: [Qemu-devel] [PATCH 0/8] block: persistent dirty bitmaps (RFC)

2015-02-04 Thread Vladimir Sementsov-Ogievskiy

On 13.01.2015 20:02, Vladimir Sementsov-Ogievskiy wrote:

The bitmaps are saved into qcow2 file format. It provides both
'internal' and 'external' dirty bitmaps feature:
  - for qcow2 drives we can store bitmaps in the same file
  - for other formats we can store bitmaps in the separate qcow2 file

QCow2 header is extended by fields 'nb_dirty_bitmaps' and
'dirty_bitmaps_offset' like with snapshots.

Proposed command line syntax is the following:

-dirty-bitmap [option1=val1][,option2=val2]...
 Available options are:
 name The name for the bitmap (necessary).

 file The file to load the bitmap from.

 file_id  When specified with 'file' option, then this file will
  be available through this id for other -dirty-bitmap
  options when specified without 'file' option, then it
  is a reference to 'file', specified with another
  -dirty-bitmap option, and it will be used to load the
  bitmap from.

 driveThe drive to bind the bitmap to. It should be specified
  as 'id' suboption of one of -drive options. If nor
  'file' neither 'file_id' are specified, then the bitmap
  will be loaded from that drive (internal dirty bitmap).

 granularity  The granularity for the bitmap. Not necessary, the
  default value may be used.

 enabled  on|off. Default is 'on'. Disabled bitmaps are not
  changing regardless of writes to corresponding drive.

Examples:

qemu -drive file=a.qcow2,id=disk -dirty-bitmap name=b,drive=disk
qemu -drive file=a.raw,id=disk \
  -dirty-bitmap name=b,drive=disk,file=b.qcow2,enabled=off

Vladimir Sementsov-Ogievskiy (8):
   spec: add qcow2-dirty-bitmaps specification
   hbitmap: store / restore
   qcow2: add dirty-bitmaps feature
   block: store persistent dirty bitmaps
   block: add bdrv_load_dirty_bitmap
   qemu: command line option for dirty bitmaps
   qmp: print dirty bitmap
   iotests: test internal persistent dirty bitmap

  block.c| 113 ++
  block/Makefile.objs|   2 +-
  block/qcow2-dirty-bitmap.c | 514 +
  block/qcow2.c  |  26 +++
  block/qcow2.h  |  48 +
  blockdev.c |  51 +
  docs/specs/qcow2.txt   |  59 ++
  hmp-commands.hx|  15 ++
  hmp.c  |   8 +
  hmp.h  |   1 +
  include/block/block.h  |   9 +
  include/block/block_int.h  |  10 +
  include/qemu/hbitmap.h |  49 +
  include/sysemu/blockdev.h  |   1 +
  include/sysemu/sysemu.h|   1 +
  qapi-schema.json   |   3 +-
  qapi/block-core.json   |   3 +
  qemu-options.hx|  37 
  qmp-commands.hx|   5 +
  tests/qemu-iotests/115 |  96 +
  tests/qemu-iotests/115.out |  64 ++
  tests/qemu-iotests/group   |   1 +
  util/hbitmap.c |  87 
  vl.c   | 100 +
  24 files changed, 1301 insertions(+), 2 deletions(-)
  create mode 100644 block/qcow2-dirty-bitmap.c
  create mode 100755 tests/qemu-iotests/115
  create mode 100644 tests/qemu-iotests/115.out



Ping. I've already done (locally):
1) using qcow2 header extension instead of changing the header
2) normal qmp query request instead of "print dirty bitmap"
 - thanks to Eric and Markus

Now I'm waiting for some comments on the concept and it's realization to 
roll v3.


--
Best regards,
Vladimir




Re: [Qemu-devel] [RFC] pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations

2015-02-04 Thread Alexander Graf


On 04.02.15 02:32, David Gibson wrote:
> On Wed, Feb 04, 2015 at 08:19:06AM +1100, Paul Mackerras wrote:
>> On Tue, Feb 03, 2015 at 05:10:51PM +1100, David Gibson wrote:
>>> qemu currently implements the hypercalls H_LOGICAL_CI_LOAD and
>>> H_LOGICAL_CI_STORE as PAPR extensions.  These are used by the SLOF firmware
>>> for IO, because performing cache inhibited MMIO accesses with the MMU off
>>> (real mode) is very awkward on POWER.
>>>
>>> This approach breaks when SLOF needs to access IO devices implemented
>>> within KVM instead of in qemu.  The simplest example would be virtio-blk
>>> using an iothread, because the iothread / dataplane mechanism relies on
>>> an in-kernel implementation of the virtio queue notification MMIO.
>>>
>>> To fix this, an in-kernel implementation of these hypercalls has been made,
>>> however, the hypercalls still need to be enabled from qemu.  This performs
>>> the necessary calls to do so.
>>>
>>> Signed-off-by: David Gibson 
>>
>> [snip]
>>
>>> +ret1 = kvmppc_enable_hcall(kvm_state, H_LOGICAL_CI_LOAD);
>>> +if (ret1 != 0) {
>>> +fprintf(stderr, "Warning: error enabling H_LOGICAL_CI_LOAD in KVM:"
>>> +" %s\n", strerror(errno));
>>> +}
>>> +
>>> +ret2 = kvmppc_enable_hcall(kvm_state, H_LOGICAL_CI_STORE);
>>> +if (ret2 != 0) {
>>> +fprintf(stderr, "Warning: error enabling H_LOGICAL_CI_STORE in 
>>> KVM:"
>>> +" %s\n", strerror(errno));
>>> + }
>>> +
>>> +if ((ret1 != 0) || (ret2 != 0)) {
>>> +fprintf(stderr, "Warning: Couldn't enable H_LOGICAL_CI_* in KVM, 
>>> SLOF"
>>> +" may be unable to operate devices with in-kernel 
>>> emulation\n");
>>> +}
>>
>> You'll always get these warnings if you're running on an old (meaning
>> current upstream) kernel, which could be annoying.
> 
> True.
> 
>> Is there any way
>> to tell whether you have configured any devices which need the
>> in-kernel MMIO emulation and only warn if you have?
> 
> In theory, I guess so.  In practice I can't see how you'd enumerate
> all devices that might require kernel intervention without something
> horribly invasive.

We could WARN_ONCE in QEMU if we emulate such a hypercall, but its
handler is io_mem_unassigned (or we add another minimum priority huge
memory region on all 64bits of address space that reports the breakage).


Alex



  1   2   3   >