date:20130307

Re: [Qemu-devel] [PATCH v6 08/24] hw/arm: add Faraday FTRTC011 RTC timer support

2013-03-07 Thread Wei-Ren Chen

On Thu, Mar 07, 2013 at 02:41:26AM -0500, Paolo Bonzini wrote:
> 
> >   From Makefile.target, qtest is built when you build softmmu target.
> > I guess there is no need to add "--enable-qtest" option. However, I
> > don't know how to run qtest...
> 
> make check-qtest-arm

  O.K., let me try to summarize the working flow..

  $ ../configure --target-list=i386-softmmu
  $ make; make install
  $ make check-qtest-i386
  $ cd tests
  $ export QTEST_QEMU_BINARY=/path/to/qemu-system-i386
  $ ./rtc-test
  /i386/rtc/check-time/bcd: OK
  /i386/rtc/check-time/dec: OK
  /i386/rtc/alarm/interrupt: OK
  
  ...

  Developers should add their test into ${QEMU_SRC}/tests . Am I right?

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj

Re: [Qemu-devel] [PATCH 2/2] qga: add windows implementation for guest-set-time

2013-03-07 Thread Lei Li


On 03/06/2013 11:05 PM, Eric Blake wrote:

On 03/06/2013 06:45 AM, Lei Li wrote:

Signed-off-by: Lei Li 
---
  qga/commands-win32.c |   35 +++
  1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 4febec7..1a90aa7 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -136,6 +136,41 @@ int64_t qmp_guest_get_time(Error **errp)
 return time_ns;
  }
  
+void qmp_guest_set_time(int64_t time_ns, Error **errp)

+{
+SYSTEMTIME ts;
+FILETIME tf;
+LONGLONG time;
+
+/* year-2038 will overflow in case time_t is 32bit */
+if (time_ns / 10 != (time_t)(time_ns / 10)) {
+error_setg(errp, "Time %" PRId64 " is too large", time_ns);
+return;
+}

Do you really need this?  That is, don't you already know what size
time_t is on windows; not to mention that on Windows, you aren't going
through time_t, but directly to tf.dwHighDateTime.


You are right.




+
+acquire_privilege(SE_SYSTEMTIME_NAME, errp);
+if (error_is_set(errp)) {
+error_setg(errp, "Failed to acquire privilege");
+return;
+}
+
+time = time_ns / 100 + _W32_FT_OFFSET;

On the other hand, _this_ operation can overflow, so you should be
checking that time_ns doesn't result in an unexpected time value.


OK.


+
+tf.dwLowDateTime = (DWORD) time;
+tf.dwHighDateTime = (DWORD) (time >> 32);
+
+if (!FileTimeToSystemTime(&tf, &ts)) {
+error_setg(errp, "Failed to convert system time");
+return;
+}
+
+if (!SetSystemTime(&ts)) {
+slog("guest-set-time failed: %d", GetLastError());
+error_setg_errno(errp, errno, "Failed to set time to guest");
+return;
+}

Trailing whitespace.  Run your submission through scripts/checkpatch.pl.


Yes.


Should you relinquish the SE_SYSTEMTIME_NAME privilege when exiting this
function, instead of leaving it always enabled for the remaining life of
the agent service?


According to the remarks on msdn, this SetSystemTime function will
disables the SE_SYSTEMTIME_NAME privilege before returning.



--
Lei

Re: [Qemu-devel] [PATCH v6 08/24] hw/arm: add Faraday FTRTC011 RTC timer support

2013-03-07 Thread Kuo-Jung Su

2013/3/7 Paolo Bonzini :
>
>>   From Makefile.target, qtest is built when you build softmmu target.
>> I guess there is no need to add "--enable-qtest" option. However, I
>> don't know how to run qtest...
>
> make check-qtest-arm
>
> Paolo

Got it, thanks.


-- 
Best wishes,
Kuo-Jung Su

Re: [Qemu-devel] [PATCH] RTC: enable lost_tick_policy=slew as default (v2)

2013-03-07 Thread Paolo Bonzini

Il 12/12/2012 22:36, Marcelo Tosatti ha scritto:
> 
> RTC interrupt reinjection has no known negative effect. Lack of
> RTC interrupt reinjection, though, has negative effects: time drift
> for Windows guests which use it as a timer source.
> 
> Based on that, enable lost_tick_policy=slew option as default.
> 
> Signed-off-by: Marcelo Tosatti 
> 
> v2: do not change default for older machines types (Paolo Bonzini)
> 
> diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
> index c79fca7..c9e007d 100644
> --- a/hw/mc146818rtc.c
> +++ b/hw/mc146818rtc.c
> @@ -884,7 +884,7 @@ ISADevice *rtc_init(ISABus *bus, int base_year, qemu_irq 
> intercept_irq)
>  static Property mc146818rtc_properties[] = {
>  DEFINE_PROP_INT32("base_year", RTCState, base_year, 1980),
>  DEFINE_PROP_LOSTTICKPOLICY("lost_tick_policy", RTCState,
> -   lost_tick_policy, LOST_TICK_DISCARD),
> +   lost_tick_policy, LOST_TICK_SLEW),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 19e342a..475bb4c 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -295,6 +295,10 @@ static QEMUMachine pc_machine_v1_4 = {
>  .driver   = "usb-tablet",\
>  .property = "usb_version",\
>  .value= stringify(1),\
> +},{\
> +.driver   = "mc146818rtc",\
> +.property = "lost_tick_policy",\
> +.value= "discard",\
>  }
>  
>  static QEMUMachine pc_machine_v1_3 = {
> 
> 

Looks like this was never applied.  Can you redo it for the new 1.5
machine (compatibility defines are now in hw/pc.h)?

Paolo

[Qemu-devel] [PATCH V7 0/5] Send the gratuitous by guest

2013-03-07 Thread Jason Wang

This series tries to let guest instead of qemu to send the gratuitous packets
after migration when guest is capable of doing this. This is needed since it's
impossible for qemu to keep track of all configurations (e.g 802.1Q) and mac
addresses (more than one mac address may be used by guest). So qemu can't build
gratuitous packets for all those configurations properly. The only solution is
let guest driver who knew all needed information to do this.

The series first introduces a new runstate which just tracks the state when the
migration is finished and guest is about to start. And then we can just trying
to notify the guest to send the GARP after changing from this state to
running. A model specific announcing method were also also introduced to let
each kinds of nic do its own notification. When there's no such method register
for the nic, the old style of sending RARP were kept. And the last two patches
implemented the virtio-net method of notification.

Changes from V6:
- introduce a new runstate instead of using a global variable check the state

Changes from V5:
- use a global variable to decide whether an announcement is needed after 
migration
- align with virtio spec and let guest ack the announcement notification through
  control vq instead of config status writing

Changes from V4:
- keep the old behavior that send the gratuitous packets only after migration
- decide whether to send gratuitous packets by previous runstate instead of a 
dedicated parameter
- check virtio_net_started() instead of VIRTIO_NET_S_LINK_UP before issue the 
config update interrupt
- move VIRTIO_NET_S_ANNOUNCE to 0x100 and supress guest config write to RO bits
- cleanups suggested by Michael

Tested with migration within 802.1Q.

Jason Wang (5):
  runstate: introduce prelaunch-migrate state
  net: announce self after vm is started
  net: model specific announcing support
  virtio-net: notify guest to annouce itself
  virtio-net: compat guest announce

 hw/pc.h   |6 +-
 hw/virtio-net.c   |   30 ++
 hw/virtio-net.h   |   15 ++-
 include/net/net.h |2 ++
 migration.c   |4 +---
 qapi-schema.json  |5 -
 savevm.c  |8 ++--
 vl.c  |8 +++-
 8 files changed, 69 insertions(+), 9 deletions(-)

[Qemu-devel] [PATCH V7 1/5] runstate: introduce prelaunch-migrate state

2013-03-07 Thread Jason Wang

Sometimes, we need track the state when guest is just about to start after
migration. There's not a accurate state available which do this accurately
(consider qemu may started with -S in destination).

So this patch introduces a new state prelaunch-migrate which just tracks this
state, it covers the case both w/ and w/o -S in destination. The first user of
this is the support of doing announce by guest.

Signed-off-by: Jason Wang 
---
 migration.c  |3 +--
 qapi-schema.json |5 -
 vl.c |4 +++-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/migration.c b/migration.c
index 11725ae..ecdf2c5 100644
--- a/migration.c
+++ b/migration.c
@@ -107,10 +107,9 @@ static void process_incoming_migration_co(void *opaque)
 /* Make sure all file formats flush their mutable metadata */
 bdrv_invalidate_cache_all();
 
+runstate_set(RUN_STATE_PRELAUNCH_MIGRATE);
 if (autostart) {
 vm_start();
-} else {
-runstate_set(RUN_STATE_PAUSED);
 }
 }
 
diff --git a/qapi-schema.json b/qapi-schema.json
index 28b070f..baa6361 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -174,11 +174,14 @@
 # @suspended: guest is suspended (ACPI S3)
 #
 # @watchdog: the watchdog action is configured to pause and has been triggered
+#
+# @migrate-prelaunch: migration is completed and QEMU were started with -S
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
 'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
-'running', 'save-vm', 'shutdown', 'suspended', 'watchdog' ] }
+'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
+'prelaunch-migrate'] }
 
 ##
 # @SnapshotInfo
diff --git a/vl.c b/vl.c
index c03edf1..5dd2e0e 100644
--- a/vl.c
+++ b/vl.c
@@ -534,7 +534,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
 
 { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
-{ RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
+{ RUN_STATE_INMIGRATE, RUN_STATE_PRELAUNCH_MIGRATE },
 
 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -580,6 +580,8 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
 { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
 
+{ RUN_STATE_PRELAUNCH_MIGRATE, RUN_STATE_RUNNING },
+
 { RUN_STATE_MAX, RUN_STATE_MAX },
 };
 
-- 
1.7.1

[Qemu-devel] [PATCH V7 2/5] net: announce self after vm is started

2013-03-07 Thread Jason Wang

Since we may want to send garp by guest if guest driver is capable of it. We
need trigger this event after vm is started. So this patch do this when the
state is changing from RUN_STATE_PRELAUNCH_MIGRATE to RUN_STATE_RUNINNG.

Signed-off-by: Jason Wang 
---
 migration.c |1 -
 vl.c|4 
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/migration.c b/migration.c
index ecdf2c5..3599d2d 100644
--- a/migration.c
+++ b/migration.c
@@ -100,7 +100,6 @@ static void process_incoming_migration_co(void *opaque)
 fprintf(stderr, "load of migration failed\n");
 exit(0);
 }
-qemu_announce_self();
 DPRINTF("successfully loaded vm state\n");
 
 bdrv_clear_incoming_migration_all();
diff --git a/vl.c b/vl.c
index 5dd2e0e..ccf6ea7 100644
--- a/vl.c
+++ b/vl.c
@@ -1681,11 +1681,15 @@ void vm_state_notify(int running, RunState state)
 void vm_start(void)
 {
 if (!runstate_is_running()) {
+RunState prev_run_state = current_run_state;
 cpu_enable_ticks();
 runstate_set(RUN_STATE_RUNNING);
 vm_state_notify(1, RUN_STATE_RUNNING);
 resume_all_vcpus();
 monitor_protocol_event(QEVENT_RESUME, NULL);
+if (prev_run_state == RUN_STATE_PRELAUNCH_MIGRATE) {
+qemu_announce_self();
+}
 }
 }
 
-- 
1.7.1

[Qemu-devel] [PATCH V7 4/5] virtio-net: notify guest to annouce itself

2013-03-07 Thread Jason Wang

It's hard to track all mac addresses and their configurations (e.g vlan) in
qemu. Without those information, it's impossible to build proper garp packet
after migration. The only possible solution to this is let guest ( who knew all
configurations) to do this.

So, this patch introduces a new rw config status bit of virtio-net,
VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce presence of its
link through config update interrupt. When gust have done the announcement, it
should ack the notification through VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This
feature is negotiated by a new feature bit VIRTIO_NET_F_ANNOUNCE.

Signed-off-by: Jason Wang 
---
 hw/virtio-net.c |   30 ++
 hw/virtio-net.h |   15 ++-
 2 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 009f09e..278c42c 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -610,6 +610,18 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
 
 return VIRTIO_NET_OK;
 }
+
+static int virtio_net_handle_announce(VirtIONet *n, uint8_t cmd,
+  struct iovec *iov, unsigned int iov_cnt)
+{
+if (cmd == VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
+n->status &= ~VIRTIO_NET_S_ANNOUNCE;
+return VIRTIO_NET_OK;
+} else {
+return VIRTIO_NET_ERR;
+}
+}
+
 static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
 VirtIONet *n = to_virtio_net(vdev);
@@ -639,6 +651,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
 } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
 status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, iov_cnt);
+} else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
+status = virtio_net_handle_announce(n, ctrl.cmd, iov, iov_cnt);
 } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
 status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt);
 }
@@ -1275,6 +1289,21 @@ static void virtio_net_cleanup(NetClientState *nc)
 n->nic = NULL;
 }
 
+static int virtio_net_announce(NetClientState *nc)
+{
+VirtIONet *n = qemu_get_nic_opaque(nc);
+
+if (n->vdev.guest_features & (0x1 << VIRTIO_NET_F_GUEST_ANNOUNCE)
+&& n->vdev.guest_features & (0x1 << VIRTIO_NET_F_CTRL_VQ)
+&& virtio_net_started(n, n->vdev.status)) {
+n->status |= VIRTIO_NET_S_ANNOUNCE;
+virtio_notify_config(&n->vdev);
+return 0;
+}
+
+return -1;
+}
+
 static NetClientInfo net_virtio_info = {
 .type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
@@ -1282,6 +1311,7 @@ static NetClientInfo net_virtio_info = {
 .receive = virtio_net_receive,
 .cleanup = virtio_net_cleanup,
 .link_status_changed = virtio_net_set_link_status,
+.announce = virtio_net_announce,
 };
 
 static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx)
diff --git a/hw/virtio-net.h b/hw/virtio-net.h
index e654c13..c0f3de5 100644
--- a/hw/virtio-net.h
+++ b/hw/virtio-net.h
@@ -43,12 +43,13 @@
 #define VIRTIO_NET_F_CTRL_RX18  /* Control channel RX mode support */
 #define VIRTIO_NET_F_CTRL_VLAN  19  /* Control channel VLAN filtering */
 #define VIRTIO_NET_F_CTRL_RX_EXTRA 20   /* Extra RX mode control support */
+#define VIRTIO_NET_F_GUEST_ANNOUNCE 21  /* Guest can announce itself */
 #define VIRTIO_NET_F_MQ 22  /* Device supports Receive Flow
  * Steering */
-
 #define VIRTIO_NET_F_CTRL_MAC_ADDR   23 /* Set MAC address */
 
 #define VIRTIO_NET_S_LINK_UP1   /* Link is up */
+#define VIRTIO_NET_S_ANNOUNCE   2   /* Announcement is needed */
 
 #define TX_TIMER_INTERVAL 15 /* 150 us */
 
@@ -152,6 +153,17 @@ struct virtio_net_ctrl_mac {
  #define VIRTIO_NET_CTRL_VLAN_DEL 1
 
 /*
+ * Control link announce acknowledgement
+ *
+ * The command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that
+ * driver has recevied the notification and device would clear the
+ * VIRTIO_NET_S_ANNOUNCE bit in the status filed after it received
+ * this command.
+ */
+#define VIRTIO_NET_CTRL_ANNOUNCE   3
+ #define VIRTIO_NET_CTRL_ANNOUNCE_ACK 0
+
+/*
  * Control Multiqueue
  *
  * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
@@ -180,6 +192,7 @@ struct virtio_net_ctrl_mq {
 DEFINE_PROP_BIT("guest_tso6", _state, _field, VIRTIO_NET_F_GUEST_TSO6, 
true), \
 DEFINE_PROP_BIT("guest_ecn", _state, _field, VIRTIO_NET_F_GUEST_ECN, 
true), \
 DEFINE_PROP_BIT("guest_ufo", _state, _field, VIRTIO_NET_F_GUEST_UFO, 
true), \
+DEFINE_PROP_BIT("guest_announce", _state, _field, 
VIRTIO_NET_F_GUEST_ANNOUNCE, true), \
 DEFINE_PROP_BIT("host_tso4", _state, _field, VIRTIO_NET_F_HOST_TSO4, 
true), \
 DEFINE_PROP_BIT("host_tso6", _state, _field, VIRTIO_NET_F_HOST_TSO6, 
true), \
 DE

[Qemu-devel] [PATCH V7 3/5] net: model specific announcing support

2013-03-07 Thread Jason Wang

This patch introduces a function pointer in NetClientInfo which is called during
self announcement. With this, each kind of card can announce the link with a
model specific way. The old method is still kept for cards that have not
implemented this or old guest. The first user would be virtio-net.

Signed-off-by: Jason Wang 
---
 include/net/net.h |2 ++
 savevm.c  |8 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/net/net.h b/include/net/net.h
index cb049a1..49031b4 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -44,6 +44,7 @@ typedef ssize_t (NetReceiveIOV)(NetClientState *, const 
struct iovec *, int);
 typedef void (NetCleanup) (NetClientState *);
 typedef void (LinkStatusChanged)(NetClientState *);
 typedef void (NetClientDestructor)(NetClientState *);
+typedef int (NetAnnounce)(NetClientState *);
 
 typedef struct NetClientInfo {
 NetClientOptionsKind type;
@@ -55,6 +56,7 @@ typedef struct NetClientInfo {
 NetCleanup *cleanup;
 LinkStatusChanged *link_status_changed;
 NetPoll *poll;
+NetAnnounce *announce;
 } NetClientInfo;
 
 struct NetClientState {
diff --git a/savevm.c b/savevm.c
index a8a53ef..fd69e32 100644
--- a/savevm.c
+++ b/savevm.c
@@ -76,12 +76,16 @@ static int announce_self_create(uint8_t *buf,
 
 static void qemu_announce_self_iter(NICState *nic, void *opaque)
 {
+NetClientState *nc = qemu_get_queue(nic);
+NetAnnounce *announce = nc->info->announce;
 uint8_t buf[60];
 int len;
 
-len = announce_self_create(buf, nic->conf->macaddr.a);
+if (!announce || announce(nc)) {
+len = announce_self_create(buf, nic->conf->macaddr.a);
 
-qemu_send_packet_raw(qemu_get_queue(nic), buf, len);
+qemu_send_packet_raw(qemu_get_queue(nic), buf, len);
+}
 }
 
 
-- 
1.7.1

[Qemu-devel] [PATCH V7 5/5] virtio-net: compat guest announce

2013-03-07 Thread Jason Wang

Disable guest announce for pre-1.5.

Signed-off-by: Jason Wang 
---
 hw/pc.h |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/hw/pc.h b/hw/pc.h
index f2c1b1c..2de7cd4 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -221,6 +221,10 @@ int e820_add_entry(uint64_t, uint64_t, uint32_t);
 .property = "vectors",\
 /* DEV_NVECTORS_UNSPECIFIED as a uint32_t string */\
 .value= stringify(0x),\
-   }
+},{\
+.driver   = "virtio-net-pci",   \
+.property = "guest_announce", \
+.value= "off",\
+}
 
 #endif
-- 
1.7.1

Re: [Qemu-devel] [PATCH v6 11/24] hw/nand.c: correct the sense of the BUSY/READY status bit

2013-03-07 Thread Edgar E. Iglesias

On Thu, Mar 07, 2013 at 12:11:51PM +1000, Peter Crosthwaite wrote:
> Hi Kuo Jung, Peter,
> 
> This patch fixes bugs for us in Zynq Nand (cc Wendy Liang). Can we get
> a cherry pick of this?

I've applied this one.

Thanks,
Edgar


> 
> Regards,
> Peter
> 
> On Wed, Mar 6, 2013 at 5:27 PM, Kuo-Jung Su  wrote:
> > The BIT6 of Status Register(SR):
> >
> > SR[6] behaves the same as R/B# pin
> > SR[6] = 0 indicates the device is busy;
> > SR[6] = 1 means the device is ready
> >
> > Some NAND flash controller (i.e. ftnandc021) relies on the SR[6]
> > to determine if the NAND flash erase/program is success or error timeout.
> >
> > P.S:
> > The exmaple NAND flash datasheet could be found at following link:
> > http://www.mxic.com.tw/QuickPlace/hq/PageLibrary4825740B00298A3B.nsf/h_Index/8FEA549237D2F7674825795800104C26/$File/MX30LF1G08AA,%203V,%201Gb,%20v1.1.pdf
> >
> > Signed-off-by: Kuo-Jung Su 
> 
> Reviewed-by: Peter Crosthwaite 
> 
> > ---
> >  hw/nand.c |3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/nand.c b/hw/nand.c
> > index 4a71265..61e918f 100644
> > --- a/hw/nand.c
> > +++ b/hw/nand.c
> > @@ -46,7 +46,7 @@
> >  # define NAND_IOSTATUS_PLANE1  (1 << 2)
> >  # define NAND_IOSTATUS_PLANE2  (1 << 3)
> >  # define NAND_IOSTATUS_PLANE3  (1 << 4)
> > -# define NAND_IOSTATUS_BUSY(1 << 6)
> > +# define NAND_IOSTATUS_READY(1 << 6)
> >  # define NAND_IOSTATUS_UNPROTCT(1 << 7)
> >
> >  # define MAX_PAGE  0x800
> > @@ -231,6 +231,7 @@ static void nand_reset(DeviceState *dev)
> >  s->iolen = 0;
> >  s->offset = 0;
> >  s->status &= NAND_IOSTATUS_UNPROTCT;
> > +s->status |= NAND_IOSTATUS_READY;
> >  }
> >
> >  static inline void nand_pushio_byte(NANDFlashState *s, uint8_t value)
> > --
> > 1.7.9.5
> >
> >

Re: [Qemu-devel] problems with freeBSD

2013-03-07 Thread Aurelien Jarno

On Wed, Mar 06, 2013 at 07:53:51PM -0500, Kevin O'Connor wrote:
> On Thu, Mar 07, 2013 at 12:12:08AM +0100, Aurelien Jarno wrote:
> > On Wed, Mar 06, 2013 at 08:21:11AM +, Dietmar Maurer wrote:
> > > Using qemu 1.4.0:
> > > 
> > > # qemu -hda test.raw -m 512 -cdrom 
> > > pfSense-LiveCD-2.0.2-RELEASE-amd64-20121207-2239.iso
> > > 
> > > Results in:
> > > 
> > > trap 12: page fault while in kernel mode
> > > ...
> > > stopped at x86bios_emu_rdw+0x2f: movzwl (%rbx),%eax
> > > 
> > > Any ideas? Can somebody reproduce that?
> > > 
> > > To get the FreeBSD VM boot use the console, enter the boot loader, then:
> > > # set hint.atkbd.0.disabled="1"
> > > # boot
> > > 
> > > But that disables the keyboard.
> > 
> > I was actually digging about that problem. It is indeed present in
> > version 1.4.0, but is fixed in the current git master. The problem is
> > actually not directly in QEMU but in seabios, the update to version
> > 1.7.2.1 commit 5c75fb10) fixes the issue. Maybe it is worth 
> > cherry-picking it into stable-1.4 (hence the Cc:). In the meantime
> > using bios.bin from master with QEMU version 1.4.0 should also fix the
> > issue.
> > 
> > What is strange is the seabios commit fixing the issue:
> > 
> > commit 4219149ad2b783abfa61e80e9e9f6910db0c76c9
> > Author: Kevin O'Connor 
> > Date:   Sun Feb 17 10:56:10 2013 -0500
> > 
> > build: Don't require $(OUT) to be a sub-directory of the main 
> > directory.
> 
> That change is definitely just build related - I don't see how it
> could impact the final SeaBIOS binary.  How did you conclude that this
> commit is what fixes the issue?
> 

I did a git bisect to find the commit fixing the issue. Then, as I was
not believing the result, I tried the following sequence a dozen of
times (for some unknown reasons the FreeBSD install CD doesn't exhibit
the issue, so I used the Debian GNU/kFreeBSD installer):

| mkdir qemu-freebsd-bug
| cd qemu-freebsd-bug
|
| wget 
http://ftp.debian.org/debian/dists/squeeze/main/installer-kfreebsd-amd64/current/images/netboot/mini.iso
 
|
| git clone git://git.qemu.org/qemu.git
| cd qemu
| git checkout -b stable-1.4 v1.4.0
| ./configure --target-list=x86_64-softmmu
| make
| cd ..
|
| git clone git://git.seabios.org/seabios.git
| cd seabios
| git checkout -b 1.7.2-stable origin/1.7.2-stable
| git reset --hard 4219149ad2b783abfa61e80e9e9f6910db0c76c9
| make
| cp out/bios.bin ../qemu/pc-bios
| cd..
|
| # debian-installer boots correctly 
| ./qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cdrom mini.iso
|
| cd seabios
| git reset --hard d75c22fcb6521dad11428b65789d92f89675c600 
| git clean -fdx
| make
| cp out/bios.bin ../qemu/pc-bios
| cd ..
|
| # debian-installer fails to boot
| ./qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cdrom mini.iso 


Maybe I am doing something wrong or there is a bug in my toolchain
(Debian Sid). It would be nice if someone could try to reproduce that on
another distro/system.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH] sysbus: add no_user for devices using mmio or IRQ or GPIO

2013-03-07 Thread Markus Armbruster

Peter Maydell  writes:

> On 5 March 2013 04:16, Paolo Bonzini  wrote:
>> Il 04/03/2013 18:58, Peter Maydell ha scritto:
 > Mass-mark these devices as no_user.
>>> "There is no such thing as a 'no-user' device" -- Anthony
>>> (http://lists.gnu.org/archive/html/qemu-devel/2013-01/msg00896.html)
>>>
>>> We should figure out what we might be trying to use 'no-user'
>>> for, and consistently use it that way. Or alternatively we
>>> should remove it (perhaps replacing it with other flags).
>>> Mass-marking all the sysbus devices when we don't have a
>>> consistent sane defined semantics for the flag seems like
>>> a bad idea.
>>
>> $ x86_64-softmmu/qemu-system-x86_64 -device xlnx,,ps7-usb
>> (qemu) info qtree
>> bus: main-system-bus
>>   type System
>>   dev: xlnx,ps7-usb, id ""
>> maxframes = 128
>> irq 1
>> mmio /1000
>> bus: usb-bus.0
>>   type usb-bus
>>
>> I have no idea what this means, but I'm pretty sure that no matter how I
>> configure it, it will never work.
>
> I agree. Exhibiting something that's a suboptimal user
> experience doesn't count as defining consistent semantics
> for no_user, though :-)

Can we make some progress towards something that makes more sense?

First step: reasons for marking a device no_user.

>From a user point of view, I think there's just one: -device/device_add
cannot possibly result in a working device.  Coherent enough semantics
for no_user, in my opinion, but I readily concede it's not particularly
useful for maintaining it as infrastructure and device models evolve.

For that, we need to drill down into reasons for "cannot possibly work".
Here are two, please add more:

* Can't make required connections

  -device can make only one connection: to a qbus (first order
  approximation).  Usually suffices for devices on "real" buses like
  PCI, SCSI, ...  Doesn't make any useful connection on "sysbus" (put in
  quotes because it's not really a bus).  Cases in between may exist:
  the bus connection is real enough, but the device wants additional
  connections.

  Additional connections are made by setup code that's not reachable
  from -device.  If they're required for the device to work, then
  -device cannot possibly result in a working device.  Worse, -device
  could happily claim success all the same.

  Are these additional connections always required?

  The need for additional connections depends only on the device, right?

  Automated static reasoning on setup code to determine the value of a
  "makes additional connections" flag feels intractable.  A runtime
  check could be feasible, though.

* Resource collision with board

  The device must connect to some fixed resource, but the board already
  connects something to it.  Without no_user, this should result in an
  error message of sorts, which is much better than the silent breakage
  above.  Whether the message makes any sense to the user is a different
  question.

  Unlike above, this isn't just about the device, it's about the device
  and the board.  Right now, one no_user has to make do for all boards.
  Hmm.

  A -device can also resource-collide with another -device.  But that's
  outside no_user's mission: -device *can* result in a working device
  there, just not in this particular combination.

  In a declarative world, we could automatically filter out devices who
  need resources that are never available on this board.  qdev/QOM is
  moving away from declarative.  Ideas?

>  [and I don't think "this device
> can be added via the monitor but not the command line"
> counts as consistent or coherent...]

no_user applies equally to -device and device_add.

The command line additionally provides a number of settings the board
setup code may (or may not) use, and no_user has no effect on them.

Coherent enough for me.

>> Yes, the right thing to do would be to QOMify memory regions and
>> introduce pins, but that's a bit more than the amount of time I have now
>> for this.
>
> ...plus it means that when we do have these things we
> have to go round and identify the cases where no_user
> was set only because we didn't have the features before.
>
> My attitude here really is "yes, it's not great but it's
> been like this forever and we don't seem to have had a
> huge flood of user complaints, so better not to mess
> with it unless what you're doing is going to amount to
> some kind of cleanup".

Never underestimate users' capability to suffer quietly.

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Kevin Wolf

Am 06.03.2013 um 21:39 hat Paolo Bonzini geschrieben:
> Il 06/03/2013 20:03, Peter Lieven ha scritto:
> > Am 06.03.2013 19:48, schrieb Jeff Cody:
> >> On Wed, Mar 06, 2013 at 07:31:51PM +0100, Paolo Bonzini wrote:
> >>> Il 06/03/2013 19:14, Jeff Cody ha scritto:
>  QCOW breaks with it using a normal raw posix file as a device.  As a
>  test: qemu-img create -f qcow test.qcow 5G.  Now run qemu with that
>  drive mounted, and try to partition and format it.  QEMU now asserts.
> 
>  The nicety of being able to using truncate during a write call,
>  especially for VHDX (which can have relatively large block/cluster
>  sizes), so to grow the file sparsely in a dynamically allocated file.
> >>>
> >>> Perhaps we need two APIs, "truncate" and "revalidate".
> >>>
> >>> Truncate should be a no-op if (!bs->growable).
> >>>
> >>> Revalidate could be called by the block_resize monitor command with no
> >>> size specified.
> >>>
> >>> Paolo
> >>
> >> I think that is a good solution.  Is it better to have "truncate" and
> >> "revalidate", or "truncate" and "grow", with grow being a subset of
> >> truncate, with fewer restrictions?  There may still be operations
> >> where it is OK to grow a file, but not OK to shrink it.

What semantics would the both operations have? Is truncate the same as
it used to be? I don't really understand what "revalidate" would do, it
sounds like a read-only operation from its name?

> > Or as a first step:
> > 
> > a) Call brdv_drain_all() only if the device is shrinked (independently of 
> > !bs->growable)
> > b) Call brdv_drain_all() inside iscsi_truncate() because it is a special 
> > requirement there
> > c) Fix the value of bs->growable for all drivers
> 
> Let's start from (c).  bdrv_file_open sets bs->growable = 1.  I think it
> should be removed and only the file protocol should set it.

This is probably right.

> Then we can add bdrv_revalidate and, for block_resize, call
> bdrv_revalidate+bdrv_truncate.  For bs->growable = 0 &&
> !bs->drv->bdrv_truncate, bdrv_truncate can just check that the actual
> size is the same or bigger as the one requested, and fail otherwise.

This one not so much. bs->growable does not mean that you can use
bdrv_truncate. It rather means that you may write beyond the end of the
file even without truncating it first. Mabye bs->auto_grow would be a
better for it.

So bs->growable == true implies that bdrv_truncate() should be allowed
as well, because obviously changing the BDS size is possbile, but it's
not true the other way round.

Kevin

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Peter Lieven


On 06.03.2013 21:39, Paolo Bonzini wrote:

Il 06/03/2013 20:03, Peter Lieven ha scritto:

Am 06.03.2013 19:48, schrieb Jeff Cody:

On Wed, Mar 06, 2013 at 07:31:51PM +0100, Paolo Bonzini wrote:

Il 06/03/2013 19:14, Jeff Cody ha scritto:

QCOW breaks with it using a normal raw posix file as a device.  As a
test: qemu-img create -f qcow test.qcow 5G.  Now run qemu with that
drive mounted, and try to partition and format it.  QEMU now asserts.

The nicety of being able to using truncate during a write call,
especially for VHDX (which can have relatively large block/cluster
sizes), so to grow the file sparsely in a dynamically allocated file.


Perhaps we need two APIs, "truncate" and "revalidate".

Truncate should be a no-op if (!bs->growable).

Revalidate could be called by the block_resize monitor command with no
size specified.

Paolo


I think that is a good solution.  Is it better to have "truncate" and
"revalidate", or "truncate" and "grow", with grow being a subset of
truncate, with fewer restrictions?  There may still be operations
where it is OK to grow a file, but not OK to shrink it.


Or as a first step:

a) Call brdv_drain_all() only if the device is shrinked (independently of 
!bs->growable)
b) Call brdv_drain_all() inside iscsi_truncate() because it is a special 
requirement there
c) Fix the value of bs->growable for all drivers


Let's start from (c).  bdrv_file_open sets bs->growable = 1.  I think it
should be removed and only the file protocol should set it.

Then we can add bdrv_revalidate and, for block_resize, call
bdrv_revalidate+bdrv_truncate.  For bs->growable = 0 &&
!bs->drv->bdrv_truncate, bdrv_truncate can just check that the actual
size is the same or bigger as the one requested, and fail otherwise.

Paolo



Regarding brd_drain_all(). Is the fix right to call it only on device shrink?
In this case it has to be added to iscsi_truncate as well.

Peter

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Peter Lieven


On 07.03.2013 09:50, Kevin Wolf wrote:

Am 06.03.2013 um 21:39 hat Paolo Bonzini geschrieben:

Il 06/03/2013 20:03, Peter Lieven ha scritto:

Am 06.03.2013 19:48, schrieb Jeff Cody:

On Wed, Mar 06, 2013 at 07:31:51PM +0100, Paolo Bonzini wrote:

Il 06/03/2013 19:14, Jeff Cody ha scritto:

QCOW breaks with it using a normal raw posix file as a device.  As a
test: qemu-img create -f qcow test.qcow 5G.  Now run qemu with that
drive mounted, and try to partition and format it.  QEMU now asserts.

The nicety of being able to using truncate during a write call,
especially for VHDX (which can have relatively large block/cluster
sizes), so to grow the file sparsely in a dynamically allocated file.


Perhaps we need two APIs, "truncate" and "revalidate".

Truncate should be a no-op if (!bs->growable).

Revalidate could be called by the block_resize monitor command with no
size specified.

Paolo


I think that is a good solution.  Is it better to have "truncate" and
"revalidate", or "truncate" and "grow", with grow being a subset of
truncate, with fewer restrictions?  There may still be operations
where it is OK to grow a file, but not OK to shrink it.


What semantics would the both operations have? Is truncate the same as
it used to be? I don't really understand what "revalidate" would do, it
sounds like a read-only operation from its name?


Or as a first step:

a) Call brdv_drain_all() only if the device is shrinked (independently of 
!bs->growable)
b) Call brdv_drain_all() inside iscsi_truncate() because it is a special 
requirement there
c) Fix the value of bs->growable for all drivers


Let's start from (c).  bdrv_file_open sets bs->growable = 1.  I think it
should be removed and only the file protocol should set it.


This is probably right.


If bs->growable is 1 for all drivers, whats the fix status of CVE-2008-0928? 
This
flag was introduced as a fix for this problem.

bdrv_check_byte_request() does nothing useful if bs->growable is 1.

Peter

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Kevin Wolf

Am 06.03.2013 um 19:27 hat Peter Lieven geschrieben:
> Am 06.03.2013 19:06, schrieb Paolo Bonzini:
> > Il 06/03/2013 18:50, Peter Lieven ha scritto:
>  Commit 9a665b2b made bdrv_truncate() call bdrv_drain_all(), but this 
>  breaks
>  QCOW images, as well other future image formats (such as VHDX) that may 
>  call
>  bdrv_truncate(bs->file) from within a read/write operation.  For 
>  example, QCOW
>  will cause an assert, due to tracked_requests not being empty (since the
>  read/write that called bdrv_truncate() is still in progress).
> > 
> > I'm not sure such bdrv_truncate calls are necessary.  QCOW2 doesn't have
> > them (almost; there is one in qcow2_write_compressed, I'm not even sure
> > that one is necessary though), and I think QCOW's breaks using it with a
> > block device as a backing file.
> 
> I think we have to check the sense of bs->growable nevertheless. It is set
> to 1 for all drivers which can't be right?!

For everything that goes through bdrv_file_open(), which are protocol
drivers, not format drivers. It is required for files so that formats
can write past EOF; for other drivers that can't actually grow their
backing storage it doesn't hurt because they will return an eror anyway
when you write to it.

"bdrv_truncate() works" and "bs->growable == true" are totally different
things, so even though it doesn't look right at the first sight,
bs->growable does its job correctly.

In your other mail you're talking about setting it for raw, qcow2, VHDX.
This would be discussing on the entirely wrong level, this isn't about
formats, but about protocols (file, host_device, nbd, iscsi, http,
vvfat...), that are below the format drivers.

Kevin

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Kevin Wolf

Am 07.03.2013 um 09:53 hat Peter Lieven geschrieben:
> On 06.03.2013 21:39, Paolo Bonzini wrote:
> >Il 06/03/2013 20:03, Peter Lieven ha scritto:
> >>Am 06.03.2013 19:48, schrieb Jeff Cody:
> >>>On Wed, Mar 06, 2013 at 07:31:51PM +0100, Paolo Bonzini wrote:
> Il 06/03/2013 19:14, Jeff Cody ha scritto:
> >QCOW breaks with it using a normal raw posix file as a device.  As a
> >test: qemu-img create -f qcow test.qcow 5G.  Now run qemu with that
> >drive mounted, and try to partition and format it.  QEMU now asserts.
> >
> >The nicety of being able to using truncate during a write call,
> >especially for VHDX (which can have relatively large block/cluster
> >sizes), so to grow the file sparsely in a dynamically allocated file.
> 
> Perhaps we need two APIs, "truncate" and "revalidate".
> 
> Truncate should be a no-op if (!bs->growable).
> 
> Revalidate could be called by the block_resize monitor command with no
> size specified.
> 
> Paolo
> >>>
> >>>I think that is a good solution.  Is it better to have "truncate" and
> >>>"revalidate", or "truncate" and "grow", with grow being a subset of
> >>>truncate, with fewer restrictions?  There may still be operations
> >>>where it is OK to grow a file, but not OK to shrink it.
> >>
> >>Or as a first step:
> >>
> >>a) Call brdv_drain_all() only if the device is shrinked (independently of 
> >>!bs->growable)
> >>b) Call brdv_drain_all() inside iscsi_truncate() because it is a special 
> >>requirement there
> >>c) Fix the value of bs->growable for all drivers
> >
> >Let's start from (c).  bdrv_file_open sets bs->growable = 1.  I think it
> >should be removed and only the file protocol should set it.
> >
> >Then we can add bdrv_revalidate and, for block_resize, call
> >bdrv_revalidate+bdrv_truncate.  For bs->growable = 0 &&
> >!bs->drv->bdrv_truncate, bdrv_truncate can just check that the actual
> >size is the same or bigger as the one requested, and fail otherwise.
> >
> >Paolo
> >
> 
> Regarding brd_drain_all(). Is the fix right to call it only on device shrink?
> In this case it has to be added to iscsi_truncate as well.

The real fix would bdrv_drain(bs). I hope we're not too far away from
that today, even though we're not quite there.

Kevin

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Kevin Wolf

Am 07.03.2013 um 09:56 hat Peter Lieven geschrieben:
> On 07.03.2013 09:50, Kevin Wolf wrote:
> >Am 06.03.2013 um 21:39 hat Paolo Bonzini geschrieben:
> >>Il 06/03/2013 20:03, Peter Lieven ha scritto:
> >>>Am 06.03.2013 19:48, schrieb Jeff Cody:
> On Wed, Mar 06, 2013 at 07:31:51PM +0100, Paolo Bonzini wrote:
> >Il 06/03/2013 19:14, Jeff Cody ha scritto:
> >>QCOW breaks with it using a normal raw posix file as a device.  As a
> >>test: qemu-img create -f qcow test.qcow 5G.  Now run qemu with that
> >>drive mounted, and try to partition and format it.  QEMU now asserts.
> >>
> >>The nicety of being able to using truncate during a write call,
> >>especially for VHDX (which can have relatively large block/cluster
> >>sizes), so to grow the file sparsely in a dynamically allocated file.
> >
> >Perhaps we need two APIs, "truncate" and "revalidate".
> >
> >Truncate should be a no-op if (!bs->growable).
> >
> >Revalidate could be called by the block_resize monitor command with no
> >size specified.
> >
> >Paolo
> 
> I think that is a good solution.  Is it better to have "truncate" and
> "revalidate", or "truncate" and "grow", with grow being a subset of
> truncate, with fewer restrictions?  There may still be operations
> where it is OK to grow a file, but not OK to shrink it.
> >
> >What semantics would the both operations have? Is truncate the same as
> >it used to be? I don't really understand what "revalidate" would do, it
> >sounds like a read-only operation from its name?
> >
> >>>Or as a first step:
> >>>
> >>>a) Call brdv_drain_all() only if the device is shrinked (independently of 
> >>>!bs->growable)
> >>>b) Call brdv_drain_all() inside iscsi_truncate() because it is a special 
> >>>requirement there
> >>>c) Fix the value of bs->growable for all drivers
> >>
> >>Let's start from (c).  bdrv_file_open sets bs->growable = 1.  I think it
> >>should be removed and only the file protocol should set it.
> >
> >This is probably right.
> 
> If bs->growable is 1 for all drivers, whats the fix status of CVE-2008-0928? 
> This
> flag was introduced as a fix for this problem.
> 
> bdrv_check_byte_request() does nothing useful if bs->growable is 1.

Don't ignore the difference between bdrv_open() and bdrv_file_open().
Typically you have two BDSes: On top there is e.g. a qcow2 BDS that is
opened through bdrv_open() and has bs->growable = false. Its bs->file is
using the file protocol (raw-posix driver) and opened by
bdrv_file_open(). This one has bs->file->growable = true so that qcow2
can write to newly allocated areas without calling bdrv_truncate()
first.

Kevin

Re: [Qemu-devel] [PATCH v4 3/6] add backup related monitor commands

2013-03-07 Thread Kevin Wolf

Am 06.03.2013 um 18:39 hat Dietmar Maurer geschrieben:
> > > > How about variable block sizes? I mean this is a stream format that
> > > > has a header for each block anyway. Include a size there and be done.
> > >
> > > You can make that as complex as you want. I simply do not need that.
> > 
> > Then you also don't need the performance that you lose by using NBD.
> 
> Why exactly?

Because your format kills more performance than any NBD connection
could.

Kevin

Re: [Qemu-devel] [PATCH v4 3/6] add backup related monitor commands

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 02:42:57PM +, Dietmar Maurer wrote:
> > > > Maybe you'd better use a different output format that doesn't
> > > > restrict you to 64k writes.
> > >
> > > The output format is not really the restriction. The problem is that
> > > an additional IPC layer add overhead, an d I do not want that (because it 
> > > is
> > totally unnecessary).
> > 
> > I missed the reason why you cannot increase the block size.  
> 
> When we run backup, we need to read such block on every write from the guest.
> So if we increase block size we get additional delays.

Don't increase the bitmap block size.

Just let the block job do larger reads.  This is the bulk of the I/O
workload.  You can use large reads here independently of the bitmap
block size.

That way guest writes still only have a 64 KB read overhead but you
reduce the overhead of doing so many 64 KB writes from the backup block
job.

Stefan

Re: [Qemu-devel] [PATCH] Fix the wrong description in qemu manual

2013-03-07 Thread Markus Armbruster

Copying qemu-trivial.

Lei Li  writes:

> Fix LP#1151450 the wrong description in qemu manual:
>
> 'qemu-system-x86_84' should be 'qemu-system-x86_64'.
>
> Signed-off-by: Lei Li 
> ---
>  qemu-options.hx |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 6f9334a..cd76f2a 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -2132,7 +2132,7 @@
> gluster[+transport]://[server[:port]]/volname/image[?socket=...]
>  
>  Example
>  @example
> -qemu-system-x86_84 --drive file=gluster://192.0.2.1/testvol/a.img
> +qemu-system-x86_64 --drive file=gluster://192.0.2.1/testvol/a.img
>  @end example
>  
>  See also @url{http://www.gluster.org}.

Re: [Qemu-devel] [PATCHv3 0/2] bridge helper: includedir conf arg

2013-03-07 Thread Stefan Hajnoczi

On Thu, Mar 07, 2013 at 12:32:08AM -0600, Doug Goldstein wrote:
> The goal is to support an 'includedir' to include all files within a
> directory specified in the bridge.conf file. The rationale is to allow
> libvirt to be able to configure interfaces to for use by unprivileged
> users by just simply generating a new configuration file to the directory.
> 
> Change from v2:
> - Integrated review changes from Corey Bryant
> - Integrated review changes from Stefan Hajnoczi
> 
> Change from v1:
> - Reversed patch order to make the series clearer
> - Integrated review changes from Corey Bryant
> - Integrated review changes from Stefan Hajnoczi
> 
> Doug Goldstein (2):
>   bridge helper: unified error cleanup for parse_acl_file
>   bridge helper: support conf dirs
> 
>  qemu-bridge-helper.c | 83 
> 
>  1 file changed, 71 insertions(+), 12 deletions(-)
> 
> -- 
> 1.7.12.4
> 
> 

Reviewed-by: Stefan Hajnoczi

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Peter Lieven


Am 07.03.2013 um 10:03 schrieb Kevin Wolf :

> Am 07.03.2013 um 09:56 hat Peter Lieven geschrieben:
>> On 07.03.2013 09:50, Kevin Wolf wrote:
>>> Am 06.03.2013 um 21:39 hat Paolo Bonzini geschrieben:
 Il 06/03/2013 20:03, Peter Lieven ha scritto:
> Am 06.03.2013 19:48, schrieb Jeff Cody:
>> On Wed, Mar 06, 2013 at 07:31:51PM +0100, Paolo Bonzini wrote:
>>> Il 06/03/2013 19:14, Jeff Cody ha scritto:
 QCOW breaks with it using a normal raw posix file as a device.  As a
 test: qemu-img create -f qcow test.qcow 5G.  Now run qemu with that
 drive mounted, and try to partition and format it.  QEMU now asserts.
 
 The nicety of being able to using truncate during a write call,
 especially for VHDX (which can have relatively large block/cluster
 sizes), so to grow the file sparsely in a dynamically allocated file.
>>> 
>>> Perhaps we need two APIs, "truncate" and "revalidate".
>>> 
>>> Truncate should be a no-op if (!bs->growable).
>>> 
>>> Revalidate could be called by the block_resize monitor command with no
>>> size specified.
>>> 
>>> Paolo
>> 
>> I think that is a good solution.  Is it better to have "truncate" and
>> "revalidate", or "truncate" and "grow", with grow being a subset of
>> truncate, with fewer restrictions?  There may still be operations
>> where it is OK to grow a file, but not OK to shrink it.
>>> 
>>> What semantics would the both operations have? Is truncate the same as
>>> it used to be? I don't really understand what "revalidate" would do, it
>>> sounds like a read-only operation from its name?
>>> 
> Or as a first step:
> 
> a) Call brdv_drain_all() only if the device is shrinked (independently of 
> !bs->growable)
> b) Call brdv_drain_all() inside iscsi_truncate() because it is a special 
> requirement there
> c) Fix the value of bs->growable for all drivers
 
 Let's start from (c).  bdrv_file_open sets bs->growable = 1.  I think it
 should be removed and only the file protocol should set it.
>>> 
>>> This is probably right.
>> 
>> If bs->growable is 1 for all drivers, whats the fix status of CVE-2008-0928? 
>> This
>> flag was introduced as a fix for this problem.
>> 
>> bdrv_check_byte_request() does nothing useful if bs->growable is 1.
> 
> Don't ignore the difference between bdrv_open() and bdrv_file_open().
> Typically you have two BDSes: On top there is e.g. a qcow2 BDS that is
> opened through bdrv_open() and has bs->growable = false. Its bs->file is
> using the file protocol (raw-posix driver) and opened by
> bdrv_file_open(). This one has bs->file->growable = true so that qcow2
> can write to newly allocated areas without calling bdrv_truncate()
> first.

Sorry, I have to admin I am little confused by what is happening in bdrv_open().

However, what I can say is that bs->growable is 1 for an iSCSI backed
harddrive and I wonder how this can happen if bdrv_file_open is not used for
opening it because that is the only place where bs->growable is set to 1.

cmdline:
x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024 -drive 
if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
 -vnc :1 -boot dc -monitor stdio

Peter

> 
> Kevin

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Kevin Wolf

Am 07.03.2013 um 10:16 hat Peter Lieven geschrieben:
> >> If bs->growable is 1 for all drivers, whats the fix status of 
> >> CVE-2008-0928? This
> >> flag was introduced as a fix for this problem.
> >> 
> >> bdrv_check_byte_request() does nothing useful if bs->growable is 1.
> > 
> > Don't ignore the difference between bdrv_open() and bdrv_file_open().
> > Typically you have two BDSes: On top there is e.g. a qcow2 BDS that is
> > opened through bdrv_open() and has bs->growable = false. Its bs->file is
> > using the file protocol (raw-posix driver) and opened by
> > bdrv_file_open(). This one has bs->file->growable = true so that qcow2
> > can write to newly allocated areas without calling bdrv_truncate()
> > first.
> 
> Sorry, I have to admin I am little confused by what is happening in 
> bdrv_open().
> 
> However, what I can say is that bs->growable is 1 for an iSCSI backed
> harddrive and I wonder how this can happen if bdrv_file_open is not used for
> opening it because that is the only place where bs->growable is set to 1.
> 
> cmdline:
> x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024 -drive 
> if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
>  -vnc :1 -boot dc -monitor stdio

It is used for the iscsi driver. You have a raw BDS (growable == false)
on top of an iscsi one (growable == true).

Kevin

Re: [Qemu-devel] [PATCH v4 3/6] add backup related monitor commands

2013-03-07 Thread Dietmar Maurer

> > > > You can make that as complex as you want. I simply do not need that.
> > >
> > > Then you also don't need the performance that you lose by using NBD.
> >
> > Why exactly?
> 
> Because your format kills more performance than any NBD connection could.

That is not true.

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Peter Lieven


Am 07.03.2013 um 10:22 schrieb Kevin Wolf :

> Am 07.03.2013 um 10:16 hat Peter Lieven geschrieben:
 If bs->growable is 1 for all drivers, whats the fix status of 
 CVE-2008-0928? This
 flag was introduced as a fix for this problem.
 
 bdrv_check_byte_request() does nothing useful if bs->growable is 1.
>>> 
>>> Don't ignore the difference between bdrv_open() and bdrv_file_open().
>>> Typically you have two BDSes: On top there is e.g. a qcow2 BDS that is
>>> opened through bdrv_open() and has bs->growable = false. Its bs->file is
>>> using the file protocol (raw-posix driver) and opened by
>>> bdrv_file_open(). This one has bs->file->growable = true so that qcow2
>>> can write to newly allocated areas without calling bdrv_truncate()
>>> first.
>> 
>> Sorry, I have to admin I am little confused by what is happening in 
>> bdrv_open().
>> 
>> However, what I can say is that bs->growable is 1 for an iSCSI backed
>> harddrive and I wonder how this can happen if bdrv_file_open is not used for
>> opening it because that is the only place where bs->growable is set to 1.
>> 
>> cmdline:
>> x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024 -drive 
>> if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
>>  -vnc :1 -boot dc -monitor stdio
> 
> It is used for the iscsi driver. You have a raw BDS (growable == false)
> on top of an iscsi one (growable == true).

Ok, but growable == true is wrong for the iSCSI driver isn`t it?

Peter

Re: [Qemu-devel] [PATCH v4 3/6] add backup related monitor commands

2013-03-07 Thread Dietmar Maurer

> > When we run backup, we need to read such block on every write from the
> guest.
> > So if we increase block size we get additional delays.
> 
> Don't increase the bitmap block size.
> 
> Just let the block job do larger reads.  This is the bulk of the I/O 
> workload.  You
> can use large reads here independently of the bitmap block size.
> 
> That way guest writes still only have a 64 KB read overhead but you reduce the
> overhead of doing so many 64 KB writes from the backup block job.

If there are many writes from the guest, you still get many 64KB block.

Anyways, and additional RPC layer always adds overhead, and It can be 
completely avoided.
Maybe not much, but I want to make backup as efficient as possible.

Re: [Qemu-devel] [PATCH 0/3] * make netlayer re-entrant *

2013-03-07 Thread Stefan Hajnoczi

On Thu, Mar 07, 2013 at 10:06:52AM +0800, liu ping fan wrote:
> On Wed, Mar 6, 2013 at 5:30 AM, mdroth  wrote:
> > On Sun, Mar 03, 2013 at 09:21:19PM +0800, Liu Ping Fan wrote:
> >> From: Liu Ping Fan 
> >>
> >> This series aim to make netlayer re-entrant, so netlayer can
> >> run out of biglock safely.
> >
> > I think most of the locking considerations are still applicable either
> > way, but this series seems to be written under the assumption that
> > we'll be associating hubs/ports with separate AioContexts to facilitate
> > driving the event handling outside of the iothread. Is this the case?
> >
> Yes.
> > From what I gathered from the other thread, the path forward was to
> > replace the global iohandler list that we currently use to drive
> > NetClient events and replace it with a GSource and GMainContext, rather
> > than relying on AioContexts.
> >
> Not quite sure about it. Seems that AioContext is built on GSource, so
> I think they are similar, and AioContext is easy to reuse.
> 
> > I do agree that the event handlers currently grouped under
> > iohandler.c:io_handlers look like a nice fit for AioContexts, but other
> > things like slirp and chardevs seem better served by a more general
> > mechanism like GSources/GMainContexts. The chardev flow control patches
> > seem to be doing something similar already as well.
> >
> I have made some fix for this series, apart from the concert about
> GSource/ AioContext.  Hope to discuss it clearly in next version and
> fix it too. BTW what can we benefit from AioContext besides those from
> GSource

This is a good discussion.  I'd like to hear more about using glib event
loop concepts instead of rolling our own.  Here are my thoughts after
exploring the glib main loop vs AioContext.

AioContext supports two things:

1. BH which is similar to GIdle for scheduling functions that get called
   from the event loop.

   Note that GIdle doesn't work in aio_poll() because we don't run
   integrate the glib event loop.

2. aio_poll() which goes a step beyond iohandlers because the
   ->io_flush() handler signals whether there are pending aio requests.
   This way aio_poll() can be called in a loop until all pending
   requests have finished.

   Imagine block/iscsi.c which has a TCP socket to the iSCSI target.
   We're ready to receive from the socket but we only want to wait until
   all pending requests complete.  That means the socket fd is always
   looking for G_IO_IN events but we shouldn't wait unless there are
   actually iSCSI requests pending.

   This feature is important for the synchronous (nested event loop)
   functionality that QEMU relies on for bdrv_drain_all() and
   synchronous I/O (bdrv_read(), bdrv_write(), bdrv_flush()).

The glib equivalent to aio_poll() is g_main_context_iteration():

https://developer.gnome.org/glib/2.30/glib-The-Main-Event-Loop.html#g-main-context-iteration

But note that the return value is different.  g_main_context_iteration()
tells you if any event handlers were called.  aio_poll() tells you
whether more calls are necessary in order to reach a quiescent state
(all requests completed).

I guess it's time to look at the chardev flow control patches to see how
glib event loop concepts are being used.

Stefan

Re: [Qemu-devel] [PATCH 0/3] * make netlayer re-entrant *

2013-03-07 Thread Stefan Hajnoczi

On Thu, Mar 07, 2013 at 10:06:52AM +0800, liu ping fan wrote:
> On Wed, Mar 6, 2013 at 5:30 AM, mdroth  wrote:
> > On Sun, Mar 03, 2013 at 09:21:19PM +0800, Liu Ping Fan wrote:
> >> From: Liu Ping Fan 
> >>
> >> This series aim to make netlayer re-entrant, so netlayer can
> >> run out of biglock safely.
> >
> > I think most of the locking considerations are still applicable either
> > way, but this series seems to be written under the assumption that
> > we'll be associating hubs/ports with separate AioContexts to facilitate
> > driving the event handling outside of the iothread. Is this the case?
> >
> Yes.
> > From what I gathered from the other thread, the path forward was to
> > replace the global iohandler list that we currently use to drive
> > NetClient events and replace it with a GSource and GMainContext, rather
> > than relying on AioContexts.
> >
> Not quite sure about it. Seems that AioContext is built on GSource, so
> I think they are similar, and AioContext is easy to reuse.
> 
> > I do agree that the event handlers currently grouped under
> > iohandler.c:io_handlers look like a nice fit for AioContexts, but other
> > things like slirp and chardevs seem better served by a more general
> > mechanism like GSources/GMainContexts. The chardev flow control patches
> > seem to be doing something similar already as well.
> >
> I have made some fix for this series, apart from the concert about
> GSource/ AioContext.  Hope to discuss it clearly in next version and
> fix it too. BTW what can we benefit from AioContext besides those from
> GSource

One thing I forgot to add: net/ doesn't use BH or
qemu_aio_set_fd_handler() so AioContext isn't strictly necessary.

Stefan

Re: [Qemu-devel] [PER] Re: socket, mcast looping back frames -> IPv6 broken

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 02:15:25PM +0100, Samuel Thibault wrote:
> Stefan Hajnoczi, le Wed 06 Mar 2013 13:29:37 +0100, a écrit :
> > On Tue, Mar 05, 2013 at 05:35:10PM +0100, Samuel Thibault wrote:
> > Unfortunately net/socket.c does not have the concept of a link-layer
> > address, so we cannot easily filter out multicast packets coming from
> > our NIC's address.
> > 
> > Are you aware of a way to filter out just the packets sent by *this*
> > process?
> 
> I haven't seen any in the Linux source code.  One thing that should
> work, however, is to use recvfrom, and drop whatever comes from our
> sockname.

Sounds like a plan :).

Stefan

Re: [Qemu-devel] [PATCH] qdev: DEVICE_DELETED event

2013-03-07 Thread Markus Armbruster

Eric Blake  writes:

> [adding libvirt]
>
> On 03/06/2013 07:52 AM, Paolo Bonzini wrote:
>> Il 06/03/2013 15:44, Eric Blake ha scritto:
>>> Question - if libvirt misses the event (for example, if libvirtd
>>> requests a remove, but then gets restarted, and the event arrives before
>>> libvirtd is back up), is there a way to poll whether the the removal has
>>> completed?  The event is great to minimize polling overhead in the
>>> common case, but we generally provide this sort of information via a
>>> pollable interface at the same time.

General rule: no event without a way to poll.

>> Yes, you can use qom-list on /machine/peripheral.
>
> Which means libvirt should be patched to use qom-list right now, even
> before this event makes it in, in order to avoid its current bug of
> reusing a device id before the deletion has completed.  Adding the event
> is still useful, as polling is never nice; and we already know how to
> make libvirt do conditional code based on whether query-events shows
> that the event has been added to minimize polling where possible.

Yes.

[Qemu-devel] [Bug 1151450] [NEW] wrong description in qemu manual

2013-03-07 Thread cyberyoung

Public bug reported:


Description:
man qemu, there is a line:
qemu-system-x86_84 --drive file=gluster://192.0.2.1/testvol/a.img
seems should be:
qemu-system-x86_64 --drive file=gluster://192.0.2.1/testvol/a.img

Additional info:
* operating system
arch linux x86_64
* package version(s)
1.4.0
* config and/or log files etc.


Steps to reproduce:
man qemu

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1151450

Title:
  wrong description in qemu manual

Status in QEMU:
  New

Bug description:
  
  Description:
  man qemu, there is a line:
  qemu-system-x86_84 --drive file=gluster://192.0.2.1/testvol/a.img
  seems should be:
  qemu-system-x86_64 --drive file=gluster://192.0.2.1/testvol/a.img

  Additional info:
  * operating system
  arch linux x86_64
  * package version(s)
  1.4.0
  * config and/or log files etc.

  
  Steps to reproduce:
  man qemu

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1151450/+subscriptions

Re: [Qemu-devel] [PATCH] Fix the wrong description in qemu manual

2013-03-07 Thread Stefan Hajnoczi

On Thu, Mar 07, 2013 at 03:50:26PM +0800, Lei Li wrote:
> Fix LP#1151450 the wrong description in qemu manual:
> 
> 'qemu-system-x86_84' should be 'qemu-system-x86_64'.
> 
> Signed-off-by: Lei Li 
> ---
>  qemu-options.hx |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)

Thanks, applied to the trivial patches tree:
https://github.com/stefanha/qemu/commits/trivial-patches

Stefan

[Qemu-devel] [PATCHv3] qdev: DEVICE_DELETED event

2013-03-07 Thread Michael S. Tsirkin

libvirt has a long-standing bug: when removing the device,
it can request removal but does not know when the
removal completes. Add an event so we can fix this in a robust way.

Signed-off-by: Michael S. Tsirkin 
---

Changes from v2:
- move event toward the end of device_unparent,
  so that parents are reported after their children,
  as suggested by Paolo
Changes from v1:
- move to device_unparent
- address comments by Andreas and Eric


 QMP/qmp-events.txt| 15 +++
 hw/qdev.c |  6 ++
 include/monitor/monitor.h |  1 +
 monitor.c |  1 +
 qapi-schema.json  |  4 +++-
 5 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
index b2698e4..f2f115a 100644
--- a/QMP/qmp-events.txt
+++ b/QMP/qmp-events.txt
@@ -136,6 +136,21 @@ Example:
 Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR
 event.
 
+DEVICE_DELETED
+-
+
+Emitted whenever the device removal completion is acknowledged
+by the guest. At this point, it's safe to reuse the specified device ID.
+Device removal can be initiated by the guest or by HMP/QMP commands.
+
+Data:
+
+- "device": device name (json-string)
+
+{ "event": "DEVICE_DELETED",
+  "data": { "device": "virtio-net-pci-0" },
+  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+
 DEVICE_TRAY_MOVED
 -
 
diff --git a/hw/qdev.c b/hw/qdev.c
index 689cd54..393e83e 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -29,6 +29,7 @@
 #include "sysemu/sysemu.h"
 #include "qapi/error.h"
 #include "qapi/visitor.h"
+#include "qapi/qmp/qjson.h"
 
 int qdev_hotplug = 0;
 static bool qdev_hot_added = false;
@@ -778,6 +779,11 @@ static void device_unparent(Object *obj)
 object_unref(OBJECT(dev->parent_bus));
 dev->parent_bus = NULL;
 }
+if (dev->id) {
+QObject *data = qobject_from_jsonf("{ 'device': %s }", dev->id);
+monitor_protocol_event(QEVENT_DEVICE_DELETED, data);
+qobject_decref(data);
+}
 }
 
 static void device_class_init(ObjectClass *class, void *data)
diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 87fb49c..b868760 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -39,6 +39,7 @@ typedef enum MonitorEvent {
 QEVENT_BLOCK_JOB_CANCELLED,
 QEVENT_BLOCK_JOB_ERROR,
 QEVENT_BLOCK_JOB_READY,
+QEVENT_DEVICE_DELETED,
 QEVENT_DEVICE_TRAY_MOVED,
 QEVENT_SUSPEND,
 QEVENT_SUSPEND_DISK,
diff --git a/monitor.c b/monitor.c
index 32a6e74..2a5e7b6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -457,6 +457,7 @@ static const char *monitor_event_names[] = {
 [QEVENT_BLOCK_JOB_CANCELLED] = "BLOCK_JOB_CANCELLED",
 [QEVENT_BLOCK_JOB_ERROR] = "BLOCK_JOB_ERROR",
 [QEVENT_BLOCK_JOB_READY] = "BLOCK_JOB_READY",
+[QEVENT_DEVICE_DELETED] = "DEVICE_DELETED",
 [QEVENT_DEVICE_TRAY_MOVED] = "DEVICE_TRAY_MOVED",
 [QEVENT_SUSPEND] = "SUSPEND",
 [QEVENT_SUSPEND_DISK] = "SUSPEND_DISK",
diff --git a/qapi-schema.json b/qapi-schema.json
index 28b070f..bb361e1 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2354,7 +2354,9 @@
 # Notes: When this command completes, the device may not be removed from the
 #guest.  Hot removal is an operation that requires guest cooperation.
 #This command merely requests that the guest begin the hot removal
-#process.
+#process.  Completion of the device removal process is signaled with a
+#DEVICE_DELETED event. Guest reset will automatically complete removal
+#for all devices.
 #
 # Since: 0.14.0
 ##
-- 
MST

Re: [Qemu-devel] [PATCH] qdev: DEVICE_DELETED event

2013-03-07 Thread Markus Armbruster

"Michael S. Tsirkin"  writes:

> On Wed, Mar 06, 2013 at 02:57:22PM +0100, Andreas Färber wrote:
>> Am 06.03.2013 14:00, schrieb Michael S. Tsirkin:
>> > libvirt has a long-standing bug: when removing the device,
>> > it can request removal but does not know when does the
>> > removal complete. Add an event so we can fix this in a robust way.
>> > 
>> > Signed-off-by: Michael S. Tsirkin 
>> 
>> Sounds like a good idea to me. :)
>> 
>> [...]
>> > diff --git a/hw/qdev.c b/hw/qdev.c
>> > index 689cd54..f30d251 100644
>> > --- a/hw/qdev.c
>> > +++ b/hw/qdev.c
>> > @@ -29,6 +29,7 @@
>> >  #include "sysemu/sysemu.h"
>> >  #include "qapi/error.h"
>> >  #include "qapi/visitor.h"
>> > +#include "qapi/qmp/qjson.h"
>> >  
>> >  int qdev_hotplug = 0;
>> >  static bool qdev_hot_added = false;
>> > @@ -267,6 +268,11 @@ void qdev_init_nofail(DeviceState *dev)
>> >  /* Unlink device from bus and free the structure.  */
>> >  void qdev_free(DeviceState *dev)
>> >  {
>> > +if (dev->id) {
>> > +QObject *data = qobject_from_jsonf("{ 'device': %s }", dev->id);
>> > +monitor_protocol_event(QEVENT_DEVICE_DELETED, data);
>> > +qobject_decref(data);
>> > +}
>> >  object_unparent(OBJECT(dev));
>> >  }
>> >  
>> 
>> I'm pretty sure this is the wrong place to fire the notification. We
>> should rather do this when the device is actually deleted - which
>> qdev_free() does *not* actually guarantee, as criticized in the s390x
>> and unref'ing contexts.
>> I would suggest to place your code into device_unparent() instead.
>> 
>> Another thing to consider is what data to pass to the event: Not all
>> devices have an ID.
>
> If they don't they were not created by management so management is
> probably not interested in them being removed.
>
> We could always add a 'path' key later if this assumption
> proves incorrect.

In old qdev, ID was all we had, because paths were busted.  Thus,
management had no choice but use IDs.

If I understand modern qdev correctly, we got a canonical path.  Old
APIs like device_del still accept only ID.  Should new APIs still be
designed that way?  Or should they always accept / provide the canonical
path, plus optional ID for convenience?

>> We should still have a canonical path when we fire
>> this event in either qdev_free() or in device_unparent() before the if
>> (dev->parent_bus) block though. That would be a question for Anthony,
>> not having a use case for the event I am indifferent there.
>> 
>> Further, thinking of objects such as virtio-rng backends or future
>> blockdev/chardev objects, might it make sense to turn this into a
>> generic object deletion event rather than a device event?
>> 
>> Andreas
>
> Backend deletion doesn't normally have guest interaction right?
> So why do we need an event?

We need an event because device_del may send its reply before it
completes the job.

device_del does that when it deletion needs to interact with the guest,
which can take unbounded time.

Conversely, we don't need an event when a QMP always completes the job
(as far as observable by the QMP client) before it sends its reply.  Off
hand, I can't see why backend deletion would do anything else.

I'm always reluctant to abstract when there are fewer than two
different, concrete things to abstract from.  Right now, we got just
one: device models.

Re: [Qemu-devel] qemu-img do not verify encrytption

2013-03-07 Thread Stefan Hajnoczi

On Thu, Mar 07, 2013 at 02:49:30PM +0800, yue-kvm wrote:
> i create qcow2 format image with -o encryption, and assign password.
> 
> but when i qemu-img info encryt.qcow2 ,  it display encryt.qcow2 info no 
> master  what i input, even just hit enter.
> the 'Password Authentication ' of qemu-img may be invalid.

Don't worry, your data is encrypted.  The qemu-img info password check
is unnecessary because qcow2 metadata is unencrypted.

I didn't get a password prompt from qemu-img info on an encrypted qcow2
file using qemu.git/master.

Which version of qemu-img are you using?  Perhaps the unnecessary prompt
has been fixed in a later version.

Stefan

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Kevin Wolf

Am 07.03.2013 um 10:25 hat Peter Lieven geschrieben:
> 
> Am 07.03.2013 um 10:22 schrieb Kevin Wolf :
> 
> > Am 07.03.2013 um 10:16 hat Peter Lieven geschrieben:
>  If bs->growable is 1 for all drivers, whats the fix status of 
>  CVE-2008-0928? This
>  flag was introduced as a fix for this problem.
>  
>  bdrv_check_byte_request() does nothing useful if bs->growable is 1.
> >>> 
> >>> Don't ignore the difference between bdrv_open() and bdrv_file_open().
> >>> Typically you have two BDSes: On top there is e.g. a qcow2 BDS that is
> >>> opened through bdrv_open() and has bs->growable = false. Its bs->file is
> >>> using the file protocol (raw-posix driver) and opened by
> >>> bdrv_file_open(). This one has bs->file->growable = true so that qcow2
> >>> can write to newly allocated areas without calling bdrv_truncate()
> >>> first.
> >> 
> >> Sorry, I have to admin I am little confused by what is happening in 
> >> bdrv_open().
> >> 
> >> However, what I can say is that bs->growable is 1 for an iSCSI backed
> >> harddrive and I wonder how this can happen if bdrv_file_open is not used 
> >> for
> >> opening it because that is the only place where bs->growable is set to 1.
> >> 
> >> cmdline:
> >> x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024 -drive 
> >> if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
> >>  -vnc :1 -boot dc -monitor stdio
> > 
> > It is used for the iscsi driver. You have a raw BDS (growable == false)
> > on top of an iscsi one (growable == true).
> 
> Ok, but growable == true is wrong for the iSCSI driver isn`t it?

I guess it depends on your definition. If you say that growable includes
the capability of growing the image, then yes, it's wrong. If you only
interpret it as permission to write beyond EOF (if the driver supports
that), then it's right even though any such attempt will result in an
error.

Practically speaking, the difference is between -EIO returned from
bdrv_check_byte_request() and -EIO from iscsi_aio_write16_cb(), i.e.
the result is the same.

Kevin

Re: [Qemu-devel] [PATCH] sysbus: add no_user for devices using mmio or IRQ or GPIO

2013-03-07 Thread Peter Maydell

On 7 March 2013 16:48, Markus Armbruster  wrote:
> Can we make some progress towards something that makes more sense?
>
> First step: reasons for marking a device no_user.
>
> From a user point of view, I think there's just one: -device/device_add
> cannot possibly result in a working device.  Coherent enough semantics
> for no_user, in my opinion, but I readily concede it's not particularly
> useful for maintaining it as infrastructure and device models evolve.

Ideally it would be nice to move to a model where the error message
was "the device you've created has some required connections which
haven't been wired up", and then for some devices you'd always get
that error [until we implemented syntax for doing the wiring] and
for some you'd only get it if you messed up the command line switches.

> For that, we need to drill down into reasons for "cannot possibly work".
> Here are two, please add more:
>
> * Can't make required connections

> * Resource collision with board
>
>   The device must connect to some fixed resource, but the board already
>   connects something to it.  Without no_user, this should result in an
>   error message of sorts, which is much better than the silent breakage
>   above.  Whether the message makes any sense to the user is a different
>   question.

Do we have any concrete examples of this? I don't think devices should
be making connections to resources themselves. If the only things wiring
up devices are (a) the board models and (b) anything the user says on
the command line, then the conflict only happens if the user says
-device foo,bar=0x42 and bar 0x42 is in use; in that case the message
will make sense to them.

I think a third case for no-user is "this device is actually an
abstract base class" (eg arm_gic_common), which we could deal with
by checking the TypeInfo instead.

Case four: "we really don't expect anybody to be trying to wire this
up dynamically", which would apply to things like the on-cpu peripherals
for some ARM cores. There it is really just an attempt at being friendly
by cutting down the length of the devices list.

Speaking of friendlyness, it might be helpful if the '-devices help'
list was somehow more structured than a single long list and if
more devices had description text?

>>  [and I don't think "this device
>> can be added via the monitor but not the command line"
>> counts as consistent or coherent...]
>
> no_user applies equally to -device and device_add.

In the codebase as it stands, it applies only to -device.
I agree that we should be consistent here, which we could do
by applying Christian's patch or some variation to make device_add
honour no_user. (Or by removing no_user altogether :-))

-- PMM

Re: [Qemu-devel] [PATCH 3/5] aio: add a ThreadPool instance to AioContext

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 06:24:37PM +0100, Paolo Bonzini wrote:
> Il 06/03/2013 16:45, Stefan Hajnoczi ha scritto:
> > This patch adds a ThreadPool to AioContext.  It's possible that some
> > AioContext instances will never use the ThreadPool, so defer creation
> > until aio_get_thread_pool().
> 
> What lock should protect against doing this twice?

aio_get_thread_pool() is not thread-safe and I don't think it needs to
be.  I imagine it will only be called from the thread that runs the
AioContext.

Stefan

Re: [Qemu-devel] [PATCH V7 0/5] Send the gratuitous by guest

2013-03-07 Thread Michael S. Tsirkin

On Thu, Mar 07, 2013 at 04:23:46PM +0800, Jason Wang wrote:
> This series tries to let guest instead of qemu to send the gratuitous packets
> after migration when guest is capable of doing this. This is needed since it's
> impossible for qemu to keep track of all configurations (e.g 802.1Q) and mac
> addresses (more than one mac address may be used by guest). So qemu can't 
> build
> gratuitous packets for all those configurations properly. The only solution is
> let guest driver who knew all needed information to do this.
> 
> The series first introduces a new runstate which just tracks the state when 
> the
> migration is finished and guest is about to start. And then we can just trying
> to notify the guest to send the GARP after changing from this state to
> running. A model specific announcing method were also also introduced to let
> each kinds of nic do its own notification. When there's no such method 
> register
> for the nic, the old style of sending RARP were kept. And the last two patches
> implemented the virtio-net method of notification.

Do we want to retry SELF_ANNOUNCE_ROUNDS?

> Changes from V6:
> - introduce a new runstate instead of using a global variable check the state
> 
> Changes from V5:
> - use a global variable to decide whether an announcement is needed after 
> migration
> - align with virtio spec and let guest ack the announcement notification 
> through
>   control vq instead of config status writing
> 
> Changes from V4:
> - keep the old behavior that send the gratuitous packets only after migration

I wonder why it's a sane thing to do. How about simply sending the event after 
load?

> - decide whether to send gratuitous packets by previous runstate instead of a 
> dedicated parameter
> - check virtio_net_started() instead of VIRTIO_NET_S_LINK_UP before issue the 
> config update interrupt
> - move VIRTIO_NET_S_ANNOUNCE to 0x100 and supress guest config write to RO 
> bits
> - cleanups suggested by Michael
> 
> Tested with migration within 802.1Q.
> 
> Jason Wang (5):
>   runstate: introduce prelaunch-migrate state
>   net: announce self after vm is started
>   net: model specific announcing support
>   virtio-net: notify guest to annouce itself
>   virtio-net: compat guest announce
> 
>  hw/pc.h   |6 +-
>  hw/virtio-net.c   |   30 ++
>  hw/virtio-net.h   |   15 ++-
>  include/net/net.h |2 ++
>  migration.c   |4 +---
>  qapi-schema.json  |5 -
>  savevm.c  |8 ++--
>  vl.c  |8 +++-
>  8 files changed, 69 insertions(+), 9 deletions(-)

Re: [Qemu-devel] [PATCH] qdev: DEVICE_DELETED event

2013-03-07 Thread Michael S. Tsirkin

On Thu, Mar 07, 2013 at 10:55:23AM +0100, Markus Armbruster wrote:
> "Michael S. Tsirkin"  writes:
> 
> > On Wed, Mar 06, 2013 at 02:57:22PM +0100, Andreas Färber wrote:
> >> Am 06.03.2013 14:00, schrieb Michael S. Tsirkin:
> >> > libvirt has a long-standing bug: when removing the device,
> >> > it can request removal but does not know when does the
> >> > removal complete. Add an event so we can fix this in a robust way.
> >> > 
> >> > Signed-off-by: Michael S. Tsirkin 
> >> 
> >> Sounds like a good idea to me. :)
> >> 
> >> [...]
> >> > diff --git a/hw/qdev.c b/hw/qdev.c
> >> > index 689cd54..f30d251 100644
> >> > --- a/hw/qdev.c
> >> > +++ b/hw/qdev.c
> >> > @@ -29,6 +29,7 @@
> >> >  #include "sysemu/sysemu.h"
> >> >  #include "qapi/error.h"
> >> >  #include "qapi/visitor.h"
> >> > +#include "qapi/qmp/qjson.h"
> >> >  
> >> >  int qdev_hotplug = 0;
> >> >  static bool qdev_hot_added = false;
> >> > @@ -267,6 +268,11 @@ void qdev_init_nofail(DeviceState *dev)
> >> >  /* Unlink device from bus and free the structure.  */
> >> >  void qdev_free(DeviceState *dev)
> >> >  {
> >> > +if (dev->id) {
> >> > +QObject *data = qobject_from_jsonf("{ 'device': %s }", dev->id);
> >> > +monitor_protocol_event(QEVENT_DEVICE_DELETED, data);
> >> > +qobject_decref(data);
> >> > +}
> >> >  object_unparent(OBJECT(dev));
> >> >  }
> >> >  
> >> 
> >> I'm pretty sure this is the wrong place to fire the notification. We
> >> should rather do this when the device is actually deleted - which
> >> qdev_free() does *not* actually guarantee, as criticized in the s390x
> >> and unref'ing contexts.
> >> I would suggest to place your code into device_unparent() instead.
> >> 
> >> Another thing to consider is what data to pass to the event: Not all
> >> devices have an ID.
> >
> > If they don't they were not created by management so management is
> > probably not interested in them being removed.
> >
> > We could always add a 'path' key later if this assumption
> > proves incorrect.
> 
> In old qdev, ID was all we had, because paths were busted.  Thus,
> management had no choice but use IDs.
> 
> If I understand modern qdev correctly, we got a canonical path.  Old
> APIs like device_del still accept only ID.  Should new APIs still be
> designed that way?  Or should they always accept / provide the canonical
> path, plus optional ID for convenience?

What are advantages of exposing the path to users in this way?
Looks like maintainance hassle without real benefits?

> >> We should still have a canonical path when we fire
> >> this event in either qdev_free() or in device_unparent() before the if
> >> (dev->parent_bus) block though. That would be a question for Anthony,
> >> not having a use case for the event I am indifferent there.
> >> 
> >> Further, thinking of objects such as virtio-rng backends or future
> >> blockdev/chardev objects, might it make sense to turn this into a
> >> generic object deletion event rather than a device event?
> >> 
> >> Andreas
> >
> > Backend deletion doesn't normally have guest interaction right?
> > So why do we need an event?
> 
> We need an event because device_del may send its reply before it
> completes the job.
> 
> device_del does that when it deletion needs to interact with the guest,
> which can take unbounded time.
> 
> Conversely, we don't need an event when a QMP always completes the job
> (as far as observable by the QMP client) before it sends its reply.  Off
> hand, I can't see why backend deletion would do anything else.
> 
> I'm always reluctant to abstract when there are fewer than two
> different, concrete things to abstract from.  Right now, we got just
> one: device models.

Re: [Qemu-devel] [PATCH 5/5] threadpool: drop global thread pool

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 05:35:09PM +0100, Paolo Bonzini wrote:
> Il 06/03/2013 16:45, Stefan Hajnoczi ha scritto:
> > Now that each AioContext has a ThreadPool and the main loop AioContext
> > can be fetched with qemu_get_aio_context(), we can eliminate the concept
> > of a global thread pool from thread-pool.c.
> > 
> > The submit functions must take a ThreadPool* argument.
> 
> This is certainly ok for thread-pool.c.  For raw-posix and raw-win32,
> what about adding already a bdrv_get_aio_context() function and using
> that in paio_submit?  Is it putting the cart before the horse?

Thanks, we might as well do that.  Then I won't have to go back and
modify block/raw-posix.c and block/raw-win32.c in the next patch series
which lets BlockDriverState bind to an AioContext.

Stefan

Re: [Qemu-devel] [PATCH] coroutine: use AioContext for CoQueue BH

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 04:03:49PM +0100, Paolo Bonzini wrote:
> Il 06/03/2013 15:53, Stefan Hajnoczi ha scritto:
> > CoQueue uses a BH to awake coroutines that were made ready to run again
> > using qemu_co_queue_next() or qemu_co_queue_restart_all().  The BH
> > currently runs in the iothread AioContext and would break coroutines
> > that run in a different AioContext.
> > 
> > This is a slightly tricky problem because the lifetime of the BH exceeds
> > that of the CoQueue.  This means coroutines can be awoken after CoQueue
> > itself has been freed.  Also, there is no qemu_co_queue_destroy()
> > function which we could use to handle freeing resources.
> > 
> > Introducing qemu_co_queue_destroy() has a ripple effect of requiring us
> > to also add qemu_co_mutex_destroy() and qemu_co_rwlock_destroy(), as
> > well as updating all callers.  Avoid doing that.
> > 
> > We also cannot switch from BH to GIdle function because aio_poll() does
> > not dispatch GIdle functions.  (GIdle functions make memory management
> > slightly easier because they free themselves.)
> > 
> > Finally, I don't want to move unlock_queue and unlock_bh into
> > AioContext.  That would break encapsulation - AioContext isn't supposed
> > to know about CoQueue.
> > 
> > This patch implements a different solution: each qemu_co_queue_next() or
> > qemu_co_queue_restart_all() call creates a new BH and list of coroutines
> > to wake up.  Callers tend to invoke qemu_co_queue_next() and
> > qemu_co_queue_restart_all() occasionally after blocking I/O, so creating
> > a new BH for each call shouldn't be massively inefficient.
> > 
> > Note that this patch does not add an interface for specifying the
> > AioContext.  That is left to future patches which will convert CoQueue,
> > CoMutex, and CoRwlock to expose AioContext.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  include/block/coroutine.h |  1 +
> >  qemu-coroutine-lock.c | 59 
> > ---
> >  2 files changed, 42 insertions(+), 18 deletions(-)
> > 
> > diff --git a/include/block/coroutine.h b/include/block/coroutine.h
> > index c31fae3..a978162 100644
> > --- a/include/block/coroutine.h
> > +++ b/include/block/coroutine.h
> > @@ -104,6 +104,7 @@ bool qemu_in_coroutine(void);
> >   */
> >  typedef struct CoQueue {
> >  QTAILQ_HEAD(, Coroutine) entries;
> > +AioContext *ctx;
> >  } CoQueue;
> >  
> >  /**
> > diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c
> > index 97ef01c..ae986b3 100644
> > --- a/qemu-coroutine-lock.c
> > +++ b/qemu-coroutine-lock.c
> > @@ -29,28 +29,34 @@
> >  #include "block/aio.h"
> >  #include "trace.h"
> >  
> > -static QTAILQ_HEAD(, Coroutine) unlock_bh_queue =
> > -QTAILQ_HEAD_INITIALIZER(unlock_bh_queue);
> > -static QEMUBH* unlock_bh;
> > +/* Coroutines are awoken from a BH to allow the current coroutine to 
> > complete
> > + * its flow of execution.  The BH may run after the CoQueue has been 
> > destroyed,
> > + * so keep BH data in a separate heap-allocated struct.
> > + */
> > +typedef struct {
> > +QEMUBH *bh;
> > +QTAILQ_HEAD(, Coroutine) entries;
> > +} CoQueueNextData;
> >  
> >  static void qemu_co_queue_next_bh(void *opaque)
> >  {
> > +CoQueueNextData *data = opaque;
> >  Coroutine *next;
> >  
> >  trace_qemu_co_queue_next_bh();
> > -while ((next = QTAILQ_FIRST(&unlock_bh_queue))) {
> > -QTAILQ_REMOVE(&unlock_bh_queue, next, co_queue_next);
> > +while ((next = QTAILQ_FIRST(&data->entries))) {
> > +QTAILQ_REMOVE(&data->entries, next, co_queue_next);
> >  qemu_coroutine_enter(next, NULL);
> >  }
> > +
> > +qemu_bh_delete(data->bh);
> > +g_slice_free(CoQueueNextData, data);
> >  }
> >  
> >  void qemu_co_queue_init(CoQueue *queue)
> >  {
> >  QTAILQ_INIT(&queue->entries);
> > -
> > -if (!unlock_bh) {
> > -unlock_bh = qemu_bh_new(qemu_co_queue_next_bh, NULL);
> > -}
> > +queue->ctx = NULL;
> 
> What about adding an accessor for qemu_aio_context and using it?  Then
> you can just use aio_bh_new in qemu_co_queue_do_restart.

Your wish is my command.  I'll add this patch to the threadpool series
where I've already introduced qemu_get_aio_context().

Stefan

Re: [Qemu-devel] [PATCH V7 0/5] Send the gratuitous by guest

2013-03-07 Thread Jason Wang

On 03/07/2013 06:04 PM, Michael S. Tsirkin wrote:
> On Thu, Mar 07, 2013 at 04:23:46PM +0800, Jason Wang wrote:
>> This series tries to let guest instead of qemu to send the gratuitous packets
>> after migration when guest is capable of doing this. This is needed since 
>> it's
>> impossible for qemu to keep track of all configurations (e.g 802.1Q) and mac
>> addresses (more than one mac address may be used by guest). So qemu can't 
>> build
>> gratuitous packets for all those configurations properly. The only solution 
>> is
>> let guest driver who knew all needed information to do this.
>>
>> The series first introduces a new runstate which just tracks the state when 
>> the
>> migration is finished and guest is about to start. And then we can just 
>> trying
>> to notify the guest to send the GARP after changing from this state to
>> running. A model specific announcing method were also also introduced to let
>> each kinds of nic do its own notification. When there's no such method 
>> register
>> for the nic, the old style of sending RARP were kept. And the last two 
>> patches
>> implemented the virtio-net method of notification.
> Do we want to retry SELF_ANNOUNCE_ROUNDS?

Yes, we do the announcement several times like in the past.
>> Changes from V6:
>> - introduce a new runstate instead of using a global variable check the state
>>
>> Changes from V5:
>> - use a global variable to decide whether an announcement is needed after 
>> migration
>> - align with virtio spec and let guest ack the announcement notification 
>> through
>>   control vq instead of config status writing
>>
>> Changes from V4:
>> - keep the old behavior that send the gratuitous packets only after migration
> I wonder why it's a sane thing to do. How about simply sending the event 
> after load?

The aim is to limit the change of the behaviour to focus on migration.

We may also need this after cont, and then maybe we can just do this
unconditionally in vm_start().
>> - decide whether to send gratuitous packets by previous runstate instead of 
>> a dedicated parameter
>> - check virtio_net_started() instead of VIRTIO_NET_S_LINK_UP before issue 
>> the config update interrupt
>> - move VIRTIO_NET_S_ANNOUNCE to 0x100 and supress guest config write to RO 
>> bits
>> - cleanups suggested by Michael
>>
>> Tested with migration within 802.1Q.
>>
>> Jason Wang (5):
>>   runstate: introduce prelaunch-migrate state
>>   net: announce self after vm is started
>>   net: model specific announcing support
>>   virtio-net: notify guest to annouce itself
>>   virtio-net: compat guest announce
>>
>>  hw/pc.h   |6 +-
>>  hw/virtio-net.c   |   30 ++
>>  hw/virtio-net.h   |   15 ++-
>>  include/net/net.h |2 ++
>>  migration.c   |4 +---
>>  qapi-schema.json  |5 -
>>  savevm.c  |8 ++--
>>  vl.c  |8 +++-
>>  8 files changed, 69 insertions(+), 9 deletions(-)

[Qemu-devel] [PATCH 0/7] linux-user updates

2013-03-07 Thread riku . voipio

From: Riku Voipio 

Hi,

I did a dig through the archive for linux-user patches
that have slipped through during my inactivity. The ones
in this patchset appear good and pass smoketest.

I will send these as pull request once I'm back from travel 
and can update my git repository.

Riku

Dillon Amburgey (3):
  linux-user: Add Alpha socket constants
  linux-user: Support setgroups syscall with no groups
  linux-user: Add more sparc syscall numbers

John Rigby (2):
  linux-user/syscall.c: handle FUTEX_WAIT_BITSET in do_futex
  linux-user: fix futex strace of FUTEX_CLOCK_REALTIME

Laurent Vivier (2):
  linux-user: improve print_fcntl()
  linux-user: correct semctl() and shmctl()

 linux-user/socket.h   |  69 
 linux-user/sparc/syscall_nr.h |   2 +
 linux-user/strace.c   | 103 ++
 linux-user/syscall.c  |  81 -
 4 files changed, 205 insertions(+), 50 deletions(-)

-- 
1.8.1.2

[Qemu-devel] [PATCH 1/7] linux-user: Add Alpha socket constants

2013-03-07 Thread riku . voipio

From: Dillon Amburgey 

Without these, some networking programs will not work

Signed-off-by: Dillon Amburgey 
Reviewed-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/socket.h | 69 +
 1 file changed, 69 insertions(+)

diff --git a/linux-user/socket.h b/linux-user/socket.h
index 93d4782..339cae5 100644
--- a/linux-user/socket.h
+++ b/linux-user/socket.h
@@ -87,6 +87,75 @@
 
#define TARGET_SOCK_MAX (SOCK_PACKET + 1)
 
+#elif defined(TARGET_ALPHA)
+
+/* For setsockopt(2) */
+#define TARGET_SOL_SOCKET   0x
+
+#define TARGET_SO_DEBUG 0x0001
+#define TARGET_SO_REUSEADDR 0x0004
+#define TARGET_SO_KEEPALIVE 0x0008
+#define TARGET_SO_DONTROUTE 0x0010
+#define TARGET_SO_BROADCAST 0x0020
+#define TARGET_SO_LINGER0x0080
+#define TARGET_SO_OOBINLINE 0x0100
+/* To add :#define TARGET_SO_REUSEPORT 0x0200 */
+
+#define TARGET_SO_TYPE  0x1008
+#define TARGET_SO_ERROR 0x1007
+#define TARGET_SO_SNDBUF0x1001
+#define TARGET_SO_RCVBUF0x1002
+#define TARGET_SO_SNDBUFFORCE   0x100a
+#define TARGET_SO_RCVBUFFORCE   0x100b
+#define TARGET_SO_RCVLOWAT  0x1010
+#define TARGET_SO_SNDLOWAT  0x1011
+#define TARGET_SO_RCVTIMEO  0x1012
+#define TARGET_SO_SNDTIMEO  0x1013
+#define TARGET_SO_ACCEPTCONN0x1014
+#define TARGET_SO_PROTOCOL  0x1028
+#define TARGET_SO_DOMAIN0x1029
+
+/* linux-specific, might as well be the same as on i386 */
+#define TARGET_SO_NO_CHECK  11
+#define TARGET_SO_PRIORITY  12
+#define TARGET_SO_BSDCOMPAT 14
+
+#define TARGET_SO_PASSCRED  17
+#define TARGET_SO_PEERCRED  18
+#define TARGET_SO_BINDTODEVICE 25
+
+/* Socket filtering */
+#define TARGET_SO_ATTACH_FILTER26
+#define TARGET_SO_DETACH_FILTER27
+
+#define TARGET_SO_PEERNAME  28
+#define TARGET_SO_TIMESTAMP 29
+#define TARGET_SCM_TIMESTAMPTARGET_SO_TIMESTAMP
+
+#define TARGET_SO_PEERSEC   30
+#define TARGET_SO_PASSSEC   34
+#define TARGET_SO_TIMESTAMPNS   35
+#define TARGET_SCM_TIMESTAMPNS  TARGET_SO_TIMESTAMPNS
+
+/* Security levels - as per NRL IPv6 - don't actually do anything */
+#define TARGET_SO_SECURITY_AUTHENTICATION   19
+#define TARGET_SO_SECURITY_ENCRYPTION_TRANSPORT 20
+#define TARGET_SO_SECURITY_ENCRYPTION_NETWORK   21
+
+#define TARGET_SO_MARK  36
+
+#define TARGET_SO_TIMESTAMPING  37
+#define TARGET_SCM_TIMESTAMPING TARGET_SO_TIMESTAMPING
+
+#define TARGET_SO_RXQ_OVFL 40
+
+#define TARGET_SO_WIFI_STATUS   41
+#define TARGET_SCM_WIFI_STATUS  TARGET_SO_WIFI_STATUS
+#define TARGET_SO_PEEK_OFF  42
+
+/* Instruct lower device to use last 4-bytes of skb data as FCS */
+#define TARGET_SO_NOFCS 43
+
 #else
 
/* For setsockopt(2) */
-- 
1.8.1.2

[Qemu-devel] [PATCH 4/7] linux-user: fix futex strace of FUTEX_CLOCK_REALTIME

2013-03-07 Thread riku . voipio

From: John Rigby 

Handle same as existing FUTEX_PRIVATE_FLAG.

Signed-off-by: John Rigby 
Signed-off-by: Riku Voipio 
---
 linux-user/strace.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 9a18146..0fbae3c 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -1498,6 +1498,12 @@ if( cmd == val ) { \
 cmd &= ~FUTEX_PRIVATE_FLAG;
 }
 #endif
+#ifdef FUTEX_CLOCK_REALTIME
+if (cmd & FUTEX_CLOCK_REALTIME) {
+gemu_log("FUTEX_CLOCK_REALTIME|");
+cmd &= ~FUTEX_CLOCK_REALTIME;
+}
+#endif
 print_op(FUTEX_WAIT)
 print_op(FUTEX_WAKE)
 print_op(FUTEX_FD)
-- 
1.8.1.2

[Qemu-devel] [PATCH 2/7] linux-user: improve print_fcntl()

2013-03-07 Thread riku . voipio

From: Laurent Vivier 

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
---
 linux-user/strace.c | 97 +++--
 1 file changed, 79 insertions(+), 18 deletions(-)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 4e91a6e..9a18146 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -462,18 +462,6 @@ UNUSED static struct flags mmap_flags[] = {
 FLAG_END,
 };
 
-UNUSED static struct flags fcntl_flags[] = {
-FLAG_TARGET(F_DUPFD),
-FLAG_TARGET(F_GETFD),
-FLAG_TARGET(F_SETFD),
-FLAG_TARGET(F_GETFL),
-FLAG_TARGET(F_SETFL),
-FLAG_TARGET(F_GETLK),
-FLAG_TARGET(F_SETLK),
-FLAG_TARGET(F_SETLKW),
-FLAG_END,
-};
-
 UNUSED static struct flags clone_flags[] = {
 FLAG_GENERIC(CLONE_VM),
 FLAG_GENERIC(CLONE_FS),
@@ -867,12 +855,85 @@ print_fcntl(const struct syscallname *name,
 {
 print_syscall_prologue(name);
 print_raw_param("%d", arg0, 0);
-print_flags(fcntl_flags, arg1, 0);
-/*
- * TODO: check flags and print following argument only
- *   when needed.
- */
-print_pointer(arg2, 1);
+switch(arg1) {
+case TARGET_F_DUPFD:
+gemu_log("F_DUPFD,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 1);
+break;
+case TARGET_F_GETFD:
+gemu_log("F_GETFD");
+break;
+case TARGET_F_SETFD:
+gemu_log("F_SETFD,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 1);
+break;
+case TARGET_F_GETFL:
+gemu_log("F_GETFL");
+break;
+case TARGET_F_SETFL:
+gemu_log("F_SETFL,");
+print_open_flags(arg2, 1);
+break;
+case TARGET_F_GETLK:
+gemu_log("F_GETLK,");
+print_pointer(arg2, 1);
+break;
+case TARGET_F_SETLK:
+gemu_log("F_SETLK,");
+print_pointer(arg2, 1);
+break;
+case TARGET_F_SETLKW:
+gemu_log("F_SETLKW,");
+print_pointer(arg2, 1);
+break;
+case TARGET_F_GETOWN:
+gemu_log("F_GETOWN");
+break;
+case TARGET_F_SETOWN:
+gemu_log("F_SETOWN,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 0);
+break;
+case TARGET_F_GETSIG:
+gemu_log("F_GETSIG");
+break;
+case TARGET_F_SETSIG:
+gemu_log("F_SETSIG,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 0);
+break;
+#if TARGET_ABI_BITS == 32
+case TARGET_F_GETLK64:
+gemu_log("F_GETLK64,");
+print_pointer(arg2, 1);
+break;
+case TARGET_F_SETLK64:
+gemu_log("F_SETLK64,");
+print_pointer(arg2, 1);
+break;
+case TARGET_F_SETLKW64:
+gemu_log("F_SETLKW64,");
+print_pointer(arg2, 1);
+break;
+#endif
+case TARGET_F_SETLEASE:
+gemu_log("F_SETLEASE,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 0);
+break;
+case TARGET_F_GETLEASE:
+gemu_log("F_GETLEASE");
+break;
+case TARGET_F_DUPFD_CLOEXEC:
+gemu_log("F_DUPFD_CLOEXEC,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 1);
+break;
+case TARGET_F_NOTIFY:
+gemu_log("F_NOTIFY,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 0);
+break;
+default:
+print_raw_param(TARGET_ABI_FMT_ld, arg1, 0);
+print_pointer(arg2, 1);
+break;
+}
 print_syscall_epilogue(name);
 }
 #define print_fcntl64   print_fcntl
-- 
1.8.1.2

[Qemu-devel] [PATCH 5/7] linux-user: correct semctl() and shmctl()

2013-03-07 Thread riku . voipio

From: Laurent Vivier 

The parameter "union semun" of semctl() is not a value
but a pointer to the value.

Moreover, all fields of target_su must be swapped (if needed).

The third argument of shmctl is a pointer.

WITHOUT this patch:

$ ipcs

kernel not configured for shared memory

qemu: uncaught target signal 11 (Segmentation fault) - core dumped

WITH this patch:

$ ipcs

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x4e545030 0  root  60096 1
0x4e545031 32769  root  60096 1
0x4e545032 65538  root  66696 1
0x4e545033 98307  root  66696 1
0x47505344 131076 root  6668240   1
0x3c81b7f5 163845 laurent   6664096   0
0x 729513990  laurent   600393216 2  dest
0x 729546759  laurent   600393216 2  dest
0x 1879179273 laurent   600393216 2  dest

-- Semaphore Arrays 
keysemid  owner  perms  nsems
0x3c81b7f6 32768  laurent   6661
0x1c44ac47 6586369laurent   6001

-- Message Queues 
keymsqid  owner  perms  used-bytes   messages
0x1c44ac45 458752 laurent60000
0x1c44ac46 491521 laurent60000

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 56 
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index c7fcfc0..7f12563 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -2652,8 +2652,9 @@ static inline abi_long host_to_target_semarray(int semid, 
abi_ulong target_addr,
 }
 
 static inline abi_long do_semctl(int semid, int semnum, int cmd,
- union target_semun target_su)
+ abi_ulong ptr)
 {
+union target_semun *target_su;
 union semun arg;
 struct semid_ds dsarg;
 unsigned short *array = NULL;
@@ -2662,43 +2663,55 @@ static inline abi_long do_semctl(int semid, int semnum, 
int cmd,
 abi_long err;
 cmd &= 0xff;
 
+if (!lock_user_struct(VERIFY_READ, target_su, ptr, 1)) {
+return -TARGET_EFAULT;
+}
 switch( cmd ) {
case GETVAL:
case SETVAL:
-arg.val = tswap32(target_su.val);
+arg.val = tswap32(target_su->val);
 ret = get_errno(semctl(semid, semnum, cmd, arg));
-target_su.val = tswap32(arg.val);
+target_su->val = tswap32(arg.val);
 break;
case GETALL:
case SETALL:
-err = target_to_host_semarray(semid, &array, target_su.array);
-if (err)
-return err;
+err = target_to_host_semarray(semid, &array,
+  tswapal(target_su->array));
+if (err) {
+ret = err;
+break;
+}
 arg.array = array;
 ret = get_errno(semctl(semid, semnum, cmd, arg));
-err = host_to_target_semarray(semid, target_su.array, &array);
-if (err)
-return err;
+err = host_to_target_semarray(semid, tswapal(target_su->array),
+  &array);
+if (err) {
+ret = err;
+}
 break;
case IPC_STAT:
case IPC_SET:
case SEM_STAT:
-err = target_to_host_semid_ds(&dsarg, target_su.buf);
-if (err)
-return err;
+err = target_to_host_semid_ds(&dsarg, tswapal(target_su->buf));
+if (err) {
+ret = err;
+break;
+}
 arg.buf = &dsarg;
 ret = get_errno(semctl(semid, semnum, cmd, arg));
-err = host_to_target_semid_ds(target_su.buf, &dsarg);
-if (err)
-return err;
+err = host_to_target_semid_ds(tswapal(target_su->buf), &dsarg);
+if (err) {
+ret = err;
+}
 break;
case IPC_INFO:
case SEM_INFO:
 arg.__buf = &seminfo;
 ret = get_errno(semctl(semid, semnum, cmd, arg));
-err = host_to_target_seminfo(target_su.__buf, &seminfo);
-if (err)
-return err;
+err = host_to_target_seminfo(tswapal(target_su->__buf), &seminfo);
+if (err) {
+ret = err;
+}
 break;
case IPC_RMID:
case GETPID:
@@ -2707,6 +2720,7 @@ static inline abi_long do_semctl(int semid, int semnum, 
int cmd,
 ret = get_errno(semctl(semid, semnum, cmd, NULL));
 break;
 }
+unlock_user_struct(target_su, ptr, 0);

[Qemu-devel] [PATCH 6/7] linux-user: Support setgroups syscall with no groups

2013-03-07 Thread riku . voipio

From: Dillon Amburgey 

Signed-off-by: Dillon Amburgey 
Reviewed-by: Peter Maydell 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 7f12563..e0c71fb 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7694,18 +7694,20 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 {
 int gidsetsize = arg1;
 target_id *target_grouplist;
-gid_t *grouplist;
+gid_t *grouplist = NULL;
 int i;
-
-grouplist = alloca(gidsetsize * sizeof(gid_t));
-target_grouplist = lock_user(VERIFY_READ, arg2, gidsetsize * 2, 1);
-if (!target_grouplist) {
-ret = -TARGET_EFAULT;
-goto fail;
+if (gidsetsize) {
+grouplist = alloca(gidsetsize * sizeof(gid_t));
+target_grouplist = lock_user(VERIFY_READ, arg2, gidsetsize * 
2, 1);
+if (!target_grouplist) {
+ret = -TARGET_EFAULT;
+goto fail;
+}
+for (i = 0; i < gidsetsize; i++) {
+grouplist[i] = low2highgid(tswapid(target_grouplist[i]));
+}
+unlock_user(target_grouplist, arg2, 0);
 }
-for(i = 0;i < gidsetsize; i++)
-grouplist[i] = low2highgid(tswapid(target_grouplist[i]));
-unlock_user(target_grouplist, arg2, 0);
 ret = get_errno(setgroups(gidsetsize, grouplist));
 }
 break;
-- 
1.8.1.2

[Qemu-devel] [PATCH 3/7] linux-user/syscall.c: handle FUTEX_WAIT_BITSET in do_futex

2013-03-07 Thread riku . voipio

From: John Rigby 

Upstream libc has recently changed to start using
FUTEX_WAIT_BITSET instead of FUTEX_WAIT and this
is causing do_futex to return -TARGET_ENOSYS.

Pass bitset in val3 to sys_futex which will be
ignored by kernel for the FUTEX_WAIT case.

Signed-off-by: John Rigby 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 19630ea..c7fcfc0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4922,6 +4922,7 @@ static int do_futex(target_ulong uaddr, int op, int val, 
target_ulong timeout,
 #endif
 switch (base_op) {
 case FUTEX_WAIT:
+case FUTEX_WAIT_BITSET:
 if (timeout) {
 pts = &ts;
 target_to_host_timespec(pts, timeout);
@@ -4929,7 +4930,7 @@ static int do_futex(target_ulong uaddr, int op, int val, 
target_ulong timeout,
 pts = NULL;
 }
 return get_errno(sys_futex(g2h(uaddr), op, tswap32(val),
- pts, NULL, 0));
+ pts, NULL, val3));
 case FUTEX_WAKE:
 return get_errno(sys_futex(g2h(uaddr), op, val, NULL, NULL, 0));
 case FUTEX_FD:
-- 
1.8.1.2

[Qemu-devel] [PATCH 7/7] linux-user: Add more sparc syscall numbers

2013-03-07 Thread riku . voipio

From: Dillon Amburgey 

Signed-off-by: Dillon Amburgey 
Signed-off-by: Riku Voipio 
---
 linux-user/sparc/syscall_nr.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/linux-user/sparc/syscall_nr.h b/linux-user/sparc/syscall_nr.h
index 061711c..534e6e9 100644
--- a/linux-user/sparc/syscall_nr.h
+++ b/linux-user/sparc/syscall_nr.h
@@ -200,6 +200,8 @@
 #define TARGET_NR__newselect 230 /* Linux Specific 
 */
 #define TARGET_NR_time   231 /* Linux Specific 
 */
 #define TARGET_NR_stime  233 /* Linux Specific 
 */
+#define TARGET_NR_statfs64   234 /* Linux Specific 
 */
+#define TARGET_NR_fstatfs64  235 /* Linux Specific 
 */
 #define TARGET_NR__llseek236 /* Linux Specific 
 */
 #define TARGET_NR_mlock  237
 #define TARGET_NR_munlock238
-- 
1.8.1.2

Re: [Qemu-devel] [PATCH V7 0/5] Send the gratuitous by guest

2013-03-07 Thread Michael S. Tsirkin

On Thu, Mar 07, 2013 at 06:13:41PM +0800, Jason Wang wrote:
> On 03/07/2013 06:04 PM, Michael S. Tsirkin wrote:
> > On Thu, Mar 07, 2013 at 04:23:46PM +0800, Jason Wang wrote:
> >> This series tries to let guest instead of qemu to send the gratuitous 
> >> packets
> >> after migration when guest is capable of doing this. This is needed since 
> >> it's
> >> impossible for qemu to keep track of all configurations (e.g 802.1Q) and 
> >> mac
> >> addresses (more than one mac address may be used by guest). So qemu can't 
> >> build
> >> gratuitous packets for all those configurations properly. The only 
> >> solution is
> >> let guest driver who knew all needed information to do this.
> >>
> >> The series first introduces a new runstate which just tracks the state 
> >> when the
> >> migration is finished and guest is about to start. And then we can just 
> >> trying
> >> to notify the guest to send the GARP after changing from this state to
> >> running. A model specific announcing method were also also introduced to 
> >> let
> >> each kinds of nic do its own notification. When there's no such method 
> >> register
> >> for the nic, the old style of sending RARP were kept. And the last two 
> >> patches
> >> implemented the virtio-net method of notification.
> > Do we want to retry SELF_ANNOUNCE_ROUNDS?
> 
> Yes, we do the announcement several times like in the past.
> >> Changes from V6:
> >> - introduce a new runstate instead of using a global variable check the 
> >> state
> >>
> >> Changes from V5:
> >> - use a global variable to decide whether an announcement is needed after 
> >> migration
> >> - align with virtio spec and let guest ack the announcement notification 
> >> through
> >>   control vq instead of config status writing
> >>
> >> Changes from V4:
> >> - keep the old behavior that send the gratuitous packets only after 
> >> migration
> > I wonder why it's a sane thing to do. How about simply sending the event 
> > after load?
> 
> The aim is to limit the change of the behaviour to focus on migration.
> We may also need this after cont,

Hmm why after cont?

> and then maybe we can just do this
> unconditionally in vm_start().

OK but then the new infrastructure we are adding will be dead code,
won't it?

Can we do this simply using a post load hook for now?

>
> >> - decide whether to send gratuitous packets by previous runstate instead 
> >> of a dedicated parameter
> >> - check virtio_net_started() instead of VIRTIO_NET_S_LINK_UP before issue 
> >> the config update interrupt
> >> - move VIRTIO_NET_S_ANNOUNCE to 0x100 and supress guest config write to RO 
> >> bits
> >> - cleanups suggested by Michael
> >>
> >> Tested with migration within 802.1Q.
> >>
> >> Jason Wang (5):
> >>   runstate: introduce prelaunch-migrate state
> >>   net: announce self after vm is started
> >>   net: model specific announcing support
> >>   virtio-net: notify guest to annouce itself
> >>   virtio-net: compat guest announce
> >>
> >>  hw/pc.h   |6 +-
> >>  hw/virtio-net.c   |   30 ++
> >>  hw/virtio-net.h   |   15 ++-
> >>  include/net/net.h |2 ++
> >>  migration.c   |4 +---
> >>  qapi-schema.json  |5 -
> >>  savevm.c  |8 ++--
> >>  vl.c  |8 +++-
> >>  8 files changed, 69 insertions(+), 9 deletions(-)

Re: [Qemu-devel] [PATCH] coroutine: use AioContext for CoQueue BH

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 04:41:01PM +0100, Kevin Wolf wrote:
> Am 06.03.2013 um 15:53 hat Stefan Hajnoczi geschrieben:
> > CoQueue uses a BH to awake coroutines that were made ready to run again
> > using qemu_co_queue_next() or qemu_co_queue_restart_all().  The BH
> > currently runs in the iothread AioContext and would break coroutines
> > that run in a different AioContext.
> > 
> > This is a slightly tricky problem because the lifetime of the BH exceeds
> > that of the CoQueue.  This means coroutines can be awoken after CoQueue
> > itself has been freed.
> 
> Does this really happen in practice? If so, that sounds like a bug to
> me.

I didn't audit the callers.  It seems reasonable that a CoQueue is
allowed to go out of scope once qemu_co_queue_empty() returns true.  I
don't see what is buggy about that.

> > Finally, I don't want to move unlock_queue and unlock_bh into
> > AioContext.  That would break encapsulation - AioContext isn't supposed
> > to know about CoQueue.
> 
> So what you would need here is "AioContext local storage". I wonder if
> this will stay a requirement unique to CoQueues when AioContexts gain
> wider use.

I'm not convinced we need AioContext local storage.

We could go all the way and make AioContext* thread-local.  Then we can
figure out the current AioContext* or NULL, if none, at any location in
the code.

Then CoQueue (and others?) could attach their resources to AioContext
local storage.

But let's avoid it if there are clean solutions because implicit state
like globals or thread-locals makes code harder to understand.

Stefan

Re: [Qemu-devel] [PATCH 1/7] block: only force IO completion in .bdrv_truncate if we are shrinking

2013-03-07 Thread Peter Lieven


Am 07.03.2013 um 11:00 schrieb Kevin Wolf :

> Am 07.03.2013 um 10:25 hat Peter Lieven geschrieben:
>> 
>> Am 07.03.2013 um 10:22 schrieb Kevin Wolf :
>> 
>>> Am 07.03.2013 um 10:16 hat Peter Lieven geschrieben:
>> If bs->growable is 1 for all drivers, whats the fix status of 
>> CVE-2008-0928? This
>> flag was introduced as a fix for this problem.
>> 
>> bdrv_check_byte_request() does nothing useful if bs->growable is 1.
> 
> Don't ignore the difference between bdrv_open() and bdrv_file_open().
> Typically you have two BDSes: On top there is e.g. a qcow2 BDS that is
> opened through bdrv_open() and has bs->growable = false. Its bs->file is
> using the file protocol (raw-posix driver) and opened by
> bdrv_file_open(). This one has bs->file->growable = true so that qcow2
> can write to newly allocated areas without calling bdrv_truncate()
> first.
 
 Sorry, I have to admin I am little confused by what is happening in 
 bdrv_open().
 
 However, what I can say is that bs->growable is 1 for an iSCSI backed
 harddrive and I wonder how this can happen if bdrv_file_open is not used 
 for
 opening it because that is the only place where bs->growable is set to 1.
 
 cmdline:
 x86_64-softmmu/qemu-system-x86_64 -k de -enable-kvm -m 1024 -drive 
 if=virtio,file=iscsi://172.21.200.31/iqn.2001-05.com.equallogic:0-8a0906-16470e107-713001aa6de511e0-001-test/0
  -vnc :1 -boot dc -monitor stdio
>>> 
>>> It is used for the iscsi driver. You have a raw BDS (growable == false)
>>> on top of an iscsi one (growable == true).
>> 
>> Ok, but growable == true is wrong for the iSCSI driver isn`t it?
> 
> I guess it depends on your definition. If you say that growable includes
> the capability of growing the image, then yes, it's wrong. If you only
> interpret it as permission to write beyond EOF (if the driver supports
> that), then it's right even though any such attempt will result in an
> error.
> 
> Practically speaking, the difference is between -EIO returned from
> bdrv_check_byte_request() and -EIO from iscsi_aio_write16_cb(), i.e.
> the result is the same.

Yes, but there are many broken storage implementations outside which might react
freaky if there is a read/write beyond the LUN size or with negative offset.
The check_request would block such requests earlier protecting the 
infrastructure.
I have a case open with the vendor of a storage system where I am able to crash
the storage sending illegal requests. I personally would feel more comfortable
if illegal requests where just not possible.

Peter

Re: [Qemu-devel] [PATCH V7 0/5] Send the gratuitous by guest

2013-03-07 Thread Jason Wang

On 03/07/2013 06:25 PM, Michael S. Tsirkin wrote:
> On Thu, Mar 07, 2013 at 06:13:41PM +0800, Jason Wang wrote:
>> On 03/07/2013 06:04 PM, Michael S. Tsirkin wrote:
>>> On Thu, Mar 07, 2013 at 04:23:46PM +0800, Jason Wang wrote:
 This series tries to let guest instead of qemu to send the gratuitous 
 packets
 after migration when guest is capable of doing this. This is needed since 
 it's
 impossible for qemu to keep track of all configurations (e.g 802.1Q) and 
 mac
 addresses (more than one mac address may be used by guest). So qemu can't 
 build
 gratuitous packets for all those configurations properly. The only 
 solution is
 let guest driver who knew all needed information to do this.

 The series first introduces a new runstate which just tracks the state 
 when the
 migration is finished and guest is about to start. And then we can just 
 trying
 to notify the guest to send the GARP after changing from this state to
 running. A model specific announcing method were also also introduced to 
 let
 each kinds of nic do its own notification. When there's no such method 
 register
 for the nic, the old style of sending RARP were kept. And the last two 
 patches
 implemented the virtio-net method of notification.
>>> Do we want to retry SELF_ANNOUNCE_ROUNDS?
>> Yes, we do the announcement several times like in the past.
 Changes from V6:
 - introduce a new runstate instead of using a global variable check the 
 state

 Changes from V5:
 - use a global variable to decide whether an announcement is needed after 
 migration
 - align with virtio spec and let guest ack the announcement notification 
 through
   control vq instead of config status writing

 Changes from V4:
 - keep the old behavior that send the gratuitous packets only after 
 migration
>>> I wonder why it's a sane thing to do. How about simply sending the event 
>>> after load?
>> The aim is to limit the change of the behaviour to focus on migration.
>> We may also need this after cont,
> Hmm why after cont?

If we stop the vm for a long period, the mac will be missed in the
forward table of the bridge also.
>> and then maybe we can just do this
>> unconditionally in vm_start().
> OK but then the new infrastructure we are adding will be dead code,
> won't it?

If we do this, there's no need to introduce a new state then.
>
> Can we do this simply using a post load hook for now?

Maybe not, this means we may want to inject an interrupt to guest when
vm is not running.
 - decide whether to send gratuitous packets by previous runstate instead 
 of a dedicated parameter
 - check virtio_net_started() instead of VIRTIO_NET_S_LINK_UP before issue 
 the config update interrupt
 - move VIRTIO_NET_S_ANNOUNCE to 0x100 and supress guest config write to RO 
 bits
 - cleanups suggested by Michael

 Tested with migration within 802.1Q.

 Jason Wang (5):
   runstate: introduce prelaunch-migrate state
   net: announce self after vm is started
   net: model specific announcing support
   virtio-net: notify guest to annouce itself
   virtio-net: compat guest announce

  hw/pc.h   |6 +-
  hw/virtio-net.c   |   30 ++
  hw/virtio-net.h   |   15 ++-
  include/net/net.h |2 ++
  migration.c   |4 +---
  qapi-schema.json  |5 -
  savevm.c  |8 ++--
  vl.c  |8 +++-
  8 files changed, 69 insertions(+), 9 deletions(-)

Re: [Qemu-devel] [RFC] s390: dump-guest-memory implementation for s390x arch.

2013-03-07 Thread Alexander Graf


On 28.02.2013, at 10:39, Jens Freimann wrote:

> From: Ekaterina Tumanova 
> 
> dump-guest-memory QEMU monitor command didn't work for s390 architecture.
> The result of the command is supposed to be ELF format crash-readable
> dump.
> In order to implement this, the arch-specific part of dump-guest-memory
> was added:
> target-s390x/arch_dump.c contains the whole set of function for writing
> Elf note sections of all types for s390x.
> target-s390x/arch_memory_mapping.c contains stub functions, current
> patch places all guest physical memory into the dump.
> 
> Signed-off-by: Ekaterina Tumanova 
> Signed-off-by: Jens Freimann 
> 
> ---
> configure  |   4 +-
> include/elf.h  |   6 +
> target-s390x/Makefile.objs |   2 +-
> target-s390x/arch_dump.c   | 240 +
> target-s390x/arch_memory_mapping.c |  26 
> 5 files changed, 275 insertions(+), 3 deletions(-)
> create mode 100644 target-s390x/arch_dump.c
> create mode 100644 target-s390x/arch_memory_mapping.c
> 
> diff --git a/configure b/configure
> index 19738ac..938be14 100755
> --- a/configure
> +++ b/configure
> @@ -4146,7 +4146,7 @@ case "$target_arch2" in
> fi
> esac
> case "$target_arch2" in
> -  i386|x86_64)
> +  i386|x86_64|s390x)
> echo "CONFIG_HAVE_GET_MEMORY_MAPPING=y" >> $config_target_mak
> esac
> if test "$target_arch2" = "ppc64" -a "$fdt" = "yes"; then
> @@ -4159,7 +4159,7 @@ if test "$target_softmmu" = "yes" ; then
>   echo "CONFIG_SOFTMMU=y" >> $config_target_mak
>   echo "LIBS+=$libs_softmmu $target_libs_softmmu" >> $config_target_mak
>   case "$target_arch2" in
> -i386|x86_64)
> +i386|x86_64|s390x)
>   echo "CONFIG_HAVE_CORE_DUMP=y" >> $config_target_mak
>   esac
> fi
> diff --git a/include/elf.h b/include/elf.h
> index a21ea53..ba4b3a7 100644
> --- a/include/elf.h
> +++ b/include/elf.h
> @@ -1219,11 +1219,17 @@ typedef struct elf64_shdr {
> 
> /* Notes used in ET_CORE */
> #define NT_PRSTATUS   1
> +#define NT_FPREGSET 2
> #define NT_PRFPREG2
> #define NT_PRPSINFO   3
> #define NT_TASKSTRUCT 4
> #define NT_AUXV   6
> #define NT_PRXFPREG 0x46e62b7f  /* copied from 
> gdb5.1/include/elf/common.h */
> +#define NT_S390_PREFIX  0x305   /* s390 prefix register */
> +#define NT_S390_CTRS0x304   /* s390 control registers */
> +#define NT_S390_TODPREG 0x303   /* s390 TOD programmable register */
> +#define NT_S390_TODCMP  0x302   /* s390 TOD clock comparator 
> register */
> +#define NT_S390_TIMER   0x301   /* s390 timer register */
> 
> 
> /* Note header in a PT_NOTE section */
> diff --git a/target-s390x/Makefile.objs b/target-s390x/Makefile.objs
> index 4e63417..82c79c4 100644
> --- a/target-s390x/Makefile.objs
> +++ b/target-s390x/Makefile.objs
> @@ -1,4 +1,4 @@
> obj-y += translate.o helper.o cpu.o interrupt.o
> obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
> -obj-$(CONFIG_SOFTMMU) += ioinst.o
> +obj-$(CONFIG_SOFTMMU) += ioinst.o arch_dump.o arch_memory_mapping.o
> obj-$(CONFIG_KVM) += kvm.o
> diff --git a/target-s390x/arch_dump.c b/target-s390x/arch_dump.c
> new file mode 100644
> index 000..3098cd1
> --- /dev/null
> +++ b/target-s390x/arch_dump.c
> @@ -0,0 +1,240 @@
> +/*
> + * writing ELF notes for s390x arch
> + *
> + *  
> + * Copyright IBM Corp. 2012
> + *
> + * Ekaterina Tumanova 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "cpu.h"
> +#include "elf.h"
> +#include "exec/cpu-all.h"
> +#include "sysemu/dump.h"
> +#include "sysemu/kvm.h"
> +
> +
> +struct s390x_user_regs_struct {
> +uint64_tpsw[2];
> +uint64_tgprs[16];
> +uint32_tacrs[16];
> +} QEMU_PACKED;
> +
> +typedef struct s390x_user_regs_struct s390x_user_regs;
> +
> +struct s390x_elf_prstatus_struct {
> +uint8_t pad1[32];
> +uint32_t pid;
> +uint8_t pad2[76];
> +s390x_user_regs regs;
> +uint8_t pad3[16];
> +} QEMU_PACKED;
> +
> +typedef struct s390x_elf_prstatus_struct s390x_elf_prstatus;
> +
> +struct s390x_elf_fpregset_struct {
> +uint32_tfpc;
> +uint32_tpad;
> +uint64_tfprs[16];
> +} QEMU_PACKED;
> +
> +typedef struct s390x_elf_fpregset_struct s390x_elf_fpregset;
> +
> +typedef struct note_struct {
> +Elf64_Nhdr hdr;
> +char name[5];
> +char pad3[3];
> + union {
> +s390x_elf_prstatus prstatus;
> +s390x_elf_fpregset fpregset;
> +uint32_t prefix;
> +uint64_t timer;
> +uint64_t todcmp;
> +uint32_t todpreg;
> +uint64_t ctrs[16];
> +} contents;
> +} QEMU_PACKED note_t;
> +
> +static int s390x_write_elf64_prstatus(note_t *note, CPUArchState *env)
> +{
> +int i;
> +s390x_user_regs *regs;
> +
> +note->

Re: [Qemu-devel] [PATCH V7 0/5] Send the gratuitous by guest

2013-03-07 Thread Michael S. Tsirkin

On Thu, Mar 07, 2013 at 06:33:30PM +0800, Jason Wang wrote:
> On 03/07/2013 06:25 PM, Michael S. Tsirkin wrote:
> > On Thu, Mar 07, 2013 at 06:13:41PM +0800, Jason Wang wrote:
> >> On 03/07/2013 06:04 PM, Michael S. Tsirkin wrote:
> >>> On Thu, Mar 07, 2013 at 04:23:46PM +0800, Jason Wang wrote:
>  This series tries to let guest instead of qemu to send the gratuitous 
>  packets
>  after migration when guest is capable of doing this. This is needed 
>  since it's
>  impossible for qemu to keep track of all configurations (e.g 802.1Q) and 
>  mac
>  addresses (more than one mac address may be used by guest). So qemu 
>  can't build
>  gratuitous packets for all those configurations properly. The only 
>  solution is
>  let guest driver who knew all needed information to do this.
> 
>  The series first introduces a new runstate which just tracks the state 
>  when the
>  migration is finished and guest is about to start. And then we can just 
>  trying
>  to notify the guest to send the GARP after changing from this state to
>  running. A model specific announcing method were also also introduced to 
>  let
>  each kinds of nic do its own notification. When there's no such method 
>  register
>  for the nic, the old style of sending RARP were kept. And the last two 
>  patches
>  implemented the virtio-net method of notification.
> >>> Do we want to retry SELF_ANNOUNCE_ROUNDS?
> >> Yes, we do the announcement several times like in the past.
>  Changes from V6:
>  - introduce a new runstate instead of using a global variable check the 
>  state
> 
>  Changes from V5:
>  - use a global variable to decide whether an announcement is needed 
>  after migration
>  - align with virtio spec and let guest ack the announcement notification 
>  through
>    control vq instead of config status writing
> 
>  Changes from V4:
>  - keep the old behavior that send the gratuitous packets only after 
>  migration
> >>> I wonder why it's a sane thing to do. How about simply sending the event 
> >>> after load?
> >> The aim is to limit the change of the behaviour to focus on migration.
> >> We may also need this after cont,
> > Hmm why after cont?
> 
> If we stop the vm for a long period, the mac will be missed in the
> forward table of the bridge also.

Hmm okay, needs some thought.

> >> and then maybe we can just do this
> >> unconditionally in vm_start().
> > OK but then the new infrastructure we are adding will be dead code,
> > won't it?
> 
> If we do this, there's no need to introduce a new state then.
> >
> > Can we do this simply using a post load hook for now?
> 
> Maybe not, this means we may want to inject an interrupt to guest when
> vm is not running.

What I'm suggesting is basically:
- set some per device flag on load
- announce based on vmstart if flag is set

We can drop the flag later if we want it on every vmstart.

>  - decide whether to send gratuitous packets by previous runstate instead 
>  of a dedicated parameter
>  - check virtio_net_started() instead of VIRTIO_NET_S_LINK_UP before 
>  issue the config update interrupt
>  - move VIRTIO_NET_S_ANNOUNCE to 0x100 and supress guest config write to 
>  RO bits
>  - cleanups suggested by Michael
> 
>  Tested with migration within 802.1Q.
> 
>  Jason Wang (5):
>    runstate: introduce prelaunch-migrate state
>    net: announce self after vm is started
>    net: model specific announcing support
>    virtio-net: notify guest to annouce itself
>    virtio-net: compat guest announce
> 
>   hw/pc.h   |6 +-
>   hw/virtio-net.c   |   30 ++
>   hw/virtio-net.h   |   15 ++-
>   include/net/net.h |2 ++
>   migration.c   |4 +---
>   qapi-schema.json  |5 -
>   savevm.c  |8 ++--
>   vl.c  |8 +++-
>   8 files changed, 69 insertions(+), 9 deletions(-)

Re: [Qemu-devel] qemu-img do not verify encrytption

2013-03-07 Thread yue-kvm

hi stefan:


[root@kvm ~]# qemu-img -V
qemu-img version 0.12.1, Copyright (c) 2004-2008 Fabrice Bellard


OS centos 6.3
thanks








At 2013-03-07 17:56:50,"Stefan Hajnoczi"  wrote:
>On Thu, Mar 07, 2013 at 02:49:30PM +0800, yue-kvm wrote:
>> i create qcow2 format image with -o encryption, and assign password.
>> 
>> but when i qemu-img info encryt.qcow2 ,  it display encryt.qcow2 info no 
>> master  what i input, even just hit enter.
>> the 'Password Authentication ' of qemu-img may be invalid.
>
>Don't worry, your data is encrypted.  The qemu-img info password check
>is unnecessary because qcow2 metadata is unencrypted.
>
>I didn't get a password prompt from qemu-img info on an encrypted qcow2
>file using qemu.git/master.
>
>Which version of qemu-img are you using?  Perhaps the unnecessary prompt
>has been fixed in a later version.
>
>Stefan
>

Re: [Qemu-devel] qemu-img do not verify encrytption

2013-03-07 Thread Wei-Ren Chen

On Thu, Mar 07, 2013 at 06:57:19PM +0800, yue-kvm wrote:
> hi stefan:
> 
> [root@kvm ~]# qemu-img -V
> qemu-img version 0.12.1, Copyright (c) 2004-2008 Fabrice Bellard

  Hrm..., 0.12 is too old, why don't you use newer version?

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj

Re: [Qemu-devel] [PATCH 01/19] linux-user: stack_base is now mandatory on all targets

2013-03-07 Thread Laurent Desnogues

On Wed, Feb 8, 2012 at 10:46 AM, Laurent Desnogues
 wrote:
> On Fri, Feb 3, 2012 at 3:49 PM,   wrote:
>> From: Riku Voipio 
>>
>> Signed-off-by: Riku Voipio 
>> ---
>>  linux-user/qemu.h |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/linux-user/qemu.h b/linux-user/qemu.h
>> index 55ad9d8..30e2abd 100644
>> --- a/linux-user/qemu.h
>> +++ b/linux-user/qemu.h
>> @@ -123,10 +123,10 @@ typedef struct TaskState {
>>  #endif
>>  #if defined(TARGET_ARM) || defined(TARGET_M68K) || defined(TARGET_UNICORE32)
>> /* Extra fields for semihosted binaries.  */
>> -uint32_t stack_base;
>> uint32_t heap_base;
>> uint32_t heap_limit;
>>  #endif
>> +uint32_t stack_base;
>
> Shouldn't this be abi_ulong instead of uint32_t?

Ping...

Yes it's more than a year old, but if Aarch64 support with semihosting
ever comes to life :)

Laurent

>> int used; /* non zero if used */
>> struct image_info *info;
>> struct linux_binprm *bprm;
>
> Laurent

Re: [Qemu-devel] problems with freeBSD

2013-03-07 Thread Gerd Hoffmann

  Hi,

> Probably works, but never appeared in a separate release:
> 
> commit 3588185b8396eb97fd9efd41c2b97775465f67c4
> Author: Gerd Hoffmann 
> Date:   Mon Jan 21 09:17:16 2013 +0100
> 
> seabios: update to 1.7.2 release

Built with "gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3)" (rhel-6).

> commit 5f876756c57c15f5e14d4136fc432b74f05f082b
> Author: Anthony Liguori 
> Date:   Wed Feb 6 05:12:06 2013 -0600
> 
> bios: recompile BIOS

> gcc version 4.7.2 20121109 (Red Hat 4.7.2-8) (GCC)

> commit 5c75fb10029c5fd1e705a6ef5d698fbea06c7a33
> Author: Gerd Hoffmann 
> Date:   Thu Feb 28 09:18:56 2013 +0100
> 
> update seabios to 1.7.2.1

Built with "gcc (GCC) 4.7.2 20121015 (Red Hat 4.7.2-5)"
(rhel-6 devtoolkit-1.1) as gcc 4.4 builds are too big and bumb seabios
size from 128k to 256k.  Which was the reason for anthonys rebuild.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v11 1/3] iov: Factor out hexdumper

2013-03-07 Thread Peter Crosthwaite

Ping!

Any issues here? Peter wanted to give this list time so it missed the
last arm-devs.

Regards,
Peter

On Wed, Feb 27, 2013 at 3:17 PM, Peter Crosthwaite
 wrote:
> Factor out the hexdumper functionality from iov for all to use. Useful for
> creating verbose debug printfery that dumps packet data.
>
> Signed-off-by: Peter Crosthwaite 
> ---
> changed from v10:
> Added Gerd and Red Hat to (c)
> reworked iov hexdumper to use iov_to_buf (Gerd Review)
> changed from v9:
> changed original source info in header to point to Gerds hexdump commit to 
> iov.c
> Moved header prototype to qemu-common.h (MJT review)
> Added brief comment in header for hexdump()
>
>  include/qemu-common.h |6 ++
>  util/Makefile.objs|1 +
>  util/hexdump.c|   37 +
>  util/iov.c|   36 +++-
>  4 files changed, 55 insertions(+), 25 deletions(-)
>  create mode 100644 util/hexdump.c
>
> diff --git a/include/qemu-common.h b/include/qemu-common.h
> index 80016ad..804667a 100644
> --- a/include/qemu-common.h
> +++ b/include/qemu-common.h
> @@ -430,4 +430,10 @@ int64_t pow2floor(int64_t value);
>  int uleb128_encode_small(uint8_t *out, uint32_t n);
>  int uleb128_decode_small(const uint8_t *in, uint32_t *n);
>
> +/*
> + * Hexdump a buffer to a file. An optional string prefix is added to every 
> line
> + */
> +
> +void hexdump(const char *buf, FILE *fp, const char *prefix, size_t size);
> +
>  #endif
> diff --git a/util/Makefile.objs b/util/Makefile.objs
> index 495a178..068ceac 100644
> --- a/util/Makefile.objs
> +++ b/util/Makefile.objs
> @@ -8,3 +8,4 @@ util-obj-y += error.o qemu-error.o
>  util-obj-$(CONFIG_POSIX) += compatfd.o
>  util-obj-y += iov.o aes.o qemu-config.o qemu-sockets.o uri.o notify.o
>  util-obj-y += qemu-option.o qemu-progress.o
> +util-obj-y += hexdump.o
> diff --git a/util/hexdump.c b/util/hexdump.c
> new file mode 100644
> index 000..0d0efc8
> --- /dev/null
> +++ b/util/hexdump.c
> @@ -0,0 +1,37 @@
> +/*
> + * Helper to hexdump a buffer
> + *
> + * Copyright (c) 2013 Red Hat, Inc.
> + * Copyright (c) 2013 Gerd Hoffmann 
> + * Copyright (c) 2013 Peter Crosthwaite 
> + * Copyright (c) 2013 Xilinx, Inc
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + * Contributions after 2012-01-13 are licensed under the terms of the
> + * GNU GPL, version 2 or (at your option) any later version.
> + */
> +
> +#include "qemu-common.h"
> +
> +void hexdump(const char *buf, FILE *fp, const char *prefix, size_t size)
> +{
> +unsigned int b;
> +
> +for (b = 0; b < size; b++) {
> +if ((b % 16) == 0) {
> +fprintf(fp, "%s: %04x:", prefix, b);
> +}
> +if ((b % 4) == 0) {
> +fprintf(fp, " ");
> +}
> +fprintf(fp, " %02x", (unsigned char)buf[b]);
> +if ((b % 16) == 15) {
> +fprintf(fp, "\n");
> +}
> +}
> +if ((b % 16) != 0) {
> +fprintf(fp, "\n");
> +}
> +}
> diff --git a/util/iov.c b/util/iov.c
> index fbe675d..9dae318 100644
> --- a/util/iov.c
> +++ b/util/iov.c
> @@ -201,32 +201,18 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, 
> unsigned iov_cnt,
>  void iov_hexdump(const struct iovec *iov, const unsigned int iov_cnt,
>   FILE *fp, const char *prefix, size_t limit)
>  {
> -unsigned int i, v, b;
> -uint8_t *c;
> -
> -c = iov[0].iov_base;
> -for (i = 0, v = 0, b = 0; b < limit; i++, b++) {
> -if (i == iov[v].iov_len) {
> -i = 0; v++;
> -if (v == iov_cnt) {
> -break;
> -}
> -c = iov[v].iov_base;
> -}
> -if ((b % 16) == 0) {
> -fprintf(fp, "%s: %04x:", prefix, b);
> -}
> -if ((b % 4) == 0) {
> -fprintf(fp, " ");
> -}
> -fprintf(fp, " %02x", c[i]);
> -if ((b % 16) == 15) {
> -fprintf(fp, "\n");
> -}
> -}
> -if ((b % 16) != 0) {
> -fprintf(fp, "\n");
> +int v;
> +size_t size = 0;
> +char *buf;
> +
> +for (v = 0; v < iov_cnt; v++) {
> +size += iov[v].iov_len;
>  }
> +size = size > limit ? limit : size;
> +buf = g_malloc(size);
> +iov_to_buf(iov, iov_cnt, 0, buf, size);
> +hexdump(buf, fp, prefix, size);
> +g_free(buf);
>  }
>
>  unsigned iov_copy(struct iovec *dst_iov, unsigned int dst_iov_cnt,
> --
> 1.7.0.4
>

Re: [Qemu-devel] [RFC] s390: dump-guest-memory implementation for s390x arch.

2013-03-07 Thread Christian Borntraeger

On 07/03/13 11:49, Alexander Graf wrote:
> 
> On 28.02.2013, at 10:39, Jens Freimann wrote:
> 
>> From: Ekaterina Tumanova 
>>
>> dump-guest-memory QEMU monitor command didn't work for s390 architecture.
>> The result of the command is supposed to be ELF format crash-readable
>> dump.
>> In order to implement this, the arch-specific part of dump-guest-memory
>> was added:
>> target-s390x/arch_dump.c contains the whole set of function for writing
>> Elf note sections of all types for s390x.
>> target-s390x/arch_memory_mapping.c contains stub functions, current
>> patch places all guest physical memory into the dump.
>>
>> Signed-off-by: Ekaterina Tumanova 
>> Signed-off-by: Jens Freimann 
>>
>> ---
>> configure  |   4 +-
>> include/elf.h  |   6 +
>> target-s390x/Makefile.objs |   2 +-
>> target-s390x/arch_dump.c   | 240 
>> +
>> target-s390x/arch_memory_mapping.c |  26 
>> 5 files changed, 275 insertions(+), 3 deletions(-)
>> create mode 100644 target-s390x/arch_dump.c
>> create mode 100644 target-s390x/arch_memory_mapping.c
>>
>> diff --git a/configure b/configure
>> index 19738ac..938be14 100755
>> --- a/configure
>> +++ b/configure
>> @@ -4146,7 +4146,7 @@ case "$target_arch2" in
>> fi
>> esac
>> case "$target_arch2" in
>> -  i386|x86_64)
>> +  i386|x86_64|s390x)
>> echo "CONFIG_HAVE_GET_MEMORY_MAPPING=y" >> $config_target_mak
>> esac
>> if test "$target_arch2" = "ppc64" -a "$fdt" = "yes"; then
>> @@ -4159,7 +4159,7 @@ if test "$target_softmmu" = "yes" ; then
>>   echo "CONFIG_SOFTMMU=y" >> $config_target_mak
>>   echo "LIBS+=$libs_softmmu $target_libs_softmmu" >> $config_target_mak
>>   case "$target_arch2" in
>> -i386|x86_64)
>> +i386|x86_64|s390x)
>>   echo "CONFIG_HAVE_CORE_DUMP=y" >> $config_target_mak
>>   esac
>> fi
>> diff --git a/include/elf.h b/include/elf.h
>> index a21ea53..ba4b3a7 100644
>> --- a/include/elf.h
>> +++ b/include/elf.h
>> @@ -1219,11 +1219,17 @@ typedef struct elf64_shdr {
>>
>> /* Notes used in ET_CORE */
>> #define NT_PRSTATUS  1
>> +#define NT_FPREGSET 2
>> #define NT_PRFPREG   2
>> #define NT_PRPSINFO  3
>> #define NT_TASKSTRUCT4
>> #define NT_AUXV  6
>> #define NT_PRXFPREG 0x46e62b7f  /* copied from 
>> gdb5.1/include/elf/common.h */
>> +#define NT_S390_PREFIX  0x305   /* s390 prefix register */
>> +#define NT_S390_CTRS0x304   /* s390 control registers */
>> +#define NT_S390_TODPREG 0x303   /* s390 TOD programmable register */
>> +#define NT_S390_TODCMP  0x302   /* s390 TOD clock comparator 
>> register */
>> +#define NT_S390_TIMER   0x301   /* s390 timer register */
>>
>>
>> /* Note header in a PT_NOTE section */
>> diff --git a/target-s390x/Makefile.objs b/target-s390x/Makefile.objs
>> index 4e63417..82c79c4 100644
>> --- a/target-s390x/Makefile.objs
>> +++ b/target-s390x/Makefile.objs
>> @@ -1,4 +1,4 @@
>> obj-y += translate.o helper.o cpu.o interrupt.o
>> obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
>> -obj-$(CONFIG_SOFTMMU) += ioinst.o
>> +obj-$(CONFIG_SOFTMMU) += ioinst.o arch_dump.o arch_memory_mapping.o
>> obj-$(CONFIG_KVM) += kvm.o
>> diff --git a/target-s390x/arch_dump.c b/target-s390x/arch_dump.c
>> new file mode 100644
>> index 000..3098cd1
>> --- /dev/null
>> +++ b/target-s390x/arch_dump.c
>> @@ -0,0 +1,240 @@
>> +/*
>> + * writing ELF notes for s390x arch
>> + *
>> + *  
>> + * Copyright IBM Corp. 2012
>> + *
>> + * Ekaterina Tumanova 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#include "cpu.h"
>> +#include "elf.h"
>> +#include "exec/cpu-all.h"
>> +#include "sysemu/dump.h"
>> +#include "sysemu/kvm.h"
>> +
>> +
>> +struct s390x_user_regs_struct {
>> +uint64_tpsw[2];
>> +uint64_tgprs[16];
>> +uint32_tacrs[16];
>> +} QEMU_PACKED;
>> +
>> +typedef struct s390x_user_regs_struct s390x_user_regs;
>> +
>> +struct s390x_elf_prstatus_struct {
>> +uint8_t pad1[32];
>> +uint32_t pid;
>> +uint8_t pad2[76];
>> +s390x_user_regs regs;
>> +uint8_t pad3[16];
>> +} QEMU_PACKED;
>> +
>> +typedef struct s390x_elf_prstatus_struct s390x_elf_prstatus;
>> +
>> +struct s390x_elf_fpregset_struct {
>> +uint32_tfpc;
>> +uint32_tpad;
>> +uint64_tfprs[16];
>> +} QEMU_PACKED;
>> +
>> +typedef struct s390x_elf_fpregset_struct s390x_elf_fpregset;
>> +
>> +typedef struct note_struct {
>> +Elf64_Nhdr hdr;
>> +char name[5];
>> +char pad3[3];
>> +union {
>> +s390x_elf_prstatus prstatus;
>> +s390x_elf_fpregset fpregset;
>> +uint32_t prefix;
>> +uint64_t timer;
>> +uint64_t todcmp;
>> +uint32_t todpreg;
>> +uint64_t ctrs[16];
>> +

Re: [Qemu-devel] [RFC] s390: dump-guest-memory implementation for s390x arch.

2013-03-07 Thread Alexander Graf


On 07.03.2013, at 12:21, Christian Borntraeger wrote:

> On 07/03/13 11:49, Alexander Graf wrote:
>> 
>> On 28.02.2013, at 10:39, Jens Freimann wrote:
>> 
>>> From: Ekaterina Tumanova 
>>> 
>>> dump-guest-memory QEMU monitor command didn't work for s390 architecture.
>>> The result of the command is supposed to be ELF format crash-readable
>>> dump.
>>> In order to implement this, the arch-specific part of dump-guest-memory
>>> was added:
>>> target-s390x/arch_dump.c contains the whole set of function for writing
>>> Elf note sections of all types for s390x.
>>> target-s390x/arch_memory_mapping.c contains stub functions, current
>>> patch places all guest physical memory into the dump.
>>> 
>>> Signed-off-by: Ekaterina Tumanova 
>>> Signed-off-by: Jens Freimann 
>>> 
>>> ---
>>> configure  |   4 +-
>>> include/elf.h  |   6 +
>>> target-s390x/Makefile.objs |   2 +-
>>> target-s390x/arch_dump.c   | 240 
>>> +
>>> target-s390x/arch_memory_mapping.c |  26 
>>> 5 files changed, 275 insertions(+), 3 deletions(-)
>>> create mode 100644 target-s390x/arch_dump.c
>>> create mode 100644 target-s390x/arch_memory_mapping.c
>>> 
>>> diff --git a/configure b/configure
>>> index 19738ac..938be14 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -4146,7 +4146,7 @@ case "$target_arch2" in
>>>fi
>>> esac
>>> case "$target_arch2" in
>>> -  i386|x86_64)
>>> +  i386|x86_64|s390x)
>>>echo "CONFIG_HAVE_GET_MEMORY_MAPPING=y" >> $config_target_mak
>>> esac
>>> if test "$target_arch2" = "ppc64" -a "$fdt" = "yes"; then
>>> @@ -4159,7 +4159,7 @@ if test "$target_softmmu" = "yes" ; then
>>>  echo "CONFIG_SOFTMMU=y" >> $config_target_mak
>>>  echo "LIBS+=$libs_softmmu $target_libs_softmmu" >> $config_target_mak
>>>  case "$target_arch2" in
>>> -i386|x86_64)
>>> +i386|x86_64|s390x)
>>>  echo "CONFIG_HAVE_CORE_DUMP=y" >> $config_target_mak
>>>  esac
>>> fi
>>> diff --git a/include/elf.h b/include/elf.h
>>> index a21ea53..ba4b3a7 100644
>>> --- a/include/elf.h
>>> +++ b/include/elf.h
>>> @@ -1219,11 +1219,17 @@ typedef struct elf64_shdr {
>>> 
>>> /* Notes used in ET_CORE */
>>> #define NT_PRSTATUS 1
>>> +#define NT_FPREGSET 2
>>> #define NT_PRFPREG  2
>>> #define NT_PRPSINFO 3
>>> #define NT_TASKSTRUCT   4
>>> #define NT_AUXV 6
>>> #define NT_PRXFPREG 0x46e62b7f  /* copied from 
>>> gdb5.1/include/elf/common.h */
>>> +#define NT_S390_PREFIX  0x305   /* s390 prefix register */
>>> +#define NT_S390_CTRS0x304   /* s390 control registers */
>>> +#define NT_S390_TODPREG 0x303   /* s390 TOD programmable register 
>>> */
>>> +#define NT_S390_TODCMP  0x302   /* s390 TOD clock comparator 
>>> register */
>>> +#define NT_S390_TIMER   0x301   /* s390 timer register */
>>> 
>>> 
>>> /* Note header in a PT_NOTE section */
>>> diff --git a/target-s390x/Makefile.objs b/target-s390x/Makefile.objs
>>> index 4e63417..82c79c4 100644
>>> --- a/target-s390x/Makefile.objs
>>> +++ b/target-s390x/Makefile.objs
>>> @@ -1,4 +1,4 @@
>>> obj-y += translate.o helper.o cpu.o interrupt.o
>>> obj-y += int_helper.o fpu_helper.o cc_helper.o mem_helper.o misc_helper.o
>>> -obj-$(CONFIG_SOFTMMU) += ioinst.o
>>> +obj-$(CONFIG_SOFTMMU) += ioinst.o arch_dump.o arch_memory_mapping.o
>>> obj-$(CONFIG_KVM) += kvm.o
>>> diff --git a/target-s390x/arch_dump.c b/target-s390x/arch_dump.c
>>> new file mode 100644
>>> index 000..3098cd1
>>> --- /dev/null
>>> +++ b/target-s390x/arch_dump.c
>>> @@ -0,0 +1,240 @@
>>> +/*
>>> + * writing ELF notes for s390x arch
>>> + *
>>> + *  
>>> + * Copyright IBM Corp. 2012
>>> + *
>>> + * Ekaterina Tumanova 
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or 
>>> later.
>>> + * See the COPYING file in the top-level directory.
>>> + *
>>> + */
>>> +
>>> +#include "cpu.h"
>>> +#include "elf.h"
>>> +#include "exec/cpu-all.h"
>>> +#include "sysemu/dump.h"
>>> +#include "sysemu/kvm.h"
>>> +
>>> +
>>> +struct s390x_user_regs_struct {
>>> +uint64_tpsw[2];
>>> +uint64_tgprs[16];
>>> +uint32_tacrs[16];
>>> +} QEMU_PACKED;
>>> +
>>> +typedef struct s390x_user_regs_struct s390x_user_regs;
>>> +
>>> +struct s390x_elf_prstatus_struct {
>>> +uint8_t pad1[32];
>>> +uint32_t pid;
>>> +uint8_t pad2[76];
>>> +s390x_user_regs regs;
>>> +uint8_t pad3[16];
>>> +} QEMU_PACKED;
>>> +
>>> +typedef struct s390x_elf_prstatus_struct s390x_elf_prstatus;
>>> +
>>> +struct s390x_elf_fpregset_struct {
>>> +uint32_tfpc;
>>> +uint32_tpad;
>>> +uint64_tfprs[16];
>>> +} QEMU_PACKED;
>>> +
>>> +typedef struct s390x_elf_fpregset_struct s390x_elf_fpregset;
>>> +
>>> +typedef struct note_struct {
>>> +Elf64_Nhdr hdr;
>>> +char name[5];
>>> +char pad3[3];
>>> +   union {
>>> +s390x_elf_prstatus prstatus;
>>> +

Re: [Qemu-devel] [RFC] s390: dump-guest-memory implementation for s390x arch.

2013-03-07 Thread Christian Borntraeger

 diff --git a/target-s390x/arch_memory_mapping.c 
 b/target-s390x/arch_memory_mapping.c
 new file mode 100644
 index 000..3dad3b9
 --- /dev/null
 +++ b/target-s390x/arch_memory_mapping.c
 @@ -0,0 +1,26 @@
 +/*
 + * s390x memory mapping
 + *
 + * Copyright IBM Corp. 2012
 + *
 + * Authors:
 + * Ekaterina Tumanova 
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or 
 later.
 + * See the COPYING file in the top-level directory.
 + *
 + */
 +
 +#include "cpu.h"
 +#include "exec/cpu-all.h"
 +#include "sysemu/memory_mapping.h"
 +
 +int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
 +{
 +return 0;
 +}
 +
 +bool cpu_paging_enabled(CPUArchState *env)
 +{
 +return false;
>>>
>>> Why? :)
>>
>> To make things compile :-)
>> Why no paging? The idea was to have some code that allows getting an ELF 
>> dump of a Linux kernel.
>> All existing tooling for s390 expects real memory dumps. 
>>
>> Actually having CONFIG_HAVE_GET_MEMORY_MAPPING=n would be a sane thing todo, 
>> but IIRC the common
>> code in dump.c references some functions that are only available with 
>> CONFIG_HAVE_GET_MEMORY_MAPPING=y. 
>>
>> /home/cborntra/REPOS/qemu/dump.c:84: undefined reference to 
>> `memory_mapping_list_free'
>> dump.o: In function `dump_init':
>> /home/cborntra/REPOS/qemu/dump.c:753: undefined reference to 
>> `memory_mapping_list_init'
>> /home/cborntra/REPOS/qemu/dump.c:757: undefined reference to 
>> `qemu_get_guest_simple_memory_mapping'
>> /home/cborntra/REPOS/qemu/dump.c:761: undefined reference to 
>> `memory_mapping_filter'
> 
> It might be better to just fix those places and not claim that you can fetch 
> memory mappings then. QEMU has all the knowledge of s390 virtual memory you 
> would need to implement it. I can see how you wouldn't need it for core dumps 
> of the kernel, but it's confusing to claim dump support for VM and then say 
> you don't.
> 
> 
> Alex

So you suggest to implementing stubs for those 4 functions in 
memory_mapping-stub.c and see if dumping
of a kernel still works with CONFIG_HAVE_GET_MEMORY_MAPPING=n, right?

Christian

Re: [Qemu-devel] [RFC] s390: dump-guest-memory implementation for s390x arch.

2013-03-07 Thread Alexander Graf


On 07.03.2013, at 12:30, Christian Borntraeger wrote:

> diff --git a/target-s390x/arch_memory_mapping.c 
> b/target-s390x/arch_memory_mapping.c
> new file mode 100644
> index 000..3dad3b9
> --- /dev/null
> +++ b/target-s390x/arch_memory_mapping.c
> @@ -0,0 +1,26 @@
> +/*
> + * s390x memory mapping
> + *
> + * Copyright IBM Corp. 2012
> + *
> + * Authors:
> + * Ekaterina Tumanova 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or 
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "cpu.h"
> +#include "exec/cpu-all.h"
> +#include "sysemu/memory_mapping.h"
> +
> +int cpu_get_memory_mapping(MemoryMappingList *list, CPUArchState *env)
> +{
> +return 0;
> +}
> +
> +bool cpu_paging_enabled(CPUArchState *env)
> +{
> +return false;
 
 Why? :)
>>> 
>>> To make things compile :-)
>>> Why no paging? The idea was to have some code that allows getting an ELF 
>>> dump of a Linux kernel.
>>> All existing tooling for s390 expects real memory dumps. 
>>> 
>>> Actually having CONFIG_HAVE_GET_MEMORY_MAPPING=n would be a sane thing 
>>> todo, but IIRC the common
>>> code in dump.c references some functions that are only available with 
>>> CONFIG_HAVE_GET_MEMORY_MAPPING=y. 
>>> 
>>> /home/cborntra/REPOS/qemu/dump.c:84: undefined reference to 
>>> `memory_mapping_list_free'
>>> dump.o: In function `dump_init':
>>> /home/cborntra/REPOS/qemu/dump.c:753: undefined reference to 
>>> `memory_mapping_list_init'
>>> /home/cborntra/REPOS/qemu/dump.c:757: undefined reference to 
>>> `qemu_get_guest_simple_memory_mapping'
>>> /home/cborntra/REPOS/qemu/dump.c:761: undefined reference to 
>>> `memory_mapping_filter'
>> 
>> It might be better to just fix those places and not claim that you can fetch 
>> memory mappings then. QEMU has all the knowledge of s390 virtual memory you 
>> would need to implement it. I can see how you wouldn't need it for core 
>> dumps of the kernel, but it's confusing to claim dump support for VM and 
>> then say you don't.
>> 
>> 
>> Alex
> 
> So you suggest to implementing stubs for those 4 functions in 
> memory_mapping-stub.c and see if dumping
> of a kernel still works with CONFIG_HAVE_GET_MEMORY_MAPPING=n, right?

Anything that gets you rolling without CONFIG_HAVE_GET_MEMORY_MAPPING :)


Alex

Re: [Qemu-devel] [Qemu-stable] [SeaBIOS] problems with freeBSD

2013-03-07 Thread Gerd Hoffmann

  Hi,

>> Would qemu consider using those blobs rather than different developers
>> using their distro toolchain to build up a random commit ID. I say
>> random only because often qemu releases ship with a non-release
>> seabios.

It happened (well, the snapshot thing).  But that isn't our preferred
model, it was just bad release cycle coordination and we'll try to avoid
doing it again.  There is reason why we have a 1.7.2 stable branch now.

The binary blobs are just a convenience thing because rebuilding the
firmware might not be that easy.  Not so much for virtualization where
host + firmware are the same arch.  But for emulation, where you need
the -- say -- ppc slof on a x86 machine, and the binaries shipped allow
people to get started without having to install (or compile) a cross
compiler first.

It is certainly not required to use blobs shipped.  In fact the release
tarballs include all the firmware submodules, so you can build the
firmware yourself if you want.  I've created roms/Makefile so you can
easily rebuild the (x86) firmware.  seabios is there, vgabios is there,
patches for ipxe are in flight.

> Just a note, or an "alternative opinion", so to say.  In Debian, we have
> a social contract which, among other things, ensures that the binaries
> you, as a user, get, comes with source which you can modify on the
> system you installed.  This requires, for example, that all binaries
> shipped are actually built from the corresponding source, and that
> no blobs from whatever other sources are used, ever.

That is perfectly fine.  How to you handle ppc firmware for x86 hosts
btw?  Build on ppc buildhost and ship as noarch package?  Or do you do
cross compiler builds, so that users can patch+rebuild it on their x86
host too?

> We don't even ship any upstream blobs in the debian qemu _source_
> package: we repack upstream qemu.tar.gz by removing these blobs.

That's a bit over the top for my taste as the release tarballs include
both source and blobs.  Although it might be the debian release is just
a bit too old for that, not fully sure with which release anthony
started to include the firmware submodules, might be it was after 1.1

cheers,
  Gerd

Re: [Qemu-devel] problems with freeBSD

2013-03-07 Thread Laszlo Ersek

On 03/07/13 09:43, Aurelien Jarno wrote:

> I did a git bisect to find the commit fixing the issue. Then, as I was
> not believing the result, I tried the following sequence a dozen of
> times (for some unknown reasons the FreeBSD install CD doesn't exhibit
> the issue, so I used the Debian GNU/kFreeBSD installer):
> 
> | mkdir qemu-freebsd-bug
> | cd qemu-freebsd-bug
> |
> | wget 
> http://ftp.debian.org/debian/dists/squeeze/main/installer-kfreebsd-amd64/current/images/netboot/mini.iso
>  
> |
> | git clone git://git.qemu.org/qemu.git
> | cd qemu
> | git checkout -b stable-1.4 v1.4.0
> | ./configure --target-list=x86_64-softmmu
> | make
> | cd ..
> |
> | git clone git://git.seabios.org/seabios.git
> | cd seabios
> | git checkout -b 1.7.2-stable origin/1.7.2-stable
> | git reset --hard 4219149ad2b783abfa61e80e9e9f6910db0c76c9
> | make
> | cp out/bios.bin ../qemu/pc-bios
> | cd..
> |
> | # debian-installer boots correctly 
> | ./qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cdrom mini.iso
> |
> | cd seabios
> | git reset --hard d75c22fcb6521dad11428b65789d92f89675c600 
> | git clean -fdx
> | make
> | cp out/bios.bin ../qemu/pc-bios
> | cd ..
> |
> | # debian-installer fails to boot
> | ./qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cdrom mini.iso 
> 
> 
> Maybe I am doing something wrong or there is a bug in my toolchain
> (Debian Sid). It would be nice if someone could try to reproduce that on
> another distro/system.
> 

Can you save the out/ directory from both builds and "diff -ur" them
(maybe just the *.lds files)?

I'm noticing that pathnames are embedded in some ELF section names (I
hope this sentence makes sense), and when you build at d75c22fc, those
pathnames contain dot-dot (".."), ie. two section name separators next
to each other. Maybe that's not good; no idea.

Laszlo

Re: [Qemu-devel] [PATCH 2/3] hw/virtio-net.c: set config size using host features

2013-03-07 Thread Christian Borntraeger

Ping.
Anthony, Jesse,

how is this supposed to work?


Christian
On 05/03/13 18:03, Christian Borntraeger wrote:
> On 05/03/13 17:48, Alexander Graf wrote:
>> On 02/06/2013 12:47 AM, Jesse Larrew wrote:
>>> Currently, the config size for virtio devices is hard coded. When a new
>>> feature is added that changes the config size, drivers that assume a static
>>> config size will break. For purposes of backward compatibility, there needs
>>> to be a way to inform drivers of the config size needed to accommodate the
>>> set of features enabled.
>>>
>>> Signed-off-by: Jesse Larrew
>>
>> The following patch gets my s390 virtio guest working again, but I doubt 
>> it's the right fix.
>>
>> What is the expected dependency chain of feature calls?
>>
>>
>> Alex
>>
>> diff --git a/hw/s390x/s390-virtio-bus.c b/hw/s390x/s390-virtio-bus.c
>> index 089ed92..81be971 100644
>> --- a/hw/s390x/s390-virtio-bus.c
>> +++ b/hw/s390x/s390-virtio-bus.c
>> @@ -154,7 +154,7 @@ static int s390_virtio_net_init(VirtIOS390Device *dev)
>>  VirtIODevice *vdev;
>>
>>  vdev = virtio_net_init((DeviceState *)dev, &dev->nic, &dev->net,
>> -   dev->host_features);
>> +   dev->host_features | (1 << VIRTIO_NET_F_MAC));
>>  if (!vdev) {
>>  return -1;
>>  }
>>
>>
> 
> Actually this goes back to
> 
> commit 1e89ad5b00ba0426d4e949c9e6ce2926c15b81b7
> Author: Anthony Liguori 
> Date:   Tue Feb 5 17:47:15 2013 -0600
> 
> virtio-net: pass host features to virtio_net_init
> 
> Signed-off-by: Anthony Liguori 
> 
> virtio-s390 calls  virtio_net_init  before it actually queries the 
> dev->features into
> the host_features. virtio-ccw does the same, but it does not BUG. (Its still 
> wrong IMHO)
> 
> Same for virtio-pci:
> 
> 
> static int virtio_net_init_pci(PCIDevice *pci_dev)
> {
> VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
> VirtIODevice *vdev;
> 
> vdev = virtio_net_init(&pci_dev->qdev, &proxy->nic, &proxy->net,
>proxy->host_features);   <--- use host_feature
> 
> vdev->nvectors = proxy->nvectors;
> virtio_init_pci(proxy, vdev);   <- actually gets host_feature 
> (!)
> [..]
> 
> 
> Good that my old rusty virtio-ccw transport has lots of BUG_ONS :-)
> 
> 
>

Re: [Qemu-devel] [PATCH 28/28] hw/sd.c: add SD card save/load support

2013-03-07 Thread Igor Mitsyanko

 On 03/06/2013 10:52 PM, Peter Maydell wrote:

On 7 March 2013 02:31, Michael Walle 
 wrote:

 Sorry for digging out such an old thread :) but this patch introduced a memory
corruption, see below.

 CC'ing Igor as the author of the patch...


 Am Dienstag 30 Oktober 2012, 09:44:24 schrieb Peter Maydell:

 From: Igor Mitsyanko  

This patch updates SD card model to support save/load of card's state.

+static const VMStateDescription sd_vmstate = {
+.name = "sd-card",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(mode, SDState),
+VMSTATE_INT32(state, SDState),
+VMSTATE_UINT8_ARRAY(cid, SDState, 16),
+VMSTATE_UINT8_ARRAY(csd, SDState, 16),
+VMSTATE_UINT16(rca, SDState),
+VMSTATE_UINT32(card_status, SDState),
+VMSTATE_PARTIAL_BUFFER(sd_status, SDState, 1),
+VMSTATE_UINT32(vhs, SDState),
+VMSTATE_BITMAP(wp_groups, SDState, 0, wpgrps_size),
+VMSTATE_UINT32(blk_len, SDState),
+VMSTATE_UINT32(erase_start, SDState),
+VMSTATE_UINT32(erase_end, SDState),
+VMSTATE_UINT8_ARRAY(pwd, SDState, 16),
+VMSTATE_UINT32(pwd_len, SDState),
+VMSTATE_UINT8_ARRAY(function_group, SDState, 6),
+VMSTATE_UINT8(current_cmd, SDState),
+VMSTATE_BOOL(expecting_acmd, SDState),
+VMSTATE_UINT32(blk_written, SDState),
+VMSTATE_UINT64(data_start, SDState),
+VMSTATE_UINT32(data_offset, SDState),
+VMSTATE_UINT8_ARRAY(data, SDState, 512),
+VMSTATE_BUFFER_UNSAFE(buf, SDState, 1, 512),

 buf is dynamically allocated in the sd_init(), see also the SDState:

struct SDState {
[...]
uint8_t *buf;

bool enable;
};


 Agreed, VMSTATE_BUFFER_UNSAFE() is for buffers that are inline
in the struct, not for buffers that the struct only points to.
I guess we want one of the VMSTATE_VARRAY_* types instead.

thanks
-- PMM


Oops, sorry. It needs VMS_POINTER flag apparently. Looks like hw/onenand.c
has the same problem with it's "otp" variable?
I can think of two good ways to fix it:

1. Use VMSTATE_BUFFER_UNSAFE_INFO instead. Just pass a custom VMStateInfo
to it where we will dereference address and get a proper data pointer.
hw/pci/pci.c does this, for example.

2. Introduce a new vmstate macro VMSTATE_BUFFER_POINTER_UNSAFE, which will
be the same as VMSTATE_BUFFER_UNSAFE,
but with an extra VMS_POINTER flag. The best option in my opinion, it also
could be reused for hw/onenand.c

Re: [Qemu-devel] [PATCH] sysbus: add no_user for devices using mmio or IRQ or GPIO

2013-03-07 Thread Markus Armbruster

Peter Maydell  writes:

> On 7 March 2013 16:48, Markus Armbruster  wrote:
>> Can we make some progress towards something that makes more sense?
>>
>> First step: reasons for marking a device no_user.
>>
>> From a user point of view, I think there's just one: -device/device_add
>> cannot possibly result in a working device.  Coherent enough semantics
>> for no_user, in my opinion, but I readily concede it's not particularly
>> useful for maintaining it as infrastructure and device models evolve.
>
> Ideally it would be nice to move to a model where the error message
> was "the device you've created has some required connections which
> haven't been wired up", and then for some devices you'd always get
> that error [until we implemented syntax for doing the wiring] and
> for some you'd only get it if you messed up the command line switches.

Sounds good to me.

>> For that, we need to drill down into reasons for "cannot possibly work".
>> Here are two, please add more:
>>
>> * Can't make required connections
>
>> * Resource collision with board
>>
>>   The device must connect to some fixed resource, but the board already
>>   connects something to it.  Without no_user, this should result in an
>>   error message of sorts, which is much better than the silent breakage
>>   above.  Whether the message makes any sense to the user is a different
>>   question.
>
> Do we have any concrete examples of this? I don't think devices should
> be making connections to resources themselves. If the only things wiring
> up devices are (a) the board models and (b) anything the user says on
> the command line, then the conflict only happens if the user says
> -device foo,bar=0x42 and bar 0x42 is in use; in that case the message
> will make sense to them.

Not exactly what you asked for, but here goes anyway:

$ qemu-system-x86_64 -nodefaults -S -vnc :0 -monitor stdio -device 
isa-fdc,id=foo
QEMU 1.4.50 monitor - type 'help' for more information
(qemu) info qtree
bus: main-system-bus
[...]
bus: isa.0
  type ISA
  dev: isa-fdc, id "foo"
iobase = 0x3f0
irq = 6
dma = 2
driveA = 
driveB = 
bootindexA = -1
bootindexB = -1
check_media_rate = on
isa irq 6
  dev: isa-fdc, id ""
iobase = 0x3f0
irq = 6
dma = 2
driveA = 
driveB = 
bootindexA = -1
bootindexB = -1
check_media_rate = on
isa irq 6
[...]
$ qemu-system-x86_64 -nodefaults -S -vnc :0 -monitor stdio -device 
q35-pcihost
Segmentation fault (core dumped)

> I think a third case for no-user is "this device is actually an
> abstract base class" (eg arm_gic_common), which we could deal with
> by checking the TypeInfo instead.

Yes.

> Case four: "we really don't expect anybody to be trying to wire this
> up dynamically", which would apply to things like the on-cpu peripherals
> for some ARM cores. There it is really just an attempt at being friendly
> by cutting down the length of the devices list.

Yes.  Case-by-case decision, I guess.

> Speaking of friendlyness, it might be helpful if the '-devices help'
> list was somehow more structured than a single long list and if
> more devices had description text?

Oh yes.  I wanted to do that, but when Anthony started to dig up qdev, I
hastened to get out of the way.

>>>  [and I don't think "this device
>>> can be added via the monitor but not the command line"
>>> counts as consistent or coherent...]
>>
>> no_user applies equally to -device and device_add.
>
> In the codebase as it stands, it applies only to -device.
> I agree that we should be consistent here, which we could do
> by applying Christian's patch or some variation to make device_add
> honour no_user. (Or by removing no_user altogether :-))

Actually, it appears to apply only to help now!

git-bisect blames this one:

commit 18b6dade8c0799c48f5c5e124b8c407cd5e22e96
Author: Anthony Liguori 
Date:   Fri Dec 9 12:08:01 2011 -0600

qdev: refactor device creation to allow bus_info to be set only in class

As we use class_init to set class members, DeviceInfo no longer holds this
information.

Signed-off-by: Anthony Liguori 

Review fail, testing fail :(

[Qemu-devel] [PATCH v2 0/7] threadpool: support multiple ThreadPools

2013-03-07 Thread Stefan Hajnoczi

This patch series changes the global thread pool to a one ThreadPool per
AioContext model.  We still only use the main loop AioContext so in practice
there is just one ThreadPool.  But this opens the door to refactoring the block
layer (which depends on ThreadPool) so block devices can be accessed outside
the global mutex in the future.

ThreadPool is tightly bound to an AioContext because it uses an EventNotifier
to signal work completion.  Completed work items are reaped and their callback
functions are invoked from the EventNotifier read handler (executing under
AioContext).

It might be possible to record the AioContext for the completion callback on a
per-request basis and continuing to use a global pool of worker threads.  After
discussing thread pool models with Paolo I have been convinced that it is
simpler and more scalable to have one ThreadPool per AioContext instead.
Therefore this series implements the 1:1 approach.  For details on previous
thread pool model discussion, see:
http://lists.gnu.org/archive/html/qemu-devel/2013-02/msg03987.html

The final patch was previously separate but I have included it because it
depends on qemu_get_aio_context().  It is unrelated to ThreadPool but the patch
reviewers are the same in both instances, so I combined the series.

At the end of this series block/raw-posix.c and block/raw-win32.c are aware of
the ThreadPool they submit work to.  The next step after this series is to
associate BlockDriverState with an AioContext so that the block layer can run
outside the global main loop.

v2:
 * Always find AioContext, don't split if (ctx) cases [Paolo]
 * Introduce bdrv_get_aio_context() [Paolo]
 * Add CoQueue AioContext patch since it depends on qemu_get_aio_context()

Stefan Hajnoczi (7):
  main-loop: add qemu_get_aio_context()
  threadpool: move globals into struct ThreadPool
  threadpool: add thread_pool_new() and thread_pool_free()
  aio: add a ThreadPool instance to AioContext
  block: add bdrv_get_aio_context()
  threadpool: drop global thread pool
  coroutine: use AioContext for CoQueue BH

 async.c |  11 ++
 block.c |   6 ++
 block/raw-posix.c   |   8 +-
 block/raw-win32.c   |   4 +-
 include/block/aio.h |   6 ++
 include/block/block_int.h   |   7 ++
 include/block/coroutine.h   |   1 +
 include/block/thread-pool.h |  15 ++-
 include/qemu/main-loop.h|   5 +
 main-loop.c |   5 +
 qemu-coroutine-lock.c   |  55 ++
 tests/test-thread-pool.c|  44 
 thread-pool.c   | 243 
 trace-events|   4 +-
 14 files changed, 276 insertions(+), 138 deletions(-)

-- 
1.8.1.4

[Qemu-devel] [PATCH v2 1/7] main-loop: add qemu_get_aio_context()

2013-03-07 Thread Stefan Hajnoczi

It is very useful to get the main loop AioContext, which is a static
variable in main-loop.c.

I'm not sure whether qemu_get_aio_context() will be necessary in the
future once devices focus on using their own AioContext instead of the
main loop AioContext, but for now it allows us to refactor code to
support multiple AioContext while actually passing the main loop
AioContext.

Signed-off-by: Stefan Hajnoczi 
---
 include/qemu/main-loop.h | 5 +
 main-loop.c  | 5 +
 2 files changed, 10 insertions(+)

diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index 0995288..6f0200a 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -82,6 +82,11 @@ int qemu_init_main_loop(void);
 int main_loop_wait(int nonblocking);
 
 /**
+ * qemu_get_aio_context: Return the main loop's AioContext
+ */
+AioContext *qemu_get_aio_context(void);
+
+/**
  * qemu_notify_event: Force processing of pending events.
  *
  * Similar to signaling a condition variable, qemu_notify_event forces
diff --git a/main-loop.c b/main-loop.c
index 8c9b58c..eb80ff3 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -109,6 +109,11 @@ static int qemu_signal_init(void)
 
 static AioContext *qemu_aio_context;
 
+AioContext *qemu_get_aio_context(void)
+{
+return qemu_aio_context;
+}
+
 void qemu_notify_event(void)
 {
 if (!qemu_aio_context) {
-- 
1.8.1.4

[Qemu-devel] [PATCH v2 2/7] threadpool: move globals into struct ThreadPool

2013-03-07 Thread Stefan Hajnoczi

Move global variables into a struct so multiple thread pools can be
supported in the future.

This patch does not change thread-pool.h interfaces.  There is still a
global thread pool and it is not yet possible to create/destroy
individual thread pools.  Moving the variables into a struct first makes
later patches easier to review.

Signed-off-by: Stefan Hajnoczi 
---
 thread-pool.c | 190 +-
 trace-events  |   4 +-
 2 files changed, 112 insertions(+), 82 deletions(-)

diff --git a/thread-pool.c b/thread-pool.c
index e3ca64d..a0aecd0 100644
--- a/thread-pool.c
+++ b/thread-pool.c
@@ -24,7 +24,9 @@
 #include "qemu/event_notifier.h"
 #include "block/thread-pool.h"
 
-static void do_spawn_thread(void);
+typedef struct ThreadPool ThreadPool;
+
+static void do_spawn_thread(ThreadPool *pool);
 
 typedef struct ThreadPoolElement ThreadPoolElement;
 
@@ -37,6 +39,7 @@ enum ThreadState {
 
 struct ThreadPoolElement {
 BlockDriverAIOCB common;
+ThreadPool *pool;
 ThreadPoolFunc *func;
 void *arg;
 
@@ -54,49 +57,56 @@ struct ThreadPoolElement {
 QLIST_ENTRY(ThreadPoolElement) all;
 };
 
-static EventNotifier notifier;
-static QemuMutex lock;
-static QemuCond check_cancel;
-static QemuSemaphore sem;
-static int max_threads = 64;
-static QEMUBH *new_thread_bh;
-
-/* The following variables are protected by the global mutex.  */
-static QLIST_HEAD(, ThreadPoolElement) head;
-
-/* The following variables are protected by lock.  */
-static QTAILQ_HEAD(, ThreadPoolElement) request_list;
-static int cur_threads;
-static int idle_threads;
-static int new_threads; /* backlog of threads we need to create */
-static int pending_threads; /* threads created but not running yet */
-static int pending_cancellations; /* whether we need a cond_broadcast */
-
-static void *worker_thread(void *unused)
+struct ThreadPool {
+EventNotifier notifier;
+QemuMutex lock;
+QemuCond check_cancel;
+QemuSemaphore sem;
+int max_threads;
+QEMUBH *new_thread_bh;
+
+/* The following variables are only accessed from one AioContext. */
+QLIST_HEAD(, ThreadPoolElement) head;
+
+/* The following variables are protected by lock.  */
+QTAILQ_HEAD(, ThreadPoolElement) request_list;
+int cur_threads;
+int idle_threads;
+int new_threads; /* backlog of threads we need to create */
+int pending_threads; /* threads created but not running yet */
+int pending_cancellations; /* whether we need a cond_broadcast */
+};
+
+/* Currently there is only one thread pool instance. */
+static ThreadPool global_pool;
+
+static void *worker_thread(void *opaque)
 {
-qemu_mutex_lock(&lock);
-pending_threads--;
-do_spawn_thread();
+ThreadPool *pool = opaque;
+
+qemu_mutex_lock(&pool->lock);
+pool->pending_threads--;
+do_spawn_thread(pool);
 
 while (1) {
 ThreadPoolElement *req;
 int ret;
 
 do {
-idle_threads++;
-qemu_mutex_unlock(&lock);
-ret = qemu_sem_timedwait(&sem, 1);
-qemu_mutex_lock(&lock);
-idle_threads--;
-} while (ret == -1 && !QTAILQ_EMPTY(&request_list));
+pool->idle_threads++;
+qemu_mutex_unlock(&pool->lock);
+ret = qemu_sem_timedwait(&pool->sem, 1);
+qemu_mutex_lock(&pool->lock);
+pool->idle_threads--;
+} while (ret == -1 && !QTAILQ_EMPTY(&pool->request_list));
 if (ret == -1) {
 break;
 }
 
-req = QTAILQ_FIRST(&request_list);
-QTAILQ_REMOVE(&request_list, req, reqs);
+req = QTAILQ_FIRST(&pool->request_list);
+QTAILQ_REMOVE(&pool->request_list, req, reqs);
 req->state = THREAD_ACTIVE;
-qemu_mutex_unlock(&lock);
+qemu_mutex_unlock(&pool->lock);
 
 ret = req->func(req->arg);
 
@@ -105,45 +115,47 @@ static void *worker_thread(void *unused)
 smp_wmb();
 req->state = THREAD_DONE;
 
-qemu_mutex_lock(&lock);
-if (pending_cancellations) {
-qemu_cond_broadcast(&check_cancel);
+qemu_mutex_lock(&pool->lock);
+if (pool->pending_cancellations) {
+qemu_cond_broadcast(&pool->check_cancel);
 }
 
-event_notifier_set(¬ifier);
+event_notifier_set(&pool->notifier);
 }
 
-cur_threads--;
-qemu_mutex_unlock(&lock);
+pool->cur_threads--;
+qemu_mutex_unlock(&pool->lock);
 return NULL;
 }
 
-static void do_spawn_thread(void)
+static void do_spawn_thread(ThreadPool *pool)
 {
 QemuThread t;
 
 /* Runs with lock taken.  */
-if (!new_threads) {
+if (!pool->new_threads) {
 return;
 }
 
-new_threads--;
-pending_threads++;
+pool->new_threads--;
+pool->pending_threads++;
 
-qemu_thread_create(&t, worker_thread, NULL, QEMU_THREAD_DETACHED);
+qemu_thread_create(&t, worker_thread, pool, QEMU_THREAD_DETAC

[Qemu-devel] [PATCH v2 5/7] block: add bdrv_get_aio_context()

2013-03-07 Thread Stefan Hajnoczi

For now bdrv_get_aio_context() is just a stub that calls
qemu_aio_get_context() since the block layer is currently tied to the
main loop AioContext.

Add the stub now so that the block layer can begin accessing its
AioContext.

Signed-off-by: Stefan Hajnoczi 
---
 block.c   | 6 ++
 include/block/block_int.h | 7 +++
 2 files changed, 13 insertions(+)

diff --git a/block.c b/block.c
index 124a9eb..0e5cd01 100644
--- a/block.c
+++ b/block.c
@@ -4638,3 +4638,9 @@ out:
 bdrv_delete(bs);
 }
 }
+
+AioContext *bdrv_get_aio_context(BlockDriverState *bs)
+{
+/* Currently BlockDriverState always uses the main loop AioContext */
+return qemu_get_aio_context();
+}
diff --git a/include/block/block_int.h b/include/block/block_int.h
index eaad53e..966f7fd 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -293,6 +293,13 @@ int get_tmp_filename(char *filename, int size);
 void bdrv_set_io_limits(BlockDriverState *bs,
 BlockIOLimit *io_limits);
 
+/**
+ * bdrv_get_aio_context:
+ *
+ * Returns: the currently bound #AioContext
+ */
+AioContext *bdrv_get_aio_context(BlockDriverState *bs);
+
 #ifdef _WIN32
 int is_windows_drive(const char *filename);
 #endif
-- 
1.8.1.4

[Qemu-devel] [PATCH v2 3/7] threadpool: add thread_pool_new() and thread_pool_free()

2013-03-07 Thread Stefan Hajnoczi

ThreadPool is tied to an AioContext through its event notifier, which
dictates in which AioContext the work item's callback function will be
invoked.

In order to support multiple AioContexts we need to support multiple
ThreadPool instances.

This patch adds the new/free functions.  The free function deserves
special attention because it quiesces remaining worker threads.  This
requires a new condition variable and a "stopping" flag to let workers
know they should terminate once idle.

We never needed to do this before since the global threadpool was not
explicitly destroyed until process termination.

Also stash the AioContext pointer in ThreadPool so that we can call
aio_set_event_notifier() in thread_pool_free().  We didn't need to hold
onto AioContext previously since there was no free function.

Signed-off-by: Stefan Hajnoczi 
---
 include/block/thread-pool.h |  5 +
 thread-pool.c   | 52 +
 2 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h
index 200703e..e1453c6 100644
--- a/include/block/thread-pool.h
+++ b/include/block/thread-pool.h
@@ -26,6 +26,11 @@
 
 typedef int ThreadPoolFunc(void *opaque);
 
+typedef struct ThreadPool ThreadPool;
+
+ThreadPool *thread_pool_new(struct AioContext *ctx);
+void thread_pool_free(ThreadPool *pool);
+
 BlockDriverAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg,
  BlockDriverCompletionFunc *cb, void *opaque);
 int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg);
diff --git a/thread-pool.c b/thread-pool.c
index a0aecd0..d1e4570 100644
--- a/thread-pool.c
+++ b/thread-pool.c
@@ -24,8 +24,6 @@
 #include "qemu/event_notifier.h"
 #include "block/thread-pool.h"
 
-typedef struct ThreadPool ThreadPool;
-
 static void do_spawn_thread(ThreadPool *pool);
 
 typedef struct ThreadPoolElement ThreadPoolElement;
@@ -59,8 +57,10 @@ struct ThreadPoolElement {
 
 struct ThreadPool {
 EventNotifier notifier;
+AioContext *ctx;
 QemuMutex lock;
 QemuCond check_cancel;
+QemuCond worker_stopped;
 QemuSemaphore sem;
 int max_threads;
 QEMUBH *new_thread_bh;
@@ -75,6 +75,7 @@ struct ThreadPool {
 int new_threads; /* backlog of threads we need to create */
 int pending_threads; /* threads created but not running yet */
 int pending_cancellations; /* whether we need a cond_broadcast */
+bool stopping;
 };
 
 /* Currently there is only one thread pool instance. */
@@ -88,7 +89,7 @@ static void *worker_thread(void *opaque)
 pool->pending_threads--;
 do_spawn_thread(pool);
 
-while (1) {
+while (!pool->stopping) {
 ThreadPoolElement *req;
 int ret;
 
@@ -99,7 +100,7 @@ static void *worker_thread(void *opaque)
 qemu_mutex_lock(&pool->lock);
 pool->idle_threads--;
 } while (ret == -1 && !QTAILQ_EMPTY(&pool->request_list));
-if (ret == -1) {
+if (ret == -1 || pool->stopping) {
 break;
 }
 
@@ -124,6 +125,7 @@ static void *worker_thread(void *opaque)
 }
 
 pool->cur_threads--;
+qemu_cond_signal(&pool->worker_stopped);
 qemu_mutex_unlock(&pool->lock);
 return NULL;
 }
@@ -298,8 +300,10 @@ static void thread_pool_init_one(ThreadPool *pool, 
AioContext *ctx)
 
 memset(pool, 0, sizeof(*pool));
 event_notifier_init(&pool->notifier, false);
+pool->ctx = ctx;
 qemu_mutex_init(&pool->lock);
 qemu_cond_init(&pool->check_cancel);
+qemu_cond_init(&pool->worker_stopped);
 qemu_sem_init(&pool->sem, 0);
 pool->max_threads = 64;
 pool->new_thread_bh = aio_bh_new(ctx, spawn_thread_bh_fn, pool);
@@ -311,6 +315,46 @@ static void thread_pool_init_one(ThreadPool *pool, 
AioContext *ctx)
thread_pool_active);
 }
 
+ThreadPool *thread_pool_new(AioContext *ctx)
+{
+ThreadPool *pool = g_new(ThreadPool, 1);
+thread_pool_init_one(pool, ctx);
+return pool;
+}
+
+void thread_pool_free(ThreadPool *pool)
+{
+if (!pool) {
+return;
+}
+
+assert(QLIST_EMPTY(&pool->head));
+
+qemu_mutex_lock(&pool->lock);
+
+/* Stop new threads from spawning */
+qemu_bh_delete(pool->new_thread_bh);
+pool->cur_threads -= pool->new_threads;
+pool->new_threads = 0;
+
+/* Wait for worker threads to terminate */
+pool->stopping = true;
+while (pool->cur_threads > 0) {
+qemu_sem_post(&pool->sem);
+qemu_cond_wait(&pool->worker_stopped, &pool->lock);
+}
+
+qemu_mutex_unlock(&pool->lock);
+
+aio_set_event_notifier(pool->ctx, &pool->notifier, NULL, NULL);
+qemu_sem_destroy(&pool->sem);
+qemu_cond_destroy(&pool->check_cancel);
+qemu_cond_destroy(&pool->worker_stopped);
+qemu_mutex_destroy(&pool->lock);
+event_notifier_cleanup(&pool->notifier);
+g_free(pool);
+}
+
 static void thread_pool_init(void)
 {
 thread_pool_init_one(&global_pool, NULL);
-

[Qemu-devel] [PATCH v2 6/7] threadpool: drop global thread pool

2013-03-07 Thread Stefan Hajnoczi

Now that each AioContext has a ThreadPool and the main loop AioContext
can be fetched with bdrv_get_aio_context(), we can eliminate the concept
of a global thread pool from thread-pool.c.

The submit functions must take a ThreadPool* argument.

block/raw-posix.c and block/raw-win32.c use
aio_get_thread_pool(bdrv_get_aio_context(bs)) to fetch the main loop's
ThreadPool.

tests/test-thread-pool.c must be updated to reflect the new
thread_pool_submit() function prototypes.

Signed-off-by: Stefan Hajnoczi 

use bdrv_get_aio_context

Signed-off-by: Stefan Hajnoczi 
---
 block/raw-posix.c   |  8 ++--
 block/raw-win32.c   |  4 +++-
 include/block/thread-pool.h | 10 ++
 tests/test-thread-pool.c| 44 +---
 thread-pool.c   | 23 +++
 5 files changed, 43 insertions(+), 46 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 4dfdf98..8a3cdbc 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -750,6 +750,7 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, 
int fd,
 BlockDriverCompletionFunc *cb, void *opaque, int type)
 {
 RawPosixAIOData *acb = g_slice_new(RawPosixAIOData);
+ThreadPool *pool;
 
 acb->bs = bs;
 acb->aio_type = type;
@@ -763,7 +764,8 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, 
int fd,
 acb->aio_offset = sector_num * 512;
 
 trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
-return thread_pool_submit_aio(aio_worker, acb, cb, opaque);
+pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
 static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
@@ -1413,6 +1415,7 @@ static BlockDriverAIOCB *hdev_aio_ioctl(BlockDriverState 
*bs,
 {
 BDRVRawState *s = bs->opaque;
 RawPosixAIOData *acb;
+ThreadPool *pool;
 
 if (fd_open(bs) < 0)
 return NULL;
@@ -1424,7 +1427,8 @@ static BlockDriverAIOCB *hdev_aio_ioctl(BlockDriverState 
*bs,
 acb->aio_offset = 0;
 acb->aio_ioctl_buf = buf;
 acb->aio_ioctl_cmd = req;
-return thread_pool_submit_aio(aio_worker, acb, cb, opaque);
+pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
 #elif defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
diff --git a/block/raw-win32.c b/block/raw-win32.c
index b89ac19..18e0068 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -144,6 +144,7 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, 
HANDLE hfile,
 BlockDriverCompletionFunc *cb, void *opaque, int type)
 {
 RawWin32AIOData *acb = g_slice_new(RawWin32AIOData);
+ThreadPool *pool;
 
 acb->bs = bs;
 acb->hfile = hfile;
@@ -157,7 +158,8 @@ static BlockDriverAIOCB *paio_submit(BlockDriverState *bs, 
HANDLE hfile,
 acb->aio_offset = sector_num * 512;
 
 trace_paio_submit(acb, opaque, sector_num, nb_sectors, type);
-return thread_pool_submit_aio(aio_worker, acb, cb, opaque);
+pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
 
 int qemu_ftruncate64(int fd, int64_t length)
diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h
index e1453c6..32afcdd 100644
--- a/include/block/thread-pool.h
+++ b/include/block/thread-pool.h
@@ -31,9 +31,11 @@ typedef struct ThreadPool ThreadPool;
 ThreadPool *thread_pool_new(struct AioContext *ctx);
 void thread_pool_free(ThreadPool *pool);
 
-BlockDriverAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg,
- BlockDriverCompletionFunc *cb, void *opaque);
-int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg);
-void thread_pool_submit(ThreadPoolFunc *func, void *arg);
+BlockDriverAIOCB *thread_pool_submit_aio(ThreadPool *pool,
+ThreadPoolFunc *func, void *arg,
+BlockDriverCompletionFunc *cb, void *opaque);
+int coroutine_fn thread_pool_submit_co(ThreadPool *pool,
+ThreadPoolFunc *func, void *arg);
+void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, void *arg);
 
 #endif
diff --git a/tests/test-thread-pool.c b/tests/test-thread-pool.c
index 9998e03..22915aa 100644
--- a/tests/test-thread-pool.c
+++ b/tests/test-thread-pool.c
@@ -4,6 +4,8 @@
 #include "block/thread-pool.h"
 #include "block/block.h"
 
+static AioContext *ctx;
+static ThreadPool *pool;
 static int active;
 
 typedef struct {
@@ -38,19 +40,10 @@ static void done_cb(void *opaque, int ret)
 active--;
 }
 
-/* A non-blocking poll of the main AIO context (we cannot use aio_poll
- * because we do not know the AioContext).
- */
-static void qemu_aio_wait_nonblocking(void)
-{
-qemu_notify_event();
-qemu_aio_wait();
-}
-
 /* Wait until all aio and bh activity has finished */
 static void qemu_aio_wait_all(void)
 {
-while (qemu_aio_wait()) {
+while (aio_poll(ctx, true)

Re: [Qemu-devel] [PATCH] sysbus: add no_user for devices using mmio or IRQ or GPIO

2013-03-07 Thread Peter Crosthwaite

Hi Peter,

[snip]
> Case four: "we really don't expect anybody to be trying to wire this
> up dynamically", which would apply to things like the on-cpu peripherals
> for some ARM cores.

What ARM cores were you thinking here exactly? We are already doing
dynamic machine creation of ARM systems including instantiation of
individual MPCore components.

Regards,
Peter

> There it is really just an attempt at being friendly
> by cutting down the length of the devices list.
>

[Qemu-devel] [PATCH v2 4/7] aio: add a ThreadPool instance to AioContext

2013-03-07 Thread Stefan Hajnoczi

This patch adds a ThreadPool to AioContext.  It's possible that some
AioContext instances will never use the ThreadPool, so defer creation
until aio_get_thread_pool().

The reason why AioContext should have the ThreadPool is because the
ThreadPool is bound to a AioContext instance where the work item's
callback function is invoked.  It doesn't make sense to keep the
ThreadPool pointer anywhere other than AioContext.  For example,
block/raw-posix.c can get its AioContext's ThreadPool and submit work.

Special note about headers: I used struct ThreadPool in aio.h because
there is a circular dependency if aio.h includes thread-pool.h.

Signed-off-by: Stefan Hajnoczi 
---
 async.c | 11 +++
 include/block/aio.h |  6 ++
 2 files changed, 17 insertions(+)

diff --git a/async.c b/async.c
index f2d47ba..90fe906 100644
--- a/async.c
+++ b/async.c
@@ -24,6 +24,7 @@
 
 #include "qemu-common.h"
 #include "block/aio.h"
+#include "block/thread-pool.h"
 #include "qemu/main-loop.h"
 
 /***/
@@ -172,6 +173,7 @@ aio_ctx_finalize(GSource *source)
 {
 AioContext *ctx = (AioContext *) source;
 
+thread_pool_free(ctx->thread_pool);
 aio_set_event_notifier(ctx, &ctx->notifier, NULL, NULL);
 event_notifier_cleanup(&ctx->notifier);
 g_array_free(ctx->pollfds, TRUE);
@@ -190,6 +192,14 @@ GSource *aio_get_g_source(AioContext *ctx)
 return &ctx->source;
 }
 
+ThreadPool *aio_get_thread_pool(AioContext *ctx)
+{
+if (!ctx->thread_pool) {
+ctx->thread_pool = thread_pool_new(ctx);
+}
+return ctx->thread_pool;
+}
+
 void aio_notify(AioContext *ctx)
 {
 event_notifier_set(&ctx->notifier);
@@ -200,6 +210,7 @@ AioContext *aio_context_new(void)
 AioContext *ctx;
 ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
 ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
+ctx->thread_pool = NULL;
 event_notifier_init(&ctx->notifier, false);
 aio_set_event_notifier(ctx, &ctx->notifier, 
(EventNotifierHandler *)
diff --git a/include/block/aio.h b/include/block/aio.h
index 5b54d38..1836793 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -66,6 +66,9 @@ typedef struct AioContext {
 
 /* GPollFDs for aio_poll() */
 GArray *pollfds;
+
+/* Thread pool for performing work and receiving completion callbacks */
+struct ThreadPool *thread_pool;
 } AioContext;
 
 /* Returns 1 if there are still outstanding AIO requests; 0 otherwise */
@@ -223,6 +226,9 @@ void aio_set_event_notifier(AioContext *ctx,
  */
 GSource *aio_get_g_source(AioContext *ctx);
 
+/* Return the ThreadPool bound to this AioContext */
+struct ThreadPool *aio_get_thread_pool(AioContext *ctx);
+
 /* Functions to operate on the main QEMU AioContext.  */
 
 bool qemu_aio_wait(void);
-- 
1.8.1.4

[Qemu-devel] [PATCH v2 7/7] coroutine: use AioContext for CoQueue BH

2013-03-07 Thread Stefan Hajnoczi

CoQueue uses a BH to awake coroutines that were made ready to run again
using qemu_co_queue_next() or qemu_co_queue_restart_all().  The BH
currently runs in the iothread AioContext and would break coroutines
that run in a different AioContext.

This is a slightly tricky problem because the lifetime of the BH exceeds
that of the CoQueue.  This means coroutines can be awoken after CoQueue
itself has been freed.  Also, there is no qemu_co_queue_destroy()
function which we could use to handle freeing resources.

Introducing qemu_co_queue_destroy() has a ripple effect of requiring us
to also add qemu_co_mutex_destroy() and qemu_co_rwlock_destroy(), as
well as updating all callers.  Avoid doing that.

We also cannot switch from BH to GIdle function because aio_poll() does
not dispatch GIdle functions.  (GIdle functions make memory management
slightly easier because they free themselves.)

Finally, I don't want to move unlock_queue and unlock_bh into
AioContext.  That would break encapsulation - AioContext isn't supposed
to know about CoQueue.

This patch implements a different solution: each qemu_co_queue_next() or
qemu_co_queue_restart_all() call creates a new BH and list of coroutines
to wake up.  Callers tend to invoke qemu_co_queue_next() and
qemu_co_queue_restart_all() occasionally after blocking I/O, so creating
a new BH for each call shouldn't be massively inefficient.

Note that this patch does not add an interface for specifying the
AioContext.  That is left to future patches which will convert CoQueue,
CoMutex, and CoRwlock to expose AioContext.

Signed-off-by: Stefan Hajnoczi 
---
 include/block/coroutine.h |  1 +
 qemu-coroutine-lock.c | 55 ---
 2 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/include/block/coroutine.h b/include/block/coroutine.h
index c31fae3..a978162 100644
--- a/include/block/coroutine.h
+++ b/include/block/coroutine.h
@@ -104,6 +104,7 @@ bool qemu_in_coroutine(void);
  */
 typedef struct CoQueue {
 QTAILQ_HEAD(, Coroutine) entries;
+AioContext *ctx;
 } CoQueue;
 
 /**
diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c
index 97ef01c..86efe1f 100644
--- a/qemu-coroutine-lock.c
+++ b/qemu-coroutine-lock.c
@@ -29,28 +29,36 @@
 #include "block/aio.h"
 #include "trace.h"
 
-static QTAILQ_HEAD(, Coroutine) unlock_bh_queue =
-QTAILQ_HEAD_INITIALIZER(unlock_bh_queue);
-static QEMUBH* unlock_bh;
+/* Coroutines are awoken from a BH to allow the current coroutine to complete
+ * its flow of execution.  The BH may run after the CoQueue has been destroyed,
+ * so keep BH data in a separate heap-allocated struct.
+ */
+typedef struct {
+QEMUBH *bh;
+QTAILQ_HEAD(, Coroutine) entries;
+} CoQueueNextData;
 
 static void qemu_co_queue_next_bh(void *opaque)
 {
+CoQueueNextData *data = opaque;
 Coroutine *next;
 
 trace_qemu_co_queue_next_bh();
-while ((next = QTAILQ_FIRST(&unlock_bh_queue))) {
-QTAILQ_REMOVE(&unlock_bh_queue, next, co_queue_next);
+while ((next = QTAILQ_FIRST(&data->entries))) {
+QTAILQ_REMOVE(&data->entries, next, co_queue_next);
 qemu_coroutine_enter(next, NULL);
 }
+
+qemu_bh_delete(data->bh);
+g_slice_free(CoQueueNextData, data);
 }
 
 void qemu_co_queue_init(CoQueue *queue)
 {
 QTAILQ_INIT(&queue->entries);
 
-if (!unlock_bh) {
-unlock_bh = qemu_bh_new(qemu_co_queue_next_bh, NULL);
-}
+/* This will be exposed to callers once there are multiple AioContexts */
+queue->ctx = qemu_get_aio_context();
 }
 
 void coroutine_fn qemu_co_queue_wait(CoQueue *queue)
@@ -69,26 +77,39 @@ void coroutine_fn qemu_co_queue_wait_insert_head(CoQueue 
*queue)
 assert(qemu_in_coroutine());
 }
 
-bool qemu_co_queue_next(CoQueue *queue)
+static bool qemu_co_queue_do_restart(CoQueue *queue, bool single)
 {
 Coroutine *next;
+CoQueueNextData *data;
+
+if (QTAILQ_EMPTY(&queue->entries)) {
+return false;
+}
 
-next = QTAILQ_FIRST(&queue->entries);
-if (next) {
+data = g_slice_new(CoQueueNextData);
+data->bh = aio_bh_new(queue->ctx, qemu_co_queue_next_bh, data);
+QTAILQ_INIT(&data->entries);
+qemu_bh_schedule(data->bh);
+
+while ((next = QTAILQ_FIRST(&queue->entries)) != NULL) {
 QTAILQ_REMOVE(&queue->entries, next, co_queue_next);
-QTAILQ_INSERT_TAIL(&unlock_bh_queue, next, co_queue_next);
+QTAILQ_INSERT_TAIL(&data->entries, next, co_queue_next);
 trace_qemu_co_queue_next(next);
-qemu_bh_schedule(unlock_bh);
+if (single) {
+break;
+}
 }
+return true;
+}
 
-return (next != NULL);
+bool qemu_co_queue_next(CoQueue *queue)
+{
+return qemu_co_queue_do_restart(queue, true);
 }
 
 void qemu_co_queue_restart_all(CoQueue *queue)
 {
-while (qemu_co_queue_next(queue)) {
-/* Do nothing */
-}
+qemu_co_queue_do_restart(queue, false);
 }
 
 bool qemu_co_queue_empty(CoQueue *queue)
-- 
1.8.1.

Re: [Qemu-devel] [PATCH v13 5/5] VMXNET3 device implementation

2013-03-07 Thread Andreas Färber

Am 06.03.2013 08:21, schrieb Dmitry Fleytman:
> Signed-off-by: Dmitry Fleytman 
> Signed-off-by: Yan Vugenfirer 
> ---
>  default-configs/pci.mak |1 +
>  hw/Makefile.objs|1 +
>  hw/pci/pci.h|1 +
>  hw/vmxnet3.c| 2460 
> +++
>  hw/vmxnet3.h|  760 +++
>  5 files changed, 3223 insertions(+)
>  create mode 100644 hw/vmxnet3.c
>  create mode 100644 hw/vmxnet3.h
> 
> diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> index ee2d18d..ce56d58 100644
> --- a/default-configs/pci.mak
> +++ b/default-configs/pci.mak
> @@ -13,6 +13,7 @@ CONFIG_LSI_SCSI_PCI=y
>  CONFIG_MEGASAS_SCSI_PCI=y
>  CONFIG_RTL8139_PCI=y
>  CONFIG_E1000_PCI=y
> +CONFIG_VMXNET3_PCI=y
>  CONFIG_IDE_CORE=y
>  CONFIG_IDE_QDEV=y
>  CONFIG_IDE_PCI=y
> diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> index 14922cb..026aff6 100644
> --- a/hw/Makefile.objs
> +++ b/hw/Makefile.objs
> @@ -120,6 +120,7 @@ common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o
>  common-obj-$(CONFIG_E1000_PCI) += e1000.o
>  common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o
>  common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet_tx_pkt.o vmxnet_rx_pkt.o
> +common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet3.o
>  
>  common-obj-$(CONFIG_SMC91C111) += smc91c111.o
>  common-obj-$(CONFIG_LAN9118) += lan9118.o
> diff --git a/hw/pci/pci.h b/hw/pci/pci.h
> index f340fe5..3beb70b 100644
> --- a/hw/pci/pci.h
> +++ b/hw/pci/pci.h
> @@ -60,6 +60,7 @@
>  #define PCI_DEVICE_ID_VMWARE_NET 0x0720
>  #define PCI_DEVICE_ID_VMWARE_SCSI0x0730
>  #define PCI_DEVICE_ID_VMWARE_IDE 0x1729
> +#define PCI_DEVICE_ID_VMWARE_VMXNET3 0x07B0
>  
>  /* Intel (0x8086) */
>  #define PCI_DEVICE_ID_INTEL_82551IT  0x1209
> diff --git a/hw/vmxnet3.c b/hw/vmxnet3.c
> new file mode 100644
> index 000..75b7181
> --- /dev/null
> +++ b/hw/vmxnet3.c
> @@ -0,0 +1,2460 @@
> +/*
> + * QEMU VMWARE VMXNET3 paravirtual NIC
> + *
> + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com)
> + *
> + * Developed by Daynix Computing LTD (http://www.daynix.com)
> + *
> + * Authors:
> + * Dmitry Fleytman 
> + * Tamir Shomer 
> + * Yan Vugenfirer 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "hw.h"
> +#include "pci/pci.h"
> +#include "net/net.h"
> +#include "virtio-net.h"
> +#include "net/tap.h"
> +#include "net/checksum.h"
> +#include "sysemu/sysemu.h"
> +#include "qemu-common.h"
> +#include "qemu/bswap.h"
> +#include "pci/msix.h"
> +#include "pci/msi.h"
> +
> +#include "vmxnet3.h"
> +#include "vmxnet_debug.h"
> +#include "vmware_utils.h"
> +#include "vmxnet_tx_pkt.h"
> +#include "vmxnet_rx_pkt.h"
> +
> +#define PCI_DEVICE_ID_VMWARE_VMXNET3_REVISION 0x1
> +#define VMXNET3_MSIX_BAR_SIZE 0x2000
> +
> +#define VMXNET3_BAR0_IDX  (0)
> +#define VMXNET3_BAR1_IDX  (1)
> +#define VMXNET3_MSIX_BAR_IDX  (2)
> +
> +#define VMXNET3_OFF_MSIX_TABLE (0x000)
> +#define VMXNET3_OFF_MSIX_PBA   (0x800)
> +
> +/* Link speed in Mbps should be shifted by 16 */
> +#define VMXNET3_LINK_SPEED  (1000 << 16)
> +
> +/* Link status: 1 - up, 0 - down. */
> +#define VMXNET3_LINK_STATUS_UP  0x1
> +
> +/* Least significant bit should be set for revision and version */
> +#define VMXNET3_DEVICE_VERSION0x1
> +#define VMXNET3_DEVICE_REVISION   0x1
> +
> +/* Macros for rings descriptors access */
> +#define VMXNET3_READ_TX_QUEUE_DESCR8(dpa, field) \
> +(vmw_shmem_ld8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field)))
> +
> +#define VMXNET3_WRITE_TX_QUEUE_DESCR8(dpa, field, value) \
> +(vmw_shmem_st8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field, value)))
> +
> +#define VMXNET3_READ_TX_QUEUE_DESCR32(dpa, field) \
> +(vmw_shmem_ld32(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field)))
> +
> +#define VMXNET3_WRITE_TX_QUEUE_DESCR32(dpa, field, value) \
> +(vmw_shmem_st32(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field), 
> value))
> +
> +#define VMXNET3_READ_TX_QUEUE_DESCR64(dpa, field) \
> +(vmw_shmem_ld64(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field)))
> +
> +#define VMXNET3_WRITE_TX_QUEUE_DESCR64(dpa, field, value) \
> +(vmw_shmem_st64(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field), 
> value))
> +
> +#define VMXNET3_READ_RX_QUEUE_DESCR64(dpa, field) \
> +(vmw_shmem_ld64(dpa + offsetof(struct Vmxnet3_RxQueueDesc, field)))
> +
> +#define VMXNET3_READ_RX_QUEUE_DESCR32(dpa, field) \
> +(vmw_shmem_ld32(dpa + offsetof(struct Vmxnet3_RxQueueDesc, field)))
> +
> +#define VMXNET3_WRITE_RX_QUEUE_DESCR64(dpa, field, value) \
> +(vmw_shmem_st64(dpa + offsetof(struct Vmxnet3_RxQueueDesc, field), 
> value))
> +
> +#define VMXNET3_WRITE_RX_QUEUE_DESCR8(dpa, field, value) \
> +(vmw_shmem_st8(dpa + offsetof(struct Vmxnet3_RxQueueDesc, field), value))
> +
> +/* Macros for guest driver shared area access */
> +#define VMXNET3_READ_DRV_SHARED64(shpa, field)

Re: [Qemu-devel] [PATCHv3] qdev: DEVICE_DELETED event

2013-03-07 Thread Andreas Färber

Am 07.03.2013 10:49, schrieb Michael S. Tsirkin:
> libvirt has a long-standing bug: when removing the device,
> it can request removal but does not know when the
> removal completes. Add an event so we can fix this in a robust way.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
> 
> Changes from v2:
> - move event toward the end of device_unparent,
>   so that parents are reported after their children,
>   as suggested by Paolo
> Changes from v1:
> - move to device_unparent
> - address comments by Andreas and Eric

Reviewed-by: Andreas Färber 

Thanks,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [Qemu-stable] [SeaBIOS] problems with freeBSD

2013-03-07 Thread Michael Tokarev

07.03.2013 15:56, Gerd Hoffmann wrote:

>> Just a note, or an "alternative opinion", so to say.  In Debian, we have
>> a social contract which, among other things, ensures that the binaries
>> you, as a user, get, comes with source which you can modify on the
>> system you installed.  This requires, for example, that all binaries
>> shipped are actually built from the corresponding source, and that
>> no blobs from whatever other sources are used, ever.
> 
> That is perfectly fine.  How to you handle ppc firmware for x86 hosts
> btw?  Build on ppc buildhost and ship as noarch package?  Or do you do
> cross compiler builds, so that users can patch+rebuild it on their x86
> host too?

"Foreign" firmware handling is an unsolved issue.  Yes, basically,
it is built on the corresponding build machine (eg, openbios-ppc is
built on a ppc machine) and shipped as "noarch" package.  The same
is for X86 seabios/vgabios/etc stuff actually.

So technically this is a violation in some way.  However, that same
ppc firmware actually can be rebuilt in a (qemu) virtual machine with
the same debian release installed.

That's why I say it is basically unsolved issue -- the "solution"
isn't actuall a solution but a workaround instead.

For added "fun", currently we ship optionroms from qemu in seabios
source package, exactly because of this reason - seabios is what
gets built on x86 only and is distributed on as "noarch" binary,
while qemu is built on all architectures.

And for another fun, we've several broken qemu architectures in
Debian at the moment -- s390 and ppc64 are missing firmwares
exactly because of the difficulties building them, despite the
fact they're shipping in upstream qemu tarball.

>> We don't even ship any upstream blobs in the debian qemu _source_
>> package: we repack upstream qemu.tar.gz by removing these blobs.
> 
> That's a bit over the top for my taste as the release tarballs include
> both source and blobs.

Yes this is a bit extremistic, so to say.  But this is the same
base principle of Debian: we should ensure and demonstrate it all
is buildable.  Just mere presence of a blob in a tarball is already
suspicious :)

>   Although it might be the debian release is just
> a bit too old for that, not fully sure with which release anthony
> started to include the firmware submodules, might be it was after 1.1

Nope, qemu started including sources before that.  But it doesn't
matter really, it is just the way how debian works, nothing to
do with qemu really.  Especially since most of these submodules
are already packaged in Debian separately (be it because the
qemu needs or due to other means).

Thanks,

/mjt

Re: [Qemu-devel] [PATCH] qdev: DEVICE_DELETED event

2013-03-07 Thread Andreas Färber

Am 07.03.2013 11:07, schrieb Michael S. Tsirkin:
> On Thu, Mar 07, 2013 at 10:55:23AM +0100, Markus Armbruster wrote:
>> "Michael S. Tsirkin"  writes:
>>
>>> On Wed, Mar 06, 2013 at 02:57:22PM +0100, Andreas Färber wrote:
 Am 06.03.2013 14:00, schrieb Michael S. Tsirkin:
> libvirt has a long-standing bug: when removing the device,
> it can request removal but does not know when does the
> removal complete. Add an event so we can fix this in a robust way.
>
> Signed-off-by: Michael S. Tsirkin 

 Sounds like a good idea to me. :)

 [...]
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 689cd54..f30d251 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -29,6 +29,7 @@
>  #include "sysemu/sysemu.h"
>  #include "qapi/error.h"
>  #include "qapi/visitor.h"
> +#include "qapi/qmp/qjson.h"
>  
>  int qdev_hotplug = 0;
>  static bool qdev_hot_added = false;
> @@ -267,6 +268,11 @@ void qdev_init_nofail(DeviceState *dev)
>  /* Unlink device from bus and free the structure.  */
>  void qdev_free(DeviceState *dev)
>  {
> +if (dev->id) {
> +QObject *data = qobject_from_jsonf("{ 'device': %s }", dev->id);
> +monitor_protocol_event(QEVENT_DEVICE_DELETED, data);
> +qobject_decref(data);
> +}
>  object_unparent(OBJECT(dev));
>  }
>  

 I'm pretty sure this is the wrong place to fire the notification. We
 should rather do this when the device is actually deleted - which
 qdev_free() does *not* actually guarantee, as criticized in the s390x
 and unref'ing contexts.
 I would suggest to place your code into device_unparent() instead.

 Another thing to consider is what data to pass to the event: Not all
 devices have an ID.
>>>
>>> If they don't they were not created by management so management is
>>> probably not interested in them being removed.
>>>
>>> We could always add a 'path' key later if this assumption
>>> proves incorrect.
>>
>> In old qdev, ID was all we had, because paths were busted.  Thus,
>> management had no choice but use IDs.
>>
>> If I understand modern qdev correctly, we got a canonical path.  Old
>> APIs like device_del still accept only ID.  Should new APIs still be
>> designed that way?  Or should they always accept / provide the canonical
>> path, plus optional ID for convenience?
> 
> What are advantages of exposing the path to users in this way?
> Looks like maintainance hassle without real benefits?

Anthony had rejected earlier QOM patches by Paolo related to qdev id,
saying it was deprecated in favor of those QOM paths.

 We should still have a canonical path when we fire
 this event in either qdev_free() or in device_unparent() before the if
 (dev->parent_bus) block though. That would be a question for Anthony,
 not having a use case for the event I am indifferent there.

 Further, thinking of objects such as virtio-rng backends or future
 blockdev/chardev objects, might it make sense to turn this into a
 generic object deletion event rather than a device event?

 Andreas
>>>
>>> Backend deletion doesn't normally have guest interaction right?
>>> So why do we need an event?
>>
>> We need an event because device_del may send its reply before it
>> completes the job.
>>
>> device_del does that when it deletion needs to interact with the guest,
>> which can take unbounded time.
>>
>> Conversely, we don't need an event when a QMP always completes the job
>> (as far as observable by the QMP client) before it sends its reply.  Off
>> hand, I can't see why backend deletion would do anything else.
>>
>> I'm always reluctant to abstract when there are fewer than two
>> different, concrete things to abstract from.  Right now, we got just
>> one: device models.

Not quite: It's about unparenting hook and object deletion, which are
both not limited to devices.

But if the ID based approach gets accepted by Anthony then we can still
introduce an OBJECT_DELETED event once someone needs it.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH v2] migration: Improve QMP documentation

2013-03-07 Thread Markus Armbruster

Juan Quintela  writes:

> v2:
> Add Markus sugestions:

This needs to go below the --- line.

> Signed-off-by: Juan Quintela 
> ---
>  qmp-commands.hx | 50 ++
>  1 file changed, 30 insertions(+), 20 deletions(-)
>
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index 95022e2..a2c0c6b 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -644,7 +644,7 @@ EQMP
>
>  SQMP
>  migrate-set-cache-size
> --
> +--
>
>  Set cache size to be used by XBZRLE migration, the cache size will be rounded
>  down to the nearest power of 2
> @@ -667,7 +667,7 @@ EQMP
>
>  SQMP
>  query-migrate-cache-size
> --
> +
>
>  Show cache size to be used by XBZRLE migration
>
> @@ -2438,25 +2438,35 @@ The main json-object contains the following:
>  total amount in ms for downtime that was calculated on
>   the last bitmap round (json-int)
>  - "ram": only present if "status" is "active", it is a json-object with the
> -  following RAM information (in bytes):
> - - "transferred": amount transferred (json-int)
> - - "remaining": amount remaining (json-int)
> - - "total": total (json-int)
> - - "duplicate": number of duplicated pages (json-int)
> - - "normal" : number of normal pages transferred (json-int)
> - - "normal-bytes" : number of normal bytes transferred (json-int)
> +  following RAM information:
> + - "transferred": amount transferred in bytes (json-int)
> + - "remaining": amount remaining to transfer in bytes (json-int)
> + - "total": total amount of memory in bytes (json-int)
> + - "duplicate": number of pages filled entirely with the same
> +byte (json-int)
> + These are sent over the wire much more efficiently.

Tab damage.

> + - "normal" : number of whole pages transfered.  I.e. they
> +were not sent as duplicate or xbzrle pages (json-int)
> + - "normal-bytes" : number of bytes transferred in whole
> +pages. This is just normal pages times size of one page,
> +but this way upper levels don't need to care about page
> +size (json-int)
>  - "disk": only present if "status" is "active" and it is a block migration,
> -  it is a json-object with the following disk information (in bytes):
> - - "transferred": amount transferred (json-int)
> - - "remaining": amount remaining (json-int)
> - - "total": total (json-int)
> +  it is a json-object with the following disk information:
> + - "transferred": amount transferred in bytes (json-int)
> + - "remaining": amount remaining to transfer in bytes json-int)
> + - "total": total disk size in bytes (json-int)
>  - "xbzrle-cache": only present if XBZRLE is active.
>It is a json-object with the following XBZRLE information:
> - - "cache-size": XBZRLE cache size
> - - "bytes": total XBZRLE bytes transferred

Suggest

 - "bytes": number of bytes transferred for XBZRLE compressed pages


> + - "cache-size": XBZRLE cache size in bytes
> + - "bytes": total XBZRLE bytes transferred as xbzrle pages
>   - "pages": number of XBZRLE compressed pages
> - - "cache-miss": number of cache misses
> - - "overflow": number of XBZRLE overflows
> + - "cache-miss": number of XBRZRLE page cache misses
> + - "overflow": number of times XBZRLE overflows.  This means
> +   that the XBZRLE encoding was bigger than just sent the
> +   whole page, and then we sent the whole page instead (as as
> +   normal page).
> +
>  Examples:
>
>  1. Before the first migration
[...]

[Qemu-devel] [PATCHv2] migration: move ram migration support

2013-03-07 Thread Michael S. Tsirkin

Move RAM migration code from arch_init to savevm-ram.

Signed-off-by: Michael S. Tsirkin 
---

Note: this is on top of Juan's pull request

Changes from v1:
- renamed source file, rebased on top of migration.next as
  suggested by Paolo

 Makefile.target |   2 +-
 arch_init.c | 763 -
 savevm-ram.c| 804 
 3 files changed, 805 insertions(+), 764 deletions(-)
 create mode 100644 savevm-ram.c

diff --git a/Makefile.target b/Makefile.target
index ca657b3..54bc21b 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -108,7 +108,7 @@ CONFIG_NO_XEN = $(if $(subst n,,$(CONFIG_XEN)),n,y)
 CONFIG_NO_GET_MEMORY_MAPPING = $(if $(subst 
n,,$(CONFIG_HAVE_GET_MEMORY_MAPPING)),n,y)
 CONFIG_NO_CORE_DUMP = $(if $(subst n,,$(CONFIG_HAVE_CORE_DUMP)),n,y)
 
-obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o
+obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o savevm-ram.o
 obj-y += qtest.o
 obj-y += hw/
 obj-$(CONFIG_KVM) += kvm-all.o
diff --git a/arch_init.c b/arch_init.c
index 98e2bc6..9943ed4 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -31,20 +31,15 @@
 #include "config.h"
 #include "monitor/monitor.h"
 #include "sysemu/sysemu.h"
-#include "qemu/bitops.h"
-#include "qemu/bitmap.h"
 #include "sysemu/arch_init.h"
 #include "audio/audio.h"
 #include "hw/pc.h"
 #include "hw/pci/pci.h"
 #include "hw/audiodev.h"
 #include "sysemu/kvm.h"
-#include "migration/migration.h"
 #include "exec/gdbstub.h"
 #include "hw/smbios.h"
-#include "exec/address-spaces.h"
 #include "hw/pcspk.h"
-#include "migration/page_cache.h"
 #include "qemu/config-file.h"
 #include "qmp-commands.h"
 #include "trace.h"
@@ -103,38 +98,6 @@ int graphic_depth = 15;
 
 const uint32_t arch_type = QEMU_ARCH;
 
-/***/
-/* ram save/restore */
-
-#define RAM_SAVE_FLAG_FULL 0x01 /* Obsolete, not used anymore */
-#define RAM_SAVE_FLAG_COMPRESS 0x02
-#define RAM_SAVE_FLAG_MEM_SIZE 0x04
-#define RAM_SAVE_FLAG_PAGE 0x08
-#define RAM_SAVE_FLAG_EOS  0x10
-#define RAM_SAVE_FLAG_CONTINUE 0x20
-#define RAM_SAVE_FLAG_XBZRLE   0x40
-
-#ifdef __ALTIVEC__
-#include 
-#define VECTYPEvector unsigned char
-#define SPLAT(p)   vec_splat(vec_ld(0, p), 0)
-#define ALL_EQ(v1, v2) vec_all_eq(v1, v2)
-/* altivec.h may redefine the bool macro as vector type.
- * Reset it to POSIX semantics. */
-#undef bool
-#define bool _Bool
-#elif defined __SSE2__
-#include 
-#define VECTYPE__m128i
-#define SPLAT(p)   _mm_set1_epi8(*(p))
-#define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0x)
-#else
-#define VECTYPEunsigned long
-#define SPLAT(p)   (*(p) * (~0UL / 255))
-#define ALL_EQ(v1, v2) ((v1) == (v2))
-#endif
-
-
 static struct defconfig_file {
 const char *filename;
 /* Indicates it is an user config file (disabled by -no-user-config) */
@@ -145,7 +108,6 @@ static struct defconfig_file {
 { NULL }, /* end of list */
 };
 
-
 int qemu_read_default_config_files(bool userconfig)
 {
 int ret;
@@ -164,731 +126,6 @@ int qemu_read_default_config_files(bool userconfig)
 return 0;
 }
 
-static int is_dup_page(uint8_t *page)
-{
-VECTYPE *p = (VECTYPE *)page;
-VECTYPE val = SPLAT(page);
-int i;
-
-for (i = 0; i < TARGET_PAGE_SIZE / sizeof(VECTYPE); i++) {
-if (!ALL_EQ(val, p[i])) {
-return 0;
-}
-}
-
-return 1;
-}
-
-/* struct contains XBZRLE cache and a static page
-   used by the compression */
-static struct {
-/* buffer used for XBZRLE encoding */
-uint8_t *encoded_buf;
-/* buffer for storing page content */
-uint8_t *current_buf;
-/* buffer used for XBZRLE decoding */
-uint8_t *decoded_buf;
-/* Cache for XBZRLE */
-PageCache *cache;
-} XBZRLE = {
-.encoded_buf = NULL,
-.current_buf = NULL,
-.decoded_buf = NULL,
-.cache = NULL,
-};
-
-
-int64_t xbzrle_cache_resize(int64_t new_size)
-{
-if (XBZRLE.cache != NULL) {
-return cache_resize(XBZRLE.cache, new_size / TARGET_PAGE_SIZE) *
-TARGET_PAGE_SIZE;
-}
-return pow2floor(new_size);
-}
-
-/* accounting for migration statistics */
-typedef struct AccountingInfo {
-uint64_t dup_pages;
-uint64_t norm_pages;
-uint64_t iterations;
-uint64_t xbzrle_bytes;
-uint64_t xbzrle_pages;
-uint64_t xbzrle_cache_miss;
-uint64_t xbzrle_overflows;
-} AccountingInfo;
-
-static AccountingInfo acct_info;
-
-static void acct_clear(void)
-{
-memset(&acct_info, 0, sizeof(acct_info));
-}
-
-uint64_t dup_mig_bytes_transferred(void)
-{
-return acct_info.dup_pages * TARGET_PAGE_SIZE;
-}
-
-uint64_t dup_mig_pages_transferred(void)
-{
-return acct_info.dup_pages;
-}
-
-uint64_t norm_mig_bytes_transferred(void)
-{
-return acct_info.norm_pages * TARGET_PAGE_SIZE;
-}
-
-uint64_t norm_mig_pages_transferred(void)
-{
-return

Re: [Qemu-devel] [PATCH 3/7] block: vhdx header for the QEMU support of VHDX images

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 09:47:41AM -0500, Jeff Cody wrote:
> +#define VHDX_ZERO_MGIC 0x6F72657A /* 'zero' */

s/MGIC/MAGIC/

> +typedef struct QEMU_PACKED vhdx_log_zero_descriptor {
> +uint32_tzero_signature; /* "zero" in ASCII */
> +uint32_treserver;

s/reserver/reserved/ ?

Re: [Qemu-devel] propose to implement ower device

2013-03-07 Thread Wenchao Xia


于 2013-3-7 11:40, li guang 写道:

Hi, Anthony and all

By now all devices of QEMU do not have much more
power management consideration, for example, if
system do suspend, it will call all registered notifiers,
this was loosely required, and the code to do power management
state transition seems just do 'ugly emulation', rather than be
conscious with whole system devices, same condition with reset(it
has been embedded in DeviceClass, good!),
shutdown, in real world, commonly all devices' power are controlled
by a power chip, then all power sequence can be done just
issue commands to this chip.
so, I come across an idea to implement qdev'ed power device, and
make all qdev struct of devices aware of self power management(add
reset, suspend, shutdown ... filed for DeviceClass), this will
bring tidy power management, and the emulation will more like what
happened in real world.
Hope I've expressed my idea clearly, if you have any questions, let me
know, if get your permission, I'll step ahead to do this power
chip emulation work.

Thanks!




Hi, Guang
  it seems the mail was sent as a child mail of "Default guest RAM
size". The topic is interesting, suggest resend it.

--
Best Regards

Wenchao Xia

Re: [Qemu-devel] [PATCH 3/7] block: vhdx header for the QEMU support of VHDX images

2013-03-07 Thread Jeff Cody

On Thu, Mar 07, 2013 at 02:15:42PM +0100, Stefan Hajnoczi wrote:
> On Wed, Mar 06, 2013 at 09:47:41AM -0500, Jeff Cody wrote:
> > +#define VHDX_ZERO_MGIC 0x6F72657A /* 'zero' */
> 
> s/MGIC/MAGIC/
> 
> > +typedef struct QEMU_PACKED vhdx_log_zero_descriptor {
> > +uint32_tzero_signature; /* "zero" in ASCII */
> > +uint32_treserver;
> 
> s/reserver/reserved/ ?

Yes, thanks (on both counts).

Re: [Qemu-devel] [PATCH v13 5/5] VMXNET3 device implementation

2013-03-07 Thread Dmitry Fleytman

Thanks, Andreas

Now I see.
We'll resubmit our patches soon.

Dmitry.


On Thu, Mar 7, 2013 at 2:59 PM, Andreas Färber  wrote:

> Am 06.03.2013 08:21, schrieb Dmitry Fleytman:
> > Signed-off-by: Dmitry Fleytman 
> > Signed-off-by: Yan Vugenfirer 
> > ---
> >  default-configs/pci.mak |1 +
> >  hw/Makefile.objs|1 +
> >  hw/pci/pci.h|1 +
> >  hw/vmxnet3.c| 2460
> +++
> >  hw/vmxnet3.h|  760 +++
> >  5 files changed, 3223 insertions(+)
> >  create mode 100644 hw/vmxnet3.c
> >  create mode 100644 hw/vmxnet3.h
> >
> > diff --git a/default-configs/pci.mak b/default-configs/pci.mak
> > index ee2d18d..ce56d58 100644
> > --- a/default-configs/pci.mak
> > +++ b/default-configs/pci.mak
> > @@ -13,6 +13,7 @@ CONFIG_LSI_SCSI_PCI=y
> >  CONFIG_MEGASAS_SCSI_PCI=y
> >  CONFIG_RTL8139_PCI=y
> >  CONFIG_E1000_PCI=y
> > +CONFIG_VMXNET3_PCI=y
> >  CONFIG_IDE_CORE=y
> >  CONFIG_IDE_QDEV=y
> >  CONFIG_IDE_PCI=y
> > diff --git a/hw/Makefile.objs b/hw/Makefile.objs
> > index 14922cb..026aff6 100644
> > --- a/hw/Makefile.objs
> > +++ b/hw/Makefile.objs
> > @@ -120,6 +120,7 @@ common-obj-$(CONFIG_PCNET_COMMON) += pcnet.o
> >  common-obj-$(CONFIG_E1000_PCI) += e1000.o
> >  common-obj-$(CONFIG_RTL8139_PCI) += rtl8139.o
> >  common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet_tx_pkt.o vmxnet_rx_pkt.o
> > +common-obj-$(CONFIG_VMXNET3_PCI) += vmxnet3.o
> >
> >  common-obj-$(CONFIG_SMC91C111) += smc91c111.o
> >  common-obj-$(CONFIG_LAN9118) += lan9118.o
> > diff --git a/hw/pci/pci.h b/hw/pci/pci.h
> > index f340fe5..3beb70b 100644
> > --- a/hw/pci/pci.h
> > +++ b/hw/pci/pci.h
> > @@ -60,6 +60,7 @@
> >  #define PCI_DEVICE_ID_VMWARE_NET 0x0720
> >  #define PCI_DEVICE_ID_VMWARE_SCSI0x0730
> >  #define PCI_DEVICE_ID_VMWARE_IDE 0x1729
> > +#define PCI_DEVICE_ID_VMWARE_VMXNET3 0x07B0
> >
> >  /* Intel (0x8086) */
> >  #define PCI_DEVICE_ID_INTEL_82551IT  0x1209
> > diff --git a/hw/vmxnet3.c b/hw/vmxnet3.c
> > new file mode 100644
> > index 000..75b7181
> > --- /dev/null
> > +++ b/hw/vmxnet3.c
> > @@ -0,0 +1,2460 @@
> > +/*
> > + * QEMU VMWARE VMXNET3 paravirtual NIC
> > + *
> > + * Copyright (c) 2012 Ravello Systems LTD (http://ravellosystems.com)
> > + *
> > + * Developed by Daynix Computing LTD (http://www.daynix.com)
> > + *
> > + * Authors:
> > + * Dmitry Fleytman 
> > + * Tamir Shomer 
> > + * Yan Vugenfirer 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "hw.h"
> > +#include "pci/pci.h"
> > +#include "net/net.h"
> > +#include "virtio-net.h"
> > +#include "net/tap.h"
> > +#include "net/checksum.h"
> > +#include "sysemu/sysemu.h"
> > +#include "qemu-common.h"
> > +#include "qemu/bswap.h"
> > +#include "pci/msix.h"
> > +#include "pci/msi.h"
> > +
> > +#include "vmxnet3.h"
> > +#include "vmxnet_debug.h"
> > +#include "vmware_utils.h"
> > +#include "vmxnet_tx_pkt.h"
> > +#include "vmxnet_rx_pkt.h"
> > +
> > +#define PCI_DEVICE_ID_VMWARE_VMXNET3_REVISION 0x1
> > +#define VMXNET3_MSIX_BAR_SIZE 0x2000
> > +
> > +#define VMXNET3_BAR0_IDX  (0)
> > +#define VMXNET3_BAR1_IDX  (1)
> > +#define VMXNET3_MSIX_BAR_IDX  (2)
> > +
> > +#define VMXNET3_OFF_MSIX_TABLE (0x000)
> > +#define VMXNET3_OFF_MSIX_PBA   (0x800)
> > +
> > +/* Link speed in Mbps should be shifted by 16 */
> > +#define VMXNET3_LINK_SPEED  (1000 << 16)
> > +
> > +/* Link status: 1 - up, 0 - down. */
> > +#define VMXNET3_LINK_STATUS_UP  0x1
> > +
> > +/* Least significant bit should be set for revision and version */
> > +#define VMXNET3_DEVICE_VERSION0x1
> > +#define VMXNET3_DEVICE_REVISION   0x1
> > +
> > +/* Macros for rings descriptors access */
> > +#define VMXNET3_READ_TX_QUEUE_DESCR8(dpa, field) \
> > +(vmw_shmem_ld8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field)))
> > +
> > +#define VMXNET3_WRITE_TX_QUEUE_DESCR8(dpa, field, value) \
> > +(vmw_shmem_st8(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field,
> value)))
> > +
> > +#define VMXNET3_READ_TX_QUEUE_DESCR32(dpa, field) \
> > +(vmw_shmem_ld32(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field)))
> > +
> > +#define VMXNET3_WRITE_TX_QUEUE_DESCR32(dpa, field, value) \
> > +(vmw_shmem_st32(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field),
> value))
> > +
> > +#define VMXNET3_READ_TX_QUEUE_DESCR64(dpa, field) \
> > +(vmw_shmem_ld64(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field)))
> > +
> > +#define VMXNET3_WRITE_TX_QUEUE_DESCR64(dpa, field, value) \
> > +(vmw_shmem_st64(dpa + offsetof(struct Vmxnet3_TxQueueDesc, field),
> value))
> > +
> > +#define VMXNET3_READ_RX_QUEUE_DESCR64(dpa, field) \
> > +(vmw_shmem_ld64(dpa + offsetof(struct Vmxnet3_RxQueueDesc, field)))
> > +
> > +#define VMXNET3_READ_RX_QUEUE_DESCR32(dpa, field) \
> > +(vmw_shmem_ld32(dpa + offsetof(struct Vmxnet3_RxQueueDesc, field)))
> > +
> > +#d

Re: [Qemu-devel] problems with freeBSD

2013-03-07 Thread Aurelien Jarno

On Thu, Mar 07, 2013 at 01:16:54PM +0100, Laszlo Ersek wrote:
> On 03/07/13 09:43, Aurelien Jarno wrote:
> 
> > I did a git bisect to find the commit fixing the issue. Then, as I was
> > not believing the result, I tried the following sequence a dozen of
> > times (for some unknown reasons the FreeBSD install CD doesn't exhibit
> > the issue, so I used the Debian GNU/kFreeBSD installer):
> > 
> > | mkdir qemu-freebsd-bug
> > | cd qemu-freebsd-bug
> > |
> > | wget 
> > http://ftp.debian.org/debian/dists/squeeze/main/installer-kfreebsd-amd64/current/images/netboot/mini.iso
> >  
> > |
> > | git clone git://git.qemu.org/qemu.git
> > | cd qemu
> > | git checkout -b stable-1.4 v1.4.0
> > | ./configure --target-list=x86_64-softmmu
> > | make
> > | cd ..
> > |
> > | git clone git://git.seabios.org/seabios.git
> > | cd seabios
> > | git checkout -b 1.7.2-stable origin/1.7.2-stable
> > | git reset --hard 4219149ad2b783abfa61e80e9e9f6910db0c76c9
> > | make
> > | cp out/bios.bin ../qemu/pc-bios
> > | cd..
> > |
> > | # debian-installer boots correctly 
> > | ./qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cdrom mini.iso
> > |
> > | cd seabios
> > | git reset --hard d75c22fcb6521dad11428b65789d92f89675c600 
> > | git clean -fdx
> > | make
> > | cp out/bios.bin ../qemu/pc-bios
> > | cd ..
> > |
> > | # debian-installer fails to boot
> > | ./qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cdrom mini.iso 
> > 
> > 
> > Maybe I am doing something wrong or there is a bug in my toolchain
> > (Debian Sid). It would be nice if someone could try to reproduce that on
> > another distro/system.
> > 
> 
> Can you save the out/ directory from both builds and "diff -ur" them
> (maybe just the *.lds files)?

Please find it attached.

> I'm noticing that pathnames are embedded in some ELF section names (I
> hope this sentence makes sense), and when you build at d75c22fc, those
> pathnames contain dot-dot (".."), ie. two section name separators next
> to each other. Maybe that's not good; no idea.
> 

It's clearly the case, but I also don't know if it is a good solution or
not. What is sure is that it also changes the address of the text
section in bios.bin.elf.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net


seabios.qemu-freebsd-bug.diff.xz
Description: Binary data

Re: [Qemu-devel] [PATCH 01/10] qdev: add qdev property for bool type

2013-03-07 Thread Andreas Färber

Am 25.02.2013 02:03, schrieb Igor Mammedov:
> Signed-off-by: Igor Mammedov 

I vaguely remember having written something like this long time ago for
ISA and for Vasilis... looks good except for minor nits.

> ---
>  hw/qdev-properties.c |   33 +
>  hw/qdev-properties.h |   10 ++
>  2 files changed, 43 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
> index a8a31f5..16ac814 100644
> --- a/hw/qdev-properties.c
> +++ b/hw/qdev-properties.c
> @@ -106,6 +106,39 @@ PropertyInfo qdev_prop_bit = {
>  .set   = set_bit,
>  };
>  
> +/* --- bool --- */
> +
> +static void get_bool(Object *obj, Visitor *v, void *opaque,
> + const char *name, Error **errp)
> +{
> +DeviceState *dev = DEVICE(obj);
> +Property *prop = opaque;
> +bool *ptr = qdev_get_prop_ptr(dev, prop);
> +
> +visit_type_bool(v, ptr, name, errp);
> +}
> +
> +static void set_bool(Object *obj, Visitor *v, void *opaque,
> +   const char *name, Error **errp)

Indentation is off.

> +{
> +DeviceState *dev = DEVICE(obj);
> +Property *prop = opaque;
> +bool *ptr = qdev_get_prop_ptr(dev, prop);
> +
> +if (dev->realized) {
> +error_setg(errp, "Insufficient permission to perform this 
> operation");

Should we be more specific here?
E.g., "Property '%s' cannot be changed once realized."

Andreas

> +return;
> +}
> +
> +visit_type_bool(v, ptr, name, errp);
> +}
> +
> +PropertyInfo qdev_prop_bool = {
> +.name  = "boolean",
> +.get   = get_bool,
> +.set   = set_bool,
> +};
> +
>  /* --- 8bit integer --- */
>  
>  static void get_uint8(Object *obj, Visitor *v, void *opaque,
> diff --git a/hw/qdev-properties.h b/hw/qdev-properties.h
> index 20c67f3..3915f7c 100644
> --- a/hw/qdev-properties.h
> +++ b/hw/qdev-properties.h
> @@ -6,6 +6,7 @@
>  /*** qdev-properties.c ***/
>  
>  extern PropertyInfo qdev_prop_bit;
> +extern PropertyInfo qdev_prop_bool;
>  extern PropertyInfo qdev_prop_uint8;
>  extern PropertyInfo qdev_prop_uint16;
>  extern PropertyInfo qdev_prop_uint32;
> @@ -51,6 +52,15 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
>  .defval= (bool)_defval,  \
>  }
>  
> +#define DEFINE_PROP_BOOL(_name, _state, _field, _defval) {   \
> +.name  = (_name),\
> +.info  = &(qdev_prop_bool),  \
> +.offset= offsetof(_state, _field)\
> ++ type_check(bool, typeof_field(_state, _field)),\
> +.qtype = QTYPE_QBOOL,\
> +.defval= (bool)_defval,  \
> +}
> +
>  #define DEFINE_PROP_UINT8(_n, _s, _f, _d)   \
>  DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_uint8, uint8_t)
>  #define DEFINE_PROP_UINT16(_n, _s, _f, _d)  \
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] problems with freeBSD

2013-03-07 Thread Kevin O'Connor

On Thu, Mar 07, 2013 at 09:43:04AM +0100, Aurelien Jarno wrote:
> On Wed, Mar 06, 2013 at 07:53:51PM -0500, Kevin O'Connor wrote:
> > That change is definitely just build related - I don't see how it
> > could impact the final SeaBIOS binary.  How did you conclude that this
> > commit is what fixes the issue?
> > 
> 
> I did a git bisect to find the commit fixing the issue. Then, as I was
> not believing the result, I tried the following sequence a dozen of
> times (for some unknown reasons the FreeBSD install CD doesn't exhibit
> the issue, so I used the Debian GNU/kFreeBSD installer):

Thanks.  I'll take a look at this tonight.  If you get a chance first,
could you try adding "make distclean; make clean" into the seabios
steps?

-Kevin

Re: [Qemu-devel] [PATCH] qdev: DEVICE_DELETED event

2013-03-07 Thread Markus Armbruster

Andreas Färber  writes:

> Am 07.03.2013 11:07, schrieb Michael S. Tsirkin:
>> On Thu, Mar 07, 2013 at 10:55:23AM +0100, Markus Armbruster wrote:
>>> "Michael S. Tsirkin"  writes:
>>>
 On Wed, Mar 06, 2013 at 02:57:22PM +0100, Andreas Färber wrote:
> Am 06.03.2013 14:00, schrieb Michael S. Tsirkin:
>> libvirt has a long-standing bug: when removing the device,
>> it can request removal but does not know when does the
>> removal complete. Add an event so we can fix this in a robust way.
>>
>> Signed-off-by: Michael S. Tsirkin 
>
> Sounds like a good idea to me. :)
>
> [...]
>> diff --git a/hw/qdev.c b/hw/qdev.c
>> index 689cd54..f30d251 100644
>> --- a/hw/qdev.c
>> +++ b/hw/qdev.c
>> @@ -29,6 +29,7 @@
>>  #include "sysemu/sysemu.h"
>>  #include "qapi/error.h"
>>  #include "qapi/visitor.h"
>> +#include "qapi/qmp/qjson.h"
>>  
>>  int qdev_hotplug = 0;
>>  static bool qdev_hot_added = false;
>> @@ -267,6 +268,11 @@ void qdev_init_nofail(DeviceState *dev)
>>  /* Unlink device from bus and free the structure.  */
>>  void qdev_free(DeviceState *dev)
>>  {
>> +if (dev->id) {
>> +QObject *data = qobject_from_jsonf("{ 'device': %s }", dev->id);
>> +monitor_protocol_event(QEVENT_DEVICE_DELETED, data);
>> +qobject_decref(data);
>> +}
>>  object_unparent(OBJECT(dev));
>>  }
>>  
>
> I'm pretty sure this is the wrong place to fire the notification. We
> should rather do this when the device is actually deleted - which
> qdev_free() does *not* actually guarantee, as criticized in the s390x
> and unref'ing contexts.
> I would suggest to place your code into device_unparent() instead.
>
> Another thing to consider is what data to pass to the event: Not all
> devices have an ID.

 If they don't they were not created by management so management is
 probably not interested in them being removed.

 We could always add a 'path' key later if this assumption
 proves incorrect.
>>>
>>> In old qdev, ID was all we had, because paths were busted.  Thus,
>>> management had no choice but use IDs.
>>>
>>> If I understand modern qdev correctly, we got a canonical path.  Old
>>> APIs like device_del still accept only ID.  Should new APIs still be
>>> designed that way?  Or should they always accept / provide the canonical
>>> path, plus optional ID for convenience?
>> 
>> What are advantages of exposing the path to users in this way?

The path is the device's canonical name.  Canonical means path:device is
1:1.  Path always works.  Qdev ID only works when the user assigned one.

Funny case: board creates a hot-pluggable device by default (thus no
qdev ID), guest ejects it, what do you put into the event?  Your code
simply doesn't emit one.

You could blame the user; after all he could've used -nodefaults, and
added the device himself, with an ID.

I blame your design instead, which needlessly complicates the event's
semantics: it gets emitted only for devices with a qdev ID.  Which you
neglected to document clearly, by the way.

If you put the path into the event, you can emit it always, which is
simpler.  Feel free to throw in the qdev ID.

>> Looks like maintainance hassle without real benefits?

I can't see path being a greater maintenance hassle than ID.

> Anthony had rejected earlier QOM patches by Paolo related to qdev id,
> saying it was deprecated in favor of those QOM paths.

More reason to put the path into the event, not just the qdev ID.

> We should still have a canonical path when we fire
> this event in either qdev_free() or in device_unparent() before the if
> (dev->parent_bus) block though. That would be a question for Anthony,
> not having a use case for the event I am indifferent there.
>
> Further, thinking of objects such as virtio-rng backends or future
> blockdev/chardev objects, might it make sense to turn this into a
> generic object deletion event rather than a device event?
>
> Andreas

 Backend deletion doesn't normally have guest interaction right?
 So why do we need an event?
>>>
>>> We need an event because device_del may send its reply before it
>>> completes the job.
>>>
>>> device_del does that when it deletion needs to interact with the guest,
>>> which can take unbounded time.
>>>
>>> Conversely, we don't need an event when a QMP always completes the job
>>> (as far as observable by the QMP client) before it sends its reply.  Off
>>> hand, I can't see why backend deletion would do anything else.
>>>
>>> I'm always reluctant to abstract when there are fewer than two
>>> different, concrete things to abstract from.  Right now, we got just
>>> one: device models.
>
> Not quite: It's about unparenting hook and object deletion, which are
> both not limited to devices.

Yes, the implementation of the event happens to be sit

Re: [Qemu-devel] [PATCH 2/2] input: introduce keyboard handler list

2013-03-07 Thread Markus Armbruster

Gerd Hoffmann  writes:

> Add a linked list of keyboard handlers.  Added handlers will go
> to the head of the list.  Removed handlers will be zapped from
> the list.  The head of the list will be used for events.
>
> This fixes the keyboard-dead-after-usb-kbd-unplug issue, key events
> will be re-routed to the ps/2 kbd instead of being discarded.

Patterned after the mouse code, which already has such a list.

For mice, we have a monitor command mouse_set to select the active one.
Useful for keyboards, too?  I dimly recall somebody trying something
like that in the past.

> Signed-off-by: Gerd Hoffmann 
> ---
>  hw/hid.c |4 ++--
>  hw/hid.h |1 +
>  include/ui/console.h |6 --
>  ui/input.c   |   37 +
>  4 files changed, 32 insertions(+), 16 deletions(-)
>
> diff --git a/hw/hid.c b/hw/hid.c
> index 89b5415..6be00ec 100644
> --- a/hw/hid.c
> +++ b/hw/hid.c
> @@ -415,7 +415,7 @@ void hid_free(HIDState *hs)
>  {
>  switch (hs->kind) {
>  case HID_KEYBOARD:
> -qemu_remove_kbd_event_handler();
> +qemu_remove_kbd_event_handler(hs->kbd.eh_entry);
>  break;
>  case HID_MOUSE:
>  case HID_TABLET:
> @@ -431,7 +431,7 @@ void hid_init(HIDState *hs, int kind, HIDEventFunc event)
>  hs->event = event;
>  
>  if (hs->kind == HID_KEYBOARD) {
> -qemu_add_kbd_event_handler(hid_keyboard_event, hs);
> +hs->kbd.eh_entry = qemu_add_kbd_event_handler(hid_keyboard_event, 
> hs);
>  } else if (hs->kind == HID_MOUSE) {
>  hs->ptr.eh_entry = qemu_add_mouse_event_handler(hid_pointer_event, 
> hs,
>  0, "QEMU HID Mouse");
> diff --git a/hw/hid.h b/hw/hid.h
> index 56c71ed..2567879 100644
> --- a/hw/hid.h
> +++ b/hw/hid.h
> @@ -31,6 +31,7 @@ typedef struct HIDKeyboardState {
>  uint8_t leds;
>  uint8_t key[16];
>  int32_t keys;
> +QEMUPutKbdEntry *eh_entry;
>  } HIDKeyboardState;
>  
>  struct HIDState {
> diff --git a/include/ui/console.h b/include/ui/console.h
> index ce5a550..fd39d94 100644
> --- a/include/ui/console.h
> +++ b/include/ui/console.h
> @@ -28,10 +28,12 @@ typedef void QEMUPutLEDEvent(void *opaque, int ledstate);
>  typedef void QEMUPutMouseEvent(void *opaque, int dx, int dy, int dz, int 
> buttons_state);
>  
>  typedef struct QEMUPutMouseEntry QEMUPutMouseEntry;
> +typedef struct QEMUPutKbdEntry QEMUPutKbdEntry;
>  typedef struct QEMUPutLEDEntry QEMUPutLEDEntry;
>  
> -void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void *opaque);
> -void qemu_remove_kbd_event_handler(void);
> +QEMUPutKbdEntry *qemu_add_kbd_event_handler(QEMUPutKBDEvent *func,
> +void *opaque);
> +void qemu_remove_kbd_event_handler(QEMUPutKbdEntry *entry);
>  QEMUPutMouseEntry *qemu_add_mouse_event_handler(QEMUPutMouseEvent *func,
>  void *opaque, int absolute,
>  const char *name);
> diff --git a/ui/input.c b/ui/input.c
> index 87a23df..59d0578 100644
> --- a/ui/input.c
> +++ b/ui/input.c
> @@ -41,18 +41,25 @@ struct QEMUPutMouseEntry {
>  QTAILQ_ENTRY(QEMUPutMouseEntry) node;
>  };
>  
> +struct QEMUPutKbdEntry {
> +QEMUPutLEDEvent *put_kbd;

Sure it's not QEMUPutKbdEvent?

> +void *opaque;
> +QTAILQ_ENTRY(QEMUPutKbdEntry) next;
> +};
> +
>  struct QEMUPutLEDEntry {
>  QEMUPutLEDEvent *put_led;
>  void *opaque;
>  QTAILQ_ENTRY(QEMUPutLEDEntry) next;
>  };
>  
> -static QEMUPutKBDEvent *qemu_put_kbd_event;
> -static void *qemu_put_kbd_event_opaque;
> -static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers = 
> QTAILQ_HEAD_INITIALIZER(led_handlers);
> +static QTAILQ_HEAD(, QEMUPutLEDEntry) led_handlers =
> +QTAILQ_HEAD_INITIALIZER(led_handlers);

Unrelated style cleanup.  Tolerable.

> +static QTAILQ_HEAD(, QEMUPutKbdEntry) kbd_handlers =
> +QTAILQ_HEAD_INITIALIZER(kbd_handlers);
>  static QTAILQ_HEAD(, QEMUPutMouseEntry) mouse_handlers =
>  QTAILQ_HEAD_INITIALIZER(mouse_handlers);
> -static NotifierList mouse_mode_notifiers = 
> +static NotifierList mouse_mode_notifiers =
>  NOTIFIER_LIST_INITIALIZER(mouse_mode_notifiers);

Unrelated whitespace cleanup.  Tolerable.

>  
>  static const int key_defs[] = {
> @@ -306,16 +313,20 @@ void qmp_send_key(KeyValueList *keys, bool 
> has_hold_time, int64_t hold_time,
> muldiv64(get_ticks_per_sec(), hold_time, 1000));
>  }
>  
> -void qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void *opaque)
> +QEMUPutKbdEntry *qemu_add_kbd_event_handler(QEMUPutKBDEvent *func, void 
> *opaque)
>  {
> -qemu_put_kbd_event_opaque = opaque;
> -qemu_put_kbd_event = func;
> +QEMUPutKbdEntry *entry;
> +
> +entry = g_malloc0(sizeof(QEMUPutKbdEntry));
> +entry->put_kbd = func;
> +entry->opaque = opaque;
> +QTAILQ_INSERT_HEAD(&kbd_handlers, entry, next);
> +return entry;

Re: [Qemu-devel] [PATCH 1/2] input: make QEMUPutLEDEntry + QEMUPutMouseEntry private

2013-03-07 Thread Markus Armbruster

Gerd Hoffmann  writes:

> There is no need for anybody outside ui/input.c to access the
> struct elements.  Move the definitions, leaving only the typedefs
> in the header files.

No-brainer (assuming it compiles).

Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [PATCH 4/7] block: initial VHDX driver support framework - supports open and probe

2013-03-07 Thread Stefan Hajnoczi

On Wed, Mar 06, 2013 at 09:48:11AM -0500, Jeff Cody wrote:
> +#define leguid_to_cpus(guid) do { \
> +le32_to_cpus(&(guid)->data1); \
> +le16_to_cpus(&(guid)->data2); \
> +le16_to_cpus(&(guid)->data3); } while (0)

This should be a function.  Please avoid macros.

> +static const ms_guid logical_sector_guid = {.data1 = 0x8141bf1d,
> +.data2 = 0xa96f,
> +.data3 = 0x4709,
> +   .data4 = { 0xba, 0x47, 0xf2, 0x33,

Indentation

> +  0xa8, 0xfa, 0xab, 
> 0x5f} };
> +
> +static const ms_guid phys_sector_guid =  { .data1 = 0xcda348c7,
> +   .data2 = 0x445d,
> +   .data3 = 0x4471,
> +   .data4 = { 0x9c, 0xc9, 0xe9, 0x88,
> +  0x52, 0x51, 0xc5, 
> 0x56} };
> +
> +static const ms_guid parent_locator_guid = {.data1 = 0xa8d35f2d,
> +.data2 = 0xb30b,
> +.data3 = 0x454d,
> +   .data4 = { 0xab, 0xf7, 0xd3, 0xd8,

Indentation

> +typedef struct BDRVVHDXState {
> +CoMutex lock;
> +
> +int curr_header;
> +vhdx_header *headers[2];
> +
> +vhdx_region_table_header rt;
> +vhdx_region_table_entry bat_rt; /* region table for the BAT */
> +vhdx_region_table_entry metadata_rt;/* region table for the metadata 
> */
> +vhdx_region_table_entry *unknown_rt;
> +unsigned int unknown_rt_size;

This field isn't used yet but "unsigned int" for something that has
"size" in its name is suspicious.  Should it be size_t (memory) or
uint64_t (disk)?

> +static int vhdx_probe(const uint8_t *buf, int buf_size, const char *filename)
> +{
> +if (buf_size >= 8 && !strncmp((char *)buf, "vhdxfile", 8)) {

memcmp() saves you the trouble of casting buf (which should stay const,
BTW).

> +static int vhdx_parse_metadata(BlockDriverState *bs, BDRVVHDXState *s)
> +{
> +int ret = 0;
> +uint8_t *buffer;
> +int offset = 0;
> +int i = 0;
> +uint32_t block_size, sectors_per_block, logical_sector_size;
> +uint64_t chunk_ratio;
> +vhdx_metadata_table_entry md_entry;
> +
> +buffer = g_malloc(VHDX_METADATA_TABLE_MAX_SIZE);

Please use qemu_blockalign() to avoid bounce buffers in case the memory
is not 512-byte aligned.  Use qemu_vfree() instead of g_free().

This applies to all I/O buffers that the block driver allocates.

> +
> +ret = bdrv_pread(bs->file, s->metadata_rt.file_offset, buffer,
> + VHDX_METADATA_TABLE_MAX_SIZE);
> +if (ret < 0) {
> +goto fail_no_free;
> +}
> +memcpy(&s->metadata_hdr, buffer, sizeof(s->metadata_hdr));
> +offset += sizeof(s->metadata_hdr);
> +
> +le64_to_cpus(&s->metadata_hdr.signature);
> +le16_to_cpus(&s->metadata_hdr.reserved);
> +le16_to_cpus(&s->metadata_hdr.entry_count);
> +
> +if (s->metadata_hdr.signature != VHDX_METADATA_MAGIC) {
> +ret = -EINVAL;
> +goto fail_no_free;
> +}
> +
> +s->metadata_entries.present = 0;
> +
> +for (i = 0; i < s->metadata_hdr.entry_count; i++) {
> +memcpy(&md_entry, buffer+offset, sizeof(md_entry));

Read beyond end of buffer if metadata_hdr.entry_count is wrong.

We cannot trust the image file - it can be corrupt or malicious.  Checks
are required for everything that can be validated.  QEMU cannot do
anything that would cause a crash or memory corruption on invalid input.

> +ret = bdrv_pread(bs->file,
> + s->metadata_entries.file_parameters_entry.offset
> + + s->metadata_rt.file_offset,
> + &s->params,
> + sizeof(s->params));

Missing error code path when ret < 0.

> +/* determine virtual disk size, logical sector size,
> + * and phys sector size */
> +
> +ret = bdrv_pread(bs->file,
> + s->metadata_entries.virtual_disk_size_entry.offset
> +   + s->metadata_rt.file_offset,
> + &s->virtual_disk_size,
> + sizeof(uint64_t));
> +if (ret < 0) {
> +goto exit;
> +}
> +ret = bdrv_pread(bs->file,
> + s->metadata_entries.logical_sector_size_entry.offset
> + + s->metadata_rt.file_offset,
> + &s->logical_sector_size,
> + sizeof(uint32_t));
> +if (ret < 0) {
> +goto exit;
> +}
> +ret = bdrv_pread(bs->file,
> + s->metadata_entries.phys_sector_size_entry.offset
> +  + s->metadata_rt.file_offset,
> + &s->physical_sector_size,
> +

[Qemu-devel] Windows doesn't like MSI/MSI-X

2013-03-07 Thread Hannes Reinecke


Hi all,

recently I've tried to teach megasas MSI/MSI-X. While it works 
perfectly under Linux, Windows refuses to.


With really strange symptoms:
Windows Vista will BSOD when both MSI/MSI-X registers are present, 
and Windows 7 will hang as Windows (apparently) thinks MSI/MSI-X is 
enabled, whereas qemu doesn't and uses INTx.

So the Windows 7 guest will never see any interrupts.

The _really_ odd thing is that when I remove the MSI-X capability 
Windows will fall back to INTx and everything works.


Even more curious is that from the logs Windows will only ever write 
zeros into the MSI/MSI-X config registers.

Which makes me wonder what's going on there.

As I'm not sure if that's my fault I was wondering if anybody every 
succeeded in getting AHCI to use MSI under Windows.

Any pointers?

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

1 2 >

1 - 100 of 197 matches

Mail list logo