[Qemu-devel] [PATCH] qemu-img: use the heap instead of the huge stack array for win32

2010-02-08 Thread TeLeMan
The default stack size of PE is 1MB on win32 and IO_BUF_SIZE in
img_convert() & img_rebase() is 2MB, so qemu-img will crash when doing
"convert" & "rebase" on win32.
Although we can improve the stack size of PE to resolve it, I think we
should avoid using the huge stack variables.

Signed-off-by: TeLeMan 
---
 qemu-img.c |   14 +++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index bbfeea1..9994b3d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -571,7 +571,7 @@ static int img_convert(int argc, char **argv)
 BlockDriverState **bs, *out_bs;
 int64_t total_sectors, nb_sectors, sector_num, bs_offset;
 uint64_t bs_sectors;
-uint8_t buf[IO_BUF_SIZE];
+uint8_t * buf;
 const uint8_t *buf1;
 BlockDriverInfo bdi;
 QEMUOptionParameter *param = NULL;
@@ -690,6 +690,7 @@ static int img_convert(int argc, char **argv)
 bs_i = 0;
 bs_offset = 0;
 bdrv_get_geometry(bs[0], &bs_sectors);
+buf = qemu_malloc(IO_BUF_SIZE);

 if (flags & BLOCK_FLAG_COMPRESS) {
 if (bdrv_get_info(out_bs, &bdi) < 0)
@@ -822,6 +823,7 @@ static int img_convert(int argc, char **argv)
 }
 }
 }
+qemu_free(buf);
 bdrv_delete(out_bs);
 for (bs_i = 0; bs_i < bs_n; bs_i++)
 bdrv_delete(bs[bs_i]);
@@ -1178,8 +1180,11 @@ static int img_rebase(int argc, char **argv)
 uint64_t num_sectors;
 uint64_t sector;
 int n, n1;
-uint8_t buf_old[IO_BUF_SIZE];
-uint8_t buf_new[IO_BUF_SIZE];
+uint8_t * buf_old;
+uint8_t * buf_new;
+
+buf_old = qemu_malloc(IO_BUF_SIZE);
+buf_new = qemu_malloc(IO_BUF_SIZE);

 bdrv_get_geometry(bs, &num_sectors);

@@ -1226,6 +1231,9 @@ static int img_rebase(int argc, char **argv)
 written += pnum;
 }
 }
+
+qemu_free(buf_old);
+qemu_free(buf_new);
 }

 /*
-- 
1.6.5.1.1367.gcd48


--
SUN OF A BEACH




Re: [Qemu-devel] rtl8139 timer interrupt rewrote

2010-02-08 Thread Markus Armbruster
You can improve your patch's chances for getting noticed, reviewed and
merged by putting [PATCH] in your subject.  Consider using
git-format-patch and git-send-email.




[Qemu-devel] Re: [PATCH] fix the static compilation for sdl

2010-02-08 Thread Paolo Bonzini

On 02/08/2010 06:56 AM, TeLeMan wrote:

The static compilation for sdl is broken after
79427693174a553d62f3da44aacd3f19ba8df3a7.


ACK.


Signed-off-by: TeLeMan
---
  configure |7 +--
  1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 4c95c27..213dddf 100644
--- a/configure
+++ b/configure
@@ -1063,7 +1063,11 @@ if test "$sdl" != "no" ; then
  int main( void ) { return SDL_Init (SDL_INIT_VIDEO); }
  EOF
sdl_cflags=`$sdlconfig --cflags 2>  /dev/null`
-  sdl_libs=`$sdlconfig --libs 2>  /dev/null`
+  if test "$static" = "yes" ; then
+sdl_libs=`sdl-config --static-libs 2>/dev/null`
+  else
+sdl_libs=`$sdlconfig --libs 2>  /dev/null`
+  fi
if compile_prog "$sdl_cflags" "$sdl_libs" ; then
  if test "$_sdlversion" -lt 121 ; then
sdl_too_old=yes
@@ -1075,7 +1079,6 @@ EOF

  # static link with sdl ? (note: sdl.pc's --static --libs is broken)
  if test "$sdl" = "yes" -a "$static" = "yes" ; then
-  sdl_libs=`sdl-config --static-libs 2>/dev/null`
if test $? = 0&&  echo $sdl_libs | grep -- -laa>  /dev/null; then
   sdl_libs="$sdl_libs `aalib-config --static-libs>2 /dev/null`"
   sdl_cflags="$sdl_cflags `aalib-config --cflags>2 /dev/null`"






[Qemu-devel] How fun it must be to commit patches without posting them!

2010-02-08 Thread Paolo Bonzini

malc,

commit bc5b600 does the same thing as the series I posted at 
http://permalink.gmane.org/gmane.comp.emulators.qemu/62997, only worse:


1) for vl.c and qemu-img.c, it disable FORTIFY_SOURCE checking which is 
only enabled by the printf macro:


#  define printf(...) \
  __builtin___printf_chk (__USE_FORTIFY_LEVEL - 1, __VA_ARGS__)

2) for readline.c, it includes a useless complication, namely

#ifdef printf
#undef printf
#endif

3) for vl.c, it doesn't take the occasion to cleanup qemu-options.hx's 
useless usage of % sequences that are filled in by vl.c.


Please revert it and commit my series instead, or explain why you didn't 
care about reading the entire thread or about explicitly NACKing my patches.


Paolo




[Qemu-devel] Re: [PATCH v2 RESEND 0/5] clang fixes

2010-02-08 Thread Paolo Bonzini

On 02/05/2010 07:26 PM, Blue Swirl wrote:

Thanks, applied.


Thanks Blue Swirl, can you please cherry-pick commit 0dfbd514 (fix 
undefined shifts by >32) into stable-0.12?


Paolo




Re: [Qemu-devel] [Patch] Support translating Guest physical address to Host virtual address.

2010-02-08 Thread Avi Kivity

On 02/08/2010 12:09 AM, Anthony Liguori wrote:

On 02/07/2010 10:31 AM, Avi Kivity wrote:
Only insofar as you don't have to deal with getting at the VM fd.  
You can avoid the problem by having the kvm ioctl interface take a 
pid or something.



That's a racy interface.


The mechanism itself is racy.  That said, pid's don't recycle very 
quickly so the chances of running into a practical issue is quite small.


While a low probability of a race is acceptable for a test tool, it 
isn't for a kernel interface.



Well, we need to provide a reasonable alternative.


I think this is the sort of thing that really needs to be a utility 
that lives outside of qemu.  I'm absolutely in favor of exposing 
enough internals to let people do interesting things provided it's 
reasonably correct.


I agree that's desirable.  However in light of the changable 
gpa->hva->hpa mappings, this may not be feasible.




One might be to use -mempath (which is hacky by itself, but so far we 
have no alternative) and use an external tool on the memory object to 
poison it.  An advantage is that you can use it independently of kvm.


It would help if the actual requirements were spelled out a bit more.  
What exactly needs validating?  Do we need to validate that a 
poisoning a host physical address results in a very particular guest 
page getting poisoned?


Is it not enough to just choose a random anonymous memory area within 
the qemu process, generate an MCE to that location, see whether qemu 
SIGBUS's.  If it doesn't, validate that an MCE has been received in 
the guest?


/proc/pid/pagemap may help, though that's racy too.  If you pick the 
largest vma (or use -mempath) you're pretty much guaranteed to hit on 
the guest memory area.




But FWIW, I think a set of per-VM directories in sysfs could be very 
useful for this sort of debugging.


Maybe we should consider having the equivalent of a QMP-for-debugging 
session.  This would be a special QMP session that we basically 
provided no compatibility or even sanity guarantees that was 
specifically there for debugging.  I would expect that it be disabled 
in any production build (even perhaps even by default in the general 
build).




We have 'info cpus' that shows the vcpu->thread mappings, allowing 
management to pin cpus.  Why not have 'info memory' that shows guest 
numa nodes and host virtual addresses?  The migrate_pages() syscall 
takes a pid so it can be used by qemu's controller to load-balance a 
numa machine, and this can also be used by the poisoner to do its work.


--
error compiling committee.c: too many arguments to function





Re: [Qemu-devel] [PATCH] fix the static compilation for sdl

2010-02-08 Thread Loïc Minier
On Mon, Feb 08, 2010, TeLeMan wrote:
> The static compilation for sdl is broken after
> 79427693174a553d62f3da44aacd3f19ba8df3a7.
> 
> Signed-off-by: TeLeMan 
> ---
>  configure |7 +--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/configure b/configure
> index 4c95c27..213dddf 100644
> --- a/configure
> +++ b/configure
> @@ -1063,7 +1063,11 @@ if test "$sdl" != "no" ; then
>  int main( void ) { return SDL_Init (SDL_INIT_VIDEO); }
>  EOF
>sdl_cflags=`$sdlconfig --cflags 2> /dev/null`
> -  sdl_libs=`$sdlconfig --libs 2> /dev/null`
> +  if test "$static" = "yes" ; then
> +sdl_libs=`sdl-config --static-libs 2>/dev/null`
> +  else
> +sdl_libs=`$sdlconfig --libs 2> /dev/null`
> +  fi
>if compile_prog "$sdl_cflags" "$sdl_libs" ; then
>  if test "$_sdlversion" -lt 121 ; then
>sdl_too_old=yes
> @@ -1075,7 +1079,6 @@ EOF
> 
>  # static link with sdl ? (note: sdl.pc's --static --libs is broken)
>  if test "$sdl" = "yes" -a "$static" = "yes" ; then
> -  sdl_libs=`sdl-config --static-libs 2>/dev/null`
>if test $? = 0 && echo $sdl_libs | grep -- -laa > /dev/null; then
>   sdl_libs="$sdl_libs `aalib-config --static-libs >2 /dev/null`"
>   sdl_cflags="$sdl_cflags `aalib-config --cflags >2 /dev/null`"

Signed-off-by: Loïc Minier 

 Ah right, the tests were being run on a shared libsdl no matter actual
 linking would happen against the static or shared libsdl; thanks for
 fixing the ordering.

-- 
Loïc Minier




[Qemu-devel] Re: How fun it must be to commit patches without posting them!

2010-02-08 Thread malc
On Mon, 8 Feb 2010, Paolo Bonzini wrote:

> malc,
> 
> commit bc5b600 does the same thing as the series I posted at
> http://permalink.gmane.org/gmane.comp.emulators.qemu/62997, only worse:
> 
> 1) for vl.c and qemu-img.c, it disable FORTIFY_SOURCE checking which is only
> enabled by the printf macro:
> 
> #  define printf(...) \
>   __builtin___printf_chk (__USE_FORTIFY_LEVEL - 1, __VA_ARGS__)
> 
> 2) for readline.c, it includes a useless complication, namely
> 
> #ifdef printf
> #undef printf
> #endif
> 
> 3) for vl.c, it doesn't take the occasion to cleanup qemu-options.hx's useless
> usage of % sequences that are filled in by vl.c.
> 
> Please revert it and commit my series instead, or explain why you didn't care
> about reading the entire thread or about explicitly NACKing my patches.

Because i had a hard disk crash and hence ended up with a new shiny
system where QEMU does not build anymore, also not enough patience
to revisit the thread and figure out why it wasn't applied yet.

-- 
mailto:av1...@comtv.ru




[Qemu-devel] Re: How fun it must be to commit patches without posting them!

2010-02-08 Thread Paolo Bonzini

On 02/08/2010 10:09 AM, malc wrote:

Because i had a hard disk crash and hence ended up with a new shiny
system where QEMU does not build anymore, also not enough patience
to revisit the thread and figure out why it wasn't applied yet.


Sorry for you, but that's what local branches are for.

Paolo




[Qemu-devel] [PATCH] rtl8139 timer interrupt rewrote

2010-02-08 Thread Frediano Ziglio
rewrote timer implementation for rtl8139. Add a QEMU timer only when needed
(timeout status not set, timeout irq wanted and timer set).

Signed-off-by: Frediano Ziglio 
---
 hw/rtl8139.c |  136 ++---
 1 files changed, 81 insertions(+), 55 deletions(-)

diff --git a/hw/rtl8139.c b/hw/rtl8139.c
index f04dd54..3b332a6 100644
--- a/hw/rtl8139.c
+++ b/hw/rtl8139.c
@@ -41,6 +41,10 @@
  *  segmentation offloading
  *  Removed slirp.h dependency
  *  Added rx/tx buffer reset when
enabling rx/tx operation
+ *
+ *  2009-Feb-04  Frediano Ziglio:   Rewrote timer support using QEMU timer only
+ *  when strictly needed (required for for
+ *  Darwin)
  */

 #include "hw.h"
@@ -60,9 +64,6 @@
 /* Calculate CRCs properly on Rx packets */
 #define RTL8139_CALCULATE_RXCRC 1

-/* Uncomment to enable on-board timer interrupts */
-//#define RTL8139_ONBOARD_TIMER 1
-
 #if defined(RTL8139_CALCULATE_RXCRC)
 /* For crc32 */
 #include 
@@ -491,9 +492,12 @@ typedef struct RTL8139State {

 /* PCI interrupt timer */
 QEMUTimer *timer;
+int64_t TimerExpire;

 } RTL8139State;

+static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
+
 static void prom9346_decode_command(EEprom9346 *eeprom, uint8_t command)
 {
 DEBUG_PRINT(("RTL8139: eeprom command 0x%02x\n", command));
@@ -2522,7 +2526,9 @@ static void rtl8139_IntrMask_write(RTL8139State
*s, uint32_t val)

 s->IntrMask = val;

+rtl8139_set_next_tctr_time(s, qemu_get_clock(vm_clock));
 rtl8139_update_irq(s);
+
 }

 static uint32_t rtl8139_IntrMask_read(RTL8139State *s)
@@ -2555,12 +2561,22 @@ static void
rtl8139_IntrStatus_write(RTL8139State *s, uint32_t val)
 rtl8139_update_irq(s);

 s->IntrStatus = newStatus;
+/*
+ * Computing if we miss an interrupt here is not that correct but
+ * considered that we should have had already an interrupt
+ * and probably emulated is slower is better to assume this resetting was
+ * done before testing on previous rtl8139_update_irq lead to IRQ loosing
+ */
+rtl8139_set_next_tctr_time(s, qemu_get_clock(vm_clock));
 rtl8139_update_irq(s);
+
 #endif
 }

 static uint32_t rtl8139_IntrStatus_read(RTL8139State *s)
 {
+rtl8139_set_next_tctr_time(s, qemu_get_clock(vm_clock));
+
 uint32_t ret = s->IntrStatus;

 DEBUG_PRINT(("RTL8139: IntrStatus read(w) val=0x%04x\n", ret));
@@ -2739,6 +2755,43 @@ static void rtl8139_io_writew(void *opaque,
uint8_t addr, uint32_t val)
 }
 }

+static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time)
+{
+DEBUG_PRINT(("RTL8139: entered rtl8139_set_next_tctr_time\n"));
+
+if (s->TimerExpire && current_time >= s->TimerExpire) {
+s->IntrStatus |= PCSTimeout;
+rtl8139_update_irq(s);
+}
+
+/* Set QEMU timer only if needed that is
+ * - TimerInt <> 0 (we have a timer)
+ * - mask = 1 (we want an interrupt timer)
+ * - irq = 0  (irq is not already active)
+ * If any of above change we need to compute timer again
+ * Also we must check if timer is passed without QEMU timer
+ */
+s->TimerExpire = 0;
+if (!s->TimerInt) {
+return;
+}
+
+int64_t pci_time = muldiv64(current_time - s->TCTR_base, PCI_FREQUENCY,
+get_ticks_per_sec());
+uint32_t low_pci = pci_time & 0x;
+pci_time = pci_time - low_pci + s->TimerInt;
+if (low_pci >= s->TimerInt) {
+pci_time += 0x1LL;
+}
+int64_t next_time = s->TCTR_base + muldiv64(pci_time, get_ticks_per_sec(),
+PCI_FREQUENCY);
+s->TimerExpire = next_time;
+
+if ((s->IntrMask & PCSTimeout) != 0 && (s->IntrStatus & PCSTimeout) == 0) {
+qemu_mod_timer(s->timer, next_time);
+}
+}
+
 static void rtl8139_io_writel(void *opaque, uint8_t addr, uint32_t val)
 {
 RTL8139State *s = opaque;
@@ -2784,13 +2837,16 @@ static void rtl8139_io_writel(void *opaque,
uint8_t addr, uint32_t val)

 case Timer:
 DEBUG_PRINT(("RTL8139: TCTR Timer reset on write\n"));
-s->TCTR = 0;
 s->TCTR_base = qemu_get_clock(vm_clock);
+rtl8139_set_next_tctr_time(s, s->TCTR_base);
 break;

 case FlashReg:
 DEBUG_PRINT(("RTL8139: FlashReg TimerInt write
val=0x%08x\n", val));
-s->TimerInt = val;
+if (s->TimerInt != val) {
+s->TimerInt = val;
+rtl8139_set_next_tctr_time(s, qemu_get_clock(vm_clock));
+}
 break;

 default:
@@ -3000,7 +3056,8 @@ static uint32_t rtl8139_io_readl(void *opaque,
uint8_t addr)
 break;

 case Timer:
-ret = s->TCTR;
+ret = muldiv64(qemu_get_clock(vm_clock) - s->TCT

[Qemu-devel] Re: [PATCH] pci: fix info pci with host bridge.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 03:40:38PM +0900, Isaku Yamahata wrote:
> This patch fixes 525e05147d5a3bdc08caa422d108c1ef71b584b5.
> pci host bridge doesn't have header type of bridge.
> The check should be by header type, instead of pci class device.
> 
> Cc: Blue Swirl 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Isaku Yamahata 

Re: 525e05147d5a3bdc08caa422d108c1ef71b584b5
this commit put hard-coded constant in pci.c.
It would have been better to post it on list for review
instead of direct commit.

> ---
>  hw/pci.c |6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index e91d2e6..eb2043e 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -1273,7 +1273,7 @@ static QObject *pci_get_devices_list(PCIBus *bus, int 
> bus_num);
>  
>  static QObject *pci_get_dev_dict(PCIDevice *dev, PCIBus *bus, int bus_num)
>  {
> -int class;
> +uint8_t type;
>  QObject *obj;
>  
>  obj = qobject_from_jsonf("{ 'bus': %d, 'slot': %d, 'function': %d,"  
>  "'class_info': %p, 'id': %p, 'regions': %p,"
> @@ -1289,8 +1289,8 @@ static QObject *pci_get_dev_dict(PCIDevice *dev, PCIBus 
> *bus, int bus_num)
>  qdict_put(qdict, "irq", 
> qint_from_int(dev->config[PCI_INTERRUPT_LINE]));
>  }
>  
> -class = pci_get_word(dev->config + PCI_CLASS_DEVICE);
> -if (class == PCI_CLASS_BRIDGE_HOST || class == PCI_CLASS_BRIDGE_PCI) {
> +type = dev->config[PCI_HEADER_TYPE] & ~PCI_HEADER_TYPE_MULTI_FUNCTION;
> +if (type == PCI_HEADER_TYPE_BRIDGE) {
>  QDict *qdict;
>  QObject *pci_bridge;
>  
> -- 
> 1.6.6.1




[Qemu-devel] Seabios dislikes -M isapc

2010-02-08 Thread Jan Kiszka
Hi,

Seabios seems to have some assumptions built in that break when -M isapc
is selected. Is this supposed to work or is isapc about to die?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux




[Qemu-devel] Re: [SeaBIOS] [PATCH] Seabios: Fix PkgLength calculation for the SSDT.

2010-02-08 Thread Avi Kivity

On 01/12/2010 03:36 PM, Kevin O'Connor wrote:

On Tue, Jan 12, 2010 at 10:10:41AM +0100, Magnus Christensson wrote:
   

On 12/24/2009 01:29 AM, Kevin O'Connor wrote:
 

On Mon, Dec 14, 2009 at 10:25:14AM +0100, Magnus Christensson wrote:
   

Should be cpu_length + 2 as far as I can tell. The current code is
definitely broken.

   

Right. That should be cpu_length +2 in the else-part.
 

Can you resend the patch with the change?
   

Attached (sorry for the delay).
 

Thanks.  Commit 3012af18.
   


Without this patch, Windows 2008 r2 won't boot with more than 4 cpus.  
Kevin/Anthony, can you coordinate a SeaBIOS release including this patch?


Probably needed for qemu 0.12 as well.

--
error compiling committee.c: too many arguments to function





[Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
> initialize header type register in pci generic code.
> 
> Cc: Blue Swirl 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Isaku Yamahata 

No objections here, I am assuming this will be followed
by patches removing header type init from bridges?
>From qdev perspective, it is probably cleaner to make
multifunction bit a separate qdev property though, right?

I queued this one, let's wait a bit for comments from
interested people.

> ---
>  hw/pci.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index eb2043e..7b055b4 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -603,6 +603,7 @@ static PCIDevice *do_pci_register_device(PCIDevice 
> *pci_dev, PCIBus *bus,
>  pci_dev->irq_state = 0;
>  pci_config_alloc(pci_dev);
>  
> +pci_dev->config[PCI_HEADER_TYPE] = header_type;
>  header_type &= ~PCI_HEADER_TYPE_MULTI_FUNCTION;
>  if (header_type == PCI_HEADER_TYPE_NORMAL) {
>  pci_set_default_subsystem_id(pci_dev);
> -- 
> 1.6.6.1




[Qemu-devel] [PATCH 1/3] qemu-kvm: Wrap phys_ram_dirty with additional inline functions.

2010-02-08 Thread OHMURA Kei
We think access phys_ram_dirty through inline functions is better
than directly for encoupseling reason.

We devided the ram in a 64 pages block. Each block has a counter, which is
stored in phys_ram_dirty_by_word. It shows the number of dirty pages.
We will find the 64 pages block is dirty or non-dirty using
phys_ram_dirty_by_word.

Signed-off-by: OHMURA Kei 
---
 cpu-all.h  |   74 
 cpu-defs.h |1 +
 2 files changed, 75 insertions(+), 0 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 8ed76c7..2251f14 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -168,6 +168,33 @@ typedef union {
 } CPU_QuadU;
 #endif
 
+static inline unsigned long unroll_flags_to_ul(int flags)
+{
+unsigned long ret = 0, flags_ul = (unsigned long)flags;
+
+#if TARGET_LONG_SIZE == 4
+ret |= flags_ul <<  0;
+ret |= flags_ul <<  8;
+ret |= flags_ul << 16;
+ret |= flags_ul << 24;
+#elif TARGET_LONG_SIZE == 8
+ret |= flags_ul <<  0;
+ret |= flags_ul <<  8;
+ret |= flags_ul << 16;
+ret |= flags_ul << 24;
+ret |= flags_ul << 32;
+ret |= flags_ul << 40;
+ret |= flags_ul << 48;
+ret |= flags_ul << 56;
+#else
+int i;
+for (i = 0; i < sizeof(unsigned long); i++)
+ret |= flags_ul << (i * 8);
+#endif
+
+return ret;
+}
+
 /* CPU memory access without any memory or io remapping */
 
 /*
@@ -847,6 +874,7 @@ int cpu_str_to_log_mask(const char *str);
 
 extern int phys_ram_fd;
 extern uint8_t *phys_ram_dirty;
+extern int *phys_ram_dirty_by_word;
 extern ram_addr_t ram_size;
 extern ram_addr_t last_ram_offset;
 extern uint8_t *bios_mem;
@@ -882,6 +910,11 @@ static inline int cpu_physical_memory_is_dirty(ram_addr_t 
addr)
 return phys_ram_dirty[addr >> TARGET_PAGE_BITS] == 0xff;
 }
 
+static inline int cpu_physical_memory_get_dirty_flags(ram_addr_t addr)
+{
+return phys_ram_dirty[addr >> TARGET_PAGE_BITS];
+}
+
 static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
 int dirty_flags)
 {
@@ -890,9 +923,50 @@ static inline int cpu_physical_memory_get_dirty(ram_addr_t 
addr,
 
 static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
+if (phys_ram_dirty[addr >> TARGET_PAGE_BITS] != 0xff)
+++phys_ram_dirty_by_word[(addr >> TARGET_PAGE_BITS) / 
+ TARGET_LONG_BITS];
+
 phys_ram_dirty[addr >> TARGET_PAGE_BITS] = 0xff;
 }
 
+static inline int cpu_physical_memory_set_dirty_flags(ram_addr_t addr,
+  int dirty_flags)
+{
+if ((phys_ram_dirty[addr >> TARGET_PAGE_BITS] != 0xff) &&
+((phys_ram_dirty[addr >> TARGET_PAGE_BITS] | dirty_flags) == 0xff)) 
+++phys_ram_dirty_by_word[(addr >> TARGET_PAGE_BITS) / 
+ TARGET_LONG_BITS];
+
+return phys_ram_dirty[addr >> TARGET_PAGE_BITS] |= dirty_flags;
+}
+
+static inline void cpu_physical_memory_mask_dirty_range(ram_addr_t start,
+int length,
+int dirty_flags)
+{
+int i, mask, len, *pw;
+uint8_t *p;
+
+len = length >> TARGET_PAGE_BITS;
+mask = ~dirty_flags;
+p = phys_ram_dirty + (start >> TARGET_PAGE_BITS);
+pw = phys_ram_dirty_by_word + (start >> TARGET_PAGE_BITS) / 
+TARGET_LONG_BITS;
+
+for (i = 0; i < len; i++) {
+if (p[i] == 0xff)
+--pw[i / TARGET_LONG_BITS];
+p[i] &= mask;
+}
+}
+
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+int dirty_flags);
+
+int cpu_physical_memory_get_non_dirty_range(ram_addr_t start, ram_addr_t end, 
+int dirty_flags);
+
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags);
 void cpu_tlb_update_dirty(CPUState *env);
diff --git a/cpu-defs.h b/cpu-defs.h
index cf502e9..8e89e96 100644
--- a/cpu-defs.h
+++ b/cpu-defs.h
@@ -37,6 +37,7 @@
 #endif
 
 #define TARGET_LONG_SIZE (TARGET_LONG_BITS / 8)
+#define TARGET_LONG_ALIGN(addr) (((addr) + TARGET_LONG_BITS - 1) / 
TARGET_LONG_BITS) 
 
 /* target_ulong is the type of a virtual address */
 #if TARGET_LONG_SIZE == 4
-- 
1.6.3.3







[Qemu-devel] [PATCH 0/3] qemu-kvm: Check the dirty and non-dirty pages by multiple size

2010-02-08 Thread OHMURA Kei
The dirty and non-dirty pages are checked one by one in vl.c.
Since we believe that most of the memory is not dirty,
checking the dirty and non-dirty pages by multiple page size
should be much faster than checking them one by one.

We think there is mostly two kind of situation.
One is almost all the page is clean because there should be small
amount of pages become dirty between each round of dirty bitmap check.
The other is all the pages is dirty because all the bitmap is marked as
dirty at the beginning of migration.

To prove our prospect,
we have evaluated effect of this patch. we compared runtime of
ram_save_remaining with original ram_save_remaining() and
ram_save_remaining() using functions of this patch.

Test Environment:
CPU: 4x Intel Xeon Quad Core 2.66GHz
Mem size: 6GB
kvm version: 2.6.31-17-server
qemu version: commit ed880109f74f0a4dd5b7ec09e6a2d9ba4903d9a5

Host OS: Ubuntu 9.10 (kernel 2.6.31)
Guest OS: Debian/GNU Linux lenny (kernel 2.6.26)
Guest Mem size: 512MB

Conditions of experiments are as follows:
Cond1: Guest OS periodically makes the 256MB continuous dirty pages.
Cond2: Guest OS periodically makes the 256MB dirty pages and non-dirty pages
in turn.
Cond3: Guest OS read 3GB file, which is bigger than memory.
Cond4: Guest OS read/write 3GB file, which is bigger than memory.

Experimental results:
Cond1: 8~16 times speed up
Cond2: 3~4 times speed down
Cond3: 8~16 times speed up
Cond4: 2~16 times speed up

# Runtime of ram_save_remaining() is changed by the number of remaining
# dirty pages.







[Qemu-devel] [PATCH 2/3] Check continuous dirty and non-dirty pages.

2010-02-08 Thread OHMURA Kei
We will check the dirty and non-dirty pages as follows:
1. Checked by 64 pages block.
Check pages using phys_ram_dirty_by_word. Count up/down it when
phys_ram_dirty when is 0xff or not.
2. Checked by TARGET_LONG_SIZE pages block.
3. Checked by one page.


Signed-off-by: OHMURA Kei 
---
 exec.c |  125 +++-
 1 files changed, 100 insertions(+), 25 deletions(-)

diff --git a/exec.c b/exec.c
index ade09cb..5770281 100644
--- a/exec.c
+++ b/exec.c
@@ -119,6 +119,7 @@ uint8_t *code_gen_ptr;
 #if !defined(CONFIG_USER_ONLY)
 int phys_ram_fd;
 uint8_t *phys_ram_dirty;
+int *phys_ram_dirty_by_word;
 uint8_t *bios_mem;
 static int in_migration;
 
@@ -1843,7 +1844,7 @@ static void tlb_protect_code(ram_addr_t ram_addr)
 static void tlb_unprotect_code_phys(CPUState *env, ram_addr_t ram_addr,
 target_ulong vaddr)
 {
-phys_ram_dirty[ram_addr >> TARGET_PAGE_BITS] |= CODE_DIRTY_FLAG;
+cpu_physical_memory_set_dirty_flags(ram_addr, CODE_DIRTY_FLAG);
 }
 
 static inline void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry,
@@ -1858,14 +1859,83 @@ static inline void tlb_reset_dirty_range(CPUTLBEntry 
*tlb_entry,
 }
 }
 
+int cpu_physical_memory_get_dirty_range(ram_addr_t start, ram_addr_t end, 
+int dirty_flags)
+{
+static unsigned long mask = 0;
+static int cached_dirty_flags = 0;
+uint8_t *p = phys_ram_dirty + (start >> TARGET_PAGE_BITS);
+int *p_by_word = phys_ram_dirty_by_word +
+(start >> TARGET_PAGE_BITS) / TARGET_LONG_BITS;
+
+/*
+ * Since making the mask in every time this function called is too slow,
+ * we will cache the value.
+ */
+if (dirty_flags != cached_dirty_flags) {
+cached_dirty_flags = dirty_flags;
+mask = unroll_flags_to_ul(dirty_flags);
+}  
+
+/*
+ * We can get the dirty-pages very fast, 
+ * when a lot of continuous pages are dirty.
+ */ 
+if start >> TARGET_PAGE_BITS) & (TARGET_LONG_BITS - 1)) == 0) && 
+((end - start) >> TARGET_PAGE_BITS) >= TARGET_LONG_BITS &&
+*p_by_word == TARGET_LONG_BITS)
+return TARGET_LONG_BITS;
+
+if start >> TARGET_PAGE_BITS) & (TARGET_LONG_SIZE - 1)) == 0) && 
+((end - start) >> TARGET_PAGE_BITS) >= TARGET_LONG_SIZE &&
+(*(unsigned long *)p & mask) == mask)
+return TARGET_LONG_SIZE;
+
+return (cpu_physical_memory_get_dirty(start, dirty_flags) == dirty_flags);
+}
+
+int cpu_physical_memory_get_non_dirty_range(ram_addr_t start, ram_addr_t end, 
+int dirty_flags)
+{
+static unsigned long mask = 0;
+static int cached_dirty_flags = 0;
+uint8_t *p = phys_ram_dirty + (start >> TARGET_PAGE_BITS);
+int *p_by_word = phys_ram_dirty_by_word +
+(start >> TARGET_PAGE_BITS) / TARGET_LONG_BITS;
+
+/* 
+ * Since making the mask in every time this function called is too slow,
+ * we will cache the value.
+ */
+if (dirty_flags != cached_dirty_flags) {
+cached_dirty_flags = dirty_flags;
+mask = unroll_flags_to_ul(dirty_flags); 
+}
+
+/* 
+ * We can skip the non-dirty-pages very fast, 
+ * when a lot of continuous pages are not dirty. 
+ */
+if start >> TARGET_PAGE_BITS) & (TARGET_LONG_BITS - 1)) == 0) && 
+((end - start) >> TARGET_PAGE_BITS) >= TARGET_LONG_BITS &&
+*p_by_word == 0)
+return TARGET_LONG_BITS;
+
+if start >> TARGET_PAGE_BITS) & (TARGET_LONG_SIZE - 1)) == 0) &&
+((end - start) >> TARGET_PAGE_BITS) >= TARGET_LONG_SIZE &&
+(*(unsigned long *)p & mask) == 0)
+return TARGET_LONG_SIZE;
+
+return (cpu_physical_memory_get_dirty(start, dirty_flags) == 0);
+}
+
 /* Note: start and end must be within the same ram block.  */
 void cpu_physical_memory_reset_dirty(ram_addr_t start, ram_addr_t end,
  int dirty_flags)
 {
 CPUState *env;
 unsigned long length, start1;
-int i, mask, len;
-uint8_t *p;
+int i;
 
 start &= TARGET_PAGE_MASK;
 end = TARGET_PAGE_ALIGN(end);
@@ -1873,11 +1943,7 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 length = end - start;
 if (length == 0)
 return;
-len = length >> TARGET_PAGE_BITS;
-mask = ~dirty_flags;
-p = phys_ram_dirty + (start >> TARGET_PAGE_BITS);
-for(i = 0; i < len; i++)
-p[i] &= mask;
+cpu_physical_memory_mask_dirty_range(start, length, dirty_flags);
 
 /* we modify the TLB cache so that the dirty bit will be set again
when accessing the range */
@@ -2535,6 +2601,7 @@ extern const char *mem_path;
 ram_addr_t qemu_ram_alloc(ram_addr_t size)
 {
 RAMBlock *new_block;
+int i, *p;
 
 size = TARGET_PAGE_ALIGN(size);
 new_block = qemu_malloc(sizeof(*new_block));
@@ -2564,6 +2631,14 @@ ram_addr_t qemu_ram_alloc(ram_addr

[Qemu-devel] [PATCH 3/3] qemu-kvm: Change the methods of get dirty pages.

2010-02-08 Thread OHMURA Kei
Get number of the dirty and non-dirty pages using
cpu_physical_memory_get_{dirty|non_dirty}_range().


Signed-off-by: OHMURA Kei 
---
 vl.c |   57 ++---
 1 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/vl.c b/vl.c
index 4ef6a78..0835da6 100644
--- a/vl.c
+++ b/vl.c
@@ -2798,7 +2798,7 @@ static int ram_save_block(QEMUFile *f)
 static ram_addr_t current_addr = 0;
 ram_addr_t saved_addr = current_addr;
 ram_addr_t addr = 0;
-int found = 0;
+int i, found = 0, skip = 0;
 
 while (addr < last_ram_offset) {
 if (kvm_enabled() && current_addr == 0) {
@@ -2810,28 +2810,37 @@ static int ram_save_block(QEMUFile *f)
 return 0;
 }
 }
-if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) 
{
+if ((found = cpu_physical_memory_get_dirty_range(current_addr, 
+last_ram_offset, MIGRATION_DIRTY_FLAG))) {
 uint8_t *p;
 
-cpu_physical_memory_reset_dirty(current_addr,
-current_addr + TARGET_PAGE_SIZE,
-MIGRATION_DIRTY_FLAG);
+for (i = 0; i < found; i++) {
+ram_addr_t page_addr = current_addr + (i * TARGET_PAGE_SIZE);
+cpu_physical_memory_reset_dirty(page_addr,
+page_addr + TARGET_PAGE_SIZE,
+MIGRATION_DIRTY_FLAG);
 
-p = qemu_get_ram_ptr(current_addr);
+p = qemu_get_ram_ptr(page_addr);
 
-if (is_dup_page(p, *p)) {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS);
-qemu_put_byte(f, *p);
-} else {
-qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE);
-qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+if (is_dup_page(p, *p)) {
+qemu_put_be64(f, (page_addr) |
+  RAM_SAVE_FLAG_COMPRESS);
+qemu_put_byte(f, *p);
+} else {
+qemu_put_be64(f, (page_addr) |
+  RAM_SAVE_FLAG_PAGE);
+qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+}
 }
 
-found = 1;
 break;
 }
-addr += TARGET_PAGE_SIZE;
-current_addr = (saved_addr + addr) % last_ram_offset;
+while ((skip = cpu_physical_memory_get_non_dirty_range(current_addr,
+last_ram_offset, MIGRATION_DIRTY_FLAG)) != 0) {
+addr += TARGET_PAGE_SIZE * skip;
+current_addr = (saved_addr + addr) % last_ram_offset;
+if (addr >= last_ram_offset) break;
+}
 }
 
 return found;
@@ -2841,12 +2850,22 @@ static uint64_t bytes_transferred;
 
 static ram_addr_t ram_save_remaining(void)
 {
-ram_addr_t addr;
+ram_addr_t addr = 0;
 ram_addr_t count = 0;
+int found = 0, skip = 0;
 
-for (addr = 0; addr < last_ram_offset; addr += TARGET_PAGE_SIZE) {
-if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG))
-count++;
+while (addr < last_ram_offset) {
+if ((found = cpu_physical_memory_get_dirty_range(addr, 
+last_ram_offset, MIGRATION_DIRTY_FLAG))) {
+count += found;
+addr += TARGET_PAGE_SIZE * found;
+} else {
+while ((skip = cpu_physical_memory_get_non_dirty_range(addr,
+last_ram_offset, MIGRATION_DIRTY_FLAG)) != 0) {
+addr += TARGET_PAGE_SIZE * skip;
+if (addr >= last_ram_offset) break;
+}
+}
 }
 
 return count;
-- 
1.6.3.3






Re: [Qemu-devel] [PATCH v3] block: more read-only changes, related to backing files

2010-02-08 Thread Kevin Wolf
Am 07.02.2010 15:19, schrieb Naphtali Sprei:
> This version addresses comments by Kevin Wolf  to v2
> Also separate commits squashed.
> 
> 
> Open image file read-only where possible
> Upgrade file to read-write during commit, back to read-only after commit
> Added option for qemu-img.c bdrv_new_open() to open file as read-only
> 
> qemu-img changes based on patch by Sheng Yang 
> 
> 
> Signed-off-by: Naphtali Sprei 

Looks much better to me now.

The only thing I'm still unsure about is what to do with the case where
re-opening the image fails completely. Have you tested this case? I have
no idea what would happen, probably a segfault somewhere.

Kevin




[Qemu-devel] Re: [PATCH] apb_pci: fix header type of pbm pci host bridge.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 03:45:03PM +0900, Isaku Yamahata wrote:
> The change set of 776e1bbb6cf4fe66a93c1a5dd814bbb650deca00
> specifies pbm pci host bridge is type of bridge.
> It contradicts with pbm_pci_host_init().
> 
> Blue Swirl, could you please check this patch?
> To be honest I don't know about pbm pci host bridge so that
> I don't know which is correct, pbm_pci_host_init() or pbm_pci_host_info.
> I just took the older code.
> 
> Cc: Blue Swirl 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Isaku Yamahata 

Blue Swirl, can you Ack please?

> ---
>  hw/apb_pci.c |3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/apb_pci.c b/hw/apb_pci.c
> index 46d5b0e..359a84f 100644
> --- a/hw/apb_pci.c
> +++ b/hw/apb_pci.c
> @@ -471,7 +471,7 @@ static int pbm_pci_host_init(PCIDevice *d)
>  d->config[0x09] = 0x00; // programming i/f
>  pci_config_set_class(d->config, PCI_CLASS_BRIDGE_HOST);
>  d->config[0x0D] = 0x10; // latency_timer

Looks like apb_pci needs another sweep of getting rid of
hard-coded constants. Any takers?

> -d->config[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL; // header_type
> +/* header type is initialized by do_pci_register_device() */

Let's not put such comments around code: that function can get renamed
or removed or moved to another file and no one will remember to update
this comment. This belongs in commit message.

>  return 0;
>  }
>  
> @@ -479,7 +479,6 @@ static PCIDeviceInfo pbm_pci_host_info = {
>  .qdev.name = "pbm",
>  .qdev.size = sizeof(PCIDevice),
>  .init  = pbm_pci_host_init,
> -.header_type  = PCI_HEADER_TYPE_BRIDGE,
>  };
>  
>  static SysBusDeviceInfo pbm_host_info = {
> -- 
> 1.6.6.1




[Qemu-devel] Re: [PATCH] apb_pci: fix header type of pbm pci host bridge.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 03:45:03PM +0900, Isaku Yamahata wrote:
> The change set of 776e1bbb6cf4fe66a93c1a5dd814bbb650deca00
> specifies pbm pci host bridge is type of bridge.
> It contradicts with pbm_pci_host_init().


By the way, the next below (cover letter) should be put after
--- rather than here, so that it does not end up in
commit message.


> Blue Swirl, could you please check this patch?
> To be honest I don't know about pbm pci host bridge so that
> I don't know which is correct, pbm_pci_host_init() or pbm_pci_host_info.
> I just took the older code.
> 
> Cc: Blue Swirl 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Isaku Yamahata 
> ---
>  hw/apb_pci.c |3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/apb_pci.c b/hw/apb_pci.c
> index 46d5b0e..359a84f 100644
> --- a/hw/apb_pci.c
> +++ b/hw/apb_pci.c
> @@ -471,7 +471,7 @@ static int pbm_pci_host_init(PCIDevice *d)
>  d->config[0x09] = 0x00; // programming i/f
>  pci_config_set_class(d->config, PCI_CLASS_BRIDGE_HOST);
>  d->config[0x0D] = 0x10; // latency_timer
> -d->config[PCI_HEADER_TYPE] = PCI_HEADER_TYPE_NORMAL; // header_type
> +/* header type is initialized by do_pci_register_device() */
>  return 0;
>  }
>  
> @@ -479,7 +479,6 @@ static PCIDeviceInfo pbm_pci_host_info = {
>  .qdev.name = "pbm",
>  .qdev.size = sizeof(PCIDevice),
>  .init  = pbm_pci_host_init,
> -.header_type  = PCI_HEADER_TYPE_BRIDGE,
>  };
>  
>  static SysBusDeviceInfo pbm_host_info = {
> -- 
> 1.6.6.1




[Qemu-devel] Re: [PATCH] Seabios - read e820 table from qemu_cfg

2010-02-08 Thread Jes Sorensen

On 01/28/10 05:39, Kevin O'Connor wrote:

As a side note, it should probably do the e820 map check even for qemu
users (ie, not just kvm).


Hi Kevin,

Here is an updated version of the patch which does the e820 read
unconditionally of  the return from kvm_para_available() so it should
work for coreboot too.

I haven't touched the file descriptor issue as I find it's a different
discussion.

Let me know what you think.

Cheers,
Jes

PS: I tried to subscribe to the seabios list but I never got an ack for
it :(
Read optional table of e820 entries from qemu_cfg

Read optional table of e820 entries through qemu_cfg, allowing QEMU to
provide the location of KVM's switch area etc. rather than rely on
hard coded values.

For now, fall back to the old hard coded values for the TSS and EPT
switch page for compatibility reasons. Compatibility code could
possibly be removed in the future.

Signed-off-by: Jes Sorensen 

---
 src/paravirt.c |   17 +
 src/paravirt.h |9 +
 src/post.c |   13 -
 3 files changed, 38 insertions(+), 1 deletion(-)

Index: seabios/src/paravirt.c
===
--- seabios.orig/src/paravirt.c
+++ seabios/src/paravirt.c
@@ -132,6 +132,23 @@ u16 qemu_cfg_smbios_entries(void)
 return cnt;
 }
 
+u32 qemu_cfg_e820_entries(void)
+{
+u32 cnt;
+
+if (!qemu_cfg_present)
+return 0;
+
+qemu_cfg_read_entry(&cnt, QEMU_CFG_E820_TABLE, sizeof(cnt));
+return cnt;
+}
+
+void* qemu_cfg_e820_load_next(void *addr)
+{
+qemu_cfg_read(addr, sizeof(struct e820_entry));
+return addr;
+}
+
 struct smbios_header {
 u16 length;
 u8 type;
Index: seabios/src/paravirt.h
===
--- seabios.orig/src/paravirt.h
+++ seabios/src/paravirt.h
@@ -36,6 +36,7 @@ static inline int kvm_para_available(voi
 #define QEMU_CFG_ACPI_TABLES   (QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES(QEMU_CFG_ARCH_LOCAL + 1)
 #define QEMU_CFG_IRQ0_OVERRIDE (QEMU_CFG_ARCH_LOCAL + 2)
+#define QEMU_CFG_E820_TABLE(QEMU_CFG_ARCH_LOCAL + 3)
 
 extern int qemu_cfg_present;
 
@@ -61,8 +62,16 @@ typedef struct QemuCfgFile {
 char name[56];
 } QemuCfgFile;
 
+struct e820_entry {
+u64 address;
+u64 length;
+u32 type;
+};
+
 u16 qemu_cfg_first_file(QemuCfgFile *entry);
 u16 qemu_cfg_next_file(QemuCfgFile *entry);
 u32 qemu_cfg_read_file(QemuCfgFile *entry, void *dst, u32 maxlen);
+u32 qemu_cfg_e820_entries(void);
+void* qemu_cfg_e820_load_next(void *addr);
 
 #endif
Index: seabios/src/post.c
===
--- seabios.orig/src/post.c
+++ seabios/src/post.c
@@ -135,10 +135,21 @@ ram_probe(void)
  , E820_RESERVED);
 add_e820(BUILD_BIOS_ADDR, BUILD_BIOS_SIZE, E820_RESERVED);
 
-if (kvm_para_available())
+u32 count = qemu_cfg_e820_entries();
+if (count) {
+struct e820_entry entry;
+int i;
+
+for (i = 0; i < count; i++) {
+qemu_cfg_e820_load_next(&entry);
+add_e820(entry.address, entry.length, entry.type);
+}
+} else if (kvm_para_available()) {
+// Backwards compatibility - provide hard coded range.
 // 4 pages before the bios, 3 pages for vmx tss pages, the
 // other page for EPT real mode pagetable
 add_e820(0xfffbc000, 4*4096, E820_RESERVED);
+}
 
 dprintf(1, "Ram Size=0x%08x (0x%08x%08x high)\n"
 , RamSize, (u32)(RamSizeOver4G >> 32), (u32)RamSizeOver4G);


[Qemu-devel] Re: [PATCH] pci: fix pci_find_bus()

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 03:38:36PM +0900, Isaku Yamahata wrote:
> typo in c021f8e65f5009a5ab5711d9d5326fcab553ef1c.
> comparison fix.
> 
> Cc: Blue Swirl 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Isaku Yamahata 

thanks, applied.

> ---
>  hw/pci.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index 9ad63dd..e91d2e6 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -1558,7 +1558,7 @@ PCIBus *pci_find_bus(PCIBus *bus, int bus_num)
>  /* try child bus */
>  QLIST_FOREACH(sec, &bus->child, sibling) {
>  if (!bus->parent_dev /* pci host bridge */
> -|| (pci_bus_num(sec) >= bus_num &&
> +|| (pci_bus_num(sec) <= bus_num &&
>  bus_num <= bus->parent_dev->config[PCI_SUBORDINATE_BUS]) ) {
>  ret = pci_find_bus(sec, bus_num);
>  if (ret) {
> -- 
> 1.6.6.1




Re: [Qemu-devel] [PATCH] rtl8139 timer interrupt rewrote

2010-02-08 Thread Frediano Ziglio
2010/2/8 Igor Kovalenko :
> On Sun, Feb 7, 2010 at 6:22 PM, Frediano Ziglio  wrote:
>> rewrote timer implementation for rtl8139. Add a QEMU
>> timer only when needed (timeout status not set, timeout irq wanted and
>> timer set).
>>
>> Signed-off-by: Frediano Ziglio 
>> --
>> diff --git a/hw/rtl8139.c b/hw/rtl8139.c
>> index f04dd54..0d877b7 100644
>> --- a/hw/rtl8139.c
>> +++ b/hw/rtl8139.c
>> @@ -41,6 +41,10 @@
>>  *  segmentation offloading
>>  *  Removed slirp.h dependency
>>  *  Added rx/tx buffer reset when
>> enabling rx/tx operation
>> + *
>> + *  2009-Feb-04  Frediano Ziglio:   Rewrote timer support using QEMU timer 
>> only
>> + *  when strictly needed (required for for
>> + *  Darwin)
>>  */
>>
>>  #include "hw.h"
>> @@ -61,7 +65,7 @@
>>  #define RTL8139_CALCULATE_RXCRC 1
>>
>>  /* Uncomment to enable on-board timer interrupts */
>> -//#define RTL8139_ONBOARD_TIMER 1
>> +#define RTL8139_ONBOARD_TIMER 1
>
> Please remove this macro.
>

Removed (see updated patch)

>>
>> +static void rtl8139_pre_save(void *opaque)
>> +{
>> +RTL8139State* s = opaque;
>> +
>> +// set IntrStatus correctly
>> +int64_t current_time = qemu_get_clock(vm_clock);
>> +rtl8139_set_next_tctr_time(s, current_time);
>> +s->TCTR = muldiv64(current_time - s->TCTR_base, PCI_FREQUENCY,
>> +   get_ticks_per_sec());
>> +}
>> +
>
> Seems like TCTR is not used after restoring from saved state. Is it 
> intentional?
>

Yes, wanted, mainly TCTR is computed from TCTR_base and tick counts so
is currently quite unused but is computed for compatibility with
previous state.

> Can you please check if freebsd works with this change?
>

Yes, it works correctly. I don't know FreeBSD that much but I
downloaded latest 8.0 live version, got a shell, configured with
network manually with old ifconfig and tried some data exchange
(mainly scp/ssh).

Regards
 Frediano Ziglio




[Qemu-devel] Re: [PATCH] pci: fix info pci with host bridge.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 12:10:52PM +0200, Michael S. Tsirkin wrote:
> On Mon, Feb 08, 2010 at 03:40:38PM +0900, Isaku Yamahata wrote:
> > This patch fixes 525e05147d5a3bdc08caa422d108c1ef71b584b5.
> > pci host bridge doesn't have header type of bridge.
> > The check should be by header type, instead of pci class device.
> > 
> > Cc: Blue Swirl 
> > Cc: "Michael S. Tsirkin" 
> > Signed-off-by: Isaku Yamahata 
> 
> Re: 525e05147d5a3bdc08caa422d108c1ef71b584b5
> this commit put hard-coded constant in pci.c.
> It would have been better to post it on list for review
> instead of direct commit.

Heh, I looked at the reverse patch for some reason,
it didn't put in constants, it removed them :)

> > ---
> >  hw/pci.c |6 +++---
> >  1 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/pci.c b/hw/pci.c
> > index e91d2e6..eb2043e 100644
> > --- a/hw/pci.c
> > +++ b/hw/pci.c
> > @@ -1273,7 +1273,7 @@ static QObject *pci_get_devices_list(PCIBus *bus, int 
> > bus_num);
> >  
> >  static QObject *pci_get_dev_dict(PCIDevice *dev, PCIBus *bus, int bus_num)
> >  {
> > -int class;
> > +uint8_t type;
> >  QObject *obj;
> >  
> >  obj = qobject_from_jsonf("{ 'bus': %d, 'slot': %d, 'function': %d,"
> >"'class_info': %p, 'id': %p, 'regions': 
> > %p,"
> > @@ -1289,8 +1289,8 @@ static QObject *pci_get_dev_dict(PCIDevice *dev, 
> > PCIBus *bus, int bus_num)
> >  qdict_put(qdict, "irq", 
> > qint_from_int(dev->config[PCI_INTERRUPT_LINE]));
> >  }
> >  
> > -class = pci_get_word(dev->config + PCI_CLASS_DEVICE);
> > -if (class == PCI_CLASS_BRIDGE_HOST || class == PCI_CLASS_BRIDGE_PCI) {
> > +type = dev->config[PCI_HEADER_TYPE] & ~PCI_HEADER_TYPE_MULTI_FUNCTION;
> > +if (type == PCI_HEADER_TYPE_BRIDGE) {
> >  QDict *qdict;
> >  QObject *pci_bridge;
> >  
> > -- 
> > 1.6.6.1




[Qemu-devel] Re: [PATCH] pci: fix info pci with host bridge.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 03:40:38PM +0900, Isaku Yamahata wrote:
> This patch fixes 525e05147d5a3bdc08caa422d108c1ef71b584b5.
> pci host bridge doesn't have header type of bridge.
> The check should be by header type, instead of pci class device.
> 
> Cc: Blue Swirl 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Isaku Yamahata 

So the effect of this will be that info pci won't report
host bridge, right? IOW, it kind of reverts
525e05147d5a3bdc08caa422d108c1ef71b584b5, or am I
missing something?

> ---
>  hw/pci.c |6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index e91d2e6..eb2043e 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -1273,7 +1273,7 @@ static QObject *pci_get_devices_list(PCIBus *bus, int 
> bus_num);
>  
>  static QObject *pci_get_dev_dict(PCIDevice *dev, PCIBus *bus, int bus_num)
>  {
> -int class;
> +uint8_t type;
>  QObject *obj;
>  
>  obj = qobject_from_jsonf("{ 'bus': %d, 'slot': %d, 'function': %d,"  
>  "'class_info': %p, 'id': %p, 'regions': %p,"
> @@ -1289,8 +1289,8 @@ static QObject *pci_get_dev_dict(PCIDevice *dev, PCIBus 
> *bus, int bus_num)
>  qdict_put(qdict, "irq", 
> qint_from_int(dev->config[PCI_INTERRUPT_LINE]));
>  }
>  
> -class = pci_get_word(dev->config + PCI_CLASS_DEVICE);
> -if (class == PCI_CLASS_BRIDGE_HOST || class == PCI_CLASS_BRIDGE_PCI) {
> +type = dev->config[PCI_HEADER_TYPE] & ~PCI_HEADER_TYPE_MULTI_FUNCTION;
> +if (type == PCI_HEADER_TYPE_BRIDGE) {
>  QDict *qdict;
>  QObject *pci_bridge;
>  
> -- 
> 1.6.6.1




[Qemu-devel] Re: How fun it must be to commit patches without posting them!

2010-02-08 Thread Paolo Bonzini

On 02/08/2010 10:10 AM, Paolo Bonzini wrote:

On 02/08/2010 10:09 AM, malc wrote:

Because i had a hard disk crash and hence ended up with a new shiny
system where QEMU does not build anymore, also not enough patience
to revisit the thread and figure out why it wasn't applied yet.


Sorry for you, but that's what local branches are for.


Thanks for committing my series.

Paolo




[Qemu-devel] USB function application on QEMU

2010-02-08 Thread Taimoor Mirza

Hi all,

 

Is it possible to run any linux based USB device side application on QEMU? For 
example I want to develop a linux based mass storage device and I want to test 
it on QEMU.

 

BR,

Taimoor
  
_
Hotmail: Free, trusted and rich email service.
https://signup.live.com/signup.aspx?id=60969

Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Gerd Hoffmann

On 02/08/10 11:17, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:

initialize header type register in pci generic code.

Cc: Blue Swirl
Cc: "Michael S. Tsirkin"
Signed-off-by: Isaku Yamahata


No objections here, I am assuming this will be followed
by patches removing header type init from bridges?
 From qdev perspective, it is probably cleaner to make
multifunction bit a separate qdev property though, right?


From a qdev perspective it would make *alot* of sense to move a bunch 
of pci config stuff (including, but not limited to header type) into 
PCIDeviceInfo.


cheers,
  Gerd





Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
> On 02/08/10 11:17, Michael S. Tsirkin wrote:
>> On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
>>> initialize header type register in pci generic code.
>>>
>>> Cc: Blue Swirl
>>> Cc: "Michael S. Tsirkin"
>>> Signed-off-by: Isaku Yamahata
>>
>> No objections here, I am assuming this will be followed
>> by patches removing header type init from bridges?
>>  From qdev perspective, it is probably cleaner to make
>> multifunction bit a separate qdev property though, right?
>
> From a qdev perspective it would make *alot* of sense to move a bunch of 
> pci config stuff (including, but not limited to header type) into  
> PCIDeviceInfo.
>
> cheers,
>   Gerd

It's an Ack then?




Re: [Qemu-devel] [PATCH 1/4] qjson: Improve debugging

2010-02-08 Thread Luiz Capitulino
On Fri, 05 Feb 2010 18:14:41 +0100
Markus Armbruster  wrote:

> Anthony Liguori  writes:

[...]

> Yes.  But what's reasonably expected entirely depends on the contract
> between the function and its callers.
> 
> I think we need a function that cannot fail and shouldn't used with
> untrusted arguments (for what it's worth, that's how we use
> qobject_from_jsonf() now).  Having related functions with different
> contracts is fine with me.

 I completely agree.




[Qemu-devel] Re: [PATCH] qemu-kvm: Speed up of the dirty-bitmap-traveling

2010-02-08 Thread OHMURA Kei
>> Would be great if you could provide a version for upstream as well
>> because it will likely replace this qemu-kvm code on day.
> O.K.  We'll prepare it.


We have implemented the version for upstream.  Some source code are borrowed 
from qemu-kvm.c.  It is not fully tested yet, though.

We also did performance test against this patch.  Test environment is the same 
as the email I sent before.


Experimental results:
Test1: Guest OS read 3GB file, which is bigger than memory.
#called orig.(msec) patch(msec) ratio
14  3.790.1820.8
12  3.200.1521.4
11  2.890.1421.0
 
Test2: Guest OS read/write 3GB file, which is bigger than memory.
#called orig.(msec) patch(msec) ratio
364 180 8.7020.7
326 161 7.7120.9
474 235 11.720.1


---
 kvm-all.c |   80 +---
 1 files changed, 65 insertions(+), 15 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 15ec38e..9666843 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -279,9 +279,69 @@ int kvm_set_migration_log(int enable)
 return 0;
 }
 
-static int test_le_bit(unsigned long nr, unsigned char *addr)
+static inline void kvm_get_dirty_pages_log_range_by_byte(unsigned int start,
+ unsigned int end,
+ unsigned char *bitmap,
+ unsigned long offset)
 {
-return (addr[nr >> 3] >> (nr & 7)) & 1;
+unsigned int i, j, n = 0;
+unsigned long page_number, addr, addr1;
+ram_addr_t ram_addr;
+unsigned char c;
+
+/*   
+ * bitmap-traveling is faster than memory-traveling (for addr...)
+ * especially when most of the memory is not dirty.
+ */
+for (i = start; i < end; i++) {
+c = bitmap[i];
+while (c > 0) {
+j = ffsl(c) - 1;
+c &= ~(1u << j);
+page_number = i * 8 + j;
+addr1 = page_number * TARGET_PAGE_SIZE;
+addr = offset + addr1;
+ram_addr = cpu_get_physical_page_desc(addr);
+cpu_physical_memory_set_dirty(ram_addr);
+n++;
+}
+}
+}
+
+static int kvm_get_dirty_pages_log_range_by_long(unsigned long start_addr,
+ unsigned char *bitmap,
+ unsigned long mem_size)
+{
+unsigned int i;
+unsigned int len;
+unsigned long *bitmap_ul = (unsigned long *)bitmap;
+
+/* bitmap-traveling by long size is faster than by byte size
+ * especially when most of memory is not dirty.
+ * bitmap should be long-size aligned for traveling by long.
+ */
+if (((unsigned long)bitmap & (TARGET_LONG_SIZE - 1)) == 0) {
+len = ((mem_size / TARGET_PAGE_SIZE) + TARGET_LONG_BITS - 1) / 
+TARGET_LONG_BITS;
+for (i = 0; i < len; i++)
+if (bitmap_ul[i] != 0)
+kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE,
+(i + 1) * TARGET_LONG_SIZE, bitmap, start_addr);
+/*  
+ * We will check the remaining dirty-bitmap, 
+ * when the mem_size is not a multiple of TARGET_LONG_SIZE. 
+ */
+if ((mem_size & (TARGET_LONG_SIZE - 1)) != 0) {
+len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8;
+kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE,
+len, bitmap, start_addr);
+}
+} else { /* slow path: traveling by byte. */
+len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8;
+kvm_get_dirty_pages_log_range_by_byte(0, len, bitmap, start_addr);
+}
+
+return 0;
 }
 
 /**
@@ -297,8 +357,6 @@ int kvm_physical_sync_dirty_bitmap(target_phys_addr_t 
start_addr,
 {
 KVMState *s = kvm_state;
 unsigned long size, allocated_size = 0;
-target_phys_addr_t phys_addr;
-ram_addr_t addr;
 KVMDirtyLog d;
 KVMSlot *mem;
 int ret = 0;
@@ -327,17 +385,9 @@ int kvm_physical_sync_dirty_bitmap(target_phys_addr_t 
start_addr,
 break;
 }
 
-for (phys_addr = mem->start_addr, addr = mem->phys_offset;
- phys_addr < mem->start_addr + mem->memory_size;
- phys_addr += TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
-unsigned char *bitmap = (unsigned char *)d.dirty_bitmap;
-unsigned nr = (phys_addr - mem->start_addr) >> TARGET_PAGE_BITS;
-
-if (test_le_bit(nr, bitmap)) {
-cpu_physical_memory_set_dirty(addr);
-}
-}
-start_addr = phys_addr;
+kvm_get_dirty_pages_log_range_by_long(mem->start_addr, 
+d.dirty_bitmap, mem->memory_size);
+start_

Re: [Qemu-devel] [PATCH] fix the static compilation for sdl

2010-02-08 Thread Aurelien Jarno
On Mon, Feb 08, 2010 at 01:56:44PM +0800, TeLeMan wrote:
> The static compilation for sdl is broken after
> 79427693174a553d62f3da44aacd3f19ba8df3a7.
> 
> Signed-off-by: TeLeMan 

Thanks, applied.

> ---
>  configure |7 +--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/configure b/configure
> index 4c95c27..213dddf 100644
> --- a/configure
> +++ b/configure
> @@ -1063,7 +1063,11 @@ if test "$sdl" != "no" ; then
>  int main( void ) { return SDL_Init (SDL_INIT_VIDEO); }
>  EOF
>sdl_cflags=`$sdlconfig --cflags 2> /dev/null`
> -  sdl_libs=`$sdlconfig --libs 2> /dev/null`
> +  if test "$static" = "yes" ; then
> +sdl_libs=`sdl-config --static-libs 2>/dev/null`
> +  else
> +sdl_libs=`$sdlconfig --libs 2> /dev/null`
> +  fi
>if compile_prog "$sdl_cflags" "$sdl_libs" ; then
>  if test "$_sdlversion" -lt 121 ; then
>sdl_too_old=yes
> @@ -1075,7 +1079,6 @@ EOF
> 
>  # static link with sdl ? (note: sdl.pc's --static --libs is broken)
>  if test "$sdl" = "yes" -a "$static" = "yes" ; then
> -  sdl_libs=`sdl-config --static-libs 2>/dev/null`
>if test $? = 0 && echo $sdl_libs | grep -- -laa > /dev/null; then
>   sdl_libs="$sdl_libs `aalib-config --static-libs >2 /dev/null`"
>   sdl_cflags="$sdl_cflags `aalib-config --cflags >2 /dev/null`"
> -- 
> 1.6.5.1.1367.gcd48
> 
> --
> SUN OF A BEACH
> 
> 
> 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net




[Qemu-devel] Re: [PATCH] qemu-kvm: Speed up of the dirty-bitmap-traveling

2010-02-08 Thread Jan Kiszka
OHMURA Kei wrote:
>>> Would be great if you could provide a version for upstream as well
>>> because it will likely replace this qemu-kvm code on day.
>> O.K.  We'll prepare it.
> 
> 
> We have implemented the version for upstream.  Some source code are borrowed 
> from qemu-kvm.c.  It is not fully tested yet, though.
> 
> We also did performance test against this patch.  Test environment is the 
> same 
> as the email I sent before.
> 
> 
> Experimental results:
> Test1: Guest OS read 3GB file, which is bigger than memory.
> #called orig.(msec) patch(msec) ratio
> 14  3.790.1820.8
> 12  3.200.1521.4
> 11  2.890.1421.0
>  
> Test2: Guest OS read/write 3GB file, which is bigger than memory.
> #called orig.(msec) patch(msec) ratio
> 364 180 8.7020.7
> 326 161 7.7120.9
> 474 235 11.720.1
> 

Wow, so we were really inefficient here. Nice work!

Once you are done with your tests, please post this against
qemu-kvm.git's uq/master so that Avi or Marcelo can push it upstream.
Minor remarks below.

> 
> ---
>  kvm-all.c |   80 +---
>  1 files changed, 65 insertions(+), 15 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index 15ec38e..9666843 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -279,9 +279,69 @@ int kvm_set_migration_log(int enable)
>  return 0;
>  }
>  
> -static int test_le_bit(unsigned long nr, unsigned char *addr)
> +static inline void kvm_get_dirty_pages_log_range_by_byte(unsigned int start,

I don't think inline is appropriate here. Smart compilers are able to do
this on their own. And small code footprint actually contributes to
speed as well.

> + unsigned int end,
> + unsigned char 
> *bitmap,
> + unsigned long 
> offset)
>  {
> -return (addr[nr >> 3] >> (nr & 7)) & 1;
> +unsigned int i, j, n = 0;
> +unsigned long page_number, addr, addr1;
> +ram_addr_t ram_addr;
> +unsigned char c;
> +
> +/*   
> + * bitmap-traveling is faster than memory-traveling (for addr...)
> + * especially when most of the memory is not dirty.
> + */
> +for (i = start; i < end; i++) {
> +c = bitmap[i];
> +while (c > 0) {
> +j = ffsl(c) - 1;
> +c &= ~(1u << j);
> +page_number = i * 8 + j;
> +addr1 = page_number * TARGET_PAGE_SIZE;
> +addr = offset + addr1;
> +ram_addr = cpu_get_physical_page_desc(addr);
> +cpu_physical_memory_set_dirty(ram_addr);
> +n++;
> +}
> +}
> +}
> +
> +static int kvm_get_dirty_pages_log_range_by_long(unsigned long start_addr,
> + unsigned char *bitmap,
> + unsigned long mem_size)
> +{
> +unsigned int i;
> +unsigned int len;
> +unsigned long *bitmap_ul = (unsigned long *)bitmap;
> +
> +/* bitmap-traveling by long size is faster than by byte size
> + * especially when most of memory is not dirty.
> + * bitmap should be long-size aligned for traveling by long.
> + */
> +if (((unsigned long)bitmap & (TARGET_LONG_SIZE - 1)) == 0) {
> +len = ((mem_size / TARGET_PAGE_SIZE) + TARGET_LONG_BITS - 1) / 
> +TARGET_LONG_BITS;
> +for (i = 0; i < len; i++)
> +if (bitmap_ul[i] != 0)
> +kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE,
> +(i + 1) * TARGET_LONG_SIZE, bitmap, start_addr);

Missing { }, 2x.

> +/*  
> + * We will check the remaining dirty-bitmap, 
> + * when the mem_size is not a multiple of TARGET_LONG_SIZE. 
> + */
> +if ((mem_size & (TARGET_LONG_SIZE - 1)) != 0) {
> +len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8;
> +kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE,
> +len, bitmap, start_addr);

This line should be indented to the '('.

> +}
> +} else { /* slow path: traveling by byte. */
> +len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8;
> +kvm_get_dirty_pages_log_range_by_byte(0, len, bitmap, start_addr);
> +}
> +
> +return 0;
>  }
>  
>  /**
> @@ -297,8 +357,6 @@ int kvm_physical_sync_dirty_bitmap(target_phys_addr_t 
> start_addr,
>  {
>  KVMState *s = kvm_state;
>  unsigned long size, allocated_size = 0;
> -target_phys_addr_t phys_addr;
> -ram_addr_t addr;
>  KVMDirtyLog d;
>  KVMSlot *mem;
>  int ret = 0;
> @@ -327,17 +385,9 @@ int kvm_physical_sync_dirty_bitmap(target_phys_addr_t 

Re: [Qemu-devel] [PATCH 4/4] target-arm: neon fix

2010-02-08 Thread Riku Voipio
On Sun, Feb 07, 2010 at 02:02:31PM +0100, Laurent Desnogues wrote:
> On Sun, Feb 7, 2010 at 1:54 PM, Laurent Desnogues
>  wrote:
> > On Fri, Feb 5, 2010 at 4:52 PM, Riku Voipio  wrote:
> >> From: Juha Riihimäki 

> >> add an extra check in "two registers and a shift" to ensure element
> >> size decoding logic cannot fail.

> > I think there's a patch ordering problem that makes
> > the comment and the change not agree :-)

Sorry, apparently messed up while rebasing.
 
> BTW I don't think adding the check for size is needed
> here.  The encoding at that point looks like this:
 
>  332211
>  10987654321098765432109876543210
>  001_1___1__1
>  001_1__1___1
>  001_1_11
 
> so it will stop for size == 0 given that bit 19 will have to
> be set.

Juha agrees so we'll drop this patch (or more precisely get the actual change
out of the previous patch..)





Re: [Qemu-devel] [PATCH] use "%lld" instead of "%I64d" for qobject_from_jsonf in monitor.c and migration.c

2010-02-08 Thread Luiz Capitulino
On Mon, 8 Feb 2010 15:42:30 +0800
Roy Tam  wrote:

> 2010/2/8 TeLeMan :
> > The json parser does not support "%I64d", so we have to use "%lld"
> > instead of "%I64d".
> >
> 
> We use PRId64 with json in more places besides migration.c and
> monitor.c, adding %I64d support in json lexer/parser is a better
> choice IMO.

 Yes, Anthony didn't merge patches posted to the list yet, I hope
he will merge yours.




[Qemu-devel] 0.12.2, PowerPC, CPU 750 wrongly identified (?), hardware error

2010-02-08 Thread Bartlomiej Celary
Hi,
I've just compiled qemu 0.12.2 for Windows under msys.

I am using "-m prep -M 750" options which on 0.11 worked fine. In 012.2
however, I get the following error:

qemu: hardware error: PowerPC 601 / 620 / 970 need a 1MB BIOS

CPU #0:
NIP    LR  CTR  XER 
MSR  HID0   HF  idx 0
TB   DECR 
GPR00    
GPR04    
GPR08    
GPR12    
GPR16    
GPR20    
GPR24    
GPR28    
CR   [ -  -  -  -  -  -  -  -  ] RES 
FPR00    
FPR04    
FPR08    
FPR12    
FPR16    
FPR20    
FPR24    
FPR28    
FPSCR 
SRR0  SRR1  SDR1 

This application has requested the Runtime to terminate it in an unusual
way.
Please contact the application's support team for more information.

Am I correct, that somehow the CPU was recognized as PowerPC 601 / 620 /
970? Is this OK?

Any help greatly appreciated.

Regards,
Bartek Celary


[Qemu-devel] Re: [PATCH] qemu-kvm: Speed up of the dirty-bitmap-traveling

2010-02-08 Thread Avi Kivity
On 02/05/2010 12:18 PM, OHMURA Kei wrote:
> dirty-bitmap-traveling is carried out by byte size in qemu-kvm.c.
> But We think that dirty-bitmap-traveling by long size is faster than by byte
> size especially when most of memory is not dirty.
>
>
>
> +
> +static int kvm_get_dirty_pages_log_range_by_long(unsigned long start_addr,
> + unsigned char *bitmap,
> + unsigned long offset,
> + unsigned long mem_size)
> +{
> +unsigned int i;
> +unsigned int len;
> +unsigned long *bitmap_ul = (unsigned long *)bitmap;
> +
> +/* bitmap-traveling by long size is faster than by byte size
> + * especially when most of memory is not dirty.
> + * bitmap should be long-size aligned for traveling by long.
> + */
> +if (((unsigned long)bitmap & (TARGET_LONG_SIZE - 1)) == 0) {
>   

Since we allocate the bitmap, we can be sure that it is aligned on a
long boundary (qemu_malloc() should guarantee that). So you can
eliminate the fallback.

> +len = ((mem_size / TARGET_PAGE_SIZE) + TARGET_LONG_BITS - 1) /
> +TARGET_LONG_BITS;
> +for (i = 0; i < len; i++)
> +if (bitmap_ul[i] != 0)
> +kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE, 
> +(i + 1) * TARGET_LONG_SIZE, bitmap, offset);
>   

Better to just use the original loop here (since we don't need the
function as a fallback).

> +/* 
> + * We will check the remaining dirty-bitmap, 
> + * when the mem_size is not a multiple of TARGET_LONG_SIZE. 
> + */ 
> +if ((mem_size & (TARGET_LONG_SIZE - 1)) != 0) {
> +len = ((mem_size / TARGET_PAGE_SIZE) + 7) / 8;
> +kvm_get_dirty_pages_log_range_by_byte(i * TARGET_LONG_SIZE, 
> +len, bitmap, offset);
> +}
>   

Seems like the bitmap size is also aligned as well (allocated using
BITMAP_SIZE which aligns using HOST_LONG_BITS), so this is unnecessary
as well.

-- 
error compiling committee.c: too many arguments to function





Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Gerd Hoffmann

On 02/08/10 12:16, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:

On 02/08/10 11:17, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:

initialize header type register in pci generic code.

Cc: Blue Swirl
Cc: "Michael S. Tsirkin"
Signed-off-by: Isaku Yamahata


No objections here, I am assuming this will be followed
by patches removing header type init from bridges?
  From qdev perspective, it is probably cleaner to make
multifunction bit a separate qdev property though, right?


 From a qdev perspective it would make *alot* of sense to move a bunch of
pci config stuff (including, but not limited to header type) into
PCIDeviceInfo.

cheers,
   Gerd


It's an Ack then?


Yes, count it as ack.  It is a move into the right direction, and the 
fact that more cleanups can/should be done here shouldn't hold it up.


cheers,
  Gerd




Re: [Qemu-devel] [PATCH] Add assignment operation to config file parser..

2010-02-08 Thread Gerd Hoffmann

On 02/07/10 17:24, Anthony Liguori wrote:

On 02/06/2010 12:59 PM, john cooper wrote:

This patch reworks support for both assignment and
append in the config file parser. It was motivated
by comments received on the cpu model config file
format.

Commit dc9ca4ba27be4fe6a0284061b8f056c4364fb0d9
changed the behavior of "=" from assign to append.
This patch preserves the ability to append to an
option (however now via "+="), reverts "=" to its
previous behavior, and allows both to interoperate.

Signed-off-by: john cooper


This deviates from standard ini syntax which makes me a big
uncomfortable with it. Gerd, do you have an opinion?


Also it the syntax change will break existing users of the append 
feature (host/guestfwd for slirp networking).


Any reason why you can't use the current append to avoid the overlong 
feature flag lines?


Another idea:  One could reference other processors "base", then you can 
define a -- say -- Opteron_G3 as "Opteron_G2 features plus these".


cheers,
  Gerd




[Qemu-devel] iothread + smp 2 + alt-f4 = lock-up

2010-02-08 Thread Jan Kiszka
Just a note, maybe someone finds the time to look at this:

Sending ALT-F4 to the SDL window of qemu (stable-0.12 and master) when
it was configured with --enable-io-thread and runs -smp 2 or more causes
a lock-up of the qemu process.

[I currently trigger way too many bugs...]

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux




[Qemu-devel] Two QMP events issues

2010-02-08 Thread Luiz Capitulino

 Hi there,

 I have two not so related QMP events issues two discuss, but I will talk about
them in the same email to avoid starting two threads.

 The first problem is wrt the STOP event. Right now it's only emitted if it's
triggered through qemu_system_vmstop_request(), which afaik will only be
called if CONFIG_IOTHREAD is enabled (nonsense, yes).

 The best fix I can think of is to move the STOP event down to do_vm_stop().
We could even have a 'reason' data member with the string representation of
the EXCP_ macros. Looks like this is the right thing do to.

 There's a problem, though. Migration and block subsystems also do vm_stop(0).
The former's reason seems to be 'stop to be loaded' and the latter is 'can't
continue' on disk errors. Note that the block subsystem already has its own
event for disk errors.

 So, my solution is to not generate the STOP event on vm_stop(0). If any
vm_stop(0) user (eg. migration) wants to generate events they should create
the appropriate EXCP_ macro for that.

 Does this look good?

 The second problem is about the watchdog device. I have been asked to
add events for the watchdog's device actions (see
hw/watchdog.c:watchdog_perform_action()).

 Issue is: most of those events directly map to QEMU's events already
generated by QMP, such as RESET, SHUTDOWN, POWEROFF etc.

 We have two solutions:

1. Introduce watchdog's own events. This is easy to do, but will
generate two QMP events for most actions. Eg. the watchdog's WDT_RESET
action will generate a QMP event for WDT_RESET and will generate
another RESET event when this action takes place in QEMU

2. Add a 'source' data member to all events requested via the
qemu_system_* functions, so that we can have a 'wachtdog' source and
only one event is triggered. This will require a more complex change
and maybe some hacks will be needed (eg. for vm_stop())

 Opinions?




[Qemu-devel] Slowdowns comparing qemu-kvm.git to qemu.git: vcpu/thread scheduling differences

2010-02-08 Thread Amit Shah
Hello,

In my testing of virtio-console, I found qemu-kvm.git introduces a lot
of overhead in thread scheduling compared to qemu.git.

My test sends a 260M file from the host to a guest via a virtio-console
port and then computes the sha1sum of the file on the host as well as on
the guest, compares the checksum and declares the result based on the
checksum match. The test passes in all the scenarios listed below,
indicating there's no unsafe data transfer.


Repo   Time taken
-  --
qemu.git < 1 m (typically 30s)
qemu-kvm.git > 16m
qemu-iothread~ 5m


The guest remains the same in all cases.

I went back upto Oct 2009 in the git history to check if this was a
regression; it is not. The qemu-kvm.git results are reproducible on
qemu-kvm-0.12.git as well.

The slowdown with qemu-iothread was reproduced by virtio-net as well:
netperf results were better on qemu.git with iothread disabled than on
qemu.git with iothread enabled.


The virtio-console code currently puts only one buffer in the in_vq,
meaning the host can only transfer 4k at a time before a guest consumes
the data and makes the buffer available for subsequent transfers.
Increasing the number of buffers available to 128 though "fixes" this --
qemu-kvm.git performance is much better in this case.


I have some gprof results comparing qemu.git and qemu-kvm.git (30% of
time in qemu-kvm.git is spent in main_loop_wait()), which I'm still
analyzing and also going over the code paths that are different.

I just thought I'd put out the observations out here so that others who
have an idea can chime in or validate these results with other
workloads.

Amit




[Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Daniel P. Berrange
On Mon, Feb 08, 2010 at 11:41:45AM -0200, Luiz Capitulino wrote:
> 
>  Hi there,
> 
>  I have two not so related QMP events issues two discuss, but I will talk 
> about
> them in the same email to avoid starting two threads.
> 
>  The first problem is wrt the STOP event. Right now it's only emitted if it's
> triggered through qemu_system_vmstop_request(), which afaik will only be
> called if CONFIG_IOTHREAD is enabled (nonsense, yes).
> 
>  The best fix I can think of is to move the STOP event down to do_vm_stop().
> We could even have a 'reason' data member with the string representation of
> the EXCP_ macros. Looks like this is the right thing do to.
> 
>  There's a problem, though. Migration and block subsystems also do vm_stop(0).
> The former's reason seems to be 'stop to be loaded' and the latter is 'can't
> continue' on disk errors. Note that the block subsystem already has its own
> event for disk errors.
> 
>  So, my solution is to not generate the STOP event on vm_stop(0). If any
> vm_stop(0) user (eg. migration) wants to generate events they should create
> the appropriate EXCP_ macro for that.
> 
>  Does this look good?
> 
>  The second problem is about the watchdog device. I have been asked to
> add events for the watchdog's device actions (see
> hw/watchdog.c:watchdog_perform_action()).
> 
>  Issue is: most of those events directly map to QEMU's events already
> generated by QMP, such as RESET, SHUTDOWN, POWEROFF etc.
> 
>  We have two solutions:
> 
> 1. Introduce watchdog's own events. This is easy to do, but will
> generate two QMP events for most actions. Eg. the watchdog's WDT_RESET
> action will generate a QMP event for WDT_RESET and will generate
> another RESET event when this action takes place in QEMU
> 
> 2. Add a 'source' data member to all events requested via the
> qemu_system_* functions, so that we can have a 'wachtdog' source and
> only one event is triggered. This will require a more complex change
> and maybe some hacks will be needed (eg. for vm_stop())
> 
>  Opinions?

For further backgrou, the key end goal here is that in a QMP client, upon
receipt of the  'RESET' event, we need to reliably & immediately determine
why it  occurred. eg, triggered by watchdog, or by guest OS request. There
are actually 3 possible sequences

 - WATCHDOG + action=reset, followed by RESET.  Assuming no intervening 
   event can occurr, the client can merely record 'WATCHDOG' and interpret
   it when it gets the immediately following 'RESET' event

 - RESET, followed by WATCHDOG + action=reset. The client doesn't know
   the reason for the RESET and can't wait arbitrarily for WATCHDOG since
   there might never be one arriving.

 - RESET + source=watchdog. Client directly sees the reason

The second scenario is the one I'd like us to avoid at all costs, since it
will require the client to introduce arbitrary delays in processing events
to determine cause. The first is slightly inconvenient, but doable if we 
can assume no intervening events will occur, between WATCHDOG and the
RESET events. The last is obviously simplest for the clients.

This question is also pretty relevant for Luiz's previous posting of disk
block I/O errors, since one of those actions  can result in a PAUSE event

Regards,
Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|




[Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
> initialize header type register in pci generic code.
> 
> Cc: Blue Swirl 
> Cc: "Michael S. Tsirkin" 
> Signed-off-by: Isaku Yamahata 

Applied, thanks.

> ---
>  hw/pci.c |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index eb2043e..7b055b4 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -603,6 +603,7 @@ static PCIDevice *do_pci_register_device(PCIDevice 
> *pci_dev, PCIBus *bus,
>  pci_dev->irq_state = 0;
>  pci_config_alloc(pci_dev);
>  
> +pci_dev->config[PCI_HEADER_TYPE] = header_type;
>  header_type &= ~PCI_HEADER_TYPE_MULTI_FUNCTION;
>  if (header_type == PCI_HEADER_TYPE_NORMAL) {
>  pci_set_default_subsystem_id(pci_dev);
> -- 
> 1.6.6.1




[Qemu-devel] Emulating MIPS self-examining code

2010-02-08 Thread Dmitry Antipov

Hello,

I'm trying to emulate the following MIPS code (taken from the bootloader of my 
system):

/* Initialize GOT pointer.
** Global symbols can't be resolved before this is done, and as such we 
can't
** use any global symbols in this code.  We use the bal/ move xxx,ra 
combination to access
** data in a PC relative manner to avoid this.  This code will 
correctly set the
** gp regardless of whether the code has already been relocated or not.
** This code determines the current gp by computing the link time (gp - 
pc)
** and adding this to the current pc.
** runtime_gp = runtime_pc + (linktime_gp - linktime_pc)
** U-boot is running from the address it is linked at at this time, so 
this
** general case code is not strictly necessary here.
*/

/* Branch and link to get current PC in ra */
bal 1f
nop
.extern _GLOBAL_OFFSET_TABLE_
.word   _GLOBAL_OFFSET_TABLE_  /* This contains the linked address of 
the GOT */
/* The ra register now contains the runtime address of the above memory 
location */

.word   . - 4  /* This contains the link time address 
of the previous word,
which is also what the link time 
expected PC value is */
1:
movegp, ra/* Move current PC into gp register */
lw  t1, 0(ra) /* Load linked address of the GOT into t1 */
lw  t2, 4(ra) /* Load the link time address of the GOT storage 
location into t2 */
sub t1, t2/* Subtract t2 from t1. */
/* t1 now contains the difference between the link-time GOT table 
address and the link time expected PC */

/* Add this difference to the current PC (copied into gp above) so that 
gp now has the current runtime
** GOT table address */
add gp, t1  # calculate current location of offset table

Corresponding objdump output is:

...[skipped]
bfc306c4:   04110003bal bfc306d4 
bfc306c8:   nop
bfc306cc:   bfc79d10cache   0x7,-25328(s8)
bfc306d0:   bfc306cccache   0x3,1740(s8)
bfc306d4:   03e0e02dmovegp,ra
bfc306d8:   8fe9lw  a5,0(ra)
bfc306dc:   8fea0004lw  a6,4(ra)
bfc306e0:   012a4822sub a5,a5,a6
bfc306e4:   0389e020add gp,gp,a5
...[skipped]
bfc79d10 <_GLOBAL_OFFSET_TABLE_>:
bfc79d10:   nop
...[skipped]

This is a kind of self-examining code (bfc306cc..bfc306d0 is treated as data). 
The problem
is that QEMU translates this (master?)piece into two translation blocks 
bfc306c4..bfc306c8
and bfc306d4..bfc306e4, silently ignoring bfc306cc..bfc306d0 because there is 
no way
to execute in that area. Due to this, 0(ra) at bfc306d8 can't be evaluated 
correctly.

Is there any ideas on how to get such code emulated?

Dmitry




Re: [Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Anthony Liguori

On 02/08/2010 08:12 AM, Daniel P. Berrange wrote:


For further backgrou, the key end goal here is that in a QMP client, upon
receipt of the  'RESET' event, we need to reliably&  immediately determine
why it  occurred. eg, triggered by watchdog, or by guest OS request. There
are actually 3 possible sequences

  - WATCHDOG + action=reset, followed by RESET.  Assuming no intervening
event can occurr, the client can merely record 'WATCHDOG' and interpret
it when it gets the immediately following 'RESET' event

  - RESET, followed by WATCHDOG + action=reset. The client doesn't know
the reason for the RESET and can't wait arbitrarily for WATCHDOG since
there might never be one arriving.

  - RESET + source=watchdog. Client directly sees the reason

The second scenario is the one I'd like us to avoid at all costs, since it
will require the client to introduce arbitrary delays in processing events
to determine cause. The first is slightly inconvenient, but doable if we
can assume no intervening events will occur, between WATCHDOG and the
RESET events. The last is obviously simplest for the clients.
   


I really prefer the third option but I'm a little concerned that we're 
throwing events around somewhat haphazardly.


So let me ask, why does a client need to determine when a guest reset 
and why it reset?


Regards,

Anthony Liguori





Re: [Qemu-devel] [PATCH 1/4] qjson: Improve debugging

2010-02-08 Thread Anthony Liguori

On 02/05/2010 11:14 AM, Markus Armbruster wrote:

Run time asserts are a terrible way to deal with reasonably expected errors.
 

Yes.  But what's reasonably expected entirely depends on the contract
between the function and its callers.

I think we need a function that cannot fail and shouldn't used with
untrusted arguments (for what it's worth, that's how we use
qobject_from_jsonf() now).  Having related functions with different
contracts is fine with me.
   


I think the key point is that if we're going to establish these 
contracts, it must be obvious.


A reasonable programmer is going to assume that if a function can return 
a NULL, it can possibly return an error.  If you want to deviate from 
those semantics, you either have to name the function appropriately or 
put a big comment above the declaration explaining the semantics.


Regards,

Anthony Liguori




Re: [Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Daniel P. Berrange
On Mon, Feb 08, 2010 at 08:49:20AM -0600, Anthony Liguori wrote:
> On 02/08/2010 08:12 AM, Daniel P. Berrange wrote:
> >
> >For further backgrou, the key end goal here is that in a QMP client, upon
> >receipt of the  'RESET' event, we need to reliably&  immediately determine
> >why it  occurred. eg, triggered by watchdog, or by guest OS request. There
> >are actually 3 possible sequences
> >
> >  - WATCHDOG + action=reset, followed by RESET.  Assuming no intervening
> >event can occurr, the client can merely record 'WATCHDOG' and interpret
> >it when it gets the immediately following 'RESET' event
> >
> >  - RESET, followed by WATCHDOG + action=reset. The client doesn't know
> >the reason for the RESET and can't wait arbitrarily for WATCHDOG since
> >there might never be one arriving.
> >
> >  - RESET + source=watchdog. Client directly sees the reason
> >
> >The second scenario is the one I'd like us to avoid at all costs, since it
> >will require the client to introduce arbitrary delays in processing events
> >to determine cause. The first is slightly inconvenient, but doable if we
> >can assume no intervening events will occur, between WATCHDOG and the
> >RESET events. The last is obviously simplest for the clients.
> >   
> 
> I really prefer the third option but I'm a little concerned that we're 
> throwing events around somewhat haphazardly.
> 
> So let me ask, why does a client need to determine when a guest reset 
> and why it reset?

If a guest OS is repeatedly hanging/crashing resulting in the watchdog 
device firing, management software for the host really wants to know about
that (so that appropriate alerts/action can be taken) and thus needs to 
be able to distinguish this from a "normal"  guest OS initiated reboot.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|




[Qemu-devel] Re: Slowdowns comparing qemu-kvm.git to qemu.git: vcpu/thread scheduling differences

2010-02-08 Thread Anthony Liguori

On 02/08/2010 07:46 AM, Amit Shah wrote:

Hello,

In my testing of virtio-console, I found qemu-kvm.git introduces a lot
of overhead in thread scheduling compared to qemu.git.

My test sends a 260M file from the host to a guest via a virtio-console
port and then computes the sha1sum of the file on the host as well as on
the guest, compares the checksum and declares the result based on the
checksum match. The test passes in all the scenarios listed below,
indicating there's no unsafe data transfer.


Repo   Time taken
-  --
qemu.git<  1 m (typically 30s)
qemu-kvm.git>  16m
qemu-iothread~ 5m
   


That very likely suggests that there are missing qemu_notify_events() in 
qemu-kvm.git and you're getting blocked waiting for the next timer event 
to fire.


IOW, I assume that during the qemu-kvm.git run, the CPU isn't pegged at 
100% whereas it is in qemu.git.


Regards,

Anthony Liguori




Re: [Qemu-devel] [PATCH] User mode: Handle x86_64 vsyscall

2010-02-08 Thread Vince Weaver
On Sun, 7 Feb 2010, Richard Henderson wrote:
> 
> I imagine that QEMU's VDSO would not have the complicated bits that the
> kernel's version does, where it arranges to read the clock without going into
> kernel space.  I imagine QEMU would simply stuff a normal syscall sequence in
> there, which would automatically be emulated in the normal way.

For what it's worth, this is how various other systems I'm aware of handle 
x86_64 VDSOs (both Valgrind and the m5 simulator do it this way).

Vince




[Qemu-devel] [PATCH] qcow2: don't ignore failed update_refcount

2010-02-08 Thread Jim Meyering
update_refcount is marked as a function for which we must use its result,

static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,

and rightly so, since doing otherwise would amount to ignoring write failure.
However, there are two cases in which the return value is currently ignored.
This fixes them:

>From 107940556a2d0ef1de1d59a5da0c6c3086246817 Mon Sep 17 00:00:00 2001
From: Jim Meyering 
Date: Mon, 8 Feb 2010 11:50:59 +0100
Subject: [PATCH] qcow2: don't ignore failed update_refcount

* block/qcow2-refcount.c (grow_refcount_table): When update_refcount
fails, return its negative return code to our caller.
(alloc_refcount_block): Likewise.
---
 block/qcow2-refcount.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 2fdc26b..b9f5093 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -181,7 +181,9 @@ static int grow_refcount_table(BlockDriverState *bs, int 
min_size)
 s->refcount_table_size = new_table_size;
 s->refcount_table_offset = table_offset;

-update_refcount(bs, table_offset, new_table_size2, 1);
+ret = update_refcount(bs, table_offset, new_table_size2, 1);
+if (ret < 0)
+   goto fail;
 qcow2_free_clusters(bs, old_table_offset, old_table_size * 
sizeof(uint64_t));
 return 0;
  fail:
@@ -231,7 +233,9 @@ static int64_t alloc_refcount_block(BlockDriverState *bs, 
int64_t cluster_index)

 refcount_block_offset = offset;
 s->refcount_block_cache_offset = offset;
-update_refcount(bs, offset, s->cluster_size, 1);
+ret = update_refcount(bs, offset, s->cluster_size, 1);
+if (ret < 0)
+return ret;
 cache_refcount_updates = cache;
 } else {
 if (refcount_block_offset != s->refcount_block_cache_offset) {
--
1.7.0.rc2.156.g2ac04




Re: [Qemu-devel] [PATCH 1/4] qjson: Improve debugging

2010-02-08 Thread Luiz Capitulino
On Mon, 08 Feb 2010 08:53:26 -0600
Anthony Liguori  wrote:

> On 02/05/2010 11:14 AM, Markus Armbruster wrote:
> >> Run time asserts are a terrible way to deal with reasonably expected 
> >> errors.
> >>  
> > Yes.  But what's reasonably expected entirely depends on the contract
> > between the function and its callers.
> >
> > I think we need a function that cannot fail and shouldn't used with
> > untrusted arguments (for what it's worth, that's how we use
> > qobject_from_jsonf() now).  Having related functions with different
> > contracts is fine with me.
> >
> 
> I think the key point is that if we're going to establish these 
> contracts, it must be obvious.
> 
> A reasonable programmer is going to assume that if a function can return 
> a NULL, it can possibly return an error.  If you want to deviate from 
> those semantics, you either have to name the function appropriately or 
> put a big comment above the declaration explaining the semantics.

 Given that qobject_from_jsonf() is already a good and long name, I
prefer to add the comment.

 I will do that and re-submit.




Re: [Qemu-devel] Emulating MIPS self-examining code

2010-02-08 Thread Aurelien Jarno
On Mon, Feb 08, 2010 at 05:26:33PM +0300, Dmitry Antipov wrote:
> Hello,
> 
> I'm trying to emulate the following MIPS code (taken from the bootloader of 
> my system):
> 
> /* Initialize GOT pointer.
> ** Global symbols can't be resolved before this is done, and as such 
> we can't
> ** use any global symbols in this code.  We use the bal/ move xxx,ra 
> combination to access
> ** data in a PC relative manner to avoid this.  This code will 
> correctly set the
> ** gp regardless of whether the code has already been relocated or 
> not.
> ** This code determines the current gp by computing the link time (gp 
> - pc)
> ** and adding this to the current pc.
> ** runtime_gp = runtime_pc + (linktime_gp - linktime_pc)
> ** U-boot is running from the address it is linked at at this time, 
> so this
> ** general case code is not strictly necessary here.
> */
> 
> /* Branch and link to get current PC in ra */
> bal 1f
> nop
> .extern _GLOBAL_OFFSET_TABLE_
> .word   _GLOBAL_OFFSET_TABLE_  /* This contains the linked address of 
> the GOT */
> /* The ra register now contains the runtime address of the above 
> memory location */
> 
> .word   . - 4  /* This contains the link time address 
> of the previous word,
> which is also what the link time 
> expected PC value is */
> 1:
> movegp, ra/* Move current PC into gp register */
> lw  t1, 0(ra) /* Load linked address of the GOT into t1 */
> lw  t2, 4(ra) /* Load the link time address of the GOT storage 
> location into t2 */
> sub t1, t2/* Subtract t2 from t1. */
> /* t1 now contains the difference between the link-time GOT table 
> address and the link time expected PC */
> 
> /* Add this difference to the current PC (copied into gp above) so 
> that gp now has the current runtime
> ** GOT table address */
> add gp, t1  # calculate current location of offset table
> 
> Corresponding objdump output is:
> 
> ...[skipped]
> bfc306c4:   04110003bal bfc306d4 
> bfc306c8:   nop
> bfc306cc:   bfc79d10cache   0x7,-25328(s8)
> bfc306d0:   bfc306cccache   0x3,1740(s8)
> bfc306d4:   03e0e02dmovegp,ra
> bfc306d8:   8fe9lw  a5,0(ra)
> bfc306dc:   8fea0004lw  a6,4(ra)
> bfc306e0:   012a4822sub a5,a5,a6
> bfc306e4:   0389e020add gp,gp,a5
> ...[skipped]
> bfc79d10 <_GLOBAL_OFFSET_TABLE_>:
> bfc79d10:   nop
> ...[skipped]
> 
> This is a kind of self-examining code (bfc306cc..bfc306d0 is treated as 
> data). The problem
> is that QEMU translates this (master?)piece into two translation blocks 
> bfc306c4..bfc306c8
> and bfc306d4..bfc306e4, silently ignoring bfc306cc..bfc306d0 because there is 
> no way

It does not ignore it, it skips it because of the jump in 0xbfc306c4
which instructs the CPU to jump into bfc306d4. That's why the second 
block starts at this address.

> to execute in that area. Due to this, 0(ra) at bfc306d8 can't be evaluated 
> correctly.

Why? Do you mean this instruction is not executed? What is important
here is the value of gp.

> Is there any ideas on how to get such code emulated?

I personally don't see the problem. Please also post the input asm code
and the output tcg code from qemu (-d in_asm,op).

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net




Re: [Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Anthony Liguori

On 02/08/2010 08:56 AM, Daniel P. Berrange wrote:

On Mon, Feb 08, 2010 at 08:49:20AM -0600, Anthony Liguori wrote:
   

On 02/08/2010 08:12 AM, Daniel P. Berrange wrote:
 

For further backgrou, the key end goal here is that in a QMP client, upon
receipt of the  'RESET' event, we need to reliably&   immediately determine
why it  occurred. eg, triggered by watchdog, or by guest OS request. There
are actually 3 possible sequences

  - WATCHDOG + action=reset, followed by RESET.  Assuming no intervening
event can occurr, the client can merely record 'WATCHDOG' and interpret
it when it gets the immediately following 'RESET' event

  - RESET, followed by WATCHDOG + action=reset. The client doesn't know
the reason for the RESET and can't wait arbitrarily for WATCHDOG since
there might never be one arriving.

  - RESET + source=watchdog. Client directly sees the reason

The second scenario is the one I'd like us to avoid at all costs, since it
will require the client to introduce arbitrary delays in processing events
to determine cause. The first is slightly inconvenient, but doable if we
can assume no intervening events will occur, between WATCHDOG and the
RESET events. The last is obviously simplest for the clients.

   

I really prefer the third option but I'm a little concerned that we're
throwing events around somewhat haphazardly.

So let me ask, why does a client need to determine when a guest reset
and why it reset?
 

If a guest OS is repeatedly hanging/crashing resulting in the watchdog
device firing, management software for the host really wants to know about
that (so that appropriate alerts/action can be taken) and thus needs to
be able to distinguish this from a "normal"  guest OS initiated reboot.
   


I think that's an argument for having the watchdog events independent of 
the reset events.


The watchdog condition happening is not directly related to the action 
the watchdog takes.  The watchdog event really belongs in a class events 
that are closely associated with a particular device emulation.


In fact, I think what we're really missing in events today is a notion 
of a context.  A RESET event is really a CPU event.  A watchdog 
expiration event is a watchdog event.  A connect event is a VNC event 
(Spice and chardevs will also generate connect events).


Including what the current action is in the watchdog expiration event is 
certainly reasonable although not strictly necessary.


Regards,

Anthony Liguori


Regards,
Daniel
   






[Qemu-devel] Re: [PATCH] qcow2: don't ignore failed update_refcount

2010-02-08 Thread Kevin Wolf
Am 08.02.2010 16:01, schrieb Jim Meyering:
> update_refcount is marked as a function for which we must use its result,
> 
> static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
> 
> and rightly so, since doing otherwise would amount to ignoring write failure.
> However, there are two cases in which the return value is currently ignored.
> This fixes them:
> 
> From 107940556a2d0ef1de1d59a5da0c6c3086246817 Mon Sep 17 00:00:00 2001
> From: Jim Meyering 
> Date: Mon, 8 Feb 2010 11:50:59 +0100
> Subject: [PATCH] qcow2: don't ignore failed update_refcount
> 
> * block/qcow2-refcount.c (grow_refcount_table): When update_refcount
> fails, return its negative return code to our caller.
> (alloc_refcount_block): Likewise.

I'm currently working on fixing exactly this, and unfortunaly, no, it's
not that easy. What you introduce looks like proper error handling at
first sight, but what happens in fact is that while the current write
request correctly fails now we're running with corrupted metadata for
all future requests (the new refcount table/block is already in use, but
it has a refcount of 0 and will be overwritten sooner or later).

Actually, I have found it impossible to fix the current approach, so the
fix I'm working on will be more of a rewrite of the two functions.

Kevin




[Qemu-devel] Re: [PATCH] qcow2: don't ignore failed update_refcount

2010-02-08 Thread Jim Meyering
Kevin Wolf wrote:
...
> I'm currently working on fixing exactly this, and unfortunaly, no, it's
> not that easy. What you introduce looks like proper error handling at
> first sight, but what happens in fact is that while the current write
> request correctly fails now we're running with corrupted metadata for
> all future requests (the new refcount table/block is already in use, but
> it has a refcount of 0 and will be overwritten sooner or later).
>
> Actually, I have found it impossible to fix the current approach, so the
> fix I'm working on will be more of a rewrite of the two functions.

Nicely worded NACK ;-)
Thanks.




[Qemu-devel] Training request

2010-02-08 Thread Jean-Christophe Voisin
French version at the end of mail.
Please excuse our English.

Hello,

We are two apprentice in the last year of our training as engineers.
We would like to offer our services for the development of qemu. We
have 256 hours of school project to accompling, helped by a teacher
and researcher at ESEO (Ecole Supérieure d'Electronique de l'Ouest, in
Angers, France). This project starts in March and ends in July.

Your project interests us, we are highly motivated by the integration
or the improvement of a target platform in your software, especially
PowerPC 405, 440, ARM7 or Cortex-M3. We already worked with these architectures
in our companies. These propositions are not restrictive, we are open
to any proposition that could allow us to acquire experience in the
field of emulation.

We rely on your knowledge of the qemu project to advise us on what
kind of projects we could achieve in the time we have. We hope this
project will fulfill our objectives of learning and being useful to
the community.

Kind regards,


Antoine Chagneau (antoine.chagn...@gmail.com) and Jean-Christophe
Voisin (jc.voi...@gmail.com)

-- French version --

Bonjour,


Apprentis en 3ème année d'école d'ingénieur à l' ESEO-ITII (Ecole
Supérieure d'Electronique de l'Ouest, à Angers en partenariat avec
l'Institut des Techniques d'Ingénieur de l'Industrie). Nous vous
contactons pour vous proposer notre collaboration au développement de
qemu. Dans le cadre de notre cursus, nous devons effectuer en binôme
un projet de 128 heures chacun (soit 256 heures au total) avec l'appui
d'un enseignant de l'ESEO. Ce projet commence en mars et se termine en
juillet.

Votre projet nous intéresse, nous sommes fortement motivés par
l'intégration dans votre logiciel ou l'amélioration d'une plateforme
cible, en particulier PowerPC 405, 440, ARM7 or Cortex-M3. Ce sont des
architectures avec lesquels nous avons eu l'occasion de travailler
dans nos projets en entreprise. Ces propositions ne sont pas
restrictives, nous sommes ouvert à toute proposition qui nous
permettrait d'acquérir des compétences dans le domaine de l'émulation
de systèmes.

Nous comptons sur votre connaissance du projet qemu pour nous
aiguiller sur le type de sujet réalisable dans le temps imparti qui,
outre un intérêt pédagogique pour nous, répondra au besoin de la
communauté.

Cordialement,


Antoine Chagneau (antoine.chagn...@gmail.com) et Jean-Christophe
Voisin (jc.voi...@gmail.com)




[Qemu-devel] Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-08 Thread Marcelo Tosatti
On Thu, Feb 04, 2010 at 08:21:08PM +0100, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Marcelo Tosatti wrote:
> >> With kvm-autotest the failure is not sporadic (and the above commit
> >> applied): with KVM_SET_GUEST_DEBUG in arch_put_regs all migration 
> >> tests fail, without, all of them succeed. 
> >>
> >> So env->kvm_guest_debug has been zeroed by cpu_x86_init, which means
> >> the writeback via KVM_SET_GUEST_DEBUG does almost nothing. It does
> >> get_rflags and set_rflags in the kernel.
> > 
> > Hmm, it also copies debug regs around... BTW, where do we save/restore
> > dr0..7 between kernel and user space?
> > 
> > But that should not be a problem, both shadow as well as effective regs
> > should be properly initialized, specifically for a newly created VCPU.
> 
> Could you retry after pushing SET_GUEST_DEBUG at the end of
> kvm_arch_put_registers? Maybe it is no good idea to run get/set_rflags
> without having the sregs properly initialized.
> 
> Jan

Can't reproduce (with the original patch, KVM_SET_GUEST_DEBUG at
beginning of arch_put_regs). Different test box though. Go figure.

Anyway, as you mentioned, the workaround can be made cleaner through
set_vcpu_events.





[Qemu-devel] Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-08 Thread Marcelo Tosatti
On Thu, Feb 04, 2010 at 08:00:26PM +0100, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Marcelo Tosatti wrote:
> >>
> >> Unrelated to this problem, won't put_vcpu_events, which is executed 
> >> after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions?
> > 
> > Good point, SET_GUEST_DEBUG should be last in the writeback for that reason.
> 
> Actually, we no longer need the exception injection via SET_GUEST_DEBUG
> now that we have full access via vcpu_events. 

> So this needs a cleanup, and I'm afraid quite a few cases are broken
> ATM with vcpu_events writeback overwriting the reinjected exceptions.

Don't see what you mean here. Can you be more explicit? What breakage?






Re: [Qemu-devel] [PATCH 4/4] target-arm: neon fix

2010-02-08 Thread Laurent Desnogues
On Mon, Feb 8, 2010 at 12:47 PM, Riku Voipio  wrote:
> On Sun, Feb 07, 2010 at 02:02:31PM +0100, Laurent Desnogues wrote:
>> On Sun, Feb 7, 2010 at 1:54 PM, Laurent Desnogues
>>  wrote:
>> > On Fri, Feb 5, 2010 at 4:52 PM, Riku Voipio  wrote:
>> >> From: Juha Riihimäki 
>
>> >> add an extra check in "two registers and a shift" to ensure element
>> >> size decoding logic cannot fail.
>
>> > I think there's a patch ordering problem that makes
>> > the comment and the change not agree :-)
>
> Sorry, apparently messed up while rebasing.
>
>> BTW I don't think adding the check for size is needed
>> here.  The encoding at that point looks like this:
>
>>  332211
>>  10987654321098765432109876543210
>>  001_1___1__1
>>  001_1__1___1
>>  001_1_11
>
>> so it will stop for size == 0 given that bit 19 will have to
>> be set.
>
> Juha agrees so we'll drop this patch (or more precisely get the actual change
> out of the previous patch..)

Do you intend to resend the patch 3 of this set or should it
be reviewed as is?

Thanks,

Laurent




[Qemu-devel] Re: [PATCH 4/4] KVM: Rework of guest debug state writing

2010-02-08 Thread Jan Kiszka
Marcelo Tosatti wrote:
> On Thu, Feb 04, 2010 at 08:00:26PM +0100, Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Marcelo Tosatti wrote:
 Unrelated to this problem, won't put_vcpu_events, which is executed 
 after KVM_SET_GUEST_DEBUG, overwrite any queued debug exceptions?
>>> Good point, SET_GUEST_DEBUG should be last in the writeback for that reason.
>> Actually, we no longer need the exception injection via SET_GUEST_DEBUG
>> now that we have full access via vcpu_events. 
> 
>> So this needs a cleanup, and I'm afraid quite a few cases are broken
>> ATM with vcpu_events writeback overwriting the reinjected exceptions.
> 
> Don't see what you mean here. Can you be more explicit? What breakage?

SET_GUEST_DEBUG and SET_VCPU_EVENTS are mutually exclusive. If we have
the latter, don't use the former for exception injection anymore. And
this is broken already without any of my patches applied, that was my
point. Will work on this soon.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux




Re: [Qemu-devel] [PATCH] Add assignment operation to config file parser..

2010-02-08 Thread john cooper
Gerd Hoffmann wrote:
> On 02/07/10 17:24, Anthony Liguori wrote:
>> On 02/06/2010 12:59 PM, john cooper wrote:
>>> This patch reworks support for both assignment and
>>> append in the config file parser. It was motivated
>>> by comments received on the cpu model config file
>>> format.
>>>
>>> Commit dc9ca4ba27be4fe6a0284061b8f056c4364fb0d9
>>> changed the behavior of "=" from assign to append.
>>> This patch preserves the ability to append to an
>>> option (however now via "+="), reverts "=" to its
>>> previous behavior, and allows both to interoperate.
>>>
>>> Signed-off-by: john cooper
>>
>> This deviates from standard ini syntax which makes me a big
>> uncomfortable with it. Gerd, do you have an opinion?
> 
> Also it the syntax change will break existing users of the append
> feature (host/guestfwd for slirp networking).
> 
> Any reason why you can't use the current append to avoid the overlong
> feature flag lines?

Impacting existing usage was my primary concern as well.
I'd run this by Mark who created the patch changing the
disposition of "=" to append and I didn't find any alarms
there.

The proposed scheme supports both semantics and arguably
seems more intuitive.  If that is the general consensus
and it won't unduly perturb existing usage the above
would be opportune before the current behavior becomes
more entrenched (e.g. by further dependencies such as the
proposed cpu model configuration).

But to answer the question, my prior patch can work with
either.

Thanks,

-john

-- 
john.coo...@redhat.com




[Qemu-devel] Re: Network shutdown under load

2010-02-08 Thread Tom Lendacky

Fix a race condition where qemu finds that there are not enough virtio
ring buffers available and the guest make more buffers available before
qemu can enable notifications.

Signed-off-by: Tom Lendacky 
Signed-off-by: Anthony Liguori 

 hw/virtio-net.c |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 6e48997..5c0093e 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -379,7 +379,15 @@ static int virtio_net_has_buffers(VirtIONet *n, int 
bufsize)
 (n->mergeable_rx_bufs &&
  !virtqueue_avail_bytes(n->rx_vq, bufsize, 0))) {
 virtio_queue_set_notification(n->rx_vq, 1);
-return 0;
+
+/* To avoid a race condition where the guest has made some buffers
+ * available after the above check but before notification was
+ * enabled, check for available buffers again.
+ */
+if (virtio_queue_empty(n->rx_vq) ||
+(n->mergeable_rx_bufs &&
+ !virtqueue_avail_bytes(n->rx_vq, bufsize, 0)))
+return 0;
 }
 
 virtio_queue_set_notification(n->rx_vq, 0);

On Friday 29 January 2010 02:06:41 pm Tom Lendacky wrote:
> There's been some discussion of this already in the kvm list, but I want to
> summarize what I've found and also include the qemu-devel list in an effort
>  to find a solution to this problem.
> 
> Running a netperf test between two kvm guests results in the guest's
>  network interface shutting down. I originally found this using kvm guests
>  on two different machines that were connected via a 10GbE link.  However,
>  I found this problem can be easily reproduced using two guests on the same
>  machine.
> 
> I am running the 2.6.32 level of the kvm.git tree and the 0.12.1.2 level of
> the qemu-kvm.git tree.
> 
> The setup includes two bridges, br0 and br1.
> 
> The commands used to start the guests are as follows:
> usr/local/bin/qemu-system-x86_64 -name cape-vm001 -m 1024 -drive
> file=/autobench/var/tmp/cape-vm001-
> raw.img,if=virtio,index=0,media=disk,boot=on -net
> nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:51,netdev=cape-vm001-eth0 -
> netdev tap,id=cape-vm001-eth0,script=/autobench/var/tmp/ifup-kvm-
> br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net
> nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:D1,netdev=cape-vm001-eth1 -
> netdev tap,id=cape-vm001-eth1,script=/autobench/var/tmp/ifup-kvm-
> br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :1 -monitor
> telnet::5701,server,nowait -snapshot -daemonize
> 
> usr/local/bin/qemu-system-x86_64 -name cape-vm002 -m 1024 -drive
> file=/autobench/var/tmp/cape-vm002-
> raw.img,if=virtio,index=0,media=disk,boot=on -net
> nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:61,netdev=cape-vm002-eth0 -
> netdev tap,id=cape-vm002-eth0,script=/autobench/var/tmp/ifup-kvm-
> br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net
> nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:E1,netdev=cape-vm002-eth1 -
> netdev tap,id=cape-vm002-eth1,script=/autobench/var/tmp/ifup-kvm-
> br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :2 -monitor
> telnet::5702,server,nowait -snapshot -daemonize
> 
> The ifup-kvm-br0 script takes the (first) qemu created tap device and
>  brings it up and adds it to bridge br0.  The ifup-kvm-br1 script take the
>  (second) qemu created tap device and brings it up and adds it to bridge
>  br1.
> 
> Each ethernet device within a guest is on it's own subnet.  For example:
>   guest 1 eth0 has addr 192.168.100.32 and eth1 has addr 192.168.101.32
>   guest 2 eth0 has addr 192.168.100.64 and eth1 has addr 192.168.101.64
> 
> On one of the guests run netserver:
>   netserver -L 192.168.101.32 -p 12000
> 
> On the other guest run netperf:
>   netperf -L 192.168.101.64 -H 192.168.101.32 -p 12000 -t TCP_STREAM -l 60
>  -c -C -- -m 16K -M 16K
> 
> It may take more than one netperf run (I find that my second run almost
>  always causes the shutdown) but the network on the eth1 links will stop
>  working.
> 
> I did some debugging and found that in qemu on the guest running netserver:
>  - the receive_disabled variable is set and never gets reset
>  - the read_poll event handler for the eth1 tap device is disabled and
>  never re-enabled
> These conditions result in no packets being read from the tap device and
>  sent to the guest - effectively shutting down the network.  Network
>  connectivity can be restored by shutting down the guest interfaces,
>  unloading the virtio_net module, re-loading the virtio_net module and
>  re-starting the guest interfaces.
> 
> I'm continuing to work on debugging this, but would appreciate if some
>  folks with more qemu network experience could try to recreate and debug
>  this.
> 
> If my kernel config matters, I can provide that.
> 
> Thanks,
> Tom
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-i

Re: [PATCH] JSON: add %I64d support (Was: Re: [Qemu-devel] system_reset command cause assert failed)

2010-02-08 Thread Anthony Liguori

On 02/04/2010 10:59 AM, Roy Tam wrote:

2010/2/4 Luiz Capitulino:
   

On Thu, 4 Feb 2010 10:30:30 +0800
Roy Tam  wrote:

 

2010/2/4 Roy Tam:
   

2010/2/3 Luiz Capitulino:
 

OK we are fooled by the json lexer and parser. As we use %I64d to
print 'long long' variables in Win32, but lexer and parser only deal
with %lld but not %I64d, this patch add support for %I64d and solve
'info pci', 'powser_reset' and 'power_powerdown' assert failure in
Win32.
   

  Hm, I guess this has been suggested before... Anthony?

 

OK I just missed this. And the wheel was reinvented. :-S
http://www.mail-archive.com/qemu-devel@nongnu.org/msg23983.html
   


I asked for the json parser changes to be split from the PRId64 
changes.  It was never resubmitted.  I'll apply your patch.


Regards,

Anthony Liguori




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
> On 02/08/10 11:17, Michael S. Tsirkin wrote:
>> On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
>>> initialize header type register in pci generic code.
>>>
>>> Cc: Blue Swirl
>>> Cc: "Michael S. Tsirkin"
>>> Signed-off-by: Isaku Yamahata
>>
>> No objections here, I am assuming this will be followed
>> by patches removing header type init from bridges?
>>  From qdev perspective, it is probably cleaner to make
>> multifunction bit a separate qdev property though, right?
>
> From a qdev perspective it would make *alot* of sense to move a bunch of 
> pci config stuff (including, but not limited to header type) into  
> PCIDeviceInfo.
>
> cheers,
>   Gerd

Actually - won't this make it possible to create broken configurations
by tweaking properties from command-line?
And generally, it sounds bad to have header type duplicated in qdev
and in config. Why do we want it in qdev?

Isaku Yamahata, could you please clarify?


-- 
MT




[Qemu-devel] Re: [PATCH 0/6] [GIT PULL] qemu-kvm.git uq/master queue

2010-02-08 Thread Anthony Liguori

On 02/03/2010 05:55 PM, Marcelo Tosatti wrote:

The following changes since commit 117f8eb81dfdf51a0418fbf6d260cbb72bcd4a9d:
   Markus Armbruster (1):
 qdev: Add rudimentary help for property value

are available in the git repository at:

   git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

Jan Kiszka (4):
   KVM: Request setting of nmi_pending and sipi_vector
   KVM: x86: Fix up misreported CPU features
   KVM: Make vmport KVM-compatible
   KVM: Move and rename regs_modified

Marcelo Tosatti (1):
   Fix incoming migration with iothread

Sheng Yang (1):
   kvm: Flush coalesced MMIO buffer periodly

  cpu-all.h |2 ++
  cpu-defs.h|3 ++-
  exec.c|6 ++
  hw/vmport.c   |3 +++
  kvm-all.c |   36 +---
  kvm.h |1 +
  target-i386/kvm.c |   11 ++-
  vl.c  |4 
  8 files changed, 49 insertions(+), 17 deletions(-)
   


Applied.  Thanks.

Regards,

Anthony Liguori




[Qemu-devel] Re: [PULL] linux-user-for upstream

2010-02-08 Thread Anthony Liguori

On 02/05/2010 08:05 AM, Riku Voipio wrote:

Same patchset as sent last week, with the TLS patched changed to the
"refactor cp15.c13" patch already acked by Laurent.

The following changes since commit 117f8eb81dfdf51a0418fbf6d260cbb72bcd4a9d:
   Markus Armbruster (1):
 qdev: Add rudimentary help for property value

are available in the git repository at:

   http://git.gitorious.org/qemu-maemo/qemu-maemo.git linux-user-for-upstream

Loïc Minier (1):
   linux-user: adapt uname machine to emulated CPU

Riku Voipio (3):
   fix locking error with current_tb
   linux-user: remove signal handler before calling abort()
   target-arm: refactor cp15.c13 register access

  Makefile.target|2 +-
  exec.c |   13 +++-
  linux-user/cpu-uname.c |   72 
  linux-user/cpu-uname.h |1 +
  linux-user/syscall.c   |3 +-
  target-arm/helper.c|   16 --
  target-arm/translate.c |   55 
  7 files changed, 142 insertions(+), 20 deletions(-)
  create mode 100644 linux-user/cpu-uname.c
  create mode 100644 linux-user/cpu-uname.h
   



Pulled.  Thanks.

Regards,

Anthony Liguori





Re: [Qemu-devel] Re: [SeaBIOS] [PATCH] Seabios: Fix PkgLength calculation for the SSDT.

2010-02-08 Thread Anthony Liguori

On 02/08/2010 04:19 AM, Avi Kivity wrote:

On 01/12/2010 03:36 PM, Kevin O'Connor wrote:

On Tue, Jan 12, 2010 at 10:10:41AM +0100, Magnus Christensson wrote:

On 12/24/2009 01:29 AM, Kevin O'Connor wrote:

On Mon, Dec 14, 2009 at 10:25:14AM +0100, Magnus Christensson wrote:

Should be cpu_length + 2 as far as I can tell. The current code is
definitely broken.


Right. That should be cpu_length +2 in the else-part.

Can you resend the patch with the change?

Attached (sorry for the delay).

Thanks.  Commit 3012af18.


Without this patch, Windows 2008 r2 won't boot with more than 4 cpus.  
Kevin/Anthony, can you coordinate a SeaBIOS release including this patch?


Probably needed for qemu 0.12 as well.


Kevin,

Just let me know when you cut a new release.  Did we ever decide what to 
do about a stable branch?


Regards,

Anthony Liguori






[Qemu-devel] Re: [PATCH] pci: fix info pci with host bridge.

2010-02-08 Thread Blue Swirl
On Mon, Feb 8, 2010 at 12:37 PM, Michael S. Tsirkin  wrote:
> On Mon, Feb 08, 2010 at 03:40:38PM +0900, Isaku Yamahata wrote:
>> This patch fixes 525e05147d5a3bdc08caa422d108c1ef71b584b5.
>> pci host bridge doesn't have header type of bridge.
>> The check should be by header type, instead of pci class device.
>>
>> Cc: Blue Swirl 
>> Cc: "Michael S. Tsirkin" 
>> Signed-off-by: Isaku Yamahata 
>
> So the effect of this will be that info pci won't report
> host bridge, right? IOW, it kind of reverts
> 525e05147d5a3bdc08caa422d108c1ef71b584b5, or am I
> missing something?

Yes, it breaks info pci. PBM/APB does not use PCI_HEADER_TYPE_BRIDGE.




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Gerd Hoffmann

On 02/08/10 17:27, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:

On 02/08/10 11:17, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:

initialize header type register in pci generic code.

Cc: Blue Swirl
Cc: "Michael S. Tsirkin"
Signed-off-by: Isaku Yamahata


No objections here, I am assuming this will be followed
by patches removing header type init from bridges?
  From qdev perspective, it is probably cleaner to make
multifunction bit a separate qdev property though, right?


 From a qdev perspective it would make *alot* of sense to move a bunch of
pci config stuff (including, but not limited to header type) into
PCIDeviceInfo.

cheers,
   Gerd


Actually - won't this make it possible to create broken configurations
by tweaking properties from command-line?


Not as property, as struct element in PCIDeviceInfo.  i.e.

static PCIDeviceInfo e1000_info = {
[ stuff which is here right now ]
.vendor_id = PCI_VENDOR_ID_INTEL,
.device_id = E1000_DEVID,
.class = PCI_CLASS_NETWORK_ETHERNET,
[ probably more stuff which makes sense ]
}

Then setup this in generic pci code instead of having each driver doing 
a bunch of pci_config_set_*() calls.


cheers,
  Gerd





[Qemu-devel] Re: [PATCH] apb_pci: fix header type of pbm pci host bridge.

2010-02-08 Thread Blue Swirl
On Mon, Feb 8, 2010 at 8:45 AM, Isaku Yamahata  wrote:
> The change set of 776e1bbb6cf4fe66a93c1a5dd814bbb650deca00
> specifies pbm pci host bridge is type of bridge.
> It contradicts with pbm_pci_host_init().

Bridge header type in qdev info is needed so that the write masks are
correct. Otherwise the masks make for example PCI_SECONDARY_BUS
readonly. But the device uses PCI_HEADER_TYPE_NORMAL. So both are
correct.




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:
> On 02/08/10 17:27, Michael S. Tsirkin wrote:
>> On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
>>> On 02/08/10 11:17, Michael S. Tsirkin wrote:
 On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
> initialize header type register in pci generic code.
>
> Cc: Blue Swirl
> Cc: "Michael S. Tsirkin"
> Signed-off-by: Isaku Yamahata

 No objections here, I am assuming this will be followed
 by patches removing header type init from bridges?
   From qdev perspective, it is probably cleaner to make
 multifunction bit a separate qdev property though, right?
>>>
>>>  From a qdev perspective it would make *alot* of sense to move a bunch of
>>> pci config stuff (including, but not limited to header type) into
>>> PCIDeviceInfo.
>>>
>>> cheers,
>>>Gerd
>>
>> Actually - won't this make it possible to create broken configurations
>> by tweaking properties from command-line?
>
> Not as property, as struct element in PCIDeviceInfo.  i.e.
>
> static PCIDeviceInfo e1000_info = {
> [ stuff which is here right now ]
> .vendor_id = PCI_VENDOR_ID_INTEL,
> .device_id = E1000_DEVID,
> .class = PCI_CLASS_NETWORK_ETHERNET,
> [ probably more stuff which makes sense ]
> }
>
> Then setup this in generic pci code instead of having each driver doing  
> a bunch of pci_config_set_*() calls.
>
> cheers,
>   Gerd

We still end up with class, vendor etc duplicated in 2 places.  Why do
we want stuff like vendor id in PCIDeviceInfo at all?  Why can't
everyone just use pci_config_set/get calls?

-- 
MST




[Qemu-devel] Re: Slowdowns comparing qemu-kvm.git to qemu.git: vcpu/thread scheduling differences

2010-02-08 Thread Amit Shah
On (Mon) Feb 08 2010 [08:57:05], Anthony Liguori wrote:
> On 02/08/2010 07:46 AM, Amit Shah wrote:
>> Hello,
>>
>> In my testing of virtio-console, I found qemu-kvm.git introduces a lot
>> of overhead in thread scheduling compared to qemu.git.
>>
>> My test sends a 260M file from the host to a guest via a virtio-console
>> port and then computes the sha1sum of the file on the host as well as on
>> the guest, compares the checksum and declares the result based on the
>> checksum match. The test passes in all the scenarios listed below,
>> indicating there's no unsafe data transfer.
>>
>>
>> Repo   Time taken
>> -  --
>> qemu.git<  1 m (typically 30s)
>> qemu-kvm.git>  16m
>> qemu-iothread~ 5m
>>
>
> That very likely suggests that there are missing qemu_notify_events() in  
> qemu-kvm.git and you're getting blocked waiting for the next timer event  
> to fire.

Hm, if that's the case, should virtio_notify() have a call to
qemu_notify_event()?

I added a qemu_notify_event() as the last statement in write_to_port()
but that didn't help.

> IOW, I assume that during the qemu-kvm.git run, the CPU isn't pegged at  
> 100% whereas it is in qemu.git.

I think you're right; for qemu-kvm.git, I have seen various numbers:
usage at an avg. of 2% and also at an avg. of 20% in different runs.

I just did another run before writing this: 2% for qemu-kvm.git and 100%
for qemu.git.

Amit




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Gerd Hoffmann

On 02/08/10 18:32, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:

On 02/08/10 17:27, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:

On 02/08/10 11:17, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:

initialize header type register in pci generic code.

Cc: Blue Swirl
Cc: "Michael S. Tsirkin"
Signed-off-by: Isaku Yamahata


No objections here, I am assuming this will be followed
by patches removing header type init from bridges?
   From qdev perspective, it is probably cleaner to make
multifunction bit a separate qdev property though, right?


   From a qdev perspective it would make *alot* of sense to move a bunch of
pci config stuff (including, but not limited to header type) into
PCIDeviceInfo.

cheers,
Gerd


Actually - won't this make it possible to create broken configurations
by tweaking properties from command-line?


Not as property, as struct element in PCIDeviceInfo.  i.e.

static PCIDeviceInfo e1000_info = {
 [ stuff which is here right now ]
 .vendor_id = PCI_VENDOR_ID_INTEL,
 .device_id = E1000_DEVID,
 .class = PCI_CLASS_NETWORK_ETHERNET,
 [ probably more stuff which makes sense ]
}

Then setup this in generic pci code instead of having each driver doing
a bunch of pci_config_set_*() calls.

cheers,
   Gerd


We still end up with class, vendor etc duplicated in 2 places.


No.  The info should be *only* in PCIDeviceInfo then.


 Why do
we want stuff like vendor id in PCIDeviceInfo at all?  Why can't
everyone just use pci_config_set/get calls?


You can do nice stuff like printing vendor/device IDs in the '-device ?' 
list then.


cheers,
  Gerd




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 06:37:41PM +0100, Gerd Hoffmann wrote:
> On 02/08/10 18:32, Michael S. Tsirkin wrote:
>> On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:
>>> On 02/08/10 17:27, Michael S. Tsirkin wrote:
 On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
> On 02/08/10 11:17, Michael S. Tsirkin wrote:
>> On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
>>> initialize header type register in pci generic code.
>>>
>>> Cc: Blue Swirl
>>> Cc: "Michael S. Tsirkin"
>>> Signed-off-by: Isaku Yamahata
>>
>> No objections here, I am assuming this will be followed
>> by patches removing header type init from bridges?
>>From qdev perspective, it is probably cleaner to make
>> multifunction bit a separate qdev property though, right?
>
>From a qdev perspective it would make *alot* of sense to move a bunch 
> of
> pci config stuff (including, but not limited to header type) into
> PCIDeviceInfo.
>
> cheers,
> Gerd

 Actually - won't this make it possible to create broken configurations
 by tweaking properties from command-line?
>>>
>>> Not as property, as struct element in PCIDeviceInfo.  i.e.
>>>
>>> static PCIDeviceInfo e1000_info = {
>>>  [ stuff which is here right now ]
>>>  .vendor_id = PCI_VENDOR_ID_INTEL,
>>>  .device_id = E1000_DEVID,
>>>  .class = PCI_CLASS_NETWORK_ETHERNET,
>>>  [ probably more stuff which makes sense ]
>>> }
>>>
>>> Then setup this in generic pci code instead of having each driver doing
>>> a bunch of pci_config_set_*() calls.
>>>
>>> cheers,
>>>Gerd
>>
>> We still end up with class, vendor etc duplicated in 2 places.
>
> No.  The info should be *only* in PCIDeviceInfo then.

That would put a lot of code in pci config cycle path.  A single array
mirroring the whole config space is much cleaner.

>>  Why do
>> we want stuff like vendor id in PCIDeviceInfo at all?  Why can't
>> everyone just use pci_config_set/get calls?
>
> You can do nice stuff like printing vendor/device IDs in the '-device ?'  
> list then.

That should use pci functions as well.

> cheers,
>   Gerd




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Blue Swirl
On Mon, Feb 8, 2010 at 7:37 PM, Michael S. Tsirkin  wrote:
> On Mon, Feb 08, 2010 at 06:37:41PM +0100, Gerd Hoffmann wrote:
>> On 02/08/10 18:32, Michael S. Tsirkin wrote:
>>> On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:
 On 02/08/10 17:27, Michael S. Tsirkin wrote:
> On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
>> On 02/08/10 11:17, Michael S. Tsirkin wrote:
>>> On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
 initialize header type register in pci generic code.

 Cc: Blue Swirl
 Cc: "Michael S. Tsirkin"
 Signed-off-by: Isaku Yamahata
>>>
>>> No objections here, I am assuming this will be followed
>>> by patches removing header type init from bridges?
>>>    From qdev perspective, it is probably cleaner to make
>>> multifunction bit a separate qdev property though, right?
>>
>>    From a qdev perspective it would make *alot* of sense to move a bunch 
>> of
>> pci config stuff (including, but not limited to header type) into
>> PCIDeviceInfo.
>>
>> cheers,
>>     Gerd
>
> Actually - won't this make it possible to create broken configurations
> by tweaking properties from command-line?

 Not as property, as struct element in PCIDeviceInfo.  i.e.

 static PCIDeviceInfo e1000_info = {
      [ stuff which is here right now ]
      .vendor_id = PCI_VENDOR_ID_INTEL,
      .device_id = E1000_DEVID,
      .class     = PCI_CLASS_NETWORK_ETHERNET,
      [ probably more stuff which makes sense ]
 }

 Then setup this in generic pci code instead of having each driver doing
 a bunch of pci_config_set_*() calls.

 cheers,
    Gerd
>>>
>>> We still end up with class, vendor etc duplicated in 2 places.
>>
>> No.  The info should be *only* in PCIDeviceInfo then.
>
> That would put a lot of code in pci config cycle path.  A single array
> mirroring the whole config space is much cleaner.

I'd suppose the arrays would remain as they are now, they just would
be initialized (using the pci functions) based on PCIDeviceInfo
structure. This would simplify the device code a lot.

>>>  Why do
>>> we want stuff like vendor id in PCIDeviceInfo at all?  Why can't
>>> everyone just use pci_config_set/get calls?
>>
>> You can do nice stuff like printing vendor/device IDs in the '-device ?'
>> list then.
>
> That should use pci functions as well.
>
>> cheers,
>>   Gerd
>




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Gerd Hoffmann

On 02/08/10 18:37, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 06:37:41PM +0100, Gerd Hoffmann wrote:

On 02/08/10 18:32, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:

On 02/08/10 17:27, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:

On 02/08/10 11:17, Michael S. Tsirkin wrote:

On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:

initialize header type register in pci generic code.

Cc: Blue Swirl
Cc: "Michael S. Tsirkin"
Signed-off-by: Isaku Yamahata


No objections here, I am assuming this will be followed
by patches removing header type init from bridges?
From qdev perspective, it is probably cleaner to make
multifunction bit a separate qdev property though, right?


 From a qdev perspective it would make *alot* of sense to move a bunch of
pci config stuff (including, but not limited to header type) into
PCIDeviceInfo.

cheers,
 Gerd


Actually - won't this make it possible to create broken configurations
by tweaking properties from command-line?


Not as property, as struct element in PCIDeviceInfo.  i.e.

static PCIDeviceInfo e1000_info = {
  [ stuff which is here right now ]
  .vendor_id = PCI_VENDOR_ID_INTEL,
  .device_id = E1000_DEVID,
  .class = PCI_CLASS_NETWORK_ETHERNET,
  [ probably more stuff which makes sense ]
}

Then setup this in generic pci code instead of having each driver doing
a bunch of pci_config_set_*() calls.

cheers,
Gerd


We still end up with class, vendor etc duplicated in 2 places.


No.  The info should be *only* in PCIDeviceInfo then.


That would put a lot of code in pci config cycle path.  A single array
mirroring the whole config space is much cleaner.


  Why do
we want stuff like vendor id in PCIDeviceInfo at all?  Why can't
everyone just use pci_config_set/get calls?


You can do nice stuff like printing vendor/device IDs in the '-device ?'
list then.


That should use pci functions as well.


Hmm, do you mix up PCIDevice and PCIDeviceInfo structs?

cheers,
  Gerd





[Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Luiz Capitulino
On Mon, 8 Feb 2010 14:12:20 +
"Daniel P. Berrange"  wrote:

> On Mon, Feb 08, 2010 at 11:41:45AM -0200, Luiz Capitulino wrote:
> > 
> >  Hi there,
> > 
> >  I have two not so related QMP events issues two discuss, but I will talk 
> > about
> > them in the same email to avoid starting two threads.
> > 
> >  The first problem is wrt the STOP event. Right now it's only emitted if 
> > it's
> > triggered through qemu_system_vmstop_request(), which afaik will only be
> > called if CONFIG_IOTHREAD is enabled (nonsense, yes).
> > 
> >  The best fix I can think of is to move the STOP event down to do_vm_stop().
> > We could even have a 'reason' data member with the string representation of
> > the EXCP_ macros. Looks like this is the right thing do to.
> > 
> >  There's a problem, though. Migration and block subsystems also do 
> > vm_stop(0).
> > The former's reason seems to be 'stop to be loaded' and the latter is 'can't
> > continue' on disk errors. Note that the block subsystem already has its own
> > event for disk errors.
> > 
> >  So, my solution is to not generate the STOP event on vm_stop(0). If any
> > vm_stop(0) user (eg. migration) wants to generate events they should create
> > the appropriate EXCP_ macro for that.
> > 
> >  Does this look good?
> > 
> >  The second problem is about the watchdog device. I have been asked to
> > add events for the watchdog's device actions (see
> > hw/watchdog.c:watchdog_perform_action()).
> > 
> >  Issue is: most of those events directly map to QEMU's events already
> > generated by QMP, such as RESET, SHUTDOWN, POWEROFF etc.
> > 
> >  We have two solutions:
> > 
> > 1. Introduce watchdog's own events. This is easy to do, but will
> > generate two QMP events for most actions. Eg. the watchdog's WDT_RESET
> > action will generate a QMP event for WDT_RESET and will generate
> > another RESET event when this action takes place in QEMU
> > 
> > 2. Add a 'source' data member to all events requested via the
> > qemu_system_* functions, so that we can have a 'wachtdog' source and
> > only one event is triggered. This will require a more complex change
> > and maybe some hacks will be needed (eg. for vm_stop())
> > 
> >  Opinions?
> 
> For further backgrou, the key end goal here is that in a QMP client, upon
> receipt of the  'RESET' event, we need to reliably & immediately determine
> why it  occurred. eg, triggered by watchdog, or by guest OS request. There
> are actually 3 possible sequences
> 
>  - WATCHDOG + action=reset, followed by RESET.  Assuming no intervening 
>event can occurr, the client can merely record 'WATCHDOG' and interpret
>it when it gets the immediately following 'RESET' event

 I don't think we can guarantee that, as there a number of events that
can happen between the two (eg. VNC events, disk errors etc).

>  - RESET, followed by WATCHDOG + action=reset. The client doesn't know
>the reason for the RESET and can't wait arbitrarily for WATCHDOG since
>there might never be one arriving.

 This is possible :(

>  - RESET + source=watchdog. Client directly sees the reason
> 
> The second scenario is the one I'd like us to avoid at all costs, since it
> will require the client to introduce arbitrary delays in processing events
> to determine cause. The first is slightly inconvenient, but doable if we 
> can assume no intervening events will occur, between WATCHDOG and the
> RESET events. The last is obviously simplest for the clients.
> 
> This question is also pretty relevant for Luiz's previous posting of disk
> block I/O errors, since one of those actions  can result in a PAUSE event

 Not exactly. The first part my original email describes the low-level
part of this problem.

 First, the block I/O error event may not stop the VM, so I think even if
we implement the third scenario, we should keep the block's event separated.

 Second, the block layer uses vm_stop(0) to stop the VM, according to the
description in this email I plan not to generate the STOP event when
vm_stop()'s argument is 0.





Re: [Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Luiz Capitulino
On Mon, 08 Feb 2010 09:13:37 -0600
Anthony Liguori  wrote:

> On 02/08/2010 08:56 AM, Daniel P. Berrange wrote:
> > On Mon, Feb 08, 2010 at 08:49:20AM -0600, Anthony Liguori wrote:
> >
> >> On 02/08/2010 08:12 AM, Daniel P. Berrange wrote:
> >>  
> >>> For further backgrou, the key end goal here is that in a QMP client, upon
> >>> receipt of the  'RESET' event, we need to reliably&   immediately 
> >>> determine
> >>> why it  occurred. eg, triggered by watchdog, or by guest OS request. There
> >>> are actually 3 possible sequences
> >>>
> >>>   - WATCHDOG + action=reset, followed by RESET.  Assuming no intervening
> >>> event can occurr, the client can merely record 'WATCHDOG' and 
> >>> interpret
> >>> it when it gets the immediately following 'RESET' event
> >>>
> >>>   - RESET, followed by WATCHDOG + action=reset. The client doesn't know
> >>> the reason for the RESET and can't wait arbitrarily for WATCHDOG since
> >>> there might never be one arriving.
> >>>
> >>>   - RESET + source=watchdog. Client directly sees the reason
> >>>
> >>> The second scenario is the one I'd like us to avoid at all costs, since it
> >>> will require the client to introduce arbitrary delays in processing events
> >>> to determine cause. The first is slightly inconvenient, but doable if we
> >>> can assume no intervening events will occur, between WATCHDOG and the
> >>> RESET events. The last is obviously simplest for the clients.
> >>>
> >>>
> >> I really prefer the third option but I'm a little concerned that we're
> >> throwing events around somewhat haphazardly.
> >>
> >> So let me ask, why does a client need to determine when a guest reset
> >> and why it reset?
> >>  
> > If a guest OS is repeatedly hanging/crashing resulting in the watchdog
> > device firing, management software for the host really wants to know about
> > that (so that appropriate alerts/action can be taken) and thus needs to
> > be able to distinguish this from a "normal"  guest OS initiated reboot.
> >
> 
> I think that's an argument for having the watchdog events independent of 
> the reset events.
> 
> The watchdog condition happening is not directly related to the action 
> the watchdog takes.  The watchdog event really belongs in a class events 
> that are closely associated with a particular device emulation.
> 
> In fact, I think what we're really missing in events today is a notion 
> of a context.  A RESET event is really a CPU event.  A watchdog 
> expiration event is a watchdog event.  A connect event is a VNC event 
> (Spice and chardevs will also generate connect events).

 This could be done by adding a 'context' member to all the events and
then an event would have to be identified by the pair event_name:context.

 This way we can have the same event_name for events in different
contexts. For example:

{ 'event': DISCONNECT, 'context': 'spice', [...] }

{ 'event': DISCONNECT, 'context': 'vnc', [...] }

 Note that today we have VNC_DISCONNECT and will probably have
SPICE_DISCONNECT too.




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 06:43:51PM +0100, Gerd Hoffmann wrote:
> On 02/08/10 18:37, Michael S. Tsirkin wrote:
>> On Mon, Feb 08, 2010 at 06:37:41PM +0100, Gerd Hoffmann wrote:
>>> On 02/08/10 18:32, Michael S. Tsirkin wrote:
 On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:
> On 02/08/10 17:27, Michael S. Tsirkin wrote:
>> On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
>>> On 02/08/10 11:17, Michael S. Tsirkin wrote:
 On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
> initialize header type register in pci generic code.
>
> Cc: Blue Swirl
> Cc: "Michael S. Tsirkin"
> Signed-off-by: Isaku Yamahata

 No objections here, I am assuming this will be followed
 by patches removing header type init from bridges?
 From qdev perspective, it is probably cleaner to make
 multifunction bit a separate qdev property though, right?
>>>
>>>  From a qdev perspective it would make *alot* of sense to move a 
>>> bunch of
>>> pci config stuff (including, but not limited to header type) into
>>> PCIDeviceInfo.
>>>
>>> cheers,
>>>  Gerd
>>
>> Actually - won't this make it possible to create broken configurations
>> by tweaking properties from command-line?
>
> Not as property, as struct element in PCIDeviceInfo.  i.e.
>
> static PCIDeviceInfo e1000_info = {
>   [ stuff which is here right now ]
>   .vendor_id = PCI_VENDOR_ID_INTEL,
>   .device_id = E1000_DEVID,
>   .class = PCI_CLASS_NETWORK_ETHERNET,
>   [ probably more stuff which makes sense ]
> }
>
> Then setup this in generic pci code instead of having each driver doing
> a bunch of pci_config_set_*() calls.
>
> cheers,
> Gerd

 We still end up with class, vendor etc duplicated in 2 places.
>>>
>>> No.  The info should be *only* in PCIDeviceInfo then.
>>
>> That would put a lot of code in pci config cycle path.  A single array
>> mirroring the whole config space is much cleaner.
>>
   Why do
 we want stuff like vendor id in PCIDeviceInfo at all?  Why can't
 everyone just use pci_config_set/get calls?
>>>
>>> You can do nice stuff like printing vendor/device IDs in the '-device ?'
>>> list then.
>>
>> That should use pci functions as well.
>
> Hmm, do you mix up PCIDevice and PCIDeviceInfo structs?
>
> cheers,
>   Gerd

OK. OTOH, I don't think we need to out print header type in -device ?
so what good is it there?






[Qemu-devel] [PATCH] don't dereference NULL after failed strdup

2010-02-08 Thread Jim Meyering
Most of these are obvious NULL-deref bug fixes, for example,
the ones in these files:

  block/curl.c
  net.c
  slirp/misc.c

and the first one in block/vvfat.c.
The others in block/vvfat.c may not lead to an immediate segfault, but I
traced the two schedule_rename(..., strdup(path)) uses, and a failed
strdup would appear to trigger this assertion in handle_renames_and_mkdirs:

assert(commit->path);

The conversion to use qemu_strdup in envlist_to_environ is not technically
needed, but does avoid a theoretical leak in the caller when strdup fails
for one value, but later succeeds in allocating another buffer(plausible,
if one string length is much larger than the others).  The caller does
not know the length of the returned list, and as such can only free
pointers until it hits the first NULL.  If there are non-NULL pointers
beyond the first, their buffers would be leaked.  This one is admittedly
far-fetched.

The two in linux-user/main.c are worth fixing to ensure that an
OOM error is diagnosed up front, rather than letting it provoke some
harder-to-diagnose secondary error, in case of exec failure, or worse, in
case the exec succeeds but with an invalid list of command line options.
However, considering how unlikely it is to encounter a failed strdup early
in main, this isn't a big deal.  Note that adding the required uses of
qemu_strdup here and in envlist.c induce link failures because qemu_strdup
is not currently in any library they're linked with.  So for now, I've
omitted those changes, as well as the fixes in target-i386/helper.c
and target-sparc/helper.c.

If you'd like to see the above discussion (or anything else)
in the commit log, just let me know and I'll be happy to adjust.


>From 9af42864fd1ea666bd25e2cecfdfae74c20aa8c7 Mon Sep 17 00:00:00 2001
From: Jim Meyering 
Date: Mon, 8 Feb 2010 18:29:29 +0100
Subject: [PATCH] don't dereference NULL after failed strdup

Handle failing strdup by replacing each use with qemu_strdup,
so as not to dereference NULL or trigger a failing assertion.
* block/curl.c (curl_open): s/\bstrdup\b/qemu_strdup/
* block/vvfat.c (init_directories): Likewise.
(get_cluster_count_for_direntry, check_directory_consistency): Likewise.
* net.c (parse_host_src_port): Likewise.
* slirp/misc.c (fork_exec): Likewise.
---
 block/curl.c  |2 +-
 block/vvfat.c |   10 +-
 net.c |2 +-
 slirp/misc.c  |2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index fe08f7b..2cf72cb 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -309,7 +309,7 @@ static int curl_open(BlockDriverState *bs, const char 
*filename, int flags)

 static int inited = 0;

-file = strdup(filename);
+file = qemu_strdup(filename);
 s->readahead_size = READ_AHEAD_SIZE;

 /* Parse a trailing ":readahead=#:" param, if present. */
diff --git a/block/vvfat.c b/block/vvfat.c
index d2787b9..bb707c0 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -883,7 +883,7 @@ static int init_directories(BDRVVVFATState* s,
 mapping->dir_index = 0;
 mapping->info.dir.parent_mapping_index = -1;
 mapping->first_mapping_index = -1;
-mapping->path = strdup(dirname);
+mapping->path = qemu_strdup(dirname);
 i = strlen(mapping->path);
 if (i > 0 && mapping->path[i - 1] == '/')
mapping->path[i - 1] = '\0';
@@ -1633,10 +1633,10 @@ static uint32_t 
get_cluster_count_for_direntry(BDRVVVFATState* s,

/* rename */
if (strcmp(basename, basename2))
-   schedule_rename(s, cluster_num, strdup(path));
+   schedule_rename(s, cluster_num, qemu_strdup(path));
} else if (is_file(direntry))
/* new file */
-   schedule_new_file(s, strdup(path), cluster_num);
+   schedule_new_file(s, qemu_strdup(path), cluster_num);
else {
assert(0);
return 0;
@@ -1753,10 +1753,10 @@ static int check_directory_consistency(BDRVVVFATState 
*s,
mapping->mode &= ~MODE_DELETED;

if (strcmp(basename, basename2))
-   schedule_rename(s, cluster_num, strdup(path));
+   schedule_rename(s, cluster_num, qemu_strdup(path));
 } else
/* new directory */
-   schedule_mkdir(s, cluster_num, strdup(path));
+   schedule_mkdir(s, cluster_num, qemu_strdup(path));

 lfn_init(&lfn);
 do {
diff --git a/net.c b/net.c
index 6ef93e6..8e951ca 100644
--- a/net.c
+++ b/net.c
@@ -96,7 +96,7 @@ int parse_host_src_port(struct sockaddr_in *haddr,
 struct sockaddr_in *saddr,
 const char *input_str)
 {
-char *str = strdup(input_str);
+char *str = qemu_strdup(input_str);
 char *host_str = str;
 char *src_str;
 const char *src_str2;
diff --git a/slirp/misc.c b/slirp/misc.c
index 05f4fb3..dcb1dc1 100644
--- a/slirp/misc.c
+++ b/slirp/misc.c
@@ -179,7 +179,7 @@ fork_exec(struct socket *so, const char *ex, int do_pty)
   close(s)

Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 07:56:27PM +0200, Blue Swirl wrote:
> On Mon, Feb 8, 2010 at 7:37 PM, Michael S. Tsirkin  wrote:
> > On Mon, Feb 08, 2010 at 06:37:41PM +0100, Gerd Hoffmann wrote:
> >> On 02/08/10 18:32, Michael S. Tsirkin wrote:
> >>> On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:
>  On 02/08/10 17:27, Michael S. Tsirkin wrote:
> > On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
> >> On 02/08/10 11:17, Michael S. Tsirkin wrote:
> >>> On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
>  initialize header type register in pci generic code.
> 
>  Cc: Blue Swirl
>  Cc: "Michael S. Tsirkin"
>  Signed-off-by: Isaku Yamahata
> >>>
> >>> No objections here, I am assuming this will be followed
> >>> by patches removing header type init from bridges?
> >>>    From qdev perspective, it is probably cleaner to make
> >>> multifunction bit a separate qdev property though, right?
> >>
> >>    From a qdev perspective it would make *alot* of sense to move a 
> >> bunch of
> >> pci config stuff (including, but not limited to header type) into
> >> PCIDeviceInfo.
> >>
> >> cheers,
> >>     Gerd
> >
> > Actually - won't this make it possible to create broken configurations
> > by tweaking properties from command-line?
> 
>  Not as property, as struct element in PCIDeviceInfo.  i.e.
> 
>  static PCIDeviceInfo e1000_info = {
>       [ stuff which is here right now ]
>       .vendor_id = PCI_VENDOR_ID_INTEL,
>       .device_id = E1000_DEVID,
>       .class     = PCI_CLASS_NETWORK_ETHERNET,
>       [ probably more stuff which makes sense ]
>  }
> 
>  Then setup this in generic pci code instead of having each driver doing
>  a bunch of pci_config_set_*() calls.
> 
>  cheers,
>     Gerd
> >>>
> >>> We still end up with class, vendor etc duplicated in 2 places.
> >>
> >> No.  The info should be *only* in PCIDeviceInfo then.
> >
> > That would put a lot of code in pci config cycle path.  A single array
> > mirroring the whole config space is much cleaner.
> 
> I'd suppose the arrays would remain as they are now, they just would
> be initialized (using the pci functions) based on PCIDeviceInfo
> structure.

This still means we have two copies of same data
and need to maintain code that keeps them in sync,
even if that is called just at init time.

> This would simplify the device code a lot.

Well, I think

pci_set_class(pci_dev, PCI_CLASS_NETWORK_ETHERNET)

is simpler than

.class = PCI_CLASS_NETWORK_ETHERNET

and some magic that copies that to pci config.


> >>>  Why do
> >>> we want stuff like vendor id in PCIDeviceInfo at all?  Why can't
> >>> everyone just use pci_config_set/get calls?
> >>
> >> You can do nice stuff like printing vendor/device IDs in the '-device ?'
> >> list then.
> >
> > That should use pci functions as well.
> >
> >> cheers,
> >>   Gerd
> >




[Qemu-devel] Re: Seabios dislikes -M isapc

2010-02-08 Thread Sebastian Herbszt

Jan Kiszka wrote:

Hi,

Seabios seems to have some assumptions built in that break when -M isapc
is selected. Is this supposed to work or is isapc about to die?


SeaBIOS doesn't POST if the F-segment is not writeable [1]. A possible, but IMO
wrong fix was posted on the list [2].

[1] http://lists.gnu.org/archive/html/qemu-devel/2009-11/msg01188.html
[2] http://lists.gnu.org/archive/html/qemu-devel/2009-12/msg00445.html

- Sebastian





[Qemu-devel] [PATCH 1/4] qjson: Improve debugging

2010-02-08 Thread Luiz Capitulino
Add an assert() to qobject_from_jsonf() to assure that the returned
QObject is not NULL. Currently this is duplicated in the callers.

Signed-off-by: Luiz Capitulino 
---
 qjson.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/qjson.c b/qjson.c
index 9ad8a91..483c667 100644
--- a/qjson.c
+++ b/qjson.c
@@ -53,6 +53,10 @@ QObject *qobject_from_json(const char *string)
 return qobject_from_jsonv(string, NULL);
 }
 
+/*
+ * IMPORTANT: This function aborts on error, thus it must not
+ * be used with untrusted arguments.
+ */
 QObject *qobject_from_jsonf(const char *string, ...)
 {
 QObject *obj;
@@ -62,6 +66,7 @@ QObject *qobject_from_jsonf(const char *string, ...)
 obj = qobject_from_jsonv(string, &ap);
 va_end(ap);
 
+assert(obj != NULL);
 return obj;
 }
 
-- 
1.6.6





[Qemu-devel] [PATCH 2/4] Monitor: remove unneeded checks

2010-02-08 Thread Luiz Capitulino
It's not needed to check the return of qobject_from_jsonf()
anymore, as an assert() has been added there.

Signed-off-by: Luiz Capitulino 
---
 block.c  |3 ---
 hw/pci-hotplug.c |1 -
 migration.c  |3 ---
 monitor.c|5 -
 4 files changed, 0 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 1919d19..89d697e 100644
--- a/block.c
+++ b/block.c
@@ -1259,7 +1259,6 @@ void bdrv_info(Monitor *mon, QObject **ret_data)
 "'removable': %i, 'locked': %i }",
 bs->device_name, type, bs->removable,
 bs->locked);
-assert(bs_obj != NULL);
 
 if (bs->drv) {
 QObject *obj;
@@ -1270,7 +1269,6 @@ void bdrv_info(Monitor *mon, QObject **ret_data)
  bs->filename, bs->read_only,
  bs->drv->format_name,
  bdrv_is_encrypted(bs));
-assert(obj != NULL);
 if (bs->backing_file[0] != '\0') {
 QDict *qdict = qobject_to_qdict(obj);
 qdict_put(qdict, "backing_file",
@@ -1356,7 +1354,6 @@ void bdrv_info_stats(Monitor *mon, QObject **ret_data)
  bs->device_name,
  bs->rd_bytes, bs->wr_bytes,
  bs->rd_ops, bs->wr_ops);
-assert(obj != NULL);
 qlist_append_obj(devices, obj);
 }
 
diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c
index ba13d2b..0fb96f0 100644
--- a/hw/pci-hotplug.c
+++ b/hw/pci-hotplug.c
@@ -285,7 +285,6 @@ void pci_device_hot_add(Monitor *mon, const QDict *qdict, 
QObject **ret_data)
 qobject_from_jsonf("{ 'domain': 0, 'bus': %d, 'slot': %d, "
"'function': %d }", pci_bus_num(dev->bus),
PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
-assert(*ret_data != NULL);
 } else
 monitor_printf(mon, "failed to add %s\n", opts);
 }
diff --git a/migration.c b/migration.c
index f20315f..2320c5f 100644
--- a/migration.c
+++ b/migration.c
@@ -183,8 +183,6 @@ static void migrate_put_status(QDict *qdict, const char 
*name,
 obj = qobject_from_jsonf("{ 'transferred': %" PRId64 ", "
"'remaining': %" PRId64 ", "
"'total': %" PRId64 " }", trans, rem, total);
-assert(obj != NULL);
-
 qdict_put_obj(qdict, name, obj);
 }
 
@@ -258,7 +256,6 @@ void do_info_migrate(Monitor *mon, QObject **ret_data)
 *ret_data = qobject_from_jsonf("{ 'status': 'cancelled' }");
 break;
 }
-assert(*ret_data != NULL);
 }
 }
 
diff --git a/monitor.c b/monitor.c
index a0ec7fc..cb7eb65 100644
--- a/monitor.c
+++ b/monitor.c
@@ -351,8 +351,6 @@ static void timestamp_put(QDict *qdict)
 obj = qobject_from_jsonf("{ 'seconds': %" PRId64 ", "
 "'microseconds': %" PRId64 " }",
 (int64_t) tv.tv_sec, (int64_t) tv.tv_usec);
-assert(obj != NULL);
-
 qdict_put_obj(qdict, "timestamp", obj);
 }
 
@@ -897,7 +895,6 @@ static void do_info_cpus(Monitor *mon, QObject **ret_data)
 obj = qobject_from_jsonf("{ 'CPU': %d, 'current': %i, 'halted': %i }",
  env->cpu_index, env == mon->mon_cpu,
  env->halted);
-assert(obj != NULL);
 
 cpu = qobject_to_qdict(obj);
 
@@ -4412,8 +4409,6 @@ static void monitor_control_event(void *opaque, int event)
 json_message_parser_init(&mon->mc->parser, handle_qmp_command);
 
 data = get_qmp_greeting();
-assert(data != NULL);
-
 monitor_json_emitter(mon, data);
 qobject_decref(data);
 }
-- 
1.6.6





[Qemu-devel] [PATCH 3/4] QError: Don't abort on multiple faults

2010-02-08 Thread Luiz Capitulino
Ideally, Monitor code should report an error only once and
return the error information up the call chain.

To assure that this happens as expected and that no error is
lost, we have an assert() in qemu_error_internal().

However, we still have not fully converted handlers using
monitor_printf() to report errors. As there can be multiple
monitor_printf() calls on an error, the assertion is easily
triggered when debugging is enabled; and we will get a memory
leak if it's not.

The solution to this problem is to allow multiple faults by only
reporting the first one, and to release the additional error objects.

A better mechanism to report multiple errors to programmers is
underway.

Signed-off-by: Luiz Capitulino 
---
 monitor.c |9 +++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/monitor.c b/monitor.c
index cb7eb65..c8b63aa 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4625,8 +4625,13 @@ void qemu_error_internal(const char *file, int linenr, 
const char *func,
 QDECREF(qerror);
 break;
 case ERR_SINK_MONITOR:
-assert(qemu_error_sink->mon->error == NULL);
-qemu_error_sink->mon->error = qerror;
+/* report only the first error */
+if (!qemu_error_sink->mon->error) {
+qemu_error_sink->mon->error = qerror;
+} else {
+/* XXX: warn the programmer */
+QDECREF(qerror);
+}
 break;
 }
 }
-- 
1.6.6





[Qemu-devel] [PATCH 4/4] QMP: Don't leak on connection close

2010-02-08 Thread Luiz Capitulino
QMP's chardev event callback doesn't call
json_message_parser_destroy() on CHR_EVENT_CLOSED. As the call
to json_message_parser_init() on CHR_EVENT_OPENED allocates memory,
we'are leaking on close.

Fix that by just calling json_message_parser_destroy() on
CHR_EVENT_CLOSED.

Signed-off-by: Luiz Capitulino 
---
 monitor.c |   12 
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index c8b63aa..aacc0af 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4401,16 +4401,20 @@ static QObject *get_qmp_greeting(void)
  */
 static void monitor_control_event(void *opaque, int event)
 {
-if (event == CHR_EVENT_OPENED) {
-QObject *data;
-Monitor *mon = opaque;
+QObject *data;
+Monitor *mon = opaque;
 
+switch (event) {
+case CHR_EVENT_OPENED:
 mon->mc->command_mode = 0;
 json_message_parser_init(&mon->mc->parser, handle_qmp_command);
-
 data = get_qmp_greeting();
 monitor_json_emitter(mon, data);
 qobject_decref(data);
+break;
+case CHR_EVENT_CLOSED:
+json_message_parser_destroy(&mon->mc->parser);
+break;
 }
 }
 
-- 
1.6.6





[Qemu-devel] Re: Seabios dislikes -M isapc

2010-02-08 Thread Jan Kiszka
Sebastian Herbszt wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> Seabios seems to have some assumptions built in that break when -M isapc
>> is selected. Is this supposed to work or is isapc about to die?
> 
> SeaBIOS doesn't POST if the F-segment is not writeable [1]. A possible, but 
> IMO
> wrong fix was posted on the list [2].
> 
> [1] http://lists.gnu.org/archive/html/qemu-devel/2009-11/msg01188.html
> [2] http://lists.gnu.org/archive/html/qemu-devel/2009-12/msg00445.html
> 

Indeed, [2] makes it work again.

But taking away IO_MEM_ROM really looks like a lazy workaround. I don't
know how much Seabios needs to write - can't it use normal RAM for this?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux




Re: [Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Anthony Liguori

On 02/08/2010 12:25 PM, Luiz Capitulino wrote:

On Mon, 08 Feb 2010 09:13:37 -0600
Anthony Liguori  wrote:

   

On 02/08/2010 08:56 AM, Daniel P. Berrange wrote:
 

On Mon, Feb 08, 2010 at 08:49:20AM -0600, Anthony Liguori wrote:

   

On 02/08/2010 08:12 AM, Daniel P. Berrange wrote:

 

For further backgrou, the key end goal here is that in a QMP client, upon
receipt of the  'RESET' event, we need to reliably&immediately determine
why it  occurred. eg, triggered by watchdog, or by guest OS request. There
are actually 3 possible sequences

   - WATCHDOG + action=reset, followed by RESET.  Assuming no intervening
 event can occurr, the client can merely record 'WATCHDOG' and interpret
 it when it gets the immediately following 'RESET' event

   - RESET, followed by WATCHDOG + action=reset. The client doesn't know
 the reason for the RESET and can't wait arbitrarily for WATCHDOG since
 there might never be one arriving.

   - RESET + source=watchdog. Client directly sees the reason

The second scenario is the one I'd like us to avoid at all costs, since it
will require the client to introduce arbitrary delays in processing events
to determine cause. The first is slightly inconvenient, but doable if we
can assume no intervening events will occur, between WATCHDOG and the
RESET events. The last is obviously simplest for the clients.


   

I really prefer the third option but I'm a little concerned that we're
throwing events around somewhat haphazardly.

So let me ask, why does a client need to determine when a guest reset
and why it reset?

 

If a guest OS is repeatedly hanging/crashing resulting in the watchdog
device firing, management software for the host really wants to know about
that (so that appropriate alerts/action can be taken) and thus needs to
be able to distinguish this from a "normal"  guest OS initiated reboot.

   

I think that's an argument for having the watchdog events independent of
the reset events.

The watchdog condition happening is not directly related to the action
the watchdog takes.  The watchdog event really belongs in a class events
that are closely associated with a particular device emulation.

In fact, I think what we're really missing in events today is a notion
of a context.  A RESET event is really a CPU event.  A watchdog
expiration event is a watchdog event.  A connect event is a VNC event
(Spice and chardevs will also generate connect events).
 

  This could be done by adding a 'context' member to all the events and
then an event would have to be identified by the pair event_name:context.

  This way we can have the same event_name for events in different
contexts. For example:

{ 'event': DISCONNECT, 'context': 'spice', [...] }

{ 'event': DISCONNECT, 'context': 'vnc', [...] }

  Note that today we have VNC_DISCONNECT and will probably have
SPICE_DISCONNECT too.
   


Which is why we gave ourselves until 0.13 to straighten out the protocol.

N.B. in this model, you'd have:

{ 'event' : 'EXPIRED', 'context': 'watchdog', 'action': 'reset' }
/* some arbitrary number of events */
{ 'event' : 'RESET', 'context': 'cpu' }

And the only reason RESET follows EXPIRED is because action=reset.  If 
action was different, a RESET might not occur.


A client needs to see the EXPIRED event, determine whether to expect a 
RESET event, and if so, wait for the next RESET event to happen.


Regards,

Anthony Liguori





Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Blue Swirl
On Mon, Feb 8, 2010 at 8:26 PM, Michael S. Tsirkin  wrote:
> On Mon, Feb 08, 2010 at 07:56:27PM +0200, Blue Swirl wrote:
>> On Mon, Feb 8, 2010 at 7:37 PM, Michael S. Tsirkin  wrote:
>> > On Mon, Feb 08, 2010 at 06:37:41PM +0100, Gerd Hoffmann wrote:
>> >> On 02/08/10 18:32, Michael S. Tsirkin wrote:
>> >>> On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:
>>  On 02/08/10 17:27, Michael S. Tsirkin wrote:
>> > On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
>> >> On 02/08/10 11:17, Michael S. Tsirkin wrote:
>> >>> On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
>>  initialize header type register in pci generic code.
>> 
>>  Cc: Blue Swirl
>>  Cc: "Michael S. Tsirkin"
>>  Signed-off-by: Isaku Yamahata
>> >>>
>> >>> No objections here, I am assuming this will be followed
>> >>> by patches removing header type init from bridges?
>> >>>    From qdev perspective, it is probably cleaner to make
>> >>> multifunction bit a separate qdev property though, right?
>> >>
>> >>    From a qdev perspective it would make *alot* of sense to move a 
>> >> bunch of
>> >> pci config stuff (including, but not limited to header type) into
>> >> PCIDeviceInfo.
>> >>
>> >> cheers,
>> >>     Gerd
>> >
>> > Actually - won't this make it possible to create broken configurations
>> > by tweaking properties from command-line?
>> 
>>  Not as property, as struct element in PCIDeviceInfo.  i.e.
>> 
>>  static PCIDeviceInfo e1000_info = {
>>       [ stuff which is here right now ]
>>       .vendor_id = PCI_VENDOR_ID_INTEL,
>>       .device_id = E1000_DEVID,
>>       .class     = PCI_CLASS_NETWORK_ETHERNET,
>>       [ probably more stuff which makes sense ]
>>  }
>> 
>>  Then setup this in generic pci code instead of having each driver doing
>>  a bunch of pci_config_set_*() calls.
>> 
>>  cheers,
>>     Gerd
>> >>>
>> >>> We still end up with class, vendor etc duplicated in 2 places.
>> >>
>> >> No.  The info should be *only* in PCIDeviceInfo then.
>> >
>> > That would put a lot of code in pci config cycle path.  A single array
>> > mirroring the whole config space is much cleaner.
>>
>> I'd suppose the arrays would remain as they are now, they just would
>> be initialized (using the pci functions) based on PCIDeviceInfo
>> structure.
>
> This still means we have two copies of same data
> and need to maintain code that keeps them in sync,
> even if that is called just at init time.
>
>> This would simplify the device code a lot.
>
> Well, I think
>
> pci_set_class(pci_dev, PCI_CLASS_NETWORK_ETHERNET)
>
> is simpler than
>
>        .class = PCI_CLASS_NETWORK_ETHERNET
>
> and some magic that copies that to pci config.

The advantage is that if the code happens to change, only the magic
(which is in one place) needs to be changed. Similar process has
happened with savevm.




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 09:32:54PM +0200, Blue Swirl wrote:
> On Mon, Feb 8, 2010 at 8:26 PM, Michael S. Tsirkin  wrote:
> > On Mon, Feb 08, 2010 at 07:56:27PM +0200, Blue Swirl wrote:
> >> On Mon, Feb 8, 2010 at 7:37 PM, Michael S. Tsirkin  wrote:
> >> > On Mon, Feb 08, 2010 at 06:37:41PM +0100, Gerd Hoffmann wrote:
> >> >> On 02/08/10 18:32, Michael S. Tsirkin wrote:
> >> >>> On Mon, Feb 08, 2010 at 06:24:57PM +0100, Gerd Hoffmann wrote:
> >>  On 02/08/10 17:27, Michael S. Tsirkin wrote:
> >> > On Mon, Feb 08, 2010 at 12:14:11PM +0100, Gerd Hoffmann wrote:
> >> >> On 02/08/10 11:17, Michael S. Tsirkin wrote:
> >> >>> On Mon, Feb 08, 2010 at 03:41:47PM +0900, Isaku Yamahata wrote:
> >>  initialize header type register in pci generic code.
> >> 
> >>  Cc: Blue Swirl
> >>  Cc: "Michael S. Tsirkin"
> >>  Signed-off-by: Isaku Yamahata
> >> >>>
> >> >>> No objections here, I am assuming this will be followed
> >> >>> by patches removing header type init from bridges?
> >> >>>    From qdev perspective, it is probably cleaner to make
> >> >>> multifunction bit a separate qdev property though, right?
> >> >>
> >> >>    From a qdev perspective it would make *alot* of sense to move a 
> >> >> bunch of
> >> >> pci config stuff (including, but not limited to header type) into
> >> >> PCIDeviceInfo.
> >> >>
> >> >> cheers,
> >> >>     Gerd
> >> >
> >> > Actually - won't this make it possible to create broken 
> >> > configurations
> >> > by tweaking properties from command-line?
> >> 
> >>  Not as property, as struct element in PCIDeviceInfo.  i.e.
> >> 
> >>  static PCIDeviceInfo e1000_info = {
> >>       [ stuff which is here right now ]
> >>       .vendor_id = PCI_VENDOR_ID_INTEL,
> >>       .device_id = E1000_DEVID,
> >>       .class     = PCI_CLASS_NETWORK_ETHERNET,
> >>       [ probably more stuff which makes sense ]
> >>  }
> >> 
> >>  Then setup this in generic pci code instead of having each driver 
> >>  doing
> >>  a bunch of pci_config_set_*() calls.
> >> 
> >>  cheers,
> >>     Gerd
> >> >>>
> >> >>> We still end up with class, vendor etc duplicated in 2 places.
> >> >>
> >> >> No.  The info should be *only* in PCIDeviceInfo then.
> >> >
> >> > That would put a lot of code in pci config cycle path.  A single array
> >> > mirroring the whole config space is much cleaner.
> >>
> >> I'd suppose the arrays would remain as they are now, they just would
> >> be initialized (using the pci functions) based on PCIDeviceInfo
> >> structure.
> >
> > This still means we have two copies of same data
> > and need to maintain code that keeps them in sync,
> > even if that is called just at init time.
> >
> >> This would simplify the device code a lot.
> >
> > Well, I think
> >
> > pci_set_class(pci_dev, PCI_CLASS_NETWORK_ETHERNET)
> >
> > is simpler than
> >
> >        .class = PCI_CLASS_NETWORK_ETHERNET
> >
> > and some magic that copies that to pci config.
> 
> The advantage is that if the code happens to change, only the magic
> (which is in one place) needs to be changed. Similar process has
> happened with savevm.

Yes, one place is good. But magic is bad.  What's wrong with
pci_set_header type or something like that?  Why do we need header type
in qdev?

-- 
MST




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Gerd Hoffmann

  Hi,


This still means we have two copies of same data
and need to maintain code that keeps them in sync,
even if that is called just at init time.


No.  There is nothing to keep in sync.  And there is no extra copy of data.

Today you have pci_set_*() calls somewhere in PCIDeviceInfo->init().
I'd like to see them replaced with PCIDeviceInfo->$field + setup in 
common code.  The information that device $foo has vendor id 42 and 
device id 4711 (and other properties) just moves from code to data.


It is static information, it should be static data.  And having the 
information in a well defined place in a data structure instead of 
hidden somewhere in the ->init() code makes it alot easier to reuse the 
information for something else.  That is the whole point.


cheers,
  Gerd





[Qemu-devel] [PATCH v1 0/4]: QMP related fixes

2010-02-08 Thread Luiz Capitulino
 Should be applied on top of feature negotiation series.

changelog:
--

v0 -> v1:

- Document qobject_from_jsonf() new semantics




Re: [Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Luiz Capitulino
On Mon, 08 Feb 2010 13:14:24 -0600
Anthony Liguori  wrote:

> On 02/08/2010 12:25 PM, Luiz Capitulino wrote:
> > On Mon, 08 Feb 2010 09:13:37 -0600
> > Anthony Liguori  wrote:
> >
> >
> >> On 02/08/2010 08:56 AM, Daniel P. Berrange wrote:
> >>  
> >>> On Mon, Feb 08, 2010 at 08:49:20AM -0600, Anthony Liguori wrote:
> >>>
> >>>
>  On 02/08/2010 08:12 AM, Daniel P. Berrange wrote:
> 
>   
> > For further backgrou, the key end goal here is that in a QMP client, 
> > upon
> > receipt of the  'RESET' event, we need to reliably&immediately 
> > determine
> > why it  occurred. eg, triggered by watchdog, or by guest OS request. 
> > There
> > are actually 3 possible sequences
> >
> >- WATCHDOG + action=reset, followed by RESET.  Assuming no 
> > intervening
> >  event can occurr, the client can merely record 'WATCHDOG' and 
> > interpret
> >  it when it gets the immediately following 'RESET' event
> >
> >- RESET, followed by WATCHDOG + action=reset. The client doesn't know
> >  the reason for the RESET and can't wait arbitrarily for WATCHDOG 
> > since
> >  there might never be one arriving.
> >
> >- RESET + source=watchdog. Client directly sees the reason
> >
> > The second scenario is the one I'd like us to avoid at all costs, since 
> > it
> > will require the client to introduce arbitrary delays in processing 
> > events
> > to determine cause. The first is slightly inconvenient, but doable if we
> > can assume no intervening events will occur, between WATCHDOG and the
> > RESET events. The last is obviously simplest for the clients.
> >
> >
> >
>  I really prefer the third option but I'm a little concerned that we're
>  throwing events around somewhat haphazardly.
> 
>  So let me ask, why does a client need to determine when a guest reset
>  and why it reset?
> 
>   
> >>> If a guest OS is repeatedly hanging/crashing resulting in the watchdog
> >>> device firing, management software for the host really wants to know about
> >>> that (so that appropriate alerts/action can be taken) and thus needs to
> >>> be able to distinguish this from a "normal"  guest OS initiated reboot.
> >>>
> >>>
> >> I think that's an argument for having the watchdog events independent of
> >> the reset events.
> >>
> >> The watchdog condition happening is not directly related to the action
> >> the watchdog takes.  The watchdog event really belongs in a class events
> >> that are closely associated with a particular device emulation.
> >>
> >> In fact, I think what we're really missing in events today is a notion
> >> of a context.  A RESET event is really a CPU event.  A watchdog
> >> expiration event is a watchdog event.  A connect event is a VNC event
> >> (Spice and chardevs will also generate connect events).
> >>  
> >   This could be done by adding a 'context' member to all the events and
> > then an event would have to be identified by the pair event_name:context.
> >
> >   This way we can have the same event_name for events in different
> > contexts. For example:
> >
> > { 'event': DISCONNECT, 'context': 'spice', [...] }
> >
> > { 'event': DISCONNECT, 'context': 'vnc', [...] }
> >
> >   Note that today we have VNC_DISCONNECT and will probably have
> > SPICE_DISCONNECT too.
> >
> 
> Which is why we gave ourselves until 0.13 to straighten out the protocol.

 Yeah.

> N.B. in this model, you'd have:
> 
> { 'event' : 'EXPIRED', 'context': 'watchdog', 'action': 'reset' }
> /* some arbitrary number of events */
> { 'event' : 'RESET', 'context': 'cpu' }
> 
> And the only reason RESET follows EXPIRED is because action=reset.  If 
> action was different, a RESET might not occur.
> 
> A client needs to see the EXPIRED event, determine whether to expect a 
> RESET event, and if so, wait for the next RESET event to happen.

 Looks reasonable to me, what do think Daniel?

 Note that if we agree on the 'context design', I'll have to change
VNC's events names..




Re: [Qemu-devel] Re: [PATCH] pci: initialize header type register.

2010-02-08 Thread Michael S. Tsirkin
On Mon, Feb 08, 2010 at 08:55:58PM +0100, Gerd Hoffmann wrote:
>   Hi,
>
>> This still means we have two copies of same data
>> and need to maintain code that keeps them in sync,
>> even if that is called just at init time.
>
> No.  There is nothing to keep in sync.  And there is no extra copy of data.
>
> Today you have pci_set_*() calls somewhere in PCIDeviceInfo->init().
> I'd like to see them replaced with PCIDeviceInfo->$field + setup in  
> common code.  The information that device $foo has vendor id 42 and  
> device id 4711 (and other properties) just moves from code to data.

We still need it in config array which is read by guest.
So that is two places.

> It is static information, it should be static data.  And having the  
> information in a well defined place in a data structure instead of  
> hidden somewhere in the ->init() code makes it alot easier to reuse the  
> information for something else.  That is the whole point.
>
> cheers,
>   Gerd

more important IMO is making code easier to follow.

-- 
MST




Re: [Qemu-devel] Re: Two QMP events issues

2010-02-08 Thread Anthony Liguori

On 02/08/2010 01:59 PM, Luiz Capitulino wrote:


  Looks reasonable to me, what do think Daniel?

  Note that if we agree on the 'context design', I'll have to change
VNC's events names..
   


Let me give you a few suggestions before diving into it.  context might 
not be the best name.


For event generated by devices, the event should be raised with 
something like qdev_event(&s->dev, QMPEV_WD_EXPIRED, ...).


The context argument should allow a client to determine which device 
raised the event.  So it could be a combination of the device's qdev 
name and it's id.


For event generated by non-qdev mechanisms, we should try our best to 
associate context with that event.  For instance, a DISCONNECT event 
happens to a particular session.  We don't quite have VNC session ids 
yet but if we did, it would make sense to include that in the context info.


So my thinking is that we don't just want context to serve as a 
classification mechanism, but we want it to indicate what 
subsystem/device/session generated the event.


Regards,

Anthony Liguori




[Qemu-devel] [PATCH 1/1] Increase VNC_MAX_WIDTH

2010-02-08 Thread Brian Jackson
Increase VNC_MAX_WIDTH to match "commonly available" consumer level monitors 
available these days.

This also closes KVM bug 2907597

Signed-off-by: Brian Jackson 
---
 vnc.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/vnc.h b/vnc.h
index 1210824..1575af2 100644
--- a/vnc.h
+++ b/vnc.h
@@ -68,7 +68,7 @@ typedef void VncSendHextileTile(VncState *vs,
 void *last_fg,
 int *has_bg, int *has_fg);

-#define VNC_MAX_WIDTH 2048
+#define VNC_MAX_WIDTH 2560
 #define VNC_MAX_HEIGHT 2048
 #define VNC_DIRTY_WORDS (VNC_MAX_WIDTH / (16 * 32))





[Qemu-devel] Re: Slowdowns comparing qemu-kvm.git to qemu.git: vcpu/thread scheduling differences

2010-02-08 Thread Anthony Liguori

On 02/08/2010 11:35 AM, Amit Shah wrote:

On (Mon) Feb 08 2010 [08:57:05], Anthony Liguori wrote:
   

On 02/08/2010 07:46 AM, Amit Shah wrote:
 

Hello,

In my testing of virtio-console, I found qemu-kvm.git introduces a lot
of overhead in thread scheduling compared to qemu.git.

My test sends a 260M file from the host to a guest via a virtio-console
port and then computes the sha1sum of the file on the host as well as on
the guest, compares the checksum and declares the result based on the
checksum match. The test passes in all the scenarios listed below,
indicating there's no unsafe data transfer.


Repo   Time taken
-  --
qemu.git<   1 m (typically 30s)
qemu-kvm.git>   16m
qemu-iothread~ 5m

   

That very likely suggests that there are missing qemu_notify_events() in
qemu-kvm.git and you're getting blocked waiting for the next timer event
to fire.
 

Hm, if that's the case, should virtio_notify() have a call to
qemu_notify_event()?
   


No, basically, the problem will boil down to, the IO thread is 
select()'d waiting for an event to occur.  However, you've done 
something in the VCPU thread that requires the IO thread to run it's 
main loop.  You need to use qemu_notify_event() to force the IO thread 
to break out of select().


Debugging these problems are very difficult and the complexity here is 
the main reason the IO thread still hasn't been enabled by default in 
qemu.git.


I'd suggest looking at the main loop in qemu-kvm, seeing what isn't 
processed as a result of fd becoming readable/writable, and then making 
sure that any of those mechanisms you're relying on in virtio-console 
have the appropriate tie-ins to qemu_notify_event().


Regards,

Anthony Liguori




Re: [Qemu-devel] [PATCH] [For stable-0.12] Sync OSS_GETVERSION handling with head

2010-02-08 Thread Juergen Lock
In article  you write:
>On Tue, 19 Jan 2010, Anthony Liguori wrote:
>
>> On 01/17/2010 11:23 AM, Juergen Lock wrote:
>> > As suggested by Andreas F?rber, here is a cumulative patch that syncs
>> > OSS_GETVERSION handling with head by merging the following commits:
>> > 
>> > 1. oss: issue OSS_GETVERSION ioctl only when needed
>> > 6d246526ce3c145b2831285def6983f5de6190d3
>> > 
>> > 2. oss: fix fragment setting
>> > 3d709fe73a77c40e263b3af6e650fd4b519c3562
>> > 
>> > 3. Workaround for broken OSS_GETVERSION on FreeBSD, part two
>> > 72ff25e4e98d6dba9286d032b9ff5432553bbad5
>> > 
>> > Signed-off-by: Juergen Lock
>> >
>> 
>> malc, please Ack.
>
>quack
>
>[..snip..]

-qu? :)

 (Because I just noticed this is not in stable yet...)

 Thanx,
Juergen




  1   2   >