Re: [Qemu-devel] [PATCH 12/16] scsi-generic: use plain ioctl

2010-11-20 Thread Nicholas A. Bellinger
On Sat, 2010-11-20 at 01:25 +, adq wrote:
> On 20 November 2010 00:41, Nicholas A. Bellinger  wrote:
> > On Fri, 2010-11-19 at 19:39 +0100, Christoph Hellwig wrote:
> >> On Thu, Nov 18, 2010 at 03:47:36PM +0100, Hannes Reinecke wrote:
> >> >
> >> > aio_ioctl is emulated anyway and currently broken.
> >>
> >> What's broken about it currently?
> >
> > Mm, I do not recall this being broken in the first place..?  There
> > was a single issue with megasas+bdrv_aio_ioctl() with WinXP (that did
> > not appear with lsi53c895a) that was mentioned on the list earlier in
> > the year that required a patch to use bdev_ioctl(), but last I recall
> > Hannes had already fixed this in recent megasas.c code w/ 32-bit MSFT
> > guests.  Also, this is what I have been with scsi_generic.c and
> > scsi_bsg.c into TCM_loop in my v0.12.5 megasas tree, and I am not
> > observing any obvious issues with AIO IOCTLs for SG_IO/BSG into Linux
> > guests.
> >
> > I will give AIO IOCTL ops a run with these on v2.6.37-rc2 lock-less KVM
> > host mode <-> TCM_Loop to verify against the v0.12.5 megasas tree.
> 
> Could this AIO ioctl breakage perhaps be the one I fixed here?
> http://web.archiveorange.com/archive/v/1XS1vROmfC7dN9wYxsmt
> 
> The patch is defintely in the latest git... it works fine for me with
> my scsi-generic MMC command patches.
> 

Interesting read, and thanks for the heads up on this bit..   I do not
personally recall running into any issues with TYPE_DISK w/ lsi53c895a
and AIO SG_IO into WinXP guests on v0.12.5 code.   After a quick double
check in the v0.12.5 megasas tree, the proper get_async_context_id() is
still present:

http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=blob;f=posix-aio-compat.c;h=ccdbf9a16d0ef1d7e57c87dbe43f318d4c7a5967;hb=HEAD#l560

So it appears this acb->async_context_id was incorrectly dropped during
v0.13 development, and with your fix commited into v0.13 mainline code
that Hannes should be able to safetly drop this the megasas series,
yes..?

Thank you for your comments!

--nab





Re: [Qemu-devel] Re: [PATCH 0/2] v8 Decouple block device removal from device removal

2010-11-20 Thread Kevin Wolf
Am 16.11.2010 14:51, schrieb Luiz Capitulino:
> On Fri, 12 Nov 2010 18:38:57 +0100
> Kevin Wolf  wrote:
> 
>> Am 12.11.2010 18:07, schrieb Ryan Harper:
>>> details, details, v8
>>>
>>> This patch series decouples the detachment of a block device from the
>>> removal of the backing pci-device.  Removal of a hotplugged pci device
>>> requires the guest to respond before qemu tears down the block device.
>>> In some cases, the guest may not respond leaving the guest with
>>> continued access to the block device.  Mgmt layer doesn't have a
>>> reliable method to force a disconnect.  
>>>
>>> The new monitor command, drive_del, will revoke a guests access to the
>>> block device independently of the removal of the pci device.
>>>
>>> The first patch implements drive_del, the second patch implements the
>>> qmp version of the monitor command.
>>>
>>> Changes since v7:
>>> - Fixed up doc strings (delete -> drive_del)
>>> Changes since v6:
>>> - Updated patch description
>>> - Dropped bdrv_unplug and inlined in drive_del
>>> - Explicitly invoke drive_uninit()
>>> Changes since v5:
>>> - Removed dangling pointers in guest and host state.  This ensures things 
>>> like 
>>>   info block no longer displays the deleted drive; though info pci will
>>>   continue to display the pci device until the guest responds to the removal
>>>   request.
>>> - Renamed drive_unplug -> drive_del
>>> Changes since v4:
>>> - Droppped drive_get_by_id patch and use bdrv_find() instead
>>> - Added additional details about drive_unplug to hmp/qmp interface
>>>
>>> Changes since v3:
>>> - Moved QMP command for drive_unplug() to separate patch
>>>
>>> Changes since v2:
>>> - Added QMP command for drive_unplug()
>>>
>>> Changes since v1:
>>> - CodingStyle fixes
>>> - Added qemu_aio_flush() to bdrv_unplug()
>>>
>>> Signed-off-by: Ryan Harper 
>>
>> Thanks, applied both to the block branch.
> 
> I guess the conclusion was that we don't want the new command
> in QMP?
> 
>   http://lists.gnu.org/archive/html/qemu-devel/2010-11/msg01084.html

If you compare the time of these mails, Markus sent his mail only a few
minutes after I had applied the patches and posted this.

Ryan split the patch in two parts only to allow dropping the QMP part if
we decided so, so I think he'll agree. I'm going to drop the second
patch from my queue again before I send a pull request.

Kevin



Re: [Qemu-devel] [PATCH v2 2/2] RAM API: Make use of it for x86 PC

2010-11-20 Thread Anthony Liguori

On 11/01/2010 10:14 AM, Alex Williamson wrote:

Register the actual VM RAM using the new API

Signed-off-by: Alex Williamson
---

  hw/pc.c |   12 ++--
  1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 69b13bf..0ea6d10 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -912,14 +912,14 @@ void pc_memory_init(ram_addr_t ram_size,
  /* allocate RAM */
  ram_addr = qemu_ram_alloc(NULL, "pc.ram",
below_4g_mem_size + above_4g_mem_size);
-cpu_register_physical_memory(0, 0xa, ram_addr);
-cpu_register_physical_memory(0x10,
- below_4g_mem_size - 0x10,
- ram_addr + 0x10);
+
+qemu_ram_register(0, 0xa, ram_addr);
+qemu_ram_register(0x10, below_4g_mem_size - 0x10,
+  ram_addr + 0x10);
  #if TARGET_PHYS_ADDR_BITS>  32
  if (above_4g_mem_size>  0) {
-cpu_register_physical_memory(0x1ULL, above_4g_mem_size,
- ram_addr + below_4g_mem_size);
+qemu_ram_register(0x1ULL, above_4g_mem_size,
+  ram_addr + below_4g_mem_size);
  }
   


Take a look at the memory shadowing in the i440fx.  The regions of 
memory in the BIOS area can temporarily become RAM.


That's because there is normally RAM backing this space but the memory 
controller redirects writes to the ROM space.


Not sure the best way to handle this, but the basic concept is, RAM 
always exists but if a device tries to access it, it may or may not be 
accessible as RAM at any given point in time.


Regards,

Anthony Liguori


  #endif




   





Re: [Qemu-devel] macaddr doesn't work

2010-11-20 Thread Mulyadi Santosa
On Fri, Nov 19, 2010 at 16:09, 郭沐錫  wrote:
> However the eth0 will disapear and induce I cannot assign the IP address to
> the QEMU.
> http://myweb.ncku.edu.tw/~p76991028/eth0.png

I think I already asked you to type "ifconfig -a" and see if it is there?

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com



Re: [Qemu-devel] [PATCH 12/16] scsi-generic: use plain ioctl

2010-11-20 Thread adq
On 20 November 2010 08:23, Nicholas A. Bellinger  wrote:
> On Sat, 2010-11-20 at 01:25 +, adq wrote:
>> On 20 November 2010 00:41, Nicholas A. Bellinger  
>> wrote:
>> > On Fri, 2010-11-19 at 19:39 +0100, Christoph Hellwig wrote:
>> >> On Thu, Nov 18, 2010 at 03:47:36PM +0100, Hannes Reinecke wrote:
>> >> >
>> >> > aio_ioctl is emulated anyway and currently broken.
>> >>
>> >> What's broken about it currently?
>> >
>> > Mm, I do not recall this being broken in the first place..?  There
>> > was a single issue with megasas+bdrv_aio_ioctl() with WinXP (that did
>> > not appear with lsi53c895a) that was mentioned on the list earlier in
>> > the year that required a patch to use bdev_ioctl(), but last I recall
>> > Hannes had already fixed this in recent megasas.c code w/ 32-bit MSFT
>> > guests.  Also, this is what I have been with scsi_generic.c and
>> > scsi_bsg.c into TCM_loop in my v0.12.5 megasas tree, and I am not
>> > observing any obvious issues with AIO IOCTLs for SG_IO/BSG into Linux
>> > guests.
>> >
>> > I will give AIO IOCTL ops a run with these on v2.6.37-rc2 lock-less KVM
>> > host mode <-> TCM_Loop to verify against the v0.12.5 megasas tree.
>>
>> Could this AIO ioctl breakage perhaps be the one I fixed here?
>> http://web.archiveorange.com/archive/v/1XS1vROmfC7dN9wYxsmt
>>
>> The patch is defintely in the latest git... it works fine for me with
>> my scsi-generic MMC command patches.
>>
>
> Interesting read, and thanks for the heads up on this bit..   I do not
> personally recall running into any issues with TYPE_DISK w/ lsi53c895a
> and AIO SG_IO into WinXP guests on v0.12.5 code.   After a quick double
> check in the v0.12.5 megasas tree, the proper get_async_context_id() is
> still present:
>
> http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=blob;f=posix-aio-compat.c;h=ccdbf9a16d0ef1d7e57c87dbe43f318d4c7a5967;hb=HEAD#l560
>
> So it appears this acb->async_context_id was incorrectly dropped during
> v0.13 development, and with your fix commited into v0.13 mainline code
> that Hannes should be able to safetly drop this the megasas series,
> yes..?
>
> Thank you for your comments!

No problem. Oh, that bug was /definitely/ in the released 0.12.x
series code; I spotted it when trying to get scsi-generic to work on
arch linux, which was using 0.12.x.



Re: [Qemu-devel] macaddr doesn't work

2010-11-20 Thread 郭沐錫
Dear all

Sorry, I was wrong.

I was too hurry to result in that I don't understand that command.

By "ifconfig -a", I found other eth...

Thank you to Mulyadi.

Best Regards,

2010/11/20 Mulyadi Santosa 

> On Fri, Nov 19, 2010 at 16:09, 郭沐錫  wrote:
> > However the eth0 will disapear and induce I cannot assign the IP address
> to
> > the QEMU.
> > http://myweb.ncku.edu.tw/~p76991028/eth0.png
>
> I think I already asked you to type "ifconfig -a" and see if it is there?
>
> --
> regards,
>
> Mulyadi Santosa
> Freelance Linux trainer and consultant
>
> blog: the-hydra.blogspot.com
> training: mulyaditraining.blogspot.com
>


Re: [Qemu-devel] [PATCH v2 1/2] Minimal RAM API support

2010-11-20 Thread Anthony Liguori

On 11/01/2010 10:14 AM, Alex Williamson wrote:

This adds a minimum chunk of Anthony's RAM API support so that we
can identify actual VM RAM versus all the other things that make
use of qemu_ram_alloc.

Signed-off-by: Alex Williamson
---

  Makefile.objs |1 +
  cpu-common.h  |2 +
  memory.c  |  109 +
  memory.h  |   18 +
  4 files changed, 130 insertions(+), 0 deletions(-)
  create mode 100644 memory.c
  create mode 100644 memory.h

diff --git a/Makefile.objs b/Makefile.objs
index f07fb01..33fae0b 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -154,6 +154,7 @@ hw-obj-y += vl.o loader.o
  hw-obj-y += virtio.o virtio-console.o
  hw-obj-y += fw_cfg.o pci.o pci_host.o pcie_host.o
  hw-obj-y += watchdog.o
+hw-obj-y += memory.o
  hw-obj-$(CONFIG_ISA_MMIO) += isa_mmio.o
  hw-obj-$(CONFIG_ECC) += ecc.o
  hw-obj-$(CONFIG_NAND) += nand.o
diff --git a/cpu-common.h b/cpu-common.h
index a543b5d..6aa2738 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -23,6 +23,8 @@
  /* address in the RAM (different from a physical address) */
  typedef unsigned long ram_addr_t;

+#include "memory.h"
+
  /* memory API */

  typedef void CPUWriteMemoryFunc(void *opaque, target_phys_addr_t addr, 
uint32_t value);
diff --git a/memory.c b/memory.c
new file mode 100644
index 000..2895082
--- /dev/null
+++ b/memory.c
@@ -0,0 +1,109 @@
+/*
+ * RAM API
+ *
+ *  Copyright Red Hat, Inc. 2010
+ *
+ * Authors:
+ *  Alex Williamson
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see.
+ */
+#include "memory.h"
+#include "range.h"
+
+typedef struct QemuRamSlot {
+target_phys_addr_t start_addr;
+ram_addr_t size;
+ram_addr_t offset;
+QLIST_ENTRY(QemuRamSlot) next;
+} QemuRamSlot;
+
+typedef struct QemuRamSlots {
+QLIST_HEAD(slots, QemuRamSlot) slots;
+} QemuRamSlots;
   


No need for all of the 'Qemu' prefixes.


+
+static QemuRamSlots ram_slots = { .slots = QLIST_HEAD_INITIALIZER(ram_slots) };
   


Might be nicer to just typedef the extra struct away.


+static QemuRamSlot *qemu_ram_find_slot(target_phys_addr_t start_addr,
+   ram_addr_t size)
+{
+QemuRamSlot *slot;
+
+QLIST_FOREACH(slot,&ram_slots.slots, next) {
+if (slot->start_addr == start_addr&&  slot->size == size) {
+return slot;
+}
+
+if (ranges_overlap(start_addr, size, slot->start_addr, slot->size)) {
+abort();
   


Should display a message before aborting.


+}
+}
+
+return NULL;
+}
+
+int qemu_ram_register(target_phys_addr_t start_addr, ram_addr_t size,
+   ram_addr_t phys_offset)
+{
+QemuRamSlot *slot;
+
+if (!size) {
+return -EINVAL;
+}
+
+assert(!qemu_ram_find_slot(start_addr, size));
+
+slot = qemu_mallocz(sizeof(QemuRamSlot));
+
+slot->start_addr = start_addr;
+slot->size = size;
+slot->offset = phys_offset;
+
+QLIST_INSERT_HEAD(&ram_slots.slots, slot, next);
+
+cpu_register_physical_memory(slot->start_addr, slot->size, slot->offset);
+
+return 0;
+}
+
+void qemu_ram_unregister(target_phys_addr_t start_addr, ram_addr_t size)
+{
+QemuRamSlot *slot;
+
+if (!size) {
+return;
+}
+
+slot = qemu_ram_find_slot(start_addr, size);
+assert(slot != NULL);
+
+QLIST_REMOVE(slot, next);
+qemu_free(slot);
+cpu_register_physical_memory(start_addr, size, IO_MEM_UNASSIGNED);
+
+return;
+}
+
+int qemu_ram_for_each_slot(void *opaque, qemu_ram_for_each_slot_fn fn)
+{
+QemuRamSlot *slot;
+
+QLIST_FOREACH(slot,&ram_slots.slots, next) {
+int ret = fn(opaque, slot->start_addr, slot->size, slot->offset);
+if (ret) {
+return ret;
+}
+}
+return 0;
+}
diff --git a/memory.h b/memory.h
new file mode 100644
index 000..0c17ff9
--- /dev/null
+++ b/memory.h
@@ -0,0 +1,18 @@
+#ifndef QEMU_MEMORY_H
+#define QEMU_MEMORY_H
+
+#include "qemu-common.h"
+#include "cpu-common.h"
   


Header needs copyright and would be nice to have some comments 
explaining these functions.


Regards,

Anthony Liguori


+int qemu_ram_register(target_phys_addr_t start_addr, ram_addr_t size,
+  ram_addr_t phys_offset);
+
+void qemu_ram_unregister(target_phys_addr_t start_addr, ram_addr_t size);
+
+typedef int (*qemu_ram_for_e

Re: [Qemu-devel] How to make shadow memory for a process? and how to trace the data propation from the instruction level in QEMU?

2010-11-20 Thread Lluís
F Zhang writes:

> This topic includes things that I recognized as critical. Have you any
> suggestions?

Sorry, I don't understand about what you want suggestions.


  Yes. For each process’s memory space A, I wanna make a shadow memory B. The
  shadow memory is used to store the tag of data. In other words, if addr in
  memory A is tainted, then the corresponding byte in B should be marked to
  indicate that addr in A is tainted.
>> 
>> The main question here is... what is the granularity that you want to
>> track with? Bytes? Words? Pages? This will greatly influence which is
>> your best approach.

> I think one byte per tag is necessary for malware analysis in most cases,
> because only a few bytes are used to launch an attack. For example, a few
> tainted bytes sent to EIP register will cause CPU to do bad things.
[...]
>> Now that I think of it, you could use the tracing points I sent for
>> guest virtual memory accesses, and instrument them instead of calling a
>> file-tracing backend (this should provide a hook for an arbitrary
>> granularity). Then, simply keep track also of address-space changes and
>> your instrumentation code can always know when to activate propagation.
>> 

> Sorry, what is “a file-tracing backend”? Could you be a little more detailed? 
> I
> think I need byte-level granularity. Thanks!

Well, the initial patch series I sent were based on macros, so that you
can place any code you want on these macros, not only tracing.

On its current form (sorry, I don't have spare time right now to finish
it), the points generate code for tracing, but there is a patch series
that lets the user re-define some of the trace points to call any
function provided by the user (look for the "trace-instrument" series).


>> This, together with the optimization I sent for dynamic control of trace
>> generation in TCG emulation code should get you on tracks.
>> 
>> Of course, you should still modify all register-accessing instructions
>> to propagate information passing through the register set. For that,
>> maybe you could start with the "fetch" tracing/instrumentation point I
>> sent long time ago, which keeps track of general-purpose register
>> usage/definition on x86 (although I'm sure I left some astray usages due
>> to the decoding complexity in x86).

> Thanks! I will read that code first, though I am currently just a newbie. L

  The guest os collects “higher” semantic
  from the OS level, and the QEMU collects “lower” semantic from the
  instruction level. Combination of both semantics is necessary in the
  analysis process.
>> 
>>>  The question is, in a situation where malware already compromise "the
>>>  higher semantic", could we trust the analysis?
>> 
>> Beware, I've read exactly this kind of scheme on previous top-tier
>> conferences (but I think tests were using an architectural simulator, so
>> it's not for a current production environment).
>> 
>> I've found it :)
>> 
>>  Secure program execution via dynamic information flow tracking
>>  ASPLOS 2004
>> 
> That is a significant paper, which is cited for more than 300 times!

That's why I said you should be careful. Porting this kind of analysis
into QEMU is not significant by itself, although I suppose it should
gain some extra relevance if you implement it in such a way that it can
be used on a production system.

You could start with guest OS taint propagation, and through the "guest
OS to QEMU" channel, activate taint propagation checks when a process
gains access to tainted information coming from the outer world (e.g.,
socket read) [*]. Then, you can conditionally generate taint checks like
I did in the "trace-gen" series, so that programs without access to
tainted information will have no checks at all.

Even more, the optimal solution would be to run in KVM-mode when no
instruction-based taint checking is needed, and use QEMU emulation
otherwise. The down side is that I was told this is not currently
possible with multiple CPUs, and only theoretically possible with one
CPU.

[*] This is just a rough summary of what I remember from the ASPLOS
paper


  The question is: how to communicate between the QEMU and the guest OS, so
  that they can cooperate with each other?
>> 
>> A few choices here, but you should first define if the communication
>> must be based just on control signals, and/or providing memory storage:
>>   * virtual device : If you need some kind of storage that the guest OS
>> must access, you could look at the ivshmem device
>>   * backdoor instruction : It's the simplest option; I sent some patch
>> series recently with two different implementations for x86.
>> 
>> 

> Both of control signals and (shadow) memory storage are required. So, the
> virtual device may be the right choice.

Shadow memory is not a problem here, as it can be handled by the
intrumented trace points and the guest OS has no need to access it. So
from my understanding, just usin

Re: [Qemu-devel] [PATCH v2] ioport: Fix duplicated code

2010-11-20 Thread Anthony Liguori

On 11/11/2010 08:03 AM, Luiz Capitulino wrote:

Functions register_ioport_read() and register_ioport_write() are almost
identical, the only difference is that they write to different arrays.

Introduce register_ioport_rw() to handle this difference and change both
functions to use it instead of duplicating code.

Signed-off-by: Luiz Capitulino
   


While it make take some scripting, let's do a global query/replace.

Having two interfaces where one is only scarely used hurts code 
readability.  We need to do the janitorial work when changing interfaces 
like this.


Regards,

Anthony Liguori


---

v2: Fix error messages and make register_ioport_rw() register both handlers
 at the same call

  ioport.c |   37 ++---
  1 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/ioport.c b/ioport.c
index ec3dc65..4560973 100644
--- a/ioport.c
+++ b/ioport.c
@@ -137,41 +137,40 @@ static int ioport_bsize(int size, int *bsize)
  }

  /* size is the word size in byte */
-int register_ioport_read(pio_addr_t start, int length, int size,
- IOPortReadFunc *func, void *opaque)
+static int register_ioport_rw(pio_addr_t start, int length, int size,
+  IOPortReadFunc *read_func,
+  IOPortWriteFunc *write_func, void *opaque)
  {
  int i, bsize;

  if (ioport_bsize(size,&bsize)) {
-hw_error("register_ioport_read: invalid size");
+hw_error("register_ioport_rw: invalid size");
  return -1;
  }
  for(i = start; i<  start + length; i += size) {
-ioport_read_table[bsize][i] = func;
+if (read_func) {
+ioport_read_table[bsize][i] = read_func;
+}
+if (write_func) {
+ioport_write_table[bsize][i] = write_func;
+}
  if (ioport_opaque[i] != NULL&&  ioport_opaque[i] != opaque)
-hw_error("register_ioport_read: invalid opaque");
+hw_error("register_ioport_rw: invalid opaque");
  ioport_opaque[i] = opaque;
  }
  return 0;
  }

-/* size is the word size in byte */
+int register_ioport_read(pio_addr_t start, int length, int size,
+ IOPortReadFunc *func, void *opaque)
+{
+return register_ioport_rw(start, length, size, func, NULL, opaque);
+}
+
  int register_ioport_write(pio_addr_t start, int length, int size,
IOPortWriteFunc *func, void *opaque)
  {
-int i, bsize;
-
-if (ioport_bsize(size,&bsize)) {
-hw_error("register_ioport_write: invalid size");
-return -1;
-}
-for(i = start; i<  start + length; i += size) {
-ioport_write_table[bsize][i] = func;
-if (ioport_opaque[i] != NULL&&  ioport_opaque[i] != opaque)
-hw_error("register_ioport_write: invalid opaque");
-ioport_opaque[i] = opaque;
-}
-return 0;
+return register_ioport_rw(start, length, size, NULL, func, opaque);
  }

  void isa_unassign_ioport(pio_addr_t start, int length)
   





Re: [Qemu-devel] Re: [PATCH v2 0/4] use new vgabios.

2010-11-20 Thread Anthony Liguori

On 11/17/2010 04:26 AM, Gerd Hoffmann wrote:

On 11/16/10 15:51, Anthony Liguori wrote:

On 11/01/2010 11:03 AM, Gerd Hoffmann wrote:

On 10/15/10 12:02, Gerd Hoffmann wrote:

This patch series will put the new vgabios into use for stdvga and
vmware_vga. The vgabios patches have been posted a while ago, they
are also also available from
git://anongit.freedesktop.org/~kraxel/vgabios pcibios



Merged this and the qemu patches (vgabios.3 branch). Thanks.


The updates are not visible @ http://git.qemu.org/vgabios.git/, forgot 
to push?


Hrm, vgabios is currently setup in mirror mode.  I guess if we're going 
to take it over, I need to switch it back to push mode.


I'll reswizzle the repo.

Regards,

Anthony Liguori


cheers,
  Gerd







Re: [Qemu-devel] [PATCH 0/3] add hotplug opt-out option for devices.

2010-11-20 Thread Anthony Liguori

On 11/18/2010 04:45 AM, Gerd Hoffmann wrote:

   Hi,

This patch series adds a qdev flag which allows devices being tagged as
not hotpluggable.  It also sets this flag for a number of devices.

   


I understand why you're adding this but this is one of those horrible 
abuses of qdev that we really need to avoid.


There are two valid reasons why hotplug is not possible:

1) Hotplugging is not supported by the *slot*.  This is something that 
needs to be exposes through ACPI.  This is not a qdev property, but a 
property of a PCI slot.  It's very important that we do this correctly 
because Windows puts a little icon in the systray that advertises 
quick-removal of devices in slots that support hotplug.


2) The PCI device is soldered to the MB or is otherwise not part of a 
PCI slot.  Again, this is part of the ACPI definition.


Since the PIIX3 lives in slot 1, our ACPI tables should not advertise 
slot 0 or slot 1 as supporting hotplug.


Hotplug information has no business being part of the core qdev 
structures.  Hotplug is a PCI concept and the information needs to live 
at the PCI layer to be meaningfully.


An ideal interface would explicitly allow a user to mark a series of PCI 
slots as no supporting hotplug.  It would be convenient in order to 
ensure that your virtio-net wasn't accidentally ejected by a click-happy 
Windows user.


Regards,

Anthony Liguori


Gerd Hoffmann (3):
   qdev: allow devices being tagged as not hotpluggable.
   piix: tag as non-hotpluggable.
   vga: tag as not hotplugable.

  hw/acpi_piix4.c |2 ++
  hw/cirrus_vga.c |1 +
  hw/ide/piix.c   |2 ++
  hw/piix4.c  |1 +
  hw/piix_pci.c   |2 ++
  hw/qdev.c   |   16 +---
  hw/qdev.h   |1 +
  hw/vga-pci.c|1 +
  hw/vmware_vga.c |1 +
  qerror.c|4 
  qerror.h|3 +++
  11 files changed, 31 insertions(+), 3 deletions(-)



   





Re: [Qemu-devel] [PATCH 2/3] vnc: support password expire

2010-11-20 Thread Anthony Liguori

On 11/17/2010 04:23 AM, Gerd Hoffmann wrote:

But the later let's a management tool implement arbitrarily complex
expiration policies.



Hmm, we could do this:

set-password $protocol $secret
expire-password $protocol [ now | never | $seconds ]

Comments?


I would be happy with this.  I don't mind a bit of policy creeping into 
qemu as long as we're exposing the underlying mechanisms.


If it were me, I'd do:

set-password $protocol $secret
unset-password $protocol
expire-password [never | $seconds]

And I would implement expire-password in terms of unset-password.

Regards,

Anthony Liguori


cheers,
  Gerd







Re: [Qemu-devel] [PATCH] stop the iteration when too many pages is transferred

2010-11-20 Thread Anthony Liguori
On 11/17/2010 08:32 PM, Wen Congyang wrote:
> When the total sent page size is larger than max_factor
> times of the size of guest OS's memory, stop the
> iteration.
> The default value of max_factor is 3.
>
> This is similar to XEN.
>
>
> Signed-off-by: Wen Congyang
>   

I'm strongly opposed to doing this. I think Xen gets this totally wrong.

Migration is a contract. When you set the stop time, you're saying that
you want only want the guest to experience a fixed amount of downtime.
Stopping the guest after some arbitrary number of iterations makes the
downtime non-deterministic. With a very large guest, this could wreak
havoc causing dropped networking connections, etc.

It's totally unsafe.

If a management tool wants this behavior, they can set a timeout and
explicitly stop the guest during the live migration. IMHO, such a
management tool is not doing it's job properly but it still can be
implemented.

Regards,

Anthony Liguori

> ---
>  arch_init.c |   13 -
>  1 files changed, 12 insertions(+), 1 deletions(-)
>
> diff --git a/arch_init.c b/arch_init.c
> index 4486925..67e90f8 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -212,6 +212,14 @@ uint64_t ram_bytes_total(void)
>  return total;
>  }
>  
> +static uint64_t ram_blocks_total(void)
> +{
> +return ram_bytes_total() / TARGET_PAGE_SIZE;
> +}
> +
> +static uint64_t blocks_transferred = 0;
> +static int max_factor = 3;
> +
>  int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
>  {
>  ram_addr_t addr;
> @@ -234,6 +242,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
> void *opaque)
>  bytes_transferred = 0;
>  last_block = NULL;
>  last_offset = 0;
> +blocks_transferred = 0;
>  
>  /* Make sure all dirty bits are set */
>  QLIST_FOREACH(block, &ram_list.blocks, next) {
> @@ -266,6 +275,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
> void *opaque)
>  
>  bytes_sent = ram_save_block(f);
>  bytes_transferred += bytes_sent;
> +blocks_transferred += !!bytes_sent;
>  if (bytes_sent == 0) { /* no more blocks */
>  break;
>  }
> @@ -295,7 +305,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, 
> void *opaque)
>  
>  expected_time = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
>  
> -return (stage == 2) && (expected_time <= migrate_max_downtime());
> +return (stage == 2) && ((expected_time <= migrate_max_downtime())
> +|| (blocks_transferred > ram_blocks_total() * max_factor));
>  }
>  
>  static inline void *host_from_stream_offset(QEMUFile *f,
>   




Re: [Qemu-devel] [PATCH 1/1] NBD isn't used by qemu-img, so don't link qemu-img against NBD objects

2010-11-20 Thread Andreas Färber

Am 19.11.2010 um 17:30 schrieb jes.soren...@redhat.com:


From: Jes Sorensen 

Signed-off-by: Jes Sorensen 
---
Makefile  |2 +-
Makefile.objs |   12 ++--
2 files changed, 11 insertions(+), 3 deletions(-)


Tested-by: Andreas Färber 

Looks good to me and a clean build works okay.

Any plans for a way to disable NBD build completely? There are  
warnings about use of daemon() on Mac OS X and possibly Solaris, and  
there's little point in building qemu-nbd if one does not use it.


Andreas


Re: [Qemu-devel] [PATCH 0/3] add hotplug opt-out option for devices.

2010-11-20 Thread Gleb Natapov
On Fri, Nov 19, 2010 at 08:30:35PM -0600, Anthony Liguori wrote:
> On 11/18/2010 04:45 AM, Gerd Hoffmann wrote:
> >   Hi,
> >
> >This patch series adds a qdev flag which allows devices being tagged as
> >not hotpluggable.  It also sets this flag for a number of devices.
> >
> 
> I understand why you're adding this but this is one of those
> horrible abuses of qdev that we really need to avoid.
> 
> There are two valid reasons why hotplug is not possible:
> 
> 1) Hotplugging is not supported by the *slot*.  This is something
> that needs to be exposes through ACPI.  This is not a qdev property,
> but a property of a PCI slot.  It's very important that we do this
> correctly because Windows puts a little icon in the systray that
> advertises quick-removal of devices in slots that support hotplug.
> 
> 2) The PCI device is soldered to the MB or is otherwise not part of
> a PCI slot.  Again, this is part of the ACPI definition.
> 
This patch series introduce "soldered" property for qdev device. It
means that device is not removable from the MB and thus can't be deleted
by issuing delete command in qemu monitor.

> Since the PIIX3 lives in slot 1, our ACPI tables should not
> advertise slot 0 or slot 1 as supporting hotplug.
> 
ACPI table may not advertise it, but it will not prevent removing of
PIIX3 by device_del command (or how it is called this days).

> Hotplug information has no business being part of the core qdev
> structures.  Hotplug is a PCI concept and the information needs to
> live at the PCI layer to be meaningfully.
> 
I am not sure that PCI is the only bus that supports hot-plug. USB and
SCSI support it to and I am sure many others.

> An ideal interface would explicitly allow a user to mark a series of
> PCI slots as no supporting hotplug.  It would be convenient in order
> to ensure that your virtio-net wasn't accidentally ejected by a
> click-happy Windows user.
> 
> Regards,
> 
> Anthony Liguori
> 
> >Gerd Hoffmann (3):
> >   qdev: allow devices being tagged as not hotpluggable.
> >   piix: tag as non-hotpluggable.
> >   vga: tag as not hotplugable.
> >
> >  hw/acpi_piix4.c |2 ++
> >  hw/cirrus_vga.c |1 +
> >  hw/ide/piix.c   |2 ++
> >  hw/piix4.c  |1 +
> >  hw/piix_pci.c   |2 ++
> >  hw/qdev.c   |   16 +---
> >  hw/qdev.h   |1 +
> >  hw/vga-pci.c|1 +
> >  hw/vmware_vga.c |1 +
> >  qerror.c|4 
> >  qerror.h|3 +++
> >  11 files changed, 31 insertions(+), 3 deletions(-)
> >
> >
> >
> 

--
Gleb.



Re: [Qemu-devel] [PATCH 1/1] NBD isn't used by qemu-img, so don't link qemu-img against NBD objects

2010-11-20 Thread Stefan Hajnoczi
On Sat, Nov 20, 2010 at 5:22 PM, Andreas Färber  wrote:
> Any plans for a way to disable NBD build completely? There are warnings
> about use of daemon() on Mac OS X and possibly Solaris, and there's little
> point in building qemu-nbd if one does not use it.

daemon() could be replaced by sharing os_daemonize().

What is the warning message that you get?

Stefan



Re: [Qemu-devel] [PATCH 1/1] NBD isn't used by qemu-img, so don't link qemu-img against NBD objects

2010-11-20 Thread Andreas Färber

Am 20.11.2010 um 18:39 schrieb Stefan Hajnoczi:

On Sat, Nov 20, 2010 at 5:22 PM, Andreas Färber > wrote:
Any plans for a way to disable NBD build completely? There are  
warnings
about use of daemon() on Mac OS X and possibly Solaris, and there's  
little

point in building qemu-nbd if one does not use it.


daemon() could be replaced by sharing os_daemonize().

What is the warning message that you get?


  CCqemu-nbd.o
/Users/andreas/QEMU/qemu/qemu-nbd.c: In function ‘main’:
/Users/andreas/QEMU/qemu/qemu-nbd.c:364: warning: ‘daemon’ is  
deprecated (declared at /usr/include/stdlib.h:283)


http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ 
ManPages/man3/daemon.3.html


Andreas


[Qemu-devel] Re: [PATCH] pci: Replace unneeded type casts in calls of pci_register_bar

2010-11-20 Thread Michael S. Tsirkin
On Fri, Nov 19, 2010 at 07:29:07PM +0100, Stefan Weil wrote:
> There is no need for these type casts (as other existing
> code shows). So re-write the first argument without
> type cast (and remove a related TODO comment).
> 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Stefan Weil 

Thanks, applied.

> ---
>  hw/cirrus_vga.c |4 ++--
>  hw/e1000.c  |4 ++--
>  hw/ide/via.c|2 +-
>  hw/lsi53c895a.c |7 +++
>  hw/openpic.c|2 +-
>  hw/usb-ohci.c   |2 +-
>  6 files changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
> index aadc56f..40be55d 100644
> --- a/hw/cirrus_vga.c
> +++ b/hw/cirrus_vga.c
> @@ -3204,10 +3204,10 @@ static int pci_cirrus_vga_initfn(PCIDevice *dev)
>   /* memory #0 LFB */
>   /* memory #1 memory-mapped I/O */
>   /* XXX: s->vga.vram_size must be a power of two */
> - pci_register_bar((PCIDevice *)d, 0, 0x200,
> + pci_register_bar(&d->dev, 0, 0x200,
>PCI_BASE_ADDRESS_MEM_PREFETCH, cirrus_pci_lfb_map);
>   if (device_id == CIRRUS_ID_CLGD5446) {
> - pci_register_bar((PCIDevice *)d, 1, CIRRUS_PNPMMIO_SIZE,
> + pci_register_bar(&d->dev, 1, CIRRUS_PNPMMIO_SIZE,
>PCI_BASE_ADDRESS_SPACE_MEMORY, 
> cirrus_pci_mmio_map);
>   }
>   return 0;
> diff --git a/hw/e1000.c b/hw/e1000.c
> index 7811699..57d08cf 100644
> --- a/hw/e1000.c
> +++ b/hw/e1000.c
> @@ -1133,10 +1133,10 @@ static int pci_e1000_init(PCIDevice *pci_dev)
>  d->mmio_index = cpu_register_io_memory(e1000_mmio_read,
>  e1000_mmio_write, d);
>  
> -pci_register_bar((PCIDevice *)d, 0, PNPMMIO_SIZE,
> +pci_register_bar(&d->dev, 0, PNPMMIO_SIZE,
> PCI_BASE_ADDRESS_SPACE_MEMORY, e1000_mmio_map);
>  
> -pci_register_bar((PCIDevice *)d, 1, IOPORT_SIZE,
> +pci_register_bar(&d->dev, 1, IOPORT_SIZE,
> PCI_BASE_ADDRESS_SPACE_IO, ioport_map);
>  
>  memmove(d->eeprom_data, e1000_eeprom_template,
> diff --git a/hw/ide/via.c b/hw/ide/via.c
> index b2c7cad..2001a36 100644
> --- a/hw/ide/via.c
> +++ b/hw/ide/via.c
> @@ -153,7 +153,7 @@ static int vt82c686b_ide_initfn(PCIDevice *dev)
>  pci_set_long(pci_conf + PCI_CAPABILITY_LIST, 0x00c0);
>  
>  qemu_register_reset(via_reset, d);
> -pci_register_bar((PCIDevice *)d, 4, 0x10,
> +pci_register_bar(&d->dev, 4, 0x10,
> PCI_BASE_ADDRESS_SPACE_IO, bmdma_map);
>  
>  vmstate_register(&dev->qdev, 0, &vmstate_ide_pci, d);
> diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
> index f97335e..1aef62f 100644
> --- a/hw/lsi53c895a.c
> +++ b/hw/lsi53c895a.c
> @@ -2177,12 +2177,11 @@ static int lsi_scsi_init(PCIDevice *dev)
>  s->ram_io_addr = cpu_register_io_memory(lsi_ram_readfn,
>  lsi_ram_writefn, s);
>  
> -/* TODO: use dev and get rid of cast below */
> -pci_register_bar((struct PCIDevice *)s, 0, 256,
> +pci_register_bar(&s->dev, 0, 256,
> PCI_BASE_ADDRESS_SPACE_IO, lsi_io_mapfunc);
> -pci_register_bar((struct PCIDevice *)s, 1, 0x400,
> +pci_register_bar(&s->dev, 1, 0x400,
> PCI_BASE_ADDRESS_SPACE_MEMORY, lsi_mmio_mapfunc);
> -pci_register_bar((struct PCIDevice *)s, 2, 0x2000,
> +pci_register_bar(&s->dev, 2, 0x2000,
> PCI_BASE_ADDRESS_SPACE_MEMORY, lsi_ram_mapfunc);
>  QTAILQ_INIT(&s->queue);
>  
> diff --git a/hw/openpic.c b/hw/openpic.c
> index 01bf15f..f6b8f21 100644
> --- a/hw/openpic.c
> +++ b/hw/openpic.c
> @@ -1197,7 +1197,7 @@ qemu_irq *openpic_init (PCIBus *bus, int *pmem_index, 
> int nb_cpus,
>  pci_conf[0x3d] = 0x00; // no interrupt pin
>  
>  /* Register I/O spaces */
> -pci_register_bar((PCIDevice *)opp, 0, 0x4,
> +pci_register_bar(&opp->pci_dev, 0, 0x4,
> PCI_BASE_ADDRESS_SPACE_MEMORY, &openpic_map);
>  } else {
>  opp = qemu_mallocz(sizeof(openpic_t));
> diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
> index c60fd8d..8fb2f83 100644
> --- a/hw/usb-ohci.c
> +++ b/hw/usb-ohci.c
> @@ -1741,7 +1741,7 @@ static int usb_ohci_initfn_pci(struct PCIDevice *dev)
>  ohci->state.irq = ohci->pci_dev.irq[0];
>  
>  /* TODO: avoid cast below by using dev */
> -pci_register_bar((struct PCIDevice *)ohci, 0, 256,
> +pci_register_bar(&ohci->pci_dev, 0, 256,
> PCI_BASE_ADDRESS_SPACE_MEMORY, ohci_mapfunc);
>  return 0;
>  }
> -- 
> 1.7.2.3



Re: [Qemu-devel] [PATCH 1/1] NBD isn't used by qemu-img, so don't link qemu-img against NBD objects

2010-11-20 Thread Stefan Hajnoczi
On Sat, Nov 20, 2010 at 6:04 PM, Andreas Färber  wrote:
> Am 20.11.2010 um 18:39 schrieb Stefan Hajnoczi:
>
>> On Sat, Nov 20, 2010 at 5:22 PM, Andreas Färber 
>> wrote:
>>>
>>> Any plans for a way to disable NBD build completely? There are warnings
>>> about use of daemon() on Mac OS X and possibly Solaris, and there's
>>> little
>>> point in building qemu-nbd if one does not use it.
>>
>> daemon() could be replaced by sharing os_daemonize().
>>
>> What is the warning message that you get?
>
>  CC    qemu-nbd.o
> /Users/andreas/QEMU/qemu/qemu-nbd.c: In function ‘main’:
> /Users/andreas/QEMU/qemu/qemu-nbd.c:364: warning: ‘daemon’ is deprecated
> (declared at /usr/include/stdlib.h:283)
>
> http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man3/daemon.3.html

Deprecated in favor of using launchd.

Removing qemu-nbd from the build because there is warning isn't a good
strategy.  You may not use qemu-io either but it is always built.
That's important because otherwise it could bitrot, no one would
notice, and one day qemu-tools wouldn't work on Mac OSX at all
anymore.  Instead we should fix the code that causes a warning.

Is it cheating much to daemonize manually in qemu-nbd? ;)

Stefan



Re: [Qemu-devel] macaddr doesn't work

2010-11-20 Thread Mulyadi Santosa
Hi...

2010/11/20 郭沐錫 :
> Dear all
> Sorry, I was wrong.
> I was too hurry to result in that I don't understand that command.
>
> By "ifconfig -a", I found other eth...
> Thank you to Mulyadi.

No problem...I believe you had important lesson here :)

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com



Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device

2010-11-20 Thread Michael S. Tsirkin
On Fri, Nov 19, 2010 at 10:38:42PM +0200, Gleb Natapov wrote:
> On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote:
> > "Michael S. Tsirkin"  writes:
> > 
> > > On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote:
> > >> On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote:
> > >> > Replace bus number with slot numbers of parent bridges up to the root.
> > >> > This works for root bridge in a compatible way because bus number there
> > >> > is hard-coded to 0.
> > >> > IMO nested bridges are broken anyway, no way to be compatible there.
> > >> > 
> > >> > 
> > >> > Gleb, Markus, I think the following should be sufficient for PCI.  What
> > >> > do you think?  Also - do we need to update QMP/monitor to teach them to
> > >> > work with these paths?
> > >> > 
> > >> > This is on top of Alex's patch, completely untested.
> > >> > 
> > >> > 
> > >> > pci: fix device path for devices behind nested bridges
> > >> > 
> > >> > We were using bus number in the device path, which is clearly
> > >> > broken as this number is guest-assigned for all devices
> > >> > except the root.
> > >> > 
> > >> > Fix by using hierarchical list of slots, walking the path
> > >> > from root down to device, instead. Add :00 as bus number
> > >> > so that if there are no nested bridges, this is compatible
> > >> > with what we have now.
> > >> 
> > >> This format, Domain:00:Slot:Slot:Slot.Function, doesn't work
> > >> because pci-to-pci bridge is pci function.
> > >> So the format should be
> > >> Domain:00:Slot.Function:Slot.Function:Slot.Function
> > >> 
> > >> thanks,
> > >
> > > Hmm, interesting. If we do this we aren't backwards compatible
> > > though, so maybe we could try using openfirmware paths, just as well.
> > 
> > Whatever we do, we need to make it work for all (qdevified) devices and
> > buses.
> > 
> > It should also be possible to use canonical addressing with device_add &
> > friends.  I.e. permit naming a device by (a unique abbreviation of) its
> > canonical address in addition to naming it by its user-defined ID.  For
> > instance, something like
> > 
> >device_del /pci/@1,1
> > 
> FWIW openbios allows this kind of abbreviation.
> 
> > in addition to
> > 
> >device_del ID
> > 
> > Open Firmware is a useful source of inspiration there, but should it
> > come into conflict with usability, we should let usability win.
> 
> --
>   Gleb.


I think that the domain (PCI segment group), bus, slot, function way to
address pci devices is still the most familiar and the easiest to map to
functionality in the guests.  Qemu is buggy in the moment in that is
uses the bus addresses assigned by guest and not the ones in ACPI,
but that can be fixed.

That should be enough for e.g. device_del. We do have the need to
describe the topology when we interface with firmware, e.g. to describe
the ACPI tables themselves to qemu (this is what Gleb's patches deal
with), but that's probably the only case.

-- 
MST



Re: [Qemu-devel] [PULL] vhost, netdev, e1000, pci

2010-11-20 Thread Luiz Capitulino
On Tue, 16 Nov 2010 15:16:20 +0200
"Michael S. Tsirkin"  wrote:

> Here are some fixes I collected in my tree.
> Please merge.
> 
> The following changes since commit 5fc9cfedfa09199e10b5f9b67dcd286bfeae4f7a:
> 
>   Fold send_all() wrapper unix_write() into one function (2010-11-03 12:48:09 
> -0500)
> 
> are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony
> 
> Alex Williamson (2):
>   e1000: Fix TCP checksum overflow with TSO
>   PCI: Bus number from the bridge, not the device
> 
> Michael S. Tsirkin (3):
>   tap: clear vhost_net backend on cleanup
>   tap: make set_offload a nop after netdev cleanup
>   pci: allow hotplug removal of cold-plugged devices

IMHO it's good practice to re-send patches in the pull request itself,
so that:

 1. the submitter knows his/her patches changed from state 'applied in
maintainer's tree' to 'submitted for merging in master'

 2. you minimize the risk of merging something that wasn't sent to the
list first

 3. people have a chance to review what you picked up, in case they didn't



[Qemu-devel] Re: [PATCH] ahci: fix lst+fis mapping

2010-11-20 Thread Alexander Graf

On 20.11.2010, at 00:06, Gerd Hoffmann wrote:

> The ahci map_page() function checks whenever it got a full page mapped.
> This is wrong.   The data structures are much smaller:  command list is
> 1k and fis is 256 bytes.  Checking whenever we can access that much
> bytes without crossing a page border is good enougth.

Looks good :). Do you want me to incorporate this with the next revision of my 
patch set or keep it separate?


Alex




Re: [Qemu-devel] [PATCH 00/10] AHCI emulation support v2

2010-11-20 Thread Alexander Graf

On 19.11.2010, at 14:46, Kevin Wolf wrote:

> Am 19.11.2010 14:08, schrieb Alexander Graf:
>> 
>> On 19.11.2010, at 10:15, Kevin Wolf wrote:
>> 
>>> Am 18.11.2010 19:43, schrieb Alexander Graf:
> Then I believe that core.c is now a mixture of some generic ATA code
> (that is also used by SATA) and the Legacy IDE code. SATA doesn't seem
> to interact with the generic code through clean interfaces, but by
> accessing internal data structures and calls to somewhere in the middle
> of the existing IDE emultion. I think we should get a clean abstraction
> there and have a clean split between SATA, PATA and common code, with
> each of them sitting in its own file in hw/ide.
> 
> I haven't reviewed the patches in detail but just had a quick look at
> them, so my impressions might be wrong. If so, please correct me.
 
 No, you're completely right. We're in a chicken and egg situation. We 
 don't have ahci, but the ide code is ugly. We would probably do a bad job 
 at refactoring the ata code if we don't know which interfaces to design 
 for.
>>> 
>>> That problem is solved. You have posted patches, so you're aware what
>>> interfaces you need for AHCI. This awareness doesn't come from putting
>>> the code into git master.
>> 
>> I guess you should go back and read the "this doesn't work yet" list. There 
>> is a lot of stuff that I'm not sure we have all correctly sorted out. The 
>> most intrusive one on that side is the legacy IDE compatibility. I don't 
>> know what interfaces and what changes we will need for that to become 
>> realistic.
> 
> Fair enough. I'll accept that we can't get it sorted out now, but let's
> try to do the part that we can do. Let's change the split to SATA
> (ahci.c), Legacy IDE (ide.c?), common code (ata.c) and "don't know yet"
> (core.c).
> 
> A start for that would be if in Patch 2 you moved the function to ata.c
> instead of just reindenting. We're also probably pretty sure that, for
> example, the I/O port handling won't need to be shared and can be moved
> to ide.c. Whenever it becomes clear for a part in which category it
> belongs, we would move it out of core.c and eventually, I hope, core.c
> could be removed.

I can certainly move out obviously pata specific pieces to a new file called 
pata.c. As for the split between ata.c and core.c, I don't think it's useful. 
Once we moved everything pata specific out or core.c, that file will 
essentially be ata.c.

> 
>> Also to catch up on Gerd's point - whatever refactoring we do, we will 
>> basically have to break migration. There is no way we can change all the 
>> internal state and structure and maintain binary compatibility with the old 
>> save states.
> 
> Hm, breaking migration isn't really an option. I'm not familiar with
> migration code, but maybe Juan can contribute some magic?
> 
> Speaking of migration, that seems to be missing for the AHCI yet, too.
> Are you planning to complete the functionality first before you start
> with that?

I'm planning to take baby steps so others can contribute and I don't keep a 
patch set lying around become more invasive and thus more prone to bitrot every 
day :). I myself just don't scale well enough for a feature this intrusive and 
important.

I had the code bitrot for quite a bit already btw. GSoC ended a couple months 
ago and I just didn't get around to polish the code well enough for upstream 
submission. And believe me, it rots very fast.

Vmstate is an issue we need to solve. I'm not sure what the right way forward 
for that would be. I certainly would not recommend declaring the migration 
protocol for sata even remotely stable for the time being, as we're missing 
crucial pieces still that might require major restructuring or even duplicating 
of core ide code. And as long as that's the case, I'm not sure how much sense 
it really makes to have any at all.

Basically, if there was a CONFIG_EXPERIMENTAL flag, I would set it on ahci. The 
code is not and will not be 100% stable and well structures and reliable within 
the next probably 1/2 year to year. But we need to start walking into a 
direction where it can finally end up being there some day, and that only works 
by multiple people working together on this, preferably upstream, so we don't 
collide with other possible ide work.

Of course there's some chance that it won't get there. If there is interest in 
it though, it will. And from what I've gathered so far, there is interest, as 
it's a speedup for a lot of guests without the need for special drivers. If it 
just lies around without getting improved, rip it out again :). If nobody works 
on it, nobody uses it.


Alex




Re: [Qemu-devel] [PATCH 00/10] AHCI emulation support v2

2010-11-20 Thread Alexander Graf

On 19.11.2010, at 10:12, Stefan Hajnoczi wrote:

> On Thu, Nov 18, 2010 at 11:24 PM, Alexander Graf  wrote:
>> linux-uztg:~ # dd if=/dev/sdc of=/dev/null bs=10M count=300 iflag=direct
> 
> That's a big block size.  bs=8k is interesting too because we see the
> per-request overhead.  Since IDE, SATA, and virtio have different
> hardware interfaces that the guest drives we may see an interesting
> picture with small block sizes.

IDE: 60MB/s
SATA: 100MB/s
virtio: 120 MB/s

I had a pretty big jitter on that one though, with SATA going between 80MB and 
120MB :). But overall, it's the same picture as the one with big block sizes.


Alex