[Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
 wrote:
> On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
>> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
>>  wrote:
>> > On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
>> >>
>> >> Actually I don't quite understand the need for vty layer, why not use
>> >> the chardev here directly?
>> >
>> > I'm not sure what you mean here...
>>
>> Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
>> instead of moving those to a separate file.
>
> Well, the VIO device instance gives the chardev instance which is all
> nicely encapsulated inside spapr-vty... Also VIO devices tend to have
> dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
> close to the rest of the VIO driver they belong to don't you think ?
>
> (Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
> added to the core VIO but you'll see that when we get those patches
> ready, hopefully soon).

This is a bit of a special case, much like semihosting modes for m68k
or ARM, or like MOL hacks which were removed recently. From QEMU point
of view, the most natural way of handling this would be hypervisor
implemented in the guest side (for example BIOS). Then the hypervisor
would use normal IO (or virtio) to communicate with the host. If
inside QEMU, the interface of the hypervisor to the devices needs some
thought. We'd like to avoid ugly interfaces like vmmouse where a
device probes CPU registers directly or spaghetti interfaces like
APIC.

> Actually, one thing I noticed is that the current patches David posted
> still have a single function with a switch/case statement for hcalls.
>
> I'm not 100% certain what David long term plans are here, but in our
> internal "WIP" tree, we've subsequently turned that into a mechanism
> where any module can call powerpc_register_hypercall() to add hcalls.
>
> So if David intends to move the "upstream candidate" tree in that
> direction, then naturally, the calls in spapr_hcall.c are going to
> disappear in favor of a pair of powerpc_register_hypercall() locally in
> the vty module.

Is the interface new design, or are you implementing what is used also
on real HW?



Re: [Qemu-devel] [PATCH 1/2] linux-user: Define target alignment size

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 4:22 AM, Laurent Vivier  wrote:
> Datatype alignment can be found using following application:
>
> int main(void)
> {
>        printf("alignof(short) %ld\n", __alignof__(short));
>        printf("alignof(int) %ld\n", __alignof__(int));
>        printf("alignof(long) %ld\n", __alignof__(long));
>        printf("alignof(long long) %ld\n", __alignof__(long long));
> }
>
> This patch includes following alignments:
>
> i386
>
>   alignof(short) 2
>   alignof(int) 4
>   alignof(long) 4
>   alignof(long long) 8
>
>  x86_64
>
>   alignof(short) 2
>   alignof(int) 4
>   alignof(long) 8
>   alignof(long long) 8
>
>  arm
>
>   alignof(short) 2
>   alignof(int) 4
>   alignof(long) 4
>   alignof(long long) 4
>
>  m68k (680x0)
>
>   alignof(short) 2
>   alignof(int) 2
>   alignof(long) 2
>   alignof(long long) 2
>
>  mips
>
>   alignof(short) 2
>   alignof(int) 4
>   alignof(long) 4
>   alignof(long long) 8
>
>  ppc
>
>   alignof(short) 2
>   alignof(int) 4
>   alignof(long) 4
>   alignof(long long) 8
>
> for other targets, use by default (2,4,4,8).
>
> Please, update for your favorite target...

For Sparc32 (I think also sparc32plus), the default is OK.

For Sparc64, please use 2, 4, 8, 8. I'd guess other 64 bit platforms
(Alpha, MIPS64, PPC64 etc) should use the same.

Does GCC produce correct code using the attributes on strictly aligned
host, when the target is less strictly aligned?

Should the alignment of floating point variables be specified as well?
The strict alignment required for doubles is 4, but recommended
alignment is 8, I'm not sure which one is used for structures
containing doubles.



Re: [Qemu-devel] Binary Translation hooking - reading registers

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 5:48 AM, felix.matenaar@rwth-aachen
 wrote:
> Hello everyone,
>
> i am working on a project adding instrumentation into qemu. My approach
> is to use gen_helper stuff do hook specific opcodes like call or ret to
> gain information about running processes in the virtual machine.
>
> Today I noticed that the CPUState* env is not in all cases up-to-date
> when my hooks are called on block execution. That makes totally sense
> since blocks are natively executed in one step as far as I understood so
> there is no code which would keep the cpu environment up-to-date.
>
> To achieve my goal, it is necessary being able reading actual register
> configuration like eax when a ret hook is called to get a function
> return value. So my question is how I can do this. Are there already
> some functions which generate code to update the cpu environment? If
> not, is there anything you can point me towards for adding support?

Without seeing your code, you are probably confusing translation phase
and executing the code generated by TCG.



Re: [Qemu-devel] [PATCH] PS/2 keyboard Scancode Set 3 support

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 9:32 AM, Roy Tam  wrote:
> The following patch adds PS/2 keyboard Scancode Set 3 support.
>
> Sign-off-by: Roy Tam 
> --
> diff --git a/hw/ps2.c b/hw/ps2.c
> index 762bb00..4b73967 100644
> --- a/hw/ps2.c
> +++ b/hw/ps2.c
> @@ -143,12 +143,87 @@ static void ps2_put_keycode(void *opaque, int keycode)
>  {
>     PS2KbdState *s = opaque;
>
> -    /* XXX: add support for scancode sets 1 and 3 */
> -    if (!s->translate && keycode < 0xe0 && s->scancode_set == 2)
> +    /* XXX: add support for scancode sets 1 */
> +    if (!s->translate && keycode < 0xe0 && s->scancode_set > 1)
>       {

The braces belong to the previous line and the indentation is off.
Please run the patch through scripts/checkpatch.pl.

>         if (keycode & 0x80)
>             ps2_queue(&s->common, 0xf0);
>         keycode = ps2_raw_keycode[keycode & 0x7f];
> +        if (s->scancode_set == 3)
> +          {
> +            switch (keycode)
> +              {
> +                case 0x1:
> +                  keycode = 0x47;
> +                  break;
> +                case 0x3:
> +                  keycode = 0x27;
> +                  break;
> +                case 0x4:
> +                  keycode = 0x17;
> +                  break;
> +                case 0x5:
> +                  keycode = 0x7;
> +                  break;
> +                case 0x6:
> +                  keycode = 0xf;
> +                  break;
> +                case 0x7:
> +                  keycode = 0x5e;
> +                  break;
> +                case 0x9:
> +                  keycode = 0x4f;
> +                  break;
> +                case 0xa:
> +                  keycode = 0x3f;
> +                  break;
> +                case 0xb:
> +                  keycode = 0x2f;
> +                  break;
> +                case 0xc:
> +                  keycode = 0x1f;
> +                  break;
> +                case 0x11:
> +                  keycode = 0x19;
> +                  break;
> +                case 0x14:
> +                  keycode = 0x11;
> +                  break;
> +                case 0x58:
> +                  keycode = 0x14;
> +                  break;
> +                case 0x5d:
> +                  keycode = 0x5c;
> +                  break;
> +                case 0x76:
> +                  keycode = 0x8;
> +                  break;
> +                case 0x77:
> +                  keycode = 0x76;
> +                  break;
> +                case 0x78:
> +                  keycode = 0x56;
> +                  break;
> +                case 0x79:
> +                  keycode = 0x7c;
> +                  break;
> +                case 0x7b:
> +                  keycode = 0x84;
> +                  break;
> +                case 0x7c:
> +                  keycode = 0x7e;
> +                  break;
> +                case 0x7e:
> +                  keycode = 0x5f;
> +                  break;
> +                case 0x83:
> +                  keycode = 0x37;
> +                  break;
> +                case 0x84:
> +                  keycode = 0x57;
> +                  break;
> +              }
> +          }
>       }
>     ps2_queue(&s->common, keycode);
>  }
>
>


Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Gleb Natapov
On Fri, Feb 11, 2011 at 08:14:16PM +0200, Blue Swirl wrote:
> On Thu, Feb 10, 2011 at 6:05 PM, Anthony Liguori  
> wrote:
> > On 02/10/2011 03:20 PM, Gleb Natapov wrote:
> >>
> >> Jugging by how well all previous conversion went we will end up with one
> >> more way of creating devices. One legacy, another qdev and your new one.
> >> And what is the problem with qdev again (not that I am a big qdev fan)?
> >>
> >
> > We've really been arguing about probably the most minor aspect of the
> > problem with qdev.
Probably. But if this aspect affects the overall re-design of device tree it
is not so minor after all.

> >
> > All I'm really saying is that we shouldn't tie device construction to a
> > factory interface as we do with qdev.
> >
> > That simply means that we should be able to do:
> >
> > RTC *rtc_create(arg1, arg2, arg2);
> 
> I don't see how that would help at all. Throwing qdev away and just
> calling various functions directly, with all states exposed would be
> like QEMU 0.9.0.
> 
That is my sentiment exactly. This is what we had before. What was the
reason we wanted device tree (let alone specific implementation of it
that was committed without much external input)? Why do you no longer want
device tree abstraction? 
 
> > And that a separate piece of code decides which devices are exposed through
> > -device or device_add.  Which devices are exposed is really a minor detail.
> >
> > That said, qdev has a number of significant limitations in my mind.  The
> > first is that the only relationship between devices is through the BusState
> > interface.
> 
> There's also qemu_irq for arbitrary signals.
> 
> >  I don't think we should even try to have a generic bus model.
> >  When you look at how badly broken PCI hotplug is current in qdev, I think
> > this is symptomatic of this.
> 
> And how should this be fixed? The API change would not help.
> 
> > There's also no way in qdev to really have polymorphism.  Interfaces really
> > aren't meaningful in qdev so you have things like PCIDevice where some
> > methods are stored in the object instead of the class dispatch table and you
> > have overuse of static class members.
> 
> QEMU is developed in C, not C++.
> 
> > And it's all unrelated to VMState.
> 
> Right, but this has also the good side that not all device state is
> automatically exported. If other devices would be allowed to muck with
> a devices internal state freely, bad things could happen.
> 
> Device reset could also use standard register definitions, shared with 
> VMState.
> 
> > And this is just the basic mechanisms of qdev.  The actual implementation is
> > worse.  The use of qemu_irq as gpio in the base class and overuse of
> > SystemBus is really quite insane.
> 
> Maybe qemu_irq should be renamed to QEMUSignal (and I don't like
> typedeffing pointers), otherwise it looks quite sane to me.
> 
> Could you point to examples of SystemBus overuse?
> 
> > And so far, the use of qdev has been entirely superficial.  Devices still
> > don't make use of bus level interfaces to do I/O so we don't have any better
> > componentization than we did before qdev.
> >
> >> The fact that there is no enough interest to convert all devices to it?
> >>
> >
> > I don't think there is any device that has been improved by qdev.  -device
> > is a nice feature, but it could have been implemented without qdev.
> 
> We have 'info qtree' which can't be implemented easily without a
> generic device class. Avi (or who was it) sent patches to expose even
> more device state.
> 
> With the patches I'm going to apply, if Redhat wants to disable
> building various devices, it can be done without #ifdeffery. This is
> not possible without a generic factory interface.

--
Gleb.



Re: [Qemu-devel] Re: [PATCH 09/15] Parse SDR1 on mtspr instead of at translate time

2011-02-13 Thread David Gibson
On Sat, Feb 12, 2011 at 04:37:46PM +0100, Alexander Graf wrote:
> On 12.02.2011, at 15:54, David Gibson wrote:
[snip]
> > +#define SDR_HTABORG_32 0xUL
> > +#define SDR_HTABMASK   0x01FFUL
> 
> Please mark this constant as ppc32
> 
> > +
> > +#if defined(TARGET_PPC64)
> > +#define SDR_HTABORG_64 0xFFFCULL
> > +#define SDR_HTABSIZE   0x001FULL
> 
> Please mark this constant as ppc64

Um.. I'm not sure what you mean by this.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] RFC: Implement emulation of pSeries logical partitions

2011-02-13 Thread David Gibson
On Sat, Feb 12, 2011 at 05:26:20PM +0100, Laurent Vivier wrote:
> Hi,
> 
> Do you plan to boot AIX in one of these partitions ?

Not really, no.  This is aimed at existing pSeries Linux kernels.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Re: [PATCH 12/15] Support 1T segments on ppc

2011-02-13 Thread David Gibson
On Sat, Feb 12, 2011 at 04:57:39PM +0100, Alexander Graf wrote:
> On 12.02.2011, at 15:54, David Gibson wrote:
[snip]
> > +if (rb & (0x1000 - env->slb_nr))
> 
> Braces...

Oops, yeah.  These later patches in the series I haven't really
audited for coding style adequately yet.  I'll fix these before the
next version.

[snip]
> > +   return -1; /* 1T segment on MMU that doesn't support it */
> > + 
> > +/* We stuff a copy of the B field into slb->esid to simplify
> > + * lookup later */
> > +slb->esid = (rb & (SLB_ESID_ESID | SLB_ESID_V)) |
> > +(rs >> SLB_VSID_SSIZE_SHIFT);
> 
> Wouldn't it be easier to add another field?

Easier for what?  The reason I put these bits in here is that the rest
of the things slb_lookup() needs to scan for are all in the esid
field, so putting B in there means slb_lookup() needs only one
comparison per-slot, per segment size.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Re: [PATCH 13/15] Add POWER7 support for ppc

2011-02-13 Thread David Gibson
On Sat, Feb 12, 2011 at 05:09:39PM +0100, Alexander Graf wrote:
> On 12.02.2011, at 15:54, David Gibson wrote:
[snip]
> > +/* Don't generate spurious events */
> > +if ((cur_level == 1 && level == 0) || (cur_level == 0 && level != 0)) {
> 
> Did you hit this? Qemu's irq framework should already ensure that
> property. I'm also not sure it's actually correct - if a level
> interrupt is on, the guest would get another interrupt injected, no?
> That would be cur_level ==1 && level == 1 IIUC.

[snip]
> > +case POWER7_INPUT_CKSTP:
> 
> POWER7 has checkstop?

[snip]
> > +case POWER7_INPUT_HRESET:
> 
> Does this ever get triggered? POWER7 is run in lpar only, so there is no 
> hreset, right?

[snip]
> > +case POWER7_INPUT_TBEN:
> > +LOG_IRQ("%s: set the TBEN state to %d\n", __func__,
> > +level);
> > +/* XXX: TODO */
> 
> Hrm - what is this?

Ah, drat.  I forgot about this.  The POWER7 interrupt stuff I copied
from 970 and them modified minimally to get it working.  I meant to
get around to auditing this stuff to see what was actually relevant to
POWER7.  I'll address this for the next version.

[snip]
> > +#if !defined(CONFIG_USER_ONLY)
> > +env->slb_nr = 32;
> 
> POWER7 has 64, no? Please check this :).

Nope.  POWER4 and POWER5 have 64, but POWER7 has 32.  This one I did
check and change.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



[Qemu-devel] [PATCH v2] PS/2 keyboard Scancode Set 3 support

2011-02-13 Thread Roy Tam
The following patch adds PS/2 keyboard Scancode Set 3 support.

Sign-off-by: Roy Tam 
--
v2: checkpatch.pl style fixes

diff --git a/hw/ps2.c b/hw/ps2.c
index 762bb00..6bea0ef 100644
--- a/hw/ps2.c
+++ b/hw/ps2.c
@@ -143,13 +143,85 @@ static void ps2_put_keycode(void *opaque, int keycode)
 {
 PS2KbdState *s = opaque;

-/* XXX: add support for scancode sets 1 and 3 */
-if (!s->translate && keycode < 0xe0 && s->scancode_set == 2)
-  {
+/* XXX: add support for scancode sets 1 */
+if (!s->translate && keycode < 0xe0 && s->scancode_set > 1) {
 if (keycode & 0x80)
 ps2_queue(&s->common, 0xf0);
 keycode = ps2_raw_keycode[keycode & 0x7f];
-  }
+if (s->scancode_set == 3) {
+switch (keycode) {
+case 0x1:
+keycode = 0x47;
+break;
+case 0x3:
+keycode = 0x27;
+break;
+case 0x4:
+keycode = 0x17;
+break;
+case 0x5:
+keycode = 0x7;
+break;
+case 0x6:
+keycode = 0xf;
+break;
+case 0x7:
+keycode = 0x5e;
+break;
+case 0x9:
+keycode = 0x4f;
+break;
+case 0xa:
+keycode = 0x3f;
+break;
+case 0xb:
+keycode = 0x2f;
+break;
+case 0xc:
+keycode = 0x1f;
+break;
+case 0x11:
+keycode = 0x19;
+break;
+case 0x14:
+keycode = 0x11;
+break;
+case 0x58:
+keycode = 0x14;
+break;
+case 0x5d:
+keycode = 0x5c;
+break;
+case 0x76:
+keycode = 0x8;
+break;
+case 0x77:
+keycode = 0x76;
+break;
+case 0x78:
+keycode = 0x56;
+break;
+case 0x79:
+keycode = 0x7c;
+break;
+case 0x7b:
+keycode = 0x84;
+break;
+case 0x7c:
+keycode = 0x7e;
+break;
+case 0x7e:
+keycode = 0x5f;
+break;
+case 0x83:
+keycode = 0x37;
+break;
+case 0x84:
+keycode = 0x57;
+break;
+}
+}
+}
 ps2_queue(&s->common, keycode);
 }



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread David Gibson
On Sun, Feb 13, 2011 at 10:08:23AM +0200, Blue Swirl wrote:
> On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
>  wrote:
> > On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
> >> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
> >>  wrote:
> >> > On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
> >> >>
> >> >> Actually I don't quite understand the need for vty layer, why not use
> >> >> the chardev here directly?
> >> >
> >> > I'm not sure what you mean here...
> >>
> >> Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
> >> instead of moving those to a separate file.
> >
> > Well, the VIO device instance gives the chardev instance which is all
> > nicely encapsulated inside spapr-vty... Also VIO devices tend to have
> > dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
> > close to the rest of the VIO driver they belong to don't you think ?
> >
> > (Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
> > added to the core VIO but you'll see that when we get those patches
> > ready, hopefully soon).
> 
> This is a bit of a special case, much like semihosting modes for m68k
> or ARM, or like MOL hacks which were removed recently. From QEMU point
> of view, the most natural way of handling this would be hypervisor
> implemented in the guest side (for example BIOS). Then the hypervisor
> would use normal IO (or virtio) to communicate with the host. If
> inside QEMU, the interface of the hypervisor to the devices needs some
> thought. We'd like to avoid ugly interfaces like vmmouse where a
> device probes CPU registers directly or spaghetti interfaces like
> APIC.

I really don't follow what you're saying here.  Running the hypervisor
in the guest, rather than emulating its effect in qemu seems like an
awful lot of complexity for no clear reason.

> > Actually, one thing I noticed is that the current patches David posted
> > still have a single function with a switch/case statement for hcalls.
> >
> > I'm not 100% certain what David long term plans are here, but in our
> > internal "WIP" tree, we've subsequently turned that into a mechanism
> > where any module can call powerpc_register_hypercall() to add hcalls.
> >
> > So if David intends to move the "upstream candidate" tree in that
> > direction, then naturally, the calls in spapr_hcall.c are going to
> > disappear in favor of a pair of powerpc_register_hypercall() locally in
> > the vty module.
> 
> Is the interface new design, or are you implementing what is used also
> on real HW?

The interface already exists on real HW.  It's described in the PAPR
document we keep mentioning.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread David Gibson
On Sun, Feb 13, 2011 at 10:15:03AM +1100, Benjamin Herrenschmidt wrote:
> On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
> > On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
[snip]
> Actually, one thing I noticed is that the current patches David posted
> still have a single function with a switch/case statement for hcalls.
> 
> I'm not 100% certain what David long term plans are here, but in our
> internal "WIP" tree, we've subsequently turned that into a mechanism
> where any module can call powerpc_register_hypercall() to add hcalls.
> 
> So if David intends to move the "upstream candidate" tree in that
> direction, then naturally, the calls in spapr_hcall.c are going to
> disappear in favor of a pair of powerpc_register_hypercall() locally in
> the vty module.

Ah, yeah.  I'm still not sure what to do about it.  I was going to
fold the dynamic hcall registration into the patch set before
upstreaming.  But then something paulus said made me rethink whether
the dynamic registration was a good idea.  Still need to sort this out
before the series is really ready.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread David Gibson
On Sat, Feb 12, 2011 at 05:47:53PM +0100, Alexander Graf wrote:
> On 12.02.2011, at 15:54, David Gibson wrote:
[snip] 
> > @@ -267,6 +295,7 @@ static QEMUMachine spapr_machine = {
> > .desc = "pSeries Logical Partition (PAPR compliant)",
> > .init = ppc_spapr_init,
> > .max_cpus = 1,
> > +.no_parallel = 1,
> 
> duplicate?

Oops, rebasing mistake.  Fixed now.

[snip]
> > +VIOsPAPRDevice *spapr_vio_find_by_reg(VIOsPAPRBus *bus, uint32_t reg)
> > +{
> > +DeviceState *qdev;
> > +VIOsPAPRDevice *dev = NULL;
> > +
> > +QLIST_FOREACH(qdev, &bus->bus.children, sibling) {
> > +dev = (VIOsPAPRDevice *)qdev;
> > +if (dev->reg == reg)
> 
> Braces
> 
> > +break;
> > +}
> > +
> > +return dev;
> 
> What if the device doesn't exist?

This returns NULL, the caller returns an error...

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] [PATCH] eepro100: pad to ensure minimum packet size

2011-02-13 Thread Stefan Weil

Am 11.02.2011 20:36, schrieb Bruce Rogers:

Recent gpxe e100pro drivers will drop small packets because the emulated
nic will report an error for small frames. In the qemu model we should
instead have the e100pro pad out the received frames to be the minimum
size and not report this case as an error.

Signed-off-by: Bruce Rogers 
---
hw/eepro100.c | 23 +--
1 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/hw/eepro100.c b/hw/eepro100.c
index edf48f6..03b6934 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -1645,6 +1645,8 @@ static int nic_can_receive(VLANClientState *nc)
#endif
}

+#define MIN_BUF_SIZE 60 /* Min. octets in an ethernet frame sans FCS */
+
static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, 
size_t size)

{
/* TODO:
@@ -1653,6 +1655,7 @@ static ssize_t nic_receive(VLANClientState *nc, 
const uint8_t * buf, size_t size

*/
EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
uint16_t rfd_status = 0xa000;
+ uint8_t min_buf[MIN_BUF_SIZE];
static const uint8_t broadcast_macaddr[6] =
{ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };

@@ -1660,15 +1663,15 @@ static ssize_t nic_receive(VLANClientState 
*nc, const uint8_t * buf, size_t size

/* CSMA is disabled. */
logout("%p received while CSMA is disabled\n", s);
return -1;
- } else if (size < 64 && (s->configuration[7] & BIT(0))) {
- /* Short frame and configuration byte 7/0 (discard short receive) set:
- * Short frame is discarded */
- logout("%p received short frame (%zu byte)\n", s, size);
- s->statistics.rx_short_frame_errors++;
-#if 0
- return -1;
-#endif
- } else if ((size > MAX_ETH_FRAME_SIZE + 4) && !(s->configuration[18] 
& BIT(3))) {

+ }
+ /* Pad to minimum Ethernet frame length */
+ if (size < sizeof(min_buf)) {
+ memcpy(min_buf, buf, size);
+ memset(&min_buf[size], 0, sizeof(min_buf) - size);
+ buf = min_buf;
+ size = sizeof(min_buf);
+ }
+ if ((size > MAX_ETH_FRAME_SIZE + 4) && !(s->configuration[18] & 
BIT(3))) {

/* Long frame and configuration byte 18/3 (long receive ok) not set:
* Long frames are discarded. */
logout("%p received long frame (%zu byte), ignored\n", s, size);
@@ -1744,7 +1747,7 @@ static ssize_t nic_receive(VLANClientState *nc, 
const uint8_t * buf, size_t size

"(%zu bytes); data truncated\n", rfd_size, size);
size = rfd_size;
}
- if (size < 64) {
+ if (size < MIN_BUF_SIZE) {
rfd_status |= 0x0080;
}
TRACE(OTHER, logout("command 0x%04x, link 0x%08x, addr 0x%08x, size %u\n",



Could you please give more details of the test scenario which fails?
I'd like to reproduce it here and find a better solution.

The configuration bit "discard short frame" exists in real hardware,
so removing the code which emulates this behavior is not the correct
solution - even if it helps in a special case.

No acknowledge for this patch from me.

Regards,
Stefan Weil




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 1:12 PM, David Gibson
 wrote:
> On Sun, Feb 13, 2011 at 10:08:23AM +0200, Blue Swirl wrote:
>> On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
>>  wrote:
>> > On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
>> >> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
>> >>  wrote:
>> >> > On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
>> >> >>
>> >> >> Actually I don't quite understand the need for vty layer, why not use
>> >> >> the chardev here directly?
>> >> >
>> >> > I'm not sure what you mean here...
>> >>
>> >> Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
>> >> instead of moving those to a separate file.
>> >
>> > Well, the VIO device instance gives the chardev instance which is all
>> > nicely encapsulated inside spapr-vty... Also VIO devices tend to have
>> > dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
>> > close to the rest of the VIO driver they belong to don't you think ?
>> >
>> > (Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
>> > added to the core VIO but you'll see that when we get those patches
>> > ready, hopefully soon).
>>
>> This is a bit of a special case, much like semihosting modes for m68k
>> or ARM, or like MOL hacks which were removed recently. From QEMU point
>> of view, the most natural way of handling this would be hypervisor
>> implemented in the guest side (for example BIOS). Then the hypervisor
>> would use normal IO (or virtio) to communicate with the host. If
>> inside QEMU, the interface of the hypervisor to the devices needs some
>> thought. We'd like to avoid ugly interfaces like vmmouse where a
>> device probes CPU registers directly or spaghetti interfaces like
>> APIC.
>
> I really don't follow what you're saying here.  Running the hypervisor
> in the guest, rather than emulating its effect in qemu seems like an
> awful lot of complexity for no clear reason.

Maybe it would be more complex but also emulation accuracy would be
increased and the interfaces would be saner. We don't shortcut BIOS
and implement its services to OS in QEMU for other machines either.

I'd expect one problem with that approach though, the interface used
on real HW between the hypervisor and the underlying HW may be
undocumented, but then it could use for example existing virtio
devices.

One way to handle this could be to add the hypervisor interface now to
QEMU and switch to guest hypervisor when (if) it becomes available.
I'd just like to avoid duplication with virtio or messy interfaces
like vmport.



[Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 09:08, Blue Swirl wrote:

> On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
>  wrote:
>> On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
>>> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
>>>  wrote:
 On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
> 
> Actually I don't quite understand the need for vty layer, why not use
> the chardev here directly?
 
 I'm not sure what you mean here...
>>> 
>>> Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
>>> instead of moving those to a separate file.
>> 
>> Well, the VIO device instance gives the chardev instance which is all
>> nicely encapsulated inside spapr-vty... Also VIO devices tend to have
>> dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
>> close to the rest of the VIO driver they belong to don't you think ?
>> 
>> (Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
>> added to the core VIO but you'll see that when we get those patches
>> ready, hopefully soon).
> 
> This is a bit of a special case, much like semihosting modes for m68k
> or ARM, or like MOL hacks which were removed recently. From QEMU point
> of view, the most natural way of handling this would be hypervisor
> implemented in the guest side (for example BIOS). Then the hypervisor
> would use normal IO (or virtio) to communicate with the host. If
> inside QEMU, the interface of the hypervisor to the devices needs some
> thought. We'd like to avoid ugly interfaces like vmmouse where a
> device probes CPU registers directly or spaghetti interfaces like
> APIC.

In this case I disagree. While the "emulate real hardware" case would be to 
have a full proprietary binary blob of firmware running in the guest that would 
handle all the hypervisor stuff, this is not feasible. Implementing PAPR in 
Qemu is hard (and slow) enough - doing it all emulated simply is overkill for 
what we're trying to achieve here.

The PAPR interfaces are well specified (in the PAPR spec - seems to be 
power.org member restricted) and are the only thing you ever get to see on 
recent POWER hardware. The real hardware interface is simply inaccessible to 
you.

It's basically the same as the S390 machine we have. On those machine we simply 
don't have access to real hw, so emulating it is moot. All interfaces that the 
OS sees are PV interfaces.

> 
>> Actually, one thing I noticed is that the current patches David posted
>> still have a single function with a switch/case statement for hcalls.
>> 
>> I'm not 100% certain what David long term plans are here, but in our
>> internal "WIP" tree, we've subsequently turned that into a mechanism
>> where any module can call powerpc_register_hypercall() to add hcalls.
>> 
>> So if David intends to move the "upstream candidate" tree in that
>> direction, then naturally, the calls in spapr_hcall.c are going to
>> disappear in favor of a pair of powerpc_register_hypercall() locally in
>> the vty module.
> 
> Is the interface new design, or are you implementing what is used also
> on real HW?

PAPR is the spec that defines the PV interface in use on POWER. Outside of IBM, 
nobody can run Linux on POWER without going through phyp which is their 
hypervisor.

So this implements exactly what the OS sees on real HW :).


Alex




Re: [Qemu-devel] Re: [PATCH 09/15] Parse SDR1 on mtspr instead of at translate time

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 10:02, David Gibson wrote:

> On Sat, Feb 12, 2011 at 04:37:46PM +0100, Alexander Graf wrote:
>> On 12.02.2011, at 15:54, David Gibson wrote:
> [snip]
>>> +#define SDR_HTABORG_32 0xUL
>>> +#define SDR_HTABMASK   0x01FFUL
>> 
>> Please mark this constant as ppc32
>> 
>>> +
>>> +#if defined(TARGET_PPC64)
>>> +#define SDR_HTABORG_64 0xFFFCULL
>>> +#define SDR_HTABSIZE   0x001FULL
>> 
>> Please mark this constant as ppc64
> 
> Um.. I'm not sure what you mean by this.

Well, while to you SDR_HTABMASK and SDR_HTABSIZE are "obviously" meant for 
ppc32/ppc64 respectively, the average code reader won't know the difference. 
What I'm proposing is:

#define SDR_32_HTABORG
#define SDR_32_HTABMASK

#define SDR_64_HTABORG
#define SDR_64_HTABSIZE

This way it's a lot more obvious that the two constants really belong to two 
completely different semantics :).


Alex




Re: [Qemu-devel] Re: [PATCH 12/15] Support 1T segments on ppc

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 10:34, David Gibson wrote:

> On Sat, Feb 12, 2011 at 04:57:39PM +0100, Alexander Graf wrote:
>> On 12.02.2011, at 15:54, David Gibson wrote:
> [snip]
>>> +if (rb & (0x1000 - env->slb_nr))
>> 
>> Braces...
> 
> Oops, yeah.  These later patches in the series I haven't really
> audited for coding style adequately yet.  I'll fix these before the
> next version.
> 
> [snip]
>>> +   return -1; /* 1T segment on MMU that doesn't support it */
>>> + 
>>> +/* We stuff a copy of the B field into slb->esid to simplify
>>> + * lookup later */
>>> +slb->esid = (rb & (SLB_ESID_ESID | SLB_ESID_V)) |
>>> +(rs >> SLB_VSID_SSIZE_SHIFT);
>> 
>> Wouldn't it be easier to add another field?
> 
> Easier for what?  The reason I put these bits in here is that the rest
> of the things slb_lookup() needs to scan for are all in the esid
> field, so putting B in there means slb_lookup() needs only one
> comparison per-slot, per segment size.

Hrm - but it also needs random & ~3 masking in other code which is very 
unpretty. Comparing two numbers really shouldn't hurt performance too much, but 
makes the code better maintainable.

struct slb_entry {
uint64_t esid;
uint64_t vsid;
int b;
}

or so :).

Alex




Re: [Qemu-devel] Re: [PATCH 13/15] Add POWER7 support for ppc

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 10:39, David Gibson wrote:

> On Sat, Feb 12, 2011 at 05:09:39PM +0100, Alexander Graf wrote:
>> On 12.02.2011, at 15:54, David Gibson wrote:
> [snip]
>>> +/* Don't generate spurious events */
>>> +if ((cur_level == 1 && level == 0) || (cur_level == 0 && level != 0)) {
>> 
>> Did you hit this? Qemu's irq framework should already ensure that
>> property. I'm also not sure it's actually correct - if a level
>> interrupt is on, the guest would get another interrupt injected, no?
>> That would be cur_level ==1 && level == 1 IIUC.
> 
> [snip]
>>> +case POWER7_INPUT_CKSTP:
>> 
>> POWER7 has checkstop?
> 
> [snip]
>>> +case POWER7_INPUT_HRESET:
>> 
>> Does this ever get triggered? POWER7 is run in lpar only, so there is no 
>> hreset, right?
> 
> [snip]
>>> +case POWER7_INPUT_TBEN:
>>> +LOG_IRQ("%s: set the TBEN state to %d\n", __func__,
>>> +level);
>>> +/* XXX: TODO */
>> 
>> Hrm - what is this?
> 
> Ah, drat.  I forgot about this.  The POWER7 interrupt stuff I copied
> from 970 and them modified minimally to get it working.  I meant to
> get around to auditing this stuff to see what was actually relevant to
> POWER7.  I'll address this for the next version.
> 
> [snip]
>>> +#if !defined(CONFIG_USER_ONLY)
>>> +env->slb_nr = 32;
>> 
>> POWER7 has 64, no? Please check this :).
> 
> Nope.  POWER4 and POWER5 have 64, but POWER7 has 32.  This one I did
> check and change.

Oh? Interesting. Good to know :)


Alex




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 12:09, David Gibson wrote:

> On Sat, Feb 12, 2011 at 05:47:53PM +0100, Alexander Graf wrote:
>> On 12.02.2011, at 15:54, David Gibson wrote:
> [snip] 
>>> @@ -267,6 +295,7 @@ static QEMUMachine spapr_machine = {
>>>.desc = "pSeries Logical Partition (PAPR compliant)",
>>>.init = ppc_spapr_init,
>>>.max_cpus = 1,
>>> +.no_parallel = 1,
>> 
>> duplicate?
> 
> Oops, rebasing mistake.  Fixed now.
> 
> [snip]
>>> +VIOsPAPRDevice *spapr_vio_find_by_reg(VIOsPAPRBus *bus, uint32_t reg)
>>> +{
>>> +DeviceState *qdev;
>>> +VIOsPAPRDevice *dev = NULL;
>>> +
>>> +QLIST_FOREACH(qdev, &bus->bus.children, sibling) {
>>> +dev = (VIOsPAPRDevice *)qdev;
>>> +if (dev->reg == reg)
>> 
>> Braces
>> 
>>> +break;
>>> +}
>>> +
>>> +return dev;
>> 
>> What if the device doesn't exist?
> 
> This returns NULL, the caller returns an error...

Makes sense :).


Alex




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 12:14, David Gibson wrote:

> On Sun, Feb 13, 2011 at 10:15:03AM +1100, Benjamin Herrenschmidt wrote:
>> On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
>>> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
> [snip]
>> Actually, one thing I noticed is that the current patches David posted
>> still have a single function with a switch/case statement for hcalls.
>> 
>> I'm not 100% certain what David long term plans are here, but in our
>> internal "WIP" tree, we've subsequently turned that into a mechanism
>> where any module can call powerpc_register_hypercall() to add hcalls.
>> 
>> So if David intends to move the "upstream candidate" tree in that
>> direction, then naturally, the calls in spapr_hcall.c are going to
>> disappear in favor of a pair of powerpc_register_hypercall() locally in
>> the vty module.
> 
> Ah, yeah.  I'm still not sure what to do about it.  I was going to
> fold the dynamic hcall registration into the patch set before
> upstreaming.  But then something paulus said made me rethink whether
> the dynamic registration was a good idea.  Still need to sort this out
> before the series is really ready.

We can surely move it to dynamic later on. I think the "proper" way would be to 
populate a qdev bus and have the individual hypercall receivers register 
themselves through -device creations. But Blue really is the expert here :).


Alex




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread David Gibson
On Sun, Feb 13, 2011 at 01:40:14PM +0100, Alexander Graf wrote:
> 
> On 13.02.2011, at 12:14, David Gibson wrote:
> 
> > On Sun, Feb 13, 2011 at 10:15:03AM +1100, Benjamin Herrenschmidt wrote:
> >> On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
> >>> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
> > [snip]
> >> Actually, one thing I noticed is that the current patches David posted
> >> still have a single function with a switch/case statement for hcalls.
> >> 
> >> I'm not 100% certain what David long term plans are here, but in our
> >> internal "WIP" tree, we've subsequently turned that into a mechanism
> >> where any module can call powerpc_register_hypercall() to add hcalls.
> >> 
> >> So if David intends to move the "upstream candidate" tree in that
> >> direction, then naturally, the calls in spapr_hcall.c are going to
> >> disappear in favor of a pair of powerpc_register_hypercall() locally in
> >> the vty module.
> > 
> > Ah, yeah.  I'm still not sure what to do about it.  I was going to
> > fold the dynamic hcall registration into the patch set before
> > upstreaming.  But then something paulus said made me rethink whether
> > the dynamic registration was a good idea.  Still need to sort this out
> > before the series is really ready.
> 
> We can surely move it to dynamic later on. I think the "proper" way
> would be to populate a qdev bus and have the individual hypercall
> receivers register themselves through -device creations. But Blue
> really is the expert here :).

Ok, not sure what you mean here.  I already have a qdev bus for the
VIO devices.  With my tentative dynamic model as devices are created
on the bus they may register hypercalls as well.

Is that what you mean, or do you mean have a separate "hypercall"
bus.  That sounds like serious overkill for a simple number->function
translation.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Re: [PATCH 05/15] Implement PowerPC slbmfee and slbmfev instructions

2011-02-13 Thread David Gibson
On Sat, Feb 12, 2011 at 04:23:39PM +0100, Alexander Graf wrote:
> On 12.02.2011, at 15:54, David Gibson wrote:
[snip]
> > +target_ulong helper_load_slb_esid (target_ulong rb)
> > +{
> > +target_ulong rt;
> > +
> > +if (ppc_load_slb_esid(env, rb, &rt) < 0) {
> > +helper_raise_exception_err(POWERPC_EXCP_PROGRAM, 
> > POWERPC_EXCP_INVAL);
> 
> The spec doesn't say what to do in this case. Have you checked what
> real hardware does?

Erm, I don't think I've checked this specific case, on this specific
CPU.  Generally I've found that invalid parameters to MMU management
instructions results in invalid instruction program checks, so I
assumed that's what would happen in this case.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



[Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 2:31 PM, Alexander Graf  wrote:
>
> On 13.02.2011, at 09:08, Blue Swirl wrote:
>
>> On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
>>  wrote:
>>> On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
 On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
  wrote:
> On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
>>
>> Actually I don't quite understand the need for vty layer, why not use
>> the chardev here directly?
>
> I'm not sure what you mean here...

 Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
 instead of moving those to a separate file.
>>>
>>> Well, the VIO device instance gives the chardev instance which is all
>>> nicely encapsulated inside spapr-vty... Also VIO devices tend to have
>>> dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
>>> close to the rest of the VIO driver they belong to don't you think ?
>>>
>>> (Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
>>> added to the core VIO but you'll see that when we get those patches
>>> ready, hopefully soon).
>>
>> This is a bit of a special case, much like semihosting modes for m68k
>> or ARM, or like MOL hacks which were removed recently. From QEMU point
>> of view, the most natural way of handling this would be hypervisor
>> implemented in the guest side (for example BIOS). Then the hypervisor
>> would use normal IO (or virtio) to communicate with the host. If
>> inside QEMU, the interface of the hypervisor to the devices needs some
>> thought. We'd like to avoid ugly interfaces like vmmouse where a
>> device probes CPU registers directly or spaghetti interfaces like
>> APIC.
>
> In this case I disagree. While the "emulate real hardware" case would be to 
> have a full proprietary binary blob of firmware running in the guest that 
> would handle all the hypervisor stuff, this is not feasible. Implementing 
> PAPR in Qemu is hard (and slow) enough - doing it all emulated simply is 
> overkill for what we're trying to achieve here.
>
> The PAPR interfaces are well specified (in the PAPR spec - seems to be 
> power.org member restricted) and are the only thing you ever get to see on 
> recent POWER hardware. The real hardware interface is simply inaccessible to 
> you.

Hah, I'm sure that if sufficiently motivated bunch of cool hackers got
access to a truckload (or several, aren't these big machines?) of
POWER hardware, which they could open and brick at leisure, like it is
done with other firmware reverse engineering efforts, we'd have open
hypervisor at no time. ;-)

It's not impossible to imagine also IBM open sourcing theirs.

> It's basically the same as the S390 machine we have. On those machine we 
> simply don't have access to real hw, so emulating it is moot. All interfaces 
> that the OS sees are PV interfaces.
>
>>
>>> Actually, one thing I noticed is that the current patches David posted
>>> still have a single function with a switch/case statement for hcalls.
>>>
>>> I'm not 100% certain what David long term plans are here, but in our
>>> internal "WIP" tree, we've subsequently turned that into a mechanism
>>> where any module can call powerpc_register_hypercall() to add hcalls.
>>>
>>> So if David intends to move the "upstream candidate" tree in that
>>> direction, then naturally, the calls in spapr_hcall.c are going to
>>> disappear in favor of a pair of powerpc_register_hypercall() locally in
>>> the vty module.
>>
>> Is the interface new design, or are you implementing what is used also
>> on real HW?
>
> PAPR is the spec that defines the PV interface in use on POWER. Outside of 
> IBM, nobody can run Linux on POWER without going through phyp which is their 
> hypervisor.
>
> So this implements exactly what the OS sees on real HW :).

Right. As long as the resulting spaghetti is well contained, I'm OK
with this approach. But should ever the millionaire hacker team (or
IBM) appear with their open hypervisor, this shall make room for it.



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 13:44, David Gibson wrote:

> On Sun, Feb 13, 2011 at 01:40:14PM +0100, Alexander Graf wrote:
>> 
>> On 13.02.2011, at 12:14, David Gibson wrote:
>> 
>>> On Sun, Feb 13, 2011 at 10:15:03AM +1100, Benjamin Herrenschmidt wrote:
 On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
>>> [snip]
 Actually, one thing I noticed is that the current patches David posted
 still have a single function with a switch/case statement for hcalls.
 
 I'm not 100% certain what David long term plans are here, but in our
 internal "WIP" tree, we've subsequently turned that into a mechanism
 where any module can call powerpc_register_hypercall() to add hcalls.
 
 So if David intends to move the "upstream candidate" tree in that
 direction, then naturally, the calls in spapr_hcall.c are going to
 disappear in favor of a pair of powerpc_register_hypercall() locally in
 the vty module.
>>> 
>>> Ah, yeah.  I'm still not sure what to do about it.  I was going to
>>> fold the dynamic hcall registration into the patch set before
>>> upstreaming.  But then something paulus said made me rethink whether
>>> the dynamic registration was a good idea.  Still need to sort this out
>>> before the series is really ready.
>> 
>> We can surely move it to dynamic later on. I think the "proper" way
>> would be to populate a qdev bus and have the individual hypercall
>> receivers register themselves through -device creations. But Blue
>> really is the expert here :).
> 
> Ok, not sure what you mean here.  I already have a qdev bus for the
> VIO devices.  With my tentative dynamic model as devices are created
> on the bus they may register hypercalls as well.

Oh, guess I just overlooked that then, sorry :).

> Is that what you mean, or do you mean have a separate "hypercall"
> bus.  That sounds like serious overkill for a simple number->function
> translation.

Yup.


Alex




Re: [Qemu-devel] Re: [PATCH 09/15] Parse SDR1 on mtspr instead of at translate time

2011-02-13 Thread David Gibson
On Sun, Feb 13, 2011 at 01:33:44PM +0100, Alexander Graf wrote:
> 
> On 13.02.2011, at 10:02, David Gibson wrote:
> 
> > On Sat, Feb 12, 2011 at 04:37:46PM +0100, Alexander Graf wrote:
> >> On 12.02.2011, at 15:54, David Gibson wrote:
> > [snip]
> >>> +#define SDR_HTABORG_32 0xUL
> >>> +#define SDR_HTABMASK   0x01FFUL
> >> 
> >> Please mark this constant as ppc32
> >> 
> >>> +
> >>> +#if defined(TARGET_PPC64)
> >>> +#define SDR_HTABORG_64 0xFFFCULL
> >>> +#define SDR_HTABSIZE   0x001FULL
> >> 
> >> Please mark this constant as ppc64
> > 
> > Um.. I'm not sure what you mean by this.
> 
> Well, while to you SDR_HTABMASK and SDR_HTABSIZE are "obviously"
> meant for ppc32/ppc64 respectively, the average code reader won't
> know the difference. What I'm proposing is:
> 
> #define SDR_32_HTABORG
> #define SDR_32_HTABMASK
> 
> #define SDR_64_HTABORG
> #define SDR_64_HTABSIZE
> 
> This way it's a lot more obvious that the two constants really
> belong to two completely different semantics :).

Ah! I see.  Done, I'll have this in the next cut.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Re: [PATCH 12/15] Support 1T segments on ppc

2011-02-13 Thread David Gibson
On Sun, Feb 13, 2011 at 01:37:12PM +0100, Alexander Graf wrote:
> On 13.02.2011, at 10:34, David Gibson wrote:
> > On Sat, Feb 12, 2011 at 04:57:39PM +0100, Alexander Graf wrote:
> >> On 12.02.2011, at 15:54, David Gibson wrote:
> > [snip]
> >>> +if (rb & (0x1000 - env->slb_nr))
> >> 
> >> Braces...
> > 
> > Oops, yeah.  These later patches in the series I haven't really
> > audited for coding style adequately yet.  I'll fix these before the
> > next version.
> > 
> > [snip]
> >>> + return -1; /* 1T segment on MMU that doesn't support it */
> >>> + 
> >>> +/* We stuff a copy of the B field into slb->esid to simplify
> >>> + * lookup later */
> >>> +slb->esid = (rb & (SLB_ESID_ESID | SLB_ESID_V)) |
> >>> +(rs >> SLB_VSID_SSIZE_SHIFT);
> >> 
> >> Wouldn't it be easier to add another field?
> > 
> > Easier for what?  The reason I put these bits in here is that the rest
> > of the things slb_lookup() needs to scan for are all in the esid
> > field, so putting B in there means slb_lookup() needs only one
> > comparison per-slot, per segment size.
> 
> Hrm - but it also needs random & ~3 masking in other code which is
> very unpretty. Comparing two numbers really shouldn't hurt
> performance too much, but makes the code better maintainable.

Well, it's only one place.  But fair enough, I'll avoid this hack in
the next version.

> struct slb_entry {
> uint64_t esid;
> uint64_t vsid;
> int b;
> }
> 
> or so :).

Actually, we don't even need that.  The B field is already in
slb->vsid.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



[Qemu-devel] Re: Problem with DOS application and 286 DOS Extender application

2011-02-13 Thread Gerhard Wiesinger

Hello,

After some fortune I found out that also Turbo Debugger 286 doesn't work 
under plain DOS 6.22 (without any memory mananger just pressing F5) or 
with some memory mananagers (HIMEM.SYS, EMM386, QEMM386).


Error message is:
Error 266 loading D:\DIR\TD286.EXE into extended memory.

So it looks like that there is a major issue with extended memory. Any 
ideas how to fix or how to find the problem and fix it?


Version is latest seabios and QEMU from git as of now (own builds).

I'm pretty sure that it is the same reason that the 286 DOS Extender 
application doesn't work.


For full reference of the previously discussed have a look here:
http://www.mail-archive.com/qemu-devel@nongnu.org/msg29518.html
http://www.mail-archive.com/qemu-devel@nongnu.org/msg29465.html

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/


On Wed, 14 Apr 2010, Jan Kiszka wrote:


Jamie Lokier wrote:

Gerhard Wiesinger wrote:

It is a non public, proprietary application which uses the Ergo Computing
286 DOS Extender. I guess some other application which use the same DOS
extender have the same problem. So best thing is to find another
application which uses the Ergo Computing 286 DOS Extender, too.


The 286 was obsolete 20 years ago, although code depending on it
persisted for some years after.

I'm fairly sure the number of people using (or trying to use) Qemu
with 286-specific code is very small indeed, so unfortunately for a
286 problem, you will need to help reproduce it as much as you can for
it to be fixed.


In some scenarios, we use QEMU in emulation mode for such a legacy guest
(16-bit protected mode), but we mostly run it in KVM mode these days. It
works fairly well under QEMU, but also we did not explore all corner cases.



Note that Qemu doesn't emulate segments properly even for 32-bit x86
code, and 16-bit (286) code depends on that all the more.  That may be
the problem.


More precisely: QEMU does not check for segment limits. This can be a
problem with buggy or pedantic guests, but usually one tried to avoid
triggering this anyway. I once wrote a crude patch to add this, but it
had significant performance impact and did not properly make use of the
TCG to optimize the checks. You'll find it in the archives (but I guess
it no longer applies).



Or it may be the "reset using keyboard controller and BIOS" method
used to switch from protected mode to real mode on a 286 is not
implemented properly, or is not supported by the BIOS properly.

Or it may simply be a bug in 16-bit task segment switching or
something like that, which is quite complex and so rarely used that it
might never have been properly tested.


Task switching looks fairly stable in QEMU (in contrast to KVM where we
just ran into some more corner cases).



Did you try running the application under Bochs, which has a more
accurate emulation of very old x86 CPUs?

-- Jamie



That said, having some test case to reproduce the issue is essential.
I'm willing to have a look if you can provide such thing (publicly or
privately). Before that, you could already try building QEMU with
--enable-debug and run it with "-d exec,int". The generated
/tmp/qemu.log may point out where things go wrong (usually where faults
starts to occur).

Jan






[Qemu-devel] Re: Problem with DOS application and 286 DOS Extender application

2011-02-13 Thread Kevin O'Connor
On Sun, Feb 13, 2011 at 03:06:44PM +0100, Gerhard Wiesinger wrote:
> Hello,
> 
> After some fortune I found out that also Turbo Debugger 286 doesn't
> work under plain DOS 6.22 (without any memory mananger just pressing
> F5) or with some memory mananagers (HIMEM.SYS, EMM386, QEMM386).
> 
> Error message is:
> Error 266 loading D:\DIR\TD286.EXE into extended memory.
> 
> So it looks like that there is a major issue with extended memory.
> Any ideas how to fix or how to find the problem and fix it?

It would help if you could post the seabios log.  The easiest way to
get at that is to add the following to the qemu command line:

-chardev stdio,id=seabios -device isa-debugcon,iobase=0x402,chardev=seabios

It's also possible to recompile seabios with the debug level
increased to get more info on specific calls.

-Kevin



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Anthony Liguori

On 02/13/2011 05:12 AM, David Gibson wrote:

On Sun, Feb 13, 2011 at 10:08:23AM +0200, Blue Swirl wrote:
   

On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
  wrote:
 

On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
   

On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
  wrote:
 

On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
   

Actually I don't quite understand the need for vty layer, why not use
the chardev here directly?
 

I'm not sure what you mean here...
   

Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
instead of moving those to a separate file.
 

Well, the VIO device instance gives the chardev instance which is all
nicely encapsulated inside spapr-vty... Also VIO devices tend to have
dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
close to the rest of the VIO driver they belong to don't you think ?

(Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
added to the core VIO but you'll see that when we get those patches
ready, hopefully soon).
   

This is a bit of a special case, much like semihosting modes for m68k
or ARM, or like MOL hacks which were removed recently. From QEMU point
of view, the most natural way of handling this would be hypervisor
implemented in the guest side (for example BIOS). Then the hypervisor
would use normal IO (or virtio) to communicate with the host. If
inside QEMU, the interface of the hypervisor to the devices needs some
thought. We'd like to avoid ugly interfaces like vmmouse where a
device probes CPU registers directly or spaghetti interfaces like
APIC.
 

I really don't follow what you're saying here.  Running the hypervisor
in the guest, rather than emulating its effect in qemu seems like an
awful lot of complexity for no clear reason.
   


In KVM for x86, instead of using a secondary interface (like 
vmmcall/vmcall), we do all of our paravirtualization using native 
hardware interfaces that we can trap (PIO/MMIO).


IIUC, on Power, trapping MMIO is not possible due to the MMU mode that 
is currently used (PFs are delivered directly to the guest).


So it's not really possible to switch from using hypercalls to using MMIO.

What I would suggest is modelling hypercalls as another I/O address 
space for the processor.  So instead of having a function pointer in the 
CPUState, introduce a:


typedef void (HypercallFunc)(CPUState *env, void *opaque);

/* register a hypercall handler */
void register_hypercall(target_ulong index, HypercallFunc *handler, void 
*opaque);

void unregister_hypercall(target_ulong index);

/* dispatch a hypercall */
void cpu_hypercall(CPUState *env, target_ulong index);

This interface could also be used to implement hypercall based 
interfaces on s390 and x86.


The arguments will have to be extracted from the CPU state but I don't 
think we'll really ever have common hypercall implementations anyway so 
that's not a huge problem.



on real HW?
 

The interface already exists on real HW.  It's described in the PAPR
document we keep mentioning.
   


Another thing to note is that the hypercall based I/O devices the 
interfaces that the current Power hypervisor uses so implementing this 
interface is necessary to support existing guests.


Regards,

Anthony Liguori





Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Anthony Liguori

On 02/13/2011 06:40 AM, Alexander Graf wrote:



Ah, yeah.  I'm still not sure what to do about it.  I was going to
fold the dynamic hcall registration into the patch set before
upstreaming.  But then something paulus said made me rethink whether
the dynamic registration was a good idea.  Still need to sort this out
before the series is really ready.
 

We can surely move it to dynamic later on. I think the "proper" way would be to 
populate a qdev bus and have the individual hypercall receivers register themselves 
through -device creations.
   


Ignore the qdev bit for a moment.  Hypercalls could be plausibly 
implemented as another I/O space from a processor so the thing to model 
off of would be PIO dispatch (cpu_outb and friends).


From a qdev perspective, having a VIO bus makes sense.  The details of 
which I/O spaces are uses are not as important from a device tree 
perspective.


Regards,

Anthony Liguori


Alex


   





Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/11/2011 12:14 PM, Blue Swirl wrote:

On Thu, Feb 10, 2011 at 6:05 PM, Anthony Liguori  wrote:
   

On 02/10/2011 03:20 PM, Gleb Natapov wrote:
 

Jugging by how well all previous conversion went we will end up with one
more way of creating devices. One legacy, another qdev and your new one.
And what is the problem with qdev again (not that I am a big qdev fan)?

   

We've really been arguing about probably the most minor aspect of the
problem with qdev.

All I'm really saying is that we shouldn't tie device construction to a
factory interface as we do with qdev.

That simply means that we should be able to do:

RTC *rtc_create(arg1, arg2, arg2);
 

I don't see how that would help at all. Throwing qdev away and just
calling various functions directly, with all states exposed would be
like QEMU 0.9.0.
   


qdev doesn't expose any state today.  qdev properties are 
construction-only properties that happen to be stored in each device state.


What we really need is a full property framework that includes 
properties with hookable getters and setters along with the ability to 
mark properties as construct-only, read-only, or read-write.


But I think it's reasonable to expose construct-only properties as just 
an initfn argument.



And that a separate piece of code decides which devices are exposed through
-device or device_add.  Which devices are exposed is really a minor detail.

That said, qdev has a number of significant limitations in my mind.  The
first is that the only relationship between devices is through the BusState
interface.
 

There's also qemu_irq for arbitrary signals.
   


Yes, but qemu_irq is very restricted as it only models a signal bit of 
information and doesn't really have a mechanism to attach/detach in any 
generic way.



  I don't think we should even try to have a generic bus model.
  When you look at how badly broken PCI hotplug is current in qdev, I think
this is symptomatic of this.
 

And how should this be fixed? The API change would not help.
   


Just as we have bus level creation functions, we should have bus level 
hotplug interfaces.



There's also no way in qdev to really have polymorphism.  Interfaces really
aren't meaningful in qdev so you have things like PCIDevice where some
methods are stored in the object instead of the class dispatch table and you
have overuse of static class members.
 

QEMU is developed in C, not C++.
   


But we're trying to do object oriented programming in C so as long as 
we're doing that, we ought to do it right.



And it's all unrelated to VMState.
 

Right, but this has also the good side that not all device state is
automatically exported. If other devices would be allowed to muck with
a devices internal state freely, bad things could happen.

Device reset could also use standard register definitions, shared with VMState.
   


There's a way to have formally verifiable serialization/deserialization 
if we can satisfy two conditions 1) the devices rely on no global state 
(i.e. static variables) and 2) every field asssociated with a device is 
marshalled during serialization/deserialization.


When we define a device, right now we say that certain state is writable 
during construction.  It's not a stretch to want to have some properties 
writable during runtime.  If we also had a mechanism to mark certain 
properties as read-only, but still were able to introspect them, we 
could implement serialization without having to have a second VMState 
definition.


Compatibility will always require manipulating state, but once you have 
the state stored in a data structure, you can describe those 
transformations in a pretty high level fashion.



And this is just the basic mechanisms of qdev.  The actual implementation is
worse.  The use of qemu_irq as gpio in the base class and overuse of
SystemBus is really quite insane.
 

Maybe qemu_irq should be renamed to QEMUSignal (and I don't like
typedeffing pointers), otherwise it looks quite sane to me.
   


Any interfaces of a base class should make sense even for derived classes.

That means if the base class is going to expose essentially a pin-out 
interface, that if I have a PCIDevice and cast it to Device, I should be 
able to interact with the GPIO interface to interact with the PCI 
device.  Presumably, that means interfacing at the PCI signalling 
level.  That's insane to model in QEMU :-)


In reality, GPIO only makes sense for a small class of simple devices 
where modelling the pin-out interface makes sense (like a 7-segment 
LCD).  That suggests that GPIO should not be in the DeviceState 
interface but instead should be in a SimpleDevice subclass or something 
like that.



Could you point to examples of SystemBus overuse?
   


anthony@titi:~/git/qemu/hw$ grep qdev_create *.c | wc -l
73
anthony@titi:~/git/qemu/hw$ grep 'qdev_create(NULL' *.c | wc -l
56

SystemBus has become a catch-all for shallow qdev conversions.  We've 
got Northbridges

Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/10/2011 04:29 AM, Avi Kivity wrote:

On 02/10/2011 09:47 AM, Anthony Liguori wrote:


So very concretely, I'm suggesting we do the following to target-i386:

1) make the i440fx device have an embedded ide controller, piix3, and 
usb controller that get initialized automatically.  The piix3 embeds 
the PCI-to-ISA bridge along with all of the default ISA devices (rtc, 
serial, etc.).


This I like.



2) get rid of the entire concept of machines.  Creating a i440fx is 
essentially equivalent to creating a bare machine.


No, it's not.  The 440fx does not include an IOAPIC, for example.  
There may be other optional components, or differences in wiring, that 
make two machines with i440fx not identical.


The IOAPIC is basically the only other component and I view it as part 
of the CPU interface to the chipset.


But still, if we're creating a machine from scratch:

qemu -device i440fx,id=nb -device piix3,id=sb,chipset=nb -device 
ioapic,id=ioapic,chipset=sb -device cpu,ioapic=ioapic,northbridge=nb


Is not all that unreasonable and presents a fully functioning PC.



4) model the CPUs as devices that take a pointer to a host 
controller, for x86, the normal case would be giving it a pointer to 
i440fx.




Surely the connection is via a bus?  An x86 cpu talks to the bus, and 
there happens to be an 440fx north bridge at the end of it.  It could 
also be a Q35 or something else.


I see being on a bus as really just taking a pointer to an interface.  
So yes, the i440fx would implement a PentiumCpuInterface or something 
like that and the CPU would take a pointer to a PentiumCpuInterface[1].


This is part of why having proper polymorphism is important.  We need it 
in order to be able to express concepts like interfaces.


[1] This is just a Random Bad Name.  Don't read anything into it.

Regards,

Anthony Liguori



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/10/2011 04:29 AM, Avi Kivity wrote:

On 02/10/2011 09:47 AM, Anthony Liguori wrote:


So very concretely, I'm suggesting we do the following to target-i386:

1) make the i440fx device have an embedded ide controller, piix3, and 
usb controller that get initialized automatically.  The piix3 embeds 
the PCI-to-ISA bridge along with all of the default ISA devices (rtc, 
serial, etc.).


This I like.



2) get rid of the entire concept of machines.  Creating a i440fx is 
essentially equivalent to creating a bare machine.


No, it's not.  The 440fx does not include an IOAPIC, for example.  
There may be other optional components, or differences in wiring, that 
make two machines with i440fx not identical.


The IOAPIC is basically the only other component and I view it as part 
of the CPU interface to the chipset.


But still, if we're creating a machine from scratch:

qemu -device i440fx,id=nb -device piix3,id=sb,chipset=nb -device 
ioapic,id=ioapic,chipset=sb -device cpu,ioapic=ioapic,northbridge=nb


Is not all that unreasonable and presents a fully functioning PC.



4) model the CPUs as devices that take a pointer to a host 
controller, for x86, the normal case would be giving it a pointer to 
i440fx.




Surely the connection is via a bus?  An x86 cpu talks to the bus, and 
there happens to be an 440fx north bridge at the end of it.  It could 
also be a Q35 or something else.


I see being on a bus as really just taking a pointer to an interface.  
So yes, the i440fx would implement a PentiumCpuInterface or something 
like that and the CPU would take a pointer to a PentiumCpuInterface[1].


This is part of why having proper polymorphism is important.  We need it 
in order to be able to express concepts like interfaces.


[1] This is just a Random Bad Name.  Don't read anything into it.

Regards,

Anthony Liguori



Re: [Qemu-devel] [PATCH 06/10] vmmouse: convert to qdev

2011-02-13 Thread Anthony Liguori

On 02/12/2011 11:03 AM, Markus Armbruster wrote:

Blue Swirl  writes:

   

Convert to qdev, also add a proper reset function.

Signed-off-by: Blue Swirl
---
  hw/pc.c  |5 +++--
  hw/pc.h  |3 ---
  hw/vmmouse.c |   37 +
  3 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index fcee09a..f66ac5d 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1096,7 +1096,7 @@ void pc_basic_device_init(qemu_irq *isa_irq,
  PITState *pit;
  qemu_irq rtc_irq = NULL;
  qemu_irq *a20_line;
-ISADevice *i8042, *port92;
+ISADevice *i8042, *port92, *vmmouse;
  qemu_irq *cpu_exit_irq;

  register_ioport_write(0x80, 1, 1, ioport80_write, NULL);
@@ -1133,7 +1133,8 @@ void pc_basic_device_init(qemu_irq *isa_irq,
  a20_line = qemu_allocate_irqs(handle_a20_line_change, first_cpu, 2);
  i8042 = isa_create_simple("i8042");
  i8042_setup_a20_line(i8042,&a20_line[0]);
-vmmouse_init(i8042);
+vmmouse = isa_create("vmmouse");
+qdev_prop_set_ptr(&vmmouse->qdev, "ps2_mouse", i8042);
  port92 = isa_create_simple("port92");
  port92_init(port92,&a20_line[1]);

diff --git a/hw/pc.h b/hw/pc.h
index 603a2a3..ae83934 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -67,9 +67,6 @@ void hpet_pit_enable(void);
  /* vmport.c */
  void vmport_register(unsigned char command, IOPortReadFunc *func,
void *opaque);

-/* vmmouse.c */
-void *vmmouse_init(void *m);
-
  /* pckbd.c */

  void i8042_init(qemu_irq kbd_irq, qemu_irq mouse_irq, uint32_t io_base);
diff --git a/hw/vmmouse.c b/hw/vmmouse.c
index 2097119..3b39144 100644
--- a/hw/vmmouse.c
+++ b/hw/vmmouse.c
@@ -25,6 +25,7 @@
  #include "console.h"
  #include "ps2.h"
  #include "pc.h"
+#include "qdev.h"

  /* debug only vmmouse */
  //#define DEBUG_VMMOUSE
@@ -52,6 +53,7 @@

  typedef struct _VMMouseState
  {
+ISADevice dev;
  uint32_t queue[VMMOUSE_QUEUE_SIZE];
  int32_t queue_size;
  uint16_t nb_queue;
@@ -270,22 +272,41 @@ static const VMStateDescription vmstate_vmmouse = {
  }
  };

-void *vmmouse_init(void *m)
+static void vmmouse_reset(DeviceState *d)
  {
-VMMouseState *s = NULL;
+VMMouseState *s = container_of(d, VMMouseState, dev.qdev);

-DPRINTF("vmmouse_init\n");
+s->status = 0x;
+}

-s = qemu_mallocz(sizeof(VMMouseState));
+static int vmmouse_initfn(ISADevice *dev)
+{
+VMMouseState *s = DO_UPCAST(VMMouseState, dev, dev);

-s->status = 0x;
-s->ps2_mouse = m;
-s->queue_size = VMMOUSE_QUEUE_SIZE;
 

Where is member queue_size initialized in your new code?

   

+DPRINTF("vmmouse_init\n");

  vmport_register(VMMOUSE_STATUS, vmmouse_ioport_read, s);
  vmport_register(VMMOUSE_COMMAND, vmmouse_ioport_read, s);
  vmport_register(VMMOUSE_DATA, vmmouse_ioport_read, s);
  vmstate_register(NULL, 0,&vmstate_vmmouse, s);

-return s;
+return 0;
+}
+
+static ISADeviceInfo vmmouse_info = {
+.init  = vmmouse_initfn,
+.qdev.name = "vmmouse",
+.qdev.size = sizeof(VMMouseState),
+.qdev.no_user  = 1,
+.qdev.reset= vmmouse_reset,
+.qdev.props = (Property[]) {
+DEFINE_PROP_PTR("ps2_mouse", VMMouseState, ps2_mouse),
 

Pointer properties are for dirty hacks only.  Is there really no better
solution?  Why does it have to be a property?
   


vmmouse is really just an extension to the PS2 Mouse.  It's definitely 
not an ISA device.


In terms of qdev enablement, I would just make it a boolean option to 
the PS2Mouse and not expose it as a top level device at all.  It cannot 
exist without a PS2Mouse.


Regards,

Anthony Liguori


+DEFINE_PROP_END_OF_LIST(),
+}
+};
+
+static void vmmouse_dev_register(void)
+{
+isa_qdev_register(&vmmouse_info);
  }
+device_init(vmmouse_dev_register)
 
   





Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Avi Kivity

On 02/13/2011 05:38 PM, Anthony Liguori wrote:




2) get rid of the entire concept of machines.  Creating a i440fx is 
essentially equivalent to creating a bare machine.


No, it's not.  The 440fx does not include an IOAPIC, for example.  
There may be other optional components, or differences in wiring, 
that make two machines with i440fx not identical.



The IOAPIC is basically the only other component and I view it as part 
of the CPU interface to the chipset.


But it isn't.  The IOAPIC is not per-core or per-socket.  It's strictly 
board level.  It's a board interface to the cpu.




But still, if we're creating a machine from scratch:

qemu -device i440fx,id=nb -device piix3,id=sb,chipset=nb -device 
ioapic,id=ioapic,chipset=sb -device cpu,ioapic=ioapic,northbridge=nb


Is not all that unreasonable and presents a fully functioning PC.


Sure.  And -M blah is a shortcut.





4) model the CPUs as devices that take a pointer to a host 
controller, for x86, the normal case would be giving it a pointer to 
i440fx.




Surely the connection is via a bus?  An x86 cpu talks to the bus, and 
there happens to be an 440fx north bridge at the end of it.  It could 
also be a Q35 or something else.


I see being on a bus as really just taking a pointer to an interface.  
So yes, the i440fx would implement a PentiumCpuInterface or something 
like that and the CPU would take a pointer to a PentiumCpuInterface[1].


This is part of why having proper polymorphism is important.  We need 
it in order to be able to express concepts like interfaces.


The CPUs and northbridge are peers on the system bus.  However, this 
isn't something good to model, since it keeps changing without any guest 
software impact.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Alexander Graf

On 13.02.2011, at 16:08, Anthony Liguori wrote:

> On 02/13/2011 05:12 AM, David Gibson wrote:
>> On Sun, Feb 13, 2011 at 10:08:23AM +0200, Blue Swirl wrote:
>>   
>>> On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
>>>   wrote:
>>> 
 On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:
   
> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
>   wrote:
> 
>> On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
>>   
>>> Actually I don't quite understand the need for vty layer, why not use
>>> the chardev here directly?
>>> 
>> I'm not sure what you mean here...
>>   
> Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
> instead of moving those to a separate file.
> 
 Well, the VIO device instance gives the chardev instance which is all
 nicely encapsulated inside spapr-vty... Also VIO devices tend to have
 dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
 close to the rest of the VIO driver they belong to don't you think ?
 
 (Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
 added to the core VIO but you'll see that when we get those patches
 ready, hopefully soon).
   
>>> This is a bit of a special case, much like semihosting modes for m68k
>>> or ARM, or like MOL hacks which were removed recently. From QEMU point
>>> of view, the most natural way of handling this would be hypervisor
>>> implemented in the guest side (for example BIOS). Then the hypervisor
>>> would use normal IO (or virtio) to communicate with the host. If
>>> inside QEMU, the interface of the hypervisor to the devices needs some
>>> thought. We'd like to avoid ugly interfaces like vmmouse where a
>>> device probes CPU registers directly or spaghetti interfaces like
>>> APIC.
>>> 
>> I really don't follow what you're saying here.  Running the hypervisor
>> in the guest, rather than emulating its effect in qemu seems like an
>> awful lot of complexity for no clear reason.
>>   
> 
> In KVM for x86, instead of using a secondary interface (like vmmcall/vmcall), 
> we do all of our paravirtualization using native hardware interfaces that we 
> can trap (PIO/MMIO).
> 
> IIUC, on Power, trapping MMIO is not possible due to the MMU mode that is 
> currently used (PFs are delivered directly to the guest).
> 
> So it's not really possible to switch from using hypercalls to using MMIO.
> 
> What I would suggest is modelling hypercalls as another I/O address space for 
> the processor.  So instead of having a function pointer in the CPUState, 
> introduce a:
> 
> typedef void (HypercallFunc)(CPUState *env, void *opaque);
> 
> /* register a hypercall handler */
> void register_hypercall(target_ulong index, HypercallFunc *handler, void 
> *opaque);
> void unregister_hypercall(target_ulong index);
> 
> /* dispatch a hypercall */
> void cpu_hypercall(CPUState *env, target_ulong index);
> 
> This interface could also be used to implement hypercall based interfaces on 
> s390 and x86.
> 
> The arguments will have to be extracted from the CPU state but I don't think 
> we'll really ever have common hypercall implementations anyway so that's not 
> a huge problem.

Is there enough common ground between the hypercall interfaces that this makes 
sense? It sounds nice on paper, but in the end the "hypercall not implemented" 
return codes differ, the argument interpretation differs and even the amount of 
return values differ.

So at the end of the day, all this interface could do is call a machine helper 
function to evaluate the hypercall id from its register state and dispatch to 
that, leaving the rest to the individual hypercall function implementation.

The implementations themselves also couldn't be reused. A PAPR hypercall simply 
doesn't work on x86. PIO and MMIO interfaces make sense to share - devices 
implemented using them could potentially be reused later by other platforms. 
For the hypercall devices, that doesn't work.

> 
>>> on real HW?
>>> 
>> The interface already exists on real HW.  It's described in the PAPR
>> document we keep mentioning.
>>   
> 
> Another thing to note is that the hypercall based I/O devices the interfaces 
> that the current Power hypervisor uses so implementing this interface is 
> necessary to support existing guests.

Yes, implementing the existing PAPR interfaces is crucial to run existing 
guests. Implementing it at the hypercall level is required if we ever want to 
run it with hardware accelerated KVM on ppc, as there hypercalls simply get 
forwarded to the hypervisor (kvm) which would pass them on to qemu. And since 
the interface is not nesting capable, running a hypervisor inside the guest 
doesn't work.


Alex




[Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Benjamin Herrenschmidt
On Sun, 2011-02-13 at 10:08 +0200, Blue Swirl wrote:
> This is a bit of a special case, much like semihosting modes for m68k
> or ARM, or like MOL hacks which were removed recently. From QEMU point
> of view, the most natural way of handling this would be hypervisor
> implemented in the guest side (for example BIOS). Then the hypervisor
> would use normal IO (or virtio) to communicate with the host. If
> inside QEMU, the interface of the hypervisor to the devices needs some
> thought. We'd like to avoid ugly interfaces like vmmouse where a
> device probes CPU registers directly or spaghetti interfaces like
> APIC.

I'm not sure I understand your point. We are emulating a guest running
under pHyp, ie, qemu in that case is the hypervisor, and provides the
same interfaces as pHyp does (the IBM proprietary hypervisor, aka
sPAPR). This is how we enable booting existing kernels and distributions
inside qemu/kvm.

> Is the interface new design, or are you implementing what is used also
> on real HW?

We are implementing a subset of the interfaces implemented by pHyp today
on "real HW". In the long run, when you throw KVM into the picture, on
machines (hypothetical machines at this stage) where one can run a KVM
has a hypervisor (currently you can only run pHyp on all released IBM
machines), this will give you an environment that is compatible with
pHyp and thus supports existing distributions and kernels.

We -will- add support for the "real" virtio on top of that, but those
will require modified guest kernels (and will provide better
performances than the sPAPR emulation). But the sPAPR emulation is a
necessary step to enable existing stuff to run.

Cheers,
Ben.




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Benjamin Herrenschmidt
On Sun, 2011-02-13 at 14:15 +0200, Blue Swirl wrote:
> 
> Maybe it would be more complex but also emulation accuracy would be
> increased and the interfaces would be saner. We don't shortcut BIOS
> and implement its services to OS in QEMU for other machines either.

But that is not comparable. BIOS is comparable for example to Open
Firmware and we do not 'emulate' OF, we will provide an implementation
that runs inside the guest, just like you do for BIOS (SLOF based, tho
people are welcome to play with OpenBIOS if they want, but SLOF is what
we will provide and support).

In this case, we are talking about a hypervisor which is somewhat a
different beast. Sure you -could- run it into the guest, I suppose, if
emulation accuracy was your ultimate goal. That would entail at least
the followings:

 - Implement support for the complete "hypervisor" mode inside qemu
 - Re-implement a complete hypervisor compatible with pHyp

An enormous amount of work, for a result that would have low
performances and about zero interest to anybody.

The goal here is to provide a runtime environment for kernels and
distributions that is -compatible- with sPAPR/pHyp to enable existing
distributions to operate in KVM.

> I'd expect one problem with that approach though, the interface used
> on real HW between the hypervisor and the underlying HW may be
> undocumented, but then it could use for example existing virtio
> devices.

But what would be the point ?

> One way to handle this could be to add the hypervisor interface now to
> QEMU and switch to guest hypervisor when (if) it becomes available.
> I'd just like to avoid duplication with virtio or messy interfaces
> like vmport. 

Again, what would be the point ? Eventually, KVM will be available as an
"alternate" hypervisor to pHyp which I suppose one could run entirely
inside qemu once we add support for the HV mode to it, and that would
somewhat do what you describe but that isn't what we are trying to get
at here.

Cheers,
Ben.




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Benjamin Herrenschmidt
On Sun, 2011-02-13 at 13:40 +0100, Alexander Graf wrote:
> 
> We can surely move it to dynamic later on. I think the "proper" way
> would be to populate a qdev bus and have the individual hypercall
> receivers register themselves through -device creations. But Blue
> really is the expert here :).

Why would you want to go through a bus for all hcalls ? (ie including
the ones that aren't device related ?). That doesn't quite "tick" but
I'm sure I'm missing part of the picture here :-)

A simple dispatch table based approach is the most efficient and simple
way to do that (after a switch/case) in my opinion, this is more/less
what we have done internally, with a pair of calls for "modules" to
register hcalls if they need to. The hcalls numbers are fixed and
specified in sPAPR.

BTW. We are still missing in this picture RTAS. I suppose David is still
cleaning up those patches. Basically, we use a 5 instruction trampoline
that calls a private h-call, the RTAS emulation is entirely in qemu.
This has to be done that way for various reasons, but essentially RTAS
under pHyp also more/less turns into private pHyp calls under the hood.
 
Cheers,
Ben.





Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Anthony Liguori

On 02/13/2011 09:56 AM, Alexander Graf wrote:



This interface could also be used to implement hypercall based interfaces on 
s390 and x86.

The arguments will have to be extracted from the CPU state but I don't think 
we'll really ever have common hypercall implementations anyway so that's not a 
huge problem.
 

Is there enough common ground between the hypercall interfaces that this makes 
sense?


Between the dispatch, registration, and tracing, yes.


  It sounds nice on paper, but in the end the "hypercall not implemented" 
return codes differ, the argument interpretation differs and even the amount of return 
values differ.
   


Yes, that's why we pass CPUState.  But keep in mind, CPUState needs to 
be sync'd before and after the invocation.



So at the end of the day, all this interface could do is call a machine helper 
function to evaluate the hypercall id from its register state and dispatch to 
that, leaving the rest to the individual hypercall function implementation.

The implementations themselves also couldn't be reused. A PAPR hypercall simply 
doesn't work on x86. PIO and MMIO interfaces make sense to share - devices 
implemented using them could potentially be reused later by other platforms. 
For the hypercall devices, that doesn't work.
   


Yes, which is why I'm not advocating trying to unmarshal the calls or 
anything like that.  But the dispatch, registration, tracing, and CPU 
sync'ing bits can all be reused.


Regards,

Anthony Liguori



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Anthony Liguori

On 02/13/2011 10:07 AM, Benjamin Herrenschmidt wrote:

On Sun, 2011-02-13 at 10:08 +0200, Blue Swirl wrote:
   

This is a bit of a special case, much like semihosting modes for m68k
or ARM, or like MOL hacks which were removed recently. From QEMU point
of view, the most natural way of handling this would be hypervisor
implemented in the guest side (for example BIOS). Then the hypervisor
would use normal IO (or virtio) to communicate with the host. If
inside QEMU, the interface of the hypervisor to the devices needs some
thought. We'd like to avoid ugly interfaces like vmmouse where a
device probes CPU registers directly or spaghetti interfaces like
APIC.
 

I'm not sure I understand your point. We are emulating a guest running
under pHyp, ie, qemu in that case is the hypervisor, and provides the
same interfaces as pHyp does (the IBM proprietary hypervisor, aka
sPAPR). This is how we enable booting existing kernels and distributions
inside qemu/kvm.

   

Is the interface new design, or are you implementing what is used also
on real HW?
 

We are implementing a subset of the interfaces implemented by pHyp today
on "real HW". In the long run, when you throw KVM into the picture, on
machines (hypothetical machines at this stage) where one can run a KVM
has a hypervisor (currently you can only run pHyp on all released IBM
machines), this will give you an environment that is compatible with
pHyp and thus supports existing distributions and kernels.
   


We try very, very hard to make our paravirtualization look like real 
hardware.


When the paravirtualization interfaces are already defined, we have no 
choice in supporting those interfaces although sometimes we try to push 
that to firmware (like with Xenner).  It's better from a security PoV.


But in this case, we have no choice but to implement the 
paravirtualization interfaces in QEMU because of the way the hardware 
works and because these interfaces are already well defined.


Regards,

Anthony Liguori


We -will- add support for the "real" virtio on top of that, but those
will require modified guest kernels (and will provide better
performances than the sPAPR emulation). But the sPAPR emulation is a
necessary step to enable existing stuff to run.

Cheers,
Ben.


   





Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Anthony Liguori

On 02/13/2011 10:17 AM, Benjamin Herrenschmidt wrote:

On Sun, 2011-02-13 at 13:40 +0100, Alexander Graf wrote:
   

We can surely move it to dynamic later on. I think the "proper" way
would be to populate a qdev bus and have the individual hypercall
receivers register themselves through -device creations. But Blue
really is the expert here :).
 

Why would you want to go through a bus for all hcalls ? (ie including
the ones that aren't device related ?). That doesn't quite "tick" but
I'm sure I'm missing part of the picture here :-)
   


A virtual bus is just an interface.  If all virtual devices that 
interact via hcalls would all reside on the same virtual bus, then 
having hypercalls registered through that interface makes sense because 
you can associate hypercalls with particular devices.  This means that 
you can automatically deregister on device removal and things like that.


But I don't think this will work out well.  I think treating the 
hypercalls as a simple dispatch table just like ioport would make sense.


Regards,

Anthony Liguori


A simple dispatch table based approach is the most efficient and simple
way to do that (after a switch/case) in my opinion, this is more/less
what we have done internally, with a pair of calls for "modules" to
register hcalls if they need to. The hcalls numbers are fixed and
specified in sPAPR.

BTW. We are still missing in this picture RTAS. I suppose David is still
cleaning up those patches. Basically, we use a 5 instruction trampoline
that calls a private h-call, the RTAS emulation is entirely in qemu.
This has to be done that way for various reasons, but essentially RTAS
under pHyp also more/less turns into private pHyp calls under the hood.

Cheers,
Ben.



   





Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/13/2011 09:56 AM, Avi Kivity wrote:

On 02/13/2011 05:38 PM, Anthony Liguori wrote:




2) get rid of the entire concept of machines.  Creating a i440fx is 
essentially equivalent to creating a bare machine.


No, it's not.  The 440fx does not include an IOAPIC, for example.  
There may be other optional components, or differences in wiring, 
that make two machines with i440fx not identical.



The IOAPIC is basically the only other component and I view it as 
part of the CPU interface to the chipset.


But it isn't.  The IOAPIC is not per-core or per-socket.


Yeah, I wasn't implying that it was.


  It's strictly board level.  It's a board interface to the cpu.



But still, if we're creating a machine from scratch:

qemu -device i440fx,id=nb -device piix3,id=sb,chipset=nb -device 
ioapic,id=ioapic,chipset=sb -device cpu,ioapic=ioapic,northbridge=nb


Is not all that unreasonable and presents a fully functioning PC.


Sure.  And -M blah is a shortcut.


Exactly.  Or better yet, blah is a config file that contains

[device "nb"]
driver=i440fx

[device "sb"]
driver=piix3
chipset=nb

[device "ioapic"]
driver=ioapic
chipset=sb

[device "cpu"]
driver=cpu
ioapic=ioapic
northbridge=nb





4) model the CPUs as devices that take a pointer to a host 
controller, for x86, the normal case would be giving it a pointer 
to i440fx.




Surely the connection is via a bus?  An x86 cpu talks to the bus, 
and there happens to be an 440fx north bridge at the end of it.  It 
could also be a Q35 or something else.


I see being on a bus as really just taking a pointer to an 
interface.  So yes, the i440fx would implement a PentiumCpuInterface 
or something like that and the CPU would take a pointer to a 
PentiumCpuInterface[1].


This is part of why having proper polymorphism is important.  We need 
it in order to be able to express concepts like interfaces.


The CPUs and northbridge are peers on the system bus.  However, this 
isn't something good to model, since it keeps changing without any 
guest software impact.


If we can move away from Bus abstraction and to a simpler interface 
mechanism, then we can express peer relationships by just having 
bidirection references.  IOW:


-device cpus,northbridge=nb,id=cpus,count=16 -device i440fx,cpus=cpus

I don't think modelling each CPU makes sense.  We should probably just 
model all cpus in a single device for the sake of simplicity.


Regards,

Anthony Liguori




Re: [Qemu-devel] [PATCH 1/2] linux-user: Define target alignment size

2011-02-13 Thread Laurent Vivier
Le dimanche 13 février 2011 à 10:24 +0200, Blue Swirl a écrit :
> On Sun, Feb 13, 2011 at 4:22 AM, Laurent Vivier  wrote:
> > Datatype alignment can be found using following application:
> >
> > int main(void)
> > {
> >printf("alignof(short) %ld\n", __alignof__(short));
> >printf("alignof(int) %ld\n", __alignof__(int));
> >printf("alignof(long) %ld\n", __alignof__(long));
> >printf("alignof(long long) %ld\n", __alignof__(long long));
> > }
> >
> > This patch includes following alignments:
> >
> > i386
> >
> >   alignof(short) 2
> >   alignof(int) 4
> >   alignof(long) 4
> >   alignof(long long) 8
> >
> >  x86_64
> >
> >   alignof(short) 2
> >   alignof(int) 4
> >   alignof(long) 8
> >   alignof(long long) 8
> >
> >  arm
> >
> >   alignof(short) 2
> >   alignof(int) 4
> >   alignof(long) 4
> >   alignof(long long) 4
> >
> >  m68k (680x0)
> >
> >   alignof(short) 2
> >   alignof(int) 2
> >   alignof(long) 2
> >   alignof(long long) 2
> >
> >  mips
> >
> >   alignof(short) 2
> >   alignof(int) 4
> >   alignof(long) 4
> >   alignof(long long) 8
> >
> >  ppc
> >
> >   alignof(short) 2
> >   alignof(int) 4
> >   alignof(long) 4
> >   alignof(long long) 8
> >
> > for other targets, use by default (2,4,4,8).
> >
> > Please, update for your favorite target...
> 
> For Sparc32 (I think also sparc32plus), the default is OK.
> 
> For Sparc64, please use 2, 4, 8, 8. I'd guess other 64 bit platforms
> (Alpha, MIPS64, PPC64 etc) should use the same.

OK, I update my patch.

> Does GCC produce correct code using the attributes on strictly aligned
> host, when the target is less strictly aligned?

It seems it is OK. I did some tests into a mips-linux-user chroot (sparc
one is broken ;-) ) :

mips is a strictly aligned host, from gcc/config/mips/mips.h:

#define STRICT_ALIGNMENT 1

aligments are:

   alignof(short) 2
   alignof(int) 4
   alignof(long) 4
   alignof(long long) 8

We try to align int on 2.

#include 

typedef int target_int __attribute__((aligned(2)));

struct {
char Z;
target_int A;
} B;

int main(void)
{
B.A = 0xdeadbeaf;
printf("%d: %x\n", __alignof__(B.A), B.A);
}

./test_align
2: deadbeaf

disass:

lw  $2,%got(B)($28)
li  $3,-559087616   # 0xdead
ori $3,$3,0xbeaf
swl $3,2($2)
swr $3,5($2)

normal case:

lw  $3,%got(B)($28)
li  $2,-559087616   # 0xdead
ori $2,$2,0xbeaf
sw  $2,4($3)

So, gcc seems smart enough to split the memory access in several ones
compatible with the strict alignment rules.

> Should the alignment of floating point variables be specified as well?

At the moment it seems useless.

> The strict alignment required for doubles is 4, but recommended
> alignment is 8, I'm not sure which one is used for structures
> containing doubles.

if necessary, some tests will be helpfull.

Thank you for your comments.

Regards,
Laurent 

-- 
- laur...@vivier.eu --
"Tout ce qui est impossible reste à accomplir"Jules Verne
"Things are only impossible until they're not" Jean-Luc Picard




Re: [Qemu-devel] Binary Translation hooking - reading registers

2011-02-13 Thread felix.matenaar@rwth-aachen
On 02/13/2011 06:38 AM, Mulyadi Santosa wrote:
> Hi
>
> On Sun, Feb 13, 2011 at 10:48, felix.matenaar@rwth-aachen
>  wrote:
>> To achieve my goal, it is necessary being able reading actual register
>> configuration like eax when a ret hook is called to get a function
>> return value. So my question is how I can do this. Are there already
>> some functions which generate code to update the cpu environment? If
>> not, is there anything you can point me towards for adding support?
> I think you should look into the tracing infrastructure that is
> gradually added to Qemu. I forgot the URL that provide the patch
> (since I am not sure whether it's fully merged with mainline). Please
> check this list archieve...
>
> NB: You're talking about qemu system emulation,right? not the user
> mode emulation, I assume? Because you said "executed in one step" (or
> something like that). AFAIK, although Qemu does lazy evalution, but
> for general registers it should be always updated. The one that gets
> lazy evalution for example is eflags.
>

Yes I am talking about full system emulation (i386).

Tracing infrastructure may be suitable for my needs in the future but I
think that my instrumentation patches which are already added as far as
I need it have a bit less overhead.

To be more specific on my env update problem, here an example:

push ebp
mv esp,ebp
/* do something */
call 0xfoo
test eax,eax
/* do something */
ret

The first line is the start of a block. What I did was adding a
gen_helper_instrument_call in the 'call' opcode cases in disas_insn()
which will call my analysis code to examine from where the call is
launched, where it will go and what is the address we will return to
after (in this case regarding to call 0xfoo).
Because in this case, usage of tcg_const_i32(pc_start) and providing
CPU_T[0] as parameters to my generated helper, I can provide all three
values. But what I would _assume_ is that after
gen_helper_instrument_call is executed, env->eip would point towards
'call 0xfoo', but it points to 'push ebp'. Thats why I am confused if I
can trust the values in env to gain more information about the current
register values of the guest machine. This particular case is indeed not
a problem since I can provide all parameters to
gen_helper_instrument_call here but there are cases where I would like
to read e.g. esp for keeping an analysis shadow stack up to date or
reading return values by examining eax when my ret hook is called.

This is my call immediate handler:

void helper_call_im_protected(target_ulong src_eip, target_ulong
new_eip, int next_eip){
  if (!(new_eip & 0x8000) ){
/* env->eip == src_eip will evaluate to 0 if 'call' opcode was not
at the beginning of the translation block*/
analysis_examine_call(src_eip,new_eip,next_eip);
  }
}

Here my hook inserting in disas_insn:
case 0xe8: /* call im */
{   
/* some code */
gen_movtl_T0_im(next_eip);
gen_push_T0(s);

/* will call helper_call_im_protected when call opcode is
executed */
gen_helper_call_im_protected(tcg_const_i32(pc_start),
 tcg_const_i32(tval),
 cpu_T[0]);
gen_jmp(s, tval);
}   

Hope you understand.



Re: [Qemu-devel] [PATCH 02/10] parallel: make optional

2011-02-13 Thread David Ahern


On 02/12/11 15:40, Blue Swirl wrote:
> Ignore failure with parallel device creation.
> 
> Signed-off-by: Blue Swirl 
> ---
>  hw/pc.h |5 -
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/pc.h b/hw/pc.h
> index 443ba34..f823b7d 100644
> --- a/hw/pc.h
> +++ b/hw/pc.h
> @@ -24,7 +24,10 @@ static inline bool parallel_init(int index,
> CharDriverState *chr)
>  {
>  ISADevice *dev;
> 
> -dev = isa_create("isa-parallel");
> +dev = isa_try_create("isa-parallel");
> +if (!dev) {
> +return false;
> +}
>  qdev_prop_set_uint32(&dev->qdev, "index", index);
>  qdev_prop_set_chr(&dev->qdev, "chardev", chr);
>  if (qdev_init(&dev->qdev) < 0) {


How is this design supposed to be better than wrapping init functions in
#ifdef CONFIG_ ... #endif?

If a hardware model is compiled out via the CONFIG options qemu should
fail to accept the command line parameters and not try to create the
device. Instead with this design it tries and fails to create the device
and yet continues on.

David



Re: [Qemu-devel] Binary Translation hooking - reading registers

2011-02-13 Thread Peter Maydell
On 13 February 2011 17:14, felix.matenaar@rwth-aachen
 wrote:
> To be more specific on my env update problem, here an example:
>
> push ebp
> mv esp,ebp
> /* do something */
> call 0xfoo
> test eax,eax
> /* do something */
> ret
>
> The first line is the start of a block. What I did was adding a
> gen_helper_instrument_call in the 'call' opcode cases in disas_insn()
> which will call my analysis code to examine from where the call is
> launched, where it will go and what is the address we will return to
> after (in this case regarding to call 0xfoo).
> Because in this case, usage of tcg_const_i32(pc_start) and providing
> CPU_T[0] as parameters to my generated helper, I can provide all three
> values. But what I would _assume_ is that after
> gen_helper_instrument_call is executed, env->eip would point towards
> 'call 0xfoo', but it points to 'push ebp'.

Ah, you should have said you were talking about EIP; you've
hit an optimisation case. If we updated env->eip after every
instruction then this would slow things down. Instead, since
we know what the value is statically at translate time, we can
just use that calculated value mostly. We only update env->eip:
 * at the end of the TB
 * when we translate jump commands (usually ends the TB)
 * when we translate something that will raise an exception
   or some other thing we know will need an accurate env->eip
Unexpected interrupts (typically faults on load/store) are
handled by constructing a mapping from host PC to guest
PC and then looking up the host fault address in it and
resetting env->eip (in gen_pc_load()).

Which registers are dealt with like this varies from target to
target; on target-arm we do this for both r15 (pc) and the
IT state bits in the CPSR, for instance.

The other case to watch out for is where the value of some
register is kept in env, but as a broken out set of components
which are only reassembled into the actual register value
when the target needs to read it; see for example the comment
in target-i386/cpu.h about env->eflags.

If the registers you care about aren't in either of these
categories you should be able to trust them in a helper fn.
This typically includes most of the general purpose registers.

-- PMM



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Gleb Natapov
On Sun, Feb 13, 2011 at 10:56:30AM -0600, Anthony Liguori wrote:
> >>
> >>qemu -device i440fx,id=nb -device piix3,id=sb,chipset=nb -device
> >>ioapic,id=ioapic,chipset=sb -device
> >>cpu,ioapic=ioapic,northbridge=nb
> >>
> >>Is not all that unreasonable and presents a fully functioning PC.
> >
> >Sure.  And -M blah is a shortcut.
> 
> Exactly.  Or better yet, blah is a config file that contains
> 
> [device "nb"]
> driver=i440fx
> 
You are trying to model how particular (very ancient) HW looked like,
instead of emulating guest visible functionality, but is dead end since
things are changing constantly. Northbridge functionality moves onto
cpu for instance. What CPU i440fx was designed for? Pentium?  What if
user runs QEMU with emulated CPU that in real life has internal memory
controller? Does you config have sense for such setup? Should we allow
to specify only Pentium CPU since this is how real HW worked?

> [device "sb"]
> driver=piix3
And piix3 refers to piix3.cfg which describe devices that present on the
chipset.

> chipset=nb
> 
> [device "ioapic"]
> driver=ioapic
> chipset=sb
Here, for instance, IOAPIC is included in a chipset for a long time
now. Why user should care that piix3 didn't have it. How this detail
changes qemu functionality? If it doesn't why should we expose it?

> 
> [device "cpu"]
> driver=cpu
> ioapic=ioapic
Why ioapic here? Doesn't cpu talks to ioapic via northbridge?

> northbridge=nb
> 

--
Gleb.



[Qemu-devel] [RFC] qapi: events in QMP

2011-02-13 Thread Anthony Liguori

Hi,

In my QAPI branch[1], I've now got almost every existing QMP command 
converted with (hopefully) all of the hard problems solved.  There is 
only one remaining thing to attack before posting for inclusion and 
that's events.  Here's my current thinking about what to do.


Events in QMP Today

QMP events are asynchronous messages.  They are not tied explicitly to 
any type of context, do not have a well defined format, and are have no 
mechanism to mask or filter events.  As of right now, we have somewhere 
around a dozen events.


Goals of QAPI

1) Make all interfaces consumable in C such that we can use the 
interfaces in QEMU


2) Make all interfaces exposed through a library using code generation 
from static introspection


3) Make all interfaces well specified in a formal schema

Proposal for events in QAPI

For QAPI, I'd like to model events on the notion of signals and 
slots[2].  A client would explicitly connect to a signal through a QMP 
interface which would result in a slot being added that then generates 
an event.  Internally in QEMU, we could also connect slots to the same 
signals.  Since we don't have an object API in QMP, we'd use a pair of 
connect/disconnect functions that had enough information to identify the 
signal.


Example:

We would define QEVENT_BLOCK_IO_EVENT as:

# qmp-schema.json
{ 'BlockIOEvent': {'device': 'str', 'action': 'str', 'operation': 'str'} }

The 'Event' suffix is used to determine that the type is an event and 
not a structure.  This would generate the following structures for QEMU:


typedef void (BlockIOEventFunc)(const char *device, const char *action, 
const char *operation, void *opaque);


typedef struct BlockIOEvent {
/* contents are private */
} BlockIOEvent;

/* Connect a slot to a BlockIOEvent signal and return a handle to the 
connection */
int block_io_event_connect(BlockIOEvent *event, BlockIOEventFunc *cb, 
void *opaque);


/* Signal any connect slots of a BlockIOEvent */
void block_io_event_signal(BlockIOEvent *event, const char *device, 
const char *action, const char *operation);


/* Disconnect a slot from a signal based on a connection handle */
void block_io_event_disconnect(BlockIOEvent *event, int handle);

In the case of BlockIOEvent, this is a global signal that is not tied to 
any specific object so there would be a global BlockIOEvent object 
within QEMU.


From a QMP perspective, we would have the following function in the schema:

[ 'block-io-event', {}, {}, 'BlockIOEvent' ]

A function definition that returns an Event is a signal accessor.  
Within QEMU, we implement a signal accessor like so:


BlockIOEvent *qmp_block_io_event(Error **errp)
{
 return &global_block_io_event;
}

For QMP and libqmp, signal accessors are handled specially.  A signal 
accessor is expanded into two QMP functions:


[ 'block-io-event-connect', {}, {}, 'str' ]
[ 'block-io-event-disconnect', {'tag' : 'str'}, {}, 'none' ]

Connect returns a handle to the connection as an opaque string.  
Disconnect takes that handle.  Both functions will also take any 
arguments that the signal accessor takes and under the cover uses the 
signal accessor to access the object.


In libqmp, the following interfaces will be generated:

int libqmp_block_io_event_connect(QmpSession *sess, BlockIOEventFunc 
*cb, void *opaque, Error **errp);
void libqmp_block_io_event_disconnect(QmpSession *sess, int handle, 
Error **errp);


Again, if the signal accessor takes additional arguments, they will be 
passed here.


All existing events will be converted to signals with signal accessors 
that take no arguments.  However, BlockIOEvent is a good example of a 
signal that we'll add a new version of that takes a 'const char *device' 
as an argument for the signal accessor.  This will let clients register 
for block io events on specific block devices.


Backwards Compatibility

Right now, QMP events to not have any type of context information which 
means that there's no way to communicate the connection handle.  We'll 
solve this by adding a new 'tag' field to events that are registered 
through one of the connect interfaces.  Since this is only present on 
events registered through this new interface, backwards compatibility is 
preserved.


Upon initial connection, we'll treat all existing signals as having a 
'default signal connection'.  We'll introduce a new QMP command that can 
be executed while in capabilities mode to disconnect default signal 
connections.  The effect is that new clients can use the explicit 
register/unregister interface while old clients will continue to see the 
old style events.


We'll deprecate the 'default signal connections' and make sure to never 
add a new one post 0.15.  Old clients will need to be updated to 
explicitly register for events.


Another Example

Here's an example of BlockDeviceEvent, the successor to BlockIOEvent.  
It makes use of signal accessor arguments and QAPI enumeration support:


# qmp-schema.json
{ 'BlockDeviceAction

Re: [Qemu-devel] [RFC] qapi: events in QMP

2011-02-13 Thread Anthony Liguori

On 02/13/2011 12:08 PM, Anthony Liguori wrote:

Hi,

In my QAPI branch[1], I've now got almost every existing QMP command 
converted with (hopefully) all of the hard problems solved.  There is 
only one remaining thing to attack before posting for inclusion and 
that's events.  Here's my current thinking about what to do.


Events in QMP Today

QMP events are asynchronous messages.  They are not tied explicitly to 
any type of context, do not have a well defined format, and are have 
no mechanism to mask or filter events.  As of right now, we have 
somewhere around a dozen events.


Goals of QAPI

1) Make all interfaces consumable in C such that we can use the 
interfaces in QEMU


2) Make all interfaces exposed through a library using code generation 
from static introspection


3) Make all interfaces well specified in a formal schema

Proposal for events in QAPI

For QAPI, I'd like to model events on the notion of signals and 
slots[2].  A client would explicitly connect to a signal through a QMP 
interface which would result in a slot being added that then generates 
an event.  Internally in QEMU, we could also connect slots to the same 
signals.  Since we don't have an object API in QMP, we'd use a pair of 
connect/disconnect functions that had enough information to identify 
the signal.


Example:

We would define QEVENT_BLOCK_IO_EVENT as:

# qmp-schema.json
{ 'BlockIOEvent': {'device': 'str', 'action': 'str', 'operation': 
'str'} }


The 'Event' suffix is used to determine that the type is an event and 
not a structure.  This would generate the following structures for QEMU:


typedef void (BlockIOEventFunc)(const char *device, const char 
*action, const char *operation, void *opaque);


One thing I'm on the fence about, is making the slot callback take a 
BlockIOEvent *event as the first argument.  If we did this, the signals 
would be compatible with GObject signals.


It's hard to map that to a libqmp though so I'm not sure.

Regards,

Anthony Liguori



typedef struct BlockIOEvent {
/* contents are private */
} BlockIOEvent;

/* Connect a slot to a BlockIOEvent signal and return a handle to the 
connection */
int block_io_event_connect(BlockIOEvent *event, BlockIOEventFunc *cb, 
void *opaque);


/* Signal any connect slots of a BlockIOEvent */
void block_io_event_signal(BlockIOEvent *event, const char *device, 
const char *action, const char *operation);


/* Disconnect a slot from a signal based on a connection handle */
void block_io_event_disconnect(BlockIOEvent *event, int handle);

In the case of BlockIOEvent, this is a global signal that is not tied 
to any specific object so there would be a global BlockIOEvent object 
within QEMU.


From a QMP perspective, we would have the following function in the 
schema:


[ 'block-io-event', {}, {}, 'BlockIOEvent' ]

A function definition that returns an Event is a signal accessor.  
Within QEMU, we implement a signal accessor like so:


BlockIOEvent *qmp_block_io_event(Error **errp)
{
 return &global_block_io_event;
}

For QMP and libqmp, signal accessors are handled specially.  A signal 
accessor is expanded into two QMP functions:


[ 'block-io-event-connect', {}, {}, 'str' ]
[ 'block-io-event-disconnect', {'tag' : 'str'}, {}, 'none' ]

Connect returns a handle to the connection as an opaque string.  
Disconnect takes that handle.  Both functions will also take any 
arguments that the signal accessor takes and under the cover uses the 
signal accessor to access the object.


In libqmp, the following interfaces will be generated:

int libqmp_block_io_event_connect(QmpSession *sess, BlockIOEventFunc 
*cb, void *opaque, Error **errp);
void libqmp_block_io_event_disconnect(QmpSession *sess, int handle, 
Error **errp);


Again, if the signal accessor takes additional arguments, they will be 
passed here.


All existing events will be converted to signals with signal accessors 
that take no arguments.  However, BlockIOEvent is a good example of a 
signal that we'll add a new version of that takes a 'const char 
*device' as an argument for the signal accessor.  This will let 
clients register for block io events on specific block devices.


Backwards Compatibility

Right now, QMP events to not have any type of context information 
which means that there's no way to communicate the connection handle.  
We'll solve this by adding a new 'tag' field to events that are 
registered through one of the connect interfaces.  Since this is only 
present on events registered through this new interface, backwards 
compatibility is preserved.


Upon initial connection, we'll treat all existing signals as having a 
'default signal connection'.  We'll introduce a new QMP command that 
can be executed while in capabilities mode to disconnect default 
signal connections.  The effect is that new clients can use the 
explicit register/unregister interface while old clients will continue 
to see the old style events.


We'll deprecate the 'default signal connections

Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Benjamin Herrenschmidt
On Sun, 2011-02-13 at 10:48 -0600, Anthony Liguori wrote:
> 
> We try very, very hard to make our paravirtualization look like real 
> hardware.

Sure, that makes sense when you invent new paravirt interfaces, but that
isn't the case. Note also that our current processors do not have the
ability to emulate MMIOs in all cases, ie, when doing "real" KVM in HV
mode, we cannot trap MMIO unless we redirect all page faults to the
hypervisor, which comes at a cost.

> When the paravirtualization interfaces are already defined, we have no
> choice in supporting those interfaces although sometimes we try to
> push  that to firmware (like with Xenner).  It's better from a
> security PoV.
> 
> But in this case, we have no choice but to implement the 
> paravirtualization interfaces in QEMU because of the way the hardware 
> works and because these interfaces are already well defined.

Right.

Now, in the long run, we might decide to "reflect" some of these back
into some guest co-located FW or whatever of that kind, especially if we
get to a point where linux virt-io becomes more prevalent and the sPAPR
style IOs become purely legacy backward compat stubs, but we aren't
there yet.

Cheers,
Ben.




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Benjamin Herrenschmidt
On Sun, 2011-02-13 at 10:52 -0600, Anthony Liguori wrote:
> 
> A virtual bus is just an interface.  If all virtual devices that 
> interact via hcalls would all reside on the same virtual bus, then 
> having hypercalls registered through that interface makes sense
> because 
> you can associate hypercalls with particular devices.  This means
> that 
> you can automatically deregister on device removal and things like
> that.

I see. Well, VIO related h-calls are only part of the picture here, I
think we can live with having explicit de-registration if needed ;-)
Besides the h-call is still implemented even if no device -instance- is
currently plugged into the partition anyways. It just returns a (well
defined per-hcall) error code if the instance handle passed to it is
bogus.

> But I don't think this will work out well.  I think treating the 
> hypercalls as a simple dispatch table just like ioport would make
> sense.

Yup.

Cheers,
Ben.




Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 5:08 PM, Anthony Liguori  wrote:
> On 02/13/2011 05:12 AM, David Gibson wrote:
>>
>> On Sun, Feb 13, 2011 at 10:08:23AM +0200, Blue Swirl wrote:
>>
>>>
>>> On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
>>>   wrote:
>>>

 On Sun, 2011-02-13 at 00:52 +0200, Blue Swirl wrote:

>
> On Sat, Feb 12, 2011 at 11:00 PM, Benjamin Herrenschmidt
>   wrote:
>
>>
>> On Sat, 2011-02-12 at 18:59 +0200, Blue Swirl wrote:
>>
>>>
>>> Actually I don't quite understand the need for vty layer, why not use
>>> the chardev here directly?
>>>
>>
>> I'm not sure what you mean here...
>>
>
> Maybe it would be reasonable to leave h_put_term_char to spapr_hcall.c
> instead of moving those to a separate file.
>

 Well, the VIO device instance gives the chardev instance which is all
 nicely encapsulated inside spapr-vty... Also VIO devices tend to have
 dedicated hcalls, not only VTY, so it makes a lot of sense to keep them
 close to the rest of the VIO driver they belong to don't you think ?

 (Actually veth does, vscsi uses the more "generic" CRQ stuff which we've
 added to the core VIO but you'll see that when we get those patches
 ready, hopefully soon).

>>>
>>> This is a bit of a special case, much like semihosting modes for m68k
>>> or ARM, or like MOL hacks which were removed recently. From QEMU point
>>> of view, the most natural way of handling this would be hypervisor
>>> implemented in the guest side (for example BIOS). Then the hypervisor
>>> would use normal IO (or virtio) to communicate with the host. If
>>> inside QEMU, the interface of the hypervisor to the devices needs some
>>> thought. We'd like to avoid ugly interfaces like vmmouse where a
>>> device probes CPU registers directly or spaghetti interfaces like
>>> APIC.
>>>
>>
>> I really don't follow what you're saying here.  Running the hypervisor
>> in the guest, rather than emulating its effect in qemu seems like an
>> awful lot of complexity for no clear reason.
>>
>
> In KVM for x86, instead of using a secondary interface (like
> vmmcall/vmcall), we do all of our paravirtualization using native hardware
> interfaces that we can trap (PIO/MMIO).
>
> IIUC, on Power, trapping MMIO is not possible due to the MMU mode that is
> currently used (PFs are delivered directly to the guest).
>
> So it's not really possible to switch from using hypercalls to using MMIO.
>
> What I would suggest is modelling hypercalls as another I/O address space
> for the processor.  So instead of having a function pointer in the CPUState,
> introduce a:
>
> typedef void (HypercallFunc)(CPUState *env, void *opaque);
>
> /* register a hypercall handler */
> void register_hypercall(target_ulong index, HypercallFunc *handler, void
> *opaque);
> void unregister_hypercall(target_ulong index);
>
> /* dispatch a hypercall */
> void cpu_hypercall(CPUState *env, target_ulong index);
>
> This interface could also be used to implement hypercall based interfaces on
> s390 and x86.
>
> The arguments will have to be extracted from the CPU state but I don't think
> we'll really ever have common hypercall implementations anyway so that's not
> a huge problem.

Nice idea. Then the part handling CPUState probably should belong to
target-ppc/ rather than hw/.



Re: [Qemu-devel] [PATCH 02/10] parallel: make optional

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 7:30 PM, David Ahern  wrote:
>
>
> On 02/12/11 15:40, Blue Swirl wrote:
>> Ignore failure with parallel device creation.
>>
>> Signed-off-by: Blue Swirl 
>> ---
>>  hw/pc.h |    5 -
>>  1 files changed, 4 insertions(+), 1 deletions(-)
>>
>> diff --git a/hw/pc.h b/hw/pc.h
>> index 443ba34..f823b7d 100644
>> --- a/hw/pc.h
>> +++ b/hw/pc.h
>> @@ -24,7 +24,10 @@ static inline bool parallel_init(int index,
>> CharDriverState *chr)
>>  {
>>      ISADevice *dev;
>>
>> -    dev = isa_create("isa-parallel");
>> +    dev = isa_try_create("isa-parallel");
>> +    if (!dev) {
>> +        return false;
>> +    }
>>      qdev_prop_set_uint32(&dev->qdev, "index", index);
>>      qdev_prop_set_chr(&dev->qdev, "chardev", chr);
>>      if (qdev_init(&dev->qdev) < 0) {
>
>
> How is this design supposed to be better than wrapping init functions in
> #ifdef CONFIG_ ... #endif?

We avoid the #ifdeffery and the device model is not violated. Most of
the code is not affected in any way by leaving the device out.

> If a hardware model is compiled out via the CONFIG options qemu should
> fail to accept the command line parameters and not try to create the
> device. Instead with this design it tries and fails to create the device
> and yet continues on.

Maybe command line parameter handling should be pushed to the devices.
The devices should pull their parameters, removing them thus from a
parameter pool. After the machine init finishes, errors could be
reported for unhandled parameters.



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread Anthony Liguori

On 02/13/2011 12:29 PM, Blue Swirl wrote:

On Sun, Feb 13, 2011 at 5:08 PM, Anthony Liguori  wrote:
   


In KVM for x86, instead of using a secondary interface (like
vmmcall/vmcall), we do all of our paravirtualization using native hardware
interfaces that we can trap (PIO/MMIO).

IIUC, on Power, trapping MMIO is not possible due to the MMU mode that is
currently used (PFs are delivered directly to the guest).

So it's not really possible to switch from using hypercalls to using MMIO.

What I would suggest is modelling hypercalls as another I/O address space
for the processor.  So instead of having a function pointer in the CPUState,
introduce a:

typedef void (HypercallFunc)(CPUState *env, void *opaque);

/* register a hypercall handler */
void register_hypercall(target_ulong index, HypercallFunc *handler, void
*opaque);
void unregister_hypercall(target_ulong index);

/* dispatch a hypercall */
void cpu_hypercall(CPUState *env, target_ulong index);

This interface could also be used to implement hypercall based interfaces on
s390 and x86.

The arguments will have to be extracted from the CPU state but I don't think
we'll really ever have common hypercall implementations anyway so that's not
a huge problem.
 

Nice idea. Then the part handling CPUState probably should belong to
target-ppc/ rather than hw/.
   


Would be nice to have the target-ppc/ hypercall handlers call into a 
higher level VIO interface that didn't deal directly with CPUState.  The 
actual hardware emulation would then be implemented in hw/ and would not 
be compiled for a specific target.  That helps avoid future sloppiness.


Regards,

Anthony Liguori



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 5:31 PM, Anthony Liguori  wrote:
> On 02/11/2011 12:14 PM, Blue Swirl wrote:
>>
>> On Thu, Feb 10, 2011 at 6:05 PM, Anthony Liguori
>>  wrote:
>>
>>>
>>> On 02/10/2011 03:20 PM, Gleb Natapov wrote:
>>>

 Jugging by how well all previous conversion went we will end up with one
 more way of creating devices. One legacy, another qdev and your new one.
 And what is the problem with qdev again (not that I am a big qdev fan)?


>>>
>>> We've really been arguing about probably the most minor aspect of the
>>> problem with qdev.
>>>
>>> All I'm really saying is that we shouldn't tie device construction to a
>>> factory interface as we do with qdev.
>>>
>>> That simply means that we should be able to do:
>>>
>>> RTC *rtc_create(arg1, arg2, arg2);
>>>
>>
>> I don't see how that would help at all. Throwing qdev away and just
>> calling various functions directly, with all states exposed would be
>> like QEMU 0.9.0.
>>
>
> qdev doesn't expose any state today.  qdev properties are construction-only
> properties that happen to be stored in each device state.
>
> What we really need is a full property framework that includes properties
> with hookable getters and setters along with the ability to mark properties
> as construct-only, read-only, or read-write.
>
> But I think it's reasonable to expose construct-only properties as just an
> initfn argument.

Sounds OK. About read-write properties, what happens if we one day
have extensive threading, and locks are pushed to device level? I can
imagine a deadlock involving one thread running in IO thread for a
device and another trying to access that device's properties. Maybe
that is not different from function call version.

>>> And that a separate piece of code decides which devices are exposed
>>> through
>>> -device or device_add.  Which devices are exposed is really a minor
>>> detail.
>>>
>>> That said, qdev has a number of significant limitations in my mind.  The
>>> first is that the only relationship between devices is through the
>>> BusState
>>> interface.
>>>
>>
>> There's also qemu_irq for arbitrary signals.
>>
>
> Yes, but qemu_irq is very restricted as it only models a signal bit of
> information and doesn't really have a mechanism to attach/detach in any
> generic way.

Basic signals are already very useful for many purposes, since they
match digital logic signals in real HW. In theory, whole machines
could be constructed with just qemu_irq and NAND gate emulator. ;-)

In the message passing IRQ discussion earlier, it was IIRC decided
that the one bit version would not be changed but a separate message
passing version would be created if ever needed.

>>>  I don't think we should even try to have a generic bus model.
>>>  When you look at how badly broken PCI hotplug is current in qdev, I
>>> think
>>> this is symptomatic of this.
>>>
>>
>> And how should this be fixed? The API change would not help.
>>
>
> Just as we have bus level creation functions, we should have bus level
> hotplug interfaces.
>
>>> There's also no way in qdev to really have polymorphism.  Interfaces
>>> really
>>> aren't meaningful in qdev so you have things like PCIDevice where some
>>> methods are stored in the object instead of the class dispatch table and
>>> you
>>> have overuse of static class members.
>>>
>>
>> QEMU is developed in C, not C++.
>>
>
> But we're trying to do object oriented programming in C so as long as we're
> doing that, we ought to do it right.
>
>>> And it's all unrelated to VMState.
>>>
>>
>> Right, but this has also the good side that not all device state is
>> automatically exported. If other devices would be allowed to muck with
>> a devices internal state freely, bad things could happen.
>>
>> Device reset could also use standard register definitions, shared with
>> VMState.
>>
>
> There's a way to have formally verifiable serialization/deserialization if
> we can satisfy two conditions 1) the devices rely on no global state (i.e.
> static variables) and 2) every field asssociated with a device is marshalled
> during serialization/deserialization.
>
> When we define a device, right now we say that certain state is writable
> during construction.  It's not a stretch to want to have some properties
> writable during runtime.  If we also had a mechanism to mark certain
> properties as read-only, but still were able to introspect them, we could
> implement serialization without having to have a second VMState definition.
>
> Compatibility will always require manipulating state, but once you have the
> state stored in a data structure, you can describe those transformations in
> a pretty high level fashion.
>
>>> And this is just the basic mechanisms of qdev.  The actual implementation
>>> is
>>> worse.  The use of qemu_irq as gpio in the base class and overuse of
>>> SystemBus is really quite insane.
>>>
>>
>> Maybe qemu_irq should be renamed to QEMUSignal (and I don't like
>> typedeffing pointers), otherwise it

Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/13/2011 12:08 PM, Gleb Natapov wrote:

On Sun, Feb 13, 2011 at 10:56:30AM -0600, Anthony Liguori wrote:
   

qemu -device i440fx,id=nb -device piix3,id=sb,chipset=nb -device
ioapic,id=ioapic,chipset=sb -device
cpu,ioapic=ioapic,northbridge=nb

Is not all that unreasonable and presents a fully functioning PC.
 

Sure.  And -M blah is a shortcut.
   

Exactly.  Or better yet, blah is a config file that contains

[device "nb"]
driver=i440fx

 

You are trying to model how particular (very ancient) HW looked like,
instead of emulating guest visible functionality, but is dead end since
things are changing constantly. Northbridge functionality moves onto
cpu for instance. What CPU i440fx was designed for? Pentium?  What if
user runs QEMU with emulated CPU that in real life has internal memory
controller? Does you config have sense for such setup? Should we allow
to specify only Pentium CPU since this is how real HW worked?
   


Yes, how we structure the device tree will change over time.  If users 
have to specify this stuff, we've already failed to start with.


The i440fx is definitely guest visible.  And yes, a guest could look at 
freak out that an i440fx is in use with a non-Pentium class CPU.  We 
cheat here, and it works just fine.



[device "sb"]
driver=piix3
 

And piix3 refers to piix3.cfg which describe devices that present on the
chipset.
   


I disagree in this case that it's the best thing to do, but in general, 
yeah, we should provide a mechanism to have essentially device tree 
macros such that one high level device represents a larger tree of 
devices.  This would be a good mechanism to support the concept of machines.


But I don't think we should rely on having this tomorrow as doing this 
well is going to be challenging.



chipset=nb

[device "ioapic"]
driver=ioapic
chipset=sb
 

Here, for instance, IOAPIC is included in a chipset for a long time
now. Why user should care that piix3 didn't have it. How this detail
changes qemu functionality? If it doesn't why should we expose it?
   


Yeah, I'd be inclined to push the ioapic into the chipset here except it 
may be useful to have multiple ioapics at some point down the road.


Yes, we're exposing gory details of the device model.  No, I don't have 
a problem with that.



[device "cpu"]
driver=cpu
ioapic=ioapic
 

Why ioapic here? Doesn't cpu talks to ioapic via northbridge?
   


The i440fx is the northbridge.  For the i440fx, the I/O apic is a 
separate chip.


Regards,

Anthony Liguori



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/13/2011 01:37 PM, Blue Swirl wrote:

On Sun, Feb 13, 2011 at 5:31 PM, Anthony Liguori  wrote:
   


qdev doesn't expose any state today.  qdev properties are construction-only
properties that happen to be stored in each device state.

What we really need is a full property framework that includes properties
with hookable getters and setters along with the ability to mark properties
as construct-only, read-only, or read-write.

But I think it's reasonable to expose construct-only properties as just an
initfn argument.
 

Sounds OK. About read-write properties, what happens if we one day
have extensive threading, and locks are pushed to device level? I can
imagine a deadlock involving one thread running in IO thread for a
device and another trying to access that device's properties. Maybe
that is not different from function call version.
   


You need hookable setters/getters that can acquire a lock and do the 
right thing.  It shouldn't be able to dead lock if the locking is 
designed right.




Yes, but qemu_irq is very restricted as it only models a signal bit of
information and doesn't really have a mechanism to attach/detach in any
generic way.
 

Basic signals are already very useful for many purposes, since they
match digital logic signals in real HW. In theory, whole machines
could be constructed with just qemu_irq and NAND gate emulator. ;-)
   


It's not just in theory.  In the C++ port of QEMU that I wrote, I 
implemented an AND, OR, and XOR gate and implemented a full 32-bit adder 
by just using a device config file.


If done correctly, using referencing can be extremely powerful.  A full 
adder is a good example.  The gates really don't have any concept of bus 
and the relationship between gates is definitely not a tree.



In the message passing IRQ discussion earlier, it was IIRC decided
that the one bit version would not be changed but a separate message
passing version would be created if ever needed.
   


C already has a message passing interface that supports type safety 
called function pointers :-)


An object that implements multiple interfaces where the interface 
becomes the "message passing interface" is exactly what I've been saying 
we need.  It's flexible and the compiler helps us enforce typing.




Any interfaces of a base class should make sense even for derived classes.

That means if the base class is going to expose essentially a pin-out
interface, that if I have a PCIDevice and cast it to Device, I should be
able to interact with the GPIO interface to interact with the PCI device.
  Presumably, that means interfacing at the PCI signalling level.  That's
insane to model in QEMU :-)
 

This would be doable, if we built buses from a bunch of signals, like
in VHDL or Verilog. It would simplify aliased MMIO addresses nicely,
the undecoded address pins would be ignored. I don't think it would be
useful, but a separate interface could be added for connecting to
PCIBus with just qemu_irqs.
   


Yeah, it's possible, but I don't want to spend my time doing this.


In reality, GPIO only makes sense for a small class of simple devices where
modelling the pin-out interface makes sense (like a 7-segment LCD).  That
suggests that GPIO should not be in the DeviceState interface but instead
should be in a SimpleDevice subclass or something like that.

 

Could you point to examples of SystemBus overuse?

   

anthony@titi:~/git/qemu/hw$ grep qdev_create *.c | wc -l
73
anthony@titi:~/git/qemu/hw$ grep 'qdev_create(NULL' *.c | wc -l
56

SystemBus has become a catch-all for shallow qdev conversions.  We've got
Northbridges, RAM, and network devices sitting on the same bus...
 

On Sparc32 I have not bothered to create a SBus bus. Now it would be
useful to get bootindex corrected. Most devices (even on-board IO)
should use SBus.

The only other bus (MBus) would exist between CPU, IOMMU and memory.

But SysBus fitted the need until recently.
   


A good way to judge where a device is using a bus interface correct: 
does all of it's interactions with any other part of the guest state 
involve method calls to the bus?


Right now, the answer is no for just about every device in QEMU.  This 
is the problem that qdev really was meant to solve and we're not really 
any further along solving it unfortunately.



I'm not arguing against a generic factory interface, I'm arguing that it
should be separate.

IOW:

SerialState *serial_create(int iobase, int irq, ...);

static DeviceState *qdev_serial_create(QemuOpts *opts);

static void serial_init(void)
{
 qdev_register("serial", qdev_serial_create);
}

The key point is that when we create devices internally, we should have a
C-friendly, type-safe interface to interact with.  This will encourage
composition and a richer device model than what we have today.
 

Isn't this what we have now in most cases?
   


No.  The common pattern of shallow conversion is:

struct SerialState
{
// device state
};

struct I

Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Blue Swirl
On Sun, Feb 13, 2011 at 9:57 PM, Anthony Liguori  wrote:
> On 02/13/2011 01:37 PM, Blue Swirl wrote:
>>
>> On Sun, Feb 13, 2011 at 5:31 PM, Anthony Liguori
>>  wrote:
>>
>>>
>>> qdev doesn't expose any state today.  qdev properties are
>>> construction-only
>>> properties that happen to be stored in each device state.
>>>
>>> What we really need is a full property framework that includes properties
>>> with hookable getters and setters along with the ability to mark
>>> properties
>>> as construct-only, read-only, or read-write.
>>>
>>> But I think it's reasonable to expose construct-only properties as just
>>> an
>>> initfn argument.
>>>
>>
>> Sounds OK. About read-write properties, what happens if we one day
>> have extensive threading, and locks are pushed to device level? I can
>> imagine a deadlock involving one thread running in IO thread for a
>> device and another trying to access that device's properties. Maybe
>> that is not different from function call version.
>>
>
> You need hookable setters/getters that can acquire a lock and do the right
> thing.  It shouldn't be able to dead lock if the locking is designed right.
>
>
>>> Yes, but qemu_irq is very restricted as it only models a signal bit of
>>> information and doesn't really have a mechanism to attach/detach in any
>>> generic way.
>>>
>>
>> Basic signals are already very useful for many purposes, since they
>> match digital logic signals in real HW. In theory, whole machines
>> could be constructed with just qemu_irq and NAND gate emulator. ;-)
>>
>
> It's not just in theory.  In the C++ port of QEMU that I wrote, I
> implemented an AND, OR, and XOR gate and implemented a full 32-bit adder by
> just using a device config file.
>
> If done correctly, using referencing can be extremely powerful.  A full
> adder is a good example.  The gates really don't have any concept of bus and
> the relationship between gates is definitely not a tree.
>
>> In the message passing IRQ discussion earlier, it was IIRC decided
>> that the one bit version would not be changed but a separate message
>> passing version would be created if ever needed.
>>
>
> C already has a message passing interface that supports type safety called
> function pointers :-)
>
> An object that implements multiple interfaces where the interface becomes
> the "message passing interface" is exactly what I've been saying we need.
>  It's flexible and the compiler helps us enforce typing.
>
>>>
>>> Any interfaces of a base class should make sense even for derived
>>> classes.
>>>
>>> That means if the base class is going to expose essentially a pin-out
>>> interface, that if I have a PCIDevice and cast it to Device, I should be
>>> able to interact with the GPIO interface to interact with the PCI device.
>>>  Presumably, that means interfacing at the PCI signalling level.  That's
>>> insane to model in QEMU :-)
>>>
>>
>> This would be doable, if we built buses from a bunch of signals, like
>> in VHDL or Verilog. It would simplify aliased MMIO addresses nicely,
>> the undecoded address pins would be ignored. I don't think it would be
>> useful, but a separate interface could be added for connecting to
>> PCIBus with just qemu_irqs.
>>
>
> Yeah, it's possible, but I don't want to spend my time doing this.
>
>>> In reality, GPIO only makes sense for a small class of simple devices
>>> where
>>> modelling the pin-out interface makes sense (like a 7-segment LCD).  That
>>> suggests that GPIO should not be in the DeviceState interface but instead
>>> should be in a SimpleDevice subclass or something like that.
>>>
>>>

 Could you point to examples of SystemBus overuse?


>>>
>>> anthony@titi:~/git/qemu/hw$ grep qdev_create *.c | wc -l
>>> 73
>>> anthony@titi:~/git/qemu/hw$ grep 'qdev_create(NULL' *.c | wc -l
>>> 56
>>>
>>> SystemBus has become a catch-all for shallow qdev conversions.  We've got
>>> Northbridges, RAM, and network devices sitting on the same bus...
>>>
>>
>> On Sparc32 I have not bothered to create a SBus bus. Now it would be
>> useful to get bootindex corrected. Most devices (even on-board IO)
>> should use SBus.
>>
>> The only other bus (MBus) would exist between CPU, IOMMU and memory.
>>
>> But SysBus fitted the need until recently.
>>
>
> A good way to judge where a device is using a bus interface correct: does
> all of it's interactions with any other part of the guest state involve
> method calls to the bus?
>
> Right now, the answer is no for just about every device in QEMU.  This is
> the problem that qdev really was meant to solve and we're not really any
> further along solving it unfortunately.
>
>>> I'm not arguing against a generic factory interface, I'm arguing that it
>>> should be separate.
>>>
>>> IOW:
>>>
>>> SerialState *serial_create(int iobase, int irq, ...);
>>>
>>> static DeviceState *qdev_serial_create(QemuOpts *opts);
>>>
>>> static void serial_init(void)
>>> {
>>>     qdev_register("serial", qdev_serial_create);
>>> }
>>>
>>> The key p

[Qemu-devel] [PATCH 0/4] More qdev conversion and optional devices

2011-02-13 Thread Blue Swirl
Make applesmc and vga-isa optional, convert i8254 to qdev.

Blue Swirl (4):
  applesmc: make optional
  vga-isa: convert to qdev
  vga-isa: make optional
  i8254: convert to qdev

 Makefile.objs  |1 +
 Makefile.target|2 +-
 default-configs/i386-softmmu.mak   |1 +
 default-configs/x86_64-softmmu.mak |1 +
 hw/i8254.c |   61 ++--
 hw/mips_fulong2e.c |4 +-
 hw/mips_jazz.c |4 +-
 hw/mips_malta.c|4 +-
 hw/mips_r4k.c  |4 +-
 hw/pc.c|5 +--
 hw/pc.h|   39 +-
 hw/pcspk.c |4 +-
 hw/ppc_prep.c  |4 +-
 hw/vga-isa.c   |   51 ++---
 hw/vga.c   |   22 -
 hw/vga_int.h   |1 +
 16 files changed, 152 insertions(+), 56 deletions(-)



[Qemu-devel] [PATCH 1/4] applesmc: make optional

2011-02-13 Thread Blue Swirl
Based on patch by David Ahern.

Signed-off-by: Blue Swirl 
---
 Makefile.objs  |1 +
 Makefile.target|2 +-
 default-configs/i386-softmmu.mak   |1 +
 default-configs/x86_64-softmmu.mak |1 +
 4 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index adb5842..25ff8a5 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -196,6 +196,7 @@ hw-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o
 hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 hw-obj-$(CONFIG_DMA) += dma.o
 hw-obj-$(CONFIG_HPET) += hpet.o
+hw-obj-$(CONFIG_APPLESMC) += applesmc.o

 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/Makefile.target b/Makefile.target
index a6c30dd..e4b1210 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -215,7 +215,7 @@ obj-$(CONFIG_KVM) += ivshmem.o
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o i8259.o pc.o
 obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
-obj-i386-y += vmport.o applesmc.o
+obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 3e0eddf..55589fa 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -20,3 +20,4 @@ CONFIG_NE2000_ISA=y
 CONFIG_PIIX_PCI=y
 CONFIG_SOUND=y
 CONFIG_HPET=y
+CONFIG_APPLESMC=y
diff --git a/default-configs/x86_64-softmmu.mak
b/default-configs/x86_64-softmmu.mak
index 1cc1b61..59b7893 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -20,3 +20,4 @@ CONFIG_NE2000_ISA=y
 CONFIG_PIIX_PCI=y
 CONFIG_SOUND=y
 CONFIG_HPET=y
+CONFIG_APPLESMC=y
-- 
1.6.2.4


0001-applesmc-make-optional.patch
Description: application/mbox


[Qemu-devel] [PATCH 2/4] vga-isa: convert to qdev

2011-02-13 Thread Blue Swirl
Signed-off-by: Blue Swirl 
---
 hw/pc.h  |8 +++-
 hw/vga-isa.c |   51 +--
 hw/vga.c |   22 ++
 hw/vga_int.h |1 +
 4 files changed, 67 insertions(+), 15 deletions(-)

diff --git a/hw/pc.h b/hw/pc.h
index 64a3a22..475484a 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -181,7 +181,13 @@ enum vga_retrace_method {

 extern enum vga_retrace_method vga_retrace_method;

-int isa_vga_init(void);
+static inline int isa_vga_init(void)
+{
+isa_create_simple("isa-vga");
+
+return 0;
+}
+
 int pci_vga_init(PCIBus *bus);
 int isa_vga_mm_init(target_phys_addr_t vram_base,
 target_phys_addr_t ctrl_base, int it_shift);
diff --git a/hw/vga-isa.c b/hw/vga-isa.c
index 3046054..fde0d56 100644
--- a/hw/vga-isa.c
+++ b/hw/vga-isa.c
@@ -29,16 +29,40 @@
 #include "qemu-timer.h"
 #include "loader.h"

-int isa_vga_init(void)
+typedef struct ISAVGAState {
+ISADevice dev;
+struct VGACommonState state;
+} ISAVGAState;
+
+static void vga_reset_isa(DeviceState *dev)
 {
-VGACommonState *s;
+ISAVGAState *d = container_of(dev, ISAVGAState, dev.qdev);
+VGACommonState *s = &d->state;

-s = qemu_mallocz(sizeof(*s));
+vga_common_reset(s);
+}

-vga_common_init(s, VGA_RAM_SIZE);
-vga_init(s);
-vmstate_register(NULL, 0, &vmstate_vga_common, s);
+static int vga_initfn(ISADevice *dev)
+{
+ISAVGAState *d = DO_UPCAST(ISAVGAState, dev, dev);
+VGACommonState *s = &d->state;
+int vga_io_memory;

+vga_common_init(s, VGA_RAM_SIZE);
+vga_io_memory = vga_init_io(s);
+cpu_register_physical_memory(isa_mem_base + 0x000a, 0x2,
+ vga_io_memory);
+qemu_register_coalesced_mmio(isa_mem_base + 0x000a, 0x2);
+isa_init_ioport(dev, 0x3c0);
+isa_init_ioport(dev, 0x3b4);
+isa_init_ioport(dev, 0x3ba);
+isa_init_ioport(dev, 0x3da);
+isa_init_ioport(dev, 0x3c0);
+#ifdef CONFIG_BOCHS_VBE
+isa_init_ioport(dev, 0x1ce);
+isa_init_ioport(dev, 0x1cf);
+isa_init_ioport(dev, 0x1d0);
+#endif /* CONFIG_BOCHS_VBE */
 s->ds = graphic_console_init(s->update, s->invalidate,
  s->screen_dump, s->text_update, s);

@@ -47,3 +71,18 @@ int isa_vga_init(void)
 rom_add_vga(VGABIOS_FILENAME);
 return 0;
 }
+
+static ISADeviceInfo vga_info = {
+.qdev.name = "isa-vga",
+.qdev.size = sizeof(ISAVGAState),
+.qdev.vmsd = &vmstate_vga_common,
+.qdev.reset = vga_reset_isa,
+.qdev.no_user  = 1,
+.init  = vga_initfn,
+};
+
+static void vga_register(void)
+{
+isa_qdev_register(&vga_info);
+}
+device_init(vga_register)
diff --git a/hw/vga.c b/hw/vga.c
index e2151a2..fd492d0 100644
--- a/hw/vga.c
+++ b/hw/vga.c
@@ -2259,12 +2259,8 @@ void vga_common_init(VGACommonState *s, int vga_ram_size)
 }

 /* used by both ISA and PCI */
-void vga_init(VGACommonState *s)
+int vga_init_io(VGACommonState *s)
 {
-int vga_io_memory;
-
-qemu_register_reset(vga_reset, s);
-
 register_ioport_write(0x3c0, 16, 1, vga_ioport_write, s);

 register_ioport_write(0x3b4, 2, 1, vga_ioport_write, s);
@@ -2278,7 +2274,6 @@ void vga_init(VGACommonState *s)
 register_ioport_read(0x3d4, 2, 1, vga_ioport_read, s);
 register_ioport_read(0x3ba, 1, 1, vga_ioport_read, s);
 register_ioport_read(0x3da, 1, 1, vga_ioport_read, s);
-s->bank_offset = 0;

 #ifdef CONFIG_BOCHS_VBE
 #if defined (TARGET_I386)
@@ -2296,8 +2291,19 @@ void vga_init(VGACommonState *s)
 #endif
 #endif /* CONFIG_BOCHS_VBE */

-vga_io_memory = cpu_register_io_memory(vga_mem_read, vga_mem_write, s,
-   DEVICE_LITTLE_ENDIAN);
+return cpu_register_io_memory(vga_mem_read, vga_mem_write, s,
+  DEVICE_LITTLE_ENDIAN);
+}
+
+void vga_init(VGACommonState *s)
+{
+int vga_io_memory;
+
+qemu_register_reset(vga_reset, s);
+
+s->bank_offset = 0;
+
+vga_io_memory = vga_init_io(s);
 cpu_register_physical_memory(isa_mem_base + 0x000a, 0x2,
  vga_io_memory);
 qemu_register_coalesced_mmio(isa_mem_base + 0x000a, 0x2);
diff --git a/hw/vga_int.h b/hw/vga_int.h
index 1067f2c..d2811bd 100644
--- a/hw/vga_int.h
+++ b/hw/vga_int.h
@@ -191,6 +191,7 @@ static inline int c6_to_8(int v)

 void vga_common_init(VGACommonState *s, int vga_ram_size);
 void vga_init(VGACommonState *s);
+int vga_init_io(VGACommonState *s);
 void vga_common_reset(VGACommonState *s);

 void vga_dirty_log_start(VGACommonState *s);
-- 
1.6.2.4


0002-vga-isa-convert-to-qdev.patch
Description: application/mbox


[Qemu-devel] [PATCH 3/4] vga-isa: make optional

2011-02-13 Thread Blue Swirl
Ignore failure with vga-isa device creation, but print a warning
message.

Signed-off-by: Blue Swirl 
---
 hw/pc.h |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/pc.h b/hw/pc.h
index 475484a..60f8c42 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -183,9 +183,15 @@ extern enum vga_retrace_method vga_retrace_method;

 static inline int isa_vga_init(void)
 {
-isa_create_simple("isa-vga");
+ISADevice *dev;

-return 0;
+dev = isa_try_create("isa-vga");
+if (!dev) {
+fprintf(stderr, "Warning: isa-vga not available\n");
+return 0;
+}
+qdev_init_nofail(&dev->qdev);
+return 1;
 }

 int pci_vga_init(PCIBus *bus);
-- 
1.6.2.4


0003-vga-isa-make-optional.patch
Description: application/mbox


[Qemu-devel] [PATCH 4/4] i8254: convert to qdev

2011-02-13 Thread Blue Swirl
Convert to qdev. Don't expose PITState.

Signed-off-by: Blue Swirl 
---
 hw/i8254.c |   61 +--
 hw/mips_fulong2e.c |4 +-
 hw/mips_jazz.c |4 +-
 hw/mips_malta.c|4 +-
 hw/mips_r4k.c  |4 +-
 hw/pc.c|5 +--
 hw/pc.h|   25 ++--
 hw/pcspk.c |4 +-
 hw/ppc_prep.c  |4 +-
 9 files changed, 75 insertions(+), 40 deletions(-)

diff --git a/hw/i8254.c b/hw/i8254.c
index 06b225c..680caab 100644
--- a/hw/i8254.c
+++ b/hw/i8254.c
@@ -53,9 +53,12 @@ typedef struct PITChannelState {
 qemu_irq irq;
 } PITChannelState;

-struct PITState {
+typedef struct PITState {
+ISADevice dev;
+uint32_t irq;
+uint32_t iobase;
 PITChannelState channels[3];
-};
+} PITState;

 static PITState pit_state;

@@ -119,8 +122,9 @@ static int pit_get_out1(PITChannelState *s,
int64_t current_time)
 return out;
 }

-int pit_get_out(PITState *pit, int channel, int64_t current_time)
+int pit_get_out(ISADevice *dev, int channel, int64_t current_time)
 {
+PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s = &pit->channels[channel];
 return pit_get_out1(s, current_time);
 }
@@ -179,8 +183,9 @@ static int64_t
pit_get_next_transition_time(PITChannelState *s,
 }

 /* val must be 0 or 1 */
-void pit_set_gate(PITState *pit, int channel, int val)
+void pit_set_gate(ISADevice *dev, int channel, int val)
 {
+PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s = &pit->channels[channel];

 switch(s->mode) {
@@ -210,20 +215,23 @@ void pit_set_gate(PITState *pit, int channel, int val)
 s->gate = val;
 }

-int pit_get_gate(PITState *pit, int channel)
+int pit_get_gate(ISADevice *dev, int channel)
 {
+PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s = &pit->channels[channel];
 return s->gate;
 }

-int pit_get_initial_count(PITState *pit, int channel)
+int pit_get_initial_count(ISADevice *dev, int channel)
 {
+PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s = &pit->channels[channel];
 return s->count;
 }

-int pit_get_mode(PITState *pit, int channel)
+int pit_get_mode(ISADevice *dev, int channel)
 {
+PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s = &pit->channels[channel];
 return s->mode;
 }
@@ -462,9 +470,9 @@ static const VMStateDescription vmstate_pit = {
 }
 };

-static void pit_reset(void *opaque)
+static void pit_reset(DeviceState *dev)
 {
-PITState *pit = opaque;
+PITState *pit = container_of(dev, PITState, dev.qdev);
 PITChannelState *s;
 int i;

@@ -498,20 +506,39 @@ void hpet_pit_enable(void)
 pit_load_count(s, 0);
 }

-PITState *pit_init(int base, qemu_irq irq)
+static int pit_initfn(ISADevice *dev)
 {
-PITState *pit = &pit_state;
+PITState *pit = DO_UPCAST(PITState, dev, dev);
 PITChannelState *s;

 s = &pit->channels[0];
 /* the timer 0 is connected to an IRQ */
 s->irq_timer = qemu_new_timer(vm_clock, pit_irq_timer, s);
-s->irq = irq;
+s->irq = isa_reserve_irq(pit->irq);

-vmstate_register(NULL, base, &vmstate_pit, pit);
-qemu_register_reset(pit_reset, pit);
-register_ioport_write(base, 4, 1, pit_ioport_write, pit);
-register_ioport_read(base, 3, 1, pit_ioport_read, pit);
+register_ioport_write(pit->iobase, 4, 1, pit_ioport_write, pit);
+register_ioport_read(pit->iobase, 3, 1, pit_ioport_read, pit);
+isa_init_ioport(dev, pit->iobase);

-return pit;
+return 0;
+}
+
+static ISADeviceInfo pit_info = {
+.qdev.name = "isa-pit",
+.qdev.size = sizeof(PITState),
+.qdev.vmsd = &vmstate_pit,
+.qdev.reset= pit_reset,
+.qdev.no_user  = 1,
+.init  = pit_initfn,
+.qdev.props = (Property[]) {
+DEFINE_PROP_UINT32("irq", PITState, irq,  -1),
+DEFINE_PROP_HEX32("iobase", PITState, iobase,  -1),
+DEFINE_PROP_END_OF_LIST(),
+},
+};
+
+static void pit_register(void)
+{
+isa_qdev_register(&pit_info);
 }
+device_init(pit_register)
diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c
index 2783ed5..f5ae639 100644
--- a/hw/mips_fulong2e.c
+++ b/hw/mips_fulong2e.c
@@ -67,7 +67,7 @@
 #define FULONG2E_ATI_SLOT6
 #define FULONG2E_RTL8139_SLOT7

-static PITState *pit;
+static ISADevice *pit;

 static struct _loaderparams {
 int ram_size;
@@ -369,7 +369,7 @@ static void mips_fulong2e_init(ram_addr_t
ram_size, const char *boot_device,
 qdev_init_nofail(eeprom);

 /* init other devices */
-pit = pit_init(0x40, isa_reserve_irq(0));
+pit = pit_init(0x40, 0);
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);

diff --git a/hw/mips_jazz.c b/hw/mips_jazz.c
index 85eba5a..a100394 100644
--- a/hw/mips_jazz.c
+++ b/hw/mips_jazz.c
@@ -115,7 +115,7 @@ void mips_jazz_init (ram_addr_t ram_size,
 void* rc4030_opaque;
 int s_rt

Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Peter Maydell
On 13 February 2011 16:56, Anthony Liguori  wrote:
> If we can move away from Bus abstraction and to a simpler interface
> mechanism, then we can express peer relationships by just having bidirection
> references.  IOW:
>
> -device cpus,northbridge=nb,id=cpus,count=16 -device i440fx,cpus=cpus
>
> I don't think modelling each CPU makes sense.  We should probably just model
> all cpus in a single device for the sake of simplicity.

How would this work for systems with multiple CPUs which have different
views of the world? (ie their memory maps differ so that eg some RAM is
shared between them but some parts of the address space are different
RAM for the two cores, some devices one core only, some devices shared
between cores but the device can tell which core made an IO request)
With a bus-style abstraction this is straightforward: each core has its
own bus which is what defines its view of the world, some devices
and RAM are wired up to both buses. I'm not sure how the bidirectional
reference model would look for this?

(Real world examples would be if we ever had any need to actually
model any of the auxiliary cores in say an OMAP device, or the
M3 in a versatile-express. Yes, most systems won't look that odd
but it does come up, especially in testbench type designs, and our
interface abstraction should be able to handle it.)

-- PMM



[Qemu-devel] [PATCH 2/2][v3] linux-user: correct core dump format

2011-02-13 Thread Laurent Vivier
This patch allows to really use the core dumped by qemu with guest
architecture tools.

- it adds a missing bswap_phdr() for the program headers
  of memory regions.

  "objdump -x" sample:

BEFORE:

0x100 off0x0020 vaddr 0x0400 paddr 0x align 2**21
 filesz 0x memsz 0x0010 flags ---
0x100 off0x0020 vaddr 0x00100400 paddr 0x align 2**21
 filesz 0x memsz 0x0008 flags --- 600

AFTER:

LOAD off0x2000 vaddr 0x0004 paddr 0x align 2**13
 filesz 0x memsz 0x1000 flags ---
LOAD off0x2000 vaddr 0x00041000 paddr 0x align 2**13
 filesz 0x memsz 0x0800 flags rw-

- it doesn't pad the note size to sizeof(int32_t).
  On m68k the NT_PRSTATUS note size is 154 and
  must not be rounded up to 156, because this value is checked by
  objdump and gdb.

  "gdb" symptoms:

  "warning: Couldn't find general-purpose registers in core file."

  "objdump -x" sample:

BEFORE:

Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 note0 01c4      03b4  2**0
  CONTENTS, READONLY
  1 .auxv 0070      0508  2**2
  CONTENTS
  2 proc1 0010  0400    0020  2**10
  READONLY

AFTER:

Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 note0 01c4      03b4  2**0
  CONTENTS, READONLY
  1 .reg/190220050      040e  2**2
  CONTENTS
  2 .reg  0050      040e  2**2
  CONTENTS
  3 .auxv 0070      0508  2**2
  CONTENTS
  4 load1   0004    2000  2**13
  ALLOC, READONLY

Signed-off-by: Laurent Vivier 
---
v2: use a predefined alignment size for target_elf_prstatus
v3: use target_ aligned according target properties
 linux-user/elfload.c |   34 ++
 1 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 2de83e4..fe5410e 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -103,13 +103,13 @@ enum {
 
 typedef target_ulongtarget_elf_greg_t;
 #ifdef USE_UID16
-typedef uint16_ttarget_uid_t;
-typedef uint16_ttarget_gid_t;
+typedef target_ushort   target_uid_t;
+typedef target_ushort   target_gid_t;
 #else
-typedef uint32_ttarget_uid_t;
-typedef uint32_ttarget_gid_t;
+typedef target_uint target_uid_t;
+typedef target_uint target_gid_t;
 #endif
-typedef int32_t target_pid_t;
+typedef target_int  target_pid_t;
 
 #ifdef TARGET_I386
 
@@ -1761,19 +1761,20 @@ struct memelfnote {
 size_t namesz_rounded;
 inttype;
 size_t datasz;
+size_t datasz_rounded;
 void   *data;
 size_t notesz;
 };
 
 struct target_elf_siginfo {
-int  si_signo; /* signal number */
-int  si_code;  /* extra code */
-int  si_errno; /* errno */
+target_int  si_signo; /* signal number */
+target_int  si_code;  /* extra code */
+target_int  si_errno; /* errno */
 };
 
 struct target_elf_prstatus {
 struct target_elf_siginfo pr_info;  /* Info associated with signal */
-short  pr_cursig;/* Current signal */
+target_short   pr_cursig;/* Current signal */
 target_ulong   pr_sigpend;   /* XXX */
 target_ulong   pr_sighold;   /* XXX */
 target_pid_t   pr_pid;
@@ -1785,7 +1786,7 @@ struct target_elf_prstatus {
 struct target_timeval pr_cutime; /* XXX Cumulative user time */
 struct target_timeval pr_cstime; /* XXX Cumulative system time */
 target_elf_gregset_t  pr_reg;   /* GP registers */
-intpr_fpvalid;   /* XXX */
+target_int pr_fpvalid;   /* XXX */
 };
 
 #define ELF_PRARGSZ (80) /* Number of chars for args */
@@ -2036,7 +2037,9 @@ static void fill_note(struct memelfnote *note, const char 
*name, int type,
 note->namesz = namesz;
 note->namesz_rounded = roundup(namesz, sizeof (int32_t));
 note->type = type;
-note->datasz = roundup(sz, sizeof (int32_t));;
+note->datasz = sz;
+note->datasz_rounded = roundup(sz, sizeof (int32_t));
+
 note->data = data;
 
 /*
@@ -2044,7 +2047,7 @@ static void fill_note(struct memelfnote *note, const char 
*name, int type,
  * ELF document.
  */
 note->notesz = sizeof (struct elf_note) +
-note->namesz_rounded + note->datasz;
+note->namesz_rounded + note->datasz_rounded;
 }
 
 static void fill_elf_header(struct elfhdr *elf, int segs, uint16_t machine,
@@ -2264,7 +2267,7 @@ static int write_note(struct memelfnote *men, int fd)
 return (-1);
 if (dump_write(fd, men->name, me

[Qemu-devel] [PATCH 1/2][v3] linux-user: Define target alignment size

2011-02-13 Thread Laurent Vivier
Datatype alignment can be found using following application:

int main(void)
{
printf("alignof(short) %ld\n", __alignof__(short));
printf("alignof(int) %ld\n", __alignof__(int));
printf("alignof(long) %ld\n", __alignof__(long));
printf("alignof(long long) %ld\n", __alignof__(long long));
}

This patch includes following alignments:

i386

   alignof(short) 2
   alignof(int) 4
   alignof(long) 4
   alignof(long long) 8

 x86_64

   alignof(short) 2
   alignof(int) 4
   alignof(long) 8
   alignof(long long) 8

 arm

   alignof(short) 2
   alignof(int) 4
   alignof(long) 4
   alignof(long long) 4

 m68k (680x0)

   alignof(short) 2
   alignof(int) 2
   alignof(long) 2
   alignof(long long) 2

 mips

   alignof(short) 2
   alignof(int) 4
   alignof(long) 4
   alignof(long long) 8

 ppc

   alignof(short) 2
   alignof(int) 4
   alignof(long) 4
   alignof(long long) 8

for other targets, use by default (2,4,4,8).

Please, update for your favorite target...

Signed-off-by: Laurent Vivier 
---
v2: compute align size for each basic datatype
v3: add alignments for some 64bit targets, renanme long_long to llong
 configure  |   17 +
 cpu-defs.h |   14 ++
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 25381e5..c36c553 100755
--- a/configure
+++ b/configure
@@ -2919,6 +2919,10 @@ target_nptl="no"
 interp_prefix1=`echo "$interp_prefix" | sed "s/%M/$target_arch2/g"`
 echo "CONFIG_QEMU_INTERP_PREFIX=\"$interp_prefix1\"" >> $config_target_mak
 gdb_xml_files=""
+target_short_alignment=2
+target_int_alignment=4
+target_long_alignment=4
+target_llong_alignment=8
 
 TARGET_ARCH="$target_arch2"
 TARGET_BASE_ARCH=""
@@ -2931,9 +2935,11 @@ case "$target_arch2" in
   x86_64)
 TARGET_BASE_ARCH=i386
 target_phys_bits=64
+target_long_alignment=8
   ;;
   alpha)
 target_phys_bits=64
+target_long_alignment=8
 target_nptl="yes"
   ;;
   arm|armeb)
@@ -2942,6 +2948,7 @@ case "$target_arch2" in
 target_nptl="yes"
 gdb_xml_files="arm-core.xml arm-vfp.xml arm-vfp3.xml arm-neon.xml"
 target_phys_bits=32
+target_llong_alignment=4
   ;;
   cris)
 target_nptl="yes"
@@ -2951,6 +2958,9 @@ case "$target_arch2" in
 bflt="yes"
 gdb_xml_files="cf-core.xml cf-fp.xml"
 target_phys_bits=32
+target_int_alignment=2
+target_long_alignment=2
+target_llong_alignment=2
   ;;
   microblaze)
 bflt="yes"
@@ -2974,6 +2984,7 @@ case "$target_arch2" in
 TARGET_BASE_ARCH=mips
 echo "TARGET_ABI_MIPSN64=y" >> $config_target_mak
 target_phys_bits=64
+target_long_alignment=8
   ;;
   ppc)
 gdb_xml_files="power-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
@@ -2992,6 +3003,7 @@ case "$target_arch2" in
 TARGET_ABI_DIR=ppc
 gdb_xml_files="power64-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
 target_phys_bits=64
+target_long_alignment=8
   ;;
   ppc64abi32)
 TARGET_ARCH=ppc64
@@ -3013,6 +3025,7 @@ case "$target_arch2" in
   sparc64)
 TARGET_BASE_ARCH=sparc
 target_phys_bits=64
+target_long_alignment=8
   ;;
   sparc32plus)
 TARGET_ARCH=sparc64
@@ -3029,6 +3042,10 @@ case "$target_arch2" in
 exit 1
   ;;
 esac
+echo "TARGET_SHORT_ALIGNMENT=$target_short_alignment" >> $config_target_mak
+echo "TARGET_INT_ALIGNMENT=$target_int_alignment" >> $config_target_mak
+echo "TARGET_LONG_ALIGNMENT=$target_long_alignment" >> $config_target_mak
+echo "TARGET_LLONG_ALIGNMENT=$target_llong_alignment" >> $config_target_mak
 echo "TARGET_ARCH=$TARGET_ARCH" >> $config_target_mak
 target_arch_name="`echo $TARGET_ARCH | tr '[:lower:]' '[:upper:]'`"
 echo "TARGET_$target_arch_name=y" >> $config_target_mak
diff --git a/cpu-defs.h b/cpu-defs.h
index 8d4bf86..37780e7 100644
--- a/cpu-defs.h
+++ b/cpu-defs.h
@@ -37,16 +37,22 @@
 
 #define TARGET_LONG_SIZE (TARGET_LONG_BITS / 8)
 
+typedef int16_t target_short __attribute__ ((aligned(TARGET_SHORT_ALIGNMENT)));
+typedef uint16_t target_ushort 
__attribute__((aligned(TARGET_SHORT_ALIGNMENT)));
+typedef int32_t target_int __attribute__((aligned(TARGET_INT_ALIGNMENT)));
+typedef uint32_t target_uint __attribute__((aligned(TARGET_INT_ALIGNMENT)));
+typedef int64_t target_llong __attribute__((aligned(TARGET_LLONG_ALIGNMENT)));
+typedef uint64_t target_ullong 
__attribute__((aligned(TARGET_LLONG_ALIGNMENT)));
 /* target_ulong is the type of a virtual address */
 #if TARGET_LONG_SIZE == 4
-typedef int32_t target_long;
-typedef uint32_t target_ulong;
+typedef int32_t target_long __attribute__((aligned(TARGET_LONG_ALIGNMENT)));
+typedef uint32_t target_ulong __attribute__((aligned(TARGET_LONG_ALIGNMENT)));
 #define TARGET_FMT_lx "%08x"
 #define TARGET_FMT_ld "%d"
 #define TARGET_FMT_lu "%u"
 #elif TARGET_LONG_SIZE == 8
-typedef int64_t target_long;
-typedef uint64_t target_ulong;
+typedef int64_t target_long __attribute__((aligned(TARGET_LONG_ALIGNMENT)));
+typedef uint64_t target_ulong __attribute__((alig

[Qemu-devel] [PATCH 0/2][v4] correct core dump format

2011-02-13 Thread Laurent Vivier
This is the v4 of my patch correcting the core dump format.
(3 versions for patch 2, 3 versions for patch 1 starting at version 2
of patch 2...)

v4 adds some long alignments for 64bit targets,
   renames target_long_long to target_llong, and so on...

v3 introduces a new parameter of the target: the datatype alignment size.

Targets like i386, mips or ppc align (short, int, long, long long) on
(2, 4, 4, 8), target like x86_64 aligns on (2, 4, 8, 8)

but arm aligns on (2, 4, 4, 4) and m68k (680x0) on (2, 2, 2, 2).

And this knowledge is needed to correctly generate a core dump.

For other targets, please update the patch with your favorite one.

V2 introduces target_elf_prstatus alignment (to manage arm and m68k)

v1 corrects core dump for m68k

[PATCH 1/2][v3] linux-user: Define target alignment size
[PATCH 2/2][v3] linux-user: correct core dump format



Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/13/2011 03:00 PM, Blue Swirl wrote:

On Sun, Feb 13, 2011 at 9:57 PM, Anthony Liguori  wrote:
   

On 02/13/2011 01:37 PM, Blue Swirl wrote:
 

On Sun, Feb 13, 2011 at 5:31 PM, Anthony Liguori
  wrote:

   

qdev doesn't expose any state today.  qdev properties are
construction-only
properties that happen to be stored in each device state.

What we really need is a full property framework that includes properties
with hookable getters and setters along with the ability to mark
properties
as construct-only, read-only, or read-write.

But I think it's reasonable to expose construct-only properties as just
an
initfn argument.

 

Sounds OK. About read-write properties, what happens if we one day
have extensive threading, and locks are pushed to device level? I can
imagine a deadlock involving one thread running in IO thread for a
device and another trying to access that device's properties. Maybe
that is not different from function call version.

   

You need hookable setters/getters that can acquire a lock and do the right
thing.  It shouldn't be able to dead lock if the locking is designed right.


 

Yes, but qemu_irq is very restricted as it only models a signal bit of
information and doesn't really have a mechanism to attach/detach in any
generic way.

 

Basic signals are already very useful for many purposes, since they
match digital logic signals in real HW. In theory, whole machines
could be constructed with just qemu_irq and NAND gate emulator. ;-)

   

It's not just in theory.  In the C++ port of QEMU that I wrote, I
implemented an AND, OR, and XOR gate and implemented a full 32-bit adder by
just using a device config file.

If done correctly, using referencing can be extremely powerful.  A full
adder is a good example.  The gates really don't have any concept of bus and
the relationship between gates is definitely not a tree.

 

In the message passing IRQ discussion earlier, it was IIRC decided
that the one bit version would not be changed but a separate message
passing version would be created if ever needed.

   

C already has a message passing interface that supports type safety called
function pointers :-)

An object that implements multiple interfaces where the interface becomes
the "message passing interface" is exactly what I've been saying we need.
  It's flexible and the compiler helps us enforce typing.

 

Any interfaces of a base class should make sense even for derived
classes.

That means if the base class is going to expose essentially a pin-out
interface, that if I have a PCIDevice and cast it to Device, I should be
able to interact with the GPIO interface to interact with the PCI device.
  Presumably, that means interfacing at the PCI signalling level.  That's
insane to model in QEMU :-)

 

This would be doable, if we built buses from a bunch of signals, like
in VHDL or Verilog. It would simplify aliased MMIO addresses nicely,
the undecoded address pins would be ignored. I don't think it would be
useful, but a separate interface could be added for connecting to
PCIBus with just qemu_irqs.

   

Yeah, it's possible, but I don't want to spend my time doing this.

 

In reality, GPIO only makes sense for a small class of simple devices
where
modelling the pin-out interface makes sense (like a 7-segment LCD).  That
suggests that GPIO should not be in the DeviceState interface but instead
should be in a SimpleDevice subclass or something like that.


 

Could you point to examples of SystemBus overuse?


   

anthony@titi:~/git/qemu/hw$ grep qdev_create *.c | wc -l
73
anthony@titi:~/git/qemu/hw$ grep 'qdev_create(NULL' *.c | wc -l
56

SystemBus has become a catch-all for shallow qdev conversions.  We've got
Northbridges, RAM, and network devices sitting on the same bus...

 

On Sparc32 I have not bothered to create a SBus bus. Now it would be
useful to get bootindex corrected. Most devices (even on-board IO)
should use SBus.

The only other bus (MBus) would exist between CPU, IOMMU and memory.

But SysBus fitted the need until recently.

   

A good way to judge where a device is using a bus interface correct: does
all of it's interactions with any other part of the guest state involve
method calls to the bus?

Right now, the answer is no for just about every device in QEMU.  This is
the problem that qdev really was meant to solve and we're not really any
further along solving it unfortunately.

 

I'm not arguing against a generic factory interface, I'm arguing that it
should be separate.

IOW:

SerialState *serial_create(int iobase, int irq, ...);

static DeviceState *qdev_serial_create(QemuOpts *opts);

static void serial_init(void)
{
 qdev_register("serial", qdev_serial_create);
}

The key point is that when we create devices internally, we should have a
C-friendly, type-safe interface to interact with.  This will encourage
composition and a richer device model than what

[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access

2011-02-13 Thread Alexander Graf

On 12.02.2011, at 01:57, Scott Wood wrote:

> On Fri, 11 Feb 2011 22:07:11 +0100
> Alexander Graf  wrote:
> 
>> 
>> On 11.02.2011, at 21:53, Scott Wood wrote:
>> 
>>> On Fri, 11 Feb 2011 02:41:35 +0100
>>> Alexander Graf  wrote:
>>> 
>> Maybe we should go with Avi's proposal after all and simply keep the 
>> full soft-mmu synced between kernel and user space? That way we only 
>> need a setup call at first, no copying in between and simply update the 
>> user space version whenever something changes in the guest. We need to 
>> store the TLB's contents off somewhere anyways, so all we need is an 
>> additional in-kernel array with internal translation data, but that can 
>> be separate from the guest visible data, right?
>>> 
>>> Hmm, the idea is growing on me.
>>> 
 So then everything we need to get all the functionality we need is a hint 
 from kernel to user space that something changed and vice versa.
 
 From kernel to user space is simple. We can just document that after every 
 RUN, all fields can be modified.
 From user space to kernel, we could modify the entries directly and then 
 pass in an ioctl that passes in a dirty bitmap to kernel space. KVM can 
 then decide what to do with it. I guess the easiest implementation for now 
 would be to ignore the bitmap and simply flush the shadow tlb.
 
 That gives us the flush almost for free. All we need to do is set the tlb 
 to all zeros (should be done by env init anyways) and pass in the 
 "something changed" call. KVM can then decide to simply drop all of its 
 shadow state or loop through every shadow entry and flush it individually. 
 Maybe we should give a hint on the amount of flushes, so KVM can implement 
 some threshold.
>>> 
>>> OK.  We'll also need a config ioctl to specify MMU type/size and the address
>>> of the arrays.
>> 
>> Right, a setup call basically :).
> 
> OK, here goes v3:
> 
> [Note: one final thought after writing this up -- this isn't going to work
> too well in cases where the guest can directly manipulate its TLB, such as
> with the LRAT feature of Power Arch 2.06.  We'll still need a
> copy-in/copy-out mechanism for that.]

In that case qemu sets the mmu mode respectively and is aware of the fact that 
it needs get/set methods. We'll get to that when we have LRAT implemented :).

> struct kvmppc_book3e_tlb_entry {
>   union {
>   __u64 mas8_1;
>   struct {
>   __u32 mas8;
>   __u32 mas1;
>   };
>   };
>   __u64 mas2;
>   union {
>   __u64 mas7_3
>   struct {
>   __u32 mas7;
>   __u32 mas3;
>   };
>   };
> };

Looks good to me, except for the anonymous unions and structs of course. Avi 
dislikes those :). Is there any obvious reason we need to have unions in the 
first place? The compiler should be clever enough to pick the right size 
accessors when writing/reading masked  __u64 entries, no? The struct name 
should also have a version indicator - it's the data descriptor only a single 
specific mmu_type, right?

> 
> For an MMU type of KVM_MMU_PPC_BOOK3E_NOHV, the mas8 in 
> kvmppc_book3e_tlb_entry is
> present but not supported.
> 
> struct kvmppc_book3e_tlb_params {
>   /*
>* book3e defines 4 TLBs.  Individual implementations may have
>* fewer.  TLBs that do not already exist on the target must be
>* configured with a size of zero.  A tlb_ways value of zero means
>* the array is fully associative.  Only TLBs that are already
>* set-associative on the target may be configured with a different
>* associativity.  A set-associative TLB may not exceed 255 ways.
>*
>* KVM will adjust TLBnCFG based on the sizes configured here,
>* though arrays greater than 2048 entries will have TLBnCFG[NENTRY]
>* set to zero.
>*
>* The size of any TLB that is set-associative must be a multiple of
>* the number of ways, and the number of sets must be a power of two.
>*
>* The page sizes supported by a TLB shall be determined by reading
>* the TLB configuration registers.  This is not adjustable by 
> userspace.
>* [Note: need sregs]
>*/
>   __u32 tlb_sizes[4];
>   __u8 tlb_ways[4];
>   __u32 reserved[11];
> };
> 
> KVM_CONFIG_TLB
> --
> 
> Capability: KVM_CAP_SW_TLB
> Type: vcpu ioctl
> Parameters: struct kvm_config_tlb (in)
> Returns: 0 on success
> -1 on error
> 
> struct kvm_config_tlb {
>   __u64 params;
>   __u64 array;
>   __u32 mmu_type;
>   __u32 array_len;

Some reserved bits please. IIRC Avi also really likes it when in addition to 
reserved fields there's also a "features" int that indicates which reserved 
fields are actually usable. Shouldn't hurt here either, right?

> };
> 

Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Anthony Liguori

On 02/13/2011 03:24 PM, Peter Maydell wrote:

On 13 February 2011 16:56, Anthony Liguori  wrote:
   

If we can move away from Bus abstraction and to a simpler interface
mechanism, then we can express peer relationships by just having bidirection
references.  IOW:

-device cpus,northbridge=nb,id=cpus,count=16 -device i440fx,cpus=cpus

I don't think modelling each CPU makes sense.  We should probably just model
all cpus in a single device for the sake of simplicity.
 

How would this work for systems with multiple CPUs which have different
views of the world? (ie their memory maps differ so that eg some RAM is
shared between them but some parts of the address space are different
RAM for the two cores, some devices one core only, some devices shared
between cores but the device can tell which core made an IO request)
With a bus-style abstraction this is straightforward: each core has its
own bus which is what defines its view of the world, some devices
and RAM are wired up to both buses. I'm not sure how the bidirectional
reference model would look for this?
   


Each core has it's own northbridge.  You would do:

-device arm-cpu,northbridge=nb1  -device dsp,northbridge=nb2

Or whatever.

Regards,

Anthony Liguori


(Real world examples would be if we ever had any need to actually
model any of the auxiliary cores in say an OMAP device, or the
M3 in a versatile-express. Yes, most systems won't look that odd
but it does come up, especially in testbench type designs, and our
interface abstraction should be able to handle it.)

-- PMM

   





Re: [Qemu-devel] KVM call minutes for Feb 8

2011-02-13 Thread Peter Maydell
On 13 February 2011 22:43, Anthony Liguori  wrote:
> On 02/13/2011 03:24 PM, Peter Maydell wrote:
>> How would this work for systems with multiple CPUs which have different
>> views of the world? (ie their memory maps differ so that eg some RAM is
>> shared between them but some parts of the address space are different
>> RAM for the two cores, some devices one core only, some devices shared
>> between cores but the device can tell which core made an IO request)
>> With a bus-style abstraction this is straightforward: each core has its
>> own bus which is what defines its view of the world, some devices
>> and RAM are wired up to both buses. I'm not sure how the bidirectional
>> reference model would look for this?
>
> Each core has it's own northbridge.  You would do:
>
> -device arm-cpu,northbridge=nb1  -device dsp,northbridge=nb2

I'm afraid I don't really understand what you mean here. "Northbridge"
as I understand it is a very PC-architecture specific term; can you
explain a bit more about what would actually be going on here?
(ie what are the various components in this kind of system model
and what are they doing?)

-- PMM



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread David Gibson
On Sun, Feb 13, 2011 at 08:29:05PM +0200, Blue Swirl wrote:
> On Sun, Feb 13, 2011 at 5:08 PM, Anthony Liguori  
> wrote:
> > On 02/13/2011 05:12 AM, David Gibson wrote:
[snip]
> > The arguments will have to be extracted from the CPU state but I don't think
> > we'll really ever have common hypercall implementations anyway so that's not
> > a huge problem.
> 
> Nice idea. Then the part handling CPUState probably should belong to
> target-ppc/ rather than hw/.

Doesn't work.  Different hypervisors may have arguments - even the
hcall number itself - arranged differently in the registers.  My
earlier drafts had this in target-ppc/; I moved it to hw/ for a
reason.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] Re: [PATCH 15/15] Implement the bus structure for PAPR virtual IO

2011-02-13 Thread David Gibson
On Sun, Feb 13, 2011 at 09:08:22AM -0600, Anthony Liguori wrote:
> On 02/13/2011 05:12 AM, David Gibson wrote:
> >On Sun, Feb 13, 2011 at 10:08:23AM +0200, Blue Swirl wrote:
> >>On Sun, Feb 13, 2011 at 1:15 AM, Benjamin Herrenschmidt
[snip]
> In KVM for x86, instead of using a secondary interface (like
> vmmcall/vmcall), we do all of our paravirtualization using native
> hardware interfaces that we can trap (PIO/MMIO).
> 
> IIUC, on Power, trapping MMIO is not possible due to the MMU mode
> that is currently used (PFs are delivered directly to the guest).
> 
> So it's not really possible to switch from using hypercalls to using MMIO.

That's correct.

> What I would suggest is modelling hypercalls as another I/O address
> space for the processor.  So instead of having a function pointer in
> the CPUState, introduce a:
> 
> typedef void (HypercallFunc)(CPUState *env, void *opaque);
> 
> /* register a hypercall handler */
> void register_hypercall(target_ulong index, HypercallFunc *handler,
> void *opaque);
> void unregister_hypercall(target_ulong index);
> 
> /* dispatch a hypercall */
> void cpu_hypercall(CPUState *env, target_ulong index);

Well, I can certainly implement dynamic registration - in fact I've
done that, I just need to fold it into the earlier part of the patch
series.

But the only "address" we have for this hypercall address space is the
hypercall number, and it's not architected where that will be
supplied.  So we'd still need a per-platform hook to extract the
index from the CPUState.

> This interface could also be used to implement hypercall based
> interfaces on s390 and x86.
> 
> The arguments will have to be extracted from the CPU state but I
> don't think we'll really ever have common hypercall implementations
> anyway so that's not a huge problem.
> 
> >>on real HW?
> >The interface already exists on real HW.  It's described in the PAPR
> >document we keep mentioning.
> 
> Another thing to note is that the hypercall based I/O devices the
> interfaces that the current Power hypervisor uses so implementing
> this interface is necessary to support existing guests.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] RFC: Implement emulation of pSeries logical partitions

2011-02-13 Thread FUJITA Tomonori
On Sun, 13 Feb 2011 01:54:12 +1100
David Gibson  wrote:

> This patch series adds a "pseries" machine to qemu, allowing it to
> emulate IBM pSeries logical partitions.  Along the way we add a bunch
> of support for more modern ppc CPUs than are currently supported.  It
> also makes some significant cleanups to the translation code for hash
> page table based ppc MMUs.
> 
> This is a first version of this series for review.  There are a number
> of additional patches adding features such as virtual IO devices to
> the emulated pSeries platform, which will be added to the series once
> they're a bit more polished.

The communication between LPARs that can be used for something like
VIO server is (or will be) supported?



Re: [Qemu-devel] [PATCH] eepro100: pad to ensure minimum packet size

2011-02-13 Thread Bruce Rogers
 >>> On 2/13/2011 at 05:17 AM, Stefan Weil  wrote: 
> Am 11.02.2011 20:36, schrieb Bruce Rogers:
>> Recent gpxe e100pro drivers will drop small packets because the emulated
>> nic will report an error for small frames. In the qemu model we should
>> instead have the e100pro pad out the received frames to be the minimum
>> size and not report this case as an error.
>>
>> Signed-off-by: Bruce Rogers 
>> ---
>> hw/eepro100.c | 23 +--
>> 1 files changed, 13 insertions(+), 10 deletions(-)
>>
>> diff --git a/hw/eepro100.c b/hw/eepro100.c
>> index edf48f6..03b6934 100644
>> --- a/hw/eepro100.c
>> +++ b/hw/eepro100.c
>> @@ -1645,6 +1645,8 @@ static int nic_can_receive(VLANClientState *nc)
>> #endif
>> }
>>
>> +#define MIN_BUF_SIZE 60 /* Min. octets in an ethernet frame sans FCS */
>> +
>> static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, 
>> size_t size)
>> {
>> /* TODO:
>> @@ -1653,6 +1655,7 @@ static ssize_t nic_receive(VLANClientState *nc, 
>> const uint8_t * buf, size_t size
>> */
>> EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
>> uint16_t rfd_status = 0xa000;
>> + uint8_t min_buf[MIN_BUF_SIZE];
>> static const uint8_t broadcast_macaddr[6] =
>> { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
>>
>> @@ -1660,15 +1663,15 @@ static ssize_t nic_receive(VLANClientState 
>> *nc, const uint8_t * buf, size_t size
>> /* CSMA is disabled. */
>> logout("%p received while CSMA is disabled\n", s);
>> return -1;
>> - } else if (size < 64 && (s->configuration[7] & BIT(0))) {
>> - /* Short frame and configuration byte 7/0 (discard short receive) set:
>> - * Short frame is discarded */
>> - logout("%p received short frame (%zu byte)\n", s, size);
>> - s->statistics.rx_short_frame_errors++;
>> -#if 0
>> - return -1;
>> -#endif
>> - } else if ((size > MAX_ETH_FRAME_SIZE + 4) && !(s->configuration[18] 
>> & BIT(3))) {
>> + }
>> + /* Pad to minimum Ethernet frame length */
>> + if (size < sizeof(min_buf)) {
>> + memcpy(min_buf, buf, size);
>> + memset(&min_buf[size], 0, sizeof(min_buf) - size);
>> + buf = min_buf;
>> + size = sizeof(min_buf);
>> + }
>> + if ((size > MAX_ETH_FRAME_SIZE + 4) && !(s->configuration[18] & 
>> BIT(3))) {
>> /* Long frame and configuration byte 18/3 (long receive ok) not set:
>> * Long frames are discarded. */
>> logout("%p received long frame (%zu byte), ignored\n", s, size);
>> @@ -1744,7 +1747,7 @@ static ssize_t nic_receive(VLANClientState *nc, 
>> const uint8_t * buf, size_t size
>> "(%zu bytes); data truncated\n", rfd_size, size);
>> size = rfd_size;
>> }
>> - if (size < 64) {
>> + if (size < MIN_BUF_SIZE) {
>> rfd_status |= 0x0080;
>> }
>> TRACE(OTHER, logout("command 0x%04x, link 0x%08x, addr 0x%08x, size %u\n",
> 
> 
> Could you please give more details of the test scenario which fails?
> I'd like to reproduce it here and find a better solution.
> 
> The configuration bit "discard short frame" exists in real hardware,
> so removing the code which emulates this behavior is not the correct
> solution - even if it helps in a special case.
> 
> No acknowledge for this patch from me.
> 
> Regards,
> Stefan Weil

What I'm doing is using the current v1.0.1 gpxe based epro100 rom to pxe boot 
using the user mode networking stack. For example, the following (in tree) 
command line fails to pxe boot successfully, when the above mentioned pxe rom 
is used:

x86_64-softmmu/qemu-system-x86_64 -boot n -bootp 
http://boot.kernel.org/bko/pxelinux.0 -net user -net nic,model=i82559er

(It doesn't have to be using http protocol however - using a local tftp path to 
pxelinux.0 fails for me as well.)

I found the gpxe git commit which causes the failure to be the following:

commit cdcb4165bdabf3adbc4b2e5937c279614d769aa9
Author: Thomas Miletich 
Date:   Tue Oct 6 23:57:49 2009 +0200

[eepro100] Convert to native gPXE API

This version is Based on Michael Decker's GSoC 2008 code.
A number cleanups and fixes were applied.


This commit causes packets which are short, such as come from the slirp code, 
to be rejected.

Bruce



Re: [Qemu-devel] [PATCH] Remove a detached device from qemu_device_opts.

2011-02-13 Thread Wen Congyang
At 01/27/2011 05:00 PM, Ken'ichi Ohmichi Write:
> 
> Hi,
> 
> When I tried to attach the interface after detaching the same interface,
> the virsh command output the following and it failed:
> 
>   # virsh detach-interface Domain01 network --mac 52:54:00:0d:78:92
>   Interface detached successfully
> 
>   # virsh attach-interface Domain01 network default --mac 52:54:00:0d:78:92
>   error: Failed to attach interface
>   error: internal error unable to execute QEMU command 'device_add': 
> Duplicate ID 'net0' for device
>   #

I can reproduce this problem.

> 
> The reason is that a detached device is not removed from the list
> "qemu_device_opts", and this patch fixes it.

The detached device will be removed from the list "qemu_device_opts"
in qdev_free() asynchronously.

> 
> 
> Thanks
> Ken'ichi Ohmichi
> 
> 
> Signed-off-by: Ken'ichi Ohmichi 
> ---
> --- a/hw/qdev.c   2011-01-27 17:42:25.0 +0900
> +++ b/hw/qdev.c   2011-01-27 17:43:46.0 +0900
> @@ -905,6 +905,8 @@ int do_device_del(Monitor *mon, const QD
>  qerror_report(QERR_DEVICE_NOT_FOUND, id);
>  return -1;
>  }
> +qemu_opts_del(qemu_opts_find(&qemu_device_opts, id));
> +

We can not use qemu_device_opts here, because qemu_device_opts is
a static variable in qemu-config.c. We should use qemu_find_opts("device")
instead of qemu_device_opts.

I test your patch by attach/detach interface, and qemu core dumps.
The reason may be that we call qemu_opts_del() twice.

Here is the log(/var/log/libvirt/qemu/.log):

*** glibc detected *** /usr/libexec/qemu-system-x86_64: free(): invalid 
pointer: 0x0355f000 ***
=== Backtrace: =
/lib64/libc.so.6[0x351f275676]
/usr/libexec/qemu-system-x86_64[0x43e2a6]
/usr/libexec/qemu-system-x86_64[0x43f9a0]
/usr/libexec/qemu-system-x86_64[0x4400ee]
/usr/libexec/qemu-system-x86_64[0x4cb176]
/usr/libexec/qemu-system-x86_64[0x5af9f6]
/usr/libexec/qemu-system-x86_64[0x4953c1]
/usr/libexec/qemu-system-x86_64[0x495dab]
/usr/libexec/qemu-system-x86_64[0x43a087]
/usr/libexec/qemu-system-x86_64[0x43a5dd]
/usr/libexec/qemu-system-x86_64[0x51f8f5]
/usr/libexec/qemu-system-x86_64[0x40ad4b]
/usr/libexec/qemu-system-x86_64[0x40ae52]
/usr/libexec/qemu-system-x86_64[0x585c81]
/usr/libexec/qemu-system-x86_64[0x589da7]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x351f21ec5d]
/usr/libexec/qemu-system-x86_64[0x408a19]
=== Memory map: 
0040-00784000 r-xp  08:06 679152 
/usr/libexec/qemu-system-x86_64
00983000-009c9000 rw-p 00383000 08:06 679152 
/usr/libexec/qemu-system-x86_64
009c9000-01177000 rw-p  00:00 0 
01177000-01178000 rwxp  00:00 0 
01178000-011d5000 rw-p  00:00 0 
02d32000-03109000 rw-p  00:00 0 
03109000-03119000 rw-p  00:00 0 
03119000-03134000 rw-p  00:00 0 
03134000-03144000 rw-p  00:00 0 
03144000-0352c000 rw-p  00:00 0 
0352c000-0353c000 rw-p  00:00 0 
0353c000-0354f000 rw-p  00:00 0 
0354f000-0355f000 rw-p  00:00 0 
0355f000-0358 rw-p  00:00 0 
4154f000-4954f000 rwxp  00:00 0 
328d00-328d038000 r-xp  08:06 394303 
/lib64/libgssapi_krb5.so.2.2
328d038000-328d237000 ---p 00038000 08:06 394303 
/lib64/libgssapi_krb5.so.2.2
328d237000-328d239000 rw-p 00037000 08:06 394303 
/lib64/libgssapi_krb5.so.2.2
328d40-328d453000 r-xp  08:06 685072 
/usr/lib64/libssl.so.1.0.0
328d453000-328d653000 ---p 00053000 08:06 685072 
/usr/lib64/libssl.so.1.0.0
328d653000-328d65b000 rw-p 00053000 08:06 685072 
/usr/lib64/libssl.so.1.0.0
328d80-328d827000 r-xp  08:06 394300 
/lib64/libk5crypto.so.3.1
328d827000-328da27000 ---p 00027000 08:06 394300 
/lib64/libk5crypto.so.3.1
328da27000-328da29000 rw-p 00027000 08:06 394300 
/lib64/libk5crypto.so.3.1
328ec0-328eccf000 r-xp  08:06 394302 
/lib64/libkrb5.so.3.3
328eccf000-328eecf000 ---p 000cf000 08:06 394302 
/lib64/libkrb5.so.3.3
328eecf000-328eeda000 rw-p 000cf000 08:06 394302 
/lib64/libkrb5.so.3.3
328f00-328f003000 r-xp  08:06 394301 
/lib64/libcom_err.so.2.1
328f003000-328f202000 ---p 3000 08:06 394301 
/lib64/libcom_err.so.2.1
328f202000-328f203000 rw-p 2000 08:06 394301 
/lib64/libcom_err.so.2.1
328f40-328f40a000 r-xp  08:06 393785 
/lib64/libkrb5support.so.0.1
328f40a000-328f609000 ---p a000 08:06 393785 
/lib64/libkrb5support.so.0.1
328f609000-328f60a000 rw-p 9000 08:06 393785 
/lib64/libkrb5support.so.0.1
351ea0-351ea1e000 r-xp 000

[Qemu-devel] CREATING CUSTOMER LOYALTY THROUGH EXTRAORDINARY SERVICE

2011-02-13 Thread MarkPlus Malaysia
 
Good day,
 
We would like to inform you
that MarkPlus&Co Consultancy is holding a
one-day
training session on How to Create Customer
Loyalty on February  22, 2011 at the Royale
Chulan Hotel in Kuala Lumpur. 
 
This training will be
helpful as it will discuss tools on how service
can attract and keep your best customers. The
training will feature a very experienced trainer
and practitioner, Ms. Emilia Zainol, who will
share her experience on boosting customer
loyalty  amongst retailers and mall
employees. It was partly due to her efforts,
that her company has won many accolades
on service, has an impressive loyal following of
visitors and retailers, and thus continues to
become a major profit center. 
 
This training is perfect for
Senior
Executives, Head of Departments, Managers,
Supervisors and Team Leaders and those that are
responsible for service to customers.

 
We can provide a special
price for you if you call us at 03 2166 8197
or 019 309 9907  (and ask for Mien
Aziz).
 
 
 
Program
Summary:
 
CREATING CUSTOMER LOYALTY
THROUGH EXTRAORDINARY
SERVICE
 
22 February 2011
9 AM  to 5 PM
Royale Chulan
Hotel 
5 Jalan Conlay, 50450 Kuala
Lumpur
 
Certificate of attendance awarded for those
who complete the course
 
 
INTRODUCTION:
 
In today's
competitive business world, many companies are
implementing marketing campaigns designed to
attract new business. Unfortunately, the cost for
these campaigns can be very high with little
return on investment. What is often lost in the
mix is the fact that it can be much more cost
effective to have a loyal customer base that
returns again and again rather than constantly
seeking the next new customers.
 
Every business
has a need for loyal customers. Not only is it
less expensive to build loyalty in your existing
customers but often they will spend more with you
and spread word of their satisfaction to their
friends. Overall, the return of investment of
keeping customers satisfied and loyal is much
higher than the return on advertising to new
customers. 
This one day training
program will teach you how to create a service
system in your company that will boost customer
satisfaction and create customer loyalty. Learn
how your company can increase revenue and profit
through extraordinary service.
 
LEARNING
OBJECTIVES
At the end of
the course participants shall be able
to:
- Master the
underlying concepts of customer expectations,
experience, satisfaction and loyalty
- Be able to
identify regular, loyal and at risk customers and
respond accordingly
- Be able to
deliver excellent service to customers and achieve
higher satisfaction
- Be able to
proactively go the extra mile to deliver
extraordinary service
- Be able to
manage customer relationships to maximize
repurchase and referrals
- Be able to
winback lost customers through personalized
approaches
 
COURSE SYLLABUS
Session 1:
Understanding Customer Expectations, Customer
Satisfaction and Customer Loyalty
Session 2:
Understanding & Managing Customer Types Based
on Satisfaction Level
Session 3:
Service Excellence and Beyond
Session 4:
Building An Extraordinary Service Culture In Your
Organization
 
WHO
SHOULD ATTEND?
Senior
Executives, Head of Departments, Managers,
Supervisors and Team Leaders and those that are
responsible for service to customers.
 
SPEAKER
Emilia Zainol
has over 10 years experience as a service
practitioner. She has helped retailers and malls
improve their service and boost customer loyalty.
As a trainer, Emilia has helped banks, retailers,
and many service companies improve their service
performance.
 
COURSE FEE AND
REGISTRATION
RM 800 per
person (includes Lunch, Tea breaks, Course Notes
and Certificate of Completion)
 
To register, please contact:
MIMIEN AZIZ
MARKPLUS&CO CONSULTANCY SDN
BHD
Mobile: 019 309 9907
Telephone: 03 2166 8197
Faxsimile: 03 2166 9197