On Fri, Jul 22, 2011 at 12:56:58PM +0200, Jan Kiszka wrote:
> On 2011-07-21 14:45, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote:
> >>>> Jan can you look at this please?
> >>>
> >>> I can't promise to do debugging myself.
> >>>
> >>> Also, as I never succeeded in getting anything working with CPU hotplug,
> >>> even back in the days it was supposed to work, I'm a bit clueless /wrt
> >>> to the right test cases.
> >>>
> >> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> >> applied). But we have two bugs currently. One is that ACPI interrupt
> >> is not send when cpu is onlined (at least this appears to be the case).
> >> I will look at that one. Another is that after new cpu is detected it
> >> can't be onlined.
> >>
> >> After fixing the first bug the test should look like this:
> >> 1. start vm with -smp 1,macpus=2
> >> 2. wait for it to boot
> >> 3. do "cpu 1 online" in monitor.
> >> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
> >>
> >> If step 4 should succeed. It fails now.
> >>
> > The first one was easy to solve. See patch below. Step 3 should be
> > "cpu_set 1 online".
> >
> > ---
> >
> > Trigger sci interrupt after cpu hotplug/unplug event.
> >
> > Signed-off-by: Gleb Natapov <[email protected]>
> > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> > index c30a050..40f3fcd 100644
> > --- a/hw/acpi_piix4.c
> > +++ b/hw/acpi_piix4.c
> > @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
> > ACPI_BITMASK_POWER_BUTTON_ENABLE |
> > ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
> > ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
> > - (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
> > + (((s->gpe.sts[0] & s->gpe.en[0]) &
> > + (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
> >
> > qemu_set_irq(s->irq, sci_level);
> > /* schedule a timer interruption if needed */
> > --
> > Gleb.
>
> I had a closer look and identified two further issues, one generic, one
> CPU-hotplug-specific:
> - (qdev) devices that are hotplugged do not receive any reset. That
> does not only apply to the APIC in case of CPU hotplugging, it is
> also broken for NICs, storage controllers, etc. when doing PCI
> hot-add as I just checked via gdb.
> - CPU hotplugging was always (or at least for a fairly long time),
> well, fragile as it failed to make CPU thread creation and CPU
> initialization atomic against APIC addition and other initialization
> steps. IOW, we need to create CPUs stopped, finish all init work,
> sync their states completely to the kernel
> (cpu_synchronize_post_init), and then kick them of. Actually I'm
Syncing the state to the kernel should be done by vcpu thread, so I it
cannot be stopped while the sync is done. May be I misunderstood what
you mean here.
> considering to stop all CPUs during that short phase to make things
> simpler and future-proof (when we reduce qemu_global_mutex
> dependencies).
>
> Still, something else must be different for hotplugged CPUs as they fail
> to come up properly every 2 or 3 system resets or online transitions of
> the Linux guest. Will try to understand that once time permits.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html