On Thu, Nov 21, 2013 at 03:21:37PM +0100, Igor Mammedov wrote: > On Thu, 21 Nov 2013 11:42:02 +0200 > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > On Thu, Nov 21, 2013 at 03:38:34AM +0100, Igor Mammedov wrote: > > > - implements QEMU hardware part of memory hotplug protocol > > > described at "docs/specs/acpi_mem_hotplug.txt" > > > - handles only memory add notification event for now > > > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > > > --- > > > docs/specs/acpi_mem_hotplug.txt | 38 ++++++++++ > > > hw/acpi/core.c | 144 > > > +++++++++++++++++++++++++++++++++++++++ > > > include/hw/acpi/acpi.h | 24 +++++++ > > > 3 files changed, 206 insertions(+), 0 deletions(-) > > > create mode 100644 docs/specs/acpi_mem_hotplug.txt > > > > > > diff --git a/docs/specs/acpi_mem_hotplug.txt > > > b/docs/specs/acpi_mem_hotplug.txt > > > new file mode 100644 > > > index 0000000..fc34142 > > > --- /dev/null > > > +++ b/docs/specs/acpi_mem_hotplug.txt > > > @@ -0,0 +1,38 @@ > > > +QEMU<->ACPI BIOS memory hotplug interface > > > +-------------------------------------- > > > + > > > +ACPI BIOS GPE.3 handler is dedicated for notifying OS about memory > > > hot-add > > > +or hot-remove events. > > > + > > > +Memory hot-plug interface (IO port 0xa00-0xa17, 1-4 byte access): > > > +--------------------------------------------------------------- > > > +0xa00: > > > + read access: > > > + [0x0-0x3] Lo part of memory device phys address > > > + [0x4-0x7] Hi part of memory device phys address > > > + [0x8-0xb] Lo part of memory device size in bytes > > > + [0xc-0xf] Hi part of memory device size in bytes > > > + [0x14] highest memory hot-plug interface version supported by QEMU > > > > So this can make guest fail gracefully but it appears that > > detecting guest version would be nicer? > > It would let us actually support old guests ... > > my idea of how to it was, > guest writes its version into [0x14] register and reads QEMU version > from it back, if they do not match then than BIOS ignores GPE.3 event > effectively disabling hotplug on guest side. > I haven't thought about supporting multiple implementations in QEMU though. > Do we really want it?
I'm talking about old bios which does not read acpi from qemu. We want it to work even if it can't see hotplugged memory. > > > > > + [0x15] Memory device status fields > > > + bits: > > > + 1: device is enabled and may be used by guest > > > + 2: device insert event, used by ACPI BIOS to distinguish > > > + device for which no device check event to OSPM was > > > issued > > > > what does the above mean? > After OSPM issued device check on selected device it clears this bit to mark > event > as handled. > It allows to avoid keeping this state in ASL (as it's done for CPU hotplug, > see CPON) That's fine. > > what if device is not present? > ASL will issue device check and clear bit, it might be a bug since _STA would > report > not present but no eject event was issued. > > Papering over it ASL could check present bit first and issue device check > only if > it's present. Is this a problem? If yes - that will still be racy won't it? Also, should guest reset eject memory that we requested unplug for? > > > > > + [0x16-0x17] reserved > > > + > > > + write access: > > > + [0x0-0x3] Memory device slot selector, selects active memory > > > device. > > > + All following accesses to other registers in > > > 0xaf80-0xaf97 > > > + region will read/store data from/to selected memory > > > device. > > > + [0x4-0x7] OST event code reported by OSPM > > > + [0x8-0xb] OST status code reported by OSPM > > > + [0x15] Memory device status fields > > > > this is control, not status? > Thanks, I'll fix it. > > > > > > + bits: > > > + 2: if set to 1 clears device insert event, set by ACPI BIOS > > > + after it has sent device check event to OSPM for > > > + seleted memory device > > > > selected? > see "write access: [0x0-0x3]" yes but you have a typo above > > > > How about we actually require guest to enable memory? > > > > This way if we hotplug, but old guest does not enable > > and puts a PCI device there, it just works. > I've lost you here, could you elaborate pls? Assume qemu adds memory by hotplug. Is it immediately enabled? I suggest it's not enabled, and only enable after ACPI enables it (or after reboot?) > > > > > + > > > +Selecting memory device slot beyond present range has no effect on > > > platform: > > > + - not documented above write accesses to memory hot-plug registers > > > + are ignored; > > > + - not documented above read accesses to memory hot-plug registers > > > return 0xFF > > > > 00 would be cleaner (matches not enabled - no event)? > I'm following pattern, where reading from not present IO port returns 0xFF on > hardware. > Fact that ASL reads 0xFF could be used as not supported indication. But isn't this a valid pattern when all bits are set? > > > > > diff --git a/hw/acpi/core.c b/hw/acpi/core.c > > > index 8c0d48c..18e169c 100644 > > > --- a/hw/acpi/core.c > > > +++ b/hw/acpi/core.c > > > @@ -680,3 +680,147 @@ void acpi_update_sci(ACPIREGS *regs, qemu_irq irq, > > > uint32_t gpe0_sts_mask) > > > (regs->pm1.evt.en & ACPI_BITMASK_TIMER_ENABLE) && > > > !(pm1a_sts & ACPI_BITMASK_TIMER_STATUS)); > > > } > > > + > > > +static uint64_t acpi_memory_hotplug_read(void *opaque, hwaddr addr, > > > + unsigned int size) > > > +{ > > > + uint32_t val = 0; > > > + MemHotplugState *mem_st = opaque; > > > + MemStatus *mdev; > > > + > > > + if (mem_st->selector >= mem_st->dev_count) { > > > + return 0; > > > + } > > > + > > > + mdev = &mem_st->devs[mem_st->selector]; > > > + switch (addr) { > > > + case 0x0: /* Lo part of phys address where DIMM is mapped */ > > > + val = object_property_get_int(OBJECT(mdev->dimm), "start", NULL); > > > + break; > > > + case 0x4: /* Hi part of phys address where DIMM is mapped */ > > > + val = object_property_get_int(OBJECT(mdev->dimm), "start", NULL) > > > >> 32; > > > + break; > > > + case 0x8: /* Lo part of DIMM size */ > > > + val = object_property_get_int(OBJECT(mdev->dimm), "size", NULL); > > > + break; > > > + case 0xc: /* Hi part of DIMM size */ > > > + val = object_property_get_int(OBJECT(mdev->dimm), "size", NULL) > > > >> 32; > > > + break; > > > + case 0x10: /* node proximity for _PXM method */ > > > + val = object_property_get_int(OBJECT(mdev->dimm), "node", NULL); > > > + break; > > > + case 0x14: /* intf version */ > > > + val = 1; > > > + break; > > > + case 0x15: /* pack and return is_* fields */ > > > + val |= mdev->is_enabled ? 1 : 0; > > > + val |= mdev->is_inserting ? 2 : 0; > > > + break; > > > + } > > > + return val; > > > +} > > > + > > > +static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, > > > uint64_t data, > > > + unsigned int size) > > > +{ > > > + MemHotplugState *mem_st = opaque; > > > + MemStatus *mdev; > > > + > > > + if (!mem_st->dev_count) { > > > + return; > > > + } > > > + > > > + if (addr) { > > > + if (mem_st->selector >= mem_st->dev_count) { > > > + return; > > > + } > > > + } > > > + > > > + switch (addr) { > > > + case 0x0: /* DIMM slot selector */ > > > + mem_st->selector = data; > > > + break; > > > + case 0x4: /* _OST event */ > > > + mdev = &mem_st->devs[mem_st->selector]; > > > + if (data == 1) { > > > + /* TODO: handle device insert OST event */ > > > + } else if (data == 3) { > > > + /* TODO: handle device remove OST event */ > > > + } > > > + mdev->ost_event = data; > > > + break; > > > + case 0x8: /* _OST status */ > > > + mdev = &mem_st->devs[mem_st->selector]; > > > + mdev->ost_status = data; > > > + /* TODO: report async error */ > > > + /* TODO: implement memory removal on guest signal */ > > > + break; > > > + case 0x15: > > > + mdev = &mem_st->devs[mem_st->selector]; > > > + if (data & 2) { /* clear insert event */ > > > + mdev->is_inserting = false; > > > + } > > > + break; > > > + } > > > + > > > +} > > > +static const MemoryRegionOps acpi_memory_hotplug_ops = { > > > + .read = acpi_memory_hotplug_read, > > > + .write = acpi_memory_hotplug_write, > > > + .endianness = DEVICE_LITTLE_ENDIAN, > > > + .valid = { > > > + .min_access_size = 1, > > > + .max_access_size = 4, > > > + }, > > > +}; > > > + > > > +void acpi_memory_hotplug_init(Object *owner, MemoryRegion *io, > > > + MemHotplugState *state) > > > +{ > > > + QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), NULL); > > > + g_assert(opts); > > > + > > > + state->dev_count = qemu_opt_get_number(opts, "slots", 0); > > > + > > > + if (!state->dev_count) { > > > + return; > > > + } > > > + > > > + state->devs = g_malloc0(sizeof(*state->devs) * state->dev_count); > > > + memory_region_init_io(io, owner, &acpi_memory_hotplug_ops, state, > > > + "apci-mem-hotplug", > > > ACPI_MEMORY_HOTPLUG_IO_LEN); > > > +} > > > + > > > +int acpi_memory_hotplug_cb(ACPIREGS *regs, MemHotplugState *mem_st, > > > + DeviceState *dev, HotplugState state) > > > +{ > > > + MemStatus *mdev; > > > + Error *local_err = NULL; > > > + int slot = object_property_get_int(OBJECT(dev), "slot", &local_err); > > > + > > > + if (error_is_set(&local_err)) { > > > + qerror_report_err(local_err); > > > + error_free(local_err); > > > + return -1; > > > + } > > > + > > > + if (slot >= mem_st->dev_count) { > > > + char *dev_path = object_get_canonical_path(OBJECT(dev)); > > > + qerror_report(ERROR_CLASS_GENERIC_ERROR, > > > "acpi_memory_hotplug_cb: " > > > + "device [%s] returned invalid memory slot[%d]", > > > + dev_path, slot); > > > + g_free(dev_path); > > > + return -1; > > > + } > > > + > > > + mdev = &mem_st->devs[slot]; > > > + if (state == HOTPLUG_ENABLED) { > > > + mdev->dimm = dev; > > > + mdev->is_enabled = true; > > > + mdev->is_inserting = true; > > > + } > > > + > > > + /* do ACPI magic */ > > > + regs->gpe.sts[0] |= ACPI_MEMORY_HOTPLUG_STATUS; > > > + return 0; > > > +} > > > diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h > > > index c4ae7d7..e5717df 100644 > > > --- a/include/hw/acpi/acpi.h > > > +++ b/include/hw/acpi/acpi.h > > > @@ -168,6 +168,30 @@ uint32_t acpi_gpe_ioport_readb(ACPIREGS *ar, > > > uint32_t addr); > > > > > > void acpi_update_sci(ACPIREGS *acpi_regs, qemu_irq irq, uint32_t > > > gpe0_sts_mask); > > > > > > +#define ACPI_MEMORY_HOTPLUG_IO_LEN 24 > > > +#define ACPI_MEMORY_HOTPLUG_BASE 0x0a00 > > > + > > > +#define ACPI_MEMORY_HOTPLUG_STATUS 8 > > > + > > > +typedef struct MemStatus { > > > + DeviceState *dimm; > > > + bool is_enabled; > > > + bool is_inserting; > > > + uint32_t ost_event; > > > + uint32_t ost_status; > > > +} MemStatus; > > > + > > > +typedef struct MemHotplugState { > > > + uint32_t selector; > > > + uint32_t dev_count; > > > + MemStatus *devs; > > > +} MemHotplugState; > > > + > > > +void acpi_memory_hotplug_init(Object *owner, MemoryRegion *io, > > > + MemHotplugState *state); > > > + > > > +int acpi_memory_hotplug_cb(ACPIREGS *regs, MemHotplugState *mem_st, > > > + DeviceState *dev, HotplugState state); > > > /* acpi.c */ > > > extern int acpi_enabled; > > > extern char unsigned *acpi_tables; > > > -- > > > 1.7.1