* Eduardo Habkost (ehabk...@redhat.com) wrote: > On Wed, Sep 19, 2018 at 08:15:25PM +0100, Dr. David Alan Gilbert wrote: > > * Igor Mammedov (imamm...@redhat.com) wrote: > > > On Wed, 19 Sep 2018 13:14:05 +0100 > > > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > > > > > > > * Igor Mammedov (imamm...@redhat.com) wrote: > > > > > On Wed, 19 Sep 2018 11:58:22 +0100 > > > > > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > > > > > > > > > > > * Marc-André Lureau (marcandre.lur...@gmail.com) wrote: > > > > > > > Hi > > > > > > > > > > > > > > On Tue, Sep 18, 2018 at 7:49 PM Dr. David Alan Gilbert > > > > > > > <dgilb...@redhat.com> wrote: > > > > > > > > > > > > > > > > * Marc-André Lureau (marcandre.lur...@gmail.com) wrote: > > > > > > > > > Hi > > > > > > > > > > > > > > > > > > On Tue, Sep 11, 2018 at 6:19 PM Laszlo Ersek > > > > > > > > > <ler...@redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > +Alex, due to mention of 21e00fa55f3fd > > > > > > > > > > > > > > > > > > > > On 09/10/18 15:03, Marc-André Lureau wrote: > > > > > > > > > > > Hi > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 10, 2018 at 2:44 PM Dr. David Alan Gilbert > > > > > > > > > > > <dgilb...@redhat.com> wrote: > > > > > > > > > > >> (I didn't know about guest_phys_block* and would have > > > > > > > > > > >> probably just used > > > > > > > > > > >> qemu_ram_foreach_block ) > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > guest_phys_block*() seems to fit, as it lists only the > > > > > > > > > > > blocks actually > > > > > > > > > > > used, and already skip the device RAM. > > > > > > > > > > > > > > > > > > > > > > Laszlo, you wrote the functions > > > > > > > > > > > (https://git.qemu.org/?p=qemu.git;a=commit;h=c5d7f60f0614250bd925071e25220ce5958f75d0), > > > > > > > > > > > do you think it's appropriate to list the memory to > > > > > > > > > > > clear, or we > > > > > > > > > > > should rather use qemu_ram_foreach_block() ? > > > > > > > > > > > > > > > > > > > > Originally, I would have said, "use either, doesn't > > > > > > > > > > matter". Namely, > > > > > > > > > > when I introduced the guest_phys_block*() functions, the > > > > > > > > > > original > > > > > > > > > > purpose was not related to RAM *contents*, but to RAM > > > > > > > > > > *addresses* > > > > > > > > > > (GPAs). This is evident if you look at the direct child > > > > > > > > > > commit of > > > > > > > > > > c5d7f60f0614, namely 56c4bfb3f07f, which put > > > > > > > > > > GuestPhysBlockList to use. > > > > > > > > > > And, for your use case (= wiping RAM), GPAs don't matter, > > > > > > > > > > only contents > > > > > > > > > > matter. > > > > > > > > > > > > > > > > > > > > However, with the commits I mentioned previously, namely > > > > > > > > > > e4dc3f5909ab9 > > > > > > > > > > and 21e00fa55f3fd, we now filter out some RAM blocks from > > > > > > > > > > the dumping > > > > > > > > > > based on contents / backing as well. I think? So I believe > > > > > > > > > > we should > > > > > > > > > > honor that for the wiping to. I guess I'd (vaguely) suggest > > > > > > > > > > using > > > > > > > > > > guest_phys_block*(). > > > > > > > > > > > > > > > > > > > > (And then, as Dave suggests, maybe extend the filter to > > > > > > > > > > consider pmem > > > > > > > > > > too, separately.) > > > > > > > > > > > > > > > > > > I looked a bit into skipping pmem memory. The issue is that > > > > > > > > > RamBlock > > > > > > > > > and MemoryRegion have no idea they are actually used for > > > > > > > > > nvram (you > > > > > > > > > could rely on hostmem.pmem flag, but this is optional), and I > > > > > > > > > don't > > > > > > > > > see a clear way to figure this out. > > > > > > > > > > > > > > > > I think the pmem flag is what we should use; the problem though > > > > > > > > is we > > > > > > > > > > > > > > That would be much simpler. But What if you setup a nvdimm > > > > > > > backend by > > > > > > > a non-pmem memory? It will always be cleared? What about platforms > > > > > > > that do not support libpmem? > > > > > > > > > > > > Right, that's basically the problem I say below, the difference > > > > > > between > > > > > > (a) and (b). > > > > > > > > > > > > > > have two different pieces of semantics: > > > > > > > > a) PMEM - needs special flush instruction/calls > > > > > > > > b) PMEM - my data is persistent, please don't clear me > > > > > > > > > > > > > > > > Do those always go together? > > > > > > > > > > > > > > > > (Copying in Junyan He who added the RAM_PMEM flag) > > > > > > > > > > > > > > > > > I can imagine to retrieve the MemoryRegion from guest phys > > > > > > > > > address, > > > > > > > > > then check the owner is TYPE_NVDIMM for example. Is this a > > > > > > > > > good > > > > > > > > > solution? > > > > > > > > > > > > > > > > No, I think it's upto whatever creates the region to set a flag > > > > > > > > somewhere properly - there's no telling whether it'll always be > > > > > > > > NVDIMM > > > > > > > > or some other object. > > > > > > > > > > > > > > We could make the owner object set a flag on the MemoryRegion, or > > > > > > > implement a common NV interface. > > > > > > > > > > > > I think preferably on the RAMBlock, but yes; the question then is > > > > > > whether we need to split the PMEM flag into two for (a) and (b) > > > > > > above > > > > > > and whether they need to be separately set. > > > > > well, > > > > > I get current pmem flag semantics as backend supports persistent > > > > > memory > > > > > whether frontend will use it or not is different story. > > > > > > > > > > Perhaps we need a separate flag which say that pmem is in use, > > > > > but what about usecase where nvdimm is used as RAM and not storage > > > > > we probably would like to wipe it out as normal RAM. > > > > > > > > Right, so we're slowly building up the number of flags/policies we're > > > > trying to represent here; it's certainly not the one 'pmem' flag we > > > > currently have. > > > > > > > > > Maybe we should add frontend property to allow selecting policy per > > > > > device, > > > > > which then could be propagated to MemoryRegion/RAMBlock? > > > > > > > > By 'device' here you mean memory-backend-* objects? If so then yes I'd > > > > agree a property on those propagated to the underlying RAMBlock would > > > > be ideal. > > > nope, 'device' is frontend i.e. pc-dimm, nvdimm, whatever comes next ... > > > it can mark a backend/MemoryRegion as 'clear on request' when it's > > > realized(), > > > (which has nothing to do with pmem or backends). > > > > Why do you want to do it on the dimm; I thought the behaviour was more > > associated with the backing device. > > I worry about that because there's lots of places you can wire a backing > > device up, for example I tend to do -object memory-backend-*.... > > -numa node, rather than using a dimm device, you'd have to go and add > > the option to all of those DIMMs etc. > > I'm a bit confused by the details here, but it's important that > we get the frontend/backend distinction correctly. > > I'm still reading this thread and trying to understand what > exactly is/isn't guest-visible here. Anything that is > guest-visible is supposed to be a frontend option (especially if > it's a setting that isn't allowed to change during migration) > Backend options are never supposed to affect guest ABI.
That's a fair argument. > Which is the case here? From what I can see the current RAM_PMEM/is_pmem flags are purely used by QEMU to decide if they should call pmem_* library calls to persist things - so that's a host visible feature only. docs/nvdimm.txt seems to cover a lot of the detail here. We've got quite a lot of places in the code that do: if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) { and that's what seems to drive the ACPI building - see hw/acpi/nvdimm.c so I guess that's most of the guess visible stuff. and for DIMMs we have a MEMORY_DEVICE_INFO_KIND_NVDIMM flag. Then we also have a flag for the CPU/machine type stating persistence mode (either cpu, or mem-ctrl I think which set some ACPI state to the magic values 2 or 3 - see pc_machine_set_nvdimm_persistence). But that's guest visible state and global, not about specific ram chunks. So I think you're right, that RAM_PMEM Is the wrong thing to use here - it shouldn't change the guest visible behaviour of whether the RAM is cleared; at the moment the only thing seems to be the TYPE_NVDIMM but that's not passed down to the memory regions. So should it just be that the nvdimm frontend sets the flag saying that a memory region/ramblock is exposed to the guest as nvdimm? And that should make the difference here? (cc'ing in Pankaj because of virtio-pmem). Dave > > > > > > I wonder if NVDIMM spec has a means to to express it, > > > or TCG spec (they should have some thoughts on how to deal with > > > coming problem as it affects bare-metal as well) > > > > Dave > > > > > > > Dave > > > > > > > > > > Dave > > > > > > > > > > > > > > > > > > > > > > Dave > > > > > > > > > > > > > > > > > There is memory_region_from_host(), is there a > > > > > > > > > memory_region_from_guest() ? > > > > > > > > > > > > > > > > > > thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Marc-André Lureau > > > > > > > > -- > > > > > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Marc-André Lureau > > > > > > -- > > > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > > > > > > > -- > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > > > > > > > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > -- > Eduardo -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK