Re: [Qemu-devel] [PATCH v10 6/6] tpm: add ACPI memory clear interface

Dr. David Alan Gilbert Thu, 20 Sep 2018 02:21:54 -0700

* Eduardo Habkost (ehabk...@redhat.com) wrote:
> On Wed, Sep 19, 2018 at 08:15:25PM +0100, Dr. David Alan Gilbert wrote:
> > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > On Wed, 19 Sep 2018 13:14:05 +0100
> > > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote:
> > > 
> > > > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > > > On Wed, 19 Sep 2018 11:58:22 +0100
> > > > > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote:
> > > > >   
> > > > > > * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:  
> > > > > > > Hi
> > > > > > > 
> > > > > > > On Tue, Sep 18, 2018 at 7:49 PM Dr. David Alan Gilbert
> > > > > > > <dgilb...@redhat.com> wrote:    
> > > > > > > >
> > > > > > > > * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:    
> > > > > > > > > Hi
> > > > > > > > >
> > > > > > > > > On Tue, Sep 11, 2018 at 6:19 PM Laszlo Ersek 
> > > > > > > > > <ler...@redhat.com> wrote:    
> > > > > > > > > >
> > > > > > > > > > +Alex, due to mention of 21e00fa55f3fd
> > > > > > > > > >
> > > > > > > > > > On 09/10/18 15:03, Marc-André Lureau wrote:    
> > > > > > > > > > > Hi
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Sep 10, 2018 at 2:44 PM Dr. David Alan Gilbert
> > > > > > > > > > > <dgilb...@redhat.com> wrote:    
> > > > > > > > > > >> (I didn't know about guest_phys_block* and would have 
> > > > > > > > > > >> probably just used
> > > > > > > > > > >> qemu_ram_foreach_block )
> > > > > > > > > > >>    
> > > > > > > > > > >
> > > > > > > > > > > guest_phys_block*() seems to fit, as it lists only the 
> > > > > > > > > > > blocks actually
> > > > > > > > > > > used, and already skip the device RAM.
> > > > > > > > > > >
> > > > > > > > > > > Laszlo, you wrote the functions
> > > > > > > > > > > (https://git.qemu.org/?p=qemu.git;a=commit;h=c5d7f60f0614250bd925071e25220ce5958f75d0),
> > > > > > > > > > > do you think it's appropriate to list the memory to 
> > > > > > > > > > > clear, or we
> > > > > > > > > > > should rather use qemu_ram_foreach_block() ?    
> > > > > > > > > >
> > > > > > > > > > Originally, I would have said, "use either, doesn't 
> > > > > > > > > > matter". Namely,
> > > > > > > > > > when I introduced the guest_phys_block*() functions, the 
> > > > > > > > > > original
> > > > > > > > > > purpose was not related to RAM *contents*, but to RAM 
> > > > > > > > > > *addresses*
> > > > > > > > > > (GPAs). This is evident if you look at the direct child 
> > > > > > > > > > commit of
> > > > > > > > > > c5d7f60f0614, namely 56c4bfb3f07f, which put 
> > > > > > > > > > GuestPhysBlockList to use.
> > > > > > > > > > And, for your use case (= wiping RAM), GPAs don't matter, 
> > > > > > > > > > only contents
> > > > > > > > > > matter.
> > > > > > > > > >
> > > > > > > > > > However, with the commits I mentioned previously, namely 
> > > > > > > > > > e4dc3f5909ab9
> > > > > > > > > > and 21e00fa55f3fd, we now filter out some RAM blocks from 
> > > > > > > > > > the dumping
> > > > > > > > > > based on contents / backing as well. I think? So I believe 
> > > > > > > > > > we should
> > > > > > > > > > honor that for the wiping to. I guess I'd (vaguely) suggest 
> > > > > > > > > > using
> > > > > > > > > > guest_phys_block*().
> > > > > > > > > >
> > > > > > > > > > (And then, as Dave suggests, maybe extend the filter to 
> > > > > > > > > > consider pmem
> > > > > > > > > > too, separately.)    
> > > > > > > > >
> > > > > > > > > I looked a bit into skipping pmem memory. The issue is that 
> > > > > > > > > RamBlock
> > > > > > > > > and MemoryRegion have no idea they are actually used for 
> > > > > > > > > nvram (you
> > > > > > > > > could rely on hostmem.pmem flag, but this is optional), and I 
> > > > > > > > > don't
> > > > > > > > > see a clear way to figure this out.    
> > > > > > > >
> > > > > > > > I think the pmem flag is what we should use; the problem though 
> > > > > > > > is we    
> > > > > > > 
> > > > > > > That would be much simpler. But What if you setup a nvdimm 
> > > > > > > backend by
> > > > > > > a non-pmem memory? It will always be cleared? What about platforms
> > > > > > > that do not support libpmem?    
> > > > > > 
> > > > > > Right, that's basically the problem I say below, the difference 
> > > > > > between
> > > > > > (a) and (b).
> > > > > >   
> > > > > > > > have two different pieces of semantics:
> > > > > > > >     a) PMEM - needs special flush instruction/calls
> > > > > > > >     b) PMEM - my data is persistent, please don't clear me
> > > > > > > >
> > > > > > > > Do those always go together?
> > > > > > > >
> > > > > > > > (Copying in Junyan He who added the RAM_PMEM flag)
> > > > > > > >    
> > > > > > > > > I can imagine to retrieve the MemoryRegion from guest phys 
> > > > > > > > > address,
> > > > > > > > > then check the owner is TYPE_NVDIMM for example. Is this a 
> > > > > > > > > good
> > > > > > > > > solution?    
> > > > > > > >
> > > > > > > > No, I think it's upto whatever creates the region to set a flag
> > > > > > > > somewhere properly - there's no telling whether it'll always be 
> > > > > > > > NVDIMM
> > > > > > > > or some other object.    
> > > > > > > 
> > > > > > > We could make the owner object set a flag on the MemoryRegion, or
> > > > > > > implement a common NV interface.    
> > > > > > 
> > > > > > I think preferably on the RAMBlock, but yes; the question then is
> > > > > > whether we need to split the PMEM flag into two for (a) and (b) 
> > > > > > above
> > > > > > and whether they need to be separately set.  
> > > > > well,
> > > > > I get current pmem flag semantics as backend supports persistent 
> > > > > memory
> > > > > whether frontend will use it or not is different story.
> > > > > 
> > > > > Perhaps we need a separate flag which say that pmem is in use,
> > > > > but what about usecase where nvdimm is used as RAM and not storage
> > > > > we probably would like to wipe it out as normal RAM.  
> > > > 
> > > > Right, so we're slowly building up the number of flags/policies we're
> > > > trying to represent here; it's certainly not the one 'pmem' flag we
> > > > currently have.
> > > > 
> > > > > Maybe we should add frontend property to allow selecting policy per 
> > > > > device,
> > > > > which then could be propagated to MemoryRegion/RAMBlock?  
> > > > 
> > > > By 'device' here you mean memory-backend-* objects? If so then yes I'd
> > > > agree a property on those propagated to the underlying RAMBlock would
> > > > be ideal.
> > > nope, 'device' is frontend i.e. pc-dimm, nvdimm, whatever comes next ...
> > > it can mark a backend/MemoryRegion as 'clear on request' when it's 
> > > realized(),
> > > (which has nothing to do with pmem or backends).
> > 
> > Why do you want to do it on the dimm; I thought the behaviour was more
> > associated with the backing device.
> > I worry about that because there's lots of places you can wire a backing
> > device up, for example I tend to do -object memory-backend-*....
> > -numa node,   rather than using a dimm device, you'd have to go and add
> > the option to all of those DIMMs etc.
> 
> I'm a bit confused by the details here, but it's important that
> we get the frontend/backend distinction correctly.
> 
> I'm still reading this thread and trying to understand what
> exactly is/isn't guest-visible here.  Anything that is
> guest-visible is supposed to be a frontend option (especially if
> it's a setting that isn't allowed to change during migration)
> Backend options are never supposed to affect guest ABI.


That's a fair argument.

> Which is the case here?

From what I can see the current RAM_PMEM/is_pmem flags are purely used by
QEMU to decide if they should call pmem_* library calls to persist
things - so that's a host visible feature only.

docs/nvdimm.txt seems to cover a lot of the detail here.

We've got quite a lot of places in the code that do:
       if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {

and that's what seems to drive the ACPI building - see hw/acpi/nvdimm.c
so I guess that's most of  the guess visible stuff.

and for DIMMs we have a MEMORY_DEVICE_INFO_KIND_NVDIMM flag.

Then we also have a flag for the CPU/machine type stating persistence
mode (either cpu, or mem-ctrl I think which set some ACPI state to the
magic values 2 or 3 - see pc_machine_set_nvdimm_persistence). But that's
guest visible state and global, not about specific ram chunks.

So I think you're right, that RAM_PMEM Is the wrong thing to use here -
it shouldn't change the guest visible behaviour of whether the RAM is
cleared;  at the moment the only thing seems to be the TYPE_NVDIMM
but that's not passed down to the memory regions.

So should it just be that the nvdimm frontend sets the flag saying that
a memory region/ramblock is exposed to the guest as nvdimm? And that
should make the difference here?


(cc'ing in Pankaj because of virtio-pmem).

Dave
> 
> > 
> > > I wonder if NVDIMM spec has a means to to express it,
> > > or TCG spec (they should have some thoughts on how to deal with
> > > coming problem as it affects bare-metal as well)
> > 
> > Dave
> > > 
> > > > Dave
> > > > 
> > > > > > Dave
> > > > > >   
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >    
> > > > > > > > > There is memory_region_from_host(), is there a 
> > > > > > > > > memory_region_from_guest() ?
> > > > > > > > >
> > > > > > > > > thanks
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Marc-André Lureau    
> > > > > > > > --
> > > > > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK   
> > > > > > > >  
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > -- 
> > > > > > > Marc-André Lureau    
> > > > > > --
> > > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK  
> > > > >   
> > > > --
> > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > > > 
> > > 
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> -- 
> Eduardo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v10 6/6] tpm: add ACPI memory clear interface

Reply via email to