Re: [Qemu-devel] [PATCH v10 6/6] tpm: add ACPI memory clear interface

Laszlo Ersek Thu, 20 Sep 2018 06:30:36 -0700

On 09/20/18 15:20, Igor Mammedov wrote:
> On Thu, 20 Sep 2018 10:20:49 +0100
> "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote:
> 
>> * Eduardo Habkost (ehabk...@redhat.com) wrote:
>>> On Wed, Sep 19, 2018 at 08:15:25PM +0100, Dr. David Alan Gilbert wrote:  
>>>> * Igor Mammedov (imamm...@redhat.com) wrote:  
>>>>> On Wed, 19 Sep 2018 13:14:05 +0100
>>>>> "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote:
>>>>>   
>>>>>> * Igor Mammedov (imamm...@redhat.com) wrote:  
>>>>>>> On Wed, 19 Sep 2018 11:58:22 +0100
>>>>>>> "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote:
>>>>>>>     
>>>>>>>> * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:    
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> On Tue, Sep 18, 2018 at 7:49 PM Dr. David Alan Gilbert
>>>>>>>>> <dgilb...@redhat.com> wrote:      
>>>>>>>>>>
>>>>>>>>>> * Marc-André Lureau (marcandre.lur...@gmail.com) wrote:      
>>>>>>>>>>> Hi
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Sep 11, 2018 at 6:19 PM Laszlo Ersek <ler...@redhat.com> 
>>>>>>>>>>> wrote:      
>>>>>>>>>>>>
>>>>>>>>>>>> +Alex, due to mention of 21e00fa55f3fd
>>>>>>>>>>>>
>>>>>>>>>>>> On 09/10/18 15:03, Marc-André Lureau wrote:      
>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Sep 10, 2018 at 2:44 PM Dr. David Alan Gilbert
>>>>>>>>>>>>> <dgilb...@redhat.com> wrote:      
>>>>>>>>>>>>>> (I didn't know about guest_phys_block* and would have probably 
>>>>>>>>>>>>>> just used
>>>>>>>>>>>>>> qemu_ram_foreach_block )
>>>>>>>>>>>>>>      
>>>>>>>>>>>>>
>>>>>>>>>>>>> guest_phys_block*() seems to fit, as it lists only the blocks 
>>>>>>>>>>>>> actually
>>>>>>>>>>>>> used, and already skip the device RAM.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Laszlo, you wrote the functions
>>>>>>>>>>>>> (https://git.qemu.org/?p=qemu.git;a=commit;h=c5d7f60f0614250bd925071e25220ce5958f75d0),
>>>>>>>>>>>>> do you think it's appropriate to list the memory to clear, or we
>>>>>>>>>>>>> should rather use qemu_ram_foreach_block() ?      
>>>>>>>>>>>>
>>>>>>>>>>>> Originally, I would have said, "use either, doesn't matter". 
>>>>>>>>>>>> Namely,
>>>>>>>>>>>> when I introduced the guest_phys_block*() functions, the original
>>>>>>>>>>>> purpose was not related to RAM *contents*, but to RAM *addresses*
>>>>>>>>>>>> (GPAs). This is evident if you look at the direct child commit of
>>>>>>>>>>>> c5d7f60f0614, namely 56c4bfb3f07f, which put GuestPhysBlockList to 
>>>>>>>>>>>> use.
>>>>>>>>>>>> And, for your use case (= wiping RAM), GPAs don't matter, only 
>>>>>>>>>>>> contents
>>>>>>>>>>>> matter.
>>>>>>>>>>>>
>>>>>>>>>>>> However, with the commits I mentioned previously, namely 
>>>>>>>>>>>> e4dc3f5909ab9
>>>>>>>>>>>> and 21e00fa55f3fd, we now filter out some RAM blocks from the 
>>>>>>>>>>>> dumping
>>>>>>>>>>>> based on contents / backing as well. I think? So I believe we 
>>>>>>>>>>>> should
>>>>>>>>>>>> honor that for the wiping to. I guess I'd (vaguely) suggest using
>>>>>>>>>>>> guest_phys_block*().
>>>>>>>>>>>>
>>>>>>>>>>>> (And then, as Dave suggests, maybe extend the filter to consider 
>>>>>>>>>>>> pmem
>>>>>>>>>>>> too, separately.)      
>>>>>>>>>>>
>>>>>>>>>>> I looked a bit into skipping pmem memory. The issue is that RamBlock
>>>>>>>>>>> and MemoryRegion have no idea they are actually used for nvram (you
>>>>>>>>>>> could rely on hostmem.pmem flag, but this is optional), and I don't
>>>>>>>>>>> see a clear way to figure this out.      
>>>>>>>>>>
>>>>>>>>>> I think the pmem flag is what we should use; the problem though is 
>>>>>>>>>> we      
>>>>>>>>>
>>>>>>>>> That would be much simpler. But What if you setup a nvdimm backend by
>>>>>>>>> a non-pmem memory? It will always be cleared? What about platforms
>>>>>>>>> that do not support libpmem?      
>>>>>>>>
>>>>>>>> Right, that's basically the problem I say below, the difference between
>>>>>>>> (a) and (b).
>>>>>>>>     
>>>>>>>>>> have two different pieces of semantics:
>>>>>>>>>>     a) PMEM - needs special flush instruction/calls
>>>>>>>>>>     b) PMEM - my data is persistent, please don't clear me
>>>>>>>>>>
>>>>>>>>>> Do those always go together?
>>>>>>>>>>
>>>>>>>>>> (Copying in Junyan He who added the RAM_PMEM flag)
>>>>>>>>>>      
>>>>>>>>>>> I can imagine to retrieve the MemoryRegion from guest phys address,
>>>>>>>>>>> then check the owner is TYPE_NVDIMM for example. Is this a good
>>>>>>>>>>> solution?      
>>>>>>>>>>
>>>>>>>>>> No, I think it's upto whatever creates the region to set a flag
>>>>>>>>>> somewhere properly - there's no telling whether it'll always be 
>>>>>>>>>> NVDIMM
>>>>>>>>>> or some other object.      
>>>>>>>>>
>>>>>>>>> We could make the owner object set a flag on the MemoryRegion, or
>>>>>>>>> implement a common NV interface.      
>>>>>>>>
>>>>>>>> I think preferably on the RAMBlock, but yes; the question then is
>>>>>>>> whether we need to split the PMEM flag into two for (a) and (b) above
>>>>>>>> and whether they need to be separately set.    
>>>>>>> well,
>>>>>>> I get current pmem flag semantics as backend supports persistent memory
>>>>>>> whether frontend will use it or not is different story.
>>>>>>>
>>>>>>> Perhaps we need a separate flag which say that pmem is in use,
>>>>>>> but what about usecase where nvdimm is used as RAM and not storage
>>>>>>> we probably would like to wipe it out as normal RAM.    
>>>>>>
>>>>>> Right, so we're slowly building up the number of flags/policies we're
>>>>>> trying to represent here; it's certainly not the one 'pmem' flag we
>>>>>> currently have.
>>>>>>   
>>>>>>> Maybe we should add frontend property to allow selecting policy per 
>>>>>>> device,
>>>>>>> which then could be propagated to MemoryRegion/RAMBlock?    
>>>>>>
>>>>>> By 'device' here you mean memory-backend-* objects? If so then yes I'd
>>>>>> agree a property on those propagated to the underlying RAMBlock would
>>>>>> be ideal.  
>>>>> nope, 'device' is frontend i.e. pc-dimm, nvdimm, whatever comes next ...
>>>>> it can mark a backend/MemoryRegion as 'clear on request' when it's 
>>>>> realized(),
>>>>> (which has nothing to do with pmem or backends).  
>>>>
>>>> Why do you want to do it on the dimm; I thought the behaviour was more
>>>> associated with the backing device.
>>>> I worry about that because there's lots of places you can wire a backing
>>>> device up, for example I tend to do -object memory-backend-*....
>>>> -numa node,   rather than using a dimm device, you'd have to go and add
>>>> the option to all of those DIMMs etc.  
> That's just RAM guest wise regardless of backend, so it must be wiped out.
> 
>>>
>>> I'm a bit confused by the details here, but it's important that
>>> we get the frontend/backend distinction correctly.
>>>
>>> I'm still reading this thread and trying to understand what
>>> exactly is/isn't guest-visible here.  Anything that is
>>> guest-visible is supposed to be a frontend option (especially if
>>> it's a setting that isn't allowed to change during migration)
>>> Backend options are never supposed to affect guest ABI.  
>>
>> That's a fair argument.
>>
>>> Which is the case here?  
>>
>> From what I can see the current RAM_PMEM/is_pmem flags are purely used by
>> QEMU to decide if they should call pmem_* library calls to persist
>> things - so that's a host visible feature only.
>>
>> docs/nvdimm.txt seems to cover a lot of the detail here.
>>
>> We've got quite a lot of places in the code that do:
>>        if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
>>
>> and that's what seems to drive the ACPI building - see hw/acpi/nvdimm.c
>> so I guess that's most of  the guess visible stuff.
>>
>> and for DIMMs we have a MEMORY_DEVICE_INFO_KIND_NVDIMM flag.
>>
>> Then we also have a flag for the CPU/machine type stating persistence
>> mode (either cpu, or mem-ctrl I think which set some ACPI state to the
>> magic values 2 or 3 - see pc_machine_set_nvdimm_persistence). But that's
>> guest visible state and global, not about specific ram chunks.
>>
>> So I think you're right, that RAM_PMEM Is the wrong thing to use here -
>> it shouldn't change the guest visible behaviour of whether the RAM is
>> cleared;  at the moment the only thing seems to be the TYPE_NVDIMM
>> but that's not passed down to the memory regions.
>>
>> So should it just be that the nvdimm frontend sets the flag saying that
>> a memory region/ramblock is exposed to the guest as nvdimm? And that
>> should make the difference here?
> 
> Well, I'm also starting to loose what we are trying to figure out here.
> Should we start anew, and sum it up?
> 
>  * we have a TPM feature 'clear memory on reboot'
>  * it's typically a FW that implements it, however it would
>     be hard to implement in OVMF, so this patch tries to implement
>     it in QEMU going over RAMBlocks and clearing them thinking it would be
>     easier, alas it turns out not really true as:


(QEMU now faces a design question. OVMF would face the same design
question *plus* a huge implementation mess. To my eyes, the relative
difference in difficulty persists.)

>     we notice that there is nvdimm and go on tangent discussing
>     backend's 'pmem' option which is not visible to guests and could be set 
> or not set
>     or the backend used by other frontends (RAM).
> 
> Regardless of where the feature should be implemented
> I think we should first figure out (ask TCG spec folks) what to do with 
> nvdimms
> in first place or try to get hold on platform that supports both and copy 
> behavior from there.
> Once we know what to do with it we can discuss details.

Agreed 100%.

Thanks
Laszlo

Re: [Qemu-devel] [PATCH v10 6/6] tpm: add ACPI memory clear interface

Reply via email to