On Thu, 26 Apr 2018 03:37:51 -0400 (EDT) Pankaj Gupta <pagu...@redhat.com> wrote:
trimming CC list to keep people that might be interested in the topic and renaming thread to reflect it. > > > > > > > >> + > > > > > > > >> + memory_region_add_subregion(&hpms->mr, addr - hpms->base, > > > > > > > >> mr); > > > > > > > > missing vmstate registration? > > > > > > > > > > > > > > Missed this one: To be called by the caller. Important because > > > > > > > e.g. > > > > > > > for > > > > > > > virtio-pmem we don't want this (I assume :) ). > > > > > > if pmem isn't on shared storage, then We'd probably want to migrate > > > > > > it as well, otherwise target would experience data loss. > > > > > > Anyways, I'd just reat it as normal RAM in migration case > > > > > > > > > > Main difference between RAM and pmem it acts like combination of RAM > > > > > and > > > > > disk. > > > > > Saying this, in normal use-case size would be 100 GB's - few TB's > > > > > range. > > > > > I am not sure we really want to migrate it for non-shared storage > > > > > use-case. > > > > with non shared storage you'd have to migrate it target host but > > > > with shared storage it might be possible to flush it and use directly > > > > from target host. That probably won't work right out of box and would > > > > need some sort of synchronization between src/dst hosts. > > > > > > Shared storage should work out of the box. > > > Only thing is data in destination > > > host will be cache cold and existing pages in cache should be invalidated > > > first. > > > But if we migrate entire fake DAX RAMstate it will populate destination > > > host page > > > cache including pages while were idle in source host. This would > > > unnecessarily > > > create entropy in destination host. > > > > > > To me this feature don't make much sense. Problem which we are solving is: > > > Efficiently use guest RAM. > > What would live migration handover flow look like in case of > > guest constantly dirting memory provided by virtio-pmem and > > and sometimes issuing async flush req along with it? > > Dirty entire pmem (disk) at once not a usual scenario. Some part of disk/pmem > would get dirty and we need to handle that. I just want to say moving entire > pmem (disk) is not efficient solution because we are using this solution to > manage guest memory efficiently. Otherwise it will be like any block device > copy > with non-shared storage. not sure if we can use block layer analogy here. > > > > The same applies to nv/pc-dimm as well, as backend file easily could be > > > > on pmem storage as well. > > > > > > Are you saying backing file is in actual actual nvdimm hardware? we don't > > > need > > > emulation at all. > > depends on if file is on DAX filesystem, but your argument about > > migrating huge 100Gb- TB's range applies in this case as well. > > > > > > > > > > > > > Maybe for now we should migrate everything so it would work in case of > > > > non shared NVDIMM on host. And then later add migration-less capability > > > > to all of them. > > > > > > not sure I agree. > > So would you inhibit migration in case of non shared backend storage, > > to avoid loosing data since they aren't migrated? > > I am just thinking what features we want to support with pmem. And live > migration > with shared storage is the one which comes to my mind. > > If live migration with non-shared storage is what we want to support (I don't > know > yet) we can add this? Even with shared storage it would copy entire pmem > state? Perhaps we should register vmstate like for normal ram and use something similar to http://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html this to skip shared memory on migration. In this case we could use this for pc-dimms as well. David, what's your take on it? > Thanks, > Pankaj > > > > > > > > > > One reason why nvdimm added vmstate info could be: still there would > > > > > be > > > > > transient > > > > > writes in memory with fake DAX and there is no way(till now) to flush > > > > > the > > > > > guest > > > > > writes. But with virtio-pmem we can flush such writes before migration > > > > > and > > > > > automatically > > > > > at destination host with shared disk we will have updated data. > > > > nvdimm has concept of flush address hint (may be not implemented in qemu > > > > yet) > > > > but it can flush. The only reason I'm buying into virtio-mem idea > > > > is that would allow async flush queues which would reduce number > > > > of vmexits. > > > > > > Thats correct. > > > > > > Thanks, > > > Pankaj > > > > > > > > > > > > >