On 2019-01-18 at 16:11:47 -0200, Eduardo Habkost wrote: > On Wed, Jan 16, 2019 at 10:58:44AM -0500, Michael S. Tsirkin wrote: > > On Wed, Jan 16, 2019 at 04:10:58PM +0800, Zhang Yi wrote: > > > When a file supporting DAX is used as vNVDIMM backend, mmap it with > > > MAP_SYNC flag in addition which can ensure file system metadata > > > synced in each guest writes to the backend file, without other QEMU > > > actions (e.g., periodic fsync() by QEMU). > > > > > > Signed-off-by: Haozhong Zhang <haozhong.zh...@intel.com> > > > Signed-off-by: Zhang Yi <yi.z.zh...@linux.intel.com> > > > --- > > > include/qemu/mmap-alloc.h | 1 + > > > include/qemu/osdep.h | 16 ++++++++++++++++ > > > util/mmap-alloc.c | 7 ++++++- > > > 3 files changed, 23 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h > > > index 6fe6ed4..a95d91c 100644 > > > --- a/include/qemu/mmap-alloc.h > > > +++ b/include/qemu/mmap-alloc.h > > > @@ -18,6 +18,7 @@ size_t qemu_mempath_getpagesize(const char *mem_path); > > > * @flags: specifies additional properties of the mapping, which can be > > > one or > > > * bit-or of following values > > > * - RAM_SHARED: mmap with MAP_SHARED flag > > > + * - RAM_PMEM: mmap with MAP_SYNC flag > > > * Other bits are ignored. > > > * > > > * Return: > > > diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h > > > index 457d24e..27a6bfe 100644 > > > --- a/include/qemu/osdep.h > > > +++ b/include/qemu/osdep.h > > > @@ -419,6 +419,22 @@ void qemu_anon_ram_free(void *ptr, size_t size); > > > # define QEMU_VMALLOC_ALIGN getpagesize() > > > #endif > > > > > > +/* > > > + * MAP_SHARED_VALIDATE and MAP_SYNC are introduced in Linux kernel > > > + * 4.15, so they may not be defined when compiling on older kernels. > > > + */ > > > +#ifdef CONFIG_LINUX > > > + > > > +#include <asm-generic/mman.h> > > > > I suspect this is a wrong way to pull in this header. > > > > You are normally supposed to use > > #include <linux/mman.h> > > > > but see below. > > > > > > > + > > > +#ifndef MAP_SYNC > > > +#define MAP_SYNC 0x0 > > > +#endif > > > > Oh that's bad. > > > > So if you run with a new kernel but > > your installed headers are old, you get MAP_SYNC 0 > > and no persistence transparently with no warning. > > Yes. The semantics of the command-line to not change depending on > build time circumstances. > > Anyway, I see a more fundamental problem in each version of this > patch: the semantics of the command-line options are not clearly > documented. > > We have at least 3 different possible use cases we might need to > support: > > 1) pmem=on, MAP_SYNC not desired > 2) pmem=on, MAP_SYNC desired but optional
Form V9, As Michael suggest, We removed the sync option, MAP_SYNC will force on while we set pmem=on. So we only have 2 user cases, Will update to user documentation. 1) pmem=on, MAP_SYNC not desired We will not pass the flag to mmap2 2) pmem=on, MAP_SYNC desired We will pass the flag to mmap2 > 3) pmem=on, MAP_SYNC required, not optional > > Which cases from the list above we need to support? > > From the cases above, what's the expected semantics of "pmem=on" > with no extra options? > > If these questions are not answered (in the commit message and > user documentation), we won't be able to review and discuss the > code. > > > > > > > + > > > +#else /* !CONFIG_LINUX */ > > > +#define MAP_SYNC 0x0 > > > +#endif /* CONFIG_LINUX */ > > > + > > > #ifdef CONFIG_POSIX > > > struct qemu_signalfd_siginfo { > > > uint32_t ssi_signo; /* Signal number */ > > > diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c > > > index 8f0a740..cba961c 100644 > > > --- a/util/mmap-alloc.c > > > +++ b/util/mmap-alloc.c > > > @@ -99,6 +99,8 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, > > > uint32_t flags) > > > void *ptr = mmap(0, total, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, > > > -1, 0); > > > #endif > > > bool shared = flags & RAM_SHARED; > > > + bool is_pmem = flags & RAM_PMEM; > > > + int mmap_xflags = 0; > > > size_t offset; > > > void *ptr1; > > > > > > @@ -109,12 +111,15 @@ void *qemu_ram_mmap(int fd, size_t size, size_t > > > align, uint32_t flags) > > > assert(is_power_of_2(align)); > > > /* Always align to host page size */ > > > assert(align >= getpagesize()); > > > + if (shared && is_pmem) { > > > + mmap_xflags |= MAP_SYNC; > > > + } > > > > > > offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr; > > > ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE, > > > MAP_FIXED | > > > (fd == -1 ? MAP_ANONYMOUS : 0) | > > > - (shared ? MAP_SHARED : MAP_PRIVATE), > > > + (shared ? MAP_SHARED : MAP_PRIVATE) | mmap_xflags, > > > fd, 0); > > > if (ptr1 == MAP_FAILED) { > > > munmap(ptr, total); > > > -- > > > 2.7.4 > > > > -- > Eduardo