On Fri, Jan 25, 2019 at 01:26:53AM -0200, Eduardo Habkost wrote: > On Thu, Jan 24, 2019 at 10:08:37PM -0500, Michael S. Tsirkin wrote: > > On Thu, Jan 24, 2019 at 05:14:43PM -0200, Eduardo Habkost wrote: > > > On Thu, Jan 24, 2019 at 02:05:45PM -0500, Michael S. Tsirkin wrote: > > > > On Thu, Jan 24, 2019 at 04:28:39PM -0200, Eduardo Habkost wrote: > > > > > On Thu, Jan 24, 2019 at 12:45:54PM -0500, Michael S. Tsirkin wrote: > > > > > > On Thu, Jan 24, 2019 at 02:59:26PM -0200, Eduardo Habkost wrote: > > > > > > > On Thu, Jan 24, 2019 at 07:21:03PM +0800, Yi Zhang wrote: > > > > > > > > On 2019-01-23 at 12:50:50 -0200, Eduardo Habkost wrote: > > > > > > > > > On Wed, Jan 23, 2019 at 11:00:02AM +0800, Zhang, Yi wrote: > > > > > > > > > > From: Zhang Yi <yi.z.zh...@linux.intel.com> > > > > > > > > > > > > > > > > > > > > Signed-off-by: Zhang Yi <yi.z.zh...@linux.intel.com> > > > > > [...] > > > > > > > > > > + - 'pmem' option of memory-backend-file is 'on': > > > > > > > > > > + The backend is a file supporting DAX, e.g., a file on > > > > > > > > > > an ext4 or > > > > > > > > > > + xfs file system mounted with '-o dax'. if your pmem=on > > > > > > > > > > ,but the backend is > > > > > > > > > > + not a file supporting DAX, mapping with this flag > > > > > > > > > > results in an EOPNOTSUPP > > > > > > > > > > + error. > > > > > > > > > > > > > > > > > > Won't this break existing configurations that work today on > > > > > > > > > QEMU > > > > > > > > > 3.1.0? Why exactly it is OK to break compatibility here? > > > > > > > > won't, pmem option default is off, if people who start VM don't > > > > > > > > know what > > > > > > > > backend file is, it is suggested and *default to set pmem=off, > > > > > > > > if people well know the backend file have dax capbility. it is > > > > > > > > suggest > > > > > > > > to set pmem=on. > > > > > > > > > > > > > > > > For a special case that we use /dev/dax as backend, we already > > > > > > > > have a > > > > > > > > patch to add MAP_SYNC falg mapiing from device dax mode. > > > > > > > > see https://lkml.org/lkml/2018/4/22/524 > > > > > > > > > > > > > > > > So, if people force set pmem=on, mapping a regular file, it > > > > > > > > will results > > > > > > > > in an EOPNOTSUPP error. > > > > > > > > > > > > > > This is where compatibility is being broken, isn't it? People > > > > > > > currently using pmem=on on a regular file will start getting > > > > > > > errors after a QEMU upgrade. Existing VMs with pmem=on may stop > > > > > > > booting. Maybe this is OK, but we need to be able to explain why > > > > > > > it is OK. > > > > > > > > > > > > I think it's OK since pmem explicitly means "persistent": > > > > > > > > > > > > The @option{pmem} option specifies whether the backing file > > > > > > specified > > > > > > by @option{mem-path} is in host persistent memory that can be > > > > > > accessed > > > > > > using the SNIA NVM programming model (e.g. Intel NVDIMM). > > > > > > If @option{pmem} is set to 'on', QEMU will take necessary > > > > > > operations to > > > > > > guarantee the persistence of its own writes to @option{mem-path} > > > > > > (e.g. in vNVDIMM label emulation and live migration). > > > > > > > > > > If it's OK, let's at least explicitly document that we are > > > > > breaking compatibility in those cases. > > > > > > > > > > > > > > > > > > > > > > > [...] > > > > > > I think generally MAP_SYNC is required. > > > > > > But for compatibility reasons we might need to support > > > > > > !MAP_SYNC on old kernels even though it's risky. > > > > > > > > > > What about making MAP_SYNC optional only on older machine-types? > > > > > > > > I don't think this makes sense. It's not a guest visible change, > > > > machine types are for that. > > > > > > Losing data written to persistent memory is surely guest-visible > > > behavior. > > > > I think we need not be purists here. Most people don't lose power and > > then it's fine and compatible. People who want more robustness need to > > use more modern kernels, that is all. > > I don't think that's being purist. I want to avoid hidden bugs > if we ignore that MAP_SYNC failed for any unexpected reason. If > we need to ignore errors in some cases, let's at least limit that > to cases where we absolutely have to. > But I would also be happy with just a warning.
Makes sense to me. So if it fails with EOPNOTSUPP, we try with MAP_SHARED_VALIDATE without MAP_SYNC. If that succeeds then it's not a dax file, and we warn. If it fails too then it's an old kernel and we silently proceed for compatibility reasons. > > -- > Eduardo