On Thu, Mar 21, 2013 at 09:15:41PM +0200, Michael S. Tsirkin wrote: > On Thu, Mar 21, 2013 at 12:41:35PM -0600, Jason Gunthorpe wrote: > > On Thu, Mar 21, 2013 at 08:16:33PM +0200, Michael S. Tsirkin wrote: > > > > > This is the one I find redundant. Since the write will be done by > > > the adaptor under direct control by the application, why does it > > > make sense to declare this beforehand? If you don't want to allow > > > local write access to memory, just do not post any receive WRs with > > > this address. If you posted and regret it, reset the QP to cancel. > > > > This is to support your COW scenario - the app declares before hand to > > the kernel that it will write to the memory and the kernel ensures > > pages are dedicated to the app at registration time. Or the app says > > it will only read and the kernel could leave them shared. > > Someone here is confused. LOCAL_WRITE/absence of it does not address > COW, it breaks COW anyway. Are you now saying we should change rdma so > without LOCAL_WRITE it will not break COW?
I am talking about 'from a spec' perspective - not what Linux does today. The absence of LOCAL_WRITE is part of the specification to support shared pages. Pages can only be kept shared if all the ACCESS WRITE bits are clear - today Linux always breaks the COW, but if you patch in the ability to keep things shared then it must only happen when *all* the ACCESS WRITE bits are clear. > > The adaptor enforces the access control to prevent a naughty app from > > writing to shared memory - think about mmap'ing libc.so and then using > > RDMA to write to the shared pages. It is necessary to ensure that is > > impossible. > That's why it's redundant: we can't trust an application to tell us > 'this page is writeable', we must get this info from kernel. And so > there's apparently no need for application to tell adaptor about > LOCAL_WRITE. The API design gives user space maximum flexibility, if it wants to create an enforced no-write MR in otherwise writable pages by skipping LOCAL_WRITE then it can do so. The kernel's role in this should be to deny ibv_reg_mr with WRITE bits set if the pages are not writable by the app - I don't know if it does this today, it isn't critically important as long as the pages are unshared. Jason