On Thu, May 02, 2019 at 10:52:23AM -0700, Santosh Shilimkar wrote: > On 5/1/2019 11:21 PM, Leon Romanovsky wrote: > > On Wed, May 01, 2019 at 10:54:00AM -0700, Santosh Shilimkar wrote: > > > On 5/1/2019 12:44 AM, Leon Romanovsky wrote: > > > > On Mon, Apr 29, 2019 at 04:37:19PM -0700, Santosh Shilimkar wrote: > > > > > From: Hans Westgaard Ry <hans.westgaard...@oracle.com> > > > > > > > > > > RDS doesn't support RDMA on memory apertures that require On Demand > > > > > Paging (ODP), such as FS DAX memory. User applications can try to use > > > > > RDS to perform RDMA over such memories and since it doesn't report any > > > > > failure, it can lead to unexpected issues like memory corruption when > > > > > a couple of out of sync file system operations like ftruncate etc. are > > > > > performed. > > > > > > > > > > The patch adds a check so that such an attempt to RDMA to/from memory > > > > > apertures requiring ODP will fail. > > > > > > > > > > Reviewed-by: H??kon Bugge <haakon.bu...@oracle.com> > > > > > Reviewed-tested-by: Zhu Yanjun <yanjun....@oracle.com> > > > > > Signed-off-by: Hans Westgaard Ry <hans.westgaard...@oracle.com> > > > > > Signed-off-by: Santosh Shilimkar <santosh.shilim...@oracle.com> > > > > > --- > > > > > net/rds/rdma.c | 5 +++-- > > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/net/rds/rdma.c b/net/rds/rdma.c > > > > > index 182ab84..e0a6b72 100644 > > > > > --- a/net/rds/rdma.c > > > > > +++ b/net/rds/rdma.c > > > > > @@ -158,8 +158,9 @@ static int rds_pin_pages(unsigned long user_addr, > > > > > unsigned int nr_pages, > > > > > { > > > > > int ret; > > > > > > > > > > - ret = get_user_pages_fast(user_addr, nr_pages, write, pages); > > > > > - > > > > > + /* get_user_pages return -EOPNOTSUPP for fs_dax memory */ > > > > > + ret = get_user_pages_longterm(user_addr, nr_pages, > > > > > + write, pages, NULL); > > > > > > > > I'm not RDS expert, but from what I see in net/rds/rdma.c and this code, > > > > you tried to mimic ib_umem_get() without protection, checks and native > > > > ODP, FS and DAX supports. > > > > > > > > The real way to solve your ODP problem will require to extend > > > > ib_umem_get() to work for kernel ULPs too and use it instead of > > > > get_user_pages(). We are working on that and it is in internal review > > > > now. > > > > > > > Yes am aware of it. For FS_DAX like memory, get_user_pages_longterm() > > > fails and then using ib_reg_user_mr() the memory is registered as > > > ODP regsion. This work is not ready yet and without above check, > > > one can do RDMA on FS DAX memory with Fast Reg or FMR memory > > > registration which is not safe and hence need to fail the operation. > > > > > > Once the support is added to RDS, this code path will make that > > > registration go through. > > > > > > Hope it clarifies. > > > > Only partial, why don't you check if user asked ODP through verbs > > interface and return EOPNOTSUPP in such case? > > > I think you are mixing two separate things. ODP is just one way of > supporting RDMA on FS DAX memory. Tomorrow, some other mechanism > can be used as well. RDS is just using inbuilt kernel mm API > to find out if its FS DAX memory(get_user_pages_longterm). > Current code will make RDS get_mr fail if RDS application issues > memory registration request on FS DAX memory and in future when > support gets added, it will do the ODP registration and return > the key.
But we are talking about kernel code only, right? Future support will be added if it exists. > > > It will ensure that once your code will support ODP properly written > > applications will work with/without ODP natively. > > > Application shouldn't care if RDS ULP internally uses ODP > or some other mechanism to support RDMA on FS DAX memory. > This makes it transparent it to RDS application. ODP checks need to be internal to kernel, user won't see those ODP checks. Thanks > > Regards, > Santosh