Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-11 Thread James Simmons
> On Sun, Jul 10, 2016 at 07:14:18PM +0100, James Simmons wrote: > > > [ 111.210818] [] kiblnd_send+0x51d/0x9e0 [ko2iblnd] > > Mea culpa - in kiblnd_send() this > if (payload_kiov) > iov_iter_bvec(&from, ITER_BVEC | WRITE, > payload_kiov,

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-10 Thread Al Viro
On Mon, Jul 11, 2016 at 02:01:13AM +0100, Al Viro wrote: > On Sun, Jul 10, 2016 at 07:14:18PM +0100, James Simmons wrote: > > > [ 111.210818] [] kiblnd_send+0x51d/0x9e0 [ko2iblnd] > > Mea culpa - in kiblnd_send() this > if (payload_kiov) > iov_iter_bvec(&from, ITER_BVEC

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-10 Thread Al Viro
On Sun, Jul 10, 2016 at 07:14:18PM +0100, James Simmons wrote: > [ 111.210818] [] kiblnd_send+0x51d/0x9e0 [ko2iblnd] Mea culpa - in kiblnd_send() this if (payload_kiov) iov_iter_bvec(&from, ITER_BVEC | WRITE, payload_kiov, payload_niov, pa

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-10 Thread James Simmons
> On Jul 4, 2016, at 10:25 PM, Al Viro wrote: > > > BTW, could you take a look at > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git#sendmsg.lustre? > > It's a bunch of simplifications that became possible once > > sendmsg()/recvmsg() > > switched to iov_iter, stopped mangling the io

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-10 Thread Oleg Drokin
On Jul 4, 2016, at 10:25 PM, Al Viro wrote: > BTW, could you take a look at > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git#sendmsg.lustre? > It's a bunch of simplifications that became possible once sendmsg()/recvmsg() > switched to iov_iter, stopped mangling the iovecs and went for

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-06 Thread Oleg Drokin
On Jul 5, 2016, at 9:51 AM, Al Viro wrote: > On Tue, Jul 05, 2016 at 01:31:10PM +0100, Al Viro wrote: >> On Tue, Jul 05, 2016 at 02:22:48AM -0400, Oleg Drokin wrote: >> + if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) { >> >> s/d_unhashed/d_in_lookup/ in that. >> >>> So we come raci

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 11:25 PM, Oleg Drokin wrote: > > On Jul 5, 2016, at 11:20 PM, Al Viro wrote: > >> On Tue, Jul 05, 2016 at 08:29:37PM -0400, Oleg Drokin wrote: + /* Otherwise we just unhash it to be rehashed afresh via + * lookup if necessary + */ >

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 11:20 PM, Al Viro wrote: > On Tue, Jul 05, 2016 at 08:29:37PM -0400, Oleg Drokin wrote: >>> + /* Otherwise we just unhash it to be rehashed afresh via >>> +* lookup if necessary >>> +*/ >>> + d_drop(dentry); >> >> So we can even drop

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Al Viro
On Tue, Jul 05, 2016 at 08:29:37PM -0400, Oleg Drokin wrote: > > + /* Otherwise we just unhash it to be rehashed afresh via > > +* lookup if necessary > > +*/ > > + d_drop(dentry); > > So we can even drop this part and retain the top condition as it was.

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 4:21 PM, Oleg Drokin wrote: > >>> So with Lustre the dentry can be in three states, really: >>> >>> 1. hashed dentry that's all A-ok to reuse. >>> 2. hashed dentry that's NOT valid (dlm lock went away) - this is >>> distinguished in d_compare by looking at a bit in the fs_dat

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 4:08 PM, Al Viro wrote: > On Tue, Jul 05, 2016 at 03:12:44PM -0400, Oleg Drokin wrote: >> >> When d_in_lookup was introduced, it was described as: >>New primitives: d_in_lookup() (a predicate checking if dentry is in >>the in-lookup state) and d_lookup_done() (tells th

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Al Viro
On Tue, Jul 05, 2016 at 03:12:44PM -0400, Oleg Drokin wrote: > > When d_in_lookup was introduced, it was described as: > New primitives: d_in_lookup() (a predicate checking if dentry is in > the in-lookup state) and d_lookup_done() (tells the system that > we are done with lookup and i

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 2:08 PM, Al Viro wrote: > On Tue, Jul 05, 2016 at 12:33:09PM -0400, Oleg Drokin wrote: > >> This also makes me question the whole thing some more. We are definitely in >> lookup >> when this hits, so the dentry is already new, yet it does not check off as >> d_in_lookup(). Th

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 1:42 PM, Al Viro wrote: > On Tue, Jul 05, 2016 at 11:21:32AM -0400, Oleg Drokin wrote: >>> ... >>> - if (d_unhashed(*de)) { >>> + if (d_in_lookup(*de)) { >>>struct dentry *alias; >>> >>>alias = ll_splice_alias(inode, *de); >> >> Thi

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Al Viro
On Tue, Jul 05, 2016 at 12:33:09PM -0400, Oleg Drokin wrote: > This also makes me question the whole thing some more. We are definitely in > lookup > when this hits, so the dentry is already new, yet it does not check off as > d_in_lookup(). That also means that by skipping the ll_splice_alias we

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Al Viro
On Tue, Jul 05, 2016 at 11:21:32AM -0400, Oleg Drokin wrote: > > ... > > - if (d_unhashed(*de)) { > > + if (d_in_lookup(*de)) { > > struct dentry *alias; > > > > alias = ll_splice_alias(inode, *de); > > This breaks Lustre because we now might progress

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 9:51 AM, Al Viro wrote: > On Tue, Jul 05, 2016 at 01:31:10PM +0100, Al Viro wrote: >> On Tue, Jul 05, 2016 at 02:22:48AM -0400, Oleg Drokin wrote: >> + if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) { >> >> s/d_unhashed/d_in_lookup/ in that. >> >>> So we come raci

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Oleg Drokin
On Jul 5, 2016, at 9:51 AM, Al Viro wrote: > On Tue, Jul 05, 2016 at 01:31:10PM +0100, Al Viro wrote: >> On Tue, Jul 05, 2016 at 02:22:48AM -0400, Oleg Drokin wrote: >> + if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) { >> >> s/d_unhashed/d_in_lookup/ in that. >> >>> So we come raci

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Al Viro
On Tue, Jul 05, 2016 at 01:31:10PM +0100, Al Viro wrote: > On Tue, Jul 05, 2016 at 02:22:48AM -0400, Oleg Drokin wrote: > > > > + if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) { > > s/d_unhashed/d_in_lookup/ in that. > > > So we come racing here from multiple threads (say 3 or more - we ha

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-05 Thread Al Viro
On Tue, Jul 05, 2016 at 02:22:48AM -0400, Oleg Drokin wrote: > > + if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) { s/d_unhashed/d_in_lookup/ in that. > So we come racing here from multiple threads (say 3 or more - we have seen > this > in the older crash reports, so totally possible) >

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-04 Thread Oleg Drokin
On Jul 3, 2016, at 2:29 AM, Al Viro wrote: > On Sat, Jun 25, 2016 at 12:38:40PM -0400, Oleg Drokin wrote: > >> Sorry to nag you about this, but did any of those pan out? >> >> d_alloc_parallel() sounds like a bit too heavy there, esp. considering we >> came in with >> a dentry already (though

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-04 Thread Oleg Drokin
On Jul 4, 2016, at 10:28 PM, Oleg Drokin wrote: > > On Jul 3, 2016, at 2:29 AM, Al Viro wrote: > >> On Sat, Jun 25, 2016 at 12:38:40PM -0400, Oleg Drokin wrote: >> >>> Sorry to nag you about this, but did any of those pan out? >>> >>> d_alloc_parallel() sounds like a bit too heavy there, esp.

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-04 Thread Oleg Drokin
ah, and of course the testcase that I forgot to attach ;) lstest.sh Description: Binary data On Jul 4, 2016, at 10:28 PM, Oleg Drokin wrote: > > On Jul 3, 2016, at 2:29 AM, Al Viro wrote: > >> On Sat, Jun 25, 2016 at 12:38:40PM -0400, Oleg Drokin wrote: >> >>> Sorry to nag you about this,

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-04 Thread Oleg Drokin
On Jul 3, 2016, at 2:29 AM, Al Viro wrote: > On Sat, Jun 25, 2016 at 12:38:40PM -0400, Oleg Drokin wrote: > >> Sorry to nag you about this, but did any of those pan out? >> >> d_alloc_parallel() sounds like a bit too heavy there, esp. considering we >> came in with >> a dentry already (though

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-04 Thread Al Viro
On Sun, Jul 03, 2016 at 11:55:09PM -0400, Oleg Drokin wrote: > Quite a bit, actually. If you connect to an rogue Lustre server, > currently there are many ways it can crash the client. > I suspect this is true not just of Lustre, if e.g. NFS server starts to > send directory inodes with duplicated

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-03 Thread Oleg Drokin
On Jul 3, 2016, at 11:08 PM, Al Viro wrote: > On Sun, Jul 03, 2016 at 08:37:22PM -0400, Oleg Drokin wrote: > >> Hm… This dates to sometime in 2006 and my memory is a bit hazy here. >> >> I think when we called into the open, it went into fifo open and stuck there >> waiting for the other opener

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-03 Thread Al Viro
On Sun, Jul 03, 2016 at 08:37:22PM -0400, Oleg Drokin wrote: > Hm… This dates to sometime in 2006 and my memory is a bit hazy here. > > I think when we called into the open, it went into fifo open and stuck there > waiting for the other opener. Something like that. And we cannot really be > stuc

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-03 Thread Oleg Drokin
On Jul 3, 2016, at 8:08 PM, Al Viro wrote: > On Sun, Jul 03, 2016 at 07:29:46AM +0100, Al Viro wrote: > >> Tentative NFS patch follows; I don't understand Lustre well enough, but it >> looks like a plausible strategy there as well. > > Speaking of Lustre: WTF is >/* Open

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-03 Thread Al Viro
On Sun, Jul 03, 2016 at 07:29:46AM +0100, Al Viro wrote: > Tentative NFS patch follows; I don't understand Lustre well enough, but it > looks like a plausible strategy there as well. Speaking of Lustre: WTF is /* Open dentry. */ if (S_ISFIFO(d_inode

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-07-02 Thread Al Viro
On Sat, Jun 25, 2016 at 12:38:40PM -0400, Oleg Drokin wrote: > Sorry to nag you about this, but did any of those pan out? > > d_alloc_parallel() sounds like a bit too heavy there, esp. considering we > came in with > a dentry already (though a potentially shared one, I understand). > Would not i

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-06-25 Thread Oleg Drokin
Hello! On Jun 17, 2016, at 12:29 AM, Al Viro wrote: > On Fri, Jun 17, 2016 at 12:09:19AM -0400, Oleg Drokin wrote: > >>So they both do d_drop(), the dentry is now unhashed, and they both >>dive into nfs_lookup(). >>There eventually they both call >> >> res = d_splice_alias(inod

Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-06-16 Thread Al Viro
On Fri, Jun 17, 2016 at 12:09:19AM -0400, Oleg Drokin wrote: > So they both do d_drop(), the dentry is now unhashed, and they both > dive into nfs_lookup(). > There eventually they both call > > res = d_splice_alias(inode, dentry); > >And so the first lucky one continues on

More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

2016-06-16 Thread Oleg Drokin
Hello! So I think I finally kind of understand this other d_splice_alias problem I've been having. Imagine that we have a negative dentry in the cache. Now we have two threads that are trying to open this file in parallel, the fs is NFS, so atomic_open is defined, it's not a creat