Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-19 Thread NeilBrown
On Sun, 17 May 2015 19:56:26 -0700 Linus Torvalds wrote: > On Sun, May 17, 2015 at 4:16 PM, NeilBrown wrote: > > > > Just to be crystal clear about what I want: > > I want the filesystem to be in control > > Yeah, no. Not going to happen. > > You seem to think that the dcache is "just" a cac

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Linus Torvalds
On Sun, May 17, 2015 at 8:42 PM, Al Viro wrote: > > "Rest of the path" makes no sense, obviously. "More of the path" (and _not_ > as a string, TYVM - we have those components in ->d_name.name of dentries we > want revalidated [..]) For revalidate, yes we kind of have them as dentries. I say kind

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Al Viro
On Sun, May 17, 2015 at 07:56:26PM -0700, Linus Torvalds wrote: > > So for Al's example of revalidating multiple components at once, once the > > VFS > > gets to a point in the path where d_revalidate says "I need more time", > > the VFS just passes the rest of the path to the filesystem. > >

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Linus Torvalds
On Sun, May 17, 2015 at 4:16 PM, NeilBrown wrote: > > Just to be crystal clear about what I want: > I want the filesystem to be in control Yeah, no. Not going to happen. You seem to think that the dcache is "just" a cache. It's not. It's a cache, but that is absolutely not all that it is. It's

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Al Viro
On Mon, May 18, 2015 at 09:39:07AM +1000, NeilBrown wrote: > There is no reason to be so gloomy. RTFS. > The VFS would provide a generic_do_last() (or whatever) which handles > everything correctly for local filesystems which keep the dcache precisely > consistent and use it for all the valuable

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread NeilBrown
On Sun, 17 May 2015 11:55:35 +0100 Al Viro wrote: > As for Neil's point re do_last() and friends being much too convoluted - yes, > they are. And it's not a result of trying to shoehorn everything in one > model. "Just let NFS have at it" as soon as we reach do_last() won't make > things any si

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread NeilBrown
On Sun, 17 May 2015 09:43:34 -0700 Linus Torvalds wrote: > On Sun, May 17, 2015 at 3:55 AM, Al Viro wrote: > > > > And that is complete crap. Multi-component lookups do make sense; once > > we are at the edge of the area present in dcache, we _know_ there won't > > be any existing mountpoints i

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Linus Torvalds
On Sun, May 17, 2015 at 9:43 AM, Linus Torvalds wrote: > > d_instantiate(dentry, inode); > > could decide that *before* it does that "d_instantiate()", it could > pre-populate the child list of 'dentry' with the lookup information > for 'b' (and possibly recursively for 'c' too under 'd').

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Linus Torvalds
On Sun, May 17, 2015 at 3:55 AM, Al Viro wrote: > > And that is complete crap. Multi-component lookups do make sense; once > we are at the edge of the area present in dcache, we _know_ there won't > be any existing mountpoints involved; parsing the components and feeding > them to fs at once, alo

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-17 Thread Al Viro
On Sat, May 16, 2015 at 09:04:34PM -0700, Linus Torvalds wrote: > It's now about things like overlayfs etc, all those things. Er... Bad example, that - overlayfs is _not_ fs-agnostic. > When somebody does a lookup of a filename, it is not a "pass this > filename to the filesystem". It very much

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-16 Thread NeilBrown
On Sat, 16 May 2015 21:04:34 -0700 Linus Torvalds wrote: > On Sat, May 16, 2015 at 8:48 PM, Linus Torvalds > wrote: > > > > Sorry, but that really is how it is. NFS isn't special enough for some > > badly designed lookup models to matter one whit. > > Btw, it's not just about performance, altho

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-16 Thread Linus Torvalds
On Sat, May 16, 2015 at 8:48 PM, Linus Torvalds wrote: > > Sorry, but that really is how it is. NFS isn't special enough for some > badly designed lookup models to matter one whit. Btw, it's not just about performance, although the whole "we can do cached lookups without ever having to et the fil

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-16 Thread Linus Torvalds
On Sat, May 16, 2015 at 8:12 PM, NeilBrown wrote: > > The problem isn't getting intermediates. The problem is that not having > intermediates confuses the dcache. When the dcache is just providing a > caching service, and not providing a consistency service, then it shouldn't > let itself get co

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-16 Thread NeilBrown
On Sat, 16 May 2015 15:18:11 +0100 Al Viro wrote: > On Sat, May 16, 2015 at 06:46:26AM +0100, Al Viro wrote: > > > Dealing with multi-component lookups isn't impossible and might be a good > > idea, but only if all intermediates are populated. What information does > > NFSv4 multi-component loo

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-16 Thread NeilBrown
On Sat, 16 May 2015 06:46:26 +0100 Al Viro wrote: > On Sat, May 16, 2015 at 02:45:27PM +1000, NeilBrown wrote: > > > Yes, I've looked lately :-) > > I think that all of RCU-walk, and probably some of REF-walk should happen > > before the filesystem gets to see anything. > > But once you hit a no

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-16 Thread Linus Torvalds
On Fri, May 15, 2015 at 9:31 PM, Al Viro wrote: => > Point, but... A lot of our problems comes from the fact that ->i_mutex > doubles as protection against the addition to the list of children, on > top of protection of directory itself. Yeah, ok, we'd need to change that too. Maybe just make it

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-16 Thread Al Viro
On Sat, May 16, 2015 at 06:46:26AM +0100, Al Viro wrote: > Dealing with multi-component lookups isn't impossible and might be a good > idea, but only if all intermediates are populated. What information does > NFSv4 multi-component lookup give you? 9p one gives an array of FIDs, > one per compon

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 02:45:27PM +1000, NeilBrown wrote: > Yes, I've looked lately :-) > I think that all of RCU-walk, and probably some of REF-walk should happen > before the filesystem gets to see anything. > But once you hit a non-positive dentry or the parent of the target name, I'd > rather

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread NeilBrown
On Sat, 16 May 2015 02:47:18 +0100 Al Viro wrote: > On Sat, May 16, 2015 at 11:25:03AM +1000, NeilBrown wrote: > > But surely those things can be managed with a spinlock. > > > > I think a big part of the problem is that the VFS tries to control > > filesystems rather than provide services to th

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 08:37:20PM -0700, Linus Torvalds wrote: > On May 15, 2015 8:17 PM, "Al Viro" wrote: > > > > What for? All we need is a flag, waitqueue and being woken > > up when the flag gets cleared. > > You need to have the flag somewhere. > > The child dentry doesn't exist y

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 07:23:11PM -0700, Linus Torvalds wrote: >For filesystems that say that they are ok with, make lookup_slow() > (and *only* lookup_slow for now) instead take the rwsem for reading, > but in addition to that, take a hashed mutex. > > By "hashed mutex", I mean having a sma

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 6:55 PM, Al Viro wrote: > > See upthread. It might be doable (provided that we turn ->i_mutex into > rwsem, to keep the exclusion with directory _modifiers_), but it'll need > a really non-trivial code review of a bunch of filesystems, especially ones > that want to play w

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 06:47:04PM -0700, Linus Torvalds wrote: > Now, maybe we could solve it with a new sleeping lock in the dentry > itself. Maybe we could allocate the new dentry early, add it to the > directory the usual way, but mark it as being "not ready" (so that > d_lookup() wouldn't use

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 11:25:03AM +1000, NeilBrown wrote: > But surely those things can be managed with a spinlock. > > I think a big part of the problem is that the VFS tries to control > filesystems rather than provide services to them. What with being the thing syscalls talk to for sending th

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 6:25 PM, NeilBrown wrote: >> >>For example, simply that we only ever have one single dentry for a >> particular name, and that we only ever have one active lookup per >> dentry. Those things happen independently of - and before - the server >> even sees the operation. >

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Fri, May 15, 2015 at 05:45:56PM -0700, Linus Torvalds wrote: > Al, do you have any ideas? Personally, I've wanted to make I_mutex a > rwsem for a long time, but right now pretty much everything uses it > for exclusion. For example, filename lookup is clearly just reading > the directory, so it

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread NeilBrown
On Fri, 15 May 2015 17:45:56 -0700 Linus Torvalds wrote: > On Fri, May 15, 2015 at 4:30 PM, NeilBrown wrote: > > > > .. and I've been wondering what to do about i_mutex and NFS. I've had > > customer reports of slowness in creating files that seems to be due to > > i_mutex on the directory bein

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 4:38 PM, Dave Chinner wrote: > > Right, because it's cold cache performance that everyone complains > about. People really do complain about the hot-cache one too. Did you read the description of the sample benchmark that Jeremy described Windows sales people for using?

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Fri, May 15, 2015 at 4:30 PM, NeilBrown wrote: > > .. and I've been wondering what to do about i_mutex and NFS. I've had > customer reports of slowness in creating files that seems to be due to > i_mutex on the directory being held over the whole 'create' RPC, so only one > of those can be in

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 01:10:27AM +0100, Al Viro wrote: > Er... Remember the clusterfuck around the ->i_size and alignment > checks on XFS DIO writes? Just this cycle. Correctness of XFS > locking is nothing to boast about - it *is* convoluted as hell and you > guys are not superhuman enough t

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 09:38:08AM +1000, Dave Chinner wrote: > > Both readdir() and path component lookup are technically read > > operations, so why the hell do we use a mutex, rather than just > > get a read-write lock for reading? Yeah, it's that (d) above. I > > might trust xfs and ext4 to ge

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Al Viro
On Sat, May 16, 2015 at 09:30:22AM +1000, NeilBrown wrote: > .. and I've been wondering what to do about i_mutex and NFS. I've had > customer reports of slowness in creating files that seems to be due to > i_mutex on the directory being held over the whole 'create' RPC, so only one > of those can

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Dave Chinner
On Thu, May 14, 2015 at 04:57:22PM -0700, Jeremy Allison wrote: > On Thu, May 14, 2015 at 04:24:13PM -0700, Linus Torvalds wrote: > > On Thu, May 14, 2015 at 3:09 PM, Jeremy Allison wrote: > > > > > > Of course we tell people to just set their filesystems > > > up using mkfs.xfs -n version=ci :-).

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Dave Chinner
On Thu, May 14, 2015 at 08:51:12AM -0700, Linus Torvalds wrote: > On Thu, May 14, 2015 at 4:23 AM, Dave Chinner wrote: > > > > IIRC, ext4 readdir is not slow because of the use of the buffer > > cache, it's slow because of the way it hashes dirents across blocks > > on disk. i.e. it has locality

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Dave Chinner
On Fri, May 15, 2015 at 03:15:48PM -0600, Andreas Dilger wrote: > On May 14, 2015, at 5:23 AM, Dave Chinner wrote: > > > > On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote: > >> On Wed, May 13, 2015 at 8:30 PM, Al Viro wrote: > >>> > >>> Maybe... I'd like to see the profiles, TBH

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread NeilBrown
On Fri, 15 May 2015 15:15:48 -0600 Andreas Dilger wrote: > On May 14, 2015, at 5:23 AM, Dave Chinner wrote: > > > > On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote: > >> On Wed, May 13, 2015 at 8:30 PM, Al Viro wrote: > >>> > >>> Maybe... I'd like to see the profiles, TBH - es

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Andreas Dilger
On May 14, 2015, at 5:23 AM, Dave Chinner wrote: > > On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote: >> On Wed, May 13, 2015 at 8:30 PM, Al Viro wrote: >>> >>> Maybe... I'd like to see the profiles, TBH - especially getxattr() and >>> access() frequency on various loads. Sure,

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-15 Thread Linus Torvalds
On Thu, May 14, 2015 at 7:51 PM, Al Viro wrote: > > What's the benefit compared to c-i mount? Not hitting filesystem's > ->d_hash() and ->d_compare()? So the reason I'd be interested in per-access flags rather than mount flags are: - only special apps should use this anyway. IOW, samba and per

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Al Viro
On Thu, May 14, 2015 at 07:18:16PM -0700, Linus Torvalds wrote: > The only difference - EVER - would be if you pass in the ICASE flag. > Nothing I suggested would change semantics without it (the _hash_ > changes, but that doesn't change semantics, it's a purely internal > random number). > > Now,

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Linus Torvalds
On Thu, May 14, 2015 at 6:26 PM, Al Viro wrote: > > Hold on. Should > stat("blah", &buf) => ENOENT, OK, let's create it > mkdir("blah", 0)=> EEXIST, bugger, looks like a race > stat("blah", &buf) => ENOENT, Whiskey, Tango, Foxtrot > be possible? No. What

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Al Viro
On Thu, May 14, 2015 at 05:25:39PM -0700, Linus Torvalds wrote: > We can easily make things per-operation, by adding another flag. We > already have per-operation flags like LOOKUP_FOLLOW, which decides if > we follow the last symlink or not. We could add a LOOKUP_ICASE, which > decides whether we

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Linus Torvalds
On Thu, May 14, 2015 at 4:36 PM, Al Viro wrote: > On Thu, May 14, 2015 at 04:24:13PM -0700, Linus Torvalds wrote: > >> So ASCII-only case-insensitivity is sufficient for you guys? >> >> Doing case-insensitive lookups at a vfs layer level wouldn't be >> impossible (add some new lookup flag, so it w

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Jeremy Allison
On Thu, May 14, 2015 at 04:24:13PM -0700, Linus Torvalds wrote: > On Thu, May 14, 2015 at 3:09 PM, Jeremy Allison wrote: > > > > Of course we tell people to just set their filesystems > > up using mkfs.xfs -n version=ci :-). > > So ASCII-only case-insensitivity is sufficient for you guys? No it'

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Al Viro
On Thu, May 14, 2015 at 04:24:13PM -0700, Linus Torvalds wrote: > So ASCII-only case-insensitivity is sufficient for you guys? > > Doing case-insensitive lookups at a vfs layer level wouldn't be > impossible (add some new lookup flag, so it would *not* be > per-filesystem, it would be per-operati

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Linus Torvalds
On Thu, May 14, 2015 at 3:09 PM, Jeremy Allison wrote: > > Of course we tell people to just set their filesystems > up using mkfs.xfs -n version=ci :-). So ASCII-only case-insensitivity is sufficient for you guys? Doing case-insensitive lookups at a vfs layer level wouldn't be impossible (add so

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Jeremy Allison
On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote: > On Wed, May 13, 2015 at 8:30 PM, Al Viro wrote: > > > > Maybe... I'd like to see the profiles, TBH - especially getxattr() and > > access() frequency on various loads. Sure, make(1) and cc(1) really care > > about stat() very much

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Linus Torvalds
On Thu, May 14, 2015 at 8:51 AM, Linus Torvalds wrote: > > Basically, in computer science, pretty much all performance work is > about caching. Credit where credit is due. Terje "almost all programming can be viewed as an exercise in caching" Mathisen. Linus -- To unsubscribe f

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Eric W. Biederman
Al Viro writes: > In particular, automounts will require > discussing what exactly in the process' state is used for those - both > with autofs/NFS/AFS/CIFS folks and with Eric (what netns should be used > when we are crossing an NFSv4 referral point? Should it come from the > NFS mount we'd foun

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Linus Torvalds
On Thu, May 14, 2015 at 4:23 AM, Dave Chinner wrote: > > IIRC, ext4 readdir is not slow because of the use of the buffer > cache, it's slow because of the way it hashes dirents across blocks > on disk. i.e. it has locality issues, not a caching problem. No, you're just worrying about IO. Natural

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Jan Kara
On Thu 14-05-15 21:23:04, Dave Chinner wrote: > On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote: > > And readdir() itself, for that matter - we have no good vfs-level > > readdir caching, so it all ends up serialized on the inode > > semaphore, and it all goes all the way into the fi

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-14 Thread Dave Chinner
On Wed, May 13, 2015 at 08:52:59PM -0700, Linus Torvalds wrote: > On Wed, May 13, 2015 at 8:30 PM, Al Viro wrote: > > > > Maybe... I'd like to see the profiles, TBH - especially getxattr() and > > access() frequency on various loads. Sure, make(1) and cc(1) really care > > about stat() very much

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-13 Thread Linus Torvalds
On Wed, May 13, 2015 at 8:30 PM, Al Viro wrote: > > Maybe... I'd like to see the profiles, TBH - especially getxattr() and > access() frequency on various loads. Sure, make(1) and cc(1) really care > about stat() very much, but I wouldn't be surprised if something like > httpd or samba would be

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-13 Thread Al Viro
On Wed, May 13, 2015 at 06:39:53PM -0700, Linus Torvalds wrote: > On Wed, May 13, 2015 at 3:25 PM, Al Viro wrote: > > More on top of the current vfs.git#for-next (== the posted patchset > > with a couple of fixes): more fs/namei.c reorganization and stack footprint > > reduction (below 1Kb

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-13 Thread Linus Torvalds
On Wed, May 13, 2015 at 3:25 PM, Al Viro wrote: > More on top of the current vfs.git#for-next (== the posted patchset > with a couple of fixes): more fs/namei.c reorganization and stack footprint > reduction (below 1Kb now). One interesting piece of that is that we don't > touch current->

Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

2015-05-13 Thread Al Viro
More on top of the current vfs.git#for-next (== the posted patchset with a couple of fixes): more fs/namei.c reorganization and stack footprint reduction (below 1Kb now). One interesting piece of that is that we don't touch current->fs->lock anymore - unlazy_walk() used to, but now we can