Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-24 Thread Paul E. McKenney
On Wed, Jun 24, 2015 at 09:50:50AM -0700, Dave Hansen wrote: > On 06/22/2015 05:26 PM, Paul E. McKenney wrote: > > OK, here is an experimental patch that provides a fast-readers variant > > of RCU, forward-ported from v3.3. Because we didn't have call_srcu() > > and srcu_barrier() back then, it is

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-24 Thread Dave Hansen
On 06/22/2015 05:26 PM, Paul E. McKenney wrote: > OK, here is an experimental patch that provides a fast-readers variant > of RCU, forward-ported from v3.3. Because we didn't have call_srcu() > and srcu_barrier() back then, it is not a drop-in replacement for SRCU, > so you need to adapt the code

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-23 Thread Jan Kara
On Fri 19-06-15 14:50:25, Dave Hansen wrote: > > From: Dave Hansen > > I have a _tiny_ microbenchmark that sits in a loop and writes > single bytes to a file. Writing one byte to a tmpfs file is > around 2x slower than reading one byte from a file, which is a > _bit_ more than I expecte. This

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Paul E. McKenney
On Mon, Jun 22, 2015 at 09:03:08PM +0200, Peter Zijlstra wrote: > On Mon, Jun 22, 2015 at 09:29:49AM -0700, Paul E. McKenney wrote: > > > I believe that there still are some cases. But why would offline > > CPUs seem so iffy? CPUs coming up execute code before they are fully > > operational, and

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Paul E. McKenney
On Mon, Jun 22, 2015 at 08:52:29PM +0200, Peter Zijlstra wrote: > On Mon, Jun 22, 2015 at 08:11:21AM -0700, Paul E. McKenney wrote: > > That depends on how slow the resulting slow global state would be. > > We have some use cases (definitely KVM, perhaps also some of the VFS > > code) that need the

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Paul E. McKenney
On Mon, Jun 22, 2015 at 11:50:50AM -0700, Dave Hansen wrote: > On 06/22/2015 08:11 AM, Paul E. McKenney wrote: > > But if Dave is willing to test it, I would be happy to send along > > a fast-readers patch, easy to do. > > I'm always willing to test, but the cost of the srcu_read_lock() barrier >

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Peter Zijlstra
On Mon, Jun 22, 2015 at 09:29:49AM -0700, Paul E. McKenney wrote: > I believe that there still are some cases. But why would offline > CPUs seem so iffy? CPUs coming up execute code before they are fully > operational, and during that time, much of the kernel views them as > being offline. Yet

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Peter Zijlstra
On Mon, Jun 22, 2015 at 08:11:21AM -0700, Paul E. McKenney wrote: > That depends on how slow the resulting slow global state would be. > We have some use cases (definitely KVM, perhaps also some of the VFS > code) that need the current speed, as opposed to the profound slowness > that three trips t

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Dave Hansen
On 06/22/2015 08:11 AM, Paul E. McKenney wrote: > But if Dave is willing to test it, I would be happy to send along > a fast-readers patch, easy to do. I'm always willing to test, but the cost of the srcu_read_lock() barrier shows up even on my 2-year-old "Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz"

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Paul E. McKenney
On Mon, Jun 22, 2015 at 05:20:13PM +0200, Peter Zijlstra wrote: > On Mon, Jun 22, 2015 at 08:11:21AM -0700, Paul E. McKenney wrote: > > That depends on how slow the resulting slow global state would be. > > We have some use cases (definitely KVM, perhaps also some of the VFS > > code) that need the

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Peter Zijlstra
On Mon, Jun 22, 2015 at 08:11:21AM -0700, Paul E. McKenney wrote: > That depends on how slow the resulting slow global state would be. > We have some use cases (definitely KVM, perhaps also some of the VFS > code) that need the current speed, as opposed to the profound slowness > that three trips t

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Paul E. McKenney
On Mon, Jun 22, 2015 at 03:28:21PM +0200, Peter Zijlstra wrote: > On Sat, Jun 20, 2015 at 06:30:58PM -0700, Paul E. McKenney wrote: > > Well, it is not hard to have an SRCU-like thing that doesn't have > > read-side memory barriers, given that older versions of SRCU didn't > > have them. However,

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-22 Thread Peter Zijlstra
On Sat, Jun 20, 2015 at 06:30:58PM -0700, Paul E. McKenney wrote: > Well, it is not hard to have an SRCU-like thing that doesn't have > read-side memory barriers, given that older versions of SRCU didn't > have them. However, the price is increased latency for the analog to > synchronize_srcu().

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-20 Thread Paul E. McKenney
On Sat, Jun 20, 2015 at 11:02:08AM -0700, Dave Hansen wrote: > On 06/19/2015 07:21 PM, Paul E. McKenney wrote: > >>> > > What is so expensive in it? Just the memory barrier in it? > >> > > >> > The profiling doesn't hit on the mfence directly, but I assume that the > >> > overhead is coming from t

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-20 Thread Dave Hansen
On 06/19/2015 07:21 PM, Paul E. McKenney wrote: >>> > > What is so expensive in it? Just the memory barrier in it? >> > >> > The profiling doesn't hit on the mfence directly, but I assume that the >> > overhead is coming from there. The "mov0x8(%rdi),%rcx" is identical >> > before and after t

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-19 Thread Paul E. McKenney
On Fri, Jun 19, 2015 at 05:39:11PM -0700, Dave Hansen wrote: > On 06/19/2015 04:33 PM, Andi Kleen wrote: > >> > I *think* we can avoid taking the srcu_read_lock() for the > >> > common case where there are no actual marks on the file > >> > being modified *or* the vfsmount. > > What is so expensive

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-19 Thread Dave Hansen
On 06/19/2015 04:33 PM, Andi Kleen wrote: >> > I *think* we can avoid taking the srcu_read_lock() for the >> > common case where there are no actual marks on the file >> > being modified *or* the vfsmount. > What is so expensive in it? Just the memory barrier in it? The profiling doesn't hit on th

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-19 Thread Paul E. McKenney
On Fri, Jun 19, 2015 at 04:33:06PM -0700, Andi Kleen wrote: > On Fri, Jun 19, 2015 at 02:50:25PM -0700, Dave Hansen wrote: > > > > From: Dave Hansen > > > > I have a _tiny_ microbenchmark that sits in a loop and writes > > single bytes to a file. Writing one byte to a tmpfs file is > > around 2

Re: [RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-19 Thread Andi Kleen
On Fri, Jun 19, 2015 at 02:50:25PM -0700, Dave Hansen wrote: > > From: Dave Hansen > > I have a _tiny_ microbenchmark that sits in a loop and writes > single bytes to a file. Writing one byte to a tmpfs file is > around 2x slower than reading one byte from a file, which is a > _bit_ more than I

[RFC][PATCH] fs: optimize inotify/fsnotify code for unwatched files

2015-06-19 Thread Dave Hansen
From: Dave Hansen I have a _tiny_ microbenchmark that sits in a loop and writes single bytes to a file. Writing one byte to a tmpfs file is around 2x slower than reading one byte from a file, which is a _bit_ more than I expecte. This is a dumb benchmark, but I think it's hard to deny that wri