Re: Recent kernel "mount" slow

2012-11-27 Thread Mikulas Patocka
On Tue, 27 Nov 2012, Jens Axboe wrote: > On 2012-11-27 11:06, Jeff Chua wrote: > > On Tue, Nov 27, 2012 at 3:38 PM, Jens Axboe wrote: > >> On 2012-11-27 06:57, Jeff Chua wrote: > >>> On Sun, Nov 25, 2012 at 7:23 AM, Jeff Chua > >>> wrote: >

[PATCH 2/2] block_dev: don't take the write lock if block size doesn't change

2012-11-27 Thread Mikulas Patocka
stem with block size equal to the default block size). The logic to test if the block device is mapped was moved to a separate function is_bdev_mapped to avoid code duplication. Signed-off-by: Mikulas Patocka --- fs/block_dev.c | 25 ++--- 1 file changed, 18 insertions(

[PATCH 1/2] percpu-rwsem: use synchronize_sched_expedited

2012-11-27 Thread Mikulas Patocka
On Tue, 27 Nov 2012, Jeff Chua wrote: > On Tue, Nov 27, 2012 at 3:38 PM, Jens Axboe wrote: > > On 2012-11-27 06:57, Jeff Chua wrote: > >> On Sun, Nov 25, 2012 at 7:23 AM, Jeff Chua > >> wrote: > >>> On Sun, Nov 25, 2012 at 5:09 AM, Mikulas Patocka >

[PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Jens Axboe wrote: > On 2012-11-28 04:57, Mikulas Patocka wrote: > > > > This patch is wrong because you must check if the device is mapped while > > holding bdev->bd_block_size_semaphore (because > > bdev->bd_block_size_semaphore prevent

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > No, this is crap. > > We don't introduce random hooks like this just because the block layer > has shit-for-brains and cannot be bothered to do things right. > > The fact is, the whole locking in the block layer open routine is > total and utter cra

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 12:03 PM, Linus Torvalds > wrote: > > > > mmap() is in *no* way special. The exact same thing happens for > > regular read/write. Yet somehow the mmap code is special-cased, while > > the normal read-write code is not. > > I

[PATCH v2] Do a proper locking for mmap and block size change

2012-11-28 Thread Mikulas Patocka
blocksize, changes the block size while the block device is mapped. (which is incorrect and may result in a crash or misbehavior). To fix this possible race condition, we introduce a counter bd_mmap_count that counts the number of vmas that maps the block device. bd_mmap_count is

Re: [PATCH 2/2] block_dev: don't take the write lock if block size doesn't change

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Jeff Chua wrote: > On Wed, Nov 28, 2012 at 12:01 PM, Mikulas Patocka wrote: > > block_dev: don't take the write lock if block size doesn't change > > > > Taking the write lock has a big performance impact on the whole system > > (b

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 12:32 PM, Linus Torvalds > wrote: > > > > Here is a *COMPLETELY* untested patch. Caveat emptor. It will probably > > do unspeakable things to your family and pets. > > Btw, *if* this approach works, I suspect we could just sw

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > > For example, __block_write_full_page and __block_write_begin do > > if (!page_has_buffers(page)) { create_empty_buffers... } > > and then they do > > WARN_ON(bh->b_size != blocksize) > > err = get_block(inode, block, bh, 1) >

Re: [PATCH] Introduce a method to catch mmap_region (was: Recent kernel "mount" slow)

2012-11-28 Thread Mikulas Patocka
On Wed, 28 Nov 2012, Linus Torvalds wrote: > A bigger issue is for things that emulate what blkdev.c does, and > doesn't do the locking. I see code in md/bitmap.c that seems a bit > suspicious, for example. That said, it's not *new* breakage, and the > "lock at mmap/read/write() time" approach d

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Mikulas Patocka
On Thu, 29 Nov 2012, Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 2:01 PM, Mikulas Patocka wrote: > > > > This sounds sensible. I'm sending this patch. > > This looks much better. > > I think I'll apply this for 3.7 (since it's too late to do an

Re: [PATCH 1/2] percpu-rwsem: use synchronize_sched_expedited

2012-11-29 Thread Mikulas Patocka
On Thu, 29 Nov 2012, Andrew Morton wrote: > On Tue, 27 Nov 2012 22:59:52 -0500 (EST) > Mikulas Patocka wrote: > > > percpu-rwsem: use synchronize_sched_expedited > > > > Use synchronize_sched_expedited() instead of synchronize_sched() > > to improve mount sp

Re: [dm-devel] [PATCH 0/3 v3] add resync speed control for dm-raid1

2013-02-05 Thread Mikulas Patocka
On Wed, 23 Jan 2013, NeilBrown wrote: > On Tue, 22 Jan 2013 20:44:41 -0500 (EST) Mikulas Patocka > wrote: > > > > > > > On Wed, 16 Jan 2013, Guangliang Zhao wrote: > > > > > On Wed, Jan 09, 2013 at 12:43:21AM -0500, Mikulas Patocka wrote: >

Re: [PATCH] Track block device users that created dirty pages

2013-04-03 Thread Mikulas Patocka
On Mon, 1 Apr 2013, Jeff Moyer wrote: > Mikulas Patocka writes: > > > The new semantics is: if a process did some buffered writes to the block > > device (with write or mmap), the cache is flushed when the process > > closes the block device. Processes that didn

Re: [PATCH] A possible deadlock with stacked devices (was: [PATCH v4 08/12] block: Introduce new bio_split())

2012-08-29 Thread Mikulas Patocka
On Wed, 15 Aug 2012, Kent Overstreet wrote: > > Both md and dm use __GFP_WAIT allocations from mempools in > > generic_make_request. > > > > I think you found an interesting bug here. Suppose that we have three > > stacked devices: d1 depends on d2 and d2 depends on d3. > > > > Now, a bio b1

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-08-29 Thread Mikulas Patocka
Hi This fixes the bio allocation problems, but doesn't fix a similar deadlock in device mapper when allocating from md->io_pool or other mempools in the target driver. The problem is that majority of device mapper code assumes that if we submit a bio, that bio will be finished in a finite time

Re: [PATCH 5/8] hpfs: drop lock/unlock super

2012-08-30 Thread Mikulas Patocka
It looks ok. Mikulas On Thu, 30 Aug 2012, Marco Stornelli wrote: > Removed lock/unlock super. > > Signed-off-by: Marco Stornelli > --- > fs/hpfs/super.c |3 --- > 1 files changed, 0 insertions(+), 3 deletions(-) > > diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c > index 706a12c..8af2cdc

[PATCH 0/4] Fix a crash when block device is read and block size is changed at the same time

2012-08-31 Thread Mikulas Patocka
Hi This is a series of patches to prevent a crash when when someone is reading block device and block size is changed simultaneously. (the crash is already happening in the production environment) The first patch adds a rw-lock to struct block_device, but doesn't use the lock anywhere. The rea

[PATCH 1/4] Add a lock that will be needed by the next patch

2012-08-31 Thread Mikulas Patocka
nce change of locking. Signed-off-by: Mikulas Patocka --- drivers/char/raw.c |2 - fs/block_dev.c | 60 +++-- include/linux/fs.h |4 +++ 3 files changed, 63 insertions(+), 3 deletions(-) Index: linux-3.5-rc6-devel/include/

[PATCH 2/4] blockdev: fix a crash when block size is changed and I/O is issued simultaneously

2012-08-31 Thread Mikulas Patocka
vents block size changing while the device is mapped with mmap. Signed-off-by: Mikulas Patocka --- drivers/char/raw.c |2 - fs/block_dev.c | 60 +++-- include/linux/fs.h |2 + 3 files changed, 61 inser

[PATCH 3/4] blockdev: turn a rw semaphore into a percpu rw semaphore

2012-08-31 Thread Mikulas Patocka
blockdev: turn a rw semaphore into a percpu rw semaphore This avoids cache line bouncing when many processes lock the semaphore for read. Partially based on a patch by Jeff Moyer . Signed-off-by: Mikulas Patocka --- fs/block_dev.c | 30 -- include/linux/fs.h

[PATCH 4/4] New percpu lock implementation

2012-08-31 Thread Mikulas Patocka
RCU has been synchronized, no processes can create new read locks. We wait until the sum of percpu counters is zero - when it is, there are no readers in the critical section. Signed-off-by: Mikulas Patocka --- fs/block_dev.c | 15 ++ inclu

Re: [PATCH 0/4] Fix a crash when block device is read and block size is changed at the same time

2012-08-31 Thread Mikulas Patocka
On Fri, 31 Aug 2012, Mikulas Patocka wrote: > Hi > > This is a series of patches to prevent a crash when when someone is > reading block device and block size is changed simultaneously. (the crash > is already happening in the production environment) > > The first pa

Re: [PATCH 0/4] Fix a crash when block device is read and block size is changed at the same time

2012-08-31 Thread Mikulas Patocka
On Fri, 31 Aug 2012, Jeff Moyer wrote: > Mikulas Patocka writes: > > > On Fri, 31 Aug 2012, Mikulas Patocka wrote: > > > >> Hi > >> > >> This is a series of patches to prevent a crash when when someone is > >> reading block devic

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-09-03 Thread Mikulas Patocka
On Thu, 30 Aug 2012, Kent Overstreet wrote: > On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote: > > On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote: > > > > [..] > > > > Performance aside, punting submission to per device worker in case of > > > > deep > > > > stack

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-09-04 Thread Mikulas Patocka
On Mon, 3 Sep 2012, Kent Overstreet wrote: > On Mon, Sep 03, 2012 at 04:41:37PM -0400, Mikulas Patocka wrote: > > ... or another possibility - start a timer when something is put to > > current->bio_list and use that timer to pop entries off current->bio_list >

[PATCH 2] dm: Use bioset's front_pad for dm_target_io

2012-09-11 Thread Mikulas Patocka
On Tue, 4 Sep 2012, Kent Overstreet wrote: > On Tue, Sep 04, 2012 at 03:26:19PM -0400, Mikulas Patocka wrote: > > > > > > On Mon, 3 Sep 2012, Kent Overstreet wrote: > > > > > On Mon, Sep 03, 2012 at 04:41:37PM -0400, Mikulas Patocka wrote: > > >

Re: [PATCH 2] dm: Use bioset's front_pad for dm_target_io

2012-09-12 Thread Mikulas Patocka
On Tue, 11 Sep 2012, Kent Overstreet wrote: > On Tue, Sep 11, 2012 at 03:28:57PM -0400, Mikulas Patocka wrote: > > > > > > On Tue, 4 Sep 2012, Kent Overstreet wrote: > > > > > On Tue, Sep 04, 2012 at 03:26:19PM -0400, Mikulas Patocka wrote: > > >

Re: [PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers

2012-10-23 Thread Mikulas Patocka
On Tue, 23 Oct 2012, Oleg Nesterov wrote: > On 10/23, Oleg Nesterov wrote: > > > > Not really the comment, but the question... > > Damn. And another question. > > Mikulas, I am sorry for this (almost) off-topic noise. Let me repeat > just in case that I am not arguing with your patches. > >

Re: [PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers

2012-10-23 Thread Mikulas Patocka
On Tue, 23 Oct 2012, Paul E. McKenney wrote: > On Tue, Oct 23, 2012 at 01:29:02PM -0700, Paul E. McKenney wrote: > > On Tue, Oct 23, 2012 at 08:41:23PM +0200, Oleg Nesterov wrote: > > > On 10/23, Paul E. McKenney wrote: > > > > > > > > * Note that this guarantee implies a further memory-orderin

Re: [PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers

2012-10-24 Thread Mikulas Patocka
On Wed, 24 Oct 2012, Paul E. McKenney wrote: > On Tue, Oct 23, 2012 at 05:39:43PM -0400, Mikulas Patocka wrote: > > > > > > On Tue, 23 Oct 2012, Paul E. McKenney wrote: > > > > > On Tue, Oct 23, 2012 at 01:29:02PM -0700, Paul E. McKenney wrote: > &

Re: [PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers

2012-10-24 Thread Mikulas Patocka
On Wed, 24 Oct 2012, Paul E. McKenney wrote: > On Wed, Oct 24, 2012 at 04:22:17PM -0400, Mikulas Patocka wrote: > > > > > > On Wed, 24 Oct 2012, Paul E. McKenney wrote: > > > > > On Tue, Oct 23, 2012 at 05:39:43PM -0400, Mikulas Patocka wrote: > >

Re: [PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers

2012-10-25 Thread Mikulas Patocka
On Wed, 24 Oct 2012, Paul E. McKenney wrote: > On Wed, Oct 24, 2012 at 04:44:14PM -0400, Mikulas Patocka wrote: > > > > > > On Wed, 24 Oct 2012, Paul E. McKenney wrote: > > > > > On Wed, Oct 24, 2012 at 04:22:17PM -0400, Mikulas Patocka wrote: > >

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-25 Thread Mikulas Patocka
On Wed, 24 Oct 2012, Dave Chinner wrote: > On Fri, Oct 19, 2012 at 06:54:41PM -0400, Mikulas Patocka wrote: > > > > > > On Fri, 19 Oct 2012, Peter Zijlstra wrote: > > > > > > Yes, I tried this approach - it involves doing LOCK instruction on read >

Re: [PATCH 2/2] percpu-rw-semaphores: use rcu_read_lock_sched

2012-10-25 Thread Mikulas Patocka
On Wed, 24 Oct 2012, Paul E. McKenney wrote: > On Mon, Oct 22, 2012 at 07:39:16PM -0400, Mikulas Patocka wrote: > > Use rcu_read_lock_sched / rcu_read_unlock_sched / synchronize_sched > > instead of rcu_read_lock / rcu_read_unlock / synchronize_rcu. > > > > This

Re: [PATCH 2/2] percpu-rw-semaphores: use rcu_read_lock_sched

2012-10-25 Thread Mikulas Patocka
On Thu, 25 Oct 2012, Paul E. McKenney wrote: > On Thu, Oct 25, 2012 at 10:54:11AM -0400, Mikulas Patocka wrote: > > > > > > On Wed, 24 Oct 2012, Paul E. McKenney wrote: > > > > > On Mon, Oct 22, 2012 at 07:39:16PM -0400, Mikulas Patocka wr

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-26 Thread Mikulas Patocka
On Fri, 26 Oct 2012, Oleg Nesterov wrote: > On 10/26, Dave Chinner wrote: > > > > On Thu, Oct 25, 2012 at 10:09:31AM -0400, Mikulas Patocka wrote: > > > > > > Yes, mnt_want_write()/mnt_make_readonly() do the same thing as percpu rw > > > semaphor

[PATCH] A possible deadlock with stacked devices (was: [PATCH v4 08/12] block: Introduce new bio_split())

2012-07-26 Thread Mikulas Patocka
suppose that for some reason mempool in d2 is exhausted and the driver needs to wait until b2.1 finishes. b2.1 never finishes, because b2.1 depends on b3.1 and b3.1 is still in current->bio_list. So it deadlocks. Turning off __GFP_WAIT fixes nothing - it just turns one bug (a possible deadlock

[PATCH 1/3] Fix Crash when IO is being submitted and block size is changed

2012-07-28 Thread Mikulas Patocka
On Thu, 19 Jul 2012, Jeff Moyer wrote: > Mikulas Patocka writes: > > > On Tue, 17 Jul 2012, Jeff Moyer wrote: > > > > >> > This is the patch that fixes this crash: it takes a rw-semaphore around > >> > all direct-IO path. > >> > &g

[PATCH 2/3] Introduce percpu rw semaphores

2012-07-28 Thread Mikulas Patocka
locking percpu rw semaphore may be rescheduled, it doesn't cause bug, but cache line bouncing occurs in this case. Signed-off-by: Mikulas Patocka --- include/linux/percpu-rwsem.h | 77 +++ 1 file changed, 77 insertions(+) Index: linux-3.5-fast/in

[PATCH 3/3] blockdev: turn a rw semaphore into a percpu rw semaphore

2012-07-28 Thread Mikulas Patocka
blockdev: turn a rw semaphore into a percpu rw semaphore This avoids cache line bouncing when many processes lock the semaphore for read. Partially based on a patch by Jeff Moyer . Signed-off-by: Mikulas Patocka --- fs/block_dev.c | 30 -- include/linux/fs.h

Re: [dm-devel] [PATCH 2/3] Introduce percpu rw semaphores

2012-07-28 Thread Mikulas Patocka
On Sat, 28 Jul 2012, Eric Dumazet wrote: > On Sat, 2012-07-28 at 12:41 -0400, Mikulas Patocka wrote: > > Introduce percpu rw semaphores > > > > When many CPUs are locking a rw semaphore for read concurrently, cache > > line bouncing occurs. When a CPU acquires

Re: [dm-devel] [PATCH 2/3] Introduce percpu rw semaphores

2012-07-30 Thread Mikulas Patocka
On Mon, 30 Jul 2012, Paul E. McKenney wrote: > On Sun, Jul 29, 2012 at 01:13:34AM -0400, Mikulas Patocka wrote: > > On Sat, 28 Jul 2012, Eric Dumazet wrote: > > > On Sat, 2012-07-28 at 12:41 -0400, Mikulas Patocka wrote: > > [ . . . ] > > > > (bdev->bd

[PATCH 1/2] Fix a crash when block device is read and block size is changed at the same time

2012-09-25 Thread Mikulas Patocka
On Tue, 25 Sep 2012, Jens Axboe wrote: > On 2012-09-25 19:59, Jens Axboe wrote: > > On 2012-09-25 19:49, Jeff Moyer wrote: > >> Jeff Moyer writes: > >> > >>> Mikulas Patocka writes: > >>> > >>>> Hi Jeff > >>>>

[PATCH 2/2] Fix a crash when block device is read and block size is changed at the same time

2012-09-25 Thread Mikulas Patocka
;locked" variable to true and synchronize rcu. Since RCU has been synchronized, no processes can create new read locks. We wait until the sum of percpu counters is zero - when it is, there are no readers in the critical section. Signed-off-by: Mikulas Patocka --- Documentation/percpu-rw-semap

Re: [PATCH 0/4] Fix a crash when block device is read and block size is changed at the same time

2012-09-25 Thread Mikulas Patocka
On Tue, 25 Sep 2012, Jeff Moyer wrote: > Jeff Moyer writes: > > > Mikulas Patocka writes: > > > >> Hi Jeff > >> > >> Thanks for testing. > >> > >> It would be interesting ... what happens if you take the patch 3, leave >

Re: [PATCH 0/4] Fix a crash when block device is read and block size is changed at the same time

2012-09-26 Thread Mikulas Patocka
On Wed, 26 Sep 2012, Jeff Moyer wrote: > Mikulas Patocka writes: > > > On Tue, 25 Sep 2012, Jeff Moyer wrote: > > > >> Jeff Moyer writes: > >> > >> > Mikulas Patocka writes: > >> > > >> >> Hi Jeff > >> >

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-26 Thread Mikulas Patocka
On Fri, 26 Oct 2012, Oleg Nesterov wrote: > > The code is different, but it can be changed to use percpu rw semaphores > > (if we add percpu_down_write_trylock). > > I don't really understand how you can make percpu_down_write_trylock() > atomic so that it can be called under br_write_lock(vfsm

Re: dm: Make MIN_IOS, et al, tunable via sysctl.

2013-08-26 Thread Mikulas Patocka
On Tue, 20 Aug 2013, Frank Mayhar wrote: > On Tue, 2013-08-20 at 18:24 -0400, Mike Snitzer wrote: > > Mikulas' point is that you cannot reduce the size to smaller than 1. > > And aside from rq-based DM, 1 is sufficient to allow for forward > > progress even when memory is completely consumed. >

Re: [dm-devel] [RFC] dm-lc: plan to go to staging

2013-08-29 Thread Mikulas Patocka
Another idea: Make the interface of dm-lc (the arguments to constructor, messages and the status line) the same as dm-cache, so that they can be driven by the same userspace code. Mikulas On Thu, 29 Aug 2013, Alasdair G Kergon wrote: > On Wed, Aug 28, 2013 at 07:05:55PM -0700, Greg KH wrote:

Re: [dm-devel] Reworking dm-writeboost [was: Re: staging: Add dm-writeboost]

2013-10-02 Thread Mikulas Patocka
On Tue, 1 Oct 2013, Joe Thornber wrote: > > Alternatively, delaying them will stall the filesystem because it's > > waiting for said REQ_FUA IO to complete. For example, journal writes > > in XFS are extremely IO latency sensitive in workloads that have a > > signifincant number of ordering cons

dm-writeboost testing

2013-10-02 Thread Mikulas Patocka
Hi I tested dm-writeboost and I found these problems: Performance problems: I tested dm-writeboost with disk as backing device and ramdisk as cache device. When I run mkfs.ext4 on the dm-writeboost device, it writes data to the cache on the first time. However, on next mkfs.ext4 invocations,

Re: [dm-devel] dm-writeboost testing

2013-10-04 Thread Mikulas Patocka
On Fri, 4 Oct 2013, Akira Hayakawa wrote: > Hi, Mikulas, > > I am sorry to say that > I don't have such machines to reproduce the problem. > > But agree with that I am dealing with workqueue subsystem > in a little bit weird way. > I should clean them up. > > For example, > free_cache() routi

Re: [dm-devel] dm-writeboost testing

2013-10-04 Thread Mikulas Patocka
On Fri, 4 Oct 2013, Akira Hayakawa wrote: > Mikulas, > > Thanks for your pointing out. > > > The problem is that you are using workqueues the wrong way. You submit a > > work item to a workqueue and the work item is active until the device is > > unloaded. > > > > If you submit a work item

Re: [dm-devel] dm-writeboost testing

2013-10-05 Thread Mikulas Patocka
On Sat, 5 Oct 2013, Akira Hayakawa wrote: > Mikulas, > > > nvidia binary driver, but it may happen in other parts of the kernel too. > > The fact that it works in your setup doesn't mean that it is correct. > You are right. I am convinced. > > As far as I looked around the kernel code, > it s

A review of dm-writeboost

2013-10-05 Thread Mikulas Patocka
Hi I looked at dm-writeboost source code and here I'm sending the list of problems I found: Polling for state - Some of the kernel threads that you spawn poll for data in one-second interval - see migrate_proc, modulator_proc, recorder_proc, sync_proc. flush_proc correctly co

Re: A review of dm-writeboost

2013-10-07 Thread Mikulas Patocka
On Sun, 6 Oct 2013, Akira Hayakawa wrote: > Mikulas, > > Thank you for your reviewing. > > I will reply to polling issue first. > For the rest, I will reply later. > > > Polling for state > > - > > > > Some of the kernel threads that you spawn poll for data in one-second > >

Re: [REGRESSION][BISECTED] skge: add dma_mapping check

2013-09-24 Thread Mikulas Patocka
On Tue, 24 Sep 2013, Joseph Salisbury wrote: > On 09/19/2013 05:03 AM, Igor Gnatenko wrote: > > Please, send patch. > > > The patch is in mainline as of 3.12-rc2 as commit: > > Author: Mikulas Patocka > Date: Thu Sep 19 14:13:17 2013 -0400 > > skge: f

Re: [PATCH] percpu-rwsem: use barrier in unlock path

2012-10-17 Thread Mikulas Patocka
Hi On Wed, 17 Oct 2012, Lai Jiangshan wrote: > On 10/17/2012 10:23 AM, Linus Torvalds wrote: > > [ Architecture people, note the potential new SMP barrier! ] > > > > On Tue, Oct 16, 2012 at 4:30 PM, Mikulas Patocka > > wrote: > >> + /* > >&g

Re: [PATCH] percpu-rwsem: use barrier in unlock path

2012-10-18 Thread Mikulas Patocka
On Thu, 18 Oct 2012, Lai Jiangshan wrote: > On 10/18/2012 04:28 AM, Steven Rostedt wrote: > > On Wed, Oct 17, 2012 at 11:07:21AM -0400, Mikulas Patocka wrote: > >>> > >>> Even the previous patch is applied, percpu_down_read() still > >>> needs mb(

Re: [PATCH] percpu-rwsem: use barrier in unlock path

2012-10-18 Thread Mikulas Patocka
On Wed, 17 Oct 2012, Steven Rostedt wrote: > On Wed, Oct 17, 2012 at 11:07:21AM -0400, Mikulas Patocka wrote: > > > > > > Even the previous patch is applied, percpu_down_read() still > > > needs mb() to pair with it. > > > > percpu_down_read uses

Re: [PATCH] percpu-rwsem: use barrier in unlock path

2012-10-18 Thread Mikulas Patocka
On Thu, 18 Oct 2012, Steven Rostedt wrote: > On Thu, 2012-10-18 at 10:18 +0800, Lai Jiangshan wrote: > > > > > > Looking at the patch, you are correct. The read side doesn't need the > > > memory barrier as the worse thing that will happen is that it sees the > > > locked = false, and will just

Re: [PATCH] percpu-rwsem: use barrier in unlock path

2012-10-18 Thread Mikulas Patocka
On Tue, 16 Oct 2012, Linus Torvalds wrote: > [ Architecture people, note the potential new SMP barrier! ] > > On Tue, Oct 16, 2012 at 4:30 PM, Mikulas Patocka wrote: > > + /* > > +* The lock is considered unlocked when p->locked is set to false. &g

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-18 Thread Mikulas Patocka
On Thu, 18 Oct 2012, Oleg Nesterov wrote: > Ooooh. And I just noticed include/linux/percpu-rwsem.h which does > something similar. Certainly it was not in my tree when I started > this patch... percpu_down_write() doesn't allow multiple writers, > but the main problem it uses msleep(1). It shoul

Re: [PATCH] percpu-rwsem: use barrier in unlock path

2012-10-18 Thread Mikulas Patocka
This patch looks sensible. I'd apply either this or my previous patch that adds synchronize_rcu() to percpu_up_write. This patch avoids the memory barrier on non-x86 cpus in percpu_up_read, so it is faster than the previous approach. Mikulas On Thu, 18 Oct 2012, Lai Jiangshan wrote: > -

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-19 Thread Mikulas Patocka
On Fri, 19 Oct 2012, Peter Zijlstra wrote: > On Thu, 2012-10-18 at 15:28 -0400, Mikulas Patocka wrote: > > > > On Thu, 18 Oct 2012, Oleg Nesterov wrote: > > > > > Ooooh. And I just noticed include/linux/percpu-rwsem.h which does > > > something similar.

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-19 Thread Mikulas Patocka
On Fri, 19 Oct 2012, Peter Zijlstra wrote: > > Yes, I tried this approach - it involves doing LOCK instruction on read > > lock, remembering the cpu and doing another LOCK instruction on read > > unlock (which will hopefully be on the same CPU, so no cacheline bouncing > > happens in the comm

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-22 Thread Mikulas Patocka
On Fri, 19 Oct 2012, Oleg Nesterov wrote: > On 10/19, Mikulas Patocka wrote: > > > > synchronize_rcu() is way slower than msleep(1) - > > This depends, I guess. but this doesn't mmatter, > > > so I don't see a reason > > why should it be complicat

[PATCH 0/2] fix and improvements for percpu-rw-semaphores (was: brw_mutex: big read-write mutex)

2012-10-22 Thread Mikulas Patocka
> > Ooooh. And I just noticed include/linux/percpu-rwsem.h which does > > something similar. Certainly it was not in my tree when I started > > this patch... percpu_down_write() doesn't allow multiple writers, > > but the main problem it uses msleep(1). It should not, I think. > > > > But. It seem

[PATCH 2/2] percpu-rw-semaphores: use rcu_read_lock_sched

2012-10-22 Thread Mikulas Patocka
rcu_read_lock_sched / rcu_read_unlock_sched that translates to preempt_disable / preempt_disable. It is smaller (and supposedly faster) than preemptible rcu_read_lock / rcu_read_unlock. Signed-off-by: Mikulas Patocka --- include/linux/percpu-rwsem.h |8 1 file changed, 4 insertions(+), 4 deletions

[PATCH 1/2] percpu-rw-semaphores: use light/heavy barriers

2012-10-22 Thread Mikulas Patocka
() in percpu_up_read. This patch changes it to a compiler barrier and removes the "#if defined(X86) ..." condition. From: Lai Jiangshan Signed-off-by: Mikulas Patocka --- include/linux/percpu-rwsem.h | 20 +++- 1 file changed, 7 insertions(+), 13 deletions(-) In

Re: blk: bd_block_size_semaphore related lockdep warning

2012-12-06 Thread Mikulas Patocka
Hi It should be fixed in 3.7-rc8 Mikulas On Sat, 1 Dec 2012, Sasha Levin wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest, running latest -next, > I've > stumbled on: > > [ 3130.099477] == > [ 3130.104862] [ INFO: p

[PATCH v3 1/1] percpu_rw_semaphore: reimplement to not block the readers unnecessarily

2012-11-07 Thread Mikulas Patocka
With this patch the code relies on the documented behaviour of synchronize_sched(), it doesn't try to pair synchronize_sched() with barrier. Signed-off-by: Oleg Nesterov Signed-off-by: Mikulas Patocka --- include/linux/percpu-rwsem.h | 80 ++- lib/Makefile

Re: [PATCH v3 1/1] percpu_rw_semaphore: reimplement to not block the readers unnecessarily

2012-11-07 Thread Mikulas Patocka
On Wed, 7 Nov 2012, Oleg Nesterov wrote: > On 11/07, Mikulas Patocka wrote: > > > > It looks sensible. > > > > Here I'm sending an improvement of the patch - I changed it so that there > > are not two-level nested functions for the fast path and

Re: [PATCH RESEND v2 1/1] percpu_rw_semaphore: reimplement to not block the readers unnecessarily

2012-11-08 Thread Mikulas Patocka
On Thu, 8 Nov 2012, Paul E. McKenney wrote: > On Thu, Nov 08, 2012 at 12:07:00PM -0800, Andrew Morton wrote: > > On Thu, 8 Nov 2012 14:48:49 +0100 > > Oleg Nesterov wrote: > > > > > Currently the writer does msleep() plus synchronize_sched() 3 times > > > to acquire/release the semaphore, and

Re: [PATCH RESEND v2 1/1] percpu_rw_semaphore: reimplement to not block the readers unnecessarily

2012-11-09 Thread Mikulas Patocka
On Thu, 8 Nov 2012, Andrew Morton wrote: > On Thu, 8 Nov 2012 14:48:49 +0100 > Oleg Nesterov wrote: > > > Currently the writer does msleep() plus synchronize_sched() 3 times > > to acquire/release the semaphore, and during this time the readers > > are blocked completely. Even if the "write" s

Re: [PATCH v2] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches)

2013-04-16 Thread Mikulas Patocka
On Tue, 16 Apr 2013, Tejun Heo wrote: > Hey, > > On Mon, Apr 15, 2013 at 09:02:06AM -0400, Mikulas Patocka wrote: > > The patch is not bug-prone, because we already must make sure that the > > cloned bio has shorter lifetime than the master bio - so the patch doesn

Re: [07/65] dm bufio: avoid a possible __vmalloc deadlock

2013-06-04 Thread Mikulas Patocka
On Mon, 3 Jun 2013, Steven Rostedt wrote: > 3.6.11.5 stable review patch. > If anyone has any objections, please let me know. > > -- > > From: Mikulas Patocka > > [ Upstream commit 502624bdad3dba45dfaacaf36b7d83e39e74b2d2 ] > > This patch uses

Re: [dm-devel] [PATCH v2] dm: verity: Add support for emitting uevents on dm-verity errors.

2013-06-26 Thread Mikulas Patocka
Hi I think the idea is fine. There is architecture problem - that target specific routines are being pushed into generic dm core. I suggest that instead of dm_send_verity_uevent and dm_path_uevent you create just one generic function (for example dm_send_uevent) that takes variable argument l

Re: [dm-devel] [PATCH] dm: Make MIN_IOS, et al, tunable via sysctl.

2013-08-20 Thread Mikulas Patocka
On Fri, 16 Aug 2013, Frank Mayhar wrote: > The device mapper and some of its modules allocate memory pools at > various points when setting up a device. In some cases, these pools are > fairly large, for example the multipath module allocates a 256-entry > pool and the dm itself allocates three

Re: [dm-devel] dm: Make MIN_IOS, et al, tunable via sysctl.

2013-08-20 Thread Mikulas Patocka
On Mon, 19 Aug 2013, Mike Snitzer wrote: > On Fri, Aug 16 2013 at 6:55pm -0400, > Frank Mayhar wrote: > > > The device mapper and some of its modules allocate memory pools at > > various points when setting up a device. In some cases, these pools are > > fairly large, for example the multipa

Re: [dm-devel] dm: Make MIN_IOS, et al, tunable via sysctl.

2013-08-20 Thread Mikulas Patocka
On Mon, 19 Aug 2013, Frank Mayhar wrote: > On Mon, 2013-08-19 at 10:00 -0400, Mike Snitzer wrote: > > Performance isn't the concern. The concern is: does DM allow for > > forward progress if the system's memory is completely exhausted? > > > > This is why request-based has such an extensive re

Re: [dm-devel] [PATCH] dm: Make MIN_IOS, et al, tunable via sysctl.

2013-08-20 Thread Mikulas Patocka
On Tue, 20 Aug 2013, Frank Mayhar wrote: > On Tue, 2013-08-20 at 17:22 -0400, Mikulas Patocka wrote: > > On Fri, 16 Aug 2013, Frank Mayhar wrote: > > > The device mapper and some of its modules allocate memory pools at > > > various points when setting up a device.

Re: dm: Make MIN_IOS, et al, tunable via sysctl.

2013-08-20 Thread Mikulas Patocka
On Tue, 20 Aug 2013, Mike Snitzer wrote: > Mikulas' point is that you cannot reduce the size to smaller than 1. > And aside from rq-based DM, 1 is sufficient to allow for forward > progress even when memory is completely consumed. > > A patch that simply changes them to 1 but makes the rq-based

Re: [dm-devel] [PATCH v2] dm ioctl: allow change device target type to error

2013-08-21 Thread Mikulas Patocka
On Wed, 21 Aug 2013, Joe Jin wrote: > commit a5664da "dm ioctl: make bio or request based device type immutable" > prevented "dmsetup wape_table" change the target type to "error". That commit a5664da is there for a reason (it is not possible to change bio-based device to request-based and vic

Re: [PATCH] md: dm-verity: Fix to avoid a deadlock in dm-bufio

2013-03-04 Thread Mikulas Patocka
aiting to get the mutex held by the first thread. The fix allows only one I/O request from dm-verity to dm-bufio per thread. To do this, the prefetch requests were queued on worker threads. In addition to avoiding the deadlock, this fix made a slight improvement in performance. seconds_kernel_to_lo

dm-crypt parallelization patches

2013-04-09 Thread Mikulas Patocka
Hi I placed the dm-crypt parallization patches at: http://people.redhat.com/~mpatocka/patches/kernel/dm-crypt-paralelizace/current/ The patches paralellize dm-crypt and make it possible to use all processor cores. The patch dm-crypt-remove-percpu.patch removes some percpu variables and repla

Re: dm-crypt parallelization patches

2013-04-09 Thread Mikulas Patocka
On Tue, 9 Apr 2013, Tejun Heo wrote: > On Tue, Apr 09, 2013 at 01:51:43PM -0400, Mikulas Patocka wrote: > > The patch dm-crypt-sort-requests.patch sorts write requests submitted by a > > single thread. The requests are sorted according to the sector number, > > rb-tree

Re: [dm-devel] dm-crypt performance

2013-04-09 Thread Mikulas Patocka
On Tue, 26 Mar 2013, Milan Broz wrote: > - Are we sure we are not inroducing some another side channel in disc > encryption? (Unprivileged user can measure timing here). > (Perhaps stupid reason but please do not prefer performance to security > in encryption. Enough we have timing attacks for A

Re: dm-crypt parallelization patches

2013-04-09 Thread Mikulas Patocka
On Tue, 9 Apr 2013, Tejun Heo wrote: > Hey, > > On Tue, Apr 09, 2013 at 02:08:06PM -0400, Mikulas Patocka wrote: > > > Hmmm? Why not just keep the issuing order along with plugging > > > boundaries? > > > > What do you mean? > > > > I used t

Re: dm-crypt parallelization patches

2013-04-09 Thread Mikulas Patocka
On Tue, 9 Apr 2013, Tejun Heo wrote: > On Tue, Apr 09, 2013 at 03:42:16PM -0400, Mikulas Patocka wrote: > > If I drop ifdefs, it doesn't compile (because other cgroup stuff it > > missing). > > > > So I enabled bio cgroups. > > > > bio_associate_c

Re: dm-crypt parallelization patches

2013-04-09 Thread Mikulas Patocka
On Tue, 9 Apr 2013, Vivek Goyal wrote: > On Tue, Apr 09, 2013 at 04:32:28PM -0400, Mikulas Patocka wrote: > > [..] > > Generally, we shouldn't associate bios with "current" task in device > > mapper targets. For example suppose that we have two stacked d

[PATCH] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches)

2013-04-10 Thread Mikulas Patocka
On Wed, 10 Apr 2013, Vivek Goyal wrote: > On Tue, Apr 09, 2013 at 05:18:25PM -0400, Mikulas Patocka wrote: > > [..] > > > bio_associate_current() return -EBUSY if bio has already been associated > > > with an io context. > > > > > > So in a stac

Re: [PATCH v2] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches)

2013-04-11 Thread Mikulas Patocka
On Wed, 10 Apr 2013, Tejun Heo wrote: > On Wed, Apr 10, 2013 at 07:42:59PM -0400, Mikulas Patocka wrote: > > /* > > + * bio_clone_context copies cgroup context from the original bio to the > > new bio. > > + * It is used by bio midlayer drivers that create new bi

Re: [PATCH v2] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches)

2013-04-11 Thread Mikulas Patocka
On Thu, 11 Apr 2013, Tejun Heo wrote: > On Thu, Apr 11, 2013 at 12:52:03PM -0700, Tejun Heo wrote: > > If this becomes an actual bottleneck, the right thing to do is making > > css ref per-cpu. Please stop messing around with refcounting. > > If you think this kind of hackery is acceptable, yo

Re: [PATCH v2] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches)

2013-04-12 Thread Mikulas Patocka
On Thu, 11 Apr 2013, Tejun Heo wrote: > On Thu, Apr 11, 2013 at 12:52:03PM -0700, Tejun Heo wrote: > > If this becomes an actual bottleneck, the right thing to do is making > > css ref per-cpu. Please stop messing around with refcounting. > > If you think this kind of hackery is acceptable, yo

Re: [PATCH v2] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches)

2013-04-12 Thread Mikulas Patocka
On Thu, 11 Apr 2013, Tejun Heo wrote: > On Thu, Apr 11, 2013 at 08:06:10PM -0400, Mikulas Patocka wrote: > > All that I can tell you is that adding an empty atomic operation > > "cmpxchg(&bio->bi_css->refcnt, bio->bi_css->refcnt, bio->bi_css-&

Re: [PATCH v2] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches)

2013-04-15 Thread Mikulas Patocka
On Fri, 12 Apr 2013, Tejun Heo wrote: > On Fri, Apr 12, 2013 at 02:01:08PM -0400, Mikulas Patocka wrote: > > So if you think that reference counts should be incremented by every clone > > of the original bio, what kind of bug should it protect against? If we > > don&

[PATCH] Track block device users that created dirty pages

2013-03-29 Thread Mikulas Patocka
is flushed when the process closes the block device. Processes that didn't do any buffered writes to the device don't cause cache flush. It has these advantages: * processes that don't do buffered writes (such as "lvm") don't flush other process's data. * if the

  1   2   3   4   5   6   7   8   9   >