[PATCH] bcache: add separate workqueue for journal_write to avoid deadlock

2018-09-27 Thread Stefan Priebe - Profihost AG
Hi Coly, is this the deadlock I reported some weeks ago? Greets, Stefan Excuse my typo sent from my mobile phone. Am 27.09.2018 um 17:53 schrieb Eddie Chapman mailto:ed...@ehuk.net>>: > On 27/09/18 16:23, Coly Li wrote: >> On 9/27/18 9:45 PM, guoju wrote: >>> After write SSD completed, bcache

Re: [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

2018-09-10 Thread Stefan Priebe - Profihost AG
Am 10.09.2018 um 22:08 schrieb David Rientjes: > On Fri, 7 Sep 2018, Michal Hocko wrote: > >> From: Michal Hocko >> >> Andrea has noticed [1] that a THP allocation might be really disruptive >> when allocated on NUMA system with the local node full or hard to >> reclaim. Stefan has posted an allo

Re: [PATCH] mm, thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

2018-09-08 Thread Stefan Priebe - Profihost AG
MA placement. > > Be careful when the vma has an explicit numa binding though, because > __GFP_THISNODE is not playing well with it. We want to follow the > explicit numa policy rather than enforce a node which happens to be > local to the cpu we are running on. > > [1] http://

Re: kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!

2016-09-09 Thread Stefan Priebe - Profihost AG
Am 08.09.2016 um 19:33 schrieb Shaohua Li: > On Thu, Sep 08, 2016 at 10:16:59AM -0600, Jens Axboe wrote: >> On 09/08/2016 02:23 AM, Stefan Priebe - Profihost AG wrote: >>> Hi, >>> >>> while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. >&

kernel 4.8-rc5 kernel BUG at block/blk-core.c:2032!

2016-09-08 Thread Stefan Priebe - Profihost AG
Hi, while trying Kernel 4.8-rc5 my raid5 breaks every few minutes. Trace: [ cut here ] kernel BUG at block/blk-core.c:2032! invalid opcode: [#1] SMP Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter ip_tables x_tables 8021q garp bondi

Re: shrink_active_list/try_to_release_page bug? (was Re: xfs trace in 4.4.2 / also in 4.3.3 WARNING fs/xfs/xfs_aops.c:1232 xfs_vm_releasepage)

2016-06-02 Thread Stefan Priebe - Profihost AG
Am 31.05.2016 um 09:31 schrieb Dave Chinner: > On Tue, May 31, 2016 at 08:11:42AM +0200, Stefan Priebe - Profihost AG wrote: >>> I'm half tempted at this point to mostly ignore this mm/ behavour >>> because we are moving down the path of removing buffer heads from >&g

Re: shrink_active_list/try_to_release_page bug? (was Re: xfs trace in 4.4.2 / also in 4.3.3 WARNING fs/xfs/xfs_aops.c:1232 xfs_vm_releasepage)

2016-05-31 Thread Stefan Priebe - Profihost AG
Am 31.05.2016 um 09:31 schrieb Dave Chinner: > On Tue, May 31, 2016 at 08:11:42AM +0200, Stefan Priebe - Profihost AG wrote: >>> I'm half tempted at this point to mostly ignore this mm/ behavour >>> because we are moving down the path of removing buffer heads from >&g

Re: shrink_active_list/try_to_release_page bug? (was Re: xfs trace in 4.4.2 / also in 4.3.3 WARNING fs/xfs/xfs_aops.c:1232 xfs_vm_releasepage)

2016-05-30 Thread Stefan Priebe - Profihost AG
Hi Dave, Am 31.05.2016 um 08:07 schrieb Dave Chinner: > On Tue, May 31, 2016 at 12:59:04PM +0900, Minchan Kim wrote: >> On Tue, May 31, 2016 at 12:55:09PM +1000, Dave Chinner wrote: >>> On Tue, May 31, 2016 at 10:07:24AM +0900, Minchan Kim wrote: On Tue, May 31, 2016 at 08:36:57AM +1000, Dave

Re: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6

2016-05-16 Thread Stefan Priebe - Profihost AG
Am 21.03.2016 um 14:38 schrieb Greg KH: > On Mon, Mar 21, 2016 at 11:52:23AM +0100, Stefan Priebe - Profihost AG wrote: >> >> Am 20.03.2016 um 22:41 schrieb Greg KH: >>> On Sun, Mar 20, 2016 at 10:27:23PM +0100, Stefan Priebe wrote: >>>> >>>>

Re: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6

2016-03-21 Thread Stefan Priebe - Profihost AG
Am 20.03.2016 um 22:41 schrieb Greg KH: > On Sun, Mar 20, 2016 at 10:27:23PM +0100, Stefan Priebe wrote: >> >> Am 19.03.2016 um 23:26 schrieb Vlastimil Babka: >>> On 03/17/2016 07:45 PM, Greg KH wrote: >>>> On Thu, Mar 17, 2016 at 07:38:03PM +0100, Stefan Prieb

Re: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6

2016-03-20 Thread Stefan Priebe
Am 19.03.2016 um 23:26 schrieb Vlastimil Babka: On 03/17/2016 07:45 PM, Greg KH wrote: On Thu, Mar 17, 2016 at 07:38:03PM +0100, Stefan Priebe wrote: Hi, while running qemu 2.5 on a host running 4.4.6 the host system has crashed (load > 200) 3 times in the last 3 days. Always with t

divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6

2016-03-19 Thread Stefan Priebe
Hi, while running qemu 2.5 on a host running 4.4.6 the host system has crashed (load > 200) 3 times in the last 3 days. Always with this stack trace: (copy left here: http://pastebin.com/raw/bCWTLKyt) [69068.874268] divide error: [#1] SMP [69068.875242] Modules linked in: ebtable_filte

Re: Asterisk deadlocks since Kernel 4.1

2015-12-06 Thread Stefan Priebe - Profihost AG
Hi Herbert, Am 07.12.2015 um 02:20 schrieb Herbert Xu: > On Sun, Dec 06, 2015 at 09:56:34PM +0100, Stefan Priebe wrote: >> Hi Herbert, >> >> i think i found the issue in 4.1 with netlink. Somebody made a >> mistake while backporting or cherry-picking your patch &qu

Re: Asterisk deadlocks since Kernel 4.1

2015-12-06 Thread Stefan Priebe
12PM +0100, Stefan Priebe wrote: * 9f87e0c - (2 months ago) netlink: Replace rhash_portid with bound - Herbert Xu * 35e9890 - (3 months ago) netlink: Fix autobind race condition that leads to zero port ID - Herbert Xu * 30c6472 - (7 months ago) netlink: Use random autobind rover - Herbert Xu These thr

Re: Asterisk deadlocks since Kernel 4.1

2015-12-05 Thread Stefan Priebe
Hello Philipp, Am 05.12.2015 um 15:19 schrieb Philipp Matthias Hahn: Hello Hannes, On Wed, Dec 02, 2015 at 12:40:32PM +0100, Hannes Frederic Sowa wrote: git bisect tells me it stopped working after those two commits were applied: commit d48623677191e0f035d7afd344f92cf880b01f8e Author: Herbert

Re: Asterisk deadlocks since Kernel 4.1

2015-12-04 Thread Stefan Priebe
) netlink: rename private flags and states - Nicolas Dichtel * 0356126 - (2 days ago) Revert "netlink: don't hold mutex in rcu callback when releasing mmapd ring" - Stefan Priebe * 231d0da - (2 days ago) Revert "netlink: make sure -EBUSY won't escape from netlink_insert

Re: Asterisk deadlocks since Kernel 4.1

2015-12-03 Thread Stefan Priebe - Profihost AG
> Am 02.12.2015 um 12:40 schrieb Hannes Frederic Sowa > : > > Hello Stefan, > > Stefan Priebe - Profihost AG writes: > > >> here are the results. >> >> It works with 4.1. >> It works with 4.2. >> It does not work with 4.1.13. >>

Re: Asterisk deadlocks since Kernel 4.1

2015-12-02 Thread Stefan Priebe - Profihost AG
:44, Stefan Priebe - Profihost AG wrote: >> Am 19.11.2015 um 20:51 schrieb Stefan Priebe: >>> >>> Am 19.11.2015 um 14:19 schrieb Florian Weimer: >>>> On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote: >>>> >>>>> I can try Kerne

Re: Asterisk deadlocks since Kernel 4.1

2015-11-24 Thread Stefan Priebe - Profihost AG
Am 23.11.2015 um 13:57 schrieb Hannes Frederic Sowa: > On Mon, Nov 23, 2015, at 13:44, Stefan Priebe - Profihost AG wrote: >> Am 19.11.2015 um 20:51 schrieb Stefan Priebe: >>> >>> Am 19.11.2015 um 14:19 schrieb Florian Weimer: >>>> On 11/19/2015 01:46

Re: Asterisk deadlocks since Kernel 4.1

2015-11-23 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 20:51 schrieb Stefan Priebe: > > Am 19.11.2015 um 14:19 schrieb Florian Weimer: >> On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote: >> >>> I can try Kernel 4.4-rc1 next week. Or something else? >> >> I found this bug r

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe
Am 19.11.2015 um 14:19 schrieb Florian Weimer: On 11/19/2015 01:46 PM, Stefan Priebe - Profihost AG wrote: I can try Kernel 4.4-rc1 next week. Or something else? I found this bug report which indicates that 4.1.10 works: <https://issues.asterisk.org/jira/browse/ASTERISK-25251>

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 13:41 schrieb Hannes Frederic Sowa: > On Thu, Nov 19, 2015, at 12:43, Stefan Priebe - Profihost AG wrote: >> >> Am 19.11.2015 um 12:41 schrieb Hannes Frederic Sowa: >>> On Thu, Nov 19, 2015, at 10:56, Stefan Priebe - Profihost AG wrote: >>>>

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 12:41 schrieb Hannes Frederic Sowa: > On Thu, Nov 19, 2015, at 10:56, Stefan Priebe - Profihost AG wrote: >> OK it had a livelock again. It just took more time. >> >> So here is the data: > > Thanks, I couldn't reproduce it so far with simple

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
fff8800b16cf000 15 4410001 000 20 1542 8800b1168800 15 4294962900 000 20 7978 8800b7088800 15 0 0000 0 00 20 5 8800b71c9800 16 0 000 20 15 f

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 19.11.2015 um 10:44 schrieb Florian Weimer: > On 11/18/2015 10:36 PM, Stefan Priebe wrote: > >>> please try to get a backtrace with debugging information. It is likely >>> that this is the make_request/__check_pf functionality in glibc, but it >>> wo

Re: Asterisk deadlocks since Kernel 4.1

2015-11-19 Thread Stefan Priebe - Profihost AG
Am 18.11.2015 um 22:22 schrieb Hannes Frederic Sowa: > On Wed, Nov 18, 2015, at 22:20, Stefan Priebe wrote: >> you mean just: >> la /proc/$pid/fd > > ls -l /proc/pid/fd/ > > the numbers in brackets in return from readlink are the inode numbers. > >> and >

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:40 schrieb Hannes Frederic Sowa: On Wed, Nov 18, 2015, at 22:36, Stefan Priebe wrote: sorry here it is. What I'm wondering is why is there ipv6 stuff? I don't have ipv6 except for link local. Could it be this one? https://bugzilla.redhat.com/show_bug.cgi?id=

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:18 schrieb Florian Weimer: On 11/18/2015 09:23 PM, Stefan Priebe wrote: Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe wrote: I've now also two gdb backtraces from two crashes: http://pastebin.com/raw.php?i=yih5jNt8 http://pastebi

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:18 schrieb Florian Weimer: On 11/18/2015 09:23 PM, Stefan Priebe wrote: Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe wrote: I've now also two gdb backtraces from two crashes: http://pastebin.com/raw.php?i=yih5jNt8

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 18.11.2015 um 22:00 schrieb Hannes Frederic Sowa: On Wed, Nov 18, 2015, at 21:23, Stefan Priebe wrote: Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe wrote: I've now also two gdb backtraces from two crashes: http://pastebin.com/raw.php?i=yih

Re: Asterisk deadlocks since Kernel 4.1

2015-11-18 Thread Stefan Priebe
Am 17.11.2015 um 20:43 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe wrote: I've now also two gdb backtraces from two crashes: http://pastebin.com/raw.php?i=yih5jNt8 http://pastebin.com/raw.php?i=kGEcvH4T They don't tell me anything as I have no idea of the inner w

Re: Asterisk deadlocks since Kernel 4.1

2015-11-17 Thread Stefan Priebe
Am 17.11.2015 um 20:15 schrieb Thomas Gleixner: On Tue, 17 Nov 2015, Stefan Priebe - Profihost AG wrote: since Upgrading our Asterisk System from Kernel 3.18.17 to 4.1.13 it deadlocks every few hours (kill -9 is the only thing working). Booting with 3.18 again let it run smooth again. An

Asterisk deadlocks since Kernel 4.1

2015-11-17 Thread Stefan Priebe - Profihost AG
Hello, since Upgrading our Asterisk System from Kernel 3.18.17 to 4.1.13 it deadlocks every few hours (kill -9 is the only thing working). Booting with 3.18 again let it run smooth again. An strace shows asterisk is looping like this: [pid 6068] timerfd_gettime(8, , {it_interval={0, 2000},

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-23 Thread Stefan Priebe
Am 22.07.2015 um 09:23 schrieb Stefan Priebe - Profihost AG: Am 21.07.2015 um 23:15 schrieb Thomas Gleixner: On Tue, 21 Jul 2015, Stefan Priebe wrote: Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 server

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-22 Thread Stefan Priebe - Profihost AG
Am 21.07.2015 um 23:15 schrieb Thomas Gleixner: > On Tue, 21 Jul 2015, Stefan Priebe wrote: >> Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: >>> On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: >>>> Hello list, >>>> >>>> i've

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-21 Thread Stefan Priebe
Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk and network load. Since a few days i encounter regular the following error messages and p

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-20 Thread Stefan Priebe - Profihost AG
Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: > On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: >> Hello list, >> >> i've 36 servers all running vanilla 3.18.18 kernel which have a very >> high disk and network load. >> >> Since a few

do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-20 Thread Stefan Priebe - Profihost AG
Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk and network load. Since a few days i encounter regular the following error messages and pretty often completely hanging disk i/o: [535040.439859] do_IRQ: 0.126 No irq handler for vector (irq -1) [548400.353

Re: [GIT PULL] bcache changes for 3.17

2014-09-05 Thread Stefan Priebe
Am 05.09.2014 20:33, schrieb Kent Overstreet: On Fri, Sep 05, 2014 at 11:10:13AM -0600, Jens Axboe wrote: On 09/05/2014 11:03 AM, Arne Wiebalck wrote: On Sep 5, 2014, at 6:41 PM, Peter Kieser wrote: On 2014-09-05 8:37 AM, Eddie Chapman wrote: On 05/09/14 15:17, Jens Axboe wrote: (from

Re: netconsole breaks netpoll on bridge

2014-06-16 Thread Stefan Priebe - Profihost AG
Am 16.06.2014 23:30, schrieb Francois Romieu: > Stefan Priebe - Profihost AG : > [...] >> That sounds great! Is there anything I can do or some code I can port to >> veth? > > You may add an empty handler for .ndo_poll_controller in drivers/net/veth.c > and give

Re: netconsole breaks netpoll on bridge

2014-06-16 Thread Stefan Priebe - Profihost AG
> Am 16.06.2014 um 21:12 schrieb Cong Wang : > > On Mon, Jun 16, 2014 at 12:05 PM, Stefan Priebe - Profihost AG > wrote: >> >>> Am 16.06.2014 um 20:51 schrieb Cong Wang : >>> >>> On Mon, Jun 16, 2014 at 11:41 AM, Stefan Priebe - Profihost AG &

Re: netconsole breaks netpoll on bridge

2014-06-16 Thread Stefan Priebe - Profihost AG
> Am 16.06.2014 um 20:51 schrieb Cong Wang : > > On Mon, Jun 16, 2014 at 11:41 AM, Stefan Priebe - Profihost AG > wrote: >> >>> Am 16.06.2014 um 20:05 schrieb Cong Wang : >>> >>> On Mon, Jun 16, 2014 at 5:51 AM, Stefan Priebe - Profihost AG >&g

Re: netconsole breaks netpoll on bridge

2014-06-16 Thread Stefan Priebe - Profihost AG
> Am 16.06.2014 um 20:05 schrieb Cong Wang : > > On Mon, Jun 16, 2014 at 5:51 AM, Stefan Priebe - Profihost AG > wrote: >> Hi, >> >> i'm using a vanilla 3.10.43 kernel and netconsole on top of a bridge. >> >> netconsole is used with vmbr0 (bridge

netconsole breaks netpoll on bridge

2014-06-16 Thread Stefan Priebe - Profihost AG
Hi, i'm using a vanilla 3.10.43 kernel and netconsole on top of a bridge. netconsole is used with vmbr0 (bridge) which is on top of bond0. If i want to add another bridge to vmbr0 is fails as long as netconsole is in use. # brctl addif vmbr0 fwpr2004p0 can't add fwpr2004p0 to bridge vmbr0: Unkn

Vanilla 3.10.22 kvm_arch_vcpu_uninit => WARNING: at kernel/jump_label.c:80 __static_key_slow_dec

2013-12-19 Thread Stefan Priebe
Hello list, while running Qemu 1.7.0 on vanilla kernel 3.10.22 i've seen several times this error: [99964.659578] WARNING: at kernel/jump_label.c:80 __static_key_slow_dec+0xb6/0xc0() [99964.659579] jump label: negative count! [99964.659579] Modules linked in: sch_htb act_police cls_u32 sch_ing

[PATCH] Commit 39c60a0948cc '[SCSI] sd: fix array cache flushing bug causing performance problems' added a possibility to temporary disable the write cache and skip flushing. But when setting temporar

2013-11-25 Thread Stefan Priebe
Signed-off-by: Stefan Priebe --- drivers/scsi/sd.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 734a29a..ccc6242 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -174,7 +174,7 @@ sd_store_cache_type(struct device

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-24 Thread Stefan Priebe
Hi Ric, Am 23.11.2013 20:35, schrieb Ric Wheeler: On 11/23/2013 01:27 PM, Stefan Priebe wrote: Hi Ric, Am 22.11.2013 21:37, schrieb Ric Wheeler: On 11/22/2013 03:01 PM, Stefan Priebe wrote: Hi Christoph, Am 21.11.2013 11:11, schrieb Christoph Hellwig: 2. Some drives may implement

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-23 Thread Stefan Priebe
Hi Ric, Am 23.11.2013 20:35, schrieb Ric Wheeler: On 11/23/2013 01:27 PM, Stefan Priebe wrote: Hi Ric, Am 22.11.2013 21:37, schrieb Ric Wheeler: On 11/22/2013 03:01 PM, Stefan Priebe wrote: Hi Christoph, Am 21.11.2013 11:11, schrieb Christoph Hellwig: 2. Some drives may implement

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-23 Thread Stefan Priebe
Hi Ric, Am 22.11.2013 21:37, schrieb Ric Wheeler: On 11/22/2013 03:01 PM, Stefan Priebe wrote: Hi Christoph, Am 21.11.2013 11:11, schrieb Christoph Hellwig: 2. Some drives may implement CMD_FLUSH to return immediately i.e. no guarantee the data is actually on disk. In which case they

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-22 Thread Stefan Priebe
Hi Ric, Am 22.11.2013 21:37, schrieb Ric Wheeler: On 11/22/2013 03:01 PM, Stefan Priebe wrote: Hi Christoph, Am 21.11.2013 11:11, schrieb Christoph Hellwig: 2. Some drives may implement CMD_FLUSH to return immediately i.e. no guarantee the data is actually on disk. In which case they

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-22 Thread Stefan Priebe
Hi Christoph, Am 21.11.2013 11:11, schrieb Christoph Hellwig: 2. Some drives may implement CMD_FLUSH to return immediately i.e. no guarantee the data is actually on disk. In which case they aren't spec complicant. While I've seen countless data integrity bugs on lower end ATA SSDs I've not se

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-22 Thread Stefan Priebe
Am 20.11.2013 16:55, schrieb J. Bruce Fields: On Wed, Nov 20, 2013 at 10:37:03AM -0500, Theodore Ts'o wrote: On Wed, Nov 20, 2013 at 08:52:36PM +0530, Chinmay V S wrote: If you have confirmed the performance numbers, then it indicates that the Intel 530 controller is more advanced and makes be

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-22 Thread Stefan Priebe
Am 20.11.2013 16:22, schrieb Chinmay V S: Hi Stefan, thanks for your great and detailed reply. I'm just wondering why an intel 520 ssd degrades the speed just by 2% in case of O_SYNC. intel 530 the newer model and replacement for the 520 degrades speed by 75% like the crucial m4. The Intel DC

Re: Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-20 Thread Stefan Priebe - Profihost AG
Hi ChinmayVS, Am 20.11.2013 14:34, schrieb Chinmay V S: > Hi Stefan, > > Christoph is bang on right. To further elaborate upon this, here is > what is happening in the above case : > By using DIRECT, SYNC/DSYNC flags on a block device (i.e. bypassing > the file-systems layer), essentially you are

Why is O_DSYNC on linux so slow / what's wrong with my SSD?

2013-11-20 Thread Stefan Priebe - Profihost AG
Hello, while struggling about an application beeing so slow on my SSD and having high I/O Waits while the app is using the raw block device i've detected that this is caused by open the block device with O_DSYNC. I've used dd and fio with oflags=direct,dsync / --direct=1 and --sync=1 and got the

Re: [PATCH] bcache: Fix a shrinker deadlock

2013-09-03 Thread Stefan Priebe - Profihost AG
Thanks! No crashes since your fix. Stefan This mail was sent with my iPhone. Am 30.08.2013 um 23:15 schrieb Kent Overstreet : > GFP_NOIO means we could be getting called recursively - mca_alloc() -> > mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then. > Whoops. > > Signed-of

Re: [PATCH] bcache: Fix a shrinker deadlock

2013-08-31 Thread Stefan Priebe
thanks applied to my local kernel git Stefan Am 30.08.2013 23:15, schrieb Kent Overstreet: GFP_NOIO means we could be getting called recursively - mca_alloc() -> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then. Whoops. Signed-off-by: Kent Overstreet --- On Thu, Aug 29, 2

Re: bcache: Fix a writeback performance regression

2013-08-29 Thread Stefan Priebe
sorry seems i see something similiar: http://pastebin.com/raw.php?i=ZqgLf9gp Stefan Am 28.08.2013 22:15, schrieb Stefan Priebe: sorry but that's completely wrong. please use branch bcache-for-3.10 http://evilpiepirate.org/git/linux-bcache.git/log/?h=bcache-for-3.10 Stefan Am 28.08.20

Re: bcache: Fix a writeback performance regression

2013-08-28 Thread Stefan Priebe
sorry but that's completely wrong. please use branch bcache-for-3.10 http://evilpiepirate.org/git/linux-bcache.git/log/?h=bcache-for-3.10 Stefan Am 28.08.2013 22:12, schrieb kernel neophyte: On Wed, Aug 28, 2013 at 12:20 PM, Stefan Priebe wrote: Am 28.08.2013 20:47, schrieb kernel neo

Re: bcache: Fix a writeback performance regression

2013-08-28 Thread Stefan Priebe
Am 28.08.2013 20:47, schrieb kernel neophyte: On Wed, Aug 28, 2013 at 11:38 AM, Stefan Priebe - Profihost AG wrote: I don't had one for a few days. Which kernel so you use? 3.10 kernel with all of kent's stable patches and perf patches.. Which exact 3.10 version? Which patc

Re: bcache: Fix a writeback performance regression

2013-08-28 Thread Stefan Priebe - Profihost AG
2522.956977] 882fa6aeb320 882f8ec94cb0 00020003 >>> 882f8ec94cb0 >>> [ 2522.956981] Call Trace: >>> [ 2522.956987] [] schedule+0x29/0x70 >>> [ 2522.956992] [] rwsem_down_read_failed+0x9d/0xe5 >>> [ 2522.956997] [] call_rwsem_down_r

Re: bcache: Fix a writeback performance regression

2013-08-26 Thread Stefan Priebe
[] __sync_filesystem+0x4a/0x50 2013-08-26 21:05:27 [] sync_filesystem+0x32/0x60 2013-08-26 21:05:27 [] SyS_syncfs+0x50/0x90 2013-08-26 21:05:27 [] system_call_fastpath+0x16/0x1b 2013-08-26 21:05:27 INFO: task ceph-osd:8798 blocked for more than 120 seconds. Stefan Am 22.08.2013 09:32, schri

Re: bcache: Fix a writeback performance regression

2013-08-22 Thread Stefan Priebe - Profihost AG
great! Everything seems to work fine now! Except read_dirty always going to negative values after a reboot. Stefan Am 22.08.2013 08:02, schrieb Kent Overstreet: > On Thu, Aug 22, 2013 at 07:59:04AM +0200, Stefan Priebe wrote: >> >>> schedule_timeout()

Re: bcache: Fix a writeback performance regression

2013-08-21 Thread Stefan Priebe
>schedule_timeout() is not the same as >schedule_timeout_interruptible(). just search and replace? So i can try on my own. Stefan Am 22.08.2013 07:43, schrieb Kent Overstreet: On Thu, Aug 22, 2013 at 07:27:12AM +0200, Stefan Priebe wrote: today i had this one: Heh, I finally trac

Re: bcache: Fix a writeback performance regression

2013-08-21 Thread Stefan Priebe
he] 2013-08-22 06:28:43 [] bch_insert_data_loop+0xf8/0x610 [bcache] 2013-08-22 06:28:43 [] ? bch_get_congested+0x25/0x70 [bcache] 2013-08-22 06:28:43 [] bch_insert_data+0x1d/0x20 [bcache] 2013-08-22 06:28:43 [] closure_queue+0x43/0x60 [bcache] 2013-08-22 06:28:43 [] request_wr

Re: bcache: Fix a writeback performance regression

2013-08-21 Thread Stefan Priebe
Am 22.08.2013 01:47, schrieb Kent Overstreet: On Tue, Aug 20, 2013 at 10:07:45AM +0200, Stefan Priebe - Profihost AG wrote: Am 20.08.2013 10:01, schrieb Stefan Priebe - Profihost AG: Am 20.08.2013 00:27, schrieb Kent Overstreet: On Mon, Aug 19, 2013 at 12:09:24AM +0200, Stefan Priebe wrote

Re: bcache: Fix a writeback performance regression

2013-08-20 Thread Stefan Priebe - Profihost AG
Am 20.08.2013 10:01, schrieb Stefan Priebe - Profihost AG: > Am 20.08.2013 00:27, schrieb Kent Overstreet: >> On Mon, Aug 19, 2013 at 12:09:24AM +0200, Stefan Priebe wrote: >>> >>> Vanilla 3.10.7 + bcache: Fix a writeback performance regression >>> >&g

Re: bcache: Fix a writeback performance regression

2013-08-20 Thread Stefan Priebe - Profihost AG
Am 20.08.2013 00:27, schrieb Kent Overstreet: > On Mon, Aug 19, 2013 at 12:09:24AM +0200, Stefan Priebe wrote: >> >> Vanilla 3.10.7 + bcache: Fix a writeback performance regression >> >> http://pastebin.com/raw.php?i=LXZk4cMH > > Whoops, at first I thought this wa

Re: bcache: Fix a writeback performance regression

2013-08-18 Thread Stefan Priebe
Vanilla 3.10.7 + bcache: Fix a writeback performance regression http://pastebin.com/raw.php?i=LXZk4cMH Stefan Am 16.08.2013 12:11, schrieb Stefan Priebe - Profihost AG: Hi, bcache: Fix a writeback performance regression this one results in 3.10 into hung tasks in bcache_writeback

Re: [GIT PULL] bcache fixes for 3.11

2013-08-16 Thread Stefan Priebe - Profihost AG
Hi, bcache: Fix a writeback performance regression this one results in 3.10 into hung tasks in bcache_writeback read_dirty. Stefan Am 15.08.2013 08:43, schrieb Stefan Priebe - Profihost AG: > Am 15.08.2013 00:59, schrieb Kent Overstreet: >> Jens, here's the latest bcache fixe

Re: [GIT PULL] bcache fixes for 3.11

2013-08-14 Thread Stefan Priebe - Profihost AG
Am 15.08.2013 00:59, schrieb Kent Overstreet: > Jens, here's the latest bcache fixes. Some urgent stuff in here: > > > The following changes since commit 79826c35eb99cd3c0873b8396f45fa26c87fb0b0: > > bcache: Allocation kthread fixes (2013-07-12 00:22:49 -0700) > > are available in the git rep

kernel 3.8.9 call trace kvm Watchdog detected hard LOCKUP

2013-05-07 Thread Stefan Priebe - Profihost AG
Hello list, today i kvm host crashed running vanilla kernel 3.8.9. The call trace looks like this: Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 18 Pid: 29053, comm: kvm Tainted: G O 3.8.9+16-ph #1 Call Trace: [] panic+0xbf/0x1df [] ? native_sched_clock+0x13/0x80 [] watchdog_o

Re: Problem with GVRP on eth while having a bridge

2013-02-07 Thread Stefan Priebe - Profihost AG
then the bridge on top of the VLANs. Greets, Stefan Am 07.02.2013 12:22, schrieb Patrick McHardy: > On Thu, Feb 07, 2013 at 11:56:38AM +0100, Stefan Priebe - Profihost AG wrote: >> Hello list, >> >> this was tested using vanilla 3.7.6 kernel. >> >> When i add a vlan to

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-08 Thread Stefan Priebe
ah OK - thanks. Will there be a fixed 1.1.2 as well? Stefan Am 08.08.2012 10:06, schrieb Stefan Hajnoczi: On Wed, Aug 08, 2012 at 07:51:07AM +0200, Stefan Priebe wrote: Any news? Was this applied upstream? Kevin is ill. He has asked me to review and test patches in his absence. When he

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-07 Thread Stefan Priebe
Any news? Was this applied upstream? Am 06.08.2012 14:37, schrieb Avi Kivity: On 08/06/2012 03:12 PM, Avi Kivity wrote: On 08/06/2012 11:46 AM, Stefan Priebe - Profihost AG wrote: But still i got the segfault and core dump - this is my main problem? I mean qemu-kvm master isn't declar

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-06 Thread Stefan Priebe - Profihost AG
can confirm - this fixed it! Am 06.08.2012 14:37, schrieb Avi Kivity: On 08/06/2012 03:12 PM, Avi Kivity wrote: On 08/06/2012 11:46 AM, Stefan Priebe - Profihost AG wrote: But still i got the segfault and core dump - this is my main problem? I mean qemu-kvm master isn't declared as stabl

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-06 Thread Stefan Priebe - Profihost AG
>Am 06.08.2012 10:36, schrieb Avi Kivity: On 08/05/2012 10:00 PM, Stefan Priebe wrote: So here are 3 backtraces from booting the rescue system: http://pastebin.com/raw.php?i=xCy2pEcP To me they all look the same. They are. What version of qemu are you using? latest stable-1.1 bra

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-05 Thread Stefan Priebe
Am 05.08.2012 17:52, schrieb Stefan Priebe: Am 05.08.2012 12:29, schrieb Avi Kivity: On 08/05/2012 01:08 PM, Stefan Priebe wrote: Am 01.08.2012 11:53, schrieb Avi Kivity: On 08/01/2012 12:42 PM, Stefan Priebe - Profihost AG wrote: Am 01.08.2012 11:33, schrieb Avi Kivity: So here are 3

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-05 Thread Stefan Priebe
Am 05.08.2012 12:29, schrieb Avi Kivity: On 08/05/2012 01:08 PM, Stefan Priebe wrote: Am 01.08.2012 11:53, schrieb Avi Kivity: On 08/01/2012 12:42 PM, Stefan Priebe - Profihost AG wrote: Am 01.08.2012 11:33, schrieb Avi Kivity: So here are 3 backtraces from booting the rescue system: http

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-05 Thread Stefan Priebe
Am 01.08.2012 11:53, schrieb Avi Kivity: On 08/01/2012 12:42 PM, Stefan Priebe - Profihost AG wrote: Am 01.08.2012 11:33, schrieb Avi Kivity: So here are 3 backtraces from booting the rescue system: http://pastebin.com/raw.php?i=xCy2pEcP To me they all look the same. They are. What version

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-01 Thread Stefan Priebe - Profihost AG
Am 01.08.2012 11:53, schrieb Avi Kivity: On 08/01/2012 12:42 PM, Stefan Priebe - Profihost AG wrote: Am 01.08.2012 11:33, schrieb Avi Kivity: So here are 3 backtraces from booting the rescue system: http://pastebin.com/raw.php?i=xCy2pEcP To me they all look the same. They are. What version

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-01 Thread Stefan Priebe - Profihost AG
Am 01.08.2012 11:33, schrieb Avi Kivity: So here are 3 backtraces from booting the rescue system: http://pastebin.com/raw.php?i=xCy2pEcP To me they all look the same. They are. What version of qemu are you using? latest stable-1.1 branch (1.1.1) - which works fine with latest RHEL6 kernel.

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-08-01 Thread Stefan Priebe - Profihost AG
, Stefan Priebe wrote: Now i got it working - sorry used old gdb. This is the backtrace: Core was generated by `/usr/bin/qemu-system-x86_64 -id 103 -chardev socket,id=qmp,path=/var/run/qemu-s'. Program terminated with signal 11, Segmentation fault. #0 0x7f6ca10faed8 in ?? () from /lib/libc

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-07-31 Thread Stefan Priebe
Am 31.07.2012 16:54, schrieb Avi Kivity: On 07/31/2012 02:59 PM, Stefan Priebe - Profihost AG wrote: Hello list, i hope it is correct to list the maintainers of kvm. While trying to install ubuntu 12.04 amd64 on a kvm based vm the KVM process segfaults while ubuntu tries to detect the HW: kvm

Re: KVM segfaults with 3.5 while installing ubuntu 12.04

2012-07-31 Thread Stefan Priebe
Now i got it working - sorry used old gdb. This is the backtrace: Core was generated by `/usr/bin/qemu-system-x86_64 -id 103 -chardev socket,id=qmp,path=/var/run/qemu-s'. Program terminated with signal 11, Segmentation fault. #0 0x7f6ca10faed8 in ?? () from /lib/libc.so.6 (gdb) where #0

KVM segfaults with 3.5 while installing ubuntu 12.04

2012-07-31 Thread Stefan Priebe - Profihost AG
Hello list, i hope it is correct to list the maintainers of kvm. While trying to install ubuntu 12.04 amd64 on a kvm based vm the KVM process segfaults while ubuntu tries to detect the HW: kvm[2978]: segfault at 7fb90d9035e0 ip 7fb90d9035e0 sp7fff652e4ed8 error 15 This does not happe

Re: getting uninterruptible sleep processes after upgrade from 2.6.20.20 to 2.6.24.2

2008-02-21 Thread Stefan Priebe - allied internet ag
the moment. Stefan Jiri Slaby schrieb: Stefan Priebe - allied internet ag napsal(a): Hello! I've done the (echo t > /proc/sysrq-trigger) now but i'm not able to get the whole output via dmesg. Here is what i get: # dmesg 3.432124] [] do_select+0x390/0x46e [272363.432226]

Re: getting uninterruptible sleep processes after upgrade from 2.6.20.20 to 2.6.24.2

2008-02-21 Thread Stefan Priebe - allied internet ag
Hello! I've done the (echo t > /proc/sysrq-trigger) now but i'm not able to get the whole output via dmesg. Here is what i get: # dmesg 3.432124] [] do_select+0x390/0x46e [272363.432226] [] __pollwait+0x0/0xcf [272363.432319] [] default_wake_function+0x0/0x8 [272363.432416] [] default_wake

2.6.24.2 many WARNING: at net/ipv4/tcp_input.c / tcp_output.c

2008-02-21 Thread Stefan Priebe - allied internet ag
307812] [] __do_softirq+0x72/0xdf [267457.307874] [] do_softirq+0x37/0x39 [267457.307932] [] smp_apic_timer_interrupt+0x5a/0x85 [267457.308015] [] apic_timer_interrupt+0x28/0x30 [267457.308083] [] qword_get+0x14d/0x27c [267457.308161] === -- Regards, Stefan Priebe -- To unsubscrib

Re: getting uninterruptible sleep processes after upgrade from 2.6.20.20 to 2.6.24.2

2008-02-17 Thread allied internet ag- Stefan Priebe
>> One week ago we upgraded about 300 servers from 2.6.20.20 to 2.6.24.2. > > Painfull. OK not really - i've tested the new kernel on all models. And it works fine... the DN state only comes sometimes - absolutely not reproducable. So my tests went OK and then we've done the update. I've now se

getting uninterruptible sleep processes after upgrade from 2.6.20.20 to 2.6.24.2

2008-02-17 Thread allied internet ag- Stefan Priebe
Hello! One week ago we upgraded about 300 servers from 2.6.20.20 to 2.6.24.2. Now we get sometimes dozent of processes in state DN. The system is completely idle - but the load is 20 or 90 or whatever. And it can only be "repaired" by rebooting the system. I'm NOT on the LIST so please CC me

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-12 Thread Stefan Priebe - allied internet ag
Hello! I've tested this patch now - and it works fine. Now rmmod, halt and reboot also works. Stefan Priebe Boaz Harrosh schrieb: gdth _exit would first remove all cards then stop the timer and would not sync with the timer function. This caused a crash in gdth_timer() when modul

Re: Kernel 2.6.20 does not work anymore with SCSI or SATA on old Opteron / Xeon servers

2007-03-20 Thread Stefan Priebe
Hello! With the sysrq i've found the function with is the problem: inode.c => nfs_getattr => nfs_sync_mapping_range I've also found the attached patch - which is not included in any stable release nor in 2.6.21.X but is public since 20.02.07 I think this is very important

Re: Kernel 2.6.20 does not work anymore with SCSI or SATA on old Opteron / Xeon servers

2007-03-20 Thread Stefan Priebe
0046 c03be392 c317bfc4 0046 0086 c313fee8 0002 c312f560 kthread+0x72/0x96 002e schedule_timeout+0x70/0x8d 0082 prep_new_page+0xb2/0xea [] inet_csk_accept+0x51/0x125 Stefan Olaf Kirch schrieb: > On Tuesday 20 March 2007 11:59, Stefan Priebe wrote: >> Kernel command

Re: Kernel 2.6.20 does not work anymore with SCSI or SATA on old Opteron / Xeon servers

2007-03-20 Thread Stefan Priebe
Kirch schrieb: On Tuesday 20 March 2007 11:33, Stefan Priebe wrote: 1.) I've bootet these systems through NFS and would like to access /dev/sda or /dev/sdb then. For example via fdisk and this does not work. What do you mean by "booted through NFS"? Do you mean the machine run

Re: Kernel 2.6.20 does not work anymore with SCSI or SATA on old Opteron / Xeon servers

2007-03-20 Thread Stefan Priebe
Hello! It runs with nfsroot # mount 192.168.0.100:/PXE/debian on / type nfs (rw) Kernel command line: nfs root=/dev/nfs nfsroot=192.168.0.100:/PXE/debian ip=dhcp Stefan Olaf Kirch schrieb: On Tuesday 20 March 2007 11:33, Stefan Priebe wrote: 1.) I've bootet these systems through NF

Re: Kernel 2.6.20 does not work anymore with SCSI or SATA on old Opteron / Xeon servers

2007-03-20 Thread Stefan Priebe
i also can fdisk /dev/sdb or so. It only does not work if the system itself is bootet via NFS... Stefan Andrew Morton schrieb: On Sun, 18 Mar 2007 21:50:46 +0100 Stefan Priebe <[EMAIL PROTECTED]> wrote: Hello! We've a very strange Problem with Kernel 2.6.20.x If i try to access a SCSI

Kernel 2.6.20 does not work anymore with SCSI or SATA on old Opteron / Xeon servers

2007-03-18 Thread Stefan Priebe
Hello! We've a very strange Problem with Kernel 2.6.20.x If i try to access a SCSI or SATA Disk (tested with Adaptec U320 ASC-29320, ICP Vortex 9024, Promise TX300) the whole server hangs - no output - no error on the screen - but it hangs completely. But it does not happen on all our systems

Re: SATA ahci Bug in 2.6.19.x

2007-01-30 Thread Stefan Priebe - FH
Hi! Any News? Stefan Stefan Priebe - FH schrieb: Hi! acpi=off does not help i've already tried that. Ok here some outputs: 1.) complete dmesg with 2.6.16.27 (works) Linux version 2.6.16.27amd ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #6 SMP Sat Aug 26 14:29:07 CEST

Re: XFS or Kernel Problem / Bug

2007-01-30 Thread Stefan Priebe - FH
Hi! Any News? Stefan Stefan Priebe - FH schrieb: Hi! OK - i rechecked everything. We've 22 Servers with the DFI PM-12 Mainboard with VIA Chipset. But only the 5 oldest of them (before 2004 / 01 / 20) (we've buyed all in a range of 10 month) have this problem. So i think it is

  1   2   >