Re: [ceph-users] Slow responding OSDs are not OUTed and cause RBD client IO hangs

2015-08-24 Thread Nick Fisk
>>> The problem is caused by the RBD device not handling device aborts > >>>> properly causing LIO and ESXi to enter a death spiral together. > >>>> > >>>> If something in the Ceph cluster causes an IO to take longer than > >>>> 10

Re: [ceph-users] Slow responding OSDs are not OUTed and cause RBD client IO hangs

2015-08-24 Thread Alex Gorbachev
nger than 10 >>>> seconds(I think!!!) ESXi submits an iSCSI abort message. Once this happens, >>>> as you have seen it never recovers. >>>> >>>> Mike Christie from Redhat is doing a lot of work on this currently, so >>>> hopefully in the futur

Re: [ceph-users] Slow responding OSDs are not OUTed and cause RBD client IO hangs

2015-08-24 Thread Jan Schermer
t RBD interface into LIO and it >>> will all work much better. >>> >>> Either tgt or SCST seem to be pretty stable in testing. >>> >>> Nick >>> >>>> -Original Message- >>>> From: ceph-users [mailto:ceph-users-boun.

Re: [ceph-users] Slow responding OSDs are not OUTed and cause RBD client IO hangs

2015-08-24 Thread Alex Gorbachev
T seem to be pretty stable in testing. >> >> Nick >> >>> -Original Message- >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >>> Alex Gorbachev >>> Sent: 23 August 2015 02:17 >>> To: ceph-users >>>

Re: [ceph-users] Slow responding OSDs are not OUTed and cause RBD client IO hangs

2015-08-24 Thread Jan Schermer
>> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Alex Gorbachev >> Sent: 23 August 2015 02:17 >> To: ceph-users >> Subject: [ceph-users] Slow responding OSDs are not OUTed and cause RBD >> client

Re: [ceph-users] Slow responding OSDs are not OUTed and cause RBD client IO hangs

2015-08-23 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Alex Gorbachev > Sent: 23 August 2015 02:17 > To: ceph-users > Subject: [ceph-users] Slow responding OSDs are not OUTed and cause RBD > client IO hangs > > Hello, th

[ceph-users] Slow responding OSDs are not OUTed and cause RBD client IO hangs

2015-08-22 Thread Alex Gorbachev
Hello, this is an issue we have been suffering from and researching along with a good number of other Ceph users, as evidenced by the recent posts. In our specific case, these issues manifest themselves in a RBD -> iSCSI LIO -> ESXi configuration, but the problem is more general. When there is an