Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

Vasiliy Angapov Wed, 13 May 2015 07:18:21 -0700

Hi,

Well, I've managed to find out that correct stop of osd causes no IO
downtime (/etc/init.d/ceph stop osd). But that cannot be called a fault
tolerance, which Ceph is supposed to be.
However, "killall -9 ceph-osd" causes IO to stop for about 20 seconds.


I've tried lowering some timeouts but without luck. Here is a related part
of my ceph.conf after lowering the timeout values:

[global]
heartbeat interval = 5
mon osd down out interval = 90
mon pg warn max per osd = 2000
mon osd adjust heartbeat grace = false

[client]
rbd cache = false

[mon]
mon clock drift allowed = .200
mon osd min down reports = 1

[osd]
osd heartbeat interval = 3
osd heartbeat grace = 5

Can you help me to reduce IO downtime somehow? Because 20 seconds for
production is just horrible.

Regards, Vasily.


On Wed, May 13, 2015 at 9:57 AM, Vasiliy Angapov <anga...@gmail.com> wrote:

> Thanks, Gregory!
>
> My Ceph version is 0.94.1. What I'm trying to test is the worst situation
> when the node is loosing network or becomes inresponsive. So what i do is
> "killall -9 ceph-osd", then reboot.
>
> Well, I also tried to do a clean reboot several times (just a "reboot"
> command), but i saw no difference - there is always an IO freeze for about
> 30 seconds. Btw, i'm using Fedora 20 on all nodes.
>
> Ok, I will play with timeouts more.
>
> Thanks again!
>
> On Wed, May 13, 2015 at 10:46 AM, Gregory Farnum <g...@gregs42.com> wrote:
>
>> On Tue, May 12, 2015 at 11:39 PM, Vasiliy Angapov <anga...@gmail.com>
>> wrote:
>> > Hi, colleagues!
>> >
>> > I'm testing a simple Ceph cluster in order to use it in production
>> > environment. I have 8 OSDs (1Tb SATA  drives) which are evenly
>> distributed
>> > between 4 nodes.
>> >
>> > I'v mapped rbd image on the client node and started writing a lot of
>> data to
>> > it. Then I just reboot one node and see what's happening. What happens
>> is
>> > very sad. I have a write freeze for about 20-30 seconds which is enough
>> for
>> > ext4 filesystem to switch to RO.
>> >
>> > I wonder, if there is any way to minimize this lag? AFAIK, ext
>> filesystems
>> > have 5 seconds timeout before switching to RO. So is there any way to
>> get
>> > that lag beyond 5 secs? I've tried lowering different osd timeouts, but
>> it
>> > doesn't seem to help.
>> >
>> > How do you deal with such a situations? 20 seconds of downtime is not
>> > tolerable in production.
>>
>> What version of Ceph are you running, and how are you rebooting it?
>> Any newish version that gets a clean reboot will notify the cluster
>> that it's shutting down, so you shouldn't witness blocked rights
>> really at all.
>>
>> If you're doing a reboot that involves just ending the daemon, you
>> will have to wait through the timeout period before the OSD gets
>> marked down, which defaults to 30 seconds. This is adjustable (look
>> for docs on the "osd heartbeat grace" config option), although if you
>> set it too low you'll need to change a bunch of other timeouts which I
>> don't know off-hand...
>> -Greg
>>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

Reply via email to