After looking a little closer now that I have a better understanding of 
osd_heartbeat_grace for the monitor all the osd failures are coming from 1 node 
in the cluster. Yes your hunch was correct and that node had stale in the 
iptables. After disabling iptables the osd "flapping" has stopped.  

Now I'm going to bring the osd_heartbeat_grace value back down incrementally 
and see if the cluster runs without reporting issues with the default.

Thank you very much for your help.

I have some default pool questions concerning cluster bring up:
I have 90 osd's (single 4TB HDD/osd with 96GB journal that is a partition on a 
SSD raid0) 30 osd's per storage node.
I have the default page/placement group info in the [global] section of 
ceph.conf:
osd_pool_default_pg_num = 4096
osd_pool_default_pgp_num = 4096

When I bring up a cluster I'm running out of the default pools 0-data, 
1-metadata, and 2-rbd and getting error msgs for not enough pages/osd. Since 
osd's require between 20 and 32 pages each as soon as I've brought up the first 
storage node I need a minimum of 600 pages, but the system comes up with the 
defaults of 64/default pool. After creation of each nodes osd's I increased the 
default pool sizes with ceph osd pool set <pool> pg_num and pgp_num for each of 
the default pools. Do I need to increase all 3 pools? Is there a ceph.conf 
setting that handles this startup issue? 

- whats' the "best practices" way to handle bringing up more osd's than the 
default pool page settings can handle?



-----Original Message-----
From: Gregory Farnum [mailto:g...@inktank.com] 
Sent: Monday, August 25, 2014 11:01 AM
To: Bruce McFarland
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] osd_heartbeat_grace set to 30 but osd's still fail 
for grace > 20

On Mon, Aug 25, 2014 at 10:56 AM, Bruce McFarland 
<bruce.mcfarl...@taec.toshiba.com> wrote:
> Thank you very much for the help.
>
> I'm moving osd_heartbeat_grace to the global section and trying to figure out 
> what's going on between  the osd's. Since increasing the osd_heartbeat_grace 
> in the [mon] section of ceph.conf on the monitor I still see failures, but 
> now they are 2 seconds > osd_heartbeat_grace. It seems that no matter how 
> much I increase this value osd's are reporting just outside of it.
>
> I've looked at netstat -s for all of the nodes and will go back and look at 
> the network stat's much closer.
>
> Would it help to put the monitor on a 10G link to the storage nodes? 
> Everything is setup, but we chose to leave the monitor on a 1G link to the 
> storage nodes.

No. They're being marked down because they aren't heartbeating the OSDs, and 
those OSDs are reporting the failures to the monitor (whose connection is 
apparently working fine). The most likely guess without more data is that 
you've got firewall rules set up blocking the ports the OSDs are using to send 
their heartbeats...but it could be many things in your network stack or your 
cpu scheduler or whatever.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to