[ceph-users] osd node heartbeat NIC broken and kick out

2014-07-19 Thread Haomai Wang
Hi all, Our production ceph node each has two NIC, one used by heartbeat another used by cluster_network. By accident, the heartbeat NIC is broken but the cluster_network NIC is healthy. But osds report the broken NIC node is unavailable, so monitor decide to kick out the node. I'm not sure what

[ceph-users] Adding OSD with journal on another disk

2014-07-19 Thread Simon Ironside
Hi there, OS: RHEL 7.0 x86_64 ceph: 0.80.4 ceph-deploy: 1.5.9 (disks have been zapped first) I think there might be a typo in the documentation here: http://ceph.com/docs/master/rados/deployment/ceph-deploy-osd/#prepare-osds If I follow by doing this: ceph-deploy osd prepare ceph-osd1:sdc:/dev

Re: [ceph-users] osd node heartbeat NIC broken and kick out

2014-07-19 Thread Gregory Farnum
The heartbeat code is very careful to use the same physical interfaces as 1) the cluster network 2) the public network If the first breaks, the OSD can't talk with its peers. If the second breaks, it can't talk with the monitors or clients. Either way, the OSD can't do its job so it gets marked do

Re: [ceph-users] osd node heartbeat NIC broken and kick out

2014-07-19 Thread Wang Haomai
Oh, it's our fault. Public_addr and cluster_addr use the same NIC(eth1). But we found during recovering heartbeat may timeout because of busy traffic. I *misunderstood* the mean of heartbeat and use another NIC(eth0) address for heartbeat to avoid timeout. From your points, it's easy to unders

Re: [ceph-users] osd node heartbeat NIC broken and kick out

2014-07-19 Thread Gregory Farnum
On Sat, Jul 19, 2014 at 11:08 AM, Wang Haomai wrote: > Oh, it's our fault. > > Public_addr and cluster_addr use the same NIC(eth1). But we found during > recovering heartbeat may timeout because of busy traffic. I *misunderstood* > the mean of heartbeat and use another NIC(eth0) address for hear

Re: [ceph-users] Placing different pools on different OSDs in the same physical servers

2014-07-19 Thread Dmitry Smirnov
On Tue, 15 Jul 2014 14:58:28 Marc wrote: > [global] > osd crush update on start = false IMHO setting OSD crush map host may be easier from OSD configuration section as follows: [osd.5] host = realhost osd crush location = host=crushost -- Best wishes, Dmitry