Okay, last one until I get some guidance.  Sorry for the spam, but wanted
to paint a full picture.  Here are debug logs from all three mons,
capturing what looks like an election sequence to me:

ceph0:
2014-03-25 16:17:24.324846 7fa5c53fc700  5 mon.ceph0@0(electing).elector(35)
start -- can i be leader?
2014-03-25 16:17:24.324900 7fa5c53fc700  1 mon.ceph0@0(electing).elector(35)
init, last seen epoch 35
2014-03-25 16:17:24.324913 7fa5c53fc700  1 -- 10.10.30.0:6789/0 --> mon.1
10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose
35) v4 -- ?+0 0x263d480
2014-03-25 16:17:24.324948 7fa5c53fc700  1 -- 10.10.30.0:6789/0 --> mon.2
10.10.30.2:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose
35) v4 -- ?+0 0x263d6c0
2014-03-25 16:17:25.353975 7fa5c4bfb700  1 -- 10.10.30.0:6789/0 <== mon.2
10.10.30.2:6789/0 493 ==== election(b3f38955-4321-4850-9ddb-3b09940dc951
propose 35) v4 ==== 537+0+0 (4036841703 0 0) 0x265fd80 con 0x1df0c60
2014-03-25 16:17:25.354042 7fa5c4bfb700  5 mon.ceph0@0(electing).elector(35)
handle_propose from mon.2
2014-03-25 16:17:29.325107 7fa5c53fc700  5 mon.ceph0@0(electing).elector(35)
election timer expired

ceph1:
2014-03-25 16:17:24.325529 7ffe48cc1700  5 mon.ceph1@1(electing).elector(35)
handle_propose from mon.0
2014-03-25 16:17:24.325535 7ffe48cc1700  5 mon.ceph1@1(electing).elector(35)
defer to 0
2014-03-25 16:17:24.325546 7ffe48cc1700  1 -- 10.10.30.1:6789/0 --> mon.0
10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 ack 35)
v4 -- ?+0 0x1bbfb40
2014-03-25 16:17:25.354038 7ffe48cc1700  1 -- 10.10.30.1:6789/0 <== mon.2
10.10.30.2:6789/0 489 ==== election(b3f38955-4321-4850-9ddb-3b09940dc951
propose 35) v4 ==== 537+0+0 (4036841703 0 0) 0x1bbf6c0 con 0x14d9b00
2014-03-25 16:17:25.354102 7ffe48cc1700  5 mon.ceph1@1(electing).elector(35)
handle_propose from mon.2
2014-03-25 16:17:25.354113 7ffe48cc1700  5 mon.ceph1@1(electing).elector(35)
no, we already acked 0

ceph2:
2014-03-25 16:17:20.353135 7f80d0013700  5 mon.ceph2@2(electing).elector(35)
election timer expired
2014-03-25 16:17:20.353154 7f80d0013700  5 mon.ceph2@2(electing).elector(35)
start -- can i be leader?
2014-03-25 16:17:20.353225 7f80d0013700  1 mon.ceph2@2(electing).elector(35)
init, last seen epoch 35
2014-03-25 16:17:20.353238 7f80d0013700  1 -- 10.10.30.2:6789/0 --> mon.0
10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose
35) v4 -- ?+0 0x18e7900
2014-03-25 16:17:20.353272 7f80d0013700  1 -- 10.10.30.2:6789/0 --> mon.1
10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose
35) v4 -- ?+0 0x18e7d80
2014-03-25 16:17:25.353559 7f80d0013700  5 mon.ceph2@2(electing).elector(35)
election timer expired
2014-03-25 16:17:25.353578 7f80d0013700  5 mon.ceph2@2(electing).elector(35)
start -- can i be leader?
2014-03-25 16:17:25.353647 7f80d0013700  1 mon.ceph2@2(electing).elector(35)
init, last seen epoch 35
2014-03-25 16:17:25.353660 7f80d0013700  1 -- 10.10.30.2:6789/0 --> mon.0
10.10.30.0:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose
35) v4 -- ?+0 0x19b7240
2014-03-25 16:17:25.353695 7f80d0013700  1 -- 10.10.30.2:6789/0 --> mon.1
10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951 propose
35) v4 -- ?+0 0x19b76c0
2014-03-25 16:17:30.354040 7f80d0013700  5 mon.ceph2@2(electing).elector(35)
election timer expired

Oddly, it looks to me like mon.2 (ceph2) never handles/receives the
proposal from mon.0 (ceph0).  But I admit I have no clue how monitor
election works.

 - Travis


On Tue, Mar 25, 2014 at 12:04 PM, Travis Rhoden <trho...@gmail.com> wrote:

> I bumped debug mon and debug ms up on one of the monitors (ceph0), and
> this is what I see:
>
> 2014-03-25 16:02:19.273406 7fa5c53fc700  5 mon.ceph0@0(electing).elector(35)
> election timer expired
> 2014-03-25 16:02:19.273447 7fa5c53fc700  5 mon.ceph0@0(electing).elector(35)
> start -- can i be leader?
> 2014-03-25 16:02:19.273528 7fa5c53fc700  1 mon.ceph0@0(electing).elector(35)
> init, last seen epoch 35
> 2014-03-25 16:02:19.273543 7fa5c53fc700  1 -- 10.10.30.0:6789/0 --> mon.1
> 10.10.30.1:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
> propose 35) v4 -- ?+0 0x251bd80
> 2014-03-25 16:02:19.273569 7fa5c53fc700  1 -- 10.10.30.0:6789/0 --> mon.2
> 10.10.30.2:6789/0 -- election(b3f38955-4321-4850-9ddb-3b09940dc951
> propose 35) v4 -- ?+0 0x251bb40
> 2014-03-25 16:02:20.284459 7fa5c4bfb700  1 -- 10.10.30.0:6789/0 <== mon.2
> 10.10.30.2:6789/0 312 ==== election(b3f38955-4321-4850-9ddb-3b09940dc951
> propose 35) v4 ==== 537+0+0 (4036841703 0 0) 0x2515480 con 0x1df0c60
> 2014-03-25 16:02:20.284524 7fa5c4bfb700  5 mon.ceph0@0(electing).elector(35)
> handle_propose from mon.2
> 2014-03-25 16:02:24.273726 7fa5c53fc700  5 mon.ceph0@0(electing).elector(35)
> election timer expired
>
>  That just repeats...
>
>
> On Tue, Mar 25, 2014 at 11:48 AM, Travis Rhoden <trho...@gmail.com> wrote:
>
>> Just to emphasize that I don't think it's clock skew, here is the NTP
>> state of all three monitors:
>>
>> # ansible ceph_mons -m command -a "ntpq -p" -kK
>> SSH password:
>> sudo password [defaults to SSH password]:
>> ceph0 | success | rc=0 >>
>>      remote           refid      st t when poll reach   delay   offset
>> jitter
>>
>> ==============================================================================
>> *controller-10g  198.60.73.8      2 u   43   64  377    0.236    0.057
>> 0.097
>>
>> ceph1 | success | rc=0 >>
>>      remote           refid      st t when poll reach   delay   offset
>> jitter
>>
>> ==============================================================================
>> *controller-10g  198.60.73.8      2 u   39   64  377    0.273    0.035
>> 0.064
>>
>> ceph2 | success | rc=0 >>
>>      remote           refid      st t when poll reach   delay   offset
>> jitter
>>
>> ==============================================================================
>> *controller-10g  198.60.73.8      2 u   30   64  377    0.201   -0.063
>> 0.063
>>
>> I think they are pretty well in synch.
>>
>>  - Travis
>>
>>
>> On Tue, Mar 25, 2014 at 11:09 AM, Travis Rhoden <trho...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I just deployed a new Emperor cluster using ceph-deploy 1.4.  All went
>>> very smooth, until I rebooted all the nodes.  After reboot, the monitors no
>>> longer form a quorum.
>>>
>>> I followed the troubleshooting steps here:
>>> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/
>>>
>>> Specifically, I"m in the stat described in this section:
>>> http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/#most-common-monitor-issues
>>>
>>> The state for all the monitors is "electing".  The docs say this is most
>>> likely clock skew, but I do have all nodes synch'd with NTP.  I've
>>> confirmed this multiple times.  I've also confirmed the monitors can reach
>>> each other (by telneting to IP:PORT, and I can see established connections
>>> via netstat).
>>>
>>> I'm baffled.
>>>
>>> here is a sample mon_status output:
>>>
>>> root@ceph0:~# ceph daemon mon.ceph0 quorum_status
>>> { "election_epoch": 31,
>>>   "quorum": [],
>>>   "quorum_names": [],
>>>   "quorum_leader_name": "",
>>>   "monmap": { "epoch": 2,
>>>       "fsid": "XXX", (redacted)
>>>       "modified": "2014-03-24 14:35:22.332646",
>>>       "created": "0.000000",
>>>       "mons": [
>>>             { "rank": 0,
>>>               "name": "ceph0",
>>>               "addr": "10.10.30.0:6789\/0"},
>>>             { "rank": 1,
>>>               "name": "ceph1",
>>>               "addr": "10.10.30.1:6789\/0"},
>>>             { "rank": 2,
>>>               "name": "ceph2",
>>>               "addr": "10.10.30.2:6789\/0"}]}}
>>>
>>> They all look identical to that.
>>>
>>> Any ideas what I can look at besides NTP?  The docs really stress that
>>> it should be clock skew, so I'll keep looking at that...
>>>
>>>  - Travis
>>>
>>
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to