It mostly like is the model of switch. In its settings the minimum
frame size you can set is 1518, default MTU is 1500, seems the switch
wants the 18 byte difference.
We are using a pair of Netgear XS712T and bonded pairs of Intel
10-Gigabit X540-AT2 (rev 01) with 3 VLans.
Cameron Scrace
Infrastructure Engineer
Mobile +64 22 610 4629
Phone +64 4 462 5085
Email cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140
www.solnet.co.nz
From: Somnath Roy <somnath....@sandisk.com>
To: "cameron.scr...@solnet.co.nz" <cameron.scr...@solnet.co.nz>, Jan
Schermer <j...@schermer.cz>
Cc: "ceph-users@lists.ceph.com" <ceph-users@lists.ceph.com>,
ceph-users <ceph-users-boun...@lists.ceph.com>
Date: 04/06/2015 11:13 a.m.
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
------------------------------------------------------------------------
Hmm…Thanks for sharing this..
Any chance it depends on switch ?
Could you please share what NIC card and switch you are using ?
Thanks & Regards
Somnath
*From:* cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] *
Sent:* Wednesday, June 03, 2015 4:07 PM*
To:* Somnath Roy; Jan Schermer*
Cc:* ceph-users@lists.ceph.com; ceph-users*
Subject:* RE: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
The interface MTU has to be 18 or more bytes lower than the switch MTU
or it just stops working. As far as I know the monitor communication
is not being encapsulated by any SDN.
Cameron Scrace
Infrastructure Engineer
Mobile +64 22 610 4629
Phone +64 4 462 5085
Email _cameron.scr...@solnet.co.nz_ <mailto:cameron.scr...@solnet.co.nz>
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140
_
__www.solnet.co.nz_
From: Somnath Roy <_Somnath.Roy@sandisk.com_
<mailto:somnath....@sandisk.com>>
To: Jan Schermer <_jan@schermer.cz_ <mailto:j...@schermer.cz>>,
"_cameron.scr...@solnet.co.nz_ <mailto:cameron.scr...@solnet.co.nz>"
<_cameron.scr...@solnet.co.nz_ <mailto:cameron.scr...@solnet.co.nz>>
Cc: "_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>"
<_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>>,
ceph-users <_ceph-users-boun...@lists.ceph.com_
<mailto:ceph-users-boun...@lists.ceph.com>>
Date: 04/06/2015 02:58 a.m.
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
------------------------------------------------------------------------
The TCP_NODELAY issue was with kernel rbd **not** with OSD. Ceph
messenger code base is setting it by default.
BTW, I doubt TCP_NODELAY has anything to do with it.
Thanks & Regards
Somnath
*
From:* Jan Schermer [_mailto:jan@schermer.cz_] *
Sent:* Wednesday, June 03, 2015 1:37 AM*
To:* _cameron.scr...@solnet.co.nz_ <mailto:cameron.scr...@solnet.co.nz>*
Cc:* Somnath Roy; _ceph-us...@lists.ceph.com_
<mailto:ceph-users@lists.ceph.com>; ceph-users*
Subject:* Re: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
Interface and switch should have the same MTU and that should not
cause any issues (setting switch MTU higher is always safe, though).
Aren’t you encapsulating the mon communication in some SDN like
openwswitch? Is that a straight L2 connection?
I think this is worth investigating. For example are mons properly
setting TCP_NODELAY on the sockets that are latency sensitive? (I just
tried finding out and lsof/netstat doesn’t report that to me, I’d need
to restart and strace it… I vaguely remember there was an issue with
NODELAY that was fixed on OSD side.)
Jan
On 03 Jun 2015, at 06:30, _cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>wrote:
Seems to be something to do with our switch. If the interface MTU is
too close to the switch MTU it stops working. Thanks for all your help :)
Cameron Scrace
Infrastructure Engineer
Mobile +64 22 610 4629
Phone +64 4 462 5085
Email _cameron.scr...@solnet.co.nz_ <mailto:cameron.scr...@solnet.co.nz>
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140_
__www.solnet.co.nz_ <x-msg://2/www.solnet.co.nz>
From: Somnath Roy <_Somnath.Roy@sandisk.com_
<mailto:somnath....@sandisk.com>>
To: "_cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>" <_cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>>
Cc: "_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>"
<_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>>,
ceph-users <_ceph-users-boun...@lists.ceph.com_
<mailto:ceph-users-boun...@lists.ceph.com>>, Joao Eduardo Luis
<_joao@suse.de_ <mailto:j...@suse.de>>
Date: 03/06/2015 11:49 a.m.
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
------------------------------------------------------------------------
I doubt it is anything to do with Ceph, hope you checked your switch
is supporting Jumbo frames and you have set MTU 9000 to all the
devices in between. It‘s better to ping your devices (all the devices
participating in the cluster) like the way it mentioned in the
following articles , just in case you are not sure.
_
__http://www.mylesgray.com/hardware/test-jumbo-frames-working/__
__http://serverfault.com/questions/234311/testing-whether-jumbo-frames-are-actually-working_
Hope this helps,
Thanks & Regards
Somnath
*
From:* _cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>[_mailto:cameron.scr...@solnet.co.nz_]
*
Sent:* Tuesday, June 02, 2015 4:32 PM*
To:* Somnath Roy*
Cc:* _ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>;
ceph-users; Joao Eduardo Luis*
Subject:* RE: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
Setting the MTU to 1500 worked, monitors reach quorum right away.
Unfortunately we really want Jumbo Frames to be on, any ideas on how
to get ceph to work with them on?
Thanks!
Cameron Scrace
Infrastructure Engineer
Mobile +64 22 610 4629
Phone +64 4 462 5085
Email _cameron.scr...@solnet.co.nz_ <mailto:cameron.scr...@solnet.co.nz>
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140_
__www.solnet.co.nz_ <x-msg://2/www.solnet.co.nz>
From: Somnath Roy <_Somnath.Roy@sandisk.com_
<mailto:somnath....@sandisk.com>>
To: "_cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>" <_cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>>
Cc: "_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>"
<_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>>,
ceph-users <_ceph-users-boun...@lists.ceph.com_
<mailto:ceph-users-boun...@lists.ceph.com>>, Joao Eduardo Luis
<_joao@suse.de_ <mailto:j...@suse.de>>
Date: 03/06/2015 10:34 a.m.
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
------------------------------------------------------------------------
We have seen some communication issue with that, try to make all the
server MTU 1500 and try out…*
From:* _cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>[_mailto:cameron.scr...@solnet.co.nz_]
*
Sent:* Tuesday, June 02, 2015 3:31 PM*
To:* Somnath Roy*
Cc:* _ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>;
ceph-users; Joao Eduardo Luis*
Subject:* Re: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
We are running with Jumbo Frames turned on. Is that likely to be the
issue? Do I need to configure something in ceph?
The mon maps are fine and after setting debug to 10 and debug ms to 1,
I see probe timeouts in the logs: _http://pastebin.com/44M1uJZc_
I just set probe timeout to 10 (up from 2) and it still times out.
Thanks!
Cameron Scrace
Infrastructure Engineer
Mobile +64 22 610 4629
Phone +64 4 462 5085
Email _cameron.scr...@solnet.co.nz_ <mailto:cameron.scr...@solnet.co.nz>
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140_
__www.solnet.co.nz_ <x-msg://2/www.solnet.co.nz>
From: Somnath Roy <_Somnath.Roy@sandisk.com_
<mailto:somnath....@sandisk.com>>
To: Joao Eduardo Luis <_joao@suse.de_ <mailto:j...@suse.de>>,
"_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>"
<_ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>>
Date: 03/06/2015 03:49 a.m.
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
Sent by: "ceph-users" <_ceph-users-boun...@lists.ceph.com_
<mailto:ceph-users-boun...@lists.ceph.com>>
------------------------------------------------------------------------
By any chance are you running with jumbo frame turned on ?
Thanks & Regards
Somnath
-----Original Message-----
From: ceph-users [_mailto:ceph-users-boun...@lists.ceph.com_] On
Behalf Of Joao Eduardo Luis
Sent: Tuesday, June 02, 2015 12:52 AM
To: _ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off,
IPtables off, can see tcp traffic)
On 06/02/2015 01:42 AM, _cameron.scr...@solnet.co.nz_
<mailto:cameron.scr...@solnet.co.nz>wrote:
> I am trying to deploy a new ceph cluster and my monitors are not
> reaching quorum. SELinux is off, firewalls are off, I can see traffic
> between the nodes on port 6789 but when I use the admin socket to
> force a re-election only the monitor I send the request to shows the
> new election in its logs. My logs are filled entirely of the following
> two
> lines:
>
> 2015-06-02 11:31:56.447975 7f795b17a700 0 log_channel(audit) log
> [DBG]
> : from='admin socket' entity='admin socket' cmd='mon_status' args=[]:
> dispatch
> 2015-06-02 11:31:56.448272 7f795b17a700 0 log_channel(audit) log
> [DBG]
> : from='admin socket' entity='admin socket' cmd=mon_status args=[]:
> finished
You are running on default debug levels, so you'll hardly get anything
more than that. I suggest setting 'debug mon = 10' and 'debug ms = 1'
for added verbosity and come back to us with the logs.
There are many reasons for this, but the more common are due to the
monitors not being able to communicate with each other. Given you see
traffic between the monitors, I'm inclined to assume that the other
two monitors do not have each other on the monmap or, if they do know
each other, either 1) the monitor's auth keys do not match, or 2) the
probe timeout is being triggered before they successfully manage to
find enough monitors to trigger an election -- which may be due to
latency.
Logs will tells us more.
-Joao
> Querying the admin socket with mon_status (the other two are the
> similar but with their hostnames and rank):
>
> {
> "name": "wcm1",
> "rank": 0,
> "state": "probing",
> "election_epoch": 1,
> "quorum": [],
> "outside_quorum": [
> "wcm1"
> ],
> "extra_probe_peers": [],
> "sync_provider": [],
> "monmap": {
> "epoch": 0,
> "fsid": "adb8c500-122e-49fd-9c1e-a99af7832307",
> "modified": "2015-06-02 10:43:41.467811",
> "created": "2015-06-02 10:43:41.467811",
> "mons": [
> {
> "rank": 0,
> "name": "wcm1",
> "addr": "10.1.226.64:6789\/0"
> },
> {
> "rank": 1,
> "name": "wcm2",
> "addr": "10.1.226.65:6789\/0"
> },
> {
> "rank": 2,
> "name": "wcm3",
> "addr": "10.1.226.66:6789\/0"
> }
> ]
> }
> }
_______________________________________________
ceph-users mailing list_
__ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>_
__http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com_
________________________________
PLEASE NOTE: The information contained in this electronic mail message
is intended only for the use of the designated recipient(s) named
above. If the reader of this message is not the intended recipient,
you are hereby notified that you have received this message in error
and that any review, dissemination, distribution, or copying of this
message is strictly prohibited. If you have received this
communication in error, please notify the sender by telephone or
e-mail (as shown above) immediately and destroy any and all copies of
this message in your possession (whether hard copies or electronically
stored copies).
_______________________________________________
ceph-users mailing list_
__ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>_
__http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com_
Attention: This email may contain information intended for the sole
use of the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you believe
you have received this email in error, please delete it and notify the
sender or _postmas...@solnetsolutions.co.nz_
<mailto:postmas...@solnetsolutions.co.nz>as soon as possible. The
content of this email does not necessarily reflect the views of Solnet
Solutions Ltd.
Attention: This email may contain information intended for the sole
use of the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you believe
you have received this email in error, please delete it and notify the
sender or _postmas...@solnetsolutions.co.nz_
<mailto:postmas...@solnetsolutions.co.nz>as soon as possible. The
content of this email does not necessarily reflect the views of Solnet
Solutions Ltd.
Attention: This email may contain information intended for the sole
use of the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you believe
you have received this email in error, please delete it and notify the
sender or _postmas...@solnetsolutions.co.nz_
<mailto:postmas...@solnetsolutions.co.nz>as soon as possible. The
content of this email does not necessarily reflect the views of Solnet
Solutions Ltd. _______________________________________________
ceph-users mailing list_
__ceph-us...@lists.ceph.com_ <mailto:ceph-users@lists.ceph.com>_
__http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com_
Attention: This email may contain information intended for the sole
use of the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you believe
you have received this email in error, please delete it and notify the
sender or _postmas...@solnetsolutions.co.nz_
<mailto:postmas...@solnetsolutions.co.nz>as soon as possible. The
content of this email does not necessarily reflect the views of Solnet
Solutions Ltd.
Attention: This email may contain information intended for the sole
use of the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you believe
you have received this email in error, please delete it and notify the
sender or postmas...@solnetsolutions.co.nz as soon as possible. The
content of this email does not necessarily reflect the views of Solnet
Solutions Ltd.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com