3) Look through the logs for when the beacon fails (the first of
"mds.0.16 is_laggy 600.641332 > 15 since last acked beacon") and
see
if there's anything tell-tale going on at the time.
anything in particular we should be looking for?
the log goes as follows:
mds starts around 11:43
...
2014-04-01 11:44:23.658583 7ffec89c6700 1 mds.0.server
reconnect_clients -- 1 sessions
2014-04-01 11:44:41.212488 7ffec89c6700 0 log [DBG] : reconnect
by
client.4585 10.141.8.199:0/3551 after 17.553854
2014-04-01 11:44:45.692237 7ffec89c6700 1 mds.0.10
reconnect_done
2014-04-01 11:44:45.996384 7ffec89c6700 1 mds.0.10
handle_mds_map
i
am now mds.0.10
2014-04-01 11:44:45.996388 7ffec89c6700 1 mds.0.10
handle_mds_map
state change up:reconnect --> up:rejoin
2014-04-01 11:44:45.996390 7ffec89c6700 1 mds.0.10 rejoin_start
2014-04-01 11:49:53.158471 7ffec89c6700 1 mds.0.10
rejoin_joint_start
then lots (4667 lines) of
2014-04-01 11:50:10.237035 7ffebc844700 0 --
10.141.8.180:6837/55117
10.141.8.180:6789/0 pipe(0x38a7da00 sd=104 :41115 s=4 pgs=0
cs=0
l=1 c=0x6513e8840).connect got RESETSESSION but no longer
connecting
with one intermediate
2014-04-01 11:51:50.181354 7ffebcf4b700 0 --
10.141.8.180:6837/55117
10.141.8.180:6789/0 pipe(0x10e282580 sd=103 :0 s=1 pgs=0 cs=0
l=1
c=0xc77d5ee0).fault
then sudden change
2014-04-01 11:57:30.440554 7ffebcd49700 0 --
10.141.8.180:6837/55117
10.141.8.182:6789/0 pipe(0xa1534100 sd=104 :48176 s=4 pgs=0
cs=0
l=1 c=0xd99b11e0).connect got RESETSESSION but no longer
connecting
2014-04-01 11:57:30.722607 7ffebec68700 0 --
10.141.8.180:6837/55117
10.141.8.181:6789/0 pipe(0x1ce98a00 sd=104 :48235 s=1 pgs=0
cs=0
l=1 c=0xc48a3f40).connect got BADAUTHORIZER
2014-04-01 11:57:30.722669 7ffebec68700 0 --
10.141.8.180:6837/55117
10.141.8.181:6789/0 pipe(0x1ce98a00 sd=104 :48235 s=1 pgs=0
cs=0
l=1 c=0xc48a3f40).connect got BADAUTHORIZER
2014-04-01 11:57:30.722885 7ffebec68700 0 --
10.141.8.180:6837/55117
10.141.8.181:6789/0 pipe(0x1ce98a00 sd=57 :48237 s=1 pgs=0
cs=0
l=1
c=0xc48a3f40).connect got BADAUTHORIZER
2014-04-01 11:57:30.722945 7ffebec68700 0 --
10.141.8.180:6837/55117
10.141.8.181:6789/0 pipe(0x1ce98a00 sd=57 :48237 s=1 pgs=0
cs=0
l=1
c=0xc48a3f40).connect got BADAUTHORIZER
followed by lots of
2014-04-01 11:57:30.738562 7ffebec68700 0 --
10.141.8.180:6837/55117
10.141.8.181:6789/0 pipe(0xead9fa80 sd=57 :0 s=1 pgs=0 cs=0
l=1
c=0x10e5d280).fault
with sporadic
2014-04-01 11:57:32.431219 7ffebeb67700 0 --
10.141.8.180:6837/55117
10.141.8.182:6789/0 pipe(0xef85cd80 sd=103 :41218 s=4 pgs=0
cs=0
l=1 c=0x130590dc0).connect got RESETSESSION but no longer
connecting
until the dmup
2014-04-01 11:59:27.612850 7ffebea66700 0 --
10.141.8.180:6837/55117
10.141.8.181:6789/0 pipe(0xe3036400 sd=103 :0 s=1 pgs=0 cs=0
l=1
c=0xa7be300).fault
2014-04-01 11:59:27.639009 7ffec89c6700 -1 mds/CDir.cc: In
function
'void CDir::_omap_fetched(ceph::bufferlist&,
std::map<std::basic_string<char, std::char_traits<char>,
std::allocator\
<char> >, ceph::buffer::list, std::less<std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >,
std::allocator<std::pair<const std::basic_string<char,
std::char_trait\
s<char>, std::allocator<char> >, ceph::buffer::list> > >&, const
std::string&, int)' thread 7ffec89c6700 time 2014-04-01
11:59:27.620684
mds/CDir.cc: 1474: FAILED assert(r == 0 || r == -2 || r == -61)
ceph version 0.78 (f6c746c314d7b87b8419b6e584c94bfe4511dbd4)
1: (CDir::_omap_fetched(ceph::buffer::list&,
std::map<std::string,
ceph::buffer::list, std::less<std::string>,
std::allocator<std::pair<std::string const, ceph::buffer::list>
&,
st\
d::string const&, int)+0x4d71) [0x77c3c1]
2: (Context::complete(int)+0x9) [0x56bb79]
3: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x11a6)
[0x806dd6]
4: (MDS::handle_core_message(Message*)+0x9c7) [0x5901d7]
5: (MDS::_dispatch(Message*)+0x2f) [0x59028f]
6: (MDS::ms_dispatch(Message*)+0x1ab) [0x591d4b]
7: (DispatchQueue::entry()+0x582) [0x902072]
8: (DispatchQueue::DispatchThread::entry()+0xd) [0x85ef4d]
9: /lib64/libpthread.so.0() [0x34c36079d1]
10: (clone()+0x6d) [0x34c32e8b6d]
NOTE: a copy of the executable, or `objdump -rdS
<executable>`
is
needed to interpret this.
--- begin dump of recent events ---
-10000> 2014-04-01 11:59:27.137779 7ffec89c6700 5 mds.0.10
initiating
monitor reconnect; maybe we're not the slow one
-9999> 2014-04-01 11:59:27.137787 7ffec89c6700 10
monclient(hunting):
_reopen_session rank -1 name
-9998> 2014-04-01 11:59:27.137790 7ffec89c6700 1 --
10.141.8.180:6837/55117 mark_down 0x1326be460 -- 0x2cae5f00
-9997> 2014-04-01 11:59:27.137807 7ffebf672700 2 --
10.141.8.180:6837/55117 >> 10.141.8.181:6789/0 pipe(0x2cae5f00
sd=57
:0 s=4 pgs=0 cs=0 l=1 c=0x1326be460).connect error 10.141.8.1\
81:6789/0, 104: Connection reset by peer
-9996> 2014-04-01 11:59:27.137819 7ffebf672700 3 --
10.141.8.180:6837/55117 >> 10.141.8.181:6789/0 pipe(0x2cae5f00
sd=57
:0 s=4 pgs=0 cs=0 l=1 c=0x1326be460).connect fault, but state\
= closed != connecting, stopping
-9995> 2014-04-01 11:59:27.137824 7ffec89c6700 10
monclient(hunting):
picked mon.ceph003 con 0x3af54ba0 addr 10.141.8.182:6789/0
-9994> 2014-04-01 11:59:27.137832 7ffec89c6700 10
monclient(hunting):
_send_mon_message to mon.ceph003 at 10.141.8.182:6789/0
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Wed, Apr 2, 2014 at 3:39 AM, Kenneth Waegeman
<kenneth.waege...@ugent.be> wrote:
----- Message from Gregory Farnum <g...@inktank.com> ---------
Date: Tue, 1 Apr 2014 09:03:17 -0700
From: Gregory Farnum <g...@inktank.com>
Subject: Re: [ceph-users] ceph 0.78 mon and mds crashing (bus
error)
To: "Yan, Zheng" <uker...@gmail.com>
Cc: Kenneth Waegeman <kenneth.waege...@ugent.be>,
ceph-users
<ceph-users@lists.ceph.com>
On Tue, Apr 1, 2014 at 7:12 AM, Yan, Zheng <uker...@gmail.com>
wrote:
On Tue, Apr 1, 2014 at 10:02 PM, Kenneth Waegeman
<kenneth.waege...@ugent.be> wrote:
After some more searching, I've found that the source of the
problem is
with
the mds and not the mon.. The mds crashes, generates a core
dump
that
eats
the local space, and in turn the monitor (because of
leveldb)
crashes.
The error in the mds log of one host:
2014-04-01 15:46:34.414615 7f870e319700 0 --
10.141.8.180:6836/13152 >>
10.141.8.180:6789/0 pipe(0x517371180 sd=54 :42439 s=4 pgs=0
cs=0
l=1
c=0x147ac780).connect got RESETSESSION but no longer
connecting
2014-04-01 15:46:34.438792 7f871194f700 0 --
10.141.8.180:6836/13152 >>
10.141.8.180:6789/0 pipe(0x1b099f580 sd=8 :43150 s=4 pgs=0
cs=0
l=1
c=0x1fd44360).connect got RESETSESSION but no longer
connecting
2014-04-01 15:46:34.439028 7f870e319700 0 --
10.141.8.180:6836/13152 >>
10.141.8.182:6789/0 pipe(0x13aa64880 sd=54 :37085 s=4 pgs=0
cs=0
l=1
c=0x1fd43de0).connect got RESETSESSION but no longer
connecting
2014-04-01 15:46:34.468257 7f871b7ae700 -1 mds/CDir.cc: In
function
'void
CDir::_omap_fetched(ceph::bufferlist&,
std::map<std::basic_string<char,
std::char_traits<char>, std::allocator<char> >,
ceph::buffer::list,
std::less<std::basic_string<char, std::char_traits<char>,
std::allocator<char> > >, std::allocator<std::pair<const
std::basic_string<char, std::char_traits<char>,
std::allocator<char> >,
ceph::buffer::list> > >&, const std::string&, int)' thread
7f871b7ae700
time
2014-04-01 15:46:34.448320
mds/CDir.cc: 1474: FAILED assert(r == 0 || r == -2 || r ==
-61)
could you use gdb to check what is value of variable 'r' .
If you look at the crash dump log you can see the return value
in
the
osd_op_reply message:
-1> 2014-04-01 15:46:34.440860 7f871b7ae700 1 --
10.141.8.180:6836/13152 <== osd.3 10.141.8.180:6827/4366 33077
====
osd_op_reply(4179177 100001f2ef1.00000000 [omap-get-header
0~0,omap-get-vals 0~16] v0'0 uv0 ack = -108 (Cannot send after
transport endpoint shutdown)) v6 ==== 229+0+0 (958358678 0 0)
0x2cff7aa80 con 0x37ea3c0
-108, which is ESHUTDOWN, but we also use it (via the 108
constant,
I
think because ESHUTDOWN varies across platforms) as
EBLACKLISTED.
So it looks like this is itself actually a symptom of another
problem
that is causing the MDS to get timed out on the monitor. If a
core
dump is "eating the local space", maybe the MDS is stuck in an
infinite allocation loop of some kind? How big are your disks,
Kenneth? Do you have any information on how much CPU/memory
the
MDS
was using before this?
I monitored the mds process after restart:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
19215 root 20 0 6070m 5.7g 5236 S 778.6 18.1 1:27.54
ceph-mds
19215 root 20 0 7926m 7.5g 5236 S 179.2 23.8 2:44.39
ceph-mds
19215 root 20 0 12.4g 12g 5236 S 157.2 38.8 3:43.47
ceph-mds
19215 root 20 0 16.6g 16g 5236 S 144.4 52.0 4:15.01
ceph-mds
19215 root 20 0 19.9g 19g 5236 S 137.2 62.5 4:35.83
ceph-mds
19215 root 20 0 24.5g 24g 5224 S 136.5 77.0 5:04.66
ceph-mds
19215 root 20 0 25.8g 25g 2944 S 33.7 81.2 5:13.74
ceph-mds
19215 root 20 0 26.0g 25g 2916 S 24.6 81.7 5:19.07
ceph-mds
19215 root 20 0 26.1g 25g 2916 S 13.0 82.1 5:22.16
ceph-mds
19215 root 20 0 27.7g 26g 1856 S 100.0 85.8 5:36.46
ceph-mds
Then it crashes. I changed the core dump location out of the
root
fs,
the
core dump is indeed about 26G
My disks:
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 9.9G 2.9G 6.5G 31% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/sda1 248M 53M 183M 23% /boot
/dev/sda4 172G 61G 112G 35% /var/lib/ceph/log/sda4
/dev/sdb 187G 61G 127G 33% /var/lib/ceph/log/sdb
/dev/sdc 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdc
/dev/sdd 3.7T 1.5T 2.2T 41% /var/lib/ceph/osd/sdd
/dev/sde 3.7T 1.4T 2.4T 37% /var/lib/ceph/osd/sde
/dev/sdf 3.7T 1.5T 2.3T 39% /var/lib/ceph/osd/sdf
/dev/sdg 3.7T 2.1T 1.7T 56% /var/lib/ceph/osd/sdg
/dev/sdh 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdh
/dev/sdi 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdi
/dev/sdj 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/sdj
/dev/sdk 3.7T 2.1T 1.6T 58% /var/lib/ceph/osd/sdk
/dev/sdl 3.7T 1.7T 2.0T 46% /var/lib/ceph/osd/sdl
/dev/sdm 3.7T 1.5T 2.2T 41% /var/lib/ceph/osd/sdm
/dev/sdn 3.7T 1.4T 2.3T 38% /var/lib/ceph/osd/sdn
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
----- End message from Gregory Farnum <g...@inktank.com> -----
--
Met vriendelijke groeten,
Kenneth Waegeman
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com