Hi,

I’m very much hoping someone can unblock me on this – we recently ran into a 
very odd issue – I sent an earlier email to the list
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033579.html

After unsuccessfully trying to repair we decided to forsake the Filesystem

I marked the cluster down, failed the MDSs, removed the FS and the metadata and 
data pools.

Then created a new Filesystem from scratch.

However, I am still observing MDS segfaulting when a client tries to connect. 
This is quite urgent for me as we don’t have a functioning Filesystem – if 
someone can advise how I can remove any and all state please do so – I just 
want to start fresh. I am very puzzled that a brand new FS doesn’t work

Here is the MDS log at level 20 – one odd thing I notice is that the client 
seems to start showing ? as the id well before the segfault…In any case, I’m 
just asking what needs to be done to remove all state from the MDS nodes:


2019-03-08 19:30:12.024535 7f25ec184700 20 mds.0.server get_session have 
0x5477e00 client.2160819875 <client_ip>:0/945029522 state open

2019-03-08 19:30:12.024537 7f25ec184700 15 mds.0.server  oldest_client_tid=1

2019-03-08 19:30:12.024564 7f25ec184700  7 mds.0.cache request_start 
request(client.?:1 cr=0x54a8680)

2019-03-08 19:30:12.024566 7f25ec184700  7 mds.0.server dispatch_client_request 
client_request(client.?:1 getattr pAsLsXsFs #1 2019-03-08 19:29:15.425510 
RETRY=2) v2

2019-03-08 19:30:12.024576 7f25ec184700 10 mds.0.server rdlock_path_pin_ref 
request(client.?:1 cr=0x54a8680) #1

2019-03-08 19:30:12.024577 7f25ec184700  7 mds.0.cache traverse: opening base 
ino 1 snap head

2019-03-08 19:30:12.024579 7f25ec184700 10 mds.0.cache path_traverse finish on 
snapid head

2019-03-08 19:30:12.024580 7f25ec184700 10 mds.0.server ref is [inode 1 
[...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | 
dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024589 7f25ec184700 10 mds.0.locker acquire_locks 
request(client.?:1 cr=0x54a8680)

2019-03-08 19:30:12.024591 7f25ec184700 20 mds.0.locker  must rdlock (iauth 
sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) 
(iversion lock) | request=1 dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024594 7f25ec184700 20 mds.0.locker  must rdlock (ilink 
sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) 
(iversion lock) | request=1 dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024597 7f25ec184700 20 mds.0.locker  must rdlock (ifile 
sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) 
(iversion lock) | request=1 dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024600 7f25ec184700 20 mds.0.locker  must rdlock (ixattr 
sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) 
(iversion lock) | request=1 dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024602 7f25ec184700 20 mds.0.locker  must rdlock (isnap 
sync) [inode 1 [...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) 
(iversion lock) | request=1 dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024605 7f25ec184700 10 mds.0.locker  must authpin [inode 1 
[...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | 
request=1 dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024607 7f25ec184700 10 mds.0.locker  auth_pinning [inode 1 
[...2,head] / auth v1 snaprealm=0x53b8480 f() n(v0 1=0+1) (iversion lock) | 
request=1 dirfrag=1 0x53ca968]

2019-03-08 19:30:12.024610 7f25ec184700 10 mds.0.cache.ino(1) auth_pin by 
0x51e5e00 on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (iversion lock) | request=1 dirfrag=1 authpin=1 0x53ca968] now 1+0

2019-03-08 19:30:12.024614 7f25ec184700  7 mds.0.locker rdlock_start  on (isnap 
sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (iversion lock) | request=1 dirfrag=1 authpin=1 0x53ca968]

2019-03-08 19:30:12.024618 7f25ec184700 10 mds.0.locker  got rdlock on (isnap 
sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (isnap sync r=1) (iversion lock) | request=1 lock=1 dirfrag=1 authpin=1 
0x53ca968]

2019-03-08 19:30:12.024621 7f25ec184700  7 mds.0.locker rdlock_start  on (ifile 
sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (isnap sync r=1) (iversion lock) | request=1 lock=1 dirfrag=1 authpin=1 
0x53ca968]

2019-03-08 19:30:12.024625 7f25ec184700 10 mds.0.locker  got rdlock on (ifile 
sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=2 
dirfrag=1 authpin=1 0x53ca968]

2019-03-08 19:30:12.024628 7f25ec184700  7 mds.0.locker rdlock_start  on (iauth 
sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | request=1 lock=2 
dirfrag=1 authpin=1 0x53ca968]

2019-03-08 19:30:12.024631 7f25ec184700 10 mds.0.locker  got rdlock on (iauth 
sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (iauth sync r=1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | 
request=1 lock=3 dirfrag=1 authpin=1 0x53ca968]

2019-03-08 19:30:12.024635 7f25ec184700  7 mds.0.locker rdlock_start  on (ilink 
sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (iauth sync r=1) (isnap sync r=1) (ifile sync r=1) (iversion lock) | 
request=1 lock=3 dirfrag=1 authpin=1 0x53ca968]

2019-03-08 19:30:12.024638 7f25ec184700 10 mds.0.locker  got rdlock on (ilink 
sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (iauth sync r=1) (ilink sync r=1) (isnap sync r=1) (ifile sync r=1) 
(iversion lock) | request=1 lock=4 dirfrag=1 authpin=1 0x53ca968]

2019-03-08 19:30:12.024642 7f25ec184700  7 mds.0.locker rdlock_start  on 
(ixattr sync) on [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() 
n(v0 1=0+1) (iauth sync r=1) (ilink sync r=1) (isnap sync r=1) (ifile sync r=1) 
(iversion lock) | request=1 lock=4 dirfrag=1 authpin=1 0x53ca968]

2019-03-08 19:30:12.024646 7f25ec184700 10 mds.0.locker  got rdlock on (ixattr 
sync r=1) [inode 1 [...2,head] / auth v1 ap=1+0 snaprealm=0x53b8480 f() n(v0 
1=0+1) (iauth sync r=1) (ilink sync r=1) (isnap sync r=1) (ifile sync r=1) 
(ixattr sync r=1) (iversion lock) | request=1 lock=5 dirfrag=1 authpin=1 
0x53ca968]

2019-03-08 19:30:12.024658 7f25ec184700 10 mds.0.server reply to stat on 
client_request(client.?:1 getattr pAsLsXsFs #1 2019-03-08 19:29:15.425510 
RETRY=2) v2

2019-03-08 19:30:12.024661 7f25ec184700 10 mds.0.server reply_client_request 0 
((0) Success) client_request(client.?:1 getattr pAsLsXsFs #1 2019-03-08 
19:29:15.425510 RETRY=2) v2

2019-03-08 19:30:12.024673 7f25ec184700 10 mds.0.server apply_allocated_inos 0 
/ [] / 0

2019-03-08 19:30:12.024674 7f25ec184700 20 mds.0.server lat 0.060895

2019-03-08 19:30:12.024677 7f25ec184700 20 mds.0.server set_trace_dist snapid 
head

2019-03-08 19:30:12.024679 7f25ec184700 10 mds.0.server set_trace_dist 
snaprealm snaprealm(1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x53b8480) len=48

2019-03-08 19:30:12.024683 7f25ec184700 20 mds.0.cache.ino(1)  pfile 0 pauth 0 
plink 0 pxattr 0 plocal 0 ctime 2019-03-07 21:12:21.476328 valid=1

2019-03-08 19:30:12.024688 7f25ec184700 10 mds.0.cache.ino(1) add_client_cap 
first cap, joining realm snaprealm(1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x53b8480)

2019-03-08 19:30:12.026741 7f25ec184700 -1 *** Caught signal (Segmentation 
fault) **

 in thread 7f25ec184700



 ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)

 1: ceph_mds() [0x89982a]

 2: (()+0x10350) [0x7f25f4647350]

 3: (CInode::get_caps_allowed_for_client(client_t) const+0x130) [0x7a19f0]

 4: (CInode::encode_inodestat(ceph::buffer::list&, Session*, SnapRealm*, 
snapid_t, unsigned int, int)+0x132d) [0x7b383d]

 5: (Server::set_trace_dist(Session*, MClientReply*, CInode*, CDentry*, 
snapid_t, int, std::tr1::shared_ptr<MDRequestImpl>&)+0x471) [0x5f26e1]

 6: (Server::reply_client_request(std::tr1::shared_ptr<MDRequestImpl>&, 
MClientReply*)+0x846) [0x611056]

 7: (Server::respond_to_request(std::tr1::shared_ptr<MDRequestImpl>&, 
int)+0x4d9) [0x611759]

 8: (Server::handle_client_getattr(std::tr1::shared_ptr<MDRequestImpl>&, 
bool)+0x47b) [0x613eab]

 9: 
(Server::dispatch_client_request(std::tr1::shared_ptr<MDRequestImpl>&)+0xa38) 
[0x633da8]

 10: (Server::handle_client_request(MClientRequest*)+0x3df) [0x63435f]

 11: (Server::dispatch(Message*)+0x3f3) [0x63b8b3]

 12: (MDS::handle_deferrable_message(Message*)+0x847) [0x5b6c27]

 13: (MDS::_dispatch(Message*)+0x6d) [0x5d2bed]

 14: (C_MDS_RetryMessage::finish(int)+0x1b) [0x63d24b]

 15: (MDSInternalContextBase::complete(int)+0x163) [0x7e3363]

 16: (MDS::_advance_queues()+0x48d) [0x5c9e4d]

 17: (MDS::ProgressThread::entry()+0x4a) [0x5ca1aa]

 18: (()+0x8192) [0x7f25f463f192]

 19: (clone()+0x6d) [0x7f25f3b4c26d]

 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.





_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to