Re: [ceph-users] mds segfault on cephfs snapshot creation

Brady Deetz Wed, 20 Apr 2016 09:00:52 -0700

On Wed, Apr 20, 2016 at 4:09 AM, Yan, Zheng <uker...@gmail.com> wrote:


> On Wed, Apr 20, 2016 at 12:12 PM, Brady Deetz <bde...@gmail.com> wrote:
> > As soon as I create a snapshot on the root of my test cephfs deployment
> with
> > a single file within the root, my mds server kernel panics. I understand
> > that snapshots are not recommended. Is it beneficial to developers for
> me to
> > leave my cluster in its present state and provide whatever debugging
> > information they'd like? I'm not really looking for a solution to a
> mission
> > critical issue as much as providing an opportunity for developers to pull
> > stack traces, logs, etc from a system affected by some sort of bug in
> > cephfs/mds. This happens every time I create a directory inside my .snap
> > directory.
>
> It's likely your kernel is too old for kernel mount. which version of
> kernel do you use?
>

All nodes in the cluster share the versions listed below. This actually
appears to be a cephfs client (native) issue (see stacktrace and kernel
dump below). I have my fs mounted on my mds which is why I thought it was
the mds causing a panic.

Linux mon0 3.13.0-77-generic #121-Ubuntu SMP Wed Jan 20 10:50:42 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux

ceph-admin@mon0:~$ cat /etc/issue
Ubuntu 14.04.4 LTS \n \l

ceph-admin@mon0:~$ dpkg -l | grep ceph | tr -s ' ' | cut -d ' ' -f 2,3
ceph 0.80.11-0ubuntu1.14.04.1
ceph-common 0.80.11-0ubuntu1.14.04.1
ceph-deploy 1.4.0-0ubuntu1
ceph-fs-common 0.80.11-0ubuntu1.14.04.1
ceph-mds 0.80.11-0ubuntu1.14.04.1
libcephfs1 0.80.11-0ubuntu1.14.04.1
python-ceph 0.80.11-0ubuntu1.14.04.1


ceph-admin@mon0:~$ ceph status
    cluster 186408c3-df8a-4e46-a397-a788fc380039
     health HEALTH_OK
     monmap e1: 1 mons at {mon0=192.168.1.120:6789/0}, election epoch 1,
quorum 0 mon0
     mdsmap e48: 1/1/1 up {0=mon0=up:active}
     osdmap e206: 15 osds: 15 up, 15 in
      pgmap v25298: 704 pgs, 5 pools, 123 MB data, 53 objects
            1648 MB used, 13964 GB / 13965 GB avail
                 704 active+clean


ceph-admin@mon0:~$ ceph osd tree
# id    weight  type name       up/down reweight
-1      13.65   root default
-2      2.73            host osd0
0       0.91                    osd.0   up      1
1       0.91                    osd.1   up      1
2       0.91                    osd.2   up      1
-3      2.73            host osd1
3       0.91                    osd.3   up      1
4       0.91                    osd.4   up      1
5       0.91                    osd.5   up      1
-4      2.73            host osd2
6       0.91                    osd.6   up      1
7       0.91                    osd.7   up      1
8       0.91                    osd.8   up      1
-5      2.73            host osd3
9       0.91                    osd.9   up      1
10      0.91                    osd.10  up      1
11      0.91                    osd.11  up      1
-6      2.73            host osd4
12      0.91                    osd.12  up      1
13      0.91                    osd.13  up      1
14      0.91                    osd.14  up      1


http://tech-hell.com/dump.201604201536

[ 5869.157340] ------------[ cut here ]------------
[ 5869.157527] kernel BUG at
/build/linux-faWYrf/linux-3.13.0/fs/ceph/inode.c:928!
[ 5869.157797] invalid opcode: 0000 [#1] SMP
[ 5869.157977] Modules linked in: kvm_intel kvm serio_raw ceph libceph
libcrc32c fscache psmouse floppy
[ 5869.158415] CPU: 0 PID: 46 Comm: kworker/0:1 Not tainted
3.13.0-77-generic #121-Ubuntu
[ 5869.158709] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 5869.158925] Workqueue: ceph-msgr con_work [libceph]
[ 5869.159124] task: ffff8809abf3c800 ti: ffff8809abf46000 task.ti:
ffff8809abf46000
[ 5869.159422] RIP: 0010:[<ffffffffa009edd5>]  [<ffffffffa009edd5>]
splice_dentry+0xd5/0x190 [ceph]
[ 5869.159768] RSP: 0018:ffff8809abf47b68  EFLAGS: 00010282
[ 5869.159963] RAX: 0000000000000004 RBX: ffff8809a08b2780 RCX:
0000000000000001
[ 5869.160224] RDX: 0000000000000000 RSI: ffff8809a04f8370 RDI:
ffff8809a08b2780
[ 5869.160484] RBP: ffff8809abf47ba8 R08: ffff8809a982c400 R09:
ffff8809a99ef6e8
[ 5869.160550] R10: 00000000000819d8 R11: 0000000000000000 R12:
ffff8809a04f8370
[ 5869.160550] R13: ffff8809a08b2780 R14: ffff8809aad5fc00 R15:
0000000000000000
[ 5869.160550] FS:  0000000000000000(0000) GS:ffff8809e3c00000(0000)
knlGS:0000000000000000
[ 5869.160550] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 5869.160550] CR2: 00007f60f37ff5c0 CR3: 00000009a5f63000 CR4:
00000000000006f0
[ 5869.160550] Stack:
[ 5869.160550]  ffff8809a5da1000 ffff8809aad5fc00 ffff8809a99ef408
ffff8809a99ef400
[ 5869.160550]  ffff8809a04f8370 ffff8809a08b2780 ffff8809aad5fc00
0000000000000000
[ 5869.160550]  ffff8809abf47c08 ffffffffa00a0dc7 ffff8809a982c544
ffff8809ab3f5400
[ 5869.160550] Call Trace:
[ 5869.160550]  [<ffffffffa00a0dc7>] ceph_fill_trace+0x2a7/0x770 [ceph]
[ 5869.160550]  [<ffffffffa00bb2c5>] handle_reply+0x3d5/0xc70 [ceph]
[ 5869.160550]  [<ffffffffa00bd437>] dispatch+0xe7/0xa90 [ceph]
[ 5869.160550]  [<ffffffffa0053a78>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
[ 5869.160550]  [<ffffffffa0056a9b>] try_read+0x4ab/0x10d0 [libceph]
[ 5869.160550]  [<ffffffff8104f28f>] ? kvm_clock_read+0x1f/0x30
[ 5869.160550]  [<ffffffff810a0685>] ? set_next_entity+0x95/0xb0
[ 5869.160550]  [<ffffffffa00588d9>] con_work+0xb9/0x640 [libceph]
[ 5869.160550]  [<ffffffff81083cd2>] process_one_work+0x182/0x450
[ 5869.160550]  [<ffffffff81084ac1>] worker_thread+0x121/0x410
[ 5869.160550]  [<ffffffff810849a0>] ? rescuer_thread+0x430/0x430
[ 5869.160550]  [<ffffffff8108b8a2>] kthread+0xd2/0xf0
[ 5869.160550]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 5869.160550]  [<ffffffff81735c68>] ret_from_fork+0x58/0x90
[ 5869.160550]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 5869.160550] Code: e7 e8 20 60 13 e1 eb c7 66 0f 1f 44 00 00 48 83 7b 78
00 0f 84 c2 00 00 00 f6 05 80 32 03 00 04 0f 85 83 00 00 00 49 89 dc eb 98
<0f> 0b 4d 8b 8e 98 fc ff ff 4d 8b 86 90 fc ff ff 48 89 c6 4c 89
[ 5869.160550] RIP  [<ffffffffa009edd5>] splice_dentry+0xd5/0x190 [ceph]
[ 5869.160550]  RSP <ffff8809abf47b68>




>
>
>
> >
> > Let me know if I should blow my cluster away?
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mds segfault on cephfs snapshot creation

Reply via email to