Re: [ceph-users] mds server(s) crashed

2015-08-13 Thread John Spray
On Thu, Aug 13, 2015 at 5:12 AM, Bob Ababurko wrote: >> > I am actually looking for the most stable way to implement cephfs at >> > this >> > point. My cephfs cluster contains millions of small files, so many >> > inodes >> > if that needs to be taken into account. Perhaps I should only be usin

Re: [ceph-users] mds server(s) crashed

2015-08-13 Thread John Spray
On Thu, Aug 13, 2015 at 3:29 AM, yangyongp...@bwstor.com.cn wrote: > I also encounter a problem,standby mds can not be altered to active when > active mds service stopped,which bother me for serval days.Maybe MDS cluster > can solve those problem,but ceph team haven't released this feature. That

Re: [ceph-users] mds server(s) crashed

2015-08-12 Thread Bob Ababurko
On Wed, Aug 12, 2015 at 7:21 PM, Yan, Zheng wrote: > On Thu, Aug 13, 2015 at 7:05 AM, Bob Ababurko wrote: > > > > If I am using a more recent client(kernel OR ceph-fuse), should I still > be > > worried about the MDS's crashing? I have added RAM to my MDS hosts and > its > > my understanding th

Re: [ceph-users] mds server(s) crashed

2015-08-12 Thread yangyongp...@bwstor.com.cn
21 To: Bob Ababurko CC: ceph-users@lists.ceph.com Subject: Re: [ceph-users] mds server(s) crashed On Thu, Aug 13, 2015 at 7:05 AM, Bob Ababurko wrote: > > If I am using a more recent client(kernel OR ceph-fuse), should I still be > worried about the MDS's crashing? I have added RAM to m

Re: [ceph-users] mds server(s) crashed

2015-08-12 Thread Yan, Zheng
On Thu, Aug 13, 2015 at 7:05 AM, Bob Ababurko wrote: > > If I am using a more recent client(kernel OR ceph-fuse), should I still be > worried about the MDS's crashing? I have added RAM to my MDS hosts and its > my understanding this will also help mitigate any issues, in addition to > setting mds

Re: [ceph-users] mds server(s) crashed

2015-08-12 Thread Bob Ababurko
If I am using a more recent client(kernel OR ceph-fuse), should I still be worried about the MDS's crashing? I have added RAM to my MDS hosts and its my understanding this will also help mitigate any issues, in addition to setting mds_bal_frag = true. Not having used cephfs before, do I always ne

Re: [ceph-users] mds server(s) crashed

2015-08-12 Thread John Spray
On Wed, Aug 12, 2015 at 5:08 AM, Bob Ababurko wrote: > What is risky about enabling mds_bal_frag on a cluster with data and will > there be any performance degradation if enabled? No specific gotchas, just that it is not something that has especially good coverage in our automated tests. We rece

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread Bob Ababurko
John, This seems to have worked. I rebooted my client and restarted ceph on the MDS hosts after giving them more RAM. I restarted the rsync's that were running on the client after remounting the cephfs fs and things seem to be working. I can access the files so that is a relief. What is risky

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread Yan, Zheng
On Wed, Aug 12, 2015 at 1:23 AM, Bob Ababurko wrote: > Here is the backtrace from the core dump. > > (gdb) bt > #0 0x7f71f5404ffb in raise () from /lib64/libpthread.so.0 > #1 0x0087065d in reraise_fatal (signum=6) at > global/signal_handler.cc:59 > #2 handle_fatal_signal (signum=6)

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread Yan, Zheng
On Wed, Aug 12, 2015 at 5:53 AM, John Spray wrote: > For the record: I've created issue #12671 to improve our memory > management in this type of situation. > > John > > http://tracker.ceph.com/issues/12671 this situation has been improved in recent clients. recent clients trim their cache first,

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread John Spray
For the record: I've created issue #12671 to improve our memory management in this type of situation. John http://tracker.ceph.com/issues/12671 On Tue, Aug 11, 2015 at 10:25 PM, John Spray wrote: > On Tue, Aug 11, 2015 at 6:23 PM, Bob Ababurko wrote: >> Here is the backtrace from the core dump

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread John Spray
On Tue, Aug 11, 2015 at 6:23 PM, Bob Ababurko wrote: > Here is the backtrace from the core dump. > > (gdb) bt > #0 0x7f71f5404ffb in raise () from /lib64/libpthread.so.0 > #1 0x0087065d in reraise_fatal (signum=6) at > global/signal_handler.cc:59 > #2 handle_fatal_signal (signum=6)

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread Bob Ababurko
Yes, this was a package install and ceph-debuginfo was used and hopefully the output of the backtrace is useful. I thought it was interesting that you mentioned reproduce with an ls because aside from me doing a large dd before this issue surfaced, your post made me recall that I also ran ls a few

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread Bob Ababurko
Here is the backtrace from the core dump. (gdb) bt #0 0x7f71f5404ffb in raise () from /lib64/libpthread.so.0 #1 0x0087065d in reraise_fatal (signum=6) at global/signal_handler.cc:59 #2 handle_fatal_signal (signum=6) at global/signal_handler.cc:109 #3 #4 0x7f71f40235d7 in rais

Re: [ceph-users] mds server(s) crashed

2015-08-11 Thread John Spray
On Tue, Aug 11, 2015 at 2:21 AM, Bob Ababurko wrote: > I had a dual mds server configuration and have been copying data via cephfs > kernel module to my cluster for the past 3 weeks and just had a MDS crash > halting all IO. Leading up to the crash, I ran a test dd that increased the > throughput

Re: [ceph-users] mds server(s) crashed

2015-08-10 Thread Yan, Zheng
On Tue, Aug 11, 2015 at 9:21 AM, Bob Ababurko wrote: > I had a dual mds server configuration and have been copying data via cephfs > kernel module to my cluster for the past 3 weeks and just had a MDS crash > halting all IO. Leading up to the crash, I ran a test dd that increased the > throughput