Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

Heller, Chris Tue, 28 Feb 2017 09:56:45 -0800

Quick update. So I'm trying out the procedure as documented here.

So far I've:


1. Stopped ceph-mds
2. set noout, norecover, norebalance, nobackfill
3. Stopped all ceph-osd
4. Stopped ceph-mon
5. Installed new OS
6. Started ceph-mon
7. Started all ceph-osd

This is where I've stopped. All but one OSD came back online. One has this 
backtrace:

2017-02-28 17:44:54.884235 7fb2ba3187c0 -1 journal FileJournal::_open: 
disabling aio for non-block journal.  Use journal_force_aio to force use of aio 
anyway
2017-02-28 17:44:58.819039 7fb2ba3187c0 -1 osd.7 153 log_to_monitors 
{default=true}
sh: lsb_release: command not found
2017-02-28 17:45:04.159907 7fb29efe6700 -1 osd.7 153 lsb_release_parse - pclose 
failed: (0) Success
terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
  what():  buffer::end_of_buffer
*** Caught signal (Aborted) **
 in thread 7fb29cfe2700

 ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
 1: ceph_osd_ceph_70d3b7af() [0xa5093a]
 2: (()+0x10350) [0x7fb2b916e350]
 3: (gsignal()+0x39) [0x7fb2b85afc49]
 4: (abort()+0x148) [0x7fb2b85b3058]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fb2b8eba555]
 6: (()+0x5e6f6) [0x7fb2b8eb86f6]
 7: (()+0x5e723) [0x7fb2b8eb8723]
 8: (()+0x5e942) [0x7fb2b8eb8942]
 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x137) [0xbbe497]
 10: (object_info_t::decode(ceph::buffer::list::iterator&)+0x7c) [0x77664c]
 11: (ReplicatedBackend::build_push_op(ObjectRecoveryInfo const&, 
ObjectRecoveryProgress const&, ObjectRecoveryProgress*, PushOp*, 
object_stat_sum_t*)+0x8bb) [0x97232b]
 12: (ReplicatedBackend::handle_pull(pg_shard_t, PullOp&, PushOp*)+0xf8) 
[0x974528]
 13: (ReplicatedBackend::do_pull(std::tr1::shared_ptr<OpRequest>)+0xd6) 
[0x974836]
 14: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x3ed) 
[0x97b89d]
 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, 
ThreadPool::TPHandle&)+0x19d) [0x80b84d]
 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, 
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3cd) [0x67720d]
 17: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x2f9) [0x6776f9]
 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x85c) 
[0xb3c7bc]
 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb3e8b0]
 20: (()+0x8192) [0x7fb2b9166192]
 21: (clone()+0x6d) [0x7fb2b867351d]
2017-02-28 17:45:17.723931 7fb29cfe2700 -1 *** Caught signal (Aborted) **
 in thread 7fb29cfe2700

 ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
 1: ceph_osd_ceph_70d3b7af() [0xa5093a]
 2: (()+0x10350) [0x7fb2b916e350]
 3: (gsignal()+0x39) [0x7fb2b85afc49]
 4: (abort()+0x148) [0x7fb2b85b3058]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fb2b8eba555]
 6: (()+0x5e6f6) [0x7fb2b8eb86f6]
 7: (()+0x5e723) [0x7fb2b8eb8723]
 8: (()+0x5e942) [0x7fb2b8eb8942]
 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x137) [0xbbe497]
 10: (object_info_t::decode(ceph::buffer::list::iterator&)+0x7c) [0x77664c]
 11: (ReplicatedBackend::build_push_op(ObjectRecoveryInfo const&, 
ObjectRecoveryProgress const&, ObjectRecoveryProgress*, PushOp*, 
object_stat_sum_t*)+0x8bb) [0x97232b]
 12: (ReplicatedBackend::handle_pull(pg_shard_t, PullOp&, PushOp*)+0xf8) 
[0x974528]
 13: (ReplicatedBackend::do_pull(std::tr1::shared_ptr<OpRequest>)+0xd6) 
[0x974836]
 14: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x3ed) 
[0x97b89d]
 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, 
ThreadPool::TPHandle&)+0x19d) [0x80b84d]
 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, 
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3cd) [0x67720d]
 17: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x2f9) [0x6776f9]
 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x85c) 
[0xb3c7bc]
 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb3e8b0]
 20: (()+0x8192) [0x7fb2b9166192]
 21: (clone()+0x6d) [0x7fb2b867351d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

     0> 2017-02-28 17:45:17.723931 7fb29cfe2700 -1 *** Caught signal (Aborted) 
**
 in thread 7fb29cfe2700 

ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
 1: ceph_osd_ceph_70d3b7af() [0xa5093a]
 2: (()+0x10350) [0x7fb2b916e350]
 3: (gsignal()+0x39) [0x7fb2b85afc49]
 4: (abort()+0x148) [0x7fb2b85b3058]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fb2b8eba555]
 6: (()+0x5e6f6) [0x7fb2b8eb86f6]
 7: (()+0x5e723) [0x7fb2b8eb8723]
 8: (()+0x5e942) [0x7fb2b8eb8942]
 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x137) [0xbbe497]
 10: (object_info_t::decode(ceph::buffer::list::iterator&)+0x7c) [0x77664c]
 11: (ReplicatedBackend::build_push_op(ObjectRecoveryInfo const&, 
ObjectRecoveryProgress const&, ObjectRecoveryProgress*, PushOp*, 
object_stat_sum_t*)+0x8bb) [0x97232b]
 12: (ReplicatedBackend::handle_pull(pg_shard_t, PullOp&, PushOp*)+0xf8) 
[0x974528]
 13: (ReplicatedBackend::do_pull(std::tr1::shared_ptr<OpRequest>)+0xd6) 
[0x974836]
 14: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x3ed) 
[0x97b89d]
 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, 
ThreadPool::TPHandle&)+0x19d) [0x80b84d]
 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, 
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3cd) [0x67720d]
 17: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x2f9) [0x6776f9]
 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x85c) 
[0xb3c7bc]
 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb3e8b0]
 20: (()+0x8192) [0x7fb2b9166192]
 21: (clone()+0x6d) [0x7fb2b867351d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

I'm not sure why I would have encountered an issue since the data was at rest 
before the install (unless there is another step that was needed).

Currently the cluster is recovering objects. Although `ceph osd stat` shows 
that the 'norecover' flag is still set.

I'm going to wait out the recovery and see if the Ceph FS is OK. That would be 
huge if it is. But I am curious why I lost an OSD, and why recovery is 
happening with 'norecover' still set.

-Chris

> On Feb 28, 2017, at 4:51 AM, Peter Maloney 
> <peter.malo...@brockmann-consult.de> wrote:
> 
> On 02/27/17 18:01, Heller, Chris wrote:
>> First I bring down the Ceph FS via `ceph mds cluster_down`.
>> Second, to prevent OSDs from trying to repair data, I run `ceph osd set 
>> noout`
>> Finally I stop the ceph processes in the following order: ceph-mds, 
>> ceph-mon, ceph-osd
>> 
> This is the wrong procedure. Likely it will just involve more cpu and memory 
> usage on startup, not broken behavior (unless you run out of RAM). After all, 
> it has to recover from power outages, so any order ought to work, just some 
> are better. 
> 
> I am unsure on the cephfs part... but I would think you have it right, except 
> I wouldn't do `ceph mds cluster_down` (but don't know if it's right to)... 
> maybe try without that. I never used that except when I want to remove all 
> mds nodes and destroy all the cephfs data. And I didn't find any docs on what 
> it really even does, except it won't let you remove all your mds and destroy 
> the cephfs without it.
> 
> The correct procedure as far as I know is:
> 
> ## 1. cluster must be healthy and to set noout, norecover, norebalance, 
> nobackfill
> ceph -s
> for s in noout norecover norebalance nobackfill; do ceph osd set $s; done
> 
> ## 2. shut down all OSDs and then the all MONs - not MONs before OSDs
> # all nodes
> service ceph stop osd
> 
> # see that all osds are down
> ceph osd tree
> 
> # all nodes again
> ceph -s
> service ceph stop
> 
> ## 3. start MONs before OSDs. 
> # This already happens on boot per node, but not cluster wide. But with the 
> flags set, it likely doesn't matter. It seems unnecessary on a small cluster.
> 
> ## 4. unset the flags
> # see that all osds are up
> ceph -s
> ceph osd tree
> for s in noout norecover norebalance nobackfill; do ceph osd unset $s; done
> 
> 
>> Note my cluster has 1 mds and 1 mon, and 7 osd.
>> 
>> I then install the new OS and then bring the cluster back up by walking the 
>> steps in reverse:
>> 
>> First I start the ceph processes in the following order: ceph-osd, ceph-mon, 
>> ceph-mds
>> Second I restore OSD functionality with `ceph osd unset noout`
>> Finally I bring up the Ceph FS via `ceph mds cluster_up`
>> 
> adjust those steps too... mons start first
> 
>> Everything works smoothly except the Ceph FS bring up.[...snip...]
> 
>> How can I safely stop a Ceph cluster, so that it will cleanly start back up 
>> again?
>> 
> Don't know about the cephfs problem... all I can say is try the right general 
> procedure and see if the result changes.
> 
> (and I'd love to cite a source on why that's the right procedure and yours 
> isn't, but don't know what to cite... for 
> examplehttp://docs.ceph.com/docs/jewel/rados/operations/operating/#id8 
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_jewel_rados_operations_operating_-23id8&d=DwMC-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=ylcFa5bBSUyTQqbx1Aqz47ec5BJJc7uk0YQ4EQKh-DY&m=Z8WoC0W2zZz1lpQRQ7ZPyt6UhQYV0sd_92NRYqdlNfs&s=ht25eyn3seVNB8DsSgfz4p1j4TIoNXEN2wBq0P4sU-Y&e=>
>  says to use -a in the arguments, but doesn't say whether that's systemd or 
> not, or what it does exactly. I have only seen it discussed a few places, 
> like the mailing list and IRC)
>> -Chris
>> 
>> 
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com&d=DwMC-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=ylcFa5bBSUyTQqbx1Aqz47ec5BJJc7uk0YQ4EQKh-DY&m=Z8WoC0W2zZz1lpQRQ7ZPyt6UhQYV0sd_92NRYqdlNfs&s=cg1fqAzVGn_PBhFzvHl00_Pm3sbZv5Pl-YLkFsCNAno&e=>
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

Reply via email to