Hi Dongdong,

On Sun, Aug 8, 2021 at 10:08 PM 陶冬冬 <tdd21151...@gmail.com> wrote:
>
> Hi Patrick,
>
> Thanks a lot for letting us know about this issue!
>
> By reading your fix[1] carefully, I understand the heart of this issue is 
> that:
> Since Jewel, CephFS introduced a new data structure FSMap (for MultiFS), and 
> the monitor has been using this new structure as the Paxos value,
> but Pre-Jewel the one stored in monitor DB was MDSMap, and the initial MDSMap 
> will keep staying in the DB and never get trimmed if CephFS wasn't used at 
> all.
> Since from Pacific, the monitor was no longer expecting the MDSMap structure 
> from the DB, which caused the crash.
>
> In order to detect if there is any old MDSMap exists, we just need to get the 
> oldest mdsmap from monitor DB and try to decode it with pacific ceph-dencoder
> We can do the below:
> 1. Stop one monitor (Since this has to be done during upgrade)
> 1. Export the binary of the first committed mdsmap from monitor 
> DB(ceph-kvstore-tool can do this)
> 2. Feed the binary to the Pacific version of ceph-dencoder
> 3. If the binary can be decoded, then we can be sure there is no legacy data 
> structure
>     Otherwise, there is legacy data structure and need to have a short 
> upgrade stop at the just-released Octopus v15.2.14 before continuing to 
> Pacific.
>
> I've done some testing and it worked, below is the same crash stack when I 
> use pacific ceph-dencoder to decode the mdsmap from a cluster (without 
> cephfs) upgraded from Firefly.
>
> ~# ceph-dencoder import mdsmap.1.f2j type FSMap decode dump_json
> /build/ceph-dJyyVB/ceph-16.2.0/src/mds/FSMap.cc: In function 'void 
> FSMap::decode(ceph::buffer::v15_2_0::list::const_iterator&)' thread 
> 7fda1b03a240 time 2021-08-08T04:27:57.491978+0000
> /build/ceph-dJyyVB/ceph-16.2.0/src/mds/FSMap.cc: 648: ceph_abort_msg("abort() 
> called")
>  ceph version 16.2.0 (0c2054e95bcd9b30fdd908a79ac1d8bbc3394442) pacific 
> (stable)
>  1: (ceph::__ceph_abort(char const*, int, char const*, 
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> 
> > const&)+0xe0) [0x7fda1e3a652d]
>  2: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0xdca) 
> [0x7fda1e9535aa]
>  3: (DencoderBase<FSMap>::decode[abi:cxx11](ceph::buffer::v15_2_0::list, 
> unsigned long)+0x54) [0x55b3a5e6ed84]
>  4: main()
>  5: __libc_start_main()
>  6: _start()
> Aborted (core dumped)
>
> Basically, the above steps have the same workflow regarding to how monitor 
> load the mdsmap from DB and decode it.
>
> [1] https://github.com/ceph/ceph/pull/42349

The steps you outlined look reasonable.

Thanks,

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to