Thank you, after apply 25 times 'ceph auth add mon.a', unpatched version works.
Here's the details step: 1. stop cluster(mon,osd and mds), backup current /var/lib/ceph/mon/ceph-a dir 2. start patched ceph-mon and ceph-osd(i am not sure ceph-osd is necessary or not) 3. run 'ceph auth add mon.a' 25 times. 4. stop ceph-mon and ceph.osd, and run unpatched ceph-mon with command 'ceph-mon -i a -f', and it works. 5. stop ceph-mon, backup current ok /var/lib/ceph/mon/ceph-a dir, 6. revert back to the /var/lib/ceph/mon/ceph-a that save on step 1, and run unpatched ceph-mon again, ensure that ceph-mon is not start with this version of files(throw errors). 7. switch back to save dir on step 5. On Mon, Aug 26, 2013 at 12:16 AM, Sage Weil <s...@inktank.com> wrote: > On Sun, 25 Aug 2013, Yu Changyuan wrote: > > Today, when I restart ceph service, the problem I asked on mail-list > before > > happened > > again(http://article.gmane.org/gmane.comp.file-systems.ceph.user/2995), > > ceph-mon refuse to start and report below error: > > > > 2013-08-25 18:24:52.465600 7fb50a496780 -1 mon/AuthMonitor.cc: In > function > > 'virtual void AuthMonitor::update_from_paxos(bool*)' thread 7fb50a496780 > > time 2013-08-25 18:24:52.453920 > > mon/AuthMonitor.cc: 152: FAILED assert(ret == 0) > > > > ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff) > > 1: (AuthMonitor::update_from_paxos(bool*)+0x1fee) [0x57742e] > > 2: (PaxosService::refresh(bool*)+0x18d) [0x4f630d] > > 3: (Monitor::refresh_from_paxos(bool*)+0x57) [0x496477] > > 4: (Monitor::init_paxos()+0xf5) [0x496635] > > 5: (Monitor::preinit()+0x6bc) [0x4ad1dc] > > 6: (main()+0x1bec) [0x48ac8c] > > 7: (__libc_start_main()+0xed) [0x7fb5084c660d] > > 8: ceph-mon() [0x48dab9] > > > > Then, I switch to ''wip-mon-skip-auth-cuttlefish" branch, ceph-mon > complain > > some "missing auth inc"(from 1 to 500), and continue running, then > > everything is ok again. > > > > But when I stop this patched ceph-mon, and try to start regular unpatched > > ceph-mon, above error happened again. As I mentioned, the ceph-mon files > > last time I use is not the final one that 'missing auth', but the files 2 > > days before ceph-mon fail, which actually ceph-mon start ok but ceph-osd > > refuse to work. > > > > So, I want to know how to make these ceph-mon files that only work with > > patched ceph-mon to work again with regular unpatched ceph-mon. > > Without seeing logs and knowing exactly what is going on, my first guess > is that running several 'ceph auth add' or 'ceph auth import' commands > that makes modifications to the auth db 25 times will get you past the > gap. After that, the mon should start with the unpatched version. > > If that doesn't fix it, can you generate a log with 'debug ms = 1' 'debug > paxos = 20' 'debug mon = 20' and share that? > > Thanks- > sage -- Best regards, Changyuan
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com