The keys in the keyrings for the broken mgrs match what is shows in ceph auth list. The relevant entries are below so that you can see the caps.
I am having problems with both mgr.6 and mgr.8. mgr.7 is the only mgr currently functioning. mgr.6 key: [redacted] caps: [mds] allow * caps: [mgr] allow r caps: [mon] allow profile mgr caps: [osd] allow * mgr.7 key: [redacted] caps: [mds] allow * caps: [mgr] allow r caps: [mon] allow profile mgr caps: [osd] allow * mgr.8 key: [redacted] caps: [mds] allow * caps: [mon] allow profile mgr caps: [osd] allow * I agree that an auth issue seems unlikely to have been triggered but I'm not sure what else it can be. On Fri, Jan 4, 2019 at 10:51 AM Steve Taylor <steve.tay...@storagecraft.com> wrote: > I can't think of why the upgrade would have broken your keys, but have you > verified that the mons still have the correct mgr keys configured? 'ceph > auth ls' should list an mgr.<host> key for each mgr with a key matching the > contents of /var/lib/ceph/mgr/<cluster>-<host>/keyring on the mgr host and > some caps that should minimally include '[mon] allow profile mgr' and > '[osd] allow *' I would think. > > Again, it seems unlikely that this would have broken with the upgrade if > it had been working previously, but if you're seeing auth errors it might > be something to check out. > > ------------------------------ > > *Steve Taylor* | Senior Software Engineer | *StorageCraft Technology > Corporation* <https://storagecraft.com> > 380 Data Drive Suite 300 | Draper | Utah | 84020 > *Office:* 801.871.2799 | > ------------------------------ > If you are not the intended recipient of this message or received it > erroneously, please notify the sender and delete it, together with any > attachments, and be advised that any dissemination or copying of this > message is prohibited. > ------------------------------ > > On Fri, 2019-01-04 at 07:26 -0700, Randall Smith wrote: > > Greetings, > > I'm upgrading my cluster from luminous to mimic. I've upgraded my monitors > and am attempting to upgrade the mgrs. Unfortunately, after an upgrade the > mgr daemon exits immediately with error code 1. > > I've tried running ceph-mgr in debug mode to try to see what's happening > but the output (below) is a bit cryptic for me. It looks like > authentication might be failing but it was working prior to the upgrade. > > I do have "auth supported = cephx" in the global section of ceph.conf. > > What do I need to do to fix this? > > Thanks. > > /usr/bin/ceph-mgr -f --cluster ceph --id 8 --setuser ceph --setgroup ceph > -d --debug_ms 5 > > 2019-01-04 07:01:38.457 7f808f83f700 2 Event(0x30c42c0 nevent=5000 > time_id=1).set_owner idx=0 owner=140190140331776 > > 2019-01-04 07:01:38.457 7f808f03e700 2 Event(0x30c4500 nevent=5000 > time_id=1).set_owner idx=1 owner=140190131939072 > > 2019-01-04 07:01:38.457 7f808e83d700 2 Event(0x30c4e00 nevent=5000 > time_id=1).set_owner idx=2 owner=140190123546368 > > 2019-01-04 07:01:38.457 7f809dd5b380 1 Processor -- start > > > 2019-01-04 07:01:38.477 7f809dd5b380 1 -- - start start > > > 2019-01-04 07:01:38.481 7f809dd5b380 1 -- - --> 192.168.253.147:6789/0 > -- auth(proto 0 26 bytes epoch 0) v1 -- 0x32a6780 con 0 > > 2019-01-04 07:01:38.481 7f809dd5b380 1 -- - --> 192.168.253.148:6789/0 > -- auth(proto 0 26 bytes epoch 0) v1 -- 0x32a6a00 con 0 > 2019-01-04 07:01:38.481 7f808e83d700 1 -- 192.168.253.148:0/1359135487 > learned_addr learned my addr 192.168.253.148:0/1359135487 > 2019-01-04 07:01:38.481 7f808e83d700 2 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:6789/0 conn(0x332d500 :-1 > s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0 cs=0 l=0)._process_connection got > newly_a$ > ked_seq 0 vs out_seq 0 > 2019-01-04 07:01:38.481 7f808f03e700 2 -- 192.168.253.148:0/1359135487 > >> 192.168.253.147:6789/0 conn(0x332ce00 :-1 > s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0 cs=0 l=0)._process_connection got > newly_a$ > ked_seq 0 vs out_seq 0 > 2019-01-04 07:01:38.481 7f808f03e700 5 -- 192.168.253.148:0/1359135487 > >> 192.168.253.147:6789/0 conn(0x332ce00 :-1 > s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=74172 cs=1 l=1). rx mon.1 > seq > 1 0x30c5440 mon_map magic: 0 v1 > 2019-01-04 07:01:38.481 7f808e83d700 5 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:6789/0 conn(0x332d500 :-1 > s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=74275 cs=1 l=1). rx mon.2 > seq > 1 0x30c5680 mon_map magic: 0 v1 > 2019-01-04 07:01:38.481 7f808f03e700 5 -- 192.168.253.148:0/1359135487 > >> 192.168.253.147:6789/0 conn(0x332ce00 :-1 > s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=74172 cs=1 l=1). rx mon.1 > seq > 2 0x32a6780 auth_reply(proto 2 0 (0) Success) v1 > 2019-01-04 07:01:38.481 7f808e83d700 5 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:6789/0 conn(0x332d500 :-1 > s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=74275 cs=1 l=1). rx mon.2 > seq > 2 0x32a6a00 auth_reply(proto 2 0 (0) Success) v1 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > <== mon.1 192.168.253.147:6789/0 1 ==== mon_map magic: 0 v1 ==== 370+0+0 > (3034216899 0 0) 0x30c5440 con 0x332ce00 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > <== mon.2 192.168.253.148:6789/0 1 ==== mon_map magic: 0 v1 ==== 370+0+0 > (3034216899 0 0) 0x30c5680 con 0x332d500 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > <== mon.1 192.168.253.147:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) > v1 ==== 33+0+0 (3430158761 0 0) 0x32a6780 con 0x33$ > ce00 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > --> 192.168.253.147:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- > 0x32a6f00 con 0 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > <== mon.2 192.168.253.148:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) > v1 ==== 33+0+0 (3242503871 0 0) 0x32a6a00 con 0x33$ > d500 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > --> 192.168.253.148:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- > 0x32a6780 con 0 > 2019-01-04 07:01:38.481 7f808f03e700 5 -- 192.168.253.148:0/1359135487 > >> 192.168.253.147:6789/0 conn(0x332ce00 :-1 > s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=74172 cs=1 l=1). rx mon.1 > seq > 3 0x32a6f00 auth_reply(proto 2 -22 (22) Invalid argument) v1 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > <== mon.1 192.168.253.147:6789/0 3 ==== auth_reply(proto 2 -22 (22) > Invalid argument) v1 ==== 24+0+0 (882932531 0 0) 0x32a6f$ > 0 con 0x332ce00 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > >> 192.168.253.147:6789/0 conn(0x332ce00 :-1 s=STATE_OPEN pgs=74172 cs=1 > l=1).mark_down > 2019-01-04 07:01:38.481 7f808e03c700 2 -- 192.168.253.148:0/1359135487 > >> 192.168.253.147:6789/0 conn(0x332ce00 :-1 s=STATE_OPEN pgs=74172 cs=1 > l=1)._stop > 2019-01-04 07:01:38.481 7f808e83d700 5 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:6789/0 conn(0x332d500 :-1 > s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=74275 cs=1 l=1). rx mon.2 > seq > 3 0x32a6780 auth_reply(proto 2 -22 (22) Invalid argument) v1 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > <== mon.2 192.168.253.148:6789/0 3 ==== auth_reply(proto 2 -22 (22) > Invalid argument) v1 ==== 24+0+0 (1359424806 0 0) 0x32a6$ > 80 con 0x332d500 > 2019-01-04 07:01:38.481 7f808e03c700 1 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:6789/0 conn(0x332d500 :-1 s=STATE_OPEN pgs=74275 cs=1 > l=1).mark_down > 2019-01-04 07:01:38.481 7f808e03c700 2 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:6789/0 conn(0x332d500 :-1 s=STATE_OPEN pgs=74275 cs=1 > l=1)._stop > > 2019-01-04 07:01:38.481 7f809dd5b380 1 -- 192.168.253.148:0/1359135487 > shutdown_connections > 2019-01-04 07:01:38.481 7f809dd5b380 5 -- 192.168.253.148:0/1359135487 > shutdown_connections mark down 192.168.253.148:6789/0 0x332d500 > 2019-01-04 07:01:38.481 7f809dd5b380 5 -- 192.168.253.148:0/1359135487 > shutdown_connections mark down 192.168.253.147:6789/0 0x332ce00 > 2019-01-04 07:01:38.481 7f809dd5b380 5 -- 192.168.253.148:0/1359135487 > shutdown_connections delete 0x332ce00 > 2019-01-04 07:01:38.481 7f809dd5b380 5 -- 192.168.253.148:0/1359135487 > shutdown_connections delete 0x332d500 > 2019-01-04 07:01:38.485 7f809dd5b380 1 -- 192.168.253.148:0/1359135487 > shutdown_connections > 2019-01-04 07:01:38.485 7f809dd5b380 1 -- 192.168.253.148:0/1359135487 > wait complete. > 2019-01-04 07:01:38.485 7f809dd5b380 1 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:0/1359135487 conn(0x332c000 :-1 s=STATE_NONE pgs=0 > cs=0 l=0).mark_down > 2019-01-04 07:01:38.485 7f809dd5b380 2 -- 192.168.253.148:0/1359135487 > >> 192.168.253.148:0/1359135487 conn(0x332c000 :-1 s=STATE_NONE pgs=0 > cs=0 l=0)._stop > failed to fetch mon config (--no-mon-config to skip) > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- Randall Smith Computing Services Adams State University http://www.adams.edu/ 719-587-7741
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com