ceph-post-file: 60e07a0c-ee5b-4174-9f51-fa091d5662dc

On Wed, Apr 3, 2019 at 5:30 PM Shawn Edwards <lesser.e...@gmail.com> wrote:

> According to ceph versions, all bits are running 14.2.0
>
> I have restarted all of the OSD at least twice and am still getting the
> same error.
>
> I'll send a log file with confirmed interesting bad behavior shortly
>
> On Wed, Apr 3, 2019, 17:17 Sage Weil <s...@newdream.net> wrote:
>
>> 2019-04-03 15:04:01.986 7ffae5778700 10 --1- v1:10.36.9.46:6813/5003637
>> >> v1:10.36.9.28:6809/8224 conn(0xf6a6000 0x30a02000 :6813
>> s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2
>> authorizor_protocol 2 len 174
>> 2019-04-03 15:04:01.986 7ffae5778700 20 AuthRegistry(0xcd64a40)
>> get_handler peer_type 4 method 2 cluster_methods [2] service_methods [2]
>> client_methods [2]
>> 2019-04-03 15:04:01.986 7ffae5778700 10 cephx: verify_authorizer
>> decrypted service osd secret_id=41686
>> 2019-04-03 15:04:01.986 7ffae5778700  0 auth: could not find
>> secret_id=41686
>> 2019-04-03 15:04:01.986 7ffae5778700 10 auth: dump_rotating:
>> 2019-04-03 15:04:01.986 7ffae5778700 10 auth:  id 41691 ... expires
>> 2019-04-03 14:43:07.042860
>> 2019-04-03 15:04:01.986 7ffae5778700 10 auth:  id 41692 ... expires
>> 2019-04-03 15:43:09.895511
>> 2019-04-03 15:04:01.986 7ffae5778700 10 auth:  id 41693 ... expires
>> 2019-04-03 16:43:09.895511
>> 2019-04-03 15:04:01.986 7ffae5778700  0 cephx: verify_authorizer could
>> not get service secret for service osd secret_id=41686
>> 2019-04-03 15:04:01.986 7ffae5778700  0 --1- v1:10.36.9.46:6813/5003637
>> >> v1:10.36.9.28:6809/8224 conn(0xf6a6000 0x30a02000 :6813
>> s=ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_message_2:
>> got bad authorizer, auth_reply_len=0
>>
>> For some reason this OSD has much newer rotating keys than the
>> connecting OSD.  But earlier in the day, this osd was the one
>> getting BADAUTHORIZER, so maybe that shifted.  Can you find an OSD where
>> you still see BADAUTHORIZER appearing in the log?
>>
>> My guess is that if you restart the OSDs, they'll get fresh rotating keys
>> and things will be fine.  But that doesn't explain why they're not
>> renewing on their own right now.. that I'm not so sure about.
>>
>> Are your mons all running nautilus?  Does 'ceph versions' show everything
>> has upgraded?
>>
>> sage
>>
>>
>> On Wed, 3 Apr 2019, Shawn Edwards wrote:
>>
>> > File uploaded: f1a2bfb3-92b4-495c-8706-f99cb228efc7
>> >
>> > On Wed, Apr 3, 2019 at 4:57 PM Sage Weil <s...@newdream.net> wrote:
>> >
>> > > Hmm, that doesn't help.
>> > >
>> > > Can you set
>> > >
>> > >  ceph config set osd debug_ms 20
>> > >  ceph config set osd debug_auth 20
>> > >  ceph config set osd debug_monc 20
>> > >
>> > > for a few minutes and ceph-post-file the osd logs?  (Or send a private
>> > > email with a link or something.)
>> > >
>> > > Thanks!
>> > > sage
>> > >
>> > >
>> > > On Wed, 3 Apr 2019, Shawn Edwards wrote:
>> > >
>> > > > No strange auth config:
>> > > >
>> > > > root@tyr-ceph-mon0:~# ceph config dump | grep -E '(auth|cephx)'
>> > > > global        advanced auth_client_required               cephx
>> > > >                                             *
>> > > > global        advanced auth_cluster_required              cephx
>> > > >                                             *
>> > > > global        advanced auth_service_required              cephx
>> > > >                                             *
>> > > >
>> > > > All boxes are using 'minimal' ceph.conf files and centralized
>> config.
>> > > >
>> > > > If you need the full config, it's here:
>> > > > https://gist.github.com/lesserevil/3b82d37e517f4561ce53c81629717aae
>> > > >
>> > > > On Wed, Apr 3, 2019 at 4:07 PM Sage Weil <s...@newdream.net> wrote:
>> > > >
>> > > > > On Wed, 3 Apr 2019, Shawn Edwards wrote:
>> > > > > > Recent nautilus upgrade from mimic.  No issues on mimic.
>> > > > > >
>> > > > > > Now getting this or similar in all osd logs, there is very
>> little osd
>> > > > > > communicatoin, and most of the PG are either 'down' or
>> 'unknown',
>> > > even
>> > > > > > though I can see the data on the filestores.
>> > > > > >
>> > > > > > 2019-04-03 13:47:55.280 7f13346e3700  0 --1- [v2:
>> > > > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1:
>> > > 10.36.9.37:6821/8825
>> > > > > > conn(0xa7132000 0xa6b28000 :-1 s=CONNECTING_SEND_CONNECT_MSG
>> pgs=0
>> > > cs=0
>> > > > > > l=0).handle_connect_reply_2 connect got BADAUTHORIZER
>> > > > > > 2019-04-03 13:47:55.296 7f1333ee2700  0 --1- [v2:
>> > > > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1:
>> > > > > 10.36.9.37:6841/11204
>> > > > > > conn(0xa9826d00 0xa9b78000 :-1 s=CONNECTING_SEND_CONNECT_MSG
>> pgs=0
>> > > cs=0
>> > > > > > l=0).handle_connect_reply_2 connect got BADAUTHORIZER
>> > > > > > 2019-04-03 13:47:55.340 7f13346e3700  0 --1- [v2:
>> > > > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1:
>> > > 10.36.9.37:6829/8425
>> > > > > > conn(0xa7997180 0xaeb22800 :-1 s=CONNECTING_SEND_CONNECT_MSG
>> pgs=0
>> > > cs=0
>> > > > > > l=0).handle_connect_reply_2 connect got BADAUTHORIZER
>> > > > > > 2019-04-03 13:47:55.428 7f1334ee4700  0 auth: could not find
>> > > > > secret_id=41687
>> > > > > > 2019-04-03 13:47:55.428 7f1334ee4700  0 cephx: verify_authorizer
>> > > could
>> > > > > not
>> > > > > > get service secret for service osd secret_id=41687
>> > > > > > 2019-04-03 13:47:55.428 7f1334ee4700  0 --1- [v2:
>> > > > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1:
>> > > > > 10.36.9.48:6805/49547
>> > > > > > conn(0xe02f24480 0xe088cb800 :6803
>> s=ACCEPTING_WAIT_CONNECT_MSG_AUTH
>> > > > > pgs=0
>> > > > > > cs=0 l=0).handle_connect_message_2: got bad authorizer,
>> > > auth_reply_len=0
>> > > > > >
>> > > > > > Thoughts?  I have confirmed that all ceph boxes have good time
>> sync.
>> > > > >
>> > > > > Do you have any non-default auth-related settings in ceph.conf?
>> > > > >
>> > > > > sage
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > >  Shawn Edwards
>> > > >  Beware programmers with screwdrivers.  They tend to spill them on
>> their
>> > > > keyboards.
>> > > >
>> > >
>> >
>> >
>> > --
>> >  Shawn Edwards
>> >  Beware programmers with screwdrivers.  They tend to spill them on their
>> > keyboards.
>> >
>>
>

-- 
 Shawn Edwards
 Beware programmers with screwdrivers.  They tend to spill them on their
keyboards.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to