What's strange is OSD rebalance obviously has no problem, it's just new
object can't be written since the new segments can't be distributed to new
OSDs.

Here is the error from radosgw.log:

2014-06-17 10:34:01.568754 7fc7e83f4700  0 cephx: verify_reply couldn't
decrypt with error: error decoding block for decryption
2014-06-17 10:34:01.568763 7fc7e83f4700  0 -- 172.17.9.218:0/1034041 >>
10.122.134.204:6820/14745 pipe(0x1e24710 sd=11 :54045 s=1 pgs=0 cs=0 l=1
c=0x1e23db0).failed verifying authorize reply

So it appears OSD can authenticate with each other, but the key generated
between client and mon are only visible to existing OSDs, but not new OSDs
just added?

I'm trying increase cephx debug level on mon but it seems hanging:

# ceph tell mon.* injectarts '--debug-auth=5'
no valid command found; 10 closest matches:
config-key exists <key>
config-key list
config-key put <key> {<val>}
config-key del <key>
osd tier remove-overlay <poolname>
config-key get <key>
osd tier cache-mode <poolname> none|writeback|invalidate+forward|readonly
osd tier set-overlay <poolname> <poolname>
mon remove <name>
osd tier remove <poolname> <poolname>
mon.nysanlab04: Error EINVAL: invalid command
mon.nysanlab04: invalid command
2014-06-17 11:08:20.995510 7ffcd078b700  0 -- 172.17.9.218:0/1001296 >>
172.17.9.219:6789/0 pipe(0x7ffccc021fe0 sd=4 :0 s=1 pgs=0 cs=0 l=1
c=0x7ffccc022240).fault

Am I using the wrong syntax to increse debug level for auth in mon? Or
something wrong with the cluster?


On Mon, Jun 16, 2014 at 5:56 PM, John Wilkins <john.wilk...@inktank.com>
wrote:

> Did you run ceph-deploy in the directory where you ran ceph-deploy new and
> ceph-deploy gatherkeys? That's where the monitor bootstrap key should be.
>
>
> On Mon, Jun 16, 2014 at 8:49 AM, Fred Yang <frederic.y...@gmail.com>
> wrote:
>
>> I'm adding three OSD nodes(36 osds in total) to existing 3-node
>> cluster(35 osds) using ceph-deploy, after disks prepared and OSDs
>> activated, the cluster re-balanced and shows all pgs active+clean:
>>
>>      osdmap e820: 72 osds: 71 up, 71 in
>>       pgmap v173328: 15920 pgs, 17 pools, 12538 MB data, 3903 objects
>>             30081 MB used, 39631 GB / 39660 GB avail
>>                15920 active+clean
>>
>> However, the object write start having issue since the new OSDs added to
>> cluster:
>>
>> 2014-06-16 11:36:36.421868 osd.35 [WRN] slow request 30.317529 seconds
>> old, received at 2014-06-16 11:36:06.104256: osd_op(client.5568.0:1502400
>> default.5250.4_loadtest/512B_file [getxattrs,stat] 9.552a7900 e820) v4
>> currently waiting for rw locks
>>
>> And from existing osd log, it seems it's having problem to authenticate
>> the new OSDs (10.122.134.204 is the IP of one of new OSD nodes) :
>>
>> 2014-06-16 11:38:25.281270 7f58562ce700  0 cephx: verify_reply couldn't
>> decrypt with error: error decoding block for decryption
>> 2014-06-16 11:38:25.281288 7f58562ce700  0 -- 172.17.9.218:6811/2047255
>> >> 10.122.134.204:6831/17571 pipe(0x2891280 sd=90 :48493 s=1 pgs=3091
>> cs=10 l=0 c=0x62d1840).failed verifying authorize reply
>>
>>
>> The cephx auth list shows good to me:
>>
>> exported keyring for osd.45
>> [osd.45]
>>         key = AQAoCp5TqBq/MhAANwclbs1nCgefNfxqqPnkZQ==
>>         caps mon = "allow profile osd"
>>         caps osd = "allow *"
>>
>> The key above does not match the keyring on osd.45.
>>
>> Anybody have any clue what might be the authentication issue here? I'm
>> running Ceph 0.72.2.
>>
>> Thanks in advance,
>> Fred
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> John Wilkins
> Senior Technical Writer
> Intank
> john.wilk...@inktank.com
> (415) 425-9599
> http://inktank.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to