[ceph-users] cephx: verify_reply couldn't decrypt with error (failed verifying authorize reply)

2015-03-24 Thread Erming Pei

Hi Experts,

  After implemented Ceph initially with 3 OSDs, now I am facing an issue:
  It reports healthy but sometimes(or often) fails to access the pools. 
While sometimes it comes back to normal automatically.


  For example:

*[*ceph@gcloudcon ceph-cluster]$ *rados -p volumes ls*
2015-03-24 11:44:17.262941 7f3d6bfff700  0 cephx: verify_reply couldn't 
decrypt with error: error decoding block for decryption
2015-03-24 11:44:17.262951 7f3d6bfff700  0 -- 206.12.25.25:0/1004580 >> 
206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55582 s=1 pgs=0 cs=0 l=1 
c=0x26d8270).failed verifying authorize reply
2015-03-24 11:44:17.262999 7f3d6bfff700  0 -- 206.12.25.25:0/1004580 >> 
206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55582 s=1 pgs=0 cs=0 l=1 
c=0x26d8270).fault
2015-03-24 11:44:17.263637 7f3d6bfff700  0 cephx: verify_reply couldn't 
decrypt with error: error decoding block for decryption
2015-03-24 11:44:17.263645 7f3d6bfff700  0 -- 206.12.25.25:0/1004580 >> 
206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55583 s=1 pgs=0 cs=0 l=1 
c=0x26d8270).failed verifying authorize reply
2015-03-24 11:44:17.464379 7f3d6bfff700  0 cephx: verify_reply couldn't 
decrypt with error: error decoding block for decryption
2015-03-24 11:44:17.464388 7f3d6bfff700  0 -- 206.12.25.25:0/1004580 >> 
206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55584 s=1 pgs=0 cs=0 l=1 
c=0x26d8270).failed verifying authorize reply
2015-03-24 11:44:17.865222 7f3d6bfff700  0 cephx: verify_reply couldn't 
decrypt with error: error decoding block for decryption
2015-03-24 11:44:17.865245 7f3d6bfff700  0 -- 206.12.25.25:0/1004580 >> 
206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55585 s=1 pgs=0 cs=0 l=1 
c=0x26d8270).failed verifying authorize reply
2015-03-24 11:44:18.666056 7f3d6bfff700  0 cephx: verify_reply couldn't 
decrypt with error: error decoding block for decryption
2015-03-24 11:44:18.666077 7f3d6bfff700  0 -- 206.12.25.25:0/1004580 >> 
206.12.25.27:6800/802 pipe(0x26d7fe0 sd=4 :55587 s=1 pgs=0 cs=0 l=1 
c=0x26d8270).failed verifying authorize reply



[ceph@gcloudcon ceph-cluster]$*ceph auth list*
installed auth entries:

mds.gcloudnet
key: xxx
caps: [mds] allow
caps: [mon] allow profile mds
caps: [osd] allow rwx
osd.0
key: xxx
caps: [mon] allow profile osd
caps: [osd] allow *
osd.1
key: xxx
caps: [mon] allow profile osd
caps: [osd] allow *
osd.2
key: xxx
caps: [mon] allow profile osd
caps: [osd] allow *
client.admin
key: xxx
caps: [mds] allow
caps: [mon] allow *
caps: [osd] allow *
client.backups
key: xxx
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
pool=backups

client.bootstrap-mds
key: xxx
caps: [mon] allow profile bootstrap-mds
client.bootstrap-osd
key: xxx
caps: [mon] allow profile bootstrap-osd
client.images
key: xxx
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
pool=images

client.libvirt
key: xxx
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
pool=libvirt-pool

client.volumes
key: xxx
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
pool=volumes, allow rx pool=images





[root@gcloudcon ~]# *more /etc/ceph/ceph.conf*
[global]
auth_service_required = cephx
osd_pool_default_size = 2
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 206.12.25.26
public_network = 206.12.25.0/16
mon_initial_members = gcloudnet
cluster_network = 192.168.10.0/16
fsid = xx

[client.images]
keyring = /etc/ceph/ceph.client.images.keyring

[client.volumes]
keyring = /etc/ceph/ceph.client.volumes.keyring

[client.backups]
keyring = /etc/ceph/ceph.client.backups.keyring



[ceph@gcloudcon ceph-cluster]$ *ceph -w*
cluster a4d0879f-abdc-4f9d-8a4b-53ce57d822f1
 health HEALTH_OK
 monmap e1: 1 mons at {gcloudnet=206.12.25.26:6789/0}, election 
epoch 1, quorum 0 gcloudnet

 osdmap e27: 3 osds: 3 up, 3 in
  pgmap v1894: 704 pgs, 6 pools, 1640 MB data, 231 objects
18757 MB used, 22331 GB / 22350 GB avail
 704 active+clean

2015-03-24 17:56:20.884293 mon.0 [INF] from='client.? 
206.12.25.25:0/1006501' entity='client.admin' cmd=[{"prefix": "auth 
list"}]: dispatch



*Can anybody give me a hint or what I should check?*


Thanks,

--

 Erming Pei,  Senior System Analyst

 Information Services & Technology
 University of Alberta, Canada

 Tel: 7804929914Fax: 7804921729


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs read-only setting doesn't work?

2015-09-01 Thread Erming Pei

Hi,

  I tried to set up a read-only permission for a client but it looks 
always writable.


  I did the following:

==Server end==

[client.cephfs_data_ro]
key = AQxx==
caps mon = "allow r"
caps osd = "allow r pool=cephfs_data, allow r pool=cephfs_metadata"


==Client end==
mount -v -t ceph hostname.domainname:6789:/ /cephfs -o 
name=cephfs_data_ro,secret=AQxx==


But I still can touch, delete, overwrite.

I read that touch/delete could be only meta data operations, but why I 
still can overwrite?


Is there anyway I could test/check the data pool (instead of meta data) 
to see if any effect on it?



Erming




--
---------
 Erming Pei, Ph.D
 Senior System Analyst; Grid/Cloud Specialist

 Research Computing Group
 Information Services & Technology
 University of Alberta, Canada

 Tel: +1 7804929914Fax: +1 7804921729
-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs read-only setting doesn't work?

2015-09-02 Thread Erming Pei

On 9/2/15, 9:31 AM, Gregory Farnum wrote:

[ Re-adding the list. ]

On Wed, Sep 2, 2015 at 4:29 PM, Erming Pei  wrote:

Hi Gregory,

Thanks very much for the confirmation and explanation.


And I presume you have an MDS cap in there as well?

   Is there a difference between set this cap and without setting?

Well, I don't think you can access the MDS without a read cap, but
maybe it's really just null...


  I asked this as I don't see a difference on operating files.



I think you'll find that the data you've overwritten isn't really written
to the OSDs — you wrote it in the local page cache, but the OSDs will reject
the writes with EPERM.

I see. Is there a way for me to verify that, i.e., see there is not a
change to the data is OSDs? I found I can overwrite a file and then I can
see the file is changed. It may be in the local cache. But how can I test
and retrieve one in the OSD pool?

Mounting it on another client and seeing if changes are reflected
there would do it. Or unmounting the filesystem, mounting again, and
seeing if the file has really changed.
-Greg


Good idea.

Thank you Gregory.

Erming

Thanks!

Erming



On 9/2/15, 2:44 AM, Gregory Farnum wrote:

On Tue, Sep 1, 2015 at 9:20 PM, Erming Pei  wrote:

Hi,

I tried to set up a read-only permission for a client but it looks
always
writable.

I did the following:

==Server end==

[client.cephfs_data_ro]
  key = AQxx==
  caps mon = "allow r"
  caps osd = "allow r pool=cephfs_data, allow r
pool=cephfs_metadata"

The clients don't directly access the metadata pool at all so you
don't need to grant that. :) And I presume you have an MDS cap in
there as well?


==Client end==
mount -v -t ceph hostname.domainname:6789:/ /cephfs -o
name=cephfs_data_ro,secret=AQxx==

But I still can touch, delete, overwrite.

I read that touch/delete could be only meta data operations, but why I
still
can overwrite?

Is there anyway I could test/check the data pool (instead of meta data)
to
see if any effect on it?

What you're seeing here is an unfortunate artifact of the page cache
and the way these user capabilities work in Ceph. As you surmise,
touch/delete are metadata operations through the MDS and in current
code you can't block the client off from that (although we have work
in progress to improve things). I think you'll find that the data
you've overwritten isn't really written to the OSDs — you wrote it in
the local page cache, but the OSDs will reject the writes with EPERM.
I don't remember the kernel's exact behavior here though — we updated
the userspace client to preemptively check access permissions on new
pools but I don't think the kernel ever got that. Zheng?
-Greg



--
-
  Erming Pei, Ph.D
  Senior System Analyst; Grid/Cloud Specialist

  Research Computing Group
  Information Services & Technology
  University of Alberta, Canada

  Tel: +1 7804929914Fax: +1 7804921729
---------




--
-
 Erming Pei, Ph.D
 Senior System Analyst; Grid/Cloud Specialist

 Research Computing Group
 Information Services & Technology
 University of Alberta, Canada

 Tel: +1 7804929914Fax: +1 7804921729
-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mds issue

2015-10-14 Thread Erming Pei

Hi,

   After I set up more than 1 mds servers, it sometimes gets stuck or 
slow from client end. I tried to stop one mds and then the client end 
will hang there.


   I accidentally set up bal frag=true. Not sure if it matters. Later I 
disabled this feature.


   Is there any reason for the above issue? What can be done to check 
or tune the mds performance?


  Can I just reduce the mds number on the fly?

Thanks,

Erming






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS namespace

2015-10-19 Thread Erming Pei

Hi,

   Is there a way to list the namespaces in cephfs? How to set it up?

   From man page of ceph.mount, I see this:

/To mount only part of the namespace://
//
//  mount.ceph monhost1:/some/small/thing /mnt/thing/

  But how to know the namespaces at first?

Thanks,

Erming



--
-
 Erming Pei, Ph.D
 Senior System Analyst; Grid/Cloud Specialist

 Research Computing Group
 Information Services & Technology
 University of Alberta, Canada

 Tel: +1 7804929914Fax: +1 7804921729
-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS namespace

2015-10-19 Thread Erming Pei

I see. That's also what I needed.
Thanks.

Can we only allow a part of the 'namespace' or directory tree to be 
mounted from *server* end? Just like NFS exporting?

And even setting of permissions as well?

Erming




On 10/19/15, 4:07 PM, Gregory Farnum wrote:

On Mon, Oct 19, 2015 at 3:06 PM, Erming Pei  wrote:

Hi,

Is there a way to list the namespaces in cephfs? How to set it up?

From man page of ceph.mount, I see this:

To mount only part of the namespace:

   mount.ceph monhost1:/some/small/thing /mnt/thing

   But how to know the namespaces at first?

"Namespace" here means "directory tree" or "folder hierarchy".
-Greg



--
---------
 Erming Pei, Ph.D
 Senior System Analyst; Grid/Cloud Specialist

 Research Computing Group
 Information Services & Technology
 University of Alberta, Canada

 Tel: +1 7804929914Fax: +1 7804921729
-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs best practice

2015-10-21 Thread Erming Pei

Hi,

  I am just wondering which use case is better: (within one single file 
system) set up one data pool for each project, or let project to share a 
big pool?



Thanks,
Erming


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Increased pg_num and pgp_num

2015-11-04 Thread Erming Pei

Hi,

  I found that the pg_num and pgp_num for meta data pool was too small 
and then increased them.

  Then I got "300 pgs stuck unclean".

/  $ ceph -s
cluster a4d0879f-abdc-4f9d-8a4b-53ce57d822f1
 health HEALTH_WARN 248 pgs backfill; 52 pgs backfilling; 300 pgs 
stuck unclean; recovery 58417161/113290060 objects misplaced (51.564%); 
mds0: Client physics-007:Physics01_data failing to respond to cache pressure

/
Is it critical?

thanks,

Erming





--
-----
 Erming Pei, Ph.D
 Senior System Analyst; Grid/Cloud Specialist

 Research Computing Group
 Information Services & Technology
 University of Alberta, Canada

 Tel: +1 7804929914Fax: +1 7804921729
-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] scrub error with ceph

2015-12-07 Thread Erming Pei
Hi,

   I found there are 128 scrub errors in my ceph system.  Checked with
health detail and found many pgs with stuck unclean issue. Should I repair
all of them? Or what I should do?

[root@gcloudnet ~]# ceph -s

cluster a4d0879f-abdc-4f9d-8a4b-53ce57d822f1

 health HEALTH_ERR 128 pgs inconsistent; 128 scrub errors; mds1: Client
HTRC:cephfs_data failing to respond to cache pressure; mds0: Client
physics-007:cephfs_data failing to respond to cache pressure; pool
'cephfs_data' is full

 monmap e3: 3 mons at
{gcloudnet=xxx.xxx.xxx.xxx:6789/0,gcloudsrv1=xxx.xxx.xxx.xxx:6789/0,gcloudsrv2=xxx.xxx.xxx.xxx:6789/0},
election epoch 178, quorum 0,1,2 gcloudnet,gcloudsrv1,gcloudsrv2

 mdsmap e51000: 2/2/2 up {0=gcloudsrv1=up:active,1=gcloudnet=up:active}

 osdmap e2821: 18 osds: 18 up, 18 in

  pgmap v10457877: 3648 pgs, 23 pools, 10501 GB data, 38688 kobjects

14097 GB used, 117 TB / 130 TB avail

   6 active+clean+scrubbing+deep

3513 active+clean

 128 active+clean+inconsistent

   1 active+clean+scrubbing


P.S. I am increasing the pg and pgp numbers for cephfs_data pool.

Thanks,

Erming



-- 

--------
Erming Pei, Ph.D, Senior System Analyst
HPC Grid/Cloud Specialist, ComputeCanada/WestGrid

Research Computing Group, IST
University of Alberta, Canada T6G 2H1
Email: erm...@ualberta.ca   erming@cern.ch
Tel. : +1 7804929914Fax: +1 7804921729
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: scrub error with ceph

2015-12-08 Thread Erming Pei
(Found no response from the current list, so forwarded to 
ceph-us...@ceph.com. )


Sorry if it's duplicated.


 Original Message 
Subject:scrub error with ceph
Date:   Mon, 7 Dec 2015 14:15:07 -0700
From:   Erming Pei 
To: ceph-users@lists.ceph.com



Hi,

   I found there are 128 scrub errors in my ceph system.  Checked with 
health detail and found many pgs with stuck unclean issue. Should I 
repair all of them? Or what I should do?


[root@gcloudnet ~]# ceph -s

cluster a4d0879f-abdc-4f9d-8a4b-53ce57d822f1

 health HEALTH_ERR 128 pgs inconsistent; 128 scrub errors; mds1: 
Client HTRC:cephfs_data failing to respond to cache pressure; mds0: 
Client physics-007:cephfs_data failing to respond to cache pressure; 
pool 'cephfs_data' is full


 monmap e3: 3 mons at 
{gcloudnet=xxx.xxx.xxx.xxx:6789/0,gcloudsrv1=xxx.xxx.xxx.xxx:6789/0,gcloudsrv2=xxx.xxx.xxx.xxx:6789/0}, 
election epoch 178, quorum 0,1,2 gcloudnet,gcloudsrv1,gcloudsrv2


 mdsmap e51000: 2/2/2 up {0=gcloudsrv1=up:active,1=gcloudnet=up:active}

 osdmap e2821: 18 osds: 18 up, 18 in

  pgmap v10457877: 3648 pgs, 23 pools, 10501 GB data, 38688 kobjects

14097 GB used, 117 TB / 130 TB avail

   6 active+clean+scrubbing+deep

3513 active+clean

 128 active+clean+inconsistent

   1 active+clean+scrubbing


P.S. I am increasing the pg and pgp numbers for cephfs_data pool.


Thanks,

Erming



--

--------
Erming Pei, Ph.D, Senior System Analyst
HPC Grid/Cloud Specialist, ComputeCanada/WestGrid

Research Computing Group, IST
University of Alberta, Canada T6G 2H1
Email:erm...@ualberta.ca  <mailto:erm...@ualberta.ca>erming@cern.ch  
<mailto:erming@cern.ch>
Tel. :+1 7804929914 Fax:+1 7804921729



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com