[ceph-users] Stuck creating pg

2015-08-16 Thread Bart Vanbrabant
Hi,

I have a ceph cluster with 26 osd's in 4 hosts only use for rbd for an
OpenStack cluster (started at 0.48 I think), currently running 0.94.2 on
Ubuntu 14.04. A few days ago one of the osd's was at 85% disk usage while
only 30% of the raw disk space is used. I ran reweight-by-utilization with
150 was cutoff level. This reshuffled the data. I also noticed that the
number of pg was still at the level when there were less disks in the
cluster (1300).

Based on the current guidelines I increased pg_num to 2048. It created the
placement groups except for the last one. To try to force the creation of
the pg I removed the OSD's (ceph osd out) assigned to that pg but that
makes no difference. Currently all OSD's are back in and two pg's are also
stuck in an unclean state:

ceph health detail:

HEALTH_WARN 2 pgs degraded; 2 pgs stale; 2 pgs stuck degraded; 1 pgs stuck
inactive; 2 pgs stuck stale; 3 pgs stuck unclean; 2 pgs stuck undersized; 2
pgs undersized; 59 requests are blocked > 32 sec; 3 osds have slow
requests; recovery 221/549658 objects degraded (0.040%); recovery
221/549658 objects misplaced (0.040%); pool volumes pg_num 2048 > pgp_num
1400
pg 5.6c7 is stuck inactive since forever, current state creating, last
acting [19,25]
pg 5.6c7 is stuck unclean since forever, current state creating, last
acting [19,25]
pg 5.2c7 is stuck unclean for 313513.609864, current state
stale+active+undersized+degraded+remapped, last acting [9]
pg 15.2bd is stuck unclean for 313513.610368, current state
stale+active+undersized+degraded+remapped, last acting [9]
pg 5.2c7 is stuck undersized for 308381.750768, current state
stale+active+undersized+degraded+remapped, last acting [9]
pg 15.2bd is stuck undersized for 308381.751913, current state
stale+active+undersized+degraded+remapped, last acting [9]
pg 5.2c7 is stuck degraded for 308381.750876, current state
stale+active+undersized+degraded+remapped, last acting [9]
pg 15.2bd is stuck degraded for 308381.752021, current state
stale+active+undersized+degraded+remapped, last acting [9]
pg 5.2c7 is stuck stale for 281750.295301, current state
stale+active+undersized+degraded+remapped, last acting [9]
pg 15.2bd is stuck stale for 281750.295293, current state
stale+active+undersized+degraded+remapped, last acting [9]
16 ops are blocked > 268435 sec
10 ops are blocked > 134218 sec
10 ops are blocked > 1048.58 sec
23 ops are blocked > 524.288 sec
16 ops are blocked > 268435 sec on osd.1
8 ops are blocked > 134218 sec on osd.17
2 ops are blocked > 134218 sec on osd.19
10 ops are blocked > 1048.58 sec on osd.19
23 ops are blocked > 524.288 sec on osd.19
3 osds have slow requests
recovery 221/549658 objects degraded (0.040%)
recovery 221/549658 objects misplaced (0.040%)
pool volumes pg_num 2048 > pgp_num 1400

OSD 9 was the one that was the primary when the pg creation process got
stuck. This OSD has been removed and added again (not only osd out but also
removed from the crush map and added again)

The bad data distribution was probably caused by the low number of pg's and
mainly bad weighing of the OSD. I changed the crush map to give the same
weight to each of the OSD's but that does not change these problems either:

ceph osd tree:
ID WEIGHT  TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 6.5 pool default
-6 2.0 host droplet4
16 0.25000 osd.16 up  1.0  1.0
20 0.25000 osd.20 up  1.0  1.0
21 0.25000 osd.21 up  1.0  1.0
22 0.25000 osd.22 up  1.0  1.0
 6 0.25000 osd.6  up  1.0  1.0
18 0.25000 osd.18 up  1.0  1.0
19 0.25000 osd.19 up  1.0  1.0
23 0.25000 osd.23 up  1.0  1.0
-5 1.5 host droplet3
 3 0.25000 osd.3  up  1.0  1.0
13 0.25000 osd.13 up  1.0  1.0
15 0.25000 osd.15 up  1.0  1.0
 4 0.25000 osd.4  up  1.0  1.0
25 0.25000 osd.25 up  1.0  1.0
14 0.25000 osd.14 up  1.0  1.0
-2 1.5 host droplet1
 7 0.25000 osd.7  up  1.0  1.0
 1 0.25000 osd.1  up  1.0  1.0
 0 0.25000 osd.0  up  1.0  1.0
 9 0.25000 osd.9  up  1.0  1.0
12 0.25000 osd.12 up  1.0  1.0
17 0.25000 osd.17 up  1.0  1.0
-4 1.5 host droplet2
10 0.25000 osd.10 up  1.0  1.0
 8 0.25000 osd.8  up  1.0  1.0
11 0.25000 osd.11 up  1.0  1.0
 2 0.25000 osd.2  up  1.0  1.0
24 0.25000 osd.24 up  1.0  1.0
 

[ceph-users] Ceph File System ACL Support

2015-08-16 Thread Eric Eastman
Hi,

I need to verify in Ceph v9.0.2 if the kernel version of Ceph file
system supports ACLs and the libcephfs file system interface does not.
I am trying to have SAMBA, version 4.3.0rc1, support Windows ACLs
using "vfs objects = acl_xattr" with the SAMBA VFS Ceph file system
interface "vfs objects = ceph" and my tests are failing. If I use a
kernel mount of the same Ceph file system, it works.  Using the SAMBA
Ceph VFS interface with logging set to 3 in my smb.conf files shows
the following error when on my Windows AD server I try to "Disable
inheritance" of the SAMBA exported directory uu/home:

[2015/08/16 18:27:11.546307,  2]
../source3/smbd/posix_acls.c:3006(set_canon_ace_list)
  set_canon_ace_list: sys_acl_set_file type file failed for file
uu/home (Operation not supported).

This works using the same Ceph file system kernel mounted. It also
works with an XFS file system.

Doing some Googling I found this entry on the SAMBA email list:

https://lists.samba.org/archive/samba-technical/2015-March/106699.html

It states: libcephfs does not support ACL yet, so this patch adds ACL
callbacks that do nothing.

If ACL support is not in libcephfs, is there plans to add it, as the
SAMBA Ceph VFS interface without ACL support is severely limited in a
multi-user Windows environment.

Thanks,
Eric
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to improve single thread sequential reads?

2015-08-16 Thread Alex Gorbachev
Hi Nick,

On Thu, Aug 13, 2015 at 4:37 PM, Nick Fisk  wrote:
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Nick Fisk
>> Sent: 13 August 2015 18:04
>> To: ceph-users@lists.ceph.com
>> Subject: [ceph-users] How to improve single thread sequential reads?
>>
>> Hi,
>>
>> I'm trying to use a RBD to act as a staging area for some data before
> pushing
>> it down to some LTO6 tapes. As I cannot use striping with the kernel
> client I
>> tend to be maxing out at around 80MB/s reads testing with DD. Has anyone
>> got any clever suggestions of giving this a bit of a boost, I think I need
> to get it
>> up to around 200MB/s to make sure there is always a steady flow of data to
>> the tape drive.
>
> I've just tried the testing kernel with the blk-mq fixes in it for full size
> IO's, this combined with bumping readahead up to 4MB, is now getting me on
> average 150MB/s to 200MB/s so this might suffice.
>
> On a personal interest, I would still like to know if anyone has ideas on
> how to really push much higher bandwidth through a RBD.

Some settings in our ceph.conf that may help:

osd_op_threads = 20
osd_mount_options_xfs = rw,noatime,inode64,logbsize=256k
filestore_queue_max_ops = 9
filestore_flusher = false
filestore_max_sync_interval = 10
filestore_sync_flush = false

Regards,
Alex

>
>>
>> Rbd-fuse seems to top out at 12MB/s, so there goes that option.
>>
>> I'm thinking mapping multiple RBD's and then combining them into a mdadm
>> RAID0 stripe might work, but seems a bit messy.
>>
>> Any suggestions?
>>
>> Thanks,
>> Nick
>>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to improve single thread sequential reads?

2015-08-16 Thread Somnath Roy
Have you tried setting read_ahead_kb to bigger number for both client/OSD side 
if you are using krbd ?
In case of librbd, try the different config options for rbd cache..

Thanks & Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alex 
Gorbachev
Sent: Sunday, August 16, 2015 7:07 PM
To: Nick Fisk
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] How to improve single thread sequential reads?

Hi Nick,

On Thu, Aug 13, 2015 at 4:37 PM, Nick Fisk  wrote:
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>> Of Nick Fisk
>> Sent: 13 August 2015 18:04
>> To: ceph-users@lists.ceph.com
>> Subject: [ceph-users] How to improve single thread sequential reads?
>>
>> Hi,
>>
>> I'm trying to use a RBD to act as a staging area for some data before
> pushing
>> it down to some LTO6 tapes. As I cannot use striping with the kernel
> client I
>> tend to be maxing out at around 80MB/s reads testing with DD. Has
>> anyone got any clever suggestions of giving this a bit of a boost, I
>> think I need
> to get it
>> up to around 200MB/s to make sure there is always a steady flow of
>> data to the tape drive.
>
> I've just tried the testing kernel with the blk-mq fixes in it for
> full size IO's, this combined with bumping readahead up to 4MB, is now
> getting me on average 150MB/s to 200MB/s so this might suffice.
>
> On a personal interest, I would still like to know if anyone has ideas
> on how to really push much higher bandwidth through a RBD.

Some settings in our ceph.conf that may help:

osd_op_threads = 20
osd_mount_options_xfs = rw,noatime,inode64,logbsize=256k 
filestore_queue_max_ops = 9 filestore_flusher = false 
filestore_max_sync_interval = 10 filestore_sync_flush = false

Regards,
Alex

>
>>
>> Rbd-fuse seems to top out at 12MB/s, so there goes that option.
>>
>> I'm thinking mapping multiple RBD's and then combining them into a
>> mdadm
>> RAID0 stripe might work, but seems a bit messy.
>>
>> Any suggestions?
>>
>> Thanks,
>> Nick
>>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph File System ACL Support

2015-08-16 Thread Yan, Zheng
On Mon, Aug 17, 2015 at 9:38 AM, Eric Eastman
 wrote:
> Hi,
>
> I need to verify in Ceph v9.0.2 if the kernel version of Ceph file
> system supports ACLs and the libcephfs file system interface does not.
> I am trying to have SAMBA, version 4.3.0rc1, support Windows ACLs
> using "vfs objects = acl_xattr" with the SAMBA VFS Ceph file system
> interface "vfs objects = ceph" and my tests are failing. If I use a
> kernel mount of the same Ceph file system, it works.  Using the SAMBA
> Ceph VFS interface with logging set to 3 in my smb.conf files shows
> the following error when on my Windows AD server I try to "Disable
> inheritance" of the SAMBA exported directory uu/home:
>
> [2015/08/16 18:27:11.546307,  2]
> ../source3/smbd/posix_acls.c:3006(set_canon_ace_list)
>   set_canon_ace_list: sys_acl_set_file type file failed for file
> uu/home (Operation not supported).
>
> This works using the same Ceph file system kernel mounted. It also
> works with an XFS file system.
>
> Doing some Googling I found this entry on the SAMBA email list:
>
> https://lists.samba.org/archive/samba-technical/2015-March/106699.html
>
> It states: libcephfs does not support ACL yet, so this patch adds ACL
> callbacks that do nothing.
>
> If ACL support is not in libcephfs, is there plans to add it, as the
> SAMBA Ceph VFS interface without ACL support is severely limited in a
> multi-user Windows environment.
>

libcephfs does not support ACL. I have an old patch that adds ACL
support to samba's vfs ceph module, but haven't tested it carefully.

Yan, Zheng



> Thanks,
> Eric
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph File System ACL Support

2015-08-16 Thread Eric Eastman
On Sun, Aug 16, 2015 at 9:12 PM, Yan, Zheng  wrote:
> On Mon, Aug 17, 2015 at 9:38 AM, Eric Eastman
>  wrote:
>> Hi,
>>
>> I need to verify in Ceph v9.0.2 if the kernel version of Ceph file
>> system supports ACLs and the libcephfs file system interface does not.
>> I am trying to have SAMBA, version 4.3.0rc1, support Windows ACLs
>> using "vfs objects = acl_xattr" with the SAMBA VFS Ceph file system
>> interface "vfs objects = ceph" and my tests are failing. If I use a
>> kernel mount of the same Ceph file system, it works.  Using the SAMBA
>> Ceph VFS interface with logging set to 3 in my smb.conf files shows
>> the following error when on my Windows AD server I try to "Disable
>> inheritance" of the SAMBA exported directory uu/home:
>>
>> [2015/08/16 18:27:11.546307,  2]
>> ../source3/smbd/posix_acls.c:3006(set_canon_ace_list)
>>   set_canon_ace_list: sys_acl_set_file type file failed for file
>> uu/home (Operation not supported).
>>
>> This works using the same Ceph file system kernel mounted. It also
>> works with an XFS file system.
>>
>> Doing some Googling I found this entry on the SAMBA email list:
>>
>> https://lists.samba.org/archive/samba-technical/2015-March/106699.html
>>
>> It states: libcephfs does not support ACL yet, so this patch adds ACL
>> callbacks that do nothing.
>>
>> If ACL support is not in libcephfs, is there plans to add it, as the
>> SAMBA Ceph VFS interface without ACL support is severely limited in a
>> multi-user Windows environment.
>>
>
> libcephfs does not support ACL. I have an old patch that adds ACL
> support to samba's vfs ceph module, but haven't tested it carefully.
>
> Yan, Zheng
>
Thank you for confirming what I am seeing.  It would be nice to have
ACL support for SAMBA.  I would be able to do some testing of the
patch if that would help.

Eric
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph distributed osd

2015-08-16 Thread gjprabu
Hi All,



   We need to test three OSD and one image with replica 2(size 1GB). While 
testing data is not writing above 1GB. Is there any option to write on third 
OSD.



ceph osd pool get  repo  pg_num

pg_num: 126



# rbd showmapped 

id pool image  snap device

0  rbd  integdownloads -/dev/rbd0 -- Already one

2  repo integrepotest  -/dev/rbd2  -- newly created





[root@hm2 repository]# df -Th

Filesystem   Type  Size  Used Avail Use% Mounted on

/dev/sda5ext4  289G   18G  257G   7% /

devtmpfs devtmpfs  252G 0  252G   0% /dev

tmpfstmpfs 252G 0  252G   0% /dev/shm

tmpfstmpfs 252G  538M  252G   1% /run

tmpfstmpfs 252G 0  252G   0% /sys/fs/cgroup

/dev/sda2ext4  488M  212M  241M  47% /boot

/dev/sda4ext4  1.9T   20G  1.8T   2% /var

/dev/mapper/vg0-zoho ext4  8.6T  1.7T  6.5T  21% /zoho

/dev/rbd0ocfs2 977G  101G  877G  11% /zoho/build/downloads

/dev/rbd2ocfs21000M 1000M 0 100% /zoho/build/repository



@:~$ scp -r sample.txt root@integ-hm2:/zoho/build/repository/

root@integ-hm2's password: 

sample.txt  
   100% 1024MB   4.5MB/s   03:48

scp: /zoho/build/repository//sample.txt: No space left on device



Regards

Prabu










  On Thu, 13 Aug 2015 19:42:11 +0530 gjprabu  
wrote 




Dear Team,



 We are using two ceph OSD with replica 2 and it is working properly. 
Here my doubt is (Pool A -image size will be 10GB) and its replicated with two 
OSD, what will happen suppose if the size reached the limit, Is there any 
chance to make the data to continue writing in another two OSD's.



Regards

Prabu


















___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph distributed osd

2015-08-16 Thread gjprabu
Hi All,



   Also please find osd information.



ceph osd dump | grep 'replicated size'

pool 2 'repo' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins 
pg_num 126 pgp_num 126 last_change 21573 flags hashpspool stripe_width 0



Regards

Prabu









  On Mon, 17 Aug 2015 11:58:55 +0530 gjprabu  
wrote 




Hi All,



   We need to test three OSD and one image with replica 2(size 1GB). While 
testing data is not writing above 1GB. Is there any option to write on third 
OSD.



ceph osd pool get  repo  pg_num

pg_num: 126



# rbd showmapped 

id pool image  snap device

0  rbd  integdownloads -/dev/rbd0 -- Already one

2  repo integrepotest  -/dev/rbd2  -- newly created





[root@hm2 repository]# df -Th

Filesystem   Type  Size  Used Avail Use% Mounted on

/dev/sda5ext4  289G   18G  257G   7% /

devtmpfs devtmpfs  252G 0  252G   0% /dev

tmpfstmpfs 252G 0  252G   0% /dev/shm

tmpfstmpfs 252G  538M  252G   1% /run

tmpfstmpfs 252G 0  252G   0% /sys/fs/cgroup

/dev/sda2ext4  488M  212M  241M  47% /boot

/dev/sda4ext4  1.9T   20G  1.8T   2% /var

/dev/mapper/vg0-zoho ext4  8.6T  1.7T  6.5T  21% /zoho

/dev/rbd0ocfs2 977G  101G  877G  11% /zoho/build/downloads

/dev/rbd2ocfs21000M 1000M 0 100% /zoho/build/repository



@:~$ scp -r sample.txt root@integ-hm2:/zoho/build/repository/

root@integ-hm2's password: 

sample.txt  
   100% 1024MB   4.5MB/s   03:48

scp: /zoho/build/repository//sample.txt: No space left on device



Regards

Prabu










 On Thu, 13 Aug 2015 19:42:11 +0530 gjprabu  
wrote 











Dear Team,



 We are using two ceph OSD with replica 2 and it is working properly. 
Here my doubt is (Pool A -image size will be 10GB) and its replicated with two 
OSD, what will happen suppose if the size reached the limit, Is there any 
chance to make the data to continue writing in another two OSD's.



Regards

Prabu


















___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com