Re: [ceph-users] Ceph health warn MDS failing to respond to cache pressure

2017-05-10 Thread John Spray
On Thu, May 4, 2017 at 7:28 AM, gjprabu wrote: > Hi Team, > > We are running cephfs with 5 OSD and 3 Mon and 1 MDS. There is > Heath Warn "failing to respond to cache pressure" . Kindly advise to fix > this issue. This is usually due to buggy old clients, and occasionally due to a buggy

Re: [ceph-users] Ceph health warn MDS failing to respond to cache pressure

2017-05-10 Thread gjprabu
Hi Webert, Thanks for your reply , can pls suggest ceph pg value for data and metadata. I have set 128 for data and 128 for metadata , is this correct. Regards Prabu GJ On Thu, 04 May 2017 17:04:38 +0530 Webert de Souza Lima wrote I have fa

Re: [ceph-users] Ceph health warn MDS failing to respond to cache pressure

2017-05-10 Thread gjprabu
HI John, Thanks for you reply , we are using below version for client and MDS (ceph version 10.2.2) Regards Prabu GJ On Wed, 10 May 2017 12:29:06 +0530 John Spray wrote On Thu, May 4, 2017 at 7:28 AM, gjprabu wrote: >

Re: [ceph-users] All OSD fails after few requests to RGW

2017-05-10 Thread Piotr Nowosielski
Hey, We had similar problems. Look for information on "Filestore merge and split". Some explain: The OSD, after reaching a certain number of files in the directory (it depends of 'filestore merge threshold' and 'filestore split multiple' parameters) rebuilds the structure of this directory. If the

Re: [ceph-users] All OSD fails after few requests to RGW

2017-05-10 Thread Anton Dmitriev
How did you solved it? Set new split/merge thresholds, and manually applied it by ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-${osd_num} --journal-path /var/lib/ceph/osd/ceph-${osd_num}/journal --log-file=/var/log/ceph/objectstore_tool.${osd_num}.log --op apply-layout-settings --po

Re: [ceph-users] All OSD fails after few requests to RGW

2017-05-10 Thread Piotr Nowosielski
You can: - change these parameters and use ceph-objectstore-tool - add OSD host - rebuild the cluster will reduce the number of files in the directories - wait until "split" operations are over ;-) In our case, we could afford to wait until the "split" operation is over (we have 2 clusters in slig

Re: [ceph-users] All OSD fails after few requests to RGW

2017-05-10 Thread Anton Dmitriev
When I created cluster, I made a mistake in configuration, and set split parameter to 32 and merge to 40, so 32*40*16 = 20480 files per folder. After that I changed split to 8, and increased number of pg and pgp from 2048 to 4096 for pool, where problem occurs. While it was backfilling I observ

Re: [ceph-users] All OSD fails after few requests to RGW

2017-05-10 Thread Piotr Nowosielski
It is difficult for me to clearly state why some PGs have not been migrated. crushmap settings? Weight of OSD? One thing is certain - you will not find any information about the split process in the logs ... pn -Original Message- From: Anton Dmitriev [mailto:t...@enumnet.ru] Sent: Wednes

Re: [ceph-users] v12.0.2 Luminous (dev) released

2017-05-10 Thread Jurian Broertjes
I'm having issues with this as well. Since no new dev build is available yet, I tried the gitbuilder route, but that seems to be outdated. eg: http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ref/ (last build was in january and Luminous is missing) Are building from source or downgrading

[ceph-users] trouble starting ceph @ boot

2017-05-10 Thread vida.zach
System: Ubuntu Trusty 14.04 Release : Kraken Issue: When starting ceph-osd daemon on boot via upstart. Error message in /var/log/upstart/ceph-osd-ceph_#.log reports 3 attempt to start the service with the errors message below starting osd.12 at - osd_data /var/lib/ceph/osd/ceph-12 /var/li

Re: [ceph-users] All OSD fails after few requests to RGW

2017-05-10 Thread David Turner
PG subfolder splitting is the primary reason people are going to be deploying Luminous and Bluestore much faster than any other major release of Ceph. Bluestore removes the concept of subfolders in PGs. I have had clusters that reached what seemed a hardcoded maximum of 12,800 objects in a subfol

Re: [ceph-users] CephFS Performance

2017-05-10 Thread Webert de Souza Lima
On Tue, May 9, 2017 at 9:07 PM, Brady Deetz wrote: > So with email, you're talking about lots of small reads and writes. In my > experience with dicom data (thousands of 20KB files per directory), cephfs > doesn't perform very well at all on platter drivers. I haven't experimented > with pure ssd

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread David Turner
Have you attempted to place the ceph-osd startup later in the boot process. Which distribution/version are you running? Each does it slightly different. This can be problematic for some services, very commonly in cases where a network drive is mapped and used by a service like mysql (terrible ex

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread Peter Maloney
On 05/10/17 15:34, vida.z...@gmail.com wrote: > > System: Ubuntu Trusty 14.04 > > Release : Kraken > > > Issue: > > When starting ceph-osd daemon on boot via upstart. Error message in > /var/log/upstart/ceph-osd-ceph_#.log reports 3 attempt to start the > service with the errors message below > > >

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread vida.zach
David, I get what you are saying. Do you have a suggestion as to what service I make ceph-osd depend on to reliable start? My understanding is that these daemons should all be sort of independent of each other. -Zach From: David Turner Sent: Wednesday, May 10, 2017 1:18 PM To: vida.z...@gm

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread David Turner
I would probably just make it start last in the boot order. Depending on your distribution/version, that will be as simple as setting it to 99 for starting up. Which distribution/version are you running? On Wed, May 10, 2017 at 2:36 PM wrote: > David, > > > > I get what you are saying. Do you

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread vida.zach
David, ceph tell osd.12 version replies version 11.2.0 Distro is Ubuntu 14.04.5 LTS (trusty) which utilizes upstart for ceph. I don’t see a good way ensure last in an event based system like upstart. For the record I already tried after networking and after filesystems are mounted to and th

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread David Turner
`update-rc.d 'ceph' defaults 99` That should put it last in the boot order. The '99' here is a number 01-99 where the lower the number the earlier in the boot sequence the service is started. To see what order your service is set to start and stop, `ls /etc/rc*.d/*{service}. Each rc# represents

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread David Turner
Are you mounting your OSDs using fstab or anything else? Ceph uses udev rules and partition identifiers to know what a disk is and where to mount it, assuming that you have your GUIDs set properly on your disks. ceph-deploy does this by default. On Wed, May 10, 2017 at 3:46 PM David Turner wrot

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread vida.zach
I was able to set the order to 99 as your indicated but /var/log/upstart/ceph logs still complain excessively about unable to look up group 'ceph': (34) Numerical result out of range Mounting is done via /etc/fstab for osds. Which are xfs formatted HDDs. -Zach From: David Turner Sent: Wednesda

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread Peter Maloney
On 05/10/17 22:07, David Turner wrote: > Are you mounting your OSDs using fstab or anything else? Ceph uses > udev rules and partition identifiers to know what a disk is and where > to mount it, assuming that you have your GUIDs set properly on your > disks. ceph-deploy does this by default. > >

Re: [ceph-users] Rebalancing causing IO Stall/IO Drops to zero

2017-05-10 Thread Alex Gorbachev
On Thu, May 4, 2017 at 8:40 AM Osama Hasebou wrote: > Hi Everyone, > > We keep running into stalled IOs / they also drop almost to zero, whenever > a node suddenly would go down or if there was a large amount of rebalancing > going on and once rebalancing is completed, we would also get stalled i

Re: [ceph-users] All OSD fails after few requests to RGW

2017-05-10 Thread Anton Dmitriev
"recent enough version of the ceph-objectstore-tool" - sounds very interesting. Would it be released in one of next Jewel minor releases? On 10.05.2017 19:03, David Turner wrote: PG subfolder splitting is the primary reason people are going to be deploying Luminous and Bluestore much faster tha