Re: [ceph-users] mkfs.ext4 hang on RBD volume

2017-01-17 Thread Vincent Godin
We found the issue. It was a simply "max open files" on user qemu which was reached. When we do in serial a lot of mkfs, there is a lot of sockets open to ceph backend and qemu reach its max open files limit. So we increased max open files in qemu.conf and the problem disapeared 2017-01-16 19:19 G

Re: [ceph-users] CephFS

2017-01-17 Thread Sean Redmond
I found the kernel clients to perform better in my case. I ran into a couple of issues with some metadata pool corruption and omap inconsistencies. That said the repair tools are useful and managed to get things back up and running. The community has been very responsive to any issues I have ran

[ceph-users] Issue with upgrade from 0.94.9 to 10.2.5

2017-01-17 Thread Piotr Dałek
Hello, During our testing we found out that during upgrade from 0.94.9 to 10.2.5 we're hitting issue http://tracker.ceph.com/issues/17386 ("Upgrading 0.94.6 -> 0.94.9 saturating mon node networking"). Apparently, there's a few commits for both hammer and jewel which are supposed to fix this is

Re: [ceph-users] CephFS

2017-01-17 Thread Kingsley Tart
How did you find the fuse client performed? I'm more interested in the fuse client because I'd like to use CephFS for shared volumes, and my understanding of the kernel client is that it uses the volume as a block device. Cheers, Kingsley. On Tue, 2017-01-17 at 11:46 +, Sean Redmond wrote: >

Re: [ceph-users] CephFS

2017-01-17 Thread Loris Cuoghi
Hello, Le 17/01/2017 à 13:38, Kingsley Tart a écrit : How did you find the fuse client performed? I'm more interested in the fuse client because I'd like to use CephFS for shared volumes, and my understanding of the kernel client is that it uses the volume as a block device. I think you're co

[ceph-users] Manual deep scrub

2017-01-17 Thread Richard Arends
Hi, When i start a deep scrub on a PG by hand 'ceph pg deep-scrub 1.18d5', sometimes the deep scrub is executed direct after the command is entered, but often it's not there is a lot of time between starting and executing. For example: 2017-01-17 05:25:31.786 session 01162017 :: Starting d

Re: [ceph-users] Manual deep scrub

2017-01-17 Thread David Turner
You want to look into the settings osd_max_scrubs which indicates how many different scrub operations an OSD can be involved in at once (the well chosen default is 1), as well as osd_scrub_max_interval and osd_deep_scrub_interval. One of the differences in your cluster from before to now is tim

Re: [ceph-users] Manual deep scrub

2017-01-17 Thread Richard Arends
On 01/17/2017 04:09 PM, David Turner wrote: Hi, You want to look into the settings osd_max_scrubs which indicates how many different scrub operations an OSD can be involved in at once That's still the default, thus 1. A pg that i wanted to deep-scrub this afternoon, should be done by an OSD

Re: [ceph-users] Manual deep scrub

2017-01-17 Thread David Turner
All OSDs with a copy of the PG need to not be involved in any scrub for the scrub to start immediately. It is not just the primary OSD but all secondary OSDs as well for a scrub to be able to run on a PG. [cid:image5e43c1.JPG@52ae1513.42ace176]

Re: [ceph-users] CephFS

2017-01-17 Thread Kingsley Tart
On Tue, 2017-01-17 at 13:49 +0100, Loris Cuoghi wrote: > I think you're confusing CephFS kernel client and RBD kernel client. > > The Linux kernel contains both: > > * a module ceph.ko for accessing a CephFS > * a module rbd.ko for accessing an RBD (Rados Block Device) > > You can mount a CephFS

Re: [ceph-users] CephFS

2017-01-17 Thread Alex Evonosky
for whats its worth, I have been using CephFS shared between six servers (all kernel mounted) and no issues. Running three monitors and 2 meta servers (one as backup). This has been running great. On Tue, Jan 17, 2017 at 12:14 PM, Kingsley Tart wrote: > On Tue, 2017-01-17 at 13:49 +0100, Loris

Re: [ceph-users] CephFS

2017-01-17 Thread Kingsley Tart
Hi, Are these all sharing the same volume? Cheers, Kingsley. On Tue, 2017-01-17 at 12:19 -0500, Alex Evonosky wrote: > for whats its worth, I have been using CephFS shared between six > servers (all kernel mounted) and no issues. Running three monitors > and 2 meta servers (one as backup). Thi

Re: [ceph-users] CephFS

2017-01-17 Thread Alex Evonosky
yes they are. I created one volume all shared by the webservers. So essentially is acting like a NAS using NFS. All servers see the same data. On Tue, Jan 17, 2017 at 12:26 PM, Kingsley Tart wrote: > Hi, > > Are these all sharing the same volume? > > Cheers, > Kingsley. > > On Tue, 2017-01-17

Re: [ceph-users] CephFS

2017-01-17 Thread Alex Evonosky
example: Each server looks like this on their mounting: /bin/mount -t ceph -o name=admin,secret= 10.10.10.138,10.10.10.252,10.10.10.103:/ /media/network-storage all points to the monitor servers. On Tue, Jan 17, 2017 at 12:27 PM, Alex Evonosky wrote: > yes they are. I created one volume all

Re: [ceph-users] CephFS

2017-01-17 Thread Kingsley Tart
Oh that's good. I thought the kernel clients only supported block devices. I guess that has changed since I last looked. Cheers, Kingsley. On Tue, 2017-01-17 at 12:29 -0500, Alex Evonosky wrote: > example: > Each server looks like this on their mounting: > > /bin/mount -t ceph -o name=admin,sec

[ceph-users] failing to respond to capability release, mds cache size?

2017-01-17 Thread Darrell Enns
I've just had one of my cephfs servers showing an "mdsY: Client X failing to respond to capability release" error. The client in question was acting strange, not allowing deleting files, etc. The issue was cleared by restarting the affected server. I see there have been a few posts about thi

Re: [ceph-users] CephFS

2017-01-17 Thread Kingsley Tart
On Tue, 2017-01-17 at 19:04 +0100, Ilya Dryomov wrote: > On Tue, Jan 17, 2017 at 6:49 PM, Kingsley Tart wrote: > > Oh that's good. I thought the kernel clients only supported block > > devices. I guess that has changed since I last looked. > > That has always been the case -- block device support

Re: [ceph-users] CephFS

2017-01-17 Thread Ilya Dryomov
On Tue, Jan 17, 2017 at 6:49 PM, Kingsley Tart wrote: > Oh that's good. I thought the kernel clients only supported block > devices. I guess that has changed since I last looked. That has always been the case -- block device support came about a year after the filesystem was merged into the kerne

Re: [ceph-users] failing to respond to capability release, mds cache size?

2017-01-17 Thread Gregory Farnum
On Tue, Jan 17, 2017 at 10:07 AM, Darrell Enns wrote: > I’ve just had one of my cephfs servers showing an “mdsY: Client X > failing to respond to capability release” error. The client in question was > acting strange, not allowing deleting files, etc. The issue was cleared by > restarting the

[ceph-users] Hosting Ceph Day Stockholm?

2017-01-17 Thread Patrick McGarry
Hey cephers, I have talked to a few different people about hosting our first Scandinavian Ceph Day in Stockholm this year, but I either have email addresses wrong or they are not replying. If anyone is interested in hosting (or knows someone who might be) a Ceph Day in Stockholm in April, please

Re: [ceph-users] CephFS

2017-01-17 Thread Gregory Farnum
On Tue, Jan 17, 2017 at 10:09 AM, Kingsley Tart wrote: > On Tue, 2017-01-17 at 19:04 +0100, Ilya Dryomov wrote: >> On Tue, Jan 17, 2017 at 6:49 PM, Kingsley Tart wrote: >> > Oh that's good. I thought the kernel clients only supported block >> > devices. I guess that has changed since I last looke

Re: [ceph-users] failing to respond to capability release, mds cache size?

2017-01-17 Thread Darrell Enns
Thanks for the info, I'll be sure to "dump_ops_in_flight" and "session ls" if it crops up again. Is there any other info you can think of that might be useful? I want to make sure I capture all the evidence needed if it happens again. ___ ceph-users ma

[ceph-users] Ceph Day Speakers (San Jose / Boston)

2017-01-17 Thread Patrick McGarry
Hey cephers, Now that we are starting to nail down Ceph Days for 2017, I'm going to be needing community speakers to fill those slots. The first events we have on the docket are: 1) San Jose, CA - 17 Mar 2017 2) Boston, MA - OpenStack Summit (exact date not solidified yet, but we will have a ful

Re: [ceph-users] Manual deep scrub

2017-01-17 Thread Richard Arends
On 01/17/2017 05:15 PM, David Turner wrote: David, All OSDs with a copy of the PG need to not be involved in any scrub for the scrub to start immediately. It is not just the primary OSD but all secondary OSDs as well for a scrub to be able to run on a PG. I thought of that and checked if th

Re: [ceph-users] Manual deep scrub

2017-01-17 Thread David Turner
Looking through the additional osd config options for scrubbing show a couple options that can prevent a PG from scrubbing immediately. osd_scrub_during_recovery - default true - If false, no new scrub can be scheduled while their is active recovery. osd_scrub_load_threshold - default 0.5 - Ceph

Re: [ceph-users] Manual deep scrub

2017-01-17 Thread Richard Arends
On 01/17/2017 09:21 PM, David Turner wrote: Looking through the additional osd config options for scrubbing show a couple options that can prevent a PG from scrubbing immediately. osd_scrub_during_recovery - default true - If false, no new scrub can be scheduled while their is active recover

Re: [ceph-users] CephFS

2017-01-17 Thread Lindsay Mathieson
On 18/01/2017 3:14 AM, Kingsley Tart wrote: This is why I thought ceph-fuse would be what I needed. cephfs fuse and cephfs kernel are the *same* thing, except the kernel based one has better performance. However due to deadlock issues you cant run the kernel mount on the same machine as the

Re: [ceph-users] Ceph Monitoring

2017-01-17 Thread Trey Palmer
Just going into production now with a large-ish multisite radosgw setup on 10.2. We are starting off by alerting on anything that isn't HEALTH_OK, just to see how things go. If we get HEALTH_WARN but no mons or OSD's are down then it will be a low-level alert. We will massage scripts to pick

[ceph-users] ceph mon unable to reach quorum

2017-01-17 Thread lee_yiu_ch...@yahoo.com
Dear all, I have a ceph installation (dev site) with two nodes, each running a mon daemon and osd daemon. (Yes, I know running a cluster of two mon is bad, but I have no choice since I only have two nodes.) Now, the two nodes are migrated to another datacenter, but after it is booted up the mo

[ceph-users] Testing a node by fio - strange results to me

2017-01-17 Thread Ahmed Khuraidah
Hello community, I need your help to understand a little bit more about current MDS architecture. I have created one node CephFS deployment and tried to test it by fio. I have used two file sizes of 3G and 320G. My question is why I have around 1k+ IOps when perform random reading from 3G file int