Re: [ceph-users] Unexplainable slow request

2014-12-09 Thread Gregory Farnum
On Mon, Dec 8, 2014 at 6:39 PM, Christian Balzer wrote: > > Hello, > > Debian Jessie cluster, thus kernel 3.16, ceph 0.80.7. > 3 storage nodes with 8 OSDs (journals on 4 SSDs) each, 3 mons. > 2 compute nodes, everything connected via Infiniband. > > This is pre-production, currently there are only

Re: [ceph-users] experimental features

2014-12-09 Thread Christian Balzer
On Mon, 08 Dec 2014 08:33:25 -0600 Mark Nelson wrote: > I've been thinking for a while that we need another more general command > than Ceph health to more generally inform you about your cluster. IE I > personally don't like having min/max PG warnings in Ceph health (they > can be independent

Re: [ceph-users] Unexplainable slow request

2014-12-09 Thread Christian Balzer
On Mon, 8 Dec 2014 20:36:17 -0800 Gregory Farnum wrote: > They never fixed themselves? As I wrote, it took a restart of OSD 8 to resolve this on the next day. > Did the reported times ever increase? Indeed, the last before the reboot was: --- 2014-12-07 13:12:42.933396 7fceac82f700 0 log [WRN]

Re: [ceph-users] Unexplainable slow request

2014-12-09 Thread Christian Balzer
Hello, On Mon, 8 Dec 2014 19:51:00 -0800 Gregory Farnum wrote: > On Mon, Dec 8, 2014 at 6:39 PM, Christian Balzer wrote: > > > > Hello, > > > > Debian Jessie cluster, thus kernel 3.16, ceph 0.80.7. > > 3 storage nodes with 8 OSDs (journals on 4 SSDs) each, 3 mons. > > 2 compute nodes, everythin

[ceph-users] Watch for fstrim running on your Ubuntu systems

2014-12-09 Thread Wido den Hollander
Hi, Last sunday I got a call early in the morning that a Ceph cluster was having some issues. Slow requests and OSDs marking each other down. Since this is a 100% SSD cluster I was a bit confused and started investigating. It took me about 15 minutes to see that fstrim was running and was utiliz

Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2014-12-09 Thread Sebastien Han
Good to know. Thanks for sharing! > On 09 Dec 2014, at 10:21, Wido den Hollander wrote: > > Hi, > > Last sunday I got a call early in the morning that a Ceph cluster was > having some issues. Slow requests and OSDs marking each other down. > > Since this is a 100% SSD cluster I was a bit confu

[ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Giuseppe Civitella
Hi all, last week I installed a new ceph cluster on 3 vm running Ubuntu 14.04 with default kernel. There is a ceph monitor a two osd hosts. Here are some datails: ceph -s cluster c46d5b02-dab1-40bf-8a3d-f8e4a77b79da health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean monmap e1

Re: [ceph-users] Monitors repeatedly calling for new elections

2014-12-09 Thread Rodrigo Severo
On Mon, Dec 8, 2014 at 5:23 PM, Sanders, Bill wrote: > Under activity, we'll get monitors going into election cycles repeatedly, > OSD's being "wrongly marked down", as well as slow requests "osd.11 > 39.7.48.6:6833/21938 failed (3 reports from 1 peers after 52.914693 >= grace > 20.00)" . Du

Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Giuseppe Civitella
Hi, thanks for the quick answer. I did try the force_create_pg on a pg but is stuck on "creating": root@ceph-mon1:/home/ceph# ceph pg dump |grep creating dumped all in format plain 2.2f0 0 0 0 0 0 0 creating 2014-12-09 13:11:37.384808 0'0

[ceph-users] Problems running ceph commands.on custom linux system

2014-12-09 Thread Patrick Darley
Hi, I'm having a problem running commands such as `ceph --help` and `ceph -s`. These commands output the expected information, but then they hang indefinitely. Using strace I have found that the system seems to get stuck with futex operations running and timing out repeatedly. However I'm un

Re: [ceph-users] Monitors repeatedly calling for new elections

2014-12-09 Thread Sanders, Bill
Thanks for the response. I did forget to mention that NTP is setup and does appear to be running (just double checked). Is this good enough resolution? $ for node in $nodes; do ssh tvsa${node} sudo date --rfc-3339=ns; done 2014-12-09 09:15:39.404292557-08:00 2014-12-09 09:15:39.521762397-08:00

Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Gregory Farnum
It looks like your OSDs all have weight zero for some reason. I'd fix that. :) -Greg On Tue, Dec 9, 2014 at 6:24 AM Giuseppe Civitella < giuseppe.civite...@gmail.com> wrote: > Hi, > > thanks for the quick answer. > I did try the force_create_pg on a pg but is stuck on "creating": > root@ceph-mon1:

[ceph-users] Query about osd pool default flags & hashpspool

2014-12-09 Thread Abhishek L
Hi I was going through various conf options to customize a ceph cluster and came across `osd pool default flags` in pool-pg config ref[1]. Though the value specifies an integer, though I couldn't find a mention of possible values this can take in the docs. Looking a bit deeper onto ceph sources [2

Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2014-12-09 Thread Wido den Hollander
On 12/09/2014 12:12 PM, Luis Periquito wrote: > Hi Wido, > thanks for sharing. > > fortunately I'm still running precise but planning on moving to trusty. > > From what I'm aware it's not a good idea to be running discard on the FS, > as it does have an impact of the delete operation, which some

Re: [ceph-users] normalizing radosgw

2014-12-09 Thread Abhishek L
Sage Weil writes: [..] > Thoughts? Suggestions? > [..] Suggestion: radosgw should handle injectargs like other ceph clients do? This is not a major annoyance, but it would be nice to have. -- Abhishek signature.asc Description: PGP signature ___ c

Re: [ceph-users] Monitors repeatedly calling for new elections

2014-12-09 Thread Sanders, Bill
Apologies for replying to myself, I thought I'd add a bit more information. We don't think the ceph cluster is the issue, but maybe something on the clients (bad configuration setting? Bug in our older version of ceph-client?). I've attached our CrushMap and OSD tree, as well. Neither /var/l

Re: [ceph-users] Problems running ceph commands.on custom linux system

2014-12-09 Thread Jeffrey Ollie
On Tue, Dec 9, 2014 at 10:15 AM, Patrick Darley wrote: > > I'm having a problem running commands such as `ceph --help` and `ceph -s`. > These commands output the expected information, but then they hang > indefinitely. If you're using Python 2.7.8 it's probably this issue: http://tracker.ceph.co

Re: [ceph-users] Multiple MDS servers...

2014-12-09 Thread Gregory Farnum
You'll need to be a little more explicit about your question. In general there is nothing special that needs to be done. If you're trying to get multiple active MDSes (instead of active and standby/standby-replay/etc) you'll need to tell the monitors to increase the mds num (check the docs; this is

Re: [ceph-users] Query about osd pool default flags & hashpspool

2014-12-09 Thread Gregory Farnum
On Tue, Dec 9, 2014 at 10:24 AM, Abhishek L wrote: > Hi > > I was going through various conf options to customize a ceph cluster and > came across `osd pool default flags` in pool-pg config ref[1]. Though > the value specifies an integer, though I couldn't find a mention of > possible values this

Re: [ceph-users] Giant osd problems - loss of IO

2014-12-09 Thread Andrei Mikhailovsky
Following Jake's recommendation I have updated my sysctl.conf file and it seems to have helped with the problem of osds being marked down by other osd peers. It has been 3 days already. I am currently using the following settings in the sysctl.conf: # Increase Linux autotuning TCP buffer limit

Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2014-12-09 Thread Luis Periquito
Hi Wido, thanks for sharing. fortunately I'm still running precise but planning on moving to trusty. >From what I'm aware it's not a good idea to be running discard on the FS, as it does have an impact of the delete operation, which some may even consider an unnecessary amount of work for the SSD

Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Irek Fasikhov
Hi. http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ ceph pg force_create_pg 2014-12-09 14:50 GMT+03:00 Giuseppe Civitella : > Hi all, > > last week I installed a new ceph cluster on 3 vm running Ubuntu 14.04 with > default kernel. > There is a ceph monitor a two osd hos

[ceph-users] Unable to start radosgw

2014-12-09 Thread Vivek Varghese Cherian
Hi, I am trying to integrate OpenStack Juno Keystone with the Ceph Object Gateway(radosw). I want to use keystone as the users authority. A user that keystone authorizes to access the gateway will also be created on the radosgw. Tokens that keystone validates will be considered as valid by the ra

Re: [ceph-users] seg fault

2014-12-09 Thread Philipp Strobl
Hi Samuel, After Reading your Mail again carrfully, i see That my last questions are obsolet. I will surely Upgrade to 0.67.10 as soon as i can and take a closer look to the improvements. As far as i understand the Release notes to 0.67.11 there are no ugrading fixes, so i hope there were no

[ceph-users] "store is getting too big" on monitors after Firefly to Giant upgrade

2014-12-09 Thread Kevin Sumner
Hi all, We recently upgraded our cluster to Giant from. Since then, we’ve been driving load tests against CephFS. However, we’re getting “store is getting too big” warnings from the monitors and the mons have started consuming way more disk space, 40GB-60GB now as opposed to ~10GB pre-upgrade

Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Craig Lewis
When I first created a test cluster, I used 1 GiB disks. That causes problems. Ceph has a CRUSH weight. By default, the weight is the size of the disk in TiB, truncated to 2 decimal places. ie, any disk smaller than 10 GiB will have a weight of 0.00. I increased all of my virtual disks to 10 G

Re: [ceph-users] Monitors repeatedly calling for new elections

2014-12-09 Thread Jon Kåre Hellan
On 09. des. 2014 18:19, Sanders, Bill wrote: Thanks for the response. I did forget to mention that NTP is setup and does appear to be running (just double checked). You probably know this, but just in case: If 'ntpq -p' shows a '*' in front of one of the servers, NTP has managed to synch up.

Re: [ceph-users] Multiple MDS servers...

2014-12-09 Thread JIten Shah
Hi Greg, Sorry for the confusion. I am not looking for active/active configuration which I know is not supported but what documentation can I refer to for installing an active/stndby MDSes ? I tried looking on Ceph.com but could not find that explains how to setup an active/standby MDS cluste

[ceph-users] Is mon initial members used after the first quorum?

2014-12-09 Thread Christopher Armstrong
Hi folks, I think we have a bit of confusion around how initial members is used. I understand that we can specify a single monitor (or a subset of monitors) so that the cluster can form a quorum when it first comes up. This is how we're using the setting now - so the cluster can come up with just

Re: [ceph-users] Multiple MDS servers...

2014-12-09 Thread JIten Shah
I have been trying to do that for quite some time now (using puppet) but ti keeps failing. Here’s what the error says. Error: Could not start Service[ceph-mds]: Execution of '/sbin/service ceph start mds.Lab-cephmon003' returned 1: /etc/init.d/ceph: mds.Lab-cephmon003 not found (/etc/ceph/ceph.

Re: [ceph-users] Multiple MDS servers...

2014-12-09 Thread Christopher Armstrong
JIten, You simply start more metadata servers. You'll notice when you inspect the cluster health that one will be the active, and the rest will be standbys. Chris On Tue, Dec 9, 2014 at 3:10 PM, JIten Shah wrote: > Hi Greg, > > Sorry for the confusion. I am not looking for active/active confi

Re: [ceph-users] "store is getting too big" on monitors after Firefly to Giant upgrade

2014-12-09 Thread Haomai Wang
Maybe you can enable "mon_compact_on_start=true" when restarting mon, it will compact data On Wed, Dec 10, 2014 at 6:50 AM, Kevin Sumner wrote: > Hi all, > > We recently upgraded our cluster to Giant from. Since then, we’ve been > driving load tests against CephFS. However, we’re getting “store