[ceph-users] Slow performance during recovery operations

2015-04-02 Thread Stillwell, Bryan
All, Whenever we're doing some kind of recovery operation on our ceph clusters (cluster expansion or dealing with a drive failure), there seems to be a fairly noticable performance drop while it does the backfills (last time I measured it the performance during recovery was something like 20% of a

Re: [ceph-users] Slow performance during recovery operations

2015-04-02 Thread Stillwell, Bryan
>Recovery creates I/O performance drops in our VM too but it's manageable. >What really hurts us are deep scrubs. >Our current situation is Firefly 0.80.9 with a total of 24 identical OSDs >evenly distributed on 4 servers with the following relevant configuration: > >osd recovery max active

Re: [ceph-users] low power single disk nodes

2015-04-09 Thread Stillwell, Bryan
These are really interesting to me, but how can you buy them? What's the performance like in ceph? Are they using the keyvaluestore backend, or something specific to these drives? Also what kind of chassis do they go into (some kind of ethernet JBOD)? Bryan On 4/9/15, 9:43 AM, "Mark Nelson" w

[ceph-users] Managing larger ceph clusters

2015-04-15 Thread Stillwell, Bryan
I'm curious what people managing larger ceph clusters are doing with configuration management and orchestration to simplify their lives? We've been using ceph-deploy to manage our ceph clusters so far, but feel that moving the management of our clusters to standard tools would provide a little mor

[ceph-users] cephfs map command deprecated

2015-04-22 Thread Stillwell, Bryan
I have a PG that is in the active+inconsistent state and found the following objects to have differing md5sums: -fa8298048c1958de3c04c71b2f225987 ./DIR_5/DIR_0/DIR_D/DIR_9/1008a75.017c__head_502F9D05__0 +b089c2dcd4f1d8b4419ba34fe468f784 ./DIR_5/DIR_0/DIR_D/DIR_9/1008a75.017c__head_

Re: [ceph-users] cephfs map command deprecated

2015-04-22 Thread Stillwell, Bryan
On 4/22/15, 2:08 PM, "Gregory Farnum" wrote: >On Wed, Apr 22, 2015 at 12:35 PM, Stillwell, Bryan > wrote: >> I have a PG that is in the active+inconsistent state and found the >> following objects to have differing md5sums: >> >> -fa8298048c1958de3c04c

Re: [ceph-users] Discuss: New default recovery config settings

2015-05-29 Thread Stillwell, Bryan
I like the idea of turning the defaults down. During the ceph operators session at the OpenStack conference last week Warren described the behavior pretty accurately as "Ceph basically DOSes itself unless you reduce those settings." Maybe this is more of a problem when the clusters are small?

[ceph-users] Expanding a ceph cluster with ansible

2015-06-17 Thread Stillwell, Bryan
I've been working on automating a lot of our ceph admin tasks lately and am pretty pleased with how the puppet-ceph module has worked for installing packages, managing ceph.conf, and creating the mon nodes. However, I don't like the idea of puppet managing the OSDs. Since we also use ansible in m

Re: [ceph-users] Expanding a ceph cluster with ansible

2015-06-23 Thread Stillwell, Bryan
new disks/nodes? >Is this statement correct? > >One more thing, why don¹t you use ceph-ansible entirely to do the >provisioning and life cycle management of your cluster? :) > >> On 18 Jun 2015, at 00:14, Stillwell, Bryan >> wrote: >> >> I've been working o

[ceph-users] Inconsistency in 'ceph df' stats

2015-08-31 Thread Stillwell, Bryan
On one of our staging ceph clusters (firefly 0.80.10) I've noticed that some of the statistics in the 'ceph df' output don't seem to match up. For example in the output below the amount of raw used is 8,402G, which with triple replication would be 2,800.7G used (all the pools are triple replicatio

Re: [ceph-users] Storage node refurbishing, a "freeze" OSD feature would be nice

2015-08-31 Thread Stillwell, Bryan
We have the following in our ceph.conf to bring in new OSDs with a weight of 0: [osd] osd_crush_initial_weight = 0 We then set 'nobackfill' and bring in each OSD at full weight one at a time (letting things settle down before bring in the next OSD). Once all the OSDs are brought in we unset 'no

Re: [ceph-users] OSDs stuck in booting state on CentOS 7.2.1511 and ceph infernalis 9.2.0

2015-12-18 Thread Stillwell, Bryan
I ran into a similar problem while in the middle of upgrading from Hammer (0.94.5) to Infernalis (9.2.0). I decided to try rebuilding one of the OSDs by using 'ceph-disk prepare /dev/sdb' and it never comes up: root@b3:~# ceph daemon osd.10 status { "cluster_fsid": "----

Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition

2016-01-04 Thread Stillwell, Bryan
I ran into this same issue, and found that a reboot ended up setting the ownership correctly. If you look at /lib/udev/rules.d/95-ceph-osd.rules you'll see the magic that makes it happen: # JOURNAL_UUID ACTION=="add", SUBSYSTEM=="block", \ ENV{DEVTYPE}=="partition", \ ENV{ID_PART_ENTRY_TYPE}=

Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition

2016-01-11 Thread Stillwell, Bryan
On 1/10/16, 2:26 PM, "ceph-users on behalf of Stuart Longland" wrote: >On 05/01/16 07:52, Stuart Longland wrote: >>> I ran into this same issue, and found that a reboot ended up setting >>>the >>> > ownership correctly. If you look at >>>/lib/udev/rules.d/95-ceph-osd.rules >>> > you'll see the m

[ceph-users] Unified queue in Infernalis

2016-02-05 Thread Stillwell, Bryan
I saw the following in the release notes for Infernalis, and I'm wondering where I can find more information about it? * There is now a unified queue (and thus prioritization) of client IO, recovery, scrubbing, and snapshot trimming. I've tried checking the docs for more details, but didn't have

Re: [ceph-users] getting rid of misplaced objects

2016-02-11 Thread Stillwell, Bryan
What does 'ceph osd tree' look like for this cluster? Also have you done anything special to your CRUSH rules? I've usually found this to be caused by modifying OSD weights a little too much. As for the inconsistent PG, you should be able to run 'ceph pg repair' on it: http://docs.ceph.com/docs

Re: [ceph-users] pg repair behavior? (Was: Re: getting rid of misplaced objects)

2016-02-16 Thread Stillwell, Bryan
Zoltan, It's good to hear that you were able to get the PGs stuck in 'remapped' back into a 'clean' state. Based on your response I'm guessing that your failure domains (node, rack, or maybe row) are too close (or equal) to your replica size. For example if your cluster looks like this: 3 repli

Re: [ceph-users] How to properly deal with NEAR FULL OSD

2016-02-17 Thread Stillwell, Bryan
Vlad, First off your cluster is rather full (80.31%). Hopefully you have hardware ordered for an expansion in the near future. Based on your 'ceph osd tree' output, it doesn't look like the reweight-by-utilization did anything for you. That last number for each OSD is set to 1, which means it d

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-18 Thread Stillwell, Bryan
When I've run into this situation I look for PGs that are on the full drives, but are in an active+clean state in the cluster. That way I can safely remove the PGs from the full drives and not have to risk data loss. It usually doesn't take much before you can restart the OSDs and let ceph take c

Re: [ceph-users] osd not removed from crush map after ceph osd crush remove

2016-02-22 Thread Stillwell, Bryan
Dimitar, I'm not sure why those PGs would be stuck in the stale+active+clean state. Maybe try upgrading to the 0.80.11 release to see if it's a bug that was fixed already? You can use the 'ceph tell osd.* version' command after the upgrade to make sure all OSDs are running the new version. A

Re: [ceph-users] osd not removed from crush map after ceph osd crush remove

2016-02-23 Thread Stillwell, Bryan
am Lead >AXSMarine Sofia >Phone: +359 889 22 55 42 >Skype: dimitar.boichev.axsmarine >E-mail: >dimitar.boic...@axsmarine.com > > >From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] >On Behalf Of Stillwell, Bryan >Sent: Tuesday, February 23, 2016 1:51 AM >To: ceph-users@list

[ceph-users] Over 13,000 osdmaps in current/meta

2016-02-25 Thread Stillwell, Bryan
After evacuated all the PGs from a node in hammer 0.94.5, I noticed that each of the OSDs was still using ~8GB of storage. After investigating it appears like all the data is coming from around 13,000 files in /usr/lib/ceph/osd/ceph-*/current/meta/ with names like: DIR_4/DIR_0/DIR_0/osdmap.303231

Re: [ceph-users] Over 13,000 osdmaps in current/meta

2016-02-25 Thread Stillwell, Bryan
n't being trimmed as they should. > > > >On Thu, Feb 25, 2016 at 10:16 AM, Stillwell, Bryan > wrote: > >After evacuated all the PGs from a node in hammer 0.94.5, I noticed that >each of the OSDs was still using ~8GB of storage. After investigating it >appears like al

Re: [ceph-users] osd not removed from crush map after ceph osd crush remove

2016-02-25 Thread Stillwell, Bryan
hev.axsmarine >E-mail: dimitar.boic...@axsmarine.com > > >-Original Message- >From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >Stillwell, Bryan >Sent: Tuesday, February 23, 2016 7:31 PM >To: ceph-users@lists.ceph.com >Subject: Re: [ceph-users]

Re: [ceph-users] Using bluestore in Jewel 10.0.4

2016-03-14 Thread Stillwell, Bryan
Mark, Since most of us already have existing clusters that use SSDs for journals, has there been any testing of converting that hardware over to using BlueStore and re-purposing the SSDs as a block cache (like using bcache)? To me this seems like it would be a good combination for a typical RBD c

[ceph-users] pg has invalid (post-split) stats; must scrub before tier agent can activate

2016-05-24 Thread Stillwell, Bryan J
On one of my test clusters that I¹ve upgraded from Infernalis to Jewel (10.2.1), and I¹m having a problem where reads are resulting in unfound objects. I¹m using cephfs on top of a erasure coded pool with cache tiering which I believe is related. >From what I can piece together, here is what the

[ceph-users] Rebuilding/recreating CephFS journal?

2016-05-27 Thread Stillwell, Bryan J
I have a Ceph cluster at home that I¹ve been running CephFS on for the last few years. Recently my MDS server became damaged and while attempting to fix it I believe I¹ve destroyed by CephFS journal based off this: 2016-05-25 16:48:23.882095 7f8d2fac2700 -1 log_channel(cluster) log [ERR] : Error

Re: [ceph-users] Rebuilding/recreating CephFS journal?

2016-05-27 Thread Stillwell, Bryan J
On 5/27/16, 11:27 AM, "Gregory Farnum" wrote: >On Fri, May 27, 2016 at 9:44 AM, Stillwell, Bryan J > wrote: >> I have a Ceph cluster at home that I¹ve been running CephFS on for the >> last few years. Recently my MDS server became damaged and while >> a

Re: [ceph-users] Rebuilding/recreating CephFS journal?

2016-05-27 Thread Stillwell, Bryan J
On 5/27/16, 3:01 PM, "Gregory Farnum" wrote: >> >> So would the next steps be to run the following commands?: >> >> cephfs-table-tool 0 reset session >> cephfs-table-tool 0 reset snap >> cephfs-table-tool 0 reset inode >> cephfs-journal-tool --rank=0 journal reset >> cephfs-data-scan init >> >> c

Re: [ceph-users] Rebuilding/recreating CephFS journal?

2016-05-27 Thread Stillwell, Bryan J
mark it as repaired. That's a monitor command. > >On Fri, May 27, 2016 at 2:09 PM, Stillwell, Bryan J > wrote: >> On 5/27/16, 3:01 PM, "Gregory Farnum" wrote: >> >>>> >>>> So would the next steps be to run the following commands?: >

Re: [ceph-users] Rebuilding/recreating CephFS journal?

2016-05-27 Thread Stillwell, Bryan J
On 5/27/16, 3:23 PM, "Gregory Farnum" wrote: >On Fri, May 27, 2016 at 2:22 PM, Stillwell, Bryan J > wrote: >> Here's the full 'ceph -s' output: >> >> # ceph -s >> cluster c7ba6111-e0d6-40e8-b0af-8428e8702df9 >> health HEALT

Re: [ceph-users] pg has invalid (post-split) stats; must scrub before tier agent can activate

2016-06-16 Thread Stillwell, Bryan J
6, 4:27 PM, "Stillwell, Bryan J" wrote: >On one of my test clusters that I¹ve upgraded from Infernalis to Jewel >(10.2.1), and I¹m having a problem where reads are resulting in unfound >objects. > >I¹m using cephfs on top of a erasure coded pool with cache tiering which I >believ

[ceph-users] Multi-device BlueStore testing

2016-07-19 Thread Stillwell, Bryan J
I would like to do some BlueStore testing using multiple devices like mentioned here: https://www.sebastien-han.fr/blog/2016/05/04/Ceph-Jewel-configure-BlueStore-with-multiple-devices/ However, si

[ceph-users] Multi-device BlueStore OSDs multiple fsck failures

2016-08-03 Thread Stillwell, Bryan J
I've been doing some benchmarking of BlueStore in 10.2.2 the last few days and have come across a failure that keeps happening after stressing the cluster fairly heavily. Some of the OSDs started failing and attempts to restart them fail to log anything in /var/log/ceph/, so I tried starting them

Re: [ceph-users] Multi-device BlueStore OSDs multiple fsck failures

2016-08-03 Thread Stillwell, Bryan J
>This is a good test case and I doubt any of us testing by enabling fsck() >on mount/unmount. > >Thanks & Regards >Somnath > >-Original Message- >From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >Stillwell, Bryan J >Sent: Wednesday,

[ceph-users] Upgrading 0.94.6 -> 0.94.9 saturating mon node networking

2016-09-21 Thread Stillwell, Bryan J
While attempting to upgrade a 1200+ OSD cluster from 0.94.6 to 0.94.9 I've run into serious performance issues every time I restart an OSD. At first I thought the problem I was running into was caused by the osdmap encoding bug that Dan and Wido ran into when upgrading to 0.94.7, because I was see

Re: [ceph-users] [EXTERNAL] Upgrading 0.94.6 -> 0.94.9 saturating mon node networking

2016-09-23 Thread Stillwell, Bryan J
causing some kind of spinlock condition. > >> On Sep 21, 2016, at 4:21 PM, Stillwell, Bryan J >> wrote: >> >> While attempting to upgrade a 1200+ OSD cluster from 0.94.6 to 0.94.9 >>I've >> run into serious performance issues every time I restart an OSD.

Re: [ceph-users] upgrade from v0.94.6 or lower and 'failed to encode map X with expected crc'

2016-10-06 Thread Stillwell, Bryan J
Thanks Kefu! Downgrading the mons to 0.94.6 got us out of this situation. I appreciate you tracking this down! Bryan On 10/4/16, 1:18 AM, "ceph-users on behalf of kefu chai" wrote: >hi ceph users, > >If user upgrades the cluster from a prior release to v0.94.7 or up by >following the steps: >

[ceph-users] Missing arm64 Ubuntu packages for 10.2.3

2016-10-13 Thread Stillwell, Bryan J
I have a basement cluster that is partially built with Odroid-C2 boards and when I attempted to upgrade to the 10.2.3 release I noticed that this release doesn't have an arm64 build. Are there any plans on continuing to make arm64 builds? Thanks, Bryan _

Re: [ceph-users] Missing arm64 Ubuntu packages for 10.2.3

2016-10-13 Thread Stillwell, Bryan J
On 10/13/16, 2:32 PM, "Alfredo Deza" wrote: >On Thu, Oct 13, 2016 at 11:33 AM, Stillwell, Bryan J > wrote: >> I have a basement cluster that is partially built with Odroid-C2 boards >>and >> when I attempted to upgrade to the 10.2.3 release I noticed that thi

Re: [ceph-users] Missing arm64 Ubuntu packages for 10.2.3

2016-10-14 Thread Stillwell, Bryan J
On 10/14/16, 2:29 PM, "Alfredo Deza" wrote: >On Thu, Oct 13, 2016 at 5:19 PM, Stillwell, Bryan J > wrote: >> On 10/13/16, 2:32 PM, "Alfredo Deza" wrote: >> >>>On Thu, Oct 13, 2016 at 11:33 AM, Stillwell, Bryan J >>> wrote: >>>&

[ceph-users] Announcing the ceph-large mailing list

2016-10-20 Thread Stillwell, Bryan J
Do you run a large Ceph cluster? Do you find that you run into issues that you didn't have when your cluster was smaller? If so we have a new mailing list for you! Announcing the new ceph-large mailing list. This list is targeted at experienced Ceph operators with cluster(s) over 500 OSDs to di

[ceph-users] Total free space in addition to MAX AVAIL

2016-11-01 Thread Stillwell, Bryan J
I recently learned that 'MAX AVAIL' in the 'ceph df' output doesn't represent what I thought it did. It actually represents the amount of data that can be used before the first OSD becomes full, and not the sum of all free space across a set of OSDs. This means that balancing the data with 'ceph

Re: [ceph-users] Total free space in addition to MAX AVAIL

2016-11-01 Thread Stillwell, Bryan J
On 11/1/16, 1:45 PM, "Sage Weil" wrote: >On Tue, 1 Nov 2016, Stillwell, Bryan J wrote: >> I recently learned that 'MAX AVAIL' in the 'ceph df' output doesn't >> represent what I thought it did. It actually represents the amount of >> data

[ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-09 Thread Stillwell, Bryan J
Last week I decided to play around with Kraken (11.1.1-1xenial) on a single node, two OSD cluster, and after a while I noticed that the new ceph-mgr daemon is frequently using a lot of the CPU: 17519 ceph 20 0 850044 168104208 S 102.7 4.3 1278:27 ceph-mgr Restarting it with 'system

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Stillwell, Bryan J
On 1/10/17, 5:35 AM, "John Spray" wrote: >On Mon, Jan 9, 2017 at 11:46 PM, Stillwell, Bryan J > wrote: >> Last week I decided to play around with Kraken (11.1.1-1xenial) on a >> single node, two OSD cluster, and after a while I noticed that the new >> ceph-mgr d

Re: [ceph-users] Crushmap (tunables) flapping on cluster

2017-01-10 Thread Stillwell, Bryan J
On 1/10/17, 2:56 AM, "ceph-users on behalf of Breunig, Steve (KASRL)" wrote: >Hi list, > > >I'm running a cluster which is currently in migration from hammer to >jewel. > > >Actually i have the problem, that the tunables are flapping and a map of >an rbd image is not working. > > >It is flapping

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Stillwell, Bryan J
, 2017 at 9:00 AM, Stillwell, Bryan J > wrote: >> On 1/10/17, 5:35 AM, "John Spray" wrote: >> >>>On Mon, Jan 9, 2017 at 11:46 PM, Stillwell, Bryan J >>> wrote: >>>> Last week I decided to play around with Kraken (11.1.1-1xenial) on a >>>>

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Stillwell, Bryan J
;On Tue, Jan 10, 2017 at 9:41 AM, Stillwell, Bryan J > wrote: >> This is from: >> >> ceph version 11.1.1 (87597971b371d7f497d7eabad3545d72d18dd755) >> >> On 1/10/17, 10:23 AM, "Samuel Just" wrote: >> >>>What ceph sha1 is that? Does it incl

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-11 Thread Stillwell, Bryan J
n behalf of Stillwell, Bryan J" wrote: >On 1/10/17, 5:35 AM, "John Spray" wrote: > >>On Mon, Jan 9, 2017 at 11:46 PM, Stillwell, Bryan J >> wrote: >>> Last week I decided to play around with Kraken (11.1.1-1xenial) on a >>> single node, two OSD cluster

Re: [ceph-users] OSD create with SSD journal

2017-01-11 Thread Stillwell, Bryan J
On 1/11/17, 10:31 AM, "ceph-users on behalf of Reed Dier" wrote: >>2017-01-03 12:10:23.514577 7f1d821f2800 0 ceph version 10.2.5 >>(c461ee19ecbc0c5c330aca20f7392c9a00730367), process ceph-osd, pid 19754 >> 2017-01-03 12:10:23.517465 7f1d821f2800 1 >>filestore(/var/lib/ceph/tmp/mnt.WaQmjK) mkfs

Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-03 Thread Stillwell, Bryan J
On 2/3/17, 3:23 AM, "ceph-users on behalf of Wido den Hollander" wrote: > >> Op 3 februari 2017 om 11:03 schreef Maxime Guyot >>: >> >> >> Hi, >> >> Interesting feedback! >> >> > In my opinion the SMR can be used exclusively for the RGW. >> > Unless it's something like a backup/archive clus