Re: [ceph-users] expanding cluster with minimal impact

2017-08-08 Thread Laszlo Budai
Hello, Thank you all for sharing your experiences, thoughts. One more question: regarding the pool used for the measurement (the -p option of the script), is it recommended to create a new pool for this? Or I could use one of our already existing pools in the cluster? Thank you, Laszlo On 0

Re: [ceph-users] expanding cluster with minimal impact

2017-08-08 Thread Laszlo Budai
Hi Dan, Thank you for your answer. Yes, I understood, that I need to have the initial crush weight 0. That's how I tested when manually adding OSDs in my test cluster. I see that with the settings mentioned by you adding OSDs using the ceph-disk tool will also have the crush weight 0, so I cou

Re: [ceph-users] expanding cluster with minimal impact

2017-08-08 Thread Dan van der Ster
Hi Bryan, How does the norebalance procedure work? You set the flag, increase the weight, then I expect the PGs to stay in remapped unless they're degraded ... why would a PG be degraded just because of a weight change? And then what happens when you unset norebalance? Cheers, Dan On Mon, Aug 7

[ceph-users] Running commands on Mon or OSD nodes

2017-08-08 Thread Osama Hasebou
Hi Everyone, I was trying to run the ceph osd crush reweight command to move data out of one node that has hardware failures and I noticed that as I set the crush reweight to 0, some nodes would reflect it when I do ceph osd tree and some wouldn't. What is the proper way to run command access

Re: [ceph-users] how to fix X is an unexpected clone

2017-08-08 Thread Gregory Farnum
On Mon, Aug 7, 2017 at 11:55 PM Stefan Priebe - Profihost AG < s.pri...@profihost.ag> wrote: > Hello, > > how can i fix this one: > > 2017-08-08 08:42:52.265321 osd.20 [ERR] repair 3.61a > 3:58654d3d:::rbd_data.106dd406b8b4567.018c:9d455 is an > unexpected clone > 2017-08-08 08:43:04.9

Re: [ceph-users] how to fix X is an unexpected clone

2017-08-08 Thread Stefan Priebe - Profihost AG
Hello Greg, Am 08.08.2017 um 11:56 schrieb Gregory Farnum: > On Mon, Aug 7, 2017 at 11:55 PM Stefan Priebe - Profihost AG > mailto:s.pri...@profihost.ag>> wrote: > > Hello, > > how can i fix this one: > > 2017-08-08 08:42:52.265321 osd.20 [ERR] repair 3.61a > 3:58654d3d:::rbd_da

[ceph-users] RGW - Unable to delete bucket with radosgw-admin

2017-08-08 Thread Andreas Calminder
Hi, I'm running into a weird issue while trying to delete a bucket with radosgw-admin # radosgw-admin --cluster ceph bucket rm --bucket=12856/weird_bucket --purge-objects This returns almost instantly even though the bucket contains +1M objects and the bucket isn't removed. Running above command

Re: [ceph-users] implications of losing the MDS map

2017-08-08 Thread John Spray
On Tue, Aug 8, 2017 at 1:51 AM, Daniel K wrote: > I finally figured out how to get the ceph-monstore-tool (compiled from > source) and am ready to attemp to recover my cluster. > > I have one question -- in the instructions, > http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-

Re: [ceph-users] 答复: hammer(0.94.5) librbd dead lock,i want to how to resolve

2017-08-08 Thread Jason Dillaman
The hammer release is nearly end-of-life pending the release of luminous. I wouldn't say it's a bug so much as a consequence of timing out RADOS operations -- as I stated before, you most likely have another thread stuck waiting on the cluster while that lock is held, but you only provided the back

[ceph-users] New install error

2017-08-08 Thread Timothy Wolgemuth
I have a new installation and following the quick start guide at: http://docs.ceph.com/docs/master/start/quick-ceph-deploy/ Running into the following error in the create-initial step. See below: $ ceph-deploy --username ceph-deploy mon create-initial [ceph_deploy.conf][DEBUG ] found configur

Re: [ceph-users] how to fix X is an unexpected clone

2017-08-08 Thread Steve Taylor
I encountered this same issue on two different clusters running Hammer 0.94.9 last week. In both cases I was able to resolve it by deleting (moving) all replicas of the unexpected clone manually and issuing a pg repair. Which version did you see this on? A call stack for the resulting crash woul

Re: [ceph-users] All flash ceph witch NVMe and SPDK

2017-08-08 Thread Mike A
> 7 авг. 2017 г., в 9:54, Wido den Hollander написал(а): > > >> Op 3 augustus 2017 om 15:28 schreef Mike A : >> >> >> Hello >> >> Our goal it is make fast storage as possible. >> By now our configuration of 6 servers look like that: >> * 2 x CPU Intel Gold 6150 20 core 2.4Ghz >> * 2 x 16 Gb

[ceph-users] One Monitor filling the logs

2017-08-08 Thread Konrad Riedel
Hi Ceph users, my luminous (ceph version 12.1.1) testcluster is doing fine, except that one Monitor is filling the logs -rw-r--r-- 1 ceph ceph 119M Aug 8 15:27 ceph-mon.1.log ceph-mon.1.log: 2017-08-08 15:57:49.509176 7ff4573c4700 0 log_channel(cluster) log [DBG] : Standby manager daemon

Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-08 Thread Marc Roos
The --debug indeed comes up with something bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x100ac314, expected 0x90407f75, device location [0x15a017~1000], logical extent 0x0~1000, bluestore(/var/lib/ceph/osd/ceph-9) _verify_csum bad

[ceph-users] bluestore on luminous using ramdisk?

2017-08-08 Thread matthew.wells
Hi I’m coming at this with not a lot of ceph experience but some enthusiasm so forgive me if this is an inappropriate question but is there any reason why it’s not possible, in theory, to setup bluestore using ramdisk? In my application I can afford to risk losing all data on system failure/re

Re: [ceph-users] CEPH bluestore space consumption with small objects

2017-08-08 Thread Pavel Shub
Marcus, You may want to look at the bluestore_min_alloc_size setting as well as the respective bluestore_min_alloc_size_ssd and bluestore_min_alloc_size_hdd. By default bluestore sets a 64k block size for ssds. I'm also using ceph for small objects and I've see my OSD usage go down from 80% to 20%

Re: [ceph-users] bluestore on luminous using ramdisk?

2017-08-08 Thread Gregory Farnum
I've no idea how the setup would go, but there's also a "Memstore" backend. It's used exclusively for testing, may or may not scale well, and doesn't have integration with the tooling, but it's got very limited setup (I think you just start an OSD with the appropriate config options set). You might

Re: [ceph-users] CEPH bluestore space consumption with small objects

2017-08-08 Thread Marcus Haarmann
Hi, I can check if this would change anything, but we are currently trying to find a different solution. The issue we ran into by using rados as backend with a bluestore osd was that every object seems to be cached in the osd and the memory consumption of the osd was increasing very much. T

Re: [ceph-users] ceph cluster experiencing major performance issues

2017-08-08 Thread David Turner
Are you also seeing osds marking themselves down for a little bit and then coming back up? There are 2 very likely problems causing/contributing to this. The first is if you are using a lot of snapshots. Deleting snapshots is a very expensive operation for your cluster and can cause a lot of slo

[ceph-users] Mon time to form quorum

2017-08-08 Thread Travis Nielsen
At cluster creation I'm seeing that the mons are taking a while time to form quorum. It seems like I'm hitting a timeout of 60s somewhere. Am I missing a config setting that would help paxos establish quorum sooner? When initializing with the monmap I would have expected the mons to initialize v

[ceph-users] jewel - radosgw-admin bucket limit check broken?

2017-08-08 Thread Sam Wouters
Hi, I wanted to the test the new feature to check the present buckets for optimal index sharding. According to the docs this should be as simple as "radosgw-admin -n client.xxx bucket limit check" with an optional param for printing only buckets over or nearing the limit. When I invoke this, howe

Re: [ceph-users] expanding cluster with minimal impact

2017-08-08 Thread bstillw...@godaddy.com
Dan, I set norebalance, do a bunch of reweights, then unset norebalance. Degraded PGs will still recover as long as they're not waiting on one of the PGs that is marked as backfilling (which does happen). What I believe is happening is that when you change CRUSH weights while PGs are actively

[ceph-users] Two clusters on same hosts - mirroring

2017-08-08 Thread Oscar Segarra
Hi, I'd like to use the mirroring feature http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ In my environment I have just one host (at the moment for testing purposes before production deployment). I want to dispose: /dev/sdb for standard operatoin /dev/sdc for mirror Of course, I'd like to

Re: [ceph-users] Inconsistent pgs with size_mismatch_oi

2017-08-08 Thread Lincoln Bryant
Hi all, Apologies for necromancing an old thread, but I was wondering if anyone had any more thoughts on this. We're running v10.2.9 now and still have 3 PGs exhibiting this behavior in our cache pool after scrubs, deep-scrubs, and repair attempts. Some more information below. Thanks much, Li

Re: [ceph-users] ceph cluster experiencing major performance issues

2017-08-08 Thread Mclean, Patrick
On 08/08/17 10:50 AM, David Turner wrote: > Are you also seeing osds marking themselves down for a little bit and > then coming back up? There are 2 very likely problems > causing/contributing to this. The first is if you are using a lot of > snapshots. Deleting snapshots is a very expensive ope

Re: [ceph-users] Running commands on Mon or OSD nodes

2017-08-08 Thread David Turner
Regardless of which node you run that command on, the command is talking to the mons. If you are getting different values between different nodes, double check their configs and make sure your mon quorum isn't somehow in a split-brain scenario. Which version of Ceph are you running. On Tue, Aug

[ceph-users] Iscsi configuration

2017-08-08 Thread Samuel Soulard
Hi all, Platform : Centos 7 Luminous 12.1.2 First time here but, are there any guides or guidelines out there on how to configure ISCSI gateways in HA so that if one gateway fails, IO can continue on the passive node? What I've done so far -ISCSI node with Ceph client map rbd on boot -Rbd has ex

Re: [ceph-users] ceph cluster experiencing major performance issues

2017-08-08 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Mclean, Patrick > Sent: 08 August 2017 20:13 > To: David Turner ; ceph-us...@ceph.com > Cc: Colenbrander, Roelof ; Payno, > Victor ; Yip, Rae > Subject: Re: [ceph-users] ceph cluster experie

Re: [ceph-users] One Monitor filling the logs

2017-08-08 Thread Mehmet
I guess this is Related to "debug_mgr": "1/5" But not Sure.. . Give it a try. Hth Mehmet Am 8. August 2017 16:28:21 MESZ schrieb Konrad Riedel : >Hi Ceph users, > >my luminous (ceph version 12.1.1) testcluster is doing fine, except >that >one Monitor is filling the logs > > -rw-r--r-- 1 c

Re: [ceph-users] Iscsi configuration

2017-08-08 Thread Adrian Saul
Hi Sam, We use SCST for iSCSI with Ceph, and a pacemaker cluster to orchestrate the management of active/passive presentation using ALUA though SCST device groups. In our case we ended up writing our own pacemaker resources to support our particular model and preferences, but I believe there

Re: [ceph-users] Iscsi configuration

2017-08-08 Thread Jason Dillaman
We are working hard to formalize active/passive iSCSI configuration across Linux/Windows/ESX via LIO. We have integrated librbd into LIO's tcmu-runner and have developed a set of support applications to managing the clustered configuration of your iSCSI targets. There is some preliminary documentat

Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-08 Thread Brad Hubbard
Wee On Wed, Aug 9, 2017 at 12:41 AM, Marc Roos wrote: > > > > The --debug indeed comes up with something > bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000 > checksum at blob offset 0x0, got 0x100ac314, expected 0x90407f75, device > location [0x15a017~1000], logical extent

Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-08 Thread Sage Weil
On Wed, 9 Aug 2017, Brad Hubbard wrote: > Wee > > On Wed, Aug 9, 2017 at 12:41 AM, Marc Roos wrote: > > > > > > > > The --debug indeed comes up with something > > bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000 > > checksum at blob offset 0x0, got 0x100ac314, expected 0x90407f

Re: [ceph-users] New install error

2017-08-08 Thread Brad Hubbard
On ceph01 if you login as ceph-deploy and run the following command what output do you get? $ sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph01/keyring auth get client.admin On Tue, Aug 8, 2017 at 11:41 PM, Timothy Wolgemuth wrote: > I hav