Re: [ceph-users] Write back cache removal

2017-01-10 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Wido > den Hollander > Sent: 10 January 2017 07:54 > To: ceph new ; Stuart Harland > > Subject: Re: [ceph-users] Write back cache removal > > > > Op 9 januari 2017 om 13:02 schreef Stuart Ha

[ceph-users] rgw swift api long term support

2017-01-10 Thread Marius Vaitiekunas
Hi, I would like to ask ceph developers if there any chance that swift api support for rgw is going to be dropped in the future (like in 5 years). Why am I asking? :) We were happy openstack glance users on ceph s3 api until openstack decided to drop glance s3 support.. So, we need to switch our

Re: [ceph-users] Write back cache removal

2017-01-10 Thread Wido den Hollander
> Op 10 januari 2017 om 9:52 schreef Nick Fisk : > > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > Wido den Hollander > > Sent: 10 January 2017 07:54 > > To: ceph new ; Stuart Harland > > > > Subject: Re: [ceph-users] Write bac

[ceph-users] Crushmap (tunables) flapping on cluster

2017-01-10 Thread Breunig, Steve (KASRL)
Hi list, I'm running a cluster which is currently in migration from hammer to jewel. Actually i have the problem, that the tunables are flapping and a map of an rbd image is not working. It is flapping between: { "choose_local_tries": 0, "choose_local_fallback_tries": 0, "choose

Re: [ceph-users] Write back cache removal

2017-01-10 Thread jiajia zhong
It's fixed since v0.94.6, http://ceph.com/releases/v0-94-6-hammer-released/ - fs: CephFS restriction on removing cache tiers is overly strict ( issue#11504 , pr#6402 , John Spray) but you have to make sure you

Re: [ceph-users] Write back cache removal

2017-01-10 Thread Stuart Harland
Yes Wido, you are correct. There is a RBD pool in the cluster, but is not currently running with a cache attached. The Pool I’m trying to manage here is only used by Librados to write objects directly to the pool as opposed to any of the other niceties that ceph provides. Specifically I ran: `

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread John Spray
On Mon, Jan 9, 2017 at 11:46 PM, Stillwell, Bryan J wrote: > Last week I decided to play around with Kraken (11.1.1-1xenial) on a > single node, two OSD cluster, and after a while I noticed that the new > ceph-mgr daemon is frequently using a lot of the CPU: > > 17519 ceph 20 0 850044 1681

[ceph-users] pg stuck in peering while power failure

2017-01-10 Thread Craig Chi
Hi List, I am testing the stability of my Ceph cluster with power failure. I brutally powered off 2 Ceph units with each 90 OSDs on it while the client I/O was continuing. Since then, some of the pgs of my cluster stucked in peering pgmap v3261136: 17408 pgs, 4 pools, 176 TB data, 5082 kobject

Re: [ceph-users] pg stuck in peering while power failure

2017-01-10 Thread Samuel Just
{ "name": "Started\/Primary\/Peering", "enter_time": "2017-01-10 13:43:34.933074", "past_intervals": [ { "first": 75858, "last": 75860, "maybe_went_rw": 1, "up

Re: [ceph-users] PGs stuck active+remapped and osds lose data?!

2017-01-10 Thread Samuel Just
Shinobu isn't correct, you have 9/9 osds up and running. up does not equal acting because crush is having trouble fulfilling the weights in your crushmap and the acting set is being padded out with an extra osd which happens to have the data to keep you up to the right number of replicas. Please

Re: [ceph-users] rgw swift api long term support

2017-01-10 Thread Yehuda Sadeh-Weinraub
On Tue, Jan 10, 2017 at 1:35 AM, Marius Vaitiekunas wrote: > Hi, > > I would like to ask ceph developers if there any chance that swift api > support for rgw is going to be dropped in the future (like in 5 years). > > Why am I asking? :) > > We were happy openstack glance users on ceph s3 api unti

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Stillwell, Bryan J
On 1/10/17, 5:35 AM, "John Spray" wrote: >On Mon, Jan 9, 2017 at 11:46 PM, Stillwell, Bryan J > wrote: >> Last week I decided to play around with Kraken (11.1.1-1xenial) on a >> single node, two OSD cluster, and after a while I noticed that the new >> ceph-mgr daemon is frequently using a lot of

Re: [ceph-users] Crushmap (tunables) flapping on cluster

2017-01-10 Thread Stillwell, Bryan J
On 1/10/17, 2:56 AM, "ceph-users on behalf of Breunig, Steve (KASRL)" wrote: >Hi list, > > >I'm running a cluster which is currently in migration from hammer to >jewel. > > >Actually i have the problem, that the tunables are flapping and a map of >an rbd image is not working. > > >It is flapping

[ceph-users] Your company listed as a user / contributor on ceph.com

2017-01-10 Thread Patrick McGarry
Hey cephers, Now that we're getting ready to launch the new ceph.com site, I'd like to open it up to anyone that would like to have their company logo listed as either a "ceph user" or "ceph contributor" with a hyperlink to your site. In order to do this I'll need you to send me a logo that is at

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Samuel Just
What ceph sha1 is that? Does it include 6c3d015c6854a12cda40673848813d968ff6afae which fixed the messenger spin? -Sam On Tue, Jan 10, 2017 at 9:00 AM, Stillwell, Bryan J wrote: > On 1/10/17, 5:35 AM, "John Spray" wrote: > >>On Mon, Jan 9, 2017 at 11:46 PM, Stillwell, Bryan J >> wrote: >>> Last

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Stillwell, Bryan J
This is from: ceph version 11.1.1 (87597971b371d7f497d7eabad3545d72d18dd755) On 1/10/17, 10:23 AM, "Samuel Just" wrote: >What ceph sha1 is that? Does it include >6c3d015c6854a12cda40673848813d968ff6afae which fixed the messenger >spin? >-Sam > >On Tue, Jan 10, 2017 at 9:00 AM, Stillwell, Bryan

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Samuel Just
Can you push that branch somewhere? I don't have a v11.1.1 or that sha1. -Sam On Tue, Jan 10, 2017 at 9:41 AM, Stillwell, Bryan J wrote: > This is from: > > ceph version 11.1.1 (87597971b371d7f497d7eabad3545d72d18dd755) > > On 1/10/17, 10:23 AM, "Samuel Just" wrote: > >>What ceph sha1 is that?

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Stillwell, Bryan J
That's strange, I installed that version using packages from here: http://download.ceph.com/debian-kraken/pool/main/c/ceph/ Bryan On 1/10/17, 10:51 AM, "Samuel Just" wrote: >Can you push that branch somewhere? I don't have a v11.1.1 or that sha1. >-Sam > >On Tue, Jan 10, 2017 at 9:41 AM, Sti

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Samuel Just
Mm, maybe the tag didn't get pushed. Alfredo, is there supposed to be a v11.1.1 tag? -Sam On Tue, Jan 10, 2017 at 9:57 AM, Stillwell, Bryan J wrote: > That's strange, I installed that version using packages from here: > > http://download.ceph.com/debian-kraken/pool/main/c/ceph/ > > > Bryan > > O

Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2017-01-10 Thread David Turner
Did you ever fitgure out why the /var/lib/ceph/osd/ceph-22 folder was not being created automatically? We are having this issue while testing adding storage to an upgraded to jewel ceph cluster. Like you manually creating the directory and setting the permissions for the directory will allow u

Re: [ceph-users] Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

2017-01-10 Thread Brian Andrus
On Mon, Jan 9, 2017 at 3:33 PM, Willem Jan Withagen wrote: > On 9-1-2017 23:58, Brian Andrus wrote: > > Sorry for spam... I meant D_SYNC. > > That term does not run any lights in Google... > So I would expect it has to O_DSYNC. > (https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to- > test-i

Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2017-01-10 Thread Scottix
I think I got it to work by removing setuser_match_path = /var/lib/ceph/$type/$cluster-$id from the host machine. I think I did do a reboot it was a while ago so don't remember exactly. Then running ceph-deploy activate --Scott On Tue, Jan 10, 2017 at 10:16 AM David Turner wrote: Did you ever f

Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2017-01-10 Thread Scottix
My guess, ceph-deploy doesn't know how to handle that setting. I just remove it on host machine to add the disk then put it back so the other will boot as root. --Scottie On Tue, Jan 10, 2017 at 11:02 AM David Turner wrote: > Removing the setuser_match_path resolved this. This seems like an >

Re: [ceph-users] Failing to Activate new OSD ceph-deploy

2017-01-10 Thread David Turner
Removing the setuser_match_path resolved this. This seems like an oversight in this setting that allows people to run osds as root that prevents them from adding storage. [cid:image0f8f2e.JPG@91910f83.4d87ff12] David Turner | Cl

Re: [ceph-users] PGs stuck active+remapped and osds lose data?!

2017-01-10 Thread Marcus Müller
Ok, thanks. Then I will change the tunables. As far as I see, this would already help me: ceph osd crush tunables bobtail Even if we run ceph hammer this would work according to the documentation, am I right? And: I’m using librados for our clients (hammer too) could this change create proble

Re: [ceph-users] High CPU usage by ceph-mgr on idle Ceph cluster

2017-01-10 Thread Alfredo Deza
On Tue, Jan 10, 2017 at 12:59 PM, Samuel Just wrote: > Mm, maybe the tag didn't get pushed. Alfredo, is there supposed to be > a v11.1.1 tag? Yep. You can see there is one here: https://github.com/ceph/ceph/releases Specifically: https://github.com/ceph/ceph/releases/tag/v11.1.1 which points to

Re: [ceph-users] Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

2017-01-10 Thread Lionel Bouton
Hi, Le 10/01/2017 à 19:32, Brian Andrus a écrit : > [...] > > > I think the main point I'm trying to address is - as long as the > backing OSD isn't egregiously handling large amounts of writes and it > has a good journal in front of it (that properly handles O_DSYNC [not > D_SYNC as Sebastien's a

Re: [ceph-users] PGs stuck active+remapped and osds lose data?!

2017-01-10 Thread Marcus Müller
Hi Sam, another idea: I have two HDDs here and already wanted to add them to ceph5, so that I would need a new crush map. Could this problem be solved by doing this? > Am 10.01.2017 um 17:50 schrieb Samuel Just : > > Shinobu isn't correct, you have 9/9 osds up and running. up does not > equal

Re: [ceph-users] Write back cache removal

2017-01-10 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stuart Harland Sent: 10 January 2017 11:58 To: Wido den Hollander Cc: ceph new ; n...@fisk.me.uk Subject: Re: [ceph-users] Write back cache removal Yes Wido, you are correct. There is a RBD pool in the cluster, but is no

Re: [ceph-users] Ceph cache tier removal.

2017-01-10 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Daznis > Sent: 09 January 2017 12:54 > To: ceph-users > Subject: [ceph-users] Ceph cache tier removal. > > Hello, > > > I'm running preliminary test on cache tier removal on a live cluster

[ceph-users] Review of Ceph on ZFS - or how not to deploy Ceph for RBD + OpenStack

2017-01-10 Thread Kevin Olbrich
Dear Ceph-users, just to make sure nobody makes the mistake, I would like to share my experience with Ceph on ZFS in our test lab. ZFS is a Copy-on-Write filesystem and is suitable IMHO where data resilience has high priority. I work for a mid-sized datacenter in Germany and we set up a cluster us

Re: [ceph-users] Review of Ceph on ZFS - or how not to deploy Ceph for RBD + OpenStack

2017-01-10 Thread Lindsay Mathieson
On 11/01/2017 7:21 AM, Kevin Olbrich wrote: Read-Cache using normal Samsung PRO SSDs works very well How did you implement the cache and measure the results? a ZFS ssd cache will perform very badly with VM hosting and/or distriibuted filesystems, the random nature of the I/O and the ARC cach

Re: [ceph-users] Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

2017-01-10 Thread Willem Jan Withagen
On 10-1-2017 20:35, Lionel Bouton wrote: > Hi, I usually don't top post, but this time it is just to agree whole hartedly with what you wrote. And you have again more arguements as to why. Using SSD that don't work right is a certain recipe for losing data. --WjW > Le 10/01/2017 à 19:32, Brian

Re: [ceph-users] PGs stuck active+remapped and osds lose data?!

2017-01-10 Thread Shinobu Kinjo
Yeah, Sam is correct. I've not looked at crushmap. But I should have noticed what troublesome is with looking at `ceph osd tree`. That's my bad, sorry for that. Again please refer to: http://www.anchor.com.au/blog/2013/02/pulling-apart-cephs-crush-algorithm/ Regards, On Wed, Jan 11, 2017 at 1:

Re: [ceph-users] Review of Ceph on ZFS - or how not to deploy Ceph for RBD + OpenStack

2017-01-10 Thread Patrick Donnelly
Hello Kevin, On Tue, Jan 10, 2017 at 4:21 PM, Kevin Olbrich wrote: > 5x Ceph node equipped with 32GB RAM, Intel i5, Intel DC P3700 NVMe journal, Is the "journal" used as a ZIL? > We experienced a lot of io blocks (X requests blocked > 32 sec) when a lot > of data is changed in cloned RBDs (disk

Re: [ceph-users] Review of Ceph on ZFS - or how not to deploy Ceph for RBD + OpenStack

2017-01-10 Thread Adrian Saul
I would concur having spent a lot of time on ZFS on Solaris. ZIL will reduce the fragmentation problem a lot (because it is not doing intent logging into the filesystem itself which fragments the block allocations) and write response will be a lot better. I would use different devices for L2AR

Re: [ceph-users] Crushmap (tunables) flapping on cluster

2017-01-10 Thread Breunig, Steve (KASRL)
Yes, that was the problem, thx. Von: Stillwell, Bryan J Gesendet: Dienstag, 10. Januar 2017 18:06:10 An: Breunig, Steve (KASRL); ceph-users@lists.ceph.com Betreff: Re: [ceph-users] Crushmap (tunables) flapping on cluster On 1/10/17, 2:56 AM, "ceph-users on behalf

Re: [ceph-users] PGs stuck active+remapped and osds lose data?!

2017-01-10 Thread Marcus Müller
I have to thank you all. You give free support and this already helps me. I’m not the one who knows ceph that good, but everyday it’s getting better and better ;-) According to the article Brad posted I have to change the ceph osd crush tunables. But there are two questions left as I already wr

Re: [ceph-users] pg stuck in peering while power failure

2017-01-10 Thread Craig Chi
Hi Sam, Thank you for your precise inspection. I reviewed the log at the time, and I discovered that the cluster failed a OSD just after I shut the first unit down. Thus as you said, the pg can't finish peering due to the second unit was then shut off suddenly. Much appreciate your advice, but