Re: [ceph-users] is rgw crypt default encryption key long term supported ?

2019-06-14 Thread Scheurer François
Dear Casey Thank you for the update and bug report. Best Regards Francois Scheurer From: Casey Bodley Sent: Tuesday, June 11, 2019 4:53 PM To: Scheurer François; ceph-users@lists.ceph.com Subject: Re: is rgw crypt default encryption key long term supporte

[ceph-users] problem with degraded PG

2019-06-14 Thread Luk
Hello, Maybe somone was fighting with this kind of stuck in ceph already. This is production cluster, can't/don't want to make wrong steps, please advice, what to do. After changing of one failed disk (it was osd-7) on our cluster ceph didn't recover to HEALTH_OK, it stopped in state:

Re: [ceph-users] problem with degraded PG

2019-06-14 Thread Dan van der Ster
Hi, This looks like a tunables issue. What is the output of `ceph osd crush show-tunables ` -- Dan On Fri, Jun 14, 2019 at 11:19 AM Luk wrote: > > Hello, > > Maybe somone was fighting with this kind of stuck in ceph already. > This is production cluster, can't/don't want to make wrong s

Re: [ceph-users] problem with degraded PG

2019-06-14 Thread Luk
Hi, here is the output: ceph osd crush show-tunables { "choose_local_tries": 0, "choose_local_fallback_tries": 0, "choose_total_tries": 100, "chooseleaf_descend_once": 1, "chooseleaf_vary_r": 1, "chooseleaf_stable": 0, "straw_calc_version": 1, "allowed_bucket_algs"

Re: [ceph-users] problem with degraded PG

2019-06-14 Thread Dan van der Ster
Ahh I was thinking of chooseleaf_vary_r, which you already have. So probably not related to tunables. What is your `ceph osd tree` ? By the way, 12.2.9 has an unrelated bug (details http://tracker.ceph.com/issues/36686) AFAIU you will just need to update to v12.2.11 or v12.2.12 for that fix. -- D

Re: [ceph-users] problem with degraded PG

2019-06-14 Thread Luk
Here is ceph osd tree, in first post there is also ceph osd df tree: https://pastebin.com/Vs75gpwZ > Ahh I was thinking of chooseleaf_vary_r, which you already have. > So probably not related to tunables. What is your `ceph osd tree` ? > By the way, 12.2.9 has an unrelated bug (details > http:

Re: [ceph-users] problem with degraded PG

2019-06-14 Thread Caspar Smit
Hi, Something seems off in weight and sizes of hosts: ssdstor-a01, ssdstor-b01 and ssdstor-c01. ssdstor-c01 has a weight of 0.4 while the size ~10TiB so i would expect the weight to be around 10 in stead of 0.4 Same goes for the other two nodes i mentioned above. Could you explain this? Kind re

Re: [ceph-users] problem with degraded PG

2019-06-14 Thread Luk
Title: Re: [ceph-users] problem with degraded PG Hi, this another case - this three storage nodes were added few weeks ago to ssd pool. I don't understand this kind of behavior also and was writing about this on ceph-devel few weeks ago (didn't get answer) :/. https://www.spinics.net/lists/ceph-

[ceph-users] radosgw multisite replication segfaults on init in 13.2.6

2019-06-14 Thread Płaza Tomasz
Hi, We have a standalone ceph cluster v13.2.6 and wanted to replicate it to another DC. After going through "Migrating a Single Site System to Multi-Site" and "Configure a Secondary Zone" from http://docs.ceph.com/docs/master/radosgw/multisite/, We have setted up all buckets to "disable replicat

[ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Sean Redmond
Hi Ceph-Uers, I noticed that Soft Iron now have hardware acceleration for Erasure Coding[1], this is interesting as the CPU overhead can be a problem in addition to the extra disk I/O required for EC pools. Does anyone know if any other work is ongoing to support generic FPGA Hardware Acceleratio

Re: [ceph-users] problem with degraded PG

2019-06-14 Thread Luk
Hello, All kudos are going to friends from Wroclaw, PL :) It was as simple as typo... There was osd added two times to crushmap due to (this commands where run over week ago - didn't have problem then, it showed up after replacing another osd - osd-7): ceph osd crush add osd.112 0.00 roo

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Janne Johansson
Den fre 14 juni 2019 kl 13:58 skrev Sean Redmond : > Hi Ceph-Uers, > I noticed that Soft Iron now have hardware acceleration for Erasure > Coding[1], this is interesting as the CPU overhead can be a problem in > addition to the extra disk I/O required for EC pools. > Does anyone know if any other

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Brett Niver
Also the picture I saw at Cephalocon - which could have been inaccurate, looked to me as if it multiplied the data path. On Fri, Jun 14, 2019 at 8:27 AM Janne Johansson wrote: > > Den fre 14 juni 2019 kl 13:58 skrev Sean Redmond : >> >> Hi Ceph-Uers, >> I noticed that Soft Iron now have hardware

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread David Byte
I can't speak to the SoftIron solution, but I have done some testing on an all-SSD environment comparing latency, CPU, etc between using the Intel ISA plugin and using Jerasure. Very little difference is seen in CPU and capability in my tests, so I am not sure of the benefit. David Byte Sr. Te

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Sean Redmond
Hi James, Thanks for your comments. I think the CPU burn is more of a concern to soft iron here as they are using low power ARM64 CPU's to keep the power draw low compared to using Intel CPU's where like you say the problem maybe less of a concern. Using less power by using ARM64 and providing E

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread Janne Johansson
Den fre 14 juni 2019 kl 15:47 skrev Sean Redmond : > Hi James, > Thanks for your comments. > I think the CPU burn is more of a concern to soft iron here as they are > using low power ARM64 CPU's to keep the power draw low compared to using > Intel CPU's where like you say the problem maybe less of

[ceph-users] RGW 405 Method Not Allowed on CreateBucket

2019-06-14 Thread Drew Weaver
Hello, I am using the latest AWS PHP SDK to create a bucket. Every time I attempt to do this in the log I see: 2019-06-14 11:42:53.092 7fdff5459700 1 civetweb: 0x55c5450249d8: redacted - - [14/Jun/2019:11:42:53 -0400] "PUT / HTTP/1.1" 405 405 - aws-sdk-php/3.100.3 GuzzleHttp/6.3.3 curl/7.29.0

[ceph-users] scrub start hour = heavy load

2019-06-14 Thread Joe Comeau
I wonder if anyone has dealt with deep-scrubbing being really heavy when it kicks of at the defined start time? I currently have a script that kicks off and runs deep-scrub every 10 minutes on the oldest un-deep-scrubbed pg this script runs 24/7 regardless of when deep scrub is scheduled my cep

[ceph-users] out of date python-rtslib repo on https://shaman.ceph.com/

2019-06-14 Thread Matthias Leopold
Hi, to the people running https://shaman.ceph.com/: please update the repo for python-rtslib so recent ceph-iscsi packages can be installed which need python-rtslib >= 2.1.fb68 shaman python-rtslib version is 2.1.fb67 upstream python-rtslib version is 2.1.fb69 thanks + thanks for running this

[ceph-users] Nautilus HEALTH_WARN for msgr2 protocol

2019-06-14 Thread Bob Farrell
Hi. Firstly thanks to all involved in this great mailing list, I learn lots from it every day. We are running Ceph with a huge amount of success to store website themes/templates across a large collection of websites. We are very pleased with the solution in every way. The only issue we have, whi

Re: [ceph-users] Nautilus HEALTH_WARN for msgr2 protocol

2019-06-14 Thread Brett Chancellor
If you don't figure out how to enable it on your monitor, you can always disable it to squash the warnings *ceph config set mon.node01 ms_bind_msgr2 false* On Fri, Jun 14, 2019 at 12:11 PM Bob Farrell wrote: > Hi. Firstly thanks to all involved in this great mailing list, I learn > lots from it

[ceph-users] Weird behaviour of ceph-deploy

2019-06-14 Thread CUZA Frédéric
Hi everyone, I am facing a strange behavious from ceph-deploy. I try to add a new node to our cluster : ceph-deploy install --no-adjust-repos sd0051 Everything seems to work fine but the new bucket (host) is not created in the crushmap and when I try to add a new osd to that host, the osd is cre

Re: [ceph-users] Nautilus HEALTH_WARN for msgr2 protocol

2019-06-14 Thread Yury Shevchuk
This procedure worked for us: http://docs.ceph.com/docs/master/releases/nautilus/#v14-2-0-nautilus 13. To enable the new v2 network protocol, issue the following command: ceph mon enable-msgr2 This will instruct all monitors that bind to the old default port 6789 for the legacy v1 prot

Re: [ceph-users] Nautilus HEALTH_WARN for msgr2 protocol

2019-06-14 Thread Paul Emmerich
Yeah, msgr2 still makes some problems sometimes. Try to (re-)run "ceph mon enable-msgr2" and if that doesn't help i'd just delete and re-create the mon (usually way easier than playing around with the mon map manually). Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us

Re: [ceph-users] Nautilus HEALTH_WARN for msgr2 protocol

2019-06-14 Thread Paul Emmerich
On Fri, Jun 14, 2019 at 6:23 PM Brett Chancellor wrote: > If you don't figure out how to enable it on your monitor, you can always > disable it to squash the warnings > *ceph config set mon.node01 ms_bind_msgr2 false* > No, that just disables msgr2 on that mon. Use this option if you want to di

Re: [ceph-users] mutable health warnings

2019-06-14 Thread Sage Weil
On Thu, 13 Jun 2019, Neha Ojha wrote: > Hi everyone, > > There has been some interest in a feature that helps users to mute > health warnings. There is a trello card[1] associated with it and > we've had some discussion[2] in the past in a CDM about it. In > general, we want to understand a few th

Re: [ceph-users] RGW 405 Method Not Allowed on CreateBucket

2019-06-14 Thread Casey Bodley
Hi Drew, Judging by the "PUT /" in the request line, this request is using the virtual hosted bucket format [1]. This means the bucket name is part of the dns name and Host header, rather than in the path of the http request. Making this work in radosgw takes a little extra configuration [2].

Re: [ceph-users] RGW Multisite Q's

2019-06-14 Thread Casey Bodley
On 6/12/19 11:49 AM, Peter Eisch wrote: Hi, Could someone be able to point me to a blog or documentation page which helps me resolve the issues noted below? All nodes are Luminous, 12.2.12; one realm, one zonegroup (clustered haproxies fronting), two zones (three rgw in each); All endpoint re

Re: [ceph-users] Nautilus HEALTH_WARN for msgr2 protocol

2019-06-14 Thread DHilsbos
Bob; Have you verified that port 3300 is open for TCP on that host? The extra host firewall rules for v2 protocol caused me all kinds of grief when I was setting up my MONs. Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc. dhils...@performai

Re: [ceph-users] Weird behaviour of ceph-deploy

2019-06-14 Thread CUZA Frédéric
Little update : I check one osd I've installed even if the host isn't not present in the crushmap (or in cluster I guess) and I found this : monclient: wait_auth_rotating timed out after 30 osd.xxx 0 unable to obtain rotating service keys; retrying I alosy add the host to the admins host : ceph-

[ceph-users] Monitor stuck at "probing"

2019-06-14 Thread ☣Adam
I have a monitor which I just can't seem to get to join the quorum, even after injecting a monmap from one of the other servers.[1] I use NTP on all servers and also manually verified the clocks are synchronized. My monitors are named: ceph0, ceph2, xe, and tc I'm transitioning away from the ce