Re: [ceph-users] Jewel -> Luminous upgrade, package install stopped all daemons

2017-09-18 Thread Brad Hubbard
Well OK now. Before we go setting off the fire alarms all over town let's work out what is happening, and why. I spent some time reproducing this and, it is indeed tied to selinux being (at least) permissive. It does not happen when selinux is disabled. If we look at the journalctl output in the

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-18 Thread Florian Haas
On Mon, Sep 18, 2017 at 8:48 AM, Christian Theune wrote: > Hi Josh, > >> On Sep 16, 2017, at 3:13 AM, Josh Durgin wrote: >> >> (Sorry for top posting, this email client isn't great at editing) > > Thanks for taking the time to respond. :) > >> The mitigation strategy I mentioned before of forcing

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-18 Thread Christian Theune
Hi, > On Sep 18, 2017, at 9:51 AM, Florian Haas wrote: > > For Josh's and others' benefit, I think you might want to share how > many nodes you operate, as that would be quite relevant to the > discussion. Sure. See the OSD tree at the end. We’re doing the typical SSD/non-SSD pool separation.

[ceph-users] Ceph 12.2.0 and replica count

2017-09-18 Thread Max Krasilnikov
Hello! In the times of Hammer it was actual to have 3 replicas for data to avoid situation with non-identical data on different OSDs. Now we have full data and metadata checksumming. So, is it actual now to have 3 replicas? Do the checksumming get us out from requirement of 3 replicas? Thanks a l

Re: [ceph-users] Ceph 12.2.0 and replica count

2017-09-18 Thread Wido den Hollander
> Op 18 september 2017 om 10:14 schreef Max Krasilnikov : > > > Hello! > > In the times of Hammer it was actual to have 3 replicas for data to avoid > situation with non-identical data on different OSDs. Now we have full data and > metadata checksumming. So, is it actual now to have 3 replicas?

[ceph-users] Help change civetweb front port error: Permission denied

2017-09-18 Thread 谭林江
Hi I create a gateway node and change it port is rgw_frontends = "civetweb port=80”, when run it response error: 2017-09-18 04:25:16.967378 7f2dd72e08c0 0 ceph version 9.2.1 (752b6a3020c3de74e07d2a8b4c5e48dab5a6b6fd), process radosgw, pid 3151 2017-09-18 04:25:17.025703 7f2dd72e08c0 0 framew

[ceph-users] [RGW] SignatureDoesNotMatch using curl

2017-09-18 Thread junho_k...@tmax.co.kr
I’m trying to use Ceph Object Storage in CLI. I used curl to make a request to the RGW with S3 way. When I use a python library, which is boto, all things work fine, but when I tried to make same request using curl, I always got error “SignatureDoesNotMatch” I don’t know what goes wrong. Here i

Re: [ceph-users] Help change civetweb front port error: Permission denied

2017-09-18 Thread Marcus Haarmann
Ceph is running as non-root user, so normally it is not permitted to listen to a port < 1024 for non-root users. This is not specific to ceph. You could trick a listener on port 80 with a redirect via iptables or you might proxy the connection through an apache/nginx instance. Marcus Haarma

Re: [ceph-users] RBD: How many snapshots is too many?

2017-09-18 Thread Florian Haas
On 09/16/2017 01:36 AM, Gregory Farnum wrote: > On Mon, Sep 11, 2017 at 1:10 PM Florian Haas > wrote: > > On Mon, Sep 11, 2017 at 8:27 PM, Mclean, Patrick > mailto:patrick.mcl...@sony.com>> wrote: > > > > On 2017-09-08 06:06 PM, Gregory Farnum wrote: >

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-18 Thread Christian Theune
Hi, > On Sep 18, 2017, at 10:06 AM, Christian Theune wrote: > > We’re doing the typical SSD/non-SSD pool separation. Currently we effectively > only use 2 pools: rbd.hdd and rbd.ssd. The ~4TB OSDs in the rbd.hdd pool are > “capacity endurance” SSDs (Micron S610DC). We have 10 machines at the m

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-18 Thread Christian Theune
Hi, and here’s another update which others might find quite interesting. Florian and I spend some time discussing the issue further, face to face. I had one switch that I brought up again (—osd-recovery-start-delay) which I looked at a few weeks ago but came to the conclusion that its rules are

Re: [ceph-users] RBD: How many snapshots is too many?

2017-09-18 Thread Piotr Dałek
On 17-09-16 01:36 AM, Gregory Farnum wrote: I got the chance to discuss this a bit with Patrick at the Open Source Summit Wednesday (good to see you!). So the idea in the previously-referenced CDM talk essentially involves changing the way we distribute snap deletion instructions from a "dele

[ceph-users] bluestore compression statistics

2017-09-18 Thread Peter Gervai
Hello, Is there any way to get compression stats of compressed bluestore storage? Thanks, Peter ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Collectd issues

2017-09-18 Thread Matthew Vernon
On 13/09/17 15:06, Marc Roos wrote: > > > Am I the only one having these JSON issues with collectd, did I do > something wrong in configuration/upgrade? I also see these, although my dashboard seems to mostly be working. I'd be interested in knowing what the problem is! > Sep 13 15:44:15 c01 c

Re: [ceph-users] Collectd issues

2017-09-18 Thread Matthew Vernon
On 18/09/17 16:37, Matthew Vernon wrote: > On 13/09/17 15:06, Marc Roos wrote: >> >> >> Am I the only one having these JSON issues with collectd, did I do >> something wrong in configuration/upgrade? > > I also see these, although my dashboard seems to mostly be working. I'd > be interested in kn

[ceph-users] CephFS Segfault 12.2.0

2017-09-18 Thread Derek Yarnell
We have a recent cluster upgraded from Jewel to Luminous. Today we had a segmentation fault that led to file system degraded. Systemd then decided to restart the daemon over and over with a different stack trace (can be seen after the 10k events in the log file[0]). After trying to fail over to

[ceph-users] Rbd resize, refresh rescan

2017-09-18 Thread Marc Roos
Is there something like this for scsi, to rescan the size of the rbd device and make it available? (while it is being used) echo 1 > /sys/class/scsi_device/2\:0\:0\:0/device/rescan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists

Re: [ceph-users] Rbd resize, refresh rescan

2017-09-18 Thread David Turner
I've never needed to do anything other than extend the partition and/or filesystem when I increased the size of an RBD. Particularly if I didn't partition the RBD I only needed to extend the filesystem. Which method are you mapping/mounting the RBD? Is it through a Hypervisor or just mapped to a

Re: [ceph-users] CephFS Segfault 12.2.0

2017-09-18 Thread Patrick Donnelly
Hi Derek, On Mon, Sep 18, 2017 at 1:30 PM, Derek Yarnell wrote: > We have a recent cluster upgraded from Jewel to Luminous. Today we had > a segmentation fault that led to file system degraded. Systemd then > decided to restart the daemon over and over with a different stack trace > (can be see

Re: [ceph-users] Rbd resize, refresh rescan

2017-09-18 Thread Marc Roos
Yes, I think you are right, after I saw this in dmesg, I noticed with fdisk the block device was updated rbd21: detected capacity change from 5368709120 to 6442450944 Maybe this also works (found a something that refered to a /sys/class, which I don’t have) echo 1 > /sys/devices/rbd/21/refre

Re: [ceph-users] Rbd resize, refresh rescan

2017-09-18 Thread David Turner
Disk Management in Windows should very easily extend a partition to use the rest of the disk. You should just right click the partition and select "Extend Volume" and that's it. I did it in Windows 10 over the weekend for a laptop that had been set up weird. On Mon, Sep 18, 2017 at 4:49 PM Marc

Re: [ceph-users] Bluestore aio_nr?

2017-09-18 Thread Sage Weil
On Tue, 19 Sep 2017, Xiaoxi Chen wrote: > Hi, > I just hit an OSD cannot start due to insufficient aio_nr. Each > OSD is with a separate SSD partition as db.block Can you paste the message you saw? I'm not sure which check you mean. > Further checking showing 6144 AIO contexts were re

Re: [ceph-users] Jewel -> Luminous upgrade, package install stopped all daemons

2017-09-18 Thread Brad Hubbard
On Sat, Sep 16, 2017 at 8:34 AM, David Turner wrote: > I don't understand a single use case where I want updating my packages using > yum, apt, etc to restart a ceph daemon. ESPECIALLY when there are so many > clusters out there with multiple types of daemons running on the same > server. > > My