[ceph-users] Dumpling cluster can't resolve peering failures, ceph pg query blocks, auth failures in logs

2014-09-14 Thread Florian Haas
Hi everyone, [Keeping this on the -users list for now. Let me know if I should cross-post to -devel.] I've been asked to help out on a Dumpling cluster (a system "bequeathed" by one admin to the next, currently on 0.67.10, was originally installed with 0.67.5 and subsequently updated a few times)

Re: [ceph-users] what are these files for mon?

2014-09-16 Thread Florian Haas
Hi Greg, just picked up this one from the archive while researching a different issue and thought I'd follow up. On Tue, Aug 19, 2014 at 6:24 PM, Gregory Farnum wrote: > The sst files are files used by leveldb to store its data; you cannot > remove them. Are you running on a very small VM? How m

Re: [ceph-users] what are these files for mon?

2014-09-16 Thread Florian Haas
On Tue, Sep 16, 2014 at 6:15 PM, Joao Eduardo Luis wrote: > Forcing the monitor to compact on start and restarting the mon is the > current workaround for overgrown ssts. This happens on a regular basis with > some clusters and I've not been able to track down the source. It seems > that leveldb

Re: [ceph-users] Dumpling cluster can't resolve peering failures, ceph pg query blocks, auth failures in logs

2014-09-17 Thread Florian Haas
regory Farnum wrote: > Not sure, but have you checked the clocks on their nodes? Extreme > clock drift often results in strange cephx errors. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Sun, Sep 14, 2014 at 11:03 PM, Florian Haas wrote: >>

Re: [ceph-users] monitor quorum

2014-09-17 Thread Florian Haas
On Wed, Sep 17, 2014 at 1:58 PM, James Eckersall wrote: > Hi, > > I have a ceph cluster running 0.80.1 on Ubuntu 14.04. I have 3 monitors and > 4 OSD nodes currently. > > Everything has been running great up until today where I've got an issue > with the monitors. > I moved mon03 to a different s

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-17 Thread Florian Haas
Hi Craig, just dug this up in the list archives. On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis wrote: > In the interest of removing variables, I removed all snapshots on all pools, > then restarted all ceph daemons at the same time. This brought up osd.8 as > well. So just to summarize this: yo

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-17 Thread Florian Haas
On Wed, Sep 17, 2014 at 5:24 PM, Dan Van Der Ster wrote: > Hi Florian, > >> On 17 Sep 2014, at 17:09, Florian Haas wrote: >> >> Hi Craig, >> >> just dug this up in the list archives. >> >> On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis >&g

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-17 Thread Florian Haas
On Wed, Sep 17, 2014 at 5:42 PM, Dan Van Der Ster wrote: > From: Florian Haas > Sent: Sep 17, 2014 5:33 PM > To: Dan Van Der Ster > Cc: Craig Lewis ;ceph-users@lists.ceph.com > Subject: Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU > > On Wed, Sep 17, 2014 at 5:24

Re: [ceph-users] monitor quorum

2014-09-17 Thread Florian Haas
On Wed, Sep 17, 2014 at 5:21 PM, James Eckersall wrote: > Hi, > > Thanks for the advice. > > I feel pretty dumb as it does indeed look like a simple networking issue. > You know how you check things 5 times and miss the most obvious one... > > J No worries at all .:) Cheers, Florian

[ceph-users] Status of snapshots in CephFS

2014-09-19 Thread Florian Haas
Hello everyone, Just thought I'd circle back on some discussions I've had with people earlier in the year: Shortly before firefly, snapshot support for CephFS clients was effectively disabled by default at the MDS level, and can only be enabled after accepting a scary warning that your filesystem

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-19 Thread Florian Haas
Hi Craig, On Fri, Sep 19, 2014 at 2:49 AM, Craig Lewis wrote: > No, removing the snapshots didn't solve my problem. I eventually traced > this problem to XFS deadlocks caused by > [osd] > "osd mkfs options xfs": "-l size=1024m -n size=64k -i size=2048 -s > size=4096" > > Changing to just "-s s

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Florian Haas
On Mon, Sep 22, 2014 at 10:21 AM, Christian Balzer wrote: > The linux scheduler usually is quite decent in keeping processes where the > action is, thus you see for example a clear preference of DRBD or KVM vnet > processes to be "near" or on the CPU(s) where the IRQs are. Since you're just menti

[ceph-users] Unexpectedly low number of concurrent backfills

2015-02-17 Thread Florian Haas
Hello everyone, I'm seeing some OSD behavior that I consider unexpected; perhaps someone can shed some insight. Ceph giant (0.87.0), osd max backfills and osd recovery max active both set to 1. Please take a moment to look at the following "ceph health detail" screen dump: HEALTH_WARN 14 pgs ba

Re: [ceph-users] Unexpectedly low number of concurrent backfills

2015-02-17 Thread Florian Haas
On Tue, Feb 17, 2015 at 11:19 PM, Gregory Farnum wrote: > On Tue, Feb 17, 2015 at 12:09 PM, Florian Haas wrote: >> Hello everyone, >> >> I'm seeing some OSD behavior that I consider unexpected; perhaps >> someone can shed some insight. >> >> Ce

Re: [ceph-users] Unexpectedly low number of concurrent backfills

2015-02-18 Thread Florian Haas
On Wed, Feb 18, 2015 at 6:56 AM, Gregory Farnum wrote: > On Tue, Feb 17, 2015 at 9:48 PM, Florian Haas wrote: >> On Tue, Feb 17, 2015 at 11:19 PM, Gregory Farnum wrote: >>> On Tue, Feb 17, 2015 at 12:09 PM, Florian Haas wrote: >>>> Hello everyone, >>>>

Re: [ceph-users] PG stuck degraded, undersized, unclean

2015-02-18 Thread Florian Haas
On Wed, Feb 18, 2015 at 7:53 PM, Brian Rak wrote: > We're running ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578), > and seeing this: > > HEALTH_WARN 1 pgs degraded; 1 pgs stuck degraded; 1 pgs stuck unclean; 1 pgs > stuck undersized; 1 pgs undersized > pg 4.2af is stuck unclean for 7

[ceph-users] ceph-osd pegging CPU on giant, no snapshots involved this time

2015-02-18 Thread Florian Haas
Hey everyone, I must confess I'm still not fully understanding this problem and don't exactly know where to start digging deeper, but perhaps other users have seen this and/or it rings a bell. System info: Ceph giant on CentOS 7; approx. 240 OSDs, 6 pools using 2 different rulesets where the prob

Re: [ceph-users] PG stuck degraded, undersized, unclean

2015-02-18 Thread Florian Haas
On Wed, Feb 18, 2015 at 9:09 PM, Brian Rak wrote: >> What does your crushmap look like (ceph osd getcrushmap -o >> /tmp/crushmap; crushtool -d /tmp/crushmap)? Does your placement logic >> prevent Ceph from selecting an OSD for the third replica? >> >> Cheers, >> Florian > > > I have 5 hosts, and i

Re: [ceph-users] ceph-osd pegging CPU on giant, no snapshots involved this time

2015-02-18 Thread Florian Haas
On Wed, Feb 18, 2015 at 9:32 PM, Mark Nelson wrote: > On 02/18/2015 02:19 PM, Florian Haas wrote: >> >> Hey everyone, >> >> I must confess I'm still not fully understanding this problem and >> don't exactly know where to start digging deeper, but perh

Re: [ceph-users] Privileges for read-only CephFS access?

2015-02-18 Thread Florian Haas
On Wed, Feb 18, 2015 at 10:28 PM, Oliver Schulz wrote: > Dear Ceph Experts, > > is it possible to define a Ceph user/key with privileges > that allow for read-only CephFS access but do not allow > write or other modifications to the Ceph cluster? Warning, read this to the end, don't blindly do as

Re: [ceph-users] Privileges for read-only CephFS access?

2015-02-18 Thread Florian Haas
On Wed, Feb 18, 2015 at 11:41 PM, Gregory Farnum wrote: > On Wed, Feb 18, 2015 at 1:58 PM, Florian Haas wrote: >> On Wed, Feb 18, 2015 at 10:28 PM, Oliver Schulz wrote: >>> Dear Ceph Experts, >>> >>> is it possible to define a Ceph user/key with privileges

Re: [ceph-users] ceph-osd pegging CPU on giant, no snapshots involved this time

2015-02-19 Thread Florian Haas
On Wed, Feb 18, 2015 at 10:27 PM, Florian Haas wrote: > On Wed, Feb 18, 2015 at 9:32 PM, Mark Nelson wrote: >> On 02/18/2015 02:19 PM, Florian Haas wrote: >>> >>> Hey everyone, >>> >>> I must confess I'm still not fully understanding this problem

Re: [ceph-users] Privileges for read-only CephFS access?

2015-02-19 Thread Florian Haas
On Thu, Feb 19, 2015 at 12:50 AM, Gregory Farnum wrote: > On Wed, Feb 18, 2015 at 3:30 PM, Florian Haas wrote: >> On Wed, Feb 18, 2015 at 11:41 PM, Gregory Farnum wrote: >>> On Wed, Feb 18, 2015 at 1:58 PM, Florian Haas wrote: >>>> On Wed, Feb 18, 2015 at

Re: [ceph-users] ceph-osd pegging CPU on giant, no snapshots involved this time

2015-02-23 Thread Florian Haas
On Wed, Feb 18, 2015 at 9:19 PM, Florian Haas wrote: > Hey everyone, > > I must confess I'm still not fully understanding this problem and > don't exactly know where to start digging deeper, but perhaps other > users have seen this and/or it rings a bell. > > Syst

[ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Florian Haas
Hi everyone, I always have a bit of trouble wrapping my head around how libvirt seems to ignore ceph.conf option while qemu/kvm does not, so I thought I'd ask. Maybe Josh, Wido or someone else can clarify the following. http://ceph.com/docs/master/rbd/qemu-rbd/ says: "Important: If you set rbd_c

Re: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Florian Haas
On 02/27/2015 01:56 PM, Alexandre DERUMIER wrote: > Hi, > > from qemu rbd.c > > if (flags & BDRV_O_NOCACHE) { > rados_conf_set(s->cluster, "rbd_cache", "false"); > } else { > rados_conf_set(s->cluster, "rbd_cache", "true"); > } > > and > block.c > > int bdrv_parse_ca

Re: [ceph-users] multiple CephFS filesystems on the same pools

2015-02-27 Thread Florian Haas
On 02/27/2015 11:37 AM, Blair Bethwaite wrote: > Sorry if this is actually documented somewhere, It is. :) > but is it possible to > create and use multiple filesystems on the data data and metadata > pools? I'm guessing yes, but requires multiple MDSs? Nope. Every fs needs one data and one meta

Re: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Florian Haas
through")) { > /* this is the default */ > } else { > return -1; > } > > return 0; > } > > > So rbd_cache is > > disabled for cache=directsync|none > > and enabled for writet

[ceph-users] Heads up: libvirt produces unusable images from RBD pool on Ubuntu trusty

2015-04-22 Thread Florian Haas
Hi everyone, I don't think this has been posted to this list before, so just writing it up so it ends up in the archives. tl;dr: Using RBD storage pools with libvirt is currently broken on Ubuntu trusty (LTS), and any other platform using libvirt 1.2.2. In libvirt 1.2.2, the rbd_create3 function

Re: [ceph-users] Packages for Debian jessie, Ubuntu vivid etc

2015-04-22 Thread Florian Haas
On Wed, Apr 22, 2015 at 10:55 AM, Daniel Swarbrick wrote: > Hi all, > > With Debian jessie scheduled to be released in a few days on April 25, > many of us will be thinking of upgrading wheezy based systems to jessie. > The Ceph packages in upstream Debian repos are version 0.80.7 (i.e., > firefly

Re: [ceph-users] Heads up: libvirt produces unusable images from RBD pool on Ubuntu trusty

2015-04-22 Thread Florian Haas
On Wed, Apr 22, 2015 at 1:02 PM, Wido den Hollander wrote: > On 04/22/2015 12:07 PM, Florian Haas wrote: >> Hi everyone, >> >> I don't think this has been posted to this list before, so just >> writing it up so it ends up in the archives. >> >> tl

Re: [ceph-users] Heads up: libvirt produces unusable images from RBD pool on Ubuntu trusty

2015-04-22 Thread Florian Haas
On 04/22/2015 03:38 PM, Wido den Hollander wrote: > On 04/22/2015 03:20 PM, Florian Haas wrote: >> I'm not entirely sure, though, why virStorageBackendRBDCreateImage() >> enables striping unconditionally; could you explain the reasoning >> behind that? >> > &g

Re: [ceph-users] Status of snapshots in CephFS

2014-09-24 Thread Florian Haas
On Fri, Sep 19, 2014 at 5:25 PM, Sage Weil wrote: > On Fri, 19 Sep 2014, Florian Haas wrote: >> Hello everyone, >> >> Just thought I'd circle back on some discussions I've had with people >> earlier in the year: >> >> Shortly before firefly, snaps

Re: [ceph-users] Status of snapshots in CephFS

2014-09-30 Thread Florian Haas
On Fri, Sep 19, 2014 at 5:25 PM, Sage Weil wrote: > On Fri, 19 Sep 2014, Florian Haas wrote: >> Hello everyone, >> >> Just thought I'd circle back on some discussions I've had with people >> earlier in the year: >> >> Shortly before firefly, snaps

Re: [ceph-users] v0.86 released (Giant release candidate)

2014-10-10 Thread Florian Haas
Hi Sage, On Tue, Oct 7, 2014 at 9:20 PM, Sage Weil wrote: > This is a release candidate for Giant, which will hopefully be out in > another week or two (s v0.86). We did a feature freeze about a month ago > and since then have been doing only stabilization and bug fixing (and a > handful on low-

Re: [ceph-users] the state of cephfs in giant

2014-10-30 Thread Florian Haas
Hi Sage, sorry to be late to this thread; I just caught this one as I was reviewing the Giant release notes. A few questions below: On Mon, Oct 13, 2014 at 8:16 PM, Sage Weil wrote: > [...] > * ACLs: implemented, tested for kernel client. not implemented for > ceph-fuse. > [...] > * samba VFS

[ceph-users] RBD Cache Considered Harmful? (on all-SSD pools, at least)

2014-11-21 Thread Florian Haas
Hi everyone, been trying to get to the bottom of this for a few days; thought I'd take this to the list to see if someone had insight to share. Situation: Ceph 0.87 (Giant) cluster with approx. 250 OSDs. One set of OSD nodes with just spinners put into one CRUSH ruleset assigned to a "spinner" po

[ceph-users] Revisiting MDS memory footprint

2014-11-28 Thread Florian Haas
Hi everyone, I'd like to come back to a discussion from 2012 (thread at http://marc.info/?l=ceph-devel&m=134808745719233) to estimate the expected MDS memory consumption from file metadata caching. I am certain the following is full of untested assumptions, some of which are probably inaccurate, s

Re: [ceph-users] Revisiting MDS memory footprint

2014-11-28 Thread Florian Haas
Just thought of one other thing; allow me to insert that below. On Fri, Nov 28, 2014 at 1:04 PM, Florian Haas wrote: > Hi everyone, > > I'd like to come back to a discussion from 2012 (thread at > http://marc.info/?l=ceph-devel&m=134808745719233) to estimate the > expecte

Re: [ceph-users] Revisiting MDS memory footprint

2014-11-28 Thread Florian Haas
On Fri, Nov 28, 2014 at 3:14 PM, Wido den Hollander wrote: > On 11/28/2014 01:04 PM, Florian Haas wrote: >> Hi everyone, >> >> I'd like to come back to a discussion from 2012 (thread at >> http://marc.info/?l=ceph-devel&m=134808745719233) to estimate the >

Re: [ceph-users] Revisiting MDS memory footprint

2014-11-28 Thread Florian Haas
On Fri, Nov 28, 2014 at 3:29 PM, Wido den Hollander wrote: > On 11/28/2014 03:22 PM, Florian Haas wrote: >> On Fri, Nov 28, 2014 at 3:14 PM, Wido den Hollander wrote: >>> On 11/28/2014 01:04 PM, Florian Haas wrote: >>>> Hi everyone, >>>> >>&g

Re: [ceph-users] Backfilling caused RBD corruption on Hammer?

2016-05-12 Thread Florian Haas
On Sun, May 8, 2016 at 11:57 PM, Robert Sander wrote: > Am 29.04.2016 um 17:11 schrieb Robert Sander: > >> As the backfilling with the full weight of the new OSDs would have run >> for more than 28h and no VM was usable we re-weighted the new OSDs to >> 0.1. The backfilling finished after about 2h

Re: [ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-21 Thread Florian Haas
Hi Yoann, On Tue, Jun 21, 2016 at 3:11 PM, Yoann Moulin wrote: > Hello, > > I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu > Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) > > ceph version is Jewel (10.2.2). > All tests have been done u

Re: [ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-22 Thread Florian Haas
On Wed, Jun 22, 2016 at 10:56 AM, Yoann Moulin wrote: > Hello Florian, > >> On Tue, Jun 21, 2016 at 3:11 PM, Yoann Moulin wrote: >>> Hello, >>> >>> I found a performance drop between kernel 3.13.0-88 (default kernel on >>> Ubuntu >>> Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu

Re: [ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-24 Thread Florian Haas
On Thu, Jun 23, 2016 at 9:01 AM, Yoann Moulin wrote: > Le 23/06/2016 08:25, Sarni Sofiane a écrit : >> Hi Florian, >> > > >> On 23.06.16 06:25, "ceph-users on behalf of Florian Haas" >> wrote: >> >>> On Wed, Jun 22, 2016 at 10:56 AM,

[ceph-users] State of play for RDMA on Luminous

2017-08-23 Thread Florian Haas
Hello everyone, I'm trying to get a handle on the current state of the async messenger's RDMA transport in Luminous, and I've noticed that the information available is a little bit sparse (I've found https://community.mellanox.com/docs/DOC-2693 and https://community.mellanox.com/docs/DOC-2721, whi

Re: [ceph-users] State of play for RDMA on Luminous

2017-08-28 Thread Florian Haas
On Mon, Aug 28, 2017 at 4:21 PM, Haomai Wang wrote: > On Wed, Aug 23, 2017 at 1:26 AM, Florian Haas wrote: >> Hello everyone, >> >> I'm trying to get a handle on the current state of the async messenger's >> RDMA transport in Luminous, and I've noti

Re: [ceph-users] State of play for RDMA on Luminous

2017-08-29 Thread Florian Haas
Sorry, I worded my questions poorly in the last email, so I'm asking for clarification here: On Mon, Aug 28, 2017 at 6:04 PM, Haomai Wang wrote: > On Mon, Aug 28, 2017 at 7:54 AM, Florian Haas wrote: >> On Mon, Aug 28, 2017 at 4:21 PM, Haomai Wang wrote: >>> On Wed,

[ceph-users] RBD: How many snapshots is too many?

2017-09-05 Thread Florian Haas
Hi everyone, with the Luminous release out the door and the Labor Day weekend over, I hope I can kick off a discussion on another issue that has irked me a bit for quite a while. There doesn't seem to be a good documented answer to this: what are Ceph's real limits when it comes to RBD snapshots?

Re: [ceph-users] RBD: How many snapshots is too many?

2017-09-06 Thread Florian Haas
Hi Greg, thanks for your insight! I do have a few follow-up questions. On 09/05/2017 11:39 PM, Gregory Farnum wrote: >> It seems to me that there still isn't a good recommendation along the >> lines of "try not to have more than X snapshots per RBD image" or "try >> not to have more than Y snapsh

Re: [ceph-users] RBD: How many snapshots is too many?

2017-09-08 Thread Florian Haas
> In our use case, we are severly hampered by the size of removed_snaps > (50k+) in the OSDMap to the point were ~80% of ALL cpu time is spent in > PGPool::update and its interval calculation code. We have a cluster of > around 100k RBDs with each RBD having upto 25 snapshots and only a small > por

Re: [ceph-users] RBD: How many snapshots is too many?

2017-09-11 Thread Florian Haas
On Mon, Sep 11, 2017 at 8:27 PM, Mclean, Patrick wrote: > > On 2017-09-08 06:06 PM, Gregory Farnum wrote: > > On Fri, Sep 8, 2017 at 5:47 PM, Mclean, Patrick > > wrote: > > > >> On a related note, we are very curious why the snapshot id is > >> incremented when a snapshot is deleted, this create

[ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-13 Thread Florian Haas
Hi everyone, disclaimer upfront: this was seen in the wild on Hammer, and on 0.94.7 no less. Reproducing this on 0.94.10 is a pending process, and we'll update here with findings, but my goal with this post is really to establish whether the behavior as seen is expected, and if so, what the ratio

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-14 Thread Florian Haas
On Thu, Sep 14, 2017 at 3:15 AM, Brad Hubbard wrote: > On Wed, Sep 13, 2017 at 8:40 PM, Florian Haas wrote: >> Hi everyone, >> >> >> disclaimer upfront: this was seen in the wild on Hammer, and on 0.94.7 >> no less. Reproducing this on 0.94.10 is a pending proce

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-14 Thread Florian Haas
On Thu, Sep 14, 2017 at 2:47 AM, Josh Durgin wrote: > On 09/13/2017 03:40 AM, Florian Haas wrote: >> >> So we have a client that is talking to OSD 30. OSD 30 was never down; >> OSD 17 was. OSD 30 is also the preferred primary for this PG (via >> primary affinity). The O

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-15 Thread Florian Haas
On Fri, Sep 15, 2017 at 8:58 AM, Josh Durgin wrote: >> OK, maybe the "also" can be removed to reduce potential confusion? > > > Sure That'd be great. :) >> - We have a bunch of objects that need to be recovered onto the >> just-returned OSD(s). >> - Clients access some of these objects while the

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-15 Thread Florian Haas
On Fri, Sep 15, 2017 at 10:37 PM, Josh Durgin wrote: >> So this affects just writes. Then I'm really not following the >> reasoning behind the current behavior. Why would you want to wait for >> the recovery of an object that you're about to clobber anyway? Naïvely >> thinking an object like that

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-18 Thread Florian Haas
On Mon, Sep 18, 2017 at 8:48 AM, Christian Theune wrote: > Hi Josh, > >> On Sep 16, 2017, at 3:13 AM, Josh Durgin wrote: >> >> (Sorry for top posting, this email client isn't great at editing) > > Thanks for taking the time to respond. :) > >> The mitigation strategy I mentioned before of forcing

Re: [ceph-users] RBD: How many snapshots is too many?

2017-09-18 Thread Florian Haas
On 09/16/2017 01:36 AM, Gregory Farnum wrote: > On Mon, Sep 11, 2017 at 1:10 PM Florian Haas <mailto:flor...@hastexo.com>> wrote: > > On Mon, Sep 11, 2017 at 8:27 PM, Mclean, Patrick > mailto:patrick.mcl...@sony.com>> wrote: > > > > On 20

Re: [ceph-users] Clarification on sequence of recovery and client ops after OSDs rejoin cluster (also, slow requests)

2017-09-19 Thread Florian Haas
On Mon, Sep 18, 2017 at 2:02 PM, Christian Theune wrote: > Hi, > > and here’s another update which others might find quite interesting. > > Florian and I spend some time discussing the issue further, face to face. I > had one switch that I brought up again (—osd-recovery-start-delay) which I > l

Re: [ceph-users] RBD: How many snapshots is too many?

2017-09-21 Thread Florian Haas
On Thu, Sep 21, 2017 at 9:53 AM, Gregory Farnum wrote: >> > The other reason we maintain the full set of deleted snaps is to prevent >> > client operations from re-creating deleted snapshots — we filter all >> > client IO which includes snaps against the deleted_snaps set in the PG. >> > Apparentl

Re: [ceph-users] objects degraded higher than 100%

2017-10-12 Thread Florian Haas
On Mon, Sep 11, 2017 at 8:13 PM, Andreas Herrmann wrote: > Hi, > > how could this happen: > > pgs: 197528/1524 objects degraded (12961.155%) > > I did some heavy failover tests, but a value higher than 100% looks strange > (ceph version 12.2.0). Recovery is quite slow. > > cluster: >

Re: [ceph-users] objects degraded higher than 100%

2017-10-12 Thread Florian Haas
On Thu, Oct 12, 2017 at 7:22 PM, Gregory Farnum wrote: > > > On Thu, Oct 12, 2017 at 3:50 AM Florian Haas wrote: >> >> On Mon, Sep 11, 2017 at 8:13 PM, Andreas Herrmann >> wrote: >> > Hi, >> > >> > how could this happen: >>

Re: [ceph-users] objects degraded higher than 100%

2017-10-13 Thread Florian Haas
On Thu, Oct 12, 2017 at 7:56 PM, Gregory Farnum wrote: > > > On Thu, Oct 12, 2017 at 10:52 AM Florian Haas wrote: >> >> On Thu, Oct 12, 2017 at 7:22 PM, Gregory Farnum >> wrote: >> > >> > >> > On Thu, Oct 12, 2017 at 3:50 AM Florian Haas &

Re: [ceph-users] objects degraded higher than 100%

2017-10-13 Thread Florian Haas
>> > Okay, in that case I've no idea. What was the timeline for the recovery >> > versus the rados bench and cleanup versus the degraded object counts, >> > then? >> >> 1. Jewel deployment with filestore. >> 2. Upgrade to Luminous (including mgr deployment and "ceph osd >> require-osd-release lumin

Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-12 Thread Florian Haas
On Sat, Mar 11, 2017 at 12:21 PM, wrote: > The upgrade of our biggest cluster, nr 4, did not go without > problems. Since we where expecting a lot of "failed to encode map > e with expected crc" messages, we disabled clog to monitors > with 'ceph tell osd.* injectargs -- --clog_to_monitors=false'

Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-12 Thread Florian Haas
On Sat, Mar 11, 2017 at 4:24 PM, Laszlo Budai wrote: >>> Can someone explain the meaning of osd_disk_thread_ioprio_priority. I'm >>> [...] >>> >>> Now I am confused :( >>> >>> Can somebody bring some light here? >> >> >> Only to confuse you some more. If you are running Jewel or above then >>

Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-13 Thread Florian Haas
On Sun, Mar 12, 2017 at 9:07 PM, Laszlo Budai wrote: > Hi Florian, > > thank you for your answer. > > We have already set the IO scheduler to cfq in order to be able to lower the > priority of the scrub operations. > My problem is that I've found different values set for the same parameter, > and

Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-13 Thread Florian Haas
On Mon, Mar 13, 2017 at 11:00 AM, Dan van der Ster wrote: >> I'm sorry, I may have worded that in a manner that's easy to >> misunderstand. I generally *never* suggest that people use CFQ on >> reasonably decent I/O hardware, and thus have never come across any >> need to set this specific ceph.co

Re: [ceph-users] osd_disk_thread_ioprio_priority help

2017-03-15 Thread Florian Haas
On Wed, Mar 15, 2017 at 2:41 AM, Alex Gorbachev wrote: > On Mon, Mar 13, 2017 at 6:09 AM, Florian Haas wrote: >> On Mon, Mar 13, 2017 at 11:00 AM, Dan van der Ster >> wrote: >>>> I'm sorry, I may have worded that in a manner that's easy to >>>>

[ceph-users] Maintaining write performance under a steady intake of small objects

2017-04-24 Thread Florian Haas
Hi everyone, so this will be a long email — it's a summary of several off-list conversations I've had over the last couple of weeks, but the TL;DR version is this question: How can a Ceph cluster maintain near-constant performance characteristics while supporting a steady intake of a large number

Re: [ceph-users] radosgw bucket index sharding tips?

2015-12-16 Thread Florian Haas
Hi Ben & everyone, just following up on this one from July, as I don't think there's been a reply here then. On Wed, Jul 8, 2015 at 7:37 AM, Ben Hines wrote: > Anyone have any data on optimal # of shards for a radosgw bucket index? > > We've had issues with bucket index contention with a few mil

Re: [ceph-users] radosgw bucket index sharding tips?

2015-12-17 Thread Florian Haas
Hey Wido, On Dec 17, 2015 09:52, "Wido den Hollander" wrote: > > On 12/17/2015 06:29 AM, Ben Hines wrote: > > > > > > On Wed, Dec 16, 2015 at 11:05 AM, Florian Haas > <mailto:flor...@hastexo.com>> wrote: > > > > Hi Ben & eve

[ceph-users] Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

2015-12-17 Thread Florian Haas
Hey everyone, I recently got my hands on a cluster that has been underperforming in terms of radosgw throughput, averaging about 60 PUTs/s with 70K objects where a freshly-installed cluster with near-identical configuration would do about 250 PUTs/s. (Neither of these values are what I'd consider

Re: [ceph-users] Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

2015-12-21 Thread Florian Haas
On Thu, Dec 17, 2015 at 6:16 PM, Florian Haas wrote: > Hey everyone, > > I recently got my hands on a cluster that has been underperforming in > terms of radosgw throughput, averaging about 60 PUTs/s with 70K > objects where a freshly-installed cluster with near-identical > conf

Re: [ceph-users] radosgw bucket index sharding tips?

2015-12-21 Thread Florian Haas
On Mon, Dec 21, 2015 at 10:20 AM, Wido den Hollander wrote: >>> > Oh, and to answer this part. I didn't do that much experimentation >>> > unfortunately. I actually am using about 24 index shards per bucket >>> > currently and we delete each bucket once it hits about a million >>> > objects. (i

Re: [ceph-users] Infernalis MDS crash (debug log included)

2015-12-21 Thread Florian Haas
On Mon, Dec 21, 2015 at 10:15 AM, Florent B wrote: > Hi all, > > It seems I had an MDS crash being in standby-replay. > > Version is Infernalis, running on Debian Jessie (packaged version). > > Log is here (2.5MB) : http://paste.ubuntu.com/14126366/ > > Has someone information about it ? Hi Flore

Re: [ceph-users] radosgw bucket index sharding tips?

2015-12-21 Thread Florian Haas
On Mon, Dec 21, 2015 at 12:36 PM, Wido den Hollander wrote: > > > On 21-12-15 10:34, Florian Haas wrote: >> On Mon, Dec 21, 2015 at 10:20 AM, Wido den Hollander wrote: >>>>>> Oh, and to answer this part. I didn't do that much experimentation >>>&

Re: [ceph-users] Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

2015-12-21 Thread Florian Haas
On Mon, Dec 21, 2015 at 3:35 PM, Haomai Wang wrote: > > > On Fri, Dec 18, 2015 at 1:16 AM, Florian Haas wrote: >> >> Hey everyone, >> >> I recently got my hands on a cluster that has been underperforming in >> terms of radosgw throughput, averaging abou

Re: [ceph-users] Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

2015-12-21 Thread Florian Haas
On Mon, Dec 21, 2015 at 4:15 PM, Haomai Wang wrote: > > > On Mon, Dec 21, 2015 at 10:55 PM, Florian Haas wrote: >> >> On Mon, Dec 21, 2015 at 3:35 PM, Haomai Wang wrote: >> > >> > >> > On Fri, Dec 18, 2015 at 1:16 AM, Florian Haas >> >

Re: [ceph-users] Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

2015-12-21 Thread Florian Haas
On Mon, Dec 21, 2015 at 9:15 PM, Ben Hines wrote: > I'd be curious to compare benchmarks. What size objects are you putting? As stated earlier, I ran rest-bench with 70KB objects which is a good approximation of the average object size in the underperforming system. > 10gig end to end from clien

Re: [ceph-users] Dealing with radosgw and large OSD LevelDBs: compact, start over, something else?

2015-12-21 Thread Florian Haas
On Tue, Dec 22, 2015 at 3:10 AM, Haomai Wang wrote: >> >> >> Hey everyone, >> >> >> >> >> >> I recently got my hands on a cluster that has been underperforming >> >> >> in >> >> >> terms of radosgw throughput, averaging about 60 PUTs/s with 70K >> >> >> objects where a freshly-installed cluster wi

Re: [ceph-users] RGW pool contents

2015-12-23 Thread Florian Haas
On Tue, Nov 24, 2015 at 8:48 PM, Somnath Roy wrote: > Hi Yehuda/RGW experts, > > I have one cluster with RGW up and running in the customer site. > > I did some heavy performance testing on that with CosBench and as a result > written significant amount of data to showcase performance on that. > >

[ceph-users] How-to doc: hosting a static website on radosgw

2016-01-26 Thread Florian Haas
Hi everyone, we recently worked a bit on running a full static website just on radosgw (akin to http://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteHosting.html), and didn't find a good how-to writeup out there. So we did a bit of fiddling with radosgw and HAproxy, and wrote one: https://www.has

Re: [ceph-users] How-to doc: hosting a static website on radosgw

2016-01-26 Thread Florian Haas
On Tue, Jan 26, 2016 at 8:56 PM, Wido den Hollander wrote: > On 01/26/2016 08:29 PM, Florian Haas wrote: >> Hi everyone, >> >> we recently worked a bit on running a full static website just on >> radosgw (akin to >> http://docs.aws.amazon.com/AmazonS3/latest

Re: [ceph-users] How-to doc: hosting a static website on radosgw

2016-01-26 Thread Florian Haas
On Tue, Jan 26, 2016 at 11:46 PM, Yehuda Sadeh-Weinraub wrote: > On Tue, Jan 26, 2016 at 2:37 PM, Florian Haas wrote: >> On Tue, Jan 26, 2016 at 8:56 PM, Wido den Hollander wrote: >>> On 01/26/2016 08:29 PM, Florian Haas wrote: >>>> Hi everyone, >>>>

Re: [ceph-users] How-to doc: hosting a static website on radosgw

2016-01-26 Thread Florian Haas
On Wed, Jan 27, 2016 at 12:00 AM, Robin H. Johnson wrote: > On Tue, Jan 26, 2016 at 11:51:51PM +0100, Florian Haas wrote: >> Hey, slick. Thanks! Out of curiosity, does the wip branch correctly >> handle Accept-Encoding: gzip? > No, Accept-Encoding is NOT presently implemented

Re: [ceph-users] ceph pg query hangs for ever

2016-04-01 Thread Florian Haas
On Fri, Apr 1, 2016 at 2:48 PM, Wido den Hollander wrote: > Somehow the PG got corrupted on one of the OSDs and it kept crashing on a > single > object. Vaguely reminds me of the E2BIG from that one issue way-back-when in Dumpling (https://www.hastexo.com/resources/hints-and-kinks/fun-extended

Re: [ceph-users] Ceph InfiniBand Cluster - Jewel - Performance

2016-04-07 Thread Florian Haas
On Thu, Apr 7, 2016 at 10:09 PM, German Anders wrote: > also jewel does not supposed to get more 'performance', since it used > bluestore in order to store metadata. Or do I need to specify during install > to use bluestore? Do the words "enable experimental unrecoverable data corrupting features

Re: [ceph-users] cephfs Kernel panic

2016-04-12 Thread Florian Haas
On Tue, Apr 12, 2016 at 11:53 AM, Simon Ferber wrote: > Thank you! That's it. I have installed the Kernel from the Jessie > backport. Now the crashes are gone. > How often do these things happen? It would be a worst case scenario, if > a system update breaks a productive system. For what it's wor

[ceph-users] Suggestion: flag HEALTH_WARN state if monmap has 2 mons

2016-04-12 Thread Florian Haas
Hi everyone, I wonder what others think about the following suggestion: running an even number of mons almost never makes sense, and specifically two mons never does at all. Wouldn't it make sense to just flag a HEALTH_WARN state if the monmap contained an even number of mons, or maybe only if the

Re: [ceph-users] openstack swift multitenancy problems with ceph RGW

2018-11-19 Thread Florian Haas
On 18/11/2018 22:08, Dilip Renkila wrote: > Hi all, > > We are provisioning openstack swift api though ceph rgw (mimic). We have > problems when trying to create two containers in two projects of same > name. After scraping web, i came to know that i have to enable  > > * rgw_keystone_implicit_te

[ceph-users] radosgw, Keystone integration, and the S3 API

2018-11-19 Thread Florian Haas
Hi everyone, I've recently started a documentation patch to better explain Swift compatibility and OpenStack integration for radosgw; a WIP PR is at https://github.com/ceph/ceph/pull/25056/. I have, however, run into an issue that I would really *like* to document, except I don't know whether what

Re: [ceph-users] radosgw, Keystone integration, and the S3 API

2018-11-22 Thread Florian Haas
On 19/11/2018 16:23, Florian Haas wrote: > Hi everyone, > > I've recently started a documentation patch to better explain Swift > compatibility and OpenStack integration for radosgw; a WIP PR is at > https://github.com/ceph/ceph/pull/25056/. I have, however, run into an

Re: [ceph-users] Slow rbd reads (fast writes) with luminous + bluestore

2018-11-28 Thread Florian Haas
On 14/08/2018 15:57, Emmanuel Lacour wrote: > Le 13/08/2018 à 16:58, Jason Dillaman a écrit : >> >> See [1] for ways to tweak the bluestore cache sizes. I believe that by >> default, bluestore will not cache any data but instead will only >> attempt to cache its key/value store and metadata. > > I

Re: [ceph-users] Slow rbd reads (fast writes) with luminous + bluestore

2018-11-28 Thread Florian Haas
On 28/11/2018 15:52, Mark Nelson wrote: >> Shifting over a discussion from IRC and taking the liberty to resurrect >> an old thread, as I just ran into the same (?) issue. I see >> *significantly* reduced performance on RBD reads, compared to writes >> with the same parameters. "rbd bench --io-type

[ceph-users] RGW Swift metadata dropped when S3 bucket versioning enabled

2018-11-28 Thread Florian Haas
On 27/11/2018 20:28, Maxime Guyot wrote: > Hi, > > I'm running into an issue with the RadosGW Swift API when the S3 bucket > versioning is enabled. It looks like it silently drops any metadata sent > with the "X-Object-Meta-foo" header (see example below). > This is observed on a Luminous 12.2.8 c

[ceph-users] RGW Swift metadata dropped when S3 bucket versioning enabled

2018-11-30 Thread Florian Haas
On 28/11/2018 19:06, Maxime Guyot wrote: > Hi Florian, > > You assumed correctly, the "test" container (private) was created with > the "openstack container create test", then I am using the S3 API to > enable/disable object versioning on it. > I use the following Python snippet to enable/disable

Re: [ceph-users] Slow rbd reads (fast writes) with luminous + bluestore

2018-12-02 Thread Florian Haas
Hi Mark, just taking the liberty to follow up on this one, as I'd really like to get to the bottom of this. On 28/11/2018 16:53, Florian Haas wrote: > On 28/11/2018 15:52, Mark Nelson wrote: >> Option("bluestore_default_buffered_read", Option::TYPE_BOOL, &

[ceph-users] Multi tenanted radosgw and existing Keystone users/tenants

2018-12-05 Thread Florian Haas
Hi Mark, On 04/12/2018 04:41, Mark Kirkwood wrote: > Hi, > > I've set up a Luminous RGW with Keystone integration, and subsequently set > > rgw keystone implicit tenants = true > > So now all newly created users/tenants (or old ones that never accessed > RGW) get their own namespaces. However t

  1   2   >