Fedotov wrote:
>
> I'd like to see more details (preferably backed with logs) on this...
>
> On 6/20/2019 6:23 PM, Dan van der Ster wrote:
> > P.S. I know this has been discussed before, but the
> > compression_(mode|algorithm) pool options [1] seem completely bro
http://tracker.ceph.com/issues/40480
On Thu, Jun 20, 2019 at 9:12 PM Dan van der Ster wrote:
>
> I will try to reproduce with logs and create a tracker once I find the
> smoking gun...
>
> It's very strange -- I had the osd mode set to 'passive', and pool
> optio
Hi Lars,
Is there a specific bench result you're concerned about?
I would think that small write perf could be kept reasonable thanks to
bluestore's deferred writes.
FWIW, our bench results (all flash cluster) didn't show a massive
performance difference between 3 replica and 4+2 EC.
I agree abou
On Mon, Jul 8, 2019 at 1:02 PM Lars Marowsky-Bree wrote:
>
> On 2019-07-08T12:25:30, Dan van der Ster wrote:
>
> > Is there a specific bench result you're concerned about?
>
> We're seeing ~5800 IOPS, ~23 MiB/s on 4 KiB IO (stripe_width 8192) on a
> pool that
Hi all,
In September we'll need to power down a CephFS cluster (currently
mimic) for a several-hour electrical intervention.
Having never done this before, I thought I'd check with the list.
Here's our planned procedure:
1. umounts /cephfs from all hpc clients.
2. ceph osd set noout
3. wait unti
Hi all,
Last night we had 60 ERRs like this:
2019-07-26 00:56:44.479240 7efc6cca1700 0 mds.2.cache.dir(0x617)
_fetched badness: got (but i already had) [inode 0x10006289992
[...2,head] ~mds2/stray1/10006289992 auth v14438219972 dirtyparent
s=116637332 nl=8 n(v0 rc2019-07-26 00:56:17.199090 b116
On Mon, Jul 29, 2019 at 2:52 PM Yan, Zheng wrote:
>
> On Fri, Jul 26, 2019 at 4:45 PM Dan van der Ster wrote:
> >
> > Hi all,
> >
> > Last night we had 60 ERRs like this:
> >
> > 2019-07-26 00:56:44.479240 7efc6cca1700 0 mds.2.cache.dir(0x617)
>
On Mon, Jul 29, 2019 at 3:47 PM Yan, Zheng wrote:
>
> On Mon, Jul 29, 2019 at 9:13 PM Dan van der Ster wrote:
> >
> > On Mon, Jul 29, 2019 at 2:52 PM Yan, Zheng wrote:
> > >
> > > On Fri, Jul 26, 2019 at 4:45 PM Dan van der Ster
> > > wrote:
>
Hi,
Which version of ceph are you using? Which balancer mode?
The balancer score isn't a percent-error or anything humanly usable.
`ceph osd df tree` can better show you exactly which osds are
over/under utilized and by how much.
You might be able to manually fix things by using `ceph osd reweigh
Thanks. The version and balancer config look good.
So you can try `ceph osd reweight osd.10 0.8` to see if it helps to
get you out of this.
-- dan
On Mon, Aug 26, 2019 at 11:35 AM Simon Oosthoek
wrote:
>
> On 26-08-19 11:16, Dan van der Ster wrote:
> > Hi,
> >
> > Which
You were running v14.2.2 before?
It seems that that ceph_assert you're hitting was indeed added
between v14.2.2. and v14.2.3 in this commit
https://github.com/ceph/ceph/commit/12f8b813b0118b13e0cdac15b19ba8a7e127730b
There's a comment in the tracker for that commit which says the
original fix wa
Hi all,
We're midway through an update from 13.2.6 to 13.2.7 and started
getting OSDs crashing regularly like this [1].
Does anyone obviously know what the issue is? (Maybe
https://github.com/ceph/ceph/pull/26448/files ?)
Or is it some temporary problem while we still have v13.2.6 and
v13.2.7 osds
I created https://tracker.ceph.com/issues/43106 and we're downgrading
our osds back to 13.2.6.
-- dan
On Tue, Dec 3, 2019 at 4:09 PM Dan van der Ster wrote:
>
> Hi all,
>
> We're midway through an update from 13.2.6 to 13.2.7 and started
> getting OSDs crashing regula
e only some
> specific unsafe scenarios?
>
> Best regards,
>
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
>
> From: ceph-users on behalf of Dan van der
> Ster
> Sent: 03 December
Hi,
One way this can happen is if you change the crush rule of a pool after the
balancer has been running awhile.
This is because the balancer upmaps are only validated when they are
initially created.
ceph osd dump | grep upmap
Does it explain your issue?
.. Dan
On Tue, 14 Jan 2020, 04:17 Yi
Hi Nick,
We saw the exact same problem yesterday after a network outage -- a few of
our down OSDs were stuck down until we restarted their processes.
-- Dan
On Wed, Jan 15, 2020 at 3:37 PM Nick Fisk wrote:
> Hi All,
>
> Running 14.2.5, currently experiencing some network blips isolated to a
>
t 12:08 PM Nick Fisk wrote:
> On Thursday, January 16, 2020 09:15 GMT, Dan van der Ster <
> d...@vanderster.com> wrote:
>
> > Hi Nick,
> >
> > We saw the exact same problem yesterday after a network outage -- a few
> of
> > our down OSDs were stuc
On Wed, Jan 22, 2020 at 12:24 AM Patrick Donnelly
wrote:
> On Tue, Jan 21, 2020 at 8:32 AM John Madden wrote:
> >
> > On 14.2.5 but also present in Luminous, buffer_anon memory use spirals
> > out of control when scanning many thousands of files. The use case is
> > more or less "look up this fi
801 - 818 of 818 matches
Mail list logo