[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Dan van der Ster
Hi Andras.

Assuming that you've already tightened the
mgr/balancer/upmap_max_deviation to 1, I suspect that this cluster
already has too many upmaps.

Last time I checked, the balancer implementation is not able to
improve a pg-upmap-items entry if one already exists for a PG. (It can
add an OSD mapping pair to an PG, but not change an existing pair from
one osd to another).
So I think that what happens in this case is the balancer gets stuck
in a sort of local minimum in the overall optimization.

It can therefore help to simply remove some upmaps, and then wait for
the balancer to do a better job when it re-creates new entries for
those PGs.
And there's usually some low hanging fruit -- you can start by
removing pg-upmap-items which are mapping PGs away from the least full
OSDs. (Those upmap entries are making the least full OSDs even *less*
full.)

We have a script for that:
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/rm-upmaps-underfull.py
It's a pretty hacky and I don't use it often, so please use it with
caution -- you can run it and review which upmaps it would remove.

Hope this helps,

Dan



On Fri, Apr 2, 2021 at 10:18 AM Andras Pataki
 wrote:
>
> Dear ceph users,
>
> On one of our clusters I have some difficulties with the upmap
> balancer.  We started with a reasonably well balanced cluster (using the
> balancer in upmap mode).  After a node failure, we crush reweighted all
> the OSDs of the node to take it out of the cluster - and waited for the
> cluster to rebalance.  Obviously, this significantly changes the crush
> map - hence the nice balance created by the balancer was gone.  The
> recovery mostly completed - but some of the OSDs became too full - so we
> neded up with a few PGs that were backfill_toofull.  The cluster has
> plenty of space (overall perhaps 65% full), only a few OSDs are >90% (we
> have backfillfull_ratio at 92%).  The balancer refuses to change
> anything since the cluster is not clean.  Yet - the cluster can't become
> clean without a few upmaps to help the top 3 or 4 most full OSDs.
>
> I would think this is a fairly common situation - trying to recover
> after some failure.  Are there any recommendations on how to proceed?
> Obviously I can manually find and insert upmaps - but for a large
> cluster with tens of thousands of PGs, that isn't too practical.  Is
> there a way to tell the balancer to still do something even though some
> PGs are undersized (with a quick look at the python module - I didn't
> see any)?
>
> The cluster is on Nautilus 14.2.15.
>
> Thanks,
>
> Andras
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph orch update fails - got new digests

2021-04-02 Thread Alexander Sporleder
Hello Ceph user list!

I tried to update Ceph 15.2.10 to 16.2.0 via ceph orch. In the
beginning everything seems to work fine and the new MGR and MONs where
deployed. But now I enden up in a pulling loop and I am unable to fix
the issue by my self. 

#ceph -W cephadm --watch-debu

2021-04-02T10:36:20.704960+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade: Need
to upgrade myself (mgr.mon-a-02.tvcrfq)
2021-04-02T10:36:21.837596+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
Pulling
ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
0951949000 on mon-a-01
2021-04-02T10:36:24.591487+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
image
ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
0951949000 pull on mon-a-01 got new digests
['docker.io/ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6
c43120b5e5e00951949000',
'docker.io/ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42f
ecb950c3407687cb4f29a'] (not
['ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5
e00951949000']), restarting
5704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a']), restarting
2021-04-02T10:36:37.054786+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade: Need
to upgrade myself (mgr.mon-a-02.tvcrfq)
2021-04-02T10:36:38.419014+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
Pulling
ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
0951949000 on mon-a-01
2021-04-02T10:36:41.172835+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
image
ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
0951949000 pull on mon-a-01 got new digests
['docker.io/ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6
c43120b5e5e00951949000',
'docker.io/ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42f
ecb950c3407687cb4f29a'] (not
['ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5
e00951949000']), restarting


After I stoppend the update I got the following health error:  

Module 'cephadm' has failed: 'NoneType' object has no attribute
'target_digests'

Thanks in advance!
 
Best,
Alex


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Dan van der Ster
Hi again,

Oops, I'd missed the part about some PGs being degraded, which
prevents the balancer from continuing.

So I assume that you have PGs which are simultaneously
undersized+backfill_toofull?
That case does indeed sound tricky. To solve that you would either
need to move PGs out of the toofull OSD, to make room for the
undersized PGs; or, upmap those undersized PGs to some other less-full
OSDs.

For the former, you could either use the rm-upmaps-underfull script
and hope that it incidentally moves data out of those toofull OSDs. Or
a similar script with some variables reversed could be used to remove
any upmaps which are directing PGs *to* those toofull OSDs. Or maybe
it will be enough to just reweight those OSDs to 0.9.

-- Dan


On Fri, Apr 2, 2021 at 10:47 AM Dan van der Ster  wrote:
>
> Hi Andras.
>
> Assuming that you've already tightened the
> mgr/balancer/upmap_max_deviation to 1, I suspect that this cluster
> already has too many upmaps.
>
> Last time I checked, the balancer implementation is not able to
> improve a pg-upmap-items entry if one already exists for a PG. (It can
> add an OSD mapping pair to an PG, but not change an existing pair from
> one osd to another).
> So I think that what happens in this case is the balancer gets stuck
> in a sort of local minimum in the overall optimization.
>
> It can therefore help to simply remove some upmaps, and then wait for
> the balancer to do a better job when it re-creates new entries for
> those PGs.
> And there's usually some low hanging fruit -- you can start by
> removing pg-upmap-items which are mapping PGs away from the least full
> OSDs. (Those upmap entries are making the least full OSDs even *less*
> full.)
>
> We have a script for that:
> https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/rm-upmaps-underfull.py
> It's a pretty hacky and I don't use it often, so please use it with
> caution -- you can run it and review which upmaps it would remove.
>
> Hope this helps,
>
> Dan
>
>
>
> On Fri, Apr 2, 2021 at 10:18 AM Andras Pataki
>  wrote:
> >
> > Dear ceph users,
> >
> > On one of our clusters I have some difficulties with the upmap
> > balancer.  We started with a reasonably well balanced cluster (using the
> > balancer in upmap mode).  After a node failure, we crush reweighted all
> > the OSDs of the node to take it out of the cluster - and waited for the
> > cluster to rebalance.  Obviously, this significantly changes the crush
> > map - hence the nice balance created by the balancer was gone.  The
> > recovery mostly completed - but some of the OSDs became too full - so we
> > neded up with a few PGs that were backfill_toofull.  The cluster has
> > plenty of space (overall perhaps 65% full), only a few OSDs are >90% (we
> > have backfillfull_ratio at 92%).  The balancer refuses to change
> > anything since the cluster is not clean.  Yet - the cluster can't become
> > clean without a few upmaps to help the top 3 or 4 most full OSDs.
> >
> > I would think this is a fairly common situation - trying to recover
> > after some failure.  Are there any recommendations on how to proceed?
> > Obviously I can manually find and insert upmaps - but for a large
> > cluster with tens of thousands of PGs, that isn't too practical.  Is
> > there a way to tell the balancer to still do something even though some
> > PGs are undersized (with a quick look at the python module - I didn't
> > see any)?
> >
> > The cluster is on Nautilus 14.2.15.
> >
> > Thanks,
> >
> > Andras
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephfs-top: "cluster ceph does not exist"

2021-04-02 Thread Erwin Bogaard
Hi,

just installed pacific on our test-cluster. This really is a minimal, but
fully functional cluster.
Everything works as expected, except for the new (and by me anticipated)
cephfs-top.
When I run that tool, it says: "cluster ceph does not exist"

If I point it to the correct config file:
# cephfs-top --conffile /etc/ceph/ceph.conf

I still get the same error.
Doesn't matter if I run this as the ceph user or as root.

I added the "client.fstop"-user as required by the documentation.
module "stats" is enabled and functioning (tested with "ceph fs perf
stats").

Anyone any suggestions what might be wrong?

Regards,
Erwin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs-top: "cluster ceph does not exist"

2021-04-02 Thread Venky Shankar
On Fri, Apr 2, 2021 at 2:59 PM Erwin Bogaard  wrote:
>
> Hi,
>
> just installed pacific on our test-cluster. This really is a minimal, but
> fully functional cluster.
> Everything works as expected, except for the new (and by me anticipated)
> cephfs-top.
> When I run that tool, it says: "cluster ceph does not exist"
>
> If I point it to the correct config file:
> # cephfs-top --conffile /etc/ceph/ceph.conf

Does running "cephfs-top" work for you? (since it is the default cluster name)

>
> I still get the same error.
> Doesn't matter if I run this as the ceph user or as root.
>
> I added the "client.fstop"-user as required by the documentation.
> module "stats" is enabled and functioning (tested with "ceph fs perf
> stats").
>
> Anyone any suggestions what might be wrong?
>
> Regards,
> Erwin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Janne Johansson
Den fre 2 apr. 2021 kl 11:23 skrev Dan van der Ster :
>
> Hi again,
>
> Oops, I'd missed the part about some PGs being degraded, which
> prevents the balancer from continuing.
> any upmaps which are directing PGs *to* those toofull OSDs. Or maybe
> it will be enough to just reweight those OSDs to 0.9.

I was also thinking this, in that case, just lower OSD weight on the
toofull OSDs like us old pre-upmap admins do. ;)
When all the dust has settled, move weight up again.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Upmap balancer after node failure

2021-04-02 Thread Andras Pataki

Dear ceph users,

On one of our clusters I have some difficulties with the upmap 
balancer.  We started with a reasonably well balanced cluster (using the 
balancer in upmap mode).  After a node failure, we crush reweighted all 
the OSDs of the node to take it out of the cluster - and waited for the 
cluster to rebalance.  Obviously, this significantly changes the crush 
map - hence the nice balance created by the balancer was gone.  The 
recovery mostly completed - but some of the OSDs became too full - so we 
neded up with a few PGs that were backfill_toofull.  The cluster has 
plenty of space (overall perhaps 65% full), only a few OSDs are >90% (we 
have backfillfull_ratio at 92%).  The balancer refuses to change 
anything since the cluster is not clean.  Yet - the cluster can't become 
clean without a few upmaps to help the top 3 or 4 most full OSDs.


I would think this is a fairly common situation - trying to recover 
after some failure.  Are there any recommendations on how to proceed?  
Obviously I can manually find and insert upmaps - but for a large 
cluster with tens of thousands of PGs, that isn't too practical.  Is 
there a way to tell the balancer to still do something even though some 
PGs are undersized (with a quick look at the python module - I didn't 
see any)?


The cluster is on Nautilus 14.2.15.

Thanks,

Andras
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch update fails - got new digests

2021-04-02 Thread Sage Weil
Hi Alex,

Thanks for the report!  I've opened
https://tracker.ceph.com/issues/50114.  It looks like the
target_digests check needs to check for overlap instead of equality.

sage

On Fri, Apr 2, 2021 at 4:04 AM Alexander Sporleder
 wrote:
>
> Hello Ceph user list!
>
> I tried to update Ceph 15.2.10 to 16.2.0 via ceph orch. In the
> beginning everything seems to work fine and the new MGR and MONs where
> deployed. But now I enden up in a pulling loop and I am unable to fix
> the issue by my self.
>
> #ceph -W cephadm --watch-debu
>
> 2021-04-02T10:36:20.704960+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade: Need
> to upgrade myself (mgr.mon-a-02.tvcrfq)
> 2021-04-02T10:36:21.837596+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
> Pulling
> ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
> 0951949000 on mon-a-01
> 2021-04-02T10:36:24.591487+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
> image
> ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
> 0951949000 pull on mon-a-01 got new digests
> ['docker.io/ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6
> c43120b5e5e00951949000',
> 'docker.io/ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42f
> ecb950c3407687cb4f29a'] (not
> ['ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5
> e00951949000']), restarting
> 5704c49591640a37c7adfd40ffad0a4b42fecb950c3407687cb4f29a']), restarting
> 2021-04-02T10:36:37.054786+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade: Need
> to upgrade myself (mgr.mon-a-02.tvcrfq)
> 2021-04-02T10:36:38.419014+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
> Pulling
> ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
> 0951949000 on mon-a-01
> 2021-04-02T10:36:41.172835+0200 mgr.mon-a-02.tvcrfq [INF] Upgrade:
> image
> ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5e0
> 0951949000 pull on mon-a-01 got new digests
> ['docker.io/ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6
> c43120b5e5e00951949000',
> 'docker.io/ceph/ceph@sha256:9b04c0f15704c49591640a37c7adfd40ffad0a4b42f
> ecb950c3407687cb4f29a'] (not
> ['ceph/ceph@sha256:35b2786dc4cd535dd84f6a1a585503db4b43623ba6c43120b5e5
> e00951949000']), restarting
>
>
> After I stoppend the update I got the following health error:
>
> Module 'cephadm' has failed: 'NoneType' object has no attribute
> 'target_digests'
>
> Thanks in advance!
>
> Best,
> Alex
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch update fails - got new digests

2021-04-02 Thread Sage Weil
I'm a bit confused by the log messages--I'm not sure why the
target_digests aren't changing.  Can you post the whole
ceph-mgr.mon-a-02.tvcrfq.log?  (ceph-post-file
/var/log/ceph/*/ceph-mgr.mon-a-02.tvcrfq.log)

Thanks!
s
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Andras Pataki
Lowering the weight is what I ended up doing.  But this isn't ideal 
since afterwards the balancer will remove too many PGs from the OSD 
since now it has a lower weight.  So I'll have to put the weight back 
once the cluster recovers and the balancer goes back to its business.


But in any case - this is conceptually challenging - that the upmap 
balancer won't help for the case one would perhaps need it the most - 
when recovering from some kind of a disaster.  So one can be under the 
illusion that everything is fine - OSDs are all balanced, cluster is 
running smooth.  Then some non-trivial failure happens, and we are back 
to a pre upmap situation - the balance is completely thrown off.  Also 
for larger clusters the pre-upmap imbalance is worse - and is also 
harder to fix (due to the large number of OSDs).  I've done some 
analysis of what the expected imbalance is given various factors of the 
cluster - but that's a longer story ...


Thanks for the input - I was really wondering if I was missing something 
with upmap ...


Andras

On 4/2/21 8:12 AM, Janne Johansson wrote:

Den fre 2 apr. 2021 kl 11:23 skrev Dan van der Ster :

Hi again,

Oops, I'd missed the part about some PGs being degraded, which
prevents the balancer from continuing.
any upmaps which are directing PGs *to* those toofull OSDs. Or maybe
it will be enough to just reweight those OSDs to 0.9.

I was also thinking this, in that case, just lower OSD weight on the
toofull OSDs like us old pre-upmap admins do. ;)
When all the dust has settled, move weight up again.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orch update fails - got new digests

2021-04-02 Thread Sage Weil
On Fri, Apr 2, 2021 at 12:08 PM Alexander Sporleder
 wrote:
>
> Hello Sage, thank you for your response!
>
> I had some problems updating 15.2.8 -> 15.2.9 but after updating Podman
> to 3.0.1 and Ceph to 15.2.10 everything was fine again.
>
> Then I started the update 15.2.10 -> 16.2.0 and in the beging
> everything worked well. But at some point the update got stucked and
> something broke the dashboard (port is in use). I stopped the update
> but it was not possible to start the update process again without the
> loop.
>
> Now my mons, the mgr, a few OSDs and cephadm are V16.2.0.
>
> Unfortunately the mgr is not logging to a file after I converted the
> cluster to cephadm.

ceph config set global log_to_file true

You might also need to chown -R 167.167 /var/log/ceph/*/. if you're on
debian/ubuntu (the packages installed on the host may have fiddled
with the ownership; the uid for fedora/rhel/centos is different than
the one for debian/ubuntu).

s
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io