On 9/13/19 4:51 AM, Reed Dier wrote:
I would love to deprecate the multi-root, and may try to do just that
in my next OSD add, just worried about data shuffling unnecessarily.
Would this in theory help my distribution across disparate OSD topologies?
May be. Actually I don't know where is bala
> 1. Multi-root. You should deprecate your 'ssd' root and move your osds of
> this root to 'default' root.
>
I would love to deprecate the multi-root, and may try to do just that in my
next OSD add, just worried about data shuffling unnecessarily.
Would this in theory help my distribution across
On 9/2/19 5:47 PM, Jake Grimmett wrote:
Hi Konstantin,
To confirm, disabling the balancer allows the mgr to work properly.
I tried re-enabling the balancer, it briefly worked, then locked up the
mgr again.
Here it's working OK...
[root@ceph-s1 ~]# time ceph balancer optimize new
real0m1.6
On 8/29/19 9:56 PM, Reed Dier wrote:
"config/mgr/mgr/balancer/active",
"config/mgr/mgr/balancer/max_misplaced",
"config/mgr/mgr/balancer/mode",
"config/mgr/mgr/balancer/pool_ids",
This is useless keys, you may to remove it.
https://pastebin.com/bXPs28h1
Issues that you have:
1. Multi-root.
See responses below.
> On Aug 28, 2019, at 11:13 PM, Konstantin Shalygin wrote:
>> Just a follow up 24h later, and the mgr's seem to be far more stable, and
>> have had no issues or weirdness after disabling the balancer module.
>>
>> Which isn't great, because the balancer plays an important r
Just a follow up 24h later, and the mgr's seem to be far more stable, and have
had no issues or weirdness after disabling the balancer module.
Which isn't great, because the balancer plays an important role, but after
fighting distribution for a few weeks and getting it 'good enough' I'm taking
Just a follow up 24h later, and the mgr's seem to be far more stable, and have
had no issues or weirdness after disabling the balancer module.
Which isn't great, because the balancer plays an important role, but after
fighting distribution for a few weeks and getting it 'good enough' I'm taking
Just to further piggyback,
Probably the most "hard" the mgr seems to get pushed is when the balancer is
engaged.
When trying to eval a pool or cluster, it takes upwards of 30-120 seconds for
it to score it, and then another 30-120 seconds to execute the plan, and it
never seems to engage automa
Hi Reed, Lenz, John
I've just tried disabling the balancer, so far ceph-mgr is keeping it's
CPU mostly under 20%, even with both the iostat and dashboard back on.
# ceph balancer off
was
[root@ceph-s1 backup]# ceph balancer status
{
"active": true,
"plans": [],
"mode": "upmap"
}
now
Yes, the problem still occurs with the dashboard disabled...
Possibly relevant, when both the dashboard and iostat plugins are
disabled, I occasionally see ceph-mgr rise to 100% CPU.
as suggested by John Hearns, the output of gstack ceph-mgr when at 100%
is here:
http://p.ip.fi/52sV
many thank
I'm currently seeing this with the dashboard disabled.
My instability decreases, but isn't wholly cured, by disabling prometheus and
rbd_support, which I use in tandem, as the only thing I'm using the
prom-exporter for is the per-rbd metrics.
> ceph mgr module ls
> {
> "enabled_modules": [
Try running gstack on the ceph mgr process when it is frozen?
This could be a name resolution problem, as you suspect. Maybe gstack will
show where the process is 'stuck'and this might be a call to your name
resolution service.
On Tue, 27 Aug 2019 at 14:25, Jake Grimmett wrote:
> Whoops, I'm r
Hi Jake,
On 8/27/19 3:22 PM, Jake Grimmett wrote:
> That exactly matches what I'm seeing:
>
> when iostat is working OK, I see ~5% CPU use by ceph-mgr
> and when iostat freezes, ceph-mgr CPU increases to 100%
Does this also occur if the dashboard module is disabled? Just wondering
if this is is
Whoops, I'm running Scientific Linux 7.6, going to upgrade to 7.7. soon...
thanks
Jake
On 8/27/19 2:22 PM, Jake Grimmett wrote:
> Hi Reed,
>
> That exactly matches what I'm seeing:
>
> when iostat is working OK, I see ~5% CPU use by ceph-mgr
> and when iostat freezes, ceph-mgr CPU increases t
Hi Reed,
That exactly matches what I'm seeing:
when iostat is working OK, I see ~5% CPU use by ceph-mgr
and when iostat freezes, ceph-mgr CPU increases to 100%
regarding OS, I'm using Scientific Linux 7.7
Kernel 3.10.0-957.21.3.el7.x86_64
I'm not sure if the mgr initiates scrubbing, but if so,
Curious what dist you're running on, as I've been having similar issues with
instability in the mgr as well, curious if any similar threads to pull at.
While the iostat command is running, is the active mgr using 100% CPU in top?
Reed
> On Aug 27, 2019, at 6:41 AM, Jake Grimmett wrote:
>
> De
Dear All,
We have a new Nautilus (14.2.2) cluster, with 328 OSDs spread over 40 nodes.
Unfortunately "ceph iostat" spends most of it's time frozen, with
occasional periods of working normally for less than a minute, then
freeze again for a couple of minutes, then come back to life, and so so
on..
17 matches
Mail list logo