On 8/22/19 5:49 PM, Jason Dillaman wrote:
> On Thu, Aug 22, 2019 at 11:29 AM Wido den Hollander wrote:
>>
>>
>>
>> On 8/22/19 3:59 PM, Jason Dillaman wrote:
>>> On Thu, Aug 22, 2019 at 9:23 AM Wido den Hollander wrote:
Hi,
In a couple of situations I have encountered that V
Hi all,
we're building up our experience with our ceph cluster before we take it
into production. I've now tried to fill up the cluster with cephfs,
which we plan to use for about 95% of all data on the cluster.
The cephfs pools are full when the cluster reports 67% raw capacity
used. There
Hi,
Which version of ceph are you using? Which balancer mode?
The balancer score isn't a percent-error or anything humanly usable.
`ceph osd df tree` can better show you exactly which osds are
over/under utilized and by how much.
You might be able to manually fix things by using `ceph osd reweigh
On 26-08-19 11:16, Dan van der Ster wrote:
Hi,
Which version of ceph are you using? Which balancer mode?
Nautilus (14.2.2), balancer is in upmap mode.
The balancer score isn't a percent-error or anything humanly usable.
`ceph osd df tree` can better show you exactly which osds are
over/under
Thanks. The version and balancer config look good.
So you can try `ceph osd reweight osd.10 0.8` to see if it helps to
get you out of this.
-- dan
On Mon, Aug 26, 2019 at 11:35 AM Simon Oosthoek
wrote:
>
> On 26-08-19 11:16, Dan van der Ster wrote:
> > Hi,
> >
> > Which version of ceph are you
On 26-08-19 11:37, Dan van der Ster wrote:
Thanks. The version and balancer config look good.
So you can try `ceph osd reweight osd.10 0.8` to see if it helps to
get you out of this.
I've done this and the next fullest 3 osds. This will take some time to
recover, I'll let you know when it's d
Balancer just balance in Healthy mode.
The problem is that data is distributed without be balanced in their first
write, that cause unproperly data balanced across osd.
This problem only happens in CEPH, we are the same with 14.2.2, having to
change the weight manually.Because the balancer is a p
On 26-08-19 12:00, EDH - Manuel Rios Fernandez wrote:
Balancer just balance in Healthy mode.
The problem is that data is distributed without be balanced in their first
write, that cause unproperly data balanced across osd.
I suppose the crush algorithm doesn't take the fullness of the osds int
The balancer is unfortunately not that good when you have large k+m in
erasure coding profiles and relatively few servers, some manual
balancing will be required
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 Mü
hi Zheng,
On 8/21/19 4:32 AM, Yan, Zheng wrote:
> Please enable debug mds (debug_mds=10), and try reproducing it again.
please find the logs at
https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz .
we managed to reproduce the issue as a worst case scenario: before
snapshotting, juju-
On 8/26/19 12:33 PM, Simon Oosthoek wrote:
> On 26-08-19 12:00, EDH - Manuel Rios Fernandez wrote:
>> Balancer just balance in Healthy mode.
>>
>> The problem is that data is distributed without be balanced in their
>> first
>> write, that cause unproperly data balanced across osd.
>
> I suppose
On 26-08-19 13:11, Wido den Hollander wrote:
The reweight might actually cause even more confusion for the balancer.
The balancer uses upmap mode and that re-allocates PGs to different OSDs
if needed.
Looking at the output send earlier I have some replies. See below.
Looking at this outpu
On 26-08-19 13:25, Simon Oosthoek wrote:
On 26-08-19 13:11, Wido den Hollander wrote:
The reweight might actually cause even more confusion for the balancer.
The balancer uses upmap mode and that re-allocates PGs to different OSDs
if needed.
Looking at the output send earlier I have some repl
Dear Jason
I shall explain my setup first
The DR centre is 300 Kms apart from the site
Site-A - OSD 0 - 1 TB Mon - 10.236.248.XX /24
Site-B - OSD 0 - 1 TB Mon - 10.236.228.XX/27 - RBD-Mirror deamon running
All ports are open and no firewall..Connectivity is there between
My ini
On 8/26/19 1:35 PM, Simon Oosthoek wrote:
> On 26-08-19 13:25, Simon Oosthoek wrote:
>> On 26-08-19 13:11, Wido den Hollander wrote:
>>
>>>
>>> The reweight might actually cause even more confusion for the balancer.
>>> The balancer uses upmap mode and that re-allocates PGs to different OSDs
>>
On Mon, Aug 26, 2019 at 5:01 AM Wido den Hollander wrote:
>
>
>
> On 8/22/19 5:49 PM, Jason Dillaman wrote:
> > On Thu, Aug 22, 2019 at 11:29 AM Wido den Hollander wrote:
> >>
> >>
> >>
> >> On 8/22/19 3:59 PM, Jason Dillaman wrote:
> >>> On Thu, Aug 22, 2019 at 9:23 AM Wido den Hollander wrote:
On Mon, Aug 26, 2019 at 6:57 PM thoralf schulze wrote:
>
> hi Zheng,
>
> On 8/21/19 4:32 AM, Yan, Zheng wrote:
> > Please enable debug mds (debug_mds=10), and try reproducing it again.
>
> please find the logs at
> https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz .
>
> we managed to
On Mon, Aug 26, 2019 at 7:54 AM V A Prabha wrote:
>
> Dear Jason
> I shall explain my setup first
> The DR centre is 300 Kms apart from the site
> Site-A - OSD 0 - 1 TB Mon - 10.236.248.XX /24
> Site-B - OSD 0 - 1 TB Mon - 10.236.228.XX/27 - RBD-Mirror deamon
> running
> All por
hi Zheng -
On 8/26/19 2:55 PM, Yan, Zheng wrote:
> I tracked down the bug
> https://tracker.ceph.com/issues/41434
wow, that was quick - thank you for investigating. we are looking
forward for the fix :-)
in the meantime, is there anything we can do to prevent q ==
p->second.end() from happening?
On Mon, Aug 26, 2019 at 9:25 PM thoralf schulze wrote:
>
> hi Zheng -
>
> On 8/26/19 2:55 PM, Yan, Zheng wrote:
> > I tracked down the bug
> > https://tracker.ceph.com/issues/41434
>
> wow, that was quick - thank you for investigating. we are looking
> forward for the fix :-)
>
> in the meantime,
On 8/26/19 7:39 AM, Wido den Hollander wrote:
On 8/26/19 1:35 PM, Simon Oosthoek wrote:
On 26-08-19 13:25, Simon Oosthoek wrote:
On 26-08-19 13:11, Wido den Hollander wrote:
The reweight might actually cause even more confusion for the balancer.
The balancer uses upmap mode and that re-allo
Hi,
I'm running Debian 10 with btrfs-progs=5.2.1.
Creating snapshots with snapper=0.8.2 works w/o errors.
However, I run into an issue and need to restore various files.
I thought that I could simply take the files from a snapshot created before.
However, the files required don't exist in any
On 8/26/19 6:46 PM, Thomas Schneider wrote:
> Hi,
>
> I'm running Debian 10 with btrfs-progs=5.2.1.
>
> Creating snapshots with snapper=0.8.2 works w/o errors.
>
> However, I run into an issue and need to restore various files.
>
> I thought that I could simply take the files from a snapshot
I created a ticket: https://tracker.ceph.com/issues/41511
Note that I think I was mistaken when I said that sometimes the problem
goes away on its own. I've looked back through our monitoring and it
looks like when the problem did go away, it was because either the
machine was rebooted or the
Have some friends I set up a Ceph cluster for use with PVE a few years ago.
It wasn't maintained and is now in bad shape.
They've reached out to me for help, but I do not have the time to assist
right now.
Is there anyone on the list that would be willing to help? As a
professional service of cou
25 matches
Mail list logo