Re: [ceph-users] Theory: High I/O-wait inside VM with RBD due to CPU throttling

2019-08-26 Thread Wido den Hollander
On 8/22/19 5:49 PM, Jason Dillaman wrote: > On Thu, Aug 22, 2019 at 11:29 AM Wido den Hollander wrote: >> >> >> >> On 8/22/19 3:59 PM, Jason Dillaman wrote: >>> On Thu, Aug 22, 2019 at 9:23 AM Wido den Hollander wrote: Hi, In a couple of situations I have encountered that V

[ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
Hi all, we're building up our experience with our ceph cluster before we take it into production. I've now tried to fill up the cluster with cephfs, which we plan to use for about 95% of all data on the cluster. The cephfs pools are full when the cluster reports 67% raw capacity used. There

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Dan van der Ster
Hi, Which version of ceph are you using? Which balancer mode? The balancer score isn't a percent-error or anything humanly usable. `ceph osd df tree` can better show you exactly which osds are over/under utilized and by how much. You might be able to manually fix things by using `ceph osd reweigh

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 11:16, Dan van der Ster wrote: Hi, Which version of ceph are you using? Which balancer mode? Nautilus (14.2.2), balancer is in upmap mode. The balancer score isn't a percent-error or anything humanly usable. `ceph osd df tree` can better show you exactly which osds are over/under

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Dan van der Ster
Thanks. The version and balancer config look good. So you can try `ceph osd reweight osd.10 0.8` to see if it helps to get you out of this. -- dan On Mon, Aug 26, 2019 at 11:35 AM Simon Oosthoek wrote: > > On 26-08-19 11:16, Dan van der Ster wrote: > > Hi, > > > > Which version of ceph are you

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 11:37, Dan van der Ster wrote: Thanks. The version and balancer config look good. So you can try `ceph osd reweight osd.10 0.8` to see if it helps to get you out of this. I've done this and the next fullest 3 osds. This will take some time to recover, I'll let you know when it's d

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread EDH - Manuel Rios Fernandez
Balancer just balance in Healthy mode. The problem is that data is distributed without be balanced in their first write, that cause unproperly data balanced across osd. This problem only happens in CEPH, we are the same with 14.2.2, having to change the weight manually.Because the balancer is a p

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 12:00, EDH - Manuel Rios Fernandez wrote: Balancer just balance in Healthy mode. The problem is that data is distributed without be balanced in their first write, that cause unproperly data balanced across osd. I suppose the crush algorithm doesn't take the fullness of the osds int

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Paul Emmerich
The balancer is unfortunately not that good when you have large k+m in erasure coding profiles and relatively few servers, some manual balancing will be required Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 Mü

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng, On 8/21/19 4:32 AM, Yan, Zheng wrote: > Please enable debug mds (debug_mds=10), and try reproducing it again. please find the logs at https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz . we managed to reproduce the issue as a worst case scenario: before snapshotting, juju-

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Wido den Hollander
On 8/26/19 12:33 PM, Simon Oosthoek wrote: > On 26-08-19 12:00, EDH - Manuel Rios Fernandez wrote: >> Balancer just balance in Healthy mode. >> >> The problem is that data is distributed without be balanced in their >> first >> write, that cause unproperly data balanced across osd. > > I suppose

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allocates PGs to different OSDs if needed. Looking at the output send earlier I have some replies. See below. Looking at this outpu

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Simon Oosthoek
On 26-08-19 13:25, Simon Oosthoek wrote: On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allocates PGs to different OSDs if needed. Looking at the output send earlier I have some repl

Re: [ceph-users] [question] one-way RBD mirroring doesn't work

2019-08-26 Thread V A Prabha
Dear Jason I shall explain my setup first The DR centre is 300 Kms apart from the site Site-A - OSD 0 - 1 TB Mon - 10.236.248.XX /24 Site-B - OSD 0 - 1 TB Mon - 10.236.228.XX/27 - RBD-Mirror deamon running All ports are open and no firewall..Connectivity is there between My ini

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Wido den Hollander
On 8/26/19 1:35 PM, Simon Oosthoek wrote: > On 26-08-19 13:25, Simon Oosthoek wrote: >> On 26-08-19 13:11, Wido den Hollander wrote: >> >>> >>> The reweight might actually cause even more confusion for the balancer. >>> The balancer uses upmap mode and that re-allocates PGs to different OSDs >>

Re: [ceph-users] Theory: High I/O-wait inside VM with RBD due to CPU throttling

2019-08-26 Thread Jason Dillaman
On Mon, Aug 26, 2019 at 5:01 AM Wido den Hollander wrote: > > > > On 8/22/19 5:49 PM, Jason Dillaman wrote: > > On Thu, Aug 22, 2019 at 11:29 AM Wido den Hollander wrote: > >> > >> > >> > >> On 8/22/19 3:59 PM, Jason Dillaman wrote: > >>> On Thu, Aug 22, 2019 at 9:23 AM Wido den Hollander wrote:

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread Yan, Zheng
On Mon, Aug 26, 2019 at 6:57 PM thoralf schulze wrote: > > hi Zheng, > > On 8/21/19 4:32 AM, Yan, Zheng wrote: > > Please enable debug mds (debug_mds=10), and try reproducing it again. > > please find the logs at > https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz . > > we managed to

Re: [ceph-users] [question] one-way RBD mirroring doesn't work

2019-08-26 Thread Jason Dillaman
On Mon, Aug 26, 2019 at 7:54 AM V A Prabha wrote: > > Dear Jason > I shall explain my setup first > The DR centre is 300 Kms apart from the site > Site-A - OSD 0 - 1 TB Mon - 10.236.248.XX /24 > Site-B - OSD 0 - 1 TB Mon - 10.236.228.XX/27 - RBD-Mirror deamon > running > All por

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng - On 8/26/19 2:55 PM, Yan, Zheng wrote: > I tracked down the bug > https://tracker.ceph.com/issues/41434 wow, that was quick - thank you for investigating. we are looking forward for the fix :-) in the meantime, is there anything we can do to prevent q == p->second.end() from happening?

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread Yan, Zheng
On Mon, Aug 26, 2019 at 9:25 PM thoralf schulze wrote: > > hi Zheng - > > On 8/26/19 2:55 PM, Yan, Zheng wrote: > > I tracked down the bug > > https://tracker.ceph.com/issues/41434 > > wow, that was quick - thank you for investigating. we are looking > forward for the fix :-) > > in the meantime,

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread Mark Nelson
On 8/26/19 7:39 AM, Wido den Hollander wrote: On 8/26/19 1:35 PM, Simon Oosthoek wrote: On 26-08-19 13:25, Simon Oosthoek wrote: On 26-08-19 13:11, Wido den Hollander wrote: The reweight might actually cause even more confusion for the balancer. The balancer uses upmap mode and that re-allo

[ceph-users] No files in snapshot

2019-08-26 Thread Thomas Schneider
Hi, I'm running Debian 10 with btrfs-progs=5.2.1. Creating snapshots with snapper=0.8.2 works w/o errors. However, I run into an issue and need to restore various files. I thought that I could simply take the files from a snapshot created before. However, the files required don't exist in any

Re: [ceph-users] No files in snapshot

2019-08-26 Thread Wido den Hollander
On 8/26/19 6:46 PM, Thomas Schneider wrote: > Hi, > > I'm running Debian 10 with btrfs-progs=5.2.1. > > Creating snapshots with snapper=0.8.2 works w/o errors. > > However, I run into an issue and need to restore various files. > > I thought that I could simply take the files from a snapshot

Re: [ceph-users] radosgw pegging down 5 CPU cores when no data is being transferred

2019-08-26 Thread Vladimir Brik
I created a ticket: https://tracker.ceph.com/issues/41511 Note that I think I was mistaken when I said that sometimes the problem goes away on its own. I've looked back through our monitoring and it looks like when the problem did go away, it was because either the machine was rebooted or the

[ceph-users] Ceph PVE cluster help

2019-08-26 Thread Daniel K
Have some friends I set up a Ceph cluster for use with PVE a few years ago. It wasn't maintained and is now in bad shape. They've reached out to me for help, but I do not have the time to assist right now. Is there anyone on the list that would be willing to help? As a professional service of cou