On Fri, Oct 18, 2019 at 9:10 AM Gustavo Tonini wrote:
>
> Hi Zheng,
> the cluster is running ceph mimic. This warning about network only appears
> when using nautilus' cephfs-journal-tool.
>
> "cephfs-data-scan scan_links" does not report any issue.
>
> How could variable "newparent" be NULL at
Hi all,
Does anyone succeed to use collectd/ceph plugin to collect ceph
cluster data?
I'm using collectd(5.8.1) and Ceph-15.0.0. collectd failed to get
cluster data with below error:
"collectd.service holdoff time over, scheduling restart"
Regards,
Changcheng
___
On Sun, Oct 20, 2019 at 1:53 PM Stefan Kooman wrote:
>
> Dear list,
>
> Quoting Stefan Kooman (ste...@bit.nl):
>
> > I wonder if this situation is more likely to be hit on Mimic 13.2.6 than
> > on any other system.
> >
> > Any hints / help to prevent this from happening?
>
> We have had this happe
I am, collectd with luminous, and upgraded to nautilus and collectd
5.8.1-1.el7 this weekend. Maybe increase logging or so.
I had to wait a long time before collectd was supporting the luminous
release, maybe it is the same with octopus (=15?)
-Original Message-
From: Liu, Changch
Quoting Yan, Zheng (uker...@gmail.com):
> delete 'mdsX_openfiles.0' object from cephfs metadata pool. (X is rank
> of the crashed mds)
Just to make sure I understand correctly. Current status is that the MDS
is active (no standby for now) and not in a "crashed" state (although it
has been crashin
On 09:50 Mon 21 Oct, Marc Roos wrote:
>
> I am, collectd with luminous, and upgraded to nautilus and collectd
> 5.8.1-1.el7 this weekend. Maybe increase logging or so.
> I had to wait a long time before collectd was supporting the luminous
> release, maybe it is the same with octopus (=15?)
>
I have the same. I do not think ConvertSpecialMetricTypes is necessary.
Globals true
LongRunAvgLatency false
ConvertSpecialMetricTypes true
SocketPath "/var/run/ceph/ceph-osd.1.asok"
-Original Message-
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] collectd
On 10:16 Mon 21 Oct, Marc Roos wrote:
> I have the same. I do not think ConvertSpecialMetricTypes is necessary.
>
>
> Globals true
>
>
>
> LongRunAvgLatency false
> ConvertSpecialMetricTypes true
>
> SocketPath "/var/run/ceph/ceph-osd.1.asok"
>
>
Same configuration, but there
Quoting Yan, Zheng (uker...@gmail.com):
> delete 'mdsX_openfiles.0' object from cephfs metadata pool. (X is rank
> of the crashed mds)
OK, MDS crashed again, restarted. I stopped it, deleted the object and
restarted the MDS. It became active right away.
Any idea on why the openfiles list (object
Your collectd starts without the ceph plugin ok?
I have also your error " didn't register a configuration callback",
because I configured debug logging, but did not enable it by loading the
plugin 'logfile'. Maybe it is the order in which your configuration
files a read (I think this used to
Is there any instruction to install the plugin configuration?
Attach my RHEL/collectd configuration file under /etc/ directory.
On RHEL:
[rdma@rdmarhel0 collectd.d]$ pwd
/etc/collectd.d
[rdma@rdmarhel0 collectd.d]$ tree .
.
0 directories, 0 files
[rdma@rdmarhel0 collectd.d
The 'xx-.conf' are mine, custom. So I would not have to merge
changes with newer /etc/collectd.conf rpm updates.
I would suggest get a small configuration that is working, set debug
logging[0], and increase the configuration until it fails with little
steps. Load plugin ceph empty, confi
I've made a ticket for this issue: https://tracker.ceph.com/issues/42338
Thanks again!
K
On 15/10/2019 18:00, Kenneth Waegeman wrote:
Hi Robert, all,
On 23/09/2019 17:37, Robert LeBlanc wrote:
On Mon, Sep 23, 2019 at 4:14 AM Kenneth Waegeman
wrote:
Hi all,
When syncing data with rsync,
I think I am having this issue also (at least I had with luminous) I had
to remove the hidden temp files rsync had left, when the cephfs mount
'stalled', otherwise I would never be able to complete the rsync.
-Original Message-
Cc: ceph-users
Subject: Re: [ceph-users] hanging slow req
On Sat, Oct 19, 2019 at 2:00 PM Lei Liu wrote:
>
> Hello llya,
>
> After updated client kernel version to 3.10.0-862 , ceph features shows:
>
> "client": {
> "group": {
> "features": "0x7010fb86aa42ada",
> "release": "jewel",
> "num": 5
> },
>
On Mon, Oct 21, 2019 at 4:33 PM Stefan Kooman wrote:
>
> Quoting Yan, Zheng (uker...@gmail.com):
>
> > delete 'mdsX_openfiles.0' object from cephfs metadata pool. (X is rank
> > of the crashed mds)
>
> OK, MDS crashed again, restarted. I stopped it, deleted the object and
> restarted the MDS. It b
Quoting Yan, Zheng (uker...@gmail.com):
> I double checked the code, but didn't find any clue. Can you compile
> mds with a debug patch?
Sure, I'll try to do my best to get a properly packaged Ceph Mimic
13.2.6 with the debug patch in it (and / or get help to get it build).
Do you already have th
Hello,
I use ceph 12.2.12 and would like to activate the ceph balancer.
unfortunately no redistribution of the PGs is started:
ceph balancer status
{
"active": true,
"plans": [],
"mode": "crush-compat"
}
ceph balancer eval
current cluster score 0.023776 (lower is better)
ceph conf
Hello llya and paul,
Thanks for your reply. Yes, you are right, 0x7fddff8ee8cbffb is come from
kernel upgrade, it's reported by a docker container
(digitalocean/ceph_exporter) use for ceph monitoring.
Now upmap mode is enabled, client features:
"client": {
"group": {
"featur
On Mon, Oct 21, 2019 at 7:58 PM Stefan Kooman wrote:
>
> Quoting Yan, Zheng (uker...@gmail.com):
>
> > I double checked the code, but didn't find any clue. Can you compile
> > mds with a debug patch?
>
> Sure, I'll try to do my best to get a properly packaged Ceph Mimic
> 13.2.6 with the debug pat
Hello,
This Wednesday we'll have a ceph science user group call. This is an informal
conversation focused on using ceph in htc/hpc and scientific research
environments.
Call details copied from the event:
Wednesday October 23rd
14:00 UTC
4:00PM Central European
10:00AM Eastern American
Main p
Hello
/var/log/messages on machines in our ceph cluster are inundated with
entries from Prometheus scraping ("GET /metrics HTTP/1.1" 200 - ""
"Prometheus/2.11.1")
Is it possible to configure ceph to not send those to syslog? If not,
can I configure something so that none of ceph-mgr messages
Just to clarify my situation, We have 2 datacenters with 3 hosts each, 12 4TB
disks each host (2 are RAID with OS installed and the remaining 10 are used for
Ceph). Right now I'm trying a single DC installation and intended to migrate to
multi site mirroring DC1 to DC2, so if we lose DC1 we can
We have a new ceph Nautilus setup (Nautilus from scratch - not upgraded):
# ceph versions
{
"mon": {
"ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba)
nautilus (stable)": 3
},
"mgr": {
"ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba)
nau
Is there a possibility to lose data if I use "cephfs-data-scan init
--force-init"?
On Mon, Oct 21, 2019 at 4:36 AM Yan, Zheng wrote:
> On Fri, Oct 18, 2019 at 9:10 AM Gustavo Tonini
> wrote:
> >
> > Hi Zheng,
> > the cluster is running ceph mimic. This warning about network only
> appears when
Apparently the graph is too big, so my last post is stuck. Resending
without the graph.
Thanks
-- Forwarded message -
From: Void Star Nill
Date: Mon, Oct 21, 2019 at 4:41 PM
Subject: large concurrent rbd operations block for over 15 mins!
To: ceph-users
Hello,
I have been ru
We recently needed to reweight a couple of OSDs on one of our clusters
(luminous on Ubuntu, 8 hosts, 8 OSD/host). I (think) we reweighted by
approx 0.2. This was perhaps too much, as IO latency on RBD drives
spiked to several seconds at times.
We'd like to lessen this effect as much as we can
hi,
I have a problem with a cluster being stuck in recovery after osd
failure. at first recovery was doing quite well, but now it just sits
there without any progress. I currently looks like this:
health HEALTH_ERR
36 pgs are stuck inactive for more than 300 seconds
5
Hello cephers,
So I am having trouble with a new hardware systems with strange OSD behavior
and I want to replace a disk with a brand new one to test the theory.
I run all daemons in containers and on one of the nodes I have mon, mgr, and 6
osds. So following
https://docs.ceph.com/docs/maste
Hi,
can you share `ceph osd tree`? What crush rules are in use in your
cluster? I assume that the two failed OSDs prevent the remapping
because the rules can't be applied.
Regards,
Eugen
Zitat von Philipp Schwaha :
hi,
I have a problem with a cluster being stuck in recovery after osd
30 matches
Mail list logo