Ceph 17.2.3 (dockerized in Ubuntu 20.04)
The subject says it. The MDS process always crashes after evicting. ceph -w
shows:
2022-09-22T13:26:23.305527+0200 mds.ksz-cephfs2.ceph00.kqjdwe [INF]
Evicting (and blocklisting) client session 5181680 (
10.149.12.21:0/3369570791)
2022-09-22T13:26:35.72931
- What operation was being carried out which led to client eviction?
- Can you share MDS side logs when that event was being carried out?
On Thu, Sep 22, 2022 at 5:12 PM E Taka <0eta...@gmail.com> wrote:
> Ceph 17.2.3 (dockerized in Ubuntu 20.04)
>
> The subject says it. The MDS process always cr
Hallo all,
taking advantage of the redundancy of my EC pool, I destroyed a
couple of servers in order to reinstall them with a new operating system.
I am on Nautilus (but will evolve soon to Pacific), and today I am
not in "emergency mode": this is just for my education. :-)
"ceph pg d
Hi Fulvio,
https://docs.ceph.com/en/quincy/dev/osd_internals/backfill_reservation/
describes the prioritization and reservation mechanism used for
recovery and backfill. AIUI, unless a PG is below min_size, all
backfills for a given pool will be at the same priority.
force-recovery will modify the
Hi,
On 9/21/22 18:00, Gauvain Pocentek wrote:
Hello all,
We are running several Ceph clusters and are facing an issue on one of
them, we would appreciate some input on the problems we're seeing.
We run Ceph in containers on Centos Stream 8, and we deploy using
ceph-ansible. While upgrading cep
Greetings,
We are trying to use the telegraf module to send metrics to InfluxDB and we
keep facing the below error. Any help will be appreciated, thank you.
# ceph telegraf config-show
Error EIO: Module 'telegraf' has experienced an error and cannot handle
commands: invalid literal for int() wi
Hello,
If I had to guess : indicates a port number like :443, so it's expecting an
int and you are passing a string. Try changing https to 443
On Thu, Sep 22, 2022 at 8:24 PM Nikhil Mitra (nikmitra)
wrote:
> Greetings,
>
> We are trying to use the telegraf module to send metrics to InfluxDB an
Tried that but still fails.
# ceph telegraf config-set address test.xyz.com:443
Error EIO: Module 'telegraf' has experienced an error and cannot handle
commands: invalid literal for int() with base 10: 'https
--
Regards,
Nikhil Mitra
From: Curt
Date: Thursday, September 22, 2022 at 12:34 PM
To
On Thu, Sep 22, 2022 at 12:55 PM Yuri Weinstein wrote:
>
> We are publishing a release candidate this time for users to try
> for testing only.
>
> Please note this RC had only limited testing. Full testing is being done now.
It might be worth sharing that the Gibba cluster has been upgraded to
On 9/22/22 18:23, Nikhil Mitra (nikmitra) wrote:
Greetings,
We are trying to use the telegraf module to send metrics to InfluxDB and we
keep facing the below error. Any help will be appreciated, thank you.
# ceph telegraf config-show
Error EIO: Module 'telegraf' has experienced an error and ca
Hi,
We've been running into a mysterious issue on Ceph 16.2.7. Every few
weeks or so (can be from 2 weeks to a month and a half), we get
input/output errors on a random OSD. Here's the logs :
2022-09-22T15:54:11.600Z syslog debug -6>
2022-09-22T15:41:05.678+ 7fec2ebaa080 -1 bde
On 9/22/22 19:55, J-P Methot wrote:
Hi,
We've been running into a mysterious issue on Ceph 16.2.7. Every few
weeks or so (can be from 2 weeks to a month and a half), we get
input/output errors on a random OSD. Here's the logs :
2022-09-22T15:54:11.600Z syslog debug -6>
2022-09-22T
Hoping someone can point me to possible tunables that could hopefully better
tighten my OSD distribution.
Cluster is currently
> "ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus
> (stable)": 307
With plans to begin moving to pacific before end of year, with a possible
in
This was a short meeting, and in summary:
* Testing of upgrades for 17.2.4 in Gibba commenced and slowness during
upgrade has been investigated.
* Workaround available; not a release blocker
___
ceph-users mailing list -- ceph-users@ceph.io
To unsub
Hi,
We have a system running Ceph Pacific with a large number of delete
requests (several hundred thousands files per day) and I'm investigating
how can I increase the gc speed to keep up with our deletes (right now
there are 44 millions of objects in the gc list).
I changed max_concurr
Hi Reed,
Just taking a quick glance at the Pastebin provided I have to say your cluster
balance is already pretty damn good all things considered.
We've seen the upmap balancer at it's best in practice provides a deviation of
about 10-20% percent across OSDs which seems to be matching up on yo
hi,
I am using kernel client to mount cephFS filesystem on Centos8.2.
But my ceph's kernel module does not contain fscache.
[root@host ~]# uname -r
5.4.163-1.el8.elrepo.x86_64
[root@host ~]# lsmod|grep ceph
ceph 446464 0
libceph 368640 1 ceph
dns_resolver 16384 1 libceph
libcrc32c 16384 2xfs, lib
On 9/22/22 21:48, Reed Dier wrote:
Any tips or help would be greatly appreciated.
Try JJ's Ceph balancer [1]. In our case it turned out to be *way* more
efficient than built-in balancer (faster conversion, less movements
involved). And able to achieve a very good PG distribution and "reclai
Hi,
I can't speak from the developers perspective, but we discussed this
just recently intenally and with a customer. We doubled the number of
PGs on one of our customer's data pools from around 100 to 200 PGs/OSD
(HDDs with rocksDB on SSDs). We're still waiting for the final
conclusion i
+1 for increasing PG numbers, those are quite low.
Zitat von Bailey Allison :
Hi Reed,
Just taking a quick glance at the Pastebin provided I have to say
your cluster balance is already pretty damn good all things
considered.
We've seen the upmap balancer at it's best in practice provides
Good to know thank you, so in that case during recovery it worth to increase
those values right?
Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---
21 matches
Mail list logo