I am not absolutly sure but you should be able to do something like
ceph config mon set
Or try to restart the mon/osd daemon
Hth
Am 29. April 2020 16:42:31 MESZ schrieb "Gencer W. Genç" :
>Hi,
>
>I just deployed a new cluster with cephadm instead of ceph-deploy. In
>tyhe past, If i change ceph
Hello Everyone,
The new cephadm is giving me a headache.
I'm setting up a new testenvironment, where I have to use lvm partitions,
because I don't have more Hardware.
I could't find any information about the compatibility of existing lvm
partitions and cephadm/octopus.
I tried the old metho
Hello Igor,
Am 30.04.20 um 15:52 schrieb Igor Fedotov:
> 1) reset perf counters for the specific OSD
>
> 2) run bench
>
> 3) dump perf counters.
This is OSD 0:
# ceph tell osd.0 bench -f plain 12288000 4096
bench: wrote 12 MiB in blocks of 4 KiB in 6.70482 sec at 1.7 MiB/sec 447
IOPS
https://
Sorry I missclicked, here the second part:
ceph-volume --cluster ceph lvm prepare --data /dev/centos_node1/ceph
But that gives me just:
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/boots
Hi Dave,
Probably not complete but I know 2 interesting ways to get configuration
of a Bluestore OSD:
1/ the /show-label/ option of /ceph-bluestore-tool/ command
Ex:
$ sudo ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
2/ the /config show/ and /perf dump/ parameters of the
Hi Casey,
Hi all,
Casey thanks a lot for your reply ! That was really helpful.
A question please. Do these tests reflect realistic workload? Basically I
am profiling (CPU profiling) the computations in these tests. And naturally
I am interested in big workload. I have started with CRUSH and her
Hi Eric,
Expected version to be included your tool in Nautilus? Maybe next reléase?
Best Regards
Manuel
-Mensaje original-
De: Katarzyna Myrek
Enviado el: lunes, 20 de abril de 2020 12:19
Para: Eric Ivancich
CC: EDH - Manuel Rios ; ceph-users@ceph.io
Asunto: Re: [ceph-users] RGW and
Hi Stefan,
so (surprise!) some DB access counters show a significant difference, e.g.
"kv_flush_lat": {
"avgcount": 1423,
"sum": 0.000906419,
"avgtime": 0.00636
},
"kv_sync_lat": {
"avgcount": 1423,
"sum": 0.
ceph@elchaka.de wrote:
> I am not absolutly sure but you should be able to do something like
>
> ceph config mon set
Yes. please use `ceph config ...` cephadm only uses a minimal ceph.conf which
only contains the IPs of the other MONs.
>
> Or try to restart the mon/osd daemon
>
> Hth
>
> Am
Hi Frank,
Could you share any ceph-osd logs and also the ceph.log from a mon to
see why the cluster thinks all those osds are down?
Simply marking them up isn't going to help, I'm afraid.
Cheers, Dan
On Tue, May 5, 2020 at 4:12 PM Frank Schilder wrote:
>
> Hi all,
>
> a lot of OSDs crashed in
On 20/05/05 08:46, Simon Sutter wrote:
> Sorry I missclicked, here the second part:
>
>
> ceph-volume --cluster ceph lvm prepare --data /dev/centos_node1/ceph
> But that gives me just:
>
> Running command: /usr/bin/ceph-authtool --gen-print-key
> Running command: /usr/bin/ceph --cluster ceph --n
Hi Dave,
wouldn't this help (particularly "Viewing runtime settings" section):
https://docs.ceph.com/docs/nautilus/rados/configuration/ceph-conf/
Thanks,
Igor
On 5/5/2020 2:52 AM, Dave Hall wrote:
Hello,
Sorry if this has been asked before...
A few months ago I deployed a small Nautilus c
Hi Frank,
On Tue, May 5, 2020 at 10:43 AM Frank Schilder wrote:
> Dear Dan,
>
> thank you for your fast response. Please find the log of the first OSD
> that went down and the ceph.log with these links:
>
> https://files.dtu.dk/u/tF1zv5zdc6mmXXO_/ceph.log?l
> https://files.dtu.dk/u/hPb5qax2-b6
Check network connectivity on all configured networks between alle hosts,
OSDs running but being marked as down is usually a network problem
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Te
Hi,
The osds are getting marked down due to this:
2020-05-05 15:18:42.893964 mon.ceph-01 mon.0 192.168.32.65:6789/0
292689 : cluster [INF] osd.40 marked down after no beacon for
903.781033 seconds
2020-05-05 15:18:42.894009 mon.ceph-01 mon.0 192.168.32.65:6789/0
292690 : cluster [INF] osd.60 mark
On Tue, May 5, 2020 at 11:27 AM Frank Schilder wrote:
> I tried that and get:
>
> 2020-05-05 17:23:17.008 7fbbe700 0 -- 192.168.32.64:0/2061991714 >>
> 192.168.32.68:6826/5216 conn(0x7fbbf01d6f80 :-1
> s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0
> l=1).handle_connect_reply connect
OK those requires look correct.
While the pgs are inactive there will be no client IO, so there's
nothing to pause at this point. In general, I would evict those
misbehaving clients with ceph tell mds.* client evict id=
For now, keep nodown and noout, let all the PGs get active again. You
might n
ceph osd tree down # shows the down osds
ceph osd tree out # shows the out osds
there is no "active/inactive" state on an osd.
You can force an individual osd to do a soft restart with "ceph osd
down " -- this will cause it to restart and recontact mons and
osd peers. If that doesn't work, restar
Ditto, I had a bad optic on 48x10 switch. The only way I detected it was my
prometheus tcp fail retrans count. Looking back over the previous 4 weeks, I
could seen it increment in small bursts, but Ceph was able to handle it and
then it went crazy and a bunch of OSD’s just dropped out.
__
Hi all,
Ceph documentation mentions it has two types of tests: *unit tests* (also
called make check tests) and *integration tests*. Strictly speaking, the *make
check tests* are not “unit tests”, but rather tests that can be run easily
on a single build machine after compiling Ceph from source .
Hi all,
a lot of OSDs crashed in our cluster. Mimic 13.2.8. Current status included
below. All daemons are running, no OSD process crashed. Can I start marking
OSDs in and up to get them back talking to each other?
Please advice on next steps. Thanks!!
[root@gnosis ~]# ceph status
cluster:
Dear Dan,
thank you for your fast response. Please find the log of the first OSD that
went down and the ceph.log with these links:
https://files.dtu.dk/u/tF1zv5zdc6mmXXO_/ceph.log?l
https://files.dtu.dk/u/hPb5qax2-b6W9vmp/ceph-osd.2.log?l
I can collect more osd logs if this helps.
Best regards
Hi,
We’ve recently installed a new Ceph cluster running Octopus 15.2.1, and we’re
using RGW with an erasure coded backed pool.
I started to get a suspicion that deleted objects were not getting cleaned up
properly, and I wanted to verify this by checking the garbage collector.
That’s when I di
Situation is improving very slowly. I set nodown,noout,norebalance since all
daemons are running, nothing actually crashed. Current status:
[root@gnosis ~]# ceph status
cluster:
id:
health: HEALTH_WARN
2 MDSs report slow metadata IOs
1 MDSs report slow requests
I tried that and get:
2020-05-05 17:23:17.008 7fbbe700 0 -- 192.168.32.64:0/2061991714 >>
192.168.32.68:6826/5216 conn(0x7fbbf01d6f80 :-1
s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply
connect got BADAUTHORIZER
Strange.
=
Frank Schilder
AIT
Thanks! Here it is:
[root@gnosis ~]# ceph osd dump | grep require
require_min_compat_client jewel
require_osd_release mimic
It looks like we had an extremely aggressive job running on our cluster,
completely flooding everything with small I/O. I think the cluster built up a
huge backlog and is/
Its not the time:
[root@gnosis ~]# pdsh -w ceph-[01-20] date
ceph-01: Tue May 5 17:34:52 CEST 2020
ceph-03: Tue May 5 17:34:52 CEST 2020
ceph-02: Tue May 5 17:34:52 CEST 2020
ceph-04: Tue May 5 17:34:52 CEST 2020
ceph-07: Tue May 5 17:34:52 CEST 2020
ceph-14: Tue May 5 17:34:52 CEST 2020
cep
Hi Dan,
looking at an older thread, I found that "OSDs do not send beacons if they are
not active". Is there any way to activate an OSD manually? Or check which ones
are inactive?
Also, I looked at this here:
[root@gnosis ~]# ceph mon feature ls
all features
supported: [kraken,luminous
Dear all,
the command
ceph config set mon.ceph-01 mon_osd_report_timeout 3600
saved the day. Within a few seconds, the cluster became:
==
[root@gnosis ~]# ceph status
cluster:
id:
health: HEALTH_WARN
2 slow ops, oldest one blocked for 10884
But what does mon_osd_report_timeout do, so it resolved your issues? Is
this related to the suggested ntp / time sync? From the name I assume
that now your monitor just waits longer before it reports the osd as
'unreachable'(?) So your osd has more time to 'announce' itself.
And I am a little
Dear Cephalopodians,
seeing the recent moves of major HDD vendors to sell SMR disks targeted for use
in consumer NAS devices (including RAID systems),
I got curious and wonder what the current status of SMR support in Bluestore
is.
Of course, I'd expect disk vendors to give us host-managed SMR
I've been using CentOS 7 and 5.6.10-1.el7.elrepo.x86_64 Linux kernel.
After today's update and reboot osd's wont start.
# podman run --privileged --pid=host --cpuset-cpus 0,1 --memory 2g --name
ceph_osd0 --hostname ceph_osd0 --ip 172.30.0.10 -v /dev:/dev -v
/etc/localtime:/etc/localtime:ro -v /etc
Hi James,
Does radosgw-admin gc list --include-all, give the same error? If yes, can
you please open a tracker issue and share rgw and osd logs?
Thanks,
Pritha
On Wed, May 6, 2020 at 12:22 AM James, GleSYS
wrote:
> Hi,
>
> We’ve recently installed a new Ceph cluster running Octopus 15.2.1, and
Is there a way to get the block,block.db,block.wal path and size?
what if all of them or some of them are colocated in one disk?
I can get the info from a wal,db,block colocated osd like below:
ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
{
"/var/lib/ceph/osd/ceph-0//block"
Hi,
Yes, it’s the same error with “—include-all”. I am currently awaiting
confirmation of my account creation on the tracker site.
In the meantime, here are some logs which I’ve obtained:
radosgw-admin gc list --debug-rgw=10 --debug-ms=10:
2020-05-06T06:06:33.922+ 7ff4ccffb700 1 -- [2a00:X
35 matches
Mail list logo