Hi again,
something is very wrong with my hardware it seems and i'm slowly turning
insane.
I'm trying to debug why ceph has incredibly poor performance for us.
we've got
- 3 EPYC 7713 dual-cpu systems
- datacenter nvme drives (3GB/s top)
- 100G infiniband
ceph does 800MB/s read max,
CPU is i
Yes, during my last adventure of trying to get any reasonable
performance out of ceph, i realized my testing methodology was wrong.
Both the kernel client and qemu have queues everywhere that make the
numbers hard to understand.
fio has rbd support, which gives more useful values.
https://subscri
on rocky, which should be identical to alma (?), i had to do this:
https://almalinux.discourse.group/t/nothing-provides-python3-pecan-in-almalinux-9/2017/4
because the rpm has a broken dependency to pecan.
But switching from debian to the official ceph rpm packages was worth
it. The systemd unit
since quincy i'm randomly getting authentication issues from clients to osds.
symptom is qemu hangs, but when it happens, i can reproduce it using:
> ceph tell osd.\* version
some - but only some - osds will never respond, but only to clients
on _some_ hosts.
the client gets stuck in a loop w
Hi,
Doing some lab tests to understand why ceph isnt working for us,
and here's the first puzzle:
setup: A completely fresh quincy cluster, 64 core EPYC 7713, 2 nvme drives
> ceph osd crush rule create-replicated osd default osd ssd
> ceph osd pool create rbd replicated osd --size 2
> dd if=/d
Heya,
ever since we had that one osd causing the entire cluster to hang
(it's been removed since),
we keep having hard to debug issues.
for example sometimes on start, qemu just hangs forever.
when i kill it manually, the next start works fine.
when i map the same volume using krbd on another hos
Hi,
today our entire cluster froze. or anything that uses librbd to be specific.
ceph version 16.2.10
The message that saved me was "256 slow ops, oldest one blocked for
2893 sec, osd.7 has slow ops" , because it makes it immediately clear
that this osd is the issue.
I stopped the osd, which mad