Den mån 12 dec. 2022 kl 21:18 skrev Benjamin.Zieglmeier
:
> We are in the process of building new stage (non-production) Ceph RGW
> clusters hosting s3 buckets. We are looking to have our customers migrate
> their non-production buckets to these new clusters. We want to help ease the
> migration
Hi Serkan,
Thanks for your reply.
-- Server setting --
OS: Ubuntu 20.04LTS
NIC: Mellanox ConnectX-6 EN
Driver: MLNX_OFED_LINUX-5.6-2.0.9.0-ubuntu20.04-x86_64
--
My ceph.conf is under following,
-- ceph.conf --
# minimal ceph.conf for 2f383ac8-76cb-11ed-bfbc-6dd8bf17bdf9
[global]
fsid =
Hi Eugen,
I assume the mon db is stored on the "OS disk". I could not find any
error related lines in cephadm.log, here is what journalctl -xe tells me:
Dec 13 11:24:21 sparci-store1
ceph-8c774934-1535-11ec-973e-525400130e4f-mon-sparci-store1[786211]:
debug 2022-12-13T10:24:21.392+ 7f318
So you get "Permission denied" errors, I'm guessing either the mon
keyring is not present (or wrong) or the mon directory doesn't belong
to the ceph user. Can you check
ls -l /var/lib/ceph/FSID/mon.sparci-store1/
Compare the keyring file with the ones on the working mon nodes.
Zitat von Mev
Hi Xiubo,
Thx for pointing me into the right direction. All involved esx host seem to use
the correct policy. I am going to detach the LUN on each host one by one until
i found the host causing the problem.
Regards Felix
--
The keyring is the same, but I found the following log lines:
Dec 13 12:22:18 sparci-store1
ceph-8c774934-1535-11ec-973e-525400130e4f-mon-sparci-store1[813780]:
debug 2022-12-13T11:22:18.016+ 7f789e7f3700 0
mon.sparci-store1@1(probing) e18 removed from monmap, suicide.
Dec 13 12:22:18 sp
On 13/12/2022 18:57, Stolte, Felix wrote:
Hi Xiubo,
Thx for pointing me into the right direction. All involved esx host
seem to use the correct policy. I am going to detach the LUN on each
host one by one until i found the host causing the problem.
From the logs it means the client was sw
Did you check the permissions? To me it reads like the permission
denied errors prevent the MONs from starting and then as a result they
are removed from the monmap:
ceph-8c774934-1535-11ec-973e-525400130e4f-mon-sparci-store1[786211]:
debug 2022-12-13T10:24:21.599+ 7f317ba4d700 -1
mon
Hi,
On Mon, 12 Dec 2022, Sascha Lucas wrote:
On Mon, 12 Dec 2022, Gregory Farnum wrote:
Yes, we’d very much like to understand this. What versions of the server
and kernel client are you using? What platform stack — I see it looks like
you are using CephFS through the volumes interface? The
Its very strange. The keyring of the ceph monitor is the same as on one
of the working monitor hosts. The failed mon and the working mons also
have the same selinux policies and firewalld settings. The connection is
also present since, all osd deamons are up on the failed ceph monitor node.
Am
Hello
I have a bunch of HDD OSDs with DB/WAL devices on SSD. If
the current trends continue, the DB/WAL devices will become
full before the HDDs completely fill up (e.g. a 50% full HDD
has DB/WAL device that is about 65% full).
Will anything terrible happen when DB/WAL devices fill up?
Will
We are happy to announce another release of the go-ceph API library.
This is a regular release following our every-two-months release
cadence.
https://github.com/ceph/go-ceph/releases/tag/v0.19.0
More details are available at the link above.
The library includes bindings that aim to play a sim
Den tis 13 dec. 2022 kl 17:47 skrev Vladimir Brik
:
>
> Hello
>
> I have a bunch of HDD OSDs with DB/WAL devices on SSD. If
> the current trends continue, the DB/WAL devices will become
> full before the HDDs completely fill up (e.g. a 50% full HDD
> has DB/WAL device that is about 65% full).
>
> W
Hi all,
in Ceph Pacific 6.2.5, the MDS failover function does not working. The
one host with the active MDS hat to be rebooted and after that, the
standby deamons did not jump in. The fs was not accessible, instead all
mds remain until now to standby. Also the cluster remains in Ceph Error
du
On Tue, Dec 13, 2022 at 2:02 PM Mevludin Blazevic
wrote:
>
> Hi all,
>
> in Ceph Pacific 6.2.5, the MDS failover function does not working. The
> one host with the active MDS hat to be rebooted and after that, the
> standby deamons did not jump in. The fs was not accessible, instead all
> mds rema
Hi,
thanks for the quick response!
CEPH STATUS:
cluster:
id: 8c774934-1535-11ec-973e-525400130e4f
health: HEALTH_ERR
7 failed cephadm daemon(s)
There are daemons running an older version of ceph
1 filesystem is degraded
1 filesystem ha
> The DB uses "fixed" sizes like 3,30,300G for different
levels of
> data, and when it needs to start fill a new level and it
doesn't fit,
> this level moves over to the data device.
I thought this no longer applied since the changes in
Pacific that Nathan mentioned?
Vlad
On 12/13/22 12:46,
On Tue, Dec 13, 2022 at 2:21 PM Mevludin Blazevic
wrote:
>
> Hi,
>
> thanks for the quick response!
>
> CEPH STATUS:
>
> cluster:
> id: 8c774934-1535-11ec-973e-525400130e4f
> health: HEALTH_ERR
> 7 failed cephadm daemon(s)
> There are daemons running an olde
Hi William,
On Mon, 12 Dec 2022, William Edwards wrote:
Op 12 dec. 2022 om 22:47 heeft Sascha Lucas het volgende
geschreven:
Ceph "servers" like MONs, OSDs, MDSs etc. are all
17.2.5/cephadm/podman. The filesystem kernel clients are co-located on
the same hosts running the "servers".
Isn
I am curious about what is happening with your iscsi configuration
Is this a new iscsi config or something that has just cropped up ?
We are using/have been using vmware for 5+ years with iscsi
We are using the kernel iscsi vs tcmu
We are running ALUA and all datastores are setup as RR
We routi
Is there any problem removing the radosgw and all backing pools from a cephadm
managed cluster? Ceph won't become unhappy about it? We have one cluster with a
really old, historical radosgw we think would be better to remove and someday
later, recreate fresh.
Thanks,
Kevin
_
Hi,
we had an issue with an old cluster, where we put disks from one host
to another.
We destroyed the disks and added them as new OSDs, but since then the
mgr daemon were restarting in 120s intervals.
I tried to debug it a bit, and it looks like the balancer is the problem.
I tried to disable it
You could try to do this in a screen session for a while.
while true; do radosgw-admin gc process; done
Maybe your normal RGW daemons are too busy for GC processing.
We have this in our config and have started extra RGW instances for GC only:
[global]
...
# disable garbage collector default
rgw_en
On 14/12/2022 06:54, Joe Comeau wrote:
I am curious about what is happening with your iscsi configuration
Is this a new iscsi config or something that has just cropped up ?
We are using/have been using vmware for 5+ years with iscsi
We are using the kernel iscsi vs tcmu
Do you mean you
Issue is resolved now. After verifying that all esx hosts are configured for
MRU, i took a closer look on the paths on each host.
`gwcli` reported lun in question was owned by gateway A, but one esx host used
the path to gateway B for I/O. I reconfigured that particular host and it’s now
using
Everything Open is a new open tech conference auspiced by Linux
Australia. For background see:
https://everythingopen.au/news/introducing-everything-open/
For the CFP and other details, read on...
Forwarded Message
Subject: [Announce] Everything Open, All At Once!
Date: Tue,
Hi guys,
we had some issues with our cephfs last, which probably have been caused by a
MTU mismatch (partly at least). Scenario was the following:
OSD Servers: MTU 9000 on public and cluster network
MON+MSD: MTU 1500 on public network
CephFS Clients (Kernel Mout): MTU 9000 on public network
RBD
After some manual rebalancing, all PGs went into a clean state and I was abler
to start the balancer again.
¯\_(ツ)_/¯
> Am 14.12.2022 um 01:18 schrieb Boris Behrens :
>
> Hi,
> we had an issue with an old cluster, where we put disks from one host
> to another.
> We destroyed the disks and ad
28 matches
Mail list logo