Hi,
ceph orch commands are not executed anymore in my cephadm-managed cluster
(17.2.3) and I don't see why. Cluster is healthy and overall working,
except for the orchestrator part.
For instance, when I run `ceph orch redeploy ingress.rgw.default`, I see
the command in audit logs, cephadm also lo
Hi Dale
Can you please post the ceph status ? I’m no expert but I would make sure that
the datacenter you intend to operate (while the connection gets reestablished)
has two active monitors. Thanks.
Yanko.
> On Nov 29, 2022, at 7:20 AM, Wolfpaw - Dale Corse wrote:
>
> Hi All,
>
>
>
> We
Hi All,
We had a fiber cut tonight between 2 data centers, and a ceph cluster didn't
do very well :( We ended up with 98% of PGs as down.
This setup has 2 data centers defined, with 4 copies across both, and a
minimum of size of 1. We have 1 mon/mgr in each DC, with one in a 3rd data
cente
Hi Frank,
On Tue, Nov 29, 2022 at 12:32 AM Frank Schilder wrote:
>
> Hi Reed,
>
> I sometimes stuck had MDS ops as well, making the journal trim stop and the
> meta data pool running full slowly. Its usually a race condition in the MDS
> ops queue and re-scheduling the OPS in the MDS queue reso
Hi Reed,
On Tue, Nov 29, 2022 at 3:13 AM Reed Dier wrote:
>
> So, ironically, I did try and take some of these approaches here.
>
> I first moved the nearfull goalpost to see if that made a difference, it did
> for client writes, but not for the metadata to unstick.
>
> I did some hunting for so
On 28/11/2022 23:21, Adrien Georget wrote:
Hi Xiubo,
I did a journal reset today followed by session reset and then the MDS
was able to start without switching to readonly mode.
A MDS scrub was also usefull to repair some bad inode backtrace.
Thanks again for your help with this issue!
Coo
Hi Reed,
forget what I wrote about pinning, you use only 1 MDS, so it won't change
anything. I think the problem you are facing is with the standby-replay daemon
mode. I used that in the past too, but found out that it actually didn't help
with fail-over speed to begin with. On top of that, the
So, ironically, I did try and take some of these approaches here.
I first moved the nearfull goalpost to see if that made a difference, it did
for client writes, but not for the metadata to unstick.
I did some hunting for some hung/waiting processes on some of the client nodes,
and was able to
Hi Reed,
I sometimes stuck had MDS ops as well, making the journal trim stop and the
meta data pool running full slowly. Its usually a race condition in the MDS ops
queue and re-scheduling the OPS in the MDS queue resolves it. To achieve that,
I usually try in escalating order:
- Find the clie
Hi Venky,
Thanks for responding.
> A good chunk of those are waiting for the directory to finish
> fragmentation (split). I think those ops are not progressing since
> fragmentation involves creating more objects in the metadata pool.
> Update ops will involve appending to the mds journal consum
On Mon, Nov 28, 2022 at 10:19 PM Reed Dier wrote:
>
> Hopefully someone will be able to point me in the right direction here:
>
> Cluster is Octopus/15.2.17 on Ubuntu 20.04.
> All are kernel cephfs clients, either 5.4.0-131-generic or 5.15.0-52-generic.
> Cluster is nearful, and more storage is co
Hopefully someone will be able to point me in the right direction here:
Cluster is Octopus/15.2.17 on Ubuntu 20.04.
All are kernel cephfs clients, either 5.4.0-131-generic or 5.15.0-52-generic.
Cluster is nearful, and more storage is coming, but still 2-4 weeks out from
delivery.
> HEALTH_WARN 1
Hi Xiubo,
I did a journal reset today followed by session reset and then the MDS
was able to start without switching to readonly mode.
A MDS scrub was also usefull to repair some bad inode backtrace.
Thanks again for your help with this issue!
Cheers,
Adrien
Le 26/11/2022 à 05:08, Xiubo Li a
I’ve never done it myself, but the network config options for public/private
should take a comma-separated list of CIDR blocks.
The client/public should be fine.
For the backend/private/replication network, that is likely overkill. Are your
OSDs SSDs or HDDs? If you do go this route, be sure
Hi Mathias,
(apologies for the super late reply - I was getting back from a long
vacation and missed seeing this).
I updated the tracker ticket. Let's move the discussion there...
On Mon, Nov 28, 2022 at 7:46 PM Venky Shankar wrote:
>
> On Tue, Aug 23, 2022 at 10:01 PM Kuhring, Mathias
> wrote
The “Network Configuration Reference” is always a good place to start:
https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/
Multiple client networks are possible ( see the “public_network” configuration
option )
I believe you’d configure 2 “public_network”s:
1. For actual
On Tue, Aug 23, 2022 at 10:01 PM Kuhring, Mathias
wrote:
>
> Dear Ceph developers and users,
>
> We are using ceph version 17.2.1
> (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable).
> We are using cephadm since version 15 octopus.
>
> We mirror several CephFS directories from our main cl
Hello,
I have a CEPH cluster with 3 MONs and 6 OSD nodes with 72 OSDs.
I would like to have multiple client and backed networks. I have
now 2x 10Gbps and 2x25Gbps NIC in the nodes and my idea is to
have:
- 2 client network, for example 192.168.1.0/24 on 10Gbps NICs and
192.168.2.0/24 on 25Gbps N
Thanks, also for finding the related tracker issue! It looks like a fix has
already been approved. Hope it shows up in the next release.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: 28 Novemb
Hi,
seems like this tracker issue [1] already covers your question. I'll
update the issue and add a link to our thread.
[1] https://tracker.ceph.com/issues/57767
Zitat von Frank Schilder :
Hi Eugen,
can you confirm that the silent corruption happens also on a
collocated OSDc (everythin
Hi Matt,
Also, make sure that when rejoining host has correct time. I have seen
clusters going down when rejoining hosts that were down for maintenance
for various weeks and came in with datetime deltas of some months (no
idea why that happened, I arrived with the firefighter team ;-) )
Chee
21 matches
Mail list logo