- Le 25 Mar 25, à 10:59, Илья Безруков rbe...@gmail.com a écrit :
> --
>
> Hello Janne,
>
> We only have a single network configured for our OSDs:
>
> ```sh
> ceph config get osd public_network172.20.180.0/24
>
> ceph config get osd cluster_network172.20.180.0
Hi Eugen,
try deploying the NFS service like this:
https://docs.ceph.com/en/latest/mgr/nfs/
Some had only success deploying it via the dashboard.
Best,
Malte
On 25.03.25 13:02, Eugen Block wrote:
Hi,
I'm re-evaluating NFS again, testing on a virtual cluster with 18.2.4.
For now, I don't ne
--
Hello Janne,
We only have a single network configured for our OSDs:
```sh
ceph config get osd public_network172.20.180.0/24
ceph config get osd cluster_network172.20.180.0/24
```
However, in the output of ceph health detail, we see multiple networks
being checked
> > After upgrading our Ceph cluster from 17.2.7 to 17.2.8 using `cephadm`, all
> > OSDs are reported as unreachable with the following error:
> >
> > HEALTH_ERR 32 osds(s) are not reachable
> > [ERR] OSD_UNREACHABLE: 32 osds(s) are not reachable
> > osd.0's public address is not in '172.20.180
Yeah, it seems to work without the "keepalive-only" flag, at least
from a first test. So keepalive-only is not working properly, it
seems? Should I create a tracker for that or am I misunderstanding its
purpose?
Zitat von Malte Stroem :
Hi Eugen,
try omitting
--ingress-mode keepalive-on
Hi Danish,
Can you specify the version of Ceph used and whether versioning is enabled on
this bucket?
There are 2 ways to clean up orphan entries in a bucket index that I'm aware of
:
- One (the preferable way) is to rely on radosgw-admin command to check and
hopefully fix the issue, cleaning
Okay, so I don't see anything in the keepalive log about communicating
between each other. The config files are almost identical, no
difference in priority, but in unicast_peer. ceph03 has no entry at
all for unicast_peer, ceph02 has only ceph03 in there while ceph01 has
both of the others
On Tue, Mar 25, 2025 at 7:40 AM Yuri Weinstein wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/70563#note-1
> Release Notes - TBD
> LRC upgrade - TBD
>
> Seeking approvals/reviews for:
>
> smoke - Laura approved?
Approved, issues are https://tracker.cep
Am 2025-03-20 15:15, schrieb Chris Palmer:
Hi,
* Ceph cluster 19.2.1 with 3 nodes, 4 x SATA disks with shared NVMe
DB/WAL, single 10g NICs
* Promox 8.3.5 cluster with 2 nodes (separate nodes to Ceph), single
10g NICs , single 1g NICs for corosync
* Test VM was using KRBD R3 pool on HDD
On Mon, 2025-03-24 at 15:35 -0700, Anthony D'Atri wrote:
> So probably all small-block RBD?
Correct. I am using RBD pools.
> Since you’re calling them thin, I’m thinking that they’re probably
> E3.S. U.3 is the size of a conventional 2.5” SFF SSD or HDD.
Hrm, my terminology is probably confusi
>
> OK, good to know about the 5% misplaced objects report 😊
>
> I just checked 'ceph -s' and the misplaced objects is showing 1.948%, but I
> suspect I will see this up to 5% or so later on 😊
If you see a place in the docs where it would help to note the balancer
phenomenon and mistakenly th
Hi Eugen,
yes, for me it's kind of "test-setting" for small setups.
Doc says:
Setting --ingress-mode keepalive-only deploys a simplified ingress
service that provides a virtual IP with the nfs server directly binding
to that virtual IP and leaves out any sort of load balancing or traffic
red
>> Since you’re calling them thin, I’m thinking that they’re probably
>> E3.S. U.3 is the size of a conventional 2.5” SFF SSD or HDD.
>
> Hrm, my terminology is probably confusing. According to the specs of
> the servers, they are U.3 slots.
Ah. I forget sometimes that there are both 7mm a
Dear Frédéric,
Unfortunately, I am still using *Octopus* version and these commands are
showing unrecognized.
Versioning is also not enabled on the bucket.
I tried running :
radosgw-admin bucket check --bucket= --fix
which run for few minutes giving lot of output, which contained below lines
fo
Hi Alan,
- Le 25 Mar 25, à 16:47, Alan Murrell a...@t-net.ca a écrit :
> OK, so just an update that the recovery did finally complete, and I am pretty
> sure that the "inconsistent" PGs were PGs that the failed OSD were part of.
> Running 'ceph pg repair' has them sorted out, along with the 6
>
> I just tried it with 3 keepalive daemons and one nfs daemon, it
> doesn't really work because all three hosts have the virtual IP
> assigned, preventing my client from mounting. So this doesn't really
> work as a workaround, it seems.
That's a bit surprising. The keepalive daemons are meant t
Thanks, Adam.
I just tried it with 3 keepalive daemons and one nfs daemon, it
doesn't really work because all three hosts have the virtual IP
assigned, preventing my client from mounting. So this doesn't really
work as a workaround, it seems. I feel like the proper solution would
be to inc
OK, so just an update that the recovery did finally complete, and I am pretty
sure that the "inconsistent" PGs were PGs that the failed OSD were part of.
Running 'ceph pg repair' has them sorted out, along with the 600+ "scrub
errors" I had.
I was able to remove the OSD from the cluster, and a
Hi Frédéric,
Thank you for replying.
I followed the steps mentioned in https://tracker.ceph.com/issues/62845 and
was able to trim all the errors.
Everything seemed to be working fine until the same error appeared again.
I am still assuming the main culprit of this issue is one missing
object an
Which daemons get moved around like that is controlled by
https://github.com/ceph/ceph/blob/main/src/pybind/mgr/cephadm/utils.py#L30,
which appears to only include nfs and haproxy, so maybe this keepalive only
case was missed in that sense. I do think that you could alter the
placement of the ingre
I completely agree that the test I did is not suitable for testing ceph
performance. I merely did the same command as the OP and obtained very
different results. I suspect the performance difference is much more due
to things like network, OS config, memory constraints, etc. But that
needs a ri
Sounds weird to me. Don't you have some element in the network that is just
limited to 5140 and above it, it starts to fix fragmentation or so. I can
remember asking the data center to enable 9000 and they never did and also were
experimenting with some software defined network.
I will bet th
No, that was peer-to-peer, controlled testing. The results were
different with different NIC chipsets, even on the same machines through
the same switch. And even without a switch. I have to say some of these
were cheaper NICs. With better ones there are less problems. But you
don't know until
Hello Frédéric,
merci beaucoup. Yes, that's what I saw, too. Thank you for your feedback
and acknowledgement.
Best,
Malte
On 20.03.25 10:00, Frédéric Nass wrote:
Hi Malte,
Yeah, I just wanted to make you aware of this separate Kafka bug in Quincy and
Reef v18.2.4.
Regarding your issue, if
Hello Yuval,
yes, I would really like to help here.
We're running Reef but can upgrade immediately.
Contact me if you need the help.
Best,
Malte
On 22.03.25 18:46, Yuval Lifshitz wrote:
Hi,
As noted above, I already started implementing mtls support. Currently
blocked on adding an mtls test
Hi,
I'm re-evaluating NFS again, testing on a virtual cluster with 18.2.4.
For now, I don't need haproxy so I use "keepalive_only: true" as
described in the docs [0]. I first create the ingress service, wait
for it to start, then create the nfs cluster. I've added the specs at
the bottom.
Hi Eugen,
try omitting
--ingress-mode keepalive-only
like this
ceph nfs cluster create ebl-nfs-cephfs "1 ceph01 ceph02 ceph03"
--ingress --virtual_ip "192.168.168.114/24"
Best,
Malte
On 25.03.25 13:25, Eugen Block wrote:
Thanks for your quick response. The specs I pasted are actually the
Thanks for your quick response. The specs I pasted are actually the
result of deploying a nfs cluster like this:
ceph nfs cluster create ebl-nfs-cephfs "1 ceph01 ceph02 ceph03"
--ingress --virtual_ip 192.168.168.114 --ingress-mode keepalive-only
I can try redeploying it via dashboard, but I
28 matches
Mail list logo