I think I found the problem. Setting the cephadm log level to debug and
then watching the logs during the upgrade:
ceph config set mgr mgr/cephadm/log_to_cluster_level debug
ceph -W cephadm --watch-debug
I found this line just before the error:
ceph: stderr Fatal glibc error: CPU does no
Hi all,
we have some Ceph clusters with RGW replication between them. It seems
that in the last month at least, it gets stuck at around the same time
~every day. Not 100% the same time, and also not 100% of the days, but
in the more recent days seem to happen more, and for longer.
With "stuc
Hi
[ceph: root@ceph-flash1 /]# rbd info rbd_ec/projects
rbd image 'projects':
size 750 TiB in 196608000 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 15a979db61dda7
data_pool: rbd_ec_data
block_name_prefix: rbd_data.10.15a979db61dda7
On Tue, Aug 6, 2024 at 11:55 AM Torkil Svensgaard wrote:
>
> Hi
>
> [ceph: root@ceph-flash1 /]# rbd info rbd_ec/projects
> rbd image 'projects':
> size 750 TiB in 196608000 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: 15a979db61dda7
> da
On 06/08/2024 12:37, Ilya Dryomov wrote:
On Tue, Aug 6, 2024 at 11:55 AM Torkil Svensgaard wrote:
Hi
[ceph: root@ceph-flash1 /]# rbd info rbd_ec/projects
rbd image 'projects':
size 750 TiB in 196608000 objects
order 22 (4 MiB objects)
snapshot_count: 0
Hi,
I am in the process of creating disaster recovery documentation and I have
two topics where I am not sure how to do it or even if it is possible.
Is it possible to recover from a 100% mon data loss? Like all mons fail and
the actual mon data is not recoverable.
In my head I would thing that
Hi,
the upgrade notes for Nautilus [0] contain this section:
Running nautilus OSDs will not bind to their v2 address
automatically. They must be restarted for that to happen.
Regards,
Eugen
[0] https://docs.ceph.com/en/latest/releases/nautilus/#instructions
Zitat von Mark Kirkwood :
We ha
On 06.08.24 1:19 PM, Boris wrote:
I am in the process of creating disaster recovery documentation and I have
two topics where I am not sure how to do it or even if it is possible.
Is it possible to recover from a 100% mon data loss? Like all mons fail and
the actual mon data is not recoverable.
Hi Ceph users,
The next Ceph Developer Summit is happening virtually from August 12 – 19,
2024 adn we want to see you there. The focus of the summit will include
planning around our next release, Tentacle, and everyone in our community
is welcome to participate!
Learn more and RSVP here:
https://
Looks like the issue was fixed in the latest reef release (18.2.4)
I found the following commit that seams to fix it:
https://github.com/ceph/ceph/commit/26f1d6614bbc45a0079608718f191f94bd4eebb6
After upgrading we also haven’t encountered the problem again.
Cheers,
Florian
> On 5. Aug 2024, at
Hello everyone,
We need to add 180 20TB OSDs to our Ceph cluster, which currently
consists of 540 OSDs of identical size (replicated size 3).
I'm not sure, though: is it a good idea to add all the OSDs at once? Or
is it better to add them gradually?
The idea is to minimize the impact of reb
What operating system/distribution are you running? What hardware?
David
On Tue, Aug 6, 2024, at 02:20, Nicola Mori wrote:
> I think I found the problem. Setting the cephadm log level to debug and
> then watching the logs during the upgrade:
>
>ceph config set mgr mgr/cephadm/log_to_cluster_
If you're using VMs,
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/6X6QIEMWDYSA6XOKEYH5OJ4TIQSBD5BL/
might be relevant
On Tue, Aug 6, 2024 at 3:21 AM Nicola Mori wrote:
> I think I found the problem. Setting the cephadm log level to debug and
> then watching the logs during th
Hi Ceph-users!
Ceph version: ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5)
quincy (stable)
Using cephadm to orchestrate the Ceph cluster
I’m running into https://tracker.ceph.com/issues/59189, which is fixed in next
version—quincy 17.2.7—via
https://github.com/ceph/ceph/pull/50
Hi Dhairya,
Thanks for the response! We tried removing it as you suggested with `rm
-rf` but the command just hangs indefinitely with no output. We are also
unable to `ls lost_found`, or otherwise interact with the directory's
contents.
Best,
Justin lee
On Fri, Aug 2, 2024 at 8:24 AM Dhairya Par
Hi,
we just set up 2 new ceph clusters (using rook). To do some processing of the
user activity we configured a topic that sends events to Kafka.
After 5-12 hours this stops working with a 503 SlowDown response:
debug 2024-08-02T09:17:58.205+ 7ff4359ad700 1 req 13681579273117692719
0.005000
Hi,
Currently I see it only supports the latest version, is there any way to
support old versions like Pacific or Quincy?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi all,
we have some Ceph clusters with RGW replication between them. It seems
that in the last month at least, it gets stuck at around the same time
~every day. Not 100% the same time, and also not 100% of the days, but
in the more recent days seem to happen more, and for longer.
With "stuc
The actual mount command doesn't hang, we just can't interact with any of
the directory's contents once mounted. I couldn't find anything unusual in
the logs.
Best,
Justin Lee
On Fri, Aug 2, 2024 at 10:38 AM Dhairya Parmar wrote:
> So the mount hung? Can you see anything suspicious in the logs?
Since they’re 20TB, I’m going to assume that these are HDDs.
There are a number of approaches. One common theme is to avoid rebalancing
until after all have been added to the cluster and are up / in, otherwise you
can end up with a storm of map updates and superfluous rebalancing.
One strateg
some kernels (el7?) lie about being jewel until after they are blocked from
connecting at jewel. then they report newer. Just fyi.
From: Anthony D'Atri
Sent: Tuesday, August 6, 2024 5:08 PM
To: Fabien Sirjean
Cc: ceph-users
Subject: [ceph-users] Re: What'
Hi Fabien,
additional to what Anthony said you could do the following:
- `ceph osd set nobackfill` to disable initial backfilling
- `ceph config set osd osd_mclock_override_recovery_settings true` to
override the mclock sheduler backfill settings
- Let the orchestrator add one host each time. I w
22 matches
Mail list logo