Yes mgrs are running as intended. It just seems that mons and osd don't
recongnize each other, because the monitors map is outdated.
On 2025-04-11 07:07, Eugen Block wrote:
Is at least one mgr running? PG states are reported by the mgr daemon.
Zitat von Jonas Schwab :
I solved the problem wit
Is at least one mgr running? PG states are reported by the mgr daemon.
Zitat von Jonas Schwab :
I solved the problem with executing ceph-mon. Among others, -i
mon.rgw2-06 was not the correct option, but rather -i rgw2-06.
Unfortunately, that brought the next problem:
The cluster now shows "100
I solved the problem with executing ceph-mon. Among others, -i
mon.rgw2-06 was not the correct option, but rather -i rgw2-06.
Unfortunately, that brought the next problem:
The cluster now shows "100.000% pgs unknown", which is probably because
the monitor data is not complete up to date, but rath
Hi Jonas,
Anthony gave some good advice for some things to check. You can also
dump the mempool statistics for OSDs that you identify are over their
memory target using: "ceph daemon osd.NNN dump_mempools"
The osd_memory_target code basically looks at the memory usage of the
process and the
On Fri, Apr 11, 2025 at 10:39 AM Alex wrote:
> I created a pull request, not sure what the etiquette is if I can
> merge it. First timer here.
>
hi Alex, I cannot find your pull request in
https://github.com/ceph/cephadm-ansible/ . did you create it in this
project?
> _
Link please.
> On Apr 10, 2025, at 10:59 PM, Alex wrote:
>
> I made a Pull Request for cephadm.log set DEBUG.
> Not sure if I should merge it.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ce
Filestore IIRC used partitions, with cute hex GPT types for various states and
roles. Udev activation was sometimes problematic, and LVM tags are more
flexible and reliable than the prior approach. There no doubt is more to it
but that’s what I recall.
> On Apr 10, 2025, at 9:11 PM, Tim Hol
I made a Pull Request for cephadm.log set DEBUG.
Not sure if I should merge it.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
I created a pull request, not sure what the etiquette is if I can
merge it. First timer here.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Peter,
I don't think udev factors in based on the original question. Firstly,
because I'm not sure udev deals with permanently-attached devices (it's
more for hot-swap items). Secondly, because the original complaint
mentioned LVM specifically.
I agree that the hosts seem overloaded, by the
19.2.2 Installed!
# ceph -s
cluster:
id: ,,,
health: HEALTH_ERR
27 osds(s) are not reachable
...
osd: 27 osds: 27 up (since 32m), 27 in (since 5w)
...
It's such a 'bad look' something so visible, in such an often given command.
10/4/25 06:00 PM[ERR]osd.27's public a
Hi,
I just did a new Ceph installation and would like to enable the "read
balancer".
However, the documentation requires that the minimum client version be
reef. I checked this information through "ceph features" and came across the
situation of having 2 luminous clients.
# ceph featur
Sounds like a discussion for a discord server. Or BlueSky or something
that's very definitely NOT what used to be known as twitter.
My viewpoint is a little different. I really didn't consider HIPAA
stuff, although since technically that is info that shouldn't be
accessible to anyone but autho
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Super cool idea - I too wanted to refer to blockchain methods to avoid data
being tampered.
Ceph would need a completely different distribution coded for such storage,
however we could say that the fundamentals are already in place?
Best,
Laimis J.
> On 7 Apr 2025, at 18:23, Tim Holloway wrot
Hi everyone,
On today's Ceph Steering Committee call we discussed the idea of removing
the diskprediction_local mgr module, as the current prediction model is
obsolete and not maintained.
We would like to gather feedback from the community about the usage of this
module, and find out if anyone is
Hi Alex,
"Cost concerns" is the fig leaf that is being used in many cases, but
often a closer look indicates political motivations.
The current US administration is actively engaged in the destruction of
anything that would conflict with their view of the world. That includes
health practic
I edited the monmap to include only rgw2-06 and then followed
https://docs.ceph.com/en/squid/rados/operations/add-or-rm-mons/#adding-a-monitor-manual
to create a new monitor.
Unfortunately, `ceph-mon -i mon.rgw2-06 --public-addr 10.127.239.63 -f`
crashed with the traceback seen in the attachment.
We're happy to announce the 5th point release in the Reef series.
We recommend users to update to this release.
For detailed release notes with links & changelog please refer to the
official blog entry at https://ceph.io/en/news/blog/2025/v18-2-5-reef-released/
Notable Changes
---
*
> I have a 4 nodes with 112 OSDs each [...]
As an aside I rekon that is not such a good idea as Ceph was
designed for one-small-OSD per small-server and lots of them,
but lots of people of course know better.
> Maybe you can gimme a hint how to struggle it over?
That is not so much a Ceph questi
That was my assumption, yes.
Zitat von Alex :
Is this bit of code responsible for hardcoding DEBUG to cephadm.log?
'loggers': {
'': {
'level': 'DEBUG',
'handlers': ['console', 'log_file'],
}
}
in /var/lib/ceph//cephadm.* ?
___
I think it's the same block of code Eugen found.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
We're happy to announce the 2nd backport release in the Squid series.
https://ceph.io/en/news/blog/2025/v19-2-2-squid-released/
Notable Changes
---
- This hotfix release resolves an RGW data loss bug when CopyObject is
used to copy an object onto itself.
S3 clients typically do this
Hey all,
Just confirming that the same debug level has been in Reef and Squid.
We got so used to it that just decided not to care anymore.
Best,
Laimis J.
> On 8 Apr 2025, at 14:21, Alex wrote:
>
> Interesting. So it's like that for everybody?
> Meaning cephadm.log logs debug messages.
>
Is this bit of code responsible for hardcoding DEBUG to cephadm.log?
'loggers': {
'': {
'level': 'DEBUG',
'handlers': ['console', 'log_file'],
}
}
in /var/lib/ceph//cephadm.* ?
___
ceph-users mailing list
It depends a bit. Which mon do the OSDs still know about? You can
check /var/lib/ceph//osd.X/config to retrieve that piece of
information. I'd try to revive one of them.
Do you still have the mon store.db for all of the mons or at least one
of them? Just to be safe, back up all the store.db d
Again, thank you very much for your help!
The container is not there any more, but I discovered that the "old" mon
data still exists. I have the same situation for two mons I removed at
the same time:
$ monmaptool --print monmap1
monmaptool: monmap file monmap1
epoch 29
fsid 6d0d4ed4-0052-4eb9-9
>> anthonydatri@Mac models % pwd
>> /Users/anthonydatri/git/ceph/src/pybind/mgr/diskprediction_local/models
>> anthonydatri@Mac models % file redhat/*
>> redhat/config.json: JSON data
>> redhat/hgst_predictor.pkl:data
>> redhat/hgst_scaler.pkl: data
>> redhat/seagate_predictor
+1
I wasn't aware that this module is obsolete and was trying to start it a
few weeks ago.
We develop a home-made solution some time ago to monitor smart data from
both HDD (uncorrected errors, grown defect list) and SSD (WLC/TBW). But
keeping it up to date with non-unified disk models is a nigh
It can work, but it might be necessary to modify the monmap first,
since it's complaining that it has been removed from it. Are you
familiar with the monmap-tool
(https://docs.ceph.com/en/latest/man/8/monmaptool/)?
The procedure is similar to changing a monitor's IP address the "messy
way
More complete description:
1-) I formatted and installed the operating system
2-) This is "ceph installed":
curl --silent --remote-name --location https://download.ceph.com/rpm-19.2.1/el9/noarch/cephadm
chmod +x cephadm
./cephadm add-repo --release squid
./cephadm install
cephadm -v bootstrap
I did have to add "su root root" to the log rotate script to fix the
permissions issue.
There's a RH KB article and Ceph github pull requests to fix it.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@cep
Thank you very much! I now stated the first step, namely "Collect the
map from each OSD host". As I have a cephadm deployment, I will have to
execute ceph-objectstore-tool within each container. Unfortunately, this
produces the error "Mount failed with '(11) Resource temporarily
unavailable'". Doe
I realized, I have access to a data directory of a monitor I removed
just before the oopsie happened. Can I launch a ceph-mon from that? If I
try just to launch ceph-mon, it commits suicide:
2025-04-10T19:32:32.174+0200 7fec628c5e00 -1 mon.mon.ceph2-01@-1(???)
e29 not in monmap and have been in a
That's quite a large number of storage units per machine.
My suspicion is that since you have apparently an unusually high number
of LVs coming online at boot, the time it takes to linearly activate
them is long enough to overlap with the point in time that ceph starts
bringing up its storage-
Hi,
has it worked for any other glance image? The snapshot shouldn't make
any difference, I just tried the same in a lab cluster. Have you
checked on the client side (OpenStack) for anything in dmesg etc.? Can
you query any information from that image? For example:
rbd info images_meta/im
Thanks!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
No, didn't issue any commands to the OSDs.
On 2025-04-10 17:28, Eugen Block wrote:
Did you stop the OSDs?
Zitat von Jonas Schwab :
Thank you very much! I now stated the first step, namely "Collect the
map from each OSD host". As I have a cephadm deployment, I will have to
execute ceph-object
No, you have to run the objectstore-tool command within the cephadm shell:
cephadm shell --name osd.x -- ceph-objectstore-tool
There are plenty examples online. I’m on my mobile phone right now
Zitat von Jonas Schwab :
Thank you for the help! Does that mean stopping the container and
mountin
On 4/10/25 10:01 AM, Jonas Schwab wrote:
Hello everyone,
I believe I accidentally nuked all monitor of my cluster (please
don't ask how). Is there a way to recover from this desaster?
Depends on how really “nuked.” Are there monitor directories with data
still under /var/lib/ceph/ by a chance
You have to stop the OSDs in order to mount them with the objectstore tool.
Zitat von Jonas Schwab :
No, didn't issue any commands to the OSDs.
On 2025-04-10 17:28, Eugen Block wrote:
Did you stop the OSDs?
Zitat von Jonas Schwab :
Thank you very much! I now stated the first step, namely
Thank you for the help! Does that mean stopping the container and
mounting the lv?
On 2025-04-10 17:38, Eugen Block wrote:
You have to stop the OSDs in order to mount them with the objectstore
tool.
Zitat von Jonas Schwab :
No, didn't issue any commands to the OSDs.
On 2025-04-10 17:28, Euge
Did you stop the OSDs?
Zitat von Jonas Schwab :
Thank you very much! I now stated the first step, namely "Collect the
map from each OSD host". As I have a cephadm deployment, I will have to
execute ceph-objectstore-tool within each container. Unfortunately, this
produces the error "Mount faile
Hi,
This is my first post to the forum and I don't know if it's appropriate,
but I'd like to express my gratitude to all people working hard on ceph
because I think it's a fantastic piece of software.
The problem I'm having is caused by me; we had a well working ceph fs
mirror solution; let's call
Can you bring back at least one of them? In that case you could reduce
the monmap to 1 mon and bring the cluster back up. If the MONs are
really dead, you can recover using OSDs [0]. I've never had to use
that myself, but people have reported that to work.
[0]
https://docs.ceph.com/en/lat
Hi Jonas,
Am 4/10/25 um 16:01 schrieb Jonas Schwab:
I believe I accidentally nuked all monitor of my cluster (please don't
ask how). Is there a way to recover from this desaster? I have a cephadm
setup.
There is a procedure to recover the MON-DB from the OSDs:
https://docs.ceph.com/en/reef/r
Hello everyone,
I believe I accidentally nuked all monitor of my cluster (please don't
ask how). Is there a way to recover from this desaster? I have a cephadm
setup.
I am very grateful for all help!
Best regards,
Jonas Schwab
___
ceph-users mailing l
Glad I could help! I'm also waiting for 18.2.5 to upgrade our own
cluster from Pacific after getting rid of our cache tier. :-D
Zitat von Jeremy Hansen :
This seems to have worked to get the orch back up and put me back to
16.2.15. Thank you. Debating on waiting for 18.2.5 to move forward.
Hi Jonas,
Is swap enabled on OSD nodes?
I've seen OSDs using way more memory than osd_memory_target and being
OOM-killed from time to time just because swap was enabled. If that's the case,
please disable swap in /etc/fstab and reboot the system.
Regards,
Frédéric.
___
Hello Dominique!
Os is quite new - Ubuntu 22.04 with all the latest upgrades.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi Alex,
Which OS? I had the same problem regarding not automatic activation of LVM's on
an older version of Ubuntu. I never found a workaround except by upgrading to a
newer release.
> -Oorspronkelijk bericht-
> Van: Alex from North
> Verzonden: donderdag 10 april 2025 13:17
> Aan: ce
Hello everybody!
I have a 4 nodes with 112 OSDs each and 18.2.4. OSD consist of db on SSD and
data on HDD
For some reason, when I reboot node, not all OSDs get up because some VG or LV
are not active.
To make it alive again I manually do vgchange -ay $VG_NAME or lvchange -ay
$LV_NAME.
I suspect
52 matches
Mail list logo