Awesome, glad to hear it worked!
Regarding your question if you should upgrade further, it's not a
simple "yes" or "no" question. Do you need features or bug fixes from
Squid that are missing in Reef?
Reef is still supported but it was just announced yesterday that it
will be EOL in August:
Just to follow this through, 18.2.6 fixed my issues and I was able to complete
the upgrade. Is it advisable to go to 19 or should I stay on reef?
-jeremy
> On Monday, Apr 14, 2025 at 12:14 AM, Jeremy Hansen (mailto:jer...@skidrow.la)> wrote:
> Thanks. I’ll wait. I need this to go smoothly on an
Ah, this looks like the encryption issue which seems new in 18.2.5,
brought up here:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/UJ4DREAWNBBVVUJXYVZO25AYVQ5RLT42/
In that case it's questionable if you really want to upgrade to
18.2.5. Maybe 18.2.4 would be more suitable,
Thanks. I’ll wait. I need this to go smoothly on another cluster that has to go
through the same process.
-jeremy
> On Monday, Apr 14, 2025 at 12:10 AM, Eugen Block (mailto:ebl...@nde.ag)> wrote:
> Ah, this looks like the encryption issue which seems new in 18.2.5,
> brought up here:
>
> https:
Are you using Rook? Usually, I see this warning when a host is not
reachable, for example during a reboot. But it also clears when the
host comes back. Do you see this permanently or from time to time? It
might have to do with the different Ceph versions, I'm not sure. But
it shouldn't be a
I haven’t attempted the remaining upgrade just yet. I wanted to check on this
before proceeding. Things seem “stable” in the sense that I’m running VMs and
all volumes and images are still functioning. I’m using whatever would have
been the default from 16.2.14. It seems to be from time to time
This looks relevant.
https://github.com/rook/rook/issues/13600#issuecomment-1905860331
> On Sunday, Apr 13, 2025 at 10:08 AM, Jeremy Hansen (mailto:jer...@skidrow.la)> wrote:
> I’m now seeing this:
>
> cluster:
> id: 95f49c1c-b1e8-11ee-b5d0-0cc47a8f35c1
> health: HEALTH_WARN
> Failed to apply 1
I’m now seeing this:
cluster:
id: 95f49c1c-b1e8-11ee-b5d0-0cc47a8f35c1
health: HEALTH_WARN
Failed to apply 1 service(s): osd.cost_capacity
I’m assuming this is due to the fact that I’ve only upgraded mgr but I wanted
to double check before proceeding with the rest of the components.
Thanks
-jer
Updating mgr’s to 18.2.5 seemed to work just fine. I will go for the remaining
services after the weekend. Thanks.
-jeremy
> On Thursday, Apr 10, 2025 at 6:37 AM, Eugen Block (mailto:ebl...@nde.ag)> wrote:
> Glad I could help! I'm also waiting for 18.2.5 to upgrade our own
> cluster from Pacifi
Glad I could help! I'm also waiting for 18.2.5 to upgrade our own
cluster from Pacific after getting rid of our cache tier. :-D
Zitat von Jeremy Hansen :
This seems to have worked to get the orch back up and put me back to
16.2.15. Thank you. Debating on waiting for 18.2.5 to move forward.
This seems to have worked to get the orch back up and put me back to 16.2.15.
Thank you. Debating on waiting for 18.2.5 to move forward.
-jeremy
> On Monday, Apr 07, 2025 at 1:26 AM, Eugen Block (mailto:ebl...@nde.ag)> wrote:
> Still no, just edit the unit.run file for the MGRs to use a differe
Got it. Thank you. I forgot that you mentioned the run files. I’ll hold off a
bit to see if there’s more comments but I feel like I at least have things to
try. Thanks again.
-jeremy
> On Monday, Apr 07, 2025 at 1:26 AM, Eugen Block (mailto:ebl...@nde.ag)> wrote:
> Still no, just edit the unit
Still no, just edit the unit.run file for the MGRs to use a different
image. See Frédéric's instructions (now that I'm re-reading it,
there's a little mistake with dots and hyphens):
# Backup the unit.run file
$ cp /var/lib/ceph/$(ceph fsid)/mgr.ceph01.eydqvm/unit.run{,.bak}
# Change contain
Thank you. The only thing I’m unclear on is the rollback to pacific.
Are you referring to
> > > https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#manually-deploying-a-manager-daemon
Thank you. I appreciate all the help. Should I wait for Adam to comment? At the
moment, the cluster is fun
I haven't tried it this way yet, and I had hoped that Adam would chime
in, but my approach would be to remove this key (it's not present when
no upgrade is in progress):
ceph config-key rm mgr/cephadm/upgrade_state
Then rollback the two newer MGRs to Pacific as described before. If
they co
Snipped some of the irrelevant logs to keep message size down.
ceph config-key get mgr/cephadm/upgrade_state
{"target_name": "quay.io/ceph/ceph:v17.2.0", "progress_id":
"e7e1a809-558d-43a7-842a-c6229fdc57af", "target_id":
"e1d6a67b021eb077ee22bf650f1a9fb1980a2cf5c36bdb9cba9eac6de8f702d9",
"tar
The mgr logs should contain a stack trace, can you check again?
Zitat von Jeremy Hansen :
No. Same issue unfortunately.
On Saturday, Apr 05, 2025 at 1:27 PM, Eugen Block (mailto:ebl...@nde.ag)> wrote:
I don't see the module error in the logs you provided. Has it cleared?
A mgr failover is of
[ceph: root@cn01 /]# ceph orch host ls
Error ENOENT: Module not found
-jeremy
> On Saturday, Apr 05, 2025 at 1:27 PM, Eugen Block (mailto:ebl...@nde.ag)> wrote:
> I don't see the module error in the logs you provided. Has it cleared?
> A mgr failover is often helpful in such cases, basically it'
No. Same issue unfortunately.
> On Saturday, Apr 05, 2025 at 1:27 PM, Eugen Block (mailto:ebl...@nde.ag)> wrote:
> I don't see the module error in the logs you provided. Has it cleared?
> A mgr failover is often helpful in such cases, basically it's the
> first thing I suggest since two or three
I don't see the module error in the logs you provided. Has it cleared?
A mgr failover is often helpful in such cases, basically it's the
first thing I suggest since two or three years.
Zitat von Jeremy Hansen :
Thank you so much for the detailed instructions. Here’s logs from
the failover
Thank you so much for the detailed instructions. Here’s logs from the failover
to a new node.
Apr 05 20:06:08 cn02.ceph.xyz.corp
ceph-95f49c1c-b1e8-11ee-b5d0-0cc47a8f35c1-mgr-cn02-ceph-xyz-corp-ggixgj[2357414]:
:::192.168.47.72 - - [05/Apr/2025:20:06:08] "GET /metrics HTTP/1.1" 200 -
"" "P
Please don't drop the list from your response.
The orchestrator might be down, but you should have an active mgr
(ceph -s). On the host running the mgr, you can just run 'cephadm logs
--name mgr.', or 'journalctl -u
ceph-@mgr.'.
To get fresh logs, you can just run 'ceph mgr fail', look on w
It might be a different issue, can you paste the entire stack trace
from the mgr when it's failing?
Also, you could go directly from Pacific to Reef, there's no need to
upgrade to Quincy (which is EOL). And I would also recommend to
upgrade to a latest major version, e.g. 17.2.8, not .0, ot
23 matches
Mail list logo