Hi,

I don’t have a fourth machine available, so that’s not an option unfortunatly.

I did enable a lot of debugging earlier, but that shows no information as to why stuff is not working as to be expected.

Proxmox just deploys the mons, nothing fancy there, no special cases.

Can anyone confirm that ancient (2017) leveldb database mons should just accept ‘mon.$hostname’ names for mons, a well as ‘mon.$id’ ?

—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / m...@tuxis.nl


------ Original Message ------
From "Eugen Block" <ebl...@nde.ag>
To ceph-users@ceph.io
Date 31/01/2024, 13:02:04
Subject [ceph-users] Re: Cannot recreate monitor in upgrade from pacific to quincy (leveldb -> rocksdb)

Hi Mark,

as I'm not familiar with proxmox I'm not sure what happens under the  hood. 
There are a couple of things I would try, not necessarily in  this order:

- Check the troubleshooting guide [1], for example a clock skew could  be one 
reason, have you verified ntp/chronyd functionality?
- Inspect debug log output, maybe first on the probing mon and if  those don't 
reveal the reason, enable debug logs for the other MONs as  well:
ceph config set mon.proxmox03 debug_mon 20
ceph config set mon.proxmox03 debug_paxos 20

or for all MONs:
ceph config set mon debug_mon 20
ceph config set mon debug_paxos 20

- Try to deploy an additional MON on a different server (if you have  more 
available) and see if that works.
- Does proxmox log anything?
- Maybe last resort, try to start a MON manually after adding it to  the monmap 
with the monmaptool, but only if you know what you're  doing. I wonder if the 
monmap doesn't get updated...

Regards,
Eugen

[1] https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/

Zitat von Mark Schouten <m...@tuxis.nl>:

Hi,

During an upgrade from pacific to quincy, we needed to recreate the  mons 
because the mons were pretty old and still using leveldb.

So step one was to destroy one of the mons. After that we recreated  the 
monitor, and although it starts, it remains in state ‘probing’,  as you can see 
below.

No matter what I tried, it won’t come up. I’ve seen quite some  messages that 
the MTU might be an issue, but that seems to be ok:
root@proxmox03:/var/log/ceph# fping -b 1472 10.10.10.{1..3} -M
10.10.10.1 is alive
10.10.10.2 is alive
10.10.10.3 is alive


Does anyone have an idea how to fix this? I’ve tried destroying and  recreating 
the mon a few times now. Could it be that the leveldb  mons only support 
mon.$id notation for the monitors?

root@proxmox03:/var/log/ceph# ceph daemon mon.proxmox03 mon_status
{
    "name": “proxmox03”,
    "rank": 2,
    "state": “probing”,
    "election_epoch": 0,
    "quorum": [],
    "features": {
        "required_con": “2449958197560098820”,
        "required_mon": [
            “kraken”,
            “luminous”,
            “mimic”,
            "osdmap-prune”,
            “nautilus”,
            “octopus”,
            “pacific”,
            "elector-pinging”
        ],
        "quorum_con": “0”,
        "quorum_mon": []
    },
    "outside_quorum": [
        “proxmox03”
    ],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 0,
        "fsid": "39b1e85c-7b47-4262-9f0a-47ae91042bac”,
        "modified": "2024-01-23T21:02:12.631320Z”,
        "created": "2017-03-15T14:54:55.743017Z”,
        "min_mon_release": 16,
        "min_mon_release_name": “pacific”,
        "election_strategy": 1,
        "disallowed_leaders: ": “”,
        "stretch_mode": false,
        "tiebreaker_mon": “”,
        "removed_ranks: ": “2”,
        "features": {
            "persistent": [
                “kraken”,
                “luminous”,
                “mimic”,
                "osdmap-prune”,
                “nautilus”,
                “octopus”,
                “pacific”,
                "elector-pinging”
            ],
            "optional": []
        },
        "mons": [
            {
                "rank": 0,
                "name": “0”,
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": “v2”,
                            "addr": "10.10.10.1:3300”,
                            "nonce": 0
                        },
                        {
                            "type": “v1”,
                            "addr": "10.10.10.1:6789”,
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.10.10.1:6789/0”,
                "public_addr": "10.10.10.1:6789/0”,
                "priority": 0,
                "weight": 0,
                "crush_location": “{}”
            },
            {
                "rank": 1,
                "name": “1”,
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": “v2”,
                            "addr": "10.10.10.2:3300”,
                            "nonce": 0
                        },
                        {
                            "type": “v1”,
                            "addr": "10.10.10.2:6789”,
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.10.10.2:6789/0”,
                "public_addr": "10.10.10.2:6789/0”,
                "priority": 0,
                "weight": 0,
                "crush_location": “{}”
            },
            {
                "rank": 2,
                "name": “proxmox03”,
                "public_addrs": {
                    "addrvec": [
                        {
                            "type": “v2”,
                            "addr": "10.10.10.3:3300”,
                            "nonce": 0
                        },
                        {
                            "type": “v1”,
                            "addr": "10.10.10.3:6789”,
                            "nonce": 0
                        }
                    ]
                },
                "addr": "10.10.10.3:6789/0”,
                "public_addr": "10.10.10.3:6789/0”,
                "priority": 0,
                "weight": 0,
                "crush_location": “{}”
            }
        ]
    },
    "feature_map": {
        "mon": [
            {
                "features": “0x3f01cfbdfffdffff”,
                "release": “luminous”,
                "num": 1
            }
        ]
    },
    "stretch_mode": false
}

—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / m...@tuxis.nl
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to