I have another suggestion: check the RAM, just in case, with memtest86
or https://github.com/martinwhitaker/pcmemtest (which is a fork of
memtest86+). Ignore the suggestion if you have ECC RAM.
вт, 22 февр. 2022 г. в 15:45, Igor Fedotov :
>
> Hi Sebastian,
>
> On 2/22/2022 3:01 AM, Sebastian Mazza
Thank you Matt, Etienne, and Frank for your great advice. I'm going to
set up a small test cluster to familiarize myself with the process
before making the change on my production environment. Thank you all
again, I really appreciate it!
Jason
On 2022-02-21 17:58, Jason Borden wrote:
> Hi all
Need to work out why the 4 aren’t starting then.
First I would check they are showing in the OS layer via dmesg or fdisk e.t.c
If you can see the correct amount of disks on each node then check the service
status / ceph logs for each osd.
Depending how you setup the cluster/osd depends on the l
I should have 10 OSDs, below is the output:
root@ceph-mon1:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.95297 root default
-5 0.78119 host ceph-mon1
2hdd 0.19530 osd.2 down 0 1.0
4hdd 0.19530
What does
‘ceph osd tree’ show?
How many OSD’s should you have 7 or 10?
> On 22 Feb 2022, at 14:40, Michel Niyoyita wrote:
>
> Actually one of my colleagues tried to reboot all nodes and he did not
> prepare the node like setting noout , norecover .., once all node are up
> the cluster i
Actually one of my colleagues tried to reboot all nodes and he did not
prepare the node like setting noout , norecover .., once all node are
up the cluster is no longer accessible and above are messages we are
getting. I did not remove any osd . except are marked down.
below is my ceph.conf:
m
You have 1 OSD offline, has this disk failed or you aware of what has caused
this to go offline?
Shows you have 10 OSD’s but only 7in, have you removed the other 3? Was the
data fully drained off these first?
I see you have 11 Pool’s what are these setup as, type and min/max size?
> On 22 Feb 2
Hello team
below are details when I try to run ceph osd dump
pg_temp 11.11 [7,8]
blocklist 10.10.29.157:6825/1153 expires 2022-02-23T04:55:01.060277+
blocklist 10.10.29.157:0/176361525 expires 2022-02-23T04:55:01.060277+
blocklist 10.10.29.156:0/815007610 expires 2022-02-23T04:54:56.05665
Dear Ceph Users,
Kindly help me to repair my cluster is down from yesterday up to now I am
not able to make it up and running . below are some findings:
id: 6ad86187-2738-42d8-8eec-48b2a43c298f
health: HEALTH_ERR
mons are allowing insecure global_id reclaim
1/3
Hello
We have a new Pacific cluster configured via Cephadm.
For the OSDs, the spec is like this, with the intention for DB and WAL to
be on NVMe:
spec:
data_devices:
rotational: true
db_devices:
model: SSDPE2KE032T8L
filter_logic: AND
objectstore: bluestore
wal_devices:
Hi Sebastian,
On 2/22/2022 3:01 AM, Sebastian Mazza wrote:
Hey Igor!
thanks a lot for the new logs - looks like they provides some insight.
I'm glad the logs are helpful.
At this point I think the root cause is apparently a race between deferred
writes replay and some DB maintenance task
Hi Yuval!
Yo are correct, I looked at the “optional” fields and didn’t realize that the
parent parameter was optional.
I see that there is a check for checking if CopyFrom is nil in the example, I’m
going to try something like that.
I mentinoned the Request.User field because of this from the
12 matches
Mail list logo