The rbd_info, rbd_directory objects will remain until you delete the
pool, you don't need to clean that up, e.g. if you decide to create
new rbd images in there.
The number of remaining objects usually slowly decreases depending on
the amount of data that was deleted. Just last week I deleted
Hi,
I have a health warn regarding pool full:
health: HEALTH_WARN
1 pool(s) full
This is the pool that is complaining:
Ceph df:
NAME ID USED%USED MAX AVAIL
OBJECTS
k8s 8 200GiB 0.22
Hi,
I have in one of my cluster a large omap object under luminous 12.2.8.
HEALTH_WARN 1 large omap objects
LARGE_OMAP_OBJECTS 1 large omap objects
1 large objects found in pool 'default.rgw.log'
Search the cluster log for 'Large omap object found' for more details.
In my setup the ha-pr
We upgraded our Jewel cluster to Nautilus a few months ago and I've noticed
that op behavior has changed. This is an HDD cluster (NVMe journals and
NVMe CephFS metadata pool) with about 800 OSDs. When on Jewel and running
WPQ with the high cut-off, it was rock solid. When we had recoveries going
on
Hi Andras,
To me it looks like the osd.0 is not peering when it starts with crush weight 0.
I would try forcing the re-peering with `ceph osd down osd.0` when the
PGs are unexpectedly degraded. (e.g start the osd when crush weight is
0, then obverve the PGs are still degraded, then force the re-p
Den ons 20 maj 2020 kl 05:23 skrev Szabo, Istvan (Agoda) <
istvan.sz...@agoda.com>:
> LARGE_OMAP_OBJECTS 1 large omap objects
> 1 large objects found in pool 'default.rgw.log'
> When I look for this large omap object, this is the one:
> for i in `ceph pg ls-by-pool default.rgw.log | tail -n +2
Okay, so the OSDs are in fact not full, it's strange that the pool
still is reported as full. Maybe restart the mgr services?
Zitat von "Szabo, Istvan (Agoda)" :
Yeah, sorry:
ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS
12 ssd 2.29799 1.0 2.30TiB 2.67GiB 2.30TiB 0.
Hi Robert,
Since you didn't mention -- are you using osd_op_queue_cut_off low or
high? I know you are usually advocating high, but the default is still
low and most users don't change this setting.
Cheers, Dan
On Wed, May 20, 2020 at 9:41 AM Robert LeBlanc wrote:
>
> We upgraded our Jewel clus
I just upgraded a cephadm cluster from 15.2.1 to 15.2.2.
Everything went fine on the upgrade, however after restarting one node that has
3 OSD's for ecmeta two of the 3 ODS's now wont boot with the following error:
May 20 08:29:42 sn-m01 bash[6833]: debug 2020-05-20T08:29:42.598+
7fbcc4
Anyone knows anything about this?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Tiered pools should be able to do this for you.
It has been dis-encouraged as a performance gain (ie, the reverse when you
have spin drives and want to put a ssd pool in front of it to get ssd perf
but hdd price/storage) in some cases, but if you do it for migrations it
should probably be worth it
Hello,
No, haven't deleted, this warning is quite long time ago.
ceph health detail
HEALTH_WARN 1 pool(s) full
POOL_FULL 1 pool(s) full
pool 'k8s' is full (no quota)
ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
315TiB 313TiB 2.27TiB 0.72
POOLS:
NA
Hi Khodayar,
Setting placement policies is probably not what you're looking for.
I've used placement policies successfully to separate an HDD pool from an
SSD pool. However, this policy only applies to new data if it is set. You
would have to read it out and write it back in at the s3 level using
Hello All,
We have 6 servers.
Configuration for each server:
1 ssd for mon (only on three servers)
1 ssd 1.9 TB for db/wal
1 nvme 1.6 TB for db/wal
10 SAS hdd 3.6 TB for osd
We decided to create a pool of 30 osd (5x6) with db/wal on ssd and a pool
of 30 (5x6) osd with db/wal on nvme.
S
Den ons 20 maj 2020 kl 12:00 skrev Ignazio Cassano :
> Hello All,
> We have 6 servers.
> Configuration for each server:
> 1 ssd for mon (only on three servers)
> 1 ssd 1.9 TB for db/wal
> 1 nvme 1.6 TB for db/wal
> 10 SAS hdd 3.6 TB for osd
> We decided to create a pool of 30 osd (5x6) with db/w
Hello Janne, so do you think we must move from 10Gbs to 40 or 100GBs to
to make the most of nvme ?
Thanks
Ignazio
Il giorno mer 20 mag 2020 alle ore 12:06 Janne Johansson <
icepic...@gmail.com> ha scritto:
> Den ons 20 maj 2020 kl 12:00 skrev Ignazio Cassano <
> ignaziocass...@gmail.com>:
>
>> H
Yeah, sorry:
ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS
12 ssd 2.29799 1.0 2.30TiB 2.67GiB 2.30TiB 0.11 0.16 24
13 ssd 2.29799 1.0 2.30TiB 2.33GiB 2.30TiB 0.10 0.14 21
14 ssd 3.49300 1.0 3.49TiB 2.71GiB 3.49TiB 0.08 0.11 27
27 ssd 2.29799 1.0 2.3
So reading online it looked a dead end error, so I recreated the 3 OSD's on
that node and now working fine after a reboot.
However I restarted the next server with 3 OSD's and one of them is now facing
the same issue.
Let me know if you need any more logs.
Thanks
On Wed, 20 May 2
Den ons 20 maj 2020 kl 12:14 skrev Ignazio Cassano :
> Hello Janne, so do you think we must move from 10Gbs to 40 or 100GBs to
> to make the most of nvme ?
>
I think there are several factors to weigh in, when you need to maximize
performance, from putting BIOS into performance mode, having as fa
Many thanks, Janne
Ignazio
Il giorno mer 20 mag 2020 alle ore 12:32 Janne Johansson <
icepic...@gmail.com> ha scritto:
> Den ons 20 maj 2020 kl 12:14 skrev Ignazio Cassano <
> ignaziocass...@gmail.com>:
>
>> Hello Janne, so do you think we must move from 10Gbs to 40 or 100GBs to
>> to make the mo
Hi,
Have you looked at omaps keys to see what's listed there?
In our configuration, the radosgw garbage collector uses the
*default.rgw.logs* pool for garbage collection (radosgw-admin zone get
default | jq .gc_pool).
I've seen large omaps in my *default.rgw.logs* pool before when I've
deleted l
As a follow-up to our recent memory problems with OSDs (with high pglog
values:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/LJPJZPBSQRJN5EFE632CWWPK3UMGG3VF/#XHIWAIFX4AXZK5VEFOEBPS5TGTH33JZO
), we also see high buffer_anon values. E.g. more than 4 GB, with "osd
memory target
Thomas,
Yes, you are correct. I would have to move objects manually between (more
than one) buckets if I use "Pool placements and Storage classes"
So you have successfully used this method and it was OK? I may be forced to
use this method because clients needs more features than mere cache
tierin
Dear cephers,
I'm sitting with a major ceph outage again. The mon/mgr hosts suffer from a
packet storm of ceph traffic between ceph fs clients and the mons. No idea why
this is happening.
Main problem is, that I can't get through to the cluster. Admin commands hang
forever:
[root@gnosis ~]# c
Looks like the immediate danger has passed by:
[root@gnosis ~]# ceph status
cluster:
id: e4ece518-f2cb-4708-b00f-b6bf511e91d9
health: HEALTH_WARN
nodown,noout flag(s) set
735 slow ops, oldest one blocked for 3573 sec, daemons
[mon.ceph-02,mon.ceph-03] have sl
Hi Harald,
Any idea what the priority_cache_manger perf counters show? (or you can
also enable debug osd / debug priority_cache_manager) The osd memory
autotuning works by shrinking the bluestore and rocksdb caches to some
target value to try and keep the mapped memory of the process bellow
Hi Ashley,
looks like this is a regression. Neha observed similar error(s) during
here QA run, see https://tracker.ceph.com/issues/45613
Please preserve broken OSDs for a while if possible, likely I'll come
back to you for more information to troubleshoot.
Thanks,
Igor
On 5/20/2020 1:26
Thanks, fyi the OSD's that went down back two pools, an Erasure code Meta (RBD)
and cephFS Meta. The cephFS Pool does have compresison enabled ( I noticed it
mentioned in the ceph tracker)
Thanks
On Wed, 20 May 2020 20:17:33 +0800 Igor Fedotov wrote
Hi Ashley,
looks like
I don't believe compression is related to be honest.
Wondering if these OSDs have standalone WAL and/or DB devices or just a
single shared main device.
Also could you please set debug-bluefs/debug-bluestore to 20 and collect
startup log for broken OSD.
Kind regards,
Igor
On 5/20/2020 3:2
lz4 ? It's not obviously related, but I've seen it involved in really
non-obvious ways: https://tracker.ceph.com/issues/39525
-- dan
On Wed, May 20, 2020 at 2:27 PM Ashley Merrick wrote:
>
> Thanks, fyi the OSD's that went down back two pools, an Erasure code Meta
> (RBD) and cephFS Meta. The c
Hi Mark
Thank you for you explanations! Some numbers of this example osd below.
Cheers
Harry
From dump mempools:
"buffer_anon": {
"items": 29012,
"bytes": 4584503367
},
From perf dump:
"prioritycache": {
"target_bytes": 375
Is a single shared main device.
Sadly I had already rebuilt the failed OSD's to bring me back in the green
after a while.
I have just tried a few restarts and none are failing (seems after a rebuild
using 15.2.2 they are stable?)
I don't have any other servers/OSD's I am willing to risk no
Do you still have any original failure logs?
On 5/20/2020 3:45 PM, Ashley Merrick wrote:
Is a single shared main device.
Sadly I had already rebuilt the failed OSD's to bring me back in the
green after a while.
I have just tried a few restarts and none are failing (seems after a
rebuild usin
Hi,
I've 15.2.1 installed on all machines. On primary machine I executed ceph
upgrade command:
$ ceph orch upgrade start --ceph-version 15.2.2
When I check ceph -s I see this:
progress:
Upgrade to docker.io/ceph/ceph:v15.2.2 (30m)
[=...] (remaining: 8h)
It
Dan, thanks for the info. Good to know.
Failed QA run in the ticket uses snappy though.
And in fact any stuff writing to process memory can introduce data
corruption in the similar manner.
So will keep that in mind but IMO relation to compression is still not
evident...
Kind regards,
Ig
What does
ceph orch upgrade status
show?
On Wed, 20 May 2020 20:52:39 +0800 Gencer W. Genç
wrote
Hi,
I've 15.2.1 installed on all machines. On primary machine I executed ceph
upgrade command:
$ ceph orch upgrade start --ceph-version 15.2.2
When I check ceph -s I see
I attached the log but was too big and got moderated.
Here is it in a paste bin : https://pastebin.pl/view/69b2beb9
I have cut the log to start from the point of the original upgrade.
Thanks
On Wed, 20 May 2020 20:55:51 +0800 Igor Fedotov wrote
Dan, thanks for the info. Go
Hi Ashley,
$ ceph orch upgrade status
{
"target_image": "docker.io/ceph/ceph:v15.2.2",
"in_progress": true,
"services_complete": [],
"message": ""
}
Thanks,
Gencer.
On 20.05.2020 15:58:34, Ashley Merrick wrote:
What does
ceph orch upgrade status
show?
On Wed, 20 May 20
Does:
ceph versions
show any services yet running on 15.2.2?
On Wed, 20 May 2020 21:01:12 +0800 Gencer W. Genç
wrote
Hi Ashley,$ ceph orch upgrade status
{
"target_image": "docker.io/ceph/ceph:v15.2.2",
"in_progress": true,
"services_complete": [],
"m
Ah yes,
{
"mon": {
"ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus
(stable)": 2
},
"mgr": {
"ceph version 15.2.2 (0c857e985a29d90501a285f242ea9c008df49eb8) octopus
(stable)": 2
},
"osd": {
"ceph version 15.2.1 (9fd2f65f91d9246fa
ceph config set mgr mgr/cephadm/log_to_cluster_level debug
ceph -W cephadm --watch-debug
See if you see anything that stands out as an issue with the update, seems it
has completed only the two MGR instances
If not:
ceph orch upgrade stop
ceph orch upgrade start --ceph-version 15.2.2
Hi All,
While we enable *ceph mon enable-msgr2 *after gateway service upgrade, the
one of the mon service getting crash and never come back, it shows,
/usr/bin/ceph-mon -f --cluster ceph --id mon01 --setuser ceph --setgroup
ceph --debug_monc 20 --debug_ms 5
global_init: error reading config fil
Hi Harald,
Thanks! So you can see from the perf dump that the target bytes are a
little below 4GB, but the mapped bytes are around 7GB. The priority
cache manager has reacted by setting the "cache_bytes" to 128MB which is
the minimum global value and each cache is getting 64MB (the local
m
Hi Ashley,
I see this:
[INF] Upgrade: Target is docker.io/ceph/ceph:v15.2.2 with id
4569944bbW86c3f9b5286057a558a3f852156079f759c9734e54d4f64092be9fa
[INF] Upgrade: It is NOT safe to stop mon.vx-rg23-rk65-u43-130
Does this meaning anything to you?
I've also attached full log. See especially af
Yes, I think it's because your only running two mons, so the script is halting
at a check to stop you being in the position of just one running (no backup).
I had the same issue with a single MGR instance and had to add a second to
allow to upgrade to continue, can you bring up an extra MON?
I have 2 mons and 2 mgrs.
cluster:
id: 7d308992-8899-11ea-8537-7d489fa7c193
health: HEALTH_OK
services:
mon: 2 daemons, quorum vx-rg23-rk65-u43-130,vx-rg23-rk65-u43-130-1 (age 91s)
mgr: vx-rg23-rk65-u43-130.arnvag(active, since 28m), standbys:
vx-rg23-rk65-u43-130-1.pxmyi
Correct, however it will need to stop one to do the upgrade leaving you with
only one working MON (this is what I would suggest the error means seeing i had
the same thing when I only had a single MGR), normally is suggested to have 3
MONs due to quorum.
Do you not have a node you can run a m
This is 2 node setup. I have no third node :(
I am planning to add more in the future but currently 2 nodes only.
At the moment, is there a --force command for such usage?
On 20.05.2020 16:32:15, Ashley Merrick wrote:
Correct, however it will need to stop one to do the upgrade leaving you with
Thanks!
So for now I can see the following similarities between you case and the
ticket:
1) Single main spinner as an OSD backing device.
2) Corruption happens to RocksDB WAL file
3) OSD has user data compression enabled.
And one more question. Fro the following line:
May 20 06:05:14 sn-m
Hey Igor,
The OSDs only back two metadata pools, so only hold a couple of MB of data
(hence they was easy and quick to rebuild), there actually NVME LVM devices
passed through QEMU into a VM (hence only 10GB and showing as rotational)
I have large 10TB disks that back the EC(RBD/FS) them se
Hi Cris,
could you please share the full log prior to the first failure?
Also if possible please set debug-bluestore/debug bluefs to 20 and
collect another one for failed OSD startup.
Thanks,
Igor
On 5/20/2020 4:39 PM, Chris Palmer wrote:
I'm getting similar errors after rebooting a node.
Hi Khodayar,
Yes, you are correct. I would have to move objects manually between (more
> than one) buckets if I use "Pool placements and Storage classes"
>
> So you have successfully used this method and it was OK?
>
After we set up the new placement rule in the zone and zonegroups we
modified us
Hi Gencer,
I'm going to need the full mgr log file.
Best,
Sebastian
Am 20.05.20 um 15:07 schrieb Gencer W. Genç:
> Ah yes,
>
> {
> "mon": {
> "ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee)
> octopus (stable)": 2
> },
> "mgr": {
> "ceph version 15.2.
Hello,
I have a pool of +300 OSDs that are identical model (Seagate model:
ST1800MM0129 size: 1.64 TiB).
Only 1 OSD crashes regularely, however I cannot identify a root cause.
Based on the output of smartctl the disk is ok.
# smartctl -a -d megaraid,1
/dev/sda
Disk is not ok, look to the output below:
SMART Health Status: HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE
you should replace the disk.
On Wed, May 20, 2020 at 5:11 PM Thomas <74cmo...@gmail.com> wrote:
>
> Hello,
>
> I have a pool of +300 OSDs that are identical model (Seagate model:
> ST1800M
Chris,
got them, thanks!
Investigating
Thanks,
Igor
On 5/20/2020 5:23 PM, Chris Palmer wrote:
Hi Igor
I've sent you these directly as they're a bit chunky. Let me know if
you haven't got them.
Thx, Chris
On 20/05/2020 14:43, Igor Fedotov wrote:
Hi Cris,
could you please share the f
Hi list,
Looking into diskprediction_local module, and I see that it only
predicts a few states: good, warning and bad:
ceph/src/pybind/mgr/diskprediction_local/predictor.py:
if score > 10:
return "Bad"
if score > 4:
return "Warning"
return "Good"
The predicted fail date is just a deriva
On Wed, May 20, 2020 at 5:36 PM Vytenis A wrote:
> Is it possible to get any finer prediction date?
>
related question: did anyone actually observe any correlation between the
predicted failure time and the actual time until a failure occurs?
Paul
--
Paul Emmerich
Looking for help with you
We are using high and the people on the list that have also changed
have not seen the improvements that I would expect.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed, May 20, 2020 at 1:38 AM Dan van der Ster wrote:
>
> Hi Robert,
>
> Sin
Hi folks, at this time we recommend pausing OSD upgrades to 15.2.2.
There have been a couple reports of OSDs crashing due to rocksdb
corruption after upgrading to 15.2.2 [1] [2]. It's safe to upgrade
monitors and mgr, but OSDs and everything else should wait.
We're investigating and will get a f
I'm getting similar errors after rebooting a node. Cluster was upgraded
15.2.1 -> 15.2.2 yesterday. No problems after rebooting during upgrade.
On the node I just rebooted, 2/4 OSDs won't restart. Similar logs from
both. Logs from one below.
Neither OSDs have compression enabled, although there
Hi Frank,
Thanks for the explanation - I wasn't aware of this subtle point. So
when some OSDs are down, one has to be very careful with changing the
cluster then. I guess one could even end up with incomplete PGs this
way that ceph can't recover from in an automated fashion?
Andras
On 5/1
Hi Igor
I've sent you these directly as they're a bit chunky. Let me know if you
haven't got them.
Thx, Chris
On 20/05/2020 14:43, Igor Fedotov wrote:
Hi Cris,
could you please share the full log prior to the first failure?
Also if possible please set debug-bluestore/debug bluefs to 20 and
Hi Dan,
Unfortunately 'ceph osd down osd.0' doesn't help - it is marked down and
soon after back up, but it doesn't peer still. I tried reweighting the
OSD to half its weight, 4.0 instead of 0.0, and that results in about
half the PGs staying degraded. So this is not specific to zero weight.
Adding the right dev list.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Wed, May 20, 2020 at 12:40 AM Robert LeBlanc wrote:
>
> We upgraded our Jewel cluster to Nautilus a few months ago and I've noticed
> that op behavior has changed. Thi
Hello,
I came across a section of the documentation that I don't quite
understand. In the section about inconsistent PGs it says if one of the
shards listed in `rados list-inconsistent-obj` has a read_error the disk is
probably bad.
Quote from documentation:
https://docs.ceph.com/docs/master/ra
Hi Eugen,
Thanks for the suggestion. The object counts of rbd pool are still stay on
430.11K. (all images were deleted 3 days +.)
I will keep monitor it and post the results here.
Regs,
Icy
On Wed, 20 May 2020 at 15:12, Eugen Block wrote:
> The rbd_info, rbd_directory objects will remain unt
Hello,
Yes it is, this is the output: "default.rgw.log:gc"
From: Thomas Bennett
Sent: Wednesday, May 20, 2020 5:44 PM
To: Szabo, Istvan (Agoda)
Cc: ceph-users
Subject: Re: [ceph-users] Re: Large omap
Email received from outside the company. If in doubt don't click links nor open
attachments
Restarted mgr and mon services, nothing helped :/
-Original Message-
From: Eugen Block
Sent: Wednesday, May 20, 2020 3:05 PM
To: Szabo, Istvan (Agoda)
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: Pool full but the user cleaned it up already
Email received from outside the company.
And a more broader question: is anyone using diskpredictor (local or cloud) ?
On Wed, May 20, 2020 at 7:35 PM Paul Emmerich wrote:
>
>
>
> On Wed, May 20, 2020 at 5:36 PM Vytenis A wrote:
>>
>> Is it possible to get any finer prediction date?
>
>
> related question: did anyone actually observe a
70 matches
Mail list logo