[ceph-users] Can see objects with "rados ls" but cannot delete them with "rados rm"

2021-01-28 Thread James, GleSYS
Hi, We have in issue in our cluster (octopus 15.2.7) where we’re unable to remove orphaned objects from a pool, despite the fact these objects can be listed with “rados ls”. Here is an example of an orphaned object which we can list (not sure why multiple objects are returned with the same nam

[ceph-users] Differences betwen heap stats and dump dump_mempools

2021-01-28 Thread 展荣臻(信泰)
Hello,all "ceph daemon osd.0 heap stats" or ceph daemon osd.0 dump_mempools" analyse memory of osd used. The "Virtual address space used" of "ceph daemon osd.0 heap stats" output is lager than total.bytes of "ceph daemon osd.0 dump_mempools"output. I dig into source code and I find t

[ceph-users] Re: scrub errors: inconsistent PGs

2021-01-28 Thread Suresh Rama
Just query the PG to see what is it that reporting and take action accordingly On Thu, Jan 28, 2021, 7:13 PM Void Star Nill wrote: > Hello all, > > One of our clusters running nautilus release 14.2.15 is reporting health > error. It reports that there are inconsistent PGs. However, when I inspec

[ceph-users] extended multisite downtime...

2021-01-28 Thread Christopher Durham
Hi, There is a potential that my Ceph RGW multi site soluton may be down for an extended time (2 weeks?) for a physical relocation. Some questions, particularly in regard to RGW 1. Is there any limit on downtime after which I might have to restart an entire sync? I want to still be able to wri

[ceph-users] scrub errors: inconsistent PGs

2021-01-28 Thread Void Star Nill
Hello all, One of our clusters running nautilus release 14.2.15 is reporting health error. It reports that there are inconsistent PGs. However, when I inspect each of the reported PGs, I dont see any inconsistencies. Any inputs on what's going on? $ sudo ceph health detail HEALTH_ERR 3 scrub erro

[ceph-users] Re: 14.2.16 Low space hindering backfill after reboot

2021-01-28 Thread Marco Pizzolo
Thanks Eugen, The issue is pretty much in the rear view now. It's correcting the last 2.5M misplaced objects. The OSDs are now evenly balanced at 77% usage, but we will be adding another 120 OSDs all the same. Thanks,, Marco On Thu, Jan 28, 2021 at 8:16 AM Eugen Block wrote: > What are your

[ceph-users] osd recommended scheduler

2021-01-28 Thread Andrei Mikhailovsky
Hello everyone, Could some one please let me know what is the recommended modern kernel disk scheduler that should be used for SSD and HDD osds? The information in the manuals is pretty dated and refer to the schedulers which have been deprecated from the recent kernels. Thanks Andrei _

[ceph-users] Re: radosgw process crashes multiple times an hour

2021-01-28 Thread Andrei Mikhailovsky
Hi Daniel, Thanks for you're reply. I've checked the package versions on that server and all ceph related packages on that server are from 15.2.8 version: ii librados215.2.8-1focal amd64RADOS distributed object store client library ii libradosstriper1 15.2.8-1focal amd64

[ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses

2021-01-28 Thread Adam Boyhan
Went through the process like 4-5 times now and its looking good. I am going to continue beating on it to make sure. I will report back tomorrow. Nice catch! From: "Jason Dillaman" To: "adamb" Cc: "ceph-users" , "Matt Wilder" Sent: Thursday, January 28, 2021 12:53:50 PM Subject: Re:

[ceph-users] Re: RGW Bucket notification troubleshooting

2021-01-28 Thread Yuval Lifshitz
On Thu, Jan 28, 2021 at 7:34 PM Schoonjans, Tom (RFI,RAL,-) < tom.schoonj...@rfi.ac.uk> wrote: > Hi Yuval, > > > Together with Tom Byrne I ran some more tests today while keeping an eye > on the logs as well. > > We immediately noticed that the nodes were logging errors when uploading > files like

[ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses

2021-01-28 Thread Jason Dillaman
On Thu, Jan 28, 2021 at 10:31 AM Jason Dillaman wrote: > > On Wed, Jan 27, 2021 at 7:27 AM Adam Boyhan wrote: > > > > Doing some more testing. > > > > I can demote the rbd image on the primary, promote on the secondary and the > > image looks great. I can map it, mount it, and it looks just lik

[ceph-users] Re: Where has my capacity gone?

2021-01-28 Thread Josh Baergen
Hi George, > May I ask if enabling pool compression helps for the future space > amplification? If the amplification is indeed due to min_alloc_size, then I don't think that compression will help. My understanding is that compression is applied post-EC (and thus probably won't even activate due

[ceph-users] Re: [Suspicious newsletter] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Szabo, Istvan (Agoda)
Does this mean the space is allocated, but actually empty so can let’s say overwrite? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com -

[ceph-users] radosgw process crashes multiple times an hour

2021-01-28 Thread Andrei Mikhailovsky
Hello, I am experiencing very frequent crashes of the radosgw service. It happens multiple times every hour. As an example, over the last 12 hours we've had 35 crashes. Has anyone experienced similar behaviour of the radosgw octopus release service? More info below: Radosgw service is runnin

[ceph-users] Re: [Suspicious newsletter] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Szabo, Istvan (Agoda)
I mean the image hasn’t been deleted, but the content from the image. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --

[ceph-users] Re: RBD-Mirror Snapshot Backup Image Uses

2021-01-28 Thread Jason Dillaman
On Wed, Jan 27, 2021 at 7:27 AM Adam Boyhan wrote: > > Doing some more testing. > > I can demote the rbd image on the primary, promote on the secondary and the > image looks great. I can map it, mount it, and it looks just like it should. > > However, the rbd snapshots are still unusable on the

[ceph-users] Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Szabo, Istvan (Agoda)
Hi, We have a pool where the user has 2 image. They cleaned up the images, no snaphot in it, but when I see ceph df detail it still shows 458GB in the first column. Why? Thanks This message is confidential and is for the sole use of the intended recipient(s). I

[ceph-users] Re: radosgw process crashes multiple times an hour

2021-01-28 Thread Daniel Gryniewicz
It looks like your radosgw is using a different version of librados. In the backtrace, the top useful line begins: librados::v14_2_0 when it should be v15.2.0, like the ceph::buffer in the same line. Is there an old librados lying around that didn't get cleaned up somehow? Daniel On 1/28/

[ceph-users] Re: 14.2.16 Low space hindering backfill after reboot

2021-01-28 Thread Eugen Block
What are your full ratios? The defaults are: "mon_osd_backfillfull_ratio": "0.90", "mon_osd_full_ratio": "0.95", You could temporarily increase the mon_osd_backfillfull_ratio a bit and see if it resolves. But it's not recommended to get an OSD really full, so be careful with

[ceph-users] RGW multi-site sudden accumulation of inconsistent PGs

2021-01-28 Thread Eugen Block
Hi *, is there any correlation between multi-site clusters and inconsistent PGs? One customer has two Octopus clusters (fresh install a few months ago) which have been expanded recently with new disks. Before that they had one single occurence of inconsistent PGs during a deep-scrub which

[ceph-users] Re: [Suspicious newsletter] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Burkhard Linke
Hi, On 28.01.21 13:21, Szabo, Istvan (Agoda) wrote: I mean the image hasn’t been deleted, but the content from the image. RBD is (as the name implies) is a block device layer. Block devices do not have a concept of content, file, directories or even allocated or unallocated space. They are

[ceph-users] Re: [Suspicious newsletter] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Eugen Block
Ah, in that case you might want to sparsify the image: rbd sparsify / Zitat von "Szabo, Istvan (Agoda)" : I mean the image hasn’t been deleted, but the content from the image. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co.,

[ceph-users] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Eugen Block
The image is probably still in the trash, I assume. rbd -p trash ls Zitat von "Szabo, Istvan (Agoda)" : Hi, We have a pool where the user has 2 image. They cleaned up the images, no snaphot in it, but when I see ceph df detail it still shows 458GB in the first column. Why? Thanks __

[ceph-users] Re: Balancing with upmap

2021-01-28 Thread Jonas Jelten
Hi! We also suffer heavily from this so I wrote a custom balancer which yields much better results: https://github.com/TheJJ/ceph-balancer After you run it, it echoes the PG movements it suggests. You can then just run those commands the cluster will balance more. It's kinda work in progress, s

[ceph-users] Re: Where has my capacity gone?

2021-01-28 Thread George Yil
Hi Marc, Thanks for participating. At first I thought this is an incorrect report and maybe I need to upgrade to for a bugfix. But I couldn’t find a such a report and I asked here. When people shared experiences it appears there may be two causes. Unbalanced OSDs or Storage Amplification. As f

[ceph-users] Re: Planning: Ceph User Survey 2020

2021-01-28 Thread Anthony D'Atri
The survey team spent some time discussing the pros and cons of formats for a number of the questions in the new survey. I think when we initially sent out the first draft of the survey, that specific question was simple checkboxes, as I think it had been in the previous year’s edition. The fi