[ceph-users] Re: ceph pool with a whitespace as name

2021-03-10 Thread Boris Behrens
After doing radosgw-admin period update --commit it looks like it is gone now. Sorry for spamming the ML, but I am not denvercoder9 :) Am Mi., 10. März 2021 um 08:29 Uhr schrieb Boris Behrens : > Ok, > i changed the value to > "metadata_heap": "", > but it is still used. > > Any ideas how to sto

[ceph-users] RadosGW unable to start resharding

2021-03-10 Thread Ansgar Jazdzewski
Hi Folks, We are running ceph 14.2.16 and I like to reshard a bucket because I have a large object warning! so I did: radosgw-admin bucket reshard --tenant="..." --bucket="..." --uid="..." --num-shards=512 but I got receive an error: ERROR: the bucket is currently undergoing resharding and canno

[ceph-users] Re: RadosGW unable to start resharding

2021-03-10 Thread Konstantin Shalygin
Try to look at: radosgw-admin reshard stale-instances list Then: radosgw-admin reshard stale-instances rm k > On 10 Mar 2021, at 12:11, Ansgar Jazdzewski > wrote: > > We are running ceph 14.2.16 and I like to reshard a bucket because I > have a large object warning! > > so I did: > radosgw

[ceph-users] Re: RadosGW unable to start resharding

2021-03-10 Thread Ansgar Jazdzewski
Hi, Both commands did not come back with any output after 30min I found that people have had run: radosgw-admin reshard cancel --tenant="..." --bucket="..." --uid="..." --debug-rgw=20 --debug-ms=1 and I got this error in the output: 2021-03-10 09:30:20.215 7f8aa239d940 -1 ERROR: failed to remov

[ceph-users] buckets with negative num_objects

2021-03-10 Thread Boris Behrens
Hi, I am in the process of resharding large buckets and to find them I ran radosgw-admin bucket limit check | grep '"fill_status": "OVER' -B5 and I see that there are two buckets with negative num_objects "bucket": "ncprod", "tenant": "", "num_object

[ceph-users] Unpurgeable rbd image from trash

2021-03-10 Thread Enrico Bocchi
Hello everyone, We have an unpurgeable image living in the trash of one of our clusters: # rbd --pool volumes trash ls 5afa5e5a07b8bc volume-02d959fe-a693-4acb-95e2-ca04b965389b If we try to purge the whole trash it says the image is being restored but we have never tried to do that: # rbd --p

[ceph-users] Re: Unpurgeable rbd image from trash

2021-03-10 Thread Jason Dillaman
Can you provide the output from "rados -p volumes listomapvals rbd_trash"? On Wed, Mar 10, 2021 at 8:03 AM Enrico Bocchi wrote: > > Hello everyone, > > We have an unpurgeable image living in the trash of one of our clusters: > # rbd --pool volumes trash ls > 5afa5e5a07b8bc volume-02d959fe-a693-4a

[ceph-users] Re: Unpurgeable rbd image from trash

2021-03-10 Thread Enrico Bocchi
Hello Jason, # rados -p volumes listomapvals rbd_trash id_5afa5e5a07b8bc value (71 bytes) :   02 01 41 00 00 00 00 2b  00 00 00 76 6f 6c 75 6d |..A+...volum| 0010  65 2d 30 32 64 39 35 39  66 65 2d 61 36 39 33 2d |e-02d959fe-a693-| 0020  34 61 63 62 2d 39 35 65  32 2d 63 61

[ceph-users] Re: Rados gateway basic pools missing

2021-03-10 Thread St-Germain, Sylvain (SSC/SPC)
Ok I fix it works -Message d'origine- De : St-Germain, Sylvain (SSC/SPC) Envoyé : 9 mars 2021 17:41 À : St-Germain, Sylvain (SSC/SPC) ; ceph-users@ceph.io Objet : RE: Rados gateway basic pools missing Ok in the interface when I create a bucket the index in created automatically 1 dev

[ceph-users] Alertmanager not using custom configuration template

2021-03-10 Thread Marc 'risson' Schmitt
Hi, I'm trying to use a custom template for Alertmanager deployed with Cephadm. Following its documentation[1], I set the option `mgr/cephadm/alertmanager_alertmanager.yml` to my own template, restarted the mgr, and re-deployed Alertmanager. However, Cephadm seems to always use its internal templa

[ceph-users] A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-10 Thread Loïc Dachary
Bonjour, In the past weeks a few mailing list threads[0][1][2] explored the problem of storing billions of small objects in Ceph. There was great feedback (I learned at lot) and it turns out the solution is a rather simple aggregation of the ideas that were suggested during these discussions. I

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-10 Thread Marc
Bonjour Loic, I think we should all use Bonjour and Merci more often, that sounds much better than Hi, Thanks and Cheers. I am looking forward to reading your research! > -Original Message- > From: Loïc Dachary > Sent: 10 March 2021 15:54 > To: Ceph Users > Cc: swh-de...@inria.fr > Su

[ceph-users] Re: OSD crashes create_aligned_in_mempool in 15.2.9 and 14.2.16

2021-03-10 Thread David Orman
Thank you for confirmation. Hopefully it will be approved in bodhi (you can leave feedback here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2021-0eda4297eb to help it along) soon, and new docker images can be built with the older version. On Tue, Mar 9, 2021 at 10:57 AM Andrej Filipcic w

[ceph-users] PG inactive when host is down despite CRUSH failure domain being host

2021-03-10 Thread Janek Bevendorff
Hi, I am having a weird phenomenon, which I am having trouble to debug. We have 16 OSDs per host, so when I reboot one node, 16 OSDs will be missing for a short time. Since our minimum CRUSH failure domain is host, this should not cause any problems. Unfortunately, I always have handful (1-5)

[ceph-users] Re: PG inactive when host is down despite CRUSH failure domain being host

2021-03-10 Thread Eugen Block
Hi, I only took a quick look, but is that pool configured with size 2? The crush_rule says min_size 2 which would explain what you're describing. Zitat von Janek Bevendorff : Hi, I am having a weird phenomenon, which I am having trouble to debug. We have 16 OSDs per host, so when I reb

[ceph-users] Re: Unpurgeable rbd image from trash

2021-03-10 Thread Jason Dillaman
Odd, it looks like it's stuck in the "MOVING" state. Perhaps the "rbd trash mv" command was aborted mid-operation? The way to work around this issue is as follows: $ rados -p volumes getomapval rbd_trash id_5afa5e5a07b8bc key_file $ hexedit key_file ## CHANGE LAST BYTE FROM '01' to '00' $ rados -p

[ceph-users] Re: PG inactive when host is down despite CRUSH failure domain being host

2021-03-10 Thread Janek Bevendorff
No, he pool is size 3. But you put me on the right track. The pool had an explicit min_size set that was equal to the size. No idea why I didn't check that in the first place. Reducing it to 2 seems to solve the problem. How embarrassing, thanks! :-D May I suggest giving this a better error de

[ceph-users] Ceph server

2021-03-10 Thread Ignazio Cassano
Hello, what do you think about of ceph cluster made up of 6 nodes each one with the following configuration ? A+ Server 1113S-WN10RT Barebone Supermicro A+ Server 1113S-WN10RT - 1U - 10x U.2 NVMe - 2x M.2 - Dual 10-Gigabit LAN - 750W Redundant Processor AMD EPYC™ 7272 Processor 12-core 2.90GHz 64M

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-10 Thread Loïc Dachary
Hi Konstantin, Thanks for the advice. Luckily objects are packed together and Ceph will only see larger objects. The median object size is ~4KB, written in RBD images using the default 4MB[0] object size. That will be ~100 millions RADOS objects instead of 100 billions. Cheers [0] https://doc

[ceph-users] Re: Ceph server

2021-03-10 Thread Ignazio Cassano
Hello , non and osd. 1 small ssd is for operations system and 1 is for mon. I am agree to increase the ram. As far as nvme size it is true that more osd little disks is a better choose for performances but I should buy more servers. Ignazio Il Mer 10 Mar 2021, 20:24 Stefan Kooman ha scritto: > O

[ceph-users] Re: Ceph server

2021-03-10 Thread Ignazio Cassano
Sorry I forgot to mention I will not use cephfs Il Mer 10 Mar 2021, 20:44 Ignazio Cassano ha scritto: > Hello , non and osd. > 1 small ssd is for operations system and 1 is for mon. > I am agree to increase the ram. > As far as nvme size it is true that more osd little disks is a better > choos

[ceph-users] Best way to add OSDs - whole node or one by one?

2021-03-10 Thread Dave Hall
Hello, I am currently in the process of expanding my Nautilus cluster from 3 nodes (combined OSD/MGR/MON/MDS) to 6 OSD nodes and 3 management nodes.  The old and new OSD nodes all have 8 x 12TB HDDs plus NVMe.   The front and back networks are 10GB. Last Friday evening I injected a whole new

[ceph-users] Bluestore OSD Layout - WAL, DB, Journal

2021-03-10 Thread Dave Hall
Hello, I'm in the process of doubling the number of OSD nodes in my Nautilus cluster -  from 3 to 6.  Based on answers receive from earlier posts to this list, the new nodes have more NVMe that the old nodes.  More to the point, on the original nodes the amount of NVMe allocated to each OSD w

[ceph-users] Failure Domain = NVMe?

2021-03-10 Thread Dave Hall
Hello, In some documentation I was reading last night about laying out OSDs, it was suggested that if more that one OSD uses the same NVMe drive, the failure-domain should probably be set to node. However, for a small cluster the inclination is to use EC-pools and failure-domain = OSD. I was

[ceph-users] How to speed up removing big rbd pools

2021-03-10 Thread huxia...@horebdata.cn
Dear Cepher, For some reasons, i had a cluster with several 20 TB pools and 100TB ones, which were previously linked with iSCSI for virtual machines. When deleting those big rbd images, it turns out to be extremely slow, taking hours if not days. The Ceph cluster is running on Luminous 12.2.13

[ceph-users] how smart is ceph recovery?

2021-03-10 Thread Marc
1. is backfilling/remapping so smart that it will do what ever it can? Or are there situations like a pg a. is scheduled to be moved, but can not be moved because of min_size. Now another pg b. cannot be moved because pg a. allocated osd space and the backfill ratio will be met. Yet if the o

[ceph-users] Re: Ceph server

2021-03-10 Thread Stefan Kooman
On 3/10/21 5:43 PM, Ignazio Cassano wrote: Hello, what do you think about of ceph cluster made up of 6 nodes each one with the following configuration ? A+ Server 1113S-WN10RT Barebone Supermicro A+ Server 1113S-WN10RT - 1U - 10x U.2 NVMe - 2x M.2 - Dual 10-Gigabit LAN - 750W Redundant Processor

[ceph-users] Re: Ceph server

2021-03-10 Thread Stefan Kooman
On 3/10/21 8:12 PM, Stefan Kooman wrote: On 3/10/21 5:43 PM, Ignazio Cassano wrote: Hello, what do you think about of ceph cluster made up of 6 nodes each one with the following configuration ? I forgot to ask: Are you planning on only OSDs or should this be OSDs and MONs and ? In case of

[ceph-users] mon db growing. over 500Gb

2021-03-10 Thread ricardo.re.azevedo
Hi all, I have a fairly pressing issue. I had a monitor fall out of quorum because it ran out of disk space during rebalancing from switching to upmap. I noticed all my monitor store.db started taking up nearly all disk space so I set noout, nobackfill and norecover and shutdown all the monitor

[ceph-users] Re: mon db growing. over 500Gb

2021-03-10 Thread Lincoln Bryant
Hi Ricardo, I just had a similar issue recently. I did a dump of the monitor store (i.e., something like "ceph-monstore-tool /var/lib/ceph/mon/mon-a/ dump-keys") and most messages were of type 'logm'. For me I think it was a lot of log messages coming from an oddly behaving OSD. I've seen folk

[ceph-users] Re: mon db growing. over 500Gb

2021-03-10 Thread ricardo.re.azevedo
Thanks for the input Lincoln. I think I am in a similar boat. I don't have the insight module activated. I checked one of my troublesome monitors with the command you game and indeed it is full of logm messages. I am not sure what would have caused it though. My OSDs have been behaving relatively

[ceph-users] Re: Openstack rbd image Error deleting problem

2021-03-10 Thread Norman.Kern
On 2021/3/10 下午3:05, Konstantin Shalygin wrote: >> On 10 Mar 2021, at 09:50, Norman.Kern wrote: >> >> I have used Ceph rbd for Openstack for sometime, I met a problem while >> destroying a VM. The Openstack tried to >> >> delete rbd image but failed. I have a test deleting a image by rbd command

[ceph-users] Some confusion around PG, OSD and balancing issue

2021-03-10 Thread Darrin Hodges
HI all, Just looking for clarification around the relationship between PGs,OSDs and balancing on a ceph (octopus) cluster.  We have pg autobalance on and balancing is set to upmap.  There are 2 pools, one is the default metric pool with 1 pg, the other is the pool we are using for everything, it h

[ceph-users] Re: mon db growing. over 500Gb

2021-03-10 Thread Lincoln Bryant
You can try compacting with monstore tool instead of using mon-compact-on-start. I am not sure if it makes any difference. From: ricardo.re.azev...@gmail.com Sent: Wednesday, March 10, 2021 6:59 PM To: Lincoln Bryant ; ceph-users@ceph.io Subject: RE: [ceph-users

[ceph-users] Re: RadosGW unable to start resharding

2021-03-10 Thread Ansgar Jazdzewski
hi, no luck after running radosgw-admin bucket check --fix radosgw-admin reshard stale-instances list and radosgw-admin reshard stale-instances rm work now but I can not start the resharding radosgw-admin bucket reshard --tenant=... --bucket=... --uid=... --num-shards=512 ERROR: the bucket is cur

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-10 Thread Loïc Dachary
Hi, On 11/03/2021 04:38, Szabo, Istvan (Agoda) wrote: > Does this mean that even in an object store the files which is smaller than > 4MB will be packed in one 4 MB object? I'm not sure I understand the question. Would you be so kind as to rephrase it? Cheers > > -Original Message- > F

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-10 Thread Szabo, Istvan (Agoda)
Does this mean that even in an object store the files which is smaller than 4MB will be packed in one 4 MB object? -Original Message- From: Loïc Dachary Sent: Thursday, March 11, 2021 2:13 AM To: Konstantin Shalygin Cc: Ceph Users ; swh-de...@inria.fr Subject: [ceph-users] Re: A practi

[ceph-users] Re: Failure Domain = NVMe?

2021-03-10 Thread Szabo, Istvan (Agoda)
Don't forget if you have server failure you might loose many objects. If the failure domain is osd, it means let's say you have 12 drives in each server, 8+2 EC in an unlucky situation can be located in 1 server also. Istvan Szabo Senior Infrastructure Engineer --

[ceph-users] Re: Bluestore OSD Layout - WAL, DB, Journal

2021-03-10 Thread Szabo, Istvan (Agoda)
Hi, If you don't specifiy WAL it will be located on the same drive as the rocksdb. You need to specify wal if you have faster disk the your rocksdb, like DATA on HDD, Rocksdb on SSD, wal on nvme/optane. In the past they suggested like this: 300GB data 30GB rocksdb, 3GB wal. Not sure is this sti