[ceph-users] how to avoid pglogs dups bug in Pacific

2024-01-30 Thread ADRIAN NICOLAE
 Hi,  I'm running Pacific 16.2.4 and I want to start a manual pg split process on the data pool (from 2048 to 4096).  I'm reluctant to upgrade to 16.2.14/15 at this point. Can I avoid the dups bug (https://tracker.ceph.com/issues/53729)  if I will increase the pgs slowly with 32 or 64pgs at e

[ceph-users] orchestrator issues on ceph 16.2.9

2023-03-04 Thread Adrian Nicolae
Hi, I have some orchestrator issues on our cluster running 16.2.9 with rgw only services. We first noticed these issues a few weeks ago when adding new hosts to the cluster -  the orch was not detecting the new drives to build the osd containers for them. Debugging the mgr logs, I noticed th

[ceph-users] Re: ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
the problem was a single osd daemon (not reported on health detail) which slowed down the entire peering process, after restarting it the cluster got back to normal. On 11/4/2022 10:49 AM, Adrian Nicolae wrote:  ceph health detail HEALTH_WARN Reduced data availability: 42 pgs inactive, 33

[ceph-users] Re: ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
ast-1.rgw.buckets.data' pg_num 2480 is not a power of two [WRN] SLOW_OPS: 2371 slow ops, oldest one blocked for 6218 sec, daemons [osd.103,osd.115,osd.126,osd.129,osd.130,osd.138,osd.155,osd.174,osd.179,osd.181]... have slow ops. On 11/4/2022 10:45 AM, Adrian Nicolae wrote: Hi, We have a Pacific clust

[ceph-users] ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
Hi, We have a Pacific cluster (16.2.4) with 30 servers and 30 osds. We started to increase the pg_num for the data bucket for more than a month, I usually added 64 pgs in every step I didn't have any issue. The cluster was healthy before increasing the pgs. Today I've added 128 pgs  and the

[ceph-users] questions about rgw gc max objs and rgw gc speed in general

2022-09-22 Thread Adrian Nicolae
 Hi,  We have  a system running Ceph Pacific with  a large number of delete requests (several hundred thousands files per day) and I'm investigating how can I increase the gc speed to keep up with our deletes (right now there are 44 millions of objects in the gc list).  I changed max_concurr

[ceph-users] Re: v16.2.6 Pacific released

2021-09-17 Thread Adrian Nicolae
Hi, Does the 16.2.6 version fixed the following bug : https://github.com/ceph/ceph/pull/42690 ? It's not listed in the changelog. Message: 3 Date: Thu, 16 Sep 2021 15:48:42 -0400 From: David Galloway Subject: [ceph-users] v16.2.6 Pacific released To: ceph-annou...@ceph.io, ceph-users@ceph

[ceph-users] how to set rgw parameters in Pacific

2021-06-19 Thread Adrian Nicolae
Hi, I have some doubts regarding the best way to change some rgw parameters in Pacific. Let's say I want to change rgw_max_put_size and some rgw_gc default values like rgw_gc_max_concurrent_io.  What is the recommended way to do it :  - via 'ceph config set global' or - via 'ceph config

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-25 Thread Adrian Nicolae
Hi, On my setup I didn't enable a strech cluster. It's just a 3 x VM setup running on the same Proxmox node, all the nodes are using a single unique network. I installed Ceph using the documented cephadm flow. Thanks for the confirmation, Greg! I‘ll try with a newer release then. >That’s wh

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
nd haven’t experienced this. 在 2021年5月24日,00:35,Adrian Nicolae 写道: Hi, I waited for more than a day on the first mon failure, it didn't resolve automatically. I checked with 'ceph status' and also the ceph.conf on that hosts and the failed mon was removed from the monmap. Th

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
Could you also try “ceph mon dump” to see whether mon.node03 is actually removed from monmap when it failed to start? 在 2021年5月23日,16:40,Adrian Nicolae 写道: Hi guys, I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put it in production on a 1PB+ storage cluster with

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
a.com <mailto:istvan.sz...@agoda.com> --- On 2021. May 23., at 15:40, Adrian Nicolae wrote: Hi guys, I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put it in production on a 1PB+ storage cluster with rgw-only access. I

[ceph-users] Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
Hi guys, I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put it in production on a 1PB+ storage cluster with rgw-only access. I noticed a weird issue with my mons : - if I reboot a mon host, the ceph-mon container is not starting after reboot - I can see with 'ceph orch

[ceph-users] adding a second rgw instance on the same zone

2021-02-11 Thread Adrian Nicolae
Hi guys, I have a Mimic cluster with only one RGW machine.  My setup is simple - one realm, one zonegroup, one zone.  How can I safely add a second RGW server to the same zone ? Is it safe to just run "ceph-deploy rgw create" for the second server without impacting the existing metadata pool

[ceph-users] safest way to remove a host from Mimic

2021-01-07 Thread Adrian Nicolae
Hi guys, I need to remove a host (osd server) from my Ceph Mimic.  First I started to remove every OSD drive one by one with 'ceph osd out' and then 'ceph osd purge'.  After all the drives will be removed from the crush map, I will still have the host empty, without any drive, and with the cr

[ceph-users] Re: Ceph on ARM ?

2020-11-25 Thread Adrian Nicolae
: Robert Sander [mailto:r.san...@heinlein-support.de <mailto:r.san...@heinlein-support.de>] Sent: Tuesday, November 24, 2020 5:56 AM To: ceph-users@ceph.io <mailto:ceph-users@ceph.io> Subject: [ceph-users] Re: Ceph on ARM ? Am 24.11.20 um 13:12 schrieb Adrian N

[ceph-users] Ceph on ARM ?

2020-11-24 Thread Adrian Nicolae
Hi guys, I was looking at some Huawei ARM-based servers and the datasheets are very interesting. The high CPU core numbers and the SoC architecture should be ideal for a distributed storage like Ceph, at least in theory.  I'm planning to build a new Ceph cluster in the future and my best cas

[ceph-users] question about rgw index pool

2020-11-21 Thread Adrian Nicolae
Hi guys, I'll have a future Ceph deployment with the following setup : - 7 powerful nodes running Ceph 15.2.x with mon, rgw and osd daemons colocated - 100+ SATA drives with EC 4+2 - every OSD will have a large NVME partition (300GB) for rocksdb - the storage will be dedicated for rgw traff

[ceph-users] Re: question about rgw delete speed

2020-11-13 Thread Adrian Nicolae
18 osd servers, 3 mons, 4 gateways, 2 iscsi gateways US Production(SSD): Nautilus 14.2.11 with 6 osd servers, 3 mons, 4 gateways, 2 iscsi gateways UK Production(SSD): Octopus 15.2.5 with 5 osd servers, 3 mons, 4 gateways -----Original Message- From: Adrian Nicolae Sent: Wednesday, Nove

[ceph-users] question about rgw delete speed

2020-11-11 Thread Adrian Nicolae
Hey guys, I'm in charge of a local cloud-storage service. Our primary object storage is a vendor-based one and I want to replace it in the near future with Ceph with the following setup : - 6 OSD servers with 36 SATA 16TB drives each and 3 big NVME per server (1 big NVME for every 12 drive

[ceph-users] changing acces vlan for all the OSDs - potential downtime ?

2020-06-04 Thread Adrian Nicolae
Hi all, I have a Ceph cluster with a standard setup : - the public network : MONs and OSDs conected in the same agg switch with ports in the same access vlan - private network :  OSDs connected in another switch with a second eth connected in another access vlan I need to change the public

[ceph-users] Re: RGW resharding

2020-05-25 Thread Adrian Nicolae
objects(max_shards=16) then you should be ok. linyunfan Adrian Nicolae 于2020年5月25日周一 下午3:04写道: I'm using only Swift , not S3. We have a container for every customer. Right now there are thousands of containers. On 5/25/2020 9:02 AM, lin yunfan wrote: Can you store your data in diff

[ceph-users] Re: RGW resharding

2020-05-25 Thread Adrian Nicolae
I'm using only Swift , not S3.  We have a container for every customer. Right now there are thousands of containers. On 5/25/2020 9:02 AM, lin yunfan wrote: Can you store your data in different buckets? linyunfan Adrian Nicolae 于2020年5月19日周二 下午3:32写道: Hi, I have the following Ceph

[ceph-users] Bluestore config recommendations

2020-05-22 Thread Adrian Nicolae
 Hi,   I'm planning to install a new Ceph cluster (Nautilus) using 8+3 EC, SATA-only storage. We want to store here only big files (from 40-50MB to 200-300GB each). The write load will be higher than the read load .   I was thinking at the following Bluestore config to reduce the load on the

[ceph-users] RGW resharding

2020-05-19 Thread Adrian Nicolae
bucket can lead to OSD's flapping or having io timeouts during deep-scrub or even to have ODS's failures due to the leveldb compacting all the time if we have a large number of DELETEs. Any advice would be appreciated. Thank you, Adrian Nicolae

[ceph-users] Removing OSDs in Mimic

2020-04-06 Thread ADRIAN NICOLAE
Hi all, I have a Ceph cluster with ~ 70 OSDs of different sizes running on Mimic . I'm using ceph-deploy for managing the cluster size. I have to remove some smaller drives and replace them with bigger drives. From your experience, are the removing an OSD guidelines from Mimic docs accurate