[ceph-users] Re: Balancing with upmap

2021-02-01 Thread Francois Legrand
Hi, Actually we have no EC pools... all are replica 3. And we have only 9 pools. The average number og pg/osd is not very high (40.6). Here is the detail of the pools : pool 2 replicated size 3 min_size 1 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 623105 lfor 0/608315/

[ceph-users] Re: Balancing with upmap

2021-02-01 Thread Dan van der Ster
On Mon, Feb 1, 2021 at 10:03 AM Francois Legrand wrote: > > Hi, > > Actually we have no EC pools... all are replica 3. And we have only 9 pools. > > The average number og pg/osd is not very high (40.6). > > Here is the detail of the pools : > > pool 2 replicated size 3 min_size 1 crush_rule 2 obje

[ceph-users] Re: Multisite recovering shards

2021-02-01 Thread Eugen Block
Hi, We are using octopus 15.2.7 for bucket sync with symmetrical replication. replication is asynchronous with both CephFS and RGW, so if your clients keep writing new data into the cluster as you state the sync status will always stay behind a little bit. I have two one-node test cluste

[ceph-users] Re: Balancing with upmap

2021-02-01 Thread Francois Legrand
This is the pgs repartition as given by the command I found here http://cephnotes.ksperis.com/blog/2015/02/23/get-the-number-of-placement-groups-per-osd : pool :    35   44   36   31   32   33    2   34    43   | SUM -

[ceph-users] Re: [Suspicious newsletter] Re: Multisite recovering shards

2021-02-01 Thread Szabo, Istvan (Agoda)
Sorry for the late response and thank you to pickup my question. I wanted to create some detailed information, here it is, please have a look and if you could help me I'd very appreciate it. https://tracker.ceph.com/issues/49075 -Original Message- From: Eugen Block Sent: Monday, Februa

[ceph-users] Re: Unable to enable RBD-Mirror Snapshot on image when VM is using RBD

2021-02-01 Thread Adam Boyhan
Doing some testing it seems this issue only happens about 10% of the time. Haven't really correlated anything that could be the cause. A resync gets things going again. Anyway to further determine what this means? description: failed to copy snapshots from remote to local image From: "a

[ceph-users] rbd resize progress report?

2021-02-01 Thread Jorge Garcia
I'm trying to resize a block device using "rbd --resize". The block device is pretty huge (100+ TB). The resize has been running for over a week, and I have no idea if it's actually doing anything, or if it's just hanging or in some infinite loop. Is there any way of getting a progress report from

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-01 Thread Alex Gorbachev
Hi Loïc, Does not borg need a file system to write its files to? We do replicate the chunks incrementally with rsync, and that is a very nice and, importantly, idempotent way, to sync up data to a second site. -- Alex Gorbachev ISS/Storcium On Mon, Feb 1, 2021 at 2:43 AM Loïc Dachary wrote:

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-01 Thread Loïc Dachary
On 01/02/2021 20:18, Alex Gorbachev wrote: > Hi Loïc, > > Does not borg need a file system to write its files to?  That's also my understanding. > We do replicate the chunks incrementally with rsync, and that is a very nice > and, importantly, idempotent way, to sync up data to a second site.  

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-01 Thread Dan van der Ster
Hi Loïc, We've never managed 100TB+ in a single RBD volume. I can't think of anything, but perhaps there are some unknown limitations when they get so big. It should be easy enough to use rbd bench to create and fill a massive test image to validate everything works well at that size. Also, I ass

[ceph-users] Re: no device listed after adding host

2021-02-01 Thread Tony Liu
"ceph log last cephadm" shows the host was added without errors. "ceph orch host ls" shows the host as well. "python3 -c import sys;exec(...)" is running on the host. But still no devices on this host is listed. Where else can I check? Thanks! Tony > -Original Message- > From: Tony Liu >

[ceph-users] Re: no device listed after adding host

2021-02-01 Thread Eugen Block
Hi, you could try ceph-volume inventory to see if it finds or reports anything. Zitat von Tony Liu : "ceph log last cephadm" shows the host was added without errors. "ceph orch host ls" shows the host as well. "python3 -c import sys;exec(...)" is running on the host. But still no devices on

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-01 Thread Loïc Dachary
Hi Dan, On 01/02/2021 21:13, Dan van der Ster wrote: > Hi Loïc, > > We've never managed 100TB+ in a single RBD volume. I can't think of > anything, but perhaps there are some unknown limitations when they get so > big. > It should be easy enough to use rbd bench to create and fill a massive test >

[ceph-users] Re: no device listed after adding host

2021-02-01 Thread Tony Liu
Hi Eugen, I installed ceph-osd on the osd-host to run ceph-volume, which then lists all devices. But "ceph orch device ls" on the controller (mon and mgr) still doesn't show those devices. This worked when I initially built the cluster. Not sure what is missing here. Trying to find out how to trac

[ceph-users] is unknown pg going to be active after osds are fixed?

2021-02-01 Thread Tony Liu
Hi, With 3 replicas, a pg hs 3 osds. If all those 3 osds are down, the pg becomes unknow. Is that right? If those 3 osds are replaced and in and on, is that pg going to be eventually back to active? Or anything else has to be done to fix it? Thanks! Tony

[ceph-users] db_devices doesn't show up in exported osd service spec

2021-02-01 Thread Tony Liu
Hi, When build cluster Octopus 15.2.5 initially, here is the OSD service spec file applied. ``` service_type: osd service_id: osd-spec placement: host_pattern: ceph-osd-[1-3] data_devices: rotational: 1 db_devices: rotational: 0 ``` After applying it, all HDDs were added and DB of each hdd i

[ceph-users] `cephadm` not deploying OSDs from a storage spec

2021-02-01 Thread Davor Cubranic
Hello, I am trying to set up a test cluster with the cephadm tool on Ubuntu 20.04 nodes. Following the directions at https://docs.ceph.com/en/octopus/cephadm/install/, I have set up the monitor and manager on a management node, and added two hosts that I want to use for storage. All storage de

[ceph-users] Re: osd recommended scheduler

2021-02-01 Thread Andrei Mikhailovsky
Bump - Original Message - > From: "andrei" > To: "ceph-users" > Sent: Thursday, 28 January, 2021 17:09:23 > Subject: [ceph-users] osd recommended scheduler > Hello everyone, > > Could some one please let me know what is the recommended modern kernel disk > scheduler that should be use

[ceph-users] Re: radosgw process crashes multiple times an hour

2021-02-01 Thread Andrei Mikhailovsky
bump - Original Message - > From: "andrei" > To: "Daniel Gryniewicz" > Cc: "ceph-users" > Sent: Thursday, 28 January, 2021 17:07:00 > Subject: [ceph-users] Re: radosgw process crashes multiple times an hour > Hi Daniel, > > Thanks for you're reply. I've checked the package versions o

[ceph-users] Re: Issue with cephadm upgrading containers.

2021-02-01 Thread Darrin Hodges
HI all, Still can't seem to get this upgrade to work, cephadm is at 15.2.8 but the containers are still 15.2.4, any ideas on how find out what the issue is? many thanks Darrin On 1/2/21 11:48 am, Darrin Hodges wrote: > Hi all, > > I'm attempting to upgrade our octopus 15.2.4 containers to 1

[ceph-users] Re: radosgw process crashes multiple times an hour

2021-02-01 Thread Brad Hubbard
On Tue, Feb 2, 2021 at 9:20 AM Andrei Mikhailovsky wrote: > > bump Can you create a tracker for this? I'd suggest the first step would be working out what "NOTICE: invalid dest placement: default-placement/REDUCED_REDUNDANCY" is trying to tell you. Someone more familiar with rgw than I should be

[ceph-users] Bucket synchronization works only after disable/enable, once finished, some operation maxes out SSDs/nvmes and sync degraded.

2021-02-01 Thread Szabo, Istvan (Agoda)
Hello, We have a 3 geo locational freshly installed multisite setup with an upgraded octopus from 15.2.5 to 15.2.7. We have 6 osd nodes, 3 mon/mgr/rgw in each dc, full SSD, 3 ssd is using 1 nvme for journaling. Each zone backed with 3 RGW, one on each mon/mgr node. The goal is to replicate 2 (cu

[ceph-users] Re: is unknown pg going to be active after osds are fixed?

2021-02-01 Thread Wido den Hollander
On 01/02/2021 22:48, Tony Liu wrote: Hi, With 3 replicas, a pg hs 3 osds. If all those 3 osds are down, the pg becomes unknow. Is that right? Yes. As no OSD can report the status to the MONs. If those 3 osds are replaced and in and on, is that pg going to be eventually back to active? Or

[ceph-users] Re: osd recommended scheduler

2021-02-01 Thread Wido den Hollander
On 28/01/2021 18:09, Andrei Mikhailovsky wrote: Hello everyone, Could some one please let me know what is the recommended modern kernel disk scheduler that should be used for SSD and HDD osds? The information in the manuals is pretty dated and refer to the schedulers which have been deprec