date:20200626

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Francois Legrand

I think he means that after disk failure he waits for the cluster to get back to ok (so all data on the lost disk have been reconstructed elsewhere) and then the disk is changed. In that case it's normal to have misplaced objects (because with the new disk some pgs needs to be migrated to popula

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Francois Legrand

We are now using osd_op_queue = wpq. Maybe returning to prio should help ? What are you using on your mimic custer ? F. Le 25/06/2020 à 19:28, Frank Schilder a écrit : OK, this *does* sound bad. I would consider this a show stopper for upgrade from mimic. Best regards, = Frank

[ceph-users] osd init authentication failed: (1) Operation not permitted

2020-06-26 Thread Naumann, Thomas

Hi, in our production cluster (proxmox 5.4, ceph 12.2) there is an issue since yesterday. after an increase of a pool 5 OSDs do not start, status is "down/in", ceph health: HEALTH_WARN nodown,noout flag(s) set, 5 osds down, 128 osds: 123 up, 128 in. last lines of OSD-logfile: 2020-06-26 08:40:26.

[ceph-users] fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou

Hi all, I'm going to deploy a cluster with erasure code pool for cold storage. There are 3 servers for me to set up the cluster, 12 OSDs on each server. Does that mean the data is secure while 1/3 OSDs of the cluster is down, or only 2 of the OSDs is down , if I set the ec profile with k=4 and m=2

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Francois Legrand

Thanks. I will try to change osd_op_queue_cut_off to high and restart everything (and use this downtime to upgrade the servers). F. Le 26/06/2020 à 09:46, Frank Schilder a écrit : I'm using osd_op_queue = wpq osd_op_queue_cut_off = high and these settings are recommended. Best regards, =

[ceph-users] Re: Bluestore performance tuning for hdd with nvme db+wal

2020-06-26 Thread Zhenshi Zhou

From my point of view, it's better to have no more than 6 osd wal/db on 1 nvme. I think that's the root cause of the slow requests, maybe. Mark Kirkwood 于2020年6月26日周五上午7:47写道： > Progress update: > > - tweaked debug_rocksdb to 1/5. *possibly* helped, fewer slow requests > > - will increase osd_m

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Janne Johansson

Den fre 26 juni 2020 kl 10:32 skrev Zhenshi Zhou : > Hi all, > > I'm going to deploy a cluster with erasure code pool for cold storage. > There are 3 servers for me to set up the cluster, 12 OSDs on each server. > Does that mean the data is secure while 1/3 OSDs of the cluster is down, > or only 2

[ceph-users] Re: Ceph Tech Talk: Solving the Bug of the Year

2020-06-26 Thread Dan van der Ster

Hi Marc, None of the CephFS issues are show-stoppers but we're anyway waiting for them to land in nautilus: * https://tracker.ceph.com/issues/45090 * https://tracker.ceph.com/issues/45261 * https://tracker.ceph.com/issues/45835 * https://tracker.ceph.com/issues/45875 Cheers, Dan On Thu, Jun 25,

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Lindsay Mathieson

On 26/06/2020 5:27 pm, Francois Legrand wrote: In that case it's normal to have misplaced objects (because with the new disk some pgs needs to be migrated to populate this new space), but degraded pg does not seems to be the good behaviour ! Yes, that would be bad, not sure if thats the proce

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Lindsay Mathieson

On 26/06/2020 6:31 pm, Zhenshi Zhou wrote: I'm going to deploy a cluster with erasure code pool for cold storage. There are 3 servers for me to set up the cluster, 12 OSDs on each server. Does that mean the data is secure while 1/3 OSDs of the cluster is down, or only 2 of the OSDs is down , if I

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou

Hi Janne, I use the default profile(2+1) and set failure-domain=host, is my best practice? Janne Johansson 于2020年6月26日周五下午4:59写道： > Den fre 26 juni 2020 kl 10:32 skrev Zhenshi Zhou : > >> Hi all, >> >> I'm going to deploy a cluster with erasure code pool for cold storage. >> There are 3 server

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou

Hi Lindsay, I have only 3 hosts, and is there any method to set a EC pool cluster in a better way Lindsay Mathieson 于2020年6月26日周五下午6:03写道： > On 26/06/2020 6:31 pm, Zhenshi Zhou wrote: > > I'm going to deploy a cluster with erasure code pool for cold storage. > > There are 3 servers for me to s

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Lindsay Mathieson

On 26/06/2020 8:08 pm, Zhenshi Zhou wrote: Hi Lindsay, I have only 3 hosts, and is there any method to set a EC pool cluster in a better way There's failure domain by OSD, which Janne knows far better than I :) -- Lindsay ___ ceph-users mailing lis

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Zhenshi Zhou

I will give it a try, thanks:) Lindsay Mathieson 于2020年6月26日周五下午7:07写道： > On 26/06/2020 8:08 pm, Zhenshi Zhou wrote: > > Hi Lindsay, > > > > I have only 3 hosts, and is there any method to set a EC pool cluster > > in a better way > > There's failure domain by OSD, which Janne knows far better

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Frank Schilder

This depends on which point in the procedure you refer to. He explicitly wrote > Note, we have not deployed the new OSD jet. meaning he observed misplaced objects before deploying the new disk. This should not happen. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, r

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Frank Schilder

I'm using osd_op_queue = wpq osd_op_queue_cut_off = high and these settings are recommended. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Francois Legrand Sent: 26 June 2020 09:44:00 To: Frank Schilder; ceph-

[ceph-users] Re: [External Email] Re: fault tolerant about erasure code pool

2020-06-26 Thread Dave Hall

I'm running EC 8+2 with 'failure domain OSD' on a 3 node cluster with 24 OSDs. Until one has 10s of nodes it pretty much has to be failure domain OSD. The documentation lists certain other important settings which it took time to find. Most important are recommendations to have a small replicat

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Francois Legrand

I changed osd_op_queue_cut_off to high and rebooted all the osds. But the result is more or less the same (storage is still extremely slow, 2h30 to rdb extract a 64GB image !). The only improvement is that it seems that degraded pgs have disapeared (which is at least a good point). It seems tha

[ceph-users] Re: ceph qos

2020-06-26 Thread Francois Legrand

Does somebody uses mclock in a production cluster ? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread Anthony D'Atri

M=1 is never a good choice. Just use replication instead. > On Jun 26, 2020, at 3:05 AM, Zhenshi Zhou wrote: > > Hi Janne, > > I use the default profile(2+1) and set failure-domain=host, is my best > practice? > > Janne Johansson 于2020年6月26日周五下午4:59写道： > >> Den fre 26 juni 2020 kl 10:3

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Francois Legrand

Thanks. I also added osd_op_queue_cut_off to high in global (as you mentioned in a previous thread that osd and mds should use it). F. Le 26/06/2020 à 16:35, Frank Schilder a écrit : I never tried "prio" out, but the reports I have seen claim that prio is inferior. However, as far as I know i

[ceph-users] Re: osd init authentication failed: (1) Operation not permitted

2020-06-26 Thread Eugen Block

Have you checked if the OSD keyring is present in /var/lib/ceph/osd/? Compare the content to other OSDs that do restart successfully. Zitat von "Naumann, Thomas" : Hi, in our production cluster (proxmox 5.4, ceph 12.2) there is an issue since yesterday. after an increase of a pool 5 OSDs do

[ceph-users] Re: fault tolerant about erasure code pool

2020-06-26 Thread DHilsbos

As others have pointed out; setting the failure domain to OSD is dangerous because then all 6 chunks for an object can end up on the same host. 6 hosts really seems like the minimum to mess with EC pools. Adding a bucket type between host and osd seems like a good idea here, if you absolutely

[ceph-users] v14.2.10 Nautilus released

2020-06-26 Thread Abhishek Lekshmanan

We're happy to announce the tenth release in the Nautilus series. In addition to fixing a security-related bug in RGW, this release brings a number of bugfixes across all major components of Ceph. We recommend that all Nautilus users upgrade to this release. For a detailed changelog please refer t

[ceph-users] Re: NFS Ganesha 2.7 in Xenial not available

2020-06-26 Thread Goutham Pacha Ravi

Hello! Thanks for bringing this issue up, Victoria. Ramana and David - we're using shaman to look up appropriate builds of packages on chacra to test Ceph with OpenStack Cinder, Manila, Nova, and Glance in the upstream OpenStack projects. This LRC outage hit us - we're sorted for everything e

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-26 Thread Frank Schilder

I never tried "prio" out, but the reports I have seen claim that prio is inferior. However, as far as I know it is safe to change these settings. Unfortunately, you need to restart services to apply the changes. Before you do, check if *all* daemons are using the same setting. Contrary to the

[ceph-users] Pointers in __crush_do_rule__ function of CRUSH mapper file

2020-06-26 Thread Bobby

Hi all, I have a question regarding pointer variables used in the __crush_do_rule__ function of CRUSH __mapper.c__. Can someone please help me understand the purpose of following four pointer variables inside __crush_do_rule__: int *b = a + result_max; int *c = b + result_max; int *w = a; int *o

[ceph-users] re Centos8 / octopus installation question

2020-06-26 Thread Mazzystr

Can anyone explain why ktdryer's dev repo is still landing on production installs? [root@centos8 ~]# ./cephadm add-repo --release octopus INFO:root:Writing repo to /etc/yum.repos.d/ceph.repo... INFO:cephadm:Enabling EPEL... INFO:cephadm:Enabling supplementary copr repo ktdreyer/ceph-el8... [root@

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] osd init authentication failed: (1) Operation not permitted

[ceph-users] fault tolerant about erasure code pool

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Re: Bluestore performance tuning for hdd with nvme db+wal

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] Re: Ceph Tech Talk: Solving the Bug of the Year

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Re: [External Email] Re: fault tolerant about erasure code pool

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Re: ceph qos

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Re: osd init authentication failed: (1) Operation not permitted

[ceph-users] Re: fault tolerant about erasure code pool

[ceph-users] v14.2.10 Nautilus released

[ceph-users] Re: NFS Ganesha 2.7 in Xenial not available

[ceph-users] Re: Removing pool in nautilus is incredibly slow

[ceph-users] Pointers in __crush_do_rule__ function of CRUSH mapper file

[ceph-users] re Centos8 / octopus installation question

28 matches

Site Navigation

Mail list logo

Footer information