[ceph-users] Adding additional disks to the production cluster without performance impacts on the existing

2018-06-06 Thread John Molefe
Hi everyone We have completed all phases and the only remaining part is just adding the disks to the current cluster but i am afraid of impacting performance as it is on production. Any guides and advices on how this can be achieved with least impact on production?? Thanks in advance John Vr

[ceph-users] Update to Mimic with prior Snapshots leads to MDS damaged metadata

2018-06-06 Thread Tobias Florek
Hi, I upgraded a ceph cluster to mimic yesterday according to the release notes. Specifically I did stop all standby MDS and then restarted the only active MDS with the new version. The cluster was installed with luminous. Its cephfs volume had snapshots prior to the update, but only one active M

[ceph-users] Problem with S3 policy (grant RW access)

2018-06-06 Thread Valéry Tschopp
Hello, We have a problem with a R/W policy on a bucket. If the bucket owner grant read/write access to another user, the objects created by the grantee are not accessible by the owner (see below) !?! Why does the owner of a bucket not access objects created by a grantee? Is is a bug? ## Setup

Re: [ceph-users] Jewel/Luminous Filestore/Bluestore for a new cluster

2018-06-06 Thread Simon Ironside
On 05/06/18 01:14, Subhachandra Chandra wrote: We have not observed any major issues. We have had occasional OSD daemon crashes due to an assert which is a known bug but the cluster recovered without any intervention each time. All the nodes have been rebooted 2-3 times due to CoreOS updates a

Re: [ceph-users] Stop scrubbing

2018-06-06 Thread Alexandru Cucu
Hi, The only way I know is pretty brutal: list all the PGs with a scrubbing process, get the primary OSD and mark it as down. The scrubbing process will stop. Make sure you set the noout, norebalance and norecovery flags so you don't add even more load to your cluster. On Tue, Jun 5, 2018 at 11:4

[ceph-users] How to throttle operations like "rbd rm"

2018-06-06 Thread Yao Guotao
Hi Cephers, We use Ceph with Openstack by librbd library. Last week, my colleague delete 10 volumes from Openstack dashboard at the same time, each volume has about 1T used. During this time, the disk of OSDs are busy, and there have no I/O for normal vm. So, I want to konw if there are an

[ceph-users] Reduced productivity because of slow requests

2018-06-06 Thread Grigory Murashov
Hello cephers! I have luminous 12.2.5 cluster of 3 nodes 5 OSDs each with S3 RGW. All OSDs are HDD. I often (about twice a day) have slow request problem which reduces cluster efficiency. It can be started both in day peak and night time. Doesn't matter. That's what I have in ceph health d

Re: [ceph-users] Reduced productivity because of slow requests

2018-06-06 Thread Piotr Dałek
On 18-06-06 01:57 PM, Grigory Murashov wrote: Hello cephers! I have luminous 12.2.5 cluster of 3 nodes 5 OSDs each with S3 RGW. All OSDs are HDD. I often (about twice a day) have slow request problem which reduces cluster efficiency. It can be started both in day peak and night time. Doesn't

[ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Andras Pataki
We're using CephFS with Luminous 12.2.5 and the fuse client (on CentOS 7.4, kernel 3.10.0-693.5.2.el7.x86_64).  Performance has been very good generally, but we're currently running into some strange performance issues with one of our applications.  The client in this case is on a higher latenc

Re: [ceph-users] Update to Mimic with prior Snapshots leads to MDS damaged metadata

2018-06-06 Thread Yan, Zheng
On Wed, Jun 6, 2018 at 3:25 PM, Tobias Florek wrote: > Hi, > > I upgraded a ceph cluster to mimic yesterday according to the release > notes. Specifically I did stop all standby MDS and then restarted the > only active MDS with the new version. > > The cluster was installed with luminous. Its ceph

Re: [ceph-users] How to throttle operations like "rbd rm"

2018-06-06 Thread Jason Dillaman
The 'rbd_concurrent_management_ops' setting controls how many concurrent, in-flight RADOS object delete operations are possible per image removal. The default is only 10, so given ten 10 images being deleted concurrently, I am actually surprised that blocked all IO from your VMs. Adding support fo

Re: [ceph-users] Reduced productivity because of slow requests

2018-06-06 Thread Jamie Fargen
Is bond0.111 just for RGW traffic? Is there a load balancer in front of your RGWs? >From the graph you linked to for bond0.111, it seems like you might just be having a spike in Rados Gateway traffic. You might want to dig into the logs at those times on your Load Balancer/RGWs to see if you can

Re: [ceph-users] Update to Mimic with prior Snapshots leads to MDS damaged metadata

2018-06-06 Thread Yan, Zheng
Tob 于 2018年6月6日周三 22:21写道: > Hi! > > Thank you for your reply. > > I just did: > > > The correct commands should be: > > > > ceph daemon scrub_path / force recursive repair > > ceph daemon scrub_path '~mdsdir' force recursive repair > > They returned instantly and in the mds' logfile only the f

Re: [ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Gregory Farnum
On Wed, Jun 6, 2018 at 5:52 AM, Andras Pataki wrote: > We're using CephFS with Luminous 12.2.5 and the fuse client (on CentOS 7.4, > kernel 3.10.0-693.5.2.el7.x86_64). Performance has been very good > generally, but we're currently running into some strange performance issues > with one of our ap

Re: [ceph-users] Reduced productivity because of slow requests

2018-06-06 Thread Grigory Murashov
Hello Jamie! Do you think this spike is the reason of the problem. Not a consequence? Grigory Murashov 06.06.2018 16:57, Jamie Fargen пишет: Is bond0.111 just for RGW traffic? Is there a load balancer in front of your RGWs? From the graph you linked to for bond0.111, it seems like you might

Re: [ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Andras Pataki
Hi Greg, The docs say that client_cache_size is the number of inodes that are cached, not bytes of data.  Is that incorrect? Andras On 06/06/2018 11:25 AM, Gregory Farnum wrote: On Wed, Jun 6, 2018 at 5:52 AM, Andras Pataki wrote: We're using CephFS with Luminous 12.2.5 and the fuse clien

[ceph-users] Reinstall everything

2018-06-06 Thread Max Cuttins
Hi everybody, I would like to start from zero. However last time I run the command to purge everything I got an issue. I had a complete cleaned up system as expected, but disk was still OSD and the new installation refused to overwrite disk in use. The only way to make it work was manually form

[ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel
Dear all, I installed QEMU, libvirtd and its RBD plugins and now trying to make QEMU use my Ceph storage. I created 'iso' pool and imported Windows installation image there (rbd import). Also I created 'libvirt' pool and there, created 2.7-TB image

Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Jason Dillaman
Can you run "rbd --id libvirt --pool libvirt win206-test-3tb " w/o error? It sounds like your CephX caps for client.libvirt are not permitting read access to the image data objects. On Wed, Jun 6, 2018 at 2:18 PM, Wladimir Mutel wrote: > > Dear all, > > I installed QEMU, libvirtd

Re: [ceph-users] Stop scrubbing

2018-06-06 Thread Joe Comeau
When I am upgrading from filestore to bluestore or any other server maintenance for a short time (ie high I/O while rebuilding) ceph osd set noout ceph osd set noscrub ceph osd set nodeep-scrub when finished ceph osd unset noscrub ceph osd unset nodeep-scrub ceph osd unset noout again on

Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel
Jason Dillaman wrote: Can you run "rbd --id libvirt --pool libvirt win206-test-3tb " w/o error? It sounds like your CephX caps for client.libvirt are not permitting read access to the image data objects. I tried to run 'rbd export' with these params, but it said it was unable to

Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Jason Dillaman
The caps for those users looks correct for Luminous and later clusters. Any chance you are using data pools with the images? It's just odd that you have enough permissions to open the RBD image but cannot read its data objects. On Wed, Jun 6, 2018 at 2:46 PM, Wladimir Mutel wrote: > Jason Dillama

Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel
Jason Dillaman wrote: The caps for those users looks correct for Luminous and later clusters. Any chance you are using data pools with the images? It's just odd that you have enough permissions to open the RBD image but cannot read its data objects. Yes, I use erasure-pool as data-pool

Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Jason Dillaman
On Wed, Jun 6, 2018 at 3:02 PM, Wladimir Mutel wrote: > Jason Dillaman wrote: >> >> The caps for those users looks correct for Luminous and later >> clusters. Any chance you are using data pools with the images? It's >> just odd that you have enough permissions to open the RBD image but >> cannot

[ceph-users] Prioritize recovery over backfilling

2018-06-06 Thread Caspar Smit
Hi all, We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a node to it. osd-max-backfills is at the default 1 so backfilling didn't go very fast but that doesn't matter. Once it started backfilling everything looked ok: ~300 pgs in backfill_wait ~10 pgs backfilling (~number of

Re: [ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Andras Pataki
Staring at the logs a bit more it seems like the following lines might be the clue: 2018-06-06 08:14:17.615359 7fffefa45700 10 objectcacher trim  start: bytes: max 2147483640  clean 2145935360, objects: max 8192 current 8192 2018-06-06 08:14:17.615361 7fffefa45700 10 objectcacher trim finish: 

Re: [ceph-users] CephFS/ceph-fuse performance

2018-06-06 Thread Gregory Farnum
> > On 06/06/2018 12:22 PM, Andras Pataki wrote: > > Hi Greg, > > > > The docs say that client_cache_size is the number of inodes that are > > cached, not bytes of data. Is that incorrect? > Oh whoops, you're correct of course. Sorry about that! On Wed, Jun 6, 2018 at 12:33 PM Andras Pataki wro

Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel
Jason Dillaman wrote: The caps for those users looks correct for Luminous and later clusters. Any chance you are using data pools with the images? It's just odd that you have enough permissions to open the RBD image but cannot read its data objects. Yes, I use erasure-pool as data-po

Re: [ceph-users] Ceph Developer Monthly - June 2018

2018-06-06 Thread Leonardo Vaz
On Fri, Jun 1, 2018 at 4:56 PM, Leonardo Vaz wrote: > Hey Cephers, > > This is just a friendly reminder that the next Ceph Developer Monthly > meeting is coming up: > > http://wiki.ceph.com/Planning > > If you have work that you're doing that it a feature work, significant > backports, or anythin

[ceph-users] Openstack VMs with Ceph EC pools

2018-06-06 Thread Pardhiv Karri
Hi, Is anyone using Openstack with Ceph Erasure Coding pools as it now supports RBD in Luminous. If so, hows the performance? Thanks, Pardhiv Karri ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph

Re: [ceph-users] pg inconsistent, scrub stat mismatch on bytes

2018-06-06 Thread Adrian
Update to this. The affected pg didn't seem inconsistent: [root@admin-ceph1-qh2 ~]# ceph health detail HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent OSD_SCRUB_ERRORS 1 scrub errors PG_DAMAGED Possible data damage: 1 pg inconsistent pg 6.20 is active+clean+inconsistent, act

Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Jason Dillaman
On Wed, Jun 6, 2018 at 4:48 PM, Wladimir Mutel wrote: > Jason Dillaman wrote: > The caps for those users looks correct for Luminous and later clusters. Any chance you are using data pools with the images? It's just odd that you have enough permissions to open the RBD image but

[ceph-users] mimic cephfs snapshot in active/standby mds env

2018-06-06 Thread Brady Deetz
I've seen several mentions of stable snapshots in Mimic for cephfs in multi-active mds environments. I'm currently running active/standby in 12.2.5 with no snapshops. If I upgrade to Mimic, is there any concern with snapshots in an active/standby MDS environment. It seems like a silly question sinc

Re: [ceph-users] How to throttle operations like "rbd rm"

2018-06-06 Thread Yao Guotao
Hi Jason, Thank you very much for your reply. I think the RBD trash is a good way. But, the QoS in Ceph is a better solution. I am looking forward to the backend QoS of Ceph. Thanks. At 2018-06-06 21:53:23, "Jason Dillaman" wrote: >The 'rbd_concurrent_management_ops' setting controls how m

[ceph-users] rbd map hangs

2018-06-06 Thread Tracy Reed
Hello all! I'm running luminous with old style non-bluestore OSDs. ceph 10.2.9 clients though, haven't been able to upgrade those yet. Occasionally I have access to rbds hang on the client such as right now. I tried to dd a VM image into a mapped rbd and it just hung. Then I tried to map a new

Re: [ceph-users] Stop scrubbing

2018-06-06 Thread Wido den Hollander
On 06/06/2018 08:32 PM, Joe Comeau wrote: > When I am upgrading from filestore to bluestore > or any other server maintenance for a short time > (ie high I/O while rebuilding) >   > ceph osd set noout > ceph osd set noscrub > ceph osd set nodeep-scrub >   > when finished >   > ceph osd unset nos

Re: [ceph-users] Update to Mimic with prior Snapshots leads to MDS damaged metadata

2018-06-06 Thread Tobias Florek
Hi! Thank you for your help! The cluster is running healthily for a day now. Regarding the problem, I just checked in the release notes [1] and on docs.ceph.com and did not find the right invocation after an upgrade. Maybe that ought to be fixed. >> [upgrade from luminous to mimic with prior cep

Re: [ceph-users] Prioritize recovery over backfilling

2018-06-06 Thread Piotr Dałek
On 18-06-06 09:29 PM, Caspar Smit wrote: Hi all, We have a Luminous 12.2.2 cluster with 3 nodes and i recently added a node to it. osd-max-backfills is at the default 1 so backfilling didn't go very fast but that doesn't matter. Once it started backfilling everything looked ok: ~300 pgs i

Re: [ceph-users] mimic cephfs snapshot in active/standby mds env

2018-06-06 Thread Yan, Zheng
On Thu, Jun 7, 2018 at 10:04 AM, Brady Deetz wrote: > I've seen several mentions of stable snapshots in Mimic for cephfs in > multi-active mds environments. I'm currently running active/standby in > 12.2.5 with no snapshops. If I upgrade to Mimic, is there any concern with > snapshots in an active