Re: [ceph-users] What is the meaning of size and min_size for erasure-coded pools?

2018-05-09 Thread Maciej Puzio
I still don't understand why I get any clean PGs in the erasure-coded pool, when with two OSDs down there is no more redundancy, and therefore all PGs should be undersized (or so I think). I repeated the experiment by bringing two remaining OSDs online, and then killing them, and got results simila

Re: [ceph-users] Public network faster than cluster network

2018-05-09 Thread Christian Balzer
On Wed, 09 May 2018 20:46:19 + Gandalf Corvotempesta wrote: > As subject, what would happen ? > This cosmic imbalance would clearly lead to the end of the universe. Seriously, think it through, what do you _think_ will happen? Depending on the actual networks in question, the configuration

Re: [ceph-users] Public network faster than cluster network

2018-05-09 Thread David Turner
Nothing out of the ordinary. The cluster network would operate at its speed and the public would do the same. The traffic on the cluster network is all communication between OSDs. No other ceph daemon other than OSDs use that network. All communication to the OSDs from any other daemon or service

Re: [ceph-users] What is the meaning of size and min_size for erasure-coded pools?

2018-05-09 Thread Gregory Farnum
On Wed, May 9, 2018 at 4:37 PM, Maciej Puzio wrote: > My setup consists of two pools on 5 OSDs, and is intended for cephfs: > 1. erasure-coded data pool: k=3, m=2, size=5, min_size=3 (originally > 4), number of PGs=128 > 2. replicated metadata pool: size=3, min_size=2, number of PGs=100 > > When a

Re: [ceph-users] What is the meaning of size and min_size for erasure-coded pools?

2018-05-09 Thread Maciej Puzio
My setup consists of two pools on 5 OSDs, and is intended for cephfs: 1. erasure-coded data pool: k=3, m=2, size=5, min_size=3 (originally 4), number of PGs=128 2. replicated metadata pool: size=3, min_size=2, number of PGs=100 When all OSDs were online, all PGs from both pools has status active+c

Re: [ceph-users] What is the meaning of size and min_size for erasure-coded pools?

2018-05-09 Thread Gregory Farnum
On Tue, May 8, 2018 at 2:16 PM Maciej Puzio wrote: > Thank you everyone for your replies. However, I feel that at least > part of the discussion deviated from the topic of my original post. As > I wrote before, I am dealing with a toy cluster, whose purpose is not > to provide a resilient storage

Re: [ceph-users] Inconsistent PG automatically got "repaired" automatically?

2018-05-09 Thread Gregory Farnum
On Wed, May 9, 2018 at 8:21 AM Nikos Kormpakis wrote: > Hello, > > we operate a Ceph cluster running v12.2.4, on top of Debian Stretch, > deployed > with ceph-volume lvm with a default crushmap and a quite vanilla > ceph.conf. OSDs > live on single disks in JBOD mode, with a separate block.db LV

[ceph-users] Public network faster than cluster network

2018-05-09 Thread Gandalf Corvotempesta
As subject, what would happen ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph RBD trim performance

2018-05-09 Thread Jason Dillaman
It's not a zero-cost operation since it involves librbd issuing 1 to N delete or truncate ops to the OSDs. The OSD will treat those ops just like any other read or write op (i.e. they don't take lower priority). Additionally, any discard smaller than the stripe period of the RBD image will potentia

[ceph-users] Ceph RBD trim performance

2018-05-09 Thread Andre Goree
Is there anything I should be concerned with performance-wise when having my libvirt-based guest systems run trim on their attached RBD devices? I'd imagine it all happens in the background (on the Ceph cluster) with minimal performance hit, given the fact that normal removal/deletion of files

Re: [ceph-users] Question: CephFS + Bluestore

2018-05-09 Thread Webert de Souza Lima
Hey Jon! On Wed, May 9, 2018 at 12:11 PM, John Spray wrote: > It depends on the metadata intensity of your workload. It might be > quite interesting to gather some drive stats on how many IOPS are > currently hitting your metadata pool over a week of normal activity. > Any ceph built-in tool f

Re: [ceph-users] fstrim issue in VM for cloned rbd image with fast-diff feature

2018-05-09 Thread Jason Dillaman
On Wed, May 9, 2018 at 9:13 AM, Youzhong Yang wrote: > Thanks Jason. > > Yes, my concern is that fstrim increases clones' disk usage. The VM didn't > use any additional space but fstrim caused its disk usage (in ceph) to go up > significantly. Imagine when there are hundreds of VMs, it would soon

Re: [ceph-users] fstrim issue in VM for cloned rbd image with fast-diff feature

2018-05-09 Thread Youzhong Yang
Thanks Jason. Yes, my concern is that fstrim increases clones' disk usage. The VM didn't use any additional space but fstrim caused its disk usage (in ceph) to go up significantly. Imagine when there are hundreds of VMs, it would soon cause space issue. If this is expected behavior, does it mean

Re: [ceph-users] fstrim issue in VM for cloned rbd image with fast-diff feature

2018-05-09 Thread Jason Dillaman
On Wed, May 9, 2018 at 11:39 AM, Youzhong Yang wrote: > This is what I did: > > # rbd import /var/tmp/debian93-raw.img images/debian93 > # rbd info images/debian93 > rbd image 'debian93': > size 81920 MB in 20480 objects > order 22 (4096 kB objects) > block_name_prefix: rbd_data.384b74b0dc51 >

[ceph-users] fstrim issue in VM for cloned rbd image with fast-diff feature

2018-05-09 Thread Youzhong Yang
This is what I did: # rbd import /var/tmp/debian93-raw.img images/debian93 # rbd info images/debian93 rbd image 'debian93': size 81920 MB in 20480 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.384b74b0dc51 format: 2 features: layering, exclusive-lock, object-map, fast-diff, d

[ceph-users] Inconsistent PG automatically got "repaired" automatically?

2018-05-09 Thread Nikos Kormpakis
Hello, we operate a Ceph cluster running v12.2.4, on top of Debian Stretch, deployed with ceph-volume lvm with a default crushmap and a quite vanilla ceph.conf. OSDs live on single disks in JBOD mode, with a separate block.db LV on a shared SSD. We have a single pool (min_size=2, size=3) on th

Re: [ceph-users] Question: CephFS + Bluestore

2018-05-09 Thread John Spray
On Wed, May 9, 2018 at 3:32 PM, Webert de Souza Lima wrote: > Hello, > > Currently, I run Jewel + Filestore for cephfs, with SSD-only pools used for > cephfs-metadata, and HDD-only pools for cephfs-data. The current > metadata/data ratio is something like 0,25% (50GB metadata for 20TB data). > > R

Re: [ceph-users] Question: CephFS + Bluestore

2018-05-09 Thread Webert de Souza Lima
I'm sorry I have mixed up some information. The actual ratio I have now is 0,0005% (*100MB for 20TB data*). Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Wed, May 9, 2018 at 11:32 AM, Webert de Souza Lima wrote: > Hello, > > Current

[ceph-users] Question: CephFS + Bluestore

2018-05-09 Thread Webert de Souza Lima
Hello, Currently, I run Jewel + Filestore for cephfs, with SSD-only pools used for cephfs-metadata, and HDD-only pools for cephfs-data. The current metadata/data ratio is something like 0,25% (50GB metadata for 20TB data). Regarding bluestore architecture, assuming I have: - SSDs for WAL+DB -

Re: [ceph-users] How to configure s3 bucket acl so that one user's bucket is visible to another.

2018-05-09 Thread Sean Purdy
The other way to do it is with policies. e.g. a bucket owned by user1, but read access granted to user2: { "Version":"2012-10-17", "Statement":[ { "Sid":"user2 policy", "Effect":"Allow", "Principal": {"AWS": ["arn:aws:iam:::user/user2"]}, "Action":["s3:GetObject",

Re: [ceph-users] Deleting an rbd image hangs

2018-05-09 Thread David Turner
Yeah, I was about to suggest looking up all currently existing rbd IDs and snapshot IDs, compare to rados ls and remove the objects that exist for rbds and snapshots not reported by the cluster. On Wed, May 9, 2018, 8:38 AM Jason Dillaman wrote: > On Tue, May 8, 2018 at 2:31 PM, wrote: > > Hel

Re: [ceph-users] Deleting an rbd image hangs

2018-05-09 Thread Jason Dillaman
On Tue, May 8, 2018 at 2:31 PM, wrote: > Hello Jason, > > > Am 8. Mai 2018 15:30:34 MESZ schrieb Jason Dillaman : >>Perhaps the image had associated snapshots? Deleting the object >>doesn't delete the associated snapshots so those objects will remain >>until the snapshot is removed. However, if y

[ceph-users] Open-sourcing GRNET's Ceph-related tooling

2018-05-09 Thread Nikos Kormpakis
Hello, I'm happy to announce that GRNET [1] is open-sourcing its Ceph-related tooling on GitHub [2]. This repo includes multiple monitoring health checks compatible with Luminous and tooling in order deploy quickly our new Ceph clusters based on Luminous, ceph-volume lvm and BlueStore. We hope t

Re: [ceph-users] Ceph ObjectCacher FAILED assert (qemu/kvm)

2018-05-09 Thread Jason Dillaman
Any clue what Windows is doing issuing a discard against an extent that has an in-flight read? If this is repeatable, can you add "debug rbd = 20" and "debug objectcacher = 20" to your hypervisor's ceph.conf and attach the Ceph log to a tracker ticket? On Tue, May 8, 2018 at 8:09 PM, Richard Bade

Re: [ceph-users] cephfs-data-scan safety on active filesystem

2018-05-09 Thread John Spray
On Tue, May 8, 2018 at 8:49 PM, Ryan Leimenstoll wrote: > Hi Gregg, John, > > Thanks for the warning. It was definitely conveyed that they are dangerous. I > thought the online part was implied to be a bad idea, but just wanted to > verify. > > John, > > We were mostly operating off of what the

Re: [ceph-users] stale status from monitor?

2018-05-09 Thread John Spray
On Tue, May 8, 2018 at 9:50 PM, Bryan Henderson wrote: > My cluster got stuck somehow, and at one point in trying to recycle things to > unstick it, I ended up shutting down everything, then bringing up just the > monitors. At that point, the cluster reported the status below. > > With nothing bu