Re: [ceph-users] RBD Cache and rbd-nbd

2018-05-11 Thread Marc Schöchlin
Hello Jason, thanks for your response. Am 10.05.2018 um 21:18 schrieb Jason Dillaman: >> If i configure caches like described at >> http://docs.ceph.com/docs/luminous/rbd/rbd-config-ref/, are there dedicated >> caches per rbd-nbd/krbd device or is there a only a single cache area. > The librbd

Re: [ceph-users] howto: multiple ceph filesystems

2018-05-11 Thread Marc Roos
If I would like to use an erasurecode pool for a cephfs directory how would I create these placement rules? -Original Message- From: David Turner [mailto:drakonst...@gmail.com] Sent: vrijdag 11 mei 2018 1:54 To: João Paulo Sacchetto Ribeiro Bastos Cc: ceph-users@lists.ceph.com Subj

Re: [ceph-users] Inaccurate client io stats

2018-05-11 Thread John Spray
On Fri, May 11, 2018 at 4:51 AM, Horace wrote: > Hi everyone, > > I've got a 3-node cluster running without any issue. However, I found out > that since upgraded to luminous, the client io stat is far too way off from > the real one. Have no idea how to troubleshoot this after went through all

[ceph-users] Adding pool to cephfs, setfattr permission denied

2018-05-11 Thread Marc Roos
I have added a data pool by: ceph osd pool set fs_data.ec21 allow_ec_overwrites true ceph osd pool application enable fs_data.ec21 cephfs ceph fs add_data_pool cephfs fs_data.ec21 setfattr -n ceph.dir.layout.pool -v fs_data.ec21 folder setfattr: folder: Permission denied Added the pool also to

Re: [ceph-users] Adding pool to cephfs, setfattr permission denied

2018-05-11 Thread John Spray
On Fri, May 11, 2018 at 7:40 AM, Marc Roos wrote: > > I have added a data pool by: > > ceph osd pool set fs_data.ec21 allow_ec_overwrites true > ceph osd pool application enable fs_data.ec21 cephfs > ceph fs add_data_pool cephfs fs_data.ec21 > > setfattr -n ceph.dir.layout.pool -v fs_data.ec21 fol

Re: [ceph-users] Adding pool to cephfs, setfattr permission denied

2018-05-11 Thread Marc Roos
Thanks! That did it. This 'tag cephfs' is probably a restriction you can add when you have mulitple filesystems? And I don't need x permission on the osd's? -Original Message- From: John Spray [mailto:jsp...@redhat.com] Sent: vrijdag 11 mei 2018 14:05 To: Marc Roos Cc: ceph-users

Re: [ceph-users] Adding pool to cephfs, setfattr permission denied

2018-05-11 Thread John Spray
On Fri, May 11, 2018 at 8:10 AM, Marc Roos wrote: > > > Thanks! That did it. This 'tag cephfs' is probably a restriction you can > add when you have mulitple filesystems? And I don't need x permission on > the osd's? The "tag cephfs data " bit is authorising the client to access any pools that ar

[ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread Oliver Schulz
Dear Ceph Experts, I'm trying to set up some new OSD storage nodes, now with bluestore (our existing nodes still use filestore). I'm a bit unclear on how to specify WAL/DB devices: Can several OSDs share one WAL/DB partition? So, can I do ceph-deploy osd create --bluestore --osd-db=/dev/nvme

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread João Paulo Sacchetto Ribeiro Bastos
Hello Oliver, As far as I know yet, you can use the same DB device for about 4 or 5 OSDs, just need to be aware of the free space. I'm also developing a bluestore cluster, and our DB and WAL will be in the same SSD of about 480GB serving 4 OSD HDDs of 4 TB each. About the sizes, its just a feeling

Re: [ceph-users] howto: multiple ceph filesystems

2018-05-11 Thread Webert de Souza Lima
Basically what we're trying to figure out looks like what is being done here: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020958.html But instead of using LIBRADOS to store EMAILs directly into RADOS we're still using CEPHFS for it, just figuring out if it makes sense to sep

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread Oliver Schulz
Hi Jaroslaw, I tried that (using /dev/nvme0n1), but no luck: ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph- volume --cluster ceph lvm create --bluestore --data /dev/sdb --block.wal /dev/nvme0n1 When I run "/usr/sbin/ceph-volume ..." on the storage node, it fail

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread Oliver Schulz
Hi, thanks for the advice! I'm a bit confused now, though. ;-) I thought DB and WAL were supposed to go on raw block devices, not file systems? Cheers, Oliver On 11.05.2018 16:01, João Paulo Sacchetto Ribeiro Bastos wrote: Hello Oliver, As far as I know yet, you can use the same DB device

Re: [ceph-users] Nfs-ganesha 2.6 packages in ceph repo

2018-05-11 Thread David C
Hi Oliver Thanks for the detailed reponse! I've downgraded my libcephfs2 to 12.2.4 and still get a similar error: load_fsal :NFS STARTUP :CRIT :Could not dlopen module:/usr/lib64/ganesha/libfsalceph.so Error:/lib64/libcephfs.so.2: undefined symbol: _Z14common_ preinitRK18CephInitParameters18code_

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread João Paulo Sacchetto Ribeiro Bastos
Actually, if you go to https://ceph.com/community/new-luminous-bluestore/ you will see that DB/WAL work on a XFS partition, while the data itself goes on a raw block. Also, I told you the wrong command in the last mail. When i said --osd-db it should be --block-db. On Fri, May 11, 2018 at 11:51 A

Re: [ceph-users] RBD Cache and rbd-nbd

2018-05-11 Thread Jason Dillaman
On Fri, May 11, 2018 at 3:59 AM, Marc Schöchlin wrote: > Hello Jason, > > thanks for your response. > > > Am 10.05.2018 um 21:18 schrieb Jason Dillaman: > > If i configure caches like described at > http://docs.ceph.com/docs/luminous/rbd/rbd-config-ref/, are there dedicated > caches per rbd-nbd/kr

Re: [ceph-users] Nfs-ganesha 2.6 packages in ceph repo

2018-05-11 Thread Oliver Freyermuth
Hi David, Am 11.05.2018 um 16:55 schrieb David C: > Hi Oliver > > Thanks for the detailed reponse! I've downgraded my libcephfs2 to 12.2.4 and > still get a similar error: > > load_fsal :NFS STARTUP :CRIT :Could not dlopen > module:/usr/lib64/ganesha/libfsalceph.so Error:/lib64/libcephfs.so.2:

Re: [ceph-users] Inconsistent PG automatically got "repaired"?

2018-05-11 Thread Nikos Kormpakis
On 2018-05-10 00:39, Gregory Farnum wrote: On Wed, May 9, 2018 at 8:21 AM Nikos Kormpakis wrote: 1) After how much time RADOS tries to read from a secondary replica? Is this timeout configurable? 2) If a primary shard is missing, Ceph tries to recreate it somehow automatically? 3) If C

[ceph-users] Node crash, filesytem not usable

2018-05-11 Thread Daniel Davidson
Hello, Today we had a node crash, and looking at it, it seems there is a problem with the RAID controller, so it is not coming back up, maybe ever.  It corrupted the local filesytem for the ceph storage there. The remainder of our storage (10.2.10) cluster is running, and it looks to be repa

[ceph-users] Bucket reporting content inconsistently

2018-05-11 Thread Sean Redmond
HI all, We have recently upgraded to 10.2.10 in preparation for our upcoming upgrade to Luminous and I have been attempting to remove a bucket. When using tools such as s3cmd I can see files are listed, verified by the checking with bi list too as shown below: root@ceph-rgw-1:~# radosgw-admin

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread David Turner
This thread is off in left field and needs to be brought back to how things work. While multiple OSDs can use the same device for block/wal partitions, they each need their own partition. osd.0 could use nvme0n1p1, osd.2/nvme0n1p2, etc. You cannot use the same partition for each osd. Ceph-volum

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread David Turner
Note that instead of including the step to use the UUID in the osd creation like [1] this, I opted to separate it out in those instructions. That was to simplify the commands and to give people an idea of how to fix their OSDs if they created them using the device name instead of UUID. It would b

Re: [ceph-users] Node crash, filesytem not usable

2018-05-11 Thread David Turner
What are some outputs of commands to show us the state of your cluster. Most notable is `ceph status` but `ceph osd tree` would be helpful. What are the size of the pools in your cluster? Are they all size=3 min_size=2? On Fri, May 11, 2018 at 12:05 PM Daniel Davidson wrote: > Hello, > > Today

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread Jacob DeGlopper
Thanks, this is useful in general.  I have a semi-related question: Given an OSD server with multiple SSDs or NVME devices, is there an advantage to putting wal/db on a different device of the same speed?  For example, data on sda1, matching wal/db on sdb1,  and then data on sdb2 and wal/db on

Re: [ceph-users] Ceph osd crush weight to utilization incorrect on one node

2018-05-11 Thread Bryan Stillwell
> We have a large 1PB ceph cluster. We recently added 6 nodes with 16 2TB disks > each to the cluster. All the 5 nodes rebalanced well without any issues and > the sixth/last node OSDs started acting weird as I increase weight of one osd > the utilization doesn't change but a different osd on the s

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread David Turner
Nope, only detriment. If you lost sdb, you would have to rebuild 2 OSDs instead of just 1. Also you add more complexity as ceph-volume would much prefer to just take sda and make it the OSD with all data/db/wal without partitions or anything. On Fri, May 11, 2018 at 1:06 PM Jacob DeGlopper wrot

Re: [ceph-users] Ceph osd crush weight to utilization incorrect on one node

2018-05-11 Thread David Turner
There was a time in the history of Ceph where a weight of 0.0 was not always what you thought. People had better experiences with crush weights of something like 0.0001 or something. This is just a memory tickling in the back of my mind of things I've read on the ML years back. On Fri, May 11, 2

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-11 Thread Alexandre DERUMIER
Hi, I'm still seeing memory leak with 12.2.5. seem to leak some MB each 5 minutes. I'll try to resent some stats next weekend. - Mail original - De: "Patrick Donnelly" À: "Brady Deetz" Cc: "Alexandre Derumier" , "ceph-users" Envoyé: Jeudi 10 Mai 2018 21:11:19 Objet: Re: [ceph-users

Re: [ceph-users] Node crash, filesytem not usable

2018-05-11 Thread Daniel Davidson
Below id the information you were asking for.  I think they are size=2, min size=1. Dan # ceph status     cluster 7bffce86-9d7b-4bdf-a9c9-67670e68ca77 health HEALTH_ERR     140 pgs are stuck inactive for more than 300 seconds     64 pgs backfill_wait     76 pgs back

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread Oliver Schulz
Dear David, thanks a lot for the detailed answer(s) and clarifications! Can I ask just a few more questions? On 11.05.2018 18:46, David Turner wrote: partitions is 10GB per 1TB of OSD.  If your OSD is a 4TB disk you should be looking closer to a 40GB block.db partition.  If your block.db parti

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-11 Thread Webert de Souza Lima
You could use "mds_cache_size" to limit number of CAPS untill you have this fixed, but I'd say for your number of caps and inodes, 20GB is normal. this mds (jewel) here is consuming 24GB RAM: { "mds": { "request": 7194867047, "reply": 7194866688, "reply_latency": {

Re: [ceph-users] Node crash, filesytem not usable

2018-05-11 Thread Webert de Souza Lima
This message seems to be very concerning: >mds0: Metadata damage detected but for the rest, the cluster seems still to be recovering. you could try to seep thing up with ceph tell, like: ceph tell osd.* injectargs --osd_max_backfills=10 ceph tell osd.* injectargs --osd_recovery_sleep

Re: [ceph-users] Question: CephFS + Bluestore

2018-05-11 Thread Webert de Souza Lima
I think ceph doesn't have IO metrics will filters by pool right? I see IO metrics from clients only: ceph_client_io_ops ceph_client_io_read_bytes ceph_client_io_read_ops ceph_client_io_write_bytes ceph_client_io_write_ops and pool "byte" metrics, but not "io": ceph_pool(write/read)_bytes(_total)

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread David Turner
For if you should do WAL only on the NVMe vs use a filestore journal, that depends on your write patterns, use case, etc. In my clusters with 10TB disks I use 2GB partitions for the WAL and leave the DB on the HDD with the data. Those are in archival RGW use cases and that works fine for the thro

Re: [ceph-users] Question: CephFS + Bluestore

2018-05-11 Thread David Turner
`ceph osd pool stats` with the option to specify the pool you are interested in should get you the breakdown of IO per pool. This was introduced with luminous. On Fri, May 11, 2018 at 2:39 PM Webert de Souza Lima wrote: > I think ceph doesn't have IO metrics will filters by pool right? I see IO

Re: [ceph-users] Question: CephFS + Bluestore

2018-05-11 Thread Webert de Souza Lima
Thanks David. Although you mentioned this was introduced with Luminous, it's working with Jewel. ~# ceph osd pool stats Fri May 11 17:41:39 2018 pool rbd id 5 client io 505 kB/s rd, 3801 kB/s wr, 46 op/s rd, 27 op/s wr pool rbd_cache id 6 client io 2538 kB/s rd,

[ceph-users] Test for Leo

2018-05-11 Thread Tom W
Test for Leo, please ignore. NOTICE AND DISCLAIMER This e-mail (including any attachments) is intended for the above-named person(s). If you are not the intended recipient, notify the sender immediately, delete this email from your system and do not disclose or

Re: [ceph-users] Ceph osd crush weight to utilization incorrect on one node

2018-05-11 Thread Pardhiv Karri
Hi Bryan, Thank you for the reply. We are on Hammer, ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90) We tried with full weight on all OSDs on that node and the OSDs like 611 are going above 90% so downsized and tested with only 0.2 Our PGs are at 119 for all 12 pools in the clust

Re: [ceph-users] Ceph osd crush weight to utilization incorrect on one node

2018-05-11 Thread Pardhiv Karri
Hi David, Thanks for the reply. Yeah we are seeing that 0.0001 usage on pretty much on all OSDs. But this node it is different whether full weight or just 0.2of OSD 611 the OSD 611 start increasing. --Pardhiv K On Fri, May 11, 2018 at 10:50 AM, David Turner wrote: > There was a time in the h

Re: [ceph-users] Ceph osd crush weight to utilization incorrect on one node

2018-05-11 Thread David Turner
What's your `ceph osd tree`, `ceph df`, `ceph osd df`? You sound like you just have a fairly fill cluster that you haven't balanced the crush weights on. On Fri, May 11, 2018, 10:06 PM Pardhiv Karri wrote: > Hi David, > > Thanks for the reply. Yeah we are seeing that 0.0001 usage on pretty much

Re: [ceph-users] Question: CephFS + Bluestore

2018-05-11 Thread David Turner
That's right. I didn't actually use Jewel for very long. I'm glad it worked for you. On Fri, May 11, 2018, 4:49 PM Webert de Souza Lima wrote: > Thanks David. > Although you mentioned this was introduced with Luminous, it's working > with Jewel. > > ~# ceph osd pool stats > >

Re: [ceph-users] Open-sourcing GRNET's Ceph-related tooling

2018-05-11 Thread Brad Hubbard
+ceph-devel On Wed, May 9, 2018 at 10:00 PM, Nikos Kormpakis wrote: > Hello, > > I'm happy to announce that GRNET [1] is open-sourcing its Ceph-related > tooling on GitHub [2]. This repo includes multiple monitoring health > checks compatible with Luminous and tooling in order deploy quickly our

Re: [ceph-users] Ceph osd crush weight to utilization incorrect on one node

2018-05-11 Thread Pardhiv Karri
Hi David, Here is the output of ceph df. We have lot of space in our ceph cluster. We have 2 OSDs (266,500) down earlier due to hardware issue and never got a chance to fix them. GLOBAL: SIZE AVAIL RAW USED %RAW USED 1101T 701T 400T 36.37 POOLS: NAME

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-11 Thread Alexandre DERUMIER
Hi >>You could use "mds_cache_size" to limit number of CAPS untill you have this >>fixed, but I'd say for your number of caps and inodes, 20GB is normal. The documentation (luminous) say: " mds cache size Description:The number of inodes to cache. A value of 0 indicates an unlimited numbe

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-11 Thread Alexandre DERUMIER
my cache is correctly capped at 5G currently here some stats: (mds has been restarted yesterday, using around 8,8gb, and cache capped at 5G). I'll try to sent some stats in 1 or 2 week, when the memory should be at 20g # while sleep 1; do ceph daemon mds.ceph4-2.odiso.net perf dump | jq '.m

[ceph-users] PG show inconsistent active+clean+inconsistent

2018-05-11 Thread Faizal Latif
Hi Guys, i need some help. i can see currently my ceph storage showing " *active+clean+inconsistent*". which result HEALTH_ERR state and cause scrubbing error. you may find below are sample output. HEALTH_ERR 1 pgs inconsistent; 11685 scrub errors; noscrub,nodeep-scrub flag(s) set pg 2.2c0 is act