Re: [ceph-users] CephFS Samba VFS RHEL packages

2016-07-21 Thread Yan, Zheng
On Fri, Jul 22, 2016 at 11:15 AM, Blair Bethwaite wrote: > Thanks Zheng, > > On 22 July 2016 at 12:12, Yan, Zheng wrote: >> We actively back-port fixes to RHEL 7.x kernel. When RHCS2.0 release, >> the RHEL kernel should contain fixes up to 3.7 upstream kernel. > > You meant 4.7 right? I mean 4.

Re: [ceph-users] Radosgw admin ops API command question

2016-07-21 Thread Horace
Thanks for your help!! That's what I need. Regards, Horace Ng - Original Message - From: "Daniel Gryniewicz" To: ceph-users@lists.ceph.com Sent: Thursday, July 21, 2016 10:34:14 PM Subject: Re: [ceph-users] Radosgw admin ops API command question On 07/21/2016 05:04 AM, Horace wrote: > H

Re: [ceph-users] CephFS Samba VFS RHEL packages

2016-07-21 Thread Blair Bethwaite
Thanks Zheng, On 22 July 2016 at 12:12, Yan, Zheng wrote: > We actively back-port fixes to RHEL 7.x kernel. When RHCS2.0 release, > the RHEL kernel should contain fixes up to 3.7 upstream kernel. You meant 4.7 right? -- Cheers, ~Blairo ___ ceph-user

Re: [ceph-users] CephFS Samba VFS RHEL packages

2016-07-21 Thread Yan, Zheng
On Fri, Jul 22, 2016 at 9:11 AM, Ira Cooper wrote: > On Fri, Jul 22, 2016 at 08:29:39AM +1100, Blair Bethwaite wrote: >> Ken, Ira, John - >> >> Thanks a lot for the replies. Our initial setup is simply running >> samba atop a cephfs kernel mount, and initial cursory checks seem to >> show the basi

[ceph-users] Infernalis -> Jewel, 10x+ RBD latency increase

2016-07-21 Thread Martin Millnert
Hi, I just upgraded from Infernalis to Jewel and see an approximate 10x latency increase. Quick facts:  - 3x replicated pool  - 4x 2x-"E5-2690 v3 @ 2.60GHz", 128GB RAM, 6x 1.6 TB Intel S3610 SSDs,  - LSI3008 controller with up-to-date firmware and upstream driver, and up-to-date firmware on SSDs.

Re: [ceph-users] CephFS Samba VFS RHEL packages

2016-07-21 Thread Blair Bethwaite
Ken, Ira, John - Thanks a lot for the replies. Our initial setup is simply running samba atop a cephfs kernel mount, and initial cursory checks seem to show the basics are working as expected (even clustered with ctdb - what are your concerns here Ira?). Though we've yet to try any of our planned

Re: [ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread David Turner
The weight of osd.53 wasn't 0.0 and the weight of your current osds aren't 1.0. Where are you getting this from? If you're getting it from ceph osd tree, then you're looking at the wrong column, the weight is the second column right between the id and the osd name. If you did ceph osd rm 53,

Re: [ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread Salwasser, Zac
Ok, I’ve gotten as far “ceph osd rm 53”. The tree command showed a weight of 0, along with two other “down” osds, both of which I “rebuilt” two days ago (more on this later). One of the (formerly) two down pgs is still down. When I run a query on it, I get: { "state": "down+incomplete",

Re: [ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread David Turner
The Mon store is important and since your cluster isn't healthy, they need to hold onto it to make sure that when things come up that the mon can replay everything for them. Once you fix the 2 down and peering PGs, The mon store will fix itself in no time at all. Ceph is rightly refusing to co

Re: [ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread w...@42on.com
> Op 21 jul. 2016 om 21:06 heeft Salwasser, Zac het > volgende geschreven: > > Thanks for the response! Long story short, there’s one specific osd in my > cluster that is responsible, according to the dump command, for the two pg’s > that are still down. > > I wiped the osd data directory

Re: [ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread Salwasser, Zac
Thanks for the response! Long story short, there’s one specific osd in my cluster that is responsible, according to the dump command, for the two pg’s that are still down. I wiped the osd data directory and recreated that osd a couple of days ago, but it is still stuck in the “booting” state.

Re: [ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread Dan van der Ster
Hi, The mon's keep all maps going back to the last time the cluster had HEALTH_OK, which is why the mon leveldb's are so large in your case. (I see Greg responded with the same info). Focus on getting the cluster healthy, then the mon sizes should resolve themselves. -- Dan On Thu, Jul 21, 2016

Re: [ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread Gregory Farnum
On Thu, Jul 21, 2016 at 11:54 AM, Salwasser, Zac wrote: > Rephrasing for brevity – I have a monitor store that is 69GB and won’t > compact any further on restart or with ‘tell compact’. Has anyone dealt > with this before? The monitor can't trim OSD maps over a period where PGs are unclean; you'

[ceph-users] Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next.

2016-07-21 Thread Salwasser, Zac
Rephrasing for brevity – I have a monitor store that is 69GB and won’t compact any further on restart or with ‘tell compact’. Has anyone dealt with this before? From: "Salwasser, Zac" Date: Thursday, July 21, 2016 at 1:18 PM To: "ceph-users@lists.ceph.com" Cc: "Salwasser, Zac" , "Heller, Ch

Re: [ceph-users] CephFS write performance

2016-07-21 Thread Gregory Farnum
On Thu, Jul 21, 2016 at 10:55 AM, Fabiano de O. Lucchese wrote: > Hey, guys. > > I'm still feeling unlucky about these experiments. Here's what I did: > > 1) Set the parameters described below in ceph.conf > 2) Push the ceph.conf to all nodes using ceph-deploy > 3) Restart monitor,

Re: [ceph-users] CephFS write performance

2016-07-21 Thread Fabiano de O. Lucchese
Hey, guys. I'm still feeling unlucky about these experiments. Here's what I did: 1)  Setthe parameters described below in ceph.conf 2)  Pushthe ceph.conf to all nodes using ceph-deploy 3)  Restartmonitor, mds and osd’s on all nodes 4)  Ranthe test twice atleast and look at the re

Re: [ceph-users] ceph + vmware

2016-07-21 Thread Mike Christie
On 07/21/2016 11:41 AM, Mike Christie wrote: > On 07/20/2016 02:20 PM, Jake Young wrote: >> >> For starters, STGT doesn't implement VAAI properly and you will need to >> disable VAAI in ESXi. >> >> LIO does seem to implement VAAI properly, but performance is not nearly >> as good as STGT even with

[ceph-users] Try to install ceph hammer on CentOS7

2016-07-21 Thread Manuel Lausch
Hi, I try to install ceph hammer on centos7 but something with the RPM Repository seems to be wrong. In my yum.repos.d/ceph.repo file I have the following configuration: [ceph] name=Ceph packages for $basearch baseurl=baseurl=http://download.ceph.com/rpm-hammer/el7/$basearch enabled=1 priorit

[ceph-users] Cluster in warn state, not sure what to do next.

2016-07-21 Thread Salwasser, Zac
Hi, I have a cluster that has been in an unhealthy state for a month or so. We realized the OSDs were flapping due to not having user access to enough file handles, but it took us a while to realize this and we appear to have done a lot of damage to the state of the monitor store in the meanti

Re: [ceph-users] ceph + vmware

2016-07-21 Thread Mike Christie
On 07/20/2016 02:20 PM, Jake Young wrote: > > For starters, STGT doesn't implement VAAI properly and you will need to > disable VAAI in ESXi. > > LIO does seem to implement VAAI properly, but performance is not nearly > as good as STGT even with VAAI's benefits. The assumption for the cause > is

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Nick Fisk
Yes awesome, as long as you fully test bcache and you are happy with it. Also, if you intend to do HA, you will have to use dual port SAS SSD’s instead of NVME and make sure you create your resource agent scripts correctly, otherwise bye bye data. If you enable writeback caching in TGT an

Re: [ceph-users] Radosgw admin ops API command question

2016-07-21 Thread Daniel Gryniewicz
On 07/21/2016 05:04 AM, Horace wrote: Hello all, I've a question regarding Ceph Radosgw Admin ops API would like to ask your help. I couldn't find the API equivalent of the follow CLI command to look up the total bytes of a user. Your help will be appreciated, thanks in advance! radosgw-admin

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Nick Fisk
What you are seeing is probably averaged over 1 second or something like that. So yes in 1 second IO would have run on all OSD’s. But for any 1 point in time a single thread will only run on 1 OSD (+2 replicas) assuming the IO size isn’t bigger than the object size. For RBD, If data is stri

Re: [ceph-users] Unknown error (95->500) when creating buckets or putting files to RGW after upgrade from Infernalis to Jewel

2016-07-21 Thread nick
Hi Maciej, I am not really sure how to fix this error but executing the same command on our cluster outputs: """ $~ # radosgw-admin zonegroup get { "id": "default", "name": "default", "api_name": "", "is_master": "true", "endpoints": [], "hostnames": [], "hostnames_s3w

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread w...@globe.de
Okay that should be the answer... I think it would be great to use Intel P3700 1.6TB as bcache in the iscsi rbd client gateway nodes. caching device: Intel P3700 1.6TB backing device: RBD from Ceph Cluster What do you mean? I think this setup should improve the performance dramatically or n

[ceph-users] rbd image creation command hangs in Jewel 10.2.2 (CentOS 7.2) on AWS Environment

2016-07-21 Thread Rakesh Parkiti
rbd image creation command hangs. 1. Configured ceph cluster on AWS environment, CEPH Jewel Ver.10.2.2 on CentOS Linux release 7.2.1511 (Core).2. Created 6 OSDs (SSDs Drives) on each OSD Node (3 OSD Nodes). Total 24 OSDs across cluster.3. Edited crush map by picking each SSD disk from each OSD N

Re: [ceph-users] Unknown error (95->500) when creating buckets or putting files to RGW after upgrade from Infernalis to Jewel

2016-07-21 Thread Naruszewicz, Maciej
Hi Nick, Thanks for your suggestion, I've tried the script on an isolated testing cluster. Unfortunately, the script did not help us a lot, it only made creating buckets possible. The logs I provided earlier actually make some sense because they were collected using RGW in Jewel and Ceph in I

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread w...@globe.de
That can not be correct. Check it at your cluster with dstat as i said... You will see at every node parallel IO on every OSD and journal Am 21.07.16 um 15:02 schrieb Jake Young: I think the answer is that with 1 thread you can only ever write to one journal at a time. Theoretically, you

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Jake Young
I think the answer is that with 1 thread you can only ever write to one journal at a time. Theoretically, you would need 10 threads to be able to write to 10 nodes at the same time. Jake On Thursday, July 21, 2016, w...@globe.de wrote: > What i not really undertand is: > > Lets say the Intel P3

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Nick Fisk
Yes, but not if you are using iSCSI and don't want data loss. If the data is in a cache somewhere and you lose power or crashgame over. That's why you want to cache to a non volatile device close to the source. If you use something like FIO and use buffered IO, you will see that you will ge

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread w...@globe.de
What i not really undertand is: Lets say the Intel P3700 works with 200 MByte/s rados bench one thread... See Nicks results below... If we have multiple OSD Nodes. For example 10 Nodes. Every Node has exactly 1x P3700 NVMe built in. Why is the single Thread performance exactly at 200 MByte/s

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread w...@globe.de
Is there not a way to enable Linux page Cache? So do not user D_Sync... Then we would the dramatically performance improve. Am 21.07.16 um 14:33 schrieb Nick Fisk: -Original Message- From: w...@globe.de [mailto:w...@globe.de] Sent: 21 July 2016 13:23 To: n...@fisk.me.uk; 'Horace Ng' C

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Nick Fisk
From: Jake Young [mailto:jak3...@gmail.com] Sent: 21 July 2016 13:24 To: n...@fisk.me.uk; w...@globe.de Cc: Horace Ng ; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance My workaround to your single threaded performance issue was to increase the

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Nick Fisk
> -Original Message- > From: w...@globe.de [mailto:w...@globe.de] > Sent: 21 July 2016 13:23 > To: n...@fisk.me.uk; 'Horace Ng' > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance > > Okay and what is your plan now to speed up ? Now I hav

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Jake Young
My workaround to your single threaded performance issue was to increase the thread count of the tgtd process (I added --nr_iothreads=128 as an argument to tgtd). This does help my workload. FWIW below are my rados bench numbers from my cluster with 1 thread: This first one is a "cold" run. This

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread w...@globe.de
Okay and what is your plan now to speed up ? Would it help to put in multiple P3700 per OSD Node to improve performance for a single Thread (example Storage VMotion) ? Regards Am 21.07.16 um 14:17 schrieb Nick Fisk: -Original Message- From: ceph-users [mailto:ceph-users-boun...@list

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > w...@globe.de > Sent: 21 July 2016 13:04 > To: n...@fisk.me.uk; 'Horace Ng' > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance > > Hi, > >

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread w...@globe.de
Hi, hmm i think 200 MByte/s is really bad. Is your Cluster in production right now? So if you start a storage migration you get only 200 MByte/s right? I think it would be awesome if you get 1000 MByte/s Where is the Bottleneck? A FIO Test from Sebastien Han give us 400 MByte/s raw performa

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Nick Fisk
I've had a lot of pain with this, smaller block sizes are even worse. You want to try and minimize latency at every point as there is no buffering happening in the iSCSI stack. This means:- 1. Fast journals (NVME or NVRAM) 2. 10GB or better networking 3. Fast CPU's (Ghz) 4. Fix CPU c-state's to C

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread Horace
Hi, Same here, I've read some blog saying that vmware will frequently verify the locking on VMFS over iSCSI, hence it will have much slower performance than NFS (with different locking mechanism). Regards, Horace Ng - Original Message - From: w...@globe.de To: ceph-users@lists.ceph.com

[ceph-users] Ceph + VMware + Single Thread Performance

2016-07-21 Thread w...@globe.de
Hi everyone, we see at our cluster relatively slow Single Thread Performance on the iscsi Nodes. Our setup: 3 Racks: 18x Data Nodes, 3 Mon Nodes, 3 iscsi Gateway Nodes with tgt (rbd cache off). 2x Samsung SM863 Enterprise SSD for Journal (3 OSD per SSD) and 6x WD Red 1TB per Data Node as

[ceph-users] Radosgw admin ops API command question

2016-07-21 Thread Horace
Hello all, I've a question regarding Ceph Radosgw Admin ops API would like to ask your help. I couldn't find the API equivalent of the follow CLI command to look up the total bytes of a user. Your help will be appreciated, thanks in advance! radosgw-admin user stats --uid=user1 http://docs.cep

Re: [ceph-users] performance decrease after continuous run

2016-07-21 Thread K K
Hello. Maybe deep-scrub starting at this time? Четверг, 21 июля 2016, 11:10 +05:00 от Christian Balzer : > > >Hello, > >On Wed, 20 Jul 2016 12:19:07 -0700 Kane Kim wrote: > >> Hello, >> >> I was running cosbench for some time and noticed sharp consistent >> performance decrease at some point.

Re: [ceph-users] OSD / Journal disk failure

2016-07-21 Thread Christian Balzer
Hello, On Tue, 19 Jul 2016 08:10:27 +0800 Pei Feng Lin wrote: > Dear Cephers: > > I have two questions that needs advice. > > 1) If there is a OSD disk failure (for example, pulling disk out), how long > does the osd daemon detect the disk failure? and how long does the ceph > cluster mark thi

Re: [ceph-users] thoughts about Cache Tier Levels

2016-07-21 Thread Christian Balzer
Hello, On Wed, 20 Jul 2016 15:36:34 +0200 Götz Reinicke - IT Koordinator wrote: > Hi, > > currently there are two levels I know of: storage- and cachepool. From > our workload I do expect an third "level" of data, which will stay > currently in the storagepool as well. > > Has anyone as we bee

[ceph-users] Antw: Ceph : Generic Query : Raw Format of images

2016-07-21 Thread Steffen Weißgerber
>>> Gaurav Goyal schrieb am Mittwoch, 20. Juli 2016 >>> um 17:41: > Dear Ceph User, > Hi, > I want to ask a very generic query regarding ceph. > > Ceph does use .raw format. But every single company is providing qcow2 > images. > It takes a lot of time to convert the images to raw format.