Re: [ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-22 Thread Dan van der Ster
I'm now running the three relevant OSDs with that patch. (Recompiled, replaced /usr/lib64/rados-classes/libcls_log.so with the new version, then restarted the osds). It's working quite well, trimming 10 entries at a time instead of 1000, and no more timeouts. Do you think it would be worth decrea

Re: [ceph-users] Transitioning to Intel P4600 from P3700 Journals

2017-06-22 Thread Luis Periquito
> Keep in mind that 1.6TB P4600 is going to last about as long as your 400GB > P3700, so if wear-out is a concern, don't put more stress on them. > I've been looking at the 2T ones, but it's about the same as the 400G P3700 > Also the P4600 is only slightly faster in writes than the P3700, so tha

Re: [ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-22 Thread Dan van der Ster
On Wed, Jun 21, 2017 at 4:16 PM, Peter Maloney wrote: > On 06/14/17 11:59, Dan van der Ster wrote: >> Dear ceph users, >> >> Today we had O(100) slow requests which were caused by deep-scrubbing >> of the metadata log: >> >> 2017-06-14 11:07:55.373184 osd.155 >> [2001:1458:301:24::100:d]:6837/3817

[ceph-users] 答复: Can't start ceph-mon through systemctl start ceph-mon@.service after upgrading from Hammer to Jewel

2017-06-22 Thread 许雪寒
I set mon_data to “/home/ceph/software/ceph/var/lib/ceph/mon”, and its owner has always been “ceph” since we were running Hammer. And I also tried to set the permission to “777”, it also didn’t work. 发件人: Linh Vu [mailto:v...@unimelb.edu.au] 发送时间: 2017年6月22日 14:26 收件人: 许雪寒; ceph-users@lists.cep

Re: [ceph-users] Transitioning to Intel P4600 from P3700 Journals

2017-06-22 Thread Maxime Guyot
Hi, One of the benefits of PCIe NVMe is that it does not take a disk slot, resulting in a higher density. For example a 6048R-E1CR36N with 3x PCIe NVMe yields 36 OSDs per servers (12 OSD per NVMe) where it yields 30 OSDs per server if using SATA SSDs (6 OSDs per SSD). Since you say that you used

Re: [ceph-users] Transitioning to Intel P4600 from P3700 Journals

2017-06-22 Thread Willem Jan Withagen
On 22-6-2017 03:59, Christian Balzer wrote: >> Agreed. On the topic of journals and double bandwidth, am I correct in >> thinking that btrfs (as insane as it may be) does not require double >> bandwidth like xfs? Furthermore with bluestore being close to stable, will >> my architecture need to chan

[ceph-users] Does CephFS support SELinux?

2017-06-22 Thread Stéphane Klein
Hi, Does CephFS support SELinux? I have this issue with OpenShift (with SELinux) + CephFS: http://lists.openshift.redhat.com/openshift-archives/users/2017-June/msg00116.html Best regards, Stéphane -- Stéphane Klein blog: http://stephane-klein.info cv : http://cv.stephane-klein.info Twitter: ht

Re: [ceph-users] Does CephFS support SELinux?

2017-06-22 Thread John Spray
On Thu, Jun 22, 2017 at 10:25 AM, Stéphane Klein wrote: > Hi, > > Does CephFS support SELinux? > > I have this issue with OpenShift (with SELinux) + CephFS: > http://lists.openshift.redhat.com/openshift-archives/users/2017-June/msg00116.html We do test running CephFS server and client bits on mac

Re: [ceph-users] Does CephFS support SELinux?

2017-06-22 Thread Stéphane Klein
2017-06-22 11:48 GMT+02:00 John Spray : > On Thu, Jun 22, 2017 at 10:25 AM, Stéphane Klein > wrote: > > Hi, > > > > Does CephFS support SELinux? > > > > I have this issue with OpenShift (with SELinux) + CephFS: > > http://lists.openshift.redhat.com/openshift-archives/users/ > 2017-June/msg00116.h

Re: [ceph-users] FW: radosgw: stale/leaked bucket index entries

2017-06-22 Thread Pavan Rallabhandi
Looks like I’ve now got a consistent repro scenario, please find the gory details here http://tracker.ceph.com/issues/20380 Thanks! On 20/06/17, 2:04 PM, "Pavan Rallabhandi" wrote: Hi Orit, No, we do not use multi-site. Thanks, -Pavan. From: Orit Wasserman

Re: [ceph-users] VMware + CEPH Integration

2017-06-22 Thread Nick Fisk
> -Original Message- > From: Adrian Saul [mailto:adrian.s...@tpgtelecom.com.au] > Sent: 19 June 2017 06:54 > To: n...@fisk.me.uk; 'Alex Gorbachev' > Cc: 'ceph-users' > Subject: RE: [ceph-users] VMware + CEPH Integration > > > Hi Alex, > > > > Have you experienced any problems with timeou

Re: [ceph-users] Transitioning to Intel P4600 from P3700 Journals

2017-06-22 Thread David Turner
Cristian and everyone else have expertly responded to the SSD capabilities, pros, and cons so I'll ignore that. I believe you were saying that it was risky to swap out your existing journals to a new journal device. That is actually a very simple operation that can be scripted to only take minutes

Re: [ceph-users] SSD OSD's Dual Use

2017-06-22 Thread David Turner
I wouldn't see this as problematic at all. As long as you're watching the disk utilizations and durability, those are the only factors that would eventually tell you that they are busy enough. On Thu, Jun 22, 2017, 1:36 AM Ashley Merrick wrote: > Hello, > > > Currently have a pool of SSD's runni

Re: [ceph-users] 答复: Can't start ceph-mon through systemctl start ceph-mon@.service after upgrading from Hammer to Jewel

2017-06-22 Thread David Turner
Did you previously edit the init scripts to look in your custom location? Those could have been overwritten. As was mentioned, Jewel changed what user the daemon runs as, but you said that you tested running the daemon manually under the ceph user? Was this without sudo? It used to run as root unde

Re: [ceph-users] Config parameters for system tuning

2017-06-22 Thread Maged Mokhtar
Looking at the sources, the config values were in Hammer but not Jewel. for jounral config i recommend journal_queue_max_ops journal_queue_max_bytes be removed from the docs: http://docs.ceph.com/docs/master/rados/configuration/journal-ref/ Also for the added filestore throttling params: filestor

Re: [ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-22 Thread Casey Bodley
On 06/22/2017 04:00 AM, Dan van der Ster wrote: I'm now running the three relevant OSDs with that patch. (Recompiled, replaced /usr/lib64/rados-classes/libcls_log.so with the new version, then restarted the osds). It's working quite well, trimming 10 entries at a time instead of 1000, and no mo

Re: [ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-22 Thread Dan van der Ster
On Thu, Jun 22, 2017 at 4:25 PM, Casey Bodley wrote: > > On 06/22/2017 04:00 AM, Dan van der Ster wrote: >> >> I'm now running the three relevant OSDs with that patch. (Recompiled, >> replaced /usr/lib64/rados-classes/libcls_log.so with the new version, >> then restarted the osds). >> >> It's work

Re: [ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-22 Thread Casey Bodley
On 06/22/2017 10:40 AM, Dan van der Ster wrote: On Thu, Jun 22, 2017 at 4:25 PM, Casey Bodley wrote: On 06/22/2017 04:00 AM, Dan van der Ster wrote: I'm now running the three relevant OSDs with that patch. (Recompiled, replaced /usr/lib64/rados-classes/libcls_log.so with the new version, then

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-22 Thread Hall, Eric
After some testing (doing heavy IO on a rdb-based VM with hung_task_timeout_secs=1 while manually requesting deep-scrubs on the underlying pgs (as determined via rados ls->osdmaptool), I don’t think scrubbing is the cause. At least, I can’t make it happen this way… although I can’t *always* mak

[ceph-users] Squeezing Performance of CEPH

2017-06-22 Thread Massimiliano Cuttini
Hi everybody, I want to squeeze all the performance of CEPH (we are using jewel 10.2.7). We are testing a testing environment with 2 nodes having the same configuration: * CentOS 7.3 * 24 CPUs (12 for real in hyper threading) * 32Gb of RAM * 2x 100Gbit/s ethernet cards * 2x OS dedicated i

[ceph-users] Obtaining perf counters/stats from krbd client

2017-06-22 Thread Prashant Murthy
Hi Ceph users, We are currently using the Ceph kernel client module (krbd) in our deployment and we were looking to determine if there are ways by which we can obtain perf counters, log dumps, etc from such a deployment. Has anybody been able to obtain such stats? It looks like the libvirt interf

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-22 Thread Mark Nelson
Hello Massimiliano, Based on the configuration below, it appears you have 8 SSDs total (2 nodes with 4 SSDs each)? I'm going to assume you have 3x replication and are you using filestore, so in reality you are writing 3 copies and doing full data journaling for each copy, for 6x writes per c

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-22 Thread Ashley Merrick
Hello, Also as Mark put, one minute your testing bandwidth capacity, next minute your testing disk capacity. No way is a small set of SSD’s going to be able to max your current bandwidth, even if you removed the CEPH / Journal overhead. I would say the speeds you are getting are what you shoul

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-22 Thread Maged Mokhtar
Generally you can measure your bottleneck via a tool like atop/collectl/sysstat and see how busy (ie %busy, %util ) your resources are: cpu/disks/net. As was pointed out, in your case you will most probably have maxed out on your disks. But the above tools should help as you grow and tune your c

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-22 Thread ceph
On 22/06/2017 19:19, Massimiliano Cuttini wrote: > We are already expecting the following bottlenecks: > > * [ SATA speed x n° disks ] = 24Gbit/s > * [ Networks speed x n° bonded cards ] = 200Gbit/s 6Gbps SATA does not mean you can read 6Gbps from that device __

Re: [ceph-users] Mon Create currently at the state of probing

2017-06-22 Thread Jim Forde
David, SUCCESS!! Thank you so much! I rebuilt the node because I could not install Jewel over the remnants of Kraken. So, while I did install Jewel I am not convinced that was the solution. I did something that I had not tried under the Kraken attempts that solved the problem. For future_me h

[ceph-users] osd down but the service is up

2017-06-22 Thread Alex Wang
Hi All I am recently testing a new ceph cluster with SSD as journal. ceph -v ceph version 10.2.7 cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.4 Beta (Maipo) I followed http://ceph.com/geen-categorie/ceph-recover-osds-after-ssd-journal-failure/ to replace the journal drive. (