Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Jason Dillaman
Yes, I"d say they aren't related. Since you can repeat this issue after a fresh VM boot, can you enable debug-level logging for said VM (add "debug rbd = 20" to your ceph.conf) and recreate the issue. Just to confirm, this VM doesn't have any features enabled besides (perhaps) layering? On Fri, Ju

[ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread Stéphane Klein
Hi, I have a CephFS cluster based on Ceph version: 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) I use ceph-fuse to mount CephFS volume on Debian with Ceph version 10.2.5 I would like set quota on CephFS folder: # setfattr -n ceph.quota.max_bytes -v 10 /mnt/cephfs/foo setfattr: /mnt

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Hall, Eric
The problem seems to be reliably reproducible after a fresh reboot of the VM… With this knowledge, I can cause the hung IO condition while having noscrub and nodeepscrub set. Does this confirm this is not-related to http://tracker.ceph.com/issues/20041 ? -- Eric On 6/22/17, 11:23 AM, "Hall, E

[ceph-users] 答复: 答复: Can't start ceph-mon through systemctl start ceph-mon@.service after upgrading from Hammer to Jewel

2017-06-23 Thread 许雪寒
I set the "mon_data" configuration item and "user" configuration item in my ceph.conf, and start ceph-mon using the user "ceph". I tested directly calling "ceph-mon" command to start the daemon using "root" and "ceph", there were no problem. Only when starting through systemctl, the start failed

[ceph-users] Which one should I sacrifice: Tunables or Kernel-rbd?

2017-06-23 Thread Massimiliano Cuttini
Dear all, running all server and clients a centOS release with a kernel 3.10.* I'm facing this choiche: * sacrifice TUNABLES and downgrade all the cluster to CEPH_FEATURE_CRUSH_TUNABLES3 (which should be the right profile for jewel on old kernel 3.10) * sacrifice KERNEL RBD and map Cep

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
Hi Ashley, I already know, I was already expecting that the bottleneck was the minimum between bandwidth and disks (and was currently disk on my first email). I thinking that write is still to low. I read that removing journal overhead is not a good idea. However I'm writing twice on a SSD...

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
Hi Mark, having 2 node for testing allow me to downgrade the replication to 2x (till the production). SSD have the following product details: * sequential read: 540MB/sec * sequential write: 520MB/sec As you state my sequential write should be: ~600 * 2 (copies) * 2 (journal write per c

Re: [ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-23 Thread Dan van der Ster
On Thu, Jun 22, 2017 at 5:31 PM, Casey Bodley wrote: > > On 06/22/2017 10:40 AM, Dan van der Ster wrote: >> >> On Thu, Jun 22, 2017 at 4:25 PM, Casey Bodley wrote: >>> >>> On 06/22/2017 04:00 AM, Dan van der Ster wrote: I'm now running the three relevant OSDs with that patch. (Recompile

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
Very good to know! Thanks for the info. Il 22/06/2017 20:15, Maged Mokhtar ha scritto: Generally you can measure your bottleneck via a tool like atop/collectl/sysstat and see how busy (ie %busy, %util ) your resources are: cpu/disks/net. As was pointed out, in your case you will most prob

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
*Of course yes!* SSD bottleneck is the SATA controller. If you use a NVMe/PCIe controller you get from almost the same SSD 2.400MB/sec instead of 580MB/sec. 2400MB/sec x 8 = ~19Gbit/sec 580MB/sec x 8 = ~5 Gbit/sec If you don't trust me take a look at this benchmark between 2 really common S

[ceph-users] CephFS vs RBD

2017-06-23 Thread Bogdan SOLGA
Hello, everyone! We are working on a project which uses RBD images (formatted with XFS) as home folders for the project's users. The access speed and the overall reliability have been pretty good, so far. >From the architectural perspective, our main focus is on providing a seamless user experien

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Hall, Eric
Only features enabled are layering and deep-flatten: root@cephproxy01:~# rbd -p vms info c9c5db8e-7502-4acc-b670-af18bdf89886_disk rbd image 'c9c5db8e-7502-4acc-b670-af18bdf89886_disk': size 20480 MB in 5120 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.f4e

Re: [ceph-users] Which one should I sacrifice: Tunables or Kernel-rbd?

2017-06-23 Thread Jason Dillaman
CentOS 7.3's krbd supports Jewel tunables (CRUSH_TUNABLES5) and does not support NBD since that driver is disabled out-of-the-box. As an alternative for NBD, the goal is to also offer LIO/TCMU starting with Luminous and the next point release of CentOS (or a vanilla >=4.12-ish kernel). On Fri, Jun

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Jason Dillaman
On Fri, Jun 23, 2017 at 8:47 AM, Hall, Eric wrote: > I have debug logs. Should I open a RBD tracker ticket at > http://tracker.ceph.com/projects/rbd/issues for this? Yes, please. You might need to use the "ceph-post-file" utility if the logs are too large to attach to the ticket. In that case,

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Ashley Merrick
You could move your Journal to another SSD this would remove the double write. Ideally you’d want one or two PCIe NVME in the servers for the Journal. Or if you can hold off a bit then bluestore, which removes the double write, however is still handy to move some of the services to a seperate di

Re: [ceph-users] CephFS vs RBD

2017-06-23 Thread Burkhard Linke
Hi, On 06/23/2017 02:44 PM, Bogdan SOLGA wrote: Hello, everyone! We are working on a project which uses RBD images (formatted with XFS) as home folders for the project's users. The access speed and the overall reliability have been pretty good, so far. From the architectural perspective, o

[ceph-users] Ceph random read IOPS

2017-06-23 Thread Kostas Paraskevopoulos
Hello, We are in the process of evaluating the performance of a testing cluster (3 nodes) with ceph jewel. Our setup consists of: 3 monitors (VMs) 2 physical servers each connected with 1 JBOD running Ubuntu Server 16.04 Each server has 32 threads @2.1GHz and 128GB RAM. The disk distribution per

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
Hi Ashley, You could move your Journal to another SSD this would remove the double write. If I move the journal to another SSD, I will loss an available OSD, so this is likely to say improve of *x2* and then decrease of *x½ *... this should not improve performance in any case on a full SSD di

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Ashley Merrick
Sorry for the not inline reply. If you can get 6 OSD’s per a NVME as long as your getting a decent rated NVME your bottle neck will be the NVME but will still improve over your current bottle neck. You could add two NVME OSD’s, but their higher performance would be lost along with the other 12

[ceph-users] v12.1.0 Luminous RC released

2017-06-23 Thread Abhishek L
This is the first release candidate for Luminous, the next long term stable release. Ceph Luminous will be the foundation for the next long-term stable release series. There have been major changes since Kraken (v11.2.z) and Jewel (v10.2.z). Major Changes from Kraken - -

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Alan Johnson
We have found that we can place 18 journals on the Intel 3700 PCI-e devices comfortably, We also tried it with fio adding more jobs to ensure that performance did not drop off (via Sebastian Han’s tests described at https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-sui

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
Ashley, but.. instead of use NVMe as a journal, why don't add 2 OSD to the cluster? Incresing number of OSD instead of improving performance of actual OSD? Il 23/06/2017 15:40, Ashley Merrick ha scritto: Sorry for the not inline reply. If you can get 6 OSD’s per a NVME as long as your gettin

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Ashley Merrick
But your then have a very mismatch of performance across your OSD’s which is never recommend by CEPH. It’s all about what you can do with your current boxes capacity to increase performance across the whole OSD set. ,Ashley Sent from my iPhone On 23 Jun 2017, at 10:40 PM, Massimiliano Cuttini

Re: [ceph-users] red IO hang (was disk timeouts in libvirt/qemu VMs...)

2017-06-23 Thread Hall, Eric
http://tracker.ceph.com/issues/20393 created with supporting logs/info noted. -- Eric On 6/23/17, 7:54 AM, "Jason Dillaman" wrote: On Fri, Jun 23, 2017 at 8:47 AM, Hall, Eric wrote: > I have debug logs. Should I open a RBD tracker ticket at http://tracker.ceph.com/projects/rbd/issu

Re: [ceph-users] Which one should I sacrifice: Tunables or Kernel-rbd?

2017-06-23 Thread Massimiliano Cuttini
Not all server are real centOS servers. Some of them are dedicated distribution locked at 7.2 with locked kernel fixed at 3.10. Which as far as I can understand need CRUSH_TUNABLES2 and not even 3! http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client So

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
Hi Everybody, i also see that VM on top of this drive see an even lower speed: hdparm -Tt --direct /dev/xvdb /dev/xvdb: Timing O_DIRECT cached reads: 2596 MB in 2.00 seconds = 1297.42 MB/sec Timing O_DIRECT disk reads: 910 MB in 3.00 seconds = 303.17 MB/sec It's seem there is huge diff

Re: [ceph-users] 答复: 答复: Can't start ceph-mon through systemctl start ceph-mon@.service after upgrading from Hammer to Jewel

2017-06-23 Thread Curt
Did you set "setuser match path" in your config? If you look at the release notes for Infernalis, it outlines how to still use the ceph user. Also to note below from Infernalis, "Ceph daemons now run as user and group ceph by default. The ceph user has a static UID assigned by Fedora and Debian (

Re: [ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread David Turner
What is the output of the following command? If a directory has no quota, it should respond "0" as the quota. # getfattr -n ceph.quota.max_bytes /mnt/cephfs/foo I tested this in my home cluster that uses ceph-fuse to mount cephfs under the david user (hence no need for sudo). I'm using Ubuntu 16

Re: [ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread John Spray
On Fri, Jun 23, 2017 at 4:59 PM, David Turner wrote: > What is the output of the following command? If a directory has no quota, > it should respond "0" as the quota. > # getfattr -n ceph.quota.max_bytes /mnt/cephfs/foo > > I tested this in my home cluster that uses ceph-fuse to mount cephfs unde

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread David Turner
I don't really have anything to add to this conversation, but I see emails like this in the ML all the time. Have you looked through the archives? Everything that's been told to you and everything you're continuing to ask have been covered many many times. http://lists.ceph.com/pipermail/ceph-use

Re: [ceph-users] Which one should I sacrifice: Tunables or Kernel-rbd?

2017-06-23 Thread David Turner
If you have no control over what kernel the clients are going to use, then I wouldn't even consider using the kernel driver for the clients. For me, I would do anything to maintain the ability to use the object map which would require the 4.9 kernel to use with the kernel driver. Because of this

Re: [ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread Stéphane Klein
2017-06-23 18:06 GMT+02:00 John Spray : > I can't immediately remember which version we enabled quota by default > in -- you might also need to set "client quota = true" in the client's > ceph.conf. > > I need to set this option only on host where I want to mount volume? or on all mds hosts? What

Re: [ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread Stéphane Klein
2017-06-23 17:59 GMT+02:00 David Turner : > It might be possible that it doesn't want an absolute path and wants a > relative path for setfattr, although my version doesn't seem to care. I > mention that based on the getfattr response. > > I did the test with relative path and I have the same err

[ceph-users] Inpu/output error mounting

2017-06-23 Thread Daniel Davidson
Two of our OSD systems hit 75% disk utilization, so I added another system to try and bring that back down. The system was usable for a day while the data was being migrated, but now the system is not responding when I try to mount it: mount -t ceph ceph-0,ceph-1,ceph-2,ceph-3:6789:/ /home -

Re: [ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread David Turner
I doubt the ceph version from 10.2.5 to 10.2.7 makes that big of a difference. Read through the release notes since 10.2.5 to see if it mentions anything about cephfs quotas. On Fri, Jun 23, 2017 at 12:30 PM Stéphane Klein wrote: > 2017-06-23 17:59 GMT+02:00 David Turner : > >> It might be poss

Re: [ceph-users] Inpu/output error mounting

2017-06-23 Thread David Turner
# ceph health detail | grep 'ops are blocked' # ceph osd blocked-by My guess is that you have an OSD that is in a funky state blocking the requests and the peering. Let me know what the output of those commands are. Also what are the replica sizes of your 2 pools? It shows that only 1 OSD was l

Re: [ceph-users] Inpu/output error mounting

2017-06-23 Thread Daniel Davidson
Thanks for the response: [root@ceph-control ~]# ceph health detail | grep 'ops are blocked' 100 ops are blocked > 134218 sec on osd.13 [root@ceph-control ~]# ceph osd blocked-by osd num_blocked A problem with osd.13? Dan On 06/23/2017 02:03 PM, David Turner wrote: # ceph health detail | grep

[ceph-users] Help needed rbd feature enable

2017-06-23 Thread Massimiliano Cuttini
Hi everybody, I just realize that all my Images are completly without features: rbd info VHD-4c7ebb38-b081-48da-9b57-aac14bdf88c4 rbd image 'VHD-4c7ebb38-b081-48da-9b57-aac14bdf88c4': size 102400 MB in 51200 objects order 21 (2048 kB objects) block_name_

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-23 Thread Massimiliano Cuttini
Ok, I get the point. Il 23/06/2017 17:42, Ashley Merrick ha scritto: But your then have a very mismatch of performance across your OSD’s which is never recommend by CEPH. It’s all about what you can do with your current boxes capacity to increase performance across the whole OSD set. ,Ash

Re: [ceph-users] Which one should I sacrifice: Tunables or Kernel-rbd?

2017-06-23 Thread Massimiliano Cuttini
Ok, so if I understand correctly your opinion: if you cannot choiche the kernel then you'll sacrifice immediatly the kernel-rbd. I was at the same opinion but i'm still harvesting opinion. Can you tell me if by using nbd-rbd I'm not losing any features? I just cannot understand if nbd is a sor

Re: [ceph-users] Help needed rbd feature enable

2017-06-23 Thread David Turner
All of the features you are talking about likely require the exclusive-lock which requires the 4.9 linux kernel. You cannot map any RBDs that have these features enabled with any kernel older than that. The features you can enable are layering, exclusive-lock, object-map, and fast-diff. You cann

Re: [ceph-users] Inpu/output error mounting

2017-06-23 Thread David Turner
Something about it is blocking the cluster. I would first try running this command. If that doesn't work, then I would restart the daemon. # ceph osd down 13 Marking it down should force it to reassert itself to the cluster without restarting the daemon and stopping any operations it's working

Re: [ceph-users] Which one should I sacrifice: Tunables or Kernel-rbd?

2017-06-23 Thread David Turner
I've never used nbd-rbd, I would use rbd-fuse. It's version should match your cluster's running version as it's a package compiled with each ceph release. On Fri, Jun 23, 2017 at 3:58 PM Massimiliano Cuttini wrote: > Ok, > > so if I understand correctly your opinion: if you cannot choiche the >

Re: [ceph-users] Help needed rbd feature enable

2017-06-23 Thread Massimiliano Cuttini
Ok, At moment my client use only nbd-rbd, can I use all these feature or this is something unavoidable? I guess it's ok. Reading around seems that a lost feature cannot be re-enabled due to back-compatibility with old clients. ... I guess I'll need to export and import in a new image fully f

Re: [ceph-users] Which one should I sacrifice: Tunables or Kernel-rbd?

2017-06-23 Thread David Turner
What is your use case? That matters the most. On Fri, Jun 23, 2017 at 4:31 PM David Turner wrote: > I've never used nbd-rbd, I would use rbd-fuse. It's version should match > your cluster's running version as it's a package compiled with each ceph > release. > > On Fri, Jun 23, 2017 at 3:58 PM

Re: [ceph-users] Help needed rbd feature enable

2017-06-23 Thread David Turner
I upgraded to Jewel from Hammer and was able to enable those features on all of my rbds that were format 2, which yours is. Just test it on some non customer data and see how it goes. On Fri, Jun 23, 2017, 4:33 PM Massimiliano Cuttini wrote: > Ok, > > At moment my client use only nbd-rbd, can I

Re: [ceph-users] v12.1.0 Luminous RC released

2017-06-23 Thread Sage Weil
On Fri, 23 Jun 2017, Abhishek L wrote: > This is the first release candidate for Luminous, the next long term > stable release. I just want to reiterate that this is a release candidate, not the final luminous release. We're still squashing bugs and merging a few last items. Testing is welcome,

Re: [ceph-users] Inpu/output error mounting

2017-06-23 Thread Daniel Davidson
We are using replica 2 and min size is 2. A small amount of data is sitting around from when we were running the default 3. Looks like the problem started around here: 2017-06-22 14:54:29.173982 7f3c39f6f700 0 log_channel(cluster) log [INF] : 1.2c9 deep-scrub ok 2017-06-22 14:54:29.690401 7f

Re: [ceph-users] Help needed rbd feature enable

2017-06-23 Thread Massimiliano Cuttini
I guess you updated those feature before the commit that fix this: https://github.com/ceph/ceph/blob/master/src/include/rbd/features.h As stated: // features that make an image inaccessible for read or write by /// clients that don't understand them #define RBD_FEATURES_INCOMPATIBLE

Re: [ceph-users] Help needed rbd feature enable

2017-06-23 Thread Massimiliano Cuttini
What seems to be strange is that feature are *all disabled* when I create some images. While ceph should use default settings of jewel at least. Do I need to place in ceph.conf something in order to use default settings? Il 23/06/2017 23:43, Massimiliano Cuttini ha scritto: I guess you upd

Re: [ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread Stéphane Klein
2017-06-23 20:44 GMT+02:00 David Turner : > I doubt the ceph version from 10.2.5 to 10.2.7 makes that big of a > difference. Read through the release notes since 10.2.5 to see if it > mentions anything about cephfs quotas. > Yes, same error with 10.2.7 :(

Re: [ceph-users] Ceph random read IOPS

2017-06-23 Thread Christian Wuerdig
The general advice floating around is that your want CPUs with high clock speeds rather than more cores to reduce latency and increase IOPS for SSD setups (see also http://www.sys-pro.co.uk/ceph-storage-fast-cpus-ssd-performance/) So something like a E5-2667V4 might bring better results in that sit

Re: [ceph-users] Inpu/output error mounting

2017-06-23 Thread David Turner
Your min_size=2 is why the cluster is blocking and you can't mount cephfs. Those 2 PGs, while the cluster is performing the backfilling, are currently only on 1 OSD (osd.13). That is not enough OSDs to satisfy the min_size, so any requests for data on those PGs will block and wait until a second O

Re: [ceph-users] Help needed rbd feature enable

2017-06-23 Thread David Turner
It all depends on how you are creating your RBDs. Whatever your using is likely overriding the defaults and using a custom line in it's code. What you linked did not say that you cannot turn on the features I mentioned. There are indeed some features that cannot be enabled if they have ever been

Re: [ceph-users] when I set quota on CephFS folder I have this error => setfattr: /mnt/cephfs/foo: Invalid argument

2017-06-23 Thread David Turner
I do not have a single mention of quotas, or even MDS, in my config files anywhere in my cluster... which is to say that everything is running on default settings. What settings do you have explicitly stated in your config file related to MDS and/or quotas on both your client server and your MDS s