Re: [ceph-users] MDS failover, how to speed it up?

2016-06-21 Thread Brian Lagoni
I will plan to add more logging and other info you have asked for at the next MDS restart. As this cluster are being used in production, I have a limited maintenance window, so unless I don't find a time outside this window you have to wait until Sunday/Monday to get the logs. @John, yes I have u

[ceph-users] Inconsistent PGs

2016-06-21 Thread Paweł Sadowski
Hello, We have an issue on one of our clusters. One node with 9 OSD was down for more than 12 hours. During that time cluster recovered without problems. When host back to the cluster we got two PGs in incomplete state. We decided to mark OSDs on this host as out but the two PGs are still in incom

Re: [ceph-users] Inconsistent PGs

2016-06-21 Thread M Ranga Swami Reddy
Try to restart OSD 109 and 166? check if it help? On Tue, Jun 21, 2016 at 4:05 PM, Paweł Sadowski wrote: > Thanks for response. > > All OSDs seems to be ok, they have been restarted, joined cluster after > that, nothing weird in the logs. > > # ceph pg dump_stuck stale > ok > > # ceph pg dump_st

[ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-21 Thread Yoann Moulin
Hello, I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) ceph version is Jewel (10.2.2). All tests have been done under Ubuntu 14.04 Kernel 4.4 has a drop of 50% compared to 4.2 Kernel 4.4 ha

Re: [ceph-users] Inconsistent PGs

2016-06-21 Thread Paweł Sadowski
Already restarted those OSD and then whole cluster (rack by rack, failure domain is rack in this setup). We would like to try *ceph-objectstore-tool mark-complete* operation. Is there any way (other than checking mtime on file and querying PGs) to determine which replica has most up to date datas?

Re: [ceph-users] librbd compatibility

2016-06-21 Thread Jason Dillaman
The librbd API is stable between releases. While new API methods might be added, the older API methods are kept for backwards compatibility. For example, qemu-kvm under RHEL 7 is built against a librbd from Firefly but can function using a librbd from Jewel. On Tue, Jun 21, 2016 at 1:47 AM, min

Re: [ceph-users] Inconsistent PGs

2016-06-21 Thread M Ranga Swami Reddy
you can use the below cmds: == ceph pg dump_stuck stale ceph pg dump_stuck inactive ceph pg dump_stuck unclean === And the query the PG, which are in unclean or stale state, check for any issue with a specific OSD. Thanks Swami On Tue, Jun 21, 2016 at 3:02 PM, Paweł Sadowski wrote: > Hello, >

Re: [ceph-users] OSD out/down detection

2016-06-21 Thread Adrien Gillard
Regarding your original issue, you may want to configure kdump on one of the machines to get more insight on what is happening when the box hangs/crashes. I faced a similar issue when trying 4.4.8 on my Infernalis cluster (box hangs, black screen, OSD down and out), and as it happens, there were c

[ceph-users] Bucket index question

2016-06-21 Thread Василий Ангапов
Hello, I have a questions regarding the bucket index: 1) As far as know index of a given bucket is the single RADOS object and it lives in OSD omap. But does it get replicated or not? 2) When trying to copy bucket index pool to some other pool i get the following error: $ rados cppool ed-1.rgw.b

Re: [ceph-users] Inconsistent PGs

2016-06-21 Thread Paweł Sadowski
Thanks for response. All OSDs seems to be ok, they have been restarted, joined cluster after that, nothing weird in the logs. # ceph pg dump_stuck stale ok # ceph pg dump_stuck inactive ok pg_statstateupup_primaryactingacting_primary 3.2929incomplete[109,272,83]10

[ceph-users] Regarding executing COSBench onto a specific pool

2016-06-21 Thread Venkata Manojawa Paritala
Hi, In Ceph cluster, currently we are seeing that COSBench is writing IO to default pools that are created while configuring rados gw. Can you please let me know, if there is a way to execute IO (using COSBech) on a specific pool. Thanks & Regards, Manoj __

[ceph-users] Observations after upgrading to latest Hammer (0.94.7)

2016-06-21 Thread Kostis Fardelas
Hello, I upgraded a staging ceph cluster from latest Firefly to latest Hammer last week. Everything went fine overall and I would like to share my observations so far: a. every OSD upgrade lasts appr. 3 minutes. I doubt there is any way to speed this up though b. rados bench with different block si

[ceph-users] Does flushbufs on a rbd-nbd invalidate librbd cache?

2016-06-21 Thread Nick Fisk
Hi All, Does anybody know if calling a blockdev --flushbufs on a rbd-nbd device causes the librbd read cache to be invalidated? I've done a quick test and the invalidate_cache counter doesn't increment like when you send the invalidate command via the admin socket. Thanks, Nick

Re: [ceph-users] osds udev rules not triggered on reboot (jewel, jessie)

2016-06-21 Thread Loic Dachary
On 16/06/2016 18:01, stephane.d...@orange.com wrote: > Hi, > > Same issue with Centos 7, I also put back this file in /etc/udev/rules.d. Hi Stephane, Could you please detail which version of CentOS 7 you are using ? I tried to reproduce the problem with CentOS 7.2 as found on the CentOS clou

[ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-21 Thread Yoann Moulin
Hello, I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) ceph version is Jewel (10.2.2). All tests have been done under Ubuntu 14.04 Kernel 4.4 has a drop of 50% compared to 4.2 Kernel 4.4 ha

Re: [ceph-users] Issue installing ceph with ceph-deploy

2016-06-21 Thread shane
Fran Barrera writes: > > Hi all, > I have a problem installing ceph jewel with ceph-deploy (1.5.33) on ubuntu 14.04.4 (openstack instance). > > This is my setup: > > > ceph-admin > > ceph-mon > ceph-osd-1 > ceph-osd-2 > > > I've following these steps from ceph-admin node: > > I have the u

Re: [ceph-users] Chown / symlink issues on download.ceph.com

2016-06-21 Thread Dan Mick
On 06/20/2016 12:54 AM, Wido den Hollander wrote: > Hi Dan, > > There seems to be a symlink issue on download.ceph.com: > > # rsync -4 -avrn download.ceph.com::ceph /tmp|grep 'rpm-hammer/rhel7' > rpm-hammer/rhel7 -> /home/dhc-user/repos/rpm-hammer/el7 > > Could you take a quick look at that? It

Re: [ceph-users] Issue installing ceph with ceph-deploy

2016-06-21 Thread Vasu Kulkarni
On Tue, Jun 21, 2016 at 8:16 AM, shane wrote: > Fran Barrera writes: > >> >> Hi all, >> I have a problem installing ceph jewel with ceph-deploy (1.5.33) on ubuntu > 14.04.4 (openstack instance). >> >> This is my setup: >> >> >> ceph-admin >> >> ceph-mon >> ceph-osd-1 >> ceph-osd-2 >> >> >> I've f

[ceph-users] slow request, waiting for rw locks / subops from osd doing deep scrub of pg in rgw.buckets.index

2016-06-21 Thread Trygve Vea
Hi, I believe I've stumbled on a bug in Ceph, and I'm currently trying to figure out if this is a new bug, some behaviour caused by our cluster being in the midst of a hammer(0.94.6)->jewel(10.2.2) upgrade, or other factors. The state of the cluster at the time of the incident: - All monitor n

[ceph-users] Bluestore Backend Tech Talk

2016-06-21 Thread Patrick McGarry
Hey cephers, Just a reminder, the Bluestore backend Ceph Tech Talk by Sage is going to be starting in ~10m. Feel free to dial in and ask questions. Thanks. http://ceph.com/ceph-tech-talks/ -- Best Regards, Patrick McGarry Director Ceph Community || Red Hat http://ceph.com || http://commun

Re: [ceph-users] slow request, waiting for rw locks / subops from osd doing deep scrub of pg in rgw.buckets.index

2016-06-21 Thread Samuel Just
.rgw.bucket.index.pool is the pool with rgw's index objects, right? The actual on-disk directory for one of those pgs would contain only empty files -- the actual index data is stored in the osd's leveldb instance. I suspect your index objects are very large (because the buckets contain many objec

[ceph-users] Ceph Performance vs Entry Level San Arrays

2016-06-21 Thread Denver Williams
Hi All I'm planning an Open-stack Private Cloud Deplyment and I'm trying to Decide what would be the Better Option? What would the Performance Advantages/Disadvantages be when comparing a 3 Node Ceph Setup with 15K/12G SAS Drives in an HP Dl380p G8 Server with SSDs for Write Cache, compared to s

Re: [ceph-users] Ceph Performance vs Entry Level San Arrays

2016-06-21 Thread Christian Balzer
Hello, On Wed, 22 Jun 2016 11:09:46 +1200 Denver Williams wrote: > Hi All > > > I'm planning an Open-stack Private Cloud Deplyment and I'm trying to > Decide what would be the Better Option? > > What would the Performance Advantages/Disadvantages be when comparing a > 3 Node Ceph Setup with 1

[ceph-users] Ceph 10.1.1 rbd map fail

2016-06-21 Thread 王海涛
Hi All I'm using ceph-10.1.1 to map a rbd image ,but it dosen't work ,the error messages are:root@heaven:~#rbd map rbd/myimage --id admin 2016-06-22 11:16:34.546623 7fc87ca53d80 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb 2016-06-22 11:16:34.54716

Re: [ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-21 Thread Florian Haas
Hi Yoann, On Tue, Jun 21, 2016 at 3:11 PM, Yoann Moulin wrote: > Hello, > > I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu > Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) > > ceph version is Jewel (10.2.2). > All tests have been done u

Re: [ceph-users] Ceph 10.1.1 rbd map fail

2016-06-21 Thread Brad Hubbard
On Wed, Jun 22, 2016 at 1:35 PM, 王海涛 wrote: > Hi All > > I'm using ceph-10.1.1 to map a rbd image ,but it dosen't work ,the error > messages are: > > root@heaven:~#rbd map rbd/myimage --id admin > 2016-06-22 11:16:34.546623 7fc87ca53d80 -1 WARNING: the following dangerous > and experimental featur

Re: [ceph-users] Ceph 10.1.1 rbd map fail

2016-06-21 Thread 王海涛
I find this message in dmesg: [83090.212918] libceph: mon0 192.168.159.128:6789 feature set mismatch, my 4a042a42 < server's 2004a042a42, missing 200 According to "http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client";, this could mean that I nee